Download as pdf or txt
Download as pdf or txt
You are on page 1of 243

HUAWEI NE40E-M2 Series Universal Service


Feature Description - Network


Issue 01
Date 2018-12-05


Copyright © Huawei Technologies Co., Ltd. 2018. All rights reserved.
No part of this document may be reproduced or transmitted in any form or by any means without prior written
consent of Huawei Technologies Co., Ltd.

Trademarks and Permissions

and other Huawei trademarks are trademarks of Huawei Technologies Co., Ltd.
All other trademarks and trade names mentioned in this document are the property of their respective

The purchased products, services and features are stipulated by the contract made between Huawei and the
customer. All or part of the products, services and features described in this document may not be within the
purchase scope or the usage scope. Unless otherwise specified in the contract, all statements, information,
and recommendations in this document are provided "AS IS" without warranties, guarantees or
representations of any kind, either express or implied.

The information in this document is subject to change without notice. Every effort has been made in the
preparation of this document to ensure accuracy of the contents, but all statements, information, and
recommendations in this document do not constitute a warranty of any kind, express or implied.

Huawei Technologies Co., Ltd.

Address: Huawei Industrial Base
Bantian, Longgang
Shenzhen 518129
People's Republic of China


Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. i

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability Contents


1 About This Document.................................................................................................................. 1

2 BFD................................................................................................................................................... 4
2.1 Overview of BFD........................................................................................................................................................... 4
2.2 Understanding BFD........................................................................................................................................................ 5
2.2.1 BFD Basic Concepts....................................................................................................................................................5
2.2.2 BFD for IP................................................................................................................................................................... 9
2.2.3 BFD for PST.............................................................................................................................................................. 11
2.2.4 Multicast BFD............................................................................................................................................................11
2.2.5 BFD for PIS............................................................................................................................................................... 12
2.2.6 BFD for Link-Bundle................................................................................................................................................ 13
2.2.7 BFD Echo.................................................................................................................................................................. 14
2.2.8 Board Selection Rules for BFD Sessions.................................................................................................................. 16
2.2.9 BFD Dampening........................................................................................................................................................19
2.3 Application Scenarios for BFD.................................................................................................................................... 19
2.3.1 BFD for Static Routes................................................................................................................................................19
2.3.2 BFD for RIP...............................................................................................................................................................20
2.3.3 BFD for OSPF........................................................................................................................................................... 22
2.3.4 BFD for OSPFv3....................................................................................................................................................... 23
2.3.5 BFD for IS-IS............................................................................................................................................................ 25
2.3.6 BFD for BGP............................................................................................................................................................. 27
2.3.7 BFD for LDP LSP..................................................................................................................................................... 28
2.3.8 BFD for P2MP TE..................................................................................................................................................... 30
2.3.9 BFD for TE CR-LSP................................................................................................................................................. 31
2.3.10 BFD for TE Tunnel..................................................................................................................................................33
2.3.11 BFD for RSVP......................................................................................................................................................... 33
2.3.12 BFD for VRRP........................................................................................................................................................ 34
2.3.13 BFD for PW.............................................................................................................................................................39
2.3.14 BFD for Multicast VPLS......................................................................................................................................... 41
2.3.15 BFD for PIM............................................................................................................................................................43

3 MPLS OAM.................................................................................................................................. 46
3.1 Overview of MPLS OAM............................................................................................................................................ 46
3.2 Understanding MPLS OAM......................................................................................................................................... 48

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. ii

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability Contents

3.2.1 Basic Detection..........................................................................................................................................................48

3.2.2 Auto Protocol.............................................................................................................................................................52
3.3 Application Scenarios for MPLS OAM....................................................................................................................... 53
3.3.1 Application of MPLS OAM in the IP RAN Layer 2 to Edge Scenario.....................................................................53
3.3.2 Application of MPLS OAM in VPLS Networking................................................................................................... 54
3.4 Terminology for MPLS OAM...................................................................................................................................... 55

4 MPLS-TP OAM............................................................................................................................ 57
4.1 Overview of MPLS-TP OAM...................................................................................................................................... 57
4.2 Understanding MPLS-TP OAM................................................................................................................................... 60
4.2.1 Basic Concepts.......................................................................................................................................................... 60
4.2.2 Continuity Check and Connectivity Verification.......................................................................................................62
4.2.3 Packet Loss Measurement......................................................................................................................................... 63
4.2.4 Frame Delay Measurement........................................................................................................................................65
4.2.5 Remote Defect Indication.......................................................................................................................................... 67
4.2.6 Loopback................................................................................................................................................................... 68
4.3 Application Scenarios for MPLS-TP OAM................................................................................................................. 69
4.3.1 Application of MPLS-TP OAM in the IP RAN Layer 2 to Edge Scenario...............................................................69
4.3.2 Application of MPLS-TP OAM in VPLS Networking............................................................................................. 70
4.4 Terminology for MPLS-TP OAM................................................................................................................................ 71

5 VRRP..............................................................................................................................................73
5.1 Overview of VRRP.......................................................................................................................................................73
5.2 Understanding VRRP................................................................................................................................................... 77
5.2.1 Basic VRRP Concepts............................................................................................................................................... 77
5.2.2 VRRP Packets............................................................................................................................................................78
5.2.3 VRRP Operating Principles.......................................................................................................................................81
5.2.4 Basic VRRP Functions.............................................................................................................................................. 86
5.2.5 mVRRP......................................................................................................................................................................89
5.2.6 Association Between VRRP and a VRRP-disabled Interface................................................................................... 91
5.2.7 VRRP Tracking an Interface Monitoring Group....................................................................................................... 92
5.2.8 BFD for VRRP.......................................................................................................................................................... 94
5.2.9 VRRP Tracking EFM................................................................................................................................................ 99
5.2.10 VRRP Tracking CFM............................................................................................................................................ 101
5.2.11 VRRP Association with NQA............................................................................................................................... 104
5.2.12 Association Between a VRRP Backup Group and a Route...................................................................................106
5.2.13 Association Between Direct Routes and a VRRP Backup Group.........................................................................108
5.2.14 Traffic Forwarding by a Backup Device................................................................................................................110
5.2.15 Rapid VRRP Switchback.......................................................................................................................................112
5.2.16 Unicast VRRP........................................................................................................................................................114
5.3 Application Scenarios for VRRP................................................................................................................................115
5.3.1 IPRAN Gateway Protection Solution...................................................................................................................... 115
5.4 Terminology for VRRP...............................................................................................................................................118

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. iii

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability Contents

6 Ethernet OAM............................................................................................................................ 120

6.1 Overview of Ethernet OAM....................................................................................................................................... 120
6.2 Understanding EFM................................................................................................................................................... 124
6.2.1 Basic Concepts........................................................................................................................................................ 124
6.2.2 Background..............................................................................................................................................................126
6.2.3 Basic Functions........................................................................................................................................................127
6.3 Understanding CFM................................................................................................................................................... 131
6.3.1 Basic Concepts........................................................................................................................................................ 131
6.3.2 Background..............................................................................................................................................................140
6.3.3 Basic Functions........................................................................................................................................................141
6.3.4 CFM Alarms............................................................................................................................................................ 144
6.4 Understanding Y.1731................................................................................................................................................ 146
6.4.1 Background..............................................................................................................................................................147
6.4.2 Basic Functions........................................................................................................................................................147
6.5 Ethernet OAM Fault Advertisement...........................................................................................................................161
6.5.1 Background..............................................................................................................................................................161
6.5.2 Fault Information Advertisement Between EFM and Other Modules.................................................................... 162
6.5.3 Fault Information Advertisement Between CFM and Other Modules.................................................................... 164
6.6 Application Scenarios for Ethernet OAM.................................................................................................................. 170
6.6.1 Ethernet OAM Applications on a MAN..................................................................................................................171
6.6.2 Ethernet OAM Applications on an IPRAN............................................................................................................. 172
6.7 Our Advantages.......................................................................................................................................................... 173
6.7.1 EFM Enhancements.................................................................................................................................................173

7 Ethernet LPT............................................................................................................................... 175

7.1 Overview of LPT........................................................................................................................................................ 175
7.2 Understanding LPT.....................................................................................................................................................176
7.2.1 Basic Principles....................................................................................................................................................... 176
7.3 Application Scenarios for LPT................................................................................................................................... 178
7.3.1 Point-to-Point Ethernet LPT.................................................................................................................................... 178

8 Dual-Device Backup................................................................................................................. 179

8.1 Overview of Dual-Device Backup..............................................................................................................................179
8.2 Dual-Device Backup Principles..................................................................................................................................181
8.2.1 Overview................................................................................................................................................................. 181
8.2.2 Status Control.......................................................................................................................................................... 181
8.2.3 Service Control........................................................................................................................................................ 184
8.2.4 IPv4 Unicast Forwarding Control............................................................................................................................188
8.2.5 IPv4 Multicast Forwarding Control.........................................................................................................................190
8.2.6 IPv6 Unicast Forwarding Control............................................................................................................................192
8.3 Application Scenarios for Dual-Device Backup.........................................................................................................194
8.3.1 Dual-Device ARP Hot Backup................................................................................................................................194
8.3.2 Dual-Device IGMP Snooping Hot Backup............................................................................................................. 196
8.3.3 Single-Homing Access in a Multi-Node Backup Scenario..................................................................................... 198

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. iv

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability Contents

8.3.4 Dual-Homing Access in a Multi-Node Backup Scenario........................................................................................200

8.3.5 Load Balancing Between Equipment...................................................................................................................... 201
8.3.6 Load Balancing Between Links...............................................................................................................................202
8.3.7 Load Balancing Between VLANs........................................................................................................................... 202
8.3.8 Load Balancing Based on Odd and Even MAC Addresses.....................................................................................203
8.3.9 Multicast Hot Backup.............................................................................................................................................. 204
8.4 Terminology for Dual-Device Backup....................................................................................................................... 205

9 Bit-Error-Triggered Protection Switching............................................................................ 207

9.1 Overview of Bit-Error-Triggered Protection Switching............................................................................................. 207
9.2 Understanding Bit-Error-Triggered Protection Switching......................................................................................... 208
9.2.1 Bit Error Detection.................................................................................................................................................. 208
9.2.2 Bit-Error-Triggered Section Switching................................................................................................................... 210
9.2.3 Bit-Error-Triggered IGP Route Switching.............................................................................................................. 212
9.2.4 Bit-Error-Triggered Trunk Update...........................................................................................................................213
9.2.5 Bit-Error-Triggered RSVP-TE Tunnel Switching................................................................................................... 216
9.2.6 Bit-Error-Triggered Switching for PW....................................................................................................................218
9.2.7 Bit-Error-Triggered L3VPN Switching................................................................................................................... 220
9.2.8 Bit-Error-Triggered Static CR-LSP/PW/E-PW APS...............................................................................................221
9.2.9 Relationships Among Bit-Error-Triggered Protection Switching Features.............................................................224
9.3 Application Scenarios for Bit-Error-Triggered Protection Switching........................................................................ 229
9.3.1 Application of Bit-Error-Triggered Protection Switching in a Scenario in Which TE Tunnels Carry an IP RAN.230
9.3.2 Application of Bit-Error-Triggered Protection Switching in a Scenario in Which LDP LSPs Carry an IP RAN.. 232
9.3.3 Application of Bit-Error-Triggered Protection Switching in a Scenario in Which a Static CR-LSP/PW Carries
L2VPN Services............................................................................................................................................................... 235
9.4 Terminology for Bit-Error-Triggered Protection Switching.......................................................................................236

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. v

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 1 About This Document

1 About This Document

This document describes the network reliability feature in terms of its overview, principles,
and applications.

Related Version
The following table lists the product version related to this document.

Product Name Version

NE40E-M2 Series V800R010C10

U2000 V200R018C50

Intended Audience
This document is intended for:
l Network planning engineers
l Commissioning engineers
l Data configuration engineers
l System maintenance engineers

Security Declaration
l Encryption algorithm declaration
The encryption algorithms DES/3DES/RSA (RSA-1024 or lower)/MD5 (in digital
signature scenarios and password encryption)/SHA1 (in digital signature scenarios) have
a low security, which may bring security risks. If protocols allowed, using more secure
encryption algorithms, such as AES/RSA (RSA-2048 or higher)/SHA2/HMAC-SHA2 is
l Password configuration declaration
– Do not set both the start and end characters of a password to "%^%#". This causes
the password to be displayed directly in the configuration file.

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 1

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 1 About This Document

– To further improve device security, periodically change the password.

l Personal data declaration
Your purchased products, services, or features may use users' some personal data during
service operation or fault locating. You must define user privacy policies in compliance
with local laws and take proper measures to fully protect personal data.
l Feature declaration
– The NetStream feature may be used to analyze the communication information of
terminal customers for network traffic statistics and management purposes. Before
enabling the NetStream feature, ensure that it is performed within the boundaries
permitted by applicable laws and regulations. Effective measures must be taken to
ensure that information is securely protected.
– The mirroring feature may be used to analyze the communication information of
terminal customers for a maintenance purpose. Before enabling the mirroring
function, ensure that it is performed within the boundaries permitted by applicable
laws and regulations. Effective measures must be taken to ensure that information is
securely protected.
– The packet header obtaining feature may be used to collect or store some
communication information about specific customers for transmission fault and
error detection purposes. Huawei cannot offer services to collect or store this
information unilaterally. Before enabling the function, ensure that it is performed
within the boundaries permitted by applicable laws and regulations. Effective
measures must be taken to ensure that information is securely protected.
l Reliability design declaration
Network planning and site design must comply with reliability design principles and
provide device- and solution-level protection. Device-level protection includes planning
principles of dual-network and inter-board dual-link to avoid single point or single link
of failure. Solution-level protection refers to a fast convergence mechanism, such as FRR
and VRRP.

Special Declaration
l This document serves only as a guide. The content is written based on device
information gathered under lab conditions. The content provided by this document is
intended to be taken as general guidance, and does not cover all scenarios. The content
provided by this document may be different from the information on user device
interfaces due to factors such as version upgrades and differences in device models,
board restrictions, and configuration files. The actual user device information takes
precedence over the content provided by this document. The preceding differences are
beyond the scope of this document.
l The maximum values provided in this document are obtained in specific lab
environments (for example, only a certain type of board or protocol is configured on a
tested device). The actually obtained maximum values may be different from the
maximum values provided in this document due to factors such as differences in
hardware configurations and carried services.
l Interface numbers used in this document are examples. Use the existing interface
numbers on devices for configuration.
l The pictures of hardware in this document are for reference only.

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 2

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 1 About This Document

Symbol Conventions
The symbols that may be found in this document are defined as follows.

Symbol Description

Indicates an imminently hazardous situation which, if not

avoided, will result in death or serious injury.

Indicates a potentially hazardous situation which, if not

avoided, could result in death or serious injury.

Indicates a potentially hazardous situation which, if not

avoided, may result in minor or moderate injury.

Indicates a potentially hazardous situation which, if not

avoided, could result in equipment damage, data loss,
performance deterioration, or unanticipated results.
NOTICE is used to address practices not related to personal

Calls attention to important information, best practices and

NOTE is used to address information not related to
personal injury, equipment damage, and environment

Change History
Updates between document issues are cumulative. Therefore, the latest document issue
contains all updates made in previous issues.
l Changes in Issue 01 (2018-12-05)
This issue is the first official release. The software version of this issue is

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 3

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 2 BFD


About This Chapter

2.1 Overview of BFD

2.2 Understanding BFD
2.3 Application Scenarios for BFD

2.1 Overview of BFD

Bidirectional Forwarding Detection (BFD) is a fault detection protocol that can quickly
determine a communication failure between devices and notify upper-layer applications.

To minimize the impact of device faults on services and improve network availability, a
network device must be able to quickly detect faults in communication with adjacent devices.
Measures can then be taken to promptly rectify the faults to ensure service continuity.
On a live network, link faults can be detected using either of the following mechanisms:

l Hardware detection: For example, the Synchronous Digital Hierarchy (SDH) alarm
function can be used to quickly detect link faults.
l Hello detection: If hardware detection is unavailable, Hello detection can be used to
detect link faults.
However, the two mechanisms have the following issues:
l Only certain media support hardware detection.
l Hello detection takes more than 1 second to detect a fault. When traffic is transmitted at
gigabit rates, such slow detection causes packet loss.
l On a Layer 3 network, the Hello packet detection mechanism cannot detect faults for all
routes, such as static routes.

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 4

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 2 BFD

BFD resolves these issues by providing:

l A low-overhead, short-duration method to detect faults on the path between adjacent
forwarding engines. The faults can be interface, data link, and even forwarding engine
l A single, unified mechanism to monitor any media and protocol layers in real time.

BFD offers the following benefits:
l BFD rapidly monitors link or IP route connectivity to improve network performance.
l Adjacent systems running BFD rapidly detect communication failures and establish a
backup channel to restore communications, which improves network reliability.

2.2 Understanding BFD

2.2.1 BFD Basic Concepts

Bidirectional Forwarding Detection (BFD) detects communication faults between forwarding
engines. Specifically, BFD checks the continuity of a data protocol on the path between
systems. The path can be a physical or logical link or a tunnel.
BFD interacts with upper-layer applications in the following manner:
l An upper-layer application provides BFD with parameters, such as the detection address
and time.
l BFD creates, deletes, or modifies sessions based on these parameters and notifies the
upper-layer application of the session status.
BFD has the following characteristics:
l Provides a low-overhead, short-duration method to detect faults on the path between
adjacent forwarding engines.
l Provides a single, unified mechanism to monitor any media and protocol layers in real
The following sections describe the basic principles of BFD, including the BFD detection
mechanism, detected link types, session establishment modes, and session management.

BFD Detection Mechanism

Two systems establish a BFD session and periodically send BFD control packets along the
path between them. If one system does not receive BFD control packets within a specified
period, the system regards it as a fault occurrence on the path.
BFD control packets are encapsulated using the User Datagram Protocol (UDP). In the initial
phase of a BFD session, both systems negotiate BFD parameters with each other using BFD
control packets. These parameters include discriminators, required minimum intervals at
which BFD control packets are sent and received, and local BFD session status. After the
negotiation is successful, both systems send BFD control packets along the path between
them at the negotiated intervals.
BFD provides two types of detection modes:

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 5

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 2 BFD

l Asynchronous mode: a major BFD detection mode. In this mode, both systems
periodically send BFD control packets to each other. If one system fails to receive BFD
control packets consecutively, the system considers the BFD session Down.
The echo function is used for two modes. When the echo function is activated, the local
system sends a BFD control packet and the remote system loops back the packet through the
forwarding channel. If several consecutive echo packets are not received, the session is
declared to be Down.

Types of Links Detected by BFD

Table 2-1 Types of links detected by BFD

Link Type Classification Description

IP links l Layer 3 physical If a physical Ethernet

interfaces interface has multiple sub-
l Ethernet sub-interfaces interfaces, BFD sessions can
(including Eth-Trunk be separately established on
sub-interfaces) the physical Ethernet
interface and its sub-

IP-Trunks l IP-Trunk links Separate BFD sessions can

l IP-Trunk member links be established to detect link
faults on an IP-Trunk and its
member interfaces at the
same time.

Eth-Trunks l Layer 2 Eth-Trunk links Separate BFD sessions can

l Layer 2 Eth-Trunk be established to detect link
member links faults on an Eth-Trunk and
its member interfaces at the
l Layer 3 Eth-Trunk links same time.
l Layer 3 Eth-Trunk
member links

VLANIF l VLAN Ethernet member Separate BFD sessions can

links be established to detect link
l VLANIF interfaces faults on a VLANIF
interface and its member
interfaces at the same time.

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 6

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 2 BFD

Link Type Classification Description

MPLS LSPs l In static mode, BFD can A BFD session used to

detect the following check the continuity of a
types of LSPs: Multiprotocol Label
– LDP LSPs Switching label switched
path (MPLS LSP) can be
– TE tunnels, static established in either of the
CR-LSPs bound to following modes:
tunnels, and RSVP
CR-LSPs bound to l Static mode: Local and
tunnels remote discriminators are
manually configured on
l In dynamic mode, BFD interconnected devices to
can detect the following allow them to negotiate a
types of LSPs: BFD session.
– LDP LSPs l Dynamic mode: BFD
– RSVP CR-LSPs discriminator type-
bound to tunnels length-value (TLV)
carried in an LSP ping
packet is used to allow
the interconnected
devices to negotiate a
BFD session.
BFD can detect a TE tunnel
that uses CR-Static or
RSVP-TE as its signaling
protocol and detect the
primary LSP bound to the
TE tunnel.
A dynamic BFD session
cannot detect the entire TE

PWs l SS PWs BFD can monitor a PW in

l MS PWs static (manually configured
discriminator) or dynamic

BFD Session Establishment Modes

BFD sessions can be established in either static or dynamic mode.
BFD identifies sessions based on the My Discriminator (local discriminator) and Your
Discriminator (remote discriminator) fields carried in BFD control packets. The difference
between the two modes lies in different configurations for the two fields.

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 7

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 2 BFD

Table 2-2 BFD session establishment modes

BFD Session Description

Establishment Mode

Static mode BFD session parameters, such as the local and remote
discriminators, are manually configured and delivered for BFD
session establishment.
In static mode, configure unique local and remote discriminators for each
BFD session. This mode prevents incorrect discriminators from affecting
BFD sessions that have correct discriminators and prevents BFD sessions
from alternating between Up and Down.

Dynamic mode When a BFD session is dynamically established, the system

processes the local and remote discriminators as follows:
l Dynamically allocates the local discriminator. When a system
triggers the dynamic establishment of a BFD session, the
system allocates a dynamic discriminator as the local
discriminator of the BFD session. Then, the system sends a
BFD control packet with Your Discriminator set to 0 to the
peer for session negotiation.
l Automatically learns the remote discriminator. The local end
of a BFD session sends a BFD control packet with Your
Discriminator set to 0 to the remote end. After the remote end
receives the packet, it checks whether the value of Your
Discriminator in this packet is the same as the value of its My
Discriminator. If the value of Your Discriminator matches
that of My Discriminator, the remote end learns the value of
My Discriminator of the local end and obtains its Your

BFD Session Management

A BFD session has the following states:

l Down: A BFD session is in the Down state or a request has been sent.
l Init: The local end can communicate with the remote end, and the local end expects the
BFD session to go Up.
l Up: A BFD session is successfully established.
l AdminDown: A BFD session is in the AdminDown state.

Session status changes are transmitted using the State field carried in a BFD control packet.
The system changes its session status based on the local session status and received remote
session status from the peer system.

When a BFD session is to be established or deleted, the BFD state machine implements a
three-way handshake to ensure that the two systems detect the status change.

Figure 2-1 shows the status change process of the state machine during the establishment of a
BFD session.

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 8

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 2 BFD

Figure 2-1 Status change process of the state machine

Device A Device B

Down Sta: Down Down

Step 1 Sta: Down Step 1

Down -> Init

Down -> Init Sta: Init Step 2
Step 3 Sta: Init

Init -> Up Init -> Up

Step 5 Sta: Up
Sta: Up Step 4

1. BFD configured on both Device A and Device B independently starts state machines.
The initial status of BFD state machines is Down. Device A and Device B send BFD
control packets with the State field set to Down. If BFD sessions are established in static
mode, the value of Your Discriminator in BFD control packets is manually specified. If
BFD sessions are established in dynamic mode, the value of Your Discriminator is set to
2. After receiving a BFD control packet with the State field set to Down, Device B switches
the session status to Init and sends a BFD control packet with the State field set to Init.

After the local BFD session status of Device B changes to Init, Device B no longer processes the
received BFD control packets with the State field set to Down.
3. The BFD session status change of Device A is the same as that of Device B.
4. After receiving a BFD control packet with the State field set to Init, Device B changes
the local session status to Up.
5. The BFD session status change of Device A is the same as that of Device B.

2.2.2 BFD for IP

A BFD session can be established to quickly detect faults of an IP link.

BFD for IP detects single- and multi-hop IPv4 and IPv6 links:

l Single-hop BFD checks the IP continuity between directly connected systems. The single
hop refers to a hop on an IP link. Single-hop BFD allows only one BFD session to be
established for a specified data protocol on a specified interface.
l Multi-hop BFD detects all paths between two systems. Each path may contain multiple
hops, and these paths may partially overlap.

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 9

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 2 BFD

IPv4 Usage Scenario

Typical application 1:
As shown in Figure 2-2, BFD monitors the single-hop IPv4 path between Device A and
Device B, and BFD sessions are bound to outbound interfaces.

Figure 2-2 Single-hop BFD for IPv4

BFD session

If1 If1

Device A Device B

BFD session

Typical application 2:
As shown in Figure 2-3, BFD monitors the multi-hop IPv4 path between Device A and
Device C, and BFD sessions are bound only to peer IP addresses.

Figure 2-3 Multi-hop BFD for IPv4

BFD session

If1 If1 If2 If2

Device A Device B Device C

BFD session

IPv6 Usage Scenario

Typical application 3:
As shown in Figure 2-4, BFD monitors the single-hop IPv6 path between Device A and
Device B, and BFD sessions are bound to outbound interfaces.

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 10

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 2 BFD

Figure 2-4 Single-hop BFD for IPv6

BFD session

Device A Device B
If1 If1

2001::1/64 2001::2/64

BFD session

Typical application 4:
As shown in Figure 2-5, BFD monitors the multi-hop IPv6 path between Device A and
Device C, and BFD sessions are bound only to peer IP addresses.

Figure 2-5 Multi-hop BFD for IPv6

BFD session

If1 If1 If2 If2

2001::1/64 2001::2/64 2002::1/64 2002::2/64

Device A Device B Device C

BFD session

In BFD for IP scenarios, BFD for PST is configured on a device. If a link fault occurs, BFD
detects the fault and triggers the PST to go Down. If the device restarts and the link fault
persists, BFD is in the AdminDown state and does not notify the PST of BFD Down. As a
result, the PST is not triggered to go Down and the interface bound to BFD is still Up.

2.2.3 BFD for PST

When Bidirectional Forwarding Detection (BFD) detects a fault, it changes the interface
status in the port state table (PST) to trigger a fast reroute (FRR) switchover. BFD for PST
applies only to single-hop scenarios when BFD sessions are bound to outbound interfaces.
BFD for PST is widely used in FRR applications. If BFD for PST is enabled for a BFD
session bound to an outbound interface, the BFD session is associated with the PST on the
outbound interface. After BFD detects that a link is Down, it sets the bit for the PST to Down
to immediately trigger an FRR switchover.

2.2.4 Multicast BFD

Multicast Bidirectional Forwarding Detection (BFD) can check the continuity of the link
between interfaces that do not have Layer 3 attributes (such as IP addresses) to quickly detect
link faults.

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 11

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 2 BFD

After multicast BFD is configured, multicast BFD packets are sent using the IP layer. If the
link is reachable, the remote interface receives the multicast BFD packets and forwards them
to the BFD module. In this manner, the BFD module detects that the link is normal. If
multicast BFD packets are sent over a trunk member link, they are delivered to the data link
layer for link continuity check. The remote IP address used in a multicast BFD session is the
default known multicast IP address ( to Any packet with the default
known multicast IP address is sent to the BFD module for IP forwarding.

Usage Scenario

Figure 2-6 Multicast BFD

BFD session

If1 If1

Device A Device B

BFD session

As shown in Figure 2-6, multicast BFD is configured on both Device A and Device B. BFD
sessions are bound to the outbound interface If1, and the default multicast address is used.
After the configuration is complete, multicast BFD quickly checks the continuity of the link
between interfaces.

2.2.5 BFD for PIS

Bidirectional Forwarding Detection (BFD) for process interface status (PIS) is a simple
mechanism in which the behavior of a BFD session is associated with the interface status.
BFD for PIS improves the sensitivity of interfaces in detecting link faults and minimizes the
impact of faults on non-direct links.
After BFD for PIS is configured and BFD detects a link fault, BFD immediately sends a
message indicating the Down state to the associated interface. The interface then enters the
BFD Down state, which is equivalent to the Down state of the link protocol. In the BFD
Down state, interfaces process only BFD packets to quickly detect link faults.
Configure multicast BFD for each BFD session to be associated with the interface status so
that BFD packet forwarding is independent of the IP attributes on the interface.

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 12

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 2 BFD

Usage Scenario

Figure 2-7 BFD for PIS

BFD session

If1 If1

Device A Device B

BFD session

In Figure 2-7, a BFD session is established between Device A and Device B, and the default
multicast address is used to check the continuity of the single-hop link connected to the
interface If1. After BFD for PIS is configured and BFD detects a link fault, BFD immediately
sends a message indicating the Down state to the associated interface. The interface then
enters the BFD Down state.

2.2.6 BFD for Link-Bundle

Two routing devices are connected through an Eth-Trunk that has multiple member interfaces.
If the Eth-Trunk fails and common BFD is used, only one single-hop BFD session is created.
After the creation is complete, BFD selects the board on which a member interface resides as
a state machine board and monitors the member interface. If the member interface or state
machine board fails, BFD considers the entire Eth-Trunk failed even if other member
interfaces of the Eth-Trunk are Up. BFD for link-bundle resolves this issue.

Figure 2-8 BFD for link-bundle networking

Eth-Trunk Eth-Trunk

BFD sub-session 1
BFD sub-session 2
BFD sub-session 3

On the network shown in Figure 2-8, a BFD for link-bundle session consists of one main
session and multiple sub-sessions.
l Each sub-session independently monitors an Eth-Trunk member interface and reports the
monitoring results to the main session. Each sub-session uses the same monitoring
parameters as the main session.

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 13

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 2 BFD

l The main session creates a BFD sub-session for each Eth-Trunk member interface,
summarizes the sub-session monitoring results, and determines the status of the Eth-
– The main session is Up so long as a sub-session is Up.
– If no sub-session is available, the main session goes Down and the Unknown state
is reported to applications. The status of the Eth-Trunk port is not changed.
– If the Eth-Trunk has only one member interface and the corresponding sub-session
is Up, the main session goes Down when the member interface exits the Eth-Trunk.
The status of the Eth-Trunk is Up.
The main session's local discriminator is allocated from the range from 0x00100000 to
0x00103fff without occupying the original BFD session discriminator range. The main
session does not learn the remote discriminator because it does not send or receive packets. A
sub-session's local discriminator is allocated from the original dynamic BFD session
discriminator range using the same algorithm as a dynamic BFD session.
Only sub-sessions consume BFD session resources per board. A sub-session must select the
board on which the physical member interface bound to this sub-session resides as a state
machine board. If no BFD session resources are available on the board, board selection fails.
In this situation, the sub-session's status is not used to determine the main session's status.

2.2.7 BFD Echo

BFD echo is a rapid fault detection mechanism in which the local system sends BFD echo
packets and the remote system loops back the packets. BFD echo is classified into passive
BFD echo and one-arm BFD echo modes. These two BFD echo modes have the same
detection mechanism but different application scenarios.

Passive BFD Echo

The NE40E supports passive BFD echo for interworking with other vendors' devices.
Passive BFD echo applies only to single-hop IP link scenarios and works with asynchronous
BFD. When a BFD session works in asynchronous echo mode, the two endpoints of the BFD
session perform both slow detection in asynchronous mode and quick detection in echo mode.
As shown in Figure 2-9, Device A is directly connected to Device B, and asynchronous BFD
sessions are established between the two devices. After active BFD echo is enabled on Device
B and passive BFD echo is enabled on Device A, the two devices work in asynchronous echo
mode and send single-hop and echo packets to each other.
If Device A has a higher BFD performance than Device B, for example, the minimum
intervals between receiving BFD packets supported by Device A and Device B are 3 ms and
100 ms respectively, then BFD sessions in asynchronous mode will adopt the larger interval
(100 ms). If BFD echo is enabled, Device A can use echo packets to implement faster link
failure detection. If BFD echo is disabled, Device A and Device B can still use asynchronous
BFD packets to detect link failures. However, the minimum interval between receiving BFD
packets is the larger interval value (100 ms in this example).

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 14

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 2 BFD

Figure 2-9 Passive BFD echo networking

Port1 Port1

Device A Device B
Echo Passive Echo

BFD Single-hop session

BFD Echo session

The process of establishing a passive BFD echo session as shown in Figure 2-9 is as follows:
1. Device B functions as a BFD session initiator and sends an asynchronous BFD packet to
Device A. The Required Min Echo RX Interval field carried in the packet is a nonzero
value, which specifies that Device A must support BFD echo.
2. After receiving the packet, Device A finds that the value of the Required Min Echo RX
Interval field carried in the packet is a nonzero value. If Device A has passive BFD echo
enabled, it checks whether any ACL that restricts passive BFD echo is referenced. If an
ACL is referenced, only BFD sessions that match specific ACL rules can enter the
asynchronous echo mode. If no ACL is referenced, BFD sessions immediately enter the
asynchronous echo mode.
3. Device B periodically sends BFD echo packets, and Device A sends BFD echo packets
(the source and destination IP addresses are the local IP address, and the destination
physical address is Device B's physical address) at the interval specified by the Required
Min RX Interval field. Both Device A and Device B start a receive timer, with a receive
interval that is the same as the interval at which they each send BFD echo packets.
4. After Device A and Device B receive BFD echo packets from each other, they
immediately loop back the packets at the forwarding layer. Device A and Device B also
send asynchronous BFD packets to each other at an interval that is much less than that
for sending echo packets.

One-Arm BFD Echo

One-arm BFD echo applies only to single-hop IP link scenarios. Generally, one-arm BFD
echo is used when two devices are directly connected and only one of them supports BFD.
Therefore, one-arm BFD echo does not require both ends to negotiate echo capabilities. A
one-arm BFD echo session can be established on a device that supports BFD. After receiving
a one-arm BFD echo session packet, devices that do not support BFD immediately loop back
the packet, implementing quick link failure detection.
The local device that has one-arm BFD echo enabled sends a special BFD packet (both the
source and destination IP addresses in the IP header are the local IP address, and the MD and
YD in the BFD payload are the same). After receiving the packet, the remote device
immediately loops the packet back to the local device to determine link reachability. One-arm
BFD echo can be used on low-end devices that do not support BFD.

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 15

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 2 BFD

Similarities and Differences Between Passive BFD Echo and One-Arm BFD Echo
To ensure that passive BFD echo or one-arm BFD echo can take effect, disable strict URPF
on devices that send BFD echo packets.
Strict URPF prevents attacks that use spoofed source IP addresses. If strict URPF is enabled
on a device, the device obtains the source IP address and inbound interface of a packet and
searches the forwarding table for an entry with the destination IP address set to the source IP
address of the packet. The device then checks whether the outbound interface for the entry
matches the inbound interface. If they do not match, the device considers the source IP
address invalid and discards the packet. After a device enabled with strict URPF receives a
BFD echo packet that is looped back, it checks the source IP address of the packet. As the
source IP address of the echo packet is a local IP address of the device, the packet is sent to
the platform without being forwarded at the lower layer. As a result, the device considers the
packet invalid and discards it.

Table 2-3 Differences between BFD echo sessions and common static single-hop sessions
BFD Suppor Session Descripto Negotiation IP Header
Session ted IP Type r Prerequisite

Common IPv4 Static MD and A matching The source and

static and single- YD must session must be destination IP
single- IPv6 hop be established on addresses are
hop session configured the peer. different.
session .

Passive IPv4 Dynamic No MD or A matching Both the source and

BFD and single- YD needs session must be destination IP
echo IPv6 hop to be established and addresses are a local
session session configured echo must be IP address of the
. enabled on the device.

One-arm IPv4 Static Only MD A matching Both the source and

BFD single- needs to session does not destination IP
echo hop be need to be addresses are a local
session session configured established on IP address of the
(MD and the peer. device.
YD are the

2.2.8 Board Selection Rules for BFD Sessions

BFD can be deployed in either distributed or integrated mode.
l Distributed mode
By default, BFD works in distributed mode. In this mode:
– If a single-hop BFD session is established and the session is bound to a board that is
BFD-capable in hardware, BFD can work properly. If the session is bound to a
BFD-incapable board, BFD cannot work.

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 16

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 2 BFD

– If a single-hop BFD session is established and the session is bound to a board that is
BFD-incapable in hardware but BFD-capable in software, the BFD session can be
processed by this board.
l Integrated mode
If single-hop BFD sessions are established and the sessions are bound to boards that are
BFD-incapable in hardware but BFD-capable in software, the sessions will be distributed
to the two load-balancing integrated boards. The load-balancing integrated board with
more available BFD resources will be preferentially selected.

Boards that are BFD-incapable in hardware but BFD-capable in software are selected in the following
l Boards that are BFD-capable in hardware are unavailable.
l The integrated mode is not configured, and BFD for IP sessions bound to a physical interface or its
sub-interfaces are single-hops.
If boards that are BFD-incapable in hardware but BFD-capable in software are already selected and the
integrated mode is configured, sessions will enter the AdminDown state and then be bound to an
integrated board.

Table 2-4 describes the board selection rules for BFD sessions.

Table 2-4 Board selection rules for BFD sessions

Session Type Board Selection Rule

Multi-hop session The board with the interface that receives BFD
negotiation packets is preferentially selected. If
the board does not have available BFD
resources, a load-balancing integrated board
will be selected. If no load-balancing integrated
board is available, board selection fails.

Single-hop session bound to a physical l If the board on which the bound interface or
interface or its sub-interfaces sub-interfaces reside is BFD-capable in
hardware, this board is selected. If the board
does not have available BFD resources,
board selection fails.
l If the board on which the bound interface or
sub-interfaces reside is BFD-incapable in
hardware but BFD-capable in software and
the integrated mode is configured, a load-
balancing integrated board will be selected.
If no load-balancing integrated board is
available, board selection fails.
l If the board on which the bound interface or
sub-interfaces reside is BFD-incapable in
hardware but BFD-capable in software, the
board is still selected. If the board does not
have available BFD resources, board
selection fails.

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 17

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 2 BFD

Session Type Board Selection Rule

Single-hop session bound to a trunk A board is selected from the boards on which
interface trunk member interfaces reside. If none of the
boards has available BFD resources, board
selection fails.
l If none of these boards is BFD-incapable in
hardware, a specified integrated board will
be selected based on load balancing.
l If any of these boards are BFD-capable in
hardware, and the others are BFD-incapable
in hardware, a specified integrated board
will be selected. If board selection fails, a
board is selected from those that are BFD-
capable in hardware.
l If all of these boards are BFD-capable in
hardware, one will be selected based on
load balancing.

BFD for LDP LSP session l If an outbound interface is configured for a

BFD for LDP LSP session, the board on
which the outbound interface resides is
preferentially selected.
– If the outbound interface is a tunnel
interface, a board is selected based on
multi-hop session rules because tunnel
interfaces reside on the IPU that is BFD-
incapable in hardware.
– If the board on which the outbound
interface resides is BFD-incapable in
hardware, a specified integrated board is
– If the board on which the outbound
interface resides is BFD-capable in
hardware, this board is selected.
l If a BFD session is not configured with an
outbound interface, a board is selected for
the BFD session based on multi-hop session

BFD for TE session Preferentially select the specified board. If no

board is specified, select the board based on the
multi-hop session principle.

BFD for VLANIF session The board with the interface that receives BFD
negotiation packets is selected. If the board
does not have available BFD resources, board
selection fails.

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 18

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 2 BFD

2.2.9 BFD Dampening

If an IGP or MPLS link frequently flaps and the flapping interval is greater than the IGP or
MPLS recovery time, BFD detects the link flapping and notifies an upper-layer protocol of
the event. As a result, the upper-layer protocol frequently flaps. BFD dampening prevents link
flapping detected by BFD from causing the frequent flapping of the upper-layer protocol.
BFD dampening enables the BFD session's next negotiation to be delayed if the number of
times that a BFD session flaps reaches a threshold. However, IGP and MPLS negotiation is
not affected. Specifically, if a BFD session that is always flapping goes Down, its next
negotiation is delayed, reducing the number of times that the BFD session flaps.

2.3 Application Scenarios for BFD

2.3.1 BFD for Static Routes

Different from dynamic routing protocols, static routes do not have a detection mechanism. If
a fault occurs on a network, an administrator must manually address it. Bidirectional
Forwarding Detection (BFD) for static routes is introduced to associate a static route with a
BFD session so that the BFD session can detect the status of the link that the static route
passes through.
After BFD for static routes is configured, each static route can be associated with a BFD
session. In addition to route selection rules, whether a static route can be selected as the
optimal route is subject to BFD session status.
l If a BFD session associated with a static route detects a link failure when the BFD
session is Down, the BFD session reports the link failure to the system. The system then
deletes the static route from the IP routing table.
l If a BFD session associated with a static route detects that a faulty link recovers when
the BFD session is Up, the BFD session reports the fault recovery to the system. The
system then adds the static route to the IP routing table again.
l By default, a static route can still be selected even though the BFD session associated
with it is AdminDown (triggered by the shutdown command run either locally or
remotely). If a device is restarted, the BFD session needs to be re-negotiated. In this
case, whether the static route associated with the BFD session can be selected as the
optimal route is subject to the re-negotiated BFD session status.
BFD for static routes has two detection modes:
l Single-hop detection
In single-hop detection mode, the configured outbound interface and next hop address
are the information about the directly connected next hop. The outbound interface
associated with the BFD session is the outbound interface of the static route, and the peer
address is the next hop address of the static route.
l Multi-hop detection
In multi-hop detection mode, only the next hop address is configured. Therefore, the
static route must be iterated to the directly connected next hop and outbound interface.
The peer address of the BFD session is the original next hop address of the static route,
and the outbound interface is not specified. In most cases, the original next hop to be
iterated is an indirect next hop. Multi-hop detection is performed on the static routes that
support route iteration.

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 19

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 2 BFD


For details about BFD, see the HUAWEI NE40E-M2 Series Universal Service Router Feature
Description - Reliability.

2.3.2 BFD for RIP

Routing Information Protocol (RIP)-capable devices monitor the neighbor status by
exchanging Update packets periodically. During the period local devices detect link failures,
carriers or users may lose a large number of packets. Bidirectional forwarding detection
(BFD) for RIP can speed up fault detection and route convergence, which improves network
After BFD for RIP is configured on the Router, BFD can detect a fault (if any) within
milliseconds and notify the RIP module of the fault. The Router then deletes the route that
passes through the faulty link and switches traffic to a backup link. This process speeds up
RIP convergence.
Table 2-5 describes the differences before and after BFD for RIP is configured.

Table 2-5 Differences before and after BFD for RIP is configured
Item Link Fault Detection Mechanism Convergence

BFD for RIP is A RIP aging timer expires. Second-level

not configured.

BFD for RIP is A BFD session goes Down. Millisecond-level


Related Concepts
The BFD mechanism bidirectionally monitors data protocol connectivity over the link
between two routers. After BFD is associated with a routing protocol, BFD can rapidly detect
a fault (if any) and notify the protocol module of the fault, which speeds up route convergence
and minimizes traffic loss.
BFD is classified into the following modes:
l Static BFD
In static BFD mode, BFD session parameters (including local and remote discriminators)
must be configured, and requests must be delivered manually to establish BFD sessions.
Static BFD is applicable to networks on which only a few links require high reliability.
l Dynamic BFD
In dynamic BFD mode, the establishment of BFD sessions is triggered by routing
protocols, and the local discriminator is dynamically allocated, whereas the remote
discriminator is obtained from BFD packets sent by the neighbor.
When a new neighbor relationship is set up, a BFD session is established based on the
neighbor and detection parameters, including source and destination IP addresses. When
a fault occurs on the link, the routing protocol associated with BFD can detect the BFD

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 20

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 2 BFD

session Down event. Traffic is switched to the backup link immediately, which
minimizes data loss.
Dynamic BFD is applicable to networks that require high reliability.

For details about BFD implementation, see "BFD" in Universal Service Router Feature
Description - Reliability. Figure 2-10 shows a typical network topology for BFD for RIP.
l Dynamic BFD for RIP implementation:
a. RIP neighbor relationships are established among Device A, Device B, and Device
C and between Device B and Device D.
b. BFD for RIP is enabled on Device A and Device B.
c. Device A calculates routes, and the next hop along the route from Device A to
Device D is Device B.
d. If a fault occurs on the link between Device A and Device B, BFD will rapidly
detect the fault and report it to Device A. Device A then deletes the route whose
next hop is Device B from the routing table.
e. Device A recalculates routes and selects a new path Device C → Device B →
Device D.
f. After the link between Device A and Device B recovers, a new BFD session is
established between the two routers. Device A then reselects an optimal link to
forward packets.
l Static BFD for RIP implementation:
a. RIP neighbor relationships are established among Device A, Device B, and Device
C and between Device B and Device D.
b. Static BFD is configured on the interface that connects Device A to Device B.
c. If a fault occurs on the link between Device A and Device B, BFD will rapidly
detect the fault and report it to Device A. Device A then deletes the route whose
next hop is Device B from the routing table.
d. After the link between Device A and Device B recovers, a new BFD session is
established between the two routers. Device A then reselects an optimal link to
forward packets.

Figure 2-10 BFD for RIP

DeviceA DeviceB DeviceD
Cost = 1



t =



Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 21

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 2 BFD

Usage Scenario
BFD for RIP is applicable to networks that require high reliability.

BFD for RIP improves network reliability and enables devices to rapidly detect link faults,
which speeds up route convergence on RIP networks.

2.3.3 BFD for OSPF

Bidirectional Forwarding Detection (BFD) is a mechanism to detect communication faults
between forwarding engines.
To be specific, BFD detects the connectivity of a data protocol along a path between two
systems. The path can be a physical link, a logical link, or a tunnel.
In BFD for OSPF, a BFD session is associated with OSPF. The BFD session quickly detects a
link fault and then notifies OSPF of the fault, which speeds up OSPF's response to network
topology changes.

A link fault or a topology change causes routers to recalculate routes. Routing protocol
convergence must be as quick as possible to improve network availability. Link faults are
inevitable, and therefore a solution must be provided to quickly detect faults and notify
routing protocols.
BFD for Open Shortest Path First (OSPF) associates BFD sessions with OSPF. After BFD for
OSPF is configured, BFD quickly detects link faults and notifies OSPF of the faults. BFD for
OSPF accelerates OSPF response to network topology changes.
Table 2-6 describes OSPF convergence speeds before and after BFD for OSPF is configured.

Table 2-6 OSPF convergence speeds before and after BFD for OSPF is configured
Item Link Fault Detection Mechanism Convergence

BFD for OSPF An OSPF Dead timer expires. Second-level

is not

BFD for OSPF A BFD session goes Down. Millisecond-level

is configured.

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 22

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 2 BFD


Figure 2-11 BFD for OSPF

DeviceA DeviceB
interface 1

interface 2


Figure 2-11 shows a typical network topology with BFD for OSPF configured. The principles
of BFD for OSPF are described as follows:

1. OSPF neighbor relationships are established between these three Routers.

2. After a neighbor relationship becomes Full, a BFD session is established.
3. The outbound interface on Device A connected to Device B is interface 1. If the link
between Device A and Device B fails, BFD detects the fault and then notifies Device A
of the fault.
4. Device A processes the event that a neighbor relationship goes Down and recalculates
routes. The new route passes through Device C and reaches Device A, with interface 2 as
the outbound interface.

2.3.4 BFD for OSPFv3

Bidirectional Forwarding Detection (BFD) is a mechanism to detect communication faults
between forwarding engines.

To be specific, BFD detects the connectivity of a data protocol along a path between two
systems. The path can be a physical link, a logical link, or a tunnel.

In BFD for OSPFv3, a BFD session is associated with OSPFv3. The BFD session quickly
detects a link fault and then notifies OSPFv3 of the fault, which speeds up OSPFv3's response
to network topology changes.

A link fault or a topology change causes routers to recalculate routes. Routing protocol
convergence must be as quick as possible to improve network availability. Link faults are
inevitable, and therefore a solution must be provided to quickly detect faults and notify
routing protocols.

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 23

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 2 BFD

BFD for Open Shortest Path First version 3 (OSPFv3) associates BFD sessions with OSPFv3.
After BFD for OSPFv3 is configured, BFD quickly detects link faults and notifies OSPFv3 of
the faults. BFD for OSPFv3 accelerates OSPFv3 response to network topology changes.

Table 2-7 describes OSPFv3 convergence speeds before and after BFD for OSPFv3 is

Table 2-7 OSPFv3 convergence speeds before and after BFD for OSPFv3 is configured

Item Link Fault Detection Mechanism Convergence


BFD for An OSPFv3 Dead timer expires. Second-level

OSPFv3 is not

BFD for A BFD session goes Down. Millisecond-level

OSPFv3 is


Figure 2-12 BFD for OSPFv3

DeviceA DeviceB
interface 1

interface 2


Figure 2-12 shows a typical network topology with BFD for OSPFv3 configured. The
principles of BFD for OSPFv3 are described as follows:

1. OSPFv3 neighbor relationships are established between these three Routers.

2. After a neighbor relationship becomes Full, a BFD session is established.
3. The outbound interface on Device A connected to Device B is interface 1. If the link
between Device A and Device B fails, BFD detects the fault and then notifies Device A
of the fault.
4. Device A processes the event that a neighbor relationship has become Down and
recalculates routes. The new route passes through Device C and reaches Device B, with
interface 2 as the outbound interface.

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 24

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 2 BFD

2.3.5 BFD for IS-IS

In most cases, the interval at which Hello packets are sent is 10s, and the IS-IS neighbor
holding time (the timeout period of a neighbor relationship) is three times the interval. If a
device does not receive a Hello packet from its neighbor within the holding time, the device
terminates the neighbor relationship.

A device can detect neighbor faults at the second level only. As a result, link faults on a high-
speed network may cause a large number of packets to be discarded.

BFD, which can be used to detect link faults on lightly loaded networks at the millisecond
level, is introduced to resolve the preceding issue. With BFD, two systems periodically send
BFD packets to each other. If a system does not receive BFD packets from the other end
within a specified period, the system considers the bidirectional link between them Down.

BFD is classified into the following modes:

l Static BFD
In static BFD mode, BFD session parameters (including local and remote discriminators)
are set using commands, and requests must be delivered manually to establish BFD
l Dynamic BFD
In dynamic BFD mode, the establishment of BFD sessions is triggered by routing

BFD for IS-IS enables BFD sessions to be dynamically established. After detecting a fault,
BFD notifies IS-IS of the fault. IS-IS sets the neighbor status to Down, quickly updates link
state protocol data units (LSPs), and performs the partial route calculation (PRC). BFD for IS-
IS implements fast IS-IS route convergence.

Instead of replacing the Hello mechanism of IS-IS, BFD works with IS-IS to rapidly detect the faults
that occur on neighboring devices or links.

BFD Session Establishment and Deletion

l Conditions for establishing a BFD session
– Global BFD is enabled on each device, and BFD is enabled on a specified interface
or process.
– IS-IS is configured on each device and enabled on interfaces.
– Neighbors are Up, and a designated intermediate system (DIS) has been elected on
a broadcast network.
l Process of establishing a BFD session
– P2P network
After the conditions for establishing BFD sessions are met, IS-IS instructs the BFD
module to establish a BFD session and negotiate BFD parameters between
– Broadcast network
After the conditions for establishing BFD sessions are met and the DIS is elected,
IS-IS instructs BFD to establish a BFD session and negotiate BFD parameters
between the DIS and each device. No BFD sessions are established between non-

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 25

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 2 BFD

On broadcast networks, devices (including non-DIS devices) of the same level on a

network segment can establish adjacencies. In BFD for IS-IS, however, BFD sessions are
established only between the DIS and non-DISs. On P2P networks, BFD sessions are
directly established between neighbors.
If a Level-1-2 neighbor relationship is set up between the devices on both ends of a link,
the following situations occur:
– On a broadcast network, IS-IS sets up a Level-1 BFD session and a Level-2 BFD
– On a P2P network, IS-IS sets up only one BFD session.
l Process of tearing down a BFD session
– P2P network
If the neighbor relationship established between P2P IS-IS interfaces is not Up, IS-
IS tears down the BFD session.
– Broadcast network
If the neighbor relationship established between broadcast IS-IS interfaces is not Up
or the DIS is reelected on the broadcast network, IS-IS tears down the BFD session.
If the configurations of dynamic BFD sessions are deleted or BFD for IS-IS is disabled
from an interface, all Up BFD sessions established between the interface and its
neighbors are deleted. If the interface is a DIS and the DIS is Up, all BFD sessions
established between the interface and its neighbors are deleted.
If BFD is disabled from an IS-IS process, BFD sessions are deleted from the process.

BFD detects only the one-hop link between IS-IS neighbors because IS-IS establishes only one-
hop neighbor relationships.
l Response to the Down event of a BFD session
When BFD detects a link failure, it generates a Down event and informs IS-IS of the
Down event through the GFD module. IS-IS then suppresses neighbor relationships and
recalculates routes. This process speeds up network convergence.

Usage Scenario

Dynamic BFD needs to be configured based on the actual network. If the time parameters are
not configured correctly, network flapping may occur.

BFD for IS-IS speeds up route convergence through rapid link failure detection. The
following is a networking example for BFD for IS-IS.

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 26

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 2 BFD

Figure 2-13 BFD for IS-IS

Device A Switch Device B

Primary path
Backup path

Device C

The configuration requirements are as follows:

l Basic IS-IS functions are configured on each device shown in Figure 2-13.
l Global BFD is enabled.
l BFD for IS-IS is enabled on Device A and Device B.
If the link between Device A and Device B fails, BFD can rapidly detect the fault and report it
to IS-IS. IS-IS sets the neighbor status to Down to trigger an IS-IS topology calculation. IS-IS
also updates LSPs so that Device C can promptly receive the updated LSPs from Device B,
which accelerates network topology convergence.

2.3.6 BFD for BGP

The Border Gateway Protocol (BGP) periodically sends Keepalive packets to a peer to
monitor the peer's status. However, BGP takes more than 1 second for fault detection. When
traffic is transmitted at gigabit rates, lengthy fault detection causes packet loss, which does
not meet carrier-class network requirements for high reliability.
Bidirectional Forwarding Detection (BFD) for BGP can quickly detect faults on the link
between BGP peers and notify BGP of the faults, which implements fast BGP route

As shown in Figure 2-14, Device A and Device B belong to ASs 100 and 200, respectively.
The two Routers are directly connected and establish an External Border Gateway Protocol
(EBGP) peer relationship.
BFD is enabled to detect the EBGP peer relationship between Device A and Device B. If the
link between Device A and Device B fails, BFD can quickly detect the fault and notify BGP.

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 27

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 2 BFD

Figure 2-14 BFD for BGP

BFD session

AS 100 EBGP AS 200

Device A Device B

2.3.7 BFD for LDP LSP

Bidirectional forwarding detection (BFD) monitors Label Distribution Protocol (LDP) label
switched paths (LSPs). If an LDP LSP fails, BFD can rapidly detect the fault and trigger a
primary/backup LSP switchover, which improves network reliability.

If a node or link along an LDP LSP that is transmitting traffic fails, traffic switches to a
backup LSP. The path switchover speed depends on the detection duration and traffic
switchover duration. A delayed path switchover causes traffic loss. LDP fast reroute (FRR)
can be used to speed up the traffic switchover, but not the detection process.
As shown in Figure 2-15, a local label switching router (LSR) periodically sends Hello
messages to notify each peer LSR of the local LSR's presence and establish a Hello adjacency
with each peer LSR. The local LSR constructs a Hello hold timer to maintain the Hello
adjacency with each peer. Each time the local LSR receives a Hello message, it updates the
Hello hold timer. If the Hello hold timer expires before a Hello message arrives, the LSR
considers the Hello adjacency disconnected. The Hello mechanism cannot rapidly detect link
faults, especially when a Layer 2 device is deployed between the local LSR and its peer.

Figure 2-15 Primary and FRR LSPs

Ingress Hello E gress

Primary LSP


The rapid, light-load BFD mechanism is used to quickly detect faults and trigger a primary/
backup LSP switchover, which minimizes data loss and improves service reliability.


BFD for LDP LSP is implemented by establishing a BFD session between two nodes on both
ends of an LSP and binding the session to the LSP. BFD rapidly detects LSP faults and

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 28

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 2 BFD

triggers a traffic switchover. When BFD monitors a unidirectional LDP LSP, the reverse path
of the LDP LSP can be an IP link, an LDP LSP, or a traffic engineering (TE) tunnel.

A BFD session that monitors LDP LSPs is negotiated in either static or dynamic mode:
l Static configuration: The negotiation of a BFD session is performed using the local and
remote discriminators that are manually configured for the BFD session to be
established. On a local LSR, you can bind an LSP with a specified next-hop IP address
to a BFD session with a specified peer IP address.
l Dynamic establishment: The negotiation of a BFD session is performed using the BFD
discriminator type-length-value (TLV) in an LSP ping packet. You must specify a policy
for establishing BFD sessions on a local LSR. The LSR automatically establishes BFD
sessions with its peers and binds the BFD sessions to LSPs using either of the following
– Host address-based policy: The local LSR uses all host addresses to establish BFD
sessions. You can specify a next-hop IP address and an outbound interface name of
LSPs and establish BFD sessions to monitor the specified LSPs.
– Forwarding equivalence class (FEC)-based policy: The local LSR uses host
addresses listed in a configured FEC list to automatically establish BFD sessions.

BFD uses the asynchronous mode to check LSP continuity. That is, the ingress and egress
periodically send BFD packets to each other. If one end does not receive BFD packets from
the other end within a detection period, BFD considers the LSP Down and sends an LSP
Down message to the LSP management (LSPM) module.
Although BFD for LDP is enabled on a proxy egress, a BFD session cannot be established for the
reverse path of a proxy egress LSP on the proxy egress.

BFD for LDP Tunnel

BFD for LDP LSP only detects primary LSP faults and switches traffic to an FRR bypass LSP
or existing load-balancing LSPs. If the primary and FRR bypass LSPs or the primary and
load-balancing LSPs fail simultaneously, the BFD mechanism does not take effect. LDP can
instruct its upper-layer application to perform a protection switchover (such as VPN FRR or
VPN equal-cost load balancing) only after LDP itself detects the FRR bypass LSP failure or
the load-balancing LSP failure.

To address this issue, BFD for LDP tunnel is used. LDP tunnels include the primary LSP and
FRR bypass LSP. The BFD for LDP tunnel mechanism establishes a BFD session that can
simultaneously monitor the primary and FRR bypass LSPs or the primary and load-balancing
LSPs. If both the primary and FRR bypass LSPs fail or both the primary and load-balancing
LSPs fail, BFD rapidly detects the failures and instructs the LDP upper-layer application to
perform a protection switchover, which minimizes traffic loss.

BFD for LDP tunnel uses the same mechanism as BFD for LDP LSP to monitor the
connectivity of each LSP in an LDP tunnel. Unlike BFD for LDP LSP, BFD for LDP tunnel
has the following characteristics:

l Only dynamic BFD sessions can be created for LDP tunnels.

l A BFD for LDP tunnel session is triggered using a host IP address, a FEC list, or an IP
prefix list.
l No next-hop address or outbound interface name can be specified in any BFD session
trigger policies.

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 29

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 2 BFD

Usage Scenarios
BFD for LDP LSP can be used in the following scenarios:
l Primary and bypass LDP FRR LSPs are established.
l Primary and bypass virtual private network (VPN) FRR LSPs are established.

BFD for LDP LSP provides a rapid, light-load fault detection mechanism for LDP LSPs,
which improves network reliability.

2.3.8 BFD for P2MP TE

BFD for P2MP TE applies to NG-MVPN and VPLS scenarios and rapidly detects P2MP TE
tunnel failures. This function helps reduce the response time, improve network-wide
reliability, and reduces traffic loss.

No tunnel protection is provided in the NG-MVPN over P2MP TE function or VPLS over
P2MP TE function. If a tunnel fails, traffic can only be switched using route change-induced
hard convergence, which renders low performance. This function provides dual-root 1+1
protection for the NG-MVPN over P2MP TE function and VPLS over P2MP TE function. If a
P2MP TE tunnel fails, BFD for P2MP TE rapidly detects the fault and switches traffic, which
improves fault convergence performance and reduces traffic loss.


Figure 2-16 BFD for P2MP TE principles

Root Backup Root

P1 P2


Leaf Leaf Leaf Leaf

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 30

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 2 BFD

In Figure 2-16, BFD is enabled on the root PE1 and the backup root PE2. Leaf nodes UPE1
to UEP4 are enabled to passively create BFD sessions. Both PE1 and PE2 sends BFD packets
to all leaf nodes along P2MP TE tunnels. The leaf nodes receives the BFD packets transmitted
only on the primary tunnel. If a leaf node receives detection packets within a specified
interval, the link between the root node and leaf node is working properly. If a leaf node fails
to receive BFD packets within a specified interval, the link between the root node and leaf
node fails. The leaf node then rapidly switches traffic to a protection tunnel, which reduces
traffic loss.

2.3.9 BFD for TE CR-LSP

BFD for TE is an end-to-end rapid detection mechanism supported by MPLS TE. BFD for TE
rapidly detects faults in links on an MPLS TE tunnel. BFD for TE supports BFD for TE
tunnel and BFD for TE CR-LSP. This section describes BFD for TE CR-LSP only.

Traditional detection mechanisms, such as RSVP Hello and Srefresh, detect faults slowly.
BFD rapidly sends and receives packets to detect faults in a tunnel. If a fault occurs, BFD
triggers a traffic switchover to protect traffic.

Figure 2-17 BFD




On the network shown in Figure 2-17, BFD is disabled. If LSRE fails, LSRA or LSRF cannot
promptly detect the fault because a Layer 2 switch exists between them. Although the Hello
mechanism detects the fault, detection lasts for a long time.

If LSRE fails, LSRA and LSRF detect the fault rapidly, and traffic switches to the path LSRA
-> LSRB -> LSRD -> LSRF.

BFD for TE detects faults in a CR-LSP. After detecting a fault in a CR-LSP, BFD for TE
immediately notifies the forwarding plane of the fault to rapidly trigger a traffic switchover.
BFD for TE is usually used together with a hot-standby CR-LSP.

The concepts associated with BFD are as follows:

l Static BFD session: established by manually setting the local and remote discriminators.
The local discriminator on a local node must match the remote discriminator on a remote
node. The minimum intervals at which BFD packets are sent and received are
changeable after a static BFD session is established.

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 31

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 2 BFD

l Detection period: an interval at which the system checks the BFD session status. If no
packet is received from the remote end within a detection period, the BFD session is
considered Down.

A BFD session is bound to a CR-LSP. A BFD session is set up between the ingress and
egress. A BFD packet is sent by the ingress to the egress along a CR-LSP. Upon receipt, the
egress responds to the BFD packet. The ingress can rapidly monitor the status of links through
which the CR-LSP passes based on whether a reply packet is received.

If a link fault is detected, BFD notifies the forwarding module of the fault. The forwarding
module searches for a backup CR-LSP and switches traffic to the backup CR-LSP. In
addition, the forwarding module reports the fault to the control plane. If static BFD for TE
CR-LSP is used, a BFD session is created manually to detect faults in the backup CR-LSP if

Figure 2-18 BFD sessions before and after a switchover





Primary Lsp
Backup Lsp
Bfd Session

On the network shown in Figure 2-18, a BFD session is set up to detect faults in the link
through which the primary CR-LSP passes. If a link fault occurs, the BFD session on the
ingress immediately notifies the forwarding plane of the fault. The ingress switches traffic to
the bypass CR-LSP and sets up a new BFD session to detect faults in the bypass CR-LSP.

BFD for TE Deployment

The networking shown in Figure 2-19 applies to BFD for TE CR-LSP and BFD for hot-
standby CR-LSP.

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 32

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 2 BFD

Figure 2-19 BFD for TE


Primary Tunnel


Switchover between the primary and hot-standby CR-LSPs

On the network shown in Figure 2-19, a primary CR-LSP is established along the path LSRA
-> LSRB, and a hot-standby CR-LSP is configured. A BFD session is set up between LSRA
and LSRB to detect faults in the primary CR-LSP. If a fault occurs on the primary CR-LSP,
the BFD session rapidly notifies LSRA of the fault. After receiving the fault information,
LSRA rapidly switches traffic to the hot-standby CR-LSP to ensure traffic continuity.

2.3.10 BFD for TE Tunnel

BFD for TE supports BFD for TE tunnel and BFD for TE CR-LSP. This section describes
BFD for TE tunnel.
The BFD mechanism detects communication faults in links between forwarding engines. The
BFD mechanism monitors the connectivity of a data protocol on a bidirectional path between
systems. The path can be a physical link or a logical link, for example, a TE tunnel.
BFD detects faults in an entire TE tunnel. If a fault is detected and the primary TE tunnel is
enabled with virtual private network (VPN) FRR, a traffic switchover is rapidly triggered,
which minimizes the impact on traffic.
On a VPN FRR network, a TE tunnel is established between PEs, and the BFD mechanism is
used to detect faults in the tunnel. If the BFD mechanism detects a fault, VPN FRR switching
is performed in milliseconds.

2.3.11 BFD for RSVP

When a Layer 2 device exists on a link between two RSVP nodes, BFD for RSVP can be
configured to rapidly detect a fault in the link between the Layer 2 device and an RSVP node.
If a link fault occurs, BFD for RSVP detects the fault and sends a notification to trigger TE
FRR switching.

When a Layer 2 device is deployed on a link between two RSVP nodes, an RSVP node can
only use the Hello mechanism to detect a link fault. For example, on the network shown in
Figure 2-20, a switch exists between P1 and P2. If a fault occurs on the link between the
switch and P2, P1 keeps sending Hello packets and detects the fault after it fails to receive
replies to the Hello packets. The fault detection latency causes seconds of traffic loss. To
minimize packet loss, BFD for RSVP can be configured. BFD rapidly detects a fault and
triggers TE FRR switching, which improves network reliability.

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 33

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 2 BFD

Figure 2-20 BFD for RSVP

llo VP
He H ello

PE1 P2 PE2
for H ello

Faulty point
: Primary CR-LSP
: Bypass CR-LSP

BFD for RSVP monitors RSVP neighbor relationships.
Unlike BFD for CR-LSP and BFD for TE that support multi-hop BFD sessions, BFD for
RSVP establishes only single-hop BFD sessions between RSVP nodes to monitor the network
BFD for RSVP, BFD for OSPF, BFD for IS-IS, and BFD for BGP can share a BFD session.
When protocol-specific BFD parameters are set for a BFD session shared by RSVP and other
protocols, the smallest values take effect. The parameters include the minimum intervals at
which BFD packets are sent, minimum intervals at which BFD packets are received, and local
detection multipliers.

Usage Scenario
BFD for RSVP applies to a network on which a Layer 2 device exists between the TE FRR
point of local repair (PLR) on a bypass CR-LSP and an RSVP node on the primary CR-LSP.

BFD for RSVP improves reliability on MPLS TE networks with Layer 2 devices.

2.3.12 BFD for VRRP

Devices in a VRRP backup group exchange VRRP Advertisement packets to negotiate the
master/backup status and implement backup. If the link between devices in a VRRP backup
group fails, VRRP Advertisement packets cannot be exchanged to negotiate the master/
backup status. A backup device attempts to preempt the Master state after a period three times
provided that; if the time interval at which VRRP Advertisement packets are broadcast.
During this period, user traffic is still forwarded to the master device, which results in user
traffic loss.

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 34

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 2 BFD

Bidirectional Forwarding Detection (BFD) can rapidly detect faults in links or IP routes. BFD
for VRRP enables a master/backup VRRP switchover to be completed within 1 second,
preventing user traffic loss. A BFD session is established between the master and backup
devices in a VRRP backup group and is bound to the VRRP backup group. BFD immediately
detects communication faults in the VRRP backup group and instructs the VRRP backup
group to perform a master/backup switchover, minimizing service interruptions.

VRRP and BFD Association Modes

The following table describes VRRP and BFD association modes.

Table 2-8 VRRP and BFD association modes

Ass Usage Scenario Type of Impact Mode BFD Support
ociat Associated BFD
ion Session

Asso A backup device Static BFD If the BFD session VRRP devices
ciati monitors the status sessions or static detects a fault and must be enabled
on of the master BFD sessions with goes Down, the with BFD.
betw device in a VRRP automatically BFD module
een a backup group. A negotiated notifies the VRRP
VRR common BFD discriminators backup group of
P session is used to the status change.
back monitor the link After receiving the
up between the notification, the
grou master and backup VRRP backup
p and devices. group changes
a VRRP priorities of
com devices and
mon determines
BFD whether to
sessi perform a master/
on backup VRRP

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 35

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 2 BFD

Ass Usage Scenario Type of Impact Mode BFD Support

ociat Associated BFD
ion Session

Asso The master and Static BFD If the link or peer VRRP devices and
ciati backup devices sessions or static BFD session goes the downstream
on monitor the link BFD sessions with Down, BFD switch must be
betw and peer BFD automatically notifies the VRRP enabled with BFD.
een a sessions. A link negotiated backup group of
VRR BFD session is discriminators the fault. After
P established receiving the
back between the notification, the
up master and backup VRRP backup
grou devices. A peer group immediately
p and BFD session is performs a master/
link established backup VRRP
and between a switchover.
peer downstream
BFD switch and each
sessi VRRP device.
ons BFD helps the
VRRP backup
group detect faults
in the link
between a VRRP
device and the

Association Between a VRRP Backup Group and a Common BFD Session

As shown in Figure 2-21, a BFD session is established between Device A (master) and
Device B (backup) and is bound to a VRRP backup group. If BFD detects a fault on the link
between Device B and Device A, the BFD module notifies the VRRP module of the status
change. After receiving the notification, the VRRP module performs a master/backup VRRP

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 36

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 2 BFD

Figure 2-21 Association between a VRRP backup group and a common BFD session
Device A
(master) Device C

Device E
network core

Device B Device D
BFD control packet
Data flow

VRRP device configurations are as follows:

l Device A supports delayed preemption and its VRRP priority is 120.
l Device B supports immediate preemption and its VRRP priority retains the default value
l A VRRP backup group is configured on Device B to monitor a common BFD session. If
BFD detects a fault and the BFD session goes Down, Device B increases its VRRP
priority by 40.
The implementation process is as follows:
1. Device A periodically sends VRRP Advertisement packets to inform Device B that it is
working properly. Device B monitors the status of Device A and the BFD session.
2. If BFD detects a fault, the BFD session goes Down. BFD notifies the VRRP module of
the status change. Device B increases its VRRP priority value to 140 (increased by 40),
higher than Device A's VRRP priority. Device B preempts the Master state and sends
gratuitous ARP packets to update address entries on Device E.
3. After the fault is rectified, the BFD session goes Up.
Device B restores a priority of 100. Device B retains the Master state and still sends
VRRP Advertisement packets to Device A.
After receiving the packets, Device A checks that the VRRP priority carried in the
packets is lower than the local VRRP priority and waits a specified period before
preempting the Master state. After restoring the Master state, Device A sends a VRRP
Advertisement packet and a gratuitous ARP packet.
After receiving the VRRP Advertisement packet that carries a priority higher than the
local priority, Device B enters the Backup state.
4. Device A in the Master state forwards user traffic, and Device B remains in the Backup
The preceding process shows that BFD for VRRP is different from VRRP. After BFD for
VRRP is deployed and a fault occurs, a backup device immediately preempts the Master state
without waiting a period three times provided that; if the time interval at which VRRP
Advertisement packets are broadcast. A master/backup VRRP switchover can be implemented
in milliseconds.

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 37

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 2 BFD

Association Between a VRRP Backup Group and Link and Peer BFD Sessions
As shown in Figure 2-22, the master and backup devices monitor the status of link and peer
BFD sessions to identify local or remote faults.
Device A and Device B run VRRP. A peer BFD session is established between Device A and
Device B to detect link and device failures. Link BFD sessions are established between
Device A and Device E and between Device B and Device E to detect link and device
failures. After Device B detects that the peer BFD session goes Down and Link2 BFD session
goes Up, Device B's VRRP status changes from Backup to Master, and Device B takes over.

Figure 2-22 Association between a VRRP backup group and link and peer BFD sessions
Device A
VRRP (master) Device C

Device E


network core

Device B Device D
(backup) BFD control packet
Data flow

VRRP device configurations are as follows:

l Device A and Device B run VRRP.

l A peer BFD session is established between Device A and Device B to detect link and
device failures.
l Link1 and Link2 BFD sessions are established between Device E and Device A and
between Device E and Device B, respectively.
The implementation process is as follows:
1. In normal circumstances, Device A periodically sends VRRP Advertisement packets to
inform Device B that it is working properly. Device A monitors the BFD session status.
Device B monitors the status of Device A and the BFD session.
2. The BFD session goes Down if BFD detects either of the following faults:
– Link1 or Device E fails. Link1 BFD session and the peer BFD session go Down.
Link2 BFD session is Up.
Device A's VRRP status directly becomes Initialize.
Device B's VRRP status directly becomes Master.
– Device A fails. Link1 BFD session and the peer BFD session go Down. Link2 BFD
session is Up. Device B's VRRP status becomes Master.
3. After the fault is rectified, the BFD sessions go Up, and Device A and Device B restore
their VRRP status.

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 38

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 2 BFD


A Link2 fault does not affect Device A's VRRP status, and Device A continues to forward upstream
traffic. However, Device B's VRRP status becomes Master if both the peer BFD session and Link2 BFD
session go Down, and Device B detects the peer BFD session status change before detecting the Link2
BFD session status change. After Device B detects the Link2 BFD session status change, Device B's
VRRP status becomes Initialize.

Figure 2-23 shows the state machine for the association between a VRRP backup group and
link and peer BFD sessions.

Figure 2-23 State machine for the association between a VRRP backup group and link and
peer BFD sessions


n. sio

rit s

e go
io e
ow es

Th s U i s

lin es
pr go

go orit
D s

es FD

5. P n

BF o
lin p a we
25 R sio

go k B

k nd r t

D n
is VR s

BF th ha

e se

se .

D e n

th D

d F

se VR 25

an k B

ss R 5 .
p n

io P
U e li


Master Backup
The peer BFD session goes Down
and the link BFD session goes Up.

The preceding process shows that, after link and peer BFD for VRRP is deployed, the backup
device immediately preempts the Master state if a fault occurs. Link and peer BFD for VRRP
implements a millisecond-level master/backup VRRP switchover.

BFD for VRRP speeds up master/backup VRRP switchovers if faults occur.

2.3.13 BFD for PW

Service Overview
Bidirectional Forwarding Detection (BFD) for pseudo wire (PW) monitors PW connectivity
on a Layer 2 virtual private network (L2VPN) and informs the L2VPN of any detected faults.
Upon receiving a fault notification from BFD, the L2VPN performs a primary/secondary PW
switchover to protect services.
Static BFD for PW has two modes: time to live (TTL) and non-TTL.
The two static BFD for PW modes are described as follows:
l Static BFD for PW in TTL mode: The TTL of BFD packets is automatically calculated
or manually configured. BFD packets are encapsulated with PW labels and transmitted
over PWs. A PW can either have the control word enabled or not. The usage scenarios of
static BFD for PW in TTL mode are as follows:
– Static BFD for single-segment PW (SS-PW): Two BFD-enabled nodes negotiate a
BFD session based on the configured peer address and TTL (the TTL for SS-PWs is
1) and exchange BFD packets to monitor PW connectivity.

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 39

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 2 BFD

– Static BFD for multi-segment PW (MS-PW): The remote peer address of the MS-
PW to be detected must be specified. BFD packets can pass through multiple
superstratum provider edge devices (SPEs) to reach the destination, regardless of
whether the control word is enabled for the PW.
l Static BFD for PW in non-TTL mode: The TTL of BFD packets is fixed at 255. BFD
packets are encapsulated with PW labels and transmitted over PWs. A PW must have the
control word enabled and differentiate control packets from data packets by checking
whether these packets carry the control word. Static BFD for PW in non-TTL mode can
detect only end-to-end (E2E) SS-PWs.

Networking Description

Figure 2-24 Service transmission over E2E PWs

BFD Session



Primary PW
Standby PW

Figure 2-24 shows an IP radio access network (RAN) that consists of the following device
l Cell site gateway (CSG): CSGs form the access network. On the IP RAN, CSGs function
as user-end provider edge devices (UPEs) to provide access services for NodeBs.
l Aggregation site gateway (ASG): On the IP RAN, ASGs function as SPEs to provide
access services for UPEs.
l Radio service gateway (RSG): ASGs and RSGs form the aggregation network. On the IP
RAN, RSGs function as network provider edge devices (NPEs) to connect to the radio
network controller (RNC).
The primary PW is along CSG1–ASG3–RSG5 and the secondary PW is along CSG1–CSG2–
ASG4-RSG6. If the primary PW fails, traffic switches to the secondary PW.

Feature Deployment
Configure static BFD for PW on the IP RAN as follows:

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 40

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 2 BFD

1. On CSG1, configure static BFD for the primary and secondary PWs.
2. On RSG5, configure static BFD for the primary PW.
3. On RSG6, configure static BFD for the secondary PW.
When you configure static BFD for PW, note the following points:
l When you configure static BFD for the primary PW, ensure that the local discriminator on CSG1 is
the remote discriminator on RSG5 and that the remote discriminator on CSG1 is the local
discriminator on RSG5.
l When you configure static BFD for the secondary PW, ensure that the local discriminator on CSG1
is the remote discriminator on RSG6 and that the remote discriminator on CSG1 is the local
discriminator on RSG6.
After you configure static BFD for PW on CSG1 and primary/secondary RSGs, services can
quickly switch to the secondary PW if the primary PW fails.

2.3.14 BFD for Multicast VPLS

Service Overview
IP/MPLS backbone networks carry an increasing number of multicast services, such as IPTV,
video conferences, and massively multiplayer online role-playing games (MMORPGs), which
all require bandwidth assurance, QoS guarantee, and high network reliability. To provide
better multicast services, the IETF proposed the multicast VPLS solution. On a multicast
VPLS network, the ingress transmits multicast traffic to multiple egresses over a P2MP
MPLS tunnel. This solution eliminates the need to deploy PIM and HVPLS on the transit
nodes, simplifying network deployment.
On a multicast VPLS network, multicast traffic can be carried over either P2MP TE tunnels or
P2MP mLDP tunnels. When P2MP TE tunnels are used, P2MP TE FRR must be deployed. If
a link fault occurs, FRR allows traffic to be rapidly switched to a normal link. If a node fails,
however, traffic is not switched until the root node detects the fault and recalculates links to
set up a Source to Leaf (S2L) sub-LSP. Topology convergence takes a long time in this
situation, affecting service reliability.
To meet the reliability requirements of multicast services, configure BFD for multicast VPLS
to monitor multicast VPLS links. When a link or node fails, BFD on the leaf nodes can
rapidly detect the fault and trigger protection switching so that the leaf nodes receive traffic
from the backup multicast tunnel.

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 41

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 2 BFD

Networking Description

Figure 2-25 BFD for multicast VPLS




Master Root 1 Backup Root



Leaf Leaf Leaf Leaf

(UPE1) (UPE2) (UPE3) (UPE4)

Receiver1 Receiver2 Receiver3 Receiver4

Link or node failure

Physical Link
Master P2MP Tunnel
Backup P2MP Tunnel

Figure 2-25 shows a dual-root 1+1 protection scenario in which PE-AGG1 is the master root
node and PE-AGG2 is the backup root node. Each root node sets up a complete MPLS
multicast tree to the UPEs (leaf nodes). The two MPLS multicast trees do not have
overlapping paths. After multicast flows reach PE-AGG1 and PE-AGG2, PE-AGG1 and PE-

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 42

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 2 BFD

AGG2 send the multicast flows along their respective P2MP tunnels to UPEs. Each UPE
receives two copies of multicast flows and selects one to send to users.
The network configurations are as follows:
1. An IGP runs between the UPEs, SPEs, and PE-AGGs to implement Layer 3 reachability.
2. Each PE-AGG sets up a P2P tunnel (a TE tunnel or LDP LSP) to each UPE. VPLS PWs
are set up using BGP-AD. In addition, BGP-AD is used to set up P2MP LSPs from PE-
AGG1 and PE-AGG2 to the UPEs. VPLS PWs are iterated to the P2MP LSPs.
3. A protection group is configured on each UPE for P2MP tunnels so that each UPE can
select one from the two copies of multicast flows it receives.
4. BFD for multicast VPLS is deployed for P2MP tunnels to implement protection
switching when BFD detects a fault. On the PE-AGGs, BFD is configured to track the
upstream AC interfaces. If the AC between NPE1 and PE-AGG1 fails, the UPEs receive
multicast flows from NPE2.
BFD for multicast VPLS sessions are set up as follows:
1. A root node triggers the establishment of a BFD session of the MultiPointHead type.
Once established, the BFD session is initially Up and requires no negotiation. BFD
triggers the root node to periodically send LSP ping packets along the P2MP tunnels and
to send BFD detection packets at a configured BFD detection interval.
2. A leaf node receives LSP ping packets and triggers the establishment of a BFD session
of the MultiPointTail type. Once established, the BFD session is initially Down. After
the leaf node receives BFD detection packets indicating that the BFD session on the root
node is Up, the leaf node changes its BFD session to the Up state and starts BFD
BFD for multicast VPLS sessions support only one-way detection. The BFD session of the
MultiPointHead type on a root node only sends packets, whereas the BFD session of the
MultiPointTail type on a leaf node only receives packets.

On the network shown in Figure 2-25, if link 1 (an AC) fails, BFD on the master root node
detects that the AC interface is Down and stops sending BFD detection packets. The leaf
nodes cannot receive BFD detection packets, and therefore report the Down event, which
triggers protection switching. The leaf nodes then receive multicast flows from the backup
multicast tunnel. Similarly, if node 2, link 3, node 4, or link 5 fails, the leaf nodes also receive
multicast flows from the backup multicast tunnel. After the fault is rectified, BFD sessions are
reestablished. The leaf nodes then receive multicast flows from the master multicast tunnel

2.3.15 BFD for PIM

To minimize the impact of device faults on services and improve network availability, a
network device must be able to quickly detect communication faults with adjacent devices.
Measures can then be taken to promptly rectify the faults to ensure service continuity.
On a live network, link faults can be detected using either of the following mechanisms:
l Hardware detection: For example, the Synchronous Digital Hierarchy (SDH) alarm
function can be used to detect link faults. Hardware fault detection mechanisms are fast,
but cannot be used in all scenarios by all media.
l Slow Hello mechanism: It usually refers to the Hello mechanism of a routing protocol.
The detection rate for slow Hello mechanisms is measured in seconds. Detection times of
one second or more can result in large losses if data is being transmitted at gigabit rates.

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 43

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 2 BFD

For delay-sensitive services, such as voice, a delay of one second or more is also
l Other detection mechanisms: Different protocols or manufacturers may provide
proprietary detection mechanisms, but it is difficult to deploy proprietary mechanisms
when systems are interconnected for interworking.

Bidirectional Forwarding Detection (BFD) is a unified detection mechanism that can detect a
fault in milliseconds on a network. BFD is compatible with all types of transmission media
and protocols. BFD implements the fault detection function by establishing a BFD session
and periodically sending BFD control packets along the path between them. If one system
does not receive BFD control packets within a specified period, the system regards it as a fault
occurrence on the path.

In multicast scenarios, if the DR on a shared network segment is faulty and the neighbor
relationship times out, other PIM neighbors start a new DR election. Consequently, multicast
data transmission is interrupted for a few seconds.

BFD for PIM can detect a link's status on a shared network segment within milliseconds and
respond quickly to a fault on a PIM neighbor. If the interface configured with BFD for PIM
does not receive any BFD packets from the current DR within a configured detection period,
the interface considers that a fault has occurred on the designated router (DR). The BFD
module notifies the route management (RM) module of the session status, and the RM
module notifies the PIM module. Then, the PIM module triggers a new DR election
immediately rather than waiting for the neighbor relationship to time out. This minimizes
service interruptions and improves the multicast network reliability.
Currently, BFD for PIM can be used on both IPv4 PIM-SM/Source-Specific Multicast (SSM) and IPv6
PIM-SM/SSM networks.

Figure 2-26 BFD for PIM


Source PIM-SM

Port 2 Port 1



As shown in Figure 2-26, on the shared network segment where user hosts reside, a PIM
BFD session is set up between the downstream interface Port 2 of Device B and the

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 44

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 2 BFD

downstream interface Port 1 of Device C. Both ports send BFD packets to detect the status of
the link between them.
Port 2 of Device B is elected as a DR for forwarding multicast data to the receiver. If Port 2
fails, BFD immediately notifies the RM module of the session status and the RM module then
notifies the PIM module. The PIM module triggers a new DR election. Port 1 of Device C is
then elected as a new DR to forward multicast data to the receiver.

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 45

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 3 MPLS OAM


About This Chapter

3.1 Overview of MPLS OAM

3.2 Understanding MPLS OAM
3.3 Application Scenarios for MPLS OAM
3.4 Terminology for MPLS OAM

3.1 Overview of MPLS OAM

As a key technology used on scalable next generation networks, Multiprotocol Label
Switching (MPLS) provides multiple services with quality of service (QoS) guarantee. MPLS,
however, introduces a unique network layer, which causes faults. Therefore, MPLS networks
must obtain operation, administration and maintenance (OAM) capabilities.

OAM is an important means to reduce network maintenance costs. The MPLS OAM
mechanism manages operation and maintenance of MPLS networks.

For details about the MPLS OAM background, see ITU-T Recommendation Y.1710. For
details about the MPLS OAM implementation mechanism, see ITU-T Recommendation Y.

The server-layer protocols, such as Synchronous Optical Network (SONET)/Synchronous
Digital Hierarchy (SDH), is below the MPLS layer; the client-layer protocols, such as IP, FR,
and ATM, is above the MPLS layer. These protocols have their own OAM mechanisms.
Failures in the MPLS network cannot be rectified completely through the OAM mechanism
of other layers. In addition, the network technology hierarchy also requires MPLS to have its
independent OAM mechanism to decrease dependency between layers on each other.

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 46

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 3 MPLS OAM

The MPLS OAM mechanism can detect, identify, and locate a defect at the MPLS layer
effectively. Then, the MPLS OAM mechanism reports and handles the defect. In addition, if a
failure occurs, the MPLS OAM mechanism triggers protection switching.
MPLS offers an OAM mechanism totally independent of any upper or lower layer. The
following OAM features are enabled on the MPLS user plane:
l Monitors links connectivity.
l Evaluates network usage and performance.
l Performs a traffic switchover if a fault occurs so that services meet service level
agreements (SLAs).

l MPLS OAM can rapidly detect link faults or monitor the connectivity of links, which
helps measure network performance and minimizes OPEX.
l If a link fault occurs, MPLS OAM rapidly switches traffic to the standby link to restore
services, which shortens the defect duration and improves network reliability.

Basic Detection Functions

MPLS OAM can be used to check the connectivity of an LSP.
Figure 3-1 shows connectivity monitoring for an LSP.

Figure 3-1 Connectivity monitoring for an LSP

The working process of MPLS OAM is as follows:

1. The ingress sends a connectivity verification (CV) or fast failure detection (FFD) packet
along an LSP to be monitored. The packet passes through the LSP and arrives at the
2. The egress compares the packet type, frequency, and trail termination source identifier
(TTSI) in a received packet with the locally configured values to verify the packet. In
addition, the egress collects the numbers of correct and incorrect packets within a
detection interval.
3. If the egress detects an LSP defect, it analyzes the defect type and sends a backward
defect indication (BDI) packet carrying defect information to the ingress along a reverse
tunnel. The ingress can then obtain the defect. If a protection group is correctly
configured, the ingress switches traffic to a backup LSP.

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 47

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 3 MPLS OAM

Reverse Tunnel
A reverse tunnel is bound to an LSP that is monitored using MPLS OAM. The reverse tunnel
can transmit BDI packets to notify the ingress of an LSP defect.

A reverse tunnel and the LSP to which the reverse tunnel is bound must have the same

The reverse tunnel transmitting BDI packets can be either of the following types:

l Private reverse LSP

l Shared reverse LSP

MPLS OAM Auto Protocol

ITU-T Recommendation Y.1710 has some drawbacks, for example:

l If OAM is enabled on the ingress of an LSP later than that on the egress or if OAM is
enabled on the egress but disabled on the ingress, the egress generates a loss of
connectivity verification defect (dLOCV) alarm.
l Before the OAM detection packet type or the interval at which detection packets are sent
are changed, OAM must be disabled on the ingress and egress.
l OAM parameters (such as a detection packet type and an interval at which detection
packets are sent) must be set on both the ingress and egress, which may cause parameter

The NE40E implements the OAM auto protocol to resolve these drawbacks.

The OAM auto protocol is configured on the egress. With this protocol, the egress can
automatically start OAM functions after receiving the first OAM packet. In addition, the
egress can dynamically stop running the OAM state machine after receiving an FDI packet
sent by the ingress.

3.2 Understanding MPLS OAM

3.2.1 Basic Detection

The Multiprotocol Label Switching (MPLS) operation, administration and maintenance
(OAM) mechanism effectively detects and locates MPLS link faults. The MPLS OAM
mechanism also triggers a protection switchover after detecting a fault.

Related Concepts
l MPLS OAM packets
Table 3-1 describes MPLS OAM packets.

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 48

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 3 MPLS OAM

Table 3-1 MPLS OAM packets

Packet Type Description

Continuity Connectivity verification Sent by a local MEP to detect

check (CV) packet exceptions. If the local MEP detects an
exception, it sends an alarm to its client-
layer MEP. For example, if a CV-enabled
device receives a packet on an incorrect
LSP, the device will report an alarm
indicating a forwarding error to the
client-layer MEP.

Fast failure detection Sent by a maintenance association end

(FFD) packet point (MEP) to rapidly detect an LSP
fault. If the MEP detects a fault, it sends
an alarm to the client layer.
l FFD and CV packets contain the same
information and provide the same
function. They are processed in the same
way, whereas FFD packets are processed
more quickly than CV packets.
l FFD and CV cannot be started

Backward defect indication (BDI) packet Sent by the egress to notify the ingress of
an LSP defect.

l Channel defects
Table 3-2 describes channel defects that MPLS OAM can detect.

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 49

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 3 MPLS OAM

Table 3-2 Channel defect detection using MPLS OAM

Defect Description

MPLS l dLOCV: a connectivity verification loss defect.

layer A dLOCV defect occurs if no CV or FFD packets are received after
defects three consecutive intervals at which CV or FFD packets are sent elapse.
l dTTSI_Mismatch: a trail termination source identifier (TTSI) mismatch
A dTTSI_Mismatch defect occurs if no CV or FFD packets with
correct TTSIs are received after three consecutive intervals at which
CV or FFD packets are sent elapse.
l dTTSI_Mismerge: a TTSI mis-merging defect.
A dTTSI_Mismerge defect occurs if CV or FFD packets with both
correct and incorrect TTSIs are received within three consecutive
intervals at which CV or FFD packets are sent.
l dExcess: an excessive rate at which connectivity detection packets are
A dExcess defect occurs if five or more correct CV or FFD packets are
received within three consecutive intervals at which CV or FFD
packets are sent.

Other l Oamfail: The OAM auto protocol expires.

defects An Oamfail defect occurs if the first OAM packet is not received after
the auto protocol expires.
l Signal deterioration (SD)
An SD defect occurs if the packet loss ratio reaches the configured SD
l Signal failure (SF)
An SF defect occurs if the packet loss ratio reaches the configured SF
l dUnknown: an unknown defect on an MPLS network.
A test packet type or interval inconsistency occurs between the source
and sink nodes.

l Reverse tunnel
A reverse tunnel is bound to an LSP that is monitored using MPLS OAM. The reverse
tunnel can transmit BDI packets to notify the ingress of an LSP defect. A reverse tunnel
and the LSP to which the reverse tunnel is bound must have the same endpoints, and
they transmit traffic in opposite directions. The reverse tunnels transmitting BDI packets
include private or shared LSPs. Table 3-3 lists the two types of reverse tunnel.

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 50

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 3 MPLS OAM

Table 3-3 MPLS OAM reverse tunnel types

type Description

Private Bound to only one LSP. The binding between the private reverse LSP
reverse LSP and its forward LSP is stable but may waste LSP resources.

Shared Bound to many LSPs. A TTSI carried in a BDI packet identifies a

reverse LSP specific forward LSP bound to a reverse LSP. The binding between a
shared reverse LSP and multiple forward LSPs minimizes LSP
resource wastes. If defects occur on multiple LSPs bound to the shared
reverse LSP, the reverse LSP may be congested with traffic.

MPLS OAM periodically sends CV or FFD packets to monitor TE LSPs, PWs, or ring


MPLS OAM monitors TE LSPs. If MPLS OAM detects a fault in a TE LSP, it triggers a
traffic switchover to minimize traffic loss.

Figure 3-2 MPLS OAM for a TE LSP



Egress LSR
Ingress LSR



Figure 3-2 illustrates a network on which MPLS OAM monitors TE LSP connectivity.
The process of using MPLS OAM to monitor TE LSP connectivity is as follows:
a. The ingress sends a CV or FFD packet along a TE LSP to be monitored. The packet
passes through the TE LSP and arrives at the egress.
b. The egress compares the packet type, frequency, and TTSI in the received packet
with the locally configured values to verify the packet. In addition, the egress
collects the number of correct and incorrect packets within a detection interval.
c. If the egress detects an LSP defect, the egress analyzes the defect type and sends a
BDI packet carrying defect information to the ingress along a reverse tunnel. The
ingress can then be notified of the defect. If a protection group is configured, the
ingress switches traffic to a backup LSP.
l MPLS OAM for PWs

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 51

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 3 MPLS OAM

MPLS OAM periodically sends CV or FFD packets to monitor PW connectivity. If

MPLS OAM detects a PW defect, it sends BDI packets carrying the defect type along a
reverse tunnel and instructs a client-layer application to switch traffic from the active
link to the standby link.

Figure 3-3 MPLS OAM for a PW


MPLS network

PW signals

Figure 3-3 illustrates a network on which MPLS OAM monitors PW connectivity.

a. For PE1 and PE2, a PW is established between them, OAM parameters are set on
them, and they are enabled to send and receive OAM packets. OAM monitors the
PW between PE1 and PE2 and obtains PW information
b. If OAM detects a default, PE2 sends a BDI packet to PE1 over a reverse tunnel.
c. PEs notify CEs of the fault so that CE1 and CE2 can use the information to
maintain networks.

3.2.2 Auto Protocol

The MPLS OAM auto protocol is a Huawei proprietary protocol.

On the NE40E, the OAM auto protocol can address the following problems, which occur
because of drawbacks of ITU-T Recommendations Y.1710 and Y.1711:

l A dLOCV defect occurs if the OAM function is enabled on the ingress on an LSP later
than that on the egress or if OAM is enabled on the egress and disabled on the ingress.
l The dLOCV defect also occurs when OAM is disabled. OAM must be disabled on the
ingress and egress before the OAM detection packet type or the interval at which
detection packets are sent can be changed.
l OAM parameters, including a detection packet type and an interval at which detection
packets are sent must be set on both the ingress and egress. This is likely to cause a
parameter inconsistency.

The OAM auto protocol enabled on the egress provides the following functions:
l Triggers OAM
– If the sink node does not support OAM CC and CC parameters (including the
detection packet type and interval at which packets are sent), upon the receipt of the
first CV or FFD packet, the sink node automatically records the packet type and
interval at which the packet is sent and uses these parameters in CC detection that

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 52

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 3 MPLS OAM

– If the OAM function-enabled sink node does not receive CV or FFD packets within
a specified period of time, the sink node generates a BDI packet and notifies the
NMS of the BDI defect.
l Dynamically stops running the OAM. If the detection packet type or interval at which
detection packets are sent is to be changed on the source node, the source node sends an
FDI packet to instruct the sink node to stop the OAM state machine. If an OAM function
is to be disabled on the source node, the source node also sends an FDI packet to instruct
the sink node to stop the OAM state machine.

3.3 Application Scenarios for MPLS OAM

3.3.1 Application of MPLS OAM in the IP RAN Layer 2 to Edge

MPLS OAM is deployed on PEs to maintain and operate MPLS networks. Working at the
MPLS client and server layers, MPLS OAM can effectively detect, identify, and locate client
layer faults and quickly switch traffic if links or nodes become faulty, reducing network
maintenance cost.

Figure 3-4 IP RAN over MPLS in the Layer 2 to edge scenario






Figure 3-4 illustrates an IP RAN in the Layer 2 to edge scenario. The MPLS OAM
implementation is as follows:
l The BTS, NodeB, BSC, and RNC can be directly connected to an MPLS network.
l A TE tunnel between PE1 and PE4 is established. PWs are established over the TE
tunnel to transmit various services.
l MPLS OAM is enabled on PE1 and PE4 OAM parameters are configured on PE1 and
PE4 on both ends of a PW. These PEs are enabled to send and receive OAM detection
packets, which allows OAM to monitor the PW between PE1 and PE4. OAM can obtain
basic PW information. If OAM detects a default, PE4 sends a BDI packet to PE1 over a
reverse tunnel. PEs notify the user-side BTS, NodeB, RNC, and BSC of fault
information so that the user-side devices can use the information to maintain networks.
The working principles of PE2 and PE3 are the same as those of PE 1.

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 53

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 3 MPLS OAM

3.3.2 Application of MPLS OAM in VPLS Networking

Service Overview
The operation and maintenance of virtual leased line (VLL) and virtual private LAN service
(VPLS) services require an operation, administration and maintenance (OAM) mechanism.
MultiProtocol Label Switching Transport Profile MPLS OAM provides a mechanism to
rapidly detect and locate faults, which facilitates network operation and maintenance and
reduces the network maintenance costs.

Networking Description
As shown in Figure 3-5, a user-end provider edge (UPE) on the access network is dual-
homed to SPE1 and SPE2 on the aggregation network. A VLL supporting access links of
various types is deployed on the access network. A VPLS is deployed on the aggregation
network to form a point-to-multipoint leased line network. Additionally, Fast Protection
Switching (FPS) is configured on the UPE; MPLS tunnel automatic protection switching
(APS) is configured on SPE1 and SPE2 to protect the links between the virtual switching
instances (VSIs) created on the two superstratum provider edges (SPEs).

Figure 3-5 UPE dual-homing networking




Node B RNC


Feature Deployment
To deploy MPLS OAM to monitor link connectivity of VLL and VPLS pseudo wires (PWs),
configure maintenance entity groups (MEGs) and maintenance entities (MEs) on the UPE,
SPE1, and SPE2 and then enable one or more of the continuity check (CC), loss measurement
(LM), and delay measurement (DM) functions. The UPE monitors link connectivity and
performance of the primary and secondary PWs.

MPLS-TP OAM is implemented as follows:

l When SPE1 detects a link fault on the primary PW, SPE1 sends a Remote Defect
Indication (BDI) packet to the UPE, instructing the UPE to switch traffic from the
primary PW to the secondary PW. Meanwhile, the UPE sends a MAC Withdraw packet,
in which the value of the PE-ID field is SPE1's ID, to SPE2. After receiving the MAC
Withdraw packet, SPE2 transparently forwards the packet to the NPE and the NPE
deletes the MAC address it has learned from SPE1. After that, the NPE learns a new
MAC address from the secondary PW.

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 54

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 3 MPLS OAM

l After the primary PW recovers, the UPE switches traffic from the secondary PW back to
the primary PW. Meanwhile, the UPE sends a MAC Withdraw packet, in which the
value of the PE-ID field is SPE2's ID, to SPE1. After receiving the MAC Withdraw
packet, SPE1 transparently forwards the packet to the NPE and the NPE deletes the
MAC address it has learned from SPE2. After that, the NPE learns a new MAC address
from the new primary PW.

3.4 Terminology for MPLS OAM

Item Definition

reverse A direction opposite to the direction that traffic flows along the
monitored service link.

forward A direction that traffic flows along the monitored service link.

path merge LSR An LSR that receives the traffic transmitted on the protection path
in MPLS OAM protection switching.
If the path merge LSR is not the traffic destination, it sends and
merges the traffic transmitted on the protection path onto the
working path.
If the path merge LSR is the destination of traffic, it sends the
traffic to the upper-layer protocol for handling.

path switch LSR An LSR that switches or replicates traffic between the primary
service link and the bypass service link.

user plane A set of traffic forwarding components through which traffic flow
passes. An OAM CV or FFD packet is periodically inserted to this
traffic flow to monitor the forwarding component status. In IETF
drafts, the user plane is also called the data plane.

Ingress An LSR from which the forward LSP originates and at which the
reverse LSP terminates.

Egress An LSR at which the forward LSP terminates and from which the
reverse LSP originates.

Acronyms and Abbreviations

Acronym & Abbreviation Full Name

BDI backward defect indication

CV connectivity verification

FDI forward defect indication

FFD fast failure detection

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 55

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 3 MPLS OAM

Acronym & Abbreviation Full Name

MPLS Multiprotocol Label Switching

TTSI trail termination source identifier

DM loss measurement

OAM operation, administration and maintenance

PE provider edge router

SD Signal deterioration

SF Signal failure

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 56

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 4 MPLS-TP OAM


About This Chapter

4.1 Overview of MPLS-TP OAM

4.2 Understanding MPLS-TP OAM
4.3 Application Scenarios for MPLS-TP OAM
4.4 Terminology for MPLS-TP OAM

4.1 Overview of MPLS-TP OAM

Multiprotocol Label Switching Protocol Transport Profile (MPLS-TP) is a transport technique
that integrates MPLS packet switching with traditional transport network features. MPLS-TP
networks are poised to replace traditional transport networks in the future. MPLS-TP
Operation, Administration, and Maintenance (MPLS-TP OAM) works on the MPLS-TP client
layer. It can effectively detect, identify, and locate faults in the client layer and quickly switch
traffic when links or nodes become defective. OAM is an important part of any plan to reduce
network maintenance expenditures.

Both networks and services are part of an ongoing process of transformation and integration.
New services like triple play services, Next Generation Network (NGN) services, carrier
Ethernet services, and Fiber-to-the-x (FTTx) services are constantly emerging from this
process. Such services demand more investment and have higher OAM costs. They require
state of the art QoS, full service access, and high levels of expansibility, reliability, and
manageability of transport networks. Traditional transport network technologies such as
Multi-Service Transfer Platform (MSTP), Synchronous Digital Hierarchy (SDH), or
Wavelength Division Multiplexing (WDM) cannot meet these requirements because they lack
a control plane. Unlike traditional technologies, MPLS-TP does meet these requirements
because it can be used on next-generation transport networks that can process data packets, as
well as on traditional transport networks.

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 57

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 4 MPLS-TP OAM

Because traditional transport networks or Optical Transport Node (OTN) networks have high
reliability and maintenance benchmarks, MPLS-TP must provide powerful OAM capabilities.
MPLS-TP OAM provides the following functions:
l Fault management
l Performance monitoring
l Triggering protection switching

l MPLS-TP OAM can rapidly detect link faults or monitor the connectivity of links, which
helps measure network performance and minimizes OPEX.
l If a link fault occurs, MPLS-TP OAM rapidly switches traffic to the standby link to
restore services, which shortens the defect duration and improves network reliability.

MPLS-TP OAM Components

MPLS-TP OAM functions are implemented by maintenance entities (MEs). An ME consists
of a pair of maintenance entity group end points (MEPs) located at two ends of a link and a
group of maintenance entity group intermediate points (MIPs) between them.
MPLS-TP OAM components are described as follows:
l ME
An ME maintains a relationship between two MEPs. On a bidirectional label switched
path (LSP) that has two MEs, MPLS-TP OAM detection can be performed on the MEs
without affecting each other. One ME can be nested within another ME but cannot
overlap with another ME.
ME1 and ME2 in Figure 4-1 are used as an example:
– ME1 consists of two MEPs only.
– ME2 consists of two MEPs and two MIPs.

Figure 4-1 ME deployment on a point-to-point bidirectional LSP

Ingress LER Transit LER Transit LER Egress LER




A maintenance entity group (MEG) comprises one or more MEs that are created for a
transport link. If the transport link is a point-to-point bidirectional path, such as a
bidirectional co-routed LSP or pseudo wire (PW), a MEG comprises only one ME.

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 58

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 4 MPLS-TP OAM

A MEP is the source or sink node in a MEG. Figure 4-2 shows ME node deployment.

Figure 4-2 ME node deployment

– For a bidirectional LSP, only the ingress label edge router (LER) and egress LER
can function as MEPs, as shown in Figure 4-2.
– For a PW, only user-end provider edges (UPEs) can function as MEPs.
MEPs trigger and control MPLS-TP OAM operations. OAM packets can be generated or
terminated on MEPs.

Fault Management
Table 4-1 lists the MPLS-TP OAM fault management functions supported by the NE40E.

Table 4-1 MPLS-TP OAM fault management functions

Function Description

Continuity check Checks link connectivity periodically.


Connectivity Detects forwarding faults continuously.

verification (CV)

Loopback (LB) Performs loopback.

Remote defect Notifies remote defects.

indication (RDI)

Performance Monitoring
Table 4-2 lists the MPLS-TP OAM performance monitoring functions supported by the

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 59

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 4 MPLS-TP OAM

Table 4-2 MPLS-TP OAM performance monitoring functions

Function Description

Loss measurement Collects statistics about lost frames. LM includes the following
(LM) functions:
l Single-ended frame loss measurement
l Dual-ended frame loss measurement

Delay measurement Collects statistics about delays and delay variations (jitter). DM
(DM) includes the following functions:
l One-way frame delay measurement
l Two-way frame delay measurement

4.2 Understanding MPLS-TP OAM

4.2.1 Basic Concepts

An MPLS-TP network consists of the section, LSP, and PW layers in bottom-up order. A
lower layer is a server layer, and an upper layer is a client layer. For example, the section
layer is the LSP layer's server layer, and the LSP layer is the section layer's client layer.
On the MPLS-TP network shown in Figure 4-3, MPLS-TP OAM detects and locates faults in
the section, LSP, and PW layers. Table 4-3 describes MPLS-TP OAM components.

Figure 4-3 MPLS-TP OAM application


Node B

PW layer

MEG End Point
MEG Intermediate Point

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 60

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 4 MPLS-TP OAM

Table 4-3 MPLS-TP OAM components

Name Description Example

Maintenance entity (ME) All MPLS-TP OAM l Section layer:

functions are performed on Each pair of adjacent
MEs. Each ME consists of LSRs forms an ME.
two maintenance entity l LSP layer:
group end points (MEPs) LSRs A, B, C, and D
and maintenance entity form an ME.
group intermediate points
(MIPs) on the link between LSRs D and E form an
the two MEPs. ME.
LSRs E, F, and G form
an ME.
l PW layer:
LSRs A, D, E, and G
form an ME.

Maintenance entity group A MEG is comprised of one l Section layer:

(MEG) or more MEs that are Each ME forms a MEG.
created for a transport link. l LSP layer:
MEGs for various services Each ME forms a MEG.
contain different MEs:
l PW layer:
l A MEG for a P2P Each ME forms a MEG.
unidirectional path
contains only one ME. If two tunnels in opposite
l A MEG for a P2P directions between LSR A and
bidirectional path LSR D are established, a
contains two MEs. A single MEG consisting of two
MEs is established.
MEG for P2P
bidirectional co-routed
path contains a single
l A MEG for a P2MP
unidirectional path
contains MEs destined
for leaf nodes.

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 61

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 4 MPLS-TP OAM

Name Description Example

MEG end point (MEP) A MEP is the source or sink l Section layer: Each LSR
node in a MEG. can function as a MEP.
Each LSR functions as
an LSR.
l LSP layer: Only an LER
can function as a MEP.
LSRs A, D, E, and G are
LERs functioning as
l PW layer: Only PW
terminating provider
edge (T-PE) LSRs can
function as MEPs.
LSRs A and G are T-PEs
functioning as MEPs.

MEG intermediate point Intermediate nodes between l Section layer:

(MIP) two MEPs on both ends of a No MIPs.
MEG. MIPs only respond to l LSP layer:
OAM packets sent by MEPs LSRs B, C, and F
and do not take the initiative function as MIPs.
in OAM packet exchanges.
l PW layer:
LSRs D and E function
as MIPs.

Usage Scenario
MPLS-TP OAM monitors the following types of links:
l Static bidirectional co-routed CR-LSPs
l Static VLL-PWs,VPLS-PWs

4.2.2 Continuity Check and Connectivity Verification

Continuity check (CC) and connectivity verification (CV) are both MPLS-TP functions. CC is
used to check loss of continuity defeat(dLOC) between two MEPs in a MEG. CV monitors
connectivity between two MEPs within one MEG or in different MEGs.

CC is a proactive OAM operation. It detects LOC faults between any two MEPs in a MEG. A
MEP sends CC messages (CCMs) to a remote RMEP at specified intervals. If the RMEP does
not receive a CCM for a period 3.5 times provided that; if the specified interval, it considers
the connection between the two MEPs faulty. This causes the RMEP to report an alarm and
enter the Down state, and the RMEP triggers automatic protection switching (APS) on both
MEPs. After receiving a CCM from the MEP, the RMEP will clear the alarm and exit the
Down state.

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 62

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 4 MPLS-TP OAM

CV is also a proactive OAM operation. It enables a MEP to report alarms when unexpected or
error packets are received. For example, if a CV-enabled MEP receives a packet from an LSP
and finds that this packet has been transmitted in error along an LSP, the MEP will report an
alarm indicating a forwarding error.

4.2.3 Packet Loss Measurement

Packet loss measurement (LM), a performance monitoring function provided by MPLS-TP, is
implemented on the two ends of a PW, LSP, or MPLS section to collect statistics about
dropped packets. Packet loss measurement results contain near- and far-end packet loss
l Near-end packet loss value: the number of dropped packets expected to arrive at the local
l Far-end packet loss value: the number of dropped packets the local MEP has sent.
To collect packet loss statistics for both incoming and outgoing packets, each MEP must have
both of the following counters enabled:
l TxFCl: records the number of packets sent to the RMEP.
l RxFCl: records the number of packets received by the local MEP.
Packet loss measurement can be performed in either single- or dual-ended mode. Table 4-4
describes the single- and dual-ended packet loss measurement.

Table 4-4 Packet loss measurement functions

Function Description Usage Scenario

Dual-ended Collects packet loss Dual-ended packet loss measurement provides

packet loss statistics to assess the more accurate results than the single-ended
measurement quality of the link method. The interval between dual-ended
between two MEPs that packet loss measurements varies with the
have connectivity fault interval between CCM transmissions. The
management (CFM) CCM transmission interval is shorter than the
continuity check (CC) interval between LMM transmissions.
enabled. Therefore, the dual-ended method allows for a
shorter measurement interval than the single-
ended method.

Single-ended Collects packet loss Sending CCMs imposes a heavier burden on

packet loss statistics to assess the the network than sending LMMs and LMRs.
measurement quality of the link To minimize the burden, single-ended packet
between two MEPs. loss measurement can be used.
This method is
independent of CC.

Dual-ended Packet Loss Measurement

Figure 4-4 illustrates proactive dual-ended packet loss measurement. Dual-ended packet loss
measurement can only be performed in proactive mode. Two MEPs on both ends of a link
periodically exchange CCMs carrying the following information:

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 63

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 4 MPLS-TP OAM

l TxFCf: the local TxFCl value recorded when the local MEP sent a CCM.
l RxFCb: the local RxFCl value recorded when the local MEP received a CCM.
l TxFCb: the TxFCf value carried in a received CCM. This TxFCb value is the local
TxFCl when the local MEP receives a CCM.

Figure 4-4 Proactive dual-ended packet loss measurement



Dual-end LM CCM TxFCf RxFCb TxFCb


After receiving CCMs carrying packet count information, both MEPs use the following
formulas to measure near- and far-end packet loss values:

Near-end packet loss value = |TxFCf[tc] - TxFCb[tp]| - |RxFCl[tc] - RxFCl[tp]|

Far-end packet loss value = |TxFCb[tc] - TxFCb[tp]| - |RxFCb[tc] - RxFCb[tp]|

l TxFCf[tc], RxFCb[tc], and TxFCb[tc] are the TxFCf, RxFCb, and TxFCb values,
respectively, which are carried in the most recently received CCM. RxFCl[tc] is the local
RxFCl value recorded when the local MEP received the CCM.
l TxFCf[tp], RxFCb[tp], and TxFCb[tp] are the TxFCf, RxFCb, and TxFCb values,
respectively, which are carried in the previously received CCM. RxFCl[tp] is the local
RxFCl value recorded when the local MEP received the previous CCM.
l tc is the time a current CCM was received.
l tp is the time the previous CCM was received.

Single-ended Packet Loss Measurement

Single-ended packet loss measurement is performed in either proactive or on-demand mode.
In proactive mode, a local MEP periodically sends loss measurement messages (LMMs) to an
RMEP carrying the following information:
l TxFCl: the local TxFCl value recorded when the LMM was sent.

After receiving an LMM, the RMEP responds to the local MEP with loss measurement replies
(LMRs) carrying the following information:
l TxFCf: equal to the TxFCf value carried in the LMM.
l RxFCf: the local RxFCl value recorded when the LMM was received.
l TxFCb: the local TxFCl value recorded when the LMR was sent.

Figure 4-5 illustrates proactive single-end packet loss measurement.

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 64

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 4 MPLS-TP OAM

Figure 4-5 Proactive single-ended packet loss measurement



Single-end LM


After receiving an LMR, the local MEP uses the following formulas to calculate near- and far-
end packet loss values:

Near-end packet loss value = |TxFCb[tc] - TxFCb[tp]| - |RxFCl[tc] - RxFCl[tp]|

Far-end packet loss value = |TxFCf[tc] - TxFCf[tp]| - |RxFCf[tc] - RxFCf[tp]|

l TxFCf[tc], RxFCf[tc], and TxFCb[tc] are the TxFCf, RxFCf, and TxFCb values,
respectively, which are carried in the most recently received LMR. RxFCl[tc] is the local
RxFCl value recorded when the most recent LMR arrives at the local MEP.
l TxFCf[tp], RxFCf[tp], and TxFCb[tp] are the TxFCf, RxFCf, and TxFCb values,
respectively, which are carried in the previously received LMR. RxFCl[tp] is the local
RxFCl value recorded when the previous LMR arrived at the local MEP.
l tc is the time a current LMR was received.
l tp is the time the previous LMR was received.

4.2.4 Frame Delay Measurement

Frame delay measurement (DM), a performance monitoring function provided by MPLS-TP,
calculates the delay time on links. Frame delay measurement is performed in either proactive
or on-demand mode. The on-demand mode is used by default. Delay information can be used
to calculate the delay variation.

The link delay time can be measured using either one- or two-way frame delay measurement.
Table 4-5 describes these frame delay measurement functions.

Table 4-5 Frame delay measurement functions

Function Description Usage Scenario

One-way Measures the network delay time One-way frame delay measurement
frame delay on a unidirectional link between can be used only on a
measurement MEPs. unidirectional link. A MEP and its
RMEP on both ends of the link
must have synchronous time.

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 65

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 4 MPLS-TP OAM

Function Description Usage Scenario

Two-way Measures the network delay time Two-way frame delay measurement
frame delay on a bidirectional link between can be used on a bidirectional link
measurement MEPs. between a local MEP and its
RMEP. The local MEP does not
need to synchronize its time with
its RMEP.

One-Way Frame Delay Measurement

Figure 4-6 illustrates one-way frame delay measurement. A local MEP periodically sends its
RMEP one-way delay measurement (1DM) messages carrying TxTimeStampf (the time when
a 1DM was sent).

Figure 4-6 One-way frame delay measurement



One-way DM 1DM TxTimeStampf


After the RMEP receives a 1DM, it subtracts the TxTimeStampf value from the RxTimef
value to calculate the delay time:

Frame delay time = RxTimef - TxTimeStampf

The frame delay value can be used to measure the delay variation that is the absolute
difference between two delay time values.

One-way frame delay measurement can only be performed when the two MEPs on both ends
of a link have synchronous time. If these MEPs have asynchronous time, they can only
measure the delay variation.

Two-Way Frame Delay Measurement

Two-way frame delay measurement is performed by E2E MEPs. A MEP periodically sends a
DMM carrying TxTimeStampf (the time when the DMM was sent). After receiving the
DMM, the RMEP responds with a delay measurement reply (DMR). This message carries
RxTimeStampf (the time when the DMM was received) and TxTimeStampb (the time when

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 66

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 4 MPLS-TP OAM

the DMR was sent). The value in every field of the DMM is copied exactly to the DMR, with
the exception that the source and destination MAC addresses are interchanged.

Figure 4-7 Two-way frame delay measurement



DMM TxTimeStampf
Two-way DM
DMR TxTimeStampb


Upon receipt of the DMR, the local MEP calculates the two-way frame delay time using the
following formula:

Frame delay = RxTimeb (the time the DMR was received) - TxTimeStampf

To obtain a more accurate result, RxTimeStampf and TxTimeStampb are used.

RxTimeStampf indicates the time a DMM is received, and TxTimeStampb indicates the time
a DMR is sent. After the local MEP receives the DMR, it calculates the frame delay time
using the following formula:

Frame delay = (RxTimeb - TxTimeStampf) - (TxTimeStampb - RxTimeStampf)

Two-way frame delay measurement supports both delay and delay variation measurement
even if these MEPs do not have synchronous time. The frame delay time is the round-trip
delay time. If both MEPs have synchronous time, the round-trip delay time can be calculated
by combining the two delay values using the following formulas:
l MEP-to-RMEP delay time = RxTimeStampf - TxTimeStampf
l RMEP-to-MEP delay time = RxTimeb - TxTimeStampb

4.2.5 Remote Defect Indication

Remote defect indication (RDI) enables a maintenance entity group end point (MEP) to send
continuity check messages (CCMs), each carrying an RDI flag, to notify a remote MEP
(RMEP) of faults.

The RDI implementation is as follows:

l After a local MEP detects a link fault using the continuity check (CC) function, the local
MEP sets the RDI flag to 1 in CCMs and sends the CCMs along a reverse path to notify
its RMEP of the fault.
l After the fault is rectified, the local MEP sets the RDI flag to 0 in CCMs and sends them
to inform the RMEP that the fault is rectified.

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 67

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 4 MPLS-TP OAM


l The RDI function is associated with the proactive continuity check function and takes effect only after the
continuity check function is enabled.
l The RDI function applies only to bidirectional links. In the case of a unidirectional LSP, before RDI can
be used, a reverse path must be bound to the LSP.

4.2.6 Loopback

On a multiprotocol label switching transport profile (MPLS-TP) network, a virtual circuit
may traverse muptiple exchanging devices (nodes), including maintenance association end
points (MEPs) and maintenance association intermediate points (MIPs). Any faulty node or
link fault in a virtual circuit may lead to the unavailability of the entire virtual circuit.
Moreover, the fault cannot be located. Loopback (LB) can be configured on a source device
(MEP) to detect or locate faults in links between the MEP and a MIP or between MEPs.

Related Concepts
LB and continuity check (CC) are both connectivity monitoring tools on an MPLS-TP
network. Table 4-6 describes differentces between CC and LB.

Table 4-6 Differences among CC and LB

Function Description Usage Scenario

CC CC is a proactive OAM To only monitor the

operation. It detects LOC connectivity of a link
faults between any two between two MEPs or
MEPs in a MEG. associate APS, choose CC.

LB LB is an on-demand OAM To monitor the bidirectional

operation. It monitors the connectivity of a link
connectivity of bidirectional between a MEP and a MIP
links between a MEP and a or a link between two MEPs
MIP and between MEPs. and not to associate APS,
choose LB.

The loopback function monitors the connectivity of bidirectional links between a MEP and a
MIP and between MEPs.

The loopback test process is as follows:

1. The source MEP sends a loopback message (LBM) to a destination. If a MIP is used as
the destination, the TTL in the LBM must be equal to the number of hops from the
source to the destination. LBM checks whether the target MIP ID carried by itself and
the MIP ID are the same. If a MEP is used as the destination, the TTL must be greater
than or equal to the number of hops to the destination. The TTL setting prevents the
LBM from being discarded before reaching the destination.

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 68

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 4 MPLS-TP OAM

2. After the destination receives the LBM, it checks whether the target MIP ID or MEP ID
matches the local MIP ID or MEP ID. If they do not match, the destination discards the
LBM. If they match, the destination responds with a loopback reply (LBR).
3. If the source MEP receives the LBR within a specified period of time, it considers the
destination reachable and the loopback test successful. If the source MEP does not
receive the LBR after the specified period of time elapses, it records a loopback test
timeout and log information that is used to analyze the connectivity failure.

Figure 4-8 Loopback test




Figure 4-8 illustrates a loopback test. LSRA initiates a loopback test to LSRC on an LSP. The
loopback test process is as follows:

1. LSRA sends LSRC an LBM carrying a specified TTL and a MIP ID. LSRB
transparently transmits the LBM to LSRC.
2. Upon receipt, LSRC determines that the TTL carried in the LBM times out and checks
whether the target MIP ID carried in the LBM matches the local MIP ID. If they do not
match, LSRC discards the LBM. If they match, LSRC responds with an LBR.
3. If LSRA receives the LBR within a specified period of time, it considers LSRC
reachable. If LSRA fails to receive the LBR after a specified period of time elapses,
LSRA considers LSRC unreachable and records log information that is used to analyze
the connectivity failure.

4.3 Application Scenarios for MPLS-TP OAM

4.3.1 Application of MPLS-TP OAM in the IP RAN Layer 2 to

Edge Scenario
MPLS-TP OAM is deployed on PEs to maintain and operate MPLS networks. Working at the
MPLS client and server layers, MPLS-TP OAM can effectively detect, identify, and locate
client layer faults and quickly switch traffic if links or nodes become faulty, reducing network
maintenance cost.

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 69

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 4 MPLS-TP OAM

Figure 4-9 IP RAN over MPLS-TP in the Layer 2 to edge scenario





TE Tunnel

In Figure 4-9, in Layer 2 to edge scenario on an IP RAN, mature PWE3 techniques are used
to carry services. The process of transmitting services between a BST/NodeB and a
RNC/BSC is as follows:
l The BTS, NodeB, BSC, and RNC can be directly connected to an MPLS-TP network.
l A TE tunnel between PE1 and PE4 is established. PWs are established over the TE
tunnel to transmit various services.
l MPLS-TP OAM is enabled on PE1 and PE4 OAM parameters are configured on PE1
and PE4 on both ends of a PW. These PEs are enabled to send and receive OAM
detection packets, which allows OAM to monitor the PW between PE1 and PE4. OAM
can obtain basic PW information. If OAM detects a default, PE4 sends a RDI packet to
PE1 over a reverse tunnel. PEs notify the user-side BTS, NodeB, RNC, and BSC of fault
information so that the user-side devices can use the information to maintain networks.

4.3.2 Application of MPLS-TP OAM in VPLS Networking

Service Overview
The operation and maintenance of virtual leased line (VLL) and virtual private LAN service
(VPLS) services require an operation, administration and maintenance (OAM) mechanism.
MultiProtocol Label Switching Transport Profile (MPLS-TP) OAM provides a mechanism to
rapidly detect and locate faults, which facilitates network operation and maintenance and
reduces the network maintenance costs.

Networking Description
As shown in Figure 4-10, a user-end provider edge (UPE) on the access network is dual-
homed to SPE1 and SPE2 on the aggregation network. A VLL supporting access links of
various types is deployed on the access network. A VPLS is deployed on the aggregation
network to form a point-to-multipoint leased line network. Additionally, Fast Protection
Switching (FPS) is configured on the UPE; MPLS tunnel automatic protection switching
(APS) is configured on SPE1 and SPE2 to protect the links between the virtual switching
instances (VSIs) created on the two superstratum provider edges (SPEs).

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 70

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 4 MPLS-TP OAM

Figure 4-10 UPE dual-homing networking




Node B RNC


Feature Deployment
To deploy MPLS-TP OAM to monitor link connectivity of VLL and VPLS pseudo wires
(PWs), configure maintenance entity groups (MEGs) and maintenance entities (MEs) on the
UPE, SPE1, and SPE2 and then enable one or more of the continuity check (CC), and
loopback (LB) functions. The UPE monitors link connectivity and performance of the primary
and secondary PWs.

MPLS-TP OAM is implemented as follows:

l When SPE1 detects a link fault on the primary PW, SPE1 sends a Remote Defect
Indication (RDI) packet to the UPE, instructing the UPE to switch traffic from the
primary PW to the secondary PW. Meanwhile, the UPE sends a MAC Withdraw packet,
in which the value of the PE-ID field is SPE1's ID, to SPE2. After receiving the MAC
Withdraw packet, SPE2 transparently forwards the packet to the NPE and the NPE
deletes the MAC address it has learned from SPE1. After that, the NPE learns a new
MAC address from the secondary PW.
l After the primary PW recovers, the UPE switches traffic from the secondary PW back to
the primary PW. Meanwhile, the UPE sends a MAC Withdraw packet, in which the
value of the PE-ID field is SPE2's ID, to SPE1. After receiving the MAC Withdraw
packet, SPE1 transparently forwards the packet to the NPE and the NPE deletes the
MAC address it has learned from SPE2. After that, the NPE learns a new MAC address
from the new primary PW.

4.4 Terminology for MPLS-TP OAM


Abbreviation Full Name

AIS Alarm Indication Signal

CC Continuity Check

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 71

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 4 MPLS-TP OAM

Abbreviation Full Name

CSF Client Signal Failure

CV Connectivity Verification

DM Delay Measurement

LB Loopback

LCK Locked Signal

LM Loss Measurement

LSP Label Switched Path

LSR Label Switching Router

LT Linktrace

MEP Maintenance association End Point

MIP Maintenance association Intermediate Point

MPLS-TP Multiprotocol Label Switching Transport Profile

OAM Operation Administration & Maintenance

PE Provider Edge Router

PW Pseudo-Wires

RDI Remote Defect Indication

SPE Superstratum PE

TST Test

UPE Underlayer PE

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 72

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 5 VRRP


About This Chapter

5.1 Overview of VRRP

5.2 Understanding VRRP
5.3 Application Scenarios for VRRP
5.4 Terminology for VRRP

5.1 Overview of VRRP

The Virtual Router Redundancy Protocol (VRRP) is a fault-tolerant protocol that groups
several routers into a virtual router. If the next hop of a host fails, VRRP switches traffic to
another router, which ensures communication continuity and reliability.


In this document, if a VRRP function supports both IPv4 and IPv6, the implementation of this VRRP
function is the same for IPv4 and IPv6 unless otherwise specified.

VRRP is a fault-tolerant protocol defined in relevant standards . VRRP allows logical devices
to work separately from physical devices and implements route selection among multiple
egress gateways.

On the network shown in Figure 5-1, VRRP is enabled on two Routers. One is the master and
the other is the backup. The two Routers form a virtual router and this virtual router is
assigned a virtual IP address and a virtual MAC address. Hosts monitor only the presence of
the virtual router. The hosts communicate with devices on other network segments through
the virtual router.

A virtual router consists of a master Router and one or more backup Routers. Only the master
Router forwards packets. If the master Router fails, a backup Router is elected as the master
Router and takes over.

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 73

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 5 VRRP

Figure 5-1 Schematic diagram for a VRRP backup group

Virtual Device

User Master



On a multicast or broadcast LAN (for example, an Ethernet), VRRP uses a logical VRRP
gateway to ensure reliability for key links. VRRP prevents service interruptions if a physical
VRRP gateway fails, providing high reliability. VRRP configuration is simple and takes effect
without modification in configurations, such as routing protocol configurations.

As networks rapidly develop and applications become diversified, various value-added
services, such as Internet Protocol television (IPTV) and video conferencing, have become
widespread. Demands for network infrastructure reliability are increasing, especially in
nonstop network transmission.
Generally, hosts use one default gateway to communicate with external networks. If the
default gateway fails, communication between the hosts and external networks is interrupted.
System reliability can be improved using dynamic routing protocols (such as RIP and OSPF)
or ICMP Router Discovery Protocol (IRDP). However, this method requires complex
configurations and each host must support dynamic routing protocols.
VRRP resolves this issue by enabling several routers to be grouped into a virtual router, also
called a VRRP backup group. In normal circumstances, the master router in the VRRP backup
group functions as a default gateway and provides access services for users. If the master
router fails, VRRP elects a backup router from the VRRP backup group to provide access
services for users.
Hosts on a local area network (LAN) are usually connected to an external network through a
default gateway. When the hosts send packets destined for addresses out of the local network
segment, these packets follow a default route to an egress gateway. A provider edge (PE)
functions as an egress gateway on the network shown in Figure 5-2. The PE forwards packets
to the external network so that the hosts can communicate with the external network.

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 74

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 5 VRRP

Figure 5-2 Default gateway on a LAN



If the PE fails, the hosts connected to it cannot communicate with the external network. The
communication failure persists even if another Router is added to the LAN. This is because
only a single default gateway can be configured for most hosts on a LAN and forward all data
packets destined for devices that are not on the local network segment. Hosts send packets
only through the default gateway though they are connected to multiple Routers.

Configuring multiple egress gateways is a common method to prevent communication

interruptions. This method is available only if one of routes to these egress gateways can be
selected. Another method is to use dynamic routing protocols, such as the Routing
Information Protocol (RIP), Open Shortest Path First (OSPF), and Internet Control Message
Protocol (ICMP). This method is available only if every host runs a dynamic routing protocol
and there is no problem in management, security, or operating systems' support for protocols.

VRRP prevents communication failures in a better way than the preceding two methods.
VRRP is configured only on Routers to implement gateway backup, without any networking
changes or burden on hosts.

VRRP offers the following benefits to carriers:

l Reliable transmission: A logical VRRP gateway on a multicast or broadcast local area

network (LAN), such as an Ethernet network, ensures reliable transmission over key
links. VRRP helps prevent service interruptions if a link to a physical VRRP gateway
l Flexible applications: A VRRP header is encapsulated into an IP packet. This
implementation allows the association between VRRP and various upper-layer protocols.
l Low network overheads: VRRP uses only VRRP Advertisement packets.

VRRP offers the following benefits to users:

l Simplified configurations: Users only need to specify a gateway address without

configuring routing protocols on their hosts.
l Improved user experience: Users are not aware of a single point of failure.

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 75

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 5 VRRP

Basic VRRP Functions

VRRP supports two modes: master/backup mode and load balancing mode.

Figure 5-3 shows the master/backup mode.

Figure 5-3 Master/Backup mode

VRRP Backup Master


User Internet
Backup 1

Backup n

For the master/backup mode:

l A single VRRP backup group is configured and consists of a master device and several
backup devices.
l The Router with the highest priority functions as the master device and transmits service
l Other Routers function as backup devices and monitor the master Router's status. If the
master Router fails, a backup Router with the highest priority preempts the Master state.

Figure 5-4 shows the load balancing mode.

Figure 5-4 Load balancing mode

VRRP Backup PE1

Group 1

User Internet

VRRP Backup
Group 2 PE2

Multiple VRRP backup groups can be configured to implement load balancing. A single
Router can be a member of multiple backup groups. On the network shown in Figure 5-4, the
VRRP backup groups work in load balancing mode.
l PE1 is the master device in VRRP backup group 1 and the backup device in VRRP
backup group 2.
l PE2 is the master device in VRRP backup group 2 and the backup device in VRRP
backup group 1.

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 76

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 5 VRRP

l In normal circumstances, different Routers process different user groups' traffic to

implement load balancing.

VRRP load balancing is classified as multi-gateway or single-gateway load balancing. For details about
VRRP load balancing, see the chapter "VRRP" in HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability.

5.2 Understanding VRRP

5.2.1 Basic VRRP Concepts

As shown in Figure 5-5, two gateways are grouped to form a virtual gateway, and the user
host uses the virtual gateway's IP address as the default gateway IP address to communicate
with the external network. If the default gateway fails, VRRP elects a new gateway to provide
access services for the user.

Figure 5-5 VRRP networking Virtual IP address





CE Virtual router



Basic VRRP concepts are described as follows:

l Virtual router: also called a VRRP backup group, consists of a master router and one or
more backup routers. A virtual router is a default gateway used by hosts within a shared
local area network (LAN). A virtual router ID and one or more virtual IP addresses
together identify a virtual router.

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 77

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 5 VRRP

– Virtual router ID (VRID): ID of a virtual router. Routers with the same VRID form
a virtual router.
– Virtual IP address: IP address of a virtual router. A virtual router can have one or
more virtual IP addresses, which are manually assigned.
– Virtual MAC address: MAC address generated by a virtual router based on a VRID.
A virtual router has one virtual MAC address, in the format of 00-00-5E-00-01-
{VRID} (VRRP for IPv4) or 00-00-5E-00-02-{VRID} (VRRP for IPv6). After a
virtual router receives an ARP (VRRP for IPv4) or NS (VRRP for IPv6) request, it
responds to the request with the virtual MAC address rather than the actual MAC
l IP address owner: VRRP router that uses the virtual IP address as its interface IP address.
If an IP address owner is available, it functions as the master router.
l Primary IP address: IP address selected from actual interface IP addresses, which is
usually the first IP address that is configured. The primary IP address is used as the
source IP address in a VRRP Advertisement packet.
l VRRP router: device running VRRP. A VRRP router can join one or more VRRP backup
groups. A VRRP backup group consists of the following VRRP routers:
– Master router: forwards packets and responds to ARP requests.
– Backup router: does not forward packets when the master router is working
properly, but can be elected as the new master router if the master router fails.
l Priority: priority of a router in a VRRP backup group. A VRRP backup group elects the
master and backup routers based on router priorities.
l VRRP working modes:
– Preemption mode: A backup router with a higher priority than the master router
preempts the Master state.
– Non-preemption mode: When the master router is working properly, a backup
router does not preempt the Master state even if it has a priority higher than the
master router.
l VRRP timers:
– Adver_Interval timer: The master router sends a VRRP Advertisement packet each
time the Adver_Interval timer expires. The default timer value is 1 second.
– Master_Down timer: A backup router preempts the Master state after the
Master_Down timer expires. The Master_Down timer value (in seconds) is
calculated using the following equation:
Master_Down timer value = (3 x Adver_Interval timer value) + Skew_Time
Skew_Time = (256 - Backup router's priority)/256

5.2.2 VRRP Packets

VRRP packets are used to advertise the priority and status of the master router to all backup
routers in the same VRRP backup group as the master. A VRRP packet is a multicast packet
that can be forwarded only within a single broadcast domain, such as a virtual local area
network (VLAN) or a virtual switching instance (VSI).
VRRP versions include VRRPv2 and VRRPv3. VRRPv2 applies only to IPv4 networks, and
VRRPv3 applies to both IPv4 and IPv6 networks. VRRP is classified as VRRP for IPv4
(VRRP4) or VRRP for IPv6 (VRRP6) by network type. VRRP4 supports both VRRPv2 and
VRRPv3, and VRRP6 supports only VRRPv3.

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 78

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 5 VRRP

For an IPv4 network, VRRP packets are encapsulated into IPv4 packets and sent to an IPv4
multicast address assigned to a VRRP4 backup group. In an IPv4 packet header:
l The source address is the primary IPv4 address of the interface that sends the packet.
l The destination address is
l The time to live (TTL) value is 255.
l The protocol number is 112.

For an IPv6 network, VRRP packets are encapsulated into IPv6 packets and sent to an IPv6
multicast address assigned to a VRRP6 backup group. In an IPv6 packet header:
l The source address is the link-local address of the interface that sends the packet.
l The destination address is FF02::12.
l The hop count is 255.
l The protocol number is 112.

NE40E allows you to manually switch a VRRP version. VRRP packets refer to VRRPv2 packets, unless
otherwise specified in this document.

VRRP Packet Structure

Figure 5-6 and Figure 5-7 show the VRRPv2 and VRRPv3 packet structures, respectively.

Figure 5-6 VRRPv2 packet structure

0 34 78 15 16 23 24 31
Version Type Virtual Rtr ID Priority Count IPv4 Addrs
Auth Type Adver Int Checksum
IPv4 Address (1)

IPv4 Address (n)

Authentication Data (1)
Authentication Data (2)

Table 5-1 describes the fields in a VRRPv2 packet.

Table 5-1 Fields in a VRRPv2 packet

Field Description

Version Version number of the VRRP protocol. The

value is 2.

Type Type of the VRRPv2 packet. The value is 1,

indicating that the packet is an
advertisement packet.

Virtual Rtr ID Virtual router identifier.

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 79

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 5 VRRP

Field Description

Priority Priority of the master router in a VRRP

backup group.

Count IPv4 Addrs Number of virtual IPv4 addresses

configured for a VRRP backup group.

Auth Type VRRPv2 packet authentication type.

VRRPv2 defines the following
authentication types:
l 0: Non Authentication, indicating that
authentication is not performed.
l 1: Simple Text Password, indicating that
simple authentication is performed.
l 2: IP Authentication Header, indicating
that MD5 authentication is performed.

Adver Int Interval at which VRRPv2 packets are sent,

in seconds.

Checksum 16-bit checksum, used to check the data

integrity of the VRRPv2 packet.

IPv4 Address Virtual IPv4 address configured for a VRRP

backup group.

Authentication Data Authentication key in the VRRPv2 packet.

This field applies only when simple or MD5
authentication is used. For other
authentication types, this field is fixed to 0.

Figure 5-7 VRRPv3 packet structure

0 34 78 15 16 23 24 31
Version Type Virtual Rtr ID Priority Count IPvX Addrs
rsvd Adver Int Checksum
IPvX Address (1)

IPvX Address (n)

Table 5-2 describes the fields in a VRRPv3 packet.

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 80

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 5 VRRP

Table 5-2 Fields in a VRRPv3 packet

Field Description

Version Version number of the VRRP protocol. The

value is 3.

Type Type of the VRRPv3 packet. The value is 1,

indicating that the packet is an
advertisement packet.

Virtual Rtr ID Virtual router identifier.

Priority Priority of the master router in a VRRP

backup group.

Count IPvX Addrs Number of virtual IPvX addresses

configured for a VRRP backup group.

rsvd Field reserved for the VRRPv3 packet. The

value must be set to 0.

Adver Int Interval at which VRRPv3 packets are sent,

in centiseconds.

Checksum 16-bit checksum, used to check the data

integrity of the VRRPv3 packet.

IPvX Address Virtual IPvX address configured for a

VRRP backup group.

As shown in Figure 5-6 and Figure 5-7, the main differences between VRRPv2 and VRRPv3
are as follows:
l VRRPv2 supports authentication, whereas VRRPv3 does not.
l VRRPv2 supports a second-level interval between sending VRRP Advertisement
packets, whereas VRRPv3 supports a centisecond-level interval.

5.2.3 VRRP Operating Principles

VRRP State Machine
VRRP defines three states: Initialize, Master, and Backup. Only a router in the Master state is
allowed to forward packets sent to a virtual IP address.
Figure 5-8 shows the transition process of the VRRP states.

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 81

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 5 VRRP

Figure 5-8 Transition process of the VRRP states












is low
5. nd

re er


25 d a

ce th

is e

iv an

rit ce


an 255

io re

d .
pr s






The priority carried in the received packet is higher than


the local priority, or the priority carried in the received


packet is equal to the local priority but the IP address in

the packet is greater than the local IP address.
Master Backup
The Master_Down timer expires.

Table 5-3 describes the VRRP states.

Table 5-3 VRRP states

State Description Transition

Initialize A VRRP router is unavailable and After a router receives a Startup

does not process VRRP event, it changes its status as follows:
Advertisement packets. l Changes from Initialize to Master
A router enters the Initialize state if the router is an IP address owner
when it starts or detects a fault. with a priority of 255.
l Changes from Initialize to Backup
if the router has a priority less than

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 82

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 5 VRRP

State Description Transition

Master A router in the Master state provides The master router changes its status as
the following functions: follows:
l Sends a VRRP Advertisement l Changes from Master to Backup if
packet each time the the VRRP priority in a received
Adver_Interval timer expires. VRRP Advertisement packet is
l Responds to an ARP request with higher than the local VRRP
an ARP reply carrying the virtual priority.
MAC address. l Remains in the Master state if the
l Forwards IP packets sent to the VRRP priority in a received
virtual MAC address. VRRP Advertisement packet is the
same as the local VRRP priority.
l Allows ping to a virtual IP address
by default. l Changes from Master to Initialize
after it receives a Shutdown event,
indicating that the VRRP-enabled
interface has been shut down.
If devices in a VRRP backup group are in
the Master state and a device receives a
VRRP Advertisement packet with the
same priority as the local VRRP priority,
the device compares the IP address in the
packet with the local IP address. If the IP
address in the packet is greater than the
local IP address, the device switches to
the Backup state. If the IP address in the
packet is less than or equal to the local IP
address, the device remains in the Master

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 83

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 5 VRRP

State Description Transition

Backup A router in the Backup state provides A backup router changes its status as
the following functions: follows:
l Receives VRRP Advertisement l Changes from Backup to Master
packets from the master router and after it receives a Master_Down
checks whether the master router timer timeout event.
is working properly based on l Changes from Backup to Initialize
information in the packets. after it receives a Shutdown event,
l Does not respond to an ARP indicating that the VRRP-enabled
request carrying a virtual IP interface has been shut down.
l Discards IP packets sent to the
virtual MAC address.
l Discards IP packets sent to virtual
IP addresses.
l If, in preemption mode, it receives
a VRRP Advertisement packet
carrying a VRRP priority lower
than the local VRRP priority, it
preempts the Master state after a
specified preemption delay.
l If, in non-preemption mode, it
receives a VRRP Advertisement
packet carrying a VRRP priority
lower than the local VRRP priority
it remains in the Backup state.
l Resets the Master_Down timer but
does not compare IP addresses if it
receives a VRRP Advertisement
packet carrying a VRRP priority
higher than or equal to the local
VRRP priority.

VRRP Implementation Process

The VRRP implementation process is as follows:
1. VRRP elects the master router from a VRRP backup group based on router priorities.
Once elected, the master router sends a gratuitous ARP packet carrying the virtual MAC
address to its connected device or host to start forwarding traffic.
2. The master router periodically sends VRRP Advertisement packets to all backup routers
in the VRRP backup group to advertise its configurations (such as the priority) and
operating status.
3. If the master router fails, VRRP elects a new master router from the VRRP backup group
based on router priorities.
4. The new master router immediately sends a gratuitous ARP packet carrying the virtual
MAC address and virtual IP address to update MAC entries on its connected device or

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 84

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 5 VRRP

host. After the update is complete, user traffic is switched to the new master router. The
switching process is transparent to users.
5. If the original master router recovers and its priority is 255, it immediately switches to
the Master state. If the original master router recovers and its priority is lower than 255,
it switches to the Backup state and recovers the previously configured priority.
6. If a backup router's priority is higher than the master router's priority, VRRP determines
whether to reelect a new master router, depending on the backup router's working mode
(preemption or non-preemption).

To ensure that the master and backup routers work properly, VRRP must implement the
following functions:
l Master router election
VRRP determines the master or backup role of each router in a VRRP backup group
based on router priorities. VRRP selects the router with the highest priority as the master
If routers in the Initialize state receive a Startup event and their priorities are lower than
255, they switch to the Backup state. The router whose Master_Down timer first expires
switches to the Master state. The router then sends a VRRP Advertisement packet to
other routers in the VRRP backup group to obtain their priorities.
– If a router finds that the VRRP Advertisement packet carries a priority higher than
or equal to its priority, this router remains in the Backup state.
– If a router finds that the VRRP Advertisement packet carries a priority lower than
its priority, the router may switch to the Master state or remain in the Backup state,
depending on its working mode. If the router is working in preemption mode, it
switches to the Master state; if the router is working in non-preemption mode, it
remains in the Backup state.

l If multiple VRRP routers enter the Master state at the same time, they exchange VRRP
Advertisement packets to determine the master or backup role. The VRRP router with the highest
priority remains in the Master state, and VRRP routers with lower priorities switch to the Backup
state. If these routers have the same priority and the VRRP backup group is configured on a router's
interface with the largest primary IP address, that router becomes the master router.
l If a VRRP router is the IP address owner, it immediately switches to the Master state after receiving
a Startup event.
l Master router status advertisement
The master router periodically sends VRRP Advertisement packets to all backup routers
in the VRRP backup group to advertise its configurations (such as the priority) and
operating status. The backup routers determine whether the master router is operating
properly based on received VRRP Advertisement packets.
– If the master router gives up the master role (for example, the master router leaves
the VRRP backup group), it sends VRRP Advertisement packets carrying a priority
of 0 to the backup routers. Rather than waiting for the Master_Down timer to
expire, the backup router with the highest priority switches to the Master state after
a specified switching time. This switching time is called Skew_Time, in seconds.
The Skew_Time is calculated using the following equation:
Skew_Time = (256 - Backup router's priority)/256
– If the master router fails and cannot send VRRP Advertisement packets, the backup
routers cannot immediately detect the master router's operating status. In this
situation, the backup router with the highest priority switches to the Master state

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 85

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 5 VRRP

after the Master_Down timer expires. The Master_Down timer value (in seconds) is
calculated using the following equation:
Master_Down timer value = (3 x Adver_Interval timer value) + Skew_Time

If network congestion occurs, a backup router may not receive VRRP Advertisement packets from the
master router. If this situation occurs, the backup router proactively switches to the Master state. If the
new master router receives a VRRP Advertisement packet from the original master router, the new
master router will switch back to the Backup state. As a result, the routers in the VRRP backup group
frequently switch between Master and Backup. You can configure a preemption delay to resolve this
issue. After the configuration is complete, the backup router with the highest priority switches to the
Master state only when all of the following conditions are met:
l The Master_Down timer expires.
l The configured preemption delay elapses.
l The backup router does not receive VRRP Advertisement packets.

VRRP Authentication
VRRP supports different authentication modes and keys in VRRP Advertisement packets that
meet various network security requirements.
l On secure networks, you can use the non authentication mode. In this mode, a device
does not authenticate VRRP Advertisement packets before sending them. After a peer
device receives VRRP Advertisement packets, it does not authenticate them either, but it
considers them authentic and valid.
l On insecure networks, you can use the simple or message digest algorithm 5 (MD5)
authentication mode.
– Simple authentication: Before a device sends a VRRP Advertisement packet, it adds
an authentication mode and key to the packet. After a peer device receives the
packet, the peer device checks whether the authentication mode and key carried in
the packet are the same as the locally configured ones. If they are the same, the peer
device considers the packet valid. If they are different, the peer device considers the
packet invalid and discards it.
– MD5 authentication: A device uses the MD5 algorithm to encrypt the locally
configured authentication key and saves the encrypted authentication key in the
Authentication Data field. After receiving a VRRP Advertisement packet, the
device uses the MD5 algorithm to encrypt the authentication key carried in the
packet and checks packet validity by comparing the encrypted authentication key
saved in the Authentication Data field with the encrypted authentication key carried
in the VRRP Advertisement packet.

l Only VRRPv2 supports authentication.

l MD5 authentication is more secure than simple authentication.

5.2.4 Basic VRRP Functions

VRRP works in either master/backup mode or load balancing mode.

Master/Backup Mode
A VRRP backup group comprises a master router and one or more backup routers. As shown
in Figure 5-9, Device A is the master router and forwards packets, and Device B and Device

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 86

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 5 VRRP

C are backup routers and monitor Device A's status. If Device A fails, Device B or Device C
is elected as a new master router and takes over services from Device A.

Figure 5-9 Master/Backup mode

VRRP Master

DeviceE DeviceD

network core

Device C

VRRP Initialize

DeviceE DeviceD
User core

Data flow
VRRP packet
ARP packet

VRRP device configurations in master/backup mode are as follows:

l Device A is the master. It supports delayed preemption and its VRRP priority is set to
l Device B is a backup. It supports immediate preemption and its VRRP priority is set to
l Device C is a backup. It supports immediate preemption and its VRRP priority is the
default value 100.

VRRP in master/backup mode is implemented as follows:

1. When Device A functions properly, user traffic travels along the path Device E ->
Device A -> Device D. Device A periodically sends VRRP Advertisement packets to
notify Device B and Device C of its status.
2. If Device A fails, its VRRP functions are unavailable. Because Device B has a higher
priority than Device C, Device B switches to the Master state and Device C remains in

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 87

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 5 VRRP

the Backup state. User traffic switches to the new path Device E -> Device B -> Device
3. After Device A recovers, it enters the Backup state (its priority remains 120). After
receiving a VRRP Advertisement packet from Device B, the current master, Device A
finds that its priority is higher than that of Device B. Therefore, Device A preempts the
Master state after the preemption delay elapses, and sends VRRP Advertisement packets
and gratuitous ARP packets.
After receiving a VRRP Advertisement packet from Device A, Device B finds that its
priority is lower than that of Device A and changes from the Master state to the Backup
state. User traffic then switches to the original path Device E -> Device A -> Device D.

Load Balancing Mode

VRRP backup groups work together to load-balance traffic. The implementation principles
and packet negotiation mechanism of the load balancing mode are the same as those of the
master/backup mode. The difference between the two modes is that in load balancing mode,
two or more VRRP backup groups are established, and each VRRP backup group can contain
a different master router. A VRRP device can join multiple VRRP backup groups and have a
different priority in each group.
VRRP load balancing is classified into the following types:
l Multi-gateway load balancing: Multiple VRRP backup groups with virtual IP addresses
are created and specified as gateways for different users to implement load balancing.
Figure 5-10 illustrates multi-gateway load balancing.

Figure 5-10 Multi-gateway load balancing

VRID1: Master
VRID2: Backup
User 1
network core

User 2
VRRP VRID2 DeviceB Data flow 1
VRID2: Master Data flow 2
VRID1: Backup

As shown in Figure 5-10, VRRP backup groups 1 and 2 are deployed on the network.
– VRRP backup group 1: Device A is the master router, and Device B is the backup
– VRRP backup group 2: Device B is the master router, and Device A is the backup
VRRP backup groups 1 and 2 back up each other and serve as gateways for different
users, therefore load-balancing service traffic.
l Single-gateway load balancing: A load-balance redundancy group (LBRG) with a virtual
IP address is created, and VRRP backup groups without virtual IP addresses are added to

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 88

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 5 VRRP

the LBRG. The LBRG is specified as a gateway to implement load balancing for all
Single-gateway load balancing, an enhancement to multi-gateway load balancing,
simplifies user-side configurations and facilitates network maintenance and
Figure 5-11 shows single-gateway load balancing.

Figure 5-11 Single-gateway load balancing

Load-Balance VRRP VRID1: Master

Member VRRP VRID2: Backup
User 1
network core

User 2
VRRP VRID2 DeviceB Data flow 1
Member VRRP VRID2: Master Data flow 2
Load-Balance VRRP VRID1: Backup

As shown in Figure 5-11, VRRP backup groups 1 and 2 are deployed on the network.
– VRRP backup group 1: an LBRG. Device A is the master router, and Device B is
the backup router.
– VRRP backup group 2: an LBRG member group. Device B is the master router, and
Device A is the backup router.
VRRP backup group 1 serves as a gateway for all users. After receiving an ARP request
packet from a user, VRRP backup group 1 returns an ARP response packet and
encapsulates its virtual MAC address or VRRP backup group 2's virtual MAC address in
the response.

5.2.5 mVRRP

A switch is dual-homed to two Routers at the aggregation layer on a metropolitan area
network (MAN). Multiple VRRP backup groups can be configured on the two Routers to
transmit various types of services. Because each VRRP backup group must maintain its own
state machine, a large number of VRRP Advertisement packets are transmitted between the

To help reduce bandwidth and CPU resource consumption during VRRP packet transmission,
a VRRP backup group can be configured as a management Virtual Router Redundancy
Protocol (mVRRP) backup group. Other VRRP backup groups are bound to the mVRRP
backup group and become service VRRP backup groups. Only the mVRRP backup group
sends VRRP packets to negotiate the master/backup status. The mVRRP backup group
determines the master/backup status of service VRRP backup groups.

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 89

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 5 VRRP

As shown in Figure 5-12, an mVRRP backup group can be deployed on the same side as
service VRRP backup groups or on the interfaces that directly connect Device A and Device

Figure 5-12 Typical mVRRP networking

Master DeviceC

Service mVRRP
network core

DeviceB DeviceD

Related Concepts
mVRRP backup group: has all functions of a common VRRP backup group. Different from a
common VRRP backup group, an mVRRP backup group can be tracked by service VRRP
backup groups and determine their statuses. An mVRRP backup group provides the following

l When the mVRRP backup group functions as a gateway, it determines the master/backup
status of devices and transmits services. In this situation, a common VRRP backup group
with the same ID as the mVRRP backup group must be created and assigned a virtual IP
address. The mVRRP backup group's virtual IP address is a gateway IP address set by
l When the mVRRP backup group does not function as a gateway, it determines the
master/backup status of devices but does not transmit services. In this situation, the
mVRRP backup group does not require a virtual IP address. You can create an mVRRP
backup group directly on interfaces to simplify maintenance.
Service VRRP backup group: After common VRRP backup groups are bound to an mVRRP
backup group, they become service VRRP backup groups. Service VRRP backup groups do
not need to send VRRP packets to determine their states. The mVRRP backup group sends
VRRP packets to determine its state and the states of all its bound service VRRP backup
groups. A service VRRP backup group can be bound to an mVRRP backup group in either of
the following modes:

l Flowdown: The flowdown mode applies to networks on which both upstream and
downstream packets are transmitted over the same path. If the master device in an
mVRRP backup group enters the Backup or Initialize state, the VRRP module instructs
all service VRRP backup groups that are bound to the mVRRP backup group in
flowdown mode to enter the Initialize state.
l Unflowdown: The unflowdown mode applies to networks on which upstream and
downstream packets can be transmitted over different paths. If the mVRRP backup
group enters the Backup or Initialize state, the VRRP module instructs all service VRRP

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 90

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 5 VRRP

backup groups that are bound to the mVRRP backup group in unflowdown mode to
enter the same state.

Multiple service VRRP backup groups can be bound to an mVRRP backup group. However, the
mVRRP backup group cannot function as a service backup group and is bound to another mVRRP
backup group.
If a physical interface on which a service VRRP backup group is configured goes Down, the status of
the service VRRP backup group becomes Initialize, irrespective of the status of the mVRRP backup

VRRP offers the following benefits:

l Simplified management. An mVRRP backup group determines the master/backup status

of service VRRP backup groups.
l Reduced CPU and bandwidth resource consumption. Service VRRP backup groups do
not need to send VRRP packets.

5.2.6 Association Between VRRP and a VRRP-disabled Interface

Virtual Router Redundancy Protocol (VRRP) can monitor the status change only in the
VRRP-enabled interface on the master device. If a VRRP-disabled interface on the master
device or the uplink connecting the interface to a network fails, VRRP cannot detect the fault,
which causes traffic interruptions.

To resolve this issue, configure VRRP to monitor the VRRP-disabled interface status. If a
VRRP-disabled interface on the master device or the uplink connecting the interface to a
network fails, VRRP instructs the master device to reduce its priority to trigger a master/
backup VRRP switchover.

Related Concepts
If a VRRP-disabled interface of a VRRP device goes Down, the VRRP device changes its
VRRP priority in either of the following modes:
l Increased mode: The VRRP device increases its VRRP priority by a specified value.
l Reduced mode: The VRRP device reduces its VRRP priority by a specified value.

As shown in Figure 5-13, a VRRP backup group is configured on Device A and Device B.
Device A is the master device, and Device B is the backup device.

Device A is configured to monitor interface 1. If interface 1 fails, Device A reduces its VRRP
priority and sends a VRRP Advertisement packet carrying a reduced priority. After Device B
receives the packet, it checks that its VRRP priority is higher than the received priority and
preempts the Master state.

After interface 1 goes Up, Device A restores the VRRP priority. After Device A receives a
VRRP Advertisement packet carrying Device B's priority in preemption mode, Device A

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 91

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 5 VRRP

checks that its VRRP priority is higher than the received priority and preempts the Master

Figure 5-13 Association between VRRP and a VRRP-disabled interface

VRRP DeviceA
Master Interface 1

network core

DeviceB DeviceD

DeviceA DeviceC
Backup Interface 1

network core

DeviceB DeviceD
Data flow
Interface in the Up state
Interface in the Down state

The association between VRRP and a VRRP-disabled interface helps trigger a master/backup
VRRP switchover if the VRRP-disabled interface fails or the uplink connecting the interface
to a network fails.

5.2.7 VRRP Tracking an Interface Monitoring Group

To prevent failures on a VRRP-disabled interface from causing service interruptions,
configure a VRRP backup group to track the VRRP-disabled interface. However, a VRRP
backup group can track only one VRRP-disabled interface at a time. As the network scale is
expanding and more interfaces are appearing, a VRRP backup group is required to track more
VRRP-disabled interfaces. If the original technology is used, the configuration workload is
very large.

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 92

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 5 VRRP

To reduce the configuration workload, you can add multiple VRRP-disabled interfaces to an
interface monitoring group and enable a VRRP backup group to track the interface monitoring
group. When the link failure ratio of the interface monitoring group reaches a specified
threshold, the VRRP backup group performs a master/backup switchover to ensure reliable
service transmission.

Related Concepts
A VRRP backup group can track three interface monitoring groups at the same time.
l A VRRP backup group can track two interface monitoring groups on the access side in
normal mode (link is not specified). When the link failure ratio on the access side
reaches a specified threshold, the VRRP backup group reduces the priority of the local
device to trigger the remote device to preempt the Master state.
l A VRRP backup group can track one interface monitoring group on the network side in
link mode. When the link failure ratio on the network side reaches a specified threshold,
the local device in the VRRP backup group changes to the Initialize state and sends a
VRRP Advertisement packet carrying a priority of 0 to the remote device to trigger the
remote device to preempt the Master state.

Each interface in an interface monitoring group has a Down weight. If an interface goes
Down, the fault weight of the interface monitoring group to which the interface belongs
increases; if an interface goes Up, the fault weight of the interface monitoring group to which
the interface belongs decreases. The fault weight of an interface monitoring group reflects
link quality. VRRP can be configured to track an interface monitoring group. If the fault
weight of the interface monitoring group changes, the system notifies the VRRP module of
the change. The VRRP module calculates the VRRP priority or status based on the fault rate
of the interface monitoring group, configured monitoring mode, and priority change value.

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 93

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 5 VRRP

Figure 5-14 VRRP tracking an interface monitoring group

VRRP DeviceA

User Interface IP/MPLS
network monitoring core

DeviceB DeviceD

DeviceA DeviceC

User Interface IP/MPLS
network monitoring core

DeviceB DeviceD
Service traffic
Link fault

Configuring VRRP to track an interface monitoring group on a device where a VRRP backup
group is configured helps to reduce the workload for configuring the VRRP backup group to
track VRRP-disabled interfaces.

5.2.8 BFD for VRRP

Devices in a VRRP backup group exchange VRRP Advertisement packets to negotiate the
master/backup status and implement backup. If the link between devices in a VRRP backup
group fails, VRRP Advertisement packets cannot be exchanged to negotiate the master/
backup status. A backup device attempts to preempt the Master state after a period three times
provided that; if the time interval at which VRRP Advertisement packets are broadcast.
During this period, user traffic is still forwarded to the master device, which results in user
traffic loss.
Bidirectional Forwarding Detection (BFD) can rapidly detect faults in links or IP routes. BFD
for VRRP enables a master/backup VRRP switchover to be completed within 1 second,

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 94

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 5 VRRP

preventing user traffic loss. A BFD session is established between the master and backup
devices in a VRRP backup group and is bound to the VRRP backup group. BFD immediately
detects communication faults in the VRRP backup group and instructs the VRRP backup
group to perform a master/backup switchover, minimizing service interruptions.

VRRP and BFD Association Modes

The following table describes VRRP and BFD association modes.

Table 5-4 VRRP and BFD association modes

Ass Usage Scenario Type of Impact Mode BFD Support
ociat Associated BFD
ion Session

Asso A backup device Static BFD If the BFD session VRRP devices
ciati monitors the status sessions or static detects a fault and must be enabled
on of the master BFD sessions with goes Down, the with BFD.
betw device in a VRRP automatically BFD module
een a backup group. A negotiated notifies the VRRP
VRR common BFD discriminators backup group of
P session is used to the status change.
back monitor the link After receiving the
up between the notification, the
grou master and backup VRRP backup
p and devices. group changes
a VRRP priorities of
com devices and
mon determines
BFD whether to
sessi perform a master/
on backup VRRP

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 95

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 5 VRRP

Ass Usage Scenario Type of Impact Mode BFD Support

ociat Associated BFD
ion Session

Asso The master and Static BFD If the link or peer VRRP devices and
ciati backup devices sessions or static BFD session goes the downstream
on monitor the link BFD sessions with Down, BFD switch must be
betw and peer BFD automatically notifies the VRRP enabled with BFD.
een a sessions. A link negotiated backup group of
VRR BFD session is discriminators the fault. After
P established receiving the
back between the notification, the
up master and backup VRRP backup
grou devices. A peer group immediately
p and BFD session is performs a master/
link established backup VRRP
and between a switchover.
peer downstream
BFD switch and each
sessi VRRP device.
ons BFD helps the
VRRP backup
group detect faults
in the link
between a VRRP
device and the

Association Between a VRRP Backup Group and a Common BFD Session

As shown in Figure 5-15, a BFD session is established between Device A (master) and
Device B (backup) and is bound to a VRRP backup group. If BFD detects a fault on the link
between Device B and Device A, the BFD module notifies the VRRP module of the status
change. After receiving the notification, the VRRP module performs a master/backup VRRP

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 96

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 5 VRRP

Figure 5-15 Association between a VRRP backup group and a common BFD session
Device A
(master) Device C

Device E
network core

Device B Device D
BFD control packet
Data flow

VRRP device configurations are as follows:

l Device A supports delayed preemption and its VRRP priority is 120.
l Device B supports immediate preemption and its VRRP priority retains the default value
l A VRRP backup group is configured on Device B to monitor a common BFD session. If
BFD detects a fault and the BFD session goes Down, Device B increases its VRRP
priority by 40.
The implementation process is as follows:
1. Device A periodically sends VRRP Advertisement packets to inform Device B that it is
working properly. Device B monitors the status of Device A and the BFD session.
2. If BFD detects a fault, the BFD session goes Down. BFD notifies the VRRP module of
the status change. Device B increases its VRRP priority value to 140 (increased by 40),
higher than Device A's VRRP priority. Device B preempts the Master state and sends
gratuitous ARP packets to update address entries on Device E.
3. After the fault is rectified, the BFD session goes Up.
Device B restores a priority of 100. Device B retains the Master state and still sends
VRRP Advertisement packets to Device A.
After receiving the packets, Device A checks that the VRRP priority carried in the
packets is lower than the local VRRP priority and waits a specified period before
preempting the Master state. After restoring the Master state, Device A sends a VRRP
Advertisement packet and a gratuitous ARP packet.
After receiving the VRRP Advertisement packet that carries a priority higher than the
local priority, Device B enters the Backup state.
4. Device A in the Master state forwards user traffic, and Device B remains in the Backup
The preceding process shows that BFD for VRRP is different from VRRP. After BFD for
VRRP is deployed and a fault occurs, a backup device immediately preempts the Master state
without waiting a period three times provided that; if the time interval at which VRRP
Advertisement packets are broadcast. A master/backup VRRP switchover can be implemented
in milliseconds.

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 97

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 5 VRRP

Association Between a VRRP Backup Group and Link and Peer BFD Sessions
As shown in Figure 5-16, the master and backup devices monitor the status of link and peer
BFD sessions to identify local or remote faults.
Device A and Device B run VRRP. A peer BFD session is established between Device A and
Device B to detect link and device failures. Link BFD sessions are established between
Device A and Device E and between Device B and Device E to detect link and device
failures. After Device B detects that the peer BFD session goes Down and Link2 BFD session
goes Up, Device B's VRRP status changes from Backup to Master, and Device B takes over.

Figure 5-16 Association between a VRRP backup group and link and peer BFD sessions
Device A
VRRP (master) Device C

Device E


network core

Device B Device D
(backup) BFD control packet
Data flow

VRRP device configurations are as follows:

l Device A and Device B run VRRP.

l A peer BFD session is established between Device A and Device B to detect link and
device failures.
l Link1 and Link2 BFD sessions are established between Device E and Device A and
between Device E and Device B, respectively.
The implementation process is as follows:
1. In normal circumstances, Device A periodically sends VRRP Advertisement packets to
inform Device B that it is working properly. Device A monitors the BFD session status.
Device B monitors the status of Device A and the BFD session.
2. The BFD session goes Down if BFD detects either of the following faults:
– Link1 or Device E fails. Link1 BFD session and the peer BFD session go Down.
Link2 BFD session is Up.
Device A's VRRP status directly becomes Initialize.
Device B's VRRP status directly becomes Master.
– Device A fails. Link1 BFD session and the peer BFD session go Down. Link2 BFD
session is Up. Device B's VRRP status becomes Master.
3. After the fault is rectified, the BFD sessions go Up, and Device A and Device B restore
their VRRP status.

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 98

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 5 VRRP


A Link2 fault does not affect Device A's VRRP status, and Device A continues to forward upstream
traffic. However, Device B's VRRP status becomes Master if both the peer BFD session and Link2 BFD
session go Down, and Device B detects the peer BFD session status change before detecting the Link2
BFD session status change. After Device B detects the Link2 BFD session status change, Device B's
VRRP status becomes Initialize.

Figure 5-17 shows the state machine for the association between a VRRP backup group and
link and peer BFD sessions.

Figure 5-17 State machine for the association between a VRRP backup group and link and
peer BFD sessions


n. sio

rit s

e go
io e
ow es

Th s U i s

lin es
pr go

go orit
D s

es FD

5. P n

BF o
lin p a we
25 R sio

go k B

k nd r t

D n
is VR s

BF th ha

e se

se .

D e n

th D

d F

se VR 25

an k B

ss R 5 .
p n

io P
U e li


Master Backup
The peer BFD session goes Down
and the link BFD session goes Up.

The preceding process shows that, after link and peer BFD for VRRP is deployed, the backup
device immediately preempts the Master state if a fault occurs. Link and peer BFD for VRRP
implements a millisecond-level master/backup VRRP switchover.

BFD for VRRP speeds up master/backup VRRP switchovers if faults occur.

5.2.9 VRRP Tracking EFM

Metro Ethernet solutions use Virtual Router Redundancy Protocol (VRRP) tracking
Bidirectional Forwarding Detection (BFD) to detect link faults and protect links between the
master and backup network provider edges (NPEs) and between NPEs and user-end provider
edges (UPEs). If UPEs do not support BFD, Metro Ethernet solutions cannot use VRRP
tracking BFD. If UPEs support 802.3ah, Metro Ethernet solutions can use 802.3ah as a
substitute for BFD to detect link faults and protect links between NPEs and UPEs. Ethernet
operation, administration and maintenance (OAM) technologies, such as Ethernet in the First
Mile (EFM) OAM defined in IEEE 802.3ah, provide functions, such as link connectivity
detection, link failure monitoring, remote failure notification, and remote loopback for links
between directly connected devices.

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 99

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 5 VRRP


EFM can detect only local link failures. If the link between the UPE and NPE1 fails, NPE2 cannot detect
the failure. NPE2 has to wait three VRRP Advertisement packet transmission intervals before it switches
to the Master state. During this period, upstream service traffic is interrupted. To speed up master/
backup VRRP switchovers and minimize the service interruption time, configure VRRP also to track the
peer BFD session.

Figure 5-18 shows a network on which VRRP tracking EFM is configured. NPE1 and NPE2
are configured to belong to a VRRP backup group. A peer BFD session is configured to
detect the faults on the two NPEs and on the link between the two NPEs. An EFM session is
configured between the UPE and NPE1 and between the UPE and NPE2 to detect the faults
on the UPE and NPEs and on the links between the UPE and NPEs. The VRRP backup group
determines the VRRP status of NPEs based on the link status reported by EFM and the peer
BFD session.

Figure 5-18 VRRP tracking EFM


M fo r V
Peer BFD

fo r V


In Figure 5-18, the following example describes how EFM and a peer BFD session affect the
VRRP status when a fault occurs and rectified.
l NPE1 and NPE2 run VRRP.
l A peer BFD session is established between NPEs to detect link and device failures on the
link between the NPEs.
l An EFM session is established between NPE1 and the UPE and between NPE2 and UPE
to detects link and node faults on the links between NPEs and the UPE.
The implementation is as follows:
1. In normal circumstances, NPE1 periodically sends VRRP Advertisement packets to
inform NPE2 that NPE1 works properly. NPE1 and NPE2 both track the EFM and peer
BFD session status.
2. If NPE1 or the link between the UPE and NPE1 fails, the status of the EFM session
between the UPE and NPE1 changes to Discovery, the status of the peer BFD session
changes to Down, and the status of the EFM session between the UPE and NPE2

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 100

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 5 VRRP

changes to Detect. NPE1's VRRP status directly changes from Master to Initialize, and
NPE2's VRRP status directly changes from Backup to Master.
3. After NPE1 or the link between the UPE and NPE1 recovers, the status of the peer BFD
session changes to Up, and the status of the EFM session between the UPE and NPE1
changes to Detect. If the preemption function is configured on NPE1, NPE1 changes
back to the Master state after VRRP negotiation, and NPE2 changes back to the Backup
In normal circumstances, if the link between the UPE and NPE2 fails, NPE1 remains in the Master
state and continues to forward upstream traffic. However, NPE2's VRRP status changes to Master
if NPE2 detects the Down state of the peer BFD session before it detects the Discovery state of the
link between itself and the UPE. After NPE2 detects the Discovery state of the link between itself
and the UPE, NPE2's VRRP status changes from Master to Initialize.

Figure 5-19 shows the state machine for VRRP tracking EFM.

Figure 5-19 State machine for VRRP tracking EFM



T h is io r i

P ion


e De ty
5. RR ss

us D
ry s

st te is
25 V se
ve FM

a t c lo

of isco

us t , w

th ve
co E

rit d EF

is e

of and er t
D f th

EF ry.
is he

th th ha
io an e
pr t, th



EF e V n 25
ec of


D tu s

M RR 5.


se P
is sta






Master Backup
The status of the peer BFD session is
Down, and the status of the EFM
session is Detect.

VRRP tracking EFM facilitates master/backup VRRP switchovers on a network on which
UPEs do not support BFD but support 802.3ah.

5.2.10 VRRP Tracking CFM

Virtual Router Redundancy Protocol (VRRP) tracking Ethernet in the First Mile (EFM)
effectively facilitates link fault detection on a network on which UPEs do not support
Bidirectional Forwarding Detection (BFD). However, EFM can detect faults only on single-
hop links. As shown in Figure 5-20, EFM cannot detect faults on the link between UPE2 and
NPE1 or between UPE2 and NPE2.

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 101

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 5 VRRP

Figure 5-20 VRRP networking diagram



UPE3 core



Connectivity fault management (CFM) defined in 802.1ag provides functions, such as point-
to-point connectivity fault detection, fault notification, fault verification, and fault locating.
CFM can monitor the connectivity of an entire network and locate connectivity faults. CFM
can also be used together with switchover techniques to improve network reliability. VRRP
tracking CFM enables a VRRP backup group to rapidly perform a master/backup VRRP
switchover when CFM detects a link fault. This implementation minimizes the service
interruption time.


CFM can detect only local link failures. If the link between UPE2 and NPE1 fails, NPE2 cannot detect
the failure. NPE2 has to wait three VRRP Advertisement packet transmission intervals before it switches
to the Master state. During this period, upstream service traffic is interrupted. To speed up master/
backup VRRP switchovers and minimize the service interruption time, configure VRRP also to track the
peer BFD session.

Figure 5-21 shows a network on which VRRP tracks CFM and the peer BFD session.

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 102

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 5 VRRP

Figure 5-21 VRRP tracking CFM

M fo r PE1

Peer BFD
UPE3 core

M for PE2


l NPE1 and NPE2 are configured to belong to a VRRP backup group.

l A peer BFD session is configured to detect the faults on the two NPEs and on the link
between the two NPEs.
l A CFM session is configured between UPE2 and NPE1 and between UPE2 and NPE2 to
detect the faults on UPE2 and the NPEs and on links between UPE2 and the NPEs.
The implementation is as follows:
1. In normal circumstances, NPE1 periodically sends VRRP Advertisement packets to
inform NPE2 that NPE1 works properly. NPE1 and NPE2 both track the CFM and peer
BFD session status.
2. If NPE1 or the link between UPE2 and NPE1 fails, the status of the CFM session
between UPE2 and NPE1 changes to Down, the status of the peer BFD session changes
to Down, and the status of the CFM session between UPE2 and NPE2 changes to Up.
NPE1's VRRP status directly changes from Master to Initialize, and NPE2's VRRP status
directly changes from Backup to Master.
3. After NPE1 or the link between UPE2 and NPE1 recovers, the status of the peer BFD
session changes to Up, and the status of the CFM session between UPE2 and NPE1
changes to Up. If the preemption function is configured on NPE1, NPE1 changes back to
the Master state after VRRP negotiation, and NPE2 changes back to the Backup state.
In normal circumstances, if the link between UPE2 and NPE2 fails, NPE1 remains in the Master
state and continues to forward upstream traffic. However, NPE2's VRRP status changes to Master
if NPE2 detects the Down state of the peer BFD session before it detects the Down state of the link
between itself and UPE2. After NPE2 detects the Down state of the link between itself and UPE2,
NPE2's VRRP status changes from Master to Initialize.

Figure 5-22 shows the state machine for VRRP tracking CFM.

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 103

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 5 VRRP

Figure 5-22 State machine for VRRP tracking CFM


Th ss

e ion
n. M

P ss

Th ss P p 5.

st g
ow F

R se

at o
D eC

se R 25
e ion rio

us es
st g ri

VR a n
es th

at o ty
5. the CF

of Do
go of

us es is

th w
n s

e n.
of Up low
is n th
i o tu

ss sta

th , a e
y , a of

25 d
rit p s
se he

i o U tu

CF nd
pr es ta

M the
go e

Master Backup
The status of the peer BFD session
goes Down, and the status of the
CFM session goes Up.

VRRP tracking CFM prevents service interruptions caused by dual master devices in a VRRP
backup group and facilitates master/slave VRRP switchovers.

5.2.11 VRRP Association with NQA

To improve network reliability, VRRP can be configured on a device to track the following
l Interface
l EFM session
l BFD session
Failure of a tracked object can trigger a rapid master/backup VRRP switchover to ensure
service continuity.
In Figure 5-23, however, if Interface 2 on Device C goes Down and its IP address (
becomes unreachable, VRRP is unable to detect the fault. As a result, user traffic is dropped.

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 104

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 5 VRRP

Figure 5-23 VRRP networking

VRRP backup DeviceA Interface 1 DeviceC Interface 2



DeviceB DeviceD
Host A

To resolve the preceding issue, you can associate VRRP with network quality analysis
(NQA). Using test instances, NQA sends probe packets to check the reachability of
destination IP addresses. After VRRP is associated with an NQA test instance, VRRP tracks
the NQA test instance to implement rapid master/backup VRRP switchovers. For the example
shown in the preceding figure, you can configure an NQA test instance on Device A to check
whether the IP address of Interface 2 on Device C is reachable.


VRRP association with an NQA test instance is required on only the local device (Device A).

You can configure VRRP association with an NQA test instance to track a gateway Router's
uplink, which is a cross-device link. If the uplink fails, NQA instructs VRRP to reduce the
gateway Router's priority by a specified value. Reducing the priority enables another gateway
Router in the VRRP backup group to take over services and become the master, thereby
ensuring communication continuity between hosts on the LAN served by the gateway and the
external network. After the uplink recovers, NQA instructs VRRP to restore the gateway
Router's priority.
Figure 5-24 illustrates VRRP association with an NQA test instance.

Figure 5-24 VRRP association with an NQA test instance

VRRP backup DeviceA Interface 1 DeviceC Interface 2


NQA test instance


DeviceB DeviceD
Host A

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 105

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 5 VRRP

As shown in Figure 5-24:

l Device A and Device B run VRRP.
l An NQA test instance is created on Device A to detect the reachability of the destination
IP address
l VRRP is configured on Device A to track the NQA test instance. If the status of the
NQA test instance is Failed, Device A reduces its priority to trigger a master/backup
VRRP switchover. A VRRP backup group can track a maximum of eight NQA test

The implementation is as follows:

1. Device A tracks the NQA test instance periodically and sends VRRP Advertisement
packets to notify its status to Device B.
2. When the uplink fails, the status of the NQA test instance changes to Failed. NQA
notifies VRRP of the link detection failure, and Device A reduces its priority by a
specified value. Because Device B has a higher priority than Device A, Device B
preempts the Master state and takes over services.
3. When the uplink recovers, the status of the NQA test instance changes to Success. NQA
notifies VRRP of the link detection success, and Device A restores the original priority.
If preemption is enabled on Device A, Device A preempts the Master state and takes
over services after VRRP negotiation.

VRRP association with NQA implements a rapid master/backup VRRP switchover if a cross-
device uplink fails.

5.2.12 Association Between a VRRP Backup Group and a Route

To improve device reliability, two user gateways working in master/backup mode are
connected to a network, and VRRP is enabled on these gateways to determine their master/
backup status. If a VRRP backup group has been configured and an uplink route to a network
becomes unreachable, access-side users still use the VRRP backup group to forward traffic
along the uplink route, which causes user traffic loss.

Association between a VRRP backup group and a route can prevent user traffic loss. A VRRP
backup group can be configured to track the uplink route to a network. If the route is
withdrawn or becomes inactive, the route management (RM) module notifies the VRRP
backup group of the change. After receiving the notification, the VRRP backup group changes
its master device's VRRP priority and performs a master/backup switchover. This process
ensures that user traffic can be forwarded along a properly functioning link.

A VRRP backup group can be associated with an uplink route to a network to determine
whether the route is reachable. If the uplink route is withdrawn or becomes inactive after the
uplink goes Down or the network topology changes, hosts on a local area network (LAN) fail
to access the external network through gateways. The RM module notifies the VRRP backup
group of the route status change. The VRRP priority of the master device decreases by a
specified value. A backup device with a priority higher than others preempts the Master state

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 106

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 5 VRRP

and takes over traffic. This process ensures communication continuity between these hosts
and the external network. After the uplink recovers, the RM module instructs the VRRP
backup group to restore the master device's VRRP priority.
As shown in Figure 5-25, a VRRP backup group is configured on Device A (master) and
Device B (backup), with Device A forwarding user traffic. The VRRP backup group on
Device A is associated with the route
When the uplink from Device A to Device C goes Down, the route becomes
unreachable and Device A's VRRP priority decreases. Because Device A's reduced VRRP
priority is lower than Device B's VRRP priority, Device B preempts the Master state and takes
over traffic, which prevents user traffic loss.

Figure 5-25 Association between a VRRP backup group and a route

VRRP DeviceA
Master DeviceC

network core

DeviceB DeviceD

VRRP DeviceA
Backup DeviceC

network core

DeviceB DeviceD
Data flow

VRRP device configurations are as follows:

l Device A's VRRP priority is 120.
l Device B supports immediate preemption and its VRRP priority retains the default value
l The VRRP backup group on Device A is associated with the route If the
RM module informs Device A that the route is unreachable, Device A's
VRRP priority decreases by 40.
The implementation process is as follows:

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 107

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 5 VRRP

1. In normal circumstances, Device A periodically sends VRRP Advertisement packets to

inform Device B that it is working properly.
2. When the uplink between Device A and Device C goes Down, the route
becomes unreachable. The RM module notifies the VRRP backup group on Device A of
the route status change. After Device A receives the notification, its VRRP priority
decreases to 80 (120 - 40). Because the VRRP priority of Device B is higher than that of
Device A, Device B preempts the Master state and sends gratuitous ARP packets to
update address entries on Device E.
3. When the faulty link recovers, the route is reachable. Device A restores its
VRRP priority to 120 (80 + 40), preempts the Master state, and sends VRRP
Advertisement packets and gratuitous ARP packets. After Device B receives the
Advertisement packets and determines that its priority is lower than that of Device A,
Device B returns to the Backup state.
4. Device A in the Master state forwards user traffic, and Device B remains in the Backup
The preceding process shows that the VRRP backup group performs a master/backup
switchover if the uplink route is unreachable.

Association between a VRRP backup group and a route helps implement a master/backup
VRRP switchover when an uplink route to a network is unreachable. The association also
ensures that the VRRP backup group performs a traffic switchback and minimizes traffic
Compared with the association between a VRRP backup group and the interface status, the
association detects both direct uplink interface faults and faults of links and devices when
uplink traffic passing through multiple devices.

5.2.13 Association Between Direct Routes and a VRRP Backup

A VRRP backup group is configured on Device1 and Device2 on the network shown in
Figure 5-26. Device1 is a master device, whereas Device2 is a backup device. The VRRP
backup group serves as a gateway for users. User-to-network traffic travels through Device1.
However, network-to-user traffic may travel through Device1, Device2, or both of them over
a path determined by a dynamic routing protocol. Therefore, user-to-network traffic and
network-to-user traffic may travel along different paths, which interrupts services if firewalls
are attached to devices in the VRRP backup group, complicates traffic monitoring or statistics
collection, and increases costs.
To address the preceding problems, the routing protocol is expected to select a route passing
through the master device so that the user-to-network and network-to-user traffic travels along
the same path. Association between direct routes and a VRRP backup group can meet
expectations by allowing the dynamic routing protocol to select a route based on the VRRP

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 108

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 5 VRRP

Figure 5-26 Association between direct routes and a VRRP backup group
Master NPE3

network core


Master NPE3

User Direct route tracking IP/MPLS
network VRRP core

User-to-network traffic
Network-to-user traffic

Related Concepts
Direct route: a 32-bit host route or a network segment route that is generated after a device
interface is assigned an IP address and its protocol status is Up. A device automatically
generates direct routes without using a routing algorithm.

Association between direct routes and a VRRP backup group allows VRRP interfaces to
adjust the costs of direct network segment routes based on the VRRP status. The direct route
with the master device as the next hop has the lowest cost. A dynamic routing protocol
imports the direct routes and selects the direct route with the lowest cost. For example, VRRP
interfaces on Device1 and Device2 on the network shown in Figure 1 are configured with
association between direct routes and the VRRP backup group. The implementation is as
l Device1 in the Master state sets the cost of its route to the directly connected virtual IP
network segment to 0 (default value).
l Device2 in the Backup state increases the cost of its route to the directly connected
virtual IP network segment.

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 109

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 5 VRRP

A dynamic routing protocol selects the route with Device1 as the next hop because this route
costs less than the other route. Therefore, both user-to-network and network-to-user traffic
travels through Device1.

Usage Scenario
When a data center is used, firewalls are attached to devices in a VRRP backup group to
improve network security. Network-to-user traffic cannot pass through a firewall if it travels
over a path different than the one used by user-to-network traffic.
When an IP radio access network (RAN) is configured, VRRP is configured to set the master/
backup status of aggregation site gateways (ASGs) and radio service gateways (RSGs).
Network-to-user and user-to-network traffic may pass through different paths, complicating
network operation and management.
Association between direct routes and a VRRP backup group can address the preceding
problems by ensuring the user-to-network and network-to-user traffic travels along the same

5.2.14 Traffic Forwarding by a Backup Device

As shown in Figure 5-27, the base station attached to the cell site gateway (CSG) on a mobile
bearer network accesses aggregation nodes PE1 and PE2 over primary and secondary pseudo
wires (PWs) and accesses PE3 and PE4 over primary and secondary links. PE3 and PE4 are
configured to belong to a Virtual Router Redundancy Protocol (VRRP) backup group. If PE1
fails, traffic switches from the primary link to the secondary link. Before a master/backup
VRRP switchover is complete, service traffic is temporarily interrupted.

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 110

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 5 VRRP

Figure 5-27 Traffic forwarding by a backup device

PE1 Master RNC1
r im ar y
Primary link

eNodeB Seco
nary Secondary link

PE1 Master RNC1
ar y
Primary link

eNodeB nary
PW Secondary link
Upstream traffic

To meet carrier-class reliability requirements, configure devices in the VRRP backup group to
forward traffic even when they are in the Backup state. This configuration can prevent traffic
interruptions in the preceding scenario.

As shown in Figure 5-27, upstream traffic travels along the path CSG -> PE1 -> PE3 ->
RNC1/RNC2 in normal circumstances. PE3 is in the Master state, and PE4 in the Backup

If PE1 fails, traffic switches from the primary link between PE1 and PE3 to the secondary link
between PE2 and PE4. Because the speed of a primary/secondary link switchover is higher
than that of a master/backup VRRP switchover:
l If PE4 cannot forward traffic, service traffic is temporarily interrupted before the master/
backup VRRP switchover is complete.
l If PE4 can forward traffic, PE4 takes over service traffic forwarding even if the master/
backup VRRP switchover is not complete.

Traffic forwarding by a backup device improves master/backup VRRP switchover
performance and reduces the service interruption time.

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 111

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 5 VRRP

5.2.15 Rapid VRRP Switchback

On the network shown in Figure 5-28, VRRP-enabled NPEs are connected to user-side PEs
through active and standby links. User traffic travels over the active link to the master NPE1,
and NPE1 forwards user traffic to the Internet. If NPE1 is working properly, user traffic
travels over the path UPE -> PE1 -> NPE1. If the active link or NPE1's interface 1 tracked by
the VRRP backup group fails, an active/standby link switchover and a master/backup VRRP
switchover are implemented. After the switchovers, user traffic switches to the path UPE ->
PE1 -> PE2 -> NPE2. After the fault is rectified, an active/standby link switchback and a
master/backup VRRP switchback are implemented. If the active link becomes active before
the original master device restores the Master state, user traffic is interrupted.
To prevent user traffic interruptions, the rapid VRRP switchback function is used to allow the
original master device to switch from the Backup state to the Master state immediately after
the fault is rectified.

Figure 5-28 Rapid VRRP switchback

Internet Internet

Master Backup Backup Master


Interface 1 Interface 1
Servie Standby Service Active
Active link VRRP link VRRP link



User User
network network

Data flow

Related Concept
A VRRP switchback is a process during which the original master device switches its status
from Backup to Master after a fault is rectified.

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 112

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 5 VRRP

Rapid VRRP switchback allows the original master device to switch its status from Back to
Master without using VRRP Advertisement packets to negotiate the status. For example, on
the network shown in Figure 5-28, device configurations are as follows:
l A common VRRP backup group is configured on NPE1 and NPE2 that run VRRP. An
mVRRP backup group is configured on directly connected interfaces of NPE1 and
NPE2. The common VRRP backup group is bound to the mVRRP backup group and
becomes a service VRRP backup group. The mVRRP backup group determines the
master/backup status of the service VRRP backup group.
l NPE1 has a VRRP priority of 120 and works in the Master state in the mVRRP backup
l NPE2 has a VRRP priority of 100 and works in the Backup state in the mVRRP backup
l NPE1 tracks interface 1 and reduces its priority by 40 if interface 1 goes Down.
The rapid VRRP switchback process is as follows:
1. If NPE1 is working properly, NPE1 periodically sends VRRP Advertisement packets to
notify NPE2 of the Master state. NPE1 tracks interface 1 connected to the active link.
2. If the active link or interface 1 fails, interface 1 goes Down. The service VRRP backup
group on NPE1 is in the Initialize state. NPE1 reduces its mVRRP priority to 80 (120 -
40). As a result, the mVRRP priority of NPE2 is higher than that of NPE1, and NPE2
immediately preempts the Master state. NPE2 then sends a VRRP Advertisement packet
carrying a higher priority than that of NPE1. After receiving the packet, the mVRRP
backup group on NPE1 stops sending VRRP Advertisement packets and enters the
Backup state. The status of the service VRRP backup group is the same as that of the
mVRRP backup group on NPE2. User traffic switches to the path UPE -> PE1 -> PE2 ->
3. After the fault is rectified, interface 1 goes Up and NPE1 increases its VRRP priority to
120 (80 + 40). NPE1 immediately preempts the Master state and sends VRRP
Advertisement packets to NPE2. User traffic switches back to the path UPE -> PE1 ->
If rapid VRRP switchback is not configured and NPE1 restores its priority to 120, NPE1 has to
wait until it receives VRRP Advertisement packets carrying a lower priority than its own priority
from NPE2 before preempting the Master state.
4. NPE1 then sends VRRP Advertisement packets carrying a higher priority than NPE2's
priority. After receiving the VRRP Advertisement packets, NPE2 enters the Backup
state. Both NPE1 and NPE2 restore their previous status.

Usage Scenario
Rapid VRRP switchback applies to a specific network with all of the following
l The master device in an mVRRP backup group tracks a VRRP-disabled interface or
feature and reduces its VRRP priority if the interface or feature status becomes Down.
l Devices in a VRRP backup group are connected to user-side devices over the active and
standby links.
l An active/standby link switchback is implemented quicker than a master/backup VRRP

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 113

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 5 VRRP

Rapid VRRP switchback speeds up a VRRP switchback after a fault is rectified.

5.2.16 Unicast VRRP

Common VRRP is multicast VRRP and only allows VRRP Advertisement packets to be
multicast. Multicast VRRP Advertisement packets, however, can be forwarded within only
one broadcast domain (for example, one VLAN or VSI). Therefore, common VRRP backup
groups apply only to Layer 2 networks. This limitation means that common VRRP does not
apply to devices on a Layer 3 network that need to negotiate their master/backup status using
To address this issue, Huawei develops unicast VRRP based on VRRPv2, which allows
VRRP Advertisement packets to pass through a Layer 3 network. After a unicast VRRP
backup group is configured on two devices on a Layer 3 network, the master device in this
group sends unicast VRRP Advertisement packets to the backup device through the Layer 3
network, implementing the master/backup status negotiation between the two devices.

The implementation of unicast VRRP is similar to that of common VRRP..
A unicast VRRP backup group cannot function as a user gateway. In addition to implementing
master/backup status negotiation between devices, unicast VRRP provides the following
extended functions:
l Security authentication: MD5 or HMAC-SHA256 authentication can be configured for a
unicast VRRP backup group to improve network security.
l Delayed preemption: This function prevents the master/backup status of devices in a
unicast VRRP backup group from changing frequently, thereby ensuring network
l Association with a VRRP-disabled interface, and BFD: If the master device in a unicast
VRRP backup group fails, the backup device immediately takes over, thereby ensuring
network reliability.
l Association with an interface monitoring group: When the link failure ratio on the access
or network side reaches a specified threshold, the unicast VRRP backup group performs
a master/backup switchover to ensure network reliability.
As an extension to association between a unicast VRRP backup group and a VRRP-disabled
interface, association between a unicast VRRP backup group and an interface monitoring group
reduces the configuration workload and implements uplink and downlink monitoring.

Usage Scenario
Unicast VRRP applies when two devices on a Layer 3 network need to use VRRP to negotiate
their master/backup status.

Unicast VRRP allows two devices on a Layer 3 network to use VRRP to negotiate their
master/backup status. Unicast VRRP can be associated with a VRRP-disabled interface, BFD,

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 114

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 5 VRRP

If the master device in a unicast VRRP backup group fails, the backup device rapidly detects
the fault and becomes the new master device.

5.3 Application Scenarios for VRRP

5.3.1 IPRAN Gateway Protection Solution

Service Overview
NodeBs and radio network controllers (RNCs) on an IP radio access network (IPRAN) do not
have dynamic routing capabilities. Static routes must be configured to allow NodeBs to
communicate with access aggregation gateways (AGGs) and allow RNCs to communicate
with radio service gateways (RSGs) at the aggregation level. To ensure that various value-
added services, such as voice, video, and cloud computing, are not interrupted on mobile
bearer networks, a VRRP backup group can be deployed to implement gateway redundancy.
When the master device in a VRRP backup group goes Down, a backup device takes over,
ensuring normal service transmission and enhancing device reliability at the aggregation

Networking Description
Figure 5-29 shows the network for the IPRAN gateway protection solution. A NodeB is
connected to AGGs over an access ring or is dual-homed to two AGGs. The cell site gateways
(CSGs) and AGGs are connected using the pseudo wire emulation edge-to-edge (PWE3)
technology, which ensures connection reliability. Two VRRP backup groups can be
configured on the AGGs and RSGs to implement gateway backup for the NodeB and RNC,

Figure 5-29 IPRAN gateway protection solution

Access ring/dual- IP/MPLS backbone network

homing network






Feature Deployment
Table 5-5 describes VRRP-based gateway protection applications on an IPRAN.

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 115

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 5 VRRP

Table 5-5 VRRP-based gateway protection on an IPRAN

Networ Feature Usage Scenario
k Layer Deploym

Deploy Associate To meet various service demands, different VRRP backup

VRRP an mVRRP groups can be configured on AGGs to provide gateway
backup backup functions for different user groups. Each VRRP backup group
groups on group with maintains its own state machine, leading to transmission of
AGGs to a service multiple VRRP packets on the AGGs. These packets use a
implemen VRRP significant amount of bandwidth when traversing the access
t gateway backup network.
backup group. To simplify VRRP operations and reduce bandwidth
for the consumption, an mVRRP backup group can be associated with
NodeB. service VRRP backup groups on AGGs. During this process,
service VRRP backup groups function as gateways for the
NodeB and are associated with the mVRRP backup group. The
mVRRP backup group processes VRRP Advertisement packets
and determines the master/backup status of the associated
service VRRP backup group.

Associate By default, when a VRRP backup group detects that the master
an mVRRP device goes Down, the backup device attempts to preempt the
backup Master state after 3 seconds (three times the interval at which
group with VRRP Advertisement packets are broadcast). During this period,
a BFD no master device forwards user traffic, which leads to traffic
session. forwarding interruptions.
BFD can detect link faults in milliseconds. After an mVRRP
backup group is associated with a BFD session and BFD detects
a fault, a master/backup VRRP switchover is implemented,
preventing user traffic loss. When the master device goes Down,
the BFD module instructs the backup device in the mVRRP
backup group to preempt the Master state and take over traffic.
The status of the service VRRP backup group associated with
the mVRRP backup group changes accordingly. This
implementation reduces service interruptions.

Associate During the traffic transmission between the NodeB and RNC,
direct user-to-network and network-to-user traffic may travel through
network different paths, causing network operation, maintenance, and
segment management difficulties. For example, the NodeB sends traffic
routes with destined for the RNC through the master AGG. The RNC sends
a service traffic destined for the NodeB through the backup AGG. This
VRRP implementation increases traffic monitoring costs. Association
backup between direct network segment routes and a service VRRP
group. backup group can be deployed to ensure that user-to-network
and network-to-user traffic travels through the same path.

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 116

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 5 VRRP

Networ Feature Usage Scenario

k Layer Deploym

Deploy Deploy RSGs provide gateway functions for the RNC. Basic VRRP
VRRP basic functions can be configured on the RSGs to implement gateway
backup VRRP backup. In normal circumstances, the master device forwards
groups on functions. user traffic. When the master device goes Down, the backup
RSGs to device takes over.
t gateway Associate a A VRRP backup group can be associated with a BFD session to
backup VRRP implement a rapid master/backup VRRP switchover when BFD
for the backup detects a fault. When the master device goes Down, the BFD
RNC. group with module instructs the backup device in the VRRP backup group
a BFD to preempt the Master state and take over traffic. This
session. implementation reduces service interruptions.

Associate Direct network segment routes can be associated with a VRRP

direct backup group to ensure the same path for both user-to-network
network and network-to-user traffic between the NodeB and RNC.
routes with

Protection Switching Process

AGG1 and RSG1 are deployed as master devices. The following describes user traffic path
changes when AGG1 goes Down and after AGG1 recovers.
As shown in Figure 5-30, in normal circumstances, the NodeB sends traffic through the CSGs
to AGG1 over the primary pseudo wire (PW). AGG1 forwards the traffic to RSG1 through
the P device. Then, RSG1 forwards the traffic to the RNC. The path for user-to-network
traffic is CSG -> AGG1 -> P -> RSG1 -> RNC, and the path for network-to-user traffic is
RNC -> RSG1 -> P -> AGG1 -> CSG.
When AGG1 goes Down, a primary/secondary PW switchover is performed. Traffic sent from
the NodeB goes through the CSGs to AGG2 through the new primary PW. AGG2 forwards
the traffic to RSG1 through the P device and RSG2. Then, RSG1 sends the traffic to the RNC.
The path for user-to-network traffic is CSG -> AGG2 -> P -> RSG2 -> RSG1 -> RNC, and
the path for network-to-user traffic is RNC -> RSG1 -> RSG2 -> P -> AGG2 -> CSG.

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 117

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 5 VRRP

Figure 5-30 Traffic path after AGG1 goes Down

CSG Master -> Backup/Initialize Master




CSG Backup -> Master Backup

User traffic before AGG1 goes Down

User traffic after AGG1 goes Down

As shown in Figure 5-31, when AGG1 recovers, a primary/secondary PW switchover is

performed, but a master/backup switchover is not performed in the mVRRP backup group.
Therefore, traffic sent from the NodeB goes through the CSGs and AGG1 to AGG2 over the
previous primary PW. AGG2 forwards the traffic to RSG1 through the P device and RSG2.
RSG1 then forwards the traffic to the RNC. The path for user-to-network traffic is CSG ->
AGG1 -> AGG2 -> P -> RSG2 -> RSG1 -> RNC, and the path for network-to-user traffic is
RNC -> RSG1 -> RSG2 -> P -> AGG2 -> AGG1 -> CSG.

Figure 5-31 Traffic path after AGG1 recovers

CSG Backup Master




CSG Backup

When AGG1 recovers, it becomes the master device after a specified preemption delay
elapses. AGG2 then becomes the backup device. Traffic sent from the NodeB goes through
the CSGs to AGG1 over the previous primary PW. AGG1 sends the traffic to RSG1 through
the P device. RSG1 then sends the traffic to the RNC. The path for user-to-network traffic is
CSG -> AGG1 -> P -> RSG1 -> RNC, and the path for network-to-user traffic is RNC ->
RSG1 -> P -> AGG1 -> CSG.

5.4 Terminology for VRRP

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 118

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 5 VRRP

Acronym Full Name


ARP Address Resolution Protocol

BFD Bidirectional Forwarding Detection

L2VPN Layer 2 virtual private network

L3VPN Layer 3 virtual private network

PW pseudo wire

VSI virtual switching instance

mVRRP management Virtual Router Redundancy Protocol

VRRP Virtual Router Redundancy Protocol

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 119

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 6 Ethernet OAM

6 Ethernet OAM

About This Chapter

6.1 Overview of Ethernet OAM

6.2 Understanding EFM
6.3 Understanding CFM
6.4 Understanding Y.1731
6.5 Ethernet OAM Fault Advertisement
6.6 Application Scenarios for Ethernet OAM
6.7 Our Advantages

6.1 Overview of Ethernet OAM

Easy-to-use Ethernet techniques support good bandwidth expansibility on low-cost hardware.
With these advantages, Ethernet services and structures are the first choice for many
enterprise networks, metropolitan area networks (MANs), and wide area network (WANs).
The increasing popularity of Ethernet applications encourages carriers to use improved
Ethernet OAM functions to maintain and operate Ethernet networks.

OAM mechanisms for server-layer services such as synchronous digital hierarchy (SDH) and
for client-layer services such as IP cannot be used on Ethernet networks. Ethernet OAM
differs from client- or server-layer OAM and has been developed to support the following
l Monitors Ethernet link connectivity.
l Pinpoints faults on Ethernet networks.
l Evaluates network usage and performance.
These functions help carriers provide services based on service level agreements (SLAs).

Ethernet operation, administration and maintenance (OAM) is used for Ethernet networks.

Ethernet OAM provides the following functions:

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 120

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 6 Ethernet OAM

l Fault management
– Ethernet OAM sends detection packets on demand or periodically to monitor
network connectivity.
– Ethernet OAM uses methods similar to Packet Internet Groper (PING) and
traceroute used on IP networks to locate and diagnose faults on Ethernet networks.
– Ethernet OAM is used together with a protection switching protocol to trigger a
device or link switchover if a connectivity fault is detected. Switchovers help
networks achieve carrier-class reliability, by ensuring that network interruptions are
less than or equal to 50 milliseconds.
l Performance management
Ethernet OAM measures network transmission parameters including packet loss ratio,
delay, and jitter and collects traffic statistics including the numbers of sent and received
bytes and the number of frame errors. Performance management is implemented on
access devices. Carriers use this function to monitor network operation and dynamically
adjust parameters in real time based on statistical data. This process reduces maintenance

Ethernet OAM Network

Table 6-1 shows the hierarchical Ethernet OAM network structure.

Table 6-1 Ethernet OAM network

Layer Description Feature Usage Scenario

Link- Monitors EFM supports link EFM is used on links between

level physical Ethernet continuity check, fault customer edges (CEs) and user-
Ether links directly detection, fault end provider edges (UPEs) on a
net connecting advertisement, and metropolitan area network
OAM carrier networks loopback for P2P Ethernet (MAN) shown in Figure 6-1. It
to user networks. link maintenance. Unlike helps maintain the reliability
For example, the CFM that is used for a and stability of connections
Institute of specific type of service, between a user network and a
Electrical and EFM is used on links provider network. EFM
Electronics transmitting various monitors and detects faults in
Engineers (IEEE) services. P2P Ethernet physical links or
802.3ah, also simulated links.
known as
Ethernet in the
First Mile
(EFM), supports
Ethernet OAM
for the last-mile
links and also
monitors direct
physical Ethernet

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 121

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 6 Ethernet OAM

Layer Description Feature Usage Scenario

Netw Checks network IEEE 802.1ag, also known CFM is used at the access and
ork- connectivity, as connectivity fault aggregation layers of the MAN
level pinpoints management (CFM), shown in Figure 6-1. For
Ether connectivity defines OAM functions, example, CFM monitors the link
net faults, and such as continuity check between a user-end provider
OAM monitors E2E (CC), loopback (LB), and edge (UPE) and a PE. It
network linktrace (LT), for Ethernet monitors network-wide
performance at bearer networks. CFM connectivity and detects
the access and applies to large-scale E2E connectivity faults. CFM is used
aggregation Ethernet networks. together with protection
layers. For switchover mechanisms to
example, IEEE maintain network reliability.
802.1ag (CFM)
and Y.1731. Y.1731 is an OAM Y.1731 is a CFM enhancement
protocol defined by the that applies to access and
Telecommunication aggregation networks. Y.1731
Standardization Sector of supports performance
the International monitoring functions, such as
Telecommunication Union LM and DM, in addition to fault
(ITU-T). It covers items management that CFM supports.
defined in IEEE 802.1ag
and provides additional
OAM messages for fault
management and
performance monitoring.
Fault management
includes alarm indication
signal (AIS), remote
defect indication (RDI),
locked signal (LCK), test
signal, maintenance
communication channel
(MCC), experimental
(EXP) OAM, and vendor
specific (VSP) OAM.
Performance monitoring
includes frame loss
measurement (LM) and
delay measurement (DM).

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 122

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 6 Ethernet OAM

Figure 6-1 Typical MAN networking




Network IP/MPLS

Business CE


EFM OAM (802.3ah) CFM(802.1ag)/Y.1731 Network
MAN access to Aggregation layer

P2P EFM, E2E CFM, E2E Y.1731, and their combinations are used to provide a complete
Ethernet OAM solution, which brings the following benefits:
l Ethernet is deployed near user premises using remote terminals and roadside cabinets at
remote central offices or in unattended areas. Ethernet OAM allows remote maintenance,
saving the trouble in onsite maintenance. Engineers operate detection, diagnosis, and
monitoring protocols and techniques from remote locations to maintain Ethernet
networks. Remote OAM maintenance saves the trouble of onsite maintenance and helps
reduce maintenance and operation expenditures.
l Ethernet OAM supports various performance monitoring tools that are used to monitor
network operation and assess service quality based on SLAs. If a device using the tools
detects faults, the device sends traps to a network management system (NMS). Carriers
use statistics and trap information on NMSs to adjust services. The tools help ensure
proper transmission of voice and data services.

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 123

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 6 Ethernet OAM

6.2 Understanding EFM

6.2.1 Basic Concepts

EFM works at the data link layer and uses protocol packets called OAM protocol data units
(OAMPDUs). EFM devices periodically exchange OAMPDUs to report link status, helping
network administrators effectively manage networks. Figure 6-2 shows the format and
common types of OAMPDUs. Table 6-2 lists and describes fields in an OAMPDU.

Figure 6-2 OAMPDU format

6 6 2 1 2 1 42-1496 4

Dest addr Source addr Type Subtype Flags Code Data/Pad CRC

Information OAMPDU 0X00 Local info TLV Remote info TLV …

Event Notification OAMPDU 0X01 Seq Link event TLV …

Loopback Control OAMPDU 0X04 Loopback command

Table 6-2 Fields and descriptions in an OAMPDU

Field Description

Dest addr Destination MAC address, which is a slow-protocol multicast address

0x0180-C200-0002. Network bridges cannot forward slow-protocol
packets. EFM OAMPDUs cannot be forwarded over multiple devices,
even if OAM is supported or enabled on the devices.

Source addr Source address, which is a unicast MAC address of a port on the
transmit end. If no port MAC address is specified on the transmit end,
the bridge MAC address of the transmit end is used.

Type Slow protocol type, which has a fixed value of 0x8809.

Subtype Subtype of a slow protocol. The value is 0x03, which means that the
slow sub-protocol is EFM.

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 124

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 6 Ethernet OAM

Field Description

Flags Status of an EFM entity:

l Remote Stable
l Remote Evaluating
l Local Stable
l Local Evaluating
l Critical Event
l Link Fault

Code OAMPDU type:

l 0X00: Information OAMPDU
l 0X01: Event Notification OAMPDU
l 0X04: Loopback Control OAMPDU
Table 6-3 describes common types of OAMPDUs.

Table 6-3 OAMPDU types

OAMPDU Type Description

Information errored symbol period event


Event Notification Used to monitor links. If an errored frame event, errored symbol
OAMPDU period event, or errored frame second summary event occurs on
an interface, the interface sends an Event Notification OAMPDU
to notify the remote interface of the event.

Loopback Control Used to enable or disable the remote loopback function.


Connection Modes
EFM supports two connection modes: active and passive. Table 6-4 describes capabilities of
processing OAMPDUs in the two modes.

Table 6-4 Capabilities of processing OAMPDUs in active and passive modes

Capability Active Mode Passive Mode

Initiate a connection request by Supported Not supported

sending an Information OAMPDU
during the discovery process.

Respond to a connection request Supported Supported

during the discovery process.

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 125

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 6 Ethernet OAM

Capability Active Mode Passive Mode

Send Information OAMPDUs. Supported Supported

Send Event Notification Supported Supported


Send Loopback Control OAMPDUs. Supported Not supported

Respond to Loopback Control Supported (The remote Supported

OAMPDUs. EFM entity must work in
active mode.)


l An EFM connection can be initiated only by an OAM entity working in active mode. An OAM
entity working in passive mode waits to receive a connection request from its peer entity. Two
OAM entities that both work in passive mode cannot establish an EFM connection between them.
l An OAM entity that is to initiate a loopback request must work in active mode.

6.2.2 Background
As telecommunication technologies develop quickly and the demand for service diversity is
increasing, various user-oriented teleservices are being provided over digital and intelligent
media through broadband paths. Backbone network technologies, such as synchronous digital
hierarchy (SDH), asynchronous transfer mode (ATM), passive optical network (PON), and
dense wavelength division multiplexing (DWDM), grow mature and popular. The
technologies allow the voice, data, and video services to be transmitted over a single path to
every home. Telecommunication experts and carriers focus on using existing network
resources to support new types of services and improve the service quality. The key point is to
provide a solution to the last-mile link to a user network.
A "last mile" reliability solution also needs to be provided. High-end clients, such as banks
and financial companies, demand high reliability. They expect carriers to monitor both carrier
networks and last-mile links that connect users to those carrier networks. EFM can be used to
satisfy these demands.

Figure 6-3 EFM network

Services Access Metro



Infrastructure EFM

Service CFM/Y.1731

Subscriber CFM/Y.1731

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 126

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 6 Ethernet OAM

On the network shown in Figure 6-3, EFM is an OAM mechanism that applies to the last-
mile Ethernet access links to users. Carriers use EFM to monitor link status in real time,
rapidly locate failed links, and identify fault types if faults occur. OAM entities exchange
various OAMPDUs to monitor link connectivity and locate link faults.

6.2.3 Basic Functions

EFM supports OAM discovery, link monitoring, fault notification, and remote loopback. The
following example illustrates EFM implementation on the network shown in Figure 6-4. The
customer edge (CE) is a device in a customer equipment room, and provider edge 1 (PE1) is a
carrier device. EFM is used to monitor the link connecting the CE to PE1, allowing the carrier
to remotely monitor link connectivity and quality.

Figure 6-4 Typical EFM network

Network Side
User Side

Port 2
Port 1 PE2

OAM Discovery
During the discovery phase, a local EFM entity discovers and establishes a stable EFM
connection with a remote EFM entity. Figure 6-5 shows the discovery process.

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 127

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 6 Ethernet OAM

Figure 6-5 OAM discovery


Initial status: 1. Sen

ds a n Initial status:
Discovery the lo Inform
cal EF ation Discovery
(proactive mode) M settin OAMP
g DU ca
rry ing
2. After
receiving the
compares the
received EFM
yin information with
4. Checks whether c arr a
DU nd the local setting
the received EFM A MP n g s a n g s
n O etti i
settings match the e tt
m atio FM s FM s
local settings. If or E E
Inf mote er the
they match each s an r e t h
d e
other, the EFM end an wh
3. S local ating
session enters the
the indic ed
Detect state. If they flag atisf
do not match, s
repeat step 1 to
initiate negotiation
or stop negotiation
if EFM is disabled
locally 5.
ma erio
int d
ain ical
th ly s
e c en 6. Enters the Detect
on ds
ne I state, establish a
cti nfor
on ma connection, and
n exchanges
MP Information
D Us OAMPDUs with the
to CE to maintain the

EFM entities at both ends of an EFM connection periodically exchange Information

OAMPDUs to monitor link connectivity. The interval at which Information OAMPDUs are
sent is also known as an interval between handshakes. If an EFM entity does not receive
Information OAMPDUs from the remote EFM entity within the connection timeout period,
the EFM entity considers the connection interrupted and sends a trap to the network
management system (NMS). Establishing an EFM connection is a way to monitor physical
link connectivity automatically.

Link Monitoring
Monitoring Ethernet links is difficult if network performance deteriorates while traffic is
being transmitted over physical links. To resolve this issue, the EFM link monitoring function
can be used. This function can detect data link layer faults in various environments. EFM
entities that are enabled with link monitoring exchange Event Notification OAMPDUs to
monitor links.

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 128

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 6 Ethernet OAM

If an EFM entity receives a link event listed in Table 6-5, it sends an Event Notification
OAMPDU to notify the remote EFM entity of the event and also sends a trap to an NMS.
After receiving the trap on the NMS, an administrator can determine the network status and
take remedial measures as needed.

Table 6-5 Common link events and their descriptions

Common Link Description Usage Scenario

Errored symbol If the number of symbol errors This event helps the device detect
period event that occur on a device's interface code errors during data
during a specified period of time transmission at the physical layer.
reaches a specified upper limit,
the device generates an errored
symbol period event, advertises
the event to the remote device,
and sends a trap to the NMS.

Errored frame If the number of frame errors This event helps the device detect
event that occur on a device's interface frame errors that occur during data
during a specified period of time transmission at the MAC sublayer.
reaches a specified upper limit,
the device generates an errored
frame event, advertises the event
to the remote device, and sends a
trap to the NMS.

Errored frame An errored frame second is a This event helps the device detect
seconds one-second interval wherein at errored frame seconds that occur
summary event least one frame error is detected. during data transmission at the
If the number of errored frame MAC sublayer.
seconds that occur during a
specified period of time reaches
a specified upper limit on a
device's interface, the device
generates an errored frame
second summary event,
advertises the event to the
remote device, and sends a trap
to the NMS.

Fault Notification
After the OAM discovery phase finishes, two EFM entities at both ends of an EFM
connection exchange Information OAMPDUs to monitor link connectivity. If traffic is
interrupted due to a remote device failure, the remote EFM entity sends an Information
OAMPDU carrying an event listed in Table 6-6 to the local EFM entity. After receiving the
notification, the local EFM entity sends a trap to the NMS. An administrator can view the trap
on the NMS to determine link status and take measures to rectify the fault.

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 129

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 6 Ethernet OAM

Table 6-6 Critical link events

Critical Link Event Description

Link fault If a loss of signal (LoS) error occurs because the interval at
which OAMPDUs are sent elapses or a physical link fails, the
local device sends a trap to the NMS.

Critical event If an unidentified critical event occurs because a fault is detected

using association between the remote EFM entity and a specific
feature, the local device sends a trap to the NMS. Remote EFM
entities can be associated with protocols, including Bidirectional
Forwarding Detection (BFD), connectivity fault management
(CFM), and Multiprotocol Label Switching (MPLS) OAM.

Remote Loopback
Figure 6-6 demonstrates the principles of remote loopback. When a local interface sends non-
OAMPDUs to a remote interface, the remote interface loops the non-OAMPDUs back to the
local interface, not to the destination addresses of the non-OAMPDUs. This process is called
remote loopback. An EFM connection must be established to implement remote loopback.

Figure 6-6 Principles of EFM remote loopback

All packets except EFM OAMPDUs

Port 1 Port 2
(Active mode) (Passive mode)
Data flow

A device enabled with remote loopback discards all data frames except OAMPDUs, causing a
service interruption. To prevent impact on services, use remote loopback to check link
connectivity and quality before a new network is used or after a link fault is rectified.
The local device calculates communication quality parameters such as the packet loss ratio on
the current link based on the numbers of sent and received packets. Figure 6-7 shows the
remote loopback process.

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 130

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 6 Ethernet OAM

Figure 6-7 Remote loopback process


1 . Se
Proactive carrying a Lo
a r em opback
mode ot e l Co 2. After receiving the
oopb ntrol O
requ MPDU
est determines whether to
enter the loopback state:
- If not, PE1 discards the
Loopback Control
O AM pted OAMPDU and forwards
tion e
acc the data frame as
f o rma est is
n In equ desired
e n d s a t th e r - If yes, PE1 stops
3 . S n g th a
forwarding the data
c a ti
i ndi frame and go to step 3

4. Enters the
loopback state 5 . Se
nds a
ack t
est p
a cket

7. Compares the t he
number of sent t ba c k to
packet with that st p ack e
the t
of received ops
packets and 6 . Lo
t or
checks the link initia
status based on
the result

If the local device attempts to stop remote loopback, it sends a message to instruct the remote
device to disable remote loopback. After receiving the message, the remote device disables
remote loopback.

If remote loopback is left enabled, the remote device keeps looping back service data, causing
a service interruption. To prevent this issue, a capability can be configured to disable remote
loopback automatically after a specified timeout period. After the timeout period expires, the
local device automatically sends a message to instruct the remote device to disable remote

6.3 Understanding CFM

6.3.1 Basic Concepts

Maintenance Domain
MDs are discrete areas within which connectivity fault detection is enabled. The boundary of
an MD is determined by MEPs configured on interfaces. An MD is identified by an MD

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 131

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 6 Ethernet OAM

To help locate faults, MDs are divided into levels 0 through 7. A larger value indicates a
higher level, and an MD covers a larger area. One MD can be tangential to another MD.
Tangential MDs share a single device and this device has one interface in each of the MDs. A
lower level MD can be nested in a higher level MD. An MD must be fully nested in another
MD, and the two MDs cannot overlap. A higher level MD cannot be nested in a lower level
Classifying MDs based on levels facilitates fault diagnosis. MD2 is nested in MD1 on the
network shown in Figure 6-8. If a fault occurs in MD1, PE2 through PE6 and all the links
between the PEs are checked. If no fault is detected in MD2, PE2, PE3, and PE4 are working
properly. This means that the fault is on PE5, PE6, or PE7 or on a link between these PEs.
In actual network scenarios, a nested MD can monitor the connectivity of the higher level MD
in which it is nested. Level settings allow 802.1ag packets to transparently travel through a
nested MD. For example, on the network shown in Figure 6-8, MD2 with the level set to 3 is
nested in MD1 with the level set to 6. 802.1ag packets must transparently pass through MD2
to monitor the connectivity of MD1. The level setting allows 802.1ag packets to pass through
MD2 to monitor the connectivity of MD1 but prevents 802.1ag packets that monitor MD2
connectivity from passing through MD1. Setting levels for MDs helps locate faults.

Figure 6-8 MDs

MD1 (Level=6) PE5


MD2 (Level=3)



802.1ag packets are exchanged and CFM functions are implemented based on MDs. Properly
planned MDs help a network administrator locate faults.

Default MD
A single default MD with the highest priority can be configured for each device according to
Std 802.1ag-2007.

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 132

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 6 Ethernet OAM

Figure 6-9 Default MDs

MD1 (Level=6)

MD2 (Level=3)

default MD



On the network shown in Figure 6-9, if default MDs with the same level as the higher level
MDs are configured on devices in lower level MDs, MIPs are generated based on the default
MDs to reply to requests sent by devices in higher level MDs. CFM detects topology changes
and monitors the connectivity of both higher and lower level MDs.

The default MD must have a higher level than all MDs to which MEPs configured on the
local device belong. The default MD must also be of the same level as a higher level MD. The
default MD transmits high level request messages and generates MIPs to send responses.

Standard 802.1ag-2007 states that one default MD can be configured on each device and
associated with multiple virtual local area networks (VLANs). VLAN interfaces can
automatically generate MIPs based on the default MDs and a creation rule.

Maintenance Association
Multiple MAs can be configured in an MD as needed. Each MA contains MEPs. An MA is
uniquely identified by an MD name and an MA name.

An MA serves a specific service such as VLAN. A MEP in an MA sends packets carrying

tags of the specific service and receives packets sent by other MEPs in the MA.

Maintenance Association End Point

MEPs are located at the edge of an MD and MA. The service type and level of packets sent by
a MEP are determined by the MD and MA to which the MEP belongs. A MEP processes
packets at specific levels based on its own level. A MEP sends packets carrying its own level.
If a MEP receives a packet carrying a level higher than its own, the MEP does not process the
packet and loops it along the reverse path. If a MEP receives a packet carrying a level lower
than or equal to its own, the MEP processes the packet.

A MEP is configured on an interface. The MEP level is equal to the MD level.

A MEP configured on an Ethernet CFM-enabled device is called a local MEP. MEPs

configured on other devices in the same MA are called remote maintenance association end
points (RMEPs).

MEPs are classified into the following types:

l Inward-facing MEP: sends packets to other interfaces on the same device.

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 133

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 6 Ethernet OAM

l Outward-facing MEP: sends packets out of the interface on which the MEP is
Figure 6-10 shows inward- and outward-facing MEPs.

Figure 6-10 Inward- and outward-facing MEPs

Inward-facing MEP Outward-facing MEP


Maintenance Association Intermediate Point

MIPs are located on a link between two MEPs within an MD, facilitating management. More
MIPs result in easier network management and control. Carriers set up more MIPs for
important services than for common services.
MIP creation modes
MIPs can be automatically generated based on rules or manually created on interfaces. Table
6-7 describes MIP creation modes.

Table 6-7 MIP creation modes

Creation Mode Description

Manual Only IEEE Std 802.1ag-2007 supports manual MIP configuration. The
configuration MIP level must be set. Manually configured MIPs are preferable to
automatically generated MIPs. Although configuring MIPs manually
is easy, managing many manually configured MIPs is difficult and
errors may occur.

Automatic A device automatically generates MIPs based on configured creation

creation rules. Configuring creation rules is complex, but properly configured
rules ensure correct MIP settings.
The following part describes automatic MIB creation principles.

Automatic MIP creation principles

A device automatically generates MIPs based on creation rules, which are configurable.
Creation rules are classified as explicit, default, or none rules, as listed in Table 6-8.

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 134

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 6 Ethernet OAM

Table 6-8 MIP creation rules

Version Manually Creation Rule MEPs Are MIPs Are
Configured Configured for Created
MIPs Exist Low-Level MDs
on an

IEEE Std Yes - - No

7 No Default No Yes

Explicit Yes Yes

None - -


The procedure for identifying a lower level MD is as follows:

1. Identify a service instance associated with the MD.
2. Query all interfaces in the service instance and check whether MEPs are configured on these
3. Query levels of all MEPs and locate the MEP with the highest level.

MIPs are separately calculated in each service instance such as a VLAN. In a single service
instance, MAs in MDs with different levels have the same VLAN ID but different levels.
For each service instance of each interface, the device attempts to calculate a MIP from the
lowest level MEP based on the rules listed in Table 6-7 and the following conditions:
l Each MD on a single interface has a specific level and is associated with multiple
creation rules. The creation rule with the highest priority applies. An explicit rule has a
higher priority than a default rule, and a default rule takes precedence over a none rule.
l The level of a MIP must be higher than any MEP on the same interface.
l An explicit rule applies to an interface only when MEPs are configured on the interface.
l A single MIP can be generated on a single interface. If multiple rules for generating
MIPs with different levels can be used, a MIP with the lowest level is generated.
MIP creation rules help detect and locate faults by level.
For example, CCMs are sent to detect a fault in a level 7 MD on the network shown in Figure
6-11. Loopback or linktrace is used to locate the fault in the link between MIPs that are in a
level 5 MD. This process is repeated until the faulty link or device is located.

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 135

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 6 Ethernet OAM

Figure 6-11 Hierarchical MIPs in MDs

Explicit rule
Level 7

Explicit rule
Level 5

Default rule
Level 3



The following example illustrates how to create a MIP based on a default rule defined in
IEEE Std 802.1ag-2007.
On the network shown in Figure 6-12, MD1 through MD5 are nested in MD7, and MD2
through MD5 are nested in MD1. MD7 has a higher level than MD1 through MD5, and MD1
has a higher level than MD2 through MD5. Multiple MEPs are configured on Device A in
MD1, and the MEPs belong to MDs with different levels.

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 136

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 6 Ethernet OAM

Figure 6-12 MIP creation based on IEEE Std 802.1ag-2007




... ...
MD2(Level=5) MD3(Level=4)



MEP2 of MA2,the level is 5

MEP3 of MA3,the level is 4
MEP4 of MA4,the level is 3

MEP5 of MA5,the level is 2

A default rule is configured on Device A to create a MIP in MD1. The procedure for creating
the MIP is as follows:

1. Device A compares MEP levels and finds the MEP at level 5, the highest level. The
MEP level is determined by the level of the MD to which the MEP belongs.
2. Device A selects the MD at level 6, which is higher than the MEP of level 5.
3. Device A generates a MIP at level 6.

If MDs at level 6 or higher do not exist, no MIP is generated.

If MIPs at level 1 already exist on Device A, MIPs at level 6 cannot be generated.

Hierarchical MP Maintenance
MEPs and MIPs are maintenance points (MPs). MPs are configured on interfaces and belong
to specific MAs shown in Figure 6-13.

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 137

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 6 Ethernet OAM

Figure 6-13 MPs


Inward MEP
Outward MEP

The scope of maintenance performed and the types of maintenance services depend on the
need of the organizations that use carrier-class Ethernet services. These organizations include
leased line users, service providers, and network carriers. Users purchase Ethernet services
from service providers, and service providers use their networks or carrier networks to
provide E2E Ethernet services. Carriers provide transport services.
Figure 6-14 shows locations of MEPs and MIPs and maintenance domains for users, service
providers, and carriers.

Figure 6-14 Hierarchical MPs


Operator1: level 3 Operator2:level 4

Service provider:level 5

Customer :level 6

Inward MEP
Outward MEP

Operator 1, operator 2, the service provider, and the customer use MDs with levels 3, 4, 5, and
6, respectively. A higher MD level indicates a larger MD.

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 138

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 6 Ethernet OAM

CFM Packet Format

CFM sends tagged protocol packets to detect link faults. Figure 6-15 shows the CFM packet

Figure 6-15 CFM packet format

3 5 8 8 8 32 8
First TLV Varies with value End
MD Level Version OpCode Flags Data/Pad
offset of OpCode TLV

Defined by Optional
CCM 0X01 Sequence number MEP ID MA ID

LBR 0X02 Optional LBR TLVs
transaction ID

LBM 0X03 Optional LBM TLVs
transaction ID

LTR 0X04 Reply TTL Relay action Additional LTM TLVs
transaction ID

LTM Original Target

LTM 0X05 LTM TTL Additional LTM TLVs
transaction ID MAC MAC

Table 6-9 describes the fields in a CFM packet.

Table 6-9 Fields in a CFM packet and their meanings

Field Description

MD Level Level of an MD. The value ranges from 0 to 7. A larger value

indicates a higher level.

Version Number of the CFM version. The current version is 0.

OpCode Message code value, specifying a specific type of CFM packet.

Table 6-10 describes the types of CFM packets.

Varies with value of Variables of message codes.


Table 6-10 Types of CFM packets

OpCode Packet Type Function


0x01 Continuity check message Used for monitoring E2E link connectivity.

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 139

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 6 Ethernet OAM

OpCode Packet Type Function


0x02 Loopback reply (LBR) Reply to a Loopback message (LBM). LBRs

message are sent by local nodes enabled with loopback.

0x03 Loopback message (LBM) Sent by an interface that initiates loopback


0x04 Linktrace reply (LTR) Reply to a Linktrace message (LTM). LTRs

message are sent by local nodes enabled with linktrace.

0x05 Linktrace message (LTM) Sent by an interface to initiate a linktrace test.

6.3.2 Background
IP-layer mechanisms, such as Simple Network Management Protocol (SNMP), IP ping, and
IP traceroute, are used to manage network-wide services, detect faults, and monitor
performance on traditional Ethernet networks. These mechanisms are unsuitable for client-
layer E2E Ethernet operation and management.

Figure 6-16 Typical CFM network

Customer Service Provider Customer


CFM supports service management, fault detection, and performance monitoring on the E2E
Ethernet network. In Figure 6-16:
l A network is logically divided into maintenance domains (MDs). For example, network
devices that a single Internet service provider (ISP) manages are in a single MD to
distinguish between ISP and user networks.
l Two maintenance association end points (MEPs) are configured on both ends of a
management network segment to be maintained to determine the boundary of an MD.
l Maintenance association intermediate points (MIPs) can be configured as needed. A
MEP initiates a test request, and the remote MEP (RMEP) or MIP responds to the
request. This process provides information about the management network segment to
help detect faults.
CFM supports level-specific MD management. An MD at a given level can manage MDs at
lower levels but cannot manage an MD at a higher level than its own. Level-specific MD
management is used to maintain a service flow based on level-specific MDs and different
types of service flows in an MD.

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 140

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 6 Ethernet OAM

6.3.3 Basic Functions

CFM supports continuity check (CC), loopback (LB), and linktrace (LT) functions.

Continuity Check
CC monitors the connectivity of links between MEPs. A MEP periodically sends multicast
continuity check messages (CCMs) to an RMEP in the same MA. If an RMEP does not
receive a CCM within a period 3.5 times the interval at which CCMs are sent, the RMEP
considers the path between itself and the MEP faulty.

Figure 6-17 CC





MEP CCMs sent by MEP1

CCMs sent by MEP2
CCMs sent by MEP3

The CC process is as follows:

1. CCM generation
A MEP generates and sends CCMs. MEP1, MEP2, and MEP3 are in the same MA on
the network shown in Figure 6-17 and are enabled to send CCMs to one another at the
same interval.
Each CCM carries a level equal to the MEP level.
2. MEP database establishment
Every Ethernet CFM-enabled device has a MEP database. A MEP database records
information about the local MEP and RMEPs in the same MA. The local MEP and
RMEPs are manually configured, and their information is automatically recorded in the
MEP database.
3. Fault identification
If a MEP does not receive CCMs from its RMEP within a period 3.5 times the interval at
which CCMs are sent, the MEP considers the path between itself and the RMEP faulty.
A log is generated to provide information for fault diagnosis. A user can implement
loopback or linktrace to locate the fault. MEPs in an MA exchange CCMs to monitor
links, implementing multipoint to multipoint (MP2MP) detection.
4. CCM processing
If a MEP receives a CCM carrying a level higher than the local level, it forwards this
CCM. If a MEP receives a CCM carrying a level lower than the local level, it does not

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 141

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 6 Ethernet OAM

forward this CCM. This process prevents a lower level CCM from being sent to a higher
level MD.

Loopback is also called 802.1ag MAC ping. Similar to IP ping, loopback monitors the
connectivity of a path between a local MEP and an RMEP.
A MEP initiates an 802.1ag MAC ping test to monitor the reachability of an RMEP or MIP
destination address. The MEP, MIP, and RMEP have the same level and they can share an
MA or be in different MAs. The MEP sends Loopback messages (LBMs) to the RMEP or
MIP. After receiving the messages, the RMEP or MIP replies with loopback replies (LBRs).
Loopback helps locate a faulty node because a faulty node cannot send an LBR in response to
an LBM. LBMs and LBRs are unicast packets.
The following example illustrates the implementation of loopback on the network shown in
Figure 6-18.

Figure 6-18 Loopback


MEP1:6 MIP1:6 MIP2:6 MEP2:6

LBM data flow
LBR data flow

CFM is configured to monitor a path between PE1 (MEP1) and PE4 (MEP2). The MD level
of these MEPs is 6. A MIP with a level of 6 is configured on PE2 and PE3. If a fault is
detected in a link between PE1 and PE4, loopback can be used to locate the fault. Figure 6-19
illustrates the loopback process.

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 142

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 6 Ethernet OAM

Figure 6-19 Loopback process


1. Sends an LMB carrying

the MAC address of MIP1

2. Replies with an LMR

3. Receives the LBR and considers

the link reachable
4. Sends an LBM carrying the MAC address of

5. Replies with an LBR

6. Receives the LBR and considers

the link reachable

7. Sends an LBM with the MAC address or MEP ID of MEP2

8. Does not receive an LBR within the

timeout period and considers the link faulty

MEP1 can measure the network delay based on 802.1ag MAC ping results or the frame loss
ratio based on the difference between the number of LBMs and the number of LBRs.

Linktrace is also called 802.1ag MAC trace. Similar to IP traceroute, linktrace identifies a
path between two MEPs.
A MEP initiates an 802.1ag MAC trace test to monitor a path to an RMEP or MIP destination
address. The MEP, MIP, and RMEP have the same level and they can share an MA or be in
different MAs. A source MEP constructs and sends a Linktrace message (LTM) to a
destination MEP. After receiving this message, each MIP forwards it and replies with a
linktrace reply (LTR). Upon receipt, the destination MEP replies with an LTR and does not
forward the LTM. The source MEP obtains topology information about each hop on the path
based on the LTRs. LTMs are multicast packets and LTRs are unicast packets.

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 143

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 6 Ethernet OAM

Figure 6-20 Linktrace




MEP LTM data flow

MIP LTR data flow

The following example illustrates the implementation of linktrace on the network shown in
Figure 6-20.
1. MEP1 sends MEP2 an LTM carrying a time to live (TTL) value and the MAC address of
the destination MEP2.
2. After the LTM arrives at MIP1, MIP1 reduces the TTL value in the LTM by 1 and
forwards the LTM if the TTL is not zero. MIP1 then replies with an LTR to MEP1. The
LTR carries forwarding information and the TTL value carried by the LTM when MIP1
received it.
3. After the LTM reaches MIP2 and MEP2, the process described above for MIP1 is
repeated for MIP2 and MEP2. In addition, MEP2 determines that its MAC address is the
destination address carried in the LTM and therefore does not forward the LTM.
4. The LTRs from MIP1, MIP2, and MEP2 provide MEP1 with information about the
forwarding path between MEP1 and MEP2.
If a fault occurs on the path between MEP1 and MEP2, MEP2 or a MIP cannot receive
the LTM or reply with an LTR. MEP1 can locate the faulty node based on such a
response failure. For example, if the link between MEP1 and MIP2 works properly but
the link between MIP2 and MEP2 fails, MEP1 can receive LTRs from MIP1 and MIP2
but fails to receive a reply from MEP2. MEP1 then considers the path between MIP2 and
MEP2 faulty.

6.3.4 CFM Alarms

Alarm Types
If CFM detects a fault in an E2E link, it triggers an alarm and sends the alarm to the network
management system (NMS). A network administrator uses the information to troubleshoot.
Table 6-11 describes alarms supported by CFM.

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 144

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 6 Ethernet OAM

Table 6-11 Alarms supported by CFM

Alarm Name Description

hwDot1agCfmUnexpectedME- A MEP receives a CCM frame with an incorrect MEG

GLevel level.

hwDot1agCfmUnexpectedME- During an interval equal to 3.5 times the CCM

GLevelCleared transmission period, a MEP does not receive CCM
frames with an incorrect MEG level.

hwDot1agCfmMismerge A MEP receives a CCM frame with a correct MEG

level but an incorrect MEG ID.

hwDot1agCfmMismergeCleared During an interval equal to 3.5 times the CCM

transmission period, a MEP does not receive CCM
frames with an incorrect MEG ID.

hwDot1agCfmUnexpectedMEP A MEP receives a CCM frame with a correct MEG

level and MEG ID but an unexpected MEP ID.

hwDot1agCfmUnexpectedMEP- During an interval equal to 3.5 times the CCM

Cleared transmission period, a MEP does not receive CCM
frames with an unexpected MEP ID.

hwDot1agCfmUnexpectedPer- A MEP receives a CCM frame with a correct MEG

iod level, MEG ID, and MEP ID but a Period field value
different than its own CCM transmission period.

hwDot1agCfmUnexpectedPer- During an interval equal to 3.5 times CCM transmission

iodCleared period, a MEP does not receive CCM frames with an
incorrect Period field value.

hwDot1agCfmUnexpectedMAC A MEP receives a CCM carrying a source MAC

address different from the locally specified RMEP's
MAC address.

hwDot1agCfmUnexpected- The alarm about RMEP MAC inconsistency is cleared.


hwDot1agCfmLOC During an interval equal to 3.5 times the CCM

transmission period, a MEP does not receive CCM
frames from a peer MEP.

hwDot1agCfmLOCCleared During an interval equal to 3.5 times the CCM

transmission period, a MEP receives n CCM frames
from a peer MEP.

hwDot1agCfmExceptional- The interface connecting the RMEP to the MEP does

MACStatus not work properly based on Status type-length-value
(TLV) information carried in a CCM sent by an RMEP.

hwDot1agCfmExceptional- The interface connecting the RMEP to the MEP is

MACStatusCleared restored based on Status TLV information carried in a
CCM sent by an RMEP.

hwDot1agCfmRDI A MEP receives a CCM frame with the RDI field set.

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 145

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 6 Ethernet OAM

Alarm Name Description

hwDot1agCfmRDICleared A MEP receives a CCM frame with the RDI field


Alarm Anti-jitter
Multiple alarms and clear alarms may be generated on an unstable network enabled with CC.
These alarms consume system resources and deteriorate system performance. An RMEP
activation time can be set to prevent false alarms, and an alarm anti-jitter time can be set to
limit the number of alarms generated.

Table 6-12 Alarm anti-jitter

Function Description

RMEP Prevents false alarms. A local MEP with the ability to receive CCMs can
activation accept CCMs only after the RMEP activation time elapses.

Alarm anti- If a MEP detects a connectivity fault,

jitter time l it sends an alarm to the NMS after the anti-jitter time elapses.
l it does not send an alarm if the fault is rectified before the anti-jitter time

Alarm If a MEP detects a link fault and sends an alarm,

clearing l it sends a clear alarm if the fault is rectified within a specified alarm
anti-jitter clearing anti-jitter time.
l it does not send a clear alarm if the fault is not rectified within a
specified alarm clearing anti-jitter time.

Alarm Suppression
If different types of faults trigger more than one alarm, CFM alarm suppression allows the
alarm with the highest level to be sent to the NMS. If alarms persist after the alarm with the
highest level is cleared, the alarm with the second highest level is sent to the NMS. The
process repeats until all alarms are cleared.

The principles of CFM alarm suppression are as follows:

l Alarms with high levels require immediate troubleshooting.
l A single fault may trigger alarms with different levels. After the alarm with the highest
level is cleared, alarms with lower levels may also be cleared.

6.4 Understanding Y.1731

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 146

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 6 Ethernet OAM

6.4.1 Background
EFM and CFM are used to detect link faults. Y.1731 is an enhancement of CFM and is used to
monitor service performance.

Figure 6-21 Typical Y.1731 networking

Figure 6-21 shows typical Y.1731 networking. Y.1731 performance monitoring tools can be
used to assess the quality of the purchased Ethernet tunnel services or help a carrier conduct
regular service level agreement (SLA) monitoring.

6.4.2 Basic Functions

Function Overview
Y.1731 can manage fault information and monitor performance.
l Fault management functions include continuity check (CC), loopback (LB), and linktrace
(LT). The principles of Y.1731 fault management are the same as those of CFM fault
l Performance monitoring functions include single- and dual-ended frame loss
measurement, one- and two-way frame delay measurement, alarm indication signal
(AIS), Ethernet test function (ETH-Test), Single-ended Synthetic Loss Measurement
(SLM), Ethernet lock signal function (ETH-LCK), ETH-BN on virtual private LAN
service (VPLS) networks, virtual leased line (VLL) networks, and virtual local area
networks (VLANs).Kompella VPLS and VLL scenarios support AIS only.

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 147

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 6 Ethernet OAM

Table 6-13 Y.1731 functions

Function Description Usage Scenario

Single-ended Collects frame loss To collect frame loss statistics, select either
Frame Loss statistics to assess the single- or dual-ended frame loss
Measurement quality of links between measurement:
MEPs, independent of l Dual-ended frame loss measurement
continuity check (CC). provides more accurate results than
Dual-ended Collects frame loss single-ended frame loss measurement.
Frame Loss statistics to assess link The interval between dual-ended frame
Measurement quality on CFM CC- loss measurements varies with the
enabled devices. interval between CCM transmissions.
The CCM transmission interval is
shorter than the interval between
single-ended frame loss measurements.
Dual-ended frame loss measurement
allows for a short interval between
dual-ended frame loss measurements.
l Single-ended frame loss measurement
can be used to minimize the impact of
many CCMs on the network.

One-way Measures the network To measure the link delay, select either
Frame Delay delay on a unidirectional one- or two-way frame delay
Measurement link between MEPs. measurement:

Two-way Measures the network l One-way frame delay measurement can

Frame Delay delay on a bidirectional be used to measure the delay on a
Measurement link between MEPs. unidirectional link between a MEP and
its RMEP. The MEP must synchronize
its time with its RMEP.
l Two-way frame delay measurement can
be used to measure the delay on a
bidirectional link between a MEP and
its RMEP. The MEP does not need to
synchronize its time with its RMEP.

AIS Detects server-layer faults AIS is used to suppresses local alarms

and suppresses alarms, when faults must be rapidly detected.
minimizing the impact on
network management
systems (NMSs).

ETH-Test Verifies bandwidth l ETH-Test is used for a carrier to verify

throughput and bit errors. the throughput and bit errors for a
newly established link.
l ETH-Test is used for a user to verify
the throughput and bit errors for a
leased link.

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 148

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 6 Ethernet OAM

Function Description Usage Scenario

ETH-LCK Informs the server-layer The ETH-LCK function must work with
(sub-layer) MEP of the ETH-Test function.
administrative locking and
the interruption of traffic
destined for the MEP in the
inner maintenance domain

Single-ended Collects frame loss Single-ended synthetic frame LM is used

Synthetic Loss statistics on point-to- to collect accurate frame loss statistics on
Measurement multipoint or E-Trunk links point-to-multipoint links.
(SLM) to monitor link quality.

ETH-BN Enables server-layer MEPs When routing devices connect to

to notify client-layer MEPs microwave devices, enable the ETH-BN
of the server layer's receiving function on the routing devices
connection bandwidth to associate bandwidth with the microwave
when routing devices devices.
connect to microwave
devices. The server-layer
devices are microwave
devices, and the client-
layer devices are routing
devices. Routing devices
can only function as ETH-
BN packets' receive ends
and must work with
microwave devices to
implement this function.

Ethernet frame loss measurement (ETH-LM) enables a local MEP and its RMEP to exchange
ETH-LM frames to collect frame loss statistics on E2E links. ETH-LM modes are classified
as near- or far-end ETH-LM.

Near-end ETH-LM applies to an inbound interface, and far-end ETH-LM applies to an

outbound interface on a MEP. ETH-LM counts the number of errored frame seconds to
determine the duration during which a link is unavailable.

ETH-LM supports the following methods:

l Single-ended frame loss measurement
This method measures frame loss proactively or on demand.
– On-demand measurement collects single-ended frame loss statistics at a time or a
specific number of times for diagnosis.
– Proactive measurement collects single-ended frame loss statistics periodically.
A local MEP sends a loss measurement message (LMM) carrying an ETH-LM request to
its RMEP. After receiving the request, the RMEP responds with a loss measurement

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 149

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 6 Ethernet OAM

reply (LMR) carrying an ETH-LM response. Figure 6-22 illustrates the process for
single-ended frame loss measurement.

Figure 6-22 Single-ended frame loss measurement





After single-ended frame loss measurement is enabled, a MEP on PE1 sends an RMEP
on PE2 an ETH-LMM carrying an ETH-LM request. The MEP then receives an ETH-
LMR message carrying an ETH-LM response from the RMEP on PE2. The ETH-LMM
carries a local transmit counter TxFCl (with a value of TxFCf), indicating the time when
the message was sent by the local MEP. After receiving the ETH-LMM, PE2 replies with
an ETH-LMR message, which carries the following information:
– TxFCf: copied from the ETH-LMM
– RxFCf: value of the local counter RxFCl at the time of ETH-LMM reception
– TxFCb: value of the local counter TxFCl at the time of ETH-LMM transmission
After receiving the ETH-LMR message, PE1 measures near- and far-end frame loss
based on the following values:
– Received ETH-LMR message's TxFCf, RxFCf, and TxFCb values and local counter
RxFCl value that is the time when this ETH-LMR message was received. These
values are represented as TxFCf[tc], RxFCf[tc], TxFCb[tc], and RxFCl[tc].
tc is the time when this ETH-LMR message was received.
– Previously received ETH-LMR message's TxFCf, RxFCf, and TxFCb values and
local counter RxFCl value that is the time when this ETH-LMR message was
received. These values are represented as TxFCf[tp], RxFCf[tp], TxFCb[tp], and
tp is the time when the previous ETH-LMR message was received.
Far-end frame loss = |TxFCf[tc] - TxFCf[tp]| - |RxFCf[tc] - RxFCf[tp]|
Near-end frame loss = |TxFCb[tc] - TxFCb[tp]| - |RxFCl[tc] - RxFCl[tp]|
Service packets are prioritized based on 802.1p priorities and are transmitted using
different policies. Traffic passing through the P device on the network shown in Figure
6-23 carries 802.1p priorities of 1 and 2.
Single-ended frame loss measurement is enabled on PE1 to send traffic with a priority of
1 to measure frame loss on a link between PE1 and PE2. Traffic with a priority of 2 is
also sent. After receiving traffic with priorities of 1 and 2, the P device forwards traffic

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 150

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 6 Ethernet OAM

with a higher priority, delaying the arrival of traffic with a priority of 1 at PE2. As a
result, the frame loss ratio is inaccurate.
802.1p priority-based single-ended frame loss measurement can be enabled to obtain
accurate results.

Figure 6-23 802.1p priority-based single-ended frame loss measurement


User User
Network Network

Priority 1
Priority 2

l Dual-ended frame loss measurement

This method measures frame loss periodically, implementing error management. Each
MEP sends its RMEP a dual-ended ETH-LM message. After receiving an ETH-LM
message, a MEP collects near- and far-end frame loss statistics but does not forward the
ETH-LM message. Figure 6-24 illustrates the process for dual-ended frame loss

Figure 6-24 Dual-ended frame loss measurement





After dual-ended frame loss measurement is configured, each MEP periodically sends a
CCM carrying a request to its RMEP. After receiving the CCM, the RMEP collects near-
and far-end frame loss statistics but does not forward the message. The CCM carries the
following information:

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 151

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 6 Ethernet OAM

– TxFCf: value of the local counter TxFCl at the time of CCM transmission
– RxFCb: value of the local counter RxFCl at the time of the reception of the last
– TxFCb: value of TxFCf in the last received CCM
PE1 uses received information to measure near- and far-end frame loss based on the
following values:
– Received CCM's TxFCf, RxFCb, and TxFCb values and local counter RxFCl value
that is the time when this CCM was received. These values are represented as
TxFCf[tc], RxFCb[tc], TxFCb[tc], and RxFCl[tc].
tc is the time when this CCM was received.
– Previously received CCM's TxFCf, RxFCb, and TxFCb values and local counter
RxFCl value that is the time when this CCM was received. These values are
represented as TxFCf[tp], RxFCb[tp], TxFCb[tp], and RxFCl[tp].
tp is the time when the previous CCM was received.
Far-end frame loss = |TxFCb[tc] - TxFCb[tp]| - |RxFCb[tc] - RxFCb[tp]|
Near-end frame loss = |TxFCf[tc] - TxFCb[tp]| - |RxFCl[tc] - RxFCl[tp]|

Delay measurement (DM) measures the delay and its variation. A MEP sends its RMEP a
message carrying ETH-DM information and receives a response message carrying ETH-DM
information from its RMEP.
ETH-DM supports the following modes:
l One-way frame delay measurement
A MEP sends its RMEP a 1DM message carrying one-way ETH-DM information. After
receiving this message, the RMEP measures the one-way frame delay and its variation.
One-way frame delay measurement can be implemented only after the MEP
synchronizes the time with its RMEP. The delay variation can be measured regardless of
whether the MEP synchronizes the time with its RMEP. If a MEP synchronizes its time
with its RMEP, the one-way frame delay and its variation can be measured. If the time is
not synchronized, only the one-way delay variation can be measured.
One-way frame delay measurement can be implemented in either of the following
– On-demand measurement: calculates the one-way frame delay at a time or a
specific number of times for diagnosis.
– Proactive measurement: calculates the one-way frame delay periodically.
Figure 6-25 illustrates the process for one-way frame delay measurement.

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 152

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 6 Ethernet OAM

Figure 6-25 One-way frame delay measurement





One-way frame delay measurement is implemented on an E2E link between a local MEP
and its RMEP. The local MEP sends 1DMs to the RMEP and then receives replies from
the RMEP. After one-way frame delay measurement is configured, a MEP periodically
sends 1DMs carrying TxTimeStampf (the time when the 1DM was sent). After receiving
the 1DM, the RMEP parses TxTimeStampf and compares this value with RxTimef (the
time when the DM frame was received). The RMEP calculates the one-way frame delay
based on these values using the following equation:
Frame delay = RxTimef - TxTimeStampf
The frame delay can be used to measure the delay variation.
A delay variation is an absolute difference between two delays.
802.1p priorities carried in service packets are used to prioritize services. Traffic passing
through the P device on the network shown in Figure 6-26 carries 802.1p priorities of 1
and 2.
One-way frame delay measurement is enabled on PE1 to send traffic with a priority of 1
to measure the frame delay on a link between PE1 and PE2. Traffic with a priority of 2 is
also sent. After receiving traffic with priorities of 1 and 2, the P device forwards traffic
with a higher priority, delaying the arrival of traffic with a priority of 1 at PE2. As a
result, the frame delay calculated on PE2 is inaccurate.
802.1p priority-based one-way frame delay measurement can be enabled to obtain
accurate results.

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 153

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 6 Ethernet OAM

Figure 6-26 802.1p priority-based one-way frame delay measurement



User User
Network Network


Priority 1
Priority 2

l Two-way frame delay measurement

A MEP sends its RMEP a delay measurement message (DMM) carrying an ETH-DM
request. After receiving the DMM, the RMEP sends the MEP a delay measurement reply
(DMR) carrying an ETH-DM response.
Two-way frame delay measurement can be implemented in either of the following
– On-demand measurement: calculates the two-way frame delay at a time for
– Proactive measurement: calculates the two-way frame delay periodically.
Figure 6-27 illustrates the process for two-way frame delay measurement.

Figure 6-27 Two-way frame delay measurement






Two-way frame delay measurement is performed by a local MEP to send a delay

measurement message (DMM) to its RMEP and then receive a DMR from the RMEP.
After two-way frame delay measurement is configured, a MEP periodically sends
DMMs carrying TxTimeStampf (the time when the DMM was sent). After receiving the
DMM, the RMEP replies with a DMR message. This message carries RxTimeStampf

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 154

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 6 Ethernet OAM

(the time when the DMM was received) and TxTimeStampb (the time when the DMR
was sent). The value in every field of the DMM is copied to the DMR, with the
exception that the source and destination MAC addresses were interchanged. Upon
receipt of the DMR message, the MEP calculates the two-way frame delay using the
following equation:
Frame delay = (RxTimeb - TxTimeStampf) - (TxTimeStampb - RxTimeStampf)
The frame delay can be used to measure the delay variation.
A delay variation is an absolute difference between two delays.
802.1p priorities carried in service packets are used to prioritize services. Traffic passing
through the P device on the network shown in Figure 6-28 carries 802.1p priorities of 1
and 2.
Two-way frame delay measurement is enabled on PE1 to send traffic with a priority of 1
to measure the frame delay on a link between PE1 and PE2. Traffic with a priority of 2 is
also sent. After receiving traffic with priorities of 1 and 2, the P device forwards traffic
with a higher priority, delaying the arrival of traffic with a priority of 1 at PE2. As a
result, the frame delay calculated on PE2 is inaccurate.
802.1p priority-based two-way frame delay measurement can be enabled to obtain
accurate results.

Figure 6-28 802.1p priority-based two-way frame delay measurement




User User
Network Network

Priority 1
Priority 2

AIS is a protocol used to transmit fault information.

A MEP is configured in MD1 with a level of 6 on each of CE1 and CE2 access interfaces on
the user network shown in Figure 6-29. A MEP is configured in MD2 with a level of 3 on
each of PE1 and PE2 access interfaces on a carrier network.
l If CFM detects a fault in the link between AIS-enabled PEs, CFM sends AIS packet data
units (PDUs) to CEs. After receiving the AIS PDUs, the CEs suppress alarms,

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 155

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 6 Ethernet OAM

minimizing the impact of a large number of alarms on a network management system

l After the link between the PEs recovers, the PEs stop sending AIS PDUs. CEs do not
receive AIS PDUs during a period of 3.5 times the interval at which AIS PDUs are sent.
Therefore, the CEs cancel the alarm suppression function.

Figure 6-29 AIS principles

CE1 AIS packets PE1 PE2 AIS packets CE2


MD2 Level 3

MD1 Level 6

ETH-Test is used to perform one-way on-demand in-service or out-of-service diagnostic tests
on the throughput, frame loss, and bit errors.

The implementation of these tests is as follows:

l Verifying throughput and frame loss: Throughput means the maximum bandwidth of a
link without packet loss. When you use ETH-Test to verify the throughput, a MEP sends
frames with ETH-Test information at a preconfigured traffic rate and collects frame loss
statistics for a specified period. If the statistical results show that the number of sent
frames is greater than the number of received frames, frame loss occurs. The MEP sends
frames at a lower rate until no frame loss occurs. The traffic rate measured at the time
when no packet loss occurs is the throughput of this link.
l Verifying bit errors: ETH-Test is implemented by verifying the cyclic redundancy code
(CRC) of the Test TLV field carried in ETH-Test frames. For the ETH-Test
implementation, four types of test patterns can be specified in the test TLV field: Null
signal without CRC-32, Null signal with CRC-32, PRBS 2-31-1 without CRC-32, and
PRBS 2-31-1 with CRC-32. Null signal indicates all 0s signal. PRBS, pseudo random
binary sequence, is used to simulate white noise. A MEP sends ETH-Test frames
carrying the calculated CRC value to the RMEP. After receiving the ETH-Test frames,
the RMEP recalculates the CRC value. If the recalculated CRC value is different from
the CRC value carried in the sent ETH-Test frames, bit errors occur.

ETH-Test provides two types of test modes: out-of-service ETH-Test and in-service ETH-
l Out-of-service ETH-Test mode: Client data traffic is interrupted in the diagnosed entity.
To resolve this issue, the out-of-service ETH-Test function must be used together with
the ETH-LCK function.

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 156

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 6 Ethernet OAM

l In-service ETH-Test mode: Client data traffic is not interrupted, and the frames with the
ETH-Test information are transmitted using part of bandwidths.

ETH-LCK is used for administrative locking on the MEP in the outer MD with a higher level
than the inner MD, that is, preventing CC alarms from being generated in the outer MD.
When implementing ETH-LCK, a MEP in the inner MD sends frames with the ETH-LCK
information to the MEP in the outer MD. After receiving the frames with the ETH-LCK
information, the MEP in the outer MD can differentiate the alarm suppression caused by
administrative locking from the alarm suppression caused by a fault in the inner MD (the AIS
To suppress CC alarms from being generated in the outer MD, ETH-LCK is implemented
with out-of-service ETH-Test. A MEP in the inner MD with a lower level initiates ETH-Test
by sending an ETH-LCK frame to a MEP in the outer MD. Upon receipt of the ETH-LCK
frame, the MEP in the outer MDsuppresses all CC alarms immediately and reports an ETH-
LCK alarm indicating administrative locking. Before out-of-service ETH-Test is complete,
the MEP in the inner MD sends ETH-LCK frames to the MEP in the outer MD. After out-of-
service ETH-Test is complete, the MEP in the inner MD stops sending ETH-LCK frames. If
the MEP in the outer MD does not receive ETH-LCK frames for a period 3.5 times provided
that; if the specified interval, it releases the alarm suppression and reports a clear ETH-LCK
As shown in Figure 6-30, MD2 with the level of 3 is configured on PE1 and PE2; MD1 with
the level of 6 is configured on CE1 and CE2. When PE1's MEP1 sends out-of-service ETH-
Test frames to PE2's MEP2, MEP1 also sends ETH-LCK frames to CE1's MEP11 and CE2's
MEP22 separately to suppress MEP11 and MEP22 from generating CC alarms. When MEP1
stops sending out-of-service ETH-Test frames, it also stops sending ETH-LCK frames. If
MEP11 and MEP22 do not receive ETH-LCK frames for a period 3.5 times provided that; if
the specified interval, they release the alarm suppression.

Figure 6-30 ETH-LCK

MD1 (Level=6)

CE1 PE1 MD2 (Level=3) PE2 CE2


Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 157

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 6 Ethernet OAM

Single-ended ETH-SLM
SLM measures frame loss using synthetic frames instead of data traffic. When implementing
SLM, the local MEP exchanges frames containing ETH-SLM information with one or more
Figure 6-31 demonstrates the process of single-ended SLM:
1. The local MEP sends ETH-SLM request frames to the RMEPs.
2. After receiving the ETH-SLM request frames, the RMEPs send ETH-SLM reply frames
to the local MEP.
A frame with the single-ended ETH-SLM request information is called an SLM, and a frame
with the single-ended ETH-SLM reply information is called an SLR. SLM frames carry SLM
protocol data units (PDUs), and SLR frames carry SLR PDUs.
Single-ended SLM and single-ended frame LM are differentiated as follows: On the point-to-
multipoint network shown in Figure 6-31, inward MEPs are configured on PE1's and PE3's
interfaces, and single-ended frame LM is performed on the PE1-PE3 link. Traffic coming
through PE1's interface is destined for both PE2 and PE3, and single-ended frame LM will
collect frame loss statistics for all traffic, including the PE1-to-PE2 traffic. As a result, the
collected statistics are not accurate. Unlike singled-ended frame LM, single-ended SLM
collects frame loss statistics only for the PE1-to-PE3 traffic, which is more accurate.

Figure 6-31 Single-ended SLM


When implementing single-ended SLM, PE1 sends SLM frames to PE3 and receives SLR
frames from PE3. SLM frames contain TxFCf, the value of TxFCl (frame transmission
counter), indicating the frame count at the transmit time. SLR frames contain the following
l TxFCf: value of TxFCl (frame transmission counter) indicating the frame count on PE1
upon the SLM transmission
l TxFCb: value of RxFC1 (frame receive counter) indicating the frame count on PE3 upon
the SLR transmission
After receiving the last SLR frame during a measurement period, a MEP on PE1 measures the
near-end and far-end frame loss based on the following values:
l Last received SLR's TxFCf and TxFCb, and value of RxFC1 (frame receive counter)
indicating the frame count on PE1 upon the SLR reception. These values are represented
as TxFCf[tc], TxFCb[tc], and RxFCl[tc].
tc indicates the time when the last SLR frame was received during the measurement

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 158

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 6 Ethernet OAM

l Previously received SLR's TxFCf and TxFCb, and value of RxFC1 (frame receive
counter) indicating the frame count on PE1 upon the SLR reception. These values are
represented as TxFCf[tp], TxFCb[tp], and RxFCl[tp].
tp indicates the time when the last SLR frame was received during the previous
measurement period.
Far-end frame loss = |TxFCf[tc] – TxFCf[tp]| – |TxFCb[tc] – TxFCb[tp]|
Near-end frame loss = |TxFCb[tc] – TxFCb[tp]| – |RxFCf[tc] – RxFCf[tp]|
On a network, each packet carries the IEEE 802.1p field, indicating its priority. According to
packet priority, different QoS policies will be applied. On the network shown in Figure 6-32,
the PE1-to-PE3 traffic has two priorities: 1 and 2, as indicated by the IEEE 802.1p field.
When implementing single-ended SLM for traffic over the PE1-PE3 link, PE1 sends SLM
frames with varied priorities and checks the frame loss. Based on the check result, the
network administrator can adjust the QoS policy for the link.

Figure 6-32 Single-ended SLM based on different 802.1p priorities

User Network
Network Y.1731 CE3

Priority 1
Priority 2

Ethernet bandwidth notification (ETH-BN) enables server-layer MEPs to notify client-layer
MEPs of the server layer's connection bandwidth when routing devices connect to microwave
devices. The server-layer devices are microwave devices, which dynamically adjust the
bandwidth according to the prevailing atmospheric conditions. The client-layer devices are
routing devices. Routing devices can only function as ETH-BN packets' receive ends and
must work with microwave devices to implement this function.
As shown in Figure 6-33, server-layer MEPs are configured on the server-layer devices, and
the ETH-BN sending function is enabled. The levels of client-layer MEPs must be specified
for the server-layer MEPs when the ETH-BN sending function is enabled. Client-layer MEPs
are configured on the client-layer devices, and the ETH-BN receiving function is enabled. The
levels of the client-layer MEPs are the same as those specified for the server-layer MEPs.
l If the ETH-BN function has been enabled on the server-layer devices Device2 and
Device3 and the bandwidth of the server-layer devices' microwave links decreases, the
server-layer devices send ETH-BN packets to the client-layer devices (Device1 and
Device4). After receiving the ETH-BN packets, the client-layer MEPs can use
bandwidth information in the packets to adjust service policies, for example, to reduce
the rate of traffic sent to the degraded links.

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 159

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 6 Ethernet OAM

l When the server-layer devices' microwave links work properly, whether to send ETH-
BN packets is determined by the configuration of the server-layer devices. When the
server-layer microwave devices stop sending ETH-BN packets, the client-layer devices
do not receive any ETH-BN packets. The ETH-BN data on the client-layer devices is
aged after 3.5 times the interval at which ETH-BN packets are sent.
When planning ETH-BN, you must check that the service burst traffic is consistent with a device's
buffer capability.

Figure 6-33 Basic principles of ETH-BN

256Q 128Q 64Q 32Q 16Q QP 16Q 32Q 64Q 128Q 256

Device1 Bandwidth=B1 Device4

Device2 Device3
B1 Server Server
Client Client

Usage Scenario
Y.1731 supports performance statistics collection on both end-to-end and end-to-multi-end

End-to-end performance statistics collection

On the network shown in Figure 6-34, Y.1731 collects statistics about the end-to-end link
performance between the CE and PE1, between PE1 and PE2, or between the CE and PE3.

End-to-multi-end performance statistics collection

On the network shown in Figure 6-35, user-to-network traffic from different users traverses
CE1 and CE2 and is converged on CE3. CE3 forwards the converged traffic to the UPE.
Network-to-user traffic traverses CE3, and CE3 forwards the traffic to CE1 and CE2.

When Y.1731 is used to collect statistics about the link performance between the CE and the
UPE, end-to-end performance statistics collection cannot be implemented. This is because
only one inbound interface (on the UPE) sends packets but two outbound interfaces (on CE1
and CE2) receive the packets. In this case, statistics on the outbound interfaces fail to be
collected. To resolve this issue, end-to-multi-end performance statistics collection can be

The packets carry the MAC address of CE1 or CE2. The UPE identifies the outbound
interface based on the destination MAC address carried in the packets and collects end-to-end
performance statistics.

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 160

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 6 Ethernet OAM

Figure 6-34 End-to-end performance statistics collection

Services Access Metro



Y.1731 Y.1731


Figure 6-35 End-to-multi-end performance statistics collection







Both end-to-multi-end and end-to-end performance statistics collection applies to VLL,

VPLS, and VLAN scenarios and has the same statistics collection principles.

6.5 Ethernet OAM Fault Advertisement

6.5.1 Background
Link detection protocols are used to monitor the connectivity of links between devices and
detect faults. A single fault detection protocol cannot detect all faults in all links on a complex
network. A combination of protocols and techniques must be used to detect link faults.
Ethernet OAM detects faults in Ethernet links and advertises fault information to interfaces or
other protocol modules. Ethernet OAM fault advertisement is implemented by an OAM
manager (OAMMGR) module, application modules, and detection modules. An OAMMGR
module associates one module with another. A detection module monitors link status and
network performance. If a detection module detects a fault, it instructs the OAMMGR module
to notify an application module or another detection module of the fault. After receiving the

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 161

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 6 Ethernet OAM

notification, the application or detection module takes measures to prevent a communication

interruption or service quality deterioration.
The OAMMGR module helps an Ethernet OAM module to advertise fault information to a
detection or application module. If an Ethernet OAM module detects a fault, it instructs the
OAMMGR module to send alarms to the network management system (NMS). A network
administrator takes measures based on information displayed on the NMS. Ethernet OAM
fault advertisement includes fault information advertisement between CFM and other

6.5.2 Fault Information Advertisement Between EFM and Other

Between EFM and Detection Modules
The OAMMGR module associates EFM with detection modules, such as EFM, CFM, and
BFD modules. Fault information advertisement between EFM and detection modules enables
a device to delete MAC address entries once a fault is detected. Figure 6-36 shows the
network on which fault information is advertised between EFM and detection modules.

Figure 6-36 Fault information advertisement between EFM and detection modules



User Port1

The following example illustrates fault information advertisement between EFM and
detection modules over a path CE5 -> CE4 -> CE1-> PE2 -> PE4 on the network shown in
Table 6-14.

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 162

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 6 Ethernet OAM

Table 6-14 Fault information advertisement between EFM and detection modules
Function Issue to Be Resolved Solution

EFM is used to l Although EFM detects a The EFM module can be

monitor the direct fault, EFM cannot notify associated with the CFM module.
link between CE1 PE6 of the fault. As a result, l If the EFM module detects a
and PE2, and CFM PE6 still forwards network fault, it instructs the
is used to monitor traffic to PE2, causing a OAMMGR module to notify
the link between traffic interruption. the CFM module of the fault.
PE2 and PE6. l Although CFM detects a l If the CFM module detects a
fault, CFM cannot notify fault, it instructs the
CE1 of the fault. As a OAMMGR module to notify
result, CE1 still forwards the EFM module of the fault.
user traffic to PE2, causing
a traffic interruption. The association allows a module
to notify another associated
module of a fault and to send an
alarm to a network management
system (NMS). A network
administrator analyzes alarm
information and takes measures to
rectify the fault.

EFM is used to l Although EFM detects a The EFM module can be

monitor the direct fault, EFM cannot notify associated with the BFD module.
link between CE1 PE6 of the fault. As a result, l If the EFM module detects a
and PE2, and BFD PE6 still forwards network fault, it instructs the
is used to monitor traffic to PE2, causing a OAMMGR module to notify
the link between traffic interruption. the BFD module of the fault.
PE2 and PE6. l Although BFD detects a l If the BFD module detects a
fault, EFM cannot notify fault, it instructs the
CE1 of the fault. As a OAMMGR module to notify
result, CE1 still forwards the EFM module of the fault.
user traffic to PE2, causing
a traffic interruption. l If EFM on CE1 detects a fault
or receives fault information
sent by PE2, the association
between EFM and BFD works
and deletes the MAC entry,
which switches traffic to a
backup link.
The association allows a module
to notify another associated
module of a fault and to send an
alarm to an NMS. A network
administrator analyzes alarm
information and takes measures to
rectify the fault.

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 163

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 6 Ethernet OAM

Fault Information Advertisement Between EFM and Application Modules

The OAMMGR module associates an EFM module with application modules, such as a
Virtual Router Redundancy Protocol (VRRP) module. Figure 6-37 shows the network on
which a user-side device is dual-homed to network-side devices, improving telecom service

Figure 6-37 Fault information advertisement between EFM and application modules
IP Core

Switch2 NPE2

Table 6-15 describes fault information advertisement between EFM and VRRP modules.

Table 6-15 Fault information advertisement between EFM and VRRP modules
Function Issue to Be Resolved Solution

l A VRRP If links connected to a VRRP To help prevent data loss, the

backup group backup group fail, VRRP packets VRRP module can be associated
is configured to cannot be sent to negotiate the with the EFM module. If a fault
determine the master/backup status. A backup occurs, the EFM module notifies
master/backup VRRP device preempts the the VRRP module of the fault.
status of Master state after a period of After receiving the notification,
provider edges- three times the interval at which the VRRP module triggers a
aggregation VRRP packets are sent. As a master/backup VRRP switchover.
(PE-AGGs). result, data loss occurs.
l EFM is used to
monitor links
between the
UPE and PE-

6.5.3 Fault Information Advertisement Between CFM and Other

Fault Information Advertisement Between CFM and Detection Modules
An OAMMGR module associates CFM with detection modules. A detection module can be
EFM, CFM, BFD. Fault information advertisement between CFM and detection modules

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 164

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 6 Ethernet OAM

enables a device to delete ARP or MAC address entries once a fault is detected. Figure 6-38
shows the network on which fault information is advertised between CFM and detection

Figure 6-38 Networking for fault information advertisement between CFM and detection



IP Core

The following example illustrates fault information advertisement between CFM and
detection modules over a path UPE1 -> PE2 -> PE4 -> PE6 -> PE8 on the network shown in
Table 6-16.

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 165

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 6 Ethernet OAM

Table 6-16 Fault information advertisement between CFM and detection modules
Function Issue to Be Resolved Solution

CFM is used to Although CFM detects a fault CFM can be associated with port
monitor the link in the link between UPE1 and 1.
between UPE1 and PE4, CFM cannot notify PE6 l If CFM detects a fault, it
PE4. of the fault. As a result, PE6 instructs the OAMMGR
still forwards network traffic to module to disconnect port 1
PE4, causing a traffic intermittently. This operation
interruption. allows other modules to detect
Although port 1 on PE4 goes the fault.
Down, port 1 cannot notify l If port 1 goes Down, it
CE1 of the fault. As a result, instructs the OAMMGR
CE1 still forwards user traffic module to notify CFM of the
to PE4, causing a traffic fault. After receiving the
interruption. notification, CFM notifies PE6
of the fault.
The association between CFM and
a port is used to detect faults in an
active link of a link aggregation
group or in the link aggregation
group in 1:1 active/standby mode.
If a fault is detected, a protection
switchover is triggered.

EFM is used to Although CFM detects a fault, The EFM module can be
monitor the direct CFM cannot notify CE1 of the associated with the CFM module.
link between CE1 fault. As a result, CE1 still l If the EFM module detects a
and UPE1, and forwards user traffic to PE4, fault, it instructs the
CFM is used to causing a traffic interruption. OAMMGR module to notify
monitor the link the CFM module of the fault.
between UPE1 and
PE4. l If the CFM module detects a
fault, it instructs the
OAMMGR module to notify
the EFM module of the fault.
The association allows a module
to notify another associated
module of a fault and to send an
alarm to an NMS. A network
administrator analyzes alarm
information and takes measures to
rectify the fault.

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 166

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 6 Ethernet OAM

Function Issue to Be Resolved Solution


CFM is configured l Although CFM detects a l Two CFM modules can be

to monitor the links fault in the link between associated with each other. If a
between UPE1 and PE4 and PE8, it cannot CFM module detects a fault, it
PE4 and between notify UPE1 of the fault. As instructs the OAMMGR
PE4 and PE8. a result, UPE1 still forwards module to notify the other
user traffic to PE4 through CFM module of the fault and
PE2, causing a traffic sends an alarm to an NMS. A
interruption. network administrator analyzes
l Although CFM detects a alarm information and takes
fault in the link between measures to rectify the fault.
UPE1 and PE4, it cannot l CFM can be associated with
notify PE8 of the fault. As a MAC or ARP entry clearing. If
result, PE8 still forwards CFM detects a fault, it
network traffic to PE4 instructs an interface to clear
through PE6, causing a MAC or ARP entries,
traffic interruption. triggering traffic to be
switched to a backup link.

l CFM is used to l Although CFM detects a The CFM module can be

monitor the link fault in the link between associated with the BFD module.
between UPE1 UPE1 and PE4, it cannot l If the CFM module detects a
and PE4. notify PE8 of the fault. As a fault, it instructs the
l BFD can be result, PE8 still forwards OAMMGR module to notify
used to monitor network traffic to PE4 the BFD module of the fault.
the non- through PE6, causing a
traffic interruption. l If the BFD module detects a
Ethernet link fault, it instructs the
between PE4 l Although BFD detects a OAMMGR module to notify
and PE8. The fault, BFD cannot notify the CFM module of the fault.
non-Ethernet UPE1 of the fault. As a
link can be a result, UPE1 still forwards The association allows a module
packet over user traffic to PE4 through to notify another associated
synchronous PE2, causing a traffic module of a fault and to send an
digital hierarchy interruption. alarm to an NMS. A network
(SDH)/ administrator analyzes alarm
synchronous information and takes measures to
optical network rectify the fault.

Fault Information Advertisement Between CFM and Application Modules

The OAMMGR module associates a CFM module with application modules, such as a Virtual
Router Redundancy Protocol (VRRP) module.
Figure 6-39 shows the network on which a CFM module advertises fault information to a
VRRP module. Figure 6-40 shows the network on which a VRRP module advertises fault
information to a CFM module.

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 167

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 6 Ethernet OAM

Figure 6-39 Fault information advertisement by a CFM module to a VRRP module

Core NPE2
Master Backup







No-neighbor Primary Edge Port

CE1 No-neighbor Secondary Edge Port
Block Port
100 SEP associated with Ethernet CFM

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 168

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 6 Ethernet OAM

Figure 6-40 Fault information advertisement by a CFM module to a VRRP module


Master Backup






Table 6-17 describes fault information advertisement between CFM and VRRP modules.

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 169

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 6 Ethernet OAM

Table 6-17 Fault information advertisement between CFM and VRRP modules
Function Deployment Issue to Be Resolved Solution

l A VRRP backup group If a fault occurs on the link CFM can be associated with
is configured to between NPE1 (the master) the VRRP module on NPEs.
determine the master/ and PE-AGG1, NPE2 cannot If CFM detects a fault in the
backup status of receive VRRP packets within link between PE-AGG1 and
network provider a period of three times the NPE1, it instructs the
edges (NPEs). interval at which VRRP OAMMGR module to notify
l CFM is used to packets are sent. NPE2 then the VRRP module of the
monitor links between preempts the Master state. As fault. After receiving the
NPEs and PE-AGGs. a result, two master devices notification, the VRRP
coexist in a VRRP backup module triggers a master/
group, and the UPE receives backup VRRP switchover.
double copies of network NPE1 then changes its VRRP
traffic. status to Initialize. NPE2
changes its VRRP status from
Backup to Master after a
period of three times the
interval at which VRRP
packets are sent. This process
prevents two master devices
from coexisting in the VRRP
backup group.

l A VRRP backup group If a fault occurs on the l When VRRP status

is configured to backbone network, it triggers changes on NPEs, the
determine the master/ a master/backup VRRP VRRP module notifies
backup status of NPEs. switchover but cannot trigger PE-AGGs' CFM modules
l CFM is used to an active/standby PW of VRRP status changes.
monitor links between switchover. As a result, the l The CFM module on each
NPEs and PE-AGGs. CE still transmits user traffic PE-AGG notifies the PW
to the previous master NPE, module of the status
l PW redundancy is causing a traffic interruption.
configured to change and triggers an
determine the active/ active/standby PW
standby status of PWs. switchover.
l Each PE-AGG notifies its
associated UPE of the PW
status change.
l After the UPE receives
the notification, it
determines the primary/
backup status of PWs.

6.6 Application Scenarios for Ethernet OAM

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 170

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 6 Ethernet OAM

6.6.1 Ethernet OAM Applications on a MAN

EFM, CFM, and Y.1731 can be combined to provide E2E Ethernet OAM solutions,
implementing E2E Ethernet service management.

Figure 6-41 Ethernet OAM applications on a MAN

IP/MPLS core






Acess DSLAM1

VoIP PC STB High-grade Enterprises VoIP PC STB

residential district

Figure 6-41 shows a typical MAN network. The following example illustrates Ethernet OAM
applications on a MAN.
l EFM is used to monitor P2P direct links between a digital subscriber line access
multiplexer (DSLAM) and a user-end provider edge (UPE) or between a LAN switch
(LSW) and a UPE. If EFM detects errored frames, codes, or frame seconds, it sends
alarms to the network management system (NMS) to provide information for a network
administrator. EFM uses the loopback function to assess link quality.

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 171

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 6 Ethernet OAM

l CFM is used to monitor E2E links between a UPE and an NPE or between a UPE and a
provider edge-aggregation (PE-AGG). A network planning engineer groups the devices
of each Internet service provider (ISP) into an MD and maps a type of service to an MA.
A network maintenance engineer enables maintenance points to exchange CCMs to
monitor network connectivity. After receiving an alarm on the NMS, a network
administrator can enable loopback to locate faults or enable linktrace to discover paths.
l Y.1731 is used to measure packet loss and the delay on E2E links between a UPE and an
NPE or between a UPE and a PE-AGG at the aggregation layer.

6.6.2 Ethernet OAM Applications on an IPRAN

Figure 6-42 Ethernet OAM applications on an IPRAN


eNodeB CSG1



A mobile backhaul network shown in Figure 6-42 consists of a transport network between a
cell site gateway (CSG) and remote service gateways (RSGs) and a wireless network between
NodeBs/eNodeBs and the CSG. Carriers operate the transport and wireless networks
separately. Therefore, traffic transmitted on the transport network of one carrier is invisible to
devices on the wireless network of another carrier.
Ethernet OAM can be used on the transport and wireless networks to identify and locate
l EFM monitors Layer 2 links between a NodeB/eNodeB and CSG1.
– EFM is used to monitor the connectivity of links between a NodeB/eNodeB and
CSG1 or between RNCs and RSGs.
– EFM detects errored codes, frames, and frame seconds on links between a NodeB/
eNodeB and CSG1 and between RNCs and RSGs. If the number of errored codes,
frames, or frame seconds exceeds a configured threshold, an alarm is sent to the
NMS. A network administrator is notified of link quality deterioration and can
assess the risk of adverse impact on voice traffic.
– Loopback is used to monitor the quality of voice links between a NodeB/eNodeB
and CSG1 or between RNCs and RSGs.
l CFM is used to locate faulty links over which E2E services are transmitted.
– CFM periodically monitors links between cell site gateway (CSG) 1 and remote site
gateways (RSGs). If CFM detects a fault, it sends an alarm to the NMS. A network
administrator analyzes alarm information and takes measures to rectify the fault.

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 172

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 6 Ethernet OAM

– Loopback and linktrace are enabled on links between CSG1 and the RSGs to help
link fault diagnosis.
l Y.1731 is used together with CFM to monitor link performance and voice and data traffic

6.7 Our Advantages

In addition to EFM, CFM, and Y.1731, Huawei devices provide the following enhancements:
l EFM enhancements
l EFM and CFM can advertise information about many types of faults, facilitating
network-wide fault detection.
l Huawei devices can collect 802.1p priority-based Y.1731 statistics about packets
transmitted over pseudo wires (PWs).

6.7.1 EFM Enhancements

EFM enhancements are EFM extended functions, including an association between EFM and
an EFM interface, an active/standby extension, and single-fiber fault detection.

Association Between EFM and EFM Interfaces

On the network shown in Figure 6-43, customer edge 1 (CE1) is dual-homed to CE2 and
CE4. The dual-homing networking provides device redundancy, making the network more
robust and services more reliable. If the active link between CE1 and CE4 fails, traffic
switches to the standby link between CE1 and CE2, minimizing the service interruption time.
Association between EFM and EFM interfaces that connect CE2 and CE4 to CE1 allows
traffic to switch from the active link to the standby link if EFM detects a link fault or link
quality deterioration. On the network shown in Figure 6-43, when EFM detects a fault in the
link between CE1 and CE4, association between EFM and EFM interfaces can be used to
trigger an active/standby link switchover, improving transmission quality and reliability.

Figure 6-43 Association between EFM and EFM interfaces




Single-Fiber Fault Detection

Optical interfaces work in full-duplex mode and therefore consider themselves Up provided
that; if they receive packets. This causes the working status of the interfaces to be inconsistent
with the physical interface status.

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 173

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 6 Ethernet OAM

As shown in Figure 6-44, optical interface A is directly connected to optical interface B. If

line 2 fails, interface B cannot receive packets and sets its physical status to Down. Interface
A can receive packets from interface B over line 1 and therefore considers its physical status
Up. If interface A sends packets to interface B, a service interruption occurs because interface
B cannot receive the packets.

Figure 6-44 Principles of EFM single-fiber fault detection

Optical Optical
module A module B

EFM single-fiber detection can be used to prevent the preceding issue.

If EFM detects a fault on an interface that is associated with EFM, the association function
enables the interface to go Down. The modules for Layer 2 and Layer 3 services can detect
the interface status change and trigger a service switchover. The working status and physical
status of the interface remain consistent, preventing a service interruption. After the fault is
rectified and EFM negotiation succeeds, the interface goes Up and services switch back.
Single-fiber fault detection prevents inconsistency between the working and physical interface
statuses and allows the service modules to detect interface status changes.

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 174

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 7 Ethernet LPT

7 Ethernet LPT

About This Chapter

7.1 Overview of LPT

7.2 Understanding LPT
7.3 Application Scenarios for LPT

7.1 Overview of LPT

Link-state pass through (LPT) transparently transmits the local link status to the opposite end
so that the opposite end can perform operations accordingly.

Ethernet LPT can detect and report a link fault on the Ethernet user side or a fault on an
intermediate point-to-point network.

After detecting a fault on the local link, the local user equipment automatically enables a
backup link and uses the backup link to communicate with the opposite user equipment. The
opposite user equipment, however, cannot obtain information about the local link fault.
Therefore, it still uses the original link to communicate with the local user equipment. As a
result, services are interrupted.

If Ethernet LPT is enabled, the local user equipment can send information about the local link
fault to the opposite network edge equipment using Ethernet LPT packets. The opposite
network edge equipment disables the UNI-side port so that the opposite user equipment starts
to use the backup link. In this manner, services are transmitted over the backup link between
the user equipment at both ends.

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 175

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 7 Ethernet LPT

7.2 Understanding LPT

7.2.1 Basic Principles

This section describes the implementation principle of Ethernet LPT in a scenario with a user
side link fault and a scenario with a point-to-point network fault.Figure 7-1 shows the
scenario where a user side link fault occurs.

Figure 7-1 Scenario where a user side link fault occurs

Backup link


Network Network
User side side side User side
Point-to-point network
Link 2
Link 1

Protection link
Working link

PE1 and PE2 are enabled with Ethernet LPT and transmit packets to each other. When a fault
occurs on link 1:
1. CE1 detects that link 1 is malfunctioning and enables the backup link to communicate
with CE2.
PE1 periodically transmits Ethernet LPT packets to PE2. After detecting that link 1 is
malfunctioning, PE1 sends Ethernet LPT packets containing a message to PE2,
indicating that link 1 is malfunctioning.
2. After receiving and interpreting the Ethernet LPT packets, PE2 acknowledges that the
user side link of PE1 is malfunctioning and disables its user side port.
After detecting that the user side port of PE2 is disabled, CE2 enables the backup link to
communicate with CE1.
After the fault on the user side link of PE1 is rectified, services on the backup link can be
switched back to the working link according to the following steps.
1. After detecting that the fault on link 1 is rectified, CE1 switches services on the backup
link to the working link and tries to communicate with CE2 using the working link.
After detecting that the fault on link 1 is rectified, PE1 sends Ethernet LPT packets
containing a message to PE2, indicating that the fault on its user side link is rectified.

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 176

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 7 Ethernet LPT

2. After receiving and interpreting the Ethernet LPT packets, PE2 acknowledges that the
fault on the user side link of PE1 is rectified and enables its user side port.
After detecting that the user side port is enabled, CE2 switches services on the backup
link back to the working link and communicates with CE1 using the working link.
Figure 7-2 shows the scenario where a point-to-point network fault occurs.

Figure 7-2 Scenario where a point-to-point network fault occurs

Backup link


User side Network side Network side User side

Link 2
Point-to-point network
Link 1

Protection link
Working link

PE1 and PE2 are enabled with Ethernet LPT and transmit packets to each other. When a
point-to-point network fault occurs:
1. PE1 receives no Ethernet LPT packets from PE2 and detects that Ethernet LPT
communication fails. Then, PE1 disables its user side port.
After detecting that the user side port of PE1 is disabled, CE1 enables the backup link to
communicate with CE2.
2. PE2 receives no Ethernet LPT packets from PE1 and detects that Ethernet LPT
communication fails. Then, PE2 disables its user side port.
After detecting that the user side port of PE2 is disabled, CE2 enables the backup link to
communicate with CE1.
After the point-to-point network fault is rectified, services on the backup link can be switched
back to the working link according to the following steps.
1. After receiving and interpreting the Ethernet LPT packets, PE1 detects that the fault is
rectified and enables its user side port.
After detecting that the user side port is enabled, CE1 switches services on the backup
link back to the working link and tries to communicate with CE2 using the working link.
2. After receiving and interpreting the Ethernet LPT packets, PE2 detects that the fault is
rectified and enables its user side port.
After detecting that the user side port is enabled, CE2 switches services on the backup
link back to the working link and communicates with CE1 using the working link.

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 177

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 7 Ethernet LPT

7.3 Application Scenarios for LPT

7.3.1 Point-to-Point Ethernet LPT

shows how point-to-point Ethernet LPT is applied.Figure 7-3

Figure 7-3 Application scenario of a network configured with point-to-point Ethernet LPT

Backup link


User side Network side Network side User side

Link 2
Point-to-point network
Link 1

Protection link
Working link

Under common conditions, data between CE1 and CE2 traverses link 1, the point-to-point
network, and link 2. The point-to-point network can be built based on PWE3 or QinQ links. If
a fault occurs on link 1, link 2, or the point-to-point network, communication between CE1
and CE2 is interrupted.
transmission. When link 1 is malfunctioning, PE2 disables link 2. When the point-to-point
network is malfunctioning, PE1 disables link 1 and PE2 disables link 2. In this manner, CE1
and CE2 can communicate with each other by using the backup link.

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 178

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 8 Dual-Device Backup

8 Dual-Device Backup

About This Chapter

8.1 Overview of Dual-Device Backup

8.2 Dual-Device Backup Principles
8.3 Application Scenarios for Dual-Device Backup
8.4 Terminology for Dual-Device Backup

8.1 Overview of Dual-Device Backup

Dual-device backup is a feature that ensures service traffic continuity in scenarios in which a
master/backup status negotiation protocol (for example, VRRP or E-Trunk) is deployed.
Dual-device backup enables the master device to back up service control data to the backup
device in real time. When the master device or the link directly connected to the master device
fails, service traffic quickly switches to the backup device. When the master device or the link
directly connected to the master device recovers, service traffic switches back to the master
device. Therefore, dual-device backup improves service and network reliability.

Related Concepts
If VRRP is used as a master/backup status negotiation protocol, dual-device backup involves
the following concepts:
VRRP is a fault-tolerant protocol that groups several routers into a virtual router. If the
next hop of a host is faulty, VRRP switches traffic to another router, which ensures
communication continuity and reliability.
For details about VRRP, see the chapter "VRRP" in NE40E Feature Description -
Network Reliability.
RUI is a Huawei-specific redundancy protocol that is used to back up user information
between devices. RUI, which is carried over the Transmission Control Protocol (TCP),

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 179

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 8 Dual-Device Backup

specifies which user information can be transmitted between devices and the format and
amount of user information to be transmitted.
The remote backup service (RBS) is an RUI module used for inter-device backup. A
service module uses the RBS to synchronize service control data from the master device
to the backup device. When a master/backup VRRP switchover occurs, service traffic
quickly switches to a new master device.
The remote backup profile (RBP) is a configuration template that provides a unified user
interface for dual-device backup configurations.

If E-Trunk is used as a master/backup status negotiation protocol, dual-device backup

involves the following concept:
l E-Trunk
E-Trunk implements inter-device link aggregation, providing device-level reliability. E-
Trunk aggregates data links of multiple devices to form a link aggregation group (LAG).
If a link or device fails, services are automatically switched to the other available links or
devices in the E-Trunk, improving link and device-level reliability.
For details about E-Trunk, see "E-Trunk" in NE40E Feature Description - LAN Access
and MAN Access.

In traditional service scenarios, all users use a single device to access a network. Once the
device or the link directly connected to the device fails, all user services are interrupted, and
the service recovery time is uncertain. To resolve this issue, deploy dual-device backup to
enable the master device to back up service control data to the backup device in real time.

l The NE40E supports only dual-device hot backup for Address Resolution Protocol
(ARP) services, also called dual-device ARP hot backup.
Dual-device ARP hot backup enables the master device to back up the ARP entries at the
control and forwarding layers to the backup device in real time. When the backup device
switches to a master device, it uses the backup ARP entries to generate host routing
information without needing to relearn ARP entries, ensuring downlink traffic continuity.
– Manually triggered dual-device ARP hot backup: You must manually establish a
backup platform and backup channel for the master and backup devices. In
addition, you must manually trigger ARP entry backup from the master device to
the backup device. This backup mode has complex configurations.
– Automatically enabled dual-device ARP hot backup: You need to establish only a
backup channel between the master and backup devices, and the system
automatically implements ARP entry backup. This backup mode has simple
l Dual-device IGMP snooping hot backup enables the master device to back up IGMP
snooping entries to the backup device in a master/backup E-Trunk scenario. If the master
device or the link between the master device and user fails, the backup device switches
to a master device and takes over, ensuring multicast service continuity.

l Benefits to users

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 180

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 8 Dual-Device Backup

– Improved user experience

l Benefits to operators
Improving network reliability from the perspective of service reliability.

8.2 Dual-Device Backup Principles

8.2.1 Overview
The NE40E ensures high reliability of services through the following approaches:
l Status control: Several BRASs negotiate a master BRAS through VRRP. With the help
of BFD or Ethernet OAM, the master BRAS can detect a link fault quickly and traffic
can be switched to the standby BRAS immediately.
l Service control: Information about access users is backed up to the standby BRAS from
the master BRAS through TCP. This ensures service consistency.
l Route control: By controlling routes in the address pool or user routes in a real-time
manner, the BRAS ensures that downstream traffic can reach users smoothly when an
active/standby switchover occurs.
Different services use different forwarding controls:
l IPv4 unicast forwarding control
l IPv4 multicast forwarding control
l L2TP service forwarding control

8.2.2 Status Control

VRRP is a fault-tolerant protocol defined in relevant standards . As shown in Figure 8-1, the
Routers on the LAN (Device1, Device2, and Device3) are arranged in a backup group using
VRRP. This backup group functions as a virtual router.

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 181

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 8 Dual-Device Backup

Figure 8-1 Schematic diagram for a virtual router

Host A

Host B

Host C


On the LAN, hosts need to obtain only the IP address of the virtual router rather than the IP
address of each router in the backup group. The hosts set the IP address of the virtual router as
the address of their default gateway. Then, the hosts can communicate with an external
network through the virtual gateway.
VRRP dynamically associates the virtual router with a physical router that transmits services.
When the physical router fails, another router is selected to take over services and user
services are not affected. The internal network and the external network can communicate
without interruption.

Virtual Access System

The virtual access solution significantly simplifies a network, facilitates service deployment
and maintenance, and reduces O&M costs. As shown in Figure 8-2, a virtual access system
consists of masters and APs. An AP can be managed by two masters. The two masters notify
their own management priority information for the AP to each other. The roles of the two
masters are determined based on the following rules:
1. The system first checks the management priorities of the two masters. The master with a
higher priority becomes the primary master, and the master with a lower priority
becomes the secondary master.
2. If the management priorities of the two masters are the same, the master with a smaller
management IP address becomes the primary master, and the master with a larger
management IP address becomes the secondary master.
If dual-device hot backup is deployed in a virtual access scenario, the master device functions
as the primary master, and the standby device functions as the secondary master in the virtual
access system. If the primary master fails, the system uses Diameter to re-negotiate the
primary and secondary states of the two masters.

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 182

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 8 Dual-Device Backup

Figure 8-2 Schematic diagram for a virtual access system

Primary Master


Network Core

Secondary Master

Principles of the Active/Standby Switchover

During the implementation of high reliability of services, VRRP is responsible for the
negotiation of the master and standby devices; BFD or Eth OAM is responsible for fast
detection of link faults to perform a rapid active/standby switchover.

Figure 8-3 Diagram of the active/standby switchover for high reliability of services
LSW-1 BFD Device1
3 K


LSW-2 BFD Device2

As shown in Figure 8-3, the two Routers negotiate the master and standby states using VRRP.
The NE40E supports active/standby status selection of interfaces and sub-interfaces.
BFD is enabled between the two Routers to detect links between the two devices. BFD in this
mode is called Peer-BFD. BFD is also enabled between the Router and the LSW to detect
links between the Router and the LSW. BFD in this mode is called Link-BFD.
When a link fails, through VRRP, the new master and standby devices can be negotiated, but
several seconds are needed and the requirements of carrier-grade services cannot be met.
Through BFD or Eth OAM, a faulty link can be detected in several milliseconds and the
device can perform a fast active/standby switchover with the help of VRRP.
During the implementation of an active/standby switchover, VRRP has to determine device
status based on Link-BFD status and Peer-BFD status. As shown in Figure 8-3, when Link 1
fails, the Peer-BFD status and Link-BFD status of Device1 both go down and Device1
becomes the standby device. In this case, the Peer-BFD status of Device2 goes down but the
Link-BFD status of Device2 is still up. Therefore, Device2 becomes the master device.
In actual networking, certain LSWs may not support BFD. In this case, you have to select
another detection mechanism. Besides BFD, the NE40E also supports detection of links
connecting to LSWs through Eth OAM.
The NE40E supports monitoring of upstream links (for example, Link 3 in Figure 8-3) to
enhance reliability protection for the network side. When an upstream link fails, the NE40E
responds to the link failure quickly and performs an active/standby link switchover.

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 183

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 8 Dual-Device Backup

8.2.3 Service Control

Service control refers to the control of information about access users. The NE40E performs
service control by backing up information about access users on the active BRAS to the
standby BRAS in a real-time manner. To ensure the reliability of information backup, the
NE40E backs up information through TCP. Table 8-1 lists the user attributes that can be
backed up. Not all the user attributes listed in Table 8-1 have to be backed up. You can
determine the user attributes to be backed up according to the actual services of users.

Table 8-1 User attributes to be backed up

Attribute Description

MAC MAC address of a user, which identifies a user in collaboration

with a Session-ID.

IP-address IP address of a user.

Vlan-ID VLAN IDs in the inner and outer VLAN tags

Option60 Option 60 carried in a user packet.

Option82 Option 82 carried in a user packet

Lease-time Address lease delivered by a RADIUS server

SessionId Session ID of a user. The session ID of a DHCP user is always


MTU Maximum transmission unit (MTU) of a user packet

Magic number Magic number of a user. It is used for loop detection.

Username User name

QosProfile Name of a QoS profile delivered by the RADIUS server. It is

used to meet users' requirements for QoS.

Up-Priority Priority of a user's upstream traffic delivered by the RADIUS


PrimaryDNS Primary DNS delivered by the RADIUS server.

SecondaryDNS Secondary DNS delivered by the RADIUS server.

UCL-Group UCL for user group policy control delivered by the RADIUS

Up-Pack Real-time number of upstream packets. It is used for traffic-

based accounting.

Down-Pack Real-time number of downstream packets. It is used for traffic-

based accounting.

Up-Byte Real-time number of upstream bytes. It is used for traffic-based


Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 184

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 8 Dual-Device Backup

Attribute Description

Down-Byte Real-time number of downstream bytes. It is used for traffic-

based accounting.

Remanent-Volume Volume of the remaining traffic delivered by the RADIUS

server. It is used to control the online traffic of users.

Session-Timeout Remaining time delivered by the RADIUS server. It is used to

control the online duration of users.

Ip-Pool IP address pool name delivered by the RADIUS server.

AcctSession-ID ID for real-time accounting.

FramedRoute User route delivered by the RADIUS server.

FramedNetMask Gateway address delivered by the RADIUS server.

Up-CIR Upstream traffic committed information rate (CIR) delivered by

the RADIUS server.

Down-CIR Downstream traffic CIR delivered by the RADIUS server.

Up-PIR Upstream traffic peak information rate (PIR) delivered by the

RADIUS server.

Down-PIR Downstream traffic PIR delivered by the RADIUS server.

Down-Priority Priority of a user's downstream traffic delivered by the RADIUS


Lease-time52 Lease agent delivered by the RADIUS server.

Renewal-Time Renewed address lease delivered by the RADIUS server.

Rebinding-Time Rebound address lease delivered by the RADIUS server.

Renewal-Time52 Renewed lease agent delivered by the RADIUS server.

Rebinding-Time52 Rebound lease agent delivered by the RADIUS server.

Web-IpAddress IP address of the Web authentication server. It is used to back up

information about Web authentication users.

Web-VRF VPN instance of the Web authentication server. It is used to back

up information about Web authentication users.

L2TP assigned local Local tunnel index assigned by L2TP.

tunnel id

L2TP assigned local Local session index assigned by L2TP.

session id

Radius proxy IP Destination IP address carried in a received RADIUS packet

address sent by a client when the BAS device functions as a RADIUS

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 185

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 8 Dual-Device Backup

Attribute Description

Radius client IP address Source IP address carried in a received RADIUS packet sent by
a client when the BAS device functions as a RADIUS proxy.

Radius client VRF VPN instance to which a RADIUS client belongs.

AcctSession-ID on Accounting session ID of a client.

Radius client

Radius client NAS ID Name of the NAS of a RADIUS client.

Called ID of Radius Called-Station-Id attribute of a RADIUS proxy user.

proxy user

Calling ID of Radius Calling-Station-Id attribute of a RADIUS proxy user.

proxy user

When backing up information about access users, you need to ensure that the configurations
of the active and standby BRASs are consistent, including the IP address, VLAN, and QoS
parameters. You need to ensure the consistency of common attributes. The special attributes
of a user are backed up through TCP. Figure 8-4 shows the process of backing up the special
attributes of a user. A TCP connection can be set up based on the uplinks connecting to the

Figure 8-4 Diagram for user information backup for high service reliability
LSW-1 BFD Device1
3 K


LSW-2 BFD Device2

The user information backup function supports backup of information about authentication,
accounting, and authorization of users. The NE40E controls user access according to the
master/backup status negotiated through VRRP. Only the active device can handle users'
access requests and perform authentication, real-time accounting, and authorization for users.
The standby device discards users' access requests.

After a user logs on through the active device, the active device backs up information about
the user to the standby device through TCP. The standby device generates a corresponding
service based on user information. This ensures that the standby device can smoothly take
over services from the active device when the active device fails.

When the active device fails (for example, the system restarts), services are switched to the
standby device. When the active device recovers, services need to be switched back. The
active device, however, lacks information about users. Therefore, information about users on
the standby device must be backed up to the active device in batch. At present, the maximum
rate of information backup is 1000 pieces of information per second.

As shown in Figure 8-5, the entire service control process can be divided into the following

1. Backup phase

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 186

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 8 Dual-Device Backup

– The two NE40Es negotiate the active device (Device1) and standby device
(Device2) using VRRP.
– A user logs on through Device1, and information about this user is backed up to
Device2 in a real-time manner.
– The two NE40Es detect the link between them through BFD or Ethernet OAM.
2. Switchover phase
– For user-to-network traffic, if a link to Device 1 fails, VRRP, with the help of BFD
or Ethernet OAM, rapidly switches Device 1 to the backup state and Device 2 to the
master state and advertises gratuitous ARP packets to update the MAC address
table on the LSW, which allows following user packets to successfully reach
– For network-to-user traffic, if a link to Device 1 fails, Device 2 forwards traffic
based on the backup ARP entry, preventing traffic loss.
3. Switchback phase
– The link on the Device1 recovers, and VRRP renegotiates the active device and the
standby device. Then, Device1 acts as the active device; Device2 acts as the
standby device. In this case, Device2 needs to back up information about all users
to Device1 in batch and Device1 needs to back up information about users on it to
Device2. User entry synchronization between the two devices is bidirectional.
– Before the batch backup is completed, the VRRP switchover is not performed. At
this time, Device1 is still the standby device and Device2 is still the active device.
When the batch backup is completed, the VRRP switchover is performed. Device1
becomes the active device and sends a free ARP packet; Device2 becomes the
standby device and completes switchback of user services.

Figure 8-5 Flowchart for service control for high service reliability

Router1 Router2
VRRP negotiation
Master Standby
User information

( BFD/Eth OAM ) Detection

Switchover Standby Master

Master => Standby =>

standby master

User information
Master Standby

Standby => Master =>

master standby

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 187

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 8 Dual-Device Backup

The NE40E provides high reliability protection for Web authentication users. The principle of
high reliability protection for Web authentication users is similar to that for ordinary access
users. No special configuration is needed on the Web server.

8.2.4 IPv4 Unicast Forwarding Control

When a link fails, the NE40E needs to refresh the MAC forwarding table of the connected
LSW to correctly forward the traffic. In addition, routes must be controlled to ensure that the
traffic on the network side can reach users. The BRAS directs downstream traffic by
advertising a route whose next hop address is an address in the address pool. Therefore,
special processing must be done on routes for high reliability to ensure that the downstream
traffic can be correctly forwarded to users.

The NE40E controls downstream traffic in two modes.

l Traffic control using a route

l Traffic control using a tunnel

Traffic Control Through a Route

The NE40E controls downstream traffic by withdrawing or advertising a route whose next
hop address is an address in an address pool. As shown in Figure 8-6, Device1 acts as the
active device and Device2 acts as the standby device. Device1 advertises a route to the router.
Device2 withdraws the corresponding route. In this case, traffic can be forwarded from the
router to the PC through Device-1.

After the active/standby switchover, Device1 acts as the standby device and Device2 acts as
the active device. Device1 withdraws the route and Device2 advertises the route. In this case,
traffic can be forwarded from the router to the PC through Device2.

You need to ensure that no fault occurs on the active device after a switchover caused by a
link failure or device failure. Route control in this mode is based on the active/standby status
of the device. You can ensure that traffic can be forwarded from the router to the PC by
controlling a route.

Figure 8-6 Diagram for traffic control using a route



Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 188

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 8 Dual-Device Backup

Traffic Control Through Tunneling

The NE40E controls downstream traffic through LSPs, MPLS TE, GRE, and IP redirection.
As shown in Figure 8-7, Device1 acts as the active device and Device2 acts as the standby
device. Device1 advertises a route to the router. Device2 advertises a route with a lower
priority to the router. In this case, there are two routes to the PC on the router. Traffic is
forwarded to the PC through Device1 because the priority of the route on Device1 is higher.
After the active/standby switchover, neither Device-1 nor Device-2 needs to handle any route.
Therefore, the traffic from the router to the PC still passes through Device1. Device1 is in the
standby state; therefore, it does not forward traffic to the PC directly but sends the traffic
through tunnel. Device2 receives the traffic and forwards it to the PC.

Figure 8-7 Diagram for traffic control through tunneling




1. No fault occurs



2. A rapid active/standby switchover is performed after a link fails




3. Routes converge after a node fault occurs

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 189

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 8 Dual-Device Backup

8.2.5 IPv4 Multicast Forwarding Control

This section describes how to control IPv4 multicast service forwarded to a Dynamic Host
Configuration Protocol (DHCP) or Point-to-Point Protocol over Ethernet (PPPoE) set top box
(STB) along the SmartLink or enhanced trunk (E-Trunk) active and standby links.
Dual-device hot backup must be configured before multicast hot backup is configured.

Dual-Device Hot Backup for Multicast Traffic Sent to a DHCP STB

l Procedure for getting online and ordering multicast programs
If a user device is connected to a DHCP STB, the user device sends DHCP packets to
attempt to get online. The procedure for getting online from a Broadband Remote Access
Server (BRAS) enabled with dual-device hot backup is as follows:
a. A user device sends DHCP packets to request for an IP address.
b. After receiving DHCP packets, the master BRAS attempts to authenticate user
information. If authentication is successful, the master BRAS allocates an IP
address to the user. The slave BRAS does not provide access services for the user.
c. The user gets online successfully.
d. The master BRAS sends user information to the slave BRAS along a backup
channel. The slave BRAS uses the information to locally generate control and
forwarding information for the user.

Figure 8-8 Hot backup for multicast traffic sent to a DHCP STB




On the network shown in Figure 8-8, the procedure for ordering multicast programs is as
a. A DHCP STB sends an Internet Group Management Protocol (IGMP) Report
message to an aggregation switch, and the switch forwards the message to both the
master and slave BRASs.
b. Both the master and slave BRASs receive the IGMP Report message, and pull
multicast traffic from multicast sources.
c. The master BRAS replicates multicast traffic to the STB, but the slave BRAS does

Dual-Device Hot Backup for Multicast Traffic Sent to a PPPoE STB

When users attached to a PPPoE STB order multicast programs, the multicast replication
point can only be a single BRAS. PPPoE data flows are transmitted between the STB and the
BRAS in end-to-end mode. An STB MAC address and a BRAS MAC address identify a
PPPoE connection, and a session ID identifies a PPPoE session.

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 190

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 8 Dual-Device Backup

After a PPPoE STB sends an IGMP Report message, the master BRAS, not the slave BRAS,
can receive this message. To protect multicast services on the master BRAS, hot backup is
implemented to allow the slave BRAS to synchronize IGMP messages with the master BRAS.
After dual-device hot backup is implemented, the procedure for forwarding multicast traffic
from the master BRAS to the PPPoE STB is as follows:
1. The STB establishes a Point-to-Point Protocol (PPP) connection to the master BRAS.
The BRAS backs up the STB information to the slave BRAS. After receiving the
information, the slave BRAS locally generates control and forwarding information for
the PPP user.
2. The STB sends an IGMP Report message. After receiving the message, the master
BRAS backs the message up to the slave BRAS, sends a Join message to the RP to pull
multicast traffic, and establishes a rendezvous point tree (RPT).
3. After receiving the IGMP Report message from the backup channel, the slave BRAS
sends a Join message to the RP to pull multicast traffic and establishes an RPT.
4. The master BRAS replicates multicast traffic to the STB, but the slave BRAS does not.

Modes of Controlling Active and Standby Links

l Using SmartLink to control active and standby links
SmartLink is a protocol running on switches to control active and standby links. The
active link can send and receive packets and the standby link does not send packets or
forward received packets. VLAN-based SmartLink can be used. SmartLink controls the
active and standby states for a pair of links over which VLAN-specific packets are
An aggregation switch running SmartLink is dual-homed to BRASs. If SmartLink
detects that the physical status of the active link is Down, SmartLink starts to use the
slave link to forward data. After the active link recovers, traffic can switch back to the
active link or remain on the standby link.
On a network shown in Figure 8-9, the master BRAS receives an IGMP Report or Leave
message from an STB, and backs up the message to the slave BRAS. After receiving the
IGMP Report or Leave message, the slave BRAS pulls multicast traffic, establishes a
multicast forwarding entry, and prunes a multicast path.

Figure 8-9 Multicast service hot backup for a DHCP STB using SmartLink to control
active and standby links

Backup IGMP


IGMP Smartlink Multicast Smartlink



l Using E-Trunk to control active and standby links

On the network shown in Figure 8-10, E-Trunk, an extension to LACP, implements link
aggregation among BRASs. E-Trunk controls the active and standby states of bundled
member links between two BRASs and an aggregation switch. E-Trunk allows the active
link to forward data packets and does not allow the standby link to forward or receive
data packets.

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 191

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 8 Dual-Device Backup

VRRP is enabled on directly connected interfaces on the master and slave BRASs, and
VRRP tracks Eth-Trunk interfaces. The VRRP status is consistent with the E-Trunk
The master BRAS receives an IGMP Report or Leave message from an STB, and backs
up the message to the slave BRAS. After receiving the IGMP Report or Leave message,
the slave BRAS pulls multicast traffic, establishes a multicast forwarding entry, and
prunes a multicast path.

Figure 8-10 Multicast service hot backup for a DHCP STB using E-Trunk to control
active and standby links

Backup IGMP

E-Trunk E-Trunk

IGMP Trunk link Multicast Trunk link



Master/Slave Switchover in the Case of a Fault in Multicast Dual-Device Hot

If an access- or network-side fault occurs on the NE40E or the NE40E fails, a master/backup
VRRP switchover is performed. After the backup NE40E switches to the Master state, it
forwards traffic. The original master NE40E switches to the Backup state and prunes traffic.

8.2.6 IPv6 Unicast Forwarding Control

This section describes the different roles of NE40Es as broadband remote access servers
(BRASs), Dynamic Host Configuration Protocol version 6 (DHCPv6) servers, and DHCPv6
relay agents respectively in Internet Protocol version 6 (IPv6) unicast forwarding control.

NE40Es Functioning as BRASs

On the network shown in Figure 8-11, NE40E-1 and NE40E-2 function as BRASs and run
redundancy user information (RUI). A Virtual Router Redundancy Protocol (VRRP) backup
group is configured for the two NE40Es, with NE40E-1 as the master and NE40E-2 as the
backup. When the link between the switch (SW) and NE40E-1 goes faulty, the fault triggers a
master/backup VRRP switchover. Then, NE40E-2 becomes the master and starts neighbor
discovery (ND) detection, and NE40E-1 becomes the backup and stops the ND detection. If
the link-local address or MAC address on an interface of NE40E-2 is different from that of an
interface on NE40E-1, some users will go offline, or some user packets will be discarded.

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 192

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 8 Dual-Device Backup

Figure 8-11 Active link fault on the access side

DSLAM1 Device-1



DSLAM2 Device-2

To prevent a user from detecting the active link fault, NE40E-2 must use the same link-local
address and MAC address as those of NE40E-1.

l Link-local address generation

When an NE40E sends ND packets, its source IP address must be filled with a link-local
After RUI is enabled on the NE40Es, the master and backup BRASs generate the same
link-local address using the virtual MAC address of the VRRP backup group. The link-
local address is generated automatically, which is convenient for users.
l Protection tunnel forwarding
An address pool backup allows the master and backup BRASs to have the same MAC
address. Address pool backup in IPv6 unicast forwarding control is similar to that in
IPv4 unicast forwarding control. For details, see chapter 8.2.4 IPv4 Unicast
Forwarding Control
IPv6 unicast forwarding allows the NE40Es to control traffic through multiprotocol label
switching (MPLS) label switched paths (LSPs) and supports simplified protection tunnel
configuration, requiring only MPLS LSPs for virtual private networks (VPNs). Each
VPN swaps its forwarding labels using a Huawei-proprietary protocol, avoiding the need
to configure the Border Gateway Protocol (BGP) on the NE40Es.

NE40Es Functioning as DHCPv6 Servers

Additionally, an NE40E can function as a DHCPv6 server or relay agent in IPv6 unicast
forwarding control.

Figure 8-12 RUI networking where NE40Es function as DHCPv6 servers




Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 193

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 8 Dual-Device Backup

On the network shown in Figure 8-12, the NE40Es act as the master and backup DHCPv6
servers by running VRRP. The master DHCPv6 server assigns an IPv6 address to the PC. The
DHCPv6 packets that the master DHCPv6 server sends carry the DHCP unique identifier
(DUID), which uniquely identifies the DHCPv6 server. If RUI is enabled for the two
DHCPv6 servers, to ensure that the new master DHCPv6 server sends correct DHCPv6
packets to the PC after a master/backup switchover, the master and backup DHCPv6 servers
must use the same DUID.
The PC automatically generates a DUID in the link-layer address (DUID-LL) mode using the
virtual MAC address of the VRRP backup group. This process avoids the need to configure a
DUID in the link-layer address plus time (DUID-LLT) mode or configure a DUID statically.
After the DUID is generated in the DUID-LL mode, the master and backup DHCPv6 servers
do not use the globally configured DUID, saving the process of backing up the DUID
between the servers.

NE40Es Functioning as DHCPv6 Relay Agents

Figure 8-13 RUI networking where NE40Es function as DHCPv6 relay agents




On the network shown in Figure 8-13, the NE40Es act as the master and backup DHCPv6
relay agents. A unique DHCPv6 relay agent remote-ID identifies the master DHCPv6 relay
agent. In the RUI-enabled scenario, to enable the backup DHCPv6 relay agent to forward the
DHCPv6 packets after a master/backup switchover, the master and backup DHCPv6 relay
agents must use the same DHCPv6 relay agent remote-ID. This way ensures that the DHCPv6
server processes the packets correctly.
The RUI-enabled PC uses the DUID that identifies the master and backup DHCPv6 servers as
the DHCPv6 relay agent remote-ID to identify both the master and backup DHCPv6 relay

8.3 Application Scenarios for Dual-Device Backup

8.3.1 Dual-Device ARP Hot Backup

Networking Description
Dual-device ARP hot backup enables the master device to back up the ARP entries at the
control and forwarding layers to the backup device in real time. When the backup device
switches to a master device, it uses the backup ARP entries to generate host routing

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 194

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 8 Dual-Device Backup

information. After you deploy dual-device ARP hot backup, the new master device forwards
downlink traffic without needing to relearn ARP entries. Dual-device ARP hot backup
ensures downlink traffic continuity.


Dual-device ARP hot backup applies in both Virtual Router Redundancy Protocol (VRRP) and enhanced
trunk (E-Trunk) scenarios. This section describes the implementation of dual-device ARP hot backup in
VRRP scenarios.

Figure 8-14 shows a typical network topology in which a Virtual Router Redundancy
Protocol (VRRP) backup group is deployed. In the topology, Device A is a master device, and
Device B is a backup device. In normal circumstances, Device A forwards both uplink and
downlink traffic. If Device A or the link between Device A and the switch fails, a master/
backup VRRP switchover is triggered to switch Device B to the Master state. Device B needs
to advertise a network segment route to a device on the network side to direct downlink traffic
from the network side to Device B. If Device B has not learned ARP entries from a device on
the user side, the downlink traffic is interrupted. Device B forwards the downlink traffic only
after it learns ARP entries from a device on the user side.

Figure 8-14 VRRP networking



Network Core


E-Trunk Active-Active Networking

In Figure 8-15, when no fault occurs, Device A and Device B load-balance traffic. Device C
that provides access services adds a link to each of Device A and Device B to an E-Trunk
interface. Device C load-balances traffic between Device A and Device B.
In this situation, ARP packets are sent through a single Eth-Trunk member link and reaches
either of the two devices. Device A and Device B receives ARP packets sent by Device C and
the two devices learn incomplete ARP entries. In this case, Device A and Device B need to
learn ARP entries from each other and back up ARP information for each other. If Device A
fails, services can be switched to Device B, which prevents A-to-C or B-to-C traffic

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 195

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 8 Dual-Device Backup

Figure 8-15 E-Trunk active-active ARP dual-device hot backup



Network Core


Feature Deployment
To prevent downlink traffic from being interrupted because Device B does not learn ARP
entries from a device on the user side, deploy dual-device ARP hot backup on Device A and
Device B, as shown in Figure 8-16.

Figure 8-16 Dual-device ARP hot backup

Dual-Device ARP

Hot Backup

Network Core


After the deployment, Device B backs up the ARP entries on Device A in real time. If a
master/backup VRRP switchover occurs, Device B forwards downlink traffic based on the
backup ARP entries without needing to relearn ARP entries from a device on the user side.

8.3.2 Dual-Device IGMP Snooping Hot Backup

Networking Description
Dual-device IGMP snooping hot backup enables the master device and the backup device
synchronously generate multicast entries in real time. The IGMP protocol packets are

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 196

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 8 Dual-Device Backup

synchronized from the master device to the backup device, so that the same multicast
forwarding table entries can be generated on the backup device. After you deploy dual-device
ARP hot backup, the new master device forwards downlink traffic without needing to relearn
multicast forwarding table entries by IGMP snooping. Dual-device IGMP snooping hot
backup ensures downlink traffic continuity.
Figure 8-17 shows a typical network topology in which an Eth—Trunk group is deployed. In
the topology, Device A is a master device, and Device B is a backup device. In normal
circumstances, Device A forwards both uplink and downlink traffic. If Device A or the link
between Device A and the switch fails, a master/backup Eth—Trunk link switchover is
triggered to switch Device B to the Master state. Device B needs to advertise a network
segment route to a device on the network side to direct downlink traffic from the network side
to Device B. If Device B has not generated multicast forwarding entries directing traffic to the
user side, the downlink traffic is interrupted. Device B forwards the downlink traffic only
after it generates forwarding entries directing traffic to the user side.

Figure 8-17 Eth-Trunk Networking



Network Core


Feature Deployment
To prevent downlink traffic from being interrupted because Device B does not generate
multicast forwarding entries directing traffic to the user side, deploy dual-device IGMP
snooping hot backup on Device A and Device B, as shown in Figure 8-18.

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 197

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 8 Dual-Device Backup

Figure 8-18 Dual-device IGMP Snooping hot backup


IGMP Snooping
Dual Device
Hot Backup

Network Core


After the deployment, Device A and Device B generate the same multicast forwarding entries
at the same time. If a master/backup Eth-Trunk link switchover occurs, Device B forwards
downlink traffic based on the generated multicast forwarding entries without needing to
generate the entries directing traffic to the user side.

8.3.3 Single-Homing Access in a Multi-Node Backup Scenario

Dual-homing access may fail to be deployed in a multi-node backup scenario due to
insufficient resources. If this problem occurs, single-homing access can be used. On the
network shown in Figure 8-19, network traffic can be forwarded by either NE40E 1 or
NE40E 2. If common single-homing access is used, NE40E 2 will discard User1's change-of-
authorization (COA) or disconnect message (DM) and web authentication response messages
upon receipt. This case causes User1's COA/DM and web authentications to fail. If the link
between NE40E 1 and the network goes faulty, the preceding problem will also occur.

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 198

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 8 Dual-Device Backup

Figure 8-19 Common single homing

IP Core User1's COA/DM and web

authentication response
A fault occurs messages are forwarded to
on the link. Device 2.

Device 1 Device 2

S1 S2

User 1 Common-single User 2

homing access

To resolve the preceding problem, configure user data virtual backup between NE40E 1 and
NE40E 2. On the network shown in Figure 8-20, information about User1's identity is backed
up on NE40E 2. The aggregation switch S1 is single-homed to NE40E 1. VRRP is deployed
on the access side. One VRRP protection group is deployed for each pair of active and
standby links. If the VRRP group is in the Master state, the access link can be accessed by
users. If User1's COA/DM and web authentication response messages are randomly delivered
to NE40E 2, user data virtual backup allows NE40E 2 to forward the response messages to
NE40E 1. Additionally, if the link between NE40E 1 and the network goes faulty, NE40E 2
can also take over the traffic on the faulty link, preventing traffic interruption.

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 199

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 8 Dual-Device Backup

Figure 8-20 Single-homing access in a multi-node backup scenario

IP Core

User data virtual

backup Device 2
Device 1


S1 S2

User 1 User 2
access in a multi-
node backup


Single-homing access in a multi-node backup scenario can be implemented only after user data virtual
backup is configured.

8.3.4 Dual-Homing Access in a Multi-Node Backup Scenario

Multi-system backup supports two types of access topologies: direct dual-homing access
through aggregation switches and dual-homing access through the ring network (semi-ring)
formed by aggregation switches.

Direct Dual-Homing Access Through Aggregation Switches

As shown in Figure 8-21, each aggregation switch is dual-homed to the master and slave
NE40Es. VRRP is deployed on the access side. One VRRP protection group is deployed for
each pair of active and standby links.

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 200

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 8 Dual-Device Backup

Figure 8-21 Dual-homing access through aggregation switches

IP Core

User data

Dual-Homing Access Through the Ring Network (Semi-ring) Formed by

Aggregation Switches
In the case of ring-based access, the NE40E is not on the ring, and the access switch accesses
the NE40E through the aggregation switch. As shown in Figure 8-22, one VRRP group is
deployed across two aggregation switches. The VRRP group determines the active/standby
status of each access link. If the VRRP group is in the master state, the access link can be
accessed by users. If the VRRP group is in the Slave state, the access link cannot be accessed
by users.

Figure 8-22 Dual-homing access through the ring network (semi-ring) formed by aggregation

IP Core

IP Core

User data

User data


U-shape access Ring access

network network

8.3.5 Load Balancing Between Equipment

User session information of multiple NE40Es is backed up on the NE40E. When a master
device is faulty, user services are switched to the slave device.
As shown in Figure 8-23, the NE40E in the middle serves as the slave device, and the
NE40Es on both sides serve as the master devices. Under normal circumstances, users go

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 201

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 8 Dual-Device Backup

online using the master devices. When master devices or the links of master devices are
faulty, the slave device takes over user services.

Figure 8-23 Deployment of equipment-level load balancing

IP Core

Master User data User data Master

backup backup

In the topology shown in Figure 8-23, focus on the VLAN planning, and make sure that the
two NE40Es can be accessed by users simultaneously.

8.3.6 Load Balancing Between Links

As shown in Figure 8-24, when the NE40E needs to access multiple aggregation switches or
links, load balancing can be applied according to the granularity of links. Two NE40Es can
serve as the master and slave devices to protect each other. Two NE40Es can also be accessed
by users simultaneously.

Figure 8-24 Deployment of link-level load balancing

IP Core

User data backup

Master Master
Slave Slave

8.3.7 Load Balancing Between VLANs

As shown in Figure 8-25, if you need to enable access links to work concurrently to save link
resources, deploy load balancing at the VLAN level. Two VRRP groups need to be deployed.
One VRRP group allows some VLAN users to go online from the NE40E on the left side, and
the other VRRP group allows other VLAN users to go online from the NE40E on the right

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 202

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 8 Dual-Device Backup

Figure 8-25 Deployment of VLAN-level load balancing

IP Core

User data


8.3.8 Load Balancing Based on Odd and Even MAC Addresses

This section describes load balancing based on the odd and even media access control (MAC)
addresses carried in user packets.

Figure 8-26 Load balancing based on odd and even MAC addresses

Device 1 Device 2



As shown in Figure 8-26, two Virtual Router Redundancy Protocol (VRRP) backup groups
are deployed on the access side. One VRRP backup group uses NE40E 1 as the master and

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 203

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 8 Dual-Device Backup

NE40E 2 as the backup, and the other uses NE40E 2 as the master and NE40E 1 as the
In multi-device backup scenarios, configure load balancing based on odd and even MAC
addresses to enable the master NE40E to forward only user packets carrying odd or even
MAC addresses.
To determine the forwarding path of uplink traffic and prevent packet disorder, the master and
backup NE40Es in the same virtual local area network (VLAN) must use different virtual
MAC addresses to establish sessions with hosts.

8.3.9 Multicast Hot Backup

Hot backup is deployed among two NE40Es, and multicast hot backup is deployed on the two
NE40Es at the same time. Two NE40Es serve as DRs. The network-side interfaces of the DRs
are configured with Protocol Independent Multicast (PIM), and the user-side interfaces of the
DRs are configured to terminate the Internet Group Management Protocol (IGMP) messages
of STBs.
On this network, HSI, VoIP, and IPTV services can be protected.

Figure 8-27 Application of multicast hot backup in operator networks

Device A


Bypass tunnel


Device B

As shown in Figure 8-27, the NE40Es serve as multicast replication points. Multicast hot
backup does not apply to VLAN-based or interface-based multicast replication.

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 204

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 8 Dual-Device Backup

8.4 Terminology for Dual-Device Backup

Term Definition

Dual-device A feature in which one device functions as a master device and the other
backup functions as a backup device. In normal circumstances, the master
device provides service access and the backup device monitors the
running status of the master device. When the master device fails, the
backup device switches to a master device and provides service access,
ensuring service traffic continuity.

Remote Backup A configuration template that provides a unified user interface for dual-
Profile system backup configurations.

Remote Backup An inter-device backup channel, used to synchronize data between two
Service devices so that user services can smoothly switch from a faulty device
to another device during a master/backup device switchover.

Redundancy A Huawei-proprietary protocol used by devices to back up user

User information between each other over TCP connections.

Acronyms and Abbreviations

Acronym and Full Name

ARP Address Resolution Protocol

BFD Bidirectional Forwarding Detection

BRAS Broadband Remote Access Server

DHCP Dynamic Host Configuration Protocol

DR Designated Router

ETH OAM Ethernet Operations Administration Maintenance

GRE Generic Routing Encapsulation

IGMP Internet Group Manage Protocol

ISP Internet Service Provider

L2TP Layer 2 Tunneling Protocol

LAC L2TP Access Concentrator

LNS L2TP Tunnel Switch

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 205

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 8 Dual-Device Backup

Acronym and Full Name


LSP label switched path

MAC Media Access Control

MPLS Multiprotocol Label Switching

PIM Protocol Independent Multicast

PPP Point-to-Point Protocol

PPPOE PPP Over Ethernet

STB Set Top Box

TE Traffic Engineering

VLAN Virtual Local Area Network

VRRP Virtual Router Redundancy Protocol

RUI Redundancy User Information

RBS Remote Backup Service

RBP Remote Backup Profile

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 206

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 9 Bit-Error-Triggered Protection Switching

9 Bit-Error-Triggered Protection Switching

About This Chapter

9.1 Overview of Bit-Error-Triggered Protection Switching

9.2 Understanding Bit-Error-Triggered Protection Switching
9.3 Application Scenarios for Bit-Error-Triggered Protection Switching
9.4 Terminology for Bit-Error-Triggered Protection Switching

9.1 Overview of Bit-Error-Triggered Protection Switching

A bit error refers to the deviation between a bit that is sent and the bit that is received. Cyclic
redundancy checks (CRCs) are commonly used to detect bit errors. Bit errors caused by line
faults can be corrected by rectifying the associated link faults. Random bit errors caused by
optical fiber aging or optical signal jitter, however, are more difficult to correct. Bit-error-
triggered protection switching is a reliability mechanism that triggers protection switching
based on bit error events (bit error occurrence event or correction event) to minimize bit error

The demand for network bandwidth is rapidly increasing as mobile services evolve from
narrowband voice services to integrated broadband services, including voice and streaming
media. Meeting the demand for bandwidth with traditional bearer networks dramatically
raises carriers' operation costs. To tackle the challenges posed by this rapid broadband-
oriented development, carriers urgently need mobile bearer networks that are flexible, low-
cost, and highly efficient. IP-based mobile bearer networks are an ideal choice. IP radio
access networks (IPRANs), a type of IP-based mobile bearer network, are being increasingly
widely used.
Traditional bearer networks use retransmission or the mechanism that allows one end to
accept only one copy of packets from multiple copies of packets sent by the other end to
minimize bit error impact. IPRANs have higher reliability requirements than traditional bearer

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 207

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 9 Bit-Error-Triggered Protection Switching

networks when carrying broadband services. Traditional fault detection mechanisms cannot
trigger protection switching based on random bit errors. As a result, bit errors may degrade or
even interrupt services on an IPRAN.
To solve this problem, configure bit-error-triggered protection switching.


To prevent impacts on services, check whether protection links have sufficient bandwidth resources
before deploying bit-error-triggered protection switching.

Bit-error-triggered protection switching offers the following benefits:
l Protects traffic against random bit errors, meeting high reliability requirements and
improving service quality.
l Enables devices to record bit error events. These records help carriers locate the nodes or
lines that have bit errors and take corrective measures accordingly.

9.2 Understanding Bit-Error-Triggered Protection


9.2.1 Bit Error Detection

Bit-error-triggered protection switching enables link bit errors to trigger protection switching
on network applications, minimizing the impact of bit errors on services. To implement bit-
error-triggered protection switching, establish an effective bit error detection mechanism to
ensure that network applications promptly detect bit errors.

Related Concepts
Bit error detection involves the following concepts:
l Bit error: deviation between a bit that is sent and the bit that is received.
l BER: number of bit errors divided by the total number of transferred bits during a certain
period. The BER can be considered as an approximate estimate of the probability of a bit
error occurring on any particular bit.
l LSP BER: calculation result based on the BER of each node on an LSP.

Interface-based Bit Error Detection

A device uses the CRC algorithm to detect bit errors on an inbound interface and calculate the
BER. If the BER exceeds the bit error alarm threshold configured on a device's interface, the
device determines that bit errors have occurred on the interface's link, and instructs an upper-
layer application to perform a service switchover. When the BER of the interface falls below
the bit error alarm clear threshold, the device determines that the bit errors have been cleared
from the interface, and instructs the upper-layer application to perform a service switchback.
To prevent line jitters from frequently triggering service switchovers and switchbacks, set the

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 208

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 9 Bit-Error-Triggered Protection Switching

bit error alarm clear threshold to be one order of magnitude lower than the bit error alarm
Interfaces support the following types of bit error detection functions:
l Trigger-LSP: applies to bit-error-triggered RSVP-TE tunnel, PW, or L3VPN switching.
l Trigger-section: applies to bit-error-triggered section switching.
l Link-quality: applies to link quality adjustment. This type of detection triggers route cost
changes and in turn route reconvergence to prevent bit errors from affecting services.

Advertisement of the Bit Error Status

BFD mode
For dynamic services that use BFD to detect faults, a device uses BFD packets to advertise
the bit error status (including the BER). If the BER exceeds the bit error alarm threshold
configured on a device's interface, the device determines that bit errors have occurred on the
interface's link, and instructs an upper-layer application to perform a service switchover. The
device also notifies the BFD module of the bit error status, and then uses BFD packets to
advertise the bit error status to the peer device. If bit-error-triggered protection switching also
has been deployed on the peer device, the peer device performs protection switching.
If a transit node or the egress of a dynamic CR-LSP detects bit errors, the transit node or
egress must use BFD packets to advertise the BER. On the network shown in Figure 9-1, a
dynamic CR-LSP is deployed from PE1 to PE2. If both the transit node P and egress PE2
detect bit errors:
1. The P node obtains the local BER and sends PE2 a BFD packet carrying the BER.
2. PE2 obtains the local BER. After receiving the BER from the P node, PE2 calculates the
BER of the CR-LSP based on the BER received and the local BER.
3. PE2 sends PE1 a BFD packet carrying the BER of the CR-LSP.
4. After receiving the BER of the CR-LSP, PE1 determines the bit error status based on a
specified threshold. If the BER exceeds the threshold, PE1 performs protection

Figure 9-1 BER advertisement using BFD packets

Dynamic CR-LSP


Bit errors
BFD packet for advertising the BER


For static services that use MPLS-TP OAM to detect faults, a device uses MPLS-TP OAM to
advertise the bit error status. If the BER reaches the bit error alarm threshold configured on an

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 209

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 9 Bit-Error-Triggered Protection Switching

interface of a device along a static CR-LSP or PW, the device determines that bit errors have
occurred on the interface's link, and notifies the MPLS-TP OAM module. The MPLS-TP
OAM module uses AIS packets to advertise the bit error status to the egress, and then APS is
used to trigger a traffic switchover.
If a transit node detects bit errors on a static CR-LSP or PW, the transit node uses AIS packets
to advertise the bit error status to the egress, triggering a traffic switchover on the static CR-
LSP or PW. On the network shown in Figure 9-2, a static CR-LSP is deployed from PE1 to
PE2. If the transit node P detects bit errors:
1. The P node uses AIS packets to notify PE2 of the bit error event.
2. After receiving the AIS packets, PE2 reports an AIS alarm to trigger local protection
switching. PE2 then sends CRC-AIS packets to PE1 and uses the APS protocol to
complete protection switching through negotiation with PE1.
3. After receiving the CRC-AIS packets, PE1 reports a CRC-AIS alarm.

Figure 9-2 Bit error status advertisement using AIS packets

Static CR-LSP


Bit errors
AIS packet

CRC_AIS alarm

9.2.2 Bit-Error-Triggered Section Switching

If bit errors occur on an interface, deploy bit-error-triggered section switching to trigger an
upper-layer application associated with the interface for a service switchover.

Implementation Principles
Trigger-section bit error detection must be enabled on an interface. After detecting bit errors
on an inbound interface, a device notifies the interface management module of the bit errors.
The link-layer protocol status of the interface then changes to bit-error-detection Down,
triggering an upper-layer application associated with the interface for a service switchover.
After the bit errors are cleared, the link-layer protocol status of the interface changes to Up,
triggering an upper-layer application associated with the interface for a service switchback.
The device also notifies the BFD module of the bit error status, and then uses BFD packets to
advertise the bit error status to the peer device.
l If bit-error-triggered section switching also has been deployed on the peer device, the bit
error status is advertised to the interface management module of the peer device. The
link-layer protocol status of the interface then changes to bit-error-detection Down or

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 210

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 9 Bit-Error-Triggered Protection Switching

Up, triggering an upper-layer application associated with the interface for a service
switchover or switchback.
l If bit-error-triggered section switching is not deployed on the peer device, the peer
device cannot detect the bit error status of the interface's link. In this case, the peer
device can only depend on an upper-layer application (for example, IGP) for link fault
For example, on the network shown in Figure 9-3, trigger-section bit error detection is
enabled on each interface, and nodes communicate through IS-IS routes. In normal cases, IS-
IS routes on PE1 and PE2 are preferentially transmitted over the primary link. Therefore,
traffic in both directions is forwarded over the primary link. If PE2 detects bit errors on the
interface to PE1:
l The link-layer protocol status of the interface changes to bit-error-detection Down,
triggering IS-IS routes to be switched to the secondary link. Traffic from PE2 to PE1 is
then forwarded over the secondary link. PE2 uses a BFD packet to notify PE1 of the bit
l After receiving the BFD packet, PE1 sets the link-layer protocol status of the
corresponding interface to bit-error-detection Down, triggering IS-IS routes to be
switched to the secondary link. Traffic from PE1 to PE2 is then forwarded over the
secondary link.
If trigger-section bit error detection is not supported or enabled on PE1's interface to PE2,
PE1 can only use IS-IS to detect that the primary link is unavailable, and then performs an IS-
IS route switchover.

Figure 9-3 Bit-error-triggered section switching


Bit errors
BFD packet
Primary link

Secondary link

Usage Scenario
If LDP LSPs are used, deploy bit-error-triggered section switching to cope with link bit errors
on the LDP LSPs.

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 211

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 9 Bit-Error-Triggered Protection Switching


After bit-error-triggered section switching is deployed, if bit errors occur on both the primary and
secondary links on an LDP LSP, the interface status changes to bit-error-detection Down on both the
primary and secondary links. As a result, services are interrupted. Therefore, it is recommended that you
deploy bit-error-triggered IGP route switching.

9.2.3 Bit-Error-Triggered IGP Route Switching

Bit-error-triggered section switching can cope with link bit errors. If bit errors occur on both
the primary and secondary links, bit-error-triggered section switching changes the interface
status on both the primary and secondary links to bit-error-detection Down. As a result,
services are interrupted because no link is available. To resolve the preceding issue, deploy
bit-error-triggered IGP route switching. After the deployment is complete, link bit errors
trigger IGP route costs to be adjusted, preventing upper-layer applications from transmitting
service traffic to links with bit errors. Bit-error-triggered IGP route switching ensures normal
running of upper-layer applications and minimizes the impact of bit errors on services.

Implementation Principles
Link-quality bit error detection must be enabled on an interface. After detecting bit errors on
an inbound interface, a device notifies the interface management module of the bit errors. The
link quality level of the interface then changes to Low, triggering an IGP (OSPF or IS-IS) to
increase the cost of the interface's link. In this case, IGP routes do not preferentially select the
link with bit errors. After the bit errors are cleared, the link quality level of the interface
changes to Good, triggering the IGP to restore the original cost for the interface's link. In this
case, IGP routes preferentially select the link again. The device also notifies the BFD module
of the bit error status, and then uses BFD packets to advertise the bit error status to the peer

l If bit-error-triggered IGP route switching also has been deployed on the peer device, the
bit error status is advertised to the interface management module of the peer device. The
link quality level of the interface then changes to Low or Good, triggering the IGP to
increase the cost of the interface's link or restore the original cost for the link. IGP routes
on the peer device then do not preferentially select the link with bit errors or
preferentially select the link again.
l If bit-error-triggered IGP route switching is not deployed on the peer device, the peer
device cannot detect the bit error status of the interface's link. Therefore, the IGP does
not adjust the cost of the link. Traffic from the peer device may still pass through the link
with bit errors. As a result, bidirectional IGP routes pass through different links. The
local device can receive traffic properly, and services are not interrupted. However, the
impact of bit errors on services cannot be eliminated.

For example, on the network shown in Figure 9-4, link-quality bit error detection is enabled
on each interface, and nodes communicate through IS-IS routes. In normal cases, IS-IS routes
on PE1 and PE2 are preferentially transmitted over the primary link. Therefore, traffic in both
directions is forwarded over the primary link. If PE2 detects bit errors on interface 1:

l PE2 adjusts the link quality level of interface 1 to Low, triggering IS-IS to increase the
cost of the interface's link to a value (for example, 40). PE2 uses a BFD packet to
advertise the bit errors to PE1.

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 212

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 9 Bit-Error-Triggered Protection Switching

l After receiving the BFD packet, PE1 also adjusts the link quality level of interface 1 to
Low, triggering IS-IS to increase the cost of the interface's link to a value (for example,
IS-IS routes on both PE1 and PE2 preferentially select the secondary link, because the cost
(20) of the secondary link is less than the cost (40) of the primary link. Traffic in both
directions is then switched to the secondary link.
If bit-error-triggered IGP route switching is not supported or enabled on PE1, PE1 cannot
detect the bit errors. In this case, PE1 still sends traffic to PE2 through the primary link. PE2
can receive traffic properly, but services are affected by the bit errors.
If PE2 detects bit errors on both interface 1 and interface 2, PE2 adjusts the link quality levels
of the interfaces to Low, triggering the costs of the interfaces' links to be increased to 40. IS-
IS routes on PE2 still preferentially select the primary link to ensure service continuity,
because the cost (40) of the primary link is less than the cost (50) of the secondary link. To
eliminate the impact of bit errors on services, you must manually restore the link quality.

Figure 9-4 Bit-error-triggered IGP route switching

PE1 Interface 1 Interface 1 PE2

Cost = 10 Cost = 10
st 10
Interface 2 =1 t =
os Interface 2
0 C

st 10
= 10 st
Bit errors
BFD packet
Primary link

Secondary link


Bit-error-triggered section switching and bit-error-triggered IGP route switching are mutually exclusive.

Usage Scenario
If LDP LSPs are used, deploy bit-error-triggered IGP route switching to cope with link bit
errors on the LDP LSPs. Bit-error-triggered IGP route switching ensures service continuity
even if bit errors occur on both the primary and secondary links on an LDP LSP. Therefore, it
is recommended that you deploy bit-error-triggered IGP route switching.

9.2.4 Bit-Error-Triggered Trunk Update

If a trunk interface is used to increase bandwidth, improve reliability, and implement load
balancing, deploy bit-error-triggered trunk update to cope with bit errors detected on trunk
member interfaces.

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 213

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 9 Bit-Error-Triggered Protection Switching

Implementation Principles
According to the types of protection switching triggered, bit-error-triggered trunk update is
classified as follows:

Trunk-bit-error-triggered section switching

On the network shown in Figure 9-5, trigger-section or trigger-LSP bit error detection must
be enabled on each trunk member interface. After detecting bit errors on a trunk interface's
member interface, a device advertises the bit errors to the trunk interface, triggering the trunk
interface to delete the member interface from the forwarding plane. The trunk interface then
does not select the member interface to forward traffic. After the bit errors are cleared from
the member interface, the trunk interface re-adds the member interface to the forwarding
plane. The trunk interface can then select the member interface to forward traffic. If bit errors
occur on all trunk member interfaces or the number of member interfaces without bit errors is
lower than the lower threshold for the trunk interface's Up links, the trunk interface goes
Down. An upper-layer application associated with the trunk interface is then triggered to
perform a service switchover. If the number of member interfaces without bit errors reaches
the lower threshold for the trunk interface's Up links, the trunk interface goes Up. An upper-
layer application associated with the trunk interface is then triggered to perform a service

The device also notifies the BFD module of the bit error status, and then uses BFD packets to
advertise the bit error status to the peer device connected to the trunk interface.

l If trunk-bit-error-triggered section switching also has been deployed on the peer device,
the bit error status is advertised to the trunk interface of the peer device. The trunk
interface is then triggered to delete or re-add the member interface from or to the
forwarding plane. The trunk interface is also triggered to go Down or Up, implementing
switchover or switchback synchronization with the device.
l If trunk-bit-error-triggered section switching is not deployed on the peer device, the peer
device cannot detect the bit error status of the interface's link. To ensure normal running
of services, the device can receive traffic from the member interface with bit errors in the
following cases:
– The trunk interface of the device has deleted the member interface with bit errors
from the forwarding plane or has gone Down.
– The trunk interface of the peer device can still forward traffic.
However, bit errors may affect service quality.

Trunk-bit-error-triggered section switching is similar to common-interface-bit-error-triggered

section switching. If bit errors occur on the trunk interfaces on both the primary and
secondary links, trunk-bit-error-triggered section switching may interrupt services. Therefore,
trunk-bit-error-triggered IGP route switching is recommended.

Figure 9-5 Trunk-bit-error-triggered section switching

Device A Device B


Bit errors

BFD packet

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 214

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 9 Bit-Error-Triggered Protection Switching

Trunk-bit-error-triggered IGP route switching

On the network shown in Figure 9-6, link-quality bit error detection must be enabled on each
trunk member interface, and bit-error-triggered IGP route switching must also be deployed on
the trunk interface. After detecting bit errors on a trunk interface's member interface, a device
advertises the bit errors to the trunk interface, triggering the trunk interface to delete the
member interface from the forwarding plane. The trunk interface then does not select the
member interface to forward traffic. After the bit errors are cleared from the member
interface, the trunk interface re-adds the member interface to the forwarding plane. The trunk
interface can then select the member interface to forward traffic. If bit errors occur on all
trunk member interfaces or the number of member interfaces without bit errors is lower than
the lower threshold for the trunk interface's Up links, the trunk interface ignores the bit errors
on the member interfaces and remains Up. However, the link quality level of the trunk
interface becomes Low, triggering an IGP (OSPF or IS-IS) to increase the cost of the trunk
interface's link. IGP routes then do not preferentially select the link. If the number of member
interfaces without bit errors reaches the lower threshold for the trunk interface's Up links, the
link quality level of the trunk interface changes to Good, triggering the IGP to restore the
original cost for the trunk interface's link. In this case, IGP routes preferentially select the link
The device also notifies the BFD module of the bit error status, and then uses BFD packets to
advertise the bit error status to the peer device connected to the trunk interface.
l If trunk-bit-error-triggered IGP route switching also has been deployed on the peer
device, the bit error status is advertised to the trunk interface of the peer device. The
trunk interface is then triggered to delete or re-add the member interface from or to the
forwarding plane. The link quality level of the trunk interface is also triggered to change
to Low or Good. In this case, the cost of IGP routes is adjusted, implementing
switchover or switchback synchronization with the device.
l If trunk-bit-error-triggered IGP route switching is not deployed on the peer device, the
peer device cannot detect the bit error status of the interface's link. If the trunk interface
of the device has deleted the member interface with bit errors from the forwarding plane,
the trunk interface of the peer device may still select the member interface to forward
traffic. Similarly, if the link quality level of the trunk interface on the device has changed
to Low, the IGP is triggered to increase the cost of the trunk interface's link. In this case,
IGP routes do not preferentially select the link. However, IGP on the peer device does
not adjust the cost of the link. Traffic from the peer device may still pass through the link
with bit errors. As a result, bidirectional IGP routes pass through different links. To
ensure normal running of services, the device can receive traffic from the member
interface with bit errors. However, bit errors may affect service quality.

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 215

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 9 Bit-Error-Triggered Protection Switching

Figure 9-6 Trunk-bit-error-triggered IGP route switching


Tr k
k un

Bit errors
BFD packet
Primary link

Secondary link


Layer 2 trunk interfaces do not support an IGP. Therefore, bit-error-triggered IGP route switching cannot
be deployed on Layer 2 trunk interfaces. If bit errors occur on all Layer 2 trunk member interfaces or the
number of member interfaces without bit errors is lower than the lower threshold for the trunk interface's
Up links, the trunk interface remains in the Up state. As a result, protection switching cannot be
triggered. To eliminate the impact of bit errors on services, you must manually restore the link quality.

Usage Scenario
If a trunk interface is deployed, deploy bit-error-triggered trunk update to cope with bit errors
detected on trunk member interfaces. Trunk-bit-error-triggered IGP route switching is

9.2.5 Bit-Error-Triggered RSVP-TE Tunnel Switching

To cope with link bit errors along an RSVP-TE tunnel and reduce the impact of bit errors on
services, deploy bit-error-triggered RSVP-TE tunnel switching. After the deployment is
complete, service traffic is switched from the primary CR-LSP to the backup CR-LSP if bit
errors occur.

Implementation Principles
On the network shown in Figure 9-7, trigger-LSP bit error detection must be enabled on each
node's interfaces on the RSVP-TE tunnels. To implement dual-ended switching, configure the
RSVP-TE tunnels in both directions as bidirectional associated CR-LSPs. If a node on a CR-
LSP detects bit errors in a direction, the ingress of the tunnel obtains the BER of the CR-LSP
after BER calculation and advertisement. For details, see Bit Error Detection.

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 216

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 9 Bit-Error-Triggered Protection Switching

Figure 9-7 Bit-error-triggered RSVP-TE tunnel switching


Bit errors
Primary CR-LSP
Hot-standby CR-LSP

The ingress then determine the bit error status of the CR-LSP based on the BER threshold
configured for the RSVP-TE tunnel. For rules for determining the bit error status of the CR-
LSP, see Figure 9-8.
l If the BER of the CR-LSP is greater than or equal to the switchover threshold of the
RSVP-TE tunnel, the CR-LSP is always in the excessive BER state.
l If the BER of the CR-LSP falls below the switchback threshold, the CR-LSP changes to
the normalized BER state.

Figure 9-8 Rules for determining the bit error status of the CR-LSP

BER Red indicates the excessive BER state.

Green indicates the normalized BER state.



After the bit error statuses of the primary and backup CR-LSPs are determined, the RSVP-TE
tunnel determines whether to perform a primary/backup CR-LSP switchover based on the
following rules:
l If the primary CR-LSP is in the excessive BER state, the RSVP-TE tunnel attempts to
switch traffic to the backup CR-LSP.
l If the primary CR-LSP changes to the normalized BER state or the backup CR-LSP is in
the excessive BER state, traffic is switched back to the primary CR-LSP.

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 217

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 9 Bit-Error-Triggered Protection Switching

The RSVP-TE tunnel in the opposite direction also performs the same switchover, so that
traffic in the upstream and downstream directions is not transmitted over the CR-LSP with bit

Usage Scenario
If RSVP-TE tunnels are used as public network tunnels, deploy bit-error-triggered RSVP-TE
tunnel switching to cope with link bit errors along the tunnels.

9.2.6 Bit-Error-Triggered Switching for PW

When PW redundancy is configured for L2VPN services, bit-error-triggered switching can be
configured. With this function, if bit errors occur, services can switch between the primary
and secondary PWs.

Trigger-LSP bit error detection must be enabled on each node's interfaces. PW redundancy
can be configured in either a single segment or multi-segment scenario.

l Single-segment PW redundancy scenario

In Figure 9-9, PE1 establishes a primary PW to PE2 and a secondary PW to PE3, which
implements PW redundancy. If PE2 detects bit errors, the processing is as follows:
– PE2 switches traffic destined for PE1 to the path bypass PW -> PE3 -> secondary
PW -> PE1 and sends a BFD packet to notify PE1 of the bit errors.
– Upon receipt of the BFD packet, PE1 switches traffic destined for PE2 to the path
secondary PW-> PE3 -> bypass PW -> PE2.
Traffic between PE1 and PE2 can travel along bit-error-free links.

Figure 9-9 Bit-error-triggered switching for single-segment PW


Primary PW


Bypass PW

Secondary PW

Bit errors
BFD packets

l Multi-segment PW redundancy scenario

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 218

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 9 Bit-Error-Triggered Protection Switching

In Figure 9-10, multi-segment PW redundancy is configured. PE1 is dual-homed to two

SPEs. If PE2 detects bit errors, the processing is as follows:
– PE2 switches traffic destined for PE1 to the path bypass PW -> PE3 -> PW2 ->
SPE2 -> secondary PW -> PE1 and sends a BFD packet to notify SPE1 of the bit
– Upon receipt of the BFD packet, SPE1 sends an LDP Notification message to notify
PE1 of the bit errors.
– Upon receipt of the notification, PE1 switches traffic destined for PE2 to the path
secondary PW -> SPE2 -> PW2 -> PE3 -> bypass PW-> PE2.
Traffic between PE1 and PE2 can travel along bit-error-free links. If bit errors occur on a
link between PE1 and SPE1, the processing is the same as that in the single-segment PW
redundancy scenario.

Figure 9-10 Bit-error-triggered switching for multi-segment PW


Primary PW
Bypass PW

Secondary PW

Bit errors
BFD packets
LDP Notification messages

After traffic switches to the secondary PW, and bit errors are removed from the primary PW,
traffic switches back to the primary PW based on a configured switchback policy.


If an RSVP-TE tunnel is established for PWs, and bit-error-triggered RSVP-TE tunnel switching is
configured, a switchover is preferentially performed between the primary and hot-standby CR-LSPs in
the RSVP-TE tunnel. A primary/secondary PW switchover can be triggered only if the primary/hot-
standby CR-LSP switchover fails to remove bit errors in either of the following situations:
l The hot standby function is not configured.
l Bit errors occur on both the primary and hot-standby CR-LSPs.

Usage Scenario
If L2VPN is used to carry user services and PW redundancy is deployed to ensure reliability,
deploy bit-error-triggered switching for PW to minimize the impact of bit errors on user
services and improve service quality.

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 219

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 9 Bit-Error-Triggered Protection Switching

9.2.7 Bit-Error-Triggered L3VPN Switching

On an FRR-enabled HVPN, bit-error-triggered switching can be configured for VPN routes.
With this function, if bit errors occur on the HVPN, VPN routes re-converge so that traffic
switches to a bit-error-free link.

Trigger-LSP bit error detection must be enabled on each node's interfaces. In Figure 9-11, an
HVPN is configured on an IP/MPLS backbone network. VPN FRR is configured on a UPE. If
SPE1 detects bit errors, the processing is as follows:
l SPE1 reduces the Local Preference attribute value or increase the Multi-Exit
Discrimination (MED) attribute value. Then, the preference value of a VPN route that
SPE1 advertises to an NPE is reduced. As a result, the NPE selects the VPN route to
SPE2, not the VPN route to SPE1. Traffic switches to the standby link. In addition, SPE1
sends a BFD packet to notify the UPE of bit errors.
l Upon receipt of the BFD packet, the UPE switches traffic to the standby link over the
VPN route destined for SPE2.
If the bit errors on the active link are removed, the UPE re-selects the VPN routes destined for
SPE1, and SPE1 restores the preference value of the VPN route to be advertised to the NPE.
Then the NPE also re-selects the VPN route destined for SPE1.

Figure 9-11 Bit-error-triggered L3VPN switching




SPE2 Bit errors

BFD packets
Active link
Standby link

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 220

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 9 Bit-Error-Triggered Protection Switching


If an RSVP-TE tunnel is established for an L3VPN, and bit-error-triggered RSVP-TE tunnel switching
is configured, a traffic switchover between the primary and hot-standby CR-LSPs in the RSVP-TE
tunnel is preferentially performed. An active/standby L3VPN route switchover can be triggered only if
the primary/hot-standby CR-LSP switchover fails to remove bit errors in either of the following
l The hot standby function is not configured.
l Bit errors occur on both the primary and hot-standby CR-LSPs.

Usage Scenario
If L3VPN is used to carry user services and VPN FRR is deployed to ensure reliability,
deploy bit-error-triggered L3VPN switching to minimize the impact of bit errors on user
services and improve service quality.

9.2.8 Bit-Error-Triggered Static CR-LSP/PW/E-PW APS

In PW/E-PW over static CR-LSP scenarios, if primary and secondary PWs are configured,
deploy bit-error-triggered protection switching. If bit errors occur, service traffic is switched
from the primary PW to the secondary PW.

Implementation Principles
The MAC-layer SD alarm function (Trigger-LSP type) must be enabled on interfaces, and
then MPLS-TP OAM must be deployed to monitor CR-LSPs/PWs. Static PWs/E-PWs are
classified as SS-PWs or MS-PWs.
In an SS-PW networking scenario (see Figure 9-12), the bit error generation and clearing
process is as follows:
Bit error generation:
l If the BER on an inbound interface of the P node reaches a specified threshold, the CRC
module detects the bit error status of the inbound interface, notifies all static CR-LSP
modules, and constructs and sends AIS packets to PE2.
l Upon receipt of the AIS packets, PE2 notifies static PWs established over the CR-LSPs
of the bit errors and instructs the TP OAM module to perform APS. APS triggers a
primary/backup CR-LSP switchover, and a PW established over the new primary CR-
LSP takes over traffic.
Bit error clearing: After bit errors are cleared, the CRC module cannot detect the bit error
status on the inbound interface. The CRC module informs the TP OAM module that the bit
errors have been cleared. Upon receipt of the notification, the TP OAM module stops sending
AIS packets to PE2 functioning as the egress. PE2 does not receive AIS packets after a
specified period and determines that the bit errors have been cleared. PE2 then generates an
AIS clear alarm and instructs the TP OAM to perform APS. APS triggers a primary/backup
CR-LSP switchover, and services are switched back to the PW over the primary CR-LSP.

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 221

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 9 Bit-Error-Triggered Protection Switching

Figure 9-12 Bit-error-triggered APS in an SS-PW networking scenario



PE3 Bit errors

AIS packet

In an MS-PW networking scenario (see Figure 9-13), the bit error generation and clearing
process is as follows:
Bit error generation:
l The CRC module of an inbound interface on the SPE detects bit errors and determines to
send either an SF or SD alarm based on a specified BER threshold. The CRC module
then notifies the TP OAM module of the bit errors. The TP OAM module notifies the bit
error status, sends RDI packets, and performs APS. The APS module instructs the peer
node to perform a traffic switchover, which triggers a primary/backup CR-LSP
switchover. The PW established over the bit-error-free CR-LSP takes over traffic.
l If the BER on an inbound interface of the SPE reaches a specified threshold, the CRC
module detects the bit error status of the inbound interface, sets all static CR-LSP
modules to the bit error status, and constructs and sends AIS packets to PE2.
l Upon receipt of the AIS packets, PE2 notifies the TP OAM module. The TP OAM
module then performs APS, which triggers a primary/backup CR-LSP switchover. The
PW established over the bit-error-free CR-LSP takes over traffic.
Bit error clearing: After bit errors are cleared, the CRC module cannot detect the bit error
status on the inbound interface. The CRC module informs the TP OAM module that the bit
errors have been cleared. Upon receipt of the notification, the TP OAM module stops sending
AIS packets to PE2 functioning as the egress. PE2 does not receive AIS packets after a
specified period and determines that the bit errors have been cleared. PE2 then generates an
AIS clear alarm and instructs the TP OAM to perform APS. APS triggers a primary/backup
CR-LSP switchover, and services are switched back to the PW over the primary CR-LSP.

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 222

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 9 Bit-Error-Triggered Protection Switching

Figure 9-13 Bit-error-triggered APS in an MS-PW networking scenario



PE3 Bit errors

AIS packet


If a tunnel protection group has been deployed for static CR-LSPs carrying PWs/E-PWs, bit errors
preferentially trigger static CR-LSP protection switching. Bit-error-triggered PW protection switching is
performed only when bit-error-triggered static CR-LSP protection switching fails to protect services
against bit errors (for example, bit errors occur on both the primary and backup CR-LSPs).

Usage Scenario
If static CR-LSPs/PWs/E-PWs are used to carry user services and MPLS-TP OAM is
deployed to ensure reliability, deploy bit-error-triggered APS to minimize the impact of bit
errors on user services and improve service quality.

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 223

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 9 Bit-Error-Triggered Protection Switching

9.2.9 Relationships Among Bit-Error-Triggered Protection

Switching Features
Featur Function Dependency Relationship Deployment
e on Bit Error with Other Constraints and
Detection Bit-Error- Suggestions

Bit A device uses the - This feature is To prevent line

error CRC algorithm to the basis of jitters from
detecti detect bit errors on other bit-error- frequently
on an inbound interface. triggered triggering service
Bit error detection protection switchovers and
types are classified as switching switchbacks, set the
trigger-LSP, trigger- features. bit error alarm clear
section, or link- threshold to be one
quality. order of magnitude
The device uses BFD lower than the bit
packetsor MPLS-TP error alarm
OAM to advertise the threshold.
bit error status, and
promptly notifies the
peer device of bit
error generation and
clearing events.

Bit- If bit errors are Trigger-section l This feature l Enable bit-

error- generated or cleared bit error is error-triggered
triggere on an interface, the detection must independent section
d link-layer protocol be enabled on ly deployed. switching on the
section status of the interface an interface. l When interfaces at
switchi changes to bit-error- The bit error deploying both ends of a
ng detection Down or status must be trunk-bit- link.
Up, triggering an advertised error- l If bit errors
upper-layer using BFD triggered occur on both
application packets. section the primary and
associated with the switching, secondary links,
interface for a service you can bit-error-
switchover or enable bit- triggered
switchback. error- section
triggered switching may
section interrupt
switching services.
on trunk Therefore, bit-
member error-triggered
interfaces. IGP route
switching is

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 224

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 9 Bit-Error-Triggered Protection Switching

Featur Function Dependency Relationship Deployment

e on Bit Error with Other Constraints and
Detection Bit-Error- Suggestions

Bit- If bit errors are Link-quality bit l This feature l Enable bit-
error- generated or cleared error detection is error-triggered
triggere on an interface, the must be independent IGP route
d IGP link quality level of enabled on an ly deployed. switching on the
route the interface changes interface. l When interfaces at
switchi to Low or Good, The bit error deploying both ends of a
ng triggering an IGP status must be trunk-bit- link.
(OSPF or IS-IS) to advertised error-
increase the cost of using BFD triggered
the interface's link or packets. IGP route
restore the original switching,
cost for the link. IGP you must
routes on the peer deploy bit-
device then do not error-
preferentially select triggered
the link with bit IGP route
errors or switching
preferentially select on trunk
the link again. interfaces.

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 225

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 9 Bit-Error-Triggered Protection Switching

Featur Function Dependency Relationship Deployment

e on Bit Error with Other Constraints and
Detection Bit-Error- Suggestions

Bit- If bit errors are When l Trunk-bit- l Enable the same

error- generated or cleared deploying error- bit-error-
triggere on a trunk member trunk-bit-error- triggered triggered
d trunk interface, the trunk triggered section protection
update interface is triggered section switching is switching
to delete or re-add switching, you independent function on the
the member interface must enable ly deployed. trunk interfaces
from or to the trigger-section l When at both ends.
forwarding plane. If or trigger-LSP deploying l Trunk-bit-error-
bit errors occur on all bit error trunk-bit- triggered IGP
trunk member detection on error- route switching
interfaces or the trunk member triggered is
number of member interfaces. IGP route recommended.
interfaces without bit When switching,
errors is lower than l Layer 2 trunk
deploying you must interfaces do
the lower threshold trunk-bit-error- deploy bit-
for the trunk not support an
triggered IGP error- IGP. Therefore,
interface's Up links, route triggered
bit-error-triggered bit-error-
switching, you IGP route triggered IGP
protection switching must enable switching
involves the route switching
link-quality bit on trunk cannot be
following modes: error detection interfaces. deployed on
l Trunk-bit-error- on trunk Layer 2 trunk
triggered section member interfaces.
switching: The interfaces.
trunk interface The bit error
goes Down, status must be
triggering an advertised
upper-layer using BFD
application packets.
associated with
the trunk interface
to perform a
l Trunk-bit-error-
triggered IGP
route switching:
The trunk
interface ignores
the bit errors on
the member
interfaces and

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 226

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 9 Bit-Error-Triggered Protection Switching

Featur Function Dependency Relationship Deployment

e on Bit Error with Other Constraints and
Detection Bit-Error- Suggestions

remains Up.
However, the link
quality level of
the trunk interface
becomes Low,
triggering an IGP
to increase the
cost of the trunk
interface's link.
IGP routes then
do not
select the link.

Bit- The ingress of the Trigger-LSP bit l This feature To implement dual-
error- primary and backup error detection is ended switching,
triggere CR-LSPs determines must be independent deploy bit-error-
d the bit error statuses enabled on an ly deployed. triggered protection
RSVP- of the CR-LSPs interface. l This feature switching on the
TE based on link BERs. The bit error is deployed RSVP-TE tunnels
tunnel A service switchover status must be together in both directions
switchi or switchback is then advertised with bit- and configure the
ng performed based on using BFD error- tunnels as
the bit error statuses packets. triggered bidirectional
of the CR-LSPs. PW associated CR-
switching. LSPs.
l This feature
is deployed
with bit-

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 227

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 9 Bit-Error-Triggered Protection Switching

Featur Function Dependency Relationship Deployment

e on Bit Error with Other Constraints and
Detection Bit-Error- Suggestions

Bit- If bit errors occur, Trigger-LSP bit This feature is If an RSVP-TE

error- service traffic is error detection deployed tunnel with bit-
triggere switched from the must be together with error-triggered
d PW primary PW to the enabled on an bit-error- protection
switchi secondary PW. interface. triggered switching enabled
ng The bit error RSVP-TE is used to carry a
status must be tunnel PW, bit-error-
advertised switching. triggered RSVP-TE
using BFD tunnel switching is
packets. preferentially
performed. Bit-
error-triggered PW
switching is
performed only
when bit-error-
triggered RSVP-TE
tunnel switching
fails to protect
services against bit

Bit- If bit errors occur, Trigger-LSP bit This feature is If an RSVP-TE

error- VPN routes are error detection deployed tunnel with bit-
triggere triggered to must be together with error-triggered
d reconverge. Service enabled on an bit-error- protection
L3VPN traffic is then interface. triggered switching enabled
route switched to the link The bit error RSVP-TE is used to carry an
switchi without bit errors. status must be tunnel L3VPN, bit-error-
ng advertised switching. triggered RSVP-TE
using BFD tunnel switching is
packets. preferentially
performed. Bit-
L3VPN route
switching is
performed only
when bit-error-
triggered RSVP-TE
tunnel switching
fails to protect
services against bit

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 228

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 9 Bit-Error-Triggered Protection Switching

Featur Function Dependency Relationship Deployment

e on Bit Error with Other Constraints and
Detection Bit-Error- Suggestions

Bit- Static CR- The MAC-layer This feature is l If a tunnel

error- LSPs/PWs/E-PWs SD alarm independently protection
triggere are used to carry user function deployed. group has been
d static services, and MPLS- (Trigger-LSP deployed for
CR- TP OAM is deployed type) must be static CR-LSPs
LSP/P to ensure reliability. enabled on carrying
W/E- If a node detects bit interfaces. PWs/E-PWs, bit
PW errors, the node uses The bit error errors
APS MPLS-TP OAM to status must be preferentially
advertise the bit error advertised trigger static
status to the egress. using MPLS- CR-LSP
APS is then used to TP OAM. protection
trigger a traffic switching. Bit-
switchover. error-triggered
PW protection
switching is
performed only
when bit-error-
triggered static
switching fails
to protect
services against
bit errors.
l Eth-Trunk
interfaces do
not support the
of the bit error
status by

9.3 Application Scenarios for Bit-Error-Triggered

Protection Switching

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 229

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 9 Bit-Error-Triggered Protection Switching

9.3.1 Application of Bit-Error-Triggered Protection Switching in a

Scenario in Which TE Tunnels Carry an IP RAN

Networking Description
Figure 9-14 shows typical L2VPN+L3VPN networking in an IP RAN application. A VPWS
based on an RSVP-TE tunnel is deployed at the access layer, an L3VPN based on an RSVP-
TE tunnel is deployed at the aggregation layer, and L2VPN access to L3VPN is configured on
the AGGs. To ensure reliability, deploy PW redundancy for the VPWS, configure VPN FRR
protection for the L3VPN, and configure hot-standby protection for the RSVP-TE tunnels.

Figure 9-14 IP RAN carried over TE tunnels




Access Aggregation




Traffic path

Feature Deployment
To prevent the impact of bit errors on services, deploy bit-error-triggered RSVP-TE tunnel
switching, bit-error-triggered PW switching, and bit-error-triggered L3VPN route switching
in the scenario shown in Figure 9-14. The deployment process is as follows:

l Enable trigger-LSP bit error detection on each interface.

l Bit-error-triggered RSVP-TE tunnel switching: Enable bit-error-triggered protection
switching on the RSVP-TE tunnel interfaces of the CSG and AGG1, and configure
thresholds for bit-error-triggered RSVP-TE tunnel switching.
l Bit-error-triggered PW switching: Enable bit-error-triggered PW switching on the
interfaces that connect the CSG and AGG1 and the interfaces that connect the CSG and

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 230

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 9 Bit-Error-Triggered Protection Switching

l Bit-error-triggered L3VPN route switching: Configure bit-error-triggered L3VPN route

switching in the VPNv4 view of AGG1.

Bit-Error-Triggered Protection Switching Scenarios

Scenario 1

On the network shown in Figure 9-15, if bit errors occur on location 1, the RSVP-TE tunnel
between the CSG and AGG1 detects the bit errors, triggering dual-ended switching. Both
upstream and downstream traffic is switched to the hot-standby path, preventing traffic from
passing through the link with bit errors.

Figure 9-15 Application of bit-error-triggered RSVP-TE tunnel switching




Access Aggregation




Bit errors
Traffic path

Scenario 2

On the network shown in Figure 9-16, if bit errors occur on both locations 1 and 2, both the
primary and secondary links of the RSVP-TE tunnel between the CSG and AGG1 detect the
bit errors. In this case, bit-error-triggered RSVP-TE tunnel switching cannot protect services
against bit errors. The bit errors further trigger PW and L3VPN route switching.

l After detecting the bit errors, the CSG performs a primary/secondary PW switchover and
switches upstream traffic to AGG2.
l After detecting the bit errors, AGG1 reduces the priority of VPNv4 routes advertised to
RSG1, so that RSG1 preferentially selects VPNv4 routes advertised by AGG2.
Downstream traffic is then switched to AGG2.

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 231

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 9 Bit-Error-Triggered Protection Switching

Figure 9-16 Application of bit-error-triggered PW and L3VPN route switching




Access Aggregation



Bit errors
Traffic path

9.3.2 Application of Bit-Error-Triggered Protection Switching in a

Scenario in Which LDP LSPs Carry an IP RAN
Networking Description
Figure 9-17 shows typical L2VPN+L3VPN networking in an IP RAN application. A VPWS
based on an LDP LSP is deployed at the access layer, an L3VPN based on an LDP LSP is
deployed at the aggregation layer, and L2VPN access to L3VPN is configured on the AGGs.
To ensure reliability, deploy LDP and IGP synchronization for the LDP LSPs, and configure
Eth-Trunk interfaces on key links.

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 232

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 9 Bit-Error-Triggered Protection Switching

Figure 9-17 IP RAN carried over LDP LSPs




Access Aggregation




Traffic path

Feature Deployment
To prevent the impact of bit errors on services, deploy bit-error-triggered IGP route switching
in the scenario shown in Figure 9-17. Deploy trunk-bit-error-triggered IGP route switching
on the Eth-Trunk interfaces. The deployment process is as follows:
l Enable link-quality bit error detection on each physical interface and Eth-Trunk member
l Enable bit-error-triggered IGP route switching on each physical interface and Eth-Trunk

Bit-Error-Triggered Protection Switching Scenarios

Scenario 1
On the network shown in Figure 9-18, if bit errors occur on location 1 (physical interface),
the CSG detects the bit errors and adjusts the quality level of the interface's link to Low,
triggering an IGP to increase the cost of the link. In this case, IGP routes do not preferentially
select the link. The CSG also uses a BFD packet to advertise the bit errors to the peer device,
so that the peer device also performs the same processing. Both upstream and downstream
traffic is then switched to the paths without bit errors.

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 233

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 9 Bit-Error-Triggered Protection Switching

Figure 9-18 Application of physical-interface-bit-error-triggered IGP route switching




Access Aggregation




Bit errors
Traffic path

Scenario 2
On the network shown in Figure 9-19, if bit errors occur on location 2 (Eth-Trunk member
interface), AGG1 detects the bit errors.
l If the number of member interfaces without bit errors is still higher than the lower
threshold for the Eth-Trunk interface's Up links, the Eth-Trunk interface deletes the Eth-
Trunk member interface from the forwarding plane. In this case, service traffic is still
forwarded over the normal path.
l If the number of member interfaces without bit errors is lower than the lower threshold
for the Eth-Trunk interface's Up links, the Eth-Trunk interface ignores the bit errors on
the Eth-Trunk member interface and remains Up. However, the link quality level of the
Eth-Trunk interface becomes Low, triggering an IGP (OSPF or IS-IS) to increase the
cost of the Eth-Trunk interface's link. IGP routes then do not preferentially select the
link. AGG1 also uses a BFD packet to advertise the bit errors to the peer device, so that
the peer device also performs the same processing. Both upstream and downstream
traffic is then switched to the paths without bit errors.

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 234

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 9 Bit-Error-Triggered Protection Switching

Figure 9-19 Application of Eth-Trunk-interface-bit-error-triggered IGP route switching




Access Aggregation




Bit errors

Traffic path

9.3.3 Application of Bit-Error-Triggered Protection Switching in a

Scenario in Which a Static CR-LSP/PW Carries L2VPN Services
Networking Description
Figure 9-20 shows a typical IP RAN. L2VPN services are carried on static CR-LSPs. CR-
LSP APS is configured to provide tunnel-level protection. Additionally, PW APS/E-PW APS
is configured for L2VPN services to provide service-level protection.

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 235

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 9 Bit-Error-Triggered Protection Switching

Figure 9-20 IP RAN using static CR-LSPs to carry L2VPN services



NodeB PW1 (Primary)

Access Aggregation
PW2 (Secondary)


Feature Deployment
To meet high reliability requirements of the IP RAN and protect services against bit errors,
configure bit-error-triggered protection switching for the CR-LSPs/PWs. To do so, enable bit
error detection on the interfaces along the CR-LSPs/PWs, configure the switching type as
trigger-LSP, and configure bit error alarm generation and clearing thresholds. If the BER
reaches the bit error alarm threshold configured on an interface of a device along a static CR-
LSP or PW, the device determines that a bit error occurrence event has occurred and notifies
the MPLS-TP OAM module of the event. The MPLS-TP OAM module uses AIS packets to
advertise the bit error status to the egress, and then APS is used to trigger a traffic switchover.

9.4 Terminology for Bit-Error-Triggered Protection

Term Definition

Bit error The deviation between a bit that is sent and

the bit that is received. Cyclic redundancy
checks (CRCs) are commonly used to detect
bit errors.

BER (bit error rate) A bit error rate (BER) indicates the
probability that incorrect packets are
received and packets are discarded.

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 236

HUAWEI NE40E-M2 Series Universal Service Router
Feature Description - Network Reliability 9 Bit-Error-Triggered Protection Switching

Acronyms and Abbreviations

Acronym and Full Name

CRC cyclic redundancy check

PW pseudo wire

APS Automatic Protection Switching

AIS Alarm Indication Signal

Issue 01 (2018-12-05) Copyright © Huawei Technologies Co., Ltd. 237

You might also like