SSR System Arch

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 287

SSR System Architecture Guide

SYST ARCHITEC DESCR.

Y
R
A
IN
IM
EL
PR

1/155 53-CRA 119 1364/1-V1 Uen PC3


Copyright

© Ericsson AB 2011–2013. All rights reserved. No part of this document may be


reproduced in any form without the written permission of the copyright owner.

Disclaimer

The contents of this document are subject to revision without notice due to
continued progress in methodology, design and manufacturing. Ericsson shall
have no liability for any error or damage of any kind resulting from the use

Y
of this document.

R
Trademark List

NetOp NetOp is a trademark of Telefonaktiebolaget LM

A
Ericsson.

IN
IM
EL
PR

1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06


Contents

Contents

1 Overview 1
1.1 Scope 1
1.2 Audience 1

Y
2 SSR Functional Architecture 1
2.1 Hardware Architecture 1

R
2.2 Software Architecture 36
2.3 System Redundancy 75

A
3 Architectural Support For Features 81
3.1 Layer 2 Cross-Connection and VPWS on SSR 81
3.2
3.3
3.4
Circuits
Link Aggregation Groups
Port and Circuit Mirroring
IN 90
92
111
IM
3.5 Routing 112
3.6 MPLS 127
3.7 Forwarding 136
EL

3.8 BNG Management 147


3.9 Advanced Services (QoS and ACLs) 178
3.10 Inter-Chassis Redundancy 204
3.11 Ethernet CFM and Single-Session BFD Home Slot
PR

Management 214
3.12 Event Tracking Interface 215
3.13 Failure Event Notification Processes 219

4 Administration 221
4.1 Accessing the SSR System Components 221
4.2 Configuration Management 225

5 Monitoring and Troubleshooting Data 227


5.1 HealthD 228
5.2 Logging 231
5.3 Show Commands 237
5.4 Collecting the Output of Logs and Show Commands 256
5.5 Debugging 257

1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06


SSR System Architecture Guide

5.6 Core Dump Files 261


5.7 Statistics 263
5.8 Managing Unsupported Transceivers 263
5.9 SNMP Monitoring and Notification 264
5.10 Troubleshooting Using ISM 264
5.11 Hardware Diagnostics 272

Glossary 279

Y
R
A
IN
IM
EL
PR

1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06


SSR Functional Architecture

1 Overview

This document describes the SSR 8000 platform architecture.

For an overview of the SSR platform with use cases, see the SSR System
Description.

1.1 Scope
This description covers the hardware, software, and functional aspects of
the product. It describes the internal functionality of the SSR and provides
a background for internal technical training. It includes a description of the
process modules and their interaction.

1.2 Audience
This document is intended for Ericsson employees in Research and
Development and Technical Support.

2 SSR Functional Architecture

2.1 Hardware Architecture


The SSR product includes the SSR 8000 hardware and Ericsson IP Operating
System. The operating system integrates multiple Layer 2 (L2) and Layer 3 (L3)
features into the common SSR platform. Each capability is well developed and
highly tuned for its specific functional role within the Triple Play network. The
SSR 8000 family is Ericsson’s common IP networking platform that integrates
IP routing capabilities with applications such as quality of service (QoS) and
complex Multiprotocol Label Switching (MPLS)-based topologies.

The SSR platform consists of:

• Chassis

• Switch cards—RPSW, ALSW, SW (SSR 8020 only)

• I/O cards—40-port Gigabit Ethernet (GE), 10-port 10GE

1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06 1


SSR System Architecture Guide

• Smart Services Card (SSC)

• Power modules

• Fan trays

• Various other hardware components (chassis kit, interface pluggables)

The SSR is a carrier-class, highly scalable L2 and L3 routing platform. It


consists of a redundant control plane, based on two controller cards, and a
redundant switch fabric that supports 100 Gbps to up to 20 I/O cards. It also
has increased power and thermal capacity to house future-generation line
cards. As a result, the capacity of the system is up to 8 Tbps, depending on the
number of line cards installed. The I/O cards use the Network Processing Unit
(NPU) for packet forwarding, access control list (ACL) and QoS processing,
and other L2 and L3 services. The software function performed by the NPU is
called the packet-forwarding engine (PFE) in this document.

The SSR 8000 family delivers advanced revenue-bearing services, such as


voice and video. Ensuring high availability has been a guiding design principle.
All core architectural design decisions—from the process-based architecture
to complete separation of the forwarding and control planes—trace their roots
back to this design goal.

The SSR contains the newest generation chassis controller card designed to
improve the performance and scalability of the control plane functions. The
platform takes advantage of the latest generation of Intel x86 processor, along
with higher memory densities to dramatically improve performance. SSR 8000
uses the switch fabric, and all cards in the chassis are connected to the central
switch fabric. Some key advantages of this design are a simpler backplane,
distributed intelligence, and scalability. Broadcom’s FE600 device (a 96-port
switch fabric chip) is used for the central switch fabric. Every SSR system uses
RPSW and ALSW switch cards. The 8020 system also uses SW cards.

2.1.1 Chassis Models

2.1.1.1 SSR 8020 Chassis

The SSR 8020 chassis has the following characteristics:

• Eight switch fabric cards, including two RPSW cards that package a route
processor complex with the switch fabric and two ALSW cards that package
an internal switch (for control traffic) and alarm hardware.

• Possible throughput of up to 8 Tbps (full duplex). Depending on the number


of cards installed, the switch fabric scales to twice the capacity of the ports
with load-shared redundancy.

• Twenty line card slots for I/O or service cards.

2 1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06


SSR Functional Architecture

• Backplane with 28 vertical slots divided into two stacked card cages, each
holding 10 full-height line and service cards and 4 half-height switch cards.

• Eight power entry modules (PEMs) in dedicated slots at the bottom of the
chassis. The PEMs blind mate into a custom vertical power backplane
to which the customer’s DC terminal lugs attach via bus bars. After
conditioning, the output power exits the power backplane via bus bars to
the backplane. Each supply requires an A (primary) 60-amp 54 V direct
current (DC) feed. An identical redundant B feed to the supply is Or'd inside
the PEM to provide a single load zone for N+1 redundancy. Total available
power is 14.7 kW, based on seven active 2.1 kW power supplies.

Note: Or'd power refers to two signals (DC power in this case), which are
logically combined such that the output is true if either one is true
(X= A+B). This is accomplished by connecting the two sources
with low-voltage diodes, with the higher DC voltage becoming the
output level. In contrast, with and'd power both signals must be
true for the output to be true (X=A*B).

• Push/pull airflow provided by two identical fan trays horizontally mounted


above and below the card cages. Each fan tray contains six horizontally
mounted 172 mm fans and a custom fan controller.

• Chassis 38 rack units (RU) (57.75 in) high, including cable management,
which fits in a 600 mm BYB501 cabinet.

• Switch fabric for load-shared redundancy.

Figure 1 SSR 8020 Chassis

2.1.1.2 SSR 8010 Chassis

The SSR 8010 chassis is a 10-slot version of the 20-slot SSR 8020 router. It
uses the same line cards, service cards, and switch cards as the SSR 8020.

1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06 3


SSR System Architecture Guide

It also shares the same PEMs, fan tray, cable management, and air filter.
Basically, it is an SSR 8020 without an upper card cage.

The SSR 8010 chassis has the following characteristics:

• Backplane with 14 vertical slots divided into a card cage of 10 full-height


line cards and 4 half-height switch boards: two switch/route processors
providing 1+1 redundancy, and two switch/alarm cards providing 1+1
redundancy.

• Six PEMs in dedicated slots at the bottom of the chassis. The two rows
of three horizontally mounted PEMs blind mate into horizontal power
backplanes to which the customer’s DC terminal lugs attach via bus bars.
Each supply requires an A (primary) 60-amp 54 V DC feed. An identical
redundant B feed to the supply is Or’ed internal to each supply to provide a
single load zone for N+1 redundancy. After conditioning, the output power
exits the power backplane via bus bars to the backplane. Total available
power is 10.5 kW, based on five active 2.1 kW power supplies.

• Push/pull airflow provided by two fan trays horizontally mounted above


and below the card cages, providing bottom-front to top-rear airflow. Each
fan tray contains six horizontally mounted 172 mm fans and a custom fan
controller.

• Chassis (including cable management) fits in a 600 mm deep BYB501


cabinet. Chassis height is 21 RU or 36.75 in (933.5 mm).

• Switch fabric provides load-shared redundancy.

Figure 2 SSR 8010 Chassis

2.1.2 SSR 8000 Line and Service Cards

The SSR platform supports:

• A 40-port GE line card, which allows up to 40 SFP plug-ins.

• A 10-port 10GE line card, which allows up to 10 10GE XFP plug-ins.

4 1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06


SSR Functional Architecture

• A 4-port 10GE or 20-port GE and 2-port 10GE line card, which allows up to
4 10GE XFP plug-ins or 20 SFP plugs-ins and 2 10GE XFP plug-ins.

• A 1-port 100 GE or 2-port 40GE line card, which allows up to one 100GE or
two 40GE hot pluggable form factor (CFP) plug-ins; required for Broadband
Network Gateway (BNG) services.

• A Smart Services Card (SSC), which provides advanced services that are
beyond the scope of the terminating and forwarding capabilities provided
by the line cards. The SCC is targeted on both control- and user-plane
applications.

Unlike a line card, an SSC does not have I/O interfaces that are used for
traffic processing. It receives all its traffic from other line cards via the
switch fabric. The SSC supports a single application per card and offers
complete installation flexibility. It occupies a single slot in the chassis and
can be plugged into any usable card.

2.1.3 SSR Switch Fabric


The SSR uses three card types to implement the switching fabric: Switch Route
Processors (RPSW), Alarm Switch (ALSW) cards, and Switch (SW) cards (also
known as ALSW Lite). The FE600 device switches user plane traffic. The three
switch cards differ by the circuitry provided other than the FE600 switch circuitry.

From a design perspective, only two switch card variants exist, because the
SW card is a depopulated version of the ALSW card. All SSR systems require
both the RPSW and ALSW cards. The SSR 8020 system also requires SW
cards for expanded fabric capacity. The switch cards natively support 100
Gbps line card slots.

All line cards are connected to all fabric switches, as illustrated in Figure 3 for
the SSR 8020, for example.

1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06 5


SSR System Architecture Guide

Figure 3 SSR 8020 Switch Fabric

Table 1 provides an overview of each switch fabric card.

Table 1 SSR Switch Fabric Cards


Switch Card Description Chassis Number of
Usage Cards per
Chassis
RPSW Contains x86 control processor, 1–2 FE600 8010, 8020 2
fabric devices, and chassis management logic
ALSW Contains GE control plane switch, 1–2 FE600 8010, 8020 2
fabric devices, and central timing distribution
SW Contains 1 FE600 fabric device 8020 4

On the SSR line cards, the fabric access processor (FAP) with Interlaken
interfaces connects to the FE600s on each switch card using a Broadcom
proprietary protocol. There are multiple serlializer/deserializer (SerDes) links
(Differential High-Speed Serial Interfaces) from each line card to each FE600

6 1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06


SSR Functional Architecture

in the system. Table 2 describes the configuration and throughput of the data
plane on the two chassis.

Note: These bandwidths do not include proprietary header overhead.

Table 2 Line Card Component Specifications


Item SSR 8010 SSR 8020
Number of switch cards 4 8
Redundancy 3+1 6+2
Number of FE600s on switch cards 2 2
Line card links per FE600 As many as there As many as there are
are line cards line cards
Effective SerDes link speed (Gbps) 4.735 4.735
Full per line card capacity (Gbps) 151.52 151.52
Single-failure line card capacity (Gbps) 113.64 132.58
Full switch fabric capacity (Gbps) 1,515.20 3,030.40
Single-failure switch fabric capacity (Gbps) 1,136.40 2,651.60

2.1.4 Control Plane Interfaces


The system has various control plane interfaces to provide services:

• Low-level card management bus (CMB) used as a low-latency register


interface by the active RPSW card to all cards in the system.

• Peripheral Component Interface (PCI) Express control interface


(SerDes-based) used by the active RPSW card to control all switch card
resources.

• High-speed, packet-based GE control plane used by the line cards to


communicate packet-based messages with the RPSW cards.

• Timing control bus (TCB) used by the ALSW cards to provide reference
clock and epoch clock support for the system.

• Selection control bus (SCB) used by the ALSW cards to arbitrate the active
route processor in the system and to communicate that information to all
switch cards.

• Common equipment, such as fan trays and PEMs, are controlled by the
active RPSW card over I2C (two-wire standard interface) buses.

1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06 7


SSR System Architecture Guide

2.1.4.1 Low-Level Card Management Bus Interface

The CMB is one of three similar buses used to both control and distribute
information to the other elements of the chassis. The CMB provides redundant,
low-level communication and control from the RPSW cards to the line cards,
ALSW cards, and SW cards (see Figure 5). Figure 4 displays the CMB
interconnections. The master and standby RPSW card declaration defines
the master and standby CMBs. There is no hardware switchover. If a CMB
failure is detected, the RPSW card must decide if it is still master-capable. If
not, the ALSW card declares a new master RPSW card, and the new master
CMB follows.

The bus consists of an 8 MHz clock and a bidirectional synchronous data line.
The CMB also has an active low-interrupt line sourced by the bus recipient and
received by the RPSW cards, as well as a detect/reset signal. Not all CMB
interfaces need to support all CMB functions. The CMBs between the two
RPSW cards are of a very reduced nature.

The ALSW and SW cards use the Shiba field progammable gate array (FPGA).
For more information, see the Shiba Functional Specification with EngDoc ID
HW-HLD-0032 (http://cdmweb.ericsson.se/WEBLINK/ViewDocs?DocumentNa
me=62%2F15941-FCP1217270&Latest=true).

The RPSW cards support a variation of the CMB called the route processor
management bus (RPMB). The Phalanx Complex Programmable Logic Device
(CPLD) supports the RPMB with the following features on the RPSW card.
The CPLD allows a specialized feature set without compromising the original
SMB specification.

• Phalanx supports all I2C buses.

• All devices (hot-swap controller, inlet temperature sensor, and


manufacturing EEPROM) outlined in the specification are available on
the I2C buses 0,1, and 2.

• Logging is supported on the NVRAM or real-time clock (RTC) module,


which is completely memory mapped on the Phalanx FPGA. Both RPSW
cards can read the NVRAM, but only the local processor can write to the
NVRAM.

Figure 4 represents the overall SSR CMB topology.

8 1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06


SSR Functional Architecture

Figure 4 CMB Interface

1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06 9


SSR System Architecture Guide

Figure 5 CMB Communications

2.1.4.2 PCI Express Control Interface

The RPSW card connects the other switch cards (ALSW and SW) through PCI
Express Gen 1 x1 lane ports, as shown in Figure 6. The PLX 8615 PCI Express
ports are enumerated as port 0 for the x4 lane Gen 2 interface to the Jasper
Forest processing chip and ports 1 through 8 for the x1 lane Gen 1 interfaces to
the switch card FPGAs on all switch cards, including the onboard FPGA. As
Figure 6 shows, all transactions to all FPGAs on all switch cards emanate from
the processor complex through the PLX 8615, which allows a homogeneous
interface for software. Port 0 is the UP port to the host and is configured as a
PCIe Gen 2 (5.0 GT/s) interface. Ports 1 through 8 are configured as a PCIe
Gen 1 (2.5GT/s) interface.

10 1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06


SSR Functional Architecture

Figure 6 PCI Bus Control Interface

2.1.4.3 Gigabit Ethernet Control Plane Interface

Figure 7 illustrates the control plane GE interconnection system used for line
card and RPSW card control communication. This is separate from the fabric
and forwarding plane connections. This interface is the backplane Ethernet,
with a central Ethernet switch on each ALSW card. The line cards support
1GE links, and the RPSW cards support a 10GE link rate from the Ethernet
switch on the ALSW cards.

The Intel i82599 Dual 10GE Network Interface Controller provides a x4 lane
Gen 2 (5.0 GT/s) interface from the Ethernet switch to the Jasper Forest
processor complex. A direct control plane connects the RPSW cards through
redundant 1GE links. The Intel i82580 Quad GE Network Controller provides

1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06 11


SSR System Architecture Guide

a x4 lane Gen 1 (2.5 GT/s) interface from the Ethernet switch to the Jasper
Forest processor complex. The Jasper Forest processor complex is capable of
handling 3–6 Gbps data bandwidth from the Ethernet control plane. The 10GE
uplinks provide future expansion capability. One Ethernet 10/100/1000Base-T
external system management port is also provided on each RPSW card from
the i82580 Quad Ethernet LAN controller.

Figure 7 Gigabit Ethernet Control Plane Network

2.1.4.4 Timing Control Bus Interface

The TCB is a redundant bus sourced by the ALSW cards. Its primary purpose
is as a conduit for timing information between the ALSW card and the line cards
(see Figure 8). The bus commands provide supportfor SyncE clock distribution.
For more information, see Section 2.1.4.5 on page 13.

Redundant buses are mastered from each ALSW card in the system. Each bus
is wired as a clock and data pair, wired in a dual star pattern to each line card in
the system (see Figure 9). The epoch clocks are two synchronized counters

12 1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06


SSR Functional Architecture

operating synchronously to the TCB. They are used throughout the system for
coordinated event logging. The TCB provides the capability to distribute a
synchronized epoch clock to all line cards processing elements in the system.
The SCB clocks from each ALSW card are run from the local 100 MHz oscillator
used for the system FPGA clock. The data is non-return-to-zero (NRZ).

Figure 8 TCB Interface

Figure 9 TCB Communications

2.1.4.5 Synchronous Ethernet

The SSR uses Synchronous Ethernet (SyncE) to keep transmission


synchronized in a network using a unified clock. SyncE communicates and
synchronizes timing and source traceability information from node to node in a

1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06 13


SSR System Architecture Guide

plesiochronous network using the physical layer interface. All network elements
along the synchronization path must be SyncE-enabled.

The goal is to keep transmission synchronized by a unified clock in order to


keep clock fluctuations and offsets under control, because they are a main
cause of errors and poor quality of service.

The core process on the SSR (timing control module daemon TCMd) is located
on the active RPSW card. It performs the following functions:

• Provisions and monitors synchronization reference candidates.

• Selects the best available reference candidate(s) as an active


synchronization source for Equipment Clock and BITS ports.

• Receives and transmits SSM over BITS ports.

SyncE produces alarms when port, line card, chassis or ALSW faults are
generated. For more information about the alarms, see Alarms and Probable
Causes.

SyncE on the SSR has the following restrictions and limitations:

• Up to four ports can be configured as input sources: Building Integrated


Timing Supply / Standalone Synchronization Equipment (BITS/SASE, or
simply, BITS) A, BITS B, and two line card Ethernet ports (one per slot).

• Up to four ports can have continuous monitoring for quality level.

• Up to four ports can be continuously monitored to receive synchronization


status messages (SSMs); other ports can be queried to receive as needed.

• Ports on the 4-port 10 Gigabit Ethernet, or 20-port Gigabit Ethernet and


2-port 10 Gigabit Ethernet line card do not currently support the Ethernet
synchronization messaging channel (ESMC) protocol data units (PDUs),
although the CLI does not prevent entering the synchronous-mode
command for these ports.

• BITS output ports have no dedicated input source selector and are always
timed from the equipment clock.

• The SSR SyncE implementation does not currently support SNMP. The
following counters and statistics are, however, available via CLI show
commands:

0 Number of SSM packets received and sent on a per-port basis.

0 Clock quality level being sent and received through SSM.

0 SyncE-capable ports operating in synchronous and non-synchronous


mode.

0 Input source used to drive a port’s transmit timing, and the state of that
source (freerun, holdover, locked).

14 1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06


SSR Functional Architecture

0 Number of times the transmitted/received quality level has changed


per port.

0 Number of SSMs received by a SyncE-capable port that is operating in


non-synchronous mode.

Figure 10 illustrates the SyncE functional architecture.

Figure 10 SyncE Functional Architecture

As shown in Figure 10, SyncE is supported by the following hardware and


software components:

On the active RPSW card:

• SSR CLI and front end components that implement the new configuration
and monitoring functions. The configurations are in three configuration
areas: Equipment Clock, SyncE port, and BITS.

1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06 15


SSR System Architecture Guide

• RCM – manages and stores the configuration. The configuration splits into
2 paths in RCM. SyncE port configuration follows the DSL configuration
path, through CTL_MGR, CSM, PAD, and down to the line card. The
configuration intended for TCMd, which includes Clock Selector and BITS,
flows through TCM_MGR and then directly to TCMd.

• TCM_MGR is a new component. CTL_MGR and TCM_MGR facilitate


monitoring (show) commands as well.

• TCMd – the main SyncE process. It controls all the timing features

• CSM facilitates SyncE port configuration and monitoring.

• PAD implements several new SLAPIs to configure and monitor the SyncE
port.

• ALARM MGR implements new timing alarms

• CMS provides card state notifications to the TCMd, including LC OIR and
ALSW OIR and redundancy change. CSM is not expected to change and is
provided here for clarity. RPSW (Standby)

• TCMd – the main timing process, in Standby mode. The standby


process performs no control functions and does not access the timing
hardware. Active TCMd syncs some information to Standby TCMd via IPC.
ALSW/ALSWT (Active), ALSW/ALSWT (Standby)

On the ALSW cards:

• The TCM hardware is physically located on the ALSW card. Both Active
and Standby TCM are controlled by the Active RPSW. The active TCM
performs its function to the system. The Standby TCM is in warm standby:

• Equipment Clock PLL is synchronized to the Active output; and BITS inputs
and outputs are disabled.

On the line cards:

• CAD – implements the remote portion of Distributed Service Layer. • Static


attribute library provides information on the SyncE capability of the card.

• LP selects timing mode of Ethernet ports (as synchronous or


non-synchronous). It also controls transmission of ESMC PDU and reports
changes in received ESMC PDU to TCMd.

• ALd is the adaptation layer between platform-independent (facing RPSW)


and platform-specific (facing card hardware) functions. New functionality
is added to this process to control/monitor the SyncE hardware on the
LIM portion of the card.

• LIM Drivers are updated to configure SyncE card/port timing circuits.

• PKTIO punts and injects ESMC packets.

16 1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06


SSR Functional Architecture

• TCM Agent is a new component, which provides synchronization services


to the card. It configures timing circuits, monitors SyncE port faults,
validates the transceiver capability to support synchronous mode and
raises faults to the TCMd. It also which controls sending and receiving of
ESMC PDU and collects ESMC statistics.

• NPU driver provides data path for ESMC PDU punting/insertion.

On the line card NPU

• Ingress NPU performs ESMC PDU packet punting to the LP.

• Egress NPU performs ESMC PDU insertion.

2.1.4.6 SyncE Redundancy Support

2.1.4.6.1 ALSW Stratum 3/3E Module Redundancy

Each ALSW card (active and standby) has an equipment clock Stratum-3/3E
module. The module on the active ALSW card performs monitoring of
synchronization input sources and provides the equipment clock to the
chassis. The module on the standby ALSW card waits in warm standby mode,
synchronized to the active clock.

When an ALSW switchover occurs, the Stratum-3/3E module on the active


ALSW is provisioned by the equipment clock control process (tcmd) and begins
operation in active mode. The SyncE ports switch their transmit timing source
to the new active ALSW card.

2.1.4.6.2 BITS Redundancy

On the standby ALSW card, BITS inputs and outputs are disabled. On the
active ALSW card:

• BITS inputs are enabled, monitored for faults, and if so configured, are
available as synchronization input sources.

• BITS outputs (if configured) are active and transmitting.

Two Y-cables, one each for BITS A and BITS B, are required for BITS input
and output redundancy.

2.1.4.6.3 Headless Operation

When there is no active RPSW or no active ALSW card, the system is said to
be in headless operation. Line card software detects the absence of an active
RPSW or active ALSW and takes the following compensating measures:

• If no ALSW card is present in the system, the line card switches to the
local oscillator.

1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06 17


SSR System Architecture Guide

• If no ALSW card or no active RPSW card is present in the system, the


SyncE clock quality can degrade. In headless operation mode, all SyncE
ports configured in synchronous mode transmit quality level QL_DNU (Do
Not Use for synchronization). If no active RPSW card is present in the
system, the BITS output signal is squelched using the configured squelch
method. If no squelch is configured, the BITS outputs are shut down. See
the squelch command for more information about configuring a squelch
method.

Note: The SyncE function is in this degraded state during the two to three
minutes required to complete a software upgrade. During that short
period, the system operates without an active RPSW card, though
it continues to forward traffic.

2.1.4.7 Selection Control Bus Interface

The SCB is a redundant bus sourced by the ALSW cards. Its primary purpose
is to act as a conduit for RPSW mastership selection between ALSW cards.
Redundant buses are mastered from each ALSW card in the system. Each bus
is wired as a clock and data pair, wired in a dual star pattern to each line card in
the system (see Figure 11). The SCB clocks from each ALSW card are run from
the local 100 MHz oscillator used for the system FPGA clock. The data is NRZ.

Figure 11 SCB Communications

The signals from each ALSW card to its mate run at a higher frequency
to ensure that the messages from each internal logic state machine are
synchronized at all times and to ensure a smaller window of uncertainty during
the selection process. The cross-coupled links run at 100 MHz and use
low-voltage differential signaling (LVDS) for both data and clock.

The SCB links to the RPSW card are slightly different because the RPSW
cards need both the IEEE 1588 clock synchronization updates and the epoch

18 1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06


SSR Functional Architecture

clock updates. The ALSW cards overlay the TCB functionality onto the SCB
bus so that the RPSW FPGA must check only one interface to get all required
status and information.

2.1.4.8 I2C Interface

The primary purpose of the of the common equipment I2C interfaces is to


provide control of the eight PEMs, fan trays, and chassis EEPROMs (see
Figure 12). All I2C links are controlled by an I2C multiplexer that allows both
RPSW cards controlled access to the system resources.

Figure 12 Common Equipment I2C Interface Communications

2.1.5 Timing and Synchronization


System timing functionality is implemented on both the ALSW and line cards in
the SSR system. All potential system reference clock sources are aggregated
on each ALSW card where the selection of a single system reference clock is
made. A backplane distribution network forwards the clock references from
each ALSW card to all line cards in the system. The selection of a line card
transaction (TX) clock is performed on each line card. The line card TX clock
selection determines which ALSW card is the timing master for the system.
Operating together, both ALSW cards implement a redundant system-timing
function. The active RPSW card selects the timing references through the PCI
Express interface to the ALSW FPGA.

1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06 19


SSR System Architecture Guide

2.1.5.1 Switch Alarm Card Timing Support

Network propagated timing and building integrated timing supply (BITS) timing
are supported on the ALSW cards, which contain aggregation, selection,
and distribution logic for the system reference clock. The implementation is
compliant with SONET/SDH (ITU-T G.823/824) and SyncE (ITU-T G.8261)
standards.

Figure 13 presents a representative block diagram of the timing logic. Three


reference input clock types are supported on each ALSW card: dual T1/E1
BITs input clock via the front panel RJ48-C connector, up to 20 recovered
line card clocks, and a locally generated Stratum 3E–compliant clock. The
input reference clocks to the ALSW cards are 8 KHz. The DS3100 reference
clock outputs are 19.44 MHz and 38.88 MHz. A DS3100 oven-controlled
crystal oscillator (OCXO) provides a single Stratum 3E system reference clock
(Stratum 3E tracks input signals within 7.1 Hz of 1.544 MHz from a Stratum 3
or better source) with holdover support; the final output from clock circuitry
is 8KHz. The capability to configure the ALSW card’s reference clocks in
a master/slave relationship is also provided so that the slave frequency and
phase are locked to the master.

Figure 13 ALSW Card Timing Support

2.1.5.2 Line Card Timing Support

The line card clocking architecture supports line-side clock recovery and
forwarding to the central timing logic and line-side transmit clocking support.
Figure 14 shows the basic block diagram for clocking support on the 40-port
GE and 10-port 10GE line cards. Figure 15 shows the diagram for clocking
support for the 1-port 100 GE or 2-port 40GE line cards.

Logic to support line-side clock recovery, multiplexing, division, and forwarding


is line card dependent. Each line card generates two 8 KHz clocks that are
forwarded to the ALSW cards for use as reference inputs to the system clock
generation logic. Each ALSW card receives a single, forwarded line card clock.

SyncE is supported on a per NPU basis with a single (receiving) Rx and Tx


Clock selection across all receive ports per NPU.

20 1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06


SSR Functional Architecture

Logic to support line-side Tx clocking selects its reference from the following
input sources: a free, running oscillator; a BITS clock, or any looped-back,
line-side receive clocks. The clock multiplexing function resides in a line
card–specific FPGA. The SI53xx family of voltage controlled oscillators
(VCXOs) is used to provide clock smoothing and glitch-less Tx clock switchover.

Figure 14 40-Port GE and 10-Port 10GE Line Card Timing Support

1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06 21


SSR System Architecture Guide

Figure 15 1-port 100 GE or 2-port 40GE Line Card Timing Support

2.1.6 RPSW Controller Card


Architecturally, the RP design looks like a server board with multiple PCI or PCI
Express–connected blocks providing resources. Figure 16 provides a block
diagram of the components.

22 1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06


SSR Functional Architecture

Figure 16 RPSW Card Architecture

The components and major blocks have the following functionality.

• CPU

0 Intel Jasper Forest 60 W Advanced Telecom Computing Architecture


(ATCA) with Network Equipment Building Standards (NEBS)-friendly
temperature profile and integrated northbridge (a memory controller
hub). Intel's latest server processor provides the following features.

0 Native quad core (four CPUs in a single device), with 32K Layer 1
(L1) and 256K L2 cache per core

0 8 Mbyte per second of shared L3 cache

0 Frequency of 2.13 GHz

0 Three local double data rate, version 3 (DDR3) memory channels,


each with 24 GB memory

1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06 23


SSR System Architecture Guide

0 One internal Intel Quickpath bus operating at 5.8 GTransfers/Sec (each


transfer is 2 bytes)

0 60 W thermal design power (TDP).

0 Up to four PCI Express bridges

0 Sixteen PCI Express Gen 1 and Gen 2 SerDes lanes, allocated across
the bridges

0 Enterprise Southbridge Interface (ESI) buses to the southbridge

• Southbridge I/O controller hub

0 Intel Ibex Peak southbridge.

0 ESI from northbridge.

0 PCI Express, PCI, SMBUS, Serial Advanced Technology Attachment


(SATA), USB, System Packet Interface (SPI), and general purpose
I/O (GPIO) resources

• BIOS

0 Numonyx M25P128 SPI flash

0 Connects to southbridge SPI bus

• USB

0 Three USB ports

0 One faceplate USB port for thumb drive

0 Two USB ports for 32 GB eUSB solid-state drives (SSDs)

• PCI Express switch

0 Eight PCI Express Gen 1 x1 lane controllers to each switch card

0 Up to 12 controllers, but 4 are disabled in the mode being used

0 One PCI Express Gen 2 x4 lane

0 Direct memory access (DMA) capabilities for control plane offload

• Ethernet quad Media Access Controllers (MACs)

0 Intel i82580 PCI Express controller for 1G Ethernet connection

0 PCI Express Gen2 connection running at 2.5 GT/s

0 Quad Ethernet MAC addresses

0 Each MAC can use copper or SerDes as the physical interface

24 1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06


SSR Functional Architecture

0 Three MAC addresses for the following functionality:

• Front panel Ethernet management port


• Mate RP Ethernet with dual links to mate RP and mate RP switch

• Ethernet dual 10GE MACs

0 Intel i82599 PCI Express controller for 10GE connection

0 Dual Ethernet MACs

0 Each MAC uses 10GBase-KR


0 Two MACs used for 10GE uplink from GE switch

• Programmable Phalanx FPGA

0 Power and reset sequencing, watchdogs, non-maskable interrupt


(NMI) handling, NVRAM, RPMB interface to mate RP, Universal
Asynchronous Receiver/Transmitters (UARTs), and so on

0 Connects to the southbridge through a PCI interface

0 Lattice XP2 17E in 484-pin FBGA


• Programmable Spanky FPGA

0 PCI Express to management functionality

0 Interfaces to CMB, SCB, epoch clock, dune fabric CPU interface, I2C
interfaces to common equipment

0 Dual PCI Express Gen 1 x1 interface (one hard core and one soft core)

0 Altera Arria II GX in FC780-pin FPGA

• Dune fabric application-specific integrated circuit (ASIC)

0 Fabric crossbar to line cards


0 Single-fabric ASIC provides 100 Gbps throughput

2.1.7 ALSW Card


Reviewers: does the information about the ALSW card need updating for
FT2586.1 other than the reference to the Synchronous Ethernet section?

The ALSW card implements the user plane fabric (FE600 device), plus the
system’s control plane Ethernet switch, timing circuits, and alarm indicators.
Figure 17 provides a diagram of the ALSW card components.

For information about the ALSW card role in SyncE, see Section 2.1.4.6 on
page 17.

1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06 25


SSR System Architecture Guide

Figure 17 ALSW Card Architecture

The following describes the Shiba FPGA and major block functionality.

• Terminates buses from active and standby RPSW cards

• Provides RPSW card access to all devices on the ALSW card, including:

0 Dune Networks FE600 fabric switch (user plane)

0 Marvell DX263 Ethernet switch (control plane)

0 Dallas 3100 BITS framer

0 Voltage and temperature monitors

26 1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06


SSR Functional Architecture

0 SCB and TCB mastership

0 CMB slave

2.1.8 Switch Card


The SW card is a functional and physical subset of the ALSW card. The PCB is
common, and the SW contains fewer components. The SW contains the Shiba
FPGA and other FE600-related components. It does not contain an Ethernet
switch, BITS circuits, or alarm-related circuits.

2.1.9 Line Cards


The SSR supports the 40-port GE, 10-port 10GE, 1-Port 100GE or 2-Port
40GE, and 4-Port 10GE, or 20-Port GE and 2-Port 10GE line cards. Their
components perform the following roles:

• Network Processing Unit—Each bidirectional NPU directly forwards traffic


to and from the ports on the line card and forwards traffic, through the
FAP, to and from service cards and other line cards. The Forwarding
Information Base (FIB) resides on the NPU, but the FIB driver is part of the
local processing. The NPU is also known as the packet forwarding engine
(PFE). The PFE performs packet parsing, encapsulation, FIB, LFIB, and
MFIB lookups, and traffic steering to the SSCs through TSFT lookups.
They also perform ACL filtering and classification, QoS propagation, rate
limiting, and marking, weighted random early detection (WRED) and priority
weighted fair queuing (PWFQ) scheduling, traffic shaping, and forward
policy instantiation.

• Fabric access processor—The FAP interfaces with the switch cards,


load-sharing traffic over them, and performs fragmentation and reassembly.
It receives traffic from the switch fabric and forwards traffic across it. In the
ingress path, the FAP fragments packets from the local NPUs and forwards
these fragments to all available switch cards to which the destination FAP
is reachable. In the egress path, the FAP reassembles packets from
fragments received from the available switch cards and forwards the
reassembled packets to the local NPUs for egress processing. The FAP
also queues traffic in virtual output queues (VOQs) and schedules traffic
according to the priority from the packet descriptor QoS priority marking.

• Local processor—Single core PowerPC processor running at 1.0 GHz.


The local processor receives and transmits high-level commands from the
route processor and translates them into low-level commands for the NPU
and other local devices. The local processor implements the forwarding
abstraction layer (FABL), which abstracts the card-specific details coming
from the RP.

• Physical interface adapter—Performs line-coding and clock recovery, and


reports the health of the physical interface.

1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06 27


SSR System Architecture Guide

2.1.9.1 10-port 10GE Line Card

Figure 18 shows the layout and components of the 10-port 10GE line card,
which is based on two NP4 NPUs. Each NPU supports five ports of 10GE and
accesses the other line cards and services cards through the FAP. The card
also has a local processor that translates high-level messages to and from the
route processor into low-level device commands on the card. This card also
has two 10GE physical interface adapters that connect with the ports.

Figure 18 10-Port 10GE Card Block Diagram

2.1.9.2 40-port GE Line Card

Figure 19 illustrates the 40-port GE line card. It is based on a single NP4 NPU
that supports all 40 ports of GE and accesses the other line and services cards
in the system through the FAP. The card also contains a local processor and a
GE physical interface adapter connecting the ports.

28 1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06


SSR Functional Architecture

Figure 19 40-Port GE Line Card Block Diagram

2.1.9.3 1-Port 100 GE or 2-Port 40 GE Card

Figure 20 illustrates the 1-port 100 GE or 2-port 40 GE line card. Typically,


this card is used is to provide high capacity uplink ports to handle increases
in network traffic.

This high capacity card is based on two NP4 NPUs that process GE traffic,
running in simplex rather than duplex. The card also contains a local processor
and a GE physical interface adapter connecting the ports. The card accesses
the other line and services cards in the system through the FAP.

Only supported CFPs are allowed to power up. Also, hardware memory DIMMs
are now ‘keyed’ so that only approved DIMMs are allowed on the card. The
card does not boot with unapproved DIMMs.

This card can be configured to run in 40Gb or 100Gb mode using the card
mode command. You must reload the router to switch from one mode to the
other.

Reviewers: is it true that in IP OS 13B, bootlogs can be extracted from


NVR (Is this the NVRAM?)even if the card does not boot?

1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06 29


SSR System Architecture Guide

Figure 20 1-Port 100 GE or 2-Port 40 GE Line Card Block Diagram

2.1.9.4 4-Port 10GE, or 20-Port GE and 2-Port 10GE Card

Figure 21 illustrates the 4-port 10GE, or 20-port GE and 2-Port 10GE line card,
which supports BNG application services on the SSR.

The line interface uses pluggable SFPs for GE bandwidth or pluggable XFPs
for 10GE bandwidth.

30 1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06


SSR Functional Architecture

This card has two forwarding complexes, iPPA3LP (ingress) and ePPA3LP
(egress), which provide 40G forwarding bandwidth. The control path is based
on the same LP (Freescale™ PowerPC™ MPC8536) as the NP4 line cards,
but it runs at a higher frequency (1.5GHz) on this card, and is equipped with
4MB of RAM. The PPA3 clock rate is also set at the maximum of 750MHz to
improve data path forwarding performance. Flow control is supported by the
(Vitesse) Ethernet MAC. The FANG FPGAs associated with the IPPA3 and
EPPA3 NPUs provide interfaces with the Fabric through the FAP, carrying
20G of data throughput.

Figure 21 4-Port 10GE or 20-Port GE and 2-Port 10GE Line Card

2.1.9.4.1 PPA3LP Card Supporting Software Architecture

The PPA3LP is a multi-Execution Unit (EU) processor, in which two of its


dedicated EUs are used for protocol handling. To communicate with FABL/ALd,
the PPA3LP card adopts an IPC-proxy approach in which the new Proxy Layer
Daemon (PLd) and Hardware Abstraction Daemon (HAd) modules send IPC
messages between Ericsson IP OS modules and PPA3LP endpoints either
directly or proxied:

• Directly—Modules that are PFE aware (such as ISM) communicate directly


with the PPA endpoints. Each PPA is assigned an IP address in the
same domain as the LP and packets are routed in the LP space to the
corresponding PPAs based on their IP addresses.

1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06 31


SSR System Architecture Guide

• Proxied—Modules that are not PFE aware do not talk directly with PPA, but
communicate through the PLd, which proxies messages in both upstream
and downstream directions.

Figure 22 illustrates the major software components on the PPA3LP card.

Figure 22 PPA3LP Card Software Components

The following software modules support this communication:

• PLd—PPAs have a number of “features” that can be configured. When an


IP OS application receives a registration message from a feature, with a
separate thread and corresponding endpoint, it can respond back with
messages to that endpoint.

The PLd therefore has a proxy thread and associated endpoint for every
feature that it proxies. Since there are two PFEs, the registration messages
are proxied and multiplexed into a single registration by PLd.

Configuration messages for a given feature are sent to the PLd thread
proxying that feature. The PLd determines the target PPA based on the
PFEid in the IPC header (some messages are sent to both PFEs).

• HAd—The HAd, the driver powerhouse on the card, is equivalent to the


ALd on the NP4 cards. Since forwarding configuration is proxied from PLd

32 1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06


SSR Functional Architecture

to PPA endpoints directly, there is no forwarding provisioning that involves


HAd (in contrast to ALd). It primarily manages hardware configuration and
monitoring.

• PAKIO—BNG applications exchange a high volume of protocol control


packets with IP OS daemons such as PPPoE. In order for IP OS modules
to specify a particular PFE on the PPA3LP card, the socket layer for PAKIO
enables the modules to specify the target PFE for the PAKIO message.

2.1.9.4.2 Adjacency and Circuit Creation

When circuit information comes from the IP OS to the PPA3LP (or PFE), the
method to indicate that the circuit should be added or deleted is as follows:

Each PFE records its PFE_mask at startup.

• Circuit Creation—When an adjacency is allocated, a card-specific


callback through the IP OS lib sets up the PFE_mask and the adjacency
automatically. When a circuit arrives having RBOB_adj->PFE_mask set,
and the circuit does not exist, this signals creation.

• Circuit Deletion—If the RBOS_adj>PFE_mask becomes 0 (un-set) and a


circuit exists, it signals circuit deletion.

2.1.10 Smart Services Card


The Smart Services Card (SSC) is a standard Intel two-way server design that
has been adapted for use in an SSR line card slot. The card provides the
following hardware resources:

• Two CPU processors, each with:

0 8 cores

0 20 MB L3 cache

0 2 MB L2 cache

• Eight memory channels with eight dual in-line memory module (DIMM)
sockets

0 64 GB of SDRAM very low profile (VLP)

0 128 GB capable (future upgrade)

0 DDR3L-1333 Mhz

• 4x 10GE ports to the FAP for user plane access

• 50GB SATA SSD boot device

• Southbridge I/O controller hub

1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06 33


SSR System Architecture Guide

0 Southbridge function (SATA, USB, boot flash, and so on)

0 Crypto and compression acceleration

• Advanced Mezanine Card (AMC)

0 AMC.0 base specification

0 AMC.1 PCI Express

0 AMC.3 storage

0 Supports minimum of 320 GB of 2.5-inch SATA SSD

Note: An AMC is a modular add-on that extends the functionality of a


carrier board. AMC modules lie parallel to and are integrated onto
the SSC card by plugging into an AMC connector.

• FAP

0 Six DDR3 channels of 256 MB

0 Quad Data Rate, second version (QDR II) 72 MB SRAM

Figure 23 illustrates the of the SSC card design.

Figure 23 SSC Architecture

34 1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06


SSR Functional Architecture

2.1.11 Backplane
The SSR 8000 backplane links together the different components of the SSR
chassis infrastructure. The backplane supports the major components in the
chassis, including the line cards, switch cards, PEMs, fan trays, EEPROM
card, and power backplane.

The SSR system uses the double-star architecture in which all line cards
communicate with all switch cards. The chassis supports each line card
interfacing with up to eight switch cards (see Figure 3), and the communication
between them relies on the backplane traces. Any line card slot can host either
a line card or a service card. Another important function of the backplane is
to distribute power from the PEMs through the power backplane to the entire
system.

The cards are vertically aligned in the front of the backplane. PEMs plug
into the power backplane located in the bottom part of the chassis. Bus bars
transfer power from the power backplane to the backplane, which distributes
the power to all cards and fan trays in the system.

2.1.12 Power Modules

The SSR 8020 has eight PEMs in dedicated slots at the bottom of the chassis,
and the SSR 8010 has 6. The PEMs blind mate into a custom vertical power
backplane to which the customer’s DC terminal lugs attach via bus bars. After
conditioning, the output power exits the power backplane via bus bars to the
backplane. Each supply requires an A (primary) 60 amp 54 V DC feed. An
identical redundant B feed to the supply is Or'd inside the PEM to provide a
single load zone for N+1 redundancy. Total available power is 14 kW based on
seven active 2 kW power supplies.

Each PEM is equipped with the status LEDs on the front surface to the right
of the inject/eject lever. See SSR 8000 Power Entry Modules for definitions of
the LED states.

2.1.13 Fan Trays


Each fan tray carries six 4-wire, 54 V compatible, pulse-width modulation
(PWM) controlled fans.

The SSR fan tray is under command of the system’s 1+1 redundant RPSW
controller cards. Each RPSW card interfaces to the fan tray with a dedicated I2C
bus, each augmented with reset, interrupt request, and insertion status signals.

The SSR fan tray incorporates a controller board. The function of the controller
boards is:

• Input power conditioning, including in-rush control and filtering

1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06 35


SSR System Architecture Guide

• Conversion of –54 V input power to local power supplies for the


microcontroller, I2C bus interfaces, memories, and other logic

• Host control interface based on I2C, and supporting access by two


redundant route processors

• Fan PWM interface

The Thermal Manager varies the speed of the fans in response to thermal
events reported by the service layer. The thermal events are based on
temperatures reported by the cards installed in the chassis. There are four card
thermal states: Normal, Warm, Hot, and Extreme. There are two fan speeds:
High (full speed) and Low (40% of full speed). The fans speed up to full speed
when the temperature changes from Normal to any of the other three states.

For example, the fans speed up from Low to High if the temperature changes
from Normal to Warm, Hot, or Extreme. If the temperature goes to Normal
from Extreme, Hot, or Warm, the fan state changes to the Hysteresis state,
which responds to past and current events, and a 10-minute hysteresis timer
starts. At the end of that time period, the fans slow to Low speed, unless the
temperature goes up during that period. If the temperature goes up, the timer is
cleared, and the fans stay at High speed.

Fan failure detection notes when speeds deviate more than 15 percentage
points from the commanded set point. Failure modes:

• If a fan fails, the rest of the fans in the same fan tray run at full speed (fault
speed).

• If I2C communication with the host is lost, the watchdog timer expires,
and all fans run at full speed.

• In the case of a controller failure (hardware or software), all fans run at


full speed.

Two panel-mounted or right-angle, board-mounted LEDs are driven by the


controller card to visually indicate the state of the fan tray, as described in
SSR 8000 Fan Trays.

2.2 Software Architecture

2.2.1 Underlying Software

The SSR runs the Ericsson IP Operating System that is built around the
Linux OS. It uses the Linux kernel to implement basic services. such as
process management, scheduling, device management, and basic networking
functionality, as well as to provide some of the functionality of the system (like
ping and traceroute). Although the operating system routing stack depends on
many of Linux services, this dependence is not visible to the operator.

36 1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06


SSR Functional Architecture

The operating system provides general interfaces for configuring and interacting
with the system that are OS-independent, such as the command-line interface
(CLI), Simple Network Management Protocol (SNMP), and console logs. Even
OS-specific information, like lists of processes and counters, is displayed
through the CLI in an OS-independent way. As a result, the operator does
not have to interact with the Linux OS directly or even be aware that the OS
used is Linux. However, it is possible to get access to the Linux shell and
directly perform Linux operations. This is intended to be done only by support
personnel, because it provides superuser access, and doing something wrong
can bring the system down. Also, there is not much additional information that
can be extracted through the Linux shell when compared to the information
provided in the CLI. Such sessions are typically used for internal debugging in
our labs.

2.2.2 Independent System Processes

Implementing the major software components as independent processes allows


a particular process to be stopped, restarted, and upgraded without reloading
the entire system or individual traffic cards. In addition, if one component fails
or is disrupted, the system continues to operate.

Figure 24 diagrams the software architecture, including the major modules.

1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06 37


SSR System Architecture Guide

Figure 24 SSR Software Architecture

The OS components run as separate processes, with many interdependencies.


See the following sections for definitions.

2.2.2.1 AAA

The AAA daemon (AAAd) process handles Authentication, Authorization and


Accounting of subscribers, tunnels and circuits. AAA has the complete control
and maintains the central repository of all the subscriber session specific
information within its database and is the central resource manager as well as
controller of subscriber session management.

AAA is not directly involved with the PFE resource management, but instead
communicates with PPA through IPC. For example, AAA provisions service
traffic volume limits directly to PPA and receives the volume limit exceed
events from PPA. AAAd will be one of the users of the new AFE layer. The

38 1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06


SSR Functional Architecture

IPC communication to PFEs will be through AFE. For provisioning to the


feature daemons, we’ll use the existing path to rcm, feature mgr and then to
feature daemons and then to PPA. For example just like today, AAAd will send
provisioning message to rcm/qos mgr and then rcm/qos mgr will pass down to
qosd and qosd will propagate to ppa. IPC messages are used along the way
for passing down the information. To avoid race condition, AAAd will add the
slot/PFE mask to the provisioning messages for all the features, including ACL,
QOS, Forwarding, NAT, LI. When there is circuit migration, AAAd will send out
reprovisioning messages for all the features just like today. The new slot/PFE
mask will be provided in the reprovisioning message.

Figure 25 illustrates the AAA daemon interactions with other modules:

Figure 25 AAAd Interactions With Other Modules

Logically, AAA can be partitioned in to three major functional subsystems.


The configuration and command processing subsystem, RADIUS Service
Management subsystem, and subscriber session management subsystem
(which includes authentication, authorization and policy enforcer components
to it). AAA holds all the subscriber session centric attributes with other modules
and sub systems to provision, manages and retains the subscriber session
alive across various module and subsystem failures. To effect the session
management, AAA talks to control plane processes and the forwarding plane.
Like other processes in the Ericsson IP Operating System, AAA also restores
the session attributes after restarts, by maintaining them in its shared memory.

1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06 39


SSR System Architecture Guide

2.2.2.2 ARPd

Address Resolution Protocol daemon (ARPd) manages IPv4 IP-to-MAC address


resolution for ARP, as described by RFC 826. IP ARP and cross-connect
(XC) ARP are supported. XC ARP is used for interworking cross-connects to
manage MAC information for the Layer 2 portions of bypass connection.

ARP entries are maintained in a database residing on the control plane. They
are maintained in the form of adjacencies that associate an identifier (the
adjacency ID) with a resolved ARP entry (associating a context-specific IPv4
address with a given MAC address).

ARP misses are triggered by platform-dependent code in the forwarding plane


and sent by inter-process communication (IPC) to ARPd. The ARP packets
sent and received by ARPd go over the PAKIO infrastructure rather than IPC.
ARPd is responsible for managing the lifetime of each ARP entry, refreshing
entries automatically before they expire, and handling throttling to prevent
duplicate requests.

In addition to discovered ARP entries, other components within the system


can also add entries. Virtual Router Redundancy Protocol daemon (VRRPd)
adds MAC entries for its virtual interfaces. DHCPd adds the IP address/MAC
address mapping into the MAC table for successful DHCP negotiations. ARPd
can also initiate an ARP request without an ARP miss on the pretense that
the resolution will be needed soon. This is the case for the neighbors in
Interior Gateway Protocols (IGPs) such as Open Shortest Path First (OSPF)
and Intermediate System-to-Intermediate System (IS-IS), as well as Border
Gateway Protocol (BGP) neighbors.

You can explicitly configure a static ARP entry using the ip host command.
Multiple entries can be defined per port.

ARPd interacts with multiple modules within the system.

When two SSRs are configured in ICR pairs, ARPd on both peers can
communicate and synchronize their ARP caches. To enable this, enter the ip
arp sync icr command on an ICR interface. The feature works with BGP-based
ICR, Multi-Chassis LAG (MC-LAG), and VRRP ICR if ARP synchronization is
enabled. When it is enabled, the ARP daemon becomes a client of ICRlib and
uses it to communicate with ARPd on the ICR peer chassis. ARPd on the
active and standby peers sends application messages over ICRlib with ARP
entries to be added or deleted.

2.2.2.3 BGP

The Ericsson IP Operating System BGP implementation supports BGP-4 as


specified in RFC 4271 and supports IP Version 6 (IPv6) (RFC 2545). The BGP
Management Information Base (MIB) is not supported, although BGP does
send a few trap notifications from RFC 4273, which is supported. BGP/MPLS
IP VPNs are supported as specified in RFC 4364.

40 1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06


SSR Functional Architecture

The BGP module is based on a strict user thread (pthread) model, with the
exception that the keepalive thread runs at a higher priority than the rest. This
is in contrast with other daemons in which all threads run at the same priority.

BGP has the following interactions with other modules.

• The BGP daemon requests and releases policies from the Routing Policy
Manager (RPM) daemon using the Routing Policy Library (RPL) API. BGP
uses every policy supported by the RPM, and much of the RPM function
is specific to BGP. For example, community lists, extended community
lists, autonomous system (AS) paths, and much of the route map function
are used only by BGP.

• BGP installs both IPv4 and IPv6 routes in the Routing Information Base
(RIB). For the case of RFC 4364 or 6VPE VPNs, the nonconnected next
hop can be a label-switched path (LSP). BGP monitors next hops for RIB
resolution and supports Bidirectional Forwarding Detection (BFD) for peer
failure detection.

• Reviewers: does the SSR support MDT

BGP installs Multicast Distribution Tree (MDT) routes into Protocol


Independent Multicast (PIM). PIM sends the MDT AF routes that it wants
the router to advertise. PIM can also request to flush all its routes or to
receive all routes from other BGP peers.

• MPLS labels allocated by BGP are downloaded to the Label Manager (LM).
BGP allocates labels for both RFC 4364 VPNs and 6VPE VPNs.

• BGP registers with the Interface and Circuit State Manager (ISM) for all
interfaces in a context in which a BGP instance is configured. It also
registers for port events associated with that interface.

When the SSR node is used as a BGP route-reflector, you can conserve
memory and CPU usage on the controller card and line cards by filtering BGP
routes to reflect routes to its iBGP clients. This is useful if BGP routes are
not needed to go to the line cards (for example, when the router is not in the
forwarding path toward the BGP route destinations). To reduce the size of the
RIB and FIB tables, you can filter which routes are downloaded from BGP to
the RIB and FIB before being advertised to peer routers.

Note: To avoid the risk of dropped traffic, design networks so that the routes
that are advertised by the router with this feature enabled do not
attract traffic. This option is not well suited for cases in which the
route-reflector is also used as a PE or ASBR node.

2.2.2.4 CFMd

Ethernet Connectivity Fault Management daemon (CFMd) implements IEEE


802.1AG standard’s main functions:

• Fault Detection by the use of Continuity Check Messages (CCM)

1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06 41


SSR System Architecture Guide

• Fault verification and isolation by using Loopback Messages and Reply


(LBM, LBR)

• Path Discovery via Link trace Message and Reply (LTM and LTR)

• Fault Notification via Alarm Indications

• Fault Recovery by sending indications to other protocols

The SSR CFM functionality consists of two components, an RCM manager


component and the backend daemon (cfmd). The RCM component manages
all CFM related configuration within RDB which then gets downloaded to the
backend daemon.

2.2.2.5 CLI

2.2.2.5.1 SSR Execution CLI

The SSR Execution CLI (EXEC-CLI) is the primary user interface with the
system. This is a multi-instance process that runs one instance for each CLI
connection to the system.

The CLI is a data-driven state machine using a parse tree (or parse chain) to
define the various system modes. A parse tree is a collection of parse nodes,
each linked together to form the tree. Each node in the parse tree defines
the keyword or token type that can be accepted for a specified command-line
input. The CLI parser has several parse trees. Each parse tree is defined as a
mode. The two main modes are exec mode and config mode. Exec mode is
used for examining the state of the system and performing operations to the
node. Config mode is used for changing the configuration of the box. Each
mode has several nested submodes.

A parser control structure contains the state of the parser, information for the
current mode, user input, and arguments that have been parsed.

The parser starts parsing a command by starting at the first node, which is
the root of the parse tree for the current mode stored in the parser control
structure. The root is pushed onto the parser stack (LIFO), and the parser
loops until the parser stack is empty. The loop pops the node on top of the
stack and calls the token-dependent parsing function. If that token type has
an alternate transition, the token function first pushes the alternate transition
onto the parser stack without consuming any of the input. The parser tries all
the alternates, attempting all tokens at a given level in the tree. The token
function then attempts to parse the token. If the token is parsed successfully,
the token function consumes the input and pushes the accept transition onto
the parser stack.

The leaves in the tree can be one of three types: an EOL token or the special
nodes CLI_NONE and CLI_NOALT. The EOL token signifies the acceptance of
the command. After reaching EOL and parsing successfully, the parser saves
the parser control structure. If parsing has finished successfully, the parser

42 1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06


SSR Functional Architecture

takes the save parser control structure and calls the function pointed to by the
EOL macro to execute the command.

The NONE corresponds to CLI_NONE and marks the end of the alternate
nodes of the current branch, but not the end of the alternate nodes for the
current level. The NOALT corresponds to CLI_NOALT and marks the end of all
alternate nodes, both for the current branch and the current level. The parser
distinguishes between the two to detect ambiguous commands.

After parsing a command, the parser calls the action routine, which executes
the command. The CLI control structure contains information about the parsing
of the command. This lets the action routine know if the default or no option
was added at the beginning of the command. Also, certain parser macros store
values in the control structure. For example, numbers, strings, or addresses,
based on the tokens that were matched for this command, are stored in the
arguments inside the control structure. The action routine uses these values to
interact with the system and execute the specified action.

The CLI has a limited set of interactions with the system, because all access
is controlled through a data communication layer (DCL). The majority of all
DCL calls directly interact with the RCM, except in situations where a direct
connection is needed.

2.2.2.5.2 Ericsson CLI

Throughout Ericsson, COM is planned to communicate with various types of


middleware (MW) running on various operating systems. For the Ericsson
SSR 8000 platform, the MW is the Ericsson IP Operating System running on
Linux. COM communicates southbound to the MW through a Service Provider
Interface (SPI). The different COM SPIs are used for the communication
toward different MWs by enabling different support agents (SAs). The COM
interface supports human-to-machine and machine-to-machine commands.
The human-to-machine commands are available with the Ericsson CLI. The
machine-to-machine commands are available with the NETCONF protocol.

The Ericsson CLI helps Ericsson platforms running the Ericsson IP Operating
System to provide common information models and operation, administration,
and maintenance (OAM) components across all network elements (NEs).
The Ericsson Common Information Model (ECIM ) that is common among
all Ericsson NEs includes logical models for OAM functions, such as fault
management and equipment management. The operating system uses
the ECIM and Common Operations and Management (COM) to supply an
OAM solution to all platforms running the OS. For example, the MPLS-TP
provisioning on the SSR supported by the OAM solution is the same MPLS-TP
it supports on other NEs.

The interface that supports the Ericsson CLI provides:

• OAM support (NETCONF and CLI) used for platform applications, such as
for Enhanced Packet Gateway (EPG).

1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06 43


SSR System Architecture Guide

• The NETCONF protocol driven by MOMs used to access application-specific


MOMs and the ECIM—The infrastructure that allows application support
without coupling the management interfaces to the platform development,
COM provides the OAM interface from the network element—SSR in
this case—to external management systems, such as EPG. The COM
Configuration Management (CM) function routes NETCONF operations.
The MO configuration data is stored by the underlying system. For more
information, see RFC 4741.

• The Ericsson CLI (also driven by MOMs)—The Ericsson CLI provides


external access to COM and is a NBI for the operating system. On the SSR
8000 platform, the Ericsson CLI is accessed through the SSR operating
system.

2.2.2.6 CLIPS

The Clientless IP Service Selection (CLIPS) module implements circuit


management for CLIPS subscribers, with a similar interface with other system
modules to the PPP model. However, CLIPS allows a subscriber to be created
and bound to a service without requiring a protocol like PPP. Since there is
no protocol required, there is no client running on the subscriber side. Other
subscriber information, most commonly the MAC address, is used to identify
the subscribers. CLIPS can work statically where subscribers are configured
over specific PVCs (802.1Q PVCs) or dynamically where subscribers are
assigned addresses through DHCP. For details about the interaction of the
CLIPS module with other modules in the creation, modification, and deletion
of subscriber sessions, see Section 3.8.2.1.3 on page 157 (static) or Section
3.8.2.2.2 on page 166 and Section 3.8.2.2.3 on page 169 (dynamic).

2.2.2.7 CLS

Classifier (CLS) is the module that handles ACL and access group processing.
Initially, CLS receives access groups from the RCM. The access groups
determine which ACLs are retrieved from the RPM. The RPM then pushes this
information to CLS, where it is processed into CLS data structures. Although
the ACLs are not in their original format, the ACLs are considered to be
processed into CLS raw format. This format is not suitable for download to the
line card, but is used within CLS to enable easier analysis of the ACLs for
building. Another level of processing is required to transform the ACLs into a
format for transfer to the line card. Platform-dependent libraries and capabilities
process the raw format ACLs into a platform-specific format that is easily
transferred to the line card. If no processing libraries exist, a default processor
creates ACL rule sets as a basic array of a rules data structure. The rules data
structure is a globally defined structure that is understood by all platforms.

CLS interacts with QoSMgr and forwarding as follows:

• QoSMgr informs CLS whenever a policy ACL is referenced or de-referenced


in a QoS metering, policing, or forward policy

44 1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06


SSR Functional Architecture

• QoSMgr informs CLS when a circuit binds to a QOS policy that uses an
ACL.

• CLS downloads associated ACLs to the appropriate forwarding modules


based on the circuit.

• In the case of forward policies that are configured on a link group, QoSMgr
informs CLS of grouping and ungrouping. It also provides a slot mask
for link aggregation group (LAG) pseudocircuits so that appropriate ACL
provisioning can occur.

2.2.2.8 Configuration Database

The configuration database, also known as the redundant database (RDB),


is a transactional database that maintains multiple transactions and avoids
conflicts by using a strict, two-phase locking protocol. By using the combination
of transactions and locks, RDB maintains the A.C.I.D. transactional properties,
atomicity, consistency, isolation, and durability. Each property must be
maintained by a database to ensure that data does not get corrupted. By
ensuring that every operation within the database occurs as one atomic
operation (atomicity), multiple users can interact with the system (isolation) as
well as make the database recoverable (durability). The database must also
provide facilities to allow a user to easily ensure the accuracy of data within
the database (consistency).

Every operation that occurs in the database must either completely finish or
appear as if it was not started at all. An operation cannot be left half done;
otherwise, it corrupts the database. This means that every operation in RDB
must be atomic.

As changes occur to data in the database, all the user's operations are saved
into a transaction log instead of being performed directly to the database.
When users have completed modifications, they can issue either an abort or
a commit of the transaction. The abort operation removes the transaction log
and all related locks, leaving the database in its prior consistent state. When a
commit is issued, it must be performed to the database in one atomic operation.
Because a transaction log can contain numerous different modifications, the
transaction log must remain persistent to ensure that it is always completely
performed.

When the database is first initialized, it makes a request to the operating


system to obtain two shared memory regions. One shared memory segment is
delegated to contain all transactional (short-lived) memory. The other region
contains all permanent (long-lived) memory. When the database process
restarts, it is able to get the same shared memory segments that it had allocated
previously, and return to the previous state of the database.

The database has no knowledge of the type of data that it contains within its
memory. All records are represented as a combination of a key and a data
buffer. When a user modifies a record within a transaction, a lock is created for
that record to prevent other users from modifying it at the same time. If another
user accesses a locked record, the transaction is blocked until that record is

1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06 45


SSR System Architecture Guide

released or until the initial user rolls back access to the record. This locking
guarantees a one-to-many relationship between transactions and records. A
record can belong to at most one transaction, and no two transactions can
be modifying the same record at one time. This preserves atomicity during a
commit operation, because each committed transaction is guaranteed to modify
only the records that are in its control.

When a transaction is to be committed, the transaction log is replayed, but


this time actually performs the operations to the database. Each operation is
guaranteed to succeed because adequate checks are made to ensure that an
operation is only added to the transaction log if it is valid. No other transaction
can access the modified records, because locks are in place to prevent this
from occurring. If a transaction is only half-committed before the database
process exits, the next time the process is started, the shared memory is
reattached. The first operation of the database is to recover old transactions
from the transactional memory and finish the commit on those transactions.
RDB implicitly aborts all transactions that are not committing.

When a transaction completes, all locks that were created for it are removed,
and the transactional memory is reclaimed. From the point of view of any other
user, the transaction was committed in one operation, because locks prevented
access to every record in the log until the transaction completed.

The committed database is organized as a binary search tree with rebalancing


(AVL tree). The number of trees is configured at database initialization, and
maps to the number of RCM component managers in the system. The first bytes
of the key specify on which tree the operation is performed. Tree operations
cannot be performed atomically, so a linked list is used to atomically insert or
remove nodes from the tree. If the database did not complete the entire insert
or remove operation successfully, the tree can be regenerated from the link list.

The database is kept redundant across controllers using the same two-phase
commit procedure for applying the transaction logs to the persistent storage. If
a redundant controller is present, the transaction log is first replicated to the
standby and committed, before it is committed on the active. This ensures that
information is not distributed until it is guaranteed to be redundant.

The database is a library, but requires many threads to perform the tasks
needed.

2.2.2.9 CSM

The Card State Module (CSM) is a back-end process corresponding to card


and port management. It relays card and port events to other back-end
processes, such as ISM. CSM consists of both platform-dependent and
platform-independent code. The platform-independent code interacts with
RCM, ISM, PM, and the kernel. SSR platform-dependent code interacts with
line cards and the Linux kernel, as well as with Platform Admin daemon (PAd).
platform-independent and dependent code are separated by the Chassis
Management Abstraction (CMA) API. New targets (such as PCREF) do not

46 1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06


SSR Functional Architecture

need to have a platform-dependent portion. Instead, they use the hooks in the
CMA API to interface directly with the target specific processes (such as PAd).

CMA abstracts the details of the chassis so that the other software that is
involved in chassis management (mostly CSM) can be generic and portable to
other chassis architectures and types. This is not a separate process but rather
a library that is linked with the process that needs the abstraction.

2.2.2.10 DHCP and DHCPv6

Dynamic Host Configuration Protocol (DHCP) and DHCPV6 provide a


framework for passing IP configuration information to hosts on a TCP/IP
network. DHCP is based on the Bootstrap protocol (BOOTP), adding the
capability of automatic collection of reusable network addresses and additional
configuration options. The SSR DHCP implementation supports the following
modes of operation:

• Relay: DHCP Relay

• Server: DHCP Server

• Proxy: DHCP Proxy Server

It is also used for Clientless IP Service Selection (CLIPS), interacting with the
CLIPS daemon to appropriately assign IP addresses.

DHCP consists of two components, an RCM manager component and the


backend daemon (DHCPd). The DHCP RCM component manages all DHCP
related configuration within RDB which then gets downloaded to the backend
daemon.

DHCPv6 enables DHCP servers to pass configuration parameters such as IPv6


network addresses to IPv6 nodes. It offers the capability of automatic allocation
of reusable network addresses and additional configuration flexibility. Like
DHCPv4, DHCPv6 consists of two components, an RCM manager component
and the backend daemon (DHCPv6d). The DHCPv6 RCM component manages
all DHCPv6 related configuration within RDB which then gets downloaded to
the backend daemon.

For details about the session connection/termination processes using DHCP,


see Section 3.8.2.1.2 on page 155.

2.2.2.11 DOT1Qd

The DOT1Q daemon (DOT1Qd) manages 802.1Q permanent virtual circuits


(PVCs) (single tag and dual tag circuits). These include explicitly configured
circuits and circuit ranges, as well as circuits created on demand.

The following figure illustrates the DOT1Qd interactions with other modules:

1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06 47


SSR System Architecture Guide

Figure 26 DOT1Qd Interaction With Other Modules

For details about the session connection/termination processes subscribers


on 802.1Q circuits, see Section 3.8.2.1.1 on page 152 and Section 3.8.2.1.2
on page 155.

2.2.2.12 ESMC

The Ethernet synchronization messaging channel (ESMC) module implements


the ESMC protocol in the SSR line card NP4 NPUs. SyncE uses ESMC to
enable the best clock source traceability to correctly define the timing source
and help prevent a timing loop. ESMC interacts with the following other
modules:

TCMd controls sending and receiving of ESMC PDUs, and configures


the following ESMC attributes: Equipment Clock QL, reference candidate
nominations (slot/port), active reference candidate nominations (slot/port), Rx
SSM monitoring, and port indexes. ESMC reports back to TCMd: Rx SSM
changes of the reference candidate, Port faults of the reference candidate.

The ESMC PDU is composed of the standard Ethernet header for a slow
protocol, an ITU-T G.8264 specific header, a flag field, and a type length value
(TLV) structure.

The NPU driver provides the data path for ESMC PDU punting and insertion.

2.2.2.13 Fabric Manager

Fabric Manager (running on the RPSW card) configures the switch fabric and
monitors its performance. Fabric Manager performs the initial configuration of
the fabric when the system starts and changes the fabric configuration when
the fabric cards or RPSW switchovers have faults.

2.2.2.14 FLOWd

Provides an infrastructure for enabling and controlling the classification of


packets into flows.

48 1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06


SSR Functional Architecture

The FLOW daemon (FLOWd) manages network flows (circuit-specific


unidirectional sequence of IP packets passing a node during a certain time
interval.

FLOWd implements the RPSW configuration and control portion of packet


classification into flows, where packet classification is performed on individual
circuits. FLOWd interacts with the line cards to classify the packets into
particular flows. To ensure that flow classification does not degrade forwarding
performance, by default flow classification is disabled, but you can enable
flows on any line card.

An external IPFIX collector is required to view network-wide IP flows. However,


you do not need to configure an external collector in your system if you only
want to see system-specific flow information. If you do not configure an external
collector, flow data is stored in the RFlow caches for the lifetime of the flow,
but that data is never exported from the router to an external collector. Flow
data that is not exported to an external collector is available only until the flow
expires. You can use this flow data if you want to monitor only local flows.

FLOWd controls which circuits have packet flow classification enabled, and
their classification attributes, and provides an infrastructure for enabling and
controlling the classification of packets into flows. Along with FLOWd, the IPFIX
daemon (IPFIXd) uses profiles to control its operations, and some of these
profiles need to be delivered to the appropriate PFEs. Flowd provides the
infrastructure for delivering these profiles and related messages to the PFEs,
as well as for automatically applying default attributes to circuits which are
configured for IPFIX, but which have not been explicitly configured for flow.

For more information, see RFlow.

2.2.2.15 Forwarding, FABL, and FIB

The Ericsson IP operating system control plane modules see the forwarding
plane as a set of line cards with a unique slot number per card. Each card
is split into ingress and egress functionality. The service layer (basic and
advanced) configures the forwarding plane in terms of logical functional blocks
(LFBs), with an API set for each forwarding block or LFB. An LFB is a logical
representation of a forwarding function, such as:

• Longest prefix match (or FIB) lookup on a packet destination IP address

• Classification stage in the packet path

• Packet encapsulation function before a packet is transmitted on the wire

The forwarding abstraction layer (FABL) (platform independent) and adaptation


layer daemon (ALd)(ALd) (platform dependent) typically run on the line card.
They are designed to enable fast adaptation of the control plane LFB APIs to
the forwarding engine and the various processing units available on a certain
card. For example, a line card might contain a line card processor, one or more
NPUs, CAMs, and encryption ASICs.

1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06 49


SSR System Architecture Guide

See the line card architecture diagrams in Section 2.1.9 on page 27. The FABL
and ALd configure these card resources for fast path forwarding (data packets
processing through the forwarding engine) and might also help in slow path
functionality, that is, functionality that requires special handling of packets,
such as ICMP, VRRP, or BFD.

The following is a brief description of each LFB in FABL.

Ingress

1 Port—Packets are received on physical ports. This module represents


ports and configurations, such as port encapsulations, port administrative
status, and so on.

2 Circuit—After a packet is received, attributes from the packet header are


used to determine the circuit that the packet belongs to. For example, the
VLAN tag can be used to determine which VLAN circuit it corresponds to.
Statistics, such as received bytes and received packets, are maintained
on a per circuit basis.

3 Validation—Packets are submitted to header validations and further


checking.

4 Services such as ACL, QoS policing, and policy-based routing (PBR) are
applied on circuits. If any of these services are applied, the packet is
submitted for special processing. Otherwise, the packet is forwarded to
the next stage.

5 MFIB—For multicast packets, Multicast-Forwarding Information Base is


used to look up the multicast destinations.

6 FIB—Contains the best routes that RIB downloads to the forwarding plane.
Each line card maintains its own FIB to make routing decisions. For IP
packets, FIB is used for longest prefix match routing. The circuit determines
in which context (or which FIB instance) the packet lookup is done.

7 XC—When cross-connects are configured on the circuit, the packet is


stitched to an egress circuit.

8 LFIB—Label FIB is used to look up MPLS labeled packets for further MPLS
actions (SWAP, POP, PUSH PHP).

9 Neighbor—Next hops determine the peer next hop or the immediate


next-hop router that this packet should be forwarded to before it reaches
its final destination. Next hops can be connected, recursive (such as
nonconnected BGP next hops), or equal-cost multipath (ECMP).

SSC

1 Session—Traffic comes into the SSC and is de-multiplexed into a session


via a service endpoint.

50 1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06


SSR Functional Architecture

2 Service—A single service endpoint identifies an application service


instance and handles multiple sessions. The packet is processed by the
application service instance before being sent out according to the FIB
to an egress location.

Note: The SSC supports the following next-hop types:

• Connected next hops, with three subtypes: adjacency; LAG hash


for packet hashing; LAG SPG for circuit hashing

• ECMP next hops

• IPv4 tunnel next hops, including IPv4 GRE and IPv4 in IPv4 next
hops

• Traffic steering (TS) next hops (for information about traffic


steering, see Section 3.7.3 on page 140)

• IPsec tunnel next hops, a variation of TS next hops

• Hosted next hops

• Indirect next hops

• Drop next hops

Egress

1 Adjacency—A next hop (connected, recursive, or multipath) eventually


resolves to a connected adjacency. This tells the ingress side of processing
to which egress card the packet is forwarded to (crossing the fabric). After
the packet is received on egress, the corresponding adjacency entry is
located based on some indication from the ingress (for example, metadata
adjacency ID). The adjacency determines the egress-side functionality that
needs to be applied on this packet.

2 Services—Egress services are applied, such as ACL or QoS.

3 Circuit encapsulation—After the adjacency is determined, the packet


egress encapsulation is determined. The packet is then formatted and
prepared to be transmitted on the wire.

4 QoS—Any QoS queuing, shaping, and scheduling is applied.

5 Port—On the egress side, the packet is transmitted on the wire.

2.2.2.16 Healthd

The Health Monitoring daemon (Healthd) monitors the health of the system
based on the Unit Test Framework (UTF). It inherits several benefits from the
UTF (including fully scriptable python interface with a potential SWIG C/C++
interface). Healthd is composed of four major functional components:

1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06 51


SSR System Architecture Guide

• The framework/core, which is UTF based

• An event scheduler, which is driven by a delta timer list. An event scheduler


is a simple event queue where each event is tagged with a desired elapsed
time and an associated action to execute. When the event time elapses,
the action is executed. If the event is tagged as periodic, the event is
reinserted in the event queue. The object can be periodic for n times (that
is, it expires after n times).

• An object tracker component in which each object is a logical gate (AND,


OR, NAND, and so on). An object has multiple inputs and a single output
that can trigger multiple actions. The logical gates can be cascaded and
combined hierarchically to achieve complex logical operations. The inputs
to the gates are called triggers, which act as permanent switches that can
be either ON or OFF).

• A set of troubleshooting scripts that can be executed on demand or


programmed as actions in the timer event scheduler.

For more information about the Healthd feature, see Section 5.1 on page 228.

2.2.2.17 HR

The HTTP Redirect (HR) process manages configuration, enforcement, and


tracking of redirected subscriber session details.

HR interacts with the following modules:

• Receives subscriber attributes from the default subscriber configuration


or RADIUS.

• AAA receives the subscriber provisioning details and sends them to RCM.

• RCM sends the details to HR, and a configured forwarding policy is


implemented in the affected session circuit, which provides the egress
web port.

HR listens to the socket bound to the local port 80, waiting for packets.

• When a subscriber attempts to send HTTP traffic, the ingress PFE forwards
the traffic to the local port 80.

• HR uses the packet's circuit information (URL, message, and timeout value)
to construct an HTTP REDIRECT message to return to the subscriber.

When the redirect is successful, HR informs AAA to remove the policy from
the circuit, or add an ACL that allows access.

2.2.2.18 IGMP Daemon

The Internet Group Management Protocol daemon (IGMPd) implements the


IGMPv3 protocol as described in RFC 3376 and IGMPv2 as described in RFC

52 1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06


SSR Functional Architecture

2236. IGMPv1 (RFC 1111) is supported with IGMPv2 in compatibility mode.


IGMP runs on Ericsson IP Operating System interfaces and determines which
IP multicast groups and, for IGMPv3, which sources have listeners on the
network attached to the interface. A cache is maintained, and the multicast
routers determine which multicast router will actively query the group. The
cache entries are provided to Protocol Independent Multicast (PIM) so that they
can be advertised to other multicast routers.

2.2.2.19 ISM

Interface State Manager (ISM) monitors and disseminates the state of all
interfaces, ports, and circuits in the system. ISM is the common hub for
Ericsson IP Operating System event messages. ISM records can display
valuable troubleshooting information using the show ism command. For
information about interpreting the various forms of the command, see Section
5.10 on page 264.

ISM receives events from the Card State Manager (CSM), the Interface
Manager (IFmgr) in RCM, or from media back ends (MBEs). Each component
creates events and sends them to ISM. For a component to listen to the events
that ISM receives, it must register as an ISM client.

The CSM and IFm components talk to a special ISM endpoint that takes a
configuration type message and converts it to an ISM event for processing.
CSM announces all port events, while IFm announces all interface events and
static circuit creation and initial configuration.

MBEs in the system talk to ISM through the MBE endpoint. Before an event
can be sent to ISM, an MBE must register using a unique and static MBE
identifier. After an MBE has registered with ISM, it can send any type of event
to announce changes to the system. ISM takes all events received from all
MBEs and propagates these events to interested clients. Clients must register
with ISM using the client endpoint. This registration also includes the scope of
which circuit and interface events a client is interested in. Registration reduces
the overhead that ISM has of sending every event to every registered client. A
client can be registered with as many different scopes as needed.

Events are placed in an event_in queue. Events are processed in order. If an


event cannot be processed when it is received, it is requeued. Requeuing
allows ISM to handle out-of-order events. For example, if ISM receives a
circuit configuration for a circuit that does not exist, it requeues the circuit
configuration until it receives the circuit create event for that circuit. After a
certain number of requeues, an event is dropped to prevent it from staying in
the event-in queue forever.

When ISM receives an event, it marks the event as received and passes the
event to interested clients. ISM tries to not send duplicate events to a client,
but if it does, a client must handle the duplication. ISM sends events in a
specific order, starting first with circuit events and followed by interface events
in circuit/interface order. All circuit delete events are sent before any other
circuit events, and all interface delete events are sent before any other interface

1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06 53


SSR System Architecture Guide

events. This order is to ensure that deleted nodes are removed from the system
as quickly as possible, because they might interfere with other nodes trying to
take their place.

For examples of the role of ISM in system processes, see the BNG session
diagrams in Section 3.8.2 on page 151.

2.2.2.20 IS-IS

The Ericsson IP Operating System IS-IS implementation supports IPv4 routing,


as described in RFC 1195 and RFC 5120 and IPv6 routing, as described in
RFC 5308. Additionally, most of the IS-IS MIB is supported as described
in RFC 4444.

Like the OSPF modules, the IS-IS module does most of its work in a global
worker thread running the dispatcher. Additionally, the dispatcher thread
receives IPC messages from other daemons and processes in the dispatcher
thread using the IPC task dispatcher capability.

The IS-IS MO thread handles configuration, clear, and show messages and
runtime data structures in the MO thread. A mutex is used to avoid data
structure contention problems. This mutex is tied into the dispatcher library.

IS-IS non-stop routing (NSR) is not enabled by default. You can enable it
with the nonstop-routing command in Is-IS configuration mode. To verify
that IS-IS information is being synchronized to the standby RPSW card, you
can use the show isis database, show isis adjacency, and show
isis interface commands on the active and standby RPSW cards.To
support NSR, pre-switchover/restart adjacencies need to be maintained (data
necessary need to maintain an adjacency should be synchronized from active
to standby RPSW card. When IS-IS NSR is enabled, each neighbor’s MAC
address is synchronized so that IIH packets containing the neighbor’s MAC
address can be sent out. To support this:

• IPC communicates between the active IS-IS process and the standby IS-IS
process and between the standby ISM and the standby IS-IS process.

• The IS-IS process on the standby RPSW controller card is always started
when there is IS-IS configuration, and the IS-IS endpoints are open. As a
result, the standby IS-IS process registers with all open endpoints on the
standby RPSW card.

• The standby IS-IS process also receives and processes information from
the standby RCM process.

2.2.2.21 L2TP

The Layer 2 Tunneling Protocol (L2TP) module implements L2TP, interacting


with other modules to support the following modes of operation: L2TP Access
Concentrator (LAC), L2TP Tunnel Switch (LTS), and L2TP Network Server
(LNS). The L2TP module consists of two components, an RCM manager

54 1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06


SSR Functional Architecture

component and the backend daemon (L2TPd). The L2TP RCM component
manages all L2TP related configuration within the configuration database, and
sends the details to L2TPd, which communicates them to the other modules.

For details about LAC and LNS session connection/termination processes,


see Section 3.8.2.2.4 on page 171.

The following figure illustrates L2TP interactions with other modules:

Figure 27 L2TP Daemon Interaction With Other Modules

2.2.2.22 LDP

Label Distribution Protocol (LDP) enables dynamic label allocation and


distribution of MPLS labels using downstream unsolicited mode (with ordered
control and liberal label retention). A label-switched router (LSR) with LDP
enabled can establish label-switched paths (LSPs) to other LSRs in the network.

LDP creates label bindings by assigning labels to connected routers and by


advertising the bindings to neighbors. LDP also assigns labels to label bindings
learned from neighbors and readvertises the bindings to other neighbors.
When an LSR advertises a label binding for a route, the LSR is advertising the
availability of an LSP to the destination of that route. LDP can learn several
LSPs from different neighbors for the same route. In this case, LDP activates
only the paths selected by the underlying Interior Gateway Protocol (IGP),
either IS-IS or OSPF.

To discover LDP peers, an LSR periodically transmits LDP Hello messages


using UDP. After two LDP peers discover each other, LDP establishes a TCP
connection between them. When the TCP connection is complete, an LDP
session is established. In the Ericsson IP Operating System, the LDP router
ID is used as the transport address.

During the LDP session, LSRs send LDP label mapping and withdrawal
messages. LSRs allocate labels to directly connected interfaces and learn
about labels from neighbors. If a directly connected interface is shut down, an
LSR withdraws the label and stops advertising it to its neighbors. If a neighbor

1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06 55


SSR System Architecture Guide

stops advertising a label to an LSR, the label is withdrawn from the LSR's
Label Forwarding Information Base (LFIB). Teardown of LDP adjacencies or
sessions results if Hello or keepalive messages are not received within the
timeout interval.

The Ericsson IP Operating System supports a maximum of 1,200 targeted


LDP sessions or a combination of up to 50 non-targeted sessions and 1,150
targeted sessions.

The Ericsson IP Operating System implementation of LDP supports RFC


5036 (the base LDP specification), RFC 4447 (only FEC 128), draft Muley for
pseudowire status signaling, and RFC 4762 (VPLS using LDP).

The LDP daemon interacts with the following modules:

• OSPF or IS-IS—LDP and IGPs implement a synchronization mechanism.


LDP sends synchronization messages to OSPF and IS-IS.

• ISM—LDP gets the interface events information from ISM.

• LM—LDP installs LDP LSPs in LM, and LM communicates with LDP for
L2VPN/VPLS/Port-Pw PW bring up.

• RIB—LDP registers with RIB for route redistribution and prefix registrations.

• RPM—Routing policies are communicated from and to RPM.

• RCM—LDP configuration is received through RCM.

2.2.2.23 LGd

Link Group Daemon (LGd) is responsible for link aggregation and running the
Link Aggregation Control Protocol (LACP).

The following components interact to manage LAG constituent-PFE updates


(see flow diagrams for details):

• FABL and ALD—The control plane sends and receives control packets to
or from line cards via the kernel. At the line card, FABL receives outgoing
packets from the kernel, and sends incoming packets to the kernel. FABL
APIs are defined to get packets between FABL and the ALD. NP4 driver
APIs are defined to get packets between the ALD and the NPU.

• LG CLI—Provides the commands for configuring link-groups and applying


the complete set of features on it. It also sends the CLI configurations to
the LG Mgr in RCM.

• LGMgr—RCM counterpart of LG. Accepts the configuration events


coming from the CLI and checks their validity. When the check
passes, LGMgr adds entries to its database and notifies the
other managers of changes in link-group configuration using the
link_group_change_callback mechanism. This callback informs the
other modules of the following type of link-group changes – LAG Aggregate

56 1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06


SSR Functional Architecture

circuit creation, deletion, addition and removal of ports to LAG, as well as


any 802.1Q PVCs under them. LGMgr assigns a valid LGID and circuit
handle with a pseudo slot and pseudo port for each newly created circuit,
and then sends the information to LGd.

• LGd—LG daemon (LGd) listens for messages from LGMgr, receives


configuration changes details, and queues the configuration details to be
sent to ISM.

• ISM—Message center of the SSR, receives messages from various


modules, including LGd. Performs desired operations and forwards the
information to clients. Each client can select and register to receive
specific sets of events. They learn Circuit attributes such as the slot-mask,
SPG-id, packet based hashing or circuit based hashing from ISM circuit
configuration messages.

• Managers—In the flow diagrams in this section, "Managers" refers to other


modules that participate in LG functions.

• DOT1Q CLI

• DOT1QMgr

• DOT1Qd

• Clients— Components that are registered with ISM for link group
information. This includes the line card PFEs. For example, label
manager (LM) registers with ISM to receive LAG group messages, level-1
pseudo-circuits and level-2 pseudo circuits (802.1Q PVCs).

For details about the LG information flow, see Section 3.3.6 on page 99.

2.2.2.24 LM and LFIB

The Label Manager (LM) is the Ericsson IP Operating System daemon that
manages label requests and reservations from various MPLS protocols such
as LDP and RSVP, and configures LSPs and PWs in the system. It installs
LSPs and Layer 2 routes in RIB next-hop label forwarding entry (NHLFE). It
provisions the LFIB entries on the ingress side, and the MPLS adjacencies
on the egress side in the forwarding plane. It also handles MPLS-related
operator configurations and MPLS functionality, such as MPL -ping and MPLS
traceroute. L2VPN functionality is handled in LM, including configuration and
setting up of PWs. Virtual private wire services (VPWSs) or virtual leased lines
(VLLs) use a common framework for PW establishment.

The SSR supports a single, platform-wide label space that is partitioned per
application (for example, LDP and RSVP). An LM library that facilitates label
allocation is linked per application. Applications install the allocated labels in
LM using the LM API.

1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06 57


SSR System Architecture Guide

The LM also handles PWs. The SSR supports PWs in L2VPNs (also known
as VPWS and VLLs) that provide Layer 2–emulated services such as Ethernet
VLANs over IP/MPLS packet-switched networks.

The LM interacts with the following modules:

• RCM—Mainly for MPLS-related configurations.

• LM clients—MPLS protocols, such as LDP, RSVP, MPLS-static, and BGP,


install LSPs through the LM.

• ISM—The LM is a client of ISM. It gets to know about MPLS-enabled


interfaces and L2VPN ACs from ISM. LM is also an MBE of ISM because it
configures MPLS-specific circuit attributes, such as the MPLS slot mask,
and creates VPLS and AAL2 PW pseudocircuits.

• RIB—The LM installs LSP routes and L2 routes through RIB. It also queries
RIB for next hops.

• Kernel—Mainly for receiving MPLS ping requests and replying back


(packet I/O).

• LM stores the adjacency IDs for label next hops (ingress label map (ILM)
entries) and LSP next hops (FTN entries) in the shared memory so that
LM can retrieve them after an LM process is restarted or when an RPSW
switchover occurs.

2.2.2.25 MCastMgr and MFIB

Multicast Manager (McastMgr) manages all communications between multicast


routing protocols in the control plane and forwarding plane. It is the central
component (like RIB, for unicast routing protocols) for multicast routing
protocols and is responsible for downloading the multicast routes to the MFIB.
Multicast Manager improves the scalability numbers for multicast in that it
supports multicast route entries, outgoing interfaces (OIFs), zap time, latency,
and throughput.

Multicast Manager interacts with the following modules:

• RCM—Mainly for handling the Multicast Manger configuration and show


commands.

• RIB—Primarily for receiving next-hop information for port-PW circuits.


The port-PW next hops are needed to program multicast route OIFs in
the forwarding plane.

• PIM—Downloads multicast routes, MDT configuration information, and


subscriber circuit attributes to Multicast Manager. Multicast Manager is
responsible for programming the multicast routes received from PIM in the
forwarding plane, and enabling MDT encapsulation/decapsulation on MDT
circuits, as well as subscriber circuit attributes on subscriber circuits.

• ISM—All circuit and interface events are learned from ISM.

58 1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06


SSR Functional Architecture

• FMM—Multicast Manager interacts with Fabric Multicast Manager (FMM)


to learn the fmg_id for each multicast route in the system. The fmg_id
is allocated by FMM for a given set of OIFs (multicast routes having the
same set of OIFs have the same fmg_id). On SSR, the fmg_id is used to
efficiently replicate traffic from ingress to egress slots over the fabric.

• Forwarding—Multicast Manager acts as the server for the forwarding plane.


Multicast Manager is responsible for programming multicast routes in their
respective slots and enabling multicast features on a given circuit in a given
slot within the forwarding plane.

2.2.2.26 MPLS and MPLS Static

The MPLS module, most of which is managed by the label manager (LM)
module, is responsible for programming the forwarding plane with label
information as well as managing the label allocation. It accepts requests for
labels from various protocols (LDP, RSVP, BGP, and MPLS static), allocates
the labels, synchronizes the allocated labels with the standby RPSW controller
card so that they can be recovered in the event of a switchover, and then
programs the labels to the forwarding plane or returns the allocated labels
to the protocols that requested them so that they can use them. The MPLS
module also accepts configurations from the CLI, mostly for enabling MPLS
functionality on interfaces and protocols.

MPLS static is the daemon that is responsible for configuring static LSPs. To
configure a static LSP on the ingress label edge router (iLER), an operator
specifies the LSP's next hop, the egress label, and the egress peer of the LSP.
The configuration commands are sent to the MPLS static daemon, which uses
the LM API to configure the LSP in LM. Static label swap entries and static label
pop entries (also called ILM entries) can also be configured through the MPLS
static daemon on the LSR and egress LER, respectively.

The MPLS static daemon Interacts with the following modules:

• RCM—MPLS static receives static LSP configurations from RCM.

• LM—MPLS static configures static LSPs on ingress LERs, LSRs, and


egress LERs through the label manager.

• RIB—MPLS static periodically polls RIB for next-hop information. MPLS


static relies on RIB information to properly configure static LSPs.

• ISM—MPLS static is a client of ISM. It receives circuit and interface events


from ISM.

2.2.2.27 NATd

Reviewers: This description is based on the SEOS CGNAT feature FS.


Is it correct for SSR?

The NAT daemon (NATd) manages Network Address Translation (NAT)


functionality on the SSR.

1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06 59


SSR System Architecture Guide

NATd interacts with the following components to maintain the NAT address
translations in the line card PFEs:

• When NAT is configured, NATMgr sends the details to NATd.

NATMgr also maintains the NAT information in the configuration database


and the real-time database (RTDB).

• NATd organizes and distributes the details to the PFEs (through ISM and
ALd), and sends updates when configuration changes occur.

ISM informs clients (including RIB) of the NAT address changes.

• RIB uses NAT data in calculating routes and downloads them to FIB.

• RPM manages the configuration and distribution of NAT policies.

• If a line card is reloaded, NATd resends the card-specific NAT information


to the PFEs.

• AAAd uses the NAT data in performing authentications.

2.2.2.28 NDd

Neighbor Discovery daemon (NDd) provides five main functions:

• Address Resolution—NDd participates in the implementation of the IPv6


Neighbor Discovery Protocol as described by RFC 4861. ND entries are
maintained in a database residing on the control plane. The database
associates a context-specific IPv6 address (along with a prefix in the
case of link-local addresses) with a MAC address. ND cache misses are
triggered by platform-dependent code in the forwarding plane and sent by
IPC to NDd. The ND packets sent and received by NDd go over the PAKIO
infrastructure rather than by IPC.

NDd is responsible for managing the lifetime of each ND entry, refreshing


entries automatically before they expire, and handling throttling to prevent
duplicate requests. In addition to discovered ND entries, other components
within the system can add entries as well. For example, RIB can request
address resolution or removal of an ND cache entry.

• Stateless Address Auto-Configuration (SLAAC)—NDd enables IPv6


subscriber hosts to auto-configure their global IPv6 address by advertising
an IPv6 prefix in a Route Advertisement message. Stateless address
auto-configuration is described in RFC 4862.

• Duplicate Address Detection (DAD)—NDd ensures that all auto- and


manually-configured addresses are uniquely assigned by both hosts and
routers.

• Neighbor Unreachability Detection (NUD)—When a neighbor entry has


been learned and cached by NDd, NDd refreshes NS/NA messages to
monitor and maintain reachability with each address. This allows NDd

60 1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06


SSR Functional Architecture

to detect when a neighbor leaves, so that it can be removed from RIB


immediately.

• Multibind IPv6 and Dual-Stack Subscriber Support

NDd supports several additional features:

• High availability support neighbor cache entries—Reachable entries are


saved in shared memory. On an NDd restart or switchover, these entries
are re-added to RIB.

• On-demand process—NDd is started when configuring the first IPv6


interface, the router nd command, or an ND profile.

• Per-subscriber configuration granularity—All ND configurable parameters


can be set via an ND profile assigned to subscriber records.

• Support for multiple link types, including Ethernet, L2TP LNS tunnels (in
the future), and LAG.

• Support for multiple encapsulations, including PPPoE, and 802.1Q (single


and dual tag circuits).

• Multibind IPv6 and Dual-Stack Subscriber Support—NDd provides address


auto-configuration for up to 32K PPP subscribers on a single chassis.

2.2.2.29 OSPF

This module implements the OSPFv2 (RFC 2318), and OSPFv3 (RFC 5340)
protocols. It also supports the OSPF MIB as described in RFC 4750.

OSPF interacts with RIB both to install OSPF routes and to receive redistributed
routes from other routing instances that might also be OSPF. OSPF installs
connected routes as well as LSP shortcut routes.

OSPF has the following interactions with other modules:

• SM—OSPF registers for events on interfaces on which OSPF is running


and for link-group events irrespective of whether OSPF is running on any
link groups. If MPLS Label Switch Path (LSP) shortcuts are enabled, LSP
circuits are also requested. Finally, If OSPF is configured over an IPsec
tunnel; it also receives circuit events for the tunnel and any lower level
circuits (there can be multiple circuits bound to the IPsec tunnel since it is a
multi-bind interface).

OSPF requests policies from RPM for three purposes:

0 Key chains are used for packet authentication.

0 Prefix lists are used for prioritized RIB download of selected IPv4
prefixes.

1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06 61


SSR System Architecture Guide

0 Route-maps may be used to periodically poll RIB to determine whether


or not a default route should be originated.

The RPL library provides APIs for all interaction with policy objects

• SNMP—OSPF supports SNMP queries via the IPC thread with IPC Request
and Reply. Since no objects or state machines are modified, this can be
done without worrying about contention as long as the operating system
run-to-completion user thread model is maintained. SNMP notifications are
sent directly from the dispatcher thread to the SNMP Module.

• CSPF—If OSPF Traffic Engineering (TE) and Constrained Shortest Path


First (CSPF) are configured, traffic engineering Link State Advertisements
(LSAs) will be maintained in the CSPF Daemon as well.

2.2.2.30 PAd

The Platform Admin Daemon (PAd) is a process than runs in both the active
and standby RPSW controller cards. It provides support for configuring line
cards and ports, monitoring the line card hardware (card and port status),
and implements the switchover functionality. The PAd process contains the
operating system drivers that are used for communicating with the line card
hardware. Through these drivers, PAd can configure the line card hardware
and ports as well as monitor the state of the card and the port and detect
conditions like port down, card crashes, card pull and so on.

PAd interacts with the following other modules:

• PAd receives configuration information for cards and ports from the CSM.
PAd communicates the status information for cards and ports to the
CSM process, which in turn propagates the information to the rest of the
operating system.

• The RPSW PM process is responsible for starting the PAd process prior
to any other RP applications and waits for PAd to determine and report
the active/standby state using the redundancy library before launching
any other applications.

2.2.2.31 PEM

The Protocol Encapsulation Manager (PEM) is the module that creates,


configures, and maintains the port circuits in the system. For each port, a circuit
is created internally for use when features are configured on the port. PEM is a
separate process that interacts with ISM for creating and configuring the port
circuits. PEM does not receive any direct configuration from CLI, but creates
the circuits indirectly when it learns from ISM about the creation of new ports.

2.2.2.32 PIM

The Protocol Independent Multicast (PIM) daemon supports the Protocol


Independent Multicast (PIM) protocol, as described in RFC 2362. PIM

62 1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06


SSR Functional Architecture

downloads multicast routes to Multicast Manager (which downloads them to


Multicast Forwarding Information Base (MFIB) on the line cards) and sends
messages to IGMP, such as requests to enable or disable multicast routing for
a context and to download the cached routes for the context.

The PIM daemon is based on a strict user thread (pthread) model with many
specialized threads. Multicast cache entries are maintained per group and
interface in a hierarchal database with (*,G) entries and (S,G) entries for
each active group. The cache entries for each (S,G) include all the outgoing
interfaces.

2.2.2.33 Ping

Performs Ping operations on the SSR.

2.2.2.34 PM

The process manager (PM) monitors the health of every other process in the
system. The PM is the first Ericsson IP Operating System process started when
the system boots. It starts all the other processes in the system. The list of
processes to be started is described in a text file that is packaged with the SSR
software distribution. The PM also monitors the liveness of the processes and, if
any process dies or appears to be stuck, it starts a new instance of the process.
In SSR, the PM subsystem is distributed with a master PM process running in
the active RPSW card and master PM processes running in the slave RPSW
card, SSC cards, and line cards. The system can not recover from failures of
the PM processes. If the PM master process crashes, a switchover is initiated.

2.2.2.35 PPP

The PPP module consists of two components, an RCM manager component


and the backend daemon (PPPd). PPPd listens, waiting for an event or a
job to process. When an event occurs (such as configuration with PPP or
PPPoE encapsulation), PPP processes the event and sends the details to
other modules.

Figure 28 illustrates PPP daemon interactions with other modules:

1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06 63


SSR System Architecture Guide

Figure 28 PPP Daemon Interaction With Other Modules

2.2.2.36 PPPoE

The PPP over Ethernet (PPPoE) module, which manages PPPoE configuration
and subscriber session setup and tear down, consists of two components,
an RCM manager component and the backend daemon (PPPoEd). The
PPPoE RCM component manages all PPPoE related configuration within the
configuration database, which then gets downloaded to PPPoEd.

For details about PPPoE session connection/termination processes, see


Section 3.8.2.2.1 on page 160.

The following figure illustrates PPPoE daemon interactions with other modules:

Figure 29 PPPoE Daemon Interaction With Other Modules

64 1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06


SSR Functional Architecture

2.2.2.37 QoS

On the SSR, the quality of service (QoS) module implements the resource
reservation control mechanism and configures the forwarding that implements
the services to guarantee quality of service.

QoS services are configured on the following types of circuits:

• Ethernet ports

• 802.1Q PVCs (single tag and dual tag circuits)

• Link groups

For more information about applying QoS policies, class maps, and scheduling
to circuits, see Configuring Circuits for QoS.

In the Ericsson IP Operating System, QoS implementation consists of two


major functional subsystems: QoS RCM Manager (QoSMgr) and QoS Daemon
(QoSd). These modules interact with the forwarding QoS module that
implements the functionality. The main functionality of the operating system
QoS components is to manage and provision QoS features on various types of
circuits.

QoSMgr plays a major role in the operating system QoS implementation. It


consists of the logic to provision policies on circuits as well as to perform
configuration validation, admission control, policy inheritance, and forwarding
resource allocation and tracking. QoSMgr stores the CLI-based configuration
information for static circuits in the configuration database and stores runtime
provisioning information in RSDB. RSDB is a shared memory database that is
used by other processes to exchange state. RSDB is primarily used for tracking
resources in the QoS and ACL subsystems so that it is possible to perform
admission control when the operator enters a configuration that requires
reservation of a QoS or ACL state).

QoSMgr also learns properties of circuits (L2 bindings) from the respective
MBE Managers as needed by the forwarding modules to enforce/implement
certain functionalities.

QoSMgr interacts with the dot1q module in the following ways:

• Dot1q informs QoS of hierarchical relationships between ports and PVCs.

• QoS creates (initially inactive) h-nodes for L2 and L3 circuits.

• Dot1qd informs QoS-MGR of activated PVCs and supplies configured QoS


attributes. The QoS configuration is stored in the configuration database
records.

• The messages from Dot1qd are sent to DOT1Q-MGR via the


be_to_rcm_callback mechanism. Dot1q-MGR then invokes
qos_mgr_ext_attr_bind_policy() that provisions the required/configured
polices on the circuit.

1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06 65


SSR System Architecture Guide

• DOT1Q-MGR also invokes qos_mgr_request_sync_ccod_cct() to request


QoSMgr to send the queue IDs and h-node IDs to Dot1qd so that it can be
SyncEd to the standby. QoSMgr responds to the send request via the API
qos_mgr_send_request_sync_ccod_cct.

QosMgr and CLS interact in the following ways:

• QosMgr informs CLS-MGR whenever a policy ACL is referenced or


de-referenced in a QoS metering, policing, or forwarding policy.

• QosMgr informs CLS-MGR when a circuit binds to a QoS policy which


uses ACL.

• CLS then downloads the associated ACL to the appropriate Forwarding


modules based on the circuit.

• In the case of forward-policies that are configured on a link group, QoSMgr


informs CLS-MGR of grouping and ungrouping. It also provides slot-mask
for LAG pseudo circuits, so that appropriate ACL provisioning can occur.

QoSd also interacts with ISM, Forwarding, and AAAd.

2.2.2.38 RADIUS

The Remote Authentication Dial In User Service (RADIUS) process manages


configuration of RADIUS attributes, and SSR interactions with RADIUS servers.

2.2.2.39 RCM

The Router Configuration Module (RCM) controls all system configurations


using the configuration database.

The RCM engine is responsible for initializing all component managers and
for maintaining the list of all backend processes for communication. The set
of managers and backend processes is set at compile time. The registration
of manager to backend daemons occurs during RCM initialization, and each
manager is responsible for notifying the RCM engine with which backend
processes it communicates.

The RCM engine provides a session thread for processing any connection
requests from the interface layer. When a new interface layer component (CLI,
NetOpd, and so on) wants to communicate through the DCL to RCM, it starts a
new session with the RCM engine. Each session has a separate thread in RCM
for processing DCL messages. Because the RCM managers are stateless, the
threads only have mutual exclusion sections within the configuration database.
Each session modifies the database through a transaction. These transactions
provide all thread consistency for the RCM component managers.

The RCM has many other threads. These threads are either dynamically
spawned to perform a specific action or they live for the entire life of the RCM
process.

66 1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06


SSR Functional Architecture

2.2.2.40 RIB and FIB

Routing Information Base (RIB) is the operating system daemon that collects
all routes from all routing protocols or clients (such as BGP, OSPF, IS-IS, and
static routes) and determines the best path for each route based on the routing
distance. A route is composed of a prefix (for example, 20.20.20.0/24) and a
path through a next-hop (for example, 10.10.10.10) residing on a circuit (for
example, circuit_1/interface_1) and is always associated with a distance. The
distances are set by the standards for the route sources. For example, for a
connected adjacency, the distance is 0, for OSPF it is 110, and for IS-IS it is
115. The set of routes with best paths (the ones with the lowest distances)
constitute the Forwarding Information Base (FIB), which RIB downloads to
the forwarding plane.

RIB is also responsible for route resolution and redistribution. Route resolution
consists of recursively finding the best connected next hop for a non-connected
remote peer address. Typically, routes from a non-connected iBGP remote
peer are resolved on a connected next-hop derived from IGP routes. Route
redistribution consists of relaying a set of routes from one source domain (such
as OSPF as an IGP) to another destination domain (for example, BGP as an
EGP), filtered by a specified routing policy (such as an ACL-based policy).

RIB also handles the Bidirectional Forwarding Detection (BFD) configurations


and BFD event propagations to its clients.

RIB is one of the fundamental daemons in the operating system. It has a major
impact on the transient period from boot to steady state. On the active RPSW
card, the RIB startup and booting sequence directly impacts how packets flow
in and out of the box as it configures the routing tables in the forwarding plane
and the connectivity to the management interface (RIB installs the management
subnet routes in the kernel). The speed at which RIB collects the routes from its
clients, selects the best path, and downloads these routes to all the line cards is
a major factor in the time for the SSR to reach steady state on loading.

RIB interacts with the following modules:

• RCM—Primarily for RIB-specific configurations.

• Routing protocols or RIB clients—Primarily to download to RIB the


protocol-specific paths toward a certain prefix. The RIB clients include
Static, ARP, OSPF, OSPF3, IS-IS, BGP, RIP, RIPng, and Tunnel.

• ISM—RIB is a client of ISM, which is how RIB learns about circuits and
interfaces and thereby configures subnet routes and subscriber routes. RIB
is also an ISM MBE, primarily for setting BFD flags on some circuit types.

• ARP—Adjacency routes are added and deleted by ARP through a MAC


ADD or MAC DEL message.

• SNMP—Queries concerning routes are handled by RIB.

• RPM—Routing policies are handled by the RPM daemon.

1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06 67


SSR System Architecture Guide

• PIM—RIB downloads a full copy of the FIB to PIM.

• Forwarding—FABL-FIB (running on the line card local processor) registers


with RIB as an FIB client. Once registered, RIB downloads a full FIB copy
for ingress and egress.

• Kernel—RIB downloads some routes (management interface specific) to


the kernel.

• Libso—RIB handles libso queries pertaining to source IP addresses.

For details about the RIB boot process, see Section 3.5.1.1 on page 113, and
the role of RIB in subscriber session management, see Figure 62.

For the role of RIB in LAG constituent to PFE mapping, see Section 3.3.6 on
page 99.

2.2.2.41 RIP and RIPng

The RIP module implements RIPv2, as documented in RFC 1388. It also


implements RIPng, as described in RFC 2080. RIP supports both IPv4 and
IPv6 traffic.

The RIP module interacts with the following other modules:

• RIB for route downloads and registrations (RIB distributes RIP routes)

• ISM for registrations and events

• RPM for policy information and release requests

• RCM to handle configuration, show, and other exec mode commands

2.2.2.42 RSVP

Resource Reservation Protocol (RSVP) is one of the label allocation protocols


used to assign labels in MPLS-enabled networks based on information from
existing routing protocols. RSVP LSPs allow for Next-Hop Fast Reroute
(NHFRR), which performs a sub-50 millisecond repair of the label switched
path. NHFRR supports the protection of an RSVP LSP with a bypass LSP. A
bypass LSP is pre-established to protect an LSP that traverses either a specific
link (link bypass LSP) or node (node bypass LSP). This feature enables very
fast path protection of an LSP if a failure occurs in its original path. The Ericsson
IP Operating System RSVP supports the ability to choose the best preferred
bypass LSP when multiple candidate bypass LSPs protect the same address.

The Ericsson IP Operating System RSVP supports graceful restart, which


enables the router and its neighbors to continue forwarding packets without
disrupting network traffic when a neighbor is down. When RSVP graceful
restart is enabled, the router preserves the LSP state when a neighbor is
down. During graceful restart, all RSVP LSPs that were previously successfully
established between the router and the restarting neighbor are maintained.

68 1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06


SSR Functional Architecture

The router uses RSVP Hello messages to determine if a neighbor is down.


The hello interval and hello keep-multiplier commands in RSVP
interface configuration mode enable and configure RSVP Hello messages.

RSVP traffic engineering (TE) is an extension to RSVP for establishing LSP


paths in MPLS networks. RSVP-TE works with routing protocols to reserve
resources across the network based on network constraint parameters, such
as available bandwidth and the number of explicit hops. It allows for resource
allocation along the path. For information about RSVP-TE, see RFC 3209,
RSVP-TE: Extensions to RSVP for LSP Tunnels.

The following RSVP RFCs are supported: RFC 3031, RFC 3032, RFC 3209,
and RFC 4090 (facility protection only).

The RSVP daemon Interacts with the following modules:

• LM—Downloads RSVP LSPs into the label manager. LM also queries


RSVP for MPLS-ping requests.

• ISM—Informs RSVP about MPLS interfaces.

• RPM—Handles RSVP configured routing policies.

• RIB—RSVP queries RIB for the outgoing interface and next-hop for a given
prefix. RSVP also registers for BFD sessions through RIB.

• CSPF—Computes the CSPF path on behalf of RSVP.

• RCM—Handles RSVP-related configuration messages.

2.2.2.43 SNMPd

The Simple Network Management Protocol (SNMP) daemon monitors and


manages network devices using SNMP, communicates trap and informational
notifications, and manages SNMP requests according to the Management
Information Base (MIB).

The MIB is a virtual database of defined objects used to manage the network
device. MIB objects are organized hierarchically, each with a unique object
identifier (OID).

The SNMP engine code is a third-party component developed by SNMP


Research. It provides a library mechanism for implementing MIBs,
communicating with SNMP agents, and handling the SNMP requests that are
received. The majority of the code for SNMP, outside of the SNMP Research
code, is infrastructure that interfaces to the operating system.

The information received by SNMP comes from various sources: ISM, RIB,
and any other client that generates SNMP notifications. ISM and RIB have
dedicated threads for communication, whereas all other clients communicate
with a notification endpoint. This endpoint is used for generating trap requests
from the system.

1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06 69


SSR System Architecture Guide

The SNMP Research component supports v1, v2, and v3 of the SNMP protocol.
The v1 and v2 are obsolete by IETF but are still supported for customers
using these older versions. The SNMP Research package has only a few
customizations, and they relate to making the component context aware. The
context has been added to the protocol community string and is parsed by
the package. With these changes, multiple instances and their contexts are
supported.

2.2.2.44 Staticd

The Static daemon (Staticd) supports both interface (connected) and gateway
(non-connected) IP and IPv6 static routes that can be configured either through
the CLI or the NetOp Element Management System (EMS). Additionally,
configured gateway routes may be verified using the proprietary Dynamically
Verified Static Route (DVSR) protocol that periodically pings the specified
gateway.

For details about static route resolution, see Section 3.5.5 on page 120.

2.2.2.45 STATd

The main task of the statistics daemon (STATd) is to maintain counters from the
line cards. It collects the counters from the forwarding plane, aggregates and
processes counters, and allows various applications in the system (including
CLI) to access these counters. STATd provides limited counter resiliency for
some restart cases. STATd maintains the following types of counters: context,
port, circuit, pseudo circuit, and adjacency. STATd maintains counters in a tree
of Counter Information Base (CIB) entries. These entries hold the counters for
contexts, port, circuits, and adjacencies. Each CIB entry contains configuration
information and counters. CIBs are placed into various aggregation general
trees to allow walking circuit hierarchies. Each CIB can contain optional counter
values. Recent optimizations allocate memory only when certain counters are
needed so that the memory footprint of STATd is reduced. For each type of
counter values, the following versions are kept:

• Cumulative—Data read from the forwarding plane.

• Cleared—Counter value at the last clear. Clearing of counters is handled in


STATd because forwarding plane counters are never cleared.

• History—Previous counter values before a line card crash, line card


restart, line card modular upgrade, switchover, ISSU, STATd restart,
pseudo-circuit slot delete, child circuit delete, link group ungroup, and
circuit group ungroup. Each time a line card restarts, its counter information
is lost. Some history counter archives are kept in separate child CIBs in
aggregation trees.

STATd collects counters in two modes:

70 1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06


SSR Functional Architecture

1 Query—STATd queries the forwarding plane for counters. Querying is


done with asynchronous IPC messages sent to each forwarding engine
(such as PPA).

2 Push—The line card updates STATd with the latest counters. The
forwarding plane sends messages to STATd either when triggered by
certain events or periodically. A reliability mechanism is implemented
for only the data that is deemed too important to be lost when STATd
restarts. STATd acknowledges this data when received, which requires the
forwarding plane to perform additional processes when STATd restarts. It
must resend all the data that was not acknowledged by STATd because it
may have been lost.

Certain important counters (such as rx/tx totals) are pushed periodically to


STATd. All forwarding engines in all slots deliver the periodic counters to
STATd, with a default period of 60 seconds. This ensures the accuracy of the
reported counters in cases where polled counters are used, such as cached
counter query and history events that must retrieve counters from STATd cache.

Bulk statistics is a mechanism that reports system information (including


counters) in bulk to a monitoring station using File Transfer Protocol (FTP) or
Secure Copy Protocol (SCP).

A bulk statistics schema specifies the type of data to be reported, the reporting
frequency, and other details. The information is collected in a local file and then
is transferred to a remote management station. STATd manages the creation,
deletion, and configuration of bulk statistics schemas. The schema selects the
information to be reported and determines its format. You can associate the
system, contexts, ports, or 802.1Q PVCs with a schema, which will include
the associated data in the information reported by the schema. When this
association happens in CLI, STATd is notified and adds a work item in the
schema definition so that the related data can be collected when the schema is
processed periodically. This functionality builds on the counters maintained by
STATd and does not introduce any new dependencies on the forwarding plane.

STATd interacts with the following modules:

• RCM—Handles STATd specific configurations such as setting the poll


interval and dealing with bulk statistics. Also handles show counters
commands.

• CLI—Directly queries STATd for counter-related show commands.

• SNMP—STATd handles SNMP counter queries (which may result in


alarms as a result of STATd data). SNMP is notified if the bulk statistics
file transfer failed.

• ISM—STATd is a client of ISM, which determines the state of circuits, their


relationships, and link group events.

• LG—Queries STATd for circuit counters.

• CSM—Queries STATd for port and circuit counters.

1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06 71


SSR System Architecture Guide

• Kernel—Interacts for bulk statistics but not for management port counters.

• Forwarding plane—STATd communicates with the STATd/FABL


component on the line cards, which is responsible for aggregating statistics
from the NPU and communicating with the STATd process running in
the RPSW card. STATd/FABL sends statistics information to the STATd
process running in the RPSW, which also polls STATd/FABL for counters
using asynchronous ipcWrite calls from a single thread.

2.2.2.46 Sysmon and Logger

The system manager module (Sysmon) and Logger module manage system
event messages. Together, they produce the system log (Syslog), used to
monitor and troubleshoot the SSR system. The SSR should be configured to
automatically store the Syslogs on an external Syslog server. For information
about how to configure, access, and collect logs, see Logging and Basic
Troubleshooting Techniques.

2.2.2.47 TCMA

Time Control Module Agent (TCMA) provides synchronization services to line


cards. It configures timing circuits, monitors SyncE port faults, validates the
transceiver capability to support synchronous mode and raises faults to the
TCMd. It also which controls sending and receiving of ESMC PDU and collects
ESMC statistics. For a diagram of the SyncE components, see Figure 10.

2.2.2.48 TCMd

Time Control Module daemon (TCMd) is the main Synchronized Ethernet


process. It controls all the timing features, except for SyncE port configuration
and monitoring (process already exists in RPSW). TCMd uses the Publish
and Subscribe (pubsub) interface to receive ALSW and line card state change
notifications. It performs initialization of the timing hardware on the ALSW card.
TCMd contains platform-dependent and independent code.

TCMd interacts with the following hardware and software components:

• Timing control manager (TCM) (hardware) in the ALSW card

• TCM Agent (TCMA), running on the line card in the Application Layer
daemon (ALd) context

• Communication protocol, ESMC

• Card State Module (CSM), Inter-process communication (IPC) , Managed


Object (MO) module, Chassis Management System (CMS), and Router
Configuration Module (RCM) on the RPSW card

• Card Admin daemon (CAD), Load Index Map (LIM) drivers, Packet
Input/Output (PKTIO) hardware module, and the NPU driver on the line
cards.

72 1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06


SSR Functional Architecture

For a diagram of the SyncE components, see Figure 10.

2.2.2.49 TSM

Reviewers: does this section need updates for FT2994?

Figure 30 illustrates the control information flow for SSC traffic slice
management (TSM).

Figure 30 SSC Traffic Slice Control Information Flow

The process is summarized in the following steps:

• Traffic Slice Management—The SSR creates a Traffic Slice Forwarding


Table (TSFT) in each line card to steer packets from the line card to the
SSCs. The SSCs return traffic to line cards using FABL-FIB and to other
SSCs using TSFT.

• IP Route Management—When the application configures an IP route-based


traffic slice, the request is forwarded to RIB via TSM on the RPSW card.
RIB then downloads and installs the information in the SSC and line card
FIB tables.

When SSR nodes are configured in an ICR pair (in the BGP-based model),
TSM packet steering changes to a more complex model. In this case, packets
are steered to specific SSCs by using service maps as well as multiple,
dynamically created, TSFT tables. For a diagram of the steering flow with this
configuration enabled, see Figure 87.

1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06 73


SSR System Architecture Guide

2.2.2.50 Tunneld

The Tunnel Manager (Tunnel) process implements soft tunnels on the SSR,
adding only an encapsulation without a tunnel entry endpoint in the forwarding
plane. It handles tunnels according to the next-hop types in the Forwarding
Information Base (FIB), including:

• GRE tunnels

• IP-in-IP tunnels

• IPsec or site-to-site tunnels (used with the SSC)

• Manual and auto IPv6 tunnels

2.2.2.51 VRRP

The SSR supports Virtual Router Redundancy Protocol (VRRP), as described


in RFC 3768. VRRP increases the availability of the default gateway
servicing hosts on the same subnet by advertising a virtual router (an abstract
representation of owner and backup routers acting as a group) as a default
gateway on the host(s) instead of one physical router. Two or more physical
routers are then configured to stand for the virtual router, with only one doing
the actual routing at any given time. VRRP protocol is implemented with
VRRPd (VRRP Daemon), which maintains the state machine for the various
virtual routers.

For an overview of VRRP and configuration and operations information, see


Configuring VRRP

With release 12.2, with the feature called VRRP Hot Standby (HS), the system
should achieve hitless switchover and process restart. Both the controller card
and line card store state information for VRRP service. When a controller card
switchover occurs, the newly active controller card recovers states by retrieving
them from the line card. No synchronization occurs between the active and
standby RPSW processes. Running the reload switch-over or process
restart vrrp commands on a VRRP router in owner state does not cause it
to lose its owner state.

When the VRRPd is running on the active node, the standby daemon is also
running. It receives all ISM and RCM messages but not RIB or line card
messages. During switchover, the standby daemon takes over and sends all
the sessions to the line cards. When the line cards receive the sessions, they
compare them with a local copy to determine which ones to send back to the
RPSW card. When that process is complete, the line card receives an EOF
message from the RPSW card and cleans up the stale sessions.

VRRP assigns a virtual route identifier (VRID). Using the VRID, a virtual MAC
address is derived for the virtual router using the notation of 00-00-5E-00-01-XX,
where XX is the VRID. This MAC address is installed in the ARP table and is
used in packet forwarding by the owner router. VRRP implementation consists
of three components: RCM manager, backend daemon, and forwarding. The

74 1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06


SSR Functional Architecture

RCM manager component manages all VRRP-related configuration within


RDB, which then gets downloaded to the backend daemon. The forwarding
component of the VRRP is implemented in the line card (in FABL) and assists
the VRRP in running the master advertisement for a given VRID when it is the
owner and detects owner timeouts when the VRID is in backup state.

2.2.2.52 XCd

he L2 cross-connect feature in the operating system implements L2 switching


between two service instances or a service instance and a PW. The
cross-connect daemon (XCd), which implements it, consists of two components:
the daemon and an RCM backend component running on the RPSW card.
XCd manages cross-connects, the L2 switching feature on the SSR, which is
also referred to as a bypass. Based on statically provided configuration, XCd
configures the forwarding plane so that a packet received on a service instance
is switched to a service instance or pseudowire for egress. This configuration
also defines the actions taken upon the L2 header, which can include stripping
or preserving the original L2 header, and possibly adding new L2 headers
according to the egress path for the packet.

2.3 System Redundancy


In hardware, the SSR supports the following redundant components:

• Dual RPSW controller cards (1 + 1)

• Dual ALSW cards (1 + 1)

• Switch fabric to which all line cards connect (on the SSR 8020, 7 + 1 at
5.57 GHz and on the SSR 8010, 3 + 1 at 6.25 GHz). When a fully loaded
SSR 8020 chassis incurs three switch card failures, the system continues
to switch traffic at line rate. If a fourth card fails, switching falls below line
rate. On the SSR 8020, each line card is connected to all eight switch
fabric cards (four on the SSR 8010). Each line card has 32 links of 6.25
Gbps, which are distributed on switch fabric cards for connectivity (four
links per switch fabric card).

• N + 1 power modules (for SSR 8020, 7 + 1; for SSR 8010, 5 + 1)

• Redundant fan trays with 6 fans each (each fan tray 5 + 1)

You can also install multiple SSC cards for high availability. If one of them fails,
the line cards steer packets to the SSC cards that remain in service.

The redundancy model on a SSR system uses a two- tier process to select its
active and standby system components. The selection process at the lowest
level is controlled by hardware that resides on the ALSW cards. This hardware
is responsible for the selection of the primary ALSW and master (active) RPSW
cards and their associated busses (CMB, PCIe, SCB, TCB). For definitions
of ALSW primary/secondary state and RPSW master/standby state, see
HW-HLD-0031. Once the primary ALSW and master RPSW cards are selected,

1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06 75


SSR System Architecture Guide

the selection of active and standby ALSW cards at the next tier is controlled by
software operating on the master RPSW card for components such as the GigE
control plane switch, timing distribution, and alarm logic.

The RPSW redundancy software design is based on the SmartEdge


redundancy design with the following major modifications:

• The new design takes advantage of the co-location of the PAd process with
all other RPSW processes in a single Linux instance and accounts for the
removal of the SCL links between the RPSW cards.

• The hardware interrupt signals formerly used to communicate between


the NetBSD applications and the VxWorks processor are replaced with a
redundancy library using shared memory and a notification mechanism
(operating system event library).

• The hardware signals and shared memory used for implementing the M2M
and Red Link are replaced with direct messaging between M2M and Red
Link components using raw sockets and TCP over Ethernet.

• The SCB arbitration logic on the ALSW cards is used for master (active)
RPSW card selection. The PAd process monitors the overall health of both
RPSW cards and coordinates with the hardware for RPSW failovers.

The ALSW redundancy software design is based on the RPSW redundancy


design, with several simplifications. The ALSW hardware handles primary and
secondary ALSW determination and failover. PAd on the RPSW card handles
active/standby ALSW card selection, monitors the overall health of both ALSW
cards, and coordinates active/standby ALSW failovers.

Because the controller cards are not involved in ingress to egress traffic
forwarding, and because each line card maintains its own FIB to make routing
decisions, when a controller card is temporarily unavailable (such as during
switchover), traffic continues to be forwarded.

2.3.1 Active and Standby RPSW and ALSW Card Selection During Startup
RPSW cards contain internal file systems that store the operating system image,
its associated files, and the configuration database. A synchronization process
ensures that the standby card is always ready to become the active card.

• When either the software release or the firmware on the active controller
card is upgraded, the standby controller card automatically synchronizes its
software or firmware version to that of the active controller.

• When a user modifies the contents of files in memory (for example, by


saving a configuration to a file, copying a file, or deleting a file), the change
is propagated to the file system of the standby controller.

• The configuration databases of the active and standby cards are always
synchronized.

76 1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06


SSR Functional Architecture

Selection of the active and standby RPSW and the primary and secondary
ALSW cards occurs in the following scenarios at chassis startup.

The active RPSW card is selected in the following sequence:

1 Linux init starts PM.

2 PM starts the CMBd process.

3 CMBd provides access for querying the chassis inventory to determine


the chassis type.

4 PM starts the PAd process, which in turn instantiates the Controller Selector
and the M2M and Red Link threads.

5 PAd waits for a callback from ipcInitialize3() to notify it that active standby
determination is made, at which point it will call ipcProcessIsReady().

6 The Controller Selector evaluates its mastership capability based on its


own health situation, where a POD pass indicates master capable and a
POD fail indicates master incapable.

7 The Controller Selector writes its mastership capability to the ALSW card by
calling slShelfCtrlSetHwMasterCapable() at a regular interval of 3 seconds
as long as it is master capable. The periodic call to this function prevents
ALSW selector HW watchdog timeout.

8 The mastership write triggers ALSW primary selection.

9 The primary ALSW card selects a master RPSW card.

10 The ALSW hardware notifies the Controller Selector whether it is the active
or standby candidate. If the RPSW card is going active, the Controller
Selector calls the slShelfCtrlGoActive SLAPI. If the RPSW card is going
standby, the Controller Selector calls the slShelfCtrlGoStandby SLAPI.

11 Once the PAd process has established the active/standby state, it updates
the state information stored in the redundancy library, which is published
to other applications.

12 PM starts DLM and allows DLM to check and synchronize software


releases. If a software upgrade is required, the new software image is
downloaded and the card reloads.

13 PM starts the remaining processes based on the contents of the PM.conf


file.

14 PAd gets the ipcInitialize3() callback and completes the remainder of its
initialization in parallel with other processes

15 The active RPSW card's configuration database is synched to the standby


RPSW card. The active RPSW card locks the database while it is synched
across to the standby RPSW card.

1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06 77


SSR System Architecture Guide

16 The PAd on the active RPSW card invokes the registered callbacks to
synchronize admin and realtime layer state information to the standby PAd
process.

17 The remaining applications synch their state information from active to the
standby using IPC and update their IPC checkpoint information as each
process completes.

18 PM on active RP broadcasts its RP.ACTIVE namespace to peer PM on


line cards.

Note: The term active candidate refers to the RPSW card that has been
selected by the ALSW card HW selector to go active but has not yet
gone active.

The active ALSW card is selected in the following sequence:

1 Both ALSW cards wait for a 1st PCIe write from an RPSW card to start the
primary ALSW selection algorithm in ALSW FPGA.

2 The primary ALSW card selects the desired master (active) RP and notifies
the RP of its selection.

3 Once PAd and CMBd have finished initializing, the ALSW Selector
determines which ALSW card is active based on health of the ALSW card.
If both ALSW cards are equally healthy, the active ALSW card is chosen.

4 The ALSW Selector calls slAlSwGoActive for the active ALSW card and
calls slAlSwGoStandby for the standby ALSW card.

2.3.2 Standby RPSW Card Synchronization During Runtime


The standby RPSW card is synchronized during runtime with the following
sequence:

1 The active and standby cards synchronize with the configuration database
and flash, and collect required state information.

2 The active RPSW configuration database is actively synched to the standby


during normal operation through IPC.

3 The active RPSW file system is monitored for changes by DLM. Whenever
the file system is modified, DLM synchs the changes across to the standby
RPSW card.

4 State information in PAd is actively synched to the mate PAd process via
the PAd Redundancy 9 Module over the Red Link.

5 Events from line cards received at the SL Upper on the active PAd process
are synched across to the standby PAd process via the Redundancy
Module during normal operation.

78 1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06


SSR Functional Architecture

6 All other processes synchronize state as required using IPC and DDL.

To guard against system inconsistency, the synchronization process is


protected during system load and controller card reload. When synchronization
is in progress, switchover from the active to the standby controller card is not
allowed. If the active card fails during synchronization, the standby controller
card does not become active. If the user attempts to force a switchover during
this synchronization period, the system warns the user that the standby is not
ready. However, during the normal running state, an controller card switchover
can occur at any time, and the standby controller card has the data required
to take the active role.

The synchronization process is not affected by traffic card installation and


removal. The active controller card continues to forward control traffic and
detect and notify the administrator of any faults that occur (the FAIL LED is
blinking) while the standby controller card is being synchronized.

After synchronization is complete, the standby controller is ready to become the


active controller card if the active card fails.

For more information about the SSR file system, see Managing Files.

2.3.3 RPSW Switchover Scenarios


The active RPSW card can be triggered to switch over with the standby card in
one of the following scenarios:

• The active RPSW card is removed from the chassis. The standby RPSW
card detects that the active card has been removed and the CMBd module
on the standby RPSW card reports the event to the Card Detection
subsystem. The Card Detection subsystem forwards the event to the
Controller Selector, which updates the mate status information.

• The active RPSW card's ejector switch is opened. The active RPSW
card CMBd receives the ejector switch open event and forwards it to the
Controller Selector state machine.

• The standby RPSW card detects that the active RPSW card has a
hardware or software fault and takes over control of the system. The active
and standby RPSW cards exchange fault information with each other
through the exchange of Fault Condition notifications over the Red Link.
As with the SE800, each RPSW card keeps track of its own local faults
and its mate’s faults and uses these as the inputs to the failover trigger
algorithm. Whenever a new fault is detected, both RPSW cards receive
the notification and the Fault Handler forwards the fault to the Controller
Selector. Typically, RPSW software failures do not result in RPSW card
failovers. If software failures cause an RPSW failover, the fault is reported
using the Fault Condition event and the fault is treated in the same way
as a hardware failure.

1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06 79


SSR System Architecture Guide

• The active ALSW card detects the loss of RPSW-intiated heartbeats and
initiates a hardware failover. The standby RPSW card is interrupted and
notified of the RP mastership change from hardware. The Controller
Selector running on the standby RPSW card starts a software failover.

• The operator enters the reload switch-over command. When the


manual switchover request is sent from CLI, if no other higher priority
switch requests are active, the Controller Protection Resource Manager
forwards the switchover request to the Controller Selector. The Controller
Selector performs a similar check and, if no higher priority request is in
progress, initiates a manual switchover.

• The active RPSW card's PM or nameserver process crashes. The active


RPSW card’s init process monitors the PM process. When it detects it has
crashed, the init process initiates a graceful card reboot. The active RPSW
card’s reboot routine writes to the ALSW SCB register to bring the RPSW
card to mastership selection offline state, thus initiating a fast hardware
failover. The standby RPSW card is interrupted and notified of the RPSW
mastership change from hardware. The Controller Selector that runs on
the standby RPSW card starts software failover.

• The active RPSW card's PAd process crashes. The PM process actively
monitors all processes. If PAd crashes, PM exits, which triggers RPSW
switchover in the same way as in the previous case.

2.3.4 ALSW Switchover Scenarios


The Active ALSW card can be triggered to switch over with the standby card in
one of the following scenarios:

• The user enters the reload switch-over alsw command. The PAd
ALSW selector calls slAlSwGoStandby on the active ALSW card, which
internally checks the primary/secondary status of the active ALSW card.
Because the ALSW card is primary, the driver demotes the primary ALSW
card and then promotes the secondary ALSW card.

• The user enters the reload standby alsw command. The PAd ALSW
selector processes the reload request without regard for the primary or
secondary status of the active ALSW card. The PAd ALSW selector calls
slAlSwGoStandby to reload the active ALSW card, and the driver checks
the primary/secondary status of the active ALSW card. Because the ALSW
card is primary, the driver demotes the primary ALSW card and then
promotes the secondary ALSW card.

• The user opens the ejector switch on the primary ALSW card. An interrupt
arrives at the driver software, which notifies PAd that the ejector was
opened. The PAd ALSW Selector evaluates the request and determines
that it is the highest priority request. The PAd ALSW Selector calls
slAlSwGoStandby to perform the switchover. When the slAlSwGoStandby
call is made, the driver also checks the primary/secondary status of the

80 1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06


Architectural Support For Features

active ALSW card. The driver demotes the primary ALSW card and then
promotes the secondary ALSW card.

• The primary ALSW card is removed from the chassis. The secondary
ALSW card generates a software interrupt and starts the B5 timer. The
driver handles the software interrupt and checks the state of the pulled
ALSW card. Because the ALSW card has been pulled, the driver does
nothing. The B5 timeout occurs, and the secondary ALSW card promotes
itself to primary.

• The primary ALSW card fails. The secondary ALSW card does not detect
the Inform signal and generates a software interrupt and starts the B5
timer. The driver handles the software interrupt and checks the state of the
failed ALSW card . Because the ALSW card has failed, the driver does
nothing. The B5 timeout occurs, and the secondary ALSW card promotes
itself to primary.

3 Architectural Support For Features

3.1 Layer 2 Cross-Connection and VPWS on SSR


For CPI-level configuration information about L2 services, see Configuring
Layer 2 Service Instances, Configuring Local Cross-Connections, and
Configuring VPWS (L2VPN).

Figure 31 Layer 2 Cross-Connection

To provide flexibility of circuit definition, VLAN manipulation, and L2 services,


the SSR supports the L2 cross-connect and Virtual Private Wire Service
(VPWS) features. To enable these L2 services, two types of new circuit-related
configuration concepts have been introduced, as shown in Figure 31:

1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06 81


SSR System Architecture Guide

• Service instance—An attachment circuit (AC) using Ethernet encapsulation


is configured for L2 forwarding, manipulation, and transportation of various
types of packet encapsulations. An AC using Ethernet / 802.1Q / 802.1ad
encapsulation between CE and PE routers.

• pseudowire instance—A PW that emulates point-to-point connection over


an MPLS network that allows the interconnection of two nodes with any
L2 technology.

Local cross-connections connect two Layer 2 Ethernet service instances


to each other. The configuration of the cross-connected service instances
determines how Ethernet traffic is forwarded from the service instances on one
port to the service instances on another port of the same SSR.

L2VPN supports circuit-to-PW switching, where traffic reaching the PE is


tunneled over a PW and, conversely, traffic arriving over the PW is sent out over
the corresponding AC. In this case, both ends of a PW are cross-connected to
L2 ACs.

You can cross connect local cross-connections between two service instances
(as in Figure 32) or, for VPWS, between a service instance and a PW instance
(as in Figure 33).

Figure 32 Layer 2 Cross-connection

VPWS is a point-to-point link between two CE routers through an MPLS-based


PW network. A VPWS configuration connects a local CE router to a remote
CE device through an existing MPLS backbone network, as shown in Figure
33. The SSR cross connects the local service instance (SI) circuit between the
local CE and PE to a PW instance that crosses the MPLS backbone network to
the remote PE router.

Figure 33 VPWS Topology

82 1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06


Architectural Support For Features

3.1.1 Service Instances


Traffic over service instances is managed by configuring VLAN tag matching or
VLAN tag manipulation.

3.1.1.1 VLAN Tag Matching

Under Ethernet port configuration, you define service instances with match
options, which designate different Layer 2 service instances for carrying specific
types of traffic, similar to ACLs. Any services not matched are dropped.

VLAN tag matching traffic types:

• dot1q—802.1Q traffic (single-tagged) received by the port.

• dot1ad—802.1AD traffic (single- and double-tagged) received by the port.

• untagged—Untagged Ethernet traffic received by the port or any traffic


double-tagged with an ethertype different from the one configured with
dot1q tunnel ethertype.

• priority-tagged—Priority-tagged traffic received by the port (with vlan-id=


0 + priority bits).

• fallback-c-tag—Single-tagged traffic with ethertype 0x8100 received by the


port. This match option specifies a transport VLAN for forwarding 8100-type
tagged traffic that does not match any other transport VLAN.

• default—Default match option for the port. This match option specifies a
default circuit that captures packets that do not match the criteria for any
other service instance.

VLAN tag matching restrictions:

• A service instance can handle packets with up to four stacked VLANs, in


which the matching criteria only match on the two outer VLANs.

• An individual (non-range) service instance can have up to four match


options.

• A range service instance can have only one match option.

VLAN tag matching traffic types have the following hierarchy:

1 dot1ad SVLAN (single-tagged) and S:CVLAN (double-tagged) packets


with encapsulation.

2 dot1ad SVLAN (single-tagged) and S:CVLAN (double-tagged) packets


without encapsulation.

3 Dot1Q CVLAN packets with encapsulation.

4 Dot1Q CVLAN packets without encapsulation.

1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06 83


SSR System Architecture Guide

5 Priority-tagged packets with encapsulation.

6 Priority-tagged packets without encapsulation.

7 Untagged packets with encapsulation.

8 Untagged packets without encapsulation.

9 Priority-tagged and fallback-C-tagged packets.

10 All packets that match the default option.

11 Unmatched packets.

Hierarchy is not based on SI numbering but on the most specific tagging.

3.1.1.2 VLAN Tag Manipulation

You can enable automatic VLAN tag modification for packets between SIs.
Under a service instance VLAN rewrite configuration, use the ingress and
egress commands to modify the Layer 2 tags of an incoming packet. Possible
tag operations are push, pop, and swap. VLAN tag manipulation guidelines:

• VLAN tag rewrites can be performed for the two outer tags only.

• An individual circuit can be configured with a maximum of two ingress


rewrites and two egress rewrites at any time.

• The following constructs are valid for push and swap operations only:

0 dot1q vlan-id

0 dot1ad tag

0 priority-tagged

• Tags swapped and pushed by the router must match the egress side
match options.

3.1.1.3 VPSW on PPA3LP Cards

The SSR supports NGL2 over PPA3LP cards, including a subset of the
cross-connection types (untagged packets are not supported):

• Match dot1q x which accepts tag type 8100, plus a single tag X

• Match dot1q * which accepts tag type 8100, plus any single tag in the
range from 1 to 4095

• Match dot1ad x which accepts a port ether type tag, plus a single tag X

• Match dot1ad * which accepts a port ether type tag, plus a single tag
in the range from 1 to 4095

84 1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06


Architectural Support For Features

• Match dot1ad x:y which accepts an outer tag X with port ethertype and
inner tag Y with type 8100.

Packets having more than two tags are also accepted in this case, but only
two tags are checked X:Y and X:Y:* Note: This is equivalent to transport
pvc X:Y in SmartEdge

• A specific combination of dot1q X, dot1ad X:*”, to emulate the SmartEdge’s


transport pvc X configuration

• Match dot1ad x:y1-y2 which accepts an outer tag X with port


ethertype and an inner tag y1 to y2 with type 8100.

Packets having more than two tags are also accepted, but only two tags
are checked.

The following egress re-write swap rules are supported on PPA3LP-based


cross-connections:

• The SSR NGL2 supports egress re-write swap rule only for the top-most tag

• The egress re-write swap rule for the inner most tag must support the
configuration Match dot1ad x:* which accepts an outer tag X with port
ethertype and inner tag in the range from 1 to 4095 with type 8100. Packets
having more than two tags are also accepted, but only two tags are checked

Ingress rewrite rules are also supported on PPA3LP.

QoS MDRR queueing is supported for NGL2 over PPA3LP based cards, as well
as the following QoS services that are supported on the other SSR line cards:

• Policing, inherited policing, and hierarchical policing on ingress

• Metering, inherited metering, and hierarchical metering on egress

• Priority weighted fair queuing (PWFQ) for bandwidth management on


egress

• Overhead profiles on egress

• QoS priority on ingress

• Ingress and egress PD QoS priority propagation

3.1.1.4 Supporting Architecture

The following SSR components support NGL2 VPWS:

The LP Proxy Layer daemon (PLd) transparently passes messages without


change to the (platform layer) PFEs.

1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06 85


SSR System Architecture Guide

The LM, RIB, and XC modules interact with PPA3LP to maintain the VLL to
PFE mappings. When changes occur, ISM informs LM, RIB, and XC; then:

• LM calculates new slot masks from the ISM message details and a
download is triggered. LM uses slot based downloads for both ingress and
egress entries. Learned PFE masks are downloaded for LAG adjacencies
(FIB nexthop and LM adjacency) in nexthop/adjacency messages.

RIB calculates new slot masks and stores them. Learned PFE masks are
sent in egress PFE download objects (FIB nexthop and RIB adjacency) in
nexthop/adjacency messages.

• For physical circuits, the PFE determines if the message is applicable for it
or not. If it belongs to it, then the PFE processes it ( Add, delete, modify), If
it does not belong to it, then the PFE discards it.

• For pseudo circuits, XCd provides the PFE mask, and the PPA does a
lookup on it to decide whether to process ( add, delete, modify) or drop
the message.

3.1.2 XCd and Ethernet VLAN—Functional Description

When you configure a cross-connection (XC), the XC daemon (XCd) stores


configuration information in the RCM database in the controller card. From the
RCM database:

• The ISM module picks up all the circuit-related information and sends it
down to the Iface process in the line card using IPC.

• The XCd process on the RPSW card picks up all the XCd
configuration-related information (that needs to be applied to a
circuit/interface, for example binding a bypass) and sends it to the XCd
module on the line card using IPC.

The FABL software running on the line card processes the messages received
from the RP and calls the appropriate APIs for configuring hardware. For XCd,
the Iface module receives the circuit information from the ISM process running
on the RPSW card. The Iface process communicates with the Eth-VLAN
module if the circuit requires handling for VLAN encapsulation. Similarly, the
Iface module communicates with the XCd module running on the line card
when it determines that the given circuit requires XCd-specific handling. Figure
34 shows the steps and modules involved in the creation of XCd and Ethernet
VLANs and illustrates the potential points of failure (PoF) during the creation.

86 1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06


Architectural Support For Features

Figure 34 Potential Points of Failure in XCd and Ethernet VLAN Creation

To verify the PoFs, use the commands in Table 3.

Table 3 Points of Failure in Circuit Creation


PoF Task Command Notes
PoF-1 Verify that the user show config dot1q
configuration is
correctly stored on the debug dot1q rcm
RPSW RCM database.
Display where show dot1q service-in Show forwarding agent
service instances stance (Forwarder) that this
are cross-connected. circuit is bound to
(usually) XC . If so,
it is successfully applied
in DOT1Q mgr.

1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06 87


SSR System Architecture Guide

Table 3 Points of Failure in Circuit Creation


PoF Task Command Notes
show dot1q support-in Indicates log messages
fo that were sent, which
can be:
88—Send the config info
to ISM.
415—Add match option
and send to ISM.
416—Add rewrite option
and send to ISM.
431—Delete match
option and send to ISM.
432—Delete rewrite
option and send to ISM.
443—x-connect is
applied.
Verify that DOT1Qd show ism mbe dot1q Look for the object
has sent all log cct handle cct ISSU_OBJID_ISM2_MA
the necessary detail TCH_OPTION_T object,
configuration to which indicates that the
ISM. You can also match has occurred.
double-check using
the ISM logs. hdr_flags 0x03 means
that it is an ADD activity
(hdr_flags 0x09 means
that it is a DELETE
activity). Note: This also
applies to VLAN rewrite
operations. Use these
steps to help identify and
narrow down the failures
from the CLI to DOT1Q
to ISM.
PoF-2 Verify that the RPSW show ism circuit When the circuit handles
card has sent correct are known, you can also
messages to the verify circuit details using
line card and that the command with the
the messages are circuit handle.
received by the line
card.

88 1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06


Architectural Support For Features

Table 3 Points of Failure in Circuit Creation


PoF Task Command Notes
Display the IPC show ism client When the client names
messages that ISM are known, you can
has sent down to display the message
FABL-IFACE (display sent to the client using
the list of ISM clients) the show ism client
using the command. "IFACE SLOT 01/0"
log command.
Verify that the IPC show card slot fabl
messages sent by ISM iface log ipc
were received by the
line card.
PoF-3 Verify that messages show card slot fabl
are received by the eth-vlan counters
XCd and Eth-VLAN iface
modules.
Check the match rules show card slot fabl
created by the FABL eth-vlan match-vlans
Eth-VLAN module.
Check the rewrite rules show card slot
created by the FABL fabl eth-vlan
Eth-VLAN module. rewrite-rules
PoF-4 Verify that after show card slot fabl
FABL completes api log control
processing for the module ether-vlan
platform-independent detail
part, FABL sends
out the platform-d
ependent API for
appropriate hardware
configuration.
Display the API show card slot fabl
functions sent out by api log control
the FABL XCd module. module xcd
Display the show card slot fabl
cross-connections xcd hidden
created.

1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06 89


SSR System Architecture Guide

Table 3 Points of Failure in Circuit Creation


PoF Task Command Notes
PoF-5 On the line card, verify show circuit counters Enter these commands
that the configuration show port counters repeatedly, a few
has been installed. minutes apart, to verify
that traffic is flowing as
expected.
show card x pfe num You can also use
circuit keywords, such
as adjacency,
acl-image, fib, lfib,
mfib, next-hop, or
qos instead of circuit
to view the contents of
the PFE tables.
Note to Competence:
Understanding the
output of this command
requires some
knowledge of the NP4
(PFE) tables.

3.2 Circuits
The Ericsson IP operating system on the SSR uses the concept of circuit to
identify logical endpoints for traffic in the forwarding plane. Think of circuits as
light-weight interfaces where services like QoS and ACLs can be applied. The
router uses both physical circuits that correspond to a real endpoint in the
forwarding plane (such as a port or VLAN) or pseudo circuits that correspond to
logical endpoints (such as a link group).

Each circuit has a 64-bit ID that is unique across the SSR chassis. IDs are
assigned dynamically by the control plane and are used by the software to
identify circuits. They appear in the various show commands. The IDs are not
flat 64-bit numbers but have internal structures that are used when displaying
them in show commands; for example, 1/1:511:63:31/1/1/3. The first two
numbers are the slot and the port where the circuit exists, the next number
is a channel that is applicable to only certain types of circuits, and the other
numbers are an internal ID and type information. Physical circuits are always
associated with a real slot and port, whereas pseudo circuits have a slot value
of 255 and the port value provides additional type information for the circuit.

Note: The SSR also supports subscriber circuits for PPP/PPPoE, CLIPS,
and L2TP subscriber sessions. For information about the creation and
termination of these circuits, see Section 3.8.2 on page 151.

The SSR supports the following types of physical circuits:

90 1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06


Architectural Support For Features

• Card

• Ethernet port

• 802.1Q VLAN circuit

• 802.1Q Q-in-Q circuit

Table 4 shows the types of logical circuits supported by SSR.

Table 4 SSR Logical Circuit Types


Circuit Type Type Number Internal Name
Null0 interface 1 CCT_PSEUDO_PORT_NUL
L0
Loopback circuit 2 CCT_PSEUDO_PORT_LOO
PBACK
MPLS LSP ip->mpls 3 CCT_PSEUDO_PORT_LSP
GRE tunnel 4 CCT_PSEUDO_PORT_GRE
TUN
L2VPN cross connect 12 CCT_PSEUDO_PORT_L2VP
N_XC
MPLS LBL mapping 14 CCT_PSEUDO_PORT_MPL
S_LBL
Intercontext circuit 19 CCT_PSEUDO_PORT_INTE
RCONTEXT
IP in IP tunnel 21 CCT_PSEUDO_PORT_TUN
_IPIP
Pseudo port for pseudowires 25 CCT_PSEUDO_PORT_PW
IPSec Tunnel 28 CCT_PSEUDO_PORT_TUN
_IPSEC
Dynamic IPsec Tunnel 32 CCT_PSEUDO_PORT_DYN
_IPSEC
Pseudo port for link groups 36 CCT_PSEUDO_PORT_LAG

1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06 91


SSR System Architecture Guide

Note: When creating a port and binding an interface to it, the interface is
translated into a unique integer in the box, called the interface grid. An
interface grid looks like 0x1…01 and it keeps increasing. Interfaces are
configured in a context and stored in an interface tree. Circuits also
have grids, but in this case they are called circuit handles, which are
64 bits long and have the following structure:

slot/port/channel/subchannel/[unused]/owner/level/running number

The slot number is the identifier of the line card, and the port number is
the identifier of the port on that line card. The channel number is the
channel number; however, on the SSR, which only supports Ethernet, it
is 1023 because Ethernet is not channelized. The subchannel number
for Ethernet is 63 or FF. The owner is the process that created the
circuit and depends on the type of the circuit. Level 0 is for port, 1 is for
a circuit under a port (Ethernet) running number. For example, if we
have 2 VLANs, only this field is different. An example circuit handle for
physical port 4 on line card 2 is: 2/4/1023/63/../1/0/1.

To see the entire circuit tree (all the circuit handles), use the show
ism circuit summary command. Circuits are stored in a radix tree
based on the circuit handle. The structure of circuit handles speeds up
searches. For example, if a port down message comes in, you can
walk the circuit structure based only on the first few bits (slot/port) and
view every circuit that is under that specific port. The circuit structure
stores circuit information, which can be displayed by using the show
ip route circuit cct_handle command (for the whole tree or a
specific circuit). Circuits are unique and global on the router. They are
not connected to any context or routing table, but if you type a specific
circuit handle, the operating system displays the context in which the
circuit is used.

For a port up event, ISM sends circuit up messages to every client.

To verify that circuits have been configured on line cards as expected, use any
of the following (hidden) commands:

• show card X fabl qos *

• show card X fabl acl *

• show card x pfe *

3.3 Link Aggregation Groups


Link groups provide increased bandwidth and availability. When a number of
ports are bundled in a link group, the failure or replacement of one link in the
group does not cause the link group to be brought down. Other links accept the
traffic of the link that is out of service. Load balancing and load distribution over
the ports in the link group result in increased bandwidth.

92 1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06


Architectural Support For Features

The SSR implementation of LAG is simpler than LAG on the SmartEdge,


including both packet-based hashing, such as a trunk link group, or
circuit-based hashing, such as an access facing link group - and referred to
internally as “Unified LAG”. It is based on the SmartEdge access LAG internals
in the control plane, but supports a full feature set that can be configured for
either core-facing or access-facing applications.

Link groups support fast failover, as well as the quality of service (QoS) policing,
metering, and queueing features. Although you set the QoS configuration for
link groups at the link-group level in link group configuration mode, policing,
metering, and queueing are performed internally per constituent port.

Typical link group applications include:

• Single-port link group—Migrating services from one slot to another without


impacting services by first adding the new constituent port to the link group
and removing the old port from the link group.

• Link redundancy—Using a link group with two ports on the same line card
to provide link redundancy.

• Additional link capacity—Using a link group with multiple ports to carry


traffic. Ports can be on the same line card or spread across line cards.

• Line card redundancy—Spreading a link group with multiple ports among


at least two different line cards to guard against traffic delays resulting
from line card failure or link failures.

• Node redundancy—Spreading a link group with N ports across two different


chassis to guard against traffic delays resulting from the failure of a node
(MC-LAG).

For configuration information, see Configuring Multichassis LAG .

3.3.1 Circuit Hashing: SPG Assignments


Circuit hashing is implemented with a subprotection group ID (SPG-ID) table.
SPG-IDs are identifiers that assign a circuit or service instance to a specific
link group constituent port. This mapping is used to direct egress traffic for
circuit-hashed circuits and to derive a home slot for data and control traffic when
needed. The SPG-ID table ensures fast failover by minimizing the number
of updates needed when a link fails.

SPGs are allocated per-LAG using the formula (N * (N-1) + 1), where N is the
configured maximum-links parameter for the LAG. Within each LAG, one of
the SPG-IDs (the “+ 1” in the formula) is reserved for mapping some special
circuits, and the (N * (N-1)) is designed such that if one LAG member port fails,
there will be (N-1) entries in the SPG table referring to that failed port, so those
entries can easily be redistributed among the remaining (N-1) available ports.

1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06 93


SSR System Architecture Guide

All traffic in the same subgroup goes through the same port, not by the
source and destination address as in load balancing. ISM assigns a SPG-ID
to every LAG circuit, regardless of whether the circuit is packet-hashed or
circuit-hashed. ISM also maintains the mapping from SPG-ID to LAG port
in the SPG-ID table. Assignment of circuits to SPG-IDs is not as simple as
round-robin – a new circuit is assigned the SPG-ID that would lead to the best
balancing. Fair balancing is only considered for circuit-hashed circuits, where
forwarded traffic uses the SPG-IDs. For packet-hashed circuits, the SPG-ID is
only used for injected control traffic, so those circuits are not included in the
balancing calculations. It is possible for all packet-hashed circuits on a given
LAG to be assigned the same SPG-ID.

3.3.2 Load Balancing


Egress traffic to a circuit-hashed LAG circuit is assigned a LAG constituent port
based only on the circuit’s SPG-ID, and thus the control plane is responsible
for the load balancing by balancing the circuit-to-SPG and SPG-to-port
assignments. Balancing is “perfect” when the same number of circuits is
assigned to each constituent, and if each circuit is carrying the same amount of
traffic. However, because perfect conditions are not those found in networks,
the more realistic goal of load balancing is not to distribute an even split of
traffic but rather to ensure that distribution is adequate when hundreds of flows
are egressing a link group.

For packet-hashed traffic, load balancing is a function of the hash key computed
for the packet from various packet header fields and the distribution of LAG
ports in the hash table.

Balancing can be “perfect” when:

• The number of available ports can be divided equally among all rows in the
hash table. For example a 3 port LAG may not be perfectly distributed in a
table, where a 4 port LAG is likely to be, depending on implementation.

• All ports in the LAG are up.

• The set of input flows (IP headers, either 2-tupe or 5-tuple) hashes equally
to different hash table rows. Generally this is done by building a sequence
of flows with IP addresses varying by 1.

• Packets are the same size.

Load balancing hash keys are computed depending on ingress circuit type, and
node configuration. A key is computed by running an algorithm on a set of
inputs, which are fields copied from the packet header. For example when the
input circuit is configured for L3 forwarding, the key is built with fields from the
L3 (IP) and sometimes Layer 4 (L4) (TCP/UDP) headers. When the input circuit
is configured for L2 forwarding, the key can be built from the L2 header, or we
can configure load balancing hashing to look deeper in the packet. You can
configure more header data to be considered by using the global configuration
mode service load-balance ip command.

94 1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06


Architectural Support For Features

Note: When 5-tuple hashing is configured, any non TCP/UDP packets, or


TCP/UDP fragments, fallback to a basic 3-tuple algorithm.

More detailed knowledge of the expected traffic distribution is helpful when


a user wants to change the default 3-tuple or 5-tuple values. For example,
hashing pseudowire (PW) traffic using customer inner IP header fields will result
in slower performance than hashing based only on labels, so it should only
be used if a finer granularity of distribution is required than label hashing can
provide. This could happen if the number of PWs is small in an L2 topology.
The specific algorithm performed on the selected inputs varies depending on
line card type (for example, NP4 or PPA3LP).

For general load balancing information, see Section 3.7.7 on page 144.

There is a LAG hash table for every LAG that maps hash results to LAG
constituents. LAG topology and configuration influence the efficiency of load
balancing.

To study LAG configuration to improve load balancing, it can be helpful to


view which ports are active in a LAG; use the show link-group detail
command.

Note: When links are added or removed from a LAG via configuration, tables
are also reshuffled to achieve optimal balancing.

Table 5 Packet Hashing Before and After a Link Failure


3-bit Hash 3-bit Hash
Result Constituent Result Constituent
0 A 0 A
1 B 1 B -> A
2 C B Fails 2 C
3 A 3 A
4 B 4 B -> C
5 C 5 C
6 A 6 A
7 B 7 B -> A

Table 5 assumes that each constituent port is the same speed (GE, 10GE).

Hash keys are built only once for each packet during packet parsing, but some
packets need to use them for more than one path selection. For example, an
IP route may use ECMP, and some or all of the ECMP paths might be over
LAGs. ECMP is very similar to LAG in that some the bits from the hash result
are used to select a path, but in the ECMP case it is an L3 path opposed to the
L2 path in LAG. Imagine the case of a 2-path ECMP where the first path leads
to a 2-port LAG shown in Figure 35.

1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06 95


SSR System Architecture Guide

Figure 35 2-Path ECMP Where The First Path Leads to a 2-port LAG

Reviewers: Is the following a design speculation, or a description of 13.1


LAG load balancing? How can I make this useful to support personnel?

Assuming a similar table is used for ECMP path selection as for LAG port
selection, if both algorithms use the same hash result then both algorithms will
select the same path. Thus all traffic that hits this LAG for this ECMP route will
use port X, resulting in bad load balancing unless there are lots of different
routes pointing to this LAG. The solution is to use a different hash result for
different path selections. We don’t want to compute a new result using new
fields from the packet, but since the original result is generally 32 bits, we can
build multiple different results from that result. If each result needs to be 8
bits, we can extract four different results from the original result. If the hashing
algorithm is good, then it’s results are well distributed in 32 bits, and thus any N
bits from the 32 bit result should be well distributed.

3.3.3 LAG Forwarding

3.3.3.1 Ingress Forwarding Flow

Packets are forwarded over a LAG in the following process:

• When the receiving port is in a LAG, the port SRAM contains a LAG flag
and the Constituent ID (CID) for the port.

• An LACP bit indicates whether LACP is active. If it is set for standby, then
only LACP packets are accepted.

• The VLAN demux table for the port leads to the pseudo-circuit.

• During L2/L3/L4 packet parsing, a hash key is built from various header
fields such as the source and destination IP address.

If it is an IP enabled circuit and an IP packet, a FIB lookup is done.

• If the destination interface is bound to a LAG, a LAG next hop is determined.

• The LAG next hop contains a pseudo-adjacency ID, plus either an SPG-ID
(for circuit-hashed circuits) or an LG-ID (for packet-hashed circuits).

96 1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06


Architectural Support For Features

• For circuit-hashed circuits, the Destination Cookie and CID are determined
using the SPG ID.

• For packet-hashed circuits, the Destination Cookie and CID are determined
using the LG ID plus 6 bits of the computed hash result.

• The packet is sent to the fabric with the Pseudo-adjacency ID, the
Destination Cookie, and the CID.

3.3.3.2 Egress Forwarding Flow

Packets are sent on the egress path with the following process:

Reviewers: what component is "doing" these steps?

• The pseudo-adjacency is determined using the adjacency-ID in the


inter-card header (ICH).

• The egress circuit is determined from the adjacency.

• The ICH indicates that the egress circuit is a LAG pseudo-circuit and
provides the CID.

• The CID is used as an index to find the per-constituent stats offset, metering
token-bucket (if needed), and queueing details.

• The packet is forwarded to the selected queue.

3.3.4 RIB Route Handling For LAGs


Reviewers: Is this the way that RIB assigns next hops for LAGs on SSR?
The FT2008 FS says " Unified LAG supports both circuit hashing and
packet hashing, and does packet hashing with a single LAG next hop that
references a pseudo adjacency." It also says the following for Trunk and
Access LAGs. Which is it?

RIB support for LAG is handled differently for trunk LAGs and Access
LAGs. For trunk LAGs, RIB produces multi-adjacency Trunk LAG next hops
(representing the constituents) in the FIBs, as in Figure 36.

1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06 97


SSR System Architecture Guide

Figure 36 Trunk LAG FIB Lookup

RIB supports access LAG with single pseudo-adjacency and SPG-ID based
next hops. FIB entries pointing to such a nexthop assign the data traffic to one
of the constituent ports in the LAG next hop referenced by the SPG-ID. RIB
acts on slot-mask and SPG-ID changes from ISM to update the next hops on
all NPU ingress flows and the adjacencies on the respective slots. No packet
hashing is supported by access LAGs. FIB lookup gives the access LAG next
hop and the corresponding pseudo adjacency (PW-Adj) with the SPG-ID. The
SPG-ID look up (SPG table is present on all the NPUs) gives the physical
adjacency and the packet is sent across the back plane to the egress path with
the PW-ADJ and the physical adjacency.

For access LAGs, RIB produces a single adjacency for each constituent in the
FIBs, as in Figure Y.

Figure 37 Access LAG FIB Lookup

3.3.5 MPLS Over LAG


MPLS (LDP, RSVP and BGP MPLS LSPs) is supported over LAG interfaces.
In the control plane each LSP is represented by a circuit, when the SSR is an
iLER. To improve scaling, LSP circuits are not programmed into line cards;
instead each LSP is represented in the line card by an adjacency on a physical
or pseudo circuit. Thus a circuit bound to an MPLS-enabled L3 interface might
have some static adjacencies, some resolved Layer-3 adjacencies, and some
LSP adjacencies.

Each LSP circuit has the following values in the circuit handle:

• Slot – CCT_PSEUDO_SLOT (255)

98 1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06


Architectural Support For Features

• Port – CCT_PSEUDO_PORT_LSP (3)

For MPLS over LAG, the Ethernet circuit is a pseudo circuit, and each LSP
adjacency is a pseudo adjacency capable of transmitting packets to any
constituent on that ePFE.

The pseudo adjacency is stored on all slots in the LAG slot mask.

For Ingress LSPs over LAG interfaces, LSP pseudo-adjacencies are


downloaded to all slots in the LAG slot mask. An LSP next hop is created at all
iPFEs to point to the pseudo adjacency. The next hop also indicates that either
a packet hash or SPG lookup is required to assign a LAG constituent port. The
LSP adjacency can be identified using the show mpls lsp command as a
pseudo-adjacency by 0xFF in the first byte. This indicates that the egress slot
isn’t known and a separate LAG lookup is required; see the following example:

[local]Ericsson#show mpls lsp


Codes : S - MPLS-Static, R - RSVP, L - LDP, B - BGP

Type Endpoint Direct Next-hop Out Label Adjacency


L 1.1.1.1/32 10.1.1.1 524292 0xFF30002d
L 3.3.3.3/32 12.1.1.2 3 0xFF300036

Transit LSPs over LAG interfaces are similar to the ingress case, except that
label mappings pointing to the LSP pseudo adjacencies are stored on all slots
of the ingress path and indicate that either packet hashing or circuit hashing is
needed; see the following example.

[local]Ericsson#show mpls label-mapping


Codes : S - MPLS-Static, R - RSVP, L - LDP, B - BGP

Type In Label Action Direct Next hop Out Label Adjac


L 524293 php 12.1.1.2 3 0xFF30
L 524294 swap 10.1.1.2 524292 0xFF30

3.3.6 LAG Constituent to PFE Updates


The following flow diagrams illustrate the interaction of these components to
manage the creation, modification, and deletion of link groups.

1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06 99


SSR System Architecture Guide

Figure 38 Link Group Creation

As illustrated in Figure 38 the following modules interact when a link group


is created in the CLI:

1 A link group is configured with the link-group name command in global


configuration mode, and LG CLI sends an LG create message to LGMgr.

2 LGMgr assigns a valid LG-ID to the link group and creates the
pseudo-circuit handle.

3 LGMgr informs the other managers about the new link group and then
sends the message to LGd with all the LAG attributes.

4 LGd stores the information and sends IPC a message about the new link
group to ISM, which contains the pseudo-circuit handle and the default
attribute flags. It also sends the circuit ethernet config message,
the min-link and max-link configuration messages to ISM. The first time that
the default values are sent, LGd also sends the SPG egress mode flag to
ISM. If not set explicitly, the default mode of round robin is sent. Then the
MBE EOF message is sent to indicate the end of link-group information.

5 ISM stores the information in the configuration database and passes a


subset of the information to the clients (modules) that have registered for

100 1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06


Architectural Support For Features

LG information and the pseudo-circuit handle of the LAG circuit to all of


the clients.

6 ISM also sends the UP/DOWN state of the LAG circuit, the circuit
ethernet config message received from LGd, and SPG group
create message, and the LG config message with the link group flags
to all the clients.

Figure 39 Link Group Deletion

When a link group is deleted by entering the no link-group name command,


the top level link group is deleted along with the L1 pseudo-circuit, the
L2 circuits, the interfaces are unbound, and any member links and active
constituents are ungrouped.

As illustrated in Figure 39, the following modules interact in the process:

1 When the link group is deleted, LG CLI sends a link-group delete message
to LGMgr.

2 LGMgr informs the other managers about the LAG deletion (with an LG
callback).

3 LGMgr deletes the circuit handle for this LAG from it's database and frees
the allocated LGID for reuse.

1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06 101


SSR System Architecture Guide

4 LGMgr informs LGd of the deletion.

5 LGd informs ISM of the LG Ethernet deletion, with L1 pseudo-circuit


handle. This message unsets LAG attribute flags and tells ISM to delete
the pseudo-circuit handle and the children of the LAG. LGd also sends the
request to unbind all interfaces that are bound directly to the LAG or under
any of the 802.1Q PVCs in the LAG.

6 ISM sends the delete message first for the level 2 circuits under the link
group. ISM then sends the state of all the level 2 interfaces whose unbind
messages were sent by LGd. ISM also sends the currently active link group
configurations to all the clients, then sends an ungroup message for each
constituent on the LAG circuit.

7 ISM sends the delete message first for the level 1 circuits under the link
group. ISM then sends the state of all the level 1 interfaces whose unbind
messages were sent by LGd. ISM also sends the currently active link group
configurations to all the clients, then sends an ungroup message for each
constituent on the LAG circuit.

8 This resets the LAG-related flags for the physical circuits.

When a constituent link is added to a LAG by entering the link-group name


command in port configuration mode, it is aggregated into the link group.
Figure 40 illustrates the interaction of modules that updates the other modules
and clients.

102 1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06


Architectural Support For Features

Figure 40 Adding a Constituent Link to a LAG

1 When a new constituent link has been added to a LAG, LG CLI sends
the grouping information to LGMgr.

2 LGMgr informs the other managers (modules) about the addition.

3 LGmgr also sends the information to LGd.

4 LGd sends the link group event with the level 1 pseudo-circuit handle and
the physical link's handle to ISM. Along with this, LGd also sends the LG
Ethernet and circuit configuration messages with Ethernet flags.

5 ISM forwards the LG group message to all the clients. Then ISM forwards
the Ethernet configuration for the member circuit and the state of the
member circuit and port circuit to them. ISM then sends the number of links
that can pass traffic to all the clients.

Figure 41 illustrates the message flow when a constituent link in a LAG is


removed.

1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06 103


SSR System Architecture Guide

Figure 41 Remove a Constituent Link

1 When a constituent link is removed by entering the no link-group name


command in port configuration mode, LG CLI sends an ungroup message
to LGMgr.

2 LGMgr informs the other managers about the deletion, and then sends the
LG ungroup event to ISM with the level 1 pseudo-circuit handle and the
physical link's handle. LGMgr also sends the circuit configuration message
about the deletion to ISM.

3 ISM forwards the LG ungroup message to all the clients and then forwards
the Ethernet configuration for the member circuit to them.

4 ISM also forwards the Ethernet configuration for the parent circuit (without
the deleted link) to all the clients.

When an 802.1Q PVC is created in a link group using the dotq pvc option
command in link-group configuration mode, a level 2 pseudo circuit under
the link-group is created with the default set of attributes. The addition is
propagated throughout the system with the process illustrated in Figure 42.

104 1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06


Architectural Support For Features

Figure 42 An 802.1Q PVC is Added to a Link Group

1 When an 802.1Q PVC is created with dot1q encapsulation in a LAG,


DOT1Qd CLI sends the new configuration details to DOT1QMgr.

2 DOT1QMgr sends the level 2 pseudo-circuit dot1q config message to


DOT1Qd.

3 DOT1Qd sends the level 2 pseudo-circuit configuration details to ISM.

4 ISM sends the level 2 circuit details to all the clients.

5 ISM forwards the dot1q configuration for this circuit to all clients, along
with the state of the circuit.

When an 802.1Q PVC is deleted, the deletion event is propagated throughout


the system with the process illustrated in Figure 43.

Figure 43 An 802.1Q PVC is Deleted

1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06 105


SSR System Architecture Guide

1 When an 802.1Q PVC in a LAG is deleted by entering the no dot1q pvc


command in link-group configuration mode, DOT1Qd CLI sends the
deletion details to DOT1QMgr.

2 DOT1QMgr sends the deletion details to DOT1Qd.

3 DOT1Qd sends the level 2 circuit deletion details to ISM.

4 ISM sends the deletion details to all the clients.

3.3.7 Economical LAG


Note: This feature is only supported on LAG constituents that are hosted
on PPA3LP cards.

The SSR (unified) LAG model consists of two basic modes for load balancing:
packet hashing and circuit hashing (“subscriber protection”). Circuit hashing
has a faster selection mechanism and better failover properties (if replicated)
favoring its use for subscriber facing routing. However, replicating all L2
circuits for all ports uses too many resources in the forwarding plane, if the
number of subscriber circuits is high. To overcome the high resource load
in the forwarding plane, economical mode was introduced for circuit hashed
circuits (with the price of slower failover handling). The use case for this feature
is networks where scaling is more important than fast failover, for example,
subscriber facing LAGs.

For configuration information, see Configuring Link Aggregation Groups.

Economical mode achieves lower resource usage by loading L2 circuit data


only onto its home card or PFE for egress forwarding. For ingress flow a simple
loopback data structure is enough to redirect incoming packets to the home
card or PFE for processing.

Economical LAG has the following limitations:

• Circuit-grouping is not supported

• Virtual ports are not supported, only physical ports

• PWFQ and MDRR queuing policies are not supported on 10GE ports

• All ports in a particular LAG must belong to the same card type (for
example, no mixing 10GE and 1GE ports)

• MPLS is not supported

• BFD is not supported

• IGMP multicast routing is supported, but no other multicast routing


protocols are supported over economical LAG

106 1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06


Architectural Support For Features

NGL2 circuits (service instances) under economical LAG are always circuit
hashed, by default.

QoS services policing, metering, queuing (PWFQ), QoS propagation, overhead


profiles, QoS priority, and class based rate-limiting are supported on all circuit
types under economical LAG hosted by PPA3LP-based ports. These include
physical ports, static 802.1Q PVCs, PPPoE, CLIPS, and DHCP subscriber
circuits, 802.1Q CCOD circuits, and NGL2 circuits.

Note: For multicast traffic, you must configure “replicate and load-balance”
on the parent circuit to transmit multicast packets on the parent circuit,
with Remote Multicast Replication (RMR); otherwise, the multicast
traffic is transmitted per child circuit.

To diagnose packet distribution for circuit-hashed circuits, a user can determine


the egress port by finding the SPG-ID from the circuit, and then looking up that
SPG-ID in the SPG table at ISM or at any line card, as in the following example:

1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06 107


SSR System Architecture Guide

[local]Ericsson#show circuit lg lg1 vlan-id 10 detail


Circuit: lg id 25 vlan-id 10, internal id: 1/2/5, state: Down
----------------------------------------------------------
. . .
lg_id : 25 spg_id : 2
load-balance : circuit
aps_id : 0 ccct state : INACTIVE
parent_child_cct : Cct invalid
internal handle : 255/22:1:26/1/2/5
constituent ports : 1/10

[local]Ericsson#show card 1 link-group spg-table 2


Slot 1 Ingress:
Size of Sub-protection group table : 8
No entries in sub-protection group table : 2
Sub-protection group table is at : 0x4392a00

Sub-protection group: 2 (Entry: Valid, Active in this slot)


LG id : 0 Flags : 0x05

Active : 1/10 Phy adj : 0x00980004

Slot 1 Egress:
Size of Sub-protection group table : 8
No entries in sub-protection group table : 2
Sub-protection group table is at : 0x43edac0

Sub-protection group: 2 (Entry: Valid, Active in this slot, Phy adj p


LG id : 0 Flags : 0x0d
Active : 1/10 Phy adj : 0x7b1e3260

[local]Ericsson#show card 4 link-group spg-table 2


Slot 4 Ingress:
Size of Sub-protection group table : 8
No entries in sub-protection group table : 2
Sub-protection group table is at : 0xfcc0

Sub-protection group: 2 (Entry: Valid)


LG id : 0 Flags : 0x01
Active : 1/10 Phy adj : 0x00980004

Slot 4 Egress:
Size of Sub-protection group table : 8
No entries in sub-protection group table : 2
Sub-protection group table is at : 0x43ed940

Sub-protection group: 2 (Entry: Valid)


LG id : 0 Flags : 0x01
Active : 1/10 Phy adj : 0x00000000

108 1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06


Architectural Support For Features

Reviewers: what is this command? For packet-hashed circuits, we need the


command to compute the hash result for a given 3-tuple or 5-tuple, and display
the port to which that result would map from the packet hash table.

3.3.7.1 Supporting Architecture

The economical LAG core functionality is performed by the following modules:

Interaction between LG and DOT1Q.

• LG sends the options indicating an economical LAG to DOT1Q.

• DOT1Q sets the LAG options for replicating (to allow circuits to be
replicated on all the constituent ports) and packet-hashing load-balancing.

• DOT1Q sends the configuration parameters to ISM.

• ISM sends a SLOT/PFE mask change event to all modules.

Interactions between DOT1Q and PPA3LP

• DOT1Q processes the L2 PFE mask coming from ISM.

• If there are changes, DOT1Q sends them to PPA3LP to change the slot
masks.

• DOT1Q asks ISM to get the PFE mask from the parent circuit handle.

Role of ISM

• When a LAG in economical mode is configured, LG sends the LAG details


to ISM (economical mode, circuit-hashed, home-slotted).

• ISM downloads all the L1 circuit events to the constituent PFE's registered
with ISM. All L2 circuits created under the LAG are marked by default as
economical and circuit-hashed, All L2 circuit events are sent to the Home
Slot. Non-Home Slots only receive L2 circuit configuration events.

• If an operator removes the Home Slot restriction on an L2 802.1Q circuit


under a LAG using the replicate keyword, ISM receives the event and
propagates it to all the constituent PFEs, since each PFE registers with
ISM for circuit updates.

Role of CLS

• CLSmgr receives notice when an interface is bound to a LAG.

It determines the circuit type with LG information, including economical


mode.

• CLSmgr obtains the PFE mask from ISM.

• CLSmgr sends the “bind” message to CLSd.

1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06 109


SSR System Architecture Guide

When the circuit is unconfigured, CLSmgr removes it from the configuration


database and sends the “unbind” message to CLSd.

• Initially, when CLSd receives the configuration from RCM, CLSd obtains
the PFE mask using the ISM API which enables retrieving the PFE mask
for any type of circuit.

• CLSd sends the message to ISM to “subscribe” for the events for this circuit

After that, every time when PFE mask for this ULAG changes, CLSd
receives from ISM the cct-config event, containing the new PFE mask for
this circuit

• When the circuit is unconfigured, CLS “unsubscribes” from the PFE mask
update events from ISM, for this circuit. The only (significant) change
interface required, is to introduce the PFE mask to the cct-config event
data (currently it has only slot-mask).

• ISM sends PPPoE notice that a circuit has been created when a LAG is
created.

• PPPoE receives a circuit up message when a port has been configured for
LAG and PPPoE marks the circuit as ready.

• When PADI is received, PPPOE will extract the real physical circuit handle
from the CMSG and use it to check for PFE session limit by calling the
API from ISM (already done today).

• When PADR packet is received, PPPOE will send CCT-create, CCT-Cfg to


ISM and ISM will forward to all clients including PPPd.

• PPPd receives CCT-create and CCT-Cfg from ISM, create a circuit and
notify PPPOE the circuit is ready to negotiate LCP.

• PPPOE receives PPP message and reply to the client with PADS.

• PPP receives LCP/CHAP,PAP/IPCP/IPv6CP packets and negotiate on the


ULAG circuit, to set the CMSG for output slot

PPP will call an API from ISM with the ULAG handle and ISM will return the
active physical slot, this active slot will be filled in the CMSG.

• when a port belong to the ULAG link-group is down, ISM will send a circuit
down on the ULAG circuit and PPP will pull down the PPP link.

• For PPPoE, the pfe complex mask is queried from ISM by passing the real
received cct handle. PPP will use PAL layer to add pfe complex mask for
a pseudo cct handle.

110 1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06


Architectural Support For Features

3.4 Port and Circuit Mirroring


Port mirroring allows an operator to create a copy of all ingress or egress traffic
for a specified physical port to troubleshoot network problems. Because the
ability to mirror traffic to another location represents a potential security risk,
port mirroring can only be configured using the highest CLI security privilege
level (level 15). Port mirroring is supported on all SSR NP4-based line cards.

A stream of mirrored traffic can only be forwarded to one mirror destination.


However, a single mirror destination can receive multiple streams of mirrored
traffic. Traffic can be mirrored to a local or a remote destination.

Port and circuit mirroring are not supported on the 4-port 10GE or 20-port GE
and 2-port 10GE line card with PPA3LP NPU

To prevent mirroring loops, for physical forward output types, you cannot
enable mirroring for a port if it is already in use as a mirror output.

To support local mirroring, this feature includes support for both physical ports
and next-generation Layer 2 (NGL2) service instances as forward output. NGL2
service instance forward outputs allow VLAN tag manipulation on the frame.

This feature also includes support for L2 PW as a forward output. The


pseudowire contains the entire L2 Ethernet frame being mirrored.

Port or circuit mirroring are enabled after the following configuration steps are
completed:

• Configure a port, L2 service instance, or pseudowire instance as a mirror


destination.

• Create a port or circuit mirror policy.

3.4.1 Mirror Destinations


Any of the following can be designated as a mirror destination:

• Ethernet port

• Layer 2 service instance

• Pseudowire instance

• Ethernet port and circuit (circuit mirroring)

The following cannot be designated a mirror destination:

• Mirror source ports (a port cannot be both a source of mirror traffic and
a mirror destination)

• Link groups

1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06 111


SSR System Architecture Guide

• Layer 2 service instances configured within a link group

3.4.2 NPU Processing Flow


Ingress port mirroring replicates all traffic, including control traffic, that is
received from the line. Egress port mirroring replicates all traffic including
control traffic as it was sent to Traffic Manager. Frames may be dropped by
TM decisions in the source stream. Processing is done in the NPU packet
processor, as follows:

• Ingress mirroring is performed as the packet is received from the wire and
includes all packets received from the port before any modifications are
made to the packet. Error packets and exceeding the ICFD queue limit
packets are not mirrored. However, any packet that the NP4 pipeline has
accepted for processing is mirrored, including multicast packets.

• Dropped packet handling discards a number of packet types in TOP Parse


due to checks such as IP header errors or circuit states. However, these
frames are port mirrored. They are not actually discarded in TOP Parse,
but sent to TOP Resolve for a mirroring decision, and then discarded there
if necessary.

• Egress port mirroring is performed as the packet is being transmitted on the


wire. Because TOP Modify runs before the Traffic Management decision
in the NP4 architecture, there may be frames in the original stream that
get dropped in TM that end up in the mirror flow. This is as designed
and deemed acceptable. Egress port mirroring includes packets with all
outgoing modifications and works for all packet types including internally
generated packets, control packets, multicast packets, and fragments.

The Egress Port Mirror Key contains the outgoing port number copied from
the PSID in the Egress Circuit search result. For rate-limiting, the L2 packet
length is also included in the key. The rate-limit is applied to the packet as
it is received from the fabric.

• Mirroring decisions are supported by the NP4 ALd mirror policy cache
(entries are added when the policy is created, updated, or deleted). The
table enables later communication between FABL and ALd, which use the
policy IDs. The policy table includes such data as policy state and ID, the
next hop ID, the forwarding-mirror class, and other policy attributes. Each
PFE has a table and policies are shared by ingress and egress mirroring.

Policy bindings are also maintained in ALd in another table, which includes
such data as the circuit handle and mirror policy handle.

3.5 Routing
On the SSR, IPv4 and IPv6 route information is collected from the different
routing protocols in the Routing Information Base (RIB) on the RPSW controller

112 1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06


Architectural Support For Features

card, which calculates the best routes and downloads them to the Forwarding
Information Base (FIB) stored on the line cards.

3.5.1 RIB Functional Architecture

3.5.1.1 RIB Boot

RIB boots occur for a variety of reasons, such as restart of the RIB or ISM
processes or the switchover during the upgrade from one SSR OS release to
another.

The following diagram shows the RIB boot sequence of events:

Figure 44 RIB Boot Sequence

1 RIB picks up stored routes from shared memory.

2 RCM provides RIB configuration information.

3 RIB registers with ISM for messages.

4 ISM sends RIB new circuit and interface data.

5 RIB registers with ARP for routes and messages.

6 ARP starts resolving routes.

1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06 113


SSR System Architecture Guide

7 RIB builds subnets and next hops

8 RIB starts downloading routes to FIB.

3.5.1.2 Route, Prefix, Next-Hop, and Adjacencies

From the RIB perspective, a route is a prefix pointing to one or more next
hops, as seen in Figure 45. Each next hop is a path to the route endpoint or
to another router.

Note: The SSR supports both IPv4 and IPv6 prefixes.

Figure 45 A Route in RIB

As seen in Figure 45, each next-hop has:

• A next-hop key, which consists of the triplet (IP address, circuit handle,
and interface grid).

• A next-hop grid, which is a unique identifier that begins with 0x3, followed
by the next-hop-type, followed by a unique index.

Next-hop types are listed in the grid.h file. There are hundreds of next-hop
types. The two basic types of next-hop are:

0 Connected next-hop (0x311)

0 Non-connected next-hop (0x312)

• A next-hop adjacency ID.

For each connected next-hop, RIB also creates an adjacency ID. The
next-hop grid is used on ingress by the packet forwarding engine (PFE) and
the adjacency ID is used on egress for indexing next-hops. RIB downloads
the same structure for both ingress and egress. There is a one-to-one
mapping between the next-hop key, the next-hop grid, and the adjacency
ID (if present).

114 1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06


Architectural Support For Features

Encapsulated packets (L2VPN, L3VPN, and PW) can have double


adjacencies, where each of the two Ethernet headers has its own
adjacency. There is no adjacency for non-connected next-hops, because
there is no Layer 2 connectivity with them.

3.5.1.3 The Process of Adding a New Route

Adding a new route to RIB always begins with the rib_rt_add()l thread, the
main entry point, triggered by routing protocols (such as BGP, OSPF, IS-IS,
or RIP) adding routes to the RIB by sending messages to it. RIB has a route
thread which performs a basic IPC message check and finds the client that is
attempting to add the route. If the client is registered, the message is assigned
into a client queue. Each client has its own queue on top of a scheduler. The
scheduler depletes these queues, specifically taking 2 IPC messages from
each client’s queue using the round-robin method. In a single IPC message
there might be a few hundred routes packed.

3.5.1.4 Route Distance

RIB can have multiple paths (next hops) to the same prefix. This occurs when
next hops with the same prefix are added by different routing protocols. The
router saves each path provided by each protocol. For example, OSPF adds
10.10.10.0/24 → 2.2.2.2 and staticd adds 10.10.10.0/24 → 1.1.1.1 , Or, paths
to the prefix 10.10.10.0/24 are added by different protocols through different
linked list storing paths. Theoretically, the prefix 10.10.10.0/24 can be reached
by an access circuit (AC) path (regular non-LSP or IGP routes), an adjacency
path (ARP entries from directly connected networks in the same subnet), or an
LSP path (MPLS entries). If the prefixes added by different clients collide, the
router stores them in the appropriate linked lists. When the router forwards a
packet and several paths (next hops) to the same prefix are available in the
RIB, the router must select which one to use for that packet.

Which path does the router use? Each path has a distance, that is, route
priority, and the path with the lowest priority is selected as the preferred route.
The adjacency-supplied next hop has the lowest distance (0), because packets
need only to be passed through a port on the current router to reach their
destination. The distance parameter sometimes indicates which client created
the next hop, because each protocol has a unique distance default value. The
linked lists are sorted by distance. The link with the shortest distance becomes
the active path. RIB downloads only the active path to the line cards, thus the
collection of prefixes and active paths forms the FIB stored in the line cards.
Every time there is a change in the active path (for example, a new protocol
with lower distance has added the same route), it must be downloaded to
update the FIBs.

3.5.2 FIB Ineligible Routes


Some routes are FIB ineligible, which means that they cannot be active even if
they have the lowest distance. For example, in Figure 46, a route learned from
eBGP is downloaded to FIB. The route learned from LDP is also downloaded

1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06 115


SSR System Architecture Guide

even though it was learned from an LSP path. Although routes from LSP
paths by default are FIB ineligible, all LSP paths must be downloaded to FIB,
because these paths must be available on the line card for use by L2VPN
and L3VPN routes. L2VPN and L3VPN routes are selected by MPLS criteria
and not their distance values.

Figure 46 RIB Prefix Tree

Note: If you have configured the router for load-balancing or load-distribution,


the selection of the next hop is dependent on that configuration and not
solely dependent on the distance parameter.

3.5.3 Interface Subnet Routes and ARP

Interface subnet routes are internal routes that are added automatically by RIB
every time an IP interface is configured and is up. These routes are added
automatically for each IP interface, even if the interface is not used for sending
or receiving traffic yet.

When an interface is bound to a port:

1. If both the port and interface circuit are up when an interface is bound
to a port, the interface comes up.

2. If the interface has an IP address, RIB creates a subnet route called the
locally connected route triggered by the corresponding ISM event, as
shown inFigure 47. At this point, the router doesn’t know whether a router,
a host, or an entire LAN is on the far end of the link.

The ARP resolution process occurs when traffic is being sent or received
in the interface. If there is traffic sent out of the interface, ARP is involved
and tries to resolve the destination IP address. When it is resolved, an
adjacency route is installed. When a routing protocol installs a route that
has this interface as a next hop, the process described in Section 3.5.1.4
on page 115 is used.

3. RIB checks the encapsulation on the connected circuit, which could be


Ethernet, PPPoE, dot1q, and so on.

4. Based on the processes described in Section 3.5.1.1 on page 113 through


Section 3.5.2 on page 115, the RIB creates routes. For example, for

116 1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06


Architectural Support For Features

Ethernet and an IP address of 10.10.10.1/24, RIB creates the routes shown


in Table 6.

Figure 47 Resolving Connected Next Hops

Note: The hidden routes (routes that are not immediately visible through
show commands) in Table 6 are created automatically to punt IP
packets sent to specified IP interfaces onward to the RPSW card for
further processing. The next hops of hidden routes have a special
grid (0x314). Their circuit and interface attributes are zeros. For more
information on punted packets, see Section 3.7.4 on page 141.

Table 6 Routes for IP Address 10.10.10.1/24


Route Type Prefix To Next Hop
(connected) 10.10.10.0/24 ---> Locally connected next-hop (unresolved
next-hop)
(hidden) 10.10.10.1/32 ---> Go up to RPSW (local address)
(hidden) 10.10.10.0/32 ---> Go up to RPSW (localized broadcast)
(hidden) 10.10.10.255/32 ---> Go up to RPSW (localized broadcast)

The unresolved connected next hop at 10.10.10.0/24 in the first row of Table 6
and illustrated in Figure 47 is resolved by the following process:

Because the locally connected next-hop is connected, the router knows its
circuit and the interface, and RIB creates an adjacency ID for it. However,
its IP address is zero, which means that when a packet destined to a host in
this subnet (for example, 10.10.10.2) is received and the longest prefix match
points to this next hop, the packet forwarding engine (PFE) knows where to
send it but can’t send it yet, because the destination MAC address is unknown
and therefore it cannot create its Layer 2 encapsulation. The PFE sends an
ARP cache miss message to ARPd with the packet's destination address.
ARPd queries RIB for the source address. RIB finds the subnet route entry with
longest prefix match from the interface address, which is sent back to ARP
to use as the source address toward the given destination. ARP creates the
packet and sends it to the Networking Processing Unit (NPU), which forwards it
on the appropriate link. If the destination is alive, it replies with its MAC address,

1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06 117


SSR System Architecture Guide

which goes up to ARPd. Since there is no ARP table on the line card NPU,
ARPd inserts the following adjacency route into RIB (MAC ADD message):

(adjacency route) 10.10.10.2/32 ---> 10.10.10.2

This is called a connected adjacency next hop (resolved next hop), which has
exactly the same circuit handle and interface grid as the locally connected next
hop except that it has a valid MAC address and IP address.

Note: When a connected route is created, RIB downloads both the prefix and
next-hops and the egress adjacencies to the PFEs.

3.5.4 Checking Routes, RIB, and FIB


To verify a packet’s route, you can perform a flood ping with a large number of
packets and huge packet sizes. Use the command show port counters
slot/port before and after the ping to see if the port counters are changed
as expected (sent/received). You can also use the show port counters
packet size num command to verify that the packets are from the SSR and
not from somewhere else. There are buckets for different packet sizes. You
can easily select one that is not quickly changing, and use that size for your
ping packets. If the counters are increasing as expected but you cannot see the
ping replies, it means that the replies are received but being dropped by the
line card, the kernel, or ping. To determine where packets are being dropped,
enter the Linux tcpdump command with the following options: tcpdump
-lexmi ’lc0’ — grep ICMP. This command listens for ICMP packets on
device driver ’lc0’. Netstat provides the device drivers, and each line card
has a number. In the command, ’lc0’ is for slot 1, ’lce0’ is for NPU 0
on slot 1, and so on.

Another technique is to Telnet from both directions (from and to the SSR)
and observe the Telnet packets with the tcpdump command. This technique
works if you are the only person Telneting to the SSR, which is likely. If you
see your packets in tcpdump, RIB is not processing them; If not, you need to
double-check that your routes are installed properly on the line card.

You can also use the hidden show ism client RIB log command to see if
ISM sent the interface up, circuit up, and port up messages to RIB, whether all
three messages were sent, and when they were sent. To filter these messages,
provide the circuit handle in the command: show ism client RIB log
cct handle circuit_handle detail. The output of this command
displays the IP address. If you only know the IP address, the circuit handle can
be derived from the show ip route ip-addr detail command.

The first step in debugging a route is to enter the show ip route prefix
detail command and verify whether or not RIB has the route prefix. If not,
enter the command show {static | ospf | bgp} route prefix to
verify whether or not the routing client has the route.

118 1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06


Architectural Support For Features

Note: When debugging routes, using the show ospf route command
queries OSPF for its routes. Using the show ip route ospf
command queries RIB for its OSPF routes.

If RIB has the route, it displays the adjacency. The first byte of the adjacency is
the card number (for example, 0x8 is card 9). If the first byte is 0xff, it means a
pseudocircuit, which can be located on any card. ISM decides on which card to
put the circuit and tells RIB in the slot mask field of the circuit. If the traffic is
congested on that card, ISM relocates it to another card; the slot mask indicates
the actual position. A zero adjacency means it is blackholed. Packets routed to
a blackhole adjacency are discarded.

Verify that a card is installed using the show chassis command. Does it
match the slot/port fields of the circuit handle?

If the cards are installed in the chassis as expected, verify the next hop using
the show ip route next-hop nh-hex-id detail. The command
output displays the next-hop IP address, the resolved MAC address for that
address, the interface grid, the FIB card bits, and the reference counters, which
indicate how many routes and next hops are pointing to this next hop. The FIB
card bits indicate on which cards the next hop has been downloaded. Using
the binary representation of these bits, the upper bits indicate the egress slots,
while the lower bits indicate which ingress cards have it. A 1 in a binary position
means that the card with that number has the next hop.

To verify that the MAC address matches the one that ARPd has for that specific
next hop IP address, enter the show arp cache command. If the destination
host is down, there is no resolved MAC address. If you have a resolved MAC
address but RIB does not have the adjacency, there is a miscommunication
between ARP and RIB.

The interface grid identifies the interface that is bound to a specific circuit. To
see what interface status RIB has, use the show ip route interface
iface_grid command. The show ism interface iface_grid command
provides the same information from the ISM perspective. In these commands
the iface_grid argument is the internal interface ID in hexadecimal numbering.

To verify what information the FIB has regarding the destination IP address,
use the show card x fabl fib * command. The output provides detail
on what RIB has downloaded, including the /32 entry for the adjacency. If that
is not present, verify the next-hop grid that was downloaded using the show
card x fabl fib nexthop {all | NH-ID} command. The output
displays the valid IP address, circuit handle, and interface for all next hops in
FIB. To verify that it has an adjacency ID and a MAC address, use the show
card x fabl fib adjacency {all | adj-id} command. If the route
does not have an adjacency or MAC address, then the problem may be ARPd.

Note: The adjacency index follows the card number in the adjacency ID. For
example, if the adjacency is 0x80000001 (card 9), the adjacency index
is 0x0000001.

1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06 119


SSR System Architecture Guide

If adjacencies or next hops are missing from the FIB, check the logs to see
what has been downloaded by RIB using the show card x fabl fib log
[rib] [ingress | egress] command. Note the message timestamps, which
may indicate when FIB received a route entry from RIB.

3.5.5 Static Routing Sequences


Instead of dynamically selecting the best route to a destination, you can
configure one or more static routes to the destination. Once configured, a static
route stays in the routing table indefinitely. When multiple static routes are
configured for a single destination and the outbound interface of the current
static route goes down, a backup static route is activated, improving network
reliability.

You can configure up to eight static routes for a single destination. Among
multiple routes with the same destination, preferred routes are selected in the
following order:

1 The route with the shortest distance value is preferred.

2 If two or more routes have the same distance and cost values, the
equal-cost multipath (ECMP) route is preferred.

3 When redistributing static routes, routing protocols ignore the cost value
assigned to those routes. When static routes are redistributed through
dynamic routing protocols, only the active static route to a destination is
advertised.

The SSR resolves next hops for the routes, adds them to RIB, and downloads
them to FIB.For the example in Figure 48, an interface with IP address
10.10.10.1/24 is bound, and the 10.10.10.0/24 route to the locally connected
next hop (unresolved next-hop) is downloaded by RIB to line cards.

120 1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06


Architectural Support For Features

Figure 48 Next-Hop Resolution

Next hops are resolved with the following process:

1 An ingress packet destined for 10.10.10.2 arrives.

2 The ingress FIB lookup results in an unresolved next hop. The packet is
forwarded to egress.
3 Because the 10.10.10.2 address is not resolved, a message is sent from
egress forwarding to ARPd to resolve 10.10.10.2. The packet is buffered at
egress.

4 ARPd sends a query to RIBd for the interface source address.


5 RIBd responds to the query with the source IP address.

6 ARPd sends a broadcast ARP request through the packet I/O.

7 ARPd forwards the broadcast ARP request through egress to the physical
output.

8 The ARP reply is received on physical input.


9 ARP reply is sent through packet I/O to ARPd.

10 ARPd sends message to RIBd with the MAC address resolved for
10.10.10.2 11.

11 RIBd downloads the resolved MAC address for 10.10.10.2 to egress.

1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06 121


SSR System Architecture Guide

Figure 49 illustrates the steps to add the static route to RIB and download
it to FIB.

Figure 49 Adding a Static Route to RIB

The static route (with the same prerequisites as the static routing sequences) is
added with the following steps:

1 STATICd queries and registers for next-hop 10.10.10.2 for static route
30.1.1.1.

2 If the next hop exists, RIBd sends it in response to STATICd. Otherwise,


RIBd keeps STATICd registered to receive information about the next hop
if it is added at a later point.

3 When the next hop has been added, STATICd sends a message to RIBd to
add the static route.

4 RIPd sends a message to Forwarding to download the route to the FIB.

If the next hop is unresolved when a packet arrives matching the static route,
the next-hop IP address is resolved using ARP, as shown in Figure 48.

To verify that static routes have been added to RIB and downloaded to FIB,
use the show ip route command.

122 1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06


Architectural Support For Features

3.5.6 IPv6
To enable the SSR to serve as an IPv6-aware node, the SSR platform
supports configuring IPv6 interfaces toward the neighboring nodes and
enabling IPv6 address-based routing and forwarding. The router supports
IPv6 address-based routing protocols (OSPFv3, IS-IS, RIPng, and BGP), IPv6
neighbor discovery (ND), IPv6 link-local address configuration and forwarding
support, and 6PE and 6VPE support. Support for tunneling mechanisms like
6PE, 6VPE, and IPv6 over IPv4 enables customers to route IPv6 packets
through intermediate IPv4 address-based networks.

IPv6 configuration is supported for all physical and link group circuit types. Also
dual stack with simultaneous IPv4 and IPv6 configuration is supported for these
circuit types. IPv6 is also supported on GRE circuits for tunneling IPv6 packets
in IPv4 GRE tunnels as well as in IP in IP tunnel circuits for 6in4 tunneling.

For an overview of the supported features and restrictions, see Ericsson IP


Operating System Release 13B.

IPv6 and IPv4 functionality share a common architecture based on the


ISM interface and circuit support (seeSection 2.2.2.19 on page 53 ), RIB
infrastructure (see Section 2.2.2.40 on page 67) and FABL and forwarding (see
Section 2.2.2.15 on page 49) support on the line cards. The NPU forwarding
process has been significantly extended with IPV6 specific optimizations that
allow you to maintain packet forwarding rates comparable to IPv4 despite the
larger address sizes of the IPv6 packets.

IPv6 and RIB—The entire IPv6 routing protocol and the IPv6 static route installs
routes with RIB. As with IPv4 routes, RIB qualifies the best possible routes and
installs them into the routing table. The responsibility of the IPv6 RIB includes:

• Installing the next hop and the route, in the same order

• Adding, updating, and deleting routes and next hops

• Qualifying and installing the ECMP

• Installing the next hops and chains, if required

• Re-downloading routes and next hops at FABL/LC restart (ALd restarts


are handled by FABL). Because the SSR RIB implementation does not
support selective IPv6 FIBs for line cards, the IPv6 FIB is the same for
all the line cards.

3.5.6.1 IPv6 Link-Local Addressing

Link-local addresses have link scope, meaning they are relevant over a single
link. Packets with source or destination link-local addresses are not forwarded
by routers to another node. IPv6 link-local support allows customers to
configure link-local addresses per interface or let the system assign a link-local
address per interface automatically.

1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06 123


SSR System Architecture Guide

Link-local addresses enable a node to start stateless without any pre


configuration and automatically assign an address. Unlike IPv4, IPv6 requires
a link-local address to be assigned to every network interface on which the
IPv6 protocol is enabled, even when one or more routable addresses are also
assigned. The IPv6 address block fe80::/10 is reserved for link-local addressing.
Different techniques can be used to prepare a link local address per interface
by combining FE80::/10 with an identifier, for example, MAC in EUI-64 format.

RPSW applications, such as OSPFv3, ND, or PING, need to be able to send


traffic using IPv6 link-local addressing. Applications sending a packet destined
to an IPv6 link-local address must attach the circuit handle to make sure the
packet is sent to the correct line card and link. The correct slot and link is
determined from the circuit handle for both physical and pseudo circuits.

A remote link-local address is uniquely identified in the system by pairing


with the circuit used to reach the remote link-local address. Since link-local
address configurations are also allowed on multi-bind interfaces, the system
internally uses a local interface-id paired with a link-local address to identify
local link-local addresses.

RIB sends all link-local next hops, both local & remote, to all line cards. RIB
re-downloads them to all the next hop’s Home Slots whenever next-hop circuit
slot masks change. FABL downloads all remote link-local addresses and only
line card-specific local link-local next hops to forwarding.

FABL keeps a link-local database to resolve link local addresses into next-hop
IDs with the following information:

• Link local address and circuit handle to next-hop for remote destinations

• Link local address and interface ID to next-hop for local destinations

FABL stores both remote and local link-local table entries, where as fast-path
forwarding only stores the local link-local table. The fast-path forwarding
link-local table of local destinations is used to perform source link-local address
validation for incoming packet to ensure that only correctly addressed link-local
packets are sent to the RPSW.

For control-plane packets with unresolved link-local destination addresses,


FABL does a lookup using the zero IPv6 address and passes the packet to
forwarding. The packet hits the interface adjacency and triggers neighbor
discovery for the remote link-local IP address.

The FABL link-local database is kept in shared memory. Insertion and deletion
of entries of the database is done only by the FABL-FIB process. Other
processes have read-only permission for lookup.

The SSR supports ping to both remote and local link-local destinations. In the
ping ipv6 command, the circuit handle must be included with the link-local
address to identify a remote link-local destination. For more information, see
Command List. Following is the flow diagram

124 1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06


Architectural Support For Features

Figure 50 illustrates the process for pinging a remote link-local address with link
layer address (MAC) not yet resolved:

Figure 50 IPv6 Ping: Resolving Link-Local MAC Address

For example:

1 A user sends an IPv6 Ping packet to FABL PAKIO with link-local destination
ID fe80::300:50 and circuit handle: 0x40080001

2 FABL PAKIO performs a series of link-local specific lookups:

1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06 125


SSR System Architecture Guide

a Link-local lookup with destination address and context ID as key. Since


MAC is not resolved, lookup fails.

b Redoes the lookup with destination address and interface ID to figure


out local link-local. If lookup fails, then:

c Redoes lookup with zero IPv6 address and context ID as key. This
returns the interface next hop ID.

3 FABL PAKIO forwards the packet with the link-local next hop to the NPU.

4 NPU Forwards the packet with unresolved destination to request resolution


to FABL ND.

5 FABL ND stores & throttles the packet for resolution and sends an ND
Resolve Request to NDd for the destination address.

6 NDd sends an ND Neighbor Solicitation (NS) message to FABL PAKIO with


destination address: ff 02::1:ff00:3

7 FABL PAKIO forwards the NS packet to NPU.

8 NPU sends an NS packet to the next hop and it returns an ND Neighbor


Advertisement (NA) packet.

9 NPU Forwards the ND NA packet back to FABL PAKIO, which sends it


to NDd.

10 The link-local destination is resolved into the MAC next hop for the
destination IPv6 address. RIB installs it to FABL and FIB.

11 FABL ND receives the MAC-resolved message from FABL-FIB. It sends


the stored ICMP ping packet to the NPU.

12 The NPU carries out an ICMP Ping exchange with the next hop to verify it ,
and forwards the ICMP-Reply message to FABL-PAKIO, which forwards
it to back to the originator.

3.5.6.2 Dual Stack Support

Physical and link group pseudo circuits can be configured to carry both IPv4
and IPv6 traffic at the same time. Such circuits are called dual stack circuits. A
single circuit handle will represent state and counters required to support both
IPv4 and IPv6 traffic. This support is provided when a single interface has both
IPv4 and IPv6 addresses configured. Both single bind and multi-bind interfaces
with such configurations are supported. Each of the addresses (IPv4 and IPv6)
can be added or removed without affecting traffic for the other address type.
For example, a circuit can move between single stack and dual stack without
affecting traffic flow for the address family that is not modified. Dual stack
configuration is supported over unified LAG circuits in both packet hashed and
circuit pinned mode. Separate ingress and egress counters are maintained for
IPv6 and IPv4 traffic per circuit when a circuit is configured as dual stack.

126 1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06


Architectural Support For Features

3.6 MPLS
IP routing suffers from four delays: propagation, queueing, processing, and
transmission. These delays cause IP routed data to be rough and disorderly,
making IP routing unsuitable for multimedia applications, especially phone
calls. The largest delay is for processing, because lookups in the tree storing
the routing table take time. To resolve this problem, MPLS and the label
concept were invented.

3.6.1 MPLS Packet Header, LSPs, and Packet Forwarding


As shown in Figure 51, the MPLS header with a label is inserted before the IP
header, between L2 and L3; it is sometimes called a Layer 2.5 protocol. The
label tells the router where to forward the packet in the MPLS network. The
label database is small, and looking up the label in an array is extremely fast.

In MPLS, the complete analysis of the packet header is performed only once,
when it enters an MPLS-enabled network. At each incoming (ingress) point of
the MPLS-enabled network, packets are assigned a label stack by an edge
label-switched router (LSR). Packets are forwarded along a label-switched
path (LSP), where each LSR makes forwarding decisions based on the label
stack information. LSPs are unidirectional. When an LSP is set up, each LSR
is programmed with the label stack operations (for example, pop-and-lookup,
swap, or push) that it is to perform for that label.

Once an MPLS router receives a packet with a label, the SSR performs a
lookup on the label and decides what to do. If it is an intermediate router along
the LSP path, it swaps the label with a different one and forwards the packet
unless the next hop is the edge of the LSP path. In the latter case, the router
pops off the label and forwards the packet to the edge, which does not have to
do label lookup but only pure IP routing. This is called penultimate hop popping
(PHP). Logically the MPLS tunnel terminates at the edge, even though the
eLER (edge LER) does no label handling.

At the egress point, an edge LSR removes the label and forwards (for example,
through IP routing longest prefix match lookup) the packet to its destination.
MPLS uses RSVP, LDP, or BGP to communicate labels and their meanings
among LSRs.

1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06 127


SSR System Architecture Guide

Figure 51 MPLS LSP Tunnel and Label Switching

3.6.2 MPLS TTL


The default behavior ensures that the time-to-live (TTL) value is properly
decremented by performing the following operations:

• At the ingress LSR, the IP TTL field is decremented and propagated to the
MPLS TTL field located in the label header.

• The MPLS TTL field is decremented at each hop in the LSP.

• At the egress LSR, the MPLS TTL field replaces the IP TTL field, and the
label is popped.

3.6.3 LSP Management


LSPs can be dynamically established and managed with Resource Reservation
Protocol (RSVP) or Label Distribution Protocol (LDP) or statically configured
and managed.

To dynamically establish LSP paths, use the LDP in downstream unsolicited


mode. LDP exchanges labels between MPLS routers automatically. LDP
always uses the shortest path (that is, the shortest IGP path created by IS-IS
or OSPF). However, the shortest route is not always the best solution. For
example, if the MPLS network has a huge traffic flow, routers along the shortest
path would be overloaded, while other routers in the network would be idle.
Load balancing in this case would need to have LSPs that utilize these other
routers.

3.6.4 MPLS-TE
Another situation where shortest path LSP is insufficient is when QoS is
imposed on the network and the routers involved in traffic need to reserve
resources for the flows. This process is called MPLS traffic engineering
(MPLS-TE). RSVP is used for MPLS-TE scenarios. RSVP also can handle

128 1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06


Architectural Support For Features

failures by installing backup paths that handle traffic when a failure in the
primary path is detected. The backup LSP feature of RSVP is called fast reroute
(FRR). Backup paths can be set to bypass a node, a link, or even a full LSP. To
detect failures, network administrators can use MPLS ping or configure virtual
circuit connectivity verification (VCCV) or Bidirectional Forwarding Detection
(BFD). Of these methods, BFD provides the fastest automatic detection of
MPLS connection failures.

3.6.5 Next-Hop Fast Reroute


Next-hop fast reroute (FRR) is a feature that allows quick re-routing of MPLS
traffic in the event of a link or node failure by creating a bypass RSVP LSP for
link or node protection.

In Figure 52, suppose that an LSP (ABCD) is configured between nodes A and
D. If a link or node fails along the LSP, traffic could be dropped until an operator
reconfigures (which is a slow mechanism) the LSP around the problem (for
example, AEFD if the fault is between B and C).

Figure 52 Network Configured for Next-Hop Fast Reroute

For a link-protection bypass LSP, an operator configures, in advance, a bypass


LSP (AEB) in case a link failure occurs (between A and B). The bypass LSP
stays inactive (not used) until the link goes down (between A and B). As soon
as the link is detected down, the bypass LSP is immediately used (that is,
traffic flows through the AEBCD path). The link-protection bypass LSP can be
configured on an LER (AEB) or LSR (BEC).

For a node-protection bypass LSP, an operator configures, in advance, a


bypass LSP (AEC) in case a node failure occurs (for example, to B). The
bypass LSP stays inactive (not used) until the node goes down. As soon
as the node is detected down, the bypass LSP is immediately used (that is,
traffic flows through the AECD path). The node-protection bypass LSP can be
configured on an LER (AEC, in case B fails) or LSR (BEFD, in case C fails).

A bypass LSP functions exactly like any other RSVP LSP except that it does not
carry traffic under normal conditions. When a link or node failure is detected,
traffic is quickly rerouted onto a bypass RSVP to circumvent the failure.

1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06 129


SSR System Architecture Guide

Note: The time needed by node-protection to switch traffic to a next-next-hop


bypass LSP can be significantly longer than the time needed by
link-protection to switch traffic to a next-hop bypass LSP. Link
protection relies on a hardware mechanism to detect a link failure,
allowing it to quickly switch traffic to a next-hop bypass LSP. Node
protection relies on the receipt of Hello messages (such as BFD
messages) from a neighboring router to determine whether it is still
functioning. The time it takes node protection to divert traffic depends
on how often the node router sends Hello messages and how long it
takes the node-protected router to react to not having received a Hello
message. However, once the failure is detected, traffic can be quickly
diverted to the next-next-hop bypass LSP.

Another fast reroute mechanism is provided for end-to-end backup of an LSP,


called backup LSP. A backup LSP is usually configured between end nodes
as a backup (such as AEFD) to the primary LSP (in this case, ABCD) in case
the primary LSP fails.

3.6.6 VPN Services, PWs, and L2VPN


LSPs are commonly used to provide Virtual Private Networking (VPN) services.
VPN services transparently connect remote networks that belong to the same
customer, keeping those networks private from other customers. To allow IP
address overlapping in different VPNs, another MPLS label must be pushed
after the first one, which creates a tunnel inside the LSP tunnel. This tunnel is
called a pseudowire (PW). The first MPLS header is the outer label, and the
other header is the inner label. The inner label is popped by the remote edge
router, and the packet is forwarded to the appropriate VPN as is, without any
IP routing. In Layer2 VPN (L2VPN) configurations like the one seen in Figure
53, an attachment circuit forwards packets through an interface to the inner
tunnel (the L2 “wire”).

The inner label can be configured statically, or configured by LDP in targeted


mode. If LDP configures the inner label used by the PW, a TCP session is
established between the edge routers, and they exchange an inner label for
every attachment circuit.

130 1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06


Architectural Support For Features

Figure 53 Layer 2 VPN in an MPLS Network

A Virtual Private Network (VPN) is a network in which customer connectivity


among multiple remote sites is deployed across a shared central infrastructure
that provides the same access or security as a private network. A PW is a
mechanism that emulates the attributes and functions of a technology such
as Ethernet connectivity over a WAN. L2VPNs allow circuit-to-PW switching,
where traffic reaching the PE node is tunneled over a PW and, conversely,
traffic arriving over the PW is sent out over the associated circuit. InFigure 53,
both ends of a PW are connected to circuits. Traffic received by the circuit is
tunneled over the PW or, using local switching (also known as circuit-to-circuit
cross-connect), the circuit switches packets or frames to another circuit
attached to the same PE node.

For additional information on L2VPN, see Section 3.1 on page 81.

3.6.7 L3VPN

The L2VPN concept is easily extended to L3 by introducing IP routing at the


L2VPN edge. In L3VPN, BGP is configured to exchange one label per route.
In L2VPN, whatever comes in on the left of the MPLS cloud goes out on the
right, whereas in L3VPN each prefix selects the PW.

The SSR supports L3VPNs in which PE routers maintain a separate VPN


context for each VPN connection. Each customer connection, such as virtual
LAN (VLAN), is mapped to a specific VPN context. Multiple ports on a PE
router can be associated with a single VPN context. PE routers advertise VPN
routes learned from CE routers using internal Border Gateway Protocol (iBGP).
MPLS is used to forward VPN data traffic across the provider’s backbone. The
ingress PE router functions as the iLER, and the egress PE router functions as
the egress LER.

1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06 131


SSR System Architecture Guide

3.6.8 MPLS High-Level Design

Figure 54 shows the high-level design of the MPLS functionality in the Ericsson
IP Operating System.

3.6.8.1 MPLS High Level Design Overview

The design is centered on the Label Manager (LM) multi-threaded process.


The LM process is primarily responsible for handling MPLS configurations
from CLI and label requests/reservations from the MPLS protocols such as
LDP and RSVP. It also configures the MPLS related data structures in the
dataplane such as MPLS adjacencies and Label Map entries. Figure 54 also
shows the processes involved in the control plane configuration and the various
exported and imported APIs.

The MPLS functionality handles MPLS configurations such as:

• Creating an MPLS router (instance)

• Enabling LDP

• Configuring static/dynamic LSPs

• Configuring L2VPNs

• Configuring RSVP-TE tunnels

These MPLS configurations are mainly stored in the LM, RSVP, and LDP
processes and are used to configure the various line cards. Thus, multiple
processes cooperate to configure MPLS LSPs and the dataplane for MPLS
forwarding. For example, RIB installs the FEC entry (route pointing to an MPLS
next hop), whereas LM installs the NHLFEs. LM is also the control plane entity
handling the MPLS-ping and MPLS-traceroute. LDP and RSVP are processes
that handle label distribution. BGP can also interact with LM to configure
L3VPNs. The major LM data structures and their relationships are used to store
the information passed from RCM or communicated from other processes such
as ISM (for example, circuit state), LDP (for example, PW inner label), and RIB
(for example, next-hop related information).

132 1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06


Architectural Support For Features

Figure 54 illustrates the MPLS information flow.

Figure 54 MPLS Label Management

3.6.8.2 Control Plane and Forwarding Plane High-Level Design

The router's MPLS functionality is implemented in both the Ericsson IP


Operating System control plane and the line card forwarding plane. The MPLS
packet forwarding is done in the forwarding plane, which is configured through
the control plane. The MPLS functionality interacts with RIB to set up LSPs and
with ISM to set up and configure MPLS circuits.

MPLS implementation is managed in both the control plane and the data plane:

• The control plane runs the protocols such as LDP, BGP, and RSVP-TE and
configures the data plane through LM.

• The data plane handles switching labeled packets and tagging/un-tagging


packets.

3.6.8.3 LSP Management High-Level Design

LSPs are typically set up dynamically through LDP or RSVP, but they can also
be set up through static configuration. The SSR supports a single platform-wide
label space that is partitioned per application (for example, LDP and RSVP). An
LM library is linked per application, which facilitates the label allocation. The
applications later install the allocated label to LM via the LM API. Although
LSPs or MPLS tunnels may utilize label stacks that are more than one label
deep (for example, PWs consisting of an outer tunnel label and an inner PW ID
label), this section describes setting up only the outer-label tunnel LSP.

The SSR supports the following ways to configure such tunnel LSPs:

1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06 133


SSR System Architecture Guide

• Static—An operator manually configures the label values and label stack
operations (push, pop, swap) to use on a specific interface, along with all
the needed information such as the egress IP of the LSP, the next-hop IP
address, and so on. The configuration is forwarded from CLI to RCM to
the MPLS-static daemon, which uses the LM API to configure the LSP
through the LM daemon.

• Dynamic through LDP—For Ethernet interfaces, LDP runs in the


Downstream Unsolicited (DU) module with independent label distribution
and liberal label retention mode. The SEOS LDP implementation does not
support split horizon for learned FEC-label maps. What is learned from
an LDP speaker is resent back to it. The implementation always gives
preference to the local label over the learned one in order to prevent loops.
Once LDP learns a specific prefix-label entry, it makes sure that the prefix
is installed as an IGP route (by periodically querying RIB) before creating
the dynamic LSP for that prefix through the LM API (LM daemon).

• Dynamic through RSVP-TE—RSVP Traffic Engineering trunks or LSPs


are partially configured by the operator where the desired bandwidth and
egress IP address of the TE trunk are specified. RSVP then negotiates the
bandwidth reservations and the labels throughout the path of the LSP and
configures the LSP on the local node through the LM API (LM daemon).
For an ingress LER, the RSVP LSPs always result in a circuit creation
through ISM (one per LSP), whereas the LDP LSPs do not result in circuit
creations. For LSR, LSPs never result in circuit creations because RSVP
LSPs are TE trunks that require circuit-like statistics. In addition, typically
the number of RSVP LSPs is much fewer than LDP LSPs and therefore
SEOS does not create circuits for LDP LSPs for scalability. RSVP also
has the capability to configure bypass LSPs (for local node or link repair)
and backup LSPs (for end-to-end LSP bypass). The backup and bypass
RSVP LSPs can be used to provide MPLS fast reroute protection with
assist from the data plane. For more information on the MPLS fast reroute,
see Section 3.6.8.6 on page 135).

3.6.8.4 Load Balancing High-Level Design

An LSP can be set up over a Link Aggregation Group (LAG) where different
circuits over different line cards are participating in the LAG. For that topology,
an ingress next-hop structure (FTN case) or an ILM entry can point to an
array of adjacency IDs belonging to the same or different egress line cards.
The label to be used is the same across the LAG (the same label is used for
all adjacencies) and the data plane load-balances flows across the various
adjacencies/circuits in the LAG.

ECMP can also be used between two endpoints when several LSPs are set
up with equal cost between these two endpoints. RIB configures an array of
next hops for the same prefix. Each next hop points to an MPLS adjacency
holding the label for that LSP.

134 1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06


Architectural Support For Features

3.6.8.5 MPLS-TE High-Level Design

From the dataplane perspective, there is no difference whether the LSP is set
up statically or dynamically. Packets are received on a specific port and a
specific circuit (for example, tagged or untagged VLAN).

Figure 55 MPLS Topology

Typical packet forwarding cases:

• Ingress Label Edge Router (iLER, for example, router A in Figure


55)—Non-labeled (for example, IPv4) packets arrive on a specific circuit
where they are submitted to a FIB lookup based on the destination IP
address, which results in a next-hop structure pointing to an MPLS
adjacency. The FIB table holds the longest prefix match mappings to next
hop entries. On the egress side, PFE identifies the adjacency, based on the
adjacency ID passed along with the packet as metadata from the ingress
PFE, holding the label to be pushed on that packet.

• Label Switching Router (LSR, for example, router B in Figure 55)—


MPLS-labeled packets arrive on an MPLS-enabled circuit where they are
submitted to a Label Forwarding Information Base (LFIB) lookup based
on the packet’s outer label, which results in an adjacency ID identifying
the egress adjacency. The LFIB table holds the label map along with the
operation (pop, swap, and so on) to be performed for each matched label
entry. On the egress side, the PFE identifies the adjacency passed along
the packet from the ingress PFE, which holds the label to be swapped
for that packet.

• Egress Label Edge Router (e-LER, for example, router D in Figure


55)—This case is similar to the LSR case, where packets arrive labeled
and an LFIB lookup is performed. However, the difference for e-LER is
that a pop operation is performed on the packets to remove the MPLS
label and forward the encapsulated L3 packet. In a pop-and-forward,
the LFIB entry also holds the adjacency ID for which this packet is to be
forwarded. However, the egress adjacency is not an MPLS adjacency but
a regular IPv4 adjacency. In a pop-and-lookup, the label is popped and a
route lookup in the local context is performed on the encapsulated packet
header to determine the next hop.

3.6.8.6 Next-Hop FRR High-Level Design

PAd monitors only ports and not individual circuits.

1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06 135


SSR System Architecture Guide

The forwarding abstraction layer (FABL) constantly monitors the state of each
port (updated by the drivers in PAd) referenced by the adjacency associated
with a next-hop structure (FTN case) or an ILM entry. When the egress circuit
of the main LSP goes down, FABL converts a port failure to a failure of the
circuits that are created on the failed port. If an alternative adjacency is
configured (for either bypass or backup LSPs since FABL does not differentiate
between the two), FABL immediately swaps the main LSP adjacency with the
configured backup adjacency.

In the control plane, the configuration of a bypass LSP is different than the
backup LSP. Even though a bypass LSP is created by RSVP through LM, when
a bypass LSP is configured in the data plane, LM notifies RSVP about that LSP.
Upon receiving the notice, RSVP rechecks its database for all eligible LSPs
that can benefit from the created bypass LSP. RSVP reconfigures all eligible
LSPs with NFRR with backup adjacencies (all eligible LSPs are automatically
reconfigured, not reinstalled). For a backup LSP, RSVP instructs LM to append
a backup LSP to an existing LSP. When RSVP realizes that the primary LSP
has failed, the backup LSP is reinstalled to LM as the primary LSP. The
configuration supports two backup LSPs for the same LSP. However, RSVP
downloads only one backup per primary LSP depending on availability (for
example, if the first backup fails, RSVP replaces it with the second backup
LSP). Finally, if both a backup LSP and a bypass LSP are configured for the
same primary LSP, RSVP configures only the backup LSP.

3.7 Forwarding
The SSR 8000 introduces a forwarding abstraction layer (FABL) to scale the
function across different line cards. The FABL is network processor-independent
and communicates with the control plane processes on the route processor
(RP). It also communicates with an adaptation layer daemon (ALd) on the line
card LP that contains and isolates network processor-dependent functions.
The combination of FABL and ALd maintains the data structures used by the
NPU to control forwarding. The NPU is responsible for forwarding packets to
other line cards, receiving control packets sent to the SSR, and sending control
packets originated by the SSR. The SSR supports unicast, multicast, punted,
SSC-punted, and host-sourced packet flows.

SSR line cards and SSC cards support the following forwarding types:

• Unicast flow—The forwarding process on each line card performs packet


processing functions such as FIB lookup for the longest prefix match
with the destination IP address, and QoS classification for both fast data
traffic and slower control traffic, such as ICMP, VRRP, or BFD messages.
Packets are forwarded using several components, including the packet
forwarding engine (PFE) on the NPU, LP on the line cards, and RP on the
controller card.

• Multicast flow—An application multicast group associates each multicast


packet to a set of outgoing circuits. Depending on the distribution of the

136 1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06


Architectural Support For Features

outgoing circuits, the multicast packets are replicated either in the fabric
element (FE), egress PFE, or both.

• Packet steering flow—Traffic destined to an SSC card is forwarded from


ingress line card to SSC based on its hosted next hop or traffic slice next
hop information. SSC can forward its traffic to either another SSC using
hosted next hop or egress line card using FIB forwarding.

Figure 56 illustrates the SSR forwarding adaptation layer architecture. This


adaptation layer is uniform across both I/O line cards (40x1GE and 10x10GE)
and the SSC.

Figure 56 SSR Forwarding Architecture

The SSR line cards and the SSC card also support the following control
information flow types:

• Punt flow—Management and control path packets destined to the control


processors are punted from the forwarding path to the LP. Punt packets are
punted from the PFE to LP memory using the direct memory access (DMA)
engine in the PFE over a Peripheral Component Interconnect Express
(PCI-E) interface. The packets are received by the platform adaptation layer
daemon (ALd) and passed to processes in the FABL. These processes
may in turn send the packets to the RP using the internal GE network.

• Host-sourced flow—Packets sourced from either the LP or the RP (via the


LP) can enter the forwarding plane in the ingress or egress path. Packets
transmitted to the PFE follow a similar but reversed path. FABL processes
originate or relay packets to the ALd, which directs the packets toward the
PFE over the PCI-E interface using DMA.

• SSC Traffic Slice Management—To support traffic steering over SSC, SSR
introduces traffic slice management (TSM) to install either target next hop

1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06 137


SSR System Architecture Guide

or traffic slice next hop information to the forwarding plane in addition to


using FIB lookup information.

Figure 57 illustrates the control information flow on SSR line cards.

Figure 57 SSR Control Packet Path

3.7.1 Unicast Packet Forwarding

In unicast forwarding, packets are forwarded from one line card to another
across the switch fabric.Figure 58 illustrates the packet flow path of an incoming
unicast packet from an ingress interface to an egress interface. IPv4 and IPv6
unicast IP addresses are supported.

Figure 58 Unicast Packet Forwarding

The forwarding path of unicast packets is summarized in the following steps:

138 1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06


Architectural Support For Features

1. Input packets are terminated on the ingress interface. Packets entering


the ingress classification unit (ICU) of the PFE are identified as a forward
packet and stored in the ingress PFE internal frame memory.

2. Task optimized processor (TOP) stages of the ingress PFE process packets
with microcode processing, which includes forwarding functions such as
FIB lookup and ingress services such as rate limiting, QoS propagation,
and ACL marking. At the end of TOP processing, the egress port of the
packet is determined.

3. The PFE encapsulates the packets with an incoming traffic management


header (ITHM) and enqueues the packets to one of the virtual output
queues (VOQs) in the ingress fabric access processor (FAP).

4. Packets are scheduled from the VOQ of the ingress FAP (iFAP) to the
egress FAP (eFAP). iFAP converts the ITHM to fabric traffic management
(TM) header (FTMH), and the FE uses the FTHM to route the packets to
the destination eFAP.

5. The eFAP transfers the packets to the PFE. The packet then enters the
PFE following the processing flow in Step 1.

6. After microcode processing in the egress TOP stages, packets are


enqueued to the egress PFE TM, which buffers the packets in its external
DRAM memory and schedules the packet to the egress MAC interface. The
egress MAC interface adds a CRC header to the packet before sending it
to the wire.

Note: Each PFE can simultaneously process ingress and egress traffic.
Packets flowing through the PFE use the same TOP stages but
undergo different processing flows. Each FAP also simultaneously
supports ingress and egress traffic.

Note: The controller card is not involved in transit traffic forwarding.


Each line card maintains its own FIB to make routing decisions.
Therefore, during a controller card switchover, traffic continues to
be forwarded.

3.7.2 Multicast Packet Forwarding


In multicast forwarding, packets are forwarded from one line card to two or
more other line cards across the switch fabric. For multicast, SSR supports only
IPv4 in release 12.2.

Figure 59 illustrates the multicast packet flow.

1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06 139


SSR System Architecture Guide

Figure 59 Multicast Packet Flow

The forwarding path of unicast packets is summarized in the following steps:

1. As ingress packets enter the PFE, they are classified as multicast packets.

2. Packets go through a similar microcode processing as in Step 2 for


forwarding unicast packets. However, forwarding lookup is based on a
multicast-specific forwarding cache (also known as MFIB). The multicast
forwarding lookup on a multicast destination address returns a Multicast
Adjacency ID and a fabric multicast group ID (FMGID) for the destination
eFAP group.

3. At the end of TOP processing, PFE places the Multicast Adjacency ID in the
ITHM header. The packet is then enqueued in one of the fabric multicast
queues (FMQ) in iFAP. iFAP includes an FMGID and indicates that the
packet is of type multicast in the FTHM header.

4. The FE in the switch card receives the packets and transmits them to all
eFAPs specified in the FMGID.

5. The eFAP transfers the packets to the egress PFE. The egress PFE
identifies the packets as multicast from the forwarding header and performs
a lookup on the Multicast Adjacency ID. The results include the number of
port replications and a list of adjacency IDs.

6. The original ingress packet enters ingress classification frame descriptor


(ICFD) block multicast processing (which is the ingress classification
frame descriptor block-- but I am not sure we need this detail). The main
difference is that in this multicast mode, there are multiple frame descriptors
created for the same copy of the packet in frame memory.

7. Replicated packets re-enter TOP stages for microcode processing similar


to Step 6 of the unicast forward flow. At the end of processing, the multicast
packets are sent to the requested interfaces.

3.7.3 Traffic Slice Packet Forwarding


Figure 60 illustrates the packet steering flow path.

140 1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06


Architectural Support For Features

Figure 60 Packet Steering Flow

The processing path of the forward packet is summarized in the following


processes:

• Ingress packet slice processing—Follows the same steps as the unicast


packet forwarding path. The exception occurs during FIB lookup, where
packet lookup would yield either hosted next hops or traffic slice next hops
with the Traffic Slice ID (TS ID) to determine the target SSC information.
The packet is then encapsulated with a fabric header and forwarded to the
target SSC over the fabric.

• SSC packet receiving—Each SSC card has a fabric connection using


its on-board FAP. When a packet arrives at the SSC, it is queued in the
MAC device driver and forwarded to the application located in one of the
two possible CPUs.

• Inter-SSC packet steering—Depending on the application, some packets


are forwarded from one SSC to another SSC for further processing. The
SSR creates a Traffic Slice Forwarding Table (TSFT) in each line card to
steer packets from the line card to the SSCs. The SSCs return traffic to line
card using FABL-FIB or to another SSC using TSFT. .

• Egress packet steering—At the last SSC, packets are enqueued to the
egress queue toward line cards. The forwarding module then forwards the
packet to an egress line card based on regular FIB lookup.

When SSR nodes are configured in an ICR pair (in the BGP-based model),
TSM packet steering changes to a more complex model. In this case, packets
are steered to specific SSCs by using service maps as well as multiple TSFT
tables (dynamically created). For a diagram of the steering flow, see Figure 87.

3.7.4 Punted Packet Processing


The ingress punt path is shown in Figure 57 and summarized in the following
steps:

1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06 141


SSR System Architecture Guide

1. IPackets are identified as control packets. Ingress circuit information is


used to determine if the packets are to be accepted.

2. After microcode processing in the egress TOP stages, packets are


prepended with a Host Receive Header (HRH). This header contains
information about the incoming port and circuit, type of punted packets, and
so on. The packets are placed with an additional Network Processor TM
header (NTMH) and sent to the PFE TM.

3. The packets are delivered to the LP across the PCI-E interface.

In the egress punt path, the punt destination is based on egress adjacency
lookup. The processing flow is the same as that of the ingress punt path.

The following types of packets can be punted to the LP from the ingress path:

• Network control packets (for example, BGP, OSPF, BPDU, ARP, ICMP,
and IGMP)

• Slow path forwarding packets (for example, tunnel fragments, BFD)

• Packets destined for the SSR (for example, Telnet, FTP, SNMP)

• FIB or ACL lookups that indicate a punt

Packets with an adjacency that has an unresolved ARP can be punted from
the egress path.

FABL processes on the LP either handle the packets or send them to the RP.

3.7.5 Host-Sourced Packet Processing

Figure 57 illustrates the processing path for packets sourced from either the
LP or the RP. The corresponding processing paths are summarized in the
following steps.

On the ingress path:

1. The LP prepends the packet with a Host Transmit Header (HTH) and then
transmits the packet toward the PFE. Source packets are buffered in the
PFE TM and then delivered to the TOP stages for further processing.

2. Unlike forwarding packets, sourced packets entering the TOP stages are
not subjected to ACL or rate limiting. Sourced packets primarily undergo
FIB lookup in the ingress microcode processing path.

3. At the end of TOP stages, the packet is queued to the VOQ of IFAP in the
same way as Step 3 for the unicast forwarding flow.

Note: The priority in the Host Transmit Header is preserved while queuing
to VOQ.

142 1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06


Architectural Support For Features

On the egress path:

1. See Step 1 for the ingress path, with the exception that FIB lookup is not
required, and the prepended HTH also includes forwarding information
such as the next-hop adjacency and priority.

2. No ACL or rate limiting is applied to sourced packets in the egress


microcode processing path. This includes sourced packets coming directly
from the host or indirectly from the fabric.

Sourced packets can also be directly enqueued from LP to the egress


TM queue.

3. See Step 6 for unicast forwarding flow, with the exception that the egress
PFE TM sends the packet to the egress MAC interface, which in turn sends
the packets to the interface.

3.7.6 Forwarding Redundancy

The SSR 8000 family architecture separates the control and forwarding planes.

The separation of the route processing and control functions (performed by the
operating system software running on the controller card) from the forwarding
function (performed on the individual traffic cards) provides the following
benefits:

• Dedicated route processing functions are not affected by heavy traffic,


and dedicated packet forwarding is not affected by routing instability in
the network.

• The architecture enables line-rate forwarding on all traffic cards. New


features can be added to the control software on the controller without
affecting forwarding performance.

• The architecture provides non-stop forwarding during system upgrades or


reloads, and the traffic cards continue to forward packets.

In software, the router also supports the following types of redundant routes:

• Link resiliency in Layer 2 topologies using link aggregation (LAG), MC-LAG,


or VPWS PW redundancy. For more information, see Configuring Link
Aggregation Groups, Configuring Multichassis LAG, and Configuring
VPWS (L2VPN).

• MPLS route redundancy using LDP. For more information, see Configuring
MPLS.

• Multipaths in networks using BGP route advertisement. For more


information, see Configuring BGP.

1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06 143


SSR System Architecture Guide

3.7.7 Load Balancing

The SSR supports traffic load balancing in the following redundant topologies:

• MPLS networks with label edge router (LER) and label-switched path (LSP)
transit node (P router) configurations

• Networks using BGP route advertisement

• Link aggregation group (LAG) configurations (for more information, see


Configuring Link Aggregation Groups)

• Multichassis LAG configurations (for more information, see Configuring


Multichassis LAG)

For information about load balancing, see Section 3.7.7 on page 144.

Note: Traffic in these topologies is load-balanced on up to 16 equal-cost links.


If you configure more than 16 links, the links in excess of 16 are not
included in the load-balancing algorithms.

3.7.7.1 Load Balancing in LER and Transit Router Configurations

Table 7 provides details of load balancing for traffic in LER configurations (see
RFC 6178).

Table 8 provides details of load balancing for traffic in P router configurations


using Label Distribution Protocol (LDP) (see RFC 4364).

3.7.7.2 Load-Balancing Configuration Options

By default, load-balancing algorithms (also known as hashing) are based on


Layer 3 data, which includes only the source and destination IP addresses and
the protocol type. This is also known as the 3-tuple hashing option.

You can change the load balance hashing. For example:

• To use Layer 4 data (or the 5-tuple hashing option), including the source
and destination IP addresses, the IP protocol, and the source and
destination ports for User Datagram Protocol (UDP) or Transmission
Control Protocol (TCP) streams) in the algorithm, use the service
load-balance ip command with the layer-4 keyword.

Note: The router automatically reverts to L3 hashing for the following


packet types:

• Non-UDP or TCP packets

• Packets with IP options

• Fragmented packets

144 1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06


Architectural Support For Features

For load balancing configuration instructions, see Load Balancing.

• To return hashing to the default 3-tuple hashing, use the service


load-balance ip command with the layer-3 keyword.

To enable equal cost multipath (ECMP) on P routers, in which LDP is used


for label distribution, use the ecmp-transit configuration command in LDP
router configuration mode.

Table 7 describes the hashing for LER load balancing for different applications.

Table 7 LER Load Balancing


Applications Traffic Type Hashing
(1)
Plain IP IP IP source and destination and the protocol type.
IP Multicast IP source and destination and the protocol type.
GRE with IP Inner IP source and destination and the protocol type
(1)
IP over IP .
IPsec tunnel IP Inner IP source and destination and the protocol
(1)
type.
If the IPsec tunnel is configured through link groups,
the hashing is based on IP source and destination
and the protocol type.
(1)
IP-based MPLS IP IP source and destination and the protocol type.
(L3VPN)
VPWS pseudowire L2 Ethernet Source and destination MAC address in the Layer
(L2VPN) 2 Ethernet header.
(1) If the 5-tuple hashing option is configured, load balancing is based on IP source, IP destination, protocol type, UDP
or TCP source port, and UDP or TCP destination port.

Load balancing hashing for egress LERs (eLERs) depends on the topology of
the network into which packets are being forwarded. For example, if traffic is
being forwarded:

• As IP ECMP traffic, hashing is based on L3 information.

• Into LSPs in another MPLS cloud (for example, through an L3VPN),


hashing is based on the same rules as in Table 7.

• Into a LAG path with multiple links, LAG hashing is used. For more
information, see Configuring Link Aggregation Groups.

Table 8 describes P router load balancing for label-switched traffic.

Table 8 Transit (P) Router Load Balancing: Label-Switched Traffic


Applications Traffic Type Hashing
All IP and non-IP Up to 4 labels in the label stack

1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06 145


SSR System Architecture Guide

Note: On P routers, non-labeled IP traffic is either load balanced by the


default IP hashing (IP source and destination and the protocol type)
or an algorithm configured using the service load-balance ip
command.

3.7.7.3 Load Balancing in Networks Using BGP

You can configure load balancing in networks using BGP route advertisement
by using the BGP multipath capabilities. By default, BGP multipath is disabled,
which means that BGP installs a single path in the RIB for each destination.
If that path fails and no other path has installed a path for that prefix, traffic
destined for that path is lost until the path is available again.

When BGP multipath is enabled with the multi-paths command, BGP installs
multiple best equal-cost paths in the routing table for load-balancing traffic to
BGP destinations. With multipath, the paths can be:

• All iBGP (configured with the multi-paths internal path-num


command)

• All eBGP (configured with the multi-paths external path-num


command)

• In a VPN context, a combination of iBGP and eBGP, where only one eBGP
path is allowed and the number of allowed iBGP equal-cost paths is equal to
the maximum number of paths allowed (configured with the multi-paths
eibgp path-num command) minus one. For example, if you configure
eibgp 7, six iBGP paths and one eBGP path are installed in the RIB.

When BGP multipath capabilities are enabled, even though multiple paths
are installed in the RIB, BGP advertises only one path (the BGP best path)
to its peers.

3.7.7.4 Testing Load Balancing

The SSR supports standard industry hashing algorithms (XOR&folding of the


fields that are inputs to the hashing). The algorithm works well under the
following two conditions, which should apply to normal network traffic:

• Random input fields from different traffic streams. If multiple fields are used,
changes in those fields between different streams should not be correlated.

• Large enough streams to achieve statistical average.

In testing, hashing might not work well if there is correlation between the
multiple fields used in the hashing. For example, using field1 (one byte) and
field2 (one byte) to generate the hash key (= field1 XOR field2) for testing
purposes, traffic streams can be generated in the following ways:

1 Field 2 stays the same and field 1 is incremented by 1 for each stream
(a typical testing method). This will achieve good hashing result. It's not
exactly random, but simple, and good for one field hashing.

146 1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06


Architectural Support For Features

2 Both field 1 and field 2 are incremented by 1 for each stream. People would
think this is random, but it's not because 0 XOR 0 = 1 XOR 1 = 2 XOR 2
= ….. All the hashing keys are the same. Starting field1 and field2 with
different values helps, but not much. Incrementing field1 by 1 and field by 2
helps some, but they are still correlated, which is not the best method.

3 Use a random number generator for both field 1 and field 2. This is the
best way to verify hashing effect.

The recommendation is to avoid case 2 in testing (which is the most unlikely


in real network traffic).

For LSR label stack hashing, if the testing equipment is generating a label
stack of (1, 1) (2, 2), the load balancing effect will not be optimum.However,
adding a few outer label values with a lot more inner label values should work.
For LER hashing, if both the source IP address and destination address are
incremented by 1 for each steam, the load balancing will not work well. For L4
hashing, if both src port and dest port are incremented by 1 for each stream,
the load balancing effect again will not work well. In general, when multiple
input fields are used, testing should be done using separate random generators
for each field.

3.8 BNG Management


This section describes the communication flow between modules for
BNG modules, which provide 802.1Q statically configured, PPP/PPPoE,
DHCP/CLIPS, and L2TP subscriber sessions.

The PPA3LP NPU subsystem receives and processes packets coming from
various interfaces like access side interfaces, trunk side interfaces and control
interfaces. Based on processing results, the NPU either drops the packets or
forwards them onto different interfaces. The NPU subsystem also provides
various platform dependent API’s for FABL to send various configurations to
populate the various databases that are required for forwarding traffic. NPU
also provides API’s to punt and receive the control packets to and from the
FABL PAKIO module on the line card processors.

To support the BNG application, the NPU subsystem provides support for
forwarding IPv4 and IPv6 packets to and from the following types of subscribers
that are brought up either statically by configuration or dynamically by protocols.
The BNG application supports the following subscriber configurations.

Dynamic PPPoE subscriber configuration includes:

• PPPoE over Ethernet port, dot1q and Q-in-Q

• PPPoE over L2TP (using LAC and LNS protocols)

• PPPoE over dot1q CCoD for dot1q and Q-in-Q

• PPPoE over link-group for 1-3 above

1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06 147


SSR System Architecture Guide

Dynamic CLIPS subscriber configuration includes:

• CLIPS over Ethernet port, dot1q and Q-in-Q

• CLIPS over dot1q CCoD for dot1q and Q-in-Q

• CLIPS over link-group for the previous two.

Static subscriber configurations include:

• Static subscriber over Ethernet port, dot1q and Q-in-Q

• Static subscriber over link-group for the previous one

• CLIPS static over Ethernet port, dot1q and Q-in-Q

• CLIPS static over link-group for the previous three

For information about configuring BNG subscriber services, see the


configuration documents in the Operations and Maintenance\Configuration
Management\Subscriber Management folder in this library.

3.8.1 BNG Modules

The BNG application suite is made up of many processes in the Ericsson IP


Operating System that service subscribers. The following diagram gives a
system view of the BNG application suite:

Figure 61 BNG Functional Architecture

148 1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06


Architectural Support For Features

As illustrated in Figure 61, the BNG application suite is logically divided into
three groups - the BNG protocols, BNG control, and BNG services.

Table 9 BNG Application Suite


BNG Protocols Includes PPPoE, PPP, L2TP, DHCP, DHCPv6 and CLIPS.
BNG Control Includes Infrastructure modules such as RIB, STATd, ISM
and so on.
BNG Services Includes QoS, ACL, NAT, MC, HR, ND, and IPFIX.

For an overview of each module, see Section 2.2.2 on page 37.

The router control plane communicates with the ALd processes on the line card
processor which in turn communicates with the line card NPU.

Figure 62 shows subscriber session connection through ALd, using PPPoE


for an example.

1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06 149


SSR System Architecture Guide

Figure 62 Subscriber Session Connection Using ALd

Session connection information flow, using ALd as a liaison with the platform
dependent hardware:

• PPPoE reports to ISM that PPPoE encapsulation has been configured


on a circuit.

• ISM forwards the circuit details to IFace. IFace sends a request to ALd to
enable the PPPoE encapsulation.

• The client sends a PPPoE Active Discovery Initiation (PADI) request.

• ALd forwards it to PPPoE via the kernel. PPPoE returns a PPPoE Active
Discovery Offer (PADO) message to ALd.

• ALd forwards it to the client.

150 1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06


Architectural Support For Features

• The client returns a PPPoE Active Discovery Request (PADR) message


to ALd.

• ALd forwards it to PPPoE via the kernel.

• PPPoE creates the circuit with PPP attributes and forwards the details
to ISM.

• ISM forwards the circuit details to IFace. IFace Forwards Details to ALd.

• PPPoE sends a PPPoE Active Discovery Session (PADS) message to ALd.

• ALd forwards it to the client.

• The client and PPPoE negotiate the session details via the kernel, Ald,
and the NPU.

• If the interface binding configured is for both IPv4 and IPv6, ISM informs
IFace.

• IFace sends the circuit and interface binding details to FIB.

• FIB sends the circuit association to ALd.

• ND messages are exchanged between the client and NDd via the kernel,
ALd, and the NPU (all modules are informed).

• DHCPv6 and the client negotiate session details via the kernel, ALd, and
the NPU (all modules are informed).

• ISM sends RIB the IP host.

• RIB creates the adjacency in FIB.

• FIB sends the adjacency to ALd to be used for packet forwarding.

• RIB adds the routes and next hop to FIB.

• FIB sends the routes and next hop to ALd to be used for packet forwarding.

Note: The role of ALd in subscriber session negotiations is assumed, but not
shown in the rest of the session diagrams.

3.8.2 BNG Sessions


The BNG application suite supports static and dynamic sessions. BNG
sessions support multiple types of subscriber encapsulations, which include,
Static, PPP, PPPoE, L2TP (LAC and LNS) and CLIPS. The following sections
provide the overall system-message-flows for bringing up and terminating each
type of static and dynamic subscriber.

1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06 151


SSR System Architecture Guide

3.8.2.1 Static Sessions

In static sessions, a fixed number of circuits are created manually. When the
sessions are set up, the subscribers are bound to the circuits by the ISM. The
Ericsson IP Operating System supports the following circuit encapsulation
types for static sessions:

• 802.1Q based circuits

• Pure Ethernet based circuits; CLIPS

3.8.2.1.1 802.1Q Circuit Sessions

Session Bring-Up

To enable static subscriber sessions, a circuit must be configured manually


using “PVC create” command.

The following figure illustrates how a static subscriber session is created:

Figure 63 Static Session Bring Up

1. When the circuit is created, the CLI module transfers the configuration to
the router configuration manager (RCM) module.

152 1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06


Architectural Support For Features

2. RCM requests ISM to create the circuit and activate the session. RCM
sends the circuit configuration details to ISM along with the request.

3. ISM creates the circuit and propagates these updates to the other modules.

4. DOT1Q communicates with AAA to authenticate the subscriber and bring


the session up:

a DOT1Q sends an authentication request to AAA.

b AAA validates the authenticity of the subscriber and responds by


sending an “authen resp” message to DOT1Q.

c DOT1Q sends a“session up” message to AAA.

5. AAA sends a request to ISM for binding the interface with the configured
circuit.

6. ISM propagates these updates to the other modules.

7. The line card sends the subscriber status to the statistics daemon (STADd),
which further propagates the status to AAA.

8. When the client sends an ARP request message, ARP responds with an
“ARP Response” and adds the MAC address of the client to the Routing
Information Base (RIB).

9. RIB downloads the route, next-hop, and adjacency information to the line
card.

Session Termination

The following figure illustrates how a static subscriber session is terminated:

1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06 153


SSR System Architecture Guide

Figure 64 Static Session Termination

1. CSM sends a “port down” message to ISM.

2. ISM propagates the update to the other modules.

3. DOT1Q sends a “session down” message to AAA to bring down the


currently active session.

4. AAA sends a request to ISM to unbind the interface from the circuit.

5. ISM unbinds the interface and propagates these updates to the other
modules.

6. The line card sends the subscriber status to the statistics daemon (STATd),
which further propagates the status to AAA.

7. RIB downloads the route, next-hop, and adjacency removal information


to the line card.

8. AAA updates the circuit state to ISM.

9. ISM propagates these updates to the other modules.

154 1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06


Architectural Support For Features

3.8.2.1.2 802.1Q-Based Circuit Session With DHCP IP Assignment

Session Bring-Up

To set up a static subscriber session using DHCP, you must configure circuits
and bind the subscribers to the circuits. The subscribers should also be
configured with DHCP_Max_Leases (RBAK VSA #3).

The following figure illustrates how a static subscriber session with DHCP is
created:

Figure 65 802.1Q-Based Static Session Bring Up

1. When you configure a DOT1Q circuit, DOT1Q sends a “session up”


message to AAA.

2. AAA sends a request to ISM for binding the interface with the circuit. AAA
sends the circuit configuration information along with the request.

3. ISM propagates these updates to the other modules.

4. When the client sends a “DHCPDISCOVER” message, the DHCP daemon


running on the router receives the message and forwards it to the DHCP
server.

1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06 155


SSR System Architecture Guide

5. The DHCP server responds with a “DHCPOFFER” message to offer the


client an IP address lease.

6. The DHCP daemon requests AAA to add the IP host address.

7. AAA requests the ISM to update the circuit configuration with the IP
address.

8. ISM propagates these updates to the other modules.

9. ARP provides RIB with the MAC address of the host client.

10. RIB downloads the route and next-hop information to the line card.

11. The DHCP daemon forwards the “DHCPOFFER” message to the client.

12. The client sends a “DHCPREQUEST ” to the DHCP daemon, which further
forwards this request to the DHCP server.

13. The DHCP server responds with the “DHCPACK” message, which contains
all the configuration information requested by the client.

Static Session with DHCP IP Release

Figure 66 802.1Q Based Static Session Termination

1. The client ends the IP address lease by sending a “DHCPRELEASE”


message to the DHCP server.

2. The DHCP daemon running on the router receives the message and
requests AAA to remove the IP address of the client from AAA database.

3. AAA sends an “ip host del” message to the ISM along with the circuit
configuration of the circuit to be removed.

156 1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06


Architectural Support For Features

4. ISM deletes the circuit and propagates these updates to the other modules.

5. RIB downloads the route, next-hop, and adjacency removal information


to the line card.

6. The DHCP daemon forwards the “DHCPRELEASE” message to the line


card, which further forwards the information to the DHCP server.

3.8.2.1.3 Ethernet-based Static CLIPS Sessions

Session Bring-Up

Static CLIPS sessions are static circuits that stay up as long as the port is up.
The CLIPS session is brought down only when the port is down, or the CLIPS
PVC is un-configured.

To enable the static CLIPS session, a circuit must be configured manually


using “PVC create” command.

The following figure illustrates Static CLIPS session connection process:

Figure 67 Static CLIPS Session Bring Up

1. After the circuit is created, the CLI transfers the configuration to RCM.

1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06 157


SSR System Architecture Guide

2. RCM sends a request to CLIPS for creating a PVC.

3. CLIPS requests ISM to create the circuit and activate the session, sending
the circuit configuration details to ISM along with request.

4. ISM creates the circuit and propagates these updates to the other modules.

5. CLIPS communicates with AAA to authenticate the subscriber and bring


the session up:

a CLIPS sends an authentication request to AAA.

b AAA validates the authenticity of the subscriber and responds by


sending an “authen resp” message to CLIPS.

c CLIPS sends a “session up” message to AAA.

6. AAA sends a request to ISM for binding the interface with the configured
circuit.

7. ISM propagates these updates to the other modules.

8. When the client sends an ARP request message, ARP responds with an
“ARP Response” and adds the MAC address of the client to the Routing
Information Base (RIB).

9. RIB downloads the route, next-hop, and adjacency information to the line
card.

Session Termination

The following figure illustrates Static CLIPS session termination process:

158 1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06


Architectural Support For Features

Figure 68 Static CLIPS Session Termination

1. CSM sends a “port down” message to ISM.

2. ISM propagates these updates to the other modules.

3. CLIPS sends a “session down” message to AAA to bring down the currently
active session.

4. AAA sends a request to ISM to unbind the interface from the circuit.

5. ISM unbinds the interface and propagates these updates to the other
modules.

6. The line card sends the subscriber status to STATd, which further
propagates the status to AAA.

7. RIB downloads the route, next-hop, and adjacency removal information


to the line card.

8. AAA updates the circuit state to ISM.

9. ISM propagates these updates to the other modules.

3.8.2.2 Dynamic Sessions

In dynamic sessions, the circuits are created dynamically whenever a request


for subscriber sessions is requested.

1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06 159


SSR System Architecture Guide

3.8.2.2.1 PPPoE Sessions

Point to Point Protocol (PPP) is available on the router over untagged Ethernet
and 802.1Q PVCs. PPP over Ethernet (PPPoE) is a client-server connection
technology for subscribers to access the internet and IP services over an
Ethernet connection.

Dual Stack Session Connection

The following figure illustrates part 1 of the session connection process

Figure 69 PPPoE Dual Stack IPv6 Connection, Part 1

Assuming that the router is serving as a PPPoE server, the Ericsson IP


Operating System brings up the IPv6 stack as follows:

1. To begin connecting to the Internet, a client sends a PPP Active Discovery


Initiation (PADI) broadcast message in the local broadcast domain. Multiple
servers may receive it.

2. When the router receives it, the PPPoE process returns a unicast PPP
Active Discovery Offering (PADO) message to the client, containing it's
MAC address, server name, and the services it offers.

160 1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06


Architectural Support For Features

3. Assuming that the client selects the router's offer, it returns a PPP Active
Discovery Request (PADR) for connection.

4. PPPoE sends a request to ISM to create the circuit with the PPPoE session
ID.

ISM creates it and passes the circuit information to the other BNG modules.

5. The router returns a PPP Active Discovery Session (PADS) confirmation.

6. The client is authenticated in a series of LCP messages between the client,


PPP and AAA modules.

In the diagram, PAP is illustrated as an example.

7. ISM communicates the subscriber circuit details (such as a magic number


or the maximum receive unit (MRU) to the other modules.

The following figure illustrates the rest of the IPv6 connection process.

Figure 70 PPPoE Dual Stack IPv6 Connection, Part 2

1. After the session/circuit has been established, the client and PPP negotiate
the IPv4 address using an IPCP message exchange.

2. PPP sends a session up message to AAA.

3. AAA sends the request to bind the subscriber circuit to the interface to ISM.

4. ISM communicates these updates to the other modules.

5. RIB passes the route, next-hop, and adjacency to FIB.

1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06 161


SSR System Architecture Guide

6. The client and ND exchange IPv6 stateless address autoconfiguration


(SLAAC) messages.

7. The client and DHCPv6 exchange messages to assign the IPv6 PD prefix.

8. DHCPv6 communicates the assigned PD prefix route to ISM.

9. ISM communicates this information to the other modules.

10. RIB downloads the route and next-hop to FIB.

The following figure illustrates the IPv4 PPP connection communication flow.

Figure 71 PPPoE Dual Stack Connection, IPv4

The router brings up the IPv4 stack as follows:

1. After the session/circuit has been established, the client and PPP negotiate
the IPv4 address and exchange their interface IDs using an IPCP message
exchange:

a The client sends an IPCP configuration request to PPP.

b PPP sends a request for an IP address to AAA.

c AAA returns the IP address.

d PPP sends the IP address in a configuration acknowledgement to the


client.

2. AAA sends the interface binding, and session configuration with IP address
to ISM.

3. ISM communicates the updates to the other modules.

162 1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06


Architectural Support For Features

4. RIB downloads the route and next-hop to FIB.

PPPoE Dual Stack Down

The following figure illustrates the IPv4 session termination process.

Figure 72 PPPoE Dual Stack IPv4 Down

The PPPoE session IPv4 stack comes down as follows:

1. The client sends an IPCP termination request (IPCP term req) to PPP.

2. PPP sends a stack down (IPv4) message to AAA.

3. AAA communicates the configuration change to ISM.

4. ISM communicates the updates to the other modules.

5. PPP sends a termination acknowledgement (IPCP term ack) message


to the client.

6. RIB downloads the route, next-hop, and adjacency removal to FIB.

The following figure illustrates the IPv6 session down process.

1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06 163


SSR System Architecture Guide

Figure 73 PPPoE Dual Stack IPv6 Down

The router brings down the session's IPv6 stack as follows:

1. The client sends an IPv6CP termination request (IPv6CP term req) to PPP.

2. PPP sends a request to reset the interface ID to ISM.

3. ISM communicates the updates to the other modules.

4. PPP sends a stack down message to AAA, which passes the configuration
change to ISM.

5. ISM communicates the updates to the other modules.

6. PPP sends a termination acknowledgement (IPv6CP term ack) message


to the client.

7. DHCPv6 communicates the prefix removal to ISM, which communicates it


to the other modules.

8. RIB downloads the route, next-hop, and adjacency removal to FIB.

PPPoE Session Termination

The following figure illustrates the session termination process.

164 1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06


Architectural Support For Features

Figure 74 PPPoE Session Termination Process

To bring down the PPPoE session, the router performs the following process:

1. The client sends a PPP Active Discovery Termination (PADT) message to


the PPPoE process.

2. PPPoE sends ISM a request to clear the circuit/session.

3. ISM communicates the change in configuration to the other modules.

4. PPP sends an LCP termination message to the client and reports that
the session is down to AAA.

5. AAA reports that the interface binding has been removed to ISM.

6. ISM communicates the change in configuration to the other modules.

7. The forwarding process reports subscriber statistics to STATd and AAA


modules.

8. RIB downloads the removal of the route and next-hop to FIB.

9. AAA reports the change in configuration to ISM and ISM passes the circuit
deletion to the other modules.

1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06 165


SSR System Architecture Guide

3.8.2.2.2 DHCP-Based CLIPS Subscriber Sessions

Session Bring-Up (DHCPv4)

Clientless IP Service Selection (CLIPS) is a method of creating a point-to-point


subscriber session based on IP or DHCP. CLIPS uses the MAC address of the
subscriber for identification, and interacts with RADIUS to make authentication
decisions based on the MAC address.

The following figure illustrates Dynamic DHCP CLIPS session connection


process:

Figure 75 Dynamic DHCP CLIPS Session Bring Up

1. To connect to the Internet, a client sends a "DHCPDISCOVER" broadcast


message to the local broadcast domain.

2. The DHCP daemon running on the router receives the message and
initiates a request with CLIPS to create a session.

166 1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06


Architectural Support For Features

3. CLIPS requests the ISM to create the circuit and activate the session.
CLIPS sends the circuit configuration details to the ISM along with request.

4. ISM creates the circuit and propagates these updates to the other modules.

5. CLIPS sends an authentication request to AAA.

6. AAA validates the authenticity of the subscriber and responds by sending


an “authen resp” message to CLIPS.

7. CLIPS sends a “session up” message to AAA.

8. AAA sends a request to ISM for binding the interface with the configured
circuit.

9. ISM propagates these updates to the other modules.

10. The DHCP daemon forwards the “DHCPDISCOVER” message to the


DHCP server.

11. The DHCP server responds with a “DHCPOFFER” message to offer an IP


address lease to the client.

12. The DHCP daemon requests AAA to add the IP host address.

13. AAA requests the ISM to update the circuit configuration with the IP
address.

14. ISM propagates these updates to the other modules.

15. ARP provides the RIB with the MAC address of the host client.

16. RIB downloads the route, next-hop and adjacency information to the line
card.

17. The DHCP daemon forwards the “DHCPOFFER” message to the client.

18. The client sends a “DHCPREQUEST” to the DHCP daemon, which further
forwards this request to the DHCP server.

19. The DHCP server responds with the “DHCPACK” message, which contains
all the configuration information requested by the client.

Session Termination

The following figure illustrates Dynamic DHCP CLIPS session termination


process:

1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06 167


SSR System Architecture Guide

Figure 76 Dynamic DHCP CLIPS Session Termination

1. The client ends the IP address lease by sending a “DHCPRELEASE”


message to the DHCP server.

2. The DHCP daemon running on the router receives the message and sends
a request to CLIPS to delete the session.

3. CLIPS sends a “session down” message to AAA.

4. AAA sends a request to ISM to unbind the interface from the circuit.

5. ISM propagates these updates to other modules.

6. The line card sends the subscriber status to STATd, which further
propagates the status to AAA.

7. RIB downloads the route, next-hop, and adjacency removal information


to the line card.

8. AAA updates the circuit state to ISM.

9. ISM propagates these updates to other modules.

10. The DHCP daemon forwards the “DHCPRELEASE” message to the DHCP
server.

168 1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06


Architectural Support For Features

3.8.2.2.3 Auto-Detect CLIPS Subscriber Sessions

Session Bring-Up

The following figure illustrates how an auto-detect CLIPS subscriber session is


created:

Figure 77 Auto-detect CLIPS Session Bring Up

1. When the line card receives an IP packet from the client, it initiates a
request with CLIPS to create a session for the client.

2. CLIPS requests ISM to create the circuit and activate the session (and
sends the circuit configuration details to ISM along with request).

3. ISM creates the circuit and propagates these updates to the other modules.

4. CLIPS communicates with AAA to authenticate the subscriber and bring


the session up.

5. CLIPS sends an authentication request to AAA.

6. AAA validates the authenticity of the subscriber and responds by sending


an “authen resp” message to CLIPS.

7. CLIPS sends a “session up” message to AAA.

1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06 169


SSR System Architecture Guide

8. AAA sends a request to ISM for binding the interface with the configured
circuit.

9. ISM propagates these updates to the other modules.

10. When the client sends an ARP request message, ARP sends an “ARP
Response” with the MAC address of the client to RIB.

11. RIB downloads the route, next-hop, and adjacency information to the line
card.

Session Termination

The auto-detect session termination is based on idle timeout; there is no


signaling for session termination through any protocol, thus the normal trigger
for session termination is idle timeout:

Figure 78 Auto-detect CLIPS Session Termination

1. If IP packets are not received within the configured time, the line card sends
an "idle-timeout" message to AAA.

2. AAA updates the circuit status to ISM.

3. ISM propagates these updates to the other modules.

4. CLIPS sends a “session down” message to AAA to bring down the currently
active session.

170 1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06


Architectural Support For Features

5. AAA sends a request to ISM to unbind the interface from the circuit.

6. ISM unbinds the interface and propagates these updates to the other
modules.

7. The line card sends the subscriber status to STATd, which further
propagates the status to AAA.

8. RIB downloads the route, next-hop, and adjacency removal information


to the line card.

9. AAA updates the circuit state to ISM.

10. ISM propagates these updates to the other modules.

3.8.2.2.4 L2TP Subscriber Sessions

Layer 2 Tunneling Protocol (L2TP) is a tunneling protocol used to support


VPNs. The two endpoints of an L2TP tunnel are called the LAC (L2TP Access
Concentrator) and the LNS (L2TP Network Server).

This section describes the connection flows for LAC and LNS sessions.

1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06 171


SSR System Architecture Guide

LAC Session Bring-Up

Figure 79 LAC Session Bring Up, Part 1

1. To begin connecting to the Internet, a client sends a PPP Active Discovery


Initiation (PADI) broadcast message in the local broadcast domain. Multiple
servers may receive it.

172 1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06


Architectural Support For Features

2. When the router receives it, the PPPoE process returns a unicast PPP
Active Discovery Offering (PADO) message to the client, containing it's
MAC address, server name, and the services it offers.

3. Assuming that the client selects the router's offer, it returns a PPP Active
Discovery Request (PADR) for connection.

4. PPP sends a unit message to ISM and ISM sends updates to the other
modules, as well as a circuit up message to PPP.

5. PPP informs PPPoE that the unit is ready, and a synchronization between
them occurs.

6. PPPoE sends a PPP Active Discovery Session (PADS) message to the


client.

7. The client and PPP carry out an LCP message exchange before beginning
authentication.

8. PPP sends a CHAP authentication challenge and the client responds.

9. PPP sends the authentication request to AAA, and AAA responds.

10. If successful, PPP sends a CHAP success message to the client, PPP
sends the session details to ISM and ISM sends updates to the rest of
the modules.

11. PPP sends a LAC start message to L2TP.

12. L2TP, the forwarding plane (FWD), and the LNS negotiate the tunnel setup.

13. The tunnel is added to FWD.

14. L2TP and the LNS exchange session setup messages.

15. L2TP sends ISM details about the L2TP session.

16. ISM sends the updates to all the modules.

17. PPP sends a session up message to AAA.

18. PPPoE sends the session account information to AAA, which sends it to
ISM.

19. ISM sends the updates to all the modules.

20. Packets to and from the client are now tunneled to the LNS.

LAC Session Termination

Figure 80 illustrates the LAC session termination process.

1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06 173


SSR System Architecture Guide

Figure 80 LAC Session Termination

1. 1The LNS sends a CDN message to L2TP.

2. L2TP sends a message to the PPP to stop the LAC session.

3. PPP sends a message to AAA to bring the session down.

4. PPP sends the LCP termination request to the line card, which further
propagates this message to the client.

5. PPP sends a “session stopping” message to the PPPoE daemon.

6. AAA requests ISM for the subscriber status.

7. ISM propagates these updates to all the other modules.

8. The line card sends the subscriber status to the PPPoE daemon, which
further forwards the message to AAA.

9. AAA requests ISM to bring the session down and delete the circuit.

10. ISM deletes the circuit and propagates these updates to all the other
modules.

LNS Session Bring-Up

174 1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06


Architectural Support For Features

Figure 81 LNS Session Bring Up

1. To initiate a tunnel between the LAC and LNS, the LAC sends a “Start
Control Connection Request (SCCRQ)” to L2TP.

2. The L2TP process gets the route from RIB and responds with a “Start
Control Connection Reply (SCCRP)”.

1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06 175


SSR System Architecture Guide

3. The LAC sends a “Start Control Connection Connected (SCCCN)” message


to L2TP.

4. L2TP provides tunnel accounting information to AAA for reporting the


creation of new L2TP tunnel.

5. The LAC sends an Incoming Call Request (ICRQ) to L2TP.

6. The L2TP process requests ISM to create and activate the circuit.

a L2TP sends circuit configuration information along with the tunnel id


and requests ISM to create a circuit for the session.

b ISM creates the circuits and propagates these updates to all the other
modules.

c PPP sends the circuit configuration message to ISM.

d ISM propagates these updates to all the other modules.

e ISM sends a circuit up message to the PPP.

f PPP sends a “Unit Ready” message to the L2TP process.

7. L2TP sends an “Incoming Call Request (ICRQ)” message to the LAC.

8. LAC responds to L2TP with an “Incoming Call Connected (ICCN)” message.

9. L2TP sends the session start account data to PPP.

10. The LAC and PPP carry out an LCP message negotiation.

11. PPP sends the LAC a CHAP challenge and the LAC sends a CHAP
response.

12. PPP sends an authentication request to AAA and AAA responds.

13. If the subscriber was authenticated, PPP sends the LAC a CHAP success
message.

14. If the subscriber is configured for dual-stack, LAC and PPP carry out an
IPv6CP message exchange.

15. L2TP sends an LNS session up message to PPP.

16. PPP sends circuit updates to ISM and ISM propagates the updates to all
the other modules.

17. AAA sends circuit updates to ISM and ISM propagates the updates to all
the other modules.

18. RIB downloads the subscriber route and next hop to FWD/FIB.

LNS Session Termination

176 1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06


Architectural Support For Features

Figure 82 LNS Session Termination

1. LAC sends an “LCP Terminate” request to the PPP daemon.

2. PPP sends a “session down” message to AAA.

3. AAA requests ISM to unbind the interface.

4. ISM unbinds the circuit and propagates the updates to all the other modules.

5. The line card sends a subscriber status message to STATd; which further
forwards the message to AAA.

6. RIB downloads the route, next-hop, and adjacency removal information


to the line card.

7. AAA requests ISM to bring the session down and mark the state.

8. ISM propagates these updates to the other modules.

9. L2TP sends a CDN message to the LAC and a circuit delete message
to ISM.

10. ISM deletes the circuit and propagates these updates to all the other
modules.

1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06 177


SSR System Architecture Guide

3.9 Advanced Services (QoS and ACLs)

3.9.1 QoS

For CPI-level configuration information, see Configuring Circuits for QoS,


Configuring Rate-Limiting and Class-Limiting, andConfiguring Queuing and
Scheduling.

Quality of Service (QoS) manages traffic flow through the SSR. The SSR
implementation is similar to QoS on the SmartEdge at the customer level. The
primary differences are internal, based on the SSR hardware architecture.
Customer-level features differ only in that there are fewer supported features
on the SSR and minor differences, such as supported ranges and valid values
for some commands.

3.9.1.1 Hardware Support

Figure 83 illustrates the line card components used by QoS to process ingress
and egress traffic.

Figure 83 Ingress to Egress QoS Processing Components

Ingress traffic is processed by a line card and then forwarded across the
switch fabric to another line card where egress processing takes place. QoS
processing of ingress network traffic is performed in the PFE on the NPU
before it is transferred to the fabric access processor (FAP), which forwards
traffic across the switch fabric. Some internal QoS functions are performed
in the FAP to maintain end-to-end QoS performance. For example, the FAP
schedules traffic according to the PD-QoS priority and buffers traffic in virtual
output queues (VOQs) as it is sent across the fabric.

The NPU Task optimized processor (TOP) packet processor performs some
egress QoS functions and sends traffic through the NPU Traffic Manager (TM)
for scheduling out through the line card ports.

The line card LP performs local management, configuration, and statistics


collection of the forwarding processors.

A FAP simultaneously performs different functions for ingress and egress


traffic. In the ingress role, it is referred to as the ingress FAP, even though that

178 1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06


Architectural Support For Features

FAP can also function for egress traffic. Likewise, an egress FAP pertains to
functioning for egress traffic.

An NPU TOP supports the following configurable QoS features:

• QoS propagation, including customized propagation via class maps

• QoS classification through class definitions and policy access control lists
(ACLs)

• Policing, metering, and marking, including hierarchical policing and


metering

• Encapsulation overhead adjustment support (also known as Inter-Packet


Gap (IPG) emulation)

• Egress queue mapping

• Congestion avoidance profile indexing

An NPU TOP internal process performs the following nonconfigurable QoS


tasks:

• FAP incoming traffic management header (ITMH) encapsulation, which


supplies:

0 The FAP system port ID used for fabric VOQ enqueuing

0 The FAP traffic class

0 The FAP drop precedence

• Packet descriptor QoS (PD-QoS) extraction from the Forwarding Header,


which obtains :

0 The PD-QoS priority (also known as the traffic class)

0 The PD-QoS drop precedence

• NPU Traffic Management (TM) header encapsulation, which supplies:

0 The NPU TM flow ID (FID)

0 The NPU TM weighted random early detection (WRED) profile indexes


for congestion avoidance

0 The NPU TM WRED color for congestion avoidance

0 The IPG values for encapsulation overhead adjustment

While performing in the egress role, the FAP controls traffic scheduling across
the fabric and to the NPU.

For ingress traffic, the FAP provides the following functions:

1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06 179


SSR System Architecture Guide

• Ingress fabric queuing through VOQs

• Ingress fabric scheduling

For internal system QoS functionality, the FAP performs the following tasks:

• Queues traffic destined to cross the switch fabric

• Schedules traffic across the switch fabric

• Provides congestion avoidance to protect depletion of FAP resources

• Provides drop statistics collected by fabric management software

For egress traffic, the FAP provides the following internal system QoS functions:

• Fabric unicast traffic shaping and scheduling

• Fabric multicast traffic shaping and scheduling

• Fabric unspecified-port traffic shaping and scheduling (for control and


data traffic)

• Reception of flow control for egress traffic to the NPU

3.9.1.2 Software Support

The QoSMgr and QoSd modules provide the following QoS services on the
line cards:

• QoS priority assignment, propagation, and class maps

• Rate limiting, classification, and marking

• Metering and policing, including QoS-related classification through class


definitions and policy ACLs

• Scheduling, including hierarchical TM and priority weighted fair queuing


(PWFQ), queue maps, and overhead profiles

• Congestion control (configuration of congestion avoidance maps for


controlling queue depth and RED behavior)

• Policy-based forwarding (redirect through forward policies)

• Policy-based mirroring (through mirror policies)

• Protocol-specific rate limiting

• Advanced provisioning support, including QoS configuration of link groups,


service instances,

180 1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06


Architectural Support For Features

3.9.1.2.1 QoS Module Interaction

The modules shown in Figure 84 manage QoS provisioning and traffic


processing on the SSR.

Figure 84 QoS Information Flow

The following components in the SSR provide QoS functionality:

• Route Processor (RP) QoS, which comprises:

0 Platform-independent RP QoS (PI-RP-QoS)

0 QoS CLI and Data Collection Layer (DCL)—Includes the CLI parse
chain definitions for all QoS CLI show and configuration commands
and their command handler functions, which forward the resulting
events to the Router Configuration Module (RCM) by way of the DCL.

0 RCM QoS Feature Manager (QoSMgr)—A component of RCM that


handles all QoS provisioning events received from the CLI and from
authentication, authorization, and accounting (AAA) and forwards
committed successful operations to QoSd to apply to the forwarding
plane. In general, all QoS-related admission control, verification and
validation, and forwarding plane resource management takes place in
QoSMgr with one exception: the QoS implementation for economical
access LAG involves resource allocation in the QoS Daemon (QoSd).

1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06 181


SSR System Architecture Guide

0 QoSd—The QoS component that is directly responsible for provisioning


of the forwarding plane. QoSd handles communication with forwarding
plane IPC endpoints that correspond to forwarding abstraction layer
(FABL) instances.

0 QoS shared library (QoSLib)—This shared library contains logic that


can be shared between QoSMgr and QoSd (for resource allocation, for
instance) or between QoS and partner modules (primarily AAA).

0 Platform-dependent route processor QoS module (PD-RP-QoS)—This


is a software component supplied by the platform developer that runs
on the RP and conforms to a published API. Its purpose is to carry
out any platform-dependent tasks required to provision QoS services.
The primary intended examples are platform-specific verification (not
covered by capability-driven platform independent code) and forwarding
plane resource allocation. An example of platform-dependent RP QoS
would be to allocate hardware resources on the PFE to perform the
rate limiting function of a Policing policy.

• QoS Forwarding Plane (FWD QoS)—In general, the design and operation
of the RBOS forwarding plane software is beyond the scope of this
document (as is FABL). However, it is expected that the provisioning
events signaled over the QoSd -> FABL and FABL -> FALd interfaces will
ultimately be driven by, closely mirror, and reference elements derived from
the QoS PI -> PD interface on the RP.

FWD QoS includes:

0 FABL—This is the processor-independent portion of the Ericsson IP


Operating System forwarding plane to which QoSd sends provisioning
events. This design assumes that there will be a QoS FABL instance
and corresponding IPC endpoint for each card slot ingress and egress
facing capabilities.

0 Interface—The interface or API between QoSd and FABL is based on


an extended and enhanced version of the current interface between
QoSd and the PPAs.

0 Forwarding Adaptation Layer Daemon (FALd)—This is the platform


dependent software layer below FABL that implements the QoS
provisioning operations on the target platform. This layer is the
ultimate consumer of resource identifiers allocated on the RP using
the PD-RP-QoS API.

3.9.1.2.2 Functional Overview

The following steps occur during QoS provisioning:

1 Provisioning initiated—A CLI configuration command is issued, or a


subscriber provisioning event is received by AAA.

2 Forwarded to RCM—The provisioning event is relayed to QoSMgr.

182 1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06


Architectural Support For Features

3 Admittance—The PI QoSMgr code determines if the event is OK (including


checking PD-specific card attributes of the targeted slot's card type).

4 PI Scope Determination—The QoSMgr determines if the event has any


forwarding plane impacts and, if so, which PFEs in the system are affected.

5 PD verification and resource allocation—QoSMgr calls a PD-supplied API


function for each PFE on which the affected circuit is instantiated to validate
the change and reserve any necessary resources on the line card.

6 RCM record update—If all PD-specific implementation checks return as


OK, QoS creates, updates, or deletes the relevant configuration records in
the configuration database. These records include an opaque cookie (that
is, the value returned by the lower (PD) layer, whose meaning and internal
structure is not known to the upper (PI) layer. The PI layer stores it and
then passes it back to the PD layer when certain operations are performed).

7 Event queued to QoSd—The QoSMgr queues a record with all the relevant
change information, including cookies, for delivery to QoSd if and when
the configuration transaction is committed. The QoSMgr also registers a
callback to back out any required RTDB and PD-specific changes in the
event that the transaction is aborted.

8 Event delivered to QoSd—When the transaction is committed, QoSd


receives the configuration events, updates its own policy and circuit
configuration records, including the PD-specific cookie, and queues the
events for an update or download to the appropriate line cards.

9 Event delivered to FABL—During its next update cycle, QoSd downloads


all pending configuration changes to the affected line cards. This update
message also includes the PD-specific cookie and details about any
resources allocated or deallocated for this configuration event by the
PD-specific QoSMgr implementation function.

10 FALd API invoked—FABL code on the line card must take the configuration
event messages received from QoSd and invoke the appropriate PD
forwarding APIs to implement the changes, supplying the relevant
PD_COOKIE objects that contain all the allocated resource IDs and other
platform-specific information needed to enact the configuration change.

Each circuit has a single PFE ID. A PFE ID represents and abstracts all
the necessary hardware devices that might need to apply QoS services in
a particular direction (ingress or ingress) for a particular physical device (for
example, non-aggregate). The PFE ID cannot change for the lifetime of the
circuit. If, instead, there are multiple network processor units (NPUs) handling,
for instance, the egress packet forwarding path for a single VLAN, the PD
domain presents the NPUs to the PI control plane as a single device.

Policies (queuing policies, metering and policing policies, forwarding policies,


and protocol rate-limit policies) are the basic reusable blocks of QoS
configuration that define a common set of parameters that can be independently
applied to multiple circuits . These policies reference secondary objects

1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06 183


SSR System Architecture Guide

contributing to QoS configuration (queue maps, congestion avoidance maps,


overhead profiles, class-definitions, and policy ACLs).

Other QoS-related objects:

• Class maps—Managed by RP QoS but referenced by qos propagate


commands that are in general managed by other modules and
communicated to FABL by ISM rather than QoSd.

• Policy bindings—An association created by applying a policy (queuing,


metering, policing, forwarding, mirroring, or protocol rate-limiting) to a circuit
for purposes of QoS enforcement.

• Circuit-level attributes:

0 Hierarchical Scheduling Nodes

0 Default priorities for circuits

0 Propagated settings

3.9.1.2.3 RCM PFE Communication Flow

In the Ericsson IP Operating System, the RCM tracks which PFEs are
associated with each circuit and tracks when QoS policies and secondary
configuration objects must be referenced on a particular PFE.

The first time that a QoS object is bound to a circuit hosted on a particular PFE,
the RCM invokes RP-QoS-PD APIs to validate the object’s parameter for the
specific PFE and to allocate resources needed to instantiate the object on
the PFE. If the instantiation operation fails due to parameter incompatibility or
insufficient resource, the circuit binding operation fails and the appropriate error
is signaled back to the provisioning entity (usually AAA or the CLI). Sources of
PFE scope information include:

• QoS control plane—Receives initial PFE scope information for a circuit


through the same mechanisms and communication paths used for scoping
information in the SmartEdge OS.

• Physical circuits—Must be associated with a single PFE ID per ingress


and egress. A non-blocking API must be available to RCM components to
return PFE IDs for a physical circuit, based on its circuit handle. The PFE
IDs must not change for the lifetime of the circuit.

• Pseudo-circuits—QoS receives the PFE IDs associated with pseudo-circuits


in the same manner as slot masks.

• Hitless access link group circuits—QoSMgr receives PFE information and


modifications for each constituent port from LGd.

• Economical access link group circuits—QoSd PFE receives information


and changes for each constituent port from ISM.

184 1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06


Architectural Support For Features

QoS control plane software relays PFE ID information to the following


components:

• RP-QoS-PD—PI QoS passes the PFE ID as an argument to RP-QoS-PD in


provisioning verification and resource allocations APIs.

• FABL—QoSd supplies the relevant PFE ID and any relevant per-PFE


cookies in every provisioning message sent to FABL.

• QoSMgr—Passes the PFE ID to the CLSMgr interface for provisioning


policy ACLs.

• Operation, Administration, and Maintenance (OAM) modules—When


needed, QoSd passes relevant PFE IDs to other components that interact
with QoS services; for example, to STATd for forwarding plane counters.

The following commands can be used to verify that the QoS and ACL
configurations have been downloaded to the line cards:

• show qos policy *

• show card X fabl qos *

• show card X fabl acl *

• show card X pfe *

• show card X fabl api

• show card X fabl qos policy *

For more troubleshooting information, see the SSR QoS Troubleshooting


Guide.

3.9.1.2.4 QoS Policy Directions

In general, QoS services are inherently directional without functional overlap


between ingress and egress. For example, policing policies apply only
to ingress traffic, while metering and queuing apply only to egress traffic.
However, the following QoS provisioning objects can be applicable to both
ingress and egress PFEs:

• Class maps

• Class definitions

• Policy ACLs

• Forward policies

• Mirror policies

RP QoS creates only one concurrent instantiation of a QoS object per unique
slot/PFE-id instance when:

1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06 185


SSR System Architecture Guide

• A single PFE handles both ingress and egress traffic, and

• The same PFE ID is associated with a physical circuit instance for both
egress and ingress purposes.

QoS signals the object creation to PD-RP-QoS API for validation and resource
allocation purposes only one time for both ingress and egress, and similarly
sends only one object creation message to FABL (or CLSMgr for ACL).

3.9.1.2.5 Provisioning Order and Dependencies

The QoS control plane is responsible for fulfilling all dependencies and
guaranteeing that the provisioning messages are delivered to the FABL for
each PFE in the correct order.

• Creation—The required provisioning order for creating or updating a circuit


policy binding is:

0 Instantiate any secondary QoS objects on the circuit’s PFEs that


are referenced by the circuit (for example, queue map, congestion
avoidance profile, overhead profile, and class definition).

0 Instantiate the QoS policy on the PFE(s) that are referenced by the
circuit.

0 Provision the circuit binding on the PFE.

0 Provision the bindings that inherit from the root binding above.

• Update—The required provisioning order for modifying a referenced QoS


policy or secondary object is:

0 Signal the policy or secondary QoS object’s modification to all PFEs


that reference the object.

0 If required by any resulting cookie changes, update the relevant PFE’s


binding for each circuit with root binding to the object.

0 If required by any resulting cookie, update the binding for each circuit
that inherits the root binding.

• Deletion—The required provisioning order for removing QoS objects is:

0 De-provision one or more circuit bindings on a PFE.

0 When all inherited circuit bindings that reference a root (non-inherited)


binding have been removed from a PFE, the root binding can be
removed from the PFE.

0 When all circuit bindings that reference a QoS policy have been
removed from a PFE, the QoS policy can be removed from the PFE.

186 1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06


Architectural Support For Features

0 When all policies that reference a QoS secondary object have been
removed from a PFE, the secondary object can be removed from the
PFE.

• Hierarchical nodes (h-nodes) and queue sets—In general, a hierarchical


scheduling tree is built downwards, starting with the port or link group’s
h-node at the top, followed by intermediate h-nodes, and finally queuing
points:

0 Creation—An h-node or queue set must always be created relative to


its parent h-node and cannot be created unless its h-node parent and
the entire scheduling tree above it has been instantiated at the port
or link level.

0 Deletion—An h-node cannot be deprovisioned until all h-nodes and


queuing points that reference it as a parent have first been removed.

3.9.1.2.6 QoSd to FABL Communication Flow

To facilitate the use of common logic in QoSd, provisioning messaging format


and semantics between QoSd and FABL and QoS and PPA is largely common
and based on the SmartEdge QoSd-to-PPA messaging, but extended:

• Almost all current QoSd -> PPA/FABL messages have been extended to
include the relevant PFE ID and QoS PD_COOKIE object of the appropriate
type.

• New FABL-specific messages—Ericsson IP Operating System FWD has


the following additional provisioning message requirements above and
beyond those required by the SmartEdge OS PPAs. The requirements
pertain to policy update handling.

0 Update bindings—Whenever a policy is updated, it is not sufficient


to merely send an update on the policy object itself; the fowarding
adaption layer daemon (FALd) requires an update event to be
generated for each circuit binding that references the policy. If no
resource allocation changes are required for the bindings, the update
events can be handled internally by FABL. However, if resource
allocation changes are required, all the cookie changes for the modified
bindings must be explicitly communicated from QoSd -> FABL.

0 Signal policy update complete—After a policy object update has been


communicated to FABL, there may be an interim period where QoSd
is updating one at a time the bindings that reference the modified
policy object. During this period, the non-updated bindings continue
to reference and depend on the state and resources of the policy as
it existed before modification. QoSd explicitly signals to FADL when
all such circuit updates are complete so that the forwarding plane can
finish cleaning up and deallocate any resources of the old policy.

0 Signal root binding update complete—After a policy object update has


been communicated to FABL, there may be an interim period where

1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06 187


SSR System Architecture Guide

QoSd is updating one at a time the inherited bindings that reference a


particular root binding. During this period, the non-updated inherited
bindings continue to reference and depend on the state and resources
of the root binding as it existed before modification. QoSd explicitly
signals to FALd when all such circuit updates are complete so that the
forwarding plane can finish cleaning up and deallocate any resources
of the old binding.

3.9.1.2.7 Binding Creation

The most fundamental QoS provisioning operation is binding a policy object


to a circuit, using the following steps:

1 Identify the affected PFEs.

2 Instantiate each secondary object referenced by the policy on the PFE if


the PFE has not yet referenced it.

3 Instantiate the policy on the PFE if the PFE has not yet referenced it.

4 Instantiate the combined binding of the policy plus all relevant secondary
objects to the circuit on the PFE.

5 Signal the circuit binding record to QoSd with any PD_COOKIES associated
with the above instantiation and allocation operations.

6 Send an individual creation message to FABL for the PFE for each new
object instantiation and the binding message itself, each including any
PD_COOKIE.

Another QoS provisioning operation is modifying the bindings when a


referenced policy has been changed. To update the object, QoSMgr must
identify all PFEs on which the modified object is currently instantiated and, for
each one, invoke the PD-RP-QoS API to notify the PD domain of the proposed
provisioning operating on the target PFE. The PD code can:

• Reject the policy change as unsupported by the PFE.

• Allocate and deallocate resources as necessary to support the policy


change and return a modified cookie.

To check existing bindings, the relevant PD-specific policy binding update


function is invoked by the QoSMgr PI for each circuit that references the policy.
The PD code has the opportunity to:

• Reject the policy change as unsupported by the current circuit or as


requiring resources that are unavailable.

• Allocate and deallocate resources as necessary to support the policy


change and return a modified cookie.

• Indicate that no forwarding-side handling is needed other than an update


notification for the specific circuit that the policy has changed.

188 1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06


Architectural Support For Features

• Indicate that no circuit-specific forwarding-side handling is needed at


all. The policy change is not finalized until all required PD-RP-QoS API
update functions have been invoked, and it must be backed out if any PFE
returns an error indicating a problem applying the change to the policy or
an affected binding.

When a CLI change is aborted rather than committed, the relevant PD-specific
functions are invoked to back out the change on the PD side. The PI code
supplies both the current cookie that reflects the aborted change and the
original cookie that reflects the desired state to return to. QoSMgr also cleans
up any of its own memory or RTDB records or state associated with the
aborted changes. For static circuits, cookie information is stored in an extended
qos_media_conf_t binding record in the RDB.

3.9.1.2.8 PD QoS Priority and Drop Precedence Markings

The SSR associates a 6-bit internal PD (Packet Descriptor) QoS value with
each packet, that is initialized when a packet is received on ingress and which
remains associated with a packet as it transits the forwarding plane. The upper
3 bits hold the priority and the lower 3 bits hold the drop precedence.

• Priority—Priority 0 – 7, where value 0 receives the highest priority treatment


by default and value 7 the lowest. (As noted below, the PD QoS priority
convention is inconsistent. It depends on whether the whole PD-QoS value
is referenced or only the priority field.

• Drop-precedence—Drop precedence 0 – 7, where value 0 receives the


lowest likelihood of being dropped by default and value 7 the highest.

Note: When PD priority is represented in the CLI and documentation as a


standalone value (mark priority, qos priority, or queue-map
configuration), it follows the convention that priority 0 is highest and 7 is
lowest. However, when PD QoS is represented as a concatenated 6-bit
value (class-map, class-definition, and congestion-map
configuration), it is displayed in the Differentiated Services Code Point
(DSCP) format, where the upper 3 bits are encoded such that value
7 has the highest priority.

The external marking-to-PD QoS priority defaults for circuits are:

• On Layer 3 (IP-routed) circuits—The upper 6 bits of IP type of service


(ToS) are copied to PD QoS (inverting the upper 3 bits if the internal
representation is zero-highest).

• On MPLS circuits—The EXP value from the relevant MPLS label is copied
to the upper 3 bits of PD QoS (inverting the upper 3 bits if the internal
representation is zero-highest) and the lower 3 bits are set to zero.

• In other non-IP-routed circuits—The upper three bits of PD QoS are set to


the lowest priority value (7 if the internal representation is zero-highest)
and the lower 3 bits are set to zero.

1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06 189


SSR System Architecture Guide

Note: The ingress QoS propagation default behavior depends on the


type of packet received, the circuit type, and the corresponding
propagation configurations on the ingress circuit. Typically, only one
ingress propagation type is applied when a packet belongs to multiple
propagation types. Exceptions exist for certain tunneling modes,
where the tunnel has propagation followed by the encapsulated packet
propagation. Some propagation types only take effect if there is an
explicit propagate setting configured. Some are in effect by default
but can be deactivated via configuration. Others, such as MPLS,
are always enabled for their respective circuit binding types. Priority
precedence ordering evaluates the propagation bindings for the ingress
propagation types. Basically, QoS is propagated from the relevant
outer-most header. On ingress, if propagate-qos-from-ethernet
and propagate-qos-from-mpls are configured for the same
circuit, propagate-qos-from-ethernet is performed for dot1q
and Q-in-Q frames and propagate-qos-from-mpls is performed
for untagged frames. The default propagation for IP is to copy the full
DSCP to PD-QoS. For other propagation types, the default uses the
8P0D mapping schema.

Table 10 defines the QoS propagation types for supported ingress circuits
and forwarding types. As noted above, only one ingress propagation type
is applied according to precedence rules, type of packet received, circuit
type, and configuration. For example, a label-switched (MPLS) packet could
propagate from an IP packet or Ethernet frame by configuring the use-ip or
use-ethernet command within the MPLS class map.

Table 10 QoS Ingress Propagation Types


QoS Propagation Type
Forwarding MPLS MPLS MPLS
Circuit Type Type Ethernet Transit L2VPN Standard IP
Ethernet
(1)
(untagged) Cross-connected
(1)
L2VPN/VLL x x x
(1)
Label-Switched x x
IP Routed x
802.1Q access Cross-connected x
via use-et
L2VPN/VLL hernet x x
via use-et
(1)
Label-Switched hernet x
IP Routed x x
802.1Q Q-in-Q
access Cross-connected x

190 1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06


Architectural Support For Features

Table 10 QoS Ingress Propagation Types


QoS Propagation Type
Forwarding MPLS MPLS MPLS
Circuit Type Type Ethernet Transit L2VPN Standard IP
via use-et
L2VPN/VLL hernet x x
via use-et
(1)
Label Switched hernet x
IP Routed x
802.1Q
(1)
Transport Cross-connected x
via use-et
L2VPN/VLL hernet x x x
via use-et
(1)
Label Switched hernet x x
IP Routed
802.1Q Q-in-Q
Transport Cross-connected x
via use-et
(1)
L2VPN/VLL hernet x x x
via use-et
(1)
Label Switched hernet x x
IP Routed
(1) Uses IP propagation type if the use-ip command is configured in the class map.

Table 11 describes the QoS propagation types for egress circuit and forwarding
types.

Table 11 QoS Egress Propagation Types


QoS Propagation Type
Forwarding MPLS MPLS MPLS
Circuit Type Type Ethernet Transit L2VPN Standard IP
Ethernet
(untagged) Cross-connected
L2VPN/VLL x x x
Label switched x x
IP routed x
802.1Q Cross-connected x
L2VPN/VLL x x x

1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06 191


SSR System Architecture Guide

Table 11 QoS Egress Propagation Types


QoS Propagation Type
Forwarding MPLS MPLS MPLS
Circuit Type Type Ethernet Transit L2VPN Standard IP
Label switched x x
IP routed x x
802.1Q
Q-in-Q Cross-connected x
L2VPN/VLL x x x
Label switched x x
IP routed x x
802.1 Q
Transport Cross-connected x
L2VPN/VLL x x x x
Label switched x x x
IP routed
802.1Q Q-in-
Q Transport Cross-connected x
L2VPN/VLL x x x x
Label switched x x x
IP routed

In the Ericsson IP Operating System, you can control the mapping between
packet header external priority and drop-precedence markings (IP TOS/DSCP,
Ethernet 802.1q, 802.1p, and MPLS EXP) and their internal representations
in the PD QoS value.

Configurations that set or modify PD QoS priority:

• qos priority command

• qos propagate from ip/ethernet/mpls commands and ingress class


maps

• mark priority command and the conform/exceed/violate variants

• mark dscp and mark precedence commands and the


conform/exceed/violate variants (which can also modify IP TOS)

Configurations that reference PD QoS priority:

• qos propagate to ip/ethernet/mpls commands and egress qos


class-map commands

192 1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06


Architectural Support For Features

• Class definitions (all 6 bits)

• Congestion avoidance maps

• Queue maps (priority bits only)

• Forwarding plane elements that enqueue and prioritize packets to the


backplane switch fabric (priority bits only)

On the SSR, ingress propagation of QoS markings can also be enabled per
port or per service instance. For incoming packets, this feature enables the
use of 802.1p priority bits in the 802.1q Ethernet header or MPLS EXP bits in
the outermost MPLS label to set the PD QoS priority value that determines
ingress priority treatment for each packet. Each NPU has four ingress queues
for incoming traffic. The PD QoS priority value assigned to a packet determines
which of the four queues the packet is admitted to. Each PD QoS priority value
has a fixed mapping to a queue, as shown in Table 12.

Table 12 NPU Queue Default Ingress PD-QoS Priority Values


CLI Priority Value PD QoS Priority Value Ingress Queue
0 or 1 7 or 6 0 (highest priority)
2 5 1
3, 4, or 5 4, 3, or 2 2
6 or 7 1 or 0 3 (lowest priority)

The queues are serviced in strict priority order on each NPU. Each
highest-priority queue is fully serviced before the next lower-priority queue is
serviced. Under congestion, the highest-priority packets are most likely to be
forwarded, and the lowest-priority packets are most likely to be discarded due
to queue overflow. If the port-propagate commands are not configured, all
packets received on the port are treated as the lowest-priority traffic, for ingress
oversubscription purposes (they are assigned to PD-QoS priority value 0 and
ingress queue 3).

Depending on the protocols supported on an egress circuit, multiple egress


QoS propagation marking types can co-exist and must be serviced based on
the egress packet and frame types. The egress QoS propagation performed
depends on the type of packet, circuit type, and the corresponding propagation
configurations on the egress circuit. If a packet belongs to multiple propagation
types, multiple egress propagation types can be applied. Most egress
propagation types only take effect if there is an explicit binding configured.
According to the configuration, it is possible that packets can:

• Have no egress propagation performed and the original markings are


preserved as received

• Have priority markings generated according to forwarding decisions

• Be marked using other features

1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06 193


SSR System Architecture Guide

The exception is MPLS transit LSR, where the default MPLS propagation is
always applied. The default propagation for IP is to copy the full PD-QoS
to DSCP. For other propagation types, the default uses the 8P0D mapping
schema with eight priorities & zero drop precedence (DP) values. When
translating from a 3-bit value to a 6-bit value, the 3-bit value is copied to the
priority field and the DP is cleared to zero. Other mapping schema types reduce
the number of priorities represented to enable some DP values in the encoding.
For more information, see mapping-schema.

3.9.1.2.9 Class Definitions

A class definition is an alternative to using an ACL to classify traffic for policing


and metering policies. As with ACLs, a class definition enables a policy to
perform class-based rate limiting in addition to circuit based rate limiting. Class
definitions are particularly useful for Layer 2 circuits because they do not have
access to the IP packet headers.

The SSR supports up to 15 class definitions, whereas up to 8 defined classes


can be configured per class definition based on PD QoS values. Each
class definition classifies all traffic that does not match a defined class to
class-default. A class definition can assign a PD QoS value to a single class.
Multiple PD QoS values can belong to the same class. Each class definition can
be applied on ingress and egress. Each circuit can have a single class definition
for each direction. The PD QoS value of a packet is used as an index into the
class definition table. The selected table element specifies the Class-ID, which
is used by the policing policy to perform actions assigned to the class.

Figure 85 illustrates the assignment of class map definitions.

Figure 85 Ingress/Egress QoS Class Definition

194 1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06


Architectural Support For Features

Class maps provide a mechanism to define a configurable schema for


customized QoS propagation. Class maps support the following QoS marking
types:

• Ethernet 802.1p marking

• IP Differentiated Services Code Point (DSCP) marking

• MPLS EXP marking

The SSR supports up to 128 class maps per line card for the marking types
listed above.

3.9.1.2.10 Congestion Avoidance Map

Used in conjunction with priority weighted fair queuing (PWFQ) policies,


congestion avoidance maps configure queue depth and WRED drop profiles
to control packet queue admittance and discard decisions. Defining the drop
behavior for up to 8 queues, the structure of a congestion avoidance map
contains the following attributes per queue:

• Queue depth—sets the tail drop threshold

• Instantaneous queue length—number of packets in the queue at the time in


which an admittance test is performed

NP4 based cards do not support exponential weight.

• Floyd and Jacobson RED algorithm—Defines drop probability as a function


of average queue occupancy. The profile is selected according to PD-QoS
value.

NP4-based cards use the instantaneous queue length for performing a queue
admittance test for a WRED Drop decision at the instant in which a packet
is to be enqueued. Whenever a packet is submitted for enqueuing, a queue
admittance test is performed to determine whether to enqueue the packet or
drop it. The instantaneous queue length is compared against the configured
WRED curve queue occupancy thresholds to determine which drop probability
value to use in the drop decision.

On the other hand, the RED algorithm devised by Floyd & Jacobson uses
a moving weighted average queue length in which the queue length is
sampled and averaged over time (giving higher weight to the most recent
value). For more information, see "Calculating the average queue length" in
http://www.icir.org/floyd/papers/red/red.html. Other NPUs in SSR might support
exponential weight where the weight value influences the responsiveness of
the admittance test to changes in the queue occupancy. The Floyd & Jacobson
RED algorithm applies a filter to reduce the effects of short-term traffic bursts.
This filter may be tuned by adjusting the value of the exponential weight.

However, NP4 does not perform this aspect of the algorithm and may be more
susceptible to dropping bursty traffic.To mitigate traffic loss, it is recommended

1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06 195


SSR System Architecture Guide

that you configure NP4 WRED curves with more lenient drop probability and
thresholds than you could select when used with an exponential weight.

The RED algorithm uses a weighted moving average of queue occupancy to


measure congestion. The exponential weight controls the sensitivity of the
average-to-select burst allowance.

A drop profile sets the probability of dropping a packet as a function of the


average queue occupancy. When the average queue occupancy is below
the minimum threshold, no packet dropping occurs. Beyond the minimum
threshold, as the average queue occupancy approaches the maximum
threshold, the drop probability increases linearly toward the configured drop
probability. If the average queue occupancy is greater than the maximum
threshold, the drop probability is 100%.

The W in WRED refers to weighted drop precedence, which is not to be


confused with the moving average exponential weight (not supported on NP4
cards). WRED is an extension to random early detection (RED) and allows a
single queue to have several different occupancy thresholds that are selected
based on QoS markings in the frame or packet header. For example, a frame
or packet marked for higher drop precedence (or drop eligibility) could have
lower queue occupancy drop thresholds. Queue congestion causes higher drop
precedence packets to be dropped before packets of lower drop precedence
in the same queue. This enables prioritization of conforming traffic over
non-conforming traffic during periods of congestion.

Each queue can have up to three drop profiles, where the drop profile for a given
packet is determined by the PD-QoS value, as configured in the congestion
avoidance map. Per queue, a PD-QoS value can be assigned to only one drop
profile. However, multiple PD-QoS values can select the same drop profile.
Because a congestion avoidance map accommodates up to 8 queues, the map
can specify up to 8 queue depths, 8 exponential weights, and 24 drop profiles.

3.9.1.2.11 NPU WRED Drop Profile Limitations

The number of NPU WRED Drop Profiles is limited to the following:

• Drop profile templates (which define the shape of the RED curve):

0 global template array


0 16 template arrays per TM hierarchical level (L0 through L4)

0 8 templates per array, selected by TM header frame color

• Scaling Factors (which combine with the templates to determine the


absolute threshold values):
0 1 global factor

0 32 factors per level at L0 and L1

0 256 scaling factors per level at L2 through L4

196 1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06


Architectural Support For Features

The limits for NP4 are: 256 scaling factors and 128 (ie 16*8) templates.

3.9.1.2.12 Overhead Profiles

Downstream traffic shaping is controlled by QoS overhead profiles. Overhead


profiles enable the router to take the encapsulation overhead of the access
line (or circuit) into consideration so that the rate of traffic does not exceed
the permitted traffic rate on the line.

The overhead profile works in conjunction with a priority weighted fair queuing
(PWFQ) policy's configured rate maximum value. The PWFQ policy defines the
rate of traffic flow. The overhead profile defines the encapsulation overhead
and the available bandwidth on the access line. These attributes are associated
with an overhead profile:

• Layer 1 overhead = reserved bytes per packet—If the reserved bytes per
packet attribute is set to 12, 64-byte packets are shaped as if they were
76 bytes.

• Layer 2 overhead = encapsulation type (Ethernet)—The Layer 2 overhead


(encapsulation type for a specific access line type within the overhead
profile) is the number of L2 overhead bytes and the method used for
calculating the overall packet size. For Ethernet encapsulation, the number
of L2 overhead bytes is 18.

3.9.1.3 QoS Services over Service Instances

As of SSR, Release 12.2, the router supports binding QoS services to service
instances.

Layer 2 service instances are configured under Ethernet ports or link groups.
For a link group, and for any L2 service instances under that link group, QoS
bindings are applied per constituent.

This feature is enabled whether or not the link-pinning keyword is


configured on the LAG. When a circuit is configured with link-pinning, all egress
traffic for that circuit flows through that link.

Next-generation (NG) L2 circuits can be LAG-based and multiple constituents


of that LAG can reside on same or different packet forwarding engines (PFEs).
This feature supports overhead profile and ingress/egress propagations
for each NGL2 circuit for up to five bindings (policing, hierarchical policing,
metering, hierarchical metering, pwfq). QoS policies on NGL2 circuits and
802.1Q permanent virtual circuits (PVCs) behave similarly.

The following QoS services are now extended to service instances on SSR (on
40-port Gigabit Ethernet (GE) and 10-port GE cards):

• Policing, inherited policing, and hierarchical policing (hierarchical child)


QoS policy bindings on ingress

1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06 197


SSR System Architecture Guide

• Metering, inherited metering, and hierarchical metering (as hierarchical


child) QoS policy bindings on egress

• PWFQ queuing and inherited queuing on egress

• Overhead profile and its inheritance on egress

• QoS priority on ingress

• Using the propagate qos to/from Ethernet, using a dot1q profile on


ingress and egress

The following limitations apply to QoS on NGL2 circuits:

• Layer 2 service instances support the attachment of 802.1Q profiles, but


support is limited to the propagate commands within the profile. No other
parameters configured in the profile are supported by service instances.

• QoS attributes included in this feature can only be inherited by a service


instance from a port or link group and not from a PVC or tunnel.

• Only two out-most layers of 802.1Q tags are read or updated for 802.1p
priority bits for ingress and egress propagation.

This functionality is extended using the existing architecture for Layer 3 circuits
(dot1q PVCs) with the following changes:

• Creating a service instance circuit triggers the creation of an h-node for the
circuit. In addition, newly created circuits inherit hierarchical, inherited,
policing, and metering bindings from the parent circuit (port or LAG), if the
parent circuit has such bindings. Unlike 802.1Q PVCs, service instance
circuits do not inherit from other 802.1Q PVCs.

• When you attach a dot1q profile to a service instance, the profile is sent as
an optional attribute of the service instance through FABL to the line card.
For 802.1Q PVCs, the profile is sent as part of the 802.1Q object to FABL.

• The profile can include information to propagate the 802.1p priority to and
from the PD-QoS priority marking. Unlike PVCs, the propagation from
the Ethernet priority can happen from the outer or inner tag. Similarly,
propagation to Ethernet can be done for outer, inner, or both Ethernet tags.

There are no architectural modifications from FABL, ALd or micro code changes
as part of this feature.

3.9.2 ACLs

3.9.2.1 Overview

For CPI-level configuration information, see Configuring ACLs.

198 1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06


Architectural Support For Features

Access control lists (ACLs) provide advanced traffic handling for incoming or
outgoing traffic. They contain rules that match subsets of the traffic to actions
that determine what is to be done with the matching traffic. The ACLs can be:

• IP Filtering ACLs, which permit or deny traffic to flow.

• Policy ACLs, which can be applied to:

0 Forwarding policies, which can redirect or drop traffic.

0 QoS policies, which are the interface to all QoS support in the system.

0 Reverse path forwarding (RPF), which verifies the source IP address


on ingress packets.

0 NAT policies, where the specified action is applied to all packets


traveling across the interface or subscriber circuit or, if an ACL is
referenced, to packets that do not belong to the classes specified by
the ACL and by the NAT policy.

An ACL consists of a number of statements that match a subset of traffic with


corresponding actions. An ACL can be applied to:

• Static circuits—For static circuits, the ACL is applied to an interface and


then to the circuit the interface gets bound to. ACLs can be applied to
inbound or outbound traffic.

• Dynamic circuits (subscribers)—For dynamic circuits the ACL is directly


applied to a subscriber. ACLs can be applied to inbound or outbound traffic.

• Ethernet management port—ACLs can be applied to inbound or outbound


traffic.

• Administrative traffic received on the control plane—These ACLs are


applied to a context and the ACL applies to inbound traffic only.

IP filtering ACLs contain permit and deny statements. Their order is significant,
because the rules are matched in the order that they are specified. They have
an implicit deny all statement at the end so that all traffic that is not explicitly
permitted is dropped. If a non-existent ACL is applied to an interface, all traffic
passes through.

Policy ACLs contain rules that map subsets of the traffic to classes. Their
order is significant, because the rules are matched in the order that they are
specified. Traffic that does not match any of the statements is assigned to
the default class.

Both types of ACLs are context-specific. They can be configured to be activated


and deactivated based on the time of day in an absolute or periodic fashion,
using the condition mechanism. Instead of having a permit or deny rule or a
class for each rule, a condition is referenced. When it is enabled, this condition
can determine the action type or class, based on time-dependent conditions.
ACL counters for both IP and policy ACLs can be enabled.

1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06 199


SSR System Architecture Guide

The following restrictions and limitations affect the application of the ip


access-group command to Layer 2 circuits:

• There is no support for packet logging when IP ACL filters are applied to
Layer 2 circuits.

• There is no support for dynamic ACLs.

• There is no support for 802.1Q PVCs with raw encapsulation.

• There is no support for inheritance of IP ACL filters. IP ACLs applied to an


outer VLAN filters all traffic on the VLAN except for the traffic that is going
to an inner VLAN.

You can apply IP ACL filters to the following Layer 2 configurations:

• Ethernet ports

• Cross-connected individual VLAN-based circuits or a range of VLAN-based


circuits

• Cross-connected VLAN-based circuits or a range of VLAN-based circuits


inside an 801.Q tunnel

• Cross-connected VLAN-based aggregated circuits that are part of an


802.1AX link group

• L2VPN ports

• VLAN-based circuits or a range of circuits attached to an L2VPN

• VLAN-based circuits or a range of circuits inside an 801.Q tunnel that is


attached to an L2VPN

The ip access-group (circuits) command does not support the following


Layer 2 circuits:

• Raw encapsulation VLAN-based circuits

• Transport-enabled circuits

3.9.2.2 ACL Communication Flows

Figure 86 shows the communication paths between the various ACL


components. The CLS module is divided into three parts: CLS, CLS Route
Processor (RP) Platform Independent (PI), and ACL forwarding abstraction
layer (FABL) PI. Messages are sent between the ACL FABL and CLS RP PI.
The messages sent are ACL rule sets and control information. The line card
portion (ACL FABL) manages the transactional nature of the ACL rule sets.

200 1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06


Architectural Support For Features

Figure 86 ACL Communication Flow

The CLS module filters, classifies, and counts packets that are processed
using ACLs. CLS receives multiple ACLs from multiple components
(QOS/NAT/FORWARD/FILTER/PKT_DBG/RPF) and these ACLs are applied
to one or more circuits. Each packet is processed by a sequential lookup in
each of the configured ACLs.

The CLS PI code has two parts:

• CLS RPSW PI—Manages the processing of ACLs and access groups into
ACL rule sets. An access group is a container that is bound to one or more
circuits. ACLs on the same circuit are separated into different access
groups by their service category. An ACL rule set is a set of ACLs applied
to the PFE as a single lookup list. It can contain part of an access group
(one or more ACLs within the access group), the whole access group, or
multiple access groups. The ACL rule set does not have to be on the same
circuit or even within the same service category.

ACL rule sets are built based on PD capabilities. The PD code provides
capabilities, validation, resource allocation, and utility functions to process
multiple ACLs into one or more ACL rule sets. A cookie for the ACL rule set
is returned by the PD library.

Note: The cookie is for platform-dependent use only. The CLS RP PD


may allocate resources on the line card to implement an ACL.
The cookie contains the representation of the resource and is
communicated to the forwarding function on the line card.

CLS RP PI also manages the distribution of the ACL rule sets to the
appropriate line cards. The ACL rule set can be reused by multiple circuits
within the same service on the same PFE. The distribution of the ACL rule

1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06 201


SSR System Architecture Guide

sets to the ACL FABL code is transactional in nature. This means that
ACL rule sets can be rolled back if not completed correctly (for example,
if CLS RP restarts).
• ACL FABL PI processes the following messages:

0 Sent by CLS, the summary_msg provides details on the ACL rule


set, including the format of the ACL rule set, the number of entries
in or the size of the message, and some platform-dependent data.
The platform-dependent portion provides any additional information
required to decompress the ACL rule set as it is received from CLS RP.
0 The download_acl_rule_set_msg contains the body of the ACL
rule set. This message can be received multiple times if the rule set
does not fit in one buffer. The ACL rule set is not added into the PFE
until all download messages are received.

As the ACL rule set is received (using the download_msg), and based
on the information within the summary message, the ACL rule set
is decompressed and added to the database. If there is a FABL or
CLS restart during this phase, the database entry is removed and a
re-download of all ACL rule sets is requested.

0 CLS sends the done_msg when all download messages have been
sent, signaling to the FABL that the ACL rule set has been completely
downloaded.

At this point, FALd has added the ACL rule set to the hardware.

0 To activate the ACL rule sets on the circuits, CLS sends a


bind_msg the PFE.

The ALd layer provides APIs and functionality that are platform-specific. For
CLS/ACLs, this layer interacts with drivers to configure the PFEs to perform
packet classification. It receives the ACLs in a well-known format. If necessary,
the ALd converts the ACLs into a hardware-dependent version of the ACLs.
The ALd also provides functions to extract information such as logs and
counters from the PFE.

PD can be considered part of the ALd that exists on the controller card and line
card. The PD Libraries provide resource management and reference IDs for
the ACL rule set. PD also provides special build functions to build ACL rule
sets into PD objects.

ACL rule sets are managed with the following transactions:


• Adding ACL rule sets to a PFE—ACL rule sets are added to a PFE in
several steps: through the summary, download , and done messages. After
validation of the ACL rule set, it is added to the PFE. It is also marked as
completed within the ACL database. If a failure occurs within FABL or ALd,
the ACL rule set is marked as failed. The database entry is only cleaned up
when the RP requests a deletion of the ACL rule set. If a restart occurs at
this point, the database entry is not cleared. Instead, during re-download of

202 1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06


Architectural Support For Features

the ACL, FABL re-invokes the ALd APIs to add the ACL rule set to the PFE.
The ALd is expected to reply to additions after it completes this task. In this
case, the ACL rule set cannot be used until the reply has been received.
All requests to use the ACL rule set should be blocked until the reply has
been received.

• Applying ACLs to circuits—The CLS and ACL FABL process the bind_msg
message received from the RP to apply an ACL rule set to a particular
circuit. Under certain circumstances, the RP denies all packets (stopping
all traffic). Otherwise, it activates the ACL rule set in the PD code.

• Downloading the ACL rule sets failure—If a failure is detected during the
downloading of an ACL rule set, the ACL database entry is marked failed
and a failure message is sent to the RP. If the failure occurred before the
ACL database entry was created, a temporary entry is created and marked
failed. Because the failure message is asynchronous, bind messages could
have been received for the ACL rule set. Any bind messages are discarded.
Eventually a delete message is received from the RP to remove the ACL
rule set. If the failure occurred while the ACL rule set was being downloaded
to the database, the ACL cannot be removed from the ALd and therefore
no message is sent to the ALd. Instead, the ACL is removed from the ACL
database, and a reply message is sent to the RP to free up its resources.

• Removing an unused ACL rule set—When an ACL rule set is no longer in


use by the PFE, the rule set is deleted. A delete message is received for
this processing. The ACL rule set is first found in the ACL database. If the
ACL is found, a check is performed to ensure that the rule set is not used in
the PFE. If the ACL is in use, an unbind message is sent to the ALd before
a message is sent to the ALd to remove the ACL rule set from the PFE.
After the ALd finishes deleting the rule set from the PFE, an asynchronous
message is sent to FABL->RP to perform any resource cleanup on the RP.
If the ACL rule set is not found in the ACL database, the asynchronous
response is sent to RP immediately.

• After CLS or RP restart—After a CLS or RP restart, the CLS re-downloads


all ACL rule sets and circuit-binding messages. FABL receives a CLS
restarted message from the RP and marks each ACL rule set in the ACL
database as stale. The CLS then re-downloads all ACL rule sets. If the
FABL finds the ACL rule set in the ACL database, it clears the mark on the
ACL rule set. New ACLs are processed in the same way. After the new
ACL rule set has been received, bind messages are received. Because
FABL does not store ACL-to-circuit mappings, it does not know whether
this is a new binding or an existing binding. FABL therefore sends the
bindings to ALd.

• During a CLS restart—For ACL rule sets that are removed during a CLS
restart, the FABL receives a delete message. Because CLS does not
know about the circuit-to-ACL mappings, the FABL unbinds the ACL rule
set. After it is unbound, the ACL rule set is removed from the ALd. When
each ACL rule set is unbound, the ALd sends a reply to the RP to free
the resources on it.

1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06 203


SSR System Architecture Guide

• During a line card or FABL restart, a registration message is sent to the


CLS. If the restart was a line card restart, the CLS first requests that traffic
be stopped. A line card or FABL restart is handled in the same way as a
CLS restart, except that the FABL requests a re-download of all ACL rule
sets and bindings from the RP. The FABL verifies that the ACL rule set is
contained in the ACL database. Based on this information, it pushes the
data to the ALd. During the FABL restart, messages may have been sent
to the FABL and may have not been processed by the FABL and ALd.
Because the ACL additions and bindings are processed normally, only the
deletions and unbindings need to be protected. Unbind messages are
not resent by the RP. Only delete messages are resent. In these cases,
the delete message must ensure that there are no circuit references to it.
The delete message handler must first request an unbind message from
the ALd before processing the delete message.

• During an ALd or PFE restart, the FABL does not know what was processed
by the ALd or what information was lost. Therefore, the FABL reapplies all
configurations to the ALd and PFE for each corresponding PFE.

3.9.2.3 ACL Database

The ACL database stores the following information about ACL rule sets in
persistent memory:

• Full ACL rule set

• PD cookie

• CLS ID for the rule set

• Status of the rule set

• Linked list of circuits that are using a particular ACL

The ACL database is transactional and can roll back ACL rule sets before they
are completed. The ACL rule set-to-circuit mappings are kept in the PI data
structure associated with each circuit. This information is called a feature block.
During CLS restarts, because an ACL could have been removed without the
RP sending an unbind message, the CLS generates an unbind message that
is sent to the ALd, based on the linked list of circuits in the ACL database.

For each circuit, the feature block stores the list of ACL rule sets required for
the circuit. The list is created when bind requests are received from the RP and
contains enough information to re-create the bind request to ALd.

3.10 Inter-Chassis Redundancy


The SSR supports several types of inter-chassis redundancy (ICR): MC-LAG,
VRRP ICR failover support for PPPoE, DHCP, and CLIPS subscriber sessions,
and BGP-based ICR.

204 1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06


Architectural Support For Features

For configuration and verification information about MC-LAG, see Configuring


Multichassis LAG and for information about MC-LAG architecture and fast
failover, see Section 3.10.2 on page 209.

For configuration and verification information about VRRP ICR, see Configuring
Inter-chassis Redundancy.

3.10.1 BGP-Based ICR


BGP based ICR (in SSR 13.1, only used by EPG) provides geographical
redundancy between two nodes of an ICR pair. The two nodes are seen
externally as one. Both routers have the same hosted IP address on each
ICR-enabled interface. One is treated as active and one as standby, unless
the active router fails. Then the ICRd module manages switching traffic to
the standby router.

For more configuration and operations information, see Configuring BGP for
Inter-Chassis Redundancy.

Note: The BGP-based ICR infrastructure on the SSR is generic and can be
used with multiple applications on the SSR. As of SSR 13B, EPG is the
only product using the infrastructure. The architecture is designed to
accommodate the addition of new applications in future releases.

3.10.1.1 Supporting Architecture

The functional architecture is as follows:

BGP based ICR module interaction:

• The ICRd now interacts with BGP through the BGP Lib.

• ICRd influences the attributes with which BGP may now advertise and
withdraw prefixes.

• ICRd can instruct BGP to advertise routes or virtual IP addresses (VIPs)


with a certain preference.

• Based on the preference level configured on the node, and the first VIP that
is added (on both nodes), ICRd will make a state determination with regard
to which chassis to bring up as active and which as standby

• ICR transport information may be configured and this information is


propagated back to all clients of ICR, including the newly added TSM client.

TSM as an ICR client:

• TSM is now a new client of the ICR infrastructure.

• TSM will include the ICR client library and will add and install VIPs from
the EPG application to ICRd.

1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06 205


SSR System Architecture Guide

• TSM APIs have been enhanced to support the installation of routes with
ICR prefix attributes (ICR or ICR tracked VIPs).

• Both IPv4 and IPv6 VIPs are supported

• TSM will now propagate ICR state and transport information for every
update received from ICRd back to the application, by means of callbacks.

EPG-TSM interactions

• EPG will now install ICR specific routes with the ICR or tracking attributes.

• These represent VIP addresses within a context that the application may
choose to track.

Transport between RPSW cards on ICR peers:

• Application components on the RP can choose to sync information with


their peers on the standby node, so that session information is preserved
during ICR switchover.

• A new RP-RP transport communication path is supported for EPG on the


chassis as well as on the SSR-SIM

Transport between SSC peers:

• EPG will now use the ICR transport library to send and receive information
between its peer nodes.

• A new SSC-SSC transport path is supported for application peers on the


active and the standby node.

• The ICR library is now available on the SSC and provides for a reliable
transport interface.

• The application can now forward/receive traffic through the reliable


transport library by means of tagged sockets.

• EPG/application components on the SSC may send out packets that will
be traffic steered to the SSC on the standby node, with the active node
being the initiator of the traffic.

• A new ICR steering table has been added and the application can manage
this steering table to retrieve the provisioned traffic slice forwarding table.

• The application can forward packets through the reliable transport by


configuring the sender and receiver endpoint. The receiver endpoint may
carry an index that is used as the TSID.

• A new PAP ID (the ICR PAPID) has been introduced. This is functionally
like a regular kernel PAPID.

206 1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06


Architectural Support For Features

• Changes have been introduced to steer ICR kernel PAPIDs by converting


the SSC header to the RBN header along with the tag value. This
represents the tagged socket endpoint of the application

• The line card NPU now supports steering of ICR packets by looking into the
ICR header and extracting the application ID and application type.

3.10.1.1.1 ARP-Table Synchronization

When two SSRs are part of a high availability cluster like BGP-based ICR,
MC-LAG, or VRRP ICR (and ARP synchronization is enabled), the ARP
daemon becomes a client of ICRlib and uses it to communicate with ARPd on
the ICR peer chassis. ARPd on the active and standby peers sends application
messages over ICRlib with ARP entries to be added or deleted.

3.10.1.1.2 Single IP Addresses for Node Pairs

To enable BRP based ICR active and standby nodes in an ICR pair to be
seen externally as one, they are configured with the same IP address. ICRd
manages the active or standby node states and all traffic flows to the active
node. Sessions are synchronized on both nodes, provided by ICR message
transmission and flow control.

For configuration and operation information about this ICR model, see
Configuring BGP for Inter-Chassis Redundancy.

When an ICR loopback interface is created, it includes the local and remote
IP addresses and local and remote UDP ports. ICR transport packets include
the destination card type (RPSW or SSC) to enable packets to be punted to
the RPSW card or steered to an SSC, based on traffic slice forwarding tables
(TSFTs).

When SSR nodes are configured in an ICR pair (in this model), TSM forwarding
changes to a more complex forwarding structure:

• Forwards packets for application processing (such as EPG or IPsec) to


different SSCs based on multiple TSFT tables (dynamically created)

• Forwards L4 control packets based on their port numbers (TCP, UDP,


or SCTP)

• Forwards all fragments to a single SSC for reassembly

Figure 87 illustrates the traffic steering flow at the line card NPU.

1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06 207


SSR System Architecture Guide

Figure 87 Distributed Traffic Flow to SSCs

To direct these forwarding streams, a service map has been added to the
TSFTs at each hosted next hop.

For unicast forwarding, TSMd interacts with RIB to install, remove, and
redistribute routes for components such as EPG in FIB.

TSMd also communicates with FABL to install routes in FIB.

To manage the virtual hosted IP addresses and mobile subscriber subnet


routes, ICR interacts with the following modules:

• When changes occur, TSMd sends the prefix and tracking attribute to ICRd
using the ICR library API.

• ICRd hands the prefix to BGP.

• BGP advertises the prefix and attributes.

If peer detection is configured, BGP monitors the prefix state; if the active
node is not detected before the timer expires, BGP uses the callback
function to inform ICRd.

• The standby node becomes the active node

208 1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06


Architectural Support For Features

3.10.2 MC-LAG Fast Failover


SSR multi-chassis LAG provides sub-second switchover of MC-LAG for ports
using CFM on the constituent ports of MC-LAG. If you configure ICR for an
MC-LAG, the standby peer functions in hot-standby mode maintained by fast
detection of constituent link failures using CFM (802.1ag) and synchronized
ARP caches on the active and standby peer (enabling the forwarding tables on
the standby peer to be in a ready state for faster failover). For more information
about ARP caching, see Section 2.2.2.2 on page 40. MC-LAG is integrated
with ETI for faster propagation of error messages (for more information, see
Section 3.10.2.2 on page 212).

For more information, see Configuring Multichassis LAG and Event Tracking.

Double-barrel static routes also support MC-LAG topologies; see Section


3.10.2.1 on page 210.

The MC-LAG fast failover architecture is based on the following functions:

• CFM or a single BFD session over an ICR transport link

• The event tracking infrastructure (ETI) (publishing BFD or CFM events to


processes subscribed to the service) for fast peer node-down detection.

• All communicating interfaces between the MC-LAG node pair are enabled
for ICR.

In using both levels of peer detection logic, the primary method is BFD and
the secondary is ICR.

This system has the following limitations:

• Some line card failures might not be covered by ETI.

• A "split brain" situation (where both the Home Slot and Backup Home Slot
could be active) could occur if an ICR transport link-down event causes
a BFD-down event.

1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06 209


SSR System Architecture Guide

Figure 88 MC-LAG Fast Failover

When you configure single-session BFD on the ICR transport link (between
SSR1 and SSR2 in Figure 88), the link state events are propagated to the
MC-LAG via the ETI. When the link group process receives a link down (false)
event, it takes action to renegotiate link availability, effectively influencing
a switchover. When a BFD-down event is received by the standby node,
the standby node takes over. If BFD is disabled, LGd uses the ICR state to
determine whether switchover is needed. BFD event configurations are sent to
ALd via RCM -> RIB ->FABL-BFD. If a BFD state changes in the forwarding
engine, ALd fetches the stored ETI object from its database and publishes it
to all subscribing processes.

If you enable sub-second chassis failover, the following process occurs when
the active links (1, 2, and 3 in Figure 88) fail; for example, if the power is
switched off in the active chassis. The system detects the active chassis failure
(single-session BFD over LAG detects ICR link down) and quickly reacts (BFD
propagates the event to LAG through the ETI). Finally, the system switches
over to the standby MC-LAG (LAG signals the remote switch to enable the link
toward SSR 2, and at the same time LAG enables the pathway from SSR 2
toward the trunk).

3.10.2.1 Double-Barrel Static Routes

Double-barrel static routes increase the resiliency of static routing through


path redundancy for MC-LAG. You can use the ip route or ipv6 route
command to configure both a primary and backup next-hop to a destination. If
both next-hops are reachable, IP packets for the configured route are forwarded
to the primary next-hop. If only one next-hop is reachable, IP packets for the
configured route are forwarded to the reachable next-hop.

For double-barrel static routes with both next-hops reachable, if the router
detects a failure of the link to the primary next-hop, packet forwarding is
switched to the backup next-hop.

The router supports detecting the following failures in the primary path:

210 1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06


Architectural Support For Features

• Failure in a router circuit or port

• Failure in a router card

• Failure in a path monitored with Bidirectional Forwarding Detection (BFD)

• Failure in a tracked-object monitored with event tracking infrastructure (ETI)

Switching to a backup path is non-revertible. If a backup path fails, the router


does not switch back to the primary path.

Note: Double-barrel static routes traversing a multichassis link aggregation


group (MC-LAG) are an exception. There the selection of primary and
backup paths depends on LAG status. For example, if you configure a
route with the LAG and peer router as next-hops, the chassis hosting
the active LAG installs the LAG as primary and the peer as backup; the
chassis hosting the standby LAG installs the peer as primary and the
LAG as backup. When an MC-LAG switchover occurs, these routes
are reinstalled.

You can configure double-barrel static routes among multiple routes to a single
destination.

BFD supports FRR on double-barrel static routes. Double-barrel static routes,


which are static routes that have primary and backup next-hops configured for
a destination, increase the resiliency of static routing through path redundancy.
You can configure BFD to detect a failure on a double-barrel static route,
causing the router to quickly switch from the primary to backup next-hop for
forwarding packets to the route's destination.

For configuration information, see Configuring Basic IP Routing and Configuring


BFD.

3.10.2.1.1 Supporting Architecture

The following line cards do not currently support double-barrel static routes:

• 4-port 10 Gigabit Ethernet, or 20-port Gigabit Ethernet and 2-port 10


Gigabit Ethernet line card

• 1-port 100 Gigabit Ethernet, or 2-port 40 Gigabit Ethernet line card

The core process running double-barrel static routes, if configured, is Staticd.


The purpose of double-barrel static routes is to enable standby circuits. The
standby circuit state is propagated to all ISM clients including RIB. RIB treats it
as equivalent to an Up state except when RIB client does next-hop registration,
and the next-hop resolution is over a circuit with Standby State. In that case,
RIB communicates this to its clients. A next-hop where the circuit is in standby
state will be referred to in this document as Standby Next-hop, or next-hop
with standby flag.

A double-barrel next-hop consists of four parts:

1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06 211


SSR System Architecture Guide

• A primary next-hop that is usually for forwarding

• A back-up next-hop to be used when primary fails

• An optional ETI object to be monitored

• A negation-flag for the ETI object. By default, line cards monitor the ETI
object and switch to using the backup when the ETI object is FALSE. If the
(non-configurable) negation-flag is set, then line-cards switch to using the
backup when the ETI object is TRUE.

Staticd interacts with the following modules:

• RIB adds the routes to FIB on line cards. If a client adds multiple routes
to the same destination with the same cost and distance, RIB treats it
as an ECMP route. RIB allows clients to add ECMP of double-barrels,
or mixed ECMPs where some paths are double-barrel and some are
single-barrel. RIB clients register for next hops and prefixes to be able to
return double-barrel next-hops where applicable.

• FFN process—When a double-barrel next-hop is configured, ALd registers


for FFN notifications related to the primary path. FFN events could be
BFD Down, CCT Down, Port Down, or Card Down. When ALd receives
an FFN event that affects the primary path, it switches to the backup path.
ALd does not register for FFN event for the backup path because switching
is non-revertible.

• ETI—If a double-barrel next-hop is configured with an ETI object to be


tracked, then ALd subscribes with the ETI infrastructure to receive the
initial state and any state changes of the ETI object. An ETI object may
have the states UNKNOWN (before ALd learned the initial state), TRUE,
FALSE, or OTHER (other than True or False). When ALd learns the state
of the object depends on whether or not the negation-flag has been set. If
negation-flag is FALSE, then ALd starts using the backup path whenever it
sees that the ETI object has a FALSE state. If negation-flag is TRUE, then
ALd starts using the backup path whenever it sees that ETI object has a
TRUE state. When ALd switches from the primary to the backup path, it
stops subscribing to the ETI object.

3.10.2.2 CFM and ETI for MC-LAG

For more information about enabling object tracking, configuring trigger or


reaction entities, and enabling ETI logging, see Event Tracking. For information
about the Event Tracking Interface architecture, see Section 3.12 on page 215
and about FEN, see Section 3.13.1 on page 219.

To support Ethernet CFM over MC-LAG, two new communication paths have
been added to the router:

• LG errors reported by FEN through ETI

• CFM-LG errors reported through ETI

212 1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06


Architectural Support For Features

Figure 89

Figure 89 illustrates the flow of messages between SSR modules in the active
and standby peers, after LGd gets an update from CFM when a remote MEP
goes to the DOWN state.

LGd publishes an ETI event notifying the LAG is going active. FABL LACP
and FFN subscribes to the event and takes action accordingly. LGd does
not include CFM’s UP message in calculating whether the LAG meets the
configured min-link requirement.

The following components support this function:

• If the CFM CCM watchdog timer or keepalive timer configured on a link


circuit fails, the NP4 triggers a CFM event to ALd. ALd in turn sends a
state change message to CFM-FABL.

CFM-FABL then takes the decision to publish the state change to LGd or
not, based on information stored in its database.

• CFM-FABL uses ETI to publish the state change to LGd.

• Upon receiving the state change, LGd performs a min-link calculation. If


the number of active links becomes less than the configured min-links, and
MC-LAG is in active state, then it publishes the MC-LAG standby event to
modules subscribed for the event through ETI or ISM.

• If the number of active links becomes more than the number of min-links
and MC-LAG is supposed to go to active state, then LGd moves the

1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06 213


SSR System Architecture Guide

MC-LAG from standby state to active state and sends respective event to
the modules through ETI or ISM.

3.11 Ethernet CFM and Single-Session BFD Home Slot


Management
The SSR supports mechanisms for Ethernet CFM and Single-session BFD over
LAG that are based on defining and assigning Home Slots (and backup Home
Slots for BFD sessions). Home Slot management is supported by fast-failure
notification (FFN); see Section 3.13.2 on page 220.

3.11.1 Ethernet CFM


IEEE 802.1ag Connectivity Fault Management (CFM) provides service
management. It allows service providers to manage each customer service
instance individually. A customer service instance, or Ethernet Virtual
Connection (EVC), is a service that is sold to a customer and is designated
by the Service-VLAN tag or entire port. Therefore, 802.1ag operates on a
per-Service-VLAN (or per-EVC) basis. It enables the service provider to know
if a service instance or EVC has failed and, if so, provides the tools to rapidly
isolate the failure.

Ethernet CFM depends on each EVC being assigned a Home Slot by CFMd.
During operation, CFMd regularly synchronizes the local MEP information from
the remote MEP.

Reviewers, are there now Backup Home Slots for CFM?

For an overview and configuration and operations information, see Configuring


Ethernet CFM.

3.11.2 Single-Session BFD over LAG Fast Failover


Single-hop BFD enables the SSR to detect whether the L3 neighbor is alive
without consideration of the underlying L2 interface. Single-hop sessions are
BFD sessions for adjacent routers separated by a single IP hop.

Single-session BFD over LAG is based on Home Slot and Backup Home Slot
definition and selection. With SSR release 12.2 and higher, changes to FABL
enable syncing up the BFD state from the Home Slot to the Backup Home Slot
(and also to non-Home slots). The goal is for BFD clients to not detect a link
failure within the LAG.

For each BFD session, the control plane picks one line card as the Home Slot,
and the PFE on that line card handles the transmission and reception of BFD
messages. A packet that arrives on a non-Home Slot for a session is redirected
to the Home Slot. It is forwarded using loopback adjacency processing (based
on a loopback adjacency stored in an internal packet header added to the

214 1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06


Architectural Support For Features

packet on ingress), which indicates that the Home Slot is to perform the
processing for the packet.

If the active Home Slot fails, the backup Home Slot becomes the new active
one and takes over BFD sessions quickly.

For an overview of BFD and configuration and operations information, see


Configuring BFD.

3.11.2.1 Supporting Architecture

ALd maintains a global table and provides allocations of the Watchdog counters
and OAM real-time counters for all the keepalive protocols such as VRRP,
BFD, 802.1ag, CFM, and Y.1731. ALd maintains the permitted allocations for
each timer level (3.3ms, 10ms, 100ms, 1sec). When all available OAM timer
resources and pre-allocated watchdog counters are allocated, it returns an
appropriate error code for further allocation requests.

ALd provides APIs for the following processes:

• Adding, deleting, and updating BFD sessions and their attributes in the
BFD table on NPU. The update BFD session API updates any change
from Home Slot to backup Home Slot. The update BFD session API also
communicates state information to non-Home slots to track BFD over LAG
sessions.

• Read/get one BFD table entry for a BFD session or read/get all BFD entries
from the BFD table in NPU.

• Processing BFD status update messages from NPU, triggering FFN events
(on session timeout or on session going down), and updating FABL-BFD
about status changes or remote-peer parameter changes.

• Creating loopback adjacencies at line card initialization for redirecting the


BFD packets to the Home Slot.

BFD uses a homing table in the NPU (per ingress PFE) to fast switch from Home
Slot to backup when the Home Slot goes down. When the Home Slot of a BFD
session goes down, the BFD FFN client handler code in ALd PD (in all other
slots) toggles the control bit to make the backup Home Slot become active in all
20 Home Slot table entries corresponding to the original slot that went down.

3.12 Event Tracking Interface


The SSR The SSR event tracking interface (ETI) facilitates tracking time-critical
events in the system. You can configure which objects are to be tracked. When
the status of a tracked object changes, the event is published, speeding fault
failover.

In this document, we use the following ETI terms:

1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06 215


SSR System Architecture Guide

Tracked-object An object tracked by ETI, which generates an event


when the object state changes.

Tracked-object-action
The action to be taken when a tracked-object event is
generated.

Tracked-object-event
The event.

Tracked-object-publisher
Entity that publishes the tracked-object-event. For SSR,
release 12.2, BFD is the tracked-object-publisher.

Tracked-object-subscriber
The entity that acts when it receives a
tracked-object-event. For SSR, release 12.2,
LAG is the tracked-object-subscriber.

For example, ETI can enable high-speed reactions to outages in MC-LAG, PW,
or LAG links. For example (see Figure 90):

1 Power fails in the active chassis, SSR 1, resulting in links 1, 2, and 3 being
down. SSR 2 must detect this in less than a second and switch over with
minimal data-path traffic disruption.

2 The BFD session over the link group between SSR 1 and SSR 2 detects
the inter-chassis links being down.

3 ETI event propagation publishes the event to LAGd.

4 LAGd signals the remote Ethernet switch to begin sending traffic to SSR 2.
At the same time, LAGd enables the pathway from SSR 2 to the switch.

Figure 90 MC-LAG, BFD, and LAG ETI Use Case Topology

216 1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06


Architectural Support For Features

3.12.1 Supporting Architecture


Figure 91 illustrates the ETI architecture within the router.

Figure 91 SSR ETI Publisher/Subscriber Functionality

You configure a tracked-object, identified by the tracked-object name. The


name is internally mapped to an ID number by RCM (and mapped to the
Transparent Inter Process Communication (TIPC) port-name), which is kept in
sync with the standby RPSW in case of a switchover.

In SSR, release 12.2, the maximum number of tracked-object-ids supported


is 8000.

Reviewers: has this changed in 13.1?

When a tracked-object state changes, the tracked-object-event is published to


enable the subscriber to quickly take the appropriate action.

The states can be ‘up or down or some other state published by the publisher.
A compile-time list of states is supported in the system.

ETI interacts with the RPSW modules, RCM, Event Publishing, and Event
Subscribing in the following process:

1 RCM sends the tracked-object configuration to ETI.

ETI prepares to maintain statistics for each of the tracked-objects. This is a


complete receive of all ETI events.

2 For BFD, the tracked-object publishing configuration is sent to RIB, which is


the BFD RPSW management daemon.

1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06 217


SSR System Architecture Guide

3 RIB propagates this information to ALd PI BFD on the associated line card,
so that it can publish the events as they happen. The ALd PI BFD publishes
the current status of the tracked-object immediately after this.

4 Tracked-object-action configuration is pushed to LAGd in RPSW.

5 This information is propagated to the actual subscribing (action-handler)


component in LAGd.

6 LAGD registers the action handlers for the event with a subscription tag
(a quick reference for the event subscriber). For example, this could
correspond to the subscription-side LAG instance.

7 The ETI process registers with the ETI library as a subscriber for the event
so that it can maintain statistics and some generic optional actions, such as
logging and snmp-traps, for the event.

8 When the event publisher is ready to publish an event, it calls the ETI
library publishing API.

9 The send-side library distributes the event through the ETI transport. The
ETI transport delivers the event to LAGD and ETI process through TIPC
SOCK_RDM reliable multicast.

The ETI library calls the event subscriber action handler, with the
subscription tag.

10 A copy of the event is also sent to the ETI process on the RPSW card.

This is not a time critical or urgent message, and is delivered with regular
priority reliable messaging.

11 The ETI process handler is called, which maintains statistics, logs, traps,
and so forth.

ETI uses Transparent Inter-Process Communication (TIPC) as the transport


mechanism. A TIPC abstraction library has been added to abstract TIPC for
SSR. The ETI library is the only user for the TIPC abstraction library. Processes
involved in publishing and subscribing ETI events use the ETI library. The
SOCK_RDM TIPC socket is used for this purpose, which is a reliable-datagram
service.

To support the proper version of TIPC, the RPSW, line card, and SSC kernels
have been upgraded to Linux 3.0. The TIPC module is loaded when each
card comes up.

218 1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06


Architectural Support For Features

3.13 Failure Event Notification Processes

3.13.1 FEN Process


With IP OS release 13B, Fast Event Notification (FEN) has been added to
support SSC resiliency. This technology provides fast notification when
resource state changes occur in the system for critical services. It consists
of an event detection mechanism that uses ETI to propagate the detected
events to all subscribers. Recovery action taken by subscribers achieves fast
recovery from failure. A new daemon APPMON (used to monitor the failures
in registered applications) supports FEN, with one instance on RP and one
on each SSC card. An FEN Registration Application instance registers with
APPMON to express interest in generating FEN on its crash.

Figure 92 illustrates the FEN Architecture.

Figure 92 FEN Architecture

APPMON generates FEN messages if a control or user plane application


running on an SSC or RPSW crashes or is killed by PM due to
unresponsiveness. This triggers APPMON to publish the “Application Status”
event with status as “Down”. In both cases, a core dump of the application is
created.

For an SSC, if the Platform Admin daemon (PAD) is notified about card level
critical faults (such as thermal or voltage) from CMS and other modules, it
sends a Card Down event before starting the card deactivation processing,
causing ETI to publish a Card Down message. The SSC Card Down condition
can also be triggered by a CLI configuration change, a kernel panic, an SSC
PM crash, or by pulling the SSC out of the chassis.

1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06 219


SSR System Architecture Guide

Note: Line cards are supported by FFN, which is not supported by ETI. For
more information, see Section 3.13.2 on page 220.

FEN depends on the following SSR components:

• FEN Library—A new FEN library introduced to support applications that


publish and subscribe events using ETI and TIPC transport.

• Modules such as PAD, CMS, CMS-Proxy and PM publish CARD_STATUS


FEN events.

• The SSC kernel supports detection of kernel panics and sends CMB
messages to CMS.

• The ASPd module publishes ASP_STATUS FEN events.

• ETI supports System Event objects as well as configured event objects.

• The SSC network processor subscribes to SSC FEN event notifications.


When it receives notification of a FEN event (SSC Card, ASP, or
APPMON), it switches the affected SSC traffic over to secondary SSCs.
This switchover is enabled by the new EPG Single IP feature added in IP
OS 13B, in which the TSM-FABL module gets port IDs from the packet
access point resolver daemon (PAPRd), which it passes to ALd, which in
turn installs them in the PAPT tables on the SSC cards.

• To support LAG on SSC for egress (SSC -> LC) traffic, SSC ALd subscribes
to receive line card FFN events. When it receives such events, it recovers
by sending packets to LAG constituents that are still reported to be up.

3.13.2 FFN Process


Fast-failure notification (FFN) enables very fast propagation of events that have
a significant impact on how traffic is forwarded. The most important events
are various types of failures (link failures, card failures, or even higher level
protocol failures like BFD timeouts). When fast failure reaction is required, the
control plane has pre-provisioned backup links to be used when a failure it
detected. FFN ensures that failures are propagated quickly across the system.
Because link failures are typically detected by and reacted to by a line card, the
current (12.2) phase of FFN covers fast communication between line cards. It
is based on an efficient Ethernet multicast protocol over the Gigabit Ethernet
control network.

Figure 93 illustrates the FFN event detection and messaging process.

220 1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06


Administration

Figure 93 FFN Event Detection and Messaging

The FFN mechanism is not currently based on the ETI infrastructure; see
Section 3.12 on page 215. In the future, it may be becauseit is expected that
the ETI infrastructure could provide a much more flexible and generic interface
than the current FFN functionality, and will be able to cover multiple event types
and be easily extended to new events.

4 Administration

4.1 Accessing the SSR System Components


The CLI can be accessed in the following ways:

• Ethernet management port connection to a local or remote management


workstation

Telnet or SSH client (Requires a PC-type workstation running DOS,


Windows, or Linux) with the following Ethernet cables:

0 For a local workstation, a shielded Ethernet crossover cable.

0 For a remote workstation, a shielded Ethernet straight cable (shipped


with the system) or a router or bridge.

• CONSOLE port connection to a local or remote console terminal:

0 ASCII/VT100 console terminal or equivalent that runs at 9600 bps,


8 data bits, no parity, 1 stop bit.

1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06 221


SSR System Architecture Guide

0 PC-type workstation running DOS, Windows, or Linux with a terminal


emulator, in the same configuration as the ASCII/VT100 terminal.

A.

0 Ethernet cables required:

• For local access to the console, a terminal server and a console


cable (shipped with the system).

• For remote access to the console, a terminal server cable.

It is recommended that you have two access methods available, such as a


remote workstation connected to the Ethernet management port and a remote
console terminal with connection to a terminal server. Many administrative
tasks should be carried out from the CLI when connected through a terminal
server, because some processes, such as reloading or upgrading the software,
may sever an Ethernet management port connection.

You may also need to access the Linux kernel to perform the following actions:

• Set environmental variables.

• Resolve the boot order in BIOS.

• Recover lost passwords.

The SSR system components are accessed through the primary and secondary
RPSW controller cards, the OK mode, and the Linux shell.

You can also access line cards and SSC cards to collect data and troubleshoot.

4.1.1 Logging On to the Active Controller Card


To log on to the Linux shell on the active RPSW card from the legacy IP OS
CLI, enter the following command in exec mode:

[local]Ericsson#start shell
sh-3.2#

The # prompt indicates that you are at the Linux shell level.

4.1.2 Logging On to the Standby Controller Card

To collect information or perform recovery tasks on the standby controller card,


log on to it from the active controller card and use the same commands that
you would on the active one. The following example shows how to log on to the
standby controller card from the active one. The standby prompt indicates
that you are now working on the standby controller.

To log on to the standby RPSW card from the CLI, enter the following command:

222 1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06


Administration

[local]Ericsson#telnet mate

Ericsson login:the same administrator name as with active controller card


Password:the same password as with active controller card
[local]standby#

To log on to the standby RPSW card from the Linux shell on the active RPSW
card, enter the following command:

sh-3.2$telnet rpsw2

This example assumes that RPSW1 is active (if RPSW2 is active, enter
the telnet rpsw1 command). To verify the active card, enter the show
chassis card command.

4.1.3 Logging On to a Line Card or SSC

To log on to a line card or SSC from the RPSW CLI (with the root password)
and open the line card ALd CLI, enter the following command:

local]Ericsson#start shell
sh-3.2$ ssh root@lc-cli-slot-num
root@lc-1[1]:/root> /usr/lib/siara/bin/ald_cli

In this example, the cli-slot-num variable is the slot number, minus 1. If


the slot number is 1, then 0.

4.1.4 Accessing the Open Firmware Shell


To access the Linux open firmware (OFW) shell through the console port on the
front of each controller card:

1. Enter the reload command in exec mode from the console port.

2. Watch the reload progress messages carefully. When the following


message appears, type ssr* within five seconds:

Auto-boot in 5 seconds - press se* to abort, ENTER to


boot:

3. If you typed ssr* within 5 seconds, the ok prompt appears. The system
sets the autoboot time limit to 5 seconds; however, during some operations,
such as a release upgrade, the system sets the time limit to 1 second to
speed up the process and then returns it to 5 seconds when the system
reboots. If you missed the time limit, the reload continues; start again with
Step 1).

1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06 223


SSR System Architecture Guide

4.1.5 Accessing the Linux Prompt

You can access the Linux prompt and perform root-level troubleshooting or
recovery if you:

• Log on through the console as root.

• Log on through the management port as an operating system administrator,


have sufficient privilege to execute the start shell command and do so
from the Linux shell with the user id root). This is a known security hole
and should not be communicated to any customers.

When the SSR is reloaded, and you type ssr within 5 seconds, a Linux bash
shelll (not the same shell as the one started by the start shell command) is
entered). The regular Linux prompt displays. From this shell, you can perform
various utilities and applications, if you have the correct permissions, as in a
standard Linux system.

A non-root user can perform many actions, execute most utilities, and run many
applications; however, only a root user can perform certain operations that
are privileged and affect the system functionality (for example, shutdown the
system, stop, start, or restart processes, use SSH).

To access the Linux prompt, perform these steps:

1. Console logon—The user root can log on from the serial console after the
login: prompt. The default root user password is root. After a successful
root username & password logon, the exec_cli is executed and the Ericsson IP
CLI is available for SSR configuration. At this point, administrator configuration
is allowed.

After successful administrator configuration, any administrator can log on


through the console and the Ericsson IP CLI will be started. If you keep the
configuration, even after reboot, you can still use the administrator's name and
password to log on. To gain root access to the Linux shell from the Ericsson IP
CLI, enter the start shell command. In summary, if Ericsson IP Operating
System administrators are configured, then at the console login prompt, you
can choose to enter root/root or admin_name/admin_password to gain
exec_cli access through the serial console connection.

2. SSH login—As per the SSH configuration, the user root is not permitted to
log on through the management interface via SSH. If the user attempts this,
the password will be prompted three times and each attempt will display a
permission denied error. All system administrators can log on via SSH if, and
only if, they provide the correct username and password combination. Once
logged on, the Ericsson IP CLI is started for the system administrator. To
access to the Linux shell from the Ericsson IP CLI, execute the start shell
command. At the Linux CLI, the user does not have root privileges.

3. Telnet login—All system administrators can log on through the management


interface via Telnet if, and only if, they provide the correct username and
password combination. Once logged on, the Ericsson IP CLI is started for

224 1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06


Administration

the system administrator. To access to the Linux shell from the Ericsson IP
CLI, execute the start shell command. At the Linux CLI, the system
administrator is root and has root privileges. If you have a Linux user
configured in the system, you can connect to the chassis through SSH using
that user's information.

4. Internal only - Workaround to permit root SSH logon.—Problem/feature:


Root is not allowed to SSH/SCP from any RP, which is by design and a security
requirement. Workaround: Log on to the system either via serial console as
root or via Telnet on the management interface as an Ericsson IP Operating
System administrator. Enter the start shell command and then edit the
/etc/ssh/sshd_config file, specifically changing the PermitRootLogin setting
from no to yes. Enter this command: /etc/init.d/sshd restart. After a
successful SSH daemon restart, root logon via SSH is permitted.

4.2 Configuration Management


For more information on memory management, the file system, and managing
files, see Managing Files and Managing Configuration Files in the SSR CPI
library.

4.2.1 Software Storage Organization


Each SSR can contain one or two RPSW cards. If there are two RPSW cards,
one is active and the other is standby. Each RPSW card has internal storage
media to store the operating system, configuration files, and other system files.

The SSR has two internal 16-GB internal disks. Storage is divided into four
independent partitions: p01, p02, /flash on the first disk, and /md on the second
disk:

• The p01 and p02 partitions are system boot partitions that store operating
system image files. One is the active partition and one is the alternate
partition.

The active partition always stores the current operating system image
files. The alternate partition is either empty or stores the operating system
image files from another release. Only one alternate configuration can
be stored at a time.

The RPSW cards in the SSR ship with the current operating system release
installed in the active partition, either p01 or p02. The system loads the
software release when the system is powered up.

• The /flash and /md partitions are internal storage partitions used for
managing configuration files, core dump files, and other operating system
files. The /flash partition is 8 GB in size and is primarily used for storing and
managing configuration files. The /md partition is 16 GB in size and stores
all kernel and application core files and log files.

1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06 225


SSR System Architecture Guide

You can also mount a USB flash drive in the external slot of an RPSW card
for transferring software images, logs, configuration files, and other operating
system files. The USB flash drive is not intended for continuous storage.

Note: The USB flash drive cannot be accessed when it is at the firmware
ok> prompt.

Each line card has one 2 GB internal storage disk that is partitioned in four
parts: /p01, /p02, /flash, and /var (/var/md).

To see the file system organization, use the show disk command. In the
following example, a USB drive has been inserted and mounted.
[local]Ericsson#show disk
Filesystem 1k-blocks Used Available Use% Mounted on
/dev/sda1 3970536 1082032 2688396 29% /p01
rootfs 3970556 1429364 2341080 38% /
/dev/sdb1 15603420 1506468 13310588 10% /var
/dev/sda3 7661920 149600 7126176 2% /flash
/dev/sdc1 1957600 1551776 405824 79% /media/flash

4.2.2 Configuration Files

A configuration file is a script of configuration commands that can be loaded into


the system. A configuration file allows you to store sequences of commands
that might be required from time to time. Configuration files might contain
only partial configurations. Multiple configuration files can be loaded into the
system. When multiple files are loaded, their configuration commands are
loaded sequentially. This results in a running configuration with commands
merged from multiple files.

When both legacy CLI and Ericsson CLI configurations are created on the SSR,
they are combined in a single configuration file.

A configuration file can have a text version and a binary version. The system
generates both versions when you enter the save configuration command
in exec mode.

By default, the system loads the binary version of the system configuration file,
ericsson.bin, from the local file system during system power on or reload. If
the binary version does not exist, or if it does not match the ericsson.cfg file,
the system loads the ericsson.cfg file. The ericsson.cfg file is loaded
on the system at the factory and should exist on initial power-up. However, if
the ericsson.cfg file has been removed, the system generates a minimal
configuration, which you can then modify.

You can modify the active system configuration in the following ways:

• Change the system configuration interactively.

• Create and modify configuration files offline.

226 1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06


Monitoring and Troubleshooting Data

With an interactive configuration, you begin a CLI session and access global
configuration mode by entering the configure command in exec mode.
In global configuration mode, you can enter any number of configuration
commands.

An offline configuration allows you to enter configuration commands using a


text editor and then save the file to load at a later time.

The operating system supports comment lines within configuration files. To add
a comment to your configuration file, begin the line using an exclamation point
(!). Any line that begins with an ! is not processed as a command.

4.2.3 Storage for System Images and Configuration Files


You can store system images and configuration files locally in the /flash
partition on the internal storage media or in the /media/flash partition on the
USB flash drive. You can also store the files on a remote server and access
them using File Transfer Protocol (FTP), Secure Copy Protocol (SCP), or
Trivial FTP (TFTP).

You can use the Management Information Base (MIB), RBN-CONFIG-FILE-MIB,


to save and load configuration files to and from a TFTP or FTP server. The
server must be reachable through one of the system ports.

Note: For operations that request the use of a transfer protocol, such as FTP,
SCP, or TFTP, it is assumed that a system is configured and reachable
by the SSR.

5 Monitoring and Troubleshooting Data

Note: Some hidden commands or parameters used in the tasks described


in this document provide useful information, but we recommend you
use them only as described. They can change without notice, are not
fully tested, and are not officially supported. To avoid unexpected
side-effects, use them only when required, with the right parameters,
and correct syntax.

The Ericsson IP Operating System includes many facilities for verifying,


monitoring, and troubleshooting the router.

• Logging—System event logs preserve a record of errors or significant


events. You can configure how much detail and what type is displayed

1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06 227


SSR System Architecture Guide

• CLI show commands—Numerous commands allow you to display the


feature and function status.

• OFW shell commands—OFW shell enables you to display and configure


system boot parameters.

• Debug commands—Enables the generation of detailed debugging


messages for troubleshooting.

• Core dump files—Preserve system status when failures occur.

• SNMP monitoring and notification—You can receive monitoring and event


notification.

• Statistics—Bulkstats enable detailed statistics collection with less


performance impact than SNMP.

• Hardware diagnostics—Power-on diagnostics (POD) and out-of-service


diagnostics (OSD) enable hardware monitoring and verification

For the customer instructions for which data to collect when submitting a
customer service request (CSR), see Data Collection Guideline.

5.1 HealthD
Health Monitoring daemon (Healthd) is a new feature that enhances the
debuggability of the SSR. Its two main purposes are monitoring the system
health and supporting operators in troubleshooting system issues. The
high-level functions of Healthd include:

• Monitoring and logging the system’s health over time

• Detecting system problems before, when possible, and as they happen and
taking preconfigured actions

• Preempting fatal system crashes, when possible, by notifying operators


and executing programmable actions

• Reacting to specified events, such as link down, by executing


programmable actions (generic object tracking)

• Simplifying system troubleshooting

Note: For security reasons, Healthd refuses any connections to the Healthd
TCP port with a destination IP address other than the loopback
address. Therefore, you must connect to Healthd from the local node
with proper authentication. For further protection of the RPSW card,
you can also independently configure filters or ACLs on the line cards
to deny connections from the outside to the Healthd port. It is assumed
that only Ericsson-trained personnel can use Healthd. Healthd’s user
interface and functionality must not be exposed to operators.

228 1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06


Monitoring and Troubleshooting Data

Healthd provides basic support for the following functions:

• User Interface—A programmable/scriptable user interface based on python.

• Event Scheduler—Allows the user to schedule scripts and actions to run at


a specified times in the future. For example, you may need to collect some
error counters at 03:00 AM, when the box use is at its minimum, and inform
the operator if any action is needed. This function is similar to the cron
function in UNIX-based systems, but is not based on the cron daemon. It is
developed in Healthd to eliminate any dependencies on the system and
enable full flexibility. It is mainly a poll function.

• Troubleshooter—A generic execution context that can execute


troubleshooting scripts in the background. The main focus is to script
the ability to troubleshoot and pinpoint the location of a fault down to a
subsystem and a specific error reason of a particular function in SSR. For
example, you can issue the troubleshoot (‘ping 10.10.10.10’) command,
which schedules the execution of the ‘ping 10.10.10.10’ script in the
background. Or you can execute the ping 10.10.10.10 command in the
foreground if you do not want to lock the user interface while the script is
running. The ‘ping 10.10.10.10’ script would then query the various modules
(RIB, ISM, and so on) and components in SSR fabric, line cards, and so
on) looking for the reason why ping toward 10.10.10.10 is not succeeding.

Healthd is initialized by the Process Manager (PM) when the system is coming
up, with the following process:

1 When booting SSR, PM boots UTF with the hltd-default.py Healthd


initialization script. This script converts UTF to Healthd, which includes
loading the event scheduler plug-in and the troubleshooter plug-in. This
are base Healthd plug-in threads.

2 After Healthd is initialized, various actions are loaded (generic python


methods that can wrap C/C++ library calls) that are scripted with a specific
functionality. Both the event scheduler and the object tracker can use
these generic actions. When an event happens, the actions taken are
distinguished from the generic support methods provided. Actions include
methods that:

0 Support the load script command

0 Log and trace

0 List all methods available and a short help description

0 List the history of the most recently issued commands

3 After the actions are loaded, the generic configuration script runs and loads
the Healthd functionality, which applies to all Release 12.2 SSR nodes.
The script configures Healthd events that run on a periodic basis. The
active RPSW and standby RPSW scripts can be different; you set a global
variable to determine which script to load.

1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06 229


SSR System Architecture Guide

4 After the generic base configuration is loaded, the per-node configuration


script runs. You can modify this per-node script based on the required
configuration for a specific node. For example, you can configure an
additional script to run on a node that is having maintenance issues. New
per-node configurations should be separated in different script files.Only
Ericsson personnel can modify this script.

5 When the generic and per-node scripts have loaded and run, you can log in
to the Healthd console and add new configurations and operations or run
troubleshooting scripts.. A history function helps you save the configuration
in a file.

Healthd is built on top of the UTF core library to allow reuse of code developed
by various teams to configure and query their own daemons. This architecture
is summarized in the following figure.

Figure 94 Healthd Functional Architecture

Sample Healthd use cases:

• Monitoring the CPU utilization of a process—Run python scripts to find the


processes with CPU utilization beyond a defined threshold (an argument to
the script). You can also use the script to find processes with high CPU
usage and collect debug outputs related to that event. This script can be
scheduled by the event scheduler to be run at a predefined time.

• Troubleshooting ping—Run a python script that debugs ping failure on the


router. The script systematically checks all the modules and components
involved in setting up the route entry and the data path for the ping, and
reports errors if any inconsistencies are found. This script can be scheduled
by the event scheduler to be run at a particular time and report if the route
to any destination is removed (according to your requirements)

230 1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06


Monitoring and Troubleshooting Data

5.2 Logging
The operating system contains two log buffers: main and debug. By default,
messages are stored in the main log. If the system restarts, for example, as a
result of a logging daemon or system error, and the logger daemon shuts down
and restarts cleanly, the main log buffer is saved in the /md/loggd_dlog.bin
file, and the debug log buffer is saved in the /md/loggd_ddbg.bin file. You
can view the contents of the main log file using the show log command in
exec mode.

Persistent and startup log files are stored by default in /md/log. In the
following example, the dir command shows the log files stored in that location:

[local]Ericsson#dir /md/log*
Contents of /md/log*
-rw-r--r-- 1 root root 16 Jun 05 05:26 /md/loggd_ddbg.bin
-rw-r--r-- 1 root root 394312 Jun 05 05:26 /md/loggd_dlog.bin
-rw-r--r-- 1 root root 2156739 Jun 05 23:09 /md/loggd_persistent.lo
-rw-r--r-- 1 root root 9751732 Jun 01 02:34 /md/loggd_persistent.lo
-rw-r--r-- 1 root root 261145 Jun 05 23:09 /md/loggd_startup.log
-rw-r--r-- 1 root root 346712 Jun 05 05:26
/md/loggd_startup.log.1

Collect system logs from both the active and standby route processor/switch
(RPSW) cards to attach to a CSR. The files are named messages.x.gz.; they
can be found in the /var/log directory through the Linux shell mode. The log file
must include the time of the failure. Time stamps before and after the event
occurred must also be included in the CSR. It is important to verify exactly in
which file the actual failure is, because the active message log file is eventually
overwritten. For example, the file can be in /var/log/messages.2.gz instead of
current message log. Verify the logging configuration on the router by collecting
the output of the show configuration log command.

Note: You cannot use the show log command to display the contents of
the debug buffer, unless you enable the logging debug command
in global configuration mode. But enabling the logging debug
command can quickly fill up the log buffer with debug and non-debug
messages. To prevent the main buffer from filling up with debug
messages and overwriting more significant messages, disable the
logging debug command in context configuration mode.

By default, log messages for local contexts are displayed in real time on the
console, but non-local contexts are not. To display non-local messages in real
time, use the logging console command in context configuration mode.
However, log messages can be displayed in real time from any Telnet session
using the terminal monitor command in exec mode (for more information,
see the command in the Command List).

In large installations, it is convenient to have all systems log to a remote


machine for centralized management and to save space on the device. The
operating system uses the UNIX syslog facility for this purpose. It can send log

1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06 231


SSR System Architecture Guide

messages to multiple machines concurrently, and logging can be constrained


to events occurring on a specific circuit.

All log messages contain a numeric value indicating the severity of the event
or condition that caused the message to be logged. Many log messages are
normal and do not indicate a system problem.

Table 13 lists event severity levels in log messages and their respective
descriptions.

Table 13 Event Severity Levels in Log Messages


Value Severity Level Description
0 emergency Panic condition—the system is unusable.
1 alert Immediate administrator intervention is
required.
2 critical Critical conditions have been detected.
3 error An error condition has occurred.
4 warning A potential problem exists.
5 notification Normal, but significant, events or
conditions exist.
6 informational Informational messages only; no problem
exists.
7 debugging Output from an enabled system debugging
function.

5.2.1 Controlling the Volume of Informational Log Messages


To reduce the number of informational messages displayed on the console,
changes were introduced in Release 12.1 to suppress the default display of
INFO messages on the console. By default, these messages are no longer
displayed, but they are still stored in the system log buffer. If you want to display
INFO messages (for example, for script purposes), you can enable them by
entering the logging display-info command for the RPSW CLI logs or the
logging card slot display-info command for logs on the line cards.

Note: Use of these commands is discouraged. They can result in a large


number of undocumented messages displaying on the console

To disable the display of INFO messages on the console, use the no form
of the commands.

232 1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06


Monitoring and Troubleshooting Data

5.2.2 Collecting System Logs


System logs contain event information about a variety of system components
and are one of the primary troubleshooting tools. Customers must collect
system logs from both the active and standby RPSW cards to attach to a CSR.

To prepare for troubleshooting, collect system logs from both the active and
standby RPSW cards. The files are named messages.x.gz. They are
located in the /var/log directory through the Linux shell mode (see Section
4.1.1 on page 222). The log file includes the time of the failure. Timestamps
before and after the event occurred must also be included in the CSR. It
is important to verify exactly in which file the actual failure is, because the
active message log file is eventually overwritten. For example, the file can
be in /var/log/messages.2.gz instead of the current message log. Verify
the logging configuration on the router by collecting the output of the show
configuration log command.

For information about collecting logs and show commands for troubleshooting,
see Section 5.4 on page 256.

5.2.2.1 Enable Event Logging During Reload or RPSW Switchover

Be sure that you have logging enabled on the console during a reload or
switchover. To turn on event logging, configure logging in global configuration
mode.

Note: Enabling event logging with these hidden commands can be very useful
in troubleshooting, but the volume of data produced can impact SSR
performance. Disable these commands after troubleshooting.

1. Enter global configuration mode and enable event logging

logging events start

Remember to disable it again after your debugging session, with the


following command:

logging events stop

2. Set the logging timestamp facility to record by milliseconds.

logging timestamp millisecond

3. Enable the logging events command.

The since, until, and level keywords are only available after specifying
the active keyword or the file filename construct.

The show log active all command prints all current active logs in the
buffers. When the buffer is full, the log is wrapped out of the buffer and written
into a series of archive files named messages.x.gz. These files are located

1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06 233


SSR System Architecture Guide

in the /var/log directory through the NetBSD shell mode, as shown in the
following example.

[local]Ericsson#start shell
#cd /var/log
#ls -l
total 56
-rwxr-xr-x 1 11244 44 0 Jun 3 21:37 authlog
-rw-r--r-- 1 root 44 12210 Aug 12 18:46 cli_commands
-rw-r--r-- 1 root 44 415 Aug 12 18:47 commands
-rwxr-xr-x 1 11244 10000 1178 Sep 6 17:58 messages

To view the messages file:

#less messages
Sep 6 07:43:36 127.0.2.6 Sep 6 07:39:52.327: %LOG-6-SEC_STANDBY:
Sep 6 07:39:52.214: %SYSLOG-6-INFO: ftpd[83]:
Data traffic: 0 bytes in 0 files
Sep 6 07:44:51 127.0.2.6 Sep 6 07:39:52.328: %LOG-6-SEC_STANDBY:
Sep 6 07:39:52.326: %SYSLOG-6-INFO: ftpd[83]:
Total traffic: 1047 bytes in 1 transfer

To view the messages.x.gz files:

#gzip -cd messages.0.gz | less


Sep 6 00:03:21 127.0.2.6 Sep 5 23:53:46.600: %LOG-6-SEC_STANDBY:
Sep 5 23:53:46.600: %CSM-6-CARD: slot 12, ALARM_MAJOR: Circuit
pack backplane failure
Sep 6 00:04:36 127.0.2.6 Sep 5 23:53:46.601: %LOG-6-SEC_STANDBY:
Sep 5 23:53:46.601: %CSM-6-CARD: slot 14, ALARM_MAJOR: Circuit
pack backplane failure
Sep 6 00:05:51 127.0.2.6 Sep 5 23:53:46.603: %LOG-6-SEC_STANDBY:
Sep 5 23:53:46.602: %CSM-6-CARD: slot 1, ALARM_MAJOR: Circuit
pack backplane failure
Sep 6 00:07:06 127.0.2.6 Sep 5 23:53:46.604: %LOG-6-SEC_STANDBY:
Sep 5 23:53:46.603: %CSM-6-CARD: slot 2, ALARM_MAJOR: Circuit
pack backplane failure

5.2.2.2 Preserving Logs across System Reload

When you enter the reload command from the CLI, or the reboot command
from the boot ROM, the system copies its log and debug buffers into the
following files:

/md/loggd_dlog.bin

/md/loggd_ddbg.bin

As an aid to debugging, you can display these files using the show log
command:

234 1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06


Monitoring and Troubleshooting Data

show log file /md/loggd_dlog.bin

show log file /md/loggd_ddbg.bin

5.2.2.3 Set Logs to Second and Millisecond Timestamps

By default, the timestamps in all logs and debug output are accurate to the
second. You can configure accuracy to the millisecond by entering the following
commands.

[local]Ericsson#configure

[local]Ericsson(config)#logging timestamp millisecond

[local]Ericsson(config)#commit

5.2.2.4 Access Boot Logs from a Console

When the controller cards cannot boot up, you cannot access them remotely.
To determine the cause, collect the information shown in the console while
the controller card is booting.

1. Connect the console cable between your PC and the RPSW controller
card console port.

2. Set connect parameters like baud rate, data bits, parity, and stop bits
correctly (for XCRP4, typically set them as 9600, 8, N, and 1, respecitively).

3. Enable the capture function in your terminal emulation software.

4. Start the controller card by powering on the power supply or inserting the
card back into the slot.

The boot output is collected in a predefined capture file.

5.2.3 ISP Logging


The in-service performance (ISP) log is stored in the flash memory of the
router. It collects information about predefined system events that can have a
potential impact on applications. It enables support representatives to perform
root cause analysis and troubleshooting on the router. It also logs events for
third-party applications, such as EPG.

You can view the ISP log in the CLI using the show isp-log command, or you
can extract the ISP log from /flash/isp.log using the copy command with
the scp keyword. The ISP log is persistent across switchovers and reboots.

When the ISP log file reaches the size limit you set with the isp-log size
command, the system stops writing log entries in the file, logs an entry in the ISP
file stating that the file is full, and displays the following system error messages.

1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06 235


SSR System Architecture Guide

• %SYSLOG-3-ERROR: ISP logging disabled due to file size


limit being reached

• %ISP-3-ERR: ISP Log is Full, current /flash/isp.log


file size file size exceeds max file size max file size,
where file size is the file size of the log file, and max file size is the
maximum file size set by the isp-log size command.

To resume logging entries in the ISP log file, extract the ISP log file using the
following command:
copy /flash/isp.log scp: //user@hostname/isp.log clear

Note: This command clears the isp.log file after it is successfully copied to
another location, enabling ISP logging to resume. If you disable the ISP
log or change the size limit, the system removes the existing ISP log
file. Also, if you change the ISP log file size limit to a lower setting than
the current file size, the system deletes all entries from the ISP log file

You can also use the copy command with the tftp keyword for extracting
the file.

You can use the information in the ISP log to manually compute system
downtime and other statistics or, in the event of a problem, you can send the
extracted file to your support representative for analysis.

The ISP log tracks and displays the following information.

• Event type. See Loggingfor the specific event types in the ISP log file.

• Application name. Events that are not application-specific use the


application name system.

• Event timestamp. The time that the event occurred in Universal Time
Coordinated (UTC) format.

• Event information. Additional details of the source of the event.

• Trigger method. If a user performed the action, the ISP log records the
trigger method as manual. If the system performed the action, the ISP log
records the trigger method as auto.

• System uptime. Time since the system last rebooted, in seconds.

• Comment. Displayed if a user added a comment using the isp-log add


comment command.

By default, the maximum log size is 3 MB. You can increase it up to 10 MB


(10240 kbytes). To change the ISP log size, enter the isp-log size command
in global configuration mode.

236 1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06


Monitoring and Troubleshooting Data

5.3 Show Commands


The Ericsson IP Operating System CLI includes many show commands to
display feature and function status. The output of show commands can be
voluminous and tricky to capture. For example, the show tech-support
command output can be too large to save to /flash, depending on its contents.

5.3.1 Capturing show tech-support Output

The show tech-support command is the foundation for troubleshooting


data collection, a system-oriented set of smaller macros that produce output
focussed on OS processes or the SSC1 card (see Section 5.3.2 on page 240).

Note: The commands included in the tech-support macro change from


release to release. To check which commands are included in the
current release, enter the hidden show macro hidden command
command. Other task-oriented built-in macros are also included, which
you can view by running the hidden show macro exec hidden or
show macro hidden command.

To collect basic data for submitting or escalating a TR, perform the following
steps.

1. Collect the output of the show tech-support command with no


keywords.

The basic macro runs the commands listed in Table 14, grouped by focus.

2. For specific processes or the SSC1 card, run the command a second time
with an appropriate keyword. For example, for AAA problems, enter the
show tech-support aaa command.

Optionally, you can collect other data relevant to the problem. See Table 15.

Note: Whenever possible, run these commands when the problem is still
present. If the issue is related to traffic, counters, or other changing
elements, run the command again after a short interval (3–5 minutes).

Table 14 Commands Included in the Basic show tech-support Macro


Focus Commands Included
Startup and software revision terminal length 0
context local
show clock
show version
show release

1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06 237


SSR System Architecture Guide

Table 14 Commands Included in the Basic show tech-support Macro


Focus Commands Included
System hardware show chassis
show chassis power
show hardware detail
show backplane-status
show diag pod detail
show diag out-of-service
show disk external detail
show disk internal detail
show disk card
show port [detail | management | slot/port]
show port counters detail
Configuration details show context all
show configuration
show history global
Core system statistics show redundancy
show system redundancy
show service [filter]
show system status detail
show system alarm
Process and memory status show memory
and crashes
show sharedmemory
show crashfiles
show process crash-info
show process
show process hidden_all
show process detail
show process hidden_all detail
show process stats

238 1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06


Monitoring and Troubleshooting Data

Table 14 Commands Included in the Basic show tech-support Macro


Focus Commands Included
Core system processes (RCM show rcm states
and ISM)
show rcm session
show rcm memory
show rcm locks
show rcm be2rcm
show rcm daemons
show rcm counter callbacks
show ism general
show ism mbe detail
show ism client detail
show ism interface detail
show ip interface brief all-context
show process ism diagnose
IP routes show ip brief all-context
show ip route all
show ip route summary
show ip route summary all-context
System logs show logging
show log
Subscribers (basic) show subscriber summary all
show circuit summary

1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06 239


SSR System Architecture Guide

Table 14 Commands Included in the Basic show tech-support Macro


Focus Commands Included
Shared memory routing show system status process shm_ribd
(SHM_RIBD)
show smr counters
show smr counters debug
show smr counters detail
show smr internals
show smr log
show process shm_ribd diagnose detail
DHCPv6 information show system status process dhcpv6d
show process dhcpv6 chunk-statistics
show porcess dhcpv6 thread-info
show porcess dhcpv6 detail
show porcess dhcpv6 diagnose detail
show dhcpv6 server host summary
show dhcpv6 statistics
show dhcpv6 debug
show dhcpv6 detail
show dhcpv6 internals
show dhcpv6 log

5.3.2 Using Tech Support Commands for Specific Problems


The show tech-support command includes optional keywords to collect
troubleshooting data about many OS modules or the SSC card. To collect data
for specific problems, use the command in exec mode with an appropriate
keyword.

The command has the following syntax:

show tech-support [aaa | ase | bfd | bgp | card slot |


dhcp | dot1q | flowd | gre | igmp | ipv6 | l2tp | ldp |
mobile-ip | ospf | ospf3 | pim | ppp | pppoe | qos | rdb
| snmp]

240 1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06


Monitoring and Troubleshooting Data

Note: ASE, ATM, Flowd, L2TP, Mobile IP, PPP, PPPoE are not supported on
the router.

For information on the SSR functions covered by the basic command (without
any keywords), see Table 14.

For the procedures to capture the output of the command, see Section 5.3.1
on page 237.

Table 15 describes the commands included in the command with the keywords.
It also includes the macro names run with each keyword.

Note: The show tech support ase command is the same as the
ase-tech macro.

Table 15 show tech-support Macros for Collecting Troubleshooting Data


show tech Sub-macro Focus Commands Included
-support Included
Command
Keyword
aaa aaa-debug Authentication, terminal length 0
authorization,
and accounting show aaa debug-stat
configuration and show aaa debug-radius-stat
events
show radius counters
show radius control
show radius statistics
show context all
show udp statistics
show radius sockets
show system status process aaad
show process aaa detail
show process aaa diagnose
detail
show log | grep "AAA-"

1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06 241


SSR System Architecture Guide

Table 15 show tech-support Macros for Collecting Troubleshooting Data


show tech Sub-macro Focus Commands Included
-support Included
Command
Keyword
ase ase-tech ASE information terminal length 0
Not supported; show version
use the card
keyword instead. show chassis
show dpmon all
show disk
show disk internal
show memory
show sharedmemory detail
show process crash-info
show process
show process hidden_all
show process detail
show process hidden_all detail
show ipc process
show log
bfd bfd-debug Bidirectional terminal length 0
Forwarding
Detection (BFD) show bfd session
information show bfd session detail
show card all bfd
show card all bfd detail
show ip route summary all

242 1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06


Monitoring and Troubleshooting Data

Table 15 show tech-support Macros for Collecting Troubleshooting Data


show tech Sub-macro Focus Commands Included
-support Included
Command
Keyword
bgp bgp-debug Border Gateway terminal length 0
Protocol (BGP)
information show process bgp detail
show log | grep "BGP-"
show bgp route
show bgp neighbor detail
show bgp next-hop-labels
show configuration bgp
show bgp summary
show bgp reset-log
show bgp notification
show ip route registered
next-hop
show ip route summary all
card slot card-tech- Line card, SSC, Runs a macro with variants of the
support RPSW, or ALSW following:
card
show card slot fabl api *
show card slot fabl iface *
show ism client "iface-fabl
SLOT slot" log detail
show card slot fabl dot1q table
show card slot fabl fib *
show card slot fabl process *
show card slot pfe *

1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06 243


SSR System Architecture Guide

Table 15 show tech-support Macros for Collecting Troubleshooting Data


show tech Sub-macro Focus Commands Included
-support Included
Command
Keyword
dhcp dhcp-debug Dynamic Host terminal length 0
Configuration
Protocol (DHCP) show system status process
server and relay dhcpd
information show dhcp relay stats debug
show dhcp relay server detail
show dhcp relay pending detail
show dhcp relay summary
show dhcp server stats debug
show dhcp log
show udp statistics
show process dhcp detail
show process dhcp diagnose
detail
show log | grep "DHCP-"
show log | grep "CLIPS-"
show log | grep "AAA-"
show clips summary
show clips counters detail
show clips counters debug
show process clips detail
show process clips diagnose

244 1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06


Monitoring and Troubleshooting Data

Table 15 show tech-support Macros for Collecting Troubleshooting Data


show tech Sub-macro Focus Commands Included
-support Included
Command
Keyword
dot1q dot1q-debu 802.1Q perman terminal length 0
g ent virtual circuit
(PVC) information show system status process
dot1q
show dot1q operational-statist
ics
show process dot1q diagnose
show process dot1q chunk-stati
stics
show process dot1q ipc-pack-st
atistics
show dot1q state cct
show process dot1q detail

1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06 245


SSR System Architecture Guide

Table 15 show tech-support Macros for Collecting Troubleshooting Data


show tech Sub-macro Focus Commands Included
-support Included
Command
Keyword
flowd flow-debug Flow process terminal length 0
information for
flow admission show system status process
control flowd
show flow admission-control
profile all
show flow circuit all detail
show flow counters
show flow counters debug
show flow counters detail
show flow internals
show flow log
show flow ppa state
show process flowd detail
show process flowd chunk-stati
stics
show process flowd thread-info
show process flowd diagnose
detail
gre gre-debug Generic Routing terminal length 0
Encapsulation
(GRE) tunnels show gre detail
and tunnel circuit show gre debug
information
show gre counters
show gre peer
show gre counters detail
show gre tunnel debug
show card all gre detail

246 1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06


Monitoring and Troubleshooting Data

Table 15 show tech-support Macros for Collecting Troubleshooting Data


show tech Sub-macro Focus Commands Included
-support Included
Command
Keyword
igmp igmp-debug Internet Group show config igmp
Management
Protocol (IGMP) show igmp interface
information show igmp group
show igmp traffic
show igmp circuit all
ipv6 ipv6-debug IPv6 subscriber terminal length 0
services informati
on show ipv6 route
show ipv6 route summary
show ipv6 interface brief
show ipv6 statistics
show ipv6 pool summary
show nd neighbor detail
show nd interface detail
show nd circuit detail
show nd summary
show nd statistics
show nd static-neighbor all
show nd profile
show nd prefix all
show nd prefix interface detail

1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06 247


SSR System Architecture Guide

Table 15 show tech-support Macros for Collecting Troubleshooting Data


show tech Sub-macro Focus Commands Included
-support Included
Command
Keyword
isis isis-debug Intermediate terminal length 0
System-to-Inter
mediate System show isis adjacency detail
(IS-IS) routing show isis adj-log
information
show isis database
show isis interfaces all detail
show isis interfaces intercont
ext
show isis interfaces intercont
ext all detail
show configuration isis
show isis topology
show isis dynamic-hostname
show isis spf-log extensive
show isis summary-address
show isis debug-setting
show isis protocol-summary
show process isis detail
show log | grep "ISIS-"

248 1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06


Monitoring and Troubleshooting Data

Table 15 show tech-support Macros for Collecting Troubleshooting Data


show tech Sub-macro Focus Commands Included
-support Included
Command
Keyword
l2tp l2tp-debug Layer 2 Tunneling terminal length 0
Protocol (L2TP)
peer and group show l2tp global ipc
information show l2tp global counters
show l2tp group
show l2tp peer
show l2tp summary
show log events process l2tp
general-events
show system status process
l2tpd
show process l2tp detail
show log | grep "L2TP-"
show card all l2tp demux
show card all l2tp log
show card all l2tp tunnel
ldp ldp-debug Label Distribution terminal length 0
Protocol (LDP)
signaling show process ldp detail
information show ldp neighbor detail
show ldp binding
show ip route registered prefix
show configuration ldp

1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06 249


SSR System Architecture Guide

Table 15 show tech-support Macros for Collecting Troubleshooting Data


show tech Sub-macro Focus Commands Included
-support Included
Command
Keyword
mobile-ip mobile-ip- Mobile IP terminal length 0
debug information
show mobile-ip detail
show mobile-ip all detail
show mobile-ip binding detail
show mobile-ip ism
show mobile-ip tunnel
all-contexts detail
show mobile-ip home-agent-peer
detail
show mobile-ip foreign-agent-pe
er detail
show mobile-ip interface detail
show mobile-ip care-of-address
detail
ospf ospf-debug Open Shortest terminal length 0
Path First (OSPF)
information show ospf
show ospf area
show ospf interface
show ospf neighbor
show ospf interface detail
show ospf neighbor detail
show ospf route detail
show ospf database detail debug
show ospf spf scheduling
show ospf statistics
show ospf statistics neighbor
show process ospf detail
show log | grep "OSPF-"

250 1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06


Monitoring and Troubleshooting Data

Table 15 show tech-support Macros for Collecting Troubleshooting Data


show tech Sub-macro Focus Commands Included
-support Included
Command
Keyword
ospf3 ospf3-debu OSPF Version terminal length 0
g 3 (OSPFv3)
information show ospf3
show ospf3 area
show ospf3 interface
show ospf3 neighbor
show ospf3 interface detail
show ospf3 neighbor detail
show ospf3 route detail
show ospf3 database detail
debug
show ospf3 spf scheduling
show ospf3 statistics
show ospf3 statistics neighbor
show process ospf3 detail
show log | grep "OSPF3-"

1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06 251


SSR System Architecture Guide

Table 15 show tech-support Macros for Collecting Troubleshooting Data


show tech Sub-macro Focus Commands Included
-support Included
Command
Keyword
pim pim-debug Protocol Indepe terminal length 0
ndent Multicast
(PIM) information show ip mroute
show ip mroute detail
show pim rpf-cache route
show pim rpf-cache next-hop
show pim ppa
show card all
show card all mfib mcache
show card all mfib summary
show card all mfib circuit
show card all fib summary
show card all adjacency summary
show process pim detail

252 1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06


Monitoring and Troubleshooting Data

Table 15 show tech-support Macros for Collecting Troubleshooting Data


show tech Sub-macro Focus Commands Included
-support Included
Command
Keyword
ppp ppp-debug Point-to-Point terminal length 0
Protocol (PPP)
information show system status process pppd
show process ppp diagnose
show process ppp chunk-statist
ics
show process ppp ipc-pack-stati
stics
show process ppp throttle-stati
stics
show process ppp termination-ca
use
show ppp counters detail
show ppp counters debug
show ppp counters all-contexts
show ppp summary all
show ppp multilink summary
show ppp global
show log | grep "PPP-"
show process ppp detail

1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06 253


SSR System Architecture Guide

Table 15 show tech-support Macros for Collecting Troubleshooting Data


show tech Sub-macro Focus Commands Included
-support Included
Command
Keyword
pppoe pppoe-debu PPP over terminal length 0
g Ethernet (PPPoE)
information show system status process
pppoed
show process pppoe diagnose
show process pppoe chunk-stati
stics
show process pppoe ipc-pack-st
atistics
show process pppoe throttle-st
atistics
show pppoe counters detail
show pppoe counters debug
show pppoe summary all
show pppoe global
show log | grep "PPPOE"
show process pppoe detail
qos qos-debug Quality of service terminal length 0
(QoS) information
show qos policy
show process qos chunk-statist
ics
show process qos
show qos circuit
show forward policy
show qos h-node
show qos client

254 1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06


Monitoring and Troubleshooting Data

Table 15 show tech-support Macros for Collecting Troubleshooting Data


show tech Sub-macro Focus Commands Included
-support Included
Command
Keyword
rdb rdb-debug SSR configuration terminal length 0
database
information show database globals
show database redundancy
show database external_op
show database threads
show database history verbose
show database transaction
show database transaction log
more /tmp/rdb_show.out
delete /tmp/rdb_show.out
-noconfirm
show database locks verboseshow
database directory records
summary
show database directory records
brief
more /tmp/rdb_show.out
delete /tmp/rdb_show.out
-noconfirm
show database run-time summary
more /tmp/rtdb_show.out
delete /tmp/rtdb_show.out
-noconfirm
show database memory verbose
snmp snmp-summa Simple Network terminal length 0
ry Management
Protocol (SNMP) show snmp server
information show snmp communities
show snmp accesses
show snmp targets
show snmp views

1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06 255


SSR System Architecture Guide

5.4 Collecting the Output of Logs and Show Commands


You can collect the output of show commands, logs, and macros in several
ways.

• To save the output of a show command to /flash or /md before


copying it to a remote location, add the | save filename or |
save /md/filename keywords to the end of the command. For
example, to save the output of the show redundancy command in
/flash/show-redun.txt, use the show redundancy | save
show-redun.txt command.

• To save your CLI session to a file on your computer, use the capture or
logging function in your terminal emulation software.

• Use the UNIX script command on the terminal server before logging on
to the router and running the show command to save the output to a file in
your working directory.

To save the output of the show tech-support command to /md and then to
an external drive:

1. Enter the show tech-support | save /md/filename command.

For example, to save the output to the showtech.txt file on an external


USB drive:

show tech-support | save /media/flash/showtech.txt

2. To copy the output file to a remote location, use the copy


/md/showtech.txt ftp://username@hostname/showtech.txt
command.

To use the script command to save the output to a file in your working
directory:

1. Access the router from a UNIX environment (for example, from a terminal
server), and enter the script filename command.

2. Use Telnet to the router and log on.

3. Enter the show tech-support command.

Your session is saved to a file in your working directory. For example, to


save the output of the command on the router isp-224 with the IP address
10.10.10.2 to the show_tech.log file in your working directory:

256 1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06


Monitoring and Troubleshooting Data

working-directoryscript show_tech.log
Script started, file is show_tech.log
working-directorytelnet 10.10.10.2
Trying 10.10.10.2...
Connected to isp-224.
Escape character is '^]'.

isp-224
login: admin
Password:
[local]isp-224#
[local]isp-224#term len 0
[local]isp-224#show tech-support

4. When the command has completed and the CLI prompt appears,
enter the exit command twice to exit the router and then the script.
The script completes with the message, Script done, file is
show_tech.log.

For information about collecting data for troubleshooting, see Data Collection
Guideline.

5.5 Debugging
The Ericsson IP Operating System includes many debugging messages to
troubleshoot system processes. By default debugging is not enabled because
of performance impact, but you can enable it when needed for troubleshooting.
Debugging in the router can be a context-specific task or a context-independent
(global) task.

Debugging messages are sent to the syslog, console, or log files, depending on
what is configured. For more information, see Logging.

To enable collaboration on serious and complex issues, the debug function is


separate for each administrator logged on to the same router. Each support
engineer can focus on their own debug actions according to their individual
expertise, and then share their results with other engineers who may be
debugging from different approaches.

Note: Use debugging cautiously in a production environment, and supervise


customers using it. Do not forget to turn debugging off after
investigating an issue. Debugging is discontinued when the Telnet
or SSH session to the CLI ends.

5.5.1 How the Active Context Affects Debug Output


The SSR supports multiple contexts. Each context is an instance of a virtual
router that runs on the same physical device. A context operates as a separate
routing-and-administrative domain with separate routing protocol instances,

1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06 257


SSR System Architecture Guide

addressing, authentication, authorization, and accounting. A context does not


share this information with other contexts.

There are two types of contexts: local (a system-wide context) and


administrator-defined (a nonlocal context). The active context (the context that
you are in) affects your debug output.

Context-specific debugging refers to navigating to a specific context and


running debug commands from it and filtering out all debug output not related
to that context. The context-specific output is identified by a context ID
in brackets, which can be displayed using context-specific debugging or
system-wide debugging.

5.5.1.1 Debugging from the Local Context

To debug all contexts on your router, use the system-wide local context. You
see debug output related to this context and all contexts running on the router.
For example, to see all OSPF instances on the router, issue the debug ospf
lsdb command in the local context.

[local] Ericsson# debug ospf lsdb

When you debug from the local context, the software displays debug output
for all contexts. When a debug function is context specific, the debug output
generated by the local context includes a context ID that you can use to
determine the source of the event (the context in which the event has its origin).
You can then navigate to the context that contains the event and collect
additional information to troubleshoot it.

The following example displays debug output from a local context. The debug
output generated using the show debug command includes the context ID
0005, which is highlighted in bold. To find the source of the debug event (the
context name) for context ID 0005, issue the show context all command.
In the Context ID column, look for the context ID with the last four digits
0005—in this case, 0x40080005, which indicates that the source of the debug
event is context Re-1.

Note: After a system reboot, context numbers might change.

Debug functions and the show context all command display


context IDs in two different formats: decimal format and hexadecimal,
respectively. For example, the debug output displays a context ID in
decimal format as 0262. The show context all command displays
the same ID in hexadecimal format as 0x40080106.

258 1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06


Monitoring and Troubleshooting Data

[local]Ericsson#show debug
OSPF:
lsdb debugging is turned on
[local]Ericsson#
Apr 18 12:21:04: %LOG-6-SEC_STANDBY: Apr 18 12:21:04: %CSM-6-PORT:
ethernet 3/7 link state UP, admin is UP
Apr 18 12:21:04: %LOG-6-SEC_STANDBY: Apr 18 12:21:04: %CSM-6-PORT:
ethernet 3/8 link state UP, admin is UP
Apr 18 12:21:05: %CSM-6-PORT: ethernet 3/7 link state UP, admin is UP
Apr 18 12:21:05: %CSM-6-PORT: ethernet 3/8 link state UP, admin is UP
Apr 18 12:21:05: [0002]: %OSPF-7-LSDB: OSPF-1: Area 0.0.0.0 Update
Router LSA 200.1.1.1/200.1.1.1/80000013 cksum 26f1 len 72
Apr 18 12:21:05: [0003]: %OSPF-7-LSDB: OSPF-1: Area 0.0.0.2
Update Router LSA 200.1.2.1/200.1.2.1/80000009 cksum ce79 len 36
Apr 18 12:21:05: [0004]: %OSPF-7-LSDB: OSPF-1: Area 0.0.0.3
Update Sum-Net LSA 0.0.0.0/200.1.3.1/80000001 cksum bb74 len 28
Apr 18 12:21:05: [0004]: %OSPF-7-LSDB: OSPF-1: Area 0.0.0.3
Update Router LSA 200.1.3.1/200.1.3.1/8000000a cksum 142 len 36
Apr 18 12:21:05: [0004]: %OSPF-7-LSDB: OSPF-1: Area 0.0.0.0 Update
Router LSA 200.1.1.1/200.1.1.1/80000013 cksum 26f1 len 72
Apr 18 12:21:05: [0003]: %OSPF-7-LSDB: OSPF-1: Area 0.0.0.0 Update
Router LSA 200.1.1.1/200.1.1.1/80000013 cksum 26f1 len 72
Apr 18 12:21:06 [0005]: %OSPF-7-LSDB: OSPF-1: Area 0.0.0.0 Update
//Associated with Context ID 0x40080005. This is context specific output,
in this case, context Re-1.
----------------------------------------------------------------
[local]Ericsson# show context all
Context Name Context ID VPN-RD Description
-----------------------------------------------------------------
local 0x40080001
Rb-1 0x40080002
Rb-2 0x40080003
Rb-3 0x40080004
Re-1 0x40080005 // The source of the debug event for
Re-2 0x40080006 // Context ID 0005 is context Re-1.
Re-3 0x40080007
[local]Ericsson#

5.5.1.2 Debugging from a Specific Context

The current context affects the output of some debug commands. For example,
the debug ospf lsdb command can be context specific because multiple
contexts can exist, each running its own protocols. In this example, you see
only the OSPF debug output from context MyService. If you run the same
command from the local context, you see output from all contexts that have
OSPF enabled. The context ID in the debug message logs shows all the
contexts for which this debug event is applicable. To debug a specific context
for OSPF, navigate to that context—in this example, MyService.

1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06 259


SSR System Architecture Guide

[local]Ericsson#context MyService
[MyService] Ericsson#terminal monitor
[MyService] Ericsson#debug ospf lsdb
OSPF:
lsdb debugging is turned on
[MyService]Ericsson#
Feb 27 15:11:24: [0001]: %OSPF-7-LSDB: OSPF-1: Area 0.0.0.0 Update
Router LSA 1.1.1.1/1.1.1.1/8000000c cksum ba60 len 36
Feb 27 15:11:24: [0001]: %OSPF-7-LSDB: OSPF-1: Delete
Net:192.1.1.1[1.1.1.1] Area: 0.0.0.0
Feb 27 15:11:24: [0001]: %OSPF-7-LSDB: OSPF-1: Area 0.0.0.0 Update
Router LSA 1.1.1.1/1.1.1.1/8000000d cksum b861 len 36
Feb 27 15:12:09: [0001]: %OSPF-7-LSDB: OSPF-1: Area 0.0.0.0 Update Net
LSA 192.1.1.1/1.1.1.1/80000002 cksum 1b4a len 32
Feb 27 15:12:09: [0001]: %OSPF-7-LSDB: OSPF-1: Delete
Net:192.1.1.1[1.1.1.1] Area: 0.0.0.0
Feb 27 15:12:09: [0001]: %OSPF-7-LSDB: OSPF-1: Area 0.0.0.0 Update
Router LSA 2.2.2.2/2.2.2.2/80000005 cksum 6ec8 len 36
Feb 27 15:12:09: [0001]: %OSPF-7-LSDB: OSPF-1: Area 0.0.0.0 Update
Router LSA 1.1.1.1/1.1.1.1/80000010 cksum 4f30 len 48
Feb 27 15:12:09: [0001]: %OSPF-7-LSDB: OSPF-1: Area 0.0.0.0 Update Net
LSA 192.1.1.1/1.1.1.1/80000003 cksum 194b len 32
Feb 27 15:12:14: [0001]: %OSPF-7-LSDB: OSPF-1: Area 0.0.0.0 Update
Router LSA 2.2.2.2/2.2.2.2/80000006 cksum 237a len 48

5.5.2 Identifying Context-Specific Debug Functions


Debug functions are either context specific or system-wide. For example, the
debug aaa authen command is system-wide because negotiation takes
place at the port or circuit level, and it is not associated with a context. When
you debug from the local context, you see debug output from all contexts. Use
the context ID to determine the source of the debug event (the context that the
event is coming from). When you debug from a nonlocal context, you see
output only from that context. You can perform context-specific debugging from
the local context or from one of the contexts that you have configured.

The following examples show how to recognize whether a debug function is


context specific or applies to the local context. In the first example, context
NiceService contains context identifier 0002, which indicates that the debug
aaa author function is context specific.

The internal circuit handle 13/1:1:63/1/2/11 consists of the following


components:

• slot/port—13/1

• channel:subchannel—Identifies an individual circuit on a TDM port.


13/1:1:63 is an ATM circuit.

• Authority (the application that made the circuit, in this case, ATM) is 1, the
level of circuit (in this example, a traffic-bearing Layer 2 circuit) is 2, and the
internal ID (a sequential uniquely assigned number) is 11.

In the second example, the debug aaa authen function in the local context
is system-wide because no context identifiers are displayed in the output. In
the third example, the local context displays context identifiers 0002, 0003,
and 0004, which indicates that the source of the LSA updates are context

260 1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06


Monitoring and Troubleshooting Data

specific. When you issue the show context all command, these contexts
are displayed as 0x40080002 , 0x40080003, and 0x40080004.

Figure 95

5.5.3 Displaying Debug Output through the Craft Port

Use the logging console command in context configuration mode to view


event log messages on the console. By default, this is enabled in the local
context.
[local]Ericsson#config
Enter configuration commands, one per line, 'end' to exit
[local]Ericsson(config)#context local
[local]Ericsson(config-ctx)#logging console

5.5.4 Displaying Debug Output through Telnet or SSH


Use the terminal monitor command in exec mode to view event log
messages on your terminal when you are connected through Telnet or SSH. To
pause debug output, type Ctrl+S. To continue, type Ctrl+C.

[local]Ericsson#terminal monitor

5.6 Core Dump Files


Core dump files are required to be attached to all customer-support issues and
are an important component of the troubleshooting information used by support
to determine the cause of a failure. A core dump file is a snapshot of the RAM
that was allocated to a process when the crash occurred. The copy is written to
a more permanent medium, such as a hard disk. A core dump file is also a disk
copy of the address space of a process when the crash occurred that provides

1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06 261


SSR System Architecture Guide

information, such as the task name, task owner, priority, and instruction queue
that were active at the time the core file was created.

Collect crash files from both the active and standby RPSW cards.

If a system crash occurred and core dump files were generated, you can
display the files using the following commands.

• show crashfiles

• show process crash-info

Note: The show process crash-info command does not


recordinformation about manual process restarts, and it does not
retaininformation across a system reboot.

On SSRs, core dump files are placed in the /md directory in the /flash partition
(a directory under root FS mounted on the internal CF card), or in the
/md directory on a mass-storage device (the external CF card located in the
front of RPSW), if it is installed in the system.

Note: Crash dump files are saved in gzip format.

5.6.1 Enabling Automatic Upload for Core Dumps


Due to the large size of core dump files, it is recommended to configure the OS
to send core dump files to a preconfigured external FTP server automatically,
thus minimizing the storage on the local CF card. Enable this feature by
entering the service upload-coredump ftp:url command in global
configuration mode. In the following example, 192.168.1.3 is a remote FTP
server with write privileges configured.
[local]Ericsson(config)#service upload-coredump ftp://root:admin@192.168.1.3/ftp-root

The FTP server has the following format:

//username[:passwd@{ip-addr | hostname} [:port] [//directory]

Use double slashes (//) if the pathname to the directory on the remote server
is an absolute pathname. Use a single slash (/ ) if it is a relative pathname (for
example, under the hierarchy of the username account home directory).

You can add the optional context ctx-name construct to name an IP


operating system context for reachability.

5.6.2 Collecting Existing Core Dump Files

To display information about existing core dump files, use the show
crashfiles command.

Note: This command does not display information about crash files that have
been transferred to a bulkstats receiver, which is a remote file server.

262 1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06


Monitoring and Troubleshooting Data

The following example displays the crash files in /md.


[local]Ericsson#show crashfiles

199974 Jan 20 2010 /md/l2tpd_850.core


401489 Jan 20 2010 /md/pm_748_l2tp_850.core

Assuming you do not have the service upload-coredump command


configured, you might need to copy the crash files to a remote location where
you can access them to send to your customer support representative. To
copy the file l2tpd_850.core::

copy md/l2tpd_850.core ftp://username@hostname/md/l2tpd_85


0.core

5.6.3 Forcing a Manual Core Dump


If a process is suspected to be in an abnormal state, the support organization
might ask you to produce core dump files proactively and send them for further
analysis. In this case, you can produce a manual core dump.

Note: You initiate a manual core dump by forcing a crash on any SSR
process or card. However, doing so can destroy other troubleshooting
evidence. Before generating core dumps, collect already existing crash
files and send them to your customer support representative.

To force a core dump for a process without restarting the process, enter the
following command:

[local]Ericsson#process coredump process name no-restart

5.7 Statistics
You can configure the router to produce bulk statistics to monitor your system.
For more information, see Configuring Bulkstats.

5.8 Managing Unsupported Transceivers


Transceiver modules that have not been tested and qualified by Ericsson may
cause power and overheating problems on SSR line cards.

The SSR detects and disables unsupported transceiver, which can potentially
cause power and thermal problems in the router, when they are installed.
This feature does not affect any existing operation and service on supported
transceivers and does not require any configuration. Typically, users would see
it in the following use cases:

• When an unsupported transceiver is inserted into a configured port on a


line card, the transceiver is disabled. The port remains in the down state.

1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06 263


SSR System Architecture Guide

An alarm is raised to alert the user to the situation. When the user replaces
the transceiver with an Ericsson approved one, the alarm is cleared and
port operations are restored.

• When the user configures a port on a line card already inserted in the
chassis with an unsupported transceiver, the transceiver is disabled and
the port remains in the down state. An alarm is raised to alert the user to
the situation. When the user replaces the transceiver with an Ericsson
approved one, the alarm is cleared and port operations are restored.

• When an unsupported transceiver is detected in a configured port in the


chassis, the transceiver is disabled. The port remains in the down state.
An alarm is raised to alert the user to the situation. If the user wants to
temporarily use the unsupported transceiver for testing purposes, he can
apply the hidden CLI enable-unsup-xcvr command to re-enable the
unsupported transceiver. The port may come up. Unsupported transceivers
are enabled on a best-effort basis. The raised alarm remains to highlight
the situation. The user is advised to replace the unsupported transceiver
as soon as possible.

This topic is also covered in the SSR Line Card Troubleshooting Guide.

5.9 SNMP Monitoring and Notification

5.10 Troubleshooting Using ISM

5.10.1 Using ISM to Troubleshoot Router Issues


ISM is the common hub for OS event messages. ISM records can provide
valuable troubleshooting information.

To troubleshoot system instability problems using ISM, perform the steps


in Table 16.

Table 16 Using ISM to Troubleshoot SSR Problems


Step Command Checked
Turn On Event Logging (hidden)
debug ism client client-name
show log events circuit
cct-handle detail

264 1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06


Monitoring and Troubleshooting Data

Table 16 Using ISM to Troubleshoot SSR Problems


Step Command Checked
After a switchover, verify (hidden)
that the newly active
RPSW card contains show ism client “SB-ISM”
complete information. log
show ism client “SB-ISM”
log cct-handle detail
Interpreting ISM
Messages
View a summary of ISM show ism global
status.
If information is not (hidden) show ism client
coming into a client,
investigate the client (hidden) show ism client
or MBE that should client-name detail
have been sending the
(hidden) show ism mbe
messages, or view the
order of messages after (hidden) show ism mbe mbe-name
a restart or switchover. detail
View a list of clients
or information about a
specific client.
View a list of MBEs
or information about a
specific MBE.
Examine message (hidden) show ism mbe mbe-name
logs about circuits, log cct handle cct-handle
subscribers, or detail
interfaces coming into
ISM. (hidden) show ism mbe mbe-name
log interface int-grid
Look up a cct-handle detail
using the show
subscriber active show ism global complete
sub-name@domain log cct handle cct-handle
command. Look up an detail
interface grid using the
show ism global complete
show ism interface
log interface int-grid
int-name detail
detail
command.

1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06 265


SSR System Architecture Guide

Table 16 Using ISM to Troubleshoot SSR Problems


Step Command Checked
Examine the internal show ism circuit cct-handle The output of this command
state of a circuit or detail contains the internal circuit
interface in ISM. handle for each circuit
show ism interface int-grid in the hierarchy. The
detail string is in the format
slot/port:channel/su
bchannel/subsubchan
nel/owner/cct-level
/running-number.
Examine the messages (hidden) show ism client
going out of ISM. client-name log cct handle
cct-handle detail

(hidden) show ism client


client-name log interface
int-grid detail
Determine the circuit show ism circuit cct-handle
hierarchy so that you tree
can investigate the
parents of a particular
circuit.
Examine the global (hidden)debug ism event-in
event logs.
(hidden)show ism global
event-in
Examine the changes in (hidden) show ism log size
the log sizes {mbe | client} [name] size
Return all logs to (hidden) ism log size all
defaults.
Examine the circuit logs. show ism circuit circuit_nam
e log
Examine the reason (hidden) show ism reason
statistics.
Restart ISM process restart ism

5.10.1.1 Turn On Event Logging

ISM logging is useful, but it can impact performance on the router.

Be sure that you have logging enabled on the console during a restart or
switchover. To turn on event logging, configure logging in global configuration
mode.

266 1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06


Monitoring and Troubleshooting Data

Note: Enabling event logging with these hidden commands can be useful
in troubleshooting, but the volume of data produced can impact
performance. Disable these commands after troubleshooting.

1. Enter global configuration mode and enable event logging with retention.

log event circuit keep-after up

log event circuit keep-after down

2. Set the logging timestamp facility to record by milliseconds.

logging timestamp millisecond

3. Enable the logging events command for ISM.

5.10.1.2 After a Switchover, Does the Newly Active RPSW Controller Card Contain
Complete Information?

In the ISM commands in this section, clients are SSR processes that receive
information from ISM, and media back ends (MBEs) send information to ISM.

After a switchover, to examine the ISM events that were on the standby RPSW
card (and just became active), use the show ism global command.

For example, if ND is not receiving messages from AAA:

• For example, if ND is not receiving messages from AAA, check the


messages received. In the output, look for the ISM EOF message for each
MBE, indicating that the MBE is up.

• The standby RPSW card is synchronized with the current state of the active
RPSW in real time. To view the messages updating the standby RPSW,
enter the show ism client SB-ISM log or the show ism client
SB-ISM log cct-handle detail command. SB-ISM is the ISM
running on the standby RPSW. You can also use the show redundancy
command to examine the RPSW card status.

• To view client and MBEs registering and EOF received and sent files, enter
the (hidden) show sys status process ism-name command.

• To view the process crash information, enter the show proc crash-info
command.

• To view the ism core dump of the system, enter the show crashfiles
command.

• To view the watchdog time, the output is PM kills with signal 7.

• For consuming too much memory in the system, the output is Kernel
kills with signal 9, no core dump.

1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06 267


SSR System Architecture Guide

• If there are any link group problems, get the MBE and client circuit logging
of aggregate and constituent circuits. Enter the following commands to
view the link-group problems:

0 show ism linkgroups

0 show ism subprot

0 show link-group detail

• To view chunk memory statistics, enter the show proc ism


chunk-statistics command.

Note: The information displayed by the execution of show ism linkgroups


detail command is different from show link-group detail
command.

5.10.1.3 Interpreting ISM Messages

To get information about clients:

1. To view a summary of the client download state, number of events by type,


number of circuits, and circuit performance, use the show ism global
command.

2. To view the events that were received by ISM, use the show ism global
event-in log detail or the show ism mbe log detail command.

3. To show when events were processed by ISM, use the show ism
global complete log detail command. You can filter it by a circuit
or interface.

4. If information is not arriving at a client, investigate the client or MBE that


should have been sending the messages or view the order of messages
after a restart or switchover using one or more of the following commands.

• To view information about all clients, use the show ism client
command.

• To view detailed information about a specific client, use the show ism
client client-name detail command.

• To view information about blocking clients details, use the show ism
client client name det command, as in the following example:

268 1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06


Monitoring and Troubleshooting Data

[local]Ericsson#show ism client ppp detail


Total clients: 28
Table version : 479,
Client: ppp, ipc name: PPP-ISM-EP-NAME, state: OK,
Internals:
IPC EP: 0x7f000206 0xf4760007
IPC is not blocked, total blocked 1
IPC is not waiting, total waits 0
IPC total msecs blocked 1, max msecs blocked 1
IPC total msecs waiting 0, max msecs waiting 0

In this example:

• IPC is not blocked, total blocked 1 indicates the client


interprocess communication (IPC) queue is full.

• IPC is waiting, total waits 0 indicates the queue needs to be


cleared before reaching the lower priority client.

• If blocked or waiting is high, this indicates that the client is consuming


the messages from ISM and blocking the lower priority clients from being
updated.

Note:

• Ensure that table version for each client is similar else the
higher priority client will blocks the lower priority clients from the
subsequent updates.

• View the IPC Q/sent/err/drop values, the Q and the sent values
should be the same, if the system is running substantially. They
might differ for short duration, but the sent value should be equal
to the Q value.

To get information about MBEs:

1. To view information about all MBEs, use the show ism mbe command.

2. To view information about a specific MBE, use the show ism mbe
mbe-name detail command.

3. If messages seem to be coming in the wrong order—for example, if a delete


message comes before a Down message—examine messages coming into
ISM with one or more of the following commands.

• show ism mbe mbe-name log cct cct-handle detail

• show ism mbe mbe-name log interface int-grid detail

• show ism global complete log cct cct-handle detail

• show ism global complete log interface int-grid

1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06 269


SSR System Architecture Guide

4. Use the show ism interface int-name detail command to


determine the grid for an interface, as in the following example.
[localsnow#show ism interface sub detail
Interface: sub, state: Up, version: 138
------------------------------------------------------
Primary IP : 10.1.1.1/16
Primary IPV6 : 2001:a:b::1/48
IPV6 Link Local : fe80::230:88ff:fe00:ba6
Grid : 0x10000006 Ref IF grid : 0x0
Context id : 0x40080001 IPV6 Ref IF grid : 0x0
Node Flags : 0x40 IP flags : 0x1
IPV6 flags : 0x10000
IP calc mtu : 0 IP cfg mtu : 0
IPV6 calc mtu : 0 IPV6 cfg mtu : 0
DHCP relay sz : 0 DHCP server IP : 0.0.0.0
DHCPV6 server IP : 2001:a:b::1
DHCP svr grp : 0x0
# of sec IP : 0 # of bound ccts : 1
# cct change q cnt: 0

5. To examine the internal state of an interface or circuit in ISM, use the


show ism circuit cct-handle detail or show ism interface
int-grid detail command.
[local]Ericsson#show ism circuit 13/2:1023:63/6/2/1 detail
Circuit: 13/2:1023:63/6/2/1, Len 64 (Circuit), state: Up, addr: 0x4199b9fc
----------------------------------------------------------
interface bound : sub@local
subscriber bound : user1@local
bind type : chap pap
admin state : 1 hardware address : 00:30:88:12:b7:21
media type : ethernet encap type : ethernet-pppoe-ppp
mode type : 0x2 port type : ethernet
mtu size : 1492 cfg mtu size : 1500
ipv6 mtu size : 1492 ipv6 cfg mtu size : 0
cct speed : 1000000 cct rx speed : 0
cct flags (attr) : 0x8007 cct flags2 (attr) : 0x1
L3 proto flags : 0x3 L3 proto valid : YES
L3 v4 proto : ENABLED L3 v6 proto : ENABLED
L3 v4 proto : UP L3 v6 proto : UP
ppa cct clear : FALSE
if flags : 0x800 aaa index : 0x10000002
profile id : 0 version : 137
nd profile : 1 h node id : 0

The status of L3 v4 proto and L3v6 proto (ENABLED and UP in this case)
indicates whether the dual stack is up.

6. To examine the messages going out of ISM, use the show ism client
client-name log cct handle cct-handle detail or the show ism
client client-name log interface int-grid detail command.

7. To list the circuits on a router, use the show ism circuit command; this can
enable you to look up a circuit handle, as in the following example.

270 1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06


Monitoring and Troubleshooting Data

[local]Redback#show ism circuit


Circuit handle Type Hardware address State Intf Bound
2/255:1023:63/1/0/1 Card 00:00:00:00:00:00 Up
2/1:1023:63/1/0/25 Port 00:30:88:14:0a:44 Up
2/1:1023:63/1/1/26 Circuit 00:30:88:14:0a:44 Up
2/1:1023:63/1/2/27 Circuit 00:30:88:14:0a:44 Up to-core@adsl
2/1:1023:63/1/2/28 Circuit 00:30:88:14:0a:44 Up lns@local
2/1:1023:63/1/2/29 Circuit 00:30:88:14:0a:44 Up l2tp-tunnel@lns1
2/1:1023:63/1/2/30 Circuit 00:30:88:14:0a:44 Up l2tp-tunnel@lns2
2/1:1023:63/1/2/31 Circuit 00:30:88:14:0a:44 Up l2tp-tunnel@lns3
2/1:1023:63/1/2/32 Circuit 00:30:88:14:0a:44 Up l2tp-tunnel@lns11
2/1:1023:63/1/2/33 Circuit 00:30:88:14:0a:44 Up l2tp-tunnel@lns12
3/255:1023:63/1/0/1 Card 00:00:00:00:00:00 Down
3/1:1023:63/1/0/34 Port 00:00:00:00:00:00 Down
3/1:1023:63/1/1/35 Circuit 00:00:00:00:00:00 Down
6/255:1023:63/1/0/1 Card 00:00:00:00:00:00 Down
7/255:1023:63/1/0/1 Card 00:00:00:00:00:00 Up
7/1:1023:63/1/0/36 Port 00:30:88:22:52:43 Up
7/1:1023:63/1/1/37 Circuit 00:30:88:22:52:43 Up mgmt@local

1. To determine the circuit hierarchy to investigate the parents of a specific


circuit, use the show ism circuit cct-handle tree command as in
the following example:

[local]Ericsson#show ism circuit 13/2:1023:63/6/2/1 tree

Circuit handle Type Hardware address State Intf Bound


13/2:1023:63/6/2/1 Circuit 00:30:88:12:b7:21 Up sub@local
13/2:1023:63/1/1/6 Circuit 00:30:88:12:b7:21 Up
13/2:1023:63/1/0/4 Port 00:30:88:12:b7:21 Up
13/255:1023:63/1/0/1 Card 00:00:00:00:00:00 Up

2. If a client's state in ISM is unexpected, ISM might have the wrong


information because it is out of sync or previously received the wrong
information. To investigate the situation for a circuit, enter the following
commands.

• show ism global event-in log cct 1/1:1:63/3/2/10


detail

• show ism mbe log cct 1/1:1:63/3/2/10 detail

• show ism global complete log cct 1/1:1:63/3/2/10


detail

• show ism global dropped log cct 1/1:1:63/3/2/10


detail

• show ism global error log cct 1/1:1:63/3/2/10 detail

1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06 271


SSR System Architecture Guide

5.10.1.4 Restart ISM

Caution!
Risk of system instability. Because restarting ISM has a major impact on
all modules and the process of resynching all modules after restart is time
consuming, only restart ISM on production systems that are already down or
during a maintenance window.

Only restart the ISM process as a last resort.

If you do need to restart ISM (if for example, a disconnect exists in the
ISM messages to and from the MBEs), you can restart the ISM process to
synchronize them. After the restart, all MBEs resend information to ISM. All
clients are then populated with this information. To restart the ISM process, use
the process restart ism command.

5.11 Hardware Diagnostics


You can use power-on diagnostics (POD) and out-of-service diagnostics (OSD)
to determine the status of hardware. For more information, see the hardware
guides and General Troubleshooting Guide.

5.11.1 Power-On Diagnostics


POD tests run on startup and provide alerts about hardware problems. In
general, if you determine that a hardware component has failed, send it to the
appropriate organization responsible for the hardware.

POD tests verify the correct operation of the controller cards, backplane, fan
trays, power modules, and each installed line card during a power-on or reload
sequence. These tests also run whenever a controller card or line card is
installed in a running system. The POD for each component consist of a series
of tests, each of which can indicate a component failure.

During each test, the POD display results and status. If an error occurs, the
test lights the FAIL LED on the failing card but does not stop loading the SSR
software. A backplane or fan tray that fails lights the FAN LED on the fan tray.

The maximum test time is 130 seconds: 60 seconds for a controller card, 10
seconds for the backplane and fan tray, and 5 seconds for each installed line
card. If the system has two controller cards, the controller tests run in parallel.

To display results from a POD, enter one of the following commands in any
mode:

272 1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06


Monitoring and Troubleshooting Data

show diag pod component

show diag pod component detail

Table 17 lists the values for the component argument.

Table 17 Components Tested by POD


Component Component Argument Values
ALSW card alsw
Power module pmx, where x is a value from 1 to 8.
Fan tray ft1, ft2
RPSW card active for the active card, standby for the
standby
Line card card n , where n is a number from 1 to 20 on
the SSR 8020, and 1 to 10 on the 8010.
Switch fabric sw
card

The detail keyword displays which test the component failed.

In general, if a component fails to pass its POD tests, you might need to replace
it. Contact your local technical support representative for more information
about the results of a failed test.

POD tests are enabled by default in the SSR software. If they have been
disabled, you can enable them with the diag pod command in global
configuration mode. You can also set the level of POD to run on startup or
installation using the command with the level level construct. For more
information, see

5.11.2 Out-of-Service Diagnostics


This section shows how to determine if problems are caused by hardware
malfunctions using out-of-service diagnostics (OSD). For more information
about OSD, see the hardware guides.

5.11.2.1 Overview of Out-of-Service Diagnostics

You can use OSD to verify hardware status or isolate a fault in a field
replaceable unit (FRU).

If a component fails to pass POD or OSD tests, you might need to replace it.
Contact your local technical support representative for more information about
the results of a failed test.

1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06 273


SSR System Architecture Guide

Five levels of tests are supported, but not all cards support all levels of tests.
Table 18 lists the levels and types of tests performed and the components for
which the tests are supported on the routers.

The OSD tests verify the correct operation of the standby RPSW card, the
standby ALSW card, line cards, and switch fabric cards in the chassis.

Before testing chassis components, put each installed card in the OOS or OSD
state using one of the following commands.

• Line cards—out-of-service-diag command in card configuration


mode.

• SW cards—reload card swslot out-of-service-diag command


in exec mode. The range of values for slot is 1 to 4.

• Standby cards—reload standby [alsw | rpsw] out-of-service-dia


g command in exec mode.

You cannot test the active RPSW or ALSW card, but you can view the results
using the out-of-service-diag command. To execute OSD on active
RPSW or ALSW cards, you must change the state from active to standby using
the switchover command.

To stop a running OSD test, use the no diag out-of-service command.


The following output shows the available options.

[local]Ericsson#no diag out-of-service ?


card The diagnostic is to be run on a card
standby The diagnostic is to be run on the standby card

[local]Ericsson#no diag out-of-service card ?


1..20 Traffic card slot number
SW1..SW4 Switch card slot number

[local]Ericsson#no diag out-of-service standby ?


alsw The diagnostic is to be run on the standby ALSW
rpsw The diagnostic is to be run on the standby RPSW

Note: The correspondence between the card name that appears in the CLI
and the line card type is found in the "Card Types" section of the
Configuring Cards document.

Table 18 OSD Test Levels


Level Components Tests
1 - Basic tests All Duplicates the POD tests.
2 - Extended Standby controller card, line cards only Includes level 1 tests. Tests all
diagnostics onboard active units and local
processors.

274 1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06


Monitoring and Troubleshooting Data

Table 18 OSD Test Levels


Level Components Tests
3 - Internal Line cards, I/O carrier card, media Includes level 2 tests. Tests and
loopbacks interface cards (MICs), and standby verifies the data paths for the entire
XCRP4 controller cards card with internal loopbacks.
4 - External Line cards, I/O carrier card, MICs, and Includes level 3 tests. Tests the entire
loopbacks standby XCRP4 controller cards card using external loopbacks. Must
be run onsite with external loopback
cables installed.

Note: If the level you select is not supported for the unit you want to test, the
tests run at the highest level for that unit.

Table 19 lists the available parameters for an OSD session of the diag
out-of-service command.

Table 19 Parameters for OSD Sessions


Parameter Description
card card-type slot Specifies the line card in the slot to be tested.
standby Tests the standby RPSW card. Can be run only from the active card.
level level Specifies the level at which to initiate the test.
loop loop-num Specifies the number of times to repeat the diagnostic test.

5.11.2.1.1 Viewing and Recording OSD Results

A session log stores the most recent results for each card in main memory and
also on the internal file system for low-level software. In addition, a history file
on the internal file system stores the results for the previous 10 sessions.

You can display partial test results while the tests are in progress. A notification
message displays when the session is complete. To view test results, enter the
show diag out-of-service command in any mode at any time. You can
display the latest results for a traffic or standby controller card from the log or
the results for one or more sessions from the history file.

Note: If you are connected to the system using the Ethernet management
port, you must enter the terminal monitor command in exec
mode before you start the test session so that the system displays
the completion message. For more information about the terminal
monitor command, see Basic Troubleshooting Techniques.

To display the results from OSD sessions, use one of the following commands.
You can enter the commands in any mode.

• Display results for all components from the last initiated session using the
show diag out-of-service command.

1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06 275


SSR System Architecture Guide

• Display results for line cards and switch fabric cards using the show diag
out-of-service card slot command.

• Display results for the standby RPSW or standby ALSW card using the
show diag out-of-service standby command.

• Display results for the active RPSW or active ALSW card using the show
diag out-of-service active command.

• Display results for the last n sessions with the show diag
out-of-service history n command. The latest session is displayed
first. You can list up to 10 sessions.

5.11.2.1.2 OSD Test Result Definitions

In general, if a unit fails a test, you should replace it. Contact your local technical
support representative for more information about the results of a failed test.

Table 20 lists the states of the LEDs when an OSD session runs on a line card
or standby RPSW card in an SSR chassis.

Table 20 Card LED States during and after an OSD Session


Traffic or Standby RPSW or ALSW
Card State State of LEDs
Out of service (shutdown command) FAIL, ACTIVE, and STDBY LEDs are off.
OSD (out-of-service-diagnost FAIL, ACTIVE, and STDBY LEDs are off.
ic command)
Session is in progress FAIL, ACTIVE, and STDBY LEDs blink.
End of session with one or more FAIL LED is on. ACTIVE and STDBY LEDs are turned
failures off until the card is returned to the in-service state.
End of terminated session FAIL LED is off. ACTIVE and STDBY LEDs are turned
off until the card is returned to the in-service state.
End of successful session FAIL LED, ACTIVE, and STDBY LEDs are turned off
until the card is returned to the in-service state.

Table 21 lists the possible status for an OSD session.

Table 21 Status Descriptions for an OSD Session


Session
Status Description
Aborted Session was terminated by the user or when the standby controller card was
removed.
Incomplete At least one of the requested tests could not be run.
In-Progress Session is currently in progress.

276 1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06


Monitoring and Troubleshooting Data

Table 21 Status Descriptions for an OSD Session


Session
Status Description
n Failures Session was completed with a number of test failures.
Passed All tests passed.

Table 22 lists the displayed descriptions for the test status.

Table 22 Status Descriptions for a Test


Test Status Description
Aborted Test was started but terminated when the standby controller card was removed.
Failed Test ran and failed.
Not Run Test has not yet run (initial state).
Passed Test ran successfully.
Running Test is currently in progress.
Skipped Test could not be run; for example, the part revision is earlier than the required
minimum version, or no file was found.

5.11.2.1.3 Clearing Results from OSD Sessions

To clear or display the results from OSD sessions, perform the tasks
described in Table 23. Enter the clear diag out-of-service and
diag out-of-service commands in exec mode. Enter the show diag
out-of-service command, which can display results for up to 20 sessions
from the history log, in any mode.

Table 23 Administer OSD and POD Results


Task Root Command
Clear the results from the last initiated session. clear diag out-of-service
Clear the latest results for the active RPSW or ALSW clear diag out-of-service
card. active
Clear the latest results for a line card. clear diag out-of-service
card slot
Clear the latest results for the standby RPSW or ALSW clear diag out-of-service
card. standby
Clear all diagnostic history. diag out-of-service
history

1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06 277


SSR System Architecture Guide

5.11.2.2 Running OSD for Chassis Components

For instructions to run OSD for chassis components, such as an RPSW card,
ALSW card, SW card, line card, or service card, and return them to the
in-service state, see General Troubleshooting Guide.

278 1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06


Glossary

Glossary

AAA CMB
Authentication, Authorization, and Accounting Card Management Bus

AC COM
attachment circuit Common Operations and Management

Y
ACLs CPLD
Access control lists Complex Programmable Logic Device

R
ALd CSM
adaptation layer daemon Card State Manager

A
ALSW CSM
Alarm Switch Card State Module

ALSW
Alarm Switch card
IN
CSPF
Constrained Shortest Path First
IM
AMC DCL
Advanced Mezanine Card Data Communication Layer

ARPd DHCP
Address Resolution Protocol daemon Dynamic Host Configuration Protocol
EL

ATCA DMA
Advanced Telecom Computing Architecture Direct memory access

BFD DP
PR

Bidirectional Forwarding Detection drop precedence

BGP DSCP
Border Gateway Protocol Differentiated Services Code Point

BNG DU
Broadband network gateway Downstream Unsolicited

CIB eFAP
Counter Information Base egress FAP

CLS eLER
Classifier egress LER

CM EPG
Configuration Management Enhanced Packet Gateway

CMA ESI
Chassis Management Abstraction Enterprise Southbridge Interface

1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06 279


SSR System Architecture Guide

EXP iFAP
MPLS experimental priority bits) ingress FAP

FABL IFmgr
forwarding abstraction layer Interface Manager

FALd IGMP
Forwarding Adaptation Layer Internet Group Management Protocol

FAP IGMPd

Y
fabric access processor Internet Group Management Protocol daemon

FIB IGP

R
Forwarding Information Base Interior Gateway Protocol

FM iLER

A
Fault Management ingress Label Edge Router

FMM ILM
Fabric Multicast Manager

FRR
Fast Reroute
IPC IN
ingress label map

Inter-process communication
IM
FTMH IPG
fabric traffic management (TM) header Inter-Packet Gap

FTP IS-IS
EL

File Transfer Protocol Intermediate System-to-Intermediate System

GE ISM
Gigabit Ethernet Interface State Manager

GPIO ITHM
PR

General Purpose Input/Output incoming traffic management header

GPRS L2VPNs
general packet radio service Layer 2 Virtual Private Networks

GRE L2
Generic Routing Encapsulation Layer 2

GTP L3
GPRS tunneling protocol Layer 3

HRH L3VPN
Host Receive Header Layer 3 VPN

HTH LACP
Host Transmit Header Link Aggregation Control Protocol

iBGP LAG
internal Border Gateway Protocol Link Aggregation Group

280 1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06


Glossary

LDP MRU
Label Distribution Protocol maximum receive unit

LER MW
Label Edge Router middleware

LFBs NBI
Logical Functional Blocks Northbound Interface

LFIB NEBS

Y
Label Forwarding Information Base Network Equipment Building Standards

LGd NHFRR

R
Link Group Daemon Next-Hop Fast Reroute

LM NHLFE

A
label manager Next-Hop Label Forwarding Entry

LP NPU
Local Processor

LSAs
Link State Advertisements
IN
Networking Processing Unit

NTMH
Network Processor TM header
IM
LSP OAM
Label-switched path Operation, Administration, and Maintenance

LSR OCXO
EL

Label switching router Oven Controlled Crystal Oscillator

MACs OFW
Media Access Controllers Open Firmware

MBEs OIFs
PR

Media Back Ends Outgoing Interfaces

McastMgr OSD
Multicast Manager out-of-service diagnostics

MFIB OSPF
Multicast Forwarding Information Base Open Shortest Path First

MIB PAd
Management Information Base Platform Admin daemon

MO PCI
Managed Object Peripheral Component Interface

MPLS PCI-E
Multiprotocol Label Switching PCI Express

MPLS-TE PD-QoS
MPLS traffic engineering Packet descriptor QoS

1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06 281


SSR System Architecture Guide

PEM QPI
Protocol Encapsulation Manager Quick Path Interface (Bus connecting CPU
chips on SSC card)
PEMs
Power Entry Modules RCM
Router Configuration Module
PFE
packet forwarding engine on the NPU RDB
Redundant (Configuration) Database
PHY

Y
Physical Interface Adapter RED
random early detection
PD

R
platform dependent RIB
Routing Information Base
PI

A
platform independent RP
Route Processor
PI-RP-QoS
Platform-independent RP QoS

PICMG
PCI Industrial Computer Manufacturers Group
RPF
IN
Reverse path forwarding

RPL
IM
Routing Policy Library
PIM
Protocol Independent Multicast RPM
Routing Policy Manager
PM
EL

Process Management RPMB


route processor management bus
POD
Power-on diagnostics RPSW
Route Processor Switch Card
PR

PWFQ
priority weighted fair queuing RSVP
Resource Reservation Protocol
PWM
Pulse Width Modulation RTC
real time clock
QoS
Quality of service SAs
support agents
QoSd
QoS Daemon SATA
Serial Advanced Technology Attachment
QoSLib
QoS shared library SCB
Selection control bus
QoSMgr
QoS RCM Manager SCP
Secure Copy Protocol

282 1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06


Glossary

SerDes TS
Serlializer/deserializer Traffic steering

SFTP TSM
Shell FTP traffic slice management

SI TX
service instance Transaction
SNMP UARTs

Y
Simple Network Management Protocol Universal Asynchronous Receiver/Transmitt
ers
SPI

R
Service Provider Interface UDP
User Datagram Protocol
SPI

A
system packet interface UTC
Universal Time Coordinated
SSC
Smart Services Card

SSDs
solid-state drives
IN
VLLs
virtual leased lines

VLP
IM
SVLAN Very Low Profile
Service VLAN
VPN
SW Virtual Private Network
EL

Switch Fabric Card


VPN
TCB Virtual Private Networking
Timing control bus
VPWS
TCP Virtual Private Wire Service
PR

Transmission Control Protocol


VPWSs
TDP Virtual private wire services
Thermal Design Power
VRRPd
TE Virtual Router Redundancy Protocol daemon
Traffic Engineering

TFTP
Trivial FTP

TM
(fabric) traffic management

TM
Traffic Management

TOP
Task optimized processor

1/155 53-CRA 119 1364/1-V1 Uen PC3 | 2013-02-06 283

You might also like