Professional Documents
Culture Documents
8 Private Network-to-Network Interface (PNNI) : 8.1 Overview
8 Private Network-to-Network Interface (PNNI) : 8.1 Overview
The Passport Private Network-to-Network Interface (PNNI) routing system simplifies large
network configuration for ATM switched connections. The protocol implements a powerful
dynamic routing mechanism that eliminates manual configurations and assures automatic
updates of routing tables in the case of topology changes and network resource shortages. It is
based on shortest best path algorithms that are distributed across the network for greater
reliability. PNNI provides standardized signaling/routing messages and interworking functions,
establishes and maintains clear point-to-point and point-to-multipoint connections, and provides
a variety of route selection criteria and performance constraints especially valuable in large
WAN networks. In addition, Passport PNNI implementation supports the new edge-based re-
routing capabilities necessary for advanced restoration mechanisms also known as route
optimization and route recovery. This is complemented by valued-added route balancing
algorithms, route caching and configurable multi-path variance that maximize the overall
performance of the network.
This chapter assumes that the reader is familiar with PNNI routing and signaling; while it is not
intended as a tutorial, the basic concepts and terms of PNNI are herein discussed.
8.1 Overview
The PNNI protocol is standardized by the ATM Forum to provide reliable, highly scalable and
dynamic routing to ATM networks. PNNI supports different ATM switches services such as
SVCs, SVPs, SPVCs and SPVPs. The value of PNNI is most typified by its ability to scale to
thousands of nodes with minimal operational overhead, and still maintain loop-free routing, QoS
guarantees and other powerful routing capabilities. Despite the major benefits of PNNI being
realized in large networks, the protocol is now mature enough to be well suited in smaller
networks as well. In the past, IISP static routing or proprietary protocols were used to route in
small networks. However, the problems inherent to static routing protocols (manual
configuration, possible routing loops, no dynamic topology updates or changes) are reason
enough for most networks to introduce or migrate to PNNI.
In addition to the benefits inherent to the protocol, Passport offers many value-added optional
features and a wide selection of proprietary route selection, route optimization, load balancing
and performance enhancing functionality.
This chapter assumes a general knowledge of PNNI routing and signaling and is not intended to
provide a PNNI tutorial, however, a brief overview is given that discusses the basic concepts and
terms of PNNI.
PNNI is a source routing protocol. This implies that a single node calculates the “hierarchically
complete” route, thereby eliminating the possibility of routing loops. A common characteristic of
source routing protocols is that every node has an identical view of the topology. This
requirement enforces consistent routing in the network, and forces each separate node to learn
the topology and characteristics of the other nodes and links in the network.
The PNNI routing protocol consists of procedures and sub-protocols that control the distribution
of the topology information to each PNNI node. With each node maintaining its own database
reflecting the network topology, the database is growing as the topology grows. Ultimately, the
switch resources required to support such a database will become exhausted, limiting the size
and growth of the network. PNNI addresses this by defining smaller sub-networks where each
node in the sub-network has only to learn and maintain a database of the peers in its sub-
network. These sub-networks are called “peer groups” and can be groups of up to 300 contiguous
Passport nodes.
To maintain the connectivity of nodes in different peer groups, PNNI defines special roles for
switches that effectively form a logical hierarchy. The function of the hierarchy is to achieve
network scalability and decrease call setup times. To achieve this, mechanisms are put in place to
distribute a minimal amount of routing information between nodes while still maintaining
network integrity. The hierarchy is constructed such that nodes closer to the apex represent entire
peer groups below them in a parent/child relationship. Groups of parent nodes form higher level
peer groups. The position of each node in the hierarchy is defined by its peer group id and level.
The peer group id defines the peer group borders while the level specifies where in the hierarchy
this peer group resides.
Reachability information is traded between nodes in the same peer group. Starting at the lowest
level of the hierarchy, this information is summarized and passed upwards to the parent node.
The parent node in turn will summarize its peer group information and pass it upward until the
top level peer is reached. This information is traded among the parent nodes, but can now be
passed down the hierarchy to other child peer groups. It is through this mechanism that each
child peer group learns about other child peer groups, and which parent peer group claims
reachability to them. Each node in the lowest level peer group now has the information to
complete routing to every other node in the hierarchy. The local nodes no longer require the link
and topology information of the remote peer groups. Each node relinquishes memory in its own
topology database by purging the link state information from the remote peer groups which
allows PNNI to achieve scalability.
As a result of topology information localized on a peer group basis, in a hierarchical network the
entry border node of each peer group is responsible for completing the routing for connections
transiting this peer group. This implies that a single connection will be routed as many times as it
enters a new lowest level peer group.
To facilitate the construction of the hierarchy, PNNI defines four special switch roles:
• Peer Group Leader (PGL)
• Logical Group Node (LGN)
• Border Node (BN)
• Interior Node (IN)
PGLs are elected in each peer group, and are configured with policies that will represent this
node in the higher levels of the hierarchy.
LGNs are virtual nodes created on the same system as the PGL, and represent the child peer
group by broadcasting the policies configured on the corresponding PGL to other LGNs.
Border nodes are nodes that have a PNNI connection to a node not in its own peer group. Border
nodes have the special job of logically connecting the lower layers of the hierarchy to the higher
layers with virtual links (called uplinks) that can be routed on. Border nodes also are responsible
for routing connections through this peer group. Interior nodes are any other node in the
hierarchy.
INs are defined by not being a PGL or LGN, and only having links to nodes inside it’s own peer
group.
Passport release PCR1.3 supports all four types of PNNI switch roles and is fully capable of
supporting a PNNI hierarchy at any level.
Introducing hierarchical PNNI means that the source node cannot generate the complete route for
inter-PG connections because it does not have full information about the destination PG and
tandem PG(s). Instead, the source node uses its knowledge of its own PG and all its ancestor PGs
to calculate a “hierarchically complete” source route.
Section 8.2.1 Example: flat vs hierarchical PNNI shows two examples. The first is a simple
network illustrating the differences in routing between a flat and hierarchical networks with
identical physical topologies. The second is a more realistic example of a network configuration
where a number of PG selections are compared to illustrate the trade-offs between flat and
hierarchical topologies.
a) Flat PNNI Topology: (same as the physical network) b) Hierarchical PNNI Topology
C Level 20
A
C
B
A.3
A.1 A.3
B.1 B.3
A.2 A.1
B.1 B.3 Level 80
A.2
B.2
B.2
c) Peer Group A’s view of the Network
Horizontal link: There is database exchange across
Level 20 horizontal links.
A
C Outside link: There is no database exchange across
B outside links.
Diagram B shows logical group nodes (LGN) A, B and C in the level 20 PG. Diagram A
shows the network as flat PNNI where every node has the entire view of the network. In
order to make a connection from A.1 to C, source node A.1 has the up to six routes to
reach C:
A.1->A.3->C A.1->A.2->B.1->B.2->B.3->C
A.1->A.2->A.3->C A.1->A.3->A.2->B.1->B.3->C
A.1->A.2->B.1->B.3->C A.1->A.3->A.2->B.1->B.2->B.3->C
Routes with insufficient bandwidth are pruned and the criteria for the best route is chosen
by A.1 according to Aw, (or CTD/CDV). In hierarchical PNNI, A.1 has three choices to
reach C; they are hierarchically complete source routes, since A.1 only has a view of the
its own PG and ancestor PGs as in Diagram C (hierarchically complete source routes):
A.1->A.3->C A.1->A.2->A.3->C
A.1->A.2->B->C
A.1 does not know if B or C are physical nodes or LGNs. Passport uses simple node
representation which summarizes an entire PG to a single node which is connected to
other nodes via the outside links to the PG. Note that the internal structure of PG B is not
visible to A.1, so if the route A.1->A.2->B->C is chosen as the best route, then the
outside link A.2->B .1 will be used, and node B.1 will have to route across PG B in order
to get to C. B.1 is the entry border node and must calculate whether B.1-> B.3->C or B.1-
>B.2->B.3->C is indeed the best route. This serves to illustrate how flat PNNI is entirely
source routed while hierarchical PNNI routing is shared between the source node and the
all entry border nodes en route to the destination.
Complex node representation is a symmetric star topology with a uniform radius. The
center of the star is the interior reference point of the logical node, and is referred to as
the nucleus. The logical connectivity between the nucleus and a port of the logical node is
referred to as a spoke. The concatenation of two spokes represents traversal of a
symmetric peer group.
Complex node representation attempts to provide more information to the LGN so that a
better choice of outside links can be made for PG traversal. The complexity of PGs in
“real world” networks is typically much more complex than a simple spoke and hub
topology, and so the possibly other representations such as using a minimum spanning
tree may be superior.
Without complex node representation, it is important to scale each peer group with
relatively the same diameter and size as each of the other peer groups. Failure to do this
may cause sub optimal routing because the zero cost of each LGN won’t be a true
reflection of the cost of each peer group. If each peer group has the same diameter and
roughly the same QoS metrics, then the equal cost of each peer group negates any large
implications of taking a particular peer group when compared to another.
Also, equally sized peer groups maximizes the EBR feature to choosing an optimal route
throughout the PNNI domain.
Developing an addressing plan is a fundamental step to offering signaled connections (e.g. ATM
SVCs or SPVCs) in a network. The addressing plan is used to uniquely identify ATM endpoints
where connections are terminated. Network nodes, other networks or Customer Premise
Equipment (CPE) are examples of connection endpoints. An effective addressing plan will
identify each endpoint, and provide summarization of groups of endpoints to reduce the size of
network routing tables (summarization is discussed in section 8.3.3 Address Summarization) and
allow for flexibility and portability of each of the addresses. Furthermore, an addressing plan
must also consider the future growth of the network and prepare for possible PNNI peer group
boundaries where they are currently not required. Section 8.3.1 Network Growth and Peer Group
Boundaries discusses the relationship between peer groups and NSAP addresses.
are planned before the actual need for them. Please refer to section 8.5 Peer Group
Boundaries for a discussion on how peer group boundaries can be formed.
In this example, the level can be arbitrarily set to any value between 1 and 80. When
migrating a flat PNNI network to a hierarchy, a “top down” approach is recommended
(see section 8.9 Migrating to a PNNI Hierarchical Network on migrating PNNI
networks). If the top down approach is applied, the actual value of the level is
inconsequential. The only requirement is to have more common bits in geographic areas
following the 80th bit. These bits will be used to uniquely identify child peer groups
similarly to how the single peer group was represented.
Address A 4 7 2 3 4 5 6 7 8 9 1 1 ……..
Hexadecimal 1 ……….
80 bits/10 Octects
Address B 4 7 2 3 4 5 6 7 8 9 F F ……..
Hexadecimal
Although the PNNI protocol supports up to 104 levels (13 octet PGID × 8bits), typically
only a two level hierarchy is required and recommended for most networks, if a hierarchy
is required at all. Whenever possible, a single lowest layer peer group is recommended.
The major benefits of hierarchy are only realized after concluding the size of the single
peer group has exhausted the switch resources needed to support routing efficiently.
Efficiency is based on network stability, call setup performance and route convergence
times in combination with the specific services offered in the network. For more
information on deciding how many levels are appropriate, please refer to section 8.7
PNNI Multi-level Hierarchical Configurations.
Note: If a single lowest level peer group is chosen, PNNI level 1 can always be used to
create the peer group id, regardless of the addressing plan and format.
A two level hierarchy is typically defined by one parent level peer group comprised of
logical group nodes and at least two lower level child peer groups. The child (lowest
level) peer groups do not have to be at the exact same level, but must define the same
parent level peer group and themselves have a lower level than the parent group. (If
address scoping with a value other than zero is used, there could be reachability
implications.) Few network topologies require 3 levels of hierarchy (e.g. grand parent,
parent and child peer groups). Justification of such a hierarchy would include
international networks and tens of thousands of nodes, or special migration scenarios.
In most instances networks today only require a single lowest level peer group. The
operator is encouraged to change the node NSAP addresses as little as possible because
doing so on a Passport switch requires a reboot. In order to offer switched services now
and avoid unnecessary reboots, a single addressing plan should be flexible enough to
identify the endpoints in a single peer group, but also be flexible and smart enough to be
reused in a hierarchical environment. Despite using a single peer group, the network
addressing plan must be aligned with future growth patterns of the network in order to
effectively partition future peer groups.
LGN 1 LGN 3
Peer Group 4
Level 32
PGID LGN 2
12345678
PG3
PG1
Level
PG2
Level 72
80
PGID PGID PGID 123456783
1234567890 1234567822
At the nodal level, Passport supports this concept by offering “summary” addresses.
Summary addresses are used to summarize UNI endpoints on this node by advertising
only the common bits of the end addresses. The simplest form of this is done via Interim
Local Management Interface (ILMI) registration of UNI addresses. This process appends
the CPE 7 octect IEEE MAC address to the 13 octect node prefix forming an NSAP
address. Because all the ILMI registered addresses have the same 13 octect prefix (that is
unique), only that prefix needs to be advertised in place of all the addresses.
Address summarization also applies to a PNNI hierarchical network. It is the role of each
Peer Group Leader (PGL) to summarize all the reachable addresses in its peer group,
such that nodes in other peer groups can reach them. The summarized list of addresses is
effectively broadcast to the rest of the network.
To facilitate effective summarization, the addressing plan must place as much general
information about the address in the most significant bits. Usually, the generalizations are
based on geographical regions. Using geographical regions is not a requirement, but for
simplicity they are used in examples and discussions that follow.
Using a geographical example, more general areas would imply larger, more expansive
geographical regions. For instance, the state of Texas is more general than the city of
Dallas. Both imply specific places, but Dallas provides a finer granularity than Texas, as
Dallas is a city in Texas. By advertising Texas, the city of Dallas (as well as every other
city in Texas) is implied.
Consider Figure 8-4 to help further illustrate. Here there are 3 major fields: city, block,
and house. Using this format, the multiple blocks and houses in a city can be effectively
summarized by advertising the single city address. In a network routing scenario, the city
address would be provisioned on a node with reachability to this particular city. Any calls
terminating in this city should route to the node advertising the city address. Advertising
the city address implies the individual city blocks and houses.
………..CITY………………BLOCK…………………….
routing, this leads to more calls set up to destinations that don’t exist, unnecessarily
wasting network resources.
ICD NSAPs are designed for organizations that are international in scope. As such these
organizations do not wish to be tied to any country in the hierarchically structured
address scheme and require globally unique ATM addresses for network interworking.
NSAP E.164 addresses are intended for organizations that own blocks of E.164 numbers
and are willing to administer their assignment according to the ITU-T recommendations.
Using any of the previous three is essentially a non-technical issue. However, the
following advantages can be gained from using NSAP addresses over E.164:
• ease of interworking between public and private ATM networks since the NSAP
address format is recommended in both environments by the ATMF
• ILMI protocol which allows ATM private switches to automatically register the
terminals (hosts) IEEE MAC addresses with the high order part of the ATM NSAP
node addresses in order to form a complete and unique 20 bytes NSAP address.
• ISO NSAPs are longer than E.164 addresses. This allow more hierarchy and
capability to address any ATM UNIs even in a very large ATM network without a
need for sub-addressing (the larger NSAP addresses allows considerably greater
flexibility and scalability).
• PNNI 1.0 dynamic routing protocol make uses of NSAP addresses in their topology
database
They start with one octet Authority and Format Identifier (AFI) followed by a two octet
Initial Domain Identifier (IDI). The AFI determines the format of the remainder of the
address. The hexadecimal value 0x39, 0x47 and 0x45 indicates that the NSAP addressing
structure conforms to the ISO DCC, ICD and E.164 addressing schemes respectively.
The IDI identifies the address authority (authority responsible for allocating the values
and the syntax of the HO-DSP field). The IDI can be a DCC code, ICD code or E.164
number.
The remainder of the address consists of three fields - the high-order Domain Specific
Part (HO-DSP), the end-system identifier (ESI), and a selector (SEL) field. The authority
identified by the IDI field decides the structure of the HO-DSP. For examples, a national
ISO member body formats the HO-DSP of the DCC addresses and the HO-DSP of the
ICD addresses is decided by the international organization.
The ESI identifies an end-system and must be unique within a particular value of AFI +
IDI + HO-DSP.
The ESI is usually an IEEE MAC address. The SEL field is not used to deliver calls to
end-systems, but could be used within an end-system to differentiate between
applications or processes. It is usually set to 00 hexadecimal value. For purposes of route
determination, end systems are identified by the 19 most significant octets of the ATM
NSAP address.
AFI SEL
47 ICD HO-DSP ESI 00
20 Octets
AFI = Authority and format identifier HO-DSP= High-order domain specific part
DCC= International code designator ESI= End-system identifier (for example, IEEE MAC address)
ICD= International code designator SEL= Selector
Many factors can influence how the hierarchical nature of the address is formed. The
remainder of this section, 8.4 Addressing Formats, discusses these factors.
With respect to addressing, all switched services are broken into two categories:
The biggest reason for the distinction is the use of the ESI portion of an address. Type 2
connections should not include the 7 octect End System Indicator (ESI) portion of the
address in any addressing plan. The ESI fields are typically reserved by private networks
who ILMI register their end customer addresses. The ESI portion is not important to
route the call in the ATMSP network, as that level of granularity only matters in the
private network. The public network must route to the highest (most general) address
uniquely identifying the private network. If the private network ESI addresses were
advertised in the ATMSP network, the ATMSP network would be flooded with addresses
identifying the private network end points.
IISP Private
ATMSP Cloud
Cloud
Address "13octectPrivPrefix:7octectESI_2
Address “13octectPrivPrefix”
An SVC addressing plan (type 2) cannot use the ESI portion of the address and as such
any subscriber information should be kept in the first 13 octects of the address. In this
instance, the subscriber information would share the same space as the node id, region
identifiers and other geographic information that has already been encoded. If only a
single addressing plan is in place, then the addressing plan is only considering the first 13
octects for both SVCs and SPVCs. This means that connections terminating inside the
ATMSP network (e.g.. SPVCs) as well as connections terminating outside the network re
sharing the 13 octect space. The consequence here is that the subscriber information
space reserved in the 13 octects is being shared amongst SVC addresses and SPVC
addresses. Under normal circumstances this would suffice if the geographic
regionalization fields do not require excessive number of bits. However, it may not be
possible to satisfy the geographic regionalization and the expected number of SVCs
and/or SPVCs in the first 13 octects. In this case, two separate addressing plans can be
used for SVCs and SPVCs respectively.
An SVC (type 2) addressing plan would mirror the one described above but would not
have to share its subscriber space with the SPVC connections. The SPVCs (type 1) would
have a separate plan that could use the ESI portion of the NSAP. For SPVCs, the call
would terminate in the ATMSP, and the ESI is a well known and distinguished routable
entity.
Any SPVC terminating outside the ATMSP network would follow the SVC addressing
plan because it will enter the private network as an SVC and would be routed as one.
Furthermore, the default addresses are NOT portable. That is, they cannot be deleted or
moved between ATM interfaces. This may cause problems if an operator needs to move
an address from one card or port to another. In this case, all the connections terminating
or transiting on this interface would have to be reconfigured to the new destination or
transit address. The relationship from source to destination is many to one, so the changes
described here may be significant. The alternative and the recommended addressing plan
uses a user supplied ESI field. The ESI field can be provisioned to identify a particular
customer, service, reference number etc, at the discretion of the ATMSP network
planners. Providing such an address provides portability and allows the flexibility of any
values inside the ESI. Note that if the card and port numbers are included in the ESI, then
the portability is reduced to the card/port level, however, the limit of three hex digits to
represent card/port is lifted as 6 digits (not including the SEL field) can now be used.
Private networks realize certain advantages to using their own addressing plan, such as
ATMSP portability. If a new ATMSP network is required for the private network, such a
change will not impact their own addressing plan. If they used the ATMSP plan,
changing the ATMSP network may imply changing the addressing plan. Currently, there
are no rules governing which plan to use.
If the private network wishes to adopt the addressing plan of the ATMSP network, this
implies that the ATMSP numbering plan will number each node in the private network.
The goal of this is to offer them a plan that can be easily summarized in the ATMSP
network, but flexible enough to satisfy the private network requirements.
Typically, a private network id can be allocated that would be used to uniquely identify
the private network. In addition, some fields can be reserved for use with in the private
network, that the private network can allocate at their discretion. Typically, these bits are
used to identify the individual nodes in the private network.
13 Octets
There are two special requirements that need to be considered in this scenario: portability,
and flexibility. Portability refers to the ability to have the private network change the
point of access to the ATMSP network. The point of access can be any of a port, node or
general geographical location. The level of portability must be negotiated between both
parties, but generally, it is not wise idea to bound a private network to a single port for
access.
Considering the example in Figure 8-7, if the private network id is appended to the node
id, then the private id is bounded to that node identifier. This means that the private id
cannot be ported to another node without the number scheme of the private network
nodes changing. All private network node ids have the ATMSP node id as part of their
numbering scheme. For this reason, the node id (or anything more specific such as card
or port id) should not be used to address SVCs terminating in a private network. Instead,
the amount of portability required for the private network should first be determined, and
then the private network identifier placed appropriately afterwards.
In Figure 8-8, a region and sub-region have been defined in the addressing plan. A
private network has connected to the sub-region, and wishes to adopt the ATMSP
networking plan. The nodes inside the ATMSP network use their node ids to terminate
calls. Calls transiting into the private network have addresses provisioned on the ATMSP
access nodes that do not specify the node id. The private network is now portable to any
node in the sub-region 1. In this case a unique value is used to distinguish private node
ids from ATMSP node ids.
Region+subregion1+nodeid1 Region+subregion1+nodeid2
Static address
Region+subregion1+privNet1
Subregion1
Region+subregion1+privNet1+privNode1
Region+subregion1+privNet1+privNode2
To accomplish this, the addressing plan must distinguish a node id from a private
network id. This can be done by using distinguished values for private network ids, or by
setting unique bit patterns to signify either a node or network. An example of using the
value based method, is to limit all the node ids from a low to high number range (i.e. 0-99
is reserved for node ids, and 100-999 is reserved for private networks). Alternatively, the
plan can designate that no node ids begin with 0, therefore any field beginning with 0
represents a private network. This implies the first bit with a value of 0 distinguishes a
private network. It may also happen that a private network customer has a request for
multiple networks with a sub region. In this case, the addressing plan may have a
requirement to identify one unique customer, but two distinct networks. A
recommendation here would be to divide the private network portion of the address into a
customer id field and a local network field. This reduces the customer id space, but
allows for sub networks to be uniquely identified.
Region+subregion1+nodeid1 Region+subregion1+nodeid2
Region+subregion1+privNet1Area2
Region+subregion1+privNet1Area1
Static Static
address address
Region+subregion1+privNet1Area1+privNode1 Region+subregion1+privNet1Area2+privNode1
Figure 8-9 Two private networks owned by a single organization attaching to the ATMSP Network
For optimal network performance, the number of peer groups in a network should be minimized.
The more peer groups that exist, the more routing ambiguity that also exists due to address
summarization and topology hiding inherent to PNNI peer groups. As previously discussed, the
major benefits of multiple peer groups are only realized after the single peer group is no longer
suitable to the network in question.
full mesh core of relatively few nodes (typically four to eight nodes) with a larger number
of edge nodes which are dually homed to the core nodes.
Core
Edge
Logical Group
Core Link
Edge Link
Figure 8-10 Network topology with no link aggregation and a flat PNNI configuration
This reflects a good compromise of low link cost, full redundancy with a maximum of
three hops per connection. A worldwide carrier network might consist of a number of
these clusters connected by transoceanic links, similar to the topology of the first
example, except where every peer group is a cluster such as Figure 8-10.
Core
Edge
Logical Group
Core Link
Edge Link
Figure 8-11 Network topology with two core nodes per peer group
Whereas Figure 8-10 depicts a flat PNNI configuration, Figure 8-11 and Figure 8-12
illustrate a two level hierarchy configuration.
Core
Edge
Logical Group
Core Link
Edge Link
Figure 8-12 Network topology with one core node per peer group
Figure 8-13, on the other hand, shows the core network in a separate peer group (PG), illustrating
a degenerate case where peer groups have been partitioned. Such topology implies many
crankbacks which prevents intra-PG routing and as such, intra-PG connections are not be
reachable. Edge nodes in each PG must be connected to each other. In order to avoid using edge
nodes for tandem traffic, a full mesh in each edge PG is required.
Core
Edge
Logical Group
Core Link
Edge Link
Figure 8-13 Network topology with core network in a separate peer group
relatively small number of core links as in Figure 8-14. Note that core links between clusters are
connected.
Core
Edge
Logical Group
Core Link
Edge Link
Figure 8-14 Network topology with parallel clusters (one PG per cluster) and core nodes connected
Number of eligible
peer group leaders
Configuration (PGLs) Routing Benefits Routing Disadvantages
Flat PNNI Not applicable Optimal Scalability limited to 300
nodes
2 core nodes Two Near optimal routing in Relatively few PGs will
per PG Robust; if one core node the network core if the somewhat limit scalability
fails, PG is not Aw of the core links >
partitioned Aw of access links.*** Routing through access
node links if core link
inside PG fails
Some connections will
require an extra hop, see
a description of the
behaviour in the next
section
1 core node One Optimal routing in the Many outside links in
per PG Single point of failure, all network core each PG
nodes in PG isolated if
core node fails. ** Many partitions if core
node fails
Some connections will
require an extra hop, see
a description of the
behaviour in the next
section
Core network Any node in the core PG Optimal in the core Many PGs are required
in a separate can be a PGL, other unless the edge nodes
PG PGs depend on how are highly inter-
they are connected connected. Too many
PGs means excessive
control traffic.
Extra hops required if
edge PGs are not full
mesh
Parallel There is only one PG Optimal in each cluster Scalability limited to 300
Clusters per cluster, so any core plus 1 hop to reach the times number of parallel
node can be a PGL other cluster if clusters (e.g. 600 nodes)
necessary
* if they are not interconnected within the PG, then this design is not feasible
** node isolation can be avoided if edges nodes are all eligible PGLs, but this is not recommended
due to impacts to the higher level and prohibitive operational complexity
*** calls have a 50% chance of 2 core hops should the destination (edge) node be connected to
only one core node
8.5.4 Recommendations
3) Two core nodes per peer group (Figure 8-11): Implement this
configuration for reliability, scalability and near optimal routing in the
core.
4) One core node per peer group (Figure 8-12): Do not implement unless all
core nodes are absolutely bullet-proof. Otherwise partitioned PGs and
isolation of all nodes in the PG will result.
In all the above cases, the Aw of core links should be similar in order to avoid
extra hops in the core. Refer to section 8.5.4.4 for more details.
8.5.4.3 Metrics
Metrics must always favour outside core links over access core links if routing
through the core is required.
then setting the entire PG to “restrict transit” is not feasible, and the
recommendation is to use the Aw attribute (setting the Aw of edge links > Aw
of core links). This will cause the edge nodes to be used as transit nodes only
as a last resort (e.g. when the links to the core nodes do not have sufficient
AvCr).
The point to emphasize is that “restrict transit” has only a limited context (this
only applies to nodes that the source node can “see”).
8.5.4.5 Partitioning
There should be at least two eligible PGLs that are connected to all other
nodes in the PG to avoid partitioning a PG. For more details on partitioned
PGs, refer to section 8.6.
In some cases, partitions may have no PGL; for example, a small partition might not contain any
nodes capable of acting as PGL. In this case the partition operates as a “leaderless peer group”
and is effectively isolated from the rest of the PNNI routing domain.
Given that each partition of a peer group is treated as a separate peer group, routing within the
parent peer group naturally routes calls around and across the various partitions. Similarly, calls
originated within the partitioned peer group to destinations outside of the peer group are
correctly routed.
A number of problems can arise when routing calls to destinations within the partitioned peer
group:
• Addressing implications — If more than one PGL is elected from the same partitioned
PG, then they must be uniquely identifiable in the parent peer group with a unique node
ID. This can be achieved by including the peer group leader’s 48-bit ESI into the “flat
ID” part of the node ID, although it is not recommended. There is the potential that the
ESIs will not be unique, especially in higher-level peer groups that span countries.
• Lost calls — It is likely that some nodes will be isolated from the rest of the PNNI
domain. Note that at the time of PGL failure, many existing calls may be severed, and the
they will not be re-established even though they are still connected to a core node, but
that core node is not in the same PG.
• Crankbacks — These will occur until the PGL is restored because all PG partitions will
be advertising the same summary address, but only a subset of the destination addressed
are actually reachable in each partition. In this case, the call will be cranked back until a
path is selected to the correct PG partition, which could result in many crankbacks
depending on the topology.
PGs with two core nodes both acting as eligible PGLs can be both elected if they are isolated
from each other. Fortunately, partitioned PGs are unlikely because the core link and all access
links between the two core nodes in the same PG would have to be simultaneously severed for
the PG to be non-contiguous. The bottom line is that the PGL must be very reliable, and core
links between eligible PGLs must be very reliable.
PG(A)
CPE1 CPE.2
A.1 A.2
A.3 A.4
A.5
PG (C)
PG (B)
B1 C.1
B2 C.2
PNNI Node
It is recommended that a single peer group or hierarchically flat configuration be used wherever
possible. This configuration offers the most precise routing available, as the status of all nodes
and links in the network are known and therefore provide the routing system with every possible
routing combination from source to destination. Where multiple peer groups are used, the
scalability of the network is increased at the expense of routing accuracy. Each peer group is
only represented by the addresses reachable in it, as its link constraints, weights or QoS metrics
are not known by nodes outside of the peer group. Therefore, routing to a peer group using
address summarization forces the calling node to “trust” that the destination address does indeed
exist, and that the QoS requirements of the connection can indeed be met by a remote peer
group.
If a hierarchy is indeed required (e.g. more than 300 nodes in the network), many decisions must
be made on how to partition the hierarchy and how many levels to build.
A.1
B.1
In Figure 8-16 and Figure 8-17, network XYZ is illustrated in both a single peer group
configuration and a possible hierarchical configuration. In the single peer group
configuration, each node is aware of every other node and every other link in this peer
group (by definition of a peer group). Routing in this configuration is optimal because a
definitive decision can be made from source to destination, as all route combinations can
be examined.
A B
Outside link 2
B.1
In the hierarchical configuration illustrated in Figure 8-17, routing ambiguities are caused
by address summarization and hidden topology inside the remote peer groups. Peer group
A has no knowledge of the topology, link or nodes states of peer group B. Instead, it has
only the understanding that some set of addresses is reachable in peer group B, via the
logical group node B.
Assuming that every link in Figure 8-17 has an identical cost, then the shortest path that
routes a connection between A.1 and B.1 is the least number of hops. In the flat peer
group shown in Figure 8-16, the shortest path is easily calculated because optimal routing
is possible. In the hierarchical model in Figure 8-17, the route is calculated to LGN B, via
uplink 1 because it represents the shortest path to B. In this case, the shortest global path
from A.1 to B.1 is to take uplink 2 (and outside link 2). As a result of the hidden topology
of peer group B, uplink 1 is taken and as such the less optimal route is delivered to B.1.
This is further compounded if the entry border node in peer group B (B.2) cannot find a
suitable route from B.2 to B.1 that satisfies the request connection QoS requirements. In
this case the call must be released and re-routed to use uplink 2 (and outside link 2). If the
QoS can be satisfied on the alternate route, the connection will be successfully
established but with the following consequences:
• bandwidth is unnecessarily reserved on the initial setup attempt on all links up to B.2
• the connection experiences more delays due to routing the connection twice
• the source node (DTL originator) must support alternate routing for the connection to
be completed
Such routing ambiguity is amplified with every new level of hierarchy (using address
summarization) introduced in the network. A comparison of a single, two-level and three-
level hierarchy is shown in Table 8-2.
The cost of more precise routing information in the topology databases is higher CPU usage. In
order to maintain an accurate depiction of the network, more PNNI routing control messages
must be sent and processed by every node in the peer group thereby increasing the CPU required
to process them. An abuse of these parameters could negatively affect call setup rates and other
switch functions.
the product of the available cell rate of the bandwidth pool from which the
connection bandwidth was allocated and the avcrpm percentage.
8.8.1.3 Recommendations
In most networks defaults of 3% avcrmt and 50% avcrpm are suitable for
normal operating conditions. In the case of a predominantly real-time (or non
real-time) traffic profile, a network operating at near capacity or a network
offering extremely high bandwidth services, these values will not suffice.
Specific values are difficult to determine without examining the entire
network profile on an individual basis. Table 8-3 summarizes the benefits and
consequences to changing these values in addition to which circumstances
would justify such changes.
ATM Forum PNNI 1.0 specifies the default RCC service category to be non real-time
VBR, with PCR=906, SCR=453 and MBS=171. As a result of nrt-VBR low emission
priority, Passport PNNI increases the default priority of the RCC to real-time VBR (the
recommended cell rates are still used). However, any service category and cell rate can be
manually provisioned as desired by the operator.
In lowest level peer groups, it is possible that PNNI peers implement different RCC
default values or even different default service categories. In addition to cell rates and
service categories, the RCC channel can also be subject to GCRA (UPC) policing and
traffic shaping. Also, in multi vendor PNNI networks, the RCC default parameters might
not align perfectly to each other. In all situations, the operator is encouraged to engineer
the RCC channel with respect to the following considerations:
Traffic Shaping
Traffic Shaping on the RCC should be disabled. Shaping provides more uniform traffic
flows intended to conform to network policers. However, for bursty traffic types (i.e.
VBR), traffic shaping increases the delay of the traffic, prolonging the transfer of critical
routing information. Across large networks, this increased delay may grow significant.
GCRA Policing
GCRA policing on the RCC should be disabled. It is difficult to characterize traffic bursts
on the RCC channel in different routing situations. Also, shaping is not recommended on
the RCC for the reasons stated above. Policing the control traffic may result in dropping
critical routing information that non conforms only for very brief periods of time.
Reserved Bandwidth
As a result of policing being disabled, the importance of the reserved bandwidth of the
RCC channel within a link becomes less important. However, the RCC should reserve
enough bandwidth such that under high link utilization, control traffic does not greatly
affect user data sharing the link. Typically, the default settings specified in the ATM
Forum PNNI 1.0 should be used.
RCC Parameters
If the default RCC parameters between two switches are not identical, the following
guidelines are given:
• The preferred service category for the RCC on Passport is rt-VBR. The increased
emission priority of rt-VBR compared to the nrt-VBR decreases the chance of control
traffic delays due to competition with higher priority traffic. The service category of
the RCC channel must be consistent for both directions of the PNNI link. If desired,
to prevent starvation to any service category on Passport, the ATM IP FP minimum
bandwidth guarantee (MBG) mechanisms can be employed. Please refer to the ATM
traffic management section for more details on MBG.
• To maintain uniform bandwidth reservation on the link, the same RCC cell rates
should be provisioned for both sides of the PNNI link. Also, the operator should
ensure that the resulting bandwidth reserved for the RCC is equal in both directions of
the link. Any differences in the amount of bandwidth reserved can be compensated
using the Passport overbooking parameters. If the bandwidth reservation of
connections on this link is not symmetrical, ACAC may prematurely reject
connections due to the forward or reverse direction of the link bandwidth being
saturated.
As mentioned earlier, in hierarchical networks the RCC is automatically setup between
adjacent LGNs. In the case where Passport cannot establish an rt-VBR RCC, it
automatically attempts a CBR RCC SVC. If CBR cannot be established, a nrt-VBR RCC
is attempted. If the nrt-VBR is not available, then a UBR SVC RCC is attempted. A
service category is considered to be not available either if no route can be found with that
service category, or if the call setup with that service category fails due to problems with
service category , traffic parameters, or QoS parameters. Before attempting the next
lower service category, Passport does not change the cell rates of any RCC SVC
establishment.
These migration procedures assume that the network is already configured as a flat PNNI
network; a flat PNNI network is a PNNI domain that has only one Peer Group (PG). Other
migration scenarios such as migration from IISP to flat or hierarchical PNNI should be
considered separately. Where possible, interworking information from other ATM vendors has
been included. This is not a tutorial on hierarchical PNNI, the reader is assumed to already be
familiar with the PNNI standard, ATM addressing and hierarchical concepts such as LGNs and
outside links.
Network Growth
The number of nodes/links exceeds the capacity of network nodes. Passport supports 300
nodes in a single PG. PTSE volume varies by O(NLR), N = number of nodes, L =
number of links, R = rate of flooding of PTSE; also note R = f(N,L). For simplicity, the
number of nodes is the only constraint since there are practical limitations to the
“meshedness” of the network, especially as the size of the network increases. Other
vendor equipment scalability may vary and the PG size is bounded by capacity of
smallest node.
Firewall
The PNNI standard describes how PGs can be used to “hide” topology information of
parts of the network. This approach uses PNNI as a firewall so that nodes do not have a
full view of the network. Nortel Networks does not recommend using hierarchical PNNI
for this purpose.
Overall, actual or expected network growth that is the primary reason for migration. The
decision to migrate should not be taken lightly as there are trade-offs to consider which
will be explained in further detail. The following are the prevailing factors to consider,
some of which may even preclude the migration:
• sub-optimal routing which may increase cell delay, and increase link utilisation
• compatibility with the ATM address plan; e.g. having enough bits for partitioning
address bits by level and maintaining efficient address summarization
• change in routing overhead from source node only routing to distributed routing
(hierarchical PNNI routing is done by source node plus all intermediate entry border
nodes); which has crankback and re-routing implications
• additional operational complexity
• additional configuration of PG levels, PGL, link aggregation
• failure modes and PG partitioning when a PGL node fails
• topological constraints; some topologies are not well-suited to hierarchical PNNI as is
demonstrated in subsequent sections
Careful network design will mitigate these effects and provide the benefits of hierarchical
PNNI. The following sections will explore the above issues by considering all the steps
of migration, including assessing if hierarchical PNNI is suitable at all.
Note: Call setup performance is the major consequence of large PNNI peer groups on
Passport. Passport can easily support 300 nodes in a peer group when only
memory and CPU are considered. However, due to the large number of routing
combinations that exist in large networks, call setup performance will degrade as
the network size increases. The level degradation is dependent upon the size of
the network and the exact topological architecture (e.g. ring topology, full mesh,
partial mesh, etc.) By reducing the amount of links and route combinations when
forming peer groups, call setup time can be increased.
3. Peer Groups
Decide on the number of PGs and the PG boundaries. Impact of simple versus complex
representation.
4. Peer Group
Leaders and Peer Group Partitioning How to choose eligible PGLs, and the leadership
level, how to avoid isolating nodes when a PGL or link fails.
1. A single peer group be split into multiple peer groups by creating a new higher level
peer group for the LGNs. This is Bottom Up migration.
2. Replace existing nodes with LGNs and add new nodes to create new peer groups at the
lower levels. This is Top Down migration.
Before migration
After migration
LG
A B
PGL
PGL LGN
A.2
2. Move remaining lowest level nodes to the new peer group as required.
LGN
A.1 A.2 PGL
LGN
PGL A
Figure 8-21 Selecting a node at the highest level in the resulting network
2. Using the following criteria, select a node that has yet to be migrated:
LGN
A B
Moving A.5 first and making it the
PGL is necessary to avoid isolating
nodes during the migration. This is
A.1 A.2 A.3 A.4 A.5 the only necessary during Bottom
Up migration.
PGL
4. Continue doing step 2 until all the nodes are moved. (A.4 will be selected
next, then A.3, finishing the migration), and move the PGL if necessary.
8.9.3.3 Comparison
The Top Down approach is better, especially if there are many physical nodes
at the higher level. It is also advantageous since downward migration can stop
when Peer Groups are sized appropriately.
In Figure 8-23, the network is running IISP between all nodes. The migration is to
convert the IISP links to PNNI. After the migration has completed, the resulting network
is a single lowest level PNNI peer group.
1. Configure PNNI routing on all switches (PNNI level, PNNI peer group ids etc). It
is assumed that the NSAP addresses have already been established by the IISP network.
Assuming a top down migration, the value of the PNNI level should be hierarchically
high enough (closer to 1) to allow for the proper formation of future child peer groups.
This step can be performed without any service interruption to existing or new
connections.
2. Block new connections from establishing. This step is necessary to ensure no calls
are attempted before the network can guarantee loop free routing.
3. Introduce PNNI interfaces one at a time until the network is completely migrated,
OR a stable IISP/PNNI routing environment (i.e. no routing loops) is achieved
Before migration
IISP Links
After migration
PNNI PG 1 PNNI PG 2
Level N, Level N,
A.2 A.1 B.1 B.2
where N>0 where N>0
PNNI Links
In the second migration scenario shown in Figure 8-24, a two-level hierarchy can be
achieved by provisioning a separate hierarchy in each peer group, before changing the
IISP interface between the peer groups to PNNI. This migration technique adds a new
higher level peer group. The basic procedure for this migration is:
1. Choose a node to represent each peer group as it’s respective Peer Group Leader
(PGL).
2. Configure the LGNs of each PGL to co-exist in the same parent level peer group.
3. After the PGL election has completed in both peer groups, convert the IISP link(s)
to PNNI.
If multiple IISP links connect the two peer groups, it is still possible that the IISP links
would be favoured over the fully functional PNNI link. For instance, if the upnode
advertising the destination address is using a summarized destination address whereas the
IISP link does not perform an summarization, the IISP link will advertise a longer prefix
match and will always route any subsequent connections. In fact, even if the addresses
are identical in length (same number of significant bits advertised), the IISP link may still
be used. If an earlier release of software than PCR1.3 is used, then calls will alternate
between the PNNI link and IISP link(s). If PCR 1.3 (or later) is used, then calls will route
to which ever node represents the shortest AW, CTD or CDV cost from the source node.
The shortest route may still include the IISP link.
Maintaining the IISP links until the hierarchy has fully established ensures that new
connections can reach the other peer group despite the PNNI hierarchy is not fully
established. Once the hierarchy has correctly established, the existing IISP links can be
converted to PNNI without disruption to any connections.
After the PNNI link is established, the PNNI protocol will automatically establish the
higher level peer group, including the instantiation the uplinks and the SVCC RCC
between each logical group node. The conversion of the outside link from IISP to PNNI
should not affect any existing connections transiting that link.
New connections spanning both peer groups will be blocked until the establishment of
the SVCC RCC between the LGNs.
PNNI trunk
attribute Description CBR rt-VBR nrt-VBR UBR
CLR0 Cell loss ratio Cbr Clr rt-VBR Clr nrt-VBR Clr n/a
for CLP=0
PNNI trunk
attribute Description CBR rt-VBR nrt-VBR UBR
CLR01 Cell loss ratio Cbr Clr rt-VBR Clr nrt-VBR Clr n/a
for CLP=0+1
AvCR Available cell poolAvaiBwl poolAvaiBwl poolAvaiBwl n/a
rate
MaxCR Maximum cell n/a n/a n/a ubrMaxConnection
rate
CLR0 and CLR01 are the maximum cell loss ratio (CLR) objectives for CLP=0 traffic
and CLP=0+1 traffic, respectively. CLR is defined as the ratio of the number of cells that
do not make it across the link to the number of cells transmitted across the link. For any
given Passport ATM service category, the CLR0 and CLR01 are advertised with the
same value because they are derived from the existing single CLR attribute provisionable
under each ATM interface. In other words, the meaning of the CLR attribute on Passport
is associated with the compliance definition of the connection. For constant bit rate
(CBR.1) and variable bit rate (VBR.1), the CLR applies for the CLP=0+1 aggregate flow.
For VBR.2 and VBR.3, the CLR applies to the CLP=0 cell flow.
The AvCR is a measure of the effective available capacity for the entire link or for each
specific service category. It is expressed in units of cells per second. The value of this
attribute is derived from the respective bandwidth pool(s). Port capacity can be divided
into the different pools (for example, CBR, rt-VBR, and nrt-VBR). Each service category
is mapped to a given pool through provisioning. By default, all traffic is assigned to a
common pool (pool 1). Each pool is assigned a percentage of the link capacity that may
vary between 0% and 2000% (large percentages are used for oversubscription). The CQC
Passport FPs support 3 bandwidth pools per interface, the PQC supports 5 pools.
If a link does not satisfy the requirements of the connection, that that link is ineligible to
route the connection. Once all the unsuitable links are pruned from the calculation, it is
very likely that more than one path will exist from source to destination. From the source
to destination node, some paths may require routing less hops, or shorter expected cell
transfer delay, or smaller expected delay variations or provide more available bandwidth.
Of the many possible routes from source to destination, it is up to the source node to
determine which policy it will apply to find the most optimal route. A route is optimal by
providing an additive function to the link metrics. From source to destination, the most
optimal route represents the smallest sum of the link metrics in that path.
Cell Delay Variation is the maximum expected delay from the egress queuing buffers.
This value represents the worst case scenario, that is the difference in microseconds from
a cell entering the queue with no cells ahead of it, to the case where the queue is full.
CDV only applies to service categories with real time requirements: CBR and rt-VBR. It
does not apply to nrt-VBR, ABR or UBR service categories. The default value for CDV
is dependent on the FP and buffer sizes being used.
Cell Transfer Delay is used to reflect the maximum expected cell delay for using this
trunk. The value should encompass the expected CDV and the propagation time for a cell
to traverse this trunk. Therefore, the CTD should never be less than the CDV value.
Similarly to CDV, CTD only applies to service categories with real time requirements
(CBR and rt-VBR). CTD does not apply to nrt-VBR, ABR or UBR service categories.
The default CTD value is dependent on the FP being used. (VERIFY!!!!)
PNNI trunk
attribute Description CBR rt-VBR nrt-VBR UBR
AW Administered weight CBR weight rt-VBR weight nrt-VBR UBR weight
weight
MaxCTD Maximum cell CBR MaxCTD rt-VBR maxCtd n/a n/a
Transfer delay
CDV Cell delay variation Cbr Cdv rt-VBR cdv n/a n/a
Note that changes in the AW metrics are non-service affecting. However, any
modifications on the CDV and MaxCTD metrics will reset the PNNI links and
connections will be rerouted (SPVCs and SPVPS) or cleared (SVCs and SVPs).
Passport allows the operator to optimize routes based on service category and supported
metric. Therefore, CBR service category can be optimized on a different metric than rt-
VBR.
There are two basic methods for engineering the Passport PNNI metrics:
Method A
• Optimize CBR traffic based on CDV where the CDV = minimal queueing delay (e.g.
10 cells) since peak rate reservation is provided by Passport ACAC
• Optimize rt-VBR traffic based on MaxCTD where the MaxCTD = propagation delay
Method B
• Route all service categories based on AW
• Cost = hierarchical structure based on tariffs, distance, and/or link speed
For instance, in a high-speed trunking environment (e.g. OC-3c), the MaxCTD can be
approximated to be equal to the propagation delay because it is assumed to be the
dominant delay component across the network. CDV can also assumed to be less than 10
cell slots of the link scheduler (less than 10 cells are waiting in the common queue).
Under PNNI 1.0 signaling, two new optional QoS IEs are introduced in the Setup and
Connect messages to supplement the QoS parameter IE that has already been defined in
UNI 3.1.
indicates the calling user’s highest acceptable (least desired) CLR value. A CLR is
expressed as an order of magnitude n, where CLR takes the value.
The QoS parameters values included in the extended QoS parameters IE, together with
those included in the end-to-end transit delay IE (if present), specify a QoS capability at a
UNI 4.0 interface.
Regarding provisionable QoS parameters for SPVCs/SPVPs: if the next hops are PNNI
then route according to acceptable values; if next hops is IISP/UNI then UNI 3.1
signaling without the new IEs.
If the call originates from an interface that does not support the extended QoS IEs, then
by specification of ATMF PNNI1.0, they must be inserted into the signalling stream
before the next PNNI interface is encountered. On Passport, these IEs are always inserted
with out specifying any hard timing, delay or cell loss ratio requirements. Therefore, with
respect to timing, delay or cell loss ratio, this connection will not have any requirements
for the network to meet.
In order to improve the rerouting procedures in the ATMF PNNI protocol, Nortel
Networks has recommended an edge-based rerouting mechanism for point-to-point
connections. In the proposal, an “edge-to-edge” protocol would allow the source node
(DTL originator) and the destination node (DTL terminator) to participate in the control
of rerouting operation within a PNNI 1.0.
Destination
Source
UNI
UNI
IISP
IISP AINI
AINI
RELEASE
SETUP
CONNECT
Source Destination
UNI
UNI IISP
IISP
AINI
AINI
RELEASE
SETUP
CONNECT
The EBR route recovery procedures are required when networks are responsible for
quickly restoring connections impacted by outages. EBR allows PNNI network-initiated
fault recovery of point-to-point “SVC/SVP” connections without the intervention of the
connection owner. It also minimizes the use of network resources required to re-instate
active connections to a single PNNI domain without the assistance of multiple PNNI
network providers.
The route recovery mechanism uses the cumulative QOS information of the original
connection (also known as the incumbent connection) as the criteria to determine whether
to reroute the call. As long as a viable alternate path provides as good or better
cumulative CDV and MaxCTD parameters then the PNNI network should be able to
reroute. This also means that the current route recovery procedure will not reroute on
"longer" paths. This is a generic mechanism used by both route optimization and route
recovery procedures. The cumulative QOS parameters (CDV and MAxCTD) are
calculated between the two PNNI edge nodes implementing the EBR protocol.
In general, EBR route recovery capabilities are not required for SPVCs/SPVPs
originating and terminating within the same PNNI domain since it is already provided by
the base PNNI 1.0 protocol.
An important characteristic of route optimization is to guarantee that the process uses on-
demand route calculation with minimum disruptive procedures. This is accomplished in
the following ways:
• existing connections are never dropped as it first attempts to establish a new optimal
path before swapping the traffic
• cells ordering is guaranteed and no cells are duplicated
• pacing of a single route optimization attempt per ATM signaling interface ensure that
network resources are never over utilized during optimization cycles
• route optimization attempts can be interrupted by the establishment of new calls and
rerouting procedures (i.e. next priority)
PNNI Domain
DTL DTL
Rerouting
Originator Terminator
Connection
Incumbent
Connection
During route optimization, there is a transition state where two connections coexist. The
original connection that was established for the call is referred as the incumbent
connection while the new “optimal” connection that will be established by the route
optimization mechanisms is referred as the re-routing connection.
Note that successful route optimization of a connection will incur a cell loss for the
duration of the swap interval. The period of cell loss can vary and is characterized by the
time it takes to clear the incumbent connection. The DTL originator initiates the tearing
down of the incumbent connection. It is performed through call control procedures that
involve the propagation and processing of a Release PDU across all of the PNNI nodes
along the incumbent connection towards the DTL terminator. In comparison, the service
interruption caused by the current EBR route optimization is 300% much less that the
loss of data triggered by a hard re-route.
In the first phase, route optimization is initiated by the network operator through the
optimize command issued to an “edge” (or ingress) ATM signaling interface which
maintains point-to-point connections (typically the UNI or IISP interfaces). The optimize
command initiates route optimization procedures for all of the applicable connections
associated with the ATM signaling interface. Only ATM point-to-point connections
traversing PNNI nodes fulfilling the role of DTL originators are optimized.
ATM point-to-point connections traversing PNNI tandem nodes or nodes performing the
role of DTL terminators are unaffected by route optimization commands. It is important
to understand that optimize command will only move connections if there is a new route
that offers an improvement compared to the existing route. If Passport finds an eligible
path in the topology database with a lower sum of the link metrics (as low as 1 metric
unit) then the connection is considered candidate for route optimization. Otherwise, the
connection is not considered for route optimization procedures and the cycle continues
with the remaining EBR capable connections on the ATM signaling interface. Note that
the route optimization procedure will not consider shorter paths if the incumbent
connection is “within” the load balancing variance.
The first phase of the EBR implementation is characterized with the following
limitations:
• route optimization is service affecting
• route recovery is limited to paths providing as good or better end-to-end QoS
guarantees (e.g. cumulative CTD) than the incumbent connection
• EBR route recovery and route optimization mechanisms don’t support preemption
(calls are rerouted and optimized in random order)
• EBR route optimization does not prevent “temporary” double booking of network
resources (ACAC) when the rerouting connection overlap the same intermediate
PNNI nodes that are already part of the incumbent connection
• automatic real-time triggering of the EBR route optimization procedures initiated by
released bandwidth and shorter path availability is for future releases
• the automatic process periodically evaluates the current path characteristics against
the updated topology database
• EBR capabilities over PNNI logical paths (VP associated signaling) are not supported
• EBR route optimization will not move connections to an equivalent path in order to
balance the bandwidth reservation among multiple equal best paths and multiple
PNNI links in a link group (links that are used to interconnect two neighbor nodes)
When connections are routed across multiple PNNI networks interconnected with UNI,
IISP or AINI links, each intermediate PNNI network independently coordinates its two
edge switches along the path in order to synchronize the rerouting and optimization
procedures. As it stands, it is the responsibility of the PNNI networks to provision the
ingress UNI interfaces if route recovery is desired. Path optimization is always initiated
independently by each PNNI organization. Adding EBR to an UNI/IISP signaling
interface enables by default both route recovery and route optimization for all switched
connections originating from this interface.
As a result, a specific indication is provided to the source node not to trigger the edge-
based rerouting operation when the failure is external to the PNNI network.
For a connection to be eligible for EBR capabilities, the DTL originator and DTL
terminator must support EBR procedures. Supporting EBR procedures means that the
nodes both implement route recovery and route optimization procedures. The
intermediate nodes within the PNNI network transparently transport these information
elements across the network. This is accomplished by using the “Pass along/No pass
along request” described in the existing PNNI 1.0 specification. This procedure is part of
the PNNI 1.0 minimum function and all PNNI vendors must be compliant to it.
Passport supports different subscription options that allow for increased flexibility during
feature deployment and allow for the differentiation of ATM service based on connection
recovery capabilities. Addition of the EBR component only impacts new call
establishments. New connections are subscribed to the specified EBR options.
Furthermore, changes to the provisioned attributes are not critical to the already
established connections. Existing connections retain the old subscription options and new
connections would be given the new options.
To provide route optimization and route recovery capabilities, PNNI connections must be
registered for EBR capabilities when the call is established. This means that during initial
deployment of EBR in an exiting PNNI network, connections must be cleared after the
software upgrade in order to initiate the EBR capabilities during call establishment. Once
the call has been established, the connection maintains EBR capabilities until the
connection owner clears the call or the network no longer has the route diversity to
perform route recoveries for failed connections.
To further address this issue, the outside links (and consequently the uplinks and higher
level horizontal links) can have their routing costs increased to reflect routing across a
peer group rather than a node as it might appear.
MPV solves this problem by allowing an operator to define exactly to what degree will
paths be determined optimal, acceptable or sub-optimal. Up to three optimal or
acceptable paths will be eligible for load balancing (up to a maximum of four alternate
paths, as defined by MaxAlternateRoutes).
MPV adds benefits to the Passport routing system by more efficiently using the network
bandwidth and providing more connection reliability. If load balancing is done on only
paths of optimal and equal cost, then connections would always be routed on the same
small number of paths. Using this small set of paths would continue until a significant
change on part of the intermediate links of such paths occurred (i.e. link down or
bandwidth saturation). Until then, other non optimal but acceptable paths would be under
utilized. Furthermore, due to the a large number of connections routed on only a few
paths, a failure to one of these paths causes massive rerouting as it affects the service of
many connections. MPV can be used to alleviate all these problems, by spreading
connections over optimal and acceptable paths in the network.
The value of the minimum variance should represent the average cost of a
single link in this routing domain.
The second part of the equation represents the constant amount of variance
that must be considered. As the optimal cost of a path in the network grows
smaller, then variance factor (which is always smaller than the optimal cost)
also grows smaller. It is foreseeable in smaller one hop networks, that the
variance factor would produce a result that is too small to include any other
routes or links to load balance on. In these situations, the minimum variance
can be used to always define the smallest the amount of variance to be used.
A path is considered to be “diverse” (and therefore acceptable to load balance on) if the
number of common links in the two paths is strictly less than a configured percentage of
the total links in the best cost path. By default, the diversity percentage is 50%.
L3.1 L4
L6 L7
L1 L2
3 6 7 8
1 2
L3.2 L5
As an example of route diversity, consider Figure 8-27. In this network, if all links have
an equal cost, routing could choose two paths from node 1 to node 7:
(L1, L2, L3.1, L4, L6, L7) and (L1, L2, L3.2, L5, L6, L7)
both of which are optimal. The problem here is that for the most part, this is almost the
same path. The paths are identical for 4/6 of the hops involved. Other sub-optimal paths
may exist that might not use any of these links and would provide better load spreading
than choosing strictly on the optimal paths. For instance, consider a new path, that
directly joined node 1 to node 7. Such a link may have a higher routing cost than either
(L1, L2, L3.1, L4, L6, L7) or (L1, L2, L3.2, L5, L6, L7) but would be desirable to use for
diversity purposes.
To better quantify the diversity of two paths, we define the diversity degree,
D(path1,path2), the diversity degree of path1 relative to path2, as follows:
Note that since we always normalize to the number of links in path2, the diversity degree
is not a reflexive relationship between path1 and path2, which means that
D(path1,path2) != D(path2,path1)
In the on-demand PNNI routing algorithm, the diversity of all the paths is calculated
relative to the best path (the optimal path). Among the paths computed in all the three
steps of the algorithm, only the paths that have the diversity degree relative to the best
path greater than 0.5 are considered as alternate paths. Note that no sorting is done based
on the diversity degree. Specifically, if the algorithm computes more than lbMaxPaths, it
is possible that paths with a higher diversity degree are not considered as alternate paths,
and paths with smaller diversity degree (but still greater than 50%) are considered.
Best cost routes can always change in dynamic networks due to bandwidth
changes, or link failures. The MPV formula accounts for changes by basing
the variance on any best cost path, by deriving a percentage of the optimal
cost. Therefore, it is recommended that the significant part of the MPV
equation defining the majority of the acceptable variance be the Variance
Factor. Figure 8-28 illustrates how the acceptable variance increases as the
cost of the optimal path increases.
Acceptable Variance
∆ = C+V· OptMetric (OP)
C – Mininum Variance
V – Variance Factor
C
0
Optimization Metric of the optimal path
The Variance factor should be set large enough to include some number of
acceptable paths (as desired by the operator). The value of the variance factor
will be influenced by the cost of the expected optimal paths and the variance
between the optimal path and suitable alternative paths. As a guideline, the
worst case (largest expected cost) optimal path and variance deviation should
be used to determine the value of the variance factor. It is important to
engineer the variance factor to always find suitable paths in the worst case
situation. Doing so, guarantees variance in the entire network. In cases where
the optimal cost is lower, or the suitable paths are “closer” to the optimal path,
MPV will be inclined to include too many routes as the variance factor is set
to higher worst case value. However, as the optimal cost decreases, so shall
the acceptable variance. Furthermore, MPV is limited to include a maximum
of 4 alternate paths that all must satisfy the diversity criteria. In other words,
despite the aggressive values of the variance factor, MPV limits itself to return
only a few (provisionable) number of routes.
Once a suitable value for the variance factor is found. Then the minimum
variance can be defined. Again, referencing the worst case scenario as
previously discussed, the worst case acceptable variance would have already
been estimated. A guideline for choosing the minimum variance is the average
cost of a single (or a couple) of links. If the source destination pair is very
close in proximity, then the variance factor becomes incidental. The minimum
variance should represent the cost of one or two links to facilitate load
spreading in short distances.
follow very diversely different paths, including many different peer groups. In
some cases, due to simplex node representation, the paths may be too diverse,
and represent very sub optimal paths through misrepresented peer groups.
Without complex node representation, a true representation of the size of each
peer group is not known, and is only represented by the cost of the horizontal
link joining the higher level nodes.
To minimize this effect, the cost of the upper level horizontal links connecting
two logical nodes should be priced higher than normal inside links to account
for the extra (and unknown) peer group costs. Doing so distinguishes the case
where the horizontal link cost reflects the cost to the next node, from a higher
level horizontal link reflecting the cost of a peer group. As a result of
increasing the outside link costs, the network routing systems will not change
hierarchical routes including different peer groups due to small topology
changes in the local peer group. If the cost system for both inside and outside
links are equal, then small changes in the local peer group topology may have
larger and adverse affects on the global hierarchical route. Differentiating the
inside and outside link costs minimizes this affect.
Also, the variance factor should not be set high enough to include many
horizontal link combinations from source to destination. If possible, try to
limit the number of suitable routes at higher levels of the hierarchy. For more
information on how to cost links in a Passport network, refer to section 8.11
Passport Hierarchical Routing.
In order to achieve load balancing in a network, there are two requirements that have to
be met. First, the routing algorithm must determine not one, but multiple diverse
acceptable routing paths. Passport utilizes the Multi Path Variance feature to accomplish
this. Second, out of these acceptable paths, the load balancing scheme has to select the
routing path used by the connection, ensuring that a balanced utilization of the network
bandwidth is achieved.
Passport supports four load balancing techniques, that load spread connections based on a
random selection, the available cell rate or a path and/or the optimization cost of different
acceptable paths. The different services offered in the network coupled with the specific
network topology will define what load balancing technique is most suitable.
If the costs of each acceptable route are roughly equivalent, then randomly selecting one
is an efficient method of choosing one. Without complications, it also offers the benefit
of distributing all the connections over all the paths evenly. If uniform load balancing is
used in networks with large variations in best cost and available cell rate, sub optimal
network utilization could occur. It is possible for some paths to become over utilized, and
for CBR and rt-VBR traffic to take un-necessary extra hops. The only exception to these
guidelines is UBR traffic, because it does not require network timing or reserved any
trunk capacity, is well suited for uniform balancing.
1
Pr[ pi ] =
n
If n denotes the number of acceptable routing paths, then the probability of choosing a
path pi as the routing path is:
Widest load balancing is best suited for networks with equal sized trunks with equal
utilization. Using the widest technique, may be well suited to load balance connections
effectively over backbone links between core nodes. For instance, in a peer group
consisting entirely of core back bone nodes, the widest technique could effectively.
AvCr ( pi )
Pr[ pi | AvCr ] = n
∑ AvCr ( pk )
k =1
OptMetric −1( pi )
Pr[ pi | OptMetric] = n
−1
∑ OptMetric ( pk )
k =1
To compensate for the limitations of simplex node representation, the cost of the outside links
(and consequently the uplinks and upper level horizontal links) can be engineered to reflect a
more accurate picture of the topology. Increasing the cost between two logical group nodes has
the following benefits:
• Small topology changes (i.e. adding or deleting links, or introducing /removing nodes) in
the DTL originators peer group will not affect the hierarchically complete routing.
• Features like Edge Based Routing (EBR) path optimization and multi-path variance
(MPV) can determine more accurate optimal paths. The increased higher level link costs
will better represent the cost of traversing many peer groups.
However, changing the value of the outside links has a great affect on the end-to-end routing of
connections, and care must be taken to not adversely affect call flows.
Due to such complexities, it is difficult to give general guidelines that can be simply applied to
every network. Instead of attempting such a daunting task, this section outlines the various cost
consideration that must be taken into account when designing Passport networks. For more
specific case scenarios or assistance in engineering a specific network topology, please consult
your Data Network Engineering representative.
The cost of an outside link should represent multiple factors: the cost of actually using
the outside link, the cost of traversing the connecting peer group and possibly a cost
considering its position in the hierarchy. The cost of the outside link should be inline with
how the remainder of the network links are engineered (i.e. based on available
bandwidth, delay etc). The cost of traversing the connecting peer group should be based
the expected number of hops to exit that peer group. This number does not represent the
total width of the peer group, rather the number of expected hops. Take for example, a
peer group with 2 backbone nodes inside of it. Any connections transiting through this
network should be engineered to use the backbone links to enter and exit out of the peer
group. In this case, the typical case is one or two hops (one for each backbone node).
Regardless of how many other nodes exist in this peer group, only the backbone nodes
are used to transit the peer group.
If link costs are increased to reflect the amplified summarization, then the guidelines
summarized in sections 8.11.1.1,8.11.1.2 and 8.11.1.3 should be followed.
As the cost of the optimal route grows larger, the determining factor in the
equation is the product of the variance factor and the cost of the optimal route.
If the higher level link costs are very large compared to the cost of the lowest
level links, then it is possible that even the most conservative values for
variance factor might include too many routes in the lowest level peer group.
Conversely, setting the link costs at values too low might cause features such
as EBR to use unnecessary peer group hops to re-optimize connections. If the
upper level links are roughly equal in cost to the lowest level links, then the
cost of a single hop anywhere in the network is also equal. This scenario does
not accurately reflect the true cost of a LGN hop compared to a lowest level
node hop. In this situation, EBR may incorrectly optimize a connection to a
sub optimal path traversing extra peer groups.
Modifying the Aw of a trunk does not affect any existing connections on that
trunk. However, modifying CTD or CDV causes connections on the trunk to
be released.
If the higher level peer group is partially meshed, such that a source node at
the lowest level can see both the logical group node and the lowest level nodes
in the higher peer group, then sub optimal routing is possible. Consider a
connection routing to an address advertised by the logical group node. The
upper level horizontal link connecting to the logical node has an inflated cost
used to represent the traversal of the peer group. However, in this instance, the
call is terminating in the peer group represented by this logical node. The
inflated cost of the outside link may cause the path to route around the inflated
link (which is probably also the optimal path) and use the other lowest level
nodes in the higher peer group. In this case, a sub optimal route has been
chosen.
The operator has two options to alleviate this problem: leave all the link costs
(inside and out) as the defaults and ensure to the best of their ability that the
peer groups are relatively equal in size (diameter), or change the lowest level
nodes link costs in the higher level peer group to a increased value. The latter
quickly becomes unsuitable in large networks with high connectivity,
especially considering the costs would have to be re-examined every time a
node migrated to a new peer group. Despite the inaccurate routing for EBR
and MPV, the former is the recommended approach.
Edge link
B.2
B.3
A.3
Core link 1 B.4
A.2 2
C.2
A.1 3 B.1
This is the compromise of hierarchical PNNI and hierarchical routing in general; should
the initial choice of outside link not be the “correct” one, then extra hops and will be
encountered thereby increasing delay and bandwidth utilization. Note that in this
particular example, as long as all core link Aw’s are set to the same value (which would
be a lower value compared to the edge links) then the following observations hold true:
1. Without a hierarchy, the connections would take two or three hops: two hops if
there is a core node connected to both source and destination, and three hops otherwise.
2. Using a hierarchy, there will be at most one extra hop in the core. When the
source node calculates the hierarchically complete source route, including the LGN of the
destination PG, the favoured outside links are always the core links, and the exit border
node and entry border node will be always be core nodes, and so the connection will
always use one core link and therefore require three hops in total. In migrating from flat
to hierarchical PNNI, the extra bandwidth on each core link required is the bandwidth
required by all the connections that would have made optimal two hop routes (e.g. the
aggregate bandwidth of all the connections between two access nodes who are in
different PGs and that both have a connection to the same core node).
3. Since the higher level peer group includes all the core nodes, there is always
optimal routing within the core.
4. If there are N edge nodes in the destination PG that are also connected to the
source PG, and the core link between them fails, then the probability of using access
nodes as a tandem is N-1/N, and the chance of using the optimal route 1/N. So, the
criticality of the core failing is related to the number of destination access nodes that are
also connected to the source PG core node. Additional redundancy of the core link in this
case is recommended if N is large (e.g. over 3-5 ) in order to avoid at all costs the
possibility of using access nodes as tandem.
Note that complex node representation only helps in routing through PGs; there is no
difference if the there are no tandem PGs. The network has no PGs which could be
traversed, so complex node representation is irrelevant in this case.
If the semantics of the network Aw link values include link rate, then usually the
behaviour described above will be consistent in the network. The link rate on core links is
typically larger than edge links, driving the Aw cost down and making the core links
more attractive.
The cache can define a maximum size by limiting the number of cached routes it stores.
The default is 10000 routes.
1. The forward cell rate of the cache is greater than the forward cell rate required by the
connection. However, the cached forward cell rate should not be significantly
superior to the connection cell rate. Such a match would limit the utilization of other
lower cell rate cached routes. Passport uses a matching scheme that promotes matches
that are “close” in advertised rates and required rates. The variance allowed between
the two rates is based on the actual amount of CR in the cached routes. The variance
increases proportionally to the amount the CR increases in the cache route.
2. The reverse cell rate of the cache is greater than the reverse cell rate required by the
connection.
3. The cached route satisfies the connections acceptable CTD and CDV.
PTSE Updates
Network changes are reflected in the topology database by way of PTSE updates. Any
change that updates the database is also applied to the route cache. Any cached route is
purged from the cache if the characteristics of the link no longer satisfies it’s QoS profile.
If multi path variance is enabled and some acceptable paths are cached for a particular
destination, then a PTSE update that disqualifies an acceptable path only removes that
individual path. The remaining acceptable routes can be used for load balancing.
Crankback Updates
If MPV is employed, then any crankback messages are used to prune links and routes
from the set of acceptable paths. Crankback messages will identify the link or node that
has failed or is determined to be unacceptable. Of the set of acceptable paths remaining
that did not include the rogue link or node, it can be used to reroute the connection
avoiding an on demand calculation.
Complete Purge
Passport has the ability to purge all the information in the database by virtue of an
operator command.
Previous support for multi-homed addresses including a round robin approach where connections
were routed serially routed to each multi homed addresses in turn. Routing was done without
regard to any other routing constraints.
This feature eliminates the Round Robin scheme of the pre PCR1.3 Passport release, and ensures
that the PNNI routing scheme calculates the optimal route to the destination address. If a PNNI
address is advertised by two or more nodes in the PNNI domain, then with the old software, each
node would be routed to in a round robin fashion for each subsequent connection specifying that
address. The problem with this method, is that no regard was given to any metrics when the call
is routed: one node advertising the address may be only one hop away, yet the connection could
be routed to a node across the network.
The enhanced multi-homed routing feature now allow metrics to be considered, providing the
best cost path from source to destination node. Network resources are better utilized, as
connections will reserved less network bandwidth, and possibly decrease cell transfer delays.
Once completed, any load balancing scheme can be used to distribute connections on the
different paths to any of the destination nodes. If MPV is not set wide enough to include
other nodes advertising the destination address, then load balancing is accomplished only
between the source node and the single destination node. If MPV is disabled, then only
the best cost path to a node advertising the destination address is routed to.
8.13.2 Crankback
Passport does not crankback/re-route on unequal addresses. Therefore, in crankback
scenarios Passport will not complete alternate routing to nodes advertising any other
prefix than the original call destination address. It is not possible to configure more
general addresses on a “backup” node for Passport to use in case of failure on the original
route.
With the older PNNI software, connections are always routed in a round robin fashion. In
crankback scenarios, the second address is always used if the first one becomes
unreachable.
With the multi-homed routing feature, it is possible to (optionally) ignore the second
route completely until the first route becomes unavailable. If the primary route becomes
unreachable, then regardless of how the MPV was engineered the secondary route will
now be used.