Professional Documents
Culture Documents
Nokia Advanced Troubleshooting SG v3.0.2
Nokia Advanced Troubleshooting SG v3.0.2
Nokia Advanced Troubleshooting SG v3.0.2
This course is part of the Nokia Service Routing Certification (SRC) Program. For more information on the SRC program,
see www.Nokia.com/src.
To locate additional information relating to the topics presented in this manual, refer to the following:
1. Hardware Stability - How redundant is your system? Do you have redundant control processors that support a full
chassis with negligible packet loss and non-stop services?
2. Network Resiliency – Does your network deploy fast re-reroute, standby secondary path, or sonnet protection? Is
the physical transport geographically dispersed? Are links identified as shared risk in your routing protocol? Is your
redundant site on the same power source? Have you deployed UPS?
3. Process and Support - Do you have an escalation process defined? Do you have a support contract for your
network equipment? How easily can you detect performance degradation and isolate or resolve network outages?
What sort of Network Management tools do you have available? How quickly can you recover from failures in your
network?
The Dilemma:
Network availability became a contentious topic since there was no separation among the network, hardware, and
application. For example, a network failure could exist while the hardware and application were running. Were they really
available? Ways of counting it became debatable. If a router showed up for a full year, the uptime was reported as
100%. However, what if an interface was down? Approaching the formula from a per port basis resulted in missing the
5 9s target on a fair amount of customers while the majority experienced 100% availability. Would the average across
all the ports be a more accurate estimation of someone’s experience with the network?
Even though there is disagreement on the validity of the numbers, network operators always strive for the 5 9s target
based on their own internal rules.
As a consequence of an outage, both the customer and provider are in the danger of losing revenue.
Most SLAs will include credits for outage time which can be very costly.
• It is possible to lose a large portion of the margins gained during a single significant outage.
While an outage is taking place, both the customer and provider require an increase in their support
activities.
• On-call, overtime hours paid, and time off in lieu.
Word of mouth brand degradation is a likely consequence when an outage is prolonged or mishandled.
• Major outages have their way of making themselves into viral media and news channels with
today’s Internet and may impact the future and renewal of sales.
• Having proper escalation channels, including the time spent managing these channels to ensure
that the best service possible is provided, comes at a cost.
It is common practice to remove non service effecting issues and scheduled maintenance windows from the
calculations.
1. Hardware Stability - How redundant is your system? Do you have redundant control processors that support a full
chassis with negligible packet loss, non-stop services, and other HA features?
2. Network Resiliency – Does your network deploy fast re-reroute, standby secondary path, or sonnet protection? Is
the physical transport geographically dispersed? Are links identified as shared risk in your routing protocol? Is your
redundant site on the same power source? Have you deployed UPS?
MTTR includes the total outage time regardless of when it was detected. The pressure is on the network provider to be
proactive in the network to find and repair issues.
MTTR is a KPI in many support groups and included in most SLAs with customers.
1. Process and Support - Do you have an escalation process defined? Do you have a support contract for your
network equipment? How easily can you detect performance degradation and isolate or resolve network outages?
What sort of Network Management tools do you have available? How quickly can you recover from failures in your
network?
In the event of DoS (Denial of Service) attacks, Management Access Filters, CPM Queues, and CPM Filters maintain the
integrity of services using the Nokia SR portfolio.
For example, an SAA test can be performed to measure the latency and jitter across a network before turning a service
over to a customer.
As another example, a network operator may turn on cflowd to baseline the current internet peering traffic distribution
before adjusting policies. They would also use cflowd to monitor the impacts of changes made.
For example, while isolating an incident, an operator may use an OAM command to locate a specific MAC address in a
VPLS service. Once located, the operator may monitor the rate of specific traffic on a port, and use show commands to
view the queue statistics.
Debug commands are often used to isolate control plane issues. They are very useful for session establishment issues.
Cflowd may be consulted to detect any traffic pattern changes that may isolate an upstream network failure.
The Nokia SR platform supports this approach by allowing various tools. For example, a network manager can use “cron”
to schedule an SAA probe to perform an OAM test every 5 minutes and generate an “SNMP trap” if a threshold is
crossed.
SAM can be used as a collector to correlate multiple events and notify network managers of incidents.
Logs are the first place to look when investigating an issue. The outputs, for the most part, are comprehensible in
English and thus provide a strong indicator of where to start your isolation attempts.
Logs also aid in the documentation process. They are time-stamped to allow for accurate MTTR and MTBF recording.
When deploying logs throughout a network, timing synchronization becomes very important to the correlation of
events.
The NTP algorithm is much more complicated than the SNTP algorithm. NTP normally uses multiple time servers to
verify the time and then controls the slew rate of the PC. The algorithm determines if the values are accurate using
several methods including fudge factors and identifying time servers that don't agree with the other time servers. It
then speeds up or slows down the PC's drift rate so that (1) the PC's time is always correct and (2) there won't be any
subsequent time jumps after the initial correction. Unlike NTP, SNTP usually uses just one Ethernet Time Server to
calculate the time and then it "jumps" the system time to the calculated time. It can, however, have back-up Ethernet
Time Servers in case one is not available. During each interval, it determines whether the time is off enough to make a
correction and if it is, it applies the correction.
Log 98 is created by SAM for SNMP events and is typically reserved as a best practice if a network has not deployed
SAM.
Operators can create 97 additional custom logs from various sources and have them sent to various destinations. It is
important to create custom logs before you need the events you are trying to collect.
In addition to event logs, the Nokia 7x50 also has special filter logs that can be used to send a copy of frames matching
a specific entry in an ip-filter or a mac-filter.
Before an event can be associated with a log-id, the from command, identifying the source of the event, must be configured. Only
one destination can be specified for a log-id. The destination of an event stream can be in memory buffer, console, session, snmp-
trap-group, syslog, or file. Use the event-control command to suppress the generation of events, alarms, and traps for all log
destinations.
Console
Sending events to a console destination means the message will be sent to all active console sessions. If there are no
active console sessions, the event log entries are dropped. The console device can be used as an event log destination.
Session
A session destination is a temporary log destination which directs entries to the active console session for the duration
of that session. When the session is terminated, the event log is removed. Event logs with a session destination are not
stored in the configuration file. Event logs can direct log entries to the session destination.
Log 99 records all events across all severities from the source of main (M). Note that it has logged a much larger number
of event logs since inception. It is important to note that even though log-id 99 has logged 17,056 events, only the last
500 are viewable.
Log-id 100 only logs events with a severity greater than major.
default-action drop
description "Collect events for Serious Errors Log"
entry 10
action forward
description "Collect only events of major severity or higher"
match
no application
no number
severity gte major
no router
no subject
exit
exit
Displays log collector statistics for the main, show log log-collector
security, change, and debug log
collectors.
<application>: application_assurance|aps|atm|bgp|cflowd|chassis|
debug|dhcp|dhcps|efm_oam|elmi|ering|eth_cfm|etun|
filter|gsmp|igh|igmp|igmp_snooping|ip|ipsec|isis|
l2tp|lag|ldp|li|lldp|logger|mcpath|mc_redundancy|
mirror|mld|mld_snooping|mpls|msdp|nat|ntp|oam|ospf|
pim|pim_snooping|port|ppp|pppoe|rip|route_policy|
rsvp|security|snmp|stp|svcmgr|system|user|video|vrrp|
vrtr
Logs can consume lots of flash space thus it is advised to never log to compact flash 3 (cf3:).
In this example,
log0601 represents log-id 06, file-id 06.
20110331 represents the calendar date of file creation on the node.
124450 represents a timestamp.
In the example above, the administrator creates a log-id for debug traffic. The output of the debug is posted to the
session. Upon creation, the administrator then activates the debug he/she wants. In this case, it is OSPF packets. Once
activated, the output is placed to the session.
To turn off the debug, use the “no debug” command. When sending debug traffic to the session, the prompt may not
be visible. That is “OK”, since the router will still accept your typing. If lots of information is being sent to the session,
you may not see what you are typing. That too is “OK”, since the router will keep track of the input. When the debug is
turned off, the router will send the following message:
A:R1# no debug
Trace disabled for all existing and future clients
Debug output can also be turned off by shutting down the log that was created for the debug. The debug statement will
remain and if the log is “no shutdown”, the debug output will resume.
*A:R6>config>log>filter>entry# match
- match
- no match
Note:
If no action is set within a log filter entry, the entry will follow the action of the default-action.
Consult the user guides specific to the version you are running for a complete list.
Each entry within the filter is shown in detail. This command is very useful for verifying the accurate configuration of an
ACL.
The “IP Filter” statement defines the filter number and entry number within the filter that qualifies the trapped
information being entered into the log file.
For a fully redundant system, a network operator would deploy either a Multiple Destinations architecture or a
Distributed Relay architecture. In both cases, in the event of a backhaul network failure, syslog messages have a greater
probability of being tracked.
Syslog uses eight severity levels whereas the 7750 SR-series uses six. This results in a mapping of SR severity levels to
syslog severities .
configure log syslog syslog-id address ip-address description description-string facility syslog-facility level syslog-level
log-prefix log-prefix-string port port
Note:
This feature only applies to port-based Epipe SAPs because 802.3ah runs at port level not VLAN level. Hence, such
ports must be configured as null encapsulated SAPs.
Note:
•802.3ah must be enabled on both ends of a link for the physical ports to go operationally up
•When turning on 802.3ah the port may go to status (Link Up) until the discovery process is completed.
…
Peer Mac Address : 8e:4c:01:01:00:02
Peer Vendor OUI : 00:16:4d
Peer Vendor Info : 00:00:00:00
Peer Mode : active
Peer Pdu Size : 1518
Peer Cfg Revision :0
Peer Support : LB
===============================================================================
Ethernet Oam Statistics
===============================================================================
Input Output
-------------------------------------------------------------------------------------------------------------------
Information 1297335 1297388
Loopback Control 0 0
Unsupported Codes 0 0
Frames Lost 0
===============================================================================
When a router receives a packet for which it currently does not have a flow entry, a flow structure is initialized to
maintain state information regarding that flow. Subsequent packets matching the same parameters (bytes, IP
addresses, port numbers, and so on) of the flow contribute to the byte and packet count of the flow until it is
terminated and exported by the router to a Cflowd collector.
Cflowd is also useful for web host tracking, accounting, network planning and analysis, network monitoring, developing
user profiles, data warehousing and mining, as well as security-related investigations. Collected information can be
viewed in several ways such as in port, AS, network matrices, and/or pure flow structures. The amount of data stored
depends on the cflowd configurations.
The following data is maintained for each individual flow in the raw flow cache:
• Source IP address
• Destination IP address
• Source port
• Destination port
• Input interface
• Output interface
• IP protocol
• TCP flags
• First timestamp (of the first packet in the flow)
• Last timestamp (timestamp of last packet in the flow prior to expiry of the flow)
• Source AS number for peer and origin (taken from BGP)
• Destination AS number for peer and origin (taken from BGP)
• IP next hop
• BGP next hop
• ICMP type and code
• IP version
• Source prefix (from routing)
• Destination prefix (from routing)
• MPLS label stack from label 1 to 6
Within the raw flow cache, the following characteristics are used to identify an individual flow:
• Ingress interface
• Source IP address
• Destination IP address
• Source transport port number
• Destination transport port number
• IP protocol type
• IP TOS byte
• Virtual router ID
• ICMP type and code
• MPLS labels
As a network operator, certain information that would result in significant data traffic may be stored locally in a region.
Other statistical data that may be of use for real time monitoring in the network management center may be sent
directly to a centralized location.
In some smaller installations, only a single central server may be deployed. A final alternative is available to store
information locally at each site.
When a flow is exported from the cache, the collected data is sent to an external collector which maintains an
accumulation of historical data flows that network operators can use to analyze traffic patterns.
As the entries within the aggregate matrices are aged out, they are accumulated to be sent to the external flow
collector in Version 8 format. The sample rate and cache size are configurable values. The cache size default is 64K
flow entries.
By enabling cflowd at the interface level, all IP packets forwarded by the interface are subject to cflowd analysis. By
setting cflowd as an action in a filter, only packets matching the specified filter are subject to cflowd analysis. This
provides the network operator greater flexibility in the types of flows that are captured.
active-timeout – Specifies, in minutes, how long cflowd samples an active flow before terminating the flow. The range is 1 to 600.
The default is 30.
cache-size – Specifies the maximum number of active flows in the flow cache table. The range is 1000 to 131072. The default is
65536.
inactive-timeout – Specifies how long, in seconds, cflowd waits for a matching flow packet before it considers the flow inactive and
terminates it. The range is 10 to 600. The default is 15.
overflow – Specifies the percentage of flow cache entries cflowd removes when the number of cache entries exceeds the cache-size
value.
rate - Specifies the rate at which cflowd samples packets. Cflowd samples one packet out of every N packets where N is the parameter
value. For example, a value of 100 specifies that one of every 100 packets is to be sampled whereas a value of 1 means that all
packets are to be sampled. The range is 1 to 10 000. The default is 1000.
template-retransmit - Specifies how often, in seconds, cflowd sends cflowd template definitions to collectors. The parameter is
configurable when the Version parameter, which is set when configuring the collectors, is set to version-9. The range is 10 to 600.
The default is 600.
Ip-address - Specifies the cflowd collector host address. Also specifies an IPv4 address in dotted-decimal format.
port - Specifies the cflowd collector UDP port. The range is 1 to 65 535. The default is 2055.
description - Specifies a description for the created object. The range is 0 to 80 characters.
version - Specifies the cflowd version of the collector. The options are 5, 8 or 9.
aggregation - Specifies the type of aggregation scheme to export. The parameter is configurable when the version parameter is set
to 8. The following are types of options:
raw - Flows are not aggregated but sent to the collector in a V5 record.
destination-prefix - Flows are aggregated based on the destination prefix and mask, destination AS, and egress
interface.
protocol-port - Flows are aggregated based on the IP protocol type, source port number, and destination port
number.
source-destination-prefix - Flows are aggregated based on the source prefix and mask, destination prefix and mask,
source and destination ASs, ingress interface, and egress interface.
source-prefix - Flows are aggregated based on the source prefix and mask, source AS, and ingress interface.
as-matrix - Flows are aggregated based on the source and destination AS and ingress and egress interfaces.
autonomous-system-type - Specifies whether the AS information that cflowd sends for analysis is from the originating AS or peer AS.
The parameter is configurable when the Version (version) parameter is set to 5 or 8. The following options are:
origin (default)
peer
template-set - Specifies which set of templates cflowd sends to the collector. The parameter is configurable when the Version
(version) parameter is set to 9. The options are:
basic
mpls-ip
Please consult the 7750 SR OS Router Configuration Guide section “Configuring Cflowd Collectors” for a list of
attributes included in each template type.
To enable for filter traffic sampling, the following requirements must be met:
1. Cflowd must be enabled globally.
2. At least one cflowd collector must be configured and enabled.
3. On the IP interface being used, the interface>cflowd acl option must be selected. (See Interface Configuration) For
configuration information, refer to the IP Router Configuration Overview section of the 7750 SR OS Router
Configuration Guide.
4. On the IP filter being used, the entry>filter-sample option must be explicitly enabled for the entries matching the
traffic that should be sampled. The default is no filter-sample. (See Filter Configuration for more information).
5. The filter must be applied to a service or network interface. The service or port must be enabled and operational.
When a filter policy is applied to a service or network interface, sampling can be configured so that traffic matching the
associated IP filter entry is sampled when the IP interface is set to cflowd ACL mode and the filter-sample command
is enabled. If cflowd is neither enabled (no filter-sample) nor set to the cflowd interface mode, then sampling does
not occur.
Since a filter can be applied to more than one interface (when configured with a scope template), the interface-disable-
sample option is intended to enable or disable traffic sampling on an interface-by-interface basis. The command can
be enabled or disabled as needed instead of creating numerous filter versions.
When the interface-disable-sample command is enabled, traffic matching the associated IP filter entry is
not sampled if the IP interface is set to cflowd ACL mode.
The following table displays the expected results when specific features are enabled and disabled:
Cflowd Admin Status - The desired administrative state for the cflowd remote collector host.
Cflowd Oper Status - The current operational status of the cflowd remote collector host.
Active Timeout - The maximum amount of time, in minutes, before an active flow is exported. If an individual flow is
active for this amount of time, the flow is exported and a new flow is created.
Inactive Timeout - Inactive timeout in seconds.
Template Retransmit - The time in seconds before template definitions are sent.
Cache Size - The maximum number of active flows to be maintained in the flow cache table.
Overflow - The percentage number of flows to be flushed when the flow cache size has been exceeded.
Sample Rate - The rate at which traffic is sampled and forwarded for cflowd analysis.
One (1) — All packets are analyzed.
1000 (default) — Every 1000th packet is analyzed.
Active Flows - The current number of active flows being collected.
Total Pkts Rcvd - The rate at which traffic is sampled and forwarded for cflowd analysis.
Total Pkts Dropped - The total number of packets dropped.
Aggregation Info:
Type - The type of data to be aggregated and sent to the collector.
Status
enabled — Specifies that the aggregation type is enabled.
disabled — Specifies that the aggregation type is disabled.
Host Address - The IP address of a remote cflowd collector host to receive the exported cflowd data.
Port - The UDP port number on the remote cflowd collector host to receive the exported cflowd data.
AS Type - The style of AS reporting used in the exported flow data.
origin — Reflects the endpoints of the AS path which the flow follows.
peer — Reflects the AS of the previous and next hops for the flow.
Version - Specifies the configured version for the associated collector.
Admin - The desired administrative state for the cflowd remote collector host.
Oper - The current operational status of the cflowd remote collector host.
Recs Sent - The number of cflowd records that have been transmitted to the remote collector host.
Collectors - The total number of collectors using the IP address.
Address - The IP address of a remote cflowd collector host to receive the exported cflowd data.
Port - The UDP port number on the remote cflowd collector host to receive the exported cflowd data.
Description - A user-provided descriptive string for the cflowd remote collector host.
Version - The version of the flow data sent to the collector.
AS Type - The style of AS reporting used in the exported flow data.
origin — Reflects the endpoints of the AS path which the flow follows.
peer — Reflects the AS of the previous and next hops for the flow.
Admin State - The desired administrative state for the cflowd remote collector host.
Oper State - The current operational status of the cflowd remote collector host.
Records Sent - The number of cflowd records that have been transmitted to the remote collector host.
Last Changed - The time when the row entry was last changed.
Last Pkt Sent - The time when the last cflowd packet was sent to the remote collector host.
Aggregation Type - The bit mask which specifies the aggregation scheme(s) used to aggregate multiple individual flows
into an aggregated flow for export to the remote host collector.
none — No data is exported for the remote collector host.
raw — Flow data is exported without aggregation in Version 5 format.
All other aggregation types use Version 8 format to export the flow data to the remote host collector.
Collectors - The total number of collectors using the IP address.
Service Accounting Statistics - Collected on every queue on every SAP that is linked to an accounting policy. Service
accounting statistics provide queue throughput and drop information and can be used for billing and SLA purposes.
Network Accounting Statistics - Collected on every queue on every SDP or on network ports that are linked to an
accounting policy. Network accounting statistics measure forwarding class queue usage. This information is used to
monitor link utilization and identify network traffic patterns and trends for capacity planning and traffic engineering.
Subscriber Accounting Statistics – Collected on a subscriber profile for residential subscriber instances. Subscriber
accounting statistics are used for billing and SLA purposes.
Application Assurance, or AA, Accounting Statistics - Collected on applications, application groups, and protocols.
Note: SDP statistics collection is supported only on devices that are in chassis mode B, C, or D.
As a network operator, certain information that would result in significant data traffic may be stored locally in a region.
Other statistical data that may be of use for real time monitoring in the network management center may be sent
directly to a centralized location.
In some smaller installation, only a single central server may be deployed. A final alternative available is to store
information locally at each site.
An accounting policy specifies an accounting statistics record type, a collection interval, an administrative state, and a
file.
A NE collects accounting statistics based on a specified collection interval and writes the statistics data in XML format to
a file on the NE. After the rollover period, the NE closes and compresses the file. The NE then notifies the 5620 SAM
that a new file is ready for processing.
Rollover Interval – Defined in minutes and determines how long a file will be used before it is closed and a new log file is
created.
Retention Interval – Determines how long a file will be stored on the CF before it is deleted.
The 7750 SR-Series creates two directories on the compact flash to store the files. The following output displays a
directory named act-collect that holds accounting files that are open and actively collect statistics. The directory
named act stores the files that have been closed and are awaiting retrieval.
Accounting files always have the prefix act followed by the accounting policy ID, log ID, and timestamp.
Predefined Record-Names:
aa-app-group|aa-application|aa-protocol|aa-subscriber-application|aa-subscriber-protocol|combined-ldp-lsp-
egress|combined-mpls-lsp-egress|combined-mpls-lsp-ingress|combined-network-ing-egr-octets|combined-queue-
group|combined-sdp-ingress-egress|combined-service-ing-egr-octets|combined-service-ingress|compact-service-
ingress-octets|complete-sdp-ingress-egress|complete-service-ingress-egress|complete-subscriber-ingress-
egress|custom-record-aa-sub|custom-record-service|custom-record-subscriber|network-egress-octets|network-
egress-packets|network-ingress-octets|network-ingress-packets|queue-group-octets|queue-group-
packets|saa|service-egress-octets|service-egress-packets|service-ingress-octets|service-ingress-packets|video
Each accounting record name consists of one or more sub-records which in turn consists of multiple fields.
Below is a description of one accounting record:
Sub-record Field Field Description
Service-ingress-Octets sio svc SvcID
sap SapID
qid QueueID
hoo OfferedHiPrioOctets
hod DroppedHiPrioOctets
loo LowOctetsOffered
lod LowOctetsDropped
uco UncoloredOctetsOffered
iof InProfileOctetsForwarded
oof OutOfProfileOctetsForwarded
When the no collect-stats command is issued, the statistics are still accumulated by the IOM cards. However, the CPU
will not obtain the results and will not write them to the billing file. If a subsequent collect-stats command is issued, the
counters written to the billing file will include all the traffic while the no collect-stats command was in effect.
This command assigns the accounting policy to a SAP, SDP, or pseudowire template. An accounting policy must be
defined before it can be associated. If the policy-id does not exist, an error message is generated.
A maximum of one accounting policy can be associated at one time. Accounting policies are configured in the
config>log context.
Pseudowire Template
The pw-template is defined under the top level service command (config>service# pw-template) and specifies whether
to use an automatically generated SDP or manually configured SDP. It also provides the set of parameters required for
establishing the pseudowire (SDP binding).
Context: config>port>ethernet>access>egr>qgrp
config>port>ethernet>access>ing>qgrp
config>port>ethernet>network>egr>qgrp
config>port>ethernet>network
config>port>sonet-sdh>path>network
config>port>tdm>ds1>network
config>port>tdm>ds3>network
config>port>tdm>e1>network
config>port>tdm>e3>network
Note:
The results shown above are absolute values.
For a remote mirror configuration, the slice-size parameter is set on the source router. This allows conservation of
mirroring resources by limiting the size of the stream of packet through the 7750 SR-Series and the core network. For
example, if the value of 256 bytes is defined, up to the first 256 bytes of the frame are transmitted to the mirror
destination. The original frame is not affected by the truncation.
The transmission of a sliced or non-sliced frame is also dependent on the mirror destination SDP path MTU and/or the
mirror destination SAP physical MTU. Packets that require a larger MTU than the mirroring destination supports are
discarded if the defined slice size does not truncate the packet to an acceptable size.
Parameters bytes — The number of bytes to which mirrored frames will be truncated, expressed as a decimal integer.
Values 128 — 9216
For in-band testing, the OAM packets closely resemble customer packets to effectively test the customer's forwarding
path. However, they are distinguishable from customer packets and are thus kept within the service provider's network
and not forwarded to the customer.
The suite of OAM diagnostics supplement the basic IP ping and traceroute operations with diagnostics specialized for
the different levels in the service delivery model. There are diagnostics for MPLS LSPs, SDPs, services and VPLS MACs
within a service.
jitter-event - Specifies that the calculated jitter value, at the termination of an SAA test probe, is evaluated against the
configured rising and falling jitter thresholds.
latency-event - Specifies that the calculated latency event value, at the termination of an SAA test probe, is evaluated
against the configured rising and falling latency event thresholds.
loss-event - Specifies that the calculated loss event value, at the termination of an SAA test run, is evaluated against
the configured rising and falling loss event thresholds.
trap-gen - This command enables the context to configure trap generation for the SAA test.
probe-fail-enable - This command enables the generation of an SNMP trap when probe-fail-threshold consecutive
probes fail during the execution of an SAA ping test.
test-completion-enable - This command enables the generation of a trap when an SAA test completes.
test-fail-enable - This command enables the generation of a trap when a test fails. In the case of a ping test, the test is
considered a failure (for the purpose of trap generation) if the number of failed probes is at least the value of the test-
fail-threshold parameter.
type - This command creates the context to provide a test type for the named test. Only a single test type can be
configured.
continuous - Specifies whether the SAA test is continuous.
When you configure a test, use the config>saa>test>continuous command to make the test run continuously. Use the
no continuous command to disable continuous testing and shutdown to disable the test completely. Once you have
configured a test as continuous, you cannot start or stop it by using the saa test-name [owner test-owner] {start | stop}
[no-accounting] command.
Values:
If a test-owner value is not specified, tests created by the CLI will have a default owner “TiMOS CLI”.
start — This keyword starts the test. A test cannot be started if the same test is still running. A test cannot be started if
it is in a shut-down or continuous state. An error message and log event will be generated to indicate a failed attempt to
start an SAA test run.
stop — This keyword stops a test in progress. A test cannot be stopped if it is not in progress or is in a continuous
state. A log message will be generated to indicate that an SAA test run has been aborted.
no-accounting — This parameter disables the recording results in the accounting policy. When specifying no-
accounting, the MIB record produced at the end of the test will not be added to the accounting file. It will however use
up one of the three MIB rows available for the accounting module to be collected.
Use this command to display information about the SAA test. If no specific test is specified, a summary of all configured
tests will be displayed. If a specific test is specified, then detailed test results for that test will be displayed for the last
three occurrences that the test has been executed or since the last time the counters were reset via a system reboot or
clear command.
For each defined SAA test, the router will keep the oldest 3 results with a max of 50 result entries.
If a test-owner value is not specified, tests created by the CLI will have a default owner “TiMOS CLI”.
Network Management Systems (NMSs) are responsible for monitoring the health of the elements in a node and the
services that run on top of them in addition to gathering logs and notifying operators.
Control Plane Managers (CPAMs) are responsible for monitoring the control plane processes on an element, allowing
the operator to assess the status of individual tunnels, routing protocols, and so on.
As an EMS, SAM offers the ability to configure nodal elements and services from end to end network perspective.
As an NM, SAM provides topology maps, event correlation, OAM test facilities, and other monitoring and reporting
capabilities.
The advantage of SAM is in its awareness of the entire discovered network topology as opposed to the CLI, which
provides a very detailed view of the local router.
Misuse Detection
Threats are detected against individual /32 hosts and cover the most common attack types. Misuse anomalies are
generated for ICMP, TCP NULL, TCP SYN, TCP RST, IP NULL, IP Fragment, IP Private Address Space, UDP, DNS and Total
Traffic flood attacks.
Profiled Detection
Threats are detected against Managed Objects configured within the Peakflow SP system. Profiled Anomalies are
generated when traffic for a given Managed Object deviates from normal traffic levels. Peakflow SP builds ‘baselines’
for the Managed Objects configured within the SP system.
Fingerprint Detection
Threats are detected when traffic exceeds user defined bps / pps thresholds for either a user defined detection
‘signature’ based on Layer 3 / 4 traffic parameters or an Arbor Active Threat Feed (ATF) policy.
ATF - Provided to all Peakflow SP deployments (will periodically download signature updates from an Arbor server over
SSL) and contains Layer 3 / 4 signature information for identifying both malicious and non-malicious network behaviors
which may be of interest to network operators (for example, host connections to known Botnet CnC servers, traffic to
Dark IP addresses, worm infection, and so on).
Policy routing is a popular tool used to direct traffic in Layer 3 networks. As Layer 2 VPNs become more popular,
especially in network aggregation, policy forwarding is required. Many providers are using methods such as DPI servers,
transparent firewalls, or Intrusion Detection/Prevention Systems (IDS/IPS) for policy routing. Since these devices are
limited by bandwidth, providers want to limit the traffic that is forwarded through them. To accomplish this, a
mechanism is required to direct some of the traffic coming from a SAP to the DPI without learning and to direct other
traffic coming from the same SAP directly to the gateway uplink based on learning.
VPLS policy-based forwarding allows the provider to create a filter that will forward packets to a specific SAP or SDP.
The packets are then forwarded to the destination SAP regardless of a learned destination or lack thereof. The SAP can
either terminate a Layer 2 firewall or Deep Packet Inspection (DPI) directly or be configured to be part of a cross-
connect bridge into another service. This is useful when running the DPI remotely using a VPWS.
If an SDP is used, the provider can terminate it in a remote VPLS or VPWS service where the firewall is connected. The
filter can be configured under a SAP or SDP in a VPLS service. All packets (unicast, multicast, broadcast, and unknown)
can be delivered to the destination SAP or SDP. The filter may be associated with SAPs or SDPs belonging to a VPLS
service only if all actions in the policy forward to SAPs or SDPs within the context of that VPLS.
Note: Disabling the learning function on SAP 1/1/5:100 and SAP 1/1/4:100 is an important step in this example. SAP
1/1/5:100 and SAP 1/1/4:100 see traffic with the source MAC from IP addresses 192.168.1.1 and 192.168.1.2. Disabling
the learning function prevents incorrect information from being populated in the FDB.
Applications
• Operator configured for maximum flexibility.
• Specific match criteria such as signature, traffic direction, server-subnet, TCP/UDP ports , and
string matches.
• For example: Gmail, Database, WebTraining, and so on.
Application Groups
• Multiple applications can be grouped together.
Resource Starvation attacks attempt to consume bandwidth or resources such as session handling capabilities.
Regardless of the DoS attack used, it exists as a flood of packets such as SYNs, SYN-FIN, ICMP, TCP, or UDP.
Amplifier sites will typically consist of zombie computers on xDSL, cable, or dialup connections (both residential and
business systems).
DoS will affect customers and the network in several ways such as:
• The inability to gain access to the network.
• Response times producing a less than satisfactory user experience.
Arbor
• Set baseline service-level characteristics
• Identify potential attacks and send alarms
• Fine tune mitigation tools
• View what is being blocked
The 7750 SR OS implementation exits the filter when the first match is found and executes the actions according to the
specified action. For this reason, entries must be sequenced correctly from most to least explicit.
An entry may not have any match criteria defined (in which case, everything matches) but must have at least the
keyword action to be considered complete. Entries without the action keyword are considered incomplete and will be
rendered inactive.
CPM filters can be used in conjunction with queues to mitigate the effects of Denial-of-Service threats. You must
determine the CIR (Committed Information Rate), PIR (Peak Information Rate), CBS (Committed Burst Size), and MBS
(Maximum Burst Size) before the CPM-queue can be configured.
NOTE: This command should always be executed prior to any additional intrusive troubleshooting. This would include
shutting down an interface, resetting a routing protocol or any other changes that could alter the status of the router
as it is not operating correctly.
NOTE: This command should only be used with authorized direction from the Nokia Technical Assistance Center (TAC).
Document resolution
Create a detailed report as to what the problem was, what times your team was notified and what and when actions
were taken. This will become valuable in the next step.
When an IOM is installed in a slot and enabled, the system verifies that the installed IOM type matches the allowed IOM
type. The IOM will remain offline if the parameters do not match. To see the IOM configuration at system initialization
use the “show boot-messages” command. To display the current IOM configuration use the “show card” command.
The following is an example of the output for this command.
Using the “show mda” command will define what MDA is located in what slot of the chassis. This also allows the
administrator to ensure operability.
The “show system information” command will allow the administrator to verify basic operational ability and up time.
The “show chassis environment” command will display the current status of the router fans indicating any error
conditions. The status field should say “up” and the speed should be “half speed”.
The “show chassis power-supply” command displays general power supply information.
The “show mda detail” command provides detailed information about each MDA configured on the router
System Up Time is calculated as the total time from restart or power on. If a CPM switch-over occurs, or it is forced, the
node is considered up, so System Up Time is not reset.
The 7750 SR-1 chassis supports redundant fans. There is a single fan tray containing six fans. Normal system
operation continues if one of the fans fails. (Not field replaceable)
The 7750 SR-7 chassis supports dual redundant fans. One fan is required for normal operation. Normal operations will
be maintained if a single fan unit is removed or fails.
The 7750 SR-12 chassis supports three redundant fans. Two fans of the three fans are required for normal operation.
Normal operations will be maintained if a single fan unit is removed or fails.
The 7750 SR continuously monitors the temperature of the system and adjusts their speed accordingly. In the event of
a cooling tower failure or if the chassis/line card temperature rises above a threshold, the appropriate chassis alarms
and SNMP traps and events are triggered.
NOTE: The fan tray increases speed as the egress air temperature rises and is forced to high speed if or when any of the
internally monitored temperatures on any of the SF/CPMs, IOMs, or MDAs exceed 154º F (68º C). NOTE: There are
temperature sensors on the IOM, SF/CPM and on each MDA. An alarm (trap) is generated when temperatures exceed
167° F (75° C). If any slot exceeds 230º F (110º C) then the slot will shut down to avoid damage.
CPU Time – The time each process or protocol has used in the specified sample period.
CPU Usage – The total CPU usage of the process or protocol.
Capacity Usage - Displays the level the specified process or protocol is being utilized. If this number hits 100%, this
part of the system is busied out. There may be extra CPU cycles still left, but this process or protocol is running at
capacity.
If the connection is slow there could be a duplex mismatch on the interface. Check the interface with the “show
interface” command and the default logs for errors being detected on the interface.
If excessive FCS errors are detected ensure that the MTU of the end device is configured correctly. Also, ensure that
the cable connecting the two devices together is not near any devices that might be injecting electrical interference.
Late collisions occur after the standard collision detection time (5.12us for Fast Ethernet). The smallest a frame for
Ethernet is 64 bytes, anything less than that is considered a RUNT and discarded. The largest frame on an Ethernet is
1518 bytes in size. Anything greater than 1518 bytes is considered a giant and discarded. All of these issues can be
results of faulty NIC cards, poor cabling or non-compliant network configuration using excessive amounts of hubs.
Ways to Identify problems at the data-link layer vary. Below are some examples of how to detect problems:
•MTU failures, encapsulation failures, and framing errors all are great indicators that there are problems at the link-layer
•Late collisions are commonly seen when a duplex mis-match exists. Runts are frames that are too small for the
protocol, and giants are frames that are too large.
•Router log files (log files 99 and 100) should be checked to ensure that there are no critical errors.
•MAC database corruption will cause frames to be forwarded out ports that should not receive the frames. This can be
catastrophic and contribute to an STP loop.
•Excessive broadcast traffic can saturate a network. Keep in mind that a switch will forward broadcast traffic out all
ports, less the one the traffic is received on, as long as they belong to the same VLAN.
•STP failures and loops can bring a network to its knees. Methodical design and meticulous configuration will avoid this
issue.
•MAC filters when not documented and configured correctly will cause frames to be blocked from being sent out
interfaces without an administrators knowledge.
Troubleshooting note: Ports by default are administratively down. If a port is correctly configured but not up, most
likely the port is administratively down.
The highlighted parts outline just some of the information that can be used in troubleshooting a physical layer problem.
Note that the specific port and operational speed is defined. In addition the administrative and operational status,
along with the duplex settings are stated.
Several lines have been removed to be able to show the “Transceiver Type” entry of the “show port x/x/x” command. In
this case the transceiver in use is an “SFP” type.
Alignment Errors - The total number of packets received that had a length (excluding framing bits, but including FCS
octets) of between 64 and 1518 octets, inclusive, but had either a bad Frame Check Sequence (FCS) with an integral
number of octets (FCS Error) or a bad FCS with a non-integral number of octets.
FCS Errors - The number of frames received on a particular interface that are an integral number of octets in length but
do not pass the FCS check.
SQE Test Errors - The number of times that the SQE TEST ERROR is received on a particular interface.
CSE - The number of times that the carrier sense condition was lost or never asserted when attempting to transmit a
frame on a particular interface.
Too Long Frames - The number of frames received on a particular interface that exceed the maximum permitted frame
size.
Symbol Errors - For an interface operating at 100 Mb/s, the number of times there was an invalid data symbol when a
valid carrier was present.
Sngl Collisions - The number of frames that are involved in a single collision, and are subsequently transmitted
successfully.
Mult Collisions - The number of frames that are involved in more than one collision and are subsequently transmitted
successfully.
Late Collisions - The number of times that a collision is detected on a particular interface later than one slotTime into
the transmission of a packet.
Excess Collisions - The number of frames for which transmission on a particular interface fails due to excessive
collisions.
Int MAC Tx Errors - The number of frames for which transmission on a particular interface fails due to an internal MAC
sublayer transmit error.
Int MAC Rx Errors - The number of frames for which reception on a particular interface fails due to an internal MAC
sublayer receive error.
The administrator tries to ping the server that most people are upset about and notes that the ping fails. This step
verifies the issue.
Upon executing the ping the administrator notes that only a few of the pings are getting across the link.
Note the operational and administrative state of the interface is “up”, meaning operational. In addition the configured
and operational duplex settings are set to “full” on router P1.
Also note that there are no detected errors on the port from this side of the link between P1 and P4.
Note: The Errors counter is a running total. Either clear the port statistics before troubleshooting, or run the command
multiple times and check for an incrementing value.
NOTE: When calculating the MTU for a port do not include the FCS, although some other vendors do. With the FCS an
Ethernet frame is 1518 bytes in length. The FCS is the last 4 bytes. Since the FCS is not included in the MTU
calculation, a standard frame is 1514 bytes.
The force-reference command affects both the central clock and the BITS output. The 7750 SR-c4 has two BITS input
ports on the cfm. The force reference command on this system allows the selection of the specific port.
NOTE:
The debug sync-if-timing force-reference command should only be used to test and debug problems. Network
synchronization problems may appear if network elements are left with this manual override setting. Once the system
timing reference input has been forced, it may be cleared using the no force-reference command.
The persistence feature allows information learned through DHCP snooping across reboots to be kept. This information
can include data such as the IP address, MAC binding information, lease length information, and ingress sap information
(required for VPLS snooping to identify the ingress interface). This information is referred to as the DHCP lease-state
information.
Ethernet ports:
You can NOT loop Ethernet ports using CLI commands.
line — Places the associated port or channel into a line loopback mode. A line loopback, loops frames received on the
corresponding port or channels back to the remote router.
internal — Places the associated port or channel into an internal loopback mode. An internal loopback loops the frames
from the local router back at the framer.
fdl-ansi — requests FDL line loopback according to ANSI T1.403.
fdl-bellcore — requests FDL line loopback according to Bellcore TR-TSY-000312.
payload-ansi — requests payload loopback using ANSI signaling.
inband-ansi — requests inband line loopback according to ANSI T1.403.
inband-bellcore — requests inband line loopback according to Bellcore signaling.
BACKBONE
The OSPF backbone area, area 0.0.0.0, must be contiguous and all other areas must be connected to the backbone
area. The backbone distributes routing information between areas. If it is not practical to connect an area to the
backbone then the ABRs must be connected via a virtual link.
STUB AREA
A stub area is a designated area that does not allow external route advertisements. Routers in a stub area do not
maintain external routes. A single default route to an ABR replaces all external routes. This OSPF implementation
supports the optional summary route (type-3) advertisement suppression from other areas into a stub area. This
feature further reduces topological database sizes and OSPF protocol traffic, memory usage, and CPU route calculation
time.
NSSA
Another OSPF area type is called a Not-So-Stubby area (NSSA). NSSAs are similar to stub areas in that no external
routes are imported into the area from other OSPF areas. External routes learned by OSPF routers in the NSSA area are
advertised as type-7 LSAs within the NSSA area and are translated by ABRs into type-5 external route advertisements
for distribution into other areas of the OSPF domain. An NSSA area cannot be designated as the transit area of a virtual
link.
1. The router on the left sends a hello packet with the standard header. In the hello information, the router inserts its
RID and leaves the neighbor field blank because it does not know of any other router on the Ethernet segment.
The router now is in the initializing state.
2. The right-side router responds with its own hello packet. However, this router’s hello contains not only its RID, but
also the RID of the left-side router.
3. The left-side router responds with another hello packet, containing the RID of the right-side router. Now that each
router sees that the other router acknowledges its existence, the state changes from initializing to 2-way.
1. The neighboring routers establish a master/slave relationship. During this step, the initial DBD sequence number is
determined for the exchange state. The router with the highest RID becomes the master, and its initial sequence
number is used.
3. The slave (left-side) router sends its DBD packet, describing its link-state database. The sequence number
negotiated in step 1 is used.
4. The master (right-side) router increments the sequence number and sends its DBD packet, describing its link-state
database.
1. Each router is responsible for maintaining a bit of reliability. Each responds to the DBD with an ACK packet. This
ensures that each knows the other has received the information without error.
2. In the example, the right side router asks for explicit information with the use of an LSR. Both routers would
actually be sending LSRs. When the LSR is sent, the exchange state changes to the Loading state.
3. Each router responds to the LSR with one or more LSU packets. These packets contain explicit details about the
requested networks.
2. After all LSUs have been received and ACKs sent, each router now has an identical link-state database. The state
changes from loading to full. This means that each router is fully converged with the other’s database.
3. To maintain the adjacency, the routers send periodic hellos to each other. The default interval is 10 seconds. If
something changes, only that change in the database is sent to the neighbor.
•OSPF status
•OSPF adjacencies
•OSPF interface status
•OSPF Configuration (Areas, ABR, ASBR, etc.)
•OSPF Database (LSAs)
•Management Access or CPM Filters
This command will display all neighbor information. To reduce the amount of output the user many opt to select the
neighbors on a given interface by address or name. The detail option will produce a large amount of data for each
neighbor.
When troubleshooting, this command is an excellent way to quickly check the status of all OSPF neighbors on a router.
This command displays the details of the OSPF interfaces. An interface can be identified by ip-address or ip interface
name. When neither is specified, all in-service interfaces are displayed.
The output above shows that interface “to-R5” is Up and configured as Point-to-Point which, in this example, is correct.
Interface “to-R1” on R5 is also in the same status.
When troubleshooting adjacency issues, be sure to run this command on both interfaces.
In the above log, R1 is stating it is receiving a conflicting hello interval from 10.1.5.5 than what is configured on interface
“to-R5”.
The Hello packet is used to establish adjacencies with other routers that speak OSPF. It is also used to maintain neighbor connectivity
by being propagated periodically. Viewing hello packets can help in recognizing the following adjacency issues;
- Mismatched subnet mask or IP address
- Router ID not unique
- Authentication problems
- OSPF timer do not match
- Area IDs do not match
Important Notes:
1) Before enabling “debug”, the user must ensure a log is created to view the debug result. The following is a log created to view
debug results. Note that if the log destination is session, when the session is closed, the log (log-id) will not be saved.
B:R1>config>log>log-id 2
B:R1>config>log>log-id$ from debug-trace
B:R1>config>log>log-id$ to session
2) Use either of the following commands to stop the debug at different levels:
debug router ospf no packet - Disable debugging for OSPF packets
debug router no ospf - Disable debugging for all OSPF messages
no debug - Disable debugging for all applications
Syntax:
database [type {router | network | summary | asbr-summary | external | nssa | all}] [area area-id] [adv-router router-id]
[link-state-id] [detail]
Checking the age of an LSA is a good way of isolating the location of the instability. A router that is experiencing link
flapping will have to generate a new LSA for every time the link flaps. This can be noticed as an LSA, or multiple LSAs in
the database with a continual low age, each time the command is executed.
Note:
In multi area environments it will be required to check the source of the summary LSAs. Once the source of the
summary LSAs has been determined, run the OSPF commands from the source of the summary LSAs.
•ISIS status
•ISIS adjacencies
•ISIS interface status
•ISIS Configuration (Levels, Authentication, etc.)
•ISIS Database
•Management Access or CPM Filters
When troubleshooting, this command is an excellent way to quickly check the status of all ISIS adjacencies on a router.
This command displays the details of the ISIS interfaces. An interface can be identified by ip-address or ip interface
name. When neither is specified, all in-service interfaces are displayed.
The output above shows that all the correct interfaces are configured for ISIS level-capability 1 and 2 and are
operationally up.
An L1/L2 router sets the ATT bit when it has routes to another area. The L1 routers in an area create a default route
to the nearest router with the ATT bit set.
The Level 2 database above should be showing Level 2 capable interfaces in all areas. Seeing as how R2 is the gateway
for area 49.0001, this is the reason why the client can not route from area 49.0001 to any other area.
In the above log, R2 is stating that it has L2 authentication failures on interfaces to-R1, to-R3 and to-R6.
The output above shows an authentication password mismatch. R2 is showing a password with 4 bytes (when converted is equal to
“lala”) whereas all its’ neighbours are showing an authentication password with 7 bytes (when converted is equal to “alcatel”). This
information confirms the authentication mismatch issue.
Checking the “Lifetime” of an entry is a good way of isolating the location of the instability. A router that is
experiencing link flapping will have to generate a new entry for every time the link flaps. A newly created entry will have
an age of 1200 and decreases to 0.
There should not be continual new events when the output is refreshed in a stable network.
Neighbor relationships in BGP are somewhat different from what is normal in the IGP context. Traditionally, a neighbor is
always a directly connected router. With BGP, this is not the case. Neighbors may be directly connected, but this is not
required. Therefore, BGP relies on an IGP to route between peers that are not directly connected.
BGP uses unicast TCP/IP for neighbor establishment. It is possible for neighbor relationships to be established with any
device that is IP-reachable. There is no guarantee that the neighbor relationship will succeed, because factors such as
firewalls or access control lists may prevent certain types of traffic from passing, but the relationship is possible and
likely to occur.
At the application layer, BGP functions similarly to TCP/IP applications such as Telnet, FTP, and HTTP. BGP is viewed as
an application because it uses registered port number 179 in the TCP/IP model.
Generic TCP/IP applications use a 3-way handshake for session establishment. After the session is established, the
applications exchange or negotiate a set of parameters for the session. In Telnet, for example, parameters such as
terminal types and passwords are typically negotiated. If application-level parameters are also acceptable, a session is
established at the application layer and data is exchanged. Periodic user data keeps the session alive and, when the
session is to be terminated, either user input or an inactivity timeout will cause the application session to be torn down.
TCP/IP initiates the 4-way session teardown.
Note: It is not because a route is indicated as BEST in BGP, that it is actually used for IP-forwarding.
Note: BGP forwards routes only when they are ‘best’ and ‘used’. This is different for VPRN routes.
•BGP status
•BGP neighbours
•BGP interfaces
•BGP Configuration (AS number, Authentication, etc.)
•BGP Policies
•Management Access or CPM Filters
When troubleshooting, this command is an excellent way to quickly check information such as the BGP router ID,
autonomous system, BGP operational state and the status of all BGP neighbours on a router.
The output above shows the neighbour 10.10.10.2 is currently in the “Active” state, which confirms what the client is
seeing.
In the above output, peer 10.10.10.2 is currently in the “Active” state with an error of “Bad BGP Identifier”
In the above log, R6 is showing an error of “INCORRECT_BGPID” when trying to associated with peer 10.10.10.2
Four message types are used by BGP to negotiate parameters, exchange routing information and indicate errors. They are:
1. Open Message — After a transport protocol connection is established, the first message sent by each side is an Open message. If
the Open message is acceptable, a Keepalive message confirming the Open is sent back. Once the Open is confirmed, Update,
Keepalive, and Notification messages can be exchanged.
2. Update Message — Update messages are used to transfer routing information between BGP peers. The information contained in
the packet can be used to construct a graph describing the relationships of the various autonomous systems. By applying rules,
routing information loops and some other anomalies can be detected and removed from the inter-AS routing.
3. Keepalive Message — Keepalive messages, consisting of only a 19 octet message header, are exchanged between peers frequently
so hold timers do not expire. The keepalive messages determine if a link is unavailable.
4. Notification — A Notification message is sent when an error condition is detected. The peering session is terminated and the BGP
connection (TCP connection) is closed immediately after sending it.
The output above shows router 10.10.10.6 is sending a BGP ID of 10.10.10.2, which is incorrect.
Flags:
used: route is in use in the RTM
suppressed: route is suppressed by route-flap-damping
history: route has been withdrawn, but logged
decayed: route damping entries that are not valid, but not suppressed
valid: no inconsistency in the BGP attributes
I/e/?: BGP attribute ORIGIN
best: the route has been selected as “best” by the BGP selection process
Network : 10.184.135.0/24
Nexthop : 10.51.0.68
Route Dist. : 12430:700002 VPN Label : 131060
From : 10.51.0.4
Res. Nexthop : 10.49.128.1
Local Pref. : 100 Interface Name : port-1/1/3
Aggregator AS : none Aggregator : none
Atomic Aggr. : Not Atomic MED : none
Community : 12430:2 target:12430:700000
Cluster : 1.1.1.1
Originator Id : 10.51.0.68 Peer Router Id : 10.51.0.4
Flags : Used Valid Best IGP
AS-Path : No As-Path
Modified Attributes
Network : 10.184.135.0/24
Nexthop : 10.51.0.68
Route Dist. : 12430:700002 VPN Label : 131060
From : 10.51.0.4
Res. Nexthop : 10.49.128.1
Local Pref. : 100 Interface Name : port-1/1/3
Aggregator AS : none Aggregator : none
Atomic Aggr. : Not Atomic MED : none
Community : 12430:2 target:12430:700000
Cluster : 1.1.1.1
Originator Id : 10.51.0.68 Peer Router Id : 10.51.0.4
Flags : Used Valid Best IGP
AS-Path : No As-Path
Network : 10.185.70.0/25
Nexthop : 10.51.0.129
Route Dist. : 12430:600001 VPN Label : 131070
To : 10.51.0.5
Res. Nexthop : n/a
Local Pref. : 100 Interface Name : NotAvailable
Aggregator AS : none Aggregator : none
Atomic Aggr. : Not Atomic MED : none
Community : target:12430:600000
Cluster : No Cluster Members
Originator Id : None Peer Router Id : 10.51.0.5
Origin : IGP
AS-Path : No As-Path
Network : 10.185.70.0/25
Nexthop : 10.51.0.129
Route Dist. : 12430:600001 VPN Label : 131070
To : 10.51.0.4
Res. Nexthop : n/a
Local Pref. : 100 Interface Name : NotAvailable
Aggregator AS : none Aggregator : none
Atomic Aggr. : Not Atomic MED : none
Community : target:12430:600000
Cluster : No Cluster Members
Originator Id : None Peer Router Id : 10.51.0.4
Origin : IGP
AS-Path : No As-Path
<slot-number> : [1..10]
<ip-prefix/prefix-*> : ipv4-prefix a.b.c.d (host bits must be 0)
ipv4-prefix-length [0..32]
ipv6-prefix x:x:x:x:x:x:x:x (eight 16-bit pieces)
x:x:x:x:x:x:d.d.d.d
x [0..FFFF]H
d [0..255]D
ipv6-prefix-length [0..128]
<longer> : keyword
<family> : ipv4|ipv6
There are no default route policies. Each policy must be created explicitly and applied to a policy, a routing protocol, or
to the forwarding table. Policy parameters are modifiable.
A filter policy compares the match criteria specified within a filter entry to packets coming through the system, in the
order the entries are numbered in the policy. When a packet matches all the parameters specified in the entry, the
system takes the specified action to either drop or forward the packet. If a packet does not match the entry
parameters, the packet continues through the filter process and is compared to the next filter entry, and so on.
To use the filter log for troubleshooting policies, perform the following;
1. Create a filter log
2. Add the log to the entries you wish to view
3. View the log events
If traffic is running and the log shows no events or no new events, then you know the packets didn’t even reach that entry. However, if
packets did reach the entry, they can be viewed in the log contents.
Example Above:
ip-filter 1 is applied on the ingress of all network interfaces on R6. It is configured incorrectly because the more generic drop entry is
before the specific forward ones.
•In step 1, a filter log is created.
•In step 2, the filter log is applied to an entry to see if the traffic reaches that entry or not. If the traffic is be processed by a certain
entry, the output will be logged in the log file. If the traffic does not reach the entry, the number of entries logged will equal 0. The
above output shows that entry 5 is dropping the ICMP traffic from 10.10.10.4.
•In step 3, the filter log is confirming that entry 5 in the policy is dropping ICMP traffic from 10.10.10.4.
Provider router
Provider (P) routers are located in the provider core network. The P router supports the service provider’s bandwidth
and switching requirements over a geographically dispersed area. The P router does not connect directly to the
customer equipment.
In the above slide, traffic flows from left-to-right. The flow of MPLS labeled packets in the other direction, that is, right-
to-left, would be represented by another LSP pointing in the reverse direction. In this case, the roles of the iLER and
eLER routers in the figure would be swapped.
The encapsulation and forwarding of packets using labels is also referred to as tunneling; thus, LSP’s are often called as
tunnels. Tunnels must be established prior to the arrival of data packets. Label negotiation and distribution protocols
are used to build the tunnels with negotiated label values. The details of these control processes and exact mechanisms
of MPLS protocols will be covered in the upcoming modules.
Label Distribution Protocol (LDP) is a protocol used to distribute labels in non-traffic-engineered applications. LDP
allows routers to establish label switched paths (LSPs) through a network by
mapping network-layer routing information directly to data link layer-switched paths.
The table above provides a high level comparison of LDP and RSVP-TE as label signaling protocols.
Running this command on all other routers within the network reveals LDP is up and all required sessions and peers are
operationally up as well.
For this example, all LDP parameters throughout the network are correct.
Another option is to use the ‘show router ldp discovery’ command as it displays the status of the Hello adjacency
discovery and whether any neighbors have been discovered on those interfaces.
R5 shows three link sessions to neighboring routers and 1 targeted session to the far-end of SDP ID 8, which his correct.
All other routers in the network have the correct LDP sessions in the Established state.
The output above shows there is no entry for 10.10.10.8/32 in the LIB which is why the SDP is operationally down.
Follow the best path to R8 to see where the label is not forwarded.
The output above shows there is no entry for 10.10.10.8/32 which means R1 is aware of R8, but is not forwarding its’
label to any of the neighbors.
SR12>config>log>log-id 3
SR12>config>log>log-id$ from debug-trace
SR12>config>log>log-id$ to session
SR12>config>log>log-id$ no shutdown
2) To stop the “debug”, use either of the following commands to stop the debug at different levels:
debug router bgp no packet - Disable debugging for OSPF packets
debug router no bgp - Disable debugging for all OSPF messages
no debug - Disable debugging for all applications
RSVP requests resources for simplex flows but only in one direction (unidirectional). Therefore, RSVP treats a sender as
logically distinct from a receiver, although the same application process may act as both a sender and a receiver at the
same time. Duplex flows require two LSPs, to carry traffic in each direction.
RSVP is not a routing protocol. It operates with unicast and multicast routing protocols that determine where packets
are forwarded. RSVP consults local routing tables to relay its messages.
Provide access to detailed and customized path information to signal LSPs, which can be completely different from
IGP best path decisions.
Facilitate the use of additional administratively defined attributes for links that enable more complex dynamic path
calculations, to increase resiliency and resource efficiency. This is an improvement to IGP shortest path calculations,
which are restricted by their use of a single parameter or metric, called the link cost, for Link-State Protocols.
Provide access to preferred features, such as secondary path or Fast Reroute protection, which help to improve the
convergence times offered by standard routing protocols.
Take resource reservation information into account during the LSP establishment process, which ensures that LSPs
only traverse routers that have sufficient resources available. This is called Connection Admission Control (CAC).
Using CAC allows operators to prevent resource overbooking.
This animated slide shows the Resv message being somehow dropped and therefore never reaches the tunnel head.
After a hold down timer (90 seconds), the router will tear down the primary path and use a secondary one if one is
available. If no secondary path is available, then the primary path is torn down and then attempted to be re-signaled. If
that is not possible then the path remains down.
refresh-time
Syntax refresh-time seconds
Context config>router>rsvp
Description The refresh-time controls the interval, in seconds, between the successive Path and Resv refresh
messages. RSVP declares the session down after it misses keep-multiplier number consecutive refresh messages.
Default 30 seconds
Parameters seconds — The refresh time in seconds.
Values 1 – 65535
keep-multiplier
Syntax keep-multiplier number
Context config>router>rsvp
Description The keep-multiplier number is an integer used by RSVP to declare that a reservation is down or the
neighbor is down..
Default 3
Parameters number — The keep-multiplier value.
Values 1 - 255
Refer to the LAB guide for suggested actions based on the error Codes displayed in the above command for LSPs that
are in the down state.
Note:
When you are reading the Failure Code of a LSP, wait for the Retry period because the Failure Code could change for the
next retry period since for every retry period we will attempt to compute a path again if the last attempt to setup a LSP
failed.
lsp-ping
In-band LSP ping utility to verify LSP connectivity.
lsp-trace
In-band LSP traceroute command to determine the hop-by-hop path for an LSP.
The Flag value is very important for any troubleshooting of SAPs. Refer to Appendix B of the Lab guide for possible
problems and suggested actions based on the Flag Values.
In the output above, the Flag is set to “PortMTUTooSmall”. The previous slide showed the service MTU set at 1514 and
this slide shows the SAP at an MTU size of 1514 as well. Because dot1q encapsulation is being used, the SAP must be
greater than the service MTU by 4 bytes.
The Flag value is very important for any troubleshooting of sdp bindings. Refer to Appendix B of the Lab guide for
possible problems and suggested actions based on the Flag Values.
The Flag value is very important for any troubleshooting of sdp bindings. Refer to Appendix B of the Lab guide for
possible problems and suggested actions based on the Flag Values.
Note:
It is important to use a non-existent source address so the customer ARP cache is not polluted with invalid ARP entries.
In the output above, sdp:443:4000 to R7 (10.10.10.7) is operationally Down. This is why the CPE ping command in the
previous slide fails.
To fix the problem, set the administrative state of interface to-R3 to UP. Once this is completed, view the service to see
if the mesh-sdp binding is operationally UP.
The show service id <id> fdb command displays FDB info for a particular service. The output above shows the table size
set to 5, while there are currently 4 entries in the FDB. The high watermark is set to 80% which means an alarm should be
present. Alarms can be viewed in log-id 99.
===============================================================================
Event Log 99
===============================================================================
Description : Default System Log
Memory Log contents [size=500 next event=516 (wrapped)]
Use the command show service fdb-info to display global FDB info.
Use the command show service fdb-mac to display global FDB entries or an FDB entry for a particular MAC.
To clear the FDB, use the clear service id <id> fdb command.
For a round-trip test, SDP Ping uses a local egress SDP ID and an expected remote SDP ID. Since SDPs are unidirectional
tunnels, the remote SDP ID must be specified and must exist as a configured SDP ID on the far-end 7750 SR. SDP round
trip testing is an extension of SDP connectivity testing with the additional ability to test:
Remote SDP ID encapsulation
Potential service round trip time
Round trip path MTU
Round trip forwarding class mapping
Service Ping operates at a higher level than the SDP diagnostics in that it verifies an individual service and not the
collection of services carried within an SDP.
Service Ping is initiated from a 7750 SR router to verify round-trip connectivity and delay to the far-end of the service.
Nokia's implementation functions for both GRE and MPLS tunnels and tests the following from edge-to-edge:
Tunnel connectivity
Service (VC) label mapping verification
Service existence
Service provisioned parameter verification
Round trip path verification
Service dynamic configuration verification
SDP Path Used determines whether the Originating 7750 SR used the originating SDP-ID to send the svc-ping request.
If a valid originating SDP-ID is found operational and has a valid egress service label, the originating 7750 SR should use
the SDP-ID as the requesting path. If the originating 7750 SR uses the originating SDP-ID as the request path, “Yes” is
displayed. If the originating 7750 SR does not use the originating SDP-ID as the request path, “No” is displayed. If the
originating SDP-ID is non-existent, N/A is displayed.
In the control plane, a MAC ping is forwarded along the flooding domain if no MAC address bindings exist. If MAC address
bindings exist, then the packet is forwarded along those paths (if they are active). Finally, a response is generated only
when there is an egress SAP binding to that MAC address. A control plane request is responded to via a control reply
only.
In the data plane, a MAC ping is sent with a VC label TTL of 255. This packet traverses each hop using forwarding plane
information for next hop, VC label, etc. The VC label is swapped at each service-aware hop, and the VC TTL is
decremented. If the VC TTL is decremented to 0, the packet is passed on to the management plane for processing. If
the packet reaches an egress node, and would be forwarded out a customer facing port, it is identified by the OAM label
below the VC label and passed to the management plane.
MAC pings are flooded when they are unknown at an intermediate node. They are responded to only by the egress
nodes that have mappings for that MAC address.
For MAC trace requests sent by the control plane, the destination IP address is determined from the control plane
mapping for the destination MAC. If the destination MAC is known to be at a specific remote site, then the far-end IP
address of that SDP is used. If the destination MAC is not known, then the packet is sent unicast, to all SDPs in the
service with the appropriate squelching.
A control plane MAC traceroute request is sent via UDP/IP. The destination UDP port is the LSP ping port. The source
UDP port is whatever the system gives (note that this source UDP port is really the demultiplexor that identifies the
particular instance that sent the request, when correlating the reply). The source IP address is the system IP of the
sender.
When a traceroute request is sent via the data plane, the data plane format is used. The reply can be via the data plane
or the control plane. A data plane MAC traceroute request includes the tunnel encapsulation, the VC label, and the
OAM, followed by an Ethernet DLC, a UDP and IP header. If the mapping for the MAC address is known at the sender,
then the data plane request is sent down to the known SDP with the appropriate tunnel encapsulation and VC label. If it
is not known, then it is sent down every SDP (with the appropriate tunnel encapsulation per SDP and appropriate egress
VC label per SDP binding).
The tunnel encapsulation TTL is set to 255. The VC label TTL is initially set to the min-ttl (default is 1). The OAM label
TTL is set to 2. The destination IP address is the all-routers multicast address. The source IP address is the system IP of
the sender. The destination UDP port is the LSP ping port. The source UDP port is whatever the system gives (note that
this source UDP port is really the demultiplexor that identifies the particular instance that sent the request, when
correlating the reply). The Reply Mode is either 3 (i.e., reply via the control plane) or 4 (i.e., reply via the data plane),
depending on the reply-control option. By default, the data plane request is sent with Reply Mode 3 (control plane
reply).
The Ethernet DLC header source MAC address is set to either the system MAC address (if no source MAC is specified) or
to the specified source MAC. The destination MAC address is set to the specified destination MAC. The Ethertype is set
to IP.
The MAC populate request is sent with a VC TTL of 1, which means that it is received at the forwarding plane at the first
hop and passed directly up to the management plane. The packet is then responded to by populating the MAC address
in the forwarding plane, like a conventional learn although the MAC will be an OAM-type MAC in the FIB to distinguish it
from customer MACs addresses.
This packet is then taken by the control plane and flooded out the flooding domain (squelching appropriately, the
sender and other paths that would be squelched in a typical flood).
This controlled population of the FIB is very important to manage the expected results of an OAM test.
The same functions are available by sending the OAM packet as a UDP/IP OAM packet. It is then forwarded to each hop
and the management plane has to do the flooding.
Options for MAC Populate are: 1. to force the MAC in the table to type OAM (in case it already existed as dynamic or
static or an OAM induced learning with some other binding), 2. to prevent new dynamic learning to over-write the
existing OAM MAC entry, 3. to allow customer packets with this MAC to either ingress or egress the network, while still
using the OAM MAC entry.
Finally, an option to flood the MAC Populate request causes each upstream node to learn the MAC (i.e., populate the
local FIB with an OAM MAC entry), and to flood the request along the data plane using the flooding domain.
An age can be provided to age a particular OAM MAC after a different interval than other MACs in a FIB.
The oam mac-purge command is used to clear the FIBs of any learned information for a particular MAC address. This
allows one to do a controlled OAM test without learning induced by customer packets. In addition to clearing the FIB of a
particular MAC address, the purge can also direct the control plane not to allow further learning from customer packets.
This allows the FIB to be clean, and be populated only via a MAC Populate.
Virtual Private Routed Networks allow multiple customer sites to communicate securely at the IP
level over a provider-managed MPLS network.
A VPRN service distributes its customer’s routing information using MP-BGP and forwards their data
packets using MPLS (or GRE ) tunnels.
As shown in the slide, a single provider-managed IP/MPLS infrastructure permits the deployment of
multiple, distinct customer routed networks that are fully isolated from each other.
Each PE can maintain multiple separate VRFs based on the number of customer sites it connects to.
Note: there is one MP-BGP session between PEs, and the same session is used to forward routing updates for different
VRFs.
The receiving PE will receive a VPN-IPv4 route from a remote PE and install the IPv4 route into the corresponding VRF.
The receiving PE must have a criteria on which to base this decision, since it is receiving routes from many different PEs,
each serving many different customers.
.
The route distinguisher is defined solely to create unique VPN-IPv4 addressing, which allows overlapping addresses
from different customers to be transported uniquely across the provider core. It is never intended to define VPRN
membership and ,therefore, a new identifier, separate from the route distinguisher, is required to associate a route to a
VRF and define the VPRN membership of the route.
A route target is defined to address this issue. A route target (RT) is the closest approximation to a VPRN membership
identifier in the VPRN architecture, and identifies the VRF table that a prefix is associated with to the receiving PE. One
or more route targets can be associated to any route.
In simple VPRN cases and for provisioning consistency, the route target value chosen is often the same as the route
distinguisher value, however, they should not be interpreted as meaning the same thing.
A properly designed and implemented core of route reflectors, or a combination of full mesh and route reflectors, is
considered to be the equivalent of a full mesh.
Transport tunnels must be created between the PEs. The transport tunnel is either a MPLS LSP or a GRE point-to-point
tunnel between PEs.
These tunnels serve as the label switched paths the customer packets will take as they cross the provider core network.
Each PE involved in a given VPRN service must be configured with a tunnel to every other PE participating in the same
VPRN service in order to transport a customer’s VPRN traffic from one site to another.
show router bgp summary - This command displays a summary of BGP neighbor information. It is an excellent
command to use when checking which BGP neighbors are operational due to the smaller output as compared to the
show router bgp neighbor command.
Executing these commands on routers R5, R6, R7 and R8 show that all required BGP neighbors are operationally up and
VPN-IPv4 (MP-BGP) is running.
Verify that the expected routes from remote PEs are in the global BGP routing table (BGP RIB-IN). If there are no VPRNs
with an import target on the PE that matches the target in a received route the PE router will silently drop the
advertise route.
Use the show router bgp routes <prefix> hunt command to verify the networks and communities within both the RIB-
IN and RIB-OUT.
The output from R5 above shows a missing RIB-IN entry from 192.168.40.1/24. The RIB-OUT is advertising entries to
all 3 other routers within the VPRN.
Use the following steps to check if the transport tunnel to the next-hop of the VPRN route is operational.
From the output of show service id 100 base, auto-bind LDP can be seen as the type of transport being used.
Verify that R5 has an active binding to R8 and R8 has an active binding to R5. From the output above R5 and R8 have an
active binding to each other.
show service <id> base will show if there are any import or export policies configured for the service.
From the output above, the VPRN 100 on both R5 and R8 have an import and export policy configured.
Use the show router policy <policy-name> to view the configuration of each policy. There seems to be an error in the
configuration of the policy “export-VPRN100” on R8. When adding a community, the name of the community must
be specified, not the route target identifier. R8 above is showing a route target identifier of “target:65535:100” as
the community. This needs to be changed to the correct community name of “exportVPRN100”.
Queues used for different types of traffic can have properties tailored to the type of traffic for which they are used, for example:
• Length/size of queue, based on delay requirements and burstiness of traffic.
• Discard policies, based on application’s sensitivity to packet loss.
Scheduling packets out of their queues is an extremely complex and variable process. There are a few types of
scheduling available. Strict Priority and Round Robin are example of simple schedulers, which service queues in order,
regardless of the amount of traffic transmitted from them. Fair Queuing is an adaptive scheduler, which dynamically
adapts the allocation of servicing time to each queue, based on the amount of traffic (for example, the number of bits)
recently transmitted from that queue.
The first possibility is the Default Scheduler. This scheduler, enabled by default, gives priority first to the Expedited
queues within CIR, then to the Best Effort Queues within CIR and then, finally, to all queues above CIR. The default
scheduler is also called a single-tier hardware scheduler because all the queues are linked directly into the FFPC
scheduler.
The second possibility (Release 4 onwards) is the Hierarchical Scheduler, which overrides the default scheduler and can
only be implemented on access ports (ingress and/or egress). This type of scheduling distinguishes eight strict priority
levels among queues serviced within CIR, and eight levels above CIR. Each level then divides the bandwidth, using
Weighted Fair Queuing.
The third type of scheduler (Release 6 onwards) is the Egress Port Scheduler, which can only be implemented on egress
ports (access and/or network). It overrides the Default Scheduler and can be used in combination with hierarchical
scheduling. This type of scheduling can be seen as an advanced technique, if compared to the hierarchical scheduling
method. The two approaches are very similar, but the Egress Port Scheduler takes the real speed of the line and the
lower OSI-levels’ overhead into consideration, and delivers a much more granular division of bandwidth to the network
egress port (previously only default scheduling).
The Hierarchical and Egress Port schedulers are multi-tiered virtual schedulers. Queues are linked in a family-like
structure so that “child” queues or schedulers can only receive bandwidth from their virtual “parent” schedulers, which
exist between the queues and the master hardware scheduler of the FFPC. Through hierarchical virtual schedulers, the
Nokia 7750 allows service providers to design customized and sophisticated scheduling. By associating queues with a
hierarchical virtual scheduler, the PIR and CIR of multiple queues can be dynamically modified. Ultimately, all queues and
schedulers are linked to the FFPC hardware scheduler.
Because these values are running totals, it is a good idea to clear the statistics before running these commands.
Comparing this to R7, we see FC ef also assigned to Queue 3, however the CIR and PIR are set much higher at 2000.
From this output, it seems the value of 200 for CIR and PIR on R5 is incorrect and too low.
The source sends a single copy of a packet that is addressed to a group of receivers. This group is a logical entity to
which any device can choose to listen (or not) at anytime. Efficient, low-overhead transport layer services are provided
by UDP.
Unlike unicast traffic, only one copy of any packet is required regardless of the number of receivers, as many receivers
can join the same group and therefore all can receive a copy of the same packet.
Unlike broadcast traffic, multicast will span across routers if configured to do so. Devices that do not want the packet
may not receive the data at all, and if they do receive the frame, they will discard it at Layer 2.
The lines representing the data flow show only a sample of how multicast data would be distributed in this network if
the routers were multicast-enabled and Receivers 1, 3 and 6 had joined the multicast group.
It is important to note that the source is sending a single copy of a packet, but it is being replicated at several points in
the network. Router B receives one packet from the source and sends out two. Similarly, Router D receives a single
packet and sends out two.
This packet replication by the routers is something not typically seen in a unicast environment.
RPF Check
Multicast packets are handled in a different fashion than unicast packets when they are received by a router. Once the frame has been
received and the packet is accepted by the router, a validation is always performed on the source IP address of each received packet.
This check is called the Reverse Path Forwarding (RPF) check. The packets that fail are silently discarded. This is in contrast to the
unicast IP world, where a router commonly will generate an Internet Control Message Protocol (ICMP) message when it discards a
packet.
Querier Election
Host Membership Query messages are sent by routers to the local network control block address of 224.0.0.1 for all multicast-enabled
devices. In order to discover which multicast groups have receivers present, the router issues a query on each of its attached
interfaces on which IGMP is enabled. The queries are sent every 125 seconds by default. The router elected as the Querier is the one
that controls the query messages.
The output above shows we have a group, however it is showing as (*,G) which means it does not know about the
source. It should be showing (S,G) instead.
The output above shows a Curr Fwding Rate = 0.0 kbps, which means there is no traffic flowing. This command can also
be used to verify if traffic is flowing, once the issue is fixed.
The above output shows that PIM is enabled on all appropriate interfaces for R7.
The output above shows R7 having two neighbours that are operational.
The output above shows R7 has an RP configured with an address of 10.10.10.1 and group address of 239.0.0.0/8
which is correct.
At this point, all configuration on R7 looks correct so the troubleshooting efforts will turn to the RP, which is R1
(10.10.10.1).
Flag Values:
tunnel - a multicast path exists between the source (10.10.10.66 in the example above) and the named host via
encapsulated IP. Lines with no “tunnel” are native multicast links to other multicast routers within the network.
pim - router is running the PIM protocol.
querier - router is an IGMP querier.
leaf - router is a leaf router (directly attached to the receivers).