Professional Documents
Culture Documents
VPC Nexus 7k
VPC Nexus 7k
This article provides guidance for troubleshooting issues that may appear when using Cisco Nexus 7000 Series. This article
introduces tools and methodologies to recognize a problem, determine its cause, and find possible solutions. However, these
documentation helps only in basic troubleshooting.
We encourage users to review the Cisco Live Presentation for detailed troubleshooting for Nexus 7000.
Sections of this presentation covers, both platform independent, and platform specific step by step troubleshooting for most
common issues. Access to this presentation is available FREE. Follow the below instructions to access the presentation.
1. Visit https://www.ciscolivevirtual.com/
4. Click on the ?Sessions? Tab on top, and select ?2011 Sessions Catalog?
6. Select the session. You can either View the Session (or) download the pdf
Welcome to Cisco DocWiki. We encourage registered Cisco.com users to contribute to this wiki to improve Cisco product documentation. Note that you
cannot log in to DocWiki with Cisco.com "guest" account credentials.
See Terms of Use and About DocWiki for more information about Cisco DocWiki.
Select the "edit" tab to edit an article or select the "discussion" tab to submit questions or comments about the article.
Click here to return to the Cisco Nexus 7000 Series documentation on www.cisco.com.
Contents
• 1 Audience and Generating a PDF of This
Guide
• 2 Organization
Organization
This article is organized into the following sections:
Troubleshooting Overview
Troubleshooting Licensing
Contents 1
Cisco Nexus 7000 Series NX-OS Troubleshooting Guide
Troubleshooting VDCs
Troubleshooting CFS
Troubleshooting Ports
Troubleshooting vPCs
Troubleshooting VLANs
Troubleshooting STP
Troubleshooting Routing
Troubleshooting WCCP
Troubleshooting Memory
This article introduces the basic concepts, methodology, and general troubleshooting guidelines for problems that may occur when
configuring and using Cisco NX-OS.
Guide Contents
Troubleshooting Overview (this section)
Troubleshooting Installs, Upgrades, and Reboots
Troubleshooting Licensing
Troubleshooting VDCs
Troubleshooting CFS
Troubleshooting Ports
Troubleshooting vPCs
Troubleshooting VLANs
Troubleshooting STP
Troubleshooting Routing
Troubleshooting Unicast Traffic
Troubleshooting WCCP
Troubleshooting Memory
Troubleshooting Packet Flow Issues
Troubleshooting FCoE
Before Contacting Technical Support
Troubleshooting Tools and Methodology
Organization 2
Cisco Nexus 7000 Series NX-OS Troubleshooting Guide
Contents
• 1 Overview of the Troubleshooting
Process
♦ 1.1 Gathering Information
♦ 1.2 Verifying Ports
♦ 1.3 Verifying Layer 2
Connectivity
♦ 1.4 Verifying Layer 3
Connectivity
• 2 Overview of Symptoms
• 3 System Messages
♦ 3.1 System Message Text
♦ 3.2 syslog Server
Implementation
• 4 Troubleshooting with Logs
• 5 Troubleshooting Modules
• 6 Viewing NVRAM logs
• 7 Contacting Customer Support
• 8 See Also
• 9 Further Reading
• 10 External Links
Note: View the Cisco Nexus 7000 instructional videos for an overview of Cisco NX-OS.
Gathering Information
This section describes the tools that are commonly used to troubleshoot problems within your network. Specific troubleshooting
articles may include additional tools and commands specific to the symptoms and possible problems covered in that article.
Note: You should have an accurate topology of your network to isolate problem areas. Contact your network architect for this
information.
Use the following commands to gather general information on your device:
Contents 3
Cisco Nexus 7000 Series NX-OS Troubleshooting Guide
• show module
• show version
• show running-config
• show logging log
• show interfaces brief
• show vlan
• show spanning-tree
• show {ip | ipv6} routing
• show processes | include ER
• show accounting log
Verifying Ports
Answer the following questions to verify that your ports are connected correctly and are operational:
• Are you using the correct media (copper, optical, fiber type)?
• Is the media broken or damaged?
• Is the port LED green on the module?
• Is the interface in the correct VDC?
Use the show vdc membership command to check which VDC that the interface is a member of. You must log
into the device with the network-admin role to use this command.
Use the show interface brief command. The status should be up.
• Use the show vlan all-ports command to verify that all the necessary interfaces are in the same VLAN. The status should
be active for the VLAN.
• Use the show port-channel compatibility-parameters command to verify that all the ports in a port channel are
configured the same for the speed, the duplex, and the trunk mode.
• Use the show running-config spanning-tree command to verify that the Spanning Tree Protocol (STP) is configured the
same on all devices in the network.
• Use the show processes | include ER command to verify that no essential Layer 2 processes are in the error state.
• Use the show spanning-tree blockedports command to display the ports that are blocked by STP.
• Use the show mac address-table dynamic vlan command to determine if learning or aging is occurring at each node.
See Troubleshooting VLANs and Troubleshooting STP for more information on troubleshooting Layer 2 issues.
• Have you configured the same dynamic routing protocol parameters throughout your routing domain or configured static
routes?
• Are any IP access lists, filters, or route maps blocking route updates?
• show arp
• show ip routing
• show platform forwarding
See Ping and Traceroute to verify Layer 3 connectivity. See Troubleshooting Routing for more information on troubleshooting
Layer 3 issues.
Overview of Symptoms
This article uses a symptom-based troubleshooting approach that allows you to diagnose and resolve your Cisco NX-OS problems
by comparing the symptoms that you observed in your network with the symptoms listed in each chapter.
By comparing the symptoms in this publication to the symptoms that you observe in your own network, you should be able to
diagnose and correct software configuration issues and inoperable hardware components so that the problems are resolved with
minimal disruption to the network. Those problems and corrective actions include the following:
System Messages
The system software sends syslog (system) messages to the console (and, optionally, to a logging server on another device). Not
all messages indicate a problem with your device. Some messages are purely informational, while others might help diagnose
problems with links, internal hardware, or the device software.
Use this string to find the matching system message in the NX-OS System Messages Reference or in the Error Message Decoder.
Each system message is followed by an explanation and recommended action. The action may be as simple as "No action is
required." It may involve a fix or a recommendation to contact technical support as shown in the following example:
Recommended Action Enter the show interface transceiver CLI command or similar DCNM command to
determine the transceiver being used. Please contact your customer support representative for a list of authorized
transceiver vendors.
This example shows how to configure a Cisco NX-OS device to use the syslog facility on a Solaris platform. Although a Solaris
host is being used, the syslog configuration on all UNIX and Linux systems is very similar.
syslog uses the facility to determine how to handle a message on the syslog server (the Solaris system in this example) and the
message severity. Different message severities are handled differently by the syslog server. They could be logged to different files
or e-mailed to a particular user. Specifying a severity level on the syslog server determines that all messages of that level and
greater severity (lower number) will be acted upon as you configure the syslog server.
Note: You should configure the syslog server so that the Cisco NX-OS messages are logged to a different file from the
standard syslog file so that they cannot be confused with other non-Cisco syslog messages. Do not locate the logfile on
the / file system. You do not want log messages to fill up the / file system. This example uses the following values:
Use the show logging server command to verify the syslog configuration.
1. Modify /etc/syslog.conf to handle local1 messages. For Solaris, you must allow at least one tab between the facility.severity and
the action (/var/adm/nxos_logs).
local1.notice /var/adm/nxos_logs
touch /var/adm/nxos_logs
/etc/init.d/syslog stop
/etc/init.d/syslog start
Test the syslog server by creating an event in Cisco NX-OS. In this case, port e1/2 was shut down and reenabled and the following
was listed on the syslog server. The IP address of the switch is listed in brackets.
tail -f /var/adm/MDS_logs
Sep 17 11:07:41 [172.22.36.142.2.2] : 2004 Sep 17 11:17:29 pacific: PORT-5-IF_DOWN_INITIALIZING: %$VLAN 1%$ Interf
Sep 17 11:07:49 [172.22.36.142.2.2] : 2004 Sep 17 11:17:36 pacific: %PORT-5-IF_UP: %$VLAN 1%$ Interface e 1/2 is u
Sep 17 11:07:51 [172.22.36.142.2.2] : 2004 Sep 17 11:17:39 pacific: %VSHD-5-VSHD_SYSLOG_CONFIG_I: Configuring cons
Use the following commands to access and view logs in Cisco NX-OS:
This example shows the output of the show logging server command:
Troubleshooting Modules
You can directly connect to a module console port to troubleshoot module bootup issues. Use the attach console module
command to connect to the module console port.
For more information on steps to take before calling Technical Support, see Before Contacting Technical Support.
See Also
Before Contacting Technical Support
Further Reading
The following links contain further information on this topic from Cisco.com:
External Links
External links contain content developed by external authors. Cisco does not review this content for accuracy.
This article describes how to identify and resolve problems that might occur when upgrading or restarting.
Guide Contents
Troubleshooting Overview
Troubleshooting Installs, Upgrades, and Reboots (this section)
Troubleshooting Licensing
Troubleshooting VDCs
Troubleshooting CFS
Troubleshooting Ports
Troubleshooting vPCs
Troubleshooting VLANs
Troubleshooting STP
Troubleshooting Routing
Troubleshooting Unicast Traffic
Troubleshooting WCCP
Troubleshooting Memory
Troubleshooting Packet Flow Issues
Troubleshooting FCoE
Before Contacting Technical Support
Troubleshooting Tools and Methodology
Contents
• 1 Information About Upgrades and Reboots
• 2 Upgrades and Reboot Checklist
• 3 Verifying Software Upgrades
• 4 Verifying a Nondisruptive Upgrade
♦ 4.1 Using ROM Monitor Mode
• 5 Troubleshooting Software Upgrades and Downgrades
♦ 5.1 Software Upgrade Ends with Error
♦ 5.2 Upgrading Cisco NX-OS Software
• 6 Troubleshooting Software System Reboots
♦ 6.1 Power-On or Switch Reboot Hangs
♦ 6.2 Corrupted Bootflash Recovery
♦ 6.3 Recovery from the loader> Prompt on Supervisor
Modules
♦ 6.4 Recovery from the loader> Prompt
♦ 6.5 Recovery from the switch(boot)# Prompt
♦ 6.6 Recovery for Systems with Dual Supervisor
Modules
◊ 6.6.1 Recovering One Supervisor Module
With Corrupted Bootflash
◊ 6.6.2 Recovering Both Supervisor Modules
with Corrupted Bootflash
♦ 6.7 System or Process Resets
♦ 6.8 Recoverable System Restarts
♦ 6.9 Unrecoverable System Restarts
♦ 6.10 Standby Supervisor Fails to Boot
♦ 6.11 Recovering the Administrator Password
• 7 See Also
• 8 Further Reading
• 9 External Links
Upgrades and reboots are ongoing network maintenance activities. You should try to minimize the risk of disrupting the network
when performing these operations in production environments and to know how to recover quickly when something does go
wrong.
Note: This publication used the term upgrade to refer to both Cisco NX-OS upgrades and downgrades.
External Links 10
Cisco Nexus 7000 Series NX-OS Troubleshooting Guide
Check
Checklist
off
Read the Release Notes for the release that you are upgrading or downgrading to.
Ensure that an FTP or TFTP server is available to download the software images.
Copy the new image onto your supervisor modules in bootflash: or slot0:.
Use the show install all impact command to verify that the new image is healthy and the impact that the new load will
have on any hardware with regards to compatibility. Check for compatibility.
Copy the startup-config file to a snapshot configuration in NVRAM. This step creates a backup copy of the
startup-config file (see the Rollback chapter in the Cisco NX-OS System Management Configuration Guide).
Save your running configuration to the startup configuration.
Back up a copy of your configuration to a remote TFTP server.
Schedule your upgrade during an appropriate maintenance window for your network.
After you have completed the checklist, you are ready to upgrade the systems in your network.
Note: It is normal for the active supervisor to become the standby supervisor during an upgrade.
Note: Log messages are not saved across system reboots. However, a maximum of 100 log messages with a severity level of
critical and below (levels 0, 1, and 2) are saved in NVRAM. You can view this log at any time by entering the show
logging nvram command.
Verifying Software Upgrades
You can use the show install all status command to watch the progress of your software upgrade or to view the ongoing install
all command or the log of the last installed install all command from a console, SSH, or Telnet session. This command shows the
install all output on both the active and standby supervisor module even if you are not connected to the console terminal.
If a failure occurs for whatever reason (such as a save runtime state failure or module upgrade failure) after the upgrade is in
progress, then the device reboots disruptively because the changes cannot be rolled back. In such cases, the upgrade has failed.
If you need further assistance to determine why an upgrade is unsuccessful, you should collect the details from the show
tech-support command output and the console output from the installation, if available, before you contact your technical support
representative.
On most systems, you can enter ROM monitor mode by entering the reload EXEC command and then pressing the Break key on
your keyboard or by using the Break key-combination (the default Break key combination is Ctrl-C) during the first 60 seconds of
startup.
1. Log into the system through the console, Telnet, or SSH port of the active supervisor.
2. Create a backup of your existing configuration file, if required.
3. Perform the upgrade by entering the install all command.
4. Exit the system console and open a new terminal session to view the upgraded supervisor module by using the show
module command.
Tip: Always carefully read the output of the install all compatibility check command. This compatibility check tells you exactly
what needs to be upgraded (such as the BIOS, loader, or firmware) and what modules will experience a disruptive upgrade. If
there are any questions or concerns about the results of the output, type n to stop the installation and contact the next level of
support.
The following example shows an upgrade using the install all command with the source images located on an SCP server.
If the configuration meets all guidelines when the install all command is used, all modules (supervisor and switching) are
upgraded.
If the images on your system are corrupted and you cannot proceed (error state), you can interrupt the system boot sequence and
recover the image by entering the BIOS configuration utility described in the following section. Access this utility only when
needed to recover a corrupted internal disk.
Caution: The BIOS changes explained in this section are required only to recover a corrupted bootflash.
Recovery procedures require the regular sequence to be interrupted. The internal sequence goes through four phases between the
time that you turn on the system and the time that the system prompt appears on your terminal--BIOS, boot loader, kickstart, and
system.
Recovery Interruption
Recovery
Normal
Prompt--appears when
Prompt--appears at
Phase the system cannot Description
the end of each
progress to the next
phase.
phase.
The BIOS begins the power-on self test, memory test, and other
operating system applications. While the test is in progress, press
BIOS loader> No bootable device
Ctrl-C to enter the BIOS configuration utility and use the netboot
option.
Note: If you boot over TFTP from the loader> prompt, you must supply the full path to the image on the remote server.
Note: The TFTP boot method is available only as a backup for diagnostics and for repairing bootflash corruption. The TFTP
boot method is not intended to bring up the system to a fully operational state. Reloading the system is mandatory after
all diagnostics and repairs have been completed.
Use the help command at the loader> prompt to display a list of commands available at this prompt or to obtain more information
about a specific command in that list.
To recover a corrupted kickstart image (system error state) for a system with a single supervisor module, follow these steps:
1. Enter the local IP address and subnet mask for the system at the loader> prompt, and press Enter.
In this example, 172.16.10.100 is the IP address of the TFTP server, and n7000-s1-kickstart-4.0.bin is the name of the kickstart
image file that exists on that server.
The switch(boot)# prompt indicates that you have a usable Kickstart image.
Caution: Be sure that you have made a backup of the configuration files before you enter this command.
5. Follow the procedure specified in the Recovery from the switch(boot)# Prompt procedure.
To recover a corrupted kickstart image (system error state) for a system with a single supervisor module, follow these steps:
1. Specify the local IP address and the subnet mask for the system.
Starting kernel...
INIT: version 2.85 booting
Checking all filesystems..r.r.r.. done.
Setting kernel variables: sysctlnet.ipv4.ip_forward = 0
net.ipv4.ip_default_ttl = 64
net.ipv4.ip_no_pmtu_disc = 1
.
Setting the System Clock using the Hardware Clock as reference...System Clock set. Local time: Wed Oct 1
11:20:11 PST 2008
WARNING: image sync is going to be disabled after a loader netboot
Loading system software
No system image Unexporting directories for NFS kernel daemon...done.
INIT: Sending processes the KILL signal
Cisco Nexus Operating System (NX-OS) Software
TAC support: http://www.cisco.com/tac
Copyright (c) 2002-2008, Cisco Systems, Inc. All rights reserved.
The copyrights to certain works contained in this software are
owned by other third parties and used and distributed under
license. Certain components of this software are licensed under
the GNU General Public License (GPL) version 2.0 or the GNU
Lesser General Public License (LGPL) Version 2.1. A copy of each
such license is available at
http://www.opensource.org/licenses/gpl-2.0.php and
http://www.opensource.org/licenses/lgpl-2.1.php
switch(boot)#
The switch(boot)# prompt indicates that you have a usable kickstart image.
Caution: Be sure that you have made a backup of the configuration files before you enter this command.
5. Follow the procedure specified in the Recovery from the switch(boot)# Prompt.
1. Change to configuration mode and configure the IP address of the mgmt0 interface.
switch(boot)# config t
switch(boot)(config)# interface mgmt0
2. Follow this step if you entered an init system command. Otherwise, skip to Step 3.
a. Enter the ip address command to configure the local IP address and the subnet mask for the system.
b. Enter the ip default-gateway command to configure the IP address of the default gateway.
3. Enter the no shutdown command to enable the mgmt0 interface on the system.
switch(boot)(config-mgmt0)# no shutdown
switch(boot)(config-mgmt0)# end
5. If you believe there are file system problems, enter the init system check-filesystem command. This command checks all
internal file systems and fixes any errors that are encountered. This command takes a few minutes to complete.
8. Verify that the system and kickstart image files are copied to your bootflash: file system.
Would you like to enter the initial configuration mode? (yes/no): yes
Note: If you enter no, you will return to the switch# login prompt, and you must manually configure the system.
Recovery for Systems with Dual Supervisor Modules
This section describes how to recover when one or both supervisor modules in a dual supervisor system have corrupted bootflash.
If one supervisor module has a functioning bootflash and the other has a corrupted bootflash, follow these steps:
The supervisor module with the corrupted bootflash performs a netboot and checks the bootflash for corruption. When the bootup
scripts discover that the bootflash is corrupted, it generates an init system command, which fixes the corrupt bootflash. The
supervisor boots as the HA Standby.
Caution: If your system has an active supervisor module currently running, you must enter the system standby manual-boot
command in EXEC mode on the active supervisor module before entering the init system command on the standby
supervisor module to avoid corrupting the internal bootflash:. After the init system command completes on the
standby supervisor module, enter the system no standby manual-boot command in EXEC mode on the active
supervisor module.
Recovering Both Supervisor Modules with Corrupted Bootflash
1. Boot the system and press the Esc key after the BIOS memory test to interrupt the boot loader.
Note: Press Esc immediately after you see the following message: 00000589K Low Memory Passed00000000K Ext
Memory PassedHit ^C if you want to run SETUP....Wait.....If you wait too long, you will skip
the boot loader phase and enter the kickstart phase.
You see the loader> prompt.
Caution: The loader> prompt is different from the regular switch# or switch(boot)# prompt. The CLI command completion
feature does not work at the loader> prompt and may result in undesired errors. You must type the command exactly
as you want the command to appear.
Tip: Use the help command at the loader> prompt to display a list of commands available at this prompt or to obtain more
information about a specific command in that list.
2. Specify the local IP address and the subnet mask for the system.
Starting kernel...
INIT: version 2.85 booting
Checking all filesystems..r.r.r.. done.
Setting kernel variables: sysctlnet.ipv4.ip_forward = 0
net.ipv4.ip_default_ttl = 64
net.ipv4.ip_no_pmtu_disc = 1
.
Setting the System Clock using the Hardware Clock as reference...System Clock set. Local time: Wed Oct 1
11:20:11 PST 2008
WARNING: image sync is going to be disabled after a loader netboot
Loading system software
No system image Unexporting directories for NFS kernel daemon...done.
INIT: Sending processes the KILL signal
Cisco Nexus Operating System (NX-OS) Software
TAC support: http://www.cisco.com/tac
Copyright (c) 2002-2008, Cisco Systems, Inc. All rights reserved.
The copyrights to certain works contained in this software are
owned by other third parties and used and distributed under
license. Certain components of this software are licensed under
the GNU General Public License (GPL) version 2.0 or the GNU
Lesser General Public License (LGPL) Version 2.1. A copy of each
such license is available at
http://www.opensource.org/licenses/gpl-2.0.php and
http://www.opensource.org/licenses/lgpl-2.1.php
switch(boot)#
The switch(boot)# prompt indicates that you have a usable kickstart image.
Note: If you boot over TFTP from the loader> prompt, you must supply the full path to the image on the remote server.
5. Enter the init-system command to repartition and format the bootflash.
6. Perform the steps in the Recovery from the switch(boot)# Prompt procedure.
7. Perform the steps in the Recovering One Supervisor Module With Corrupted Bootflash procedure to recover the other
supervisor module.
Note: If you do not enter the reload module command when a boot failure has occurred, the active supervisor module
automatically reloads the standby supervisor module within 3 to 6 minutes after the failure.
System or Process Resets
When a recoverable or nonrecoverable error occurs, the system or a process on the system may reset. See Table 2-4 for possible
causes and solutions.
1. Check the syslog file to see which process restarted and why it restarted.
For information about the meaning of each message, see the Cisco NX-OS System Messages Reference. The system output looks
like the following example:
Sep 10 23:31:31 dot-6 % LOG_SYSMGR-3-SERVICE_TERMINATED: Service "sensor" (PID 704) has finished with error
code SYSMGR_EXITCODE_SY.
switch# show logging logfile | include fail
Jan 27 04:08:42 88 %LOG_DAEMON-3-SYSTEM_MSG: bind() fd 4, family 2, port 123, ad
dr 0.0.0.0, in_classd=0 flags=1 fails: Address already in use
Jan 27 04:08:42 88 %LOG_DAEMON-3-SYSTEM_MSG: bind() fd 4, family 2, port 123, ad
dr 127.0.0.1, in_classd=0 flags=0 fails: Address already in use
Jan 27 04:08:42 88 %LOG_DAEMON-3-SYSTEM_MSG: bind() fd 4, family 2, port 123, ad
dr 127.1.1.1, in_classd=0 flags=1 fails: Address already in use
Jan 27 04:08:42 88 %LOG_DAEMON-3-SYSTEM_MSG: bind() fd 4, family 2, port 123, ad
dr 172.22.93.88, in_classd=0 flags=1 fails: Address already in use
Jan 27 23:18:59 88 % LOG_PORT-5-IF_DOWN: Interface fc1/13 is down (Link failure
or not-connected)
Jan 27 23:18:59 88 % LOG_PORT-5-IF_DOWN: Interface fc1/14 is down (Link failure
or not-connected)
Jan 28 00:55:12 88 % LOG_PORT-5-IF_DOWN: Interface fc1/1 is down (Link failure o
r not-connected)
Jan 28 00:58:06 88 % LOG_ZONE-2-ZS_MERGE_FAILED: Zone merge failure, Isolating p
ort fc1/1 (VSAN 100)
Jan 28 00:58:44 88 % LOG_ZONE-2-ZS_MERGE_FAILED: Zone merge failure, Isolating p
ort fc1/1 (VSAN 100)
Jan 28 03:26:38 88 % LOG_ZONE-2-ZS_MERGE_FAILED: Zone merge failure, Isolating p
2. Identify the processes that are running and the status of each process.
The following codes are used in the system output for the state (process state):
Note: ER usually is the state that a process enters if it has been restarted too many times and has been detected as faulty by the
system and disabled.
The system output looks like the following example. (This output has been abbreviated to be more concise.)
3. Show the processes that have had abnormal exits and to if there is a stack-trace or core dump.
To determine if the restart is repetitive or a one-time occurrence, compare the length of time that the system has been up with the
time stamp of each restart.
The output shows all cores that are presently available for upload from the active supervisor. The module-num column shows the
slot number on which the core was generated. In the previous example, an FSPF core was generated on the active supervisor
module in slot 5. An FCC core was generated on the standby supervisory module in slot 6. Core dumps generated on the module
in slot 8 include ACLTCAM and FIB.
Copy the FSPF core dump to a TFTP server with the IP address 1.1.1.1, as follows:
Virtual Memory:
Register Set:
{vrf-name | default | management} | slot0:[path]} command to configure the system to use TFTP to send the core dump to a TFTP
server.
This command causes the system to enable the automatic copy of core files to a TFTP server. For example, the following
command sends the core files to the TFTP server with the IP address 10.1.1.1:
• The core files are copied every 4 minutes. This time interval is not configurable.
• The copy of a specific core file to a TFTP server can be manually triggered, by using the command copy
core://module#/pid# tftp://tftp_ip_address/file_name.
• The maximum number of times that a process can be restarted is part of the high-availability (HA) policy for any process.
(This parameter is not configurable.) If the process restarts more than the maximum number of times, the older core files
are overwritten.
• The maximum number of core files that can be saved for any process is part of the HA policy for any process. (This
parameter is not configurable, and it is set to three.)
Note:
When using the system cores command with the scp or sftp option, you also need to generate a passwordless SSH using the
following:
username admin keypair generate rsa force username admin keypair export bootflash:key_rsa rsa force copy
bootflash:key_rsa.pub scp://<userid>@<server>/<dir>/ vrf management
8. Determine the cause and resolution for the restart condition by contacting your technical support representative and asking the
representative to review your core dump.
See the Cisco NX-OS High Availability and Redundancy Guide for more information on high-availability policies.
The effect of a process reset is determined by the policy configured for each process. An unrecoverable reset may cause
To respond to an unrecoverable reset, see the Troubleshooting Cisco NX-OS Software System Reboots procedure.
• The last four reset-reason codes for the supervisor modules are displayed. If either supervisor module is absent, the
reset-reason codes for that supervisor module are not displayed.
• The show system reset-reason module number command displays the last four reset-reason codes for a specific module
in a given slot. If a module is absent, then the reset-reason codes for that module are not displayed.
• The overall history of when and why expected and unexpected reloads occur
• The time stamp of when the reset or reload occurred
• The reason for the reset or reload of a module
• The service that caused the reset or reload (not always available)
• The software version that was running at the time of the reset or reload
Explanation This message is printed if the standby supervisor doesn't complete its boot procedure (i.e. it doesn't reach the
login prompt on the local console) 3 to 6 minutes after the loader has been loaded by the BIOS. This message is usually
caused by boot variables not properly set for the standby supervisor. This message can also be caused by a user
intentionally interrupting the boot procedure at the loader prompt (by means of pressing ESC).
Recommended Action Connect to the local console of the standby supervisor. If the supervisor is at the loader prompt,
try to use the boot command to continue the boot procedure. Otherwise, issue a reload command for the standby
supervisor from a vsh session on the active supervisor, specifying the force-dnld option. Once the standby is online, fix
the problem by setting the boot variables appropriately.
Standby supervisor does not Active supervisor kickstart image booted from Reload the active supervisor from
boot. TFTP. bootflash:.
Recovering the Administrator Password
You can access the system if you forget the administrator password.
Problem Solution
You forgot the administrator password for Use the Password Recovery procedure to recover the password using a local
accessing. console connection.
See Also
Cisco NX-OS/IOS Configuration Fundamentals Comparison
Further Reading
The following links contain further information on this topic from Cisco.com:
External Links
External links contain content developed by external authors. Cisco does not review this content for accuracy.
Guide Contents
Troubleshooting Overview
Troubleshooting Installs, Upgrades, and Reboots
Troubleshooting Licensing (this section)
Troubleshooting VDCs
Troubleshooting CFS
Troubleshooting Ports
Troubleshooting vPCs
Troubleshooting VLANs
Troubleshooting STP
Troubleshooting Routing
Troubleshooting Unicast Traffic
Troubleshooting WCCP
Troubleshooting Memory
Troubleshooting Packet Flow Issues
Troubleshooting FCoE
Before Contacting Technical Support
Troubleshooting Tools and Methodology
Contents
• 1 Information About Troubleshooting Licensing Issues
♦ 1.1 Chassis Serial Numbers
♦ 1.2 Swapping out a Chassis
♦ 1.3 Grace Period
• 2 Licensing Guidelines
• 3 Initial Troubleshooting Checklist
• 4 Displaying License Information Using the CLI
♦ 4.1 Example: Displays Information About Current
License Usage
♦ 4.2 Example: Displays the List of Features in a
Specified Package
♦ 4.3 Example: Displays the Host ID for the License
♦ 4.4 Example: Displays All Installed License Key
Files and Contents
• 5 Licensing Installation Issues
♦ 5.1 Serial Number Issues
♦ 5.2 RMA Chassis Errors or License Transfers
Between Systems
♦ 5.3 Receiving Grace Period Warnings After License
Installation
♦ 5.4 Grace Period Alerts
♦ 5.5 License Listed as Missing
• 6 See Also
• 7 Further Reading
• 8 External Links
Note: You can enable a feature without installing the license. Cisco NX-OS provides a grace period that allows you to try out
the feature before purchasing the license.
External Links 30
Cisco Nexus 7000 Series NX-OS Troubleshooting Guide
Grace Period
If you use a feature that requires a license but you have not installed a license for that feature, you are given a 120-grace period to
evaluate the feature. You must purchase and install the number of licenses required for that feature before the grace period ends or
Cisco NX-OS will disable the feature at the end of the grace period.
License packages can contain several features. If you disable a feature during the grace period and there are other features in that
license package that are still enabled, the clock does not stop for that license package. To suspend the grace period countdown for
a licensed feature, you must disable every feature in that license package. Use the show license usage command to determine
which features are enabled for a license package.
Licensing Guidelines
Follow these guidelines when dealing with licenses for Cisco NX-OS:
• Do not ignore the grace period expiration warnings. Allow 60 days before the grace period expires to allow time for
ordering, shipping, and installation for a new license purchase.
• Carefully determine the license(s) that you require based on the features that require a license.
• Order your license accurately, as follows:
♦ Enter the Product Authorization Key that appears in the Proof of Purchase document that comes with your
system.
♦ Enter the correct chassis serial number when ordering the license. The serial number must be for the same chassis
that you plan to install the license on. Use the show license host-id command to obtain your chassis serial
number.
♦ Enter serial numbers accurately. Do not use the letter "O" instead of a zero in the serial number.
♦ Order the license that is specific to your chassis.
• Back up the license file to a remote, secure place. Archiving your license files ensures that you will not lose the licenses in
the case of a failure on your system.
• Install the correct licenses on each system, using the licenses that were ordered using that system's serial number. Licenses
are serial-number specific and platform specific.
• Use the show license usage command to verify the license installation.
• Never modify a license file or attempt to use it on a system that it was not ordered for. If you return a chassis, contact your
customer support representative to order a replacement license for the new chassis.
Check
Checklist
off
Note: Use the entire ID that appears after the colon (:) . The VHD is the Vendor Host ID.
If you use a license meant for another chassis, you may see the following system message:
Error Message: LICMGR-3-LOG_LIC_INVALID_HOSTID: Invalid license hostid VDH=[chars] for feature [chars].
Explanation: The feature has a license with an invalid license Host ID. This can happen if a supervisor module with licensed
features for one system is installed on another system.
Recommended Action: Reinstall the correct license for the chassis where the supervisor module is installed.
When entering the chassis serial number during the license ordering process, do not use the letter "O" instead of any zeros in the
serial number.
The grace period stops if you disable a feature that you are evaluating. If you enable that feature again without a valid license, the
grace period countdown continues where it left off.
The grace period operates across all features in a license package. License packages can contain several features. If you disable a
feature during the grace period and there are other features in that license package that are still enabled, the countdown does not
stop for that license package. To suspend the grace period countdown for a license package, you must disable every feature in that
license package.
The Cisco NX-OS license counter keeps track of all licenses on a system. If you are evaluating a feature and the grace period has
started, you will receive console messages, SNMP traps, system messages, and daily Call Home messages.
Beyond that, the frequency of these messages become hourly during the last seven days of the grace period. The following
example uses the VDC feature. On January 30th, you enabled the VDC feature, using the 120-day grace period. You will receive
grace period ending messages as follows:
On May 31st, the grace period ends, and the VDC feature is automatically disabled. You will not be allowed to use multiple VDCs
until you purchase a valid license.
Note: You cannot modify the frequency of the grace period messages.
Caution: After the final seven days of the grace period, the feature is turned off and your network traffic may be disrupted.
Any future upgrade will enforce license requirements and the 120-day grace period.
If you try to use an unlicensed feature, you may see one of the following system messages:
Explanation: The unlicensed feature has exceeded its grace time period. Applications using this license will be shut down
immediately.
Recommended Action: Install the license file to continue using the feature.
Error Message: LICMGR-3-LOG_LICAPP_NO_LIC: Application [chars] running without [chars] license, shutdown in [dec]
days.
Explanation: The Application [chars1] has not been licensed. The application will work for a grace period of [dec] days after
which it will be shut down unless a license file for the feature is installed.
Explanation: The feature has exceeded its evaluation time period. The feature will be shut down after a grace period.
Error Message: LICMGR-3-LOG_LIC_NO_LIC: No license(s) present for feature [chars]. Application(s) shutdown in [dec]
days.
Explanation: The feature has not been licensed. The feature will work for a grace period, after which the application(s) using the
feature will be shut down.
Error Message: LICMGR-6-LOG_LICAPP_EXPIRY_WARNING: Application [chars] evaluation license [chars] expiry in [dec]
days.
Explanation: The application will exceed its evaluation time period in the listed number of days and will be shut down unless a
permanent license for the feature is installed.
Recommended Action: Install the license file to continue using the feature.
Use the show license usage command to display grace period information for a system.
See Also
Before Contacting Technical Support
Further Reading
The following links contain further information on this topic from Cisco.com:
External Links
External links contain content developed by external authors. Cisco does not review this content for accuracy.
Guide Contents
Troubleshooting Overview
Troubleshooting Installs, Upgrades, and Reboots
Troubleshooting Licensing
Troubleshooting VDCs (this section)
Troubleshooting CFS
Troubleshooting Ports
Troubleshooting vPCs
Troubleshooting VLANs
Troubleshooting STP
Troubleshooting Routing
Troubleshooting Unicast Traffic
Troubleshooting WCCP
Troubleshooting Memory
Troubleshooting Packet Flow Issues
Troubleshooting FCoE
Before Contacting Technical Support
Troubleshooting Tools and Methodology
Contents
• 1 Information About Troubleshooting VDCs
• 2 Initial Troubleshooting Checklist
• 3 VDC Issues
♦ 3.1 You Cannot Create a VDC
♦ 3.2 You Cannot Log into a Device
♦ 3.3 You Cannot Switch to a VDC
♦ 3.4 You Cannot Delete a VDC
♦ 3.5 You Cannot Allocate an Interface to a VDC
See Also 36
Cisco Nexus 7000 Series NX-OS Troubleshooting Guide
◊ 3.5.1 Table: Port Numbers for Cisco Nexus 7000 Series 32-port
10-Gbps Ethernet module
♦ 3.6 The VDC Does Not Reflect a Resource Template Change
♦ 3.7 The VDC Remains in a Failed State
♦ 3.8 You Cannot Copy the Running-Config File to the Startup-Config File in
a VDC
• 4 See Also
• 5 Further Reading
• 6 External Links
VDC issues may not be directly related to VDC management. See the troubleshooting chapter that reflects your symptoms to find
other issues related to VDCs. For instance, if you configure a VDC template that limits the number of port channels in that VDC,
you may experience problems if you try to create more port channels than the VDC template allows.
• Port channels
• SPAN sessions
• IPv4 route map memory
• VLANs
• Virtual routing and forwarding instances (VRFs)
The minimum resource value configures the guaranteed limit for that feature. The maximum resource value represents
oversubscription for the feature and is available on a first-come,first-served basis.
Note: When you allocate an interface to a VDC, Cisco NX-OS removes all configuration for that interface.
See the Cisco NX-OS Virtual Device Context Configuration Guide for more information on VDCs or for details on any VDC
configuration changes recommended in this article.
Contents 37
Cisco Nexus 7000 Series NX-OS Troubleshooting Guide
Verify that you are logged into the device as network-admin if you are creating or modifying VDCs.
Verify that you are in the correct VDC. You must be in the default VDC to configure VDCs.
Verify that you have installed the Advanced Services license to configure VDCs.
Verify that you are not attempting to create more than three VDCs.
• show vdc membership - Displays information about which interfaces are assigned to a VDC.
• show vdc resource - Displays information about the resources assigned (Command is available only in the default VDC).
• show vdc current-vdc - Displays the VDC you are currently in.
VDC Issues
Problems with VDCs usually occur from logging into the incorrect VDC or misallocating resources for a VDC.
Explanation: You cannot create a VDC because not enough resources are available based on the template configuration. If no
template is used, a default template is applied.
Recommended Action: Verify that you have sufficient resources available to create this VDC by using the show vdc resources
[detail] or show vdc resource template command. Modify the template that you are using to create the VDC or create a new
template with resource limits that are currently available.
Explanation: Some services crashed or failed to come up because of insufficient system resources other than what can be
reserved using the resource templates. These dynamic resources are based on system utilization and may not be available to
support a new VDC.
Recommended Action: Use the show system internal sysmgr service running command to determine what caused the failure.
Use the show vdc resources [detail] or show vdc resource template command to
There are not enough
determine your available resources. Modify your template or create a VDC with fewer
resources.
resources by using the limit-resource command in VDC configuration mode.
Error Message: VDC_MGR-2-VDC_UNGRACEFUL: vdc_mgr: Ungraceful cleanup request received for vdc [dec], restart
count for this vdc is [dec]
You attempted to delete the default VDC. You cannot delete the default VDC.
You cannot delete a
VDC. Unknown errors occurred when deleting Use the show tech-support VDC command to gather more
a VDC. information.
Error Message: VDC_MGR-2-VDC_BAD: vdc_mgr: There has been a failure at gim (port_affected_list).
Recommended Action: Use the show vdc membership status or show interface brief command to gather more information.
Table: Port Numbers for Cisco Nexus 7000 Series 32-port 10-Gbps Ethernet module. shows the port allocation requirements for
the Cisco Nexus 7000 Series 32-port 10-Gbps Ethernet module (N7K-M132XP-12).
Table: Port Numbers for Cisco Nexus 7000 Series 32-port 10-Gbps Ethernet module
You Cannot Copy the Running-Config File to the Startup-Config File in a VDC
You may have a problem when trying to save the configuration in a VDC.
Table: Port Numbers for Cisco Nexus 7000 Series 32-port 10-Gbps Ethernet module 41
Cisco Nexus 7000 Series NX-OS Troubleshooting Guide
Further Reading
The following links contain further information on this topic from Cisco.com:
Cisco Nexus 7000 Series NX-OS Virtual Device Context Configuration Guide
External Links
External links contain content developed by external authors. Cisco does not review this content for accuracy.
This article describes how to troubleshoot Cisco Fabric Services (CFS) problems on a Cisco NX-OS device.
Guide Contents
Troubleshooting Overview
Troubleshooting Installs, Upgrades, and Reboots
Troubleshooting Licensing
Troubleshooting VDCs
Troubleshooting CFS {this section}
Troubleshooting Ports
Troubleshooting vPCs
Troubleshooting VLANs
Troubleshooting STP
Troubleshooting Routing
Troubleshooting Unicast Traffic
Troubleshooting WCCP
Troubleshooting Memory
Troubleshooting Packet Flow Issues
Troubleshooting FCoE
Before Contacting Technical Support
Troubleshooting Tools and Methodology
Contents
• 1 Information About
Troubleshooting CFS
• 2 Initial Troubleshooting Checklist
♦ 2.1 Verifying CFS Using
the CLI
Further Reading 42
Cisco Nexus 7000 Series NX-OS Troubleshooting Guide
Some of the applications that can be synchronized using CFS are as follows:
• Call Home
• RADIUS
• TACACS+
• User roles
Note: Do not enable CFS for an application that you manage using Cisco DCNM.
You can use CFS regions to limit the CFS configuration distribution to a subset of devices on the network.
Check
Checklist
off
Verify that CFS is enabled for the same applications on all affected devices.
Verify that CFS distribution is enabled for the same applications on all affected devices.
If you are using CFS regions, verify that the application is in the same region on all the affected devices.
Verify that there are no pending changes for an application and that a CFS commit was issued for any configuration
changes in a CFS-enabled application.
Verify that there are no unexpected CFS locked sessions. Clear any unexpected locked sessions.
Contents 43
Cisco Nexus 7000 Series NX-OS Troubleshooting Guide
1. Verify that CFS is globally enabled on all devices in the network or CFS region.
2. Verify that CFS is enabled for the application on all devices in the network or CFS region.
The Physical-fc-ip scope means that CFS uses IP to apply the configuration for that application to all devices in the network or
region. The Physical-eth scope means that CFS uses Ethernet to apply the configuration for that application to all devices in the
network or region.
3. Verify that CFS distribution is enabled for the application on all devices in the network or CFS region.
4. If you configure CFS regions, verify that the application is in the same region on all applicable devices.
5. Verify the set of devices that are registered with CFS for that application.
6. Compare the output of the show cfs merge status name application-name command and the show cfs peers name
application-name command to verify that the network is not partitioned.
If the list of switch WWNs in the show cfs merge status name command output is shorter than the list of switch WWNs in
theshow cfs peers name command output, the network is partitioned into multiple CFS fabrics and the merge status may show
that the merge has failed, is pending, or is waiting.
7. Verify that a distribution is not in progress in the network for the application.
If the application does not show in the output, the distribution has completed.
8. Verify that there are no CFS sessions in progress for the application.
If you add a new device to the network and the merge status for any application shows "In Progress" for a prolonged period of
time, then there may be an active session for that application in some other device. Use the show cfs lock command to check the
lock status for that application on all the devices. The merge will not proceed if there are any locks present for that application on
any device in the network or CFS region. Use the application-name commit command to commit the changes or use the clear
application-name session command to clear the session lock so that the merge can proceed.
To recover from a merge failure using the CLI, follow these steps:
2. Commit the application configuration to restore all peers in the fabric to the same configuration database.
When another application peer acquires a lock, you cannot commit new configuration changes. This is a normal operation and you
should postpone any changes to an application until the application peer releases the lock.
• When locks are not held on all of the devices in the network or CFS region.
Cisco Nexus 7000 Series NX-OS Troubleshooting Guide
• When locks are held on all devices in the network or region, but a CFS session does not exist on the device that holds the
lock.
Note: Use the troubleshooting steps in this section only if you believe the lock has not been properly released.
To troubleshoot a lock failure, follow these steps:
1. Determine all the devices that participate in the CFS distribution for this application.
2. Check for a lock for this application on all CFS peer devices to determine the name of the administrator who owns the lock for
the application.
You should check with that administrator before clearing the lock.
4. Release the CFS lock on the device that owns the lock.
5. If the device does not release the lock, clear the CFS session on the device that owns the lock.
• When using CFS regions, an application on a given device can only belong to one region at a time.
• An application in a CFS region ignores all CFS distributions in any other region (including the default region).
• All applications that you do not assign to a CFS region exist in the default region.
To resolve a configuration distribution failure to all devices in a CFS region, follow these steps:
2. Verify that the application distribution is enabled and is in the same region on all devices in the region.
Note: You must reassign an application to a region whenever you disable that application. CFS assigns new applications in the
default region.
Note: When an application is moved from one region to another (including the default region), the application loses all CFS
history.
See Also
Before Contacting Technical Support
Further Reading
The following links contain further information on this topic from Cisco.com:
Configuring CFS
External Links
External links contain content developed by external authors. Cisco does not review this content for accuracy.
This article describes how to identify and resolve problems that can occur with ports in Cisco NX-OS.
Guide Contents
Troubleshooting Overview
Troubleshooting Installs, Upgrades, and Reboots
Troubleshooting Licensing
Troubleshooting VDCs
Troubleshooting CFS
Troubleshooting Ports (this section)
Troubleshooting vPCs
Troubleshooting VLANs
Troubleshooting STP
Troubleshooting Routing
Troubleshooting Unicast Traffic
Troubleshooting WCCP
Troubleshooting Memory
Troubleshooting Packet Flow Issues
Troubleshooting FCoE
Before Contacting Technical Support
Troubleshooting Tools and Methodology
Contents
• 1 Information About Troubleshooting Ports
• 2 Port Guidelines
• 3 License Requirements
• 4 Initial Troubleshooting Checklist
♦ 4.1 Viewing Port Information
• 5 Troubleshooting Port States from the CLI
♦ 5.1 Example: show interface Command Output
• 6 Port-Interface Issues
♦ 6.1 You Cannot See The Interface
♦ 6.2 The Interface Configuration Has
Disappeared
♦ 6.3 You Cannot Enable an Interface
♦ 6.4 You Cannot Configure a Dedicated Port
♦ 6.5 A Port Remains in a Link Failure or Not
Connected State
♦ 6.6 An Unexpected Link Flapping Occurs
♦ 6.7 A Port Is in the ErrDisabled State
◊ 6.7.1 Verifying the ErrDisable State
Using the CLI
• 7 See Also
• 8 Further Reading
• 9 External Links
External Links 49
Cisco Nexus 7000 Series NX-OS Troubleshooting Guide
Each interface has an associated administrative configuration and operational status as follows:
• The administrative configuration does not change unless you modify it. This configuration has various attributes that you
can configure in administrative mode.
• The operational status represents the current status of a specified attribute like the interface speed. This status cannot be
changed and is read-only. Some values may not be valid when the interface is down (such as the operation speed).
For a complete description of port modes, administrative states, and operational states, see the Cisco NX-OS Interfaces
Configuration Guide.
Port Guidelines
Follow these guidelines when you configure a port interface:
• Before you begin configuring a switch, make sure that the modules in the chassis are functioning as designed. Use the
show module command to verify that a module is OK or active before continuing the configuration.
• When configuring dedicated ports in a port group, follow these port mode guidelines:
♦ You can configure only the one port in each four-port group in dedicated mode. The other three ports are not
usable and remain shut down.
♦ If any of the other three ports are enabled, you cannot configure the remaining port in dedicated mode. The other
three ports continue to remain enabled.
License Requirements
There are no licensing requirements for port configuration in Cisco NX-OS.
Check
Checklist
off
Check the physical media to ensure that there are no damaged parts.
Verify that the SFP (small form-factor pluggable) devices in use are those authorized by Cisco and that they are not
faulty.
Verify that you have enabled the port by using the no shutdown command.
Use the show interface command to verify the state of the interface. See the Cisco NX-OS Interfaces Configuration
Guide for reasons why a port may be in a down operational state.
Verify that you have configured a port as dedicated and make sure that you have not connected to the other three ports in
the port group.
Use one of the following commands to clear all port counters or the counters for specified interfaces:
The counters can identify synchronization problems by displaying a significant disparity between received and transmitted frames.
• Speed
• Trunk VLAN status
• Number of frames sent and received
• Transmission errors, including discards, errors, and invalid frames
Example: show interface Command Output displays the show interface command output.
Port-Interface Issues
This section includes symptoms and solutions for troubleshooting ports.
Verify that the media is not broken or damaged. Is the LED on the switch
green?
When you are troubleshooting an unexpected link flapping, you should know the following information:
A port is in the Use the Verify the ErrDisable State Using the CLI
ErrDisabled state. The device detected a high amount of bad frames procedure to verify the SFP, cable, and connections.
(CRC errors), which might indicate a problem with
the media.
To verify the ErrDisable state using the CLI, follow these steps:
1. Use the show interface command to verify that the switch detected a problem and disabled the port.
In this example, port ethernet 1/7 entered the ErrDisabled state because of a capability mismatch, or "CAP MISMATCH."
4. Display the switch log file and view a list of port state changes.
In this example, an error was recorded when someone attempted to add port e1/7 to port channel 7. The port was not configured
identically to port channel 7, so the attempt failed.
See Also
Cisco NX-OS/IOS Interface Comparison
Further Reading
The following links contain further information on this topic from Cisco.com:
External Links
External links contain content developed by external authors. Cisco does not review this content for accuracy.
This article describes how to do basic troubleshooting of virtual Port Channel(vPC) problems on a Cisco Nexus 7000 NX-OS
device.
Guide Contents
Troubleshooting Overview
Troubleshooting Installs, Upgrades, and Reboots
Troubleshooting Licensing
Troubleshooting VDCs
Troubleshooting CFS
Troubleshooting Ports
Troubleshooting vPCs {this section}
Troubleshooting VLANs
Troubleshooting STP
Troubleshooting Routing
Troubleshooting Unicast Traffic
Troubleshooting WCCP
Troubleshooting Memory
Troubleshooting FCoE
Contents
• 1 Information About Troubleshooting vPCs
• 2 Initial Troubleshooting Checklist
• 3 Verifying vPCs Using the CLI
• 4 Received Type 1 Configuration Element Mismatch
♦ 4.1 Example: show vpc consistency-parameters
Command Output
• 5 Cannot Enable the vPC Feature
♦ 5.1 Example: show module Command Output
• 6 vPC in Blocking State
• 7 VLANs on a vPC moved to suspend state
• 8 Hosts with an HSRP Gateway Cannot Access Beyond Their
VLAN
• 9 Traffic Disrupted when the Primary vPC Device Goes Down
• 10 See Also
• 11 Further Reading
• 12 External Links
See the Configuring vPC chapter in the Cisco Nexus 7000 Series NX-OS Interfaces Configuration Guide for more information on
vPCs.
Check
Checklist
off
Verify that all vPC interfaces in a vPC domain are configured in the same virtual device context (VDC).
Verify that you have a separate vPC peer-link and peer-keepalive link infrastructure for each VDC deployed.
Is the vPC keepalive link mapped to a separate vrf? If not, it will be mapped to the management vrf by default. In this
case, do you have a management switch connect to the management ports on both vPC peer devices?
Verify that the vPC peer-link is configured on a N7K-M132XP-12. It is recommended to have at least two
N7K-M132XP-12 for redundancy.
External Links 57
Cisco Nexus 7000 Series NX-OS Troubleshooting Guide
Verify that both the source and destination IP addresses used for the peer-keepalive messages are reachable from the
VRF associated with the vPC peer-keepalive link.
Verify that the peer-keepalive link is up or the vPC peer-link will not come up.
Verify that the vPC peer-link is configured as a Layer 2 Port Channel trunk which only allows vPC VLANs.
Verify that the vPC number that you assigned to the port channel that connects to the downstream device from the vPC
peer device is identical on both vPC peer devices.
If you manually configured the system priority, verify that you assigned the same priority value on both vPC peer
devices.
Check the show vpc consistency-parameters command to verify that both vPC peer devices have identical type-1
parameters.
Verify that the primary vPC is the primary STP root and the secondary vPC is the secondary STP root.
Verifying vPCs Using the CLI
To verify vPCs using the CLI, follow these steps:
1. Use the show running-config vpc command to verify the vPC configuration.
3. Use the show vpc peer-keepalive command to check the status of the vPC peer-keepalive link.
4. Use the show vpc consistency-parameters command to verify that both the vPC peers have the identical type-1 parameters.
5. Use the show port-channel summary command toverify the members in the port channel are mapped to the vPC.
6. Use the show cfs status commands to verify that distribution over Ethernet is enabled.
7. If you enable STP, use the show spanning-tree command on both sides of the vPC peer link to verify that the following STP
parameters are identical:
• BPDU Filter
• BPDU Guard
• Cost
• Link type
• Priority
• VLANs (PVRST+)
This example shows how to display the vPC consistency parameters on a port channel:
Mod Sw Hw
--- -------------- ------
2 4.1(5) 1.2
3 4.1(5) 1.0 >>> Must be 1.3 or later.
Further Reading
The following links contain further information on this topic from Cisco.com:
External Links
The following links contain content developed by external authors. Cisco does not review this content for accuracy.
Guide Contents
Troubleshooting Overview
Troubleshooting Installs, Upgrades, and Reboots
Troubleshooting Licensing
Troubleshooting VDCs
Troubleshooting CFS
Troubleshooting Ports
Troubleshooting vPCs
Troubleshooting VLANs (this section)
Troubleshooting STP
Troubleshooting Routing
Troubleshooting Unicast Traffic
Troubleshooting WCCP
Troubleshooting Memory
Troubleshooting Packet Flow Issues
Troubleshooting FCoE
Before Contacting Technical Support
Troubleshooting Tools and Methodology
Contents
• 1 Information About Troubleshooting
VLANs
• 2 Initial Troubleshooting Checklist
Further Reading 61
Cisco Nexus 7000 Series NX-OS Troubleshooting Guide
• 3 VLAN Issues
♦ 3.1 You Cannot Create a
VLAN
♦ 3.2 You Cannot Create a
PVLAN
♦ 3.3 The VLAN Interface is
Down
• 4 See Also
• 5 Further Reading
• 6 External Links
• a through z or A through Z
• 0 through 9
• - (hyphen) or _ (underscore)
• Keep user traffic off the management VLAN; keep the management VLAN separate from user data.
• You can apply different Quality of Service (QoS) configurations to primary, isolated, and community VLANs.
• To apply output VACLs to all outgoing private VLAN traffic, map the secondary VLANs on the Layer 3 VLAN interface
of the primary VLAN and then configure the VACLs on the SVI of the primary VLAN.
• VACLs that apply to the Layer 3 VLAN interface of a primary VLAN automatically apply to the associated isolated and
community VLANs.
• If you do not map the secondary VLAN to the Layer 3 VLAN interface of the primary VLAN, you can have different
VACLs for primary and secondary VLANs.
• Because traffic in private VLANs flow in different directions, you can have different VACLs for ingressing traffic and
different VACLs for egressing traffic.
Note: We recommend that you keep the same VACLs for the primary VLAN and all secondary VLANs in the private VLAN.
• You can enable DHCP snooping on private VLANs. When you enable DHCP snooping on the primary VLAN, it is
propagated to the secondary VLANs. If you configure DHCP on a secondary VLAN, the configuration does not take
effect if the primary VLAN is already configured.
• You can configure IEEE 802.1X port-based authentication on a private VLAN port, but do not configure 802.1X with port
security or per-user ACL on private VLAN ports.
• 802.1X works with private VLANs, but the 802.1X dynamic VLAN assignment or the guest VLAN assignment does not
work with private VLANs.
• IGMP runs only on the primary VLAN and uses the configuration of the primary VLAN for all secondary VLANs.
• Any IGMP join request in the secondary VLAN is treated as if it is received in the primary VLAN.
• Private VLANs support these Switched Port Analyzer (SPAN) features:
Contents 62
Cisco Nexus 7000 Series NX-OS Troubleshooting Guide
VLAN Issues
This section includes symptoms and solutions for VLAN issues.
to this VDC.
You are using a reserved VLANs 3968 to 4047 and 4094 are reserved for internal use in each VDC; you
VLAN ID. cannot change or use these reserved VLANs.
Further Reading
The following links contain further information on this topic from Cisco.com:
External Links
External links contain content developed by external authors. Cisco does not review this content for accuracy.
This article describes how to identify and resolve problems that might occur when implementing the Spanning Tree Protocol
(STP).
Guide Contents
Troubleshooting Overview
Troubleshooting Installs, Upgrades, and Reboots
Troubleshooting Licensing
Troubleshooting VDCs
Troubleshooting CFS
Troubleshooting Ports
Troubleshooting vPCs
Troubleshooting VLANs
Troubleshooting STP (this section)
Troubleshooting Routing
Troubleshooting Unicast Traffic
Troubleshooting WCCP
Troubleshooting Memory
Troubleshooting Packet Flow Issues
Troubleshooting FCoE
Before Contacting Technical Support
Troubleshooting Tools and Methodology
Contents
• 1 Information About Troubleshooting STP
• 2 Initial Troubleshooting Checklist
• 3 Troubleshooting STP Data Loops
• 4 Troubleshooting Excessive Packet
Flooding
• 5 Troubleshooting Convergence Time
Issues
• 6 Securing the Network Against Forwarding
Loops
• 7 See Also
• 8 Further Reading
• 9 External Links
External Links 65
Cisco Nexus 7000 Series NX-OS Troubleshooting Guide
• If you are running private VLANs with multiple STP (MST), verify that all secondary VLANs belong to the same MST
instance as that of the primary VLANs.
• Disabling spanning tree on the native VLAN of an 802.1Q trunk when you are working in Rapid PVST+ spanning tree
mode can cause a spanning tree loop on that VLAN. We recommend that you leave spanning tree enabled on the native
VLAN of the 802.1Q trunks. Make sure that your network has no physical loops before you disable spanning tree.
• When you connect two Cisco switches through 802.1Q trunks, the switches exchange spanning tree bridge protocol data
units (BPDUs) on each VLAN allowed on the trunks. The BPDUs on the native VLAN of the trunk are sent untagged to
the reserved IEEE 802.1D spanning tree multicast MAC address (01-80-C2-00-00-00). The BPDUs on all other VLANs
on the trunk are sent tagged to the reserved Cisco Shared Spanning Tree (SSTP) multicast MAC address
(01-00-0c-cc-cc-cd).
• In STP, the port-channel bundle is considered as a single port. The port cost is the aggregation of all the configured port
costs that are assigned to that channel.
• When a secondary VLAN is associated with the primary VLAN, the STP parameters of the primary VLAN, such as
bridge priorities, are propagated to the secondary VLAN. However, STP parameters do not necessarily propagate to other
devices. You should manually check the STP configuration to ensure that the spanning tree topologies for the primary,
isolated, or community VLANs match exactly so that the VLANs can share the same forwarding database.
Note: In some cases, the configuration is accepted with no error messages, but the commands have no effect.
Cisco Nexus 7000 Series NX-OS Troubleshooting Guide
Check
Checklist
off
Verify the type of spanning tree configured on your device.
Verify the network topology including all interconnected ports and switches. Identify all redundant paths on the
network and verify that the redundant paths are blocking.
Use the show spanning-tree summary totals command to verify that the total number of logical interfaces in the
Active state are less than the maximum allowed. See the Cisco NX-OS Layer 2 Switching Configuration Guide for
information on these limits.
Verify the primary and secondary root bridge and any configured Cisco extensions.
Use the following commands to view STP configuration and operational details:
Use the show spanning-tree blockedports command to display the ports that are blocked by STP.
Use the show mac address-table dynamic vlan command to determine if learning or aging occurs at each node.
1. Identify the ports involved in the loop by looking at the interfaces with high link utilization.
3. Locate every switch in the redundant paths using your network topology diagram.
4. Verify that the switch lists the same STP root bridge as the other nonaffected switches.
VLAN0009
Spanning tree enabled protocol rstp
Root ID Priority 32777''
Address 0018.bad7.db15''
Cost 4
...
5. Verify that the root port is correctly identified as the port with the lowest cost to the root bridge.
VLAN0009
Spanning tree enabled protocol rstp
Root ID Priority 32777
Address 0018.bad7.db15
Cost 4
Port 385 (Ethernet3/1)
Hello Time 2 sec Max Age 20 sec Forward Delay 15 sec
6. Verify that the root port and alternate ports are regularly receiving BPDUs.
7. If the received BPDU counter is not incremented, check if the BPDUs are received by the internal packet manager.
Ethernet3/1, ordinal: 36
SUP-traffic statistics: (sent/received)
Packets: 120210 / 15812
Bytes: 8166401 / 1083056
Instant packet rate: 5 pps / 5 pps
8. If the BPDUs are not received by the packet manager, check the hardware packet statistic (error drop) counters.
--------------------------------------------------------------------------------
Port Align-Err FCS-Err Xmit-Err Rcv-Err UnderSize OutDiscards
--------------------------------------------------------------------------------
mgmt0 -- -- -- -- -- --
Eth1/1 0 0 0 0 0 0
Eth1/2 0 0 0 0 0 0
Eth1/3 0 0 0 0 0 0
Eth1/4 0 0 0 0 0 0
Eth1/5 0 0 0 0 0 0
Eth1/6 0 0 0 0 0 0
Eth1/7 0 0 0 0 0 0
Eth1/8 0 0 0 0 0 0
10. If the BPDU send counter is incrementing, check if BPDUs are transmitted by the packet manager.
Ethernet3/1, ordinal: 36
SUP-traffic statistics: (sent/received)
Packets: 120210 / 15812
Bytes: 8166401 / 1083056
Instant packet rate: 5 pps / 5 pps
Average packet rates(1min/5min/15min/EWMA):
Packet statistics:
Tx: Unicast 0, M'' ulticast 120210''
Broadcast 0
Rx: Unicast 0, Multicast 15812
11. If the packet manager BPDU sent counters is incrementing, check the hardware packet statistic counters for a possible BPDU
error drop.
--------------------------------------------------------------------------------
Port Align-Err FCS-Err Xmit-Err Rcv-Err UnderSize OutDiscards
--------------------------------------------------------------------------------
mgmt0 -- -- -- -- -- --
Eth1/1 0 0 0 0 0 0
Eth1/2 0 0 0 0 0 0
Eth1/3 0 0 0 0 0 0
Eth1/4 0 0 0 0 0 0
Eth1/5 0 0 0 0 0 0
Eth1/6 0 0 0 0 0 0
Eth1/7 0 0 0 0 0 0
Eth1/8 0 0 0 0 0 0
In a stable topology, a topology change should not trigger excessive flooding. Link flaps can cause a topology change, so
continuous link flaps can cause repetitive topology changes and flooding. Flooding slows the network performance and can cause
packet drops on an interface.
3. Repeat step 2 on devices connected to the interface until you can isolate the device that originated the topology change.
Note: The recommended scalability limits are system wide and not per VDC.
Troubleshooting STP helps to isolate and find the cause for a particular failure, while the implementation of these enhancements is
the only way to secure the network against forwarding loops.
1. Enable the Cisco-proprietary Unidirectional Link Detection (UDLD) protocol on all the switch-to-switch links. See the UDLD
2. Set up the Bridge Assurance feature by configuring all the switch-to-switch links as the spanning tree network port type.
Note: You should enable the Bridge Assurance feature on both sides of the links or Cisco NX-OS will put the port in the
blocked state because of a Bridge Assurance inconsistency.
You must set up the STP edge port to limit the amount of topology change (TC) notices and subsequent flooding that can
affect the performance of the network. Use this command only with ports that connect to end stations. Otherwise, an
accidental topology loop can cause a data-packet loop and disrupt the device and network operation.
4. Enable the Link Aggregation Control Protocol (LACP) for port channels to avoid any port-channel misconfiguration issues. See
the LACP section in the Cisco NX-OS Interfaces Configuration Guide.
Do not disable autonegotiation on the switch-to-switch links. Autonegotiation mechanisms can convey remote fault
information, which is the quickest way to detect failures at the remote side. If failures are detected at the remote side, the
local side brings down the link even if the link is still receiving pulses.
Caution! Be careful when you change STP timers. STP timers are dependent on each other and changes can impact the
entire network.
5. (Optional) To prevent denial-of-service attacks, use the spanning-tree loopguard default command to secure the network STP
perimeter with Root Guard. Root Guard and BPDU Guard allow you to secure STP against influence from the outside.
6. Use the spanning-tree bpduguard enable command to enable BPDU Guard on STP edge ports to prevent STP from being
affected by unauthorized network devices (such as hubs, switches, and bridging routers) that are connected to the ports.
Root Guard prevents STP from outside influences. BPDU Guard shuts down the ports that are receiving any BPDUs (not
only superior BPDUs).
Note: Short-living loops are not prevented by Root Guard or BPDU Guard if two STP edge ports are connected directly
or through the hub.
7. Use the vlan command to configure separate VLANs and avoid user traffic on the management VLAN. The management
VLAN is contained to a building block, not the entire network.
8. Use the spanning-tree vlan vlan-range root primary command to configure a predictable STP root.
9. Use the spanning-tree vlan vlan-range root secondary command to configure a predictable backup STP root placement.
You must configure the STP root and backup STP root so that convergence occurs in a predictable way and builds optimal
topology in every scenario. Do not leave the STP priority at the default value.
See Also
Cisco NX-OS/IOS STP Comparison
Further Reading
The following links contain further information on this topic from Cisco.com:
External Links
External links contain content developed by external authors. Cisco does not review this content for accuracy.
Guide Contents
Troubleshooting Overview
Troubleshooting Installs, Upgrades, and Reboots
Troubleshooting Licensing
Troubleshooting VDCs
Troubleshooting CFS
Troubleshooting Ports
Troubleshooting vPCs
Troubleshooting VLANs
Troubleshooting STP
Troubleshooting Routing (this section)
Troubleshooting Unicast Traffic
Troubleshooting WCCP
Troubleshooting Memory
Troubleshooting Packet Flow Issues
Troubleshooting FCoE
Before Contacting Technical Support
Troubleshooting Tools and Methodology
Contents
• 1 Information about Troubleshooting
Routing Issues
• 2 Initial Troubleshooting Checklist
• 3 Troubleshooting Routing
• 4 See Also
• 5 Further Reading
• 6 External Links
Further Reading 73
Cisco Nexus 7000 Series NX-OS Troubleshooting Guide
Cisco NX-OS uses the virtual device contexts (VDCs) to provide separate management domains per VDC and software fault
isolation. Each VDC supports multiple Virtual Routing and Forwarding Instances (VRFs) and multiple routing information bases
(RIBs) to support multiple address domains. Each VRF is associated with a routing information base (RIB) and this information is
collected by the Forwarding Information Base (FIB).
See the Cisco NX-OS Unicast Routing Configuration Guide and the Cisco NX-OS Multicast Routing Configuration Guide for
more information on routing.
• show ip arp
• show ip traffic
• show tcp statistics udp4
• show ip client
• show tcp client
• show ip fib
• show ip process
• show ip route
• show pktmgr interface
• show frame traffic
• show platform fib
• show platform forwarding
• show platform ip
• show vrf
• show vrf interface
Troubleshooting Routing
To troubleshoot basic routing issues, follow these steps:
If the feature is not enabled, Cisco NX-OS reports that the command is invalid. Use the feature command to enable the
routing protocol.
Troubleshooting Routing 76
Cisco Nexus 7000 Series NX-OS Troubleshooting Guide
igmp 1 N/A
mrib 2 /procket/shm/mrib-mfdm
m6rib 3 /procket/shm/m6rib-mfdm
See Also
Cisco NX-OS/IOS BGP (Basic) Comparison
Further Reading
The following links contain further information on this topic from Cisco.com:
External Links
External links contain content developed by external authors. Cisco does not review this content for accuracy.
This article below provides only basic information on how to troubleshoot unicast packet flow traffic issues for the M1 Series
modules.
Troubleshooting L2/L3 unicast is covered in detail in Cisco-Live presentation. Sections of this presentation covers, both platform
independent, and platform specific step by step troubleshooting for unicast, among other things. Access to this presentation is
available FREE. Follow the below instructions to access the presentation
1. Visit https://www.ciscolivevirtual.com/
4. Click on the ?Sessions? Tab on top, and select ?2011 Sessions Catalog?
See Also 77
Cisco Nexus 7000 Series NX-OS Troubleshooting Guide
6. Select the session. You can either View the Session (or) download the pdf.
Guide Contents
Troubleshooting Overview
Troubleshooting Installs, Upgrades, and Reboots
Troubleshooting Licensing'
Troubleshooting VDCs
Troubleshooting CFS
Troubleshooting Ports
Troubleshooting vPCs
Troubleshooting VLANs
Troubleshooting STP
Troubleshooting Routing
Troubleshooting Unicast Traffic(this section)
Troubleshooting WCCP
Troubleshooting Memory
Troubleshooting Packet Flow Issues
Troubleshooting FCoE
Before Contacting Technical Support
Troubleshooting Tools and Methodology
Contents
• 1 Packet is Received into Interface from Wire
• 2 Linksec Decryption Occurs, 1st stage Port QoS
• 3 Second Stage Port QoS Occurs
• 4 Layer 2 Source/Destination MAC Processing
• 5 Layer 3 Engine Processing
♦ 5.1 Layer 3 Engine Processes Layer 3 Features
♦ 5.2 Layer 3 forwarding for Routed Traffic
• 6 SFabric Processing Occurs (optional)
• 7 Layer 2 Engine Performs Source/Destination MAC
Processing
• 8 Egress Port QoS is Performed
• 9 Linksec Encryption Occurs
• 10 Packet is Transmitted
External Links 78
Cisco Nexus 7000 Series NX-OS Troubleshooting Guide
Ethernet1/1 is up
Hardware: 10000 Ethernet, address: 0024.986c.00b0 (bia 0024.986c.00b0)
Description: N7K-vdc-1 connecting to core 6506
MTU 1500 bytes, BW 10000000 Kbit, DLY 10 usec,
reliability 255/255, txload 1/255, rxload 1/255
Encapsulation ARPA
Port mode is trunk
full-duplex, 10 Gb/s, media type is 10g
Beacon is turned off
Auto-Negotiation is turned off
Input flow-control is off, output flow-control is off
Rate mode is shared
Switchport monitor is off
Last link flapped 7week(s) 4day(s)
Last clearing of "show interface" counters never
1 minute input rate 13056 bits/sec, 9 packets/sec
1 minute output rate 4608 bits/sec, 0 packets/sec
Rx
341190251 input packets 276211313 unicast packets 52112947 multicast packets
12865991 broadcast packets 0 jumbo packets 0 storm suppression packets
94295027129 bytes
Tx
462437316 output packets 85121 multicast packets
188251 broadcast packets 0 jumbo packets
648159081064 bytes
0 input error 0 short frame 0 watchdog
0 no buffer 0 runt 0 CRC 0 ecc
0 overrun 0 underrun 0 ignored 0 bad etype drop
0 bad proto drop 0 if down drop 0 input with dribble
0 input discard
0 output error 0 collision 0 deferred
0 late collision 0 lost carrier 0 no carrier
0 babble
0 Rx pause 0 Tx pause
1 interface resets
Ethernet1/1
sfp is present
name is CISCO-AVAGO <<< If this says type is (unknown), it is not supported.
part number is SFBR-7700SDZ
revision is B4
serial number is AGD12434116
nominal bitrate is 10300 MBits/sec
Link length supported for 50/125um fiber is 82 m(s)
Link length supported for 62.5/125um fiber is 26 m(s)
cisco id is --
cisco extended id number is 4
It is important to step back and evaluate the difference between stage 1 and stage 2 QoS. The difference is that some ports can be
configured in shared mode, whereas some can be configured in dedicated mode, on the 10G modules. What this means, is that
there is 10g of bandwidth that can be dedicated to a port or shared amongst ports (4 ports, on the m132 module).
When running in shared mode, there exists a chance for contention accessing the 10g bandwidth through the 4:1 Mux. To alleviate
this, some QoS intelligence was passed down to the 4:1 Mux which aggregates the ports.
In dedicated mode, there is no QoS applied at the Mux, instead, all traffic is processed in phase 2 QoS. To summarize, in shared
mode, 1st stage QoS ensures fair access to the 10g of port bandwidth. In both shared and dedicated mode, 2nd stage QoS occurs to
provide ingress queuing to the system.
For the ingress QoS, we are concerned about the Receive side QoS parameters in the show queuing command.
Use the show policy-map command to see per queue dropped packets.
Configured WRR
WRR bandwidth ratios: 25[1p7q4t-out-q-default] 15[1p7q4t-out-q2] 12[1p7q4t-out-q3]
12[1p7q4t-out-q4] 12[1p7q4t-out-q5] 12[1p7q4t-out-q6] 12[1p7q4t-out-q7]
WRR configuration read from HW
WRR bandwidth ratios: 25[1p7q4t-out-q-default] 15[1p7q4t-out-q2] 11[1p7q4t-out-q3]
11[1p7q4t-out-q4] 11[1p7q4t-out-q5] 11[1p7q4t-out-q6] 11[1p7q4t-out-q7]
Thresholds:
COS Queue Threshold Type Min Max
__________________________________________________________________
0 1p7q4t-out-q-default DT 100 100
1 1p7q4t-out-q-default DT 100 100
2 1p7q4t-out-q-default DT 100 100
3 1p7q4t-out-q-default DT 100 100
4 1p7q4t-out-q-default DT 100 100
5 1p7q4t-out-pq1 DT 100 100
6 1p7q4t-out-pq1 DT 100 100
7 1p7q4t-out-pq1 DT 100 100
Configured WRR
WRR bandwidth ratios: 20[8q2t-in-q-default] 0[8q2t-in-q2] 0[8q2t-in-q3]
0[8q2tin-q4] 0[8q2t-in-q5] 0[8q2t-in-q6] 0[8q2t-in-q7] 80[8q2t-in-q1]
WRR configuration read from HW
WRR bandwidth ratios: 20[8q2t-in-q-default] 0[8q2t-in-q2] 0[8q2t-in-q3]
0[8q2t-in-q4] 0[8q2t-in-q5] 0[8q2t-in-q6] 0[8q2t-in-q7] 80[8q2t-in-q1]
Thresholds:
COS Queue Threshold Type Min Max
__________________________________________________________________
0 8q2t-in-q-default DT 100 100
1 8q2t-in-q-default DT 100 100
2 8q2t-in-q-default DT 100 100
3 8q2t-in-q-default DT 100 100
4 8q2t-in-q-default DT 100 100
5 8q2t-in-q1 DT 100 100
6 8q2t-in-q1 DT 100 100
7 8q2t-in-q1 DT 100 100
To validate forwarding of the Layer 2 engine, we should first look at the centralized mac table aggregated on the supervisor to
validate whether the mac addresses are correlated as we expect them, and assigned to the ports where we expect the Mac?s to
reside.
Based off of this, we can then validate the hardware programming on the ingress linecard to validate that our mac address table is
properly programmed into the hardware based Layer 2 engine on the linecard.
We first will look at the mac address table, then we can ensure programming is properly occurring in the hardware table.
To drill down on a specific MAC address, we can use the grep function with these commands to validate the mac is associated
with a particular port, and that the hardware programming reflects that.
Note: When evaluating the Hardware mac table, if the Index is set to 0x00400, or the GM bit is set to ?1?, that traffic will be
routed. For example, you will see the index set to 0x00400 and GM bit set to 1 for traffic destined to the mac address
local to the device
PHX2-N7K-1# show mac address-table
Legend:
PHX2-N7K-1# show hardware mac address-table 1 int eth 1/1 | grep 000c.294b.c5ca
1 1 2 000c.294b.c5ca 0x00422 0 3 0 67 1 0
0 0 0 0 0 0 0 0
The layer 3 features which are applied to all packets include the below features :
Following the evaluation of the features, we will evaluate the layer 3 forwarding troubleshooting.
The first feature we will look at is ACL. To troubleshoot ACL, we want to evaluate the configuration, and any relevant hit
counters. We then can identify if the hardware on the linecard is programming the ACL.
It is important to note, that if you wish to see per ACL counters, you must enable ?statistics per-entry? in the ACL.
VLAN 86 :
=========
No ingress policies
No Netflow profiles in ingress direction
VDC-1 Ethernet1/1 :
====================
Policies in ingress direction:
Policy type Policy Id Policy name
------------------------------------------------------------
QoS 1
...
VDC-1 CoPP :
====================
Policies in ingress direction:
Policy type Policy Id Policy name
------------------------------------------------------------
QoS 3
No egress policies
No Netflow profiles in egress direction
The next feature we will look at is the QoS troubleshooting for the Nexus. Note, we will have QoS applied, potentially, on both
ingress and egress. So we should interrogate both the ingress and egress QoS.
Netflow processing also has portions which occur in hardware. For netflow, we collect statistics in hardware on the linecards. We
then, can export them via software.
Cisco Nexus 7000 Series NX-OS Troubleshooting Guide
The commands to troubleshoot Netflow are
Interface Vlan86:
Monitor: sample-86
Direction: Input
Monitor: sample-86
Direction: Output
Cisco NX-OS supports a hardware based intrusion detection system that checks for ip packet verification. These checks handle
well known, and unusable traffic types which can be witnessed during malicious activity, such as if the source is a broadcast
address, or if the destination is the 0.0.0.0 address. You can validate if any of these checks are dropping packets.
Note: It has been shown in the field, that frequently it is advantageous to disable IP fragment verification. This is done via the
below command
To troubleshoot the routed traffic, we need to perform the following tasks: 1. Ensure that the control plane routing is correct. 2.
Ensure that the hardware forwarding entries on the ingress module have the corresponding information.
Note: All routing of traffic is performed on the forwarding engine of the ingress module.
Cisco Nexus 7000 Series NX-OS Troubleshooting Guide
For our example, we will troubleshoot a route 86.86.87.0/24, which is set to a next hop of 86.86.86.1, and set to route out of
VLAN 86 (an SVI).
We first will look at the route, ensure it is set to the correct next hop (86.86.86.1), and set to route out of VLAN 86. We will then
want to ensure that we have a corresponding ARP entry associated with this next hop, and validate that the adjacency is in the
adjacency table.
As we can see below, 86.86.87.0/24 is set to route to 86.86.86.1, out VLAN 86. This next hop is associated with MAC address
0011.aabb.ccdd. We will use this information to investigate the hardware, next.
IP ARP Table
Total number of entries: 1
Address Age MAC Address Interface
86.86.86.1 - 0011.aabb.ccdd Vlan86
The above example shows the control plane. Now that we know how things are supposed to work, we can interrogate the
hardware to ensure the hardware entries have propagated properly to the Layer 3 hardware engine. We can see that the IP FIB has
properly associated 86.86.87.0/24 to the next hop of 86.86.86.1. We can also see, in the hardware entry, that this is routed out
VLAN 86, that the RPF is valid if we have enabled RPF Checking), and that the route entry is correctly associated with the MAC
address of 0011.aabb.ccdd.
This demonstrates that the routing in the forwarding plane is programmed correctly and that the forwarding will follow the
information contained in the routing protocols.
------------------+------------------+---------------------
Cisco Nexus 7000 Series NX-OS Troubleshooting Guide
Prefix | Next-hop | Interface
------------------+------------------+---------------------
86.86.87.0/24 86.86.86.1 Vlan86
In this step, we need to interrogate if the fabrics are functioning properly, and if their utilization is at an acceptable level. We can
view the fabric status and utilization using the following commands:
-----------------------------
Slot Direction Utilization
-----------------------------
2 ingress 3%
2 egress 3%
6 ingress 1%
6 egress 1%
Xbar Sw Hw
--- -------------- ------
1 NA 1.0
2 NA 1.0
3 NA 1.0
Based on this, we can then validate the hardware programming on the egress module to validate that our MAC address table is
properly programmed into the hardware based Layer 2 engine on the module.
The output from these commands are documented in steps 3-4 above.
Configured WRR
WRR bandwidth ratios: 25[1p7q4t-out-q-default] 15[1p7q4t-out-q2] 12[1p7q4t-out-q3] 12[1p7q4t-out-q4] 12[1p7
WRR configuration read from HW
WRR bandwidth ratios: 25[1p7q4t-out-q-default] 15[1p7q4t-out-q2] 11[1p7q4t-out-q3] 11[1p7q4t-out-q4] 11[1p7
Thresholds:
COS Queue Threshold Type Min Max
__________________________________________________________________
0 1p7q4t-out-q-default DT 100 100
1 1p7q4t-out-q-default DT 100 100
2 1p7q4t-out-q-default DT 100 100
3 1p7q4t-out-q-default DT 100 100
4 1p7q4t-out-q-default DT 100 100
5 1p7q4t-out-pq1 DT 100 100
6 1p7q4t-out-pq1 DT 100 100
7 1p7q4t-out-pq1 DT 100 100
...
Packet is Transmitted
The final step in the process is the transmission of the frame out of the physical egress port. Troubleshooting of the physical port,
is the same as in step 1, and includes the following commands:
This article describes how to troubleshoot Web Cache Communication Protocol version 2 (WCCPv2) on Cisco NX-OS.
Guide Contents
Troubleshooting Overview
Troubleshooting Installs, Upgrades, and Reboots
Troubleshooting Licensing
Troubleshooting VDCs
Troubleshooting CFS
Troubleshooting Ports
Troubleshooting vPCs
Troubleshooting VLANs
Troubleshooting STP
Troubleshooting Routing
Troubleshooting Unicast Traffic
Troubleshooting WCCP (this section)
Troubleshooting Memory
Troubleshooting Packet Flow Issues
Troubleshooting FCoE
Before Contacting Technical Support
Troubleshooting Tools and Methodology
Contents
• 1 Information About Troubleshooting
WCCP
• 2 Problem Scenarios
♦ 2.1 Reasons For Service Group
Startup Failure
♦ 2.2 Client Loss
♦ 2.3 Packet Redirect Counts Not
Incrementing
♦ 2.4 Potential problems
See the Configuring WCCPv2 chapter in the Cisco Nexus 7000 Series NX-OS Unicast Routing Configuration Guide for more
information on WCCPv2.
Problem Scenarios
Reasons For Service Group Startup Failure
• WCCP Client fails to see ISU messages and is stuck in the NOT Usable state.
♦ Confirm by enabling the WCCP event debugging messages and looking for bad Receive ID messages.
♦ The reason for the failure would in general be a connectivity problem and may be mismatched speed or duplex
settings.
• The WCCP Client is requesting a capability which is not supported by the router probably because of platform limitations.
♦ This can be confirmed by enabling WCCP event debugging and looking for Capability Mismatch messages.
• The WCCP Client is requesting a capability which is not supported by the service group.
♦ This will occur if a service group is already formed and the WCCP Client configuration does not match the
existing service.
♦ This is a misconfiguration of the WCCP Client and can be confirmed by enabling WCCP event debugging and
looking for Capability Mismatch messages.
• HIA event messages may indicate other reasons why the router rejected an incoming "Here I Am" message.
• Some WCCP clients don't adhere to the configured forward/return methods and prefer to always default to "GRE"
forward/return. The Cisco Nexus 7000 requires L2 forward/return methods. ACNS, WAAS, etc. might need to be
configured with the assign-method-strict option. This type of failure can be seen with packet traces. The client does not
respond to the Cisco Nexus 7000 with a sent RXID but will keep sending HIA with a receive ID of 0.
Client Loss
A WCCP Client is removed form a service group when the router loses contact with the WCCP Client.
Potential problems
• Direct communication between WCCP client and host
♦ The Cisco Nexus 7000 requires that the host running the browser, the WCCP clients, be attached to different L3
interfaces (the hosts cannot be present in the same subnet).
• Service definition mismatch
♦ There is no mechanism on a Cisco WCCP Client to mark two services as complementary. This also appears to be
true for third party vendors. This has the consequence that the two services can drift apart over time. On service
startup there is usually no problem however, as WCCP clients leave and rejoin either of the services the
assignments change independently meaning that an outgoing connection and the corresponding incoming
connection may not be redirected to the same WCCP Client. If that happens the configuration is broken.
♦ Check for this condition by comparing the assignments shown with the show ip wccp [web-cache | service
number] detail command. The only way currently to correct the condition is to restart both services.
• Asymmetric routing
♦ As long as the incoming connection returns to any router in the complementary service group that will happen
automatically. Note that the connection does not have to go to the exact same router as the outgoing connection,
just the same service group. In any given network the routes to a particular destination may be numerous which
raises the possibility that traffic returning from an origin server may take a different route to the outgoing traffic
and fail to hit routers in the complementary service group. In that case the connection will not be redirected and
the configuration will be broken. There is no way round this other than to ensure that there is no asymmetric
routing taking place.
Client Loss 97
Cisco Nexus 7000 Series NX-OS Troubleshooting Guide
This article describes how to troubleshoot memory issues that may occur when configuring and using Cisco NX-OS.
Guide Contents
Troubleshooting Overview
Troubleshooting Installs, Upgrades, and Reboots
Troubleshooting Licensing
Troubleshooting VDCs
Troubleshooting CFS
Troubleshooting Ports
Troubleshooting vPCs
Troubleshooting VLANs
Troubleshooting STP
Troubleshooting Routing
Troubleshooting Unicast Traffic
Troubleshooting WCCP
Troubleshooting Memory (this section)
Troubleshooting Packet Flow Issues
Troubleshooting FCoE
Before Contacting Technical Support
Troubleshooting Tools and Methodology
Contents
• 1 Overview
• 2 General/High Level Assessment of Platform Memory
Utilization
• 3 Detailed Assessment of Platform Memory Utilization
♦ 3.1 Page Cache
♦ 3.2 Kernel
♦ 3.3 User Processes
◊ 3.3.1 Figuring Out Which Process is Using
a Lot of Memory
◊ 3.3.2 Figuring Out How a Specific Process
is Using Memory
• 4 Built-in Platform Memory Monitoring
♦ 4.1 Memory Thresholds
♦ 4.2 Memory Alerts
Overview
Dynamic random access memory (DRAM) is a limited resource on all platforms and must be controlled/monitored to ensure
utilization is kept in check.
Page cache
When you access files from persistent storage (CompactFlash), the kernel reads the data into the page cache, which means
that when you access the data in the future, you can avoid the slow access times that are associated with disk storage.
Cached pages can be released by the kernel if the memory is needed by other processes.
Some file systems (tmpfs) exist purely in the page cache (for example, /dev/sh, /var/sysmgr, /var/tmp), which means that
there is no persistent storage of this data and that when the data is removed from the page cache, it cannot be recovered.
tmpfs-cached files release page-cached pages only when they are deleted.
Overview 98
Cisco Nexus 7000 Series NX-OS Troubleshooting Guide
Kernel
The kernel needs memory to store its own text, data, and Kernel Loadable Modules (KLMs). KLMs are pieces of code
that are loaded into the kernel (as opposed to being a separate user process). An example of kernel memory usage is when
an inband port driver allocates memory to receive packets.
User processes
This memory is used by Cisco NX-OS/Linux processes that are not integrated in the kernel (such as text, stack, heap, and
so on).
When you are troubleshooting high memory utilization, you must first determine what type of utilization is high (process, page
cache, or kernel). Once you have identified the type of utilization, you can use additional troubleshooting commands to help you
figure out which component is causing this behavior.
Note: From these command outputs, you might be able to tell that platform utilization is higher than normal/expected, but you
will not be able to tell what type of memory usage is high.
The show system resources command displays platform memory statistics (not per VDC).
N7K# show system resources
Load average: 1 minute: 0.43 5 minutes: 0.30 15 minutes: 0.28
Processes : 884 total, 1 running
CPU states : 2.0% user, 1.5% kernel, 96.5% idle
Memory usage: 4135780K total, 3423272K used, 712508K free
0K buffers, 1739356K cache
Note: This output is derived from the Linux memory statistics in /proc/meminfo.
The show process memory command displays the memory allocation per process for the current VDC (the output will
contain non-VDC global processes also).
N7K# show processes memory
PID MemAlloc MemLimit MemUsed StackBase/Ptr Process
----- -------- ---------- ---------- ----------------- ----------------
4662 52756480 562929945 150167552 bfffdf00/bfffd970 netstack
While this output is more detailed, it is only useful for verifying process-level memory allocation within a specific VDC.
Cisco Nexus 7000 Series NX-OS Troubleshooting Guide
MemTotal (kB)- Total amount of memory in the system (4 GB in the Cisco Nexus 7000 Series Sup1)
Cached (kB) - Amount of memory used by the page cache (includes files in tmpfs mounts and data cached from persistent
storage /bootflash)
RamCached (kB) - Amount of memory used by the page cache that cannot be released (data not backed by persistent
storage)
Available (Pages) - Amount of free memory in pages (includes the space that could be made available in the page cache
and free lists)
Mapped (Pages) - Memory mapped into page tables (data being used by nonkernel processes)
Slab (Pages) - Rough indication of kernel memory consumption
Page Cache
Kernel
User Processes
Page Cache
If Cached or RAMCached is high, you should check the file system utilization and determine what kind of files are filling the page
cache.
The show system internal flash command displays the file system utilization (the output is similar to df -hT included in the
memory alerts log).
N7K# show system internal flash
Mount-on 1K-blocks Used Available Use% Filesystem
/ 409600 43008 367616 11 /dev/root
/proc 0 0 0 0 proc
/sys 0 0 0 0 none
/isan 409600 269312 140288 66 none
/var/tmp 307200 876 306324 1 none
/var/sysmgr 1048576 999424 49152 96 none
/var/sysmgr/ftp 307200 24576 282624 8 none
/dev/shm 1048576 412672 635904 40 none
/volatile 204800 0 204800 0 none
/debug 2048 16 2032 1 none
/dev/mqueue 0 0 0 0 none
/mnt/cfg/0 76099 5674 66496 8 /dev/hda5
/mnt/cfg/1 75605 5674 66027 8 /dev/hda6
/bootflash 1796768 629784 1075712 37 /dev/hda3
/var/sysmgr/startup-cfg 409600 27536 382064 7 none
/mnt/plog 56192 3064 53128 6 /dev/mtdblock2
/dev/pts 0 0 0 0 devpts
/mnt/pss 38554 6682 29882 19 /dev/hda4
/slot0 2026608 4 2026604 1 /dev/hdc1
/logflash 7997912 219408 7372232 3 /dev/hde1
/bootflash_sup-remote 1767480 1121784 555912 67 127.1.1.6:/mnt/bootflash/
/logflash_sup-remote 7953616 554976 6994608 8 127.1.1.6:/mnt/logflash/
Note: When reviewing this output, the value of none in the Filesystem column means that it is a tmpfs type.
In this example, utilization is high because the /var/sysmgr (or subfolders) is using a lot of space. /var/sysmgr is a tmpfs mount,
which means that the files exist in RAM only. You need to determine what type of files are filling the partition and where they
came from (cores/debugs/etc). Deleting the files will reduce utilization, but you should try to determine what type of files are
Cisco Nexus 7000 Series NX-OS Troubleshooting Guide
taking up the space and what process left them in tmpfs.
In Cisco NX-OS release 4.2(4) and later releases, use the following commands to display and delete the problem files from the
CLI:
The show system internal dir full directory path command lists all the files and sizes for the specified path (hidden
command).
The filesys delete full file path command deletes a specific file (hidden command).
Note: Use caution when using this command. You cannot recover a deleted file.
Note: If you are running a Cisco NX-OS release prior to Cisco NX-OS release 4.2(4), you should contact your customer
support representative.
You can also use the show hardware internal proc-info pcacheinfo command to determine how much space each file system is
using in the page cache (Cached). The command output may help you determine which persistent file systems are using the page
cache and how much memory they are using.
Kernel
Kernel issues are less common, but you can determine the problem by reviewing the slab utilization in the show system internal
meminfo command output. Generally, kernel troubleshooting requires Cisco customer support assistance to isolate why the
utilization is increasing.
If slab memory usage grows over time, use the following commands to gather more information:
The show system internal kernel malloc-stats command displays all the currently loaded KLMs, malloc, and free counts.
N7K# show system internal kernel malloc-stats
Kernel Module Memory Tracking
-------------------------------------------------------------
Module kmalloc kcalloc kfree diff
klm_usd 00318846 00000000 00318825 00000021
klm_eobcmon 08366981 00000000 08366981 00000000
klm_utaker 00001306 00000000 00001306 00000000
klm_sysmgr-hb 00000054 00000000 00000049 00000005
klm_idehs 00000001 00000000 00000000 00000001
klm_sup_ctrl_mc 00209580 00000000 00209580 00000000
klm_sup_config 00000003 00000000 00000000 00000003
klm_mts 03357731 00000000 03344979 00012752
klm_kadb 00000368 00000000 00000099 00000269
klm_aipc 00850300 00000000 00850272 00000028
klm_pss 04091048 00000000 04041260 00049788
klm_rwsem 00000001 00000000 00000000 00000001
klm_vdc 00000126 00000000 00000000 00000126
klm_modlock 00000016 00000000 00000016 00000000
klm_e1000 00000024 00000000 00000006 00000018
klm_dc_sprom 00000123 00000000 00000123 00000000
klm_sdwrap 00000024 00000000 00000000 00000024
klm_obfl 00000050 00000000 00000047 00000003
By comparing several iterations of this command, you can determine if some KLMs are allocating a lot of memory but are not
freeing/returning the memory back (the differential value will be very large compared to normal).
Cisco Nexus 7000 Series NX-OS Troubleshooting Guide
The show system internal kernel skb-stats command displays the consumption of SKBs (buffers used by KLMs to send and
receive packets).
N7K# show system internal kernel skb-stats
Kernel Module skbuff Tracking
-------------------------------------------------------------
Module alloc free diff
klm_shreth 00028632 00028625 00000007
klm_eobcmon 02798915 02798829 00000086
klm_mts 00420053 00420047 00000006
klm_aipc 00373467 00373450 00000017
klm_e1000 16055660 16051210 00004450
Compare the output of several iterations of this command to see if the differential value is growing or very high.
The show hardware internal proc-info slabinfo command dumps all of the slab information (memory structure used for
kernel management). The output can be large.
User Processes
If page cache and kernel issues have been ruled out, utilization might be high as a result of some user processes taking up too
much memory or a high number of running processes (due to the number of VDCs/features enabled).
Note: Cisco NX-OS defines memory limits for most processes (rlimit). If this rlimit is exceeded, sysmgr will crash the process
and a core file is usually generated. Processes close to their rlimit may not have a large impact on platform utilization
but could still become an issue if a crash occurs.
Figuring Out Which Process is Using a Lot of Memory
The following commands can help you identify if a specific process is using a lot of memory:
The show process memory command displays the memory allocation per process for the current VDC (the output will
contain non-VDC global processes also).
N7K# show processes memory
PID MemAlloc MemLimit MemUsed StackBase/Ptr Process
----- -------- ---------- ---------- ----------------- ----------------
4662 52756480 562929945 150167552 bfffdf00/bfffd970 netstack
Note: The output of the show process memory command might not provide a completely accurate picture of the current
utilization (allocated does not mean in use). This command is useful for determining if a process is approaching its
rlimit.
To determine how much memory the processes are really using, you should check the Resident Set Size (RSS). This value will
give you a rough indication of the amount of memory (in KB) that is being consumed by the processes. You can gather this
information by using the following command:
The show system internal processes memory command displays the process information in the memory alerts log (if the
event occurred).
N7K# show system internal processes memory
PID TTY STAT TIME MAJFLT TRS RSS VSZ %MEM COMMAND
4727 ? Ss 00:00:00 0 1549 123248 132832 2.9 /isan/bin/pixm
4728 ? Ssl 00:00:00 0 408 78388 143104 1.8 /isan/bin/routing-sw/mrib -m 4
6662 ? Ssl 00:00:05 0 2762 64024 144396 1.5 /isan/bin/routing-sw/netstack /isan/etc/routing-sw/pm.cfg
4538 ? Ssl 00:00:00 0 2762 60448 211664 1.4 /isan/bin/routing-sw/netstack /isan/etc/routing-sw/pm.cfg
5865 ? Ssl 00:00:01 0 2762 60416 113320 1.4 /isan/bin/routing-sw/netstack /isan/etc/routing-sw/pm.cfg
Cisco Nexus 7000 Series NX-OS Troubleshooting Guide
6395 ? Ssl 00:00:00 0 2762 52008 105552 1.2 /isan/bin/routing-sw/netstack /isan/etc/routing-sw/pm.cfg
4271 ? Ssl 00:00:00 0 609 49812 61420 1.2 /isan/bin/routing-sw/urib
7879 ? Ssl 00:00:00 0 1909 44800 90508 1.0 /isan/bin/routing-sw/bgp -t 64000
5696 ? Ssl 00:00:17 0 337 44696 55252 1.0 /isan/bin/routing-sw/clis -cli /isan/etc/routing-sw/cli
5333 ? Ssl 00:00:14 0 337 44652 55208 1.0 /isan/bin/routing-sw/clis -cli /isan/etc/routing-sw/cli
4182 ? Ssl 00:00:15 0 337 44648 55204 1.0 /isan/bin/routing-sw/clis -cli /isan/etc/routing-sw/cli
6076 ? Ssl 00:00:14 0 337 44624 55284 1.0 /isan/bin/routing-sw/clis -cli /isan/etc/routing-sw/cli
6825 ? Ssl 00:00:00 0 1402 44576 84020 1.0 /isan/bin/routing-sw/pim -t
4268 ? Ssl 00:00:00 0 363 27132 38896 0.6 /isan/bin/routing-sw/u6rib
4732 ? Ssl 00:00:00 0 404 25220 65360 0.6 /isan/bin/routing-sw/m6rib
4726 ? S<s 00:00:00 0 144 25208 30188 0.6 /isan/bin/pixmc
remaining output omitted
If you see an increase in the utilization for a specific process over time, you should gather additional information about the process
utilization.
The show system internal sysmgr service pid <PID in decimal> command dumps the service information running the
specified PID.
N7K# show system internal sysmgr service pid 4727
Service "pixm" ("pixm", 109):
UUID = 0x133, PID = 4727, SAP = 176
State: SRV_STATE_HANDSHAKED (entered at time Fri Nov 12 01:42:01 2010).
Restart count: 1
Time of last restart: Fri Nov 12 01:41:11 2010.
The service never crashed since the last reboot.
Tag = N/A
Plugin ID: 1
Convert the UUID from the above output to decimal and use in the next command.
Note: If troubleshooting in lab, you can use NX-OS hex/dec conversion using following hidden commands :
The show system internal kernel memory uuid <UUID in decimal> command displays the detailed process memory usage
including its libraries for a specific UUID in the system (convert UUID from the sysmgr service output).
N7K# show system internal kernel memory uuid 307
Note: output values in KiloBytes
Name rss shrd drt map heap ro dat bss stk misc
---- --- ---- --- --- ---- -- --- --- --- ----
/isan/bin/pixm 7816 5052 2764 1 0 0 0 0 52 0
/isan/plugin/1/isan/bin/pixm 115472 0 115472 0 109176 752 28 6268 0 24
/lib/ld-2.3.3.so 84 76 8 2 0 76 0 0 0 8
/usr/lib/libz.so.1.2.1.1 16 12 4 1 0 12 4 0 0 0
/usr/lib/libstdc++.so.6.0.3 296 272 24 1 0 272 20 4 0 0
/lib/libgcc_s.so.1 1824 12 1812 1 1808 12 4 0 0 0
/isan/plugin/1/isan/lib/libtmifdb.so.0 12 8 4 1 0 8 4 0 0 0
/isan/plugin/0/isan/lib/libtmifdb_stub 12 8 4 1 0 8 4 0 0 0
/dev/mts0 0 0 0 1 0 0 0 0 0 0
/isan/plugin/1/isan/lib/libpcm_sdb.so. 16 12 4 1 0 12 4 0 0 0
/isan/plugin/1/isan/lib/libethpm.so.0. 76 60 16 1 0 60 16 0 0 0
/isan/plugin/1/isan/lib/libsviifdb.so. 20 4 16 1 12 4 4 0 0 0
Cisco Nexus 7000 Series NX-OS Troubleshooting Guide
/usr/lib/libcrypto.so.0.9.7 272 192 80 1 0 192 76 4 0 0
/isan/plugin/0/isan/lib/libeureka_hash 8 4 4 1 0 4 4 0 0 0
remaining output omitted
This output helps you to determine if a process is holding memory in a specific library and can assist with memory leak
identification.
The show system internal <service> mem-stats detail command displays the detailed memory utilization including the
libraries for a specific service.
N7K# show system internal pixm mem-stats detail
Private Mem stats for UUID : Malloc track Library(103) Max types: 5
--------------------------------------------------------------------------------
TYPE NAME ALLOCS BYTES
CURR MAX CURR MAX
2 MT_MEM_mtrack_hdl 32 33 16448 16596
3 MT_MEM_mtrack_info 424 531 6784 8496
4 MT_MEM_mtrack_lib_name 636 743 30054 35112
--------------------------------------------------------------------------------
Total bytes: 53286 (52k)
--------------------------------------------------------------------------------
Private Mem stats for UUID : Non mtrack users(0) Max types: 105
--------------------------------------------------------------------------------
TYPE NAME ALLOCS BYTES
CURR MAX CURR MAX
4 [r-xp]/isan/plugin/0/isan/lib/libacfg.s 0 4 0 51337
9 [r-xp]/isan/plugin/0/isan/lib/libavl.so 79 81 1568 1608
25 [r-xp]/isan/plugin/0/isan/lib/libfsrv.s 6 6 34 34
32 [r-xp]/isan/plugin/0/isan/lib/libindxob 6 6 456 456
46 [r-xp]/isan/plugin/0/isan/lib/libmpmts. 0 2 0 100
48 [r-xp]/isan/plugin/0/isan/lib/libmts.so 7 10 816 972
51 [r-xp]/isan/plugin/0/isan/lib/libpfm_in 0 1 0 3490
53 [r-xp]/isan/plugin/0/isan/lib/libpss.so 169 196 27316 114880
57 [r-xp]/isan/plugin/0/isan/lib/libsdb.so 140 140 5632 5632
62 [r-xp]/isan/plugin/0/isan/lib/libsrg.so 0 1 0 3480
68 [r-xp]/isan/plugin/0/isan/lib/libsysmgr 3 3 2094 2094
79 [r-xp]/isan/plugin/0/isan/lib/libutils. 61 69 512 55389
84 [r-xp]/isan/plugin/1/isan/bin/pixm 238 240 532920 533440
88 [r-xp]/isan/plugin/1/isan/lib/libpixm.s 0 1 0 48
92 [r-xp]/lib/ld-2.3.3.so 21 26 3483 4233
94 [r-xp]/lib/tls/libc-2.3.3.so 286 287 8163 8490
100 [r-xp]/usr/lib/libglib-2.0.so.0.600.1 12 19 6328 6800
--------------------------------------------------------------------------------
Total bytes: 589322 (575k)
remaining output omitted
These outputs are usually requested by the Cisco customer support representative when investigating a potential memory leak in a
process or its libraries.
Note: While Cisco NX-OS implements VDCs, it is important to remember that a specific VDC's memory utilization is not
limited. Platform memory issues will impact all configured VDCs.
Memory Thresholds
Prior to Release 4.2(4), the default memory alert thresholds were as follows:
• 70% MINOR
• 80% SEVERE
• 90% CRITICAL
From Release 4.2(4) and later releases, the memory alert thresholds were changed to the following:
• 85% MINOR
• 90% SEVERE
• 95% CRITICAL
This change was introduced in part due to baseline memory requirements when many features/VDCs are deployed.
The show system internal memory-status command allows you to check the current memory alert status.
N7K# show system internal memory-status
MemStatus: OK
Memory Alerts
If a memory threshold has been passed (OK -> MINOR, MINOR -> SEVERE, SEVERE -> CRITICAL), the Cisco NX-OS
platform manager will capture a snapshot of memory utilization and log an alert to SYSLOG (as of Release 4.2(4), default VDC
only). This snapshot is useful in determining why memory utilization is high (process, page cache, or kernel). The log is generated
in the Linux root path (/) and copy is moved to OBFL (/mnt/plog) if possible. This log is very useful for determining if memory
utilization is high due to the memory that was consumed by the page cache, kernel, or Cisco NX-OS user processes.
The show system internal memory-alerts-log command displays the memory alerts log.
Command Description
cat
Provides a log of timestamps when memory alerts occurred.
/proc/memory_events
Shows the overall memory statistics including the total RAM, memory consumed by the page cache,
cat /proc/meminfo
slabs (kernel heap), mapped memory, available free memory, and so on.
cat /proc/memtrack
Displays the allocation/deallocation counts of the KLMs (Cisco NX-OS processes running in kernel
memory).
df -hT Displays file system utilization information (with type).
du --si -La /tmp Displays file information for everything located in /tmp (symbolic link to /var/tmp).
cat
Dumped a second time to help determine if utilization changed during data gathering.
/proc/memory_events
cat /proc/meminfo Dumped a second time to help determine if utilization changed during data gathering.
This article describes how to troubleshoot packet flow issues for Cisco NX-OS.
Guide Contents
Troubleshooting Overview
Troubleshooting Installs, Upgrades, and Reboots
Troubleshooting Licensing
Troubleshooting VDCs
Troubleshooting CFS
Troubleshooting Ports
Troubleshooting vPCs
Troubleshooting VLANs
Troubleshooting STP
Troubleshooting Routing
Troubleshooting Unicast Traffic
Troubleshooting WCCP
Troubleshooting Memory
Troubleshooting Packet Flow Issues(this section)
Troubleshooting FCoE
Before Contacting Technical Support
Troubleshooting Tools and Methodology
Contents
• 1 Packet Flow Issues
♦ 1.1 Packets Dropped Because of
Rate Limits
♦ 1.2 Packets Dropped Because of a
QoS Policy
♦ 1.3 Packets Dropped in Hardware
♦ 1.4 show hardware internal
statistics rates
◊ 1.4.1 show hardware
internal statistics pktflow
all
• Software switched packets could be received from the interface, but dropped by the supervisor because of rate limits.
• Packets could be dropped because of a QoS policy.
• Hardware switched packets could be dropped by the hardware because of a bandwidth limitation.
module 2 :
conformed 0 bytes; action: transmit
violated 0 bytes; action: drop
module 3 :
conformed 0 bytes; action: transmit
violated 0 bytes; action: drop
module 4 :
conformed 0 bytes; action: transmit
violated 0 bytes; action: drop
module 10 :
conformed 11614462878 bytes; action: transmit
violated 3097405384908 bytes; action: drop
This command displays per ASIC statistics, including packets into and out of the ASIC. This command helps to identify where
packet loss is occurring.
|------------------------------------------------------------------------|
| Device:Metropolis Role:REWR |
| Packets
|------------------------------------------------------------------------|
Instance: 0 Ports:-
|----------|-------------------|------------------|
| | IN | OUT |
|----------|-------------------|------------------|
|Ingress | 00000000014a40c0 | 0000000001498ccc |
|----------|-------------------|------------------|
|Egress | 000000000007e9dc | 000000000007e9dc |
|----------|-------------------|------------------|
|------------------------------------------------------------------------|
| Device:Octopus Role:QUE |
| Packets
|------------------------------------------------------------------------|
Instance: 0 Ports:-
|----------|-------------------|------------------|
| | IN | OUT |
|----------|-------------------|------------------|
|Ingress | 0000000001498ccc | 0000000001498cc6 |
|----------|-------------------|------------------|
|Egress | 000000000007e9c5 | 000000000007e9dc |
|----------|-------------------|------------------|
*** Counters above represent packets combined into a larger one ***