Professional Documents
Culture Documents
SAN Troubleshooting PDF
SAN Troubleshooting PDF
Brocade SAN
Troubleshooting
Student Guide
Brocade, the Brocade B-weave logo, Fabric OS, File Lifecycle Manager, MyView, Secure Fabric
OS, SilkWorm, and StorageX are registered trademarks and the Brocade B-wing symbol and
Tapestry are trademarks of Brocade Communications Systems, Inc., in the United States and/or
in other countries. FICON is a registered trademark of IBM Corporation in the U.S. and other
countries. All other brands, products, or service names are or may be trademarks or service
marks of, and are used to identify, products or services of their respective owners.
Notice: This document is for informational purposes only and does not set forth any warranty,
expressed or implied, concerning any equipment, equipment feature, or service offered or to be
offered by Brocade. Brocade reserves the right to make changes to this document at any time,
without notice, and assumes no responsibility for its use. This informational document describes
features that may not be currently available. Contact a Brocade sales office for information on
feature and product availability. Export of technical data contained in this document may require
an export license from the United States government.
Revision 0213 1 – 10
SAN-TS 300 Course Introduction
Footnote 1: Brocade University releases nutshell guides for each certification exam. The
guides are named after the exam, i.e. BCFP in a Nutshell, and are available from the
Brocade University certification page: http://www.brocade.com/education/certification-
accreditation.
Revision 0213 1 – 11
SAN-TS 300 Course Introduction
Revision 0213 1 – 12
SAN-TS 300 Course Introduction
Revision 0213 1 – 13
SAN-TS 300 Course Introduction
Revision 0213 1 – 14
SAN-TS 300 Course Introduction
Revision 0213 1 – 15
SAN-TS 300 Course Introduction
Revision 0213 1 – 16
SAN-TS 300 Course Introduction
Revision 0213 1 – 17
SAN-TS 300 Course Introduction
Revision 0213 1 – 18
SAN-TS 300 Course Introduction
Revision 0213 1 – 19
SAN-TS 300 Course Introduction
Revision 0213 1 – 20
SAN-TS 300 Course Introduction
Revision 0213 1 – 21
SAN-TS 300 Course Introduction
Revision 0213 1 – 22
SAN-TS 300 Course Introduction
Revision 0213 1 – 23
SAN-TS 300 Course Introduction
Revision 0213 1 – 24
SAN-TS 300 Troubleshooting Overview
Here is a partial list of helpful commands associated with identifying these problems; all
problem determination steps include switchshow and errshow:
• Timeout/sluggishness: urouteshow, topologyshow, porterrshow,
portshow, portstatsshow, portcfgshow, portbuffershow, and
aptpolicy (check routing configuration)
• Segmented fabric: configshow, fabricshow, fabstatsshow, portshow,
portcfgshow, check zone related commands, and license configuration
• Port/node configuration: portcfgshow, configshow, portlogdump,
portshow, fabricshow, trunkshow, portcfglongdistance,
licenseshow, and portshow
• Missing device: Check physical connectivity using switchshow, portshow, and
fcping. Check fabric connectivity with nsallshow, nsshow, nscamshow,
zoning(zoneshow, etc.) and port configuration commands (portcfgshow,
portshow). Optionally use a diagnostic tests such as porttest or D_Port
diagnostics because this will test the port and link components.
For marginal links use D_Port tests or the porttest command to troubleshoot link
issues.
Footnote 1: Example if there is a performance issue with a server are other servers also
having problems? If so what severs, knowing this will help in the problem resolution.
Footnote 1: If the problem is a device that cannot log into the fabric capturing a
supportsave from the switch and HBA (if Brocade HBA), and server syslog will be
enough. If the problem is that the server cannot ‘see’ storage, capturing a
supportsave from each switch in the path is required. If the issue is performance
then capturing a supportsave from each switch in the fabric is required.
Taking the supportsave after you have already started to troubleshoot the problem
can make resolution determination harder and may introduce false positives into the
supportsave data.
Brocade Network Advisor can be used to easily collect and store support save data from
multiple switches simultaneously.
During the supportsave process in the Fabric OS, the *.dump files get moved to
*.old.dump, the old file gets overwritten.
Revision 0213 2 – 10
SAN-TS 300 Troubleshooting Overview
Footnote 1: The 80 means end of list, so there are no other devices that the server
currently has access to. If this were 00 instead of 80 that would mean there are
additional devices that the host has access too. Remember for a 24 bit address to be
included in this Name Server query, the device must be currently logged in and the
server must have access (zoned).
See appendix portlogdump module for more information on this output: 80 means end of
010a00 is the the list. 020b00 is
8002 is an address of the the address of the
accept to server (Of course storage device
the CT the host will have connected to port
request access to itself) 11
Revision 0213 2 – 11
SAN-TS 300 Troubleshooting Overview
When working with the port counters it is important to remember that the numbers
displayed have been accumulating since the switch was last rebooted or the stats last
cleared. Because of this it is necessary to either clear the stats and wait or take a
baseline and note any increases.
Revision 0213 2 – 12
SAN-TS 300 Troubleshooting Overview
Revision 0213 2 – 13
SAN-TS 300 Troubleshooting Overview
Revision 0213 2 – 14
SAN-TS 300 Troubleshooting Overview
The fabricshow command can be found in the SSHOW_FABRIC.txt file from the
supportsave capture.
Use this command to display information about switches in the fabric.
If the switch is initializing or is disabled, the message "no fabric" is displayed.
Revision 0213 2 – 15
SAN-TS 300 Troubleshooting Overview
The islshow command can be found in the SSHOW_FABRIC.txt file from the
supportsave capture.
Use the islshow command to display the current connections and status of the
interswitch link (ISL) for each port on a switch. The command output includes the
following information:
• Node world wide name (WWN)
• Domain ID
• Switch name
• ISL connection speed, if applicable
• Bandwidth
• Trunking enabled, if applicable
• QoS enabled, if applicable
• Encryption enabled, if applicable
• Compression enabled, if applicable
Revision 0213 2 – 16
SAN-TS 300 Troubleshooting Overview
The trunkshow command can be found in the SSHOW_FABRIC.txt file from the
supportsave capture.
Use this command to display trunking information of both E_Ports and EX_Ports.
Port to port connections
Displays the port-to-port trunking connections.
WWN: Displays the world wide name of the connected port.
Domain: Displays the domain IDs of the switches directly connected to the physical
ports. In case of an FC Router backbone fabric interlinking several edge fabrics, the
domain ID displayed for an E_Port trunk refers to a domain of a switch within the
backbone fabric, whereas the domain ID displayed for an EX_Port trunk refers to the
domain ID of a switch in the edge fabric. Because they are independent fabrics, it is
possible that both the backbone and the edge fabric may have the same domain ID
assigned to switches. If this is the case, run switchshow to obtain information on the
port types of the local switch and the WWNs of the remote switches. Refer to the
Example section for an illustration.
Deskew: The difference between the time it takes for traffic to travel over each ISL
compared to the time it takes through the shortest ISL in the group plus the minimum
deskew value. The value is expressed in nanoseconds divided by 10. The firmware
automatically sets the minimum deskew value for the shortest ISL, which is 15.
Master: Displays whether this trunking port connection is the master port connection for
the trunking group.
Revision 0213 2 – 17
SAN-TS 300 Troubleshooting Overview
Revision 0213 2 – 18
SAN-TS 300 Troubleshooting Overview
Divide and Conquer is a troubleshooting methodology that involves taking a system and
breaking it up into smaller testable components. By moving through the system in a
systematic fashion you can, by thorough testing, identify and isolate parts of the system
that could potentially cause a problem.
The most important part is knowledge of the system you are trying to troubleshoot.
Knowing the technologies involved and how they interconnect and interact is essential
to know where to divide the system and how to eliminate potential problems.
A Brocade fabric can be separated into a number of individual components. The list
below is an example but is not all inclusive:
• Storage devices
• Hosts
• Fabric switches
• Cables / Patch panels
Revision 0213 2 – 19
SAN-TS 300 Troubleshooting Overview
Revision 0213 2 – 20
SAN-TS 300 Troubleshooting Overview
If a host does not see a particular storage device then check the following using CLI,
Web Tools or Brocade Network Advisor:
• Is the device physically connected? If both devices do not appear as an F_Port,
FL_Port or an L_Port then it may not have a good physical connection. Look for a
marginal link or other initialization-related problem.
• If the device has a good physical connection then ask yourself, is the device
logically connected? (Is it present in the Name Server? Use CLI commands such
as nsshow, nscamshow, and nsallshow or GUI’s such as Webtool or Brocade
Network Advisor to determine if the fabric can see each device.)
• In the case of one device that can not see another you may have to additionally
examine zoning configuration and link error counter information to make sure end
devices are in the same zone and one of them isn’t bouncing (marginal) – this
would clearly show up in the port log.
This goes back to the Divide and Conquer process: where did the breakdown occur? At
the link level or at the logical level?
Revision 0213 2 – 21
SAN-TS 300 Troubleshooting Overview
The fabric in this example has five switches and devices attached, a deterministic path
exists and can be used to isolate this problem. The problem as described is that the
host on Switch3 cannot see one of the paths to the storage that is on the Switch2. A
path (in green) can be drawn that shows the connection the host and storage are
attempting to use. The other devices and switches in the fabric at this point should be
considered as non-existent until such time as they need to be existing again.
Revision 0213 2 – 22
SAN-TS 300 Troubleshooting Overview
Storage
Switch1 14
Switch2
12
Switch5
8
Switch3 5 7 Switch4
3
Host
Revision 0213 2 – 23
SAN-TS 300 Troubleshooting Overview
The G_Port being online indicates a problem. The device connected to that port has a
good link (it shows Online) but did not successfully get far enough into the process to
become either an E_Port or an F_Port (the port did not receive a FLOGI or ELS frame). If
the device did not come up as a G_Port and was still not physically connected, it would
come up with one of the following port states: No_Light (not receiving), No_Sync (not
synchronizing), In_Sync (receiving light and in synchronization but unable to go further in
initialization process), Laser_Flt, Port_Flt, Diag_Flt (diagnostics failed during bring up),
or Testing (which would explain why you do not see the device). You want to see Online.
switchshow port state Information:
• No_Card — no interface card present
• No_Module — no module (GBIC or other) present
• No_Light — module not receiving light
• No_Sync — module receiving light but out of synchronization
• In_Sync — module receiving light and in synchronization
• Laser_Flt — module signaling a laser fault
• Port_Flt — port marked faulty
• Diag_Flt — port failed diagnostics
• Lock_Ref — locking to the reference signal
• Testing — running diagnostics
• Online — port is up and running
Revision 0213 2 – 24
SAN-TS 300 Troubleshooting Overview
Footnote 1: If moving the cable to another port and the storage device logs in, check the
original port configuration and try the SFP in the working port. If the device still will not
log in check the cable and the storage device. Also check the switch port for errors, such
as CRC errors (which generally indicates a physical problem). Also if there is a patch
panel involved check the connections on the patch panel.
Revision 0213 2 – 25
SAN-TS 300 Troubleshooting Overview
Revision 0213 2 – 26
SAN-TS 300 Troubleshooting Overview
Revision 0213 2 – 27
SAN-TS 300 Troubleshooting Overview
Establishing link is the first step in connecting to a fabric. To establish a link the device
and switch ports will start transmitting a signal. This signal is used to negotiate speed
and synchronize character and word boundaries in the transmission.
In the next few slides we will continue our overview of the LLFD concept. LLFD will be
discussed in much greater detail later in this course.
Revision 0213 2 – 28
SAN-TS 300 Troubleshooting Overview
Footnote 1: If security is enabled there will also be an additional security policy check after the
FLOGI. The switch will check the Device Connection Control Policy (DCC) Access Control Lists
(ACL) to verify that the device requesting a login is permitted to attach to the fabric. This will
generate one of two responses:
• Accept – Assign fabric unique 24-bit address
• Deny – No response, do not assign fabric address
Footnote 2: Once logged into the Name Server, there is an implied login to all well known
address:
• FFFFFF – Broadcast Server
• FFFFFE – Fabric Login
• FFFFFD – Fabric Controller
• FFFFFC – Directory/Name Server
• FFFFFB – Time Server
• FFFFFA – Management Server
• FFFFF8 – Alias Server
• FFFCxx – Embedded Port (Domain Controller)
Footnote 3: Initiators should make a State Change Registration (SCR) prior to initiating a PLOGI
to a target. By issuing the SCR, they will ensure they are notified of any changes within their
zoning configuration prior to initiating communications with any targets. They may issue the SCR
after logging into a target, but the possibility exists that something may happen to the target
after they login and before they register to be notified of changes by the Name Server. For this
reason, the SCR usually occurs immediately after the PLOGI into the fabric.
Revision 0213 2 – 29
SAN-TS 300 Troubleshooting Overview
To communicate with other end devices, the device must register with and query the
Name Server. Many Host Bus Adapters (drivers) and storage devices will send standard
SCSI Inquiry data to the switch for registration. This data can be very useful for
identifying a particular device. Depending on the vendor you may also get additional
data such as firmware and driver versions. Name server registration takes place after
the device performs a FLOGI to the Fabric Controller and then a PLOGI to the name
server port.
Revision 0213 2 – 30
SAN-TS 300 Troubleshooting Overview
Footnote 1: This is not limited to initiators, some target devices will also query the name
server to see what devices has access to it, and will reject login requests from devices
that do not have access to it.
Footnote 2: There are several different query commands to get information about the
devices that an initiator has access to. Which query commands the server sends is
dependent on the driver for that device. Different initiators can send different query
commands.
Footnote 3: This is based upon the type of device that has registered. Type 8 is SCSI –
FCP (Fibre Channel Protocol). Type 5 is IP/FCIP.
Footnote 4: Brocade Fabric OS switches log into each device in the fabric and probe for
additional information to populate into the Name Server. Device probing is on by default
but can be disabled using the configure command. Some initiators will reject this
probing which is OK. Target devices generally allow the probing. The SID from the switch
for this probe will be FFFCxx (where xx is the domain ID in hex of the switch).
Revision 0213 2 – 31
SAN-TS 300 Troubleshooting Overview
Revision 0213 2 – 32
SAN-TS 300 Troubleshooting Overview
If there are problems with end devices communicating with each other, start troubleshooting
from the switch and work toward one of the affected end devices
Common mistakes with LUN Masking include:
• Initiator Node Wide Node Name (NWWN) defined when Port World Wide Name (PWWN) is
required (or both are required)
• Wrong or no LUNs enabled for that particular initiator
• Note: LUN Masking will sometimes be referred to using vendor specific terms such as
"LUN Security" or "LUN Mapping"
Common mistakes with persistent binding:
• New device presented from storage, but not added to persistent binding list on host may
prevent device from being seen by the OS
• Replaced device may need modification within persistent binding file
• Note: Persistent binding could be done by HBA utility or within OS specific file
While these issues are beyond the scope of this course, verification of switch related
connectivity and availability will help isolate the problem to host OS driver, array LUN masking,
or persistent binding configuration file issues
Use host logs and utilities to verify whether device connectivity exists:
• Can you gather inquiry data of a device from the host?
• Can you access the device from the host?
Revision 0213 2 – 33
SAN-TS 300 Troubleshooting Overview
Brocade Connect is the technical Web portal and online community for the Brocade
installed base. It empowers customers with self-service technical info, tools, and
community features that let them find answers to their questions, optimize their SAN
investment, and increase their productivity. Gain your customers' mind share, loyalty,
and appreciation by driving them to Brocade Connect on a daily basis. Best part — it's
free!
Revision 0213 2 – 34
SAN-TS 300 Troubleshooting Overview
Revision 0213 2 – 35
SAN-TS 300 Troubleshooting Overview
Revision 0213 2 – 36
SAN-TS 300 Data Gathering
Additional information about what supportsave captures is shared later in this section.
Footnote 1: Example of a B8510 switch: SW8510-S4cp-
201205021421.SSHOW_EX.txt.gz
For director class switches you will see files for both CPs (S4 and S5)
Footnote 2: This tool is not available to the general public.
If Virtual Fabrics are enabled, commands are checked for context and switch type as follows:
• Virtual Fabric context (VF) = Command applies to the current logical switch only, or to a
specified logical switch
• Virtual Fabric commands are further constrained by one of the following switch types:
– All Switches (All) = Command can be run in any switch context.
– Base Switch (BS) = Command can be run only on the base switch
– Default Switch ((DS) = Command can be run only in default switch
– N/A = Switch Type is not applicable to the command
– Chassis context (CH ) = Command applies to the chassis on which it is executed
– Switch and Chassis context (VF/CH) = Command applies to the switch and the
chassis
– Disallowed = Command can not be executed when Virtual Fabrics are enabled
Revision 0213 3 – 10
SAN-TS 300 Data Gathering
Revision 0213 3 – 11
SAN-TS 300 Data Gathering
supportftp Usage:
-S
-s
[-h hostname or IP]
[-u username]
[-p password]
[-d remotedirectory]
[-l protocol]
-R | -t hours |-e | -d
Revision 0213 3 – 12
SAN-TS 300 Data Gathering
SW1:FID128:admin> supportsave -c
This command collects RASLOG, TRACE, supportShow, core file, FFDC data and
then transfer them to a FTP/SCP/SFTP server or a USB device. This
operation can take several minutes.
NOTE: supportSave will transfer existing trace dump file first, then
automatically generate and transfer latest one. There will be two trace
dump files transferred after this command.
OK to proceed? (yes, y, no, n): [no] y
Saving support information for switch:SW6510, module:RAS...
...................................
Saving support information for switch:SW6510, module:CTRACE_OLD...
Saving support information for switch:SW6510, module:CTRACE_NEW...
Saving support information for switch:SW6510, module:FABRIC….....
Saving support information for switch:SW6510, module:DIAG….....
Saving support information for switch:SW6510, module:RTE...
Saving support information for switch:SW6510, module:IF_TREE...
Saving support information for switch:SW6510, module:ISCSID_DBG...
Saving support information for switch:SW6510, module:AGDUMP...
Saving support information for switch:SW6510, module:AGWWNS...
Saving support information for switch:SW6510, module:AGWWN_CFG...
Saving support information for switch:SW6510, module:VPWWN_CFG….......
Saving support information for switch:SW6510, module:SSHOW_PLOG…....
Revision 0213 3 – 13
SAN-TS 300 Data Gathering
SupportSave completed.
Revision 0213 3 – 14
SAN-TS 300 Data Gathering
Revision 0213 3 – 15
SAN-TS 300 Data Gathering
Revision 0213 3 – 16
SAN-TS 300 Data Gathering
Revision 0213 3 – 17
SAN-TS 300 Data Gathering
Supportshow operands:
• Slot On bladed systems only, specifies a slot number, followed by a slash (/).
• port1[-por2] Specifies a port or a range of ports for which to display
supportShow information. This operand is optional; if omitted, the command
displays information for all ports.
• Lines Specifies the number of lines for the portLogDump output. This
parameter is valid only with the slot/port parameters.
Revision 0213 3 – 18
SAN-TS 300 Data Gathering
Output generated by this command may vary by switch configuration, platform and
Fabric OS level.
Some of the more common logs are (Note: this does not cover every command in every
log, just the more common commands, also many of these commands can be found in
multiple files. The commands in bold are most commonly used for troubleshooting.:
SSHOW_EX (exception): Contains errdump, pdshow
SSHOW_OS: Contains Linux OS level commands
SSHOW_PLOG: Contains the portlogdump
SSHOW_FABRIC: fabricshow, islshow, lfcfg --showall –cfg; lfcfg
--showall -lisl –v, lfmlog –dump, trunkshow, fabriclog –show,
fabstatsshow, topologyshow, cfgshow, portzoneshow,
portcamshow, cfgsize, cfgshow, defzone –-show, zone –-show,
porttrunkarea –-show all
SSHOW_NET: Contains network commands: ifconfig, route
Revision 0213 3 – 19
SAN-TS 300 Data Gathering
Revision 0213 3 – 20
SAN-TS 300 Data Gathering
Revision 0213 3 – 21
SAN-TS 300 Data Gathering
Revision 0213 3 – 22
SAN-TS 300 Data Gathering
Revision 0213 3 – 23
SAN-TS 300 Data Gathering
Footnote 1: You can easily use this event code to search the Fabric OS Message
Reference Manual for more information.
Date and Time Stamp: The system time (UTC) when the message was generated on the
switch. The RASLog subsystem supports an internationalized time stamp format based
on the “LOCAL” setting.
Message Module and Message Number: The message module and number. These
values uniquely identify each message in the Fabric OS and reference the cause and
actions recommended in this manual. Note that not all message numbers are used;
there can be gaps in the numeric message sequence.
Sequence Number: The error message position in the log. When a new message is
added to the log, this number is incremented by 1. When this message reaches the last
position in the error log and becomes the oldest message in the log, it is deleted when a
new message is added. The message sequence number starts at 1 after a
firmwaredownload and will increase up to a value of 2,147,483,647 (0x7fffffff). The
sequence number will continue to increase beyond the storage limit of 1024 messages.
The sequence number can be reset to 1 using the errClear command. The sequence
number is persistent across power cycles and switch reboots.
Severity Level: The severity of the error:
1 = Critical
2 = Error
3 = Warning
4 = Info
Revision 0213 3 – 24
SAN-TS 300 Data Gathering
Revision 0213 3 – 25
SAN-TS 300 Data Gathering
Revision 0213 3 – 26
SAN-TS 300 Data Gathering
Revision 0213 3 – 27
SAN-TS 300 Data Gathering
Footnote 1: Cannot chain command, example: (The following example does not work.)
SW1:admin> auditcfg –class 1,3,5 –enable
Once audit logging is enabled classes can be change with out first disabling logging.
Footnote 2:
SW1:admin> auditcfg –-show
Audit filter is enabled.
1-ZONE
2-SECURITY
3-CONFIGURATION
4-FIRMWARE
5-FABRIC
6-FW
7-LS
Severity level: INFO
Note: See next slide for information on the Severity levels and how to change them.
Revision 0213 3 – 28
SAN-TS 300 Data Gathering
There are four severity levels: INFO, WARNING, ERROR, CRITICAL To change severity
level (which by default is INFO which means all four levels will be included in the log)
run command: auditcfg -- severity
Example: To change the severity from info to warning (which would include error and
critical) run command:
Revision 0213 3 – 29
SAN-TS 300 Data Gathering
Note: Audit messages are also logged to the syslog server if configured.
Revision 0213 3 – 30
SAN-TS 300 Data Gathering
Revision 0213 3 – 31
SAN-TS 300 Data Gathering
Director considerations
Audit messages are generated independently by both the Active and Standby CPs.
Both CPs need an external management port connection. Both CPs need network
connectivity. A crossover cable attached to one CP card will prevent system logging
from the other CP card.
Syslog messages will always be delivered to the host syslog server from the Active CP.
The Audit configuration is propagated to the Standby CP during a CP card failover.
Syslog Server Considerations
To successfully deliver Audit messages to a syslog server, verify that:
• External syslogd server is functional and the syslog facility is operational
• IP network is functional
There will be some limitation for syslog on the frequency of events that can stream off
the switch. If too many events are generated by the switch, syslog will become a
bottleneck and audit events will be dropped by the software to prevent any issues with
the switch.
The Audit infrastructure is reliant on the event generating applications to provide the
audit-specific information. This means that if an application does not have the ability
to figure out the username/IP address/interface that an event came in, the Audit
infrastructure will not be able to transport that data and it will not be seen by the user.
i.e. events not generated by a user.
Audit messages are viewed from the console and, if syslog functionality is configured,
from the syslog server. Messages will continue to stream into the server. Methods to
sort, store, and clear these messages needs to be configured on the server. There is
no limit to the number of messages that a switch will send.
Revision 0213 3 – 32
SAN-TS 300 Data Gathering
Result: Audit messages are streamed chronologically to the configured syslog servers.
Revision 0213 3 – 33
SAN-TS 300 Data Gathering
Revision 0213 3 – 34
SAN-TS 300 Data Gathering
Each event that triggers an FFDC capture may result in more than one FFDC file being
created. The FFDC files are stored on the switch and transferred by supportsave;
once transferred they are automatically deleted from the switch.
Footnote 1: The specific events that trigger an FFDC capture are pre-selected by
Brocade engineering and cannot be changed by the user.
Footnote 2: When an FFDC capture occurs, the RAS Log error message includes FFDC is
the AUDIT flag field. Please check the latest revision of the Fabric OS Message
Reference manual or release notes for the latest details on which messages generate
an FFDC message.
Revision 0213 3 – 35
SAN-TS 300 Data Gathering
Revision 0213 3 – 36
SAN-TS 300 Data Gathering
When an FFDC defined event triggers a core dump then FFDC data is captured along
with panic data. The FFDC data is in readable format, the panic data is not.
Revision 0213 3 – 37
SAN-TS 300 Data Gathering
Panic dumps and core files remain on the switch after the supportsave command is run.
• Panic Dumps are caused by a reboot reason = panic. These occur when Linux
Kernel panics cause the Fabric OS to panic.
• Core Files are Linux standard core files.
Footnote 1: It may take up to 60 seconds to detect the daemon failure. The interval
between daemon restart attempts is short – seconds. If the daemon is successfully
restarted but fails again 10 minutes later, then 3 more daemon restart attempts will be
made. There is no permanent death; the 3 restart attempts every 15 minutes will
continue indefinitely.
Revision 0213 3 – 38
SAN-TS 300 Data Gathering
The trace dump file is meant to be like an airplane black-box recorder, tracking a brief
window of current values. This information can be an important aid to debugging system
crashes by provided an historical record of switch activity and behavior.
Only one trace dump file is retained on a switch at any time. If another trace dump is
triggered, the previous trace dump file is deleted.
Revision 0213 3 – 39
SAN-TS 300 Data Gathering
Footnote 1: Looking at the errdump output shows the creation of the dump file:
2012/05/24-13:27:36, [TRCE-1001], 208, CHASSIS, WARNING, SW1,
Trace dump available ! (reason: MANUAL)
You will also see one of the following two messaging depending of the auto FTP setting
(enable or disabled):
Use the –n option and include the –s (slot) option on director switches to generate a
trace dump for a specific slot in the chassis
See Brocade Fabric OS Command Reference manual for more information on the
tracedump command.
Revision 0213 3 – 40
SAN-TS 300 Data Gathering
Footnote 1: The parameters set by the supportftp command are used by both the
supportsave and tracedump commands.
For more information on supportftp parameters see next page notes slide.
Revision 0213 3 – 41
SAN-TS 300 Data Gathering
Use the supportftp command to set, clear, or display support FTP parameters. This
command has the following optional arguments:
–s: Set the FTP parameters. The following operands can be optionally specified on
the command line. If the -s option is specified without further operands, the
command interactively prompts for these parameters.
–h <host>: Specifies the FTP host. Provide an IP address or a server name. IPv4
and IPv6 addresses are supported. To specify the host by name, a DNS
entry must exist for the server.
–u <username>: Specifies the FTP account user name. The user name must be
less than 48 characters long.
–p <password>: Specifies the FTP account password. The password must be
less than 48 characters long. When using anonymous FTP, a password is not
required.
–d <remotedirectory>: Specifies the remote directory where the trace dump
files are stored. The directory name must be less than 48 characters long.
Specifying the root directory as the remote directory (/) is not allowed.
–l protocol: Specifies the transfer protocol. Valid values are file transfer protocol
(FTP), secure copy protocol (SCP), or secure FTP (SFTP).
–t <hours>: Specifies the time interval, in units of hours, at which the FTP server
connectivity is checked.
–e: Enables auto file transfer. Trace dump files are automatically transferred to a
designated FTP server. The server parameters must be set before you can
enable auto file transfer.
In Fabric OS, you can administer limited parts of the trace dump functionality through
the Trace tab in the Switch Admin dialog in Web Tools.
Revision 0213 3 – 42
SAN-TS 300 Data Gathering
To access the Web Tools view on this slide click Switch Admin and then Show
Advanced Mode:
Revision 0213 3 – 43
SAN-TS 300 Data Gathering
Revision 0213 3 – 44
SAN-TS 300 Data Gathering
For more information on Brocade Network advisor see WBT BNA 200 Brocade Network
Advisor Training course.
Revision 0213 3 – 45
SAN-TS 300 Data Gathering
Footnote 1: The switch and the host (containing the Brocade HBA/Fabric Adapter) must
be discovered by Brocade Network Advisor.
Revision 0213 3 – 46
SAN-TS 300 Data Gathering
Revision 0213 3 – 47
SAN-TS 300 Data Gathering
Footnote 1: The fabric and the hosts must be discover by Brocade Network Advisor. To
get to the Technical SupportSave window click on: Monitor Technical Support
Product/Host SupportSave
Footnote 2: In would be the name of the fabric, in this example the name of the fabric is
Fabric.
Revision 0213 3 – 48
SAN-TS 300 Data Gathering
Revision 0213 3 – 49
SAN-TS 300 Data Gathering
Revision 0213 3 – 50
SAN-TS 300 Data Gathering
Revision 0213 3 – 51
SAN-TS 300 Data Gathering
Revision 0213 3 – 52
SAN-TS 300 Data Gathering
Footnote 1: To Generate reports select SAN and click on Reports in the menu. Select
Event Customer Reports, Generate or View:
Fabric Summary Report: List information for discovered fabrics. Creates a separate
report for each fabric. Includes a summary on: (See example next slide)
Fabric information
Switches
Device information
ISLs and trunks
Port Ports Report: Lists discovered ports including used and unused ports. Port data for
each fabric is divided into two parts: (See example in two more slides)
Director and switch utilization
Individual port details
Revision 0213 3 – 53
SAN-TS 300 Data Gathering
Revision 0213 3 – 54
SAN-TS 300 Data Gathering
Revision 0213 3 – 55
SAN-TS 300 Data Gathering
Can display Rx and Tx Utilization or Mbps as well as the following error counters:
• CRC Errors
• Signal Losses
• Sync Losses
• Link Failures
• Sequence Errors
• Invalid Transmissions
• Rx Link Resets
• Tx Link Resets
Revision 0213 3 – 56
SAN-TS 300 Data Gathering
Footnote1: The freeze option freezes the log from “on the fly” updates. New events will
still be stored in the database but not the display will not be updated until the freeze is
unchecked.
Footnote 2: Event message can be user defined: Example the user can define pseudo
events (more on this later in this presentation) and assign a severity level to them. So a
user can assign an Emergency level to a pseudo event. This could be useful for
troubleshooting. To create a pseudo event: Monitor Event Processing Pseudo
Events
Revision 0213 3 – 57
SAN-TS 300 Data Gathering
Revision 0213 3 – 58
SAN-TS 300 Data Gathering
Revision 0213 3 – 59
SAN-TS 300 Data Gathering
Revision 0213 3 – 60
SAN-TS 300 Data Gathering
Download the SAN Health Diagnostics Capture, and save to your hard drive.
SAN Health Diagnostics Capture minimum system requirements:
• Intel Pentium processor 133 MHz or higher
• Microsoft Windows 95 or higher
• 64 MB RAM / 10 MB available hard disk space
Revision 0213 3 – 61
SAN-TS 300 Data Gathering
Revision 0213 3 – 62
SAN-TS 300 Data Gathering
The last screen of the process gives you an option to send the diagnostic data to the
report generation queue via HTTPS or via email attachment to SHUpload@brocade.com
Either way you will get an email confirmation letting you know that the report was
received and a second email when the report is ready.
Revision 0213 3 – 63
SAN-TS 300 Data Gathering
Revision 0213 3 – 64
SAN-TS 300 Data Gathering
Values that merit attention are highlighted in red, orange and blue If a value is
highlighted in one of these colors, it is recommended that action be taken to assess the
impact to your SAN
Revision 0213 3 – 65
SAN-TS 300 Data Gathering
Revision 0213 3 – 66
SAN-TS 300 Data Gathering
Revision 0213 3 – 67
SAN-TS 300 Data Gathering
Revision 0213 3 – 68
SAN-TS 300 Data Gathering
Footnote 1: HCM under tools supportsave, however this is for the HCM application
only and does not capture information about the HBA. This supportsave is useful for
troubleshooting issues with the HCM application and management of an HBA. But is not
useful when troubleshooting issues with the HBA.
Revision 0213 3 – 69
SAN-TS 300 Data Gathering
Revision 0213 3 – 70
SAN-TS 300 Data Gathering
Revision 0213 3 – 71
SAN-TS 300 Data Gathering
Revision 0213 3 – 72
SAN-TS 300 Data Gathering
Revision 0213 3 – 73
SAN-TS 300 Data Gathering
Revision 0213 3 – 74
SAN-TS 300 Data Gathering
Revision 0213 3 – 75
SAN-TS 300 Data Gathering
Revision 0213 3 – 76
SAN-TS 300 Data Gathering
Footnote 1: Some utilities require you to configure the utility for capturing prior to
opening up a session. Check with your utility vendor for instructions.
Revision 0213 3 – 77
SAN-TS 300 Data Gathering
Best practice: Consider connecting a terminal server with network AND modem
capability for serial console access to switch. If you lose network access, you can
still dial in assuming that the terminal server has this capability. The serial console
is used to access a switch to configure network parameters, monitor switch console
messages, and sometimes to perform password recovery procedures. Not all
password recovery procedures require serial access.
Console messages that “pop-up” during CLI login sessions are not displayed in
errshow/Dump (log error message) outputs unless they contain a severity level.
Console messages are messages that go to the serial port. In Linux, messages
directed to “standard error” are mirrored on the console. Console messages that
contain severity levels will be displayed in the error log. Examples of console
messages that do not include severity codes include CP sync messages. These
CP sync messages let the console know about events that occur in the CP fail over
process. Console messages can be configured to go to syslog servers.
Revision 0213 3 – 78
SAN-TS 300 Data Gathering
Revision 0213 3 – 79
SAN-TS 300 Data Gathering
Events can also be filtered by using the Reports Event Custom Reports utility.
Revision 0213 3 – 80
SAN-TS 300 Data Gathering
Revision 0213 3 – 81
SAN-TS 300 Data Gathering
Revision 0213 3 – 82
SAN-TS 300 Data Gathering
Revision 0213 3 – 83
SAN-TS 300 Data Gathering
Revision 0213 3 – 84
SAN-TS 300 Device Connectivity
Troubleshooting is never an exact methodology. The path you take depends upon the
results of the command you typed in. It may depend on visual indicators within the
switch, the host, or the target.
No two people troubleshoot the same way, and this is only a summary of commands
available and symptoms to be aware of.
Think of a switchshow as a binary action – you may be able to eliminate the systems
side of the picture if something looks wrong with the storage port. With the output of
your switchshow command, you may eliminate half of the configuration as suspect.
Try not to make it too complicated by keying in on one specific component until some
data points toward that component. Don’t assume the information you have been given
is correct, always validate the information.
Footnote 1: You can also use command portlogshow which filters the portlogdump
for one specific port.
Verify you are receiving light from the end device. Does the switch see light from the
device?
A disconnected or bad cable may be the problem. The HBA in the host may have failed.
OS configuration file parameters, driver parameters, and HBA firmware parameters
could also be a reason that the switch is not receiving light from the end device.
Start with the switchshow command to get an overall view of the ports.
For port state, the following would be related to Light:
• No_Card - no interface card present
• No_Module - no module (SFP or other) present
• Mod_Val - module validation in process
• Mod_Inv - invalid module
• No_Light - the module is not receiving light
Use portflagsshow to verify whether Light had previously been seen.
Your SFP within that port on the switch could be faulty. Use the sfpshow <port>
command to verify that the SFP is functioning properly.
Footnote 1: D_Port is an advanced diagnostics used to diagnose issues with: SFPs,
cables, Condor 3 ASICs, and Connections. Does require the switch port ASIC to be
Condor3. D_Port test is cover in more detail in switch to switch connectivity module.
Make sure we are receiving light from the end device. Does the switch see light from the
device?
A disconnected or bad cable may be the problem. The HBA in the host may have failed.
OS configuration file parameters, driver parameters, and HBA firmware parameters
could also be a reason that the switch is not receiving light from the end device.
Start with the switchshow command to get an overall view of the ports.
For port state, the following would be related to Light:
• No_Card - no interface card present
• No_Module - no module (SFP or other) present
• Mod_Val - module validation in process
• Mod_Inv - invalid module
• No_Light - the module is not receiving light
Your SFP within that port on the switch could be faulty. Use the sfpshow <port>
command to verify that the SFP is functioning properly.
Revision 0213 4 – 10
SAN-TS 300 Device Connectivity
Revision 0213 4 – 11
SAN-TS 300 Device Connectivity
Identifier: 3 SFP
Encoding: 1 8B10B
Vendor Rev: A
BR Max: 0
BR Min: 0
DD Type: 0x68
Alarm
Enh Options: 0xfa thresholds and
current sensor
Status/Ctrl: 0xb0
readings
Alarm flags[0,1] = 0x0, 0x0
Alarm Warn
Revision 0213 4 – 12
SAN-TS 300 Device Connectivity
Footnote 1: Remember to check both ends of the link for light/signal. One end may be showing no sync because it is
receiving light but not transmitting light.
Revision 0213 4 – 13
SAN-TS 300 Device Connectivity
Revision 0213 4 – 14
SAN-TS 300 Device Connectivity
Revision 0213 4 – 15
SAN-TS 300 Device Connectivity
crc_err – counter are frames with CRC errors. If this counter goes up, then the physical
path should be inspected. Check the cables to and from the switch, patch panel, and
other devices. Check the SFP by swapping it with a known good working SFP. If you see
this issue on an 8 Gbps blade, use the portCfgfillword command to reduce EMI.
Suggested actions would be to replace the cable or SFP, move cable to another port, or
run porttest or portdporttest.
crc g_eof – The crc_g_eof counter are frames with CRC errors and a good EOF. The
first port detecting a CRC error marks the frame with a bad EOF and passes the frame
on to its destination. Subsequent ports in the path also detect the CRC error and the
crc_err counter increments on these ports. However, since the first port marked the
frame with a bad EOF, the good EOF counter on the subsequent ports does not
increment. The marginal link associated with the port with an increasing good EOF
counter is the marginal link and the source of the errors.
too_short – The too_short counter is incremented whenever a frame, bounded by an
SOF and EOF, is received and the number of words between the SOF and EOF is less
than 7 words (6 word header plus 1 word CRC). This would be 38 bytes including the
Revision 0213 4 – 16
SAN-TS 300 Device Connectivity
SOF and EOF. This could be caused by the transmitter, or an unreliable link.
Revision 0213 4 – 16
SAN-TS 300 Device Connectivity
too_long – Fibre Channel frames are 2148 byes maximum. If an eof is corrupted or
data generation is incorrect a too_long error is generated.
bad_eof – After a loss of synchronization error continuous mode alignment allows the
receiver to reestablish word alignment at any point in the incoming bit stream while the
receiver is operational. Such realignment is likely (but not guaranteed) to result in code
violations and subsequent loss of synchronization. Under certain conditions, it may be
possible to realign an incoming bit stream without loss of synchronization. If such a
realignment occurs within a received frame, detection of the resulting error condition is
dependent upon higher-level function (e.g., invalid CRC, missing EOF Delimiter).
enc_out – 8bit/10bit encoding errors occurred in words (ordered sets) outside the
Fibre Channel frame and usually indicating a bad primitive. Words outside of frames
are encoded, if this encoding is corrupted or an error is detected enc_out is
generated. This is a sign of a hardware problem, take snapshots of the port errors by
using the porterrshow command in increments of 5 to 10 minutes. If you notice the
crc_err counter go up, you have a bad or damaged cable, or a bad or damaged
device in the path. Suggested actions would be to replace the cable or SFP, move cable
to another port, or run porttest or portdporttest to verify. NOTE: ICLs will see
enc_out errors when ports on other side of the link are disabled, this is normal and
OK.
Disc c3 – Discard class 3 errors could be generated by a switch when devices send
frames without performing a FLOGI first or send frames to an invalid destination. It also
is an indication of a possible performance problem, when a switch port can’t send a
frame due to congestion and must discard the frame when the hold time expires. More
information on this in the performance module of this course.
Link fail – If a port remains in the LR Receive State for a period of time greater than a
timeout period (R_T_TOV), a Link Reset Protocol Timeout shall be detected which
results in a Link Failure condition (enter the NOS Transmit State). The link failure also
indicates that loss of signal or loss of sync lasting longer than the R_T_TOV value was
detected while not in the Offline state.
Loss sync – Synchronization failures on either bit or transmission word boundaries are
not separately identifiable and cause loss-of synchronization errors.
Loss sig – Occurs when a signal is transmitted but none is being received on the same
port.
Frjt – If the fabric cannot process a Class 2 frame a F_RJT is returned. The F_RJT
response to a frame indicates that delivery of that frame is being rejected. Rejection
indicates that the frame contents are intact (i.e. no transmission errors) but the frame
cannot be received for some protocol-related reasons, such as non-support of a service
or inconsistent frame header fields.
Revision 0213 4 – 17
SAN-TS 300 Device Connectivity
Fbsy – If the fabric cannot deliver a class 2 frame within E_D_TOV frame will be
discarded and an F_BSY returned. The F_BSY indicates that the frame can’t be
delivered, because either the fabric or the destination N_Port is temporarily busy. On
receipt of an F_BSY in response to a frame transmitted, the source N_Port is expected
to attempt Frame retransmission, up to some number of retries. Recovery after retry is
exhausted is dependent on the FC-4 ULP and the Exchange Error Policy.
For 8 Gbps switches: use the porttest command along with porterrshow to verify
physical near-end components
Switch1:admin> porttest –ports 1 –iteration 100
For 16 Gbps switches with 16 Gbps SFPs: use portdporttest, this to verify hysical
near-end components. Note: If you have a 16 Gbps with 8 Gpbs SFPs must use
porttest.
frames enc crc crc too too bad enc disc link loss loss frjt fbsy
==========================================================================
<truncated output>
5: 3.3g 3.8g 0 0 0 0 0 0 45 1 0 15 30 0 0
7: 0 0 0 0 0 0 0 0 0 0 0 0 12 0 0
<truncated output>
Revision 0213 4 – 18
SAN-TS 300 Device Connectivity
Once you identify suspect ports with porterrshow, use portshow <port> or
portstatsshow <port> to look at actual port counters. Fields within the
portstatsshow output are larger than porterrshow. Look at the enc_out errors.
The difference between 3.8g and 3.9g is larger than the difference between 38 and
39. For 3.8g to increment to 3.9g, 1,000,000 more errors must occur. The exact values
can be seen with portstatsshow <port> or portshow <port>. Alternatively,
you could clear the counters for the port with portstatsclear <port>, and then
continue to monitor.
Revision 0213 4 – 19
SAN-TS 300 Device Connectivity
Fiber cable needs to be matched to the SFP in use and cables on both sides of a patch
need to be the same type.
Footnote 1: If the marginal link was caused by Switch3 port 5 the CRC and ENC errors
would only be seen on switch4 port 7 and Switch2 port 12.
crc_err – counter are frames with CRC errors. If this counter goes up, then the physical
path should be inspected. Check the cables to and from the switch, patch panel, and
other devices. Check the SFP by swapping it with a known good working SFP. If you see
this issue on an 8 Gbps blade, use the portcfgfillword command to reduce EMI.
Suggested actions would be to replace the cable or SFP, move cable to another port, or
run porttest or portdporttest.
Footnote 2: crc g_eof – The crc_g_eof counter are frames with CRC errors and a good
EOF. The first port detecting a CRC error marks the frame with a bad EOF and passes
the frame on to its destination. Subsequent ports in the path also detect the CRC error
and the crc_err counter increments on these ports. However, since the first port marked
the frame with a bad EOF, the good EOF counter on the subsequent ports does not
increment. The marginal link associated with the port with an increasing good EOF
counter is the marginal link and the source of the errors.
Revision 0213 4 – 20
SAN-TS 300 Device Connectivity
Revision 0213 4 – 21
SAN-TS 300 Device Connectivity
porttest: Use this command to isolate problems in a single replaceable element and
to trace problems to near-end terminal equipment, far-end terminal equipment, or the
transmission line. This command verifies the functional operation of the switch by
sending frames from a port's transmitter, and looping the frames back through an
external fiber cable into the port's receiver. The test exercises all switch components
from the main board, to the fibre cable, to the media (of the devices and the switch),
and back to the main board.
See Fabric OS command reference manual for more information on these tests.
Revision 0213 4 – 22
SAN-TS 300 Device Connectivity
Footnote 1: Port speeds are configured using the portcfgspeed command. Syntax
is:
Usage: portCfgSpeed PortNumber Speed_Level
Speed_Level: 0 - Auto Negotiate (Hardware)
1 - 1Gbps
2 - 2Gbps
4 - 4Gbps
8 - 8Gbps
10 - 10Gbps
16 - 16Gbps
ax - Auto Negotiate (Hardware) + retries
s - Auto Negotiate (Software)
Both the sender and receiver attempt to clock bits as they receive them. When they
agree on the frequency of the bits, speed has been negotiated and established. At this
point they can start bit synchronization. If they cannot achieve this synchronization, the
port remains in a No_Sync state. This is part of the Port State Machine T-11 FC-FS
Revision 0213 4 – 23
SAN-TS 300 Device Connectivity
Standard.
Revision 0213 4 – 23
SAN-TS 300 Device Connectivity
The output from portshow or portflagsshow can be used to get a high level
overview of the login process for a port. In addition to login information other port level
information is sometimes shown in the port flags. The flags output for both commands
is the same.
The flags are read from right to left. The possible flags that can be displayed in Fabric
OS are:
• PRESENT – Port present (card plugged in)
• ACTIVE – Port is in the active state
• VIRTUAL – This is a virtual port
• E_PORT – Port type is an E_Port (ISL port)
• T_FPORT – F_Port is a trunk port
• T_FMASTER – F_Port is a trunk master
• T_PORT – Port is a trunk port
• T_MASTER – Port is a trunk master
• F_PORT – Port type is an edge port connecting to fabric capable devices
• G_PORT – Port type is a Generic port – Acts as a transition for non-loop fabric
capable devices
• L_/FL_PORT – Port type is a Fabric Loop port
• U_PORT – Port type can be unidentified port
Revision 0213 4 – 24
SAN-TS 300 Device Connectivity
Revision 0213 4 – 25
SAN-TS 300 Device Connectivity
U_Port
Is
Yes Is it a No
No something
loop G_Port
plugged
device?
in?
Yes
Is it a
fabric No
L_/FL_Por
point to E_Port
t
point
device
F_Port
Revision 0213 4 – 26
SAN-TS 300 Device Connectivity
The example on this slide shows an instance where both the device and switch ports are
hard set to different speeds. Since auto-negotiation does not occur the switch and
attached device are unable to complete the speed negotiation process.
Here we see that the port has been locked to 4 Gbit/sec with the command:
Switch1:admin> portcfgspeed 1 4
This can be confirmed with a portcfgshow or portshow <port>, but
switchshow has already shown this above.
Revision 0213 4 – 27
SAN-TS 300 Device Connectivity
Revision 0213 4 – 28
SAN-TS 300 Device Connectivity
Revision 0213 4 – 29
SAN-TS 300 Device Connectivity
Footnote 1:
Link Fail - If a port remains in the Link Reset (LR) Receive State for a period of time
greater than a timeout period (R_T_TOV), a Link Reset Protocol Timeout shall be
detected which results in a Link Failure condition (enter the NOS Transmit State). Also
indicates loss of sync or loss of signal lasting longer than Receiver Transmitter Timeout
Value (R_T_TOV) while port was not in the Offline State; both will cause only the Link
Failure counter to increase. For loss of sync lasting shorter then R_T_TOV the port will
remain in the active state and the Loss of Sync counter will increase.
Per Fibre Channel standards, the default R_T_TOV value is 100 milliseconds but can be
set as low as 100 microseconds.
Revision 0213 4 – 30
SAN-TS 300 Device Connectivity
Revision 0213 4 – 31
SAN-TS 300 Device Connectivity
Revision 0213 4 – 32
SAN-TS 300 Device Connectivity
Fabric OS switches also support a port fencing option with Fabric Watch which will
disable a port when a threshold is reached.
Switch1:admin> errdump
2009/03/17-22:21:07, [FW-1170], 10,, WARNING, Switch1, , Port#1,Loss of
Signal, is above high boundary (High=1, Low=0). Current value is 3
Error(s)/second.
Revision 0213 4 – 33
SAN-TS 300 Device Connectivity
Revision 0213 4 – 34
SAN-TS 300 Device Connectivity
Revision 0213 4 – 35
SAN-TS 300 Device Connectivity
Revision 0213 4 – 36
SAN-TS 300 Device Connectivity
ISL R_RDY Mode – Displays ON when ISL R_RDY mode is enabled on the port. Displays
(..) or OFF when ISL R_RDY mode is disabled. This value is set by the
portcfgislmode command.
RSCN Suppression – Displays ON when RSCN suppression is enabled on the port.
Displays (..) or OFF when RSCN suppression is disabled. This value is set by the
portcfg rscnsupr command.
Persistent Disable – Displays ON when the port is persistently disabled; otherwise
displays (..) or OFF. This value is set by the portcfgpersistentdisable command.
NPIV capability Displays – ON when N_Port ID Virtualization (NPIV) is enabled on the
port (default). Displays (..) or OFF when NPIV capability is disabled. This value is set by
the portcfgnpivport command.
QOS E_Port – Displays ON when Quality of Service (QoS) is enabled on the port. Displays
(..) or OFF when QoS is disabled. By default, QoS is enabled by best effort based on
availability of buffers. This value is set by the portcfgqos command.
EX_Port – Displays ON when the port is configured as an EX_Port. Otherwise displays (..)
or OFF. This value is set by the portcfgexport command.
Mirror Port – Displays ON when Mirror Port is enabled on the port. Displays (..) or OFF
when Mirror Port is disabled. This value is set by the portcfg mirrorport
command.
Revision 0213 4 – 37
SAN-TS 300 Device Connectivity
FC Fastwrite – Displays ON when FC Fastwrite is enabled on the port or (..) or OFF when
disabled. Fastwrite is disabled by default. This value is set by the portcfg
fastwrite command.
Rate Limit – Displays ON when ingress rate limit is set on the port or (..) or OFF when
the ingress rate limiting feature is disabled. This value is set by the portcfgqos --
setratelimit command. The default value is OFF.
Credit Recovery – Displays ON when Credit Recovery is enabled on the port or (..) or
OFF when disabled. This value is set by the portcfgcreditrecovery command.
The credit recovery feature is enabled by default, but only ports configured as long
distance ports can utilize this feature.
Port Auto Disable – This is the Port Fencing feature. Displays On when the Auto
Disable feature is enabled on a port or (..) when disabled. This feature causes ports to
become disabled when they encounter an event that would cause them to reinitialize.
This feature is enabled by the portcfgautodisable command. The feature is
disabled by default.
Revision 0213 4 – 38
SAN-TS 300 Device Connectivity
Switch1:admin> portcfglport 1
Usage: portcfglport PortNumber <0 | 1> [0 | 1] [0 | 1 | 2]
Switch1:admin> portcfglport 1 1
Port 1 is already locked as a G-Port
Revision 0213 4 – 39
SAN-TS 300 Device Connectivity
Switch1:admin> portcfglport 1
Usage: portcfglport PortNumber <0 | 1> [0 | 1] [0 | 1 | 2]
Switch1:admin> portcfglport 1 1
Port 1 is already locked as a G-Port
Revision 0213 4 – 40
SAN-TS 300 Device Connectivity
Revision 0213 4 – 41
SAN-TS 300 Device Connectivity
Revision 0213 4 – 42
SAN-TS 300 Device Connectivity
Revision 0213 4 – 43
SAN-TS 300 Device Connectivity
Once a link is established a device must login with the fabric and request a 24-bit Fibre
Channel address. During this time the device will register the number of buffer-to-buffer
credits it has available, its max receive frame size, and the Class of Service (CoS)
supported.
Revision 0213 4 – 44
SAN-TS 300 Device Connectivity
Revision 0213 4 – 45
SAN-TS 300 Device Connectivity
Revision 0213 4 – 46
SAN-TS 300 Device Connectivity
Revision 0213 4 – 47
SAN-TS 300 Device Connectivity
Revision 0213 4 – 48
SAN-TS 300 Device Connectivity
Once the fabric login is completed the next step is registering with the name server. The
device registers its attributes with the name server. In addition to the device registration
the name server also probes the device to attempt to gather additional information. To
see information about the device probing run the fcpprobeshow command.
Revision 0213 4 – 49
SAN-TS 300 Device Connectivity
After the FLOGI has completed, the HBA will send a PLOGI request to FFFFFC asking for
permission to log into the Name Server.
If a device does not perform a PLOGI or the device does not receive an ACCEPT from the
switch, the device does not complete login into the Name Server. This can be verified
through nsshow (no entry in Name Server). On occasion a device may only be partially
registered with the Name Server. In a case like this it is necessary to first know how a
device is supposed to appear in the Name Server in order to spot the differences.
Revision 0213 4 – 50
SAN-TS 300 Device Connectivity
State Change Register (SCR) – Nx_Port request to receive notification when something
in the fabric changes. FC devices that choose to receive RSCNs must register for this
service.
• Devices send a SCR to FFFFFD
• Registration indicates that the device wants to be notified of changes
• Devices register after PLOGI to Name Server
Registered State Change Notification (RSCN) – Issued by the Fabric Controller Service or
an Nx_Port to devices that registered
• Only sent to devices within an affected zone
Initiators should register for RSCNs using SCR. This is commonly a function within the
driver and may not be changed with any configuration files. Targets do not register for
SCN’s.
SCR 0 – No SCR registration
SCR 1 – Fabric detected registration
• Device registered to receive all RSCNs issued by Fabric Controller for events detected by
fabric
SCR 2 – Nx_Port detected registration
• Device registered to receive all RSCN requests issued for events detected by that affected
Nx_Port
Revision 0213 4 – 51
SAN-TS 300 Device Connectivity
Revision 0213 4 – 51
SAN-TS 300 Device Connectivity
The Fabric Controller Service (FFFFFD) alerts device that changes have occurred in
the fabric by sending a Registered State Change Notification (RSCN) if:
• Device registered to receive RSCN using an SCR
• A new device has been added (within the same zone)
• An existing device has been removed (within the same zone)
• A zone has been changed
• A switch name or IP address changed
• The fabric reconfigured
Registration is optional
• SCSI initiators normally register
• SCSI targets may not register
The Fabric Controller (FFFFFD) is responsible for routing changes, topology changes
and the SCR/SCN/RSCN processes.
Fabric Controller (FFFFFD) service is a required logical entity within a fabric that
controls the general operation of the fabric. It is the fabric owner as well as the traffic
controller. Functions include fabric initialization, frame routing management,
generation of link responses, and setup and tear down of dedicated connections. Since
Fabric Controller is such an important service, Fibre Channel deploys a fully distributed
environment for this service. The Fabric Controller exists in every single switch in a
fabric, therefore, there is no single point of failure.
Major fabric management responsibilities:
• Execution of the fabric initialization procedure
• Advertise RSCN (Registered State Change Notification)
Major traffic management responsibilities:
• F_Ports are interconnected by a routing function which is managed by Fabric
Controller and allowing frames to flow from one F_Port to another N_Port connect
to a F_Port in the fabric.
• Setup and tear down of dedicated connections
• Perform general frame routing
• Parse and routes of frames directed to well-known addresses
• Generation of class 2 F_BSY (Fabric Busy) and F_RJT (Fabric Rejected) link
responses
Revision 0213 4 – 52
SAN-TS 300 Device Connectivity
Revision 0213 4 – 53
SAN-TS 300 Device Connectivity
There are several commands used to identify and locate devices within the fabric:
switchshow – displays devices and whether they are logged into the local switch
nsshow – displays devices logged in the local Name Server
nscamshow – displays devices logged in a remote Name Server (other switch within the
fabric)
nsallshow – Lists 24-bit PID addresses of all devices logged into the fabric
nodefind – specify with ALIAS,WWN, or PID to locate Name Server information (local
or remote) within the fabric.
Revision 0213 4 – 54
SAN-TS 300 Device Connectivity
Use nsshow to verify that a device logged into the Name Server. We can further verify
information about that device, below we see the following information in the PortSymb
field:
Vendor: Emulex
Model: LP1150-F4
Firmware Version: V2.10A7
Driver Version: V5.20A9
Switch1:admin> nsshow
{
Type Pid COS PortName NodeName TTL(sec)
N 0a0000; 2,3;10:00:00:00:c9:51:35:96;20:00:00:00:c9:51:35:96; na
FC4s: FCP
NodeSymb: [52] "Emulex LP1150-F4 FV2.10A7 DV5-5.20A9 RSL1-ST15-W2K-1"
Fabric Port Name: 20:00:00:05:1e:02:0c:77
Permanent Port Name: 10:00:00:00:c9:51:35:96
Port Index: 0
Share Area: No
Device Shared in Other AD: No
Revision 0213 4 – 55
SAN-TS 300 Device Connectivity
Revision 0213 4 – 56
SAN-TS 300 Device Connectivity
Revision 0213 4 – 57
SAN-TS 300 Device Connectivity
Initiators perform the PLOGI and PRLI handshake from initiator Nx_Port to target
(storage) Nx_Port. After this occurs, the initiator issues SCSI commands such as Report
LUNs, Test Unit Ready, Start, and Read/Write to targets. This is all a function of the HBA,
firmware, system drivers, and configuration files.
Revision 0213 4 – 58
SAN-TS 300 Device Connectivity
fcping is a command that can verify connectivity and verify zoning. fcping performs
two operations:
• Sends five Fibre Channel Extended Link Service (ELS) Echo request to a pair of
ports or to a single destination, or executes a SuperPing.
• Does a Zone Database check to make sure devices are zoned together
• See Fabric OS command reference manual for more information and usage
options.
Devices that do not support the ELS ECHO will time out, but that does not mean there is
no physical connectivity. Verify connectivity through previously supported methods of
switchshow, nsshow, and nsallshow.
Revision 0213 4 – 59
SAN-TS 300 Device Connectivity
Revision 0213 4 – 60
SAN-TS 300 Device Connectivity
Note: This verifies that devices are zoned together within the Effective Configuration. It
does not display whether devices are logged in and online to the switch.
Revision 0213 4 – 61
SAN-TS 300 Device Connectivity
You can use the nszonemember –u command to look for unzoned devices in the fabric.
Switch1:admin> nszonemember -u
Pid: 0x020400; Aliases:
Pid: 0x010400; Aliases:
Pid: 0x030000; Aliases:
Total of 3 unzoned device(s) in the fabric.
Revision 0213 4 – 62
SAN-TS 300 Device Connectivity
Revision 0213 4 – 63
SAN-TS 300 Device Connectivity
State Change Notifications (SCN) are used for internal state change notifications. This is
the switch logging that the port is online or is an Fx_port. SCNs are not sent from the
switch to the Nx_ports and should not be confused with RSCNs.
Note: Devices can send RSCNs to the fabric if they change their Name Server attributes.
Revision 0213 4 – 64
SAN-TS 300 Device Connectivity
Note: The “*Removing all nodes from port” entry is listed when a port goes
offline and after a port online occurs when a port can no longer be an E_Port. In this
case the port has come online as an F_Port.
Revision 0213 4 – 65
SAN-TS 300 Device Connectivity
Revision 0213 4 – 66
SAN-TS 300 Device Connectivity
Revision 0213 4 – 67
SAN-TS 300 Device Connectivity
Revision 0213 4 – 68
SAN-TS 300 Device Connectivity
HCM can be used to see devices the sever has access too. There are also counters for
the HBA ports as well as detailed information on discovered targets and LUNs.
Revision 0213 4 – 69
SAN-TS 300 Device Connectivity
Revision 0213 4 – 70
SAN-TS 300 Device Connectivity
Revision 0213 4 – 71
SAN-TS 300 Device Connectivity
To clear the statistics for a port use the bcu fabric --statsclr <portID>
command.
Revision 0213 4 – 72
SAN-TS 300 Device Connectivity
Revision 0213 4 – 73
SAN-TS 300 Device Connectivity
Revision 0213 4 – 74
SAN-TS 300 Device Connectivity
Revision 0213 4 – 75
SAN-TS 300 Device Connectivity
Revision 0213 4 – 76
SAN-TS 300 Device Connectivity
Revision 0213 4 – 77
SAN-TS 300 Device Connectivity
Revision 0213 4 – 78
SAN-TS 300 Device Connectivity
Revision 0213 4 – 79
SAN-TS 300 Device Connectivity
Revision 0213 4 – 80
SAN-TS 300 Device Connectivity
Revision 0213 4 – 81
SAN-TS 300 Device Connectivity
Today, a customer does not have to bring down the SAN or interrupt production traffic to
install an analyzer and collect data to aid in troubleshooting Fibre Channel end-to-end
link communication. The port mirroring feature allows a customer configure a switch
port as an analyzer port to mirror a specific source port and destination port traffic
passing though a switch port. The port mirroring feature will not completely replace
inline analyzers due to some minor limitations. Port mirroring is only supported on
Condor, Condor2, and GoldenEye2 based platforms. Port mirroring cannot be
implemented on any 1 or 2 Gbit/sec based platforms or on GoldenEye based platforms
(Brocade 200E or embedded products).
The port mirroring feature will mirror the traffic in both directions between the source
identifier and the destination identifier to a single mirror port. It will create and delete
mirror connections between two identifiers. All traffic between the two identifiers will be
mirrored to the specified mirror port. The user should connect a FC analyzer to this
mirror port to capture all the mirrored traffic. The analyzer will only need one connection
between the port and the analyzer, this one connection will capture traffic in both
directions.
In the ingress directions, traffic originating from the source identifier and destined to the
destination identifier are mirrored to the mirrored port. In the egress direction, traffic
originating from the destination identifier and destined to the source identifier are
mirrored to the mirror port.
Revision 0213 4 – 82
SAN-TS 300 Device Connectivity
The idea of port mirroring is to capture traffic between two devices. We chose not to
mirror all the traffic from one device received and transmitted because it is not required,
i.e. if there is an issue between two devices mirroring that SID/DID pair is enough. A
complete port mirror would require two mirror ports to provide enough bandwidth to
support full line rate traffic. In addition, two ports would be consumed by the mirror
connection to support each direction of traffic. A user would then need to connect a FC
analyzer to each mirror port.
Examples of communication between an end device and a switch include Fabric Logins
(FLOGIs), FLOGI ACC, Name Server Fibre Channel Common Transport (FC_CT) Requests
and Responses, State Change Registrations (SCRs), and Registered State Change
Notifications (RSCNs).
Port Mirroring Is … Port Mirroring Can Not …
• Capable of mirroring end-to-end traffic • Debug frames with invalid SID or DID
• Mirror port can be any non-shared port • Debug link issues
located on the same switch as the • Debug embedded switch traffic
source identifier (SID) • Debug frames to well-known
• Can be uses to detect missing frames addresses
(zoning issues/hold timeout) • Be implemented on any 1 or
• Can be used to capture protocol errors 2Gbit/sec based platforms or
• Can be used to capture ULP traffic GoldenEye ASIC based platforms
(SCSI/FICON) (Brocade 200E or embedded
• Supported on Condor, Condor2, and products)
GoldenEye2 platforms • Mirror E_Ports
Revision 0213 4 – 83
SAN-TS 300 Device Connectivity
Revision 0213 4 – 84
SAN-TS 300 Device Connectivity
Footnote 1: Disable of the mirrorport connection will cause frames to be received out of
order. If IOSET is enabled, a frame will be dropped during this step. All other steps are
nondistuptive to I/O.
If the mirror port was not online you will get the following message:
ST01-B48:AD255:admin> portmirror --add 1/0 0x010c00 0x0163e8
Port Mirror: mirror port is offline.
Configure a mirror port using the CLI command portcfg mirrorport.
Switch> portcfg mirrorport 1/0 --enable
Connect a FC analyzer to the configured mirror port.
Setup a port mirror connection between the two F_PORT devices using the CLI
command portmirror --add.
Switch> portmirror --add 1/0 0x0a0500 0x0a0800
Start FC Analyzer capture, reproduce problem, stop FC analyzer capture, review FC
Analyzer trace.
Remove port mirror connection using the CLI command portmirror --delete.
Switch> portmirror --delete 1/0 0x0a0500 0x0a0800
Remove the mirror port using the CLI command portcfg mirrorport.
Switch> portcfg mirrorport 1/0 --disable
Revision 0213 4 – 85
SAN-TS 300 Device Connectivity
Revision 0213 4 – 86
SAN-TS 300 Device Connectivity
Footnote 1: ZONE-1058
Message <timestamp>, [ZONE-1058], <sequence-number>,, WARNING, <system-name>, Domain
<Domain ID of the switch that becomes unreachable> present in TI zone <TI zone name> became
unreachable due to failover disabled mode.
Probable Cause Indicates that the domain present in the Traffic Isolation (TI) zone path is unreachable.
This occurs if the TI zone paths are unavailable or the TI zone is set up incorrectly.
Recommended Action: Verify that the paths defined by TI zones are online or remove the domain from the
TI zone.
Severity WARNING
ZONE-1059
Message <timestamp>, [ZONE-1059], <sequence-number>,, WARNING, <system-name>, Unexpected TI
routing behavior or a potentially un-routable TI configuration has been detected on local domain <Domain
ID of the local Logical Switch where the error
was detected>.
Probable Cause Indicates that the current fabric topology and TI Zone configuration may result in an
unroutable condition or unexpected routing behavior.
Recommended Action: Execute the zone --showTIerrors command on the specified switch to report the
conflicting configuration details.
Severity WARNING
ZONE-1060
Message <timestamp>, [ZONE-1060], <sequence-number>,, WARNING, <system-name>, Non-TI and TI
failover-enabled traffic restricted to domain <Domain ID> due to TI failover-disabled zoning.
Probable Cause Indicates that only TI failover-disabled paths remain to reach the given domain causing
non-TI and TI failover traffic disruption.
Recommended Action: Add or restore the non-TI ISLs and/or TI failover-enabled ISLs to the specified
domain.
Severity WARNING
Revision 0213 4 – 87
SAN-TS 300 Device Connectivity
Zone --showTIerrors
Analyzes real and potential routing problems with the activated TI zoning set and prints
a report. This command must be executed in the local domain and analyzes only that
domain.
Error Types: Error and Warning:
Error type records indicate that a problem is present within the fabric given the current
set of online devices and activated TI zone configuration. In this case, if traffic between
the involved devices has already been started, frames are likely to be dropped within the
fabric.
Warnings are not currently a problem given the current set of online devices and ports
and reachable domains. Traffic may not be getting dropped in the fabric at the moment.
However, given the activated TI zone configuration, parallel exclusive paths between a
shared device and a remote domain have been detected which might cause a issue for
devices that join the fabric later and attempt to start communicating.
Revision 0213 4 – 88
SAN-TS 300 Device Connectivity
Revision 0213 4 – 89
SAN-TS 300 Device Connectivity
Revision 0213 4 – 90
SAN-TS 300 Device Connectivity
Revision 0213 4 – 91
SAN-TS 300 Device Connectivity
Looking at the switchshow output: The device is present however the it has been removed
from the Name Server
Sw1:admiin> switchshow
<truncated output>
===================================================
<truncated output>
2 1 2 010200 id N8 Online FC F-Port 10:00:00:05:1e:db:69:d9
<truncated output>
Revision 0213 4 – 92
SAN-TS 300 Device Connectivity
Revision 0213 4 – 93
SAN-TS 300 Device Connectivity
Revision 0213 4 – 94
SAN-TS 300 Device Connectivity
Revision 0213 4 – 95
SAN-TS 300 Device Connectivity
Footnote 1: Switch will use a GE_PT (Get Entries by Port Type) command to verify device.
Each of the node devices typically determine the properties of the other node devices
with which it communicates. Upon connecting to the network, the node devices send a
request addressed to the name server, which is then received by the resident name
server on the entry switch. Typically, where such request forms are supported, the
request takes the form of GE_PT (get entries of a given Port Type) or GE_FT (get entries
of a given FC-4 Type). Where such forms are not supported, the request may take the
form of GID_PT (get identifiers for ports of a given Port Type) or GID_FT (get identifiers
for ports of a given FC-4 Type). Once the identifiers have been obtained, a series of
GE_ID (get entry for a given identifier) requests may be used to obtain the corresponding
entries. In either case, the effect is to cause the entry switch to request each of the
other switches to send all name server database entries that satisfy the given criteria to
the entry switch, which then forwards the entries to the requesting device. As the
number of entries is generally proportional to the number of node devices, and each
device typically generates such a request, the amount of traffic increases as the square
of the number of node devices.
Revision 0213 4 – 96
SAN-TS 300 Device Connectivity
Revision 0213 4 – 97
SAN-TS 300 Device Connectivity
Revision 0213 4 – 98
SAN-TS 300 Device Connectivity
Revision 0213 4 – 99
SAN-TS 300 Device Connectivity
Why not retain just one of the duplicate devices instead of removing them all?
In order to retain a single device that has duplicates, a decision has to be made
regarding which device is the “right” device. It is often suggested that the most current
login or SCN represents the correct device, and that the older device does not require
verification prior to removal. The weakness of this assumption is that we cannot ensure
that all SCNs are delivered and processed in an identical order for every switch. With
remote duplicates, making the decision based solely on the arrival order of an SCN
would lead to an inconsistent Name Server database across all switches in the fabric.
Other issues:
Fabric Build Scenario A - In the case where a switch is joining a fabric, and the fabric has
existing duplicates, the login order cannot be deduced by the joining switch. The joining
switch would simply have to “guess” which device is the correct one. Similar issues exist
when considering HA failover.
Fabric Build Scenario B - Another case where order cannot be determined is where a
switch is joining a fabric and this introduces a duplicate condition. It is not possible to
determine either the login order or correctness of the “joining” device versus the existing
device. Again, FOS would simply have to “guess” which device is the correct one. Similar
issues exist when considering HA failover.
With any of the above scenarios the method for choosing the correct device would be
imperfect. If we choose to favor local devices over remote devices (or vice versa) it would
lead to an inconsistent Name Server database across the fabric. If we use a numeric
selection process and, for example, have all switches favor the device with the highest
(numerically) PID we would ensure a consistent Name Server database, but could offer
no guarantee that the selection would be the best choice.
The only case where order can be used to determine which device should be retained is
the one where all duplicates occur on the same (local) switch. The local switch has
control over all logins and can determine the login order for every device. As such, our
approach does retain a single device, rather than removing all devices, in this particular
case. Beyond this case, our approach removes all conflicting devices from the Name
Server database uniformly across the fabric. The Name Server is left in a predictable
state and the user is notified of the condition. The decision regarding which device
should exist in the fabric is left to the fabric administrator, an approach that is aligned
with customer feedback on this issue.
Footnote 1: The portlog captures information for the entire physical switch even if
virtual switches are created.
Footnote 2: The portlog captures portions of Fibre Channel frames used by devices
and switches to communicate. It also captures other events that do not use FC
frames. The caution here is not to assume every line displayed as part of a portlog
dump is a FC frame. This module will point out the differences.
Footnote 1: Portlogs are circular files. They display the activity for all switch ports in
a running log using a First in First out (FIFO) format.
Footnote 2: Though increasing the portlog size is safe on a switch functioning
normally it could cause problems on a switch that is low on memory. For this reason
it is recommend to only increase the portlog size when directed to do so by Brocade
Support/Engineering.
The supported portlog size for switches depends on the firmware version running.
Use the portlogconfigshow command to get the current value and
portlogresize command with no value set to verify the supported range.
Footnote 1: Power on Switch, Power On Self Test (POST), Light Signal, and
Character/Word sync will not be covered.
Register & Query/ ACC – Initiator will register & probe for SCSI devices
FLOGI ACC
Revision 0213 5–9
SAN-TS 300 Portlog Analysis
Revision 0213 5 – 10
SAN-TS 300 Portlog Analysis
Frame delimiters:
SOF identifies the start of a frame and conditions the receiver to begin frame reception.
EOF identifies the end of a frame and deconditions the frame reception logic.
Primitive signals:
IDLE or ARB are transmitted on a link whenever a port is operational and has no other specific
information to send. The transmitter side of a port is always sending words to maintain
synchronization with the receiver at the other end of the link.
Receiver Ready (R_RDY) indicates that the receiver has emptied a receive buffer and is ready to
receive another frame.
Virtual Channel Ready (VC_RDY) are used for buffer-to-buffer flow control on ISLs that support
Virtual Channels.
Primitive Sequences:
Not Operational (NOS) is transmitted by port to indicate that the transmitting port has detected a
link failure or is in an offline condition, waiting for the OLS sequence to be received.
Offline (OLS) is transmitted by a port to indicate the port is beginning the link initialization
protocol, has received and recognized the NOS sequence or is entering the Offline state.
Link Reset (LR) is used to initiate a link reset.
Link Reset Response (LRR) is transmitted by a port to indicate that is has recognized a LR
sequence and has performed the appropriate link reset actions.
Revision 0213 5 – 11
SAN-TS 300 Portlog Analysis
Revision 0213 5 – 12
SAN-TS 300 Portlog Analysis
Revision 0213 5 – 13
SAN-TS 300 Portlog Analysis
Revision 0213 5 – 14
SAN-TS 300 Portlog Analysis
Revision 0213 5 – 15
SAN-TS 300 Portlog Analysis
The FC standards do define the error recovery process to use when sync is lost
which can be seen in the portlog and is covered in the notes under Not Operational
(NOS) Link Initialization Protocol later in the module.
Revision 0213 5 – 16
SAN-TS 300 Portlog Analysis
Footnote 1: Each device starts speed negotiation at its highest supported speed and
works down until a common supported speed is found.
Revision 0213 5 – 17
SAN-TS 300 Portlog Analysis
There are four possible commands used by the speed negotiation process:
• WS (Wait for Signal) - wait until a signal is detected.
• NM (Negotiate Master) - Tx starts at maximum speed and progressively and
cyclically reduces speed. It dwells at each speed t_txcycl to allow the device to
follow. Meanwhile tunes to incoming speeds. It changes Rx speed from maximum
downwards at t_rxcycl periods.
• NF (Negotiate Follow) - tests the stability of the Rx speed.
• NC (Negotiate Complete) – indicates a negotiated speed has been reached
successfully.
Argument 1 – WS possible values:
• 00 - Start speed negotiation
• 01 - Wait for signal
• f0 - Loss of Rx_Sync
• ee - Signal (light) received
• e0 - Lost light
• ff - Sync gained
• f0 - Sync lost
Argument 1 – NC possible values:
• 01 = 1 Gbps
• 02 = 2 Gbps
• 04 = 4 Gbps
• 08 = 8 Gbps
• 10 = 16 Gbps
Revision 0213 5 – 18
SAN-TS 300 Portlog Analysis
Revision 0213 5 – 19
SAN-TS 300 Portlog Analysis
Revision 0213 5 – 20
SAN-TS 300 Portlog Analysis
• NOS is sent when a device that was previously in the active state goes offline (OL3)
or if a Link Failure is detected (LF2).
• OLS is sent when a device comes online for the first time, Link Initialize, or when
receiving NOS.
• LR is sent when a port in the active state performing a Link Reset (for example
buffer credit recovery) or when receiving OLS. A port in the Active state that
issues a successful Link Reset doesn't need to login to the fabric (FLOGI) if it had
previously done so.
Revision 0213 5 – 21
SAN-TS 300 Portlog Analysis
Revision 0213 5 – 22
SAN-TS 300 Portlog Analysis
Extended Link Services will be the most common type of frame to become familiar
with and decode in the portlog. FC-4 Data frames use the Common Transport
protocol and is used for Name Server registrations and queries.
Revision 0213 5 – 23
SAN-TS 300 Portlog Analysis
Revision 0213 5 – 24
SAN-TS 300 Portlog Analysis
Revision 0213 5 – 25
SAN-TS 300 Portlog Analysis
Footnote 1:
The three most-common Well-Known Addresses are:
FFFFFE is the address for Fabric F_Port Service.
FFFFFD is the address for Fabric Controller Service.
FFFFFC is the address for Name Server Service.
Less common Well-Known Addresses are:
FFFFFF is address for Broadcast
FFFFFA is address for Management Server
FFFFFB is address for Time Server
FFFFF8 is address for Alias Server
FFFCxx is address for Domain Controller (embedded port / switch ID). The xx will
be the Domain ID of the switch.
Revision 0213 5 – 26
SAN-TS 300 Portlog Analysis
Some ELS commands, such as RSCN, include Page Length and Payload Length
but not all. This slide illustrates an ELS Fabric Login (FLOGI) which doesn’t use
Page Length or Payload Length.
Revision 0213 5 – 27
SAN-TS 300 Portlog Analysis
ELS command code is in word 6 of the FC frame (first word of the frame payload).
Revision 0213 5 – 28
SAN-TS 300 Portlog Analysis
Revision 0213 5 – 29
SAN-TS 300 Portlog Analysis
Revision 0213 5 – 30
SAN-TS 300 Portlog Analysis
Part of the FLOGI request includes common service parameters and class of
service parameters for each class of service 1, 2 and 3. These parameters must
match what the switch supports in order to successfully login to the Fabric.
Common Service Parameters: These parameters apply to all classes of service and
include the FC_PH version supported, BB Credit, max receive frame size and
timeout values. This field represents the basic capabilities of the N_Port.
Revision 0213 5 – 31
SAN-TS 300 Portlog Analysis
Part of the ELS Accept used to respond to the FLOGI includes common service
parameters and class of service parameters for each class of service 1, 2 and 3.
Common Service Parameters: These parameters apply to all classes of service and
include the FC_PH version supported, BB Credit, max receive frame size and
timeout values. This field represents the basic capabilities of the F_Port.
Revision 0213 5 – 32
SAN-TS 300 Portlog Analysis
Revision 0213 5 – 33
SAN-TS 300 Portlog Analysis
Revision 0213 5 – 34
SAN-TS 300 Portlog Analysis
Revision 0213 5 – 35
SAN-TS 300 Portlog Analysis
Revision 0213 5 – 36
SAN-TS 300 Portlog Analysis
RSL8-ST01-B51:admin> nsshow -r
Type Pid COS PortName NodeName SCR
N 020400; 3;20:02:00:11:0d:e7:50:00;20:02:00:11:0d:e7:50:00; 0x01000003
FC4s: FCP
PortSymb: [36] "Brocade University Virtual FC Target"
Fabric Port Name: 20:04:00:05:1e:0c:ad:e5
Permanent Port Name: 20:02:00:11:0d:e7:50:00
Port Index: 4
Share Area: No
Device Shared in Other AD: No
Redirect: No
Partial: No
The Local Name Server has 1 entry }
Revision 0213 5 – 37
SAN-TS 300 Portlog Analysis
Revision 0213 5 – 38
SAN-TS 300 Portlog Analysis
Revision 0213 5 – 39
SAN-TS 300 Portlog Analysis
Revision 0213 5 – 40
SAN-TS 300 Portlog Analysis
For a frame, the portlog only captures words 0, 1, 4 and 6. For an ELS frame we
learned the first word of the payload (word 6) is the command code. But for an FC-4
Data frame the command code is in word 8. Another entry in the portlog will identify
the command code from this frame.
Revision 0213 5 – 41
SAN-TS 300 Portlog Analysis
Revision 0213 5 – 42
SAN-TS 300 Portlog Analysis
Revision 0213 5 – 43
SAN-TS 300 Portlog Analysis
Footnote 1: The ctin will display zero, one or two words of additional information,
depending on the request CT command code. The additional information is the
number of words displayed after the command code.
It uses the bit map as follows:
Hex 0000 = Binary 0000000000000000 = 0 words
Hex 0001 = Binary 0000000000000001 = 1 word
Hex 0003 = Binary 0000000000000011 = 2 words
Revision 0213 5 – 44
SAN-TS 300 Portlog Analysis
Revision 0213 5 – 45
SAN-TS 300 Portlog Analysis
Footnote 1: The ctout will display zero, one or two words of additional
information, depending on the requested CT command code. The additional
information is the number of words displayed after the command code.
It uses the bit map as follows:
Hex 0000 = Binary 0000000000000000 = 0 words
Hex 0001 = Binary 0000000000000001 = 1 word
Hex 0003 = Binary 0000000000000011 = 2 words
If you see 8001 in the Reply Command code this means the registration was rejected.
Revision 0213 5 – 46
SAN-TS 300 Portlog Analysis
Revision 0213 5 – 47
SAN-TS 300 Portlog Analysis
Revision 0213 5 – 48
SAN-TS 300 Portlog Analysis
Revision 0213 5 – 49
SAN-TS 300 Portlog Analysis
Revision 0213 5 – 50
SAN-TS 300 Portlog Analysis
Revision 0213 5 – 51
SAN-TS 300 Portlog Analysis
Revision 0213 5 – 52
SAN-TS 300 Portlog Analysis
A Brocade switch is enabled by default to probe devices for type information. This
probing can be disabled using the configure command on a disabled switch, then
changing the Fabric parameter FCP probe disable to a 1 (default is 0 which means
enabled).
A storage device will accept a PLOGI from the switch. Then the switch will do a
Process Login (PRLI). The reason for this is to get / query the storage device about
its FCP information (type of disk – i.e. Seagate, driver version, etc.). The Name
Server will store this information in its database for other devices (hosts) to get /
query and build device tables.
Revision 0213 5 – 53
SAN-TS 300 Portlog Analysis
Revision 0213 5 – 54
SAN-TS 300 Portlog Analysis
The switch is done with its probing and logs out of the device. The device accepts
the log out. Note: There is NO fabric logout, just N_Port log outs.
Revision 0213 5 – 55
SAN-TS 300 Portlog Analysis
Revision 0213 5 – 56
SAN-TS 300 Portlog Analysis
Revision 0213 5 – 57
SAN-TS 300 Portlog Analysis
Revision 0213 5 – 58
SAN-TS 300 Portlog Analysis
Footnote 1: The Condor ASIC (4 Gbps) and Condor2 ASIC (8 Gbps) support
FL_Ports. The Condor3 ASIC (16 Gbps) does not.
Revision 0213 5 – 59
SAN-TS 300 Portlog Analysis
In the example above, the port 0 starts loop initialization (LIP 8002), LIP times out
(TMO), port retries (LIP 801e), times out again, port retries a third time followed by the
port dropping back down to speed negotiation (not shown in truncated output). This
most likely was caused by the NL_Port not ready to perform loop init and is normal
behavior until both ends of the link are ready to start loop init. After the SN is
completed, port 0 again starts loop (LIP 8002) followed by L_Port acquiring an AL_PA
(LIP F7,F7) and the switch port becoming loop master (LIM). Not shown is the ELS loop
init process covered shortly.
Arbitrated Loop Physical Address (AL_PA): A unique one-byte valid value assigned
during Loop Initialization to each NL_Port or FL_Port on a Loop.
Arbitrated Loop Destination Address (AL_PD): The Arbitrated Loop Physical Address of
the L_Port on the Loop that should receive the Primitive Signal.
Arbitrated Loop Source Address (AL_PS): The Arbitrated Loop Physical Address of the
L_Port on the Loop that sent the Primitive Signal.
L_Port: Either an FL_Port or an NL_Port as defined in ANSI X3.230, FC-PH, 3.1.
Without the qualifier "Public“ or "Private," an NL_Port is assumed to be a Public
NL_Port.
Public Loop: A Loop that includes a participating FL_Port and may contain both Public
and Private NL_Ports.
Public NL_Port: An NL_Port that does a Fabric Login (FLOGI).
Revision 0213 5 – 60
SAN-TS 300 Portlog Analysis
Revision 0213 5 – 61
SAN-TS 300 Portlog Analysis
Revision 0213 5 – 62
SAN-TS 300 Portlog Analysis
Revision 0213 5 – 63
SAN-TS 300 Portlog Analysis
Revision 0213 5 – 64
TS300 Switch to Switch Connectivity
Revision 0512 1
TS300 Switch to Switch Connectivity
Revision 0512 3
TS300 Switch to Switch Connectivity
You can typically see the reason for the segmentation in three places: switchshow,
fabstatsshow, errshow (errdump). In the slides that follow, we will review each of
these conditions and associated outputs.
In the first error message on this slide the other switch rejected this switches exchange
link parameter (ELP) request because the fabric.ops parameters do not match.
Parameters exchanged include: Port_Name and Switch_Name, Class F service
parameters, R_A_TOV and E_D_TOV (part of fabric.ops parameters), and Virtual Channel
(VC) information. The Fabric OS Error Message Guides have the same error message
Probable Cause indicates that the specified switch port is isolated because of a
segmentation due to mismatched configuration parameters. Probable Action is based
on the segmentation reason displayed within the message, look for a possible mismatch
of relevant configuration parameters in the switches at both ends of the link. Run the
configure command to modify the appropriate switch parameters on both the local and
remote switch.
Flow control parameters and a subset of Class-n parameters. If the parameters are
incompatible, the E_Port link will segment.
When switches connect they go through the following initialization process:
• Negotiate link speed, if supported
• Determine the switch port operating mode
• If an F_Port or FL_Port, wait for node to initiate login
• If an E_Port, exchange link parameters (ELP) and switch capabilities with neighbor
• Select a principal switch during an Exchange Fabric Parameters (EFP) process
• Request/assign Domain IDs
The fabstatsshow command is brief and concise but be cautious - the counters/"<"s
are not cleared when the segmentation is fixed.
The information displayed is as follows:
•Number of times a switch domain ID has been forcibly changed
•Number of E_Port offline transitions
•Number of fabric reconfigurations
•Number of fabric segmentations due to:
• Loopback - Number of times this switch segmented port due to port being placed
into mcastloopback mode used to prevent IP/FC broadcast problems.
• Incompatibility - Fabric.ops parameters are different.
• Overlap - there are duplicate domains in attaching fabrics and fabrics are being hot
plugged
• Zoning - cfgmismatch (only one enabled cfg allowed), type mismatch (fabric A has
a zone called “eng27”, fabric B has an Alias called “eng27”), content mismatch
(fabric A has an “eng27” zone with “2,4; 2,6; 4,6”, fabric B has an “eng27” zone
with “2,4; 4,6”. When a switch is taken out of a fabric for maintenance persistently
disable ports (Fabric OS v3.1/4.1 and higher), clean the zoning out (cfgdisable,
cfgclear, cfgsave), re-attach switch to fabric establish E_Port connection and
allow the existing fabrics zone to propagate to the new switch and then persistently
enable or enable end device ports.
• Licensing - Some licenses are required to activate specific features. In some cases
a license may b e required on every switch in the fabric.
• Disabled E_Port - a segmentation occurs because a ports E_Port capability was
disabled using the portcfgeport command.
• Incompatible management server Platform DBs - Management server platform
DBs when enabled need to be compatible. If you have incompatible platform db
segmentation – use the msplcleardb command on merging fabric to clear data
base, incorporate merging fabric members into the existing fabric’s access list.
Management server platform databases are enabled (msplmtactivate) and
configured (msconfigure) when applications that use the management server
are desired in a fabric. The management server enables a management
application to access and configure switches in the fabric. It is located at the Fibre
Channel address, FFFFFA. If the access control list (ACL) is empty (default value),
the management server is available to all systems connected in-band to the fabric.
To restrict access, specify the World Wide Name (WWN) for one or more
management applications using the msconfigure command; access is then
restricted to those WWNs. Up to 16 maximum WWNs are supported in the ACL. The
ACL is implemented on a per-switch basis and should be configured on the switch
to which the management application station is directly connected.
• Security incompatibility - A security incompatibility could occur for the following
reasons: Unknown incompatibility, Security parameters incompatibility, Exchange
FCS failed, Data incompatibility, MS Platform config incompatibility
• Security violations - A security violation could occur if The port used to connect
switches has a DCC policy that does not allow the switch into the fabric. The
fabric has an SCC policy that does not allow switch into the fabric. ECP Error:
Exchange Credit parameter.
Revision 0512 10
TS300 Switch to Switch Connectivity
Note: The Fabric, Zoning, and Extended Fabric licenses are required on each switch. If a
license is not present on a switch it will segment when it tries to join the fabric.
Revision 0512 6 – 11
TS300 Switch to Switch Connectivity
Revision 0512 12
TS300 Switch to Switch Connectivity
Configuration mismatches: In the example, the cfg4 zoning configuration has different
definitions in the two fabrics. When you attempt to merge the two fabrics, there will be a
fabric segmentation.
Revision 0512 6 – 13
TS300 Switch to Switch Connectivity
Type mismatches: In the example, the Device1 object is defined as a zone alias in Fabric
A, and as a zone in Fabric B. Because this object is a different type of object in the two
fabrics, an attempt to merge the fabrics will result in a fabric segmentation.
Revision 0512 6 – 14
TS300 Switch to Switch Connectivity
Content mismatches: In the example, the Green_Zone object is a zone in both fabrics;
however, since zoning definitions depend on both the zone members and the order in
which the members are defined, the definitions are thus different between the two
fabrics. As a result, an attempted fabric merge will result in a fabric segmentation.
Revision 0512 6 – 15
TS300 Switch to Switch Connectivity
Footnote 1: The defzone command has two settings: All Access and No Access, if No
access is enabled, when you do a cfgdisable, no devices will be able to access any
other devices in that SAN.
Revision 0512 6 – 16
TS300 Switch to Switch Connectivity
Always begin by running the switchshow command, as this will identify the ISL
connection that reports the fabric segmentation, and thus the remote switch or fabric
that is conflicting with the local switch/fabric. You may want to capture the command
output in a text file to aid making the comparison between the fabrics.
Revision 0512 6 – 17
TS300 Switch to Switch Connectivity
Revision 0512 6 – 18
PR205 SAN Feature Updates
Network Advisor allows you to easily compare two zone databases using a unique
interface. To access the Compare/Merge Zone DBs utility click Zone DB Operation →
Compare. The left side of the screen is the reference zone DB and the right side is the
editable zone DB. Changes can be merged into the editable zone DB from the reference
zone DB.
Use the Tree Level and pull-down to view all entries, only zones, or only zone
configurations. Use the Add, Merge, and Merge All buttons to make changes.
The Sync Scroll checkbox can be used to lock the two zone views together so that when
one side scrolls the other side scrolls at the same time. Unchecking the Sync Scroll box
will allow you to move through the two views independently.
Revision 1210 6 – 19
TS300 Switch to Switch Connectivity
Revision 0512 20
TS300 Switch to Switch Connectivity
The configshow method discussed above does not require the switch to be disabled,
resulting in a non-disruptive review. All of the fabric.ops parameters must be identical on
both switches/fabrics. If you have a segmentation due to incompatibility, the output of
switchshow may (newer versions of firmware) tell you exactly which parameter is
mismatched.
Revision 0512 6 – 21
TS300 Switch to Switch Connectivity
Revision 0512 6 – 22
TS300 Switch to Switch Connectivity
You could also change these values by uploading the switch configuration file
(configupload), editing the file manually, then downloading the edited file
(switchdisable; configdownload).
You will see the reason for the segmentation in the following two command outputs and
also in the error log: fabstatsshow and switchshow
Revision 0512 6 – 23
TS300 Switch to Switch Connectivity
Revision 0512 6 – 24
TS300 Switch to Switch Connectivity
In a two-switch fabric, a Brocade 5100 was configured with the following long distance
parameters: portcfglongdistance 8 LS 1 40; the attached Brocade 300E was
configured with the following parameters: portcfglongdistance 8 LD 1 40
The Brocade 5100 displayed the following truncated outputs (it was the first to ELP):
switchshow: 8 8 id N4 Online LS E-Port segmented,(incompatible)
Switch1:admin> errshow -r
Switch1:admin> errshow -r
Revision 0512 6 – 25
TS300 Switch to Switch Connectivity
Revision 0512 6 – 26
TS300 Switch to Switch Connectivity
Switch1:admin> portcfgshow
Ports of Slot 0 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
-----------------+--+--+--+--+----+--+--+--+----+--+--+--+----+--+--+--
Speed AN AN AN AN AN AN AN AN AN AN AN AN AN AN AN AN
Fill Word 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
AL_PA Offset 13 .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. ..
Trunk Port ON ON ON ON ON ON ON ON ON ON ON ON ON ON ON ON
Long Distance .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. ..
VC Link Init .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. ..
Locked L_Port .. .. .. .. .. .. ON .. .. .. .. .. .. .. .. ..
Locked G_Port .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. ..
Disabled E_Port .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. ..
ISL R_RDY Mode .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. ..
RSCN Suppressed .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. ..
Persistent Disable.. ON ON .. .. .. .. .. .. .. .. .. .. .. .. ..
NPIV capability ON ON ON ON ON ON ON ON ON ON ON ON ON ON ON ON
Revision 0512 6 – 27
TS300 Switch to Switch Connectivity
QOS E_Port ON ON ON ON ON ON ON ON ON ON ON ON ON ON ON ON
EX Port .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. ..
Mirror Port .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. ..
Rate Limit .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. ..
Credit Recovery ON ON ON ON ON ON ON ON ON ON ON ON ON ON ON ON
Fport Buffers .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. ..
Revision 0512 28
TS300 Switch to Switch Connectivity
Revision 0512 6 – 29
TS300 Switch to Switch Connectivity
In a three-switch fabric, here is the fabricshow command output from one of the switches. The "<"
indicates which switch is the principal switch in the fabric.
sw5:admin> fabricshow
Switch ID Worldwide Name Enet IP Addr FC IP Addr Name
--------------------------------------------------------------------------
34: fffc22 10:00:00:60:69:00:06:56 192.168.64.59 192.168.65.59 >"sw5"
38: fffc26 10:00:00:60:69:00:02:0b 192.168.64.180 192.168.65.180 "sw180"
As there are only two switches shown, there could be a domain ID conflict.
Note: It is recommended that you set a unique domain ID for every switch in the fabric. On newer
switches you may not get a fabric segmentation error if two switches have the same domain ID, one of the
switches will take the next available domain ID in the fabric. However, for planning and documentation
accuracy, you may want to set the domain IDs.
Switch1:admin> switchshow
switchName: Switch1
Area Port Media Speed State
==============================
0 0 id N4 Online F-Port 10:00:00:00:c9:53:c6:c5
1 1 id N2 Online E-Port segmented, (domain overlap) Trunk
master)
Insistent Domain ID mode – Required to be enabled for FICON environments, recommended for HP-UX
and AIX environments. Consider enabling this as a best practice when defining <domain, port>, <domain,
area> or <domain, port index> zoning. This mode enables a flag for the domain ID, so that the current
domain setting for the switch is insistent: that is, remains the same over switch reboots, power cycles, CP
failovers, firmware downloads, and fabric reconfigurations. If a switch does not get the selected insistent
domain ID during a fabric reconfiguration, it segments itself out of the fabric.
Revision 0512 6 – 30
TS300 Switch to Switch Connectivity
In a three-switch fabric, here is the fabricshow command output from one of the switches. The "<"
indicates which switch is the principal switch in the fabric.
sw5:admin> fabricshow
Switch ID Worldwide Name Enet IP Addr FC IP Addr Name
--------------------------------------------------------------------------
34: fffc22 10:00:00:60:69:00:06:56 192.168.64.59 192.168.65.59 >"sw5"
38: fffc26 10:00:00:60:69:00:02:0b 192.168.64.180 192.168.65.180 "sw180"
As there are only two switches shown, there could be a domain ID conflict.
Note: It is recommended that you set a unique domain ID for every switch in the fabric. On newer
switches you may not get a fabric segmentation error if two switches have the same domain ID, one of the
switches will take the next available domain ID in the fabric. However, for planning and documentation
accuracy, you may want to set the domain IDs.
Switch1:admin> switchshow
switchName: Switch1
Area Port Media Speed State
==============================
0 0 id N4 Online F-Port 10:00:00:00:c9:53:c6:c5
1 1 id N2 Online E-Port segmented, (domain overlap) Trunk
master)
Insistent Domain ID mode – Required to be enabled for FICON environments, recommended for HP-UX
and AIX environments. Consider enabling this as a best practice when defining <domain, port>, <domain,
area> or <domain, port index> zoning. This mode enables a flag for the domain ID, so that the current
domain setting for the switch is insistent: that is, remains the same over switch reboots, power cycles, CP
failovers, firmware downloads, and fabric reconfigurations. If a switch does not get the selected insistent
domain ID during a fabric reconfiguration, it segments itself out of the fabric.
Revision 0512 6 – 31
TS300 Switch to Switch Connectivity
Revision 0512 32
TS300 Switch to Switch Connectivity
Revision 0512 6 – 33
TS300 Switch to Switch Connectivity
See the Fabric OS Command Reference Guide for additional information about
commands that will help resolve the segmentation issue:
• fddcfg
• secpolicyabort
• secpolicyadd
• secpolicycreate
• secpolicydelete
• secpolicydump
• secpolicyremove
• secpolicysave
• secpolicyshow
There are RASLog error messages (errshow) that indicate ACL conflicts.
Revision 0512 6 – 34
TS300 Switch to Switch Connectivity
Revision 0512 6 – 35
TS300 Switch to Switch Connectivity
Footnote 1: The B5100 shows no light because the B300 has detected the security
violation and disabled its transmitter.
Revision 0512 6 – 36
TS300 Switch to Switch Connectivity
Now each switch's SCC policy contains the other switch's WWN; the Brocade 300
additionally contains its own WWN:
Switch1:admin> secpolicyshow
____________________________________________________
SCC_POLICY
--------------------------------------------------
10:00:00:05:1e:0a:85:05 2 Switch1
10:00:00:05:1e:0c:dc:72 98 Switch2
____________________________________________________
SCC_POLICY
--------------------------------------------------
10:00:00:05:1e:0a:85:05 2 Switch1
10:00:00:05:1e:0c:dc:72 98 Switch2
<Truncated Output>
Revision 0512 6 – 37
TS300 Switch to Switch Connectivity
The Fabric Data Distribution (FDD) ACL policy is viewed and configured with the fddcfg
command. Use the fddcfg --showall command to determine fabric-wide
consistency policy. Use the fddcfg --fabwideset and the secpolicycreate
and secpolicyadd commands to resolve the conflict.
Switch1:admin> fddcfg --showall
Local Switch Configuration for all Databases:-
DATABASE - Accept/Reject
---------------------------------
SCC - accept
DCC - accept
PWD - accept
FCS - accept
AUTH - accept
IPFILTER - accept
Fabric Wide Consistency Policy:- ""
When the fabric-wide consistency policy is set to Tolerant and the ACL SCC or DCC
databases are different, an error log entry made and ACL distribution must be done
manually. This scenario still requires the SCC policies to contain connecting switch
WWNs, domain ID, and/or switch names.
Use the secpolicyshow command in connecting fabrics to check the contents of SCC
Revision 0512 6 – 38
TS300 Switch to Switch Connectivity
Revision 0512 39
TS300 Switch to Switch Connectivity
Revision 0512 6 – 40
TS300 Switch to Switch Connectivity
Revision 0512 6 – 41
TS300 Switch to Switch Connectivity
Revision 0512 6 – 42
TS300 Switch to Switch Connectivity
Switch1:admin> trunkdebug
trunkdebug: area_number1 area_number2
Revision 0512 6 – 43
TS300 Switch to Switch Connectivity
If the port speed is locked to 1 Gbit/sec the trunk will not come online
Switch1:admin> trunkdebug 8 9
port 8 and port 9 speed is not 2G, 4G or 8G
Revision 0512 6 – 44
TS300 Switch to Switch Connectivity
Revision 0512 45
TS300 Switch to Switch Connectivity
Revision 0512 6 – 46
TS300 Switch to Switch Connectivity
Revision 0512 6 – 47
TS300 Switch to Switch Connectivity
Footnote 1:
r10-st16-b51:admin> portbuffershow
<truncated output>
38 E LE 46 40 40 10km
39 E LE 46 40 40 10km 1616
Revision 0512 6 – 48
TS300 Switch to Switch Connectivity
Revision 0512 6 – 49
TS300 Switch to Switch Connectivity
Revision 0512 6 – 50
TS300 Switch to Switch Connectivity
Revision 0512 51
CFA 200 Basic Troubleshooting
Revision 0512 10 – 52
CFA 200 Basic Troubleshooting
Revision 0512 10 – 53
CFA 200 Basic Troubleshooting
Revision 0512 10 – 54
CFA 200 Basic Troubleshooting
Revision 0512 10 – 55
CFA 200 Basic Troubleshooting
<truncated output>
Index Port Address Media Speed State Proto
==============================================
<truncated output>
<truncated output>
Status: PASSED
==========================================================================
==========================================================================
==========================================================================
Revision 0512 10 – 56
CFA 200 Basic Troubleshooting
Revision 0512 10 – 57
CFA 200 Basic Troubleshooting
Footnote 1: Not supported on 8 Gbps SFPs. If a port with a 16 Gbps SFP is hard set to 8
Gbps the test will run at 8 Gbps speed.
Revision 0512 10 – 58
CFA 200 Basic Troubleshooting
No High Availability (HA) support for D_Ports, the D_Port test may be restarted manually
after HA failover. Use bcu on HBA to configure D-port.
Revision 0512 10 – 59
CFA 200 Basic Troubleshooting
Revision 0512 10 – 60
CFA 200 Basic Troubleshooting
Revision 0512 10 – 61
CFA 200 Basic Troubleshooting
D_Port Information:
===================
Port: 1
Remote WWNN: 10:00:00:05:33:13:2f:b4
Remote port: 2
Mode: Automatic
Start time: Fri Mar 11 01:41:55 2011
End time: Fri Mar 11 01:43:21 2011
Status: PASS
========================================================================
Test Start time Result EST(secs) Comments
========================================================================
Electrical loopback 01:42:11 PASS -- --------
Optical loopback 01:42:14 RESPONDER -- See remote
Link traffic test 01:43:10 RESPONDER -- See remote
========================================================================
Roundtrip link latency: 1108 nano-seconds
Estimated cable distance: 20 meters
Revision 0512 10 – 62
CFA 200 Basic Troubleshooting
Revision 0512 10 – 63
CFA 200 Basic Troubleshooting
SwitchA:admin> portcfgshow
Ports of Slot 0 24 25 26
----------------------+---+---+---
Octet Speed Combo 1 1 1
Speed AN AN AN
AL_PA Offset 13 .. .. ..
Trunk Port ON ON ON
Long Distance .. .. ..
.....
.....
Mirror Port .. .. ..
Rate Limit .. .. ..
Credit Recovery ON ON ON
Fport Buffers .. .. ..
Port Auto Disable .. .. ..
CSCTL mode .. .. ..
D_Port mode ON OFF ON
<truncated output>
Revision 0512 10 – 64
CFA 200 Basic Troubleshooting
Revision 0512 10 – 65
CFA 200 Basic Troubleshooting
Revision 0512 10 – 66
CFA 200 Basic Troubleshooting
If the port being tested is an ISL that is currently functional, Network Advisor performs
the following:
• Disable the ports
• Put the ports into D_Port mode
• Enable the ports
• Run the tests
• Disable the ports
• De-configure the ports as a D_Ports
• Enable the ports which should restore the ISL connection
Network Advisor does not display the cable distance or round trip latency.
Revision 0512 10 – 67
CFA 200 Basic Troubleshooting
The master log is located at the bottom of the main Network Advisor screen:
Master log
Revision 0512 10 – 68
TS300 Switch to Switch Connectivity
Revision 0512 69
TS300 Switch to Switch Connectivity
Revision 0512 6 – 70
TS300 Switch to Switch Connectivity
Revision 0512 6 – 71
TS300 Switch to Switch Connectivity
Revision 0512 6 – 72
TS300 Switch to Switch Connectivity
Revision 0512 73
TS300 Switch to Switch Connectivity
Revision 0512 74
TS300 Switch to Switch Connectivity
Footnote 1: These features only allow one FCIP tunnel per GbE port
Revision 0512 6 – 75
TS300 Switch to Switch Connectivity
Revision 0512 7 – 76
TS300 Switch to Switch Connectivity
Revision 0512 7 – 77
TS300 Switch to Switch Connectivity
Revision 0512 6 – 78
TS300 Switch to Switch Connectivity
Footnote1: The default route is the default gateway used by the FCIP tunnel to get
outside of the local IP subnet.
Revision 0512 6 – 79
TS300 Switch to Switch Connectivity
Footnote1: Use the portcmd --ipperf command to determine the link bandwidth
available for the tunnel
On local side:
• portcmd --ipperf <slot/GBPort> -s <local IP> -d
<remote IP> -R
On Remote side:
• portcmd --ipperf <slot/GBPort> -s <local IP> -d
<remote IP> -S
Command must be run from both endpoints simultaneously
Allow ipperf to run for at least three minutes
The last 30 seconds of data will indicate good recommended commit rates from the –
S (remote) side
Repeat the test in the opposite direction to get throughput
In the ipperf command used above the –s option represents the source IP and the –d
option represents the destination IP of the remote switch.
The –S switch is used to specify source mode for the ipperf connection. The source end-
point generates a traffic stream and reports the end-to-end bandwidth from this end-
Revision 0512 6 – 80
TS300 Switch to Switch Connectivity
Revision 0512 6 – 81
TS300 Switch to Switch Connectivity
Revision 0512 82
TS300 Switch to Switch Connectivity
Revision 0512 6 – 83
TS300 Switch to Switch Connectivity
Revision 0512 6 – 84
TS300 Switch to Switch Connectivity
Footnote1: For a complete list of interop mode requirements refer to the Fabric OS 6.2
Admin Guide and Release Note.
Revision 0512 6 – 85
TS300 Switch to Switch Connectivity
Footnote1: If only i10K M-EOS switches are used domain ID 1-239 can be used.
Revision 0512 6 – 86
TS300 Switch to Switch Connectivity
Revision 0512 6 – 87
TS300 Switch to Switch Connectivity
Revision 0512 7 – 88
TS300 Performance
Note: When dealing with FCIP congestion issues the concept is the same with regards
to too much data being sent over the link. On these IP links, look for re-transmission
and slow starts. Suggest slowing down the amount of data being put on the IP pipe until
the re-transmissions and slow starts go away and adjust and monitor the IP port speeds
to find maximum performance. Other things to check would be window size, MTU size,
and compression.
Footnote 1: ISL VC_RDY flow control mode is not used when M-EOS interoperability is enabled on a Fabric OS switch.
M-EOS switches do not support VC_RDY flow control mode. VC_RDY flow control is also not used for some long
distance links depending on long distance settings.
Footnote 2: The number of VCs used and amount of credit assigned will change depending on long distance and QoS
settings.
The default ISL setting is 8 VCs sharing 26 buffer credits with VC0 used for class F traffic receiving 4 credits, VC 2-5
used for data frames each receiving 5 credits each and VC 6-7 used for broadcast/multicast each receiving1 credit
each. For 8 Gbit/sec switches using QoS 16 VCs are used with VC 0 used for class F traffic receiving 3 credits, VC 2-5
medium priority data traffic receiving 2 each, VC 8-14
low/high priority traffic receiving 2 each and VC 6-7
broadcast/multicast traffic each receiving 1.
Footnote 1: Each switch port always uses a set PID based on the switch port number
unless portswap has been used.
Footnote 2: When QoS is enabled the VC selected is based on the traffic priority. See
notes on prior slide.
Footnote 3: The graphic below illustrates traffic distributed across all VC’s to remove the
VC congestion.
Footnote 1: In this example the hosts are pushing their frames to an ISL at a greater
bandwidth then the ISL can handle. Thus the ISL VC credit pool is continuously backed
up from the central credit pool that is feeding it. Eventually the frames are TX through
ISL, but the link is congested never the less, causing some of the host behind it to hold
frame beyond 500 ms and dropping, if credits are not available to receive from behind
because it’s oversubscribed. This is know as “high I/O credit starvation”.
Revision Alpha 7 – 10
TS300 Performance
Fabric
Use to monitor I/O (per port)
Watch
Revision Alpha 7 – 11
TS300 Performance
Revision Alpha 7 – 12
TS300 Performance
1 Gbps .5 25 50
2 Gbps 1 50 100
Note: The above table is an approximation and assumes 2k payload size. For smaller
payload size, increase the number accordingly.
Footnote 3: See appendix for more information on this feature
Revision Alpha 7 – 13
TS300 Performance
Footnote 1: Buffer-to-Buffer Flow Control is flow control between adjacent ports in the
I/O path, A separate, independent pool of credits are used to manage Buffer-to-Buffer
Flow Control. Buffer-to-Buffer Flow Control works by a sending port using its available
credit supply and waiting to have the credits replenished by the port on the opposite
end of the link. These Buffer-to-Buffer credits (BB credits) are used by Class 2 and
Class 3 service and rely on the Fibre Channel Receiver-Ready (R_RDY) control word to
be sent by the receiving link port to the sender. An end node attached to a switch will
establish its BB credit during login to the fabric. A communicating partner attached
elsewhere on the switch will establish its own and most likely different BB credit value
to the director during its login process. Hence, BB credit has no end-to-end component.
The sender then decrements the BB credit by 1 for each R_RDY received. The initial
value of BB credit must be non-zero. The rate of frame transmission is regulated by the
receiving port based on the availability of buffers to hold received frames.
At first glance, it is readily apparent that this system may leave something to be
desired in terms of overall performance and efficiency. This is due to the time required
for frames to travel from the sending port to the receiving port and responses to return
from the receiving port back to the sending port. Now, consider that it takes light
approximately 5 nsec to propagate through 1 meter of optical fiber, or 50
microseconds to travel 10 km. This behavior becomes even less efficient and more of a
performance drag on faster links, longer-distance links, or when traveling through
complex topologies that contribute significant delivery latencies. So, to achieve the
higher performance while preventing the overrun of receive buffers, we need to use BB
credit values greater than one. If a sending port is allowed to send more than one
frame without having to wait for a response to each, performance can be improved.
This is referred to as frame streaming. As more credits (beyond one) are made
available, link utilization (and performance) will increase until link utilization reaches
100 percent. When the link is thus fully utilized, frames can be sent as rapidly as
allowed but additional credits will not help matters.
Revision Alpha 7 – 14
TS300 Performance
In this example insufficient buffer credits were assigned for the long distance link. This
same type of behavior will also take place had credits been lost due to R_RDYs or
VC_RDYs being lost.
Revision Alpha 7 – 15
TS300 Performance
Revision Alpha 7 – 16
TS300 Performance
Revision Alpha 7 – 17
TS300 Performance
Footnote 1: Brocade Fabric OS switches can detect link distance and assign the correct
number of buffer credits (LD long distance mode) or a specific distance can be selected
which will assign the correct buffer credits need.
If you double the speed or double the distance, you need to double the credits
available on the port
If the speed doubles, the maximum distance is cut in half
Speed Credits/km Credits/50km Credits/100km
1 Gbps .5 25 50
2 Gbps 1 50 100
Note: The above table is an approximation and assumes 2k payload size. For smaller
payload size, increase the number accordingly.
Footnote 2: Requires condor 3 ASICs at both ends of the link running at 10 or 16 Gbps.
Revision Alpha 7 – 18
TS300 Performance
Revision Alpha 7 – 19
TS300 Performance
Footnote 1: If you are using QoS or ISL trunking these will be VC_RDY.
Footnote 2: This is generally caused by a misbehaving or mis-configured device.
Footnote 3: In the graphic example above, if storage 1 port 0 was not returning R_RDYs
or was returning them slower than it should, frames would backup on port 0 causing
frames to backup on VC 2 which in turn would slow all traffic using that VC, in this case
Hosts 1, 5 and 6. Also, expect I/O to be lower then expected on storage 2 port 4 but with
no frame drop or buffer credit starvation as frames are not backed up but rather never
reaching this port.
If Storage 1 was returning R_RDYs too slowly for all four ports then frame backup would
occur on all VCs slowing traffic for all 7 hosts in this example.
Revision Alpha 7 – 20
TS300 Performance
Revision Alpha 7 – 21
TS300 Performance
Footnote 1. Even though slow performance may be seen on devices using the same VC
as the problem device, only the end device causing the problem will have class 3 frame
drop counter increasing. In this example the storage on the bottom is seeing low
utilization but not credit starvation.
Revision Alpha 7 – 22
TS300 Performance
Revision Alpha 7 – 23
TS300 Performance
Revision Alpha 7 – 24
TS300 Performance
Footnote 1: As stated earlier, encoding outside the frame errors can also cause R_RDYs
to be lost causing buffer credit starvation.
Footnote 2: Forward Error Correction can automatically fix some of these corrupted
frames. More information later in this module as well as the appendix of this module.
Revision Alpha 7 – 25
TS300 Performance
Revision Alpha 7 – 26
CFP 300 Adaptive Networking: Fabric Profiling
Revision 0312 7 – 27
CFP 300 Adaptive Networking: Fabric Profiling
Revision 0312 7 – 28
CFP 300 Adaptive Networking: Fabric Profiling
Bottlenecks are also reported through RASlog alerts and SNMP traps. These two alerting
mechanisms are intertwined and cannot be independently turned on and off. You can
use the bottleneckmon command to specify alerting parameters for the following:
• Whether alerts are to be sent when a bottleneck condition is detected
• The size of the time window to look at when determining whether to alert
• How many affected seconds are needed to generate the alert.
• How long to stay quiet after an alert
Changing alerting parameters affects RASlog alerting as well as SNMP traps
For more information: Fabric OS Administrator’s Guide 53-1002148-01
Revision 0312 7 – 29
CFP 300 Adaptive Networking: Fabric Profiling
Revision 0312 7 – 30
CFP 300 Adaptive Networking: Fabric Profiling
Revision 0312 7 – 31
CFP 300 Adaptive Networking: Fabric Profiling
Setting a threshold of 0.1 and a time window of 30 seconds specifies that an alert
should be sent when 10% of the one-second samples over any period of 30 seconds
were affected by bottleneck conditions. The -qtime option can be used to throttle alerts
by specifying the minimum number of seconds between consecutive alerts.
Syntax:
bottleneckmon --enable [-alert][-thresh threshold]
[-time window] [-qtime quiet_time]
[slot/]port_list [[slot/]port_list] ...
bottleneckmon --disable [slot/]port_list
[[slot/]port_list] ...
bottleneckmon --show [-interval interval_size]
[-span span_size] [slot/]port
bottleneckmon –status
bottleneckmon --help
Revision 0312 7 – 32
CFP 300 Adaptive Networking: Fabric Profiling
Enter the bottleneckmon --status command to display the details of bottleneck detection
configuration for the switch, which includes the following:
• Whether the feature is enabled
• Switch-wide parameters
• Per-port overrides, if any
• Excluded ports
Example
switch:admin> bottleneckmon --status
Bottleneck detection - Enabled
==============================
Switch-wide sub-second latency bottleneck criterion:
====================================================
Time threshold - 0.800
Severity threshold - 50.000
Switch-wide alerting parameters:
============================
Revision 0312 7 – 33
CFP 300 Adaptive Networking: Fabric Profiling
Alerts - Yes
Latency threshold for alert - 0.100
Congestion threshold for alert - 0.800
Averaging time for alert - 300 seconds
Quiet time for alert - 300 seconds
Per-port overrides for sub-second latency bottleneck criterion:
==========================================================
Slot Port TimeThresh SevThresh
=========================================
0 3 0.500 100.000
0 4 0.600 50.000
0 5 0.700 20.000
Per-port overrides for alert parameters:
========================================
Slot Port Alerts? LatencyThresh CongestionThresh Time (s) QTime (s)
==========================================================
0 1 Y 0.990 0.900 3000 600
0 2 Y 0.990 0.900 4000 600
0 3 Y 0.990 0.900 4000 600
Excluded ports:
===============
Slot Port
============
0 2
0 3
0 4
Revision 0312 7 – 34
CFP 300 Adaptive Networking: Fabric Profiling
Enter the following command to set the alerting and sub-second latency criterion
parameters
switch:admin> bottleneckmon --config
Enter the following command to remove any port-specific alerting and sub-second
latency criterion parameters and revert to the switch-wide parameters
switch:admin> bottleneckmon --configclear
When you enable bottleneck detection, you can configure switch-wide alerting and sub-
second latency criterion parameters that apply to every port on the switch. After you
enable bottleneck detection, you can change the alerting parameters on the entire
switch or on individual ports. You can change the sub-second latency criterion
parameters on individual ports only. You can also
change the parameters on ports that have been excluded from bottleneck detection.
The alerting parameters indicate whether alerts are sent, and the threshold, time, and
quiet time options.
For a trunk, you can change the parameters only on the master port.
1. Connect to the switch and log in as admin.
2. Enter the bottleneckmon --config command to set the alerting and sub-
second latency criterion parameters.
Enter the bottleneckmon --configclear command to remove any port-specific
alerting and sub-second latency criterion parameters and revert to the switch-wide
parameters.
Revision 0312 7 – 35
CFP 300 Adaptive Networking: Fabric Profiling
The following example changes alerting parameters for the entire logical switch.
switch:admin> bottleneckmon --config -alert -lthresh .97 -cthresh .8 –time 5000
==============================
====================================================
Revision 0312 7 – 36
CFP 300 Adaptive Networking: Fabric Profiling
Revision 0312 7 – 37
CFP 300 Adaptive Networking: Fabric Profiling
Revision 0312 7 – 38
CFP 300 Adaptive Networking: Fabric Profiling
Revision 0312 7 – 39
PR208 Support Enhancements
Revision 0511 7 – 40
PR208 Support Enhancements
Provides a mechanism that allows configuring different thresholds for different type of
frames. Report events whenever a count for particular frame type crosses its respective
threshold
Revision 0511 7 – 41
PR208 Support Enhancements
Revision 0511 7 – 42
TS300 Performance
Revision Alpha 7 – 43
TS300 Performance
Revision Alpha 7 – 44
TS300 Performance
Footnote 1: The VC specific counters are not shown for Condor and GoldenEye ASICs.
This feature requires Condor2 and GoldenEye2 or later ASICs.
Revision Alpha 7 – 45
TS300 Performance
Footnote 1: In the example above, the buffer credit to zero counter are incrementing on
the ISL while the performance problem is occurring. This along with the high ISL
utilization on the prior slide helps confirms the ISL’s are congested in both directions. To
further confirm this, the next step is to check for class 3 frame discards on the end
device ports.
Also note that Tx and Rx counters can be used to check for one way congestion. If Tx is
increasing at a significantly higher rate then Rx, this may be the direction of the
congestion.
Footnote 2: For VC congestion use the tim_txcrd_z_vc to help identify which VCs may be
congested. In the example above, all credit to zero instances are on VC 0 indicating this
port is probably long distance using R_RDY mode rather then VC_RDY.
Revision Alpha 7 – 46
TS300 Performance
The er_other_discard counter is the number of other discards (platform and port
specific) on a receiving port. This counter is used to track class 3 frames dropped that
are not due to hold time or destination unreachable. For example a frame that is
dropped due to a zoning violation.
Buffer Credit counters:
The tim_txcrd_z counter is the number of times that the port was unable to transmit
frames because the transmit BB credit was zero. The purpose of this statistic is to
detect congestion or a slow drain device. This parameter is sampled at intervals of
2.5Us (microseconds), and the counter is incremented if the condition is true.
The tim_txcrd_z_vc counter is the number of times that the port was unable to
transmit frames because the transmit BB credit was zero for each of the port's 16
Virtual Channels (VC 0-15). The purpose of this statistic is to detect congestion or a
slow drain device. This parameter is sam-pled at intervals of 2.5Us (microseconds),
and the counter is incremented if the condition is true. Each sample represents 2.5Us
of time with zero Tx BB Credit. An increment of this counter means that the frames
could not be send to the attached device for 2.5Us, indicating degraded performance
(platform-and port-specific).
VC Congestion
Low Utilization
ISL Congestion
Credit Starvation
High I/O
Frame Drop Credit Starvation
Congestion
8 Gbit/sec 8 Gbit/sec
4
Multiple Connection Storage
8 Gbit/sec
Hosts
Revision Alpha 7 – 47
TS300 Performance
r10-st16-b51:admin> portstatsshow 38
tim_txcrd_z_vc 0- 3: 7806470 0 0 0
tim_txcrd_z_vc 4- 7: 0 0 0 0
tim_txcrd_z_vc 8-11: 0 0 0 0
tim_txcrd_z_vc 12-15: 0 0 0 0
Revision Alpha 7 – 48
TS300 Performance
Revision Alpha 7 – 49
TS300 Performance
portbuffershow [[slotnumber/]portnumber]
DESCRIPTION
Use this command to display the current long distance buffer information for the
ports in a port group. The port group can be specified by giving any port number in
that group. If no port is specified, then the long distance buffer information for all of
the port groups of the switch is displayed.
This command displays the following fields of the long distance buffer information:
Lx Mode Long distance mode.
L0 - link is not in long distance mode, LE - link is up to 10Km, LD – distance
determined dynamically
LS - distance determined statically via user input
Max/Resv Buffers
The maximum or reserved number of buffers that are allocated to the port based on
the estimated distance (as defined by the desired_distance operand of the
portCfgLongDistance command). If the port is not configured in long distance mode,
certain systems may reserve buffers for the port. This field then shows the number
of buffers reserved for the port
Buffer Usage
The actual number of buffers allocated to the port. In LD mode, the number is
determined by two factors: the actual distance and the user specified desired
distance (as defined by the desired_distance operand of the portCfgLongDistance
command).
Needed Buffers
The number of buffers that are needed to utilize the port at full bandwidth
(depending on port speed configuration). If the number of Buffer Usage is less than
the number of Needed Buffers, the port is operating in the buffer limited mode.
Link Distance
For L0 (not in long distance mode), the command displays the fixed distance
based on port speed, for instance: 10 Km (1 Gbps), 5 Km (2 Gbps), 2 Km (4 Gbps),
or 1 Km (8 Gbps). For static long distance mode (LE), the fixed distance displays10
Km. For LD mode, the distance in kilometers displays as measured by timing the
return trip of a MARK primitive that is sent and then echoed back to the switch. LD
mode supports distances up to 500 Km. Distance measurement on a link longer
than 500 Km might not be accurate. If the connecting port does not support LD
mode, is shows "N/A".
Remaining Buffers
The remaining (unallocated and unreserved) buffers in a port group.
Revision Alpha 7 – 50
TS300 Performance
Footnote1: To export the performance reports, run a SAN Export and include the
performance reports as part of the exported data.
Revision Alpha 7 – 51
TS300 Performance
Revision Alpha 7 – 52
TS300 Performance
Revision Alpha 7 – 53
TS300 Performance
Revision Alpha 7 – 54
TS300 Performance
Revision Alpha 7 – 55
TS300 Performance
Footnote 1: An analysis of the network performance taken at the time of the problem,
such as a portperfshow, would be very helpful but this information is not provided in a
supportsave.
Revision Alpha 7 – 56
TS300 Performance
Revision Alpha 7 – 57
TS300 Performance
Revision Alpha 7 – 58
TS300 Performance
Revision Alpha 7 – 59
TS300 Performance
All device and ISL port traffic is low to normal. If this was ISL or device port over-
subscription we should see ISL or device port utilization peaked at max or near max
utilization (remember portperfshow displays both the Tx and Rx totals for a port).
Hint, ISL traffic was low but the second ISL in the trunk group is still being used.
Trunking only uses the first link in the trunk until it is saturated, then the second, third,
fourth,.. ISL is needed. In this example ISL traffic is low to normal but the second ISL in
each trunk group is still being used. What could be causing this and why?
Note, B5100 ports 0-30 have hosts attached and B300 ports 0-7 have storage
attached.
Revision Alpha 7 – 60
TS300 Performance
Revision Alpha 7 – 61
TS300 Performance
Revision Alpha 7 – 62
TS300 Performance
Revision Alpha 7 – 63
TS300 Performance
Revision Alpha 7 – 64
TS300 Performance
Revision Alpha 7 – 65
TS300 Performance
We also see class 3 frame drop on all four B300 ISLs but none on the B5100
Again, we see tim_txcrd_z_vc counters increasing on all four B5100 ISL ports but
not on the B300. Remember these counter are Tx counters not Rx. All ISLs on the
B5100 have a high number of VC 4 Tx buffer credits at zero for longer then 2.5us waiting
to receive a VC_RDY.
This confirms class 3 frames going from the B5100 to the B300 are getting dropped
while frames from the B300 to the B5100 are not.
Revision Alpha 7 – 66
TS300 Performance
Revision Alpha 7 – 67
TS300 Performance
Revision Alpha 7 – 68
TS300 Performance
As only VC 4 is congested we only need to look at storage ports using VC 4 and the hosts
using those storage ports. Remember the VC assignment is based on destination PID.
Revision Alpha 7 – 69
TS300 Performance
Revision Alpha 7 – 70
TS300 Performance
VC3 1 9 0001/1001
VC4 2 A 0010/1010
VC5 3 B 0011/1011
VC2 4 C 0100/1100
VC3 5 D 0101/1101
VC4 6 E 0110/1110
VC5 7 F 0111/1111
Revision Alpha 7 – 71
TS300 Performance
Revision Alpha 7 – 72
TS300 Performance
Revision Alpha 7 – 73
TS300 Performance
Revision Alpha 7 – 74
TS300 Performance
Revision Alpha 7 – 75
TS300 Performance
Revision Alpha 7 – 76
TS300 Performance
Revision Alpha 7 – 77
SAN-TS 300 Device Connectivity
Revision 0213 7 – 78
SAN-TS 300 Device Connectivity
Revision 0213 7 – 79
SAN-TS 300 Device Connectivity
Footnote 1:
ABTS: Abort command
BA_ACC: Abort Accept command
BA_RJT: Abort Reject command
Frame monitor configuration is saved/persistent across reboots.
Trunk Master will monitor the data traffic for the entire trunk. For F_Port and E_Ports
trunks, the monitor is set only on the master port of the trunk. If the master changes,
the monitor automatically moves to the new master port. If a monitor is installed on a
port that later becomes a slave port when a trunk comes up, the monitor automatically
moves to the master port of the trunk.
Revision 0213 7 – 80
SAN-TS 300 Device Connectivity
Revision 0213 7 – 81
SAN-TS 300 Device Connectivity
Revision 0213 7 – 82
SAN-TS 300 Device Connectivity
Revision 0213 7 – 83
SAN-TS 300 Device Connectivity
In this bit pattern example: 40,0xFF,0x08,0x28 the 0x08 and 0x28 are SCSI read
commands
Footnote 1: When bit patterns are chained together like this the “;” acts like an “and”. All
three bit patterns must match to get an “match”.
Revision 0213 7 – 84
SAN-TS 300 Device Connectivity
Revision 0213 7 – 85
SAN-TS 300 Device Connectivity
Footnote 1: If you convert the bitmask 0x0f to binary 00001111 only the last 4 bits
would be checked. If had a bit mask of 0xF1 that would be 11110001 in binary only only
bits 0, 4-7 would be checked for a match.
Revision 0213 7 – 86
SAN-TS 300 Device Connectivity
In SOFx frames the offset is specified as 0x0; value is specified as one of the following.
For example, the value of 0x6 matches frames of type SOFi3:
0 SOFf
1 SOFc1
2 SOFi1
3 SOFn1
4 SOFi2
5 SOFn2
6 SOFi3
7 SOFn3
Revision 0213 7 – 87
SAN-TS 300 Device Connectivity
Revision 0213 7 – 88
SAN-TS 300 Device Connectivity
Refer to the Fabric Watch Administrator’s Guide and the Fabric OS Command Reference
Manual for more information on these commands.
Pre-defined frame types include the following:
ABTS: Specifies a frame of type ABTS (Abort Sequence Basic Link Service
command) with a bit pattern of "4,0xff,0x81;12,0xff,0x00;12".
BA_ACC: Specifies a frame of type BA_ACC (Abort Accept) with a bit pattern of
“0xff,0x84;12,0xff,0x00".
BA_RJT: Specifies a frame of type BA_RJT (Abort Reject) with a bit pattern of “()”.
IP Specifies a frame of type IP with a bit pattern of “12,0xFF,0x05”.
SCSI: Specifies a frame of type SCSI with a bit pattern of “12,0xFF,0x08”.
SCSI_READ: Specifies a frame of type SCSI Read with a bit pattern of
“12,0xFF,0x08;4,0xFF,0x06;40,0xFF,0x08,0x28”.
SCSI_WRITE: Specifies a frame of type SCSI Write with a bit pattern of
“12,0xFF,0x08;4,0xFF,0x06;40,0xFF,0x0A,0x2A”.
SCSI_RD_WR: Specifies a frame of type SCSI Read or Write with a bit pattern of
“12,0xFF,0x08;4,0xFF,0x06;40,0xFF, 0x08,0x28,0x0A,0x2A”.
SCSI2_RESERVE: Specifies a frame of type SCSI-2 Reserve with a bit pattern of
“12,0xFF,0x08;4,0xFF,0x06;40,0xFF,0x16,0x56”.
SCSI3_RESERVE: Specifies a frame of type SCSI-3 Reserve with a bit pattern of
“12,0xFF,0x08;4,0xFF,0x06;40,0xFF,0x5F;41,0xFF,0x01”.
Revision 0213 7 – 89
SAN-TS 300 Device Connectivity
Footnote 1: The physical port number to index number mapping can be found in the
switchshow output.
Refer to the Fabric Watch Administrator’s Guide for more information.
Revision 0213 7 – 90
SAN-TS 300 Device Connectivity
When you set a time base to a value other than none, there are two main points to
remember when configuring events:
• Fabric Watch triggers an event only if the difference in the data value exceeds the
preset threshold boundary limit.
• Even if the current data value exceeds the threshold, Fabric Watch does not trigger
an event if the rate of change is below the threshold limit.
Revision 0213 7 – 91
SAN-TS 300 Device Connectivity
Revision 0213 7 – 92
PR208 Support Enhancements
Revision 0511 7 – 93
PR208 Support Enhancements
Revision 0511 7 – 94
PR208 Support Enhancements
Not shown but at the bottom of this screen is an OK button, once you assign
Revision 0511 7 – 95
PR208 Support Enhancements
If the monitor is not there, click on switch view to create the monitor.
Revision 0511 7 – 96
PR208 Support Enhancements
Revision 0511 7 – 97
TS300 Performance
Revision Alpha 7 – 98