Module 14 - Troubleshooting and Collecting Diagnostics

You might also like

Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 22

Module 14

Collecting Diagnostics Data


By the end of the module, you should be able to:

• Describe, compare and/or demonstrate the differences between running diagnostics from within the Group Manager GUI and the CLI

• Identify the 15 different diagnostics sections that are collected when Diagnostics are run on an array

• Describe what is meant by ‘abbreviated’ diagnostics and when they are useful

• Identify the different diagnostic parameters and switches that you may need to employ when running diagnostics

• Identify the output files that the diagnostics script generates

• Identify the different methods to extract the diagnostic output from a member or group of members

• Determine how to verify a hard disk failure using Group Manager

• Determine how to verify a power supply failure using Group Manager

2
Dell
Internal-Use
Internal Use - Confidential
- Confidential
Collecting Diagnostics
Data

Collecting PS Series Array


Diagnostic Data

3
Dell
Internal-Use
Internal Use - Confidential
- Confidential
Dell PS Series Array Diagnostics

• SAN Assist is preferred method for collecting Group/Array Diagnostics for customers

• Running Dell EqualLogic Array Diagnostics


• GUI
• Easiest method to execute for customer
• Can run across multiple members at the same time
• Cannot pass parameters to the ‘under-the-hood’ diagnostic script
• CLI
• Must connect directly to the member (not the Group IP)
• Must run manually on every member
• Can run any subset of the diagnostics collectors (sometimes required by customers)
• Many parameters available to CLI, including abbreviated output

4
Dell
Internal-Use
Internal Use - Confidential
- Confidential
Dell PS Series Array Diagnostics

• Preferred method is CLI to each affected member and the group lead

• Running Dell EqualLogic Array Diagnostics


• GUI
• Easiest method to execute for customer
• Can run across multiple members at the same time
• Cannot pass parameters to the ‘under-the-hood’ diagnostic script
• CLI
• Must connect directly to the member (not the Group IP)
• Must run manually on every member
• Can run any subset of the diagnostics collectors (sometimes required by customers)
• Many parameters available to CLI, including abbreviated output

5
Dell
Internal-Use
Internal Use - Confidential
- Confidential
Collecting Diagnostics with Support Assist

• Customer oriented for automated e-support

• Runs automatically when customer installs SupportAssist

• Manual option available


• Accessible from SAN Headquarters via Support Assist
menu in tool bar
• Customer can ‘decrypt’ output to allay any security
concerns
• If calling support, will use array diagnostic script via Group
Manager GUI or CLI

6
Dell
Internal-Use
Internal Use - Confidential
- Confidential
Array Diagnostics Data Collected

• When diagnostics is run, multiple encrypted files (seg_#.dgo) are generated and stored on the array (grpadmin’s / ftp
directory)
• Full Diagnostics collects 15 groups (or sections) of information:

7
Dell
Internal-Use
Internal Use - Confidential
- Confidential
Manually Running Diagnostics

• Passing parameters to diag script is only available with the CLI


• Required if a customer disallows SNMP collection (diag -x14)
• seg_#.dgo encrypted files generated and accessible via integrated FTP server
• Ability to capture through terminal console if necessary
• Ability to allow/disallow sending email with results and/or run abbreviated diags
• Use -w when possible, as snmpwalk is quicker than snmpget

8
Dell
Internal-Use
Internal Use - Confidential
- Confidential
Manually Running Array Diagnostics - CLI

• Login directly to the member as grpadmin (or any Group Administrator account)

• Determine if any parameters need to be passed to script


• To the management port if enabled, otherwise any eth port or via the active CM’s serial port
• ALWAYS run diagnostics on the Group Lead in addition to problem /suspect arrays
• If a small group (e.g. 4 arrays), run diags on all members in the group

• Execute script with any necessary parameters


• Enter diag -h for command syntax & help
• <CTRL>+C can break out of the script

9
Dell
Internal-Use
Internal Use - Confidential
- Confidential
How to Manually Run Array Diagnostics - GUI

• Login as a Group Administrator


• Open the Tools panel and click on Diagnostic
reports
• Complete the Diagnostics Wizard
• Always on the Group Lead
• All suspect arrays
• All arrays if small group
• Check status of diags through Operations Panel

10
Dell
Internal-Use
Internal Use - Confidential
- Confidential
Retrieving Diagnostic Files From The Array(s)

• Requirements
• Host that can communicate over management
network to the array
• Over iSCSI SAN if management port not configured
due to mixed traffic
• FTP Client
• FileZilla, WinSCP (FTP protocol)
• Command prompt

11
Dell
Internal-Use
Internal Use - Confidential
- Confidential
Retrieving Diagnostic Files From The Array(s)

• Requirements
• Host that can communicate over management
network to the array
• Over iSCSI SAN if management port not configured
due to mixed traffic
• FTP Client
• FileZilla, WinSCP (FTP protocol)
• Command prompt
• Windows Explorer
• ftp://grpadmin@member_ip_address

12
Dell
Internal-Use
Internal Use - Confidential
- Confidential
.DGO files – What Do I Do With Encrypted .DGO Files?

• .DGO Files must be uploaded to the Dell EqualLogic Diagnostic Server


(a.k.a., “Diag Server”)
• MUST be placed in correct folder to be useful and retain historical
reference
• When not properly uploaded, customer may experience an issue and support
personnel may not see or find it
• Diag Server is a Linux host that is the central repository for diagnostic data
that is used by support for analysis
• File structure is specific
• Many scripts are used to assist in analyzing “unpacked” .DGO files

13
Dell
Internal-Use
Internal Use - Confidential
- Confidential
Collecting Diagnostics
Data

Diagnosing Hard Drive


and Power Failure Faults

14
Dell
Internal-Use
Internal Use - Confidential
- Confidential
Diagnosing Hard Disk and Power Supply Failures

• Hard Disk Failures


• Verify/validate using Group Event Log through SAN Headquarters or Group Manager
• Check disk firmware revisions
• Do NOT reseat disks, replace them

• Power Supply Failures


• Validate environment/cables
• Verify/validate fault using Group Event Log through SAN Headquarters or Group Manager

15
Dell
Internal-Use
Internal Use - Confidential
- Confidential
When a Disk Fails or Has Tripped SMART

• Tripped SMART on array running firmware 6.01 or later


• Drive will begin copy-to-spare (pre-emptive mirroring) prior to fault
• If drive fails during process, will shift to normal rebuild process

• Fails with a spare drive


• Spare replaces failed disk and begins RAID rebuild
• RAID-6 and RAID-5 have 1 spare by default
• RAID-10 and RAID-50 members have 2 spares by default

• Fails with no spare drive available


• Data continues to be available, but the set is degraded.

• Multiple Disk Failure (MDF)


• In a degraded RAID 10, RAID 5, or RAID 50 set, the member will be set offline; any volumes and snapshots residing on member are set offline
• In a degraded RAID 6 set, continues in degraded state with further performance degradation. After both drives are reconstructed, performance returns to normal.

• When a failed drive is replaced:


• If a spare drive was used: Data has already been reconstructed on the spare drive, so the new drive becomes a spare. EqualLogic does not have a specific slot reserved for
spare.
• If a set is degraded: Data is reconstructed on the new drive and after reconstruction, performance returns to normal. (NOTE: CLI commands exist that allow a customer to
configure RAID Policy with no spares)

16
Dell
Internal-Use
Internal Use - Confidential
- Confidential
Determine Status of a Hard Disk

• Check the Event Log (SAN Headquarters or Group Manager)

• Use Group Manager to review Member information

• Abbreviated diag (diag -ad) is available, but does not replace Diagnostics

17
Dell
Internal-Use
Internal Use - Confidential
- Confidential
Diagnosing Power Supply Faults

• Cooling Fans are a part of the Power Supply

• Check the Event Log (SAN Headquarters or Group


Manager)

• Use Group Manager to review Member information


• CLI: select member show enclosure
• GUI: Select Member -> Status Tab
(Rear view) and -> Enclosure Tab
• Abbreviated diag (diag -ap) is available, but does not
replace Diags

18
Dell
Internal-Use
Internal Use - Confidential
- Confidential
Diagnosing Power Supply Faults

• Cooling Fans are usually part of the Power Supply

• Check the Event Log (SAN Headquarters or Group


Manager)

• Use Group Manager to review Member information


• CLI: select member show enclosure
• GUI: Select Member -> Status Tab
(Rear view) and -> Enclosure Tab
• Abbreviated diag (diag -ap) is available, but does not
replace Diags

19
Dell
Internal-Use
Internal Use - Confidential
- Confidential
Collecting Diagnostics
Data

Module Review and Lab

20
Dell
Internal-Use
Internal Use - Confidential
- Confidential
End of Module Review

• In this module we have reviewed and discussed the following:


• The differences between running diagnostics from within the Group Manager GUI and the CLI
• The 15 different diagnostics sections that are collected when Diagnostics are run on an array
• What is meant by ‘abbreviated’ diagnostics and when they are useful
• Why you might need to run only partial diagnostics
• The different diagnostic parameters and when running diagnostics
• The output files the diagnostics script generates
• The different methods to extract the diagnostic output from a member or group of members
• How to verify a hard disk failure using Group Manager
• How to verify a power supply failure using Group Manager

21
Dell
Internal-Use
Internal Use - Confidential
- Confidential
Collecting Diagnostics Data Lab

• Please refer to the lab manual for instructions.

22
Dell
Internal-Use
Internal Use - Confidential
- Confidential

You might also like