Professional Documents
Culture Documents
Sun Fire Midrange Server Maintenance
Sun Fire Midrange Server Maintenance
Sun Fire Midrange Server Maintenance
SM-340
edited 02/08 by LG
Copyright 2004 Sun Microsystems, Inc. 4150 Network Circle, Santa Clara, California 95054, U.S.A. All rights reserved. This product or document is protected by copyright and distributed under licenses restricting its use, copying, distribution, and decompilation. No part of this product or document may be reproduced in any form by any means without prior written authorization of Sun and its licensors, if any. Third-party software, including font technology, is copyrighted and licensed from Sun suppliers. Sun, Sun Microsystems, the Sun logo, Java, Netra, OpenBoot, Solaris, Sun Enterprise, Sun Fire, Sun HPC Cluster Tools, Sun Java, and Sun StorEdge are trademarks or registered trademarks of Sun Microsystems, Inc. in the U.S. and other countries. All SPARC trademarks are used under license and are trademarks or registered trademarks of SPARC International, Inc. in the U.S. and other countries. Products bearing SPARC trademarks are based upon an architecture developed by Sun Microsystems, Inc. UNIX is a registered trademark in the U.S. and other countries, exclusively licensed through X/Open Company, Ltd. U.S. Government approval might be required when exporting the product. RESTRICTED RIGHTS: Use, duplication, or disclosure by the U.S. Government is subject to restrictions of FAR 52.227-14(g)(2)(6/87) and FAR 52.227-19(6/87), or DFAR 252.227-7015 (b)(6/95) and DFAR 227.7202-3(a). DOCUMENTATION IS PROVIDED AS IS AND ALL EXPRESS OR IMPLIED CONDITIONS, REPRESENTATIONS, AND WARRANTIES, INCLUDING ANY IMPLIED WARRANTY OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE OR NON-INFRINGEMENT, ARE DISCLAIMED, EXCEPT TO THE EXTENT THAT SUCH DISCLAIMERS ARE HELD TO BE LEGALLY INVALID. THIS MANUAL IS DESIGNED TO SUPPORT AN INSTRUCTOR-LED TRAINING (ILT) COURSE AND IS INTENDED TO BE USED FOR REFERENCE PURPOSES IN CONJUNCTION WITH THE ILT COURSE. THE MANUAL IS NOT A STANDALONE TRAINING TOOL. USE OF THE MANUAL FOR SELF-STUDY WITHOUT CLASS ATTENDANCE IS NOT RECOMMENDED.
Copyright 2004 Sun Microsystems Inc. 4150 Network Circle, Santa Clara, California 95054, Etats-Unis. Tous droits rservs. Ce produit ou document est protg par un copyright et distribu avec des licences qui en restreignent lutilisation, la copie, la distribution, et la dcompilation. Aucune partie de ce produit ou document ne peut tre reproduite sous aucune forme, par quelque moyen que ce soit, sans lautorisation pralable et crite de Sun et de ses bailleurs de licence, sil y en a. Le logiciel dtenu par des tiers, et qui comprend la technologie relative aux polices de caractres, est protg par un copyright et licenci par des fournisseurs de Sun. Sun, Sun Microsystems, le logo Sun, Java, Netra, OpenBoot, Solaris, Sun Enterprise, Sun Fire, Sun HPC Cluster Tools, Sun Java, et Sun StorEdge sont des marques de fabrique ou des marques dposes de Sun Microsystems, Inc. aux Etats-Unis et dans dautres pays. Toutes les marques SPARC sont utilises sous licence sont des marques de fabrique ou des marques dposes de SPARC International, Inc. aux Etats-Unis et dans dautres pays. Les produits portant les marques SPARC sont bass sur une architecture dveloppe par Sun Microsystems, Inc. UNIX est une marques dpose aux Etats-Unis et dans dautres pays et licencie exclusivement par X/Open Company, Ltd. Laccord du gouvernement amricain est requis avant lexportation du produit. LA DOCUMENTATION EST FOURNIE EN LETAT ET TOUTES AUTRES CONDITIONS, DECLARATIONS ET GARANTIES EXPRESSES OU TACITES SONT FORMELLEMENT EXCLUES, DANS LA MESURE AUTORISEE PAR LA LOI APPLICABLE, Y COMPRIS NOTAMMENT TOUTE GARANTIE IMPLICITE RELATIVE A LA QUALITE MARCHANDE, A LAPTITUDE A UNE UTILISATION PARTICULIERE OU A LABSENCE DE CONTREFAON Ce manuel de rfrence doit tre utilis dans le cadre dun cours de formation dirig par un instructeur (ILT). Il ne sagit pas dun outil de formation indpendant. Nous vous dconseillons de lutiliser dans le cadre dune auto-formation.
Course Contents
About This Course ..................................................................................... Preface-xiv
Course Goals ............................................................................................................................... Preface-xv Course Map ...............................................................................................................................Preface-xvii Topics Not Covered ............................................................................................................... Preface-xviii How Prepared Are You? .......................................................................................................... Preface-xix Introductions .............................................................................................................................. Preface-xx Icons ............................................................................................................................................ Preface-xxi Typographical Conventions ...................................................................................................Preface-xxii Additional Conventions .........................................................................................................Preface-xxiv
iv
Sun Services
Multipathed I/O ................................................................................................................................... 1-20 Dynamic Reconfiguration (DR) .......................................................................................................... 1-21 Platform Startup and Shutdown ......................................................................................................... 1-22
Sun Services
Sun Fire 4800/E4900, 4810, and 6800/E6900 Server Eight-Slot PCI I/O Assembly ................... 2-40 Sun Fire V1280/E2900 Server I/O Assembly Location ................................................................... 2-41 Sun Fire 3800 Server I/O Assembly Locations ................................................................................. 2-42 Sun Fire 4800/E4900 Server I/O Assembly Locations .................................................................... 2-43 Sun Fire 4810 Server I/O Assembly Locations ................................................................................ 2-44 Sun Fire 6800/E6900 Server I/O Assembly Locations ................................................................... 2-45 Eight-Slot PCI I/O Assembly Slot Locations and LEDs .................................................................. 2-46 Eight-Slot Assembly Electrical Characteristics ................................................................................ 2-47 Six-Slot cPCI I/O Assembly Slot Locations and LEDs .................................................................... 2-48 Six-Slot cPCI I/O Slot Electrical Characteristics ............................................................................... 2-49 Four-Slot cPCI I/O Assembly Slot Locations and LEDs ................................................................. 2-50 Four-Slot cPCI I/O Slot Electrical Characteristics ............................................................................ 2-51 PCI and cPCI I/O Adapters ................................................................................................................ 2-52 Sun Fire V1280/E2900 Server Sun Fireplane Switchboard ............................................................. 2-54 Sun Fire 4800/E4900, 4810, and 6800/E6900 Server Sun Fireplane Switchboard ....................... 2-55 Sun Fireplane Switchboard Physical Locations ................................................................................ 2-56 Sun Fireplane Switchboard LEDs ....................................................................................................... 2-58 Sun Fire V1280/E2900 Server Baseplane .......................................................................................... 2-59 System Configuration Card Reader (SCCR) .................................................................................... 2-60 Sun Fire 6800/E6900 Server Centerplane and ID Board (Front View) ......................................... 2-61 Sun Fire 6800/E6900 Server Centerplane and ID Board (Rear View) ........................................... 2-62 ID Board ................................................................................................................................................. 2-63 Replacing a Centerplane or ID Board ................................................................................................ 2-65 ID Board MAC Addresses ................................................................................................................... 2-66 AC Power Distribution ......................................................................................................................... 2-67 Sun Fire 4810 and 4800/E4900 Server AC Component Locations ................................................. 2-69 Sun Fire 6800/E6900 Server AC Component Locations .................................................................. 2-70 RTU and RTS ......................................................................................................................................... 2-71 Redundant Transfer Unit Panel .......................................................................................................... 2-72 Redundant Transfer Unit LED Functions ......................................................................................... 2-73 AC Input Box ......................................................................................................................................... 2-74
Sun Fire Midrange Server Maintenance
Copyright 2004 Sun Microsystems, Inc. All Rights Reserved. Sun Services, Revision C
vi
Sun Services
DC Power Distribution ......................................................................................................................... 2-75 Sun Fire V1280/E2900 Server DC Power Distribution .................................................................... 2-76 Sun Fire 3800 Server DC Power Distribution ................................................................................... 2-77 Sun Fire 4800/E4900 Server DC Power Distribution ....................................................................... 2-78 Sun Fire 4810 Server DC Power Distribution ................................................................................... 2-79 Sun Fire 6800/E6900 Server DC Power Distribution ....................................................................... 2-80 Power Grid Slot Assignments ............................................................................................................. 2-81 DC Power Supplies ............................................................................................................................... 2-82 DC Power Supply Locations ............................................................................................................... 2-83 Sun Fire Midrange Server Fan Trays and Blower Assemblies ....................................................... 2-88 Fan Tray Locations ................................................................................................................................ 2-89 Fan Tray Assembly LEDs .................................................................................................................... 2-94 FrameManager Cable Diagram ........................................................................................................... 2-95 FrameManager Cap Front Panel ......................................................................................................... 2-96 Sun StorEdge D240 Media Tray ...................................................................................................... 2-97 Sun StorEdge D240 Media Tray Rear Panel ...................................................................................... 2-98 Full-Bus Configuration SCSI ID Assignments .................................................................................. 2-99 Full SCSI Bus Configuration Options .............................................................................................. 2-100 Split SCSI Bus ...................................................................................................................................... 2-101 Typical Split SCSI Bus SCSI ID Assignments ................................................................................. 2-102 Sun StorEdge D240 Media Tray Status LEDs ................................................................................. 2-103 Media Tray Status LED Descriptions ............................................................................................... 2-104 Media Tray Power Supply LEDs ...................................................................................................... 2-105 Media Tray Power Supply LED States ............................................................................................ 2-106 Installing the Administration Console ............................................................................................. 2-107 System Controller Patch Panel ......................................................................................................... 2-108 Accessing the Platform Shell ............................................................................................................. 2-109 Sun Fire Midrange Server Installation ............................................................................................. 2-110 Rackmounting an Additional Sun Fire 3800 and 4800/E4900 Server ......................................... 2-111
vii
Sun Services
Platform Assessment and Management ................................................................ 3-1
Objectives ................................................................................................................................................. 3-2 Relevance .................................................................................................................................................. 3-6 Sun Fire V1280/E2900 Server Platform Assessment and Management ......................................... 3-7 Lights-Out Management (LOM) ........................................................................................................... 3-8 LOM Shell ................................................................................................................................................ 3-9 LOM Shell Commands ......................................................................................................................... 3-10 The help Command ............................................................................................................................. 3-11 Connecting to the LOM Shell .............................................................................................................. 3-12 The shownetwork Command .............................................................................................................. 3-13 The setupnetwork Command ............................................................................................................ 3-14 The logout Command ......................................................................................................................... 3-15 Navigating Between Shell Environments on the Sun Fire V1280/E2900 Server ......................... 3-16 The showescape Command ................................................................................................................ 3-17 The password Command .................................................................................................................... 3-18 The showsc Command ......................................................................................................................... 3-19 The setupsc command ........................................................................................................................ 3-20 Managing the LOM Time-of-Day (TOD) ........................................................................................... 3-22 The bootmode Command .................................................................................................................... 3-23 LOM Platform Monitoring Functions ................................................................................................ 3-24 The showboards Command ................................................................................................................ 3-25 The showcomponent Command ......................................................................................................... 3-31 The inventory Command .................................................................................................................. 3-34 The showenvironment Command ..................................................................................................... 3-35 The history Command ....................................................................................................................... 3-36 The showlogs Command .................................................................................................................... 3-37 The showlocator Command .............................................................................................................. 3-38 Sun Fire V1280/E2900 Server Power Operations ............................................................................. 3-39 LOM poweron Command .................................................................................................................... 3-40 LOM shutdown Command .................................................................................................................. 3-41
viii
Sun Services
LOM poweroff Command .................................................................................................................. 3-42 Power-Cycling the Sun Fire V1280/E2900 Server Using the Power Rocker Switch ................... 3-43 Power-Cycle Operations ...................................................................................................................... 3-45 Sun Fire 3800, 4800/E4900, 4810, and 6800/E6900 Server Platform Assessment and Management ....................................................................................................................................... 3-46 Platform Shell Commands ................................................................................................................... 3-47 The help Command ............................................................................................................................. 3-48 Connecting to the System Controller Shells ...................................................................................... 3-49 Initiate a Remote Connection With SSH ............................................................................................ 3-50 Initiate a Remote Connection With Telnet ........................................................................................ 3-51 Navigating Between Shells on the Sun Fire 3800, 4800/E4900, 4810, and 6800/E6900 Servers ................................................................................................................................................. 3-52 Managing Shell Passwords .................................................................................................................. 3-53 Console Command Considerations .................................................................................................... 3-54 Platform and System Controller Configuration ............................................................................... 3-56 Configuring the Platform TOD ........................................................................................................... 3-59 Viewing System Controller Details .................................................................................................... 3-60 Viewing the SC Message Logs ............................................................................................................ 3-61 Viewing System Controller Connections .......................................................................................... 3-62 Viewing System Controller Command History ............................................................................... 3-63 System Controller Management ......................................................................................................... 3-64 System Controller Configuration ....................................................................................................... 3-65 Rebooting the System Controller ........................................................................................................ 3-66 System Controller Failover .................................................................................................................. 3-69 System Controller Failover Prerequisites .......................................................................................... 3-70 Failover ................................................................................................................................................... 3-71 Controlling System Controller Failover Behavior ............................................................................ 3-72 Determining the System Controller Failover State .......................................................................... 3-76 Platform Assessment ............................................................................................................................ 3-77 Assessing the Platform Configuration ............................................................................................... 3-78 The showplatform Command ............................................................................................................ 3-79
Sun Fire Midrange Server Maintenance
Copyright 2004 Sun Microsystems, Inc. All Rights Reserved. Sun Services, Revision C
ix
Sun Services
Viewing Platform Component Status ................................................................................................ 3-84 The showboards Command ................................................................................................................ 3-85 Viewing Platform Component Details ............................................................................................... 3-91 The showcomponent Command ......................................................................................................... 3-92 Powering On and Off System Components ...................................................................................... 3-95 Updating the Platform Firmware ....................................................................................................... 3-96 Using the Root or a User Account to Flash Update the System ..................................................... 3-97 Introducing Segments and Domains .................................................................................................. 3-99 Segments .............................................................................................................................................. 3-100 Domains ................................................................................................................................................ 3-101 Sun Fire 6800/E6900 Servers Configured With Four Domains ................................................... 3-102 Server Configuration Domain IDs .................................................................................................... 3-103 Segment and Domain Configurations ............................................................................................. 3-105 Domain Access Control List (ACL) .................................................................................................. 3-107 Configuring ACLs ............................................................................................................................... 3-108 Viewing ACLs ..................................................................................................................................... 3-109 Starting, Stopping, and Power-Cycling Domains .......................................................................... 3-110 Introducing Device Configuration ................................................................................................... 3-111 OpenBoot PROM Capabilities ........................................................................................................... 3-113 Device Tree ........................................................................................................................................... 3-115 Sun Fire V1280/E2900 Server Device Tree Components .............................................................. 3-116 Sun Fire 3800, 4800/E4900, 4810, and 6800/E6900 Server Device Tree Components .............. 3-117 Mapping Node Devices ...................................................................................................................... 3-118 CPU and Memory AID Assignments ............................................................................................... 3-121 Mapping I/O Devices ........................................................................................................................ 3-122 Decoding IOC AID ............................................................................................................................. 3-124 IOC AID Assignments ........................................................................................................................ 3-126 IOC PCI Bus Offset ............................................................................................................................. 3-127 Device Number ................................................................................................................................... 3-128 Sun Fire V1280/E2900 Server Six-Slot PCI Chassis ....................................................................... 3-129 Sun Fire 4800/E4900, 4810, and 6800/E6900 Server Eight-Slot PCI Chassis ............................. 3-130
Sun Fire Midrange Server Maintenance
Copyright 2004 Sun Microsystems, Inc. All Rights Reserved. Sun Services, Revision C
Sun Services
Sun Fire 3800 Server Six-Slot cPCI Chassis ..................................................................................... 3-131 Sun Fire 4800/E4900, 4810, and 6800/E6900 Server Four-Slot cPCI Chassis ............................. 3-132
xi
Sun Services
Fault Analysis Method ......................................................................................................................... 4-31 Eight Steps of Fault Analysis and Diagnosis .................................................................................... 4-32 Sun Fire Midrange Server Fault Analysis Methodology ................................................................. 4-33 Rule of Three Fault Isolation ............................................................................................................... 4-36 Gathering Background Information ................................................................................................... 4-37 Error Repositories and Commands .................................................................................................... 4-38 Sun Explorer Software Data Collector ............................................................................................... 4-40 Running Sun Explorer Software on the Sun Fire Midrange Server ............................................... 4-41 Viewing a Sun Explorer Software Capture ....................................................................................... 4-42 Interpreting Sun Fire Midrange Server LEDs ................................................................................... 4-43 LED Status Code Summary ................................................................................................................. 4-44 Testing the Platform ............................................................................................................................. 4-50 OpenBoot PROM Commands ............................................................................................................. 4-51 POST on the Sun Fire V1280/E2900 Server ...................................................................................... 4-55 Controlling System Controller POST Behavior ................................................................................ 4-56 Controlling OpenBoot PROM POST Behavior ................................................................................. 4-57 POST on the Sun Fire 3800, 4800/E4900, 4810, and 6800/E6900 Servers ..................................... 4-59 DIMM Memory Errors ......................................................................................................................... 4-63 Controlling System Controller POST Behavior ................................................................................ 4-69 System Controller testboard Command ......................................................................................... 4-70 Sun Fire Interconnect Link Errors ...................................................................................................... 4-71 Identifying Suspect FRUs From Error Messages .............................................................................. 4-72 Data Parity Coverage From CPU to CPU Through Memory ......................................................... 4-75 Parity Detection in the Address Network ......................................................................................... 4-76 Parity Protection for Address Interconnects ..................................................................................... 4-77 Error Correcting Code Errors .............................................................................................................. 4-78 ECC Error Types ................................................................................................................................... 4-80 ECC Error Persistence .......................................................................................................................... 4-81 Console Port Errors ............................................................................................................................... 4-82 Environmental Errors ........................................................................................................................... 4-83 Enhanced Availability Features Implemented in Firmware Update 5.15.3 ................................. 4-84
Sun Fire Midrange Server Maintenance
Copyright 2004 Sun Microsystems, Inc. All Rights Reserved. Sun Services, Revision C
xii
Sun Services
Diagnosis Engines ................................................................................................................................. 4-86 Auto-Diagnosis (AD) Engine .............................................................................................................. 4-87 Fault Event and Error Reporting ........................................................................................................ 4-89 AD Engine Logs and Records ............................................................................................................. 4-90 Decoding AD Engine Diagnosis Messages ....................................................................................... 4-91 Viewing AD Engine Diagnosis Messages ......................................................................................... 4-92 Automatic Restoration of Stopped Domains .................................................................................... 4-95 Identifying Disabled Components ..................................................................................................... 4-97 Sun Fire Midrange Server Blacklisting .............................................................................................. 4-98 Blacklisting Components ................................................................................................................... 4-100 Managing the Blacklist on Sun Fire V1280/E2900 Server ............................................................. 4-101 The setls Command ......................................................................................................................... 4-102 Domain Shell Operating Messages ................................................................................................... 4-103 Recovering From a Hung Domain ................................................................................................... 4-105 Verifying the Recovery ....................................................................................................................... 4-106 Collecting Data .................................................................................................................................... 4-107 Obtaining a Solaris OS Core File ....................................................................................................... 4-111 Obtaining Registers ............................................................................................................................ 4-114
xiii
dited 02/08 by LG
Sun Services
Preface
About This Course
Sun Services
Course Goals
Upon completion of this course, you should be able to: Locate online resources for the Sun Fire midrange server product line, which includes the following servers: Sun Fire V1280 server and Sun Fire E2900 server Sun Fire 3800 server Sun Fire 4800 server and Sun Fire E4900 server Sun Fire 4810 server Sun Fire 6800 server and Sun Fire E6900 server Describe the server configuration and key features of each model in the Sun Fire midrange server line
Sun Fire Midrange Server Maintenance
Copyright 2004 Sun Microsystems, Inc. All Rights Reserved. Sun Services, Revision C
Sun Services
Course Goals
Configure the Sun Fire midrange server platforms Perform system maintenance-related activities on the Sun Fire midrange servers
Sun Services
Course Map
Sun Fire Midrange Introduction
Introducing the Sun Fire Midrange Servers Field-Replaceable Units
Sun Services
Sun Services
Sun Services
Introductions
Name Company affiliation Title, function, and job responsibility Experience related to topics presented in this course Reasons for enrolling in this course Expectations for this course
Sun Services
Icons
Additional resources
!
?
Sun Services
Typographical Conventions
Courier is used for the names of commands, files, directories, programming code, programming constructs, and on-screen computer output. Courier bold is used for characters and numbers that you type, and for each line of programming code that is referenced in a textual description. Courier italic is used for variables and command-line placeholders that are replaced with a real name or value.
Sun Services
Typographical Conventions
Courier italic bold is used to represent variables whose values are to be entered by the student as part of an activity. Palatino italic is used for book titles, new words or terms, or words that are emphasized.
Sun Services
Additional Conventions
Java programming language examples use the following additional conventions: Courier is used for the class names, methods, and keywords. Methods are not followed by parentheses unless a formal or actual parameter list is shown. Line breaks occur where there are separations, conjunctions, or white space in the code. If a command on the Solaris OS is different from the Microsoft Windows platform, both commands are shown.
edited 02/08 by LG
Sun Services
Module 1
Introducing the Sun Fire Midrange Servers
Sun Services
Objectives
List the functional goals of the Sun Fire midrange server product line Locate Sun Microsystems web sites containing important Sun Fire midrange server information List the server models that comprise the Sun Fire midrange server product line Identify the input and output (I/O) components that the Sun Fire midrange servers support Describe the key features of each Sun Fire midrange server model Power on and off each Sun Fire midrange server
Module 1, slide 2 of 22
Sun Services
Relevance
Which Sun Fire midrange server models are available? How is each Sun Fire midrange server model used? What are the key features of each Sun Fire midrange server model?
Module 1, slide 3 of 22
Sun Services
Module 1, slide 4 of 22
Sun Services
Module 1, slide 5 of 22
Sun Services
Module 1, slide 6 of 22
Sun Services
Module 1, slide 7 of 22
Sun Services
Module 1, slide 8 of 22
Sun Services
Sun Fire 4810 server Sun Fire 4800/E4900 server Sun Fire V1280/E2900 server
Module 1, slide 9 of 22
Sun Services
Module 1, slide 10 of 22
Sun Services
Module 1, slide 11 of 22
Sun Services
Module 1, slide 12 of 22
Sun Services
Module 1, slide 13 of 22
Sun Services
Module 1, slide 14 of 22
Sun Services
Module 1, slide 15 of 22
Sun Services
Module 1, slide 16 of 22
Sun Services
Module 1, slide 17 of 22
Sun Services
System Controllers
Sets up the system and coordinates the boot process Generates system clocks Monitors the environmental sensors Analyzes errors and takes corrective action Sets up the system partitions and domains Provides the system console capabilities
Module 1, slide 18 of 22
Sun Services
CPU Memory
CPU Memory
I/O
CPU Memory
I/O
CPU Memory
CPU Memory
I/O
CPU Memory
I/O
Domain A Segment 0
Domain B
Domain C Segment 1
Domain D
Module 1, slide 19 of 22
Sun Services
Multipathed I/O
Sun StorEdge Traffic Manager software Provides a high level of disk availability and performance using multipath access to I/O devices. This was formerly known as Multiplexed IO or MPxIO. Internet Protocol Multipathing Provides a high level of network availability and performance using automatic failover and load balancing on existing Internet Protocol-based networking products.
Module 1, slide 20 of 22
Sun Services
Module 1, slide 21 of 22
Sun Services
Module 1, slide 22 of 22
edited 02/08 by LG
Sun Services
Module 2
Field-Replaceable Units
Sun Services
Objectives
Describe the various administrative and service layers on the Sun Fire midrange server products Describe the Sun Fire midrange server FRU strategy Locate and describe the function of the Sun Fire midrange server system controller boards Locate and describe the function of the Sun Fire midrange server system boards Locate and describe the function of the Sun Fire midrange server I/O boards Locate and describe the function of the Sun Fire midrange server Sun Fireplane switchboards
Sun Services
Objectives
Locate and describe the function of the Sun Fire midrange server baseplane and centerplanes Locate and describe the function of the Sun Fire midrange server AC and DC power distribution FRUs Locate and describe the function of the Sun Fire midrange server fan tray assemblies Locate and describe the function of the Sun Fire midrange server FrameManager Locate and describe the function of the Sun StorEdge D240 media tray Install and administer the console Install the server in a rack configuration
Sun Fire Midrange Server Maintenance
Copyright 2004 Sun Microsystems, Inc. All Rights Reserved. Sun Services, Revision C
Sun Services
Relevance
What URLs are needed to install and configure the Sun Fire midrange server models? What is the difference between hot-plug and hot-swap? Which components make up the Sun Fire midrange server models? Where are the components located? Which status indicators are associated with each component?
Sun Services
Sun Services
Platform Hardware
Sun Services
Sun Services
Sun Services
Sun Services
Sun Services
FRU Administration
Non-hot-pluggable FRUs: PCI cards Centerplane Sun Fireplane switchboard
Sun Services
FRU Administration
Hot-pluggable FRUs: System boards I/O boards System controller (only with failover enabled)
Sun Services
FRU Administration
Hot-swappable FRUs: DC power supplies Fan trays cPCI cards
Sun Services
PCI chassis
PCI riser board IB_SSC riser board -I/O Controller (I/O) -System Controller (SC)
Sun Services
Sun Services
Sun Services
IB_SSC FRU
Sun Fire Midrange Server Maintenance
Copyright 2004 Sun Microsystems, Inc. All Rights Reserved. Sun Services, Revision C
Sun Services
SC1
SC0
Sun Services
SC1
SC0
Sun Services
SC0
Sun Services
SC1 SC0
Sun Services
Sun Services
Status LED
Reset button
Sun Services
Off
The board is activated. Do The board is not not remove the board when activated. You can remove this LED is on. the board when this LED is off. An internal fault occurred. No internal fault occurred. Do not remove the component under hot-pluggable conditions.
Fault (amber)
Removal OK (amber)
Sun Services
Sun Services
Sun Services
Sun Fire 3800, 4800/E4900, 4810, and 6800/E6900 Server System Boards
The Sun Fire midrange server system board houses either two or four UltraSPARC III or UltraSPARC IV processors. Each processor supports two physical banks of memory.
Sun Services
Sun Services
CPU 1 (P1)
CPU 0 (P0)
CPU 3 (P3)
CPU 2 (P2)
Sun Services
DIMM 3, bank 0 DIMM 3, bank 1 DIMM 2, bank 0 DIMM 2, bank 1 DIMM 1, bank 0 DIMM 1, bank 1 DIMM 0, bank 0 DIMM 0, bank 1
Sun Services
SB0 SB2
SB4
Sun Services
SB2
SB0
Sun Services
Sun Services
Sun Services
Sun Services
Sun Services
PCI chassis
PCI riser board IB_SSC riser board -I/O Controller (I/O) -System Controller (SC)
Sun Services
Sun Fire 4800/E4900, 4810, and 6800/E6900 Server Four-Slot cPCI I/O Assembly
Sun Services
Sun Services
Sun Fire 4800/E4900, 4810, and 6800/E6900 Server Eight-Slot PCI I/O Assembly
Sun Services
33MHz PCI 1
32-Bit
64-Bit
33MHz PCI 2
32-Bit
64-Bit
33MHz PCI 3
32-Bit
64-Bit
33MHz PCI 4
32-Bit
64-Bit
66MHz PCI 5
32-Bit
64-Bit
Sun Services
IB6
IB8
Sun Services
IB8
IB6
Sun Services
IB8
Sun Services
IB9
IB8
IB7
IB6
Sun Services
7 6
5 4
3 2
1 0
Slots
Sun Services
Frequency
33 MHz 33 MHz 33 MHz 33 MHz 33 MHz 33 MHz
Bit-Size
64-bit 64-bit 64-bit 64-bit 64-bit 64-bit
Voltage
5VDC 5VDC 5VDC 3.3VDC 5VDC 5VDC 5VDC 3.3VDC
Sun Services
5 4 3 2 1 Slots 0
Sun Services
Frequency
66 MHz/33 MHz 66 MHz/33 MHz 33 MHz 33 MHz 33 MHz 33 MHz
Bit-Size
64-bit 64-bit 64-bit 64-bit 64-bit 64-bit
Voltage
3.3VDC 3.3VDC 5VDC 5VDC 5VDC 5VDC
Sun Services
3 2 1 0 Slots
Sun Services
Frequency
Bit-size Voltage
3.3VDC 3.3VDC 5VDC 5VDC
66 MHz/33 MHz 64-bit 66 MHz/33 MHz 64-bit 33 MHz 33 MHz 64-bit 64-bit
Sun Services
Ejector handle
Sun Services
Sun Services
DX0
DX1 Echip
Air Vent
Air Vent
Lever
Sun Services
Sun Fire 4800/E4900, 4810, and 6800/E6900 Server Sun Fireplane Switchboard
Sun Services
RP0 RP2
Sun Services
Sun Services
Sun Services
530-3073 1B6_SSC1
J9001
System Board 0
1B6_SSC1
System Board 2
Sun Services
Sun Services
System Controller 1
System Board 2
System Board 0
System Board 4
System Board 1
System Board 3
System Board 2
System Controller 0
System Board 0
System Board 4
System Board 1
System Board 3
System Board 5
System Board 5
Sun Services
I/O Board 9
I/O Board 9
Bus Bar 340-4796 to Power Centerplane I2C Cable 530-2546 to Power Centerplane
I/O Board 8
I/O Board 8
Sun Services
ID Board
Sun Services
ID Board
The ID board contains a serial electrically erasable programmable read-only memory (SEEPROM) application-specic integrated circuit (ASIC) with the following information: It has the server chassis ID. It has the server serial number/host ID. It incorporates six media access control (MAC) addresses for the Sun Fire 6800/E6900 server and four MAC addresses for the Sun Fire 3800, 4800/E4900, and 4810 servers. This includes one per possible domain and one each for the system controllers. It has the server and component power-on hours.
Sun Fire Midrange Server Maintenance
Copyright 2004 Sun Microsystems, Inc. All Rights Reserved. Sun Services, Revision C
Sun Services
Sun Services
Domain A = Base MAC address Domain A = Base MAC address Domain B = Base MAC address plus 1
SC0 = Base MAC address plus 2 Domain C = Base MAC address plus 2 SC1 = Base MAC address plus 3 Domain D = Base MAC address plus 3 N/A N/A SC0 = Base MAC address plus 4 SC1 = Base MAC address plus 5
Sun Services
AC Power Distribution
All Sun Fire midrange servers installed in a data center cabinet are equipped with the following AC components: The redundant transfer unit (RTU) The redundant transfer switch (RTS) An AC input box
Sun Services
AC Power Distribution
Rack Fan Trays (2) 220VAC RTS RTU RTS 220VAC To AC input box for all servers except the Sun Fire 3800 servers. To power supplies for the Sun Fire 3800 servers only.
Sun Services
AC input box
Sun Services
AC input boxes
RTU
RTS Front
Sun Fire Midrange Server Maintenance
Copyright 2004 Sun Microsystems, Inc. All Rights Reserved. Sun Services, Revision C
Rear
Module 2, slide 70 of 112
Sun Services
Sun Services
LEDs
J12 J8 J10
Switched
J7 J9
Sun Services
LED Color
Green
State
On Off Flashing
Meaning
The source is present and is okay. The source is not present or is lower than the specication. The source is out of the specication. The relay is energized and connected to the outlets. The relay is de-energized and docked. The relay is de-energized and undocked. The module has a fault. The module does not have a fault.
Middle
Green
On Off Flashing
Right
Amber
On Off
Sun Services
AC Input Box
For the Sun Fire 6800/E6900 systems, the AC input box receives power from the RTU through two power cables, each with a corresponding power switch. For the Sun Fire 4810 and 4800/E4900 systems, the AC input box receives power from the RTU through three power cables, each with a corresponding power switch. Sun Fire 3800 systems do not use AC input boxes.
Sun Services
DC Power Distribution
Sun Fire midrange DC power distribution systems include comprised of the following major components: System centerplane Power centerplane Fan centerplane DC power supplies
Sun Services
Main 48VDC
Auxiliary 3V3
Auxiliary 48VDC
48VDC_IL
Auxiliary Primary 200240VAC Secondary 200240VAC RTS RTU RTS Feed A (2) AC Input Box Main 48VDC Standby 12VDC
Sun Services
Rack Fan Trays (2) Primary 200240VAC Secondary 200240VAC 220VAC RTS RTU RTS
DC-DC Board Converters (system, I/O, and Sun Fireplane switch boards) Main 56VDC
DC-DC Board Converters (system controller and ID boards) Auxiliary 56VDC To fans (4) Main 56VDC
Sun Services
DC-DC Board Converters (system, I/O, and Sun Fireplane switch boards) Main 56VDC
DC-DC Board Converters (system controller and ID boards) Auxiliary 56VDC To fans (3) Main 56VDC
Sun Services
Rack Fan Trays (2) Primary 200240VAC Secondary 200240VAC 220VAC RTS
Sun Services
F A N C E N T E R P L A N E
Sun Services
Sun Services
DC Power Supplies
Power Supply Specications
Server
Sun Fire V1280/E2900 Sun Fire 3800 Sun Fire 4800/E4900 Sun Fire 4810 Sun Fire 6800/E6900
Slot Number
PS0, PS1, PS2, 48VDC PS3 PS0, PS1, PS2 PS0, PS1, PS2 PS0, PS1, PS2 PS0PS5 56VDC 56VDC 56VDC 56VDC
Sun Services
Sun Services
PS2
PS1
PS0
Sun Services
Sun Services
PS1
PS0
PS2
Sun Services
PS3
PS4 PS5
Grid 0
Grid 1
PS0
PS1 PS2
Sun Services
Sun Services
Fan tray
Sun Services
FT0
FT1
FT2
FT3
Sun Services
FT0
FT2
Sun Services
FT0
FT2 FT1
Sun Services
FT1 FT3
FT0 FT2
Front
Sun Fire Midrange Server Maintenance
Copyright 2004 Sun Microsystems, Inc. All Rights Reserved. Sun Services, Revision C
Rear
Module 2, slide 93 of 112
Sun Services
Sun Services
Sun Services
Keyswitch
Sun Services
Sun Services
To domain
Sun Fire Midrange Server Maintenance
Copyright 2004 Sun Microsystems, Inc. All Rights Reserved. Sun Services, Revision C
Sun Services
Disk (ID1)
Disk (ID0)
Sun Services
Hard Drives
Two Four
Tape Drives
Two None One None
DVD-ROM Drives
None None One Two
Mixed hard drives, tape drive, and Two DVD-ROM drive Two
Sun Services
To domain
To domain
Sun Services
Disk (ID0)
Disk (ID0)
Sun Services
Sun Services
Status
The power supply is inserted and cabled on, normal.
Both LEDs are off The power supply is absent, or the power cords are not connected. System fault is amber The power supplies have failed, the fan has failed, or the system is running from a single power supply.
Sun Services
Sun Services
Normal Fault Fault Indication (All Power (Good Power (Bad Power Supplies) Supply) Supply)
Green Amber Blue Green On Off On On On Off Off On Off On On Off
Sun Services
Sun Services
Sun Services
Sun Services
Sun Services
Sun Services
edited 02/08 by LG
Sun Services
Module 3
Platform Assessment and Management
Sun Services
Objectives
Describe an overview of the Sun Fire V1280/E2900 server platform administrative functions Perform user maintenance and administrative functions using the LOM shell Display and change system controller parameters by using LOM shell commands Monitor the Sun Fire V1280/E2900 server platform by using LOM shell commands Power cycle the Sun Fire V1280/E2900 server with LOM shell commands and with the front panel power rocker switch
Sun Services
Objectives
Identify the administrative and service tasks that you can perform with the platform shell Identify the platform shell commands you can use to display system information, set up system parameters, and test system hardware Describe three methods you can use to connect to the system controller shells Describe how to navigate between shells on the Sun Fire 3800, 4800/E4900, 4810, and 6900/E6900 servers Configure the platform and system controller by using the platform shell
Sun Services
Objectives
Display and change system controller parameters by using the platform shell Describe how to manage the system controller for reboot and failover operations Describe how to use system controller commands to monitor platforms and domains Describe how to power on and off the system components Describe how to update the platform firmware Describe the capability and effects of splitting the Sun Fire 3800, 4800/E4900, 4810, and 6800/E6900 servers into segments and domains
Sun Fire Midrange Server Maintenance
Copyright 2004 Sun Microsystems, Inc. All Rights Reserved. Sun Services, Revision C
Sun Services
Objectives
Describe how to start, stop, and power cycle domains Describe how to perform device configuration with OpenBoot programmable read-only memory (PROM) to add or remove devices from a Sun Fire midrange server Describe OpenBoot PROM capabilities Describe how to use OpenBoot PROM to represent interconnected busses and their devices in a device tree Describe how to map Sun Fire midrange server physical devices
Sun Services
Relevance
Which functions does the system controller perform? How do the system controller maintenance buses communicate with the platform? Which role does the platform shell play in configuring the Sun Fire midrange servers? Which commands are available in the platform shell? How is each platform shell command used to configure the Sun Fire midrange server platform?
Sun Services
Sun Services
Sun Services
LOM Shell
The LOM shell that runs on the system controller has been signicantly modied from versions on earlier Netra server platforms. Commands that were originally developed for the Sun Fire 3800, 4800, 4810, and 6800 server domain shell have been adopted.
Sun Services
Sun Services
Sun Services
Sun Services
Sun Services
Sun Services
Sun Services
Sun Services
Sun Services
Sun Services
Sun Services
Sun Services
Sun Services
Sun Services
Sun Services
Sun Services
Sun Services
Sun Services
Sun Services
8M 8M 8M 8M
Sun Services
Sun Services
12 12 12 12 12
Sun Services
POST ---untest untest pass pass untest untest untest untest untest untest untest untest pass pass
Description ----------empty empty UltraSPARC-III, UltraSPARC-III, empty empty empty empty empty empty empty empty 512M DRAM 512M DRAM
Module 3, slide 31 of 132
Sun Services
Sun Services
Sun Services
Sun Services
Sun Services
Sun Services
Sun Services
Sun Services
Sun Services
Sun Services
Sun Services
Sun Services
Power-Cycling the Sun Fire V1280/E2900 Server Using the Power Rocker Switch
Sun Services
Power-Cycling The Sun Fire V1280/E2900 Server Using the Power Rocker Switch
The switch is only operational if it has not been disabled using the setupsc LOM command. To determine if the rocker switch is disabled or enabled use the setupsc command.
lom> setupsc System Controller Configuration ------------------------------SC POST diag Level [off]: Host Watchdog [enabled]: Rocker Switch [enabled]: Secure Mode [off]:
Sun Services
Power-Cycle Operations
The following power-cycle operations are available when you enable the system indicator board rocker switch. If the system is in standby mode, pressing the switch powers on the system. This action is equivalent to executing the LOM poweron command. If the system running the Solaris OS, pressing the switch for four seconds executes an orderly shutdown. This action is the equivalent of executing the LOM shutdown command. If the system is powered on, pressing this switch for more than four seconds executes a system power down to standby mode. This action is equivalent to the LOM poweroff command.
Sun Fire Midrange Server Maintenance
Copyright 2004 Sun Microsystems, Inc. All Rights Reserved. Sun Services, Revision C
Sun Services
Sun Fire 3800, 4800/E4900, 4810, and 6800/E6900 Server Platform Assessment and Management
Using the platform shell, you can: Configure the system controller network parameters Configure platform-wide parameters Configure segments and domains Monitor platform environments Display hardware configuration information Power on and power off the system and system components
Module 3, slide 46 of 132
Sun Services
Sun Services
Sun Services
Sun Services
Sun Services
Sun Services
Navigating Between Shells on the Sun Fire 3800, 4800/E4900, 4810, and 6800/E6900 Servers
Telnet Connection Press: CTRL ] at the telnet> prompt type: send break Solaris Operating System SSH Connection #. Tip Connection ~.
Telnet Connection Press: CTRL ] at the telnet> prompt type: send break SSH Connection #. OpenBoot PROM Tip Connection ~.
Type: break
Platform Shell
Sun Services
Sun Services
Sun Services
Sun Services
Sun Services
Sun Services
Sun Services
Sun Services
Sun Services
Sun Services
Sun Services
Sun Services
Sun Services
Sun Services
Sun Services
Sun Services
Sun Services
Sun Services
Sun Services
Failover
The system controller failover event is logged in the platform message log le, which is viewed on the console of the new main system controller or through the showlogs command on the system controller.
Platform Shell - Spare System Controller sp4-sc0:sc> Nov 12 01:15:42 sp4-sc0 Platform.SC: SC Failover: enabled and active. Nov 12 01:16:42 sp4-sc0 Platform.SC: SC Failover: no heartbeat detected from the Main SC Nov 12 01:16:42 sp4-sc0 Platform.SC: SC Failover: becoming main SC Nov 12 01:16:49 sp4-sc0 Platform.SC: Chassis is in single partition mode. Nov 12 01:17:04 sp4-sc0 Platform.SC: Main System Controller Nov 12 01:17:04 sp4-sc0 Platform.SC: SC Failover: disabled sp4-sc1:SC>
Sun Services
Sun Services
Sun Services
You can force system controller failover by using the setfailover force command:
schostname:SC> setfailover force SC: SSC0 Spare System Controller SC Failover: enabled and active. Clock failover enabled.
Sun Services
Sun Services
Sun Services
Platform Assessment
You can use system controller commands to monitor the platform and domains. These commands include: showplatform showboards showcomponent showsc showenvironment history connections showlogs showfru
Module 3, slide 77 of 132
Sun Services
Sun Services
Sun Services
System Serial Number: 105H25AA Loghosts -------Loghost for Platform: 10.6.5.120 Log Facility for Platform: local0
Sun Services
Sun Services
SC -SC POST diag Level: min SC Failover: disabled Logical Hostname: Security Options ---------------Telnet servers: Enabled Idle connection timeout : No timeout
Sun Services
Sun Services
Sun Services
Sun Services
Sun Services
Sun Services
8M 8M 8M 8M
Sun Services
Sun Services
12 12 12 12 12
Sun Services
Sun Services
POST ---untest untest pass pass untest untest untest untest untest untest untest untest pass pass
Description ----------empty empty UltraSPARC-III, UltraSPARC-III, empty empty empty empty empty empty empty empty 512M DRAM 512M DRAM
Module 3, slide 92 of 132
Sun Services
Sun Services
Sun Services
Sun Services
Sun Services
Sun Services
Sun Services
Platform Hardware
Sun Services
Segments
A segment refers to all or part of the Sun Fireplane interconnect. Dual-partition (segment) mode splits the Sun Fireplane interconnect into two independent snoopy coherent systems. The Sun Fireplane switch boards are divided between the two segments. All connections between segments are disabled. To enable dual-segment mode, run:
schostname:SC> setupplatform -p partition Configure chassis for single or dual-partition mode? [single]: dual
Sun Services
Domains
A domain is a logical division of a segment. Each domain has an independent instance of the Solaris OS. Each segment can have a maximum of two domains. Domains are useful for testing new applications or operating system updates. Temporary resources can be borrowed from existing domains. Upon completion, resources can be returned. System reboot is not required.
Sun Services
RP0/RP1
RP2/RP3
SB0
SB2 Domain A
IB6
SB4
IB8
SB1
SB3 Domain C
IB7
SB5
IB9
Domain B
Domain D
Segment 0
Segment 1
Sun Services
Conguration
One segment, one domain One segment, two domains Two segments, two domains
Domain IDs
A A, B A, C A A, B A, C A A, B A, C
One segment, one domain One segment, two domains Two segments, two domains One segment, one domain One segment, two domains Two segments, two domains
Sun Services
Conguration
One segment, one domain One segment, two domains Two segments, two domains Two segments, three domains Two segments, four domains
Domain IDs
A A, B A, C or A, D or B, C or B, D A, B, C or A, B, D or A, C, D or B, C, D A, B, C, D
Sun Services
1 1 2 2
* You need to understand performance and availability trade-offs before choosing this conguration.
Sun Fire Midrange Server Maintenance
Copyright 2004 Sun Microsystems, Inc. All Rights Reserved. Sun Services, Revision C
Sun Services
1 1 2 2
* You need to understand performance and availability trade-offs before choosing this conguration.
Sun Services
Sun Services
Conguring ACLs
Congure ACLs by using the showplatform -p acls command. Type:
schostname:SC> setupplatform -p acls
ACL for domain A [SB0 SB1 SB2 SB3 SB4 SB5 IB6 IB7 IB8 IB9]:-r SB1 SB3 SB5 IB7 IB9 ACL for domain B [SB0 SB1 SB2 SB3 SB4 SB5 IB6 IB7 IB8 IB9]:ACL for domain C [SB0 SB1 SB2 SB3 SB4 SB5 IB6 IB7 IB8 IB9]:-r SB0 SB2 SB4 SB5 IB6 IB8 IB9 ACL for domain D [SB0 SB1 SB2 SB3 SB4 SB5 IB6 IB7 IB8 IB9]:-r SB0 SB1 SB2 SB4 IB6 IB7 IB8
Sun Services
Viewing ACLs
Display the current ACLs by using the showplatform -p acls command. Type:
schostname:SC> showplatform -p acls
A: SB0 SB2 IB4 IB6 IB8 B: C: SB1 SB3 IB7 D: SB3 SB5 IB9
Sun Services
Sun Services
Sun Services
Sun Services
Sun Services
Domain Shell
Platform Shell
Sun Services
Device Tree
Each device node can have the following components: Properties Data structures describing the node and its associated device. Methods The software procedures used to access the device. Data The initial values of the private data used by the methods. Children Other device nodes attached to a given node and that lie directly below it in the device tree. Parent The node that lies directly above a given node in the device tree.
Sun Fire Midrange Server Maintenance
Copyright 2004 Sun Microsystems, Inc. All Rights Reserved. Sun Services, Revision C
Sun Services
PCI Slot 0
Gigabit Ethernet 0
PCI Slot 1
PCI Slot 2
PCI Slot 3
PCI Slot 5
DVD Drive
Gigabit Ethernet 1
PCI Slot 4
Sun Services
24
Memory-controller
24
Device
Device
Sun Services
Sun Services
Sun Services
Sun Services
Location P0 AID
SB0 SB1 SB2 SB3 SB4 SB5 0 (0x0) 4 (0x4) 8 (0x8) 12 (0xc) 16 (0x10) 20 (0x14)
P1 AID
1 (0x1) 5 (0x5) 9 (0x9) 13 (0xd) 17 (0x11) 21 (0x15)
P2 AID
2 0x(2) 6 (0x6) 10 (0xa) 14 (0xe) 18 (0x12) 22 (0x16)
P3 AID
3 (0x3) 7 (0x7) 11 (0xb) 15 (0xf) 19 (0x13) 23 (0x17)
Sun Services
Sun Services
/ssm@0,0/pci@19,700000/pci@3/SUNW,isptwo@4/sd@5,0
Node ID
IOC AID
Bus offset
Device #
PCI controller
Device instance
Sun Services
You can calculate the IOC AID by performing the following steps: 1. Convert the IOC AID from hexadecimal to decimal. For example: 19 (hexadecimal) = 25 (decimal)
Sun Services
Sun Services
Location
IB6 IB7 IB8 IB9
Sun Services
Sun Services
Device Number
The PCI controller slots, located in the PCI (cPCI) chassis, are referenced by the device number. Device number:
/ssm@0,0/pci@19,700000/pci@3.......
Device #
Sun Services
network@1
scsi@2
network@2
ide@3
Hard Drive Target 0 disk@0,0 Hard Drive Target 1 disk@1,0 Tape Drive Target 5 st@5,0
DVD-ROM sd0,0
Sun Services
Sun Fire 4800/E4900, 4810, and 6800/E6900 Server Eight-Slot PCI Chassis
IOC 0 pci@18,600000 B A pci@18,700000 IOC 1 pci@19,600000 B pci@19,700000 A
pci@1 (Slot 0) pci@2 (Slot 1) pci@3 (Slot 2) pci@1 (Slot 3) pci@1 (Slot 4) pci@2 (Slot 5) pci@3 (Slot 6) pci@1 (Slot 7)
Sun Fire Midrange Server Maintenance
Copyright 2004 Sun Microsystems, Inc. All Rights Reserved. Sun Services, Revision C
Sun Services
pci@1 (Slot 0) pci@1 (Slot 1) pci@1 (Slot 2) pci@2 (Slot 3) pci@1 (Slot 4) pci@2 (Slot 5)
Sun Services
Sun Fire 4800/E4900, 4810, and 6800/E6900 Server Four-Slot cPCI Chassis
IOC 0 pci@18,700000 A B pci@18,600000 IOC 1 pci@19,700000 A pci@19,600000 B
edited 02/08 by LG
Sun Services
Module 4
Troubleshooting the Sun Fire Midrange Servers
Sun Services
Objectives
Describe the basic architecture of the Sun Fire server system Describe the two levels of Sun Fireplane interconnect switches Describe how the system boards provide CPU and memory resources to the operating system in Sun Fire midrange servers Describe how Sun Fire midrange servers use PCI and cPCI I/O assemblies Describe how the Sun Fireplane interconnect plane is the main system bus of the Sun Fire family of servers
Sun Services
Objectives
Describe the different integrated service processors supported by the Sun Fire midrange server architecture Describe the troubleshooting methodology for fault analysis and diagnosis of failed components Describe the system tools available for gathering background information on Sun Fire midrange server problems Describe the testing tools available for isolating faults in the Sun Fire midrange servers Describe how Sun Fire midrange servers use parity to detect system interconnect errors
Sun Services
Objectives
Describe how Sun Fire midrange server subsystems can use error correcting code (ECC) to recover from errors Describe how console port error messages are reported to help isolate faulty components in the console bus hub (CBH) Describe how Sun Fire midrange server environmental faults are reported Describe the enhanced availability features implemented in the new firmware update 5.15.3 Describe how blacklisting is used to reconfigure Sun Fire midrange server hardware to avoid parts with errors
Sun Fire Midrange Server Maintenance
Copyright 2004 Sun Microsystems, Inc. All Rights Reserved. Sun Services, Revision C
Sun Services
Objectives
Describe how domain shell operating messages provide an aid to troubleshooting a system error in Sun Fire midrange servers. Recover from a hung domain
Sun Services
Relevance
Which diagnostic tools are available to test the Sun Fire servers? How do you free a hung domain? How do you create an action plan to replace failed FRUs?
Sun Services
Additional Resources
Sun Microsystems, Inc. Sun Fire 6800/4810/4800/3800 Systems Platform Administration Manual, part number 817-0999. Sun Microsystems, Inc. Sun Fire 6800/4810/4800/3800 System Controller Command Reference Manual, part number 817-1000. Sun Microsystems, Inc. Sun Fire V1280/Netra 1280 System Administration Guide, 817-0509. Sun Microsystems, Inc. Sun Fire V1280/Netra 1280 Systems Service Manual, part number 817-0510. Sun Microsystems, Inc. Sun Fire V1280/Netra 1280 System Controller Command Reference Manual, part number 817-0511.
Sun Fire Midrange Server Maintenance
Copyright 2004 Sun Microsystems, Inc. All Rights Reserved. Sun Services, Revision C
Sun Services
Additional Resources
Sun Microsystems, Inc. Sun Fire Midrange Systems Hardware Reference Manual, part number 805-7363. Sun Microsystems, Inc. Sun Fire 6800/4800/4810/3800 Systems Site Planning Guide, part number 805-7365. URL Resources Public Web sites http://sunsolve.sun.com/handbook_pub/ http://www.sun.com/software/solaris/ sunmanagementcenter/hwds/ http://docs.sun.com http://www.sun.com/blueprints/0803/ 817-3342.pdf
Sun Services
Operational Overview
Sun Fire midrange server architecture includes: New system bus architecture based on the Sun Fireplane interconnect High-performance Sun Fireplane interconnect switch technology An enhanced CPU architecture starting at speeds of 750 MHz and greater An industry standard I/O incorporating PCI and cPCI technology
Sun Services
IOC
IOC
CPU
DCDS
CPU
CPU
DCDS
CPU
Memory
Memory
Memory
Memory
Sun Services
Functionality
The Sun Fireplane interconnect provides a 288-bit data path between the UltraSPARC III processors and the PCI I/O bridge (IOC) with a high clock frequency of 150 MHz. The connection between the Sun Fireplane interconnect devices (UltraSPARC III processors and PCI and enhanced PCI [EPCI] bridges) and the data path uses point-to-point connections. The UltraSPARC III processors are interfaced to the data path using the dual CPU data switch (DCDS).
Sun Services
Sun Services
DX
DX 2
AR
AR 2 6
System boards
DX DX PCI IOC AR
Data Address
AR 2 4 I/O boards
Sun Services
SB
SB
SB
SB
SB
SB
IB6
IB7
IB8
IB9
Sun Services
E-Cache Tag
E-Cache Tag
Core 1 (US-III Cu) M C U Address 15 75MHz Data Data 128+ 9ECC+ 7MTag 150MHz Transaction Request Signals Data
Memory (SDRAM)
UltraSPARC IV
Safari Bus
150MHz
Sun Services
CPU 0
DCDS CPU 1
SBBC
SRAM
FPROM
SDC
SC0 SC1 8,9 6,7 5 4 SBBC0 SBBC1 0 1 2 3
DIMMs
DIMMs
CPU 2
DCDS DX
5 0,1
SBBC CPU 3
SRAM
FPROM
2,3
DIMMs
Sun Services
Maintenance Bus Devices Address Repeater Data Controller Data Switch A IOC 1 B A IOC 0 B
Connection to Fireplane
Data Switch
PCI #0
PCI #1
PCI #5
Dual Channel SCSI Controller Internal SCSI Connection External SCSI Connection
SBBC
SRAM
FPROM
To DVD
Sun Services
A B IOC 1
Data Address Data Route Bootbus Console Bus Control Signals PCI Bus
SDC
SC0 SC1 9 8 7 6 1 0
A B IOC 0
DX
6,7 2 0 8,9
Sun Services
Sun Services
A B IOC 1
Slot 1 Slot 3
Data Address Data Route Bootbus Console Bus Control Signals PCI Bus
SDC
SC0 SC1 9 8 7 6 1 0
A B IOC 0
DX
6,7 2 0 8,9
SBBC
SRAM
FPROM
Sun Services
A B IOC 1
Data Address Data Route Bootbus Console Bus Control Signals PCI Bus
SDC
SC0 SC1 9 8 7 6 1 0
A B IOC 0
DX
6,7 2 0 8,9
SBBC
SRAM
FPROM
Sun Services
Sun Services
Level 1: Board
Address Repeater
Address Repeater
Proc
Proc
Proc
Proc
Memory
Memory
Memory
Memory
Sun Services
4.8 GB/s*
2.4 GB/s
Level 1: Board
4.8 GB/s
Data Switch
4.8 GB/s
Data Switch
1.2 GB/s 1.2 GB/s
Proc
2.4 GB/s
Proc
2.4 GB/s
Proc
2.4 GB/s
Proc
2.4 GB/s
PCI Controller
0.2 GB/s
PCI Controller
0.2 GB/s
PCI Card
0.4 GB/s
PCI Card
0.4 GB/s
2.4 GB/s
Memory
Memory
Memory
Memory
PCI Card
PCI Card
Sun Services
CPU 0
Sun Services
SBBC
SEEPROM
SRAM
Misc. Registers
TOD NVRAM
NVRAM FPROM
ScApp FPROM
Clocks
RIO
DRAM
MicroSPARC IIep
boot FPROM
Rear Panel
10/100BASE-T Ethernet
* Not all Console and I2C Buses are used.
Sun Services
Sun Fire 3800, 4800/E4900, 4810, and 6800/E6900 Server System Controller Block Diagram
Centerplane Reset Lines Error Lines 14x Console Buses Serial to Other SC Clocks Global I C Buses
2
SBBC
Echip SRAM Misc. Registers I 2C MUXs Local I C
2
I 2 C Buses
NVCI FPROM
TOD NVRAM
Scapps FPROM
16552 SC serial
16552 TTY
clocks control
temp
volt
SEEPROM
RIO
Ebus
SC (MicroSPARC IIep)
DRAM
Panel
10/100BASE-T Ethernet
TTYA, TTYB
Sun Services
To PCI Controller
PCI Bus Console Bus SRAM PROM Bus I C Buses Sensors LEDs
2
FPROM
To Datapath Controller
JTAG
To Processer
To Processer
Sun Services
3
AR
0
SBBC0
1
SBBC1 (SB only)
Sun Services
150 MHz
Sun Services
Sun Services
Sun Services
Sun Services
Sun Services
Sun Services
Sun Services
Sun Services
Loghost
showlogs (domain shell) The platform or domain shell messages can be diverted to an external loghost by entering the loghost IP address or host name when using the setupplatform or setupdomain commands, respectively. show-post-results (OpenBoot PROM)
Sun Services
Loghost
Sun Services
Sun Services
Sun Services
Sun Services
Sun Services
On FrameManager (used on the Sun Fire 3800, 4810, and 6800 server chassis)
Sun Services
On On
On On
Sun Services
On
On
SC1
On
Sun Services
On On
On On On
On
Sun Services
cPCI slots
On On
Sun Services
Sun Services
Sun Services
Sun Services
Sun Services
Sun Services
Sun Services
Sun Services
lom> setupsc System Controller Configuration ------------------------------SC POST diag Level [max]: Host Watchdog [enabled]: enabled Rocker Switch [disabled]: disabled Secure Mode [off]: on
Sun Services
Sun Services
Sun Services
POST on the Sun Fire 3800, 4800/E4900, 4810, and 6800/E6900 Servers
POST executes to check operational capability of hardware Two types: SCPOST System controller POST SPOST System POST LPOST Local POST IOPOST I/O POST
Sun Services
POST on the Sun Fire 3800, 4800/E4900, 4810, and 6800/E6900 Servers
Example of an error output during LPOST:
r24-13a:A> setkeyswitch on . . Oct 05 05:22:38 r24-13a Chassis-Port.SC: Chassis.pass2ICT: Slot 2 Dx 3 Stuck at testing failed
Sun Services
POST on the Sun Fire 3800, 4800/E4900, 4810, and 6800/E6900 Servers
r24-13a:A> showlogs .error] Interconnect test: Board 5 address repeater connection to RP0 failed Oct 17 14:37:47 r24-13a Domain-A.SC: [ID 788592 local0.error] Bit in error: L2_ADDR[29] Oct 17 14:37:47 r24-13a Domain-A.SC: [ID 668033 local0.error] Bit in error: L2_ADDR[28] Oct 17 14:37:47 r24-13a Domain-A.SC: [ID 547474 local0.error] Bit in error: L2_ADDR[27] Oct 17 14:37:47 r24-13a Domain-A.SC: [ID 306356 local0.error] Bit in error: L2_ADDR[25] Oct 17 14:37:47 r24-13a Domain-A.SC: [ID 185797 local0.error] Bit in error: L2_ADDR[24]
Sun Services
Sun Services
Sun Services
Sun Services
Sun Services
Sun Services
Description ----------------------III, III, III, III, 750MHz, 750MHz, 750MHz, 750MHz, 4M 4M 4M 4M ECache ECache ECache ECache
UltraSparc UltraSparc UltraSparc UltraSparc 256M DRAM 256M DRAM empty empty 256M DRAM 256M DRAM empty empty
Sun Services
Sun Services
Sun Services
Sun Services
Centerplane
Centerplane
Sun Services
Sun Services
Sun Services
Sun Services
Memory
CPU
System board y
System board z
Sun Services
Sun Services
Centerplane
Parity Data
Sun Services
Centerplane
L1DX
ES ED
L1DX
ES ED
CPU
ED
DCDS
PCI IOC
ES ED
Ecache
System Boards
Sun Services
Centerplane
L1DX
DCDS
DCDS
DCDS
Sun Services
Sun Services
Sun Services
Sun Services
Environmental Errors
Usually caused by faulty (or blocked) fan trays or power supplies.
Sun-Fire-sc0:SC> showlogs Dec 12 08:31:00 Sun-Fire-sc0 Chassis-Port.SC: Domain A has a SYSTEM ERROR Dec 12 08:31:07 Sun-Fire-sc0 Chassis-Port.SC: This domain is still running because error pause is not enabled for this domain Dec 12 08:31:18 Sun-Fire-sc0 Chassis-Port.SC: Device temperature problem: /N0/SB5 auto power off may occur due to device: Cheetah 3 Temp. 0 Value: 127 Degrees C Dec 12 08:31:19 Sun-Fire-sc0 Chassis-Port.SC: Device temperature problem: Shutting down /N0/SB5 due to temperature of device: Cheetah 3 Temp. 0 Value: 127 Degrees C Dec 12 08:31:19 Sun-Fire-sc0 Chassis-Port.SC: /N0/SB5, sensor status, over limit (7,1,0x201050603030000) Dec 12 08:32:08 Sun-Fire-sc0 Chassis-Port.SC: ...board successfully powered off.
Sun Fire Midrange Server Maintenance
Copyright 2004 Sun Microsystems, Inc. All Rights Reserved. Sun Services, Revision C
Sun Services
Sun Services
Sun Services
Diagnosis Engines
The following automatic diagnosis engines (DEs) identify and diagnose hardware errors that affect the availability of the system and its domains: SMS DE Solaris OS DE POST DE
Sun Services
Sun Services
Domain is running.
Sun Services
Sun Services
Sun Services
Sun Services
Sun Services
Sun Services
Sun Services
Sun Services
Sun Services
Sun Services
Sun Services
Sun Services
Blacklisting Components
Component
System boards Ports on the system board or I/O assembly Memory banks on system boards I/O assemblies Ports on the I/O assembly I/O cards in the I/O assemblies
Component Name
SB0, SB1, SB2, SB3, SB4, and SB5 P0, P1, P2, and P3 B0 and B1 IB6, IB7, IB8, and IB9 P0 (C0, C1, C2, and C3) P1 (C4, C5, C6, and C7) C0, C1, C2, C3, C4, C5, C6, and C7
Sun Services
Sun Services
Sun Services
Sun Services
Sun Services
Sun Services
Sun Services
Collecting Data
Apart from a Solaris OS core le, various other bits of information might provide insight into the failure. Follow these steps to collect this data: 1. Ensure you that have a record of what you did during the previous verification step. 2. If you do not have a hardware error, assess current conditions on the domain: Is there any output being printed on the domain console? Does the domain console echo characters you type? Does the domain respond to the ping command? Does the domain respond to the rup command?
Sun Fire Midrange Server Maintenance
Copyright 2004 Sun Microsystems, Inc. All Rights Reserved. Sun Services, Revision C
Sun Services
Collecting Data
3. Record the results of all the preceding tests. 4. Run the following commands from the system controller domain shell, and collect the output in a file: showlogs showenvironment showdomain
Sun Services
Collecting Data
5. Run the following commands from the system controller platform shell, and collect the output in a file: showsc showlogs showplatform history
Sun Services
Collecting Data
6. Run the Sun Explorer software utility to collect the system configuration information. If the domain is paused because of an error, nothing else can be done after collecting all the previous information. Reboot the domain with the following command in the domain shell:
setkey off ; setkey on
Sun Services
Sun Services
Sun Services
Sun Services
Obtaining Registers
If you cannot get a core le out of the Solaris OS, there should still be CPU register information to collect. The reset command causes all the CPUs in the target domain to save their registers in a save area in CPU static random access memory (SRAM).
showresetstate -v