Download as pdf or txt
Download as pdf or txt
You are on page 1of 125

BOSTON

DISTRICT
T. O. I
Hand Book
07/24/06

Disclaimer
This Document is for reference only.
The purpose of this document is to give the SSE a quick
reference to a broad amount of material. It is not intended
to replace the original product manuals, and should not be
used in place of these manuals or substituted for training on
these products.
This can be best used as tool to get you in the right frame of
mind (product wise) when preparing to go on a call.
Comments, suggestions, and request for updated copies should
be sent to:
toi.handbook@east.sun.com

Copies also available at:


http://webhome.east/boston/toi.html

Table of contents
Desktop configurations: ........................................................................................................
Firmware revision number: ...................................................................................................
OBP Escape hatches ...........................................................................................................
nvalias, NVRAMRC ...........................................................................................................
reset Host ID .....................................................................................................................
Boot sequence ...................................................................................................................
Run Levels ........................................................................................................................
Restore Boot Block ...........................................................................................................
E1000/2000 info ...............................................................................................................
E series info .....................................................................................................................
OBP commands ................................................................................................................
OBP device path breakdown .............................................................................................
Device tree listing - desktop ...............................................................................................
E- 450 information .............................................................................................................
E- 10000 information .........................................................................................................
Blacklist .............................................................................................................................
Sysyem Bd power proceedure ............................................................................................
E 10k component numbering ...............................................................................................
Scsi Array Model 100 ........................................................................................................
Model 200 Array ...............................................................................................................
ssaadm commands .............................................................................................................
Replace WWN on SSA ...................................................................................................
A1000 Array .....................................................................................................................
D1000 Array ....................................................................................................................
RSM Disk Tray ................................................................................................................
A3000/3500 Array ...........................................................................................................
A5000 Array ....................................................................................................................
luxadm commands .............................................................................................................
Disk replacment in Veritas ................................................................................................
A5000 min configuration .................................................................................................
A5000 addressing ...............................................................................................................
A5000 Target assignments ................................................................................................
RDAC ................................................................................................................................
Raid Overview ....................................................................................................................
Raid Levels .......................................................................................................................
Boot process .......................................................................................................................
Diagnostic commands ..........................................................................................................
Diagnostic Files ...................................................................................................................
Watchdog resets ..................................................................................................................
What to look for on a watchdog reset .................................................................................
Dump Analysis .....................................................................................................................
abd commands .....................................................................................................................
crash commands ...................................................................................................................
kadb .....................................................................................................................................
Sunsolve ...............................................................................................................................
SunVTS ...............................................................................................................................
STORtools ...........................................................................................................................
Explorer Scripts ....................................................................................................................
Performance Analysis tools .................................................................................................
Backup ..............................................................................................................................
ufsdump ..............................................................................................................................
ufsrestore ...........................................................................................................................
tar ......................................................................................................................................
cpio ...................................................................................................................................
dd ....................................................................................................................................

1
1
1
2
2
2
2
2
3
4
5
6
6
7
8
10
10
11
12
13
13
14
14
14
15
16
16
17
18
18
19
19
19
19
20
20
21
22
23
24
25
26
27
27
27
28
29
30
31
32
32
32
33
33
33

How to get a core dump on a 2.x server ............................................................................


Dump device bad when saving core on encapsulated root ................................................
Uncompressing Files ........................................................................................................
T300 (purple) ......................................................................................................................
ACT (A Coredump Tool) ...................................................................................................
Advantages of Splitting a Drive into Multiple File Systems ...............................................
How to configure a system to run on a network ...................................................................
SEVM - How to recover a primary boot disk .....................................................................
Disable DMP ....................................................................................................................
Memory Scrubber ...............................................................................................................
Display remote App GUI locally..........................................................................................
Cluster 2.x ..........................................................................................................................
Encapsulating root after using Environmental CD to load O/S ..........................................
Adding a second network interface ......................................................................................
Adding a default gateway .....................................................................................................
Volume Manager (general info) ...........................................................................................
FTPing to and from sunsolve ............................................................................................
Serengeti 3800, 4800, 6800 ...............................................................................................
mounting CDROM without vold ........................................................................................
mailx: send files/messages
.............................................................................................
StarCat 15k notes
.................................................................................................
local-mac-address
..................................................................................................
SDS- How to mirror root ..............................................................................................
IPMP
..................................................................................................
T3B or T3+ Firmware Rev 2.1 New Functions: ...............................................................
Hitachi StorEdge 99X0 Arrays: ......................................................................................
SunFire forgotten password ...........................................................................................
StorEdge Network FC Switch
.......................................................................................
Hitachi 9900v notes
.....................................................................................................
Minnow 3300 Array
....................................................................................................
Tuning ecache scrubber scan rate .....................................................................................
VxWorks (serengeti) ........................................................................................................
LVD adapter information .................................................................................................
Replaceing a nordica bd in a 15K SC ..............................................................................
Serengetti/15k DR boards .............................................................................................
Clean up non-root disc controler numbers ....................................................................
Starcat Portid cheat sheet .................................................................................................
Starcat SC clean slate .....................................................................................................
Starcat redx info ............................................................................................................
StorADE
...................................................................................................................
Get FRU info from serengetti ..........................................................................................
Swap
......................................................................................................................
Maserati Notes- StorEdge 6320 and 6120 .......................................................................
Flash Archive interactive install .....................................................................................
UltraSPARC III CPU Diagnostic Monitor (CDM) .........................................................
SunFire Service Mode Password Generator ...................................................................
V440 ALOM, raidctl ....................................................................................................
Finding Solaris release and distribution loaded ..............................................................
Find local NIS servers ...................................................................................................
Network troubleshooting command, files, daemons .....................................................
How to find your way around a B1600
................................................................
Cluster 3.x
...........................................................................................
SMSupgrade 1.4.1 info
...........................................................................................
Solaris 9 SVM (sds) disk replacement ............................................................................
SC rebuild after total disk failure
............................................................................
15K DR / hpost examples
..........................................................................................

34
36
39
40
44
46
48
49
51
52
52
53
56
56
56
57
60
61
67
67
68
73
73
75
76
77
78
79
81
84
86
86
87
87
87
88
88
89
89
90
90
91
92
93
94
94
94
95
95
96
97
103
106
107
108
109

smsbackup: manually check a backup file: .....................................................................


3310/3510 Disk replacement: ...........................................................................................
How to mount a CD image file (.iso) as a filesystem: .......................................................
Removing the top cover on a V20z ..................................................................................
Explorer -w scextended with cron
..................................................................................
Useful COD commands .....................................................................................................
ALOM4v Ontaeri/Erie(Niagra) ...........................................................................................
Forgotten password (ALOM4v) ..........................................................................................
Solaris to Linux cross reference ..........................................................................................
SSH information ...............................................................................................................
Galaxy ILOM info .............................................................................................................
SSH with SMS 1.5 .............................................................................................................

110
110
110
111
111
111
111
113
113
114
115
115

Desktop Configurations
processor

sbus slots

onboard hosts

network

scsi II

10bt/AUI

scsi II

10bt

scsi II

10bt

scsi II

10bt/AUI

fast/wide

10/100

PCI

N/A

10/100

Ultra 10

1sim/bank
16,32
70,85,110,170 1sim/bank
8/32
20,30,40,50 1sim/bank
16/64
50-150mhz 1sim/bank
16/32/64
167,200,300 2sims/bank
16/32/64/128
270
2sims/bank
can't use 256mb
d/b
2sims/bank

PCI

N/A

10/100

1000

.....

4/group

......

......

1000e

.....

......

......

......

2000

51,61,81

......

.....

2000e

51,61,81

4/group
8/32 meg
4/group

......

......

SS4

memory

70, 85, 110

SS5
SS10
SS20
Ultra 2
Ultra 5

Commands to find firmware revision number:


#/usr/platform/'uname-i'/sbin/prtdiag -v (gives you a listing of all boards)
#/usr/sbin/prtconf -V (gives you a listing of the master boards version)
ok .version
ok banner
OBP escape hatches
L1-a
L1-f
L1-d
L1-n
L1

(stop-a) (Ctrl Break)* To stop a process in OBP or to bring a system down in solaris (not reccomended)
(stop-f) enters command mode on ttya before probing H/W, use 'fexit' to continue with initialization
sequence.
(stop-d) Sets diag-switch? parameter to true. Enables verbose output durring post.
(stop-n) Resets NVRAM contents to defaults. (not reccomended. see 'nvrecover')
(stop) Runs POST in INIT mode (does not depend on security mode)

* laptop key strokes


Make a new alias... OBP, printenv, nvramrc
1 ok show-disks
2 select a disk controller a,b,c
3 ok
nvalias (alias name) (ctl-y) ..... control-y is the yank command, and will give you the path you
selected in the show-disks command.
You have to type sd@n,n for Sbus or disk@n for PCI at the end.
Page 1

To recover
ok
ok

NVRAMRC, printenv, veritos

nvrecover
nvstore

(ctl-c)

To remove an alias nvramrc, printenv, devalias


ok nvunalias (alias name)
To reprogram your MAC address and host ID
ok

17 0 mkp (return)

ok

8 0 20 xx yy zz 080020xxyyzz mkpl (return)


(curser disappears)
(ctl-d)
(ctl-r)

ok

banner

Boot Sequence
1
2
3
4
5
6
7

Beep (keyboard)
Led's blink, screen goes blank, (POST)
Banner
Testing memory (selftest#mem)
Boot (auto-boot?)
diag-switch?
prom loads boot block (UFS reader)

Run levels
1
2
3
4
5
6
0

O/S command

single user
init 1
multi-user but no sharing init 2
multi-user with sharing
init 3
N/A user configurable
shutdown and shuts off pwr init 5
stop and reboot
init 6
goto firmware
init 0

rc.script
/etc/rc1
/etc/rc2
/etc/rc3
/etc/rc5
/etc/rc6
/etc/rc0

Restore boot block


# installboot /usr/platform/'uname-i'/lib/fs/ufs/bootblk /dev/rdsk/c0t0d0s0
|
ex: sun4u

page 2

Deskside server
Key switch
-standby
-on
-diag
-secure

no power
normal
verbose post, on board, master bd (1000,2000)
prevents a (stop-a) and disables reset switch

1000/2000 server Info


1000 40mhz control card
1000e 50 mhz control card
2000 40 mhz control card
2000e 50mhz control card
*auto master- if you replace any CPU/Mem cards put new card in slot 0
Master Board requirements:
CPU
Memory
Latest firmware rev.
*

To determine which board is master:


1 ok

print-nvram-stat

2 switch cables to board you want to be master


<2> ok 0 switch-cpu
3 make board 0 a master bd
<0> ok set-master-nvram
<0> ok print-nvram-stat
4 Get rid of unwanted master
<0> ok 2 switch-cpu
<2> ok clear-master-nvram
<2> ok print-nvram-stat
(move rs232 cable to master board)
command to show all sbus cards:

<ok> show-devs

NVRAM contents 1000/2000


If you need to change a CPU board, you do not need to do anything with the NVRAM. There is a copy
on the control board and it wil be automatically transfered..... If you need to change a control board you
must use the proceedure in the FE handbook (pg. cpu81) to invalidate the contents of nvram on the
new control board.
Page 3

Ultra Enterprise 3000 Information


2 power supplies
6 cpus
I/O board w/sbus, internal scsi adpter
clock board, clock, voltage monitor, reset, console (keep firmware)

CPU boards
CPU/mem bd
501-2976
501-4312
501-4882

Speeds
83mhz
more sram
83-90-100mhz

Processors

memory

167mhz
250mhz
333mhz
400mhz
600mhz

8@8
32 @ 8
128 @ 8

I/O boards
I/O type
1
2
3
4
5

Speed

83mhz
83mhz
83 and 83/90/100
83 and 83/90/100
83/90/100

sbus

o/b fiber

on board host

network

3
2 (upa)
0 (2pci)
3
2upa

soc
soc
n/a
soc+
soc+

fas
fas
ultra wide
fas
f/w scsi

10/100
10/100
10/100
10/100
10/100

Clock boards

page 4

Clock board numbers

Speed

501-2975
501-4286
501-4946
501-5365

83mhz
83mhz
83-90-100 (x500 servers)
83-90-100 (x500 servers, shipped with the E6500)

OPB commands:
(OBP reference guide) get this... http\\docs.sun.com
banner
boot -v
boot -a
boot -s
boot (alias)
cd /
devalias
limit-ecache-size
nvalias
nvunalias
nvrecover
nvstore
.properties
probe-scsi
probe-scsi-all
probe-fcal
printenv
prom-copy
reset
setenv (variable)
set-default
show-post-results
show-disks
show-devs
socal-diag-all
show-wwn
selftest
sifting
update-proms
watch -net
watch-net-all
words
.xir-state-all

a brief decsription of the system. mac address, firmware level, host ID


will verbose boot the system from defaults set in printenv list and devalias file.
will boot without the use of /etc/system file (interactive boot)
will boot in single user mode
will boot the server from the specified alias in the devalias file
will put you in a directory hiearchy for listing hardware paths. 'device-end' gets
you out of this mode
shows you a listing of your device aliases
will allow you to boot a 400mhz 8meg cache processor on os 2.5.1 or 2.6 CD
solaris 7 works fine. Jumbo patch 105181-14 for 2.6 or 103640-27 for 2.5.1
is used to create an alias
is used to remove an alias. see previous example.
is used to recover a deleted alias
is used with nvrecover
when you are in device hiearchy mode ( cd /) on 3.x systems you can use the
.properties command to see info about the device path you are on. use
.attributes for the same function on 2.x systems.
list only internal disks
list all scsi devices
list all photon drives
used to give you a listing of the environment settings
will copy the flash prom from one board to another boards must be the same type.
prom-copy (src dest) ex : ok prom-copy 0 2
will copy flash prom from
board 0 to board 2
will reset the system
used to set an envronment setting (variable). use printenv to get setting syntext.
will set a line in the environment to default. ex ok set-default output-device
show results of the last POST
will give you a disk controler listing and is used when creating an nvalias.
will give you a listing of all device paths on the system. Use the 'cd /' command
to go down the path.
when you are in device hiearchy mode (cd /), you can go down a socal path
(ex: cd /sbus@3,0/SUNW,socal@0,0). And run OBP diags on that path.
when you are in device hiearchy mode (cd /), you can go down a socal path
(ex: cd /sbus@3,0/SUNW,socal@0,0). And show the world wide number and loop
id .
when you are in device hiearchy mode (cd /), you can go down a socal path
(ex: cd /sbus@3,0/SUNW,socal@0,0). And run the socal selftest.
will search for the command specified. ex: ok sifting probe-scsi
will update the proms (do not use to copy to cpu board 0, use the prom-copy # #
command)
watch packets and clock tics
watch packets and clock tics
will list all the fourth commands for the current screen
externally Initated Reset command, used to gather info on a hung machine
page 5

Command to reset the line in the envronment to defaults:


set-default

ex: ok

set-default output-device

Move an S-buss card from one slot to another:


1. Remove controller (Sbus card)
2. boot - r, remove path_to_inst
3. boot - ra
You might also be able to switch the Sbus-probe-list order to change the C# in c#t#d#s#.

OBP path breakdown for Enterprise machines


convert to decimal
divide by 2
round down
sbus slot
lun#
|
|
|
/sbus@7,0/SunWfas@3,8800000/sd@1,0
|
|
result is bd #
target#
Device tree listings for desktop machines
4m:

4u

ss4, ss5, ss10, ss20


/iommu/sbus/cgsix
/ledma/lc
/espdma/esp

path to monitor card


path to on-board network adapter
path to on-board scsi devices

ultra 1 - 140, 170


/sbus@1f,0/ledma@e/le
/espdma/esp

path to on-board network adapter


path to onboard scsi devices

ultra 1 - 140e, 170e, 200e


/sbus@1f,0/hme
/fas

path to on board network


path to on-board scsi devices

ultra 2
/upa/sbus/hme
/fas@e

path to on-board network


path to on-board scsi devices

ultra 5,10
/upa/pci@1f/apb/pci@1,0
/upa/pci@1f/apb/pci@1,1/ide@3
/network@1,1
/m64b
/ebus@1
ultra 30
/upa/pci@1f,2000
page 6

path to pci slots 1-3


path to cdrom and disk
path to on-board network
path to on-board graphics adapter
path to system devices
path to pci slots 1(33/66mhz) - 4 (33mhz)

/upa/pci@1f,4000/scsi@3
/network@1,1
/ebus@1
ultra 60
/upa/pci@1f,2000
/upa/pci@1f,4000/scsi@3
/scsi@3,1
/network@1,1
/ebus@1

path to on-board scsi devices


path to on-board network (hme)
path to system devices
path to pci slots 1(33/66mhz) -4 (33mhz)
path to internal scsi devices
path to external scsi devices
path to network (hme)
path to system devices

acronyms for above listings


esp
scsi2 50 pin
fas
fast and wide scsi 68 pin
hme
100mb ethernet
isp
Intel Scsi Processor
le0
10mb ethernet
qe
Quad Ethernet
qfe
Quad fast Ethernet
soc
Serial Optical Controler
socal
Serial Optical Controler +

Ultra 450
and
Ultra Enterprise 450
ok setenv disk_led_assoc

add a pci adapter to printenv list to get entries into prtconf so you
can do the following proceedure:

1. To find a drive path on an ultra 450, get the path '/pci@6,40001# - - - - - - - - - - /sd@0,0
from the format command.
2. Change the 'sd' to 'disk' and '0,0' to 0
3. #prtconf -vp | grep 'c#t#d#. . . . . . . . . . . . . /disk@#
4. results will be the slot# and the disk# will tell you the drive.
Device tree listing

----- ----- ------ ---- ---- FE Handbook 1 cpu-126 and cpu-128

mfg-options is a NVRAM variable is a decimal value that sets up the system as a workstation or a server.
the UE 450 is currently not offered as a workstation.
ok setenv mfg-options 0 (workstation default) Ultra 450
ok setenv mfg-options 49 (server default) Ultra Enterprise 450
upa-port-skip-list is a NVRAM variable used to skip probing of upa ports, following upa ports are used:
Prosessors
framebuffers
psycho
ex: ok

upa ports 0,1,3


upa ports 1d and 1e
upa ports 4,6,1f

setenv upa-port-skip-list 3,1d (skips CPU3 and FFB1)


page 7

obdiag

is a command you can run for prom based diagnostics

pcIO-probe-list is an NVRAM variable used to control the probe order for onboard PCI devices (/pci@1f,4000)
pci-slot-skip-list is an NVRAM variable used to skip probing of PCI devices plugged into the backpanel slots
memory-interleave is a NVRAM variable that controls how OBP sets memory interleaving
env-monitor is a NVRAM variable that determins how OBP responds to envronmental monitoring via the l2c
serial bus.
.post

command displays the results of POST

.asr

command displays the system devices and settings

asr-enable , asr-disable commands enable and disables system devices.

/associations The associations tree node contains entries representing catigories of assosiations or connections
between system components that are dispersed in the device tree.
ex:

ok cd /associations/slot2dev
ok .properties
ok cd /associations/slot2led
ok .properties
ok cd /associations/slot2disk
ok .properties

E10000
SSP basic commands
hostinfo
-F
-S
-h
-p
-t
domain_create

will give you a status of different parts of the E10k


fan status, on/off, speed
signature blocks (board ID)
processor status
power status (boards and centerplane)
temperature status
requirements: system boards must be present not in use
Sufficient memory and at least one proc
At least 1 network interface
Connection to a disk for OS
Unique hostname
Entry in host database
template eeprom.image file

syntax:
Create a new domain:
ssp:domain% domain_create -d domain -b 0 3 4 -o 2.5.1 -p platform
page 8

Recreate a domain that previously existed (domain_history file)


ssp:domain% domain_create -d domain
domain_remove

Domain must be halted


syntax: # ssp:domain% domain_remove -d domain

domain_rename

syntax # ssp:domain% domain_rename -d old_name -n new_name

domain_status will tell you which boards are in each domain


domain_switch

will change the domain your ssp window is conected to.

domain_history

Displays the contents of domain_history file (contains removed domain info)

power

no argument
will tell you the voltages at each board
-on
-off
-all = everything
except AC sequencers ex: power -on -all
-ps = powersupply
ex: power -on -ps # (#=0-7)
-p = AC sequencer
ex: power -on -p # (#=0-4)
-cb = control board
-sb = system board
-csb = center plane sprt bd

ex: power -on -cb # (#=0-1)


ex: power -on -sb # (#=0-15)
ex: power -on -csb # (#=0-1)

fan

no arguments
same as hostinfo -F (fan status)
-t =tray
ex: # fan -t x -p off (x = 0-15)
-1 =group of trays
ex: # fan -1 x -p off (x=front,rear)
-p on all fans on

autoconfig

Must be run when adding a new revision of a board to the system


May also be required when moving a board to a new slot
Not required if all boards are the same revision level
(Do not run on a system board that is running the OS, or on the
centerplane when any domain is running the OS)

board_id

will read the serial number eeprom on specified board


(has no effect on running domain)

thermcal_config

thermcal_config must be run when installing a new board


or moving a board to a new slot, or else temperature sensing
for that board will be incorrect.
Target board must be off for 30 miniutes before running
Updates SSP file with conversion factors from serial eeproms
ssp:domain% edd_cmd -x stop
ssp:domain% thermcal_config
ssp:domain% edd_cmd -x start

bringup

boot the domain

ex: # bringup -A off -l32 will bring system to the <ok> prompt
and run hpost at level 32 (7-128)
ex: # bringup
will bring up system (autoboot)

netcon

start network console session


page 9

Blacklist
- Edit via hostveiw or manually (vi)
- Explicit removal of components for isolation of intermittent faults or benchmarking
- processors
- IO controllers
- ASICs
- Memory banks
- Boards
- Busses
- Default location of blacklist file
/var/opt/SUNWssp/etc/platform_name/blacklist
- After editing the blacklist file, halt the domain and re-run bringup to make changes take
effect. (reboot does not cause hpost to reread the blacklist file)
Hostveiw

To remove a device from the blacklist file:


- Edit
- Blacklist
(change veiw if required)
- MIDDLE click on blacklisted device (should change from black to white)
- run bringup to make changes take effect.

Redlist:
$SSPVAR/etc/platform_name/redlist is an ASCII file that enables the system administrator
or root to restrict, from the SSP, the configuration of the host system. It lists components
that POST cannot touch, and whose state POST cannot change. Redlisted components are
also considered effectively black- listed. Never use redlisting if blacklisting will do.
System Board Power off Procedure
1. Have the customer bring down all jobs on the domain in question.
Next, they need to either use the shutdown command or use the init0 command
to bring the system to the <ok> prompt.
2. After this has been done, go to the ssp login window. Login as ssp and (ssp password)
3. At the SUNW_HOSTNAME prompt, enter either the platform name or the name of the
existing domain
4. Issue the 'domain_status' command , this will list all the domains and system boards
associated with each domain.
5. Issue the 'domain_switch (domain name)' command , to get to the proper domain.
6. Use the 'power -off -sb #' (#= system board #) command , to power off the system board to
be removed. MAKE SURE THE YELLOW LEDS ARE OFF BEFORE REMOVING BOARD.
7. After completing the work on the system board and the board has been reinstalled, use the
'power -on -sb #' (#=system board#) command, to return the power to the system board.
8. Next use the 'bringup' command to autoboot or the 'bringup -A off' to stop at the <ok>
prompt.

Page 10

Component Numbering
Processors
component
System Board 0 - 15
proc. Mod. 0-3

Solaris

Hostveiw
SB 0 - 15
00-63

/SUNW,ultraSparc@0,0
|
proc. in hex (0 - 3f )

Post
sysbd 0- 15
proc0.0 - proc 15.3
|
sysbd#.proc#

I/O ( SBus)
Component
I/O port 0 - 3

Solaris
/sbus@40
|
Subtract 40
change to decimal
divide by 4
answer is board #
remainder is SBus #

Cable Label
SB0.0.0
|
sysbd#.Sbus#.Slot#

Post
scard 0.0.0
|
sysbd#.SBus#.Slot#

I/O (PCI)
Component
I/O port 0 - 3

Solaris
Cable Label
/PCI@40
PCI0.0.0
|
|
Subtract40
sysbd#.PCI#.0
change to decimal
divide by 4
answer is board #
remainder is PCI #

Post
scard 0.0.0
|
sysbd#.PCI#.0

Memory
Component
System board memory

Post
mem x.0
|
system bd.#.bank#

SSP: (notes)
/etc/netmasks should be: 10.0.0.0 255.255.255.0
(for private net or cb1 will not come up)
share cdrom to load VTS
share -F nfs -o ro,anon=0 /cdrom/cdrom0/s0
3.4 commands:
showfailover:
Shows you the failover status
showdatasync
Shows you the datasync status (from main to spare)
setfailover
on
enables failover
force forces a failover to spare
off
disables failover to spare
setdatasync backup backup files to spare
ssp_backup
creates a ssp_backup.cpio file ex: # ssp_backup /var/tmp
ssp_restore
restores ssp_backup.cpio file ex: # ssp_restore /var/tmp/ssp_backup.cpio
ssp_config float
lets you change the hostname for the floating hostname (name should be in the hosts
files of both SSPs and also in /etc/ssphostname on the domains)
Page 11

SCSI Array
MODEL 100
Front Panel LCD indications:
POST
Service
Controller
Alphanumerics
Fan
Battery
Drive
fibre

Located in the top left corner. (circle with line at 12:00) indicates post is running
Under POST icon (wrench). Service is needed, always displayed with another icon
Located to the right of service icon (looks like a se scsi icon). indicates a controller
problem
POST - test codes and status value of failing test are flashed continuously.
Normal operation - Four lsd's of world wide number
Controller errors - Panic code is flashed continuously, and controller icon is on
Fan failure or heat problem
Fast write cache Low NVRAM battery voltage, battery should be replaced.
a small solid rectangle represents an avalible drive
Fiber optic link state. Two link icons A and B. Switched on when link is
established.

POST codes
01
08
09
30
xx

LCD failure Replace fan tray?


Fan failure
Replace fan tray
P/S failure
Replace Power supply
Battery failure
Replace battery module
Controller failure
Replace Controller

100/110mhz
|
Model
11/2
|
size of drives
Layout:
__ _________POWER SUPPLY_________
| |d0
|d0
|d0
|
| F |d1
|d1
| d1
|
| A |d2
|d2
| d2
|
| N |d3
T0 | d3 T2
| d3
T4
|
| |d4
|d4
| d4
|
| T |_________________________________|
| R | d0
|d0
|d0__________|____________
| A| d1
|d1
|d1
|
|
| Y | d2 T1 |d2
T3
| d2
T5____|_________ |
| | d3
|d3
| d3
|
| |
| | d4
|d4
| d4
|
c0t5d0s?
|__|_________________________________|
page 12

Tray 1

Tray 2

Tray 3

STRIPE

Trays...

MIRROR arrays.
*Use channel B first on controller Fiber to copper adapter, 1 port for each host.
** run

#ssaadm display cn (where 'n' = controller number)


|
this will give you array info on this controller

Solstice Disk Suite: "md" devices... Can change /etc/vfstab and /etc/system to bypass and use raw device
Use command 'metastat -s' to tie "md' device name in vfstab to physical partition name.
# solstice & (will run the GUI)
MODEL 200
The Sparc storage array model 200 is a rack mount disk array controller. Up to six differential SCSI disk
trays can be connected to it. Each tray can hold up to six drives. Ports are numbered 0-5 right to left, top
to bottom.
controller #
drive in tray
|
|
c2t2d0
|
port # on controller of array (or tray #, determined by port on array controller)
Connectors and switches:
Fiber optic connector
Scan connector

Connects F/O cables from host to array


Used to test SSA controller in factory

NVRAM LED

Gives info on the SSA NVRAM. Press the NVRAM button when the SSA
is off, if the NVRAM LED comes on, then there is data pending on the
NVRAM that must be flushed to disk using the fastwrite software
command.
Used to determine if there is any data pending on the SSA NVRAM
Used to set the diag level of the SSA. DIAG position for normal
diagnostics. DIAG EXT for extended diagnostics.
Resets array... Do not press while array is in use.
Gives info on SSA status. Blinking is running normally.( freq=activity)
Off is no power or hung. Solid On is power but hung.

NVRAM button
DIAG switch
Reset switch
SYS OK LED
100/200 Array Commands:

# ssaadm release /dev/rdsk/c#t#d#s#


To release a specific disk
# ssaadm release c#
To release all drives on a specific controller
# ssaadm stop /dev/rdsk/c#t#d#s#
To stop a specific disk
# ssaadm stop -t2
c#
To stop a specific tray on a specific controller
# ssaadm stop c#
To stop all drives on a specific controller
# ssaadm display c#
To display status on all drives on a controller
# ssaadm -v download -w ####WWN##### c# To download old wwn to new SSA controler (2.5 and >)
page 13

Procedure to replace WWN on a SSA


1. Boot from CDROM
ok boot cdrom -sw
2. Locate the new array controller
# ls -l /dev/dsk/c*t0d0s2 | grep NWWN (=new controllers WWN, SSA display)
3. Mount servers '/' filesystem on /a
# mount -o ro /dev/dsk/c0t0d0s0 /a
4. Download the old address to the new controller
# /a/usr/sbin/ssaadm -v download -w ####WWN##### c#
(old WWN) (c# from step2)
5. # halt
6. Press reset on the back of the SSA
(if you don't know the original WWN, mount the root filesystem on /a and do a ls-l on
/dev/dsk/c#t0d0s2)
A1000 Disk array
- Same disk tray as the D1000
- Hardware raid controller
- Ultra Differential Fast/Wide host Connection
- 8-16 meg Processor Memory (2 simm slots)
- 16-64 meg Data cache (2 simm slots)
- Battery Backup for data cache
- Scsi ID switch on controller
- two models 8 HH or 12 low Profile chassis
D1000 Disk tray
D1000 disk tray is used in the Storage Edge A3500 RAID array. 5,8, or 15 D1000's can be used,
depending on the configuration. It uses the same disk tray as the A1000, but different controller. It
has 2 sets of scsi connectors, you can run 2 scsi busses into it and divide the drives or jumper the
busses together and have the array on one buss.
- Does not have a hardware raid controller
- 16bit Ultra Differential Fast/Wide Scsi bus
- two models 8HH (9.1gb or 18.2gb) or 12 low profile (4.2gb or 9.1gb)
- hot plug disk drives
- hot plug power and cooling units
- dual power cables to seperate sequencers
Scsi Id and Array Id are set on the rear DIP switch ( D1000 can be configured for 1 or 2 busses)
sw1:
sw2:
sw3:
sw4:
sw5:

Disk Array 1 Id:


Disk Array 2 Id:
Drives Remote Start:
Drives Delay Start:
Reserved

up: drive IDs 8-11 or 8-13,


Down: drive IDs 0-3 or 0-5
up: drive IDs 8-11 or 8-13, Down: drive IDs 0-3 or 0-5
up wait for scsi command,
Down: check sw4
up: Start with delay (id*12),
Down: start at power-on

Module ID switch (rear): Wheel switch used to ID unit (1-5) when used in an A3500
configuration.
Page 14

Disk Layout:

D1000
Array2
|
Array1
sw2: down | 0 1 2 3 4 5 | 0 1 2 3 4 5| sw 1: down
sw2: up
| 8 9 10 11 12 13| 8 9 10 11 12 13| sw 1: up
Front veiw

Leds on back:
Power supply Status led:
Cooling status leds (4):
Temp fault:
controller power:

Color
Normal (green), failureand other p/s is ok (amber)
Normal (green), blower failure (amber)
Normal (off), fault (amber)
Normal (green), no power (off)

location
P/S
fan housing
Control bd
Control bd

RSM Disk Tray


RSM are used in the Storage Edge A3000 RAID array. Each A3000 contains 5 RSM disk trays
- Internally drives operate on a 16-bit Single-Ended Fast/Wide Scsi bus
- Externally the tray interface is a 16-bit Differential Fast/Wide Scsi bus
- 3 to 7 4.2gb or 9.1gb HH disk drives
- hot plug disk drives
- hot plug, redundent power and cooling units
- dual power cables to seperate sequencers
*** Scsi Id for the tray is set on the I/O board. setting of 0-6 or 8-14, 8-14 is required for the
RDAC module.
* Scsi Id for the SEN card is a wheel selection and should be set to 15 (F).
RSM
_____front veiw___________
*** | 0 | 1 | 2 | 3 | 4 | 5 | 6 |
_________or_____________
| 8 | 9 | 10 | 11 | 12 | 13 | 14 |
target IDs
Leds/switchesDisk leds:
Red-fault, Green I/O activity
Panel leds:
Power on/off switch
Power indicator (green)
Power module A and B fault (red)
Fan module warning (amber)
Fan module falure (red)
Over temp
(red)
Reset Alarm
(pbs)
page 15

A3000/A3500
A3000
- 56 inch rack.
- contains 5 RSM disk trays
- 1 RDAC Module
- each RDAC module has dual hot plug RAID controllers
A3500
- 72 inch rack
- contains 5, 7, 15 D1000 disk trays
- 1, 2, or 3 RDAC modules
- each RDAC module has dual hot plug RAID controllers
# raidutil - c (c#t#d#) - B battery age info for that controllers (A3x00)
- R to reset battery age after replacement (A3x00)
Break ,(esc), Q40, ld</Debug, arrayPrintSummary,cfgUnitList,vdShow,dstDevs,
rdacMgrSetModeActivePassive, rdacMgrSetModeDualActive,rdacMgrAltCtlFail,rdacMgrAltCtlResetRelease,
moduleList,sysReboot
A5000 (photon)
- The A5000 or Photon is a Fiber channel array
- up to 14 hh drives or 22 low profile hot pluggable, dual ported FC-AL disk drives
Model #'s
A5000 - 14 7200 rpm Drive of 9.1GB each
A5100 - 14 7200 rpm Drives of 18.2GB each
A5200 - 22 10000 rpm Drives of 9.1 GB each
RAID Manager
Commands:
# /usr/lib/osa/bin/rm6 to run
# /usr/lib/osa/lad
will give ctd#s, controller serial #s and lun configurations
# fwutil /usr/lib/osa/fw/aaaaaaaaa.apd cxtxdxs0
Downloads appware to a controller (halt all i/0)
# fwutil /usr/lib/osa/fw/bbbbbbbb.bwd cxtxdxs0
Downloads bootware to a controller (halt all i/0)
# raidutil - c (c#t#d#) - b battery age info for that controllers (A3x00)
- r to reset battery age after replacement (A3x00)
RAID Manager Device Naming Conventions
Target ID of RAID controller
|
slice
|
|
C# T# D# S#
|
|
| Lun # (created when setting up array)
Host Controller #
page 16

luxadm commands for the A5000


luxadm probe -p
luxadm display
luxadm inq

laxadm led_blink
luxadm led_off
luxadm power_off

luxadm power_on

luxadm remove_device

luxadm insert_device

luxadm reserve
luxadm release
luxadm enclosure_name

luxadm download

Display information about all attached A5000s. This will give you the
enclosure names
Use the display subcommand to display enclosure or device specific info
enclosure info ex: # luxadm display mars-0
device info
ex: # luxadm display mars-0,f3 (f3= front disk slot# 3)
Use the inquiry subcommand to display inquiry info for the enclosure or
specific disk
enclosure info ex: # luxadm inq mars-0
device info
ex: # luxadm inq mars-0,f4 (f4=front disk slot#4)
Use the led_blink subcommand to start flashing the yellow led
associated with a specific disk.
ex: # luxadm led_blink mars-0,f2 (f2=front disk slot 2)
Use the led_off subcommand to turn off the yellow LED
associated with a specific disk.
ex: # luxadm led_off mars-0,r3 (r3= rear disk slot#3)
Use the power_off subcommand to set an enclosure or disk to
power save mode
enclosure
ex: # luxadm power_off mars-0
disk
ex: # luxadm power_off mars-0,f5 (f5=front disk slot#5)
Use the power_on subcommand to set a drive or enclosure to
its normal power on state.
enclosure ex: # luxadm power_on mars-0
disk
ex: # luxadm power_on mars-0,f1 (f1=front disk slot#1)
Use this subcommand to 'hot remove' a device or enclosure, when
removing failed disk units for replacement. Verbose output will
walk you thru the proceedure
enclosure ex: # luxadm remove_device mars-0
disk only ex: # luxadm remove_device mars-0,f6
Use the insert_device subcommand for 'hot' insertion of a new disk or
enclosure. Use after the remove_device command to replace a failed
drive with a new one. Verbose output will walk you thru the proceedure.
ex: # luxadm insert_device mars-0,f5
Use the reserve subcommand to reserve the specified disk(s) for exclusive
use by the host from which the subcommand was issued.
ex: # luxadm reserve mars-0,f6
The release command releases the drive from the reserve state
ex: # luxadm release mars-0,f6
Use the enclosure_name subcommand to change the enclosure name of
one or more A5000s
ex: # luxadm enclosure_name mars1 pluto2
(change from pluto2 to mars1)
Use the download command to download a prom image to the
FEPROMs on an A5000 interface board. Stop all activity on this
connection before downloading firmware, the array will recycle
automatically after the download.
ex: # luxadm download -s mars-0 (will download firmware from
default file /usr/lib/locale/C/LC_MESSAGES/ibfirmware)
ex: # luxadm download -s -f /special/upgrade/ibfirmware.latest
mars-0
-f you can specify the file name and do not use the default
page 17

luxadm fcal_s_download

Use the fcal_s_download command to download new fcode into ALL


the FC100-HA sbus cards or display the current versions of the fcode
in each FC100-HA Sbus card.
display:
ex: # luxadm fcal_s_download
download:
ex: # luxadm fcal_s_download -f /usr/lib/firmware/fc_s/fcal_s_fcode

Disk failure and replacement Veritas


remove

1. # vxdiskadm
2. item 4 (Remove disk for replacement), Enter disk name, Remove another disk? n
3. item 11 (Disable (offline)a disk device) offline the same disk so it can be removed, q
4. # vxdctl enable (This will reconfigure DMP)
5. # luxadm remove_device mars-0,f0 (mars-0,f0 is enclosure name, diskslot#) return
(physically remove disk drive) (return)
replacement 6. # luxadm insert_device mars-0,f0 (mars-0,f0 is enclosure name, diskslot#) return
(physically insert new disk) return
7. # vxdctl enable (This will reconfigure DMP)
8. #vxdiskadm
9. item 5 (Replace a failed or removed disk) Enter disk name, enter c#t#d#, continue y,
replace another? n, quit q
10. from here you have a choice of 2 ways to complete this. (most of the time this is up to
the customer to do) read both before choosing.
1. make new disk spare and spare disk part of the RAID
# usr/sbin/vxedit -g rootdg set spare=on disk01
# /usr/sbin/vxedit -g rootdg set spare=off disk05
OR
2. Take the data from the rebuilt spare and put it back on the new drive
Evacuate the spare, disk05 back to disk01 to recover original configuration
# /etc/vx/bin/vxevac disk05 disk01
Minimum Configuration A5000
These are minimum disk configurations to insure adequate signal retransmission.
14 disk array The minimum configuration system has drives in slots 3, 6 in front and drives in
0, 3, and 6 in the rear. No other configuration is authorized. As disks are added they
should be spaced to minimize gaps between disks.
22 disk array The minimum configuration system has drives in slots 0, 5 in front and drives in
0, 3, 6,and 10 in the rear. No other configuration is authorized. As disks are added they
should be spaced to minimize gaps between disks.

Page 18

A5000 Addressing
"sf" = Host Adapter (socal) has 2 ports sf@0,0 and sf@1,0
"ses" = Interface Boards (IB) in the A5000, 2 IBs/array, 2 ports/IB
"ssd" = disk drives

ses 0 and 1 = IB-A


ses 2 and 3 = IB-B

convert to decimal
Data path through IB to disk
divide by 2
sbus slot
21 = node A
round down
d = on bd soc+
22 = node B
lun (always 0)
|
|
|
|
sbus@1f,0/SUNW,socal@1,0/sf@1,0/ssd@w2100002037007fa1,0:a
|
|
|
|
result is I/O bd #
Loop connection
WWN#
slice a = 0
port on the HBA
0 = port A
1 = port B

A5000 Target ID assignments


(Box ID x 32) + (Backplane# x 16) + (Disk slot#) = Target ID
|
|
|
0,1,2,3
0 front
0-11 left to right
1 rear
ex: a rear disk slot 5 in a A5000 with box ID of 3 would be (3 x 32) + (1x16) +5 = t117
RDAC Module
- used in the A3000 and A3500 arrays
- dual hot plug RAID controllers
- Hot plug power and cooling units
- Battery backed up data cache
- Scsi out must be terminated (UDWIS)
- Controller Status leds Pattern will give you error information.
- SCSI ID jumpers for both RAID controllers, Default is 5 for top controller and 4 for the lower one
RAID Overveiw
RAID Manager Device Naming Conventions
Target ID of RAID controller
|
slice
|
|
C# T# D# S#
|
|
| Lun # (created when setting up array)
Host Controller #
page 19

RAID LEVELS
RAID 0
RAID 0 is actually a AID (Array of Interconnected Disks) the R (redundant) part just isn't
here. RAID 0 is being able to put multiple physical disks together to make it appear as
one large virtual disk. There is no parity drives or parity stripes.
RAID 1
RAID 1 is an array that is mirrored. That means there are 2 sets of disks, every disk has a
counter part that is an exact copy. If one fails the other will take its place.
RAID 3
RAID 3 has striped data across multiple volumes and a dedicated parity drive. If one of the
drives should fail, it's data can be reconstructed from the parity drive.
RAID 5
RAID 5 has striped data across multiple volumes as RAID 3, but also has it's parity striped
across multiple volumes. RAID 5 is also able recover from a failed disk.

Boot process
1. VTOC (volume table of contents)
2. Boot Block
3. UFSboot

4. genunix
5. unix

6. /etc/system
7. /etc/inittab

page 20

Sector 0 of boot disk


Sector 1-15 UFS reader can be rebuilt with the
installboot command.
/platform/'uname-m'/ufsboot Loads standalone
kernel. You can tell it is loaded by the first instance of
the spinning wheel (after the memory size post
spinning wheel.)
/kernel/genunix; generic unix kernel for the
operating system; specific only to the O/S release
/platform/'uname-m'/kernel/unix
specific to O/S and archecture type.
(you can tell it is loaded by the
second instance of spinning
wheel, at the Sun O/S Release 5.7
message).
has the varibles to custom load kernel parameters.
boot -a will not use /etc/system file on boot
sysinit: as we are trying to grab the console.
respawn: respawn proc if it dies
initdefault: default run level
wait: wait for job to complete
Powerfail: on PWR signal run approprite command.

Diagnostic commands:
arp
catman -w
compare
crash
devlinks
df -k
dfmounts
dfshares
diff
disks
drvconfig
eeprom
file
find
format
fsck
fstyp -v
grep
groups
ifconfig -a
iostat
isainfo - v
last
ls
mpstat
ndd
netstat (-i, -r, -k)
newfs
nfsstat
od
pagesize
patchdiag
patchadd -p
patchinstall
backoutpatch
perfmeter
ping (-s)
pkgchk
pkginfo -l
prtdiag
prtconf -v
prtconf -vp

Displays Address Resolution Protocol tables.


Create the /usr/share/man/windex database for use with index function available
thru the apropos command. Creates a windex file that includes every solaris command
and a brief description.
Will tell you the difference between two files ex: compare /kernel /usr/kernel
Used to analyse crash dumps
Creates symbolic links in /dev using info in /devices
Displays disk space usage in Kbytes, including free space
Display remote filesystem mount info.
Displays shared filesystem info.
Compare file contents
Creates symbolis links in /dev/dsk and /dev/rdsk, used after the drvconfig command
Configure the /devices directory and the device information tree.
Analyse and change PROM settings.
Determine a file's type
Search for specific files
Analyse or modify partition information
Check UFS filesystems for inconsistencies
Display extensive file system parameters for a specified file system.
Analyse file contents, and search for specific patterns.
Display group definitions for a given user
Add, display, and analyse the status of network interfaces
Analyse I/O performance issues
Will tell you if you are running 32 or 64 bit applications
Display history of system login information
Analyse file properties
reports processor stats on a per processor basis
get and set named device driver parameters
Analyse network tunning information, including active routes. -i interface info/collisions,
-r router info, -k kernel info pipe to more look for interface, verbose version of -i,
Create and examine file system parameters
Analyse NFS performance information
Octal dump of a file. ex: od -c /etc/nsswitch.conf will display all charectors in the file
print the size of a memory page in bytes
(sunsolve CD) Listing of recommended patches
Displays patches loaded on your system,
(sunsolve CD) Is used to install patches
(ex: # cd /cdrom/cdrom0)
( # ./patchinstall)
(sunsolve CD) Will remove a patch after you cd to that directory
(ex: # cd /var/sadm/patch/102044-01)
( #./backoutpatch .)
Provide graphic display of performance metrics
Contact network hosts by sending Internet Control Message Protocol (ICMP) request and
reply datagrams.
check file integrity and accuracy of installation
Will give you a description of all the packages (w/o pkg name) or one package (w pkg
name)
Display system configuration and diagnostic information (/usr/platform/ 'uname -m'/sbin)
Get system device information from POST probe
Device tree info and PROM version (OBP)
page 21

Diagnostic commands continued:


prtvtoc
psrinfo -v
prsadm - f (-n)
/usr/ucb/ps -aux
pwck
sar
showrev -p
snoop (-s)
strings
sysdef
swap
sum
sys-unconfig
tail -f
tic
timex
traceroute
truss
tunefs
uname
vmstat
who am i
xhost hostname

List the vtoc (disk label) of a disk drive ex: prtvtoc /dev/rdsk/c0t0d0s0
Will give you processor information
- f Will allow you to offline a processor. - n will online a specified processor
Lists processes in CP utilization desending order.
checks the password file for inconsistencies
Analyse system performance information (must be initialized in /etc/init.d/perf)
list currently installed patches; patchadd -p in solaris 2.6 and above
display and analyse network traffic
Search object and binaryfiles for ASCII strings
Analyse device and software configuration information.
Add, delete and monitor system swap areas
Calculate and print a checksum value for a named file
Enables you to change information entered during sysidtool phase of installation
Leave file open for reading and display what is there
Terminfo compiler; translates a terminfo file from source to compiled format
List runtime and system activity information during command execution
Show the route followed by packet transfered in a subnet environment
Trace system calls issued and used by a program or command
Modify file system parameters that affect layout policies
Print platform, architecture, operating system, and system node information.
Analyse memory performance statistics
Display the effective current user name, terminal line and login time
allows graphical access to your host from the host specified in hostname

Diagnostic files
/etc/defaultdomain
/etc/default/cron
/etc/default/login
/etc/default/su
/etc/dfs/dfstab
/etc/dfs/sharetab
/etc/hosts
/etc/hostname.le0
/etc/hostname.hme0
/etc/inetd.conf
/etc/inittab
/etc/minor_perm
/etc/mnttab
/etc/name_to_major
/etc/netconfig
/etc/nsswitch.conf
/etc/path_to_inst
/etc/protocols
/etc/release
/etc/rmtab
page 22

Name of the current domain, read and set at each boot by script /etc/init.d/inetinit
Determine logging activity for the cron daemon through specificationof the cronlog
variable
Control root logins at the console through specification of the console varible and other
defaults.
Determine /etc/hostname.le0 logging activity for the su command thru specification of
the sulog variable
List what distributed file systems will be shared at boot time
List currently shared NFS file systems
Host file linked to /etc/inet/hosts
Assign a system name, and through cross-referencing the /etc/hosts file, add an IP address
to a particular network interface
List information for network services that can be invoked by the inetd daemon
Read by init daemon at startup to determine which rc script to execute; also contains
default run level.
Specifies permissions to be assigned to device files
Display a list of currently mounted file systems
Display a list of configured major device numbers.
Display the network configuration database read durring network initializeation and use
List the database configuration file for the name service switch engine.
List the contentents of the system device tree using the format of a physical device names
and instance numbers
List known protocols used in conjunction with internet
O/S release and date
List the current remotely mounted file systems

diagnostic files continued:


/etc/rpc
/etc/services
/etc/system
/etc/vfstab
/var/adm/messages
/var/adm/sulog
/var/adm/utmpx
/var/adm/wtmpx
/var/crash/hostname

List available RPC programs


List the well-known networking services and associated port numbers; maintained by NIC
Tunable Kernel parameters boot -a will boot w/o an /etc/system file
List local and remote filesystems mounted at boot time.
Lists resent console window and boot messages
Display a record for each invocation of the su command
List user and accounting information for the who and login commands
Maintain history of user information for the accounting packageand report facility.
Crash files, unix is the symbol lookup file, vmcore is the core dump, bounds is incremental
value for next core set.
/var/lp/log
List print services activity
/var/sadm/install/contents
List installed software packages
/var/sadm/install_data/install_log
A listing of the way the install was completed
/var/sadm/pkg
patch and package information (new O/Ss)
/var/sadm/patch
patch and package information (old O/Ss)
/var/sadm/system/admin/INST_RELEASE
List of clusters installed on the system.
/var/saf/_log
List activity of the Service Access Facility (SAF)
/var/spool/locks/lck clean up to clear bad tip session (will get error- all ports busy)

Watchdog Resets
CPU Watchdog Reset is initiated on a single processor machine when a trap condition occurs while traps
are disabled and register bit to enable traps is not set. The system tries to come down in a
deterministic state and traps to a reserved physical address
System Watchdog Reset is when a fatal error is detected on a multi-processor machine.
obpsym

module should be loaded to maximize the amount of symbolic information available in the
PROM (obp) environment. Without this module, information is displayed without textual
information.

To check if obpsym is loaded:


# modinfo | grep obpsym
To load the module from command line:
# modload -p misc/obpsym
To load module with each boot, enter the following in /etc/system:
forceload: misc/obpsym
obp register commands - sun 4u (used with watchdog reset analysis)
.locals
.registers
ctrace

Displays the local CPU registers


Dumps the registers of the current window, those in use at the time of the crash.
Displays a stack trace, listing routines that erer active when the system went down
(obpsym module should be loaded. see above)
.pstate
Formatted display of the process state register
.ver
Formatted display of the version register
.ccr
Formatted display o f the ccr or cache control register
.trap-registers Display of trap related registers
page 23

obp register commands - sun4m (used with watchdog reset analysis)


.locals
.registers
ctrace
.psr
.fregisters

Displays the local CPU registers


Dumps the registers of the current window, those in use at the time of the crash.
Displays a stack trace, listing routines that erer active when the system went down
(obpsym module should be loaded. see above)
Formatted display of the process status register
Display of the floating point registers

What to look for at the OK prompt of a watchdog reset:


Note the number next to the OK prompt, which is the number of the CPU that hit the watchdog
reset (multi-processor only)
Note the information in the following fields from OK prompt:
.registers- Valid addresses associated with the window registers on
display
.locals - Valid addresses associated with the registers on this display
cstrace - pc addresses and routine names
.ver - The implementation (IMPL) and (MANUF) manufacturer
numbers.
.trap-registers- The trap type (TT), the (TSTATE), and the processor state
(PSTATE)
.pstate - The RED value, which is similar to the ET (enable trap) bit on
SPARC Version 8.
Solaris commands and files that can be used in watchdog reset analysis:
showrev -p
prtconf -v
pkginfo
/usr/ccs/bin/nm /dev/ksyms > symbol_file
/usr/platform/sun4u/sbin/prtdiag -v > prtdiag_file
/etc/system
/var/adm/messages
Related document numbers in the SunSolve database include
1360 - Trouble Shooting Watchdog Resets
14133- Is the system crash due to hardware or software
14230- System crashes and how to prepare for analysis by Sun Service

page 24

Dump analysis
****Cores sent in from the customer are located in:
/net/eastcores/corefiles/SO# (SO#= SO opened by customer)
/net/cores.central/cores/gesd/fidelity/open/SO#
***(STOP A) sync ... on a hung system will cause a core dump.
Three debuggers:
adb:

Assembly debugger. It is an interactive and general purpose utility and can be used to
examine files, and it provides a controlled enviroment for executing programs. By default
it does not supply a prompt.
(to run adb on a dump file)
#cd /var/crash/host_name
# adb -k unix.n vmcore.n
(to run on a live system)
#adb -kw /dev/ksyms /dev/mem

What to look for in a core dump with the adb debugger:


$<msgbuf

This will give you the:


Name or the failing process
Register pointer (rp=)
PID (pid=)
Program counter (pc=)
stack pointer (sp=)
thread of failing process (g7)

if no info do this:
To find executing instruction:
1. do a stack trace... $c
(this will give you a listing to use in step2)
2. get register pointer,
64 bit system 2nd value from 'die' ex: die (0x9, 0xf05246f4, 0x30, 0x326,...
32 bit system 2nd value from 'trap' ex: trap (0xf028a1d8, 0xf05246f4, ...
(use this value in step 3)
3. get values in register 'pc'
0xf05246f4$<regs
(use the value under the pc heading for step4)
ex: pc
fc479dbc
4.

use the value in 'pc' to see the executing instruction


fc479dbc/ai
(it will tell you something like 'ram_write)
page 25

To find thread involved with panic:


1. panic_thread/x (32 bit systems)
panic_thread/k (64bit systems)
(this will print out something like... panic_thread: f5c66480)
2. use the thread value to find the 'procp'
f5c66480$<thread
(look @ the structure and retrieve 'procp' value)
ex: procp
f5c0fcc8
3. Take a look at the process structure to get the process name and arguments (psargs)
f5c0fcc8$<proc2u
(you should see something in text for process name)
4. You can also use the 'procp' value found in step 2 to get the 'pidp' address
f5c0fcc8$<proc
pidp
f74ccf93
5. Use the 'pidp' adderss from step 4 with the 'pid.print' macro
f74ccf93$<pid.print
adb commands
cpu$<cpus

Display cpu0 which contains the address of the currently running thread.

cpun $< cpu Display the cpu identified by n


$<msgbuf

Display the msgbuf structure, which contains the console messages


leading up to the panic.

$c

Display the stack trace

$C

Show the call trace, and stack trace leading up to a panic from the bottom
up.

$r

Display the SPARC window registers, including the program counter and
the stack pointer

<sp$<stacktrace Use the sp(stack pointer) address to locate and display a detailed
stacktrace

page 26

$q

Quit adb

$>file

Redirect output to file

crash

similar to adb, but the command interface is different. Crash is used to examine memory of a
running or crashed system.
(to run crash on a dump file)
# cd /var/crash/host_name
# crash vmcore.n unix.n
(to run on a live system)
# crash (without any arguments)

crash commands:
u or user

will give info on the process that was running when the crash occured

stat

will give you the following information:


system name
version information
time of crash
age of system
type of panic

proc

will give you listing of process table

defproc

will give you the current process slot number (used with proc command)

defthread

will give you the current thread address

kadb

is similar to adb. It must be loaded prior to the standalone program it is to debug. To run the
kernel under kadb type 'boot kadb' at the ok prompt

iscda

is Initial System Crash Dump Analysis... The script is included on the sunsolve CD under
the top level directory ISCDA. The following is an example of usage:
# cd /var/crash/machine_name
# iscda unix.0 vmcore. 0 > /tmp/iscda.output
This will run the iscda script on the core dump in /var/crash/machine_name. The output
will go to /tmp/iscda.output. The output will consist of the results from a sequence of adb
and crash commands. If needed, you can send this file to the Sun solution center via Email.
SunSolve
The Sunsolve CD is a valuable tool in diagnosing problems. The following are home page
selections:
Power search provides a menu driven database selection for searching, Bug reports,
FAQ's, Patch descriptions, tech bulletins, Info docs, Symptom and
Resolutions
Patch Diag Tool

Determines the patch level of your system compared to Sun's reccomended


patch lists. can be run by cli
# patchdiag
Page 27

Crash Dump Analysis


Sun Courier

Displays how to load and run the ISCDA script. (Initial System
Crash Dump Analysis)

Submits a service request to Sun solution center. (sendmail must be


running)

Installing a patch with Sunsolve CD:


# cd /cdrom/cdrom0
# ./patchinstall (patch#)
Removing a patch with the Sunsolve CD:
# showrev -p (list all patches installed on your system, get name and rev)
# find / -name 102044-01 -print (find installed location of patch)
# cd /var/sadm/patch/102044-01 (change to patch directory)
# ./backoutpatch .
# reboot
SUN VTS
Sun VTS is validation test suite. VTS is run at the Solaris level, but should not be run
while the customer's applications are up. VTS comes with the Solaris package, there
are different revisions for Solaris Releases, rev 2.12 for Solaris 2.6, 3.0 for Solaris 7
and 3.4 for Solaris 8. It is reccommended to use the version of VTS that corresponds to the O/S
you are running. Also check sunsolve for related patches.
Installation: (loads to the /opt/SUNWvts directory)
(share -F nfs -o ro,anon=0 /cdrom/cdrom0/s0 if ssp)
# cd /cdrom/cdrom0/Product
# pkgadd -d . SUNWvts SUNWvtsx SUNWodu SUNWvtsmn
or
# /cdrom/cdrom0/installer or run thru file manager window
To run: (programs reside in the /opt/SUNWvts/bin directory)
# sunvts - Default graphical interface (CDE) on local machine
# sunvts - l Runs Openlook graphical interface on local machine
# sunvts - t Runs in tty mode*
# sunvts -h host-name Runs graphical interface on local machine
while connecting and testing a remote machine (Sun vts
must be loaded on both machines)
* #TERM=vt100; export TERM (use this command when running in tty mode from notebook)
*** set_options / Thresholds to 00 ( to log errors and continue )
sunvts -t Navigation: (the <ctl> keys are good if you forgot to set the TERM)
<tab>
move between windows
<ctl> w
move between windows
<arrow>
move within window
<ctl> r
move within window on same line
<ctl> u
move within window up/down lines
<ctl> f
move within window forward
<ctl> b
move within window backwards
<ctl> l
refresh screen
<esc>
close pop- up menu
<space>
select / deselect test
Page 28
<enter>
select function

STORtools
STORtools Toolkit simplifies the monitoring and troubleshooting of Sun
StorEdge A5000, A5100, A5200 disk array instalations. The tool provides
an easy to use menu driven front end program with task explanations and
help information. Command line utilities are provided for advanced custmized
use. The utilities have standard man pages for online documentation.
STORtools provides tools for performing the following tasks:
- Revision Checking
- Configuration Management
- Monitoring and Notification
- Troubleshooting and Fault Isolation
To install from CDrom:
# pkgadd -d . STORtools
To install after down load from web site:
# uncompress STORtools.tar.Z
# tar -xvf STORtools.tar
# pkgadd -d . STORtools
To run STORtools
# /opt/STORtools/bin/stormenu

page 29

Explorer Scripts:
New Version:
The new version of explorer can be found on Sunsolve under "navigation - diagnostic tools"
It is now a software package (SUNWexplo) and can be installed and run (initially) with the
pkgadd - d command.
To expand: # zcat SUNWexplo.tar.z | tar xvf to install: # pkgadd - d . SUNWexplo
Once the package is installed explorer can be run from /opt/SUNWexplo/bin/explorer.
Old Version:
The following is documentation sent out with the explorer script. It contains information
on how to expand, run and mail the output from the explorer.
1. #su root
2. Save the explorer.tar.Z file in directory where root has write permission
3. for encoded files :
#uudecode filename
#zcat explorer.tar.Z | tar xvf 4. #./explorer
-While executing this script, you will be prompted to enter information about your site.
- If you have internet access, we ask that you enter "y" to the question Would you like
to e-mail results [y/n]" so that we get the output automatically.
- If you choose not to e-mail the explorer file automatically, please send the resulting file
(*.uu) as an attachment to your PTAS account manager.
Explorer in CRON (for this example, explorer will reside in /usr/tmp)
**** Do steps 1-3 above
1. # copy file 'explorer.template' to another file (ie: file_name)
2. # chmod 755 file_name
3. Edit file_name and fill in the appropiate lines.
4. Edit the root crontab file using the 'crontab -e' command and make an entry
similar to the following:
00 23 1 * * cd /usr/tmp; /usr/bin/zcat explorer.tar.Z | /usr/bin/tar xvf - ; /usr/tmp/explorer -file
/usr/tmp/file_name -mail
5. If you choose not to email the explorer file automatically (-mail option)
please send the resulting file (*.uu) as an attachment to your PTAS Account
manager.
Note: if crontab -e does not work correctly, try setting the following variable
'setenv EDITOR vi'
To veiw the explorer output file
run uudecode on the *.uu file (this will create a host_id.tar.z file)
run gunzip on the tar.z file
(this will create a host_id.tar file)
run tar -xvf on the .tar file (this will expand the file to the explorer output
structure)
page 30

Performance Analysis
Tools: (commands)
timex reports system activity for the execution of a single command
-o reports I/O transfers
-s reports sar activity during command
-h reports 'hog factor'
ex: # timex ps -ef (will tell you the amount of time the ps command took to
execute)
top
display and update information about the top cpu processes
ex: # top 20 (will give you stats on the top 20 processes default is 10)
vmstat

reports Virtual memory statistics


ex: # vmstat 15 2 (will collect and report virtual memory stats for 15 intervals of
2 seconds)

iostat

reports I/O statistics


ex: # iostat 60 3 (will collect and report I/O statistics for 3 60 second intervals)
disk thruput test: ( from infodoc 21931)
for write performance: (this will write over data. do not use if data is needed on this disk)
# dd if=/dev/zero of=/dev/rdsk/cxtxdxs2 bs=1024k
for read performance:
# dd if=/dev/rdsk/cxtxdxs2 of=/dev/null bs=1024k
# iostat -pxn 5

mpstat

Reports processor statistics per processor


ex: # mpstat 30 2 (will collect and report proc stats for 30 intervals of 2 seconds)

sar

reports overall system activity


-u CPU usage data
-q average length of run queue
-r collect paging data
ex: sar -u 60 30 (will collect cpu data for 30 intervals of 60 seconds each)
sar -q 60 30 (will collect run queue data for 30 intervals of 60 seconds each)
sar -r 60 30 (will collect paging data for 30 intervals of 60 seconds each)

reports on current system activity per user

page 31

Backups
ufsdump

backs up all files specified by files_to_dump (normally either a whole file


system or files within a file system changed after a certain date) to
magnetic tape, diskette, or disk file. Filesystems to be backed up
must be inactive (unmounted or single user mode)
0-9

c
f
u
v

dump level, 0 is full dump. It is relative to what has been backed


up. If a level 2 was done then level 4 backup was done the next day.
If the next day you did a level 5 all modified files since level 4 would
be backed up.... If instead you did a level 3 backup all modified files
since the level 2 would be backed up.
cartridge. Sets the defaults for cartridge instead of the standard
half-inch reel.
Dump file. Use dump_file as the file to dump to, instead of
/dev/rmt/0. If dump_file is specified as -, dump to standard output.
update the dump record. Add an entry to the file /etc/dumpdates.
verify. After each tape or diskette is written, verify the contents
of the media against the source file system.

ex: # ufsdump 0cfu /dev/rmt/0 /dev/rdsk/c0t3d0s0 (full dump of a root file


system on c0t3d0 on cartridge tape unit 0)
# usfdump 0uf /dev/rmt/0 /usr (dump the /usr filesystem to tape)
# ufsdump 5fuv /dev/rmt/1 /dev/rdsk/c0t3d0s6 (make and verify an
incremental dump at level 5 of the /usr partition of c0t3d0,
on tape unit 1
ufsrestore

ufsrestore utility restores files from backup media created with the
ufsdump command.
i

t
x
f

Interactive. After reading in the directory information from the


media, ufsrestore invokes an interactive interface that allows
you to browse through the dump file's directory hierarchy and
select individual files to be extracted. Valid commands
are ls, cd, add, verbose, delete, extract, quit
Recursive. Restore the entire contents of the media into the
current directory (which should be the top-level of the file system).
To completely restore a file system, use this function letter to
restore the level 0 dump, and again for each incremental dump.
Table of contents. List each filename that appears on the media.
If no filename argument is given, the root directory is listed.
Extract the named files from the media. If a named file matches
a directory whose contents were written onto the media, and the h
modifier is not in effect, the directory is recursively extracted
Use dump_file instead of /dev/rmt/0 as the file to restore from.
Typically dump_file specifies a tape or diskette drive.

ex: # ufsrestore tvf /dev/rmt/0 (list tape contents of /dev/rmt/0)


# ufsrestore rvf /dev/rmt/0 (restore contents of tape /dev/rmt/0 to
the current directory you are in)
# ufsrestore ivf /dev/rmt/0 (interactive restore of tape /rmt/0)
page 32

tar

Copies and Archives files


-c create (backup)
-v verbose (details)
-f device
-t table of contents (list)
-x extract
-p restore to original mode
-h follow symbolic link
-d access special files
ex: tar -cvf /dev/rmt/0
/usr (backup /usr to tape /rmt/0)
tar -xvf /dev/rmt/0
/usr
(restores /usr from tape /rmt/0)
tar -tvf /dev/rmt/0
(lists the contents of tape /rmt/0)
zcat file_name.tar.Z | tar xvf (expand a tar.Z file)

cpio

copies and archives files


-o output
-v verbose
-i input
-t list
-d create directories
-m retain modification time
ex: # cpio -ov /usr /dev/rmt/0 (copies /usr to /dev/rmt/0)
cpio -itv < /dev/rmt/0 (list the contents of /dev/rmt/0)
cpio -idmv < /dev/rmt/0 (restores /dev/rmt/0)

dd

Device to device copy


ex: # dd if=ascii_file of=ebcid_file conv=ebcidic (converts an ascii file to ebcidic)
# dd if=/dev/rmt/0 of=/dev/rmt/1 (copies from rmt/0 to rmt/1)
# dd if=/dev/rdsk/c0t0d0s2 of=/dev/rdsk/c1t0d0s2 bs=512000
(for an quick copy of c0t0d0 on c1t0d0)

page 33

How is a Coredump Generated?


When a system crashes, it writes a copy of its memory to a temporary location on a disk, usually to the
primary swap partition. Savecore is a program which runs at boot time to retrieve the memory copy
from the temporary location and to save it to a place where it can be accessed. Savecore must be run
during the bootup process, or very shortly thereafter, before it would be overwritten by a
running operating system which uses the primary swap partition for other purposes.

How to Get a Coredump from a Solaris 2.x system


Getting a coredump is not enabled by default, because corefiles can be
quite large. Enabling a coredump requires the following to be done:
1) Verify that savecore exists.
Do the following command:
ls -l /usr/bin/savecore
savecore is located in the SUNWtoo package (Programming tools) in 2.X, and is not part of the core install.
If savecore does not exist on a 2.X system, do a pkgadd on SUNWtoo.
a) Put the correct OS version installation CDrom in the CDrom drive.
b) Wait until the access lamp goes out in the CDrom drive.
c) # pkgadd -d /cdrom/sol*/s0/Sol* SUNWtoo
d) Answer the questions.
2) Determine how much memory you have on your system. This can be done by:
a) examining your system banner if your system is down by typing "banner" at the "OK" prompt.
b) doing a "wsinfo" on a 2.x system running openwindows, and checking the "physical memory" column.
c) looking at the /var/adm/messages file, or output of the dmesg command, and searching for the line
which starts with "mem =". The number which follows will be in bytes. Divide by 1048576 to
get megabytes.
3) Find any locally mounted partition, other than /tmp, which has enough room to hold the coredump. A
coredump takes usually about 35% of the size of main RAM memory.
4) Verify that your dump area is at least 35% of the size of main RAM memory. A regular disk is prefered
to a meta-filesystem running under Veritas or DiskSuite control. The dump area is usually the primary
swap file.
Execute a "swap -l" command and observe the first line with values in it. Take the number in the
"blocks" column and divide by 2048. This is the number of megabytes in the primary swap file.
Compare this to the size of main RAM memory found in step (2) above.

Page 34

5) Enable savecore as follows: (Savecore is enabled by default in Solaris 2.7.)


a) Edit /etc/init.d/sysetup, and search for the word
"savecore". You will find something similar to
##
## Default is to not do a savecore
##
#if [ ! -d /var/crash/`uname -n` ]
#then mkdir -p /var/crash/`uname -n`
#fi
#echo 'checking for crash dump...\c '
#savecore /var/crash/`uname -n`
#echo ''
b) Remove the left "#" signs from the bottom 6
statements in (i) above.
c) ( optional if you don't want the core copied to the /var or if /var
wasn't large enough)
Substitute the name of the partition found in (3)
above for "/var" wherever it shows in the statements in (i) above.
Incidentally, if you know that savecore is enabled but do not know where
the corefiles are put, checking the "savecore" statement listed above
will tell you.

Page 35

Dump device bad when saving core on encapsulated root


Problem:
Systems with VxVM encapsulated boot disks will not be able to do system dumps if the swap
slice is not tagged as swap. With the root drive encapsulated, if the system tries to do a
system dump in the event of a panic, it may present messages similar to the following:
panic: <some OS kernel panic message>
syncing file systems... done
2084 static and sysmap kernel pages
380 dynamic kernel data pages
385 kernel-pageable pages
0 segkmapkernel pages
0 segvn kernel pages
253 current user process pages
3102 total pages (3102 chunks)
dumping to vp fc2f9204, offset 171232
0 total pages, dump device bad
<=- The problem!
rebooting...
Problem Solution:
If the swap slice was not tagged as swap in format when the root
drive was encapsulated, the encapsulation process will zero out
the swap slice when it makes the swap volume:
Part
Tag Flag
0
root
wm
1 unassigned wm
2 backup wm
3
usr
wu
4
usr
wm
5 unassigned wm
6
wu
7
wu

Cylinders
Size
0 - 134
100.20MB
0
0
0 - 2732
1.98GB
825 - 1229
300.59MB
1230 - 1667
325.08MB
0
0
0 - 2732
1.98GB
135 - 135
0.74MB

Blocks
(135/0/0) 205200
(0/0/0)
0
(2733/0/0) 4154160
(405/0/0) 615600
(438/0/0) 665760
(0/0/0)
0
(2733/0/0) 4154160
(1/0/0)
1520

In this example, slice 1 is the swap slice.


When the system dumps, it need to use the physical device and not the swap volume. The dump
fails because slice 1 shows a zero size in format.
To solve the dump dev problem, you need to go into format and edit
slice 1, change the tag to swap, and give it the start and end
cylinders.

Page 36

To get the end cylinder, you need to look in /etc/vx/reconfig.c/disk.d/c?t?d?/vtoc:


# cd /etc/vx/reconfig.d/disk.d/c0t0d0
# more vtoc
#THE PARTITIONING OF /dev/rdsk/c0t0d0s2 IS AS FOLLOWS :
#SLICE
0
1
2
3
4
5
6
7

TAG
0x2
0x0
0x5
0x4
0x7
0x0
0x0
0x0

FLAGS
0x200
0x201
0x200
0x200
0x200
0x200
0x000
0x000

START
0
103360
0
718960
1334560
1539760
0
0

SIZE
103360
611040
4154160
615600
205200
410400
0
0

In this example, 611040b is the ending cylinder for slice 1.


In format, select the root drive and edit slice 1:
partition> p
Current partition table (unnamed):
Total disk cylinders available: 2733 + 2 (reserved cylinders)
Part
Tag
0
root
1 unassigned
2 backup
3
usr
4
var
5 unassigned
6
7
-

Flag Cylinders
wm
0 - 67
wm
0
wm
0 - 2732
wm 473 - 877
wm 878 - 1012
wm
0
wu
0 - 2732
wu 2732 - 2732

Size
50.47MB
0
1.98GB
300.59MB
100.20MB
0
1.98GB
0.74MB

Blocks
(68/0/0) 103360
(0/0/0)
0
(2733/0/0) 4154160
(405/0/0) 615600
(135/0/0) 205200
(0/0/0)
0
(2733/0/0) 4154160
(1/0/0)
1520

partition> 1
Part
Tag
Flag
1 unassigned wm

Cylinders
0

Size
0

Blocks
(0/0/0)

Enter partition id tag[unassigned]: swap


Enter partition permission flags[wm]:
Enter new starting cyl[0]: 68
Enter partition size[0b, 0c, 0.00mb]: 611040b <==from vtoc file
partition> l
Ready to label disk, continue? y
partition> p
Current partition table (unnamed):
Total disk cylinders available: 2733 + 2 (reserved cylinders)
page 37

Part
Tag Flag Cylinders
Size
Blocks
0
root
wm
0 - 67
50.47MB
(68/0/0) 103360
1
swap wm 68 - 469
298.36MB
(402/0/0) 611040
2 backup wm
0 - 2732
1.98GB
(2733/0/0) 4154160
3
usr
wm 473 - 877
300.59MB
(405/0/0) 615600
4
var
wm 878 - 1012
100.20MB
(135/0/0) 205200
5 unassigned wm
0
0
(0/0/0)
0
6
wu
0 - 2732
1.98GB
(2733/0/0) 4154160
7
wu 2732 - 2732
0.74MB
(1/0/0)
1520
partition> q

page 38

Uncompressing Files:

What to use to uncompress files:


Use the 'file (file_name)' command to determine what type of compression was used.
Ex: # file 2.6_x86_Recommended.tar.gz
2.6_x86_Recommended.tar.gz:
gzip compressed data - deflate method , original file name
*.tar.Z files use the 'zcat (file_name.tar.Z) | tar xvf -' command
Ex: # zcat explorer.v.3.1.0.tar.Z | tar xvf *.tar.gz files use the 'gzcat (file_name.tar.gz) | tar xvf -' command
Ex: # gzcat 2.6_x86_Recommended.tar.gz | tar xvf you can also use the 'gunzip' command but that will result in a *.tar file and
you will have to use the 'tar - xvf (file_name.tar)' command to expand it
*.tar.z files copy to *.tar.Z and use zcat (see above)
*.zip files use the 'unzip (file_name.zip)' command
Ex: # unzip stuff.zip
*.tar files use the 'tar -xvf (file_name.tar)' command
Ex: # tar -xvf 2.6_x86_Recommended.tar

*****zcat can be found on most versions of Solaris in /usr/bin******


gzcat can be found on the web and Sunsolve CD
gunzip or gzunzip can be found in /usr/dist/exe on the corporate network
tar can be found on most versions of Solaris in /usr/bin
unzip can be found in /usr/dist/local/exe on the corporate network
****NOTE: It is a good idea (due to the locations of these commands) to have them on a floppy
or CD that you can bring on-site. *****

page 39

T300 (purple): Also see page 67


Description:
The T300 array is a hardware RAID FCAL device. As such please make sure all firmware
and patches are up to date. You can use STORtools* to exercise and troubleshoot the product.
The T300 also has a com (rs232) port so you can tip into it and a ethernet port so you can
use telnet, ftp, tftp boot, or administer it through Component Manager.
The T300 has an EP (extended Prom) boot that runs post and has its own set of commands
and also runs a limited function unix O/S called PSOS, (accessed thru tip or telnet). PSOS
can be run from the reserved area on the array drives or tftp can be used to load it from
the server.
*STORtools will only test to the MIA on the T300 product line.
Partner group
Two T310s cabled together through the UICs. The cables coming from the 2 dot (OUT ..)
ports on the UIC designate the primary array. The other array (uic 1 dot IN ) becomes the
secondary array. Only 2 T300s can be in a partner group at this time. In a partnered group
with 2 fiber paths, the server will access the LUNs thru both paths, top array LUNs
thru top array controller and bottom array LUNs thru bottom array controller. If something
happens to one of the controller then the LUNs will failover to the remaining controller.
Tray ID #s (fru stat, fru list)
u#
= unit right now valid numbers are u1 and u2
u1d3 = unit 1 disk 3
u2pcu1 = unit 2 power cooling unit 1
u1l1 = unit 1 loop 1 (uic1)
u2ctr1 = unit 2 raid controller 1
Default array login: :/:> root (return) no password
Default Configuration: 1 LUN RAID 5
Chassis Model number history:
p1.7
p1.8
p1.9
p2.0

Darker gray, 2 fiber data ports on raid controller bd.


Single fiber data port and HH 1.6" seagate drives
Single fiber data port and LP (1.0") drives
Redesigned chassis called "barney" (have not seen yet 3/12/00)

Hot pluggable FRUs:


PCU

(Power Cooling Unit) battery good only 2 years, messages in syslog 45 days prior
to expiration once PCU is unplugged you have 30 min to change before array starts
a shutdown sequence. Array requires 3 fans to stay below critical temp.

UIC

(Unit Interconnect Controller) verify status thru fru stat. Once UIC is removed you
have 30 min to change before array starts a shutdown sequence.

Raid Controller is only redundant in a partner group. Also needs to have some type of DMP
running (veritas) to fail over and have the server be able to access the disks on the
failed array.
page 40

T300 (continued) Also see page 67


Disk(s) Numbered 1 - 9 left to right while facing front of array. Pull disk out ( use
spring loaded latch handle) one inch, wait 30 seconds then remove from array.
Once Disk drive is removed you have 30 min to change before array starts
a shutdown sequence.
MIA

Media Interface Adapter (fiber to copper connection) is only redundant in a partner group.
Also needs to have some type of DMP running (veritas) to fail over and have the server
be able to access the disks on the failed array.

LEDs: ( in general, for specific info see pg 6-9 & 6-15 install and admin manual)
Green:
Amber:

Solid
|
normal status
|
Fru is being initialized |

Blinking
system activity
Fru failure (controller, uic, pcu, disk)

Path:
Sbus controller # (hba)
|
C#T#D#S#
| | |_ Slice
| |____ T300 volume number (LUN) (use 'port listmap' command)
|_______ Target ID of array ( use 'port list' 'port set' commands)
****Use format, scsi, inquiry, mode bytes, 10 = primary path 30 = secondary path ****
****You will cause a LUN failover if you try to access the secondary path LUNSs through *****
****low level commands like format and dd in a partner group*****
convert to decimal
divide by 2
Volume on array
round down
sbus slot
LUN (port listmap)
|
|
|
sbus@1f,0/SUNW,socal@1,0/sf@1,0/ssd@w50020f2300000a06,1:a
|
|
|
|
result is I/O bd #
Loop connection
WWN#
slice a = 0
d = on board soc+
port on the HBA
last 6 digits
0 = port A
are from
1 = port B
mac address
(set command)
T300 Boot:
- Eprom:

T300 EP boot (1st stage)


POST
- U1d1 (will try to get PSOS from U1d1- d9 or TFTP if set bootmode tftp)
- PSOS boot (T300 Release x.x) (2nd stage)
- POST
- Mount filesystems
- Load daemons
- Login prompt
page 41

T300 (continued) Also see page 67


TFTP BOOT:

(if chasis is swapped enter new mac address into /etc/ethers file of tftp server)

On Server:
1. Modify /etc/hosts file on server with ip and name of array
2. Modify (create) /etc/ethers file on server with mac and array name
3. Create /tftpboot directory and copy nbxxx.bin (psos) to it
4. Un comment '#tftp' in /etc/inetd.conf
5. kill -HUP inetd PID#
6. ps -ef | grep in.rarpd (should be running... restart if tftp doesn't work)
On Array:
7. Modify Bootmode to tftp
(:/:set bootmode tftp)
8. Modify tftphost to server's IP
(:/: set tftphost xxx.xxx.xxx.xxx)
9. Modify tftpfile to nbxxx.bin (step#3)
(:/: set tftpfile nbxxx.bin)
10. Modify IP to ip assigned to your array ** (:/: set ip xxx.xxx.xxx.xxx)
11. Reset array
** if rarp is working, array should get IP from server, If IP is assigned thru "set"
command than array will go to the 'who is tftphost' phase of tftpboot.
Add a volume (lun) to a array: (:/: sys blocksize (n)k should be set to correct value before 'vol add')
vol add vol_name data u#d#-# raid # standby* u#d9
vol init vol_name data rate(1-16)
vol mount vol_name
vol stat*
vol list*
vol mode*
(Note: if t3b and volslice is enabled, you must create a slice to see lun in format- pg. 76)
*optional
T300 useful commands:

(use the 'help' command to get specific switches)

File management:
mkdir, rmdir, cd, pwd, touch, cat, more*, tail ,rm, mv, telnet, ftp**
*more command use q=quit, f= forward, b= backward
** ftp requires a password on the root account
vol commands:
vol list, vol add, vol remove, vol init, vol mount, vol unmount, vol mode,
vol verify, vol stat.
boot
disable
disk
date
enable
ep
fru
help
id
lpc
page 42

Boot system (-i, -s,)


Disable controller (u1,u2) or loop cards (ux lx)
Disk administration (version)
set date and time (200003071607 = 03/07/2000 16:07)
Enable controller (u1,u2) or loop cards (ux lx)
Program the flash prom
Display FRU information (-s , -st, list, stat,)
Display reference Manual pages
Display fru identification summary
Get interconnect card property (ledtest)

T300 (continued) Also see page 67


passwd
port
proc
refresh
reset
set
shutdown
sys
tzset
ver
vol

change or display array password


configure the interface port number (list, listmap, set)
Display status of outstanding vol processes (list, kill)
Start/stop battery refreshing or display it's status
Reset system
Display or modify the set information
Shutdown disk tray or partner group
Display or modify the system information (list) (*mp_support to rw for dmp)
set the time zone
Display the software version
Display or modify volume information

Firmware upgrading: (strongly recommended to have array "out of use" before upgrading
firmware. This includes disable polling from Component manager)
FTP firmware files to / on the array. At this moment the files can be found at
http://icode.ebay but in the future they will be available on sunsolve Patch 109115.xx.
Raid controller firmware upgrade:
:/:> boot -i nb###.bin
:/:> reset -y (Warning: if base firmware was below 1.17a, use serial port to reset)
EEprom upgrade:
:/:> ep download ep2_09.bin
UIC upgrade:
:/:> lpc download u#l# lpc_04.11 (3minutes/card, will take card off line)
Disk upgrade: (unmount volumes, 20min for 9 disks, led goes amber during download)
:/:> disk download u1d1-9 D44a.lod
Useful Array files:
/syslog
/syslog.old
/etc/syslog.conf

Array error log file, 1Meg in size. Then gets copied to .old
backup to syslog
Configures where to send error messages

Comm port wiring for notebook: ( it works I verified it)


RJ11
to
DB9
or
1 grd --------------- 5 grd--------------7 grd
5 RXD -------------- 3 TXD-------------2 TXD
6 TXD -------------- 2 RXD-------------3 RXD

DB25

123456

Useful web sites:


http://icode.ebay
Firmware
http://ISI.com
PSOS o/s information
http://thedance.ebay/hardware/arrays/purple/hardware.html

White papers and documentation


page 43

ACT ( A Crashdump Tool)


ACT is a tool that can be run against a core dump or live system. It generates a report that gives you
server state information based on the core. ACT should be run on the server that panicked or should
at least be run on a server that has the same O/S version as the core that is being analysed. The
engineers that maintain ACT recommend you give it to your customers and have them install it on
their servers. When a core dump is produced they can run it on the core and forward the output
to the solution center, because it is much smaller than the core it will save time in transmission.
Act is supposed to become the standard output that all centers will accept.
Available at Http://cte-www.uk It is in *.gz format. To expand it:
# gunzip CTEact.tar.gz
(this will create a CTEact.tar file)
# tar -xvf CTEact.tar
(this will explode the CETact directory)
# pkgadd -d . CTEact
(will install the package into /opt/CTEact)
(answer install questions, I selected 'n' for mailout option)
(executable is /opt/CTEact/bin/act)
Examples:
# ./act -l (output on live server to screen)
# ./act -l -s /tmp/dir/ (output from live server to seperate files)
# ./act -d /var/crash/hostname/vmcore.0 -s /tmp/dir/ (output core
file to seperate files in /tmp/dir)
# ./act -d /var/crash/hostname/vmcore.0 > /tmp/act_out (output core
file to file /tmp/act_out)
****** Info from our website ******
ACT is a tool developed over several years to aid in the process of
analysing kernel dumps. It attempts to perform a good first pass on a
kernel dump.
ACT prints detailed and accurate information about:
- Where the kernel panicked
- A complete list of threads on the system.
- The contents of the /etc/system file which was read when the failed
system booted
- A list of kernel modules that were loaded at the time of the panic.
- The output of the kernel message buffer
- Full deadlock detection relating to threads blocked on mutexes or
readers/writer locks.
- Threads blocked in either getblk() or biowait().
ACT was conceived and developed by Steve Cumming, while working for what was
SunService and then while working for SMCC European CTE. After a short
illness Steve died on July 12th 1998.
ACT is under continuous development by members of Computer Systems European
CTE group based in Bagshot, UK.
page 44

Installation
ACT now resides in package format for both x86 and sparc,so pkgadd should be
used for installation. To check on the current version click Here.
By installing one of the packages below ACT will be installed for the
appropriate architecture and version of Solaris you are running and a new
RC script will be installed which will configure savecore and run ACT
against the newly generated crash dump upon system reboot.
CTEactx.tar.gz. ACT for X86
CTEact.tar.gz. ACT for SPARC
Or alternatively if you have KENV installed then you can tar the following
over kenv in order to update Kenv with the latest version ACT.
KENVact.tar.gz. ACT for KENV.
Instructions
ACT takes the following options, options may appear in any order :
-d corefile
ACT assumes that the file corefile contains the kernel core image.
This file could be /dev/mem if you want ACT to analyze the running
system.
-l
Should be used when running act on a live system.
-n namelist
ACT assumes that the file namelist contains a valid kernel
namelist. This file could be /dev/ksyms if you want ACT to
analyze the running system.
-s directory
Tells act to split its output into several files writing the data
into the directory specified to aid readability. The files created
are,the names speak for themselves:biowait
threads

getblk
system

modules msgbuf mutex


summary sunsolve

rwlock

-u
Displays stack information in an alternate form
-z
This informs ACT to display timezone information in localtime
rather than GMT

page 45

Advantages of Splitting a Drive into Multiple File Systems (info doc 14622)
Rather than using an entire disk drive for one file system, which may lead to inefficiencies and
other problems, you can split a single drive into sections. The sections are called slices, as
each is a slice of the disk's capacity. Once the partition has been allocated, it becomes the a logical
disk drive. A disk can be split into eight subdisks. The splitting of the disk is often called partitioning
or labeling of the disk drive. Below is an example:
Current partition table (original):
Total disk cylinders available: 2036 + 2 (reserved cylinders)
Part
Tag Flag Cylinders
Size
Blocks
0
root
wm
0 - 1872
921.87MB (1873/0/0) 1887984
1 unassigned wm
0
0
(0/0/0)
0
2 backup
wm
0 - 2035 1002.09MB (2036/0/0) 2052288
3 unassigned wm 1873 - 2035
80.23MB (163/0/0) 164304
4 unassigned wm
0
0
(0/0/0)
0
5 unassigned wm
0
0
(0/0/0)
0
6 unassigned wm
0
0
(0/0/0)
0
7 unassigned wm
0
0
(0/0/0)
0
partition>
Here are some of the reasons for multiple filesystems on one hard drive.
1. Damage Control: If the system were to crash due to software error, hardware failure,
or power problems, some of the disk blocks might still be in the file system cache and not
have been written to disk yet. This can cause damage to the filesystem structure. While the
methods used try to reduce this damage, and the FSCK utility can repair most of the damage,
spreading the files across multiple filesystems minimizes the possibility of damage, especially
to those files that are needed during boot-up. When the files are split up across the disk
slices, critical files end up on slices that rarely change or are mounted read-only and never
change. The chances of them being damaged and preventing you from recovering the remainder
of the system are greatly reduced.
2. Access Control: Only complete slices can be marked as read-only or read-write.
If you desire to mount the shared Operating System sections as read-only to prevent changes, they
have to be on their own slice.
3. Space Management: Files are used from a reserve of free space on a per-file system basis.
If, for example, a user has allocated a large amount of space, depleting the free space, and the
entire system disk were a single filesystem, there would be no free space left for critical system
files. The entire system would freeze when it ran out of space.
Using separate filesystems, especially for user files, allows only that a single user, or group of
users, to be inconvenienced when filesystem becomes full. The system will continue to operate,
allowing the System Administrator to handle the problem. The exception to the above scenario is
the root filesystem.
4. Performance:
The larger the filesystem, the larger the tables that must be managed.
As the disk fragments and space become scarce, the further apart the fragments of a file
might be placed on the disk. Using multiple (smaller) partitions reduces the absolute distance
and keeps the sizes of the tables manageable. Although the UFS file filesystem does not suffer
page 46

Advantages of Splitting a Drive into Multiple File Systems (cont.)


from table size an fragmentation problems as much as System V file systems, this is still a
concern.
5. Backups:
Many of the back-up utilities, such as "ufsdump" work on a complete filesystem basis.
If a filesystem is large, it could take longer than you want to allocate to back-up. Most importantly,
multiple smaller backups are easier to handle and recover from.
Below is a listing of slices, some that are required, root and swap, and the recommended additional
slices such as usr, var, opt, home and tmp.
1. The root slice: The root slice is mounted at the top of the filesystem hierarchy. It is mounted automatically
as the system boots, and cannot be unmounted. All other file systems are mounted below the root.
The root filesystem needs to be large enough to hold the following:
* The boot information and the bootable kernel (kernel/genunix), and a backup
of the kernel just in case the main one gets damaged.
* Any local system configuration files, which typically reside in the /etc directory.
* Any stand-alone programs, such as diagnostics, that may be run instead of the OS.
The root partition typically runs on between 15 and 30mb. It is usually placed on the first slice of
the disk, or more commonly know as slice 0 or a.
2. The swap slice: The default rule is that there is twice as much swap space as there is RAM
installed on the system. For example, if you have 16mb of ram, the swap space would need
to be 32mb. Although this is just a preliminary template as to how much swap to use,
their are other factors to consider, an example would be if a users system is running large
applications that use large amounts of data, such as a CAD application. You can monitor the
amount of swap space used via the pstat or swap commands. If you did not allow enough swap
space during the initial install you can add additional swap with either the swapon or swap
commands.
3. The usr slice: The usr slice holds the remainder of the operating system utilities. It needs to be
large enough to hold all the packages you chose to install when installing the OS. If you are going to
install local applications or third-party applications in this slice, it needs to be large enough to hold
them. It is generally better if the usr slice contains the operating system and only symbolic
links to the applications. The filesystem is often mounted read-only to prevent changes.
4. The var slice: The var slice holds the spool directories used to queue printer files and mail, as well
as log files that my be unique to the system. It also holds the /var/tmp directory, which is used for
larger temporary files. It is the read-write counterpart to the usr slice. Every system, even a diskless
client, needs it's own var filesystem. It is not a filesystem that can be shared with any other
system(s).
5. The opt slice: In the newer UNIX systems based on System V release 4 (Solaris 2.x) many sections
are now optional and no longer needed to be loaded on the /usr filesystem. They are now installed
onto the /opt filesystem. Additional add on packages are also installed in this filesystem.
6. The home or export home (remote users) slice: The home directory is where the user's login directories
are placed. Making home its own slice prevents users from hurting anything else if they run this
filesystem out of space. A good starting point for the size of this slice is 1mb per application user plus
5mb per power user and 10mb per developer you intend to support.
Page 47

Advantages of Splitting a Drive into Multiple File Systems (cont):


These are rough estimates and are to be only used as a guideline, your configuration may need
more or less space per user. Usually this is /export/home. Don't put things into /home,
as this is a reserved mount point for automounted NFS filesystems. It's fine to use when
automounter is turned off, but it is on by default.
7. The tmp slice: Large temporary files are placed in the /var/tmp but sufficient temporary files are
placed in /tmp. The files in the /tmp directory are very short-lived and are cleared out during a reboot of
the system. If users run mostly application based programs 5 to 10mb should be sufficient for this
slice. If developers are the primary users of the system 10 to 20mb may be needed. Once again these
numbers or only a guideline, your needs may be different.
How to configure a system to run on a network (info doc 14981) (also see pg 56 Adding a 2nd network interface)
1. /etc/hosts
This file is used to resolve host name into IP addresses. This file must be updated if no naming
service is being used. This file should contain the IP and host name of each system on the
local network, including any gateways or routers.
Example:
127.0.0.1 localhost
129.145.71.109 kishori loghost
129.145.71.110 sage

#this is the IP and host name for the local machine


#this is the IP and host name for a host on the network

2. # ifconfig -a
Be sure that both the loopback and network interface are up and running.
Example:
lo0: flags=849<UP,LOOPBACK,RUNNING,MULTICAST> mtu 8232
inet 127.0.0.1 netmask ff000000
le0: flags=863<UP,BROADCAST,NOTRAILERS,RUNNING,MULTICAST> mtu 1500
inet 129.145.71.109 netmask ffffff00 broadcast 129.145.71.255
If the interface to the network is not up and running do the following:
# ifconfig le0 plumb
NOTE: The default may be hme0 (for most Ultra machines)
3. /etc/netmasks
This file should contain the netmasks. If you are using the default netmasks and it appears in
ifconfig -a, this file is not necessary.
Example:
# The netmasks file associates Internet Protocol (IP) address
# masks with IP network numbers.
#
#
network-number netmask
#
# Both the network-number and the netmasks are specified in
# "decimal dot" notation, e.g:
#
128.32.0.0 255.255.255.0
#
129.145.0.0 255.255.255.0
page 48

How to configure a system to run on a network(cont.):


4. /etc/defaultrouter
If you want to define a default router include the router name in this file.
5. /etc/hostname.le0 or /etc/hostname.hme0 (depending on you interface type) This file should contain
the name of the local host.
6. /etc/resolv.conf
If you are using dns this file should contain the name of the domain and the IP address of the nameserver.
It is acceptable to list more than one nameserver (up to 4). The nameservers will be consulted in the
order listed. Be careful this file is very sensitive to extra spaces and tabs.
Example:
domain support.Corp.Sun.Com
nameserver
129.150.254.2
7. /etc/nsswitch.conf
Check this file for the appropriate entries. If a naming service is being used this file should reflect that.
8. It is a good idea to reboot the system at this point. Check to see if the network is working by pinging other
machines both inside and outside of your network.
SEVM - How to recover a primary boot disk. (info doc 14820)
NOTE: This document was written for VxVM 2.x. New functionality in VxVM 3.x renders many
of the "extra steps" in replacing a primary root disk obsolete. See the comments interspersed
below regarding steps when using VxVM 3.x.
If Volume Manager (VxVM) is running on a system with the root disk encapsulated and mirrored, and
the root disk fails, the system stays up and running, due to the fact that it is mirrored, but how can you
recover the original root disk?
First, some terminology:
The 'primary' root disk is the system disk on which the OS was originally installed. This
disk was "encapsulated" into VxVM and then mirrored. Since this disk is encapsulated, there is a
direct mapping of partitions onto volumes for /, swap, /usr, and /var.
The 'secondary' root disk is a disk which was first initialized into VxVM and then used to form a mirror
for the primary root disk.
VxVM 2.x: Since it was initialized, rather than encapsulated, there is no mapping of partitions onto the
volumes /, swap, /usr, and /var. VxVM 3.x: When the mirror of the root disk is created, the mapping
of partitions onto the volumes /, swap, /usr, and /var is maintained.

Page 49

SEVM - How to recover a primary boot disk. (cont.)


RECOVERING THE 'SECONDARY' BOOT DISK:
If the 'secondary' system disk fails, the replacement of the disk is straightforward. It is handled in
the same manner that any other failed drive needs to be replaced.
The easiest way to do this is to run 'vxdiskadm' and choose option #4 (Remove a disk for replacement).
Then, shut down the system (if necessary) to physically replace the disk, and reboot.
Run 'vxdiskadm' again, this time choosing option #5 (Replace a failed or removed disk). When asked
to 'encapsulate' the disk, reply "no", and then reply "yes" when asked if you wish to initialize it.
This will begin recovery of the disk and the mirrors will resync automatically.
RECOVERING THE 'PRIMARY' BOOT DISK:
NOTE: If you are running Volume Manager version 3.x.x or above, it is not necessary to follow the
steps below. Instead, the process for replacing the 'primary' boot disk is EXACTLY the same as that
for the 'secondary' boot disk, which is shown above. The reason for this is because Volume
Manager 3.x automatically creates the underlying "hard" partitions for /usr and /var on the replacement
disk, whereas older versions did not.
If you are using Volume Manager 2.x, continue on:
The recovery of the 'primary' boot disk contains a few additional steps because the procedure must
reestablish the direct mapping between the partitions on the disk and the system volumes. This is
necessary so that the system can be changed back to use underlying devices, should this be
necessary (for example, to perform a system upgrade or boot from cdrom to fsck one of these filesystems).
1.Run 'vxdiskadm' and choose option #4 (Remove a disk for replacement). Then, shut down the system
(if necessary) to physically replace the disk, and reboot.
2. Run 'vxdiskadm' and choose option #5 (Replace a failed or removed disk). When asked to 'encapsulate' the
disk, reply "no", and then reply "yes" when asked if you wish to initialize it.
3.This step will change depending on the number of partitions on the boot disk. The 'vxdiskadm'
command will put back partition 0 (for /) automatically, and may also do this for swap. However,
if you have any additional volumes on that disk (i.e., /usr or /var), you will have to run a command
to put the partition on the new disk in the correct location.
Examine the partitions on the replaced disk by running 'format' or 'prtvtoc' on it. At the very least, you
will see a partition for root and one for the public and one for the private partitions for VxVM. Determine
if any partitions are missing. If so, these "missing" partitions can be recreated easily using the steps below.
The command to use is 'vxmksdpart'. You give this command the name of a particular subdisk, and it creates a
partition on the disk in the correct location. The syntax is:
/etc/vx/bin/vxmksdpart <subdisk> <partition> <tag> <flags>

Page 50

SEVM - How to recover a primary boot disk. (cont.)


For example, if you have a subdisk named "disk01-02" and wanted to create partition 7 on the disk to map
this subdisk, you can run
/etc/vx/bin/vxmksdpart disk01-02 7 0x00 0x00
3a. SWAP. To create a partition for the swap volume, run:
/etc/vx/bin/vxmksdpart -g rootdg <subdisk> <partition> 0x03 0x01
where <subdisk> is the name of the subdisk used in the swapvol volume on the primary boot disk
(for example, "rootdisk-01"), and <partition> is the unused partition to use for swap (for
example, "1"). The "0x03" tag specifies this partition is for 'swap'.
3b. USR. To create a partiton for /usr (if this disk contains /usr), run:
/etc/vx/bin/vxmksdpart -g rootdg <subdisk> <partition> 0x04 0x00
3c. VAR. To create a partiton for /var (if this disk contains /var), run:
/etc/vx/bin/vxmksdpart -g rootdg <subdisk> <partition> 0x07 0x00
There is no reason to create any other partitions on the boot disk.

Disable DMP
Note: Be sure to do these steps first: 1. umount all file systems created on Volume
Manager volumes 2. Stop the Volume Manager (vxdctl stop).
1. remove the "vxdmp" driver from the "/kernel/drv" directory
rm /kernel/drv/vxdmp
2. edit /etc/system, and remove the line:
forceload: drv/vxdmp
3. Remove the Volume Manager DMP files:
rm -rf /dev/vx/dmp /dev/vx/rdmp
4. symbolically link /dev/vx/dmp to /dev/dsk
ln -s /dev/dsk /dev/vx/dmp
5. symbolically link /dev/vx/rdmp to /dev/rdsk
ln -s /dev/rdsk /dev/vx/rdmp
6. shut down the system to disable the DMP functionality
7. reboot
Patch 105181-20 not loading... Check for 106125, 106292, 106361-08

page 51

Memory Scrubbing
On Ultra Enterprise (sun4u) platforms ECC is generated and checked by the UPA devices
(CPU, SYSIO and PSYCHO), not by the memory controller (Address Controller or AC).
Thus, ECC covers the entire data path between devices and memory.
***This means that an ECC error can be reported against a memory (DIMM/SIMM) that might not be bad ***
For a few ECC errors one may not recommend DIMM/SIMM replacement however in the case when
the errors are exactly 12 hours apart the DIMM/SIMM must be replaced. Memory scrubber runs every
12 hours after the system is booted. The purpose of scanning physical memory is to read each memory
location and determine if the data and ECC are correct. If the data does not match ECC, ECC will be
rerun and correction made to memory content. If it fails exactly 12 hours apart it means the error
appeared again despite of the correction, it will be corrected again however the DIMM/SIMM must be replaced.
check to see if memory scrubbing is enabled do:
# echo disable_memscrub\ /X | adb -k
physmem 3b7b
disable_memscrub:
disable_memscrub:

if it is "0" it is enabled
if it is "1" it is disabled

Display a remote application GUI on your local server


When using telnet to connect to a remote server you can have the a application that has a GUI
interface (like VTS) display on your local server by doing the following:
1. # /usr/openwin/bin/xhost +
(run this on your local server. 'xhost - ' removes permissions)
2. Connect to remote server and:
If using csh, use this syntax:
If using sh or ksh, use this syntax:
# setenv DISPLAY <hostname>:0.0 &
# DISPLAY=<hostname>:0.0
example:
# export DISPLAY
# sentenv DISPLAY persia:0.0 &
3. Run application and the GUI should display on the local server

page 52

Cluster 2.x

http://suncluster.eng
http://neato.east/suncluster/scinstall.html (good install doc)
General:
Up to 4 nodes in cluster
Only Sun Storage is supported (can get waiver, but seldom granted)
HA or PDB (Parallel Data Base)
HA - 1 server runs at up to 100% or 2 up 50 % so the other node can take over in case of
failure
PDB - Both servers access the database simutaiously, no logical hosts or shared ccd
Supports Solaris 2.6, 7, 8
Supports QFE, SCI, fast ethernet, gigabit ethernet on the private net
Supports different types of server nodes in the cluster
Terminal concentrator is special model, it does not send a break on power on
DMP and Fast Write Cache not supported
(touch /kernel/drv/ap before vxvm install to not load DMP)
Cluster install (chapter 8 sun cluster 2.2 book)
Admin w/s

Only requires end user distribution


2.2 release 7/00 has all the cluster related o/s patches
install order: o/s, cluster patches, cluster software
important files:
/etc/clusters
logical hostname and nodes
/etc/serialports
node name and concentrator port

Server install Requires full distribution, 10k requires full+oem


installer must be root
Avoid 'scinstall' "change" option if possible. Use 'scconf'command
Software components:
CMM -Cluster Membership Monitor
CCD - Cluster Configuration Database
SMA - Private Network Management
SSVM/CVM - Volume manager
PNM - Public Network Management
Logical Hosts
DLM - Distributed Lock Manager
Data Services
Topologies:
Clustered Pair
N+1 (hot standby node)
Ring or cascade
N to N scalable (cascading failover)
Shared Nothing ( used for Informix parallel server)
OPS : (Oracle Parrell Server)
No logical hosts
The instants of Oracle syncing goes over the private network
No shared CCD
Must select CVM on install even with Volume Manager 3.0.4, to get OPS pick at end.
Must install UDLM (Oracle CD)
Create shared disk group while only one node in cluster.
Page 53

Cluster 2.x (cont.)


Hardware Notes:
Must change the initiator id on one node if using SCSI arrays between 2 nodes
(see procedure 5-17)
If Quorum device is replaced it needs to be reconfigured.
#scconf - q
A5000 - full loop only
must be mirrored
DMP, FW cache not supported
Direct or Hub attached (pg 5-23 5-27)
Wiring Diagrams
(pg 5-30)
SCI - scrubber jumpers need to be 'on' on one node 'off' on all the other nodes
/opt/SUNWsma/bin (has the SCI sm_config template files you need to
modify and run sm_config)
switch1.sc (4 nodes, 8 cards, 2 switches)
switch2.sc (2 nodes, 4 cards, 2 switches)
link1.sc (2 nodes, 4 cards, 0 switches)
#/opt/SUNWsma/bin/sm_config - f template file
Terminal Concentrator - port 1 is used for setup (numbered 1-8 not 0-7) (pg 5-56)
Enable setup mode - Power On < 30sec (test button) 15 more sec (test button)
should get
monitor::
:: erase EEPROM (to set password to default, default is IP address of box)
Remove the password from port 8 in a 3 node Nto N cluster for 'port locking'
Cluster Commands:
abort partition
Same as scadmin stopnode... Use scadmin stopnode command
ccdadm <clustname> -p ccd.database.ssa - creates a ccd.database.pure file for recovery use
-r ccd.database.pure - restores to ccd.database file
-v verify consistancy of the dynamic copy of ccd.database
-x convert the candidate file to a CCD database. Or verifies the CCD file.
ccp
cconsole
get_node_status
haswitch
hastat
hareg

Page 54

Command used to run the cluster control panel software on the


admin workstation
# ccp clustername &
Command used to start up the cluster console on the admin W/S
# cconsole
Command used to get the status of a node (also can use hastat and
scconf clustername - p commands)
# get_node_status
Switch logical host to another node (will start the reconfiguration)
# haswitch nodename
Will give you the status of the cluster, will lie if private network is
down. You can run it in the common window to get all views
# hastat (- m 0 skip messages)
registers data service with HA and associate the given logical
host.
# hareg - s - r dataservice - h logicalhost
# hareg - y dataservicename (to turn on a dataservice)
# hareg (to verify a service is turned on)
# hareg - n dataservicename (to stop a data service)
# hareg - u dataservicename (will shutdown dataservice on all
logical hosts)

Cluster 2.x (cont)


Cluster Commands: (cont)
pnmset
pnmstat - l
scadmin startcluster
scadmin startnode
scadmin stopnode
scadmin switch
scconf

scdidadmn

scinstall
scmgr
xhost

Command to create PNM NAFO groups (on each node) for the public
network interfaces to be used for the NFS data service.
# opt/SUNWpnm/bin/pnmset (follow interactive install)
Command lists the /etc/pnmconfig file (to set up NAFO groups)
The first node into the cluster must enter with the 'cluster ' switch.
# scadmin startcluster nodename clustername
All remaining nodes can join the cluster with the startnode switch
# scadmin startnode
To remove your node from the cluster use the stopnode switch. (do
this before init or shutdown commands)
# scadmin stopnode
Switch logical host to another node (will start the reconfiguration)
same as haswitch command
# scadmin switch nodename
Command used to configure cluster parameters (many, use MAN)
# scconf - F (creates admin filesystem, each node)
# scconf - L (for logical hosts) (one node, diskset)
# scconf - q (for quoram device)
# scconf -N (to change a node ethernet address )
Command to initialize the Disk ID psudo driver (SDS install only)
builds a file with paths from each node to disks
# scdidadm - r (on node 0 to initialize)
# scdidadm - l (L) (verify DID configuration)
Installation command for Sun Cluster from CD
Command to start Sun Cluster manager (cluster monitor) (set DISPLAY)
# /opt/SUNWcluster/bin/scmgr nodename &
Command on admin W/S to allow all xhost connections from
cluster nodes (graphics)
# /usr/openwin/xhost +

Cluster Files:
/etc/opt/SUNWcluster/conf/clustername.cdb
Contains Install info, flat file use more command to view.
/etc/opt/SUNWcluster/conf/ccd.database
Contains cluster database, viewed by scconf, scadmin commands. If you have to restore
this file to a 'bad' node, you must reboot (file info is kept in memory)
/etc/opt/SUNWcluster/conf/hanfs/vfstab.logicalhostname
Logical hosts vfstab file
/etc/opt/SUNWcluster/conf/hanfs/dfstab.logicalhostname
Logical hosts dfstab file (shared filesystems)
/etc/clusters
Admin W/S file, contains cluster names and node names
/etc/serialports
Admin W/S file, contains node names and port assignments on the consentrator
/etc/pnmconfig
Public network file. pnmset command creates, pnmstat - l command will list.
/etc/hosts
You must enter logical host name and IP.
Page 55

Cluster 2.x (cont)


Cluster Files:
/etc/name_to_major
vxio must have the same number on both nodes to switch nfs logical host
(unencapsulate first, change number)
/opt/SUNWcluster/bin
Most SC2.2 commands are located in this directory
/var/opt/SUNWcluster
Cluster error messages are located in this directory and in /var/adm/messages
Encapsulating root after using Environmental CD to load O/S:
The newer pci based servers come with a Operating Envrionment Installation CD to use with
Solaris 2.5 and 2.6. This CD will create a mini-root partion and allows you to install and boot the server
from the older versions of Solaris.
The mini-root is currently Solaris 7 and starts at cylinder 0 on the boot disk. Once the intended version
of Solaris is loaded, the environmental CD makes mini-root (not mini-me) swap (slice1), leaving it starting
at cylinder 0. This is alright if you are not encapsulating root.
When you then encapsulate root, swap (slice1) remains starting at cylinder 0, and veritas will not allow that
space to be used for a core dump. It assumes it is reserved for the VTOC.
One way we have used to get around this is to boot from the Operating Envrionment Installation CD,
load mini-root onto one disk and the intended O/S on another, through the custom install option. Then
boot from the other disk and encapsulate it.
Adding a second network interface:
(also see pg 48 - 49 How to configure a system to run on a network)
This proceedure can also be used to add the first network interface and may work without booting the machine.
- add hostname and ip address to /etc/hosts file (hostname is usually hostanme_interface ex: sunnie_qfe0)
- create a /etc/hostname. interface file
# touch /etc/hostname.sunnie_qfe0
- vi /etc/hostanme. interface file add entry at top (no spaces) hostname_interface
- ifconfig interface (hme0,qfe0,ect.) plumb
- ifconfig interface inet IP_address
# ifconfig qfe0 inet 129.145.121.123
- ifconfig interface netmask 255.255.255.0
# ifconfig qfe0 netmask 255.255.255.0
- ifconfig interface broadcast IP_address.255 # ifconfig qfe0 broadcast 129.145.121.255
- ifconfig interface up
#ifconfig qfe0 up
- ifconfig - a (if ready to use, should look like this:)
qfe0: flags=863<UP,BROADCAST,NOTRAILERS,RUNNING,MULTICAST> mtu 1500
inet 129.145.121.123 netmask ffffff00 broadcast 129.145.121.255
ether 8:0:20:88:xx:xx
*** Warning: touch the file /etc/notrouter so the server will not route between the two ethernet interfaces***
Adding a default gateway:
# route add default gateway_IP_address
then vi /etc/defaultrouter and enter gateway_IP_address (to keep router thru reboots)
page 56

Veritas Volume Manager :


Volume Manager takes physical disks and allows you to create logical volumes across these disks.
A group of physical disks is called a 'disk group'
All or portions of these physical disks can be combined to create logical 'volumes'
You then can create filesystems on these logical volumes that span multiple physical disks.
Veritas Volumes have 2 partitions on them, a public and a private region.
The public region is the size of the whole physical disk
The private region is 1024 sectors long. The configuration database is located in this region.
There is enough room in the private region to define 128 disks.
The private region is usually located at the beginning of a disk
And is usually slice # 3.
If you run # prtvtoc /dev/rdsk/c#t#d#s2 on a disk initialized under vm (#vxdisksetup - i c#t#d#)
a '15' in the Tag column output indicates the private region
a '14' in the Tag column output indicates the public region
Rules:
-

There must be a rootdg, for vxvm to come up at boot. This is usually made when you
install vxinstall volume manager and encapsulate your boot disk. Although you do not
have to encapsulate the boot disk, rootdg can be made up any disk.
You must have 2 unassigned slices to encapsulate a disk. (public and private regions)
vxunroot will unencapsulate a volume only if /, swap, /usr, /var, and /opt are the only
filesystems on the encapsulated disk.

General the flow of building logical volumes, creating a filesystem and mounting it, is as follows:
1. assign physical disks to free disk pool (to use with volume manager)
# vxdisksetup - i c#t#d# c#t#d# (ect...)
2. create a disk group (uses disks in the free disk pool. You assign names. nconfig is private db
copies, default is 4 and nlogs kernel logs, both switches are optional)
# vxdg init diskgrp_name disk_name=cxtxdx nconfig=# nlog=#
3. add disks from the free disk pool to the diskgroup
# vxvg - g diskgrp_name adddisk disk_name=cxtxdx disk_name=cxtxdx (ect...)
4. Create a logival volume in your disk group
mirror
# vxassist -g diskgrp_name -U fsgen make vol_name size layout=stripe nstripe=# disk_name disk_name (ect..)
(ex: 100m)
raid5 {nolog}
5. mirror a striped or concat logical volume (optional)
# vxassist -g diskgrp_name mirror vol_name disk_name disk_name disk_name (ect..)
6. start the volume
#vxvol start vol_name
7. Make the filesystem that sits on the logical volume
# newfs /dev/vx/rdsk/ diskgrp_name/ vol_name
Page 57

Veritas Volume Manager (cont):


General the flow of building logical volumes, creating a filesystem and mounting it, is as follows:
8. create a mount point (you decide dir_name)
# mkdir /dir_name
9. Mount the filesystem on the mount point
# mount /dev/vx/dsk/ diskgrp_name/ vol_name /dir_name
Break a mirror and unencapsulate:
# vxprint - htg rootdg (get the names of mirror plexes)
# vxplex - g rootdg - o rm dis rootvol-02 swapvol-02 (use pl names from vxprint)
# vxunroot (this will ask for a re-boot when completed)
(you can use vxdiskadm to re-encapsulate)
Break a mirror and take the plex to make another volume:
# vxprint - htg dg_name (find plex name of mirror volume you want to use)
# vxplex - g dg_name dis plex_name (dissociate plex with volume)
#vxmake - g dg_name - U fsgen vol vol_name plex=plex_name (make the volume)
#mkdir /mp_name (create a mount point)
#vxvol - g dg_name start vol_name (start the newly created volume)
#mount /dev/vx/dsk/dg_name/vol_name /mp_name
To boot without Volume manager:
rem out 'vxio' lines in /etc/system (usually 2 lines at the end of vm section)
copy /etc/vfstab to /etc/vfstab.vm
copy /etc/vfstab.prevm to /etc/vfstab
touch /etc/vx/reconfig.d/state.d/install-db
reboot
(to reverse)
uncomment 'vxio' lines in the /etc/system file (on both disks if root was mirrored)
copy /etc/vfstab.vm to /etc/vfstab (on both disks if root was mirrored)
rm /etc/vx/reconfig.d/state.d/install-db
reboot
Deport and Import a disk group:
# vxdg list (get a list of disk groups)
# vxdg deport dg_name
# vxdg import dg_name (can use - n name or - s for shared or - t for temporary optional switches)
Remove a volume from Volume Manager:
# umount / vol_name
or
(filesystem that sits on volume)
# vxvol - g dg_name stop vol_name
(stop the volume)
# vxedit - g dg_name - r rm vol_name (recursivly removes volume, plex, and sub-disk from vm)
Page 58

Veritas Volume Manager (cont):


Volume Manager commands:
vxdg free
how much free space in a diskgroup:
vxdg - g dg_name free
vxdg list
list all imported disk groups (exported use: vxdisk - s list | grep dgname)
vxdg init
Creates a disk group:
vxdg init dg_name disk_name=c#t#d#
vxdg adddisk
Add disk to dg:
vxdg - g dg_name adddisk disk_name=cxtxdx
vxdg rmdisk
Remove disk from dg:
vxdg - g dg_name rmdisk disk_name
vxdg upgrade
Upgrade dg after VM upgrade:
vxdg upgrade dg_name
vxdg deport
deport a dg:
vxdg deport dg_name
vxdg import
import a dg:
vxdg import dg_name
vxassist make
makes a logical volume:
mirror
vxassist -g diskgrp_name -U fsgen make vol_name size layout=stripe nstripe=# disk_name disk_name (ect.
raid5
vxassist maxsize
what is the max size raid you can make in a disk group:
mirror
vxassist - g dg_name maxsize layout=stripe nstripe=#
raid5
vxassist mirror
mirror a stripe or concat vol :vxassist - g dg_name vol_name disk_name(s) &
vxassist remove mirror Used to remove a mirror permenemtly (do not use to break mirror)
vxassist - g dg_name remove mirror vol_name
vxplex
used to attach and dissociate plex(es) with volumes:
vxplex att vol_name plex_name or
vxplex - o rm dis vol_name
vxdisk - s list | grep dgname
Gives you a listing of all disk groups
vxdisksetup - i
used to add a disk to the volume manager free disk pool: vxdisksetup - i c#t#d#
vxdiskunsetup - C
used to remove a disk from the free disk pool:
vxdiskunsetup - C c#t#d#
vxdiskadd
will do both the vxdisksetup and vxdg adddisk:
vxdiskadd c#t#d#
vxvol start
start a volume after it was made with vxassist or vxmake: vxvol start vol_name
vxvol stop
used to stop a volume after a umount: vxvol stop vol_name
vxedit - r rm
allows you to recursivly remove a volume, plex or subdisk: vxedit - r rm vol_name
plex_name
vxmake sd
vxmake plex
vxmake vol
vxunroot
vxdiskadm
vxio set

manually make a sub-disk:


vxmake sd sd_name offset=# len=size disk=disk_name
manually make a plex from a sub disk: vxmake plex plex_name sd=sd_name
manually make a volume from a plex: vxmake - U fsgen vol vol_name plex=plex_name
unencapsulates a disk: vxunroot disk_name
menu driven disk adminiatration
set the number of vxio deamons (default is 10. 2/cpu is recommended) : vxio set #
permently set daemons in the s85vxvm-startup2 file.

Volume Manager files:


/etc/vx/bin:/opt/VRTSvmsa/bin:.
/etc/vx/reconfig.d/state.d/install-db
/etc/vx/reconfig.d/disk.d/cxtxdx
/var/opt/vmsa/logs/commands
/etc/vfstab.prevm
/opt/VRTS/bin/vea

Set PATH to:


Touch this file to prevent Volume manager from starting
all GUI commands are located here
Copy of the vfstab before vm was installed
GUI for version 3.5
Page 59

FTPing to and from sunsolve:


You can use this to temporarily store files that you may want to access at a customers site or to
send files from a customer site that you can retreive on swan.
Anything sent to sunsolve will be deleted after two days
Internal to sunsolve:
(change to directory where the file you want to send resides)
# rftp sunsolve.sun.com
Name : anonymous or suncore
Password: (enter your e-mail address or suncore passwd changes weekly check url:)
https://livelink.central.sun.com/livelink/livelink?func=ll&objId=5537115&objAction=browse&sort=name
ftp> cd cores
ftp> mkdir dir_name (as of 5/01 you cannot create directories. Skip to bin command)
ftp>cd dir_name
ftp>pwd
257 "/cores/dir_name" is current directory.
ftp> bin
ftp> put file_name_to_be_sent
ftp> quit
#
External from sunsolve:
# ftp sunsolve.sun.com
(192.9.9.24)
login: anonymous
password: your_email_address
ftp> cd cores/dir_name/ (as of 5/01 you cannot create directories. Skip to bin command)
ftp> bin
ftp> get file_name_to_be_retrieved
ftp> quit
#

Page 60

Serengeti: 3800 - 6800


General:

(first supported O/S on serengeti Solaris 8 4/01)

Serengeti 8 (3800):
Support for 2 to 8 Ultrasparc III processors (2 system bds max)
Up to 64 Gbytes of Memory (8 banks of 4 dimms each. 2 banks/CPU. possible that a CPU
be installed without a bank but a populated bank must have corresponding CPU installed)
12 hot-swappable compact pci (cPCI) slots
Up to 2 domains
Power Server: up to 3 power supplies nema 6-15P (connect internal to rack )
Rack mount: up to 2 NEMA L6-30P
Serengeti 12 (4800):
Support for 2 to 12 Ultrasparc III processors (3 system bds max)
Up to 96 Gbytes of Memory (8 banks of 4 dimms each. 2 banks/CPU. possible that a CPU
be installed without a bank but a populated bank must have corresponding CPU installed)
16 PCI slots or * 8 hot swappable cPCI slots or *combination of 8 PCI and 4 cPCI
Up to 2 domains
Power Server: up to 3 power supplies nema 6-15P
Rack mount: up to 2 NEMA L6-30P
Serengeti 12i (4810): (100% front access for specialized environments.)
Support for 2 to 12 Ultrasparc III processors (3 system bds max)
Up to 96 Gbytes of Memory (8 banks of 4 dimms each. 2 banks/CPU. possible that a CPU
be installed without a bank but a populated bank must have corresponding CPU installed)
16 PCI slots or * 8 hot swappable cPCI slots or *combination of 8 PCI and 4 cPCI
Up to 2 domains
Power Server: up to 3 power supplies nema 6-15P (connect internal to rack )
Rack mount: up to 2 NEMA L6-30P
Serengeti 24 (6800):
Support for 2 to 24 Ultrasparc III processors (6 system bds max)
Up to 192 Gbytes of Memory (8 banks of 4 dimms each. 2 banks/CPU. possible that a CPU
be installed without a bank but a populated bank must have corresponding CPU installed)
32 PCI slots or * 16 hot swappable cPCI slots or *combination of PCI and cPCI
Up to 4 domains (2 domains / partition)
Power Rack mount: up to 4 NEMA L6-30P
Hardware:
SC Board:
(SSC)

System Console. You can tip or telnet to the SC card to configure/maintain the server.
There are 3 shells you can acess and configure from the SC, Platform shell, Domain shell
and O/S shell on a specific domain. The SC bd is part of the platform, it is not
configured into a domain. A second (slave) SC board is installed if the redundancy
kit is ordered. The SC runs it's own O/S and is upgraded and backed up across the
ethernet connection.

Repeater Bds: The repeater boards establish and maintain the connections between the system boards
(RP)
and the IO boats. The 3800 and 4800 have 2 repeater boards, although the circuitry
for the repeaters on the 3800 is on the centerplane. The 6800 has 4 repeater bds.
* When available

Page 61

Serengeti: 3800 - 6800: (cont.)


System Boards: The system board is common across all 3 servers. It can have 2 or 4 CPUs
(SB)
installed on it (they are not field replaceable). The system board has sockets
for 8 banks of 4 dimms. Each CPU has 2 corresponding dimm banks. It is possible
that a CPU might not have any dimms installed in its corresponding banks.
However, a populated dimm bank must have a corresponding CPU installed.
I/O boat:
(IB)

The I/O boat types : PCI or cPCI, no sbus I/O boat. The PCI and compact PCI
adapters are installed in the I/O boats. Currently cpci is only available on the 3800.

ID Board:

The ID board is a pre-programmed daughter board that is on the centerplane.


The 3800s ID board is incorporated into the centerplane. The id board has
the System chasis id #, System serial #/host id, (6) MAC adresses for the 6800
and (4) for the 3800, 4800.

LEDS:
Activate (green):

Fault (amber):

on
Bd is activated. You must
NOT remove the board when
this LED is on

off
Bd is not activated: you can
remove the board when this
LED is off

an internal fault occurred

No internal fault occurred

Removal ok (amber): you can safely remove the


component under hot-pluggable
conditions.

you must not remove the


component under hot-pluggable
conditions.

Partitioning:
You can configure the server in single or dual partition mode. If you select dual partition
mode, each partition will be electrically separated from the other. The 3800 (on bd repeaters)
and the 48x0 have dual repeaters one will be configured for each partition, the 6800 has
4 repeater bds, 2 will be configured for each partition. Dual partition mode is recommended for
keeping domains electrically separated.
Domains:

On the serengeti, you configure the resources you want allocated to each domain. The domain
(like on an E10K) then becomes an independent server. At a minimum each domain must have
a system bd, I/O boat with ethernet/scsi PCI card, and a boot disk.

Domain/Partition configurations:
3800/4800/6800:
configuration
1 partition 1 domain
1 partition 2 domains
2 partitions 2 domains

Domain IDs
A
A,B
A,C (1 per partition, 6800 see comment below)

6800: Domains A,B even bd #s grid0, C,D odd bd#s grid1 (best practices)
2 partitions 3 domains
ABC, ABD, ACD, or BCD
2 partitions 4 domains
A,B,C,D

Page 62

Serengeti: 3800 - 6800: (cont.)


To connect to the SCC:
# tip hardwire
(from admin workstation, or notebook pc to SCC0 console port)
# telnet ip_address_of_SC
(ip address of sc must be configured > setupplatform)
Power on hardware:
Connect to SCC
enter 0 (platform shell)
> poweron all
to verify: > showboards -v
Switch to domain:

(from platform shell)


> console -d [a, b, c or d]

Power on domain:

(from domain shell)


> setkeyswitch on
(will start POST)
to verify >showkeyswitch
(once POST is complete should get OK prompt and be able
to run standard OBP commands and boot)

Power off domain:


(from domain shell)

(after init 0, shutdown, ect)


> setkeyswitch off (wait, takes a while)

Power off platform:


(from platform shell)

(after domain(s) are set off)


> poweroff all

update SC firmware:

(from platform shell on the SC)

>flashupdate -y -f ftp://root:password@host_ip/path_to_new_firmware all rtos


Run this command from the platform shell. Keep in mind this command will not
update the slave SC. To update it you must make it the primary or run the command
from the slave SC.
>flashupdate -c <source board> <replacement board> (to copy firmware btwn like bds)
Save SC configuration:

(from platform shell on the SC)

> dumpconfig - f ftp://root:password@host_ip/path_to_dumpdir


Restore SC Configuration:

(from platform shell on the SC)

> restoreconfig - f ftp://root:password@host_ip/path_to_dumpdir


To create/modify Platform:

(from platform shell on the SC)

> setupplatform (enter information and modify ACLs . for each domain use
deleteboard and addboard - d commands )
To create/modify Domain:
(from Domain shell on the SC)
> setupdomain (enter information. Defaults in [ ]s)

Page 63

Serengeti: 3800 - 6800: (cont.)


Navigating between shells:
When you first connect:
enter 0 (platform) 1 (domain A) 2 (domain B) ect...
Platform -> Domain
> console - d [A,B,C,D ] (will go to OBP, O/S or shell)
Domain -> Platform
> disconnect
Domain -> OBP
> break (after 'setkeyswitch on ' had been run)
OBP -> Domain
ctrl ] or ~# or (telnet) ctl ] send break or (ssh) #.
Domain -> O/S
> resume (after O/S was brought up via 'boot' command)
O/S -> Domain
ctrl ] or ~# or (telnet) ctl ] send break or (ssh)#.
Platform Shell commands:
(command - h will give you a listing... ** command avalable on slave SC)
addboard
assign a board to a domain -d,
connections **
show connections to the system controller or a domain
console
connect to a domain shell/console -d
deleteboard
delete a board from a domain
disablecomponent
add a component to the blacklist
disconnect **
disconnect this connection or a specified connection
dumpconfig **
save the system controller configuration to a server
enablecomponent
delete a component from the blacklist
flashupdate **
update flash prom images -y, -f,
help
show help for a command or list commands
history
**
show shell command history
password
change platform or domain password
poweroff
turn components off
poweron
turn components on
reboot
**
reboot the system controller
reset
**
reset the other system controller
restoreconfig **
restore the system controller configuration from a server
service
service mode (see page 94)
setchs (service cmd)
setchs -s ok, suspect, faulty -r "reason for status" -c /N0/SB2/p2
setdate
set the date and time for the platform
setdefaults
set default configuration values
setescape (5.16.00)
change escape charectors (default #.)
setfailover (5.13.00)
changes the state of SC failover on, off, force,
setkeyswitch
set the keyswitch position for a domain on
set-keygen (5.16.00)
Generates/lists ssh host keys/fingerprint -l -r
setupplatform **
configure the platform -p acls, -p partition,
showboards
show board information -d,-e, -p, -v,
showchs (service cmd)
shows chs status (use with setchs)
showcomponent
show state of a component -v,
showdate
show the current date and time for the platform
showenvironment
show environment sensors -u, -w, -l, -p, -v,
showescape (5.16.00)
lists escape charector
showframe
show frame information -v,
showfailover (5.13.00)
displays SC and clock failover status
showfru (5.16.00)
list frus in system -r manr
showkeyswitch
show the keyswitch positions
showlogs
show the logs -d, -v,
showplatform **
show the status of domain and platform configuration -d, -p, -v,
showsc
**
show system controller uptime, version, and configuration -v,
sshrestart (5.16.00)
restarts ssh server to put new host keys into effect
testboard
test a board
testinterconnect
run interconnect test (available in service mode only)
Page 64

Serengeti: 3800 - 6800: (cont.)


Domain Shell commands:

(command - h will give you a listing... )

addboard
break
connections
deleteboard
disablecomponent
disconnect
enablecomponent
help
history
password
poweroff
poweron
reset (-x)
resume
setdate
setdefaults
setkeyswitch
setupdomain
showboards
showcomponent
showdate
showdomain
showenvironment
showkeyswitch
showlogs
testboard

assign a board to a domain -d


send break to the domain console
show connections to the domain
delete a board from a domain
add a component to the blacklist
disconnect this connection
delete a component from the blacklist
show help for a command or list commands
show shell command history
change domain password
turn components off
turn components on
reset the domain (XIR, will dump a hung domain)
return to domain console
set the date and time for the domain
set default configuration values
set the keyswitch position on, off
configure the domain -v,
show board information -v,
show state of a component -v,
show the current date and time for the domain
show domain configuration -v
show environment sensors -v,
show the keyswitch position
show the logs -v
test a board

Setup remote logging:


In setupplatform:
Syslog loghost [ ] : ip_of_adminStation
Log Facility [ ]: local0 (can be 0-7)
In setupdomain: (for each domain)
Syslog loghost [ ] : ip_of_adminStation
Log Facility [ ]: local1 (can be 0-7)
In syslog.conf on admin station:
local0.notice
/var/adm/messages.platform
local1.notice
/var/adm/messages.domainA
(ect...)
Admin station:
create the files:
restart syslog:

# touch /var/adm/messages.nnnnnnn
# kill -HUP `cat /etc/syslog.pid` or ( /etc/init.d/syslog stop) ( /etc/init.d/syslog start)

(continued on next page)


Page 65

Setup remote logging: (cont.)


/usr/lib/newsyslog file: (so logs do not grow forever. On line 2 enter all message file names you created.)
--- Change --LOG=messages
cd /var/adm
test -f $LOG.2 && mv $LOG.2 $LOG.3
test -f $LOG.1 && mv $LOG.1 $LOG.2
test -f $LOG.0 && mv $LOG.0 $LOG.1
mv $LOG $LOG.0
cp /dev/null $LOG
chmod 644 $LOG

--- To --#LOG=messages
for LOG in messages messages.platform messages.domainA (ect..)
do
cd /var/adm
test -f $LOG.2 && mv $LOG.2 $LOG.3
test -f $LOG.1 && mv $LOG.1 $LOG.2
test -f $LOG.0 && mv $LOG.0 $LOG.1
mv $LOG $LOG.0
cp /dev/null $LOG
chmod 644 $LOG
done

to test logging:
- # logger -p local0.notice "test message for platform log file" (check contents of log files to make sure
logging is working) (if not check permissions on log file)
setfailover off /on and check log file on log host (if not snoop interface, make sure log entry is reaching loghost
also make sure syslogd is not running with the -t switch)
Notes:
- Use 'connections' command to see if ghost sessions are keeping you from connecting to a domain.
(reset the SC , from slave sc or reset button, to remove those sessions.)
- Use the dash (-) to remove an entry when running setupplatform
Firmware: http://pts-americas.west/esg/msg/techinfo/platform/sun_fire/firmware-matrix/
Patch #
SC Firmware
CPU (MHz)
Domain Firmware
Other features
----------------------------------------112127-xx
5.12.5 750/900 (Masks 2.1/2.2 only)
5.12.x
5.12.6 750/900 (Masks 2.1/2.2 only)
5.12.x
DR
5.12.7 750/All 900
5.12.x
DR/900 2.3
112494-xx
5.13.0 750/All 900
5.12.x or 5.13.x
DR/ SC auto failover
5.13.1 750/All 900
5.12.x or 5.13.x

5.13.2 750/All 900/1050


5.12.x or 5.13.x
DR/1050/failover
5.13.3 750/All 900/1050
5.12.x or 5.13.x

750/All 900/1050
5.12.x or 5.13.x

5.13.5 750/All 900/1050


5.12.x or 5.13.x
/L2 timing
112883-xx
5.14.0 750/All 900/1050
5.12.x, 5.13.x or 5.14.0
DR/Failover/COD
5.14.4 750/All 900/1050/1200
5.12.x, 5.13.x or 5.14.x
/L2 timing
112884-xx
114523-01

5.15.0
5.16.0

750/All 900/1050/1200
750/All 900/1050/1200

ASR
SSH

Freshchoice (scsi2/ethernet) adapter firmware has problem booting CDROM. Bug 4397457
workaround: To patch get-mail of ISP fcode to give longer timeout period:
(set nvram parameter fcode-debug? to true.)
ok cd /ssm@0,0/pci@b,2000/pci@2/SUNW,isptwo@4
ok patch 100 64 get-mail
ok
Page 66

Mounting and unmounting CD without vold:


to stop vold : (automount daemon for cdrom and floppy)
# /etc/init.d/volmgt stop
to mount cdrom:
# mount -F hsfs -o ro /dev/dsk/c0t6d0s0 /cdrom
to unmount cdrom:
# umount /cdrom
to start vold :
# /etc/init.d/volmgt start

Send a file using mailx Command line:


# mailx -r return_email_address -s subject_no_spaces sendto_email_address < filename
This will dump the file into the heart of the e-mail. Use for text documents, post output ect...
Send a message using mailx Command line:
# mailx -r return_email_address -s subject_no_spaces sendto_email_address
Cc: (enter cc: e-mail address if any)
Type text of e-mail (control d) when finished
EOT
#
More T3 info:
Forgotten password:
reset the T3
press (return) within 3 seconds of reset (on the console sesion you have open)
type set passwd (this will display the current password)
T3 Logging: (you will need to modify the T3s host file and syslog.conf file by ftping them to a unix
server, edit them, send the files back to the T3 and reset the T3)
You should already have the T3 connected to the network and be able to telnet to the T3
type 'set' to make sure you have an ip, netmask, gateway, and hostname on the T3
:/: set logto *
modify T3s host file (add ip and hostname of loghost)
modify T3s syslog.conf (add line '*.info @ip_address_of_loghost')
modify loghost syslog.conf file (add line ' local7.info [tab] /var/adm/messages.t3') must use local7
touch /var/adm/messages.t3 on loghost
kill -HUP syslog.d pid or stop and restart it on the loghost
ftp modified host and syslog.conf to the T3
reset the T3 to have changes take effect

Page 67

StarCat 15K:
General:
StarCat 15K:
Has 18 available slots for system board sets. In each of the 18 available slots, you can configure (1)
System bd slot 0 bd and (1) hsPCI, MaxCPU or SunFire Link bd slot 1 bd.
Supports up to 18 system bds (72 CPUs)
and a combination of, not to exceed 18 total slot1 bds:
up to 17 MaxCPU bds (34 CPUs),
up to 18 hsPCI boards (2 3.3v and 2 5v PCI adapters slots per board),
up to 18 SunFire link boards. (includes: 1 3.3v and 1 5v PCI adapter slots per board)
Domains: up to 18 Domains
Power requirements: (12) NEMA L6-30P (2 seperate power grids)
System Board set: (up to 18)
System board set is made up of a system board (slot 0 bd) and a slot 1 type board. A slot 1
type board is usually a I/O (hsPCI) board, but can be a SunFire link or MaxCPU bd. The
slot0 and slot1 boards are physically mounted on a 'carrier plate and expander board'.
The expander bd/carrier plate is then inserted into one of the 18 available slots of the StarCat.
Control Board Set: (2) See Fin I0771-1 (keep old id bd if replaceing CP1500 bd on the SC)
Also see I0761-1 (upgrade CP1500 post & OBP )
Control Board set is made up of a 'System Controller Bd', 'System Controller Perepheral Bd'
and a 'CenterPlane Support Bd'. The system controller runs solaris and the SMS packages.
The System controller peripheral board has 2 SDS mirrored boot disks, DVD-rom and a 4mm DAT
that are used by the System Controller board.
The System Controller bd and SC peripheral bd are mounted on the CenterPlane Support bd. The
Centerplane support bd is then inserted into one of the 2 control bd slots on the StarCat.
The Control Bd set provides system clock, I2C monitoring bus, console bus to all domains,
serial port and 2 net ports to outside world, serial port internal to other SC and internal net connection
to each domain and other SC.
The SCs come with a O/S installed in a 'sys-unconfig' state. When you run smsconfig -m to
configure your SCs, it is easiest if the SCs are on the network and able to reach their default gateway.
IPMP contacts the gateway to determine if the physical interface is up.
Floating = community hostname and IP address. This address will follow the main SC
failover = virtual IP and hostname that will float between hme0 and eri1 on each SC
hme0, eri1= regular IP and hostnames for the interfaces
SC console port pinout: (plus null modem info for connection to 25pin/9pin serial ports)
o o o
td (2/3)<--- rd 5 > o
o o < 3 -- td ----> (3/2) rd
dtr (20/4)<--- cts -- 2 > o | o < 1 dtr ----> (6,8/6,1) dsr, dcd
|
4 gnd (7/5)
Page 68

StarCat 15K: (cont...)


Example of IPMP configuration on Sun Fire 15K system controllers C (Community) Network:
System Controller Floating IP <===== .150
|
_____________|_____________
|
|
_______ .151
.152
======> IPMP Logical IP failover Address
/
|
|
/
______|_________
_______|________
IPMP
|
|
|
|
LEVEL
|
SC0
|
|
SC1
|
\
hme0
eri1
hme0
eri1
\_ .100
.101
.200
.201 ===> IPMP Test IP Address
Private internal net interfaces:
scman0: SC's internal ethernet interface to each domain ( I1 network )
scman1: SC's internal interface to other SC (I2 network)
dman0: Domain's internal ethernet interface to each SC and domain ( I1 network)
15k O/S install:
System controller: The SC's are fully functional servers with 2 SDS mirrored 18gb disks,
DVD-rom and a 4mm DAT. They will come already loaded from the factory with Solaris
and SMS. At this time, there is no way to create the ' idprom.image' files in the field (so make sure
they are backed up). The default login and password is sms-svc, sms-svc.
Domain install: If the domain has a D240 attached the install (after creating the domain:
setupplatform, deleteboard, addboard, setobpparams, setkeyswitch) can be done from
the D240s DVD-rom. If you do not have a DVD-rom attached to the domain you are loading
you will most likley have to boot net.
To boot from an install server:
(on install server)
- add the ethernet address and node_name in the /etc/ethers file
- add the node_name in the /etc/hosts file
# add_install_client node_name sun4u (solaris CD in the Tools directory, keep CD mounted)
- check the /tftpboot directory for created files (file name hex representation of nodse IP address)
- check the /etc/bootparams file for node_name
(on the domain)
- check out your network interfaces ok> watch-net-all
- check out network interface alias ok> devalias
- change if desired interface is not listed, nvunalias, show-nets, nvalias, nvstore
- boot net_alias -install
Blacklist:

is populated/unpopulated by hand or with the 'enablecomponent'/'disablecomponent'


commands. The path is /etc/opt/SUNWSMS/config/platform or A-R/ blacklist.
Use the 'hpost -? blacklist' command to list possible entries

.postrc:

The path is /etc/opt/SUNWSMS/config/platform or A-R/ .postrc.


Use the 'hpost -? .postrc' command to list possible entries
Page 69

StarCat 15K: (cont...)


Send BREAK to domain: (be careful, will stop solaris):
from the console connection: ~# (goes right to OK prompt, NOT domain shell like serengetti)
Decoding CPU locations: 15k
/SUNW,UltraSPARC III @1c 2,0
|
|
change to decimal CPU ID = 0-3 system bd, 8,9 (MaxCPU bd.)
divide by 2
result is EX slot
1C16=28
28/2=14
EX slot=14
Decoding Memory locations: 15k
memory offset 4=bank0 6=bank1
|
/SUNW,memory-controller @12 2,400000
| |
change to decimal CPU ID = 0-3 system bd, 8,9 (MaxCPU bd.)
divide by 2
result is EX slot
1216=18 18/2=9
EX slot=9
Decoding I/O card locations: 15k
c= IOC0
(slot 0 or 1)

d= IOC1
(slot 2 or 3)
|
always 1
board type
|
|
|
/pci@17c,700000/pci@1/SUNW,isptwo@4/disk@0,0
|
|
|
change to decimal 6= I/O slot 0 or 2
device identifier
divide by 2
7= I/O slot 1 or 3
result is EX slot
1716=23 23/2=11 r1
EX slot=11
SMS (Sun Management Server)
Default login:
sms-svc
Default password: sms-svc
SMS daemons:
dca
dsmd
dxs
efe
Page 70

- domain configuration agent. One for every POST. Talks to dcs on domain (only on active SC.)
- domain status monitoring daemon (only on active SC.)
- domain X server. One for each domain. (only on active SC.)
- event front-end daemon. Part of SMC acts as intermediarybtwn SMC agent and SMS (only act SC)

StarCat 15K: (cont...)


SMS daemons: (cont)
esmd
fomd
frad
hwad
kmd
mand
mld
osd
pcd
ssd
tmd

- environmentalstatus monitoring daemon (only on active SC)


- failover monitoring daemon
- field replaceableunit access daemon
- hardware access daemon
- key management daemon (only on active SC)
- management network daemon
- messages logging daemon
- OpenBoot Server daemon (only on active SC.)
- platform configuration database daemon (only on active SC)
- SMS startup daemon
- task management daemon (only on active SC)

SMS Files:
/export/home/sms-svc/.sms_env
/export/home/sms-svc/.cshrc
/export/home/sms-svc/.login

- SMS user environment


- SMS user .cshrc
- SMS user .login

/etc/opt/SUNWSMS/.sms_groups

- sms groups file

/etc/opt/SUNWSMS/config/dsmd_tuning.txt
/etc/opt/SUNWSMS/config/esmd_tuning.txt
/etc/opt/SUNWSMS/config/fomd.cf
/etc/opt/SUNWSMS/config/fomd_sys_datasync.cf

- Domain status and monitoring daemon tuning info


- Environmental status and monitoring daemon tuning info
- Failover monitoring daemon config file
- Failover monitoring daemon datasync file

/etc/opt/SUNWSMS/config/platform/.postrc
/etc/opt/SUNWSMS/config/platform/blacklist

- Platform specific .postrc file


- Platform specific blacklist file

/etc/opt/SUNWSMS/config/A/.postrc
/etc/opt/SUNWSMS/config/A/blacklist

- Domain specific (A-R) .postrc file


- Domains specific (A-R) blacklist

/etc/opt/SUNWSMS/startup/ssd_start
/etc/opt/SUNWSMS/startup/sms_env.sh

- Start script for the ssd daemons


-

/var/opt/SUNWSMS/.pcd/domain_info
/var/opt/SUNWSMS/.pcd/platform_info
/var/opt/SUNWSMS/.pcd/sysboard_info

- Platform configuration database daemon domain info


- Platform configuration database daemon platform info
- Platform configuration database daemon sysboard info

/var/opt/SUNWSMS/adm/.logger
/var/opt/SUNWSMS/data/osdTimeDeltas

- Message logging daemon specifics


- OpenBoot Prom server daemon info file

/var/opt/SUNWSMS/data/A/nvramdata
/var/opt/SUNWSMS/data/A/idprom.image
/var/opt/SUNWSMS/data/A/bootparamdata

- Domains specific (A-R) nvram information


- Domains specific (A-R) idprom information
- Domains specific (A-R) boot parameters

SMS commands: (/opt/SUNWSMS/bin)


addboard
- assigns, attaches and configures a board to the domain (domain_id|domain_tag.)
addtag
- adds the specified domain tag name (new_tag) to a domain
cancelcmdsync - The command synchronization commands work together to control the recovery of
user-defined scripts interrupted by a system controller (SC) failover
Page 71

SMS commands :(cont)


console

- creates a remote connection to the domain's virtual console driver, making the window in which
the command is executed a "console window" for the specified domain
deleteboard - removes a board from the domain it is currently assigned to
deletetag
- remove the domain tag name associated with the domain
disablecomponent - adds a component to the domain or platform blacklist
enablecomponent - removes a component from the platform, domain or ASR blacklist
flashupdate - updates the Flash PROM in the system controller (SC), and the Flash PROMs in
a domain's CPU and MaxCPU boards, given the board location.(/opt/SUNWSMS/firmware)
ex: flashupdate -f /opt/SUNWSMS/hostobjs/sgcpu.flash SB1 (leave Name blank to do all SBs)
fruupdate
(command in 'help' listing, but no description or man page)
help
- displays a list of valid SMS commands along with their correct syntax
initcmdsync - The command synchronization commands work together to control the recovery of user-defined
scripts interrupted by a system controller (SC) failover
marginclock
[-f (65|75|83.333) | -s synth-freq | -m [+/-] margin-percent][-y]
marginvoltage [-p1.5] [-p2.5] [-p3.3] [-p5.0] [-pcore] [-m(0|+|-)] [-d domain_id|domain_tag]
[-d domain_id|domain_tag...] [-b location] [-b location...] [-y]
moveboard - first attempts to unassign location from the domain it is currently assigned to and possibly active
in, then proceeds to assign, connect, and configure location to the domain
poweroff
- powers off the specified dual 48V power supply, fan tray, or board
poweron
- powers on the specified dual 48V power supply, fan tray, or board
reset
- allows you to reset one or more domains in one of two ways: reset the hardware to a clean state
or send an externally initiated reset (XIR) signal
resetsc
- resets the other SC
runcmdsync - command prepares the specified script for automatic synchronization (recovery) after a failover.
Savecmdsync - The command synchronization commands work together to control the recovery of user-defined
scripts interrupted by a system controller (SC) failover
setbus
- perform dynamic bus reconfiguration on active expanders in a domain
setchs
- SMS1.4 set component health status. SMS can auto fail components. Setchs lets you change the status
setcsn
- SMS1.4 set chasis serial number. allows you to set csn once. (showplatform) # setcsn -c serial#
setdatasync - schedule filename enables you to specify a user-created file to be added to or removed from the
data propagation list.
setdate
- allows the SC platform administrator to set the SC or optionally a domain date and time values.
Allows domain administrators to set the date and time values for their domains.
setdefaults
- removes all SMS instances of a previously active domain. A domain instance includes all
pcd entries except network information; all message, console, and syslog log files; and, optionally,
all NVRAM and boot parameters. pcd entries and NVRAM and boot parameters are returned to
system default settings
setfailover
- provides the ability to modify the state of failover for the SC failover mechanisms
setkeyswitch - changes the position of the virtual keyswitch to the specified value
setobpparams - allows a domain administrator to set the virtual NVRAM and REBOOT variables passed to
OpenBoot PROM by setkeyswitch
setupplatform - sets up the available component list for domains.
showboards - displays board assignments
showbus
- display the bus configuration of expanders in active domains
showchs
- SMS1.4 displays component health status. EX: showchs -r sb15
showcmdsync - displays the command synchronization list to be used by the spare system controller (SC) to
determine which commands or scripts need to be restarted after an SC failover.
showcomponent - displays whether the specified component is listed in the platform, domain, or ASR blacklist file.
showdatasync - provides the current status of files propagated (copied) from the main SC to its spare
showdate
- display the date and time for the system controller (SC) or a domain
showdevices - displays the configured physical devices on system boards and the resources made available by
these devices.
Page 72

StarCat 15K: (cont...)


showenvironment - displays the environmental data
showfailover - provides the ability to monitor the state of the SC failover mechanism.
showkeyswitch - displays the position of the virtual keyswitch of the specified domain
showlogs
- displays platform or domain log files. The default is the platform message log.
showobpparams - allows a domain administrator to display the virtual NVRAM and REBOOT parameters
passed to OpenBoot PROM by setkeyswitch
showplatform - Show the available component list and domain state for domains.
showxirstate - displays CPU dump information after sending a reset pulse to the processors
smsbackup
- creates a cpio(1) archive of files that maintain the operational environment of SMS
smsconfig -m - configures and modifies the host name and IP address settings used by the MAN daemon,
mand (must have SCs on the network and able to contact the default router for IPMP to work.)
smsrestore
- restores the operational environment of the SMS from a backup file created by smsbackup
smsversion
- Displays the active version and exits when only one version
of SMS is installed.
sysid
{-d domain_id|-f filename} [-m YYYYMMDDhhmm] [-M machineType (defaults to 0x82)]
[-e etherAddr] [-s serial#|-H host_id] sysid -F textIDPROMfile -f newBinaryfile
thermcal
- Use command if replaceing a csb bd.
testemail - SMS1.4 allows you to generate a test emailto verify SMS logging and recipients
xir
[-d domain_id|domain_tag [-d domain_id|domain_tag]...] [-q] [-y]
local-mac-address :
The "local-mac-address?" eeprom parameter is used enable the MAC addresses which are burnt-in on
network cards.
false - do not use the card's burnt-in adresses, use the nvram default address for all interfaces
(shown on obp banner)
true - use the on-board MAC address (if there is any). This setting is necessary to get a
unique MAC address per interface.
The default setting of the local-mac-address? is set to "false". On non clustered servers the installation
engineer must not forget to set local-mac-address? to true to avoid having one MAC address several
times in the network, which causes network problems.
SDS - How to mirror the root disk
Use this procedure to mirror the system disk partitions using Solstice DiskSuite:
- first format the second disk exactly like the original root disk: (typically s7 is reserved for metadatabase)
# prtvtoc /dev/rdsk/c0t0d0s2 > /tmp/firstdisk
# fmthard -s /tmp/firstdisk /dev/rdsk/c1t0d0s2
- create at least 3 state database replicas on unused (10mb) slices.
# metadb -a -f -c 3 c0t0d0s7 c1t0d0s7 (-a and -f options create the initial state database replicas. -c 3
puts three state database replicas on each specified slice)
- for each slice, you must create 3 new metadevices: one for the existing slice, one for the slice on the
mirrored disk, and one for the mirror. To do this, make the appropriate entries in the md.tab file.
slice 0, create the following entries in (/etc/lvm/md.tab)
d10
1 1 /dev/dsk/c0t0d0s0
d20
1 1 /dev/dsk/c1t0d0s0
d0
-m d10
Page 73

SDS - How to mirror the root disk (cont...)


slice 1, create the following entries in (/etc/lvm/md.tab)
d11
1 1 /dev/dsk/c0t0d0s1
d21
1 1 /dev/dsk/c1t0d0s1
d1
-m d11
Follow this example, creating groups of 3 entries for each data slice on the root disk.
- run the metainit command to create all the metadevices you have just defined in the md.tab file.
If you use the -a option, all the metadevices defined in the md.tab will be created.
# metainit -a -f (-f is required because the slices on the root disk are currently mounted)
- make a backup copy of the vfstab file:

# cp /etc/vfstab /etc/vfstab.pre_sds

- run the metaroot command for the metadevice you designated for the root mirror. In the example
above, we created d0 to be the mirror device for the root partition, so we would run:
# metaroot d0
- edit the /etc/vfstab file to change each slice to the appropriate metadevice. 'metaroot' command has already
done this for you for the root slice.
/dev/dsk/c0t0d0s1
/dev/md/dsk/d1

to
-

swap
swap

no
no

Make sure that you change the slice to the main mirror, d1 not to the simple submirror, d11.
- reboot the system. Do not proceed without rebooting your system, or data corruption will occur.
- After the system has rebooted, you can verify that root and other slices are under DiskSuite's control:
# df -k
# swap -l
The outputs of these commands should reflect the metadevice names, not the slice names.
- Last, attach the second submirror to the metamirror device.
# metattach d0 d20 (must be done for each partition on the disk, and will start the syncing of data)
- to follow the progress of this syncing for this mirror, enter the command
# metastat d0
Although you can run all the metattach commands one right after another, it is a good idea to run the next
metattach command only after the first syncing has completed. Once you have attached all the submirrors
to the metamirrors, and all the syncing has completed, your root disk is mirrored.

Page 74

IPMP: (Solaris 8 Update 2 10/01)


General Description:
IPMP allows you to create a logical IP address that can be swapped on-the-fly to another
physical network interface.
IPMP Test IP Address: physical interfaces (hme0,qfex,ge). This address is used by IPMP to determine
the status of the physical interface. It is not for use by applications.
IPMP Logical IP Address: IP address is used by applications for data transfers to and from
the server. This IP address will failover between the configured interfaces.
_______ .151
======> IPMP Logical IP failover Address
|
/
______|_________
IPMP
|
|
LEVEL
|
|
\
hme0
qfe0
\_ .100
.101 ===> IPMP Test IP Address
/

Setup ipv4 IPMP: (IPMP group w/ 1 stanndby interface) see IP Multipathing Admin Guide
ok> setenv local-mac-address? true
# ifconfig hme0 plumb 172.20.66.100 netmask + broadcast +
# ifconfig qfe0 plumb
# ifconfig hme0 group test-group
# ifconfig qfe0 group test-group
# ifconfig hme0 addif 172.20.66.151 netmask + broadcast + -failover deprecated up
# ifconfig qfe0 plumb 172.20.66.101 netmask + broadcast + deprecated -failover standby up
# ifconfig -a
/etc/hostname.hme0 :
172.20.66.100 netmask + broadcast + group test-group up \
addif 172.20.66.151 deprecated -failover netmask + broadcast + up
/etc/hostname.qfe0 :
172.20.66.101 netmask + broadcast + deprecated group test-group -failover standby up

Page 75

T3B or T3+ Firmware Rev 2.1 New Functions:


Volume slicing:
- Create max 16 slices within a T3, either WG or PP.
- Layered on top of volumes. If volume is unmounted all slices go away.
- Volume slices cannot be seen until the voilume is initalized and mounted.
- Minimum size is 1GB, increments of 1GB, starts on GB boundaries.
- Maximum size is size of volume.
- Once enabled cannot be disabled.
EX: (simple example of sliceing a volume on a t3+)
Enabled by new system variable enable_volslice.
sys enable_volslice (Note: if volslice is enabled, you must create a slice to see lun in format)
vol add vol_name data u#d#-# raid # standby* u#d9
vol init vol_name data rate(1-16) optional
volslice create slice_name -z size vol_name
volslice list
lun perm list (should be rw, else `lun default all_lun rw')
vol mount vol_name
Lun mapping and masking:
- Enabled with volume slicing.
- Each slice must be mapped to a lun.
- Slices can be renumbered to unused lun.
- Luns range from number 0 to 15.
- Lun masking controls access to lun
- Lun permissions can be none, ro (read only), rw (read write).
- Lun permissions set for all or by WWN of hba.
- Default lun permissions is rw when slice is created from existing volume.
- Lun permissions are nonewhen slices are made of volume created after volume slicing is enabled.
New Mapping / Masking command: lun
Mapping:
lun map add lun <lun#> slice <slice#>
lun map rm lun <lun#> [slice <slice#>]
lun map rm all
lun map list [lun <lun#> | slice <slice#>]
Masking:
lun perm
lun perm list
lun default
lun wwn list
lun wwn rm all
lun wwn rm wwn <wwn#>
WWN Groups: Allows groups of wwns to share security features, saves lazy typists.
New command: hwwn
hwwn add <grp_name> wwn <wwn#>, rm <grp_name> wwn <wwn#>
hwwn list <grp_name>
hwwn rmgrp <grp_name>
hwwn listgrp
Fabric Support: Enabled thru new sys variable fc_topology, three possible settings:
- auto:
chooses between loop and fabric_p2p, depending on capability of attached device.
- loop:
establishes an arbitrated loop connection thru a translated loop (TL) port
- fabric_p2p: establishes a fabric connection thru an F port
NTP can also run in the array to sync time with external server.
Page 76

Hitachi StorEdge 99X0 Arrays:


SE9910-

Single cabinet, logic boards in front, disk drives in rear.


Max 16GB cache, 24 host ports, 48 disk drives.
up to 4096 logical devices can be configured and presented.

SE9960One DKC logic cabinet, one to six DKU disk cabinets, arranged on right and left (R1-3, L1-3).
R1 is added first, add on alternate sides for best performance.
Max 32GB cache, 32 host ports, 512 disk drives.
up to 4096 logical devices can be configured and presented.
SE9970V-

Single cabinet,logic boards in front, disk drives in rear.


Max 32GB cache, 48 host ports, 128 disk drives.
up to 8092 logical devices can be configured and presented.

SE9980V-

One DKC logic cabine, one to four DKU cabinets. Added same as 9960.
Max 64GB cache, 64 host ports, 1024 disk drives.
up to 8092 logical devices can be configured and presented.

All use the concept of "storage clusters" redundant combinations of cache boards, host adapter boards (CHA)
and disk adapters boards (DKA). All array transactions run through the cache.
Drives are set up in either RAID 5 or RAID 1 (1+0).
Basic building block is called the B4, which is 4 trays of disks (HDUs). In 9910 and 9970 B4 is all 4 HDUs of
disks, in 9960 and 9980 a B4 is 4 (of 8) HDUs in a cabinet (bottom 4 or top 4). HDUs will be numbered in N
shape. The same 4 drives in a B4 are a parity group, which is where the RAID level is set. A parity group will
always be 4 drives. In 9970 and 9980 parity groups can span 2 B4's.
B4's are numbered 1 through 12; 1 and 2 are in cabinet R1, 3 and 4 are in L1, 5 and 6 are in R2 etc. Disk drives
in each 9910 and 9960 HDU are numbered 0 through B (11), thus 12 drives. Disk drives in each 9970 and 9980
HDU are numbered 00 thru 0F and 10 thru 1F. Accesssing drives 10 thru 1f requires an additional card in the
HDU.
Each parity group is set to an emulation mode, the system then divides that parity group into the appropriate
number of LDEV's based on the emulation mode sizing. LDEV's can be presented on the host ports as LUN's
as is or combined to create larger LUNs.
In 9910 and 9960 drive B (top last drive on left) in each HDU in the L1 and R1 DKU's is used as a universal
spare, the bottom B4 drive B will always be a spare if installed, the top B4 drive B may be designated as spares
or may be a normal parity group. In a 9910 any drives installed in slot B will be spares. In 9970 and 9980, drive
0F will be the spare (top left drive next to center cards). Same rules apply for slot 0F as B in 9910 and 9960.
In 9970 the HDU can be "split" using special cards to create two B4's.
Service Processor (SVP):
Windows PC mounted in array. 9970 and 9980 have optional second SVP mounted in cold standby.
Two modes of operation, View and Modify, View will come on when the Remote Console is connected.
Disconnect Remote Console or reboot SVP to go back to Modify mode.

Page 77

Hitachi StorEdge 99X0 Arrays: (cont...)


Switches on the SVP Main Panel:
Information- allows review of messages (SIMs)
Maintenance- Select a component for replacement
DiagnosisFD Copy- Create a configuration floppy disk
Install- Initial Setup, microcode upgrades, etc.
Default remote console login: USER USER
Passwords:
raid-initialsetup
raid-install
raid-online
horc-forcibly
SVP FUNCTION tabs:
LDEV: format initialize drives/parity groups
HORC or Open TruCopy: Copy between subsystems
LUN Manager: map LUNs to ports
DCR: Dynamic Cache Residency aka Flash Access LUN is mapped into cache
Shadow Image: copy data within subsystem
CVS/Virtual LUN: (small volumes) smaller than emulation mode size... use wasted space,
make small volumes for DCR
On Demand/Just In Time: add additional space
LDEV Security SANtinel: LUN Masking

MAINTENANCE:

lots of jumpers on boards, must be carefully checked. All changes must be made thru
modify mode on the svp, carefully following the procedures. Repair procedures have
a pre change section, a change section and a post change section, follow all steps.
USE THE MANUALS (on CD comes with the firmware) !!

SunFire forgotten password: (SRDB 26846) This procedure works with firmware version 5.11.3 and higher.
If the platform administrator's password is lost, the following procedure can be used to
clear the password.
1. Reboot the System Controller (SC). You won't be able to do this by logging into the platform shell.
You'll need to hit the reset button on the SC to do this.
2. The normal sequence of a System Controller rebooting is for SCPOST to run, then ScApp. You'll need
to wait for ScApp to start loading, then hit Control-A to spawn a vxWorks shell. SCPOST is done running
when you see the message 'POST Complete'. At this point, ScApp will begin to load. When you see
the copyright message 'Copyright 2001 Sun Microsystems, Inc. All rights reserved.', Hit CONTROL-A.
You should see the following:
Task not found
spawning new shell.
->
Page 78

Sunfire forgotten password: (cont:)


This last line is the vxWorks prompt. Keep in mind, that ScApp will still continue to load all the way
to the point of giving you the menu to enter the platform/domain shells. To make it less confusing,
wait for the ScApp menu to display on your screen, then hit return. You should see the
vxWorks prompt -> again.
3. Make a note of the current boot flags settings. This will be used to restore the boot flags to the original value.
-> getBootFlags()
value = 48 = 0xC = '0' (Save the 0x number for # 8 below.)
4. Change the boot flags to disable autoboot.
-> setBootFlags (0x10)
5. Reboot the System Controller (CONTROL-X or reboot ). Once reset, it will stop at the -> prompt.
6. If you are running firmware 5.17.x or above, enter the following commands, otherwise, go to step 7:
-> ld 1,0,"/sc/flash/vxAddOn.o"
If you are running firmware 5.17.x or 5.18.x, enter the following command at the prompt
-> uncompressJVM("/sc/flash/JVM.zip", "/sc/flash/JVM");
If you are running firmware 5.19.x or later, enter the following command at the prompt
-> uncompressFile("/sc/flash/JVM.zip", "/sc/flash/JVM");
7. Enter the following commands at the -> prompt.
-> kernelTimeSlice 5
-> javaConfig
-> javaClassPathSet "/sc/flash/lib/scapp.jar:/sc/flash/lib/jdmkrt.jar"
-> javaLoadLibraryPathSet "/sc/flash"
-> java "-Djava.compiler=NONE -Dline.separator=\r\n sun.serengeti.cli.Password"
Wait for the following System Controller messages to display. Your prompt will come back right away,
but it'll take about 10 seconds for these messages to show up:
Clearing SC Platform password...
Done. Reboot System Controller.
8. After the above messages are displayed, restore the bootflags to the original value using the
setBootFlags() command.
-> setBootFlags (0xC) (Use the value returned from #3 above. )
9. Reboot the System Controller using CONTROL-X or the reboot command. Once rebooted,
the platform administrator's password will be cleared.
Default Storage switch passwords:

(telnet to the switches in the san.)

Sun 1GB switch: user: root passwd: ma31_glw


Sun 2GB switch: user: admin passwd: password
Brocade Switch: user: admin passwd: silkworm
Page 79

StorEdge Network FC Switch:


The StorEdge Network FC Switch are replacing the fibre hubs. When you receive them they
are configured as similar to a hub (all ports one zone). The switch will initially get it's IP address
by RARPing (though it has a default IP of 10.0.0.1). You cannot telnet to the switch, you must use
the GUI to configure (may change with future firmware).
Remember: each array in a zone must have a unique tag address or box id...
Setup: (on server)
- load San Foundation Kit (SUNWsan packages) http://storage.east/san
load and patch SanSurfer GUI (pkgadd -d SUNWsmgr) EIS CD /sun/patch/SAN/8/
add ethernet address and switch_name to /etc/ethers
add Ip address and switch_name to /etc/hosts
check in.rarpd is up: ps -eaf | grep in.rarpd (start if not up /usr/sbin/in.rarpd -a &)
turn on FC switch
ping Ip address of switch
bring up GUI SanSurfer ( java -jar /usr/opt/SUNWsmgr/bin/Sun.jar) or
( /usr/opt/SUNWsmgr/bin/esm_smgr)
login (default login: su, password: su) (can't login? add patch 110696)
Click on IP Address and enter switch IP
Configure the switch as needed. (rate field >20 scan rate for app to get stats)
To set up zoning: (from Fabric Screen)
- click on IP address of switch / zoom / zoning / add zone / click on port / apply
To edit network config: (from Fabric Screen)
- double click on `Fabric Name' of switch
To veiw zone config: (from Fabric Screen)
- click on IP address of switch / zoom / zoning / zone index 1,2,3 ect...
To clear all zones: (from Fabric Screen)
- click on IP address of switch / zoom / zoning / clear all zones
Useful SAN commands:
luxadm fcode -p (lists SUN/QLOGIC HBAs and firmware on each).
luxadm -e port (Here you would be looking for a connected status for your device in question.)
luxadm -e dump_map /devices/pci@1f,4000/SUNW,qlc@4/fp@0,0:devctl (path is from above command)
luxadm probe
luxadm display <path> (path from above or WWN)
ls -l /dev/cfg (This will show you paths to controller mapping.)
cfgadm -al (View what fabric devices are seen and configured and their condition)
cfgadm -c configure c# (to configure a device ex: cfgadm -c configure c5::50020f2300000cab)
cfgadm -o show_FCP_dev -al (list luns under each device. very handy when troubleshooting lun issues).
ls -l /dev/fc (give you fp to path mappings)
prtconf -vp|grep -i wwn (will give you the wwn of all configured HBAs on the system, this is a snap shot of
what the prom saw at boot).
Page 80

Hitachi Lightning 9900V notes:


also see: http://storage.east/hitachi
DKC - Disk (subsystem) Control Unit
DKU - Disk only frame: up to 4 DKUs : Left 1 (L1), Right 1 (R1) , Left 2 (L2), Right 2 (R2) 9980v
SVP - Superviser Console: 1/DKC standard, optional: 2nd SVP/DKC (NOTHING EXTRA loaded on SVP!!)
ACP / DKA - Array Control Processor / Disk adapter : same thing connects to FSWs
CHA- Channel Adapter: contains fiber ports to connect to server
SM - Shared Memory: located on Cache bds, contains subsystem metadata
MDL - Maintenance Documentation Library
PDL - Product Documentation Library ( includes User Guide -theory)
SIM - System Information Message (message led blinking means it cannot talk to SVP) reference numbers
can be looked up in SIMRC.PDF manual (on m/c CD) Action code points to a work ID (USE
MANUALS!!!)
SSID - SubSystem ID: asigned number associated with: mainframes, 'Trucopy', 'Shadow Image"
HDD - Hard Disk Drive
HDU - Hard Disk Unit: up to 32 HDDs in a HDU: slot 0f is spare
B4 - Group of 4 HDUs (N shaped numbering, 0,2 on bottom: 1, 3 on top) 9970 has (1) B4 unless it has
FSW 'c' cards then 2. 9980 B4 numbering: (R1) 1,2 (L1) 3,4 (R2) 5,6 (L2) 7,8 (lowest # on bottom)
FSW - Fiber Channel Interface Switch: PCB in HDU. Connects to DKA. (3) types A, B, C (switches)
SC - Single Cabinet (9970)
MC - Multi Cabinet (9980)
Cluster - set of boards in a subsystem. 2 clusters: CL1, CL2. Mirror config across clusters
Emulations - Lun Specifications (what type of disk drive do you want the lun to appear to be?)
Cannot Hot SWAP: Backplane, FSW 'B' boards
Available Raid Types: Raid 5
Raid 10
CU - Control Unit - a addressable list of Ldevs in shared memory. Rule of thumb: use same
type of Ldev in a CU. If using another type of Ldev in system put them in another CU.
Max 32 CUs 256 Ldev/CU
LUSE - Lun Size Expansion: Make large Lun from Ldevs (concatinate)
CVS/VLL - Make smaller Luns from free space 35gb and lower (must be smaller than emulation size selected)
Pariy Group (aka: Array Group): 4 disks only. Select physical disks, Select emulation, (this will give
you a number of Ldevs depending on emulation) Assign Ldevs to CU
Lun Mapping: Map a Ldev to ports on the CHAs. Done thru Storage Navigator.
Host mode 0 is standard, host mode 9 for Solaris, host mode C for windows
Host Groups: When Lun Security is on upto 128 host groups/ port. Can config host mode and have lun0 per
group. Need to know WWN of HBA
High Speed Mode: All the processers on a CHA will be working 1 port : 1 port 4 procs (other 3 ports
disabled)
Standard speed mode : 1 processor per port on a 4port CHA, 1 proc/2ports 8 port CHA
Offline SVP: Software (m/c CD) to load on your PC. Use to configure without SVP. Requires config floppy
DCI - Define Configure Install: DCI operation destroys customer data use for new install only.
Use 'Change Configuration' on existing subsystems. (Shift ctl i raid-initialsetup)
Page 81

Hitachi Lightning 9900V notes: Cont.


How to figure needed disk capacity: (but don't forget spares)
Customer wants (10) 500gb luns. How many HDDs do you need?
1, (1) 500gb lun = (14) 36gb open-L Ldevs
500/36= 13 r32 (round up to 14)
2, (10) luns = 140 Ldevs 14x10=140
3, parity groups = 24 6 Ldevs/parity group 140/6= 23 r2 (round up to 24)
4, 96 HDDs required 24 parity groups x 4 disks/group = 96 disks
Spare Disk Drives:
Are available to any array group
Spares install in slot 0f of each HDU
Manditory: B4-1, B4-3
Optional: B4-2, B4-4
Adding Frames: Watch HDU Jumper locations when adding frames.
Microcode CD:
- Read ECN (engineering Change Notice) comes with m/c CD
- Includes Manuals (use them)
- Includes Offline SVP software
M/C Upgrade Sequence:
- SVP
- Everything but DKU
- DKU
If message led is on, check subsystem status: (if blinking communication problem with the SVP)
- Maintenance button on SVP
Special Key strokes:
shift-ctl i 'raid-initialsetup'
alt-shift >
shift-ctl m 'mode'
raid-install

used for DCI


update config diskette
puts you in mode mode for m/c upgrades
used in disk replacement

Storage Navigator - Allows you to do Lun mapping, LUSE, CVS, DCR, True Copy, Shadow Image
from a client thru the lan to the SVP. Make sure the SVP is not in 'modify' mode so
you can get write access. Default login: root pwd: root
http://ipaddress-main-SVP//cgi-bin/utility/sjc0000.cgi
DCR/Flashaccess - Dynamic Cache Residency: Will keep a Ldev resident in cache, save on transfer time.
If purchased set it up on install, will save downtime later

Page 82

Hitachi Lightning 9900Vnotes: Cont.


HDLM - Hitach Dynamic Link Manager: Loaded on the server similar to DMP.
/opt/dynamiclinkmanager/log /bin
Defaults:
Sun
Windows
Setting
Path Health Check
off
off
15 - 1444 min
auto failback
none
off
HDLM commands:
# dlnkmgr veiw (-path), (-sys),
offline (-path)
online (-path)
set -ellv log-level, -elfs log-size, -systflv trace-level, -pchk, -s
clear
help
True Copy: Remote copy to another disk subsystem (9900 to 9900). Mainly used for disaster recovery.
You configure it on each subsystem using Storage Navigator. One will be the Master (MCU)
and the other Remote (RCU).
2 transfer methods:
SYNC: Data that is transferred to the MCU is inturn sent to the RCUthru a dedicated port.
When the data is acknowleged at the RCU the MCU sends an acknowlegement back to the HBA
ASYNC: Data sent to the MCU is acknowleged to the HBA before the MCU receives
acknowlegement from the RCU
The dedicated port has to be configured as 'initiator' on the MCU and 'RCU target' on the RCU.
This port is a point to point connection between the disk subsystems.
The PVOL is the primary volume (Ldev) the data is sent to it from the server.
The SVOL is the secondary volume (Ldev) on the RCU that True Copy copies to.
True Copy Volume States:
SMPL - simplex volume prior to any pair operation or result of 'pairsplit -s' command
COPY - (initial copy in progress) a result of a 'paircreate' command
PAIR - initial copy complete and doing updates as data changes on pvol
PSUS - pair operations suspended as a result of a 'pairsplit' command
PSUE - pair operations suspended as a result of a failure
To setup True Copy:
Decide on PVOL and SVOL
SSID (need to know, get from customer)
Serial number of each disk subsystem
setup Path between the subsystems (ports, cables, ect...)
Define MCU to RCU path
ASYNC only Define Consistancy Groups (order in which you want data sent to svol)
True Copy (create pairs)
Page 83

Hitachi Lightning 9900V notes: Cont.


Shadow Image: A local copy within a disk subsystem. Configured using Storage Navigator.
The PVOL is the primary volume (Ldev) the data is sent to from the server. The SVOL is the secondary
volume (Ldev) that Shadow Image copies to. You can can have a max of 9 copies (svols), this includes
(3)level 1 SVOLs and (6) level2 SVOLs (cascade)
Level1
Level2
_______S
_____S /
|
\ ________S
|
_______S
P_____S /
|
\ ________S
|
_______S
|_____S /
\ ________S
Shadow Image Commands:
paircreate: starts a initial copy and results in a PVOL SVOL pair
pairsplit: splits the pair. quick or steady options. Level1 must be split before level2
a split will syncronized data btwn the PVOL and SVOL before the split.
Pairresync: Will resyncronize a suspended pair.
Quick Functions:
quicksplit : makes it possible to read and write SVOLs immediately after split
quickresync: reduces the resync time considerably
quickrestore: reduces restore time considerabaly

Minnow StorEdge 3300 Series array: (also see page 110 for disk replacement)
OEM'd from Dot Hill. Small cheap array. Scsi hardware raid and jbod. Fiber array soon.
Raid levels 0, 1, 3, 5, 1+0, and 0+1 supported.
Up to 12 drives per box, 2 redundant RAID controllers.
Model 3310 Ultra 160 LVD SCSI (will work Single Ended as well).
Use new LVD card and SUNWqus driver.
Logical Disk or Group- the raid setup from the disks.
Logical Volume- a raid of logical disks (how they do 1+0 and 0+1).
Partitioning- may map a chunk of LD or LV.
Local spares assigned to particular LD or LV
Global spares assigned within array.
Luns are created and owned by one controller, other is failover for it. Controllers can be active/active or
active/passive. All interface to array is done thru the master controller.
Parts are raid controllers (2), event monitor units (emus) (2), power supplies (2), terminator board (1), io
board(1), disks (12). All hot swappable. Replacing terminator and io board will interupt io.

Page 84

Minnow StorEdge 3300 Series array: cont...


Cableing can be complex, refer to manual. 4 channels within box, two are for host, 2 for drives
Single bus- all drives same channel.
Dual bus- split drives between two channels (split drives 1-6 & 7-12, channels 0 &2).
Any box combination, maximum of 16 drives per any one channel.
IO Board
Channels 0 and 2 are drive channels
Channel 1 and 3 are host channel ports
SB and DB ports are jumper ports: Single bus jumper cable from channel 0 to SB port.
Dual bus jumper cable from channel 2 to DB port.
Expansion unit (JBOD) has no controllers, has 4 port IO Board (A, Aterm, B, Bterm).
Aterm and Bterm are self terminating ports, need to be at end of chain.
Single bus in expander jumper cable from B to Aterm.
Dual bus in expander no jumper cable installed.
If adding an expander to a controller box run the cable to the non term ports.
Box Management thru serial port or GUI (GUI doesn't work well yet).
If using network connect both controllers to same subnet, only master controller has ip address. IP
assigned by DHCP or static thru serial port connection.
Standard RS232 null modem (9 pin female) serial cable to either controller. Settings are 38400 baud, 8N1.
control-l
refreshes screen (if just connected to running array hit control-l choose VT100 mode)
control-w
switches between the controllers.
control-acbd reset to factory defaults, password oemmaint
Config tool is a text based menu, common to all arrays, main selections are: (use Return and ESC to navigate)
view & edit logical drives
view & edit logical volumes
view & edit host luns
view & edit scsi drives
view & edit scsi channels
view & edit config parameters
view & edit peripheral devices
veiw system information
system functions
event logs

(create, expand, delete, raid configs, partition, set spares)


(create, delete logical volumes)
(assign lun id's and map host channels)
(view drive status,flash drive leds, set global spares, clone drives)
(status, properties, set controller target id)
(controller settings, set baud and ip address)
(set expansion box, secondary controller, array status (emu))
(cache size, firmware revision, Ect...)
(reset, shutdown, fw upgrade)

Create LUNs: (in general, example does not use logical volumes so no + raid levels)
setup qlobal spares- v/e scsi drivesselect disk- add global spare- yes
setup logical drive- v/e logical drivesselect LGcreate logical drive-yes-raid-select disks-capacity-ESC
partition logical drive- v/e logical drives-select logical drive-partition-select partition(arrow)-size-yes
map luns to host- v/e host luns-select controler-select lun#-select logical drive-select partition-map(y)
Modify /kernel/drv/sd.conf: (must do for all lun #'s other than 0)
create the following 2 line entry for each lun:
name="sd" class="scsi"
target=# lun=#; (change target and lun)
Page 85

Tuning ecache scrubber scan rate:


See FIN I0755-1.
The following procedure can helpon UltraSparc II servers that experience ecache failures. Best used on
servers that mirrored ecache is not an option.
The procedure increases the scan rate from 100 times a second to 1000 times a second.
It will increase the system utilization by about 1%.
To adjust ecache_scan_rate:
1. As root, run the following command to adjust ecache_scan_rate.
# echo 'ecache_scan_rate/W 0t1000' | adb -kw
NOTE: This does not require downtime. Be very careful, though, as mis-typing the command could
result in downtime.
2. To make the change permanent, add the parameter setting to /etc/system. It is best to insert all
3 parameters together into /etc/system if the settings are not already there:
set ecache_scrub_enable=1
set ecache_scan_rate=1000
set ecache_calls_a_sec=100
To check a system's current setting use the following command.
This does not modify the setting in any way:
# echo 'ecache_scan_rate/D' | adb -k
VxWorks (serengeti SC): Use when you cannot get into scapp or to recover a failed SC flashupdate
- Reset the SC using the reset button on the front of the SC.
- when Copyright 2001-2002 Sun Microsystems, Inc. All rights reserved.
Use is subject to license terms. appears hit CTRL A
->setBootFlags(0x0) CTRL X (will reboot and stop booting at the -> PROMPT.
->setBootFlags(0xd) ( then reboot to Change the boot settings back so SC automatically boots ScApp)
- to configure a netmask: -> ifMaskSet("eri0", 0xffffff00) (example will set it to 255.255.255.0)
- to configure an IP address: -> ifAddrSet("eri0", "129.146.232.222")
- to enable the network interface: -> ifFlagSet("eri0", 0x8063)
- to configure a default router: -> routeAdd("0.0.0.0", "129.146.232.10")
- to Register the name/address of a server: -> hostAdd("myhost", "129.146.240.105")
- to test the network interface : -> ping "myhost",1 or ping 129.146.240.105,1
- to Update the ScApp flashprom in Vxworks:
updateBootFlashURL("ftp://login:password@myhost/path_to/sgrtos.flash")
updateScAppFlashURL("ftp://login:password@myhost/path_to/sgsc.flash")
Page 86

LVD PCI Adapter: (ultra scsi-3 375-3057)


Code named jasper, it is a Low Voltage Differential card. Mainly supports the S1, D2 and Minnow
(SE 3310) arrays.
The LVD drivers are not on any Solaris CD yet, 8 02/02 or 9. You will have to either make a temp boot disk
and patch it or boot net from a patched image to see the disks on a LVD adapter, until a bootable CD is
released that has driver support for the LVD.
do the following to see disks on a LVD adapter:(drivers and patches available on EIS cd sun/progs, sun/patch)
- add_install_server (create the solaris 8 image)
- download "QUS" drivers from www.sun.com(all four) SUNWqus, SUNWqusu, SUNWqusx and SUNWqusux.
- download patch 112697-02 from sunsolve
- pkgadd -R /boot_dir/Solaris_8/Tools/Boot -d . (add drivers to boot image)
- patchadd -R /boot_dir/Solaris_8/Tools/Boot 112697-02 (patch boot image)
- add_install_client (enable client to boot net from server)
- boot net (client)
Once loaded you can install Solaris on the LVD disks. But you have to select 'manual
boot' so you can then patch the install image before reboot as follows:
- cd /net/ipaddress_of_install_server/shared_dir_where_pkgs_located/
- pkgadd -R /a -d . (add all four pkgs 32 and 64 bit)
- patchadd -R /a 112697-02
- reboot
see doc 816-2156-11.pdf StorageEdge PCI Dual Ultra3 SCSI Host Adapter Install Guide.
CP1500 - (15k SC) replacement: (see fin I0761-1)
The Nordica bd is used both in the netra line and the SC of a 15k. When replacing the Nordica
bd (501-5473) in a 15k you have to upgrade the OBP so you will have all the SC functionality.
The info doc says you should do the procedure on rev -12 and below. We had to do the procedure on
a -13 board to get it to work(without it we could not see the 'man' network interfaces).
In general you have to: (see fin and download readme for specifics)

download the CP1500 OBP image


run the downloaded script (updates CP1500 OBP to 3.14.6)
flash the SC (flashupdate)
reset OBP parameters

You can find The current Nordica OBP firmware image available for download at :
http://pts-americas.west/esg/hsg/starcat/patches.html
Serengetti /15k Dynamic Reconfiguration: Min Requires Solaris 8 (02/02 u7) SC 5.12.6
(also see 15k dr examples page 109)
(Solaris commands)
To get a list of component NAMES: # cfgadm -al
To remove a bd from a domain:
# cfgadm -o unassign,nopoweroff -c disconnect NAME (ex: N0.SB1)
To add a bd into a domain:
# cfgadm -v -c configure NAME (ex: N0.SB1)
To see if board has perm mem:
# cfgadm -val | grep permanent
Page 87

To Clean up non-root disc controler numbers: (see info docs 15019, 27756)
# mv /etc/path_to_inst /etc/path_to_inst.orig
# rm /etc/path_to_inst.old
# cd /dev/dsk
# rm c1* c2* c3* c4* (do not remove your boot device)
# cd /dev/rdsk
# rm c1* c2* c3* c4* (do not remove your boot device)
# rm -rf /dev/cfg/* (new on solaris 8)
If boot disk is under Sun StorEdge Volume Manager, search for "rootdev:" in /etc/system.
ex: rootdev: /pseudo/vxio@0:0 (Write down this device name exactly, you will use it on boot.)
# init 0
ok boot -ar

(take the default through all prompts except: Do you want to rebuild this file [n]? y )
(and if you have the boot disk under StorEdge Volume Manager, when asked for)
( the physical root device, enter the device name you found above)

Set network parameters at boot:


ok> boot net:speed=100,duplex=full (no spaces)
Starcat Portid cheat sheet:
Decimal:
-----------------------------------------------------------------Exp| cpu0| cpu1| cpu2| cpu3| max0| max1| pci0| pci1| axq0| axq1|
-----------------------------------------------------------------| 0|
0| 1|
2| 3|
8|
9 | 28 | 29 | 30 | 31 |
| 1|
32 | 33 | 34 | 35 | 40 | 41 | 60 | 61 | 62 | 63 |
| 2|
64 | 65 | 66 | 67 | 72 | 73 | 92 | 93 | 94 | 95 |
| 3|
96 | 97 | 98 | 99 | 104 | 105 | 124 | 125 | 126 | 127 |
| 4 | 128 | 129 | 130 | 131 | 136 | 137 | 156 | 157 | 158 | 159 |
| 5 | 160 | 161 | 162 | 163 | 168 | 169 | 188 | 189 | 190 | 191 |
| 6 | 192 | 193 | 194 | 195 | 200 | 201 | 220 | 221 | 222 | 223 |
| 7 | 224 | 225 | 226 | 227 | 232 | 233 | 252 | 253 | 254 | 255 |
| 8 | 256 | 257 | 258 | 259 | 264 | 265 | 284 | 285 | 286 | 287 |
| 9 | 288 | 289 | 290 | 291 | 296 | 297 | 316 | 317 | 318 | 319 |
| 10 | 320 | 321 | 322 | 323 | 328 | 329 | 348 | 349 | 350 | 351 |
| 11 | 352 | 353 | 354 | 355 | 360 | 361 | 380 | 381 | 382 | 383 |
| 12 | 384 | 385 | 386 | 387 | 392 | 393 | 412 | 413 | 414 | 415 |
| 13 | 416 | 417 | 418 | 419 | 424 | 425 | 444 | 445 | 446 | 447 |
| 14 | 448 | 449 | 450 | 451 | 456 | 457 | 476 | 477 | 478 | 479 |
| 15 | 480 | 481 | 482 | 483 | 488 | 489 | 508 | 509 | 510 | 511 |
| 16 | 512 | 513 | 514 | 515 | 520 | 521 | 540 | 541 | 542 | 543 |
| 17 | 544 | 545 | 546 | 547 | 552 | 553 | 572 | 573 | 574 | 575 |
------------------------------------------------------------------

In Hex:
-----------------------------------------------------------------Exp| cpu0| cpu1| cpu2| cpu3| max0| max1| pci0| pci1| axq0| axq1|
-----------------------------------------------------------------| 0|
0| 1| 2|
3|
8|
9 | 1c | 1d | 1e | 1f |
|1 |
20 | 21 | 22 | 23 | 28 | 29 | 3c | 3d | 3e | 3f |
| 2|
40 | 41 | 42 | 43 | 48 | 49 | 5c | 5d | 5e | 5f |
| 3|
60 | 61 | 62 | 63 | 68 | 69 | 7c | 7d | 7e | 7f |
| 4|
80 | 81 | 82 | 83 | 88 | 89 | 9c | 9d | 9e | 9f |
| 5|
a0 | a1 | a2 | a3 |
a8 | a9 | bc | bd | be | bf |
| 6|
c0 | c1 | c2 | c3 |
c8 | c9 | dc | dd | de | df |
| 7|
e0 | e1 | e2 | e3 |
e8 | e9 | fc | fd | fe | ff |
| 8 | 100 | 101 | 102 | 103 | 108 | 109 | 11c | 11d | 11e | 11f |
| 9 | 120 | 121 | 122 | 123 | 128 | 129 | 13c | 13d | 13e | 13f |
| 10 | 140 | 141 | 142 | 143 | 148 | 149 | 15c | 15d | 15e | 15f |
| 11 | 160 | 161 | 162 | 163| 168 | 169 | 17c | 17d | 17e | 17f |
| 12 | 180 | 181 | 182 | 183| 188 | 189 | 19c | 19d | 19e | 19f |
| 13 | 1a0 | 1a1 | 1a2 | 1a3 | 1a8 | 1a9 | 1bc | 1bd | 1be | 1bf |
| 14 | 1c0 | 1c1 | 1c2 | 1c3 | 1c8 | 1c9 | 1dc | 1dd | 1de | 1df |
| 15 | 1e0 | 1e1 | 1e2 | 1e3 | 1e8 | 1e9 | 1fc | 1fd | 1fe | 1ff |
| 16 | 200 | 201 | 202 | 203 | 208 | 209 | 21c | 21d | 21e | 21f |
| 17 | 220 | 221 | 222 | 223 | 228 | 229 | 23c| 23d | 23e | 23f |
------------------------------------------------------------------

Page 88

Starcat SC: clean the slate: (bring down domains)


Clean dump and post files in /var/opt/SUNWSMS/adm/A-R
Remove all boards from domains: ex: # deleteboard SB0 SB1 IO0 IO1 ect...
Stop SMS both SCs:
/etc/init.d/sms stop
# mv /etc/opt/SUNWSMS/SMS1.3/config/MAN.cf
/etc/opt/SUNWSMS/SMS1.3/config/MAN.old
# sys-unconfig
Without the MAN.cf file it is as though smsconfig -m has never been run.
Starcat redx info: check out : http://pts-americas.west/esg/hsg/starcat/tools/xcredx.html
#redx -l
(will put you in local mode to look at dumps. redxl.csh for non SC analysis)
redxl>dumpf load dump-file-name (will load dump and give you a brief summary)
redxl>dumpf types (will list the domain board configuration)
redxl> wfail (will give you failure info 1E= 1st error 1E+= accumulated errors)
SB (slot 0) redx commands:
redxl> shproc 0 0 3 (show PROC. 0 0 3 = exb0 slot0 cpu 3 shproc connects to DCDS, SDC, AR, SBBC)
redxl> shdcds 0 0 1 (show DCDS. 0 0 1= exb 0 slot0 dcds 1
shdcds connects to PROC, DX)
redxl> shdx 0 0 3 (show DX. 0 0 3= exb 0 slot0 dx 3
shdx connects to SDI(exb) DCDS)
redxl> shar 0 0
(show AR. 0 0 = exb 0 slot0
shar connects to AQX(exb) SDI 0(exb) PROCs)
redxl> shbbc 0 0 1 (show SBBC. 0 0 1 = exb 0 slot0 sbbc 1
shbbc connects to SDC, PROCs)
redxl> shsdc 0 0
(show SDC. 0 0 = exb 0 slot0
shsdc connects to SBBC, PROCs)
I/O(slot1) redx commands:
redxl> shioc 0 1 1 (show IOC. 0 1 0= exb0 slot1 ioc 1
shioc connects to SDC, DXs, AR)
redxl> shar 0 1
(show AR. 0 1 = exb 0 slot1
shar connects to AQX(exb) SDI 0 (exb) IOCs)
redxl> shdx 0 1 1 (show DX. 0 0 1= exb 0 slot1 dx 1
shdx connects to SDI(exb) IOCs)
redxl> shsdc 0 1
(show SDC. 0 1 = exb 0 slot1
shsdc connects to SBBC, IOCs)
redxl> shbbc 0 1
(show SBBC. 0 0 1 = exb 0 slot1
shbbc connects to SDC, IOCs)
Expander (exb) redx commands:
redxl> shaxq 0
(show AXQ. 0 = exb 0
shaxq connects to AMXs(cp) ARs, SDCs, SDI 0)
redxl> shcbr axq 0 (show CBR AXQ. 0 = exb 0 )
redxl> shsdi 0 0
(show SDI. 0 0 = exb 0 sdi 0
shsdi connects to DARBs (cp) DMXs(cp) ARs
SDCs, SDIs(exb) AXQ(exb) (6 SDIs/exb)
redxl> shcbr exb 0 (show CBR EXB. 0 = exb 0)
CenterPlane (cp) redx commands:
redxl> shamx 0 1 (show AMX. 0 1 = cp 0 amx 1
shamx connects to AXQs (exbs)
redxl> shrmx 1
(show RMX. 1 = cp 1
shrmx connects to AXQs (exbs)
redxl> shdmx 0
(show DMX. 0 = cp 0 shdmx connects to SDIs (exbs) port 0-3, 1-2, 2-1, 3-0, 4-4, 5-5
redxl> shdarb 1
(show DARB. 1 = cp 1
shdarb connects to SDI 0 (exbs) shows domain configs)
Terms:
AR
AMX
AXQ
DARB
DCDS
DMX
DX
RMX
SBBC
SDC
SDI

Address Repeater
Address MultipleXer
Address controller
Data ARBiter
Dual CPU Data Switch
Data MultipleXer
Data Switch
Response MultipleXer
System Boot Bus Controller
System Data path Controller
System Data Interface

(1 per SB, IO, max CPU)


(2 per centerplane buss C0, C1)
(1 per expander board)
(1 per centerplane buss C0,C1)
(2 per SB, 1 per Max CPU. 1/DCDS for 2 PROCs)
(6 per centerplane bussC0,C1 connects to SDI exbs)
(4 per slot0, 2 per slot1 bd)
(1 per centerplane buss C0,C1)
(2 per slot0, 1 per slot1 bd)
(1 per SB, IO, max CPU)
(6 per EXB, 0 is master connects to DMXs)
Page 89

StorADE:
Has diagnostics included in it that are supposed to replace Storetools.
Alot of the new arrays and fiber channel backplanes are supported.
You can bring up the GUI by typing (in a browser window, any server):
http://hostname :7654
(default login: ras
password: agent)
(I found cli diags to be more useful then the GUI)
New cli storage diagnostics located in : /opt/SUNWstade/Diags/bin
listing below.
6120test -tests the functionality of disks in a 6120 array (minnow)
a5ktest - tests the functionality of disks in the Sun StorEdge A5000 and A5200 array
a5ksestest - tests Sun StorEdge A5000 and A5200 arrays
a3500fctest - verifies functionality of Sun StorEdge A3500FC disk tray
brocadetest - diagnose Brocade Silkworm Fibre Channel switches
d2disktest - tests the functionality of the Internal Sun StorEdge D2 Array disk
daksestest - tests Sun Fire V880 FC-AL disk backplanes
daktest - tests the Sun Fire V880 FC-AL disk
dex - Device Exerciser for Sun StorEdge arrays
discman - discovery manager
disk_inquiry - disk-only version of the inquiry program
disktest - No manual entry
enc_inquiry - No manual entry
fcdisktest - tests the functionality of internal fibre channel disk
fctapetest - tests the functionality of Fibre Channel tape drives
ifptest - tests functionality of the PCI FC-100 Fibre Channel-Arbitrated loops (FC-AL) card
lbf - A loop back frame diagnostic utility program that tests Fibre Channel-Arbitrated loops (FC-AL)
linktest - diagnose Sun StorEdge network passive Fibre Channel components
linktest2 - No manual entry .
ofdg - No manual entry
ondg - No manual entry
qlc_hba - displays stats on qlc hba
qlctest - tests the functions of the 1gb and 2 gb PCI and cPCI Fibre Channel Network Adapter boards.
socaltest - tests the SOC+ host adapter card
stresstest - Checks for possible SAN errors.
switchtest - diagnose Sun StorEdge Network Fibre Channel switch-8 and switch-16 switches
t3test - tests the functionality of the Sun StorEdge T3 and T3+ array LUNs
vediag - Runs virtualization engine diagnostics through SLICD
veluntest - tests the functionality of the virtualization engine by accessing the VLUNs.
volverify - No manual entry
Get fru info from a serengetti: (prtfru does not work on serengetti, explorer must be loaded)
#cd /opt/SUNWexplo/bin
# LD_LIBRARY_PATH=/opt/SUNWexplo/lib
# export LD_LIBRARY_PATH
# CLASSPATH=/opt/SUNWexplo/java/fruid-scappclient.jar:/opt/SUNWexplo/java/libfru.jar
# export CLASSPATH
# ./rprtfru.sparc -b sc_ip_address:password >/tmp/fruid(must use password. will put output in file /tmp/fruid)
Page 90

SWAP
What is recommended now (2003) swap size with gb physical memory servers?
(http://docs.sun.com/db/doc/817-0798/6mgisnqfi?a=view)
System Type
Swap Space Size
Workstation 4 Gb of physical memory
1 Gbyte
Mid-range server 8 Gb of physical memory
2 Gbytes
High-end server 16 to 128 Gb of physical memory
4 Gbytes

Dedicated Dump Device Size


1 Gbyte
2 Gbytes
4 Gbytes

Performance considerations:
How much and how often?
# swap -s (command to monitor swap resources)
# swap -l (command to determine if your system needs more swap space)
How do you get an estimate of needed swap/app?
# pmap -r pid#
(sol 8, 9) (shows heap used/process. Add up heap to get an idea)
# pmap -Sa pid#
(sol 9) (will show all reservations by each process)
How to tell how much swapping? (if too much should consider adding more physical memory)
# vmstat 5 5 (look at sr column, also note po, page out column. non-zero numbers
- page scanner looking for pages to mark as free, po - we're sending stuff out.)
# iostat -npxc 5 5 (check for kw/s on the swap partition - non-zero and the page outs from
vmstat are really writes to swap partition(s).
(http://docs.sun.com/db/doc/816-4553/6maop1hik?a=view)
Dump considerations:
How much memory do you want dumped? all, kernel, kernel + active process
# dumpadm
Dump content: kernel pages
Dump device: /dev/dsk/c0t3d0s1 (swap)
Savecore directory: /var/crash/pluto ***(large enough to hold core)
Savecore enabled: yes
# dumpadm -c all -d /dev/dsk/c0t1d0s1 -m 10%
Dump content: all pages
Dump device: /dev/dsk/c0t1d0s1 (dedicated)
Savecore directory: /var/crash/pluto (minfree = 77071KB)
Savecore enabled: yes
savecore -L (live core dump, WATCH OUT, do not do a savecore -L to a dumpslot under volume
manager control)
DR considerations:
How much physical memory on most populated System board?
Nonpermanent Memory (currently 32gb physical mem/max/bd) Before you can delete
a board, the environment must vacate the memory on that board.Vacating a board means
flushing its nonpermanent memory to swap space.
http://education.central/AliasArchive/Archives/ILT/ses_systemadmin-ext/msg08612.html
http://education.central/AliasArchive/Archives/ILT/ses_systemadmin-ext/msg05509.html
Page 91

from /net/cores.central/cores/dir5/
(REAL DATA: looked at explorer for ram size and explaned core to check size)
RAM
24gb
20gb
18gb
16gb
16gb
16gb
10gb
8gb
6gb
4gb
4gb
4gb
4gb
4gb
2gb
2gb
2gb
2gb
2gb
1.5gb
1gb
1gb

Core size
1.7gb
984mb
1gb
900mb
884mb
2.4gb
800mb
1.2gb
518mb
300mb
594mb
305mb
435mb
2mb
155mb
243mb
234mb
374mb
263mb
220mb
997mb
138mb

type
k
k
k
k
k
k
k
k
k
k
k
k
k
a
k
k
k
k
k
k
k
k

Solaris
8
8
8
8
8
8
8
8
2.6
8
8
2.6
7
8
2.6
8
8
7
8
7
8

Maserati Notes- StorEdge 6320 and 6120:


Two models: 6120- standalone, desk side or rack, like T3 WG or PP. 6320- rack solution like the 3900 (Indy),
includes service processor, management net. Next generation T3, just don't call it the T4. Very much like the T3.
Drives in front, two power supplies in back on top, one controller, two loop cards. Components are similar to the
T3 but are physically enclosed differently, not swappable between T3 and T4. Units are 3U high. On back,
controller in middle, loop cards on each side. Loop cables are different (use RJ-45 type connector). All fiber
connections use the LC style connector.
Arrays are 2GB capable on the front end using the Qlogic 2300 chipset. Internally run at 1GB using the Qlogic
2200 chipset.
Model marketing designation is a 'YxZ' config: where Y=# of controllers, Z= # of trays.
Each controller can have 1 to 3 disk trays associated with it. One tray will have the controller in it, the other 2
will have no controller. Trays are joined via the loop cards. Min config a 1x1 (1 controler, 1 box), max config
is 2x6 (2 controllers, 6 boxes). Controller redudnancy is done thru a partner pair type config, just like the T3
except with the expansion trays factored in.
Up to 14 drives per tray, 7 is minimum supported number (though only 4 will work). Drive slot 14 is the hot
spare location. Like T3 don't have to have a spare, but if you do it must be slot 14. Drive sizes are 36GB, 73GB
and 146GB drives.
All commands are the same as a T3 with 2.1 and above firmware. Max luns per array is 64, max luns per volume
is 32. Each tray is still limited to two volumes, using contiguous disks. If you have min config (7 disks) and
build two volumes, you will need to remove/create a volume to add more disks.
Note- internally brick terminology is the same as T3 (volslice, volume). Although, maserati manuals refer to
them as pools (volumes on T3) and volumes (volsices on T3).
Page 92

Maserati Notes- StorEdge 6320 and 6120: cont.


6120 LED indicators:
Green- Normal
Yellow- Service action required
Blue- Safe to remove (hot swap)
White- Locator beacon
6320 -rack has a V100 service processor, an integrated patch panel and a SPAT (service processor accessory
tray). V100 has cdrom, optional usb flash memory card to save config. Patch panel consolidates connections for
service components and fiber connections. SPAT has a 4 port terminal concentrator (NTC) with a built in
modem, a firewall/router, an ethernet hub and future usb power management sequencer. (Customers are
encouraged to use remote services by Sun thru the provided modem. During initial release of the
product 5/03 thru 9/03 install is free.)
FC switches may be mounted in the rack but are no longer monitored or controlled via the SP.
I

6320 has 3 LANs set up:


internal- for components only
SP LAN- remote services net (behind the firewall)
User LAN- one customer net port.
6230 default logins and passwords and roles:
Service Processor
root/!root
Firewall
root/sun1
user firewall access
NTC
rss/sun1rss
NTC user
NTC
su/sun1rss
NTC admin
6120 array
root/!root
array admin
GUI passwords
config service
admin/!admin
full access
storage/!storage
storage set up only
guest/!guest
observe only
Login to sp from external system using ssh
ssh -l root <ip>
(sp does not have menu to make changes to config, like 3900 Indy)
Use web based GUI
https://<ip>:9443/se6000ui/login.do (GUI is similar to storade)
Use sccs CLI:
( from external system with packages installed.)
Commands located in /opt/se6x20/cli/bin
sscs login,
sscs list,
sscs add,
sscs create,
sscs modify, sscs delete.
Flash Archive interactive install: (saves time on multi domain installs)(see info doc 40131)
Create a flash image from a patched server: (load patches and packages before creating image)
# cd /
# flarcreate -S -n image_name /path_ to/ image_file (~2.2gb - can use -c compress, 2x longer, only 1/5 smaller)
# share F nfs -o ro,anon=0 /path_ to/ image_file (share image file) (/etc/init.d/nfs.server start)
Boot new server and load from image: (if boot from CD best to use same release as flash ex :sol9 04/04)
(note: you need network connectivity btwn image server and new server to download image)
- On server to be loaded: boot cdrom or boot net (if you have created a install server or 12/15K)
- Answer all install questions until you get to F2 Standard F4 Flash select F4
- Select NFS
- NFS Location: ip_address:/path_ to/ image_file
(ex: 192.148.220.113:/var/tmp/flash )
- Continue answering install questions as you would on a regular interactive install
- Server will load Solaris from the image you specified/created
Page 93

UltraSPARC III CPU Diagnostic Monitor (CDM): ( see Sun Alert ID: 55081 )
CDM is supported only on UltraSparc-III processors based platforms with Solaris 8 or Solaris 9 releases.
CDM contains 3 packages with total size less than 1MB.
To download packages:

http://diagnostics.sfbay/cdm/

EIS-CD 29JUL03 will also have packages on it


Download consists of three Sun Packages:
Install order
SUNWcdiam
3
SUNWcdiar
2
SUNWcdiax
1
To start CDM, add packages and boot server. Will run at `default' settings without modifications
to /etc/cpudiagd.conf. To change settings modify /etc/cpudiagd.conf. See cpudiagd man pages for log files
and config info.
To remove CDM :
# /etc/init.d/cpudiag stop
# pkgrm SUNWcdiam SUNWcdiar SUNWcdiax
(note: log files in /var/cpudiag/log/ remain after CDM is removed)
SunFire Service Mode Password Generator: (for info see http://acts.ebay/bulletins/index.cgi?bulletin=159)
Generator url:

https://sfservicepass.sfbay/

(Generator will ask for hostid of main SC, ScApp version, RTOS version. If you type 'service' (return, return)
in the platform shell the SC will list the needed info)
To enter service mode type 'service' and enter password in the platform shell.
To exit service mode type 'service'
ex: setchs -s ok, suspect, faulty -r "reason for status" -c /N0/SB2/p2
V440 :

(Chalupa) Solaris 8 7/03 beta

ALOM: ('#.' to enter, default login admin admin1)


poweron
power on server, fru. Turns off ok-2-remove led
poweroff
power off server
removefru
will move a FRU into a state whereby it is ready to be removed
reset
resets the managed system
break
causes the SC to send a break to the managed system OS
bootmode
provides control over the OBP firmware behavior during system initialization
console
connect this user session to the managed system's OS console stream #. to return
consolehistory displays the contents of the selected OS console output buffer
showlogs
displays the contents of the managed system eventlog
setlocator
cause SC to turn the managed system locator indicator on or off
showlocator display the managed system locator indicator current state
showenvironment displays the environmental status to the SC for the managed system.
Showfru
prints out the FRUID data stored in the FRU PROM
showplatform displays the hardware configuration of the platform
showsc
displays the details of the SC software configuration and firmware version information.
Page 94

shownetwork displays the current SC network configuration parameters


setsc
allows the user to individually configure SC parameters
setupsc
interactivly configures the SC parameters
showdate
displays the current SC date and time
setdate
allows the user to set the current SC date and time
resetsc
resets the SC
flashupdate download a new firmware image to the active SC
setdefaults
set all the user settable SC configuration parameters to their default value
useradd
add a new user to the SCs user database
userdel
remove an existing user from the SCs user database
usershow
displays the configuration details for a user account, or all accounts (w/o argument)
userpassword allows an administrator to set/change a users password
userperm
sets the permissions for the specified user
password
allows a user to change their own login password
showusers
display a list of users currently logged into the SC
logout
logs the current user out from his alom session
help [command] provides assistance to the user of the CLI by listing the commands
raidctl: solaris command ( V440 hardware raid command, mirror within controler only)
raidctl -h Help text, no man pages
raidctl -c Create mirror (note: raid volume will use original disks ctd#)
raidctl -d Delete mirror
raidctl [-f] Update controler firmware
raidctl -l List raid controller status

ex: raidctl -c c1t1d0 c1t2d0


ex: raidctl -d c1t1d0
ex: raidctl -F image 1
ex: raidctl -l 1

Adding Locales to Solaris: (S8 see infodoc 44626, S7 infodoc 44505 )


There are 3 ways to add locales to a server.
Initial install
Upgrade
pkgadd

select locales while installing


select locales while Upgrading
pkgadd from Solaris Media kit Languages CD (about 100meg/locale)
(/cdrom/Sol_8_1001_lang_sparc/components/<product>)

Finding Solaris release and distribution loaded:


# more /etc/release (to find the Solaris version loaded)
# more /var/sadm/system/admin/CLUSTER (to find the distribution loaded)
SUNWCXall - Full Distribution + OEM Support
SUNWCall - Full Distribution
SUNWCprog - Developer
SUNWCuser - End User
SUNWCreq - Core
Find local NIS servers

(see infodoc:4736)

% rpcinfo -b ypserv 2
(systems that respond are running ypserv, and thus NIS servers)
Are they serving your NIS domain?
% yppoll -h responding_server passwd.byname
Page 95

Network troubleshooting:
Commands:
arp -a
display entries in the arp table
dmesg
check status of interface at boot time
ifconfig
allows you to add/modify/delete interface parameters (see page 48,75)
kstat -n interface
kernal stats for interface (good info)
kstat -p
kstat -p | grep interface gives speed and duplex information
ndd -set /dev/eri instance 0 sets view to eri0
ndd /dev/eri \?
shows what eri paramaters are modifiable
ndd -get /dev/tcp tcp_status displays tcp parameter value 'tcp_status' also ndd -get /dev/eri link_status
netstat -i
gives you interface details # of packets, collisions, errors ect...
netstat -Pn protocol
protocol info, no name resolution
netstat -rnv
routing info, no name resolution, local veiw
netstat -k interface
same info as kstat -p but not well formatted
ping 192.168.47.2
command contacts and reports status of 192.168.47.2
rup 192.168.47.2
contacts and reports up time for 192.168.47.2
route (add, get, flush, delete) command allows you to add, get, delete, flush, entries in the routing table
snoop
monitors network traffic use -v ,-d ,interface, ipaddress to filter view
spray 192.168.47.2
will send packets to 192.168.47.2 report on transfer rate and number received
traceroute 192.168.47.2
maps and times route from your server to 192.168.47.2
Files:
/etc/defaultdomain
/etc/dhcp.interface
/etc/hosts
/etc/hosts.equiv
/etc/hostname.xxx
/etc/protocols
/etc/services
/etc/notrouter
/etc/defaultrouter
/etc/gateways
/etc/ftpusers
/etc/ftpd/ftpusers
/etc/netconfig
/etc/nsswitch.conf
/etc/netmasks
.rhosts

- servers domain name


- touch file for dhcp boot ex: /etc/dhcp.hme0 (hme0 will boot dhcp)
- list of hosts (local file) is linked to /etc/inet/hosts
- trusted remote hosts and users
- contains interface name and/or config at boot time
- contains protocol names configured and psudo number
- contains services configured and default port number
- touch file if server has multiple interfaces and should NOT route
- contains ip address of servers router (needed to reach other subnets)
- file contains static route entries
- contains a list of users that can NOT ftp login (Solaris 8 and 9)
- contains a list of users that can NOT ftp login (Solaris 9)
- network config File
- contains config of named services on server
- contains a list of base addresses and netmasks
- trusted remote hosts and users

Daemons:
dhcpagent
in.dhcpd
in.ftpd
in.mpathd
in.routed
in.rdisc
in.telnetd
xntpd

Page 96

- implements client half of the DHCP


- dhcp daemon run with the -d -v switch for diagnostic output
- in.ftpd is the Internet FTP server process.
- IPMP process. Started by the 'group' option of ifconfig command
- the routing daemon (only present on router servers) -s -q
- implements the ICMP router discovery protocol
- in.telnetd is a server that supports TELNET virtual terminal protocol
- ntp daemon

How to find your way around a B1600... (min O/S Sol8 12/02, Sol9 04/03)
Default login sc: admin:no psswd sw: admin:admin
SC commands:
console
console connection to switch or blade (use showplatform name. #. to return)
help
lists available commands
showplatform -v
platform and blade config and status information
setupsc
initial sc setup...
showsc
lists config data provided to setupsc command
poweroff s#
Poweroff blade number s# (console to blade & shutdown first)
poweron s#
Poweron blade number s#
SW commands:
help
lists available commands
?
command ? will list available syntax
show vlan
listing and ports assigned to vlans
show running-config current switch configuration
show startup-config Config used at boot time
show mac-address-table
mac addresses learned by ports
show system platform wide config information
show interface
Shows status/config of selected interface
show spanning-tree displays spanning-tree info
Sun Blade Management GUI:
http:// switch_IP_address:80 (ipaddress # from 'show running-config' command 'show system' for port address)
switch ports:
NETPn ports are external uplink switch ports. There is no correlation of NETPn port to blade number.
SNPn ports are internal downlink switch ports that are connected to the blades ce interfaces.
There is a 1 to 1 correlation of SNPn port to blade number ( ce0 to ssc0/swt, ce1 to ssc1/swt)
Setting up Vlans:
Vlans are assigned to ports and can be designated as tagged or untagged. A tagged vlan is
one that uses tagged communication to a vlan aware interface. A untagged vlan passes
all untagged traffic. Ports that have the same vlan assigned to it can communicate together.
The formula for determining a Solaris interface number for a tagged vlan (VID) is:
1000 * VID + device PPA = Vlan logical PPA
vlan 15 on ce0 :
1000 * 15 + 0 (for ce0) = ce15000
vlan 15 on ce1 :
1000 * 15 +1 (for ce1) = ce15001
Ex: to assign blade s0 and blade s1 interface ce0 to vlan 15 you would do the
following:
on S0 and S1:
# ifconfig ce15000 plumb
# ifconfig ce15000 inet ip_address netmask + broadcast + up
create/add hostname to /etc/hostname.ce15000
add ip_address (es) and hostanmes to /etc/hosts
on switch:
Console# config
Console(config)#vlan database
Console(config-vlan)#vlan 15 name VLAN15 media ethernet
Console(config-vlan)#end
Console#config
Console(config)#interface ethernet SNP0 (s0 ce0 is connected to SNP0 port)
(continued on next page)
Page 97

b1600 cont...
Console(config-if)#switchport allowed vlan add 15 tagged
Console(config-if)#end
Console#
Console#config
Console(config)#interface ethernet SNP1(s1 ce0 is connected to SNP1 port)
Console(config-if)#switchport allowed vlan add 15 tagged
Console(config-if)#end
Console#
(you would follow the same procedures if creating untagged vlans only the interface would remain
ce0 and the switch command would not have 'tagged' at the end. ALSO: if you want the vlan to
be seen outside the chassis you must allow it on a external port NETPn)
Trunking: (ports grouped together to act as one)
to create a static trunk (external ports NETP3 and NETP3 are put into trunk2): ports must be connected to a
static trunk on another switch.
Console#config
Console(config)#interface port-channel 2
Console(config-if)#exit
Console(config)#interface ethernet netp2
Console(config-if)#channel-group 2
Console(config-if)#exit
Console(config)#interface ethernet netp3
Console(config-if)#channel-group 2
Console(config-if)#end
Console#show interface status port-channel 2
to create LACP (link aggregation connection protocol) trunk: ports must be connected to a LACP- enabled
trunk ports on another switch
Console(config)#interface ethernet netp4
Console(config-if)#lacp
Console(config-if)#exit
Console(config)#interface ethernet netp5
Console(config-if)#lacp
Console(config-if)#exit
(The trunk is automatically activated if LACP is enabled on the connected port of the
target switch. A trunk formed with another switch using LACP is automatically assigned the
next available trunk ID)
Spanning tree:
Where two bridges are used to connect the same two computer network segments, a spanning
tree configuration occurs. Because spanning trees have multiple paths to the same destination,
a condition called 'bridge loop' is created. 'Spanning tree protocol' is communications between
bridges designed to eliminate the loop path. Caution should be used if you are configuring the
switch for spanning tree protocol. In that it will effect switches in the customers network.

Page 98

b1600 cont...
Full list of commands:
sc commands:
bootmode reset_nvram|diag|skip_diag| normal|bootscript= string sn {sn} This command allows you to specify a
boot mode for a blade. You need to use it to boot Linux blades for the first time
break -y s#
Command causes blade to drop from Solaris into either kadb or OBP
console -f -r
Access console of a switch or blade. (ssc#/swt,s#) type #. to return to the sc> promp
consolehistory -b -e -g
Displays the contents of the switch or blade consoles buffer. (boot|run ssc#/swt|s#)
flashupdate -s IPaddress -f path -v ssc# s# Enables you to upgrade firmware to a System Controller or to a blade
help [command]
Provides help text for specified command
logout
password
command allows a user to change his or her own password
poweroff -f -y -s -r Powers off components (ch,ssc#,s#)
poweron -f -y -s -r
Powers on components. (ch,ssc#,s#)
removefru -f -y
Powers down components (ch,ssc#,s#)
reset -y -x
Resets components (s#,ssc#/swt,ssc#/sc,ssc#)
resetsc -y
Resets the active System Controller.
setdate
set the time of day on the System Controller, switches, and server blades.
setdefaults -y Returns the active System Controller (but not its switch) to the factory default settings.
setfailover
Tells you which System Controller is the active and standby System Controller.
setlocator on off
Turn on/of blade locator
setupsc
Enables you to configure the active System Controller interactively.
showdate
Displays the current date and time
showenvironment -v Displays environmental sensors status in components of the chassis. (ssc#,psn,s#)
showfru
Displays the contents of component (s) FRUID database (ssc#,s#,ch,psn)
showlocator
Tells you whether the locator LED is on or off.
showlogs -b -e -g -v Displays the events (s#, ssc#)
showplatform -v -p Displays the status of each component. (ssc#,ssc#/swt,psn,s#,ch)
showsc [-v]
Displays a summary of the configuration of the active System Controller.
showusers
Shows the users currently logged into the System Controller.
standbyfru -f -y
Powers down components (ch, ssc#, s#)
u
Gives user administration privileges
useradd username
Adds a named user to the list of permitted System Controller users.
userdel username
Deletes a user from the list of permitted System Controller users.
userpassword username
allows a user with a-level permissions to alter another users password.
userperm username aucr specifies the named users permission levels.
usershow username Shows details of the specified users login account.
switch comands:

(use ? and help commands for assistance)

switch Exec commands:


clear counters
Clears statistics on an interface
logging Clears messages from the logging buffer
mac-addresstable dynamic Removes any learned entries from the forwarding database
config
Activates global configuration mode
copy
Copies a code image or a switch configuration to or from Flash memory or a TFTP server
file
Copy from file system
running-config
Copy from current system configuration
startup-config
Copy from startup configuration
tftp
Copy from tftp server
Page 99

b1600 cont...
debug
Debugging functions
delete
Deletes a file or code image
dir
Displays a list of files in Flash memory
disable
Returns to normal mode from privileged mode
exit
Returns to the previous configuration mode, or exits the CLI
flowcontrol
Enables flow control on a given interface
garp timer
Sets the GARP timer for the selected function
help
Description of the interactive help system
?
Shows options for command completion (context sensitive)
hostname
Specifies or modifies the host name for the device
ip dhcp restart Submits a BOOTP or DHCP client request
login
Enables password checking at login
password
Specifies a password on a line
password-thresh
Sets the password intrusion threshold, which limits the number of failed logon attempts
ping
Sends ICMP echo request packets to another node on the network
port monitor Configures a mirror session
security
Configures a secure port IC
quit
Exits a CLI session
reload
Restarts the system
show bridge-ext
Shows bridge extension configuration
bridge multicast Shows the IGMP snooping MAC multicast list
gvrp configuration
Displays GVRP configuration for selected interface
garp timer
Shows the GARP timer for the selected function
interfaces status Displays status for the specified interface
port-channel Shows information about a particular aggregated link.
vlan
Displays status for the specified VLAN interface
counters
Displays statistics for the specified interface
switchport
Displays the administrative and operational status of an interface
ip interface
Displays the IP settings for this device
redirects
Displays the default gateway configured for this device
filter
Displays filter rules or captured packets
igmp snooping Shows the IGMP snooping configuration
mrouter
Shows multicast router ports
line
Displays a terminal line's parameters
logging
Displays the state of logging
mac-addresstable Displays entries in the bridge-forwarding database
aging-time Shows the aging time for the address table
map ip precedence
Shows the IP precedence map
dscp
Shows the IP DSCP map
port monitor
Shows the configuration for a mirror port
queue bandwidth Shows round-robin weights assigned to the priority queues
cos-map
Shows the class-of-service map
radius-server
Shows the current RADIUS settings
running-config
Displays the configuration data currently in use
snmp
Displays the status of
spanning-tree
Shows the spanning tree configuration
startup-config
Displays the contents of the start up configuration
system
Displays system information
tacacs-server
Shows the current TACACS settings
users
Shows all active console and Telnet sessions,
version
Displays version information for the system
Page 100

b1600 cont...
vlan
Shows VLAN information
shutdown
Disables an interface
silent-time
time the management console is inaccessible after unsuccessfullogon attempts exceeded
spanning-tree protocol-migration Re-checks the appropriate BPDU format
whichboot
Displays the files booted
switch Configure commands:
authentication login Defines logon authentication method and precedence
boot system
Specifies the file or image used to start up the system
bridge-ext gvrp
Enables GVRP globally for the switch
capabilities
Advertises the capabilities of a given interface for use in auto-negotiation
channel-group
Adds a port to an aggregated link
description
Adds a description to an interface configuration
enable [level] Use this command to activate Privileged Exec mode.
password
Sets a password to control access to the Privileged Exec level
end
Returns to Privileged Exec mode
exec-timeout Sets the interval that the command interpreter waits until user input is detected
exit
Exit from global configure mode
help
Description of the interactive help system
hostname
Specifies or modifies the host name for the device
interface
Configures an interface type and enters interface configuration mode
ethernet
Ethernet IEEE 802.3
portchannel
Configures an aggregated link and interface configuration mode for the aggregated link
vlan
Enters interface configuration mode for a specified VLAN
ip filter
Blocks specified IP packets from entering the internal management port (NETMGT)
http port
Specifies the port to be used by the Web browser interface
server
Allows the switch to be monitored or configured from a browser
address
Command to set the IP address for this device
dhcp restart Submits a BOOTP or DHCP client request
client-identifier Specifies the DHCP client identifier for the switch
default-gateway
Defines the default gateway
igmp snooping
Enables IGMP snooping
vlan static
Adds an interface as a member of a multicast gro up
version
Configures the IGMP version for snooping
querier
Allows this device to act as the querier for IGMP snooping
query-count
Configures the query count
query-max-responsetime Configures the report delay
router-port-expiretime
Configures the query timeout
vlan mrouter
Adds a multicast router port
jumbo-frame Enables support for jumbo frames
lacp
Configures LACP for the current interface IC 4-168
line
Identifies a specific line for configuration and starts the line configuration mode
logging on
Controls logging of error messages
history
Limits syslog messages saved to switch memory based on severity
mac-address-table aging-time
Sets the aging time of the address table
static
Maps a static address to a port in a VLAN
map ip precedence
Enables IP precedence class-of-service mapping
map ip precedence
Maps IP precedence value to a class of service
map ip dscp
Enables IP DSCP class-of-service mapping
map ip dscp
Maps IP DSCP value to a class of service
Page 101

b1600 cont...
negotiation
Enables auto-negotiation of a given interface
no
Negate a command or set its defaults
queue bandwidth
Assigns round-robin weights to the priority queues
queue cos map
Assigns class-of-service values to the priority queues
radius-server host
Specifies the RADIUS server
port
Sets the RADIUS server network port
key
Sets the RADIUS encryption key
retransmit
Sets the number of retries
timeout
Sets the interval between sending authentication requests
snmp-server contact Sets the system contact string
location
Sets the system location string
host
Specifies the recipient of an SNMP notification operation
enable traps
Enables the device to send SNMP traps (SNMP notifications)
spanning-tree Enables the spanning tree protocol
spanning-tree mode Configures STP or RSTP mode
forward-time Configures the spanning tree bridge forward time
hello-time
Configures the spanning tree bridge hello time
maxage Configures the spanning tree bridge maximum age
priority Configures the spanning tree bridge priority
pathcost method
Configures the path cost method for RSTP
transmission-limit
Configures the transmission limit for RSTP
cost
Configures the spanning tree path cost of an interface
portpriority
Configures the spanning tree priority of an interface
edgeport
Enables fast forwarding for edge ports IC
linktype Configures the link type for RSTP
speed-duplex Configures the speed and duplex operation of a given interface
switchport
broadcast packetrate Configures the broadcast storm control threshold
mode
Configures VLAN membership mode for an interface
acceptable-frame-types Configures frame types to be accepted by an interface
ingress-filtering
Enables ingress filtering on an interface
native vlan
Configures the PVID (native VLAN) of an interface
allowed vlan
Configures the VLANs associated with an interface
gvrp
Enables GVRP for an interface
forbidden vlan Configures forbidden VLANs for an interface
gvrp
Enables GVRP for an interface
forbidden vlan Configures forbidden VLANs for an interface
priority default Sets a port priority for incoming untagged frames
tacacs-server host
Specifies the TACACS server
port
Sets the TACACS server network port
key
Sets the TACACS encryption key
username
Establish User Name Authentication
vlan database Enters VLAN database mode to add, change, and delete VLANs
vlan
Configures a VLAN, including VID, name and state

Page 102

Cluster 3.x:

http://suncluster.eng

http://cluster.central (Installation Information)

Introduction: Sun Cluster 3 is the first integrated release of Sun's next generation
Full Moon clustering technology. Sun Cluster 3 extends Solaris with the
Full Moon cluster framework, enabling the use of core Solaris services such
as file systems, devices, and networks seamlessly across a tightly coupled
cluster and maintaining full Solaris compatibility for existing applications.
Key Benefits: Higher / Near continuous availability of existing applications based on
Solaris services such as highly available file system and network services.
Integrates/extends the benefits of Solaris scalability to dotCOM application
architectures by providing scalable and available file and network services for
horizontal applications.
Ease of management of the cluster platform by presenting a simple unified management
view of shared system resources.
General:
Configuration guide is located at suncluster.eng. All Information is too much to show here.
Below are some highlights.
Up to 8 nodes in a cluster including single node clusters.
Sun and EMC storage supported with others starting in May 04.
Failover, Scalable and OPS/RAC Services
Supports Solaris 8 and 9
PNM is supported for 3.x and IPMP for 3.1 for Public net.
Supports QFE, Gigabit, Wildcat and SCI for Private net.
Supports different types of server nodes in the cluster.
DMM not supported. Have to use STMS or Powerpath which overides it.
Terminal concentrator isn't mandatory. Can use RSC or system controllers.
Admin w/s:
Admin Workstation not mandatory. Management GUI is now web based.
Good to install Sun Console software on Sun machine to have access to double window GUI.
Server
Requires end user distribution. However Server Storage and some Software
may require more. Best to at least install Full distribution.
Topologies

Clustered Pair
N+1
Pair + N
N to N scalable
Diskless Cluster
Single-node Cluster

Hardware Notes:
Must change the initiator id on one node if using SCSI arrays between 2 nodes
See info Doc 20704 for scsi initiator change procedure.
When a disk is replaced, the cluster needs to be made aware through the
scdidadm command.

Page 103

Cluster 3.x: (cont...)


Wiring Diagrams - See the configuration guide on internal site: suncluster.eng.
Commands:
boot -x
ccp

Bring server up w/o cluster


Used to run the cluster control panel software
#ccp clustername

scstat

Used to get a status of the whole or part of the cluster.


-D
Shows status for all disk device groups.
-g
Shows status for all resource groups.
-i
Shows status for all IP Network Multipathing groups.
-n
Shows status for all nodes.
-p
Shows status for all components in the cluster. Use with -v[v] to display more verbose
output.
-q
Shows status for all device quorums and node quorums.
-v[v] Shows verbose output.
-W Shows status for cluster transport path.

scrgadm

manage registration and unregistration of resource types, resource groups, and resources
Show Current Configuration:
-pv [v] -t resource_type_name -g resource_group_name -j resource_name
Resource Type Commands: (add, change, remove)
-a -t resource_type_name -h RT_installed_node_list -f registration_file_path
-c -t resource_type_name -h RT_installed_node_list
-r -t resource_type_name
Resource Group Commands: (add, change, remove)
-a -g RG_name -h nodelist -y property
-c -g RG_name -h nodelist -y property -y property
-r -g RG_name
Resource Commands: (add, change, remove)
-a -j resource_name -t resource_type_name -g RG_name -y property -x extension_property
-c -j resource_name -y property -x extension_property
-r -j resource_name
Logical Host Name Resource Commands: (add)
-a -L -g RG_name -j resource_name -l hostnamelist -n netiflist -y property
Shared Address Resource Commands: (add)
-a -S -g RG_name -l hostnamelist -j resource_name -n netiflist -X auxnodelist -y property

scconf

Update the cluster software configuration. Recommend running scsetup and this will print out the
scconf command used. Therefore remember and use the commands you use repetitively.
-pv[v] Prints out the configuration.

scinstall

Install Sun Cluster software and initialize new cluster nodes.


-pv[v] Print out packages and versions installed.

scsetup

Interactive cluster configuration tool similar to vxdiskadm in Veritas.

scdidadm

The scdidadm utility administers the device identifier (DID) pseudo device driver did
-C Removes references to nonexistent devices on the cluster nodes.
-l Lists the local devices in the DID configuration file.
-L Lists all the paths, including those on remote hosts, of the devices in the DID config file.
-r Reconfigures the database.
-R Performs a repair procedure on a particular device instance.

Page 104

Cluster 3.x: (cont...) Commands:


scshutdown

Shut down a cluster

scvxinstall

The scvxinstall utility provides automatic VxVM installation and optional root-disk encapsulation
for Sun Cluster nodes.

scgdevs

Global devices namespace administration script

scswitch

Perform ownership and state change of resource groups and disk device groups in Sun Cluster
configurations. Below are some examples:

Misc Procedures:
Device Groups:
Register a new disk group:
scconf -a -D type=vxvm,name=new_disk_group,nodelist=nodex:nodex
Sync device group info after adding a volume:
scconf -c -D name=diskgroup,sync
Getting registered device group information:
scstat -D
Switch a device group off a node:
scswitch -z -D device_group -h node
Switch a device group offline (must be quiescent and unmounted)
scswitch -F -D device_group
Switch a device group into maintenance state (must be quiescent and unmounted)
scswitch -m -D device_group
Switch a device group online:
scswitch -z -D device_group -h node
Resource Groups:
Get current resource group status:
scstat -g
Switch a resource group to another node:
scswitch -z -g resource_group -h node
Switch all resource and device groups off a node:
scswitch -S -h node
Take a resource group offline on all nodes:
scswitch -F -g resource_group
Bring a resource group online on all nodes:
scswitch -Z -g resource_group
View configured resource groups:
scrgadm -p[v][v]
Removing a resource group: Before a resource group may be removed, all resources within the group
must be removed. The steps required are:
1) take the resource group offline
scswitch -F -g resource_group
2) disable the resources within the group
scswitch -n -j name_of_resource
3) remove the resources within the group
scrgadm -r -j name_of_resource
4) remove the resource group
scrgadm -r -g resource_group
Page 105

SMS upgrade 1.4.1:

(see SMS 1.4.1 install guide http://www.sun.com/servers/highend/sms.html)

Download your SMS packages: http://www.sun.com/servers/highend/sms.html (make sure to run cksum and compare)
(also on EIS CD3 starting Apr-27-04)
- unzip file and note location
Prepare for Upgrade:
- switch user to sms-svc
- Make sure SCs are stable, no data syncs, DR, hw changes in progress
- Turn off failover on main SC (SC0) sc0:sms-svc:>setfailover off
- Stop SMS on the spare SC (SC1)
sc1:#/etc/init.d/sms stop
- Backup SMS on spare (optional)
sc1:#smsbackup (can add UFS dest dir. default: /var/tmp)
Upgrade Solaris Operating environment (optional)
sms 1.4.1 will work with sol8 and sol9. There is a different SMS package for each O/S version.
Sol8 02/02
Sol9 04/04
(if you upgrade O/S add all patches and reboot. stop sms again if rebooted)
Upgrade SMS software packages using smsupgrade: (spare sc first SC1)
- cd to download directory sc1:# cd /download_dir/sms_1_4_1_sparc_System_Management_Services_1.4.1/Tools
- smsupgrade sc1:# ./smsupgrade /download_dir/sms_1_4_1_sparc_System_Management_Services_1.4.1/Product
Switch control to spare SC (SC1)
- stop SMS on main SC (SC0) sc0:# /etc/init.d/sms stop
- bringdown spare (SC1)
sc1:#init 0
- boot spare (SC1) to activate pkgs and become main OK> boot -rv
Update the SC and CPU flash PROMs on the new main SC (SC1)
- switch user to sms-svc
- flash SC: sc1:sms-svc:> flashupdate -f /opt/SUNWSMS/firmware/SCOBPimg.di sc1/fp0
sc1:sms-svc:> flashupdate -f /opt/SUNWSMS/firmware/nSSCPOST.di sc1/fp1 CP1500 only
sc1:sms-svc:> flashupdate -f /opt/SUNWSMS/firmware/oSSCPOST.di sc1/fp1 SCV2(cp2140) only
- flash SBs: sc1:sms-svc:> flashupdate -f /opt/SUNWSMS/hostobjs/sgcpu.flash sb0 sb1 sb2 ect...
(must specify location for sms 1.4.1)
- bring down sc1
sc1:# init 0
- boot sc1
OK> boot -rv
Upgrade the former main SC (SC0)
- Download your SMS packages: www.sun.com/servers/sw (make sure to run cksum and compare)
- unzip file and note location
- stop SMS on the former main (SC0) sc0:# /etc/init.d/sms stop
- Backup SMS on former main (SC0) (optional) sc0:# smsbackup (can add UFS dest dir. default: /var/tmp)
Upgrade Solaris (optional)
sms 1.4.1 will work with sol8 and sol9. There is a different SMS package for each O/S version.
Sol8 02/02
Sol9 04/04
(if you upgrade O/S add all patches and reboot. stop sms again if rebooted)

Page 106

smsupgrade 1.4.1: (Cont...)


Upgrade SMS on former main (SC0)
- cd to download directory sc0:# cd /download_dir/sms_1_4_1_sparc_System_Management_Services_1.4.1/Tools
-smsupgrade
sc0:# ./smsupgrade /download_dir/sms_1_4_1_sparc_System_Management_Services_1.4.1/Product
Reboot the former main SC (SC0)
- bringdown former main (SC0)
sc0:#init 0
- boot (SC0) to activate pkgs and become main OK> boot -rv
Update the SC PROMs on the former main SC (SC0)
- switch user to sms-svc
- flash SC: sc0:sms-svc:> flashupdate -f /opt/SUNWSMS/firmware/SCOBPimg.di sc0/fp0
sc0:sms-svc:> flashupdate -f /opt/SUNWSMS/firmware/nSSCPOST.di sc0/fp1 CP1500 only
sc0:sms-svc:> flashupdate -f /opt/SUNWSMS/firmware/oSSCPOST.di sc0/fp1 SCV2(cp2140) only
- bring down sc0
sc0:# init 0
- boot sc0
OK> boot -rv
Verify chasis serial number main SC (SC1)
- switch user to sms-svc
- check chasis serial # sc1:sms-svc:>showplatform -p csn
- record serial #
sc1:sms-svc:>setcsn -c serial_numb
Enable failover on main SC (SC1) sc1:sms-svc:>setfailover on
Solaris 9 SVM (sds) disk replacement: (also see infodoc ID73132 )
Beginning with Solaris 9, SVM uses a new feature called Device-ID
which identifies each disk not only by it's c#t#d# name, but by
a unique ID generated by the disk's WWN or serial number.
Mirrored disk replacement: (use when submirror State: Needs maintenance in metastat cmd)
On failing disk: (If you can access the disk, if not start at the cfgadm -c unconfigure step)
# umount filesystem
(unmount any non-svm open filesystems on failed disk)
# metadb -d c1t0d0s7
(if replicas on this disk, remove them)
# metadb | grep c1t0d0s0
(verify there are no existing replicas left on the disk)
# cfgadm -c unconfigure c1::dsk/c1t0d0 (might not complete command if busy, remove failed disk)
Insert a new disk :
# cfgadm -c configure c1::dsk/c1t0d0
# prtvtoc /dev/rdsk/c0t0d0s2 > /tmp/firstdisk
# fmthard -s /tmp/firstdisk /dev/rdsk/c1t0d0s2
# metadevadm -u c1t0d0
# metadb -a c1t0d0s7
# metareplace -e d0 c1t0d0s0
# metastat -i

(configure new disk)


(get format for new disk)
(format disk same as mirror)
(will update the New DevID)
(if necessary, recreate any replicas)
(do this for each submirror on the disk)
(will change unavailable state of devices to Okay)

Raid-5 disk replacement: (use when raid unit State: Needs maintenance in metastat cmd)
On failing disk:(If you can access the disk, if not start at the cfgadm -c unconfigure step)
# umount filesystem
(unmount any open non-svm filesystems on this disk)
# metadb -d c1t0d0s7
(any replicas on this disk, remove them)
# metadb | grep c1t0d0
(verify there are no existing replicas left on the disk)
# cfgadm -c unconfigure c1::dsk/c1t0d0
(might not complete command if busy, remove the failed disk)
Page 107

Solaris 9 SVM (sds) disk replacement: (cont...)


Insert a new disk :
# cfgadm -c configure c1::dsk/c1t0d0
Run 'format' or 'prtvtoc' to put the desired partition table on the new disk
# metadevadm -u c1t0d0
(will update the New DevID)
# metadb -a c1t0d0s7
(if necessary, recreate any replicas)
# metareplace -e <raid5-md> c1t0d0s0
(do this for each raid on the disk)
# metastat -i
(will change unavailable state of devices to Okay)
SC rebuild after total disk failure: (Sun Fire 12k/15k)
Use this procedure after disk replacement to rebuild an SC that experienced a total disk failure.
This is a modified version of the `Fresh Installed SCs' portion of the 12k/15k & 20k/25k EIS checklist.
http://sunweb.germany/EIS/Web/inst-support/checkl.html.
Note: A smsbackup from the other SC on the platform should not be restored on the failed SC. The smsbackup file
must come from the same SC that failed.
Items needed:
Solaris OE CDs (operating system install)
SMS Software (http://www.sun.com/servers/highend/sms.html)
EIS CDs
smsbackup file (from failed SC or ID-PROMs from service call)
explorer output file (from failed SC http://proactive.central)
On Main SC as user sms-svc: setfailover off
On Failed SC at ok prompt:
check OBP settings: setenv auto-boot? false
setenv diag-level pmax-epvmax
setenv input-device ttya
setenv output-device ttya
setenv local-mac-address? true
setenv diag-switch? true
setenv post-on-sir? true
setenv diag-device <same as boot-device>
- Inital SC bootup: boot cdrom.
- Get Solaris install info from explorer output (/etc/nodename, /etc/hosts, /etc/nsswitch.conf, /disks/prtvtoc ect...)
you can also reference install docs and customer supplied info.
- Install SC as per EIS "Install Spec". Entire Distribution is required.
- Install Solaris & select manual reboot.
- Fix the "No SOF Interrupt" problem. Append to /a/etc/system: exclude: drv/ohci (Makes booting much faster)
- Touch /a/etc/notrouter
Disable routing.
- Reboot SC.
- Log in as user root.
- Insert EIS-CD-ONE Copy the EIS-CD to the system disc: cd /cdrom/...sun/install; sh copy-cd2sun.sh
- Insert EIS-CD-TWO. Copy the EIS-CD to the system disc: cd /cdrom/...sun2/install; sh add-cd2sun.sh
- Edit /etc/dfs/dfstab
Share directory /sun
- Run setup-standard as user root: cd /sun/install; sh setup-standard.sh
(Do NOT select option to install SAN Foundation Suite.)
(PTS recommends activation of alternate break sequence on SCs)
- Log out & back in to set environment. Or enter: . $HOME/.profile
- Ensure that NIS is not configured. (If NIS/NIS+ used "files" must be first in /etc/nsswitch.conf.)
- Install Solaris patches: Recommended Cluster and Additional Solaris Patches (/sun/patch/<SolarisVn>)
- Solaris 8: Verify entry in /etc/system set TS:ts_sleep_promote=1 (EIS-ALERT#22)
- Fix sendmail messages "My unqualified host name unknown" (/etc/hosts append <hostname>.somewhere.com)
- Reboot
Page 108

SC rebuild after total disk failure: (cont...)


- Install SDS/SVM software.
- Patch the SDS software (Solaris 8 only).
/sun/patch/sds/<Vn>
- Install the SMS software on failed SC. (web release: http://www.sun.com/servers/highend/sms.html)
- Patch SMS software
/sun/patch/SMS/<Vn>
- As root run smsrestore on failed SC. Use file from smsbackup or install the IDPROM files obtained
via the service call.
- Reboot SC.
- Mirror the system disk. See scripts on EIS-CD in /sun/tools/SF15K
(SDS Infodoc 28196)
- Set boot-device & diag-device to both sides of the mirror. (SDS: sds-disk, sds-mirror) (See Infodoc 11854)
- If NVRAM editor (nvedit) was used ensure to setenv use-nvramrc? true
- Set up UFS-ACLs for user sms-svc on SC. As root run script sms-svc-setup.sh (EIS-CD: sun/tools/SF15K)
- As user sms-svc: touch $HOME/.hushlogin
- Append "share cdrom* -o ro,anon=0" to /etc/rmmount.conf
- Share /export/install if not already. (/etc/dfs/dfstab)
- Set up /etc/defaultrouter according to customer requirements.
- Verify connectivity to defaultrouter (eg via ping).
- Execute smsconfig -m on failed SC. Use data from explorer output from failed SC, Cu supplied info reference.
(if you restored the smsbackup for the failed SC, select 'Edit Network Settings'. All the IP hostnames
will be populated and you will only have to supply the IP addresses and save the settings. smsconfig will
populate your host, netmasks and hostname files.)
(if you did not have the smsbackup file, and restored the IDPROM files, you will have to Set platform name
and change base ip addresses if necessary. Use explorer output from failed SC, Customer supplied info for
reference. Also see infodoc ID71490)
- The smsconfig -m command modifies the hosts file. Check it to be sure things are as they should be.
- Verify auto-boot?=true, watchdog-reboot?=false (eeprom auto-boot?, eeprom watchdog-reboot?)
- Shutdown newly loaded SC and do hard reset. (Press reset button on SC).
On MAIN SC as user sms-svc: setfailover on Wait 5 minutes.....
On MAIN SC as user sms-svc: Verify setfailover (showfailover -v) and showdatasync are "ACTIVE" to propogate
changes to spare SC.
- Run explorer and SunCheckup on both SCs, compare outputs and correct any errors.
- When datasync is completed: On Main and spare SC, make a backup copy of sms files (smsbackup)
15K DR examples: (also see serengetti/15k dr commands page 87, infodoc 76795 How to DR a Single PCI Card)
(cfgadm commands run from domain)
# cfgadm -val (get name app ID of board to use with cfgadm -c 'disconnect' or configure command)
# cfgadm -val | grep permanent (see what SB has perm memory)
# cfgadm -c disconnect SB0 (removes SB0)
# cfgadm -c configure SB0
(adds SB0 back into domain)
# cfgadm -c disconnect IO1 ( removes IO1 and all pci adapters on it)
# cfgadm -c configure IO1
(configures IO1 back into domain)
IO PCI slot #s
# cfgadm -c disconnect pcisch5:e01b1slot0 (removes pci card in IO1 slot 0)
|3|1|
# cfgadm -c disconnect pci_pci0:e00b1slot1 (removes pci card in IO0 slot1)
|2|0|
# cfgadm -c configure pcisch5:e01b1slot0 (configures pci card in IO1 slot 0 into domain)
15/25K hpost:
sms-svc> hpost -d r -l127 (run hpost on domain R level 127)
.postrc
(etc/opt/SUNWSMS/adm/config/platform or A-R)
level 64
(run level 64)
dash_H_level 127
(run level 127 when DRing a board into domain)
no_ioadapt_ok
(test SB only. Good when you create a test domain w/o IO)
no_obp_handoff
(when testing SB only don't attempt to load obp)
Page 109

SMSbackup: (how to manually expand backup file) also see infodoc 77357
- Copy backup file to /tmp
- sms-svc> cpio -icvdum < /tmp/sms_backup.1.4.1.cpio.0
3310/3510 Disk replacement: (also see infodoc 78432 and page 84)
- save nvram info: system functions, Controller maintenance, Save NVRAM to disks, yes
- Identify bad disk: view and edit scsi device, look for BAD or FAILED status, note Chl, Id and LG_DRV #s,
select bad drive, Identify scsi drive, flash all But Selected drive, Flash Drive Time, yes (go find the disk)
disk ID #s(single bus 3310)
Chl 0
0 3 8 11
1 4 9 12
2 5 10 13

disk ID #s (dual bus 3310 )


Chl 2
Chl 0
0 3
0 3
1 4
1 4
2 5
2 5

disk ID#s (3510)


Ch 0 / Ch2
0 3 6 9
1 4 7 10
2 5 8 11

- Physically unseat bad disk, let spin down 20 sec, then remove
- Install replacement disk
- view and edit scsi device, look for NEW_DRV or USED_DRV status.
If not seen: select a disk, Scan scsi drive, select Chl (use noted #), select Id# (of replacement), yes
- Is replacement to be new local or global spare? If not skip to copy and replace step
if so: view and edit scsi device, select replacement disk, add Global spare drive or add Local spare drive, yes
- If replaced disk cannot be spare. view and edit logical drives, select logical drive, select PREVIOUS spare
disk, copy and replace drive, yes (when copy is completed assign PREVIOUS spare back in step above)
How to mount a CD image file (.iso) as a filesystem: (see SRDB 50566)
# lofiadm -a /export/install/sol-10-b72-sparc-v1.iso (must use absolute path to iso file)
/dev/lofi/1
# mkdir /cd1 (create a mount point)
# mount -F hsfs -o ro /dev/lofi/1 /cd1 (mount /dev/lofi/# on the mount point)
# df -k /cd1
Filesystem
kbytes used avail capacity Mounted on
/dev/lofi/1
239904 239904
0 100% /cd1
To mount a slice of an .iso image (like s1 when doing a 'setup_install_server')
# mkdir /s1 (create the mountpoint)
# dd if=sol-10-b72-sparc-v1.iso of=vtoc bs=512 count=1 (make a copy of the vtoc)
# od -D -j 452 -N 8 < vtoc
(starting cyl and block length for s1 is 452 bytes into vtoc and is 8 bytes long)
0000000 0000000750 0000857600 (slice1 starts at cyl 750 and is 857600 blks long)
0000010
# echo 750*640 | bc
(Starting cyl750 *blks/cyl always 640 = s1 starting blk is 480000)
480000
# dd if=sol-10-b72-sparc-v1.iso of=sol-10-b72-sparc-v1-s1.iso bs=512 skip=480000 count=857600
# lofiadm -a /export/install/sol-10-b72-sparc-v1-s1.iso
/dev/lofi/2
# mount -F ufs -o ro /dev/lofi/2 /s1
# df -k /cd1 /s1
Filesystem
kbytes used avail capacity Mounted on
/dev/lofi/1
239904 239904
0 100% /cd1
/dev/lofi/2
402086 397100
0 100% /s1
Page 110

Removing the top cover on a V20z: (very tricky :-)


Keep top button down, pull cover forward until click, slide to the rear.
Explorer -w scextended from cron:
- Add IP address and password (if used) of SC to the /etc/opt/SUNWexplo/scinput.txt file.
- run crontab -e and add -w default,scextended to the explorer entry
ex: 0 0 * * 1 /opt/SUNWexplo/bin/explorer -q -e -w default,scextended
Useful COD commands: ( to obtain a license www.sun.com/licensing) 5.14.00 and up
showcodlicense (-r)
addcodlicense
deletecodlicense
enablecodboard <sb#>
showcodusage
showplatform -p cod
setupplatform -p cod
showboards

see Info doc 81531

sc> addcodlicense 01:80d8a855:000000000:0201010100:c:00000000:BLqg5Ko


Used to replace a COD sb (need service passwd on Sun Fires)

(addcodlicense will populate this area)

ALOM4v: Niagra (Ontario, Erie) (initial login/password admin/admin1) also see ALOM commands on page 94
Removed in ALOM4v: Reduced managed system interface:
Solaris 'scadm' , Solaris 'locator', 'prtfru' cannot access DFRUID PROMs, 'prtdiag'/'prtpicl' no environmentals.
ALOM Alerts not forwarded to host syslog
ALOM 'setsupsc' questions related to managed system interface removed.
Removed ALOM environment variables:
sys_eventlevel, sys_hostname, ALOM cannot detect hung OS
Removed ALOM variables:
sys_autorestart, sys_xirtimeout, sys_wdttimeout "No CPU Signature (OBP and OS Status) support!
ALOM 'showplatform' cannot display Booting/OS Running state, stops at running
sys_bootrestart, sys_bootfailrecovery, sys_maxbootfail, sys_boottimeout
New in ALOM4v:
Password recovery (procedure on page 113)
If the admin password is lost/forgotten, can reset the NVRAM to factory defaults, including clearing all users.
Requires physical access to the machine to unplug power cords and connect to ALOM serial port. "
Flashupdate protection
ALOM flash is in two segments with a persistant switch.
'flashupdate' always operates on the non-running segment. Segments are only switched after flashupdate
completes and image is CRC verified. A jumper can also switch the segments.
Ex: sc> flashupdate -s 129.148.173.99 -f /tmp/122430-01/System_Firmware-6_1_2-Sun_Fire_T2000.bin-latest
Supports new LED States:
White locator LED flashes at 4Hz when activated.
Green LED states:
Standby blink: 0.1sec on, 2.9sec off. When system is on standby power
Slow blink: 0.5 sec on, 0.5sec off: When system is in transition (running POST, powering down, etc)
Steady ON: system is running
Amber LED states:
Off: No faults.
On: Service required.
Amber slow blink to indicate unacknowledged faults not supported.
Page 111

ALOM4v: (cont)
New in ALOM4v: ALOM handles the fault by:
Lighting the Fault LED(s)
Logging the fault to DFRUID of the indicted FRU(s)
Alerting the user using ALOM alerting mechanisms:
To logged-in ALOM users
To an email address (if configured) "
New ALOM commands:
showfaults

Prints any faults Environmental faults, faulty FRUs, POST-detected faults, which result in ASR-disable
FMA-detected faults, prints the time and status of the last POST run.
clearfault
<UUID> to manually clear an FMA-diagnosed fault. (get UUID from showfaults output)
ASR commands:
showcomponent view and manage the list of blacklisted (ASR-disabled) devices
enablecomponent disabled state is stored on the actual FRU, such as the DIMM itself.
disablecomponent A FRU disabled on one system will remain disabled when inserted in another system
clearasrdb
setkeyswitch
normal: System can be used normally.
stby: Powers off the system and prevents 'poweron' command or button from operating.
diag: Forces the system to run servicemode diagnostics at next reset.
locked: Prevents 'flashupdate' and 'break' commands, system can power on/off and reset normally.
showkeyswitch
showfru
command prints both static and dynamic sections
setfru
command to set Customer_DataR in all FRUs
showhost version command to print the software versions contained in the Host flash prom.
obpupdate
command to update the Host flash prom (POST, OBP, etc). 'obpupdate' and 'flashupdate' will be merged
into a single command which will update both ALOM and the Host flash from a single master image
flash host prom
Servicemode commands: Be sure to set sc_servicemode to false when done!
setsc sc_servicemode true Warning: misuse of this mode may invalidate your warranty.
showplatform -v will print CPU #Cores and version information. "
ping <ipaddress> - test network connectivity
clearnvramlog - erases persistent 'showlogs -v'
frucapture - offload a FRU's DFRUID image via FTP
fruupdate - update (overwrite) a DFRUID image via FTP
setcsn - set the chassis serial number, required when replacing the PDB board.
Can only be executed one time and only with a blank (new) PDB
fmagentconfupdate - field update FMA agent via FTP
showfmfaults - show current FMA faults stored on the DOC (Disk-on-chip)
showfmerptlog1 - show the first 40 ereports on DOC
showfmerptlog2 - show the last 40 ereports on DOC
clearereports - clear the ereport logs from DOC
docftpput - FTP a DOC file off of ALOM. " Note: the above command names may change by product ship!
spdiag consists of the following commands:
i2ctest - run a single pass of the i2c test
envtest - run a single pass of the environmental test
sptest - run a single pass of the SP diag tests
setdiagopt - set diag test options used by 'rundiag'
rundiag - start diagnostics in the background
stopdiag - stop any running background diagnostic tests
showdiagstatus - show the status of background tests
resetdiagstatus - reset the diagnostic status Servicemode: spdiag suite
Page112

ALOM4v (cont...)
diagnostics run environment variables:
diag_trigger: when POST runs. Valid triggers: none, power-on-reset, user-reset, error-reset, all-resets
diag_verbosity: verbosity level of POST, one of: none, min, normal, max, or debug
diag_level: level of testing performed, one of: none, min, or max.
diag_mode: POST mode, one of: off, normal, service, or menu
sys_autorunonerror: Controls if the system should continue boot if POST finds an error. Set to true or false.
Forgotten password ALOM4v : Niagra Ontario, Erie
1.Connect to the ALOM serial port
2. Power cycle the server by unplugging both PSU cords and re-plugging
3. Hit "esc", the Escape key, during ALOM boot at the point: Return to Boot Monitor for Handshake
4.After hitting "esc", the ALOM boot escape menu will be printed:
ALOM <ESC> Menu
e - Erase ALOM NVRAM. m - Run POST Menu.
R - Reset ALOM. r - Return to bootmon. Your selection:
Enter "e" to erase the ALOM NVRAM and then 'r' to resume ALOM boot. ALOM will now boot and reset
all NVRAM settings. You will automatically be logged on as user 'admin' with no password and
no permissions, and all ALOM NVRAM settings will be reset to the factory defaults.
Solaris to Linux cross-reference: ( http://www.unixporting.com/quickguide.html and Linux overview for Solaris users
817-3341-10)
Solaris

Linux

System Administration Tools


/usr/bin/admintool
/bin/linuxconf
/usr/sbin/useradd
/usr/sbin/useradd
Kernel Configuration
/etc/system
/usr/src/linux
Processes
/usr/bin/ps -ef
/bin/ps -ef
/bin/truss
/usr/bin/strace
/usr/ucb/users
/usr/bin/users
/usr/ucb/ps -aux
/bin/ps -aux
/usr/bin/prstat
/usr/bin/top
Physical Memory
/usr/sbin/dmesg | grep mem grep MemTotal /proc/meminfo
Hardware Status/Information
/usr/bin/dmesg
/bin/dmesg
/usr/bin/arch -k
/bin/uname -m
Host ID
/usr/bin/hostid
/usr/bin/hostid
Hostname
/usr/bin/hostname
/bin/hostname
/usr/bin/uname -a
/bin/uname -a
Swap
/usr/sbin/swap -a
/sbin/swapon -a
/usr/sbin/swap -l
/usr/bin/free
vmstat
vmstat
System Files
/etc/vfstab
/etc/fstab
/etc/inet/hosts
/etc/hosts

Description
system administration tasks
adds a new user

active processes
trace of the system
users currently on the system
active processes sorted by %cpu
active processes, reports statistics
memory size
system buffer diagnostic messages
application architecture of host system
lists host id
lists hostname
lists hostname
add swap space
lists swap info
virtual memory statistics
filesystem default info
network hosts file
Page 113

Solaris to Linux cross-reference: (cont...)


Solaris

Linux

The X Window System


/usr/openwin/bin/xterm
/usr/X11R6/bin/xterm
/usr/openwin/bin/xhost
/usr/X11R6/bin/xhost
Networking
/usr/sbin/showmount
/sbin/showmount
/etc/dfs/dfstab
/etc/exports
/usr/sbin/route
/sbin/route
/usr/bin/netstat
/bin/netstat
/usr/sbin/ifconfig
/sbin/ifconfig
/usr/sbin/snoop
/usr/sbin/tcpdump
Copies
/usr/bin/cpio
/bin/cpio
/usr/sbin/tar
/sbin/tar
Software
/usr/sbin/pkgadd
/bin/rpm -i[U]vh
/usr/sbin/pkginfo
/bin/rpm -qa
/usr/sbin/pkgrm
/bin/rpm -e
Disk Formatting
/usr/sbin/format
/sbin/mke2fs
Disk Partitioning/info
/usr/sbin/format
/sbin/fdisk
/usr/sbin/format
/sbin/fdisk -l
Disk Space and Information
/usr/sbin/df
/bin/df
/usr/sbin/df -k
/bin/df -k
/usr/sbin/mount
/bin/mount
/usr/bin/du
/usr/bin/du
Log Files
/var/adm/messages
/var/log/messages
Miscellaneous
/usr/ucb/whoami
/usr/bin/whoami
/usr/bin/fdformat
/usr/bin/fdformat
/usr/bin/tip
/usr/bin/minicom
/usr/bin/find
/usr/bin/locate
/usr/bin/who -r
/sbin/runlevel

Description
terminal emulator for x windows
allowed connections to the X server
clients that remotely mounted a filesystem
sharing resources
manipulate the routing tables
show network status
configure network interface parameters
displays network packets and their contents
copy files
copy files
add software pkg
displays software pkg info
removes software pk
creates partition
creates partition
lists partition info
displays mounted file systems
displays disk space of file systems
mounts a file system
displays disk usage
system Log file
displays current user name
floppy disk format
terminal connect thru serial port
find a file
displays current run level

SSH - Secure Shell :


SSH (Secure Shell/Secure socket shell) is a secure Unix command interface and protocol that enables the user to have
remote access to a device located on a network. SSH is built of three different utilities, slogin, ssh, and scp - these are all
secure versions of existing Unix ultilities, rlogin, rsh and rcp. All SSH commands and sessions are encrypted to enhance
security during a remote session. In most cases, if you have to connect via ssh to a server, ICMP (ping) will be disabled.
In other words you will not be able to ping the server.
Commands for ssh users:
ssh hostname connect to hostname using ssh ex: # ssh - l root 129.148.173.230
slogin hostname you can use ssh and slogin interchangeably
ssh hostname command run command remotely on hostname
ssh -v hostname connect in verbose mode for debugging
ssh -V determine version number for your copy of ssh
Page114

Commands for ssh users: (cont.)


ssh-keygen generate a new public/private key pair
ssh-keygen -c myuserid-ssh2@pha generate new key pair with identifying comment
sftp hostname copy files interactively between hosts (requires SSH2). Commands for an sftp session are similar to
standard ftp.
scp filename hostB:filename copy file from current computer to hostB
scp1 filename hostB:filename copy file from current computer to hostB (use if hostB only supports SSH1)
scp hostA:filename hostB:filename copy file between two computers
scp -r hostA:dirname1 hostB:dirname2 copy directory (and its contents) between two computers
scp hostA:fn1 hostB:fn2 copy and rename file between two computers
scp fn1 fn2 fn3 hostB:directoryname copy multiple files into hostB's directory
ssh-agent command run command (usually a shell) under control of ssh-agent
ssh-add add local identity to list maintained in memory by ssh-agent
ssh-add filename add identity whose private key is stored in filename to list in memory
ssh-add -l list keys stored in memory
ssh-add -D delete all keys stored in memory
Commands for ssh maintainers
ssh-keygen -P /etc/ssh2/hostkey generate & store a new host key
SSH with SMS 1.5
smsinstall command will automatically harden your SC, smsupgrade will not.(Bug ID: 5079760)
to undo hardening: (pg50 SMS 1.5 Installation Manual: )
1. login to SC as superuser
2. Type at sc:# prompt: /opt/SUNWjass/bin/jass-execute -u
3. The system will prompt you with an `undo' menu
4. Select `run' number you want to undo
5. type q to exit
6. reboot system
To manually harden a SC with SMS 1.5: (note telnet, rlogin, ftp, vold will not work so make sure you
serial console access before you harden it) infodoc 83763
# /opt/SUNWjass/bin/jass-execute -q -d sunfire_15k_sc-secure.driver
Galaxy ILOM:

(default login/password root/changeme)

ILOM (Integrated Lights Out Manager) (Motorola MPC8248 Service Processor):


Provides RKVMS functionality (Remote Keyboard, Video, Mouse and Storage. Default is not enabled for LAN.)
Provides ability to boot from virtual devices.
CLI through serial connection or SSH.
Environmental monitoring (voltage, fan speeds, temperatures, etc. and will send alert messages.)
Allows for LOM.
Embedded Web Server w/ SSL encryption. (connect to web GUI by: https://ipaddress)
Flash memory for built-in Linux OS.
Connects to all components via JTAG connection.
IPMI v2.0 command interface
SNMP v1, v2c and v3 interface.
CLI, Web GUI or ILOM Remote Console to manage.
To Power on:
To turn on main power mode (all components powered on), press and release the small Power button on the server
front panel. When main power is applied to the full server, the Power/OK LED next to the Power button lights and
remains lit. or
Page115

Galaxy ILOM: cont...


(Connect a serial cable from the RJ-45 Serial Mgt port on your ILOM SP to laptop)
-> start /SYS
To Power off: press and release the small Power button on the server front panel
or -> stop /SYS
Configuring the SP: (Serial Port default: 9600/8/1/none )
cd /SP/network
set /SP/network pendingipaddress=192.168.0.1
set /SP/network pendingipnetmask=255.255.255.0
set /SP/network pendingipgateway=192.168.0.10
set commitpending=true
show /SP/network
To start the serial console: (Connect a serial cable from the RJ-45 Serial Mgt port on your ILOM SP to laptop)
-> cd /SP/console
start
`esc ( `
to return to SP
eeprom default is screen and keyboard. Use solaris eeprom command to
get serial console in solaris (ssh to host or see remote console below)
eeprom input-device=ttya
eeprom output-device=ttya
BIOS: You need to change the BIOS setting to have serial port control
after POST. (this will not override the eeprom setting in solaris)
to change setting:
F2 (ctl-E) on reset, Advanced, Remote access Configuration,
Redirect after POST [always]
(Some OSs may not work if set to always)
CLI
<verb><options><target><properties>
VERBS:
See Sun Fire X4100 and X4200 Servers System Management Guide for guidance on
CLI commands.
cd
Navigate the object namespace.
create
Set up an object in the namespace
delete
Remove an object from the namespace.
exit
Terminate a session to the CLI.
help
Displays help information about commands and targets.
load
Transfers a file from an indicated source to an indicated target.
reset
Resets the state of the target.
set
Sets target properties to the specified value.
show
Displays information about targets and properties.
start
Starts the target
stop
Stops the target.
version
Displays the version of service processor firmware running.
Options: short-cuts
-default
n/a
Causes the verb to perform only its default functions.
-destination n/a
Specifies the location of a destination for data.
-display
-d
Shows the data the user wants to display.
-examine -x
Examines the command but does not execute it.
-force
-f
Causes an immediate shutdown, instead of an orderly shutdown.
-help
-h
Displays help information.
-level
-l
Executes the command for the current target and all targets contained through the level specified.
-output
-o
Specifies the content and form of command output.
-resetstate n/a
Resets the state of the target to its default.
-script
n/a
Skips warnings or prompts normally associated with the command.
-source
n/a
Indicates the location of a source image.
Page 116

Galaxy ILOM: cont...


Contents of /SYS and /SP
-> cd SYS
/SYS
- > show
/SYS
Targets: FIOBD
FT0
Properties:
ACT = standby_blink
FAN_FAULT = off
LOCATE = off
POWERSTATE = off
PSU_FAULT = off
SERVICE = off
TEMP_FAULT = off
Commands:
cd
reset
-> cd ../SP
-> show
//SP
Targets:
alert
cli
clients
services
sessions

clock
users

FT1

MB

show

PDB

start

console

PS0

PS1

SASBP

stop

logs

network

serial

Properties:
Commands:

cd

reset

show

version

Web Gui allows you to:


(To log on, use https://ipaddress)
redirect graphical console to remote host.
connect a virtual floppy or CD-ROM drive.
monitor and manage fans remotely.
monitor BIOS messages, OS messages and system status remotely.
interrogate NICs for MAC remotely.
Power on, off and reset remotely
Remote Console (RKVMS): (requires Java 5.0 or higher)
You can use Remote Console to get remote console, keyboard, mouse access to the server and to install s/w from local
CD drive. Open a browser
https://SP_ipaddress
From Remote Console, choose Redirection->Start Redirection->Devices->Mouse/Keyboard/CD-ROM

USERs:
Can't delete the following accounts:root/anonymous/ldapproxy
Can create an additional 7 accounts.
Send break: When logged into the SP using ssh with a console session running,: ESC + Shift-b

Page 117

Revision History:
First release 01/17/00
Corrections:
02/14/00
06/21/00
Additions:
02/28/00
03/14/00
03/27/00
03/27/00
05/18/00

page 30
page 19

punzip to gunzip
d = on bd soc+ (was in wrong place)

page 39
page 40 - 43
page 28
page 44-45
page 46-48

Uncompressing files
T300
* #TERM=vt100; export TERM
ACT
Advantages of Splitting a Drive into Multiple
File Systems
How to configure a system to run on a network
SEVM - How to recover a primary boot disk.
Disable DMP
Memory Scrubbing
metastat command added to Disk Suite sec.
raidutil commands
Display remote App GUI locally
Cluster 2.x
T300 Pgroup secondary disk addressing failover note
mpstat command added
T300 Pgroup, 2 fiber path data transfer usage note
isainfo - v command added
T300 tftp boot (examples added)
Encapsulating root after using Environmental CD to
load O/S:
Warning added (:/: sys blocksize (n)k should be
set to correct value before 'vol add')
Adding a second network interface (without boot)
Adding a default gateway
OPS general description
Volume Manager
FTPing to and from sunsolve
/etc/name_to_major (cluster warning added)
/etc/defaultrouter added
/etc/notrouter (warning added to 2nd interface)
Serengeti added
info on new explorer added
Mounting CD without vold
Notes added
Update A3500 info and rm6 commands
modify Enable/Disable command descriptions
modify disk and lpc download descriptions
modified repeater bd info (removed 3800 4800
warning on dual partitions)
mailx: send messages/files
take -g out of vxdg import and export example
no longer able to create directories on ftp sunsolve
Warning added on controller firmware upgrade
* when available (added)
VTS description change (removing "on-line")
T3 forgotten password
T3 logging
-k added to netstat command

05/19/00
05/19/00
05/23/00
06/16/00
07/20/00
07/20/00
07/21/00
08/26/00
10/13/00
10/16/00
10/17/00
11/09/00
11/09/00
11/09/00

page 48-49
page 49-51
page 51
page 52
Page 13
Page 16
Page 52
Page 53
Page 41
Page 31
Page 40
Page 21
Page 42
Page 56

11/20/00

Page 42

11/20/00
11/20/00
11/27/00
12/12/00
12/28/00
02/06/01
02/06/01
04/24/01
04/24/01
04/24/01
04/24/01
05/20/01
06/20/01
06/20/01
06/20/01
06/20/01

Page 56
Page 56
Page 53
Page 57
Page 60
Page 56
Page 56
Page 56
Page 61
Page 30
Page 67
Page 66
Page 16
Page 42
Page 43
Page 62

06/20/01
06/20/01
06/20/01
06/20/01
06/20/01
07/23/01
11/06/01
11/06/01
11/06/01

Page 67
Page 59
Page 60
Page 43
Page 61
Page 28
Page 67
Page 67
Page 21

11/06/01
11/08/01
12/03/01
12/12/01
12/12/01
01/28/02
02/21/02
02/21/02
05/08/02
05/08/02
05/08/02
05/08/02
05/08/02
05/20/02
06/06/02
06/10/02
06/13/02
06/17/02
07/16/02
07/17/02
07/25/02
07/29/02
08/09/02
10/01/02
10/07/02
10/25/02
10/28/02
10/30/02
11/04/02
11/08/02
11/12/02
11/12/02
11/20/02
11/20/02
11/27/02
12/04/02
12/04/02
12/06/02
12/11/02
01/14/03
01/21/03
02/03/03
02/10/03
03/04/03
03/04/03
03/05/03
03/17/03
03/17/03
03/17/03
03/21/03
04/11/03
04/11/03
04/11/03
04/18/03
05/15/03
06/02/03

Page 33
Page 1
Pages 68 -73
Page 73
Page 73
Page 16
Page 68
Page 68
Page 55
Page 8
Page 73
Page 68,69
Page 75
Page 11
Page 76
Page 77
Page 78
Page 64
Page 60
Page 80
Page 80
page I
Page 31
page 81
Page 84
Page 72
page 86
Page 86
Page 11
Page 87
Page 54
Page 87
Page 65
Page 10
Page 87
Page 66
Page 87
Page 85
Page 66
Page 66
Page 64
Page 88
Page 88
Page 80
Page 79
Page 66
Page 59
Page 88
page 62
Page 90
Page 89
Page 89
Page 75
Page 90
Page 91
Page 92

dd, added a disk to disk quick copy example


note addad 'or disk@n for PCI '
StarCat 15k notes
local-mac-address
SDS- How to mirror root
raidutil command switches fixed ( -B and -R)
added fin I0771-1 information
added SC console port pinout
# scconf -N (to change a node ethernet address )
(7-127) added (E10K hpost levels)
smsconfig -m (added IPMP info)
smsconfig -m info added
IPMP
Added SSP3.4 information
T3B or T3+ Firmware Rev 2.1 New Functions:
Hitachi StorEdge 99X0 Arrays:
SunFire forgotten password
Sun Fire setfailover, showfailover cmds added
Updated 'ftp to sunsolve' with rftp
StorEdge Network FC Switch
Added to FC switch info
added: http://webhome.east/boston/ to disclaimer
added 'top' command
9900v notes added
Minnow info added
flashupdate-f opt/SUNWSMS/hostobjs/sgcpu.flash
Tuning ecache scrubber scan rate
VxWorks commands (serengeti)
Add syntex for share cdrom for VTS
LVD adapter information (ultra scsi-3 375-3057)
ccdadm command added for ccd.database recovery
changed step sequence for booting image
added (-x) to domain reset command
redlist definition added
Replaceing a nordica bd in a 15K SC
remove firmware bugs add firmware matrix
Add serengetti DR commands
Added to Minnow info
add to firmware matrix
modified logging information
added `service' and `testinterconnect' commands
Clean up non-root disk controler numbers
Set network parameters at boot:
Useful SAN commands
Default Storage switch passwords
Modified firmware matrix (5.14.4)
added
/opt/VRTS/bin/vea
Starcat Portid cheat sheet
6800 partition info added
StorADE info added
Starcat SC: clean the slate
Starcat redx info
rm + after depreciated under /etc/hostname.qfe0 :
get FRU info from serengetti
SWAP
Maserati Notes- StorEdge 6320 and 6120

06/19/03
07/08/03
07/09/03
07/10/03
07/10/03
07/14/03
07/21/03
09/03/03
10/24/03
11/06/03
11/06/03
01/20/04
01/26/04
01/28/04
02/10/04
02/12/04
03/01/04
03/09/04
03/09/04
04/06/04
04/06/04
04/27/04
05/31/04
06/11/04
06/28/04
07/07/04
07/27/04
08/19/04
08/27/04
08/27/04
09/16/04
09/23/04
10/11/04
11/11/04
12/08/04
02/03/05
02/03/05
02/03/05
04/19/05
04/26/05
08/03/05
08/23/05
08/23/05
09/13/05
09/26/05
10/18/05
10/18/05
10/28/05
10/28/05
12/08/05
01/09/06
01/17/06
03/22/06
03/28/06
04/04/06
07/24/06

Page 11
Page 93
Page 94
Page 89
Page 94
Page 94
Page 94
Page 60
Page 94
Page 28
Page 95
Page 96
Page 42, 76
Page 72
Page 97
Page 97
Page 87
Page 64
Page 66
Page 72
Page 87
Page 103
Page 96
Page 106
Page 93
Page 60
Page 106
Page 108
Page 73
Page 106
Page 60
Page 109
Page110
Page110
Page 64
Page 64
Page 110
Page 31
Page111
Page111
Page111
Page111
Page113
Page 68
Page 113
Page114
Page115
Page 96
Page 95
Page109
Page93
Page 65
page 79
Page 111
Page116
Page115

removed 'slot' from sbus numbering formula


Flash Archive interactive install
UltraSPARC III CPU Diagnostic Monitor (CDM)
add lines to Starcat SC: clean the slate.
SunFire Service Mode Password Generator
added : To removeCDM
V440 ALOM, raidctl
update ftp info
added setchs -s command to service mode
added navigation keys to sunvts
Finding Solaris release and distribution loaded
Network troubleshooting command, files, daemons
volslice note added
Added SMS1.4 commands
How to find your way around a B1600...
added default login and console info
added # cfgadm -val | grep permanent
updated platform commands 5.16.0
updated firmware matrix
add SB1 to flashupdate command
add 15k to dr command
Cluster 3.x
added to fileinfo /etc/dhcp.interface
added smsupgrade 1.4.1 info
flasharch info add (use same release ex: sol9 40/04)
suncore password change
Solaris 9 SVM (sds) disk replacement
SC rebuild after total disk failure
simplified sds mirror procedure
Made SVM replacement more universal
added password url to ftp sunsolve info
15K DR / hpost examples
smsbackup: manually check a backup file
3310/3510 Disk replacement:
3800-6800 navigation (ssh) #. added
added setchs showchs cmds
How to mount a CD image file (.iso) as a filesystem
Added iostat (disk thruput test)
Removing the top cover on a v20z
Explorer -w scextended with cron
Useful COD commands:
ALOM4v Ontaeri/Erie(Niagra)
Forgotten password (ALOM4v)
add details to 15k serial pinout
Solaris to Linux cross reference
SSH information
Galaxy ILOM info
kstat -p, netstat -k added
Find local NIS servers
Made 15k dr clearer (cfgadm -val)
added -S to flarcreate example for faster archive
updated remote logging
updated serengeti password reset
added niagra flashupdate example
added x4100 console information
SSH with SMS 1.5

You might also like