Download as pdf or txt
Download as pdf or txt
You are on page 1of 20

Troubleshooting and Upgrades

Jason Nash
www.jasonnash.com
@TheJasonNash
Troubleshooting

 Not a lot to troubleshoot with XtremIO


 Cluster communications
 Host connectivity issues

 Can view a lot of information using the XMS CLI

 You can also install the XMS CLI on a Linux box


 In case the XMS is restricted

 Worst case you can re-install the XMS


 Just attach to the cluster again

Do Not Place Anything


in This Space
(Add watermark during
editing)
Note: Warning will not appear
during Slide Show view.
Lab
Examine the XMS CLI

Do Not Place Anything


in This Space
(Add watermark during
editing)
Note: Warning will not appear
during Slide Show view.
Startup - XMS

 Power up the Virtual XMS


 Just use the power-on command in your management console

 Power up the Physical XMS


 Use the power button on the left of the physical server

 Wait a few minutes and ping the appropriate IP address

Do Not Place Anything


in This Space
(Add watermark during
editing)
Note: Warning will not appear
during Slide Show view.
Startup - XMS

Power up the Virtual XMS Power up the Physical XMS


Just use the power-on command in Use the power button on the left of
your management console the physical server

Wait a few minutes

Ping the appropriate IP address

Do Not Place Anything


in This Space
(Add watermark during
editing)
Note: Warning will not appear
during Slide Show view.
Startup - BBUs

 Will auto power on at 70% charge

 There are two types of BBUs


 Power-on is basically the same

 Confirm they are not already powered-on

 Press and hold the power button on the front

Do Not Place Anything


in This Space
(Add watermark during
editing)
Note: Warning will not appear
during Slide Show view.
Startup - Controllers

 Usually done via the CLI


 Makes use of the IPMI connection

 Commands are
 show-storage-controllers
 power-on sc-id="<Storage Controller name or index from
prev command>”

 Wait a few minutes and ping the appropriate IP address

 If this doesn’t work there is a physical button on the front left of each
controller

Do Not Place Anything


in This Space
(Add watermark during
editing)
Note: Warning will not appear
during Slide Show view.
Starting the Cluster

 Once you verify all of the components are powered on you can start
the cluster
 start-cluster

 Wait and then check status


 show-clusters

Do Not Place Anything


in This Space
(Add watermark during
editing)
Note: Warning will not appear
during Slide Show view.
Before You Shutdown

 You want to confirm that nothing is using the array


 show-clusters
 show-clusters-performance

 You want to see no I/O activity


 If you see any that means a host and application is accessing the storage

 Highly suggest to consult documentation for your version of software

Do Not Place Anything


in This Space
(Add watermark during
editing)
Note: Warning will not appear
during Slide Show view.
Shutdown - Software

 Make sure the XMS is talking to the clusters


 show-clusters
 Want the state to be active
 If the state is not stopped or active call EMC support

 Issue the command to shut down the cluster


 stop-cluster-unorderly

 If you want you can also shut down the XMS


 shutdown-xms shutdown-type=service

Do Not Place Anything


in This Space
(Add watermark during
editing)
Note: Warning will not appear
during Slide Show view.
Shutdown - Hardware

 Perform a show-clusters again to confirm the cluster services are


down
 Then issue shutdown cluster-id=1
 Will down all controllers in the cluster

 To power off the Virtual XMS do it from your virtual environment


management console

 For physical servers issue shutdown-xms shutdown-type=machine

Do Not Place Anything


in This Space
(Add watermark during
editing)
Note: Warning will not appear
during Slide Show view.
Emergency Shutdown

 From the XMS issue


 stop-cluster-unorderly
 power-off sc-id=<Storage Controller number>

 Then power off the XMS


 Use method described earlier

 If you need to physically power down the environment quickly


 Pull out the DAE power cables of the first Xbrick
 Wait for blue LEDs to start blinking on all controllers
 Power down the rack

Do Not Place Anything


in This Space
(Add watermark during
editing)
Note: Warning will not appear
during Slide Show view.
Upgrade Options

Disruptive Upgrade

Requires stopping all I/O to/from the cluster

Non-Disruptive Upgrade (NDU)

Each controller is updated and rebooted individually so as to not stop I/O

Software Update

Is done following an addition of a brick

Do Not Place Anything


in This Space
(Add watermark during
editing)
Note: Warning will not appear
during Slide Show view.
Disruptive Upgrade – Software Prep

1. Download the latest XtremIO Software from the EMC support site

2. SCP that software to the /var/lib/xms/images/ on the XMS

3. SSH in to the XMS and login as xmsadmin and then tech

4. Run show-sw-images and confirm that the XMS sees the new
software

Do Not Place Anything


in This Space
(Add watermark during
editing)
Note: Warning will not appear
during Slide Show view.
Disruptive Upgrade – Upgrading the Cluster

1. Disable the ESRS monitoring, if enabled


1. modify-syr-notifier disable

2. Confirm that there are no issues that may impact the upgrade
1. test-cluster-upgradability package=”file-name-of-
upgrade.tgz"

3. Perform the upgrade


1. stop-cluster-unorderly
2. upgrade-cluster-unorderly package=“file-name-of-
upgrade.tgz"

Do Not Place Anything


in This Space
(Add watermark during
editing)
Note: Warning will not appear
during Slide Show view.
Disruptive Upgrade - Completion

1. Wait for the XMS to come back up

2. Login as xmsadmin and then tech again

3. Restart the cluster


1. start-cluster

4. Re-enable ESRS monitoring


1. modify-syr-notifier enable

Do Not Place Anything


in This Space
(Add watermark during
editing)
Note: Warning will not appear
during Slide Show view.
Non-Disruptive Upgrade - Process

 This process is very much like the disruptive process


 Almost all steps are the same

 The difference is that you don’t stop/start the cluster manually


 No stop-cluster-unorderly
 No start-cluster

 You will get kicked out of XMS and need to login again
 May see a lot of alerts and errors as the process completes
 Try not to be alarmed!

 Documentation says it takes 15 minutes


 Good bit longer in real life
 Will cause a lot of alarms in the XMS!
Do Not Place Anything
in This Space
(Add watermark during
editing)
Note: Warning will not appear
during Slide Show view.
Lab
Perform a Non-Disruptive Upgrade

Do Not Place Anything


in This Space
(Add watermark during
editing)
Note: Warning will not appear
during Slide Show view.
Software Upgrade

 This is done if you are adding a brick to an existing cluster


 Currently it is very disruptive

 By very disruptive I mean that you will lose all data on the brick
 Under current v2.x code releases

 This will be resolved in later releases and allow for non-disruptive


expansion and redistribution of data

 Process is just like building a new cluster from scratch

Do Not Place Anything


in This Space
(Add watermark during
editing)
Note: Warning will not appear
during Slide Show view.
Summary

 Troubleshooting using the XMS CLI

 Power on and off procedures

 Performing software upgrades

Do Not Place Anything


in This Space
(Add watermark during
editing)
Note: Warning will not appear
during Slide Show view.

You might also like