Professional Documents
Culture Documents
Nutanix - Hardware Admin Ref AOS v511
Nutanix - Hardware Admin Ref AOS v511
11
NX Series Hardware
Administration Guide
July 24, 2020
Contents
Visio Stencils................................................................................................................. xi
ii
Configuring the Remote Console IP Address (IPMI Web Interface)................................................... 47
Configuring the Remote Console IP Address (Command Line).......................................................... 48
Configuring the Remote Console IP Address (BIOS).............................................................................. 49
9. Memory Configurations....................................................................................... 71
Supported Memory Configurations (Ivy Bridge and Sandy Bridge)....................................................71
Supported Memory Configurations (G4 and G5 Platforms).................................................................. 77
Supported Memory Configurations (G6 Platforms).................................................................................. 83
Supported Memory Configurations (G7 Platforms).................................................................................. 86
DIMM and CPU Performance..............................................................................................................................88
Copyright.................................................................................................................. 89
License......................................................................................................................................................................... 89
Conventions............................................................................................................................................................... 89
Version......................................................................................................................................................................... 89
iii
SYSTEM SPECIFICATIONS AND
HARDWARE INFORMATION
For system specifications, wiring diagrams, and other platform-specific hardware information,
see the system specifications. Go to the Nutanix Support portal and select Documentation >
Hardware Replacement Documentation. Click the AOS Version drop-down box and select ANY.
FIREWALL PORT REQUIREMENTS FOR
IPMI
Port numbers for the IPMI interface. Make sure that these ports are open on the firewall.
Interface Port number
HTTP 80 (TCP)
SMASH 22 (TCP)
CAUTION: Do not configure a cluster that violates any of the following rules.
Compatibility
The Nutanix Support portal includes a compatibility matrix available from the Compatibility
Matrix link. You can filter and display compatibility by Nutanix NX model, AOS release,
hypervisor, and feature (platform/cluster intermixing).
Nutanix recommends that you consult the matrix before installing or upgrading software on
your cluster.
Hardware Restrictions
• You can mix nodes that use different CPU families in the same cluster, but not in the same
block. For example, a cluster can contain nodes from any NX generation, but a G6 block
must contain only G6 nodes, a G7 block must contain only G7 nodes, and so on.
• Nutanix does not support mixing Nutanix NX nodes in the same cluster with nodes from any
other hardware vendor. However, you can manage separate clusters using the Prism web
console regardless of the hardware type.
• Nutanix does not support mixing nodes with RDMA enabled and nodes without RDMA
enabled in the same cluster.
• All Controller VMs in a cluster must use the same version of AOS.
CPU Restrictions
• All CPUs in a block must be identical. When adding a node to a multinode block, make sure
the new node and the existing nodes use identical CPUs.
NIC Restrictions
Storage Restrictions
Encryption Restrictions
• Encrypted drives (SED) can be mixed with unencrypted (non-SED) drives in the same node,
if encryption has never been enabled.
Encrypted nodes can be mixed with unencrypted nodes in the same cluster if encryption
was never enabled and remains disabled.
• NVMe drives do not support self-encryption (SED).
DIMM Restrictions
DIMM types
For all platforms: within a node, all DIMMs must be of the same type. For example, you
cannot mix RDIMMs and LRDIMMs in the same node.
DIMM capacity
For all platforms: within a node, all DIMMs must have the same memory capacity. For
example, you cannot mix 16 GB and 32 GB DIMMs in the same node.
DIMM manufacturers
For platforms earlier than G5, you cannot mix DIMMs from different manufacturers in the
same node.
For G5, G6, and G7 platforms, Nutanix supports mixing DIMMs from different
manufacturers within the same node, but not within the same channel:
• On G5 platforms, channels contain either two DIMM slots (one blue and one black)
or three DIMM slots (one blue and two black.)
• On G6 and G7 platforms, all channels contain two DIMM slots (one blue and one
black.)
• Within a channel, all DIMMs must be from the same manufacturer.
• When replacing a failed DIMM, ensure that you are replacing the original DIMM like-for-
like.
Note: You do not need to balance numbers of DIMMs from different manufacturers within
a node, so long as you never mix them in the same channel.
DIMM speed
For platforms earlier than G5, you cannot mix DIMMs that run at different speeds in the
same node.
For G5 and later platforms, Nutanix supports higher-speed replacement DIMMs, under
these conditions:
• You can mix DIMMs that use different speeds in the same node but not in the same
channel. Within a channel, all DIMMs must run at the same speed.
• You can only use higher-speed replacement DIMMs from one NX generation later than
your platform.
• G5 platforms shipped with 2400MHz DIMMs. You can use G6 2666MHz DIMMs in a
G5 platform, but not G7 2933MHz DIMMs.
• G6 platforms shipped with 2666MHz DIMMs. You can use G7 2933MHz DIMMs in a
G6 platform, but no higher.
• G7 platforms ship with 2933MHz DIMMs. Currently 2933MHz is the highest DIMM
speed that Nutanix supports.
• All installed DIMMs run only at the supported speed of your platform configuration.
Hypervisor Restrictions
• All nodes in a cluster must use the same hypervisor type and version. This restriction does
not apply if the cluster contains the following nodes:
• Storage-only nodes always run AHV, but you can add them to clusters that run on other
hypervisors.
Note: With Nutanix Foundation version 4.0 and later, any node can act as a storage-only
node. With Foundation versions earlier than 4.0, only dedicated storage platforms such as
the NX-6035C can act as storage-only nodes.
• A cluster consisting of only NX-1065 series nodes or only Lenovo HX1310 nodes. These
nodes can form a mixed-hypervisor cluster running ESXi and AHV, with the AHV node
used for storage only. For information about creating a multi-hypervisor cluster, see the
Field Installation Guide: Imaging Bare Metal Nodes.
• If you expand a cluster by adding a node with older generation hardware to a cluster
that was initially created with later generation hardware, power cycle (do not reboot)
vSphere Restrictions
• Nutanix supports mixing nodes with different processor architectures in the same cluster.
However, vSphere only supports enhanced/live vMotion of VMs from one type of node
to another when you have enhanced vMotion compatibility (EVC) enabled. For more
information about EVC, see the vSphere 5 documentation and the following VMware
knowledge base articles:
Procedure
1. In Prism, go to the Home page and make sure Data Resiliency Status displays a green OK.
2. In Prism, go to the Health page and select Actions > Run NCC Checks.
3. In the dialog box that appears, select All Checks and click Run.
Alternatively, issue the following command from the CVM:
nutanix@cvm$ ncc health_checks run_all
4. If any checks fail, see the related KB article provided in the output and the Nutanix Cluster
Check Guide: NCC Reference for information on resolving the issue.
5. If you have any unresolvable failed checks, contact Nutanix Support before shutting down
the node.
7. Save the output of the show_hardware_info command so that you can compare details when
verifying the component replacement later.
Procedure
1. Log on to the CVM and make a note of the BIOS, BMC, and SATA DOM versions.
3. Find the IPMI IP address of the node (necessary in order to access the IPMI web UI.)
nutanix@cvm$ ncc ipmi_info
Procedure
3. Right-click the host and select Maintenance Mode > Enter Maintenance Mode.
5. Log on to the Controller VM with SSH and shut down the Controller VM.
nutanix@cvm$ cvm_shutdown -P now
Note: Do not reset or shutdown the Controller VM in any way other than the cvm_shutdown
command to ensure that the cluster is aware that the Controller VM is unavailable
6. After the Controller VM shuts down, wait for the host to go into maintenance mode.
Procedure
Replace Hypervisor-address with the value of Hypervisor address for the node. Value of
Hypervisor address is either the IP address of the AHV host or the host name.
Specify wait=true to wait for the host evacuation attempt to finish.
Procedure
1. In Prism, go to the Home page and make sure Data Resiliency Status displays a green OK.
2. In Prism, go to the Health page and select Actions > Run NCC Checks.
3. In the dialog box that appears, select All Checks and click Run.
Alternatively, issue the following command from the CVM:
nutanix@cvm$ ncc health_checks run_all
4. If any checks fail, see the related KB article provided in the output and the Nutanix Cluster
Check Guide: NCC Reference for information on resolving the issue.
5. If you have any unresolvable failed checks, contact Nutanix Support before shutting down
the node.
7. Save the output of the show_hardware_info command so that you can compare details when
verifying the component replacement later.
Procedure
1. Log on to the CVM of each node and make a note of the BIOS, BMC, and SATA DOM
versions.
3. Find the IPMI IP address of each node (necessary in order to access the IPMI web UI.)
nutanix@cvm$ ncc ipmi_info
CAUTION: Verify the data resiliency status of your cluster. If the cluster only has replication
factor 2 (RF2), you can only shut down one node for each cluster. If an RF2 cluster would have
more than one node shut down, shut down the entire cluster.
Procedure
2. If DRS is not enabled, manually migrate all the VMs except the Controller VM to another host
in the cluster or shut down any VMs other than the Controller VM that you do not want to
migrate to another host.
If DRS is enabled on the cluster, you can skip this step.
3. Right-click the host and select Maintenance Mode > Enter Maintenance Mode.
Note: If DRS is not enabled, manually migrate or shut down all the VMs excluding the
Controller VM. The VMs that are not migrated automatically even when the DRS is enabled
can be because of a configuration option in the VM that is not present on the target host.
Note: Do not reset or shutdown the Controller VM in any way other than the cvm_shutdown
command to ensure that the cluster is aware that the Controller VM is unavailable
6. After the Controller VM shuts down, wait for the host to go into maintenance mode.
CAUTION: Verify the data resiliency status of your cluster. If the cluster only has replication
factor 2 (RF2), you can only shut down one node for each cluster. If an RF2 cluster would have
more than one node shut down, shut down the entire cluster.
You can put the ESXi host into maintenance mode and shut it down from the command line or
by using the vSphere web client.
Procedure
1. Log on to the Controller VM with SSH and shut down the Controller VM.
nutanix@cvm$ cvm_shutdown -P now
If successful, this command returns no output. If it fails with a message like the following,
VMs are probably still running on the host.
CRITICAL esx-enter-maintenance-mode:42 Command vim-cmd hostsvc/maintenance_mode_enter failed
with ret=-1
Ensure that all VMs are shut down or moved to another host and try again before
proceeding.
nutanix@cvm$ ~/serviceability/bin/esx-shutdown -s cvm_ip_addr
Replace cvm_ip_addr with the IP address of the Controller VM on the ESXi host.
Alternatively, you can put the ESXi host into maintenance mode and shut it down using the
vSphere Web Client.
If the host shuts down, a message like the following is displayed.
INFO esx-shutdown:67 Please verify if ESX was successfully shut down using
ping hypervisor_ip_addr
CAUTION: Verify the data resiliency status of your cluster. If the cluster only has replication
factor 2 (RF2), you can only shut down one node for each cluster. If an RF2 cluster would have
more than one node shut down, shut down the entire cluster.
You must shut down the Controller VM to shut down a node. When you shut down the
Controller VM, you must put the node in maintenance mode.
When a host is in maintenance mode, VMs that can be migrated are moved from that host
to other hosts in the cluster. After exiting maintenance mode, those VMs are returned to the
original host, eliminating the need to manually move them.
If a host is put in maintenance mode, the following VMs are not migrated:
• VMs with GPUs, CPU passthrough, PCI passthrough, and host affinity policies are
not migrated to other hosts in the cluster. You can shut down such VMs by setting
the non_migratable_vm_action parameter to acpi_shutdown. If you do not want
to shut down these VMs for the duration of maintenance mode, you can set the
non_migratable_vm_action parameter to block, or manually move these VMs to another
host in the cluster.
• Agent VMs are always shut down if you put a node in maintenance mode and are powered
on again after exiting maintenance mode.
Perform the following procedure to shut down a node.
Note the value of Hypervisor address for the node you want to shut down.
c. Put the node into maintenance mode.
nutanix@cvm$ acli host.enter_maintenance_mode Hypervisor address [wait="{ true |
false }" ] [non_migratable_vm_action="{ acpi_shutdown | block }" ]
Replace Hypervisor address with either the IP address or host name of the AHV host you
want to shut down.
Set wait=true to wait for the host evacuation attempt to finish.
Set non_migratable_vm_action=acpi_shutdown if you want to shut down VMs such
as VMs with GPUs, CPU passthrough, PCI passthrough, and host affinity policies for the
duration of the maintenance mode.
If you do not want to shut down these VMs for the duration of the maintenance mode,
you can set the non_migratable_vm_action parameter to block, or manually move these
VMs to another host in the cluster.
If you set the non_migratable_vm_action parameter to block and the operation to
put the host into the maintenance mode fails, exit the maintenance mode and then
either manually migrate the VMs to another host or shut down the VMs by setting the
non_migratable_vm_action parameter to acpi_shutdown.
d. Shut down the Controller VM.
nutanix@cvm$ cvm_shutdown -P now
CAUTION: Verify the data resiliency status of your cluster. If the cluster only has replication
factor 2 (RF2), you can only shut down one node for each cluster. If an RF2 cluster would have
more than one node shut down, shut down the entire cluster.
1. Log on to the Controller VM with SSH and shut down the Controller VM.
nutanix@cvm$ cvm_shutdown -P now
Note:
Always use the cvm_shutdown command to reset, or shutdown the Controller VM. The
cvm_shutdown command notifies the cluster that the Controller VM is unavailable.
2. Log on to the Hyper-V host with Remote Desktop Connection and start PowerShell.
» > shutdown /s /t 0
Procedure
1. Contact Nutanix Support to obtain the SATA DOM firmware update ISO.
» If the SATA DOM is part of a single-node cluster, follow the procedures in Shutting
Down a Single-node Cluster on page 12.
» If the SATA DOM is part of a multinode cluster, follow the procedures in Shutting Down
a Multinode Cluster on page 14.
3. From the system where you downloaded the firmware ISO, enter the IPMI IP address of the
node in a web browser to reach the IPMI web UI for the node.
6. From the console menu, select Virtual Media > Virtual Storage.
8. Click OK.
10. After the node starts up from the mounted ISO, log on as root (you do not need a
password.)
11. Use the lsscsi command to verify that the SATA DOM is visible and present.
[root@centos6_8_satadom_S670330N ~]# lsscsi
The command returns output similar to the following:
Note: In this example, the SATA DOM-SL 3IE3 is visible, and the device name is /dev/sda.
12. Use the ls command to find the name of the firmware update file in /usr/local. Then untar
the file.
~]# cd /usr/local
~]# ls
~]# tar xvf filename.tar
13. cd to the firmware-image-file directory created by the untarring command and confirm
which /dev/xxx is the SATA DOM device.
~]# lsscsi | grep SATADOM #
14. Apply the firmware. Replace /dev/xxx with the name of the SATA DOM device:
~]# ./mp_64 -d /dev/xxx -c 1 -u -k -r -v 0
Note: The character 1 is the numeral one, and the final character is a zero.
The command typically takes about a minute to complete. Output resembles the following:
**************************************************
* Innodisk MPTool V2.6.4 2018/04/20 *
**************************************************
Model Name : SATADOM-SL 3IE3 V2
Serial Num : BCA11602030230365
FW Version : S560301N
1. TH58TFT0DDLBA8H
Flash: TH58TFT0DDLBA8H
Note: The -r parameter causes kernel output messages that report atax activity. The output
indicates that the device has restarted successfully.
15. After you see a successful upgrade, turn off the node by selecting Power Control > Set
Power Off; then turn the node on again by selecting Power Control > Set Power On.
Note: Once you have updated the firmware, you cannot review the SATA DOM version with
smartctl or other commands until after the node power-cycles.
17. Log on as root (you do not need a password.) Confirm that the firmware version is the
same as shown in the name of the ISO file you downloaded from Nutanix in step 1, where /
dev/xxx is the SATA DOM device:
~]# smartctl -a /dev/xxx | grep -i "Firmware Version"
18. Check the current wear level of the media, where /devxxx is the SATA DOM device:
~]# smartctl -a /dev/xxx | grep -i “Media_Wearout_Indicator”
233 Media Wearout Indicator 0x0000 100 000 000 Old_age Offline -
100
~]#
Note: Make a note of the wear level, and review it with Nutanix Support in order to
determine whether the SATA DOM media needs replacing.
19. In the IPMI remote console, open the Virtual Media menu and select Plug Out to unmount
the ISO from the node.
Note: If you are upgrading both the BMC and the BIOS, upgrade the BMC first.
Procedure
1. Update the BMC by following the procedures described in the Nutanix BMC Manual Upgrade
Guide.
2. Update the BIOS by following the procedures described in the Nutanix BIOS Manual
Upgrade Guide.
Procedure
1. Contact Nutanix Support to obtain a link to the HBA firmware update ISO and download
the ISO to your system.
» If the HBA card is in a single-node cluster, follow the procedures in Shutting Down a
Single-node Cluster on page 12.
» If the HBA card is in a multinode cluster, follow the procedures in Shutting Down a
Multinode Cluster on page 14.
3. From the system where you downloaded the update ISO, enter the IPMI IP address of the
node in a web browser to reach the IPMI web UI for the node.
4. In the IPMI web UI, launch the remote console by selecting Remote Control > Console
Redirection.
6. From the console menu, select Virtual Media > Virtual Storage.
8. Click OK.
When the host restarts into the ISO, the sas3flash utility performs the update
automatically.
10. In the Virtual Storage dialog box, click Plug Out to unmount the ISO.
11. Disconnect from the IPMI web UI and restart the host normally.
» If the HBA card is in a single-node cluster, follow the procedures in Starting a Single-
node Cluster on page 39.
» If the HBA card is in a multinode cluster, follow the procedures in Starting a Multinode
Cluster on page 41.
Controller Number : 0
Controller : SAS3008(C0)
PCI Address : 00:00:05:00
SAS Address : 5003048-0-1977-6201
NVDATA Version (Default) : 0e.00.30.28
NVDATA Version (Persistent) : 0e.00.30.28
Firmware Product ID : 0x2221 (IT)
Firmware Version : 14.00.00.00
NVDATA Vendor : LSI
NVDATA Product ID : LSI3008-IT
BIOS Version : 08.31.03.00
UEFI BSD Version : 12.00.00.00
FCODE Version : N/A
Board Name : LSI3008-IT
Board Assembly : N/A
Board Tracer Number : N/A
Note: Make sure that the CVM is not currently being updated (such as with WinSCP or FileZilla)
before you stage the disk firmware binaries.
2. From the CVM prompt, identify an active 10G interface on the host. Make a note of the
interface for use in a later step.
» AHV:
nutanix@cvm$ manage_ovs show_interfaces
» ESXi:
nutanix@cvm$ ssh root@host-IP-address esxcli network nic list
3. Find an unused IP address that is in the same subnet as the CVM. Make a note of it so you
can assign this IP address to the Phoenix ISO in a later step.
4. Check to see if the CVM has a VLAN configured. If it does, make a note of the VLAN ID.
b. Make sure that all hosts are part of the metadata ring.
nutanix@cvm$ nodetool -h 0 ring
6. Put the host into maintenance mode and shut down the CVM and the node.
» If the drive is part of a single-node cluster, follow the procedures in Shutting Down a
Single-node Cluster on page 12.
» If the drive is part of a multinode cluster, follow the procedures in Shutting Down a
Multinode Cluster on page 14.
7. From the system where you downloaded the Phoenix ISO, enter the IPMI IP address of the
node in a web browser to reach the IPMI web UI for the node.
10. From the console menu, select Virtual Media > Virtual Storage.
13. From the console menu, restart the node by selecting Power Control > Set Power Reset.
phoenix ~#
a. Change your working directory to the directory that contains the do_rescue_shell.sh
script.
For versions of Phoenix earlier than 4.3:
phoenix ~# cd /
phoenix ~# cd /root
17. Open a new SSH session to the CVM where the binaries are staged, and log on as root with
the password nutanix/4u.
18. Copy the binaries from the CVM to the node that you have restarted into Phoenix.
scp firmware_binary root@phoenix_ip_address:/root/
For phoenix_ip_address, use the IP address of the node that you have restarted into Phoenix.
20. Enter the lsscsi command to find the device name of the drive you want to update.
phoenix ~# lsscsi
[0:0:0:0] disk ATA SAMSUNG MZ7KM1T9 104Q /dev/sda
[0:0:1:0] disk ATA SAMSUNG MZ7KM1T9 104Q /dev/sdb
[0:0:2:0] disk ATA ST6000NM0115-1YZ SN04 /dev/sdc
[0:0:3:0] disk ATA ST6000NM0115-1YZ SN04 /dev/sdd
[1:0:0:0] cd/dvd ATEN Virtual CDROM YS0J /dev/sr0
[10:0:0:0] disk ATA SATADOM-SL 3IE3 301N /dev/sde
21. Run the smartctl command (where /dev/sdx is the device name of the drive you want to
update.) Look in the Information section of the output to find out whether your drive uses
a SAS or SATA interface. At the same time, make a note of the current firmware version (so
you can verify the version after the update.)
22. Download the firmware update binary for your drive from the list in KB 6937.
23. Load the firmware update binary to the Phoenix IP address you assigned in step 16, using a
retrieval command such as wget or scp .
» For a SATA drive: At the Phoenix prompt, enter the following command, where binary-
file-name is the firmware update binary and /dev/sdx is the device name of the drive:
Note: The command does not affect any data on your drive. It only updates the
firmware. The --please-destroy-my-drive flag does not actually destroy the drive. It is
safe to proceed with the firmware update.
» For a SAS drive: At the Phoenix prompt, enter the following command, where binary-
file-name is the firmware update binary and /dev/sdx is the device name of the drive:
25. Run the smartctl command (where /dev/sdx is the device name of the updated drive) and
look in the Information section of the output to verify the firmware update.
26. In the Virtual Storage dialog box, click Plug Out to unmount the ISO.
27. Disconnect from the IPMI web UI and restart the host normally.
» If the drive is part of a single-node cluster, follow the procedures in Starting a Single-
node Cluster on page 39.
» If the drive is part of a multinode cluster, follow the procedures in Starting a Multinode
Cluster on page 41.
a. Make sure that all hosts are part of the metadata ring.
nutanix@cvm$ nodetool -h 0 ring
CAUTION: To protect your device during the update, follow these restrictions:
Do not SSH from the NIC while you are updating it.
Do not turn off your system or disconnect power during the update.
Do not remove the NIC before the update is complete.
Do not interrupt the update.
Procedure
# ./mlxfwmanager --query
Querying Mellanox devices firmware ...
Device #1:
----------
Device Type: ConnectX4LX
Part Number: MCX4121A-ACA_Ax
Description: ConnectX-4 Lx EN network interface card; 25GbE dual-port SFP28; PCIe3.0 x8; ROHS
R6
PSID: MT_2420110034
PCI Device Name: mt4117_pciconf0
Base MAC: 0000248a078d61de
Versions: Current Available
FW 14.20.1010 N/A
PXE 3.5.0210N/A
Status: No matching image found
8. Query the firmware again to make sure that the update succeeded.
# ./mlxfwmanager --query
Procedure
# cd tmp
# reboot
# /opt/mellanox/bin/mst start
Module mst is already loaded
# /opt/mellanox/bin/mst status
MST devices:
# /opt/mellanox/bin/mst status -v
PCI devices:
------------
DEVICE_TYPE MST PCI RDMA NET
NUMA
ConnectX4LX(rev:0) mt4117_pciconf0 03:00.0 net-vmnic0
ConnectX4LX(rev:0) mt4117_pciconf0.1 03:00.1 net-vmnic1
ConnectX4LX(rev:0) mt4117_pciconf1 82:00.0 net-vmnic6
ConnectX4LX(rev:0) mt4117_pciconf1.1 82:00.1 net-vmnic7
ConnectX4LX(rev:0) mt4117_pciconf2 83:00.0 net-vmnic2
ConnectX4LX(rev:0) mt4117_pciconf2.1 83:00.1 net-vmnic3
# /opt/mellanox/bin/mlxfwmanager --query
Querying Mellanox devices firmware ...
Device #1:
----------
Device Type: ConnectX4LX
Part Number: MCX4121A-ACA_Ax
Description: ConnectX-4 Lx EN network interface card; 25GbE dual-port SFP28; PCIe3.0
x8; ROHS R6
PSID: MT_2420110034
PCI Device Name: mt4117_pciconf0
Base MAC: 0000248a078d61de
Versions: Current Available
FW 14.20.1010 N/A
PXE 3.5.0210 N/A
Status: No matching image found
# reboot
12. Query the firmware again to make sure that the update succeeded.
# /opt/mellanox/bin/mlxfwmanager --query
1. Go to the Mellanox web site and download the latest version of WinMFT.
C:\Users\Administrator> WinMFTversion.exe
Procedure
2. Log on to vCenter.
5. Right-click the ESXi host in the vSphere client and select Rescan for Datastores. Confirm
that all Nutanix datastores are available.
Procedure
Replace cvm_name with the name of the Controller VM that you found from the preceding
command.
5. If the node is in maintenance mode, log on to the Controller VM and take the node out of
maintenance mode.
nutanix@cvm$ acli
<acropolis> host.exit_maintenance_mode AHV-hypervisor-IP-address
Procedure
1. In Prism, go to the Health page and select Actions > Run NCC Checks.
2. In the dialog box that appears, select All Checks and click Run.
Alternatively, issue the following command from the CVM:
nutanix@cvm$ ncc hardware_info show_hardware_info
Procedure
1. If the node is off, turn it on by pressing the power button on the front. Otherwise, proceed to
the next step.
2. Log on to vCenter (or to the node if vCenter is not running) with the vSphere client.
7. Right-click the ESXi host in the vSphere client and select Rescan for Datastores. Confirm
that all Nutanix datastores are available.
If the cluster is running properly, output similar to the following is displayed for each node in
the cluster:
CVM: 10.1.64.60 Up
Zeus UP [5362, 5391, 5392, 10848, 10977, 10992]
Scavenger UP [6174, 6215, 6216, 6217]
SSLTerminator UP [7705, 7742, 7743, 7744]
SecureFileSync UP [7710, 7761, 7762, 7763]
Medusa UP [8029, 8073, 8074, 8176, 8221]
DynamicRingChanger UP [8324, 8366, 8367, 8426]
Pithos UP [8328, 8399, 8400, 8418]
Hera UP [8347, 8408, 8409, 8410]
Stargate UP [8742, 8771, 8772, 9037, 9045]
InsightsDB UP [8774, 8805, 8806, 8939]
Procedure
If successful, this command produces no output. If if it fails, wait 5 minutes and try again.
nutanix@cvm$ ~/serviceability/bin/esx-start-cvm -s cvm_ip_addr
After starting, the Controller VM restarts once. Wait three to four minutes before you ping
the Controller VM.
Alternatively, you can take the ESXi host out of maintenance mode and start the Controller
VM using the vSphere Web Client.
If the cluster is running properly, output similar to the following is displayed for each node in
the cluster:
CVM: 10.1.64.60 Up
Zeus UP [5362, 5391, 5392, 10848, 10977, 10992]
4. Verify storage.
Procedure
Replace cvm_name with the name of the Controller VM that you found from the preceding
command.
5. If the node is in maintenance mode, log on to the Controller VM and take the node out of
maintenance mode.
nutanix@cvm$ acli
<acropolis> host.exit_maintenance_mode AHV-hypervisor-IP-address
If the cluster is running properly, output similar to the following is displayed for each node in
the cluster:
CVM: 10.1.64.60 Up
Zeus UP [5362, 5391, 5392, 10848, 10977, 10992]
Scavenger UP [6174, 6215, 6216, 6217]
SSLTerminator UP [7705, 7742, 7743, 7744]
SecureFileSync UP [7710, 7761, 7762, 7763]
Medusa UP [8029, 8073, 8074, 8176, 8221]
DynamicRingChanger UP [8324, 8366, 8367, 8426]
Pithos UP [8328, 8399, 8400, 8418]
Hera UP [8347, 8408, 8409, 8410]
Stargate UP [8742, 8771, 8772, 9037, 9045]
InsightsDB UP [8774, 8805, 8806, 8939]
InsightsDataTransfer UP [8785, 8840, 8841, 8886, 8888, 8889, 8890]
Ergon UP [8814, 8862, 8863, 8864]
Cerebro UP [8850, 8914, 8915, 9288]
Chronos UP [8870, 8975, 8976, 9031]
Curator UP [8885, 8931, 8932, 9243]
Prism UP [3545, 3572, 3573, 3627, 4004, 4076]
CIM UP [8990, 9042, 9043, 9084]
AlertManager UP [9017, 9081, 9082, 9324]
Arithmos UP [9055, 9217, 9218, 9353]
Catalog UP [9110, 9178, 9179, 9180]
Acropolis UP [9201, 9321, 9322, 9323]
Atlas UP [9221, 9316, 9317, 9318]
Uhura UP [9390, 9447, 9448, 9449]
Snmp UP [9418, 9513, 9514, 9516]
SysStatCollector UP [9451, 9510, 9511, 9518]
Procedure
1. If the node is off, turn it on by pressing the power button on the front. Otherwise, proceed to
the next step.
2. Log on to the Hyper-V host with Remote Desktop Connection and start PowerShell.
If the cluster is running properly, output similar to the following is displayed for each node in
the cluster:
CVM: 10.1.64.60 Up
Zeus UP [5362, 5391, 5392, 10848, 10977, 10992]
Scavenger UP [6174, 6215, 6216, 6217]
SSLTerminator UP [7705, 7742, 7743, 7744]
SecureFileSync UP [7710, 7761, 7762, 7763]
Medusa UP [8029, 8073, 8074, 8176, 8221]
DynamicRingChanger UP [8324, 8366, 8367, 8426]
Pithos UP [8328, 8399, 8400, 8418]
Procedure
1. In Prism, go to the Health page and select Actions > Run NCC Checks.
2. In the dialog box that appears, select All Checks and click Run.
Alternatively, issue the following command from the CVM:
nutanix@cvm$ ncc hardware_info show_hardware_info
3. If any checks fail, see the related KB article provided in the output and the Nutanix Cluster
Check Guide: NCC Reference for information on resolving the issue.
Note: Restart Genesis after changing the IPMI configuration. Otherwise, the cluster does not have
access to the IPMI interface.
Procedure
1. Configure the IPMI IP addresses by using either the IPMI web interface or the hypervisor host
command-line interface.
Note: If you are reconfiguring IPMI address on a node because you have replaced the
motherboard, restart Genesis on the Controller VM only for that node.
Procedure
4. Click Save.
The new IPv4 configuration takes effect. This change terminates your connection to the web
interface. To start a new connection, go to the new IP address of the IPMI interface.
Procedure
1. Log on to the hypervisor host with SSH (vSphere or AHV) or remote desktop connection
(Hyper-V).
» vSphere
root@esx# /ipmitool -U ADMIN -P ADMIN lan set 1 ipsrc static
root@esx# /ipmitool -U ADMIN -P ADMIN lan set 1 ipaddr mgmt_interface_ip_addr
root@esx# /ipmitool -U ADMIN -P ADMIN lan set 1 netmask mgmt_interface_subnet_addr
root@esx# /ipmitool -U ADMIN -P ADMIN lan set 1 defgw ipaddr mgmt_interface_gateway
» Hyper-V
> ipmiutil lan -e -I mgmt_interface_ip_addr -G mgmt_interface_gateway
-S mgmt_interface_subnet_addr -U ADMIN -P ADMIN
» AHV
root@ahv# ipmitool -U ADMIN -P ADMIN lan set 1 ipsrc static
root@ahv# ipmitool -U ADMIN -P ADMIN lan set 1 ipaddr mgmt_interface_ip_addr
root@ahv# ipmitool -U ADMIN -P ADMIN lan set 1 netmask mgmt_interface_subnet_addr
root@ahv# ipmitool -U ADMIN -P ADMIN lan set 1 defgw ipaddr mgmt_interface_gateway
• Replace mgmt_interface_ip_addr with the new IP address for the remote console.
• Replace mgmt_interface_gateway with the gateway IP address.
• Replace mgmt_interface_subnet_addr with the subnet mask for the new IP address.
» vSphere
root@esx# /ipmitool -v -U ADMIN -P ADMIN lan print 1
» Hyper-V
> ipmiutil lan -r -U ADMIN -P ADMIN
» AHV
root@ahv# ipmitool -v -U ADMIN -P ADMIN lan print 1
Note: If you are reconfiguring IPMI address on a node because you have replaced the
motherboard, restart Genesis on the Controller VM only for that node.
Procedure
2. Restart the node and press Delete to enter the BIOS setup utility.
There is a limited amount of time to enter BIOS before the host completes the restart
process.
4. Press the down arrow key until BMC network configuration is highlighted and then press
Enter.
5. Press down the arrow key until Update IPMI LAN Configuration is highlighted and press
Enter to select Yes.
9. Review the BIOS settings and press F4 to save the configuration changes and exit the BIOS
setup utility.
The node restarts.
Note: If you are reconfiguring IPMI address on a node because you have replaced the
motherboard, restart Genesis on the Controller VM only for that node.
Note: This procedure helps prevent the BMC password from being retrievable on port 49152.
Tip: Although Nutanix does not require the administrator to have the same password on all
hosts, doing so makes cluster management much easier. If you do select a different password for
one or more hosts, make sure to note the password for each host.
Note: The maximum allowed length of the IPMI password is 19 characters, except on ESXi hosts,
where the maximum length is 15 characters.
Note: Do not use the following special characters in the IPMI password: & ; ` ' \ " | * ? ~ < > ^ ( ) [ ]
{ } $ \n \r
Procedure
Change the administrative user password of all IPMI hosts.
Perform these steps on every IPMI host in the cluster.
Note: On ESXi hosts, the maximum allowed length of the IPMI password is 15 characters.
Procedure
In the sample output, the ID of the administrator for which you want to change the password
is 2.
Procedure
3. Determine the user ID of the administrator for which you want to change the password.
:\> ipmicfg-win.exe -user list
In the sample output, the user ID of the administrator for which you want to change the
password is 2.
Procedure
In the sample output, the ID of the administrator for which you want to change the password
is 2.
Procedure
The command sets the BMC to factory default and sets the IPMI IP address to DHCP.
3. To reset the IPMI IP address through the BIOS, use the following steps.
4. To reset the IPMI IP address with ipconfig, use the following commands.
Note: IPMI restarts automatically, so you do not need to issue a restart command.
(where xx is the ID shown in the alert) to verify the location before replacing
the drive.
Rescue Shell
Log on to a rescue shell to diagnose issues with the boot device.
To log on to a rescue shell, you must first create the svmrescue.iso image on another node.
Procedure
CAUTION: This procedure is for the boot drive replacement in slot 1 of the node. Do not install
the Controller VM on a metadata drive in slot 2 of the node.
Procedure
2. If the Controller VM is running, right-click the Controller VM and select Power > Power Off.
a. Right-click the Controller VM and select Edit Settings /Hardware > CD/DVD Drive >
Device Type > Client Device.
b. Select Options > Advanced > Boot Options > Force BIOS Setup > OK.
c. Click the Console tab of the Controller VM.
d. Right-click the Controller VM and select Power > Power On.
e. When the Controller VM restarts into BIOS, select the Connect/Disconnect the CD/DVD
devices of the virtual machine icon at the top of the console and select CD/DVD Drive >
Connect to ISO image on local disk.
f. Select the svmrescue.iso file on your local system and click OK.
g. In the console, press Esc and select Exit Discarding Changes to exit the BIOS.
4. Choose Rescue Shell from the boot menu and press Enter.
Note: This option reimages the Controller VM, but keeps the metadata (oplog/extent
store) and cold data.
CAUTION: Exercise caution before selecting this option as this option formats and
reimages all the disks.
CAUTION: Only use the print command within parted. Using other commands
could destroy the drive.
lsscsi
This command lists the SCSI devices presented to the Controller VM. The boot
drive is listed as /dev/sda.
As an example, here is what running the lsscsi command looks like on an
NX-3450:
# lsscsi
[2:0:0:0] disk ATA INTEL SSDSC2BA80 0250 /dev/sda
[2:0:1:0] disk ATA INTEL SSDSC2BA80 0250 /dev/sdb
[2:0:2:0] disk ATA ST91000640NS SN03 /dev/sdc
[2:0:3:0] disk ATA ST91000640NS SN03 /dev/sdd
[2:0:4:0] disk ATA ST91000640NS SN03 /dev/sde
[2:0:5:0] disk ATA ST91000640NS SN03 /dev/sdf
• sda3: /home/nutanix
5. Click Browse and locate the ServiceVM ISO file on the local datastore (for example: [NTNX-
local-ds-nfs-1-4] ServiceVM-1.25_Centos/ServiceVM-1.25_Centos.iso). Do not select the
file or folder or file that starts with .# (period and pound symbols).
8. Right-click the Controller VM and select Power > Power On. It might take a few minutes for
the logon prompt to appear.
Node
Applicable Platforms
All platforms
Failure Indications
All platforms:
• When trying to start the node, there is no green light on the power button. A
green light indicates power to the node. There may or may not be network/
BMC link lights.
• ESXi experiences a PSOD that indicates a CPU or other hardware error.
• One of the on-board NIC ports is not working.
• A diagnosed memory failure turns out to be a memory slot failure.
• Multiple HDDs have gone offline in a single node but the drives do not report
any errors.
Next Steps:
All platforms:
• Multi-node platforms: If the node does not start, reseat the node. If that does
not resolve the issue, replace the node.
• If the node is on but one or more of the other symptoms are present, replace
the node.
• To troubleshoot NIC issues, see KB article 1088.
• An alert on the Nutanix UI that a fan has stopped or is running too fast or too
slow.
• Running /ipmitool sel list from the ESXi host shows fan errors on fan 1 or
fan 2.
• Running /ipmitool sensor list from the ESXi host shows 0 RPM for either
fan 1 or fan 2 OR shows significantly fewer RPM than another fan, AND there
are temperature alerts.
• All platforms except NX-6000 and NX-9040: ignore failure reports from fan 3
or fan 4. Each node only sees two fans, labeled on all nodes as fan 1 and fan 2.
Memory
Applicable Platforms
All platforms
Failure Indications
All platforms:
1344 @classmethod
1345 @register_check("15019")
1353 @classmethod
1354 @register_check("15020")
• IPMI event log (/ipmitool sel list from the ESXi shell) for a node shows an
uncorrectable ECC memory error for a particular DIMM.
• ESXi:
root@esx# smbiosDump | grep -A 12 -B1 'Bank: ' | egrep \
'Bank|No Memory Installed|Location'
• AHV:
root@ahv# virsh sysinfo | egrep "size|'locator"
The specifications tables for each platform describe the supported DIMM
configurations.
Power Supply
Applicable Platforms
All platforms
Failure Indications
All platforms:
• Errors or failures on multiple drives, that are not resolved by replacing the
drives or the node.
• Errors or a failure on a single drive that are not resolved by replacing the drive.
• PDB sensor failure, after diagnosing a failed PSU or fan indicates that a sensor
has failed, not the actual component.
• Physical damage to the chassis.
Next Steps:
All platforms:
Replace the chassis.
5
ADDING A DRIVE
Add a drive to a platform.
Note: The process of adding a drive is the same for all platforms (both Nutanix and third-party
platforms), assuming the platform is running Nutanix AOS.
What types of drives you can add depends on your platform configuration. For supported drive
configurations, see the system specifications for your platform.
• Hybrid: a mixture of SSDs and HDDs. Hybrid configurations fill all available drive slots, so the
only case where you would add a drive is if there is a drive missing.
• All-flash: All-flash nodes have both fully populated and partially populated configurations.
You can add new drives to the empty slots.
Procedure
If the drive is red and shows a label of Unmounted Disk, select the drive and click Repartition
and Add under the diagram.
This message and the button appear only if the replacement drive contains data. Their
purpose is to protect you from unintentionally using a drive with data on it.
CAUTION: This action removes all data on the drive. Do not repartition the drive until you
have confirmed that the drive contains no essential data.
If the cluster has only one storage pool, the disk is automatically added to the storage pool.
a. In the web console, select Storage from the pull-down main menu (upper left of screen)
and then select the Table and Storage Pool tabs.
Procedure
2. Determine which firmware slot to use to download the image, based on the NVMe 1.1b
specification at http://www.nvmexpress.org/.
In this example, the frmw field returned 0x2, which means that the drive supports one
firmware slot and slot 1 is read/write. So, in the next step, specify the slot as 1.
Note: The results vary by firmware version. For example, if instead of 0x2 the frmw field
returned 0x07, then slot 1 is read-only and the drive supports three firmware slots. In that case
you would save the firmware image to either slot 2 or slot 3.
6. Use the id-ctrl command to verify that the new firmware is active.
[root@localhost p3600_fw]# nvme id-ctrl /dev/nvme0 | grep fr
fr : 8DV101F0
frmw : 0x2
G5 platforms NX-9030-G5
G6 platforms
• NX-3060-G6
• NX-3155G-G6
• NX-3170-G6
• NX-8035-G6
• NX-8155-G6
G7 platforms
• NX-3060-G7
• NX-3155G-G7
• NX-8035-G7
• NX-8150-G7
• NX-8155-G7
• Each node in the cluster must contain two RDMA-enabled Mellanox CX-4 network cards.
Note: Mellanox CX NICs are dual-port cards. One card uses a single port dedicated to RDMA
traffic. The other card uses its ports for CVM and guest VM traffic.
Note: All network cards in a node must be the same type. Nutanix does not support mixing
network cards from different manufacturers, or of different capacities.
• RDMA-enabled cards must be installed at the factory. You cannot add them to a node in the
field.
• AHV
• ESXi (starting with AOS 5.11.2)
6 × 32 GB = 192 GB Fill all blue slots 1A, 1B, 1C, 1D. Also fill 2A, 2B.
6 × 16 GB = 96 GB Fill all blue slots 1A, 1B, 1C, 1D. Also fill 2A, 2B.
NX-1050, NX-3050, NX-3060, NX-6000, and NX-9040 Series DIMM Connector IDs
This diagram shows the connector IDs for all NX-1050, NX-3050, NX-6000, and NX-9040 series
platforms.
Figure 18: DIMM Connector IDs for NX-1050, NX-3050, NX-3060, NX-6000, and NX-9040
series
NX-1050
Each NX-1050 node supports
NX-7000 Series
Each NX-7000 chassis supports:
NX-8000 Series
Each NX-8000 chassis supports:
• 8 × 16 GB = 128 GB R DIMM (Fill all slot 1: 1A, 1B, 1C, 1D, 1E, 1F, 1G, 1H)
• 16 × 16 GB = 256 GB R DIMM (Fill slots 1 and 2: 1A, 2A, 1B, 2B, 1C, 2C, 1D, 2D, 1E, 2E, 1F, 2F, 1G,
2G, 1H, 2H)
• 24 × 16 GB = 384 GB R DIMM (Fill all slots)
• 16 × 32 GB = 512 GB LR DIMM (Fill slots 1 and 2: 1A, 2A, 1B, 2B, 1C, 2C, 1D, 2D, 1E, 2E, 1F, 2F, 1G,
2G, 1H, 2H)
NX-9040 Series
Each NX-9040 node supports
Note: DIMM slots on the motherboard are most commonly labeled as A1, A2, and so on. However,
some software tools may report DIMM slot labels in a different format, such as 1A, 2A, or CPU1,
CPU2, or DIMM1, DIMM2.
2 16 1A, 2A, 1B, 2B, 1C, 2C, 1D, 2D, 1E, 2E, 1F, 2F, 1G, 2G, 1H, 2H (For most
platforms, fill all slots. For NX-8000, fill all #1 and #2 slots.)
• EXAMPLE: You have an NX-3060-G5 node that has eight 32 GB DIMMs for a total
of 256 GB. You decide to upgrade to sixteen 32 GB DIMMs for a total of 512 GB.
When you remove the node from the chassis and look at the motherboard, you
see that each CPU has four DIMMs. The DIMMs fill all blue DIMM slots, with all black
DIMM slots empty. Remove all DIMMs from one CPU and place them in the empty
DIMM slots for the other CPU. Then place all the new DIMMs in the DIMM slots for
the first CPU, filling all slots. This way you can ensure that the original DIMMs and
the new DIMMs do not share channels.
Note: You do not need to balance numbers of DIMMs from different manufacturers within
a node, so long as you never mix them in the same channel.
DIMM speed
For G5 platforms, Nutanix supports higher-speed replacement DIMMs, under these
conditions:
• You can mix DIMMs that use different speeds in the same node but not in the same
channel. Within a channel, all DIMMs must run at the same speed.
• All installed DIMMs run only at the supported speed of your platform configuration.
If you install a higher-speed replacement DIMM from a later NX generation, arrange the
DIMMs so that the original DIMMs and the higher-speed DIMMs do not share the same
memory channel.
DIMM Performance
Memory performance is most efficient with a configuration where every memory channel
contains the same number of DIMMs. Nutanix supports other configurations, but be aware that
these configurations result in lower performance.
SX-1065-G5
NX-6155-G5 NX-8150-G4/-G5
Three DIMMs per channel (3DPC) memory channels have one blue slot and two black slots
each, as shown in the following figure.
Note: DIMM slots on the motherboard are most commonly labeled as A1, A2, and so on. However,
some software tools report DIMM slot labels in a different format, such as 1A, 2A, or CPU1, CPU2,
or DIMM1, DIMM2.
2 12 (unbalanced) A1, B1, C1, D1, E1, F1, G1, H1 (blue slots)
A2, B2, E2, F2 (black slots)
2 20 (unbalanced) A1, B1, C1, D1, E1, F1, G1, H1 (blue slots)
A2, B2, C2, D2, E2, F2, G2, H2 (black slots)
A3, B3, E3, F3 (black slots)
DIMM Restrictions
Note: If you install a higher-speed replacement DIMM from a later NX generation, arrange
the DIMMs so that the original DIMMs and the higher-speed DIMMs do not share the same
memory channel.
• You can mix DIMMs from different manufacturers in the same G6 node but not the same
channel. Within a channel, all DIMMs must be from the same manufacturer.
DIMM Performance
Memory performance is most efficient with a configuration where every memory channel
contains the same number of DIMMs. Nutanix supports other configurations, but be aware that
these configurations result in lower performance.
Note: DIMM slots on the motherboard are most commonly labeled as A1, A2, and so on. However,
some software tools report DIMM slot labels in a different format, such as 1A, 2A, or CPU1, CPU2,
or DIMM1, DIMM2.
DIMM Restrictions
Each G7 node must contain only DIMMs of the same type, speed, and capacity.
DIMMs from different manufacturers can be mixed in the same node, but not in the same
channel:
• DIMM slots are arranged on the motherboard in groups called channels. On G7 platforms, all
channels contain two DIMM slots (one blue and one black). Within a channel, all DIMMs must
be from the same manufacturer.
• When replacing a failed DIMM, ensure that you are replacing the old DIMM like-for-like.
• When adding new DIMMs to a node, if the new DIMMs and the original DIMMs are from
different manufacturers, arrange the DIMMs so that the original DIMMs and the new DIMMs
are not mixed in the same channel.
• EXAMPLE: You have an NX-3060-G7 node that has twelve 32GB DIMMs for a total of
384GB. You decide to upgrade to twenty-four 32GB DIMMs for a total of 768GB. When
you remove the node from the chassis and look at the motherboard, you will see that
each CPU has six DIMMs, filling all blue DIMM slots, with all black DIMM slots empty.
Remove all DIMMs from one CPU and place them in the empty DIMM slots for the other
CPU. Then place all the new DIMMs in the DIMM slots for the first CPU, filling all slots. This
way you can ensure that the original DIMMs and the new DIMMs do not share channels.
Note: You do not need to balance numbers of DIMMs from different manufacturers within a
node, so long as they are never mixed in the same channel.
Note: DIMM slots on the motherboard are most commonly labeled as A1, A2, and so on. However,
some software tools report DIMM slot labels in a different format, such as 1A, 2A, or CPU1, CPU2,
or DIMM1, DIMM2.
Table 9: CPU and DIMM Performance for Intel Haswell CPUs with BIOS 1.oc (G4U-1.0)
DIMM DIMM speed with low-speed CPUs: DIMM speed with high-speed CPUs:
information E5-2620v3 E5-2660v3, E5-2680v3, E5-2667v3,
E5-2697v3, E5-2699v3
1 2 3 1 2 3
License
The provision of this software to you does not grant any licenses or other rights under any
Microsoft patents with respect to anything other than the file server implementation portion of
the binaries for this software, including no licenses or any other rights in any hardware or any
devices or software that are used to communicate with or in connection with this software.
Conventions
Convention Description
root@host# command The commands are executed as the root user in the vSphere or
Acropolis host shell.
> command The commands are executed in the Hyper-V host shell.
Version
Last modified: July 24, 2020 (2020-07-24T11:51:09-07:00)
Platform |