Download as pdf or txt
Download as pdf or txt
You are on page 1of 37

VMware vSphere

Networking
Nutanix Best Practices

Version 2.0 • September 2018 • BP-2074


VMware vSphere Networking

Copyright
Copyright 2018 Nutanix, Inc.
Nutanix, Inc.
1740 Technology Drive, Suite 150
San Jose, CA 95110
All rights reserved. This product is protected by U.S. and international copyright and intellectual
property laws.
Nutanix is a trademark of Nutanix, Inc. in the United States and/or other jurisdictions. All other
marks and names mentioned herein may be trademarks of their respective companies.

Copyright | 2
VMware vSphere Networking

Contents

1. Executive Summary................................................................................ 5

2. Nutanix Enterprise Cloud Overview...................................................... 6


2.1. Nutanix Acropolis Architecture...................................................................................... 7

3. Introduction to VMware vSphere Networking.......................................8


3.1. vSphere Standard Switch.............................................................................................. 8
3.2. vSphere Distributed Switch........................................................................................... 8
3.3. VMware NSX............................................................................................................... 10

4. Discovery Protocols..............................................................................11

5. Network I/O Control (NIOC).................................................................. 13


5.1. NIOC Version 2........................................................................................................... 13
5.2. vSphere 6.x: NIOC Version 3......................................................................................16
5.3. NIC Teaming and Failover.......................................................................................... 17
5.4. Security........................................................................................................................ 20
5.5. Virtual Networking Configuration Options....................................................................20
5.6. Jumbo Frames.............................................................................................................24

6. Multicast Filtering in vSphere 6.0........................................................28

7. Port Binding with vSphere Distributed Switches............................... 29

8. Network Configuration Best Practices................................................30


8.1. Physical Network Layout............................................................................................. 30
8.2. Upstream Physical Switch Specifications....................................................................30
8.3. Link Aggregation..........................................................................................................31
8.4. Switch and Host VLANs.............................................................................................. 31
8.5. Guest VM VLANs........................................................................................................ 32
8.6. CVM Network Configuration........................................................................................ 32

3
VMware vSphere Networking

8.7. Jumbo Frame Configuration........................................................................................ 32


8.8. IP Address Management............................................................................................. 32
8.9. IPMI Ports.................................................................................................................... 32

9. Conclusion............................................................................................. 33

Appendix......................................................................................................................... 34
References.......................................................................................................................... 34
About the Authors............................................................................................................... 34
About Nutanix......................................................................................................................34

List of Figures................................................................................................................36

List of Tables................................................................................................................. 37

4
VMware vSphere Networking

1. Executive Summary
With the VMware vSphere hypervisor, you can dynamically configure, balance, or share logical
networking components across various traffic types. Therefore, it’s imperative that we take virtual
networking into consideration when architecting a network solution for Nutanix hyperconverged
appliances.
This best practices guide addresses the fundamental features of VMware’s vSphere switching
technology and details the configuration elements you need to run vSphere on the Nutanix
Enterprise Cloud OS. The guidance in this document can help you quickly and easily develop
an appropriate design strategy for deploying the Nutanix hyperconverged platform with VMware
vSphere 6.x as the hypervisor.

Table 1: Document Version History

Version
Published Notes
Number
1.0 March 2017 Original publication.
Updated platform overview, virtual networking
1.1 May 2017
option diagrams, and Nutanix VLAN IDs.
Updated platform overview and added network
1.2 February 2018
configuration best practices.
1.3 March 2018 Updated Jumbo Frames section.
Updated Nutanix overview and vSphere 6.x: NIOC
1.4 July 2018
Version 3 section.
Added Port Binding with vSphere Distributed
2.0 September 2018 Switches section and updated Jumbo Frames
section.

1. Executive Summary | 5
VMware vSphere Networking

2. Nutanix Enterprise Cloud Overview


Nutanix delivers a web-scale, hyperconverged infrastructure solution purpose-built for
virtualization and cloud environments. This solution brings the scale, resilience, and economic
benefits of web-scale architecture to the enterprise through the Nutanix Enterprise Cloud
Platform, which combines three product families—Nutanix Acropolis, Nutanix Prism, and Nutanix
Calm.
Attributes of this Enterprise Cloud OS include:
• Optimized for storage and compute resources.
• Machine learning to plan for and adapt to changing conditions automatically.
• Self-healing to tolerate and adjust to component failures.
• API-based automation and rich analytics.
• Simplified one-click upgrade.
• Native file services for user and application data.
• Native backup and disaster recovery solutions.
• Powerful and feature-rich virtualization.
• Flexible software-defined networking for visualization, automation, and security.
• Cloud automation and life cycle management.
Nutanix Acropolis provides data services and can be broken down into three foundational
components: the Distributed Storage Fabric (DSF), the App Mobility Fabric (AMF), and AHV.
Prism furnishes one-click infrastructure management for virtual environments running on
Acropolis. Acropolis is hypervisor agnostic, supporting three third-party hypervisors—ESXi,
Hyper-V, and XenServer—in addition to the native Nutanix hypervisor, AHV.

Figure 1: Nutanix Enterprise Cloud

2. Nutanix Enterprise Cloud Overview | 6


VMware vSphere Networking

2.1. Nutanix Acropolis Architecture


Acropolis does not rely on traditional SAN or NAS storage or expensive storage network
interconnects. It combines highly dense storage and server compute (CPU and RAM) into a
single platform building block. Each building block delivers a unified, scale-out, shared-nothing
architecture with no single points of failure.
The Nutanix solution requires no SAN constructs, such as LUNs, RAID groups, or expensive
storage switches. All storage management is VM-centric, and I/O is optimized at the VM virtual
disk level. The software solution runs on nodes from a variety of manufacturers that are either
all-flash for optimal performance, or a hybrid combination of SSD and HDD that provides a
combination of performance and additional capacity. The DSF automatically tiers data across the
cluster to different classes of storage devices using intelligent data placement algorithms. For
best performance, algorithms make sure the most frequently used data is available in memory or
in flash on the node local to the VM.
To learn more about the Nutanix Enterprise Cloud, please visit the Nutanix Bible and
Nutanix.com.

2. Nutanix Enterprise Cloud Overview | 7


VMware vSphere Networking

3. Introduction to VMware vSphere Networking


VMware vSphere supports two main types of virtual switches: the vSphere Standard Switch
(vSwitch) and the vSphere Distributed Switch (vDS). Both switches provide basic functionality,
which includes the ability to forward layer 2 frames, provide 802.1q VLAN encapsulation, manage
traffic shape, and utilize more than one uplink (NIC teaming). The main differences between
these two virtual switch types pertain to switch creation, management, and the more advanced
functionality they provide.

3.1. vSphere Standard Switch


The vSwitch is available in all versions of VMware vSphere and is the default method for
connecting VMs, as well as management and storage traffic, to the external (or physical)
network. The vSwitch is part of the standard vSphere licensing model and comes with the
following advantages and disadvantages.
Advantages:
• Simple and easy to configure.
• No reliance on VirtualCenter (vCenter) availability.
Disadvantages:
• No centralized management.
• No support for Network I/O Control (NIOC).
• No automated network load balancing.
• No backup, restore, health check, or network visibility capabilities.

3.2. vSphere Distributed Switch


To address many of the vSwitch’s disadvantages, the vDS separates the networking data
plane from the control plane, enabling advanced networking features such as load balancing,
traffic prioritization, backup and restore capabilities, and so on. These features can certainly be
compelling to an organization, but it’s important to note the following disadvantages of the vDS:
• Requires Enterprise Plus licensing from VMware.
• Because VMware vCenter stores the control plane in this configuration, you must have
vCenter server availability to configure and manage the vDS.
• Complexity of management configuration.

3. Introduction to VMware vSphere Networking | 8


VMware vSphere Networking

The vDS has matured over the several version iterations of the vSphere product, providing new
functionality as well as changes to existing components and behaviors.
vSphere 5.x provided the following capabilities and enhancements:
• Inter-VM traffic visibility via the Netflow protocol.
• LLDP (link layer discovery protocol) support, to provide upstream switch visibility.
• Enhanced link aggregation support, including a variety of hashing algorithms.
• Single-root I/O virtualization (SR-IOV) support.
• Support for 40 Gb network interface cards (NICs).
• Network I/O Control (NIOC) version 2.
In vSphere 6.x, VMware has introduced improvements as well as new features and functionality,
including:
• NIOC version 3.
• Support for IGMP snooping for IPv4 packets and MLD snooping for IPv6 packets.
• Multiple TCP/IP stacks for vMotion.
The following table concisely compares the vSwitch and the vDS.

Table 2: Summary Comparison Between vSwitch and vDS

Switch Management Private


License NIOC Teaming LLDP LACP
Type Method VLANs

Originating port ID
IP hash
Host or
vSwitch Standard No No Source MAC hash No No
vCenter
Explicit failover
order

3. Introduction to VMware vSphere Networking | 9


VMware vSphere Networking

Switch Management Private


License NIOC Teaming LLDP LACP
Type Method VLANs

Originating port ID
IP hash
Source MAC hash
Enterprise
vDS vCenter Yes Yes Yes Yes
Plus Explicit failover
order
Route based on
physical NIC load

3.3. VMware NSX


Running VMware NSX for vSphere on Nutanix improves control and increases the agility of the
compute, storage, virtualization, and network layers. Nutanix has evaluated and tested VMware
NSX for vSphere and developed a technical guide for customers who want to move toward SDN
architecture with Nutanix.

3. Introduction to VMware vSphere Networking | 10


VMware vSphere Networking

4. Discovery Protocols
From vSphere 5.0 onward, the vDS has supported two discovery protocols: LLDP and CDP
(Cisco discovery protocol). Discovery protocols give vSphere administrators visibility into
connectivity from the virtual network to the physical network, which simplifies troubleshooting in
the event of cabling issues, MTU (maximum transmission unit) or packet fragmentation, or other
problems. The only disadvantage to using discovery protocols is a potential security issue, as
they make both the topology of the network and switch-specific information visible.
VMware vSphere offers three configuration options or attributes for LLDP, as presented in the
table below. Each option changes the information that the selected discovery protocol sends and
receives. CDP is either enabled or not—it does not have additional options.

Table 3: LLDP Discovery Options

Attribute Description
In this mode, ESXi detects and displays information from
Listen the upstream switch, but it does not advertise the vDS
configuration to the upstream network device.
In this mode, ESXi advertises information about the vDS
Advertise to the switch administrator, but it does not detect or display
information about the physical switch.
This attribute listens and advertises information between the
Both
vDS and upstream switch provider.

Note: Nutanix recommends the Both option, which ensures that ESXi collects and
displays information from the physical switch and sends information about the vDS to
the physical switch.

The following information is visible in the vSphere client when you enable discovery protocol
data:
• The physical switch interface connected to the dvUplink.
• MTU size (in other words, whether jumbo frames are enabled).
• The switch management IP address.
• The switch name, description, software version, and capabilities.

4. Discovery Protocols | 11
VMware vSphere Networking

The following table presents some general guidelines, but Nutanix recommends that all
customers carefully consider the advantages and disadvantages of discovery protocols for their
specific security and discovery requirements.

Table 4: Recommendations for Discovery Protocol Configuration

Dependent on your switching infrastructure. Use CDP for Cisco-


Type
based environments and LLDP for non-Cisco environments.
Both CDP and LLDP are generally acceptable in most environments
and provide maximum operational benefits. Configuring an attribute
(listen, advertise, or both) in LLDP allows vSphere and the physical
Operation
network to openly exchange information. You can also choose not
to configure an attribute in LLDP if you don’t want this exchange to
occur.

4. Discovery Protocols | 12
VMware vSphere Networking

5. Network I/O Control (NIOC)


NIOC is a vDS feature that has been available since the introduction of vSphere 4.1. The feature
initially used network resource pools to determine the bandwidth that different network traffic
types could receive in the event of contention. With the introduction of vSphere 6.0, we must
now also take shares, reservations, and limits into account. The following sections detail our
recommendations.

5.1. NIOC Version 2


NIOC version 2 was introduced in vSphere 5.1 and became generally available in vSphere 5.5.
Enabling NIOC divides vDS traffic into the following predefined network resource pools:
• Fault tolerance
• iSCSI
• vMotion
• Management
• vSphere Replication (VR)
• NFS
• VM
The physical adapter shares assigned to a network resource pool determine what portion of the
total available bandwidth is guaranteed to the traffic associated with that network resource pool
in the event of contention. It is important to understand that NIOC only impacts network traffic
if there is contention. Thus, when the network is less than 100 percent utilized, NIOC has no
advantage or disadvantage.
Comparing a given network resource pool’s shares to the allotments for the other network
resource pools determines its available bandwidth. You can apply limits to selected traffic types,
but Nutanix recommends against it, as limits may inadvertently constrain performance for given
workloads when there is available bandwidth. Using NIOC shares ensures that burst workloads
such as vMotion (migrating a VM to a different ESXi host) can complete as quickly as possible
when bandwidth is available, while also protecting other workloads from significant impact if
bandwidth becomes limited.
The following table shows the Nutanix-recommended network resource pool share values.

Note: None of the network resource pools configure artificial limits or reservations,
so leave the host limit set to unlimited.

5. Network I/O Control (NIOC) | 13


VMware vSphere Networking

Table 5: Recommended Network Resource Pool Share Values

Network Resource Pool Share Value Physical Adapter Shares


Management traffic 25 Low
vMotion traffic 50 Normal
Fault tolerance traffic 50 Normal
VM traffic 100 High
NFS traffic 100 High
Nutanix CVM traffic (1) 100 High
iSCSI traffic 100 High
vSphere Replication traffic 50 Normal
vSphere Storage Area Network
50 Normal
traffic (2)
Virtual SAN traffic (3) 50 Normal

(1) This is a custom network resource pool created manually.


(2) This is exposed in vSphere 5.1, but as it has no function and no impact on NIOC, we can
ignore it.
(3) This is exposed in vSphere 5.5, but as Nutanix environments do not use it, it has no impact
on NIOC here.

The sections below describe our rationale for setting the share values in this table.

Note: The calculations of minimum bandwidth per traffic type presented here
assume that the system is not using iSCSI, vSphere Replication, vSphere Storage
Area Network traffic, or Virtual SAN traffic. Also, we assume that NFS traffic remains
with the host for the default “NFS traffic” resource pool.

Management Traffic
Management traffic requires minimal bandwidth. A share value of 25 (low) and two 10 Gb
interfaces ensure a minimum of approximately 1.5 Gbps for management traffic, which exceeds
the minimum bandwidth requirement of 1 Gb.

5. Network I/O Control (NIOC) | 14


VMware vSphere Networking

vMotion Traffic
vMotion is a burst-type workload that uses no bandwidth until the distributed resource scheduler
(DRS) (a vCenter service that actively rebalances VMs across hosts) invokes it or a vSphere
administrator starts a vMotion or puts a host into maintenance mode. As such, it is unlikely to
have any significant ongoing impact on the network traffic. A share value of 50 (normal) and two
10 Gb interfaces provide a minimum of approximately 3 Gbps, which is sufficient to complete
vMotion in a timely manner without degrading VM or storage performance and well above the
minimum bandwidth requirement of 1 Gb.

Fault Tolerance Traffic


Fault tolerance (FT) traffic depends on the number of fault-tolerant VMs you have per host
(current maximum is four per host). FT is generally a consistent workload (as opposed to a burst-
type workload like vMotion), as it needs to keep the primary and secondary VMs’ CPU resources
synchronized. We typically use FT for critical VMs, so to protect FT traffic (which is also sensitive
to latency) from impact during periods of contention, we use a share value of 50 (normal) and
two 10 Gb interfaces. This setting provides a minimum of 3 Gbps, which is well above VMware's
recommended minimum of 1 Gb.

VM Traffic
VM traffic is the reason we have datacenters in the first place, so this traffic is always important,
if not critical. As slow VM network connectivity can quickly impact end users and reduce
productivity, we must ensure that this traffic has a significant share of the available bandwidth
during periods of contention. With a share value of 100 (high) and two 10 Gb interfaces, VM
traffic receives a minimum of approximately 6 Gbps. This bandwidth is more than what is
required in most environments and ensures a good amount of headroom in case of unexpected
burst activity.

NFS Traffic
NFS traffic is always critical, as it’s essential to the Acropolis DSF and to CVM and VM
performance, so it must receive a significant share of the available bandwidth during periods
of contention. However, as NFS traffic is serviced locally, it does not impact the physical
network card unless the Nutanix CVM fails or is offline for maintenance. Thus, under normal
circumstances, no NFS traffic crosses the physical NICs, so the NIOC share value has no impact
on other traffic. As such, we have excluded it from our calculations.
In case of network contention, CVM maintenance, or failure, we can assign NFS traffic a share
value of 100 (high) as a safety measure, ensuring a minimum of 6 Gbps bandwidth.

5. Network I/O Control (NIOC) | 15


VMware vSphere Networking

Nutanix CVM Traffic


For the Acropolis DSF to function, it must have connectivity to the other CVMs in the cluster for
tasks such as synchronously writing I/O across the cluster and Nutanix cluster management. It
is important to note that, under normal circumstances, minimal or no read I/O traffic crosses the
physical NICs; however, write I/O always utilizes the physical network.
To ensure optimal DSF performance in the unlikely circumstance wherein both 10 Gb adapters
are saturated, we assign a share value of 100 (high), guaranteeing each CVM a minimum of
6 Gbps bandwidth. This bandwidth is more than what is required in most environments and
ensures a good amount of headroom in case of unexpected burst activity.

iSCSI Traffic
Like NFS traffic, iSCSI traffic is not used by default; however, if you are using it, it may be critical
to your environment. As such, we have given this traffic type a share value of 100.

Note: NIOC does not cover in-guest iSCSI. If you are using in-guest iSCSI, we
recommend creating a dvPortGroup for in-guest iSCSI traffic and assigning it to a
custom network resource pool called "In-Guest iSCSI." Assign this pool a share value
of 100 (high).

vSphere Replication Traffic


In a non-Nutanix environment, vSphere Replication (VR) traffic may be critical to your
environment if you choose to use VR (with or without VMware Site Recovery Manager (SRM)).
However, if you are using SRM, we strongly recommend using the Nutanix Storage Replication
Adaptor (SRA) instead of VR, as it is more efficient. If you’re using VR without SRM, the default
share value of 50 (normal) should be suitable for most environments, ensuring approximately 3
Gbps of network bandwidth.

vSphere Storage Area Network and Virtual SAN Traffic


vSphere storage area network (SAN) traffic is visible as a parameter in vSphere 5.1, but it is not
used. Therefore, simply leave the default share value, as vSphere SAN traffic has no impact on
NIOC.
Virtual SAN traffic is visible as a parameter from vSphere 5.5 onward, but Nutanix environments
do not use it. Therefore, simply leave the default share value as it is, as virtual SAN traffic also
has no impact on the environment.

5.2. vSphere 6.x: NIOC Version 3


VMware vSphere versions 6.x introduced a newer iteration of NIOC: version 3. Depending on the
version of vSphere 6, NIOC v3 allows an administrator to set not only shares, but also artificial

5. Network I/O Control (NIOC) | 16


VMware vSphere Networking

limits and reservations on system traffic such as vMotion, management, NFS, VMs, and so
on. Administrators can apply limits and reservations to a VM’s network adapter via the same
constructs they used when allocating CPU and memory resources to the VM. Admission control
is now part of the vDS, which integrates with VMware HA and DRS to actively balance network
resources across hosts.
Although setting artificial limits and reservations on various traffic types guarantees quality of
service (QoS) and SLAs, it prevents other workloads from using the total available bandwidth.
Reserving a certain amount of bandwidth strictly for one workload or purpose constrains the
network’s ability to sustain an unreserved workload’s short bursts of traffic.

vSphere 6.0.x: NIOC Version 3


For customers who have implemented vSphere 6.0.x and are considering NIOC v3, it’s important
to note that this combination of versions only works when you use artificial limits and reservations
to prioritize traffic flows.
In the course of testing and evaluating the functionality of NIOC v3 with vSphere 6.0.x, Nutanix
has encountered various abnormalities in its usability and behavior as well as generally higher
CPU overhead. Because of these results, we recommend that customers implement NIOC v2
instead of NIOC v3.
To implement NIOC v2, provision a vDS with 5.5.0 selected as the distributed switch version,
following the proportional shares we described above for NIOC v2. For guidance on creating a
new vDS on your cluster and configuring its settings, please refer to the VMware vSphere 5.5
documentation on enabling NIOC.

vSphere 6.5 Update 1 and Newer: NIOC Version 3


For customers who have implemented vSphere 6.5 update 1 and later, Nutanix fully supports the
vDS operating at version 6.5 in combination with NIOC v3. Nutanix can fully support this version
combination because VMware reverted to using the proportional share algorithm, which means
that you use shares to prioritize traffic types only in the event of contention, rather than limits and
hard reservations. Refer to the table Recommended Network Resource Pool Share Values for
the share values we recommend.
Testing by Nutanix shows only a minimal performance deviation between using vDS 5.5 with
NIOC v2 and using vDS 6.5 with NIOC v3.

5.3. NIC Teaming and Failover


vSphere provides several NIC teaming and failover options. Each of these settings has different
goals and requirements, and we need to understand and carefully consider them before using
them in our vSphere deployments.

5. Network I/O Control (NIOC) | 17


VMware vSphere Networking

The available options are:


1. Route based on originating virtual port.
2. Route based on physical NIC load (also called load-based teaming or LBT).
a. Only available with the vDS.
3. Route based on IP hash.
4. Route based on source MAC hash.
5. Use explicit failover order.

Recommendation for vSwitch: Route Based on Originating Virtual Port


The Route based on originating virtual port option is the default load balancing policy and
does not require an advanced switching configuration such as LACP, Cisco EtherChannel, or HP
teaming, making it simple to implement, maintain, and troubleshoot. It only requires 802.1q VLAN
tagging for secure separation of traffic types. This option’s main disadvantage is that it does not
support load balancing based on network load, so the system always sends traffic from a single
VM to the same physical NIC unless a NIC or upstream link failure causes a failover event.

Recommendation for vDS: Route Based on Physical NIC Load


The Route based on physical NIC load option (LBT) also does not require an advanced
switching configuration such as LACP, Cisco Ether channel, or HP teaming. It offers fully
automated load balancing, which takes effect when one or more NICs reach and sustain 75
percent utilization for a period of at least 30 seconds, based on egress and ingress traffic. LBT’s
only requirement is 802.1q VLAN tagging for secure separation of traffic types, so it is also
simple to implement, maintain, and troubleshoot.

Note: The vDS requires VMware vSphere Enterprise Plus licensing.

LBT is a simple and effective solution that works very well in Nutanix deployments.

Network Failover Detection


ESXi uses one of two network failover detection methods: beacon probing or link status.
Beacon probing sends out and listens for beacon probes, which are made up of broadcast
frames. Because beacon probing must have three network connections to function, we do not
recommend it for Nutanix solutions, which currently have two NICs of each speed (1 Gb and 10
Gb).
Link status depends on the state that the physical NIC reports for the link. Link status can detect
failures such as a cable disconnection or a physical switch power failure, but it cannot detect
configuration errors or upstream failures that may also result in connectivity issues for VMs.
To avoid these link status limitations related to upstream failures, enable Link state tracking
(if the physical switches support it) or an equivalent, which then enables the switch to pass

5. Network I/O Control (NIOC) | 18


VMware vSphere Networking

upstream link state information back to ESXi, so the link status triggers a link down on ESXi
where appropriate.

Notify Switches
The purpose of the notify switches policy setting is to enable or disable communication by ESXi
with the physical switch in the event of a failover. Choosing Yes for this policy setting means that
when a failover event routes a virtual Ethernet adapter’s traffic over a different physical Ethernet
adapter in the team, ESXi sends a notification to the physical switch to update the lookup tables.
This configuration ensures that failover occurs in a timely manner and with minimal interruption to
network connectivity.

Failback
For customers not using Enterprise Plus or the vDS with load-based teaming, failback can help
rebalance network traffic across the original NIC, which may improve network performance.
The only significant disadvantage of setting failback to Yes is that, in the unlikely event of
network instability (or “flapping”), having network traffic fail back to the original NIC may result in
intermittent or degraded network connectivity.
Nutanix recommends setting failback to Yes when using vSwitch and No when using vDS.

Failover Order
Using failover order allows the vSphere administrator to specify the order in which NICs fail
over by assigning a pNIC to one of three groups: active adapters, standby adapters, or unused
adapters. For example, if all active adapters lose connectivity, the system uses the highest
priority standby adapter, and so on. This feature is only necessary in a Nutanix environment
when performing multi-NIC vMotion.
When configuring multi-NIC vMotion, set the first dvPortGroup used for vMotion to have one
dvUplink active and the other standby. Configure the reverse for the second dvPortGroup used
for vMotion.
For more information, see Multiple-NIC vMotion in vSphere 5 (KB 2007467).

Summary of Recommendations for NIC Teaming and Failover

Table 6: Recommendations for NIC Teaming and Failover

vSphere Distributed Switch (vDS)


Load Balancing Route based on physical NIC load (LBT)
Network Failover Detection Link status only (1)

5. Network I/O Control (NIOC) | 19


VMware vSphere Networking

Notify Switches Yes


Failback No
vSphere Standard Switch (vSwitch)
Load Balancing Route based on originating virtual port
Network Failover Detection Link status only (1)
Notify Switches Yes
Failback Yes
(1) Enable link state tracking or the equivalent on physical switches.

5.4. Security
When configuring a vSwitch or vDS, there are three configurable options under security that can
be set to Accept or Reject: promiscuous mode, MAC address changes, and forged transmits.
In general, the most secure and appropriate setting for each of these options is Reject unless
you have a specific requirement for Accept. Two examples of situations in which you might
consider configuring Accept on forged transmits and MAC address changes are:
1. Microsoft load balancing in Unicast mode.
2. iSCSI deployments on select storage types.
The following table offers general guidelines, but Nutanix recommends that all customers
carefully consider their requirements for specific applications.

Table 7: Recommendations for VMware Virtual Networking Security

Promiscuous Mode Reject


MAC Address Changes Reject
Forged Transmits Reject

5.5. Virtual Networking Configuration Options


The following two virtual networking options cover the Nutanix recommended configurations for
both vSwitch and vDS solutions. For each option, we discuss the advantages, disadvantages,
and example use cases. In both of the options below, Nutanix recommends setting all vNICs as
active on the portgroup and dvPortGroup unless otherwise specified.

5. Network I/O Control (NIOC) | 20


VMware vSphere Networking

The following table defines our naming convention, portgroups, and corresponding VLANs as
used in the examples below for various traffic types.

Table 8: Portgroups and Corresponding VLANs

Portgroup VLAN Description


VM kernel interface for host management
MGMT_10 10
traffic
VMOT_20 20 VM kernel interface for vMotion traffic
FT_30 30 Fault tolerance traffic
VM_40 40 VM traffic
VM_50 50 VM traffic
Nutanix CVM to CVM cluster
NTNX_10 10
communication traffic (public interface)
Svm-iscsi-pg N/A Nutanix CVM to internal host traffic
VM kernel port for CVM to hypervisor
VMK-svm-iscsi-pg N/A
communication (internal)

All Nutanix deployments use an internal-only vSwitch for the NFS communication between the
ESXi host and the Nutanix CVM. This vSwitch remains unmodified regardless of the virtual
networking configuration for ESXi management, VM traffic, vMotion, and so on.

Tip: Do not modify the internal-only vSwitch (vSwitch-Nutanix). vSwitch-Nutanix


facilitates communication between the CVM and the internal hypervisor.

The configurations for vSwitch and vDS solutions look very similar—the main difference is the
location of the control plane.
• With the default vSwitch, there is a control plane on each host, and each operates
independently. This independence requires an administrator to configure, maintain, and
operate each individual vSwitch separately.
• With the vDS, the control plane is located within vCenter. This centrality requires the system to
keep vCenter online to ensure that configuration and maintenance activities propagate to the
ESXi hosts that are part of the vDS instance.
Both options offer the advantage of reduced cabling and switching requirements (because they
do not require 1 Gb ports). Moreover, they represent a simple solution that only requires 802.1q

5. Network I/O Control (NIOC) | 21


VMware vSphere Networking

configured on the physical network. These options are suitable for environments where logical
separation of management and VM traffic is acceptable.

Option 1: vSphere Standard Switch with Nutanix vSwitch Deployment


Option 1 is the default configuration for a Nutanix deployment and suits most use cases. This
option is a good fit for customers who do not have VMware vSphere Enterprise Plus licensing or
those who prefer not to use the vDS.
Customers may wish to incorporate the 1 Gb ports into the vSwitch as a means of providing
additional redundancy; however, this configuration requires additional cabling and switch ports. If
you connect the 1 Gb ports, set them to Standby and configure failback as noted above.
Option 1 offers several benefits; the most important benefit is that the control plane is located
locally on every ESXi host, so there is no dependency on vCenter.
A major disadvantage of the vSwitch lies in its load balancing capabilities. Although both physical
uplinks are active, traffic across these pNICs does not actively rebalance unless the VM has
gone through a power cycle or a failure has occurred in the networking stack. Furthermore, a
standard vSwitch cannot manage contention or prioritize network traffic across various traffic
classes (for example, storage, vMotion, VM, and so on).

Figure 2: Virtual Networking Option 1: One vSwitch with vSwitch-Nutanix

Recommended use cases:


1. When vSphere licensing is not Enterprise Plus.
2. When physical and virtual server deployments are not large enough to warrant a vDS.

5. Network I/O Control (NIOC) | 22


VMware vSphere Networking

Option 2: vSphere Distributed Switch (vDS) with Nutanix vSwitch Deployment


Option 2 is a good fit for customers using VMware vSphere Enterprise Plus licensing and offers
several benefits:
• Advanced networking features, such as:
# Network I/O Control (NIOC).
# Load-based teaming (route based on physical NIC load).
# Actively rebalancing flows across physical NICs.
• Ability for all traffic types to “burst” where required, up to 10 Gbps.
• Effective with both egress and ingress traffic.
• Lower overhead than IP hash load balancing.
• Reduced need for NIOC to intervene during traffic congestion.
Because the control plane is located within vCenter, vDS administration requires vCenter to
be highly available and protected, as its corresponding database stores all configuration data.
Particularly during disaster recovery operations, the requirement for vCenter availability can pose
additional challenges; many administrators see this constraint as the primary disadvantage to
using the vDS.
Recommended use cases:
1. When vSphere licensing is Enterprise Plus.
2. Where there is a requirement for corresponding VMware products or services.
3. Large physical and virtual server deployments.

5. Network I/O Control (NIOC) | 23


VMware vSphere Networking

Figure 3: Virtual Networking Option 2: vDS with vSwitch-Nutanix

5.6. Jumbo Frames


Ethernet frames with a payload larger than 1,500 bytes are called jumbo frames and can reduce
network overhead and increase throughput by reducing the number of frames on the network,
each carrying a larger payload. The conventional jumbo frame size is 9,000 bytes of Ethernet
payload.
The Nutanix CVM uses the standard Ethernet MTU (Max Transmission Unit) of 1,500 bytes for all
internal and external network traffic by default. Internal traffic travels between the ESXi VMK port
(192.168.5.1/24) and the CVM’s internal interface (eth1) on the 192.168.5.2/24 network. External
traffic travels between CVMs (via the eth0 interface) to maintain the desired replication factor,
cluster state, and required components, as well as applications or servers using the Nutanix
Volumes feature.

Note: To learn more about cluster components, please visit The Nutanix Bible.

Jumbo frames are an optional configuration parameter for external CVM traffic only.
Because the standard 1,500-byte MTU delivers excellent performance, Nutanix recommends
that you retain this default configuration; however, increasing the MTU can be beneficial in
some scenarios. For example, you can configure jumbo frames in your network when using

5. Network I/O Control (NIOC) | 24


VMware vSphere Networking

Nutanix Volumes to connect high-performance external servers to Nutanix volume groups over
iSCSI. This configuration increases efficiency for applications such as databases connecting to
Volumes.
In situations where you encounter high CPU contention or very high peak traffic loads on the
NICs, using jumbo frames can also help eliminate performance bottlenecks.
If you’re considering implementing jumbo frames (in applications using Nutanix Volumes
features), keep in mind that jumbo frame traffic that crosses between VLANs and subnets can
seriously impact the core network, depending on its configuration and hardware.
We recommend seeking guidance from your network team before enabling jumbo frames.

Tip: Carefully examine the traffic paths of Nutanix Volumes iSCSI traffic when
crossing layer-3 domains.

Note: Nutanix currently does not recommend jumbo frames unless specifically
required by high-performance Nutanix Volumes iSCSI workloads or guest VMs.
When switching from 1,500-byte frames to 9,000-byte frames, performance
improvements are generally not significant in most cases.

Be sure to plan carefully before enabling jumbo frames. The entire network infrastructure,
Nutanix cluster, and components that connect to Nutanix must all be configured properly to avoid
packet fragmentation, poor performance, or disrupted connectivity. For configuration details on
increasing the network MTU to enable jumbo frames on ESXi hosts, please refer to VMware KB
1038827 and VMware KB 1007654. To modify MTU size on the CVM, see Nutanix KB 1807.
Work with Nutanix support before enabling jumbo frames.

Note: You can choose to enable jumbo frames for only specific traffic types;
however, ensure that jumbo frames are configured end to end.

VMware vSphere and VMware NSX


NSX requires customers using VMware vSphere with VMware NSX to have jumbo frames on
hosts transmitting VXLAN traffic. In this case, increase the hypervisor and physical switch MTU
to at least 1,600 bytes to accommodate the larger VXLAN frame size and keep this overhead in
mind. For example, when the physical network MTU increases to 9,000 bytes, the largest VM
MTU is 8,950 bytes because of VXLAN’s 50-byte overhead.

Note: Increasing the MTU of the Nutanix CVM is not required when using NSX
for vSphere. To understand the requirements when installing NSX in further detail,
consult the solution note VMware NSX for vSphere.

5. Network I/O Control (NIOC) | 25


VMware vSphere Networking

Example End-to-End Configuration with Jumbo Frames


The following diagram shows a 9,216-byte MTU configured on the physical network, with two
ESXi hosts configured with jumbo frames where appropriate. The figure below illustrates how
jumbo frames and standard frames can coexist and communicate in a Nutanix deployment.

Figure 4: End-to-End Configuration with Jumbo Frames

Note: Nutanix supports using a standard (not jumbo) 1,500-byte MTU. Performance
using this MTU is still excellent.

The following table shows the various traffic types in a Nutanix environment with VMware
vSphere, as well as the Nutanix-recommended MTU.

Table 9: Traffic Types

Nutanix CVM internal traffic


vMotion / multi-NIC vMotion
Fault tolerance
Suited to jumbo frames (1) (MTU 9,000)
iSCSI
NFS
VXLAN (2)

ESXi management
Not suited to jumbo frames (MTU 1,500)
VM traffic (3)

(1) Assumes that traffic types are not routed.


(2) Minimum MTU supported is 1,524 bytes; Nutanix recommends >=1,600 bytes.
(3) Can be beneficial in selected use cases, but generally not advantageous.

In summary, jumbo frames offer the following benefits:

5. Network I/O Control (NIOC) | 26


VMware vSphere Networking

• Reduced number of interrupts for the switches to process.


• Lower CPU utilization on the switches and on Nutanix CVMs.
• Increased performance for some workloads, such as the CVM, vMotion, and fault tolerance.
• Performance with jumbo frames is either the same or better than without them.

Implications of Jumbo Frames


The converged network must be configured for jumbo frames, and administrators must validate
the configuration to ensure that jumbo frames are enabled properly end to end, so that no packet
fragmentation occurs. Fragmentation typically results in degraded performance, due to the higher
overhead it brings to the network.
Nutanix advises all customers to carefully consider their specific requirements, but we suggest
these guidelines when jumbo frames are supported on all devices in your network:
• Physical switches and interfaces: configure an MTU of 9,216 bytes.
• VMK for NFS or iSCSI: configure an MTU of 9,000 bytes (if applicable).
• VMK for vMotion or fault tolerance: configure an MTU of 9,000 bytes (if applicable).

5. Network I/O Control (NIOC) | 27


VMware vSphere Networking

6. Multicast Filtering in vSphere 6.0


When using vSphere 6.0 or later, administrators have the option to configure advanced multicast
filtering within the vDS. Prior to vSphere 6.0, VMs connected to a vSwitch encountered issues in
forwarding and receiving their multicast traffic—VMs that may not have originally subscribed to
the multicast group sending traffic received that traffic anyway.
To overcome these traffic concerns, vSphere 6.0 introduced two configurable filtering modes:
• Basic mode: Applies to the vSwitch and the vDS. This mode forwards multicast traffic for VMs
according to the destination MAC address of the multicast group.
• Multicast snooping: Applies only to the vDS and uses IGMP snooping to route multicast traffic
only to VMs that have subscribed to its source. This mode listens for IGMP network traffic
between VMs and hosts and maintains a map of the group’s destination IP address and the
VM’s preferred source IP address (IGMPv3).
# When a VM does not renew its membership to a group within a certain period of time, the
vDS removes the group’s entry from the lookup records.

Note: Enabling multicast snooping on a vDS with a Nutanix CVM attached affects its
ability to discover and add new nodes in the cluster (via the Cluster Expand option in
Prism and the Nutanix CLI).

If you need to implement multicast snooping, your standard operational procedures should
include manual steps for adding additional nodes to a cluster. For more information, refer to
Nutanix KB 3493 and VMware’s vSphere 6.0 documentation on multicast filtering modes.

6. Multicast Filtering in vSphere 6.0 | 28


VMware vSphere Networking

7. Port Binding with vSphere Distributed Switches


When connecting VMs to a vSphere Distributed Switch (vDS), administrators have three options
for defining how a virtual port on a vDS is assigned to a VM.
• Static binding
Static binding is the default setting, whereby once a VM connects to the vDS, a port is
immediately assigned and reserved for it. Because the port is only disconnected when you
remove the VM from the port group, static binding guarantees virtual network connectivity.
However, static binding also requires that you perform all connect and disconnect operations
through vCenter Server.

Note: Because dynamic binding was deprecated in ESXi 5.0, customers on


subsequent versions should now use static binding.

• Dynamic binding
When using dynamic binding, a port is only assigned to a VM when the VM is powered on
and its NIC is in a connected state. Unlike static binding, when the VM is powered off or the
VM’s NIC is disconnected, the vDS port also disconnects. As with static binding, you must
use vCenter to power cycle VMs that use dynamic binding and are connected to a port group
configured with dynamic binding.
• Ephemeral binding
When you configure virtual ports with ephemeral binding, the host only creates a port
and assigns it to a VM when the VM is powered on and its NIC is in a connected state.
Subsequently, when the VM powers off or the VM’s NIC is disconnected, the port is deleted.
Ephemeral binding does not rely on vCenter availability—either ESXi or vCenter can assign
or remove ports—so administrators have increased flexibility in managing their environment,
particularly in a disaster recovery situation. To preserve performance and scalability, only use
ephemeral ports for disaster recovery, not for production workloads.

Note: Do not configure ephemeral binding for the Nutanix CVM’s management port
group. This configuration causes the CVM’s upgrade process to hang or fail. The
Nutanix Cluster Check (NCC) service notifies administrators when ephemeral binding
is set up for the CVM management port group.

For more information regarding port binding in vSphere, refer to https://kb.vmware.com/s/


article/1022312.

7. Port Binding with vSphere Distributed Switches | 29


VMware vSphere Networking

8. Network Configuration Best Practices


When designing the logical networking attributes of any vSphere design, pay close attention to
the physical networking requirements and to the impact configuration decisions.

Note: The following recommendations and considerations apply to all switch


vendors; however, be sure to consult your switch vendor’s implementation guides for
specifics on how to enable these features and functionalities.

8.1. Physical Network Layout


• Use redundant top-of-rack switches in a leaf-spine architecture. This simple, flat network
design is well suited for a highly distributed, shared-nothing compute and storage architecture.
# If you need more east-west traffic capacity, add spine switches.
• Add all the nodes that belong to a given cluster to the same layer 2 network segment.
• Use redundant 40 Gbps (or faster) connections to ensure adequate bandwidth between
upstream switches.
• When implementing a three-tier networking topology, pay close attention to the
oversubscription ratios for uplinks while also taking into consideration that the spanning tree
blocks ports to prevent network loops.

8.2. Upstream Physical Switch Specifications


• Connect the 10 Gb or faster uplink ports on the ESXi node to switch ports that are
nonblocking, datacenter-class switches capable of providing line-rate traffic throughput.
• Use an Ethernet switch that has a low-latency, cut-through design, and that provides
predictable, consistent traffic latency regardless of packet size, traffic pattern, or the
features enabled on the 10 Gb interfaces. Port-to-port latency should be no higher than two
microseconds.
• Use fast-convergence technologies (such as Cisco PortFast) on switch ports that are
connected to the ESXi host.
• To prevent packet loss from oversubscription, avoid switches that use a shared port-buffer
architecture.
• Enable LLDP or CDP to assist with troubleshooting and operational verification.

8. Network Configuration Best Practices | 30


VMware vSphere Networking

• Implement BPDU guard to ensure that devices connected to the STP boundary do not
influence the topology or cause BPDU-related attacks.
# Note: You should also enable the BPDU filter on the ESXi hosts that are connecting to the
physical switch.
# Do not use this feature in deployments where you may wish to run software-based bridging
functions within VMs across multiple vNICS. For more information, see VMware KB
2047822.

8.3. Link Aggregation


There are two possible ways to implement link aggregation: LAG (link aggregation group) and
LACP (link aggregation control protocol). When aggregating links, you bundle two or more
physical links between a server and a switch or between two switches to increase the overall
bandwidth and provide link redundancy. LAG is a static configuration, while LACP is a control
protocol for the automatic negotiation of link aggregation. Both of these methods are functions of
the vDS and support vSphere versions 5.1 and later.
Important notes regarding link aggregation:
• An ESXi host can support up to 32 LAGs. However, in real-world environments, the number of
supported LAGs per host depends on the number of ports that can be members of an LACP
port channel in the physical switch.
• When configuring LACP on the physical switch, the hashing algorithm must match what is
configured within ESXi’s vDS.
• All physical NICs connected to the LACP port channel must be configured with the same
speed and duplex settings.
• When using the Nutanix Foundation process to image hosts, temporarily disable LACP. LACP
is not supported in the current Foundation release.
• Consult VMware’s KB 2051307 for a complete understanding of LACP and its compatibility
with vSphere versions.
• Set the mode for all members that are part of the LACP group to Active. This setting
ensures that both parties (ESXi host and switch) can actively send LACP data units and thus
successfully negotiate the link aggregation.

8.4. Switch and Host VLANs


• Keep the CVM and ESXi host in the same VLAN. By default, the CVM and the hypervisor are
assigned to VLAN 0, which effectively places them on the native untagged VLAN configured
on the upstream physical switch.

8. Network Configuration Best Practices | 31


VMware vSphere Networking

• Configure switch ports connected to the ESXi host as VLAN trunk ports.
• Configure any dedicated native untagged VLAN other than 1 on switch ports facing ESXi
hosts to carry only CVM and ESXi management traffic.

8.5. Guest VM VLANs


• Configure guest VM networks on their own VLANs.
• Use VLANs other than the dedicated CVM and ESXi management VLAN.
• Use VM NIC VLAN trunking (virtual guest tagging) only in cases where guest VMs require
multiple VLANs on the same NIC. In all other cases, add a new VM NIC with a single VLAN in
access mode to bring new VLANs to guest VMs.

8.6. CVM Network Configuration


• Do not remove the CVM from the internal vSwitch-Nutanix.
• If required for security, use the network segmentation feature to add a dedicated CVM
backplane VLAN with a nonroutable subnet. This separates CVM storage backplane traffic
from CVM management traffic.

8.7. Jumbo Frame Configuration


• See KB 1807 and contact Nutanix support for jumbo frame configuration.
• If using Oracle RAC or VMware’s NSX solution, be sure to configure jumbo frames end to end.

8.8. IP Address Management


• Coordinate the configuration of IP address pools to avoid address overlap with existing
network DHCP pools.

8.9. IPMI Ports


• Do not allow multiple VLANs on switch ports that connect to the IPMI interface. For
management simplicity, only configure the IPMI switch ports as access ports in a single VLAN.

8. Network Configuration Best Practices | 32


VMware vSphere Networking

9. Conclusion
Virtual networking within a vSphere 6.x environment plays an important role in infrastructure
availability, scalability, performance, management, and security. With hyperconverged
technology, customers need to evaluate their networking requirements carefully against the
various configuration options, implications, and features within the VMware vSphere 6.x
hypervisor.
In most deployments, Nutanix recommends the vSphere Distributed Switch (vDS) coupled with
Network I/O Control (version 2) and load-based teaming, as this combination provides simplicity,
ensures traffic prioritization in the event of contention, and reduces operational management
overhead.
With a strategically networked vSphere deployment on Nutanix, customers can be confident that
their networking configuration conforms to field-tested best practices.
For feedback or questions, please contact us using the Nutanix NEXT Community forums.

9. Conclusion | 33
VMware vSphere Networking

Appendix

References
1. VMware vSphere 6.0 Documentation
2. Nutanix Jumbo Frames
3. Performance Evaluation of NIOC in VMware vSphere 6.0
4. Securing Traffic Through Network Segmentation

About the Authors


Richard Arsenian is a Senior Staff Solution Architect within the Solutions and Performance R&D
engineering team at Nutanix. As a Nutanix Platform Expert (NPX #09) and VMware Certified
Expert (VCDX #126), Richard brings a wealth of experience within the datacenter across the
various elements of the systems development life cycle (presales, consulting, and support),
due to his experience working at key technology vendors including VMware and Cisco. While
maintaining his passion within the datacenter, Richard also integrates his R&D efforts with
autonomous aerial and motor vehicles; he most recently released the concept of the Nutanix
Acropolis 1 (patent pending), a platform for delivering applications autonomously, via his
Community Edition with Intel NUC project. Follow Richard on Twitter @richardarsenian.
Josh Odgers has a proven track record in virtualization and cloud computing based on
over 13 years of IT industry experience, including 7 years focused on virtualization. His
experience includes leading SMB and enterprise customers through their cloud computing and
hyperconvergence journeys. Josh is a virtualization evangelist and industry thought leader
who enjoys giving back to the community by speaking at industry events and via blogging
and tweeting. Josh has been awarded the titles of vExpert in 2013 and vChampion (2011–
2014). In his role as Principal Solutions Architect at Nutanix, Josh is responsible for developing
reference architectures for solutions, mainly focused on virtualizing business-critical applications.
Josh works with customers at all levels of business, including C-level executives and senior
management, to ensure that the chosen solution delivers technical benefits and excellent return
on investment (ROI). Follow Josh on Twitter @josh_odgers.

About Nutanix
Nutanix makes infrastructure invisible, elevating IT to focus on the applications and services that
power their business. The Nutanix Enterprise Cloud OS leverages web-scale engineering and
consumer-grade design to natively converge compute, virtualization, and storage into a resilient,

Appendix | 34
VMware vSphere Networking

software-defined solution with rich machine intelligence. The result is predictable performance,
cloud-like infrastructure consumption, robust security, and seamless application mobility for a
broad range of enterprise applications. Learn more at www.nutanix.com or follow us on Twitter
@nutanix.

Appendix | 35
VMware vSphere Networking

List of Figures
Figure 1: Nutanix Enterprise Cloud................................................................................... 6

Figure 2: Virtual Networking Option 1: One vSwitch with vSwitch-Nutanix..................... 22

Figure 3: Virtual Networking Option 2: vDS with vSwitch-Nutanix...................................24

Figure 4: End-to-End Configuration with Jumbo Frames................................................ 26

36
VMware vSphere Networking

List of Tables
Table 1: Document Version History.................................................................................. 5

Table 2: Summary Comparison Between vSwitch and vDS............................................. 9

Table 3: LLDP Discovery Options................................................................................... 11

Table 4: Recommendations for Discovery Protocol Configuration.................................. 12

Table 5: Recommended Network Resource Pool Share Values.....................................14

Table 6: Recommendations for NIC Teaming and Failover............................................ 19

Table 7: Recommendations for VMware Virtual Networking Security............................. 20

Table 8: Portgroups and Corresponding VLANs............................................................. 21

Table 9: Traffic Types......................................................................................................26

37

You might also like