Professional Documents
Culture Documents
RoCE vs. IWARP Final
RoCE vs. IWARP Final
iWARP
A Great Storage Debate
Live Webcast
August 22, 2018
10:00 am PT
Today’s Presenters
The material contained in this presentation is copyrighted by the SNIA unless otherwise
noted.
Member companies and individual members may use this material in presentations and
literature under the following conditions:
Any slide or slides used must be reproduced in their entirety without modification
The SNIA must be acknowledged as the source of any material used in the body of any document containing material
from these presentations.
This presentation is a project of the SNIA.
Neither the author nor the presenter is an attorney and nothing in this presentation is intended
to be, or should be construed as legal advice or an opinion of counsel. If you need legal
advice or a legal opinion please contact your attorney.
The information presented herein represents the author's personal opinion and current
understanding of the relevant issues involved. The author, the presenter, and the SNIA do not
assume any responsibility or liability for damages arising out of any reliance on or use of this
information.
NO WARRANTIES, EXPRESS OR IMPLIED. USE AT YOUR OWN RISK.
© 2018 Storage Networking Industry Association. All Rights Reserved. 4
Agenda
SCSI
“iSCSI” usually means SCSI on TCP/IP over Ethernet
Target SW
© 2018 Storage Networking Industry Association. All Rights Reserved. *Almost always over Ethernet 7
RoCE – Tim Lustig, Mellanox
Designed for enterprise, virtualized, cloud, web 2.0 and storage platforms
RoCE v1
Needs custom settings on the switch
Priority queues to guarantee lossless L2 delivery
Takes advantage of PFC (Priority Flow Control) in DCB Ethernet
VMware
Microsoft SMB 3.0 (Storage Space Direct) and Azure
Oracle
IBM Spectrum Scale (formerly known as IBM GPFS)
Gluster, Lustre, Apache Spark, Hadoop and Ceph
Software-Defined Storage (SDS) and hyperconverged vendors
Nearly all NVMe-oF demonstrations, designs, and customer
deployments are using RoCE
iWARP
iWARP is NOT an acronym
iWARP can be used in different network environments:
LAN, storage network, Data center, or even WAN
http://www.rdmaconsortium.org/home/FAQs_Apr25.htm
© 2018 Storage Networking Industry Association. All Rights Reserved. 17
What is iWARP
Offload Offloads the TCP/IP process from the Eliminates CPU overhead for
TCP/IP CPU to the RDMA-enabled NIC (RNIC) network stack processing
Zero Copy iWARP enables the application to place Significantly relieves CPU load and
the data directly into the destination frees memory bandwidth
application’s memory buffer, without
unnecessary buffer copies
Less Application Context iWARP can bypass the OS and work in Can dramatically reduce application
Switching user space to post the command context switching and latency
directly to the RNIC without the need for
expensive system calls into the OS
NVMe* over Fabrics Initiator/Target Storage – Network block storage Kernel Linux
NVMe over Fabrics Initiator/Target for SPDK Storage – Network block storage User Linux
171
FastLinQ QL41xxx iWARP
155
161 Reduces Live Migra9on Time by 58%
TCP/IP SMBDirect - iWARP
Highly Predictable Migra9ons
Time to Migrate (in Seconds)
98
92 93
51 53 55
69 68 69
Benefits
44 44 43 Shorter Maintenance Windows
25 25 24 Adap9ve Load Balancing – SLAs
2 2 2 4 4 4 6 6 6
Less Flight 9me = Less Risk
Number of Concurrent VM Migrations
UDP TCP
IB network layer IB network layer
IP IP
IB link layer Ethernet link layer Ethernet link layer Ethernet link layer
RoCE portion adopted from “Supplement to InfiniBand Architecture Specification Volume 1 Release 1.2.1, Annex A17: RoCEv2,” September 2, 2014
© 2018 Storage Networking Industry Association. All Rights Reserved. 24
Key Differences
RoCEv2 iWARP
Underlying Network UDP TCP
Adapter Offload
Full DMA
Option Full DMA
and TCP/IP*
Routability Yes Yes
© 2018 Storage Networking Industry Association. All Rights Reserved. * Depending on vendor implementations
25
Key Differences - RoCE
Based on ECN/DCB (RoCEv1) standards that provide a lossless network and the
ability to optimally allocate bandwidth to each protocol on the network
Scalable to thousands of nodes based on Ethernet technologies that are widely used
and well understood by network managers
Big Data
Accelerates Hadoop MapReduce, SPARK Shuffling
Alluxio acceleration
Virtualization
Windows Server Hyper-V
Windows VM live migration acceleration
RoCE iWARP
Transport UPD/IP TCP/IP
Network Lossless Standard
Adapter rNIC (soft-RoCE) NIC
Offload Hardware Hardware
Switch DCB (resilient RoCE) Standard