Professional Documents
Culture Documents
Enterprise Flash Storage Annual Update - 20170808 - SU2 - Marks
Enterprise Flash Storage Annual Update - 20170808 - SU2 - Marks
Annual Update
Or how the data center is replacing
spinning rust with solid state
Howard Marks
Audio-Visual Sponsor Chief Scientist
Santa Clara, CA
August 2017
Your not so Humble Speaker
3
Flash Has Won
2012 2016
• Still a point • Flash is mainstream
2010 •
solution
Becoming cost
• Full data services &
data reduction
• 100K+ IOPS
effective • Cost effective for
• Consistent sub- most applications
• Limited data
millsec latency
services • New solutions for
• Go fast for new applications
• Data reduction
special cases
5
The All Flash Data Center?
12
SDD Advances
§ Field configurable SSDs
• SSD has xGB flash
– User chooses balance between useable capacity &
endurance
• Sophisticated users, like webscalers
§ Host Managed SSDs
• Give host system control of garbage collection
• System can avoid writes to drive in GC
• More consistent latency
8/9/17 13
(Last year) The Future is PCIe
§ PCIe offers:
• Low latency, high bandwith, RDMA
§ PCIe Switch chips
• PLX and PMC – 96 lane
§ Use for:
• Controller to controller link
• U.2 SSDs in storage system
• Rack scale switched system (DSSD)
• External PCI standards exist
8/9/17
14
The Near Future is NVMe
§ Gen1 and 2 PCI SSDs
• ACHI (SATA command set)
• Propreatary (Fusion-IO,
Verident) with heavy software
§ Enter NVM Express
• A new software protocol for
non-volatile memory access
§ Lower compute overhead than
SCSI
§ 64K queues of 64K entries vs
SCSI 1 queue of 32 entries
15
NVMe = Lower Overhead & Latency
16
A New Tier 0
§ We’ve redefined tier 1
• 100,000s of IOPS
• Sub-millisecond latency Tier 0
(SSD)
• 100s TB useable capacity
• Rich data services Tier 1
15K Flash
RPM Disk
• Back to $/GB
§ A new tier 0 emerges Tier 2
7200 RPM
• Application resiliency
§ NVMe over network 17
New Tier 0 (2016)
§ EMC DSSD
• Rack scale (48 hosts) switched PCIe
• Very custom hardware
• Block, key-value, direct memory APIs
§ Several NVMe over Network startups
• Apeiron – 40Gbps Ethernet switch in JBOF
• E8 – Dual controller array – basic services
• Mangstor – x86 NVMEoF target
• Excellero – Low CPU SDS, RDMA
18
NVMe Over Fabrics (NVMEoF)
§ Extends/encapsulates NVMe semantics over
• Ethernet with RMDA
– ROCE – RDMA over Converged Ethernet
– iWARP – RDMA over TCP
• Fibre Channel
• Infiniband (no products yet announced)
§ Adds name spaces and discovery
§ 10-50µsec protocol and network overhead
19
The New Tier 0 (2017)
§ NVMe over Fabrics standard announced at FMS
• Drivers now in all major OSes/Hypervisors
• Intel SPDK high performance requestor and target
§ DellEMC cancels DSSD
• Promises tech will live on in other products
20
Pure FlashArray//x
§ Replaces //m SAS SSDs
with NVMe flashmodules
§ Expansion via SAS or
NVMEoF JBOF
40 Gbp
Ethernet § NVMEoF target on 40Gbps
NVMEoF Ethernet
§ Full services
21
Kaminario K2 Composeable
Array B
Array D § NVMEoF
Array C
• Controller to JBOF
• Host to array (opt)
Array A § Dynamically
assign controllers
and flash to virt
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54
BCN BCN
STS STS
ENV ENV
array
Unassigned Controllers
22
NVMe JBOFs Emerge
§ Today’s JBOFs are x86 servers NVMeOF
• Dual servers needed for HA ASIC
• High flexibility
• High cost
§ NVMEoF ASICs
• Vastly reduce costs
• Sampling from
– SolarFlare Xilinx
– Kazan Networks
23
Or The Future is Persistent Memory