Guide 2024 What is Computational Storage ScaleFlux

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 4

A proven Key benefits

approach,
Offload tasks from the CPU
to optimize resources
Analyze more data by
now applied to improving performance
Improve efficiency by
storage multiplying capacity

Computational storage is not a compression, decompression,


new idea, particularly its new encryption, and many other things
at wire speed.
methodology.

You might be familiar with the idea of That brings us to computational


graphic processing units (GPUs) from storage. Simply put, it is the idea of
NVIDIA or AMD and others that can a storage device with an element of
perform some very specialized compute to perform a specialized
mathematics to relieve the CPU from function. These can come in many
the pressure of performing those different shapes and sizes. We now
functions. have SSDs with compute functions,
devices on the memory bus, and
These specialized processors can add-in cards with specialized
perform these functions even better storage processors. They may have
than the general-purpose CPU. FPGAs or ASICs, but their primary
Pushing these tasks to a specialized role remains - handle data-centric
processor lets users get more out of tasks for the CPU. We use the right
their CPUs. The industry had the idea compute type of resource as close
of Smart-NICs or Data Processing to data as possible to relieve
Units (DPU). For example, network pressure on the CPU and get more
interface cards with some compute work done.
resources on-board can do things like

For more information, visit www.scaleflux.com or follow us on LinkedIn.


Let's look at the ScaleFlux performance from an IOPS or
approach bandwidth perspective from the host.
You can also run the device to a far
We take our SSD designed to be as higher capacity utilization before you
performant and feature-rich as any see any abnormal performance side
major vendor of SSD devices. Then, effects. So looking at the design
we add our custom silicon to give it ahead of you, you can see that it has
some additional functions that a PCIe Gen 4 by 4 interface. We have
enable it to achieve better our own customized ASIC on board
performance, better capacity that is then backed up by industry-
utilization, and more endurance. leading NAND to persist and store the
data.
There are a couple of ways you
could do this - with an FPGA or with
an ASIC or SoC (System-on-Chip). It allows you to store more
FPGAs are a bit easier to design, but data - either by increasing
they consume host resources and utilization rates to 85-90% or
can be more complex in other ways. by expanding the namespace
They're also not as fast or as power- beyond the physical capacity.
efficient as an ASIC. While
delivering high-performance ASIC
with multiple functions is an When we bring this all together, even
extraordinarily tricky task (and very with a modest amount of data
expensive if you get it wrong!), compression, you get industry-
ScaleFlux has done exactly that with leading performance in IOPS and
our first-ever Silicon design. This bandwidth. You may see double or
SoC sits at the heart of the CSD even triple the capacity you can use
3000, allowing us to transparently on the device because of the built-in
compress data before writing to data compression.
the NAND and decompress the
data when reading it back from it. Finally, because we're writing less data
to the NAND, you can go from a 5-
What does this deliver?
drive write-per-day device to a 9-plus
Very simply, if you write fewer bits to drive write-per-day device.
the NAND, you have an increase in

Deep dive into


If you're writing less
data, it means that your the ScaleFlux
endurance and your
reliability significantly approach
increase.
When designing a computational

For more information, visit www.scaleflux.com or follow us on LinkedIn.


storage device, first, we need to To get data into the device, we use a
build an SSD that is physically equal PCIe Gen4 interface. It will work on
to any other enterprise-class SSDs. older Gen 3 servers, but you won't
So, in the design, we of course have quite see the performance that you
some NAND. We use 16 channels in would from GM, from a system with a
our ASIC to access that NAND. This Gen 4 bus. As we write data, it comes
is a common design practice for from the host into the DRAM, flows
enterprise class SSDs. The drive is a through, our ASIC gets compressed,
U.2 form factor, one of the most and then gets written to the NAND
common form factors for SSDs. As device where it's persisted for data
we move forward into Gen-5, we'll coming out.
do other form factors such as E1.S
and E3.S. We also need some DRAM Very similar. We read from those
that acts as a buffer area to land NAND channels that it gets hydrated
data before we commit it to the or decompressed as it comes out of
NAND channels, and it allows us to the device and then to the host.
manage our flash translation layer Compression is always on. You can
while doing a few other functions use it in different ways, either having a
with our DRAM as well. regular sized namespace or doing
something called Expanded
The heart of our system, and Namespace.
something that makes us a little
different, is our ASIC, also known as Where can we use this CSD
a System-on-a-Chip (SoC). The 3000?
SoC is custom designed by
ScaleFlux, and built on several Arm We've seen customers use it on large
cores and Hardware Accelerator datasets, mainly focused on machine
Engines. The first function we learning and artificial intelligence.
implemented in computational We have customers using it today in
storage was an inline transparent traditional structured and
compression and decompression unstructured databases such as
function decoupled from the host NoSQL.
CPU.

High-level SSD Architecture

For more information, visit www.scaleflux.com or follow us on LinkedIn.


We also see customers using it for your storage device or array can't
streaming data. These streams keep up? That's the first indication
might be data from anything. For that this could be a great fit.
example, it might be market data
from financial institutions interested
in high-speed network packets
Is your dataset compressible?
manipulated and stored to Most people don't know whether their
calculate risk profiles as quickly as data is compressible, and some
possible. In essence, the idea is believe it is entirely uncompressible.
simple. The benefit you get from Uncompressible data is a possibility,
compressing data on the storage but it's extremely rare in practice.
device instead of elsewhere in the ScaleFlux can provide you with a tool
CPU or only on the network allows that allows you to assess the
you to gain additional performance compressibility of your dataset in the
from your application. applications you care about and very
quickly give an indication of the
degree of benefit you could see by
2nd Order Effects
deploying CSD 3000 in your
Another possible outcome is architecture.
increased license efficiency. For
example, if you're licensing an In addition, our device is a U.2 form
application such as SQL Server or factor device, meaning you need a
Oracle, you could get more work server with a U.2 backplane; ideally, it
done per license. Or, you could be should be Gen 4 PCIe. They do work in
particularly sensitive to rack power, the earlier Gen 3. Still, you're going to
thermal, or other considerations limit performance, and the whole
from a sustainability perspective. point of using a computational
But using computational storage, storage device is to increase
especially with compression built performance capacity and
into the storage device itself, allows endurance.
you to be more efficient on many
levels. If you want to know more about how
to realize performance gains, how to
It would help if you looked at your get that capacity utilization up to a
application performance is bound. very high level, or even extend your
Are you looking at system resource capacity namespace by two or three
constraints in DRAM, CPU, or just that times, please get in touch with us.

About ScaleFlux
ScaleFlux helps customers harness data growth as a competitive advantage by building
products that reduce complexity and accelerate the creation of value from data. In our first
phase of rethinking the data pipeline for the modern data center, ScaleFlux has built a better
SSD by embedding computational storage technology into flash drives. Now, customers can
gain an edge in optimizing their data center infrastructure by deploying storage intelligence
for workloads like databases, analytics, IoT, and 5G.

For more information, visit www.scaleflux.com or follow us on LinkedIn.

You might also like