WP Seven Design Principles

SCYLLADB WHITEPAPER
Beyond Legacy
NoSQL: 7 Design
Principles Behind
ScyllaDB
CONTENTS
INTRODUCTION 4
DESIGN DECISION #1: C++ INSTEAD OF JAVA 5
DESIGN DECISION #2: CASSANDRA COMPATIBILITY 6
DESIGN DECISION #3: ALL THINGS ASYNC 7
DESIGN DECISION #4: SHARD PER CORE 8
DESIGN DECISION #5: UNIFIED CACHE 9
DESIGN DECISION #6: I/O SCHEDULER 11
DESIGN DECISION #7: AUTONOMOUS CAPABILITIES 12
CONCLUSION 14
NEXT STEPS 14
ScyllaDB is a NoSQL database designed for modern multiprocessor servers, providing high availability,
high scalability and high performance. It was designed as a direct alternative to legacy NoSQL databases
based upon seven core design principles that guided its architecture and implementation.
In particular, it was designed as an alternative for Apache Cassandra, an early entrant in the NoSQL arena,
which became a popular database for non-relational data projects. Cassandra provides IT organizations
with an easy way to scale data across multiple nodes and datacenters with fault tolerance and data
redundancy. Early deployments were successful, but over time limitations emerged. In practice, Cassandra
has proven to be expensive and problematic, due to several key limitations of the underlying architecture.
ScyllaDB was inspired to specifically address the key issues that hindered Cassandra: node sprawl,
inconsistent performance and arduous administration requirements. In part, this is because database
software architecture has not kept pace with innovations in hardware over the last decade. While
hardware has grown exponentially faster and more sophisticated, neither traditional SQL nor newer
NoSQL databases are able to take full advantage of the additional power available to them. The results are
wasteful overprovisioning and node sprawl, with high administrative and maintenance costs.
The ScyllaDB team has deep roots in low-level kernel programming, having devised and developed the
KVM hypervisor, which now powers most public clouds, including Google, AWS, OpenStack and many
others. KVM was a late-entry re-engineered technology that displaced mature technologies like Xen open
source and VMware. While trying to extract performance from a Cassandra cluster for a different project,
the ScyllaDB team was surprised to discover that neither Apache Cassandra nor any other databases on
the market were able to translate the full power of the underlying hardware into user-visible performance—
in particular modern, multi-core CPUs and fast I/O devices. Soon thereafter, the team chose to apply its
expertise to another reengineering effort, this time an API-compatible replacement for Apache Cassandra.
Their goal was a NoSQL database with scale-up performance of 1,000,000 IOPS, scale-out to hundreds of
nodes and 99% latency of less than 1 millisecond—without sacrificing any of the rich functionality, tooling
and ecosystem support of Cassandra.
Since ScyllaDB’s inception, the ScyllaDB architects realized their core database engine could also be
applied to other database designs and use cases. After having achieved feature parity with Apache
Cassandra, the team implemented an API compatible with Amazon DynamoDB, a popular (but costly)
key-value database-as-a-service (DBaaS). The same core principles that guided ScyllaDB to replicate the
best of Apache Cassandra and yet also improve upon it substantially will now allow ScyllaDB to extend its
capabilities and compatibilities well into the foreseeable future.
In this paper, we will explain the key design decisions that paved the way to a database that is ideally suited
to low-latency data-intensive application use cases that require performance at scale.
3
INTRODUCTION Cassandra’s excellent clustering architecture
and data model, however, are undermined by
The NoSQL movement was created in response fundamental problems in its node architecture.
to the requirements of cloud-based, internet As a result of these limitations, users of
and web-scaled companies. The SQL databases Cassandra have struggled with the following
available at the time were not suited to the issues:
emerging use cases, which called for managing
different types of data structures, as well as high • Team Intensive: Operating Cassandra at scale
performance and high availability across globally requires dedicated full-time experts with an
distributed data centers. increasingly scarce and expensive skill-set.
Drawing on key concepts behind Amazon • JVM Challenges: Memory management in

Dynamo, a team at Facebook created Cassandra. the Java virtual machine (JVM) produces
The designers originally described it as “a unpredictable and unbounded latency.
distributed storage system for managing • Inefficient Utilization: Cassandra cannot
structured data that is designed to scale to a efficiently exploit modern computing
very large size across many commodity servers, resources, in particular multi-core CPUs and
with no single point of failure.” Released as an high-density storage servers
open-source project on Google code in July
• Manual Tuning: Clusters require operators to
2008, by 2010 Cassandra was promoted to
perform intricate yet unpredictable tuning
a top-level project by the non-profit Apache
procedures, while combating compactions and
Software Foundation.
garbage collection storms.
Apache Cassandra has experienced widespread
• Dev Tweaking: Application developers
adoption. Its popularity was primarily based on
require knowledge of database internals to
its excellent cluster architecture, which provides
successfully scale applications.
the following capabilities:
As a result of these issues, architects and
• Masterless Replication: Cassandra clusters
developers using Cassandra are forced to choose
have no masters, slaves, or elected leaders. All
between availability, simplicity and performance.
nodes are symmetric and can serve all reads
Some deploy a Redis cache in front of Cassandra
and writes equally, with no single point of
and lose simplicity and consistency. Some do
failure.
not use the full Cassandra feature set due to
• Global Distribution: Data sets can be complexity and stability. Others give up and
replicated across multiple data centers, triple their TCO by going to cloud managed
multiple geographical regions, and across solutions like DynamoDB.
public and private clouds.
The primary objective behind ScyllaDB is
• Linear Scale: Performance scales along to eliminate this compromise. As an open
with the number of nodes. Users can source alternative to Apache Cassandra,
increase performance by adding new nodes, ScyllaDB preserves everything that the Apache
without downtime or disruption to running community loves about Cassandra, while also
applications. delivering higher throughput, consistently low
• Tunable Data Consistency: The consistency latencies, operational simplicity, and lower TCO.
levels for read and write is tunable and so is In this paper, we share the seven fundamental
the amount of replicas to maximize price/ design decisions we made when architecting
performance. a database that’s built from the ground up to
• Data Model: Cassandra provides a simple data provide performance at scale for low-latency
model that supports dynamic control over data-intensive applications, and the results those
data layout and format. decisions have had on the ScyllaDB database.
4
DESIGN DECISION #1: C++ INSTEAD OF JAVA
One of the early and fundamental design fragments memory and ultimately defeats the
decisions behind ScyllaDB is the use of C++ purpose of managed memory entirely.
in place of the Java programming language.
In contrast, C++ can be considered an
The Java programming language platform
infrastructure programming language. It runs
isn’t wellsuited for I/O and compute-intensive
as native executable machine code and gives
workloads because it deprives developers of
developers complete control over low-level
control. A modern database requires the ability
operations. ScyllaDB is built on an advanced,
to use large amounts of memory and have
open source C++ framework for high-
precise control over what the server is doing at
performance server applications on modern
any time. Java isn’t well suited to either of these
hardware. One example of the way ScyllaDB
requirements. C++ serves both purposes well.
improves upon Cassandra by using C++ is
It provides very precise control over everything
its kernel-level API enhancement. Such an
a database does, along with abstractions that
enhancement would be difficult at best and
enable database developers to create code that’s
definitely non-idiomatic in Java.
both complex and manageable.
ScyllaDB developers regularly check the assembly
Further, a database written in Java, like
generated code to verify efficiency metrics and
Cassandra, is unable to fully optimize low-level
to uncover potential optimizations down to the
operations against the available hardware.
microarchitectural level, as you can see in our
Cassandra’s reliance on the Java Virtual Machine
published instruction-per-cycle research.
(JVM) makes it susceptible to performance and
latency issues caused by garbage collection. By deciding to build in C++, we eliminated the
Cassandra users can side-step garbage collection problems associated with garbage collection
by using off-heap data structures, but that altogether.
“So why did we end up moving from Cassandra to ScyllaDB?” Peace

of mind, both operationally and expense-wise, was key. We no longer
have to worry about ‘stop-the-world’ GC pauses. Also, we were able
to store more data per node, and achieve more throughput per node,
thereby saving significant dollars for the company.”
Dilip Kolasani, Expedia.
Learn more about his experience in this tech talk.
5
DESIGN DECISION #2: CASSANDRA COMPATIBILITY
With more than a decade of technology translates JMX into our RESTful API. Common
investment, the Cassandra user community isn’t Cassandra dashboards can retrieve the same
eager to learn a new database or data model. information from ScyllaDB. ScyllaDB provides
Yet they would benefit from an API-compatible additional monitoring in the form of the
alternative that overcomes the performance and Prometheus API and direct support for REST.
latency issues that have often put their projects
at risk. This takes advantage of the tremendous • Underlying File Format and Algorithms:
value in the accumulated body of institutional ScyllaDB adopted Cassandra’s Sorted Strings
knowledge and expertise generated by the Table (SSTable) format and supports all
Cassandra community. For these reasons, we Cassandra compaction strategies. Auxiliary
chose to leverage the existing work in drivers, tools for SSTable loading, backup and restore,
query language, and the ecosystem surrounding and repair are also supported.
Cassandra, rather than building a new suite of
• Configuration File: ScyllaDB is able to
drivers or inventing yet another query language.
directly consume Cassandra’s configuration
Similarly, our decision to implement an Amazon file, cassandra.yaml, allowing for a seamless
DynamoDB-compatible API now enables migration path. ScyllaDB ignores JVM-related
DynamoDB users to easily migrate to a more options, as well as various other limitations
flexible and price-performant database. that derive from Cassandra (such as parallel
ScyllaDB’s Cassandra compatibility encompasses: compactors configuration). ScyllaDB always
uses maximum parallelism.
• Wire Protocol: ScyllaDB supports Thrift, CQL,
and the full polyglot of client/driver languages. • Command Line Interface: ScyllaDB uses
We strive to be fully compliant with CQL and the same command line tool (nodetool).
support all functionality, from lists, maps, Its processes—from backup to repair—are
to counters, UDTs, secondary indexes and identical to Cassandra’s.
lightweight transactions (LWT). As a result of this fundamental decision,
organizations using Cassandra or DynamoDB
• Monitoring: ScyllaDB supports the Java
today can seamlessly migrate to ScyllaDB without
Management Extensions (JMX) protocol. We
rewriting existing applications. They can also
add a JVM-proxy daemon that transparently
continue to leverage their existing ecosystems.
“We did not change our application one bit—the same drivers, the
same commands we were issuing before with Cassandra worked
with ScyllaDB with absolutely no changes. So for a very low technical
cost we saved money on infrastructure and got much improved read
performance, especially when under load.”
David Blythe, Principal Software Engineer, SAS Institute.
Learn more in this tech talk.
6
DESIGN DECISION #3: ALL THINGS ASYNC
Equipped with dozens of multi-core processors, as Non-Volatile Memory Express (NVMe), we
modern hardware servers are capable of decided not to synchronously wait for either
performing millions of I/O operations per second I/O completion or neighboring CPUs, even for
(IOPS). To take advantage of this speed, software nanoseconds.
needs to be asynchronous, to drive both I/O and
To accommodate this reality, we made the
CPU processing in a way that scales linearly with
fundamental decision to adopt a completely
the number of CPU cores.
asynchronous architecture. While building an
In any database, many different operations asynchronous framework meant more upfront
execute simultaneously. Multiple queries work for our team, once it was complete, all
executing on different cores may be waiting of the traditional problems associated with
for data from other cores, or even from the concurrency were eliminated. By taking this
network or storage media. It is already common approach, we implemented a system in which the
practice not to wait for slow devices such as number of concurrent queries is constrained only
HDD disks through the use of thread pools by system resources, not by the framework itself.
and other similar architectures. Yet as storage In other words, the framework drives both I/O
technology improves, the cost of dispatching one and CPU processing, enabling it to scale linearly
I/O operation approaches the cost of a thread with the number of cores available.
context switch. As the common core count
ScyllaDB relies on a dedicated asynchronous
increases, context switches can also appear as
engine (known as Seastar). You can read more
a result of locking. In order to scale with the
here how ScyllaDB accesses the disk with async
core count and extract maximum performance
and direct memory access (DMA) using the
from newer, faster storage technology such
Linux API.
Figure 1: Scale Linearly: ScyllaDB – Operations per Second vs. Number of Cores
7
DESIGN DECISION #4: SHARD PER CORE
Moore’s law went hand-in-hand with Dennard The complexity of the traditional threaded model
Scaling for decades, doubling single-threaded goes beyond locking. Where more than a single
CPU performance every 18 months. As socket is involved, memory access may spread
CPU frequency limits were reached, chip across multiple sockets. Access to a memory
manufacturers began experimenting with bank with a remote socket imposes twice the
multicore CPUs in the late 1990’s, and eventually cost as access to a local socket. Threaded
shifting to multicore architectures in the late applications are usually agnostic about memory
2000’s with the breakdown of Dennard Scaling. location, and can be moved around to different
Since the typical programming model involves CPUs in different sockets. This doubles response
many threads, increasing cores-per- CPU virtually times for applications that are not Non-Uniform
guarantees scalability problems. Memory Access (NUMA) friendly.
A threaded programming model lets the Finally, there are the I/O devices. Modern NICs
application focus on the problem at hand, relying and storage devices have multi-queue modes
on locks to ensure exclusive access to the data. where every CPU can drive I/O. The standard
Other threads, however, are prevented from model typically lacks enough handlers or, being
accessing the data and, as a result, they block. threaded by nature, will require context switches
Even a cheap locking technique, such as busy to service I/O.
waiting, must lock the CPU bus and invalidate
ScyllaDB addresses these scalability issues using
caches, adding overhead even when the lock is
a shared-nothing architecture. There are two
not contended.
levels of sharding in ScyllaDB. The first, identical
Application-level locks are even more expensive, to Cassandra, the entire dataset of the cluster is
since they cause the contended thread to sleep sharded into individual nodes. The second level,
and issue context switches. As the core-count transparent to users, is within each node.
grows, the chances of hitting the contended case
grow as well, thereby limiting the scalability of
threaded applications.
SCYLLA DB: Network comparison
TRADITIONAL STACK SEASTAR’S SHARED STACK
Core Database
Cassandra
Lock contention No contention

Cache contention Linear scaling
NUMA unfriendly Threads NUMA friendly
Kernel
Task Scheduler
Scheduler Memory TCP/IP
smp queue
Userspace
NIC
Queues NIC
CPU Queue
CPU
Figure 2: ScyllaDB Network Comparison
8
The node’s token range is automatically sharded Each shard issues its own I/O, either to the disk
across available CPU cores. (More accurately, or to the NIC directly. Administrative tasks such
datasets are sharded across hyperthreads). Each as compaction, repair and streaming are also
shard-per-core process is an independent unit, managed independently by each shard.
fully responsible for its own dataset. A shard has
In ScyllaDB, shards communicate using shared
a thread of execution, a single OS-level thread
memory queues. Requests that need to retrieve
that runs on that core and a subset of RAM.
data from several shards are first parsed by the
The thread and the RAM are pinned in a NUMA- receiving shard and then distributed to the target
friendly way to a specific CPU core and its shard in a scatter/gather fashion. Each shard
matching local socket memory. Since a shard is performs its own computation, with no locking
the only entity that accesses its data structures, and therefore no contention.
no locks are required and the entire execution
Each ScyllaDB shard runs a single OS-level
path is lock-free. In the journey towards pure
thread, leveraging an internal task scheduler to
shared-nothing and lock-free operation, our
allow the shards to perform a range of different
engineering team developed its own memory
tasks, such as network exchange, disk I/O,
allocation (malloc) library so each thread will
compaction, as well as foreground tasks such
have its own pool with no hidden OS-level
as reads and writes. ScyllaDB’s task scheduler
locking. Even exception handling, derived from
selects from low-overhead lambda functions,
the GCC compiler, was optimized since the
which we refer to as continuations. By taking this
GCC library used to acquire spin-lock created
approach, both the overhead of switching tasks
contentions during exceptional paths.
and the memory footprint are reduced, enabling
each CPU core to execute a million continuation
tasks per second.
DESIGN DECISION #5: UNIFIED CACHE

The page cache, sometimes also called disk Since Cassandra is unaware that a requested
cache, improves operating system performance object does not reside in the Linux page cache,
by storing page-size chunks of files in memory, accesses to non-resident pages will cause Linux
saving expensive disk seeks. In Linux, by default to issue a page fault and context switch to read
the kernel treats files as 4KB chunks. This speeds from disk.
performance, unless data is smaller than 4KB,
Then it will context switch again to run another
as is the case with many common database
thread. The original thread is paused and its
operations. In those cases, a 4KB minimum leads
locks are held. Eventually, when the disk data is
to high read amplification.
ready (yet another interrupt context switch), the
Having very poor spatial locality, that extra kernel will schedule in the original thread.
data is rarely useful for subsequent queries.
To attempt to alleviate read amplification,
It’s just wasted bandwidth. So while the Linux
Cassandra also employs a key cache and a row
page cache is a general-purpose data structure
cache, which directly store frequently-used
we recognized that a special-purpose cache,
objects. However, the added caches increase
intrinsic to the database, would deliver better
overall complexity. The operator allocates
performance.
memory to each cache; different ratios produce
The Linux page cache also performs synchronous varying performance characteristics, and each
blocking operations under the hood, decreasing workload will benefit from a different setting.
the performance and predictability of the system.
9
The operator also has to decide how much ScyllaDB has controllers such as a memtable
memory to allocate to the JVM’s heap and controller, compaction controls, and cache
its off-heap structures. Since the allocations controller and can dynamically adjust their sizes.
are performed at boot time, it’s practically
Once the data isn’t cached in memory, ScyllaDB
impossible to get it right, especially for dynamic
will generate a continuation task to read the
workloads.
data asynchronously from the disk using direct
The diagram below displays the architecture of memory access (DMA), which allows hardware
Cassandra’s caches, with layered key, row, and subsystems to access main system memory
underlying Linux page caches. independent of CPU. Seastar will execute it in a
μsec (1 million tasks/core/sec) and rush to run
In light of this complexity, it was a straightforward
the next task. There’s no blocking, heavyweight
decision to implement our own unified row-
context switch, waste, or tuning.
level cache. This cache bypasses the Linux
page cache, so does not suffer the same read For ScyllaDB users, this design decision means
amplification. As well, a unified cache can higher ratios of disk to RAM, yet also better
dynamically tune itself to the current workload, utilization of your available RAM. Each node can
obviating the need to manually tune multiple serve more data, enabling the use of smaller
different caches. ScyllaDB caches objects clusters with larger disks. The unified cache
itself, and thus always controls their eviction simplifies operations, since it eliminates multiple
and memory footprint. Moreover, ScyllaDB competing caches and is capable of dynamically
can dynamically balance the different types of tuning itself at runtime to accommodate varying
caches stored. workloads.
Cassandra
Key cache
On-heap /
Off-heap
App
Row cache thread
Page fault Map page
Suspend thread Resume thread
Page
Linux page cache Kernel
fault
Initiate I/O I/O completes Interrupt
Context switch Context switch
SSTables SSD
Figure 3: A diagrammatic view of the Linux Page Cache used by Apache Cassandra
10
DESIGN DECISION #6: I/O SCHEDULER
Database users want their data to be committed full control of how to prioritize the foreground
to disk as quickly as possible. For distributed operations over the background operations. Thus
databases, this requirement is nontrivial. I/O no tuning is required, query operation is always
producers have to compete for bandwidth. If too minimized, ScyllaDB maximizes disk throughput
much data is submitted at once, the data will be while idle time exists and streams gigabytes of
queued in the underlying device. The filesystem data per second, without any limit.
and the disk are ignorant of the content and
When ScyllaDB is installed, a tool called scylla_
purpose of the data, and they cannot judge
io_setup runs under the hood. scylla_io_setup is
whether the blocks originated from latency
a benchmark that automatically determines the
sensitive, real-time workloads or from batch
maximum useful disk concurrency, defined as the
background tasks.
point at which maximum throughput is achieved
It’s difficult to reach a compromise between low while latency is good, and no data is queued by
latencies for sensitive tasks while maximizing the disk or the file system.
throughput and preserving SLAs. Cassandra
Max useful
controls I/O submission by capping background disk concurrency
operations, such as compaction, streaming, and 400000 350000
repair. Cassandra users are required to carefully 350000 300000

tune it and obtain a detailed knowledge of
300000
database internals. With spiky, unpredictable 250000
workloads, getting it right is a daunting challenge. 250000

4096 read iops
200000
An inaccurate cap may cause commensurate 200000
spiky latency when the cap is too high, starving 150000

150000
foreground operations of compute and I/O 100000
resources. Set the cap too low and streaming 100000
terabytes of data between nodes might take 50000

50000
days—rendering autoscaling a nightmare. 0 0

0 50 100 150 200 250 300 350 400
The goal behind ScyllaDB is to provide a Concurrency
database that finds this balance automatically No queues I/O queued in FS/device
and autonomously, without operator
Figure 4: ScyllaDB’s iotool benchmarks
intervention; ScyllaDB should maximize I/O but
to ensure the optimal balance between
not overload the drives. This way ScyllaDB has foreground and background operations.
Query Queue
Userspace
Commitlog Queue I/O Scheduler Disk
Compaction Queue
Figure 5: ScyllaDB’s I/O Scheduler enforces priority across user-facing workloads and background tasks.
11
To manage data on disk, both Cassandra scheduler can decide which priority should be
and ScyllaDB use an algorithm called Log applied to each class of operations.
Structured Merge Tree (LSM tree). LSM trees
The I/O scheduler guarantees that the disk will
create immutable files during data insertion
always be in its sweet spot and thus latency
with sequential I/O (the “Log-Structured” part
remains low while bandwidth is maximized.
of LSM)—yielding great initial throughput. But
that penalizes future reads, which may now This design decision helps ScyllaDB users
have to deal with data for its queries coming meet SLAs while maintaining system stability.
from multiple SSTable files. To alleviate that, a Operators spend less time tuning and can
background operation called compaction goes be confident that background operations
through the existing files and merges them (the will complete as quickly as possible without
“Merge” part in LSM) into fewer files. Issues arise impacting performance.
when these background operations compete with Another example of a background operation
user queries, reducing throughput in the process, is the commissioning of new nodes and
which is, in general, an undesirable scenario. decommissioning of old ones. For example, to
To address this we saw the need to control commission or decommission nodes, an operator
all of the I/O going through the system. Our can simply instruct ScyllaDB to perform the
approach relies on a scheduler that allows types operation, without having to take the speed of
of requests—both foreground requests and the operation into consideration.
background requests—to be tagged according The operation will simply run mediated by the
to the origin of the I/O operation. Once tagged I/O Scheduler at the fastest speed that won’t
as such, these requests can be metered and the impact system throughput.
“We’ve reduced our P99, three 9 and four 9 latencies by 95%.”

Phil Zimich, Senior Director of Engineering, Comcast.
Learn more in this tech talk.
DESIGN DECISION #7: AUTONOMOUS CAPABILITIES

Developing a database with self-optimizing consultants. To alleviate this burden and to help
capabilities is an overarching design decision IT groups focus on more pressing issues, we
that informs many of the design principles that committed to building a database that continually
we have described to this point. Cassandra users monitors all operations and dynamically adjusts
have told us time and again that they waste internal processors to optimize performance.
significant time and energy not only tuning the ScyllaDB takes advantage of control theory
database, but also dealing with the fallout of to achieve autonomous operations. Control
complex and tricky tuning mechanisms. theory is routinely employed in other fields like
Before ScyllaDB, the only solutions available industrial plants and automotive cruise control
to this problem were to become an expert systems. It works by setting a particular level for
in Cassandra internals or to hire expensive user-visible properties that the system has to
12
maintain, leaving the tuning of the component This self-optimizing approach made much more
parts that will yield the desired behavior up sense to us than saddling users with elaborate
to the control algorithm. This is used in many guides and explanations of how to tune and
parts of ScyllaDB. It is present in background retune for changing workloads and corner
operations like compactions (where we make cases. Not only does an autonomous database
sure that the uncompacted backlog never greatly reduce administrative burden, it also
grows out of control), in the caches (so we means that operators can achieve 100% resource
automatically move memory to where it is utilization while maintaining SLAs, and optimize
needed), and many other parts of the system. infrastructure budgets, all at the same time.
Commitlog
Memory Monitor
Adjust priority Memtable WAN
Adjust priority
Compaction Seastar SSD
Compaction Scheduler
Backlog Monitor
Query CPU
Repair
Figure 6: A closer view of ScyllaDB’s Scheduler architecture. ScyllaDB dynamically

adjusts priority across tasks in user space, enabling self-optimizing operations.
“We needed an aggregation store that is high-throughput, low-latency,

low-overhead, low-cost, and low maintenance. That is where ScyllaDB
comes in. Low maintenance is extremely important. I have a baby at
home. I don’t want another baby at work!”
Aravind Srinivasan, Lead Engineer at Grab.
Read more about his experience in this case study.
13
CONCLUSION They relied on this practical experience with
modern distributed systems to reinvent Apache
IT organizations of all sizes have embraced Cassandra for the modern datacenter, finally
NoSQL in general and Apache Cassandra in releasing a NoSQL database capable of 1 million
particular based on the promise of greater operations per second on a single node.
flexibility and cost-effective scale. As a first
generation NoSQL solution, Apache Cassandra Today, ScyllaDB is battle-hardened in production
delivered on that early promise of NoSQL. datacenters. Customers use it as an API-
compatible replacement for Cassandra or
Over time, however, it has become clear that DynamoDB as a replacement for SQL and
limitations in the fundamental architecture of NoSQL databases such as MongoDB and as a
Cassandra render it unable to fully leverage the replacement for the Redis cache. Production
computing resources in the modern datacenter. deployments have shown that ScyllaDB
Organizations that adopted Cassandra significantly shrinks the datacenter hardware
now struggle with costs, maintenance and footprint compared to Apache Cassandra
administrative overhead. This is due to a number clusters, which often run on many small nodes.
of reasons, from node sprawl due to poor ScyllaDB was designed from the ground-up
hardware utilization, throughput bottlenecks, to maximize available computing resources
inconsistent and high latencies, complex and to take advantage of modern multi-core
performance tuning, and inefficient memory ‘commodity’ hardware. In this way, ScyllaDB
management. Ultimately, Cassandra has proven delivers scale-out capabilities along with vastly
to be too expensive and problematic for many lower management and administrative overhead
projects. compared to other distributed databases on the
market today.
On the other hand, public cloud users who
moved to a NoSQL database-as-a-service
(DBaaS) such as Amazon DynamoDB avoided
maintenance and administrative tasks, but
NEXT STEPS
found their ease-of-use came with a price: Get started with ScyllaDB
operational costs increased dramatically. Plus,
Learn more at ScyllaDB University
once committed to DynamoDB, they were locked
into a single cloud vendor solution, limiting their Explore papers, videos, benchmarks & more
operational flexibility.
Users clearly want a system that provides a best-
of-all-worlds solution; high performance, high
availability and reliability, ease of administration,
operational flexibility as well as significantly
lower total cost of ownership (TCO).
ScyllaDB delivers on the original vision of
NoSQL—without the architectural downsides
associated with Apache Cassandra or the costs
of DynamoDB. To achieve this goal, the ScyllaDB
team had to rethink many of the foundational
architectural choices behind Cassandra. The
team behind ScyllaDB applied their experiences
with the hypervisor infrastructure underpinning
many production cloud platforms.
14
ABOUT SCYLLADB
ScyllaDB is the database for data-intensive apps that
require high performance and low latency. It enables teams
to harness the ever-increasing computing power of modern
infrastructures – eliminating barriers to scale as data
grows. Unlike any other database, ScyllaDB is built with
deep architectural advancements that enable exceptional
end-user experiences at radically lower costs. Over 400
game-changing companies like Disney+ Hotstar, Expedia,
FireEye, Discord, Crypto.com, Zillow, Starbucks, Comcast,
and Samsung use ScyllaDB for their toughest database
challenges. ScyllaDB is available as free open source
software, a fully-supported enterprise product, and a fully
managed service on multiple cloud providers. For more
information: ScyllaDB.com
SCYLLADB.COM
United States Headquarters Israel Headquarters

2445 Faber Place, Suite 200 11 Galgalei Haplada
Palo Alto, CA 94303 U.S.A. Herzelia, Israel
Email: info@scylladb.com
Copyright © 2022 ScyllaDB Inc. All rights reserved. All trademarks or

registered trademarks used herein are property of their respective owners.

WP Seven Design Principles

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

WP Seven Design Principles

Uploaded by

Copyright:

Available Formats

SCYLLADB WHITEPAPER

DESIGN DECISION #1: C++ INSTEAD OF JAVA 5

DESIGN DECISION #2: CASSANDRA COMPATIBILITY 6

DESIGN DECISION #3: ALL THINGS ASYNC 7

DESIGN DECISION #4: SHARD PER CORE 8

DESIGN DECISION #5: UNIFIED CACHE 9

DESIGN DECISION #6: I/O SCHEDULER 11

DESIGN DECISION #7: AUTONOMOUS CAPABILITIES 12

Drawing on key concepts behind Amazon • JVM Challenges: Memory management in

“So why did we end up moving from Cassandra to ScyllaDB?” Peace

Learn more about his experience in this tech talk.

Learn more in this tech talk.

TRADITIONAL STACK SEASTAR’S SHARED STACK

Lock contention No contention

Figure 2: ScyllaDB Network Comparison

DESIGN DECISION #5: UNIFIED CACHE

repair. Cassandra users are required to carefully 350000 300000

workloads, getting it right is a daunting challenge. 250000

spiky latency when the cap is too high, starving 150000

resources. Set the cap too low and streaming 100000

terabytes of data between nodes might take 50000

days—rendering autoscaling a nightmare. 0 0

The goal behind ScyllaDB is to provide a Concurrency

“We’ve reduced our P99, three 9 and four 9 latencies by 95%.”

Learn more in this tech talk.

DESIGN DECISION #7: AUTONOMOUS CAPABILITIES

Adjust priority Memtable WAN

Figure 6: A closer view of ScyllaDB’s Scheduler architecture. ScyllaDB dynamically

“We needed an aggregation store that is high-throughput, low-latency,

Read more about his experience in this case study.

United States Headquarters Israel Headquarters

Copyright © 2022 ScyllaDB Inc. All rights reserved. All trademarks or

You might also like