Download as pdf or txt
Download as pdf or txt
You are on page 1of 9

DATA STORAGE IN

AN OPEN SOURCE
WORLD

Sponsored by
DATA STORAGE IN
AN OPEN SOURCE
WORLD
AUTHORED BY:
Joe McKendrick,
Lead Analyst, Unisphere
Research, a Division of
Information Today, Inc.

TABLE OF CONTENTS

Executive Summary��������������������������������������������������������������������������������������������������������������������������������������������������1

Examining the Open Source Database Landscape��������������������������������������������������������������������������������������������������2

Staying Agile with Ever-Expanding Data�����������������������������������������������������������������������������������������������������������������3

Achieving a Modern Storage Environment��������������������������������������������������������������������������������������������������������������4

Optimizing Your Open Source Database Environment with Pure����������������������������������������������������������������������������5


1. Executive Summary data that is leveraged by today’s proliferation of
Open source databases have been on the scene advanced applications.
for a number of years as rapidly-deployable A modern storage environment that is simple,
databases at the peripheries of enterprises, scalable, adaptable and resilient is essential for
serving as testing environments and website going forward in today’s open source world,
back-ends. Lately, however, they have been especially as open source databases take on large
moving into mission-critical production enterprise workloads. The growing size and
environments in a big way, from Software-as-a- complexity of today’s database environments
Service (SaaS) providers using the technology not only creates challenges in maintaining the
to stay on the cutting edge, to supporting performance and availability of mission-critical
interactions and analytics at more traditional
applications and systems, but for the ongoing
enterprises within industries like finance,
day-to-day management of these environments
healthcare and education. Today, an open source
database is just as likely to be found behind a by time-strapped database teams. The ability
bank’s customer relationship management system to automate and simplify routine database
as it is under the hood of the intensive research maintenance tasks via cloud and rich data
center of a university. services is becoming increasingly important.
However, maintaining open source databases A modern storage environment should play a
such as MySQL and MongoDB can also lead to central role in simplifying the management of
growing pains in enterprises struggling to keep database environments and enabling greater
up with exploding data volumes and performance data mobility.
requirements. Storage—often an afterthought
in smaller-footprint, open-source projects—
needs to be addressed on an enterprise level. 2. Welcome to the Open-Source World
Performance tuning is a must-have, but this The days when enterprises maintained data
requires special expertise, and its benefits tend environments that were tied to a single vendor
to be limited. Increases in processing power or platform are gone. Even among enterprise
may also help boost performance, but constantly shops that were initially built around Oracle
upgrading processors and hardware stacks can be
relational database management systems, open-
expensive. Another option is to adopt clustering
source databases are proliferating, as shown
solutions to support larger workloads and
improve high availability. However, ultimately, in a survey conducted among members of the
the key to achieving maximum performance Independent Oracle Users Group by Unisphere
and scalability in an open-source database research, a division of Information Today. Open-
environment comes from a modern storage source databases proliferate, including adoption
environment, designed to efficiently deliver of MySQL (among 44%), PostgreSQL (22%),

1 August 2020
MongoDB (18%), MariaDB (11%) and CouchDB and artificial intelligence. Licensing for per-user
(3%).[1] or per-processor instances is also less expensive
Additional surveys confirm that today’s and more open than those of commercial
enterprise data shops can only be characterized databases.
by “polyglot persistence”—or the use of different The reasons for adopting open source databases
databases to handle different needs based on vary, but the two main factors driving adoption

Database Adoption by Brand at Oracle Enterprise Sites


Oracle (all versions): 81%
MySQL: 44%
Microsoft SQL Server: 64%
PostgreSQL: 22%
IBM DB2: 21%
MongoDB: 18%
SAP HANA: 12%
MariaDB: 11%
Amazon DynamoDB: 10%
CouchDB: 3%
0 20 40 60 80 100
Source: Unisphere Research/Information Today Inc�

the strengths of each particular database. The are achieving cost savings and avoiding vendor
average company now leverages more than
lock-in. A survey published by Percona finds
three database types for applications, a survey
77% see cost savings as a benefit, followed
from DZone confirms—with some reporting
by 56% citing vendor lock-in. Users are also
up to nine different database types used.[2] For
example, new applications can be more readily attracted by the community support they can
stood up with open-source databases, which are receive with open-source environments.[3] At
more lightweight and quicker to deploy. Many the same time, the expansion of open source
open-source databases are NoSQL databases that databases into mission-critical environments,
more readily support unstructured data, which where they support large enterprise workloads,
is a huge part of the data now flowing through can highlight limitations in your database and
enterprises and sought for advanced analytics traditional storage environment in areas such as

2 August 2020
performance, availability, capacity and time three years, 31% expect to start seeing this pace
to market. of growth. [4]
Data is pouring in from all corners of the
3. Where Does All That Data Come From, enterprise—from ERP, financial systems,
production systems, customer relationship
and Where Does It Go?
management systems, human capital
The key challenge for today’s enterprises is management systems, and everywhere else inside
supporting and enhancing the mobility of data— and close to the enterprise. Data is also streaming
data that comes from an ever-expanding array in from devices, sensors and systems across
of sources, including machines and sensors, the Internet of Things. This requires 24x7 data
business users, external and social media sites, monitoring and management as well. Add to this
and transaction systems. This data ends up in the data coming out of transactions, social media
data warehouses, data lakes and databases across or other customer engagements which have value
the enterprise. in advanced analytics, artificial intelligence and
With most applications considered mission- machine learning.
critical in today’s enterprises, there’s almost This data growth, of course, is translating into
unanimous agreement that availability and very large databases and data sites that need to
performance are the most important services be managed. While a multi-terabyte database
data shops can deliver. Businesses are online and was seen as exclusive to large enterprises just a
depend upon many elements to keep delivering. few years ago, close to half of the data managers
The user base is no longer limited to employees in the survey report having databases exceeding
who will just sit and patiently wait until things 10 TBs right now, according to the Unisphere
are restored—it involves customers and partners Research study. Close to one-third manage
who depend upon and expect information to be databases exceeding 50 terabytes in size. The
delivered, and transactions to be completed and presence of such massive amounts of data
updated, as soon as they happen. means greater care needs to be taken in keeping
In today’s data centers, everything that affects information highly available, without the loss
the business matters. Close to two thirds of data of current data in the event of incidents or
managers, 63%, responding to a survey from performance slowdowns.
Unisphere Research say much of what they In fact, availability is the data issue most
manage is mission-critical to their businesses— likely to keep respondents up at night, the
defined as more than 25% of total database survey shows. Close to two-thirds, 63%, say
installations. This survey also shows the pace the availability of applications is an “extremely
of data growth accelerating. At the time of the critical” concern for them, according to
survey, about 15% of respondents were in the Unisphere Research. About half also see database
fastest-growing segment, with data growing at and application performance as an extremely
a pace of 50% a year. However, within the next critical concern. Overall, data managers and

3 August 2020
administrators are extremely active in their to meet these modern data demands as they are
efforts to assure constant, uninterrupted delivery prone to disruptions, complexity and lack of
of data to their users and customers. The majority scalability—all of which leads to administrative
of companies in the survey are taking steps to overhead. The following are recommendations
boost the performance of their database or even to achieve a modern storage environment that
upgrade to new versions. incorporates open source databases:
The pervasiveness of data, combined with
users executing queries from so many different •C
 onsider a “software defined storage”
domains, calls for a high-performing and strategy. Software defined storage, or SDS,
resilient storage solution similar to that leveraged abstracts storage configuration away from
by relational database management systems. underlying physical hardware and database-
With this growth of database activity—and dependent features into a standardized and
proliferation of open source databases, the accessible service layer.
challenge for data managers is assuring the •B
 uild in an intelligent storage layer. Such
performance and availability of a wide range an architecture—structured as a multicloud
of solutions from varying providers. Data data plane that eliminates the complexity
managers—database administrators, developers of operating siloed private and public cloud
and analysts—are not experts in these multiple environments—provides rich data services and
environments. Oracle databases, for example, data mobility.
have different protocols than MariaDB databases. •T
 ransform storage infrastructure
This complicates important elements such as incrementally. Many organizations have
overall performance, availability, backup and networks of existing storage area networks
recovery and disaster recovery. and network-attached storage arrays. While it
is cost-prohibitive and discursive to migrate
4. Recommendations for Achieving a to newer more intelligent storage layers, such
capabilities can be built into new applications
Modern Storage Environment and configurations.
The increasing complexity of managing •L
 ook to more rapid and efficient storage
open-source databases up front increases the technologies. The traditional magnetic disk
urgency of maintaining a modern storage paradigm has proven to be too slow for todays’
environment that serves all platforms across the always-on, real-time applications requirements,
enterprise. Such a storage platform can enhance as data needs to make round trips between
performance and simplify operations to the point storage and random access memory in systems.
in which more expensive up-front methods, such Flash storage arrays and in-memory computing
as processor upgrades or performance tuning can promise data access the instant it is required
be avoided. Traditional data storage systems such by applications and users. Make sure your
as direct attached storage (DAS) lack the ability vendor supports NVMe (Non-Volatile Memory

4 August 2020
express) and NVMe-oF (Non-Volatile Memory
Express over Fabrics) storage interconnect
Modern Storage Platform technologies, which are based primarily on
Checklist PCIe, a fast-evolving standard that leverages
flash storage.
When evaluating a modern storage platform •  Look to the cloud. Today’s cloud providers—
for your open source database management particularly Infrastructure as a Service (IaaS),
system, look for these key features and Platform as a Service (PaaS) and Database
capabilities: as a Service (DBaaS) vendors—offer almost
unlimited capacity, available on a subscription
•R  apid response time and low latency to or usage basis. Even data warehouses and
database queries data lakes can be efficiently managed in
•S  peedy database cloning of production the cloud.
databases
Today’s environment offers many choices
•N  on-stop availability of database with high
storage uptime to help meet SLA targets for applying the right database for the right
application. However, with so many choices
• Good data reduction that can provide comes more responsibility for the proper and
effective use of capacity and lower TCO
efficient storage of data. A modern storage
• Consistent experience and data mobility environment provides the foundation for a
across on-premises and public cloud. high-performing data-driven enterprise.
• Open and efficient APIs
• Space-efficient snapshots for database 5. Optimize Your Open Source Database
recovery and cloning Environment with Pure Storage
• Easy setup and maintenance which Whether your workload is transactional, data
reduces administrative overhead warehousing or conducting analytics, Pure
• Non-disruptive upgrades Storage® can optimize your open source database
• Pay as you go billing flexibility. that deployment and improve your application
allows buying of storage based on actual performance. Pure delivers a modern data
consumption experience that empowers organizations to run
• End-to-end encryption and ransomware their operations as a true, automated, storage
protection as-a-service model seamlessly across multiple
clouds. Pure helps companies use more of their
data, while reducing the complexity and expense
of managing the infrastructure behind it. Pure’s
flash storage solutions are purpose-built to

5 August 2020
support the modern data experience and deliver an optimal outlook by right-sizing capacity
simplicity, flexibility and reliability. FlashArray™ allocation.
is the industry’s first all-flash 100 percent • Pure solutions provide scalability and
NVMe shared accelerated storage designed for uninterrupted uptime, which is critical to
mainstream enterprise deployments. FlashBlade™ meeting customer SLAs. Pure FlashArray
is the industry’s first unified fast file and object provides six-nines availability (99.9999%
(UFFO) storage platform for modern data and uptime), inclusive of upgrades and
applications. Pure as-a-Service™ provides storage maintenance, across both hardware and
as a service for on-premises and public cloud that software. Additionally, Pure’s Evergreen™
unifies hybrid clouds with a single subscription. storage program allows for non-disruptive
Pure1® enables self-driving storage with full- upgrades while supporting long-term
stack, AI-powered data-storage management and compatibility, IT agility, and peace of mind.
monitoring. With Pure as-a-Service, you can purchase
storage in a cloud-like fashion to adapt to
• Faster applications with rapid response times fluctuating capacity requirements. When
can build customer value and deliver an it comes to data protection, options start
amazing user experience. Hence fast data is with instant snapshot copies to synchronous
agile data. Pure can help speed databases with replication with ActiveCluster™ and replication
low latency to support business applications. resiliency for rapid-restore backup and
Reduce the time and cost of database recovery. Pure’s asynchronous replication with
activities—including copy, clone, and refresh— ActiveDR™ provides a “near-zero” recovery
and provide quick provisioning via APIs, with point objective. Combined, these offerings can
data copies for dev/test so that your teams are help limit downtime, data loss, and risk.
always working off the latest copies of data. • Pure’s cloud data services, together with on-
Pure offers granular data reduction necessary premises cloud data infrastructure, enable hybrid
for virtually any application: pattern removal, applications that run seamlessly across clouds.
deduplication, compression, deep reduction, Take advantage of the agility and innovation of
and copy reduction. This embedded data- multiple clouds, building applications once, and
reduction capability minimizes both capacity then running them seamlessly on-premises and
needs and capacity costs. in the public cloud. Pure1 enables cloud-based,
• Pure solutions are easy to set up and smart fleet management; advanced capacity planning;
enough to manage themselves. This minimizes and workload simulation to deliver resources
administrative overhead and eliminates risk. faster, control costs and forecast IT need. Pure
Pure’s cloud-based management tool, Pure1, as-a-Service delivers pay-as-you-go billing
makes it quick and easy to monitor your with scale up and down flexibility, competitive
storage, wherever you are. Pure1 Meta®, which on-demand rates and unified subscriptions
is an AI-driven workload planner can provide for both on-premises and cloud. The Modern

6 August 2020
Data Experience from Pure Storage leverages
hybrid mobility alongside consistent storage
services, resiliency, and APIs across your hybrid
environment to give you the most flexibility
possible with your database deployments.

For more information, visit Pure Storage for Open


Source Databases today.

Addition Resources:
• FlashArray
• Cloud Block Store
• Pure1
• Pure as-a-Service
•  ure Solutions for MySQL, MongoDB,
P
PostgreSQL and Cassandra

[1] 2019 IOUG Databases in the Cloud Survey, prepared for Unisphere
Research/Information Today, Inc. in cooperation with Amazon Web
Services, January 2019.http://www.ioug.org/d/do/8551

[2] 2019 Open Source Database Report, Kristi Anderson, DZone, January
17, 2020. https://dzone.com/articles/2019-open-source-database-re- About Pure:
port-top-databases-pub
Get more from your data. Pure Storage
[3] 
2019 Open Source Data Management Software Survey, Percona, empowers innovators to build a better
2019https://learn.percona.com/hubfs/Percona_Open_Source_DataMa- world with data by delivering a simple,
nagement_Software_Survey_2019.pdf
evergreen platform that enables
[4] Achieving Your 2018 Database Goals Through Replication: Real-World organizations of all kinds to turn data
Market Insights and Best Practices, Unisphere Research, a division of into intelligence and advantage.
Information Today, Inc., March 2018. https://www.dbta.com/DBTA-Down-
loads/ResearchReports/Achieving-Your-Database-Goals-Through-Repli-
cation-Real-World-Market-Insights-and-Best-Practices-8555.aspx

7 August 2020

You might also like