Mysql Cluster Scaling Web Databases Webinar Aug 26

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 57

<Insert Picture Here>

Scaling Web Databases: MySQL Cluster


Auto-Partitioning, SQL & NoSQL Interfaces, Schema Flexibility
Ronen Baram MySQL Senior Sales Consultant

The Reality of Being Successful on the Web


Scale fast
.writes as well as reads, linearly .on commodity hardware .without downtime

Choose the right tool for the job


.integrating multiple lightweight services .multiple data access & data integrity requirements .SQL & NoSQL, when and where

Rapidly iterate
.evolve the app and the database

Stay upstay available

Session Agenda
Best Practices in Scaling Web Services with MySQL Cluster
Scaling Reads & Writes with Auto-Sharding
On-Line Scaling with Commodity Hardware Choosing the Right Interface(s): NoSQL and SQL
<Insert Picture Here>

Rapidly Evolving Web Services


On-Line Scale-Out & Schema Updates

Staying Up, Staying On-Line Case Studies Resources to Get Started

The presentation is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment to deliver any material, code, or functionality, and should not be relied upon in making purchasing decisions. The development, release, and timing of any features or functionality described for Oracles products remains at the sole discretion of Oracle.

MySQL Cluster Overview ACID Compliant Relational Database


SQL & NoSQL interfaces

Write-Scalable & Real-Time


Distributed, auto-partitioning (sharding), multi-master

99.999% Availability
Shared-nothing, integrated clustering & sub-second recovery, local & geographic replication, on-line operations

Low TCO
Open-source, management & monitoring tools, scale-out on commodity hardware

MySQL Cluster Architecture

REST

LDAP

Application Nodes

Node Group 1 Node 1 Cluster Mgmt F1 F3 F3 F1

Node Group 2 Node 3 F2 F4 F4 F2 Cluster Mgmt

Node 2

Data Nodes

Node 4

MySQL Cluster - Extreme Resilience

REST

LDAP

Application Nodes

Node Group 1 Node 1 Cluster Mgmt F1 F3 F3 F1

Node Group 2 Node 3 F2 F4 F4 F2 Cluster Mgmt

Node 2

Data Nodes

Node 4

MySQL Cluster Users & Applications


HA, Transactional Services: Web & Telecoms

Web
User profile management Session stores eCommerce On-Line Gaming Application Servers

Telecoms
Subscriber Databases (HLR/HSS) Service Delivery Platforms VoIP, IPTV & VoD Mobile Content Delivery On-Line app stores and portals IP Management Payment Gateways

http://www.mysql.com/customers/cluster/

Auto-Sharding

Scale-Out Reads & Writes on Commodity Hardware


NDB API Performance 4.33 M Queries per second! 8 Intel servers, dual-6-core CPUs @2.93 GHz, 24GB RAM 2 Data Nodes per server flexAsync benchmark
16 parallel threads, each issuing 256 simultaneous transactions Read / Write 100 byte attribute

Interim results from 2 days testing watch this space: mikaelronstrom.blogspot.com

Out of the Box Scalability: Auto-Sharding

Partitioning happens automatically & transparent to the application


A little knowledge of how it works though can massively increase application performance

Transparency maintained during failover, upgrades and scale-out No need for application-layer sharding logic No need to limit application to single-shard transactions (though can help efficiency)
http://www.mysql.com/why-mysql/white-papers/mysql_wp_cluster_perfomance.php

Automatic Data Partitioning


4 Partitions * 2 Replicas = 8 Fragments

Table T1
Px Partition

Data Node 1

P1
Data Node 2

P2 P3
Data Node 3

P4
A fragment is a copy of a partition (aka fragment replica) Number of fragments = # of partitions * # of replicas
Data Node 4

Automatic Data Partitioning


4 Partitions * 2 Replicas = 8 Fragments

Table T1
Px Partition

Data Node 1 F1

P1
Data Node 2

P2 P3
Data Node 3

P4
Fx Fx Primary Fragment Secondary Fragment (fragment replica) Data Node 4

Automatic Data Partitioning


4 Partitions * 2 Replicas = 8 Fragments

Table T1
Px Partition

Data Node 1 F1

P1
Data Node 2

P2 P3

F1

Data Node 3

P4
Fx Fx Primary Fragment Secondary Fragment (fragment replica) Data Node 4

Automatic Data Partitioning


4 Partitions * 2 Replicas = 8 Fragments

Table T1
Px Partition

Data Node 1 F1

P1
Data Node 2

P2 P3

F3

F1

Data Node 3

P4
Fx Fx Primary Fragment Secondary Fragment (fragment replica) Data Node 4

Automatic Data Partitioning


4 Partitions * 2 Replicas = 8 Fragments

Table T1
Px Partition

Data Node 1 F1 F3

P1
Data Node 2

P2 P3

F3

F1

Data Node 3

P4
Fx Fx Primary Fragment Secondary Fragment (fragment replica) Data Node 4

Automatic Data Partitioning


4 Partitions * 2 Replicas = 8 Fragments

Table T1
Px Partition

Data Node 1 F1 F3

P1
Data Node 2

P2 P3

F3

F1

Data Node 3 F2

P4
Fx Fx Primary Fragment Secondary Fragment (fragment replica)

Data Node 4

Automatic Data Partitioning


4 Partitions * 2 Replicas = 8 Fragments

Table T1
Px Partition

Data Node 1 F1 F3

P1
Data Node 2

P2 P3

F3

F1

Data Node 3 F2

P4
Fx Fx Primary Fragment Secondary Fragment (fragment replica)

Data Node 4 F2

Automatic Data Partitioning


4 Partitions * 2 Replicas = 8 Fragments

Table T1
Px Partition

Data Node 1 F1 F3

P1
Data Node 2

P2 P3

F3

F1

Data Node 3 F2

P4
Fx Fx Primary Fragment Secondary Fragment (fragment replica)

Data Node 4 F4 F2

Automatic Data Partitioning


4 Partitions * 2 Replicas = 8 Fragments

Table T1
Px Partition

Data Node 1 F1 F3

P1
Data Node 2

P2 P3

F3

F1

Data Node 3 F2 F4

P4
Fx Fx Primary Fragment Secondary Fragment (fragment replica)

Data Node 4 F4 F2

Automatic Data Partitioning


4 Partitions * 2 Replicas = 8 Fragments

Table T1
Px Partition

Data Node 1 F1 F3

P1 P2 P3
F3

Node Group 1
Data Node 2 F1

Data Node 3 F2 F4

P4
Fx Fx Primary Fragment Secondary Fragment (fragment replica)

Data Node 4 F4 F2

Automatic Data Partitioning


4 Partitions * 2 Replicas = 8 Fragments

Table T1
Px Partition

Data Node 1 F1 F3

P1 P2 P3
F3

Node Group 1
Data Node 2 F1

Data Node 3 F2 F4

P4
Fx - Node groups are created automatically - # of groups = # of data nodes / # of replicas Fx Secondary Fragment (fragment replica) Primary Fragment

Node Group 2
Data Node 4 F4 F2

Automatic Data Partitioning


4 Partitions * 2 Replicas = 8 Fragments

Table T1
Px Partition

Data Node 1 F1 F3

P1 P2 P3
F3

Node Group 1
Data Node 2 F1

Data Node 3 F2 F4

P4
Fx As long as one data node in each node group is running we have a complete copy of the data Fx Primary Fragment Secondary Fragment (fragment replica)

Node Group 2
Data Node 4 F4 F2

Automatic Data Partitioning


4 Partitions * 2 Replicas = 8 Fragments

Table T1
Px Partition

Data Node 1 F1 F3

P1 P2 P3
F3

Node Group 1
Data Node 2 F1

Data Node 3 F2 F4

P4
Fx As long as one data node in each node group is running we have a complete copy of the data Fx Primary Fragment Secondary Fragment (fragment replica)

Node Group 2
Data Node 4 F4 F2

Automatic Data Partitioning


4 Partitions * 2 Replicas = 8 Fragments

Table T1
Px Partition

Data Node 1 F1 F3

P1 P2 P3
F3

Node Group 1
Data Node 2 F1

Data Node 3 F2 F4

P4
Fx As long as one data node in each node group is running we have a complete copy of the data Fx Primary Fragment Secondary Fragment (fragment replica)

Node Group 2
Data Node 4 F4 F2

Automatic Data Partitioning


4 Partitions * 2 Replicas = 8 Fragments

Table T1
Px Partition

Data Node 1 F1 F3

P1 P2 P3
F3

Node Group 1
Data Node 2 F1

Data Node 3 F2 F4

P4
Fx - No complete copy of the data - Cluster shutdowns automatically Fx Primary Fragment Secondary Fragment (fragment replica)

Node Group 2
Data Node 4 F4 F2

General Design Considerations


MySQL Cluster is designed for
Short transactions Many parallel transactions

Utilize Simple access patterns to fetch data


Use efficient scans and batching interfaces

Analyze what your most typical use cases are


optimize for those

Overall design goal Minimize network roundtrips for your most important requests!

Best Practice : Primary Keys


To avoid problems with
Cluster 2 Cluster replication Recovery Application behavior (KEY NOT FOUND.. etc)

ALWAYS DEFINE A PRIMARY KEY ON THE TABLE!


A hidden PRIMARY KEY is added if no PK is specified. BUT.. .. NOT recommended The hidden primary key is for example not replicated (between Clusters)!! There are problems in this area, so avoid the problems!

So always, at least have


id BIGINT AUTO_INCREMENT PRIMARY KEY

Even if you don't need it for you applications

Best Practice: Distribution Aware Apps


SELECT SUM(population) FROM towns WHERE country=UK;
Partition Key

Partition selected using hash on Partition Key


Primary Key by default User can override in table definition

Primary Key town Maidenhead Paris Boston Boston country UK France UK USA population 78000 2193031 58124 617594

MySQL Server (or NDB API) will attempt to send transaction to the correct data node
If all data for the transaction are in the same partition, less messaging > faster

SELECT SUM(population) FROM towns WHERE town=Boston;


Partition Key

Primary Key town country population

Aim to have all rows for highrunning queries in same partition


78000
2193031 58124 617594

Maidenhead
Paris Boston Boston

UK
France UK USA

Best Practice: Distribution Aware Multiple Tables


Partition Key

Primary Key sub_id 19724 84539 19724 age 25 43 16 gender male female female

74574

21

female

Extend partition awareness over multiple tables Same rule aim to have all data for instance of high running transactions in the same partition ALTER TABLE service_ids PARTITION BY KEY(sub_id);

Partition Key

Primary Key service twitter sub_id 19724 svc_id 76325732

twitter
facebook facebook

84539
19724 73642

67324782
83753984 87324793

Scaling Distributed Joins


Adaptive Query Localization

7.2DM

Complex joins traditionally slower in MySQL Cluster


Complex = lots of levels and interim results in JOIN

JOIN was implemented in the MySQL Server:


mysqld

Data Nodes AQL

Nested Loop join When data is needed, it must be fetched over the network from the Data Nodes; row by row This causes latency and consumes resources

mysqld

Can now push the execution down into the data nodes, greatly reducing the network trips 25x-40x performance gain in customer PoC!
Data Nodes
The existence, content and timing of future releases described here is included for information only and may be changed at Oracles discretion. May 26, 2011

http://www.mysql.com/news-and-events/on-demand-webinars/display-od-583.html

<Insert Picture Here>

Early Adopter Speaks!


Testing of Adaptive Query Localization has yielded over 20x higher performance on complex queries within our application, enabling Docudesk to expand our use of MySQL Cluster into a broader range of highly dynamic web services. Casey Brown Manager, Development & DBA Services, Docudesk

Scale Out
Need more throughput?

Data Node 1

Data Node 2

Scale Out
Need more throughput?

Data Node 1

Data Node 2

Scale Out
Need more throughput?

Data Node 1

Data Node 2

Oops, need to increase capacity as well!

Scale Out

Data Node 1

Data Node 2

Data Node 3

Data Node 4

Scaling Across Data Centers


Geographic Replication with Multi-Master Replication
Synchronous replication within a Cluster node group for HA Bi-Direction asynchronous replication to remote Cluster for geographic redundancy Master-slave or multi-master Automated conflict detection and resolution Asynchronous replication to non-Cluster databases for specialised activities such as report generation Mix and match replication types

Cluster 1

Cluster 2

InnoDB
Synchronous replication
Asynchronous replication

InnoDB

InnoDB

SQL & NoSQL Interfaces

Performance I Flexibility I Simplification


SQL and NoSQL Access Methods to tables
SQL: complex queries, rich ecosystem of apps & expertise Simple Key/Value interfaces bypassing SQL layer for blazing fast reads & writes Real-time interfaces for micro-second latency Developers free to work in their preferred environment

MySQL Cluster: SQL & NoSQL Combined

Mix & Match! Same data accessed simultaneously through SQL & NoSQL interfaces
NoSQL Multiple ways to bypass SQL, and maximize performance: NDB API. C++ for highest performance, lowest latency Cluster/J for optimized access in Java NEW! Memcached. Use all your existing clients/applications

Which to Choose ?

NoSQL With NDB API


Best possible performance
Clients

Applications with embedded NDB API Library

Application embeds the NDB API C++ interface library NDB API make intelligent decision (where possible) about which data node to send queries to
With a little planning in the schema design, achieve linear scalability

Used by all of the other application nodes (MySQL, LDAP, ClusterJ,) Best possible performance but requires > development skill Favourite API for real-time network applications Foundation for all interfaces
MySQL Cluster Data Nodes

NoSQL with memcached

7.2DM

Memcached

protocol

Memcached is a distributed memory based hash-key/value store with no persistence to disk NoSQL, simple API, popular with developers MySQL Cluster already provides scalable, in-memory performance with NoSQL (hashed) access as well as persistence
Provide the Memcached API but map to NDB API calls

Writes-in-place, so no need to invalidate cache Simplifies architecture as caching & database integrated into 1 tier Access data from existing relational tables

NoSQL with Memcached


Pre-GA version available from labs.mysql.com

7.2DM

Flexible:
Deployment options Multiple Clusters Simultaneous SQL Access Can still cache in Memcached server Flat key-value store or map to multiple tables/columns

Simple:
set maidenhead 0 0 3 SL6 STORED get maidenhead VALUE maidenhead 0 3 SL6 END

Supporting Rapidly Evolving Services

On-Line Operations
Scale the cluster (add data nodes) Repartition tables Recover failed nodes Upgrade / patch servers & OS Upgrade / patch MySQL Cluster Back-Up Evolve the schema on-line

Online Add Node (1) add node group


authid (PK) 1 2 fname Albert Ernest Johann Junichiro lname Camus Hemingway Goethe Tanizaki country France USA Germany Japan

Application

3 4

Node Group

New Node Group

authid (PK) 1 2 3 4

fname Albert Ernest Johann Junichiro

lname Camus Hemingway Goethe Tanizaki

country France USA Germany Japan

http://www.mysql.com/why-mysql/white-papers/mysql_wp_cluster7_architecture.php
Copyright 2011 Oracle Corporation 47

Online Add Node (2) copy data


authid (PK) 1 2 fname Albert Ernest Johann Junichiro lname Camus Hemingway Goethe Tanizaki country France USA Germany Japan

Application

3 4

Node Group

New Node Group

authid (PK) fname lname country authid (PK) fname lname country 1 1 2 2 3 3 4 4 Johann Ernest Albert Camus France Albert Camus France Hemingway USA Ernest Hemingway USA 4 Junichiro Tanizaki Japan authid (PK) fname lname country

Ernest

Hemingway

USA

Goethe Germany Johan Goethe Germany

Junichiro Tanizaki Japan Junichiro Tanizaki Japan

No extra space needed on existing nodes!

http://www.mysql.com/why-mysql/white-papers/mysql_wp_cluster7_architecture.php
Copyright 2011 Oracle Corporation 48

Online Add Node (3) switch distribution


authid (PK) 1 2 fname Albert Ernest Johan Junichiro lname Camus Hemingway Goethe Tanizaki country France USA Germany Japan

Application

3 4

Node Group

New Node Group

authid (PK) 1 2 3 4

fname Albert Ernest Johann Junichiro

lname Camus Hemingway Goethe Tanizaki

country France USA Germany Japan authid (PK) 2 4 fname Ernest Junichiro lname Hemingway Tanizaki country USA Japan

http://www.mysql.com/why-mysql/white-papers/mysql_wp_cluster7_architecture.php
Copyright 2011 Oracle Corporation 49

Online Add Node (4) -delete rows


authid (PK) 1 fname Albert Ernest Johan Junichiro lname Camus Hemingway Goethe Tanizaki country France USA Germany Japan

Dynamic scaling of a running Cluster no interruption to service


Node Group 1

Application

3 4

Node Group 2

authid (PK) 1 3

fname Albert Johann

lname Camus Goethe

country France Germany

authid (PK) 2 4

fname Ernest Junichiro

lname Hemingway Tanizaki

country USA Japan

http://www.mysql.com/why-mysql/white-papers/mysql_wp_cluster7_architecture.php
Copyright 2011 Oracle Corporation 50

On-Line Schema Changes


Fully online transaction response times unchanged Add and remove indexes, add new columns and tables No temporary table creation No recreation of data or deletion required Faster and better performing table maintenance operations Less memory and disk requirements
CREATE OFFLINE INDEX b ON t1(b); Query OK, 1356 rows affected (2.20 sec) DROP OFFLINE INDEX b ON t1; Query OK, 1356 rows affected (2.03 sec) CREATE ONLINE INDEX b ON t1(b); Query OK, 0 rows affected (0.58 sec) DROP ONLINE INDEX b ON t1; Query OK, 0 rows affected (0.46 sec) LTER ONLINE TABLE t1 ADD COLUMN d INT; A Query OK, 0 rows affected (0.36 sec)

Case Studies

Shopatron: eCommerce Platform


Applications
Ecommerce back-end, user authentication, order data & fulfilment, payment data & inventory tracking. Supports several thousand queries per second

Key business benefits


Scale quickly and at low cost to meet demand Self-healing architecture, reducing TCO

Why MySQL?
Low cost scalability High read and write throughput Extreme availability

Since deploying MySQL Cluster as our eCommerce database, we have had continuous uptime with linear scalability enabling us to exceed our most stringent SLAs
Sean Collier, CIO & COO, Shopatron Inc

http://www.mysql.com/why-mysql/case-studies/mysql_cs_shopatron.php

53

COMPANY OVERVIEW UK-based retail and wholesale ISP & Hosting Services 2010 awards for best home broadband and customer service Acquired by BT in 2007

CUSTOMER PERSPECTIVE Since deploying our latest AAA platform, the MySQL environment has delivered continuous uptime, enabling us to exceed our most stringent SLAs -- Geoff Mitchell Network Engineer

CHALLENGES / OPPORTUNITIES Enter market for wholesale services, demanding more stringent SLAs Re-architect AAA systems for data integrity & continuous availability to support billing sytems Consolidate data to for ease of reporting and operating efficiency Fast time to market

SOLUTIONS MySQL Cluster MySQL Server with InnoDB

RESULTS Continuous system availability, exceeding wholesale SLAs 2x faster time to market for new services Agility and scale by separating database from applications Improved management & infrastructure efficiency through database consolidation

Summary MySQL Cluster


Web-Scale Performance with Carrier-Grade Availability SQL & NoSQL Access Methods

No Compromise
Scale-Out, Real Time Performance, 99.999% Uptime

Proven
Deployed across telecoms networks Powering mission-critical web and internet services

Getting Started
Learn More GA Release
Architecture & New Features Guide
www.mysql.com/cluster/

Evaluate MySQL Cluster 7.2


Download Today
http://dev.mysql.com/do wnloads/cluster/ http://labs.mysql.com (memcached)

Quick Start Guides


Linux, Solaris, Windows
http://tinyurl.com/5wkl4 dy

You might also like