Professional Documents
Culture Documents
Mysql Cluster Scaling Web Databases Webinar Aug 26
Mysql Cluster Scaling Web Databases Webinar Aug 26
Mysql Cluster Scaling Web Databases Webinar Aug 26
Rapidly iterate
.evolve the app and the database
Session Agenda
Best Practices in Scaling Web Services with MySQL Cluster
Scaling Reads & Writes with Auto-Sharding
On-Line Scaling with Commodity Hardware Choosing the Right Interface(s): NoSQL and SQL
<Insert Picture Here>
The presentation is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment to deliver any material, code, or functionality, and should not be relied upon in making purchasing decisions. The development, release, and timing of any features or functionality described for Oracles products remains at the sole discretion of Oracle.
99.999% Availability
Shared-nothing, integrated clustering & sub-second recovery, local & geographic replication, on-line operations
Low TCO
Open-source, management & monitoring tools, scale-out on commodity hardware
REST
LDAP
Application Nodes
Node 2
Data Nodes
Node 4
REST
LDAP
Application Nodes
Node 2
Data Nodes
Node 4
Web
User profile management Session stores eCommerce On-Line Gaming Application Servers
Telecoms
Subscriber Databases (HLR/HSS) Service Delivery Platforms VoIP, IPTV & VoD Mobile Content Delivery On-Line app stores and portals IP Management Payment Gateways
http://www.mysql.com/customers/cluster/
Auto-Sharding
Transparency maintained during failover, upgrades and scale-out No need for application-layer sharding logic No need to limit application to single-shard transactions (though can help efficiency)
http://www.mysql.com/why-mysql/white-papers/mysql_wp_cluster_perfomance.php
Table T1
Px Partition
Data Node 1
P1
Data Node 2
P2 P3
Data Node 3
P4
A fragment is a copy of a partition (aka fragment replica) Number of fragments = # of partitions * # of replicas
Data Node 4
Table T1
Px Partition
Data Node 1 F1
P1
Data Node 2
P2 P3
Data Node 3
P4
Fx Fx Primary Fragment Secondary Fragment (fragment replica) Data Node 4
Table T1
Px Partition
Data Node 1 F1
P1
Data Node 2
P2 P3
F1
Data Node 3
P4
Fx Fx Primary Fragment Secondary Fragment (fragment replica) Data Node 4
Table T1
Px Partition
Data Node 1 F1
P1
Data Node 2
P2 P3
F3
F1
Data Node 3
P4
Fx Fx Primary Fragment Secondary Fragment (fragment replica) Data Node 4
Table T1
Px Partition
Data Node 1 F1 F3
P1
Data Node 2
P2 P3
F3
F1
Data Node 3
P4
Fx Fx Primary Fragment Secondary Fragment (fragment replica) Data Node 4
Table T1
Px Partition
Data Node 1 F1 F3
P1
Data Node 2
P2 P3
F3
F1
Data Node 3 F2
P4
Fx Fx Primary Fragment Secondary Fragment (fragment replica)
Data Node 4
Table T1
Px Partition
Data Node 1 F1 F3
P1
Data Node 2
P2 P3
F3
F1
Data Node 3 F2
P4
Fx Fx Primary Fragment Secondary Fragment (fragment replica)
Data Node 4 F2
Table T1
Px Partition
Data Node 1 F1 F3
P1
Data Node 2
P2 P3
F3
F1
Data Node 3 F2
P4
Fx Fx Primary Fragment Secondary Fragment (fragment replica)
Data Node 4 F4 F2
Table T1
Px Partition
Data Node 1 F1 F3
P1
Data Node 2
P2 P3
F3
F1
Data Node 3 F2 F4
P4
Fx Fx Primary Fragment Secondary Fragment (fragment replica)
Data Node 4 F4 F2
Table T1
Px Partition
Data Node 1 F1 F3
P1 P2 P3
F3
Node Group 1
Data Node 2 F1
Data Node 3 F2 F4
P4
Fx Fx Primary Fragment Secondary Fragment (fragment replica)
Data Node 4 F4 F2
Table T1
Px Partition
Data Node 1 F1 F3
P1 P2 P3
F3
Node Group 1
Data Node 2 F1
Data Node 3 F2 F4
P4
Fx - Node groups are created automatically - # of groups = # of data nodes / # of replicas Fx Secondary Fragment (fragment replica) Primary Fragment
Node Group 2
Data Node 4 F4 F2
Table T1
Px Partition
Data Node 1 F1 F3
P1 P2 P3
F3
Node Group 1
Data Node 2 F1
Data Node 3 F2 F4
P4
Fx As long as one data node in each node group is running we have a complete copy of the data Fx Primary Fragment Secondary Fragment (fragment replica)
Node Group 2
Data Node 4 F4 F2
Table T1
Px Partition
Data Node 1 F1 F3
P1 P2 P3
F3
Node Group 1
Data Node 2 F1
Data Node 3 F2 F4
P4
Fx As long as one data node in each node group is running we have a complete copy of the data Fx Primary Fragment Secondary Fragment (fragment replica)
Node Group 2
Data Node 4 F4 F2
Table T1
Px Partition
Data Node 1 F1 F3
P1 P2 P3
F3
Node Group 1
Data Node 2 F1
Data Node 3 F2 F4
P4
Fx As long as one data node in each node group is running we have a complete copy of the data Fx Primary Fragment Secondary Fragment (fragment replica)
Node Group 2
Data Node 4 F4 F2
Table T1
Px Partition
Data Node 1 F1 F3
P1 P2 P3
F3
Node Group 1
Data Node 2 F1
Data Node 3 F2 F4
P4
Fx - No complete copy of the data - Cluster shutdowns automatically Fx Primary Fragment Secondary Fragment (fragment replica)
Node Group 2
Data Node 4 F4 F2
Overall design goal Minimize network roundtrips for your most important requests!
Primary Key town Maidenhead Paris Boston Boston country UK France UK USA population 78000 2193031 58124 617594
MySQL Server (or NDB API) will attempt to send transaction to the correct data node
If all data for the transaction are in the same partition, less messaging > faster
Maidenhead
Paris Boston Boston
UK
France UK USA
Primary Key sub_id 19724 84539 19724 age 25 43 16 gender male female female
74574
21
female
Extend partition awareness over multiple tables Same rule aim to have all data for instance of high running transactions in the same partition ALTER TABLE service_ids PARTITION BY KEY(sub_id);
Partition Key
twitter
facebook facebook
84539
19724 73642
67324782
83753984 87324793
7.2DM
Nested Loop join When data is needed, it must be fetched over the network from the Data Nodes; row by row This causes latency and consumes resources
mysqld
Can now push the execution down into the data nodes, greatly reducing the network trips 25x-40x performance gain in customer PoC!
Data Nodes
The existence, content and timing of future releases described here is included for information only and may be changed at Oracles discretion. May 26, 2011
http://www.mysql.com/news-and-events/on-demand-webinars/display-od-583.html
Scale Out
Need more throughput?
Data Node 1
Data Node 2
Scale Out
Need more throughput?
Data Node 1
Data Node 2
Scale Out
Need more throughput?
Data Node 1
Data Node 2
Scale Out
Data Node 1
Data Node 2
Data Node 3
Data Node 4
Cluster 1
Cluster 2
InnoDB
Synchronous replication
Asynchronous replication
InnoDB
InnoDB
Mix & Match! Same data accessed simultaneously through SQL & NoSQL interfaces
NoSQL Multiple ways to bypass SQL, and maximize performance: NDB API. C++ for highest performance, lowest latency Cluster/J for optimized access in Java NEW! Memcached. Use all your existing clients/applications
Which to Choose ?
Application embeds the NDB API C++ interface library NDB API make intelligent decision (where possible) about which data node to send queries to
With a little planning in the schema design, achieve linear scalability
Used by all of the other application nodes (MySQL, LDAP, ClusterJ,) Best possible performance but requires > development skill Favourite API for real-time network applications Foundation for all interfaces
MySQL Cluster Data Nodes
7.2DM
Memcached
protocol
Memcached is a distributed memory based hash-key/value store with no persistence to disk NoSQL, simple API, popular with developers MySQL Cluster already provides scalable, in-memory performance with NoSQL (hashed) access as well as persistence
Provide the Memcached API but map to NDB API calls
Writes-in-place, so no need to invalidate cache Simplifies architecture as caching & database integrated into 1 tier Access data from existing relational tables
7.2DM
Flexible:
Deployment options Multiple Clusters Simultaneous SQL Access Can still cache in Memcached server Flat key-value store or map to multiple tables/columns
Simple:
set maidenhead 0 0 3 SL6 STORED get maidenhead VALUE maidenhead 0 3 SL6 END
On-Line Operations
Scale the cluster (add data nodes) Repartition tables Recover failed nodes Upgrade / patch servers & OS Upgrade / patch MySQL Cluster Back-Up Evolve the schema on-line
Application
3 4
Node Group
authid (PK) 1 2 3 4
http://www.mysql.com/why-mysql/white-papers/mysql_wp_cluster7_architecture.php
Copyright 2011 Oracle Corporation 47
Application
3 4
Node Group
authid (PK) fname lname country authid (PK) fname lname country 1 1 2 2 3 3 4 4 Johann Ernest Albert Camus France Albert Camus France Hemingway USA Ernest Hemingway USA 4 Junichiro Tanizaki Japan authid (PK) fname lname country
Ernest
Hemingway
USA
http://www.mysql.com/why-mysql/white-papers/mysql_wp_cluster7_architecture.php
Copyright 2011 Oracle Corporation 48
Application
3 4
Node Group
authid (PK) 1 2 3 4
country France USA Germany Japan authid (PK) 2 4 fname Ernest Junichiro lname Hemingway Tanizaki country USA Japan
http://www.mysql.com/why-mysql/white-papers/mysql_wp_cluster7_architecture.php
Copyright 2011 Oracle Corporation 49
Application
3 4
Node Group 2
authid (PK) 1 3
authid (PK) 2 4
http://www.mysql.com/why-mysql/white-papers/mysql_wp_cluster7_architecture.php
Copyright 2011 Oracle Corporation 50
Case Studies
Why MySQL?
Low cost scalability High read and write throughput Extreme availability
Since deploying MySQL Cluster as our eCommerce database, we have had continuous uptime with linear scalability enabling us to exceed our most stringent SLAs
Sean Collier, CIO & COO, Shopatron Inc
http://www.mysql.com/why-mysql/case-studies/mysql_cs_shopatron.php
53
COMPANY OVERVIEW UK-based retail and wholesale ISP & Hosting Services 2010 awards for best home broadband and customer service Acquired by BT in 2007
CUSTOMER PERSPECTIVE Since deploying our latest AAA platform, the MySQL environment has delivered continuous uptime, enabling us to exceed our most stringent SLAs -- Geoff Mitchell Network Engineer
CHALLENGES / OPPORTUNITIES Enter market for wholesale services, demanding more stringent SLAs Re-architect AAA systems for data integrity & continuous availability to support billing sytems Consolidate data to for ease of reporting and operating efficiency Fast time to market
RESULTS Continuous system availability, exceeding wholesale SLAs 2x faster time to market for new services Agility and scale by separating database from applications Improved management & infrastructure efficiency through database consolidation
No Compromise
Scale-Out, Real Time Performance, 99.999% Uptime
Proven
Deployed across telecoms networks Powering mission-critical web and internet services
Getting Started
Learn More GA Release
Architecture & New Features Guide
www.mysql.com/cluster/