Professional Documents
Culture Documents
Rwws Mysql 2006
Rwws Mysql 2006
Rwws Mysql 2006
and Scalability
with MySQL
Ask Bjørn Hansen
Develooper LLC
1
Real World
Web Scalability
MySQL Edition
Ask Bjørn Hansen
Develooper LLC
2
Hello.
• I’m Ask Bjørn Hansen
4
Questions ...
• How many saw my talk last year?
5
• Lesson number 1
• Think Horizontal!
• Everything in your architecture, not just the front
end web servers
6
Benchmarking techniques
• Scalability isn't the same as processing time
7
Vertical scaling
• “Get a bigger server”
8
Horizontal scaling
• “Just add another box” (or another thousand or ...)
9
Scalable
Application
Servers
Don’t paint yourself into a corner from the
start
10
Run Many of Them
11
Stateless vs Stateful
• “Shared Nothing”
12
Caching
How to not do all that work again and again and again...
13
Generate Static Pages
14
Cache full pages
(or responses if it’s an API)
15
Cache full pages 2
• Front end cache (mod_cache, squid, ...) stores
generated content
16
Cache partial pages
• Pre-generate static page “snippets”
(this is what my.yahoo.com does or used to do...)
17
Cache data
18
Cache hit-ratios
19
Caching Tools
Where to put the cache data ...
20
A couple of bad ideas
Don’t do this!
21
MySQL cache table
• Write into one or more cache tables
22
MySQL Cache Fails
23
MySQL Cache Scales
• Partitioning
24
memcached
• LiveJournal’s distributed caching system
(also used at slashdot, wikipedia, etc etc)
• memory based
• No “master”
~$2,000
~$3,500,000 ( = 1750 for $3.5M!)
Vertical Horizontal
26
Be Simple
• Use MySQL
27
Replication
More data more places!
Share the love load
28
Basic Replication
• Write to one master webservers
writes
reads
29
Relay slave webservers
data loading
script
master
reads
loadbalancer
30
Replication Scaling – Reads
• Reading scales well with replication
reads
reads reads
31
Replication Scaling – Writes
(aka when replication sucks)
32
Partition the data
Divide and Conquer!
or
Web 2.0 Buzzword Compliant!
Now free with purchase of milk!!
33
Partition your data
Cat cluster Dog cluster
34
Cluster data with a master server
• Can’t divide data up in “dogs” and “cats”?
cluster 3
webservers
35
How this helps “Web 2.0”
• Don’t have replication slaves!
36
Hacks!
Don’t be afraid of the data-duplication monster
37
Summary tables!
• Find queries that do things with COUNT(*) and
GROUP BY and create tables with the results!
• or hourly/daily/... updates
38
Summary databases!
• Don’t just create summary tables
39
“Manual” replication
• Save data to multiple “partitions”
40
a brief diversion ...
•
writes
Move read operations to MySQL!
•
replication
Replicate from Oracle to a MySQL Oracle program
writes
loadbalancer
41
Make everything repeatable
• Script failed in the middle of the nightly processing job?
(they will sooner or later, no matter what)
42
More MySQL
Faster, faster, faster ....
43
Table Choice
• Short version:
Use InnoDB, it’s harder to make them fall over
• Long version:
Use InnoDB except for
44
Multiple MySQL instances
• Run different MySQL instances for different workloads
• Simpler replication
45
Asynchronous data loading
• Updating counts? Loading logs?
46
Preload, -dump and -process
• Let the servers do as much as possible without touching
the database directly
47
Stored Procedures
Dangerous
• Not horizontal
48
Reconsider Persistent DB
Connections
• DB connection = thread = memory
49
InnoDB configuration
• innodb_file_per_table
Splits your innodb data into a file per table instead of one
big annoying file
• innodb_buffer_pool_size=($MEM*0.80)
• innodb_flush_log_at_trx_commit setting
• innodb_log_file_size
• transaction-isolation = READ-COMMITTED
50
Store Large Binary Objects
(aka how to store images)
51
Random Application Notes
• Everything is Unicode, please!
52
Don’t overwork the DB
• Databases don’t easily scale
53
Sessions
“The key to be stateless”
or
“What goes where”
54
Evil Session
Cookie: session_id
=12345
Web/application server
with local
Session store
...
with this?
{ username => 'joe',
email => 'joe@example.com',
id => 987,
},
shopping_cart => { ... },
last_viewed_items => { ... },
background_color => 'blue',
},
12346 => { ... },
....
55
Evil Session
Cookie: session_id
=12345
56
Good Session!
Cookie: sid=seh568fzkj5k09z;
user=987-65abc;
bg_color=blue; Web/application server
cart=...;
• Stateless
web server!
Database(s)
Users • Important data memcached cache
987 => in a database
{ username => 'joe',
},
email => 'joe@example.com',
• Individual seh568fzkj5k09z =>
{ last_viewed_items => {...},
... expiration on ... other "junk" ...
Shopping Carts
session objects },
....
...
• Small data items
in cookies
57
Safe cookies
• Worried about manipulated cookies?
• cookie=1/user::987/cart::943/ts::1123.../EFGH9876
• cookie=$cookie_format_version
/$key::$value[/$key::$value]
/ts::$timestamp
/$md5
58
Use your
resources wisely
don’t implode when things run warm
59
Resource management
60
Do the work in parallel
• Split the work into smaller (but reasonable) pieces
and run them on different boxes
61
Use light processes
for light tasks
• perlbal
– new & improved, now with vhost support!
62
Proxy illustration
perlbal or mod_proxy
low memory/resource usage
Users
backends
lots of memory
db connections etc
63
Light processes
• Save memory and database connections
• Load balancing
64
Light processes
• Apache 2 makes it Really Easy
• ProxyPreserveHost On
<VirtualHost *>
ServerName combust.c2.askask.com
ServerAlias *.c2.askask.com
RewriteEngine on
RewriteRule (.*) http://localhost:8230$1 [P]
</VirtualHost>
65
Job queues
66
Job Queues
• Database “queue”
• Other ways...
• gearman workers
workers
workers
workers
• Spread
67
Log http requests!
• Log slow http transactions to a database
time, response_time, uri, remote_ip, user_agent, request_args, user,
svn_branch_revision, log_reason (a “SET” column), ...
68
Get good deals on servers
• Silicon Mechanics
http://www.siliconmechanics.com/
69
remember
Think Horizontal!
70
Hiring!
• Contractors and dedicated moonlighters!
• Help me with $client_project ($$)
• Help me with $super_secret_startup (fun!)
• Perl / MySQL
• Javascript/AJAX
• ask@develooper.com
(resume in text or pdf, code samples)
71
Thanks!
• Direct and indirect help from ...
• Cal Henderson, Flickr
• Brad Fitzpatrick, LiveJournal
• Kevin Scaldeferri, Overture Yahoo!
• Perrin Harkins, Plus Three
• Tim Bunce
• David Wheeler, Tom Metro
72
– The End –
Questions?
Thank you!
73