Hbase +Fosdem+2010+Nosql 2

“My Life with HBase”
Lars George, CTO of WorldLingo

Apache Hadoop HBase Committer
www.worldlingo.com www.larsgeorge.com
WorldLingo
 Co-founded 1999
 Machine Translation Services
 Professional Human Translations
 Offices in US and UK
 Microsoft Office Provider since 2001
 Web based services
 Customer Projects
 Multilingual Archive
Multilingual Archive
 SOAP API
 Simple calls
◦ putDocument()
◦ getDocument()
◦ search()
◦ command()
◦ putTransformation()
◦ getTransformation()
Multilingual Archive (cont.)
 Planned already, implemented as customer
project
 Scale:
◦ 500million documents
◦ Random Access
◦ “100%” Uptime
 Technologies?
◦ Database
◦ Zip-Archives on file system, or Hadoop
RDBMS Woes
 Scaling MySQL hard, Oracle expensive (and
hard)
 Machine cost goes up faster speed
 Turn off all relational features to scale
 Turn off secondary indexes too
 Tables can be a problem at sizes as low as
500GB
 Hard to read data quickly at these sizes
 Write speed degrades with table size
 Future growth uncertain
MySQL Limitations
 Master becomes a problem
 What if your write speed is greater than a
single machine
 All slaves must have same write capacities
as master (can„t check out on slaves)
 Single point of failure, no easy failover
 Can (sort of) solve this with sharding
Sharding
Sharding Problems
 Requires either a hashing function or
mapping table to determine shard
 Data access code becomes complex
 What if shard sizes become too large?
Resharding
Schema Changes
 What about schema changes or
migrations?
 MySQL not your friend here
 Only gets harder with more data
HBase to the Rescue
 Clustered, commodity(-ish) hardware
 Mostly schema-less
 Dynamic distribution
 Spreads writes out over the cluster
HBase
 Distributed database modeled on Bigtable
◦ Bigtable: A Distributed Storage System for
Structured Data by Chang et al.
 Runs on top of Hadoop Core
 Layers on HDFS for storage
 Native connections to MapReduce
 Distributed, High Availability, High
Performance, Strong Consistency
HBase
 Column-oriented store
◦ Wide table costs only the data stored
◦ NULLs in row are 'free'
◦ Good compression: columns of similar type
◦ Column name is arbitrary
 Rows stored in sorted order
 Can random read and write
 Goal of billions of rows X millions of cells
◦ Petabytes of data across thousands of servers
Tables
 Table is split into roughly equal sized
„regions“
 Each region is a contiguous range of keys,
from [start, to end)
 Regions split as they grow, thus
dynamically adjusting to your data set
Tables (cont.)
 Tables are sorted by Row
 Table schema defines column families
◦ Families consist of any number of columns
◦ Columns consist of any number of versions
◦ Everything except table name is byte[]
(Table, Row, Family:Column, Timestamp)  Value

Tables (cont.)
 As a data structure
SortedMap(
RowKey, List(
SortedMap(
Column, List(
Value, Timestamp
)
)
)
)
Server Architecture
 Similar to HDFS
◦ Master ≈ Namenode
◦ Regionserver ≈ Datanode
 Often run these alongsaide each other!
 Difference: HBase stores state in HDFS
 HDFS provides robust data storage
across machines, insulating against failure
 Master and Regionserver fairly stateless
and machine independent
Region Assignment
 Each region from every table is assigned
to a Regionserver
 Master Duties:
◦ Reponsible for assignment and handling
regionserver problems (if any!)
◦ When machines fail, move regions
◦ When regions split, move regions to balance
◦ Could move regions to respond to load
◦ Can run multiple backup masters
Master
 The master does NOT
◦ Handle any write requests (not a DB master!)
◦ Handle location finding requests
◦ Not involved in the read/write path
◦ Generally does very little most of the time
Distributed Coordination
 Zookeeper is used to manage master
election and server availability
 Set up as a cluster, provides distributed
coordination primitives
 An excellent tool for building cluster
management systems
HBase Storage Architecture
HBase Public Timeline
 November 2006
◦ Google releases paper on Bigtable
 February 2007
◦ Initial HBase prototype created as Hadoop contrib
 October 2007
◦ First "useable" HBase (0.15.0 Hadoop)
 December 2007
◦ First HBase User Group
 January 2008
◦ Hadoop becomes TLP, HBase becomes subproject
 October 2008
◦ HBase 0.18.1 released
 January 2009
 September 2009
HBase WorldLingo Timeline
HBase - Example
 Store web crawl data
◦ Table crawl with family content
◦ Row is URL with columns
 content:data stores raw crawled data
 content:language stores http language header
 content:type stores http content-type header
◦ If processing raw data for hyperlinks and
images, add families links and images
 links:<url> column for each hyperlink
 links:<url> column for each image
HBase - Clients
 Native Java client/API
◦ get(Get get)
◦ put(Put put)
 Non-Java clients
◦ Thrift server (Ruby, C++, Erlang, etc.)
◦ REST server (Stargate)
 TableInput/TableOutputFormat for
MapReduce
 HBase shell (jruby)
Scaling HBase
 Add more machines to scale
◦ Automatic rebalancing
 Base model (BigTable) scales past 1000TB
 No inherent reason why Hbase couldn„t
What to store in HBase
 Maybe not your raw log data...
 ... but the results of processing it with
Hadoop!
 By storing the refined version in HBase,
can keep up with huge data demands and
serve to your website
!HBase
 “NoSQL” Database!
◦ No joins
◦ No sophisticated query engine
◦ No transactions (sort of)
◦ No column typing
◦ No SQL, no ODBC/JDBC, etc. (but there is
HBql now!)
 Not a replacement for your RDBMS...
 Matching Impedance!
Why HBase?
 Datasets are reaching Petabytes
 Traditional databases are expensive to
scale and difficult to distribute
 Commodity hardware is cheap and
powerful (but HBase can make use of
powerful machines too!)
 Need for random access and batch
processing (which Hadoop does not
offer)
Numbers
 Single reads are 1-10ms depending on
disk seeks and caching
 Scans can return hundreds of rows in
dozens of ms
 Serial read speeds
 44 Dell PESC1435, 12GB RAM, 2 x 1TB
SATA drives
 Java 6
 Tomcat 5.5
 88 Xen domU‟s
◦ Apache
◦ Hadoop/HBase
◦ Tomcat application servers
 Currently split into two clusters
Lucene Search Server
 43 fields indexed
 166GB size
 Automated merging/warm-up/swap
 Looking into scalable solution
◦ Katta
◦ Hyper Estraier
◦ DLucene
◦ …
 Sorting?
 5 Tables
 Up to 5 column families
 XML Schemas
 Automated table schema updates
 Standard options tweaked over time
◦ Garbage Collection!
 MemCached(b) layer
Layers
Network Firewall
LWS Director 1 Director n
Web Apache 1 Apache n …
App Tomcat 1 Tomcat n Tomcat 1 Tomcat n
Cache MemCached
1
MemCached
n
Data HBase
Map/Reduce
 Backup/Restore
 Index building
 Cache filling
 Mapping
 Updates
 Translation
HBase - Problems
 Early versions (before HBase 0.19.0!)
◦ Data loss
◦ Migration nightmares
◦ Slow performance
 Current version
◦ Read HBase Wiki!!!
 Single point of failure (name node only!)
HBase - Notes
 RTF M (ine)
 HBase Wiki, IRC Channel

 Personal Experience:
◦ Max. file handles (32k+)
◦ Hadoop xceiver limits (NIO?)
◦ Redundant meta data (on name node)
◦ RAM (4GB+)
◦ Deployment strategy
◦ Garbage collection (use CMS, G1?)
◦ Maybe not mix batch and interactive?
Graphing
 Use supplied Ganglia context or JMX
bridge to enable Nagios and Cacti
 JMXToolkit: swiss army knife for JMX
enabled servers:
http://github.com/larsgeorge/jmxtoolkit
HBase - Roadmap
 HBase 0.20.x “Performance”
◦ New Key Format – KeyValue
◦ New File Format – Hfile
◦ New Block Cache – Concurrent LRU
◦ New Query and Result API
◦ New Scanners
◦ Zookeeper Integration – No SPOF in HBase
◦ New REST Interface
◦ Contrib
 Transactional Tables
 Secondary Indexes
 Stargate
HBase - Roadmap (cont.)
 HBase 0.21.x “Advanced Concepts”
◦ Master Rewrite – More Zookeeper
◦ New RPC Protocol (Avro)
◦ Multi-DC Replication
◦ Intra Row Scanning
◦ Further optimizations on algorithms and data
structures
◦ Discretionary Access Control
◦ Coprocessors
Questions?
 Email: lars@worldlingo.com
larsgeorge@apache.org
lars@larsgeorge.com
 Blog: www.larsgeorge.com
 Twitter: larsgeorge

Hbase +Fosdem+2010+Nosql 2

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Hbase +Fosdem+2010+Nosql 2

Uploaded by

Copyright:

Available Formats

“My Life with HBase”

Lars George, CTO of WorldLingo

(Table, Row, Family:Column, Timestamp)  Value

LWS Director 1 Director n

Web Apache 1 Apache n …

App Tomcat 1 Tomcat n Tomcat 1 Tomcat n

 HBase Wiki, IRC Channel

You might also like