Professional Documents
Culture Documents
Thinking in Columns, Not Rows: Eugene Meidinger
Thinking in Columns, Not Rows: Eugene Meidinger
Thinking in Columns, Not Rows: Eugene Meidinger
Eugene Meidinger
DATABASE DEVELOPER
@sqlgene www.sqlgene.com
Two Uses for Databases
OLTP OLAP
Applications designed for day to day Applications designed for daily
use and frequent updates reporting and heavy reads
Optimizing for OLAP
Account
85006001
6001
85006002
6002
85006003
6003
85000000
85006004
6004
85006005
6005
85006006
6006
85006007
6007
Dictionary Encoding
Color Color
Blue Blue, 1
Green
Green Green, 2
Red
Red
Red, 4
Red
Red
Columnar storage turns
repeated values from a
waste of space
to an asset.
Demo
Visualizing run-length encoding
Comparing compression rates
Related
1 1
1 1
1 1
1 1
2 2
2 2
Four
2 1s Four
2 1s
Four
2 2s Four
2 2s
Unrelated
1 1 1 1
1 1 1 1
1 1 2 2
1 1 2 2
2 2 1 1 1s
Two
2 2 1 Two
1 2s
Four
2 1s 2 2 2 1s
Two
Four
2 2s 2 2 2 2s
Two
Unique
a
b
c
d
e
f
g
h
Unique
Related a
Four 1s Four 1s b
Four 2s Four 2s c
d
Unrelated e
f
Two 1s
g
Four 1s Two 2s
h
Four 2s Two 1s
Two 2s
Summary
OLAP is designed for reporting
Columnar databases are optimized for OLAP
Columnar storage plus encoding allows for
great compression