Download as pdf or txt
Download as pdf or txt
You are on page 1of 56

Sharone Zehavi

Jordan Jordanov

Agenda
Concepts of Column Store
Structure Compared to Row Store Performance issues Compared to Row Store Go through examples to make the points

In-Depth view of Column Store


Architecture Delta Store Consistent View Data Compression Accessing Data Join Operation
2011 SAP AG. All rights reserved. 2

Performance bottleneck

2011 SAP AG. All rights reserved.

Orders of Magnitude
presented by Jeff Dean (Google)

Activity L1 cache reference Branch mis-prediction L2 cache reference Mutex lock/unlock Main memory reference Compress 1K bytes with Zippy Send 2K bytes over 1 Gbps network Read 1 MB sequentially from memory Round trip within same datacenter Disk seek Read 1 MB sequentially from disk Send packet CA->Netherlands->CA

Time in ns 0.5 5 7 25 100 3,000 20,000 250,000 500,000 10,000,000 20,000,000 150,000,000

http://www.cs.cornell.edu/projects/ladis2009/talks/dean-keynote-ladis2009.pdf
2011 SAP AG. All rights reserved. 4

HANA Table Types


Column Row History Column (Temporal) Global Temporary Local Temporary In this presentation we will focus on Column tables only, and will mention a little bit of Row tables for the sake of comparison.

2011 SAP AG. All rights reserved.

Logical Structure Of a Table


Column 1 Column 2 Data Page2 Column 3 Data Page3 Column 4 Data Page4 Column 5 Data Page5

Row 1

Data Page1

Row 2 . . . . . . Row N

Data Page6

Data Page7

Data Page8

Data Page9

Data Page10

. . .

. . .

Data Page(5n-4)

Data Page(5n-3)

Data Page(5n-2)

Data Page(5n-1)

Data Page(5n)

2011 SAP AG. All rights reserved.

Row Store - Physical Structure


The address of Row 1
Column 1 Data Page1 Column 2 Data Page2 Column 3 Data Page3 Column 4 Data Page4 Column 5 Data Page5

The address of row 2 can be calculated


Data Page6 Data Page7 Data Page8 Data Page9 Data Page10

. . .
Row n = (the size of all columns) * n

. . .
Data Page(5n-4) Data Page(5n-3) Data Page(5n-2) Data Page(5n-1)

. . .
Data Page(5n)

2011 SAP AG. All rights reserved.

Column Store - Physical Structure (Simplified)


Column 1 Column 2 Data Page2 Column 3 Data Page3 Column 4 Data Page4 Column 5 Data Page5

Row 1

Data Page1

Row 2 . . . . . . . Row N

Data Page6

Data Page7

Data Page8

Data Page9

Data Page10

. . .

. . .

. . .

. . .

. . .

Data Page(5n-4)

Data Page(5n-3)

Data Page(5n-2)

Data Page(5n-1)

Data Page(5n)

2011 SAP AG. All rights reserved.

Example Logical Structure

Table
Country Product US US JP UK Alpha Beta Alpha Alpha Sales 3000 1250 700 450
Row3 Row2 Row1

Row Store US Alpha 3000 US Beta 1250 JP Alpha 700 UK Row4 Alpha 450 Sales Product Country

Column Store US US JP UK Alpha Beta Alpha Alpha 3000 1250 700 450

2011 SAP AG. All rights reserved.

Example (cont.) For Column Store: How is the logical Structure Preserved? Row ID
Column Store US (Row ID 1) Country US JP UK Alpha (Row ID 1) Beta Product Alpha Alpha 3000 (Row ID 1) 1250 Sales 700 450
2011 SAP AG. All rights reserved. 10

Data Dictionary

2011 SAP AG. All rights reserved.

11

Column Store performing select where CITY = New York

2011 SAP AG. All rights reserved.

12

2011 SAP AG. All rights reserved.

13

Row vs. Column Store


3 Topics to Consider Read (Select) Write (Update) Write (Insert)

2011 SAP AG. All rights reserved.

14

Row vs. Column Store Reading Data

We will understand the Pros and Cons of each method following an example. Lets look at the following school table: Family Father
Smith Galway Bush Brown Taylor Moore Harris Taylor Richard Stephen John Jack John Peter Clark James

Mother 1st Grade


Miranda Giselle Barbara Abby Ginny Nancy Ruth Michelle Jim Gary

2nd Grade
null Eric

3rd Grade
null Alex Roland null David null

4th Grade
Donna null null Donald null Ruth Frank Melissa

5th Grade
null Jeffrey null null Brian Karen Janet null

6th Grade
Kevin null Alexis Susan Larry Laura null Brenda

Timothy null Sandra Jessica Ronald Dennis Shirley Jason null Angela

Heather Jerry Cynthia null

2011 SAP AG. All rights reserved.

15

Row vs. Column Store Reading Data (cont.)


So what if we just wanted to read the entire table? select * from School Recall the Physical Structure discussed earlier Which Storing method will enable us a faster read? Why? Hints: We are going to fully scan the table in any case. We need to read entire rows one after the other, so which physical structure will enable us a smooth read? What actions are required when performing the query with Column Store? What actions are required when performing the query with Row Store?

2011 SAP AG. All rights reserved.

16

Row vs. Column Store Reading Data (cont.)


Now, what if we want to get a list of all 1st grade pupils? select 1st_grade from School where 1st_grade is not null Again, recall the Physical Structure discussed earlier Which Storing method will enable us a faster read? Why? Hints: Are we going to fully scan the table? We are going to scan all rows in any case, but only one column, so which physical structure will enable us a smooth read? What actions are required when performing the query with Column Store? What actions are required when performing the query with Row Store?

2011 SAP AG. All rights reserved.

17

Row vs. Column Store Reading Data (cont.)


Now, what if we want to get a list of all Families who have children in 1st grade, 3rd grade and 6th grade? select from where and and Family School 1st_grade is not null 3rd_grade is not null 6th_grade is not null

Again, recall the Physical Structure discussed earlier and try to answer Which Storing method will enable us a faster read? Can we have a definite answer here? What are the Pros and Cons?

2011 SAP AG. All rights reserved.

18

Row vs. Column Store Writing Data Update

Now, a mistake was found with the tables data, and we found out that David Taylor from 3rd grade is actually in 4th grade. So we need to update the table accordingly: update School set 3rd_grade = null, 4th_grade = David where Family = Taylor and Father = John and Mother = Ginny Again, recall the Physical Structure discussed earlier and try to answer Which Storing method will enable us a faster update?

2011 SAP AG. All rights reserved.

19

Row vs. Column Store Writing Data Update (cont.)


For Column Store, we first need to search for the conditions:

Family Brown Bush Galway Harris Moore Smith Taylor Taylor

Row ID 4 3 2 7 6 1 5 8

Father Clark Jack James John John Peter Richard Stephen

Row ID 7 4 8 3 5 6 1 2

Mother Abby Barbara Ginny Giselle Michelle Miranda Nancy Ruth

Row ID 4 3 5 2 8 1 6 7

2011 SAP AG. All rights reserved.

20

Row vs. Column Store Writing Data Update (cont.)


We found out that the Row ID for change is 5, so now is the time for update:

3rd_grade Alex David Jerry Roland null null null null


2011 SAP AG. All rights reserved.

Row ID 2 5 7 3 1 4 6 8

4th_grade Donald Donna Frank Melissa Ruth null null null

Row ID 1 4 7 8 6 2 3 5
21

Row vs. Column Store Writing Data Update (cont.)


So we have to update the values as requested, but we also have to sort the columns to reflect the new order, based on the new values:
Row ID 5 1 4 7 8 6 2 3
22

3rd_grade Alex Jerry Roland null null null null null


2011 SAP AG. All rights reserved.

Row ID 2 7 3 5 1 4 6 8

4th_grade David Donald Donna Frank Melissa Ruth null null

Row vs. Column Store Writing Data Update (cont.)


For Row Store, assuming no indexes are present, we simply scan the table row for row, stopping every time we find a match for the conditions, and updating. But the table scan is full, meaning, the table is scanned until the end. On the other hand, it is scanned only once. So where did we get better performance for update? Can we have a definite answer here?

2011 SAP AG. All rights reserved.

23

Row vs. Column Store Writing Data Insert

A new family has moved into town, and they registered their kids to the school. We want to reflect this with an insert command: insert into School values (Donovan, Harry, Pamela, null, Martha, null, Brenda, Albert, Justin) How would we implement this action in both methods?

2011 SAP AG. All rights reserved.

24

Row vs. Column Store Writing Data Insert (cont.)


For Column Store, after allocating a new Row ID, we will need to do the following for each column:
1. 2.

Add the new value Re-sort the column, and maybe reorder, assuming we want the values` to be contiguous.

For Row Store, we simply allocate new data pages at the end of the table and simply pour the data in there. It should take o(1) time. So we can see the straightforward advantage of Row Store when inserting new data is involved.

2011 SAP AG. All rights reserved.

25

Advantages of Column Store

So when does Column Store have a clear cut advantage over Row Store? Calculations are typically executed on a single or a few columns only The table is searched based on values of a few columns The table has a big number of columns The table has a big number of rows and columnar operations are required (aggregate, scan, etc.) High compression rates can be achieved because the majority of the columns contain only few distinct values (compared to number of rows) Elimination of indexes Parallelization

2011 SAP AG. All rights reserved.

26

Advantages of Row Store

Row Store tables are better when: The application needs to process only one single record at one time (many selects and /or updates of single records). The application typically needs to access the complete record The columns contain mainly distinct values so compression rate would be low Neither aggregations nor fast searching are required The table has a small number of rows (for example configuration tables)

2011 SAP AG. All rights reserved.

27

Column Store Conceptual Architecture

2011 SAP AG. All rights reserved.

28

Column Store Delta Storage

So we saw that inserting a new row (and sometimes update too) is a very expensive action to perform for Column Store. So what do we do to ease the pain? Every write operation (Insert or Update) in Column Store does not directly modify compressed data, but rather goes into a separate area called the Delta Storage. The changes are taken over from the delta storage asynchronously at some later point in time. This action is called Delta Merge. The Delta Merge operation integrates committed changes collected in delta storage into main storage. The following steps are taken when a write operation occurs:

2011 SAP AG. All rights reserved.

29

Write operations in a Columnar Store

2011 SAP AG. All rights reserved.

30

Write operations in a Columnar Store

2011 SAP AG. All rights reserved.

31

Write operations in a Columnar Store

2011 SAP AG. All rights reserved.

32

Column Store Delta Storage Cont.


If the current transaction is not already a write transaction, the transaction manager is told to make it a write transaction and to provide an updated transaction token. For updates and deletes a write lock is requested from the transaction manager for the record (identified by its key). The operation is blocked until the lock is available. The lock is held until the transaction is committed or rolled back. For inserts and updates, the operation inserts a new row into the delta storage with the updated data. The write operation tells the consistent view manager about the change. The consistent view manager stores transaction related information that is needed to create the consistent view for a specific read operation. This includes the information which rows in delta storage were inserted by some transaction and which other rows were invalidated. In case of a deletion the consistent view manager just stores the information that the previously valid row now becomes invalid. Unless it is a temporary table, the write operation writes an entry into the delta log.

2011 SAP AG. All rights reserved.

33

Consistent View of Current Data


With the delta concept, updates in the Column Store do not physically change existing rows. Updates are always done by inserting a new entry to the delta storage. Therefore a mechanism is required, to ensure each transaction reads the data it is supposed to read, be it from the Main Store or from the Delta Store The Consistent View Manager takes care of exactly this. To understand Consistent View, we first need to understand Isolation Levels:

2011 SAP AG. All rights reserved.

34

Consistent View of Current Data Isolation Levels


Read Committed
Corresponds to Statement Level Read Consistency With statement level snapshot isolation, different statements in a transaction may see different snapshots of the system. The statement in a transaction sees consistent snapshots of the system. Each statement sees the changes that were committed when the execution of the statement started.

2011 SAP AG. All rights reserved.

35

Consistent View of Current Data Isolation Levels


Repeatable Read / Serializable
Corresponds to Transaction Level Snapshot Isolation All statements of a transaction see the same snapshot of the database. This snapshot contains all changes that were committed at the time the transaction started. This snapshot contains, in addition, the changes made by the transaction itself. Now, back to Consistent View, lets follow an example:

2011 SAP AG. All rights reserved.

36

Consistent View of Current Data

2011 SAP AG. All rights reserved.

37

Delta Merge
Executed on Table Level when: Number of lines in delta storage for this table exceeds specified number Memory consumption of delta storage exceeds specified limit Merge is triggered explicitly by a client using SQL The delta log for a columnar table exceeds the defined limit. As the delta log is truncated only during merge operation, a merge operation needs to be performed in this case.

2011 SAP AG. All rights reserved.

38

Delta Merge

2011 SAP AG. All rights reserved.

39

Data Compression

2011 SAP AG. All rights reserved.

40

Data Compression Additional Compression


Prefix Coding If the column starts with a long sequence of the same value V, the sequence is replaced by storing the value once, together with the number of occurrences. This makes sense if there is one predominant value in the column and the remaining values are mostly unique or have low redundancy.

2011 SAP AG. All rights reserved.

41

Data Compression Additional Compression


Run Length Encoding Run length encoding replaces sequences of the same value with a single instance of the value and its start position. This variant of run length encoding was chosen, as it speeds up access compared to storing the number of occurrences with each value.

2011 SAP AG. All rights reserved.

42

Data Compression Additional Compression


Cluster Encoding Cluster encoding partitions the sequence into N blocks of fixed size (1024 elements). If a cluster contains only occurrences of a single value, the cluster is replaced by a single occurrence of that value. A bit vector of length N indicates which clusters were replaced by a single value.

2011 SAP AG. All rights reserved.

43

Data Compression Additional Compression


Sparse Encoding Sparse encoding removes the value V that appears most often. A bit vector indicates at which positions V was removed from the original sequence.

2011 SAP AG. All rights reserved.

44

Data Compression Additional Compression


Indirect Encoding Indirect encoding is also based on partitioning into blocks of 1024 elements. If a block contains only a few distinct values, an additional dictionary is used to encode the values in that block. Here is the concept with a block size of 8 elements. The first and the third block consist of not more than 4 distinct values, so a dictionary with 4 entries and an encoding of values with 2 bits is possible. For the second block this kind of compression makes no sense. With 8 distinct values the dictionary alone would need the same space as the uncompressed sequence. The implementation also needs to store the information which blocks are encoded with an additional dictionary and the links to the additional dictionaries.
2011 SAP AG. All rights reserved. 45

Data Compression Additional Compression

Indirect Encoding

2011 SAP AG. All rights reserved.

46

Data Compression Additional Compression


String Delta Compression The dictionary is stored as a sequence of blocks that contain 16 string values that are compressed using the delta compression For each string value the following information is stored: The length of the prefix which this value has in common with its predecessor The number of remaining characters after the common prefix The remaining characters after the common prefix.

2011 SAP AG. All rights reserved.

47

Data Compression Additional Compression


String Delta Compression

2011 SAP AG. All rights reserved.

48

Accessing Data in Column Store


Search by Attribute Value Search all rows with a given attribute value (select * from Table where attribute = value), so a reverse lookup is needed. A binary search is performed on the Dictionary If the value exists in the Dictionary, the result of the reverse lookup is the value ID of the specified value. The value ID sequence is searched for all occurrences of the found value ID. Lets look at an example:

2011 SAP AG. All rights reserved.

49

Accessing Data in Column Store


Search by Attribute Value (no index)

2011 SAP AG. All rights reserved.

50

Accessing Data in Column Store


Search by Attribute Value With Index Normally, even full column scans can be executed with high performance. However, in cases where the performance of column scans is not sufficient, an index can be defined on the column. It contains references to the rows that contain the value.

2011 SAP AG. All rights reserved.

51

Accessing Data in Column Store


Access by Row ID After a Row ID was determined: The value ID is read from the value ID sequence, by simply accessing the corresponding row ID. Then the value ID is used to lookup the corresponding value in the Dictionary. Lets look at an example:

2011 SAP AG. All rights reserved.

52

Accessing Data in Column Store


Access by Row ID

2011 SAP AG. All rights reserved.

53

Column Store Join Operation

Can calculate Inner joins, Right Outer joins, Left Outer joins, and Full Outer joins. Limited to Equi-Joins only. Following is a Join example (using Value ID):

2011 SAP AG. All rights reserved.

54

Column Store Join Operation

2011 SAP AG. All rights reserved.

55

Thank You!

You might also like