Professional Documents
Culture Documents
OLAP Implementation Techniques: High Performance Data Warehouse Design and Construction
OLAP Implementation Techniques: High Performance Data Warehouse Design and Construction
OLAP Implementation Techniques: High Performance Data Warehouse Design and Construction
OLAP Implementation
Techniques
1
Objectives
Provide a robust framework for OLAP
techniques for decision support.
Characterize tradeoffs in performance,
scalability, flexibility, and complexity
associated to various OLAP implementation
techniques.
Examine tradeoffs in aggregate construction.
2
Topics
OLAP framework for decision support.
Physical implementation techniques:
MOLAP, ROLAP, HOLAP, and DOLAP.
Star schema design.
3
Where Does OLAP Fit In?
See www.olapreport.com/DataExplosion.htm
Virtual Cubes
Virtual cubes are used when there is a need to join
information from two dissimilar cubes that share one or
more common dimensions.
Similar to a relational view; two (or more) cubes are
linked along common dimension(s).
Often used to save space by eliminating redundant
storage of information.
Fact Table M M
ITEM# RECEIPT# STORE# DATE ... $
M M M
1 1 1
ITEM# CATEGORY DEPT MFCTR ... STORE# DATE WEATHER
A vastly simplified model ... may even summarize out receipt # .....
15
Simplified Star Schema
Assume:
4 Billion rows in fact table.
20 different kinds (size, color, style) of raincoats
(product category) out of 50,000 UPCs in store.
8 stores out of 400 are in BOSTON SMSA.
2 years of POS history in DBMS.
17
Star Schema for High Performance
18
Star Schema for High Performance
Advanced (better performance) approach to query execution:
19
Forcing a Cartesian Product Join
Add an addition “join_value” column in each
dimensional table.
Set join_value to same value in all rows of the
dimensional tables.
Add additional where clause predicates joining
on this column between dimensional tables.
20
Forcing a Cartesian Product Join
Sample code:
select sum(sales.sales_amt)
from d_sales_detail
,store
,item
,period
where d_sales_detail.store_id = store.store_id
and d_sales_detail.item_id = item.item_id
and d_sales_detail.day_dt = period.day_dt
and period.day_dt between '23-NOV-2000' and '24-DEC-2000'
and item.trade_style_cd = 'BARBIE'
and store.state_cd = 'CA'
and store.join_value = period.join_value
and store.join_value = item.join_value
and period.join_value = item.join_value
;
21
Star Schema for High Performance
22
Star Schema for High Performance
Bottom Line:
It is not at all unusual to obtain an order of
magnitude (or more) in performance advantage
using a star schema with advanced indexing
versus a more traditional relational database
implementation.
Despite what vendors may tell you, star schemas
cannot be effectively implemented for all DSS
business applications and/or data models.
23
ROLAP
24
ROLAP
25
ROLAP
Summary tables in a naive implementation require all
combinations of the dimensions at each aggregation level...
All Days 13 19 24 28 30
Year 9 15 22 27 29
Quarter 6 11 18 23 26
Period 4 8 14 21 25
Week 2 5 10 17 20
Day 1 3 7 12 16
Store Zone District Region All Stores
27
ROLAP
Warning: Do not assume that dimensions are always
simple hierarchies.
28
ROLAP
Many ROLAP products have devised ways to reduce
the number of summary tables:
29
Intelligent Aggregation Selection
34