Professional Documents
Culture Documents
1.DWH Concepts - 23.24
1.DWH Concepts - 23.24
, HCMC, VN
Tel: +84 8 37221223, Fax: +84 8 38960640
DATA WAREHOUSE
(DAWH430784)
CONCEPTS
DAWH430784 1/16/2024 1
OUTLINE
HCMC UNIVERSITY OF TECHNOLOGY AND EDUCATION
❑Multidimensional Model
❑OLAP Operations
DAWH430784 3
Motivation for DWH
HCMC UNIVERSITY OF TECHNOLOGY AND EDUCATION
DAWH430784 4
Data Warehouse
HCMC UNIVERSITY OF TECHNOLOGY AND EDUCATION
DAWH430784 5
DWH
HCMC UNIVERSITY OF TECHNOLOGY AND EDUCATION
Pattern Evaluation
Data Mining
Task-relevant Data
Data Cleaning
Data Integration
Databases 7
DAWH430784
DWH in Business Intelligence
HCMC UNIVERSITY OF TECHNOLOGY AND EDUCATION
Increasing potential
to support
business decisions End User
Decision
Making
Data Exploration
Statistical Summary, Querying, and Reporting
DAWH430784 9
OLTP vs. OLAP
HCMC UNIVERSITY OF TECHNOLOGY AND EDUCATION
Transaction
processing (OLTP)
• Primary data from
transactions
• Daily operations and
short term decisions
Business intelligence
processing (OLAP)
• Transformed secondary
data
• Medium and long-term
decisions
10
DAWH430784
OLTP vs. OLAP
HCMC UNIVERSITY OF TECHNOLOGY AND EDUCATION
12
DAWH430784
Multidimensional Model
HCMC UNIVERSITY OF TECHNOLOGY AND EDUCATION
Rome 33 25 23 25
)
S
Nice 12 20 24 33
Paris
18
23
Q1 21 10 18 35 measure
Time (Quarter)
14
17
values
20
Q2 27 14 11 30
12
18
dimensions
33
Q3 26 12 35 32
10
Q4 14 20 47 31
games DVDs
books CDs
Product (Category) 13
DAWH430784
Hierarchies
HCMC UNIVERSITY OF TECHNOLOGY AND EDUCATION
❑Example
▪ Hierarchies of
the Product,
Time, and
Customer
dimensions
15
DAWH430784
Hierarchies
HCMC UNIVERSITY OF TECHNOLOGY AND EDUCATION
16
DAWH430784
Measure Aggregation and
Summarizability
HCMC UNIVERSITY OF TECHNOLOGY AND EDUCATION
❑ Measures are aggregated when using hierarchies for
visualizing data at different abstraction levels
❑ Summarizability refers to the correct aggregation of
cube measures along dimension hierarchies
❑ Summarizability conditions:
❑Disjointness of instances: the grouping of instances in a level
with respect to the parent in the next level must result in
disjoint sets
❑Completeness: all instances are included in the hierarchy and
each instance is related to one parent in the next level
❑Correct use of aggregation functions (“measure type”
condition): Type of measures determine the kind of
aggregation functions that can be applied.
17
DAWH430784
Measure Classification
HCMC UNIVERSITY OF TECHNOLOGY AND EDUCATION
Milan 24 18 28 14
ity e
Italy 57 43 51 39
(C tor
Rome 33 25 23 25
)
France
S
Nice 12 20 24 33
Paris
18
Q1 33 30 42 68
Time (Quarter)
41
23
Q1 21 10 18 35
Time (Quarter)
17
Q2 27 14 11 30
37
20
Q2 27 14 11 30
12
18
Q3 26 12 35 32
51
33
Q3 26 12 35 32
10
Q4 14 20 47 31
Q4 14 20 47 31
games DVDs
games DVDs books CDs
books CDs Product (Category)
Product (Category)
20
DAWH430784
OLAP Operations: Drill down
HCMC UNIVERSITY OF TECHNOLOGY AND EDUCATION
ity e
(C tor
ity e
Rome 10 8 11 8
)
(C tor
Rome 33 25 23 25
)
S
Nice 4 7 8 10
S
Nice 12 20 24 33
Paris Paris
6
18
10
Jan 7 2 6 13
23
Q1 21 10 18 35 Drill-down to the
14
Time (Quarter)
3
14
17
Time (Quarter)
Month level
7
Feb 8 4 8 12
20
Q2 27 14 11 30
...
9
12
18
...
Mar 6 4 4 10
33
Q3 26 12 35 32
...
8
10
14
Q4 14 20 47 31 ... ... ... ... ...
5
games DVDs Dec 4 4 16 7
books CDs games DVDs
Product (Category) books CDs
Product (Category)
21
DAWH430784
OLAP Operations: Pivot or Rotate
HCMC UNIVERSITY OF TECHNOLOGY AND EDUCATION
Milan 24 18 28 14
or t
ity e
DVDs 35 30 32 31
y)
eg c
(C tor
Rome 33 25 23 25
)
at du
CDs 18 11 35 47
S
(C Pro
Nice 12 20 24 33 games 10 14 12 20
Paris
18
books
10
23
Q1 21 10 18 35
21
Time (Quarter)
Paris 21 27 26 14
14
17
17
33
Store (City)
20
Q2 27 14 11 30
20
Pivot Nice 12 14 11 13
12
18
28
18
33
Q3 26 12 35 32
47
Rome 33 28 35 32
10
19
Q4 14 20 47 31
Milan 24 23 25 18
games DVDs
Q1 Q2 Q3 Q4
books CDs
Product (Category) Time (Quarter)
22
DAWH430784
OLAP Operations: Slice
HCMC UNIVERSITY OF TECHNOLOGY AND EDUCATION
Milan 24 18 28 14
ity e
(C tor
Rome 33 25 23 25
)
Q1 21 10 18 35
Time (Quarter)
S
Nice 12 20 24 33
Paris Q2 27 14 11 30
18
23
Q1 21 10 18 35
Time (Quarter)
Q3 26 12 35 32
14
17
Q2 27 14 11 30
Q4 14 20 47 31
12
18
33
Q3 26 12 35 32 games DVDs
10
Q4 14 20 47 31 books CDs
Product (Category)
games DVDs
books CDs
Product (Category)
23
DAWH430784
OLAP Operations: Dice
HCMC UNIVERSITY OF TECHNOLOGY AND EDUCATION
Milan 24 18 28 14
ity e
(C tor
Rome 33 25 23 25
)
ity e
(C tor
S
)
Nice 12 20 24 33 Nice 12 20 24 33
S
Paris
18
Paris
23
(Quarter)
Q1 21 10 18 35
Dice on Store.Country = ‘France’
Time (Quarter)
Q1 21 10 18 35
Time
14
17
14
and Time.Quarter= ‘Q1’ or ‘Q2’
20
Q2 27 14 11 30 Q2 27 14 11 30
12
18
33
Q3 26 12 35 32 games DVDs
10
Q4 14 20 47 31 books CDs
Product (Category)
games DVDs
books CDs
Product (Category)
24
DAWH430784
OLAP Operations – Summary
HCMC UNIVERSITY OF TECHNOLOGY AND EDUCATION
Top Down
• Enterprise data warehouse
• Higher integration levels
• Logically centralized
• Larger project scope
Bottom Up
• Independent data marts
• Lower integration levels
• Logically decentralized
• Smaller project scope
26
DAWH430784
Bottom-up Architecture
HCMC UNIVERSITY OF TECHNOLOGY AND EDUCATION
User
Data mart tier
departments
Operational
database
Transformation
process
Data mart
Operational
database
External
data source
Data mart
27
DAWH430784
Top-Down Architecture
HCMC UNIVERSITY OF TECHNOLOGY AND EDUCATION
Operational
database Staging Extraction
Area
process
Transformation
process
Detailed and
summarized data
EDM
External
data source Data warehouse
Data mart
28
DAWH430784
General Architecture
HCMC UNIVERSITY OF TECHNOLOGY AND EDUCATION
29
DAWH430784
General Architecture
HCMC UNIVERSITY OF TECHNOLOGY AND EDUCATION
✓ Data sources
▪ Operational databases
▪ Other internal or external sources of information (e.g. files)
✓ Back-end tier
▪ Extraction-Transformation-Loading (ETL) tools for manipulating data
from sources
▪ Data staging area: Intermediate database where manipulation is done
✓ OLAP tier
▪ OLAP Server: Supports multidimensional data and operations
✓ Front-end tier: Deals with data analysis and visualization
▪ Composed of OLAP tools, reporting tools, statistical tools, data-
mining tools, …
30
DAWH430784
Extraction-Transformation-Loading
HCMC UNIVERSITY OF TECHNOLOGY AND EDUCATION
35
DAWH430784
HCMC UNIVERSITY OF TECHNOLOGY AND EDUCATION
DAWH430784 1/16/2024 37