Professional Documents
Culture Documents
DWI - Lecture - 4 - Dimensional Basics
DWI - Lecture - 4 - Dimensional Basics
Data Warehouses
• Goal:
Multidimensional model
Logical model – facts, measures and dimensions
What is a data cube?
Drilling into data using dimensions
• “Through measurement
comes knowledge.”
Heike Kamerlingh Onnes
Information systems
• OLAP users
Manager, analyst, management, etc.
The management of the organization is typically interested
in aggregated data – specific numbers, like:
• Business processes
operational activities performed by the organization
such as taking an order, processing an insurance claim, enrolling
students for class, or creating a snapshot of each account each
month
managers are interested in capturing performance metrics
events generated
• Managers
can tell you what
measurement units are
important for them
think of the business in
• Business dimensions
describe the business-
Order Amount / ID
20
15
0
0 2 4 6 8 10 12 14 16
Multidimensional data
Order Amount / ID
20
15
10
5
0
0 2 4 6 8 10 12 14 16
0 0
0 5 10 15 20 0 2 4 6 8
50
40
30
20
10
0
0 1 2 3 4 5 6 7 8
Multidimensional data
• Dimensions
Perspectives on the phenomena
Group correlated attributes into the same dimension to ease analysis
• In general
if an attribute is commonly aggregated or summarized, it is a fact
if an attribute is used to drive aggregations or summarizations, it
is a dimension.
Multi-dimensional Model
• Facts
describe business processes, certain business events
E.g. single game at a slot machine
• Measures
quantitative context
• Dimensions
provide the “who, what, where, when, why, and how” context
surrounding the subject – business process event
contain the descriptive attributes
used by BI applications for filtering and grouping of the facts
• Examples:
Customer, Product, Time, Store, ...
• Dimension:
A dimension is the main analytical
object in a BI space.
can be a list of products or customers,
OLTP event DW
tran1 event
tran2 event
tran3
tran4
• Query
Show me sales, profit and average call volume per day
for my 10 most profitable salespeople
• Components
Facts
a focus of interest for decision-making - subject
• Logical
data as a multidimensional
cube
• Perspective
It’s not the bottom-up perspective of rows
The dataset is treated as a whole
We do not really care about the individual rows.
Dataset perspective
• Multidimensional data
viewed as cubes:
eg. sales cube
eg. olympic standings cube
• Multi-dimensional data
Each measure’s value is
accessed
Using coordinate system
composed of a set of dimensions
Particular attributes of the
Location
Data as a Cube
• Multidimensional
many dimensions, many
axeses
Product
• Multi-dimensional data
Regional manager perspective
Dimension
Region
Product
• Multi-dimensional data
Product manager perspective
Dimension
Region
Data as a cube
• Multi-dimensional data
Multiple measures
A point in a multi-dimensional space can represent several values
Add virtual dimensions representing measures
• Aggregations
OLAP queries also require
the ability to aggregate data
over different dimensions
• Aggregations
Many dimensions = many
aggregates
We need to calculate many
• Left to right
How fine-grained data turns
into a cube.
• Right to left
How the cube containing all
• Different aggregations in
each slice are possible
average, median, minimum,
maximum, etc.
Multidimensional Data
• Dimension hierarchies
are organized in classification levels
e.g., Day, Month, …
dependencies between the classification levels
are described by the classification schema through functional dependencies
• Example:
classification hierarchy path from product dimension
fully-ordered set of classification levels is called a Path
• Dimension hierarchies
A concept hierarchy defines a sequence of mappings from a set
of low-level concepts to higher-level, more general concepts.
• Dimension hierarchies
may also be defined by discretizing or grouping values for a given
dimension or attribute, resulting in a set-grouping hierarchy
• Aggregates
• Data cube
Lattice of cuboids
• Data cube
Lattice of cuboids
• M-D cuboid
M dimensions, with N-M
summaries
E.g. 3-D cuboid for time, item,
and location - summarized for all
• 2 dimensions
location
country
state
city
topic
• Navigation paths
• GROUP BY
All ALL
<>
<Name> Category CD DVD
<Brand>
<Category> Brand TOS BOS TOS
<Name, Category>
• Facts
A fact is an element of the multi-dimensional space
Associates a set of dimension elements with measures
„cube cell“
Granularity relates to dimensions
• Measures
Quantifying information
Usually numeric
in simple terms measures are normally the elements you want to
add up in reporting
• Dimensional Model
Focus is on how managers view the
• Multidimensional data
Aggregates
together with measure values
we store summarizing
information
• Multidimensional data
Aggregates
together with measure values
we store summarizing
information
• JAVA • C like
Product
Customer
Customer
Data as a star
Customer
Cubes and Stars
• Data in a cube
• Starnet query model
What can we do with the cube ? all
continent
country
branch
customer
group
all
OLAP Operations
group
all
OLAP Operations
• OLAP operations
Drill down (roll-down)
Category → Product
Region → City
Quarter → Month
group
all
OLAP Operations
• OLAP operations
Roll up (drill-up)
Product → Category
City → Region
Month → Quarter
• OLAP operations
Slice
Selection on one dimension of
the given cube
Result is a subcube
• OLAP operations
Dice
Selection on two or more
dimensions
continent
country
branch
group
all
OLAP Operations
• OLAP operations
Pivot
also known as rotation
rotates data axes
to provide an alternative
presentation of data
group
supplier
type
all
• Adamson C.
Star Schema The Complete Reference
McGraw-Hill, 2010
• Inmon W.,
Building the Data Warehouse,
John Wiley & Sons, New York 2002