Professional Documents
Culture Documents
TVDM Digital 5 - Dimentional Modeling
TVDM Digital 5 - Dimentional Modeling
Data-Driven Decision-making
Dimensional Modeling
item
time_key
day
day_of_the_week
month
quarter
year
branch
branch_key
branch_name
branch_type
location_key
units_sold
dollars_sold
avg_sales
Measures
3 GEST S492 The Digital Firm
item_key
item_name
brand
type
supplier_type
location
location_key
street
city
province_or_street
country
item
Sales Fact Table
time_key
item_key
branch_key
branch
location_key
branch_key
branch_name
branch_type
units_sold
dollars_sold
avg_sales
Measures
5 GEST S492 The Digital Firm
item_key
item_name
brand
type
supplier_key
supplier
supplier_key
supplier_type
location
location_key
street
city_key
city
city_key
city
province_or_street
country
item
Sales Fact Table
time_key
item_key
item_key
item_name
brand
type
supplier_type
location_key
branch_key
branch_name
branch_type
units_sold
dollars_sold
avg_sales
Measures
6 GEST S492 The Digital Firm
time_key
item_key
shipper_key
from_location
branch_key
branch
location
to_location
location_key
street
city
province_or_street
country
dollars_cost
units_shipped
shipper
shipper_key
shipper_name
location_key
shipper_type
Star Schema
Fact tables contain
factual or quantitative
data
1:N relationship
between dimension
tables and fact tables
An example
Fact table provides statistics for sales
broken down by product, period and store
dimensions
Modeling dates
Step 1
Joining necessary
dimensions & Fact
Using connected
dimensions to
constraint
T.Time_Key= (3,4)
P.Product_Key= (3,4)
L.Location_Key= (1,2)
Step 2
Step 3
14
SQL on Facts
Slice 1
SELECT sum(Vehicles_Sold)
FROM Location L, Product P, Time T, Fact F
WHERE L.Location_Key = F.Location_Key
AND P.Product_Key = F.Product_Key
AND T.Time_Key = F.Time_Key
AND L.Continent = Europe
AND P.Vehicle_Kind = Truck
AND T.Year =1997
Joining necessary
dimensions & Fact
Using connected
dimensions to constraint
L.Continent
Drill down to countries
GROUP BY
L.Continent, L.Country
An example
Star for Grocery stores
TVDM BI 6 Star Grocery.mdb
OLAP in Pivot Table Excel
TVDM BI 6 Star Grocery Pivot.xlsx
Dimensional Models
A denormalized relational model
Made up of tables with attributes
Relationships defined by keys and foreign keys
Organized for understandability and ease of reporting
rather than update
Queried and maintained by SQL or special purpose
management tools
Because the semantic structure is known, and the surrogate
key mechanism is standard, tools can be optimized for query
and indexing
CUSTOMER
customer_ID (PK)
customer_name
purchase_profile
credit_profile
address
ERD
1
1
n
n
STORE
store_ID (PK)
store_name
address
district
floor_type
PRODUCT
SKU (PK)
description
brand
category
ORDER
order_num (PK)
customer_ID (FK)
store_ID (FK)
clerk_ID (FK)
date
n
1
ORDER-LINE
order_num (PK) (FK)
SKU (PK) (FK)
promotion_key (FK)
dollars_sold
units_sold
dollars_cost
n
1
CLERK
clerk_id (PK)
clerk_name
clerk_grade
PROMOTION
promotion_NUM (PK)
promotion_name
price_type
ad_type
CUSTOMER
customer_ID (PK)
customer_name
purchase_profile
credit_profile
address
ERD
1
STORE
store_ID (PK)
store_name
address
district
floor_type
CLERK
clerk_id (PK)
clerk_name
clerk_grade
PRODUCT
SKU (PK)
description
brand
category
ORDER
order_num (PK)
customer_ID (FK)
store_ID (FK)
clerk_ID (FK)
date
n
1
Dimensions Measures
1
1
n
ORDER-LINE
order_num (PK) (FK)
SKU (PK) (FK)
promotion_key (FK)
dollars_sold
units_sold
dollars_cost
n
1
PROMOTION
promotion_NUM (PK)
promotion_name
price_type
ad_type
TIME
time_key (PK)
SQL_date
day_of_week
month
STORE
store_key (PK)
store_ID
store_name
address
district
floor_type
CLERK
clerk_key (PK)
clerk_id
clerk_name
clerk_grade
DIMENSONAL
MODEL
FACT
time_key (FK)
store_key (FK)
clerk_key (FK)
product_key (FK)
customer_key (FK)
promotion_key (FK)
dollars_sold
units_sold
dollars_cost
PRODUCT
product_key (PK)
SKU
description
brand
category
CUSTOMER
customer_key (PK)
customer_name
purchase_profile
credit_profile
address
PROMOTION
promotion_key (PK)
promotion_name
price_type
ad_type
Dimensions
A table (or hierarchy of tables) connected with the fact table with keys
and foreign keys
Dimension tables contain text or numeric attributes that gives the
context of the measures available in a Fact Table
Time, client, status, shop, product, employee, etc.
Preferably in a 1-M relationship with Facts
1 context can have many measures: 1 client having many sales
If not a bridge table must be created
Dimensions attributes will be the source of query constraints
Select sum(Sales) from where Client.ZIP = 1150
Customer (region, status); Product (Type, Price_range); Time (Mondays, WE, 2010)
Fact Tables
Represent a process or reporting environment interesting
for some business users
Contains many foreign keys connecting to Dimensions
It is important to determine the identity of the fact table and
specify exactly what it represents
Sales in shops, direct-mail campaigns, shipment, production of item x
(family)
Measures
Measurements associated with fact records at the fact
table granularity
Normally numeric and additive
Sales, number of contacts, etc.
Semi-additive (stock or level measures)
Measures that are non additive on time but additive along
other dimensions: Inventory, balance
Non-additive (Value-per-unit measures)
Measure that are non additive at all: exchange rate
Attributes in dimension tables are constants. Facts attributes vary with
the granularity of the fact table: when aggregations are done, they
should smoothly aggregate as well.
30 GEST S492 The Digital Firm
Launch_date
1/01/2009
15/01/2009
17/01/2009
1/02/2009
19/02/2009
30/02/2009
1/03/2009
12/03/2009
Group_Campaign
GRPCpg_keyCpg_key
1
1
1
3
1
5
2
1
2
3
2
7
2
8
Main
1
0
0
0
1
0
0
n (one
Cpg
can
be
n
times
in
Group_Cpg
table)
Client
Group_
Campaign
- Weight
- Main
Credit FACT
- Amount
- Rate
Day
Hierarchies
Hierarchies are the key relationship for roll-up and drilldown in OLAP environment
Sales per Day Week Month Quarter
Sales for cheese milk_product food
Hierarchies can be represented in 2 ways in dimensions
Explicit Attribute sets:
Dim.Time: date; DayofWeek; Week; Month; Quarter;
Dim.product: Product (Cheese); Family (Milk_product); Category
(Food)
Snowflake:
Create a table per level and link tables by 1:N relationships
Understandability
The user must join tables to get the
right level of aggregation
Product.level3
Food
Product.level2
milk
Product.level1
cheese
Yougourt
milk
Product.level1
cheese
Yougourt
milk
Product.level2
milk
milk
milk
Product.level3
Food
Food
Food
Name
Kasparov
zip
1150
Date_in Date_close
1/01/2010
Type
1
Client
Key
1
Name
Kasparov
zip
1950
Date_in Date_close
1/01/2010
Sales
client_key week_key Sale
1
1
10
week
Key
1
2
3
Week Month
1
1
2
1
3
1
Year
2010
2010
2010
Sales
client_key week_key Sale
1
1
10
1
2
8
week
Key
1
2
3
Week Month
1
1
2
1
3
1
Year
2010
2010
2010
Year
2010
Zip
1950
Sales
18
Name
Kasparov
zip
1150
Date_in Date_close
1/01/2010
Sales
client_key week_key Sale
1
1
10
week
Key
1
2
3
Week Month
1
1
2
1
3
1
Year
2010
2010
2010
Type
2
Client
Key
1
2
Name
Kasparov
Kasparov
zip
1150
1950
Date_in Date_close
1/01/2010 7/01/2010
8/01/2010
Sales
client_key week_key Sale
1
1
10
2
2
8
week
Key
1
2
3
Week Month
1
1
2
1
3
1
Year
2010
2010
2010
Year
2010
2010
Zip
1150
1950
Sales
10
8
Type
3
Client
Key
1
Name
Kasparov
Name
Kasparov
zip
1950
zip
1150
Date_in Date_close
1/01/2010
Sales
client_key week_key Sale
1
1
10
week
Key
1
2
3
Week Month
1
1
2
1
3
1
Year
2010
2010
2010
Sales
client_key week_key Sale
1
1
10
1
2
8
week
Key
1
2
3
Week Month
1
1
2
1
3
1
Year
2010
2010
2010
Year
2010
Zip
1950
Sales
18
Sales
Client
Key
Name
Status
street
n
ZIP
Region
Country
Type_suburb
Client
Key
Name
Status
Sales
Geo
Key
street
n
ZIP
Region
Country
Type_suburb
Campaign Datamart
Non additive