DWDM CC

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 4

Data Warehouse

Data Warehousing and Data Mining

Multi-Dimensional Data, Cuboid of lattices


Data Warehouse Models, Schema Subhas Barman
Assistant Professor
OLAP Operations Computer Science and Engineering
Jalpaiguri Government Engineering College
#02
**Ref. Data Mining : Concepts and Techniques, Jiawei Han, Micheline Kamber, and Jian Pei

From Tables and Spreadsheets to


Cube: A Lattice of Cuboids Data Cubes

all A data warehouse is based on a multidimensional data model which views


0-D (apex) cuboid
data in the form of a data cube

time item location supplier A data cube, such as sales, allows data to be modeled and viewed in
1-D cuboids multiple dimensions
Dimension tables, such as item (item_name, brand, type), or time(day,
time,location item,location location,supplier
week, month, quarter, year)
time,item 2-D cuboids
time,supplier item,supplier Fact table contains measures (such as dollars_sold) and keys to each of
time,location,supplier the related dimension tables
3-D cuboids
time,item,location In data warehousing literature, an n-D base cube is called a base cuboid.
time,item,supplier item,location,supplier
The top most 0-D cuboid, which holds the highest-level of summarization,
4-D (base) cuboid
is called the apex cuboid. The lattice of cuboids forms a data cube.
time, item, location, supplier
4 3
Example of Star Schema Conceptual Modeling of Data Warehouses
It contains multiple dimension tables and
time one fact table. Fact table contains Modeling data warehouses: dimensions & measures
time_key dimension keys and facts or measures item
day item_key Star schema: A fact table in the middle connected to a set of
day_of_the_week Sales Fact Table item_name
month
dimension tables
brand
quarter time_key type
year
Snowflake schema: A refinement of star schema where
item_key supplier_type
some dimensional hierarchy is normalized into a set of
branch_key
branch location smaller dimension tables, forming a shape similar to
location_key
branch_key location_key snowflake
branch_name units_sold street
branch_type city Fact constellations: Multiple fact tables share dimension
dollars_sold state_or_province
country tables, viewed as a collection of stars, therefore called
avg_sales
Measures galaxy schema or fact constellation
6 5

Example of Fact Constellation Example of Snowflake Schema


time time
time_key item Shipping Fact Table item
time_key
day item_key day item_key supplier
day_of_the_week Sales Fact Table item_name time_key Sales Fact Table
month day_of_the_week item_name supplier_key
brand
quarter item_key month brand supplier_type
time_key type time_key
year supplier_type quarter type
shipper_key
item_key year item_key supplier_key
branch_key from_location
branch_key
branch to_location branch location
location_key location location_key
location_key
branch_key location_key dollars_cost branch_key
branch_name
units_sold units_sold street
street branch_name
branch_type units_shipped city_key
dollars_sold city branch_type city
dollars_sold
province_or_state
avg_sales city_key
country shipper avg_sales city
Measures shipper_key state_or_province
Measures country
shipper_name some dimensional hierarchy is normalized
Multiple fact tables share dimension tables location_key
8 7
shipper_type
into a set of smaller dimension tables
Distributive: if the result derived by applying the function to n
aggregate values is the same as that derived by applying the
function on all the data without partitioning

Algebraic: if it can be computed by an algebraic function with M

obtained by applying a distributive aggregate function

Holistic: if there is no constant bound on the storage size needed

10 9

Country
Product

12 11
Typical OLAP Operations Cuboids Corresponding to the Cube
Roll up (drill-up): summarize data
by climbing up hierarchy or by dimension reduction
all
Drill down (roll down): reverse of roll-up 0-D (apex) cuboid
from higher level summary to lower level summary or detailed product country
date
data, or introducing new dimensions 1-D cuboids
Slice and dice: project and select product,date product,country date, country
Pivot (rotate): 2-D cuboids
reorient the cube, visualization, 3D to series of 2D planes
Other operations
3-D (base) cuboid
drill across: involving (across) more than one fact table product, date, country

drill through: through the bottom level of the cube to its back-
end relational tables (using SQL)
14 13

OLAP Operations

Dicing 2x2x2
Data Warehouse Related Questions
What is data warehousing? Explain the characteristics of DW. What are the advantages
and disadvantages of data warehouse?
Explain the 3-Tier architecture of Data Warehousing. Explain the functions of each tier
of the data warehouse architecture. Roll up with cities to
What is metadata? Describe the structure of meta data in the DW. countries
What is data mart? Compare a data mart with a data warehouse. Explain different types
of data marts, Slicing w.r.t. Qtr1
Describe the special operations of OLAP server which are applied to deal with multi-
dimensional data of data warehouse? Or Explain the OLAP operations with example.
What are the differences between OLTP and OLAP with respect to user, size, and
purpose?
What is schema? Name the schema used for data warehouse design. Give example of
each data warehouse schema. Compare star schema, snowflake schema and fact
constellation.
Write short notes on concept hierarchy, cuboid of lattices. Drill down towards
month from qtr
S. Barman, Asstt. Prof., CSE, JGEC
15

You might also like