Professional Documents
Culture Documents
Experiment2 E059 DWM PDF
Experiment2 E059 DWM PDF
Experiment2 E059 DWM PDF
PART A
(PART A : TO BE REFFERED BY STUDENTS)
Experiment No.02
Aim: Design a star schema, snowflake schema and fact constellation schema for any subject of
your choice.
Prerequisite:
Fundamental Knowledge of Database Management
Fundamental Knowledge of SQL
Learning Outcomes:
Learning of Star, Snowflake & Fact Constellation(Galaxy) schema
Theory:
Dimensional modeling:
It is the name of a logical design technique often used for data warehouses. Dimensional
modeling always uses the concepts of facts, measures, and dimensions. Facts are typically (but
not always) numeric values that can be aggregated, Dimensions are groups of hierarchies and
descriptors that define the facts. For example, sales amount is a fact; timestamp, product,
register#, store#, etc. are elements of dimensions. Dimensional models are built by business
process area, e.g. store sales, inventory, claims, etc.
Fact table:
The fact table is not a typical relational database table as it is de‐normalized on purpose ‐ to
enhance query response times. The fact table typically contains records that are ready to
explore, usually with ad hoc queries. Records in the fact table are often referred to as events,
due to the time‐variant nature of a data warehouse environment.
The primary key for the fact table is a composite of all the columns except numeric values
/scores (like QUANTITY, TURNOVER, exact invoice date and time). Typical fact tables in a
global enterprise data warehouse are (usually there may be additional company or business
specific fact tables):
Dimension table:
Nearly all of the information in a typical fact table is also present in one or more dimension
tables. The main purpose of maintaining Dimension Tables is to allow browsing the categories
quickly and easily.
The primary keys of each of the dimension tables are linked together to form the composite
primary key of the fact table. In a star schema design, there is only one de‐normalized table for
a given dimension.
Snowflake schemas are generally used when a dimensional table becomes very big and when a
star schema can’t represent the complexity of a data structure. For example if a PRODUCT
dimension table contains millions of rows, the use of snowflake schemas should significantly
improve performance by moving out some data to other table (with BRANDS for instance). The
problem is that the more normalized the dimension table is, the more complicated SQL joins
must be issued to query them. This is because in order for a query to be answered, many tables
need to be joined and aggregates generated.
In a fact constellation schema, different fact tables are explicitly assigned to the dimensions,
which are for given facts relevant. This may be useful in cases when some facts are associated
with a given dimension level and other facts with a deeper dimension level. Use of that model
should be reasonable when for example, there is a sales fact table (with details down to the
exact date and invoice header id) and a fact table with sales forecast which is calculated based
on month, client id and product id.
PART B
(PART B : TO BE COMPLETED BY STUDENTS)
(Students must submit the soft copy as per following segments within two hours of the practical
slot. The soft copy must be uploaded on the Blackboard or emailed to the concerned lab in charge
faculties at the end of the practical in case the there is no Black board access available)
Star
Time Customers
Year Customer_ID
Quarter Name
Month Age
Week Ticket Issued
Day Contact
Time_ID Details
Category
Fact Table Airlines
Airlines_ID
Time_Id Name
Age
Customer_ID Ticket Issued
Contact
Airlines_ID Details
Category
Seat_Number Membership
Transaction_ID
Seat Pop_Destn
SnowFlakes
Time
Year Customers
Quarter Customer_ID
Month Name
Week Age
Day Ticket Issued
Time_ID Contact
Details
Category
Fact Table Airlines
Airlines_ID
Time_Id Name
Age
Customer_ID Ticket Issued
Contact
Airlines_ID Details
Category
Seat_Number Membership
Seat_number
Transaction_ID
Pop_Destn
Reserved
Consolation
Time
Year Customers
Quarter
Month Customer_ID
Week Name
Day Age
Time_ID Ticket Issued
Contact
Details
Fact Table Airlines
Time_Id Airlines_ID
Name
Customer_ID Age
Ticket Issued
Airlines_ID Contact
Details
Seat_Number Category
Membership
Transaction_ID Seat_number
Pop_Destn
Pop_time
Slice
The slice operation selects one particular dimension from a given cube and provides a new sub-cube.
Consider the following diagram that shows how slice works.
Here Slice is performed for the dimension "time" using the criterion time = "Q1".
Dice
Dice selects two or more dimensions from a given cube and provides a new sub-cube. Consider the
following diagram that shows the dice operation.
The dice operation on the cube based on the following selection criteria involves three dimensions.
Pivot
The pivot operation is also known as rotation. It rotates the data axes in view in order to provide an
alternative presentation of data. Consider the following diagram that shows the pivot operation.
In this the item and location axes in 2-D slice are rotated.
In snowflake schema, a dimension table will have one or more parent tables.