Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 22

Data Modeling for Data Warehousing

Star schema and snowflake schema


Dimensional modeling techniques
Fact and dimension tables
Star Schema and Snowflake Schema
• Definition and Characteristics of Star Schema:
• The star schema is a fundamental data modeling technique used in
data warehousing.
• It consists of one or more fact tables referencing any number of
dimension tables.
• In a star schema, the fact table contains quantitative measures or
metrics, while dimension tables contain descriptive attributes.
Components of Star Schema:
• Fact Table: This central table in the star schema holds the quantitative
data or metrics that are the focus of analysis. It typically contains
foreign keys referencing the primary keys of related dimension tables
and the numerical measures.
• Dimension Tables: These tables surround the fact table and provide
context to the measures. Dimension tables contain descriptive
attributes that define the business entities or concepts being
analyzed. Each dimension table usually has a primary key and
attributes related to that dimension.
Illustrative Example of Star Schema:
Fact Table: Sales
• Columns:
• Sales_ID (Primary Key): Unique identifier for each sales transaction.
• Date_ID (Foreign Key): Reference to the Date dimension table.
• Product_ID (Foreign Key): Reference to the Product dimension table.
• Store_ID (Foreign Key): Reference to the Store dimension table.
• Sales_Amount: The total amount of sales for each transaction.
• Quantity_Sold: The quantity of products sold in each transaction.Fact
Table: Sales
Dimension Tables:
• Date Dimension
• Columns:
• Date_ID (Primary Key): Unique identifier for each date.
Date: Date in YYYY-MM-DD format.
Day_of_Week: Day of the week (e.g., Monday, T
uesday).
Month: Month of the year (e.g., January, February).
Year: Year of the date
Product Dimension
• Columns:
• Product_ID (Primary Key): Unique identifier for each product.
Product_Name: Name of the product.
Category: Category of the product (e.g., Electronics, Clothing).
Brand: Brand of the product.
Store Dimension
• Columns:
• Store_ID (Primary Key): Unique identifier for each store.
Store_Name: Name of the store.
Location: Location of the store (e.g., city, state).
Region: Region where the store is located (e.g., North, South,
East, West).
• This star schema design enables efficient querying and
analysis of sales data. Each dimension table provides
descriptive attributes that offer context to the
measures stored in the fact table.
Introduction to Snowflake Schema:
• The snowflake schema is an extension of the star
schema.
• It organizes data in a more normalized form by further
normalizing dimension tables into sub-dimension
tables, creating a snowflake-like structure.
• Certainly! Let's extend the example of a retail
business and create a snowflake schema by
further normalizing the Product dimension into
sub-dimensions.
Fact Table: Sales_Fact
• Columns:
• Date_ID (foreign key)
• Product_ID (foreign key)
• Store_ID (foreign key)
• Sales_Amount
• Quantity_Sold
Dimension Tables:
• Date_Dimension
• Columns:
• Date_ID (primary key)
Date
Day_of_Week
Month
Year
Product_Dimension (Normalized):
• Columns:
• Product_ID (primary key)
• Product_Name
• Category_ID (foreign key)
• Brand_ID (foreign key)
Product_Category_Dimension (Sub-Dimension of Product):

• Columns:
• Category_ID (primary key)
Category_Name
Department_ID (foreign key)
Product_Brand_Dimension (Sub-Dimension
of Product):
• Columns:
• Brand_ID (primary key)
Brand_Name
Manufacturer
Store_Dimension
• Columns:
• Store_ID (primary key)
Store_Name
Location
Region
• This snowflake schema structure allows for
better data normalization, reduces redundancy,
and maintains data integrity. However, it may
require more complex queries involving multiple
joins compared to a star schema.
• ACTIVITY/QUIZ

You might also like