Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
You are on page 1of 13

An Introduction to Dimensional Data

Warehouse Design

Presented by
Joseph J. Sarna Jr.
JJS Systems, LLC
Relational Normal Form

• Most relational databases are set to 3rd normal form

 1st Normal form – Tables have unique keys and no repeating


groups or multi-value fields

 2nd Normal form – Every attribute is dependent ont the entire


key of the table

 3rd Normal form – Attributes are dependent only on the key.


No derived elements
Northwind Database Model – Relational Format
EmployeeTerritories Employees
Region Territories

PK RegionID PK TerritoryID PK EmployeeID

FK2 TerritoryID I1 LastName


RegionDescription TerritoryDescription
FK1 EmployeeID FirstName
FK1 RegionID
Title
TitleOfCourtesy
BirthDate
HireDate
Address
City
Region
I2 PostalCode
Country
HomePhone
Extension
Photo
Notes
FK1 ReportsTo
PhotoPath

Categories

PK CategoryID Orders
Order Details CustomerCustomerDemo
I1 CategoryName Products PK OrderID Customers
Description PK,FK1,I2,I1 OrderID PK,FK2 CustomerID
Picture PK ProductID FK1,I1,I2 CustomerID PK CustomerID
PK,FK2,I4,I3 ProductID PK,FK1 CustomerTypeID
FK2,I4,I3 EmployeeID
I3 ProductName I5 OrderDate I2 CompanyName
UnitPrice
FK2,I4,I5 SupplierID RequiredDate ContactName
Quantity
FK1,I2,I1 CategoryID I6 ShippedDate ContactTitle
Discount
QuantityPerUnit FK3,I7 ShipVia Address
UnitPrice Freight I1 City
UnitsInStock ShipName I4 Region
UnitsOnOrder ShipAddress I3 PostalCode
ReorderLevel ShipCity Country
Suppliers Discontinued ShipRegion Phone
I8 ShipPostalCode Fax
PK SupplierID
ShipCountry
I1 CompanyName
ContactName
CustomerDemographics
ContactTitle
Address PK CustomerTypeID
City
Region CustomerDesc
I2 PostalCode Shippers
Country
Phone PK ShipperID
Fax
HomePage CompanyName
Phone
Why Dimensional Data Warehouses?

• Business needs to analyze data so that it can:


– Understand trends

– Predict future behavior and needs

– Personalize contact with customers

– Be competitive

• All of this in a speedy manner, with the ability to do


“What if’s”
Drawbacks to Relational Data Structures

• Data is not structured for analytical usage

• Multiple Joins are resource intensive

• Missing data from external sources, context history, not


operational sources
What Is a Dimensional Data Warehouse?

“A structured repository of validated and integrated


historical information accessible to business people to
provide the basis for both tactical and strategic
business decisions.”

• Centralized extract and staging


• Separate from operational system
• Structured for analysis
• Historically contexted
Dimensional Data Warehouse Architecture

Relational Data External Data Enterprise Data

Data Distribution
Acquisition, Staging, Cleaning,
Transformation

Data Warehouse Analytical


Storage Applications
Levels of Design

• Detail Level
– Dimensional Normal form
– Value and feasibility

• Analytical Level
– Structured for the required analyses

• Summary Level
– Summaries for user requirements
– Better response time
Dimensional Normal Form

• Normalized for maintainability

• De-normalized for performance, based on rules

• 2 level structure, therefore only one level of joins


required for queries
Northwind Database Model – Dimensional Format
Employees

PK EmployeeKey

Territories EmployeeTerritories EmployeeID


Region
LastName
PK TerritoryKey PK EmployeeTerritoryKey FirstName
PK RegionKey
Title
FK1 RegionKey FK1 EmployeeKey TitleOfCourtesy
RegionID
TerritoryID FK2 TerritoryKey BirthDate
RegionDescription
TerritoryDescription TerritoryID HireDate
TerritoryDescription Address
RegionKey City
RegionID Region
RegionDescription PostalCode
Country
HomePhone
Extension
Photo
Notes
FK1 ReportsTo
PhotoPath
EmployeeTerritoryKey
TerritoryKey
Products Order Details TerritoryID
TerritoryDescription
Categories PK ProductKey PK OrderDetailKey Orders RegionKey
PK CategoryKey RegionID
ProductID FK2 OrderKey PK OrderKey RegionDescription
ProductName OrderID
CategoryID OrderID
QuantityPerUnit UnitPrice
CategoryName FK1 CustomerKey
UnitPrice Quantity
Description FK2 EmployeeID
UnitsInStock Discount
Picture FK3 ShipperKey CustomerCustomerDemo
UnitsOnOrder ExtendedPrice Customers
ReorderLevel FK1 ProductKey OrderDate
RequiredDate PK CustomerKey PK CustomerDemographicsKey
Discontinued ProductID
FK1 CategoryKey ProductName ShippedDate
Suppliers Freight CustomerID CustomerKey
CategoryID QuantityPerUnit
ShipName CompanyName FK1 CustomerTypeKey
PK SupplierKey CategoryName UnitPrice
ShipAddress ContactName CustomerDesc
Description UnitsInStock
FK2 SupplierKey UnitsOnOrder ShipCity ContactTitle
SupplierID Address
CompanyName SupplierID ReorderLevel ShipRegion
CompanyName Discontinued ShipPostalCode City
ContactName Region
ContactTitle ContactName CategoryKey ShipCountry
ContactTitle CategoryID PostalCode
Address Country CustomerDemographics
City Address CategoryName
City Description Phone
Region Fax PK CustomerTypeKey
PostalCode Region SupplierKey
PostalCode SupplierID CustomerTypeID
Country CustomerDesc CustomerTypeID
Phone Country CompanyName CustomerDesc
Phone ContactName FK1 CustomerDemographicsKey
Fax
HomePage Fax ContactTitle
HomePage Address
City
Shippers
Region
PostalCode PK ShipperKey
Country
Phone ShipperID
Fax CompanyName
HomePage Phone
Dimensional Schema

• Fact Tables
– contain related measures
– Usually the largest tables
– Usually appended to
– Can contain detail or summary data
– Measures are usually additive
• Dimension Tables
– Contain descriptors
– Utilize business terminology
– Textual and discrete data
– Attributes through which the table measures are analyzed
Northwind Database – Star Schema – Orders
dimDate

PK DateKey

DayDate
DayDate_YYYYMMDD
DayOfWeekName
DayOfWeekNameAbbrv
DayNumberInWeek dimEmployees
DayNumberInMonth
DayNumberInQuarte PK EmployeeKey
DayNumberInYear
WeekDayIndicator EmployeeID
WeekEndIndicator LastName
Week_YYYYWW FirstName
WeekNumberInYear Title
Month_YYYYMM TitleOfCourtesy
MonthName BirthDate
MonthNameAbbrv HireDate
MonthNumberInYear Address
Quarter_YYYYQ City
QuarterName Region
QuarterNameAbrv PostalCode
QuarterNumberInYear Country
dimOrderDetails Year HomePhone
Extension
PK ProductKey Photo
Notes
OrderID ReportsTo
UnitPrice PhotoPath
Quantity fctOrders
TerritoryID
Discount PK OrderKey TerritoryDescription
ExtendedPrice RegionID
ProductID FK3 ProductKey RegionDescription
ProductName FK2 EmployeeKey
QuantityPerUnit FK1 CustomerKey
UnitPrice FK4 ShipperKey
UnitsInStock FK6 OrderDateKey
UnitsOnOrder FK5 RequiredDateKey
ReorderLevel FK7 ShippedDateKey
Discontinued OrderID
CategoryID ShipVia
CategoryName Freight
Description ShipName dimCustomers
SupplierID ShipAddress
CompanyName PK CustomerKey
ShipCity
ContactName ShipRegion
ContactTitle CustomerID
ShipPostalCode
Address CompanyName
ShipCountry
City ContactName
Region ContactTitle
PostalCode Address
Country City
Phone Region
Fax PostalCode
HomePage Country
dimShippers
Phone
PK ShipperKey Fax
CustomerTypeID
ShipperID CustomerDesc
CompanyName
Phone
Resources

• Books
– The Data Warehouse Toolkit, Ralph Kimball
– The Data Warehouse Lifecycle Toolkit, Ralph Kimball, et al
– Data Warehouse Design Solutions, Adamson / Venerable

• Websites
– http://www.ralphkimball.com/
– http://www.atre.com
– http://www.microsoft.com/sql/evaluation/overview/
dataware.asp

You might also like