BI Podium June 6 v05 Dist-1

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 39

25568 Genesee Trail Rd

Golden, Colorado 80401


(303) 526-0340

 Data Vault Modeling and Approach  DW2.0 and Unstructured Data  Master Data Management and Metadata

Data Vault & Ensemble


Modeling

BI Podium Next Generation DWH Modeling 2013


gohansgo

Hans Hultgren
© 2013 Genesee Academy, LLC
25568 Genesee Trail Rd
Golden, Colorado 80401

© 2013 Genesee Academy, LLC


Data Vault & Ensemble Modeling
• Welcome
• Quick audience poll:
– Data Warehousing Business Intelligence
– Data Vault Modeling
– Certification Course
• Session will cover:
– Data Vault
– Ensemble
– Unified Decomposition
– Data Warehousing
– Agility
• More information

© 2013 Genesee Academy, LLC


Data Vault and Ensemble Modeling
Intro
• About Data Warehousing - Characteristics 1

Each layer of the architecture has its own requirements, constraints & variables

© 2013 Genesee Academy, LLC 3


Data Vault and Ensemble Modeling
Intro
• Why do we need it?
Each layer of the architecture has its own requirements, constraints & variables

© 2013 Genesee Academy, LLC 4


Data Vault and Ensemble Modeling
Intro
• Why do we need it?
Each layer of the architecture has its own requirements, constraints & variables

3 layer architecture…
© 2013 Genesee Academy, LLC 5
About Data Vault, Ensemble & the EDW
Intro 2
• Enterprise Data Warehousing

• Integrated, Non-Volatile, Time-Variant, Subject/Concept


Oriented, Central data store.
• Core Features: Enterprise-Wide, Historized, Auditable, Central
Data, Integrated across all forms of sources internal and
external.
Why data vault…
© 2013 Genesee Academy, LLC 6
Why do we use Data Vault
Intro 2
• Integration
• Traceability
• History
• Incremental Build
• Agility

• Gracefully Adapts to New Sources


• Full Auditability - Source to Mart
• Enterprise View of Central Data

• Data Vault is optimized for modeling the EDW


What is data vault…
© 2013 Genesee Academy, LLC 7
Data Vault & Ensemble Modeling
Intro 2
• Data Vault is the leading data modeling approach among new options
for the flexible/agile data warehouse.
Data Modeling Approaches:

Operational Data Warehouse Data Mart

3rd Normal Form Data Vault Dimensional

• For data warehouse agility there are other techniques as well. The
broader family of techniques are all flavors of Ensemble Modeling.
• In effect Ensemble modeling = EDW modeling.
• Ensemble is based on the premise: The flexibility required by the data
warehouse needs a model that de-couples changing context from
relationships from the business keys (Unified Decomposition).
Agenda…
© 2013 Genesee Academy, LLC
Agenda
• Background Topics:
– Core Business Concepts
– Agility
• Unified Decomposition
• Ensemble Modeling
• Data Vault Agility
• The Data Vault Ensemble
• Data Vault Core Constructs
• Applying Data Vault
• Core Concepts and the Backbone
• DV Pattern applied
• Bottom Line and Summary
© 2013 Genesee Academy, LLC
INTEGRATION &
THE CORE BUSINESS CONCEPT

© 2013 Genesee Academy, LLC


The Core Business Concept
• The Core Business Concept is the basis for our Data Vault Data Warehouse. It is
similar to the Entity in 3NF or a Dimension in a Star Schema. And so it
commonly includes Customer, Product, Employee, and etc.
• Important to note: 1) Business Driven, and 2) Enterprise Wide.

© 2013 Genesee Academy, LLC 11


ABOUT AGILITY

© 2013 Genesee Academy, LLC


Agile Data Warehousing BI
4
• Agility = Measure of ability to Adapt to Change

• The EDW is constantly needing to adapt to change


– New Sources
– New Attributes
– Changing Sources
– New and Changing Requirements
– New and Changing Business Rules
– New and Changing Deliveries
– Expanding Subject Areas
Data Adapting
Warehousing = to Change

© 2013 Genesee Academy, LLC 13


UNIFIED DECOMPOSITION™

© 2013 Genesee Academy, LLC


Unified Decomposition™

Separate things that change from things that are not changing.
• Break things out into component parts for flexibility and to facilitate the capture
of things that are either interpreted in different ways or changing
independently of each other. Decomposition.

• These parts however need to be integrated to define the core business concept
(the Entity, the Dimension, etc.). So they must be kept together. Unified.

© 2013 Genesee Academy, LLC 15


Ensemble Modeling™

• The constellation of component parts acts as a whole – an Ensemble.

All the parts of a thing taken together, so that


each part is considered only in relation to the whole.

• With Ensemble Modeling the Core Business Concepts that we define and model
are represented as a whole – an ensemble – including all of the component
parts.
• An Ensemble is based on all things defining a Core Business Concept that can be
uniquely and specifically said for one instance of that Concept.
© 2013 Genesee Academy, LLC 16
Data Vault Agility

• The Data Vault Ensemble conforms to a single key


embodied in the Hub construct.

• The component parts for the Data Vault Ensemble include:


– Hub The Natural Business Key
– Link The Natural Business Relationships
– Satellite All Context, Descriptive Data and History

© 2013 Genesee Academy, LLC 17


The Data Vault Ensemble

Core

• Data Vault constructs have been


broken out by type of data…

Customer
Customer

Core Constructs…
© 2013 Genesee Academy, LLC 18
Hubs

– A Hub Construct in Data Vault H_Customer


• contains Business Key H_Customer_SID

• only the Business Key Business Key 


Date/Time Stamp
• contains No Context Record source
• is always 1:1 with EWBK

– A Hub Table contains only


• Business Key
• Surrogate Key (Data Warehouse)
• Load Date / Time Stamp
• Record Source

© 2013 Genesee Academy, LLC


Links

– A Link Construct in Data Vault


• contains Relationship L_Cust_Class
L_Cust_Class_SID
• only a Relationship H_Sequence1_SID

• contains No Context H_Sequence2_SID


Date/Time Stamp
• is always 1:1 with Relationship Record source

– A Link Table contains only


• 2-n FKs for the Relationship
• Surrogate Key (Data Warehouse)
• Load Date / Time Stamp
• Record Source

© 2013 Genesee Academy, LLC


Satellites

– A Satellite Construct in Data Vault


• contains Context only S_Customer
• has no FKs (no relationships) H_Customer
• Designed by * Rate of Change Date/Time Stamp

* Type of Data * System… Context A


Context B

– A Satellite Table contains only Context C


Context D
• Business Key FK + Record source
• Load Date / Time Stamp
• Context Data…
• Record Source

© 2013 Genesee Academy, LLC


Applying the data vault modeling pattern

© 2013 Genesee Academy, LLC


Data Vault Model – How it Looks
Data Vault Model for Customer Sales with Employee and Product.

© 2013 Genesee Academy, LLC 23


Core Concepts

© 2013 Genesee Academy, LLC 24


Core Concepts

Six (6) Concept Keys

© 2013 Genesee Academy, LLC 25


Data Vault Backbone
The core foundation, the skeletal structure of the data vault model

The model as viewed..

without the things that


describe the key

without the things that


change over time

Six (6) Concept Keys

© 2013 Genesee Academy, LLC


The Complete Data Vault Model
Complete model with all context and history. Easily adapting to changes.

© 2013 Genesee Academy, LLC 27


Applying the data vault modeling pattern

© 2013 Genesee Academy, LLC


Tracking History: Time Slice Data

© 2013 Genesee Academy, LLC


Tracking History: Time Slice Data

© 2013 Genesee Academy, LLC


Tracking History: Time Slice Data

© 2013 Genesee Academy, LLC


Tracking History: Time Slice Data

© 2013 Genesee Academy, LLC


Tracking History: Time Slice Data

© 2013 Genesee Academy, LLC


Impact of Change: New Attribute
5

New
Attribute

© 2013 Genesee Academy, LLC 34


The Bottom Line
• The Data Warehouse needs to adapt to change easily, be based on
central business concepts, integrate data from several sources,
track history of changing context, contain trusted and auditable
information, and it needs to perform.
• Answering this call means a data warehouse program that is
designed to meet these requirements with the people, processes,
and the modeling techniques that support them.
• Data Warehouse modeling => Ensemble modeling. Techniques that
are based on Unified Decomposition. There are several forms of
Ensemble methods in play today.
• Data Vault modeling is the leading form of Ensemble modeling
today.
• The Best Practice is Modeling Awareness
© 2013 Genesee Academy, LLC
Data Vault Around the World

Estimated 750 Data Vault based


Data Warehouses around the world

© 2013 Genesee Academy, LLC 36


Data Vault Certification Course
The Genesee Academy CDVDM – Data Vault Modeling Course.
The CDVDM is the data vault certification course covering all main topics of data vault modeling.
The course is delivered in a blended learning method using online video lessons (2 weeks),
classroom lectures, exercises, labs and small group modeling cases. Public courses are offered on
a regular schedule www.GeneseeAcademy.com and there are in-company options as well.

Data Vault Class

June 10-11
Amsterdam NL

Register Today!

© 2013 Genesee Academy, LLC 37


About Hans Hultgren
• Hans Hultgren is an author, speaker, educator and advisor in the data
warehousing and business intelligence space. He is an expert on data
vault modeling and the author of Modeling the Agile Data Warehouse
with Data Vault where he introduced Ensemble Modeling and Unified
Decomposition.

• Hans is the President of Genesee Academy, LLC (including also


www.DataVaultAcademy.com) which provides the CDVDM data vault
certification around globe.

• For 20 years Hans was a professor at DU where he was the founder and
director of the masters of science degree in business intelligence and data
warehousing MSBI.

© 2013 Genesee Academy, LLC


Links and Information

Data Vault Class

CDVDM Training & Certification June 10-11


www.GeneseeAcademy.com Amsterdam NL

Hans@GeneseeAcademy.com Register Today!


gohansgo
Book DataVaultBook.blogspot.com
HansHultgren.WordPress.com

HansHultgren

DataVaultAcademy

Online video-lesson training

DataVaultAcademy.com

© 2013 Genesee Academy, LLC 39

You might also like