Professional Documents
Culture Documents
Web Content: With The SQL Server 2008R2 Platform
Web Content: With The SQL Server 2008R2 Platform
Web Content: With The SQL Server 2008R2 Platform
Web Content
Chapter 2 Example of Type 1 and Type 2 attribute change tracking techniques
The best way to understand the concepts of Type 1 and Type 2 change
tracking, and the substantial impact of accurately tracking changes, is with an
example. As we see in the Data Mining chapter, Adventure Works Cycles collects
demographic information from its Internet customers. These attributes, such as
gender, homeowner status, education, and commute distance, all go into the
customer dimension. Many of these attributes will change over time. If a customer
moves, her commute distance will change. If a customer bought a bike when he
went to college, his education status will change when he graduates. These
attributes all certainly qualify as slowly changing, but should you treat them as
Type 1 or Type 2 attributes?
If youve been paying attention, you know the correct answer is to ask the
business users how theyll use this information. You already know from the
requirements gathering information in Chapter 1 that Marketing intends to use
some data mining techniques to look for demographic attributes that predict
buying behaviors. In particular, they want to know if certain attributes are
predictive of higher-than-average buying. They could use those attributes to create
targeted marketing campaigns. Lets examine some example customer-level data
to see how decisions on handling attribute changes can affect the information you
provide your users.
Table 2.1 shows the row in the customer dimension for a customer named
Jane Rider as of January 1, 2011. Notice that the customer dimension includes the
business key from the transaction system along with the attributes that describe
Jane Rider. The business key allows users to tie back to the transaction system if
need be.
Table 2.1: Example Customer Dimension Table Row as of Jan 1, 2011
Customer_Ke
y
1552
Customer_
BKCustomer_ID Name
31421
Jane Rider
Commute_
Distance
3
Gender
Female
Home_
Owner_
Flag
No
Table 2.2 shows some example rows from an abridged version of the AWC
Orders fact table in the data warehouse database. We have used surrogate keys for
customer and product, but we left the order date as a date for the sake of clarity.
These rows show all of Jane Riders AWC orders.
A marketing analyst might want to look at these two tables together, across
all customers, to see if any of the customer attributes predict behavior. For
example, one might hypothesize that people with short commute distances are
more likely to buy nice bikes and accessories than people with long commute
distances. They might ride their bikes to work rather than drive.
Table 2.2: Example Order Fact Table Rows for Jane Rider as of Feb 22, 2011
Date
1/7/2009
3/2/2009
5/7/2010
8/21/201
0
2/21/201
1
Customer_Ke
y
1552
1552
1552
1552
Product_Ke
y
95
37
87
33
Item_Coun
t
1
1
2
2
1552
42
Dollar_Amoun
t
1,798.00
27.95
320.26
129.99
19.95
BKCusto
mer_Id
31421
31421
Customer_
Name
Jane Rider
Jane Rider
Commute_
Distance
3
31
Gender
Female
Female
Home_
Owner_
Flag
No
Yes
Eff_Date
1/7/2009
1/2/2011
End_Date
1/1/2011
12/31/9999
The customer dimension table has been augmented to help manage the
Type 2 process. Two columns have been added to indicate the effective date and
end date of each row. This is how you can tell exactly which row was in effect at
any given time.
As Table 2.4 shows, the fact table has to change as well because it has a
row that occurred after the Type 2 change occurred. The last row of the fact table
needs to join to the row in the dimension that was in effect when the order
occurred on February 21, 2011. This is the new dimension row with
Customer_Key = 2387.
Table 2.4: Updated Order Fact Table Rows for Jane Rider as of Feb 22, 2011
Date
1/7/2009
3/2/2009
5/7/2010
Customer_Ke
y
1552
1552
1552
Product_Ke
y
95
37
87
Item_COUN
T
1
1
2
Dollar_Amoun
t
1,798.00
27.95
320.26
8/21/201
0
2/21/201
1
1552
33
129.99
2387
42
19.95