DW Lec7

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 15

Inside a Dimension Table

 Dimension Table Key


 Table is Wide.
 Drilling Down, Rolling Up.
 Fewer Records.
 Multiple Hierarchies.

Surrogate key
2

Dimension Table Key (Surrogate key)


The primary key of the dimension table uniquely identifies each
row in the table.
Surrogate key
3

Surrogate key
4
Surrogate key
5

Surrogate key
6

Surrogate key is an anonymous Integer Primary keys and not driven by


application data.
Features of Surrogate key
◼ Number.
◼ Sequential.
◼ No Business Meaning.

Benefits of Surrogate keys


◼ Constant Behavior.
◼ Multi source integration.
◼ Faster Query Performance.
◼ Future Records.
Inside a Dimension Table
7

❑ Table is Wide.
❑ dimension tables are wide as we can add any number of attributes
to a dimension table at any point in the DW cycle. DW architect will
request the ETL team to add respective new attributes to the
schema.
❑ In real-time scenarios, you can see dimension tables with 50 (or)
more attributes.

Inside a Dimension Table


8

❑ Not Normalized.
❑ For efficient query performance, it is best if the query picks up an attribute
from the dimension table and goes directly to the fact table and not through
other intermediary tables.
❑ If you normalize the dimension table, you will be creating such intermediary
tables and that will not be efficient.
Inside a Dimension Table
9

❑ Drilling Down, Rolling Up.


❑ The attributes in a dimension table provide the ability to get to the details
from higher levels of aggregation to lower levels of details.
❑ For example, the three attributes zip code, city, and state form a hierarchy.
You may get the total sales by state, then drill down to total sales by city,
and then by zip code. Going the other way, you may first get the totals by
zip codes, and then roll up to totals by city and state.
❑ Fewer Records.
❑ Dimension tables will have less number of records (in hundreds) than the fact
tables (in millions). Though they are smaller than the facts, they provide all
the inputs to the fact tables.

Inside a Dimension Table


10

❑ Multiple Hierarchies.
❑ In the example of the customer dimension, there is a single hierarchy
going up from individual customer to zip, city, and state.
❑ But dimension tables often provide for multiple hierarchies, so that
drilling down may be performed along any of the multiple
hierarchies.
Multiple Hierarchies (Example)
11

Inside a Fact Table


12
Inside a Fact Table
13

Concatenated Key.
❑ A row in the fact table relates to a combination of rows from all the dimension tables.

❑ In this example of a fact table, you find quantity ordered as an attribute. Let us say
the dimension tables are product, time, customer, and sales representative. For these
dimension tables, assume that the lowest level in the dimension hierarchies are
individual product, a calendar date, a specific customer, and a single sales
representative. Then a single row in the fact table must relate to a particular product,
a specific calendar date, a specific customer, and an individual sales representative.
❑ This means the row in the fact table must be identified by the primary keys of these
four dimension tables. Thus, the primary key of the fact table must be the
concatenation of the primary keys of all the dimension tables.

Additive measure
14

❑ Additive Facts are Facts that can be summed up through all of the dimensions in the
Fact table.
❑ For example, if there is a retail store and if we want to identify the total sales which
have happened in the last six months, we can group the records of the last six months
and get the summed up (aggregated) value. In this case, Sales or the Sales amount is
an additive fact, because we can sum up (group) the fact records with other related
dimensions (product, customer, supplier etc) present in the fact table.
Semi-Additive measure
15

Semi-additive Facts are Facts that can be summed up for some of the
dimensions in the Fact table, but not the others.
❑ Let us take headcount for example, if I had ten people in my
department in January, February, and March, then my Q1 headcount
is not thirty it is still ten.
❑ Quantity in hand, account balance, and #of patients are an examples
on Semi-additive Facts

Non-Additive measure
16

Certain measures/numbers are completely non-additive, such as ratios. Non-additive Facts are Facts that
cannot be summed up for any of the dimensions present in the Fact table.
on-additive facts:
Examples
❑ Averages (average sales price, unit price)

❑ Percentages (% discount)

❑ Ratios (gross margin)

❑ Count of distinct products sold


Example
17

Example
18

❑ Additive fact
❑ In the previous fact table, we can aggregate the quantity sold to find the
total number of sold products for each store or for each product or for each
time dimension. Sales amount also is an additive fact.
❑ Semi-additive fact
❑ Quantity in hand tells the number of products available in a store after
every transaction. It can be easily summarized across products or stores and
tells a total number. However it is not additive across dates.
Example
19

Non-Additive measure
In the previous example, all stores combining 5% for Miami will give 25% that
will be incorrect. Also, it can not be aggregated for any products or data. Other
Non-Additive measure is a unit price that also can not be summarized along any
dimension.

Table Deep, Not Wide.


20

Typically a fact table contains fewer attributes than a dimension table.


Usually, there are about 10 attributes or less. But the number of records
in a fact table is very large in comparison.
Sparse Data
21

❑ For a particular product, a specific calendar date, a specific customer,


and an individual sales representative, there is a corresponding row in
the fact table.
❑ What happens when the date represents a closed holiday and no
orders are received and processed? The fact table rows for such dates
will not have values for the measures.
❑ Therefore, it is important to realize this type of sparse data and
understand that the fact table could have gaps.

Degenerate Dimension
22

❑ A degenerate dimension is a dimension key in the fact table that


does not have its own dimension table.

❑ Look closely at the example of the fact table. You find the attributes
of order_number and order_line. These are not measures or metrics
or facts. Then why are these attributes in the fact table?

❑ When you pick up attributes for the dimension tables and the fact
tables from operational systems, you will be left with some data
elements in the operational systems that are neither facts nor strictly
dimension attributes.
Degenerate Dimension
23

❑ Examples of such attributes are order numbers, invoice numbers, order


line numbers, and so on.
❑ These attributes are useful in some types of analyses. For example,
you may be looking for average number of products per order. Then
you will have to relate the products to the order number to calculate
the average.
❑ Attributes such as order_number and order_line in the example are
called degenerate dimensions and these are kept as attributes of the
fact table.

Example
24
Example
25

The Factless Fact Table


26

❑ A factless fact table is fact table that does not contain measurements.
They contain only dimensional keys and it captures events that happen
only at information level but not included in the calculations level. just
an information about an event that happen over a period.
❑ They are often used to record events or coverage information.
Common examples of factless fact tables include:
The Factless Fact Table
27

❑ Identifying product promotion events (to determine promoted products


that didn’t sell)
❑ Tracking student attendance or registration events
❑ Tracking insurance-related accident events
❑ Identifying building, facility, and equipment schedules for a hospital or
university
❑ Factless fact tables are used for tracking a process or collecting stats.
They are called so because, the fact table does not have
aggregatable numeric values or information.

The Factless Fact Table


28
The Factless Fact Table
29

The Factless Fact Table


30

You might also like