Download as pdf or txt
Download as pdf or txt
You are on page 1of 30

DCS2105 Business Intelligence

Decision Support Science


(Part 2)

1
The list of OLAP operations:

 Roll-up

 Drill-down

 Sliceand dice
 Pivot (rotate)
Common OLAP Operations
1.Roll-up: Move up the
hierarchy

 By dimension reduction.
 When roll-up is
performed, one or more
dimensions from the data
cube are removed.
 E.g. Given total sales by

city, we can roll-up to get


sales by state or by
country.
ROLLUP example

The Rollup is an extension of Group by clause.

4
OLAP Operations

2.Drill-down: Move down the


hierarchy

 By introducing a new
dimension
 Lowest level can be the
detail records (drill-
through)
 It navigates the data from
less detailed data to
highly detailed data.
 E.g., Given total sales by

state, can drill-down to


get total sales by city.
Contd...
3. Slice & Dice :- Select and Project on one or more dimensions. The user
can view the data from many angles.
 The slice operation selects one particular dimension from a given cube
 WHERE clause in SQL
 Dice selects two or more dimensions from a given cube and provides a
new sub-cube.
 GROUP BY clause in SQL
customers

store
customer = “Smith”
OLAP Operations: Slice

 Performs a selection on one dimension of a cube, resulting in


a subcube
Milan 24 18 28 14
ity e
(C tor

Rome 33 25 23 25
)

Q1 21 10 18 35

Time (Quarter)
S

Nice 12 20 24 33
Paris Q2 27 14 11 30

18
23
Q1 21 10 18 35
Time (Quarter)

Q3 26 12 35 32

14

17
Slice on Store.City = ‘Paris’

20
Q2 27 14 11 30
Q4 14 20 47 31
12

18
Q3 26 12 35 32 33 games DVDs
10

Q4 14 20 47 31 books CDs
Product (Category)
games DVDs
books CDs
Product (Category)

7
OLAP Operations: Dice

 Defines a selection on two or more dimensions, thus again


defining a subcube
Milan 24 18 28 14
ity e
(C tor

Rome 33 25 23 25
)

ity e
(C tor
S

)
Nice 12 20 24 33 Nice 12 20 24 33

S
Paris

18
Paris

23

(Quarter)
Q1 21 10 18 35
Dice on Store.Country = ‘France’
Time (Quarter)

Q1 21 10 18 35

Time
14

17

14
and Time.Quarter= ‘Q1’ or ‘Q2’
20
Q2 27 14 11 30 12 Q2 27 14 11 30
18
33
Q3 26 12 35 32 games DVDs
10

books CDs
Q4 14 20 47 31
Product (Category)
games DVDs
books CDs
Product (Category)

8
Contd...
4. Pivot(Rotate):-
 Changing the dimensions.
 It rotates the data axes in
view in order to provide an
alternative presentation of
data
What is multidimensional model?

 Logical view of the enterprise


 Shows main entities of the enterprise business and
relationships between them
 Not tied to a physical database and tables
 Not E-R diagram
Dimensional Model in Data Warehouse
 What is Dimensional Model?
 A dimensional model is a data structure technique

optimized for Data warehousing tools.


 The concept of Dimensional Modelling was developed by
Ralph Kimball and is comprised of "fact" and
"dimension" tables.
 A Dimensional model is designed to read, summarize,
analyze numeric information like values, balances,
counts, weights, etc. in a data warehouse.
 In contrast, relation models are optimized for addition,
updating and deletion of data in a real-time Online
Transaction System.

11
Why need multidimensional modelling?

 An electronic gadget distributor company “ElectronicsForAll” is based out of


Delhi, India.
 The company sells its products in north, north-west, and western regions of
India. They have sales units at Mumbai, Pune, Ahmedabad, Delhi, and
Punjab.
 The President of the company wants the latest sales information to measure
the sales performance and to take corrective actions if required.
 He has requested this information from his business analysts

12
Sales Report of ElectronicsForAll

13
Multidimensional view of data
 This method of analyzing a performance measure ( in
this case the number of units sold) by looking at it
through various perspectives.
 Dimensional modeling is a logical design technique for
structuring data so that it is intuitive to business users
and delivers fast query performance.
 Dimensional modeling is the first step towards building a
dimensional database, i.e. a data warehouse.
 It allows the database to become more understandable
and simpler.

14
Elements of Dimensional Data Model
Fact
 Facts are the measurements/metrics or facts from your
business process. For a Sales business process, a
measurement would be quarterly sales number
Dimension
 Dimension provides the context surrounding a business
process event.
 In the Sales business process, for the fact quarterly
sales number, dimensions would be
 Who – Customer Names

 Where – Location

 What – Product Name

15
Question to ponder

 To better understand the fact (measurement)—dimension


(context) link, let us take the example of booking an airlines
ticket.
 Determine the facts and dimensions in this situation.

16
Elements of Dimensional Data Model

Attributes
 The Attributes are the various characteristics of
the dimension.
 In the Location dimension, the attributes can be
 State
 Country
 Zipcode etc.
 Attributes are used to search, filter, or classify
facts.
 Dimension Tables contain Attributes

17
Elements of Dimensional Data Model

Fact Table
 A fact table is a primary table in a
dimensional model.
 A Fact Table contains

 Measurements/facts

 Foreign key to dimension table

18
Facts
 Data columns (usually numeric) that can be used to perform
calculations needed to answer business questions.
 Facts are stored in Fact Tables
 Facts can be aggregated on different levels:

Aggregated on
Aggregated on
Region level
Country level
Facts (continued)
 Same facts can be represented by different column name in the
DW due to various historical and design reasons.
 In the example below the same fact has two different names:
SALES and DOLLAR_SALES

 Facts are cross-dimensional, not limited to one dimension


only. In the example above, the same fact crosses two
dimensions: Geography and Time.
Facts (continued)

 Facts are used to create metrics.


 Metrics - business measurements (i.e. Dollar Sales, Units Sold,
Gross Margin and etc.) used by businesses to analyze and report
their performance.
 Metrics are usually a fact that has a mathematical function
applied to it (sum, average, max, min and etc.)
 More on metrics in a separate presentation
Elements of Dimensional Data Model

Dimension table
 A dimension table contains dimensions of a fact.
 They are joined to fact table via a foreign key.
 Dimension tables are de-normalized tables.
 The Dimension Attributes are the various columns in a
dimension table
 Dimensions offers descriptive characteristics of the facts
with the help of their attributes
 No set limit set for given for number of dimensions
 The dimension can also contain one or more hierarchical
relationships

22
Five Steps of Dimensional Modelling

 Identify Business Process


 Identify Grain (level of detail)
 Identify Dimensions
 Identify Facts
 Build Schema

23
Identify the business process

 Identifying the actual business process a data warehouse


should cover.
 This could be Marketing, Sales, HR, etc. as per the data
analysis needs of the organization.
 To describe the business process, you can use plain text or
Unified Modelling Language (UML).

24
Identify the granularity

 The Grain describes the level of detail for the business


problem/solution.
 It is the process of identifying the lowest level of information
for any table in your data warehouse.
 If a table contains sales data for every day, then it should be
daily granularity.
 If a table contains total sales data for each month, then it has
monthly granularity.

25
Identify the grain cont….
 During this stage, you answer questions like
 Do we need to store all the available products or just a

few types of products?


 Do we store the product sale information on a monthly,
weekly, daily or hourly basis?
Example of Grain:
 The CEO at an MNC wants to find the sales for specific
products in different locations on a daily basis.
 So, the grain is "product sale information by
location by the day."

26
Identify the dimensions
 Dimensions are nouns like date, store, inventory, etc.
These dimensions are where all the data should be
stored. For example, the date dimension may contain
data like a year, month and weekday.
Example of Dimensions:
 The CEO at an MNC wants to find the sales for specific
products in different locations on a daily basis.
 Dimensions: Product, Location and Time

 Attributes: For Product: Product key (Foreign Key),


Name, Type, Specifications
 Hierarchies: For Location: Country, State, City, Street
Address, Name

27
Identify the Fact
 This step is co-associated with the business
users of the system because this is where they
get access to data stored in the data warehouse.
 Most of the fact table rows are numerical
values like price or cost per unit, etc.
Example of Facts:
 The CEO at an MNC wants to find the sales for
specific products in different locations on a daily
basis.
 The fact here is : Sum of Sales by product by
location by time.
28
Build Schema
 A schema is nothing but the database structure
(arrangement of tables).
 There are two popular schemas:

Star Schema
 The star schema architecture is easy to design.

 It is called a star schema because diagram


resembles a star, with points radiating from a
center.
 The center of the star consists of the fact table,
and the points of the star is dimension tables.

29
The “Classic” Star Schema

 In the Star schema, the center of the star can have one fact
tables and numbers of associated dimension tables.
 A single fact table, with detail and summary data
 Fact table primary key has only one key column per
dimension

30

You might also like