Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
You are on page 1of 27

BIDW Roadmap

Author : Dave Goyal

BIDW Process Roadmap

Author : Dave Goyal 2

Overall Process
 Program / Project Planning and
 Business Process Definition
 Technical Architecture Design
 Product Selection and Installation
 Dimensional Modeling

Author : Dave Goyal 3

Overall Process…Contd.
 Physical Design
 ETL Design and Development
 BI Application Design
 BI Application Development
 Deployment
 Change Management and Maintenance

Author : Dave Goyal 4

Program / Project
Planning and Management
 Define the Project
 Build the Business Case and Justification
 Plan the Project
 Manage the Project
 Manage the Program

Author : Dave Goyal 5

Business Process
 Define Business Process
 Define Requirements using Interviews
 Define Requirements using Facilitated

Author : Dave Goyal 6

Technical Architecture Design
 Back Room Architecture (Source , ETL)
 Presentation Server Architecture
(Dimensional Architecture)
 Front Room Architecture (BI)
 Additional Architecture Features
(Infrastructure, Metadata, Security)

Author : Dave Goyal 7

Product Selection and Installation
 Architecture Plan (DW Architecture
Diagram and Application Architecture
 Product Selection (Hardware/OS, DBMS,
ETL, BI, Data Profiling, Data Cleansing

Author : Dave Goyal 8

Dimensional Modeling Process
 Value Chain Business Process
 Choose the Business Process
 Declare the Grain
 Identify the Dimensions
 Identify the Facts
 Enterprise Bus Matrix

Author : Dave Goyal 9

Physical Design
 High Level Physical Design
 Develop Standards
 Develop the Physical Data Model
 Develop Initial Indexing Plan
 Design OLAP Database
 Design Aggregations

Author : Dave Goyal 10

ETL Table Naming Convention
 D_ : Dimension Table
 F_ : Fact Table
 S_ : Source Table - Contains all data copied
directly from a source file
 X _ : Extract Table – Contains changed
source data only, Changes may be from an
incremental extract or derived from a full

Author : Dave Goyal 11

ETL Table Naming Convention 2
 C_ : Clean Table – contains source rows that
have been cleaned
 E_ : Error Table - contains error rows found in
source data
 M_ : Master table – maintains history of all clean
 T_ : Transform Table – contains the data
resulting from a transformation of source data

Author : Dave Goyal 12

ETL Table Naming Convention 3
 I_ : Insert Table – contains new data to be
inserted in dimension table
 U_ : Update Table – contains changed data
to be inserted in dimension table

Author : Dave Goyal 13

Data Quality
 Avoid Null string in dimension tables
 Specify default value for NOT NULL
columns – ‘N/A’, ‘Not Known’, ‘Invalid’
 Dimension Primary keys should be auto
generated surrogate keys. Allow data
quality rows as 0, -1 , -2

Author : Dave Goyal 14

Surrogate Keys
 Always use surrogate keys for dimension
keys as auto generate keys
IDENTITY OFF sql statement to create
keys 0 , -1 and -2 rows for each dimension
when it is created
 -1 : UNKNOWN

Author : Dave Goyal 15

ETL Design and Development
 Round Up the Requirements
 Extract Data from source (3 Steps)
 Clean and Conform Data (5 Steps)
 Delivering Data (13 Steps)
 Managing the ETL Environment (13 Steps)

Author : Dave Goyal 16

ETL Roadmap

Author : Dave Goyal 17

ETL Implementation Process
 Analyze data quality thoroughly and have
options available to resolve it
 Define Data source definitions
 Create High Level S2T Map
 Create Detail Level S2T Map
 Create Fact Worksheet

Author : Dave Goyal 18

ETL Process…Extract
 Extract Data to S_Table (Full Load)
 Compare S_ to M_ table and load the difference in
X_ tables
 Clean X Table by removing duplicate rows from X_
Table . De-duplication step
 Move duplicate rows to E_ Table
 Move non duplicate clean rows to C_ table
 Compare C_ to M_ and insert new into M and
update M_ with changed

Author : Dave Goyal 19

ETL Process…Transform
 Select and Transform from C_ to T_
 Compare T_ with D_ for new and changed
 Insert New rows in I_ and changed rows in

Author : Dave Goyal 20

ETL Process…Load (I_)
 Insert rows directly into D_ table from I_
 Update rows from U_ to D_ when its SCD
 Insert rows from U_ to D_ when its SCD 2
 Please Dimension or Surrogate keys will be
generated during Load stage

Author : Dave Goyal 21

ETL Process… To remember
 S_ , X_ , M_ , C_ , E_ tables should be
named as source tables such S_Agents .
 T_ , I_ , U_ , or D_ table should be named
as target tables such as T_Agent,
T_PolicyHolder etc.
 Source table data size should follow source
data formats except Natural keys should be
varchar to accommodate data quality

Author : Dave Goyal 22

High Level BIDW System
Architecture Model

Author : Dave Goyal 23

BI Application Design
 Define the structure of the portal and its
 Define High Level Reporting requirements
(Dashbaords, Scorecards)
 Define Analytical reporting requirements
( Cubes, Interactive reports, Adhoc Queries)
 Define Detailed reporting requirements
( Filter based reports, Adhoc queries)

Author : Dave Goyal 24

BI Application Layers

Author : Dave Goyal 25

BI Application Development
 Setup the development environment
 Setup the Issue management system
 Develop all reports
 Test and Balance each report against the
source system

Author : Dave Goyal 26

Deployment / Maintenance
 Design Version control system
 Define the change management process
 Define the documents to deploy changes
from Dev, Test, QA to Production
 Manage and maintain environments.

Author : Dave Goyal 27

You might also like