Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 11

ng A.Y.

2020-2021
E
Data Warehouse Components and
MODULE 2: 4 hrs.
Gathering Business Requirements

Course Instructor Alan S. Brillantes, CPA, MBA


FB
Alan Brillantes
Messenger
Contact Details Email Ad a.brillantes@usls.edu.ph
Phone No./s 0932-9543932
Consultation
MWF 2:30-4:00 pm TTH 8:00-10:00 am
Hours

Part I: TARGETED COURSE OUTCOMES

Demonstrate understanding of the basic concepts of data warehousing.

Learning Objectives
At the end of this module, you must be able to:

1. Describe the components of a data warehouse system and their


interrelationships.
2. Discuss the process of gathering business requirements, including sub-
processes and relevant considerations.

Part II: ASSESSMENT/S

Learning Evidence

The following shall serve as evidence of your learning:

1. Accomplished Assignment
2. Accomplished Quiz

Rubric/Evaluation Tool

The following rubrics shall be utilized in evaluating and grading your work:

LE1: Accomplished
Assignment
Area to Weight Excellent Above Average Passing Failure
Assess Average
Complete- 60% All required 86-99% of 71-85% of 50%-70% of <50% of
ness contents are required required required required
present contents are contents are contents are contents are
present present present present
Substance 40% Depth & Depth & Depth & Depth & Generally
elaboration are elaboration are elaboration elaboration lacks depth &
exemplary very good are good are wanting elaboration
in some
parts

LE2: Quiz
Area to Superior Above Average Below Poor
Assess Average Average
Number of 91-100% 61-90% correct 51-60% 41-50% <40% correct
Correct correct
Answers
MODULE GUID
Flexible Learn

This document is a property of the University of St. La Salle Module 1 | Page 1


Unauthorized copying and / or editing is prohibited.
ng A.Y. 2020-2021
E
Part III: TEACHING-LEARNING ACTIVITIES (TLA)

What follows is a narrative discussion on “Data Warehouse Components and Gathering


Business Requirements.” At the end of this section is a learning task composed of guide
questions for you to answer to enhance your learning of the topic(s).

DATA WAREHOUSE COMPONENTS AND GATHERING BUSINESS REQUIREMENTS

CONTENTS:

A. Data Warehouse Components


o The Components
o Data Sourcing
o Data Extraction/Transformation
o Data Warehouse Database Management System
o Data Warehouse Administration
o Business Intelligence Tools
o Metadata

B. Gathering Business Requirements


o Defining Business Requirements
o Flow of Business Requirements Collection
o Data-Centric Interviews by IT
o Actual Business Requirements

Data Warehouse Components

The following are the components of a data warehouse:

The core business processes of many organizations are becoming more dynamic and complex
because of globalization and evolving technology asserts that data warehousing is a system
architecture, not a software product or application. Building a data warehouse requires the
integration of many tasks and components and coordination of the efforts of many people.
The components organizes the essential components for defining and
understanding data warehouses and supports a methodical approach to presenting
the data warehousing process
MODULE GUID
Flexible Learn

This document is a property of the University of St. La Salle Module 1 | Page 2


Unauthorized copying and / or editing is prohibited.
ng A.Y. 2020-2021
E
Data Sourcing

• Sources of data are identified in the transactional legacy systems


• Information from each source is extracted, translated and merged with sources before
being stored in the data warehouse
• Cite examples of transactional legacy systems
• “The information needs of the organization have to be identified.
• This in turn helps to determine the data requirements that fulfill these information needs.
These requirements are used to develop a data model that provides business reasons
for building a data warehouse.
• Information from the native format of the source is translated into the format
and data model used by the warehousing system
Data Extraction/Transformation

The next step in building the warehouse is data preparation and data cleansing. It involves the
extraction of source data, transformation into new forms, and loading into the data warehouse
environment:

• Extraction - data extraction from the operational systems by customized code or


routines
• Transformation - data is reconciled by integrating it with different formats, values,
or codes
• Load - transfers data from the operational source systems into
the data warehouse for analysis
Take note that Transformation takes up the bulk of the work in the Data Warehouse. It involves
heavy coding and may involve tuning to enable faster transformations of data.

Data Warehouse Database Management System

• A good data warehousing system needs a good database management system


• Needs to provide robust data management, scalability, high-performance query
processing, and integration with other servers
Warehouse servers can be categorized into two types:

• relational database management system - huge data storage capacity, portability,


security
• multidimensional database -  instant response, implementation ease,
integration with metadata” --Bhansali, N. (qtd in Strategic Data Warehousing: Achieving
Alignment with Business)
Data Warehouse Administration

• Keeps the data warehouse environment working


• Required for enhancing performance and monitoring the data warehouse
Data warehouse administration provides

• query management
• access control
• disaster recovery
• tool integration
• directory management
• security request control
• capacity planning
• data usage auditing
• user administration
Effective governance is considered a key to data warehouse success.
MODULE GUID
Flexible Learn

This document is a property of the University of St. La Salle Module 1 | Page 3


Unauthorized copying and / or editing is prohibited.
ng A.Y. 2020-2021
E
Business Intelligence Tools

Analysis Applications (e.g., SPSS, ILOG, Cognos):


• Decision support tools that allow end-users to analyze information with ease
• Removes the query building from the end user and make the data easily accessible
• The selection o f the right end-user tool is important because the ease of use and range
of functions provided by the access tools determine the user’s perception of the value
and success of the data warehouse.
• These tools could be a set of query generation and reporting tools
Metadata

• Data that is used to describe and locate other data in the data warehouse


• Plays an important role in the loading, organization, and utilization of data
• Metadata is data about the data and defines raw data
• Before a data warehouse is accessed, it is necessary to understand what data is stored
in the data warehouse and where is it located.
• In addition to describing and locating data, metadata contains the definitions of the
databases and the relationships between data elements.
• Metadata is data about the data and defines raw data.
Gathering Business Requirements

Defining Business Requirements

What requirements are needed?

• Basics of the business itself


• Challenges facing the organization
• Strategic plans to move the organization forward

Layers of Requirements Gathering

Multiple levels of requirements are needed to build a successful data warehouse environment.
These range from high-level strategic planning to detailed data analysis. Each level represents
a different type of information that is gathered. Figure above shows the different layers, each
progressively more detailed.
MODULE GUID
Flexible Learn

This document is a property of the University of St. La Salle Module 1 | Page 4


Unauthorized copying and / or editing is prohibited.
ng A.Y. 2020-2021
E
Strategic Requirements

• Provide insight into the vision and overall goals of the organization
• Focus on the big picture and look at the entire enterprise.
These requirements need to be at a high level of detail. These are used to help define the
charter and scope of the data warehouse project. More information can be found in the next
section about strategic requirements.

Broad Business Themes

• Include business goals and challenges


• Issues and topics that the business group is working on
• There may be too many of these broad business requirements to tackle at once. Select
the most critical areas for further research and requirements gathering.
Questions asked that identify Broad Business Requirements

This is the basic bread-and-butter content that is needed as the foundation for design and
development of the data warehouse. It is still too soon to begin crawling through individual
reports, or tables and columns of data. A basic understanding of the business function itself is
needed first.

Explore with the interview group the challenges facing their part of the organization, how
success is measured, and what problems the group is facing. This discussion should also cover
what reporting and analytical needs exist.
MODULE GUID
Flexible Learn

This document is a property of the University of St. La Salle Module 1 | Page 5


Unauthorized copying and / or editing is prohibited.
ng A.Y. 2020-2021
E
Business Analysis and Business Data

Business Analysis Business Data

Include analyses that are supports reporting requirements


currently performed on a and analyses
regular basis
Determine why a report is
useful

Business Analysis - Depending on the audience, there may also be discussion about analyses
that are currently performed. The key is to be able to understand why these are done and what
happens with the results.
It is also useful to be able to tie a specific business analysis to a broad business theme
discussed in the previous slide. This helps to clarify the underlying purpose for performing the
analysis

Business Data - At this point of the project, it is most important to get a complete picture of all
of the data that would be useful to the business community. This is a time for the business
representatives to share the realities of the data used regularly and to be able to share a vision
about other data that would be valuable, if available.

Prioritization and Scope

• Opportunity to set new priorities and then modify the project charter and scope as
necessary
• Determine whether the results of the requirements gathering process so far are aligned
with the project charter and scope
• Before jumping into more detailed requirements gathering or design work, it is important
to take a moment to determine whether the results of the requirements gathering
process so far are aligned with the project charter and scope. If not, this is the
opportunity to set new priorities and then modify the project charter and scope as
necessary.
• Then, with confirmation of the original project scope or a revised project scope, more
detailed requirements can be gathered. These are typically done in later stages of a data
MODULE GUID
Flexible Learn

This document is a property of the University of St. La Salle Module 1 | Page 6


Unauthorized copying and / or editing is prohibited.
ng A.Y. 2020-2021
E
warehouse project, but are included here to show a complete picture of all the
requirements.
Specific Reports, Calculations, and Screens and Actual Data Sources

Business Analysis Business Data

Part of the solution Source for the business


must be defined to data that must be
support the business included, as defined by
analyses specified the business analyses
earlier

Actual data sources - selected to be the source for the business data that must be included, as
defined by the business analyses.

All of these requirements must be collected for the first data warehouse project. The project
charter and scope will help determine the group of people needed to provide requirements
input. The project scoping and prioritization step will further narrow the focus for this project.

Flow of Business Requirements Collection

This includes:

• Launch
• Interview Flow
• Wrap-up

It’s time to sit down face to face to collect the business requirements. The process usually flows
from an introduction through structured questioning to a final wrap-up.

Launch

• Responsibility for introducing the interview should be established prior to gathering in a


conference room
• Focus on the project and interview objectives but not ramble on about the hardware,
software, and other technical jargon

• The designated kickoff person should script the primary points to be conveyed in the first
couple minutes when you set the tone of the interview meeting. The introduction should
convey a crisp, business-centric message.

Interview Flow
• IT People will try to meet us on your own business turf
• They will ask about your key performance metrics
• How business people track progress and success translates directly into the dimensional
model
The objective of an interview is to get business users to talk about what they do and why they
do it. A simple, nonthreatening place to begin is to ask about their job responsibilities and
organizational fit. This is a lob ball that interviewees can respond to easily. From there, we
typically ask about their key performance metrics.
Such questions as

• ‘How do you distinguish between products (or agents, providers, or facilities)?’


• ‘How do you naturally categorize products?’
help identify key dimension attributes and hierarchies.
Understanding the nature of these analyses and whether they are ad hoc or standardized
provides input into the data access tool requirements.
MODULE GUID
Flexible Learn

This document is a property of the University of St. La Salle Module 1 | Page 7


Unauthorized copying and / or editing is prohibited.
ng A.Y. 2020-2021
E
Wrap Up

• The interviewer will ask about the success criteria for the project
• Each criterion should be measurable
• Try to articulate specifics
• Take advantage of this opportunity to manage expectations.
• As the interview is coming to a conclusion, we ask each interviewee about his or her
success criteria for the project. Of course, each criterion should be measurable. Easy to
use and fast mean something different to everyone, so you should get the interviewees
to articulate specifics, such as their expectations regarding the amount of training
required to run a predefined report.
• At this point in the interview we make a broad disclaimer. The interviewees must
understand that just because we discussed a capability in the meeting doesn’t guarantee
that it’ll be included in the first phase of the project.
Data-Centric Interviews by IT

• They will Intersperse sessions with the source system data gurus or subject matter
experts
• They will evaluate the feasibility of supporting the business needs
• While we’re focused on understanding the requirements of the business, it is helpful to
intersperse sessions with the source system data gurus or subject matter experts to
evaluate the feasibility of supporting the business needs. These data-focused interviews
are quite different from the ones just described.
• The goal is to assess that the necessary core data exists before momentum builds
behind the requirements.
Ground Rules for Effective Interviewing

• Establish a peer basis with the interviewee; use your vocabulary


• Remember interview role; listen & absorb like a sponge
• Strive for a conversational flow
• Verify communication and try to capture business terminology
Interviewing Tips for Uncovering Business Requirements

• Be curious, but not too smart


• Be conversational
• Listen and expect to be changed

If you don't learn what the business really needs from the data warehouse, you can't
provide it.

Business acceptance is the most critical measure of DW/BI success. If the business doesn't
embrace the DW/BI deliverables to support its decision making processes, then the DW/BI
initiative is an exercise in futility.

Actual Business Requirements

This is what IT people want to uncover from Business users. It is best to get familiarized with
these requirements to maximize the potential of the interview process.

Discussion is focused on the actual business requirements of an organization. It is categorized


into 8 areas:

• Business Needs
• Data Quality
• Security
• Data Integration
• Data Latency
• Archiving and Lineage
MODULE GUID
Flexible Learn

This document is a property of the University of St. La Salle Module 1 | Page 8


Unauthorized copying and / or editing is prohibited.
ng A.Y. 2020-2021
E
• User Delivery
• Skills

Business Needs

• Users' information requirements


• Directly drive the choice of data sources and their subsequent transformation in the ETL
system
• Must be understood and carefully examined by the ETL Team

• Gathering and understanding all the known requirements, realities, and constraints affecting
the ETL system. The list of requirements can be pretty overwhelming, but it's essential to
lay them on the table before launching into the development of your ETL system.

• The list of requirements can be pretty overwhelming, but it's essential to lay them on the
table before launching into the development of your ETL system.

• Typical due diligence requirements for the data warehouse include the following:
• Saving archived copies of data sources and subsequent data staging
• Providing proof of the complete transaction flow that changed any data results
• Fully documenting algorithms for allocations, adjustments, and derivations
• Supplying proof of security of the data copies over time, both online and offline
• Typical due diligence requirements for the data warehouse include the following:
• Saving archived copies of data sources and subsequent data staging
• Providing proof of the complete transaction flow that changed any data results
• Fully documenting algorithms for allocations, adjustments, and derivations
• Supplying proof of security of the data copies over time, both online and offline

• Some compliance issues will be outside the scope of the data warehouse system, but many
others will land squarely within its boundaries.

• Changing legal and reporting requirements have forced many organizations to seriously
tighten their reporting and provide proof that the reported numbers are accurate, complete,
and have not been tampered with.”
Data Quality

Three powerful forces have converged to put data quality concerns near the top of the list for
executives.
– “If only I could see the data, then I could manage my business better“.
Every knowledge worker believes instinctively that data is a crucial requirement for them to
function in their jobs.
– Data sources are profoundly distributed, typically around the world, and
that effectively integrating myriad disparate data sources is required.
– Third, the sharply increased demands for compliance mean that careless
handling of data will not be overlooked or excused.
First, the long term cultural trend that says "if only I could see the data, then I could manage my
business better" continues to grow; today every knowledge worker believes instinctively that
data is a crucial requirement for them to function in their jobs.

Most organizations understand that their data sources are profoundly distributed, typically
around the world, and that effectively integrating myriad disparate data sources is required.

Security

The basic rhythms of the data warehouse are at odds with the security mentality.
• The data warehouse seeks to publish data widely to decision makers
• Security interests assume that data should be restricted to those with a need to know.
• Security awareness has increased significantly in the past few years across IT, but often
remains an afterthought and an unwelcome burden to most data warehouse teams.
MODULE GUID
Flexible Learn

This document is a property of the University of St. La Salle Module 1 | Page 9


Unauthorized copying and / or editing is prohibited.
ng A.Y. 2020-2021
E
• During the requirements roundup, the team should seek clear guidance from senior
management as to what aspects of the data warehouse carry extra security sensitivity.
Compliance requirements are likely to overlap security requirements; it may be wise to
combine these two topics during the requirements round up.
Data Integration

• Aims to make all systems work together seamlessly


• Must take place among the organization's primary transaction systems before data arrives
at the data warehouse
• Data integration is a huge topic for IT because, ultimately, it aims to make all systems work
together seamlessly. The "360 degree view of the enterprise" is a familiar name for data
integration. In many cases, serious data integration must take place among the
organization's primary transaction systems before data arrives at the data warehouse.
• Data integration usually takes the form of conforming dimensions and conforming facts in
the data warehouse. Conforming dimensions means establishing common dimensional
attributes across business processes so that "drill across" reports can be generated using
these attributes. Conforming facts means making agreements on common business metrics
such as key performance indicators (KPIs) across separated databases so that these
numbers can be compared mathematically by calculating differences and ratios.
Data Latency

• Describes how quickly source system data must be delivered to the business users via the
system
• Processing algorithms, parallelization, and potent hardware can speed up traditional batch-
oriented data flows
• Data latency obviously has a huge effect on the ETL architecture. Clever processing
algorithms, parallelization, and potent hardware can speed up traditional batch-oriented
data flows.
• But at some point, if the data latency requirement is sufficiently urgent, the ETL system's
architecture must convert from batch to streaming oriented. This switch isn't a gradual or
evolutionary change; it's a major paradigm shift in which almost every step of the data
delivery pipeline must be re-implemented.
Archiving and Lineage

• Needed either for comparisons with new data to generate change capture
records or reprocessing
• Should have accompanying metadata describing the origins and processing
steps that produced the data
• Tracking is explicitly required by certain compliance requirements, but should be
part of every archiving scenario
• It is recommended staging the data (writing it to a storage media) after each
major activity of the ETL pipeline: after it's been extracted, cleaned and conformed, and
delivered.”
User Delivery

• The final step for the ETL system is the handoff to the analytics applications.
• Teams working closely with the modeling team, must take responsibility for the content and
structure of the data that makes the analytics applications simple and fast
• The ETL team and data modelers need to work closely with the analytics application
developers to determine the exact requirements for the data handoff
• It's irresponsible to hand off data to the analytics application in such a way as to increase
the complexity of the application, slow down the query or report creation, or make the data
seem unnecessarily complex to the business users.
MODULE GUID
Flexible Learn

This document is a property of the University of St. La Salle Module 1 | Page 10


Unauthorized copying and / or editing is prohibited.
ng A.Y. 2020-2021
E
• Each analytics tool has certain sensitivities that should be avoided and certain
features that can be exploited if the physical data is in the right format
Available Skills

• Some ETL system design decisions must be made on the basis of available
resources to build and manage the system.
• Not advisable to go in the unfamiliar direction without seriously considering the
decision's long term implication
• You shouldn't build a system that depends on critical C + + processing modules if
those programming skills aren't in house or can't be reasonably acquired. Likewise, you
may be much more confident in building your ETL system around a major vendor's ETL tool
if you already have those skills in house and know how to manage such a project.
• You need to consider the big decision of whether to hand code your ETL system
or use a vendor's package.

LEARNING TASK

Answer the following questions in your OWN words. DO NOT copy verbatim from the material
given:

1. Enumerate the components of a data warehouse system. Briefly describe and give the
significance of each one.

2. Briefly discuss the process of gathering business requirements. Highlight key


considerations in the process.

Submit your answers to the instructor via Canvas.

This document is a property of the University of St. La Salle Module 1 | Page 11


Unauthorized copying and / or editing is prohibited.

You might also like