Download as pdf or txt
Download as pdf or txt
You are on page 1of 22

CHAPTER 1.

Review Questions

1. What do we mean by strategic information? For a commercial bank, name five types of
strategic objectives.
Strategic information refers to data critical for long-term planning and decision-making. For a
commercial bank, key strategic objectives include risk management, customer retention,
technological innovation, regulatory compliance, and financial sustainability.

2. Do you agree that a typical retail store collects huge volumes of data through its operational
systems? Name three types of transaction data likely to be collected by a retail store in large
volumes during its daily operations.

Absolutely! A retail store indeed gathers substantial data. Three types of transaction data
collected in large volumes are:

Sales Transactions: Recording items sold, prices, and quantities.

Inventory Transactions: Tracking stock movements, restocking, and depletion.

Customer Transactions: Capturing purchase history, preferences, and loyalty program activities.

3. Describe five differences between operational systems and informational systems.

a. Purpose:

Operational Systems: Primarily focused on day-to-day transactions and core business


operations.
Informational Systems: Geared towards analyzing and presenting data to support
decision-making.

b. Data Usage:

Operational Systems: Deal with real-time, transactional data for immediate processing.
Informational Systems: Handle historical, aggregated data for reporting and analysis.

c. Time Horizon:

Operational Systems: Emphasize the present and immediate future.


Informational Systems: Focus on the past, present, and future trends.

d. User Focus:
Operational Systems: Used by front-line employees for routine tasks.
Informational Systems: Accessed by managers and analysts for strategic planning and insights.

e. Processing Speed:

Operational Systems: Require fast and efficient real-time processing.


Informational Systems: Can tolerate longer processing times for complex analytical tasks.

4. Why are operational systems not suitable for providing strategic information? Give three
specific reasons and explain.

the design principles and focus of operational systems immediate, detailed transactions make
them less suitable for providing the broader, historical, and analytical information needed for
strategic decision-making.

1. Focus on Real-Time Processing:

Operational systems prioritize real-time processing to support immediate operational needs. In


contrast, strategic decisions often require a broader and historical perspective, which may not
align with the real-time nature of operational systems.

2. Granularity of Data:

Operational systems capture detailed transactional data at a granular level. While this
granularity is essential for operational tasks, it may be overwhelming for strategic
decision-makers who need a more aggregated and summarized view to identify long-term
trends and patterns.

3. Lack of Analytical Capabilities:

Operational systems are optimized for transaction processing and may lack advanced analytical
capabilities. Strategic decision-making often involves complex analysis, forecasting, and trend
identification, which may go beyond the capabilities of operational systems.

5. Name six characteristics of the computing environment needed to provide strategic


information.
● Scalability: Ability to handle large data volumes.

● Data Integration: Correlating information from diverse sources.

● Advanced Analytics: Support for data mining and predictive modeling.


● Security Measures: Robust protection against unauthorized access.

● Historical Data Storage: Efficient retention and retrieval of historical data.

● User-Friendly Interface: Intuitive tools for non-technical users.

6. What types of processing take place in a data warehouse? Describe.

● Extract:

- Description: Data is extracted from various sources.

● Transform:

- Description: Data is cleaned, formatted, and transformed for consistency.

● Load:

- Description: Processed data is loaded into the data warehouse.

● Query and Analysis:

- Description: Users can run queries and perform analytical tasks on the stored data.

These processes collectively ensure data integration, quality, and accessibility in a data
warehouse.

7. A data warehouse is an environment, not a product. Discuss.


- A data warehouse is not just a standalone product but an encompassing environment
that includes hardware, software, and processes working together to facilitate centralized
storage, integration, and analysis of data for business intelligence purposes.
CHAPTER EXERCISES

1. The current trends in hardware/software technology make data warehousing feasible. Explain
with some examples how exactly technology trends do help.
- Advancements in hardware and software technology make data warehousing feasible by
improving storage capacity, processing speed, and analytical capabilities. For instance,
the use of high-performance databases, distributed computing, and cloud computing
allows for the efficient handling of large volumes of data. Additionally, innovations in data
compression and in-memory processing enhance the speed of data retrieval and
analysis. Machine learning algorithms and artificial intelligence further contribute to
automated insights, making data warehousing more powerful and accessible for
decision-makers.

2. For an airline company, how can strategic information increase the number of frequent flyers?
Discuss giving specific details.

Strategic information for an airline can boost frequent flyers by:

● Segmentation: Targeting high-value customer segments.


● Personalized Marketing: Tailoring promotions based on passenger data.
● Route Optimization: Adding more flights on popular routes.
● Loyalty Program Enhancement: Improving benefits and introducing tiered rewards.
● Predictive Analytics: Anticipating travel patterns for targeted promotions.
● Feedback Analysis: Addressing customer concerns to enhance overall experience.

3. You are a senior analyst in the IT department of a company manufacturing automobile parts.
The marketing VP is complaining about the poor response by IT in providing strategic
information. Draft a proposal to him introducing the concept of business intelligence and how
data warehousing and analytics as part of business intelligence for your company would be the
optimal solution.

Here is my draft proposal:

Business Intelligence Overview:


Business Intelligence involves the collection, analysis, and presentation of business data to
facilitate informed decision-making. It transforms raw data into actionable insights, aligning IT
efforts with strategic business goals.

Key Components:

1. Data Warehousing:

Purpose: Centralized storage for integrated and historical data.


Benefits: Streamlined data access, improved accuracy, and enhanced data consistency.
2. Analytics:

Purpose: Extract meaningful patterns and trends from data.


Benefits: Informed decision-making, predictive modeling, and data-driven strategies.

Optimal Solution for Our Company:


Implementing BI, specifically data warehousing and analytics, will revolutionize how we access
and utilize information for strategic planning. Here's why it's the optimal solution:

a. Holistic View:

BI provides a consolidated and comprehensive view of business data, ensuring that


decision-makers have access to relevant information from various departments.
Timely Insights:

With analytics, we can move beyond raw data to gain timely insights, enabling quicker
responses to market trends and customer behaviors.

b. Strategic Planning:

BI supports strategic planning by offering predictive analytics, helping us anticipate market shifts
and optimize our operations accordingly.

c. User-Friendly Dashboards:

Intuitive dashboards make it easy for non-technical users, including yourself, to interact with and
extract valuable insights from the data.

d. Improved Collaboration:

BI fosters collaboration between departments by providing a common platform for data analysis
and reporting.
CHAPTER 2

REVIEW QUESTIONS
1. Name at least six characteristics or features of a data warehouse.

1. Subject-Oriented: Organized around specific business subjects.


2. Integrated: Consolidates data from diverse sources.
3. Time-Variant: Stores historical data for trend analysis.
4. Non-volatile: Data remains unchanged once loaded.
5. Decision Support: Supports querying and reporting for decision-making.
6. Scalable: Capable of handling large data volumes and scalable for growth.

2. Why is data integration required in a data warehouse, more so than in an operational


application?

Data integration is more critical in a data warehouse than in an operational application because
it harmonizes diverse data sources, ensures consistency for historical analysis, and provides a
unified foundation for centralized and informed decision-making.

3. Every data structure in the data warehouse contains the time element. Why?

- Every data structure in the data warehouse contains the time element to facilitate
historical analysis and track changes over different periods.

4. Explain data granularity and how it is applicable to the data warehouse.

- Data granularity refers to the level of detail or specificity in the data, and in the data
warehouse, it determines the depth of information stored, allowing for precise analysis
and reporting at various levels of detail.

5. How are the top-down and bottom-up approaches for building a data warehouse different?
List the major types of architectures and highlight the features of any two of these.

Answers:

The top-down approach starts with designing the overall architecture and then focuses on
specific data marts, while the bottom-up approach builds data marts first and integrates them
into a comprehensive data warehouse.
Data Warehouse Architectures:

Kimball Architecture:

Features: Follows a bottom-up approach, emphasizing data marts for specific business areas;
promotes rapid development and deployment.

Inmon Architecture:

Features: Top-down approach with a focus on a centralized data warehouse; ensures


consistency across the organization, supports complex queries, and provides a comprehensive
view of data.

6. What are the various data sources for the data warehouse?

- Data sources for the data warehouse include operational databases, external data from
vendors, spreadsheets, legacy systems, and other structured and unstructured data
repositories.

7. Why do you need a separate data staging component?

- A separate data staging component is needed in a data warehouse to preprocess, clean,


and integrate data from diverse sources before loading it into the warehouse, ensuring
data quality and consistency.

8. Under data transformation, list five different functions you can think of.
- Data transformation functions include filtering to extract relevant data, aggregation for
summarizing information, merging to combine datasets, cleansing to standardize and
correct data, and derivation to create new variables or calculations.

9. Name any six different methods for information delivery.


- Six methods for information delivery include dashboards for visual representation,
reports for detailed analysis, alerts for real-time notifications, portals for centralized
access, mobile apps for on-the-go access, and data exports for external usage.

10. What are the three major types of metadata in a data warehouse? Briefly mention the
purpose of each type.

The three major types of metadata in a data warehouse are:

1. Operational Metadata:
Purpose: Describes the execution and operation of processes within the data warehouse,
providing insights into system performance and resource utilization.

2. Structural Metadata:

Purpose: Defines the structure and organization of the data, including tables, relationships, and
data types, ensuring proper integration and understanding of the data schema.

3. Business Metadata:

Purpose: Offers business context to the data, including definitions, business rules, and user
annotations, aiding in the comprehension and meaningful utilization of data by business users.

CHAPTER EXERCISES
1. A data warehouse is subject-oriented. What would be the major critical business subjects for
the following companies?
a. an international manufacturing company
b. a local community bank
c. a domestic hotel chain

2. You are the data analyst on the project team building a data warehouse for an insurance
company. List the possible data sources from which you will bring the data into your data
warehouse. State your assumptions.

3. For an airlines company, identify three operational applications that would feed into the data
warehouse. What would be the data load and refresh cycles?

4. Prepare a table showing all the potential users and information delivery methods for a data
warehouse supporting a large national grocery chain
CHAPTER 3

REVIEW QUESTIONS

1. State any three factors that indicate the continued growth in data warehousing and
business intelligence. Can you think of some examples?

Three factors indicating continued growth in data warehousing and business intelligence include
increasing data volumes from various sources, advancements in analytics technologies, and the
rising demand for data-driven decision-making. Examples include the proliferation of IoT
(Internet of Things) devices generating massive datasets, the evolution of machine learning for
predictive analytics, and businesses adopting BI (Business Intelligence) tools to gain actionable
insights from their data.

2. Why do data warehouses continue to grow in size, storing huge amounts of data? Give
any three reasons.

Data warehouses continue to grow in size due to the increasing volume of data generated by
various sources, the need to store historical data for trend analysis and compliance, and the
growing emphasis on detailed and granular data for more accurate and comprehensive
analysis.

3. Why is it important to store multiple types of data in the data warehouse? Give
examples of some non structured data likely to be found in the data warehouse of a
health management organization (HMO).

Storing multiple types of data in a data warehouse is important to provide a holistic view and
support diverse analyses. In the data warehouse of a Health Management Organization (HMO),
non-structured data like patient feedback comments, social media mentions related to
healthcare trends, and unstructured clinical notes from healthcare providers are examples that
offer valuable insights when combined with structured data.

4. What is meant by data fusion? Where does it fit in data warehousing?

Data fusion refers to the process of integrating diverse data from multiple sources to create a
unified and comprehensive dataset. In data warehousing, data fusion occurs during the data
integration phase, where various data sources are combined and harmonized to provide a
cohesive and meaningful view for analysis and reporting.
CHAPTER EXERCISES

1. Indicate if true or false:

True A. Data warehousing helps in customized marketing.


True B. It is as important to include unstructured data as structured data in a data warehouse.
False C. Dynamic charts are themselves user interfaces.
False D. MPP is a shared-memory parallel hardware configuration.
False E. ERP systems may be substituted for data warehouses.
True F. Most of a corporation’s knowledge base contains unstructured data.
False G. The traditional data transformation tools are quite adequate for a CRM-ready data
warehouse.
True H. Metadata standards facilitate deploying a combination of best-of-breed products.
False I. MDAPI is a data fusion standard.
False J. A Web-enabled data warehouse stores only the clickstream data captured at the
corporation’s Web site.

2. As the senior analyst on the data warehouse project of a large retail chain, you are
responsible for improving data visualization of the output results. Make a list of your
recommendations.

1. Implement Interactive Dashboards:

● Utilize tools that allow users to interact with and customize dashboards for personalized
insights.

2. Visualize Key Performance Indicators (KPIs):

● Focus on visual representations of crucial KPIs for quick and easy monitoring.

3. Utilize Heat Maps and Geographic Visualization:

● Implement heat maps for data density representation and geographical visualization to
analyze regional performance.

4. Incorporate Trend Analysis:

● Introduce trend lines and charts to highlight historical patterns and forecast future trends.

5. Apply Drill-Down and Drill-Up Functionality:

● Enable users to drill down into detailed data or drill up for a broader perspective based
on their analysis needs.
6. Use Data Storytelling Techniques:

● Craft narratives around data to enhance understanding and decision-making.

7. Ensure Mobile Responsiveness:

● Optimize visualizations for mobile devices to enable on-the-go access.

8. Implement Data Annotations:

● Include annotations on visualizations to provide context and explanations for key data
points.

9. Leverage Advanced Visualization Techniques:

● Explore advanced visualization techniques such as Sankey diagrams, tree maps, and
network graphs for complex relationships.

10. Prioritize User Training:

● Conduct training sessions to ensure users understand how to interpret and derive
insights from visualizations effectively.

3. Explain how and why parallel processing can improve performance for data loading and index
creation.

Parallel processing improves performance for data loading and index creation by dividing
the tasks into parallel threads or processes that can be executed simultaneously. This is
beneficial for two main reasons:

1. Speed and Efficiency:

Parallel processing allows multiple data chunks to be loaded or indexed concurrently,


significantly reducing the overall processing time. This parallel execution enhances
efficiency and expedites the completion of these resource-intensive tasks.

2. Resource Utilization:

By leveraging multiple processors or nodes concurrently, parallel processing optimizes


resource utilization. This is particularly advantageous when dealing with large datasets
or when creating complex indexes, as it maximizes the use of available computing
resources.
Parallel processing improves performance by distributing the workload across multiple
threads or processes, leading to faster data loading and index creation through
enhanced speed, efficiency, and resource utilization.

4. Discuss three specific ways in which agent technology may be used to enhance the value of
the data warehouse in a large manufacturing company.

1. Automated Data Extraction:

Agents can be employed to autonomously extract data from diverse sources within the
manufacturing processes, ensuring real-time updates and minimizing manual intervention.

2. Predictive Maintenance Agents:

Utilizing agent technology for predictive maintenance, the data warehouse can integrate
information from sensors and machines to anticipate equipment failures, optimizing production
efficiency and reducing downtime.

3. Supply Chain Optimization:

Agents can analyze data across the supply chain, providing insights into inventory levels,
production schedules, and demand forecasting, enabling proactive decision-making and
enhancing overall supply chain efficiency.

5. Your company is in the business of renting DVDs and video tapes. The company has recently
entered into ebusiness and the senior management wants to make the existing data warehouse
Web-enabled. List and describe any three of the major tasks required for satisfying the
management’s directive.

1. Web-Enabled Data Access:

Description: Implement a user-friendly web interface that allows seamless access to the data
warehouse, enabling employees across the organization to retrieve and analyze information
through web browsers.

2. Security and Authentication:

Description: Enhance security measures to ensure secure access to the web-enabled data
warehouse, implementing robust authentication protocols and access controls to safeguard
sensitive information.

3. Scalability and Performance Optimization:


Description: Optimize the data warehouse infrastructure for scalability to accommodate
increased web traffic, ensuring that the system can handle concurrent user requests and
maintain performance levels for efficient data retrieval and analysis.

CHAPTER 4

REVIEW QUESTIONS

1. Name four key issues to be considered while planning for a data warehouse.

1. Data Quality:

Consideration: Ensuring the accuracy, completeness, and consistency of data to maintain the
integrity and reliability of information stored in the data warehouse.

2. Scalability:

Consideration: Planning for the ability of the data warehouse to handle growing volumes of data
and increased user demands over time without compromising performance.

3. Data Integration:

Consideration: Addressing the challenges associated with integrating data from various sources
to create a unified and coherent view within the data warehouse.

4. User Requirements:

Consideration: Understanding and incorporating the specific needs of end-users to design a


data warehouse that provides relevant and actionable insights to support decision-making.

2. What is meant by a preliminary survey of requirements? List six types of information


you will gather during a preliminary survey.

A preliminary survey of requirements involves gathering initial information to understand the


needs and expectations for a data warehouse project.

Types of Information Gathered:

1. Business Goals and Objectives:


● Identify the overarching goals and objectives the data warehouse should support.

2. User Requirements:

● Understand the specific needs and preferences of end-users who will interact with the
data warehouse.

3. Data Sources:

● Determine the sources of data that need to be integrated into the warehouse for
comprehensive analysis.

4. Data Volume and Complexity:

● Assess the volume and complexity of data to plan for scalability and integration
challenges.

5. Security and Compliance Requirements:

● Identify security measures and compliance standards that must be adhered to in


handling sensitive data.

6. Budget and Resource Constraints:

● Understand the financial and resource constraints to plan the project within specified
limits.

3. List and explain any four of the development phases in the life cycle of a data warehouse
project.

1. Requirements Gathering:

Explanation: Involves understanding and documenting the business needs, user requirements,
and data sources to establish the foundation for the data warehouse design.

2. Data Modeling:

Explanation: Focuses on designing the structure of the data warehouse, including the creation
of conceptual, logical, and physical data models to define how data will be organized and
accessed.
3. Data Extraction, Transformation, and Loading (ETL):

Explanation: Encompasses the processes of extracting data from source systems, transforming
it into a suitable format, and loading it into the data warehouse for analysis.

4. Testing and Deployment:

Explanation: Involves rigorous testing of the data warehouse to ensure accuracy and
performance, followed by the deployment of the system for end-users to access and utilize.

4. What do you consider to be a core set of team roles for a data warehouse project?
Describe the responsibilities of three roles from your set.

Core Team Roles:

1. Project Manager:

Responsibilities: Oversee the entire data warehouse project, manage resources, timelines, and
budget, and ensure alignment with business goals.

2. Data Architect:

Responsibilities: Design the data architecture, define data models, and ensure that the data
warehouse structure supports business requirements and scalability.

3. ETL Developer:

Responsibilities: Implement the Extract, Transform, Load (ETL) processes, ensuring data is
accurately extracted from source systems, transformed appropriately, and loaded into the data
warehouse.

5. Name and describe any five of the success factors in a data warehouse project.

1. **Clear Business Goals:**


- *Description:* Clearly defined business goals and objectives ensure that the data warehouse
project aligns with the organization's strategic priorities.

2. **User Involvement and Training:**


- *Description:* Active user involvement and effective training programs contribute to user
adoption and maximize the utilization of the data warehouse.

3. **Data Quality Management:**


- *Description:* Implementing robust data quality management practices ensures the accuracy
and reliability of information stored in the data warehouse.

4. **Scalability Planning:**
- *Description:* Planning for scalability from the outset accommodates future growth,
preventing performance issues as data volumes and user demands increase.

5. **Executive Support:**
- *Description:* Strong support from executive leadership ensures that the project receives
necessary resources, funding, and organizational commitment.

CHAPTER EXERCISES

1. As the recently assigned project manager, you are required to work with the executive
sponsor to write a justification without detailed ROI calculations for the first data ware-
house project in your company. Write a justification report to be included in the planning
document.

**Title: Justification for the First Data Warehouse Project**

**Executive Summary:**

The implementation of a data warehouse represents a strategic initiative crucial for enhancing
our organizational decision-making processes. By consolidating and integrating disparate data
sources, the data warehouse will provide a unified view, fostering informed decision-making,
improving operational efficiency, and aligning our business strategies with actionable insights.

**Key Justifications:**

1. **Improved Decision-Making:**
- The data warehouse will centralize data, enabling executives and managers to make
informed decisions based on a comprehensive and accurate understanding of business
operations.

2. **Enhanced Operational Efficiency:**


- Streamlining data retrieval and analysis processes will reduce the time spent on manual data
gathering, leading to increased operational efficiency and productivity.

3. **Strategic Alignment:**
- Aligning our data assets will support strategic initiatives by providing a cohesive view of
organizational performance, fostering a data-driven culture and enhancing our competitive
edge.
4. **Mitigated Risks:**
- A centralized data repository will mitigate risks associated with data inconsistencies and
inaccuracies, ensuring data integrity and compliance with industry regulations.

5. **Future Growth Readiness:**


- Planning for scalability in the data warehouse design positions us to accommodate future
growth in data volume and user demands, safeguarding our investment and ensuring long-term
viability.

**Conclusion:**

The implementation of a data warehouse is not merely a technological upgrade; it is an


essential step towards fostering a data-driven culture that will empower our organization to
thrive in a rapidly evolving business landscape. The strategic advantages gained from this
project will contribute to our long-term success and competitiveness.

2. You are the data transformation specialist for the first data warehouse project in an
airlines company. Prepare a project task list to include all the detailed tasks needed for
data extraction and transformation.

1. **Define Data Sources:**


- Identify and document the diverse sources of data, including operational systems, third-party
vendors, and external sources.

2. **Extract Data:**
- Develop extraction processes to retrieve data from various sources, considering frequency,
volume, and scheduling requirements.

3. **Data Cleansing:**
- Implement data cleansing routines to identify and rectify errors, inconsistencies, and missing
values in the extracted data.

4. **Data Transformation Rules:**


- Establish transformation rules to standardize, normalize, and enrich the data to meet the
required format and quality standards.

5. **Data Loading Strategy:**


- Develop a strategy for loading transformed data into the data warehouse, considering batch
loading, real-time loading, or a hybrid approach.

6. **Error Handling and Logging:**


- Implement mechanisms for error handling and logging to capture and manage errors
encountered during extraction and transformation processes.

7. **Metadata Management:**
- Define and document metadata for extracted and transformed data, including source
information, transformation logic, and data lineage.

8. **Data Quality Assurance:**


- Establish processes for data quality assurance, including validation checks and reconciliation
to ensure the accuracy and reliability of transformed data.

9. **Integration Testing:**
- Conduct thorough integration testing to validate the end-to-end process, ensuring that data
extraction and transformation meet project requirements.

10. **User Training and Documentation:**


- Develop training materials and conduct sessions for end-users on understanding and
utilizing the transformed data in the data warehouse.

11. **Performance Optimization:**


- Optimize data transformation processes for performance, considering factors such as
indexing, partitioning, and parallel processing.

12. **Collaboration with Stakeholders:**


- Collaborate with stakeholders, including data analysts and business users, to gather
feedback and refine data extraction and transformation processes based on user requirements.

13. **Documentation and Knowledge Transfer:**


- Document all data transformation processes comprehensively for future reference and
facilitate knowledge transfer to ensure continuity.

14. **Implementation Planning:**


- Develop a detailed implementation plan for the deployment of data extraction and
transformation processes into the production environment.

15. **Monitoring and Maintenance Procedures:**


- Establish monitoring and maintenance procedures to ensure the ongoing effectiveness of
data extraction and transformation processes in the live environment.
3. Why do you think user participation is absolutely essential for success? As a member
of the recently formed data warehouse team in a banking business, your job is to write a
report on how the user departments can best participate in the development. What
specific responsibilities for the users will you include in your report?

**Importance of User Participation:**


User participation is essential for success as it ensures that the data warehouse aligns with
actual business needs, fostering user adoption and increasing the likelihood of generating
actionable insights.

**Report on User Participation:**

1. **Requirements Gathering:**
- *Responsibility:* Collaborate with users to clearly define and document business
requirements, ensuring that the data warehouse addresses specific needs and objectives.

2. **Data Validation:**
- *Responsibility:* Actively participate in data validation processes, verifying that the data in
the warehouse accurately represents business operations and adheres to user expectations.

3. **User Acceptance Testing (UAT):**


- *Responsibility:* Engage in UAT to evaluate the functionality and usability of the data
warehouse, providing feedback on features, interface, and overall user experience.

4. **Training Sessions:**
- *Responsibility:* Attend training sessions conducted by the data warehouse team to
enhance familiarity with the system, its capabilities, and best practices for data retrieval and
analysis.

5. **Feedback and Continuous Improvement:**


- *Responsibility:* Provide ongoing feedback on the data warehouse's performance,
suggesting improvements or additional features that align with evolving business needs.

6. **Documentation Assistance:**
- *Responsibility:* Assist in documenting business rules, definitions, and specific use cases to
enhance the overall documentation and facilitate knowledge transfer within the organization.

7. **Promotion of Data-Driven Culture:**


- *Responsibility:* Actively promote a data-driven culture within the user department,
encouraging colleagues to leverage the data warehouse for decision-making and analytical
purposes.

8. **Communication with IT Team:**


- *Responsibility:* Maintain open communication with the IT team, conveying user
requirements, challenges, and expectations to ensure a collaborative and responsive
development process.

By actively participating in these responsibilities, users contribute to the success of the data
warehouse project, ensuring its alignment with business objectives and fostering a collaborative
and user-friendly environment.
CHAPTER 5

REVIEW QUESTIONS
1. What are the essential differences between defining requirements for operational systems
and for data warehouses?
2. Explain business dimensions. Why and how can business dimensions be useful for defining
requirements for the data warehouse?
3. What data does an information package contain?
4. What are dimension hierarchies? Give three examples.
5. Explain business metrics or facts with five examples.
6. List the types of users who must be interviewed for collecting requirements. What information
can you expect to get from them?
7. In which situations can JAD methodology be successful for collecting requirements?

8. Why are reviews of existing documents important? What can you expect to get out of such
reviews?

CHAPTER EXERCISES
1. Indicate if true or false:
A. Requirements definitions for a sales processing operational system and a sales analysis data
warehouse are very
similar.
B. Managers think in terms of business dimensions for analysis.
C. Unit sales and product costs are examples of business dimensions.
D. Dimension hierarchies relate to drill-down analysis.
E. Categories are attributes of business dimensions.
F. JAD is a methodology for one-on-one interviews.
G. Questionnaires provide the least interactive method for gathering requirements.
H. The departmental users provide information about the company’s overall direction.
I. Departmental managers are very good sources for information on data structures of
operational systems.
J. Information package diagrams are essential parts of the formal requirements definition
document.
2. You are the vice president of marketing for a nation-wide appliance manufacturer with three
production plants.
Describe any three different ways you will tend to analyze your sales. What are the business
dimensions for your analysis?
3. BigBook, Inc. is a large book distributor with domestic and international distribution channels.
The company orders from publishers and distributes publications to all the leading booksellers.
Initially, you want to build a data warehouse to analyze shipments that are made from the
company’s many warehouses. Determine the metrics or facts and the business dimensions.
Prepare an information package diagram.
4. You are on the data warehouse project of AuctionsPlus.com, an Internet auction company
selling upscale works of art. Your responsibility is to gather requirements for sales analysis. Find
out the key metrics, business dimensions,
hierarchies, and categories. Draw the information package diagram.
5. Create a detailed outline for the formal requirements definition document for a data
warehouse to analyze product profitability of a large department store chain.

CHAPTER 6

REVIEW QUESTIONS
1. “In a data warehouse, business requirements of the users form the single and most powerful
driving force.” Do you agree? If you do, state four reasons why. If not, is there any other such
driving force?
2. How do requirements affect the choice of the metadata framework? Explain very briefly.
3. What types of user requirements dictate the granularity or the levels of detail in a data
warehouse?
4. How do you estimate the storage size? What factors determine the size?
5. How do accurate information diagrams turn into sound data models for your data marts?
Explain briefly.

CHAPTER EXERCISES
1. It is a known fact that data quality in the source systems is poor in your company. You are
assigned to be the data quality assurance specialist on the project team. Describe what details
you will include in the requirements definition document to address the data quality problem.

2. As the analyst responsible for data loads and data refreshes, describe all the details you will
look for and document during the requirements definition phase
3. You are the query tools specialist on the project team for a manufacturing company with the
primary users based in the main office. These power users need sophisticated tools for analysis.
How will you determine what types of information delivery methods are needed? What kinds of
details are to be gathered in the requirements definition phase?

You might also like