Professional Documents
Culture Documents
FDW Review and Chapter Questions
FDW Review and Chapter Questions
Review Questions
1. What do we mean by strategic information? For a commercial bank, name five types of
strategic objectives.
Strategic information refers to data critical for long-term planning and decision-making. For a
commercial bank, key strategic objectives include risk management, customer retention,
technological innovation, regulatory compliance, and financial sustainability.
2. Do you agree that a typical retail store collects huge volumes of data through its operational
systems? Name three types of transaction data likely to be collected by a retail store in large
volumes during its daily operations.
Absolutely! A retail store indeed gathers substantial data. Three types of transaction data
collected in large volumes are:
Customer Transactions: Capturing purchase history, preferences, and loyalty program activities.
a. Purpose:
b. Data Usage:
Operational Systems: Deal with real-time, transactional data for immediate processing.
Informational Systems: Handle historical, aggregated data for reporting and analysis.
c. Time Horizon:
d. User Focus:
Operational Systems: Used by front-line employees for routine tasks.
Informational Systems: Accessed by managers and analysts for strategic planning and insights.
e. Processing Speed:
4. Why are operational systems not suitable for providing strategic information? Give three
specific reasons and explain.
the design principles and focus of operational systems immediate, detailed transactions make
them less suitable for providing the broader, historical, and analytical information needed for
strategic decision-making.
2. Granularity of Data:
Operational systems capture detailed transactional data at a granular level. While this
granularity is essential for operational tasks, it may be overwhelming for strategic
decision-makers who need a more aggregated and summarized view to identify long-term
trends and patterns.
Operational systems are optimized for transaction processing and may lack advanced analytical
capabilities. Strategic decision-making often involves complex analysis, forecasting, and trend
identification, which may go beyond the capabilities of operational systems.
● Extract:
● Transform:
● Load:
- Description: Users can run queries and perform analytical tasks on the stored data.
These processes collectively ensure data integration, quality, and accessibility in a data
warehouse.
1. The current trends in hardware/software technology make data warehousing feasible. Explain
with some examples how exactly technology trends do help.
- Advancements in hardware and software technology make data warehousing feasible by
improving storage capacity, processing speed, and analytical capabilities. For instance,
the use of high-performance databases, distributed computing, and cloud computing
allows for the efficient handling of large volumes of data. Additionally, innovations in data
compression and in-memory processing enhance the speed of data retrieval and
analysis. Machine learning algorithms and artificial intelligence further contribute to
automated insights, making data warehousing more powerful and accessible for
decision-makers.
2. For an airline company, how can strategic information increase the number of frequent flyers?
Discuss giving specific details.
3. You are a senior analyst in the IT department of a company manufacturing automobile parts.
The marketing VP is complaining about the poor response by IT in providing strategic
information. Draft a proposal to him introducing the concept of business intelligence and how
data warehousing and analytics as part of business intelligence for your company would be the
optimal solution.
Key Components:
1. Data Warehousing:
a. Holistic View:
With analytics, we can move beyond raw data to gain timely insights, enabling quicker
responses to market trends and customer behaviors.
b. Strategic Planning:
BI supports strategic planning by offering predictive analytics, helping us anticipate market shifts
and optimize our operations accordingly.
c. User-Friendly Dashboards:
Intuitive dashboards make it easy for non-technical users, including yourself, to interact with and
extract valuable insights from the data.
d. Improved Collaboration:
BI fosters collaboration between departments by providing a common platform for data analysis
and reporting.
CHAPTER 2
REVIEW QUESTIONS
1. Name at least six characteristics or features of a data warehouse.
Data integration is more critical in a data warehouse than in an operational application because
it harmonizes diverse data sources, ensures consistency for historical analysis, and provides a
unified foundation for centralized and informed decision-making.
3. Every data structure in the data warehouse contains the time element. Why?
- Every data structure in the data warehouse contains the time element to facilitate
historical analysis and track changes over different periods.
- Data granularity refers to the level of detail or specificity in the data, and in the data
warehouse, it determines the depth of information stored, allowing for precise analysis
and reporting at various levels of detail.
5. How are the top-down and bottom-up approaches for building a data warehouse different?
List the major types of architectures and highlight the features of any two of these.
Answers:
The top-down approach starts with designing the overall architecture and then focuses on
specific data marts, while the bottom-up approach builds data marts first and integrates them
into a comprehensive data warehouse.
Data Warehouse Architectures:
Kimball Architecture:
Features: Follows a bottom-up approach, emphasizing data marts for specific business areas;
promotes rapid development and deployment.
Inmon Architecture:
6. What are the various data sources for the data warehouse?
- Data sources for the data warehouse include operational databases, external data from
vendors, spreadsheets, legacy systems, and other structured and unstructured data
repositories.
8. Under data transformation, list five different functions you can think of.
- Data transformation functions include filtering to extract relevant data, aggregation for
summarizing information, merging to combine datasets, cleansing to standardize and
correct data, and derivation to create new variables or calculations.
10. What are the three major types of metadata in a data warehouse? Briefly mention the
purpose of each type.
1. Operational Metadata:
Purpose: Describes the execution and operation of processes within the data warehouse,
providing insights into system performance and resource utilization.
2. Structural Metadata:
Purpose: Defines the structure and organization of the data, including tables, relationships, and
data types, ensuring proper integration and understanding of the data schema.
3. Business Metadata:
Purpose: Offers business context to the data, including definitions, business rules, and user
annotations, aiding in the comprehension and meaningful utilization of data by business users.
CHAPTER EXERCISES
1. A data warehouse is subject-oriented. What would be the major critical business subjects for
the following companies?
a. an international manufacturing company
b. a local community bank
c. a domestic hotel chain
2. You are the data analyst on the project team building a data warehouse for an insurance
company. List the possible data sources from which you will bring the data into your data
warehouse. State your assumptions.
3. For an airlines company, identify three operational applications that would feed into the data
warehouse. What would be the data load and refresh cycles?
4. Prepare a table showing all the potential users and information delivery methods for a data
warehouse supporting a large national grocery chain
CHAPTER 3
REVIEW QUESTIONS
1. State any three factors that indicate the continued growth in data warehousing and
business intelligence. Can you think of some examples?
Three factors indicating continued growth in data warehousing and business intelligence include
increasing data volumes from various sources, advancements in analytics technologies, and the
rising demand for data-driven decision-making. Examples include the proliferation of IoT
(Internet of Things) devices generating massive datasets, the evolution of machine learning for
predictive analytics, and businesses adopting BI (Business Intelligence) tools to gain actionable
insights from their data.
2. Why do data warehouses continue to grow in size, storing huge amounts of data? Give
any three reasons.
Data warehouses continue to grow in size due to the increasing volume of data generated by
various sources, the need to store historical data for trend analysis and compliance, and the
growing emphasis on detailed and granular data for more accurate and comprehensive
analysis.
3. Why is it important to store multiple types of data in the data warehouse? Give
examples of some non structured data likely to be found in the data warehouse of a
health management organization (HMO).
Storing multiple types of data in a data warehouse is important to provide a holistic view and
support diverse analyses. In the data warehouse of a Health Management Organization (HMO),
non-structured data like patient feedback comments, social media mentions related to
healthcare trends, and unstructured clinical notes from healthcare providers are examples that
offer valuable insights when combined with structured data.
Data fusion refers to the process of integrating diverse data from multiple sources to create a
unified and comprehensive dataset. In data warehousing, data fusion occurs during the data
integration phase, where various data sources are combined and harmonized to provide a
cohesive and meaningful view for analysis and reporting.
CHAPTER EXERCISES
2. As the senior analyst on the data warehouse project of a large retail chain, you are
responsible for improving data visualization of the output results. Make a list of your
recommendations.
● Utilize tools that allow users to interact with and customize dashboards for personalized
insights.
● Focus on visual representations of crucial KPIs for quick and easy monitoring.
● Implement heat maps for data density representation and geographical visualization to
analyze regional performance.
● Introduce trend lines and charts to highlight historical patterns and forecast future trends.
● Enable users to drill down into detailed data or drill up for a broader perspective based
on their analysis needs.
6. Use Data Storytelling Techniques:
● Include annotations on visualizations to provide context and explanations for key data
points.
● Explore advanced visualization techniques such as Sankey diagrams, tree maps, and
network graphs for complex relationships.
● Conduct training sessions to ensure users understand how to interpret and derive
insights from visualizations effectively.
3. Explain how and why parallel processing can improve performance for data loading and index
creation.
Parallel processing improves performance for data loading and index creation by dividing
the tasks into parallel threads or processes that can be executed simultaneously. This is
beneficial for two main reasons:
2. Resource Utilization:
4. Discuss three specific ways in which agent technology may be used to enhance the value of
the data warehouse in a large manufacturing company.
Agents can be employed to autonomously extract data from diverse sources within the
manufacturing processes, ensuring real-time updates and minimizing manual intervention.
Utilizing agent technology for predictive maintenance, the data warehouse can integrate
information from sensors and machines to anticipate equipment failures, optimizing production
efficiency and reducing downtime.
Agents can analyze data across the supply chain, providing insights into inventory levels,
production schedules, and demand forecasting, enabling proactive decision-making and
enhancing overall supply chain efficiency.
5. Your company is in the business of renting DVDs and video tapes. The company has recently
entered into ebusiness and the senior management wants to make the existing data warehouse
Web-enabled. List and describe any three of the major tasks required for satisfying the
management’s directive.
Description: Implement a user-friendly web interface that allows seamless access to the data
warehouse, enabling employees across the organization to retrieve and analyze information
through web browsers.
Description: Enhance security measures to ensure secure access to the web-enabled data
warehouse, implementing robust authentication protocols and access controls to safeguard
sensitive information.
CHAPTER 4
REVIEW QUESTIONS
1. Name four key issues to be considered while planning for a data warehouse.
1. Data Quality:
Consideration: Ensuring the accuracy, completeness, and consistency of data to maintain the
integrity and reliability of information stored in the data warehouse.
2. Scalability:
Consideration: Planning for the ability of the data warehouse to handle growing volumes of data
and increased user demands over time without compromising performance.
3. Data Integration:
Consideration: Addressing the challenges associated with integrating data from various sources
to create a unified and coherent view within the data warehouse.
4. User Requirements:
2. User Requirements:
● Understand the specific needs and preferences of end-users who will interact with the
data warehouse.
3. Data Sources:
● Determine the sources of data that need to be integrated into the warehouse for
comprehensive analysis.
● Assess the volume and complexity of data to plan for scalability and integration
challenges.
● Understand the financial and resource constraints to plan the project within specified
limits.
3. List and explain any four of the development phases in the life cycle of a data warehouse
project.
1. Requirements Gathering:
Explanation: Involves understanding and documenting the business needs, user requirements,
and data sources to establish the foundation for the data warehouse design.
2. Data Modeling:
Explanation: Focuses on designing the structure of the data warehouse, including the creation
of conceptual, logical, and physical data models to define how data will be organized and
accessed.
3. Data Extraction, Transformation, and Loading (ETL):
Explanation: Encompasses the processes of extracting data from source systems, transforming
it into a suitable format, and loading it into the data warehouse for analysis.
Explanation: Involves rigorous testing of the data warehouse to ensure accuracy and
performance, followed by the deployment of the system for end-users to access and utilize.
4. What do you consider to be a core set of team roles for a data warehouse project?
Describe the responsibilities of three roles from your set.
1. Project Manager:
Responsibilities: Oversee the entire data warehouse project, manage resources, timelines, and
budget, and ensure alignment with business goals.
2. Data Architect:
Responsibilities: Design the data architecture, define data models, and ensure that the data
warehouse structure supports business requirements and scalability.
3. ETL Developer:
Responsibilities: Implement the Extract, Transform, Load (ETL) processes, ensuring data is
accurately extracted from source systems, transformed appropriately, and loaded into the data
warehouse.
5. Name and describe any five of the success factors in a data warehouse project.
4. **Scalability Planning:**
- *Description:* Planning for scalability from the outset accommodates future growth,
preventing performance issues as data volumes and user demands increase.
5. **Executive Support:**
- *Description:* Strong support from executive leadership ensures that the project receives
necessary resources, funding, and organizational commitment.
CHAPTER EXERCISES
1. As the recently assigned project manager, you are required to work with the executive
sponsor to write a justification without detailed ROI calculations for the first data ware-
house project in your company. Write a justification report to be included in the planning
document.
**Executive Summary:**
The implementation of a data warehouse represents a strategic initiative crucial for enhancing
our organizational decision-making processes. By consolidating and integrating disparate data
sources, the data warehouse will provide a unified view, fostering informed decision-making,
improving operational efficiency, and aligning our business strategies with actionable insights.
**Key Justifications:**
1. **Improved Decision-Making:**
- The data warehouse will centralize data, enabling executives and managers to make
informed decisions based on a comprehensive and accurate understanding of business
operations.
3. **Strategic Alignment:**
- Aligning our data assets will support strategic initiatives by providing a cohesive view of
organizational performance, fostering a data-driven culture and enhancing our competitive
edge.
4. **Mitigated Risks:**
- A centralized data repository will mitigate risks associated with data inconsistencies and
inaccuracies, ensuring data integrity and compliance with industry regulations.
**Conclusion:**
2. You are the data transformation specialist for the first data warehouse project in an
airlines company. Prepare a project task list to include all the detailed tasks needed for
data extraction and transformation.
2. **Extract Data:**
- Develop extraction processes to retrieve data from various sources, considering frequency,
volume, and scheduling requirements.
3. **Data Cleansing:**
- Implement data cleansing routines to identify and rectify errors, inconsistencies, and missing
values in the extracted data.
7. **Metadata Management:**
- Define and document metadata for extracted and transformed data, including source
information, transformation logic, and data lineage.
9. **Integration Testing:**
- Conduct thorough integration testing to validate the end-to-end process, ensuring that data
extraction and transformation meet project requirements.
1. **Requirements Gathering:**
- *Responsibility:* Collaborate with users to clearly define and document business
requirements, ensuring that the data warehouse addresses specific needs and objectives.
2. **Data Validation:**
- *Responsibility:* Actively participate in data validation processes, verifying that the data in
the warehouse accurately represents business operations and adheres to user expectations.
4. **Training Sessions:**
- *Responsibility:* Attend training sessions conducted by the data warehouse team to
enhance familiarity with the system, its capabilities, and best practices for data retrieval and
analysis.
6. **Documentation Assistance:**
- *Responsibility:* Assist in documenting business rules, definitions, and specific use cases to
enhance the overall documentation and facilitate knowledge transfer within the organization.
By actively participating in these responsibilities, users contribute to the success of the data
warehouse project, ensuring its alignment with business objectives and fostering a collaborative
and user-friendly environment.
CHAPTER 5
REVIEW QUESTIONS
1. What are the essential differences between defining requirements for operational systems
and for data warehouses?
2. Explain business dimensions. Why and how can business dimensions be useful for defining
requirements for the data warehouse?
3. What data does an information package contain?
4. What are dimension hierarchies? Give three examples.
5. Explain business metrics or facts with five examples.
6. List the types of users who must be interviewed for collecting requirements. What information
can you expect to get from them?
7. In which situations can JAD methodology be successful for collecting requirements?
8. Why are reviews of existing documents important? What can you expect to get out of such
reviews?
CHAPTER EXERCISES
1. Indicate if true or false:
A. Requirements definitions for a sales processing operational system and a sales analysis data
warehouse are very
similar.
B. Managers think in terms of business dimensions for analysis.
C. Unit sales and product costs are examples of business dimensions.
D. Dimension hierarchies relate to drill-down analysis.
E. Categories are attributes of business dimensions.
F. JAD is a methodology for one-on-one interviews.
G. Questionnaires provide the least interactive method for gathering requirements.
H. The departmental users provide information about the company’s overall direction.
I. Departmental managers are very good sources for information on data structures of
operational systems.
J. Information package diagrams are essential parts of the formal requirements definition
document.
2. You are the vice president of marketing for a nation-wide appliance manufacturer with three
production plants.
Describe any three different ways you will tend to analyze your sales. What are the business
dimensions for your analysis?
3. BigBook, Inc. is a large book distributor with domestic and international distribution channels.
The company orders from publishers and distributes publications to all the leading booksellers.
Initially, you want to build a data warehouse to analyze shipments that are made from the
company’s many warehouses. Determine the metrics or facts and the business dimensions.
Prepare an information package diagram.
4. You are on the data warehouse project of AuctionsPlus.com, an Internet auction company
selling upscale works of art. Your responsibility is to gather requirements for sales analysis. Find
out the key metrics, business dimensions,
hierarchies, and categories. Draw the information package diagram.
5. Create a detailed outline for the formal requirements definition document for a data
warehouse to analyze product profitability of a large department store chain.
CHAPTER 6
REVIEW QUESTIONS
1. “In a data warehouse, business requirements of the users form the single and most powerful
driving force.” Do you agree? If you do, state four reasons why. If not, is there any other such
driving force?
2. How do requirements affect the choice of the metadata framework? Explain very briefly.
3. What types of user requirements dictate the granularity or the levels of detail in a data
warehouse?
4. How do you estimate the storage size? What factors determine the size?
5. How do accurate information diagrams turn into sound data models for your data marts?
Explain briefly.
CHAPTER EXERCISES
1. It is a known fact that data quality in the source systems is poor in your company. You are
assigned to be the data quality assurance specialist on the project team. Describe what details
you will include in the requirements definition document to address the data quality problem.
2. As the analyst responsible for data loads and data refreshes, describe all the details you will
look for and document during the requirements definition phase
3. You are the query tools specialist on the project team for a manufacturing company with the
primary users based in the main office. These power users need sophisticated tools for analysis.
How will you determine what types of information delivery methods are needed? What kinds of
details are to be gathered in the requirements definition phase?