Professional Documents
Culture Documents
Recent in Data Warehousing: Developments
Recent in Data Warehousing: Developments
Data Warehousing
Hugh J. Watson
Terry College of Business
University of Georgia
hwatson@terry.uga.edu
http://www.terry.uga.edu/~hwatson/dw_tutorial.ppt
Tutorial Objectives
Provide an overview of data
warehousing
Provide materials to support the
teaching of data warehousing
Discuss recent developments in data
warehousing
The Importance of Data
Warehousing
Provide a “single version of the truth”
Improve decision making
Support key corporate initiatives such as
performance management, B2C and B2B
e-commerce, and customer relationship
management
Estimated to be a $113.5 billion market in
2002 for systems, software, services, and
in-house expenditures (Palo Alto
Management Group)
A Simple Definition
Annual
Revenue
Length of Relationship
Two Data Warehousing
Strategies
Enterprise-wide warehouse, top
down, the Inmon methodology
Data mart, bottom up, the Kimball
methodology
When properly executed, both result
in an enterprise-wide data
warehouse
The Data Mart Strategy
The most common approach
Begins with a single mart and architected marts
are added over time for more subject areas
Relatively inexpensive and easy to implement
Can be used as a proof of concept for data
warehousing
Can perpetuate the “silos of information” problem
Can postpone difficult decisions and activities
Requires an overall integration plan
The Enterprise-wide Strategy
A comprehensive warehouse is built initially
An initial dependent data mart is built using
a subset of the data in the warehouse
Additional data marts are built using subsets
of the data in the warehouse
Like all complex projects, it is expensive,
time consuming, and prone to failure
When successful, it results in an integrated,
scalable warehouse
Data Sources and Types
Primarily from legacy, operational
systems
Almost exclusively numerical data at the
present time
External data may be included, often
purchased from third-party sources
Technology exists for storing unstructured
data and expect this to become more
important over time
Extraction, Transformation,
and Loading (ETL) Processes
The “plumbing” work of data
warehousing
Data are moved from source to
target data bases
A very costly, time consuming part
of data warehousing
Recent Development:
More Frequent Updates
Updates can be done in bulk and
trickle modes
Business requirements, such as
trading partner access to a Web site,
requires current data
For international firms, there is no
good time to load the warehouse
Recent Development:
Clickstream Data
Results from clicks at web sites
A dialog manager handles user
interactions. An ODS helps to custom
tailor the dialog
The clickstream data is filtered and
parsed and sent to a data warehouse
where it is analyzed
Software is available to analyze the
clickstream data
Recent Development:
Further Automation of ETL Processes
Patient Physician
#Patient ID #Physician ID Service
Patient Name Physician Name
Address Specialty ID #Service Code
Age Credential ID Service Description
Sex #Category Code
Insurance ID
Claim
# Physician ID
Payer # Patient ID
# Service Code
Time Periods
#Payer ID
Name # Payer ID #Claim Date
Address # Claim Number Year
Phone Number # Line Item Number Month
EDI Number # Claim Date Quarter
Date of Services Week
Amount of Charge
Unit of Services
Dimension Table Examples
Retail -- store name, zip code, product
name, product category, day of week
Telecommunications -- call origin, call
destination
Banking -- customer name, account
number, branch, account officer
Insurance -- policy type, insured party
Fact Table Examples
Retail -- number of units sold, sales
amount
Telecommunications -- length of
call in minutes, average number of
calls
Banking -- average monthly
balance
Insurance -- claims amount
The Fact Table Key Concatenates
the Dimension Keys
Assume that you want to know the
number of television sets sold
to Best Buys on January 15, 2001.
The query might be:
SELECT CLIENT.CUSNAME, SALES.NOSOLD
FROM CLIENT, PRODUCT, TIME, SALES
WHERE CLIENT.CUSNAME=SALES.CUSNAME AND
PRODUCT.PRODNAME=SALES.PRODNAME AND
TIME.DATE=SALES.DATE AND CLIENT.CUSNAME=“BEST BUYS”
AND PRODUCT.PRODNAME=“TELEVISION” AND
TIME.DATE=#01/15/2001#
Warehouse Users
Analysts
Managers
Executives
Operational personnel
Customers and suppliers
Warehouse Tools and
Applications
SQL queries
Managed query environments
Structured and ad hoc reports
DSS/EIS
Portals
Data mining
Packaged applications
Custom-built applications
Recent Development:
Growing Dominance of MS SQL
Server 7.0 with OLAP Services
Low cost, integration of bundled
DSS components from one vendor,
and extended SQL for OLAP
Competitors are either leaving the
market or are repositioning their
products to be complimentary
Recent Development:
Enterprise Intelligence Portals
Offers users an effective way to access
information scattered across networked
enterprise systems through a simple and
personalized Web interface
Provides access to structured and
unstructured data
Potentially integrates data warehousing
and knowledge management
Owens & Minor
Owens&Minor -- data warehousing has supported
integration along the supply chain. Winner of the
1999 TDWI Leadership Award
the nation's leading distributor of name-brand
medical and surgical supplies
has transformed its business model by integrating
supply chain management, e-business, data
warehousing, and Internet technologies
as part of this initiative, WISDOM
(WebIntelligence Supporting Decisions from
Owens & Minor) has been especially valuable
P
R O
DUCT
R
awM aterials M
anufacturer O
wens&M
inor P
rovider P
atient
Suppliers
IN
FORMA
TIO
N
+1,400m
anufacturers +4,000AcuteCareFacilities
WISDOM
a Web-based decision support system that
provides information to OM’s employees,
suppliers and customers
accesses data from a data warehouse that
maintains supplier and customer transaction
data
sold to trading partners as a value added
product
WISDOM II provides data about the
transactions that suppliers and customers
have with all of their trading partners
Sample Applications
Supports reporting and queries for internal
personnel
Supports an EIS for senior management
Suppliers can determine their market share
in specific hospitals
Hospitals can identify which products are
being bought off contract
WISDOM II extends data warehousing to
trading partners through an outsourcing
arrangement
Articles
Cooper, B.L., H.J. Watson, B.H. Wixom, and D.L. Goodhue, "Data Warehousing
Supports Corporate Strategy at First American Corporation," MIS Quarterly,
(December 2000), pp. 547-567. Provides a case study of how the First American
Corporation turned their strategy and fortunes around through the use of data
warehousing.
Stoller, Wixom, and Watson, “WISDOM Provides Competitive Advantage at
Owens & Minor,” (http://terry.uga.edu/~watson/owens&minor.doc) Provides a
case study of how data warehousing can support supply chain integration.
Watson, Wixom, Buonamica, and Revak, “Sherwin-Williams' Data Mart Strategy:
Creating Intelligence Across the Supply Chain,” Communications of ACIS, April
2001. Provides a textbook example of how to implement a data mart strategy.
Watson, H.J., D.A. Annino, B.H. Wixom, K.L. Avery, and M. Rutherford, “Current
Practices in Data Warehousing,” Information Systems Management, (Winter,
2001), pp. 47-55. Provides data on companies’ data warehousing experiences,
with an emphasis on the benefits being realized.
Watson, H.J. and L. Volonino, “Harrah’s High Payoff from Customer
Information,” (http://www.terry.uga.edu/~hwatson/harrahs.doc) Provides a
case study of how Harrah’s Entertainment has implemented a CRM strategy
facilitated by data warehousing.
Books
Devlin, Data Warehouse -- Architecture to Implementation, Addison-
Wesley, 1997.
Gray and Watson, Decision Support in the Data Warehouse, Prentice-Hall,
1998.
Kimball, The Data Warehouse Toolkit, Wiley, 1996.
Kimball and Merz, The Data Webhouse Toolkit, Wiley, 2000.
Inmon, Building the Operational Data Store, second edition, Wiley, 1999.
Inmon, Imhoff, and Sousa, Corporate Information Factory, Wiley, 1999.
Websites
http://www.olapreport.com
(provides detailed information about the OLAP
market, products, and applications)
http://www.firstlogic.com
(includes an interactive demo of their data cleansing
tool)
http://www.billinmon.com
(a wealth of current information from “the father of
data warehousing”)
http://www.metagenix.com
(illustrates recent advances in ETL tools)
http://www.microstrategy.com
(excellent materials from one of the leading DSS
vendors)
Questions