Professional Documents
Culture Documents
Customer Data Warehouse
Customer Data Warehouse
Alain Wahl
alain.wahl@unifr.ch
University of Fribourg, Switzerland
Department of Informatics
MAY 2005
Abstract
Big companies need to generate a huge volume of data, that needs to be converted
into information that can be used for operational and analytical purposes. Ideally
the data is stored in a data warehouse. A data warehouse is a large repository of
historical, operational and customer data. Data volume can reach the size of several
terabytes, i.e. 240 bytes of data.
Home shopping companies, retailers and banks are users of data warehouses.
Attached to a data warehouse is a set of analytic procedures for making sense out
of the data.
Contents
Contents
List of Figures
1 Introduction
1.1
4
5
2.1
2.2
3 Data Warehouse
3.1
Multi-dimensional databases . . . . . . . . . . . . . . . . . . . . . . .
3.2
DBMS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.3
3.4
3.5
3.6
4 Conclusion
12
Bibliography
13
List of Figures
2.1
3.1
3.2
3.3
. . . . . . . . . . . . . . . . . . . . . . . .
3.4
3.5
Chapter 1
Introduction
1.1
Most companies are unable to discover valuable information hidden in the data,
which prevents them from transforming data to knowledge. Data warehouses are
the ideal solution to collect all possible data from business processes. They are used
to analyse data and draw conclusions.
A customer data warehouse should answer the following questions for customer
relationship management [Meier]5 :
What attributes describe an attractive customer or customer group?
What is the customer value in the past, present or future?
How loyal is a specific customer or customer group?
What changes in the customer requirements or service quality can be traced?
What are the preferred communictaion channels of a specific customer?
Chapter 2
The Customer Relationship
challenge
2.1
The first challenge of CRM is to acquire new customers and lost customers with
attractive market and resource potential [Kotler et al. 2002]4 .
2.2
The second strategic strategic objective is to maintain and improve customer equity by cross- and up-selling together with retention programs during the customer
lifetime [Blattberg et al. 2001]1 .
As figure 2.1 shows, the manual treatment of customer data is still common.
The use of a customer data warehouse with data mining techniques is rare [Han and
Kamber 2001]3 .
Manual gathering of
customer data
No data collection
except for fulfillment
Survey of customer
satisfaction
Registration of
campaign reactions
Machine-aided collection of attributes
Enrollment of customer behavior
Permission marketing
External market
research
Others
0
50
100 150 200 250 300 350 400 450 500 550 600
N = 969
Chapter 3
Data Warehouse
3.1
Multi-dimensional databases
Online analytical Processing (OLAP) is the action of data analysis. The core of
OLAP is a multi-dimensional database. The differences between OLAP and OLTP
(Online Transaction Processing) are shown in the following table:
Data
Operations
Time Period
Granularity
Time of Response
Availability
3.2
DBMS
A database management system (cf. figure 3.1) is, like the word it suggests, composed of two components: a database and a management system.
A multi-dimensional database management system is called a data warehouse.
3.3
SECTION 3.4
Database
DB
MS
Management
System
3.4
Large companies have presences in many places, each of which may generate a large
volume of data. For instance, manufacturing-problem data and customer-complaint
data may be stored on different database systems. A data warehouse (cf. figure
3.2) is a repository of information gathered from different sources and stored under
a unified schema.
Figure 3.3 shows the core of each data warehouse: a data cube. In this example
three dimensions are chosen, i.e. time, product and area.
The schema of figure 3.4 is called a star schema. It is composed of a fact table,
multiple dimension tables and foreign keys from the fact table to the dimension
tables.
SECTION 3.5
Data source 1
Data
loaders
Data source 2
.
.
.
DBMS
Data warehouse
Query and
analysis tools
Data source n
time
Product
area
3.5
There are two different possibilities when and how to gather data. In a source-driven
architecture data sources transmit new information continually or periodically to the
data warhouse. In a destination-driven architecture a data-warehouse periodically
SECTION 3.6
item-info
item-id
itemname
color
size
category
date-info
date
month
quarter
year
sales
item-id
store-id
customer-id
date
number
price
10
store
store-id
city
state
country
customer
customer-id
name
street
city
state
zipcode
country
3.6
Drill Down and Roll Up is used to see more of less granularity. To make this
actions possible, data need to bee stored hierarchically, like in figure 3.5.
Slicing is the action to select a part of the data cube, e.g. all products in one
time period.
Dicing is called the action of re-order the main dimensions of a data cube.
SECTION 3.6
Year
Region
Quarter
Country
Day of week
Month
State
Hour of day
Date
City
Date Time
a) Time Hierarchy
b) Location Hierarchy
11
Chapter 4
Conclusion
Customer data warehouses are decision-support systems to help analyze and achieve
online data collected by transaction processing systems, to help companies to make
business decisions. To achieve this objective data warehouses are used to analyse
historical data, for instance to predict trends. Warehouse schemas tend to be multidimensional, involving one or a a few very large fact tables and several much smaller
dimension tables.
Bibliography
[1] R. C. Blattberg, G. Getz, J. S. Thomas (2001). Customer Equity Building and Managing Relationships as Valuable Assets. Harvard Business
School Press: Boston.
[2] Francis Buttle (2004). Customer Relationship Management. Concepts and
Tools.
[3] J. Han, M. Kamber (2001). Data Mining - Concept and Techniques. Morgan
Kaufmann: San Francisco.
[4] P. Kotler, C.J. Dipak, S. Maesincee (2002). Marketing Moves - A New
Approach to Profits, Grothw, and Renewal. Harward Business School Press:
Boston.
[5] Andreas Meier. A Data Warehouse Approach to Customer Relationship Management.
[6] Andreas Meier (2004). Relationale Datenbanken - Leitfaden f
ur dir Praxis.
F
unfte Auflage.
semann, Thomas Zurkinden (2004). Data
[7] Andreas Meier, Stefan Hu
Warehouse Architecture.
[8] Silberschatz, Korth and Sudarshan (2002). Database. System Concepts.
Fourth edition.
[9] R. T. Watson (1999). Data Management: Databases and Organisations.