Download as pdf or txt
Download as pdf or txt
You are on page 1of 14

Seminar

Prof. Dr. Andreas Meier


Nicolas Werro

Customer Data Warehouse

Alain Wahl
alain.wahl@unifr.ch
University of Fribourg, Switzerland
Department of Informatics

MAY 2005

Abstract
Big companies need to generate a huge volume of data, that needs to be converted
into information that can be used for operational and analytical purposes. Ideally
the data is stored in a data warehouse. A data warehouse is a large repository of
historical, operational and customer data. Data volume can reach the size of several
terabytes, i.e. 240 bytes of data.
Home shopping companies, retailers and banks are users of data warehouses.
Attached to a data warehouse is a set of analytic procedures for making sense out
of the data.

Contents
Contents

List of Figures

1 Introduction

1.1

Why a Customer Data Warehouse? . . . . . . . . . . . . . . . . . . .

2 The Customer Relationship challenge

4
5

2.1

The first challenge of Customer Relationship Management . . . . . .

2.2

The second challenge of Customer Relationship Management . . . . .

3 Data Warehouse

3.1

Multi-dimensional databases . . . . . . . . . . . . . . . . . . . . . . .

3.2

DBMS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

3.3

Definitions of a Data Warehouse . . . . . . . . . . . . . . . . . . . . .

3.4

Data Warehouse Architecture . . . . . . . . . . . . . . . . . . . . . .

3.5

Data Warehouse Components . . . . . . . . . . . . . . . . . . . . . .

3.6

Data Warehouse Operations . . . . . . . . . . . . . . . . . . . . . . . 10

4 Conclusion

12

Bibliography

13

List of Figures
2.1

Treatment of the customer data. [Meier]5 . . . . . . . . . . . . . . . .

3.1

A database management system. . . . . . . . . . . . . . . . . . . . .

3.2

Data warehouse architecture. [Silberschatz et al.]8 . . . . . . . . . . .

3.3

Data Cube. [Meier et al.]

. . . . . . . . . . . . . . . . . . . . . . . .

3.4

Star Schema. [Silberschatz et al.]8 . . . . . . . . . . . . . . . . . . . . 10

3.5

Hierarchies on dimensions. [Silberschatz et al.]8 . . . . . . . . . . . . 11

Chapter 1
Introduction
1.1

Why a Customer Data Warehouse?

Most companies are unable to discover valuable information hidden in the data,
which prevents them from transforming data to knowledge. Data warehouses are
the ideal solution to collect all possible data from business processes. They are used
to analyse data and draw conclusions.
A customer data warehouse should answer the following questions for customer
relationship management [Meier]5 :
What attributes describe an attractive customer or customer group?
What is the customer value in the past, present or future?
How loyal is a specific customer or customer group?
What changes in the customer requirements or service quality can be traced?
What are the preferred communictaion channels of a specific customer?

Chapter 2
The Customer Relationship
challenge
2.1

The first challenge of Customer Relationship


Management

The first challenge of CRM is to acquire new customers and lost customers with
attractive market and resource potential [Kotler et al. 2002]4 .

2.2

The second challenge of Customer Relationship Management

The second strategic strategic objective is to maintain and improve customer equity by cross- and up-selling together with retention programs during the customer
lifetime [Blattberg et al. 2001]1 .
As figure 2.1 shows, the manual treatment of customer data is still common.
The use of a customer data warehouse with data mining techniques is rare [Han and
Kamber 2001]3 .

SECTION 2.2 THE SECOND CHALLENGE OF CUSTOMER RELATIONSHIP


MANAGEMENT
6

Manual gathering of
customer data
No data collection
except for fulfillment
Survey of customer
satisfaction
Registration of
campaign reactions
Machine-aided collection of attributes
Enrollment of customer behavior
Permission marketing
External market
research
Others
0

50

100 150 200 250 300 350 400 450 500 550 600

N = 969

Figure 2.1: Treatment of the customer data. [Meier]5

Chapter 3
Data Warehouse
3.1

Multi-dimensional databases

Online analytical Processing (OLAP) is the action of data analysis. The core of
OLAP is a multi-dimensional database. The differences between OLAP and OLTP
(Online Transaction Processing) are shown in the following table:
Data
Operations
Time Period
Granularity
Time of Response
Availability

3.2

Online Transactional Processing


productive databases
read, write, delete
current date
detailed
short (ms - s)
high

Online Analytical Processing


databases for decision support
read only
historical, current, (future)
different aggregation levels
long (s - min)
medium

DBMS

A database management system (cf. figure 3.1) is, like the word it suggests, composed of two components: a database and a management system.
A multi-dimensional database management system is called a data warehouse.

3.3

Definitions of a Data Warehouse

A Data Warehouse is an integrated collection of all transaction data that


exist within a company to be used to consolidate and analyze daily business
transactions. [Karchur 2000]
A Data Warehouse is a database, with tools, that stores current and historical
data of potential interest to managers throughout the company. [Laudon
1999]

SECTION 3.4

DATA WAREHOUSE ARCHITECTURE

Database

DB

MS

Management
System

Entities & Relationships


Query and Manipulation Language
Specialized Functions (Recovery, Reorganization,
Security, Data Protection)

Figure 3.1: A database management system.

A Data Warehouse is a subject oriented collection of information in support


of managements decisions. [Schinzer 1999]
Inmon (1996) describes data warehouses by the following keywords:
subject-oriented: data is organized around subjects of the business
(e.g. customers) rather than around applications (e.g. insurance)
integrated: it is consistent in the way data from several sources are
treated (e.g. coding conventions are standardized)
time-variant: data are organized by various time periods (e.g. months)
non-volatile: the database is not uploaded in real time.

3.4

Data Warehouse Architecture

Large companies have presences in many places, each of which may generate a large
volume of data. For instance, manufacturing-problem data and customer-complaint
data may be stored on different database systems. A data warehouse (cf. figure
3.2) is a repository of information gathered from different sources and stored under
a unified schema.
Figure 3.3 shows the core of each data warehouse: a data cube. In this example
three dimensions are chosen, i.e. time, product and area.
The schema of figure 3.4 is called a star schema. It is composed of a fact table,
multiple dimension tables and foreign keys from the fact table to the dimension
tables.

SECTION 3.5

DATA WAREHOUSE COMPONENTS

Data source 1
Data
loaders
Data source 2
.
.
.

DBMS
Data warehouse

Query and
analysis tools

Data source n

Figure 3.2: Data warehouse architecture. [Silberschatz et al.]8

time
Product

area

Figure 3.3: Data Cube. [Meier et al.]7

3.5

Data Warehouse Components

There are two different possibilities when and how to gather data. In a source-driven
architecture data sources transmit new information continually or periodically to the
data warhouse. In a destination-driven architecture a data-warehouse periodically

SECTION 3.6

DATA WAREHOUSE OPERATIONS

item-info
item-id
itemname
color
size
category

date-info
date
month
quarter
year

sales
item-id
store-id
customer-id
date
number
price

10

store
store-id
city
state
country

customer
customer-id
name
street
city
state
zipcode
country

Figure 3.4: Star Schema. [Silberschatz et al.]8

sends requests for new data to sources.


What schema to use? Data sources have been created independently and often have different schemas. They even may use different data models. The data
warehouse has to convert the data to the integrated schema before they are stored.
Data cleansing is the action of the data warehouse of correcting and preprocessing data. Data sources often deliver data with numerous minor inconsistencies.

3.6

Data Warehouse Operations

Drill Down and Roll Up is used to see more of less granularity. To make this
actions possible, data need to bee stored hierarchically, like in figure 3.5.
Slicing is the action to select a part of the data cube, e.g. all products in one
time period.
Dicing is called the action of re-order the main dimensions of a data cube.

SECTION 3.6

DATA WAREHOUSE OPERATIONS

Year
Region
Quarter
Country
Day of week

Month
State

Hour of day

Date
City

Date Time
a) Time Hierarchy

b) Location Hierarchy

Figure 3.5: Hierarchies on dimensions. [Silberschatz et al.]8

11

Chapter 4
Conclusion
Customer data warehouses are decision-support systems to help analyze and achieve
online data collected by transaction processing systems, to help companies to make
business decisions. To achieve this objective data warehouses are used to analyse
historical data, for instance to predict trends. Warehouse schemas tend to be multidimensional, involving one or a a few very large fact tables and several much smaller
dimension tables.

Bibliography
[1] R. C. Blattberg, G. Getz, J. S. Thomas (2001). Customer Equity Building and Managing Relationships as Valuable Assets. Harvard Business
School Press: Boston.
[2] Francis Buttle (2004). Customer Relationship Management. Concepts and
Tools.
[3] J. Han, M. Kamber (2001). Data Mining - Concept and Techniques. Morgan
Kaufmann: San Francisco.
[4] P. Kotler, C.J. Dipak, S. Maesincee (2002). Marketing Moves - A New
Approach to Profits, Grothw, and Renewal. Harward Business School Press:
Boston.
[5] Andreas Meier. A Data Warehouse Approach to Customer Relationship Management.
[6] Andreas Meier (2004). Relationale Datenbanken - Leitfaden f
ur dir Praxis.
F
unfte Auflage.
semann, Thomas Zurkinden (2004). Data
[7] Andreas Meier, Stefan Hu
Warehouse Architecture.
[8] Silberschatz, Korth and Sudarshan (2002). Database. System Concepts.
Fourth edition.
[9] R. T. Watson (1999). Data Management: Databases and Organisations.

You might also like