WP3137 A DQ Survival Guide

WHITE PAPER
DATA QUALITY: A SURVIVAL GUIDE

FOR MARKETING
Improving Your Customer Data for Marketing Effectiveness
CONTENTS EXECUTIVE SUMMARY

1 Executive Summary
Direct marketing is about communicating a message to a specific prospect
2 Typical Data Problems in a
Marketing Campaign or customer. The success of that communication, measured in terms of a
2 Duplicate Account Records qualified lead that generates a sale, depends on accurately identifying the
3 Incomplete Data prospect. Once you’ve identified the prospect, you need to contact and convince
3 The Wrong Data them your message is worth their time. There is nothing like misspelling
4 How Marketing Benefits from
their name, using the wrong title, mailing to an old address, or mailing multiple
Quality Data
5 But How Do I Find My Data copies to instantly lose credibility. But ensuring data quality can be a significant
Quality Problems? challenge if you have 500,000 or a million or 10 million prospect records in
7 Attacking the Data Quality Problem at your customer relationship management (CRM) system and you are trying to
the Source target the right prospect. The challenge becomes matching the target audience
9 Transactional Updates
of your upcoming marketing campaign to the records in your database. How
10 O perational Feeds
10 P urchased Data do you select the correct prospects? Data quality, data integration, and other
11 Legacy Migration functions of enterprise information management (EIM) are crucial to this endeavor.
11 Regular Maintenance
12 Mapping the Opportunities Demographics of the prospects are certainly key to mapping your CRM records
to Cleanse to the campaign selection, but have you applied the right demographic codes
14 Data Quality Functions in a Marketing to the right prospects? In past eras, when mass marketing was in vogue,
Environment it was standard practice to just mail out the 10 million marketing flyers and
16 Delivering Data Quality Functionality
hope for the best. In those days, marketing budgets were constructed to
16 On-Premise Software
17 Internal Hosting via Web Services support the printing and postage fees of massive mailing campaigns. Not so
17 On-Demand today, where marketing is compelled to be significantly more efficient and
18 Service Bureaus cost-effective at reaching the right customers.
19 Tying It All Together
19 About Business Objects Ultimately, even with the finest marketing organizations, the success of marketing
comes down to the data. Surrounding the data and storage application are
people, processes, standards, and technologies used to manage that data.
This white paper from Business Objects, an SAP company, focuses on these
concepts as they pertain to marketing, and particularly as they are supported
by data quality functions inside of the broader EIM framework.
Author: Frank Dravis, Six Factors Consulting

TYPICAL DATA PROBLEMS IN A
MARKETING CAMPAIGN
Perhaps you recognize or have suffered at the hands of some of these problems. We
list and discuss them briefly here to establish a common understanding of what we face.
One thing that becomes apparent is data quality problems exacerbate each other. For
example, if you have duplicate records, some of the duplicates will not be reconciled if
crucial data elements such as addresses are incorrect or not standardized.
DUPLICATE ACCOUNT RECORDS

Which record do you choose? I have often seen an 8:1 duplication ratio between
customer records. That is, for every one Ford Motor customer record that you
think is the master, you could have seven more records in the same file, as well
as others across your enterprise in disparate systems. How does this happen? In
many ways–and those of us who capture, create, and store customer data are still
inventing new ways.
Mergers and acquisitions (M&A) is certainly one large contributor to duplicate
records. I’ve seen it over and over where IT is saddled with merging two versions
of accounts payable, billing, shipping, fulfillment, order entry, and CRM systems,
and is either given too little time, has its original scheduled shortened, or lacks
adequate resources to merge the systems without creating duplicates. In truth,
even the best M&A integration projects still create some duplicate records, but
more often overburdened IT shops–in order to make their deadlines–are forced to
just merge two systems together and let operations clean up the mess after the
deadline has passed.
Account managers are another popular source of duplicate account records.
For myriad reasons, some of them illogical, when creating a customer record an
account manager sees an existing record that matches a customer’s, but ignores
it and creates a new record. Often this duplicate effort is driven by data control or
even compensation issues.
A third source for duplicate records–and there are more–is poor visibility and linkage
across systems. This happens in highly segmented enterprises where each business
unit, function, or department has its own customer data repository. In this case, if the
account manager for hoists and lifts wants to add new data to the existing derricks
and cranes customer record, they can’t because they don’t have access to it.
Business Objects. Data Quality: A Survival Guide For Marketing

INCOMPLETE DATA
Incomplete data–such as blank data fields–causes a variety of problems for the
marketer. First, and most obvious, when the address, email, or phone number field
is blank, your ability to deliver the message is impacted. Second, if a field such as
title, salutation, job code, or ethnicity is blank, your ability to segment prospects
into the correct categories or demographics is impacted. And third, if any of those
fields–and others like social security, account number, log-in ID, or account name–
are blank, your ability to identify similar or related records across systems, or even
within the same repository, is impacted–accentuating your duplicate record problem
because you needed those fields to match against to identify similarities.
THE WRONG DATA

Wrong data is simply that–the data is incorrect. There are a number of ways
that wrong data grows within your system. Age is one. People move (change
addresses), change phone numbers, change jobs, and so on. Over time, these
changes accumulate. Managers experienced with using data are often skeptical
of data that has not been updated for six months or a year or more because they
understand the compounding effect of data aging. System migrations are another
source of wrong data. Incorrect field mappings from source to target, or trying to
merge a larger field into a shorter field, can result in cryptic abbreviations that are
understandable only to the person doing the migration and are easily misinterpreted
by the end user. And then, of course, poor data capture procedures may allow the
placement of the third phone number for an office in the second email address
field. To the account manager who recorded the data and manages the account
today, everything is understandable and known, but for the account manager who
takes over tomorrow, nothing is known and these data shortcuts become problems.
A more insidious form of wrong data is fraud, where the supplier of information
purposefully enters the wrong data to mislead the business–for example, when a
terminated customer with a delinquent mobile phone account opens a new account
by supplying fictitious information.

HOW MARKETING BENEFITS FROM
QUALITY DATA
Data quality is crucial to knowing who your customers are and reaching them in an
effective manner. In order for marketing efforts to gain the greatest benefits, a clear
and single view of the customer is necessary. Without this view, contacts are made
with the wrong prospects and the right prospects either are missed or have multiple
touches that are confusing and costly. A goal of marketing is to cross-sell and up-sell
existing customer accounts, which is not achievable if the multiple accounts for the
same customer are not matched and consolidated into a single view. This is where
data quality plays a direct role in delivering value through marketing efforts.
Usually, the way an organization first experiences the need for data quality and
other EIM capabilities is when they build a CRM or customer data integration (CDI)
system, and find that the data they’ve loaded into the system is far less than expectations.
Throughout this paper, we use CRM as the surrogate for marketing data repositories
in general. Somewhere, somehow, customer and prospect data must be stored
and accessed, and CRM/CDI systems, whether homegrown or vendor-supplied,
are the common repositories for this data.

BUT HOW DO I FIND MY DATA
QUALITY PROBLEMS?
Your position within the marketing organization determines your visibility into the
data and the perceptions of the quality of that data. The higher in the organization,
the more removed a manager is from the data that drives their operations. A chief
marketing officer (CMO), for example, may be the person to ask the question, “How
do I know my data is defective?” The field-marketing specialist is most likely to
wonder, “I know the types of problems, I need the counts.” Fortunately, no matter
who is asking, the solution to both situations and other data integrity questions
is the same: Conduct a data quality assessment. Without the findings from an
assessment, you’ll have a number of issues to deal with:
• You won’t know the scope and depth of your problems. For example, are they
systemic or superficial?
• You won’t know the cause of the defects. Without knowing the types of problems,
you can’t track back in the process to isolate the source.
• You won’t know how effective the resulting cleansing operation was.
• The cleansing operation may very well miss whole categories of problems.
• You won’t be able to report on progress because no baseline was established.

That means no ROI calculation for the cleansing.
• You won’t be able to conduct trend analysis over time to see how your data is
regressing or progressing.
Many data quality problems are processed-based. That is, the problems result from
non-standard practices, no validation of data entry procedures, or just faulty
application design. The results of a data quality assessment often uncover process
issues as you work through the cause and effect. The assessment exposes the
effect, and it’s a relatively simple matter for the marketing manager to backtrack
through the data distribution chain to, for example, the account management system
and verify field edits are being used or enforced.
A data quality assessment is something marketing managers can do themselves,
especially if they have an assessment tool suitable for business users. Another
alternative is for the marketing manager to engage IT to conduct the assessment,
or even contract with a third-party information management-consulting firm.
Regardless, there is little mystery to conducting a data quality assessment. The
hardest part belongs with the business–that is, marketing–to articulate the business
rules that define good or bad. What are the rules that govern a specific field, such
as product name? For example, is it a mandatory field, can the field have abbreviations,
is there a maximum length, are special characters allowed, how many generations

of products are allowed to be listed, and what is the relationship with the other fields,
such as SKU code? At the very least, these rules rest in the minds of the marketing
specialists, also known as subject matter experts (SMEs). SMEs need to compose
the rules, agree on them, and then load them into whatever tool is used, be it a commercial
profiling tool or custom SQL code that is used to explore the data.
Probably the most important part of an assessment, other than the initial rules
gathering, is the reporting of the findings. Graphical or tabular reports are crucial
to delivering and communicating the impact of the defects. Once armed with this
information, marketing management can now make an informed decision as to what
fields of what dimensions or tables need to be cleansed. There is no need to “boil
the ocean,” that is, cleanse every field of every table. At any given time, certain fields
of specific tables will be of primary concern, and those are the fields that should
be first assessed and then cleansed, because they are the ones that will deliver the
greatest value the soonest.

ATTACKING THE DATA QUALITY
PROBLEM AT THE SOURCE
The best place for direct marketers to cleanse their data is as close to the point of
creation as possible. Consider if you will an information supply chain where at the
very beginning the data is captured from the prospect, perhaps at a trade show or
from a Web site registration form. See Figure 1.
The farther upstream you start cleansing, the earlier the ROI counter starts ticking
Total Benefit
Data Quality Investment $1,000 $500 $300 $300 $200
Point of Creation,Start of Web Site ODS

Call CRM/ Mailing
Information Value Chain Center CDI List
Information Supply Chain

Figure 1: Information Supply Chain Data Quality ROI
Captured data is propagated through numerous data repositories, processes,

and is perhaps even sold and purchased, then merged into a data warehouse
where it is ultimately loaded into the CRM system from which the direct marketing
campaign will be driven. This long chain of processes and migrations is an information
supply chain. It should be the goal of the marketer to work with the IT department
to institute cleansing functions in this supply chain as close to the point of creation
as possible. The reason is the further upstream the data is validated, cleansed, and
consolidated, the greater the number of downstream marketing and other operations
benefit, and the fewer problems defective data will cause.
If we examine the information supply chain graphic in Figure 1, we see a declining
level of data quality investment the further into the supply chain you progress. The
reason is that upfront and early data cleansing benefits all downstream operations,
and makes subsequent cleansing easier and less complicated. At the Web site,
for example, a hypothetical $1,000 is spent on cleansing a set of records (in this
case, ensuring the fields are correctly filled in, ZIP Codes are correct, as are the
street address and city/state names). The benefit at that point in the supply chain

is to the Web site order entry process. However, as the data is propagated beyond
order entry, other functions (such as the operational data store or ODS) that use
the data immediately benefit from the previous data quality work. Managers of
the ODS still need to do their own cleansing, such as checking field formatting,
matching the data to existing records, or consolidating like records, but they don’t
have to do as much if they were starting from scratch, and the complexity of issues
are less. This cascading data quality (DQ) benefit works its way through the entire
supply chain and all the branches that derive value from the data. The benefit of
that initial $1,000 is magnified at each link in the chain until at the very end where
the mailroom sending out the promotional piece need only adjust the data, such as
applying the right salutation and selecting the applicable address before mailing
the piece.
Another reason to manage data quality at the point of creation is it is much easier to
validate data and ask the contributor to confirm details as they provide them, rather
than months or years later when you actually want to use the data. The picture below
(Figure 2) shows an example of how Adobe Systems has implemented real-time data
cleansing at the point of capture in its Web order entry process to both automatically
validate the data and then to ask the customer which address they prefer.
Figure 2: Real-Time Data Cleansing Within Adobe’s Web Order-Entry Process

However, we understand that the marketer is often the inheritor of data and may not
have the opportunity to institute a function to cleanse the data as soon as it is captured.
After all, the information supply chain can extend many steps and even years from
where the data was originally created. Because of this, marketing managers must
be prepared to improve data quality anywhere in their process that they have the
availability. In general, there are five common opportunities to cleanse data that will
occur in information management processes. They are as follows:
• Transactional updates–often at the point of creation
• Operational feeds–upstream and before the data enters your system
• Purchased data–if you’re buying it, demand that it is clean
• Legacy migration–data is in the enterprise, but not in your system yet
• Regular maintenance–as your data ages, you need to cleanse it
We discuss each in these next sections.
TRANSACTIONAL UPDATES
This opportunity fits well with organizations that take a proactive approach to data
cleansing. Organizations can identify the entry points of information into the
organization–in this case, during transactions, such as a new customer login or
order entry–and where exposure to flawed data may occur. When a transaction is
processed, organizations have an opening to validate the data before it is saved to an
operational system. Transactional updating also affords the chance to validate data
as it arrives in its information packet rich with contextual information. Since this
contextual setting is lost as soon as the data is sent down stream, it is therefore
important to leverage it.
By their very nature, transaction updates force organizations to handle individual
information packets as they become available, which implies real-time processing,
low volumes, and a potentially wide distribution of implementation. In other words,
the cleansing functionality must be connected to or embedded in the transactional
environment and be able to respond in milliseconds, and also be able to service
multiple transactional applications.

OPERATIONAL FEEDS
The second opportunity to cleanse and consolidate data is during operational feeds.
These are regular weekly, nightly, hourly, or even sub-hourly updates supplied from
distributed sites to a central data store. A nightly upload from a subsidiary’s CRM
system to the corporate data warehouse is just one example. Regular operational
feeds allow an organization to implement batch-oriented data quality functions in the
path of the data stream, and volumes can be from the few thousands of changed
records, as in change data capture, or can be in the millions, even hundreds of millions.
Operational cleansing utilizes a predefined cleansing job or project selected from
a library of potentially hundreds of jobs. The appropriate job is usually triggered
or scheduled to run automatically on a specific data flow (flat file, database table,
input stream, and so forth) with an established mapping to the data model of the
input stream.
PURCHASED DATA
The third opportunity to cleanse is when you purchase data from a third party. Many
organizations erroneously assume data to be clean when purchased. Not so. Buying
third-party data is in many ways like buying a used car. Do you really know what the
previous owner has done to it? Of course not, that’s why you take the car to your
mechanic to have him pop the hood and put it on the hoist. You should do the same
thing with purchased data; otherwise, you are essentially abdicating your data quality
standards to those of the vendor.
In the case of a purchased list for a marketing campaign, you can ask for a random
sample from the prospect list and conduct your own data quality assessment.
Rudimentary tests for field completion and validation are simple to run. Validating
purchased data extends to matching the purchased data against your current data
set. The merging of two clean data sets is the equivalent of pouring a gallon of red
paint into blue. A merge will not equate to 1+1 = 2, but is more like 1.5. The reason
being duplication between data sets, and the duplication may not be easily recon-
ciled. Two records may appear the same, but one record might have a crucial field
that is different. The merged data sets must be matched and consolidated as one
new, entirely different set to ensure continuity. A hidden danger with purchased data
is it occurs as an ad hoc event, which means no regular process (a cleansing job
with business rules) exists to incorporate the data into an existing system. The lack
of regularly occurring processes raises the specter that in the rush to get the file
loaded, “expedient” shortcuts may be taken.
Business Objects. Data Quality: A Survival Guide For Marketing 10

LEGACY MIGRATION
The fourth opportunity to improve the quality of data is during legacy migration.
Any time data from an existing system is exported to a new system, the data must
be robustly checked and validated. A common problem to look for is legacy fields
plagued with overuse, such as the <contract type> field used by a leasing firm. Over
the years the definition of “contract type” had changed and fallen out of favor. Yet
sometimes it held important data such as the previous lease duration–important
data to migrate, but obscured in an antiquated data model. Another problem was
uncovered when a manufacturing company during a data quality assessment discovered it
had three types of addresses (site location, billing address, and corporate headquarters)
but only one address record per account. In order to capture all three addresses,
the CRM analysts were duplicating account records. What they needed to do was
extend their data model to hold three separate address records for each account,
which impacted the data model of the new system being built. Had it not assessed
its data beforehand, the manufacturer would not have discovered this problem until
after the initial migration.
In resolving legacy migrations, an interesting relationship appears between data
modeling and data quality. As the previous examples show, you can’t have good
data quality with a deficient data model, and you can have a good data model with
bad data quality. They are like the yin and the yang of data management. The two
are inseparable.
When conducting a legacy migration, the data quality job takes on the form of an
operational data cleanse project except the project is usually run just once. The
same amount of initial thought is dedicated to mapping the source to target fields
and defining the rules governing those target data elements. Test runs can be conducted
to see how well the job conforms to the desired output.
REGULAR MAINTENANCE
The fifth opportunity is during regular maintenance. Even if an organization starts
with perfect data today, tomorrow it will be flawed. Data ages–and ages more quickly
that most expect. For example, 17% of U.S. households move each year, and in some
years, as many as 60% of phone records change in some way. Moreover, every day
people get married, divorced, have children, have birthdays, get new jobs, get promoted,
and change titles. And if that wasn’t enough, the companies we work for start up, go
bankrupt, merge, acquire, rename, and spin-off. To account for this irrevocable
aging process, organizations must implement regular data cleansing and consolidation
processes, be they nightly, weekly or monthly. The longer the interval between
regular data quality activities, the lower the overall value of your data.

Of the five opportunities to cleanse data, regular maintenance is perhaps the easiest
and most important to perform. It’s easier in that there are no real time constraints
as in transactional processing. The data is in a single place, the host repository, and
it’s staying there–that is, it’s not being moved at the time of cleansing, and you have
schedule flexibility. Data such as prices can be checked and updated in place.
What makes maintenance the most important of the five is that you know the data
will age, and hence defects will grow. Also, consider that there are numerous connections
to the data repository and any number of them can be supplying defective data. A
regular maintenance process is your insurance, your backstop if you will, that any
defect data that leaks into the system will be caught in the next maintenance sweep.
MAPPING THE OPPORTUNITIES TO CLEANSE

In the following diagram, a typical lead-generation process is used to map each of
the five cleansing opportunities and highlight how they relate to each other. Most
occur multiple times, as they will in any marketing operation. Each of the opportunities
sits astride a data flow where the data moves from one function or repository to
another. We call them opportunities because in order to move the data, a software
program, IT process, marketing task, or all three must occur, and within that action
a data quality function can be readily inserted. These are opportunities. What
immediately becomes obvious is in this relatively simple lead-generation operation
there are numerous locations to invoke a DQ process. These locations offer
substantial flexibility to marketing managers when planning their data integrity
strategy. Depending on span of control (who owns what tasks or data stores),
budget, availability of resources, and time frame, the marketing manager can choose
to implement DQ checks in all possible locations or just one and plan to build from
there. The manager’s IT department will be crucial in guiding the strategy and even
implementing the final plan. IT will be able to provide feedback, for example, on the
complexity of implementing DQ maintenance procedures in a CRM system, and
how long it would take to prepare for the legacy migration of data from the obsolete
call center that is being replaced by the CRM system.

Data Flows
Maintenance
Obsolete,
Home-grown Legacy CRM System
Call Center Migration
Purchased Operational Transactional Operational Transactional

Lists Feeds Updates Feeds Updates
List of Raw Contact Qualified Sales Active

Attendees Leads Prospect Prospect Prospect
Records Lists Records
Records
Trade Collect Store Qualify Distribute Engage

Show Leads Leads Leads Leads Prospects
Lead Generation Work Flow

Figure 3: Lead Generation Workflow

DATA QUALITY FUNCTIONS IN A
MARKETING ENVIRONMENT
There are nine data quality functions marketers call upon to cleanse their data. As
shown below and depicted in Figure 4, in order of their occurrence in a data quality
project, those functions are:
1. Measure
2. Analyze
3. Identify (Parse)
4. Standardize
5. Correct
6. Enhance
7. Match
8. Consolidate
9. Monitor
Figure 4: Nine Functions of Data Quality

These functions will usually be conducted in this order because they support each other.
For example, to standardize the elements of a customer record, those elements–such
as title, salutation, or phone number–need to be identified or parsed out from the contact
data. Many marketing campaigns will receive data that comes straight from a mainframe
in a multiline record with no fielding, as in the following example:
Tom Jones
Director of Cybernetics, Formalux
Acetera Corporation, Formalux Divson
1900 Corporate Way N
Cincinatti, OH, 58999
For the record to be useful, it needs its various components identified and standardized–
as in changing corporation to corp and correcting divson to division. It then must
match and consolidate with the other records pulled from the source systems.
Measuring and analysis kick off the process by providing metadata as to the level
and types of defects found in the source data, so subsequent cleansing operations
can be tailored for the greatest effect.
The first six functions–including enhancement where additional data is appended
like demographic or geo codes–improve data to the point where it can be matched
and consolidated. Matching and consolidation is where a tremendous amount of value
is delivered to marketing in that duplicate records are eliminated, best of records
are built, and the manager now has a single view of each prospect or customer
within the context of the applied source data. Now able to build a corporate or
retail household for target marketing, the marketing manager can identify the top
20% of the customer base or form demographic groups for segmentation in the
next campaign.
Last, monitoring uses the business rules and definitions created in the measure and
analyze phases to create an automated profiling project that provides managers with
defect information (metadata) at any time, so they can make decisions as to whether
the data is good enough to use or needs to be improved for the next operation.

DELIVERING DATA QUALITY
FUNCTIONALITY
Once the marketing manager has determined the nature and scope of data problems
and has determined the data quality functionality needed, there are a number of
options available for connecting the functionality to the data. In broad terms, those
options are:
• On-premise software
• Internal hosting via Web services
• On-demand (for example, Software as a Service (SaaS))
• Service bureaus
These options range from having the greatest control and largest footprint (on-premise
software) to least control and no footprint (service bureau). From a cost basis one
might suspect that on-premise software would be the most expensive. However, the
cost of maintaining a level of data quality is not limited to the initial expenditure.
Consider that the data and its usage will extend as far into the future as the organization
remains in existence. Maintenance fees, per record charges (otherwise known as
click charges), or subscription fees over time will exceed initial software license
fees. What becomes important when calculating the cost of data quality processing
is the breadth and depth of functionality needed and the volume of records processed
each month.
ON-PREMISE SOFTWARE
Next to service bureaus, on-premise software is the oldest form of data quality delivery
mechanism. Early data quality software vendors such as Postalsoft began selling
and distributing on-premise data quality software in the mid 1980s. On-premise
software is simply that: a software application—either commercial or hand-coded—
that resides in your facility and is run by IT or the marketing staff. Sometimes the
software is run by third-party consulting or contracting agencies and can be operated
locally or remotely via a virtual private network (VPN) or Internet connection. The
advantages of on-premise software are you control the application, the parameter
settings, the computing environment, processing schedule, and so on. The disadvantage
is that your organization is responsible for all of the above. You need to have the
system resources, personnel, and training to run the software. For most firms, however,
the advantages far outweigh the disadvantages. Basically, if a firm has grown to the
size where it has any sort of customer data management system and an IT staff to
match, it usually has the capabilities to host and run data quality software internally.

INTERNAL HOSTING VIA WEB SERVICES
Internal hosting of data quality functionality was made possible by the advent of
Web services. Today, most commercial data quality software packages support
Web services to some level. With Web services, a corporate IT group can install
a centralized data quality server, such as Business Objects Data Services, and
publish the data quality functionality to departments and business units within
the enterprise. IT does not need to install its own software. The elegance of this
approach is that the marketing department, for example, can see and leverage the
customer processing rules and jobs established by the sales operations department.
This business rules reuse dramatically cuts project development time and allows
corporate data governance to create data standards and common definitions to be
applied across the enterprise. The advantages and disadvantages of internal hosting
are the same as on-premise software with the exception that the advantages are
magnified by each department or operation that connects to the service. Each new
connection and project leverages the single installation, thus keeping IT complexity
and maintenance burdens to a minimum. Moreover, when demand begins to exceed IT
capacity, rather than having to add another server or deployment, IT can increase the
size of the existing server, thus keeping the installation footprint to one deployment.
ON-DEMAND
On-demand software is the newer form of what previously was known as an application
service provider, or ASP. With on-demand software, the marketing manager contracts
with a third–party service provider and accesses the contracted software via the
Internet. The advantage of on-demand is the service provider bears the complete
burden of installing, running, and maintaining the software at its own facilities. The
disadvantage is the user must trust the provider to safeguard any data that is
stored at the provider’s facility, and functionality offered by the provider may be
limited when compared to on-premise software. With Internet reliability constantly
improving, access to contract software is rarely a problem, and almost always the
user interface is Web-enabled and therefore accessible via an Internet browser. A
downside of on-demand for data quality processing is the customer’s data must be
uploaded to the service and then returned after cleansed. This round-tripping of data
adds latency to processing times. However, if the marketing manager is not inter-
ested in real-time processing, the added latency may be of little concern.
On-demand actually allows the marketing manager the option of creating a hybrid
solution. Look at address cleansing as an example. A firm may have 5 million
customer and prospect records, 80% of which have U.S. addresses, 15% have
European addresses, and the remaining 5% have Japanese addresses. At the
volumes and frequency of processing the firm averages per month, in addition to

the complexity of cleansing, the marketing manager, with guidance from IT, may
determine it is more practical to process the records for both the United States
and Europe in house. But the additional cost of adding Japanese processing given
the fewer number of records and the fewer number of campaigns run against those
records dissuades the marketing manager from purchasing that capability for that
region. On-demand allows managers to contract out their Japanese records and
process the data when they need to. These types of hybrid solutions are expected
to increase in frequency as globalization continues to expand and more and more
customers come from outside of the firm’s nation of origin.
SERVICE BUREAUS
Service bureau processing is yet one step further removed from when compared to
on-demand. With a service bureau a complete project including data file, processing
rules and delivery instructions is sent to a third party agency that processes the job
in batch with relatively little interaction with the customer. The advantage of a service
bureau is the marketing manager or their IT counterpart need not run any software,
either on-premise or on-demand. Once the initial effort is taken to establish the
contract and job requirements, the hard work is done. The files are delivered to the
service bureau and the customer awaits either their return or, in the case of a direct
mail/email campaign, the marketing pieces are sent to the prospects. The disadvantage
of using a service bureau is the client must trust the bureau to follow all the requirements
and perform the proper cleansing, and the work performed is largely on the service
bureau’s schedule. There are ways, of course, to validate that the bureau has complied
with all the requirements, and when negotiating the contract the customer can set
the desired delivery date of the finished product.

TYING IT ALL TOGETHER
In the grand scheme of things, marketing managers have numerous options for ensuring
and uplifting the quality of their data. EIM provides a framework for deploying those
options together in one streamlined process flow. The EIM framework contains everything
from data integration–extract, transform, and load (ETL) or enterprise information
integration (EII)–through metadata management, data quality, and building specialized
reporting marts. Data integration applications are a primary deployment mechanism for
data quality functionality, which makes it convenient to cleanse data when it is being
moved into or out of your CRM/CDI system.
Today’s data quality vendors have built rich and deep functionality that can
remediate almost any customer data problem, and they’ve structured their deployment
mechanisms to give you the greatest flexibility in deciding when, where, and how
to cleanse data. On-premise, internal hosting, on-demand, service bureaus, or any
combination thereof are the options that can be tailored to infrastructure and
marketing needs. With the latitude of options available, there is really no reason
why suboptimal data should be used to deliver suboptimal results in your marketing
efforts, whether you’re identifying cross-sell opportunities or distributing leads to
the appropriate sales person. The question for marketing managers becomes: Why
marginalize your marketing efforts when better results lie in improving your data?
ABOUT BUSINESS OBJECTS

As an independent business unit within SAP, Business Objects transforms the way
the world works by connecting people, information, and businesses. Together with
one of the industry’s strongest and most diverse partner networks, the company
delivers business performance optimization to customers worldwide across all
major industries, including financial services, retail, consumer-packaged goods,
healthcare, and public sector. With open, heterogeneous applications in the areas
of governance, risk, and compliance; enterprise performance management; and
business intelligence; and through global consulting and education services,
Business Objects enables organizations of all sizes around the globe to close the
loop between business strategy and execution.

businessobjects.com
© 2008 Business Objects. All rights reserved. Business Objects owns the following U.S. patents, which may cover products that are offered and licensed by Business Objects: 5,555,403; 5,857,205;
6,289,352; 6,247,008; 6,490,593; 6,578,027; 6,831,668; 6,768,986; 6,772,409; 6,882,998; 7,139,766; 7,299,419; 7,194,465; 7,222,130; 7,181,440 and 7,181,435. Business Objects and the Business
Objects logo, BusinessObjects, Business Objects Crystal Vision, Business Process On Demand, BusinessQuery, Crystal Analysis, Crystal Applications, Crystal Decisions, Crystal Enterprise, Crystal Insider,
Crystal Reports, Desktop Intelligence, Inxight, the Inxight Logo, LinguistX, Star Tree, Table Lens, ThingFinder, Timewall, Let there be light, Metify, NSite, Rapid Marts, RapidMarts, the Spectrum Design, Web
Intelligence, Workmail and Xcelsius are trademarks or registered trademarks in the United States and/or other countries of Business Objects and/or affiliated companies. All other names mentioned herein
may be trademarks of their respective owners. May 2008 WP3137-A

WP3137 A DQ Survival Guide

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

WP3137 A DQ Survival Guide

Uploaded by

Copyright:

Available Formats

WHITE PAPER

DATA QUALITY: A SURVIVAL GUIDE

CONTENTS EXECUTIVE SUMMARY

Author: Frank Dravis, Six Factors Consulting

DUPLICATE ACCOUNT RECORDS

Business Objects. Data Quality: A Survival Guide For Marketing 

THE WRONG DATA

Business Objects. Data Quality: A Survival Guide For Marketing 

Business Objects. Data Quality: A Survival Guide For Marketing 

• You won’t be able to report on progress because no baseline was established.

Business Objects. Data Quality: A Survival Guide For Marketing 

Business Objects. Data Quality: A Survival Guide For Marketing 

Data Quality Investment $1,000 $500 $300 $300 $200

Point of Creation,Start of Web Site ODS

Information Supply Chain

Captured data is propagated through numerous data repositories, processes,

Business Objects. Data Quality: A Survival Guide For Marketing 

Figure 2: Real-Time Data Cleansing Within Adobe’s Web Order-Entry Process

Business Objects. Data Quality: A Survival Guide For Marketing 

• Operational feeds–upstream and before the data enters your system

• Purchased data–if you’re buying it, demand that it is clean

• Legacy migration–data is in the enterprise, but not in your system yet

• Regular maintenance–as your data ages, you need to cleanse it

We discuss each in these next sections.

Business Objects. Data Quality: A Survival Guide For Marketing 

Business Objects. Data Quality: A Survival Guide For Marketing 10

Business Objects. Data Quality: A Survival Guide For Marketing 11

MAPPING THE OPPORTUNITIES TO CLEANSE

Business Objects. Data Quality: A Survival Guide For Marketing 12

Purchased Operational Transactional Operational Transactional

List of Raw Contact Qualified Sales Active

Trade Collect Store Qualify Distribute Engage

Lead Generation Work Flow

Business Objects. Data Quality: A Survival Guide For Marketing 13

Figure 4: Nine Functions of Data Quality

Business Objects. Data Quality: A Survival Guide For Marketing 14

Business Objects. Data Quality: A Survival Guide For Marketing 15

• Internal hosting via Web services

• On-demand (for example, Software as a Service (SaaS))

Business Objects. Data Quality: A Survival Guide For Marketing 16

Business Objects. Data Quality: A Survival Guide For Marketing 17

Business Objects. Data Quality: A Survival Guide For Marketing 18

ABOUT BUSINESS OBJECTS

Business Objects. Data Quality: A Survival Guide For Marketing 19

You might also like

Business Objects. Data Quality: A Survival Guide For Marketing

Business Objects. Data Quality: A Survival Guide For Marketing

Business Objects. Data Quality: A Survival Guide For Marketing

Business Objects. Data Quality: A Survival Guide For Marketing

Business Objects. Data Quality: A Survival Guide For Marketing

Business Objects. Data Quality: A Survival Guide For Marketing

Business Objects. Data Quality: A Survival Guide For Marketing

Business Objects. Data Quality: A Survival Guide For Marketing