Customer Integration Manager

Get a 360 degree view of your customers with our premier onsite data integration solution
Technical White Paper

Table of Contents
The Need for Customer Data Integration Data Integration Challenges The Data Integration Process Deployment Options Introducing Customer Integration Manager: the Premier Business Customer Data Integration Solution from D&B Business Benefits How It Works The Customer Integration Manager Matching Process Technical Benefits Sample Applications Integrated Customer Relationship Management Consolidated Sales Data for Customer Analysis Interactive Customer Lookup During Order Entry Implementation Monthly Update Support Summary 3 4 5 8 9

9 10 12 14 15 15 16 17 18 18 19 19

All data is fictitious.

Customer Integration Manager Technical White Paper

The Need for Customer Data Integration

CRM. ERP. Supply Chain. Demand Management.

of data among systems, physical movement is not enough. Data must be logically integrated as well.

Business Intelligence. Data Warehouse. Look inside any of today's strategic business technologies and you'll find the same requirement: to share data freely and accurately across different business functions. Data sharing is what lets customer relationship management systems ensure consistent customer treatment across different channels. It is what lets enterprise resource planning systems coordinate manufacturing plans with sales forecasts and inventory levels with delivery schedules. It is what lets quality managers track customer complaints back to specific production units.

This means your systems must recognize that the customer who bought this product through the order entry system is the same customer who logged that complaint in the customer service system and the same customer about to receive those mail promotions from the marketing system. The trouble is, each system holds its own record for that customer in a different format, with different spellings, and possibly even under different names. You can close your eyes, click your heels three times, and hope the separate systems go away. Or you can find a way to identify the relationships among those records despite the differences.

Customer Integration Manager links customer data across systems

Finding those relationships is part of a process called customer data integration. There is more to customer data

Marketing Database Sales


integration than just matching customer records, although matching is at the heart of the process. Customer data integration also involves establishing relationships among records that cannot be matched directly, such as corporate parents and subsidiaries with different names or addresses. It identifies the most accurate version of information such as a company telephone number or mailing address

Customer Integration Manager


Web Site

Data Warehouse



In the abstract, data sharing looks easy enough just connect a few boxes on a white board or assume the whole company moves to a single all-purpose software package. But reality is much more complicated. Different departments, divisions, and subsidiaries will continue to run separate systems, and even if every member of the corporate family does adopt the same integrated product, every supplier and customer will not. While technology has simplified the physical movement

that is different in different systems. Most important, it assigns a permanent identification number to each customer. This permits easy sharing of customer data without complicated, on-the-fly matching processes. Accurate customer data integration is particularly important for achieving the promises of today's customer relationship management projects: lower marketing costs, higher retention rates and greater revenue per

Customer Integration Manager Technical White Paper

customer, among other tangible benefits. All these goals are premised on developing a complete picture of each customer's relationships with your business. This requires both combining data from different systems and linking records within the same system that refer to the same customer. Sometimes multiple records exist by mistake and can be combined. But there is often a valid reason for the same customer to have more than one record, particularly when dealing with businesses rather than individual consumers. A business customer may have separate accounts for different departments, locations or even specific projects. While these accounts must remain distinct for operational reasons, accurate customer management and analysis must recognize that they are related. In fact, the most active customers are the most likely to have multiple accounts making consolidation a critical function in managing your largest clients as well as the smaller ones.

gration is most likely to affect precisely those customers. At the most basic level, customer data integration saves money by avoiding redundant data entry and cleaning projects, providing more accurate information, and supporting a complete and accurate view of the customer that is multifaceted.

Data Integration Challenges

The importance of customer data integration is clear, but so are the difficulties. The fundamental problem is that data is captured in many different systems, each with its own formats, standards and requirements. Data may be perfectly adequate for its original purpose yet still not suitable for integration: for example, names and addresses in free form text fields can generate serviceable mailing labels but are difficult to analyze for matching. Similarly, operational processing is often unaffected by data entry inconsistencies, such as variations in formats and abbreviations, that make parsing and matching still more difficult. Operational users often cram extraneous information such as customer status codes into name and address data that makes perfect sense to a human reader or customized computer program, but confuses external integration processes. Or an operational system

Although improved customer relationships are more than sufficient reason to undertake a data integration project, there are other benefits as well.

Although improved customer relationships are more than sufficient reason to undertake a data integration project, there are other benefits as well. One is the increasing reliance on consolidated data for processes other than direct customer contact, such as distribution, financial planning and purchasing. In fact, consolidated data on suppliers can sometimes be as valuable as consolidated data on customers. Data integration is also the foundation of data warehouse and business intelligence systems, which are increasingly used for both strategic and tactical business decisions. Nothing destroys the credibility of a data warehouse report more quickly than an obvious understatement of sales to the company's largest accounts and, as already noted, inaccurate inte-

may lack basic information needed for any matching: for example, a help desk system might capture only the customer's first name and a telephone number to call with an answer, but no mailing address or account ID. But data integration is difficult even when the finest operational systems are in place. Customers themselves often provide different information to different systems,

The fundamental problem is that data is captured in many different systems, each with its own formats, standards and requirements.

Customer Integration Manager Technical White Paper

either by mistake or to meet different purposes: a billing system may correctly hold a different address than a shipping system. Nor do customers update every company system when important changes take place: if they haven't called for customer service since their last move, the customer service system will have an outdated address. The system changes needed to support integration can themselves be a challenge, particularly since many corporate IT departments have little experience with customer data integration technologies. Working with business rather than consumer data adds yet another level of complexity. Business records include not just a simple name, address and city/state/ZIP but personal and company names, titles, departments, buildings, mail stops, and other elements. They also contain industry terms, such as "DBA" for "doing business as", that must be recognized and interpreted appropriately. Few companies follow enterprise-wide formats to capture and hold these elements and even firms with internal standards must contend with data from external sources. Business files may also hold different names for the same company, such as legal vs. trade names. Common names may themselves be represented differently "Kentucky Fried Chicken", "KFC", "Kentucky FC" and any number of variations. Related firms may have totally unrelated names, such as Lotus Development Corporation, a subsidiary of IBM. Multiple locations for the same business must be somehow brought together. Today's business trends make customer data integration more challenging than ever. As more systems are connected, integration techniques must become more efficient to handle the increased volume. Interactive applications impose strict performance requirements to ensure data entry and customer service processes are not delayed. Privacy and security regulations impose strict limits on how data is shared, making it more important than ever to ensure accurate consolidation,

precise access control, and audit trails on changes. Privacy concerns also make it more difficult to gather data directly and thus more important to share as widely as possible whatever data your company has already acquired.

The Data Integration Process

Customer data integration may be difficult, but it is far from impossible. In fact, experience over the past

Data Integration Process

Batch Input

Online Input

Parse Standardize Group Match Enhance

Reference Rules & Data

Validation Data (if used)

Supplemental Data

Batch Output

Online Output

several decades has provided a firm understanding of the steps in an effective data integration process: Input. This is the initial process of gathering and presenting the data to be integrated. Traditionally, large numbers of records were extracted from source systems, loaded into files, and run through the integration process

Customer Integration Manager Technical White Paper

as a group. Results were then posted back to the source system or used elsewhere. This is referred to as "batch" processing. Many data warehouses and marketing databases are built this way. Other systems use "online" processing, where an external system presents a single record to the integration system and waits for the result before continuing. This often involves interactive processes where a human user is working on the record. It also includes fully automated processes, such as processing a credit card transaction, where immediate response is needed even though there is no direct user involvement. Online processing is technically more difficult than batch processing, because the external system must be modified to present the data and use the results. Online integration must also run quickly enough to avoid a significant decline in source system performance.

("Corporation" is probably part of a company name; "Andrew" is probably a first name) and the likely sequences of elements within lines of different types. Reference tables must be tuned for specific applications such as business or consumer processing and to adjust for different national data formats.

Input Record Dan Brandstreet 3 Silven Way Parrippany NJ 07054 Parsed Record Name: Dan Brandstreet Street Nbr: 3 Street Name: Silven Way City: Parrippany State: NJ Postal Code: 07054

2. Standardization is the stage in the process that Matching. This is the heart of the data integration process, where records are compared to find which are related. It usually includes three stages: 1. Parsing isolates data elements so they can later be compared with corresponding elements on other records. For example, a typical parser would split a name into title (Mr., Mrs. Ms., etc.), first name, middle initial, last name, generation (Jr., III) and suffix (Ph.D., L.L.D). Addresses and other data would be similarly broken into components. Parsing also identifies missing or questionable data. Parsing is necessary because many systems store name and address data as text lines rather than separate elements. Often the parser must determine both the nature of the line and the sequence of the elements within it. Sophisticated parsers use reference tables that list how specific words are likely to be used converts data elements to standard formats to improve match accuracy. On consumer records, it might replace nicknames such as Bob, Bobby and Rob with a formal name of Robert. For businesses, it might replace different versions of a company name with a similar standard. Like parsing, standardization relies heavily on tables to make such corrections. Address standardization may simply apply standard formats or it may extend to validation and correction using actual postal tables. These could determine which street names exist in which cities and what postal code applies to each address. But postal

Input Record Name: Dan Brandstreet Street Nbr: 3 Street Name: Silven Way City: Parrippany State: NJ Postal Code: 07054 Standardized Record Name: Dun & Bradstreet Street Nbr: 3 Street Name: Sylvan Way City: Parsippany State: NJ Postal Code: 07054

Customer Integration Manager Technical White Paper

validation and coding are generally considered separate processes from other standardizations and are often applied outside of the data integration system.

of such systems provide pre-built tables to spare users most of this labor. An ultimately more significant difference among match-

3. Grouping determines which records are compared to each other. This avoids the inefficiency of comparing everything to everything else. Most systems build a group key by extracting portions of data elements such as state, city and last name. Often several keys are created to bring together records that a single method might miss. For batch matching, records are typically assembled into one large file, sorted or indexed on the key, and then compared in sequence. For online matching, the keys are stored permanently on the customer database. When a record is presented for matching, the online system generates keys for that record and selects customer database records with the same keys for detailed comparison. Linking. The process of deciding which records should be considered a match. All matching systems ultimately rely on string comparisons that is, they assess the similarity between the text strings assigned to the different data elements in the records being compared. Some systems extract a few characters from each data element to create a "match code" and treat all records with the same match code as a match. Some calculate numeric similarity scores for each pair of elements, add the scores, and treat as a match any pair of records whose total exceeds a specified threshold. Some assign alphabetic codes to different types of element matches and specify which combinations of codes are considered a record match. Each approach has its proponents, but the codecombination technique is the most common among sophisticated matching systems. The advantage of this method is precise control over how each combination of element-level matches is treated. The major criticism is that specifying treatments for tens of thousands of different combinations is a great deal of work. But vendors

ing systems is whether they compare input records against each other or against a separate validation file. Such validation files are compiled from external sources and contain all entities that might appear on the input files themselves. Each entity on the validation database is assigned a fixed ID number. Input records are compared with the validation file using conventional stringcomparison techniques; when a match is found, the ID is copied from the validation database record to the input record. Input records that end the process with the same ID are assumed to match.

An ultimately more significant difference among matching systems is whether they compare input records against each other or against a separate validation file.

Validation-based matching is significantly more effective than direct comparison because the validation database can contain links between records that are not physically similar. For consumer data, this might be someone who has changed addresses or has summer and winter homes. This not only lets companies unify customer records that would otherwise remain fragmented, but also provides notice that a customer has moved even if he has not informed the company directly. Advantages for business matching are greater still, because of the many reasons legal vs. trade names, parents vs. subsidiaries, headquarters vs. branch sites, etc. that related records appear different. The value of a validation database depends on its coverage and the accuracy of its linkages. The major data compilers can afford to invest in comprehensive valida-

Customer Integration Manager Technical White Paper

tion files, knowing they will reuse the data many times over. Since only an ID number is transferred from the validation database to the input records, other information compiled by the database developer can remain hidden or be provided for an extra fee. Validation-based systems are available for consumer data in the U.S., United Kingdom and a few other countries. Business matching files are available for most major developed countries. Most systems also provide an option for manual review of suspect matches that is, record pairs that are similar, but not close enough to be accepted as matches automatically. The final output of any linking process is a set of records with IDs that identify members of the same match group. In consumer matching, the match group often represents a single individual or household; in business matching, it may represent records at the same site or belonging to the same company. Matching systems often give each record several IDs, representing different groupings. Enhancement. Once matches are identified, the

tion technologies, even external data can be accessed by in-house online systems at acceptable speeds. Output. The final step in integration is making the output available for use. In some cases, particularly online systems, this means returning the data back to the source system. In other cases, such as building a data warehouse, the information will go somewhere other than the original source. In still other configurations, the result is added to a cross reference table that links source system IDs such as account numbers with a standard customer ID. This particular approach allows source systems to share customer data without modifying their internal processes or data structures. Whatever the details, the only task of the integration system is to produce an output either a batch file or online transaction that links the original input with a standard customer ID. Other systems can then process this data in any way that is appropriate.

Deployment Options
Many firms have installed piecemeal integration solutions to meet specific operational requirements, such as sharing data between two particular systems. But enterprise-wide customer data integration usually involves a dedicated system to gather, process and distribute data across many different sources. Such systems can reside at external service bureaus, as part of an in-house operation, or even in a hybrid configuration. External service bureaus provide specialized expertise and have economies of scale that can make them more cost-effective than in-house processing. Service bureaus are particularly appropriate for periodic batch updates, where immediate turnaround is not required. They are also often used to produce consolidated files, such as marketing databases, that will reside at the service bureau rather than the company's internal systems.

process may attempt to create a more complete customer record by combining information from multiple sources. This could involve comparing conflicting data from different systems and picking the values that seem most likely to be correct. For example, birth date may be populated in one system and blank or obviously wrong (e.g., 99/99/99) in another. Or the system may combine information from multiple records, such as the sum of all account balances. It might also look at an external data source that has information not collected internally. Enhancement can be separate from data integration, particularly when the enhancement data resides at an external source. But it is often more efficient to do enhancement within the integration processing stream. Like other integration processes, enhancement can be performed in batch or online. With today's communica-

Customer Integration Manager Technical White Paper

External processing also simplifies validation-based matching, since the service bureau has immediate access to updates to the validation file.

Remotely searches D&B's own computers for matching records that have not already been downloaded Includes D&B's own customer matching engine, a

In-house operation has become increasingly common, particularly in data warehouse projects where the entire system is maintained internally. In-house operation is almost required for online applications, where the data integration processing must take place quickly and be tightly connected with other corporate systems. Hybrid solutions let companies do most integration processing internally, but still access external systems when needed. The most common hybrid configuration reads external files for validation or enhancement data. This lets companies use up-to-the-second versions of those files without the cost and security issues of copying them onto in-house systems. High-speed communications make it possible to use hybrid approaches even for online integration.

sophisticated product using world-class technology and specifically tuned for business matching applications Performs both batch and online matching Returns a single best match or provides interactive users with a set of possible matches to evaluate Can automatically request further research by D&B staff when no match is found Codes records with the D&B D-U-N-S Number, D&B's universal company ID, opening the door to easy enhancement with data from D&B's own files and from many third parties.

Introducing Customer Integration Manager: the Premier Business Customer Data Integration Solution from D&B
Customer Integration Manager provides a single, comprehensive solution to your company's business customer data integration needs. Customer Integration Manager: Assigns consistent customer IDs to records from all of your systems, allowing easy integration of customer information throughout your company

Business Benefits
Customer Integration Manager provides your company with the benefits of sophisticated customer data integration without the costs of building it yourself. By installing Customer Integration Manager, your company will: Gain a consolidated view of customer relationships that are spread over multiple accounts, sites and trade names, thereby supporting effective customer relationship management programs Facilitate online access by sales and service staff to

Combines records from your own systems and from D&B in an in-house validation directory that is easily accessible for both batch and online processing

customer information, no matter where the information is stored Expand your understanding of each customer by

Regularly updates the validation database with fresh D&B data

adding enhancement data from D&B and other providers

Customer Integration Manager Technical White Paper

Ensure the integrity of customer data in each corporate system, by identifying and eliminating discrepancies caused by entry errors, outdated information and incomplete records Improve the efficiency of internal operations by helping to coordinate front- and back-office systems Make better business decisions by providing more accurate data for marketing studies, sales analyses and other types of research Save time and money by acquiring a complete solution that is easily adapted to your needs, rather than attempting to develop a comparable system internally Benefit from the experience of D&B consulting services, world leaders in managing business customer data

How It Works
Customer Integration Manager accepts name and addresses input from your company systems and returns a standard ID plus company data from D&B files. The system is built around the Common Customer Directory. This starts with a set of D&B records representing the universe of firms your company is likely to do business with. Each record refers to a specific business site that is, a branch or office of a single business at a single location. Companies with multiple locations will have at least one site record for each location. If a company does business under multiple trade names at the same site, there will be a separate site record for each name. This improves matching accuracy since input records carrying any of these names can still be recognized as part of the same site. Each D&B site record carries a unique D-U-N-S Number, a universal site identifier assigned by D&B and used throughout the world as a standard business ID. Records for the same site but with different trade name will have the same D-U-N-S Number. In addition to the site's own D-U-N-S Number, the site record will carry the D-U-N-S Numbers of the site's headquarters or corporate parent, national parent and global parent sites if these exist. D&B can provide additional data about each site from its own files, including name, address, telephone number, revenue, number of employees, and key executives. This is stored on the site record or in separate tables that are linked to the site records through the D-U-N-S Numbers. All records carry a second site number, called the Logical Site ID or LSID. Unlike D-U-N-S Numbers, LSIDs are assigned independently at each Customer Integration Manager installation. Use of LSIDs is explained next.

Underlying Technology Customer Integration Manager is built on industrystandard technologies, making it an easy fit for most corporate systems environments. Supports platforms include: Operating System: HP-UX 11, Sun Solaris 2.9, Microsoft Windows 2000; configuration depends on data volume and throughput

Minimum hardware requirements: - Application Server: 2 CPUs @ 750 MHz, 2 GM RAM, 100 GB Disk - Database Server: 4 CPUs @ 440 MHz, 4 GM RAM, 1 TB Disk.

Database: DB2 (Version 2), Microsoft SQL Server 2000, and Oracle 9i RDBMS.

The system is written in J2EE to provide a highly scalable architecture. It was built on a Struts framework with open-source solutions, Jboss and Jetty. Batch data inputs and outputs can be sent as delimited or fixed record length files. The system can monitor a specified directory and automatically initiate batch processes when new files appear.


Customer Integration Manager Technical White Paper

Customer Integration Manager Operation

Name & Address

on the input record; an LSID is created or copied from an existing record as appropriate. If no

Customer Integration Manager

match is found on the D&B master file, the record can be forwarded to D&B's Global

Company Systems

Name & Address D-U-N-S Number and D&B data

Common Customer Directory

Resolution Services staff. This group will look for a match that the computer missed and can contact a non-matching site to

Once the D&B records are loaded, users can match them against records from the company's own systems. (See the next section for a detailed explanation of the matching process.) If the input record matches a D&B record, the input record is coded with the D&B record's D-U-N-S Number and LSID. This will also happen if the input record matches a previously-loaded company record that is itself linked to a D&B record. If the input record matches a company-provided record without a D-U-N-S Number, it is given that record's LSID. If there is no match at all, the system assigns the record a new LSID. The result is that every record has an LSID while records linked directly or indirectly to a D&B record have a D-U-N-S Number as well. Customer Integration Manager can find additional D-U-N-S Numbers by connecting remotely to the D&B master file. This process, called Remote Resolution, lets Customer Integration Manager search for records that were not selected for the local database or have changed on the master file but have not yet been updated locally. If a remote match is found, the D&B record is copied to the Common Customer Directory and its D-U-N-S Number is placed

determine if it is truly a valid business. Since manual research takes time, the non-matching record is initially returned without a D-U-N-S Number. If a D&B match is found later, the D-U-N-S Number will be added and the LSID adjusted if this links the record to a different site.

Customer Integration Manager can find additional D-U-N-S Numbers by connecting remotely to the D&B master file.

Common Customer Directory Creation

Company Input Dun Wordbase 3 Silver Way Dans Pizza 13 Short Ave Parrippany Parsippany NJ NJ LSID 1111 1111 LSID 1111 1111 1111 2222

D&B Input = Initial Common Customer Directory Dun & Bradstreet 3 Sylvan Way Parsippany NJ Dun Worldbase 3 Sylvan Way Parsippany NJ Combined Input = New Common Customer Directory Dun & Bradstreet 3 Sylvan Way Parsippany NJ Dun Worldbase 3 Sylvan Way Parsippany NJ 1 Dun Wordbase 3 Silver Way Parrippany NJ Dans Pizza 13 Short Ave Parsippany NJ 2

D-U-N-S # 123456789 123456789 D-U-N-S # 123456789 123456789 123456789

(1) Company input matches D&B record; has D-U-N-S Number and LSID (2) Company input does not match D&B record; has LSID only

Customer Integration Manager Technical White Paper


Records can be presented for matching in batch or online. Online processes can choose the best match automatically or present interactive users with a list of candidates, ranked to present the most likely matches first. Batch processes can also create lists of questionable matches for users to resolve. Users can force a record to become part of a specified site, regardless of how closely it matches records already assigned to that site. Once processing is complete, the record can be added to the Common Customer Directory, sent to a flat file, returned to an online system, or any combination of these. Adding the records to the Common Customer Directory lets a company assemble a complete list of its customers. The system's matching functions can then be used to check whether new inputs, such as responses to marketing promotions or requests for technical support, match a known customer record. Output formats are customized for each process and can include the D-U-N-S Number, LSID, and other items from the Common Customer Directory. This database includes several separate tables, linked by D-U-N-S Number, and can incorporate data from D&B's master database. Common Customer Directory records can also store a user-specified "hard key," typically a record number or account ID from the original source system. Company systems can use this key to find a record that has already been loaded to the Common Customer Directory. They can then use the D-U-N-S Number or LSID on that record to retrieve additional information or to find records from all company systems with the same D-U-N-S Number or LSID.

The Common Customer Directory also receives monthly updates from D&B itself. These updates include D&B records that have been changed and new D&B records that meet the user's selection criteria. The system can reassign LSIDs and D-U-N-S Numbers if a match is found to be inaccurate.

The Customer Integration Manager Matching Process

Customer Integration Manager is powered by the DUNS Name Matching API, an implementation of D&B's patented name matching process. The system provides parsing, normalization and linking functions, all optimized for business data. Linking uses the "code combination" technique, generally considered the most effective method available. This assigns a code for the type of match between each element pair, and then applies a table that defines how each set of code combinations is to be treated.

Match Engine Linking

Step 1: Compare Records and Generate Match Grade Name Street Nbr Street Name City State PO Box Telephone Nbr Input Record Dun Wordbase 3 Silver Way Parripany NJ Validation Record Dunn Worldbase 3 Sylvan Way Parsippany NJ 973-605-6000 Match Grade = BABAAZZ Step 2: Find Corresponding Confidence Code in Confidence Matrix Confidence Matrix (portion) Match Grade Confidence Code Match Percentage BABAAZF 8 90% BABAAZZ 8 90% BABABAA 8 89% Step 3: List All Record Pairs Exceeding a Specified Confidence Code
(In this example: record 111111 matches record 222222; records 333333 and 555555 both match record 444444)

Similarity Type Code Moderate B Strong A Moderate B Strong A Strong A Blank Z Blank Z

Input Record ID 111111 333333 555555

Linked Record ID 222222 444444 444444

Confidence Code 10 8 8


Customer Integration Manager Technical White Paper

As described elsewhere in this White Paper, Customer Integration Manager matches input records against a validation table, rather than each other. This lets it identify relationships among records that do not match each other directly. Details of the process are: Input. The match engine accepts inputs for business name, address, city, state, postal code, telephone number and country code. Country code is used to select country-specific matching rules and postal code is used to resolve discrepancies between inputs and D&B validation records. The other elements are used in the matching process itself.

the linking process. The system will compare all records with the same key, regardless of how many there are. Linking. The system compares the name, street number, street name, city, state, Post Office Box and telephone number elements in each pair of records. The comparisons use different methods depending on the data type, but each produces one of four similarity codes: A (strong), B (moderate), F (no similarity) and Z (one or both is blank). These codes are strung together with the most significant fields listed first to form a Match Grade. A system table called a Confidence Matrix relates each Match Grade to a Confidence Code and Match Percentage. The Confidence Code is a number from 0 to 10, typically interpreted as: 0=no match; 1-4=weak match; 5-7=limited match needing user validation; 8-10=high quality match. The Confidence Matrix is provided with the system; because it is based on extensive empirical research, D&B recommends it remain unchanged, but it could be modified if necessary. Different runs of the match engine could point to different Confidence Matrices, although again this is not recommended. Output. The output of the matching system depends

Linking uses the "code combination" technique, generally considered the most effective method available.
Parsing. The system uses tables of key words and identification rules to split the address into street number, street name and Post Office Box. Parsing tables and rules are built into the system and not directly accessible by end-users. Normalization. The system normalizes the name and address elements by removing extraneous words (such as "The") and applying standard forms and contractions. Normalization tables include standard names for specific companies as well as general business terms. Address elements are normalized to improve matching. Normalized data is a condensed form of the original input designed for internal comparisons; it does not replace the original data for display or output. Grouping. The system creates group keys by extracting information from the normalized name and address elements. The name and address keys can be used separately or in combination to select records to compare in

on the application at hand. In Customer Integration Manager it can be one best match or a set of possible matches from the Common Customer Directory. Output can include contents of the input record, the D-U-N-S Number and/or LSID, match results like the Confidence Code or Match Grade, treatment information such as whether the record was sent for remote resolution or inserted in the Common Customer Directory, and enhanced data from a matching D&B record. The system can generate different output for matched and unmatched records. Users control many aspects of the matching process, some within the match engine itself and others in surrounding applications.. In Customer Integration

Customer Integration Manager Technical White Paper



these choices are expressed through

Company-wide ID scheme. The D-U-N-S Number and LSID provide consistent IDs to identify the same customer in all systems. This means that customer data can be assembled with a simple query rather than a complicated on-the-fly matching process. Companies willing to store the D-U-N-S Numbers or LSID on source system records could join customer data across multiple systems without using the Common Customer Directory as an intermediate cross-reference table. Incorporates D&B data. D&B records in the Common Customer Directory allow verification-based matching, which is the most effective way to link business records. In addition, the corporate family data on D&B records provides connections that cannot be derived from customer records alone. Because company input and D&B records are coded with the same D-U-N-S Numbers and LSIDs, it is easy to combine data from both sources.

Workflows. Users can determine which types of records to consider for a match: for example, they may wish to exclude records that belong to companies known to be out of business. They can also set the Confidence Code levels required to accept a match or send it for manual review, how many candidates to return for review, and which types of records are sent to D&B for remote resolution. Users willing to override D&B recommendations can change entries in the Confidence Matrix, allowing them to determine which data elements are considering in the matching process and how different combinations of element matches are treated. In many cases, different settings can be applied in different situations by creating alternate Workflows.

Technical Benefits
The approach taken by Customer Integration Manager has many advantages: One table links site records from all sources. The Common Customer Directory provides all systems with a single source for links among customer records. This simplifies cross-system information requests, since programs must look in only one location. The same directory is shared by batch and online processes, so the company need not develop multiple, redundant matching systems. Minimal impact on source systems. The Common Customer Directory can be created and managed without changes to the source systems. Because the Common Customer Directory holds both source system keys (the "hard key") and internal linking keys (the D-U-N-S Number and LSID), source systems can access specific Common Customer Directory records and find related records in other systems still without storing any new data in the source systems themselves.

Accumulates company-specific information. Companysupplied records can be manually associated with a site even though they appear unrelated. When similar records are presented in the future, they will be linked to the same site automatically. This allows the Site Reference Database to build a store of information that improves match accuracy over time. It also lets the system associate D&B data with customer records that do not directly match a D&B record. Superior matching technology. D&B uses sophisticated matching technology that is already optimized for business data. Users can deploy and operate the system with a minimum of effort, but still control critical parameters such as the confidence level required to accept a match or to flag it for manual review. Online matching functions are accessed through a Java API that lets users specify precisely how each transaction is handled.


Customer Integration Manager Technical White Paper

The D&B Database The D&B database is the world's most comprehensive repository of business information. It covers more than 80 million business sites in

Remote connections to D&B data and research services. This gives access to the most current D&B data without constant updates to local files. It also lets the system find matches against D&B records that are not stored locally, allowing companies to limit the size of the in-house database. Automated transfer of questionable records to D&B for remote resolution permits use of this service with minimal effort from company systems staff. Customer Integration Manager supports both batch and online connections: Data Integration Batch, included with Customer Integration Manager, gives each client a password-protected home directory on the D&B server. When clients send files to the directory via FTP. D&B systems process them automatically, return the results to the directory, and generate email notifications to D&B staff and clients. Data Integration Toolkit, a set of optional modules, allows online, world wide integration of client systems with D&B systems for corporate family linkage, enterprise management, financial management and marketing management. Easily extended. Custom database tables can easily be linked to the Site Reference Database using the D-U-N-S Number or LSID as a key. These tables can hold any data needed for company-specific applications.

214 countries, including 17 million in the United States. Locations that are part of a larger organization are linked in a corporate family tree that identifies relationships between headquarters and branches of the same corporation, and between parent corporations and their subsidiaries.

Each site is assigned a D-U-N-S Number, a

unique nine-digit identifier. In addition to a

site's own D-U-N-S Number, its record will carry the D-U-N-S Numbers of the headquarters or the ultimate global parent. The D-U-N-S parent site, the ultimate domestic parent, and Number is used by more than 50 global, indus-

try and trade associations including the United Nations.

States government, European Union and United

The D&B database is refreshed more than one million times each day with data from telephone calls, company Web sites, and business collected; those available in the Common partners. More than 1,000 data elements are

Sample Applications
The following examples illustrate some of the ways that Customer Integration Manager can be employed. Integrated Customer Relationship Management The lead processing module of a company's sales force automation system calls Customer Integration Manager to check if an inquiry is from an existing customer and uses the result to determine how to respond. This company has stored the LSID on customer records throughCustomer Integration Manager Technical White Paper

Customer Directory include company name

and address, telephone number, and indicators for whether the site is currently active, out of classification code available. Customer business, marketable, and has a valid industrial Integration Manager can also store other D&B data including revenue, number of employees, year started, and executive names and addresses.


out its CRM systems, so once Customer Integration Manager has identified the customer's LSID, the CRM systems can manage interactions among themselves without further Customer Integration Manager involvement. 1. The lead processing system receives an inquiry from an unknown person. The system reads the business name and address and sends these via Java to the Customer Integration Manager API. 2. Customer Integration Manager parses and standardizes the input. It finds a match in the Common Customer Directory and returns the LSID via the Java API. 3. The lead processing system queries the company's central marketing database, using the LSID as a key. The query returns a customer status code. 4. The lead processing system uses the customer status code as part of its business rule to determine how to respond. In this case, the rule determines this is a high value customer who should receive a telephone call. The lead processing system sends a transaction to the call center, including the LSID as part of the customer identifier. The call center calls the customer. Consolidated Sales Data for Customer Analysis A company uses Customer Data Integration to associate transactions from multiple systems with the proper customers. The company uses D-U-N-S Numbers to consolidate information from different branches at the headquarters level. To ensure that as many records as possible have a D-U-N-S Number, it uses D&B remote resolution services for inputs that do not match the Common Customer Directory. 1. The customer database load process creates a flat file with records from multiple source systems. It places this in a directory assigned to receive such files. The Customer

Integrated Customer Relationship Management

Inquiry from Customer

Lead Processing

Customer Integration Manager 4 3 2

Common Customer Directory

Marketing Database

Call Center

Phone call to Customer Integration Manager scanning and file transport modules (SST and FTM) notice the file, transfer it to a work area, and call Customer Integration Manager's matching module. Customer Integration Manager parses and


standardizes each record and looks for a match in the Common Customer Directory. Records that match an existing entry are coded with the site and ultimate parent D-U-N-S Number and accumulated in a flat file. 3a. Records that do not match a Common Customer Directory entry are accumulated in a (different) flat file. When processing is complete, this file is sent to a designated


Customer Integration Manager Technical White Paper

directory. SST and FTM send it to D&B for remote resolution. If the D&B system finds a match on its master database, it outputs the original record and the D&B site record. If there is no match, it sends the record for manual research and outputs the original record with an appropriate flag. When the process is complete, the files are placed in a directory where SST and FTM sends them back to Customer Integration Manager.

Interactive Customer Lookup During Order Entry A company's order entry system uses Customer Integration Manager to determine whether an order is from an existing customer and if credit is available. This firm has not changed its operational systems to store the LSID, so it must use the Common Customer directory as a cross-reference table. 1. The call center receives a telephone call from a cus-

3b. Customer Integration Manager adds the returned records to the Common Customer Directory, codes them with LSIDs, and sends the coded records to a flat file. 4. Customer Integration Manager sends the output files with coded records to the customer database load area for additional processing. Most records contain a site and headquarters D-U-N-S Number that will be used for consolidation. Records that did not match with D&B are consolidated using the LSID.

tomer wishing to place an order. The caller knows his firm has ordered before, but does not have an account number available. The order entry agent enters the company name and address into the order entry system and selects a 'customer search' option. The order entry system sends the name and address to the Customer Integration Manager API via Java. Customer Integration Manager parses and standardizes the input and looks for matches in the Common Customer Directory. It finds several that exceed the specified confidence threshold. 2. Customer Integration Manager sends the name,

Consolidated Sales Data

1 Customer Integration Manager
Common Customer Directory

address, telephone number and LSID of each matched record to the order entry system. The order entry system displays the records to the agent, who reviews them with the caller. There is no match, so the agent asks for an alternate address and searches for matches on those. When the right record is located, the order entry system issues a command for Customer Integration Manager to add the original address to the Common Customer Directory and link it to the alternate address. This will allow the system to identify the company automatically if a future caller uses either address.

Multi-source Data Extract

2 Coded Records



D&B Remote D&B Database

3. The order entry system issues a direct SQL query to the Common Customer Directory, selecting all records with the specified LSID that represent credit system accounts. There are several such records, relating to different accounts that the customer has established for different

Customer Database Load

departments at the same site.

Customer Integration Manager Technical White Paper


4. The order entry system extracts the account numbers from the returned records and submits them to the credit system. The credit system evaluates these accounts and returns an approval to the order processing system. The agent completes the order.








visual workflows to define matching, file processing, input and output formats, remote resolution, logging and notifications. Set up the Common Customer Directory. D&B consultants will help to build the data tables, design special views and indexes, and determine which records to load. The initial records will be selected from the D&B master database. This process also includes matching your specific records against D&B data, and conducting manual and remote resolution as needed. Customer Integration Manager uses a relational database with the following categories: matching, data append, and operational. The Common Customer Directory is a set of relational database tables with predefined data structures for matching. Develop interfaces with specific applications. Details

Interactive Customer Lookup

Conversation with Customer

Order Processing System

1 2 3

Customer Integration Manager

Common Customer Directory

Credit System

will depend on the situation, but tasks will generally include researching existing data sources and systems, developing extract programs for batch feeds, modifying online systems to communicate with the Customer

Every Customer Integration Manager installation is tailored to the customer's requirements with help from a dedicated team of D&B consultants. Most implementations include the following stages: Install the Customer Integration software. Preliminary steps include establishing a home directory, setting database access rights, and installing Java Virtual Machine (Java 2 SDK Standard Edition version 1.4.1). The user then runs an automated installation Wizard that will prompt the user for the required information. The Graphical User Interface (GUI) lets the user easily set up and manage operations and administrative functions. D&B consultants will provide advice on proper settings. The GUI controls technical details like directory locations, and business decisions such as which records are sent

Integration Manager APIs, and making other changes needed to use Customer Integration Manager outputs. Implementation time can vary from a few hours for simple batch matching to several days for a process including remote connections to D&B. longer. Projects requiring changes to the client's in-house systems may take

Monthly Update
D&B provides Customer Integration Manager clients with a monthly update for D&B records in the Common Customer Directory. This can be sent electronically or by tape, CD, or other physical media. The update file will include changed records for the primary Common Customer Directory table which contains matching data, and complete replacements for the other D&B database tables.


Customer Integration Manager Technical White Paper

Monthly Update Process

Old Common Customer Directory D&B adds, changes, deactivates

organization's internal customer knowledge, and validation against D&B data, to produce the most thorough relationship identification available. It supports both batch and online processes, allowing a single system to serve all customer data integration requirements. It can provide a central cross-reference file to link account numbers in different source systems, or generate a single customer ID for all systems to use internally. It combines the efficiency of in-house operation with the accuracy of up-to-the-minute external validation data. The system is built on industry-standard databases, hardware and Java APIs, making it simple to integrate with existing corporate infrastructures. Perhaps most important, it links your data with the worldwide resources of D&B, whose D-U-N-S Number opens the door to an unmatched range and depth of business customer information.

Monthly Update Utility

New Common Customer Directory

Customer Integration Manager includes a monthly update utility with scripts to control the process. This script drops indexes and tables and later recreates them, records each step in a log file, and notifies an operator via email when the process is complete or an error occurs. The process can restart from the point of failure if necessary. The system stores detailed information on changes to individual records, with separate files for records that have been added, updated and deleted.

D&B professional services staff supports Customer Integration Manager clients during and after implementation. Technical assistance is provided by email, telephone, and in person. Clients receive a prompt response during normal business hours. Training is tailored to each implementation . D&B usually creates a custom training system with the client's own data. Most training is conducted at the client's office. Clients also receive detailed written documentation.

Customer Integration Manager provides a comprehensive, flexible solution for business customer data integration. It combines sophisticated matching technology, your

Customer Integration Manager Technical White Paper


Decide with Confidence

D&B Solutions
Risk Management Solutions Sales & Marketing Solutions Supply Management Solutions E-Commerce Solutions
