Professional Documents
Culture Documents
Implementing or Upgrading SAP® Solutions? Don't Forget The Data
Implementing or Upgrading SAP® Solutions? Don't Forget The Data
W H I T E PA P E R
This document contains Confidential, Proprietary and Trade Secret Information (“Confidential Information”) of
Informatica Corporation and may not be copied, distributed, duplicated, or otherwise reproduced in any manner
without the prior written consent of Informatica.
While every attempt has been made to ensure that the information in this document is accurate and complete, some
typographical errors or technical inaccuracies may exist. Informatica does not accept responsibility for any kind of
loss resulting from the use of information contained in this document. The information contained in this document is
subject to change without notice.
The incorporation of the product attributes discussed in these materials into any release or upgrade of any
Informatica software product—as well as the timing of any such release or upgrade—is at the sole discretion of
Informatica.
Protected by one or more of the following U.S. Patents: 6,032,158; 5,794,246; 6,014,670; 6,339,775; 6,044,374;
6,208,990; 6,208,990; 6,850,947; 6,895,471; or by the following pending U.S. Patents: 09/644,280;
10/966,046; 10/727,700.
Table of Contents
Executive Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
ERP applications like SAP have an impact on business processes; therefore, the decision to
purchase and implement them is approached with considerable due diligence. While the vendor
selection and implementation decision gets much of the spotlight, there are critical project
phases upon which the success of an SAP implementation hinges. These project phases are not
always considered with the same degree of detail as the purchase and implementation decision.
Examples of such project phases include:
• Business process reengineering. Software that touches and drives processes across the
enterprise cannot simply be installed and turned on. Current “as is” processes must be
understood and methodically mapped to the new SAP system and its “to be” capabilities.
Invariably, gaps are uncovered during business process reengineering which must be considered
and planned for.
• Change management and user adoption. SAP implementations cannot rely on an “if we build
it, they will come” approach. The success of a new business application is ultimately measured
by its adoption by business users. Careful consideration must be given to executive sponsorship
and to business as well as technical user training.
2 1
Hamerman, Paul and R “Ray” Wang. “ERP Applications—The Technology And Industry Battle Heats Up.”
Forrester Market Overview, June 9, 2005
White Paper
Business logic can be embedded in the data itself. For example, consider the following order
number from a legacy application: POUNE55289. Without proper context, it is impossible to tell
whether this number is a sales order, purchase order, or purchase requisition. It is impossible to
know which system generated the order, or if it is an active or historic order. Such insight can only
be achieved by identifying and analyzing the context of the data. As shown in Figure 1, valuable
details that are critical to data migration are exposed when the context of the data is analyzed.
Figure 1: True Context of Data Revealed Only with Proper Analysis of Data
4
White Paper
Not only do the number of sources required for extraction need to be identified, but how the data
will be extracted and by whom also needs to be addressed. The data migration team must have
resources to extract data from the legacy applications, and these resources need to be reliable
and trustworthy in providing high quality and timely data extracts.
Tension between the data migration and legacy application teams is common because their goals
are often at odds with one another. After all, once the SAP solution goes live, the applications that
the legacy teams have built and have supported for years may be deemed obsolete and eventually
shut off.
Aside from the politics associated with the access and ownership of legacy data, the legacy
resources with the experience and skill sets essential to providing data extracts are likely busy and
in-demand resources within their organizations. Between maintaining and enhancing the legacy
applications, they may not have the bandwidth to dedicate the proper attention to the SAP data
migration effort.
6
White Paper
• Processed. When incorrect data enters the application, it may be propagated across multiple
systems. For example, a system of record for material master information feeds data to
downstream applications, such as supply chain management or purchasing applications.
Data quality issues are replicated and may multiply as data is fed downstream to constituent
applications.
• Stored. Storing data across multiple business applications often leads to redundant and
inconsistent data. For example, various attributes of customer master entity information are
frequently stored in multiple business applications, such as customer relationship management,
sales force automation, and sales and marketing applications. Customer information, such as
names, titles, addresses, and purchase history, may be stored in different formats or duplicated
across different systems, preventing a single view of the customer.
Data migration teams need to understand and accept that there may be “dirty” data. To address
data quality issues when migrating legacy applications into SAP, data migration teams should
consider the data’s:
• Existence. Does the required data for the SAP solution exist? Does it exist within the enterprise,
or possibly in a partner’s or outsourcing vendor’s environment? If it doesn’t exist, what is the
business rule to populate the required information in SAP?
• Validity. Do data values fall within an acceptable range or domain? For example, if the legacy
applications have 73 U.S. state codes instead of 50, is this valid?
• Consistency. Is the same data stored in multiple applications in a common format? For
example, is “John Doe” from Company XYZ the same as “Mr. Jon Doe” from the same company?
• Timeliness. Is the data that is required to support the SAP business processes available at the
optimal time?
• Accuracy. Does the data correctly describe the properties of the object it is meant to model?
• Relevance. Does the data meet and support the SAP business processes?
Data migration project teams commonly leverage custom code to support the data conversion
process required to address data quality issues. Custom code can initially offer some degree of
flexibility. However, as the number and complexity of integration touch points increase, custom
coding limitations in scale and maintenance are exposed.
In most cases, if just a portion of the data being loaded into SAP does not pass the SAP
application validation, then SAP will reject the entire record. Examples of data validation
performed at the SAP application layer include:
• Syntactical. Is the field length and data type of the material master number valid?
• Semantic. What is the context of the data? Does this number identify a customer or vendor?
• Structural. Does the purchase order header and line item meet proper parent/child
relationships or cardinality rules?
• Dependency. Is this bill of material valid even if one of the referenced material master records
has not yet been created in SAP?
It is important to note that data migration PowerCenter provides powerful capabilities to help overcome data migration challenges. These
capabilities include:
teams do not have to choose between
NetWeaver and PowerCenter—it’s not an • Data profiling capabilities for identifying and analyzing source data
“either/or” proposition. NetWeaver and • Universal data access capabilities for accessing source data
PowerCenter’s complementary capabilities
• Built-in transformation and correction capabilities for addressing the quality of data in legacy
help organizations significantly reduce risks
applications
and improve productivity during the data
migration effort in any SAP implementation. • Certified connectivity to SAP to prepare and load data into SAP
• Single, unified data integration platform to support the data migration lifecycle
10
White Paper
Analysis
10% Test Test
30% Analysis Analysis 30%
40%
Build
Build
Build
60% Test 30%
Figure 3: Proactive Analysis of Source Data Saves Both Time and Moneyv PowerCenter’s data profiling reports help
migration teams determine if the legacy
PowerCenter’s data profiling capabilities provide comprehensive, accurate information about
data has quality issues and how to properly
the content, quality, and structure of data in virtually any operational system. Organizations can
address them.
automatically assess the initial and ongoing quality of data regardless of its location or type. With
its comprehensive data profiling capabilities, PowerCenter:
• Reduces data quality assessment time with easy-to-use wizards and pre-built metric-driven
reports that comprise a single interface for the entire profiling process
• Addresses ongoing data quality in legacy applications with Web-based dashboards and reports
that illustrate changes in data content, quality, structure, and values over time
• Ensures end user data confidence by automatically and accurately profiling any data accessible
to PowerCenter—virtually any and all enterprise data formats
Figure 4 shows an example of a PowerCenter data profiling report. The report shows how
PowerCenter automatically infers the primary and foreign key relationships across three tables in a
legacy application.
Figure 4: PowerCenter Profiling Report Inferring Primary Key and Foreign Key Relationships between
Multiple Legacy Application Data Sources
Data Sources
0 20 40 60 80 100
2
Eckerson, Wayne and Colin White. “Evaluating ETL and Data Integration Platforms.” TDWI Report Series, 2003
3
Ibid
12
White Paper
PowerCenter provides universal data access, allowing the data migration team to source virtually
any and all enterprise data formats, including:
• Mainframe data
• Structured data
• Unstructured data (e.g., Microsoft Word documents and Excel spreadsheets, email, binary files,
.pdf files, etc.)
• Semi-structured data (e.g., industry-specific formats such as HL7, ACORD, FIXML, SWIFT, etc.)
• Relational data (e.g., DB2, Oracle, Microsoft SQL Server, etc.)
• ERP (e.g., SAP, PeopleSoft, Siebel, etc.) and file data
• Message queues (e.g., Tibco, IBM MQ Series, JMS, MS MQ, etc.) Sources of data for SAP implementations
tend to be dynamic. Extracting data from a
Figure 6 shows the breadth of PowerCenter’s data access capabilities.
relational database-based legacy application
today does not preclude SAP data migration
Real-Time Data Sources Enterprise
TIBCO IBM WebSphere MQ Software Sources teams from having to meet future sourcing
JMS SAP MSMQ WEBM Mainframe AS/400 JDE
Web Services requirements, such as mainframe or mid-
PeopleSoft Siebel SAP
SAS Essbase Lotus Notes range applications.
Unstructured Data
PDF Word Excel
Vertical Standards
(e.g., HL7, SWIFT, ACORD)
Print Stream BLOBs Informatica
Any proprietary data
format/standard PowerCenter
Across the Firewall/WAN
With PowerCenter, SAP data migration teams can source directly from a mainframe application as
if it were a relational database. PowerCenter’s data access capabilities offer SAP migration teams
the flexibility to source these “softer” forms of data which traditionally would be left up to manually
interpretation and processing—or worse, left unaccounted for in the migration process.
The flexibility to access all types of enterprise data in a single data integration platform offers
significant advantages over hand-coded data migration approaches, including:
• Increased productivity. With the ability to centralize data access and management, PowerCenter
frees data migration teams from having to maintain and be dependent on a cumbersome, time-
consuming process where programs are developed to extract and stage data for each source of
legacy data.
• Reduced risk. Sources of data for SAP implementations tend to be dynamic. Extracting data
from a client/server-based legacy application today does not insulate the team from future
requirements—for example, having to migrate over mainframe and mid-range applications
from applications resulting from a corporate merger or acquisition. PowerCenter reduces the
risk of both current and future data migration efforts by providing access to a broad range of
enterprise data formats.
While a custom coding approach to data migration initially offers some flexibility, this approach
has its limitations. For example, if 100 programs are required to convert data across 10 legacy
data sources (a conservative number of sources), custom coding becomes complicated,
inefficient, and a challenge to scale. Developers working on different platforms and development
tools may add to the complexity, and sharing and reusing the coding effort across the migration
team, even with the best intentions, is unrealistic.
PowerCenter helps SAP data migration teams by enabling the team to focus on the data and
not code. PowerCenter was originally developed to address data transformation and conversion
requirements associated with data warehousing. That core capability has evolved into a single,
unified, scalable enterprise data integration platform with a robust library of transformation and
data services capable of handling all data conversion on any SAP data migration project. By
leveraging PowerCenter’s codeless and wizard-driven approach for SAP data conversion, teams can
focus more on the business rules and data, and less on the code.
14
White Paper
Note how a single PowerCenter transformation within the mapping replaces the traditionally
coding-intensive effort for preparing a well-formed and valid file ready for loading into SAP. All of
the detail shown in the DMI customer master object is entirely imported directly from the SAP
According to a 2005 TDWI report, broadly speaking, enterprise business integration can occur at
four different levels in an IT system: data, application, business process, and user interaction.4
Figure 8 shows how the four layers of an integrated enterprise are jointly supported by the
SAP Netweaver
Informatica METADATA SERVICES
Business Process
Source data profiling integration
Lifecycle Management
Bus. intelligence Knowledge mgmt.
Data cleansing Interaction +
process
Collaboration
management Master data mgmt.
Mappings
PROCESS INTEGRATION
Complex transformations
Integration Business
broker Process mgmt.
APPLICATION PLATFORM
J2EE ABAP
DB and OS abstraction
Figure 8: Complementary Data Integration between SAP NetWeaver and Informatica PowerCenter
4
16 White, Colin. “Data Integration: using ETL, EAI, and EII Tools to Create an Integrated Enterprise” TDWI Report Series,
November 2005
White Paper
proven front-end platform for business intelligence and visibility across the enterprise.
Reusability/Team
Productivity
XML, Messaging,
and Web Services
2 3 4
Iterate
1 7 8
Access source Access target/data
Relational and systems/data Execute
Flat Files Migration
Target Application
9
Synchronize
Mainframe and
Midrange 10
Audit/Lineage
5
Eckerson, Wayne and Colin White. “Evaluating ETL and Data Integration Platforms.” TDWI Report Series, 2003 Implementing or Upgrading SAP 17
traced at a metadata level.
Figure 10 shows a PowerCenter data lineage diagram spanning multiple data migration mappings,
as well as each component responsible for sourcing, converting or targeting the required data
for SAP. This diagram shows the flow of migration logic. Once the suspect area of data migration
Figure 10: PowerCenter Data Lineage Diagram enables tracking and auditing of end-to-end migration from legacy
applications to SAP
PowerCenter has been awarded “Powered
By SAP NetWeaver” status by porting the logic has been identified, users can drill into the object and make the appropriate changes to the
PowerCenter platform and PowerCenter Web relevant data mapping object.
Service Hub to the SAP J2EE platform. SAP PowerCenter helps data migration teams trace and prove how data has been converted and
users can access PowerCenter’s Web Services moved. The enhanced data visibility and tracking helps organizations comply with reporting
capabilities directly through SAP NetWeaver requirements. These capabilities also help with user adoption, instilling new SAP application users
Portal’s front-end. with confidence that legacy application data has in fact been converted and moved.
Furthermore, PowerCenter alleviates the politics associated with data migration projects. Data
migration activities, whether related to legacy applications or the target SAP application, can
be centralized within a single, unified data integration platform. This promotes effective and
productive communication between legacy and SAP resources, and between technical and
functional resources.
18
White Paper
“Powered by SAP NetWeaver” is a program where partners develop solutions directly on the
NetWeaver application platform.
20
White Paper
Furthermore, PowerCenter allows data migration teams to leverage all these capabilities from
a single, unified data integration platform. This increases productivity, ensures scalability, and
reduces risk.
Now that you have a solid understanding of the challenges around data migration and how
PowerCenter can help you overcome them, what is your next step? Informatica has developed an
offering to show you how you can make your next data migration project a success. This offering is
called the Data Migration Readiness Assessment.
The Data Migration Readiness Assessment demonstrates the value of leveraging Informatica for
SAP data migrations. It also serves to jump start any SAP data migration project.
The Data Migration Readiness Assessment is designed to help any SAP customer understand the
challenges and risk in a data migration project:
1. Identify data risks early
2. Scope and plan migrations effectively
3. Deliver SAP implementation on time, on budget, and in scope
*EFOUJGZDBOEJEBUFTPVSDFT
Figure 11 shows how the Data Migration Readiness Assessment works. 1
4PVSDF
4ZTUFN
r*EFOUJGZ4"1FOUJUJFT
FH
*UFN.BTUFS
1 r&YUSBDUTPVSDFEBUB $VTUPNFS.BTUFS
r"OBMZ[FTPVSDFEBUB
r*EFOUJGZBUUSJCVUFT
r*EFOUJGZSJTLTJOTPVSDFEBUB
4PVSDF
4ZTUFN
Legacy SAP
Stage Stage SAP
4PVSDF
4ZTUFN
4
r$SFBUFNBQQJOHTUP4"1
r*EFOUJGZSJTLTJONBQQJOH
4PVSDF
4ZTUFN
4
Figure 11: The Data Migration Readiness Assessment Jump Starts Data Migration Projects
Informatica Offices Around The Globe: Australia • Belgium • Canada • China • France • Germany • Japan • Korea • the Netherlands • Singapore • Switzerland • United Kingdom • USA
© 2008 Informatica Corporation. All rights reserved. Printed in the U.S.A. Informatica, the Informatica logo, and PowerCenter are trademarks or registered trademarks of Informatica Corporation in the United States and in jurisdictions
throughout the world. All other company and product names may be trade names or trademarks of their respective owners.
6665 (09/17/2008)