Professional Documents
Culture Documents
CE Student Guide PC - HandsOnWorkshop
CE Student Guide PC - HandsOnWorkshop
CE Student Guide PC - HandsOnWorkshop
Agenda
Time 9:00 9:30 10:00 Topic Introduction to Informatica Introduction to Informatica Data Integration Platform Introduction to PowerCenter
10:30
11:30
Tutorial Lesson 1
Tutorial Lesson 2
12:30
1:30 2:00 2:30 3:00
Lunch
Tutorial Lesson 3 Tutorial Lesson 4 Using the Debugger Putting It All Together
4:00
4:30
Tutorial Lesson 6
Review and Q/A
Workshop Objectives
By the end of the day you will:
Understand the broad set of data integration challenges facing organizations today and how the Informatica Platform can be used to address them
Access data from different data sources and targets Profile a data set and understand how to look for basic problems that need to be solved Integrate data from multiple sources through Extraction, Transformation and Load (ETL) Debug data integration processes (mappings) Expose integration logic as Web Services for use in a SOA architecture
3
Informatica
The #1 Independent Leader in Data Integration
$900
Founded: 1993 2012 Revenue: $811.6 million 7-year Annual CAGR: 17% per year Employees: 2,810+ Partners: 450+
Major SI, ISV, OEM and On-Demand Leaders
$800
$700 $600 $500 $400 $300 $200 $100 $0
2005 2006 2007 2008 2009 2010 2011 2012
Product Development
Customer Support
Professional Services
Why Informatica?
Product Leadership
Proven Technology Leadership
Data Quality
ULTRA MESSAGING
APPLICATION ILM
DATA QUALITY
Application ILM
Ultra Messaging
In spite of the new entrants, Informatica remains the market leader in this highly demanding part of the messaging market.
Why Informatica?
A Track Record of Continuous Innovation
Q2 2012 Informatica 9.5
Q1 2012 Informatica 9.1 Data Quality
Data Services 9.5 PowerCenter 9.5 PowerExchange 9.5 Data Quality 9.5 Data Explorer 9.5
Q3 2011 Ultra Messaging Cloud MDM Q2 2011 Informatica 9.1 Q1 2011 CEP 5.2 MDM Q4 2010 Cloud
Proactive Monitoring Options for DQ, PC MDM for DB2 MDM Securities Master
DVO 9.5 ILM Dynamic Data Masking 9.5.1 Informatica Identity Resolution 9.5 Informatica MDMRegistry Edition 9.5
UM PowerCenter integration Broad Cloud Connectivity DQ Dashboards and Reports MDM Counterparty Master MDM Social Networking
Big Data Integration Self Service Adaptive Data Services Authoritative & Trustworthy Data
Why Informatica?
Empowering the Data-Centric Enterprise
BUSINESS IMPERATIVES
Improve Decisions Modernize Business Improve Efficiency & Reduce Costs Mergers Acquisitions & Divestitures Acquire & Retain Customers Outsource Non-core Functions Governance Risk Compliance Increase Partner Network Efficiency Increase Business Agility
IT INITIATIVES
Business & Operational Intelligence Legacy Retirement Application ILM Application Consolidation Customer, Supplier, Product Hubs BPO SaaS Risk Mitigation & Regulatory Reporting B2B Integration Zero Latency Operations
Data Warehouse
Data Migration
Data Consolidation
Data Synchronization
Ultra Messaging
Why Informatica?
The Neutral, Trusted and Preferred Partner
BI Partners OEM OEM Partners Cloud Partners Cloud Global SI Partners Global SI Partners
Operating Systems
10
CLOUD
INTERACTIONS
MOBILE
ON-PREMISE
TRANSACTIONS
DESKTOP
11
Data Integration
Data is Changing
CLOUD
INTERACTIONS
MOBILE
ON-PREMISE
TRANSACTIONS
DESKTOP
12
13
Value of Data
=
We empower organizations to maximize return on data to drive their top business imperatives
14
TRANSACTIONS
DESKTOPS
ON-PREMISE
CLOUD
MOBILE
INTERACTIONS
15
16
17
TRANSACTIONS
DESKTOPS
ON-PREMISE
CLOUD
MOBILE
INTERACTIONS
18
Source: An IDC White Paper - sponsored by EMC. As the Economy Contracts, the Digital Universe Expands.
19
Big Data
Confluence of Big Transaction, Big Interaction and Big Data Processing
Other Interaction Data Cloud Salesforce.com Concur Google App Engine Amazon Clickstream image/Text Scientific Genomoic/pharma Medical Medical/Device Sensors/meters RFID tags CDR/mobile
20
BI Application
CUSTOMER
CUSTOMER
PRODUCT
CUSTOMER
CUSTOMER
ORDER PRODUCT
INVOICE PRODUCT
ORDER INVOICE
ORDER INVOICE
ORDER PRODUCT
21
BI Application
CUSTOMER
CUSTOMER
PRODUCT
CUSTOMER
CUSTOMER
ORDER PRODUCT
INVOICE PRODUCT
ORDER INVOICE
ORDER INVOICE
ORDER PRODUCT
22
BI Application
CUSTOMER
CUSTOMER
PRODUCT
CUSTOMER
ORDER PRODUCT
INVOICE PRODUCT
ORDER INVOICE
ORDER INVOICE
ORDER PRODUCT
Business
IT
23
Data Warehouse
Data Migration
Data Consolidation
Data Synchronization
SWIFT
NACHA
HIPAA
Cloud Computing
Application
Database
Unstructured
Partner Data
24
25
26
Introduction to PowerCenter
Enterprise Data Integration and ETL
27
28
29
30
31
Informatica Platform
Single unified architecture
Provider
XML, Messaging, and Web Services
PowerCenter
Design Manage Workflow Manager Monitor Workflow Monitor
Consumer
Portals, Dashboards, and Reports
Client
Designer
Administrator
Packaged Applications
Services Framework
Repository Service
Packaged Applications
Repository
Relational and Flat Files
Integration Service
Web Services
32
Proven Scalability
Threaded Parallel Processing
Provider
XML, Messaging, and Web Services
PowerCenter
Partition Point
Consumer
Monitor Workflow Monitor
Consumer Thread
Portals, Dashboards, and Reports
Design
Client
Designer
Packaged Applications
Provider Thread
Administer
Services Framework
Repository Service
Packaged Applications
Integration Service
Web Services
33
Proven Scalability
Pipeline Parallel Processing
Provider
XML, Messaging, and Web Services
PowerCenter
Provider Thread
Consumer
Monitor
Consumer Thread
Portals, Dashboards, and Reports
Design Designer
Transformation Threads
Manage
Client
Workflow Manager
Workflow Monitor
Administrator
Packaged Applications
In-memory pipeline
Provider Thread
Transformation Threads
Repository Service
Consumer Thread
Services Framework
Packaged Applications
Repository
Relational and Flat Files
In-memory pipeline
Mainframe and Midrange
Integration Service
Web Services
34
35
36
Informatica Platform
Single unified architecture
Provider
XML, Messaging, and Web Services
PowerCenter
Design Manage Workflow Manager Monitor Workflow Monitor
Consumer
Portals, Dashboards, and Reports
Client
Designer
Administrator
Packaged Applications
Services Framework
Repository Service
Packaged Applications
Repository
Relational and Flat Files
Integration Service
Web Services
37
Mapping Designer - Used to create mappings to extract, transform and load data.
38
Source Analyzer
Integrated. Key component of PowerCenter Designer, Source Analyzer offers universal data access in a single unified platform Consistent. A single consistent method to access and manage any data source regardless of type or location Visual. Simple graphical interface for importing and creating source definitions for any of the data sources supported by PowerCenter
39
Target Designer
Integrated. Key component of PowerCenter Designer, Target Analyzer offers universal data access in a single unified platform Consistent. A single consistent method to access and manage any data target regardless of type or location Visual. Simple graphical interface for importing target definitions for any of the data types supported by PowerCenter Extensible. Can create target definitions, executable DDLs, and even create new tables in the warehouse
40
Tutorial Lesson 1
Creating Users and Groups Creating a Folder in the PowerCenter Repository Creating Source Tables: Pre Requisite - Create the demo source tables.
41
Lesson 2
42
43
Using Designer
44
Using Designer
45
Using Designer
46
Using Designer
47
Using Designer
1. Make sure you see Source Analyzer at the top left hand part of the gray work area
48
Using Designer
Import a relational source 1. From the menu bar select Sources Import from Database
49
Using Designer
In the Import Tables dialog, choose the ODBC connection for the data source where the source tables reside 1. Click the ODBC data source drop-down box 2. Select the data source called source
Note Informatica only uses ODBC to import the metadata structures into PowerCenter.
50
Using Designer
1. 2. 3. 4.
Enter Username: pc_user The Owner name will self populate Enter Password: pc_user Press Connect
51
Using Designer
1. Open up the directory tree under Select tables 2. select table CUSTOMERS 3. Press OK
52
Using Designer
1. Verify the source metadata structure for the CUSTOMERS and GOOD_CUST_STG tables Next, we will import our flat file source structure
53
Using Designer
1. From the menu bar and select Sources Import from File
54
Using Designer
55
Using Designer
1. Select the Import field names from the first line check box this tells PowerCenter to start importing from the second line (Note Start Import at Row: has changed to 2) 2. Keep the remaining defaults (the flat file source Delimited not Fixed Width) 3. Press Next
The flat file wizard is now displayed which allows us to parse through our flat file source.
56
Using Designer
1. Keep the defaults (the flat file is comma delimited) 2. Press Next
Look around this page. Notice you can account for multiple delimiters, consecutive delimiters and quotes around data.
57
Using Designer
Earlier we told PowerCenter to use the first line of the original flat file for the column names. Note that the columns are now named for us. Review the other options on this page.
1. Press Finish
58
Using Designer
Congratulations!
You just successfully imported one flat file and two relational source structures.
59
Using Designer
Select the Target Designer to bring in our target structures 1. Select the second icon on the shortcut line
60
Using Designer
Notice when you select the Target designer, the menu options change. One now says Targets 1. Select Targets and choose the Import from Database option
61
Using Designer
62
Using Designer
1. 2. 3. 4.
Enter Username: target The Owner name will self populate Enter Password: target Press Connect
63
Using Designer
CUSTOMER_NONAME will capture all of our records that do not have an associated customer name. GOOD_CUSTOMERS will capture all clean records to be loaded into our Data Warehouse.
64
Using Designer
65
Using Designer
Select the database type 1. Click the drop-down box and choose Oracle for the database type
66
Using Designer
1. Enter CUSTOMER_DATES as the name for the target table 2. Press Create
67
Using Designer
A new table should appear in the workspace behind the pop up menu 1. Select Done to close the Create Target Table dialog
68
Using Designer
Edit the table - CUSTOMER_DATES
69
Using Designer
70
Using Designer
71
Using Designer
Add columns to the table 1. Press the Add icon three times to add in three new columns
72
Using Designer
73
Using Designer
74
Using Designer
1. Click in the Key Type drop-down for CUST_ID to make this field a Primary Key
75
Using Designer
1. Change the second Column Name to TRANSACTION_ID 2. Change the Datatype of the TRANSACTION_ID to number. 3. Change the third Column Name to Date_of_Purchase 4. Change the Datatype of the Date_of_Purchase column to date 5. Press OK
76
Using Designer
We now have a metadata target structure in the PowerCenter Metadata Repository. We will now build the table in the Oracle target instance.
77
Using Designer
Build the table in the Oracle target instance 1. Select Targets Generate/Execute SQL
78
Using Designer
79
Using Designer
1. Press the ODBC data source dropdown menu 2. Select the target database
80
Using Designer
1. Enter the Username target 2. Enter the Password target 3. Press Connect
81
Using Designer
82
Using Designer
Using Designer
The table GOOD_CUST_STG is for staging good customer records prior to loading them into the data warehouse. It will be used as both a target (when we clean the data) and a source (when we load the clean data into the warehouse). We can reuse the source definition to create the target.
With the Target Designer Selected 1. Expand the Sources folder so that GOOD_CUST_STG is visible 2. Drag the GOOD_CUST_STG object from the Sources directory tree in the navigation pane to the Target Designer Canvas.
84
Using Designer
GOOD_CUST_STG is now setup to be used as both a source and a target in PowerCenter. However, while the table exists in PowerCenter, it does not yet exist in our target Oracle database. Lets build this table in our target Oracle database.
85
Using Designer
Build the table in the Oracle target instance 1. Select Targets Generate/Execute SQL
86
Using Designer
87
Using Designer
1. Press the ODBC data source dropdown menu 2. Select the target database
88
Using Designer
1. Enter the Username target 2. Enter the Password target 3. Press Connect
89
Using Designer
90
Using Designer
Using Designer
If we look back at the directory tree in the Navigation Pane, we will see that we now have three Sources TRANSACTIONS (flat file) CUSTOMERS (relational) GOOD_CUST_STG (relational) and four Targets (all relational) CUSTOMER_DATES CUSTOMER_NONAME GOOD_CUSTOMERS GOOD_CUST_STG
92
93
Creating a Pass-Through Mapping Creating Sessions and Workflows Running and Monitoring Workflows
94
What is a mapping?
What are Transformation Objects? How do we build a mapping? How do we Join sources together? How do we separate out records with missing data?
95
PowerCenter Transformations
Some examples
Transaction Control Router Normalizer Custom Transformation Stored Procedure Lookup
XML Parser Update Strategy Source Qualifier Sort Rank Sequence Generator Aggregator
Transformations used in this mapping. For a detailed description of these Transformations and their function see the tables in Appendix A
Mapplet
Filter JAVA Target Definition Union
XML Generator
Expression Joiner Mapplet Input Mapplet Output
97
PowerCenter Functions
Some Examples. A more complete reference can be found in the Appendix B at the end of this Guide
Summary view of all available functions Character manipulation (CONCAT, LTRIM, UPPER, ) Datatype Conversion (TO_CHAR, TO_DECIMAL, ) Data matching and parsing (Reg_Match, Soundex, ) Date manipulation (Date_Compare, Get_Date_Part, ) Encryption/Encoding (AES_Encrypt, Compress, MD5, ) Financial Functions (PV, FV, Pmt, Rate, ) Mathematical operations (LOG, POWER, SQRT, Abs, ) Trigonometric Functions (SIN, SINH, COS, TAN, ) Flow Control and Conditional (IIF, DECODE, ERROR, ) Test and Validation (ISNULL, IS_DATE, IS_NUMBER, ) Library of Reusable User Created Functions Variable Updates (SETVARIABLE, SETMINVARIABLE, ) Available Lookups that may be used
98
In this Scenario
We show how to build mappings with Designer. Mappings are a logical process that define the structure of data and how it is changed as it flows from one or more data sources to target locations. Mappings are the core of the Informatica data integration tool set. With Informatica transformations and mappings are reusable and can be used in multiple different scenarios.
For our first mapping we need to combine two sets of data for our data warehouse. We also need to separate good records from bad ones that are missing the customer name.
99
100
101
102
103
104
Add the source TRANSACTIONS to the mapping 1. Expand the + next to source and Flatfile so TRANSACTIONS and CUSTOMERS are visible (as above) 2. Drag the TRANSACTIONS source into the work area
105
106
Add the source CUSTOMERS table to the mapping 1. Click and drag the CUSTOMERS source into the workspace
107
108
Add the target tables to the mapping. 1. Expand the Targets folder 2. While holding CTRL select the CUSTOMER_NONAME and GOOD_CUST_STG tables 3. Still holding CTRL, drag them onto the workspace
109
110
1. Collapse the Navigation Pane for now to give us more work space (Single click-left icon) 2. Collapse the Output Window at the bottom of our screen (Single Click-right icon)
111
Add a joiner transformation to join the two source files together 1. Single click on the joiner transform 2. Single click in the workspace, the Joiner transformation should appear
112
1. Highlight all of the fields in the TRANSACTION Source Qualifier transformation (holding SHIFT, click the first field then click the last) 2. Still holding SHIFT, drag the selection to the Joiner Transformation
113
To join the two sources we need to add the Customer fields to the Joiner 1. Highlight the fields in the CUSTOMERS Source Qualifier transformation 2. Drag them to the Joiner Next, we need to edit the Joiner properties
114
1. Click Rename
Remember, all of this metadata will be captured in the PowerCenter Metadata Repository. Since we have the ability to report on the PowerCenter Metadata Repository, we want the names of our transformation objects to be meaningful.
115
116
Notice that once we have a field in each source named CUST_ID, it named the second instance of CUST_ID to CUST_ID1.
117
Add a join condition 1. Click on the Condition tab 2. Click on the Add condition icon
118
A default condition will be displayed. Since we have two fields with similar names, by default, the condition will use these to field names.
1. Press OK
119
With the data joined we need to separate good records from those with missing customer names. 1. Click on the Router Transformation 2. Click on the workspace to add a router to the mapping
120
We want to keep all of the fields from the Joiner except CUST_ID1, which is the same as CUST_ID.
1. Hold CTRL and select all fields except for CUST_ID1 2. Drag the selected fields to the Router We need to tell the Router what conditions to check for. 1. Double-click the Router to edit it.
121
Rename the Router 1. Click Rename 2. Type Transformation Name rtr_check_customer_name 3. Click OK
122
123
The Router groups data based on user defined conditions. All records that meet the Group Filter Condition are included in the output for that group.
We need to create two groups. One for records with a customer name and one records where the name is missing. 1. Click the Add button twice.
124
Rename the Groups 1. 2. 3. 4. Click on the first Group Name Rename the group GOOD_CUSTOMER Click on the second Group Name Rename the group CUSTOMER_NONAME
Next we need to edit the Group Filter Condition 1. Click the arrow on the first condition to open editor
125
Bad records have a NULL value for the customer name. If the record is not NULL then it is good. 1. Enter the expression: NOT ISNULL(CUST_NAME) 2. Click Validate to test your expression 3. Click OK, to close the message window. 4. Click OK, to close the Expression Editor
126
127
128
1. Press OK
129
The Router appears. Expand the transformation and scroll down to see the two we created.
130
Note: When you drag and release, Designer connects the first field in the set being dragged to the field under the cursor when the mouse is released, the second with the second and so on. If your fields are not in matching order you may need to connect them one at a time.
131
132
1. Click the disk icon to Save the mapping 2. Click the Toggle Output Window icon to view save status and other messages 3. Verify the mapping is VALID, if it is not check for Error messages 4. Finally clean up the workspace. Right-click and select Arrange All Iconic
133
Congratulations!
134
135
Lesson 3: Workflow
Using Workflow Manager and Monitor
136
137
Informatica Platform
Workflow Manager and Workflow Monitor
Provider
XML, Messaging, and Web Services
PowerCenter
Design Manage Workflow Manager Monitor Workflow Monitor
Consumer
Portals, Dashboards, and Reports
Client
Designer
Administrator
Packaged Applications
Services Framework
Repository Service
Packaged Applications
Repository
Relational and Flat Files
Integration Service
Web Services
138
Workflow Tasks
Workflow Tasks
Assignment Command Control Decision Email Event-Raise Event-Wait Session Timer
Description
Assigns a value to a workflow variable Specifies a shell command to run during the workflow. Stops or aborts the workflow. Specifies a condition to evaluate. Sends email during the workflow. Notifies the Event-Wait task that an event has occurred. Waits for an event to occur before executing the next task. Runs a mapping you create in the Designer. Waits for a timed event to trigger.
139
In this Scenario
We will use the Workflow Manager and Workflow Monitor to build a workflow to execute the mappings we just built. We will configure our workflow and then monitor the workflow in the Workflow Monitor. Along the way, we will investigate the various options in both tool sets.
140
Step-by-step Overview
1. Open the Workflow Manager through Designer
2. Create a session task. 3. Configure the session task to run the mappings we just built. 4. Investigate the options in the Workflow Manager. 5. Monitor the execution of the session in the Workflow Monitor. 6. View the run properties and session log in the Workflow Manager.
141
Launch the Workflow Manager 1. Press the Orange W on the tool bar above to launch the Workflow Manager
142
Create worklets
Create workflows
143
144
1. Select the Session task and drop it in by clicking on the Workflow Designer workspace
145
We need to join and remove records with no customer name before we can load them into the Data Warehouse. 1. Select the mapping m_remove_missing_customers 2. Click OK
146
147
We need to configure the Session to connect to the source and target structures 1. Double-click the Session task to open and edit it
148
149
1. 2. 3.
Scroll down under Properties until you see Source file directory Enter the location C:\PowerCenter Workshop as the Source file directory Enter TRANSACTIONS.dat as the Source filename
150
1. Select SQ_CUSTOMERS under Sources on the left 2. Click the drop-down to select the correct Oracle instance that houses this source table
151
1. Select the Source connection under Objects (this is the Oracle instance where the CUSTOMERS table resides) 2. Press OK
152
Configure the target structures 1. 2. Select CUSTOMER_NONAME from the Targets folder on the left Click the drop-down box under Value to open the Relational Connction Browser
153
154
1. Right click in the Value box 2. Select Apply Connection Value To all Instances to assign this connection value to all target tables
155
1. Review the information for GOOD_CUST_STG 2. Notice Target is already filled in 3. Press OK to close the Session Editor
156
1. Under Properties, scroll down and select the Truncate target table option 2. Select OK to close the Session Editor
157
158
159
The new Session is added. We need to sequence the sessions so they execute in the proper order. 1. Select Link Tasks 2. Click on the left session and drag to the session on the right, so the sessions are connected. 3. Double-click the new Session (on right) task to edit it
160
161
1. Select SQ_GOOD_CUST_STG under Sources on the left 2. Click the arrow under Value to select the correct Oracle instance for this source table
162
In the Relational Connection Browser 1. Select the Target connection under Objects why ?? 2. Press OK
163
Configure the target structures 1. 2. Select CUSTOMER_DATES from the Targets folder on the left Click the arrow under Value to open the Relational Connction Browser
164
165
1. Right click in the Value box 2. Select Apply Connection Value To all Instances to assign this connection value to all target tables
166
167
Under the Transformations folder in the left navigation pane 1. Verify the lkp_product_description Connection Value uses Source, if not click the arrow and update 2. Click OK
168
Properties allow you to specify log options, recovery strategy, commit intervals for this session in the workflow and so forth. Note in this case the workflow will continue even if the mapping fails.
169
The Config Object allows you to specify a variety of Advanced, Logging, Error Handling and Grid related options. Scroll down to view the range of options available.
170
In the Components tab, you can configure presession shell commands, post-session commands, email messages if the session succeeds or fails, and variable assignments.
171
172
1. Verify the workflow is VALID, if not scroll up to check for errors 2. Select Workflows Start Workflow
173
Workflow Monitor provides a variety of views for monitoring workflows and sessions. This view shows the status of running jobs.
1. Notice that the Workflow Monitor is displayed when you start a workflow 2. Let the task run to completion
174
This view allows users to view the tasks associated with a specific workflow. Note that in this case our workflow has two sessions and has been successfully run several times. Your view may vary depending on when and how many times you have run your mapping
175
1. Select the Gantt Chart view 2. Right click on the first session in the workflow we just ran 3. Select Get Run Properties
176
177
Looks like Writer execution failed for some reason with error 8425. Lets take a look at the session log and find out what the 8425 error is.
1. 2. 3. 4.
Select Find. . . Enter the Error Number 8425 Select the radio button for All fields Select Find Next
179
In order to debug, lets override writing our data to the CUSTOMER_NONAME table.
181
1. Select CUSTOMER_NONAME from the Mapping tab 2. Override the Relational Writer 3. Select the drop-down box
182
183
1. Under properties Scroll down to Header Options 2. Click drop-down and select Output Field Names 3. Select Set File Properties
184
1. Switch the radio button to Delimited 2. Press OK 3. Press OK to exit the Session Editor
185
1. Save the changes we made 2. Verify the workflow is VALID 3. Run the workflow again
186
187
188
As we suspected, we have duplicate Customer IDs and will have to deal with that in our mapping, but well save that for another day!
189
1. Go to Designer and open the Target Designer 2. Right click on the GOOD_CUSTOMERS target 3. Select Preview Data. . .
190
1. 2. 3. 4.
Verify the ODBC data source is target Enter Username: target Enter Password: target Press Connect
191
192
Lab 3: 30 min
10 min break
193
Lesson 4
194
How to use a look-up to enrich records with data from another source?
What is a reusable transformation? How to use expressions to format data? How to use aggregate functions to generate results from a data set?
195
In this Scenario
We will use Designer to build another mapping. Where the last lab focused on joining raw data and removing bad records, this lab focuses on using transformations to convert, enrich, and reformat the data and, finally, load it into the data warehouse.
Specifically, we will be working with the good records that the first mapping loaded into the staging table.
196
PowerCenter Transformations
Some examples
Transaction Control Router Normalizer Custom Transformation Stored Procedure Lookup
XML Parser Update Strategy Source Qualifier Sort Rank Sequence Generator Aggregator
Mapplet
Filter JAVA Target Definition Union
XML Generator
Expression Joiner Mapplet Input Mapplet Output
197
Step-by-step Overview
1. Create a new mapping called m_build_customer_DW 2. Get a product description from the PRODUCT table
3. Format customer names and product descriptions so the first letter is Upper Case
4. For the good data, perform a simple calculation to determine total revenue 5. Collapse any duplicates 6. Load transaction dates into a table so a reporting tool can get the date of a specific transaction
198
Starting from the Mapping Designer 1. Select Mappings Create to build a new mapping
199
200
Add the source GOOD_CUST_STG to the mapping 1. Drag the GOOD_CUST_STG source into the work area
201
202
Add the target tables to the mapping 1. Expand the Targets folder 2. Select and drag CUSTOMER_NONAME and GOOD_CUST_STG tables onto the workspace
203
204
The Lookup Transformation will allow us to pull back the product description names from our PRODUCT table. This is required by the end user so they can see exactly what products were purchased by our customers.
Add a Lookup Transformation 1. Click the Lookup Transformation icon once and single click in the workspace
205
Select the Lookup Table 1. Click on the Import tab 2. Select From Relational Table
206
We have to connect to the database instance that holds our lookup table. Note that PowerCenter will NEVER override database level security.
1. Click on ODBC data source the dropdown box 2. Select the source ODBC connection
207
1. Enter the Username source 2. Enter the Password source 3. Press Connect
208
209
The Lookup Transformation appears in the workspace. We will use the Product_ID from the source to lookup the data we need
1. Highlight the PRODUCT_ID field from the Source Qualifier and drag it onto the white space at the bottom of the Lookup
210
211
212
Much like the joiner, the lookup transformation requires a condition to be true for it to pass values. In this case, we want the product ID from the TRANSACTIONS file to match the product ID in the PRODUCTS table. Once there is a match, the lookup will return the proper product description value.
213
We want to return the Product Description that matches the Product_ID we passed in. Designer automatically identified the correct ports to compare for the lookup. No change is required 1. Click on the Ports tab
214
215
We would like to do some formatting on our source data. We want the initial character of our customer names and product descriptions to be Upper Case
Add an Expression for formatting data 1. Select the Expression Transformation 2. Click on the workspace
216
217
218
Minimize completed transformations 1. Click on the minimize icon for each completed transformation 2. Next, double-click on the Expression Transformation to edit it
219
220
1. Select the Ports tab 2. Add field button twice to add two output ports 3. Select the first field and rename it CUST_NAME_OUT 4. Select the second field and rename it PRODUCT_DESC_OUT
221
1. 2. 3.
Change the precision to 50 for each new port De-select the O (output) ports for CUST_NAME and PRODUCT_DESC (they will be replaced by the new fields) De-select the I for the new fields (they originate here and have no input) When the O is selected the expression editor box on the right will become active
222
1. Select the Expression box area next to the first field CUST_NAME_OUT (an arrow will appear) 2. Click the arrow to open the Expression Editor
223
1. 2. 3. 4.
Edit the Expression Expand the Character folder Select the Initcap function Double-click the function to add it to the Formula
224
This is a simple expression telling PowerCenter capitalize the first letter of the customer first and last name.
1. Edit the Formula so that it matches the one above. Remember CUST_NAME is the input being modified, CUST_NAME_OUT is the result 2. Press OK to close the editor
225
Repeat for PRODUCT_DESC_OUT 1. Press the down arrow to open the Expression Editor
226
1. Select the Initcap function 2. Edit the Formula so it matches the one above 3. Press OK
227
1. Click the row number at left and use the black arrows to move the row up or down in the list
228
229
Next we need to format our date. In our flat file, the date is an 8 character string. We need to convert that string to a date format so that it matches the format the target database (Oracle) is expecting
1. Validate the mapping it should look like this 2. Open the Navigation Pane
230
1. Select the Transformation Developer 2. Expand the Transformations folder in the left Navigation pane 3. Drag the exp_formatted_date transformation onto the workspace 4. Double-click the transformation to edit it
231
1. Select the Ports tab 2. Open the Expression Editor for the formatted_date port
232
1. Review the expression formula 2. Press OK 3. Press OK on the next screen to close the Edit Transformations window
233
234
1. Select any Object Types that should be included in the report 2. Press OK
235
1. Review the report content. In this case there are no dependencies. 2. Close the report
236
237
1. Link the DATEOFTRANSACTION port to the DATE_IN port on the new Expression Transformation 2. Add an Aggregator to our mapping
238
We need to calculate the total revenue for each customer. The Aggregator transformation performs these types of functions on groups of data. It can also help collapse records based on a grouping criteria (CUST_ID in this case), eliminating duplicate sets of results
239
240
241
Update port names and build the aggregate calculation 1. 2. 3. Select the Ports tab Remove the _OUT from the CUST_NAME and PRODUCT_DESC ports Click Add new port button once
1. Rename NEWFIELD to TOTAL_REVENUE 2. Change the Datatype to Double 3. De-select the I so the Expression Editor becomes available 4. Click the arrow to open the Expression Editor
243
244
Our calculation computes the total revenue by customer. To accomplish this, data needs to be grouped by Customer ID 1. 2. Check the GroupBy box for the CUST_ID port Press OK
245
We are ready to map the fields from the aggregator to the GOOD_CUSTOMERS table
1. Select the relevant ports from the Aggregator 2. Map the selected fields to the matching ports on the GOOD_CUSTOMERS target table
246
We want to map three of the fields in the Aggregator to our second target, CUSTOMER_DATES. The CUST_ID field will go to both targets.
1. 2.
Select the relevant ports from the Aggregator to map to CUSTOMER_DATES Connect the selected fields to the matching ports on the target table
247
Almost done. Lets apply the finishing touches. 1. Save the mapping 2. Verify the mapping is VALID 3. Clean up. Right click anywhere in the workspace, select Arrange All Iconic
248
Congratulations!
You are now ready to load your data into the Data Warehouse.
249
Lab 4: 1 hr
250
251
252
In this Scenario
As a developer you want to test the mapping you built prior to running the data to ensure that the logic in the mapping will work. For this lab we will use a pre-built mapping to review the features of the Debugger
253
Step-by-step Overview
1. Open the Debugger lab folder
2. Run the Debugger 3. Configure the Session with the Debugger Wizard 4. Edit Breakpoints 5. Step through the mapping
6. Monitor results
254
1. Open the Mapping Designer 2. Expand the Mappings Folder 3. Drag M_DebuggerLab to the Mapping Designer workspace
256
Debugger Toolbar
Start Debugger
Stop the Debugger Next Instance Step to Instance
257
258
259
1. Select the Int_Workshop_Service as the Integration Service on which to run the debug session 2. Leave the defaults 3. Click Next
260
1. Choose the Source and Target database connection 2. Leave the default values of Source and Target 3. Click Next
261
262
263
Configure Session parameters 1. Check Discard target data 2. Click Finish to start the session
264
Lets adjust the tool bars so it is easier to work with the Debugger. 1. Right click on the tool bar and unselect the Advanced Transformations tool bar 2. Repeat and select Debugger so the toolbar is visible
265
1. Select Edit Breakpoints to establish breakpoints to stop the debug session at specific transformations
266
267
1. Select the Add button to create a breakpoint at the expression transformation 2. Under Condition, click the Add box to set the breakpoint rules 3. Edit the rule so that it will stop when CUST_ID = 325 4. Click OK
268
269
Debugger Menu
Breakpoint
270
Next Instance
From the Debugger Toolbar 1. 2. 3. Click Next Instance to step into the mapping Review values and outputs in the debug panes Continue to step through and monitor changes
See Output
Examine values
271
1. Click Next Instance until 9 records have been processed. 2. Monitor Output below 3. Click Stop the Debugger
272
When the Debugger closes, Designer returns to the normal design view
273
274
Informatica Community
my.informatica.com
275
My.informatica.com Assets
Searchable knowledge base
Online support and service request management Product documentation and demos Comprehensive partner sales, support and training tools Velocity Informaticas implementation methodology Educational services offerings
Mapping Templates
Link to devnet Many more
276
Developer Network
devnet.informatica.com
277
278
Welcome to
What is beINFORMed?
Informaticas Partner Home
A variety of online tools and resources to help you sell and deliver Informatica solutions
Where is beINFORMed?
URL: http://partners.informatica.com/
279
Log in and Register Today as a NEW USER". The system will lead you through the application process
280
Software
Ensuring that you are successful with the deployment of the Informatica platform, we offer you internal training and demonstration software.
Resource Center
A one-stop shop for technical, marketing, and sales information around Informatica's products, solutions and programs.
Marketing Center
Review and participate in joint programs to drive pipeline
281
beINFORMed
What It Looks Like
Increase your Informatica skills Request software Log your opportunities Find resources Do joint marketing
282
beINFORMed
Submit and Manage Software Requests
Submit Requests
283
beINFORMed
Enablement Paths Your Steps to Success
Increase your selling skills
Understand solutions
284
beINFORMed
Current Pre-Sales Enablement Paths
Solution Basics
INFORMATICA
DELIVERY
beINFORMed
285
beINFORMed
Presales Accreditation 2010-2011 Next Steps
Silver Accreditation Presentation and Positioning Expert
DELIVERY
Presales Accreditation on Platform, DI, DQ, MDM and other Informatica Solution Areas
beINFORMed Solution InfoCenters Online SC Webinars eLearnings Demo Recordings/Scripts Modular Web-based consumption
beINFORMed Solution InfoCenters SC Bootcamps VMWare-based POC scenarios POC reviews and validation POC Shadowing
Success Measures 2010 Manual review process 2011 Automated review process per solution area 2010 Manual review process 2011 Automated review process per solution area 2010 Manual review process 2011 Automated review process per solution area
286
287
beINFORMed
Comprehensive Resource Center
288
beINFORMed
Implementer Enablement Paths Data Quality
Follow the Initial Steps
289
Step 1 QuickStart
Quickstart, MetaData Manager, New Features, Unified Security, Data Analyzer, Real-Time eLearning Download Software for Training or evaluation purposes Installation Guide, Getting Started
Documentation Install Guide, Getting Started Guide, User Guide Demos Real-Time Edition, MetaData Manager, Data Masking
290
Step 2 Education
Global Education Services PowerCenter 8.x - Level 1 Developer
4 day course (Virtual or classroom based) - More Details >>
Metadata Manager 8.6 3 day course (Virtual or classroom based) More Details >>
291
Step 3 Services
During Projects you can use the following services
Global Customer Support 24 x 7 support
Raise service request via Email / web Search our knowledge base via http://my.informatica.com Phone (North America: +1 866 563 6332)
Professional Services
For initial engagements DI experts can be contracted To compliment your team Velocity Methodology Available for Partners, Informatica Best Practices Search with Velocity on beINFORMed PowerCenter Baseline Architecture PowerExchange CDC Deployment for Mainframe Systems Data Migration Jumpstart
292
Step 4 Certification
Global Education Services
Informatica Certified Developer
PowerCenter QuickStart eLearning PowerCenter 8.X+ Administrator course PowerCenter Developer 8.x Level I course PowerCenter Developer 8 Level II course Three Exams
293
beINFORMed
Lead Management Opportunity to Close
Register Leads Obtain Sales Support Collaborate with Alliances
294
beINFORMed
Joint Marketing Leverage Existing Programs and Content
Find Marketing Info & Opportunities
Do joint PR
295
296
In this Scenario
You are the regional manager for a series of car dealerships. Management has asked you to track the progress of your employees. Specifically, you need to capture:
Employee name Name of the dealership they work at What they have sold
298
Step-by-step Overview
1. Create a new target definition to use in the mapping, and create a target table based on the new target definition. 2. Create a mapping using the new target definition. You will add the following transformations to the mapping:
Lookup transformation. Finds the name of the employees, dealerships they work at, and all products they have sold.
Aggregator transformation. Calculates the net revenue that the employee has sold.
Expression transformation. Format all employees names and product descriptions.
3. Create a workflow to run the mapping in the Workflow Manager 4. Monitor the workflow in the Workflow Monitor
299
300
Step 2: Mapping
1. Open up Mapping Designer 2. Create a new mapping call it whatever you like 3. Bring in mm_transaction source and T_Employee_Summary target 4. Find dealership name (hint: Use the mm_data user as all dealerships names are kept in the mm_dealership table)
5. Find product description (hint: Use mm_data user as all product descriptions are kept in the mm_product table)
6. Find employee name (hint: Use mm_data user as all employees names are kept in the mm_employees table) 7. Format the employee name and make sure the name is capitalized 8. Format the product description and make sure the initial letters are capitalized 9. Calculate net revenue (hint: keep it simple, net revenue is revenue cost) 10. Group by Employee_ID to collapse all unique employees 11. Map to target table
301
302
303
Thank You!
304
305
Transformation Objects
Transformation
Aggregator Application Source Qualifier Custom Data Masking Expression External Procedure Filter HTTP Input Java Joiner Lookup Normalizer Output Rank
Description
Performs aggregate calculations. Represents the rows that the PowerCenter Server reads from an application, such as an ERP source, when it runs a session. Calls a procedure in a shared library or DLL. Replaces sensitive production data with realistic test data for non-production environments. Calculates a value. Calls a procedure in a shared library or in the COM layer of Windows. Filters data. Connects to an HTTP server to read or update data. Defines mapplet input rows. Available in the Mapplet Designer. Executes user logic coded in Java. The byte code for the user logic is stored in the repository. Joins data from different databases or flat file systems. Looks up values. Source qualifier for COBOL sources. Can also use in the pipeline to normalize data from relational or flat file sources. Defines mapplet output rows. Available in the Mapplet Designer. Limits records to a top or bottom range.
306
Transformation Objects
Transformation Router Sequence Generator Sorter Source Qualifier SQL Stored Procedure Transaction Control Union Unstructured Data Update Strategy XML Generator XML Parser XML Source Qualifier Description Routes data into multiple transformations based on group conditions. Generates primary keys. Sorts data based on a sort key. Represents the rows that the PowerCenter Server reads from a relational or flat file source when it runs a session. Executes SQL queries against a database. Calls a stored procedure. Defines commit and rollback transactions. Merges data from different databases or flat file systems. Transforms data in unstructured and semi-structured formats. Determines whether to insert, delete, update, or reject rows.
Reads data from one or more input ports and outputs XML through a single output port.
Reads XML from one input port and outputs data to one or more output ports. Represents the rows that the Integration Service reads from an XML source when it runs a session.
307
308
Aggregate Functions
Function
AVG COUNT FIRST LAST MAX MEDIAN MIN PERCENTILE STDDEV SUM VARIANCE
Description
Returns the average of all values in a group. Returns the number of records with non-null value, in a group. Returns the first record in a group. Returns the last record in a group. Returns the maximum value, or latest date, found in a group. Returns the median of all values in a selected port. Returns the minimum value, or earliest date, found in a group. Calculates the value that falls at a given percentile in a group of numbers. Returns the standard deviation for a group. Returns the sum of all records in a group. Returns the variance of all records in a group.
309
Character Functions
Function
ASCII
Description
In ASCII mode, returns the numeric ASCII value of the first character of the string passed to the function. In Unicode mode, returns the numeric Unicode value of the first character of the string passed to the function. Returns the ASCII or Unicode character corresponding to the specified numeric value. In ASCII mode, returns the numeric ASCII value of the first character of the string passed to the function. In Unicode mode, returns the numeric Unicode value of the first character of the string passed to the function. Concatenates two strings. Capitalizes the first letter in each word of a string and converts all other letters to lowercase. Returns the position of a character set in a string, counting from left to right.
CHR
CHRCODE
310
Description
Returns the number of characters in a string, including trailing blanks. Converts uppercase string characters to lowercase. Adds a set of blanks or characters to the beginning of a string, to set a string to a specified length.
LTRIM
METAPHONE REPLACECHR REPLACESTR
RPAD
RTRIM SOUNDEX SUBSTR UPPER
311
Conversion Functions
Function
TO_BIGINT
Description
Converts a string or numeric value to a bigint value.
TO_CHAR
TO_DATE
Converts a character string to a date datatype in the same format as the character string.
TO_DECIMAL
TO_FLOAT
Converts any value (except binary) to a double-precision floating point number (the Double datatype).
TO_INTEGER
Converts any value (except binary) to an integer by rounding the decimal portion of a value.
312
Description
Returns the greatest value from a list of input values Matches input data to a list of values Returns the position of a character set in a string, counting from left to right. Returns whether a value is a valid date. Returns whether a string is a valid number. Returns whether a value consists entirely of spaces. Returns whether a value is NULL. Returns the smallest value from a list of input values. Removes blanks or characters from the beginning of a string. Encodes characters of the English language alphabet (A-Z). It encodes both uppercase and lowercase letters in uppercase.
LEAST
LTRIM METAPHONE
313
Description
Extracts subpatterns of a regular expression within an input value
Returns whether a value matches a regular expression pattern Replaces characters in a string with a another character pattern. Replaces characters in a string with a single character or no character Replaces characters in a string with a single character, multiple characters, or no character. Removes blanks or characters from the end of a string. Encodes a string value into a four-character string. Returns a portion of a string. Converts a string or numeric value to a bigint value. Converts numeric values and dates to text strings. Converts a character string to a date datatype in the same format as the character string.
TO_DECIMAL
TO_FLOAT
TO_INTEGER
314
Date Functions
Function Description ADD_TO_DATE
Adds a specified amount to one part of a date/time value, and returns a date in the same format as the specified date.
Returns a value indicating the earlier of two dates. Returns the length of time between two dates, measured in the specified increment (years, months, days, hours, minutes, or seconds). Returns the specified part of a date as an integer value, based on the default date format of MM/DD/YYYY HH24:MI:SS. Returns whether a string value is a valid date. Returns the date of the last day of the month for each date in a port. Returns the date and time based on the input values Rounds one part of a date. Sets one part of a date/time value to a specified value. Date/Time datatype. Passes the date values you want to convert to character strings Truncates dates to a specific year, month, day, hour, or minute.
DATE_COMPARE DATE_DIFF
315
Encoding Functions
Function Description
AES_DECRYPT
AES_ENCRYPT COMPRESS CRC32 DEC_BASE64 DECOMPRESS ENC_BASE64
Encodes data by converting binary data to string data using Multipurpose Internet Mail Extensions (MIME) encoding
Calculates the checksum of the input value. The function uses Message-Digest algorithm 5 (MD5)
MD5
316
Financial Functions
Function
FV NPER
Description
Returns the future value of an investment, where you make periodic, constant payments and the investment earns a constant interest rate Returns the number of periods for an investment based on a constant interest rate and periodic, constant payments Returns the payment for a loan based on constant payments and a constant interest rate Returns the present value of an investment Returns the interest rate earned per period by a security
PMT
PV RATE
317
Numeric Functions
Function ABS CEIL CONVERT_BASE CUME EXP FLOOR LN LOG MOD MOVINGAVG MOVINGSUM POWER RAND ROUND SIGN SQRT TRUNC Description Returns the absolute value of a numeric value. Returns the smallest integer greater than or equal to the specified numeric value. Converts a number from one base value to another base value Returns a running total of all numeric values. Returns e raised to the specified power (exponent), where e=2.71828183. Returns the largest integer less than or equal to the specified numeric value. Returns the natural logarithm of a numeric value. Returns the logarithm of a numeric value. Returns the remainder of a division calculation. Returns the average (record-by-record) of a specified set of records. Returns the sum (record-by-record) of a specified set of records. Returns a value raised to the specified exponent. Returns a random number between 0 and 1 Rounds numbers to a specified digit. Notes whether a numeric value is positive, negative, or 0. Returns the square root of a positive numeric value. Truncates numbers to a specific digit.
318
Scientific Functions
Function Description
COS
COSH
SIN
SINH
TAN
TANH
319
Special Functions
Function Description
ABORT
DECODE
ERROR
Causes the PowerCenter Server to skip a record and issue the specified error message.
IIF
Returns one of two values you specify, based on the results of a condition.
LOOKUP
Searches for a value in a lookup source column. Informatica recommends using the Lookup transformation.
320
String Functions
Function Description
CHOOSE
INDEXOF
REVERSE
321
Test Functions
Function
IS_DATE
Description
Returns whether a value is a valid date..
IS_NUMBER
IS_SPACES
ISNULL
322
Variable Functions
Function
SETCOUNTVARIABLE
Description
Counts the rows evaluated by the function and increments the current value of a mapping based on the count.
SETMAXVARIABLE
Sets the current value of a mapping variable to the higher of two values:the current value of the variable or the value specified. Returns the new current value.
SETMINVARIABLE
Sets the current value of a mapping variable to the lower of two values: the current value of the variable or the value specified. Returns the new current value.
SETVARIABLE
Sets the current value of a mapping variable to a value you specify. Returns the specified value. Returns the current date and time of the node hosting the Integration Service with precision to the nanosecond.
SYSTIMESTAMP
323
324
Transformation Naming
Each object in a PowerCenter repository is identified by a unique name. This allows PowerCenter to efficiently manage and track statistics all the way down to the object level. When an object is created, PowerCenter automatically generates a unique name. These names, however, do not reflect project/repository specific context. As a best practice Informatica recommends the following convention for naming PowerCenter objects:
325
Custom
Expression External Procedure Filter Joiner Lookup MQ Source Qualifier
CT_TransformationName
EXP_TransformationName EXT_TransformationName FIL_TransformationName JNR_TransformationName LKP_TransformationName SQ_MQ_TransformationName
Normalizer
Rank
NRM_TransformationName
RNK_TransformationName
326
Sequence Generator
Sorter Stored Procedure
SEQ_TransformationName
SRT_TransformationName SP_TransformationName
Source Qualifier
Transaction Control Union Update Strategy XML Generator XML Parser XML Source Qualifier
SQ_TransformationName
TC_TransformationName UN_TransformationName UPD_TransformationName XG_TransformationName XP_TransformationName XSQ_TransformationName
327
328