Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
You are on page 1of 89

PowerCenter 7 Advanced: New Features

Education Services
Version PC7A-20040830

Informatica Corporation, 2004. All rights reserved.

Agenda
PowerCenter 7.1 Platforms and Connectivity PowerCenter 7.1 Options and Upgrades Workflow Manager: Session Editor Enhancement Workflow Monitor Enhancements (Workflow Monitor lab) Cross-Tool Enhancements Designer Enhancements (Client Usability, Flat File Lookup and Union, Creating XML Definitions and Transaction-Preserving Transformations labs) Workflow Manager: Error Logging Enhancement (Error Logging lab)

PowerCenter 7.1 Platforms and Connectivity


PowerCenter Server

PowerConnects:
Web Services SAS Plus in PowerCenter 7.1.1 MSMQ Hyperion Essbase HTTP Most PowerConnects on Linux

64-bit AIX 64-bit HP-UX Windows NT X AIX 4.3.3 X SuSE Linux (in PowerCenter 7.1.1)

Repository Server

PowerCenter Client

Windows NT X Windows 98 X

Added
3

X Discontinued

PowerCenter 7.1 Options


Data Profiling Data Cleansing Server Grid Real-Time/WebServices Partitioning Team-Based Development
Profile wizards, rules definitions, profile results tables, and standard reports Name and address cleansing functionality, including directories for US and certain international countries Server group management, automatic workflow distribution across multiple heterogeneous servers ZL Engine, always-on non-stop sessions, JMS connectivity, and real-time Web Services provider Data smart parallelism, pipeline and data parallelism, partitioning Version control, deployment groups, configuration management, automatic promotion Server engine, metadata repository, unlimited designers, workflow scheduler, all APIs and SDKs, unlimited XML and flat file sourcing and targeting, object export to XML file, LDAP authentication, role-based object-level security, metadata reporter, centralized monitoring

PowerCenter

PowerCenter 7.1 Upgrades


Install Base Version Upgrade
Data Profiling Data Cleansing Server Grid
PowerCenterRT v5 or v6 v7.1

New Customers

Real-Time /WebServices Partitioning

Purchasable options

PowerCenter v5 or v6 v7.1 PowerMart v5 or v6 v7.1


Note: PowerMart upgrades allow use of global repositories but extra repositories cost more.

Team-Based Development

PowerCenter

WorkFlow Manager: Session Editor Enhancement

v7 Session Editor
Properties and Config Object tabs have collapsible options rather than sub-tabs New Mapping tab consolidates Sources, Targets, Transformations and Partitions into one tab with two views: Transformations view Partitions view, with graphical display

Properties Tab

Collapsible options

Config Object Tab

Collapsible options

Mapping Tab - Transformations View

10

Mapping Tab Partitions View

Flag color indicates partition type

Folders for - Partition Points - Non-Partition Points

Graphical display shows mapping flow, partition points, partition type & number
11

Workflow Monitor Enhancements

12

Workflow Monitor Enhancements


Improved Task view
Workflow run tree display All workflows running on all servers at once Status messages Filters menu and toolbar with more options:
Workflows that ran in a specific time frame Sessions that ran during the last X hours

13

Copyright 2004 Informatica Corporation. All rights reserved.

Workflow Monitor Task View

v6

v7

14

Filter Toolbar

New Filter toolbar


Select type of tasks to filter Select servers to filter Filter tasks by specified criteria Display recent runs

15

Workflow Monitor Enhancements


Standard toolbar

Print preview

Toggle Navigator window on/off


Toggle Output window on/off

Server toolbar

Resume and recover workflow


16

Lab NF1 Workflow Monitor

17

Cross-Tool Enhancements

18

Cross-Tool Enhancements
Cool look Validation enhancements Object export/import Copying and comparing objects

19

Cool Look
Cool look (no borders to icons) default

Turn off in Tools => Customize, Toolbars tab

Many icons revised in toolbars and workspace


20

Validation Enhancements
Invalidation A parent object is invalidated when changes are made to its child object In v6, the parent object was marked invalid but the reason was not reported In v7, the reason is reported in the fetch.log Mass Validation In v6, the user had to fetch and validate each parent object individually In v7, the user can validate all the parent objects at the same time. This is useful to identify all invalidations caused by changing a shared child object. Available in Repository Manager Navigator tree, List View, and (for versioned repositories) in Results View

21

Object Export/Import
Full export/import of repository objects to/from XML Workflows, worklets, sessions, mappings, transformations Multiple objects in a single XML file Automatic handling of dependent objects

Objects can span multiple folders across a repository

22

Copying and Comparing Objects

23

Copying and Comparing Objects


Designer and Repository Manager copy conflicts now invoke the Copy Wizard Copy Wizard has several enhancements Workflow Manager and Repository Manager allow Compare Objects for workflows and tasks

24

Designer and Repository Manager Copy Conflict


v6 v7

Opens Copy Wizard

25

Copy Wizard Enhancements


v7

Simplified name resolution Compare conflicting objects Scope of resolution

26

Copy Wizard Enhancements contd


Compare objects before resolving a name conflict

v7

27

Compare (Diff) Workflows and Tasks


In Workflow Manager and Repository Manager

v7

28

Designer Enhancements

29

Designer Enhancements
Port Attribute Propagation Lookup Transformation with Flat Files

Union Transformation
Custom Transformation

XML Enhancements
Transaction-Preserving Transformations New Functions and Datatypes
30

Port Attribute Propagation

31

Port Attribute Propagation


When you change a port name, Designer automatically propagates references to that port in expressions, conditions, and other ports within the transformation Can also propagate changed port attributes forward and backward throughout the mapping

32

Port Attribute Propagation Steps 1-3


1. In Normal View, select one or more ports (use Shift or Ctrl key for multiple ports). Right-click and select Propagate Attribute.

2.

Dialog Box Opens

3. Select
Direction (forward / backward link path or both) Attributes to propagate (name, data type, precision, scale)

Options implicit dependencies to include (condition and / or expression). Disabled if Name attribute selected.

33

Port Attribute Propagation Steps 4-5


4. Preview (best practice) shows links to affected ports in green, unaffected ports in red

5.

Propagate updates:

I and I/O ports in forward link path O and I/O ports in backward link path
Selected attributes for all ports in the link path Port name in: Dependent expressions or conditions (if options selected) Associated port of a dynamic lookup Custom transformations

34

Lab NF2 Client Usability

35

Lookup Transformation with Flat Files

36

Lookup Transformation with Flat Files 1


In v7, you can use a flat file as source for a connected or
unconnected Lookup transformation

You can use any flat file definition in the repository or you can
import it

37

Lookup Transformation with Flat Files 2


When you import a flat file lookup source, the Designer invokes the Flat
File Wizard

38

Lookup Transformation Editor Flat File 1

39

Lookup Transformation Editor Flat File 2

40

Configuring a Session for Flat File Lookup

41

Union Transformation

42

Union Transformation
Merges data from multiple pipelines into one pipeline (similar to SQL Statement UNION ALL)
Passive Transformation Connected Mode only Ports Multiple input groups Single output group Ports in all input and output groups must match Usage Merging pipelines Does not remove duplicate rows
43

Union Transformation - Example

44

Lab NF3 Flat-File Lookup and Union

45

Custom Transformation

46

Custom Transformation 1
New framework for developing user defined transformations
Uses compiler-independent APIs C for server C++ for client Native transformation look and feel Supports: Active or passive transformations Multiple input and output groups Port-level metadata Transaction control Update strategy Partitioning
47

Custom Transformation 2
Calls an active or passive procedure defined in a dynamic linked library (DLL) or shared library
Active or Passive Transformation Connected Mode only Ports Mixed

Usage Perform transformation logic outside PowerCenter Uses Custom transformation functions Sorting, Aggregation

48

Custom Transformation 3
Custom transformation replaces the Advanced External Procedure (active) transformation External Procedure (passive) transformation remains
This supports Microsoft COM objects, including Java and

Visual Basic, as well as C and C++

49

XML Enhancements

50

XML Enhancements
XML Definition Wizard
Import from XML schemas (XML 2001 standard)

Generate XML views (groups)


XPath support

XML Editor
XML workspace displays XML views and relationships graphically Popup windows for schema details e.g. ComplexType hierarchies Data preview

Midstream XML Parser and Generator transformations Performance options for large XML targets
51

Import from XML Schemas


XML schemas are much richer than DTDs:
Written in XML Support multiple namespaces
(A namespace is a schema location, e.g. URL, where a group of related elements and attributes are defined)

Support many more datatypes


(44+ simpletypes plus user-defined complextypes)

Support substitution groups e.g. alternative root elements


More flexible, e.g.

Child elements occurring in any order

52

Multiple elements with the same name but different content


Elements with no content

Generate XML Views 1


XML definitions represent the XML hierarchy as groups, called XML views
XML Source Definition

XML Views (Groups)

53

Generate XML Views 2


The XML Wizard can generate XML views from rules (entity relationships, hierarchy relationships) or you can create custom XML views

54

Generate XML Views 3


For custom views, you can reduce metadata explosion by several options

55

XPath Support
XPaths list the path from the root element to an element or attribute with all intermediate components separated by /

XML Source Definition


56

XML Editor
Double-click XML definition in workspace or Right-click Edit XML Definition or from Source / Targets / Transformation menus Edit XML Definition

XML Metadata Navigator

XML Workspace

Components Pane - Properties - Actions - Data Values, if any (shows selected component)

Columns window (shows selected view)

57

XML Workspace XML Views


The XML Editors workspace displays the XML views (groups) as entities connected by lines and symbols indicating the relationships (parent/child, many:many, etc)
XML Source Definition XML Workspace

XML Views
58

XML Workspace View Schema Details

XML Editor has popup windows for Edit Namespace, ComplexType Hierarchy, Data Preview, etc.

59

Midstream XML Parser Transformation


Reads XML from a database table or message queue

In v6, had to use a mapplet with an XML Source Qualifier

60

MidStream XML Generator Transformation


Creates XML in a database table or message queue

In v6, had to use a mapplet interface


61

Performance Options for Large XML Targets


On Commit option allows user-defined commits to flush XML data On Commit Write to new document allows multiple XML output files Target cache size for XML tree (on overflow spills to disk)

Do not output empty elements avoids writing unnecessary elements

62

Lab NF4 Creating XML Definitions

63

Transaction-Preserving Transformations

64

Transaction-Preserving Transformations
In v.6, Aggregator, Rank, Joiner, and Sorter processed all input rows before emitting output rows In v.7, these and the new Custom transformation can process data one transaction at a time Benefits

Preserves transactions
Increased performance, less resource

65

Transformation Scope
Transformation Scope Most transformations Output

Row

As each row is processed

All input (only v6 option) Transaction (added in v7)

Agg, Rnk, Jnr, Srt When all rows processed

When commit encountered

Note: Custom transformations have whatever scopes are implemented by the developer
66

Example: Rank with Scope = All Input


In v6, a Rank transformation always has scope = All Input, dropping any incoming transactions Name Salary Name A4 Rank on All Input (Transactions are dropped) Salary $100K

A1
A2 A3 A4 COMMIT A5 A6

$80K
$40K $50K $100K $30K $60K

A7
A1 A6 A3

$90K
$80K $60K $50K

A2
A5

$40K
$30K

A7
67

$90K

Example: Rank with Scope = Transaction


In v7, a Rank transformation with scope = Transaction preserves incoming transactions Name A1 A2 A3 A4 Salary $80K $40K $50K $100K $30K $60K $90K Rank on a set of data bounded by transactions Name A4 A1 A3 A2 Salary $100K $80K $50K $40K $90K $60K $30K

COMMIT
A5 A6 A7
68

COMMIT
A7 A6 A5

Setting Transformation Scope

Transformation Scope

69

Lab NF5 Transaction-Preserving Transformations

70

New Functions and Datatypes

71

Soundex and Metaphone Functions


Used in expressions
Create index based on English pronunciations, e.g. SOUNDEX(Smith) = SOUNDEX(Smyth) Soundex
Encodes a string value into a four-character string (first input

character plus 3 numbers for unique consonants) Fast and standard

Metaphone
More accurate (but needs more computational power) Can specify length of string Algorithm not standard

72

New Datatypes
To handle Oracle, DB2, and SQL Server datatypes, PowerCenter 7 supports:
blob Large objects containing unstructured binary data

clob Large objects containing single-byte fixed-width character data


nclob Large binary objects containing single-byte or multiple-byte fixed-width character data xmltype Structured XML data (Oracle only)

73

WorkFlow Manager: Error Logging Enhancement

74

Error Types
Transformation error
Data row has only passed partway through the mapping

transformation logic
An error occurs within a transformation

Data reject
Data row is fully transformed according to the mapping

logic
Due to a data issue, it cannot be written to the target A data reject can be forced by an Update Strategy

75

Error Logging Off/On


Error Type
Transformation errors

Logging OFF (Default)


Written to session log then discarded

Logging ON
Appended to flat file or relational tables. Only fatal errors written to session log.

Data rejects

Appended to reject file Written to row error (one .bad file per target) tables or file

76

Setting Error Log Options


In Session task

Error Log Type Log Row Data Log Source Row Data

77

Error Logging Off Specifying Reject Files


In Session task

1 file per target

78

Error Logging Off Transformation Errors


Details and data are written to session log Data row is discarded If data flows concatenated, corresponding rows in parallel flow are also discarded
Transformation Error

79

Error Logging Off Data Rejects


Conditions causing data to be rejected include: Target database constraint violations, out-of-space errors, log space errors, null values not accepted Data-driven records, containing value 3 or DD_REJECT (the reject has been forced by an Update Strategy) Target table properties reject truncated/overflowed rows

Sample reject file


INSERT UPDATE DELETE REJECT 0,1313,Regulator System,Air Regulators,250.00,150.00 1,1314,Second Stage Regulator,Air Regulators,365.00,265.00 2,1390,First Stage Regulator,Air Regulators,170.00,70.00 3,2341,Depth/Pressure Gauge,Small Instruments,105.00,5.00

80

Log Row Data


Logs:

Session metadata
Reader, transformation, writer and user-defined errors For errors on input, logs row data for I and I/O ports For errors on output, logs row data for I/O and O ports

81

Logging Errors to a Relational Database 1

Relational Database Log Settings

82

Logging Errors to a Relational Database 2


PMERR_SESS: Stores metadata about the session run

such as workflow name, session name, repository name etc


PMERR_MSG: Error messages for a row of data are

logged in this table


PMERR_TRANS: Metadata about the transformation such

as transformation group name, source name, port names with data types are logged in this table
PMERR_DATA: The row data of the error row as well as

the source row data is logged here. The row data is in a string format such as [indicator1: data1 | indicator2: data2]

83

Error Logging to a Flat File 1


Creates delimited Flat File with || as column delimiter

Flat File Log Settings (Defaults shown)

84

Logging Errors to a Flat File 2


Format: Session metadata followed by de-normalized error information Sample session metadata
********************************************************************** Repository GID: 510e6f02-8733-11d7-9db7-00e01823c14d Repository: RowErrorLogging Folder: ErrorLogging Workflow: w_unitTests Session: s_customers Mapping: m_customers Workflow Run ID: 6079 Worklet Run ID: 0 Session Instance ID: 806 Session Start Time: 10/19/2003 11:24:16 Session Start Time (UTC): 1066587856 **********************************************************************

Row data format


Transformation || Transformation Mapplet Name || Transformation Group || Partition Index || Transformation Row ID || Error Sequence || Error Timestamp || Error UTC Time || Error Code || Error Message || Error Type || Transformation Data || Source Mapplet Name || Source Name || Source Row ID || Source Row Type || Source Data

85

Log Source Row Data 1


Separate checkbox in session task Logs the source row associated with the error row Logs metadata about source, e.g. Source Qualifier, source row id, and source row type

86

Log Source Row Data 2


Source row logging is not available downstream of an Aggregator, Rank, Joiner, Sorter (where output rows are not uniquely correlated with input rows)
Source row logging available Source row logging not available

87

Lab NF6 Error Logging

88

89

You might also like