Performance Tuning

PowerMart/Center 6
Performance/Tuning Overview
Informatica Developer Network
Mark Haas
Senior Consultant
Professional Services
P&T022603
A Performance Tuning Methodology
 How do you optimize and tune Informatica?
 You need a combination of basic knowledge and
techniques
 Knowledge
 Know the basic Informatica architecture
 Know the other building blocks in the system
 Database features and architecture
 Operating system features and architecture
 Know the limits of what is possible
 Know your goals
 Techniques
 Know how to find the bottlenecks
 Know how to eliminate them
2
The Production Environment
Disk Disk
Disk Disk Disk Disk

LAN/WAN DBMS
Disk OS Disk
Disk Disk
Disk Informatica Disk
Disk Disk
 This is a multi-vendor, multi-system environment
 There are many components involved
 Operating systems, databases, networks and I/O
 Usually need to monitor performance in several places
 Usually need to monitor outside Informatica
 Tuning involves an iterative approach

 1) Find the biggest performance problem
 2) Eliminate or reduce it
 3) Go to step 1
3
DBMS
OS
 Database and OS considerations

 Databases are not usually configured for optimal
performance when first installed out of the box
 The amount of memory and CPUs dedicated to the ETL
process also plays a significant role
4
Informatica
 Session Performance will be affected by:

 Properly utilizing mapping objects (Lookups, Aggregators,
etc.) with cache attributes
 Partition where ever possible
 Use good mapping strategies
5
Performance Tuning
 There are two general areas to optimize and tune
 External components to Informatica (OS, memory, etc.)
 Internal to Informatica (tasks, mapping, workflows, etc.)
 Getting data through the Informatica engine

 This involves optimizing at the task and mapping level
 This involves optimizing the system to make sure Informatica
runs well
 Getting data into and out of the Informatica engine
 This usually involves optimizing non-Informatica components
 The engine can’t run faster than the source or target
6
Measuring Performance
 Several types of Bottlenecks can effect performance
 Network
 System
 Database
 Informatica Mappings and Tasks
 There are several ways to measure performance such as total

amount of data (volume) per unit of time
 Volume can be measured as:
 Number of bytes
 Number of rows
 Time can be measured as:
 CPU or process time
 “Wall Clock” time
7
 For the purpose of identifying bottlenecks we will
use:
 “Wall Clock” time as a relative measurement time
 Number of rows loaded over the period of time (rows per
second)
 Rows per second (rows/sec) will allow performance

measurement of a session over a period of time and
with changes in our environment.
 Rows per sec can have a very wide range
depending on the size of the row (number of bytes),
the type of source/target (flat file or relational) and
underlying hardware.
8
 Establishing the baseline using the Workflow
Manager
 Run the workflow with the session task to be measured
 View the session properties from the Workflow Monitor at
the end of the session and record the number of rows
loaded, the start time, and end times of the session
 Subtract the start time from the end time of the session
convert to seconds to get the total session time
 Divide the number of rows loaded by the number of
seconds of run time for the session
9
 Things to note:
 Calculated rows per second are not the same as “Write
Throughput”
 For multiple targets use sum of rows loaded for targets that
are similar in row size
 For multiple partitions use the sum of rows loaded for all
partitions
 Complex mappings with multiple data flows may require
the creation of multiple mappings
 Monitor background processes external to Informatica
 Establish a baseline and work to make improvements
 Use the MX views and Real-Time Metadata Reporter to
view historical session task information
10
Server Resource Architecture
 Two session task parameters control the processing
pipeline
 The session shared memory size (DTM Buffer Size)
 The buffer block size
 These parameters are specified per session task, in

the Workflow Manager
11
 Session Shared Memory Size controls the total
amount of memory used to buffer rows internally by
the reader and writer
 This sets the total number of blocks available
 The usual value is about 25 MB (25,000,000)
 If the block size is 64K, then you get 16*25 = 400 blocks
 Buffer Block Size controls the size of the blocks that

move in the pipeline
 Optimum size depends on the row size being processed
 64K (64,000)  64 rows of 1K
 128K (128,000)  128 rows of 1K
12
Shared memory constant 25 MB

900
800
700
rows p/sec
600
1K row
500
400 2K row
300 3K row
200
100
0
32K 64K 96K 128K
buffer block size
13
Buffer block size constant 64K
1000
800
rows p/sec
600 1K row
2K row
400 3K row
200
0
12 15 20 25 30 35
shared memory size
14
Identifying Source Bottlenecks
 Reading from a flat file usually does not cause a
bottleneck
 Configure a session to a flat file target instead of a
relational target
 You can use the one created from a write test
 Place a filter set to false on the output of each

source qualifier in the map
 Execute the generated SQL from the source qualifier
externally
Use a filter to create read throughput only test
Original
Modified
15
Identifying Source Bottlenecks
 Modified “Read Test” mapping
 Used to identified a read or mapping bottleneck
 Create a new mapping that bypasses transformations
 If there is a significant change then the problem could be
with the transformations
Original
Modified
16
Identifying Target Bottlenecks
 Writing to a flat file usually does not cause a
bottleneck
 Configure a session task to write to a flat file target
instead of a relational target
 If a target is a flat file then most likely the problem is
elsewhere
17
Identifying Mapping Bottlenecks
 Generally if the bottleneck is not with the reader or
writer process then the next step to review the
mapping
 Mapping bottlenecks can be created by improper
configuration of aggregator, joiner, sorter, rank, and
lookup caches
18
Identifying Session Task Bottlenecks
 Check commit levels
 Check the session log for excessive transformation
errors
 Decimal Arithmetic enabled
 Update (else insert) enabled
 Incorrect Partitioning choices (PowerCenter)
 Pre and Post session task commands
 Tracing level
19
Mapping Optimizing
 Single-Pass Read
 Use a single SQL when reading multiple tables from the same
database.
 Data type conversions are expensive
 Watch out for hidden port to port data type conversions
 Over use of string and character conversion functions
 Use filters early and often

 Filters can be used as SQL overrides (Source Qualifies,
Lookups) and as transformations to reduce the amount of data
processed
 Simplify expressions
 Factor out common logic
 Use variables to reduce the number of time a function is used
20
Mapping Optimizing
 Use operators instead of functions when possible
 The concatenation operator (‘||’) is faster than the CONCAT
function
 Simplify nested IIFs when possible

 Use proper cache sizing for Aggregators, Rank,
Sorter, Joiner and Lookup transformations
 Incorrect cache sizing creates additional disk swaps that
can have a large performance degradation
 Use the performance counters to verify correct sizing
21
Using Performance Counters
 All transformations have basic counters that are
maintained by the server
 Counters are enabled at the session level using the collect
performance data option
 The server creates a “session_name.perf” for counter
statistics
 Default location is the session log directory
 Collecting performance data will have some impact on
session performance similar to tracing
 The important counters are the ones showing reads
and writes to disks for aggregators, ranks, sorter,
and joiners and the rows read from cache for
lookups
 Refer to the help documentation to find cache calculations
for the Aggregator, Lookup, Sorter, Joiner, and Rank
22
Using Performance Counters
 Performance counters provide a variety of statistics
for each transformation in a map
 Counters are enabled in the session property
Session Wizard
“Collect Performance Data”
23
Session Task Optimizing
 Run partitioned sessions
 Improved performance
 Better utilization of CPU, I/O and data source/target
bandwidth
 Use “Incremental Aggregation” when possible

 Good for “rolling average” type aggregation
 Reduce transformation errors

 Put logic in place to reduce “bad data” such as nulls in a
calculation
 Reduce the level of tracing

Tip: See the “Velocity Methodology”
Document for further information
24
Partitioned Extraction and Load
 Key Range
 Round Robin
 Hash Auto Keys
 Hash User Keys
 Pass Through
25
 Key Range Partition
 Data is distributed between partitions according to
pre-defined range values for keys
 Available in PowerCenter 5, but only for Source
Qualifier
 Key Range partitioning can now be applied to other
transformations
 Common use for this new functionality:
 Apply to Target Definition to align output with physical partition
scheme of target table.
 Apply to Target Definition to write all data to a single file to stream
into a database bulk loader that does not support concurrent loads
(e.g. Teradata, DB2)
26
 Key Range Partition (Continued)
 You can select input or input/output ports for the keys,
not variable ports and output only ports.
 Remember, the partition occurs BEFORE the transformation;
hence variable and output only are not allowed because they
have not yet been evaluated
 You can select multiple keys to form a Composite Key
 Range specification is: Start Range and End Range
 You can specify an Open Range also
 NULL values will go to the First Partition
 All unmatched rows will also go into First Partition,
user will see the following warning message once in
the log file:
TRANSF_1_1_2_1> TT_11083 WARNING! A row did not match any of the key ranges
specified at transformation [EXPTRANS]. This row and all subsequent unmatched rows will
be sent to the first target partition.
27
 Round Robin Partitioning
 The Informatica Server evenly distributes the data to
each partition.
 The user need not specify anything because key
values are not interrogated
 Common use:
 Apply to a flat file source qualifier when dealing with unequal
input file sizes
– Use “user hash” when there are downstream lookups/joiners
 Trick: All but one of the input files can be empty… you no
longer have to physically partition input files
– Note: There are performance implications with doing this.
Sometimes it’s better, sometimes it’s worse.
28
 Hash Partitioning
 Data is distributed between partitions according to a “hash”
function applied to the key values
 PowerCenter 5 supports Auto Hash Partitioning
automatically for aggregator and rank transformations
 Goal: Evenly distribute data, but make sure that like key
values are always processed by the same partition
 A “hash function” is applied to a set of ports.
 Hash function returns a value between 1 and the number of
partitions.
 A “good” hash function provides a uniform distribution of return
values
– Not all 1s or all 2s, but an even mix.
 Based on this function’s return value, the row is routed to
the corresponding partition (e.g. if 1, send to partition 1, if
2, send to partition 2, etc.)
29
 Hash Auto Key
 No need to specify the keys to hash on. Automatically uses all
key ports (ex, Group By key or Sort key) as a composite key
 Only valid for Unsorted Aggregation, Rank and Sorter
 The default partition type for unsorted aggregator and rank
 NULL values will get converted to zero for hashing.
 Hash Key
 Just like Auto Hash Key, but the user explicitly specifies the ports
to be used as the key.
 Only input and input/output ports allowed
 Common use:
 When dealing with input files of unequal sizes, hash partition (vs.
round-robin) data into downstream lookups and joiners to improve
“locality of reference” of caches (hash on ports used in the
lookup/join condition)
 “Override” the auto hash for performance reasons
– Hashing is faster on numeric values vs. strings
30
 Pass Through Partitioning
 Data is passed through to the next stage within the current
partition
 Since data is not distributed, the user need not specify
anything
 Common use:
 Create additional stages (processing threads) within pipeline
to improve performance
31
 Partition tab appears in Session Task from within the
Workflow Manager
Partitioning is
Based on the transformation Partition Tab
Select the
appropriate
partition
32
 By default, session tasks have the following partition points
and partition schemes:
 Relational Source, Target (Pass Through)
 File Source, Target (Pass Through)
 Unsorted Aggregator, Rank (Auto Hash)
 NOTE: You cannot delete the default partition points for sources
and targets.
 You can NOT run debug session with number of partition >1
 Just like PowerCenter 5…
33
Partitioning Do’s
 Cache as much as possible in memory
 Spread cache files across multiple physical devices both
within and across partitions
 Unless the directory is hosted on some kind of disk array,
configure disk based caches to use as many disk devices as
possible
 “Round Robin” or “Hash” partition a single input file until you

determine this is bottleneck
 “Range Partition” to align source/target with physical
partitioning scheme of source/target tables
 “Pass through” partition to apply more CPU resources to a
pipeline (when TX is bottleneck)
34
Partitioning Don’ts
 Don’t add partition points if the session is already source or
target constrained
 Tune the source or target to eliminate the bottleneck
 Don’t add partition points if the CPUs are already maxed out
(%idle < ~5%)
 Eliminate unnecessary processing and/or buy more CPUs
 Don’t add partition points if system performance monitor

shows regular “page out” activity
 Eliminate unnecessary processing and/or buy more memory
 Don’t add multiple partitions until you’ve tested and tuned a

single partition
 You’ll be glad you did
35
Default Partition Points
Default partitioning points
Reader Transformation Transfor- Writer

mation
Default Partition Points
Default Partition Point Default Partition Type Description
Source Qualifier or Pass-through Controls how the Server reads data from the
Normalizer Transformation source and passes data into the source qualifier
Rank and unsorted Hash auto-keys Ensures that the Server group rows before it
Aggregator Transformation sends them to the transformation
Target Instances Pass-through Control how the instances distribute data to the
targets
36
Session Performance Thread Statistics
***** RUN INFO FOR TGT LOAD ORDER GROUP [1], SRC PIPELINE [1] *****
MASTER> PETL_24018 Thread [READER_1_1_1] created for the read stage of
Reader
partition point [SQ_order_data] has completed: Total Run Time = [79.694595]
Stage secs, Total Idle Time = [21.450840] secs, Busy Percentage = [73.083695].
MASTER> PETL_24019 Thread [TRANSF_1_1_1_1] created for the transformation

DTM stage of partition point [SQ_order_data] has completed: Total Run Time =
Stage [80.135229] secs, Total Idle Time = [0.711023] secs, Busy Percentage =
[99.112721].
MASTER> PETL_24022 Thread [WRITER_1_1_1] created for the write stage of

Writer partition point(s) [order_data_out] has completed: Total Run Time =
Stage [80.936382] secs, Total Idle Time = [2.123060] secs, Busy Percentage =
[97.376878].
MASTER> PETL_24021 ***** END RUN INFO *****
Reader Data Transformation Writer
Note: Stages overlap when possible
37
Using Thread Statistics
 Look for stages with that are ~100% busy. These
are likely candidates for “pass-through” partition
points to allow concurrent processing
 If repartition point rules do not allow sub-dividing the
“100% busy” thread, then consider adding another
partition
 This only helps if you have available CPU capacity
38
Informatica Resources
 Professional Services
 Contact the Regional Manager for details and pricing
 Educational Services
 Performance & Tuning Courses
 Informatica Methodology
 http://www1.informatica.com/methodology
 Informatica Developer Network downloads & forums

 http://devnet.informatica.com
39
40

Performance Tuning

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Performance Tuning

Uploaded by

Copyright:

Available Formats

PowerMart/Center 6

Disk Disk Disk Disk

 Tuning involves an iterative approach

 Database and OS considerations

 Session Performance will be affected by:

 Getting data through the Informatica engine

 There are several ways to measure performance such as total

 Rows per second (rows/sec) will allow performance

 These parameters are specified per session task, in

 Buffer Block Size controls the size of the blocks that

Shared memory constant 25 MB

Buffer block size constant 64K

 Place a filter set to false on the output of each

 Use filters early and often

 Simplify nested IIFs when possible

 Use “Incremental Aggregation” when possible

 Reduce transformation errors

 Reduce the level of tracing

 “Round Robin” or “Hash” partition a single input file until you

 Don’t add partition points if system performance monitor

 Don’t add multiple partitions until you’ve tested and tuned a

Default partitioning points

Reader Transformation Transfor- Writer

MASTER> PETL_24019 Thread [TRANSF_1_1_1_1] created for the transformation

MASTER> PETL_24022 Thread [WRITER_1_1_1] created for the write stage of

Reader Data Transformation Writer

Note: Stages overlap when possible

 Informatica Developer Network downloads & forums

You might also like