Example of SCD1 and Update Stratgey

You might also like

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 39

Simple Example of SCD Type 1-UPSERT In Informatica

Many beginners get it wrong to manually get the SCD working in Informatica. Lets see with a
simple example step by step.
Assumption : Working with Scott user in Oracle
SRC:
create table emps_us as select empno,ename,sal from emp ;
TGT:
create table empt_us as select empno,ename,sal from scott.emp where 1=2 ;
alter table empt_us add constraint eno_pk primary key(empno) ;
Step1: Lets Get the Simple Pass Through Working
Transfer the data from Source to target using Informatica
Note: For some reason if this operation is not done , the update may not work.
Step2 : Lets get the update working.
Objective : When the source rows changes , update only those changed rows in target
SRC: update emps_us set sal=sal+1000 where empno in ( 7900,7902,7934);
Power Center Designer
Drag the source ,target and create a Update Strategy Transformation.
Straight link src - Updtrans - tgt
Now we need to lookup into the tgt to see which rows to update. We are going to do it using a
unconnected lookup transformation.We want to lookup for rows where the SAL is changed.
Create lookup transformation,select target table , add two input port IN_EMPNO and IN_SAL
.This values will be supplied when this lkp transformation will be called. Add the conditions
empno=IN_EMPNO and SAL != IN_SAL
Ports -> enable R for empno . This just signifies that if the condition are met a true will be
returned.
Now we need to edit update strategy expression as
IIF( NOT ISNULL( :LKP.LKPTRANS(EMPNO,SAL)),DD_UPDATE,DD_REJECT )
Save and create a Workflow.
Imp:
1) WF->Properties->Treat Source Rows as > Change it to Data Driven.
2) The target should have a primary key for update
Now run the workflow . Observe that the modified rows are getting updated.
Step3: Now lets get the insert also working.
Objective : Now along with updating the existing rows we need to add new rows.

Power Center Designer


In the same mapping , drag one more instance of the target , also create a new update strategy.
Straingth link SRC - NEW UPDTRANS - TGT
Create a lookup transformation for target table . Add port IN_EMPNO , condition
EMPNO=IN_EMPNO , enable R for empno .The lkp returns true if the condition is satisfied.
Now edit the update Strategy expression as below
IIF(ISNULL(:LKP.LKPTRANS(EMPNO)) ,DD_INSERT,DD_REJECT)
Save , refresh the workflow mapping
Do the following modification to source
insert into emps_us values ( 1,'N1',100);
update emps_us set sal=sal+1000 where empno in ( 7900,7902,7934);
Start the workflow to see your SCD Type 1 working
The above example will let you get the basic stuff working.
You can keep optimizing may be with a single UPDTRANS and a single target.Edit the strategy
expression as
IIF( ISNULL(:LKP.LKPTRANS_INSERT(EMPNO)),DD_INSERT,IIF( ISNULL(:LKP.LKPTRANS_UPDA
TE(EMPNO,SAL)) ,DD_REJECT,DD_UPDATE ) )

SCD Type-1 Implementation in Informatica using dynamic Lookup


SCD Type-1: A Type 1 change overwrites an existing dimensional attribute with
new information. In the customer name-change example, the new name
overwrites the old name, and the value for the old version is lost. A Type One
change updates only the attribute, doesn't insert new records, and affects no
keys. It is easy to implement but does not maintain any history of prior attribute
values
Implementation:
Source:
Create CUST source using following script.
CREATE TABLE CUST
(CUST_ID NUMBER,
CUST_NM VARCHAR2(250 BYTE),
ADDRESS VARCHAR2(250 BYTE),
CITY VARCHAR2(50 BYTE),

STATE VARCHAR2(50 BYTE),


INSERT_DT DATE,
UPDATE_DT DATE);
Target:
CREATE TABLE STANDALONE.CUST_D
(
PM_PRIMARYKEY INTEGER,
CUST_ID NUMBER,
CUST_NM VARCHAR2(250 BYTE),
ADDRESS VARCHAR2(250 BYTE),
CITY VARCHAR2(50 BYTE),
STATE VARCHAR2(50 BYTE),
INSERT_DT DATE,
UPDATE_DT DATE);
CREATE UNIQUE INDEX STANDALONE.CUST_D_PK ON
STANDALONE.CUST_D(PM_PRIMARYKEY);
ALTER TABLE CUST_D ADD (CONSTRAINT CUST_D_PK PRIMARY KEY
(PM_PRIMARYKEY));
Import Source and target in to informatica using source analyzer and target
designer.
Create mapping m_Use_Dynamic_Cache_To_SCD_Type1 and drag CUST source
from sources to mapping designer.

Create lookup transformation lkp_CUST_D for CUST_D target table.

Create input ports in_CUST_ID, in_CUST_NM, in_ADDRESS, in_CITY and


in_STATE attributes in lkp_CUST_D transformation. Connect CUST_ID, CUST_NM,
ADDRESS, CITY and STATE from source qualifier to lookup lkp_CUST_D table
in_CUST_ID, in_CUST_NM, in_ADDRESS, in_CITY and in_STATE attributes
respectively.
Create condition in lookup transformation CUST_ID=in_CUST_ID in conditions
tab.

Select dynamic cache and insert else update options in lookup transformation
properties.

Assign ports for lookup ports as shown in below screen shot.

Create expression transformation and drag all attributes from lookup


transformation and drop in expression transformation and change the name of
attributes in expression transformation with respect to source attributes or target
attributes, so that it is easy to understand the fields which are coming from
source and target.

Create one dummy out port in expression transformation to pass date to target
and assign SYSDATE in expression editor.

Create router transformation and drag attributes from expression transformation


to router transformation as shown in below screen shot.

Create two groups in router transformation one for INSERT and another one for
UPDATE.
Give condition NewLookupRow=1 for insert group and NewLookupRow=2 for
update group.

Connect insert group from router to insert pipe line in target and update group to
update pipe line target through update strategy transformation.

For update strategy transformation upd_INSERT give condition DD_INSERT and


DD_UPDATE for upd_UPDATE update strategy transformation
Create work flow wkfl_Use_Dynamic_Cache_To_SCD_Type1 with session
s_Use_Dynamic_Cache_To_SCD_Type1 for mapping
m_Use_Dynamic_Cache_To_SCD_Type1.

With coding for SCD Type1 by using Dynamic lookup transformation completed.
Execution:
Insert records in source CUST table by using following insert scripts.
SET DEFINE OFF;
Insert into CUST (CUST_ID, CUST_NM, ADDRESS, CITY, STATE, INSERT_DT,
UPDATE_DT)
Values (80001, 'Marion Atkins', '100 Main St.', 'Bangalore', 'KA',
SYSDATE,SYSDATE);
Insert into CUST (CUST_ID, CUST_NM, ADDRESS, CITY, STATE, INSERT_DT,
UPDATE_DT)
Values (80002, 'Laura Jones', '510 Broadway Ave.', 'Hyderabad', 'AP',
SYSDATE,SYSDATE);
Insert into CUST (CUST_ID, CUST_NM, ADDRESS, CITY, STATE, INSERT_DT,
UPDATE_DT)
Values (80003, 'Jon Freeman', '555 6th Ave.', 'Bangalore', 'KA',
SYSDATE,SYSDATE);
COMMIT;
Data in source will look like below.

Start work flow after insert the records in CUST table. After completion of this
work flow all the records will be loaded in target and data will be look like below.

Now update any record in source and re run the work flow it will update record in
target. If any records in source which are not present in target will be inserted in
target table.

SCD type1 step by step example:


Step 1: Get an EMP source table:
This table available in the scott user:
EMPNO ENAME JOB MGR HIREDATE SAL COMM DEPTNO
---------- ---------- --------- ---------- --------- ---------- ---------- ---------Step 2 get a target table for the same(note this has a surrogate key newly created:)
EMPNO ENAME JOB MGR HIREDATE SAL COMM DEPTNO SK
SQL>: Create table empscd1 as select * from scott.emp;( you might have to grant create
permission for the scott user incase not working).
SQL>: alter table t_emp add sk number(15) primary key;
Now, we have the source and the target tables created.
Lets start off with the mapping.
Step 3: In the informatica designer get the source and the target tables that we just created above.
Step 4: Now, get a look up and select the target table to look up.
Step 5: Drag and drop all the source tables to the look up transformation.(now, we have the target
tables rows at the top of the look up and
the source rows the lower side in the look up transformation)
Step6: Get an expression transformation and connect all the rows in the look up transformation
to the expression transformation. Now, you
need to add two more columns here
1. Insert_flg of integer(15) type: Give the expression for this as: IIF(ISNULL(SK) OR
ISNULL(EMPNO),1,0)
2. Update_flg of integer(15) type: Give the expression for this as: IIF(NOT ISNULL(SK) and
(
( ENAME != ENAME1 ) OR
( JOB != JOB1 ) OR
( MGR != MGR1 ) OR
( HIREDATE != HIREDATE1 ) OR
( SAL != SAL1 ) OR
( COMM != COMM1 ) OR

( DEPTNO != DEPTNO1 )
) ,1,0 )
Step 7: Get a router transformation and connect the source rows(the rows that have "1"
postfixed) to it. Makesure the insert_flg and update
flg columns that you created also is connected for the same and give group names for this to
create two groups as follows
Group 1. Insert_rows: Group filter condition for this: Insert_flg
Group 2. Update_rows: Group filter condition for this: Update_flg
Step 8: Now connect the insert_rows groups of the router to the target. Note: here you should not
connect the SK column from the source
/router transformation, but you need to use a sequence generator tranformation and connect the
nextval column to the SK of the target as this
sequence needs to be updated to the next value whenever a new row gets added.
Step 9: Connect the update_rows of the router tranformation to an update strategy tranformation
with a transformation value of DD_UPDATE and
then connect to the Target instance 2( Note : target instance2 is nothing but a copy paste of the
target table and is not generated at the
target DB).
STep 10: Now create a workflow and the task for the same and run the transformation.
Make sure you commit the changes made on the source side :-)
The logic goes very simple:
1. First the lookup will look up in the cache table for a given row for existence.
2. If the SK does not exist, then it will go ahead and update the row in the target/Dimension
table.
3. After this the sequence generator will be updated to the next value WRT the target/Dimension
table.
4. If the SK exist then the condition will point to the update_flg and will do a DD_UPDATE of
the corresponding row in the target table.
5. Now, the same process will continue with the next row onwards.
6. Note the SK in the target table should be a primary key with out fail.

Slowly Changing Dimensions (SCDs) are dimensions that have data that changes slowly, rather
than changing on a time-based, regular schedule

For example, you may have a dimension in your database that tracks the sales records of your
company's salespeople. Creating sales reports seems simple enough, until a salesperson is
transferred from one regional office to another. How do you record such a change in your sales
dimension?
You could sum or average the sales by salesperson, but if you use that to compare the
performance of salesmen, that might give misleading information. If the salesperson that was
transferred used to work in a hot market where sales were easy, and now works in a market
where sales are infrequent, her totals will look much stronger than the other salespeople in her
new region, even if they are just as good. Or you could create a second salesperson record and
treat the transferred person as a new sales person, but that creates problems also.
Dealing with these issues involves SCD management methodologies:
Type 1:
The Type 1 methodology overwrites old data with new data, and therefore does not track
historical data at all. This is most appropriate when correcting certain types of data errors, such
as the spelling of a name. (Assuming you won't ever need to know how it used to be misspelled
in the past.)
Here is an example of a database table that keeps supplier information:

Supplier_Key

Supplier_Code

Supplier_Name

Supplier_State

123

ABC

Acme Supply Co

CA

In this example, Supplier_Code is the natural key and Supplier_Key is a surrogate key.
Technically, the surrogate key is not necessary, since the table will be unique by the natural key
(Supplier_Code). However, the joins will perform better on an integer than on a character string.
Now imagine that this supplier moves their headquarters to Illinois. The updated table would
simply overwrite this record:

Supplier_Key

Supplier_Code

Supplier_Name

Supplier_State

123

ABC

Acme Supply Co

IL

The obvious disadvantage to this method of managing SCDs is that there is no historical record
kept in the data warehouse. You can't tell if your suppliers are tending to move to the Midwest,
for example. But an advantage to Type 1 SCDs is that they are very easy to maintain.
Explanation with an Example:
Source Table: (01-01-11) Target Table: (01-01-11)

Emp no

Ename

Sal

101

1000

102

2000

103

3000

Emp no

Ename

Sal

101

1000

102

2000

103

3000

The necessity of the lookup transformation is illustrated using the above source and target table.
Source Table: (01-02-11) Target Table: (01-02-11)

Emp no Ename

Sal

Empno

Ename

Sal

101

1000

101

1000

102

2500

102

2500

103

3000

103

3000

104

4000

104

4000

In the second Month we have one more employee added up to the table with the Ename
D and salary of the Employee is changed to the 2500 instead of 2000.

Step 1: Is to import Source Table and Target table.

Create a table by name emp_source with three columns as shown above in oracle.

Import the source from the source analyzer.

In the same way as above create two target tables with the names emp_target1,
emp_target2.

Go to the targets Menu and click on generate and execute to confirm the creation of the
target tables.

The snap shot of the connections using different kinds of transformations are shown
below.

Step 2: Design the mapping and apply the necessary transformation.

Here in this transformation we are about to use four kinds of transformations namely
Lookup transformation, Expression Transformation, Filter Transformation, Update
Transformation. Necessity and the usage of all the transformations will be discussed in
detail below.

Look up Transformation: The purpose of this transformation is to determine whether to insert,


Delete, Update or reject the rows in to target table.

The first thing that we are goanna do is to create a look up transformation and connect the
Empno from the source qualifier to the transformation.

The snapshot of choosing the Target table is shown below.

What Lookup transformation does in our mapping is it looks in to the target table
(emp_table) and compares it with the Source Qualifier and determines whether to insert,
update, delete or reject rows.

In the Ports tab we should add a new column and name it as empno1 and this is column
for which we are gonna connect from the Source Qualifier.

The Input Port for the first column should be unchked where as the other ports like
Output and lookup box should be checked. For the newly created column only input and
output boxes should be checked.

In the Properties tab (i) Lookup table name ->Emp_Target.

(ii)Look up Policy on Multiple Mismatch -> use First Value.


(iii) Connection Information ->Oracle.

In the Conditions tab (i) Click on Add a new condition

(ii)Lookup Table Column should be Empno, Transformation port should be Empno1 and
Operator should =.
Expression Transformation: After we are done with the Lookup Transformation we are using
an expression transformation to check whether we need to insert the records the same records or

we need to update the records. The steps to create an Expression Transformation are shown
below.

Drag all the columns from both the source and the look up transformation and drop them
all on to the Expression transformation.

Now double click on the Transformation and go to the Ports tab and create two new
columns and name it as insert and update. Both these columns are gonna be our output
data so we need to have check mark only in front of the Output check box.

The Snap shot for the Edit transformation window is shown below.

The condition that we want to parse through our output data are listed below.

Input IsNull(EMPNO1)
Output iif(Not isnull (EMPNO1) and Decode(SAL,SAL1,1,0)=0,1,0) .

We are all done here .Click on apply and then OK.

Filter Transformation: we are gonna have two filter transformations one to insert and other to
update.

Connect the Insert column from the expression transformation to the insert column in the
first filter transformation and in the same way we are gonna connect the update column in
the expression transformation to the update column in the second filter.

Later now connect the Empno, Ename, Sal from the expression transformation to both
filter transformation.

If there is no change in input data then filter transformation 1 forwards the complete input
to update strategy transformation 1 and same output is gonna appear in the target table.

If there is any change in input data then filter transformation 2 forwards the complete
input to the update strategy transformation 2 then it is gonna forward the updated input to
the target table.

Go to the Properties tab on the Edit transformation

(i) The value for the filter condition 1 is Insert.


(ii) The value for the filter condition 1 is Update.

The Closer view of the filter Connection is shown below.

Update Strategy Transformation: Determines whether to insert, delete, update or reject the
rows.

Drag the respective Empno, Ename and Sal from the filter transformations and drop them
on the respective Update Strategy Transformation.

Now go to the Properties tab and the value for the update strategy expression is 0 (on the
1st update transformation).

Now go to the Properties tab and the value for the update strategy expression is 1 (on the
2nd update transformation).

We are all set here finally connect the outputs of the update transformations to the target
table.

Step 3: Create the task and Run the work flow.

Dont check the truncate table option.

Change Bulk to the Normal.

Run the work flow from task.

Step 4: Preview the Output in the target table.

Create/Design/Implement SCD Type 1 Mapping in Informatica


Q) How to create or implement or design a slowly changing dimension (SCD) Type 1 using the
informatica ETL tool.
The SCD Type 1 method is used when there is no need to store historical data in the Dimension
table. The SCD type 1 method overwrites the old data with the new data in the dimension table.
The process involved in the implementation of SCD Type 1 in informatica is

Identifying the new record and inserting it in to the dimension table.

Identifying the changed record and updating the dimension table.

We see the implementation of SCD type 1 by using the customer dimension table as an example.
The source table looks as
CREATE TABLE Customers (

Customer_Id
Number,
Customer_Name Varchar2(30),
Location
Varchar2(30)

Now I have to load the data of the source into the customer dimension table using SCD Type 1.
The Dimension table structure is shown below.
CREATE TABLE Customers_Dim (
Cust_Key
Number,
Customer_Id
Number,
Customer_Name Varchar2(30),
Location
Varchar2(30)
)

Steps to Create SCD Type 1 Mapping


Follow the below steps to create SCD Type 1 mapping in informatica

Create the source and dimension tables in the database.

Open the mapping designer tool, source analyzer and either create or import the source
definition.

Go to the Warehouse designer or Target designer and import the target definition.

Go to the mapping designer tab and create new mapping.

Drag the source into the mapping.

Go to the toolbar, Transformation and then Create.

Select the lookup Transformation, enter a name and click on create. You will get a
window as shown in the below image.

Select the customer dimension table and click on OK.

Edit the lkp transformation, go to the properties tab, and add a new port In_Customer_Id.
This new port needs to be connected to the Customer_Id port of source qualifier
transformation.

Go to the condition tab of lkp transformation and enter the lookup condition as
Customer_Id = IN_Customer_Id. Then click on OK.

Connect the customer_id port of source qualifier transformation to the IN_Customer_Id


port of lkp transformation.

Create the expression transformation with input ports as Cust_Key, Name, Location,
Src_Name, Src_Location and output ports as New_Flag, Changed_Flag

For the output ports of expression transformation enter the below expressions and click
on ok

New_Flag = IIF(ISNULL(Cust_Key),1,0)
Changed_Flag = IIF(NOT ISNULL(Cust_Key)
AND (Name != Src_Name
OR Location != Src_Location),
1, 0 )

Now connect the ports of lkp transformation (Cust_Key, Name, Location) to the
expression transformaiton ports (Cust_Key, Name, Location) and ports of source qualifier
transformation(Name, Location) to the expression transforamtion ports(Src_Name,
Src_Location) respectively.

The mapping diagram so far created is shown in the below image.

Create a filter transformation and drag the ports of source qualifier transformation into it.
Also drag the New_Flag port from the expression transformation into it.

Edit the filter transformation, go to the properties tab and enter the Filter Condition as
New_Flag=1. Then click on ok.

Now create an update strategy transformation and connect all the ports of the filter
transformation (except the New_Flag port) to the update strategy. Go to the properties tab
of update strategy and enter the update strategy expression as DD_INSERT

Now drag the target definition into the mapping and connect the appropriate ports from
update strategy to the target definition.

Create a sequence generator transformation and connect the NEXTVAL port to the target
surrogate key (cust_key) port.

The part of the mapping diagram for inserting a new row is shown below:

Now create another filter transformation and drag the ports from lkp transformation
(Cust_Key), source qualifier transformation (Name, Location), expression transformation
(changed_flag) ports into the filter transformation.

Edit the filter transformation, go to the properties tab and enter the Filter Condition as
Changed_Flag=1. Then click on ok.

Now create an update strategy transformation and connect the ports of the filter
transformation (Cust_Key, Name, and Location) to the update strategy. Go to the
properties tab of update strategy and enter the update strategy expression as DD_Update

Now drag the target definition into the mapping and connect the appropriate ports from
update strategy to the target definition.

The complete mapping diagram is shown in the below image.

Update Without Update Strategy for Better Session


Performance

You might have come across an ETL scenario, where you need to update a huge
table with few records and occasional inserts. The straight forward approach of
using LookUp transformation to identify the Inserts, Update and Update Strategy to
do the Insert or Update may not be right for this particular scenario, mainly because
of the LookUp transformation may not perform better and start degrading as the
lookup table size increases.
In this article lets talk about a design, which can take care of the scenario we just
spoke.

The Theory
When you configure an Informatica PowerCenter session, you have several options
for handling database operations such as insert, update, delete.
Specifying an Operation for All Rows
During session configuration, you can select a single database operation for all rows
using the Treat Source Rows As setting from the 'Properties' tab of the session.
1. Insert :- Treat all rows as inserts.
2. Delete :- Treat all rows as deletes.
3. Update :- Treat all rows as updates.
4. Data Driven :- Integration Service follows instructions coded into
Update Strategy flag rows for insert, delete, update, or reject.
Specifying Operations for Individual Target Rows
Once you determine how to treat all rows in the session, you can also set options for
individual rows, which gives additional control over how each rows behaves. Define
these options in the Transformations view on Mapping tab of the session properties.
1. Insert :- Select this option to insert a row into a target table.
2. Delete :- Select this option to delete a row from a table.
3. Update :- You have the following options in this situation:

Update as Update :- Update each row flagged for update if it


exists in the target table.

Update as Insert :- Insert each row flagged for update.

Update else Insert :- Update the row if it exists. Otherwise, insert


it.

4. Truncate Table :- Select this option to truncate the target table before
loading data.

Design and Implementation


Now we understand the properties we need to use for our design implementation.

We can create the mapping just like an 'INSERT' only mapping, with out LookUp,
Update Strategy Transformation. During the session configuration lets set up the
session properties such that the session will have the capability to both insert and
update.

First set Treat Source Rows As property as shown in below image.

Now lets set the properties for the target table as shown below.
properties Insert and Update else Insert.

Choose the

Thats all we need to set up the session for update and insert with out update
strategy.

Hope you enjoyed this article. Please leave us a comment below, if you have any
difficulties implementing this. We will be more than happy to help you.

Update Strategy transformation


Update Strategy transformation Active and Connected Transformation. The Update Strategy
transformation to update, delete or reject rows coming from source based on some condition.
For example; if Address of a CUSTOMER changes, we can update the old address or keep both
old and new address. One row is for old and one for new. This way we maintain the historical
data.
Update Strategy is used with Lookup Transformation. In Data warehouse, we create a Lookup on
target table to determine whether a row already exists or not. Then we insert, update, delete or
reject the source record as per business need.
In PowerCenter, we set the update strategy at two different levels:

1. Within a session
2. Within a Mapping
Update Strategy within a session:
When we configure a session, we can instruct the IS to either treat all rows in the same way or
use instructions coded into the session mapping to flag rows for different database operations.
Session Configuration:
Edit Session -> Properties -> Treat Source Rows as: (Insert, Update, Delete, and Data Driven).
Insert is default.
Specifying Operations for Individual Target Tables:
You can set the following update strategy options:
1. Insert: Select this option to insert a row into a target table.
2. Delete: Select this option to delete a row from a table.
3. Update: We have the following options in this situation:
i.

Update as Update. Update each row flagged for update if it exists in the target
table.

ii.

Update as Insert. Inset each row flagged for update.

iii.

Update else Insert. Update the row if it exists. Otherwise, insert it.

4. Truncate: Select this option to truncate the target table before loading data.
Flagging Rows within a Mapping:
Within a mapping, we use the Update Strategy transformation to flag rows for insert, delete,
update, or reject.
Operation

Constant

Numeric Value

INSERT

DD_INSERT

INSERT

DD_INSERT

UPDATE

DD_UPDATE

DELETE

DD_DELETE

REJECT

DD_REJECT

Update Strategy Expressions:


Frequently, the update strategy expression uses the IIF or DECODE function from the
transformation language to test each row to see if it meets a particular condition. You can write
these expression in Properties Tab of Update Strategy transformation.
IIF( ( ENTRY_DATE > APPLY_DATE), DD_REJECT, DD_UPDATE )
Or
IIF( ( ENTRY_DATE > APPLY_DATE), 3, 2 )
Note: We can configure the Update Strategy transformation to either pass rejected rows to the
next transformation or drop them. To do, see the Properties Tab for the Option.

Understanding Treat Source Rows property and Target Insert,


Update properties
Posted by Ankur Nigam on August 17, 2011
Informatica has plethora of options to perform IUD (Insert, Update, Delete) operations on tables.
One of the most common method is using the Update strategy, while the underdog is using Treat
Source Rows set to {Insert; Update; Delete} and not data driven. I will be focusing on latter in
this topic.

In simple terms when you set the Treat Source Rows property it indicates Informatica that the
row has to be tagged as Insert or Update or Delete. This property coupled with target level
property of allowing Insert, Update, Delete works out wonders even in absence of Update
Strategy. This also leads to a clear-cut mapping design. I am not opposing the use of Update
Strategy but in some situations this leads to a slight openness in the mapping wherein I dont
have to peek into the reason of action the Strategy is performing e.g.
IIF(ISNULL(PK)=1,DD_INSERT,DD_UPDATE).
Lets buckle up our belts and go on a ride to understand the use of these properties.
Assume a scenario where I have following Table Structure in Stage

Keeping things simple the target table would be something like this

As you can see the target has UserID as a surrogate key which I will populate through a
sequence. Also note that Username is unique.
Now I have a scenario where I have to update the existing records and insert the new ones as
supplied in the staging table.
Before beginning with writing code, lets first understand TSA and target properties is more
detail. Treat Source Rows accepts 4 types of settings:
1. Insert :- When I set this option Informatica will mark all rows read from source
as Insert. Means that the rows will only be inserted.
2. Update :- When I set this option Informatica will mark all rows read from
source as Update. It means that rows when arrive target they have to be
updated in it.
3. Delete :- The rows will be marked as to be deleted from target once having
been read from Source.
4. Data Driven :- This indicates Informatica that we are using an update strategy
to indicate what has to be done with rows. So no marking will be done when
rows are read from source. Infact what has to be done with rows arriving to
target will be decided immediately before any IUD operation on target

However setting TSA alone will not let you modify rows in the target. Each target in itself should
be able to accept or I should say allow IUD operations. So when you have set TSA property you
have to also set the target level property also that whether the rows can be inserted, updated or
deleted from the target. This can be done in following ways:-

Insert and delete are self-explanatory however update has been categorized into 3 sections.
Please note that setting any of them will allow update on your tables:1. Update as Update :- This is simple property which says that if the row arrives
target, it has to be updated in target. So if you check the logs Informatica will
generate an Update template something like UPDATE
INFA_TARGET_RECORDS SET EMAIL = ? WHERE USERNAME = ?
2. Update as Insert :- This means that when row arrives target and it is a row
which has to be updated, then the update behaviour should be to insert this
row in target. In this case Informatica will not generate any update template
for the target instead the incoming row will be inserted using the template
INSERT INTO INFA_TARGET_RECORDS(USERID,USERNAME,EMAIL) VALUES
( ?, ?, ?)
3. Update else Insert :- Means that the incoming row flagged as update should
be either updated or inserted. In a nutshell it means that if any key column is

present in the incoming row which also exists in target then Informatica will
intelligently update that row in target. In case if the incoming key column is
not present in the target the row will be inserted.

PS :- The last two properties require you to set the Insert property of target also because if this is
not checked then Update as Insert & Update else Insert will not work and session will fail stating
that the target does not allows Insert. Why? Well its simple because these update clauses have
insert hidden in them.
Ok enough of theories? Fine lets get our hand dirty. Coming back to our scenario, we have
the rows read from source and want them to be either inserted or updated in target depending
upon the status of rows i.e. whether they are present in the target or not. My mapping looks
something like this:

Here I have used a lookup table to fetch user ID for a username incoming from stage. In the
router following has been set:-

The output from router is sent to respective instances of the target


(INFA_TARGET_RECORDS) in case if user exists or not. INFA_TARGET_RECORDS_NEW
in case of new records and INFA_TARGET_RECORDS_UPD in case of existing records.
Once this is in place I have to set the Treat Source Rows property as Update for this session. Also
to enable Informatica to insert in the table I will have to :1. Set the Insert & Update as Insert properties of the instance
INFA_TARGET_RECORDS_NEW.
2. Set Update as Update property for INFA_TARGET_RECORDS_UPD instance in
the session.

What actually happened is that I have treated all rows from source to be flagged as update.
Secondly I have modified the behaviour of the Update and set it as Update as Insert. Due to this
property update has allowed me actually to insert the rows in target. When the session runs it will
update the rows in target and insert the new rows in target (actually update as insert).
Try it out and let me know if it works for you. I am not attaching any run demo because its better
if you do it and understand even more clearly what is happening behind the scenes.

Informatica Mapping Insert Update Delete


There are situations where you need to keep a source and target in sync. One method to do this
is to truncate and reload. However this method is not that efficient for a table with millions of
rows of data. You really only want to:

insert rows from the source that dont exist in the target

update rows that have changed

delete rows from the target that no longer exist in the source

Can you do this efficiently in a single informatica mapping?

Here is a picture of the informatica mapping:

Here is the detailed informatica mapping:


1. Insert Source and Source Qualifier from Source
2. Insert Source and Source Qualifier from Target
3. Sort Source and Target in Source Qualifiers by Key Fields
4. Insert Joiner Transformation using a Full Outer Join and select Sorted Input
option
5. Insert a Router Transformation with 3 groups

1. Insert ISNULL(Target_PK)
2. Delete ISNULL(Source_PK)
3. Default used for Update
6. Insert an Update Transformation coming form the Delete Group using
DD_DELETE
1. Connect this transformation to the Target
7. Insert a Filter Transformation coming from the Update Group
1. (
DECODE(Source_Field1, Target_Field1, 1, 0) = 0
OR
DECODE(Source_Field2, Target_Field2, 1, 0) = 0
)
2. Modify as needed to compare all non-key fields
8. Insert an Update Transformation coming from the Filter Transformation using
DD_UPDATE
1. Connect this transformation to the Target
9. Connect the Insert Group in the Router Transformation to the Target

Please leave a comment if you have questions on this informatica mapping process

SCD - Type 1
Slowly Changing Dimensions (SCDs) are dimensions that have data that
changes slowly, rather than changing on a time-based, regular schedule
For example, you may have a dimension in your database that tracks the
sales records of your company's salespeople. Creating sales reports seems
simple enough, until a salesperson is transferred from one regional office to
another. How do you record such a change in your sales dimension?
You could sum or average the sales by salesperson, but if you use that to
compare the performance of salesmen, that might give misleading
information. If the salesperson that was transferred used to work in a hot
market where sales were easy, and now works in a market where sales are

infrequent, her totals will look much stronger than the other salespeople in
her new region, even if they are just as good. Or you could create a second
salesperson record and treat the transferred person as a new sales person,
but that creates problems also.
Dealing with these issues involves SCD management methodologies:
Type 1:
The Type 1 methodology overwrites old data with new data, and therefore
does not track historical data at all. This is most appropriate when correcting
certain types of data errors, such as the spelling of a name. (Assuming you
won't ever need to know how it used to be misspelled in the past.)
Here is an example of a database table that keeps supplier information:
Supplier_Key
123

Supplier_Code
ABC

Supplier_Name
Acme Supply Co

Supplier_State
CA

In this example, Supplier_Code is the natural key and Supplier_Key is


asurrogate key. Technically, the surrogate key is not necessary, since the
table will be unique by the natural key (Supplier_Code). However, the joins
will perform better on an integer than on a character string.
Now imagine that this supplier moves their headquarters to Illinois. The
updated table would simply overwrite this record:
Supplier_Key
123

Supplier_Code
ABC

Supplier_Name
Acme Supply Co

Supplier_State
IL

The obvious disadvantage to this method of managing SCDs is that there is


no historical record kept in the data warehouse. You can't tell if your
suppliers are tending to move to the Midwest, for example. But an
advantage to Type 1 SCDs is that they are very easy to maintain.
Explanation with an Example:
Source
Emp no
101
102
103

Table: (01-01-11) Target Table: (01-01-11)


Ename
Sal
A
1000
B
2000
C
3000

Emp no Ename
101
A
102
B
103
C

Sal
1000
2000
3000

The necessity of the lookup transformation is illustrated using the above


source and target table.
Source Table: (01-02-11) Target Table: (01-02-11)
Emp
no
101
102
103
104

Ename

Sal

Empno

Ename

Sal

A
B
C
D

1000
2500
3000
4000

101
102
103
104

A
B
C
D

1000
2500
3000
4000

In the second Month we have one more employee added up to the table with
the Ename D and salary of the Employee is changed to the 2500 instead of
2000.
Step 1: Is to import Source Table and Target table.

Create a table by name emp_source with three columns as shown


above in oracle.

Import the source from the source analyzer.

In the same way as above create two target tables with the names
emp_target1, emp_target2.

Go to the targets Menu and click on generate and execute to confirm


the creation of the target tables.

The snap shot of the connections using different kinds of


transformations are shown below.

Step 2: Design the mapping and apply the necessary transformation.


Here in this transformation we are about to use four kinds of transformations
namely Lookup transformation, Expression Transformation, Filter
Transformation, Update Transformation. Necessity and the usage of all the
transformations will be discussed in detail below.
Look up Transformation: The purpose of this transformation is to
determine whether to insert, Delete, Update or reject the rows in to target
table.

The first thing that we are goanna do is to create a look up


transformation and connect the Empno from the source qualifier to the
transformation.

The snapshot of choosing the Target table is shown below.

What Lookup transformation does in our mapping is it looks in to the


target table (emp_table) and compares it with the Source Qualifier and
determines whether to insert, update, delete or reject rows.

In the Ports tab we should add a new column and name it as empno1
and this is column for which we are gonna connect from the Source
Qualifier.

The Input Port for the first column should be unchked where as the
other ports like Output and lookup box should be checked. For the
newly created column only input and output boxes should be checked.

In the Properties tab (i) Lookup table name ->Emp_Target.

(ii)Look up Policy on Multiple Mismatch -> use First Value.


(iii) Connection Information ->Oracle.

In the Conditions tab (i) Click on Add a new condition

(ii)Lookup Table Column should be Empno, Transformation port should be


Empno1 and Operator should =.
Expression Transformation: After we are done with the Lookup
Transformation we are using an expression transformation to check whether
we need to insert the records the same records or we need to update the
records. The steps to create an Expression Transformation are shown below.

Drag all the columns from both the source and the look up
transformation and drop them all on to the Expression transformation.

Now double click on the Transformation and go to the Ports tab and
create two new columns and name it as insert and update. Both these
columns are gonna be our output data so we need to have check mark
only in front of the Output check box.

The Snap shot for the Edit transformation window is shown below.

The condition that we want to parse through our output data are listed
below.

Input IsNull(EMPNO1)
Output iif(Not isnull (EMPNO1) and Decode(SAL,SAL1,1,0)=0,1,0) .

We are all done here .Click on apply and then OK.

Filter Transformation: we are gonna have two filter transformations one


to insert and other to update.

Connect the Insert column from the expression transformation to the


insert column in the first filter transformation and in the same way we
are gonna connect the update column in the expression transformation
to the update column in the second filter.

Later now connect the Empno, Ename, Sal from the expression
transformation to both filter transformation.

If there is no change in input data then filter transformation 1 forwards


the complete input to update strategy transformation 1 and same
output is gonna appear in the target table.

If there is any change in input data then filter transformation 2


forwards the complete input to the update strategy transformation 2
then it is gonna forward the updated input to the target table.

Go to the Properties tab on the Edit transformation

(i) The value for the filter condition 1 is Insert.


(ii) The value for the filter condition 1 is Update.

The Closer view of the filter Connection is shown below.

Drag the respective Empno, Ename and Sal from the filter
transformations and drop them on the respective Update Strategy
Transformation.

Now go to the Properties tab and the value for the update strategy
expression is 0 (on the 1st update transformation).

Now go to the Properties tab and the value for the update strategy
expression is 1 (on the 2nd update transformation).

We are all set here finally connect the outputs of the update
transformations to the target table.

Step 3: Create the task and Run the work flow.

Dont check the truncate table option.

Change Bulk to the Normal.

Run the work flow from task.

Step 4: Preview the Output in the target table.

You might also like