Professional Documents
Culture Documents
ETL Informatica Power Centre Interview Questions
ETL Informatica Power Centre Interview Questions
--------------------------------------------
6. How to update the record without using Update strategy and Update over ride, keys are
also not defined at DB level.
---------------------------------------------------------------------------------------------------------------------------
Step2: In the session Level, Click on Properties tab set “ Treat source rows as “ to “update”.
Step 3: In the session, Mapping TabClick on target and select the “Update as Update
“Property.
-----------------------------------------------------------------------------------------------------------------------------
Lookup transformation in a mapping used to look up data in a flat file, relational table, view,
or synonym.
Can return multiple columns from Designate one return port (R).
the same row or insert into the Returns one column from each row.
dynamic lookup cache.
If there is no match for the lookup If there is no match for the lookup
condition, the Integration Service condition, the Integration Service
returns the default value for all returns NULL.
output ports. If you configure
dynamic caching, the Integration
Service inserts rows into the cache or
leaves it unchanged.
If there is a match for the lookup If there is a match for the lookup
condition, the Integration Service condition, the Integration Service
returns the result of the lookup returns the result of the lookup
condition for all lookup/output ports. condition into the return port.
If you configure dynamic caching, the
Integration Service either updates
the row the in the cache or leaves
the row unchanged.
To make active transformation, enable “Return all values when multiple match” while
creating the lookup transformation.
-------------------------------------------------------------------------------------
8. What is Performance tuning? There is mapping in the production, which was taking 15
minutes, later on it is taking more than 2 hours. How will you sort it out? How do you
identify whether it is in source bottleneck or target bottleneck or transformation
bottleneck?
-------------------------------------------------------------------------------------------------------------------------
This is an existing session hence we can identify by going through session log.
We will go through the session log line by line then come to know busy percentage of
different stages.
If data fetching is taking more time, then we need to check with source team on the changes
made on the following object.
Certain times possible that the cache memory is not sufficient due to that it may be waiting
for other process to release the cache files created on that same path.
Possible that there may be a chance that changes happened on target side.
Also need to verify recent deployment happened on Source, Target and ETL Side
which through some light on this issue.
-----------------------------------------
Incremental load is a process of loading data incrementally which means Only new and
changed data is loaded to the destination. Data that didn't change will not be updated.
In Informatica we can achieve incremental loading multiple ways based on the requirement.
In this method, we can flag the incoming is new or changed in the expression and then load
accordingly. This method is driving based on source primary key value.
M3: Slowly changing dimension methods like SCD Type1, Type 2 & Type 3.
---------------------------------------------
9. Can you briefly explain what all the transformations you have work with are?
-------------------------------------------------------------------------------------------------------
Performs aggregate
Aggregator Active Connected
calculations.
Active Connected or
Lookup and return data
Passive Connected or
Lookup from a flat file, relational
Active Unconnected or
table, view, or synonym.
Passive Unconnected
10. In my source one of the columns have clob value, I need to parse it to the target, how do
you achieve?
-----------------------------------------------------------------------------------------------------------------------------
By changing the data type of the CLOB column to nvarchar2 () we can load the CLOB
datatype column. If the length of CLOB data exceeds more than 4000 bytes, then we can
split CLOB column into Multiple Columns and load into the destination.
Below oracle function helps to cut CLOB column value based on the parameters specified.
---------------------------------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------------------------------------
No, I didn’t get a chance to work on main frame files. Informatica have Normaliser
Transformation to read mainframe files.
---------------------------------------------------------------------------------------------------------------------------------
14. There are multiple workflows which are running simultaneously and you want to setup a
dependency between them like dependency such that
WF1 and WF2 should complete and then only WF3 should execute.
----------------------------------------------------------------------------------------------
Generate workflow completion temp files with the help of command task at the end of workflow1 &
workflow 2 completion.
In the workflow 3, keep event waits to watch for the files created as part of Workflow1 & Workflow2
completion on a shared location. Until temp files not generated from Workflow1 & Workflow2,
workflow 3 is continuously waiting and once the temp files is available on the shared location then
proceed the further tasks in the workflow3.
--------------------------------------------------------------------------------------------------
15. We have a performance issue in a particular session, how do you identify the performance
bottleneck and what way you will resolve that?
--------------------------------------------------------------------------------------------------------------------------------------
The first step in understanding performance issue is to read the session log clearly so that we will get
some idea where to start issue.
Target
Source
Mapping
Session
System
You can configure a test session to read from a flat file source or to write to a flat file target to
identify source and target bottlenecks.
You can use system monitoring tools to view the percentage of CPU use, I/O waits, and paging to
identify system bottlenecks. You can also use the Workflow Monitor to view system resource usage.
--------------------------------------------------------------------------------------------------------------------------------------
16. What are different DB's you have used as the source and target?
--------------------------------------------------------------------------------------
Target: Oracle
-----------------------------------------------------------------------------------------
--------------------------------------------------------------------
----------------------------------------------------------------------
-----------------------------------------------------------------------
-----------------------------------------------------------------------
19. In Informatica administration if the services are down which log do you check?
--------------------------------------------------------------------------------------------------------
1. node.log
2. catalina.out
3.exceptions.out
You can check whether the Informatica Server is running by entering the
If it lists pmserver process, the server is running. If it lists grep pmserver, that is the process of the ps
command.
---------------------------------------------------------------------------------------------------------
20. In data migration project what are all the steps you take to deploy the code to production
environment.
------------------------------------------------------------------------------------------------------------------------------------
For Code Deployment from one environment to another , follow change management
process. IT’s PM who provide/guide with the details related to CR/CMR process.
As a Dev team, we will take the code deployed in the higher environments into the CR/CMR
and after that PM will acquire the required approval to move the code from DEV to Testing .
and if there are no errors then deployment will be done to UAT & PROD based on time lines
scheduled for the project.
If there are any errors, then defect will be raised by respective team, then the respective
components will re developed by DEV team based on the inputs received from
business/testing team/SME and then again re upload the code base into CR/CMR for the
deployment.
------------------------------------------------------------------------------------------------------------------------------------
21. What is the difference between reusable and non-reusable transformation? What is the non-
reusable transformation and map let?
-------------------------------------------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------------------------------------------
21. What is SCD and types of SCD? How do you develop the mappings according to the SCD?
-------------------------------------------------------------------------------------------------------------------------
SCD means slowly changing dimensions mainly used to load data incrementally from sources.
Out of many slowly changing types, 3 types are popularly used in the industry. They are
I. SCD Type 1
II. SCD Type 2
III. SCD Type 3
SCD Type 1:
With slowly changing dimension type 1, the old attribute value in the dimension row is
overwritten with the new value.
SCD Type 1 attributes always reflects the most recent assignment, and therefore this
technique destroys history.
SCD Type 2:
Slowly changing dimension type 2 changes, add a new row in the dimension with the
updated attribute values.
When a new row is created for a dimension member, a new surrogate key is assigned and
used as a foreign key in all fact tables from the moment of the update until a subsequent
change creates a new dimension key and updated dimension row.
1) Flag
2) Version
SCD Type 3:
Slowly changing dimension type 3 changes, add a new attribute in the dimension to
preserve the old attribute value; the new value overwrites the main attribute as in a type 1
change.
---------------------------------------------------------------------------------------------------------------------------
A dimension table is designed with one column serving as a unique primary key.
This primary key cannot be the operational system’s natural key because there will be
multiple dimension rows for that natural key when changes are tracked over time.
-------------------------------------------------------------------
--------------------------------------------
--------------------------------------------
------------------------------------------------------------------------------
Latest standard Informatica PowerCenter Version used in the Kingdom are 10.1
------------------------------------------------------------------------------