Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 9

ETL Informatica Power Centre Interview Questions

1. Self-introduction with explanation of previous project


------------------------------------------------------------------------
You can prepare on your own.
------------------------------------------------------------------------
2. What kind of source system you used in development?
--------------------------------------------------------------------------
Data base: Oracle and .csv files.
Add if you know any data base or XML or Cobol structures clearly(You should be able to
answer if questions raised)
-------------------------------------------------------------------------
3. Have you familiar with Power Exchange.
------------------------------------------------------
You can tell like … Never get a chance to work on this Tool configuration (Separate License
required apart from Informatica PowerCenter), but this tool is used to extract data from
applications like SAP, Tibco, Siebel etc.
Also, for Real time data extraction from source systems.
---------------------------------------------------------
4. We have one source database and how to generate multiple files at target system and
how do you give different file name
---------------------------------------------------------------------------------------------------------------------------
If the number of target files are fixed then can use either filter or router transformation.
If the number of target files are not fixed then need to use Dynamic file concept by enabling
the file name column in the target file level.
---------------------------------------------------------------------------------------------------------------------------
5. What is XML TRANSFORMATION?
--------------------------------------------
XML Transformations are used to deal with XML sources/Targets. In Informatica there are 3
XML transformation are available.

Transformation Type Description

Reads data from one or more input ports and outputs


XML Generator Active Connected
XML through a single output port.

Reads XML from one input port and outputs data to


XML Parser Active Connected
one or more output ports.

XML Source Represents the rows that the Integration Service


Active Connected
Qualifier reads from an XML source when it runs a session.

--------------------------------------------

6. How to update the record without using Update strategy and Update over ride, keys are
also not defined at DB level.
---------------------------------------------------------------------------------------------------------------------------

Step1: Define primary key in the target (Informatica Level/Metadata Level).

Step2: In the session Level, Click on Properties tab set “ Treat source rows as “ to “update”.

Step 3: In the session, Mapping TabClick on target and select the “Update as Update
“Property.

-----------------------------------------------------------------------------------------------------------------------------

7. Explain Lookup transformation and can we make it to active transformation.


------------------------------------------------------------------------------------------------------

Lookup transformation in a mapping used to look up data in a flat file, relational table, view,
or synonym.

Configuration wise, look up transformation is of 2 types i.e., Connected and Un Connected.

Connected Lookup Unconnected Lookup


Receives input values directly from Receives input values from the result
the pipeline. of a: LKP expression in another
transformation.

Use a dynamic or static cache. Use a static cache.

Cache includes the lookup source Cache includes all lookup/output


columns in the lookup condition and ports in the lookup condition and the
the lookup source columns that are lookup/return port.
output ports.

Can return multiple columns from Designate one return port (R).
the same row or insert into the Returns one column from each row.
dynamic lookup cache.

If there is no match for the lookup If there is no match for the lookup
condition, the Integration Service condition, the Integration Service
returns the default value for all returns NULL.
output ports. If you configure
dynamic caching, the Integration
Service inserts rows into the cache or
leaves it unchanged.

If there is a match for the lookup If there is a match for the lookup
condition, the Integration Service condition, the Integration Service
returns the result of the lookup returns the result of the lookup
condition for all lookup/output ports. condition into the return port.
If you configure dynamic caching, the
Integration Service either updates
the row the in the cache or leaves
the row unchanged.

Pass multiple output values to Pass one output value to another


another transformation. Link transformation. The
lookup/output ports to another lookup/output/return port passes
transformation. the value to the transformation
calling :LKP expression.

Supports user-defined default values. Does not support user-defined


default values.

To make active transformation, enable “Return all values when multiple match” while
creating the lookup transformation.

-------------------------------------------------------------------------------------

8. What is Performance tuning? There is mapping in the production, which was taking 15
minutes, later on it is taking more than 2 hours. How will you sort it out? How do you
identify whether it is in source bottleneck or target bottleneck or transformation
bottleneck?
-------------------------------------------------------------------------------------------------------------------------

This is an existing session hence we can identify by going through session log.

We will go through the session log line by line then come to know busy percentage of
different stages.
If data fetching is taking more time, then we need to check with source team on the changes
made on the following object.

Certain times possible that the cache memory is not sufficient due to that it may be waiting
for other process to release the cache files created on that same path.

Possible that there may be a chance that changes happened on target side.

Also need to verify recent deployment happened on Source, Target and ETL Side
which through some light on this issue.

9. How to do the incremental load?

-----------------------------------------

Incremental load is a process of loading data incrementally which means Only new and
changed data is loaded to the destination. Data that didn't change will not be updated.

In Informatica we can achieve incremental loading multiple ways based on the requirement.

M1: Based on Source Primary Key

SourceSource Qualifier Lookup (on Target) ExpressionFilter/RouterTarget.

In this method, we can flag the incoming is new or changed in the expression and then load
accordingly. This method is driving based on source primary key value.

M2: Based on Mapping Variable

SourceSource Qualifier Expression Target.


In the source qualifier transformation, we will add a filter like date/timestamp column value
>Mapping Variable. Mapping variable value is stored maximum time stamp value passed
through the data.

M3: Slowly changing dimension methods like SCD Type1, Type 2 & Type 3.

---------------------------------------------

9. Can you briefly explain what all the transformations you have work with are?
-------------------------------------------------------------------------------------------------------

Transformation Type Description

Performs aggregate
Aggregator Active Connected
calculations.

Expression Passive Connected Calculates a value.

Joins data from different


Joiner Active Connected databases or flat file
systems.

Active Connected or
Lookup and return data
Passive Connected or
Lookup from a flat file, relational
Active Unconnected or
table, view, or synonym.
Passive Unconnected

Limits records to a top or


Rank Active/Connected
bottom range.

Routes data into multiple


Router Active/Connected transformations based on
group conditions.

Filters the data based on


Filter Active/Connected
condition

10. In my source one of the columns have clob value, I need to parse it to the target, how do
you achieve?
-----------------------------------------------------------------------------------------------------------------------------

By changing the data type of the CLOB column to nvarchar2 () we can load the CLOB
datatype column. If the length of CLOB data exceeds more than 4000 bytes, then we can
split CLOB column into Multiple Columns and load into the destination.

Below oracle function helps to cut CLOB column value based on the parameters specified.

dbms_lob.substr (clob_column, for_how_many_bytes, from_which_byte);

nvarchar2 () max size 4000bytes

Example:- dbms_lob.substr (clob_column, 1, 4000);

---------------------------------------------------------------------------------------------------------------------------

13. Did you get a chance to work with mainframe files?

--------------------------------------------------------------------------------------------------------------------------------

No, I didn’t get a chance to work on main frame files. Informatica have Normaliser
Transformation to read mainframe files.

---------------------------------------------------------------------------------------------------------------------------------

14. There are multiple workflows which are running simultaneously and you want to setup a
dependency between them like dependency such that

WF1 and WF2 should complete and then only WF3 should execute.

----------------------------------------------------------------------------------------------

Generate workflow completion temp files with the help of command task at the end of workflow1 &
workflow 2 completion.

In the workflow 3, keep event waits to watch for the files created as part of Workflow1 & Workflow2
completion on a shared location. Until temp files not generated from Workflow1 & Workflow2,
workflow 3 is continuously waiting and once the temp files is available on the shared location then
proceed the further tasks in the workflow3.

--------------------------------------------------------------------------------------------------

15. We have a performance issue in a particular session, how do you identify the performance
bottleneck and what way you will resolve that?

--------------------------------------------------------------------------------------------------------------------------------------

The first step in understanding performance issue is to read the session log clearly so that we will get
some idea where to start issue.

Look for performance bottlenecks in the following order:

Target

Source

Mapping
Session

System

Use the following methods to identify performance bottlenecks:

Run test sessions. 

You can configure a test session to read from a flat file source or to write to a flat file target to
identify source and target bottlenecks.

Analyze performance details.

 Analyze performance details, such as performance counters, to determine where session


performance decreases.

Analyze thread statistics.

 Analyze thread statistics to determine the optimal number of partition points.

Monitor system performance.

 You can use system monitoring tools to view the percentage of CPU use, I/O waits, and paging to
identify system bottlenecks. You can also use the Workflow Monitor to view system resource usage.

--------------------------------------------------------------------------------------------------------------------------------------

16. What are different DB's you have used as the source and target?

--------------------------------------------------------------------------------------

Source: Oracle & SQL Server

Target: Oracle

-----------------------------------------------------------------------------------------

17. How will you connect SQL server to the Informatica.

--------------------------------------------------------------------

By using ODBC Connection.

----------------------------------------------------------------------

18. Is your Informatica installed on windows or Linux or UNIX.

-----------------------------------------------------------------------

It is possible to install Informatica on Windows/Linux/UNIX. ( Based on your project experience)

-----------------------------------------------------------------------

19. In Informatica administration if the services are down which log do you check?
--------------------------------------------------------------------------------------------------------

Check Following logs under $INFA_HOME/tomcat/logs

 1. node.log

2. catalina.out

3.exceptions.out

You can check whether the Informatica Server is running by entering the

command ps -elf |grep pmserver

If it lists pmserver process, the server is running. If it lists grep pmserver, that is the process of the ps
command.

---------------------------------------------------------------------------------------------------------

20. In data migration project what are all the steps you take to deploy the code to production
environment.

------------------------------------------------------------------------------------------------------------------------------------

For Code Deployment from one environment to another , follow change management
process. IT’s PM who provide/guide with the details related to CR/CMR process.

As a Dev team, we will take the code deployed in the higher environments into the CR/CMR
and after that PM will acquire the required approval to move the code from DEV to Testing .
and if there are no errors then deployment will be done to UAT & PROD based on time lines
scheduled for the project.

If there are any errors, then defect will be raised by respective team, then the respective
components will re developed by DEV team based on the inputs received from
business/testing team/SME and then again re upload the code base into CR/CMR for the
deployment.

------------------------------------------------------------------------------------------------------------------------------------

21. What is the difference between reusable and non-reusable transformation? What is the non-
reusable transformation and map let?

-------------------------------------------------------------------------------------------------------------------------------------

Reusable Transformation: Re usable transformation is one which created in the


transformation developer and can be used in any number of mappings based on the
requirement.

Non-Reusable Transformation: Non-Reusable transformation created inside the single


mapping and can’t be reused in any other mapping.
Mapplet: More than one transformation logic is required to re-use in multiple mappings
then we will create mapplet by using these transformations in the mapplet designer. This
mapplet can be re-used in multiple number of mappings based on the need.

--------------------------------------------------------------------------------------------------------------------------------------

21. What is SCD and types of SCD? How do you develop the mappings according to the SCD?

-------------------------------------------------------------------------------------------------------------------------

SCD means slowly changing dimensions mainly used to load data incrementally from sources.
Out of many slowly changing types, 3 types are popularly used in the industry. They are

I. SCD Type 1
II. SCD Type 2
III. SCD Type 3

SCD Type 1:

With slowly changing dimension type 1, the old attribute value in the dimension row is
overwritten with the new value.

SCD Type 1 attributes always reflects the most recent assignment, and therefore this
technique destroys history.

SCD Type 2:

Slowly changing dimension type 2 changes, add a new row in the dimension with the
updated attribute values.

When a new row is created for a dimension member, a new surrogate key is assigned and
used as a foreign key in all fact tables from the moment of the update until a subsequent
change creates a new dimension key and updated dimension row.

Different types are

1) Flag

2) Version

3) Effective Date Range

SCD Type 3:

Slowly changing dimension type 3 changes, add a new attribute in the dimension to
preserve the old attribute value; the new value overwrites the main attribute as in a type 1
change.

In this method it is not possible to maintain all historical information

---------------------------------------------------------------------------------------------------------------------------

22. What is surrogate key and why do you use this?


-------------------------------------------------------------------

A dimension table is designed with one column serving as a unique primary key.

This primary key cannot be the operational system’s natural key because there will be
multiple dimension rows for that natural key when changes are tracked over time.

-------------------------------------------------------------------

23. Have you work with Flat files?

--------------------------------------------

Yes, we worked with .csv files.

--------------------------------------------

24. What is the version of Informatica that you have done?

------------------------------------------------------------------------------

Latest standard Informatica PowerCenter Version used in the Kingdom are 10.1

------------------------------------------------------------------------------

You might also like