Professional Documents
Culture Documents
Idmc Iics 101 Labs v10
Idmc Iics 101 Labs v10
Idmc Iics 101 Labs v10
a. Go to https://www.informatica.com/trials/api-and-application-integration.html
b. create an account with your personal email
c. Enter the requested information to start your Free Cloud trial.
d. You will receive an email with your login credentials. Click to confirm new trial account
e. Create a new password for your new Informatica cloud account.
In Order to access data and perform Integration Tasks, you need a secure agent to do the job.
1. Informatica Hosted Secure Agent is in the cloud and can do the work
2. However, you need a local secure agent on premise (behind your firewall) to make sure your
sensitive local data remains behind your firewall
3. For the purpose of this week’s boot camp use Informatica’s Hosted Secure Agent.
4. If you feel adventurous and want to create your own secure agent on your local machine
(Windows 10/11 or Windows Server or unix/Linux only! no Apple Mac is supported)
The Logic
• The secure agent setup is common for all users, so the Administrator is highly advised to
download and distribute the secure agent setup for all users.
• After you correctly configured the secure agent, access the Informatica cloud web interface and
check the status of ICS Secure agent.
• Pre-Installation summary.
• Installation Process.
• Register the secure agent by entering your Informatica Cloud login Username and Password.
• Check the secure agent status and optionally configure the proxy server information if it Exist.
1) Go to the URL sent to you by Informatica when you registered for the free trial, and login to
Informatica cloud (https://dm-us.informaticacloud.com/ma/home)
2) Go to the Administrator service. In the Runtime Environments tab download Secure Agent (top
right).
7) Configure your secure agent by entering your Informatica cloud Username and Install token.
Follow Steps a & b to get the install token:
a. Go to Administrator page > Runtime Environment tab > Click on Generate Install token.
Fill the user name and install token information then click register.
To perform integration tasks we need to define the source and target data Objects. In this lab we will
create all the required connections from different data types Flat file, SQL Server and Salesforce .
o In the logon tab choose this account and enter the username and
password of a user that is in the windows Administrator user group.
o Then stop and restart your service from the service console.
You need to transfer some columns of data from a production table into your own schema in
development. Your database administrator is too busy, so you decide to do the data transfer yourself,
using Informatica cloud and a simple pass-through mapping.
The Logic
Step-by-step solution
• Naming conventions
Informatica recommends the following conventions to name various objects.
o Target definitions: T_TargetName
o Transformations: <abbreviation>_TransformationName as in
▪ EXP_xxx, for an expression
▪ AGG_xxx, for an aggregator
▪ FIL_xxx, for a filter
o Mappings: m_MappingName
o Mapping task: mtsk_MappingTask
Note: This is a good start, but you need to use meaningful names for these objects to
help in the “self-documentation” of your work.
2. Create a mapping
a. Click on new icon go to the mappings window and select Mapping.
b. Click on the source transformation to Connect to the Orders table in the database. In the
General tab specify the name ‘Source_Oders’.In the orders tab Specify the MSSQL_Source
connection and specify the Orders Object.
4. Connect to the T_OrdersDev_x (where x is the student id) table in the database. Specify the
MSSQL_Target connection and specify the T_OrdersDev_x Object.
b. Go to field mapping in the target to map the output fields to the corresponding input.
5. Save and run the mapping. Choose the Lab_env as the runtime environment.
7. Then download the log file of the job and in the session load summary you can identify the
number of records read from the source and the number of records inserted into the target.
8. To see the end result go back to the mapping and Preview the data of the target table. Click on
the target table -> go to target in the properties pane -> Preview data. The Result is as follows.
The TradeWind Company wants to improve its fulfillment systems. It is worried that too many orders
might be late getting to customers. Customer satisfaction is a top priority for the TradeWind Company. It
is their competitive advantage.
Create a mapping to solve the following
1. How long, on average, does it take for an order to be shipped to a customer?
2. If orders are late, what is the average number of days late, per customer?
3. What is the maximum number of days an order has ever been late, per customer.
Background
NOTE: The solutions for the labs have been provided with SQL Server as the database
for the source & target. Please make the appropriate changes by selecting the
appropriate database type (ORACLE, DB2, etc.) for either Source or Target.
Note: you can also verify the target data using a SQL tool to query the table or Preview
Data from the mapping view by clicking on preview data in the target tab of Target
transformation properties window.
b. Go to fields tab and delete the unwanted fields. Select unwanted fields and click on delete
icon. Keep the fields shown below since they will only be used in the mapping.
Hint: To input function names and port names, select them in the expression editor’s
navigator and click the add button next to the field. This reduces the chances of
typos.
b. Click on configure next to DaysToShip to enter the expression editor and enter the
expression shown below.
c. Validate the expression and create another output port named DaysLate, as shown below.
d. Enter the expression editor for the DaysLate port and add the expression shown below.
b. Go to the incoming fields tab to exclude the fields that are not further required in the mapping.
This is optional but makes your mappings cleaner. In the include field rule change field
selection criteria to named field and configure it to include the required fields as shown below.
c. Press on configure under the field rule detail and select the following fields to include.
Validation Panel
10. You can test the developed logic before running the mapping using the run preview. For example
click on the expression ‘Exp_ComputeToShipAndLateDays’ to validate the expressions created.
a. In the Preview tab click on run preview.
b. Choose the runtime environment Lab_env and the number of rows you want to read from the
source. Click run preview.
11. Run the Mapping and choose the runtime environment as ‘Lab_Env’. Then go to my jobs to check
the status of job. It must process 89 rows. Then you can preview the data in the target table from
the mapping and the result should be as follows.
Tradewind’s marketing department wants to know about sales made to customers outside the USA. You
will build a new target table to hold a summary of sales made to these customers using a flat file
customer list as a source and joining it to the relational tables Orders and Order_Details
The logic
1. The customers list flat file is of fixed width, so you must create a new Fixed-Width File Format
component called ‘CustomersFlatFile_Format ’.
2. Create a mapping called ‘m_ForeignCustomersSales’. The mapping will contain the following:
a. Source transformation that will read from CUSTOMERS_FLAT.DAT file.
b. A Filter transformation, to remove customers from the US. Note that, since our flat file source
is a fixed width file, we’ll need to TRIM the customer’s country before we compare it to ‘USA’.
c. Source transformation with multiple objects type containing the Orders and Order_Details
tables. Your database has a primary/foreign key relationship enforced between the Orders
and Order_Details tables, meaning you won’t have to do anything special in defining the
relationship. The cloud integration service will handle everything automatically.
d. A Joiner transformation to join flat file and relational streams, based on customer ID
e. An Aggregator transformation, to sum up the sales amounts for each customer.
f. A target transformation to connect to ‘T_ForeignCustSales_x’ where x is your student ID.
3. Save the mapping and make sure it is valid
4. Run the mapping.
b. Name the Fixed_Width File Format ‘CustomersFlatFile_Format’ and specify the connection
details to import the file CUSTOMERS_FLAT.DAT. Then add the lines to the data preview
window to specify the column boundaries.
d. Click save.
b. Go to formatting options to specify the file type as fixed_width and add the file format that you
have created previously.
5. Add a filter transformation after the ‘Customers_Flat’ source to remove customers from the US.
a. Choose the filter condition to be advanced and then use an RTRIM function to get rid of
trailing spaces before you compare the country field to the string ‘USA’.
Click on dropdown list to navigate between functions and fields to add to the expression.
6. Add a source transformation to connect to order and Order_Details tables by joining on OrderID
field.
a. Choose the MSSQL_Source connection and specify the source type to Multiple objects. Add
the Orders object by clicking on the following icon , and then choosing Add source object.
To add the related object Order_Details click on ,and then click on Add Related objects.
b. Delete all unwanted fields thus keeping the fields below that will be used in the mapping.
7. Create a joiner to join the flat filtered source to the relational sources. This join is a normal (equi)
join, based on CustomerID. The flat file source is the smallest stream based on the number of
rows and should constitute the master side of the joiner’s relationship.
a. Click on joiner + icon to see the Master and Detail inputs.
8. Add the aggregator transformation. You need to group by Custo and sum up order sales
based on the formula UNITPRICE * QUANTITY * (1 – DISCOUNT) (discount is a
percentage):
c. Map the target fields to the input fields. Click on Automap -> ExactFieldName
1. Create a User Defined Function to extract year from date. In this way the function will be
developed once and used in all your mappings.
a. Create a new User-Defined function component.
b. In the general tab Name the function ‘UDF_ExtractYear’, define the return type as string.
d. In the expression tab Define the expression that will extract the year of the InputDate
argument.
e. Click on Save.
2. Go to the mapping and Add an expression transformation called ‘Exp_Year’ after the joiner and
Create an output port called Year (String (4)) .
Tradewind wants to maintain a roster of its most productive employees, based on the number of orders
they generate.
Your job is to create a relational table listing all employees, with their names and titles, along with their
ranking.
The logic
m_TopEmployeesByOrders
In the lookup condition tab you will get an error with respect to name conflicts so press on Resolve field
name Conflicts and prefix the incoming fields from the rank transformation by ‘in_’.
That’s it. The only thing left to do is to connect the appropriate rank and lookup output ports to your target
m_TopEmployeesByOrders_Supervisor
After adding the lookup transformation, you have to check the ‘unconnected lookup’ button.
Your company needs to combine account information from Salesforce with order data from SQL server.
The data will be loaded into two different tables depending on the total price of the order. The tables are
used in the order fulfillment system.
The logic
• Create a new mapping that will join the salesforce account object with the orders_prem table.
• We need to lookup the Products file to get the detailed information for each product.
• Then we have to load the data into the Accelerated Shipping target if the total price of the order is
greater than or equal to 1500, otherwise we have to load them into the Normal Shipping target.
• We are going to use Salesforce, flat file, and MSSQL connections.
b. Delete the fields that you will not use in the mapping. Only Keep the fields shown below.
5. Create a joiner transformation that joins the account object and Orders_Prem table based on the
condition Account_External_ID_c = CustomerID
6. Create a lookup on the flat file ‘Products_Prem.csv’ to get the Product Description and
Price_Per_Unit fields.
a. Connect to the Products_Prem.csv file.
9. Create two target transformations one that will create ‘T_AcceleratedShipping’ and the other
‘T_NormalShipping’ at the MSSQL target database. Connect the router to the targets as follows.
11. Configure ‘T_NormalShipping_x’ target by repeating the same steps provided in step 9.
13. Validate the result by previewing the target tables in MSSQL server.
Errors have popped up in the daily production extracts. Duplicate orders with conflicting primary keys
have found their way into the system. Your job is to filter out duplicate orders from the extract file and
send them to a separate table for cleanup.
This process will leave the first row in a series of duplicate going to the target table while the other
duplicate rows will be sent to the error table.
The logic
Click on the input to add the ‘OrderID’, ‘CustomerID’, ‘OrderDate’, and ‘EmployeeID’ fields.
Add the expression transformation to catch duplicate rows before they hit the warehouse, we need to
compare the value of the ‘OrderID’ primary key in the current row with its value in the previous row and
output both values to the router, which will make the dupe/no-dupe decision.
You will implement this logic using local variables. The logic depends on 2 properties:
1. Variables are evaluated from top to bottom, in the order they appear in the port’s list
2. A variable can be set to the value of another variable located later it in the order of evaluation,
this is called a forward reference
To remember a value from a previous row, you need 2 variables, following each other in the port’s list:
Select the output transformation to add the output ports OrderID’, ‘CustomerID’, ‘OrderDate’, and
‘PrevOrderID’ as seen below.
This is a tab delimited flat file so you have to specify the formatting options as shown below.
Connect to both targets (T_Orders and T_DupeOrders) and Map the target fields
Marketing would like to make sure the products they advertise in their marketing campaigns all have an
active status and are not discontinued. They want an aggregated semi-annual Total Sales for both active
and discontinued products. To optimize the mapping you can turn on the Aggregator’s sorted input
property.
The logic
Hints
m_SemiAnnualSales
Source transformation
Add the three sources and the relationship will directly be established in between since the keys between
the tables have the same name. To add the Products table you have to highlight Order_Details and then
add the related object.
To be able to use the sorted input property of the aggregator we must sort the data by the group keys
ProductID and OrderDate. In the source properties under query options configure the sort criteria as
follows.
Aggregator transformation
Use a SUM with a conditional clause to compute both semester sales amounts.
Connect to the target table ‘T_SemiAnnualSalesByProduct’ and map the target fields.
TradeWind is testing a new e-commerce system. The system takes orders on the web and generates a
daily extract of order details as a flat file. The system collects information about new customers and
allows existing customers to modify their profile. Your job is to take the daily extract and create a
customer dimension table and update it day by day. New customers get added, and modifications made
online to existing customers are reflected in the dimension.
Note: to simplify the exercise, only a few of the Customers dimension fields will appear in the daily
extract.
The logic
m_UpdateCustDimType1
Source Transformation
Define the formatting options of the source as follows.
Delimiter: Tab
Field Labels: Auto-generate
Include only the needed fields in the mapping and rename them as follows. This is done by configuring
the include rule in the incoming field tab of the aggregator.
Add a router with two groups one for new customers and another for updated customers
Note that NEXTVAL is the field coming from the sequence generator.
Then go to the mapping and change the source object so it can read from ‘DWOrdersExtractDay2.dat’.
During the second run, with ‘DWOrdersExtractDay2.dat’, your mapping should have inserted the following
Your job is to put together a simple Taskflow containing 2 tasks linked in series and implement error
handling.
You will have to make sure that:
a. Tasks fail at the first mapping error
b. Tasks do not run when the previous Task fails
c. Taskflow fails if anything goes wrong
The ultimate goal is to run the taskflow to completion, without any errors.
You can pick any two valid tasks in your folder.
5. Then click on the 2 tasks link to see the run-time details of each task.
Your data warehouse daily load dimension tables based on a specific logic.
The logic
Hints
This exercise is about taskflow logic, not actually loading data. You will be using any valid mapping task
each time you need to add a data task.
In the path that loads dimension table 3 add a decision object as follows.
Define the script path in the Command Task input field tab:
If LoadDim3 task fails, the taskflow must not stop since further analysis is done. So, change the ‘on error’
property in the Error handling properties tab of the data task to ‘Ignore’.
Add a notification task after the command to send an email for the command execution status.
You need to pass Order records to a staging table in increments. Each incremental load will truncate the
stating table and load 500 new orders. In addition, you will keep a history table with statistics for each
incremental load. You will use mapping in-out parameters to handle the incremental load automatically.
The logic
Hints
m_IncrementalLoad
Target definitions
The following target definitions shows the fields that must be loaded when you create a new target at
runtime.
T_OrdersStaging_x, where x is your student number.
Don’t forget to check the truncate target table option of T_OrdersStaging_x. Create a new runtime target
object for both targets and just include the incoming fields that you need to load into the target.
The value could be checked in the mapping task. Click on edit go to the in-out parameters tab and you
will see the updated value of $$StartOrderID.
If you select the StartOrderID in-out parameter and hit the ‘Reset In-out parameter’ button , the in-out
parameter is removed from the repository. The next time you run, the default value set in your mapping
will be used, unless you are using a parameter file.