07 Joiner AGG Sorter

You might also like

You are on page 1of 13

BANGALORE TECHNICAL TRAININGS Onlinetrainingsbglr@gmail.

com +917411642061

Lab 7
Heterogeneous Join, Aggregator, & Sorter

Lab at a Glance................................................................2 Objectives..............................................................2 Summary................................................................2 Duration.................................................................3 Exercises..........................................................................4 Exercise 1: Create the Mapping.............................4 Exercise 2: Create and Run the Workflow............11 Reference.......................................................................13

PowerCenter 9.x Level I Developer Lab Guide

7-1

Lab 7. Heterogeneous Join, Aggregator, & Sorter

Lab at a Glance
The exercises in this lab are designed to walk the student through the process of using the Joiner transformation to join data from heterogeneous sources. The student will also learn how to use the Aggregator & Sorter transformations.

Objectives
After completing the lab, the student will be able to: Perform a heterogeneous join using the Joiner transformation. Use the Sorter transformation.----ORDER BY CLAUSE Aggregate data using the Aggregator transformation.--GROUP BY Use the sorted input property of the Aggregator transformation.

Summary
The purpose of this lab is to populate an ODS table by loading data from a flat file and a relational table. The flat file contains information about orders placed with a vendor for products and supplies. An example of data from the flat file follows:

The product table contains product data, such as the make and model name, vendor ID, and cost. An example of data from the product table follows:

7-2

PowerCenter 9.x Level I Developer Lab Guide

Lab 7. Heterogeneous Join, Aggregator, & Sorter

The goal of this lab is to load an ODS table with the costs summarized by date, product and vendor. This will be the raw data that will be used to populate the fact table. Assuming there is a large amount of data broken down into several groups, the data will flow through the mapping much faster if the data is sorted. Since the data is not coming only from a relational source, grouping the data in the Source Qualifier will serve very little purpose. SOURCES: PRODUCT, ORDER flat file TARGET: ODS_ORDER_AMOUNT

The completed mapping should look as follows:

Duration
This lab should take approximately 45 minutes.

PowerCenter 9.x Level I Developer Lab Guide

7-3

Lab 7. Heterogeneous Join, Aggregator, & Sorter

Exercises
Exercise 1: Create the Mapping
Step 1. Import the sources. Clear the Source Analyzer workspace (right-click anywhere in the workspace and select Clear All). Continue to work in the assigned student folder and import the tab delimited flat file, ORDER.txt. In the wizard step 1, check the Import Field Names From First Line checkbox because the first row includes the field names. In the wizard step 2, Delimiters area, check Tab and uncheck Comma. Import the relational table, PRODUCT from the SDBU database schema. The sources should look as follows:

Save your work. Step 2. Import the target. Switch to the Warehouse Designer tool. Clear the workspace. (Right-click anywhere in the workspace and select Clear All). Import the relational database target, ODS_ORDER_AMOUNT, from the TDBUxx database schema. Save your work. Step 3. Create a mapping. Create a mapping called m_ODS_ORDER_AMOUNT_xx.

7-4

PowerCenter 9.x Level I Developer Lab Guide

Lab 7. Heterogeneous Join, Aggregator, & Sorter

Step 4. Add sources and target. Add the ORDER and PRODUCT source definitions with their respective Source Qualifiers to the mapping:

Remember that each source must have its own Source Qualifier. If they do not, then they will have to be created manually.

Add the target definition, ODS_ORDER _AMOUNT:

Note that as more objects are added to the mapping, the Navigator and Output windows can be toggled off, providing more room in the workspace:

Save the mapping.

PowerCenter 9.x Level I Developer Lab Guide

7-5

Lab 7. Heterogeneous Join, Aggregator, & Sorter

Step 5. Create a joiner transformation.


The Joiner transformation can join data from two related heterogeneous sources that reside in different locations or file systems.

Create a Joiner transformation and name it jnr_ODS_ORDER_AMOUNT. Copy/link the following ports to jnr_ODS_ORDER_AMOUNT: From sq_ORDER, add ORDER_DATE, PRODUCT and QUANTITY. From sq_PRODUCT, add PRODUCT_CODE, VENDOR_ID, PRICE and COST.

One of the two sources in each Joiner must be deemed the Master and the other will become the Detail by default. Refer to the Reference section at the end of this Lab for more information about selecting Master and Detail sources.

Edit jnr_ODS_ORDER_AMOUNT. On the Ports tab, select the ports from sq_PRODUCT as the master ports by checking the M boxes. On the Ports tab, increase size of the PRODUCT port to a precision of 10. On the Condition tab, add a condition where PRODUCT_CODE = PRODUCT by clicking on the Add a
New Condition

button.

On the Properties tab, confirm the Join Type is Normal Join. The Joiner transformation should appear as follows:

Save the mapping. Step 6. Create an expression transformation. Create an Expression transformation called exp_ORDER_DATE.

7-6

PowerCenter 9.x Level I Developer Lab Guide

Lab 7. Heterogeneous Join, Aggregator, & Sorter

Copy/link the ORDER_DATE port from the jnr_ODS_ORDER_AMOUNT to exp_ORDER_DATE. Rename the ORDER_DATE port to ORDER_DATE_in and make it input only. Add a new port called ORDER_DATE_out with the datatype date/time. Make ORDER_DATE_out an output only port. Create an expression for the ORDER_DATE_out port as follows: TO_DATE(ORDER_DATE_in, 'DD-MONYYYY') The Expression transformation should look as follows:

Step 7. Create a Sorter transformation. Create a Sorter transformation and name it srt_ODS_ORDER_AMOUNT. Copy/link the ORDER_DATE_out port from exp_ORDER_DATE. Copy/link the QUANTITY, PRODUCT_CODE, VENDOR_ID, PRICE and COST ports from jnr_ODS_ORDER_AMOUNT. Rename ORDER_DATE_out to ORDER_DATE. Check the key checkbox for the ORDER_DATE, PRODUCT_CODE and VENDOR_ID ports. Be certain the ports are in that order. The mapping should look as follows:

PowerCenter 9.x Level I Developer Lab Guide

7-7

Lab 7. Heterogeneous Join, Aggregator, & Sorter

Click OK. Save the repository. Step 8. Create an Aggregator transformation. Create an Aggregator transformation
agg_ODS_ORDER_AMOUNT.

and name it

Copy/link all ports from srt_ODS_ORDER_AMOUNT to agg_ODS_ORDER_AMOUNT. Edit agg_ODS_ORDER_AMOUNT.


It is important to specify how to group the data when doing aggregate calculations. The group by ports can be input, output or variable ports. The order of the ports from top to bottom determines the group by order.

On the Ports tab, group the data by:


ORDER_DATE PRODUCT_CODE VENDOR_ID

On the Ports tab, append _in to the QUANTITY port. The end result will be QUANTITY_in. Position the QUANTITY_in port after the ORDER_DATE port. Make this an input only port by turning off the Output port checkbox. On the Ports tab, create an output port (after QUANTITY_in) called QUANTITY_out with a data type of integer and precision 10. Add the following expression to the QUANTITY_out port: SUM(QUANTITY_in)

7-8

PowerCenter 9.x Level I Developer Lab Guide

Lab 7. Heterogeneous Join, Aggregator, & Sorter

Under the Properties tab, check the Sorted Input box. Click OK. The mapping should look as follows

Save the mapping. Step 9. Link the target definition. Use the Autolink feature to link from agg_ODS_ORDER_AMOUNT to ODS_ORDER_AMOUNT. In the Autolink dialog box, click the More>> button. All of the port names are the same as the target except QUANTITY_out. Autolink by name, using _out in the From Transformation suffix field:

PowerCenter 9.x Level I Developer Lab Guide

7-9

Lab 7. Heterogeneous Join, Aggregator, & Sorter

Click the OK button. The linking should look as follows:

Step 10. Save and validate the mapping. Use the Arrange All Iconic feature. The completed mapping should look as follows:

Save the mapping and check for validation information on the Save tab in the Output window:

7 - 10

PowerCenter 9.x Level I Developer Lab Guide

Lab 7. Heterogeneous Join, Aggregator, & Sorter

Exercise 2: Create and Run the Workflow


Step 1. Create a workflow. Create a new workflow and name it wf_ODS_ORDER_AMOUNT_xx. Create a session task called s_m_ODS_ORDER_AMOUNT_xx. Edit the session and select the Mapping tab. In the Navigation box, select the source SQ_ORDER.

Under Properties for ORDER: select File Reader; for Source filename enter ORDER.txt. Under Connections, click on the down arrow native_source and click OK. , select

In the Navigation box, select the source SQ_PRODUCT.

In the Navigation box, select the target ODS_ORDER_AMOUNT.

Under Connections, click on the down arrow native_target_xx and click OK.

, select

Under Properties, the Target load type should be defaulted to Normal. Scroll down to select the Truncate target table option. Click OK to close the Edit Tasks dialog box. Save the repository. Link Start to s_m_ODS_ORDER_AMOUNT_xx.

PowerCenter 9.x Level I Developer Lab Guide

7 - 11

Lab 7. Heterogeneous Join, Aggregator, & Sorter

Save, validate and start wf_ODS_ORDER_AMOUNT_xx. Monitor and review the results for s_m_ODS_ORDER_AMOUNT_xx in the Workflow Monitor. Step 2. Verify results session properties.

Step 3. Verify results session transformation statistics.

Step 4. Verify results preview data (in Designer).


Note that only the first few rows are shown here.

7 - 12

PowerCenter 9.x Level I Developer Lab Guide

Lab 7. Heterogeneous Join, Aggregator, & Sorter

Reference
Only the sorted Joiner can link two flows arising from the same Source Qualifier. To join more than two flows, multiple Joiners must be nested the results from one Joiner must be passed on to the next and so forth. The Joiner needs a minimum of two ports, one from each input flow, to create the join condition. Those two ports must have compatible data types and precisions for the join condition to be valid. In a Joiner, one input flow must be designated as the master, and the other as the detail. Master input ports are cached in memory choose the flow with the least duplicate rows as the master. Specify the master input ports by checking the M attribute (unchecked input ports will be detail).

PowerCenter 9.x Level I Developer Lab Guide

7 - 13

You might also like