Professional Documents
Culture Documents
Input: Output: Id Name Id Name - A Name - B Name - C
Input: Output: Id Name Id Name - A Name - B Name - C
Input: Output: Id Name Id Name - A Name - B Name - C
Input: Output:
Id Name Id Name_A Name_B Name_C
10 aa 10 aa bb cc
10 bb 20 aa bb
10 cc 30 bb
20 aa 40 aa
20 bb 50 cc
30 bb
40 aa
50 cc
Source -> Sorter (Asc by ID) -> AGG TRF { Create Three output ports. Each output port has this condition
MAX(DECODE(Name,'aa',Name)), MAX(DECODE(Name,'bb',Name)), MAX(DECODE(Name,'cc',Name))} and use group by id
dition
e))} and use group by id connect id and three output groups to target.
If the source data contains unique and duplicate records then transform the data to
load unique records in one target and duplicate records in another target:
Source -> Sorter (Sort on Id - Assuming unsorted data) -> Aggregator (Group on Id and Calculate the Count of Id) - > Joiner (Joi
will contain the rows where Count of Id = 1 and another group will contain the rows where Count of ID > 1) -> Populate these
another target will contain duplicate records.
Joiner
Id Count
1 2
1 2
2 1
3 3
3 3
->' 3 3
4 2
4 2
5 1
the Count of Id) - > Joiner (Join the original data with the output of Aggregator on Id Column ) -> Router (One Group
nt of ID > 1) -> Populate these groups into two seperate targets. The first target will contain the unique records and
If the source data contains unique and duplicate records then transform the data to
load the first occurrence of each record in one target and the remaining duplicate
records in another target:
Expression Logic:
V_Curr_Val = Id
V_Seq = IIF(V_Curr_Val = V_Prev_Val, V_Seq + 1, 1)
V_Prev_Val = V_Curr_Val
Out_Seq = V_Seq
Source -> Sorter (Sort on Id - Assuming unsorted data) -> Expression (Using variables create a cycling sequence for the same va
-> Router (One Group will contain the rows where expression generated sequence id = 1 and another group will contain the r
expression generated sequence id > 1) -> Populate these groups into two seperate targets. The first target will contain the the
each record and another target will contain the remaining duplicate records
Expression
V_Seq V_Prev_Val Out_Seq Id
1 1 1 1
2 1 2 1
1 2 1 2
1 3 1 3
2 3 2 3
3 3 3 3
1 4 1 4
2 4 2 4
1 5 1 5
Source -> Sequence Generator -> Aggregator (Calculate the total number of records by Counting the records without Grouping
on any field) -> Joiner (Perform a Normal Join to join the original dataset with the output of the Aggregator on the Dummy por
created in both the datasets) -> Filter (Pass the records where Count - Seq Id <= 2)
Joiner
Seq Id Id Dummy Count
1 3 1 7
2 6 1 7
3 2 1 7
4 7 1 7
5 9 1 7
6 1 1 7
7 6 1 7