Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 31

Ab Initio Training

Components Contd…
> Joining Data
0345Smith
0345Smith Bristol
Bristol 56
56 0322970402
0322970402 1242.50
1242.50
0212Spade
0212Spade London
London 88 0345970924
0345970924 923.75
923.75
0322Jones
0322Jones Compton
Compton 12
12 0121961211
0121961211 12392.00
12392.00
0492West
0492West London
London 23
23 0492971123
0492971123 234.12
234.12
0121Forth
0121Forth Bristol
Bristol 77 0666950616
0666950616 2312.10
2312.10
0221Black
0221Black New
New York
York 42
42

0345Bristol
0345Bristol 561997/09/24
561997/09/24
0212London
0212London 81900/01/01
81900/01/01
0322Compton
0322Compton 121997/04/02
121997/04/02
0492London
0492London 231997/11/23
231997/11/23
0121Bristol
0121Bristol 71996/12/11
71996/12/11
0221New
0221New York
York 421900/01/01
421900/01/01

Confidential ©2012 Syntel, Inc.


> Joining Sorted Data on the ‘id’ field

0121Forth
0121Forth Bristol
Bristol 77 0121961211
0121961211 12392.00
12392.00
0212Spade
0212Spade London
London 88
0221Black
0221Black New
New York
York 42
42
0322Jones
0322Jones Compton
Compton 12
12 0322970402
0322970402 1242.50
1242.50
0345Smith
0345Smith Bristol
Bristol 56
56 0345970924
0345970924 923.75
923.75
0492West
0492West London
London 23
23 0492971123
0492971123 234.12
234.12
0666950616
0666950616 2312.10
2312.10

0121Bristol
0121Bristol 71996/12/11
71996/12/11
0212London
0212London 81900/01/01
81900/01/01
...
...

Confidential ©2012 Syntel, Inc.


> Join

 Join performs a join of inputs. By default, the inputs to join must be sorted
and an inner join is computed.

 Types Of Join
 Inner Join (Default)
 Full outer Join
 Explicit Join

Confidential ©2012 Syntel, Inc.


> Join Properties
 Count : Required, Integer
Integer from 1 to 20 that sets the number of each of the
following: Input arguments to the transform function in ports,
unused ports, reject ports, error ports,record-required
parameters, dedup parameters, select parameters, override-key
parameters Default is 2.
 Key: Required, Key specifier
The Name(s) of the field(s) in the input records that must
have matching values for Join to call the transform function.

 Transform: Required, Filename or string


We should specify either the name of the file containing the
transform function, or a transform string.
 Join-Type: Required Choice
Choose from the following:
Inner join - sets the record-required parameters for all ports
to true.
Outer join - sets the record-required parameters for all ports
to false.
Explicit - allows you to set the record-required parameter for
each port individually.

 Record-requiredn Required Boolean


Set to true to call the transform function only if the key value
on the in port designated by n matches the current key value - in
other words, a matching key value on that port is required.

Confidential ©2012 Syntel, Inc.


> Building the Output Record

in0: in1:
record record
decimal(4) id; decimal(4) id;
string(6) name; date(”YYMMDD”) dt;
string(8) city; decimal(9.2) cost;
decimal(3) amount; end
end

out:
record
decimal(4) id;
string(8) city;
decimal(3) amount;
date(“YYYY/MM/DD”)dt;
end

Confidential ©2012 Syntel, Inc.


> What if the in1 record is missing?

in0: in1:
record record
decimal(4) id; decimal(4) id;
string(6) name; date(”YYMMDD”) dt; ???
string(8) city; decimal(9.2) cost;
decimal(3) amount; end
end

out:
record
decimal(4) id;
string(8) city;
decimal(3) amount;
date(“YYYY/MM/DD”)dt;
end

Confidential ©2012 Syntel, Inc.


> Prioritized Assignment

Destination Priority Source

out.dt :1: in1.dt;


out.dt :2: “1900/01/01”;

 In DML, a missing value (say, if there is no ‘in1’ record) causes an


assignment to fail.

 If an assignment for a left hand side fails, the next priority assignment is
tried. There must be one successful assignment for each output field.

Confidential ©2012 Syntel, Inc.


> Assigning Priorities to Business Rules

Confidential ©2012 Syntel, Inc.


> Resulting display when out.dt is selected

Confidential ©2012 Syntel, Inc.


> Joining (Inner Join)

Confidential ©2012 Syntel, Inc.


> A Look Inside the Join Component*

a b c a q r

Align inputs by key *join-type = Inner Join

a b c a q r

out :: join(in0, in1) =


begin
...
...
...
...
...
end;

a x q

Confidential ©2012 Syntel, Inc.


> Records arrive at the inputs of the Join
G 234 42 G NY 4

*join-type = Inner Join


Align inputs by a

out :: join(in0, in1) =


begin
out.a :: in0.a;
out.x :: in0.b + in0.r;
out.q :: in0.q;
end;

Confidential ©2012 Syntel, Inc.


> The input records are read into the Join component
G 234 42 G NY 4

Align inputs by a

out :: join(in0, in1) =


begin
out.a :: in0.a;
out.x :: in0.b + in0.r;
out.q :: in0.q;
end;

Confidential ©2012 Syntel, Inc.


> The input Key fields are compared
G 234 42 G NY 4

Align inputs by a

out :: join(in0, in1) =


begin
out.a :: in0.a;
out.x :: in0.b + in0.r;
out.q :: in0.q;
end;

Confidential ©2012 Syntel, Inc.


> The aligned records are passed to the transformation function

Align inputs by a

G 234 42 G NY 4

out :: join(in0, in1) =


begin
out.a :: in0.a;
out.x :: in0.b + in0.r;
out.q :: in0.q;
end;

Confidential ©2012 Syntel, Inc.


> The transformation engine evaluates based on the
inputs

Align inputs by a

G 234 42 G NY 4

out :: join(in0, in1) =


begin
out.a :: in0.a;
out.x :: in0.b + in0.r;
out.q :: in0.q;
end;

Confidential ©2012 Syntel, Inc.


> A result record is emitted and written out as long as all output fields have been
successfully computed

Align inputs by a

out :: join(in0, in1) =


begin
out.a :: in0.a;
out.x :: in0.b + in0.r;
out.q :: in0.q;
end;

G 238 NY

Confidential ©2012 Syntel, Inc.


> Lookup Files

 DML provides a facility for looking up records in a dataset based on a key:


lookup(”file-name”, key-expression)

 The data is read from a file into memory.

 The GDE provides a Lookup File component as a special dataset with no


ports.

Confidential ©2012 Syntel, Inc.


> Using lookup instead of Join

Using Last-Visits
as a lookup file

Confidential ©2012 Syntel, Inc.


> Configuring a Lookup File
1. Label used as name in
lookup expression 4. Set the lookup key

2. Browse for pathname 3. Set record format

Confidential ©2012 Syntel, Inc.


> Using a lookup file in a Transform Function

Input 0 record format: Output record format:


record record
decimal(4) id; decimal(4) id;
string(6) name; string(8) city;
string(8) city; decimal(3) amount;
decimal(3) amount; date(”YYYY/MM/DD”) dt;
end end

Transform function:
out :: lookup_info(in) =
begin
out.id : : in.id;
out.city : : in.city;
out.amount : : in.amount;
out.dt :1 : lookup(”Last-Visits”, in.id).dt;
out.dt :2 : ”1900/01/01”;
end;

Confidential ©2012 Syntel, Inc.


Sort Within Groups

 Sort Within Groups refines the sorting of records already sorted


according to one key specifiers: it sorts the records within the
groups formed by the first sort according to a second key
specifier.

Confidential ©2012 Syntel, Inc.


SWG: Parameters

 major-key: key(s) on which


groups are created.

 minor-key: key(s) on which records


are to be sorted.

 Max-core: maximum memory


allowed to be used by the
component in bytes

 Allow-unsorted: key(s) on which


groups are created.

Confidential ©2012 Syntel, Inc.


SWG: Example

Confidential ©2012 Syntel, Inc.


Sample Data

Input Output

Confidential ©2012 Syntel, Inc.


> The GDE Debugger

 The GDE has a built in debugger capability


 To enable the Debugger, Debugger: Enable Debugger
 The Debugger Toolbar

Enable Debugger Remove All Watchers

Add Watcher File Isolate Components

Confidential ©2012 Syntel, Inc.


> The GDE Debugger

 To add a Watcher File, select a flow and click Add Watcher


 To remove a Watcher File, click Remove All Watchers
 To Isolate a set of components, select the components to be Isolated, Watcher
Files will automatically be placed into the graph by the Debugger.

 Note that if the Watcher files do not exist, the GDE will build them during the
first run only, using the Watchers on successive runs

Confidential ©2012 Syntel, Inc.


> Diagnostic Ports: Reject, Error

 Reject: Input records that caused errors.


 Error: Error messages.

Confidential ©2012 Syntel, Inc.


> Instrumentation Parameters: Reject-threshold

 A drop-down menu specifying the number of errors to tolerate.

Confidential ©2012 Syntel, Inc.


THANK YOU

You might also like