Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
You are on page 1of 65

Components

• A component is a program.

• Components may run on any computer running


the Co>Operating System.

• Different components do different jobs.

• The particular work a component accomplishes


depends on its parameter settings.

• Some parameters are computational metadata.

Confidential & Proprietary


Simple Components

• In these components
the record format
metadata does not
change from input to
output

Confidential & Proprietary


The Filter by Expression
Component

• Reads records from input port and evaluates


the select_expr parameter for each. If
expression is true (non-zero), record is written
to out port.
• Optionally, if expression is false (zero), record
is written to deselect port.
• One port must be connected downstream
• Can use both flows

Confidential & Proprietary


Filter Data (Selection)
(figure-02)
1. Push “Run” button.

2. View monitoring information.


3. View output data.

Confidential & Proprietary


Expression Parameter

Confidential & Proprietary


Exercise 2: Data Filtering
(Selection)

• Using example graph figure-02.mp,


change the select expression parameter
of the Filter by Expression component to
select records with id greater than 215.

• Run the application and examine the


resulting data.

Confidential & Proprietary


Keys
• A key identifies a field or set of fields used to organize a
dataset in some way.

• Single field:id
• Multiple field: { last_name; first_name }
• Modifiers: { id descending }

• Used for sorting, grouping, partitioning.

• (See Chapter 8 of the Data Manipulation Language Reference for more


information on keys. Note: keys are also called collators.)

Confidential & Proprietary


The Sort Component

• Reads records from input port, sorts them by key, and


writes result on output port.

Confidential & Proprietary


Sorting
(figure-03)

Confidential & Proprietary


Sorting (figure-03)

Confidential & Proprietary


Exercise 3: Sorting

Using example graph figure-03.mp, change


the key parameter of the Sort component to
sort the data by first_name.

Run the application and examine the resulting


data.

Confidential & Proprietary


More Complex Components

• In these components
the record format
metadata typically
changes (goes
through a transfor-
mation) from input
to output

Confidential & Proprietary


Data Transformation
Input record format:
0345,090263John,Smith; record
decimal(”,”) id;
date(”MMDDYY”) bday;
Drop Reformat string(”,”)
first_name;
string(”;”) last_name;
end
Reformat Reorder

id+1000000
Output record format:
record
decimal(7) id;
string(8) last_name;
date(”YYYY.MM.DD”) bday;
end 1000345Smith 1963.09.02

Confidential & Proprietary


The Reformat Component

• Reads records from input port, reformats them according to a


transform function, and writes the result records to output
(out0) port.
• Additional output ports (out1, ...) can be created by adjusting
the count parameter.

Confidential & Proprietary


Transform Function

•A transform function specifies the rules used to create


the output record.

•Each field of the output record must be assigned a


value. Partial output records are not allowed!

•The transform editor is used to create a transform


function in a graphical manner.

Confidential & Proprietary


Transform Editor

Confidential & Proprietary


Text DML:
Transform Function Syntax

• Functions look like:


output-variables :: name ( input-variables ) =
begin
assignments
end;

• Assignments look like:


output-variable.field :: expression ;

(See Chapter 6 of the Data Manipulation Language Reference


for more information on transform functions.)

Confidential & Proprietary


A Look Inside the Reformat
Component

a b c

x y z

Confidential & Proprietary


1. Record arrives at input port
9 45 QF

out :: trans(in) =
begin
out.x :: in.b - 1;
out.y :: in.a;
out.z :: fn(in.c);
end;

Confidential & Proprietary


2. Record is read into component

9 45 QF

out :: trans(in) =
begin
out.x :: in.b - 1;
out.y :: in.a;
out.z :: fn(in.c);
end;

Confidential & Proprietary


3. Transform function is evaluated

9 45 QF

out :: trans(in) =
begin
out.x :: in.b - 1;
out.y :: in.a;
out.z :: fn(in.c);
end;

Confidential & Proprietary


4. Transform function yields
a result record

out :: trans(in) =
begin
out.x :: in.b - 1;
out.y :: in.a;
out.z :: fn(in.c);
end;

44 9 RG

Confidential & Proprietary


5. Result record is written to
output port

out :: trans(in) =
begin
out.x :: in.b - 1;
out.y :: in.a;
out.z :: fn(in.c);
end;

44 9 RG

Confidential & Proprietary


Exercise 4: Reformat Data

• Using graph figure-04.mp, write a record


format with an id from the simple dataset and
a single name field of 20 characters.
• Write a transform function to produce a
dataset in this format passing through the id
and concatenating first_name and last_name
using string_concat.
• Run the graph and examine the results.
• Modify the transform to trim the spaces from
the first name before concatenating with last
name to get “John Smith ” rather than
“John Smith ”

Confidential & Proprietary


Data Aggregation

0345Smith Bristol 56 Bristol 63


0212Spade London 8 Compton 12
0322Jones Compton 12 London 31
0492West London 23 New York 42
0121Forth Bristol 7
0221Black New York 42

Confidential & Proprietary


Data Aggregation of
Sorted/Grouped Input

0345Smith Bristol 56
0121Forth Bristol 7 Bristol 63
0322Jones Compton 12 Compton 12
0212Spade London 8
0492West London 23 London 31
0221Black New York 42 New York 42

Confidential & Proprietary


The Rollup Component
• By default, Rollup reads sorted records from the
input port, aggregates them as indicated by key and
transform parameters, and writes the resulting
aggregated records on the out port.

Confidential & Proprietary


Built-in Functions for Rollup

• The following aggregation functions are


predefined and are only available in the
rollup component:

avg max
count min
first product
last sum

Confidential & Proprietary


Rollup Wizard

Note the use of an aggregation function in the expression

Confidential & Proprietary


Exercise 6: Rollup Data

• Using example graph figure-05.mp,


modify the transform function to count
the number of records for the same
city.

• Run the application and examine the


results.

Confidential & Proprietary


Joining Data
0345Smith Bristol 56 0322970402 1242.50
0212Spade London 8 0345970924 923.75
0322Jones Compton 12 0121961211 12392.00
0492West London 23 0492971123 234.12
0121Forth Bristol 7 0666950616 2312.10
0221Black New York 42

0345Bristol 561997/09/24
0212London 81900/01/01
0322Compton 121997/04/02
0492London 231997/11/23
0121Bristol 71996/12/11
0221New York 421900/01/01

Confidential & Proprietary


Joining Sorted Data

0121Forth Bristol 7 0121961211 12392.00


0212Spade London 8
0221Black New York 42
0322Jones Compton 12 0322970402 1242.50
0345Smith Bristol 56 0345970924 923.75
0492West London 23 0492971123 234.12
0666950616 2312.10

0121Bristol 71996/12/11
0212London 81900/01/01
...

Confidential & Proprietary


The Join Component

• Join performs a join of inputs. By default,


the inputs to join must be sorted and an
inner join is computed.

• Note: The following slides and the on-line


example assume the join-type parameter is
set to ‘Outer’, and thus compute an outer
join.

Confidential & Proprietary


Joining (figure-06)

Confidential & Proprietary


A Look Inside the Join
Component*

a b c a q r
*join-type = Full
Align inputs by key Outer join
a b c a q r

out :: fname(in0, in1) =


begin
...
...
...
...
...
end;

a x q

Confidential & Proprietary


1.Records arrive at inputs
G 234 42 G NY 4

Align inputs by a

out :: join(in0, in1) =


begin
out.a : : in0.a;
out.x :1: in1.r + 20;
out.x :2: in0.b + 10;
out.q :1: in1.q;
out.q :2: ”XX”;
end;

Confidential & Proprietary


2.Records read into
component

G 234 42 G NY 4

Align inputs by a

out :: join(in0, in1) =


begin
out.a : : in0.a;
out.x :1: in1.r + 20;
out.x :2: in0.b + 10;
out.q :1: in1.q;
out.q :2: ”XX”;
end;

Confidential & Proprietary


3.Keys compared

G 234 42 G NY 4

Align inputs by a

out :: join(in0, in1) =


begin
out.a : : in0.a;
out.x :1: in1.r + 20;
out.x :2: in0.b + 10;
out.q :1: in1.q;
out.q :2: ”XX”;
end;

Confidential & Proprietary


4.Aligned records passed to
function

Align inputs by a

G 234 42 G NY 4

out :: join(in0, in1) =


begin
out.a : : in0.a;
out.x :1: in1.r + 20;
out.x :2: in0.b + 10;
out.q :1: in1.q;
out.q :2: ”XX”;
end;

Confidential & Proprietary


5.Transform evaluated

Align inputs by a

G 234 42 G NY 4

out :: join(in0, in1) =


begin
out.a : : in0.a;
out.x :1: in1.r + 20;
out.x :2: in0.b + 10;
out.q :1: in1.q;
out.q :2: ”XX”;
end;

Confidential & Proprietary


6.Result record generated

Align inputs by a

out :: join(in0, in1) =


begin
out.a : : in0.a;
out.x :1: in1.r + 20;
out.x :2: in0.b + 10;
out.q :1: in1.q;
out.q :2: ”XX”;
end;

G 24 NY

Confidential & Proprietary


7.Result record written

Align inputs by a

out :: join(in0, in1) =


begin
out.a : : in0.a;
out.x :1: in1.r + 20;
out.x :2: in0.b + 10;
out.q :1: in1.q;
out.q :2: ”XX”;
end;

G 24 NY
Confidential & Proprietary
8.Records arrive at input
H 79 23 K IL 8

Align inputs by a

out :: join(in0, in1) =


begin
out.a : : in0.a;
out.x :1: in1.r + 20;
out.x :2: in0.b + 10;
out.q :1: in1.q;
out.q :2: ”XX”;
end;

Confidential & Proprietary


9.Records read into
component

H 79 23 K IL 8

Align inputs by a

out :: join(in0, in1) =


begin
out.a : : in0.a;
out.x :1: in1.r + 20;
out.x :2: in0.b + 10;
out.q :1: in1.q;
out.q :2: ”XX”;
end;

Confidential & Proprietary


10.Keys compared

H 79 23 K IL 8

Align inputs by a

out :: join(in0, in1) =


begin
out.a : : in0.a;
out.x :1: in1.r + 20;
out.x :2: in0.b + 10;
out.q :1: in1.q;
out.q :2: ”XX”;
end;

Confidential & Proprietary


11.Aligned records passed to
function
K IL 8

Align inputs by a

H 79 23

out :: join(in0, in1) =


begin
out.a : : in0.a;
out.x :1: in1.r + 20;
out.x :2: in0.b + 10;
out.q :1: in1.q;
out.q :2: ”XX”;
end;

Confidential & Proprietary


12.Transform evaluated

K IL 8

Align inputs by a

H 79 23

out :: join(in0, in1) =


begin
out.a : : in0.a;
out.x :1: in1.r + 20;
out.x :2: in0.b + 10;
out.q :1: in1.q;
out.q :2: ”XX”;
end;

Confidential & Proprietary


13.Result record generated

K IL 8

Align inputs by a

out :: join(in0, in1) =


begin
out.a : : in0.a;
out.x :1: in1.r + 20;
out.x :2: in0.b + 10;
out.q :1: in1.q;
out.q :2: ”XX”;
end;

H 89 XX

Confidential & Proprietary


14.Result record written

K IL 8

Align inputs by a

out :: join(in0, in1) =


begin
out.a : : in0.a;
out.x :1: in1.r + 20;
out.x :2: in0.b + 10;
out.q :1: in1.q;
out.q :2: ”XX”;
end;

H 89 XX
Confidential & Proprietary
Exercise 7: Join Data

• Using example graph figure-06.mp,


modify the transform function to join
visits.dat and last-visits.dat so
that no records are rejected.

• Run the application, and examine the


results. The Unmatched Last Visits
dataset should be empty.

Confidential & Proprietary


Exercise 8 (if time):
Join Retaining All Fields
• Building upon the graph you created in
Exercise 7, create a new output record format
and transform function to join visits.dat and
last-visits.dat according to the following
rules:
• Retain all fields from each dataset.
• Supply defaults where necessary.

• Change the necessary parameters, run the


application, and examine the results.

Confidential & Proprietary


Mouse and Key Shortcuts

Action On What? Does This


Mouse & Key Shortcuts
Shift-<double Components Open Editor
click>

<double click> A parameter in Open Editor


Parameter Tab

<double click> Port Open Record


Format Editor

Drag input field Transform Adds field to


to blank space editor output record
in output field format
pane
Confidential & Proprietary
Component: Gather Logs

• Reads logging records from multiple flows


connected to the input port and writes them to
the specified file outside of the application’s
transactional context. The start-text and end-
text parameter values are written to the log at
the beginning and end.

Confidential & Proprietary


Component: Replicate

• Copies records from input port to multiple flows


connected to output port.

Confidential & Proprietary


Sample Graph

Confidential & Proprietary


Exercise 9: Creating a
Reformatting Application

• Create a new graph that:

Reads data from simple.dat with record format


simple.dml.

Reformats that data with simple-out.xfr.

Writes the results to simple-out.dat with record format


simple-out.dml.

• Run it and verify the results.

Confidential & Proprietary


Exercise 10:
Obtaining Log Information
• Add a Gather Logs component to the application.

• Configure the component. Don’t forget to provide a log


file name.

• Connect it to the Reformat’s log port.

• Run the application.

• View the log file on the server.

Confidential & Proprietary


Exercise 11: Creating an
Aggregation Application
• Create an application that:
Reads data from visits.dat with record
format visits.dml.

Sorts it by city.

Aggregates it (using Rollup component) by


city with visits-to-city-rollup.xfr.

Writes the results to visits-to-city.dat


with record format visits-to-city.dml.

Logs input,output,intermediate events.

Confidential & Proprietary


Computing without Sort

Some components do not require pre-sorted inputs.

These components work by keeping some or all of the


inputs in memory.

These components usually have a sorted-input parameter,


or have the word hash in their name.

There are rules of thumb about when to use “in-memory”


sorting or grouping vs sorting before the component.

Confidential & Proprietary


Exercise 12:
Rollup without Sort
• Open figure-05.

• Save As... to figure-05-nosort.

• Delete the Sort component.

• Change the sorted-input parameter of the Rollup component to


“in-memory…”

• Run the application and examine the results.

Confidential & Proprietary


Exercise 13:
Join without Sort

• Open figure-06.

• Save As... to figure-06-nosort.

• Delete both Sort components.

• Change the sorted-input parameter of the Join component to


“in-memory…”

• Run the application and examine the results.

Confidential & Proprietary


Join
• Join performs a join of inputs. By default, the inputs
to join must be sorted and an inner join is computed.

• Options:
• join-type: Inner, Outer or Explicit (other).
• dedupn: Call the transform function only once for any
matching record on input n. Defaults to false.
• record-requiredn: Call transform function for all keys,
even if there is not a matching record for input n.
Defaults to true. Only used if join-type is Explicit.

Confidential & Proprietary


Inner Join:

An inner join produces an output record only when a given key is


present on ALL inputs. If the key is duplicated on any input,
each (duplicate) key is matched with the other inputs.

in0 in1 result


a,me b,7 b,we,7
b,we b,8 b,we,8
b,she c,9 b,she,7
c,he b,she,8
d,us c,he,9

Confidential & Proprietary


Full Outer Join:

A full outer join produces an output record whether there is a


match for a given key on an input or not. If the key is duplicated
on any input, each (duplicate) key is matched with the other
inputs. The user should provide default values.
in0 in1 result
a,hi b,7 a,hi,999
b,lo b,8 b,lo,7
c,bye c,9 b,lo,8
d,1 c,bye,9
d,XXX,1

Confidential & Proprietary


Joins can be arbitrarily
complex in Ab Initio

The Join component is capable of combining its input in many


ways. It is also capable of combining more than two inputs.

See the Component Reference or the Online Help for complete


information about Join.

Confidential & Proprietary

You might also like