Professional Documents
Culture Documents
Ab Initio Graphical Development Environment Version 1
Ab Initio Graphical Development Environment Version 1
Ab Initio Graphical Development Environment Version 1
11 Release Notes
Installation
Operating System Requirement: Windows 95, Windows 98, Windows NT 4.0, Windows 2000, or
Windows XP.
IMPORTANT: Before installation, exit all running instances of the GDE. Ab Initio strongly
recommends exiting ALL other applications.
To install from a CD-ROM, run Setup.exe in the Ab Initio GDE directory.
NOTE: By default, older versions of the GDE cannot read graphs written by this version of the
GDE. However, you may save graphs readable by older version of the GDE. See Save in
Historical Formats below.
NOTE: For connecting to the Enterprise Meta Environment and the Ab Initio Repository, you must
install Repository version 2.11 or greater. Ab Initio allows you to upgrade Repositories without
upgrading Co>Operating Systems; see the Co>Operating System release notes.
System Requirements
On Windows 95/98:
On Windows NT/XP/2000:
Checkin puts graphs, record formats, and transforms unto the Ab Initio Repository so they can be
analyzed. The repository keeps all versions of the files, so you can recover lost work. Checkout
retrieves a version of your files from the Repository. Analysis generates dependency information
so you can see relationships within your business logic. These functions are now done with
simple wizards.
The Checkin, Checkout and Analysis commands will not work with Co>Operating System
versions 2.10 and earlier. These operations are now implemented as air commands executed on
the sandbox host machine. For details about the air commands, see the Repository guide.
Also, the Repository menu has been simplified. For more details, see the online help.
Editing parameters in Sandboxes is much simpler. Formerly, changes made to parameters had to
be confirmed individually; now parameters are stored in a file named .air-project-
parameters, which you must lock to edit. That file checks in, and checks out from the
Repository just like Graphs, Record Format files, and Transform files.
The sandbox parameters editor now has a gray background to show cells that are read-only.
You may now checkin parameters files without using the GDE using the air project import
command. You can create, edit and delete sandbox parameters with the air sandbox parameter
command. For more details, see the Repository guide.
The Sandbox Parameters Editor has a new column called Private Value. If Private Value is
checked for a parameter, changes to the parameter's value or interpretation made in a sandbox
will never get checked into the Repository. Therefore, the parameter's value is private to the
sandbox.
You are allowed to change private values even if the sandbox parameters file (.air-project-
parameters) is not locked. When Private Value is checked, the value column shows the private
value, otherwise it shows the shared public value. To change the public value of a private value
parameter, uncheck Private Value, edit the value, and check Private Value again.
Switch parameters and common project parameters always have private value checked.
When the remote Co>Operating System is running on Windows, you can choose Windows
Native (DCOM) as a connection method. This uses Windows networking and security to start
Co>Operating System services. Formerly, you had to configure either an rexec or telnet service
on the remote Windows machine. For details about configuration and security issues, see the
Co>Operating System release notes.
Key Files
If you have not yet received a GDE key file for your computer:
Your Ab Initio contact will use the information to send you a GDE key file for your computer.
Designated contacts can call 1-781-301-2588 or email key@abinitio.com.
To move a key file between computers, you must remove it from the old computer first, and then
request a key file for the new computer. (Once you remove a key file from your computer, you
must request a new one to be able to run the GDE again.)
Uninstall the GDE through the "Add or Remove Programs" utility in the Windows Control
Panel.
If this is the only GDE installation on your computer that is using the key file, the uninstall
program will ask you whether or not you want to remove the file.
Click Yes to remove the key file. The uninstall program will create a file containing the
key removal information.
Send this file to a designated Ab Initio contact in your company to register the license
removal.
If another installation is using the same key file, the uninstall program will inform you and
will not generate the key removal text. You must uninstall all products that use the key file
in order to receive the key removal text.
Read-only Files
When using the connection method called Ab Initio Server/Rexec or Ab Initio Server/Telnet,
the GDE forbids editing on write-protected files. This can save you from losing work.
Interpretation Fixups
$-substutition is no longer available as a choice for embedded transform parameters. This helps
avoid confusion between DML's internal environment variable expressions, and the GDE's macro
parameter substitution.
Watchers
A watcher shows the contents of a flow after your graph has run. It is similar to a temporary file,
but adding and removing watchers does not modify a graph. You use these to debug your graph.
To add a watcher to a flow, first choose Debugger>Enable Debugger. Then select the flow and
choose Debugger>Add Watcher to Flow. Finally, re-run the graph. Watchers appear on flows as
rounded rectangles ; When filled with data, they look like . To view data on a filled
watcher, right-click on it and choose View Data.
Isolation Mode
To aid debugging of graphs, you may run a selected subset of components in the graph without
other components. Select the components that you wish to test in isolation, then choose
Debugger>Isolate Selected Components. Components that will not be run in isolation mode
will be grayed out. Using Isolation Mode mimics the well known debugging technique of copying
and pasting a subset of the graph to a separate document during development.
Context-Specific Errors
If your graph fails, context-specific error messages appear in the job output window. These have
a lot more information thant the older simple text error messages. You use these to navigate from
error to component and back:
An error is followed by a link, which brings up the properties of the component that failed
when clicked.
Multiple errors for the same component on different partitions are hidden by default;
clicking a link shows or hides them
Many more details are available
A red LED indicates a failed component. Double-clicking it brings up the associated error
and the component properties.
Hovering over a red LED shows the error text.
The F4 key cycles to the next error message and associated component.
The phases shown in the GDE now more closely match the phases used when the graph is run.
Formerly, a phase break in a subgraph was invisible from the top-level graph; now these appear
as phase ranges (for example, 5-7) next to subgraph components. In addition, phase breaks
caused by Intermediate File components will become visible on your canvas. When you open a
graph saved by a previous version of the GDE, the phases will be renumbered, but the graph's
components will continue to run in the same order as they were run by the earlier GDE.
Converting graphs saved in 1.8 and earlier to 1.10 format preserves all phase breaks.
The phase toolbar has been simplified. The "phase number" edit box has been eliminated, and
the increment and decrement phase buttons now change the phase of the selected components,
rather than change the number in the edit box.
To package a graph and all related files (record formats, transforms, database configurations and
so on) choose Run>Package for Support. This reads all files into a compressed archive file,
which you then can mail to support@abinitio.com. The package produced can be read by WinZip.
This saves you time when reporting problems to support.
Undo on large graphs should be much faster. Formerly, Undo information size was proportional to
the size of your graph; this slowed as the graphs became large. The new implementation only
records sufficient information to reverse your changes.
The component organizer can optionally be docked, so it will not obscure your graphs. The
information in the details pane has been simplified.
You may add "Database" folders to the Component Organizer. When opened, the database
folders contain a folder for each registered user of the database. The user folders contain a table
component for each table in their schema. These tables can be dropped as table components on
graphs.
To speed execution of your programs, you may optionally deploy your graph and compile the
DML transform files. This is rarely needed, but on some machines may improve performance by
20%. Choose the directory for the compilation on the Compiled Transforms tab in the Run
Settings dialog . Then, choose Run>Deploy>with Compiled Transforms.
Speedometer
You can view your graph's performance by choosing Insert>Speedometer. A speedometer
appears. To decide which information to view, choose View>Kb/sec, View>CPUs/sec or
View>Records/sec. Kb/sec is the number of bytes flowing through your graph during the last
tracking interval. CPUs/sec is the fraction of the available CPU seconds are dedicated to running
your graph; when the graph is using all the CPU power, this should equal the number of
processors. Records/sec is the number of records flowing through your graph during the last
tracking interval.
This component attaches a new key field to a record, such that each value of the new key
corresponds to one value of the existing key. This new key is called the surrogate key. You use
this component:
When you populate a data warehouse with data from external source systems, it is good
practice to generate new surrogate keys for the items in the warehouse, rather than
reusing keys from the source systems. This isolates the warehouse from changes in the
source systems.
When the key is long or logically components of several fields, using a short, single-field
surrogate key saves space and computation time. For example if the key is 200 bytes of
address and post code information, the surrogate key might be a 4-byte integer.
Both the Host Profile Dialog and the Repository>Settings dialog include SSH as a connection
method. Use this to connect to SSH servers. When you choose SSH as a connection method in
the Host Profile Dialog, the Settings button brings up the SSH Settings Dialog which includes
the following options:
Private Key File : Set this to the file containing your private key. If the private key file itself
has been encrypted, enter the passphrase in the password text box of the Host Profile
dialog.
SSH Protocol : Choose the protocol of the connection. Ab Initio's SSH1 implementation is
compatible with 3rd party SSH products; to use SSH2 keys, you must use the format
provided by ab_keygen (described below).
Use compression : When you select this option, the SSH client will compress data before
sending, and the SSH server should decompress. Use this option when operating over a
slow connection.
Allow empty password : Select this if you do not need the GDE to prompt for a
passphrase to decrypt your private key file AND you are using public key/private key
authentication. Then, the password text box in the host profile dialog should be empty.
Response Timeout : Enter the number of seconds to wait for response from the SSH
server.
SSH Port: Enter the port number of the SSH server. To use the standard SSH port (22 on
most machines), enter 0.
To generate SSH1 or SSH2 public key-private key pairs, click Generate Key in the SSH Settings
Dialog. The Ab Initio SSH Key Generator dialog appears.
2. On the machine you wish to connect to via the SSH protocol, create a .ssh subdirectory
in your login directory.
3. In that .ssh subdirectory, create a file (if one does not already exist) named
authorized_keys (for SSH1 keys) or authorized_keys2 (for SSH2 keys).
4. In that file, do one of the following:
o Copy the contents of the Public Key text box if the file is new
o Append the contents of the Public Key text box if the file exists
Use copy and paste for OpenSSH compatible keys, or the Save Private Key button for
private format keys.
5. Be sure the protection on the .ssh directory does not allow write or execute permissions
for others (700). Also, make sure your authorized_keys file does not allow write
permissions for others (600).
6. In the Key Passphrase text box, enter a passphrase if desired. (You will need to enter this
passphrase in the GDE into the Password text box of the Host Profile Dialog.) If you
leave the passphrase blank, you should check Allow Empty Password in the SSH
Settings dialog in the GDE.
7. Save the private key in a secure place on your desktop machine. (You will need to enter
the name of this file in the GDE as the Private Key file in the SSH Settings Dialog.)
Many bugs were fixed in the key editor. In particular, you can now drag and drop fields into key
specifiers.
Tracking Enhancements
The tracking dialogs now include the bytes/sec and records/sec columns. These give the
instantaneous dataflow rate for programs and flows.
The parameters editor is broken into two grids, a main grid and a details grid, like the record
format editor. Also, you may reorder columns.
Sandboxes
This release supports organizing Ab Initio applications into project directories called Sandboxes.
When you save your graphs (and any related .dml, .xfr, and .dbc files) into sandboxes, you get
the following benefits:
Graph job script is automatically saved whenever you save your graph. (This can be
toggled by choosing Repository>Settings and unchecking Save Script When Graph
Saved To Sandbox).
Sandbox parameters allow you to easily define variables that are shared between
multiple graphs, without having to understand UNIX shell syntax.
Sandboxes support easy migration from development, through test, to production with
Switch Parameters and Dependent Parameters.
If you develop your applications using sandboxes, you can easily check them into the Ab
Initio Repository.
To define a new sandbox, choose Repository>Create Sandbox. This will create some hidden
files (.air-project-parameters and .project.ksh) used to define sandbox parameters, and will
also create the subdirectories that organize your application files. These are:
You can reference each of these subdirectories from graphs by using sandbox parameters. The
parameter names are DML, XFR, MP, DB, and RUN. You can define additional sandbox
parameters; to do so, choose Repository>Edit Sandbox. Any graph saved in the sandbox (or
any subdirectory of the sandbox) can refer to the sandbox parameters using $-reference syntax
(see Parameter Interpretations below).
To make the parameters available when you are logged into the host directly, you can "dot" in the
file ab_project_setup.ksh, which is automatically generated in every sandbox.
To use sandboxes, currently you must use the Korn shell (the default) in your Run>Settings Host
Profile, and the Repository>Settings mode must be Source Code Control. Then, create a
sandbox with Repository>Create Sandbox, and save your graph to any subdirectory of that
sandbox. The status bar should show the current sandbox.
This release of the GDE (when used with version 2.8 of the Co>Operating System) adds support
for application source-code control and project migration using the Repository. The new facilities
are as follows:
Repository>Check In copies graphs, related files, and sandbox parameters into the
repository. These graphs are then available for other developers in your group.
Repository>Check Out copies graphs, related files, and sandbox parameters out of the
repository into your private working area.
Locking support allows multiple developers to work in the same project without conflicts.
Choose File>Lock.
Version control: projects can be checked into the Repository using a symbolic tag, and
then that version can be retrieved using that same tag.
Revision history: you can retrieve past versions of graphs and related files; differences
are available from the Web interface.
Administrators can create shared projects with Repository>Create Project and edit
checked-in sandbox parameters with Repository>Edit Project.
For more information, please see the Guide to Managing Technical Metadata.
Improved On-line Help
The DML and Component reference material in the GDE on-line help is significantly updated and
enhanced.
Automatic flow buffering nearly eliminates the possibility of deadlocked graphs. Automatic flow
buffering examines your graph as you are building it, and adds flow buffering on any flow that
may cause a deadlock. Flow buffers are shown with a blue dot. For more information on deadlock
and flow buffering, see the on-line help.
Automatic flow buffering is activated by default when you open a new graph. To toggle it for
graphs that were first built using an older version of the GDE, choose Edit>Automatic Flow
Buffering.
The Graph Parameters Editor allows you to view and modify all the parameter settings for all of
the components in your graph at once. To display the Graph Parameters editor, Choose
Edit>Parameters. The Graph Parameters Editor replaces (and augments) the old
Run>Parameters dialog.
The editor supports the following operations:
View>Columns
Shows a subset of parameter attributes, by hiding columns. You may show or hide any
column.
When you run this graph from the GDE, you are prompted for the INPUT_FILE_URL
parameter's value in the Test Parameters dialog.
When you run this graph as a deployed script, you must pass the URL as the first
argument on the command line; otherwise the script prints an error message.
Visual Cues
The GDE now provides more visual cues about your graph as you are developing it:
A key icon is displayed on components that have key parameters. To edit the key
specifier, double-click the icon.
An in-memory icon is displayed on Rollup, Join, and Scan components. This icon
shows whether the component expects sorted input ( ), or whether it operates in-
memory and therefore its input need not be sorted ( ).
To enable or disable visual cues, choose File>Preferences and select Show In Memory Icon or
Show Component Key.
Hovers
Letting the mouse pointer hover over parts of your graph displays tool tips, as follows:
To enable or disable hovers, choose File>Preferences, and check Enable Tool Tips.
The Control Server provides remote access services for the Co>Operating System. These
services include:
File transfer
Graph execution
Monitoring and control
From the Method list box of the Host Profile Dialog, choose Ab Initio Server/Rexec to start the
Control Server with the Rexec protocol, and thereafter use its File Transfer services instead of
FTP.
Uses a single port for communication. This makes it possible to connect through a firewall
using SSH port forwarding. (Contact Ab Initio for support configuring SSH.)
Provides more reliable file transfer than FTP.
Removes the dependence on Microsoft Internet Explorer's Internet settings. For example,
the Internet settings may say "cache recently-used pages", which might cause the GDE
to retrieve graphs from a remote host that are not up-to-date.
Removes dependence on the format of the remote Operating System's FTP service. For
example, the FTP-based file transfer service must read the machine-generated directory
listing to determine file size and time. Errors in reading might incorrectly prepend the file
creation time to the remote file name.
Parameter Interpretations
1. From the Parameters tab of the Properties dialog, click More, then choose the
interpretation from the drop-down list box.
2. From the Interpretation column of the Graph Parameters Editor (Edit>Parameters),
choose the interpretation.
The GDE chooses the default interpretation from the parameter type, as follows:
Subgraph Parameters
A parameter may be added to a subgraph or graph, and then referred to using the $identifier
or ${identifier} syntax. In particular, you may define a parameter on a subgraph, and refer to
it from components within the subgraph. This greatly enhances your ability to create reusable
subgraph components.
More specifically, to resolve parameter P's reference to $VARIABLE, the GDE uses the first value
found in:
A component parameter (using P's containing component)
A subgraph parameter (using P's component's containing graph G1, then G1's containing
graph G2, and so on)
A top-level graph parameter
A project parameter
A shell variable defined in a host profile's Host Setup commands
A UNIX environment variable
For example, let's define a reusable subgraph component with subgraph parameters. It should
read an input stream and write two output files: the selected records and the deselected records.
The two output files should have the same root names and different extensions, for example,
JanData.selected and JanData.discarded. To build this, do the following:
1. Add a parameter named ROOTNAME to the subgraph (use the Graph Parameters Editor
to do this). Make the scope be Formal (this is an input to the subgraph) and leave the
value blank.
2. Within the subgraph, define the URLs for the two output datasets to be
$ROOTNAME.selected and $ROOTNAME.discarded.
Now, one instance of the subgraph component may set ROOTNAME to JanData and another
may set ROOTNAME to FebData. Each instance's datasets will have the proper URLs: the first
with JanData.selected and JanData.discarded, and the second with
FebData.selected and FebData.discarded.
Since the syntax $identifier may refer to a subgraph parameter, where it formerly (in 1.7 and
earlier) referred to a Shell environment variable, the GDE will use "Legacy 1.7 Code Generation"
if it finds a possible conflict. To disable "Legacy 1.7 Code Generation", choose Run>Settings,
click the Script tab, and uncheck the Legacy 1.7 Code Generation check box.
Other Changes
Host Profiles
In earlier versions of the GDE a graph can be saved on a different host than where it runs, that is,
the "Save" location is different from the "Run" location. In this version, these two locations must
be the same; a graph runs using the host profile that opened it. When you open a remote graph,
the host profile in the Run Settings dialog changes to the opener's host profile. This may change
the behavior of the graph, especially if you have specified Host Setup commands.
To keep the profile from switching, in the File>Open dialog's host list box, choose the host profile
for the machine on which you want to run the graph.
The user ID and password are now stored in the Windows registry, instead of in host profile
(*.aih) files. When you access an old host profile with 1.8, the GDE will rewrite it to remove the
user ID and password. Each developer that opens a graph with such a host profile will need to
reenter the id and password, which is then stored in the registry. Now host profiles describe only
host-specific items, rather than host-specific and user-specific items.
Since host profiles are now sharable, you may store them in the Repository, or on your windows
machine in a user-configurable directory. The default directory is C:\Program Files\Ab
Initio\Ab Initio GDE\Hosts, but you can change it by choosing File>Preferences.
NOTE: older graphs are not able to reference a host profile not in the Host Profile Location (or in
the Repository), and generate an error message. To correct this, either copy the referenced host
profile to the Host Profile Location or change the configured Host Profile Location.
Menu Changes
New Components
As the Ab Initio Co>Operating System adds services, the GDE provides the following components
to access them:
Compress Components: GZip and GUnzip. These wrap the GNU gzip and gunzip codes
for compressing and uncompressing your data. For more details, consult the GNU
website.
Continuous Flow Components: MQ Publish and MQ Subscribe support connection to
the MQSeries data flows. Universal Subscribe allows you to connect Ab Initio graphs to
arbitrary continuous streams of data.
MVS Components: MVS Input Tape, MVS Output Tape, MVS Output File, MVS Output
DASD
These allow you to read and write from MVS datasets on OS/390.
Repository Connector Components: Unload Graph Instance and Unload Job
Summary
produce streams of data describing objects in the Repository: Graphs, and Graph
Executions.
Improved Layout Propagation
The layout propagator derives layouts of components from their neighbors. The algorithm has
been improved as follows:
Layouts can be derived across all-to-all flows
The algorithm is stable in ambiguous conditions: the same inputs result in the same code
User errors are flagged as warnings; for example, it is rare that a fan-out or all-to-all from
a file component is correct; these are flagged as warnings (yellow flow pattern ovals)
ascii, ebcdic, arabic, cyrillic, east european, euc jis, greek, hebrew, jis 201, latin_3, latin_4, shift
jis, turkish, unicode, unicode big-endian, unicode little-endian, unicode utf8
This release provides full support for the new DML features introduced in the 2.7 release of the
Co>Operating System. This includes datetime (both types and builtin functions), varstring, and so
on. Please refer to the release notes included with your Co>Operating System release.
Validation
The Validate button on the tool bar checks all of your DML types and transforms and generates a
report. After validation, all errors are flagged with Yellow To-Do Cues. If anything is selected,
validation only operates on that. Also, the Ports tab and the Parameters tab include Validate
buttons to validate individual record formats and transforms. The validation is based on the
version of the Co>Operating System, so that the GDE flags newer features which may not work
on older Co>Operating Systems.
Component Organization
New component names describe the behavior of components. Components are unified by
function. For example, the only difference between "MergeJoin" and "HashJoin" was how the
results were computed. Now, the functionality of both components is found in Join. The on-line
help contains details about the component name changes.
Viewing data allows you to see your data organized in a grid. Right-click a dataset component,
then choose View Data, and finally choose "Grid Mode" from the Display As: pull-down.
Conditional Components
Components may be excluded from your graph based on runtime values. A runtime value is a
Korn Shell expression that returns 1 or 0. To enable this feature, choose File>Preferences, and
check Conditional Components. Thereafter, to set a condition on a component, click the
Condition tab on the Component Properties dialog. If a Condition evaluates to True, the
component is included in the graph. If it evaluates to anything else, the component is either
removed completely or replaced with a flow between two user-designated ports. Components that
are excluded are displayed with gray tracking LEDs at runtime.
Data Editor
You may change values in a dataset. Double-click the component, and click Edit Data. A grid
appears; modify the values and click OK. This is very useful for editing Lookup files.
Restrictions: the dataset must be a serial file, the record format must be "simple" (no subrecords
or vectors), and it must be completely read into the grid before you can edit.
Key Specifiers for Lookup Files may be interval- or regular-expression-based. These new field
modifiers are available in the Special column of the Key editor for a Lookup File.
Linked Subgraphs
The GDE only displays options which are relevant to the currently connected Co>Operating
System. For example, later versions of the Co>Operating System may add new built-in functions
to DML; these are displayed in the expression editor only if your version supports them.
The Ramp and Limit parameters of components are now easy to use. Choose from:
You may save graphs in version 1.4, 1.5, 1.6, 1.8.14 and 1.10 formats, to allow older GDEs to
read them. For example, to write a graph that Version 1.6 of the GDE can read, choose
File>Save As, and set the Save As Type to GDE 1.6 Graphs.
Miscellaneous
If you get a "No such entry CoInitializeEx" error when you attempt to run the GDE on Windows
95, this is because you must install DCOM. To install DCOM, visit
http://www.microsoft.com/com/dcom/dcom95/download.asp. Installation will require a reboot. This
error does not appear on Windows 98, NT, 2000, or XP.
If you get a message such as "cannot find required library WS2_32.DLL" when you run, or if you
get a message like "Setup has determined that Windows Sockets 2.0 is not installed.", you need
to download the windows sockets 2 package from Microsoft. The latest version of the Winsock
2.0 run-time components is available from the Microsoft FTP site:
ftp://ftp.microsoft.com/bussys/winsock/winsock2
This location contains the Winsock 2.0 SDK for Windows 95. After installing the SDK, run
ws2setup.exe (in the setup subdirectory); this installs Winsock2. Installation will require a reboot.
You can also download the Winsock 2.0 update for Windows 95 from the following Microsoft Web
site:
http://www.microsoft.com/windows95/downloads/contents/wuadmintools/s_wunetworkingtools/w9
5sockets2
The simplest alternative is to install the latest version of Microsoft Internet Explorer.