Professional Documents
Culture Documents
Pervasive ETL Fundamental Exercises
Pervasive ETL Fundamental Exercises
2010 Pervasive Software Inc. All rights reserved. Design by Pervasive. Pervasive is a registered trademark, and "Integrating the Interconnected World" is a trademark of Pervasive Software Inc. Cosmos, Integration Architect, Process Designer, Map Designer, Structured Schema Designer, Extract Schema Designer, Document Schema Designer, Content Extractor, CXL, Process Designer, Pervasive Integration Engine, DJIS, Data Junction Integration Suite, Data Junction Integration Engine, XML Junction, HIPAA Junction, and Integration Engineering are trademarks of Pervasive Software Inc.. All names of databases, formats and corporations are trademarks or registered trademarks of their respective companies. This exercise scenario workbook is written for Pervasives Integration Platform software, version 9.x. (Dewberry)
Table of Contents
Forward ............................................................................................................................................... 7 The Pervasive Integration Platform ................................................................................................. 8 Architectural Overview of the Integration Platform ........................................................................ 9 Design Tools ................................................................................................................................... 10 MetaData Tools .............................................................................................................................. 14 Production Tools ............................................................................................................................ 15 Inside a Simple Integration.............................................................................................................. 17 Course Setup Instructions ............................................................................................................... 19 Course Setup Instructions............................................................................................................... 20 Workspaces and Repositories.......................................................................................................... 24 Workspaces and Repositories Defined ........................................................................................... 25 Repository Explorer ......................................................................................................................... 26 Repository Explorer - Defined........................................................................................................ 27 Splash Screen Licensing and Version Information...................................................................... 28 Map Designer Fundamentals of Transformation ....................................................................... 30 Map Designer The Foundation ................................................................................................... 31 Interface Familiarization ................................................................................................................... 32 Basic Map .......................................................................................................................................... 33 Connectors and Connections Methods of Accessing Data .......................................................... 38 Factory Connections .......................................................................................................................... 39 Macro Definitions.............................................................................................................................. 41 User Defined Connections ................................................................................................................. 44 Basic Transformation Features ...................................................................................................... 47 Source Data Features Sort .............................................................................................................. 48 Source Data Features Filter ............................................................................................................ 51 Target Output Modes - Replace, Append, Clear and Append ........................................................... 54 Target Output Modes Delete .......................................................................................................... 57 Target Output Modes Update ......................................................................................................... 59 The Rapid Integration Flow Language (RIFL) Script Editor......................................................... 63 RIFL Script - Functions ..................................................................................................................... 64 RIFL Script Flow Control .............................................................................................................. 69 Transformation Map Properties ..................................................................................................... 74 Reject Connection Info ...................................................................................................................... 75 Event Handlers & Actions .............................................................................................................. 78 Understanding Event Handlers .......................................................................................................... 79 Source and Target Buffers ClearMapPut Action ............................................................................ 82 4 Data Integrator Fundamentals Training
Event Sequence Issues ....................................................................................................................... 85 Using Action Parameters Conditional Put ...................................................................................... 89 Using OnDataChange Events ............................................................................................................ 92 Trapping Processing Errors with Events ........................................................................................... 96 Error and Exception Handling Review ........................................................................................ 100 Comprehensive Review .................................................................................................................. 102 Metadata Using the Schema Designers...................................................................................... 104 Structured Schema Designer ........................................................................................................ 105 No Metadata Available (ASCII Fixed) ............................................................................................ 106 External Metadata (Cobol Copybook) ............................................................................................. 107 Extract Schema Designer ............................................................................................................. 110 Interface Fundamentals & CXL ...................................................................................................... 112 Data Collection/Output Options ...................................................................................................... 116 Extract Schema Designer: Extracting Variable Fixed Field Definitions ........................................ 118 Process Designer for Data Integrator ........................................................................................... 121 Process Designer Fundamentals .................................................................................................. 122 Creating a Process ........................................................................................................................... 123 Parallel vs. Sequential Processing ................................................................................................... 128 Conditional Branching The Step Result Wizard .......................................................................... 130 FileList - Batch Processing Multiple Files ...................................................................................... 132 Pervasive Integration Engine ........................................................................................................ 137 Syntax: Version Information ........................................................................................................... 138 Options and Switches ...................................................................................................................... 139 Execute a Transformation................................................................................................................ 141 Using a -Macro_File Option ........................................................................................................ 142 Executing a Process ......................................................................................................................... 143 Additional Sample Exercises Integration Engine ...................................................................... 144 Command Line Overrides Source Connection ............................................................................. 145 Ease of Use: Options File ................................................................................................................ 146 Checklist Integration Engine ..................................................................................................... 147 Intermediate Mapping Techniques ............................................................................................... 150 Multiple Record Type Structures .................................................................................................. 151 Multiple Record Type 1 One-to-Many ......................................................................................... 152 Multiple Record Type 2 Many-to-One ......................................................................................... 156 User Defined Functions................................................................................................................ 159 Code Reuse Save/Open a RIFL script Code Modules .................................................................. 160 Code Reuse - Code Modules ........................................................................................................... 161 Lookup Wizards ............................................................................................................................ 163 Incore Table Lookup ....................................................................................................................... 164 Relational Database Management System (RDBMS) Mapping ................................................... 168 Select Statements SQL Passthrough ............................................................................................. 169 DJX in Select Statements Dynamic Row sets .............................................................................. 171 5 Data Integrator Fundamentals Training
Multimode Introduction................................................................................................................... 173 Multimode Data Normalization .................................................................................................... 176 Multimode Implementation with Upsert Action ............................................................................. 181 Reference ......................................................................................................................................... 185 Checklist Starting Your Integration Project .............................................................................. 186 Upgrading from 8.x to 9.x ............................................................................................................ 188 Cosmos.ini Settings ...................................................................................................................... 189 Windows Default Installation Locations ...................................................................................... 190 Design Tool User Interfaces ......................................................................................................... 192 Setting Properties ......................................................................................................................... 194 Reading a Log File ....................................................................................................................... 195 Examples of Complex Process Layouts ........................................................................................ 197 Additional Documentation Resources .......................................................................................... 199 Glossary ........................................................................................................................................... 200 Appendix ......................................................................................................................................... 210 Additional Exercises ..................................................................................................................... 211 Extract Schema Designer: Extracting Fixed Field Definitions....................................................... 212 Integration Engine: Using the -Set Variable Option ................................................................... 214 Integration Engine: Scheduling Executions ................................................................................... 216 Lookup Wizard: Flat File Lookup .................................................................................................. 217 Lookup Wizard: Dynamic SQL Lookup ........................................................................................ 221 RDBMS: Integration Querybuilder ................................................................................................ 225 Structured Schema Designer: Binary Data and Code Pages .......................................................... 229 Structured Schema Designer: Reuse Metadata (Reusing a Structured Schema) ............................ 231 Structured Schema Designer: Multiple Record Type Support in Structured Schema Designer..... 233 Structured Schema Designer: Conflict Resolution ......................................................................... 237
Forward
This course is designed to be presented in a classroom environment in which each student has access to their own computer that has the Pervasive Integration Products installed as well as the Fundamentals courseware. It could be used as a stand-alone tutorial course if the student is already familiar with the interface of the Pervasive tools. The Fundamentals course is not meant to be a comprehensive tutorial of all of our products. At the end of this course it is our intention that a student will have a basic understanding of Map Designer, Structured Schema Designer, Extract Schema Designer, Process Designer, and the Integration Engine. The student should know how to use and how to expand their own knowledge of these tools. Further training can be obtained from Pervasive Training Services. Any path mentioned in this document assumes a default installation of the Pervasive software and the Fundamentals courseware. If the student installs differently, that will have to be taken into account when doing exercises or following links. We hope that the student enjoys this class and takes away everything needed. We welcome any feedback.
This section describes the integration stack from the users perspective.
Design Tools
Data Integrator includes 6 tools used to create maps (transformations), schemas, profiles and processes. Each of the tools is discussed below.
Map Designer Map Designer is the heart of the integration product tool set. It transfers data among a wide variety of data types. In Map Designer, to transfer data, the user designs and runs what is called a Transformation or a Map. Each Transformation created contains all the information Map Designer needs to transform data from an existing data file or table to a new Target data file or table, including any modifications made on the data during the transformation. Map Designer solves complex Transformation problems by allowing the user to: transform data between applications combine data from external Sources change data types add, delete, rearrange, split or concatenate fields parse and select substrings; pad or truncate data fields clean address fields and execute unlimited string and numerical manipulations control log errors and events define external table lookups Map Designer creates two files (tf.xml and map.xml) that contain all the information necessary to run a transformation. A transformation can be run from Map Designer, Process Designer or the Integration Engine. Map Designer is covered extensively in this course and is also explored in the Advanced and the EDI/HIPAA courses.
Process Designer Process Designer is a graphical data transformation management tool that can be used to arrange a complete transformation project. Listed below are some of the Steps that a user can put into a process: Map Designer Transformation SQL Command Decision RIFL Scripting Command Line Application SQL Server DTS Package Sub-process Validation 10 Data Integrator Fundamentals Training
XSLT Queue Iterator Aggregator Invoker Transformer Once the user has organized these Steps in the order of execution, the entire workflow sequence can be run as one unit. This workflow is saved as an .ip.xml file which can be run from the Process Designer or from Integration Engine. Process Designer processes can also be packaged using the Repository Manager. This packaging gathers all of the files that are required by the process and puts them into a single DJAR file that can then be run from the Integration Engine. This courseware covers some basic functionality of the Process Designer. Both the Advanced and the EDI/HIPAA courses cover the more advanced functionality of this tool.
Structured Schema Designer The Structured Schema Designer provides a visual user interface for designing structural data files. The resulting metadata is stored as Structured Schema files with an .ss.xml extension. The .ss.xml files include schema, record recognition rules and record validation rule information. The Data Parser is used to manually parse flat Binary, fixed-length ASCII, or record manager files. The Data Parser defines Source record length, Source field sizes and data types, and Source data properties. It also assigns Source field names, and defines Schemas with multiple record types. Structured Schema Designer can be used to import and read schemas from outside sources such as Cobol Copybooks, XML DTDs, or Oracle DDLs. The ss.xml files that are created by Structured Schema Designer are used as input in Map Designer as part of a source or target connection. There are courseware and exercises on the Structured Schema Designer in this document.
Extract Schema Designer The Extract Schema Designer is a parser tool that allows the user to visually select fields and records from text files that are of an irregular format. Some examples are: Printouts from programs captured as disk files Reports of any size or dimension ASCII or any type of EBCDIC text files Spooled print files Fixed length sequential files Complex multi-line files Downloaded text files (e.g., news retrieval, financial, real estate...) HTML and other structured documents Internet text downloads E-mail header and body 11 Data Integrator Fundamentals Training
On-line textual databases CD-ROM textbases Files with tagged data fields Extract Schema Designer creates schemas that are stored as CXL files. These files are then used as input in Map Designer as part of a source connection. There are courseware and exercises on the Extract Schema Designer in this document.
Document Schema Designer Document Schema Designer is a Java-based tool that allows you to build templates for E-document files. You can custom-build schema subsets for specific EDI Trading Partner and TranType scenarios. In addition, the Document Schema Designer is also very useful to those working with HL7, HIPAA, SAP (IDoc), SWIFT and FIX data files. You can develop schema files for all e-documents that are compatible with Map Designer. The document schemas serve several useful purposes: File Structure Metadata Support Parsing Capabilities Validation Support In an easy-to-use GUI interface, the user selects desired segments from the "template" document schemas that are generated from the controlling standards documentation. The segments are saved in a schema file that can be edited. The user may also add segments from a "master" segment library, add loops/segments/composites/elements by hand, add discrimination rules for distinguishing loops/segments of the same type at the same level, and use code tables for data validation. The user can copy, paste and delete any part of the structure, including the segments, elements, composites loops, and fields (and their subordinate loops/segments/subcomponents). The Document Schema Designer produces DS.XML document schema files that can be used as input in Map Designer as part of a source or target connection. These files can also be used in a Process as part of a Validation step. This document does not have exercises or courseware on Document Schema Designer, though there is a one-day course available from Pervasive Training Services.
Join Designer Join Designer is an application that allows the user to join two or more single-record type data sources prior to running a Map Designer Transformation on them. These sources do not have to be of the same type. For example, an SQL database table could be joined with a simple ASCII text file. The user first uses Source View Designer to create Source View Files that hold metadata about the Sources. From these a Join View File is created, which contains the metadata needed by Map Designer to treat the Source files as if they were a single Source. The user then supplies this Join
View File to Map Designer using "Join Engine" as the connection type. The original Source files and the Source View Files must still be available in the locations specified in the Join View File. When a join is saved, a Join View File (.join.xml) is created. This can be supplied to Map Designer as a Source file or used to create further joins. While a join is limited to two Source files, you can use another join as a Source, thus building nested joins to any level of complexity. This document does not have exercises or courseware on Join Designer. There are exercises in the Advanced course available from Pervasive Training Services.
MetaData Tools
The design tools create artifacts that are XML files (except Extract Schema Designer). The Metadata tools organize these file during development, and manipulate these files to be used for production.
Repository Explorer The Repository Explorer is the central location from which the user can launch all of the Designers, including the Map Designer, Process Designer, Join Designer, Extract Schema Designer, Structured Schema Designer, Source View Designer and Document Schema Designer. The User can also open any Repository that has been created, and then open Transformations, Processes or Schema files in that Repository list. The Repository Explorer can also access the version control functionality of CVS or Microsoft Visual SourceSafe, and can check files in and out of repositories using commands in Repository Explorer. There is courseware about the Repository Explorer in this document.
Repository Manager Repository Manager is designed to facilitate the tasks of managing large numbers of Pervasive design documents, contained in multiple repositories in multiple workspaces. Repository Manager provides a single application to directly access any number of Pervasive design documents, view their contents, make simple updates, bundle them into a package, and generate reports. The features of Repository Manager include: Open and work with any number of defined Workspaces. Browse the hierarchy of Workspaces, Repositories, Collections, and Documents. Search for documents based on text strings, regular expressions, date ranges, Document Types, document-specific fields. Make minor updates to documents. Generate an impact analysis of proposed document modifications. Import and export Documents and Collections. Package Processes and related documents into a single entity (DJAR) that can be more easily managed and transported. View and print documents and Reports. This document does not have exercises or courseware on Repository Manager, though there is an exercise in the Advanced course available from Pervasive Training Services.
Production Tools
These are the tools that allow the user to automate their Transformations and Processes in their production environment.
Integration Engine Integration Engine is an embedded data Transformation engine used to deploy runtime data replication, migration and Transformation jobs on Windows or Unix-based systems. Because Integration Engine is a pure execution engine with no user interface components, it can perform automatic, runtime data transformations quickly and easily, making it ideal for environments where regular data transformations need to be scheduled and launched. Integration Engine supports the following operating systems: Windows 2000, Windows XP, Windows Server 2003, HPUX, Sun Solaris, IBM AIX, and Linux. The Integration Engine has the capability to work with multiple threads if a multi-threaded license is purchased. There is courseware about the Integration Engine in this document.
Integration Server Integration Server is actually an SDK that is installed by default when the integration platform is installed. The core components of the Integration Server SDK are the Engine Controller, Engine Instances (Managed Pool), and the Client API that accesses the Engine Controller through a proxy. Server stability is maintained, scalability enhanced, and resources are spared through the use of a control-managed pool of EngineExe objects. This allows the Integration Engine to be called as a service. This document does not have exercises or courseware on the Integration Server, though there is a one-day course available from Pervasive Training Services that covers the Integration Server and the Integration Manager.
Integration Manager Through a browser-based interface, Integration Manager performs deployment, scheduling, on-going monitoring, and real-time reporting on individual or groups of distributed Integration Engines. Since all management is performed from a single administration point, Integration Manager improves operational efficiency in the management of geographically distributed Integration Engines. With the ability to remotely administer any number of integration points throughout the organization, customers can build out their integration infrastructure as required, using a flexible and scalable architecture designed for easy manageability. In other words, the Integration Manager allows the user to schedule and deploy multiple packages (DJAR) amongst multiple Integration Servers across an enterprise.
This document does not have exercises or courseware on the Integration Manager, though there is a one-day course available from Pervasive Training Services that covers the Integration Server and the Integration Manager.
For more information on Windows Default Installation Locations see the Reference Section.
Licensing
A temporary license file will be provided to you by the training services manager. This temporary license will allow you to utilize all of the capabilities of the Integration Platform for at least two weeks. If you are receiving training on site with Pervasive software, the license may appear on your desktop. The license file will have a .slc extension. You may store your license in any directory that you wish. The default location for storing a license on a Windows machine is C:\Documents and Settings\All Users\Application Data\Pervasive\Cosmos9\Common\License. You can store the license to this directory. After you have determined where the license will reside, double click on the Repository Explorer9 icon on your desktop to launch the software. Choose the option to Browse to a valid license file on disc, and browse to the location where you have stored the license.
3. Choose the Microsoft Access Driver (*.mdb ). 4. Set the Data Source name to TrainingDB, and click the Select button. 5. Browse to the folder C:\Cosmos9_Work\Fundamentals\Data and select the TrainingDB.mdb database.
Repositories Repositories are used to store the maps, processes, and schemas that make up your integration designs. The repository is typically a folder in a workspace directory; however the repository folder does not have to physically reside within the workspace folder to belong to the workspace. You may have many repositories within a workspace. You are required to have at least one. A default repository is created within the default workspace. There is more information about repositories contained in the next section. Your default repository location is C:\Documents and Settings\username\Cosmos9_Work\Workspace1\xmldb.
Repository Explorer
Change the Current Workspace Root Directory Select File Manage Workspaces (Ctrl+Alt W). Change the Workspaces Root Directory to the Cosmos9_Work folder that was created on your C Drive. This will allow you to use a list of Repositories and Macro definitions specific to your current Workspace.
Modify the Default Repository in the Current Workspace Click on the Repositories button in the bottom right-hand corner of the Workspaces dialog box. When you change the Root Directory, a default Workspace and Repository will be created. We are going to modify the default for use during training. Change the name xmldb to Fundamentals and navigate to the folder C:\Cosmos9_Work\Fundamentals by clicking the Find button. We will use this Repository to store all of the XML schema and metadata for the training exercises.
Interface Familiarization
Objectives The Map Designer icons offer you shortcuts when you are creating, modifying, and viewing maps. Here is information pulled from the Help File about the icons and their descriptions. Descriptions
Basic Map
Objectives At the end of this lesson you should understand the Source and Target tabs and be able to use the new Simple Map view to create a Transformation. Keywords: Drag and Drop Mapping Description In this exercise we will follow the basic steps in the flow chart below and create a simple map.
Exercise
Define the Source: 1. Open Map Designer. 2. There are 3 tabs. The first tab is selected for defining the source. 3. Locate the textbox labeled Source Connection, click the down arrow. This will open the Select Connection Dialog box pictured below. Notice there are three additional tabs.
Note: The first time you open this it will open on the Factory Connections tab. Afterwards it will default to the Most Recently Used tab. We will discuss the User Defined Connection tab in a future exercise. 4. Choose the ASCII (Delimited) connector and click OK. 5. Next to the textbox labeled Source File/URI, click the down arrow to select a file. Browse to the Accounts.txt file in the C:\Cosmos9_Work\Fundamentals\Data folder. 6. In the ASCII (Delimited) Properties box on the right side of the Source tab, find the Header property and set it to True. Then click the Apply button under the Properties list. Note: Any time you make a change in the source or target properties, you will have to click Apply to save the changes. 7. Use the toolbar Icon to open the Source Data Browser. If you see data records, then you have connected to the source. Close the Browser. Define the Target: 8. Click on the Target Connection tab. 9. In steps 3- 6 above, we chose a source connection. Create a Target connection similar to the way we created a Source Connection. This time choose ACSII (Fixed) as the connector type. 10. In the Target File/URI drop down browse to the C:\Cosmos9_Work\Fundamentals\Data folder. 11. Type Accounts_Fixed.txt as the file name, and click Open. Note: This file does not exist and will be created when we run the transformation. Map the Fields: 34 Data Integrator Fundamentals Training
12. Click on the Map Tab (Yellow Tab). 13. If you see two quadrants on this page, then you are set to the Map Fields view and you will need to follow the next steps. If not, you can skip to step 16. 14. From the Menu click View Preferences. Click the General tab. Check Always show Map All view.
We will be working in the Map All view for the remainder of the course. 15. To return to the Simple Map View, simply click on the Simple Map View icon in the toolbar. 16. To map the fields, drag the asterisk from the box labeled All Fields in the source, and drop it under the Target field name header. 17. Notice that the target has been filled out with field names identical to the source, and that the Target Field Expressions are filled out as well. Validate the Transformation using the check mark icon on the toolbar. 18. If the map is valid click OK. 19. Save the Map as m_BasicMap.map.xml in the C:\Cosmos9_Work\Fundamentals\Development folder. 20. Click the Run Map Icon to run the transformation.
21. Click the Target Data Browser and note your results.
Map Summary:
Define the Source: Source Connector: ASCII(Delimited) Source Data: C:\Cosmos9_Work\Fundamentals\Data\Accounts.txt
Source Options:
Header = True
Records("R1").Fields("Account Number") Records("R1").Fields("Name") Records("R1").Fields("Company") Records("R1").Fields("Street") Records("R1").Fields("City") Records("R1").Fields("State") Records("R1").Fields("Zip") Records("R1").Fields("Email") Records("R1").Fields("Birth Date") Records("R1").Fields("Favorites")
Records("R1").Fields("Payments") Records("R1").Fields("Balance")
Define Events: Source R1 Events Event Name Event Actions Event Parameters
target name record layout
AfterEveryRecord
ClearMapPut Record
Target R1
Factory Connections
Objectives: At the end of this lesson you will be able to find and use the appropriate data access Connector. Keywords: Connectors List, Connection Menu, and Source Connection tab
Description Factory Connections contains a list of all of the Connectors available to you in Map Designer. Type the first letter of a Connector name to jump to that Connector in the list (or the first one in the list with that letter). For instance, you want to choose Btrieve v7. Type "B", and BAF will appear. From there, you can scroll down to Btrieve v7 and select it.
The Map Designer Connector Toolbar Here are the icons and their descriptions:
New - Allows you to clear the Source tab and define a new source connection. Open Source Connection Allows you to open the Select Connection dialog to access the:
o o o
Most Recently Used Tab Factory Connections Tab User Defined Connections Tab
Save Source Connection - Allows you to save the selected connector type, and any properties hat you have defined for a source as a sc.xml file. The advantage of doing this is that you can reuse the Connection in any subsequent Map design in the future. This saved connection will become a User Defined connection. We will discuss user Defined Connections in the next topic Source Connector Properties - opens the Source Properties dialog box. These are the same properties available via the Source Connection tab, and are dependent upon the Connector to which you are connected. This icon will be active only when you are on the Map tab.
Macro Definitions
Objectives At the end of this lesson you will be able to define and use Macros in connection strings. Keywords: Macros, Macro Definition File, Workspace Description Macros are symbolic names assigned to text strings, and are usually used to represent file paths. You should use macros as a tool to aid in the movement of integration files from one life cycle environment to the next. A macro definition file is an XML file that contains name value pairs. This file is named macrodef.xml and resides in your Workspace directory. Each workspace will only read one macrodef file. Therefore, the scope of macros contained in a single macrodef file is across a workspace. These macro names can be used throughout a map or process to provide connection information. For example, a macro name can be substituted in the following connector options: Server Name or IP Address Database name UserID Password File or table connection paths We will create a new macro that we can use to represent the Data sub-directory for our Training Repository. This will allow us to port the schema files more readily from one workstation to another or deploy to servers for execution by Integration Engine.
Exercise 1. Select the menu item Tools Define Macros. Notice there is already a macro that is set to the default location of the current Workspace. 2. Click New. 3. Enter a Macro Name value as FUN_DATA. 4. Click the Macro Value drop-down button and navigate to our workspace and highlight the C:\Cosmos9_Work\Fundamentals\Data folder, and click OK. 5. Add a back slash \ to the end of the macro value. 6. Enter a description if you wish and click OK. 7. On the source connection tab, highlight the portion of the connection string you wish to replace (e.g., C:\Cosmos9_Work\Fundamentals\Data\). 8. From the menu bar, select Tools Paste Macro String. 9. Click on the row of the Macro you want to use (e.g., FUN_DATA). Map Designer uses the syntax $(FUN_DATA) to represent the entire path to the Data folder.
Root Macro If you will be selecting files from the same directory often you can set the Root Macro for automatic substitution. Click Tools Define Macros. Highlight the Macro you want to use as the root directory and click the Set as Root button.
Then set the automatic substitution switch in Map Designer View Preferences Directory Paths: Choose Substitute root MACRO. This step is optional, and is a design preference.
Exercise 1. Reopen the Transformation built previously named m_BasicMap.map.xml and view the Source Connection tab. 2. Use the Macro created in the last exercise for the path to the Accounts.txt file on the Source Tab. 3. Using the Connector Toolbar to the right side of the Connection field, click the Save icon.
4. Save the source connection as Accounts_Delimited.sc.xml. 5. Close the current Map and open a new map design. 6. Select the Source Connection dropdown and click the User Defined Connections tab. Click on the Connections folder and select the Accounts_Delimited.sc.xml connection.
Exercise 1. Connect to the ASCII Delimited file, Accounts.txt as your source. Hint: Use the User Defined Connection created in class in an earlier exercise. 2. Click the Source Keys and Sorting icon in the toolbar. 3. On the Sort Options Tab, click in the Key Expression box to see the down arrow. Click on the down arrow. 4. Choose the State Field to use as a key. Note: You can choose Build if you want to build a key using an expression to parse out or concatenate parts of different fields. Also, the sort will default to ascending order. If you would prefer to sort in descending order, select "Descending" from the dropdown list.
5. Create a target connection to an ASCII Delimited file called AccountsSortedbyState.txt. This file doesnt yet exist, so youll have to type in the file name. 6. Set the header to true and click the Apply button. 7. Go to the Map Step by clicking on the Map All Tab. 8. Validate the Map. Note: You may see a dialog box that looks like this. We will go into greater detail on the Default Event Handler and Event Handlers in general later in this courseware.
9. Click OK to accept the Default Event Handler. 10. Save this Map as m_SourceDataFeatures_Sort.map.xml in the Development folder. 11. Run the Map.
12. Notice the results in the status bar. 13. Open the Target Data Browser and notice that the records are sorted by state.
Map Summary:
Define the Source: Source Connector: ASCII(Delimited) Source Data: Source Options: $(FUN_DATA)Accounts.txt Header = True
Automatic Transformation Feature: Sort Fields on Source Data: Fields("State") type=Text ascending=yes length=2
Define the Target: Target Connector: Target Data: Target Options: ASCII(Delimited) $(FUN_DATA)AccountsSortedByState.txt Header = True
Records("R1").Fields("Account Number") Records("R1").Fields("Name") Records("R1").Fields("Company") Records("R1").Fields("Street") Records("R1").Fields("City") Records("R1").Fields("State") Records("R1").Fields("Zip") Records("R1").Fields("Email") Records("R1").Fields("Birth Date") Records("R1").Fields("Favorites")
Records("R1").Fields("Payments") Records("R1").Fields("Balance")
Define Events: Source R1 Events Event Name Event Actions Event Parameters
target name record layout
AfterEveryRecord
ClearMapPut Record
Target R1
Exercise 1. Connect to the ASCII Delimited file, Accounts.txt as your source. 2. Click the Source Filters icon in the toolbar. Note: The radio buttons in the bottom of the window where it says Define Source Sample. We can choose a range of records. We can choose to process ever Nth record from the source. (The behavior of this is that you always get the first record, then every Nth record like so, 1, N+1, 2N+1, 3N+1)
3. In this exercise we will filter all Account Records from the state of Texas. We will use the Source Record Filtering Expressions box. This allows us to use the RIFL Scripting Language (see The RIFL Script Editor section) to write an expression that will evaluate to True or False. We will process the records that cause the expression to evaluate to True. The expression to use is: Records("R1").Fields("State") == "TX" . 4. Create a target connection to an ASCII Delimited file called AccountsinTX.txt. This file does not exist, so type in the file name. 5. Set the header property to true and click the Apply button. 6. Go to the Map Step. Drag all Source Fields to the Target. 7. Validate the Map. Note: You may see the dialog box pictured below. We will go into greater detail about the Default Event Handler and Event Handlers in general later in this course.
8. Click OK to accept the Default Event Handler. 9. Save this Map as m_SourceDataFeatures_Filter.map.xml in the Development folder. 10. Run the Map, and notice the Results Status Bar.
11. Open the Target Data Browser and notice that the target data set only contains records from Texas.
Map Summary:
Define the Source: Source Connector: ASCII(Delimited) Source Data: Source Options: $(FUN_DATA)Accounts.txt Header = True
Define the Target: Target Connector: Target Data: Target Options: ASCII(Delimited) $(FUN_DATA)AccountsinTX.txt Header = True
Records("R1").Fields("Account Number") Records("R1").Fields("Name") Records("R1").Fields("Company") Records("R1").Fields("Street") Records("R1").Fields("City") Records("R1").Fields("State") Records("R1").Fields("Zip") Records("R1").Fields("Email") Records("R1").Fields("Birth Date") Records("R1").Fields("Favorites")
Records("R1").Fields("Payments") Records("R1").Fields("Balance")
Define Events: Source R1 Events Event Name Event Actions Event Parameters
target name record layout
AfterEveryRecord
ClearMapPut Record
Target R1
Exercise 1. Connect to the ASCII Delimited file, Accounts.txt as your source. 2. Create a target connection to the TrainingDB database that we have set up previously. The table is called tblAccounts. Note that when we connect to this table, the output mode was automatically set to Append because the table already existed. Lets change the output mode to Replace. 3. Go to the Map Step. Note: In this case we already have target fields defined. This metadata (Field names, Field lengths, and Data types) is defined by the database. Notice also that some fields are mapped and some are not. The Simple Map view does an automatic Match by Name that pulls in field names that are exact matches from source to target. We will have to do the rest by hand. 4. For the AccountNumber field we click inside the target field expression, and then click the down arrow. 5. We can then choose Account Number (note the space that is not there in the target field. Thats why Match by Name failed). 54 Data Integrator Fundamentals Training
6. Now we do the same for each of the remaining fields. Look at the charts below for specific mapping if needed. 7. Alternatively, we could have right clicked in the AccountNumber Target Field Expression and chosen Match by Position. In this case, we would have mapped all of our source fields into the target fields correctly. However, it is not always the case that filed names will be in perfect position order between the source and target. 8. Run the map by clicking the Run button. 9. Accept the Default Event Handler if necessary. 10. Notice the results in the Target Data Browser. Note the number of records in the table. 11. Now lets go back to the Target Connection Tab and set the Output Mode to Append. 12. Click the Run button. 13. Notice Results in the Target Data Browser. Note the number of records in the table. 14. Now change the Output Mode to Clear File/Table contents and Append. 15. Run the map and note the results. 16. Save this map as m_OutputModes_Clear_Append.map.xml.
Map Summary:
Define the Source: Source Connector: ASCII(Delimited) Source Data: Source Options: $(FUN_DATA)Accounts.txt Header = True
Define the Target: Target Connector: Target Data: ODBC 3.x Database: TrainingDB Table: tblAccounts Target Options: none
Fields("Payments") Fields("Balance")
Define Events: Source R1 Events Event Name Event Actions Event Parameters
target name record layout
AfterEveryRecord
ClearMapPut Record
Target R1
Exercise 1. Connect to the ASCII Delimited file, InactiveAccounts.txt as your source. 2. Set the Header Property to True and click Apply as we have done previously. 3. Create a target connection to the TrainingDB database that we have set up previously. Connect to tblAccounts. Note that when we connected to this table, because it already existed, our output mode was automatically set to Append. Lets change it to Delete. 4. Note that Target Filed AccountNumber was automatically set as the key field. Map the Source Field Account Number to the Target Field AccountNumber. 5. Validate the map, and accept the Default Event Handler. 6. Click the Run button. 7. Notice Results in the Target Data Browser. Note the number of records in the table. 8. Be aware that you will only see results the first time you run the Map. This is because we will remove the matching records the first time and they will no longer exist. You will need to load the original source records into the target table before you run the Delete Mode map a second time. Assuming that you correctly ran the previous Map in Clear and Append mode, you can run it again to prime the table. 9. Save this map as m_ OutputModes_Delete.map.xml. 57 Data Integrator Fundamentals Training
Map Summary:
Define the Source: Source Connector: ASCII(Delimited) Source Data: Source Options: $(FUN_DATA)InactiveAccounts.txt Header = True
Define the Target: Target Connector: Target Data: ODBC 3.x Database: TrainingDB Table: tblAccounts Target Options: none
Define Events: Source R1 Events Event Name Event Actions Event Parameters
target name record layout
AfterEveryRecord
ClearMapPut Record
Target R1
Exercise From within the Transformation Map Designer: 1. Connect to the ASCII Delimited file, AccountsUpdate.txt as your source. 2. Set Header to true and apply as we have done previously. 3. Create a target connection to the TrainingDB database that we have set up previously. The table is called tblAccounts. Note that when we connected to this table, because it already existed, our output mode was automatically set to Append. Lets set it to Update. 59 Data Integrator Fundamentals Training
4. Go to the Map Step. Note: In this case the target fields are already defined. This metadata (Field names, Field lengths, Datatypes) is coming from the description of the table in the database. 5. Click inside the target field expression for the Account Number field, and then click the down arrow. 6. Choose Fields(Account Number). Note the space that is not there in the target field. Thus selecting Match by Name to map the data will fail. 7. Choose the corresponding source field names for each Target Field Expression as we have just done. Look at the charts below for specific mapping if needed. 8. Alternatively we could have right clicked in the AccountNumber Target Field Expression and chosen Match by Position. In this case, we would have mapped all of our source fields into the target fields correctly. However it will not always be the case that source fields and target fields will be in the same position. 9. Notice that the Target Field AccountNumber was automatically set as the key field. 10. Open the Target Keys, Indexes and Options dialog box. Note all the options that are possible using Update Mode. In this case the defaults Update all matching records and insert non-matching records and Update only mapped fields are sufficient. Although the Update All fields would give us the same results since we have mapped all fields. 11. Click the Run button. 12. Accept the Default Event Handler. 13. Notice Results in the Target Data Browser. Note the number of records in the table. 14. When we run this map we will be updating the records, so unless you restore the table to its original contents before you run the map again, you wont see any change. You can just run the map we created for the Clear and Append Mode exercise and then run the Delete mode map before re-running this map. 15. Save this map as m_ OutputModes_Update.map.xml.
Map Summary:
Define the Source: Source Connector: ASCII(Delimited) Source Data: Source Options: $(FUN_DATA)AccountsUpdate.txt Header = True
Define the Target: Target Connector: Target Data: ODBC 3.x Database: TrainingDB Table: tblAccounts
Target Options:
none
Fields("Account Number") Fields("Name") Fields("Company") Fields("Street") Fields("City") Fields("State") Fields("Zip") Fields("Email") Fields("Birth Date") Fields("Favorites")
Fields("Payments") Fields("Balance")
Define Events: Source R1 Event Handlers Event Name Event Actions Event Parameters
target name record layout
AfterEveryRecord
ClearMapPut Record
Target R1
Exercise In Map Designer: 1. Connect to the ASCII Delimited file, Accounts.txt as the source using the user defined connection Accounts_Delimited.sc.xml. 2. Create a target connection to the TrainingDB database using the ODBC 3.x connector. The table is tblAccounts. Set the Outputmode to Clear File/Table contents and Append. 3. Go to the Map Step. 4. Map all fields correspondingly except for the Name and the BirthDate fields. The first field that well work with is the Birthdate field. In our source, the birth date field has string data in it that appears as: 11/12/1975. Most databases will not accept a string value into a date or datetime field. We will have to convert the date using the RIFL function, Datevalmask, in the Target Field Expression. 5. Double Click in the Birth Date fields Target Field Expression, or select the drop down and choose Build Expression.
6. The RIFL Script Editor for the Birth Date field will open. A list of built in functions is listed in the lower right hand side 7. Find the function DateValMask. And select it. In the windows below you will see a description of the function, and its parameters.
8. Double Click on the function to add it to the Script Editor. The function will appear along with its parameters. Use the next steps to define the parameters. 9. In the Script, highlight the parameter DateString. In the lower left pane click on Source R1.
Then in the lower right pane double click Birth Date. 10. Highlight Mask and type mm/dd/yyyy. Masks are used in many RIFL functions. In order to know what values to use for masks, look in the Help files for the topic Picture Mask.
11. The next field we will manipulate is the Name target field. The source data names are in the format, First Middle Last. A sample from the first record is George P Schell. We would like the Name Field in the target to have the format, Last, First Middle Initial. Example: Schell, George P. 12. In the RIFL Script Toolbar Click on the Show Expression Tree icon.
14. Delete the Fields(Name) value or any other value in the Editor pane so that its blank. 15. In the lower right pane select the NamePart function. 16. Double click the NamePart function to add it to the Scripting Editor. 17. In the editor window select the Mask parameter. Type in l ( a lower case L in double quotes) . 18. Select the Name parameter. Pull in the source field Name as we did above for the Birth Date. (See step 9.) 19. The script that we have created will return only the last name. We will have to parse the other parts of the name and use the concatenation icon format. to create the full name in the desired
20. Use the concatenation operator to add a comma and whitespace to the name format. Write the following script: NamePart(l, Records(R1).Fields(Name)) & , & _ NamePart(f, Records(R1).Fields(Name)) & & _ NamePart(mi, Records(R1).Fields(Name)) Note: For logic purposes this script would need to be all one line. We use the space and the underscore characters as a continuation that allows us to write script on the next line. This makes the script easier to read. 21. Validate the Script Syntax by selecting the Validation Icon. Troubleshooting Tips: 66 Data Integrator Fundamentals Training
Verify that the Script is written as it appears above. Make sure there arent any trailing spaces after the continuation characters (underscores) 22. Click OK in the RIFL Script Editor and save this map as m_RIFLScript_Functions.map.xml in the Development folder. 23. Run the Map and note the results.
Map Summary:
Define the Source: Source Connector: ASCII(Delimited) Source Data: Source Options: $(FUN_DATA)Accounts.txt Header = True
Define the Target: Target Connector: Target Data: ODBC 3.x Database: TrainingDB Table: tblAccounts Target Options: none
Fields("Account Number") NamePart("l", Records("R1").Fields("Name")) & ", " & _ NamePart("f", Records("R1").Fields("Name")) & " " & _ NamePart("mi", Records("R1").Fields("Name")) Fields("Company") Fields("Street") Fields("City") Fields("State") Fields("Zip") Fields("Email")
R1.BirthDate R1.Favorites
Fields("Payments") Fields("Balance")
Define Events: Source R1 Event Handlers Event Name Event Actions Event Parameters
target name record layout
AfterEveryRecord
ClearMapPut Record
Target R1
This exercise will evaluate the dates in the source file to determine if they are valid. If the date for a record is valid then we will write the record to the target. If the date is not valid, we will write a message to the log file and discard the record so it isnt written to the target.
Exercise 1. Create a new Map and connect to the source and target listed below. 2. On the Map tab, map all fields as before except for the Birth Date field. 3. Open the RIFL Script Editor in the Birth Date field. 4. In the lower left pane of the RIFL Script Editor, above All Functions, click Flow Control. In the lower right pane, double click IfThenElse.
Notice that the RIFL Script Editor puts the syntax for the If Then Else Statement into the editor window. We will replace condition with a statement that will evaluate to true or false. The statement block one will become actions that will take place if the statement is true. The statement block two will become actions that take place if the statement is false. 5. Enter the following script, replacing what is in the editor.
Dim d d = Records("R1").Fields("Birth Date") If IsDate(d) then DateValMask(d, "mm/dd/yyyy") Else Logmessage("Warn", "Account Number " & Records("R1").Fields("Account Number") &_ " has an invalid date: " & d) Discard() End if
Line 01 declares a local variable d that will be available to us only in this script. Line 02 sets d to the value contained in the Birth Date field in the source. Line 04 uses the IsDate function to determine if the string can be converted to a valid date. Line 05 converts the date for use in the target using the DateValMask function. Lines 07 and 08 use the LogMessage function. The first parameter of a LogMessage function is always either Info, Warn, Error, or Debug. The second parameter is the string written to the log file. In this case there are a combination of literal strings and data contained in the source record. Note the continuation character at the end of line 7. Line 09 uses the Discard function which causes the source record not to be written to the target. 6. Click the Validate icon . We should see Expression contains no syntax errors at the bottom of the RIFL Script Editor. Click OK. 7. Validate the map and save it as m_RIFLScript_FlowControl.map.xml. 8. Run the Map and note results in the target. There are only 201 records in the target. 9. Click on the Transformation Log icon function. . Note the results of the LogMessage
Map Summary:
71 Data Integrator Fundamentals Training
Define the Source: Source Connector: ASCII(Delimited) Source Data: Source Options: $(FUN_DATA)Accounts.txt Header = True
Define the Target: Target Connector: Target Data: ODBC 3.x Database: TrainingDB Table: tblAccounts Target Options: none
R1.Favorites R1.StandardPayment
R1.LastPayment R1.Balance
Fields("Payments") Fields("Balance")
Define Events: Source R1 Event Handlers Event Name Event Actions Event Parameters
target name record layout
AfterEveryRecord
ClearMapPut Record
Target R1
Using the Transformation and Map Properties dialog affects many areas of the Transformation Map. These areas include the log file settings, runtime execution properties, error handling and definitions of external code-modules.
Exercise 1. Using the previous Map, change the Discard() function call to a Reject() function call. 2. Go to the Map Properties dialog and click Build Connection String from Source.
3. Change the file name in the connect string to BadDateRejects.txt. 4. Using the Target Event Handler OnReject, add a ClearMapPut Record action. 5. Change the target name parameter from Target to Reject. 6. Save the map as m_RIFL_RejectConnectInfo.map.xml. 7. Execute the map by clicking the Run icon. 75 Data Integrator Fundamentals Training
8. Note the results in the Target Data Browser. 9. Navigate to the reject file BadDateRejects.txt. You should see all records that contain invalid dates.
Map Summary:
Define the Source: Source Connector: ASCII(Delimited) Source Data: Source Options: $(FUN_DATA)Accounts.txt Header = True
Define the Target: Target Connector: Target Data: ODBC 3.x Database: TrainingDB Table: tblAccounts Target Options: none
d) Reject() End If
R1.Favorites R1.StandardPayment R1.LastPayment R1.Balance
Define Events: Source R1 Event Handlers Event Name Event Actions Event Parameters
target name record layout
AfterEveryRecord
ClearMapPut Record
Target R1
Define Events: Target R1 Event Handlers Event Name Event Actions Event Parameters
target name record layout
OnReject
ClearMapPut Record
Reject R1
A source record has been read from the source file. Transform the data and write a record to the target file. Fire this action only if Fields(Status) of the source record == Active.
Using an Event The first task is to choose an event to use. Events are grouped in a number of places. There are events that apply to the transformation as a whole (e.g., BeforeTransformation). These can be found in the Transformation and Map Properties dialog. Next, there are source record events that apply to each specific source record type (e.g., AfterEveryRecord). These can be found in the source hierarchy on the Map Tab under each record types heading. Next, there are source record events that apply to each and every source record that is read, and these can be found under the General Event Handlers heading in the source hierarchy. Finally, there are two groups of target record events - one group that applies to target records of a specific type and one that applies to each and every target 79 Data Integrator Fundamentals Training
record (no matter what type). These are found in the target hierarchy under headings like those for the source record events. The Default Event Handler A transformation must have at least one event, and that event must have at least one action. To ensure that your transformations meet this requirement, Map Designer will define an event and an action if you do not create any events. The event that it creates is the AfterEveryRecord event for the source file, and the action that it supplies is the ClearMapPut action for the target file. This event and its associated action are collectively referred to as the Default Event Handler. This Default Event Handler will automatically read every source record and clear the target buffer, execute all of the mapping expressions and then write the target buffer contents to the target file for each source record. When Map Designer supplies this default event handler, you are informed via an on-screen message box. However, Map Designer supplies the default event handler ONLY if you do not create any event handlers. If you do, then Map Designer WILL NOT ADD the default event handler. Map Designer will, however, warn you when you are about to run a transformation that has no event action that will cause a target record to be written. Commonly Used Events Some events are very basic and are used frequently. Most of these events will be discussed and used in the exercises in this course module. You should be aware of these events and when they occur. BeforeTransformation: This is the first event that occurs in any transformation, and is very useful for all the housekeeping and set-up tasks that you may wish to perform. After Transformation: This is the last event that occurs before a transformation ends, and it is very useful for accessing final totals and other values, and performing housekeeping and clean-up tasks. Specific AfterEveryRecord: The word specific refers to an event that is tied to a particular source or target record type. This event occurs whenever a source record of a specific type is read, and is the ideal place to perform the action you want to do using the values from each source record. Specific AfterFirstRecord: This event only occurs when the first record of a specific type is read, and it is the ideal event in which to perform housekeeping and set-up tasks that relate to a single record type. General AfterFirstRecord: The word General refers to an event that is not tied to a particular source or target record type. This particular event occurs only when the first record is read from the source file and is again a great place to perform general housekeeping and set-up tasks that relate to all record types. General AfterEveryRecord: This event occurs whenever a source record is read from the source file- no matter what type it may be. It is the best place to put common tasks- those that will apply to all source records. Commonly Used Actions There are many actions that you can perform when a particular event occurs. Some actions are used very often and are common to many events. The two most common actions are: 80 Data Integrator Fundamentals Training
ClearMapPut: This action is three actions in one. The first action clears the target buffer (for the record type specified in its Layout parameter). Next, it executes all the mapping expressions that you have supplied for each field in the target buffer, in effect filling the target buffer fields with the data specified by the Target Field Expressions. Finally, it writes the contents of the buffer to the target file. A visual representation of these actions is pictured in the next topic. Execute: This action executes a script created with the RIFL Script Editor. The scripts you write and execute perform the work of your transformation. For additional documentation of using Event Handlers, read the Event Management Guide: http://docs.pervasive.com/products/integration/download/events.pdf.
III. State of Buffers after a new Source record is read and the data is stored in the Source Buffer.
IV. The first action of the ClearMapPut causes the Target Buffer to be cleared.
V. The second action of the ClearMapPut causes the data in the Source Buffer to be mapped to the Target Buffer.
VI. The third action of the ClearMapPut causes the data to be written to the Target File or Table.
Exercise 1. Create a map based on the specifications given below. Save the Map as m_Events_SequenceTest.map.xml. 2. Run the map and observe the results. Most of our exercises make some attempt to mimic a real world situation in a simplified fashion. This exercise, however, is pure classroom.
Map Summary:
Define the Source: Source Connector: Null Source Options: Record count = 5
Define the Target: Target Connector: Target Data: Target Options: ASCII(Delimited) $(FUN_DATA)EventNames.txt Header = True
eventName Variant no
Define Events: Transformation Events Event Name Event Actions Event Parameters
Expression:
Before Transformation
Execute
ClearMapPut Record
Target R1 86
After Transformation
Execute
Expression:
ClearMapPut Record
Target R1
Define Events: Source R1 Events Event Name Event Actions Event Parameters
Expression:
AfterEveryRecord
Execute
ClearMapPut Record
Target R1
Define Events: Source General Events Event Name Event Actions Event Parameters
Expression:
AfterEveryRecord
Execute
ClearMapPut Record
Target R1
BeforeFirstRecord
Execute
Expression:
ClearMapPut Record
Target R1
OnEOF
Execute
Expression:
ClearMapPut Record
Target R1
Note: The following target fields should be created manually through the user interface.
eventName
Exercise 1. Create our map based on the specifications given below. 2. Save the map as m_Events_ConditionalPut.map.xml. 3. Run the map and observe the results.
Map Summary:
Define the Source: Source Connector: ASCII(Delimited) Source Data: Source Options: $(FUN_DATA)Accounts.txt Header = True
Define the Target: Target Connector: Target Data: ODBC 3.x Database: TrainingDB Table: tblAccounts Target Options: none
Variables:
Name Type Public Value 0
varBadDates Variant no
Records("R1").Fields("Account Number") Records("R1").Fields("Name") Records("R1").Fields("Company") Records("R1").Fields("Street") Records("R1").Fields("City") Records("R1").Fields("State") Records("R1").Fields("Zip") Records("R1").Fields("Email") DateValMask(Records("R1").Fields("Birth Date"),"mm/dd/yyyy") Records("R1").Fields("Favorites")
Records("R1").Fields("Payments") Records("R1").Fields("Balance")
Define Events: Source R1 Events Event Name Event Actions Event Parameters
target name record layout count Dim d d = Records("R1").Fields("Birth Date") ' Use flow control to test for a valid date If IsDate(d) Then ' Enable the Put action by returning 1 1 Else ' Invalid date, log a message Logmessage("Error", "Account number: " & Records("R1").Fields("Account Number") & _ " has an invalid date: " & d) ' Increment counter varBadDates = varBadDates + 1 ' Suppress the Put action by setting to zero 0 End If
AfterEveryRecord
ClearMapPut Record
Target R1
Exercise 1. Create our map based on the specifications given below. Save the Map as m_Events_OnDataChange.map.xml. 2. Run the map and observe the result. 92 Data Integrator Fundamentals Training
Map Summary:
Define the Source: Source Connector: ASCII(Delimited) Source Data: Source Options: $(FUN_DATA)Accounts.txt Header = True
Automatic Transformation Feature: Sort Fields on Source Data: Fields("State") type=Text ascending=yes length=2
Define the Target: Target Connector: Target Data: Excel 2000 or Excel XP File: $(FUN_DATA)AccountSummariesByState.xls Sheet: Sheet1 Target Options: Header Record Row: 1
Target Schema
Field Name State Number_of_Accounts Type Length Description Text Text 16 16 16 48
Variables:
Name Type Public Value
no no no 0 0
Define Events: Source R1 Event Handlers Event Name Event Actions Event Parameters
Expression:
AfterEveryRecord
Execute
' Set the state value for the current record since it will change when we are ready to write the data to the target varState = Records("R1").Fields("State") ' Increment the counter varCounter = varCounter + 1 ' Accumulate the balance varBalance = varBalance + Records("R1").Fields("Balance")
There are also two special situations that should be considered. When the very first record is placed in the buffer the value of the field being monitored will have changed since the source buffer is always filled with null values at the start of a transformation. So the OnDataChange Event will fire after the first record is read. Similarly, when the source buffer is cleared after the last record has been processed, the value of the field being monitored will change from some real value to a null value, and again the OnDataChange Event will be fired. However, these situations may or may not be useful in any given transformation. Therefore, you have the option of suppressing one or the other, or both of them. This is controlled in the Data Change Event Management Options.
Define Events: Source R1 OnDataChangeEvent Monitor: Records("R1").Fields("State") Management: Suppress first ODC event, Fire Extra ODC event at EOF Event Name Event Actions Event Parameters
target name record layout
OnDataChange1
ClearMapPut Record
Target R1
Execute
Expression:
' Reset the variables for the records belonging to the next state varCounter = 0 varBalance = 0
After reviewing the target data you may notice that the precision of the decimal places is not formatted correctly. This precision becomes distorted because the Balance field, which is stored as a text value is converted to a numeric value, addition of the values is performed, and then the data is again converted into text. In order to fix the precision, we can change the Source Data type to Decimal and set the number of decimal places to 2.
Exercise 1. Create our map based on the specifications given below. Save the map as m_Events_OnError_Event.map.xml. 2. Run the map and observe the results. 96 Data Integrator Fundamentals Training
Map Summary:
Define the Source: Source Connector: ASCII(Delimited) Source Data: Source Options: $(FUN_DATA)Accounts.txt Header = True
Define the Target: Target Connector: Target Data: Target Options: ASCII(Delimited) File: $(FUN_DATA)PaymentsRemaining.txt Header = True
Target Fields:
Name Account Number Payments Balance Type Length Description Text Text Text 9 7 6 16
Int(bal/pmt) + 1 End if
Variables
Name Type Public no no Value 0
Define Events: Source General Events Event Name Event Actions Event Parameters
Expression:
BeforeFirstRecord
Execute
Set the value of the file variable errorFile = MacroExpand("$(FUN_DATA)DivideByZero.txt") If FileExists(errorFile) Then FileDelete(errorFile) End If
Note: This example shows the functionality of the MacroExpand. FileExists and FileDelete functions, though similar results could be had by using:
FileWrite(errorFile, "AcctNumber" & sep & "Payt" & sep & "Bal" & crlf) where sep = | and crlf = Chr(13)&Chr(10)
This would replace any existing file with a file that contains only the header. This would also make the flagFirstTime variable unnecessary.
Define Events: Source R1 Events Event Name Event Actions Event Parameters
target name record layout
AfterEveryRecord
ClearMapPut Record
Target R1
Define Events: Target General Events Event Name Event Actions Event Parameters
Expression:
OnError
Execute
Dim sep, crlf sep = "|" crlf = Chr(13) & Chr(10) If flagFirstTime == 0 Then 'Write the header for the error file FileAppend(errorFile , "AcctNumber" & sep & "Payt" & sep & "Bal" & crlf) ' set flag to 1 so header will not be written next time flagFirstTime = 1 End If FileAppend(errorFile, Records("R1").Fields("Account Number") & sep & _ Records("R1").Fields("Payments") & sep & _ Records("R1").Fields("Balance") & crlf)
Resume
none
Do not forget the Resume Action. The Resume Action is what causes the map to continue processing the remaining records after the error is handled. 3. Observe the DividebyZero.txt file that is created in the Data folder.
Types of Errors and Log Messages Errors occur during data integration at design time and run time.
Design-Time Errors Design-time errors occur while you are using an application such as Map Designer and Process Designer. Run-Time Errors Run-time errors occur during the execution of a transformation or process. Because the designers can run transformations and processes through Integration Engine, run-time errors can also be displayed in the designer interfaces. Run-time errors are generated from the following places: Designers RIFL scripts Integration Engine command line console SDK code 100 Data Integrator Fundamentals Training
Error Log Messages All errors originate from the Integration Engine, but the log to which they are written depends on the interface being used. For instance, if you are using Map Designer, then errors are logged to an error file. If you are using Integration Engine, then error messages are displayed in the console and written to a log file. See the following topics for more information on error logging. Map Designer Errors that occur in Map Designer are displayed in a dialog box and logged to a TransformMap.log file. Process Designer Errors that occur in Process Designer are logged to a process log file named by the designer. Integration Engine In the Integration Engine, the last error message logged while loading, changing or running a transformation or process is logged to the command line interface console. All error messages are logged to a log file. The default name for the log file depends upon which interface you are calling. For instance, if you are running a transformation on the command line, errors are logged to the TransformMap.log file; if you are running a process, errors are logged to a process log file. RIFL The Rapid Integration Flow Language (RIFL) includes functions and statements that return information about errors to the error log files. For instance, you can use the LogMessage Function to write messages to an error log file, and the On Error GoTo Statement to trap run-time errors.
Comprehensive Review
To test our knowledge and review the introductory module for the Cosmos Integration Essentials courses we want to design a Map to load data in the Accounts.txt file into a target database table. Basic Map specifications: Source Connector: ASCII (Delimited) Source File: Accounts.txt Header property: True Target Connector: ODBC 3.x Data Name Source: TrainingDB Table: tblIllini Output Mode: Replace Table
Exercise 1. Map the four target fields with the appropriate data from the source. 2. Use the appropriate Event and Action that will write all source records to the target. Hint: This is also the Default Event Handler. 3. In the Target BirthDate Field, use an appropriate Date/Time function to convert the formatted date strings into a real date-time data type. 4. Test for invalid dates using the IsDate function, and reject the invalid records to an ASCII Delimited file named Reject_Accounts.txt. 5. Reject all records from the state of Illinois (IL) into the Reject_Accounts.txt file as well. Hint: You will have to use a Target Event to write the rejected record to the file. 6. Aggregate the Balances from all rejected records using a global variable. 7. Report the aggregated balance (total balance) in the log file using the LogMessage function.
The solution to this review is in the Solutions folder. It is named m_Comprehensive_Review.map.xml. Open it and look only if you get stuck. It should be noted that the solution map shows only one way to complete this exercise. There are several.
Exercise 1. Start a New Structured Schema design and choose the ASCII Fixed connector. 2. Click the Visual Parser toolbar button (Red Knife). 3. Navigate to the file named Payments.txt. 4. Click in the current row (blue highlight) between the fields and name the fields by overtyping in the Field Name drop down list. 5. Save the Structured Schema as s_Payments.ss.xml.
Record Layouts
Record R1
Name Type Length 9 8 10 27
Exercise 1. Start a New Structured Schema design session and choose the Binary connector. 2. Using the drop-down menu in the upper right hand of the window, choose Cobol 01. 3. Navigate to the file named Accounts_Cobol.cbk. 4. Click the Layout/Record Name(s) you want to import. 5. Click OK. 6. Review the structure in the grid view. 7. Save the Structured Schema as s_CobolCopyBook_Accounts.ss.xml.
Record Layouts
Record ACCOUNT_INFO
Name ACCTNUM NAME COMPANY STREET CITY STATE POSTCODE EMAIL Type Display Display Display Display Display Display Display Display Length 9 21 31 35 16 2 10 25 10 11 6 6 6
BIRTHDATE Display FAVORITES STDPAYT LASTPAYT BALANCE Display Display sign leading Display sign leading Display sign leading
Printouts from programs captured as disk files Reports of any size or dimension ASCII or any type of EBCDIC text files Spooled print files Fixed length sequential files Complex multi-line files Downloaded text files (e.g., news retrieval, financial, real estate...) HTML and other structured documents Internet text downloads E-mail header and body On-line textual databases CD-ROM textbases Files with tagged data fields XML HL7 Swift
Extract Schema Designer does NOT use the XML repository that all of our other Design Tools use. Extract Schema Designer saves extracts in two ways. The first is in a script file in Content Extractor Language with a .cxl extension. This file is only useful as part of a Source Connection in Map Designer. It cannot be imported into Extract Schema Designer to be edited. The second way that an Extract is saved is in an Access Database. The default path and filename for this database is C:\Program Files\Pervasive\Cosmos9\Common\extractor900.mdb. . Extracts stored here can be reopened and edited. Content Extractor Language is very rich and expressive, and provides many advanced data manipulation and formatting capabilities. CXL can be used to create or customize complex scripts necessary for text files whose patterns and rules may be beyond the functionality of the user interface supplied with the Extract Schema Designer. More information about this language is available in the Content Extraction Language Help file under the SDK Help Files. The default path and filename for this file is C:\Program Files\Pervasive\Cosmos9\Common\Help\SDKs\cxl_sdk.pdf. Former users of Data Junction Content Extractor should be aware that the script files are no longer called DJP files. They are known as CXL files now.
There are several legacy names that may be used in place of the default connector name of Extract Schema Designers Connector. This list includes: Cambio, Content Extractor, Extractor, and Report Reader. There are also two connectors that have a pre-designed script included with the software that parse statistical information from the log file automatically. These are Data Junction Log File and Integration Log File.
Exercise Start Extract Schema Designer. 1. From the Repository Explorer, select New Object Extract Schema. 2. At the prompt, navigate to the file you will be working with, in this case, Purchases_Phone.txt. 3. Choose OK to accept the Source Options defaults. 4. Highlight the word Category on one of the Category lines and right-click in the highlight. 5. Select Define Line Style New Line Style. 6. Verify that all defaults are acceptable and click Add. Weve now defined a Line Style for the Category field. 7. Highlight the Category code on one of the Category lines and right-click in the highlight. 8. Select Define Data Field New Data Field. 9. Change the field name to Category. 10. Verify that all other defaults are acceptable and click Add. Weve now defined the Category Data Field. 11. Highlight a ProductNumber and the rest of the spaces on the line and right-click in the highlight. 12. Select Define Data Field New Data Field. 13. Change the field name to ProductNumber. 14. Verify that all other defaults are acceptable and click Add.
15. Highlight a Quantity and all but one of the spaces between the actual digits of the Quantity and the colon following the literal Quantity (if any). 16. Right-click in the highlight and select Define Data Field New Data Field. 17. Change the field name to Quantity. 18. Verify that all other defaults are acceptable and click Add. Now lets ensure that Source Options will allow parsing: 19. Select Source Options from the Menu bar. 20. On the Extract Design Choices tab, look in the Tag Separator dropdown to see if there is a character sequence that matches the sequences used in your data to separate Line Style tags from actual data. If there is, select it. If there is not, then automatic parsing is not available. Also on this tab, ensure that the Trim Leading and Trailing Spaces checkbox is selected. 21. On the Display Choices tab, ensure that the Pad Lines checkbox is selected. 22. Click OK to accept the selections. Now lets define the UnitCost Line Style and Data Field simultaneously. 23. Highlight an entire UnitCost line in the data and right-click in the highlight. 24. Select Define Data Field Parse Tagged Data. Note: When Line Styles and Fields are defined in this way, the default name for the Field is exactly the same as that for the Line Style, so no change to the field name is usually necessary. If a change is desired, however, point your cursor to the actual field data in the display and double-click on the data. This will bring up the Field Definition dialog box and you can change the name (or other characteristics) here. Now well define the TotalCost and ShipmentMethodCode Line Styles and Data Fields simultaneously. 25. Highlight an entire TotalCost line and ShipmentMethodCode line in the data. 26. Right-click in the highlight and select Define Data Field Parse Tagged Data. The next thing is to define the Line Style that determines the end of a row of data for the Extract File. 27. Locate the Line Style that contains the Field that will be the last column in each row in the eventual extract file (in this case, ShipmentMethodCode). 28. Double-click on the Line Style name to bring up the Line Style Definition dialog. 29. On the Line Action tab, choose ACCEPT Record, and accept the remaining defaults. 30. Click Update. Test the Extract to ensure that your definitions are correct. 31. Click on the Browse Data Record button. 32. Choose OK to allow assignment of all Fields to the Extract File. 33. Examine the data to ensure that your Field definitions are correct. 113 Data Integrator Fundamentals Training
34. Close the browser window. 35. Use the Parse Tagged Data functionality to define the Account Number, Purchase Order Number and PODate fields. 36. Double-click on a Purchase Order Number to access the Field Definition dialog. Note: The options at this dialog determine how the Extract Schema Designer will process the data in this particular field from record to record. The use of these options makes a distinction between the data fields and the contents of those fields. When the Extract Schema Designer is collecting data fields, it collects all the fields that have been defined on lines of text whose line action is either COLLECT Fields or ACCEPT Record and assembles those fields into a data record. The options at this dialog determine how data within a data field is handled.
37. On the Data Collection/Output tab, ensure that Propagate Field Contents has been selected. 38. Double-click on a PODate to access the Field Definition dialog. 39. On the Data Collection/Output tab, select Flush Field Contents. 40. Click Update. 41. Click on the Browse Data Record button. 42. Choose OK to allow assignment of all Fields to the Extract File. 43. Examine the data to see the effect of Propagate versus Flush. 44. Close the browser window. 45. Redefine the PODate field to propagate it as well. 46. Browse the data record again to ensure the data is being propagated.
Note: In this case we do want the data to propagate, but you will need to decide which behavior you want for any situation. We can specify an order for the columns in your Extract File (if desired). 47. Choose Field Export Field Layout from the menu bar. 48. To reposition a column, left-click and drag a column name up or down in the list, dropping it on top of another column name. Note: When you drag upward, the column you are dragging will be placed before the column on which you drop it. When you drag downward, the column you are dragging will be placed after the column on which you drop it. 49. Put the six columns in the order they appear in the source file. 50. Click OK. 51. Exclude columns from the Extract File (if desired). 52. Select Record Edit Accept Record from the menu bar. 53. Clear the check boxes for the columns that you do not wish to appear in the Extract File. 54. Click Update. 55. Save the Extract Schema Definition: If the Extract Schema Definition has already been saved before, click the Save Extract button to save it again under the same name. You may also choose File Save Extract to perform the same function. If the Extract Schema Definition has not yet been saved, click the Save Extract button. In the Save Extract dialog, supply the name Phone_Purchases.cxl and verify the location where the Definition will be stored (changing it if necessary). You may also choose File Save Extract to perform the same function. If the Extract Schema Definition has been saved before, but you have modified it and want to save it as a different Definition, then choose File Save Extract As. In the Save Extract dialog, supply a name for the Definition and verify or supply the save location. 56. Close the Extract Schema Designer. 57. Open Map Designer and establish a source connection based on the information below. 58. Open the Source Data Browser and note the results. Note that this source could now be used in the same way that any other source would be in a transformation. 59. Close Map Designer without saving.
Exercise 1. From the Repository Explorer, select New Object Extract Schema. 2. At the file selection prompt, click Cancel. 3. Double-click on the Purchases_Phone.cxl script to open it. 4. Choose File Save Extract As and save the extract again as Purchases_Phone2.cxl. 5. Highlight the first slash in the ReportDate. 6. Right-click in the highlight and select Define Line Style New Line Style. 7. Change the proposed name to ReportDate. 8. Choose Add. 9. Highlight the second slash in the ReportDate. 10. Right-click in the highlight and choose Define Line Style Append Line Pattern. 11. Double-click on the ReportDate line style name to view the results. Note: This Line Style definition will be sufficient so long as there is no other line of information in the file that has slashes in positions 24 and 27 and which does not contain a Report Date. If there were, we could use the same procedure to add the spaces in front of and after the actual date. If that were still not sufficient, then we could use additional techniques that we will learn in later exercises to make the Line Style definition a unique one. 12. Highlight the Report Date. 13. Right-click in the highlight and select Define Data Field New Data Field. 14. In the Field Definition dialog, change the name of the Field to ReportDate. 15. Click Add. 16. Use the Browse Data Record button to view the results. 17. Highlight the entire Order File Creator text line at the bottom of the file. 116 Data Integrator Fundamentals Training
18. Right-click in the highlight and select Define Data Field Parse Tagged Data. 19. Double-click on the Order_File_Creator Line Style to change its name (if desired). 20. Double-click on the actual email address to open the Field Definition dialog. 21. Change the Field Name to OrderFileCreatorEmailAddress. 22. Click Update. 23. Use the Browse Data Record button to view the results. 24. Close the browser then Double-click on the Order_File_Creator Line Style name to open the Line Style Definition dialog. 25. On the Line Action tab, change the action to ACCEPT Record. 26. Click Update. 27. Choose Record Edit Accept Record from the menu bar. 28. Choose Order_File_Creator for the Current Accept Record. 29. Select the OrderFileCreatorEmailAddress checkbox. 30. Choose ShipmentMethodCode for the Current Accept Record. 31. De-select the Order_File_Creator checkbox. 32. Click Update. 33. Use the Browse Data Record button to view the results. 34. Save the Extract Schema Design as Purchases_Phone2.cxl and close the Extract Schema Designer. Note: When an Extract Schema Design like this one is used as part of the Source specification for a transformation, the transformation Map tab will look as if the input file had been defined to have multiple record types. The email address will be in the last record read by the transformation, of course. If your requirements dictate that the email address be available as actual purchase records are processed, then you will have to use other techniques in a more complex transformation.
Exercise 1. From the Repository Explorer, select New Object Extract Schema. 2. At the prompt, navigate to the file you will be working with, in this case, Purchases_Fax.txt. 3. In the Source Options dialog, choose OK to accept the defaults. 4. Highlight the literal Order Header and right-click in the highlight. 5. Select Define Line Style Auto New Line Style Action - Collect fields. 6. Highlight an Account Number and Right-click in the highlight. 7. Select Define Data Field New Data Field. 8. Change the Field Name to AccountNumber. 9. For the Start Rule, choose Floating Tag. 10. Enter the tag Account Number(. 11. Use first tag starting at position 0. 12. For the End Rule, choose Floating Tag. 13. Enter the tag ) (a single closing parenthesis). 14. Use first tag starting at position 0. 15. Choose Add. 16. Highlight a PO Number and right-click in the highlight. 17. Select Define Data Field New Data Field. 18. Change the Field Name to PONumber. 19. For the Start Rule, select the first floating tag of PO Number( starting at position 0. 20. For the End Rule, select the first floating tag of ) starting at position 0. 21. Choose Add. 118 Data Integrator Fundamentals Training
Note: When working with Floating Tags, the starting position for the End Rule is relative to the beginning of the Field being defined- not the beginning of the record. So even though the closing parenthesis for the PONumber is the second one from the beginning of the file, it is only the first one from the beginning of the PONumber. 22. Highlight a PO Date, right-click and select Define Data Field New Data Field. 23. Change the Field Name to PODate. 24. For the Start Rule, select the first floating tag of PO Date: starting at position 0. Please note that there is a space after the colon. 25. For the End Rule, choose End of Line. 26. Choose Add. 27. Highlight the literal Item and right-click in the highlight. 28. Select Define Line Style Auto New Line Style Action - Collect fields. 29. Highlight a Category and right-click in the highlight. 30. Select Define Data Field New Data Field. 31. Change the Field Name to Category. 32. Choose Add. 33. Highlight a Product Number and right-click in the highlight. 34. Select Define Data Field New Data Field. 35. Change the Field Name to ProductNumber. 36. For the Start Rule, select the first floating tag of / starting at position 0. 37. For the End Rule, select the first floating tag of (a single space) starting at position 0. 38. Choose Add. 39. Highlight a Quantity, right-click and select Define Data Field New Data Field. 40. Change the Field Name to Quantity. 41. For the Start Rule, select the third floating tag of (a single space) starting at position 0. 42. For the End Rule, select the first floating tag of / starting at position 0. 43. Choose Add. 44. Highlight a Unit Cost, right-click and select Define Data Field New Data Field. 45. Change the Field Name to UnitCost. 46. For the Start Rule, select the second floating tag of / starting at position 0. 47. For the End Rule, select the first floating tag of / starting at position 0. 48. Choose Add. 49. Highlight a Shipment Method Code, right-click and select Define Data Field New Data Field. 50. Change the Field Name to ShipmentMethodCode. 51. For the Start Rule, select the third floating tag of / starting at position 0. 119 Data Integrator Fundamentals Training
52. For the End Rule, choose End of Line. 53. Choose Add. 54. Locate the Line Style that contains the Field that will be the last column in each row in the eventual extract file (in this case, Item). 55. Double-click on the Line Style name to bring up the Line Style Definition dialog. 56. On the Line Action tab, choose ACCEPT Record, and accept the remaining defaults. 57. Click Update. 58. Click on the Browse Data Record button. 59. Choose OK to allow assignment of all Fields to the Extract File. 60. Examine the data to ensure that your Field definitions are correct. 61. Close the browser window. 62. Ensure that the Fields are in the order they appear in the input data. 63. Save the Extract Schema Design as Purchases_Fax.cxl. 64. Close the Extract Schema Designer. 65. Remember that this schema can be used as part of a source connection in Map Designer.
Process Designer is a graphical data transformation management tool you can use to arrange your complete transformation project. With Process Designer, you can organize Map Designer Transformations with logical choices, SQL queries, global variables, Microsoft's DTS packages, and any other applications necessary to complete your data transformation. Once you have organized these Steps in the order of execution, you can run the entire workflow sequence as one unit. IntegrationArchitect_ProcessDesigner.ppt
Creating a Process
Objectives At the end of this lesson you should be able to create a simple Process Design. Keywords: Process Designer, Transformation Map, and Component Description Process Designer can be used from beginning to end to make your data transformation task simpler and more streamlined. Map Designer is one of the applications that can be called from within Process Designer. Process Designer allows you to create new Transformations, use existing Transformations, or use a copy of an original transformation file; where the original transformation file remains unchanged. Follow the steps below to create a simple process.
Exercise 1. Open Process Designer. 2. Add a Transformation step to the Process Design. 3. Right Click on the Transformation Map and choose Properties. 4. Click Browse and choose m_OutputModes_Clear_Append.map.xml from a previous exercise or from the solutions folder. Note: A Process Designer SQL Session is a particular method of connecting to the given SQL application's API. We can use the same session in multiple steps or create new sessions wherever needed. We must have at least one session if any connection to a relational database is made during the process.
5. A SQL Session is created based upon the maps target connection. Accept the default session for the target and click OK. 6. Name this step Load_Accounts. 7. Add another Transformation step to the Process Design. 8. Right Click on the Transformation Map and choose Properties. 9. Click New to open the Map Designer. 10. Create a new map that loads the ASCII Delimited file Category.txt into the tblCategories table in the TrainingDB Database. Use the report below for specifications.
Map Summary:
Define the Source: Source Connector: ASCII(Delimited) Source Data: Source Options: $(FUN_DATA)Category.txt Header = False
Target Data:
Target Options:
None
Define Events: Source R1 Events Event Name Event Actions Event Parameters
target name record layout
AfterEveryRecord
ClearMapPutRecord
Target R1
11. Accept the default for the Transformation Step dialog. 12. Choose Use an existing session for the target in the Sessions Dialog. 13. Name step Load_Categories. 14. Create a new map that loads ShippingMethod.txt into the tblShippingMethod table in the TrainingDB Database. Use the report below for specifications.
Map Summary:
Define the Source: Source Connector: ASCII(Delimited) Source Data: Source Options: $(FUN_DATA)ShippingMethod.txt Header = True
Target Data:
Target Options:
None
Define Events: Source R1 Events Event Name Event Actions Event Parameters
target name record layout
AfterEveryRecord
ClearMapPutRecord
Target R1
15. Accept the default for the Transformation Step dialog. 16. Choose Use an existing session for the target in the Sessions Dialog. 17. Name step Load_ShippingMethod. 18. Establish the Step Sequence as described below (Use the corresponding image if necessary). 19. Start Load_Accounts Load_Categories Load_ShippingMethod Stop 20. Validate the Process Design. 21. Save the Process as p_Load_Tables.ip.xml. 22. Run the Process Design. 23. Examine the Target Tables.
Exercise 1. Open the p_Load_Tables process from the previous exercise or from the solutions folder. 2. Run the process and check the log file for the length of time the process took to run. 3. Change the format of linking the steps in the process as pictured in the figure below. 4. Create a separate SQL Session for the target in each map. 5. Open the Process Properties Dialog and set Max Concurrent Execution Threads to 3. 6. Validate the process, and then save it as p_ParallelProcessing.ip.xml. 7. Run the process and check the log file for the length of time it took to run.
There is a limit to the number of execution threads that can be executed per license. The Max number of execution threads allowed can be found on the Splash Screen. From the Toolbar choose Help About Process Designer Licensed Features. Under the list of Features you will find the feature Max allowed threads.
Exercise 1. Open Process Designer. 2. Add a Transformation step to the Process Design. 3. Right Click on the Transformation Map and choose Properties. 130 Data Integrator Fundamentals Training
4. Click Browse and choose m_Reject_Connect_Info.map.xml from a previous exercise or from the solutions folder. 5. Accept the default for the Transformation Step and the Sessions dialog. 6. Name step LoadAccounts_CheckDates. 7. Add a Decision step to the Process Design. 8. Right-click on the Decision icon and select Properties. 9. Name the step Eval_RejectRecordCount. 10. Using the Step Result Wizard, create and add the following code:
project("LoadAccounts_CheckDates").RejectRecordCount > 0
11. Click OK to close. 12. Add a Scripting Step to the Process Design. 13. Right-click on the Scripting icon and select Properties. 14. Name the step NotificationBadDates. 15. Use the Build button to build an expression that will display There are STILL invalid dates!!" in a message box with a stop icon and an OK button and the title Invalid Date Warning:
MsgBox("There are STILL invalid dates!!", 16, "Invalid Date Warning")
16. Click OK to close. 17. Link the steps as follows: 18. Start LoadAccounts_CheckDates Eval_RejectRecordCount (False) Stop 19. Link the remaining steps as follows: 20. Eval_RejectRecordCount (True) NotificationBadDates Stop 21. Validate the Process Design. 22. Save your Process Design as p_ConditionalBranching_StepResultWizard.ip.xml 23. Run the Process and observe the results.
currentFile
Variant
No
2.
3. Name the step LoadAccountsTable. 4. Click Browse to locate the m_OutputModes_Clear_Append.map.xml from a previous exercise or from the solutions folder. 5. Accept the defaults in the Sessions dialog to Create a New Session for the target. 6. Add a scripting step as described below. Name the step BuildFileList:
Expression:
' Set directory for incoming files. ' Consider using lookup or user input for this value.
filePath = MacroExpand("$(FUN_DATA)") & "Inbox\"
9. Use the Line Builder to connect the steps created thus far. 10. Start LoadAccountsTable BuildFileList GotFiles (False) Notification_NoFiles Stop 11. Add a scripting step as described below. Name the step SetCurrnentFile:
Option Explicit ' Trap runtime errors (e.g., Array Index Out of Bounds) ON ERROR GOTO ErrorScript ' Set variable for the current file. Define Macro for use within the map. currentFile = filePath & files(fileCounter) DefineMacro("SOURCE_FILE", currentFile) ' Verification... Dim f f = Ubound(files) - fileCounter MsgBox("Processing File: " & files(fileCounter) & ". File " & f + 1 & " of " & Ubound(files)+1) ' Use the Return statement to exit this module Return ' Error handler ErrorScript: ' Get the error info and check variable values LogMessage("ERROR","Err.Number = " & Err.Number & " " & _ "Err.Description = " & Err.Description & " " & "FileDirectory=" & filePath & " " & _ "fileCounter=" & fileCounter & " " & "CurrentFile=" & files(fileCounter)) Terminate()
12. Add a Transformation step and name the step UpdateAddresses. 13. Click New and build a map based on the specifications below:
Map Summary:
Define the Source: Source Connector: ASCII(Delimited) Source Data: Source Options: $(SOURCE_FILE) Header = True
Define the Target: Target Connector: Target Data: ODBC 3.x Database: TrainingDB Table: tblAccounts Target Options: None
Source Schema
Filed Name
Account Number New Street Total
Type
Text Text
Length
9 34 43
Description
Records("R1").Fields("New Street")
Define Events: Source R1 Events Event Name Event Actions Event Parameters
target name record layout
AfterEveryRecord
ClearMapPutRecord
Target R1
14. Save the map as m_UpdateAddresses.map.xml and close Map Designer. 15. Use the SQL session as was created in the first transformation step. 16. Add a decision step and name the Step SuccessCheck. 17. Use the Step Result Wizard to build the expression below:
Expression: Project("UpdateAdds").ReturnCode == 0
18. Add a scripting step as described below. Name the step UpdateFileCounter:
' Decrement the file counter variable Expression: fileCounter = fileCounter 1
19. Add a scripting step as described below. Name the step Notification_UpdateFailure:
Expression: MsgBox("Update Address Map Failed")
20. Connect the remaining steps as in the screen shot above the exercise instructions. 21. Validate the process. 22. Save the process as p_FileListLoop.ip.xml. 23. Run the process and observe the results.
Pervasive Integration Engine is an embedded data Transformation engine used to deploy runtime data replication, migration and Transformation jobs on Windows or UNIX-based platforms quickly and easily without costly custom programming. It fills the need for a low-cost, universal data transformation engine. The Integration Engine is a 32-bit data transformation engine written in C++, containing the core data driver modules that are the foundation for the transformation architecture. Because the Integration Engine is a pure execution engine with no user interface components, it can perform automatic, runtime data transformations quickly and easily, making it ideal for environments where regular data transformations need to be scheduled and launched on Windows or UNIX-based systems. Maps and Processes can be scheduled through the command line or invoked through an API. APIs are documented in the Integration Engine SDK.
Exercise 1. Open a command window by typing cmd in the Windows Run dialog. 2. Use a cd command to navigate to directory where the engine is installed. 3. The default directory is: C:\Program Files\Pervasive\Cosmos9\Common. 4. To get the current engine version information, type: djengine version
Exercise View the different options and parameters available for executing transformations and processes by using the -? switch. To see all the available options, at the command prompt type: djengine help
Execute a Transformation
Objectives This lesson demonstrates how to execute a Transformation Map via the command line interface. Keywords: Executing a Map Description At the command prompt type: djengine MapName.tf.xml . Note: Be sure to use the file that has the extension .tf.xml. This is the transformation file. The transformation file contains all of the connection information that the engine needs to connect to the source and target. It also contains a link to the map file. If you provide the engine with the name of the map file (the file with a .map.xml extension), you will receive errors. Tip: You can browse to the file in Windows Explorer and drag and drop the file onto the command line. Add verbose at end of command to get statistics printed to the console during runtime. At the command prompt type: djengine C:\Cosmos9_Work\ Fundamentals\Solutions\IntegrationEngine_CommandLine\EngineTest.tf. xml -verbose
The Define_Macro command allows you to define individual Macros on the command line. At the command prompt type: djengine -Define_Macro FUN_DATA=C:\Cosmos9_Work\ Fundamentals\Data\ C:\Cosmos9_Work\Fundamentals\Solutions\IntegrationEngine_CommandLin e\EngineTestwithMacro.tf.xml -verbose
Executing a Process
Keywords: Using the Process Design Option
Command syntax is djengine -process_execute file name (include path) At the command prompt type (code below is wrapped around the command line): djengine pe verbose -Macro_File C:\Cosmos9_Work\Workspace1\macrodef.xml C:\Cosmos9_Work\Fundamentals\Solutions\ProcessDesigner_DataIntegrat or\CreatingAProcess.ip.xml Notes We are using the Macro_File command because some of the Maps in the process uses a Macro as part of the source connection. Every process should contain the pe switch as the first switch. It should always be used even if you notice there are times when a process runs without it. The process name being called should always be the last item in the command. Any extra switches used should be entered after the pe switch and before the path to the process.
Let's substitute a different source file from the original file defined in the Transformation to show how overrides can be performed at execution time. The syntax of the command is: djengine -Source_Connect_Info string (include path) At the command prompt type: djengine -Source_Connect_Info C:\Cosmos_Work\ Fundamentals\Data\AccountsSmall.txt C:\Cosmos_Work\ Fundamentals\Solutions\IntegrationEngine_CommandLine\EngineTestw ithMacro.tf.xml -verbose AccountsSmall.txt is a file that has the same format as Accounts.txt, but it only has 54 records.
Note that only 54 records were written. Note also that we did not need to define the Macro or the path to the Macro File. The Macro in the map was only used in the source connection and we defined a new source with a complete path. So the Macro was no longer relevant.
Including the DJEngine command in the batch file will allow you to use the batch file with third party scheduling tools. Save the file Options.bas as Options.bat. Include the djengine call in Options.bat so that the entire text of the file reads: djengine pe verbose -Macro_File C:\Cosmos9_Work\Workspace1\macrodef.xml C:\Cosmos9_Work\Fundamentals\Solutions\ProcessDesigner_DataIntegrat or\CreatingAProcess.ip.xml If you are using a Windows machine, use Windows Task Scheduler or schtask to schedule this process. Tip: You may choose to add a pause command at the bottom of the script so that the command prompt will remain open, and you can verify that the process ran.
Troubleshooting Review the Integration Engine Command Line Interface Error Messages in the Help Files as well as the Error Code Reference Check the command line syntax Verify that the tf.xml is being used for executing maps Be sure the map or process file is specified last in the command Check spelling Verify the license has not expired (run djengine V from the command line) Confirm the appropriate version is installed Does the process/map run from the Design Tools? Are your Environment variables setup correctly (i.e. PATH, connector specific such as Oracle, Java paths)? Use the SET command to see a quick list of Windows environment variables. Try a backup or previous copy of your file Are you using the correct case? The following can be case sensitive Macro names Platforms (Unix) Switches (i.e. -V vs. v)
Does it run on one platform and not on another ? Check your file path slashes Windows - back slashes: \ Unix - forward slashes: /
Setting an Environment Variable in Windows This setting allows the user to call the DJEngine command from any path and eliminates the need to include the full path the command each time. 1. Right Click My Computer, and Choose the Properties option 2. Click on Advanced tab 3. Click the Environment Variables button 4. Under System variables, scroll down to Path 5. Double click Path 6. In Variable Value, put the following path followed by a semicolon in front of the first path: 147 Data Integrator Fundamentals Training
Engine Profiler The Engine Profiler is a tool designed to fine tune your Transformations and Processes. There is an excellent document that goes into detail of the functionality and use of the Engine Profiler at C:\Program Files\Pervasive\Cosmos9\Common\Help\PDF\engine_profiler.pdf
This section explores the capabilities of Transformation Map Designer in more detail.
The resulting target file will have the following format: 152 Data Integrator Fundamentals Training
Employee1 Data Auto1 Data Auto2 Data Auto3 Data Employee2 Data Auto1 Data Auto2 Data In order to achieve this it will be pertinent to know which event handlers require a ClearMapPut Action to write the target records. You can only make this decision by knowing what is contained in the source buffer or buffers. (The Source Buffer is the internal object that stores the values that have just been read in from a source record. There is one buffer for each source record type.) As a general rule, you will create at least one write action (usually a ClearMapPut) for every record type in the target.
Exercise 1. Create a map based on the specifications given below. 2. Save the map as m_One_to_Many.map.xml. 3. Run the map and observe the results.
Map Summary:
Define the Source: Source Connector: ASCII(Delimited) Source Data: Source Options: $(FUN_DATA)Autos_Sorted.txt Header = True
Define the Target: Target Connector: Target Data: Target Options: ASCII(Fixed) File: $(FUN_DATA)Autos_MultiRecType.txt None
Create 2 record types in the target through the Map Designer user interface. The layouts for both record types are described below:
Record Employee
Name Type Length Description 1 2 10 9 2 24
RecordID Text Initials Phone City State Total Text Text Text Text
Record Auto
Name Type Length Description 1 2 4 10 5 22
RecordID Text Initials Year Make Color Total Text Text Text Text
Define Events: Source R1 Events Event Name Event Actions Event Parameters
target name record layout
AfterEveryRecord
ClearMapPut Record
Target Auto
Define Events: Source R1 OnDataChangeEvent Monitor: Records("R1").Fields("Initials") Management: Fire first ODC event, Suppress Extra ODC event at EOF Event Name Event Actions Event Parameters
target name record layout
OnDataChange1
ClearMapPut Record
Target Employee
Note: It would be a good idea to create recognition rules for each target record type. It would also be a good idea to save the schema that was created in the target through the Map Designer Interface. The trainer can walk you through these steps before moving on to the next exercise.
Exercise 1. Create a map based on the specifications given below. 2. Save the map as m_Many_to_One.map.xml. 3. Run the map and observe the results.
Map Summary:
Define the Source: Source Connector: ASCII(Fixed) Source Data: Source Options: Source Schema: File: $(FUN_DATA)Autos_MultiRecType.txt none s_Autos_MultiRecType.ss.xml
Define the Target: Target Connector: Target Data: Target Options: ASCII(Delimited) $(FUN_DATA)Autos_Combined.txt Header = True
Define Events: Source Auto Events Event Name Event Actions Event Parameters
target name record layout
AfterEveryRecord
ClearMapPut Record
Target R1
Exercise 1. Simply open any RIFL Script in the Editor Window and click the Save button on the toolbar. This saves a text file with a RIFL extension somewhere on your network. 2. To reuse the script, click the Open Folder toolbar button in another Script editor window. You will need to manually change any parameters for use in the new Script window. 3. Next, we will show you how to make the functions more flexible by abstracting them into User Defined Functions and storing them in Code Modules. 160 Data Integrator Fundamentals Training
Exercise 1. Create a map based on the specifications given below. Save the map as m_CodeReuse.map.xml. 2. Run the map and observe the results.
Map Summary:
Define the Source: Source Connector: ASCII(Delimited) Source Data: Source Options: File: $(FUN_DATA)Accounts.txt Header = True
Define the Target: Target Connector: Target Data: Target Options: ASCII(Delimited) $(FUN_DATA)ZipReport.txt Header = True
Records("R1").Fields("Zip") zipTest(Records("R1").Fields("Zip"))
Define Events: Source R1 Events Event Name Event Actions Event Parameters
target name record layout
AfterEveryRecord
ClearMapPut Record
Target R1
Lookup Wizards
Lookup Wizards automate the process of creating lookups for your Transformations. You select that data that needs to be looked up, browse to those files or tables to automatically build connection strings, and select the key and returned fields. After using the Lookup Wizard, a reusable code module is created in your workspace containing the functions you need for performing lookups. The Code Module files generated by these wizards can then be reused in any Map you create. There are three types of Lookup methodologies and each has their advantages in certain situations. They are: 1. Static Flat File Lookups are fast but not very portable or dynamic. 2. Dynamic SQL Lookups are portable and dynamic but not very fast. 3. Incore Table Lookups are extremely fast and can be made more dynamic with extra RIFL code but they use core memory to store the data.
Map Summary:
Define the Source: Source Connector: ASCII(Delimited) Source Data: Source Options: File: $(FUN_DATA)Accounts.txt Header = True
Define the Target: Target Connector: Target Data: ODBC 3.x Database: TrainingDB tblFavoriteInfo Target Options: none
Note: The following code module should be built through the Lookup Wizard. Follow the instructions below to use the Lookup Wizard.
Define Code Modules: Code Modules: $(FUN_DATA)Scripts\Categories.itable.rifl
2. From the Menu click Tools Define Lookup Functions to open the Lookup Wizard. 3. Choose the Incore Table Lookup Wizard and click Next. 4. Create a new Incore Table Definition named Categories and click Next.
5. Click Build to build a new Connection String and then click Next. 6. Connect to the data source defined below for the lookup and click Next.
Define the Connection String: Connector: Access 2000 File: C:\Cosmos9_Work\Fundamentals\Data\TrainingDB.mdb Table: tblCategories Properties: None
7. Choose the appropriate Key Field, and Fields that should be returned by the lookup.
8.
Click Finish.
9. The Wizard will create several Incore Table lookup functions in a code module. Use the following functions in the appropriate event handlers as described below.
Categories_Init() Initializes the DJImport object, makes the connection to the data source as defined by the connection string, and builds the Incore table. Categories_Category_Lookup(KeyValue, DefaultValue) Creates the SQL call needed to retrieve a value from the Category field based on a Key value. Categories_ProductManager_Lookup(KeyValue, DefaultValue) Creates the SQL call needed to retrieve a value from the ProductManager field based on a Key value. Categories_Clear() Clears the Incore Table from memory.
Define Events: Transformation and Map Properties Events Event Name Event Actions Event Parameters
Expression: Categories_Init() Expression: Categories_Clear()
BeforeTransformation AfterTransformation
Execute Execute
Define Events: Source R1 Events Event Name Event Actions Event Parameters
target name record layout count
Target R1
CharCount("|",Records("R1").Fields("Favorites")) + 1
counter variable cntr
Serial()
R1.ProductManager
Exercise 1. Create a map based on the specifications given below. 2. Save the map as m_SQL_Passthrough.map.xml. 3. Run the map and observe the results.
Map Summary:
Define the Source: Source Connector: ODBC 3.x Source Data: Database: TrainingDB SQL Statement: SELECT * FROM tblAccounts WHERE State = TX Source Options: none
Define the Target: Target Connector: Target Data: Target Options: ASCII(Delimited) $(FUN_DATA)TXAccounts.txt Header = True
Fields("AccountNumber") Fields("Name")
Fields("LastPayment") Fields("Balance")
Define Events: Source R1 Events Event Name Event Actions Event Parameters
target name record layout
AfterEveryRecord
ClearMapPut Record
Target R1
Keywords: Integration Query Builder, DJX Syntax, and Dynamic Row Sets via User Interaction, InputBox Description DJX is used to escape into the RIFL expression language to design SQL statements dynamically. This allows you to use variables and macros in SQL Statement. This exercise will select records from the tblAccounts table that are from a particular state. That state will be determined at runtime.
Exercise 1. Create a map based on the specifications given below. 2. Save the map as m_SQL_DynamicRowset.map.xml. 3. Run the map and observe the results.
Map Summary:
Define the Source: Source Connector: ODBC 3.x Source Data: Database: TrainingDB SQL Statement:SELECT * FROM tblAccounts WHERE State = DJX(varState)
Define the Target: Target Connector: Target Data: Target Options: HTML $(FUN_DATA)AccountsByState.html index = false; mode = table; table border = true
Variables
Name Type Public Value
varState Variant no
Fields("AccountNumber") Fields("Name") Fields("Company") Fields("Street") Fields("City") Fields("State") Fields("Zip") Fields("Email") Fields("BirthDate") Fields("Favorites")
Fields("LastPayment") Fields("Balance")
Define Events: Transformation and Map Properties Events Event Name Event Actions Event Parameters
BeforeTransformation
Execute
Expression:
varState = InputBox("Enter the two letter code for the State:", "State Input", "TX")
Define Events: Source R1 Events Event Name Event Actions Event Parameters
target name record layout
AfterEveryRecord
ClearMapPut Record
Target R1
Multimode Introduction
Keywords: Multimode Functionality, Insert Action, and Count Parameter Multimode is a functionality that allows us to write to more than one table in the same database within the same Transformation. The use of the Multimode connector provides us with greater capabilities when mapping to a database. Since we now have the option to map to multiple tables within a database there isnt an option to set output modes such as Replace, Append, Clear and Append, Update, or Delete. We may want to append records to one table, but delete records from another. Therefore, this functionality now exists as Actions that can be taken with specific record layouts and table names. The Account Numbers in the Accounts.txt file all start with either 01 or 02. The ones that start with 01 are trading partners. We want to create a Transformation that will insert those records into the tblTradingPartners table in the TrainingDB Database. The records that start with 02 are individual customers, and we will insert them into the tblIndividuals table.
Exercise 1. Create a map based on the specifications given below. 2. Save the map as m_Multimode_Intro.map.xml. 3. Run the map and observe the result.
Map Summary:
Define the Source: Source Connector: ASCII(Delimited) Source Data: Source Options: $(FUN_DATA)Accounts.txt Header = True
Define the Target: Target Connector: Target Data: ODBC 3.x Multimode Database: TrainingDB Tables: tblIndividuals, tblTradingPartners Target Options: none
In order to remove any previous data residing in these tables, we can use a SQL Statement Action to write literal SQL deleting all records in these tables. 173 Data Integrator Fundamentals Training
Define Events: Transformation Events Event Name Event Actions Event Parameters
target name statement Delete from tblIndividuals; Delete from tblTradingPartners
BeforeTransformation
SQL Statement
Target
Define Events: Source R1 Events Event Name Event Actions Event Parameters
target name record layout count
AfterEveryRecord
ClearMapInsert Record
Target tblIndividuals
ClearMapInsert Record
Target tblTradingPartners
tblIndividuals.Favorites
Records("R1").Fields("Payments") Records("R1").Fields("Balance")
Records("R1").Fields("Payments") Records("R1").Fields("Balance")
Exercise 1. Create a map based on the specifications given below. 2. Run the map and observe the result.
Map Summary:
Define the Source: Source Connector: ASCII(Delimited) Source Data: Source Options: $(FUN_DATA)Accounts.txt Header = True
Define the Target: Target Connector: Target Data: ODBC 3.x Multimode Database: TrainingDB Tables: tblEntity, tblFavorites, tblPaymets, tblRejects Target Options: none
Variables
Name rejectReason Type Variant Public no Value "NoReason"
Records("R1").Fields("Account Number") Records("R1").Fields("Name") Records("R1").Fields("Company") Records("R1").Fields("Street") Records("R1").Fields("City") Records("R1").Fields("State") Records("R1").Fields("Zip") Records("R1").Fields("Email") DateValMask(Records("R1").Fields("Birth Date"), "mm/dd/yyyy")
Records("R1").Fields("Account Number") Serial(0) ' Starts at 1 each execution. Consider using a lookup to get Max Value first. Parse(cntFavorites, Records("R1").Fields("Favorites"), "|")
tblFavorites.Favorites
Records("R1").Fields("Account Number") Serial(0) 'Starts at one each execution. Consider using lookup for Max Value Records("R1").Fields("Payments") Records("R1").Fields("Balance")
tblPayments.Payments tblPayments.Balance
Records("R1").Fields("Account Number") Serial(0) 'Starts at one each execution. Consider using lookup for Max Value rejectReason Records("R1").Fields("Name") Records("R1").Fields("Company") Records("R1").Fields("Street") Records("R1").Fields("City") Records("R1").Fields("State") Records("R1").Fields("Zip") Records("R1").Fields("Email") Records("R1").Fields("Birth Date") Records("R1").Fields("Favorites") Records("R1").Fields("Standard Payment") Records("R1").Fields("Payments") Records("R1").Fields("Balance")
tblRejects.RejectReason tblRejects.Name tblRejects.Company tblRejects.Street tblRejects.City tblRejects.State tblRejects.Zip tblRejects.Email tblRejects.Birth Date tblRejects.Favorites tblRejects.Standard Payment tblRejects.Payments tblRejects.Balance
Define Events: Transformation Events Event Name Event Actions Event Parameters
target name table name
BeforeTransformation
DropTable
Target
tblEntity
DropTable
Target
tblFavorites
DropTable
Target
tblPayments
DropTable
Target
tblRejects
CreateTable
Target
tblEntity tblEntity
CreateTable
Target
tblFavorites tblFavorites
CreateTable
Target
tblPayments tblPayments
CreateTable
Target
tblRejects tblRejects
CreateIndex
Target
tblEntity tblEntity idxEntity True
CreateIndex
Target
tblFavorites tblFavorites idxFavorites True
CreateIndex
Target
tblPayments tblPayments idxPayments False
CreateIndex
Target
tblRejects tblRejects idxRejects False
Define Events: Source R1 Events Event Name Event Actions Event Parameters
target name record layout
AfterEveryRecord
ClearMapInsert Record
Target tblEntity
table name
ClearMapInsert Record
count:
Charcount(|, Records(R1).Fields(Favorites)) +1
Define Events: Target Events Event Name Event Actions Event Parameters
Expression:
OnConstraintError
Execute
rejectReason = "General-OnConstraintErr"
ClearMapInsert Record
none
Expression:
rejectReason = "General-OnError"
ClearMapInsert Record
Resume
none
Do not forget the Resume Action. The Resume Action is what causes the map to continue processing the remaining records after the error is handled.
Exercise 1. Create our map based on the specifications given below. 2. Save the map as m_Multimode_Upsert.map.xml. 3. Run the map and observe the result.
Map Summary:
Define the Source: Source Connector: ASCII(Delimited) Source Data: Source Options: $(FUN_DATA)Accounts.txt Header = True
Define the Target: Target Connector: Target Data: ODBC 3.x Multimode Database: TrainingDB Tables: tblIndividuals, tblTradingPartners Target Options: None
Variables
Name varChngSrc Type Variant Public no Value 0
Define Events: Transformation Events Event Name Event Actions Event Parameters
target name statement Delete from tblIndividuals; Delete from tblTradingPartners
BeforeTransformation
SQL Statement
Target
Records("R1").Fields("Account Number") Records("R1").Fields("Name") Records("R1").Fields("Street") Records("R1").Fields("City") Records("R1").Fields("State") Records("R1").Fields("Zip") Records("R1").Fields("Email") DatevalMask(Records("R1").Fields("Birth Date"), "mm/dd/yyyy") Records("R1").Fields("Favorites") Records("R1").Fields("Standard Payment") Records("R1").Fields("Payments") Records("R1").Fields("Balance")
Define Events: Source R1 Events Event Name Event Actions Event Parameters
target name
AfterEveryRecord
ClearMap
= Target = tblIndividuals
Upsert Record
count
If Left(Records("R1").Fields("Account Number"), 2) == "02" Then 1 Else 0 End if
ClearMap
Target tblTradingPartners
Else 0 End if
Upsert Record
Consider creating variables to use as flags the count parameter. The variables can be set in an Execute Statement in the AfterEveryRecord Event. This method keeps the logic for writing to each table in one location which would be easier to update in the long run.
Define Events: Source Events Event Name Event Actions Event Parameters
If varChngSrc == 0 Then varChngSrc = 1 "+File=$(FUN_DATA)AccountsUpdate.txt" End if
OnEOF
ChangeSource
Reference
o Gather e-mail information and confirm checkpoints that require special logging. Identify Platform and Software Needs o What operating system platforms do you plan to use? Is client software needed for connectivity? o Do you need special expertise to set up and configure the software which would require a database administrator or Pervasive professional services? Naming Conventions for Specification Files, Variables, and Objects Invoke Integration. How will you call your maps and process (batch, real-time with Integration Server)?
Integration Design
Performance (Lookups, Parallel and Multithreaded) How To (record x to filter or use multiple record type; error handling)
Reference:
See Best Practices: http://docs.pervasive.com/products/integration/download/best_practices.pdf
Cosmos.ini Settings
The cosmis.ini file contains the startup information required to launch the integration products. This file is available in InstallDir\Common, where InstallDir is the installation directory for the integration tool set. For more information, see the next page Windows Default Installation Locations, and see the Installation Locations topic in the release notes.
Component
cosmos.ini
C:\Program Files\Pervasive\Cosmos9\Common
Repository Manager Component SDK Target and Source Connectors Product Documentation (PDFs and Help) SDK Documentation License Files Components (Plug Ins) Samples .msi file SDKs for Content eXtraction Language and Engine SDK
C:\Program Files\Pervasive\Cosmos9\RepositoryManager C:\Program Files\Pervasive\Cosmos9\Common\ComponentSDK C:\Program Files\Pervasive\Cosmos9\Common\connections C:\Program Files\Pervasive\Cosmos9\Common\Help C:\Program Files\Pervasive\Cosmos9\Common\Help\PDF C:\Program Files\Pervasive\Cosmos9\Common\Help\SDKs C:\Documents and Settings\All Users\Application Data\Pervasive\Cosmos9\Common\License C:\Documents and Settings\All Users\Application Data\Pervasive\Cosmos9\Common\Plug-Ins C:\Program Files\Pervasive\Cosmos9\Common\Samples C:\Program Files\Pervasive\Cosmos9\Common\SDKs
Table 1-2 Windows Vista and Windows Server2008 Default Installation Locations
Component
cosmos.ini Integration Platform Designers
and Other Executables Integration Server C:\Program Files\Pervasive\Cosmos9\IntegrationServer C:\ProgramData\Pervasive\Cosmos9\\IntegrationServer Repository Manager Component SDK Target and Source Connectors Product Documentation (PDFs and Help) SDK Documentation License Files Components (Plug Ins) Samples .msi file SDKs for Content eXtraction Language and Engine SDK C:\Program Files\Pervasive\Cosmos9\RepositoryManager C:\Program Files\Pervasive\Cosmos9\Common\ComponentSDK C:\Program Files\Pervasive\Cosmos9\Common\connections C:\Program Files\Pervasive\Cosmos9\Common\Help\ C:\Program Files\Pervasive\Cosmos9\Common\Help\PDF C:\Program Files\Pervasive\Cosmos9\Common C:\ProgramData\Pervasive\Cosmos9\\Common\License C:\ProgramData\Pervasive\Cosmos9\\Common\Plug-Ins C:\Program Files\Pervasive\Cosmos9\Common\Samples C:\Program Files\Pervasive\Cosmos9\Common\SDKs
Process Designer
Setting Properties
Setting Map Properties To set the Map Tab to show the Navigation Tree with Events, from the Menu click View Preferences. Click the General tab. Check Always show Map All view.
Setting RIFL Script Properties To set the editor to show a line number for each line of the scripts, on the menu bar choose View Editor Properties. Click on the Misc tab. In the lower left see Line numbering. In the Style dropdown choose Decimal. Change the Start at to 1. Click OK.
Note: The Log File Browser displays a maximum of 32,000 lines. If your log file is very long, you will be able to see only the last 32,000 lines of it in the browser. If you want to see the rest, open it in a text editor, such as WordPad. 3. Click Search to display the Find Text dialog box. It allows you to search the error and event log file for a particular string of text. 4. Click Clear Log to delete the log file.
To change names of log files: You can set the name of the .log file in three places. In Map Designer, you open Transformation and Map Properties, click Error Logging, and type a new log file name. In Process Designer, select File > Process Properties, click the Logging Tab, type a name for the log, and click OK. For the Engine, type the following at a command prompt: -logfile newlogfilename 195 Data Integrator Fundamentals Training
Transformation Log Codes The transformation log file displays the following information:
Date
Time
Error Type
Internal Code
Direction Code
Source
08/25/2006
14:08:10
Global
Error Type 1 Informative 2 Warning 4 General Error 8 Fatal Error 16 Debug Message Internal Code This code is related to 255xx codes. Direction Code I Import C Component E Export M Message component O Other U Unknown component Source The source of the log message can be global, the name of a connector, name of a component, or some other indication of the origin of the message
http://docs.pervasive.com/products/integration/download/best_practices.pdf
Product Documentation:
http://docs.pervasive.com/products/integration/di/wwhelp/wwhimpl/js/html/wwhelp.htm#href=conta ct/contact.html
Integration Support:
http://www.pervasiveintegration.com/support/Pages/submit_a_support_ticket.aspx
Integration Forums:
http://cs.pervasive.com/forums/16.aspx
Documentation and Downloadable Samples:
http://www.pervasiveintegration.com/support/documentation/Pages/documentation_and_samples.as px
Event Management Guide:
http://docs.pervasive.com/products/integration/download/events.pdf
http://www.pervasiveintegration.com/support/Pages/product_downloads.aspx
Integration Manager Pages:
http://www.pervasiveintegration.com/products/Pages/integration_manager.aspx
Glossary
A
Action One of the options in Event Handlers (Map tab, upper left quadrant in Map Designer). For example, ClearMapPut Record is the default Action automatically set when you do not override the option. Some other Actions in the drop down list include: Execute, MapPut Record, Map, Put Record, Insert Record, Clear, and ClearInitialize. Array In programming, a series of objects, all of which are the same size and type. Each object in an array is called an array element. For example, you could have an array of integers or an array of characters or an array of anything that has a defined data type. The important characteristics of an array are: Each element has the same data type (although they may have different values). The entire array is stored contiguously in memory (that is, there are no gaps between elements). Arrays can have more than one dimension. A one-dimensional array is called a vector; a twodimensional array is called a matrix. Arithmetic operators The +, -, *, /, and ( ) are operators used to construct arithmetic expressions. ASCII The most common format for text files. In an ASCII file, each alphabetic, numeric, or special character is represented with a 7-bit number (a string of seven 0s or 1s), with 128 characters defined. Unix and older DOS-based operating systems use ASCII for text files. Newer Windows systems use an encoding standard called Unicode. IBM System 390 servers use a proprietary 8-bit code called EBCDIC. Transformation programs allow different operating systems to change a file from one encoding standard to another. The American National Standards Institute (ANSI) oversees ASCII Standards.
B
Binary File A computer file that contains machine-readable information that must be read by an application; the characters use all 8 bits of each byte. Boolean logic The type of an expression with two possible values, True and False. Also, a variable of Boolean type or a function with Boolean arguments or result. The most common Boolean functions are And, Or and Not. In computer operation with binary values, Boolean logic can be used to describe electromagnetically charged memory locations or circuit states that are either charged (1 or true) or not charged (0 or false). The computer can use an AND gate or an OR gate operation to obtain a result that can be used for further processing.
C
Comma-delimited A data format in which each piece of data is separated by a comma. This is a popular format for transferring data from one application to another, because most database systems are able to import and export comma-delimited data. Concatenate 201 Data Integrator Fundamentals Training
To merge the records from two or more files into a single file. Also, to add a string of data to other data that already exists in a field. In Map Designer, you can concatenate fields into a single field by using an expression. Connection String A list of key = value pairs. The keywords are either names of connection information fields, or Connector property names. The key=value pairs are separated by a semi colon (;). Connector Name of the type of connection at the Source or Target tab. ASCII (Delimited), MySQL, and Oracle 9i are some examples of Connectors. In early versions of Map Designer, the term for connector was spoke. Constraint An object used to place rules on data in a relational database. Constraints are used to control the allowed data in a column, are created at the column level, and are used to enforce referential integrity (parent and child table relationships). Conversion Called a transformation in more recent versions of Map Designer, and the basic unit for all data transfer and manipulation. A conversion (transformation) is one set of source, target, and mapping specifications. When these specifications are set, the data transformation process can be run.
D
Data (1) Distinct pieces of information, usually formatted in a special way. Software is divided into two general categories: data and programs. Programs are collections of instructions for manipulating data. (2) The term data is often used to distinguish binary machine-readable information from textual human-readable information. For example, some applications make a distinction between data files (files that contain binary data) and text files (files that contain ASCII data). (3) In database management systems, data files are the files that store the database information, whereas other files, such as index files and data dictionaries, store administrative information, known as Metadata. Data integrity Refers to the validity of data. Data integrity can be compromised in a number of ways: Human errors when data is entered Errors that occur when data is transmitted from one computer to another Software bugs or viruses Hardware malfunctions, such as disk crashes Natural disasters, such as fires and floods There are many ways to minimize these threats to data integrity. These include: Backing up data regularly Controlling access to data via security mechanisms Designing user interfaces that prevent the input of invalid data (such as Input Boxes for user input) Using error detection and correction software when transmitting data (error trapping, reject tables) Data structure 202 Data Integrator Fundamentals Training
A Schema (Map tab in Map Designer). In previous versions of Map Designer, data structures were also called Record Layouts. A data structure is the arrangements of fields in a record within a particular data file, either source or target. This includes field length, record length, field data types, and other field properties such as Decimal and Scale. Data type The classification of data a field can contain. Some data types include text, numeric, datetime, float, packed decimal, Boolean, and 16-bit binary. Database An organized collection of information, stored systematically in tables or files. Default What the integration product automatically does in the absence of an overriding command. For example, if no After Every Record events are selected in Map Designer, the ClearMapPut Record Action is automatically invoked when a transformation is run. Delimited ASCII data ASCII data has fields that are separated by some character, often a comma. Field entries frequently begin and end with double quotation marks ("), and records are often separated by a carriage returnline feed (CR-LF). Records and fields are not usually a fixed length. Delimiter A character or combination of characters used to separate one item or set of data from another. For example, in comma-delimited records, a comma is used to separate each field of data. In the Map Designer ASCII Delimited connector, the source and target Property default setting is commadelimited. Design time Activities performed when designing a transformation or process. It includes specifying source and target connection information, reading and applying metadata, specifying transformation events, options, execution paths, errors, defining mapping expressions, and exception handling. Discriminator A discriminator is the data within a file that indicates record type. DJAR Data Junction Archive (DJAR) is a package that contains processes and dependents of the processes such as Maps, Functions, Executables, etc. DJImport Object An internal object designed to provide a generic interface to Map Designer Connectors. It is used to read data to be utilized as a source.
E
EBCDIC An IBM code for representing characters as numbers. Although it is common on large IBM computers, most other computers, including PCs and Macintoshes, use ASCII codes. Expression 203 Data Integrator Fundamentals Training
An Expression (also called a Script) is a combination of Operator, literal values, field names, Statement, Variable, and Function. They are used to perform calculations, enter a specific value, concatenate data, or otherwise modify data in a particular field. Expression Builder Now called RIFL Script Editor, this is the functional area of Map Designer where you can write your own scripts to include with your transformations. RIFL Script Editor includes a list of all of the functions and operators available to you in RIFL (Rapid Integration Flow Language).
F
Field A labeled or unlabeled column of information in a data file or table; a field contains the same kind of information for each record in the data file or table. File format A format for encoding information in a file. Each different type of file has a different file format. The file format specifies first whether the file is binary or ASCII, and second, how the information is organized. Filter A set of criteria applied to a range of records. In the Map Designer, both the source and target filters sift through data and return a subset of records specified in the filter options. The number of records processed can also be specified in these filters. Fixed ASCII file An ASCII data file that has fixed field and record sizes, but no delimiter (except possibly a record separator). Fixed length Having a set length that never varies. In database systems, a field can have a fixed or a variable length. A variable-length field is one whose length can be different in each record, depending on what data is stored in the field. The terms fixed length and variable length can also refer to the entire record. A fixed-length record is one in which every field has a fixed length. A variable-length record has at least one variablelength field. Flow Control Management of data flow between computers, devices, or network nodes to maintain efficient use of data. Function A small section of a program designed to perform a specific task. Many functions return a value based on the results of a calculation or other operation. Some functions operate as a procedure and return no value. In Map Designer, functions can be used to map and manipulate data. A list of available functions is in the RIFL Script Editor interface. For a list of functions, see All Functions in the Help Files.
G
GUI 204 Data Integrator Fundamentals Training
(Graphical User Interface) A graphics-based user interface that incorporates icons, menus, and a mouse. The interface has become the standard way users interact with a computer. In a client-server environment, the GUI resides in your client machine.
H
Header Information that appears at the beginning of a data file, but is not a part of the actual data.
I
Integration In reference to data it is the combining or movement of data from different sources to provide end users with a unified view of this data. The data movement may also involve transforming the data through computations, or modifying the data format.
K
Key In database management systems, a key is a field that you use to sort data. It can also be called a key field, sort key, index, or key word. Most database management systems allow you to have more than one key so that you can sort records in different ways. One of the keys is designated the primary key, and must hold a unique value for each record. A key field that identifies records in a different table is called a foreign key.
L
Lookup Table An array or matrix of data that contains values that can be searched.
M
Mask A pattern of tokens used to accept or reject patterns in another set of data. For example, a date mask that looks for two numbers followed by a slash followed by two more numbers, another slash and two more numbers (##/##/##) can be used to match a string of source data. When the specified pattern appears in both the mask and the data, the source data will be written to the target. Metadata Data about data. Metadata describes how and when and by whom a particular set of data was collected, and how the data is formatted. Metadata is essential for understanding information stored in data warehouses. Multimode Specific connector types that have been designated to allow writes to multiple tables. When a user selects one of these connector types, the Output mode will automatically be "Multiple Output Mode". This cannot be changed to regular output mode. SQL Script and ODBC 3.x are two of the Multimode Connectors available.
N
205 Data Integrator Fundamentals Training
Null A value that indicates missing or unknown data in a field. Null characters are placeholders with a hex value 00. These values can be entered in fields for which information is unknown and can be used in expressions. Some fields, such as those identified with primary keys, cannot contain Null values.
O
Object A mechanism that binds data to methods that operate on it. In object-oriented programming, an object is a self-contained entity that consists of both data and procedures to manipulate the data. ODBC (Open Data Base Connectivity) A database programming interface introduced by Microsoft in 1992 that provides a common language for applications to access databases on a network. ODBC is made up of the function calls programmers write into their applications and the ODBC drivers themselves. For client/server database systems such as Oracle and SQL Server, the ODBC driver provides links to their database engines to access the database. For desktop database systems such as dBASE and FoxPro, the ODBC drivers actually manipulate the data. ODBC supports SQL and non-SQL databases. Although the application always uses SQL to communicate with ODBC, ODBC will communicate with non-SQL databases in its native language. Map Designer supports ODBC 2.x, ODBC 3.x, ODBC 3.5 and ODBC 3.x multimode connectivity. OLE OLE (Object Linking and Embedding) is a compound document technology and part of Microsoft ActiveX technologies. A compound document can contain visual and information objects of all kinds. Each object is an independent program entity that can interact with a user and also communicate with other objects. OLE utilizes the Component Object Model (COM) and its distributed version, (DCOM). An OLE object is also, by default, a component (or COM object). OnEOF A source schema event (upper left, Map tab in Map Designer). Executed when the end of the file (EOF) is reached. Operator A symbol that represents an operation to be performed on a value or values. For example, the + operator represents addition, and the * operator represents multiplication. Output A mode which represents the transfer of data from the source to the target (Map tab in Map Designer). Some selections include: Replace File/Table, Append to File/Table, Update File/Table and Clear/ Append. Connectors that write to multiple tables use the Multimode Output mode.
R
RDBMS
Relational Database Management System. RDBMS includes a wide variety of SQL and relational database systems, such as SQL Server and Oracle. Data is stored in multiple tables, many of which are linked by the use of primary key fields. Record (1) In database management systems, a complete set of information. Records are composed of fields, each of which contains one item of information. A set of records constitutes a file. For example, a personnel file might contain records that have three fields: a name field, an address field, and a phone number field. (2) Some programming languages allow you to define a special data structure called a record. Generally, a record is a combination of other data objects. For example, a record might contain three integers, a floating-point number, and a character string. Record layout The term for a data structure used in Map Designer. The alternative term is schema. The arrangement of fields in a record in a particular data file, either source or target. This includes field length, record length, field data types, and other field properties such as decimal and scale. Record number A unique number that identifies each record in a data file or table. Record type A set of field options within the source and target schemas (Map tab in Map Designer). These options include layout name, length, lock, schema origin, and description. Regular expression A string of characters that defines a set of rules for matching character strings found in fields. Relative path An implied path. When a command is expressed that references files, the current working directory is the implied, or relative, path if the full path is not explicitly stated. Repository A physical location on your local system and on the network. It stores maps, connections, structured schemas and join view files. RIFL Rapid Integration Flow Language (RIFL) is a custom expression language for the integration products. RIFL includes functions, statements, operators, events, scripts, and objects unique to the integration platform. Some RIFL functions are similar, but not the same as, Visual Basic. RIFL scripts can be run on both Windows and Unix systems. Use the .rifl extension for script files. Run Time The events that occur during transformation and process execution. These include connecting to data sources and targets, reading and writing data, compiling and evaluating expressions, transformation events, and exception processing.
S
Scale A Field Property Value option (Map tab in Map Designer). Designates where a decimal is positioned in a number. 207 Data Integrator Fundamentals Training
Schema The term for a data structure (Map tab in Map Designer). The arrangement of fields in a record in a particular data file, either source or target. This includes field length, record length, field data types, and other field properties such as decimal and scale. You can create and modify schemas in Document Schema Designer and in Structured Schema Designer. These schemas can then be validated in Process Designer, and used as structural metadata in Map Designer. Scope In programming, the visibility of variables within a program. For example, whether or not one function can use a variable created in another function. Script A Script or Expression is a grammatically correct combination of operators, literal values, field names, variables and functions used to perform calculations, enter a specific value, concatenate data, or otherwise modify data in a particular field. Server The application that responds to the calling application or client in a DDE or OLE conversion. The server usually sends data to the client. SQL Structured Query Language (abbreviated SQL and commonly pronounced "sequel") is the standard language for storing and manipulating data in relational databases. Statement A descriptive phrase that generates one or more instructions in the computer. String An alphanumeric value or an expression consisting of alphanumeric characters. Syntax Grammar, structure, or order of elements in a language statement. Syntax Error An error caused by an incorrectly expressed statement written in the RIFL Script Editor or in a transformation event in Map Designer.
T
Table (1) In programming, a collection of adjacent fields. Also called an array. A table contains data that is either constant within the program or is called when the program runs. (2) In a relational database, the same as a file; a collection of records. A structure made up of rows (records) and columns (fields) that contain information. A table is the primary object used to store data. When data is queried and accessed for modification, it is usually found in a table. Transformation Called a conversion in previous versions of Map Designer, a transformation is the basic unit for all data transfer and manipulation. A transformation is one set of source connection, target connection, mapping, event, and property specifications. When these specifications are set, the data transformation process can be run. 208 Data Integrator Fundamentals Training
Truncate To remove leading or trailing digits or characters from an item of data without regard to the accuracy of the remaining characters. Truncation occurs when data is converted into a new record with smaller field lengths than the original.
U
Unicode A character encoding scheme that uses two bytes to represent every character, regardless of whether its an ASCII character. This scheme is capable of encoding all known characters and is used as a worldwide character-encoding standard.
V
Validation A process that ensures that the user has provided sufficient information in the design phase. In Process Designer, for example, it verifies that the Steps and links have certain fundamental requirements. Variable (Public, Global, Dim) A named storage location that can be modified during program execution. Each variable has a name that uniquely identifies it within its level of scope. A Public variable can be used throughout a project, while a Global variable can be used throughout a transformation. Dim variables are specific to a module or an expression. View A virtual table that looks like and acts like a table in a relational database. A view is defined based on the structure and data of a table. A view can be queried and sometimes updated.
W
Where Clause The part of a SQL statement that specifies which records to retrieve. In the Map Designer, the statement is an option in source properties in several SQL database applications, such as Access, Oracle, and SQL Server. Workspace A collection of Repositories. Each Workspace directory contains a macro definitions file
called "macrodef.xml".
Appendix
This section contains additional exercises and information that may be of use.
Additional Exercises
Keywords: Extract Schema Designer: Multiple Fields per Line Style (fixed) Description The next file that we will be parsing is Purchases_Mail.txt. We should take a look at it in a text viewer. Although it might be possible to use this report file as a direct input for a transformation, we would have to define it as a multiple-record-type file. Although there are fewer record types than with the phone purchases we dealt with earlier, there are still enough that when combined with the extra processing logic involved, the job would become tedious. So, again, what we plan to do is use the Extract Schema Designer to create an extract specification that will transform the report file into a more familiar row/column format, and then use that formatted data as input to the transformation that adds these purchases to the database table. As before, we dont require multiple passes of the input file. We will just create the extract schema and apply it to the input on the Source tab of our eventual transformation.
Exercise 1. From the Repository Explorer, select New Object Extract Schema. 2. At the prompt, navigate to the file you will be working with, in this case, Purchases_Mail.txt. 3. In the Source Options dialog, on the Extract Design Choices tab, set the Tag Separator to Colon:Space(: ) Also on this tab, ensure that the Trim Leading and Trailing Spaces checkbox is selected. 4. On the Display Choices tab, ensure that the Pad Lines checkbox is selected. 5. Choose OK to accept the selections. 6. Highlight the entire Account Number line in the data.
7. Right-click in the highlight and select Define Data Field Parse Tagged Data.
8. Highlight the label Purchase Order Number. 9. Right-click in the highlight. 10. Select Define Line Style New Line Style. 11. Change the Line Style Name to PONumber. 12. Choose Add. 13. Highlight the Purchase Order Number tag and the data following it. 14. Right-click in the highlight. 15. Select Define Data Field Parse Tagged Data. 16. Define the PO_Date Field using the same technique 17. Define the Category Line Style and the three Fields on it using the same technique. 18. Define the Unit Cost Line Style and the three Fields on it using the same technique. 212 Data Integrator Fundamentals Training
19. Define the Line Style that determines the end of a row of data for the Extract File. 20. Locate the Line Style that contains the Field that will be the last column in each row in the eventual extract file (in this case, Unit_Cost). 21. Double-click on the Line Style name to bring up the Line Style Definition dialog. 22. On the Line Action tab, choose ACCEPT Record, and accept the remaining defaults. 23. Choose Update. 24. Click on the Browse Data Record button. 25. Choose OK to allow assignment of all Fields to the Extract File. 26. Examine the data to ensure that your Field definitions are correct. 27. Close the browser window. 28. Ensure that the Fields are in the order they appear in the input data. 29. Save the Extract Schema Design as Purchases_Mail.cxl. 30. Close the Extract Schema Designer. 31. Remember that this schema can be used as part of a source connection in Map Designer.
Note that without the Verbose command the only command line indication that the Map ran correctly is a single line, Return Code: 0 Now lets change the value of the variable. For a string with a single word, type at the command prompt: djengine -se myVar=\"NewValue\" C:\Cosmos_Work\ Fundamentals\Solutions\IntegrationEngine_CommandLine\EngineTestwithVar.tf.xml
For a string with multiple words, type at the command prompt: djengine -se myVar=\"New Value\" C:\Cosmos_Work\Fundamentals\Solutions\IntegrationEngine_CommandLine\EngineTestwith Var.tf.xml
Additional notes: Aside from normal command line quoting/escaping sequences for the given operating system, what is to the right of the equals sign will be used verbatim in an expression to set the variable. On windows, the only command line quote character is the double quote, and it is escaped using a backslash. By using -se gblsStartDate='07-09-1976' you are causing the expression gblsStartDate = '07-09-1976' to be executed, which of course does nothing since the single quote indicates the start of a comment. By using -se gblsStartDate=07-09-1976 you are causing the expression gblsStartDate = (07 09) 1976 to be executed. If you use -se gblsStartDate="07-09-1976" you will get the same results as above (as if the quotes weren't present). However, if you use -se gblsStartDate=\"07-09-1976\" the expression gblsStartDate = "07-091976" will be executed, which is what you want. Note that this also means you can do something like -se gblsStartDate=now() and have gblsStartDate = now() executed.
Map Summary:
Define the Source: Source Connector: ASCII(Delimited) Source Data: Source Options: File: $(FUN_DATA)Accounts.txt Header = True
Define the Target: Target Connector: Target Data: ODBC 3.x Database: TrainingDB tblFavoriteInfo Target Options: none
Note: The following code module should be built through the Lookup Wizard. The steps for creating the code module are specified below.
217 Data Integrator Fundamentals Training
2. From the Menu click Tools Define Lookup Functions to open the Lookup Wizard. 3. Choose the Flat File Lookup Wizard and click Next. 4. Create a new Flat File Definition named Categories and click Next.
5. Specify the Lookup File as C:\Cosmos9_Work\Fundamentals\Data\Category.txt. 6. Click Next. 7. Choose the appropriate Key Field, and Fields that should be returned by the lookup.
8.
Click Finish. The Wizard will create the Lat File Lookup Functions in a code module. Use the functions in the appropriate event handlers as described below.
Categories_Field2_Lookup(KeyValue, DefaultValue) Categories_Field3_Lookup(KeyValue, DefaultValue)
Define Events: Source R1 Events Event Name Event Actions Event Parameters
target name
= Target = R1
record layout
count =
CharCount("|",Records("R1").Fields("Favorites")) + 1
Serial()
R1.ProductManager Categories_Field3_Lookup(Targets(0).Records("R1").Fields("CategoryCode"),
"NoManagers")
Map Summary:
Define the Source: Source Connector: ASCII(Delimited) Source Data: Source Options: File: $(FUN_DATA)Accounts.txt Header = True
Define the Target: Target Connector: Target Data: ODBC 3.x Database: TrainingDB tblFavoriteInfo Target Options: none
Note: The following code module should be built through the Lookup Wizard. The steps for creating the code module are specified below.
2. From the Menu click Tools Define Lookup Functions to open the Lookup Wizard. 3. Choose the Dynamic SQL Lookup and click Next. 4. Create a new Dynamic SQL Definition named Categories and click Next.
5. Create a name for the DJImport Object that will make the connection to the table or file. Choose the name category and click Next. 6. Click Build to build a new Connection String and then click Next. 7. Connect to the data source defined below for the lookup and click Next.
Define the Connection String: Connector: Access 2000 File: C:\Cosmos9_Work\Fundamentals\Data\TrainingDB.mdb Table: tblCategories Properties: none
8. Choose the appropriate Key Field, and Fields that should be returned by the lookup.
9.
Click Finish. The Wizard will create the following Dynamic SQL functions in a code module. Use the functions in the appropriate event handlers as described below.
Categories_Init() Initializes the DJImport object and makes the connection to the data source as defined by the connection string. Categories_Category_Lookup(KeyValue, DefaultValue) Creates the SQL call needed to retrieve a value from the Category field based on a Key value. Categories_ProductManager_Lookup(KeyValue, DefaultValue) Creates the SQL call needed to retrieve a value from the ProductManager field based on a Key value. Categories_Terminate() Terminates the connection to the data source by destroying the DJImport Object.
Define Events: Transformation and Map Properties Events Event Name Event Actions Event Parameters
BeforeTransformation
Execute
Expression:
Categories_Init()
AfterTransformation
Execute
Expression:
Categories_Terminate()
Define Events: Source R1 Events Event Name Event Actions Event Parameters
target name
= Target = R1
record layout
count =
CharCount("|",Records("R1").Fields("Favorites")) + 1
Serial()
R1.ProductManager
Exercise Once you have connected to a data source, (described below) your connection is displayed in the upper-right pane. You can set up and save as many data source connections as you need. Integration Querybuilder stores all connections you create unless you explicitly delete them. 1. Double-click the connection you want to use. The DB Browser in the lower-right pane will display the database. 2. Click the database icon to display the icons for tables, views and procedures for this database. Clicking on these will display their contents. Click on the individual tables to list their columns, or right-click and select Get Details from the shortcut menu to see the SQL representation of column values such as length, data types and whether they are used as primary or secondary keys. 3. To create a query, select New Query from the Query menu. A new query icon will be opened beneath the connection icon in the upper-right pane. You can rename this now or later by Integration Querybuilder Right-click on the icon. 4. Drag the tables and views you want to use into the upper-left pane. This is called the Relations pane. As you drag tables into this pane, you will see that SELECT... FROM statements are created in the SQL pane. If tables are already linked in the database, these links will be displayed, although these can be changed or removed for the purpose of this particular query. If you are using a table more than once, the second and further copies will be renamed. For example, if you already have a Customer table in the Relations pane and you drag across another copy, it will be automatically renamed Customer1.
The Select statement that is generated becomes part of the connection string and it is passed through to the database server. We can now map this data into any target type and format we desire. The following is information taken from reports generated by Repository Manager from the RDBMS_SelectStatements transformation in the Solutions folder: Source (ODBC 3.x)
Database TrainingDB SELECT srcAccounts.[Account Number], srcAccounts.Name, srcAccounts.Company, SQLStatement srcAccounts.Street, srcAccounts.City, srcAccounts.State, srcAccounts.Zip, srcPurchases.PONumber, srcPurchases.Category,
srcPurchases.ProductNumber, srcPurchases.ShipmentMethodCode FROM (srcAccounts RIGHT JOIN srcPurchases ON srcAccounts.[Account Number] = srcPurchases.AccountNumber) ORDER BY srcPurchases.ShipmentMethodCode, srcAccounts.City
TargetOptions
header True
Outputmode Replace
Source R1 Events
AfterEveryRecord ClearMapPut Record
Map Expressions
R1.Account Number R1.Name R1.Company R1.Street R1.City R1.State
R1.ShipmentMethodCode Fields("ShipmentMethodCode")
Exercise 1. Start a New Structured Schema Design 2. Click the Visual Parser button (red knife) 3. Change the Code Page property to 37 US EBCDIC (click the Apply button!) 4. Navigate to the file named Accounts_Binary.bin. 5. Determine the record length by looking for patterns in the file 6. Overtype the Length and hit Enter key (try 180, what happens?) 7. After you have the columns lined up, parse the fields, select data types and field properties until you have defined the structure. 8. Save the Structured Schema as s_BinaryDataCodePages.ss.xml for reuse.
Record Layouts
Record R1
Name AccountNumber Type Text Length 9
21 31 35 16 2 10 25 4 11 6 7 6 183
StandardPayment Packed decimal Payments Balance Total Packed decimal Packed decimal
Exercise 1. Start a New Map design session and choose the Binary connector. 2. Select the Structured Schema named s_BinaryDataCodePages.ss.xml. 3. Select the file named Accounts_Binary.bin. 4. Change the source property Code Page to 37 US EBCDIC (click APPLY button!). 5. Browse the file to confirm the structure has been applied. 6. If desired, you can complete the map based on the specifications below. The lesson, though is intended to demonstrate that a file can be parsed in Structured Schema Designer, and used as input for Map Designer. In this exercise we use it as a source connection. Structured Schemas can also be used as part of a target connection.
Map Summary:
Define the Source: Source Connector: Binary Source Data: Source Schema: Source Options: $(FUN_DATA)Accounts_Binary.bin s_BinaryDataCodePages.ss.xml codepage = 0037 US (EBCDIC)
Define the Target: Target Connector: Target Data: Target Options: ASCII(Delimited) File: $(FUN_DATA)AccountsOut.txt Header = True
Records("R1").Fields("AccountNumber") Records("R1").Fields("Name") Records("R1").Fields("Company") Records("R1").Fields("Address") Records("R1").Fields("City") Records("R1").Fields("State") Records("R1").Fields("ZipCode") Records("R1").Fields("Email") Records("R1").Fields("BirthDate") Records("R1").Fields("Favorites")
Records("R1").Fields("Payments") Records("R1").Fields("Balance")
Define Events: Source R1 Events Event Name Event Actions Event Parameters
target name record layout
AfterEveryRecord
ClearMapPut Record
Target R1
Structured Schema Designer: Multiple Record Type Support in Structured Schema Designer
Objectives At the end of this lesson you should be able to discuss the differences between files that have multiple record types and those that dont. You should be able to describe the tasks that will have to be performed to work with source files that have multiple record types. You should also be able to describe the actions you will have to take should you wish to create a target file with multiple record types. Keywords: Record Types, Record Layouts, Discriminator, and Recognition Rules Description Files can be grouped into two main classifications relative to the records they contain. The first classification is comprised of those files all of whose records are of the same type. This means that each record will contain the same fields, in the same order and with the same properties. The second classification is comprised of those files that contain records that have different formats. One record might contain ten fields while another might contain only six or perhaps twelve. One record type might describe a Customer while another describes a payment he made on his account. Certainly these two records would be different. The critical issue for record type files is not the definition of the records themselves. These can be defined in the Structured Schema Designer with the Visual Parser (by parsing one of them, adding another, parsing it, adding another, and so on). They can also be defined using the grid interface within the SSD (where you simply enter record type names and then enter the field lists for each). You might also be able to import the record layouts, perhaps from a COBOL copybook or some other readable file. The critical issue is how the Map Designer will be able to distinguish one record from another. For any application to be able to work with a file of this type there must be some way to tell the records apart. There should be one common field in each record type, the value of which must identify the record type itself. If this were not true, no software application would be able to deal with the fileMap Designer included. This field is called the discriminator field as it enables us to discriminate between record types. Once the discriminator field has been identified, the remaining task is to define the values that it can have and associate these values with individual record types. For example, if the value of the field were CUS, we might know we have a Customer record type. Or if the value of the field were PAY, we might know we are dealing with a record that describes a payment on an account. These types of rules are called recognition rules, and we must define at least one such rule for each record type. Rules might not be so simple, but fortunately the Structured Schema Designer can work with very complex ones. To create a structured schema for a source file that contains multiple record types, there are three possible strategies you can follow. The strategies you choose depends on what information you already have available describing the file. The three strategies are: 1. You have record layout definitions available in a file: Import the record layout definition file into the SSD. Use the ALL Record Type Rules Recognition dialog to define at least one rule for each record type. 233 Data Integrator Fundamentals Training
2. You have record layout definitions available in a printed document: Select the connector type in the SSD. Use the Grid layout to define each record type and its fields. Use the ALL Record Type Rules Recognition dialog to define at least one rule for each record type. 3. You have no definitions available- only the data file: Activate the SSD Visual Parser for your file. Name and parse each record type. Find and select the discriminator field. Use the Recognition Rules button to activate the Recognition Rules dialog and define at least one rule for each record type. The common element to these strategies is the definition of the recognition rules. These are defined in the Recognition Rules dialog, which is activated from either the ALL Record Type Rules Recognition hierarchy item or the individual R1 Rules R1 Recognition items on the grid layout in the SSD. First, youll identify the discriminator- the field whose contents will be used to tell the record types apart. Next, you can use the Generate Rules button to automatically generate some skeleton rules for each record type. Finally, you can add the actual value that the discriminator field will contain for each record type (and adjust other properties of the rules as you wish). When youre done, the structured schema for the file can be saved. Scenario A source file (Payments_MultiRecType.txt) contains multiple record types, and there is not any information about the files records or its fields. We know that the file contains payment records followed by a summary record. We also know that the payment records are supposed to contain an account number, payment date and payment amount, and that the summary records will contain a payment count and a payment total. However, we do not know where in the records each field begins and ends. We need to define a structured schema for this file by visually determining where each field for both record types starts and stops. We will use the parse data tool to accomplish the task.
Exercise 1. Begin a new Map Design. 2. Point the source to the ASCII Fixed file Payments_MultiRecType.txt. 3. Browse the source file and determine whether record types exist. Close the browser. 4. Click the Build Schema... button for the Structured Schema. 5. Click the Parse Data icon. 6. Rename the Record to Payment and parse a payment record according to the record layout given below.
Record Payment
Name Type Length 1 9 8 11 29
7. Click the Add Record button and name the new record type CheckSum. 8. Scroll down until you find the next different structured record (row 30). 9. Parse this record type with its fields as described below.
Record CheckSum
Name Type Length 1 8 3 9 4 4 29
RecordIndicator Text EmptiedDate Action TotalAmount PaymentCount ClerkID Total Text Text Text Text Text
10. Select the Payment record from the Record dropdown and ensure that the RecordIndicator field is displayed in the Field Name box. 11. Check the Discriminator check box. 12. Click the Recognition Rules... button. 13. Click the Generate Rules button. 14. Define PaymentRule1 to be that the discriminator field equals P . 15. Define CheckSumRule1 to be that the discriminator field must be equal to E. 16. Return to the Structured Schema Designer dialog. 17. Save the structured schema as s_Payments_MultiRecType.ss.xml. 18. Close the Structured Schema Designer. 235 Data Integrator Fundamentals Training
19. Browse the source file again and note how the structured schema information has been applied to it. Look at both kinds of records and see how the browser changes.
Map Summary:
Define the Source: Source Connector: ASCII(Fixed) Source Data: Source Schema: Source Options: $(FUN_DATA)Payments_MultiRecType.txt s_PaymentsMultiRecType.ss.xml None
Define the Target: Target Connector: Target Data: ODBC 3.x Database: TrainingDB Table: tblPaymentsVerified Target Options: None
Variables
Name paymentCounter Type Public Value
Variant no
paymentSubtotal Variant no
Define Events: Source Payment Events Event Name Event Actions Event Parameters
Expression:
AfterEveryRecord
Execute
ClearMapPut Record
Target R1
Define Events: Source CheckSum Events Event Name Event Actions Event Parameters
Expression:
AfterEveryRecord
Execute
'This code can be imported by the menu, File > Open Script File > ChecksumTest.rifl ' declare temp variables used for better readability Dim crlf, realTotal, realCount, crlf crlf = Chr(13)&Chr(10) realTotal = Records("CheckSum").Fields("TotalAmount") realCount = Records("CheckSum").Fields("PaymentCount")
' display current count and payment sub-total for each clerk MsgBox("---New Checksum---" & crlf & _ "PaymentCounter= " & paymentCounter & " : Should be = " & realCount & crlf & _ "Paymt Amt= " & paymentSubtotal & " : Should be = " & realTotal) ' evaluate count and sub-total for inconsistencies If paymentSubtotal <> Trim(realTotal) Then MsgBox("Total payment amount for this clerk does not match checksum amount!!!", 48) End If If paymentCounter <> Trim(realCount) Then MsgBox("Payment Count for this clerk does not match checksum amount!!!", 48) End If ' reset global variables for next clerk paymentCounter = 0 paymentSubtotal = 0