Download as pdf or txt
Download as pdf or txt
You are on page 1of 234

RowGen

v3
Test Data and File
Generation

© INNOVATIVEROUTINES INTERNATIONAL (IRI), INC.


2005 - 2013 All rights reserved. The material
herein is confidential, proprietary property,
protected by, and enforced under, U.S.
and international copyright,
trademark and trade secret laws.
No part of this guide may
be disclosed, disseminated,
or reproduced without
the express written
consent of IRI.
RowGen Version 3.1 LEGAL NOTICE

March 2013

CONFIDENTIAL: Use of this document is restricted by the terms of a non-disclosure


and/or license agreement binding you and your company. For your own protection, DO NOT
transfer or disclose any part of this manual, RowGen software or trial information without
the prior written consent of Innovative Routines International ("IRI"), Inc.

IRI tries to ensure that this document is correct and accurate, and IRI reserves the right to change
it without notice.

RowGen software licenses are serialized and usage is registered. Anyone wishing to expand
the use, or integrate and/or distribute all, or any part, of the software, must first execute an
appropriate license agreement with IRI.

Restricted rights: Use, duplication, or disclosure by the U.S. Government is subject to


restrictions as set forth in subparagraph (c)(1)(ii) of the Rights in Technical Data and
Computer Software clause at DFARS 252.227-7013.

No warranty, expressed or implied, is made by IRI as to the accuracy of the material and
functioning of the software. Any warranty of fitness for any particular purpose is expressly
excluded and in no event will IRI be liable for any direct or consequential damages.

Trademarks: RowGen, CoSort, and sortcl of IRI. All other brand or product names are
trademarks or registered trademarks of their respective holders/companies.

© 2005-2013 IRI. All rights reserved. No part of this document or the RowGen programs
may be used or copied without the express written permission of IRI. Please contact:

INNOVATIVE ROUTINES INTERNATIONAL (IRI), INC.


2194 Highway A1A
Suite 303, Atlantis Center
Melbourne, Florida 32937 USA
Tel (321) 777-8889, Fax (321) 777-8886
Email rowgen@iri.com
URL www.iri.com
Table of Contents

TABLE OF CONTENTS

TABLE OF CONTENTS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii

INTRODUCTION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

ROWGEN EXAMPLES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

1 DATA GENERATION ONLY . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13


Example 1: Using All Defaults: Simplest Form . . . . . . . . . . . . . . . . . . . . . . 13
Example 2: Using a Character SET File . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
Example 3: Using a Numeric SET File . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
Example 4: Using MIN_SIZE and MAX_SIZE . . . . . . . . . . . . . . . . . . . . . . 16

2 ADVANCED SETS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
Example 5: Using a Relational SET File . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
Example 6: Using Literal SETs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
Example 7: Selecting ALL Values from a Single-column SET . . . . . . . . . . 21
Example 8: Selecting ALL Values from a Multi-column SET . . . . . . . . . . . 23
Example 9: Selecting Values ONCE from a SET . . . . . . . . . . . . . . . . . . . . . 23

3 RECORD AND FIELD FORMATTING / CONVERSION . . . . . . . . . . . . . . . . . . . . . . 25


Example 10: Generating Valid Social Security Numbers . . . . . . . . . . . . . . . 25
Example 11: Record Formatting and Field Remapping . . . . . . . . . . . . . . . . 25
Example 12: File Format and Data Type Conversion . . . . . . . . . . . . . . . . . . 27
Example 13: Derived Field Values, Using /INREC . . . . . . . . . . . . . . . . . . . 28
Example 14: Report with Multiple Aggregations . . . . . . . . . . . . . . . . . . . . . 31
Example 15: Creating a Web-Ready Report . . . . . . . . . . . . . . . . . . . . . . . . . 33

4 CREATING MULTIPLE RECORD LAYOUTS. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36


Example 16: Generating and Identifying Multiple Input Sources . . . . . . . . . 36

5 CREATING TABLES FOR A RELATIONAL DATABASE. . . . . . . . . . . . . . . . . . . . . 39


Example 17: Creating Transactional Table Data . . . . . . . . . . . . . . . . . . . . . . 39

6 PIPING ROWGEN DATA INTO ANOTHER PROCESS . . . . . . . . . . . . . . . . . . . . . . . 44


Example 18: RowGen Output —> SQL*Loader . . . . . . . . . . . . . . . . . . . . . . 44

ROWGEN CONTROL LANGUAGE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47

1 PURPOSE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47

2 CONVENTIONS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49

RowGen iii
Table of Contents

2.1 Line Continuation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49


2.2 Comments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
2.3 Optional Statement Parameters. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
2.4 Environment Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
2.5 Naming Conventions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
2.6 Abbreviations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51

3 EXECUTION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52

4 USAGE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
4.1 Data Flow Structure in RowGen Scripts. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
4.2 Input Filenames and Attributes. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
4.3 Action . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
4.4 Output Filenames and Attributes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
4.5 /INCOLLECT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
4.6 /OUTCOLLECT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59

5 FILES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
5.1 Resource Control File (rowgenrc) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
5.2 Job Specification Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
5.3 Data Definition Files. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
5.4 File Formats (/PROCESS) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
5.5 Statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
5.6 Auditing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
Example 19: Creating an /AUDIT log . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70

6 FIELDS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
6.1 Syntax . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
6.2 Field Name . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
6.3 ROWID . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
Example 20: Using ROWID . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
6.4 SET Files and Literal SETs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
6.5 Distributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
6.5.1 Distributions Using Routines. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
6.5.2 Distributions Using Set Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
6.6 POSITION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
Example 21: Generating Fixed-Position Fields . . . . . . . . . . . . . . . . . . . . . . . 80
Example 22: Generating Variable-Position (Delimited) Fields . . . . . . . . . . . 80
6.7 SIZE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
Example 23: Size with NUMERIC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
Example 24: Using ASCII Substrings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
6.8 SEPARATOR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
Example 25: Generating Multi-Character Field Separators . . . . . . . . . . . . . 86

iv RowGen
Table of Contents

Example 26: Generating Records with Multiple Separators . . . . . . . . . . . . . 87


6.9 FRAME. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
Example 27: Generating Framed Fields on Input . . . . . . . . . . . . . . . . . . . . . 88
Example 28: Producing CVS Fields on Output . . . . . . . . . . . . . . . . . . . . . . . 89
6.10 Alignment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
Example 29: Using the Alignment Option . . . . . . . . . . . . . . . . . . . . . . . . . . 90
6.11 MILL. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
Example 30: Using the MILL option . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
6.12 FILL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
Example 31: Using the FILL Option . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
6.13 Data Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
6.13.1 ASCII Character Data Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
6.13.2 ASCII-Numeric Data Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
6.13.3 Binary Numeric Data Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
6.13.4 EBCDIC Data Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
6.13.5 Micro Focus Data Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
6.13.6 RM COBOL Data Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
6.13.7 EBCDIC-Native Micro Focus COBOL Data Types . . . . . . . . . . . . 103
6.13.8 EBCDIC-Native RM COBOL Data Types . . . . . . . . . . . . . . . . . . . 104
6.13.9 Date/Timestamp Data Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
6.13.10 Zoned Decimal. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
Example 32: Producing Zoned Decimal Fields . . . . . . . . . . . . . . . . . . . . . . 107

7 SET FILES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108


7.1 Character SET Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
Example 33: Using a Character SET File . . . . . . . . . . . . . . . . . . . . . . . . . . 112
7.2 Numeric SET Files with Ranges. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
Example 34: Using a Numeric SET File . . . . . . . . . . . . . . . . . . . . . . . . . . . 114
7.3 Date SET Files with Ranges . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
Example 35: Using a Date SET File . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116
7.4 Timestamp SET Files with Ranges. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117
Example 36: Using a Timestamp SET File . . . . . . . . . . . . . . . . . . . . . . . . . 119
7.5 Relation SET Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119
Example 37: Using a Multi-Column SET File . . . . . . . . . . . . . . . . . . . . . . 120
7.6 Literal SETs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122

8 FIELD EXPRESSIONS (CROSS-CALCULATION) . . . . . . . . . . . . . . . . . . . . . . . . . . 123


Example 38: Using Mathematical Expressions . . . . . . . . . . . . . . . . . . . . . . 123

9 /INREC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126
Example 39: Using /INREC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126

10 /DATA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129
Example 40: Using /DATA Statements . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130

RowGen v
Table of Contents

11 INTERNAL VARIABLES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131

12 CONTROL (ESCAPE) CHARACTERS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132

13 CONVERSION SPECIFIERS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133


13.1 Using a Conversion Specifier within a /DATA statement . . . . . . . . . . . . . . 134
13.2 Using a Conversion Specifier within a Condition . . . . . . . . . . . . . . . . . . . . 134

14 KEYS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135
14.1 Syntax . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135
14.2 Field Name Reference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135
14.3 Unnamed Reference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136
14.4 Collating Sequence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136
14.5 Direction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136
14.6 ASCII Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137
14.7 No Duplicates, Duplicates Only . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138

15 CONDITIONS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139
15.1 Syntax . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139
15.2 Unary Logical Expressions (Change Test) . . . . . . . . . . . . . . . . . . . . . . . . . . 140
15.3 Binary Logical Expressions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140
15.4 Function Compares in Conditions (iscompares). . . . . . . . . . . . . . . . . . . . . . 141
Example 41: Using iscompares . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143
15.5 Compound Logical Expressions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144
15.6 Evaluation Order. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144
15.7 Compound Conditions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145

16 INCLUDE-OMIT (RECORD SELECTION). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146


16.1 /INCLUDE and /OMIT Syntax. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146
16.2 Include-Omit Evaluation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147
Example 42: Using Include and Omit on Input and on Output . . . . . . . . . . 147
Example 43: Using Named Conditions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148
Example 44: Condition Order: Include before Omit . . . . . . . . . . . . . . . . . . 149
Example 45: Condition Order: Omit before Include . . . . . . . . . . . . . . . . . . 150

17 CONDITIONAL FIELD AND DATA STATEMENTS . . . . . . . . . . . . . . . . . . . . . . . . 152

18 EXAMPLES USING CONDITIONAL FIELD AND DATA STATEMENTS . . . . . . 154


Example 46: Simple Conditional /DATA Statement . . . . . . . . . . . . . . . . . . 154
Example 47: Multi-Level Conditional /FIELD Statement . . . . . . . . . . . . . 155

19 SUMMARY FUNCTIONS (AGGREGATION) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156


19.1 Summary and Average Functions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157
Example 48: Summary and Average . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 158

vi RowGen
Table of Contents

19.2 Maximum and Minimum Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159


Example 49: Maximum and Minimum . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159
19.3 Counting Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161
19.4 Ranking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161
Example 50: Ranking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162
19.5 Reports with Summaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163
19.6 Running (Accumulating) Aggregates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164
Example 51: Using /RUNNING . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164

20 SEQUENCER . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 166
Example 52: Using SEQUENCER . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 166

21 OUTPUT OPTIONS. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169

22 MISCELLANEOUS OPTIONS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174


22.1 /RC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174
22.2 /EXECUTE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174
22.3 /MONITOR. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174
22.4 Runtime Warnings (/WARNINGSON and /WARNINGSOFF) . . . . . . . . . 177
22.5 /ROUNDING . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177

23 USING SEEDS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 179


Example 53: Using a Starting Seed . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 179

ROWGEN TOOLS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 191

cob2ddf . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193

1 PURPOSE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193

2 USAGE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193

3 EXAMPLE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 194

csv2ddf . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 195

1 PURPOSE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 195

2 USAGE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 195

3 EXAMPLE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 196

RowGen vii
Table of Contents

elf2ddf . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 199

1 PURPOSE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 199

2 USAGE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 199
2.1 ELF Format. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 199
2.2 Syntax . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 200

3 EXAMPLE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 201

ctl2ddf . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 203

1 PURPOSE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 203

2 USAGE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 203
2.1 Execution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 203

3 EXAMPLE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 203

APPENDIX . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 205

A PERFORMANCE TUNING . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 205


A.1 Tuning RowGen for Windows . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 207
A.2 Tuning RowGen for UNIX . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 208
A.3 Using Customized Resource Control Files . . . . . . . . . . . . . . . . . . . . . . . . . 209
A.4 Search Order for Resource Controls . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 211
A.5 Resource Control Settings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 213

B ERROR and RUNTIME MESSAGES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 219


B.1 Table of Error Values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 219
B.2 Detailed Error and Runtime Messages . . . . . . . . . . . . . . . . . . . . . . . . . . . . 226

C ASCII COLLATING SEQUENCE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 234

D EBCDIC PRINTING CHARACTERS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 235

viii RowGen
INTRODUCTION

RowGen is a high-performance data generator and format synthesizer that builds test
databases, flat files, records and reports in the same form, and format, of real data.
RowGen can be used to develop and stress test applications, prototype databases and
ETL/ELT operations, and safely outsource formatted data targets when production data
are confidential or unavailable.

RowGen jobs are controlled by text-based scripts that define the layout of the test tables
and files you want to build. RowGen Control Language (RCL) scripts use the same
explicit and intuitive syntax as the Sort Control Language (sortcl) program within IRI’s
CoSort data manipulation package, so you can transform and format your test data.

For test database generations, RowGen can parse the data model information for any
JDBC-connected database, converting the SQL information about table layouts and
primary-foreign key relationships into RowGen job scripts. These, in turn, produce pre-
sorted, structurally and referentially correct test data that can be bulk loaded to target
tables.

The IRI Workbench, a graphical user interface (GUI) built on Eclipse, facilitates the
specification, execution, tuning and maintenance of RowGen job scripts through job
wizards, a dynamic job outline, and a fully syntax-aware editor. To ensure that
referential integrity is preserved in the test data, a wizard for creating database test data
based on existing table structures and their relationships is also available. This manual
documents the use of the RowGen Control Language only, and is thus a reference guide
for the programs running outside the GUI. Documentation on GUI operations is
provided in the topic help and in the context sensitive help within RowGen-specific
GUI wizards and dialogs.

RowGen job scripts describe the precise layout of the data to be randomly generated or
selected, including the size, position, and data type of each field element. RowGen can
generate one or more sets of data in a single RowGen execution, and it can produce
multiple output tables/files/reports that involve transformations, like sorting, or
aggregation on those random sets, all within the same job script and I/O pass.

RowGen allows you to order the generated data (over any number of keys), create
structured reports, or build sequenced load files by using the same formatting
capabilities available to CoSort sortcl users. The difference between RowGen and the
sortcl program is that RowGen generates and processes random data, while sortcl
recognizes and processes real data.

RowGen Introduction 9
RowGen-generated data fields may consist of:

• Values of most widely used data types, including:

• ASCII
• EBCDIC
• numeric
• COBOL
• datestamp and timestamp
• binary
• other special types such as IP_Address.
See Data Types on page 93.

In cases where RowGen cannot generate field data of a desired data type in the
NOTE input phase of a job script, RowGen can convert field values to your preferred
data type on output.

In addition, random data generated in the input phase can be re-typed,


reformatted, and used in arithmetic expressions or other field functions in the
output phase.

• Values drawn at random from SET files (either user-built or provided with the
RowGen package) or from inline values within a job script (literal SETs). SETs
can include character-based strings, numbers and numeric ranges, dates and date
ranges, or timestamp and timestamp ranges. You can also select from relational
SET files where the value that is returned is dependent on another value(s)
(see SET FILES on page 108).

RowGen output can be stored in files, sent to stdout or stderr, or sent to named pipes.

RowGen can assist in building and testing databases (see page 48) and applications that
would manipulate or otherwise act on real production data later, or for benchmarking
high-performance file management or manipulation software like CoSort. RowGen
output can range from simple, single-field flat files to multi-column database load
targets. RowGen output can be formatted with multi-level, custom HTML reports
populated with headers, footers, literals, details and summary values.

For new users, it is recommended that you read the ROWGEN EXAMPLES
NOTE chapter, which provides an opportunity to sample the functionality of the
RowGen product. Examples start out simple and become increasingly more
complex. The ROWGEN CONTROL LANGUAGE chapter contains the formal
recitation of the RowGen Control Language (RCL).

10 Introduction RowGen
ROWGEN EXAMPLES

RowGen data generation is under control of the text-based 4GL known as RowGen
Control Language (RCL). The examples provided in this chapter illustrate the various
capabilities of RCL. For a formal description of RCL syntax components, see the
ROWGEN CONTROL LANGUAGE chapter on page 47.

Although specific file name extensions are not required, the following conventions are
used for file name extensions:

.rcl Job scripts.

.set SET files.

.out Output files.

All files referenced throughout this chapter are provided in the subdirectory
/examples/Examples_chapter of your RowGen install directory, so you can run these
examples and re-create the results. In addition, where other file types are used
(such as .set and .bat), they too are provided in /examples/Examples_chapter.

The default behavior of RowGen is to sort records from left to right.


NOTE To prevent sorting, and therefore speed production with larger data sets, you
must include a /REPORT statement in the RowGen job script (for example,
see Using a Character SET File on page 14).

The first examples are simple, and they become increasingly more complex. It is
therefore recommended that you perform them in the order presented:

• DATA GENERATION ONLY on page 13


• Using All Defaults: Simplest Form on page 13
• Using a Character SET File on page 14
• Using a Numeric SET File on page 15
• Using MIN_SIZE and MAX_SIZE on page 16.

• ADVANCED SETS on page 18


• Using a Relational SET File on page 18
• Using Literal SETs on page 20
• Selecting ALL Values from a Single-column SET on page 21
• Selecting ALL Values from a Single-column SET on page 21
• Selecting ALL Values from a Multi-column SET on page 23
• Selecting Values ONCE from a SET on page 23.

RowGen RowGen Examples 11


• RECORD AND FIELD FORMATTING / CONVERSION on page 25
• Generating Valid Social Security Numbers on page 25
• Record Formatting and Field Remapping on page 25
• File Format and Data Type Conversion on page 27
• Derived Field Values, Using /INREC on page 28
• Report with Multiple Aggregations on page 31
• Creating a Web-Ready Report on page 33.

• CREATING MULTIPLE RECORD LAYOUTS on page 36


• Generating and Identifying Multiple Input Sources on page 36.

• CREATING TABLES FOR A RELATIONAL DATABASE on page 39


• Creating Transactional Table Data on page 39.

• PIPING ROWGEN DATA INTO ANOTHER PROCESS on page 44


• RowGen Output —> SQL*Loader on page 44.

Many of the SET files delivered with the RowGen package are derived from
NOTE public domain sources found on the internet. As such, IRI (and thus you)
cannot vouch for the accuracy, format, safety, or completeness of their content.
They are provided for your convenience and to assist in the development of
realistic set data through random selection of their values. IRI expressly
disclaims any warranty of fitness for these SET files.

12 RowGen Examples RowGen


1 DATA GENERATION ONLY

1 DATA GENERATION ONLY

The simplest form of a RowGen job script requires an /INFILE statement, followed by
one or more
/FIELD statements (see Data Flow Structure in RowGen Scripts on page 55).

Example 1 Using All Defaults: Simplest Form

The following script, default_gen.rcl, illustrates the simplest way to generate random
data with RowGen, where standard defaults are applied:

/INFILE=simple.in # placeholder only; no real input is recognized


/PROCESS=RANDOM
/FIELD=(first,POSITION=1,SIZE=5) # alpha_digit field; position 1; size 5

The results of running the above script will be as follows:

• 100 records will be generated by default (see /INCOLLECT on page 59)


• the generated data in the field "first" will:
• consist of alphabetic and digit characters -- the default data type is
alpha_digit (see Data Types on page 93)
• start at byte POSITION 1 (see POSITION on page 79)
• have a field SIZE (width) of 5 (see SIZE on page 81).

• the generated records will be sorted from left to right (see KEYS on page 135)
• output will be sent to stdout, that is, the default output when no
file name is specified (see Output Filenames and Attributes on page 57)
• output records will be linefeed- or carriage return, linefeed (CRLF)- terminated
(see RECORD on page 63)
• output (to stdout) will be of the same format as described on input, that is, a
single alpha_digit field.

To execute this job, enter:

rowgen /spec=default_gen.rcl

This will generate 100 sorted records (the following shows the first five records):

0wl6n
2ihzj
3Nvin
5Zqdw
5rWHG

RowGen RowGen Examples 13


1 DATA GENERATION ONLY

Example 2 Using a Character SET File

RowGen can draw field values at random from an existing SET file containing any
number of possible field values. In this way, you can ensure that fields are populated
with realistic-looking data, rather than just randomized characters.

This example uses the SET file parts_list1.set, which contains the following:

Brackets
Screws
Nails
Tacks

The following script, simple_gen.rcl, will generate 10 records consisting of two fields
each, where one field contains values drawn from the SET file, and the other contains
randomly generated values.

/INFILE=char_set.in # placeholder only; no real input is recognized


/PROCESS=RANDOM
/FIELD=(part,SET=parts_list1.set,POSITION=1,SEPARATOR=’|’) # selected values
/FIELD=(profit,POSITION=2,SEPARATOR=’|’,SIZE=5,NUMERIC) # generated values
/INCOLLECT=10 # generate 10 records
/REPORT # no sorting, records will be unordered
/OUTFILE=gen_set.out # Same layout as input data; no remapping
# Therefore, no output field layout required

To execute this job, enter:

rowgen /spec=simple_gen.rcl

This produces the output file gen_set.out:

Tacks|-3.67
Brackets|-7.17
Nails|64.12
Screws|68.64
Nails|-2.94
Brackets|-8.14
Brackets|81.57
Screws|-3.77
Brackets|6.85
Screws|57.45
Screws|52.24

14 RowGen Examples RowGen


1 DATA GENERATION ONLY

Note that:

• 10 records were generated (see /INCOLLECT on page 59)


• the two generated fields are separated by a pipe (|), and the POSITION attributes
determined their layout with respect to the delimiter (see POSITION on page 79
and SEPARATOR on page 86)
• the content of the first field, "parts," was selected at random from values in the
text file parts_list1.set (see SET Files and Literal SETs on page 77)
• the second field, "profit," used the NUMERIC data type with a fixed SIZE of 5,
which defaults to two-digit precision, and was randomly generated (see SIZE on
page 81 and Data Types on page 93)
• records were not ordered because of the /REPORT statement, which prevents
sorting (and thus speeds data production).

Example 3 Using a Numeric SET File

This example shows how you can produce field values randomly by drawing from
a data set of numeric values (literals) and ranges. You must declare the data type
NUMERIC when selecting from a numeric SET file.

values_low.set contains:

-15
[-9,-5]
(-2,0)

values_high.set contains:

34567
[40000,40010]
56789

Consider the following script, number_gen.rcl, which draws values at random from the
two set files, where random values can be either literal, or fall within a range:

/INFILE=numeric_set.in # placeholder only; no real input is recognized


/PROCESS=RANDOM
/INCOLLECT=10 # generate 10 records
/FIELD=(low,SET=values_low.set,POSITION=1,SIZE=5.0,NUMERIC)
/FIELD=(high,SET=values_high.set,POSITION=9,SIZE=8.2,NUMERIC)
/REPORT # no sorting, records will be unordered
/OUTFILE=numeric_set.out # Same layout as input data, i.e., no remapping
# No output /FIELD layout required

RowGen RowGen Examples 15


1 DATA GENERATION ONLY

This produces numeric_set.out:

-6 40003.73
-15 56789.00
-1 40009.70
-15 34567.00
-15 56789.00
-15 40009.93
-1 40008.64
-1 40005.78
-15 40006.68
-1 40008.82

Note that:

• The field "low" contains a random mix of the literal value -15, and can also
contain values from the inclusive [-9,-5] and exclusive (-2,0) ranges,
respectively (see Numeric SET Files with Ranges on page 113).
• The SIZE=5.0 attribute indicates that no decimal point will be used
(.0 precision) in the first column.
• The field "high" contains a random mix of the literal values 34567 and 56789,
and values from the inclusive range [40000,40010].
• The SIZE=8.2 attribute indicates that a decimal precision of 2 will be used.

To see how different SIZE precision specifications can affect output when
selecting values from a numeric SET file, see Example 35 on page 114.

WARNING!
A 0-length entry in a numeric SET file can cause an infinite
loop because such a value can never be found during selection.

Example 4 Using MIN_SIZE and MAX_SIZE

You can restrict the minimum and maximum field sizes of both ASCII character and
numeric field values (see MIN_SIZE and MAX_SIZE on page 82 for complete details).

16 RowGen Examples RowGen


1 DATA GENERATION ONLY

The following script, minmaxsize.rcl, demonstrates MIN_SIZE and MAX_SIZE when


generating random values for delimited fields.

/INFILE=minmaxsize.in
/PROCESS=RANDOM
/FIELD=(code,POSITION=1,MIN_SIZE=1,MAX_SIZE=4,SEPARATOR='|')
/FIELD=(value,POSITION=2,MIN_SIZE=4,MAX_SIZE=8,SEPARATOR='|',NUMERIC)
/REPORT
/OUTFILE=minmaxsize.out

You can run the above job script as follows

rowgen /spec=minmaxsize.rcl

This will generate 100 sorted records (the following shows the first 13 records):

01AE|-.02
0G7w|47.77
0|-.73
0|113.39
1k4|211.25
1v7|598.30
1|3.17
2A|66.97
2Sf|-.19
2d4|-.89
2p|-80.17
2|1556.03
3Bt|-1.56
3M|1.41
3Nb7|85245.23

Note that:

• the "code" field, which defaults to the alpha_digit data type (see Data Types on
page 93), ranges from a width of 1 through 4
• the "value" field, specified as NUMERIC, defaults to a total field width between
4 and 8, which includes two decimal places (by default), the decimal point, and a
minus sign if the generated number is negative (see SIZE on page 81).

For a description of other, default behaviors observed when executing this script, see
Using All Defaults: Simplest Form on page 13.

RowGen RowGen Examples 17


2 ADVANCED SETS

2 ADVANCED SETS

As illustrated in Example 2 on page 14 and Example 3 on page 15, RowGen can select
field values at random from a pre-existing SET file. This section demonstrates some of
the more advanced SET functionality available in RowGen. See SET FILES on
page 108 for full details on all supported SET options and how to use them.

Example 5 Using a Relational SET File

This example shows how you can use a dependency table as a SET file to produce
values for one field which are dependent on the values selected for another
(see Relation SET Files on page 119). This example provides realistic first names that
conform to gender.

The following SET files are needed to perform this example:

name_list.set, which is the tab-delimited dependency table:

female Carole
female Jane
female Jill
female Rachel
male Bill
male John
male Peter
male Roger

last_names.set

Stevens
Evans
Jones
Pierce
Murray
Smith
Osbourne

The following script, relate.rcl, performs a two-key sort that uses the above SET files to
select realistic names that correspond appropriately to gender. It also produces a second
output file that applies record filter logic:

18 RowGen Examples RowGen


2 ADVANCED SETS

/INFILE=names.in
/PROCESS=RANDOM
/INCOLLECT=10
/FIELD=(gender,set=name_list.set,POSITION=1,SEPARATOR='|')
/FIELD=(first_name,set=ROW[2] name_list.set,POSITION=2,SEPARATOR='|')
# the dependency table is referenced to provide a first_name to
# correspond to the value of gender
/FIELD=(last_name,set=last_names.set,POSITION=3,SEPARATOR='|')
/SORT
/KEY=last_name
/KEY=first_name
/OUTFILE=names.out
/HEADREC="Name Gender\n----------------------\n"
/FIELD=(last_name,POSITION=1)
/DATA=","
/FIELD=(first_name)
/DATA="\t"
/FIELD=(gender)
/OUTFILE=females_only.out
/INCLUDE WHERE gender == "female"
/HEADREC="Name Gender\n----------------------\n"
/FIELD=(last_name,POSITION=1)
/DATA=","
/FIELD=(first_name)
/DATA="\t"
/FIELD=(gender)

This will produce names.out, which contains:

Name Gender
----------------------
Evans,Bill male
Evans,Rachel female
Jones,Bill male
Jones,Rachel female
Murray,Jane female
Murray,Peter male
Osbourne,John male
Pierce,Rachel female
Smith,Bill male
Smith,Roger male

Note that the first names correspond to the gender, as determined by the dependency
table, name_list.set. The header record was produced with the /HEADREC statement
(see /HEADREC on page 170).

RowGen RowGen Examples 19


2 ADVANCED SETS

The second output file, females_only.out, contains:

Name Gender
----------------------
Evans,Rachel female
Jones,Rachel female
Murray,Jane female
Pierce,Rachel female

The /INCLUDE statement ensured that only records containing the female gender were
included in the output file (see INCLUDE-OMIT (RECORD SELECTION) on
page 146).

Example 6 Using Literal SETs

This example shows how you can include SET values explicitly within /FIELD
statements when you have a small number of random elements to draw from, and do not
require a separately held SET file (see Literal SETs on page 122).

This example also uses the following SET files:

last_names.set

Stevens
Evans
Jones
Pierce
Murray
Smith
Osbourne

first_names.set

John
Susan
Mary
Stephen
Barry
Cathy

20 RowGen Examples RowGen


2 ADVANCED SETS

The following script, lower_man.rcl, ensures that RowGen selects only the zip codes
representing the Lower Manhattan area of New York City. The zip codes are entered
explicitly in the /FIELD statement:

/INFILE=literal_set.in
/PROCESS=RANDOM
/FIELD=(last_name,SET=last_names.set,POSITION=1,SIZE=10)
/FIELD=(first_name,SET=first_names.set,POSITION=11,SIZE=10)
/FIELD=(zip,SET={10004,10005,10006,10007,10038,10280},POSITION=21,SIZE=5)
/SORT
/KEY=last_name
/NODUPLICATES
/OUTFILE=lower_man.out
/OUTCOLLECT=7
/HEADREC="Lower Manhattan\n"
/FIELD=(last_name,POSITION=1,SIZE=10)
/FIELD=(first_name,POSITION=11,SIZE=10)
/FIELD=(zip,POSITION=21,SIZE=5)

This produces:

Lower Manhattan
Evans Mary 10005
Jones Cathy 10006
Murray Barry 10038
Osbourne Susan 10005
Pierce Mary 10006
Smith Barry 10006
Stevens Mary 10038

Note that, RowGen selected zip codes only from the list given within the curly brackets
{} of the SET attribute in the zip field (see Literal SETs on page 122).

Example 7 Selecting ALL Values from a Single-column SET

This example demonstrates how you can ensure that all values contained in a
single-column SET file are used to populate a given field within the generated records.

Consider the SET file parts_list2.set:

glue sticks
lighters
pliers
ratchets
buzz saws
sanders
hammers

RowGen RowGen Examples 21


2 ADVANCED SETS

nails
tacks
screws
screwdrivers
wrenches
drills
lightbulbs

The following script, allparts.rcl, shows how you can ensure the inclusion of all values
from the SET file:

/INFILE=parts.in
/PROCESS=RANDOM
/INCOLLECT=15
/FIELD=(part,SET=ALL parts_list2.set,POSITION=1,SIZE=13)
/FIELD=(price,set={(1,15)},POSITION=15,SIZE=5.2,NUMERIC) # literal range
/REPORT
/OUTFILE=pricelist_all.out
/HEADREC="Price list: Set of 100\n"
/FIELD=(part,POSITION=1,SIZE=13)
/FIELD=(price,POSITION=15,SIZE=5.2,NUMERIC)

pricelist_all.out contains:

Price list: Set of 100


glue sticks 9.79
lighters 9.67
pliers 11.18
ratchets 11.82
buzz saws 11.95
sanders 13.65
hammers 10.21
nails 13.17
tacks 14.23
screws 14.64
screwdrivers 13.03
wrenches 11.19
drills 14.12
lightbulbs 13.22
glue sticks 10.80

Note that:

• all 15 unique values from parts_list2.set are represented in the above list
• random numbers between 1 and 15 were drawn from a literal numeric range (see
Literal SETs on page 122).

22 RowGen Examples RowGen


2 ADVANCED SETS

Example 8 Selecting ALL Values from a Multi-column SET

This example demonstrates how you can select ALL values that satisfy a dependency
within a multi-column, relational SET file (see Example 5 on page 18).

Consider the following tab-delimited relational SET file, PANY_cities.set:

NY Albany
NY Concord
NY Hemstead
NY New York City
PA Erie
PA Philadelphia
PA Pittsburg
PA Scranton
PA State College

The following script, allny.rcl, generates all records on the right that correspond to the
given value on the left. Note that this example contains a string to represent the desired
left-hand value, but you can also use a field value to define the value on the left, as
described in Relation SET Files on page 119.

/INFILE=allstates.in
/INCOLLECT=4
/FIELD=(ny_city,POSITION=1,SIZE=15,set = ALL PANY_cities.set["NY"])
/REPORT
/OUTFILE=allny.out
/FIELD=(ny_city,POSITION=1,SIZE=15)

This produces allny.out:

Albany
Concord
Hemstead
New York City

All unique values from the right column that satisfy the relationship specified ("NY")
are produced. The /INCOLLECT=4 statement ensures that no repeated entries are
produced (see also ONCE in Example 9 on page 23 for an alternative way to return all
values once without repetition).

Example 9 Selecting Values ONCE from a SET

This example demonstrates how to select values contained in a


single-column SET file only one time to populate a given field within the generated
records.

RowGen RowGen Examples 23


2 ADVANCED SETS

This example uses the SET files, parts_list2.set from Example 7 on page 21.

The following script, onceparts.rcl, shows how you can include all the values from the
SET file only once:

/INFILE=parts.in
/PROCESS=RANDOM
/INCOLLECT=20
/FIELD=(part,SET=ONCE parts_list2.set,POSITION=1,SIZE=13)
/FIELD=(price,SET={(1,15)},POSITION=15,SIZE=5.2,NUMERIC)
/REPORT
/OUTFILE=pricelist_once.out
/HEADREC="Price list: Set of 100\n"
/FIELD=(part,POSITION=1,SIZE=13)
/FIELD=(price,POSITION=15,SIZE=5.2)

pricelist_once.out contains:

Price list: Set of 100


glue sticks 6.81
lighters 4.63
pliers 8.24
ratchets 4.42
buzz saws 6.85
sanders 5.76
hammers 2.25
nails 10.29
tacks 6.67
screws 5.45
screwdrivers 8.27
wrenches 10.42
drills 2.66
lightbulbs 9.61
3.05
8.90
10.07
2.76
2.80
5.13

Note that:

• All 15 unique values from parts_list2.set are represented in the above list. After
the unique values are used, no further selections are made from that SET file,
resulting in empty values in the left column.
• Random numbers between 1 and 15 were drawn from a literal numeric range
(see Literal SETs on page 122).

24 RowGen Examples RowGen


3 RECORD AND FIELD FORMATTING / CONVERSION

3 RECORD AND FIELD FORMATTING / CONVERSION

Several record and file formatting options are available to modify the output targets that
contain the random data generated by RowGen. In addition, you can create derived
fields in the output section that are the result of field-level conditions, cross-calculation,
and/or aggregation.

Example 10 Generating Valid Social Security Numbers

This example shows how you can add /DATA statements to the /OUTFILE section
of a RowGen script to format the output records. The following script, ssno_gen.rcl,
generates one million valid social security numbers (according to the rules of validity
described in http://en.wikipedia.org/wiki/
Social_security_number#Valid_SSNs):

/INFILE=ssno.in
/PROCESS=RANDOM
/INCOLLECT=100000 # Generates 1 million records
# after include/omit logic is applied
/FIELD=(area_number,POSITION=1,SIZE=3,SEPARATOR=',',digit)
/FIELD=(group_number,POSITION=2,SIZE=2,SEPARATOR=',',digit)
/FIELD=(serial_number,POSITION=3,SIZE=4,SEPARATOR=',',digit)
/OMIT WHERE area_number > "772" OR area_number == "666"
/INCLUDE WHERE area_number > "000" AND group_number > "00" \
AND serial_number > "0000"
/REPORT
/OUTFILE=ssnos.out
/FIELD=(area_number)
/DATA="-" # customized record formatting
/FIELD=(group_number)
/DATA="-" # customized record formatting
/FIELD=(serial_number)

In the above job script, field descriptions were specified in the input section, and field
formatting was applied in the output section. See /DATA on page 129 for complete
details on using /DATA statements.

Example 11 Record Formatting and Field Remapping

This example shows how you can customize output record layouts, and produce a report,
using record formatting and field remapping statements.

RowGen RowGen Examples 25


3 RECORD AND FIELD FORMATTING / CONVERSION

Consider the SET file parts_list2.set from Example 7 on page 21. The following script,
remap.rcl, shows how to specify remapping and record formatting in the output section,
building on the generated data specified in the input section:

/INFILE=remap.in # placeholder only; no real input is recognized


/PROCESS=RANDOM
/FIELD=(part,SET=parts_list2.set, POSITION=1,SEPARATOR='|')
/FIELD=(price,POSITION=2,SEPARATOR='|',SIZE=5,NUMERIC) # random data
/INCLUDE WHERE price > 0 # filter logic; prevents negative values
/INCOLLECT=5 # generate 5 records that satisfy include
/SORT # sorting required
/KEY=(price, descending) # sort by price, in descending order
/OUTFILE=remap.out # attributes required below:
/HEADREC="Price Part\n-------------------------------\n"
/FIELD=(price,POSITION=1,SIZE=5.2,NUMERIC)
/DATA=" for a set of"
/FIELD=(part,POSITION=20)
/FOOTREC="-------------------------------\nAs recorded on %s",CURRENT_DATE

This produces remap.out:

Price Part
-------------------------------
96.40 for a set of hammers
65.63 for a set of hammers
65.00 for a set of glue sticks
55.04 for a set of nails
29.55 for a set of nails
-------------------------------
As recorded on 2013-02-01

Note that:

• values for the "part" field were selected at random using the SET file, and the
"price" field consists of randomly generated numbers, satisfying the INCLUDE
• the "price" values were sorted in descending order (see KEYS on page 135)
• the /HEADREC statement created the header record, i.e., a literal string with \n
(linefeed) characters (see /HEADREC on page 170 and CONTROL (ESCAPE)
CHARACTERS on page 132)
• the two fields were remapped to fixed-byte positions, with price displayed first
• the /DATA statement created intra-record text, placed within the field values
(see /DATA on page 129)
• the /FOOTREC statement created the footer record with the current datestamp
(see /FOOTREC on page 171 and INTERNAL VARIABLES on page 131).

26 RowGen Examples RowGen


3 RECORD AND FIELD FORMATTING / CONVERSION

Example 12 File Format and Data Type Conversion

RowGen can produce multiple output files with different file formats (see File Formats
(/PROCESS) on page 62), in the same job. You can also convert from one data type to
another at the field level (see Data Types on page 93), which is useful when multiple
output files have different data-type requirements. Consider the following SET file,
USAdates.set:

Feb/13/2003
Mar/06/1998
Apr/13/1996
Jul/14/2000
Jun/20/2003
Jan/03/1999

The following script, conv.rcl, produces two output files. The first will show the same
data as generated on input. The second output will illustrate a file-format conversion,
and data-type conversion at the field level:

/INFILE=conv.in # placeholder only; no real input is recognized


/PROCESS=RANDOM
/FIELD=(day,SET=USAdates.set,POSITION=1,SIZE=11,SEPARATOR='|',AMERICAN_DATE)
/FIELD=(zip,POSITION=2,SEPARATOR='|',SIZE=5,digit)
/FIELD=(code,POSITION=3,SEPARATOR='|',SIZE=5,EBCDIC_alpha)
/INCOLLECT=7 # generate 7 records
/SORT # sorting required
/KEY=day # sort the AMERICAN_DATE field in ascending order
/OUTFILE=orig.out # output file 1, same attributes as on input
/OUTFILE=conv.csv # output file 2, with file and field conversion
/PROCESS=CSV # convert file format to CSV (adds header)
/FIELD=(day,POSITION=1,SEPARATOR=',',SIZE=10,ISO_DATE) # convert DATE format
/FIELD=(zip,POSITION=2,SEPARATOR=',',digit)
/FIELD=(code,POSITION=3,SEPARATOR=',',ASCII) # convert from EBCDIC->ASCII
/OUTCOLLECT=5 # produce only 5 of the 7 generated records

The file orig.out contains all seven generated records, sorted on the "day" field:

Apr/13/1996|07271|ÿ-+ç-
Apr/13/1996|34755|+ÖS+¬
Mar/06/1998|58163|¦Ö-µÿ
Mar/06/1998|34568|¦++ö+
Mar/06/1998|70363|êtå++
Jan/03/1999|11065|+-+Ñù
Feb/13/2003|30614|ó-üÑ+

The code field is EBCDIC, and displays as shown.


NOTE

RowGen RowGen Examples 27


3 RECORD AND FIELD FORMATTING / CONVERSION

The second output file, conv.csv, which was produced simultaneously (in the same I/O
pass) with the other output file, is as follows:

day,zip,code
"1996-04-13","07271","qBCgK"
"1996-04-13","34755","CrUNz"
"1998-03-06","58163","GrBWq"
"1998-03-06","34568","GPImI"
"1998-03-06","70363","hXfHL"

Note that:

• all seven generated records were returned to the first output file, orig.out, and
five records only were produced in conv.csv, as determined by the output file-
specific /OUTCOLLECT statement (see /OUTCOLLECT on page 59).
• as a result of specifying /PROCESS=CSV, a header record was created based on
the field names in the output layout, and the individual fields are framed in
double-quotes by default (see CSV on page 66)
• the "day" field was converted from the AMERICAN_DATE format (from the
values in the SET file) into ISO_DATE format (see Date/Timestamp Data Types
on page 105)
• the randomly generated "code" field was converted from EBCDIC_alpha format
into ASCII alpha format (see ASCII Character Data Types on page 94).

Example 13 Derived Field Values, Using /INREC

The job script on the following page, derived.rcl, illustrates several methods for
producing derived fields, and also demonstrates how to use /INREC to create a virtual
record layout to be processed (see /INREC on page 126).

Consider the following character SET file, parts_list3.set, where each part name is
preceded by a unique three-digit code and a colon (:):

DBD:screwdrivers
CBC:hammers
ABC:glue sticks
EBD:lightbulbs
ABE:pliers
BBC:ratchets
BBD:buzz saws
EBE:switches
BBE:sanders
CBD:nails
CBE:tacks
DBC:screws

28 RowGen Examples RowGen


3 RECORD AND FIELD FORMATTING / CONVERSION

DBE:wrenches
EBC:drills
ABD:lighters

The use of the \ character in the following script is for line-continuation


NOTE purposes (see Line Continuation on page 49).

/INFILE=derived.in
/PROCESS=RANDOM
/FIELD=(full,SET=parts_list3.set,POSITION=1,SIZE=16)# use SET values
/FIELD=(value1,POSITION=17,SIZE=5,NUMERIC)# random numeric value 1
/FIELD=(value2,POSITION=22,SIZE=5,NUMERIC)# random numeric value 2
/INCLUDE WHERE value1 > 0 AND value2 > 0 # exclude negative values
/INREC # create virtual input record:
/FIELD=(part=sub_string(full,5,12),POSITION=1,SIZE=12) # define substring
/FIELD=(value1,POSITION=13,SIZE=5)
/FIELD=(value2,POSITION=18,SIZE=5)
/FIELD=(T=abs(value1 - value2),POSITION=23,SIZE=5,NUMERIC) # math field
/SORT
/KEY=part # sort over the derived substring
/NODUPLICATES # exclude duplicates of substring
/OUTFILE=derived.out# summary record, same outfile:
/DATA="\n Average difference: "
/FIELD=(avg_diff,NUMERIC)# names and positions new field
/AVERAGE avg_diff from T # defines the aggregation
/OUTFILE=derived.out # layout for detail records:
/HEADREC="Part Low High Diff\n------------------------------
-----------\n"
/FIELD=(part,POSITION=1,SIZE=12)# produce substring on output
/FIELD=(low,POSITION=17,SIZE=5,NUMERIC, IF value1 LE value2 THEN value1 \
ELSE value2) # conditional field
/FIELD=(high,POSITION=27,SIZE=5,NUMERIC,IF value1 LE value2 THEN value2 \
ELSE value1) # conditional field
/FIELD=(T,POSITION=37,SIZE=5,NUMERIC)

RowGen RowGen Examples 29


3 RECORD AND FIELD FORMATTING / CONVERSION

This produces derived.out:

Part Low High Diff


-----------------------------------------
buzz saws 39.00 62.94 23.94
drills 28.58 93.46 64.88
glue sticks 42.82 66.96 24.14
hammers 70.16 85.53 15.37
lightbulbs 9.84 24.41 14.57
nails 41.92 50.56 8.64
pliers 3.80 78.83 75.03
ratchets 31.13 83.62 52.49
sanders 41.70 82.86 41.16
screwdrivers 44.52 80.06 35.54
screws 59.27 97.06 37.79
switches 34.26 83.55 49.29
tacks 49.42 84.97 35.55
wrenches 30.64 76.07 45.43

Average difference: 37.42

Note that:

• the "full" field, as defined on input, specifies that values from the SET file will
be drawn at random
• the /INREC section creates a virtual layout of three fields to be processed
(see /INREC on page 126), where consecutive fields are placed right next to each
other (temporarily, to reduce processing time)
• the first /INREC field is a derived field, "part," defined as a substring of "full,"
where the offset of 5 ignores the first four prefix characters (see ASCII
Substrings on page 84)
• the "part" field, the desired substring, is then sorted in ascending order,
excluding duplicates
• the first same-name iteration of /OUTFILE provides the output layout for the
detail records, including three newly derived fields:
• "low" is a conditional field that ensures that the lower of the two values
(value1 or value2, as generated on input) is placed at position 17
(see CONDITIONAL FIELD AND DATA STATEMENTS on page 152)
• "high" is a conditional field that ensures that the higher of the two values is
placed next, at position 27
• the field named as "T" will contain the absolute value of the result of
subtracting value2 from value1, where the absolute value ensures a positive
result regardless of which value was larger
• the second same-name iteration of /OUTFILE contains:
• a derived field "avg_diff," which is not given a POSITION attribute, and is
therefore placed directly after the preceding DATA statement
(see /DATA on page 129)

30 RowGen Examples RowGen


3 RECORD AND FIELD FORMATTING / CONVERSION

• "avg_diff" is defined as an /AVERAGE of the derived field "T" from above


(see SUMMARY FUNCTIONS (AGGREGATION) on page 156).

Example 14 Report with Multiple Aggregations

This example, inventory.rcl, creates a report with detail records, subtotals, and
grand totals:

/INFILE=agg.in # placeholder only; no real input is recognized


/PROCESS=RANDOM
/FIELD=(prefix,SET={A,B,C},POSITION=1,SIZE=1) # generates only A-C part values
/FIELD=(part,POSITION=2,SIZE=3,alpha_digit)
/FIELD=(price,SET={[0,99]},POSITION=6,SIZE=5.2,NUMERIC) # price value range
/INCOLLECT=15
/SORT # will perform a two-key sort
/KEY=prefix # sort first on part prefix, for grouping
/KEY=part # sort next on remainder of part field
/OUTFILE=agg.out # this section defines the sub-totals
/DATA="Total "
/FIELD=(prefix)
/FIELD=(tot_price,POSITION=12,SIZE=7,CURRENCY) # place sub-total field
/FIELD=(max_price,POSITION=23,SIZE=7,CURRENCY) # place max field
/FIELD=(min_price,POSITION=34,SIZE=7,CURRENCY) # place min field
/FIELD=(avg_price,POSITION=45,SIZE=7,CURRENCY) # place average field
/FIELD=(count_tot_recs,POSITION=57,SIZE=2.0,NUMERIC) # place count field
/DATA="\n"
/SUM tot_price FROM price BREAK prefix # sum prices by group
/MAX max_price FROM price BREAK prefix # find max price per group
/MIN min_price FROM price BREAK prefix # find min price per group
/AVG avg_price FROM price BREAK prefix # find avg price per group
/COUNT count_tot_recs BREAK prefix # count records per group
/OUTFILE=agg.out # this section defines the grand totals
/HEADREC="All Groups:-----------------------------------------------\n"
/FIELD=(grand_price,POSITION=12,SIZE=7,CURRENCY) # place grand-total field
/FIELD=(max_price,POSITION=23,SIZE=7,CURRENCY) # place max field
/FIELD=(min_price,POSITION=34,SIZE=7,CURRENCY) # place min field
/FIELD=(avg_price,POSITION=45,SIZE=7,CURRENCY) # place average field
/FIELD=(count_tot_recs,POSITION=57,SIZE=2.0,NUMERIC) # place count field
/SUM grand_price FROM price # sum all prices, no groups
/MAX max_price FROM price # find max all prices
/MIN min_price FROM price # find mix all prices
/AVG avg_price FROM price # find avg all prices
/COUNT count_tot_recs # count all records
/OUTFILE=agg.out # this section defines the detail records
/HEADREC="PartID Price Maximum Minimum Average Count\n--------
--------------------------------------------------\n"
# produces a header record
/FIELD=(prefix,POSITION=1,SIZE=1)
/FIELD=(part,POSITION=2,SIZE=3)
/FIELD=(price,POSITION=8,SIZE=11,CURRENCY)

RowGen RowGen Examples 31


3 RECORD AND FIELD FORMATTING / CONVERSION

This produces agg.out:

PartID Price Maximum Minimum Average Count


----------------------------------------------------------
Akt5 $82.45
Aq78 $83.78
Ay2w $94.38
Total A $260.61 $94.38 $82.45 $86.87 3

B1PP $53.68
B6Oz $36.52
BCAW $36.09
BPDq $42.66
BPo4 $31.82
Bt45 $23.62
Total B $224.39 $53.68 $23.62 $37.40 6

C10A $60.76
C1bS $63.55
C1h6 $51.22
C6UL $12.18
C8ZR $78.26
Cj4c $62.41
Total C $328.38 $78.26 $12.18 $54.73 6

All Groups:-----------------------------------------------
$813.38 $94.38 $12.18 $54.23 15

Note that:

• 15 records were produced (using /INCOLLECT), consisting of "PartID" and


"Price" fields
• only prices within the literal numeric range were included (see Literal SETs on
page 122)
• PartIDs beginning only with A, B, and C were generated (see Literal SETs on
page 122)
• three same-name output file sections from the script are superimposed onto
the final display: the first output section defined the sub-totals; the second, grand
totals; the third, detail records (see Reports with Summaries on page 163)
• the aggregation-derived /FIELD statements in the script were defined in the
sub-total and grand-total output sections, and the /SUM, /MAX, /MIN, /AVG, and
/COUNT statements beneath them describe how they are to be calculated
(see SUMMARY FUNCTIONS (AGGREGATION) on page 156)
• /DATA and /HEADREC statements, in their respective output sections, were used
to embellish the output (see /DATA on page 129 and /HEADREC on page 170)
• data-type conversion from NUMERIC to CURRENCY was performed for all price
values (see Data Types on page 93).

32 RowGen Examples RowGen


3 RECORD AND FIELD FORMATTING / CONVERSION

Example 15 Creating a Web-Ready Report

This example produces an HTML file showing, as a table, the summary of store sales by
state in the USA.

states.set contains:

Alabama
Alaska
Arizona
Arkansas
California etc.

The script, sales_by_state.rcl, is used to create a summary report, embedding HTML


tags within /DATA, /HEADREC, and /FOOTREC statements so that output displays in a
web browser:

/INFILE=html.in
/PROCESS=RANDOM
/FIELD=(State,set=states.set,POSITION=1,SEPARATOR=',')
/FIELD=(StoreCode,POSITION=2,SEPARATOR=',',SIZE=3,upper)
/FIELD=(Sales,POSITION=3,SEPARATOR=',',SIZE=6,NUMERIC)
/INCLUDE WHERE Sales > 0
/SORT
/KEY=State
/KEY=StoreCode
/OUTFILE=sales_report.htm # summary lines
/DATA="<TR>\n<TD><B><FONT SIZE=+2>"
/FIELD=(State)
/DATA="</FONT><B></TD>\n<TD align=right><B><U><FONT SIZE=+2>"
/FIELD=(Sales,SIZE=15,CURRENCY)
/DATA="</FONT></U></B></TD>\n</TR>\n"
/SUM Sales BREAK State
/FOOTREC="</TABLE><BR>\nCreated on </B>%s.\
<HR></BODY>\n</HTML>",AMERICAN_TIMESTAMP
/OUTFILE=sales_report.htm # details lines
/HEADREC="<HTML><HEAD>\n<TITLE>HTML produced by RowGen\
</TITLE>\n</HEAD>\n<BODY><H2>Summary of Sales by\
State</H2>\nSales under \$100 are shown in \
italics.\n<TABLE CELLPADDING=4 CELLSPACING=1 \
BORDER COLS=5>\n"
/DATA="<TR>\n<TD><B>"
/FIELD=(StoreCode)
/DATA="</B></TD>\n<TD align=right>"
/DATA=(IF Sales LT 100 THEN "<em>")
/FIELD=(Sales,SIZE=15,CURRENCY)
/DATA=(IF Sales LT 100 THEN "</FONT>")
/DATA="</TD>\n</TR>\n"

Note that, for HTML purposes, field names are enclosed within parentheses, for
example: /FIELD=(Region).

RowGen RowGen Examples 33


3 RECORD AND FIELD FORMATTING / CONVERSION

This produces sales_report.htm, which you can open in a web browser:

RowGen allows you to use any HTML syntax, such as commands to modify text and
background color, to enhance a web-ready report. You can also include commands
specific to other markup languages, such as XML and SGML.

34 RowGen Examples RowGen


3 RECORD AND FIELD FORMATTING / CONVERSION

The mark-up language syntax you can use is dependent on the version of the
NOTE browser, or other utility, you will use to open and read the output file(s).

RowGen RowGen Examples 35


4 CREATING MULTIPLE RECORD LAYOUTS

4 CREATING MULTIPLE RECORD LAYOUTS

RowGen allows you to generate multiple data sets in the input section of a job
specification script. You can process, mix, and track these disparate sets on output
according to your specifications. This capability is useful for synthesizing indexed
tables or formatted reports where data origin is important.

Example 16 Generating and Identifying Multiple Input Sources

This example produces an intermixed data file that contains three separately generated
record types. The first input record format contains employee names and addresses with
sales; the second, store numbers with sales; and the third, department numbers with
sales. A code is used to distinguish the different record types. The first field in each
record represents its code. For the employee records, the code is a; the store records, b;
and the department records, c. To ensure that the various record types are intermixed in
the output, the records are sorted together using three characters starting in column
position 24.

This example uses the following SET files:

first_names.set:

John
Susan
Mary
Stephen
Barry
Cathy

last_names.set:

Stevens
Evans
Jones
Pierce
Murray
Smith
Osbourne

36 RowGen Examples RowGen


4 CREATING MULTIPLE RECORD LAYOUTS

The following script, multi_in.rcl, demonstrates how you can generate multiple data
sets in RowGen:

/INFILE=employee.dat # first generated data set


/PROCESS=RANDOM
/SEED=123
/FIELD=(code,SET=acode.set,POSITION=1,SIZE=1)
/FIELD=(lname,POSITION=3,SIZE=20,SET=last_names.set)
/FIELD=(fname,POSITION=24,SIZE=10,SET=first_names.set)
/FIELD=(address,POSITION=35,SIZE=20,alpha)
/INCOLLECT=5
/INFILE=stores.dat # second generated set
/PROCESS=RANDOM
/FIELD=(code,SET=bcode.set,POSITION=1,SIZE=1)
/FIELD=(storenum,POSITION=3,SIZE=2.0,FILL='0',NUMERIC)
/FIELD=(storesales,POSITION=6,SIZE=10.2,NUMERIC)
/FIELD=(comment,POSITION=17,SIZE=10,alpha)
/INCLUDE where storenum > 0
/INCOLLECT=3
/INFILE=dept.dat # third generated set
/PROCESS=RANDOM
/FIELD=(code,SET=ccode.set,POSITION=1,SIZE=1)
/FIELD=(deptnum,POSITION=3,SIZE=4.0,FILL='0',NUMERIC)
/FIELD=(deptsales,POSITION=8,SIZE=10.2,NUMERIC)
/FIELD=(comment,POSITION=19,SIZE=10,alpha)
/INCLUDE where deptnum > 0
/INCOLLECT=5
/SORT # to combine the different types of records
/KEY=(POSITION=24,SIZE=3) # common to all data sources
/OUTFILE=mixed.out # no remapping

This produces mixed.out:

a Jones Mary UFGwrmfucHAfUgUBihxA


c 1251 6310039.20 ZwVkzQtdIX
a Murray Stephen MgUxckhXltzlDienZEZN
a Jones Stephen OJKpmHdLGFNWALwshePg
a Pierce Susan fhxXdWHNUnGkoXCRSvEu
a Smith Susan RxIAIwjSuwmpPVhcAmqw
b 63 5877758.03 htfnipZSyT
c 3503 6862527.36 xkVjKWOvuE
c 6403 6709233.06 JSeOkWPTdF
c 1348 4021324.38 CgCyxdksEn
b 60 5524711.90 NqnNpeihfp
b 31 6149415.26 XmIKgNXmyx
c 9190 8540941.90 WCqjkrAIMV

Note that:

• three distinct /INFILE sections were created, each with their own attributes
(including a different /INCOLLECT for each)

RowGen RowGen Examples 37


4 CREATING MULTIPLE RECORD LAYOUTS

• the "code" field in each of the three data sets references a distinct literal SET
value (see Literal SETs on page 122), so it is easy to identify the source after the
three data sources are processed together
• the /KEY statement references a position and size which is common to all three
input sources, so that three data sets are arbitrarily mixed together by the sort
(see Unnamed Reference on page 136)
• the output file contains a mix of the three generated data sets, as sorted in
ascending order at position 24 (Barry is first alphabetically, then Ddy, etc.).

There can be no remapping in the output when multiple input files are
NOTE generated.

38 RowGen Examples RowGen


5 CREATING TABLES FOR A RELATIONAL DATABASE

5 CREATING TABLES FOR A RELATIONAL DATABASE

RowGen allows you to create realistic Relational Database Management System


(RDBMS) tables having relational integrity using primary and foreign keys. Using a
single RowGen script, you can create several output tables that can be loaded into a
DBMS to produce a realistic transactional environment.

The input section of the RowGen script can be used to generate all the required column
data (a master set), and the output section can be used to produce the requisite tables
using custom-selected subsets from the master set. A common key(s) is shared among
all output tables, thereby allowing you to perform joins and create views that have
relational integrity.

Example 17 Creating Transactional Table Data

In this example, three multi-column tables are produced that contain: customer
information, stock transaction records, and broker information. Because the tables are
produced at the same time using the same master set of data values, they share a key that
is common to all three tables, the ID code (an ID prefix and number), which is unique to
each customer. The following relations are created with the output tables from this
example:

one to one The customer table will contain a unique name and a unique address
for each ID code.

many to one The trades table will contain many (one or more) buy/sell
transactions for each ID code.

one to many The brokers table will contain a unique brokerage company office for
many different ID codes.

The script on the following page, common_keys.rcl, uses several SET files to provide
realistic values.

RowGen RowGen Examples 39


5 CREATING TABLES FOR A RELATIONAL DATABASE

### Creates three output files sharing common ID codes ###


/INFILE=cust_trade_broker_master.in
/PROCESS=RANDOM
/INCOLLECT=10000
# Shared IDCODE key for all tables:
/FIELD=(idcode_prefix,SET=ROW[1] trading_cities.set,POSITION=1,SIZE=1,SEPARA-
TOR='|',upper)
/FIELD=(idcode_number,POSITION=2,SIZE=3,SEPARATOR='|',digit)
# Column data for the customer table:
/FIELD=(name,SET=fullnames.set,POSITION=3,SEPARATOR='|')
/FIELD=(street_address,set=addresses.set,POSITION=4,SEPARATOR='|')
/FIELD=(state,SET=state_city.set,POSITION=5,SEPARATOR='|')
/FIELD=(city,SET=ROW[2] state_city.set,POSITION=6,SEPARATOR='|')
# Column data for the trades table:
/FIELD=(time,SET=tradetimes.set,POSITION=7,SIZE=22,SEPARA-
TOR='|',AMERICAN_TIMESTAMP)
/FIELD=(trans_type,SET={buy,sell},POSITION=8,SEPARATOR='|')
/FIELD=(num_shares,SET={(0,1000)},POSITION=9,SEPARATOR='|',WHOLE)
/FIELD=(stock_name,SET=stocks_price.set,POSITION=10,SEPARATOR='|')
/FIELD=(stock_price,SET=ROW[2] stocks_price.set,POSITION=11,SEPARATOR='|')
# Column data for the broker table:
/FIELD=(trader,SET=ROW[2] trading_cities.set,POSITION=12,SEPARATOR='|')
/FIELD=(trader_city,SET=ROW[3]trading_cities.set,POSITION=13,SEPARATOR='|')

/SORT
/KEY=idcode_prefix
/KEY=idcode_number
/KEY=time # where ID codes are the same, order by transaction time

/OUTFILE=customers.out # creates customer table data


/INCLUDE WHERE idcode_prefix OR idcode_number # ensures unique idcodes
/FIELD=(idcode_prefix)
/FIELD=(idcode_number)
/FIELD=(name,POSITION=2,SEPARATOR='|')
/FIELD=(street_address,POSITION=3,SEPARATOR='|')
/DATA=" "
/FIELD=(city)
/DATA=", "
/FIELD=(state)

/OUTFILE=trades.out # creates trade table data


/FIELD=(idcode_prefix)
/FIELD=(idcode_number)
/FIELD=(time,POSITION=2,SIZE=22,SEPARATOR='|',AMERICAN_TIMESTAMP)
/FIELD=(trans_type,POSITION=3,SEPARATOR='|')
/FIELD=(num_shares,POSITION=4,SEPARATOR='|')
/FIELD=(stock_name,POSITION=5,SEPARATOR='|')
/FIELD=(stock_price,POSITION=6,SEPARATOR='|')
/FIELD=(num_shares * stock_price,POSITION=7,SEPARATOR='|',NUMERIC) # cross-calc
/OUTFILE=brokers.out # creates broker table data
/INCLUDE WHERE idcode_prefix OR idcode_number # ensures unique idcodes
/FIELD=(idcode_prefix)
/FIELD=(idcode_number)
/FIELD=(trader,POSITION=2,SEPARATOR='|')
/FIELD=(trader_city,POSITION=3,SEPARATOR='|')

40 RowGen Examples RowGen


5 CREATING TABLES FOR A RELATIONAL DATABASE

This produces the following three output files / tables:

customers.out (excerpts)

A000|Mattis,Marjorie|461 Adam Dr. Park Forest, Illinois


A001|Darnall,Geneva|506 Whispering Ln. Lewisville, Texas
A002|Whittenburg,Son |1794 Dorsey St. Leesport, Pennsylvania
...
B006|Olsson,Normand|693 Royall Dr. Lewisville, Texas
B007|Loudermilk,Kenton|263 Arrow Head Crystal Springs, Mississippi
B008|Monahan,Judith|685 Workman St. Southport, Connecticut
...
C028|Danna,Anh |8468 Patetown Rd. Lebanon, Virginia
C029|Hamil,Felecia|6358 Washington St. Santa Fe, New Mexico
C030|Bozeman,Delores|624 Lou Dr. Fenton, Missouri
...
D232|Schendel,Martin|8170 Carver Blvd Edgewater, Florida
D233|Tse,Scott|269 Kenny Dr. Lewisville, Texas
D234|Peace,Florine|8582 Polly Watson Lebanon, Virginia
...
E117|Wulff,Precious|24 Cadel Ln. Springhill, Louisiana
E118|Darwin,Courtney|412 South Landfill Edgewater, Florida
E119|Kesterson,Leopoldo|3952 Rhonda Dr. Abbeville, Alabama
...

Note that:

• the first column contains the sorted ID code, where the prefix is either A, B, C,
D, or E, as drawn from the three-column set file trading_cities.set.
• the use of the output file /INCLUDE statement with file name references (that is,
without logical expressions) was used to ensure uniqueness of ID codes, thereby
preventing duplicate ID codes for any given name and address (see /INCLUDE
and /OMIT Syntax on page 146)
• the names and street addresses were drawn from SET files (see Character SET
Files on page 111)
• the cities and states were drawn from a relational SET file (see Relation SET
Files on page 119).

trades.out (excerpts)

A000|08/28/2012 04:21:33 AM|sell|677|FTKHF|194.64|131771.28


A000|11/14/2012 07:57:20 PM|buy|185|DRX|.53|98.05
A001|08/19/2012 07:06:29 PM|sell|26|XZCDH|26.19|680.94
A002|02/22/2012 09:28:19 PM|buy|198|RBFPE|.84|166.32
A002|04/23/2012 05:39:52 AM|buy|107|TDYFB|.22|23.54
A002|10/30/2012 09:56:37 AM|sell|616|JGV|305.76|188348.16
A002|11/25/2012 02:03:49 PM|buy|247|QC||0.00
...
B000|01/05/2012 11:22:31 PM|buy|168|CIFDF|2.07|347.76
B000|03/06/2012 11:10:32 PM|buy|676|HOPNM|.83|561.08

RowGen RowGen Examples 41


5 CREATING TABLES FOR A RELATIONAL DATABASE

B000|08/30/2012 12:28:02 AM|sell|865|LNFC|7.30|6314.50


B001|05/21/2012 04:11:33 PM|buy|315|DXGC|94.33|29713.95

B001|07/09/2012 01:31:30 PM|buy|111|QJGYB|91.45|10150.95


B002|05/10/2012 00:38:11 AM|sell|2|HBLUF|85.95|171.90
B003|04/18/2012 08:51:25 PM|sell|101|LBIAD|61.64|6225.64
...
C001|03/18/2012 09:18:45 PM|sell|572|BVQRH|299.07|171068.04
C002|05/25/2012 02:22:45 AM|sell|754|ZNM|45.96|34653.84
C002|10/25/2012 06:49:16 PM|buy|292|LQBC|.54|157.68
...
D001|11/14/2012 02:56:15 AM|sell|469|PBIBS|67.96|31873.24
D002|08/04/2012 11:25:47 PM|sell|152|BWBKB|1.11|168.72
D003|03/11/2012 10:17:35 AM|sell|764|QMBD|.92|702.88
D004|01/30/2012 06:48:11 PM|sell|949|BCEHW|.83|787.67
D004|04/27/2012 01:29:45 PM|buy|703|GUAH|43.38|30496.14
...
E002|01/14/2012 00:30:50 AM|sell|875|XH||0.00
E002|01/17/2012 07:29:26 AM|sell|446|YBSFE|309.52|138045.92
E002|04/04/2012 02:07:57 PM|buy|362|XXID|6.60|2389.20
...

Note that:

• The first column contains the sorted ID code. Note the repetition of ID codes that
simulate multiple transactions per customer. Repetition frequency is based on
the number of records generated and restrictions on the SET values.
• Random timestamps were drawn from a range in a timestamp SET file
(see Timestamp SET Files with Ranges on page 117), and this was used as a sort
key where ID codes were the same.
• A "buy" or "sell" transaction type was chosen from a literal SET
(see Literal SETs on page 122).
• The number of shares was controlled by a literal numeric range SET (see Literal
SETs on page 122).
• The stock symbol and stock prices were drawn from a relational SET file
(see Relation SET Files on page 119).
• The final column’s value was cross-calculated from the number of shares bought
or sold times the price of that stock.

brokers.out (excerpts)

A010||Los Angeles, CA
A011|Acme Brokerage Company|Atlanta, GA
A013|Acme Brokerage Company|Atlanta, GA
...
B004|BrandX Traders, Inc.|Biloxi, MS
B008|BrandX Traders, Inc.|Trenton, NJ
B010|BrandX Traders, Inc.|Salem, NC
...

42 RowGen Examples RowGen


5 CREATING TABLES FOR A RELATIONAL DATABASE

C001|Corporate Stockhouse, Intl.|Memphis, TN


C004|Corporate Stockhouse, Intl.|Washington, D.C.
C005|Corporate Stockhouse, Intl.|Memphis, TN
...
D003|Delaney and Sons|Las Vegas, NV
D007|Delaney and Sons|Topeka, KS
D008|Delaney and Sons|Topeka, KS
...
E001|Endeavor Stocks and Bonds|Chicago, IL
E003|Endeavor Stocks and Bonds|Detroit, MI
E004|Endeavor Stocks and Bonds|Chicago, IL
...

Note that:

• the first column contains the sorted ID code


• the use of /INCLUDE with file name references (that is, without logical
expressions) was used to ensure uniqueness of ID codes, thereby preventing
duplicate ID codes in this broker table (see /INCLUDE and /OMIT Syntax on
page 146)
• the broker name was drawn from a relational SET file that matched ID code
prefixes with a corresponding broker (see Relation SET Files on page 119)
• the broker city (branch) was drawn from a relational SET file that matched a
given broker name with any of three city branches.

Conclusion

The above tables, when loaded into a database, will provide a transactional environment.
Because of the common ID code key across the tables, you are able to join, and to create
views that have relational integrity.

RowGen RowGen Examples 43


6 PIPING ROWGEN DATA INTO ANOTHER PROCESS

6 PIPING ROWGEN DATA INTO ANOTHER PROCESS

Example 18 RowGen Output —> SQL*Loader

This example shows how you can stream rows of data directly from RowGen into
another process. By streaming data, rather than creating physical files, you do not have
to wait for the data generation to complete in order to start the next process. With this
method, I/O efficiency can scale linearly as data sets grow.

This example also shows how you can use the ctl2ddf utility to translate SQL*Loader
control file statements into RowGen-supported /FIELD layouts. The /FIELD layouts
are written to a data definition file which can be invoked from within a RowGen job
specification script (see Data Definition Files on page 62).

Consider that you have an empty Oracle table instream, which you expect to populate
on a nightly basis with new test data. The table description for instream is:

Name Type
--------- ----------------
CODE CHAR(5)
PRICE NUMBER(6,2)
PART VARCHAR2(12)

Your control file, stream_in.ctl, is used to load the new nightly test data into this table:

LOAD DATA
INFILE 'outstream.dat'
TRUNCATE
INTO TABLE instream
TRAILING NULLCOLS
(code position(0001:0005) char,
price position(0006:0012) DECIMAL EXTERNAL,
part position(0013:0024) char)

To generate the metadata used by RowGen, you can run the utility ctl2ddf against this
control file, as follows:

ctl2ddf stream_in.ctl

which produces instream.ddf:

/FILE=outstream.dat
/FIELD=(code, POSITION=1, SIZE=5)
/FIELD=(price, POSITION=6, SIZE=7.2, NUMERIC)
/FIELD=(part, POSITION=13, SIZE=12)

44 RowGen Examples RowGen


6 PIPING ROWGEN DATA INTO ANOTHER PROCESS

Note that:

• the .ddf file was named instream.ddf, as derived from the LOAD table entry in
the control file
• the /FILE entry uses the INFILE name from the control file
• the "price" field was given a NUMERIC data type for RowGen purposes, derived
from decimal external, and the size defaults to a precision of 2 (and increased to
size 7 to account for the decimal place).

For the purposes of generating meaningful data, this example requires that you manually
change one of the field statements in instream.ddf in order to reference SET file values.
That is, change the part field to the following:

/FIELD=(part,SET=parts_list2.set,POSITION=13,SIZE=12)

If you do not make this change, the example will run, but the "part" field will consist of
random ASCII characters.

Consider the following script, stream_out.rcl. This invokes the /FIELD statements in
instream.ddf:

/SPECIFICATION=instream.ddf # invoke the /FIELD layouts


/INFILE=outstream.dat # refers to the /FILE entry in instream.ddf
/OMIT WHERE price < 0 # no negative values generated
/INCOLLECT=1000000 # generate one million rows that satisfy omit
/SORT
/KEY=code
/OUTFILE=outstream.dat # data to be streamed in to the control file

This script will generate 1 million records to be loaded, and orders the rows on the
"code" field. The DIRECT=TRUE clause bypasses SQL*Loader’s indexing function,
which would otherwise slow the load with its internal sort. Because the data has been
pre-sorted on the desired key(s) by RowGen, you can specify this faster direct path load
option.

Finally, you can use the batch file, stream.bat, which will perform the streaming
operation:

mkfifo outstream.dat
rowgen /spec=stream_out.rcl |
sqlldr scott/tiger control=stream_in.ctl DIRECT=TRUE

RowGen RowGen Examples 45


6 PIPING ROWGEN DATA INTO ANOTHER PROCESS

When streaming RowGen data into a process other than SQL*Loader, you
NOTE may not need to used a named pipe (as mkfifo specifies). You can
declare stdout for RowGen output (/OUTFILE=stdout), and stdin can
be the input source to the next process. SQL*Loader, however, requires a
.dat extension for its data source (INFILE), so the named pipe convention was
used in this example.

When you run stream.bat, the RowGen and SQL*Loader processes begin
simultaneously. To verify the results, check the contents of the newly populated table in
Oracle, which now contains 1 million rows:

SQL> select * from instream where rownum < 11;

CODE PRICE PART


----- ---------- ------------
01Lln 4403.15 switches
03zx4 1727.90 lightbulbs
05WEC 5133.66 hammers
08zY1 3546.96 sanders
0AaGY 402.68 lighters
0ArUv 6141.93 ratchets
0BlYr 4409.89 screwdrivers
0EKcU 6696.61 buzz saws
0Eems 3822.16 pliers
0FTjd 92.36 nails

10 rows selected.

Note that:

• the "CODE" field is ordered alphabetically


• the "PRICE" field contains no negative values
• the "PART" field contains randomly selected values from the parts_list2.set
file.

To improve performance in large generation jobs, consider licensing


NOTE RowGen for additional CPUs, if applicable, on your system.

46 RowGen Examples RowGen


1 PURPOSE

ROWGEN CONTROL LANGUAGE

1 PURPOSE

This chapter describes the RowGen Control Language (RCL). RCL syntax uses
intuitive key words and logical expressions to generate and produce any desired volume
and format of test data.
RCL syntax is explicit; it is a 4GL designed to be easy for anyone to learn, use and
modify. RCL job scripts can be invoked from the command line, the IRI Workbench
Eclipse GUI, embedded into batch scripts, or called into programs.
With a single RCL job script and I/O pass through the RowGen program, you can
generate one or more random files (i.e., from randomly generated data, or from
randomly selected real data) in one or more formats. If you wish, you can also
simultaneously incorporate major transformation and presentation functions for future
application prototyping and report format sharing. The supported manipulations are:
• select
• sort
• aggregate
• cross-calculate
• report.

More specifically, RowGen provides the following functionality during data synthesis:
• A single input definition can be filtered and sorted, and multiple input definitions
with various formats can be mapped into multiple output files with different
formats.
• Summary results can be generated (to multiple levels of grouping at the same
time), and can include maximums and minimums, totals, counts, and averages.
• Detail records can be output to one file, with aggregations sent to another.

RowGen can accomplish its transformation mappings from input to output because it
references the layouts of named fields within your record definitions. This also allows
you to reference metadata that are centrally maintained (in data definition files),
providing the basis for a shared data environment and the creation of materialized views.

Because the source for RCL is the CoSort Sort Control Language (sortcl) program,
your RCL job scripts can be used immediately for equivalent data transformations and
reporting on real data sources if you also license CoSort.

The RowGen data flow diagram on page 54 illustrates the functionality of RowGen by
following the flow of data from input (generation), optional processing (sorting), and
output (production). Data can be mapped, sorted, and remapped into multiple output
targets and formats.

RowGen RowGen Control Language 47


1 PURPOSE

The job script files (.rcl), SET files (.set) and result files (.out) referenced
NOTE throughout this chapter are provided in the /examples/RCL_chapter
subdirectory of your RowGen install directory, so you can run these examples
and re-create the results.

db

If you are specifically interested in using RowGen to create test data in structurally and
referentially correct target tables of a relational database, please use the IRI Workbench
GUI, built on Eclipse, and its DB test data creation job wizards, in particular.

The GUI also has a custom test data job wizard that supports RCL functions and
specifications for bespoke table, file, and report targets needing test data.

48 RowGen Control Language RowGen


2 CONVENTIONS

2 CONVENTIONS

This section describes documentation conventions that are used throughout this chapter.

2.1 Line Continuation

U The backslash character \ is the continuation character for RowGen specification


statements longer than one line. In a RowGen script, you can include blanks, comments
(starting with #), or tabs on the same line following the \ character. However, the first
syntax on the next line must begin with the continuation that was interrupted by the \.
On the command line, UNIX users can use a backslash as a line continuation character,
but spaces, tabs and comments are not supported beyond the \. 

W Windows users cannot use \ on the command line. Long lines should simply wrap
around to the next line. 

WARNING!
You cannot use a \ line continuation character to separate a
[path]filename reference, even if it contains spaces (see
Spaces within File Names/Paths: Windows Users Only on
page 56) Also, you cannot use the line continuation character to
break up any word. You must place the \ before or after the
complete word, and complete the statement on the next line.

2.2 Comments

The # character marks the beginning of comments on a line. Comments may begin after
a statement on the same line, or may be on a line by themselves. The comment continues
until the end of the line, and is ignored during processing.

2.3 Optional Statement Parameters

Square brackets [] are used to describe optional parameters or values. They are also
used in numeric SET file ranges (see Numeric SET Files with Ranges on page 113).

2.4 Environment Variables

The character $ preceding a variable name directs RowGen to replace the environment
variable with its current value. You may use any of the following conventions:

RowGen RowGen Control Language 49


2.5 Naming Conventions

• $variable
• ${variable}
• $[variable]

U UNIX users defining environment variables in a Bourne or Korn shell must export
the variables for RowGen to recognize them. 

2.5 Naming Conventions

The following rules apply both to names and statements recognized by RowGen, and to
file names and field names that you define:

• may be any length


• may contain any combination of the following characters:
• alphabetic
• numeric
• embedded underscore (_).

WARNING!
Hyphenated names can be interpreted as mathematical
expressions involving subtraction. The use of underscores
is therefore recommended for compound file and field names.

Statements and field names are not case-sensitive, that is, upper and lowercase letters are
interchangeable, for example:
• POSITION is the same as position.
• /FIELD=(PARTY) is the same as /field=(Party)

U UNIX paths and file names, however, are case-sensitive, so the file chicago is not
the same as the file CHICAGO, Chicago, etc. 

W For Windows users with Fast Access Table (FAT) and NT File Systems (NTFS),
file names are not case-sensitive, so the file chicago is the same as the file CHICAGO,
Chicago, etc.

Refer to your operating system manual for the acceptable maximum length and naming
format of a file name. 

50 RowGen Control Language RowGen


2.6 Abbreviations

stdin and stdout are used for standard input and standard output (pipes), respectively.
When an output file name is not specified, the default is stdout.

2.6 Abbreviations

Generally, RowGen words in statements can be truncated by deleting trailing letters.


The control word must only be long enough to distinguish it from any other control
word, for example:
/FIELD=(Goods,POSITION=10) is the same as
/FIELD=(Goods,POS=10)

and
/SPECIFICATIONS is the same as
/SPEC

RowGen RowGen Control Language 51


3 EXECUTION

3 EXECUTION

RowGen is a standalone program that is executed from the command line or from
within a batch script.

To begin execution from the command line, enter the program name RowGen followed
by either actual specifications, a job specification file, or a combination of both.

To display the version and other information, enter:


rowgen /version

This will display something like:


RowGen Version 3.1 D95130121-1715 #07223.8394, 4 CPUs sun4u
Expires April 21, 2013

Although you can enter RCL statements on the command line, it is strongly
recommended that you organize these statements into a job specification (script) file
which RowGen reads and processes (see Job Specification Files on page 61). This
prevents you from encountering difficulties with shell control characters and command
line buffer limits, and you can preserve your scripts for repeated use.
The syntax for executing a script from the command line is:
rowgen /SPECIFICATION=[path]script_file

/SPECIFICATION can be abbreviated as in the following example:

rowgen /spec=/home/test/job10.rcl

W The syntax required for Windows users referencing the path to a script_file
depends on whether the drive letter is used. The following is an example of the syntax
required in each case:

rowgen /spec=C:\\home\\test\\job10.rcl
rowgen /spec=/home/test/job10.rcl

That is, Windows users must use a double backslash when including the drive letter as
part of a /SPECIFICATION= command-line statement. If the drive letter is not used,
you can use either a single forward slash (for consistency with UNIX) or a single
backslash (the standard Windows convention). See Spaces within File Names/Paths:
Windows Users Only on page 56 if the name of the job script, or its path, contain a
space. 

52 RowGen Control Language RowGen


3 EXECUTION

During RowGen execution, any syntax errors are reported by the line number in your
script. The script also provides a way to re-execute, share, and modify your data
definitions and/or job specifications.

You can adjust the MONITOR_LEVEL resource control setting which


NOTE determines the degree of verbosity for reporting the progress of a job
(see Resource Control Settings on page 213). For details on on-screen
error messages, see Table of Error Values on page 219.

RowGen RowGen Control Language 53


4 USAGE

4 USAGE

This section describes the structure of a RowGen script, and explains the basic syntax
requirements for generating data, ordering data (if required), and performing additional
transformations on the generated data (if required). RowGen data flow is controlled by
statements specified in the input, action, and output sections of a RowGen job
specification script, as illustrated in the diagram below. For more details, see the
following sections:

• Data Flow Structure in RowGen Scripts on page 55


• Input Filenames and Attributes on page 56
• Action on page 56
• Output Filenames and Attributes on page 57
• /INCOLLECT on page 59
• /OUTCOLLECT on page 59.

RowGen Data Flow


Job Sections
Input /INFILE=filename
/FIELD=description
/FIELD=description
...
other attributes Data
Generation
Only Data Generation
Action/Process /SORT or /REPORT
with
key(s) attributes Field Transforms,
Summaries
Output(s) /OUTFILE=target#1
/FIELD=description
/FIELD=description
...
other attributes

/OUTFILE=target #2
...

Other Process such as SQL*Loader

54 RowGen Control Language RowGen


4.1 Data Flow Structure in RowGen Scripts

4.1 Data Flow Structure in RowGen Scripts

A RowGen job specification script consists of three main sections:

Input Random data is generated based on specifications given here.

Action (Process) This can be a /SORT (the default) over one or more key fields,
or a /REPORT (where the generated records are not ordered).

Output The generated and processed random data is sent to a named


output target(s), where a modified record layout, and/or other
target-specific requirements, can be specified.

The input section of a RowGen script consists of an /INFILE name, followed by a set
of attributes such as the number of records to be generated (/INCOLLECT), the field
layout, and record filters (see Input Filenames and Attributes on page 56). In order for
RowGen to generate data, an /INFILE name must be followed by one or more /FIELD
statements. This requirement does not apply to the output section unless you have
additional output-specific requirements (see Output Filenames and Attributes on
page 57). The output section should, however, consist of one or more /OUTFILE
declarations, where you name each output file. Otherwise, the default output target is
stdout.

By default, if you do not include attributes beneath an /OUTFILE statement, the


/INFILE field layout and other input attributes you specify are not changed in the
output.

If you require customized record formatting, aggregation, or field-level transformations


such as remapping or data-type conversion, you must include one or more /FIELD
statements on output (and other statements, if applicable) to describe the new record
layout and/or other output-level transformations. The output /FIELD statements can
either reference field names that were specified in the input section, or you can introduce
new field names if you are deriving an output field value. This chapter describes the
various ways to perform RowGen transformations, starting with FIELD EXPRESSIONS
(CROSS-CALCULATION) on page 123.

For documentation purposes, the terms generated and produced are used to
NOTE differentiate between the data that are generated internally by RowGen based
on your input attributes, and the output targets that are ultimately created.

RowGen RowGen Control Language 55


4.2 Input Filenames and Attributes

4.2 Input Filenames and Attributes

Input file names in RowGen are required only as placeholders for when real data
become available. RowGen is designed only to generate new data files/streams and
reports based on your specifications. However, if real data sources will become
available at sometime in the future, it is recommended that you specify the name of the
input file(s), if known, that you will be working with. In this way, if you purchase a
license for CoSort, the sortcl program interface will recognize, as a legitimate input
source, the input file(s) you have specified, reducing (or in some cases, eliminating) the
need to modify the RowGen script.
The syntax for the input file section of a RowGen script is as follows:
/INFILE=[path]filename
attributes

Optionally, you can create multiple /INFILE sections, each with their own attributes, to
generate disparate data sets that can be processed together, as illustrated in Example 16
on page 36.

A valid input file name is also a named pipe or an unnamed pipe (stdin). Standard input
is relevant if you intend to upgrade to CoSort and use streamed-in data as an input
source. Similarly, if you intend to upgrade to CoSort’s sortcl to perform a job that
involves multiple input sources, you may use the following syntax in RowGen to
identify multiple files of similar formats:

/INFILES=([path]filename1,[path]filename2,...)
attributes

Spaces within File Names/Paths: Windows Users Only

W In some cases, you may want RowGen, for placeholder purposes, to reference a
path name or file name that contains a space. RowGen allows you to use the Windows
convention of a tilde (~) followed by a 1 to substitute for the space. For example, within
a RowGen script, you can use:

/INFILE=C:\progra~1\iri\rowgen21\chiefs

This feature also applies to /OUTFILE (see OUTPUT OPTIONS on page 169) and
/FILE statements (see Data Definition Files on page 62) within scripts.

4.3 Action

This section describes the two fundamental actions (processes) that can take place when
moving generated records (as defined in the input section) to one or more outputs (as
defined in the output section). Only one action statement can be designated in a

56 RowGen Control Language RowGen


4.4 Output Filenames and Attributes

RowGen job; they are mutually exclusive. You can specify either of the following
action statements:

/REPORT Generated data are passed to output without ordering, that is, as
unsorted records.

/SORT Records are ordered over one or more keys that you specify. This is
the default action when no action is specified. If no /KEY statements
are included beneath a /SORT statement, the default key is a left-to-
right sort over the entire record (see KEYS on page 135).

CoSort’s sortcl program (optional upgrade) runs the above actions on


NOTE RowGen-produced data or on real data, and also offers /JOIN, /MERGE, and
/CHECK actions.

4.4 Output Filenames and Attributes

As records are generated by RowGen, they are processed and written to one or more
outputs, each with optional target-specific field layouts, record-filters, and other
attributes. An output target can be named stdout (the default if no /OUTFILE statement
is included), where RowGen displays results on-screen unless streaming into another
process (see Example 1 on page 13 and Example 18 on page 44). In simple, generation-
only RowGen scripts, there is no need to specify any output file attributes other than the
file name because the data generated and processed by RowGen are based on the layout
(and any record-filter logic) provided in the input file section of the script.

However, if you are performing a job with target-specific requirements, you must add
one or more /FIELD statements (and other statements if required) beneath your output
file name declaration. /FIELD and other statements included on output allow you to
exploit the full range of RowGen’s capabilities in addition to random data generation,
including:

• data-type conversion
• remapping
• conditional field value / substitution
• record reformatting
• complex report formatting
• drill-down and roll-up aggregation
• other derivations (such as mathematical formulae).

RowGen RowGen Control Language 57


4.4 Output Filenames and Attributes

The syntax for declaring an output file is therefore:

/OUTFILE=[path]filename
optional_attributes

This chapter describes all the various ways to perform the above transformations,
starting with FIELD EXPRESSIONS (CROSS-CALCULATION) on page 123.

RowGen allows you to invoke a custom output procedure by way of the


NOTE command /OUTPROCEDURE, rather than using the standard /OUTFILE
command described in this section. You can use a custom output procedure to
redirect output records to a proprietary DBMS or ETL tool, for example. In
these cases, you can program a hook between the API of the tool (such as
ODBC or OCI) and your RowGen output procedure. Contact your RowGen
agent for details on using /OUTPROCEDURE.

If you intend to produce multiple output files from the same generated data, then you
must define file-specific attributes with each new /OUTFILE statement, for example:
/OUTFILE=apples
attributes for the apples file
/OUTFILE=peaches
attributes for the peaches file
/OUTFILE=pears
attributes for the pears file

In cases where output reports contain both summary and detail records, a single output
file such as pears will requires multiple record formats, in which case you must use an
/OUTFILE statement to describe the attributes of each format, using the same filename:
/OUTFILE=pears
attributes_a
/OUTFILE=pears
attributes_b
...

See Example 14 on page 31.

58 RowGen Control Language RowGen


4.5 /INCOLLECT

4.5 /INCOLLECT

This statement determines the number of records to be generated by RowGen after any
include or omit logic is applied. It is specified last in the inputs section(s) of a script. The
syntax is:

/INCOLLECT=n
or
/INCOLLECT=PERMUTE

where n is the number of records to be generated and PERMUTE is all possible


permutations (after any include or omit logic is satisfied). With no /INCOLLECT
statement, the number of records generated is 100.

You can specify the number of records produced on output using the /OUTCOLLECT
statement (see MISCELLANEOUS OPTIONS on page 174). By default, the
/OUTCOLLECT value is the same as the /INCOLLECT value.

4.6 /OUTCOLLECT

This statement determines the number of records produced for each output target (file)
after data is generated and processed, and then after any output /INCLUDE, /OMIT, and
/OUTSKIP filters are applied. It is specified in the output section(s) of a script.
The syntax is:

/OUTCOLLECT=n

where n is the maximum number of records produced in the output file.

To determine the number of records generated on in the input side (after input conditions
are satisfied, use /INCOLLECT (see /INCOLLECT on page 59). If the number of
/OUTCOLLECT records you specify is greater than the /INCOLLECT value, RowGen
will not return an amount of records greater than that which was generated.

In cases where you are applying record filter logic (see INCLUDE-OMIT
NOTE (RECORD SELECTION) on page 146), it is possible that the specified number
of /OUTCOLLECT records for a given output file will not be produced because
a smaller number of generated records meets the output filter condition(s) you
have specified.

RowGen RowGen Control Language 59


5 FILES

5 FILES

This section describes the types of files that can be referenced by RowGen:

• Resource Control File (rowgenrc) below


• Job Specification Files on page 61
• Data Definition Files on page 62
• File Formats (/PROCESS) on page 62
• Statistics on page 69.

5.1 Resource Control File (rowgenrc)

This is a text file you can use to set system resources locally and/or globally that
RowGen’s embedded CoSort engine uses for high-performance sorting of your random
data. You can adjust and control several resources to improve sorting efficiency, such as
the:
• number of threads
• amount of memory
• overflow areas
• verbosity of monitor messages.

U RowGen searches for a .rowgenrc or rowgenrc file within specific directories.


Normally, a default rowgenrc is built in your $ROWGEN_HOME/etc directory by
your system administrator at installation time. 

W On Windows systems, the resource control file is called rowgen.rc, and its values
take precedence over default registry settings only when this file is in the same directory
as the RowGen (or a separately licensed RowGen) executable. 

You may set the environment variable COSORT_TUNER to the path and name for a
resource control file. And, you may create multiple resource control files so that you
may have specific files for specific users and/or jobs. To ensure that a particular file is
used with a specific job, you can use the following statement within a RowGen job
script:
/MEMORY-WORK="[path]filename"

The /MEMORY-WORK values have a higher priority than those in COSORT_TUNER. After
values in COSORT_TUNER are checked, any values in .rowgenrc (UNIX) or rowgen.rc
(Windows) in the search path will be read. If there are still values that have not been set,
factory default values will be used (see Search Order for Resource Controls on
page 211).

60 RowGen Control Language RowGen


5.2 Job Specification Files

5.2 Job Specification Files

These files are used to organize RowGen statements. These named files are referenced
by a /SPECIFICATION or /SPEC statement as shown below:
rowgen /SPECIFICATION=[path]filename

where the filename that is launched by RowGen is the job specification file. When a
/SPECIFICATION=filename statement occurs within a job specification file, the
contents of the referenced file (the Data Definition File, or DDF) are read into the job at
that point. For example, you might have a job specification file as follows:

/INFILE=chiefs # placeholder only; not a physical file


/SPECIFICATION=key1.ddf
/REPORT
/OUTFILE=parties

The DDF key.2.ddf could be defined as follows:

/FIELD=(pres,SET=names.set,POSITION=1,SIZE=22)
/FIELD=(votes,POSITION=24,SIZE=3,digit)
/FIELD=(party,SET=types.set,POSITION=28,SIZE=3)
/INCOLLECT=500

This is the equivalent of writing the following job specification (indents applied for
readability only):

/INFILE=chiefs
/FIELD=(pres,SET=names.set,POSITION=1,SIZE=22)
/FIELD=(votes,POSITION=24,SIZE=3,digit)
/FIELD=(party,SET=types.set,POSITION=28,SIZE=3)
/INCOLLECT=500
/REPORT
/OUTFILE=parties

Note that instead of using the name of a specification file, you may use an environment
variable that references the file. In that case, the job specification file might be as
follows:

/INFILE=chiefs
/SPECIFICATION=$KEY1
/REPORT
/OUTFILE=parties

RowGen RowGen Control Language 61


5.3 Data Definition Files

5.3 Data Definition Files

For application independence, it is recommended that you develop and use a


specification file that contains data definitions only. This type of metadata repository is
referred to as a Data Definition File (DDF). It contains a /FILE statement followed by
further attributes for the named data file. The file described can be used as an input or an
output file, and are an ideal way to create user views and share common data.

Several RowGen tools convert third-party metadata into RowGen DDFs.


NOTE See the ROWGEN TOOLS chapter on page 191 for conversion from Micro
Focus COBOL data dictionaries, Microsoft Comma-Separated Values (CSV)
header records, Oracle SQL*Loader control files (CTL), and W3C Extended
Log (web) Format (ELF) layouts.

When an /INFILE or /OUTFILE statement occurs in a job specification file, the record
definitions can be obtained from the same-named /FILE layout within the data
definition file. The data definition file reference must appear before any/INFILE
or /OUTFILE that will reference the layouts in that data definition file.

For example, the data definition files chiefs.ddf could contain:

# chiefs.ddf
/FIELD=(president,POSITION=1,SIZE=22)
/FIELD=(votes,POSITION=24,SIZE=3)
/FIELD=(service,POSITION=28,SIZE=9)
/FIELD=(party,POSITION=40,SIZE=3)
/FIELD=(state,POSITION=45,SIZE=2)

and the job specification file could be:

/INFILE=chiefs
/SPECIFICATION=definitions.ddf

/REPORT
/OUTFILE=out # no change to the generated records

which will produce two files of 100 records each: out will contain randomly generated
records based on the chiefs layout.

5.4 File Formats (/PROCESS)

By default, all generated RowGen records are consistent with the file format
(process type) called RECORD (see RECORD on page 63).

62 RowGen Control Language RowGen


5.4 File Formats (/PROCESS)

However, for output purposes, you can change the default file format to any other
supported file format type by using the /PROCESS statement in the output section of a
RowGen job script. You can also produce multiple output files simultaneously, each
with a different file format.

The syntax for a /PROCESS declaration is:

/OUTFILE=filename
/PROCESS=process_type

RowGen can produce the following output file process types:

• RECORD on page 63
• MFVL_SMALL on page 64
• MFVL_LARGE on page 64
• VARIABLE_SEQUENTIAL (or VS) on page 65
• LINE_SEQUENTIAL (or LS) on page 65
• VISION on page 65
• UNIVBF on page 66
• CSV on page 66
• LDIF on page 66
• ODBC on page 67
• XML on page 67
• ELF (W3C Extended Log Format) on page 69.

RECORD

This is the default output process type (equivalent to RECORD_SEQUENTIAL in


COBOL), where every record generated is either variable- or fixed-length:

variable If no /LENGTH is specified on input, or if you specify /LENGTH=0,


then records are variable-length, and terminated with a line-feed or
carriage-return character (depending on the operating system, and the
value of OUTPUT_TERMINATOR option on page 217). In this case,
field contents, and any additional record formatting on output you
specify, can range from 0 to 65,535 bytes in length.

If you generate variable-length records, you still have the option to


specify a fixed-length for one or more output files. Otherwise, the
default output is variable length (LENGTH=0).
fixed You must provide a length statement, on input and output, if you
require that all the records generated and produced are to be the same
length, and not terminated by a linefeed/carriage-return character. If

RowGen RowGen Control Language 63


5.4 File Formats (/PROCESS)

you do not include a LENGTH statement on output, even if you have


specified one on input, the record length of the output file will default
to LENGTH=0 (variable-length, terminated records). Record length
statements are given in the form:
/LENGTH=n

where every record is length n, and the length can be a maximum of


65,535 (bytes).

The process type RECORD_SEQUENTIAL or RS is an alias of the process type


NOTE RECORD.

MFVL_SMALL

When you specify /PROCESS=MFVL_SMALL on output, Micro Focus variable-length


records are produced. The file has a 128-byte header record, and each record is
prepended with a binary short integer of two bytes which holds the size of the record.
The maximum record length is 4,096 bytes. Each record begins at an offset address
which is evenly divisible by four.

When you are using this process type on output, you have the option to set your own
minimum and maximum record lengths, the values of which will be written to the
header record to support inter-process conversions.

The syntax for using this process on output therefore supports additional options:

/PROCESS=MFVL_LARGE[,min_length][,max_length]

where min_length and max_length are the minimum and maximum record
lengths contained in the output file. If /PROCESS=MFVL_SMALL on input, this
information is taken from the input file if you do not specify these attributes.

MFVL_LARGE

When you specify /PROCESS=MFVL_LARGE on output, Micro Focus variable-length


records are produced. The file has a 128-byte header record, and each record is
prepended with a binary int of four bytes which holds the size of the record. The
maximum record length is 268,435,455 bytes. Each record begins at an offset address
which is evenly divisible by four.

64 RowGen Control Language RowGen


5.4 File Formats (/PROCESS)

When you are using this process type on output, you have the option to set your own
minimum and maximum record lengths, the values of which will be written to the
header record to support inter-process conversions.

The syntax for using this process on output therefore supports additional options:

/PROCESS=MFVL_LARGE[,min_length][,max_length]

where min_length and max_length are the minimum and maximum record
lengths contained in the output file. If /PROCESS=MFVL_LARGE on input, this
information is taken from the input file if you do not specify these attributes.

VARIABLE_SEQUENTIAL (or VS)

When you specify /PROCESS=VS on output, RowGen produces variable-length records


that are a string of characters prepended by a short integer. The value of the short is the
length of the record, for example, ABC is (x03 x00 x41 x42 x43) on a computer that is
little-endian.

WARNING!
The format of the short integer is machine-dependent.
Therefore, VS data between dissimilar computers (for example,
between RISC and SPARC) may be incompatible due to the
difference in endianness.

LINE_SEQUENTIAL (or LS)

When you specify /PROCESS=LS on output, RowGen produces variable-length records


that are strings of characters where each low value data byte (x00 through x1f) is
protected (or prepended) by a null byte (x00). The terminating linefeed character is not
protected.

VISION

When you specify /PROCESS=VISION on output, an ACUCOBOL-GT (index) file is


produced in Vision format. You can use the utility vutil32.exe (or vutil) to
view, or display information about, the converted file.

When using /PROCESS=VISION on output, you must specify the index /KEY in the
sortcl job script in order to create an indexed file.

RowGen RowGen Control Language 65


5.4 File Formats (/PROCESS)

UNIVBF

When you specify /PROCESS=UNIVBF on output, RowGen produces variable-length


records that are prepended by a 4-byte ASCII field giving the length of the record,
including the four bytes. This format is useful for Unisys mainframe users
writing to tapes.

CSV

When you specify /PROCESS=CSV on output, RowGen produces a


Microsoft Comma-Separated Values (CSV) file based on the /FIELD naming and
mapping that you provide on output. It will also produce a header record based on those
names and positions. Therefore, to produce an authentic CSV file on output, you must
be sure to specify comma-separated field on output (see SEPARATOR on page 86). All
CSV fields are enclosed within double-quotes.

To facilitate the creation of CSV file metadata, you may wish to use the utility csv2ddf,
which is provided with the RowGen package in the $ROWGEN_HOME/bin directory
(see the csv2ddf sub-chapter on page 195). In Windows, csv2ddf is located in
\Rowgen21\bin. The utility examines file headers and generates a RowGen data
definition file based on the input field names found in the file header. Its syntax is:
csv2ddf datafile [data definition filename]

LDIF

You can use /PROCESS=LDIF to produce data in an LDIF (Lightweight Directory


Interchange) format -- that is, the format of data exported from an LDAP database.
When using /PROCESS=LDIF on output, the /FIELD names must be the same as the
attribute names in the LDIF format. If a field value is null, the attribute will not appear in
the output.

When using /PROCESS=LDIF, data columns must be defined as delimited fields,


for example:

/INFILE=ldif_data
/FIELD=(cn,SET=cn_vals.set,POSITION=1, SEPARATOR='|')
/FIELD=(address1,SET=add1_vals.set,POSITION=2,SEPARATOR='|')
/FIELD=(zipcode,SET=zip_vals.set,POSITION=3,SEPARATOR='|')
/REPORT
/OUTFILE=test1.ldif
/PROCESS=LDIF
/FIELD=(cn,POSITION=1,SEPARATOR='|')
/FIELD=(address1,POSITION=2,SEPARATOR='|')
/FIELD=(zipcode,POSITION=3,SEPARATOR='|')

66 RowGen Control Language RowGen


5.4 File Formats (/PROCESS)

ODBC

Use /PROCESS=ODBC in a job script to populate (on output) table columns in databases
supported by ODBC (Open Database Connectivity). For CoSort version 9.5.1, the
following database types have been tested for compliance with /PROCESS=ODBC:

• Oracle
• SQL Server
• DB2
• MySQL.

WARNING!
On some systems, the ODBC specification has been updated
such that a certain value (SQLLEN) has changed from a 32-bit
integer to a 64-bit integer when the software is compiled in 64-
bit mode. Some older drivers will expect this value to be 32-bit,
and newer drivers will expect it to be 64-bit. To address this, a
separate file format must be used,
/PROCESS=ODBC_LEGACY, which supports the older standard.
Current PostgreSQL and MySQL ODBC drivers use the 64-bit
value(/PROCESS=ODBC). Oracle 11i ODBC drivers use the
32-bit value (/PROCESS=ODBC_LEGACY). This is not an issue
on Windows or MacOSX (iODBC always defines it as 64-bit).

XML

When /PROCESS=XML on output, data records are generated within XML tags, and an
XML file is produced. To define XML data elements, you can specify XML attributes
and tag names within each /FIELD statement using an XDEF attribute. Alternatively,
you can define a single XDEF for one field, and the same nodes you specify will be
applied to all remaining fields, where the tag names will assume the /FIELD names by
default.

If you do not define any XDEF attributes when using /PROCESS=RANDOM, the
NOTE XML tags will default to a generic /File/Record naming convention in the
target XML file.

RowGen RowGen Control Language 67


5.4 File Formats (/PROCESS)

When defining the XDEF attribute, the syntax for /FIELD statements used for
/PROCESS=XML is:

/FIELD=(fieldname,POSITION=n,SEPARATOR='x',XDEF="/node1/node2/[/node3]/tag_name")

Optionally, you can use the @ sign to specify that data is an attribute of the XML tag
preceding it, for example:

/FIELD=(fieldname,POSITION=n,SEPARATOR='x',XDEF="/node1/node2/@attribute_name")

Note that you can specify any level of nodes, depending on how nested the source XML
tag is (for input purposes), or how nested you would like the XML target tag to be.

Generally, the first two nodes listed in an XDEF attribute will be the same for
NOTE all fields you are defining because the actual data elements do not typically
reside within the two uppermost node levels in the XML hierarchical structure.

The /FIELD statements that you define will determine the size and position of the field
elements within the records that are output by RowGen.

The following is an example of a script that will produce an XML output file. Note that
the fields defined on input will be mapped automatically to the output section because
no fields are specified beneath the /OUTFILE statement (see Data Flow Structure in
RowGen Scripts on page 55):

/INFILE=xml_data.in
/FIELD=(president,POSITION=1,SEPARATOR="^",XDEF="/chiefs/chief/president")
/FIELD=(sparty,POSITION=2,SEPARATOR="^",XDEF="/chiefs/chief/state")
/FIELD=(state,POSITION=3,SEPARATOR="^",XDEF="/chiefs/chief/party")
/SORT
/KEY=president
/OUTFILE=data.xml
/PROCESS=XML

In this case, because the same node structure is desired for all output fields, and because
the tag names that are desired in the target are identical to the RowGen /FIELD names,
the same XML output file could be generated by specifying only one XDEF as an
indicator of how the other fields will be written:

68 RowGen Control Language RowGen


5.5 Statistics

/INFILE=xml_data.in
/FIELD=(president,POSITION=1,SEPARATOR="^")
/FIELD=(sparty,POSITION=2,SEPARATOR="^")
/FIELD=(state,POSITION=3,SEPARATOR="^",XDEF="/chiefs/chief/party")
/SORT
/KEY=president
/OUTFILE=data.xml
/PROCESS=XML

Web Logs

ELF (W3C Extended Log Format)

RowGen supports the generation of internet web transaction files in W3C Extended Log
Format (ELF). These files have a header containing lines of comments, followed by a
line naming the data fields. When you specify /PROCESS=ELF on output, RowGen
produces a header based on the field names and positions as they are specified on output.

If you wish to create a RowGen script that is designed to process ELF files when the
actual data become available, you can use the utility elf2ddf, which is provided with the
RowGen package in the $ROWGEN_HOME/bin directory. In Windows, elf2ddf is
located in \Rowgen21\bin (see the elf2ddf sub-chapter on page 199). This utility reads
ELF file headers and generates RowGen data definition files accordingly, giving you a
base for generating ELF data files. Its syntax is:
elf2ddf datafile [data-definition-filename]

5.5 Statistics

To produce runtime statistics, use the following statement:


/STATISTICS[=[path]filename]

RowGen will output to a specified filename, or, by default, to standard out.


Runtime statistics can contain the following information, where some items may not be
applicable to the specifications in your job script:

• Input File Names, and for each input file:


• record length
• number of records rejected
• number of records accepted.

RowGen RowGen Control Language 69


5.6 Auditing

• Process Record (/INREC) information, including:


• record length
• key number, direction, position, size, and format
• /NODUPLICATES
• /STABLE

• Output File Names, and for each output file:


• /HEADREC byte lengths
• /FOOTREC byte lengths
• record length
• number of records rejected
• number of records accepted.

• Job Efficiency (Tuner) information, including:


• internal buffer (I/O block) size
• number of records in memory
• work area directories.

• Job Total information, which may include:


• number of records read
• number of records sorted
• number of records output
• time job finished and began
• real, user and system times.

5.6 Auditing

RowGen can produce a self-appending log file, in XML format, that contains
comprehensive job information for the purposes of auditing. Auditing is enabled when
the AUDIT entry is present in the rowgenrc file on Unix or the Windows Registry. An
audit record is produced for every RowGen execution, and these records append to the
XML [path]filename specified by the AUDIT entry.

Example 20 on page 70 demonstrates how a RowGen audit record is generated based on


the job script contents and user environment at the time of execution. It also shows what
the audit file looks like when opened in both a text editor and an XML browser utility.

Example 20 Creating an /AUDIT log

Consider that the following entry is set in the Windows Registry (or set in the
rowgenrc file on Unix):

AUDIT=C:\RowGen\Tests\mytests\audit\rgaudit.xml

70 RowGen Control Language RowGen


5.6 Auditing

Consider that the following RowGen job script, audit.rcl, is executed:

/INFILE=audit.in
/SPECIFICATION=fields.ddf
/REPORT
/OUTFILE=$CSOUTPUT
/SPECIFICATION=fields.ddf

where fields.ddf contains the first two /FIELD statements and a nested
/SPECIFICATION entry that refers to the remaining two fields of the generated record:

/FIELD=(name,POSITION=1,SIZE=27,ASCII)
/FIELD=(year,POSITION=28,SIZE=12,ASCII)
/SPECIFICATION=fields2.ddf

and where fields2.ddf contains:

/FIELD=(party,POSITION=40,SIZE=5,ASCII)
/FIELD=(state,POSITION=45,SIZE=2,ASCII)

When the above job script is executed, an entry in the audit file rgaudit.xml, is
created/added. When rgaudit.xml is opened in a text editor, the .ddf file names and
their components are tabbed within the <Script> element to allow for easier
readability of nested script specifications, as follows:

...
<Script>
/AUDIT=C:\RowGen\Tests\mytests\audit\rgaudit.xml
/INFILE=audit.in
/SPECIFICATION=fields.ddf
/FIELD=(name,POSITION=1,SIZE=27,ASCII)
/FIELD=(year,POSITION=28,SIZE=12,ASCII)
/SPECIFICATION=fields2.ddf
/FIELD=(party,POSITION=40,SIZE=5,ASCII)
/FIELD=(state,POSITION=45,SIZE=2,ASCII)
/REPORT
/OUTFILE=C:\RowGen\Tests\mytests\audit\chiefs.out
/SPECIFICATION=fields.ddf
/FIELD=(name,POSITION=1,SIZE=27,ASCII)
/FIELD=(year,POSITION=28,SIZE=12,ASCII)
/SPECIFICATION=fields2.ddf
/FIELD=(party,POSITION=40,SIZE=5,ASCII)
/FIELD=(state,POSITION=45,SIZE=2,ASCII)
</Script>
...

RowGen RowGen Control Language 71


5.6 Auditing

However, when opened in an XML-supported browser, the <script> entry displays as


a series of space-delimited entries and there are no tabs. An XML browser will therefore
display rgaudit.xml as follows (note that the <Environment> entry near the bottom
has been truncated for readability purposes):

<?xml version="1.0" encoding="UTF-8" standalone="yes" ?>


- <AuditTrail>
- <AuditRecord>
<Product>RowGen</Product>
<Version>2.1.1</Version>
<VersionTag>D90060623-0000</VersionTag>
<Serial>12345.6789</Serial>
<OperatingSystem>Windows XP</OperatingSystem>
<User>rowgen_user</User>
<ProcessId>4784</ProcessId>
<Program>C:\RowGen\Tests\mytests\audit\rowgen</Program>
<Command>/sp=audit.rcl</Command>
<StartTime>2008-02-01 21.47.15</StartTime>
<EndTime>2008-02-01 21.48.06</EndTime>
<RunTime>00:01:51</RunTime>
<ReturnCode>0</ReturnCode>
<ErrorMessage>normal return</ErrorMessage>
<RecordsProcessed>42</RecordsProcessed>
<Script>/AUDIT=C:\RowGen\Tests\mytests\audit\rgaudit.xml
/STATITICS=C:\RowGen\Tests\mytests\audit\csstat.log /INFILE=audit.in
/SPECIFICATION=fields.ddf /FIELD=(name,POSITION=1,SIZE=27,ASCII)
/FIELD=(year,POSITION=28,SIZE=12,ASCII)
/SPECIFICATION=fields2.ddf
/FIELD=(party,POSITION=40,SIZE=5,ASCII)
/FIELD=(state,POSITION=45,SIZE=2,ASCII)
/SORT /OUTFILE=C:\RowGen\Tests\mytests\audit\chiefs.out
/SPECIFICATION=fields.ddf
/FIELD=(name,POSITION=1,SIZE=27,ASCII)
/FIELD=(year,POSITION=28,SIZE=12,ASCII)
/SPECIFICATION=fields2.ddf
/FIELD=(party,POSITION=40,SIZE=5,ASCII)
/FIELD=(state,POSITION=45,SIZE=2,ASCII)</Script>
<Environment>ALLUSERSPROFILE=C:\Documents and Settings\All Users APPDATA=C:\Doc-
uments and Settings\rowgen_user\Application Data CLASSPATH=.;C:\Program
Files\Java\j2re1.4.2_03\lib\ext\QTJava.zip CLIENTNAME=Console CommonProgram-
Files=C:\Program Files\Common Files
...
</Environment>
</AuditRecord>
</AuditTrail>

This audit file contains only one record. Subsequent jobs that are written to
NOTE rgaudit.xml are appended to the bottom, and always begin with a new
<AuditRecord> tag.

72 RowGen Control Language RowGen


6 FIELDS

6 FIELDS

The data files/streams generated by RowGen consist of records, where records consist
of one or more fields that you define. Fields are the building blocks of RowGen files,
and the various syntax options for defining fields allow you to customize the contents to
best suit your requirements.
The location of a field within a RowGen record is either:
• fixed, where the starting byte for each field is always in the same column
• floating or delimited, where the first field is at position 1, and subsequent fields
are separated by a delimiter character(s).

In both cases, numbering begins with one (1). RowGen allows you produce files that
contain both fixed and floating fields within the same record.

As described in Data Flow Structure in RowGen Scripts on page 55, one or


NOTE more /FIELD statements is required in the input section of a RowGen script in
order to generate records, but not required in the output section(s) unless you
intend to reformat or transform the generated data.

6.1 Syntax

The following is the syntax for defining a fixed-position field. When defining input
fields to be generated, the POSITION and SIZE attributes are required, in addition to
field name:
/FIELD=(fieldname[,ROWID[=values]][,SET=[access ][NOT_SORTED ]
[path]filename][,SET={literal_set}],POSITION=n,SIZE=n.[n]
[,FRAME=’char’][,data type])

The following is the syntax for defining delimited fields. When defining input fields to
be generated, the POSITION and SEPARATOR are required, in addition to the field name:

/FIELD=(fieldname[,ROWID[=values]][,SET=[access ][NOT_SORTED ]
[path]filename][,SET={literal_set}],POSITION=n,SEPARATPR=’Char’[,SIZE=n[.n]]
[,FRAME=’char’][,data type])

Note that field attributes are separated by commas. The field name must appear first, but
other attributes may appear in any order. The n shown for some of the parameter values
represents a whole number.

RowGen RowGen Control Language 73


6.2 Field Name

Note that alignment, mill, and fill options are available when defining /FIELD statments
in the output section of RowGen scripts (see Alignment on page 90, MILL on page 91,
and FILL on page 92).

6.2 Field Name

The field name you provide identifies the field. A symbolic or meaningful name is
recommended. Once you have identified a field name in the input section of your script,
it is recognized in the /OUTPUT section(s) of the same script, or in the /INREC section
(see /INREC on page 126). In some cases, you can create a newly named field on output
or in /INREC, but only when you are deriving a new field (for example, for summary
purposes or for cross-calculation).

In the output file only, it is possible to have a field definition that is only an input field
name. Each defined field will be mapped sequentially (one after the other). For example,
if you had defined/generated the fields lastname and firstname in the input file, you
could have the following field definitions in an output file:

/FIELD=(lastname)
/FIELD=(firstname)

Unless new field attributes are specified, the output fields will retain the attributes
(such as size, position, and data type) specified in the input section (or the /INREC
section, which takes precedence, if applicable).

6.3 ROWID

The ROWID field attribute enables you assign a unique, incrementing or decrementing
row number (or ID tag) to a field within each generated record. ROWID is valid in the
/INFILE section of a job script. See SEQUENCER on page 166 for details on creating
an incremental value field in the /OUTFILE section of a job script.

Syntax

ROWID[=”[INIT][,STEP][,MAX or MIN]”]

where the following options are supported:

INIT The initial value for the counter. Can be either a positive or negative
integer of up to 19 digits. (RowGen supports RowID values that
reach, but do not exceed, 22 digits.) The INIT value defaults to 1.

STEP The increment value of the ROWID counter. This can be set to a
positive step to indicate a forward count, or a negative step that
requires a minus sign (-) to indicate a backward count. Defaults to
counting forward by one.

74 RowGen Control Language RowGen


6.3 ROWID

MAX Applicable when you specify a positive STEP value (the default).
The highest value that the ROWID counter will reach . If this value is
reached, the ROWID counter recycles to the INIT value. The MAX
value can be either a positive or negative integer. By default, there is
no MAX value.

MIN Applicable when you specify a negative STEP value. The lowest
value that the ROWID counter will reach when counting backwards. If
this value is reached, the ROWID counter recycles to the INIT value.
The MIN value can be either a positive or negative integer. Be default,
there is no MIN value.

The following are valid ROWID attributes, for example:

ROWID # The default. INIT=1. STEP=1. No MAX. No MIN.


ROWID=”1,10” # INIT=1. STEP=10. No MAX.
ROWID=”1,10,100” # INIT=1. STEP=10. MAX=100.
ROWID=”1,10000” # INIT=1. STEP=10000. No MAX.
ROWID=”-50,-10,-100” # INIT=-50. STEP=-10. MIN=-100.
ROWID=”1, ,10” # INIT=1. No STEP (default=1). MAX=10.

If you want to omit the STEP, option, and rely on its default (STEP=1), you
NOTE must include an empty space between the commas to denote the absence of the
STEP, as shown in the last example above.

If the field with a ROWID attribute also includes a SIZE, the ROWID value
will be right-aligned in accordance with the field size.

Example 21 Using ROWID

This example, rowid.rcl, demonstrates how the ROWID field attribute behaves when
INIT, STEP, and MAX values are provided. It requires the SET file parts_list1.set,
which contains:

Brackets
Screws
Nails
Tacks

RowGen RowGen Control Language 75


6.3 ROWID

/INFILE=rowid.in # placeholder only; no real input is recognized


/PROCESS=RANDOM
/FIELD=(pkey,ROWID="1,10,100",POSITION=1,SEPARATOR='|')
# ROWID with options
/FIELD=(part,SET=parts_list1.set, POSITION=2,SEPARATOR='|')
# selected values
/FIELD=(profit,POSITION=3,SEPARATOR='|',SIZE=5,NUMERIC)
# generated values
/INCOLLECT=500 # generate 500 records
/REPORT # no sorting
/OUTFILE=rowid.out
/FIELD=(pkey,POSITION=1,SEPARATOR='|')
/FIELD=(part,POSITION=2,SEPARATOR='|')
/FIELD=(profit,POSITION=3,SEPARATOR='|',SIZE=5,NUMERIC)

This produces rowid.out

1|Nails|-3.88
11|Brackets|41.47
21|Screws|65.66
31|Brackets|-2.32
41|Brackets|-8.14
51|Nails|-1.13
61|Tacks|53.58
71|Nails|-6.35
81|Nails|-6.24
91|Brackets|-4.02
1|Screws|-.99
11|Tacks|85.45
21|Brackets|82.73
31|Screws|75.14
41|Tacks|71.60
51|Screws|-4.52
and so on....

Note that:

• the ROWID field named "pkey" begins at 1 (INIT=1) for the first record
• the "pkey" field increments by positive 10 (STEP=10) for subsequent records
• when the MAX value of 100 is reached (91 in this case due to the STEP size),
the ROWID counter then recycles to 1.

76 RowGen Control Language RowGen


6.4 SET Files and Literal SETs

6.4 SET Files and Literal SETs

See SET FILES on page 108 for full details on using user-defined SETs to provide
realistic values from which RowGen can draw.

6.5 Distributions

Establish parameters for data that control the distribution of values (numeric or
character-based) that best approximate the occurrence rate (or spread) of certain values.
The following distribution types use either routines or set files.

6.5.1 Distributions Using Routines

The following types of distribution use routines:

• Linear
• Normal (bell curve) for mean and standard deviation
• Normal (bell curve) for a range
• Weighted distribution of items

The routines used for these distributions are contained in the library file libcsnum,
provided in /lib/modules in the home directory.

Linear

The syntax for invoking a linear distribution is:

(field1=linear(min_value,max_value,precision),TYPE=ALPHA_DIGIT,POSITION=1,SEPARATOR=",")

where linear is the routine used , field1 is the name of the new field, min_value
and max_value are the lowest value and highest value in the distribution, and
precision is the number of decimal places or significant digits of the generated
numbers.

Normal for mean and standard deviation

The syntax for invoking a normal distribution for mean and standard deviation is:

(field1=normal_distribution1(mean,std_dev,precision),TYPE=ALPHA_DIGIT, POSITION=1,SEPARATOR=",")

where normal_distribution1 is the routine used, field1 is the name of the new
field, mean is the average value within all the generated values, std_dev sets the
standard deviation from the mean, and precision is the decimal precision of the
generated numbers.

RowGen RowGen Control Language 77


6.5.2 Distributions Using Set Files

Normal for a range

The syntax for invoking a normal distribution for a range is:

(field1=normal_distribution2(min_value, max_value, precision),TYPE=ALPHA_DIGIT,POSITION=1,SEPARATOR=",")

where normal_distribution2 is the routine used , field1 is the name of the new
field, min_value and max_value are the lowest value and highest value in the range,
and precision is the number of decimal places or significant digits of the generated
numbers.

Weighted Distribution of Items

You can control the occurrence rate of certain literal values in relation to the occurrence
rate of others, for example if you want 60 percent males and 40 percent females in your
test data, irrespective of the number of total records generated. The total must equal 100
percent.

The syntax for invoking a weighted distribution of items is:

(field1=weighted_distribution(ratio,"value",percentage"value"),TYPE=ALPHA_DIGIT,POSITION=1,SEPARATOR=",")

where weighted_distribution is the routine used, field1 is the name of the new
field, percentage is a percentage for the occurrence of the value, value is a literal
value that will be generated in accordance with the above percentage in relation to the
other percentage/value entries.

You can add multiple entries, but the percentages must equal 100.

6.5.2 Distributions Using Set Files

Weighted Distribution of Numeric Values

Random numeric values will be generated from the user-defined space; the entire space
can consist of one or more smaller spaces. Each entry that you define describes that
smaller space. This dialog has entries to define the smaller spaces using percentage-
based ranges of numeric values, with flexible minimum and maximum ranges for each.

Weighted Distribution requires the existence of a set file with one or more entries with
five options that must be tab separated. The options are percentage, beginning minimum
value, ending minimum value, beginning maximum value, and ending maximum value.

The syntax to invoke the set file with weight options is:

(field1,TYPE=ALPHA_DIGIT,POSITION=1,SEPARATOR=",",SET= WEIGHTS "C:/IRI/sets/setfile")

78 RowGen Control Language RowGen


6.6 POSITION

where field1 is the name of the new field. You must use WEIGHTS followed by a
reference to the set file, including the path.

For example, consider the following values in weights.set:

20 45 25 45 45
20 5 5 5 25
30 5 5 45 45
30 0 25 50 25

If you invoke the above set file with the WEIGHTS option, RowGen will produce a
weighted distribution of values that would look similar to the following if mapped to a
scatter-plot diagram.

As shown in the diagram above, there are four sections, two of which represent the 20
percent range entries, and two that represent the 30 percent range entries. Note that a
scatter plot similar to the above can be generated with a Preview feature in the IRI
Workbench.

6.6 POSITION

This statement describes the starting location of each field in the record. The syntax is:
POSITION=n

For a fixed-position field, n is the starting column for the first byte of the field. In a
floating (delimited) field, n is the field number when counting from left to right. If
defining delimited fields on output, you must define each field sequentially even if
the field is null.

RowGen RowGen Control Language 79


6.6 POSITION

Example 22 Generating Fixed-Position Fields

To generate three fixed-position fields, you might use the following script, fixed.rcl:

/INFILE=fixedpos.in
/PROCESS=RANDOM
/INCOLLECT=3
/FIELD=(code1,POSITION=1,SIZE=5,digit)
/FIELD=(code2,POSITION=7,SIZE=5,digit)
/FIELD=(code3,POSITION=13,SIZE=5,digit)
/REPORT
/OUTFILE=fixedpos.out

This will produce fixedpos.out:

33148 30213 25583


73110 17067 22562
67649 14106 75311

Example 23 Generating Variable-Position (Delimited) Fields

To produce delimited rows, you might use the following script, variable.rcl:

/INFILE=variable.in
/PROCESS=RANDOM
/INCOLLECT=3
/FIELD=(code1,POSITION=1,SIZE=5,SEPARATOR='|',WHOLE)
/FIELD=(code2,POSITION=2,SIZE=5,SEPARATOR='|',WHOLE)
/FIELD=(code3,POSITION=3,SIZE=5,SEPARATOR='|',WHOLE)
/REPORT
/OUTFILE=variable.out

This will produce variable.out:

95278|12121|97825
78550|20426|68242
45025|95232|51598

Note that the positions are now 1, 2, and 3, which indicate the field positions with
respect to the separator (|), rather than fixed byte positions. Note also how the fields are
given a size, which is supported for both fixed-length and delimited fields. The ASCII-
numeric data type WHOLE is used, which generates whole numbers anywhere from size 1
to size 5, unlike the ASCII-character data type digit which would always generate size
5
(see ASCII Character Data Types on page 94 and ASCII-Numeric Data Types on
page 95).

80 RowGen Control Language RowGen


6.7 SIZE

6.7 SIZE

This statement sets the width, in bytes, of a given field. If you do not provide a SIZE
attribute on input, RowGen will generate fields that range from size 1 through size 10
by default, regardless of the data type of that field. However, if you want to limit this
variance, or set a fixed length for all generated values, you can use the SIZE attribute for
both fixed-position and delimited fields (see also MIN_SIZE and MAX_SIZE on
page 82). Depending on the data type, this attribute behaves differently:

ASCII-Numeric If the field to be generated is given an ASCII-numeric data type such


as NUMERIC or WHOLE, then the size you provide is the upper limit of
how large the field will be. For example, a NUMERIC field with
SIZE=5 will randomly generate values from -9.99 to 10.00. (See
ASCII-Numeric Data Types on page 95 for details on the supported
ASCII-Numeric data types, and how you can use precision to further
customize the size.)

All Other Data Types For all data types other than ASCII-Numeric, any SIZE attribute you
provide will dictate the actual size of every value generated. For
example, if you want to generate a whole number that is always
5 characters long (such as a zip code), you would the data type digit
and specify a size of five, because using WHOLE (an ASCII-Numeric
data type) would generate values that can be of size 1, 2, 3, 4, or 5.

If you want to select from a range of known values that may not all be the same
NOTE size, you can provide a list of acceptable values in a SET file (see SET Files
and Literal SETs on page 77), in which case the SIZE attribute you specify
will simply create a fixed, maximum field size into which the randomly drawn
values will be placed.

The syntax is:


SIZE=width[.precision]

For example, to specify a SKU field as 13 bytes long, the statement could be:

/FIELD=(SKU,POSITION=16,SIZE=13,digit)

RowGen RowGen Control Language 81


6.7 SIZE

The .precision option applies only to fields declared as NUMERIC. By default, fields
specified as NUMERIC on input will be generated with a decimal point and two decimal
places to the right (the .precision option is not required). Numeric fields are also
right-aligned by default when fixed-position fields are generated. However, you can also
control the number of decimal places, where the .precision value you specific
indicates the number of digits to be displayed to the right of the decimal point, and size
then represents the total length of the field, including the decimal point and decimal
places (negative sign, if applicable).

Example 24 Size with NUMERIC

Consider the following script, numsize.rcl:

/INFILE=numsize.in
/PROCESS=RANDOM
/INCOLLECT=3
/FIELD=(value,POSITION=1,SIZE=8.3,NUMERIC)
/REPORT
/OUTFILE=numsize.out

This will produce numsize.out, where the precision is 3 and the field size is up to eight
bytes wide:
8046.126
5858.720
-168.132

If you want to generate a NUMERIC field without decimal places, you must
NOTE specify a SIZE with a precision of 0, such as SIZE=5.0.

If you declare an output field as NUMERIC, and assign it a size that is shorter
than that which was defined on input, then *** characters will be produced to
indicate an overflow. Similarly, if you create summary NUMERIC fields that are
not sized appropriately to accommodate a product of two numeric values, for
example, the overflow characters will appear (see SUMMARY FUNCTIONS
(AGGREGATION) on page 156).

precision may not exceed 18 bytes.

MIN_SIZE and MAX_SIZE

82 RowGen Control Language RowGen


6.7 SIZE

MIN_SIZE and MAX_SIZE are used to manage the size of generated field values. This is
particularly useful when generating delimited (variable-length) fields when you want the
size of the field values to vary according to your parameters, rather than having the same
fixed length assigned to all generated fields when the SIZE attribute is used (see SIZE
on page 81).

You can specify a minimum and maximum value size for any field with an ASCII
character or numeric data type (specifically, those described in ASCII Character Data
Types on page 94 and ASCII-Numeric Data Types on page 95).

The syntax for ASCII character fields is:

[MIN_SIZE=width1][,MAX_SIZE=width2][,SIZE=width3]

where width1 and width2 are the minimum and maximum character string sizes to be
generated. Optionally, you can also include a SIZE attribute, where width3 is the total
field width to be generated in all instances, irrespective of the string size dictated by
MIN_SIZE and MAX_SIZE. That is, fields will be padded with blank spaces when the
string generated is not as large as the SIZE you specify. Note that you can use any
combination of the above attributes.

WARNING! The SIZE attribute, if used, takes precedence over


MIN_SIZE and MAX_SIZE values, so if the SIZE value is less
than MAX_SIZE, some generated strings may be truncated.

The use of MIN_SIZE and MAX_SIZE with the SET attribute is


not recommended, because SET values that are selected by
RowGen are intended to be known quantities with respect to
their possible field widths.

The syntax for numeric fields is:

[MIN_SIZE=width1[.n]][,MAX_SIZE=width2[.n]][,SIZE=width3[.n]]

where width1 and width2 are the minimum and maximum widths of the numeric
values to be generated (including any minus sign, or decimal point and decimal digits if
applicable, as shown in Example 24 on page 82). Optionally, you can specify a decimal
precision of n (the default precision for NUMERIC fields is 2), but any precision value
you specify for MIN_SIZE must be equal to the precision you specify for MAX_SIZE.
(That is, the width of the numeric value before the decimal point can vary, but the
number of digits after the decimal point cannot.) Optionally, you can also include a
SIZE attribute, where width3 is the total field width to be generated, irrespective of the
numeric value size dictated by MIN_SIZE and MAX_SIZE. However, any precision

RowGen RowGen Control Language 83


6.7 SIZE

given for SIZE must be equal to the precision given for MIN_SIZE and MAX_SIZE.
Note that you can use any combination of the above attributes.

See Example 4 on page 16 for an example of using MIN_SIZE and MAX_SIZE attributes
to generate ASCII character and numeric values.

Precision

In the output section of a RowGen script, you can use the PRECISION attribute to
assign a uniform decimal precision to numeric fields. Using PRECISION instead of the
SIZE=width[.precision] convention ensures that each numeric value is displayed
with the decimal precision you specify, without the limitation of having to specify a
SIZE that would result in the same field length for every record (see SIZE on page 81
and Example 24 on page 82). The syntax is:

PRECISION=n

where n is the numeric decimal precision applied to the output field.

ASCII Substrings

RowGen allows you to identify substrings of either an ASCII field that was generated
on input, or a string of characters that you provide. You can use substrings to re-position
and re-cast these values on output (or in /INREC).

The syntax for using substrings on output is as follows:


/FIELD=(sub_string(field_name or "string",value1,value2),POSITION=...

where:
• field_name is a field that was specified on input, and the substring is
derived from its field contents
• "string" is an ASCII string from which the substring will be taken
• value1 is the offset. It can be a positive value (the default) to indicate the
number of characters from the left of the field (or string) where you want your
substring to begin. Or, you can use a negative value to indicate the number of
characters from the right of the field (or string).
• value2 is the substring length. It is the number of bytes/characters you want to
include in the string once your starting point has been determined using value1.

value1 and/or value2 can be field names rather than integers if the fields
NOTE you specify contain integer values.

84 RowGen Control Language RowGen


6.7 SIZE

Example 25 Using ASCII Substrings

This example incorporates some of the substring options that are available. Consider the
SET file chiefs.set:
McKinley, William
Roosevelt, Theodore
Taft, William H.
Wilson, Woodrow
Harding, Warren G.

The following script, substring.rcl, incorporates several uses of the substring option:

/INFILE=substring.in
/PROCESS=RANDOM
/INCOLLECT=3
/FIELD=(president,SET=chiefs.set,POSITION=1)
/REPORT
/OUTFILE=substring.out
/FIELD=(president,POSITION=1)
/FIELD=(sub_string(president,3,3),POSITION=25,SIZE=10)
/FIELD=(sub_string(president,-3,3),POSITION=35,SIZE=10)
/FIELD=(sub_string("tuvwxyz",-3,3),POSITION=44,SIZE=3)

This will produce substring.out:

Harding, Warren G. rdi G. xyz


Roosevelt, Theodore ose ore xyz
Wilson, Woodrow lso row xyz

Note that:

• The first substring field at position 25 includes the third character in from the left
of the president and extends for 3 bytes.
• The second substring at position 35 uses a negative offset to show that the
substring was taken by counting in 3 from the right-most character of the
president field.
• The final substring uses the "string" option and also features a negative offset,
giving the last three characters xyz.

RowGen RowGen Control Language 85


6.8 SEPARATOR

6.8 SEPARATOR

The separator character is used as the delimiting character that separates floating fields.
The syntax is:
SEPARATOR=’option’

where option can consist of any of the following:


• a single ASCII character such as a comma ,
• a multi-character ASCII string such as ,*|
(see Multi-Character Separators on page 86)
• a control character such as a tab (\t)
• hexadecimal character such as a # sign: (\X23) (see ASCII COLLATING
SEQUENCE on page 234).

For an example of generating records with a single ASCII separator, see Example 23 on
page 80.

Multi-Character Separators

You can also specify an ASCII-based multiple-character separator, as shown in the


following example.

Example 26 Generating Multi-Character Field Separators

The following script, multichar.rcl, specifies that three records will be generated, each
consisting of three alpha_digit fields (alphabetic and digit characters, the default
data type) separated by a multiple-character separator:

/INFILE=multichar.in
/PROCESS=RANDOM
/INCOLLECT=3
/FIELD=(first,POSITION=1,SIZE=3,SEPARATOR=’,*|’)
/FIELD=(second,POSITION=2,SIZE=3,SEPARATOR=’,*|’)
/FIELD=(third,POSITION=3,SIZE=3,SEPARATOR=’,*|’)
/REPORT
/OUTFILE=multichar.out

This will produce multichar.out:

vel,*|vtS,*|bzO
plZ,*|BP7,*|PhQ
1PG,*|Eud,*|5ds

86 RowGen Control Language RowGen


6.8 SEPARATOR

You can also incorporate a control character as part of a multi-character


NOTE separator, by offsetting it with a backslash. For example, you can specify

SEPARATOR=’,*\t’

to describe a separator that consists of a comma, an asterisk, and a tab.


See CONTROL (ESCAPE) CHARACTERS on page 132 for a complete list
of RowGen control characters.

Different Separators Within Records

RowGen allows the use of different separators within the same record definition. The
fields delimited by one separator are totally independent of the fields delimited by
another.

Example 27 Generating Records with Multiple Separators

Consider that you want to produce a file that contains multiple separators, where two
inner sub-fields are delimited by a comma (,) and three total fields are delimited by a
pipe (|).

The following script, multiple_seps.rcl, generates three records of that description:

/INFILE=multiple_seps.in
/PROCESS=RANDOM
/INCOLLECT=3
/FIELD=(left1,POSITION=1,SIZE=3,SEPARATOR='|')
/FIELD=(mid1,POSITION=2,SIZE=2,SEPARATOR='|')
/FIELD=(mid2,POSITION=2,SIZE=2,SEPARATOR=',')
/FIELD=(right2,POSITION=3,SIZE=3,SEPARATOR='|')
/REPORT
/OUTFILE=multiple_seps.out

This will produce multiple_seps.out:

gSq|ZF,7Y|YKl
BeE|jS,Jj|7S0
FY8|zl,v7|2HO

RowGen RowGen Control Language 87


6.9 FRAME

6.9 FRAME

Framed fields are typically found in a Relational Database Management System


(RDBMS) table or Microsoft Comma-Separated-Values (CSV) data. Framing of data
within quotes or other characters protects the elements from being processed
incorrectly. This is useful, for example, in offsetting and treating a firstname, lastname
combination as one field, rather than as separate fields. You can use RowGen’s FRAME
option to produce field contents that are framed with the character of your choice.

The RowGen frame field attribute syntax is:

FRAME=’char’

You can create one or more framed fields in the input section of a script, or you can
remove, change, or add a new frame to one or more fields in the output (or INREC)
section.

When framing a field to which you assign a SIZE, such as when defining
NOTE fixed-position fields, you must add 2 to the size of the field contents to account
for the enclosing characters. For example, a random digit field with SIZE=5
and FRAME=’*’ might appear as *726*.

Example 28 Generating Framed Fields on Input


This example illustrates how you can generate a framed field on input that is treated as a
single field for sorting purposes, even if it contains the same character that is used as the
field separator. This example uses the SET file full_names.set:

Smith, Wanda
Jones, Paul
Jones, Abe
Smith, Bill
Smith, Walter
Jones, Mark

88 RowGen Control Language RowGen


6.9 FRAME

The following script, gen_frame.rcl, encloses the names field within quotes (") to
protect its contents from being processed incorrectly:

/INFILE=framed.in
/PROCESS=RANDOM
/INCOLLECT=10
/FIELD=(last_first,SET=full_names.set,POSITION=1,SEPARATOR=',',FRAME=’"’)
/FIELD=(code,POSITION=2,SIZE=3,digit,SEPARATOR=',')
/SORT
/KEY=last_first
/NODUPLICATES
/OUTFILE=framed_protected.out

This will produce framed_protected.out:

"Jones, Paul",985
"Smith, Bill",429
"Smith, Walter",724
"Smith, Wanda",728

Note that both segments of the name field (last and first) were protected for sorting
purposes. Without the frame, the two fields in this example could not have been
generated as comma-separated, and the name field would not be protected.

Example 29 Producing CVS Fields on Output

This example shows how you can create CVS files, where all fields are framed by
default. Consider the following script, cvs_out.rcl:

/INFILE=framed.in
/PROCESS=RANDOM
/INCOLLECT=3
/FIELD=(field1,POSITION=1,SEPARATOR=',',alpha)
/FIELD=(field2,POSITION=2,SIZE=3,SEPARATOR=',',digit)
/FIELD=(field3,POSITION=3,SEPARATOR=',',alpha)
/SORT
/KEY=(field2,DESCENDING)
/OUTFILE=csv.out
/PROCESS=csv

This produces csv.out:

field1,field2,field3
"EV","596","OPuEfz"
"pwX","589","mEO"
"UipnSF","047","rtSQz"

RowGen RowGen Control Language 89


6.10 Alignment

Note that "field2" was sorted in descending order (see KEYS on page 135), and enclosed
in quotes on output.

6.10 Alignment

This field attribute aligns a desired field string (not its leading or trailing fill characters)
to either the left or right of the target output or /INREC field (see /INREC on page 126).
Alignment also moves leading or trailing fill characters to the opposite side of the string.
The following alignment options are accepted:

NONE_ALIGN No change (the default, not required).

LEFT_ALIGN The string beginning with the first desired (non-fill) character is
aligned to the left of the target field. The remaining length to the right
of the target field is populated with the specified fill character.

RIGHT_ALIGN Fill characters to the right of the source string are removed. The
remaining source string is moved to the right side of the target field.
The remaining length to the left of the target field is populated with
the specified fill character.

By default, the fill character is considered to be a space, but you can specify a different
FILL character on input or on output (see FILL on page 92).).

Example 30 Using the Alignment Option

Consider the SET file chiefs.set (see Example 25 on page 85). The following script,
align.rcl, will generate an ACSCII field called name (with values drawn from the SET
file) and an age field (a randomly generated two-digit number):

/INFILE=align.in
/PROCESS=RANDOM
/INCOLLECT=4
/FIELD=(name,SET=chiefs.set,POSITION=1,SIZE=20)
/FIELD=(age,POSITION=22,SIZE=2,digit)
/SORT
/KEY=name
/OUTFILE=align.out
/FIELD=(name,POSITION=1,SIZE=20,RIGHT_ALIGN)
/FIELD=(age,POSITION=22,SIZE=2,digit)

90 RowGen Control Language RowGen


6.11 MILL

This produces align.out:

Harding, Warren G. 37
Roosevelt, Theodore 34
Taft, William H. 70
Wilson, Woodrow 72

Note that the name field was sorted (see KEYS on page 135), and right-aligned on
output. The spaces (fill characters) existing within the name strings have not been
affected by the alignment.

6.11 MILL

This statement can be used in an output field with a numeric data type. It causes commas
to be inserted at the appropriate place(s) in a string of digits.

MILL is implied when CURRENCY (or MONEY) is used as a data type on output. on output.

Example 31 Using the MILL option

Consider the effect of the MILL option in this example, mill.rcl, where the generated
input field is a five-digit number:

/INFILE=mill.in
/PROCESS=RANDOM
/INCOLLECT=3
/FIELD=(Value,POSITION=1,SIZE=5,digit)
/OUTFILE=mill.out
/FIELD=(Value,POSITION=1,SIZE=8,NUMERIC)
/FIELD=(Value,POSITION=10,SIZE=9,MILL,NUMERIC)
/FIELD=(Value,POSITION=20,SIZE=10,CURRENCY)

This produces mill.out:

3634.00 3,634.00 $3,634.00


33017.00 33,017.00 $33,017.00
67348.00 67,348.00 $67,348.00

Note that the same value was used for all three output fields in each record, but the input
size of 5 had to be expanded to 8 on output to accommodate the NUMERIC data type
(which includes an additional decimal point and two decimal places). A size of 9 was
used for MILL to accommodate that numeric value and an additional comma. Finally,

RowGen RowGen Control Language 91


6.12 FILL

the CURRENCY field, which uses MILL by default and produces a $ character, had to be
expanded to size 10.

If, on output, you do not expand sizes in this way, an overflow (***) may occur.

The MILL attribute is available only in the OUTFILE section of a RowGen script.

6.12 FILL

This statement is used with numeric data types (such as NUMERIC, WHOLE, or
CURRENCY) to pad the left side of field values with a specified character whenever the
value is shorter than the SIZE given. There are two forms of the FILL statement:
• FILL=’char’
• FILL=n
where char is the fill character, and n is the decimal weight of a character. The default
fill character for a non-binary field is a space; for a binary field, it is a binary NULL
(see Binary Numeric Data Types on page 96).

Example 32 Using the FILL Option

Consider the following script, fill.rcl:

/INFILE=fill.in
/PROCESS=RANDOM
/INCOLLECT=10
/FIELD=(code1,POSITION=1,SIZE=5.0,NUMERIC)
/REPORT
/OUTFILE=fill.out
/FIELD=(code1,POSITION=1,FILL='*',SIZE=7.0,NUMERIC)

This produces fill.out:

0095634
0008910
0052162
0009264
0004742
0029380
0009665
0092121
0050436
0082668

Note that where the field value did not extend to the full size of the field (7), the left side
was padded with the fill character.

The FILL attribute is available only in the OUTFILE section of a RowGen script.

92 RowGen Control Language RowGen


6.13 Data Types

6.13 Data Types

RowGen allows you to:

• Generate randomized field data using several available data types, which can be
specified in the input section of a job script (see Input Filenames and Attributes
on page 56).
• Convert from one supported data type to another. This is done by supplying
/FIELD statements within one or more output file sections that reference the
originally generated field name, but with a different option for the data type
attribute (see Output Filenames and Attributes on page 57).

When you are selecting values from a character SET file (see Character SET
NOTE Files on page 111), the values are generated in the exact form in which they
appear in the SET file, and the alpha_digit data type (the default) is assumed.
Therefore, if you want to change the data type of these values, you must
provide /FIELD statements on output, and declare a different data type for
each field that you need to convert.

alpha_digit, the default data type, is not required to be specified as a field attribute.

The following groups of data types are in RowGen:

• ASCII Character Data Types on page 94


• ASCII-Numeric Data Types on page 95
• Binary Numeric Data Types on page 96
• EBCDIC Data Types on page 98
• Micro Focus Data Types on page 99
• RM COBOL Data Types on page 101
• EBCDIC-Native Micro Focus COBOL Data Types on page 103
• EBCDIC-Native RM COBOL Data Types on page 104
• Date/Timestamp Data Types on page 105
• Zoned Decimal on page 106.

RowGen RowGen Control Language 93


6.13.1 ASCII Character Data Types

6.13.1 ASCII Character Data Types

ASCII fields are randomized using subsets of the 127 possible ASCII characters
available (see ASCII COLLATING SEQUENCE on page 234). The following RowGen
data types are provided to help you customize your ASCII field data with the most
realistic values possible, which is especially useful when real data become available:

ALPHA Contains only alphabetic characters

ALPHA_DIGIT The default. Contains digits and alphabetic characters.

ASCII Contains all ASCII characters, printable and non-printable, with the
exception of a binary 0. From hexadecimal 01 to 7F.

DIGIT Contains only digits (0 through 9).

IP_ADDRESS Refers to positive whole number sub-fields separated by a dot (.),


such as 155.46.142.205. When sorting an IP_ADDRESS field,
each subfield is compared numerically starting with the left-most
subfield. Subsequent sub-fields are compared only if all previous
sub-fields have been determined to be equal.

LOWER Contains only lower-case alphabetic characters.

PRINTABLE Contains all printable ASCII characters, represented by the


hexadecimal representations 20 to 7E in ASCII COLLATING
SEQUENCE on page 234.

SPACE Contains only blank or tab characters.

UPPER Contains only upper-case alphabetic characters.

Certain generated values may differ based on your environment’s locale set-
NOTE ting.

When generating random data with input /FIELD statements, ASCII fields are
populated to the full SIZE that you assign, and placed at the POSITION you assign.
However, as with any data type you generate, any output SIZE and POSITION you
specify (if performing field-level transformation) will supersede its input phase
specifications. See Data Flow Structure in RowGen Scripts on page 55.

94 RowGen Control Language RowGen


6.13.2 ASCII-Numeric Data Types

6.13.2 ASCII-Numeric Data Types

RowGen supports the following ASCII-Numeric data types:

NUMERIC By default, NUMERIC displays right-justified with a precision of two.


If you want to generate a NUMERIC field without decimal places, you
must specify a SIZE with a precision of 0, such as SIZE=5.0.

CURRENCY By default, CURRENCY displays right-justified with a precision of


two. MILL is set to on, and displays the monetary symbol at the
beginning of the field (see MILL on page 91).1

WHOLE Displays right-justified and contains only whole numbers.

When generating random data with input /FIELD statements, ASCII-Numeric fields are
not always populated to the full SIZE that you assign. For example, if you declare a field
of SIZE=3 and a data type of WHOLE, RowGen will randomly generate numbers from 0
through 999. If you want to ensure that your numeric values are always a certain number
of digits, you must use the data type digit.

The following example illustrates how the ASCII-numeric data types display.
Consider a SET file that contains these entries:

150
9.46
1023.45
These entries, if converted to ASCII-Numeric types on output, would appear as follows
if given adequate sizes:
WHOLE NUMERIC CURRRENCY CURRENCY w/ FILL
--------------------------------------------------
150 150.00 $150.00 $***150.00
9 9.46 $9.46 $*****9.46
1023 1023.45 $1,023.45 $*1,023.45

If you do not size your fields appropriately for the data type, the * overflow symbol may
appear on output, for example:

***

1. CURRENCY cannot be generated in the /INFILE section of a RowGen


script. You can declare NUMERIC on input, for example, and convert to
CURRENCY on output, resizing the field accordingly. See File Format
and Data Type Conversion on page 27 for a conversion example.

RowGen RowGen Control Language 95


6.13.3 Binary Numeric Data Types

6.13.3 Binary Numeric Data Types

RowGen supports the following binary numeric data types1:


SHORT, USHORT SHORT indicates integer, short signed. USHORT indicates integer, short
unsigned. Short integers are stored as 2’s complement numbers (in
binary form) usually in two
8-bit bytes with ranges -32,767 ≤ signed short ≤ 32,767 and
0 ≤ unsigned short ≤ 65535. No width assumptions are made. The
order of the byte pair (whether higher valued bits are in the first or
second byte) uses the natural ordering of the hardware.

INT, UINT UNIT indicates integer, natural signed. UINT indicates integer, natural
unsigned. These data types represent 2’s complement integers stored
in the width allocated by an int declaration with no short or long
adjective. Two and four bytes are common widths. Byte ordering
uses the natural order of the machine.

LONG, ULONG LONG indicates integer, long signed. ULONG indicates integer, long
unsigned. Four bytes is the typical width for these 2’s complement
data types, with ranges
-2147483648 ≤ signed ≤ 2147483647 and
0 ≤ unsigned ≤ 4294967295.

FLOAT, DOUBLE FLOAT indicates float, single precision. DOUBLE indicates float, double
precision. Single precision and double precision floats are usually
placed in four and eight bytes.

CHAR Character, Natural (EBCDIC)

SCHAR Character, Signed

UCHAR Character, Unsigned (EBCDIC)

Chars are one-machine-byte long, with no assumptions made about byte width
NOTE for most comparison cases -- assuming a standard 8-bit width with ranges -128
≤ signed character ≤ 127 and 0 ≤ unsigned character ≤ 255. Natural chars are
either signed or unsigned, depending on the interpretation of char used by
your machine.

- Character, Natural1 ≤ Length

1. Binary numeric data types cannot be generated in the /INFILE section of


a RowGen script.You can declare NUMERIC on input, for example, and
convert to the desired binary data type on output, resizing the field
accordingly.

96 RowGen Control Language RowGen


6.13.3 Binary Numeric Data Types

- Character, Signed1 ≤ Length


- Character, Unsigned1 ≤ Length

Natural means that RowGen will use the native C library memcmp() function
to evaluate the keys. Natural comparison is faster and should be used
whenever collation of meta-characters (those with the most significant bit on)
is not a problem. If meta-characters occur, memcmp() does not necessarily
give the same results as the default char type on a machine, on signed or
unsigned chars.

On machines without a signed char declaration, performance is degraded by


forcing an additional interpretation. On most machines, characters are naturally
signed, but memcmp(a, b, 1) will return the correct result when:

0 ≤ a ≤ 126 and 129 + a ≤ b ≤ 255


28 ≤ a ≤ 255 and 0 ≤ b ≤ a - 128

Because the results of char comparisons can differ for meta-characters across
machines and libraries, it is advisable to build test cases for every
combination of meta-characters so that you will know what to expect.

RowGen RowGen Control Language 97


6.13.4 EBCDIC Data Types

6.13.4 EBCDIC Data Types

The following RowGen data types are provided to help you generate and/or customize
EBCDIC field data with the most realistic values possible, which is especially useful
when real data become available:

EBCDIC This contains all EBCDIC printing characters (see EBCDIC


PRINTING CHARACTERS on page 235).

EBCDIC_alpha_digit Contains digits and alphabetic characters in EBCDIC format.

EBCDIC_alpha Contains only alphabetic characters in EBCDIC format.

EBCDIC_digit Contains only digits in EBCDIC format (0 through 9).

EBCDIC_space Contains only blank or tab characters.

EBCDIC_upper Contains only upper-case alphabetic characters in EBCDIC format.

EBCDIC_lower Contains only lower-case alphabetic characters in EBCDIC format.

Certain generated values may differ based on your environment’s locale set-
NOTE ting.

When generating random data with input /FIELD statements, EBCDIC fields are
populated to the full SIZE that you assign, and placed at the POSITION you assign.
However, as with any data type you generate, any output SIZE and POSITION you
specify (if performing field-level transformation) will supersede its input specifications.
See Output Filenames and Attributes on page 57.

98 RowGen Control Language RowGen


6.13.5 Micro Focus Data Types

6.13.5 Micro Focus Data Types

RowGen can generate the following Micro Focus data types, where the width of data
fields is generally based on a maximum of 18 characters in a PICTURE clause:

The following data types cannot be directly generated. However, you can
NOTE generate using NUMERIC in input and convert to these data types in output.

RM, ERM, MF, ENF, UMF, URM, Zoned, and Packed

MF_COMP, UMF_COMP
COMP, Signed and COMP, Unsigned. In computational
(COMPUTATIONAL-4, BINARY) field values, negative values are
stored as 2’s complement numbers with the most significant byte first.
The number of bytes of storage depends on the magnitude of the value
(9s in PICTURE) and on the storage mode of the COBOL program
which generates the data, as shown in Table 1:

Table 1: Picture 9s/Storage


Picture 9’s Storage
Signed Unsigned Byte Word
1-2 1-2 1 2
3-4 3-4 2 2
5-6 5-7 3 4
7-9 8-9 4 4
10-11 10-12 5 8
12-14 13-14 6 8
15-16 15-16 7 8
17-18 17-18 8 8

If the machine is big-endian, consider using a C, FORTRAN, or


Pascal internal type of the same width; it should be faster.

MF_CMP3 MF COMP-3, Signed.

UMF_CMP3 MF COMP-3 Unsigned.

MF_CMP4 MF COMP-4, Signed.

UMF_CMP4 MF COMP-4, Unsigned.

RowGen RowGen Control Language 99


6.13.5 Micro Focus Data Types

Micro Focus COMPUTATIONAL-3, packed decimal, is like Ryan-McFarland


NOTE (RM/COBOL) COMPUTATIONAL-3 but with 0xC used for the positive values
sign in the half-byte instead of 0xB (see Table 2).

Table 2: Hexadecimal/Signs
Decimal Hex Sign
11 b Positive
13 d Negative
15 f Unsigned (positive)

MF_CMP5, UMF_CMP5
MF COMP-5, Signed and MF COMP-5, Unsigned.
COMPUTATIONAL-5 is like COMPUTATIONAL-4 but the byte order
depends on the hardware.

For RowGen purposes, if you require big-endian data, you should use
NOTE COMP-4. The COMP-5 algorithm explicitly does little-endian comparisons.

For better performance, consider using a C Library-supported integer type


if the data width is suitable.

MF_CMPX COMP-X. COMPUTATIONAL-X is like COMP-4, but its width is the


number of 9s in the PICTURE clause and its allowable values are any
unsigned integer which fits the width, i.e., 41 significant decimal
digits at most.

MF_DISP DISP, Unsigned.

UMF_DISP DISP, Unsigned.

MF_DISPSL DISP, Sign Leading.

MF_DISPSLS DISP, Sign Leading Separate.

MF_DISPST DISP, Sign Trailing.

MF_DISPSTS DISP, Sign Trailing Separate.

100 RowGen Control Language RowGen


6.13.6 RM COBOL Data Types

Micro Focus DISPLAY values are stored in the same manner as


NOTE Liant (Ryan-McFarland), except with sign included. MF does not alter any
bytes when a positive sign is included, and increments by 64, from 0 through 9
(0x30-0x39) to p through y (0x70-0x79).

6.13.6 RM COBOL Data Types

RowGen can generate the following Liant (Ryan-McFarland) data types:

The following data types cannot be directly generated. However, you can
NOTE generate using NUMERIC in input and convert to these data types in output.

RM, ERM, MF, ENF, UMF, URM, Zoned, and Packed

RM_COMP, URM_COMP
RM_COMP indicates COMP, signed. URM_COMP indicates COMP,
unsigned. Computational variables are stored one digit per byte with a
trailing byte for signed data. Each digit is stored as its binary value
(that is, 0x01 for 1). The sign byte is 0x0D for negative values and
0x0B for positive values.

RM_CMP1 COMP-1. COMPUTATIONAL-1 variables are stored as signed, two


byte big-endian (most significant byte first) 2’s complement numbers
between -32,768 and 32,767. If your machine is big-endian and C
shorts are two bytes, it will be faster to use one of the C short integer
comparison types.

RM_CMP3, URM_CMP3
RM_CMP3 indicates COMP-3, signed. URM_CMP3 indicates COMP-3,
unsigned. A COMPUTATIONAL-3 item, also referred to as packed
decimal, comprises a string of hex digits and a sign. Decimal digits (0
through 9) are held left to right.
Each decimal digit is represented as a hex digit with two hex digits
per byte. The last hex digit holds the sign, as shown in Table 3:

RowGen RowGen Control Language 101


6.13.6 RM COBOL Data Types

Table 3: Hexadecimal/Signs
Decimal Hex Sign
11 b Positive
13 d Negative
15 f Unsigned (positive)

The sign is the last hex digit so that an odd number of decimal digits
needs to be retained. If there are an even number of digits, a 0 hex
digit is prepended to the value to make full bytes.

Table 4 shows some sample data representations:

Table 4: Sample Data Representations


Numeric Hex
Value Patterns Bit Patterns
-123 12 3d 0001 0010 0011 1101
-1234 01 23 4d 0000 0001 0010 0011 0100 1101
+123 12 3b 0001 0010 0011 1100
+1234 01 23 4b 0000 0001 0010 0011 0100 1100

RM_CMP6 COMP-6. COMPUTATIONAL-6 is stored just like COMPUTATIONAL-3,


but as unsigned values without the need for a sign half-byte.

RM_DISP DISP, Signed.

URM_DISP DISP, Unsigned.

RM_DISPSL DISP, Sign Leading.

RM_DISPSLS DISP, Sign Leading Separate.

RM_DISPST DISP, Sign Trailing.

RM_DISPSTS DISP, Sign Trailing Separate.

102 RowGen Control Language RowGen


6.13.7 EBCDIC-Native Micro Focus COBOL Data Types

USAGE IS DISPLAY values are stored byte-by-byte as the ASCII values for
NOTE each digit for up to 18 digits. Each digit is in the range 0x30 through 0x39.

SIGN IS SEPARATE causes a leading or trailing byte to be added to the width,


with a value of 0x2B (ASCII ’+’) or 0x2D (ASCII ’-’).

Included signs cause the leading or trailing byte to be incremented by 16 for


positive values if the digit is 1 through 9, to A (0x41) through I (0x49), or to
(0x7B) for 0. Negative values increment from 1 through 9, from to J (0x4A)
through R (0x52), and from 0 to } (0x7D).

6.13.7 EBCDIC-Native Micro Focus COBOL Data Types

RowGen can generate the following EBCDIC-Native Micro Focus COBOL Data types:

The following data types cannot be directly generated. However, you can
NOTE generate using NUMERIC in input and convert to these data types in output.

RM, ERM, MF, ENF, UMF, URM, Zoned, and Packed

EMF_COMP, EMF_UCOMP
EMF_COMP indicates COMP, Signed. EMF_UCOMP indicates COMP,
unsigned.

EMF_CMP3, EMF_UCMP3
EMF_CMP3 indicates COMP-3, packed decimal. EMF_UCMP3 indicates
COMP-3, unsigned.

EMF_CMP4, EMF_UCMP4
EMF_CMP4 indicates COMP-4, signed. EMF_UCMP4 indicates COMP-4,
unsigned.

EMF_CMP5, EMF_UCMP5
EMF_CMP5 indicates COMP-5, signed. EMF_UCMP5 indicates COMP-5,
unsigned.

EMF_COMPX COMP-X.

EMF_DISP DISP, Signed.

EMF_UDISP DISP, Unsigned.

EMF_DISPSL DISP, Sign Leading.

RowGen RowGen Control Language 103


6.13.8 EBCDIC-Native RM COBOL Data Types

EMF_DISPSLS DISP, Sign Leading Separate.

EMF_DISPST DISP, Sign Trailing.

EMF_DISPSTS DISP, Sign Trailing Separate.

See Micro Focus Data Types on page 99 for a description of each type.

6.13.8 EBCDIC-Native RM COBOL Data Types

RowGen can generate the following EBCDIC-Native RM COBOL Data Types:

The following data types cannot be directly generated. However, you can
NOTE generate using NUMERIC in input and convert to these data types in output.

RM, ERM, MF, ENF, UMF, URM, Zoned, and Packed

ERM_COMP, ERM_UCOMP
ERM_COMP indicates COMP, signed. ERM_UCOMP indicates
COMP, unsigned.

ERM_CMP1 COMP-1.

ERM_CMP3, ERM_UCMP3
ERM_CMP3 indicatesCOMP-3, signed. ERM_UCMP3 indicates
COMP-3, unsigned.

ERM_CMP6 COMP-6.

ERM_DISP DISP, Signed.

ERM_UDISP DISP, Unsigned.

ERM_DISPSL DISP, Sign Leading.

ERM_DISPSLS DISP, Sign Leading Separate.

ERM_DISPST DISP, Sign Trailing.

ERM_DISPSTS DISP, Sign Trailing Separate.

See RM COBOL Data Types on page 101 for a description of each type.

104 RowGen Control Language RowGen


6.13.9 Date/Timestamp Data Types

6.13.9 Date/Timestamp Data Types

RowGen supports several common datestamp and timestamp formats. Values in these
formats can be generated natively on input, or you can select from a SET file with values
formatted in any of these formats (see Date SET Files with Ranges on page 115 and
Timestamp SET Files with Ranges on page 117). RowGen can sort the values
appropriately if used as keys (see KEYS on page 135). Also, note that you can convert
from one data type to another. For example, on input you can generate, or draw from a
SET of, AMERICAN_TIMESTAMP-conforming values, and declare them as another
timestamp format on output.

AMERICAN_DATE month/day/year
where month is a name, for example:
Jul/31/2004 (using as many characters for the name as the size of the
field allows. Will recognize the month as an integer when used in a set
file, for example 12/31/2004.

AMERICAN_TIME hour[:minute][:second] xM
for example:
11:23:01 PM

AMERICAN_TIMESTAMP
month/day/year hour[:minute][:second] xM
for example:
12/31/2004 11:23:01 PM

EUROPEAN_DATE day.month.year
where month is a name or integer, for example:
31.12.2004 or 31.Dec.2004

EUROPEAN_TIME hour[.minute][.second]
for example:
23.23.01

EUROPEAN_TIMESTAMP
day.month.year hour[.minute][.second]
for example:
31.12.2004 23.23.01

JAPANESE_DATE year-month-day
where month is a name or integer, for example:
2004-12-31 or 2004-Dec-31

JAPANESE_TIME hour[:minute][:second]
for example:
23:23:01

RowGen RowGen Control Language 105


6.13.10 Zoned Decimal

JAPANESE_TIMESTAMP
year-month-day hour[:minute][:second]
for example:
2004-12-31 23:23:01

ISO_DATE year-month-day
where month is a name or integer, for example:
2004-12-31 or 2004-Dec-31

ISO_TIME hour[.minute][.second]
for example:
23.51.01

ISO_TIMESTAMP year-month-day hour[.minute][.second]


for example:
2004-12-31 23.23.01

6.13.10 Zoned Decimal

Zoned decimal is not a supported data type within RowGen. However, this section
describes how RowGen enables you to produce output fields in zoned decimal format
by using the supplied zoned.set file (found in the /examples/RCL_chapter directory of
your RowGen installation directory). See Example 33 below.

The following data types cannot be directly generated. However, you can
NOTE generate using NUMERIC in input and convert to these data types in output.

RM, ERM, MF, ENF, UMF, URM, Zoned, and Packed

Zoned Decimals are alphanumeric digits. If the decimal quantity is positive, the last
character ends with a digit from 0-9. If the decimal quantity is negative, the last
character is written as a lower-case character. The following table shows the format
for negative quantities

the last the string


character is ends with
0 p
1 q
... ...
9 y

106 RowGen Control Language RowGen


6.13.10 Zoned Decimal

Example 33 Producing Zoned Decimal Fields

To produce random zoned decimal fields on output using RowGen, you will need the
SET file zoned.set, which gives all of the possible values for the final character of a
zoned decimal field:

0
1
2
3
4
5
6
7
8
9
p
q
r
s
t
u
v
w
x
y

The following script, zoned.rcl, generates 10 random zoned decimal formatted values
with a size of 4:

/INFILE=zoned.in
/PROCESS=RANDOM
/INCOLLECT=10
/FIELD=(value_first,POSITION=1,SIZE=3,digit)
/FIELD=(number_last,SET=zoned.set,POSITION=4,SIZE=1)
/OUTFILE=zoned.out

This produces zoned.out:

328v
390x
397v
4176
5934
653x
7417
775q
801t
8234

RowGen RowGen Control Language 107


7 SET FILES

7 SET FILES

RowGen can draw field values at random from a pre-existing SET file. In this way, you
can ensure that one or more generated fields is populated with realistic-looking data.

Syntax

SET files are text files that consist of one or more columns of any number of entries.
When multiple columns are present (see Relation SET Files on page 119), the entries
must be tab-delimited. Any SET file entries beginning with a # symbol and are located
at the top of the set are treated as comments, and not selected by RowGen. When the #
symbol is no longer the first character, the SET file content begins. Thereafter, if a #
symbol appears, it is assumed to be part of the data.

The RowGen /FIELD invocation of SET files is as follows:

/FIELD=(FieldName, SET=<set_descript>[<control options>][,other field attributes]

where:

• FieldName is the name of the field.


• set_descript is either an optional path/filename or a user defined value list
(any combination of literals, ranges, and field names) enlosed in brackets; for
example {5,7,(8,10)}
• control_options can include any of the following:
• DEFAULT="string" where string represents the value to be displayed when
a search yields no result.

108 RowGen Control Language RowGen


7 SET FILES

• ORDER=

PRE_SORTED PRE_SORTED is the default sort order of SET files. If


you are selecting values from a single-column SET
file, order is not considered and not required. How-
ever, to find second, third, fourth, etc. column values
in a multi-column relational SET file, the SET file
must be in order (see Relation SET Files on
page 119).

The PRE_SORTED option (the default) yields better


performance than using the NOT_SORTED option
because no additional sorting is performed on each
execution of the job script.

NOT_SORTED Use NOT_SORTED if the SET file needs to be ordered,


indicating that that RowGen will sort the SET file
internally. It is recommended that you sort multi-
column SET files prior to use with RowGen so that
the NOT_SORTED option will never be required.

WARNING! If you access elements from an unsorted multi-column


SET file without specifying NOT_SORTED, you will receive an
error message that indicates where the first unsorted record was
found, for example:

error (111): SET name_list.set @ 2 records not in order


error (111): records not in order

where the first unsorted record is the second record in


name_list.set.

• SEARCH=
= EQ exact match to the argument
> GT value in table after the argument
>= GE exact match if found, otherwise later value
< LT value in table before the argument
<= LE exact match if found, otherwise earlier value

RowGen RowGen Control Language 109


7 SET FILES

• SELECT=
ANY The default. Values from the specified SET file are selected at
random. Therefore, repetition of field values is possible using
this default option, as well as the omission of values. The
amount of repeated or omitted SET values depends on the
number of rows being generated and the number of entries that
exist in the SET file.
ALL If ALL is specified, RowGen selection begins at the top entry
of the SET file, and continues downward through the file, in
order, with each new row that is generated. Depending on the
value of /INCOLLECT (see /INCOLLECT on page 59), if all
SET files entries are utilized, the selection process beings
again at the top, and this process is repeated.
ONCE If ONCE is specified, RowGen selection begins at the top entry
of the SET file, and continues downward through the file, in
order, with each new row that is generated. Depending on the
value of /INCOLLECT (see /INCOLLECT on page 59), once
all SET files entries are utilized, the selection process ends,
and any remaining rows to be generated contain an empty
value for that field.
SUFFIX Selection begins at the top entry and continues downward
through the set file, in order, with each new row that is
generated. Once all set file entries are used, the selection
process begins again at the top. On the second pass through the
set file, the values are appended with the string "_2, continuing
with "_3", "_4", and so on, until the INCOLLECT limit is
reached.
ROW Only applies to set files with two or more tab delimited
columns. Works like ANY if the referenced set file has not
been used in a previous field. If the referenced set file has
already been used in a previous field, then the same row from
the set file is used to supply the value for the field with the
ROW selection type. The required index argument to the
ROW selection type specifies which column of the set file to
use to supply the field value.

110 RowGen Control Language RowGen


7.1 Character SET Files

PERMUTE Similar to ALL. All fields that have PERMUTE as the


selection type are related. Each field will select values from
their respective set files in a sequential order. Some values on
the second, and subsequent fields will be skipped, in order to
give all possible unique combinations of all fields. Once all
possible unique combinations have been produced, the same
pattern will repeat if the /INCOLLECT (see /INCOLLECT on
page 59) amount has not been filled. When using at least one
PERMUTE field, a special value for INCOLLECT may be
specified. Using PERMUTE as the argument to the
INCOLLECT statement will result in an INCOLLECT amount
that is just large enough to encompass all possible unique
combinations of the permuted fields.
WEIGHT Applies only to weighted distributions.
See Distributions on page 77

The above access types are also applicable when using literal SETS
NOTE (see Literal SETs on page 122), where ALL and ONCE selections begin at the
left-most entry in the literal set, and continue with each value to the right.

• SetFile is the set file name.


• param list - Bracket characters ([ ] ) are used to contain a list of
comma-separated search arguments. An argument is a literal (quoted
string) or a field name. Look up values can be a literal or a value of a field
from either the input or inrec.
• def chars is a quoted character string to be returned if the search fails.
• other field attributes are used to complete your field definition with
other attributes, such as POSITION, SIZE, and data type.

The previous version of applying these attributes without requiring the


NOTE attribute= convention is still supported. For example:
/FIELD=(new_id,NOT_SORTED ONCE SET=c:\sets\id_list.set,POS=1,SEP='|')

7.1 Character SET Files

RowGen can draw field values at random from pre-existing SET files that contain any
number of character strings. When you are selecting values from a character SET file,
the values are generated in the exact form in which they appear in the SET file.

RowGen RowGen Control Language 111


7.1 Character SET Files

An alpha_digit data type (the default) is assumed (see ASCII Character Data Types on
page 94).

If, for example, you want to produce EBCDIC-equivalent values from the strings in your
character SET file, you must provide output /FIELD statements and specify EBCDIC as
the data type.

Example 34 Using a Character SET File

Consider the SET file chiefs.set (see Example 25 on page 85).

Comments preceded by # symbol are allowed in all RowGen SET files at the
NOTE top of the set. When the # symbol is no longer the first character, the set file
content begins. Thereafter, if a # appears, it is assumed to be part of the data.

You can create a simple generation script, simple.rcl, as follows:

/INFILE=simple.in
/PROCESS=RANDOM
/INCOLLECT=3
/FIELD=(Names,SET=chiefs.set,POSITION=1,SIZE=19)
/FIELD=(zip_code,POSITION=21,SIZE=5,digit)
/REPORT
/OUTFILE=simple.out

To run this job, you would type the following on the command line:

rowgen /spec=simple.rcl

This will produce the output file simple.out, which contains:

Taft, William H. 70363


Roosevelt, Theodore 46881
Wilson, Woodrow 47217

However, without the SET=filename attribute in the above script, RowGen would
randomly generate a field of size 15, with alpha_digit characters only, which is the
default data type (see ASCII Character Data Types on page 94):

p01d33sG6yD4864qt6d 31716
E1v6IaG86PP1rq66o8R 45286
P1cO3la4Jh3hk5Kv6e6 14812

112 RowGen Control Language RowGen


7.2 Numeric SET Files with Ranges

If you declare a SIZE that is too small to accommodate one or more values in
NOTE the numeric SET file referenced in its /FIELD statement, an error 21 is
returned (improper format declaration).

7.2 Numeric SET Files with Ranges

RowGen can also draw from numeric values at random from pre-existing SET files that
contain any number of numeric values or numeric ranges. The SIZE attribute you
provide will determine the precision of how the value will be represented (unless a
literal value is selected

When you are selecting values from a numeric SET file, the NUMERIC data type should
be given as a field attribute on input. If you want to produce fields on output with
another other data type such as currency, you can provide output /FIELD statements and
specify CURRENCY as the data type.

The supported entries in a numeric SET files can be any combination of:

• Literal values with or without precision, such as 14 or 12.25. These values are
produced as they appear.
• [x,y]. Inclusive low to high range values, where x and y are considered for
random selection. For example, the entry [-2,2] can yield any of the following
(if decimal precision is set to 0):
• -2
• -1
• 0
• 1
• 2.

• (x,y). Exclusive low to high range values, where x and y are not considered for
random selection, but all values in between are considered. For example, the
entry (-2,2) can yield any of the following (if decimal precision is set to 0):
• -1
• 0
• 1.

• [x,y) or (x,y]. A combination pair of square and round brackets where the
above rules apply to each side of the expression. For example, the entry [-2,1)
can yield any of the following (if decimal precision is set to 0):
• -2
• -1
• 0.

RowGen RowGen Control Language 113


7.2 Numeric SET Files with Ranges

• Numeric ranges using decimal precision. For example, the entry [1.52,1.55] can
yield any of the following (if decimal precision is set to 2):
• 1.52
• 1.53
• 1.54
• 1.55.

Example 35 Using a Numeric SET File

Consider the following SET file, numbers.set:

# SET file for RowGen


# Decimal values with open and closed sets
[-7,-5]
[-4.5,-2.5)
(-1.00,1.00)
(2.000,4.000]

Comments preceded by # symbol are allowed in all RowGen SET files at the
NOTE top of the set. When the # symbol is no longer the first character, the set file
content begins. Thereafter, if a # appears, it is assumed to be part of the data.

The following script, numbergen.rcl, shows several ways to define and produce
numeric values using the above SET file:

/INFILE=numbers.in
/PROCESS=RANDOM
/INCOLLECT=10
/FIELD=(v0,SET=numbers.set,POSITION=01,SIZE=8.0,NUMERIC)
/FIELD=(v1,SET=numbers.set,POSITION=11,SIZE=8.1,NUMERIC)
/FIELD=(v2,SET=numbers.set,POSITION=21,SIZE=8.2,NUMERIC)
/FIELD=(v3,SET=numbers.set,POSITION=31,SIZE=8.3,NUMERIC)
/SORT
/KEY=v0
/OUTFILE=numbers.out

To run this job, you would type the following on the command line:

rowgen /spec=numbergen.rcl

114 RowGen Control Language RowGen


7.3 Date SET Files with Ranges

This will produce the output file numbers.out, which contains:

-6 0.9 -5.73 0.011


-6 0.8 3.02 3.662
-5 4.0 -5.43 -3.188
-5 -5.7 3.31 3.364
-3 3.0 -5.76 -2.958
-3 0.8 -2.92 0.383
-3 -2.8 -3.30 -5.626
0 3.2 -5.64 0.880
0 0.3 3.84 -2.509
0 3.9 -5.74 3.288

Note that the SIZE attribute given for each /FIELD statement determines the total field
size and decimal precision on output for the selected SET file values. The ranges of
possible values are determined by the SET file entries themselves and the types of
brackets that enclose them.

See Example 3 on page 15 for another example of selecting values from a numeric SET
file.

7.3 Date SET Files with Ranges

RowGen can also draw from dates at random from pre-existing SET files that contain
any number of date values or ranges.

When you are selecting values from a date SET file, a RowGen-supported DATE data
type must be specified as a field attribute on input (see Date/Timestamp Data Types on
page 105 for the available options), and the date(s) within the SET file must be of the
data type format specified.

If you want to produce fields on output with another supported DATE data type, such as
converting from ISO_DATE to AMERICAN_DATE, you can provide output
/FIELD statements and specify AMERICAN_DATE as the data type.

Note that DATE fields are sorted in date order when referenced as a sort /KEY (see
KEYS on page 135).

The supported entries in a date SET file can be any combination of:

• Literal date values, of a RowGen-supported DATE data type, such as


07/31/2004 or 2004-12-31. These values are produced as they appear in the SET
file.
• [x,y]. Inclusive low to high range values, where x and y are considered for
random selection. For example, the entry [2002-02-27,2002-03-02] can yield
any of the following:

RowGen RowGen Control Language 115


7.3 Date SET Files with Ranges

• 2002-02-27
• 2002-02-28
• 2002-03-01
• 2002-03-02.

Only valid dates are produced by RowGen when date ranges are selected from
NOTE a SET file.

Literal date values in a SET file are also ignored by RowGen when they are
invalid. For example, an entry of Feb/31/2005 is ignored.

• (x,y). Exclusive low to high range values, where x and y are not considered for
random selection, but all values in between may be generated. For example, the
entry (2002-02-27,2002-03-02) can yield either of the following:
• 2002-02-28
• 2002-03-01.

• [x,y) or (x,y]. A combination pair of square and round brackets where the
above rules apply to each side of the expression. For example, the entry
[2002-02-27,2002-03-02) can yield any of the following:
• 2002-02-27
• 2002-02-28
• 2002-03-01.

Example 36 Using a Date SET File

Consider the following SET file, dates.set:

[2009-02-27,2009-03-01]
[2010-12-31,2011-01-02)
(2011-04-02,2011-04-04)
(2011-06-01,2011-06-03]
2012-01-01

116 RowGen Control Language RowGen


7.4 Timestamp SET Files with Ranges

The following script, dategen.rcl, illustrates how dates are drawn from a date SET file,
and how date formats can be converted:

/INFILE=dates.in
/PROCESS=RANDOM
/INCOLLECT=10
/FIELD=(date1,SET=dates.set,POSITION=1,SIZE=10,ISO_DATE)
/SORT
/KEY=date1 # sorts in ascending date order
/NODUPLICATES # produces unique dates
/OUTFILE=datesISO.out # display original selection
/OUTFILE=datesUSA.out # for type conversion
/FIELD=(date1,POSITION=1,SIZE=11,AMERICAN_DATE) # type-convert

To run this job, you would type the following on the command line:

rowgen /spec=dategen.rcl

This will produce the file datesISO.out:

2009-02-28
2011-01-02
2011-04-03
2011-04-04
2011-06-03
2012-01-01

This will also produce the output file datesUSA.out, which contains:

Feb/28/2009
Jan/02/2011
Apr/03/2011
Apr/04/2011
Jun/03/2011
Jan/01/2012

Note that the ranges of possible values are determined by the SET file entries and the
types of brackets that enclose them. Note that the literal entry (2008/01/01) is also
represented in the final output. The data type is converted from ISO_DATE in the SET
file to AMERICAN_DATE on output. The entry /NODPULICATES assures that each
date is unique (see No Duplicates, Duplicates Only on page 138). Note that values with
the ISO_DATE data type can also be generated on input (rather than requiring a
conversion as shown in this example), provided the SET file contains ISO_DATE
values, and that the input field is declared accordingly.

7.4 Timestamp SET Files with Ranges

RowGen can also draw from dates at random from SET files that contain any number of
timestamp values or ranges.

RowGen RowGen Control Language 117


7.4 Timestamp SET Files with Ranges

When you are selecting values from a timestamp SET file, a RowGen-supported
timestamp data type must be specified as a field attribute on input (see Date/Timestamp
Data Types on page 105 for the available options), and the timestamp(s) within the SET
file must be of the data type format specified.

If you want to produce fields on output with another supported timestamp data type,
such as converting from ISO_TIMESTAMP to AMERICAN_TIMESTAMP, you can
provide output /FIELD statements and specify AMERICAN_TIMESTAMP as the data
type. Note that timestamp fields are sorted in date order when referenced as a sort /KEY
(see KEYS on page 135).

The supported entries in a date timestamp file can be any combination of:

• Literal date values, of a RowGen-supported timestamp data type, such as


11/01/2006 11:15:01 PM, or 2006-11-01 23.15.01. These values are produced as
they appear in the SET file.
• [x,y]. Inclusive low to high range values, where x and y are considered for
random selection. For example, the entry
[2004-11-01 23.15.01,2005-02-28 15.01.59] might yield any of the
following:

• 2004-11-22 02.28.32
• 2004-11-25 09.57.26
• 2005-01-07 20.20.24
• 2005-02-07 12.39.35.

Only valid dates are produced by RowGen when timestamp ranges are
NOTE selected from a SET file.

Literal date values in a SET file are also ignored by RowGen when they are
invalid, for example, an entry of 2004-02-31 27.15.01 is ignored.

• (x,y). Exclusive low to high range values, where x and y are not considered for
random selection, but all values in between can be generated.
• [x,y) or (x,y]. A combination pair of square and round brackets where the
above rules apply to each side of the expression.

118 RowGen Control Language RowGen


7.5 Relation SET Files

Example 37 Using a Timestamp SET File

Consider the following SET file, times.set:

# SET file for RowGen


[2010-11-01 23.15.01,2011-02-28 15.01.59]
[2012-11-01 23.15.01,2013-02-28 15.01.59]
2009-12-01 13.04.04

The following script, timestamp.rcl, illustrates how dates are drawn from a SET file,
and how timestamp formats can be converted:

/INFILE=remap.in
/PROCESS=RANDOM
/INCOLLECT=10
/FIELD=(random_time,SET=times.set,POSITION=1,SIZE=20,ISO_TIMESTAMP)
/REPORT
/OUTFILE=times.out
/FIELD=(random_time,POSITION=1,SIZE=20,ISO_TIMESTAMP)

To run this job, enter the following command:

rowgen /spec=timestamp.rcl

This will produce the output file times.out, which contains:

2011-01-31 09:56:530
2013-01-25 19:15:270
2009-12-01 13.04.040
2009-12-01 13.04.040
2009-12-01 13.04.040
2009-12-01 13.04.040
2010-12-29 11:14:090
2010-12-03 01:00:160
2009-12-01 13.04.040
2009-12-01 13.04.040

Note that the ranges of possible values are determined by the SET file ranges and its
single literal value.

7.5 Relation SET Files

RowGen supports the generation of independent and dependent (or related )data values
or ranges across fields, when randomly selected from a multi-column set file (such as
those found in look-up tables). This functionality is similar to using a look-up table.

RowGen RowGen Control Language 119


7.5 Relation SET Files

When the desired value or range of values of one field is dependent on the selected value
from another field, the data look more realistic.

The syntax for generating related field values is:

/FIELD=(field1,SET=setfilename.set ,POSITION=1, . . .
/FIELD=(field2,SET=ROW[2] setfilename.set ,POSITION= . . .

where setfilename.set is the name of any tab-delimited set file that contains the
possible values for field1 (which can be given as a literal string, i.e., a known
quantity) to the left of the tab, and the possible values for field2 to the right of the tab.
The first pull from the set file in the record will pull randomly from the column
indicated. All other columns will come from the same record in the set file. If you later
in the record pull from the set file without indicating the column, that is again a random
pull and then all other pulls after that will pull from the same record in the set file from
which that was pulled. The number following ROW in brackets is the column to pull from
within the set file.

Example 38 Using a Multi-Column SET File

Consider a relational SET file, state_city.set, which contains a tab-delimited list of the
50 United States, accompanied by the cities which comprise each state:

Alabama Abbeville
Alabama Alabaster
Alabama Albertville
...
Alaska Anchorage
Alaska Barrow
Alaska Bethel
...
Arizona Ajo
Arizona Apache Junction
Arizona Avondale
...

This example also uses the following other SET files:

add_values.set

[1,100)
[1,100)
[1,100)
[100,999]

120 RowGen Control Language RowGen


7.5 Relation SET Files

streetnames.set

1st St.
2nd St.
3rd St.
4th St.
5th St.
6th St.
7th St.
8th St.
9th St.
A St.
Aaron's Pl.
Abbey Pl.
Abigail St.
Abington Pl.
etc...

The following script, address_gen.rcl, generates fictional addresses, and matches states
with realistic, corresponding city names:

/INFILE=addresses.in
/PROCESS=RANDOM
/FIELD=(add_no,SET=add_values.set,POSITION=1,SIZE=3.0,SEPARATOR='\t',NUMERIC)
/FIELD=(address,SET=streetnames.set,POSITION=2,SEPARATOR='\t')
/FIELD=(State,SET=state_city.set,POSITION=3,SEPARATOR='\t')
/FIELD=(City,SET=ROW[2] state_city.set,POSITION=4,SEPARATOR='\t')
/REPORT
/OUTFILE=addresses.out
/FIELD=(add_no,POSITION=1)
/DATA=" "
/FIELD=(address)
/DATA=" "
/FIELD=(City,POSITION=25)
/DATA=", "
/FIELD=(State)

RowGen RowGen Control Language 121


7.6 Literal SETs

An excerpt of the results, addresses.out, is as follows:

803 Flamingo Cir. Plymouth, Wisconsin


77 Orbit Rd. Bear, Delaware
87 Weaver Rd. Broussard, Louisiana
96 Albert Dr. Roy, Utah
78 Beems St. Sterling, Virginia
80 Ballance Rd. Wilkesboro, North Carolina
48 Solara Dr. Pearisburg, Virginia
125 Cricket Ridge Waipahu, Hawaii
254 Country Point Wooster, Ohio
811 Leonard Sasser Cut Bank, Montana

Note that the city and state fields match up according to the entries in the SET file
state_city.set, which provides a more realistic set of address values.

7.6 Literal SETs

In cases where you have a small, customized static list of field values, you can include
the values directly within the /FIELD statement, rather than refer to a separately held
SET file.

There are three options available for using literal SETs:

You can include a comma-separated list of single values from which RowGen will
randomly draw, for example:

/FIELD=(first_name,SET={rob,sue,bill},POSITION=3,SEPARATOR=’|’)

The conventional use of encapsulating string values within a literal SET is also
NOTE supported, for example, SET={"rob","sue","bill"}

Or, you can include a literal range, or comma-separated ranges, of values from which
RowGen will randomly draw, for example:

/FIELD=(salary,SET={(10000,50000),(60000,95000)},POSITION=3,SEPARATOR=’|’)

which will generate random numbers between 10,000 and 50,000, and between 60,000
and 95,000. Use the NUMERIC data type if you want values right-aligned, with a
precision of 2, by default (see ASCII-Numeric Data Types on page 95).

See Numeric SET Files with Ranges on page 113 for details on the rules for using
parentheses and square brackets to determine range selection criteria.

122 RowGen Control Language RowGen


8 FIELD EXPRESSIONS (CROSS-CALCULATION)

8 FIELD EXPRESSIONS (CROSS-CALCULATION)

In both /INREC and in the output files, it is possible to use a mathematical expression in
place of a field name. These expressions may reference multiple fields and use
arithmetic operators and mathematical functions. Parentheses can be used to control
operator precedence, and temporary fields may be created to hold intermediate values.
These features are particularly useful for ad hoc financial calculations or spreadsheet-
style presentations.

It is possible to use the /ROUNDING statement to change the way in which


NOTE numeric values with several decimal places are rounded after an arithmetic
RowGen operation (see /ROUNDING on page 177).

Table 5 shows the arithmetic operators in high-to-low-precedence order:

Table 5: Arithmetic Symbols and their Precedence


Operator Meaning
() parentheses
*/% multiply, divide, whole number remainder of x/y
+- add, subtract, and unary operators (such as t=-5)

Example 39 Using Mathematical Expressions

This example, expr.rcl, demonstrates expression writing and its use of precedence:

/INFILE=expr.in
/PROCESS=RANDOM
/INCOLLECT=5
/FIELD=(a,POSITION=01,SIZE=1,digit)
/FIELD=(b,POSITION=04,SIZE=1,digit)
/FIELD=(c,POSITION=07,SIZE=1,digit)
/FIELD=(d,POSITION=10,SIZE=1,digit)
/OUTFILE=expr.out
/HEADREC="a b c d t=a+b*(c+d) (t-1)/4\n\n"
/FIELD=(a,POSITION=01)
/FIELD=(b,POSITION=04)
/FIELD=(c,POSITION=07)
/FIELD=(d,POSITION=10)
/FIELD=(t=a + b * (c + d),POSITION=15,SIZE=6,NUMERIC) # calculate t
/FIELD=((t - 1) / 4,SIZE=9,NUMERIC) # calculate and display

RowGen RowGen Control Language 123


8 FIELD EXPRESSIONS (CROSS-CALCULATION)

Internal arithmetic is performed in floating-point precision. This example calculates a


value for t and displays it. The arithmetic that produces t is performed in the following
order:

1) Add c plus d.
2) Multiply by b.
3) Add a.
4) Store t.

The output file produced is expr.out:

abcd t=a+b*(c+d) (t-1)/4

2 5 3 9 62.00 15.25
2 8 7 1 66.00 16.25
3 6 5 9 87.00 21.50
6 6 4 7 72.00 17.75
9 4 6 3 45.00 11.00

The functions supported by RowGen are shown in Table 6.

Table 6: Mathematical Functions

Function Description
abs (x) absolute value of x, |x|. x can be a whole number or a floating point value
acos (x) arc cosine of x; range: [0, π] radians; -1≤x≤1
asin (x) arc sine of x; range: [-π, π/2] radians; -1≤x≤1
atan (x) arc tangent of x; range: [-π/2, π/2] radians
atan2 arc tangent of y/x; range: [-π, π]
j0 (x) Bessel function of x, first kind, order 0
j1 (x) Bessel function of x, first kind, order 1
jn (n, x) Bessel function of x, first kind, order n
y0 (x) Bessel function of x, second kind, order 0.
y1 (x) Bessel function of x, second kind, order 1.
yn (n, x) Bessel function of x, second kind, order n.
cos (x) cosine of x in radians
exp (x) e to the power of x
cosh (x) hyperbolic cosine of x
sinh (x) hyperbolic sine of x

124 RowGen Control Language RowGen


8 FIELD EXPRESSIONS (CROSS-CALCULATION)

Table 6: Mathematical Functions (cont.)

Function Description
tanh (x) hyperbolic tangent of x
floor (x) largest whole number (as a double-precision number) not greater than x

gamma (x) ln(|GAMMA(x)|)


log10 (x) logarithm base ten of x; x>0
log (x) natural logarithm of x; x>0
sqrt (x) non-negative square root of x; x≥0
mod (x,y) remainder of the division of x by y: x if y is zero or if x/y would overflow;
otherwise the number f with the same sign as x, such that x=iy+f for a
whole number i, and f < |y|
sin (x) sine of x in radians
ceil (x) smallest whole number not less than x
hypot sqrt(x*x+y*y)
tan (x) tangent of x in radians
pow (x,y) x to the power of y; if x<0, y must be a whole number

RowGen RowGen Control Language 125


9 /INREC

9 /INREC

The optional /INREC section in a RowGen job script provides the capability to
transform the input (generation-phase) record, prior to the processing phase of a job,
where you can define one or more derived fields as part the layout. The derived fields
can be based on the randomly generated (or selected) field values that were defined in
the input section.

/INREC is required when:

• a derived field is to be used as a sort key (as in Example 13 on page 28)


• include/omit record-filter logic will be based on a derived field value
• an output /FIELD or /DATA statement that references a derived field will be part
of a condition (IF THEN ELSE).

An /INREC section must contain all the fields which you intend to include in your
output. That is, if one of the original fields that you define in the input section is not
included in /INREC, it will not be processed, and therefore will not appear on output.

As with the /INFILE section of a script, an /INREC section defines the mapping that
will be sent to output if no output layout is specified.

The syntax is similar to /INFILE and /OUTFILE:


/INREC
attributes

The /PROCESS attribute is ignored in /INREC.


NOTE

The example on the following page incorporates many possible uses of /INREC.

Example 40 Using /INREC

Consider that you have three sets of grades for each pupil, and that the average of the
three grades determines the final mark they receive.

126 RowGen Control Language RowGen


9 /INREC

The following script, grades.rcl uses /INREC to define a derived field that is first used
for sorting, then within outfile-specific /INCLUDE statements, and finally as part of
conditional /DATA statements within each output file:

/INFILE=grades.in
/PROCESS=RANDOM
/INCOLLECT=20
/FIELD=(student_code,POSITION=1,SIZE=3,alpha)
/FIELD=(grade1,POSITION=6,SIZE=2,digit)
/FIELD=(grade2,POSITION=9,SIZE=2,digit)
/FIELD=(grade3,POSITION=12,SIZE=2,digit)

/INCLUDE WHERE grade1 > 60 AND grade2 > 60 and grade3 > 60

/INREC # derived field, avg_grade, to be defined here:


/FIELD=(student_code,POSITION=1,SIZE=3)
/FIELD=(grade1,POSITION=6,SIZE=2,digit)
/FIELD=(grade2,POSITION=9,SIZE=2,digit)
/FIELD=(grade3,POSITION=12,SIZE=2,digit)
/FIELD=(avg_grade=(grade1 + grade2 + grade3) / 3,POSITION=15,NUMERIC)
/SORT
/KEY=avg_grade # INREC was required to define this key field

/OUTFILE=lowgrades.out
/INCLUDE WHERE avg_grade < 80 # references the INREC-defined field
/HEADREC="Code Grade1 Grade2 Grade3 Average\n------------------
----------------------\n"
/FIELD=(student_code,POSITION=1,SIZE=3)
/FIELD=(grade1,POSITION=6)
/FIELD=(grade2,POSITION=15)
/FIELD=(grade3,POSITION=24)
/FIELD=(avg_grade,POSITION=34)
/DATA=IF avg_grade < 70 THEN " very low pass" ELSE " low pass"

/OUTFILE=highgrades.out
/INCLUDE WHERE avg_grade > 80 # references the INREC-defined field
/HEADREC="Code Grade1 Grade2 Grade3 Average\n------------------
----------------------\n"
/FIELD=(student_code,POSITION=1)
/FIELD=(grade1,POSITION=6)
/FIELD=(grade2,POSITION=15)
/FIELD=(grade3,POSITION=24)
/FIELD=(avg_grade,POSITION=34)
/DATA=IF avg_grade < 85 THEN " high pass" ELSE " very high pass"

This produces two output files.

RowGen RowGen Control Language 127


9 /INREC

lowgrades.out contains:

Code Grade1 Grade2 Grade3 Average


----------------------------------------
rAH 69 65 69 67.67 very low pass
QSD 61 83 65 69.67 very low pass
FjB 66 71 74 70.33 low pass
EZH 91 64 62 72.33 low pass
FFG 83 73 63 73.00 low pass
PBC 97 61 65 74.33 low pass
BFD 65 78 84 75.67 low pass
CNA 68 91 68 75.67 low pass
CZD 77 73 77 75.67 low pass
zBt 91 70 67 76.00 low pass
rUJ 76 86 66 76.00 low pass
GIB 63 96 70 76.33 low pass
qVV 81 84 74 79.67 low pass

highgrades.out contains:

Code Grade1 Grade2 Grade3 Average


----------------------------------------
BrC 81 92 71 81.33 high pass
sRu 69 86 95 83.33 high pass
ECp 81 76 96 84.33 high pass
MKl 85 74 99 86.00 very high pass
FTF 92 81 86 86.33 very high pass

Note that the "avg_grade" field (grade1 + grade2 + grade3) / 3) was created in
the /INREC section because it was used for sorting purposes. It was also used in output
file-specific /INCLUDE statements (see INCLUDE-OMIT (RECORD SELECTION) on
page 146). Finally, it was used as part of output field-specific IF THEN ELSE logic to
produce the evaluations seen in the final column
(see CONDITIONAL FIELD AND DATA STATEMENTS on page 152).

128 RowGen Control Language RowGen


10 /DATA

10 /DATA

/DATA statements are used to pad and format output records. /DATA statements are not
named fields, so they cannot be directly mapped. They are positioned just after the
previous/DATA or /FIELD statement.

There are several forms of /DATA statements:


/DATA=[{n}]"literal string"
where n is the number of times to repeat the literal string you
specify, and the constant string can contain any combination of
constants and/or control (escape) characters (see Table 8 on
page 132) and conversion specifiers (see Table 9 on page 133).

The literal string can also contain syntax specific to a mark-up


language, such as HTML, provided the browser (or other utility) you
use to read the output file accepts that syntax. For an example of
producing an HTML report, see
Example 15 on page 33).

/DATA=field_name
displays the value of an input field without formatting

/DATA=internal_variable
where the internal variable can be any RowGen internal variable
listed in Table 7 on page 131.

You can also use a conditional data statement (see CONDITIONAL FIELD AND DATA
STATEMENTS on page 152).

RowGen RowGen Control Language 129


10 /DATA

Example 41 Using /DATA Statements

Several examples of /DATA statements are shown in the following script, data.rcl.
It uses the SET file parts_list2.set (as shown in Example 11 on page 25).

/INFILE=data.in
/PROCESS=RANDOM
/INCOLLECT=15
/FIELD=(part,SET=parts_list2.set,POSITION=1,SIZE=15)
/FIELD=(price,POSITION=16,SIZE=10,NUMERIC)
/OMIT WHERE Price < 0
/SORT
/KEY=part
/NODUPLICATES
/OUTFILE=data.out
/DATA=part # field name
/DATA="Price: " # literal string
/FIELD=(price,SIZE=14,CURRENCY)
/DATA=" " # adds two spaces
/DATA=CURRENT_TIMESTAMP # internal variable
/DATA={3}"*-*" # repeated constant string

This produces data.out:


drills Price: $3,841,273.75 2007-10-15 10.07.37*-**-**-*
lightbulbs Price: $5,377,739.84 2007-10-15 10.07.37*-**-**-*
nails Price: $7,267,159.77 2007-10-15 10.07.37*-**-**-*
pliers Price: $3,675,267.17 2007-10-15 10.07.37*-**-**-*
ratchets Price: $7,036,346.34 2007-10-15 10.07.37*-**-**-*
sanders Price: $4,832,167.22 2007-10-15 10.07.37*-**-**-*
screws Price: $6,521,240.49 2007-10-15 10.07.37*-**-**-*
tacks Price: $7,082,260.76 2007-10-15 10.07.37*-**-**-*
wrenches Price: $3,451,852.58 2007-10-15 10.07.37*-**-**-*

130 RowGen Control Language RowGen


11 INTERNAL VARIABLES

11 INTERNAL VARIABLES

RowGen maintains internal values which you can use in /DATA, /HEADREC, and
/FOOTREC statements (see /DATA on page 129, /HEADREC on page 170, and
/FOOTREC on page 171). The internal values are shown in Table 7.
Table 7: Internal Variables
Variable Output/Example
AMERICAN_DATE Sep/19/2004
EUROPEAN_DATE 19.09.2004
JAPANESE_DATE 2004-09-19
ISO_DATE 2004-09-19
CURRENT_DATE 2004-09-19

AMERICAN_TIME 09:47:15 AM
EUROPEAN_TIME 21.47.15
JAPANESE_TIME 09:47:15 PM
ISO_TIME 21.47.15
CURRENT_TIME 21.47.15

AMERICAN_TIMESTAMP Sep/19/2004 09:47:15 AM


EUROPEAN_TIMESTAMP 19.09.2004 21.47.15
JAPANESE_TIMESTAMP 2004-09-19 09:47:15 PM
ISO_TIMESTAMP 2004-09-19 21.47.15
CURRENT_TIMESTAMP 2004-09-19 21.47.15

SYSDATE current date and time in the format used by


Oracle and SQLbase
CURRENT_TIMEZONE Eastern Daylight Time (Windows)
EDT (UNIX)
PAGE_NUMBER page number of the report
(/HEADREC and
/FOOTREC statements only)
USER the name of the user currently
executing the program

For example, if you include the following statement in your script:


/HEADREC="Produced on %s",CURRENT_DATE

then the top of the output might contain:

Produced on 2004-09-19

See details on using format control characters, such as %s, see Table 14 on page 170.

RowGen RowGen Control Language 131


12 CONTROL (ESCAPE) CHARACTERS

12 CONTROL (ESCAPE) CHARACTERS

RowGen supports the use of a control character, such as a horizontal tab, as a field
separator (see SEPARATOR on page 86). In addition, the formatting statements /DATA,
/HEADREC, and /FOOTREC can also contain control characters (see /DATA on page 129,
/HEADREC on page 170, and /FOOTREC on page 171). Table 8 describes the
supported control characters and how to specify them.

Table 8: Control (Escape) Characters


Character Format within a String
\a alert
\\ backslash
\b backspace
\r carriage return
\" double quote
\f form-feed
\t horizontal tab
\n newline
\0 NULL character
\’ single quote
\v vertical tab
\$ dollar sign

132 RowGen Control Language RowGen


13 CONVERSION SPECIFIERS

13 CONVERSION SPECIFIERS

Conversion specifiers are used within /DATA statements, and for defining conditions,
when you want to convert a known value into its equivalent in another data type
(see /DATA on page 129).

The EBCDIC conversion specifier is available when you want to use the ASCII
equivalent of an EBCDIC value within a condition, or return the EBCDIC equivalent of
an ASCII value within the records of your output. Similarly, the PACKED conversion
specifier is available when you want to use the ASCII-numeric equivalent of a PACKED
value within a condition, or return the PACKED equivalent of an ASCII value within
the records of your output.

The hexadecimal conversion specifier is used when want to return the hexadecimal
equivalent of an ASCII value. For example, if you include the statement
/DATA=%HEX"C134", the two-byte hexadecimal equivalent, consisting of c1 and 34,
would be returned. A hex dump representation of this value would therefore appear as
C1 34 at the specified location within every output record.

The ASCII conversion specifier is not typically required because the character string
you specify for conversion is already in ASCII. It is provided only for situations where
the specifier itself is a variable (see Environment Variables on page 49).

Table 9 shows the syntax for using the conversion specifiers which are recognized by
RowGen.

Table 9: Conversion Specifiers


Data Type Form
EBCDIC %EBCDIC"ASCII_value"
PACKED %PACKED"ASCII_value"
Hexadecimal %HEX"ASCII_value"
ASCII %ASCII"string"

For hexadecimal values 01 through 09, you can use "\n" without the %HEX
NOTE specifier, where n is any whole number from 1 to 9. For example,
/DATA="\4" returns the hexadecimal value 04.

This short-hand method can also be used within conditions.

RowGen RowGen Control Language 133


13.1 Using a Conversion Specifier within a /DATA statement

13.1 Using a Conversion Specifier within a /DATA statement

To return the EBCDIC equivalent of the ASCII value 65 in an output file, for example,
use the following:

/DATA=%EBCDIC"65"

The location of this value within your results depends on where, within the layout of
your output records, you specified the /DATA statement (see /DATA on page 129). For
/HEADREC and /FOOTREC statements, the value would appear at the top or bottom of
the output, respectively.

13.2 Using a Conversion Specifier within a Condition

The following is an example of a condition involving a conversion specifier when the


input field’s data is in EBCDIC format:

/CONDITION=(Senior,TEST=(Age GT %E"65"))

If you /INCLUDE this condition, only those records where the EBCDIC equivalent of
66 and higher for the field Age are included in the output (see CONDITIONS on
page 139).

134 RowGen Control Language RowGen


14 KEYS

14 KEYS

In RowGen jobs that contain a /SORT statement (the default process), the order of the
output records is determined by comparing one or more key fields within records until
they are not equal. The /KEY statement is used for specifying each key field.
By default, records sort from left to right if no /KEY is given. Any number of /KEY
statements can be given. Compares will be performed in the order given while each key
is equal.

This section describes the components of the /KEY statement and its related options:

• Syntax
• Field Name Reference
• Unnamed Reference on page 136
• Collating Sequence on page 136
• Direction on page 136
• ASCII Options on page 137
• No Duplicates, Duplicates Only on page 138.

14.1 Syntax

Sort parameters are separated by commas. If the field name is used, it must be first,
while the other parameters may appear in any order:
/KEY=(field[,data type][,collating_sequence]
[,direction][,ASCII options])

field may be one of the following:

• an input field name


• a column position and size
• a field position and separator, and optionally, a size.

14.2 Field Name Reference

The simplest form of the /KEY statement uses a defined input field name for field and
no other parameters. When field is the only parameter and the direction is ascending,
the parentheses are not required. The position in the record, size of the field, and data
type are known from the input or /INREC section’s /FIELD description.

RowGen RowGen Control Language 135


14.3 Unnamed Reference

The following RowGen job script generates and sorts records with first and last names,
selected from SET files. The sort order is by last name, followed by first name:

/INFILE=chiefs_sep
/FIELD=(fname,SET=first_names.set,POSITION=1,SEPARATOR=’|’)
/FIELD=(lname,SET=last_names.set,POSITION=2,SEPARATOR=’|’)
/FIELD=(term,POSITION=3,SEPARATOR=’|’)
/SORT
/KEY=lname
/KEY=fname
/OUTFILE=chiefs.out

Parentheses are not required for the key when using only the field name to
NOTE describe it.

14.3 Unnamed Reference

A key field can contain a field statement without a name, as in:


/KEY=(POSITION=1,SIZE=15)
/KEY=(POSITION=3,SEPARATOR=’|’)

14.4 Collating Sequence

By default, the collating sequence for the key field(s) is determined by the data type
specified in the input /FIELD statement(s). If the field was not specifically typed on
input, ASCII collation will occur.

You can, however, specify a different collating sequence for any given key. For
example, if the code field contains ASCII strings that might contain some binary
characters (for example, if using a SET file), you may want to use the EBCDIC collating
sequence, as follows:

/KEY=(code,EBCDIC)

14.5 Direction

Direction of a key’s comparison is either ASCENDING or DESCENDING. If the direction


is omitted, the default is ASCENDING. An example is:
/KEY=(Price,DESCENDING)

136 RowGen Control Language RowGen


14.6 ASCII Options

14.6 ASCII Options

These can be used when the field is ASCII. These options do not actually reformat the
field; they only change how the comparison is made (see Alignment on page 90 for
actual field reformatting). The ASCII options are:

alignment Causes the key field to be shifted for comparison purposes. The
options are:
Left Sort as if the key field characters precede trailing space(s).
Right Sort as if the key field characters trail leading spaces.
None The default. Sort with the key field characters compared as
they currently appear in the field. Sort order may be affected
by leading or trailing spaces within the field.

For example, given the following:

/INFILE=align.dat
/FIELD=(f1,SET=align_test.set,POSITION=1,SIZE=9)
/SORT
/KEY=(f1,alignment)
/OUTFILE=stdout

and assuming align_test.set contains this value:

__Chars__1

the various alignments cause the following compares:

Table 10: Comparisons: Alignment


Alignment Effect
LEFT Chars____
RIGHT ____Chars
NONE __Chars__

case can be CASE_ON or CASE_OFF to cause the key field to become


case-sensitive, as shown in Table 11 using the previous record.
Default is CASE_ON. With CASE_ON, key field characters are
compared as they appear, respecting uppercase and lowercase letters.

1. The underscore character (_) represents a blank space.

RowGen RowGen Control Language 137


14.7 No Duplicates, Duplicates Only

All uppercase letters collate ahead of lowercase letters.

Table 11: Comparisons: Case


Case Effect
CASE_ON __Chars__
CASE_OFF __CHARS__

14.7 No Duplicates, Duplicates Only

Duplicates occur when two or more records have key fields that compare equally.
Only the key fields, not the records, must compare equally for a record to be considered
a duplicate.

The statement /NODUPLICATES results in only one of the duplicates being output.
If you also specify /STABLE, the earliest input duplicate will be the one retained.
Without /STABLE, the duplicate retained is arbitrary.

The number of records that are generated (using /INCOLLECT) will not match
NOTE the number that is produced if duplicates are found and removed.

Inversely, you can specify /DUPLICATESONLY, where only records containing key
fields that compare equally are returned.

138 RowGen Control Language RowGen


15 CONDITIONS

15 CONDITIONS

A condition is a logical expression that combines field names and/or constants with
relational and/or logical operators. When the expression is evaluated, it will be either
true or false.

A condition is associated with several RowGen statements, and can be used for both
input and output. The true/false result controls how the statement works. The following
statements use a condition within their definitions:
• /INCLUDE and /OMIT
• /DATA using IF-THEN-ELSE logic
• /FIELD using IF-THEN-ELSE logic
• WHERE and BREAK in summary functions (MAX, MIN, SUM, AVG, COUNT).

15.1 Syntax

A condition can be one of two forms (see INCLUDE-OMIT (RECORD SELECTION) on


page 146):

Named (implicit) where the logical expression has a name and is defined with the
statement:

/CONDITION=(condition_name, TEST=(logical_expression))

Once you have defined the condition_name, it may be used in any


later statement that uses that condition, such as
/INCLUDE WHERE condition_name. Naming is done for
documentation purposes and to build nested expressions. It is also
more efficient than using unnamed conditions if the same logical
expression is used repeatedly. The named condition is analyzed
once, and the result is reusable.

Unnamed (explicit) where the logical expression is built into the statement using it:

/INCLUDE WHERE logical_expression


/OMIT WHERE logical_expression

The logical_expression can involve any of the following types:

• Unary Logical Expressions (Change Test) on page 140


• Binary Logical Expressions on page 140
• Function Compares in Conditions (iscompares) on page 141
• Compound Logical Expressions on page 144.

RowGen RowGen Control Language 139


15.2 Unary Logical Expressions (Change Test)

15.2 Unary Logical Expressions (Change Test)

The simplest logical expression is a field name. The condition is true when the value of
the field is different from its value in the previous record. If the value of the field has not
changed, the condition is false.

Typically in change testing, data is sorted using the named field as a key, but this is not a
requirement. The most common use of change tests is for defining BREAK points in
summary functions, as shown in Example 50 on page 159.

15.3 Binary Logical Expressions

Another form of a logical expression involves two values and a relational operator.
The following is the general form of the expression:

Value1 Relational_Operator Value2

The values can be field names or literals (constants).

RowGen recognizes both the operator and symbol forms of these relational operators,
as shown in Table 12.

Table 12: Recognized Relational Operators


Operator Symbol Meaning
EQ == Equal
NE != Not equal
GT > Greater than
GE >= Greater than or equal
!< Not less than
LT < Less than
LE <= Less than or equal
!> Not greater than
CT Contains
NC Does not contain

Examples of logical expressions using relational operators are:

Author EQ "Publisher"
Publisher >= "Addison-Wesley"
Author CT "Hemmingway"
Price > 25.00

140 RowGen Control Language RowGen


15.4 Function Compares in Conditions (iscompares)

Note the difference above between Price > 25.00 (a numeric compare) and
NOTE Publisher >= "Addison-Wesley" (a character compare).

15.4 Function Compares in Conditions (iscompares)

You can use C-style iscompare functions in RowGen to evaluate conditions at the field
level, and also for record-filtering using /INCLUDE and /OMIT statements. This is
useful when drawing from character SET files that require validation (see Character
SET Files on page 111).

Two categories of iscompare tests are available:

C-library iscompare tests

The following function compares in RowGen are identical to those available to C


programmers:

isalphadigit(field) Equivalent to the C library test named isalnum. True if all


characters are alphanumeric characters. This is equivalent to
(isalpha(field) || isdigit(field)).

isalpha(field) True if all alphabetic characters, in the current locale. This is


equivalent to (isupper(field) or islower(field)).
In some locales, there may be additional characters for which
isalpha(field) is true, that is, there may be letters which are
neither upper-case nor-lower case.

isascii(field) True if each character is a 7-bit unsigned char value that fits into the
ASCII character set.

iscntrl(field) True if all characters are control characters.

isdigit(field) True if all characters are digits (0 through 9).

isgraph(field) True if all characters are printable (except space).

islower(field) True if all characters are lower-case.

isprint(field) True if all characters are printable (including space).

ispunct(field) True if all printable characters, except a space or an alphanumeric.

isspace(field) Checks for white-space characters, including space, form-feed (\f),


newline (\n), carriage return (\r), horizontal tab (\t), and vertical tab
(\v).

RowGen RowGen Control Language 141


15.4 Function Compares in Conditions (iscompares)

isupper(field) True if all characters are upper-case.

isxdigit(field) Checks for a hexadecimal digits, that is, any of 0, 1, 2, 3, 4, 5, 6, 7, 8,


9, a, b, c, d, e, f, A, B, C, D, E, F.

WARNING!The list of printable characters on Windows operating


systems differs from printable characters on Linux and Unix.
Refer to your operating system documentation for details.

Additional iscompare tests

The following non-C-library function compares are also available in RowGen:

isempty(field) Returns true for null fields or those that satisfy isspace(field).

isnumeric(field) Same as isdigit(field), but also recognizes period (.), plus (+),
and minus (-), and only when it satisfies isspace(field). At least
one char must be a digit.

isebcalpha(field) True if all characters are EBCDIC alphabetic.

isebcdigit(field) True if all characters are EBCDIC digits.

isholding(value1,value2)
True if value2 is contained within value1. value1 and/or value2
may be a literal value or a field name, for example
isholding(ACCOUNT," # ").

ispattern(field,"expression")
Checks the field using Perl-compatible regular expressions such as
a+bc.

ispacked(field) Checks the field to make sure each nibble, except for the last one,
contains a 0-9 value, and that the last nibble contains a hex b, c,
d, or f.

142 RowGen Control Language RowGen


15.4 Function Compares in Conditions (iscompares)

Example 42 Using iscompares

Given the SET file, phone.set:

555-1111
55520098

555-4321

555-0098

Note that the two empty records above consist of 8 spaces each.

The following script, iscompare.rcl, uses iscompare functions within both record filter
logic (/OMIT) and field-level evaluation in the output:

/INFILE=iscompare.in
/PROCESS=RANDOM
/INCOLLECT=10
/FIELD=(phone,set=phone.set,POSITION=1,SIZE=8)
/OMIT WHERE isdigit(phone)
/REPORT
/OUTFILE=iscompare.out
/FIELD=(phone,POSITION=1,SIZE=8)
/FIELD=(flag,POSITION=12,IF isspace(phone) THEN "empty" ELSE "")

This produces iscompare.out:

empty
555-4321
empty
555-4321
empty
555-1111
empty
555-4321
555-4321
empty

Note that the following record was not selected because the phone number field contains
all digits (isdigit):

55520098
Notice also the records that contain empty phone number fields (isspace) were appended
with the word empty because of the IF THEN ELSE logic in the final output field
(see CONDITIONAL FIELD AND DATA STATEMENTS on page 152).

RowGen RowGen Control Language 143


15.5 Compound Logical Expressions

15.5 Compound Logical Expressions

The simplest form of a compound logical expression is:


Expression 1 Logical_Operator Expression 2

There are two logical operators:


AND true when both Expression 1 and Expression 2 are true
OR true when either Expression 1 is true or Expression 2 is true.

An example of a compound logical expression is:


Publisher == "Dell" AND Price > 25.00

where Publisher and Price are previously defined fields.

15.6 Evaluation Order

Any number of conditional expressions can be connected by logical operators, i.e.:


Expr 1 Log Op 1 Expr 2 Log Op 2 Expr 3 ....

Evaluation proceeds left to right. For example, if Expr 1 is TRUE and


Log Op 1 is OR, then the logical expression is TRUE. All evaluations are shown
in Table 13.

Table 13: Evaluation of Logical Expressions


Unary/Binary Logical
Result
Expression Operator
True None True
False None False
True AND Continue
False AND False
True OR True
False OR Continue

144 RowGen Control Language RowGen


15.7 Compound Conditions

15.7 Compound Conditions

It is possible to build conditions so that the logical expression of one condition will
contain the name of a previously defined condition linked to the rest of the expression by
a logical operator. Since parentheses are not recognized for grouping logical
expressions, this is helpful in defining complex logical expressions.

The following is an example of a script that includes nested conditions, and creates two
output files with different filter logic based on the nested conditions:

/INFILE=presidents.in
/FIELD=(Name,set=names.set,POSITION=1,SIZE=27)
/FIELD=(Party,set=party.set,POSITION=40,SIZE=3)
/FIELD=(State,set=states.set,POSITION=45,SIZE=2)
/CONDITION=(C1,TEST=(Party EQ "DEM")) # C1 defined
/CONDITION=(C2,TEST=(Party EQ "REP")) # C2 defined
/CONDITION=(C3,TEST=(C1 OR C2)) # C3 includes C1 and C2
/INCOLLECT=20
/SORT
/KEY=Name

/OUTFILE=dem_rep.out
/INCLUDE WHERE C3 # output includes only DEMs and REPs

/OUTFILE=no_dem_rep.out
/OMIT WHERE C3 # output omits all DEMs and REPs

RowGen RowGen Control Language 145


16 INCLUDE-OMIT (RECORD SELECTION)

16 INCLUDE-OMIT (RECORD SELECTION)

/INCLUDE and /OMIT statements use conditions to accept or reject entire records,
respectively. They are different than the IF-THEN-ELSE conditions which can
determine field values (see CONDITIONAL FIELD AND DATA STATEMENTS on
page 152); include and omit conditions use field-value conditions to determine the
dispositions of entire records. This functionality can be applied to records for both input
filtering and/or output purposes.

The number of records generated, as specified by the /INCOLLECT statement


NOTE (100 is the default) reflects the evaluation of any /INCLUDE or /OMIT
statements specified in the input (generation) section of a RowGen job script
(see /INCOLLECT on page 59). For example, if /INCOLLECT=50, and you
specify include or omit logic in the input section of your script, then 50 records
are generated that satisfy the condition(s).

Any /INCLUDE or /OMIT logic used in an output section(s) of a job script is


evaluated post-generation and post-processing, and relates only to the records
written to the output file(s). You can use /OUTCOLLECT to determine the
number of records produced that satisfy outfile-specific include or omit
conditions (see MISCELLANEOUS OPTIONS on page 174).

16.1 /INCLUDE and /OMIT Syntax

The form of /INCLUDE and /OMIT statements is as follows:

/INCLUDE WHERE condition


/OMIT WHERE condition

where condition is a logical expression (see Example 43 on page 147), a condition


name (see Example 44 on page 148), or simply a field name (when used on output).

When a field name is given, without being used in an expression, beneath an /OUTFILE
statement, such as /INCLUDE WHERE ID_NO, then only one record containing each
unique ID_NO is returned to that output file. This is similar to the use of
/NODUPLICATES, but in this case, the uniqueness feature is output file-specific,
whereas /NODUPLICATES affects all output files (see No Duplicates, Duplicates Only
on page 138). Note that /OMIT WHERE field_name is the inverse case, which
behaves similarly to /DUPLICATESONLY, but at the output file level. See Example 17 on
page 39 which contains two output files that employ this /INCLUDE usage for
uniqueness.

146 RowGen Control Language RowGen


16.2 Include-Omit Evaluation

16.2 Include-Omit Evaluation

Care must be given to the order of /INCLUDE and /OMIT statements because records are
tested for each /INCLUDE and /OMIT condition in the order specified. If a particular
record meets a given /INCLUDE condition, it is included, regardless of any remaining
/OMIT statements that would otherwise cause that record to be omitted. Alternatively, if
a record meets a given /OMIT condition, it is omitted, regardless of any remaining
/INCLUDE statements that would otherwise cause that record to be included. That is,
once a record satisfies an include or omit condition, its inclusion/exclusion in the output
is determined (see Example 44 on page 148 and Example 45 on page 149).

For better performance when there are multiple conditions, the most likely /INCLUDE
and /OMIT statements must be given early. The sooner the records’ dispositions are
determined, the fewer conditions are required to be evaluated.

You should also consider which section of the script is best for placement of
/INCLUDE and /OMIT statements.

• /INFILE
• /INREC
• /OUTFILE.

Any records that are not required in the output should be filtered out in the generation
(input) or /INREC phase prior to sorting (see /INREC on page 126). This makes the sort
faster by keeping unnecessary records from being processed. Be sure such records will
not be required for deriving fields in the output.

Example 43 Using Include and Omit on Input and on Output

This example, inc.rcl, shows how /INCLUDE and /OMIT logic is applied differently,
depending on whether used in the input or output section(s) of a job script:

/INFILE=inc.in
/PROCESS=RANDOM
/INCOLLECT=10
/FIELD=(letter1,POSITION=1,SIZE=1,alpha)
/FIELD=(rest,POSITION=2,SIZE=3,alpha)
/INCLUDE WHERE letter1 GE "A" AND letter1 LE "C"
/REPORT
/OUTFILE=lettera.out
/INCLUDE WHERE letter1 == "A"
/OUTFILE=letterb.out
/INCLUDE WHERE letter1 == "B"
/OUTFILE=letterc.out
/OMIT WHERE letter1 == "A" OR letter1 == "B"

RowGen RowGen Control Language 147


16.2 Include-Omit Evaluation

In this example, RowGen generates 10 random fields, where the first field consists of a
single letter, A, B, or C. These ten records are sent to separate output files, depending on
the include or omit conditions at the outfile level. Note that the omit logic used in the
final output file is the equivalent of using

/INCLUDE WHERE letter1 == "C"

In this example, the condition criteria were specified explicitly. See Example 44 on
page 148 for an example of using named conditions

When executed, three files are produced.

lettera.out contains:

AFUz
AKCG

letterb.out contains:

BKpz
BtBE
BbHW
BnsH
BCxW

letterc.out contains:

CcwH
CuDm
CyeB

Example 44 Using Named Conditions

This example, named.rcl, uses /INCLUDE and /OMIT statements with named
conditions:

/INFILE=named.in
/PROCESS=RANDOM
/FIELD=(Code,POSITION=1,SIZE=3,alpha)
/FIELD=(Price,POSITION=5,SIZE=5,NUMERIC)
/CONDITION=(caps_only,TEST=(Code GE "A" AND Code LE "Z"))
/CONDITION=(under_ten,TEST=(Price < 10))
/OMIT WHERE under_ten
/INCLUDE WHERE caps_only
/INCOLLECT=10
/OUTFILE=named.out

148 RowGen Control Language RowGen


16.2 Include-Omit Evaluation

This produces named.out:

ARM 53.75
BoD 95.67
DeZ 14.23
EGS 42.12
EbD 65.52
GbH 67.39
KKD 26.41
PBH 77.16
UOK 81.70
XbJ 31.21

In this case, records are not generated when they satisfy the under_ten /OMIT
condition, and records are generated only when they satisfy the caps_only /INCLUDE
condition. The /INCOLLECT statement determines the number of records generated that
satisfy both input file conditions.

Example 45 Condition Order: Include before Omit

This example, order1.rcl, shows how the output is affected when an /INCLUDE
statement appears before an /OMIT statement:

/INFILE=order1.in
/PROCESS=RANDOM
/INCOLLECT=15
/FIELD=(code,POSITION=1,SIZE=1,alpha)
/FIELD=(rest,POSITION=2,SIZE=3,alpha)
/FIELD=(Price,POSITION=6,SIZE=5,NUMERIC)
/INCLUDE WHERE Code GE "A" AND Code LE "C"
/OUTFILE=order1.out
/INCLUDE WHERE Code GE "A" AND Code LE "B"
/OMIT WHERE Price < 10

In this example, 15 records are generated where the first character can be only A, B, or
C. The order of the include and omit statements in the output section determines how the
filter logic is applied in the resultant output file. After generation and processing
(a default left to right sort in this case), RowGen first checks the include condition. If
the code field is A or B (as the condition states), these records are produced to ouput
regardless of the subsequent omit statement. All other records, that is, those with code C,
are subject to the evaluation criteria of the subsequent /OMIT statement.

RowGen RowGen Control Language 149


16.2 Include-Omit Evaluation

The results, order1.out, are as follows:

AFGw 34.97
AQsE 68.50
AXxA 39.10
AZtM 81.95
Aenn -7.80
AmFc 75.58
BFjG 84.76
BbXB 3.45
Bfmn 43.41
ByXG 19.21
CFHR 27.62
CGyw 72.19
CPCV 84.15
CnIa 74.38
CwgD 26.63

Note that all records satisfying the first include statement (A and B records) are included
without exception. Only the C records were left to be evaluated by the omit condition,
and therefore, C records satisfying the under_ten condition were omitted from the
results.

Example 46 Condition Order: Omit before Include

This example, order2.rcl, shows how the output is affected when an /OMIT statement
appears before an /INCLUDE statement:

/INFILE=order2.in
/PROCESS=RANDOM
/INCOLLECT=15
/FIELD=(code,POSITION=1,SIZE=1,alpha)
/FIELD=(rest,POSITION=2,SIZE=3,alpha)
/FIELD=(Price,POSITION=6,SIZE=5,NUMERIC)
/INCLUDE WHERE Code GE "A" AND Code LE "C"
/OUTFILE=order2.out
/OMIT WHERE Price < 10
/INCLUDE WHERE Code GE "A" AND Code LE "B"

The output is different for this script because the order of the /INCLUDE and /OMIT
statements is reversed. Generated records are first tested for the condition under_ten,
and if true, these records are omitted regardless of the subsequent /INCLUDE statement.
Of the surviving records that are not omitted (that is, those with prices greater than 10),
the /INCLUDE evaluation is then applied.

150 RowGen Control Language RowGen


16.2 Include-Omit Evaluation

The results in order2.out are as follows:

AJED 76.51
AaKh 87.32
AgCe 79.67
AoQQ 21.51
AsER 90.73
Attz 29.85
AvhO 30.26
BKLe 72.03
BRnB 67.32

Note that all records with prices less than ten were omitted from the results without
exception. Of the remaining records, the include statement is considered,
which is why only A and B records are produced.

RowGen RowGen Control Language 151


17 CONDITIONAL FIELD AND DATA STATEMENTS

17 CONDITIONAL FIELD AND DATA STATEMENTS

RowGen permits the value for /FIELD statements in /INREC and output file
descriptions, and the value for /DATA statements in output file descriptions to be derived
from IF-THEN-ELSE logic. When using this logic, the syntax for these statements is:
/FIELD=(field-name[,field attributes][,IF-THEN-ELSE])
/DATA=IF-THEN-ELSE

The general form for IF-THEN-ELSE is:


IF condition THEN value1 ELSE value2

If the condition is true, the resultant value of the /FIELD or /DATA statement is
value1; if the condition is false, the resultant value is value2. A condition is a
named condition or a logical expression as discussed in CONDITIONS starting on
page 139. A value is a field name, literal (numeric value or character string), algebraic
expression, summary value, or another IF-THEN-ELSE derived value.

Below is an example with a conditional /DATA and a /FIELD statement, which are used
in the output section(s) of a script:

/CONDITION=(DT,TEST=(type == "deciduous"))
/FIELD=(tree, POSITION=2, SIZE=4, IF DT THEN oldtree ELSE "PINE")
/DATA=IF DT THEN instructions ELSE "none"

where type, oldtree, and instructions are previously defined input fields where
SET files were used, and "PINE" and "none" are string literals. Notice that the string
must be in double quotes. Also, remember that /DATA statements will appear
immediately after any preceding /FIELD or /DATA statements while a /FIELD
statement may have its POSITION and SIZE defined.

You can get the same results as above using implicit conditions:

/FIELD=(tree, POSITION=2, SIZE=4, IF type == "deciduous" THEN oldtree ELSE "PINE")


/DATA=IF type == "deciduous" THEN instructions ELSE "none"

There are two variations on the general form of IF-THEN-ELSE logic:

• IF condition THEN value1


• IF condition ELSE value2

The first implies an empty value for the ELSE clause, and the second implies an empty
value for the THEN clause. Below are examples:

/FIELD=(tree, POSITION=2, SIZE=4, IF type == "deciduous" THEN oldtree)


/DATA=IF type == "deciduous" ELSE "none"

152 RowGen Control Language RowGen


17 CONDITIONAL FIELD AND DATA STATEMENTS

These are equivalent to:

/FIELD=(tree, POSITION=2, SIZE=4, IF type == "deciduous" THEN oldtree ELSE "")


/DATA=IF type == "deciduous" THEN "" ELSE "none"

A next level of IF-THEN-ELSE may appear in either or both of the ELSE or THEN
clauses. Any number of levels can be defined to meet the degree of complexity in your
requirements.

For example, the following is valid syntax:

IF C1 THEN IF C2 THEN V1 ELSE V2 ELSE IF C3 THEN V3 ELSE V4

but for clarity should be written:

IF C1 \
THEN IF C2 \
THEN V1 \
ELSE V2 \
ELSE IF C3 \
THEN v3 \
ELSE V4

The rule is that each THEN or ELSE clause is associated with the most recent IF that does
not already have a THEN or ELSE clause associated with it. The line continuation at the
end of each line of the statement is necessary for RowGen to evaluate the statement
correctly.

When long lists are to be tested for a true condition, processing will be faster if
NOTE the cases that are most likely to be true are specified first.

RowGen RowGen Control Language 153


18 EXAMPLES USING CONDITIONAL FIELD AND DATA STATEMENTS

18 EXAMPLES USING CONDITIONAL FIELD AND DATA STATEMENTS

Following are examples using conditional field and conditional data statements.

The SET file parts_list3.set contains:

DBD:screwdrivers
CBC:hammers
ABC:glue sticks
EBD:lightbulbs
ABE:pliers
BBC:ratchets
BBD:buzz saws
EBE:switches
BBE:sanders
CBD:nails
CBE:tacks
DBC:screws
DBE:wrenches
EBC:drills
ABD:lighters

Example 47 Simple Conditional /DATA Statement

This example, cond_data.rcl, includes a /DATA statement that identifies those records
from parts_list3.set where the prefix code begins with the letter A. An /INREC
statement is used to isolate the first letter of the prefix code for evalution purposes
(see /INREC on page 126):

/INFILE=cond_data.in
/PROCESS=RANDOM
/INCOLLECT=10
/FIELD=(part,SET=parts_list3.set,POSITION=1,SIZE=30)
/INREC
/FIELD=(prefix=sub_string(part,1,1),POSITION=1,SIZE=1)
/FIELD=(part,POSITION=1,SIZE=30)
/REPORT
/OUTFILE=cond_data.out
/FIELD=(part,POSITION=1,SIZE=30)
/DATA=IF prefix EQ "A" THEN "A code" ELSE "not A"

This produces cond_data.out:

EBE:switches not A
CBD:nails not A
ABE:pliers A code
DBC:screws not A
BBD:buzz saws not A

154 RowGen Control Language RowGen


18 EXAMPLES USING CONDITIONAL FIELD AND DATA STATEMENTS

BBE:sanders not A
BBC:ratchets not A
ABC:glue sticks A code
CBE:tacks not A
DBE:wrenches not A

Example 48 Multi-Level Conditional /FIELD Statement

This job script, cond_field.rcl, uses a multi-level condition for an output field where the
test is based on the value of the /INREC-derived prefix field (see /INREC on
page 126):7

/INFILE=cond_field.in
/PROCESS=RANDOM
/INCOLLECT=10
/FIELD=(part,SET=parts_list3.set,POSITION=1,SIZE=30)
/INREC
/FIELD=(prefix=sub_string(part,1,1),POSITION=1,SIZE=1)
/FIELD=(part,POSITION=1,SIZE=30)
/REPORT
/OUTFILE=cond_field.out
/FIELD=(part,POSITION=1,SIZE=30)
/FIELD=(test1,POSITION=31,SIZE=10,IF prefix EQ "A" THEN "A code" \
ELSE IF prefix EQ "B" THEN "B code" \
ELSE IF prefix EQ "C" THEN "C code" \
ELSE "other")

This produces cond_field.out:

EBE:switches other
CBD:nails C code
ABE:pliers A code
DBC:screws other
BBD:buzz saws B code
BBE:sanders B code
BBC:ratchets B code
ABC:glue sticks A code
CBE:tacks C code
DBE:wrenches other

RowGen RowGen Control Language 155


19 SUMMARY FUNCTIONS (AGGREGATION)

19 SUMMARY FUNCTIONS (AGGREGATION)

RowGen can produce output records containing summary fields derived from
accumulated detail records that have been generated. (Detail records are the records that
comprise the summaries.) Multiple levels of summary records can be created in the
same pass. Depending on the levels of BREAKs you specify, multiple levels of subtotals
can be provided, along with a grand total.
For an example of a multi-level report that uses all RowGen-supported aggregation
functions, see Report with Multiple Aggregations on page 31.
One or more summary fields may be derived using the following RowGen features:
• Summary and Average Functions on page 157
• Maximum and Minimum Functions on page 159
• Counting Function on page 161
• Ranking on page 161.

Summary records may be produced on a given BREAK condition at the end of


processing. Any number of intermediate BREAK levels can also be specified. This is
particularly useful in simulating data warehousing and reporting environments where
complex drill-down aggregation is performed.

There are two steps required to define a summary field within a summary or detail
record:
1) Use an output /FIELD statement to describe its position and format in the output
record.
2) Use one of the above functions to determine how the value of the field is derived.

Summary records can be formatted differently at each level. Each of these levels can be
written to a separate file, or merged into one file to produce a structured report.

You can also create RUNNING summary fields in the detail records. These running, or
accumulating, summary fields are updated at each record.
Use the /ROUNDING statement to change the way in which numeric values with several
decimal places are rounded after an arithmetic RowGen operation (see /ROUNDING on
page 177).

156 RowGen Control Language RowGen


19.1 Summary and Average Functions

19.1 Summary and Average Functions

The syntax of /SUM and /AVERAGE is the same:


/SUM=FieldX [FROM field or expression][RUNNING ]\
[WHERE condition1][BREAK condition2]

/AVERAGE=FieldX [FROM field or expression ][RUNNING ]\


[WHERE condition1][BREAK condition2]

If condition1 in a WHERE clause is an expression (such as price GT 11), rather than


simply a field name, and is followed by a BREAK statement, then condition1 must be
enclosed in parentheses.

A sum is accumulated until each break. With /AVERAGE, the accumulating sum is
divided by the number of records before each break. A grand total and file-wide
average is produced if there is no grouping via breaks.

The FROM portion indicates the source of values; it can be a generated field name or a
mathematical expression which references one or more previous field names. If an
expression is used, it should be enclosed with parentheses.

When using an expression to define the FROM clause, you should enclose the
NOTE expression in parentheses. Only simple expressions can be used in the FROM of
a summary definition. You may not use parentheses for grouping. For
example, you may have:

/SUM=Exp_B FROM (A * B - 5)
but not
/SUM=Exp_B FROM (A * (B - 5))

WHERE condition1 determines if the sum field value of a particular record is to be


added to the sum. If Condition1 is true, the value is added, otherwise it is not. If there
is no WHERE portion in the statement, then the addition is unconditional.

BREAK Condition2 controls when the /SUMMARY or /AVERAGE record is output and
the values are reset.

Again, there is a natural BREAK at the end of the job; therefore, if the BREAK portion is
not used, or Condition2 never becomes true, results for the whole job display.

RowGen RowGen Control Language 157


19.1 Summary and Average Functions

Example 49 Summary and Average

Consider the SET file, publishers.set, which consists of names of book publishers:

Prentice Hall
Harper-Row
Dell
Valley Kill
Academic Press
Cambridge University Press
Macmillan

The following job script, sum_avg.rcl, generates 100 records where the publisher name
is randomly selected from the set file, and generates several book prices (up to $20 each)
for each publisher. The summary information is grouped by individual publisher, so the
job must be sorted over the publisher key.

/INFILE=sum_avg.in
/PROCESS=RANDOM
/FIELD=(publisher,SET=publishers.set,POSITION=1,SIZE=30)
/FIELD=(price,POSITION=31,SIZE=5.2,NUMERIC)
/INCLUDE WHERE price > 0 AND price < 20
/SORT
/KEY=publisher # required as a sort key when BREAK is used
/OUTFILE=sum_avg.out
/FIELD=(publisher,POSITION=1,SIZE=30)
/FIELD=(tot_cost,POSITION=32,SIZE=8.2,NUMERIC)
/FIELD=(ct_books,POSITION=41,SIZE=4)
/FIELD=(avgprice,POSITION=48,SIZE=7.2,NUMERIC)
/SUM=tot_cost FROM price WHERE price < 10 BREAK publisher
/COUNT=ct_books WHERE price < 10 BREAK Publisher
/AVERAGE=avgprice FROM price WHERE price < 10 BREAK publisher

This produces sum_avg.out:

Academic Press 33.45 6 5.58


Cambridge University Press 29.72 6 4.95
Dell 59.67 13 4.59
Harper-Row 23.66 5 4.73
Macmillan 32.70 6 5.45
Prentice Hall 73.37 12 6.11
Valley Kill 37.06 6 6.18

158 RowGen Control Language RowGen


19.2 Maximum and Minimum Functions

Note that:

• the first numeric column in the output contains the sum (/SUM) of all the
sub-$10 price values for each publisher.
• the second columncontains a count of all sub-$10 records for that publisher
(see Counting Function on page 161)
• the final column is the average price, calculated as the total sum of prices divided
by the number of records, for that publisher
• WHERE clauses were applied in the summary fields to produce information for
only those records where the price was less than $10, and therefore the number
of records returned on output was fewer than those generated
(/INCOLLECT=100, by default).

19.2 Maximum and Minimum Functions

/MAXIMUM and/MINIMUM are used to calculate the maximum and minimum values,
respectively, of a field.

/MAX and /MIN are used the same way as /SUM or /AVERAGE. The syntax is:

/MAX=FieldX [FROM field or expression] [RUNNING] \


[ WHERE condition1] [BREAK condition2]

/MIN=FieldX [FROM field or expression] [RUNNING] \


[WHERE condition1] [BREAK condition2]

If two or more records share the same /MAX or /MIN field value, RowGen will output
the first sorted record that satisfies the condition.

Example 50 Maximum and Minimum

Consider the SET file, publishers.set (see Summary and Average on page 158).

In this example, max_min.rcl, RowGen produces maximum and minimum values for
prices, by publisher:

RowGen RowGen Control Language 159


19.2 Maximum and Minimum Functions

/INFILE=max_min.in
/PROCESS=RANDOM
/FIELD=(publisher,SET=Publishers.set,POSITION=1,SIZE=30)
/FIELD=(price,POSITION=31,SIZE=7,NUMERIC)
/INCLUDE WHERE price > 0
/SORT
/KEY=publisher # required as a sort key when BREAK is used
/OUTFILE=max_min.out
/FIELD=(publisher,POSITION=1,SIZE=30)
/FIELD=(min_price,POSITION=32,SIZE=8.2,MILL,NUMERIC)
/FIELD=(max_price,POSITION=43,SIZE=8.2,MILL,NUMERIC)
/MIN min_price FROM price BREAK publisher
/MAX max_price FROM price BREAK publisher

This produces max_min.out:

Academic Press 128.53 8,475.56


Cambridge University Press 31.41 9,360.14
Dell 761.35 9,552.96
Harper-Row 1,106.95 9,651.02
Macmillan 71.23 9,941.17
Prentice Hall 564.47 9,547.70
Valley Kill 1,776.19 7,553.74

Note that the MILL option produced numeric values with a comma where expected (see
MILL on page 91).

In this example, the aggregate records are displayed without their component
NOTE detail records. You can add a same-name /OUTFILE section to include the
detail records in the above report, as illustrated in Example 14 on page 31.

To also display the minimum and maximum for all departments (that is, with no
BREAK), add the following lines to the bottom of the script:

/OUTFILE=min_max.out
/HEADREC="--------------------------------------------------\n"
/FIELD=(min_price,POSITION=32,SIZE=8.2,MILL,NUMERIC)
/FIELD=(max_price,POSITION
=43,SIZE=8.2,MILL,NUMERIC)
/MIN min_price FROM price
/MAX max_price FROM price

160 RowGen Control Language RowGen


19.3 Counting Function

This produces the following:

Academic Press 128.53 8,475.56


Cambridge University Press 31.41 9,360.14
Dell 761.35 9,552.96
Harper-Row 1,106.95 9,651.02
Macmillan 71.23 9,941.17
Prentice Hall 564.47 9,547.70
Valley Kill 1,776.19 7,553.74
--------------------------------------------------
31.41 9,941.17

In this case, the grand-total record (without a BREAK) was specified in the job script after
the sub-total records (with a BREAK), but both /OUTFILE names were the same. When
creating a script with detail records and sub-totals (and or/grand totals), the detail
records must be specified in thelast same-name /OUTFILE section in the script, as
illustrated in Example 14 on page 31.

19.3 Counting Function

The syntax of a /COUNT statement is:

/COUNT = FieldX [RUNNING] [WHERE Condition1] /


[BREAK Condition2]

FieldX will contain the count of records that satisfy Condition1. The record
containing FieldX is displayed and reset to 0 when Condition2 occurs. If no WHERE
condition is specified, the records are counted until the BREAK occurs. If no BREAK is
given, all records that satisfy Condition1 are counted and displayed at the end of the
job. See Summary and Average on page 158 for an example of using a /COUNT
statement.

19.4 Ranking

One type of statistical analysis involves assigning a sequential rank to a set of data
values. You can use RowGen to rank generated records by performing a descending-
order sort, and use /COUNT RUNNING option in the output. The RUNNING option is
described in Running (Accumulating) Aggregates on page 164.

The following RowGen syntax can be used as a template for a ranking job:

/INFILE=...
/FIELD=(data_value, ...)

RowGen RowGen Control Language 161


19.4 Ranking

...
/SORT
/KEY=(data_value,DESCENDING)
/OUTFILE=...
/FIELD=(rank, SIZE=n.0,NUMERIC)
/FIELD=(data_value, ...)
...
/COUNT=rank RUNNING WHERE data_value

The data_value is the input field on which ranking is to be performed, and this must
be the primary /KEY field for sorting (descending order for highest-value-first ranking).
The WHERE clause is required for instances when there are equal values in data_value.

Example 51 Ranking

Suppose that you want to create a report that ranks salespeople by the value of sales they
have made, with the highest sales ranking first. In the production application, the top
three ranking salespeople are to earn a bonus.

The set file, people.set, which contains names of salespeople, is used:

Jane
Sam
Frank
Melanie
Adam
Mary
Robert
John
Vanessa
Donald
Sarah
Lawrence
Laura
Roger
Keith
Paul
Jennifer
Henry
Richard

The following RowGen script, rank.rcl, generates 10 records, where names are selected
randomly from the set file, and a sales figure (in thousands of dollars) is generated for
each. The use of /COUNT with RUNNING and a descending sort key will rank the
salespeople by the value of sales they have made:

162 RowGen Control Language RowGen


19.5 Reports with Summaries

/INFILE=rank.in
/PROCESS=RANDOM
/INCOLLECT=10
/FIELD=(person,SET=people.set,POSITION=1,SIZE=11)
/FIELD=(sales,POSITION=13,SIZE=7.2,NUMERIC)# values up to 9999.99
/INCLUDE WHERE sales GE 1000 # ensures values of over 1000
/SORT
/KEY=(sales,DESCENDING) # Highest value is ranked first
/OUTFILE=rank.out
/FIELD=(rank,POSITION=1,SIZE=2.0,NUMERIC)
/FIELD=(person,POSITION=4,SIZE=11)
/FIELD=(sales,POSITION=15,SIZE=9.2,CURRENCY)
/FIELD=(bonus,POSITION=26,IF rank < 4 THEN "Bonus" ELSE "No Bonus")
/COUNT rank RUNNING WHERE sales

Note how the generated numeric values are converted to currency, which requires an
increase in field size to accommodate the expected comma (,) and $ in the new data type
(see ASCII-Numeric Data Types on page 95).

This produces rank.out:

1 Henry $7,773.58 Bonus


2 Richard $7,344.72 Bonus
3 Mary $7,036.46 Bonus
4 Henry $6,481.81 No Bonus
5 Sam $6,333.45 No Bonus
6 Jane $5,945.12 No Bonus
7 Roger $5,286.56 No Bonus
8 Jennifer $3,167.22 No Bonus
9 Sam $1,776.19 No Bonus
10 Frank $1,265.66 No Bonus

Note that the ranking was performed in descending order of sales made for each
salesperson. The top three are to receive bonuses, as specified using IF THEN ELSE
logic in the final output field (see CONDITIONAL FIELD AND DATA STATEMENTS
on page 152).

19.5 Reports with Summaries

To produce a report that has detail records, any number of subtotals, and a final total in
the same output file, use the same output file name to define each type of record. For an
example of a multi-level report that uses all RowGen-supported aggregation functions,
see Report with Multiple Aggregations on page 31.

RowGen RowGen Control Language 163


19.6 Running (Accumulating) Aggregates

19.6 Running (Accumulating) Aggregates

For each aggregation function, you may display RUNNING aggregate values in the detail
records. To define a running summary, for example, you must add the parameter
RUNNING to the summary definition.

Example 52 Using /RUNNING

Using publishers.set (see Example 49 on page 158), the following script, running.rcl,
produces a running record count, plus a running sum and average price values for each
publisher:

/INFILE=sum_avg.in
/PROCESS=RANDOM
/FIELD=(publisher,SET=publishers.set,POSITION=1,SIZE=30)
/FIELD=(price,POSITION=31,SIZE=5.2,NUMERIC)
/INCLUDE WHERE price > 0 AND price < 10
/SORT
/KEY=publisher # required as a sort key when BREAK is used
/OUTFILE=runninga.out
/FIELD=(publisher,POSITION=1,SIZE=30)
/FIELD=(ct_books,POSITION=32,SIZE=4)
/FIELD=(tot_cost,POSITION=36,SIZE=8.2,NUMERIC)
/FIELD=(avgprice,POSITION=52,SIZE=7.2,NUMERIC)
/COUNT=ct_books RUNNING BREAK Publisher
/SUM=tot_cost FROM price RUNNING BREAK publisher
/AVERAGE=avgprice FROM price RUNNING BREAK publisher

Academic Press 1 3.44 3.44


Academic Press 2 5.18 2.59
Academic Press 3 13.86 4.62
Academic Press 4 23.50 5.88
Academic Press 5 30.69 6.14
Academic Press 6 33.78 5.63
Academic Press 7 40.34 5.76
Academic Press 8 49.88 6.24
Academic Press 9 56.29 6.25
Academic Press 10 62.99 6.30
Academic Press 11 67.20 6.11
Academic Press 12 69.47 5.79
Academic Press 13 72.18 5.55
Academic Press 14 76.91 5.49
Academic Press 15 84.49 5.63
Cambridge University Press 1 7.71 7.71
Cambridge University Press 2 16.92 8.46
Cambridge University Press 3 23.34 7.78
Cambridge University Press 4 25.87 6.47
Cambridge University Press 5 28.64 5.73
Cambridge University Press 6 36.49 6.08
...

164 RowGen Control Language RowGen


19.6 Running (Accumulating) Aggregates

Note that all the individual records for each publisher are displayed with a running count
in the second column, followed by a running sum and a running average. Note that
without the RUNNING option, only one record -- the record displaying the total -- would
be displayed for each publisher.

RowGen RowGen Control Language 165


20 SEQUENCER

20 SEQUENCER

SEQUENCER is an internal field that keeps a running count of records. The SEQUENCER
field can be positioned and sized, as required, and is particularly useful in database
indexing and re-loading work. It is used in the /OUTFILE section of a job script.
See ROWID on page 74 for details on assigning a unique, incrementing row number or
ID tag to a field within the /INFILE section of a job script.

When generating output with both summary BREAK and detail records, SEQUENCER can
be used in both types of records. Every time there is a BREAK record, the counter for the
detail records gets reset. SEQUENCER also displays a running count for the BREAK
records. Example 53 on page 166 demonstrates this.

The field named SEQUENCER may appear in one or more output files with the
following syntax:
/FIELD=(SEQUENCER[=[+/-]n],[field attributes])

where n is a whole number and you choose either "+" or "-" to indicate whether the
initial value is positive or negative. If no initial value is given, then the initial value is 1.
If no sign is given, then the sign is assumed to be positive.

The following are possible usages:

/FIELD=(SEQUENCER,POSITION=5,SIZE=4)
/FIELD=(SEQUENCER=0,POSITION=5,SIZE=4)
/FIELD=(SEQUENCER=-10,POSITION=5,SIZE=4)
/FIELD=(SEQUENCER=+100,POSITION=5,SIZE=4)

Example 53 Using SEQUENCER

The following example, seq.rcl, includes a SEQUENCER field for both the detail records
that comprise each group, and an additional group-level SEQUENCER field:

166 RowGen Control Language RowGen


20 SEQUENCER

/INFILE=seq.in
/PROCESS=RANDOM
/FIELD=(code_letter1,POSITION=1,SIZE=1,alpha)
/FIELD=(code_letter2,POSITION=2,SIZE=1,alpha)
/FIELD=(code3_num,POSITION=3,SIZE=1,WHOLE)
/FIELD=(value,POSITION=6,SIZE=5.2,NUMERIC)
/OMIT WHERE code_letter1 > "C"
/OMIT WHERE code_letter2 > "C"
/INCLUDE WHERE value > 0
/INCOLLECT=15
/SORT
/KEY=(POSITION=1,SIZE=3)
/KEY=value
/OUTFILE=seq.out
/FIELD=(tot_value,POSITION=10,SIZE=6.2,NUMERIC)
/DATA=" Group "
/FIELD=(SEQUENCER=+101,SIZE=3)
/DATA="\n"
/SUM tot_value FROM value BREAK code_letter1
/OUTFILE=seq.out
/FIELD=(SEQUENCER,POSITION=1,SIZE=3)
/FIELD=(code_letter1,POSITION=5,SIZE=1)
/FIELD=(code_letter2,POSITION=6,SIZE=1)
/FIELD=(code3_num,POSITION=7,SIZE=1)
/FIELD=(value,POSITION=10,SIZE=6.2,NUMERIC)

This produces seq.out:

1 AA7 8.08
2 AB3 85.13
3 AB3 93.60
4 AB7 43.64
5 AC4 9.83
6 AC5 32.56
272.84 Group 101

1 BA2 64.73
2 BB1 1.33
3 BB2 2.11
4 BB5 79.62
5 BC3 59.21
207.00 Group 102

1 CA0 26.78
2 CA5 48.63
3 CA8 73.75
4 CC1 67.52
216.68 Group 103

RowGen RowGen Control Language 167


20 SEQUENCER

Note that the SEQUENCER count for the 15 detail records is re-started with each new
group, and a different sequencer is applied to the groups themselves (A, B, or C). The
attribute SEQUENCER=+101 ensured that aggregate sequences would begin at 101.

168 RowGen Control Language RowGen


21 OUTPUT OPTIONS

21 OUTPUT OPTIONS

RowGen provides several horizontal record filtering statements for managing the size,
number, and flow of records.

This section describes the statements relating to the output phase of RowGen record
management:

• /CREATE
• /APPEND
• /HEADREC on page 170
• /FOOTREC on page 171
• /RECSPERPAGE on page 172
• /OUTSKIP on page 173
• MISCELLANEOUS OPTIONS on page 174.

/OUTCOLLECT, another output option, controls the number of records that are produced
on a per-outfile basis. This is described in /OUTCOLLECT on page 59.

/CREATE

This is the default specification for an output target. It indicates that a new output file
will be created. If the file name already exists, all previous data in the file will be lost,
even if nothing is written by this job.

The syntax is:


/OUTFILE=output filename
/CREATE

/APPEND

You can associate an /APPEND with any output to cause output data to be placed
directly after the existing data in a file. If the file does not exist or is empty, /CREATE
will be invoked.

The syntax is:


/OUTFILE=output_filename
/APPEND

RowGen RowGen Control Language 169


21 OUTPUT OPTIONS

/HEADREC

This statement creates a customized header record in the output file (report). See
/RECSPERPAGE on page 172 for details on making the header record appear on every
page of output.

The syntax is:

/HEADREC="character string with format for each embedded \


variable (format control characters) ..."[, var1, var2, ...]

The character string can be a constant which may contain any combination of
internal variables and control (escape) characters recognized by RowGen (see Table 7
on page 131 and Table 8 on page 132).

A character string can also contain syntax specific to a mark-up language, such
NOTE as HTML, provided the browser (or other utility) you use to read the output file
accepts those statements. For an example of producing an HTML report, see
Example 15 on page 33).

Some examples of using the /HEADREC statement are:

/HEADREC="The Monthly Report\n"


/HEADREC="%s Sales Report\n",CURRENT_DATE

where \n indicates a new line, and %s is the format for the variable CURRENT_DATE.
See Table 7 on page 131 for a list of accepted variables.

The \n is needed to cause a line-feed. Without it, the first record will display
NOTE immediately after the header on the same line. Also, the variables must be
listed in the order in which they will appear in the string.

Table 14 lists the display formats that RowGen supports for customizing output reports.

Table 14: Format Control Characters for Variables


Character Printed Result
%c character
%d, %i decimal integer
%u unsigned decimal integer

170 RowGen Control Language RowGen


21 OUTPUT OPTIONS

Table 14: Format Control Characters for Variables (cont.)


Character Printed Result
%o unsigned octal integer
%x, %X unsigned hexadecimal integer
%e floating point number, such as 4.321e+00
%E floating point number, such as 4.321E+00
%f floating point number, such as 4.321
%g either e-format or f-format, whichever is shorter
%G either E-format or f-format, whichever is shorter
%s as a string
%% writes a single % to the output stream; no argument is
converted

/FOOTREC

This statement uses the same syntax as /HEADREC but its string values appear at the
bottom of the output data as a footer (see /HEADREC on page 170). See
/RECSPERPAGE on page 172 for details on making the footer record appear on every
page of output.

The syntax is:

/FOOTREC="character string" [, var1, var2, ...]

The character string can be a constant which may contain any combination of
internal variables, control (escape) characters, and conversion-specifier characters
recognized by RowGen (see Table 7 on page 131, Table 8 on page 132, and
Table 9 on page 133).

A character string can also contain syntax specific to a mark-up language, such
NOTE as HTML, provided the browser (or other utility) you use to read the output file
accepts that syntax. For an example of producing a custom HTML report, see
Example 15 on page 33.

An example of using the /FOOTREC statement is:

/FOOTREC="generated by %s [%d] -------\n\f",\ USER,PAGE_NUMBER

RowGen RowGen Control Language 171


21 OUTPUT OPTIONS

/RECSPERPAGE

This statement sets the number of records displayed on each page of a RowGen output
report. It is used in conjunction with /FOOTREC and/or /HEADREC. If /RECSPERPAGE
is specified, the /HEADREC header and/or /FOOTREC footer will output on each page.
If the /RECSPERPAGE statement is not given, the header and/or footer will only be
displayed once (at the start and end of the file, respectively).

Using an explicit value, for example:

/RECSPERPAGE=10

After every 10 records have been displayed or written, the header and/or footer will be
repeated if /HEADREC and/or /FOOTREC are defined.

Using /RECSPERPAGE does not actually cause a page break; it designates the
NOTE number of records to be printed before printing the footer and header records
again. In order to force a pagebreak, a "\f" should be included in the
/FOOTREC statement.

U Specifying /RECSPERPAGE without a value instructs RowGen to read the


user’s environment variable, LINES, and uses it for the value of RECSPERPAGE.
Therefore, if LINES is set to 25, the header and/or footer will output every twenty-five
records. It would appear in the script as /RECSPERPAGE. 

For example, the following report is produced from a job script where
/RECSPERPAGE=5 was specified on output, and where total of seven records were
generated by RowGen (note that a character SET file was used):

May/29/2007 INVENTORY REPORT


Titles Qty Price

Map Reader 200 14.95


Murder Plots 160 5.90
People Please 75 11.50
Pressure Cook 228 9.95
Reasoning For 150 10.25

Page 1

May/29/2007 INVENTORY REPORT


Titles Qty Price

172 RowGen Control Language RowGen


21 OUTPUT OPTIONS

Sending Your 130 15.75


Still There 80 13.05

Page 2

/OUTSKIP

This statement skips the first n number of processed records on output that satisfy any
previous /INCLUDE or /OMIT selection criteria. The syntax is:

/OUTSKIP=n

where the first n number of sorted/processed records will be excluded from the output
target to which the /OUTSKIP statement applies.

RowGen RowGen Control Language 173


22 MISCELLANEOUS OPTIONS

22 MISCELLANEOUS OPTIONS

This section describes the various RowGen statements relating to neither input nor
output:

• /RC
• /EXECUTE
• /MONITOR on page 174
• Runtime Warnings (/WARNINGSON and /WARNINGSOFF) on page 177
• /ROUNDING on page 177.

22.1 /RC

This statement allows the user to turn off the generation of output from RowGen so that
the user may

• test for errors and warnings during the debugging of a specification file
• check tuner settings being used from rowgenrc files and /MEMORY-WORK
(see Using Customized Resource Control Files on page 209).

The /RC statement may be used anywhere in the specification file.

22.2 /EXECUTE

This statement causes the operating system shell to execute the specified command.
It can be placed anywhere in the job script. The syntax is:

/EXECUTE="command_statement"

An example is:

/EXECUTE "echo 'ROWGEN began ‘date‘' >> joblog"


/INFILE=.....

When the above was executed, the following was echoed to the file joblog:

ROWGEN began Thu Feb 19 18:52:33 EST 2004

22.3 /MONITOR

This statement allows you to monitor the progress of each RowGen job by setting a
level of runtime server messages that will report through stderr. Important events such

174 RowGen Control Language RowGen


22.3 /MONITOR

as job start and stop, file opens and closes, and record throughput can each be reported
with a timestamp. The syntax of the statement is:

/MONITOR=n

where n is a whole number greater than or equal to 0 and less than 16. The statement can
appear anywhere in the job script. The MONITOR_LEVEL can also be set in the rowgenrc
file or Windows registry (see MONITOR_LEVEL level number on page 216).

Below is a chart describing the levels:

Level Description

0 no monitoring
Show job initiation and job completion. This includes for the
1
job itself, the sort processes, and the merge processes.
2 Includes Level 1 plus the opening and closing of output files.
Includes Level 2 plus the opening and closing of temporary
3-9 files. Each progressive level will show number of records
processed with an increasing degree of frequency.
3 every 1,000,000 records
4 every 100,000 records
5 every 10,000 records
6 every 1,000 records
7 every 100 records
8 every 10 records
9 every 1 record
10-14 undefined
15 Use the monitor value in the rowgenrc file.

WARNING! Setting the monitor verbosity level higher than 2 may


degrade job performance. The higher the level, the greater the
impact.

The following are examples of running the same RowGen job at different monitor level
settings. This job creates, and sorts, a 500MB file.

RowGen RowGen Control Language 175


22.3 /MONITOR

Level 1
C:\RowGen>rowgen /spec=gen.rcl /monitor=1

RowGen Version 2.1.1 D91071030-1045 Copyright 2005-2008 IRI, Inc. www.cosort.com

EDT 10:50:44 Fri Feb 01 2008. #99999.9999 2 CPU Monitor Level 1


<00:00:00.00> event (57): /spec=gen.rcl initiated
<00:00:00.00> event (66): cosort() process begins
<00:01:17.09> event (67): cosort() process ends
<00:01:17.14> event (58): /spec=gen.rcl completed
EDT 10:52:01 RowGen Serial # 99999.9999 2 CPUs Non-expiring

Level 2

C:\RowGen>rowgen /spec=gen.rcl /monitor=2

RowGen Version 2.1.1 D91071030-1045 Copyright 2005-2008 IRI, Inc. www.cosort.com

EDT 10:52:24 Fri Feb 01 2008. #99999.9999 2 CPU Monitor Level 2


<00:00:00.00> event (57): /spec=gen.rcl initiated
<00:00:00.02> event (66): cosort() process begins
<00:00:47.86> event (61): onegb outfile opened
<00:01:13.64> event (67): cosort() process ends
<00:01:13.64> event (62): onegb outfile closed
<00:01:13.66> event (58): /spec=gen.rcl completed
EDT 10:53:37 RowGen Serial # 99999.9999 2 CPUs Non-expiring

Level 3

Because this sort requires the creation of temporary files, the lines pertaining to
workfiles are displayed, which shows that the temporary files are being deleted as they
are merged together.

Below is the final output that /MONITOR will display at this level:

C:\RowGen>rowgen /spec=gen.rcl /monitor=3

RowGen Version 2.1.1 D91071030-1045 Copyright 2005-2008 IRI, Inc. www.cosort.com

EDT 10:53:54 Fri Feb 01 2008. #99999.9999 2 CPU Monitor Level 3


<00:00:00.00> event (57): /spec=gen.rcl initiated
<00:00:00.00> event (66): cosort() process begins
<00:00:00.14> event (63): .\CS00.5c0.9e0.395b28 workfile opened
<00:00:00.16> event (63): .\CS01.5c0.9e0.395b28 workfile opened
<00:00:47.11> event (61): onegb outfile opened
<00:01:18.94> event (64): .\CS00.5c0.9e0.395b28 workfile deleted
<00:01:18.98> event (64): .\CS01.5c0.9e0.395b28 workfile deleted
<00:01:19.00> event (67): cosort() process ends
<00:01:19.00> event (62): onegb outfile closedjected 0
<00:01:19.03> event (58): /spec=gen.rcl completed
EDT 10:55:13 RowGen Serial # 99999.9999 2 CPUs Non-expiring

Level 5

176 RowGen Control Language RowGen


22.4 Runtime Warnings (/WARNINGSON and /WARNINGSOFF)

When generating a large file, you will see, at this level, when every 10,000 records are
processed as the job is running:

C:\RowGen>rowgen /spec=gen.rcl /monitor=5

RowGen Version 2.1.1 D91071030-1045 Copyright 2005-2008 IRI, Inc. www.cosort.com

EDT 11:04:46 Fri Feb 01 2008. #99999.9999 2 CPU Monitor Level 5


<00:00:00.00> event (57): /spec=gen.rcl initiated
<00:00:00.01> event (66): cosort() process begins
<00:00:00.14> event (63): .\CS00.b58.fec.395b28 workfile opened
<00:00:00.16> event (63): .\CS01.b58.fec.395b28 workfile opened
<00:00:02.17> event (65): µ1, 520000 processed
. . .

If there are criteria for accepting or rejecting records, you will periodically see the
accepted and rejected line as the job is running:

RowGen Version 2.1.1 D91071030-1045 Copyright 2005-2008 IRI, Inc. www.cosort.com

EDT 11:05:55 Fri Feb 01 2008. #99999.9999 2 CPU Monitor Level 5


<00:00:00.00> event (57): /spec=gen.rcl initiated
<00:00:00.00> event (66): cosort() process begins
<00:00:00.26> event (63): .\CS00.31c.ed4.395b28 workfile opened
<00:00:00.28> event (63): .\CS01.31c.ed4.395b28 workfile opened
<00:00:02.95> event (69): accepted 640259 rejected 799741
. . .

Note that when running a job without sorting (using /REPORT), the Accepted
NOTE display reflects the number of rows being generated / processed.

22.4 Runtime Warnings (/WARNINGSON and /WARNINGSOFF)

The command /WARNINGSON captures warning messages which do not stop execution,
but do indicate where some steps may have been omitted. /WARNINGSOFF is the default
and turns off the output of the warning messages.

By including /WARNINGSON, RowGen displays warning messages via stderr.

22.5 /ROUNDING

Different hardware systems produce different results as a result of rounding arithmetic


operations. These differences occur in the 12th or 13th decimal digit, but can become
apparent after truncation.

RowGen RowGen Control Language 177


22.5 /ROUNDING

The /ROUNDING statement, which should be placed at or near the top of your RowGen
script, is used to control how rounding is to be performed throughout the execution of
the job. One of two options are available:

/ROUNDING=SYSTEM The default. This produces results in the manner determined by your
operating system.

/ROUNDING=NEAREST For display/output purposes (only) for the results of an arithmetic


operation, this option causes half a decimal digit to be added to the
least significant digit. For example, if an actual result is .794999, the
NEAREST option will cause a display of .80 when precision is
specified as 2.

WARNING!
The ROUNDING=NEAREST option slows numeric output. If your
calculations do not require this level of accuracy, or if your
system rounds in your preferred manner by default, then
/ROUNDING=NEAREST should not be specified.

178 RowGen Control Language RowGen


23 USING SEEDS

23 USING SEEDS

It is possible to control the sequence of random value generation with RowGen,


enabling you to control the starting point for the randomization sequence, and/or to
duplicate output results every time the same job is run.

If you do not specify a seed, RowGen will use a random seed value each time
NOTE a job is run.

RowGen uses an optimized implementation of the Mersenne Twister


pseudo-random number generator.

The syntax for using your own seed is as follows:

/SEED=seed_value

where seed_value is an integer between 0 and 65535, such as /SEED=9876. You can
also use an environment variable such as /SEED=$value where value is equal to the
desired seed value.

Example 54 Using a Starting Seed

The following job script, seeded.rcl uses the seed value 12345.

/INFILE=seeded.in
/PROCESS=RANDOM
/SEED=12345 # use a custom start seed
/FIELD=(string,POSITION=1,SIZE=5,alpha)
/INCOLLECT=7
/REPORT
/OUTFILE=seeded.out

This produces the file seeded.out, for example:

VzBaQ
DrGSq
zJUxW
JDISk
QgZfc
bHlYz
MYbEl

Record generation was initiated with the seed value 12345. RowGen will produce
identical results every time this job is run.

RowGen RowGen Control Language 179


23 USING SEEDS

180 RowGen Control Language RowGen


Index

Symbols /WARNINGSOFF 177


/WARNINGSON 177
!< 140 \’ 132
!= 140 \" 132
!> 140 \\ 132
/APPEND 169 \0 132
/AUDIT 218 \a 132
/AVERAGE 157 \b 132
/COUNT 161 \f 132, 172
/CREATE 169 \n 132, 171
/DATA 133 \r 132, 171
defined 129 \t 132
/DUPLICATESONLY 138 \v 132
/EXECUTE 174 # 49
/FIELD 73 %% 171
/FOOTREC 132 %c 170
defined 171 %d 170
/HEADREC 132 %E 171
defined 170 %e 171
/INCLUDE %f 171
defined 146 %G 171
/INCOLLECT 59 %g 171
/INFILE 56 %i 170
/INFILES 56 %o 171
/INREC %s 171
defined 126 %u 170
/KEY 135 %X 171
/LENGTH 64 %x 171
/MAXIMUM 159 < 140
/MEMORY-WORK 60, 174 <= 140
/MINIMUM 159 == 140
/NODUPLICATES > 140
defined 138 >= 140
/OUTCOLLECT 59 $ 49, 132
/OUTFILE 58
/OUTSKIP 173
/PROCESS 63 A
/RC 174
/RECSPERPAGE 172 abbreviations 51
/REPORT 57 abs (x) 124
/ROUNDING 177 accumulating aggregates 164
/SEED 179 acos (x) 124
/SORT 57 action statements 56
/STABLE 138 /REPORT 57
/STATISTICS 69 /SORT 57
/SUM 157 addition 123
/VERSION 52 aggregation. See summary functions.
alert 132

RowGen RowGen Control Language Index 181


Index

alignment in keys 137 command-line execution 52


alignment of fields 90 commas 86, 91
AMERICAN_DATE 131 comments 49
AMERICAN_TIME 131 common input record 126
AMERICAN_TIMESTAMP 131 compound conditions 145
appending to an output file 169 compound logical expressions 144
arithmetic symbols conditional /DATA statements 152
-, subtraction 123 IF 152
(), parentheses 123 IF-THEN-ELSE logic 152
*, multiplication 123 THEN 152
+, addition 123 conditional /FIELD statements 152
unary 123 IF 152
ASCENDING 136 IF-THEN-ELSE logic 152
ASCII 133 THEN 152
ASCII options conditions 139
alignment 137 binary 140
case 137 compound 144, 145
in keys 137 explicit (named) 139
asin (x) 124 function compares 141
atan (x) 124 implicit 139
atan2 (x,y) 124 syntax 139
auditing 70, 218 unary 140
WHERE 139, 146
with /COUNT 161
B with /MAXIMUM and /MINIMUM
159
backslash 132 with /SUM and /AVERAGE 157
backspace 132 with include-omit 146
batch script execution 52 continuation character 49
Bessel functions 124 control characters 132
binary logical expressions 140 \’, single quote 132
binary NULL 92 \", double quote 132
BREAK 139, 156 \\, backslash 132
with /COUNT 161 \0, NULL 132
with /MAXIMUM and /MINIMUM 159 \a, alert 132
with /SUM and /AVERAGE 157 \b, backspace 132
building conditions 145 \f, form feed 132
\n, newline 132
\r, carriage return 132
C
\t, horizontal tab 132
\v, vertical tab 132
carriage return 132
$, dollar sign 132
case-sensitive 50, 137
conventions 49
ceil (x) 125
conversion specifiers 133
change test 140
ASCII 133
character SET files 111
EBCDIC 133
columns. See fields, POSITION.
hexadecimal 133
command line 194

182 RowGen Control Language Index RowGen


Index

within a /DATA statement 134 E


within a condition 134
conversion. See data-type conversion. EBCDIC 133
cos (x) 124 ELF file format 69
cosh (x) 124 elf2ddf 69
COSORT_TUNER 60 ELSE 152
cosort.rc 60 environment variables 49
cosortrc 60 EQ 140
count 161 errors 174
creating an output file 169 escape characters. See control characters.
cross-calculation 123 ETL 47
CSV file format 66 EUROPEAN_DATE 131
csv2ddf 66 EUROPEAN_TIME 131
CT 140 EUROPEAN_TIMESTAMP 131
CURRENCY 91 evaluation of include-omit 147
CURRENCY, MONEY 91 evaluation order of expressions 144
CURRENT_DATE 131 examples
CURRENT_TIME 131 creating an HTML-formatted output file
CURRENT_TIMESTAMP 131 33
CURRENT_TIMEZONE 131 execution 52, 203
exp (x) 124
explicit conditions 139
D expressions
binary logical 140
data types 93 compound logical 144
CURRENCY 91 unary logical 140
CURRENCY, MONEY 91 Extended Log Format. See ELF.
dates 131 Extraction, Transformation, and Loading.
IP_ADDRESS 94 See ETL.
timestamps 131
WHOLE_NUMBER 95
date SET files and ranges 115 F
date SET files, ranges 115
dates 131 Fast Access Table (FAT) 50
decimal point 82 field expressions 123
decimal precision 84 field name references in keys 135
delimiter 86 fields 73
DESCENDING 136 alignment 90
direction in keys 136 data types 93
ASCENDING 136 FILL 92
DESCENDING 136 FRAME 88
dollar sign 132 MILL 91
double quote 132 naming 74
duplicates only 138 POSITION 79
SEPARATOR 86
SIZE
substrings 81, 84
syntax 73

RowGen RowGen Control Language Index 183


Index

file formats 63 GE 140


CSV 66 GT 140
ELF 69
LDIF 66
LINE_SEQUENTIAL 65 H
MFVL_LARGE 64
MFVL_SMALL 64 header record 170
ODBC 67 hexadecimal 133
RECORD 63 horizontal tab 132
UNIBF 66 HTML 33, 129, 170, 171
VARIABLE_SEQUENTIAL 65 HTML-format output file example 33
XML 67 hypot (x,y) 125
files 60–70
job specification 61 I
resource control 60
specification 61 IF 152
statistics 69 IF-THEN-ELSE logic 152
FILL 92 implicit conditions 139
fill characters in fields 92 include-omit 146
fixed-length records 63 evaluation 147
floating-point precision 124 internal variables 131
floor (x) 125 AMERICAN_DATE 131
footer record 171 AMERICAN_TIME 131
format control characters 170 AMERICAN_TIMESTAMP 131
%%, to write % to output 171 CURRENT_DATE 131
%c, character 170 CURRENT_TIME 131
%d, %i, decimal integer 170 CURRENT_TIMESTAMP 131
%e, %E, %f, floating point number 171 CURRENT_TIMEZONE 131
%G, E-format or F-format 171 EUROPEAN_DATE 131
%g, e-format or f-format 171 EUROPEAN_TIME 131
%o, unsigned octal integer 171 EUROPEAN_TIMESTAMP 131
%s, as a string 171 ISO_DATE 131
%u, unsigned decimal integer 170
ISO_TIME 131
%x, %X, unsigned hexadecimal integer ISO_TIMESTAMP 131
171 JAPANESE_DATE 131
form-feed 132 JAPANESE_TIME 131
FRAME 88 JAPANESE_TIMESTAMP 131
framed characters in fields 88 PAGE_NUMBER 131
framing 88 SYSDATE 131
FROM USER 131
with /MAXIMUM and /MINIMUM 159 IP_ADDRESS 94
with /SUM and /AVERAGE 157 iscompares 141
function compares in conditions 141 isalpha 141
isalphadigit 141
G isascii 141
iscntrl 141
gamma (x) 125 isdigit 141

184 RowGen Control Language Index RowGen


Index

isebcalpha 142 LINES 172


isebcdigit 142 literal SETs 122
isempty 142 log (x) 125
isgraph 141 log10 (x) 125
isholding 142 logical expressions 139
islower 141 binary 140
isnumeric 142 compound 144
ispacked 142 evaluation order 144
ispattern 142 unary 140
isprint 141 look-up tables 119
ispunct 141 lowercase 50, 137
isspace 141 LT 140
isupper 142
isxdigit 142
ISO_DATE 131 M
ISO_TIME 131
ISO_TIMESTAMP 131 mark-up language 33, 129
mathematical functions 124
abs (x) 124
J acos (x) 124
asin (x) 124
JAPANESE_DATE 131 atan (x) 124
JAPANESE_TIME 131 atan2 (x,y) 124
JAPANESE_TIMESTAMP 131 Bessel functions 124
job specification file 61 ceil (x) 125
cos (x) 124
cosh (x) 124
K exp (x) 124
floor (x) 125
keys 135 gamma (x) 125
syntax 135 hypot (x,y) 125
ASCII options log (x) 125
alignment 137 log10 (x) 125
case 137 mod (x,y) 125
direction 136 pow (x,y) 125
field name reference 135 sin (x) 125
no duplicates 138 sinh (x) 124
unnamed reference 136 sqrt (x) 125
tan (x) 125
tanh (x) 125
L
MAX_SIZE 83
maximum 159
LDIF file format 66
maximum field size 83
LE 140
MFVL_LARGE file format 64
left align 90
MFVL_SMALL file format 64
length 64
MILL 91
line continuation 49
mill option in fields 91
LINE_SEQUENTIAL file format 65
MIN_SIZE 83

RowGen RowGen Control Language Index 185


Index

minimum 159 R
minimum field size 83
miscellaneous options 174 ranking 161
mod (x,y) 125 RECORD file format 63
MONITOR_LEVEL 175 record filters 169, 174
multi-column SET files 119 RECORD_SEQUENTIAL file format 63
multiplication 123 records
common input 126
fixed-length 63
N footer 171
format control characters 170
named conditions 139 header 170
named keys 135 length 64
naming conventions 50 selection 146
naming fields 74 variable-length 63
NC 140 records per page 172
NE 140 registry 60, 175
newline 132 relational operators
no duplicates 138 CT 140
NT File System. See NTFS. EQ, == 140
NTFS 50 GE, >=, !< 140
NULL 92, 132 GT, > 140
numeric SET files and ranges 113 LE, <=, !> 140
numeric SET files, ranges 113 list of 140
LT, < 140
NC 140
O
NE, != 140
relational SET files 119
ODBC file format 67
reports with summaries 163
ODS 47
resource control file 60
OMIT
resource controls
defined 146
search order 211
Operational Data Store. See ODS.
right align 90
optional statements 49
row ID 74
optional values 49
RowGen tools
csv2ddf 66
P elf2ddf 69
ROWID 74
PAGE_NUMBER 131, 171 rows. See records.
pagebreak 172 RUNNING
parentheses in arithmetic 123 with aggregates 156, 164
pattern matching 142 running aggregates 164
PERMUTE 59 runtime statistics 69
POSITION 79 runtime warnings 177
pow (x,y) 125
precision 84
priority of resource control settings 211

186 RowGen Control Language Index RowGen


Index

S /LENGTH 64
/MAXIMUM 159
search order for resource controls 211 /MEMORY-WORK 174
seeds 179 /MINIMUM 159
SEPARATOR 86 /NODUPLICATES
separator characters in fields 86 defined 138
SEQUENCER 166 /OMIT
SEQUENCER. See also RUNNING with defined 146
/COUNT. /OUTCOLLECT 59
SET files 77 /OUTFILE 58
character SET files 111 /OUTSKIP 173
date SET files and ranges 115 /PROCESS 63
literal SETs 122 /RC 174
numeric SET files and ranges 113 /RECSPERPAGE 172
relational SET files 119 /REPORT 57
SET source 111 /ROUNDING 177
SGML 34 /SEED 179
sin (x) 125 /SORT 57
single quote 132 /STABLE 138
sinh (x) 124 /STATISTICS 69
size of fields /SUM 157
substrings 81, 84 /VERSION 52
specification file 61 /WARNINGSOFF 177
sqrt (x) 125 /WARNINGSON 177
statements statistical analysis
/APPEND 169 ranking 161
/AUDIT 218 statistics file 69
/AVERAGE 157 stderr 174
/COUNT 161 stdout 51, 177
/CREATE 169 substrings 81, 84
/DATA 133 subtraction 123
defined 129 summary functions 156
/DUPLICATESONLY 138 /AVERAGE 157
/EXECUTE 174 /COUNT 161
/FIELD 73 /MAXIMUM 159
/FOOTREC 132 /MINIMUM 159
defined 171 /SUM 157
/HEADREC 132 BREAK 139, 156
defined 170 with /COUNT 161
/INCLUDE with /MAXIMUM and /MINIMUM
defined 146 159
/INCOLLECT 59 with /SUM and /AVERAGE 157
/INFILE 56 FROM 157, 159
/INFILES 56 RUNNING 156, 164
/INREC summary reports 163
defined 126 SYSDATE 131
/KEY 135

RowGen RowGen Control Language Index 187


Index

table look-ups 119


tan (x) 125
tanh (x) 125
THEN 152
timestamp SET files, ranges 117
timestamps 131
tuning 60

unary logical expressions 140


unary operators 123
UNIBF file format 66
unnamed references in keys 136
uppercase 50, 137
USER 131, 171

VARIABLE_SEQUENTIAL file format 65


variable-length records 63
vertical tab 132

warnings 174, 177


web-ready report. See HTML.
WHERE 139, 146
with /COUNT 161
with /MAXIMUM and /MINIMUM 159
with /SUM and /AVERAGE 157
with include-omit 146
WHOLE_NUMBER 95
Windows registry 60, 175

XML 34
XML file format 67

188 RowGen Control Language Index RowGen


ROWGEN TOOLS

ROWGEN TOOLS

This chapter discusses the additional tools that RowGen provides to facilitate the
creation of field layouts. It contains the following sub-chapters:

• cob2ddf on page 193


• csv2ddf on page 195
• elf2ddf on page 199
• ctl2ddf on page 203.
These command line programs parse COBOL copybooks, comma-separated
values files, extended web logs, and SQL*Loader control file layouts
(respectively) to generate RowGen data definitions for use within RowGen jobs
(see Data Definition Files on page 62).

RowGen RowGen Tools 191


ROWGEN TOOLS

192 RowGen Tools RowGen


1 PURPOSE

cob2ddf

1 PURPOSE

RowGen offers a metadata translation program for users with input data from a COBOL
application who want to convert the record (copybook) layouts into RowGen data
definition files. The cob2ddf (COBOL-to-data definition file) program, located in the
directory $RowGen11/bin on UNIX (\RowGen11\bin on Windows), produces
descriptive file name and field-layout text that can be referenced by, or pasted directly
into, a RowGen job specification script (see Data Definition Files on page 62). The
RowGen language allows you to generate and produce multiple, differently formatted
output files and structured reports, while also replicating some of the data
transformation and reformatting capabilities of COBOL and legacy sort programs.

Currently, cob2ddf does not convert the entire range of COBOL data-definition
functionality, but it provides a convenient way to convert field descriptions. See the
Micro Focus COBOL Language Reference for documentation on the data description
portion of COBOL programs.

2 USAGE

The syntax is:

cob2ddf filename.cbl > filename.ddf

To execute cob2cl and create a RowGen data definition file, enter:

cob2ddf filename.cbl > RowGen_script

where filename.cbl is the name of the copybook or file description file, and
RowGen_script is the resulting RowGen specification file. For example, the
command:

cob2ddf cpybk.cbl > cpybk.ddf

converts the file and field descriptions in cpybk.cbl to a RowGen data definition file
called cpybk.ddf. For details on how to reference a .ddf file from within a RowGen job
script, see Data Definition Files on page 62.

RowGen RowGen Tools—cob2ddf Program 193


3 EXAMPLE

3 EXAMPLE

The following is a COBOL copybook file, cob.app:

01 REG
05 PLANT PIC X(08).
05 FREE PIC 9(10).
05 CLIENT PIC X(09).
05 CARRIER-18 PIC 9(12).
05 CARRIER-23 PIC 9(12).
05 INTEREST PIC S9(11) sign is leading
separate character.
05 CARGOES PIC S9(12) sign is leading
separate character.

To create the RowGen equivalent, enter the following on the command line:

cob2ddf cob.app cob.dff

The resultant RowGen data definition file, cob.dff, is:

/FIELD=(FREE, POSITION=9, SIZE=10, NUMERIC)


/FIELD=(CLIENT, POSITION=19, SIZE=9)
/FIELD=(CARRIER_18, POSITION=28, SIZE=12, NUMERIC)
/FIELD=(CARRIER_23, POSITION=40, SIZE=12, NUMERIC)
/FIELD=(INTEREST, POSITION=52, SIZE=12, MF_DISPSLS)
/FIELD=(CARGOES, POSITION=64, SIZE=13, MF_DISPSLS)

This file gives the record layout (field definitions) for a file that can be used in the input
and/or output section of a RowGen job specification script. If using the data definition
file in this manner, then the following statement needs to be placed at the top of your
RowGen job specification file:

/SPECIFICATION=cob.ddf

You may also copy the field definitions into the job specification file.

194 RowGen Tools— cob2ddf Program CoSORT


1 PURPOSE

csv2ddf

1 PURPOSE

csv2ddf (comma separated values-to-data definition file) is a translation program for


converting CSV file header descriptions to RowGen data definition files. The csv2ddf
program, located in the directory $RowGen11/bin on UNIX (\RowGen11\bin on
Windows), produces descriptive file name and field-layout text that can be referenced
by, or pasted directly into, a RowGen job specification script
(see Data Definition Files on page 62). The RowGen language supports multiple,
differently formatted input and output files, and structured reports.

WARNING!
For the purposes of csv2ddf, it is expected that the first record
of a Microsoft .csv file data is preceded by header descriptions.
Therefore, if your data file does not have a header, you cannot
use the csv2ddf program. See SEPARATOR on page 86 for
details on how to use RowGen to define input file fields for
comma-delimited records.

For details on using the /PROCESS=CSV command in a RowGen specification file


to instruct RowGen to treat a file as a .csv file, see CSV on page 66.

2 USAGE

The syntax of csv2ddf is:

csv2ddf filename.csv filename.ddf

To execute csv2cl and create a RowGen data definition file, enter:

csv2ddf filename RowGen_ddf

where filename.csv is the name of the comma-separated-values file, and


RowGen_ddf is the resulting RowGen data definition file. For example, the command:

csv2ddf spreadsheet.csv spreadsheet.ddf

RowGen RowGen Tools—csv2ddf Program 195


3 EXAMPLE

converts the field layout descriptions in spreadsheet.csv to a RowGen data


definition file called spreadsheet.ddf. For details on how to reference a .ddf file from
within a RowGen job script, see Data Definition Files on page 62.

Using the resultant .ddf that is created, you can include the field layout in the input
section of a RowGen script to generate fields of this type. It is then recommended that
you specify /PROCESS=CSV on output, and include the field layouts again so that a
header is automatically created in the output field based on the field names given
(see CSV on page 66), as shown in the following example.

3 EXAMPLE

This example shows how you can use both csv2ddf and RowGen to produce a random
data file with the same header and record layout as a pre-existing CSV file.

Using the following CSV input file, comma_sep.csv:

Element_Name,Windows_NT,Windows,Windows_CE,Win32s,Component,
Component_Version,Header_File,Import_Library,Unicode,Element_Type
ADsBuildEnumerator,4.0 or later,,,,ADSI,,adshlp.h,,,function
ADsBuildVarArrayInt,4.0 or later,,,,ADSI,,adshlp.h,,,function
ADsBuildVarArrayStr,4.0 or later,,,,ADSI,,adshlp.h,,,function
ADsEnumerateNext,4.0 or later,,,,ADSI,,adshlp.h,,,function
ADsFreeEnumerator,4.0 or later,,,,ADSI,,adshlp.h,,,function
ADsGetLastError,4.0 or later,,,,ADSI,,adshlp.h,,,function
ADsGetObject,4.0 or later,,,,ADSI,,adshlp.h,,,function
ADsOpenObject,4.0 or later,,,,ADSI,,adshlp.h,,,function
ADsSetLastError,4.0 or later,,,,ADSI,,adshlp.h,,,function
IADs::Get,4.0 or later,,,,ADSI,,iads.h,,,interface method

Execute the following command to generate the .ddf:

csv2ddf comma_sep.csv comma_sep.ddf

196 RowGen Tools—csv2ddf Program RowGen


3 EXAMPLE

This produces the following file, csv.ddf, which can be invoked with a
/SPECIFICATION= statement from within a RowGen job specification script (see Data
Definition Files on page 62).

/FILE=comma_sep
/PROCESS=CSV
/LENGTH=0
/FIELD=(Element_Name,POSITION=1,SEPARATOR=',')
/FIELD=(Windows_NT,POSITION=2,SEPARATOR=',')
/FIELD=(Windows,POSITION=3,SEPARATOR=',')
/FIELD=(Windows_CE,POSITION=4,SEPARATOR=',')
/FIELD=(Win32s,POSITION=5,SEPARATOR=',')
/FIELD=(Component,POSITION=6,SEPARATOR=',')
/FIELD=(Component_Version,POSITION=7,SEPARATOR=',')
/FIELD=(Header_File,POSITION=8,SEPARATOR=',')
/FIELD=(Import_Library,POSITION=9,SEPARATOR=',')
/FIELD=(Unicode,POSITION=10,SEPARATOR=',')
/FIELD=(Element_Type,POSITION=11,SEPARATOR=',')

The /PROCESS=CSV entry that is automatically produced is useful only for


NOTE output formatting purposes (see CSV on page 66), and will be ignored when
included in the input section of a RowGen script.

You can now generate random data, and produce a sample output file that conforms to
your original CSV format, including the insertion of a header record, as follows:

/SPECIFICATION=comma_sep.ddf
/INFILE=comma_sep # reference-in the .ddf created by ct2lddf
/INCOLLECT=7 # generate 100 rows of csv data
/REPORT # no ordering required
/OUTFILE=comma_sep.out
/PROCESS=CSV # generate header and frame fields in quotes
/SPECIFICATION=comma_sep.ddf # use same field layouts from the .ddf

Because the file type /PROCESS=CSV is present on output, a header record is created
from the field names, and field contents are automatically framed within double
quotes (").

RowGen RowGen Tools—csv2ddf Program 197


3 EXAMPLE

The output is as follows:

Element_Name,Windows_NT,Windows,Windows_CE,Win32s,Component,Comp
onent_Version,Header_File,Import_Library,Unicode,Element_Type
"g06dCAtm","aNnbY5","HFhY6IS","Tn4JQG1a","ZWwYHuSymI","JG","kpnl
ieI9C","nT2o","7xLK9","JsCOGgm","9ZBXHv"
"A4PT7TN61","A58ZN","KNDLWWNf18","vvoQ","kMdtiS","tXcWvR","7yjlu
HTp","28teziD","qUp1x","rIB","CzNuwT"
"a8RU","GHBU","q","Hf8","JIvKH","D7","RPM5ay","UCyDk","nd","8tBr
Tq0y","FL"
"d09aOah","u","YxQ8P5","OUqqxDBvv","Kd","aBvYhL5","M2AoGn","XsNp
o","w1ZJFJ","DgNvDF2r","efuAyN8p07"
"HP0VwDAe","ZrRzUY","H","5","vALJufwAl","IJ8o4pTkhO","ObfP","1y6
N99vx","O2Tly","qSeDG","4Ml"
"SJ070U3qF","bWSqlGTiI","LMi9J","X5T","ohriL","cOUv023","JEvY8fd
","Nd6x3nr","10y","4kVQBx9n","Rj"

198 RowGen Tools—csv2ddf Program RowGen


1 PURPOSE

elf2ddf

1 PURPOSE

elf2ddf (extended log format-to-data definition file) is a translation program for


converting W3C web data descriptions to RowGen data definition files. The elf2ddf
program, found in the $ROWGEN_HOME/bin directory (RowGen11\bin on
Windows), scans web transaction files in ELF to produce descriptive file name and
field-layout text from the header that can be referenced by, or pasted directly into, a
RowGen job specification script (see Data Definition Files on page 62). The RowGen
language supports multiple, differently formatted input and output files, and structured
reports.

For details on using the /PROCESS=ELF command in the /OUTFILE section of a


RowGen specification file, see ELF (W3C Extended Log Format) on page 69.

RowGen also handles web logs in CLF format. The sample RowGen data definition
files CLF_Referrer.ddf, CLF_Agent.ddf, and CLF_Access.ddf are provided in the
examples/RowGen directory.

2 USAGE

2.1 ELF Format

elf2ddf requires that your ELF file is in the W3C convention format described on
this page.

An extended log file contains a sequence of lines containing ASCII characters


terminated by either the sequence LF or CRLF. Log file generators should follow the
line termination convention for the platform on which they are executed. Analyzers
should accept either form. Each line may contain either a directive or an entry.

Entries consist of a sequence of fields relating to a single HTTP transaction. Fields are
separated by white space, the use of tab characters for this purpose is encouraged. If a
field is unused in a particular entry dash "-" marks the omitted field. Directives record
information about the logging process itself.

Lines beginning with the # character contain directives. The following directives are
defined:

• Version: integer.integer
The version of the extended log file format used.
• Fields: [specifier...]
Specifies the fields recorded in the log.

RowGen RowGen Tools—elf2ddf Program 199


2.2 Syntax


Software: string
Identifies the software which generated the log.
• Start-Date: date_time
The date and time at which the log was started.
• End-Date: date_time
The date and time at which the log was finished.
• Date: date_time
The date and time at which the entry was added.
• Remark: text
Specifies comment information. Data recorded in this field should be ignored by
analysis tools.
The directives Version and Fields are required and should precede all entries in the
log. The Fields directive specifies the data recorded in the fields of each entry.

2.2 Syntax

The syntax of elf2ddf is:

elf2ddf filename.elf filename.ddf

To execute elf2cl and create a RowGen data definition file, enter:

elf2ddf filename.elf RowGen_script

where filename.elf is the name of the extended log format file, and
RowGen_script is the resulting RowGen data definition file. For example, the
command:

elf2ddf clickstream.elf clickstream.ddf

converts the field layout descriptions in the clickstream.elf file header to a RowGen
data definition file named clickstream.ddf. For details on how to reference a .ddf file
from within a RowGen job script, see Data Definition Files on page 62.

elf2ddf does not yet support ELF2 format.


NOTE

200 RowGen Tools—elf2ddf Program RowGen


3 EXAMPLE

3 EXAMPLE

The following is a header from an ELF file:

#Version 1.0
#Date: 12-Jan-2007 00:00:00
#Fields: time cs-method cs-url
00:24:23 GET /tak/far.html
12:21:16 GET /tak/far.html
12:45:52 GET /tak/far.html
12:57:34 GET /tak/far.html

The following is the RowGen .ddf generated by elf2ddf based on the


above header:

# RowGen data definition file for ELF data


# Generated by elf2ddf.exe based on ELF header
# in "data.elf".
/FILE=data.elf
/PROCESS=ELF
/LENGTH=0
/FIELD=(time, POSITION=1, SEPARATOR=' '
/FIELD=(cs-method, POSITION=2, SEPARATOR=' ')
/FIELD=(cs-url, POSITION=3, SEPARATOR=' ')

The /PROCESS=ELF entry that is automatically produced is useful only for


NOTE output formatting purposes (see CSV on page 66), and will be ignored when
included in the input section of a RowGen script.

If you specify /PROCESS=ELF on output in a RowGen job script, an ELF-


style header will be generated.

RowGen RowGen Tools—elf2ddf Program 201


3 EXAMPLE

202 RowGen Tools—elf2ddf Program RowGen


1 PURPOSE

ctl2ddf

1 PURPOSE

The program ctl2ddf converts the column layouts specified in a SQL*Loader control
file (.ctl) into a data definition file (.ddf) containing /FIELD layout descriptions.
This .ddf then be used in RowGen job specification scripts (see Data Definition Files
on page 62) to create multiple, differently formatted load files and structured reports.

2 USAGE

2.1 Execution

To execute ctl2ddf, enter:

ctl2ddf control_file

where control_file is the SQL*Loader control file (typically .ctl) containing the
column layout specifications you want to convert. The output is sent to filename.ddf,
where filename is the table name specified in the .ctl file into which the data will be
loaded.).

3 EXAMPLE

Given the following SQL*Loader control file, test1.ctl:

# SQL*Loader Control File specifications (user)


# Fri Mar 30 17:LOAD DATA
INFILE 'out.dat'
INTO TABLE emp_sorted
TRAILING NULLCOLS
(EMPNO position(0001:0006) DECIMAL EXTERNAL NULLIF (EMPNO = BLANKS),
ENAME position(0007:0016) char,
JOB position(0017:0026) char,
MGR position(0027:0032) DECIMAL EXTERNAL NULLIF (MGR = BLANKS),
SAL position(0033:0043) DECIMAL EXTERNAL NULLIF (SAL = BLANKS),
COMM position(0044:0053) DECIMAL EXTERNAL NULLIF (COMM = BLANKS),
DEPTNO position(0054:0056) DECIMAL EXTERNAL NULLIF(DEPTNO = BLANKS))

The command to convert it to a RowGen data definition file might be:

ctl2ddf test1.ctl

RowGen RowGen Tools—ctl2ddf Program 203


3 EXAMPLE

emp_sorted.ddf is generated:

/FILE=out.dat
/FIELD=(EMPNO, POSITION=1, SIZE=6, DOUBLE)
/FIELD=(ENAME, POSITION=7, SIZE=10)
/FIELD=(JOB, POSITION=17, SIZE=10)
/FIELD=(MGR, POSITION=27, SIZE=6, DOUBLE)
/FIELD=(SAL, POSITION=33, SIZE=11, DOUBLE)
/FIELD=(COMM, POSITION=44, SIZE=10, DOUBLE)
/FIELD=(DEPTNO, POSITION=54, SIZE=3, DOUBLE)

Note that the INFILE source for the .ctl file, out.dat, is used as the /FILE specification
in the .ddf (see in Data Definition Files on page 62).

This .ddf can be now invoked from within a RowGen job script, for example:

/SPECIFICATION=test1.ddf
/INFILE=out.dat
/SORT
/KEY=(JOB)
/KEY=(SAL, DESCENDING
/OUTFILE=stdout.dat

In this example, stdout.dat could be a named pipe for directly feeding output data
into another SQL*Loader operation.

204 RowGen Tools—ctl2ddf Program RowGen


PERFORMANCE TUNING

APPENDIX

A PERFORMANCE TUNING

RowGen is designed to allow the user to control, and scale, machine resources. There
are no mechanisms to assume special privilege, disable interrupts, bypass the kernel, or
restrict other programs performing I/O. This good neighbor approach assures that
multiple sorts and non-sort jobs can run well concurrently. Accordingly, RowGen will
perform with minimal impact on the system with the default resource control settings
provided.

This section describes the tuning of parameters that can improve RowGen job
NOTE performance where sorting is involved (that is, if you are using the /SORT
option rather than /REPORT). These tuning descriptions are relevant only when
the job exceeds the upper memory limit used for internal (in-memory) sorts --
that is, when overflow to disk-based temporary files occurs.

RowGen is capable of symmetric multiprocessing. Therefore, the number of processors,


the shared memory, and the total memory will all affect sort throughput.
In large sorting operations, it may be possible to improve job performance by:
• changing the balance of memory and I/O
• controlling the size of I/O blocks
• changing the size and location of overflow files.

Your system administrator can use the advice and instructions in this section to re-tune
RowGen for faster or slower performance. Manual performance tuning will affect the
execution of RowGen.

Because RowGen is designed for high-volume, high-performance sorting in a typical


multi-job environment, your expectations for sort performance must be weighed against
the needs of other users on the machine. When RowGen runs (alone or concurrently
with other jobs), you should monitor system and user times, and include the effect of
RowGen on the total environment when evaluating performance.

Memory Usage

The amount of sorting performed in memory has the greatest impact on the length of
time a job takes to run. For a large job, keeping few records in memory and writing
many to disk generally causes slower sort performance due to the physical reading and
writing of files. It can also cause job failure if there is insufficient disk space for the
records.

RowGen Appendix A 205


PERFORMANCE TUNING

Conversely, using too much memory can cause thrashing, which also degrades
performance. On most computer systems, the best timings are achieved when all the
records of the sort can easily be retained in RAM.

For small sorts on lightly loaded machines, sorting in memory works well, especially
with the defaults provided by IRI. For larger jobs, or those running on very heavily
loaded machines, it is desirable to find new tuning values that will optimize the
resources spent. For huge jobs, or large jobs that you expect to run more than once, it
may be worthwhile to experiment with various memory-disk schemes to find a good
combination.

The setup program works at installation time to find the best amount of total memory
and shared memory for sorting. The values generated by setup assume that
approximately 35-40% of the machines resources can be given to a sort job.

On a heavily used system, it is likely that you will not get any memory without
swapping. The main concern is that requesting too much memory will thrash the system
or lead to complete memory usage. Advanced users will note that sar and vmstat on
various UNIX flavors, and mem on Windows machines can provide more detailed
memory usage information. See MEMORY_PERTHREAD_MAX in Resource Control
Settings on page 213.

Disk Usage

Overflow occurs in sorting when the input data volume exceeds available or specified
memory limits. When the memory is filled, the records in memory are sorted and written
to a temporary work file. When all the overflow data are distributed to temporary files,
these files and the internal data are merged to produce output. Typically, the required
overflow space is less than the size of the input. If the system does allow all the work
files to be open at once, it is possible to temporarily need more than the input size.

You are able to control where temporary files will be placed. Where multiple drives are
available, you can achieve increased speed, as the files are written to read back in
parallel. If your system provides striping, the I/O system will automatically
accommodate this. Otherwise you should assign different devices as overflow
directories. Furthermore, using different physical devices for work space and output will
distribute the space requirement in a multi-threaded fashion, avoiding I/O conflicts in
the merge phase.

In addition to specifying the location of sort overflow, the capacity of each work area
can be set. By restricting the amount of overflow allowed, system administrators can
more precisely control the impact and performance of large sort jobs on a busy machine;
see WORK_AREAS path on page 214.

206 Appendix A RowGen


Tuning RowGen for Windows

A.1 Tuning RowGen for Windows


During First Time Setup, a set of Windows Registry values are created that
contains the default resource control values that RowGen uses when you run a job.

Additionally, a template file containing resource control values, sample_rowgen.rc, is


created in \install_dir\etc. You can edit this file, or create a separate one, to override the
registry settings at the user or job level.

See Search Order for Resource Controls on page 211 for details on how RowGen
prioritizes the recognition of resource control settings.

To access and edit the Windows Registry, you must run regedit.exe and select:
HKEY_LOCAL_MACHINE\SOFTWARE\Innovative Routines Int’l
Inc.\RowGen 2.1\Global Configuration.

You will see the following, for example:

Generally, the values you will be most concerned with are WORK_AREAS. These are the
directories where RowGen will put temporary files. For the best performance, it is
recommended that you name as many directories as possible which are on physically
separate devices.

You can include multiple entries for any given WORK_AREAS setting, for
NOTE example c:\sortwork1,d:\sortwork2.

The number of threads used for sorting can be controlled by the value of THREAD_MAX.
Other registry values should not be changed unless recommended by IRI’s technical
support staff.

RowGen Appendix A 207


Tuning RowGen for UNIX

A.2 Tuning RowGen for UNIX


During First Time Setup, a resource control file is created in the
$ROWGEN_HOME/etc directory, the initial values for which are determined by
installer user prompts. This file, rowgenrc, is used to set the resources that RowGen
will use by default1.

It is recommended that you set ROWGEN_TUNER to the name of a resource control file.
This allows you to create and modify multiple resource control files, which can be
selected for different users, jobs, and system conditions. For details, see Using
Customized Resource Control Files on page 209.

See Search Order for Resource Controls on page 211 for details on how RowGen
prioritizes the different ways of setting resource control settings.

1. Previous versions of RowGen used the environment variable


ROWGEN_TUNER for this job, and its old syntax will still continue to
work.

208 Appendix A RowGen


Using Customized Resource Control Files

A.3 Using Customized Resource Control Files


The files rowgenrc (UNIX) and rowgen.rc (Windows) contain a template of tuner
values. At the user and job level, you can customize your own resource control file in
either of the following ways:

• edit the resource control template, and give it a name that is meaningful to a
specific job
• create your own resource control file and give it a name that is meaningful to a
specific job.

When creating your own resource control file, it is necessary to include only those
settings that you wish to be used other than the defaults. For example, the following is
valid resource control file contents in Windows:

THREAD_MAX 3

If you specify this resource control setting, the maximum number of processes
designated for the job will be 3 (for details on all the possible settings, see Resource
Control Settings on page 213). All other values for the job will be determined in the
order described in Search Order for Resource Controls on page 211.

U If you are in the C shell, the syntax for specifying a new resource control file to
be used is:

setenv ROWGEN_TUNER [path]filename

For example:

setenv ROWGEN_TUNER /usr/mis/rowgenrc.job1

You can also use the /MEMORY-WORK statement to designate a resource control file. The
syntax is:

/MEMORY-WORK="[path]filename"

For example:

/MEMORY-WORK="/usr/mis/rowgenrc.job1"

The settings in the file rowgenrc.job1 would be read before the settings in any other
resource control file, and therefore take precedence. 

W To specify the use of a new resource control file from a Windows command line,
type:

RowGen Appendix A 209


Using Customized Resource Control Files

set ROWGEN_TUNER=c:\progra~1\IRI\RowGen21\bin\filename

For example:

set ROWGEN_TUNER=c:\progra~1\IRI\RowGen21\bin\rowgen_job1.rc

You can also use the /MEMORY-WORK statement to designate a resource control file. The
syntax is:

/MEMORY-WORK="[path]filename"

For example:

/MEMORY-WORK="\Progra~1\IRI\RowGen21\etc\rowgen.rc2"

The settings in the file rowgen.rc2 would be read before the settings in any other
resource control file, and therefore take precedence. 

210 Appendix A RowGen


Search Order for Resource Controls

A.4 Search Order for Resource Controls


This section describes the level of priority given to other methods of setting resource
controls, for both UNIX and Windows. RowGen will first evaluate a /MEMORY-WORK
command for the old tuning values or a file referenced to by
ROWGEN_TUNER (see Using Customized Resource Control Files on page 209).

U Any controls not set by these methods will be searched in order in the files
named below. Note that for some directories, .rowgenrc is the file, and in others, it is
rowgenrc.

Table 1: Search Order—rowgenrc


Priority File
1 . /.rowgenrc (current directory)
2 $HOME/.rowgenrc (user’s home directory)
3 /etc/default/rowgenrc
4 /etc/rowgenrc
5 $RowGen_HOME/etc/rowgenrc
6 /opt/rowgen/etc/rowgenrc
7 /usr/local/rowgen/etc/rowgenrc

RowGen will search for each of the above files in order until all the values have been
set. Once a value is set, it will not be changed by settings in any subsequent files. It is
therefore possible that different variables can be set in different files. Factory defaults
will be used for any values not already set.

If you wish to have all the variables set in one place, you should set ROWGEN_TUNER to a
file name and make sure that all the variables are set in that file. Remember that you may
use any legal UNIX name for this file and that it will only be searched in the path
designated in the ROWGEN_TUNER declaration. If all the values are not set in this file, a
search will be done as shown in Table 1.

If you find performance unsatisfactory, you may want to run setup and choose Tune
RowGen from the menu. This is the same utility that runs during First Time
Setup. You may set different values for the tuner variables based on anticipated
conditions at the time of the sort. Any values that you do not set yourself will be given
factory default values.

You are also given the opportunity to set the path and name for the resource control file
being created. In this way, you can generate multiple resource control files, store them in

RowGen Appendix A 211


Search Order for Resource Controls

the same location, and then activate the desired one by setting ROWGEN_TUNER or
/MEMORY-WORK to the appropriate file name. 

W Table 2 shows the order in which RowGen prioritizes resource control settings.
Factory (registry) defaults will be used for any values not manually set.

Table 2: Resource Control Priority on Windows


Priority File
1 files pointed to by a /MEMORY-WORK statement
2 resource control files pointed to by
ROWGEN_TUNER

3 settings taken from rowgen.rc if it is located in the


same directory from which the job is run (that is,
the command-line directory)
settings taken from
install_dir\etc\rowgen.rc
4 Windows Registry

212 Appendix A RowGen


Resource Control Settings

A.5 Resource Control Settings


This section provides a complete listing, with definitions and syntax, of the variables
that may be set in the default rowgenrc file (or rowgen.rc), the Windows registry, or
some other RowGen resource control file.

The names are not case-sensitive and they may be specified in any order. A sample
resource control file will be presented after the definitions.

All size and count values must be positive numbers without commas, though decimal
values are supported. Units may be designated as K or KB for kilobytes, M or MB for
megabytes, G or GB for gigabytes.

If no units are designated, the default units are bytes. Units are not case-sensitive.
All values and path/file names can contain references to environment variables.

All resource control settings prior to RowGen version 8 are still supported, but
NOTE their values will not be optimal. It is therefore recommended that you update
any older settings (and values) with the settings described here.

THREAD_MAX count

This is the maximum number of threads to be designated for a sort or merge. The default
is the number of CPUs licensed for the RowGen software on each machine in your
license agreement. You might want to set this number lower to minimize overhead in
smaller jobs, or to accommodate the needs of other programs running concurrently.

Typically, the maximum number of sort threads corresponds to the number of physical
CPUs on board.

MEMORY_MAX= fixed_value or %value

Is the upper memory limit used for internal (in-memory) sorts before they will overflow
to disk-based temporary files. After overflow, the same value represents the size of
temporary files for sort operations.

At installation, you can set MEMORY_MAX to be a literal value, such as 500MB. You can
also express this value as a percentage of physically-detected RAM. For example, if
physical RAM is 1024MB, you can set MEMORY_MAX=50% which indicates a value of
512MB. You can modify this value in the rowgenrc file at any time.

W The default MEMORY_MAX value on Windows at installation is based on 50% of


physically-detected RAM. 

RowGen Appendix A 213


Resource Control Settings

U The default MEMORY_MAX value on Unix and Linux systems at installation is based
on a percentage of the memory detected when CSMEMTEST is run at installation:

• 10% over 32GB


• 15% over 16GB
• 25% over 2GB
• 35% under 2GB
• 50% under 512MB.

On Unix systems where CSMEMTEST is not run, or where it fails, the default will be
the above percentages based on physically-detected RAM. 

WORK_AREAS path

This allows you to specify directories for sort overflow (temporary) files. You may
specify as many WORK_AREAS entries as you wish, but the optimal quantity should
match your THREAD_MAX value, and their optimal locations would be on different
physical drives, each with sufficient capacity and space to hold at least 1x your largest
generated file. If you name multiple overflow directories on different physical devices,
your sort will be faster because the I/O will occur in parallel and not conflict. Try also to
ensure that your WORK_AREAS path(s) are not on the same physical drive that holds the
generated or the sorted output file(s).

path may be an environment variable or contain an environment variable within it.


The default overflow directory is ./ on Unix or .\ on Windows (current). Be sure to
follow your operating system convention for specifying a path, for example
D:\tmp\work (Windows) or /tmp/work (Unix).

Work files take the form CSprocess_number, and are removed automatically at the
completion of a successful job.

WORK_AREAS directories/files are written to in round-robin fashion in the order


NOTE specified until the job is finished. If more space is required beyond that which
is available, the job will abort with an error message.

214 Appendix A RowGen


Resource Control Settings

WARNING!
U If your generated, temporary, and/or output files will be
read from and/or written to remote, NFS-mounted drives, then
you should add the following entry to the rowgenrc file:

AIO OFF

which disables RowGen’s use of AIO (Asynchronous


Input/Output). By default, AIO ON (enabled) is set. 

ON_WORKAREAS_FULL option

This option determines the pause/resume behavior when one or more specified
WORK_AREAS (the paths where temporary work files are written to in large sort jobs) has
run out of disk space during a RowGen job. The options are:

ABORT The default behavior. When a work area(s) is full, the job is aborted,
temporary work files are purged, and the job must be re-run. Before
restarting the job after an abort, you must either free up space in your
specified work area(s), or assign different WORK_AREAS entries. See
WORK_AREAS path on page 214.

RETRY_PROMPT The following prompt is displayed when a work area(s) is filled up


during a job:

out of space in work area: [a]bort or [r]esume?

Enter a to abort the job (see ABORT above).

If you want to resume, you must first free up space in the specified
WORK_AREAS (see WORK_AREAS path on page 214), and then enter
r to resume the job.

RETRY_ADD_WORKAREA_PROMPT
The following prompt is displayed when a work area(s) is filled up
during a job:

out of space in work area: [a]bort, [r]esume, [n]ew


work area?

Enter a to abort the job (see ABORT above).

To resume the job, you must first free up space in the specified
WORK_AREAS (see WORK_AREAS path on page 214), and then enter

RowGen Appendix A 215


Resource Control Settings

r to resume the job.

If you enter n, you are then prompted as follows:

enter path of new work area (empty line to stop):

Enter one or more paths, pressing <Enter> after each entry, to add to
the list of WORK_AREAS (see WORK_AREAS path on page 214).
After you have added one or more path names, press <Enter> again to
create an empty line. Each path entered is checked for read/write
access, and, if acceptable, added to the list of WORK_AREAS. All
previous work files are then considered to be non-full, and RowGen
attempts to resume the job. And if all other work areas are still full,
RowGen will utilize only the new work areas entered here.

BLOCKSIZE size [unit]

This defines the size of the I/O buffer. Optimal blocksize varies depending on the
amount of physical memory available in your environment. The default is set to 1200KB
on Unix and 2048KB on Windows.

MONITOR_LEVEL level number

Set the monitor level to determine the degree of on-screen reporting detail on a RowGen
job. Monitor events are reported through stderr. Setting the monitor at higher levels can
adversely impact the efficiency of the sort. The default MONITOR_LEVEL is 1.

Setting the level with a sortcl /MONITOR statement will override the global default for
that particular job. Below is a chart describing the levels:

Level Description

0 no monitoring

Show job initiation and job completion of


events only. This includes the program itself,
1
the sort processes, and the merge (output
write) processes.

Includes Level 1 plus the opening and closing


2
of generated and output files

Includes Level 2 plus the opening and closing


of temporary files. Each progressive level will
3-9
show the number of records processed with an
increasing degree of frequency.

3 every 1,000,000 records

216 Appendix A RowGen


Resource Control Settings

Level Description

4 every 100,000 records

5 every 10,000 records

6 every 1,000 records

7 every 100 records

8 every 10 records

9 every 1 record

WARNING! High monitor levels degrade large file processing


performance due to screen I/O overhead.

LOG [path] filename

Runtime configuration and sort timing information can be sent to a self-appending log
file. By showing the actual system values used during execution, the log file can be used
in benchmarking and performance analysis. Specifying different filenames in other
resource control files will create additional log files.

When an error prevents a job from completing, the log file is not created. Instead, you
can refer to the file .rgerrlog which is created and overwritten with each job. The
location of .rgerrlog is also determined by the [path] specified here. If you do not
specify a path, any log and .rgerrlog files are written to the current working (user)
directory.

OUTPUT_TERMINATOR option

This setting determines the record terminator used when your output is specified as
variable-length. This option is ignored when your output is specified as fixed-length.
You can include this setting when you want to do any of the following on output:

• change variable-length records from the Windows format to the Unix


format, or vice versa
• add special characters to terminate records
• prevent double-linefeed terminators.

The following options are supported:

INFILE The default behavior. When your generated file is variable-length,


this option uses the same terminator as the generated file, that is,
either a linefeed (LF) when running on Unix/Linux, or a carriage

RowGen Appendix A 217


Resource Control Settings

return-linefeed (CRLF) for Windows jobs. For fixed-length input, the


default output terminator is LF. Therefore, use the DOS option (see
below) if you prefer a CRLF.

character(s)
The character, or character string, that you specify -- possibly the null
string "" -- is appended to each output record.

Using the null string "" as the output terminator character can be useful when
NOTE reunified-terminated input data is declared as fixed-length (which is done to
save processing time when all records are of equal length). In these cases,
using the null string as an output terminator character will prevent
double-linefeeds from appearing in the output when converting from fixed- to
variable-length.

DOS A special case of character(s) equal to \r\n, that is, a carriage


return-linefeed.

Unix A special case of character(s) equal to \n, that is, a linefeed.

AUDIT [path]filename

sortcl can produce a self-appending log file, in XML format, that contains
comprehensive job information for the purposes of auditing. Auditing is enabled only
when the AUDIT entry is included in the rowgenrc file (or Windows Registry).

An audit record is appended to the log after each sortcl invocation, and includes
statistical information regarding the job and the complete sortcl job script. Additional
entries/lines that do not appear in the original job script (that is, entries referenced via
one or more /SPEC commands) are expanded and included in the audit log.
Additionally, an environment display is provided to show the literal equivalents of all
environment variables that were contained in the job script.

The setting is as follows:

AUDIT [path]filename

where path is the directory that will contain the self-appending audit file, and
filename is the name of the audit file. It is recommended that the file name is given the
extension .xml to conform with its file type, for example:

AUDIT /home/compliance/audit/audit_trail.xml

218 Appendix A RowGen


ERROR and RUNTIME MESSAGES

To disable auditing, you must remove the AUDIT entry from the rowgenrc file
(Windows Registry).

You can assign an environment variable in place of the path, for example:
NOTE
AUDIT $syspath/audit_log.xml

B ERROR and RUNTIME MESSAGES

This section contains a table of RowGen error and runtime values and messages
(see Detailed Error and Runtime Messages on page 226 for elaboration on each
message).

B.1 Table of Error Values


The errors and messages in Table 3, presented in value order, can occur throughout the
RowGen suite (see B.2. Detailed Error and Runtime Messages for an alphabetical
listing of messages). When an error occurs, batch programs stop with an error message,
the system status error is set, and work files are purged. Depending on how your
operating system was generated, fatal job or operator intervention errors may not be
caught by RowGen and may go on to cause undesirable results. In these cases, contact
your RowGen agent for assistance.

Table 3: Error Values


Value Message Meaning
0 Normal Return N/A

1 Currently unused N/A

2 Insufficient Memory Required memory space could not be dynamically allocated.

3 Unknown Exception An internal error detected due to invalid data and/or specifica-
tions. Check whether the input data and specifications are
valid. Contact IRI Support if you cannot resolve the problem.

4 Invalid Format Definition Not currently returned.

5 Currently unused N/A

RowGen Appendix B 219


Table of Error Values

Table 3: Error Values (cont.)


Value Message Meaning
6 Currently unused N/A

7 Error in WORK_AREAS WORK_AREAS specification in your resource control file is


invalid or cannot be used.

8 Currently unused N/A

9 Currently unused N/A

10 Parameter Error Indicates that a routine was called with an illegal parameter.
Check your rowgen specification.

11 Record Length Improper A record length less than 0 was specified.

12 Invalid Number of Keys A number of keys less than 0 was specified.

13 Invalid Key Specification Key location was not specified as CS_FCHAR, CS_FANY, or
CS_FBLANK.

14 Unspecified Error Not currently returned.

15 Unspecified Error Not currently returned.

16 Invalid Direction A direction which was not CS_ASCEND or CS_DESCEND


was specified.

17 Invalid Field Position A key position less than 1 was given.

18 Invalid Record Length A fixed key position plus its length goes past the end of a
fixed-length record. A variable-length record shorter than the
highest fixed key position was read. A variable-length record
longer than 65,535 bytes was read.

19 Unknown Alignment Type A CS_ALPHA key had k_align not CS_NOJUST,


CS_JUSTLEFT, or CS_JUSTRIGHT.

20 Unknown Case Type A CS_ALPHA key did not use CS_NOCASE or


CS_CASEFOLD

21 Improper Format Declaration A CS_NUMERIC CS_INTERNAL key was specified with


field positioning. A CS_NUMERIC CS_INTERNAL k_form
was named which does not exist. A key was not CS_ALPHA or
CS_NUMERIC.

22 Field Type vs. Length Wrong A CS_NUMERIC CS_INTERNAL key had an unreasonable
k_len for its form.

23 Output Type Unknown Output was not stdout, file, both or returned to caller.

24 Problem with User’s Output File An error occurred when writing data to the final output file.

25 Invalid Value An invalid value for the given data type/task/feature has been
specified.

26 Unspecified Error Not currently returned.

27 Currently unused N/A

220 Appendix B RowGen


Table of Error Values

Table 3: Error Values (cont.)


Value Message Meaning
28 Environment Variable Undefined Used by cosort() and RowGen.

29 Source Unknown RowGen syntax error.

30 Unexpected RowGen syntax error.

31 Wrong Combination of Items RowGen syntax error.

32 Divide By Zero Division by zero (mathematical error).

33 Terminated Program execution terminated by Ctrl-C.

34 Insufficient Disk Space for Output File Total input bytes cannot fit on disk where output file
is specified.

35 Insufficient Disk Space for Work File(s) Total input bytes exceed (maxmemory + total temp area)

36 Insufficient Disk Space for Two times total input bytes is greater than total output area,
Both Output File and Work File(s) where output area overlaps work area.

37 Maximum Number of Sort Trying to run with more sort threads than your license allows.
Threads Exceeded

38 Unspecified Error Not currently returned.

39 Unspecified Error Not currently returned.

40 Unspecified Error Not currently returned.

41 Unspecified Error Not currently returned.

42 Unspecified Error Not currently returned.

43 Unspecified Error Not currently returned.

44 Unspecified Error Not currently returned.

45 Unspecified Error Not currently returned.

46 License Violation: Contact your RowGen agent.


Incorrect Node or Invalid Key

47 License Violation: Contact your RowGen agent.


Expiration Date Passed or Invalid Key

48 License Violation: Contact your RowGen agent.


Invalid Key

49 License Error: License manager is unable to identify its location. Contact


Cannot Obtain Machine ID. your RowGen agent.

50 Currently unused N/A

51 Invalid Resource Variable in Resource Missing or non-writable overflow directory.


Control File

52 ROWGEN_HOME is Not Set in Set the environment variable ROWGEN_HOME to the


the Environment RowGen home directory.

RowGen Appendix B 221


Table of Error Values

Table 3: Error Values (cont.)


Value Message Meaning
53 Invalid Work Area Specified Change or reset permission in WORK_AREAS.

54 License Violation: Contact your RowGen agent.


This Application is Not Licensed

55 System Error Monitor error message.

56 csio Error Monitor error message.

57 Initiated Monitor event message.

58 Completed Monitor event message.

59 Infile Opened Monitor event message.

60 Infile Closed Monitor event message.

61 Outfile Opened Monitor event message.

62 Outfile Closed Monitor event message.

63 Workfile Opened Monitor event message.

64 Workfile Closed Monitor event message.

65 Records Processed Monitor event message.

66 Process Begins Monitor event message.

67 Process Ends Monitor event message.

68 Left Right Monitor join event message.

69 Accepted Rejected Monitor event message.

70 Too Many Errors -- Aborting RowGen script analyzer stops after 8 syntax errors.

71 Incomplete Command Missing part(s) of a command.

72 Expecting Names RowGen syntax error.

73 Parenthesis Count RowGen syntax error.

74 Missing Double Quote Mark RowGen syntax error.

75 Duplicate Name RowGen syntax error. FILLER, as a field name, is exempt


from this error.

76 Expecting RowGen syntax error.

77 Expression Syntax RowGen syntax error.

78 Source is Elsewhere RowGen syntax error.

79 Abbreviation of ASCENDING RowGen syntax error.

80 Field Length > Record Length RowGen syntax error.

81 Overlapping Field RowGen syntax error.

82 Illegal Character RowGen syntax error.

222 Appendix B RowGen


Table of Error Values

Table 3: Error Values (cont.)


Value Message Meaning
83 Applies Only to Keys RowGen syntax error.

84 Unrecognized Word RowGen syntax error.

85 Scripts Too Deeply Nested RowGen syntax error.

86 Circular Definition RowGen syntax error.

87 Not an Active File RowGen syntax error.

88 Blocking Factor Invalid RowGen syntax error.

89 No File with this Name RowGen syntax error.

90 Unrecognized Name in Expression Field referenced was not specified.

91 No /HEADREC on input file Need to define a headrec on the referenced input file.

92 Not a Valid Option Here Improper syntax.

93 Require Memory Amount Tuning parameter missing from cosort() or registry.

94 Require Directory Name Tuning parameter missing from cosort() or registry.

95 Improper Command Not a recognized command.

96 No Such Locale on System Unknown locale specified.

97 Cannot Set Locale on System The requested locale routines are not available.

98 Specific Line Too Long RowGen statement or command line limit exceeded.

99 Condition Define on a Different File Defined on one file and used on another.

100 MFVL on Output Requires MFVL RowGen syntax error.


on Input

101 Ambiguous Reference RowGen syntax error.

102 Invalid Record Length for this File e.g., file size not integer multiple of fixed record length.

103 Constant in Conditional Possible Error Invalid data type for numeric operand in a condition.

104 Error Return from RowGen Coroutine reports an error to the caller.

105 Illegal Comparison Cannot compare data of these different forms.

106 Divide by 0 Division by zero (mathematical error).

107 Invalid Argument Illegal argument, such as sqrt(-1) or month 13.

108 Cannot Set Environment Variable for example, setlocale() returns error.

109 Invalid Conversion Cannot convert this data type.

110 Invalid Macro Referenced Cannot convert this data type.

111 Records Not in Order SET file requires the NOT_SORTED option.

RowGen Appendix B 223


Table of Error Values

Table 3: Error Values (cont.)


Value Message Meaning
112 rowgenrc Report Reports the total number of RowGen errors in /DEBUG
mode.

113 Last Record Incomplete Wrong record length or missing record terminator.

114 Currently unused N/A

115 Currently unused N/A

116 Currently unused N/A

117 Currently unused N/A

118 Records are in Order Monitor event message.

119 Unable to Read File Error in reading the specified file.

120 Unable to Write File Error in writing the specified file.

121 Unable to Open File Error in opening the specified file.

122 Currently unused N/A

123 Permission Denied Insufficient privileges to access the specified file.

124 No Such File The specified file does not exist.

125 Invalid File The specified file is not a regular file.

126 Empty Input File The specified file is empty.

127 Cannot Find a Matching File The specified pattern did not match any file in the file system.

128 Not Supported The feature/syntax is not supported by rowgen. Contact your
RowGen agent.

129 Disk Full Insufficient space for input/temporary/output purposes.

130 Missing records (number read != number A mismatch between the number of output records and the
written) number of input records (also accounts for any filter logic).

131 AIO resources exceeded AIO resources are insufficient and must be increased.

132-150 Reserved for future use N/A

151 Operating System Error Indicates an operating system error that is not otherwise
covered by the one of the standard error conditions.

152 Parameter Error Indicates that a routine was called with an illegal parameter.
Check your RowGen specification.

153 Too Many Files Open An attempt was made to open more files than the system
allows open at once.

154 Unsupported Action for the Current File An operation was requested that the current file open mode
Mode does not allow.

155 Record in Use by Another Process/Task The requested record is locked by another process/task.

224 Appendix B RowGen


Table of Error Values

Table 3: Error Values (cont.)


Value Message Meaning
156 Corrupted Index File The indexed file is corrupt. It should be reconstructed using
the appropriate host system utility.

157 Duplicate Key Detected Where it is Not A duplicate key was detected where duplicates are not
Allowed allowed.

158 Requested Record Was Not Found The requested record was not found. This can indicate the
end or beginning of the file.

159 File Handler has Undefined Status The current file operation cannot be completed because it
detected an undefined status for a parameter.

160 Disk Full Insufficient space for input/temporary/output files.

161 File in Use by Another Process/Task The file is locked by another process/task.

162 Mismatch in Record Size Mismatch detected in the record size specifications.

163 File Type Mismatch Trying to treat a file with a different /PROCESS type.

164 Insufficient Memory Required memory space could not be dynamically allocated.

165 No Such File The specified file does not exist.

166 Permission Denied Insufficient privileges to access the specified file.

167 Requested Operation Not Supported by The requested operation is not supported in your machine.
this Host System

168 System Ran Out of Lock-Table Entries Indicates an error when your machine ran out of lock-table
entries. Try again.

169 Vision License Error Invalid license file to generate vision files. Make sure you
have the Vision license file rowgen.vlc in the directory where
you have the RowGen executable. The Vision license file
can be obtained from a Micro Focus (formerly Acucorp)
representative.

170 Unknown Exception An internal problem detected due to invalid input data or
specification. Please make sure that the input data and
specifications are valid. Contact IRI technical support if you
cannot resolve the problem.

171 Error in the Vision Transaction System Indicates that an error occurred in the Vision transaction
system.

172 Header information missing in input file The input /PROCESS type you have specified requires a
header record to exist, and it can not be found.

RowGen Appendix B 225


Detailed Error and Runtime Messages

B.2 Detailed Error and Runtime Messages

The following list contains details of the RowGen error and runtime messages,
presented alphabetically. For a list of messages by value number, see Table 3 on
page 219.

Abbreviation of ASCENDING
A RowGen syntax error.

Accepted Rejected This message is generated by MONITOR. See MONITOR_LEVEL


level number on page 216 and /MONITOR on page 174.

AIO resources exceeded


AIO resources are insufficient and must be increased. The method for
increasing AIO values is operating-system-dependent (see the
Platform Considerations section in the RowGen Install Guide).

Ambiguous Reference A RowGen syntax error.

Applies Only to Keys A RowGen syntax error.

Blocking Factor Invalid


A RowGen syntax error.

Cannot Find a Matching File


The specified pattern, such as *.dat, for an input file name did not
match any file in the file system.

Cannot Set Environment Variable


For example, setlocale() returns an error.

Cannot Set Locale on System


The requested locale routines are not available.

Circular Definition A RowGen syntax error.

Completed This message is generated by MONITOR. See MONITOR_LEVEL


level number on page 216 and /MONITOR on page 174.

Condition Define on a Different File


Defined on one file and used on another.

Constant in Conditional Possible Error


An invalid data type has been specified for the numeric operand in a
/CONDITION statement.

226 Appendix B RowGen


Detailed Error and Runtime Messages

Corrupted Indexed File


The indexed file is corrupt. It should be reconstructed using the
appropriate host system utility.

ROWGEN_HOME Not in Environment


Use the appropriate command to assign the $ROWGEN_HOME
environment (shell) variable to your installed RowGen directory.

rowgenrc Report Reports the total number of RowGen errors in /DEBUG mode.

csio Error This message is generated by MONITOR. See MONITOR_LEVEL


level number on page 216 and /MONITOR on page 174.

Disk Full Insufficient space for input/temporary/output purposes.

Divide by Zero Mathematical error.

Duplicate Key Detected Where it is not Allowed


A duplicate key was detected where duplicates are not allowed.

Duplicate Name A RowGen syntax error. FILLER, when used as a field name, is
exempt from this error.

Empty Input File A warning message that indicates that the specified file is empty.

Environment Variable Undefined


Check the data or job specifications, or the resource control file(s) in
use for a missing or undefined $variable.

Error in the Vision Transaction System


Indicates that an error occurred in the Vision transaction system.

Error in WORK_AREAS
WORK_AREAS specification in your resource control file is invalid or
cannot be used. Check whether the WORK_AREAS directory exists and
you have read, write, file create access.

Error Return from RowGen


Coroutine reports an error to the caller.

Expecting A RowGen syntax error.

Expecting Names A RowGen syntax error.

Expression Syntax A RowGen syntax error.

Field Length > Record Length


A RowGen syntax error.

RowGen Appendix B 227


Detailed Error and Runtime Messages

Field Type vs. Length Wrong


Binary data types may only be used at integer multiples of a specific
length. In this case, the length specified for the type is incorrect.

File Handler Has Undefined Status


The current file operation cannot be completed because it detected an
undefined status for a parameter.

File in Use by Another Process/Task


The file is locked by another process/task.

File Type Mismatch Trying to treat a file with a different /PROCESS type.

Header information missing in input file


The input /PROCESS type you have specified, such as CSV or ELF,
requires a header record to exist, and it cannot be found.

Illegal Character A RowGen syntax error.

Illegal Comparison Cannot compare data of these different forms.

Improper Command Not a recognized command.

Improper Format Declaration


Numeric internal fields must be on fixed boundaries; their format
must be one of the types discussed in Data Types on page 93.

Incomplete Command Missing part(s) of a command.

Infile Closed This message is generated by MONITOR. See MONITOR_LEVEL


level number on page 216 and /MONITOR on page 174.

Infile Opened This message is generated by MONITOR. See MONITOR_LEVEL


level number on page 216 and /MONITOR on page 174.

Initiated This message is generated by MONITOR. See MONITOR_LEVEL


level number on page 216 and /MONITOR on page 174.

Insufficient Disk Space for Both Output File and Work File(s)
RowGen has found that the output area and at least one of the work
areas overlap, and that the file system does not have enough free
space to accommodate both the sort output size and the temporary
disk space requirements of the sort. In the simplest case of a single
work area overlapping with the output area, the estimated
consumption of the common area is typically equal to two times the
total number of input file sizes.

Insufficient Disk Space for Output File


RowGen has found that the file system mounted on the output file

228 Appendix B RowGen


Detailed Error and Runtime Messages

path does not have enough space to hold the output of the sort. The
space required of that file system is typically equal to the total
number of input file sizes.

Insufficient Disk Space for Work File(s)


RowGen has found that the sum of disk space available in the work
areas is less than the estimated consumption of the sort. The
estimated consumption of the work areas is typically equal to the total
number of input file sizes.

Insufficient Memory Required memory space could not be dynamically allocated.

Invalid Argument Illegal argument, such as sqrt(-1) or month 13.

Invalid Conversion Cannot convert this data type.

Invalid Direction Only 0 for ascending or 1 for descending is permitted.

Invalid Field Position The acceptable field start range is 0 ≤ Position ≤ 65,535.

Invalid File The specified file is not a regular file. This could happen when you
try to enter a directory name as the input file.

Invalid Format Declaration


Used for improper format definition by the user. Also, it is used when
data does not conform to the definition.

Invalid Key Specification


Field positions may be specified as 0 for fixed, 1 for blank, or 2 for
character delimited.

Invalid Macro Referenced


Cannot convert this data type.

Invalid Number of Keys


The acceptable range is 0 ≤ Number of keys ≤ 65,535.

Invalid Record Length


This is an Execution Phase error associated with variable-length
records and fixed and floating keys. An error situation occurs when a
fixed position key starts later than the record length. A variable-
position error occurs when the record does not contain enough of the
field separators, and the key cannot be found.

Invalid Record Length for this File


The file size not is not an integer multiple of the fixed-record length.

RowGen Appendix B 229


Detailed Error and Runtime Messages

Invalid Resource Variable in Resource Control File


Your resource control file contains lines that do not specify valid
resource variables.

Invalid Value An invalid value for the given data type/task/feature has been
specified. Check your RowGen specification.

Invalid Work Area specified


You have specified a directory for a work area in your rowgenrc file
which does not exist or for which you do not have the appropriate
rights.

Last Record Incomplete


Wrong record length or missing record terminator.

Left Right This is a join event message generated by MONITOR. See


MONITOR_LEVEL level number on page 216 and /MONITOR on
page 174.

License Error: Cannot Obtain Machine ID


Contact your RowGen agent.

License Violation: Expiration Date Passed or Invalid Key


Your RowGen evaluation period has expired. Contact your RowGen
agent to obtain a permanent license.

License Violation: Incorrect Node or Invalid Key


You are attempting to run RowGen on a machine that is not licensed.

License Violation: Invalid key


Recheck the license keys you received from IRI or its agent. Contact
IRI if you still cannot run after re-trying the setup procedure.

License Violation: This Application is Not Licensed


Contact your RowGen agent.

Maximum Number of Sort Threads Exceeded


Trying to run more sort threads than your license allows. Set
THREAD_MAX in your resource control file to the value that RowGen
is licensed for. Usually, this is same as the number of CPUs on your
machine.

MFVL on Output Requires MFVL on Input


A RowGen syntax error.

Mismatch in Record Size


A mismatch was detected in the record size specifications.

230 Appendix B RowGen


Detailed Error and Runtime Messages

Missing Double Quote Mark


A RowGen syntax error.

Missing records (number read != number written)


There is a mismatch between the number of output /produced records
and the number of input / generated records (also accounts for any
filter logic). Contact your RowGen agent.

No File with this Name


A RowGen syntax error.

No /HEADREC on input file


Need to define a /HEADREC on the referenced input file.

No Such File The specified does not exist in the file system.

No Such Locale on System


Unknown locale specified.

Not a Valid Option Here


Improper syntax.

Not an Active File A RowGen syntax error.

Not Supported The feature/syntax is not supported by RowGen. Contact your


RowGen agent.

Operating System Error


Indicates an operating system error that is not otherwise covered by
one of the standard error conditions.

Outfile Closed This message is generated by MONITOR. See MONITOR_LEVEL


level number on page 216 and /MONITOR on page 174.

Outfile Opened This message is generated by MONITOR. See MONITOR_LEVEL


level number on page 216 and /MONITOR on page 174.

Output Type Unknown


Only 0 for program return, 1 for file out, 2 for standard out, and 3 for
both 1 and 2 are acceptable values for output Type.

Overlapping Field A RowGen syntax error.

Parameter Error Indicates that a routine was called with an illegal parameter. Check
your RowGen specification.

Parenthesis Count A RowGen syntax error.

Permission Denied Insufficient privileges to access the specified file. Check the access
privileges of the specified file.

RowGen Appendix B 231


Detailed Error and Runtime Messages

Problem With User’s Output File


An I/O problem occurred while you were producing the output file. This
is usually caused by running out of disk space.

Process Begins This message is generated by MONITOR. See MONITOR_LEVEL


level number on page 216 and /MONITOR on page 174.

Process Ends This message is generated by MONITOR. See MONITOR_LEVEL


level number on page 216 and /MONITOR on page 174.

Record in Use by Another Process/Task


The requested record is locked by another process/task.

Record Length Improper


The acceptable range is 0 ≤ Record Length ≤ 65,535.

Records Are in Order SET file requires the NOT_SORTED option.

Records Not in Order From a check action.

Records Processed This message is generated by MONITOR. See MONITOR_LEVEL


level number on page 216 and /MONITOR on page 174.

Requested Operation Not Supported by this Host System


The requested operation is not supported in your machine.

Requested Record was not Found


The requested record was not found. This may indicate the
end/beginning of the file.

Require Memory Amount


Tuning parameter missing from cosort() or registry.

Require Directory Name


Tuning parameter missing from cosort() or registry.

Scripts Too Deeply Nested


A RowGen syntax error.

Source is Elsewhere A RowGen syntax error.

Specific Line Too Long


A RowGen statement or command line limit exceeded.

System Error This message is generated by MONITOR. See MONITOR_LEVEL


level number on page 216 and /MONITOR on page 174.

System Ran Out of Lock-Table Entries


Indicates an error when your machine ran out of lock-table entries.
Try again.

232 Appendix B RowGen


Detailed Error and Runtime Messages

Terminated Program execution terminated by Ctrl-C.

Too Many Errors -- Aborting


The RowGen script analyzer stops after 8 syntax errors.

Too Many Files Open An attempt was made to open more files than the system allows open
at once.

Uncorrected Error Condition


This error is the result of succeeding calls into cosort() after an
error has already been reported. The only valid call after a reported
error is to initiate a new sort or merge.

Unknown Alignment Type


Only 0 for none, 1 for left, and 2 for right are acceptable values.

Unknown Case Type Only 0 for no case conversion, or 1 for case conversion, are
acceptable values.

Unknown Exception An internal error detected due to invalid data or specification.


Check whether the input data and specifications are valid. Contact
IRI technical support if you cannot resolve the problem.

Unable to Open File Error in opening the specified file. This could be due to resource
limitations of your machine during runtime.

Unable to Read File Error in reading the specified file. This could be due to resource
limitations of your machine during runtime.

Unable to Write File Error in writing the specified file. This could be due to resource
limitations of your machine during runtime.

Unrecognized Name in Expression


A field referenced was not specified.

Unrecognized Word A RowGen syntax error.

Unsupported Action for the Current File Mode


An operation was requested that the current file open mode does not
allow.

Vision License Error Invalid license file to generate vision files. Make sure you have the
Vision license file with name rowgen.vlc in the directory where you
have the RowGen executable. A Vision license file can be obtained
from a Micro Focus (formerly Acucorp) representative.

Workfile Closed This message is generated by MONITOR. See MONITOR_LEVEL


level number on page 216 and /MONITOR on page 174.

RowGen Appendix B 233


ASCII COLLATING SEQUENCE

Workfile Opened This message is generated by MONITOR. See MONITOR_LEVEL


level number on page 216 and /MONITOR on page 174.

C ASCII COLLATING SEQUENCE

Dec Hex Chr Dec Hex Chr Dec Hex Chr Dec Hex Chr

000 00 Nul 032 20 Bl 064 40 @ 096 60 ’


001 01 Soh 033 21 ! 065 41 A 097 61 a
002 02 Stx 034 22 " 066 42 B 098 62 b
003 03 Etx 035 23 # 067 43 C 099 63 c
004 04 Eot 036 24 $ 068 44 D 100 64 d
005 05 Enq 037 25 % 069 45 E 101 65 e
006 06 Ack 038 26 & 070 46 F 102 66 f
007 07 Bel 039 27 ’ 071 47 G 103 67 g
008 08 Bs 040 28 ( 072 48 H 104 68 h
009 09 Ht 041 29 ) 073 49 I 105 69 i
010 0A Lf 042 2A * 074 4A J 106 6A j
011 0B Vt 043 2B + 075 4B K 107 6B k
012 0C Ff 044 2C , 076 4C L 108 6C l
013 0D Cr 045 2D – 077 4D M 109 6D m
014 0E So 046 2E . 078 4E N 110 6E n
015 0F Si 047 2F / 079 4F O 111 6F o
016 10 Dle 048 30 0 080 50 P 112 70 p
017 11 Dc1 049 31 1 081 51 Q 113 71 q
018 12 Dc2 050 32 2 082 52 R 114 72 r
019 13 Dc3 051 33 3 083 53 S 115 73 s
020 14 Dc4 052 34 4 084 54 T 116 74 t
021 15 Nak 053 35 5 085 55 U 117 75 u
022 16 Syn 054 36 6 086 56 V 118 76 v
023 17 Etb 055 37 7 087 57 W 119 77 w
024 18 Can 056 38 8 088 58 X 120 78 x
025 19 Em 057 39 9 089 59 Y 121 79 y
026 1A Sub 058 3A : 090 5A Z 122 7A z
027 1B Esc 059 3B ; 091 5B [ 123 7B {
028 1C Fs 060 3C < 092 5C \ 124 7C |
029 1D Gs 061 3D = 093 5D ] 125 7D }
030 1E Rs 062 3E > 094 5E ^ 126 7E ~
031 1F Us 063 3F ? 095 5F _ 127 7F Del

234 Appendix C RowGen


EBCDIC PRINTING CHARACTERS

D EBCDIC PRINTING CHARACTERS

Dec Hex Chr Dec Hex Chr Dec Hex Chr Dec Hex Chr

064 40 Bl 192 C0 { 240 F0 0


074 4A ¢ 129 81 a 193 C1 A 241 F1 1
075 4B . 130 82 b 194 C2 B 242 F2 2
076 4C < 131 83 c 195 C3 C 243 F3 3
077 4D ( 132 84 d 196 C4 D 244 F4 4
078 4E + 133 85 e 197 C5 E 245 F5 5
079 4F | 134 86 f 198 C6 F 246 F6 6
080 50 & 135 87 g 199 C7 G 247 F7 7
090 5A ! 136 88 h 200 C8 H 248 F8 8
091 5B $ 137 89 i 201 C9 I 249 F9 9
092 5C * 208 D0 }
093 5D ) 145 91 j 209 D1 J
094 5E ; 146 92 k 210 D2 K
095 5F ^ 147 93 l 211 D3 L
096 60 - 148 94 m 212 D4 M
097 61 / 149 95 n 213 D5 N
106 6A | 150 96 o 214 D6 O
107 6B , 151 97 p 215 D7 P
108 6C % 152 98 q 216 D8 Q
109 6D _ 153 99 r 217 D9 R
110 6E > 161 A1 ~ 224 E0 \
111 6F ? 162 A2 s 226 E2 S
121 79 ‘ 163 A3 t 227 E3 T
122 7A : 164 A4 u 228 E4 U
123 7B # 165 A5 v 229 E5 V
124 7C @ 166 A6 w 230 E6 W
125 7D ’ 167 A7 x 231 E7 X
126 7E = 168 A8 y 232 E8 Y
127 7F " 169 A9 z 233 E9 Z

We Need Your Input

IRI, Inc. upholds a standard of documentation quality that is best


maintained through ongoing user feedback. If you have any comments, concerns,
or suggestions regarding this documentation, please communicate them to us.

Please indicate the title and page number(s) in the manual you are addressing.
Be sure to provide your name, address (or e-mail address), and telephone number
if you would like a reply from IRI.

RowGen Appendix D 235


EBCDIC PRINTING CHARACTERS

Mailing address: RowGen Technical Publications


INNOVATIVE ROUTINES INTERNATIONAL, INC.
2194 Highway A1A, Suite 303
Melbourne, FL 32937-4932
USA

Telephone: USA +1 (321) 777-8889

Fax: USA +1 (321) 777-8886

E-Mail: rowgen@iri.com

The comments you provide may be used by IRI to improve the quality of, or to make
additions to, this document and/or the RowGen software.

Additional suggestions or submissions may be made or posted through the technical


support area at http://www.cosort.com/support.

© 2005-2013 IRI. All rights reserved. No part of this document or the RowGen
programs may be used or copied without the express written permission of IRI.

236 Appendix D RowGen

You might also like