Download as pdf or txt
Download as pdf or txt
You are on page 1of 408

7DEOH0DQDJHPHQW

8WLOLW\

5HIHUHQFH*XLGH

,%05HG%ULFN:DUHKRXVH

9HUVLRQ
$XJXVW
3DUW1R
Note:
Before using this information and the product it supports, read the information in the appendix
entitled “Notices.”

This document contains proprietary information of IBM. It is provided under a license agreement and is
protected by copyright law. The information contained in this publication does not include any product
warranties, and any statements provided in this manual should not be interpreted as such.

When you send information to IBM, you grant IBM a nonexclusive right to use or distribute the information
in any way it believes appropriate without incurring any obligation to you.

© Copyright International Business Machines Corporation 1996, 2002. All rights reserved.

US Government User Restricted Rights—Use, duplication or disclosure restricted by GSA ADP Schedule
Contract with IBM Corp.

LL 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
Table of
Contents

7DEOHRI&RQWHQWV

,QWURGXFWLRQ
In This Introduction . . . . . . . . . . . . . . . . . 3
About This Guide . . . . . . . . . . . . . . . . . . 3
Types of Users . . . . . . . . . . . . . . . . . . 4
Software Dependencies . . . . . . . . . . . . . . . 4
Documentation Conventions . . . . . . . . . . . . . . 5
Typographical Conventions . . . . . . . . . . . . . 5
Syntax Notation . . . . . . . . . . . . . . . . . 6
Syntax Diagrams . . . . . . . . . . . . . . . . . 7
Keywords and Punctuation . . . . . . . . . . . . . 9
Identifiers and Names . . . . . . . . . . . . . . . 10
Icon Conventions . . . . . . . . . . . . . . . . . 11
Customer Support . . . . . . . . . . . . . . . . . . 12
New Cases . . . . . . . . . . . . . . . . . . . 12
Existing Cases . . . . . . . . . . . . . . . . . . 13
Troubleshooting Tips . . . . . . . . . . . . . . . . 13
Related Documentation . . . . . . . . . . . . . . . . 14
Additional Documentation . . . . . . . . . . . . . . . 16
Online Documents . . . . . . . . . . . . . . . . 16
Printed Documents . . . . . . . . . . . . . . . . 16
Online Help . . . . . . . . . . . . . . . . . . . 16
IBM Welcomes Your Comments . . . . . . . . . . . . . 17
&KDSWHU  ,QWURGXFWLRQWRWKH7DEOH0DQDJHPHQW8WLOLW\
In This Chapter . . . . . . . . . . . . . . . . . . . 1-3
TMU Operations and Functions . . . . . . . . . . . . . 1-4
TMU Control Files and Statements . . . . . . . . . . . . 1-8
Termination . . . . . . . . . . . . . . . . . . 1-8
Comments . . . . . . . . . . . . . . . . . . . 1-9
Locales and Multibyte Characters . . . . . . . . . . . 1-9
USER Statement . . . . . . . . . . . . . . . . . 1-9
LOAD DATA and SYNCH Statements . . . . . . . . . 1-10
UNLOAD Statements . . . . . . . . . . . . . . . 1-10
GENERATE Statements . . . . . . . . . . . . . . 1-11
REORG Statements . . . . . . . . . . . . . . . . 1-11
BACKUP Statements . . . . . . . . . . . . . . . 1-11
RESTORE Statements . . . . . . . . . . . . . . . 1-12
UPGRADE Statements . . . . . . . . . . . . . . . 1-12
SET Statements . . . . . . . . . . . . . . . . . 1-13

&KDSWHU  5XQQLQJWKH708DQG3708
In This Chapter . . . . . . . . . . . . . . . . . . . 2-3
User Access and Required Permission . . . . . . . . . . . 2-4
Operating System Access . . . . . . . . . . . . . . 2-4
Database Access . . . . . . . . . . . . . . . . . 2-5
Permissions on TMU Output Files . . . . . . . . . . . 2-5
Syntax for rb_tmu and rb_ptmu Programs . . . . . . . . . 2-5
Exit Status Codes . . . . . . . . . . . . . . . . . . 2-7
Setting Up the TMU . . . . . . . . . . . . . . . . . 2-8
Remote TMU Setup and Syntax . . . . . . . . . . . . . 2-12
Client-Server Compatibility . . . . . . . . . . . . . 2-12
Client Configuration . . . . . . . . . . . . . . . 2-13
Server Configuration . . . . . . . . . . . . . . . 2-14
Syntax for the rb_ctmu Program . . . . . . . . . . . 2-14
Summary of Remote TMU Operation . . . . . . . . . . 2-18
Example: Windows-to-UNIX Remote TMU Operation . . . . 2-19
USER Statement for User Name and Password . . . . . . . . 2-21
SET Statements and Parameters to Control Behavior . . . . . . 2-23
Lock Behavior . . . . . . . . . . . . . . . . . . 2-25
Buffer-Cache Size . . . . . . . . . . . . . . . . 2-27
Temporary Space Management . . . . . . . . . . . . 2-28
Format of Datetime Values . . . . . . . . . . . . . 2-33

LY 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
Load Information Limit . . . . . . . . . . . . . . . 2-34
Memory-Map Limit . . . . . . . . . . . . . . . . 2-35
Setting Precomputed View Maintenance . . . . . . . . . 2-36
Precomputed View Maintenance On Error . . . . . . . . 2-36
Managing Row Messages . . . . . . . . . . . . . . 2-38
Enabling Versioning . . . . . . . . . . . . . . . . 2-39
Commit Record Interval . . . . . . . . . . . . . . . 2-40
Commit Time Interval . . . . . . . . . . . . . . . 2-42
Displaying Load Statistics . . . . . . . . . . . . . . 2-45
Backup and Restore (BAR) Unit Size . . . . . . . . . . 2-45
External Backup and Restore Operations . . . . . . . . . 2-46
REORG Tasks . . . . . . . . . . . . . . . . . . 2-47
Parallel Loading Tasks (PTMU Only) . . . . . . . . . . 2-48
Serial Mode Operation (PTMU Only) . . . . . . . . . . 2-50
Suggestions for Effective PTMU Operations . . . . . . . . . 2-52
Operations That Use Parallel Processing . . . . . . . . . 2-52
Discard Limits on Parallel Load Operations . . . . . . . . 2-53
AUTOROWGEN with the PTMU . . . . . . . . . . . 2-53
Multiple Tape Drives with the PTMU . . . . . . . . . . 2-54
3480/3490 Multiple-Tape Drive with the PTMU . . . . . . 2-54

&KDSWHU  /RDGLQJ'DWDLQWRD:DUHKRXVH'DWDEDVH
In This Chapter . . . . . . . . . . . . . . . . . . . 3-5
The LOAD DATA Operation . . . . . . . . . . . . . . 3-6
Inputs and Outputs . . . . . . . . . . . . . . . . 3-6
Processing Stages for Loading Data . . . . . . . . . . . 3-8
Procedure for Loading Data . . . . . . . . . . . . . . . 3-12
Some Preliminary Decisions . . . . . . . . . . . . . . . 3-14
Determining Table Order . . . . . . . . . . . . . . 3-14
Ordering Input Data . . . . . . . . . . . . . . . . 3-15
Maintaining Referential Integrity with Automatic Row Generation 3-16
Writing a LOAD DATA Statement . . . . . . . . . . . . . 3-23
LOAD DATA Syntax . . . . . . . . . . . . . . . . . 3-24
Input Clause . . . . . . . . . . . . . . . . . . . . 3-25
Format Clause . . . . . . . . . . . . . . . . . . . 3-29
EBCDIC to ASCII Conversion . . . . . . . . . . . . . 3-35
Locale Clause . . . . . . . . . . . . . . . . . . . . 3-38
Locale Specifications for XML Input Files . . . . . . . . . 3-41
Usage Notes . . . . . . . . . . . . . . . . . . . 3-42

7DEOHRI&RQWHQWV Y
Discard Clause . . . . . . . . . . . . . . . . . . . 3-43
Usage. . . . . . . . . . . . . . . . . . . . . 3-54
Row Messages Clause . . . . . . . . . . . . . . . . 3-57
Optimize Clause . . . . . . . . . . . . . . . . . . 3-59
MMAP Index Clause . . . . . . . . . . . . . . . . . 3-63
Table Clause . . . . . . . . . . . . . . . . . . . . 3-65
Loading a SERIAL Column . . . . . . . . . . . . . 3-68
Selective Column Updates with RETAIN and DEFAULT . . . 3-69
Simple Fields . . . . . . . . . . . . . . . . . . 3-71
Concatenated Fields. . . . . . . . . . . . . . . . 3-81
Constant Fields . . . . . . . . . . . . . . . . . 3-84
Sequence Fields . . . . . . . . . . . . . . . . . 3-85
Increment Fields . . . . . . . . . . . . . . . . . 3-86
Segment Clause . . . . . . . . . . . . . . . . . . 3-87
Criteria Clause . . . . . . . . . . . . . . . . . . . 3-90
Comment Clause . . . . . . . . . . . . . . . . . . 3-95
Field Types . . . . . . . . . . . . . . . . . . . . 3-97
Character Field Type . . . . . . . . . . . . . . . 3-99
Numeric External Field Types . . . . . . . . . . . . 3-101
Floating-Point External Field Type . . . . . . . . . . 3-103
Packed and Zoned Decimal Field Types . . . . . . . . . 3-104
Integer Binary Field Types . . . . . . . . . . . . . 3-105
Floating-Point Binary Field Types . . . . . . . . . . . 3-106
Datetime Field Types . . . . . . . . . . . . . . . 3-107
Format Masks for Datetime Fields . . . . . . . . . . . . 3-109
Subfield Components . . . . . . . . . . . . . . . 3-110
Restricted Datetime Masks for Numeric Fields . . . . . . . . 3-116
Writing a SYNCH Statement . . . . . . . . . . . . . . 3-119
Format of Input Data . . . . . . . . . . . . . . . . . 3-122
Disk Files . . . . . . . . . . . . . . . . . . . 3-123
Tape Files on UNIX Operating Systems . . . . . . . . . 3-131
Field-Type Conversions . . . . . . . . . . . . . . . . 3-133
LOAD DATA Syntax Summary . . . . . . . . . . . . . 3-137

YL 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
&KDSWHU  8QORDGLQJ'DWDIURPD7DEOH
In This Chapter . . . . . . . . . . . . . . . . . . . 4-3
The UNLOAD Operation . . . . . . . . . . . . . . . . 4-4
Internal Format . . . . . . . . . . . . . . . . . . 4-5
External Format . . . . . . . . . . . . . . . . . 4-5
Data Conversion to External Format . . . . . . . . . . 4-6
UNLOAD Syntax . . . . . . . . . . . . . . . . . . 4-8
Unloading or Loading Internal-Format Data . . . . . . . . . 4-14
Unloading or Loading External-Format Data . . . . . . . . . 4-16
Converting a Table to Multiple Segments . . . . . . . . . . 4-18
Moving a Database . . . . . . . . . . . . . . . . . . 4-18
Loading External-Format Data into Third-Party Tools . . . . . . 4-19
Unloading Selected Rows . . . . . . . . . . . . . . . . 4-19
Example: External Fixed-Format Data . . . . . . . . . . 4-20
Example: External Variable-Format Data . . . . . . . . . 4-22

&KDSWHU  *HQHUDWLQJ&5($7(7$%/(DQG/2$''$7$6WDWHPHQWV
In This Chapter . . . . . . . . . . . . . . . . . . . 5-3
Generating CREATE TABLE Statements . . . . . . . . . . . 5-3
Generating LOAD DATA Statements . . . . . . . . . . . . 5-5
Example: GENERATE Statements and External-Format Data . . . 5-8

&KDSWHU  5HRUJDQL]LQJ7DEOHVDQG,QGH[HV
In This Chapter . . . . . . . . . . . . . . . . . . . 6-3
The REORG Operation . . . . . . . . . . . . . . . . 6-3
REORG Operation Options. . . . . . . . . . . . . . 6-5
Data Processing During the REORG Operation . . . . . . . . 6-7
Coordinator Stage . . . . . . . . . . . . . . . . . 6-10
Input Stage . . . . . . . . . . . . . . . . . . . 6-10
Conversion Stage . . . . . . . . . . . . . . . . . 6-10
Index-Building Stage . . . . . . . . . . . . . . . . 6-11
Cleanup Stage . . . . . . . . . . . . . . . . . . 6-11
REORG Syntax . . . . . . . . . . . . . . . . . . . 6-12
discardfile Clause . . . . . . . . . . . . . . . . . 6-19
Usage Notes . . . . . . . . . . . . . . . . . . . 6-21

7DEOHRI&RQWHQWV YLL
&KDSWHU  0RYLQJ'DWDZLWKWKH&RS\0DQDJHPHQW8WLOLW\
In This Chapter . . . . . . . . . . . . . . . . . . . 7-3
The rb_cm Utility . . . . . . . . . . . . . . . . . . 7-4
System Requirements . . . . . . . . . . . . . . . 7-5
Database Security Requirements . . . . . . . . . . . 7-6
The rb_cm Syntax . . . . . . . . . . . . . . . . . . 7-7
TMU Control Files for Use with rb_cm . . . . . . . . . . . 7-10
LOAD and UNLOAD Statements . . . . . . . . . . . 7-11
SYNCH Statement . . . . . . . . . . . . . . . . 7-12
SET Statements . . . . . . . . . . . . . . . . . 7-13
Examples of rb_cm Operations . . . . . . . . . . . . . 7-13
Example: Copying Data Between Different Computers. . . . 7-14
Example: Copying Data Between Tables on the Same Computer 7-18
Verifying the Results of rb_cm Operations . . . . . . . . . 7-20

&KDSWHU  %DFNLQJ8SD'DWDEDVH
In This Chapter . . . . . . . . . . . . . . . . . . . 8-3
Backup Levels and Modes . . . . . . . . . . . . . . . 8-4
External Full Backups . . . . . . . . . . . . . . . 8-4
Restore Rules . . . . . . . . . . . . . . . . . . 8-5
Backup Data . . . . . . . . . . . . . . . . . . 8-5
Backup Strategies . . . . . . . . . . . . . . . . 8-6
Backup Procedure . . . . . . . . . . . . . . . . 8-8
Preparing the Database for Backups . . . . . . . . . . . 8-8
ALTER DATABASE CREATE BACKUP DATA Command . . 8-9
ALTER DATABASE DROP BACKUP DATA Command . . . 8-10
Storage Requirements for the Backup Segment . . . . . . 8-10
Altering the Backup Segment . . . . . . . . . . . . 8-11
How to Run a TMU Backup . . . . . . . . . . . . . . 8-13
Scope of Backup Operations . . . . . . . . . . . . . 8-14
Configuring the Size of Backup Files . . . . . . . . . . 8-15
Backups to Tape . . . . . . . . . . . . . . . . . 8-17
Using a Storage Manager for TMU Backups . . . . . . . 8-19
Using External Tools for Full Backups . . . . . . . . . 8-20
BACKUP Syntax . . . . . . . . . . . . . . . . . 8-22
Backup Metadata . . . . . . . . . . . . . . . . . . 8-26
Media History File (rbw_media_history) . . . . . . . . 8-27
Backup Log File (action_log) . . . . . . . . . . . . . 8-29
&KDSWHU  5HVWRULQJD'DWDEDVH
In This Chapter . . . . . . . . . . . . . . . . . . . 9-3
Full and Partial TMU Restores . . . . . . . . . . . . . . 9-4
Restore Path . . . . . . . . . . . . . . . . . . . 9-4
Restore Examples . . . . . . . . . . . . . . . . . 9-5
How to Run a TMU Restore . . . . . . . . . . . . . . . 9-10
Recommended Procedure for Foreign Restore Operations . . . 9-11
Restore of Special Segments . . . . . . . . . . . . . 9-11
Cold Restore Operations. . . . . . . . . . . . . . . 9-12
PSUs for Objects Created After a Restored Backup. . . . . . 9-12
RESTORE Syntax . . . . . . . . . . . . . . . . . 9-13
Partial Restore Procedure . . . . . . . . . . . . . . 9-19

$SSHQGL[ $ ([DPSOH8VLQJWKH708LQ$**5(*$7(0RGH

$SSHQGL[ % 6WRUDJH0DQDJHU&RQILJXUDWLRQIRU;%6$%DFNXSV

1RWLFHV

,QGH[

7DEOHRI&RQWHQWV L[
Introduction

,QWURGXFWLRQ

In This Introduction . . . . . . . . . . . . . . . . . . 3
About This Guide . . . . . . . . . . . . . . . . . . . 3
Types of Users . . . . . . . . . . . . . . . . . . . 4
Software Dependencies . . . . . . . . . . . . . . . . 4

Documentation Conventions . . . . . . . . . . . . . . . 5
Typographical Conventions . . . . . . . . . . . . . . 5
Syntax Notation . . . . . . . . . . . . . . . . . . 6
Syntax Diagrams . . . . . . . . . . . . . . . . . . 7
Keywords and Punctuation . . . . . . . . . . . . . . 9
Identifiers and Names . . . . . . . . . . . . . . . . 10
Icon Conventions . . . . . . . . . . . . . . . . . . 11
Comment Icons . . . . . . . . . . . . . . . . . 11
Platform Icons . . . . . . . . . . . . . . . . . . 11
Customer Support . . . . . . . . . . . . . . . . . . . 12
New Cases . . . . . . . . . . . . . . . . . . . . 12
Existing Cases . . . . . . . . . . . . . . . . . . . 13
Troubleshooting Tips . . . . . . . . . . . . . . . . . 13

Related Documentation . . . . . . . . . . . . . . . . . 14
Additional Documentation . . . . . . . . . . . . . . . . 16
Online Documents . . . . . . . . . . . . . . . . . 16
Printed Documents . . . . . . . . . . . . . . . . . 16
Online Help . . . . . . . . . . . . . . . . . . . . 16

IBM Welcomes Your Comments . . . . . . . . . . . . . . 17


 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
,Q7KLV,QWURGXFWLRQ
This Introduction provides an overview of the information in this document
and describes the conventions it uses.

$ERXW7KLV*XLGH
This guide provides the information you need to use both the Table
Management Utility (TMU) and its parallel version, the PTMU, to load and
maintain the tables and indexes in IBM Red Brick Warehouse databases. It
includes information necessary for the effective use of the TMU, as well as
syntax definitions and procedural descriptions. Use it in conjunction with the
Administrator’s Guide to develop and maintain an efficient data warehouse
operation.

Information in this guide applies to IBM Red Brick Warehouse on UNIX ,


Linux, and Windows platforms. Features and options that are not available
or applicable to all platforms are so indicated.

For operating-system or platform-specific information, refer to the Release


Notes, the appropriate Installation and Configuration Guide, or the documen-
tation that accompanies the hardware and operating system.

,QWURGXFWLRQ 
7\SHVRI8VHUV

7\SHVRI8VHUV
This guide is written for the following users:

■ Database administrators
■ Database users who are responsible for loading and maintaining the
tables and indexes in IBM Red Brick Warehouse

This guide assumes that you have the following background:

■ A working knowledge of your computer, your operating system, and


the utilities that your operating system provides
■ Some experience working with relational databases or exposure to
database concepts
■ Some experience with database server administration,
operating-system administration, or network administration

6RIWZDUH'HSHQGHQFLHV
This guide assumes that you are using IBM Red Brick Warehouse,
Version 6.2, as your database server.

IBM Red Brick Warehouse includes the Aroma database, which contains
sales data about a fictitious coffee and tea company. The database tracks daily
retail sales in stores owned by the Aroma Coffee and Tea Company. The
dimensional model for this database consists of a fact table and its
dimensions.

For information about how to create and populate the demonstration


database, see the Administrator’s Guide. For a description of the database and
its contents, see the SQL Self-Study Guide.

The scripts that you use to install the demonstration database reside in the
redbrick_dir/sample_input directory, where redbrick_dir is the IBM Red Brick
Warehouse directory on your system.

 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
'RFXPHQWDWLRQ&RQYHQWLRQV

'RFXPHQWDWLRQ&RQYHQWLRQV
This section describes the conventions that this document uses. These
conventions make it easier to gather information from this and other volumes
in the documentation set.

The following conventions are discussed:

■ Typographical conventions
■ Syntax notation
■ Syntax diagrams
■ Keywords and punctuation
■ Identifiers and names
■ Icon conventions

7\SRJUDSKLFDO&RQYHQWLRQV
This document uses the following conventions to introduce new terms,
illustrate screen displays, describe command syntax, and so forth.

&RQYHQWLRQ 0HDQLQJ

KEYWORD All primary elements in a programming language statement


(keywords) appear in uppercase letters in a serif font.

italics Within text, new terms and emphasized words appear in italics.
italics Within syntax and code examples, variable values that you are
italics to specify appear in italics.

boldface Names of program entities (such as classes, events, and tables),


boldface environment variables, file and pathnames, and interface
elements (such as icons, menu items, and buttons) appear in
boldface.

monospace Information that the product displays and information that you
monospace enter appear in a monospace typeface.
(1 of 2)

,QWURGXFWLRQ 
6\QWD[1RWDWLRQ

&RQYHQWLRQ 0HDQLQJ

KEYSTROKE Keys that you are to press appear in uppercase letters in a sans
serif font.

♦ This symbol indicates the end of one or more product- or


platform-specific paragraphs.

➞ This symbol indicates a menu item. For example, “Choose


Tools➞Options” means choose the Options item from the
Tools menu.
(2 of 2)
7LS When you are instructed to “enter” characters or to “execute” a command,
immediately press RETURN after the entry. When you are instructed to “type” the
text or to “press” other keys, no RETURN is required.

6\QWD[1RWDWLRQ
This guide uses the following conventions to describe the syntax of
operating-system commands.

&RPPDQG(OHPHQW ([DPSOH &RQYHQWLRQ

Values and table_name Items that you replace with an appropriate


parameters name, value, or expression are in italic type
style.

Optional items [ ] Optional items are enclosed by square


brackets. Do not type the brackets.

Choices ONE |TWO Choices are separated by vertical lines; choose


one if desired.

Required choices {ONE|TWO} Required choices are enclosed in braces;


choose one. Do not type the braces.
(1 of 2)

 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
6\QWD['LDJUDPV

&RPPDQG(OHPHQW ([DPSOH &RQYHQWLRQ

Default values ONE|TWO Default values are underlined, except in


graphics where they are in bold type style.

Repeating items name, … Items that can be repeated are followed by a


comma and an ellipsis. Separate the items
with commas.

Language () , ; . Parentheses, commas, semicolons, and


elements periods are language elements. Use them
exactly as shown.
(2 of 2)

6\QWD['LDJUDPV
This guide uses diagrams built with the following components to describe
the syntax for statements and all commands other than system-level
commands.

&RPSRQHQW 0HDQLQJ

Statement begins.

Statement syntax continues on next line. Syntax


elements other than complete statements end with
this symbol.

Statement continues from previous line. Syntax


elements other than complete statements begin
with this symbol.

Statement ends.

SELECT Required item in statement.

Optional item.
DISTINCT

(1 of 2)

,QWURGXFWLRQ 
6\QWD['LDJUDPV

&RPSRQHQW 0HDQLQJ

DBA TO Required item with choice. One and only one item
CONNECT TO must be present.
SELECT ON

Optional item with choice. If a default value exists,


ASC it is printed in bold.
DESC

,
Optional items. Several items are allowed; a
comma must precede each repetition.
ASC
DESC

(2 of 2)
The preceding syntax elements are combined to form a diagram as follows.

REORG table_name
,
INDEX ( index_name )

;
RECALCULATE RANGES OPTIMIZE ON
OFF

 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
.H\ZRUGVDQG3XQFWXDWLRQ

Complex syntax diagrams such as the one for the following statement are
repeated as point-of-reference aids for the detailed diagrams of their compo-
nents. Point-of-reference diagrams are indicated by their shadowed corners,
gray lines, and reduced size.

LOAD INPUT_CLAUSE
DATA FORMAT_CLAUSE DISCARD_CLAUSE

TABLE_CLAUSE ;
optimize_clause segment_clause criteria_clause comment_clause

The point-of-reference diagram is then followed by an expanded diagram of


the shaded portion—in this case, the INPUT_CLAUSE.

INPUTFILE filename
INDDN TAPE DEVICE ’DEVICE_NAME ’
( ’FILENAME ’ )

START RECORD START_ROW STOP RECORD stop_row

.H\ZRUGVDQG3XQFWXDWLRQ
Keywords are words reserved for statements and all commands except
system-level commands. When a keyword appears in a syntax diagram, it is
shown in uppercase characters. You can write a keyword in uppercase or
lowercase characters, but you must spell the keyword exactly as it appears in
the syntax diagram.

Any punctuation that occurs in a syntax diagram must also be included in


your statements and commands exactly as shown in the diagram.

,QWURGXFWLRQ 
,GHQWLILHUVDQG1DPHV

,GHQWLILHUVDQG1DPHV
Variables serve as placeholders for identifiers and names in the syntax
diagrams and examples. You can replace a variable with an arbitrary name,
identifier, or literal, depending on the context. Variables are also used to
represent complex syntax elements that are expanded in additional syntax
diagrams. When a variable appears in a syntax diagram, an example, or text,
it is shown in lowercase italic.

The following syntax diagram uses variables to illustrate the general form of
a simple SELECT statement.

SELECT column_name FROM table_name

When you write a SELECT statement of this form, you replace the variables
column_name and table_name with the name of a specific column and table.

 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
,FRQ&RQYHQWLRQV

,FRQ&RQYHQWLRQV
Throughout the documentation, you will find text that is identified by several
different types of icons. This section describes these icons.

&RPPHQW,FRQV
Comment icons identify three types of information, as the following table
describes. This information always appears in italics.

,FRQ /DEHO 'HVFULSWLRQ

:DUQLQJ Identifies paragraphs that contain vital instructions,


cautions, or critical information

,PSRUWDQW Identifies paragraphs that contain significant


information about the feature or operation that is
being described

7LS Identifies paragraphs that offer additional details or


shortcuts for the functionality that is being described

3ODWIRUP,FRQV
Feature, product, and platform icons identify paragraphs that contain
platform-specific information.

,FRQ 'HVFULSWLRQ

UNIX
Identifies information that is specific to the UNIX and
Linux operating systems

Windows
Identifies information that is specific to Windows
platforms

,QWURGXFWLRQ 
&XVWRPHU6XSSRUW

These icons can apply to an entire section or to one or more paragraphs


within a section. If an icon appears next to a section heading, the information
that applies to the indicated feature, product, or platform ends at the next
heading at the same or higher level. A ♦ symbol indicates the end of feature-,
product-, or platform-specific information that appears within one or more
paragraphs within a section.

&XVWRPHU6XSSRUW
If you have technical questions about IBM Red Brick Warehouse but cannot
find the answer in the appropriate document, contact IBM Customer Support
as follows:

Telephone 1-800-274-8184 or 1-913-492-2086


(7 A.M. to 7 P.M. central time, Monday through Friday)

Internet http://www-3.ibm.com/software/data/informix/support/
access

1HZ&DVHV
To log a new case, you must provide the following information:

■ IBM Red Brick Warehouse version


■ Platform and operating-system version
■ Error messages returned by IBM Red Brick Warehouse or the
operating system
■ Concise description of the problem, including any commands or
operations performed before you received the error message
■ List of IBM Red Brick Warehouse or operating-system configuration
changes made before you received the error message

 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
([LVWLQJ&DVHV

For problems concerning client-server connectivity, you must provide the


following additional information:

■ Name and version of the client tool that you are using
■ Version of the Red Brick ODBC Driver or Red Brick JDBC Driver that
you are using, if applicable
■ Name and version of client network or TCP/IP stack in use
■ Error messages returned by the client application
■ Server and client locale specifications

([LVWLQJ&DVHV
The support engineer who logs your case or first contacts you will always
give you a case number. This number is used to keep track of all the activities
performed during the resolution of each problem. To inquire about the status
of an existing case, you must provide your case number.

7URXEOHVKRRWLQJ7LSV
You can often reduce the time it takes to close your case by providing the
smallest possible reproducible example of your problem. The more you can
isolate the cause of the problem, the more quickly the support engineer can
help you resolve it:

■ For SQL query problems, try to remove columns or functions or to


restate WHERE, ORDER BY, or GROUP BY clauses until you can isolate
the part of the statement causing the problem.
■ For Table Management Utility load problems, verify the data type
mapping between the source file and the target table to ensure
compatibility. Try to load a small test set of data to determine
whether the problem concerns volume or data format.
■ For connectivity problems, issue the ping command from the client
to the host to verify that the network is up and running. If possible,
try another client tool to see if the same problem arises.

,QWURGXFWLRQ 
5HODWHG'RFXPHQWDWLRQ

5HODWHG'RFXPHQWDWLRQ
The standard documentation set for IBM Red Brick Warehouse includes the
following documents.

'RFXPHQW 'HVFULSWLRQ

Administrator’s Guide Describes warehouse architecture, supported


schemas, and other concepts relevant to databases.
Procedural information for implementing and
maintaining a database. Includes a description of
the system tables and the configuration file.

Client Installation and Connec- Includes procedures for installing ODBC, Red Brick
tivity Guide JDBC Driver, RISQL Entry Tool, and RISQL
Reporter on client systems. Describes how to access
IBM Red Brick Warehouse using ODBC for C and
C++ applications and JDBC for Java applications.

IBM Red Brick Vista User’s Describes the IBM Red Brick Vista aggregate
Guide computation and management system. Illustrates
how Vista improves query performance by
automatically rewriting queries to use aggregates,
describes how the Advisor recommends the best set
of aggregates based on data collected daily, and
explains how aggregate tables are maintained when
their detail tables are updated.

Installation and Configuration Provides installation and configuration infor-


Guide mation, as well as platform-specific material, about
IBM Red Brick Warehouse. Customized for either
UNIX and Linux platforms or Windows platforms.
(1 of 2)

 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
5HODWHG'RFXPHQWDWLRQ

'RFXPHQW 'HVFULSWLRQ

Messages and Codes Reference Contains a complete listing of all informational,


Guide warning, and error messages generated by IBM Red
Brick Warehouse products, including probable
causes and recommended responses. Also includes
event log messages that are written to the log files.

Query Performance Guide This guide describes the determinants of query


performance and shows how to tune a database for
optimal query performance. Examples show how to
evaluate query performance using Red Brick tools:
SET STATS, Dynamic Statistic Tables, EXPLAIN,
and the Query Performance Monitor.

Release Notes Contains information pertinent to the current


release that was unavailable when the documents
were printed.

RISQL Entry Tool and RISQL Is a complete guide to the RISQL Entry Tool,
Reporter User’s Guide a command-line tool used to enter SQL statements,
and the RISQL Reporter, an enhanced version of the
RISQL Entry Tool with report-formatting
capabilities.

SQL Reference Guide Is a complete language reference for the Red Brick
SQL implementation and RISQL extensions for IBM
Red Brick Warehouse databases.

SQL Self-Study Guide Provides an example-based review of SQL and


introduction to the RISQL extensions, the macro
facility, and Aroma, the sample database.

This guide Describes the Table Management Utility, including


activities related to loading, maintaining, and
backing up data. Also includes information about
data replication and the rb_cm copy management
utility.
(2 of 2)
Additional references you might find helpful include:

■ An introductory-level book on SQL


■ An introductory-level book on relational databases
■ Documentation for your hardware platform and operating system

,QWURGXFWLRQ 
$GGLWLRQDO'RFXPHQWDWLRQ

$GGLWLRQDO'RFXPHQWDWLRQ
For additional information, you might want to refer to the following types of
documentation:

■ Online documents
■ Printed documents
■ Online help

2QOLQH'RFXPHQWV
A Documentation CD that contains IBM Red Brick Warehouse documents in
electronic format is provided with your Red Brick products. You can copy the
documentation to your computer or access it directly from the CD.

3ULQWHG'RFXPHQWV
To order printed documents, contact your sales representative.

2QOLQH+HOS
IBM provides online help with each graphical user interface (GUI) that
displays information about those interfaces and the functions that they
perform. Use the help facilities that each GUI provides to display the online
help.

 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
,%0:HOFRPHV<RXU&RPPHQWV

,%0:HOFRPHV<RXU&RPPHQWV
To help us with future versions of our documents, let us know about any
corrections or clarifications that you would find useful. Include the following
information:

■ The full name and version of your document


■ Any comments that you have about the document
■ Your name, address, and phone number

Send electronic mail to us at the following address:

comments@vnet.ibm.com

This address is reserved for reporting errors and omissions in our documen-
tation. If you prefer, you can fill out a comment form by going to:

http://www.ibm.com/software/data/rcf/

For immediate help with a technical problem, contact IBM Customer


Support.

,QWURGXFWLRQ 
Chapter

,QWURGXFWLRQWRWKH7DEOH
0DQDJHPHQW8WLOLW\ 
In This Chapter . . . . . . . . . . . . . . . . . . . . 1-3
TMU Operations and Functions . . . . . . . . . . . . . . 1-4
TMU Control Files and Statements . . . . . . . . . . . . . 1-8
Termination . . . . . . . . . . . . . . . . . . . . 1-8
Comments . . . . . . . . . . . . . . . . . . . . 1-9
Locales and Multibyte Characters . . . . . . . . . . . . 1-9
USER Statement . . . . . . . . . . . . . . . . . . 1-9
LOAD DATA and SYNCH Statements . . . . . . . . . . . 1-10
UNLOAD Statements . . . . . . . . . . . . . . . . 1-10
GENERATE Statements . . . . . . . . . . . . . . . . 1-11
REORG Statements . . . . . . . . . . . . . . . . . 1-11
BACKUP Statements . . . . . . . . . . . . . . . . . 1-11
RESTORE Statements. . . . . . . . . . . . . . . . . 1-12
UPGRADE Statements . . . . . . . . . . . . . . . . 1-12
SET Statements . . . . . . . . . . . . . . . . . . . 1-13
 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
,Q7KLV&KDSWHU
The Table Management Utility (TMU) is the IBM Red Brick Warehouse
program that you use to load data into the database and to maintain its tables,
indexes, precomputed views, and referential integrity. While the primary
function of the TMU is to load and index large amounts of data quickly, it also
performs the following tasks:

■ Unloading data from a database to move data, either an entire table


or selected rows, or to edit the data for use with the TMU or with
other applications.
■ Rebuilding tables and indexes after the table is substantially
modified by incremental loads or by insert, update, or delete
statements.
■ Generating DDL (CREATE TABLE) or TMU LOAD DATA statements
from an existing table for use with unloaded data.
■ Upgrading databases to provide upward compatibility with new
versions of IBM Red Brick Warehouse.
■ Performing full and incremental database backups and full and
partial restores.

This chapter contains the following sections:

■ TMU Operations and Functions


■ TMU Control Files and Statements

,QWURGXFWLRQWRWKH7DEOH0DQDJHPHQW8WLOLW\ 
7082SHUDWLRQVDQG)XQFWLRQV

7082SHUDWLRQVDQG)XQFWLRQV
The TMU is a program that runs independently of the database server; it is
invoked from the operating-system command line and uses the same config-
uration information as other components of IBM Red Brick Warehouse. The
TMU program can be invoked remotely, allowing DBAs to load data from an
input file on a networked client machine into a database table on the
production server machine.

Before you invoke the TMU, you must use the TMU control language to write
a control file that specifies the task to be done and provides the information
needed to perform that task. Next you run the TMU, naming the control
filename as input. The TMU reads the control file and carries out the task,
reading its input from tape, disk, or standard input, and modifying the
database or writing output files to tape or disk as directed. At the same time,
the TMU writes messages for system logging and accounting purposes. The
TMU supports a variety of tape, disk, and data formats.

 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
7082SHUDWLRQVDQG)XQFWLRQV

Figure 1-1 illustrates the TMU and its input and outputs.
)LJXUH 
708,QSXWDQG2XWSXW2SWLRQV
,QSXWV 2XWSXWV
'DWDEDVH
&RQWURO V\VWHPWDEOH
ILOH XSGDWHV
7DEOHDQG
LQGH[
&RQWURO ILOHV
'LVFDUG
8QORDG ILOH 7DEOH ILOHV
IRUPDW 0DQDJHPHQW
ILOHV 8WLOLW\ 8QORDG
IRUPDW /2$''$7$
UEBWPXRU ILOHV FRQWURO
UEBSWPX ILOHV &5($7(
,QSXWGDWD 7$%/(
ILOHV ''/ILOHV
$FFRXQWLQJ
DQG
ORJUHTXHVW
'DWD PHVVDJHV
,QSXWGDWD VWDQGDUG
VWDQGDUG 81,;ORJGDHPRQ
RXWSXW
LQSXWSLSHV RU:LQGRZV
SLSHV 0HVVDJHV
VWDQGDUGHUURU ORJWKUHDG
ZDUQLQJPHVVDJH UEZORJG
ILOH

UNIX Tape support is available only for UNIX. ♦

,QWURGXFWLRQWRWKH7DEOH0DQDJHPHQW8WLOLW\ 
7082SHUDWLRQVDQG)XQFWLRQV

Figure 1-2 shows how the TMU can be invoked on a remote server from a
local client machine:
)LJXUH 
5HPRWH708$UFKLWHFWXUH

/RFDO&OLHQW0DFKLQH 1HWZRUN 5HPRWH6HUYHU0DFKLQH


&RQQHFWLRQ
&OLHQW7DEOH
0DQDJHPHQW
,QSXW)LOHV 5HG%ULFN:DUHKRXVH6HUYHU
8WLOLW\ /RDG DQG'DWDEDVHV
UEBFWPX

&RQWURO)LOHV 2XWSXW)LOHV 7DEOH


0DQDJHPHQW
8QORDG'LVFDUG 8WLOLW\
UEBSWPX

In this case, the TMU is invoked from the client machine (using the rb_ctmu
program) but the LOAD DATA or UNLOAD operation is performed against an
IBM Red Brick Warehouse database on the server machine. This feature
allows DBAs to maintain control files, input files, and output files on the
client, reducing the security risk on the production machine.

Some of the other key features of the TMU are as follows:

■ Parallel loads and REORG operations


The Parallel TMU (PTMU) speeds up load operations on tables that
have multiple indexes and referential integrity constraints because it
performs index building, data conversion, and referential integrity
checks in parallel as data records are loaded.
Most of the information in this reference guide applies to both the
TMU and the PTMU; in cases where differences in behavior or syntax
exist, these differences are specified. For general guidelines about
PTMU usage, refer to “Suggestions for Effective PTMU Operations”
on page 2-52.

 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
7082SHUDWLRQVDQG)XQFWLRQV

■ Copy Management utility


The copy management utility (rb_cm) facilitates the movement and
synchronization of data among multiple databases found through-
out an enterprise. In addition to combining high-speed loading and
unloading operations, it can also move data over a network. This
utility is described in Chapter 7, “Moving Data with the Copy Man-
agement Utility.”
■ Auto Aggregate mode
The auto aggregate mode provides the capability to automatically
aggregate new input data with the data already in a table. For exam-
ple, if you are loading daily sales amounts for each store into the
Sales table, you can automatically add the new amount for each store
to the amount already in the table to keep a running total, or aggre-
gate. This mode is described in Chapter 3, “Loading Data into a
Warehouse Database,” with a detailed example in Appendix A,
“Example: Using the TMU in AGGREGATE Mode.”
■ Precomputed view maintenance
The precomputed view maintenance feature automatically updates
aggregate tables whenever detail tables are updated, ensuring that
detail and aggregate tables remain in sync. By using maintenance,
users do not need to develop their own scripts for updating aggre-
gate tables, and in many cases, the updates run faster. For detailed
information about this feature, refer to the IBM Red Brick Vista User’s
Guide.
■ Integrated backup and restore operations
You can use the TMU to run full and incremental backups and full
and partial restores. Backups can be performed online, while the
database is available for loads and updates. The TMU tracks incre-
mental changes at the 8K block level, making level 1 and 2 backups
fast. For restores, the TMU uses backup metadata to automatically
construct the optimal restore path. Backups to disk, tape, and XBSA-
compliant storage management systems are supported.

,QWURGXFWLRQWRWKH7DEOH0DQDJHPHQW8WLOLW\ 
708&RQWURO)LOHVDQG6WDWHPHQWV

708&RQWURO)LOHVDQG6WDWHPHQWV
A TMU control file contains one or more statements that specify the functions
to be performed and the information the TMU needs to perform those
functions. A single control file can contain multiple control statements of the
same or different types; for example:

■ A USER statement that provides a database user name and password.


■ Various types of control statements that correspond to the functions
to be performed.
■ SET statements that control the TMU execution environment.

LOAD DATA statement


USER statement SYNCH SEGMENT statement
UNLOAD statement
GENERATE statement
REORG statement
BACKUP statement
RESTORE statement
UPGRADE statement
SET statement

The following sections briefly describe each of the previous control


statements.

7HUPLQDWLRQ
Each control statement must end with a semicolon (;) as the detailed syntax
diagrams for each statement show in the remaining chapters. If multiple
control statements are included in a single control file, each one must end
with a semicolon.

 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
&RPPHQWV

&RPPHQWV
You can enclose comments in a control file with either C-language-style
delimiters (/*…*/), in which case they can span multiple lines, or precede
them with two dashes (--) and end them with end-of-line, in which case they
are limited to a single line.

/RFDOHVDQG0XOWLE\WH&KDUDFWHUV
You must specify most of a TMU control statement (LOAD DATA, UNLOAD,
SYNCH, REORG, and so on) with ASCII characters, regardless of the database
locale. However, the TMU does support multibyte characters for database
object names and for some special characters used in those statements, and
you can specify a locale for a TMU input file that differs from the database
locale. Messages that the TMU returns are displayed in the language of the
database locale unless the RB_NLS_LOCALE environment variable for the
current user overrides that locale. In all other cases, TMU operations use the
locale of the database.

For a list of defined locales supported by IBM Red Brick Warehouse, see the
locales.pdf file in the relnotes directory on your installation CD.

86(56WDWHPHQW
A USER statement provides a database user name and password, which
allows you to invoke the TMU without entering a username and password on
the command line or interactively in response to a prompt.

Only one USER statement can occur in a control file and it must be the first
statement in the file. If a username and password are provided on the
command line, those values override a USER statement present in the control
file and a warning is issued that the USER statement was overridden.

For information about USER statements, refer to “USER Statement for User
Name and Password” on page 2-21.

,QWURGXFWLRQWRWKH7DEOH0DQDJHPHQW8WLOLW\ 
/2$''$7$DQG6<1&+6WDWHPHQWV

/2$''$7$DQG6<1&+6WDWHPHQWV
A LOAD DATA statement provides the control information you need to load
data into a database. This information includes the LOAD DATA keywords,
the source of data, its format, its locale, what to do with records that cannot
be loaded, and how to map the input data record fields into the database
table columns. The statement does not include the data itself.

If data is loaded into an offline segment of a table, that segment must be


synchronized with the rest of the table after the data is loaded; this synchro-
nization is performed with a SYNCH statement. Note that the SYNCH
statement is used only in conjunction with load operations into offline
segments.

For information about LOAD DATA and SYNCH control files, refer to
Chapter 3, “Loading Data into a Warehouse Database.” For additional
information about load operations with the rb_cm Copy Management
utility, refer to Chapter 7, “Moving Data with the Copy Management Utility.”

81/2$'6WDWHPHQWV
An UNLOAD statement provides the information you need to unload data
from a database table in any of several formats to move the data or to use it
with another tool. An UNLOAD statement contains the UNLOAD keyword
and other relevant information such as the name of the table to be unloaded,
a description of the desired output format, and where to write the output
files. Data can be unloaded in the order determined by either a table scan or
an index. Either a complete table can be unloaded, or only those rows that
meet the specified criteria.

In cases where you are unloading data that will later be loaded into another
table, you can include instructions in the UNLOAD statement for the TMU to
automatically generate an SQL CREATE TABLE statement that corresponds to
the table to be unloaded and a TMU LOAD DATA statement that corresponds
to its data. These automatically generated statements provide templates that,
with little or no modification, allow you to create a table and load it with the
unloaded data. (This functionality is also available in the GENERATE
statement.)

 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
*(1(5$7(6WDWHPHQWV

For information about UNLOAD control files, refer to Chapter 4, “Unloading


Data from a Table.” For information about unloading data for use with the
rb_cm Copy Management utility, refer to Chapter 7, “Moving Data with the
Copy Management Utility.”

*(1(5$7(6WDWHPHQWV
A GENERATE statement provides the information you need to automatically
generate an SQL CREATE TABLE or TMU LOAD DATA statement based on an
existing table. A GENERATE statement allows you to separate the task of
generating the CREATE TABLE or LOAD DATA statements from the task of
unloading the data, to generate these statements without actually unloading
the data.

For information about GENERATE control files, refer to Chapter 5, “Gener-


ating CREATE TABLE and LOAD DATA Statements.”

5(25*6WDWHPHQWV
A REORG statement instructs the TMU to reorganize a table, which includes
enforcing referential integrity and rebuilding any specified indexes to
improve internal storage. (Referential integrity is the relational property that
each foreign-key value in a referencing table exists as a primary-key value in
the referenced table.) A REORG statement can also be used to rebuild any
aggregate tables defined on the target table of the REORG statement. A
REORG statement includes the REORG keyword, a table name, index names,
and precomputed view names and instructions for rebuilding them.

For information about REORG control files, refer to Chapter 6, “Reorganizing


Tables and Indexes.”

%$&.836WDWHPHQWV
A BACKUP statement performs a level 0, 1, or 2 backup of the database, in
either online or checkpoint mode. For information about BACKUP control
files, refer to Chapter 8, “Backing Up a Database.”

,QWURGXFWLRQWRWKH7DEOH0DQDJHPHQW8WLOLW\ 
5(6725(6WDWHPHQWV

5(6725(6WDWHPHQWV
A RESTORE statement fully or partially restores the database from one or
more TMU backups. For information about RESTORE control files, refer to
Chapter 9, “Restoring a Database.”

83*5$'(6WDWHPHQWV
An UPGRADE statement instructs the TMU to upgrade an existing database
so that it is compatible with a newer version of IBM Red Brick Warehouse.
Not all versions require that databases be upgraded; for those that do require
an upgrade, the information needed to upgrade a database is built into the
UPGRADE command in the new IBM Red Brick Warehouse software.

UPGRADE ;
DDLFILE ’filename’

DDLFILE ’filename’ Indicates that the TMU is to generate a file containing


DDL statements as part of the upgrade procedure. Such
files are not always required and their contents are
specific to each release.

For the following information, refer to the release notes that accompany each
release of IBM Red Brick Warehouse:

■ Whether a database needs to be upgraded when a new release of IBM


Red Brick Warehouse is installed.
■ Specific syntax for the UPGRADE statement for that release, including
details about the contents of any required DDL files.
■ Changes that an UPGRADE operation makes for that release.

For general instructions on how to install new software and plan an upgrade
for a production database, refer to the Installation and Configuration Guide for
your platform and the Release Notes.

 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
6(76WDWHPHQWV

6(76WDWHPHQWV
Various options are available that allow you to customize certain aspects of
TMU behavior for a specific session. You can include SET statements for these
options in a control file to override global configuration parameters set in the
configuration file (rbw.config). For example, you can use a SET statement to
change the default location and amount of temporary space that a specific
load operation uses.

For information about these SET statements, refer to “SET Statements and
Parameters to Control Behavior” on page 2-23.

,QWURGXFWLRQWRWKH7DEOH0DQDJHPHQW8WLOLW\ 
Chapter

5XQQLQJWKH708DQG3708

In This Chapter . . . . . . . . . . . . . . . . . . . . 2-3
User Access and Required Permission . . . . . . . . . . . . 2-4
Operating System Access . . . . . . . . . . . . . . . 2-4
Database Access . . . . . . . . . . . . . . . . . . 2-5
Permissions on TMU Output Files . . . . . . . . . . . . 2-5

Syntax for rb_tmu and rb_ptmu Programs . . . . . . . . . . . 2-5


Exit Status Codes . . . . . . . . . . . . . . . . . . . 2-7
Setting Up the TMU . . . . . . . . . . . . . . . . . . 2-8
Remote TMU Setup and Syntax . . . . . . . . . . . . . . 2-12
Client-Server Compatibility . . . . . . . . . . . . . . 2-12
Client Configuration . . . . . . . . . . . . . . . . . 2-13
Server Configuration . . . . . . . . . . . . . . . . . 2-14
Syntax for the rb_ctmu Program . . . . . . . . . . . . . 2-14
Options . . . . . . . . . . . . . . . . . . . . 2-15
Control File for Remote TMU Operations . . . . . . . . 2-16
Username and Password . . . . . . . . . . . . . . 2-16
Usage . . . . . . . . . . . . . . . . . . . . . 2-17
Syntax Examples . . . . . . . . . . . . . . . . . 2-17
Summary of Remote TMU Operation . . . . . . . . . . . 2-18
Example: Windows-to-UNIX Remote TMU Operation . . . . . 2-19

USER Statement for User Name and Password . . . . . . . . . 2-21


SET Statements and Parameters to Control Behavior . . . . . . . 2-23
Lock Behavior . . . . . . . . . . . . . . . . . . . 2-25
Buffer-Cache Size . . . . . . . . . . . . . . . . . . 2-27
Temporary Space Management . . . . . . . . . . . . . 2-28
Format of Datetime Values . . . . . . . . . . . . . . . 2-33
Load Information Limit . . . . . . . . . . . . . . . . 2-34
Memory-Map Limit . . . . . . . . . . . . . . . . . 2-35
Setting Precomputed View Maintenance . . . . . . . . . . 2-36
Precomputed View Maintenance On Error . . . . . . . . . 2-36
Managing Row Messages . . . . . . . . . . . . . . . 2-38
Enabling Versioning . . . . . . . . . . . . . . . . . 2-39
Commit Record Interval . . . . . . . . . . . . . . . . 2-40
Commit Time Interval . . . . . . . . . . . . . . . . 2-42
Displaying Load Statistics . . . . . . . . . . . . . . . 2-45
Backup and Restore (BAR) Unit Size . . . . . . . . . . . 2-45
External Backup and Restore Operations . . . . . . . . . . 2-46
REORG Tasks . . . . . . . . . . . . . . . . . . . 2-47
Parallel Loading Tasks (PTMU Only) . . . . . . . . . . . 2-48
Serial Mode Operation (PTMU Only) . . . . . . . . . . . 2-50

Suggestions for Effective PTMU Operations . . . . . . . . . . 2-52


Operations That Use Parallel Processing . . . . . . . . . . 2-52
Discard Limits on Parallel Load Operations . . . . . . . . . 2-53
AUTOROWGEN with the PTMU. . . . . . . . . . . . . 2-53
Multiple Tape Drives with the PTMU . . . . . . . . . . . 2-54
3480/3490 Multiple-Tape Drive with the PTMU. . . . . . . . 2-54

 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
,Q7KLV&KDSWHU
Before you can use the TMU or the Parallel TMU (PTMU), you must prepare a
control file that contains statements that define the tasks to perform. These
statements are described in subsequent chapters. When the control file is
ready (either a newly created file or an existing file that was modified) and
any required input files are ready, you can run the TMU or the PTMU, as this
chapter describes.

This chapter contains the following sections:

■ User Access and Required Permission


■ Syntax for rb_tmu and rb_ptmu Programs
■ Exit Status Codes
■ Setting Up the TMU
■ Remote TMU Setup and Syntax
■ USER Statement for User Name and Password
■ SET Statements and Parameters to Control Behavior
■ Suggestions for Effective PTMU Operations

Use of the PTMU is similar to use of the TMU, except as noted in the syntax on
page 2-5 and the specific suggestions for PTMU use on page 2-52.

5XQQLQJWKH708DQG3708 
8VHU$FFHVVDQG5HTXLUHG3HUPLVVLRQ

8VHU$FFHVVDQG5HTXLUHG3HUPLVVLRQ
To use the TMU or PTMU, you must have the required permissions for both
the operating system and the database.

,PSRUWDQW The documentation uses the redbrick directory (or redbrick_dir in


examples) to indicate the directory into which the IBM Red Brick Warehouse
software is installed, and the redbrick user ID indicates the database administrator
user ID, which is the operating-system user ID used to install the IBM Red Brick
Warehouse software. If your site installs the software in another location or with
another user ID, then substitute that location or user for redbrick wherever you see
references to the redbrick directory or redbrick user.

2SHUDWLQJ6\VWHP$FFHVV
If you run either the TMU or the PTMU (rb_tmu or rb_ptmu) from a user other
than redbrick, you must ensure that the redbrick user ID has read access to
the control file and input files and write access to the directories in which the
table, index, discard, and generated files are written. For example, if the
administrative user at your site has the user name redbrick and you are
running the TMU under your user name calvin, then you must make sure that
the redbrick user has the necessary permissions to read, write, and execute
the required files. If you use another name for the administrative user, you
must make sure that user name has the necessary permissions on the
required files.

Windows On Windows platforms, the user who runs the TMU or PTMU must also
belong to one of the following user groups:

■ Administrator
■ REDBRICK_DBA
■ *_REDBRICK_DBA (where * refers to any prefix; for example,
TMU_REDBRICK_DBA)

If you do not want to make the TMU user part of the standard Windows
Administrator group, you must create the group REDBRICK_DBA or
*_REDBRICK_DBA with the User Manager administrative tool (accessible
from the Start menu) and assign the user to that group. ♦

 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
'DWDEDVH$FFHVV

'DWDEDVH$FFHVV
You can specify the database to access at the command line when you invoke
the TMU. If you do not specify a database at the command line, the TMU uses
the database that the RB_PATH environment variable specifies.

The database user ID you supply to the TMU must have the necessary object
privileges and task authorizations to perform the TMU operation. You can
supply the user ID and the password at the command line, in response to a
prompt, or in a USER statement.

UNIX 3HUPLVVLRQVRQ7082XWSXW)LOHV
By default, all TMU output files (including discard files, generated files,
unload files, and backup files) inherit the permissions of the user who runs
the rb_tmu or rb_ptmu executable. These permissions are based on the
current umask setting for that user. If umask is set to 0, the output files are
rw for all users. To restrict the permissions to rw for the redbrick user only,
you can set umask to 077:
% umask 077

6\QWD[IRUUEBWPXDQGUEBSWPX3URJUDPV
The executable files for the TMU and PTMU are named rb_tmu and rb_ptmu,
respectively, and they are located in the bin subdirectory in the redbrick
directory. These programs run under the redbrick user ID and the redbrick
user owns all files that they create. The rb_ctmu is a client executable used to
start a load or unload operation on a remote server; see page 2-12 for details.

The syntax to invoke the TMU and PTMU is as follows:


rb_tmu [options] control_file [db_username [db_password]]
rb_ptmu [options] control_file [db_username [db_password]]

5XQQLQJWKH708DQG3708 
2SWLRQV

2SWLRQV
You can specify any or all of the following options, in any order:

-interval nrows Optional. Directs the TMU to print a progress message


or -i nrows to the system message file for every n rows of data that
are loaded. Because the TMU processes rows in small
batches, it reports the number of rows processed at the
end of the batch in which an interval occurs. Hence the
number reported might not be exactly nrows. Frequent
intervals slow down the load process.

If an interval is not specified, no progress messages are


printed.

-database db_name Optional. Logical database name defined in the


or -d db_name rbw.config file. This option overrides RB_PATH. If no
database is specified, the value of the RB_PATH
environment variable is used.

-timestamp Optional. Time-stamp information is appended to all


or -t information and error messages that the TMU issues.
The time-stamp format is localized in accordance with
the locale of the operating system, whereas the mes-
sages use the locale of the database (or the locale that the
RB_NLS_LOCALE environment variable specifies).

control_file Pathname of file containing the TMU control statements.

db_username, Optional. Database user name and password (not oper-


db_password ating-system user account and password). The user
name can be either single-byte or multibyte characters;
the password must be single-byte characters. If you do
not supply these arguments with the command or in the
control file, the TMU prompts for them before it executes
the control file. If you supply a user name and password
both with the command and in the control file, the
values specified with the command override those in
the control file.

 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
8VDJH

8VDJH
■ To display the TMU or PTMU syntax, enter rb_tmu or rb_ptmu with
no options at the system prompt. For example:
% rb_tmu
Usage: /redbrick_dir/bin/rb_tmu [<Options>]
<control_file> [<username> [<password>]]
Options:
-i or -interval <nrows> Display progress every
<nrows> rows.
-t or -timestamp Append a timestamp to all
TMU messages.
-d or -database <database> Database to use.
■ If the TMU or PTMU is interrupted, it exits immediately after closing
any open tables.

([LW6WDWXV&RGHV
Upon exiting, the TMU and PTMU return the highest status code encountered
during processing. For example, if the TMU generates only warning
messages, it returns an exit status code of 1; if, however, it generates both
warning and fatal messages, it returns an exit status code of 3. You can use
these exit status codes to control user-implemented applications that run the
TMU or PTMU. The following table defines the meaning of each exit status
code.

6WDWXV 0HDQLQJ

0 Information or statistics messages might be issued during execution, but


no warning, error, or fatal messages are issued.

1 Warning messages are issued during execution.

2 Error messages are issued during execution.

3 Fatal messages are issued during execution.

An exit status of 2 or 3 causes the TMU or PTMU to stop execution.

5XQQLQJWKH708DQG3708 
6HWWLQJ8SWKH708

6HWWLQJ8SWKH708
7RVHWXS\RXUHQYLURQPHQWDQGLQYRNHWKH708RU3708
 Log in as user redbrick. (If you run the TMU as any user other than
redbrick, you must verify that user redbrick has the necessary access
to all locations used for input, output, discard, and generated files, as
described on page 2-4.)
 Make sure the system is configured as you want it:
■ Verify that the RB_HOST environment variable is set to the
correct database daemon (UNIX) or service (Windows).
■ Verify that the RB_CONFIG environment variable is set to the
directory that contains the rbw.config file.
■ Verify that the RB_PATH environment variable is set to the
correct database. If it is not set to the database that you want to
access, you must use the -d option to provide the logical
database name when you invoke the TMU.

Windows
On Windows, the default value of these environment variables is determined
by the database service that RB_HOST selects in the Registry. ♦

 Invoke the TMU. Use rb_tmu and specify the file containing the TMU
control statements. Your database user name and password can be
entered on the command line, with a USER statement within the
control file, or in response to a prompt from the TMU.

The following example illustrates how to set the RB_PATH environment


variable and run the TMU on the Aroma database with a control file named
aroma.tmu, sending progress messages with time stamps to the terminal
every 10,000 rows.

UNIX At a Korn or Bourne shell prompt, enter:


$ RB_PATH=AROMA; export RB_PATH
$ rb_tmu -i 10000 -timestamp aroma.tmu curly secret

At a C shell prompt, enter:


% setenv RB_PATH AROMA
% rb_tmu -i 10000 -timestamp aroma.tmu curly secret

 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
6HWWLQJ8SWKH708

Windows
At a Windows shell prompt, enter:
c:\db1> set RB_PATH=AROMA
c:\db1> rb_tmu -i 10000 -timestamp aroma.tmu curly secret

The TMU first logs in user curly with password secret to the database refer-
enced by the RB_PATH variable. The TMU executes the control statements
contained in aroma.tmu located in the current directory. Progress messages
are issued approximately every 10,000 rows (based on row batch intervals, as
described on page 2-6). Time stamps are appended to all messages.

The following example illustrates how to invoke the TMU and specify a
database with the -d option. Enter the following command at a shell prompt:
rb_tmu -d AROMA aroma.tmu

The TMU checks the control file for a USER statement. If it does not find one,
it prompts for the user name and password before it executes the control
statements in the file aroma.tmu on the Aroma database located in the
directory defined in the rbw.config file.

The following example illustrates how to invoke the PTMU and specify an
interval with a timestamp. Enter the following command:
rb_ptmu -i 10000 -timestamp aroma.tmu curly secret

The following example illustrates how to capture TMU informational and


error messages and write the messages to a file named load_messages.

UNIX At a Korn or Bourne shell prompt, enter:


$ rb_tmu file.tmu system secret > load_messages 2>&1

At a C shell prompt, enter:


% rb_tmu file.tmu system secret >& load_messages

If you do not want to display messages on the terminal or write them to a file,
you can redirect the system stderr output to /dev/null to prevent the creation
of very large files. IBM does not recommend this practice because you cannot
detect any problems that occur. You can also redirect messages through a
filter such as UNIX grep to filter out repetitive informational messages. ♦

5XQQLQJWKH708DQG3708 
6HWWLQJ8SWKH708

Windows At a Windows shell prompt, enter:


c:\db1> rb_tmu file.tmu system secret > load_messages 2>&1

The following example illustrates how to use the output from another
program (in this case, zcat, a decompression program) as the input for the
TMU. At a shell prompt, enter:

zcat cmpressd_file | rb_tmu mydb.tmu system secret

This command first executes the decompression program to decompress the


file cmpressd_file and pipes the output into the TMU. The TMU uses this
output as its input when it executes the mydb.tmu control file. A separate file
is not needed to hold the decompressed data, which reduces the temporary
storage requirements. The control file must specify standard input (’-’) as its
input file.

The following example illustrates another way to use standard input for the
input data, allowing a single control file (mydb.tmu) to process different data
input files (one of which is market.txt). At a shell prompt, enter:
rb_tmu mydb.tmu system secret < market.txt

The TMU uses the file market.txt as the input when it executes the mydb.tmu
control file. You can name another input file the next time you use the
mydb.tmu control file. The control file must specify standard input (’-’) as the
input source. For more information about file redirection and pipes, refer to
your operating-system documentation.

UNIX The following example, based on UNIX named pipes and the tee command,
illustrates how to run multiple instances of the TMU to read an input file once
and load multiple tables.

Windows Similar capabilities are available on Windows, using named pipes and
third-party software.

Assume the daily input data is in a file named Sales.txt and it is stored in a
table named Sales. This same input data is also used to generate the
aggregate sales data stored in tables named Sales_Monthly and
Sales_Quarterly.

 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
6HWWLQJ8SWKH708

7RUXQPXOWLSOHLQVWDQFHVRIWKH708WRUHDGDQLQSXWILOHRQFHDQGORDGPXOWLSOH
WDEOHV

 Modify the load script that loads the Sales table to read its input from
the standard input (stdin). The modified script is in a file named
Sales.stdin.tmu.
 Modify the load scripts that load the aggregate tables to read their
input from named pipes pipeM and pipeQ. These modified scripts
are in files named SalesMonthly.pipeM.tmu and
SalesQuarterly.pipeQ.tmu.
 Create two named pipes pipeM and pipeQ with the UNIX mkfifo
utility:
% mkfifo pipeM pipeQ
 Start two instances of the TMU in the background. They use the load
scripts modified in step 2, which read input from the two pipes, as
control files.
,PSRUWDQW In the C shell, the greater-than (>) and ampersand (&) characters direct
the output from each rb_tmu process to a separate file instead of to the terminal. The
ampersand (&) character runs the process in the background. You must use the corre-
sponding characters for the UNIX shell that you are using.
% rb_tmu MonthlySales.pipe_m.tmu system manager >& pipe_mout &
% rb_tmu QuarterlySales.pipe_q.tmu system manager >& pipe_qout &

 Read the input data with the UNIX cat command, pipe the standard
output to the two pipes by using the UNIX tee command, and pipe
the tee standard output to a third instance of the TMU, which uses the
modified Sales load script as a control file:
% cat Sales.txt | tee pipeM pipeQ | rb_tmu Sales.stdin.tmu \
system manager >& sales_out

When you use the named pipes and the tee command, you read the input file
only once, but load it into three tables with a single operation in the same
time it takes to load a single table. You can also run step 5 in the background
so you can monitor all three output files. ♦

5XQQLQJWKH708DQG3708 
5HPRWH7086HWXSDQG6\QWD[

5HPRWH7086HWXSDQG6\QWD[
The remote TMU feature allows DBAs to start a LOAD DATA, UNLOAD, or
GENERATE operation from a client machine, using local control files and
input files. The TMU runs on the remote server machine and returns its
output files to the client.

The Client TMU executable file is named rb_ctmu. When you invoke the
rb_ctmu, it establishes a connection with the server machine, and the server-
side Driver TMU program (rb_drvtmu) starts an rb_ptmu process for the
remote operation. Users must configure the client and server machines
correctly in order for remote TMU operations to work, as discussed in the
following sections.

The rb_ctmu program is installed as part of the Client Connector Pack on


Windows platforms, as part of the client product installation on UNIX and
Linux platforms, and as part of the IBM Red Brick Warehouse server instal-
lation on all platforms. The rb_drvtmu program is installed as part of the
server installation on all platforms.

Although the Client TMU is intended for TMU operations on remote servers,
the rb_ctmu program can also be used to run local TMU operations. The
environment setup on the machine where you start the rb_ctmu program
determines the target host and database for the TMU operation.

&OLHQW6HUYHU&RPSDWLELOLW\
The client and server do not need to be on the same platform or operating
system. For example, you can use a 32-bit Windows client to run a remote
load operation on a 64-bit UNIX server. All of the following configurations are
supported:

&OLHQW UEBFWPX 6HUYHU UEBSWPXDQGUEBWPX

Windows UNIX or Linux

UNIX or Linux UNIX or Linux

UNIX or Linux Windows

Windows Windows

 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
&OLHQW&RQILJXUDWLRQ

These configurations are subject to the following assumptions:

■ Binary data for loading must be in the server machine’s format.


■ Binary data that is unloaded is returned to the client in the server
machine’s format.
■ External format data is compatible across platforms.

&OLHQW&RQILJXUDWLRQ
The RB_CONFIG environment variable must be set on the client machine to
point to the location of the client copy of the rbw.config file.

The recommended way to specify the host and database for a remote TMU
operation is to create an ODBC DSN (data source name). Then you can either
specify the DSN with the -s option on the rb_ctmu command line or set the
RB_DSN environment variable.

To create a DSN on Windows platforms, use the ODBC control panel, as


described in the Client Installation and Connectivity Guide. On UNIX and Linux
platforms, add the DSN to the user’s .odbc.ini file.

If you choose not to create DSNs, you can either set the RB_HOST and
RB_PATH environment variables or specify the -h (host) and -d (database)
values on the rb_ctmu command line. If you use the -h option to specify the
remote server, the client rbw.config file must contain a SERVER entry that
exactly matches the SERVER entry in the server-side rbw.config file. For
example:
RB_620 SERVER brick:6200

TUNE parameters in the client-side rbw.config file have no effect on the


operation of the remote TMU. The SET commands in the TMU control file (if
any) and the TUNE parameters in the server-side rbw.config file control the
run-time behavior.

5XQQLQJWKH708DQG3708 
6HUYHU&RQILJXUDWLRQ

6HUYHU&RQILJXUDWLRQ
Before using the rb_ctmu, check that the REMOTE_TMU_LISTENER configu-
ration parameter is set to ON (the default) in the server-side rbw.config file:
RBWAPI REMOTE_TMU_LISTENER ON

This parameter enables the server to respond to requests from the remote
TMU. The listening port is the server port +2. For example, if you specified
port number 6200 during the IBM Red Brick Warehouse installation, the
remote TMU port will be 6202.

To turn off the remote TMU feature, you must set REMOTE_TMU_LISTENER to
OFF, then stop and restart the rbwapid daemon. If the parameter is not
present in the rbw.config file, the remote TMU is still, by default, turned on.
You can also check the rbwlogview output when the rbwapid daemon or
service is started to see whether the server is listening for remote TMU
requests.

The client TMU always invokes the parallel TMU (rb_ptmu) on the server
machine. If you want remote TMU loads to run in serial mode, you must
include the following SET command in the control file:
set tmu serial mode on;

The interaction of the rb_drvtmu and rb_ptmu on the server machine is


transparent to the user. The driver program does not require any setup or
configuration.

6\QWD[IRUWKHUEBFWPX3URJUDP
The syntax for the client TMU is as follows:
rb_ctmu [options] control_file [db_username [db_password]]

 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
6\QWD[IRUWKHUEBFWPX3URJUDP

2SWLRQV
Most of the options are identical to those supported by the rb_tmu and
rb_ptmu programs; see page 2-6. The following options are specific to the
rb_ctmu program. These options can be specified in any order, as long as they
follow the rb_ctmu executable and precede the control file.

-s dsn Data source name (DSN), as defined in the .odbc.ini file


on UNIX and Linux platforms and in the Registry on
Windows platforms. A DSN can also be set with the
RB_DSN environment variable.

If specified, the DSN takes precedence over the -d data-


base option and the current setting for the RB_PATH
environment variable.

-h host The RB_HOST value, as defined in the rbw.config file


(on the server machine). This value can also be set with
the RB_HOST environment variable.

If you use the -h option, the client rbw.config file or


.odbc.ini file must contain a SERVER entry that exactly
matches the SERVER entry in the server-side rbw.config
file.

-w time_in_secs Optional. This value is the timeout interval, in seconds,


or -waittime for connecting to the remote server. If the first connec-
time_in_secs tion attempt fails, the client TMU retries for the duration
of the specified wait time. If no value (or 0 seconds) is
specified, only one connection attempt is made; if it
fails, an error is displayed.

-show Displays detailed connectivity information for the


rb_ctmu command without executing the TMU opera-
tion itself. Based on the options specified in the com-
mand or the current environment, the -show output
indicates which environment variables or DSN will be
used to connect; the remote machine name, IP address,
and port number; the database name; and the database
user. The user’s password is not displayed.

5XQQLQJWKH708DQG3708 
6\QWD[IRUWKHUEBFWPX3URJUDP

&RQWURO)LOHIRU5HPRWH7082SHUDWLRQV
The control file for the rb_ctmu must reside on the client machine and must
contain only one executable LOAD, UNLOAD, or GENERATE statement.
Multiple SET statements can be included in the file.

The following TMU operations are supported:

■ LOAD DATA
■ UNLOAD
Generated DDL and TMU files are not supported for remote UNLOAD
operations.
■ GENERATE CREATE TABLE
■ GENERATE LOAD DATA

The following TMU operations cannot be run remotely:


■ REORG
■ SYNCH
■ UPGRADE
■ BACKUP
■ RESTORE

You cannot use the remote TMU feature to load from a tape device or unload
to a tape device.

8VHUQDPHDQG3DVVZRUG
The database username and password entries for the rb_ctmu follow the
same rules as those for the rb_tmu. The rb_ctmu does not validate usernames
and passwords; the validation occurs on the remote machine.

You can set the environment variable RB_USER on the client machine instead
of specifying your database username on the command line.

 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
6\QWD[IRUWKHUEBFWPX3URJUDP

8VDJH
To display the client TMU syntax, enter rb_ctmu with no options at the
system prompt. For example:
109 brick % $RB_CONFIG/bin/rb_ctmu
(C) Copyright IBM Corp. 1991-2002. All rights reserved.
Remote TMU Client Tool Version 06.20.0000(0)TST
Usage: rb_ctmu [<Options>] <Control_file> [<Username>] [<Password>]
Options:
-i or -interval <nrows> Display progress every <nrows> rows.
-t or -timestamp Append a timestamp to all TMU messages.
-d or -database <database> Database to use.
-s <DSN> Data Source Name.
-h <Host> RB Host if different from RB_HOST.
-w or -waittime <secs> Wait time for connection.
-show Show connection information only. (No
execution)
Arguments:
<Control_file> Path to control file to be used.
<Username> User name, prompted for if not given.
<Password> Password, prompted for if not given.

6\QWD[([DPSOHV
■ This example uses a DSN named red_brick_620:
% rb_ctmu -s red_brick_620 sales.tmu orwell george
■ The following example specifies the host rb_6200, the database
aroma, and a wait time of 10 seconds:
% rb_ctmu -h rb_6200 -d aroma -w 10 sales.tmu wolfe tom
■ The following example does not specify a DSN, host, or database. The
client TMU will use the RB_DSN environment variable, if specified, or
connect to the host specified by RB_HOST and the database specified
by RB_PATH.
% rb_ctmu sales.tmu system manager

5XQQLQJWKH708DQG3708 
6XPPDU\RI5HPRWH7082SHUDWLRQ

■ The following example includes the -show option only. Default


connectivity information is displayed, but the load operation is not
executed.
brick% rb_ctmu -show sales.tmu system manager
(C) Copyright IBM Corp. 1991-2002. All rights reserved.
Remote TMU Client Tool Version 06.20.0000(0)TST
** INFORMATION ** (12003) Connection information for
Remote TMU execution:
RB_HOST Environment variable: RB_620
Machine: brick.usa.ibm.com
IP Address: 8.36.53.21
Port: 6200 + 2
Database: aroma
User: system

6XPPDU\RI5HPRWH7082SHUDWLRQ
The following procedure summarizes the steps required to run a remote TMU
operation.

7RUXQDUHPRWH708FRPPDQG

 Set the RB_CONFIG environment variable on the client machine.


Optionally, set the RB_DSN, RB_HOST, RB_PATH, and RB_USER
variables or create appropriate data source names (DSNs).
 Make sure the REMOTE_TMU_LISTENER parameter is set to ON in the
server-side rbw.config file.
 Create a TMU control file on the client machine, specifying a LOAD
DATA, UNLOAD, or GENERATE command. For a LOAD DATA
operation, make sure the input file is present on the client machine.
If the Client TMU is on Windows and the remote TMU machine is a
UNIX machine, specify the pathnames for input and output files in
UNIX format.
If you want remote TMU loads to run in serial mode, you must
include the following SET command in the control file:
set tmu serial mode on;

 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
([DPSOH:LQGRZVWR81,;5HPRWH7082SHUDWLRQ

 Execute the control file from the command line, using the rb_ctmu
program with the options specified on page 2-15.

([DPSOH:LQGRZVWR81,;5HPRWH7082SHUDWLRQ
This example shows how to run the Client TMU from a Windows machine to
connect to the Aroma database on a UNIX machine. The TMU operation in this
case is a GENERATE CREATE TABLE command for the Period table. All of the
steps are performed on the Windows client.

 Set the RB_CONFIG environment variable to point to the rbw.config


file installed with the Client Connector Pack software:
C:\RedBrick\Client32>set RB_CONFIG=\redbrick\client32
 Use the Data Sources (ODBC) control panel to create a data source
name (DSN) that defines the server environment where the TMU
operation will be performed:

 Set the RB_DSN environment variable to the value of the new DSN:
C:\RedBrick\Client32>set RB_DSN=BRICK_620
 Create the TMU control file for the GENERATE operation. Name this
file gen_pd.tmu.
generate create table from period ddlfile
’/tmufiles/period.ddl’;

5XQQLQJWKH708DQG3708 
([DPSOH:LQGRZVWR81,;5HPRWH7082SHUDWLRQ

 Execute the rb_ctmu.exe program from the Windows command line,


specifying the control file you created in step 4. You do not need to
specify a DSN, host, or database because you have already set the
RB_DSN environment variable.
C:\rbw620\bin>rb_ctmu gen_pd.tmu system manager
(C) Copyright IBM Corp. 1991-2002. All rights reserved.
Remote TMU Client Tool Version 06.20.0000(0)TST
** INFORMATION ** (13004) TMU executing in Remote Mode.
** INFORMATION ** (1312) Generated CREATE TABLE statement
for table PERIOD.
** STATISTICS ** (500) Time = 00:00:00.01 cp time,
00:00:05.18 time, Logical IO count=5, Blk Reads=0, Blk
Writes=0
 Check the contents of the output file period.ddl on the local machine:
CREATE TABLE PERIOD (
PERKEY INTEGER NOT NULL UNIQUE,
DATE DATE NOT NULL,
DAY CHARACTER(3) NOT NULL,
WEEK INTEGER NOT NULL,
MONTH CHARACTER(5) NOT NULL,
QTR CHARACTER(5) NOT NULL,
YEAR INTEGER NOT NULL,
PRIMARY KEY(PERKEY));

 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
86(56WDWHPHQWIRU8VHU1DPHDQG3DVVZRUG

86(56WDWHPHQWIRU8VHU1DPHDQG3DVVZRUG
If you prefer not to enter a database user name and password on the
command line, you can include a USER statement at the beginning of a
control file. A control file can include only one USER statement, and it must
be the first statement in the file.

USER Database user under whose name the TMU is invoked. The
db_username user name can be either a literal value (for example, john,
smith2, or elvis) or an environment variable.

If the user name is an environment variable, the corresponding


user name is determined from the environment. If the environ-
ment variable is not defined, a warning is issued. If a valid user
name was not supplied when the TMU was invoked, an error
occurs.

The database user name can be composed of either single-byte


or multibyte characters.

PASSWORD Password for the database user. The password can be either a
db_password literal value or an environment variable; it must be composed
of single-byte characters.

If the password is an environment variable, the corresponding


password is determined from the environment. If the environ-
ment variable is not defined, a warning is issued. If a valid
password is not supplied when the TMU is invoked, an error
occurs.

UNIX If the user name or password begins with the dollar sign ($), the value is
taken as an environment variable. For example, $DBADMIN and $SECRET. ♦

Windows If the user name or password is surrounded by percent symbols (%) or begins
with the dollar sign ($), the value is taken as an environment variable. For
example, %DBADMIN% or $DBADMIN; and %SECRET% or $SECRET. ♦

The following USER statement uses literal values:


user dbadmin password secret;

5XQQLQJWKH708DQG3708 
86(56WDWHPHQWIRU8VHU1DPHDQG3DVVZRUG

The following USER statements use environment variables.

2SHUDWLQJ6\VWHP 6WDWHPHQW

UNIX user $MY_DBNAME password $MY_PSSWD;

Windows user %MY_DBNAME% password %MY_PSSWD%;


user $MY_DBNAME password $MY_PSSWD;

The following USER statements use both a literal value and an environment
variable.

2SHUDWLQJ6\VWHP 6WDWHPHQW

UNIX user dbadmin password $MY_PSSWD;

Windows user dbadmin password %MY_PSSWD%;


user dbadmin password $MY_PSSWD;

 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
6(76WDWHPHQWVDQG3DUDPHWHUVWR&RQWURO%HKDYLRU

6(76WDWHPHQWVDQG3DUDPHWHUVWR&RQWURO%HKDYLRU
Along with the functional statements in a control file, you can include SET
statements to specify certain aspects of TMU and PTMU behavior for a specific
session. For example, for a load operation, you can specify a different
directory for temporary space from the one specified in the rbw.config file.

With the exception of RBW_LOADINFO_LIMIT, the following TMU and PTMU


parameters can be controlled with SET statements:

6(76WDWHPHQW &RQWUROV

LOCK WAIT, NO WAIT TMU behavior when the database or


target tables are locked.

TMU BUFFERS Size of the buffer cache the TMU uses.

INDEX TEMPSPACE TMU use of temporary space, both


memory and disk.

DATEFORMAT Valid alternative date format for input


data.

RBW_LOADINFO_LIMIT Number of rows of historical infor-


mation stored in the RBW_LOADINFO
system table.

TMU MMAP LIMIT Size of memory available for memory


mapping primary-key indexes during a
LOAD operation.

PRECOMPUTED VIEW Turns automatic aggregate maintenance


MAINTENANCE ON, OFF on or off for all precomputed aggregates.

PRECOMPUTED VIEW Rolls back or invalidates aggregates that


MAINTENANCE ON ERROR cannot be maintained.
ROLLBACK, INVALIDATE

TMU ROW MESSAGES Defines the warning level for messages


returned during LOAD processing.

TMU VERSIONING OFF, ON, Versioned LOAD, REORG, and SYNCH


RECOVER operations.
(1 of 3)

5XQQLQJWKH708DQG3708 
6(76WDWHPHQWVDQG3DUDPHWHUVWR&RQWURO%HKDYLRU

6(76WDWHPHQW &RQWUROV

TMU COMMIT RECORD INTERVAL Number of records to load into a table


between each commit operation.

TMU COMMIT TIME INTERVAL Amount of time to load data into a table
before each commit operation.

STATS ON, INFO, OFF Displays information messages and


reports statistics for the current TMU
operation.

TMU BAR_UNIT_SIZE Amount of data committed per trans-


action for TMU backups
(2 of 3)

 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
/RFN%HKDYLRU

6(76WDWHPHQW &RQWUROV

Affects only the PTMU

TMU MAX TASKS Number of input and index builder


TMU INPUT TASKS tasks during a REORG operation.
TMU INDEX TASKS

TMU CONVERSION TASKS Parallel processing of data-conversion


TMU INDEX TASKS and index-building tasks during the
load procedure.

TMU SERIAL MODE OFF, ON Use of serial processing.


(3 of 3)
As the following descriptions of the SET statements show, some SET state-
ments override analogous global settings in the rbw.config file. A SET
statement in a control file affects TMU behavior only during the TMU session
that uses that control file. If a control file contains multiple TMU statements
(for example, a single file containing three LOAD DATA statements followed
by a REORG statement) a SET statement after the first LOAD DATA statement
applies to all subsequent statements. After that session, the option value
reverts to the value specified in the rbw.config file; if no value is specified in
the rbw.config file, it reverts to its default value.

/RFN%HKDYLRU
The TMU automatically locks the database or the affected tables during its
operations. If the database or table is already locked, then whether the TMU
returns immediately or waits for the lock to be released depends on the
behavior that the SET LOCK statement selects.

5XQQLQJWKH708DQG3708 
/RFN%HKDYLRU

6\QWD[
To specify the behavior to use for a specific session when locked tables are
encountered, use the following syntax to enter a SET LOCK statement in the
TMU control file.

SET LOCK WAIT ;


NO WAIT

WAIT Default behavior. If the database or table is already locked, the


TMU waits until any existing locks are released and then
completes the operation.

Use this mode with versioning operations.

NO WAIT If the database or table is already locked, the TMU returns a mes-
sage that the operation failed because the database or table was
locked. (The default behavior for the database server is also
WAIT.)

,PSRUWDQW In cases where waiting for a lock might result in a deadlock, the lock
request is refused and control is returned to the lock requestor. Deadlocks occur only
when the LOCK TABLE or LOCK DATABASE command is used. A deadlock cannot be
caused by the automatic locking operations of the TMU. For more information about
deadlocks, refer to the Administrator’s Guide.

The following example illustrates how to set the lock behavior with a SET
LOCK statement:

set lock no wait;

 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
%XIIHU&DFKH6L]H

%XIIHU&DFKH6L]H
TMU performance is affected by the size of the program’s buffer cache, as well
as system load and other factors related to the database. IBM recommends
that you use the default settings for the TMU buffer cache. After careful
analysis of your hardware, software, and user environment, however, you
might determine that changes to the buffer-cache size would improve
performance.

6\QWD[
To specify TMU buffer-cache size for a specific session, enter a SET statement
in the TMU control file; for all sessions, edit the TUNE parameter in the
rbw.config file.

SET TMU BUFFERS num_blocks ;

TUNE TMU_BUFFERS num_blocks

num_blocks Number of 8-kilobyte blocks. The num_blocks variable must be


an integer between 128 and 131071 on 32-bit platforms or 128
and 8388607 on 64-bit platforms. The default buffer-cache size
is 128 blocks.

,PSRUWDQW If you increase the value significantly, monitor performance carefully


because interactions between the operating system and the database implementation
sometimes cause a large buffer cache to decrease rather than increase performance.

A SET statement can increase, but can never decrease, the current buffer-
cache size during a given session. For example, if the current buffer size is
1024 blocks, set either in the rbw.config file or by a previous SET statement in
the TMU control file, the size cannot be reduced to 512 with a SET statement.

The following examples illustrate how to specify buffer-cache size.

■ SET command:
SET TMU BUFFERS 1024;
■ rbw.config file entry:
TUNE TMU_BUFFERS 1024

5XQQLQJWKH708DQG3708 
7HPSRUDU\6SDFH0DQDJHPHQW

8VDJH1RWH
If your LOAD DATA statement includes OPTIMIZE OFF syntax, the 128-page
default buffer-cache size might be too low to adequately handle index
searches. A typical indication of inadequate buffer cache is when the logical
I/O count returned in statistics message 500 at the end of the LOAD operation
is greater than half the number of input rows for the load. To increase perfor-
mance in this kind of situation, you can increase the size of the buffer cache.

([DPSOH
The following examples illustrate how to increase buffer-cache size.

■ SET command:
SET TMU BUFFERS 5000;
■ rbw.config file entry:
TUNE TMU_BUFFERS 5000

7HPSRUDU\6SDFH0DQDJHPHQW
As data is loaded and indexed, intermediate results are stored in memory
until they reach a threshold value, at which point they are written to disk. The
following parameters control how temporary space, both memory and disk,
is used when the OPTIMIZE option is ON:

■ INDEX TEMPSPACE DIRECTORIES, which specifies temporary space


directories that the TMU uses for index-building operations.
■ INDEX TEMPSPACE THRESHOLD, which specifies the size at which
index-building operations are written to disk.
■ INDEX TEMPSPACE MAXSPILLSIZE, which specifies the maximum
size to which a spill file can grow.
■ TEMPSPACE DUPLICATESPILLPERCENT, which specifies the
percentage of the INDEX TEMPSPACE MAXSPILLSIZE that the TMU
can use for recording duplicate rows.

When the OPTIMIZE option is OFF, the entire INDEX TEMPSPACE


MAXSPILLSIZE can be used for recording duplicate rows.

 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
7HPSRUDU\6SDFH0DQDJHPHQW

This section describes both the SET commands and the TUNE parameters in
the rbw.config file that control temporary space allocation and management.
For more information about temporary space management, which affects not
only TMU and PTMU operations but also SQL DDL statements, refer to the
Administrator’s Guide.

6\QWD[
To specify INDEX TEMPSPACE parameters for a specific session, enter a SET
statement in the TMU control file. For all sessions, edit the TUNE parameters
in the rbw.config file.

,
SET INDEX TEMPSPACE DIRECTORIES ’dir_path’ ;
THRESHOLD value

MAXSPILLSIZE size

DUPLICATESPILLPERCENT percent

RESET

TUNE INDEX_TEMPSPACE_DIRECTORY dir_path


INDEX_TEMPSPACE_THRESHOLD value
INDEX_TEMPSPACE_MAXSPILLSIZE size
INDEX_TEMPSPACE_DUPLICATESPILLPERCENT percent

5XQQLQJWKH708DQG3708 
7HPSRUDU\6SDFH0DQDJHPHQW

DIRECTORY ’dir_path’, Specifies a directory or a set of directories that are to


DIRECTORIES ’dir_path’, be used for temporary files; dir_path must be a full
… pathname. To define a set of directories by using
entries in the rbw.config file, enter multiple lines.
The order in which you specify the directories has no
effect because the order in which they are used is ran-
dom (determined internally) and no user control is
possible.

The default directory is /tmp on UNIX and c:\tmp on


Windows.

THRESHOLD value Specifies the amount of memory used before writing


intermediate results from an index-building opera-
tion to disk. For operations involving multiple
indexes, this threshold value is divided equally
among the indexes being built.

The size must be specified as kilobytes (K) or mega-


bytes (M) by appending K or M to the number. No
space is allowed between the number and the unit
identifier (K, M). For example: 1024K, 500M.

You must specify the threshold value before you


specify the corresponding MAXSPILLSIZE value. The
threshold value must precede the MAXSPILLSIZE
entry in the rbw.config file.

A value of 0 (K, M) causes files to be written to disk


after the first 200 rows or index entries.

The default threshold for index-building operations


is 10 megabytes (10M).

RI

 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
7HPSRUDU\6SDFH0DQDJHPHQW

MAXSPILLSIZE size Specifies the total maximum amount of temporary


space per operation. For an operation involving
multiple indexes, this space is divided equally
among the indexes being built.

You must specify the size as kilobytes (K), megabytes


(M), or gigabytes (G) by appending K, M, or G to the
number. No space is allowed between the number
and the unit identifier (K, M, G). For example: 1024K,
500M, 8G.

The default MAXSPILLSIZE value is 1 gigabyte (1G).


The maximum MAXSPILLSIZE value is 2047
gigabytes.

DUPLICATESPILL- Specifies the percentage of the INDEX TEMPSPACE


PERCENT percent MAXSPILLSIZE used for the temporary storage of dis-
carded duplicate rows. The percent is an integer
between 0 and 100 (inclusive). The default is two
percent.

RESET Resets the index-building TEMPSPACE parameters to


the values specified in the rbw.config file.

RI

8VDJH1RWHV
In addition, use the following guidelines when you set index-building
temporary space parameters:

■ Always set the THRESHOLD value before you set the MAXSPILLSIZE
value.
■ Remember that the INDEX TEMPSPACE parameter settings in the
rbw.config file affect not only TMU index-building operations but
also SQL index-building operations.

5XQQLQJWKH708DQG3708 
7HPSRUDU\6SDFH0DQDJHPHQW

([DPSOH

The following examples illustrate SET commands that you can use to change
parameters for a specific session:
SET INDEX TEMPSPACE THRESHOLD 2M;
SET INDEX TEMPSPACE MAXSPILLSIZE 3G;
SET INDEX TEMPSPACE DUPLICATESPILLPERCENT 5;

UNIX SET INDEX TEMPSPACE DIRECTORIES ’/disk1/itemp’,


’/disk2/itemp’;


SET INDEX TEMPSPACE DIRECTORIES ’d:\itemp’, ’e:\itemp’;
Windows

The following example illustrates how to reset the INDEX_TEMPSPACE


parameters to the values specified in the rbw.config file:
SET INDEX TEMPSPACE RESET;

The following examples illustrate entries in the rbw.config file that apply to
all sessions:
TUNE INDEX_TEMPSPACE_THRESHOLD 20M
TUNE INDEX_TEMPSPACE_MAXSPILLSIZE 8G
TUNE INDEX_TEMPSPACE_DUPLICATESPILLPERCENT 5

UNIX TUNE INDEX_TEMPSPACE_DIRECTORY /disk1/itemp


TUNE INDEX_TEMPSPACE_DIRECTORY /disk2/itemp
TUNE INDEX_TEMPSPACE_DIRECTORY /disk3/itemp


TUNE INDEX_TEMPSPACE_DIRECTORY d:\itemp
Windows
TUNE INDEX_TEMPSPACE_DIRECTORY e:\itemp
TUNE INDEX_TEMPSPACE_DIRECTORY f:\itemp

 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
)RUPDWRI'DWHWLPH9DOXHV

)RUPDWRI'DWHWLPH9DOXHV
To use an alternative date format (not ANSI SQL-92 datetime format) for a
date constant specified in a TMU statement, you must use a TMU SET
DATEFORMAT statement to specify the input format of the date when the
format includes numeric month values and the default order of mdy is not
used. (You can use a date constant in a LOAD DATA statement to load a
constant into a date column, to specify a date value in an ACCEPT or REJECT
clause, or in an UNLOAD statement in a WHERE clause.) The SET
DATEFORMAT statement must precede the LOAD DATA or UNLOAD
statement that it applies to in the control file.

SET DATEFORMAT ’format’ ;

format Order of month, day, and year components for non-ANSI SQL-
92 (alternative) date inputs, using a combination of the charac-
ters m, d, and y. The default format is mdy. This SET statement
uses the same format combinations as the SQL SET statement.
For more information about formats, refer to the
SET DATEFORMAT statement in the SQL Reference Guide.

The following statement specifies that any constants in the TMU statement
that follows are assumed to be in ymd format (year, month, day). For example:
2000/11/30:
SET DATEFORMAT ’ymd’;

The following statement specifies that any constants in the TMU statement
that follows are assumed to be in dmy format (day, month, year). For example:
30/11/2000:
SET DATEFORMAT ’dmy’;

5XQQLQJWKH708DQG3708 
/RDG,QIRUPDWLRQ/LPLW

/RDG,QIRUPDWLRQ/LPLW
Information about each load operation is stored in the RBW_LOADINFO
system table, one row per operation. The RBW_LOADINFO_LIMIT configu-
ration parameter specifies the maximum number of rows that you can store
in that table, thereby allowing you to control the amount of historical load
information that the system records.

You can specify the RBW_LOADINFO_LIMIT configuration parameter only as


an entry in the rbw.config file (there is no equivalent SET statement), and it
applies to all TMU sessions. The following diagram shows the syntax for this
entry.

DEFAULT RBW_LOADINFO_LIMIT value

value Number of rows of information to be stored. The value


specified has no upper limit; however, it should be set to a
reasonable number. The default value is 256 rows.

8VDJH1RWHV
■ If you set this parameter to a value less than the current value, the
RB_DEFAULT_LOADINFO file is truncated. However, the original file
is saved as RB_DEFAULT_LOADINFO.save.
■ The current value of RBW_LOADINFO_LIMIT is stored in the
RBW_OPTIONS system table.

([DPSOH
To check the value of RBW_LOADINFO_LIMIT, query the RBW_OPTIONS
table:
select option_name, value from rbw_options where option_name
= ’RBW_LOADINFO_LIMIT’;
OPTION_NAME VALUE
RBW_LOADINFO_LIMIT 256

 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
0HPRU\0DS/LPLW

0HPRU\0DS/LPLW
Memory-mapping the primary-key indexes of dimension tables into shared
memory could significantly accelerate referential integrity checking.

By default, the TMU will attempt to use available shared memory to map all
the indexes needed. If there is not enough available shared memory to allow
full mapping of all the dimension primary-key indexes, the TMU will map as
much of the indexes as possible into the available shared memory.

To maximize LOAD operation performance, IBM recommends configuring


enough shared memory to allow all the dimension table primary-key indexes
to be mapped into shared memory. Partial mapping of these indexes will
likely result in a significantly longer load time. The TMU MMAP LIMIT
statement or the TUNE TMU_MMAP_LIMIT parameter enables you to control
the amount of shared memory available to the TMU.

To control the amount of memory allocated to memory-mapping indexes for


a single session, enter a SET statement in the TMU control file. To control the
amount of memory allocated to memory-mapping indexes for all sessions,
edit the TUNE parameters in the rbw.config file. The syntax is as follows.

SET TMU MMAP LIMIT size ;

TUNE TMU_MMAP_LIMIT size

size Integer that indicates the maximum amount of memory avail-


able for mapping indexes during a LOAD or REORG operation.
The size value is specified in kilobytes (K) or megabytes (M).

If no size is specified, the default is the operating-system limit.

During the LOAD or REORG operation, the TMU maps up to the available
memory map limit. If the primary-key indexes of the dimension tables are
larger than the memory-map limit, the TMU maps the index up to the
memory-map limit, and then accesses the rest of the indexes through the
buffer cache.

5XQQLQJWKH708DQG3708 
6HWWLQJ3UHFRPSXWHG9LHZ0DLQWHQDQFH

6HWWLQJ3UHFRPSXWHG9LHZ0DLQWHQDQFH
Precomputed view maintenance automatically updates aggregate tables
whenever their detail tables are updated.

6\QWD[
To specify the behavior to use for a specific session, use the following syntax
to enter a PRECOMPUTED VIEW MAINTENANCE statement in the TMU
control file. To specify the behavior for all sessions, edit the OPTION
parameter in the rbw.config file.

SET PRECOMPUTED VIEW MAINTENANCE ON ;


OFF

OPTION PRECOMPUTED_VIEW_MAINTENANCE ON
OFF

You can view the current value for this parameter in the RBW_OPTIONS
system table. Within the OPTION_NAME column, locate
PRECOMPVIEW_MAINTENANCE; the setting is in the VALUE column.

3UHFRPSXWHG9LHZ0DLQWHQDQFH2Q(UURU
The PRECOMPUTED VIEW MAINTENANCE ON ERROR statement specifies the
action that the versioned database takes when it encounters an aggregate
table that cannot be maintained due to an error during maintenance. This
statement only applies to versioned databases.

 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
3UHFRPSXWHG9LHZ0DLQWHQDQFH2Q(UURU

6\QWD[
To specify the behavior to use for a specific session, use the following syntax
to enter a PRECOMPUTED VIEW MAINTENANCE ON ERROR statement in the
TMU control file. To specify the behavior for all sessions, edit the OPTION
parameter in the rbw.config file.

SET PRECOMPUTED VIEW MAINTENANCE ON ERROR ROLLBACK ;


INVALIDATE

OPTION PRECOMPUTED_VIEW_MAINTENANCE_ON_ERROR ROLLBACK

INVALIDATE

ROLLBACK The entire transaction is rolled back, thus restoring all of the
tables (including aggregate tables) to their original condition.
INVALIDATE The offending aggregates are marked invalid.

You can view the current setting for this parameter in the RBW_OPTIONS
system table. Within the OPTION_NAME column, locate
PRECOMPVIEW_MAINTENANCE_ON_ERROR; the setting is in the VALUE
column.

5XQQLQJWKH708DQG3708 
0DQDJLQJ5RZ0HVVDJHV

0DQDJLQJ5RZ0HVVDJHV
You can manage how and when you view messages and warnings returned
during the LOAD process by either sending the messages to a specific file or
viewing them during the LOAD processing.

6\QWD[
To control the amount of messages and warnings viewed for a single session,
enter a SET statement in the TMU control file. To control the amount of
messages and warnings viewed for all sessions, edit the TUNE parameters in
the rbw.config file. The syntax is as follows.

SET TMU ROW MESSAGES FULL ;


NONE

TUNE TMU_ROW_MESSAGES FULL


NONE

FULL If TMU ROW MESSAGES is set to FULL, and the ROW MESSAGES
clause of the LOAD DATA statement specifies a filename, all row-
level warning messages go to that file. If TMU ROW MESSAGES
is set to FULL, and no ROW MESSAGES clause is present, all row-
level warning messages go to standard output. For both the
FULL and NONE settings, depending on their error-level setting,
messages also go to the log file.
NONE If TMU ROW MESSAGES is set to NONE, the ROWMESSAGES
clause, if specified, is ignored, and no warning-level messages
are produced. For both the FULL and NONE settings, depending
on their error-level setting, messages also go to the log file.

Multiple logging operations could slow down the loading operation


significantly. For information on how to configure logging, refer to the
Administrator’s Guide.

 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
(QDEOLQJ9HUVLRQLQJ

(QDEOLQJ9HUVLRQLQJ
You can run a TMU operation as a versioned transaction on databases where
versioning is enabled. The SET TMU VERSIONING statement and OPTION
TMU_VERSIONING rbw.config file parameter specify whether TMU opera-
tions are executed as versioned or blocking transactions. A versioned
transaction allows query operations to occur on a previously committed
version of the database while a new version is being written. A blocking
transaction locks all of the tables involved and does not allow query opera-
tions to begin until the transaction is complete. For information about setting
up a versioned database, refer to the Administrator’s Guide.

The following TMU operations can run as versioned transactions:

■ Online LOAD operations


■ REORG
■ SYNCH

,PSRUWDQW Each versioned PTMU transaction requires approximately 100 kilobytes


of extra shared memory.

To control TMU versioning for a single session, enter a SET statement in the
TMU control file. For all sessions, edit the OPTION parameter in the
rbw.config file.

SET TMU VERSIONING OFF ;


ON

RECOVER

OPTION TMU_VERSIONING OFF


ON

RECOVER

When TMU VERSIONING is set to OFF, all TMU operations are run as blocking
transactions. The default value is OFF. When TMU VERSIONING is set to ON,
data is loaded directly on to the version log, and all TMU LOAD and REORG
operations are run as versioned transactions.

5XQQLQJWKH708DQG3708 
&RPPLW5HFRUG,QWHUYDO

When TMU VERSIONING is set to RECOVER, data is loaded directly into the
version log. When the load commits, the changed blocks in the version log
are moved back to the database. If the PTMU is used, this operation is
performed in parallel across the load processes. The RECOVER option needs
exclusive access to the target table, so concurrent queries on the target table
are not allowed during the LOAD operation.

Using the version log ensures great LOAD operation stability and recover-
ability. Compared to TMU VERSIONING ON, TMU VERSIONING RECOVER
reduces the version-log usage and makes it easier to back up database files.

TMU VERSIONING RECOVER is supported for online LOAD operations, but


not for offline LOAD or REORG operations. During the LOAD operation, the
ALTER DATABASE FREEZE QUERY REVISION statement should not be issued.

&RPPLW5HFRUG,QWHUYDO
The TMU COMMIT RECORD INTERVAL statement specifies the number of
records to load into a table between each commit operation. This function is
available for both the TMU and the PTMU.

To control the TMU COMMIT RECORD INTERVAL statement for a single


session, enter a SET statement in the TMU control file. To control the TMU
COMMIT RECORD INTERVAL statement for all sessions, edit the OPTION
parameter in the rbw.config file.

SET TMU COMMIT RECORD INTERVAL OFF ;


record_count

OPTION TMU_COMMIT_RECORD_INTERVAL OFF


record_count

record_count Number of records, indicated by a positive integer, after


which the load commits. After each commit, the load pro-
ceeds until the record_count is reached again and then another
commit occurs. Setting the value to 0 disables the record
interval, and is the same as specifying OFF (the default).

 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
&RPPLW5HFRUG,QWHUYDO

8VDJH1RWHV
■ After each record-interval commit operation, the TMU issues an
information message indicating the number of rows that were
loaded. If the current transaction fails, the database rolls back to the
state of the last completed commit interval.
■ The TMU COMMIT RECORD INTERVAL statement is valid only with
TMU VERSIONING set to ON or RECOVER. This statement has no
effect if VERSIONING is set to OFF.
■ When performing a versioned load with OPTIMIZE ON, each commit
operation requires an index merge to occur. Depending on the values
of the INDEX TEMPSPACE parameters, this requirement might result
in more merges during the index-building phases, and therefore less-
efficient indexes than if the load operation completed as a single
transaction.
■ If this statement is used in conjunction with the TMU COMMIT TIME
INTERVAL statement, a commit is performed when either condition is
met. After the commit occurs, both counters are reset and loading
continues until the next interval (either TIME or RECORD) occurs.
■ If this statement is used in conjunction with the
PRECOMPUTED_VIEW_MAINTENANCE ON statement, all precom-
puted views are maintained automatically each time a commit is
made. Thus, precomputed views remain valid and synchronized
with the detail table data.

5XQQLQJWKH708DQG3708 
&RPPLW7LPH,QWHUYDO

&RPPLW7LPH,QWHUYDO
The TMU COMMIT TIME INTERVAL statement specifies the amount of time, in
minutes, to load data into a table before each commit operation. This function
is available for both the TMU and the PTMU.

To control the TMU COMMIT TIME INTERVAL statement for a single session,
enter a SET statement in the TMU control file. To control the TMU COMMIT
TIME INTERVAL statement for all sessions, edit the OPTION parameter in the
rbw.config file.

SET TMU COMMIT TIME INTERVAL OFF ;


minutes

OPTION TMU_COMMIT_TIME_INTERVAL OFF


minutes

minutes A positive integer specifying the number of minutes after


which a load operation commits. After each commit, the load
proceeds until the number of minutes is reached again and
then another commit occurs. Setting the value to 0 disables
the time interval, and is the same as specifying OFF (the
default).

The time specified in the TMU COMMIT TIME INTERVAL state-


ment is the time between when the transaction begins and
when the commit operation begins. The commit operation
might take some time to complete.

8VDJH1RWHV
■ After each time-interval commit operation, the TMU issues an infor-
mation message indicating the number of rows that were loaded. If
the current transaction fails, the database rolls back to the state of the
last completed commit interval.
■ The TMU COMMIT TIME INTERVAL statement is valid only with TMU
VERSIONING set to ON or RECOVER. This statement has no effect if
VERSIONING is set to OFF.

 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
&RPPLW7LPH,QWHUYDO

■ When performing a versioned load with OPTIMIZE ON, each commit


operation requires an index merge to occur. Depending on the values
of the INDEX TEMPSPACE parameters, this requirement might result
in more merges during the index-building phases, and therefore less-
efficient indexes than if the load operation completed as a single
transaction.
■ If this statement is used in conjunction with the TMU COMMIT
RECORD INTERVAL statement, a commit is performed when either
condition is met. After the commit occurs, both counters are reset
and loading continues until the next interval (either TIME or
RECORD) occurs.
■ If this statement is used in conjunction with the
PRECOMPUTED_VIEW_MAINTENANCE ON statement, all precom-
puted views are maintained automatically each time a commit is
made. Thus, precomputed views remain valid and synchronized
with the detail table data.

([DPSOH
The following example shows a control file with a LOAD DATA statement for
the Sales table in a versioned Aroma database where the commit interval is
set for every 20,000 records. The following code segment is the text of the
TMU control file:

set tmu versioning on;

set tmu commit record interval 20000;

load data inputfile ’$RB_CONFIG/sample_input/aroma_sales.txt’


recordlen 86
insert
into table sales (
perkey position(2) integer external(11) nullif(1)=’%’,
classkey position(14) integer external(11) nullif(13)=’%’,
prodkey position(26) integer external(11) nullif(25)=’%’,
storekey position(38) integer external(11) nullif(37)=’%’,
promokey position(50) integer external(11) nullif(49)=’%’,
quantity position(62) integer external(11) nullif(61)=’%’,
dollars position(74) decimal external(12)
radix point ’.’ nullif(73)=’%’);

5XQQLQJWKH708DQG3708 
&RPPLW7LPH,QWHUYDO

The following example shows the messages produced by the previous TMU
operation:
** STATISTICS ** (500) Time = 00:00:00.01 cp time, 00:00:00.00
time, Logical IO count=0
** STATISTICS ** (500) Time = 00:00:00.01 cp time, 00:00:00.00
time, Logical IO count=0
** INFORMATION ** (366) Loading table SALES.
** INFORMATION ** (8555) Data-loading mode is INSERT.
** INFORMATION ** (8707) Versioning is active.
** INFORMATION ** (8710) Interval commit set to 20000 records.
** INFORMATION ** (352) Row 3 of index SALES_STAR_IDX is out of
sequence. Switching to standard optimized index building. Loading
continues...
** INFORMATION ** (8708) Performing interval commit...
** INFORMATION ** (513) Starting merge phase of index building
SALES_STAR_IDX.
** INFORMATION ** (367) Rows: 20000 inserted. 0 updated. 0
discarded. 0 skipped.
** INFORMATION ** (8709) Commit complete. Loading continues...
** INFORMATION ** (8708) Performing interval commit...
** INFORMATION ** (513) Starting merge phase of index building
SALES_STAR_IDX.
** INFORMATION ** (367) Rows: 40000 inserted. 0 updated. 0
discarded. 0 skipped.
** INFORMATION ** (8709) Commit complete. Loading continues...
** INFORMATION ** (8708) Performing interval commit...
** INFORMATION ** (513) Starting merge phase of index building
SALES_STAR_IDX.
** INFORMATION ** (367) Rows: 60000 inserted. 0 updated. 0
discarded. 0 skipped.
** INFORMATION ** (8709) Commit complete. Loading continues...
** INFORMATION ** (315) Finished file
/redbrick/sample_input/aroma_sales.txt. 69941 rows read from this
file.
** INFORMATION ** (513) Starting merge phase of index building
SALES_STAR_IDX.
** INFORMATION ** (367) Rows: 69941 inserted. 0 updated. 0
discarded. 0 skipped.
** STATISTICS ** (500) Time = 00:00:29.21 cp time, 00:00:33.73
time, Logical IO count=1044

Notice that this operation performs four commit operations—one each after
inserting 20,000, 40,000, 60,000, and 69,941 records.

 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
'LVSOD\LQJ/RDG6WDWLVWLFV

'LVSOD\LQJ/RDG6WDWLVWLFV
The SET STATS statement turns on statistics reporting for the current TMU
session. The following syntax diagram shows how to construct a SET STATS
statement.

SET STATS ON
INFO
OFF

ON Returns a summary of statistical information for each load


operation.

INFO The INFO setting returns the same statistics as the ON setting,
along with additional information about the load operation,
such as server messages generated during precomputed
view maintenance.

OFF Turns off statistics reporting. The default setting is OFF.

%DFNXSDQG5HVWRUH %$5 8QLW6L]H


The BAR unit size represents the suggested maximum size of individual
backup files and XBSA objects, except in those cases where the size of the
backed-up blocks for one PSU exceeds the BAR unit size. Backup blocks for a
single PSU cannot be split across different backup files or XBSA objects, but
blocks for different PSUs can be backed up within a single backup file or XBSA
object.

SET BAR_UNIT_SIZE size K|M|G ;

TUNE BAR_UNIT_SIZE size K|M|G

5XQQLQJWKH708DQG3708 
([WHUQDO%DFNXSDQG5HVWRUH2SHUDWLRQV

size K | M | G Specifies the maximum size of a backup file, in kilobytes (K),


megabytes (M), or gigabytes (G). This limit does not apply
when the backup data for a single PSU exceeds the specified
size. The default setting for this parameter is 256 megabytes.
The minimum value for this parameter is 1 megabyte.

For more information about BAR unit configuration, see page 8-15.

([WHUQDO%DFNXSDQG5HVWRUH2SHUDWLRQV
To support a mixture of external full backups and TMU incremental backups
of the same database, execute the SET FOREIGN FULL BACKUP command
immediately before performing an external backup operation. This
command resets the backup segment and effectively states that a reliable
external backup is about to be created. In turn, TMU incremental backups can
follow, just as if a TMU full backup had been done. For more information
about external backups, see page 8-20.

An equivalent SET FOREIGN FULL RESTORE command supports the use of


external full restore operations. Issue this command immediately after the
external restore is performed and before any connections or changes can be
made to the restored database. For more details about foreign restore opera-
tions, see page 9-11.

SET FOREIGN FULL BACKUP ;

SET FOREIGN FULL RESTORE ;

The SET FOREIGN FULL BACKUP and SET FOREIGN FULL RESTORE
commands require the BACKUP_DATABASE and RESTORE_DATABASE task
authorizations, respectively.

 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
5(25*7DVNV

5(25*7DVNV
The following statements control the number of input-tasks and
index-builder tasks during a REORG operation:

■ SET TMU MAX TASKS, which specifies an upper bound on the total
number of tasks allocated for input tasks and index builder tasks.
The total number of tasks specified by the MAX TASKS statement
must be at least two.
■ SET TMU INPUT TASKS, which specifies the number of tasks allocated
to scan the target table. Because the number of conversion tasks is
always identical to the number of INPUT tasks, this option also
controls the number of conversion tasks.
■ SET TMU INDEX TASKS, which specifies the number of tasks allocated
for indexes.

The SET TMU INPUT TASKS statement and the SET TMU INDEX TASKS
statement should not be used in conjunction with the SET TMU MAX TASKS
statement.

6\QWD[
The following syntax diagram shows how to construct a SET TMU TASK
statement or a TUNE TMU_TASKS rbw.config file parameter.

SET TMU MAX TASKS num_tasks ;

TMU INPUT TASKS

TMUINDEX
TMU INDEX TASKS
TASKS

TUNE TMU_MAX_TASKS num_tasks

TMU_INPUT_TASKS

TMU_INDEX_TASKS

5XQQLQJWKH708DQG3708 
3DUDOOHO/RDGLQJ7DVNV 37082QO\

num_tasks Integer that indicates the number of tasks. If the


num_tasks is set to 0, the REORG operation resets to its
default behavior.

:DUQLQJ The REORG operation might not allocate all the specified INPUT or INDEX
tasks if the tasks are deemed excessive.

([DPSOHV
The following examples illustrate SET statements that you can use to change
parameters for a specific session:
SET TMU MAX TASKS 5;
SET TMU INPUT TASKS 3;
SET TMU INDEX TASKS 2;

The following examples illustrate entries in the rbw.config file that apply to
all sessions:
TUNE TMU_MAX_TASKS 5
TUNE TMU_INPUT_TASKS 3
TUNE TMU_INDEX_TASKS 2

3DUDOOHO/RDGLQJ7DVNV 37082QO\
As the PTMU loads data, it can use multiple tasks (even on a single CPU) to
perform the data conversion and index-building portions of the load
operation. You can control the amount of parallel processing for both data
conversion and for index-building, based on your site resources and
workload requirements. For more information about how parallel processing
is used to load data, refer to “Processing Stages for Loading Data” on
page 3-8.

 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
3DUDOOHO/RDGLQJ7DVNV 37082QO\

6\QWD[
To set the PTMU parallel-processing parameters for a single session, enter
either or both SET statements in the PTMU control file. To set the PTMU
parallel-processing parameters for all sessions, edit the TUNE parameters in
the rbw.config file.

SET TMU CONVERSION TASKS num_tasks ;

SET TMU INDEX TASKS num_tasks ;

TUNE TMU_CONVERSION_TASKS num_tasks

TUNE TMU_INDEX_TASKS num_tasks

TMU CONVERSION Tasks that convert input data to the platform-based inter-
TASKS nal format used to represent data. These tasks also ensure
uniqueness and check referential integrity whenever such
checks are performed. The number specified is the actual
number of tasks used. The default value is one-half the
number of processors on the computer (as determined
from the hardware).

TMU INDEX TASKS Tasks that make index entries into nonunique indexes,
corresponding to the data being loaded. Each nonunique
index can have at most one task associated with it. The
number specified with this parameter is the maximum
number of tasks that can be used to process all nonunique
indexes. The actual number of tasks used is the smaller of
the number of nonunique indexes and the number
specified with this parameter. The default value is one
task per nonunique index.

Use this parameter if you want to use fewer index tasks


than provided by the default value.

The task that makes the entries into unique indexes is not
affected by this parameter.

num_tasks Integer that indicates the number of tasks.

5XQQLQJWKH708DQG3708 
6HULDO0RGH2SHUDWLRQ 37082QO\

([DPSOH
The following examples illustrate how to use the SET statements and TUNE
parameter entries.

To control parallel processing for a single PTMU session within a control file:
set tmu conversion tasks 5;
set tmu index tasks 8;

To control parallel processing for all PTMU sessions by using the rbw.config
file:
TUNE tmu_conversion_tasks 5
TUNE tmu_index_tasks 8

([DPSOH
To illustrate how the TMU CONVERSION TASKS parameter works, assume
you have 8 processors on the system. By default, 4 of them are used for
conversion tasks. If you want to use more than 4 processors, set the TMU
CONVERSION TASKS parameter to a number larger than 4 to increase the
number of processors.

([DPSOH
To illustrate how the TMU INDEX TASKS parameter works, assume you have
5 nonunique indexes and TMU INDEX TASKS is set to 3. In this case, 3 tasks
are used, and some tasks process multiple indexes in parallel. If you have 5
nonunique indexes and TMU INDEX TASKS is set to 6, 5 tasks are used, one
per nonunique index.

6HULDO0RGH2SHUDWLRQ 37082QO\
You can force the PTMU to run in a serial mode in which no parallel
processing is used. In this case, the PTMU is effectively running as the TMU.
This capability is useful in cases where you do not want the resource
consumption and overhead of parallel processing. You can use the TMU
instead of the PTMU, but the ability to run the PTMU in serial mode allows
you to combine operations in a single control file to be executed by the PTMU.
Within the control file, you specify those operations that are to be executed in
serial mode, with all other operations to be executed in parallel mode.

 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
6HULDO0RGH2SHUDWLRQ 37082QO\

,PSRUWDQW The TMU SERIAL MODE parameter affects only those operations for
which the PTMU uses parallel processing. It does not affect those operations for which
the PTMU normally uses serial processing, as defined on page 2-52.

To control PTMU serial mode for a single session, enter a SET statement in the
PTMU control file. To control PTMU serial mode for all sessions, edit the TUNE
parameters in the rbw.config file.

SET TMU SERIAL MODE OFF ;


ON

TUNE TMU_SERIAL_MODE OFF


ON

([DPSOH
Suppose you have a PTMU control file that performs multiple operations.
Most operations are performed in parallel, but you want the following opera-
tions to be performed in serial mode:

■ Loading some small dimension tables where the overhead of parallel


processing causes the operation to actually take longer in parallel
mode than in serial mode.
■ Loading a large table that would, in parallel mode, preclude
reasonable performance for other users, whereas in serial mode, the
operation would complete in the required time and not affect other
users.

The SET SERIAL MODE statement (included wherever needed in the control
file) directs the PTMU to switch between parallel and serial mode.

5XQQLQJWKH708DQG3708 
6XJJHVWLRQVIRU(IIHFWLYH37082SHUDWLRQV

6XJJHVWLRQVIRU(IIHFWLYH37082SHUDWLRQV
Although the TMU and PTMU function similarly, the PTMU has some
exclusive functions. These are detailed in the following sections.

2SHUDWLRQV7KDW8VH3DUDOOHO3URFHVVLQJ
The PTMU uses parallel processing for some operations and serial processing
for others:

■ Parallel processing is used for:


❑ Online LOAD operations in INSERT, APPEND, MODIFY, UPDATE,
and REPLACE modes
❑ REORG operations
■ Serial processing is used for:
❑ Offline LOAD operations
❑ SYNCH, UNLOAD, and UPGRADE operations

When loading data in MODIFY or UPDATE mode, the PTMU does not support
using a combination of pseudo- and regular columns in an ACCEPT or REJECT
Criteria clause. Instead, the PTMU proceeds in serial mode.

When loading data in MODIFY or UPDATE modes, PTMU referential integrity


checking is performed in the output task when the LOAD DATA statement
contains any or all of the following items:

■ AGGREGATE clause
■ AUTOROWGEN clause
■ RETAIN clause
■ ACCEPT or REJECT clause with one column as a regular column

If you use the PTMU to load data to take advantage of parallel processing, do
not use the MODIFY mode when you can use the APPEND or INSERT mode.
For example, to load data into an empty table, use the INSERT mode instead
of the MODIFY mode. To add new rows to an existing table without
modifying any existing rows, use the APPEND mode instead of the MODIFY
mode.

 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
'LVFDUG/LPLWVRQ3DUDOOHO/RDG2SHUDWLRQV

'LVFDUG/LPLWVRQ3DUDOOHO/RDG2SHUDWLRQV
Because the PTMU processes multiple input rows at the same time, its
behavior might be different from the TMU when a load operation reaches the
discard limit and ends early. For example, if input row 500 is discarded,
causing the maximum discard limit to be exceeded, the PTMU might already
be processing rows 501, 502, and so on. If these rows are being discarded,
messages appear for these rows and they appear in the discard file (if
specified) even though they are beyond the limit. Any rows beyond the
discard limit that are inserted into the table are removed before the load
operation ends.

The same behavior occurs if the load ends prematurely for other similar
reasons, such as exceeding the MAXROWS PER SEGMENT limit on the target
table.

$87252:*(1ZLWKWKH3708
The PTMU supports automatic row generation during parallel-load opera-
tions. However, automatic row generation reduces the amount of parallelism
that can be applied to the operation because the referential-integrity checking
and row generation must all be done by a single process. For the best perfor-
mance, do not use automatic row generation for parallel loads. The decrease
in performance can be significant.

To load a large amount of data that might have only a few rows that fail refer-
ential-integrity checking, load the data by specifying AUTOROWGEN OFF
and naming a discard file. Then load the rows in the discard file by specifying
AUTOROWGEN ON.

,PSRUWDQW This strategy is not appropriate if you expect a large number of discarded
rows because the discard processing significantly slows the load processing.

5XQQLQJWKH708DQG3708 
0XOWLSOH7DSH'ULYHVZLWKWKH3708

0XOWLSOH7DSH'ULYHVZLWKWKH3708
UNIX Not all UNIX systems support multiple tape drives with the PTMU. ♦

If you use the PTMU with multiple tape drives (such as 8-mm drives), you can
load data in sequence into a database without operator intervention. Include
the following clause in the control file to specify the filename and each device
name:
INPUTFILE ’filename’ TAPE DEVICE ’device_name[,…]'

The following example illustrates how to describe and load multiple tape
drives for the PTMU. The following line in the control file loads data from
tape devices tx0 and tx1:
INPUTFILE 'myfile' TAPE DEVICE '/dev/rmt/tx0,/dev/rmt/tx1'

If the PTMU loads all the data from tx0 and then from tx1 and still does not
reach the end of the file, it pauses and requests that the next tape be loaded.
Load the next tape in the tape device tx0.

0XOWLSOH7DSH'ULYHZLWKWKH3708
UNIX Support for the 3480/3490 tape drive is not available on all UNIX systems. ♦

If you use the PTMU with the 3480 or 3490 multiple-tape drive, include the
following clause in the control file, specifying a device name and a range of
cartridges for each tape device:
INPUTFILE 'filename' TAPE DEVICE 'device_name[(start-end)][,…]'

The following line in the control file loads data from 3480/3490 tape devices
tf0 and tf1:
INPUTFILE myfile’ TAPE DEVICE ’/dev/rmt/tf0(1-3), /dev/rmt/tf1(1-3)’

If, after loading all the data on tf1, the PTMU does not reach the end of the file,
it pauses and requests that the next tape be loaded.

 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
0XOWLSOH7DSH'ULYHZLWKWKH3708

Data is loaded in the following order.

/RDG2UGHU 7DSH'HYLFH &DUWULGJH

Tape 1 tf0 1

Tape 2 tf1 1

Tape 3 tf0 2

Tape 4 tf1 2

Tape 5 tf0 3

Tape 6 tf1 3

Operator prompted for new cartridges

Tape 7 tf0 1

Tape 8 tf1 1

Tape 9 tf0 2

5XQQLQJWKH708DQG3708 
Chapter

/RDGLQJ'DWDLQWRD:DUHKRXVH
'DWDEDVH 
In This Chapter 3-5
The LOAD DATA Operation . . . . . . . . . . . . . . 3-6
Inputs and Outputs . . . . . . . . . . . . . . . . 3-6
Processing Stages for Loading Data . . . . . . . . . . . 3-8
Input Stage . . . . . . . . . . . . . . . . . . 3-10
Conversion Stage . . . . . . . . . . . . . . . . 3-11
Main Output and Index Stages . . . . . . . . . . . 3-11
Error Handling and Cleanup Stage . . . . . . . . . 3-11
Procedure for Loading Data . . . . . . . . . . . . . . . 3-12
Some Preliminary Decisions. . . . . . . . . . . . . . . 3-14
Determining Table Order . . . . . . . . . . . . . . 3-14
Ordering Input Data . . . . . . . . . . . . . . . . 3-15
Maintaining Referential Integrity with Automatic Row Generation 3-16
Discarding Records That Violate Referential Integrity . . . 3-16
Adding Generated Rows to Referenced Tables. . . . . . 3-17
Modifying the Input Rows . . . . . . . . . . . . 3-19
Adding Rows in Mixed Mode . . . . . . . . . . . 3-21
Specifying the AUTOROWGEN Mode . . . . . . . . 3-22
Writing a LOAD DATA Statement . . . . . . . . . . . . 3-23
LOAD DATA Syntax . . . . . . . . . . . . . . . . . 3-24
Input Clause . . . . . . . . . . . . . . . . . . . . 3-25
Format Clause . . . . . . . . . . . . . . . . . . . 3-29
EBCDIC to ASCII Conversion . . . . . . . . . . . . . 3-35
IBM Syntactic Code Set (CS 640) . . . . . . . . . . 3-36
Two Approaches to Loading EBCDIC Data . . . . . . . 3-36
Examples: Format Clause . . . . . . . . . . . . . 3-37
Locale Clause . . . . . . . . . . . . . . . . . . . . 3-38
Locale Specifications for XML Input Files . . . . . . . . . 3-41
Usage Notes . . . . . . . . . . . . . . . . . . . 3-42

Discard Clause . . . . . . . . . . . . . . . . . . . 3-43


Usage . . . . . . . . . . . . . . . . . . . . . 3-54
Default Values . . . . . . . . . . . . . . . . . 3-54
Table Locks . . . . . . . . . . . . . . . . . . 3-54
Conflicts in Mixed-Mode Operation . . . . . . . . . 3-55
DEFAULT Mode and Simple Star Schemas . . . . . . . 3-56
Troubleshooting . . . . . . . . . . . . . . . . 3-57
Row Messages Clause . . . . . . . . . . . . . . . . . 3-57
Optimize Clause . . . . . . . . . . . . . . . . . . . 3-59
MMAP Index Clause . . . . . . . . . . . . . . . . . 3-63
Table Clause . . . . . . . . . . . . . . . . . . . . 3-65
Loading a SERIAL Column . . . . . . . . . . . . . . 3-68
Selective Column Updates with RETAIN and DEFAULT . . . 3-69
Simple Fields . . . . . . . . . . . . . . . . . . 3-71
xml_path Specification . . . . . . . . . . . . . . 3-75
Aggregate Operators . . . . . . . . . . . . . . 3-76
Example: Position Clause . . . . . . . . . . . . . 3-77
Example: XML Data and Corresponding Control File . . . 3-78
Example: NULLIF . . . . . . . . . . . . . . . 3-79
Example: Auto Aggregate . . . . . . . . . . . . . 3-80
Example: ROUND Function . . . . . . . . . . . . 3-80
Concatenated Fields . . . . . . . . . . . . . . . . 3-81
Constant Fields . . . . . . . . . . . . . . . . . . 3-84
Sequence Fields . . . . . . . . . . . . . . . . . . 3-85
Increment Fields . . . . . . . . . . . . . . . . . 3-86

Segment Clause . . . . . . . . . . . . . . . . . . . 3-87


Criteria Clause . . . . . . . . . . . . . . . . . . . 3-90
Usage . . . . . . . . . . . . . . . . . . . . 3-93
Comment Clause . . . . . . . . . . . . . . . . . . 3-95
Field Types. . . . . . . . . . . . . . . . . . . . . 3-97
Character Field Type . . . . . . . . . . . . . . . . 3-99
Numeric External Field Types . . . . . . . . . . . . . 3-101
 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
Floating-Point External Field Type . . . . . . . . . . . 3-103
Packed and Zoned Decimal Field Types . . . . . . . . . . 3-104
Integer Binary Field Types . . . . . . . . . . . . . . 3-105
Floating-Point Binary Field Types . . . . . . . . . . . . 3-106
Datetime Field Types . . . . . . . . . . . . . . . . 3-107

Format Masks for Datetime Fields . . . . . . . . . . . . . 3-109


Subfield Components . . . . . . . . . . . . . . . . 3-110
Format Masks to read Input Fields . . . . . . . . . . 3-113
Restricted Datetime Masks for Numeric Fields . . . . . . . . . 3-116
Requirements for Input Data for Datetime Masks . . . . . 3-118
Writing a SYNCH Statement . . . . . . . . . . . . . . . 3-119
Format of Input Data . . . . . . . . . . . . . . . . . . 3-122
Disk Files . . . . . . . . . . . . . . . . . . . . 3-123
Fixed-Format Records . . . . . . . . . . . . . . . 3-123
Variable-Format Records . . . . . . . . . . . . . . 3-124
Separated-Format Records . . . . . . . . . . . . . 3-128
XML Format . . . . . . . . . . . . . . . . . . 3-129
Tape Files on UNIX Operating Systems . . . . . . . . . . 3-131
TAR Tapes . . . . . . . . . . . . . . . . . . . 3-131
ANSI-Standard Label Tapes . . . . . . . . . . . . . 3-132
Field-Type Conversions . . . . . . . . . . . . . . . . . 3-133
LOAD DATA Syntax Summary . . . . . . . . . . . . . . 3-137
input_clause . . . . . . . . . . . . . . . . . . 3-137
format_clause . . . . . . . . . . . . . . . . . 3-138
locale_clause . . . . . . . . . . . . . . . . . . 3-138
discard_clause . . . . . . . . . . . . . . . . . 3-139
rowmessages_clause . . . . . . . . . . . . . . . 3-139
optimize_clause . . . . . . . . . . . . . . . . . 3-140
mmap_index_clause . . . . . . . . . . . . . . . 3-140
table_clause . . . . . . . . . . . . . . . . . . 3-140
simple_field . . . . . . . . . . . . . . . . . . 3-141
xml_path . . . . . . . . . . . . . . . . . . . 3-141
concatenated_field . . . . . . . . . . . . . . . . 3-142
constant_field . . . . . . . . . . . . . . . . . 3-142
sequence_field . . . . . . . . . . . . . . . . . 3-143
increment_field . . . . . . . . . . . . . . . . . 3-143
segment_clause . . . . . . . . . . . . . . . . . 3-143

/RDGLQJ'DWDLQWRD:DUHKRXVH'DWDEDVH 
criteria_clause on non-character column . . . . . . . . 3-144
criteria_clause on character column . . . . . . . . . 3-144
comment_clause . . . . . . . . . . . . . . . . 3-144
field_type . . . . . . . . . . . . . . . . . . 3-145
field_type (continued) . . . . . . . . . . . . . . 3-146
restricted date_spec . . . . . . . . . . . . . . . 3-146

 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
,Q7KLV&KDSWHU
You use the TMU and a control file that contains a LOAD DATA statement to
load data into a data warehouse.

This chapter provides the information you need to write the LOAD DATA
statements, the field specifications within the LOAD DATA statements, and
the SYNCH statement for offline load operations.

This chapter contains the following sections:

■ The LOAD DATA Operation


■ Procedure for Loading Data
■ Some Preliminary Decisions
■ Writing a LOAD DATA Statement:
❑ Input Clause
❑ Format Clause
❑ Locale Clause
❑ Discard Clause
❑ Row Messages Clause
❑ Optimize Clause
❑ MMAP Index Clause
❑ Table Clause
❑ Segment Clause
❑ Criteria Clause
❑ Comment Clause
■ Field Types
■ Format Masks for Datetime Fields
■ Restricted Datetime Masks for Numeric Fields

/RDGLQJ'DWDLQWRD:DUHKRXVH'DWDEDVH 
7KH/2$''$7$2SHUDWLRQ

■ Writing a SYNCH Statement


■ Format of Input Data
■ Field-Type Conversions
■ LOAD DATA Syntax Summary

7KH/2$''$7$2SHUDWLRQ
Before you can load data, the database and the tables to load must already
exist. You build databases with a utility program (rb_creator on UNIX and
dbcreate on Windows) and define the user tables with SQL CREATE TABLE
statements. The load process automatically builds primary-key indexes for
each table that has a primary key. It also builds existing user-defined indexes.
For information about defining tables and indexes, refer to the Administrator’s
Guide and the SQL Reference Guide.

,QSXWVDQG2XWSXWV
Input to the TMU for a data loading operation consists of:

■ A LOAD DATA control file.


■ Tape or disk files containing the input data records.

Output from a data loading operation consists of:

■ Table and index files that are part of the database.


■ Status, error, and warning messages.
■ One or more discard files (in ASCII or EBCDIC format) containing
records or rows that could not be loaded because of format errors,
data-integrity violations, or referential-integrity violations.

The TMU also updates the IBM Red Brick Warehouse system tables that
contain the data format descriptions, table and index files, and other
information that the TMU and the database server need.

The TMU automatically creates indexes on all primary keys, based on table
definitions. It also builds and updates any additional user-created indexes
that exist at the time of the load operation.

 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
,QSXWVDQG2XWSXWV

If you are running on a 64-bit platform and have explicitly enabled your file
system for large files (files larger than 2 gigabytes), you can load and unload
input and output files larger than 2 gigabytes. You can only load input files
larger than 2 gigabytes in disk format.

If data is loaded into an offline segment, the data-loading operation must be


completed with a SYNCH SEGMENT statement, which synchronizes the
newly loaded segment with the table and its indexes.

The following TMU features can be useful when loading or copying data:

■ Auto Aggregate mode


The TMU Auto Aggregate mode allows you to automatically and
selectively aggregate new input data with the data already in a table.
■ Precomputed view maintenance feature
The precomputed view maintenance feature automatically updates
aggregate tables whenever detail tables are updated, thus ensuring
that detail and aggregate tables are in sync. Maintenance does not
occur when any detail table segment is offline. For more information,
refer to the IBM Red Brick Vista User’s Guide.
■ Automatic Row Generation feature
The TMU Automatic Row Generation feature is useful for rapid load-
ing of non-standard, or dirty, data. This feature provides the
following alternatives to discarding an input row that violates refer-
ential integrity:
❑ Generating and adding any new rows needed to preserve refer-
ential integrity into the referenced tables and then adding the
original input row into the referencing table instead of
discarding it.
❑ Loading the new row into the table being loaded after replacing
values that violate referential integrity with default values that
are already present in specified referenced tables.
❑ Combining these two behaviors on a table-by-table basis for each
referenced table, sometimes adding new rows to referenced
tables and sometimes loading modified rows into the table being
loaded.

/RDGLQJ'DWDLQWRD:DUHKRXVH'DWDEDVH 
3URFHVVLQJ6WDJHVIRU/RDGLQJ'DWD

Precomputed views defined on the detail dimension tables for which


you are generating rows cannot be maintained. If precomputed view
maintenance is turned on, such views are marked invalid.
For more information about the Automatic Row Generation feature,
see “Maintaining Referential Integrity with Automatic
Row Generation” on page 3-16.
■ Copy Management utility
The copy management utility (rb_cm) allows you to move data
between different databases located throughout an enterprise. For
more information about rb_cm, refer to Chapter 7, “Moving Data
with the Copy Management Utility”.

3URFHVVLQJ6WDJHVIRU/RDGLQJ'DWD
When data is loaded into a database from an input file, the load process
includes several processing stages. By understanding what activities occur
during each stage, you are better able to avoid bottlenecks and resource
conflicts, thereby reducing the time required to load data.

A load operation consists of the following stages:

■ Input stage
❑ Validates syntax of TMU control statement.
❑ Locks tables and segments.
❑ Reads input records, monitoring progress and status communi-
cated by error handling and cleanup stage.
❑ Sets up additional processes for conversion and index stages for
PTMU.
■ Conversion stage
❑ Converts input records to internal row format and validates
data.
❑ Checks referential integrity (if Automatic Row Generation is off).

 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
3URFHVVLQJ6WDJHVIRU/RDGLQJ'DWD

■ Main output stage


❑ Checks referential integrity (if Automatic Row Generation is on).
❑ Writes row data to table.
❑ Writes entries to unique indexes and verifies uniqueness.
■ Additional index stage for nonunique indexes
❑ Writes entries to nonunique indexes.
■ Error handling and cleanup stage
❑ Handles error processing.
❑ Finishes building indexes.
❑ Communicates progress and status back to input stage.

The TMU uses a single process that controls all stages. It processes small
batches of rows, passing one batch through each stage before starting the next
batch.

The PTMU with its parallel processing capability improves performance in


two ways:

■ It uses separate processes for each stage, creating a pipeline in which


batches of rows are passed from one stage to the next, with multiple
batches of rows being processed simultaneously.
Even on systems with a single CPU, multiple processes can take
advantage of I/O and CPU overlap, which might result in reduced
load time.
■ On systems with multiple CPUs, the PTMU further improves the
pipeline throughput by creating additional conversion and output
processes. (The user can define the number of additional processes.)

/RDGLQJ'DWDLQWRD:DUHKRXVH'DWDEDVH 
3URFHVVLQJ6WDJHVIRU/RDGLQJ'DWD

Figure 3-1 illustrates the sequence of stages in a LOAD operation and the
additional parallelism that the multiple processes of the PTMU provide. (The
TMU uses a single process for all stages.)

)LJXUH 
37080XOWLSOH3URFHVVHV

/2$''$7$3URFHVVLQJ6HTXHQFH
,QSXWVWDJH
.H\
&RQWUROILOH ,QSXW
6\VWHPWDEOHV SURFHVV 3URFHVV
,QSXWUHFRUGV
3ULPDU\,2
&RQYHUVLRQ
VWDJH 6WDWXV
&RQYHUVLRQ &RQWUROIORZ
3.LQGH[HV SURFHVVWDVNWDVN

0DLQRXWSXW
VWDJH
'DWDDQGXQLTXH 0DLQRXWSXW $GGLWLRQDOSURFHVVHVIRUWKHVHVWDJHV
LQGH[VHJPHQWV SURFHVV

,QGH[VWDJH
1RQXQLTXH ,QGH[
,QGH[
,QGH[
LQGH[VHJPHQWV SURFHVV
WDVN

(UURUDQGFOHDQXS
VWDJH
6\VWHPWDEOHV (UURUKDQGOLQJDQG
FOHDQXSSURFHVV

,QSXW6WDJH
During the input stage, the PTMU checks the syntax of the LOAD DATA
statement and locks the table or segment for exclusive use. You can specify
whether you want the TMU to wait for a lock or to return immediately if the
table is in use.

 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
3URFHVVLQJ6WDJHVIRU/RDGLQJ'DWD

&RQYHUVLRQ6WDJH
During the conversion stage, the PTMU performs any necessary data
conversion on each record, including conversion between code sets. For
example, EBCDIC to ASCII or MS932 to EUC, conversion from the external code
set to internal (binary) format, and decimal scaling. In this stage, the PTMU
checks referential integrity (if Automatic Row Generation is off) and the data
is validated by comparing it with the column data type and checking for
truncation, underflow, and overflow. Since the PTMU uses multiple
conversion processes, it improves conversion performance significantly.

0DLQ2XWSXWDQG,QGH[6WDJHV
During the main output stage, the PTMU writes data to the table and makes
entries in all unique indexes. In addition, if Automatic Row Generation is on,
the PTMU performs referential integrity checks during this stage, and inserts
into referenced tables any automatically-generated rows. In this stage, the
PTMU uses a single process to make all entries into each unique index.

During the index stage, PTMU makes entries into any nonunique indexes. The
PTMU, by default, uses one index process per nonunique index, thereby
speeding up this part of the load operation.

(UURU+DQGOLQJDQG&OHDQXS6WDJH
The error handling and cleanup stage performs the error handling, which
includes keeping track of rows loaded in case of interrupts, and cleans up
after the processing is complete. It also monitors progress through the
pipeline, providing feedback to the input stage to control the flow of records
being processed. The PTMU uses a single process for this stage.

/RDGLQJ'DWDLQWRD:DUHKRXVH'DWDEDVH 
3URFHGXUHIRU/RDGLQJ'DWD

3URFHGXUHIRU/RDGLQJ'DWD
To load data into the tables of a warehouse database.

7DVN 'HVFULSWLRQ

Determine the order in which to load the tables page 3-14

Determine whether to load ordered or unordered page 3-15


data and order the data if desired

Determine the following information about your The description of the input
input data: data is supplied in the Input
■ Source of your input data (disk, TAR or standard clause, as described on
label tape, or standard input) page 3-29. Additional infor-
mation about file and record
■ Record length (fixed or variable) formats is provided on
■ Record format (fixed, variable, separated, or XML) page 3-122 and about data
type conversions from the
■ Record field order and type input data to the server data
■ Mapping between input fields and table columns types on page 3-133.
■ Code set (ASCII, EBCDIC, or XML encoding)

Determine the load mode to use: APPEND, INSERT, The load mode is specified
MODIFY, REPLACE, or UPDATE. If you use in the Format clause, as
MODIFY or UPDATE mode, determine whether to described on page 3-29.
use the AGGREGATE mode.

Determine whether to use Automatic Row See “Maintaining Refer-


Generation to ensure that no rows are discarded for ential Integrity with
referential-integrity violations, for some or all of the Automatic
referenced tables. Row Generation” on
Also decide whether to use separate discard files for page 3-16. Discard file
records discarded for data integrity and referential- choices are specified in the
integrity violations. Discard clause, as described
on page 3-43.

Determine whether to automatically maintain For more information, see


precomputed views defined on the table being the IBM Red Brick Vista
loaded. User’s Guide
(1 of 2)

 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
3URFHGXUHIRU/RDGLQJ'DWD

7DVN 'HVFULSWLRQ

Determine how you want to display row-level Row message file choices
warning messages. are specified in the Row
Message clause, as
described on page 3-57.

Determine whether to use Optimize mode to load the Optimize mode is selected
data. in the Optimize clause, as
described on page 3-59.

Write the LOAD DATA control statements, one per See “TMU Control Files and
table, in a file. A single file can contain multiple Statements” on page 1-8
control statements of different types.

If you are loading data into a segment of a table, None


determine whether you want to use an offline load
operation. If you do, take the segment offline with an
ALTER SEGMENT operation.

Run the TMU with a control file containing the None


LOAD DATA statements.

If you loaded data into an offline segment, The SYNCH statement is


synchronize the segment by running the TMU with a described on page 3-119.
SYNCH statement, and then bring the segment
online with an ALTER SEGMENT operation.

To check the status of a load operation on a specific For information on the


table, query the RBW_LOADINFO table. RBW_LOADINFO table, see
the Administrator’s Guide.
(2 of 2)

/RDGLQJ'DWDLQWRD:DUHKRXVH'DWDEDVH 
6RPH3UHOLPLQDU\'HFLVLRQV

6RPH3UHOLPLQDU\'HFLVLRQV
Before you write the LOAD DATA statements to load data into the tables of
your database, make the following decisions, which affect how you write the
statements:

■ In what order to load the tables.


■ Whether the input data is or should be ordered.
■ Whether to use automatic row generation to maintain referential
integrity.

Each of these topics is discussed in the following sections.

'HWHUPLQLQJ7DEOH2UGHU
The TMU loads tables in the order of the LOAD DATA statements in the control
file. Each LOAD DATA statement corresponds to one table. To control the
order in which tables are loaded, place the LOAD DATA statements in the file
in the order in which you want the tables to load.

You can load tables in any order as long as any table referenced by a foreign
key is loaded before the table containing that foreign key. That is, a referenced
table must be loaded before the table that references it. For example, if the
Sales table, the referencing table, contains three foreign keys, each of the
three tables referenced by the foreign keys must be loaded before the Sales
table can be loaded.

 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
2UGHULQJ,QSXW'DWD

2UGHULQJ,QSXW'DWD
You must decide whether to order the data in the input files, balancing any
improvement in load time against the amount of time available to load data,
the time spent ordering input data, and the difficulty of maintaining ordered
data.

The initial load of input data into a table is somewhat faster with ordered
data. However, for incremental loads of data into indexed tables, the
optimized load mode makes data-order issues unimportant. With more than
one STAR index, a combination of primary key and STAR indexes, or refer-
ences to multicolumn primary keys, it is usually not useful to attempt to
order data. IBM suggests that for OPTIMIZE OFF loads, the incoming data
should be sorted in the key order of the leading columns of the primary STAR
index.

To order the data for an initial load, order the data for each referenced table
by the primary-key values. If you have a single STAR index on the referencing
table, you can order the data in the key order of the STAR index definition,
which can result in a more efficient index. To order the input data based on a
single STAR index on the referencing table, order it so that the data in the
foreign-key columns named first in the CREATE STAR INDEX statement is the
slowest to change. Data in the foreign-key columns named next changes
more slowly, and so on. The order of data in each foreign-key column must
match the order of data in the corresponding primary key in the referenced
tables. If you use multiple STAR indexes, then the difficulty of choosing an
order for the input data increases and the benefits are reduced.

In a single column in the key, data can be in any arbitrary order, provided that
the order in that column is the same as the order in the corresponding
primary key in the referenced table. For example, if the input data for the
foreign-key column of the referencing table is in descending collation order,
the input data for the corresponding column in the referenced table must also
be in descending collation order.

/RDGLQJ'DWDLQWRD:DUHKRXVH'DWDEDVH 
0DLQWDLQLQJ5HIHUHQWLDO,QWHJULW\ZLWK$XWRPDWLF5RZ *HQHUDWLRQ

0DLQWDLQLQJ5HIHUHQWLDO,QWHJULW\ZLWK$XWRPDWLF
5RZ *HQHUDWLRQ
The Automatic Row Generation feature (AUTOROWGEN) allows the TMU to
add any rows needed to preserve referential integrity. If this feature is OFF,
the TMU discards records that violate referential integrity. However, this
behavior can be both time-consuming and frustrating in situations where the
data being loaded is dirty, unfamiliar, or incomplete. This feature offers the
following alternatives, in addition to discarding rows, to maintain referential
integrity:

■ Generating and adding new rows to the referenced tables (ON


mode).
■ Modifying the row to add to the table being loaded, target table, by
using a column default value for one or more of the foreign keys
(DEFAULT mode).
■ Combining these actions within a single load operation with a
mixed-mode operation.

This flexibility allows you to choose how you want to maintain referential
integrity on a table-by-table basis within a single load operation. Tables are
locked automatically, for either read or write access as needed, at the
beginning of the load operation.

You can set the AUTOROWGEN feature for ON and OFF mode in the
rbw.config file or in the Discard clause of a LOAD DATA statement. However,
you can set it for DEFAULT or mixed-mode operation only in the Discard
clause.

Precomputed views defined on the detail dimension tables for which you are
generating rows cannot be maintained. If aggregate maintenance is turned
on, such views are marked invalid.

'LVFDUGLQJ5HFRUGV7KDW9LRODWH5HIHUHQWLDO,QWHJULW\
If the AUTOROWGEN feature is OFF (the default behavior), all records that
violate referential integrity are discarded or written either to the standard
discard file or to files designated for referential-integrity violations. (You can
designate separate files for violations of each referenced table.)

 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
0DLQWDLQLQJ5HIHUHQWLDO,QWHJULW\ZLWK$XWRPDWLF5RZ *HQHUDWLRQ

$GGLQJ*HQHUDWHG5RZVWR5HIHUHQFHG7DEOHV
If the AUTOROWGEN feature is ON, whenever an input row in the table being
loaded contains a value in a foreign-key column that is not present in the
primary-key column of the referenced table, a row is generated and added to
the referenced table before the input row is added to the target table. This
behavior cascades through any outboard tables that are in turn referenced by
the referenced table.

In this mode, the referenced tables grow as rows are inserted into them. If a
table grows beyond its MAXROWS PER SEGMENT value, a REORG operation
might be required on STAR indexes built on these foreign-key columns.

The generated rows get their values from default values defined for each
column when the table was created.

([DPSOH$87252:*(121

Figure 3-2 illustrates the AUTOROWGEN ON feature. Assume you are the
database administrator for the following database. (Bold text indicates
primary keys. Bold italic indicates foreign keys.)
)LJXUH 
$87252:*(121)HDWXUH$GGV5RZVWR5HIHUHQFHG7DEOH

Sales Period
Class Product SHUNH\ SHUNH\
FODVVNH\ FODVVNH\ FODVVNH\ GDWH
FODVVBW\SH SURGNH\ SURGNH\ GD\
FODVVBGHVF SURGBQDPH VWRUHNH\ ZHHN
SNJBW\SH SURPRNH\ PRQWK
TXDQWLW\ TWU
GROODUV \HDU
Store
Market Promotion
VWRUHNH\
PNWNH\ PNWNH\ SURPRNH\
KTBFLW\ VWRUHBW\SH SURPRBW\SH
KTBVWDWH VWRUHBQDPH SURPRBGHVF
GLVWULFW VWUHHW YDOXH
UHJLRQ FLW\ VWDUWBGDWH
VWDWH HQGBGDWH
]LS

/RDGLQJ'DWDLQWRD:DUHKRXVH'DWDEDVH 
0DLQWDLQLQJ5HIHUHQWLDO,QWHJULW\ZLWK$XWRPDWLF5RZ *HQHUDWLRQ

The Sales table contains daily total sales for products sold in a chain of retail
stores. Because all managers have authority to order goods for their own
stores, frequently when you load the daily sales data, new products appear
for which no supporting entries exist in the Product table. Your operations
run more smoothly when you can complete the nightly load and complete
the entries for these new items the next day.

The Product table is defined as follows:


create table product (
classkey integer not null,
prodkey integer not null,
prod_name char(30) default ’new product’,
pkg_type char(20),
constraint prod_pkc primary key (classkey, prodkey),
…;

Each manager has a range of Prodkey values to assign to new products in the
defined classes.

The LOAD DATA statement for the daily load operation on the Sales table sets
the Automatic Row Generation feature to ON.
load data
inputfile 'sales.txt'
recordlen 86
insert
discardfile 'sales_disc.txt'
autorowgen on

When a record containing the sales dollars for a brand new product is
encountered during the load process, the TMU inserts the record in the Sales
table and adds a row containing the new Prodkey value into the Product
table, filling in that row with any specified default values or NULL.

Assume the following record is encountered in the load process.

…:01/96/d22:7:789:78:…:236:…
Dollars
Perkey Storekey
New Prodkey value
Classkey

 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
0DLQWDLQLQJ5HIHUHQWLDO,QWHJULW\ZLWK$XWRPDWLF5RZ *HQHUDWLRQ

The value 789, which was assigned by a store manager who added a new
item at his store, does not appear in the primary key of the Product table. The
AUTOROWGEN feature allows the TMU to insert the previous information
into the Sales table after adding the following row to the Product table.

FODVVNH\ SURGNH\ SURGBQDPH SNJBW\SH

7 789 new product NULL

The task of replacing the default values with real values remains, but now the
data is loaded and analysis can proceed. To find any new products that are
added, use a SELECT statement of the form:
select prodkey, product from product
where product = ’new product’;

You now need to find the missing information and update the Product table
entry.

0RGLI\LQJWKH,QSXW5RZV
If the Discard clause specifies AUTOROWGEN DEFAULT mode for a list of
referenced tables, when an input row contains a value in a foreign-key
column that is not present in the primary-key column of a referenced table in
the list, the row is first modified by replacing the missing value with the
default value for the foreign-key column. The row is then added to the target
table. In this mode, referenced tables in the list do not grow. This mode is
useful for data that contains unknown values in foreign-key columns that are
not of critical importance to the application. It is also useful in cases where
you do not want a referenced table to grow to exceed the MAXROWS PER
SEGMENT value.

/RDGLQJ'DWDLQWRD:DUHKRXVH'DWDEDVH 
0DLQWDLQLQJ5HIHUHQWLDO,QWHJULW\ZLWK$XWRPDWLF5RZ *HQHUDWLRQ

([DPSOH$87252:*(1'()$8/7

The following example shows the AUTOROWGEN DEFAULT mode in which


referential integrity is preserved by modifying the input rows before they are
loaded. It is based on the database in the previous example on page 3-17.
Assume the Sales table is defined as follows, with a default value assigned to
the Prodkey column:
create table sales (
perkey integer not null,
classkey integer not null,
prodkey integer not null default 0,
storekey integer not null,
promokey integer not null,
quantity integer,
dollars dec(7,2),
constraint sales_pkc primary key (perkey, classkey,
prodkey, storekey, promokey),
… ;

Assume the load operation occurs on the Sales table with AUTOROWGEN
DEFAULT mode specified for the Product table. For all other tables that the
Sales table references, the default behavior is OFF mode.
load data
inputfile 'sales.txt'
recordlen 86
insert
discardfile 'sales_disc.txt'
autorowgen default (product)

The following records, which contain Prodkey values not present in the
Product table, are encountered in the load process.

…:01/96/d22:7:789:78:45:36:236.56:…
…:01/96/d22:7:790:78:46:42:168.72:…
…:01/96/d22:7:791:78:46:143:937.25:…
Dollars
Perkey
Quantity
Promokey
Storekey
New Prodkey value
Classkey

 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
0DLQWDLQLQJ5HIHUHQWLDO,QWHJULW\ZLWK$XWRPDWLF5RZ *HQHUDWLRQ

No changes are made to the Product table, but the first two records are added
to the Sales table as the following table shows.

SHUNH\ FODVVNH\ SURGNH\ VWRUHNH\ SURPRNH\ TXDQWLW\ GROODUV

1996-01-22 7 0 78 45 36 236.56

1996-01-22 7 0 78 46 42 168.72

The third record is discarded, because its primary-key value is identical to


that of the previous record. Any records that violate referential integrity with
respect to any referenced tables other than the Product table are discarded.

$GGLQJ5RZVLQ0L[HG0RGH
The AUTOROWGEN feature also allows you to combine the ON and DEFAULT
behaviors in a mixed-mode operation. To combine behaviors, you must use
the Discard Clause of a LOAD DATA statement to specify on a table-by-table
basis whether rows should be added to the referenced or referencing table.

,PSRUWDQW Potential conflicts might arise in mixed-mode operation when a refer-


enced table appears in a DEFAULT mode table list and it is also referenced by another
table that appears in an ON mode table list. For an example of this behavior, refer to
“Usage” on page 3-54.

([DPSOH$87252:*(1PL[HGPRGH

The following example shows the AUTOROWGEN mixed-mode operation. It


is based on the database in the previous example on page 3-17. Assume the
LOAD DATA statement for the Sales table contains the following Discard
clause:

discardfile ‘sales_dscd’ discards 100


autorowgen on (store, promotion) default (product)

/RDGLQJ'DWDLQWRD:DUHKRXVH'DWDEDVH 
0DLQWDLQLQJ5HIHUHQWLDO,QWHJULW\ZLWK$XWRPDWLF5RZ *HQHUDWLRQ

As records are loaded into the Sales table, rows are also added to the Store
and Promotion tables as needed to maintain referential integrity. Rows are
also added to the Market table if necessary, because it is an outboard table
that the Store table references. However, if a record to be loaded contains a
foreign-key value not found in the Prodkey column of the Product table, the
record is added to the Sales table by using the default value (0) for the
Prodkey column of the Sales table. If a record to be loaded into the Sales
table contains a Perkey value not found in the Perkey column of the Period
table, it is discarded because the Period table is not in either table list and
hence is controlled by AUTOROWGEN OFF mode.

6SHFLI\LQJWKH$87252:*(10RGH
The default behavior for a data warehouse is controlled as a system default
with an entry in the rbw.config file.

OPTION AUTOROWGEN OFF


ON

The AUTOROWGEN feature also can be set for a specific load operation in the
Discard clause of a LOAD DATA statement, as described in “Discard Clause”
on page 3-43. Setting this feature in the Discard clause provides more
flexibility because you can specify the behavior (ON, OFF, DEFAULT) for each
table referenced by the table being loaded.

 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
:ULWLQJD/2$''$7$6WDWHPHQW

:ULWLQJD/2$''$7$6WDWHPHQW
The LOAD DATA statement for the TMU specifies, in this order:

 The files that contain your input data, which can be tape files, disk
files, or standard input.
 The format of the input data (format specification).
 The locale of the input data, if different from the database locale.
 Optional discard instructions, which can include filenames and
formats for discarded records and the Automatic Row Generation
feature.
 Optional row messages instructions, specifying a filename in which
to view messages and warnings.
 Optional optimization instructions that specify whether to build
indexes in optimize mode and a discard file for discarded records.
 The table into which the data is loaded and a map of data fields into
table columns. Alternatively, you can specify an offline segment of a
table.
 Optional criteria that determine which input records should be
loaded and which should be discarded.
 Optional comment text that allows you to store information about a
load operation or the data loaded.

Each LOAD DATA statement loads only one table. It can load data into all of
the columns in a table or into a subset of the columns. The names of the
columns to load are specified together with a description of the source data,
which is called a field specification. A field specification contains information
about the data type of the data field in the input record if the data is in an
input file, or it contains information about the automatically generated input
data if the TMU must produce the data.

A control file can contain multiple LOAD DATA statements for multiple
tables, which are processed sequentially.

/RDGLQJ'DWDLQWRD:DUHKRXVH'DWDEDVH 
/2$''$7$6\QWD[

/2$''$7$6\QWD[

LOAD input_clause
p. 3-25
DATA format_clause locale_clause
p. 3-30 p. 3-39

discard_clause row_messages_clause optimize_clause mmap_index_clause


p. 3-45 p. 3-58 p. 3-59 p. 3-63

table_clause ;
p. 3-65 criteria_clause comment_clause
p. 3-90 p. 3-95
segment_clause
p. 3-88

The clauses shown in this syntax diagram are described in detail in the
following sections. For convenient reference, a syntax summary for the LOAD
DATA statement is shown at the end of this chapter.

 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
,QSXW&ODXVH

,QSXW&ODXVH
The TMU accepts input from tape drives, disk drives, and system standard
input. The Input clause specifies the file or files containing the input data, the
input device (for tape drives), and record numbers (for partial loads). In a
single LOAD DATA statement, files must be all tape files, all disk files, or all
standard input.

input_clause Back to LOAD DATA


p. 3-24

INPUTFILE ’filename’
INDDN ’
( 'filename’ ) TAPE DEVICE ’device_name’

START RECORD start_row STOP RECORD stop_row

INPUTFILE or File that contains the input data to load into the table. The
INDDN filename must satisfy operating-system conventions for file
’filename’ specification. The name must be enclosed in single quotation
marks. If you use multiple input files, you must enclose the
list of filenames in parentheses, and separate the names with
commas.

If input is standard input, the filename reference is ‘-’:


INPUTFILE ’-’

,PSRUWDQW If the LOAD DATA statement appears in a control file for the rb_cm util-
ity, INPUTFILE must be set to standard input.

/RDGLQJ'DWDLQWRD:DUHKRXVH'DWDEDVH 
,QSXW&ODXVH

A filename can contain environment variables:

UNIX If a dollar sign ($) character is part of the filename, it must be


preceded by a slash (/) character so it is not confused with an
environment variable:
INPUTFILE ’$INPUT/file.1’

Windows If a percent sign (%) character is part of the filename, it must


be preceded by a backslash (\) character so it is not confused
with an environment variable:
INPUTFILE ’%INPUT%\file.1’

For standard-label tapes, the filename in the LOAD DATA


statement can be uppercase or lowercase. However, the file-
name on the tape must be uppercase. Standard-label tapes
support filenames of up to 17 characters. If the specified file-
name is longer than 17 characters, the TMU uses the first
17 characters.

For TAR tapes, filenames are case-sensitive and the case of the
filename in the LOAD DATA statement must match the
filename on the tape.

 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
,QSXW&ODXVH

WIN
UNIX
NT TAPE DEVICE Tape device. It must be a rewind tape device. Use this clause
’device_name’ if the input files are on one or more tapes.

Tape format can be either TAR or ANSI-standard label tapes.


The tape can be of fixed- or variable-record length, but not
segmented records. The TMU checks to see if the tape con-
forms to the ANSI tape format. If so, it treats the tape as an
ANSI-labeled tape. If not, the TMU treats the tape as a TAR
tape. For more information about tape formats, refer to “For-
mat of Input Data” on page 3-122.

Each name in the filename list can be the name of a single file
on a multifile standard-label tape. However, the TMU does
not support multiple TAR archive files on a tape. It reads only
the first file.

You must enclose the tape device name in single quotation


marks. You can specify the name as a literal or as an
environment variable.

If a tape contains a file named data_file.1 and the tape is


mounted on a tape device named /dev/rmt0, then the input
clause is:
INPUTFILE ’data_file.1’ TAPE DEVICE ’/dev/rmt0’

If the tape device is defined as an environment variable


named TAPE, the Input clause is:
INPUTFILE ’data_file.1’ TAPE DEVICE ’$TAPE’

,PSRUWDQW The TAPE DEVICE parameter is not valid for a LOAD DATA statement
appearing in a control file for the rb_cm utility. Load input must come from standard
input when the LOAD DATA statement is in a control file that the rb_cm utility uses.

/RDGLQJ'DWDLQWRD:DUHKRXVH'DWDEDVH 
,QSXW&ODXVH

START Specifies which records in the input file mark the beginning
RECORD, STOP and end of loading. If the START RECORD keywords are spec-
RECORD ified, loading begins at the specified record. Earlier records
are read and counted, but their contents are ignored. If the
STOP RECORD keywords are specified, loading stops after the
specified number of records.

Default values are: START RECORD: 1 and STOP RECORD:


end-of-all-files.

The START RECORD and STOP RECORD clauses are useful in the following
circumstances:

■ A load operation ends prematurely (for example, because a discard


limit specified with a Discard clause was reached) and you want to
resume loading after the last record loaded. Use START RECORD to
specify the next record to load. The TMU issues messages that
provide row numbers to use with START RECORD.
■ An input file is too large. Use START RECORD and STOP RECORD to
split the records into two load operations.
■ You are unsure about the format or content of the input data and
want to test the load script. Use STOP RECORD to stop the load
operation after a few rows.

The TMU follows these rules when counting rows:

■ It counts only the rows it sees. For example, if tape 1 is not used and
the load starts with tape 2, then the first record is the first one on
tape 2 (the first tape used).
■ The number of rows counted is not reset between files or tapes. The
number keeps incrementing until the end of the current LOAD DATA
statement.
■ If START RECORD is specified on data fields defined as a SEQUENCE
field type, the sequence value is incremented for each row skipped.
For example, if you specify START RECORD 10 and are loading a
column using SEQUENCE (2,2), then the first row loaded is input row
10 with sequence value 20.

 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
)RUPDW&ODXVH

In the following example the input file, market.txt, has fixed-format records.
The fields in market.txt start and end at the same position in each record.
They are not separated by characters. The input file is located on a disk
device, which is the default. The TMU starts counting records from the
beginning of the file, but starts loading at record 100 and stops loading at
record 200.

load data
inputfile ’market.txt’
start record 100
stop record 200
recordlen 7 replace
nls_locale ’English_Canada.MS1252@Default
discardfile ’mktdisc.txt’
discards 1
optimize on discardfile ’mktdups.txt’
into table market(
mktkey integer external (4),
state char (2)
);

)RUPDW&ODXVH
The Format clause is optional and specifies the format details of the input
data:

■ Record length: fixed or variable


■ Mode: append, insert, modify, replace, or update
■ Logical data format: fixed, separated, variable, XML, or unloaded
from the same or another database
■ Code set: ASCII or EBCDIC

The absence of a Format clause indicates that records are in the ASCII code set,
and that their length is determined by the field lengths specified in the field
specifications in the Table clause.

For more information about valid format combinations, refer to “Format of


Input Data” on page 3-122.

/RDGLQJ'DWDLQWRD:DUHKRXVH'DWDEDVH 
)RUPDW&ODXVH

format_clause Back to LOAD DATA


p. 3-24

RECORDLEN n APPEND

FIXEDLEN n INSERT
INTRA RECORD SKIP n REPLACE
MODIFY
AGGREGATE
UPDATE
AGGREGATE

FORMAT IBM
FORMAT SEPARATED BY ’c’
FORMAT IBM SEPARATED BY ’c’
FORMAT UNLOAD
FORMAT VARIABLE
FORMAT IBM VARIABLE
FORMAT XML
FORMAT XML_DISCARD

RECORDLEN n Number of bytes in each record. Signifies fixed-length


records without any newline character (or other record
separator). If files contain binary data, this value is
required. If files contain only character and external
numeric data (for example, ASCII characters and num-
bers), you can omit this value, signifying newline-charac-
ter separated data records. However, if it is included, n
must include the newline character.

RECORDLEN is not allowed with FORMAT UNLOAD and


FORMAT XML.

 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
)RUPDW&ODXVH

FIXEDLEN n For VARIABLE FORMAT, indicates the length of the fixed


part. If you do not specify FIXEDLEN n, the TMU reads the
input line by line.

FIXEDLEN is not allowed with FORMAT XML.

INTRA RECORD Valid only for variable format. Indicates to the read pro-
SKIP n cess to skip n bytes after finishing reading the previous
record. This is added for skipping newline characters
between input records.

APPEND Load mode used to insert additional rows of data into an


existing table. Each new row must have a primary-key
value that does not already exist in the table. Otherwise,
the record is discarded. INSERT privilege on the table is
required.
INSERT Default mode. Load mode used to load data into an
empty table. If the table is not empty, the load operation
ends. The table requires the INSERT privilege.

REPLACE Load mode used to replace the entire contents of a table.


The table requires the INSERT and DELETE privileges.

7LS In REPLACE mode, the existing contents of a table are destroyed. Use this mode
carefully.

MODIFY Load mode used to insert additional rows or to update


existing rows in a table. If the input row has the same pri-
mary-key value as an existing row, the new row replaces
the existing row. Otherwise, it is added as a new row. You
can update selected columns with the DEFAULT and
RETAIN keywords, as described on page 3-69. The table
requires the INSERT and UPDATE privileges.

In MODIFY mode, the primary-key columns must be


present in the LOAD DATA statement.

/RDGLQJ'DWDLQWRD:DUHKRXVH'DWDEDVH 
)RUPDW&ODXVH

UPDATE Load mode used to update existing rows in an existing


table. Each new row must have a primary-key value that
is already present in the table. Otherwise, the record is
discarded. You can update selected columns with the
DEFAULT and RETAIN keywords, as described on
page 3-69. The table requires the UPDATE privilege.

In UPDATE mode, the primary-key columns must be


present in the LOAD DATA statement.
AGGREGATE Indicates that aggregate operators are used on a column-
by-column basis in MODIFY or UPDATE modes. In these
modes, you can use aggregate operators on simple fields
(as described on page 3-71) and on increment fields (as
described on page 3-86). The aggregate operators
available are: ADD, SUBTRACT, MIN, MAX, and
INCREMENT.

In MODIFY AGGREGATE mode, if the primary key of the


input row matches an existing row in the table, the exist-
ing row is updated as defined for the specified aggregate
operator. If the primary key of the input row does not
match an existing row in the table, the row is inserted. In
this case, if the aggregate operator is ADD, SUBTRACT,
MIN, or MAX, which require a value from a matching row
in the table (there is no such row in this case), the input
value is inserted. If the aggregate operator is INCREMENT,
the increment value is inserted.

In UPDATE AGGREGATE mode, if the primary key of the


input row does not match the primary key of a row
already in the table, the input row is discarded. If it does
match an existing row, the existing row is updated as
defined for the specified aggregate operator.

For a detailed example of aggregate operation, refer to


Appendix A, “Example: Using the TMU in AGGREGATE
Mode.”

You cannot use aggregate operators on primary-key


columns.

 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
)RUPDW&ODXVH

No FORMAT Indicates that all records are fixed length, as defined with
keyword RECORDLEN n. If RECORDLEN is not defined, each record
is read until a newline character is encountered. Binary
data is not permitted. For more information on fixed-for-
mat records, see “Fixed-Format Records” on page 3-123.

FORMAT IBM Specifies that data is in the EBCDIC code set. CHARACTER
and EXTERNAL fields are converted from EBCDIC to
ASCII, and integer fields are converted to the byte order of
the computer that is running IBM Red Brick Warehouse.
For details about specific EBCDIC to ASCII conversions,
contact IBM Customer Support.

A tape must have an ANSI-standard label (in EBCDIC).


The TMU accepts multiple-file tapes and multiple-reel
files. You must specify input files in the order in which
they appear on the tape.

Specification of IBM format is typically used when the


input file was prepared on an IBM mainframe system.

FORMAT Specifies that fields in a data record are separated by the


SEPARATED BY 'c' character c, which must be a single-character literal and it
must be different from the radix (decimal) point character.
For example, data separated by an exclamation mark (!) is
specified by:
format separated by ’!’

Data separated by a tab character is specified by:


format separated by ’tab’

where tab represents the actual tab keystroke, which


might appear as a blank space: ' '.

,PSRUWDQW The separator character (SEPARATED BY ‘c’) must be specified by using


the database-locale code set, but it can be either a single-byte or multibyte character.
If the character used as a separator in the input data cannot be expressed as a character
in the database locale, then the input data cannot be interpreted correctly.

/RDGLQJ'DWDLQWRD:DUHKRXVH'DWDEDVH 
)RUPDW&ODXVH

FORMAT IBM Specifies that data is in the EBCDIC code set and that fields
SEPARATED BY ’c’ in a data record are separated by the character c, which
must be a single-character literal. It must be different from
the radix (decimal) point character.

FORMAT UNLOAD Specifies that the data to load is unloaded in internal for-
mat from a database using a TMU UNLOAD statement.
You cannot use this format choice to load data that was
unloaded in external format. For more information about
the UNLOAD statement, refer to Chapter 4, “Unloading
Data from a Table.”

The RECORDLEN keyword and field specifications are not


allowed when FORMAT UNLOAD is used.

Format variable, Use only if at least one VARLEN data field type is present
Format IBM in subsequent simple fields. (IBM keyword means that the
variable input is in EBCDIC.)

In VARIABLE format, the input record consists of a fixed-


length part and a variable-length part. Every input record
has the same length for the fixed part as other records. Use
FIXEDLEN to indicate the length of the fixed part. The
variable length can differ for different records.
FORMAT XML Specifies that the input file is an Extended Markup Lan-
guage (XML) document. For information about how the
TMU parses XML data (using the Xerces-C++ parser), see
“Simple Fields” on page 3-71 and “Format of Input Data”
on page 3-122.
FORMAT Specifies that the input file is the discard file for a previ-
XML_DISCARD ous load operation in XML format. This option allows you
to reload the discarded rows; see page 3-43.

 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
(%&',&WR$6&,,&RQYHUVLRQ

(%&',&WR$6&,,&RQYHUVLRQ
If you are using the FORMAT IBM option to load data, note the following
restrictions:

■ Only single-byte EBCDIC code sets are supported.


■ Conversion to ASCII is limited to EBCDIC characters within the
syntactic code set (CS 640). These characters are identified in the
following table.
■ In addition to CS 640, the TMU converts the EBCDIC exclamation
point (!) based on its definition in IBMUSCanada code set 037
(Hex 5A, Dec 90).
■ All other code points not addressed in this section have unsupported
TMU mappings to ASCII. Attempting to map such characters yields
unpredictable results.

/RDGLQJ'DWDLQWRD:DUHKRXVH'DWDEDVH 
(%&',&WR$6&,,&RQYHUVLRQ

,%06\QWDFWLF&RGH6HW &6 

ABCDEFGHIJKLMNOPQRSTUVWXYZ
abcdefghijklmnopqrstuvwxyz
0123456789

(space) ( left parenthesis

. (period) ) right parenthesis

" (double quotation = (equal)


mark)

/ (slash) > (greater than)

% (percent) * (asterisk)

: (colon) ? (question mark)

& (ampersand) + (plus)

; (semicolon) _ (underscore)

’ (apostrophe) , (comma)

< (less than) - (hyphen)

7ZR$SSURDFKHVWR/RDGLQJ(%&',&'DWD
The two ways to load EBCDIC data with the TMU are:

■ Use the FORMAT IBM keywords in the Format clause.


■ Specify an IBM code set in the NLS_LOCALE clause.

The FORMAT IBM and NLS_LOCALE specifications are mutually exclusive.


You can use only one of these specifications in each TMU control file.

 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
(%&',&WR$6&,,&RQYHUVLRQ

If you are certain that you are loading only characters that comply with
CS 640, you can use either the FORMAT IBM or the NLS_LOCALE specification,
but the FORMAT IBM approach yields higher performance. However, if you
are unsure whether your input data complies with CS 640, use the
NLS_LOCALE clause to select an EBCDIC code set that is fully compatible with
the specified language. Although load performance might not be optimal,
this approach ensures the integrity of both the loaded data and database
objects (such as indexes) that are built based on that data.

([DPSOHV)RUPDW&ODXVH
The following example shows the use of the Format clause. The Market table
contains existing data that is modified by the records in the input file,
market.txt. The keyword MODIFY specifies that if a record in market.txt has
the same primary key as a row in the Market table, the record in the file
replaces the row in the table. If a row does not yet exist in the table, the TMU
adds a new row.

Assume the market.txt file has fixed-format records. No separator character


occurs between fields and all of the records are the same length. RECORDLEN
specifies the number of bytes the TMU reads for each record. RECORDLEN is
calculated by summing the number of bytes in the Mktkey and State fields
and adding one more byte for the newline character. The TMU loads a new
record every seven bytes.

load data
inputfile ’market.txt’
recordlen 7 modify
nls_locale ’English_Canada.MS1252@Default
discardfile ’mktdisc.txt’
discards 1
optimize on discardfile ’mktdups.txt’
into table market(
mktkey integer external (4),
state char (2)
);

/RDGLQJ'DWDLQWRD:DUHKRXVH'DWDEDVH 
/RFDOH&ODXVH

The following example shows the use of a Format clause to load data in
UNLOAD format. No format specifications or field specifications apply other
than the keyword UNLOAD.

load data
inputfile ’market.txt’
insert format unload
discardfile ’mktdisc.txt’
discards 1
optimize on discardfile ’mktdups.txt’
into table market;

The following example shows the use of a Format clause to load data of
variable format.

load data
inputfile ’market.txt’
fixedlen 6 intra rec skip 1
insert format variable

(mktkey integer external (4)
state varlen external (2)
);

/RFDOH&ODXVH
The unique combination of a language and a location is known as a locale. A
locale specification consists of four components: language, territory, code set,
and collation order. The default locale for IBM Red Brick Warehouse is:
English_UnitedStates.US-ASCII@binary

where

■ English = language
■ United_States = territory
■ US-ASCII = code set
■ binary = collation order

 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
/RFDOH&ODXVH

Examples of other locales are:


Japanese_Japan.MS932@Binary
German_Austria.Latin1@Default
French_France.Latin1@Default
Spanish_Mexico.Latin1@Spanish

,PSRUWDQW Any collation order value other than binary implies a linguistic sort. The
default locale resorts to the sort definition that the CAN/CSA Z243.4.1 Canadian order
specifies, which covers English and several Western European languages.

For more information about locales, refer to the Administrator’s Guide and to
the locales.pdf file on your installation CD.

Although the TMU uses the database locale for most of its processing, you can
specify a different locale for a TMU input file. In this way, the TMU can
automatically convert data from one code set to another as it is loaded into a
database table. The locale of the input file, if different from the database
locale, is specified with the NLS_LOCALE keyword in the Locale clause of the
LOAD DATA statement. If the Locale clause is omitted, the input locale is
assumed to be the same as the database locale.

,PSRUWDQW The Locale clause refers only to the contents of the input file itself. All
information specified in TMU control files must be specified in the database locale.
Specifically, this means that separator, radix, and escape characters must be specified
with the database-locale code set. If the character used as a separator or radix point in
the input data cannot be expressed as a character in the database locale, then the input
data cannot be interpreted correctly.

locale_clause Back to LOAD DATA


p. 3-24

NLS_LOCALE ’ ’
language _territory . codeset @sort

XML_ENCODING

/RDGLQJ'DWDLQWRD:DUHKRXVH'DWDEDVH 
/RFDOH&ODXVH

NLS_LOCALE Locale of the input data when the code set of the
input data differs from that of the database locale.
The locale specification also determines which
character is interpreted as the decimal (radix) point
in the input data, but it can be overridden by a
RADIX POINT definition, which is specified in the
Table clause as part of a DECIMAL field
description.

‘language_territory. All or part of the locale for the input file. The locale
codeset@sort’ specification must be enclosed in single quotation
marks. You do not need to specify all four parts of
a locale.

XML_ENCODING This option applies only to loads in XML format.


When this keyword is specified, the native encod-
ing defined by the XML input file will be used. The
encoding string inside the XML file must match one
of the strings listed on page 3-41.

7LS You do not need to specify all the separator characters (the underscore (_), the
period (.), and the @ character) in a partial locale specification. Only the character
that immediately precedes the components is required, such as the underscore
character ( _ ) in the previous territory example.

Regarding default values for unspecified locale components, the following


rules apply; the same default values and precedence rules apply to input file
locales as to locales specified with the RB_NLS_LOCALE environment
variable. IBM strongly recommends that if the locale is not fully specified, the
language should be one of the specified components. Otherwise, the
unspecified components might default to incompatible values.

 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
/RFDOH6SHFLILFDWLRQVIRU;0/,QSXW)LOHV

■ If you only specify the language, the omitted components are set to
the default values for that language. For example, if you set the locale
to Japanese the complete locale specification is as follows:
Japanese_Japan.JapanEUC@Binary
For a list of default components for each language, refer to the
locales.pdf file in the RELNOTES directory of your installation CD. If
you only specify the territory, the language defaults to English, the
code set to US-ASCII, and the collation order to binary. For example,
if you set the locale to _Japan the complete, but impractical, locale
specification is as follows:
English_Japan.US-ASCII@Binary
■ Similarly, if you only specify the code set, the language defaults to
English, the territory defaults to UnitedStates, and the collation order
component defaults to binary.
■ Finally, if you only specify the sort component (collation order), the
language defaults to English, the territory defaults to UnitedStates,
and the code set defaults to US-ASCII.

/RFDOH6SHFLILFDWLRQVIRU;0/,QSXW)LOHV
TMU loads in XML format are fully internationalized. The locale of an XML
input file can be specified in two different ways:

■ Locales supported by IBM Red Brick Warehouse (as listed in the


/RELNOTES/locales.pdf file on the installation CD-ROM) are
specified with the standard language_territory.codeset@sort string.
■ Encodings supported by the Xerces-C++ parser are specified with
the XML_ENCODING keyword. The parser supports the following
encodings:
❑ ASCII
❑ UTF-8
❑ UTF-16 (Big/Small Endian)
❑ UCS4 (Big/Small Endian)
❑ EBCDIC code pages IBM037 and IBM1140
❑ ISO-8859-1 (Latin1)
❑ Windows-1252

/RDGLQJ'DWDLQWRD:DUHKRXVH'DWDEDVH 
8VDJH1RWHV

8VDJH1RWHV
■ For all discard files you specify in the Discard clause, the input file
locale is used. However, for a discard file you specify in the Optimize
clause, the database locale is used.
■ For all ACCEPT and REJECT processing in the Criteria clause, the
database locale is used.
■ The locale for TMU messages is either the database locale or the locale
specified for the current user with the RB_NLS_LOCALE
environment variable.
■ Before you specify a code set for the input file that is different from
the code set for the database, make sure that conversion between
those code sets is supported. For a complete list of supported
languages and code sets, refer to the locales.pdf file in the relnotes
directory of your installation CD.
.:DUQLQJ IBM Red Brick Warehouse provides no recovery mechanism when data
loss or data corruption occurs because of incompatible code sets.

([DPSOH

The following example shows the use of the Locale clause in a LOAD DATA
statement, where the language is English, the territory is Canada, the code set
is MS1252, and the collation sequence is Default, a Canadian collation
sequence definition.

load data
inputfile ’market.txt’
recordlen 7 modify
nls_locale ’English_Canada.MS1252@Default’
discardfile ’mktdisc.txt’
discards 1
optimize on discardfile ’mktdups.txt’
into table market(
mktkey integer external (4),
state char (2)
);

 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
'LVFDUG&ODXVH

'LVFDUG&ODXVH
If the TMU rejects any records because of data conversion, data content, or
referential integrity errors or because records do not meet the ACCEPT and
REJECT criteria in the Criteria clause, it places these records in one or more
discard files.

,PSRUWDQW In loads performed in optimize mode, as described on page 3-59,


duplicate rows are discarded to the discard file you specify in the Optimize clause
rather than the discard file you specify in the Discard clause.

The Discard clause is optional, and is used to specify the following


information about discard files:

■ Discard filenames.
■ Whether to separate records discarded for referential-integrity
violations from those discarded for data-integrity violations (data
conversion, data content, or data that does not satisfy the ACCEPT
and REJECT criteria).
■ Whether to further separate records violating referential integrity by
specifying separate discard files for the referenced tables. (If a record
contains multiple referential-integrity violations, it is written to the
file specified for each violated dimension.)

The Discard clause also specifies whether referential integrity is preserved


with the Automatic Row Generation feature rather than by discarding rows.
The default behavior for this feature is set with the AUTOROWGEN parameter
in the rbw.config file. If it is not set in the rbw.config file, the default behavior
is OFF. You can change the default behavior for all tables or for one or more
specific tables in the Discard clause of the LOAD DATA statement for that
table.

,PSRUWDQW If you have multiple discard files for a TMU LOAD DATA statement (for
example, DISCARDFILE and RI_DISCARDFILE), be sure that the names of the discard
files are unique. If the names are not unique, one file will overwrite the other file.

/RDGLQJ'DWDLQWRD:DUHKRXVH'DWDEDVH 
'LVFDUG&ODXVH

;0/'LVFDUGV
When data is loaded in XML format, the resulting discard files are in fixed
format, not XML format. You can reload the discarded rows by following
these steps:

 Fix the problems that caused the rows to be discarded.


 Edit the original control file:
D Change the Format clause to FORMAT XML_DISCARD. See
page 3-29.
E Change the input file to the name of the discard file.

 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
'LVFDUG&ODXVH

Each input record is first converted to internal row format and the data
integrity is checked. If an error occurs during this phase, the record is
discarded and written to the standard discard file, if one is specified. If no
error occurs during this phase, referential integrity checks are performed on
each referenced table.

discard_clause Back to LOAD DATA


p. 3-24

,
DISCARDFILE ’filename’
DISCARDDN IN ASCII
EBCDIC

RI_DISCARDFILE ’filename’
,
( table_name ’ filename’ )
OTHER ’filename’

DISCARDS n

AUTOROWGEN OFF
ON
,
( table_name )
,
DEFAULT ( table_name )
DEFAULT
,
( table_name )
,
ON ( table_name )

/RDGLQJ'DWDLQWRD:DUHKRXVH'DWDEDVH 
'LVFDUG&ODXVH

DISCARDFILE Files to which the TMU writes discarded input records.


’filename’ You can specify two files (but no more than two) here so
that both ASCII and EBCDIC versions of the discard file
can be written. (You can use DISCARDDN instead of DIS-
CARDFILE.) All discarded records are written to these
files unless the RI_DISCARDFILE clause is present and
specifies separate files for records that violate
referential integrity.

The user redbrick must have write permission for the


discard file.The filenames must be enclosed in single
quotation marks.
ASCII, EBCDIC Optional. Specifies the code set of the output file. This
clause applies only to EBCDIC input data in an ASCII
locale. If you specify ASCII or no value, the output file is
written in the code set of the input locale. The output
file can be in EBCDIC only if the input file is in EBCDIC.

RI_DISCARDFILE Optional. File to which to discard the records that vio-


’filename’ late referential integrity. You cannot use this clause on a
table that does not reference other tables. You cannot
specify ASCII and EBCDIC format for these files. The
files are written in the code set of the input locale.

The filename must be enclosed in single quotation


marks.

 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
'LVFDUG&ODXVH

table_name ’filename’ Table name and filename pair that names a table refer-
enced by a foreign key in the table being loaded and a
file to which to discard the records that violate referen-
tial integrity with respect to the referenced table. The
use of these pairs allows resolution and reprocessing of
referential-integrity violations more easily than if all
discards are stored in a single file.

These name pairs provide a separate discard file for


each named table. If a single record violates referential
integrity with respect to multiple referenced tables, that
record is written to the file associated with each of those
tables.

You can specify multiple pairs. If some but not all refer-
enced tables are listed here, records that violate referen-
tial integrity with respect to tables missing from the list
are written either to the file following the OTHER
keyword, or if that keyword is missing, then to the
standard discard file (following the DISCARDFILE
keyword).

The filenames must satisfy the file-specification con-


ventions of your operating system and must be
enclosed in single quotation marks. Pairs must be sepa-
rated by commas.

OTHER ’filename’ Optional. File to which to discard any records that vio-
late referential integrity with respect to referenced
tables not named in the table name and filename pairs.
If a table name and filename pair list is present and the
OTHER clause is omitted, any records that violate refer-
ential integrity with respect to tables missing from the
list are written to the standard discard file (following
the DISCARDFILE keyword).

The filename must satisfy operating-system file-specifi-


cation conventions and must be enclosed in single
quotation marks.

/RDGLQJ'DWDLQWRD:DUHKRXVH'DWDEDVH 
'LVFDUG&ODXVH

DISCARDS n Maximum number of records that can be discarded, for


any reason, before the TMU ends execution of the con-
trol file. If the control file contains multiple control
statements, any remaining statements are not executed.

When the maximum number of discards is reached,


data already loaded into the current TMU batch of rows
is rolled back, but all batches completed before the
maximum was reached remain in the table.

The discard limit applies to total records discarded from


all input files in a single LOAD DATA statement. If mul-
tiple input files are specified, the number of records dis-
carded is not reset for each file.

If the Discard clause is omitted or if DISCARDS n is omit-


ted from the Discard clause, the default value is 0. If n is
0, no maximum exists. The TMU completes execution of
the control file regardless of how many records are
discarded.

In load operations performed in optimize mode, as


described on page 3-59, the actual number of discards
might exceed the discard limit n because duplicate rows
are not detected until all input rows are processed and
counted. If you specify no filenames in the Discard
clause but DISCARDS n is present, execution of the
control file stops after n discards, but the discarded
records are not saved in a file.

 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
'LVFDUG&ODXVH

AUTOROWGEN Specifies the behavior of the TMU when input records


violate referential integrity (when a foreign-key value
in an input record does not match a primary-key value
in the referenced table). The behavior can be specified
on a table-by-table basis, allowing a combination of
OFF, ON, and DEFAULT modes.

Whenever a list of referenced tables is present, the


AUTOROWGEN feature uses a mixed-mode operation,
applying the specified mode to the tables in the list. All
other directly referenced tables are treated as if AUTOR-
OWGEN is set to OFF. Any records that violate referen-
tial integrity with respect to such a table are discarded.

The AUTOROWGEN feature requires that all referenced


tables into which generated rows are inserted have
defined MAXROWS PER SEGMENT values.

Precomputed views defined on the detail dimension


tables for which you are generating rows cannot be
maintained. If aggregate maintenance is turned on,
such views are marked invalid.

AUTOROWGEN OFF Records violating referential integrity are written to the


discard file.

/RDGLQJ'DWDLQWRD:DUHKRXVH'DWDEDVH 
'LVFDUG&ODXVH

AUTOROWGEN ON New rows are automatically generated (using the col-


umn default values) and added to the referenced tables
to fully satisfy referential integrity, and the input record
is added to the table being loaded. If no list of tables fol-
lows the ON keyword, this behavior applies to all
tables. However, if a list of tables follows the keyword,
this behavior applies only to the tables in the list and
tables referenced directly or indirectly (outboard tables)
by those tables.

When AUTOROWGEN is set to ON and a potential refer-


ential-integrity violation cascades through a series of
referenced tables, new rows are added until referential
integrity is satisfied. If any of the rows necessary to pre-
serve referential integrity cannot be added, for exam-
ple, because conflicting modes apply to a referenced
table or because of permission or authorization
violations, the record is discarded.

AUTOROWGEN If a referential-integrity violation is detected, the for-


DEFAULT eign-key value in that record is changed to the default
value for the foreign-key column and a new row is
inserted into the table being loaded. If no list of table
names is present, this behavior applies to all tables.
However, if a list of table names follows the DEFAULT
keyword, this behavior applies only to the tables in the
list.

 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
'LVFDUG&ODXVH

table_name Specifies a table directly referenced by the table to load.


A table cannot appear in more than one list.

For any table in a list following ON: if a referential-


integrity violation occurs involving a table in this list or
a table referenced directly or indirectly by a table in this
list, a new row is added to the referenced table.

For any table in a list following DEFAULT: if a referen-


tial-integrity violation occurs involving a table in this
list, the record is modified and added to the referencing
table.

If a conflict arises because a row necessary for referen-


tial integrity cannot be added, the input record is
discarded.

([DPSOHUHMHFWHGUHFRUGVVWRUHGIRUUHIHUHQWLDOLQWHJULW\

In the following example, the TMU stores records rejected for data-integrity
violations in the file prod_di.txt and it stores records rejected for
referential-integrity violations in the file prod_ri.txt.

load data
inputfile ’prod.txt’
format separated by ’:’
discardfile ’prod_di.txt’
ri_discardfile ’prod_ri.txt’
discards 10
optimize on discardfile ’prd_dups.txt’
into table product(
classkey integer external (2)
prodkey integer external (2),
prodname char (30)
pkg_type (20)
);

/RDGLQJ'DWDLQWRD:DUHKRXVH'DWDEDVH 
'LVFDUG&ODXVH

([DPSOHUHMHFWHGUHFRUGVVWRUHGLQGLIIHUHQWILOHVIRUGLIIHUHQWWDEOHV

In the following example, the TMU stores records that are rejected for data-
integrity violations in the file orders_di.txt. It stores the records rejected for
referential-integrity violations against the referenced tables Supplier and
Deal in the files sup_ri.txt and deal_ri.txt, respectively. Any other records
discarded for referential integrity are stored in the file misc_ri.txt.

load data
inputfile ’aroma_orders.txt’
format separated by ’*’
discardfile ’orders_di.txt’
ri_discardfile (supplier ’sup_ri.txt’,
deal ’deal_ri.txt’) other ’misc_ri.txt’
discards 10
optimize on discardfile ’orders_dups.txt’
into table orders(
order_no integer external,
perkey integer external,
supkey integer external,
dealkey integer external,
order_type char (20),
order_desc char (40)
close_date date ‘YYYY-MM-DD',
price dec external (7,2)
);

 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
'LVFDUG&ODXVH

([DPSOH$87252:*(1'()$8/7

Assume the following tables and AUTOROWGEN setting.

Fact Dim3
... ...
Dim1
...

Dim2
...
Out1
...


autorowgen on (dim1, dim3) default (dim2)

If a record to insert into the Fact table violates referential integrity with
respect to the Dim2 table, a new row, in which the foreign-key value that
violated referential integrity is replaced by the default value for the
foreign-key column, is added to the Fact table.

/RDGLQJ'DWDLQWRD:DUHKRXVH'DWDEDVH 
8VDJH

8VDJH
The following information applies to the use of the AUTOROWGEN feature in
the Discard clause.

'HIDXOW9DOXHV
Whenever a referential-integrity violation is detected and a row is inserted
with a default value (in either the referenced table (ON mode) or in the refer-
encing table (DEFAULT mode)) the default values used are determined from
default values defined for each column when the table was created. The
default values can be literals, NULL, or system values such as
CURRENT_USER, CURRENT_DATE, CURRENT_TIME, or
CURRENT_TIMESTAMP. These default values also have specific interactions
and restrictions with the column attributes NOT NULL and UNIQUE as
defined in the SQL Reference Guide.

You can change a default value assigned to a column with the ALTER TABLE
statement.

You can determine the default values for each column in a table by selecting
from the RBW_COLUMNS system table as follows:
select name, defaultvalue from rbw_columns
where tname = ’TABLENAME’ ;

7DEOH/RFNV
Whether a referenced table is locked for read or write access during a load
operation depends on the AUTOROWGEN mode. Locks are obtained at the
beginning of the load operation and held throughout the operation. First, to
maintain referential integrity, write locks are obtained on all referenced tables
into which a write might occur. Read locks are obtained on all referenced
tables that must be read to verify referential integrity.

All required locks are obtained automatically by the TMU. You do not need to
lock any tables manually.

 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
8VDJH

During a load operation, for all AUTOROWGEN modes, the table being
loaded is locked for write access. In addition, the modes, whether specified
in the rbw.config file or the LOAD DATA statement, require additional locks
on referenced tables as follows:

■ OFF mode: directly referenced tables require a read lock. Other


tables are not locked.
■ ON mode: if no list of tables follows the ON keyword, all referenced
tables and all tables referenced directly or indirectly by those tables
are locked for write access.
If a list of tables follows the ON keyword, tables in the list and all
tables referenced directly or indirectly by those tables are locked for
write access. Any directly referenced tables not present in the list are
locked only for read access.
■ DEFAULT mode: if no list of tables follows the DEFAULT keyword, all
directly referenced tables are locked for read access. Indirectly refer-
enced tables require no locks.
If a list of tables follows the DEFAULT keyword, those tables are
locked for read access.
■ Whenever lists of tables are present, any directly referenced table not
in the list is locked for read access.

&RQIOLFWVLQ0L[HG0RGH2SHUDWLRQ
In mixed-mode operation, with both ON- and DEFAULT-mode table lists
present, the LOAD DATA statement might specify potentially conflicting
behavior. If such conflicts occur, a warning message is issued and the record
is discarded.

/RDGLQJ'DWDLQWRD:DUHKRXVH'DWDEDVH 
8VDJH

Assume the database contains the following tables.

)DFW
'LP

'LP 
 '()$8/7
'LP
 2XW
21 
21

Assume also that the LOAD DATA statement for the Fact table contains the
following Discard clause:
discardfile ’fact_dscd’ discards 100
autorowgen on (dim1, dim2) default (dim3)

Assume that a record to load into the Fact table requires (for referential
integrity) that a row be added to the Dim2 table, which in turn requires (for
referential integrity) that a row be added to the Dim3 table. However,
according to the AUTOROWGEN clause, referential-integrity violations in
which Dim3 is referenced directly are resolved by replacing the foreign-key
value with the default-column value before adding the row to the Fact table.
Because of this conflict, the TMU discards the record.

'()$8/70RGHDQG6LPSOH6WDU6FKHPDV
In a simple star schema, the primary key is composed of all the foreign-key
columns and only those columns. In DEFAULT mode, the same value, the
default value for the column, is used for each row to enter in the referencing
table. Because each row must contain a unique primary key, repeated use of
the same value might cause records to be discarded as duplicates rather than
entered into the referencing table. The example on page 3-20 shows this
behavior.

 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
5RZ0HVVDJHV&ODXVH

7URXEOHVKRRWLQJ
If automatic row generation is in ON or DEFAULT mode but the TMU is unable
to generate rows in a referenced table:

■ Verify that the database user ID running the TMU has INSERT
privilege on the table.
■ Verify that the referenced table does not specify both NOT NULL and
DEFAULT NULL for any column. This combination on a single
column prevents automatic row generation.
■ Verify that a MAXROWS PER SEGMENT value is set for each
referenced table.

If a load operation ends before any rows are loaded, verify that the user
redbrick has write permission on any files named in the Discard clause.

For additional information about using the Automatic Row Generation


feature, refer to “Maintaining Referential Integrity with Automatic
Row Generation” on page 3-16.

5RZ0HVVDJHV&ODXVH
The Row Messages clause allows you to specify a filename where row-level
warning messages are sent as the LOAD operation progresses. If no Row
Messages clause is specified, the messages are displayed as part of the
standard error output.

The Row Messages filename is ignored if the TUNE TMU_ROW_MESSAGES


configuration parameter is set to NONE. For information about this
parameter, refer to “Managing Row Messages” on page 2-38.

/RDGLQJ'DWDLQWRD:DUHKRXVH'DWDEDVH 
5RZ0HVVDJHV&ODXVH

Regardless of the presence of the Row Messages clause in the control file,
messages also go to the log file maintained by the log daemon. A large
number of message-logging operations could slow down the LOAD
operation significantly; however, if warning messages are being sent to a file
instead of displayed, you might not realize that it is the message logging that
is causing the performance degradation. For information about configuring
the severity level for message logging, refer to the Administrator’s Guide.

You can set the Row Messages mode globally in the rbw.config file for all
load operations or in the SET statement for individual loads. If you have Row
Messages set to FULL, which is the default, the Row Messages clause desig-
nates the filename for the messages. If the RowMessages filename is
specified, but the rbw.config or SET statements are set to NONE, the
RowMessages filename is ignored.

row_messages clause Back to LOAD DATA


p. 3-24

ROWMESSAGES ’filename’

’filename’ The name of the file to which row-level warning


messages are sent. This filename should be different from
the filename specified for discard files. The file is created
only if row warnings exist.

 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
2SWLPL]H&ODXVH

2SWLPL]H&ODXVH
The Optimize clause applies only to those operations that use REPLACE,
INSERT, APPEND or MODIFY modes in the Format clause. You cannot use the
Optimize clause with UPDATE mode.

The Optimize clause specifies how to update the indexes during a TMU
incremental load operation.

You can set the optimize mode globally in the rbw.config file for all load
operations. If the optimize setting in the rbw.config file is what you want for
a given LOAD DATA statement, then you do not need to include the Optimize
clause in the LOAD DATA statement.

optimize_clause Back to LOAD DATA


p. 3-24

OPTIMIZE OFF
ON DISCARDFILE ’filename ’

OPTIMIZE OFF, ON Turns optimize mode on or off. The default behavior is


OFF.

This setting overrides the global optimize mode setting


in the rbw.config file for this LOAD DATA statement. In a
TMU control file containing multiple LOAD DATA state-
ments, an Optimize clause applies only to the LOAD
DATA statement that contains it. Any LOAD DATA state-
ment that does not contain an Optimize clause uses the
optimize setting in the rbw.config file.
If OPTIMIZE OFF mode is used (non-optimize mode),
indexes are updated when each input row is inserted into
the data file, which provides better performance when
the data being loaded contains many duplicate rows.

/RDGLQJ'DWDLQWRD:DUHKRXVH'DWDEDVH 
2SWLPL]H&ODXVH

If OPTIMIZE ON mode is used (optimize mode), the index


entries can be inserted as a batch operation. The batch
operation is faster because it requires fewer I/O
operations.

OPTIMIZE ON loads automatically switch to OPTIMIZE


OFF when the mode is MODIFY or MODIFY AGGREGATE
and a regular column (not a pseudocolumn) is specified
in the Criteria clause.

DISCARDFILE Optional. File into which duplicate records are dis-


’filename’ carded. A duplicate record contains a value that is the
same as the value in an existing row for a column that is
declared UNIQUE in the CREATE TABLE statement and is
indexed. This definition does not apply to MODIFY and
MODIFY AGGREGATE loads, in which case such “dupli-
cates” are not discarded but used to update existing rows
and re-aggregate existing rows. These records are only
discarded if there is an error during the update or re-
aggregation operation.

This file contains rows in the same format as those rows


discarded during an UNLOAD EXTERNAL operation on a
table. The format of duplicate records discarded in opti-
mize mode differs from the format of records discarded
for referential-integrity violations, data conversion, or
format errors. You can specify a separate discard file for
these duplicate records in the LOAD DATA statement for
each file so that records of different formats are not mixed
together in a single discard file. If no discard file is spec-
ified, then duplicate records are discarded to the same
file as other discarded records. Discarded records are not
always written to the discard file in the same order as the
input records.

Numeric and datetime data are formatted according to


the ANSI SQ-92 rules for these data types. Data formats
are not localized. However, multibyte characters in data
and table and column names are preserved.

If a discard file is specified and optimize mode is off, the


discard file clause is ignored.

 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
8VDJH1RWHV

8VDJH1RWHV
In optimize mode, new index nodes in B-TREE and STAR indexes are built
using fill factors specified in the rbw.config file or found in the
RBW_INDEXES system table. For information about fill factors, refer to the
Administrator’s Guide.

In optimize mode, space is allocated for index entries for all records,
including duplicate records, and that space is not reclaimed for immediate
reuse when the duplicate records are discarded. If the data you are loading
contains many duplicate records, this behavior has several side effects:

■ Indexes built in optimize mode are larger than indexes built in


non-optimize mode.
■ The MAXROWS PER SEGMENT limit might be reached before that
many rows are actually loaded into a table segment because records
are counted before duplicates are eliminated.
■ The actual number of discarded rows might exceed the number n set
as the maximum number of discards in the Discard clause.
■ The actual number of rows that can be loaded in optimize mode
might be less than the number calculated to fit in the space available,
even if the discarded duplicate records are corrected and reloaded in
non-optimize mode.
■ For INSERT, APPEND, and REPLACE modes, when duplicate rows are
discarded, the first row encountered in the table is kept and subse-
quent duplicates are discarded. This row might or might not be the
first row encountered in the load process (because of the way space
is reused and the batch processing that occurs during index-building
operations).

A MODIFY mode load with OPTIMIZE ON can perform both a batch-index


update and a direct-index update. If the new input row has the same
primary-key value as an existing row, the new input row replaces the existing
row, and the indexes and data file are updated immediately. However, if the
new input row does not have the same primary-key value as an existing row,
the index entries for the row are added to the temporary index-building
space area, and inserted later as a batch operation.

/RDGLQJ'DWDLQWRD:DUHKRXVH'DWDEDVH 
8VDJH1RWHV

Unlike an INSERT, APPEND, or REPLACE mode load with OPTIMIZE ON,


during a MODIFY mode load with OPTIMIZE ON, the duplicate-row
processing in the batch operation observes the order of the input row. For
rows with the same primary-key value, the row encountered last in the load
process is guaranteed to be in the table. This preserves the meaning of the
MODIFY operation.

Essentially, the MODIFY mode load with OPTIMIZE ON uses batch insertion
for insert-input rows and direct-index insertion for update-input rows. If
multiple unique indexes are defined on the table, OPTMIZE is turned to OFF.
Except for the primary-key index (which could be a B-TREE or a STAR index),
a unique index can only be a B-TREE defined on a unique column.

The benefit of the batch-insertion operation is proportional to the percentage


of the insert-input stream. Therefore, the MODIFY mode load with OPTIMIZE
ON is suggested for an input stream that contains both insert and update
stream.

The duplicates going through the duplicate-removal phase are the rows that
have the same primary keys as the previous rows in the input file. The input
rows that have the same primary keys as the ones in the table do not count as
duplicates. Those input rows are updated directly. Therefore, when you use
the MODIFY mode load, duplicate handling is not as critical as when you use
the INSERT, APPEND, or REPLACE modes.

Checking for duplicate rows consumes both time and memory when the load
process is done in optimize mode. When the number of duplicate rows is
large, the amounts consumed can be significant. IBM recommends that you
do not use optimize mode for load processes in either of the following cases:

■ The target table has a UNIQUE index other than the primary-key
index (multiple UNIQUE indexes), and more than 5,000 records in the
input data are discarded because their key values are duplicates of
existing rows.
■ The target table has a single UNIQUE index (the primary-key index),
the input data contains more than 1 million records, and more than
10 percent of these records are discarded because their key values are
duplicates of existing rows.

 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
00$3,QGH[&ODXVH

In the following example, the TMU discards duplicate records and saves them
in the named file mktdups.txt. Any records discarded for other reasons
(referential-integrity violations, data-conversion errors) are saved (in a
different format) in the file mktdups.txt.

load data
inputfile ’market.txt’
recordlen 7
discardfile ’mktdisc.txt’
discards 1
optimize on discardfile ’mktdups.txt’
into table market(
mktkey integer external (4),
state char (2)
);

00$3,QGH[&ODXVH
The MMAP Index clause specifies one or more primary key indexes on tables
referenced by the table being loaded. The purpose of this specification is to
define the order in which those indexes are memory-mapped (with the
mmap system function), as a means of optimizing referential-integrity
checking. Use this clause in conjunction with the TUNE TMU_MMAP_LIMIT
parameter, which controls the amount of memory available for memory-
mapping during loads; see page 2-35.

mmap_index_clause Back to LOAD DATA


p. 3-24


MMAP INDEX ( pk_index_name )

SEGMENT ( segment_name )

pk_index_name Specifies the primary index on a table referenced by the


table being loaded. One or more indexes can be specified.

/RDGLQJ'DWDLQWRD:DUHKRXVH'DWDEDVH 
00$3,QGH[&ODXVH

SEGMENT Specifies one or more segments of a specified primary


segment_name key index. Use this specification when you know which
index segments the referenced table data is associated
with. The memory mapping function is limited to the
index segments you specify.

The list of segment names must be enclosed by parenthe-


ses.

In the following example, the primary key indexes from the Period and
Product dimension tables will be memory-mapped in that order when the
Sales_Forecast table is loaded.

load data
LOAD DATA INPUTFILE ’sales_forecast.txt’
RECORDLEN 62
INSERT
NLS_LOCALE ’English_UnitedStates.US-ASCII@Binary’
MMAP INDEX (period_pk_idx, product_pk_idx)
INTO TABLE SALES_FORECAST (
...;

The TMU attempts to memory-map the indexes in the order specified. If an


index cannot be memory-mapped for some reason, the TMU will still try to
memory-map any subsequent indexes in the list.

If the MMAP INDEX clause is omitted from the control file, primary key
indexes on referenced tables are memory-mapped in descending order of size
(from largest to smallest). If insufficient memory is available to memory-map
an entire index, some of its segments are still memory-mapped.

 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
7DEOH&ODXVH

7DEOH&ODXVH
The Table clause specifies the table into which the data is loaded. It can
include:

■ Table column names


■ Field specifications, which describe either:
❑ Fields within the records of the input file.
❑ Instructions for the TMU to retain the existing value, to use a
default value, or to generate data, such as a sequence of
numbers, for some columns.

Each field or group of fields maps to a column within the specified table. The
mapping of input data types to database server data types is defined on
page 3-133. For all formats except FORMAT UNLOAD, one or more field speci-
fications is required. For UNLOAD, field specifications are not allowed.

table_clause Back to LOAD DATA


p. 3-24

INTO TABLE table_name

,
( col_name RETAIN )
AS $pseudocolumn DEFAULT
$pseudocolumn simple_field
p. 3-71

concat_field
p. 3-81

constant_field
p. 3-84

sequence field
p. 3-85

increment_field
p. 3-86

/RDGLQJ'DWDLQWRD:DUHKRXVH'DWDEDVH 
7DEOH&ODXVH

,PSRUWDQW The column names, pseudocolumns, and field specifications must be


omitted if and only if the input data is in UNLOAD format, as specified in the Format
clause.

A pseudocolumn is not a real column in the table. It is used to store data


temporarily for one of the following reasons:

■ For future use with the CONCAT option.


■ To discard an input field that is not used in the table.
■ As a reference in a Criteria clause.

table_name Table into which the data is loaded. The table must be
previously defined with a CREATE TABLE statement.
This name cannot be a view or synonym name.

col_name, Column or pseudocolumn into which a field from the


$pseudocolumn input record or TMU-generated data are loaded. Each
pseudocolumn name must begin with a dollar sign ($).

If a table column is omitted from the Table clause,


existing rows retain the current value for that column
and new rows are loaded with the default value for the
column. In MODIFY or UPDATE mode, the primary-key
column must be present in the table clause.

 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
7DEOH&ODXVH

AS $pseudocolumn Stores the input value in the specified column in the


table as well as in the specified pseudocolumn for refer-
ence in a Criteria clause. You must specify the AS
$pseudocolumn clause if a Criteria clause is present and
the input data is in fixed or variable format and a Posi-
tion clause is not used.

Under this circumstance, you must specify the input


column with AS $pseudocolumn, and you must use
$pseudocolumn in the Criteria clause to refer to the input
field because no way exists to refer to that column by
specifying its position.

When loading fixed-format or variable-format input


data and specifying a Position clause, you need not
specify AS $pseudocolumn because you can refer to the
input data by its position. You can specify both a
pseudocolumn for use in the Criteria clause and a table
column for the same input field.

RETAIN Causes an existing row, when being updated, to retain


its current value for the column corresponding to this
field. Causes a new row to store the default value for the
column. (Default column values are defined by CREATE
TABLE statements.) This behavior applies to both
AGGREGATE and non-aggregate modes.

The RETAIN keyword is not allowed with primary-key


columns or pseudocolumns.

DEFAULT Causes the default value for the column corresponding


to this field (or NULL if no default value is defined) to be
stored in the column.To load the default value of the col-
umn into both new and existing rows, include a field
specification with the DEFAULT keyword for that
column. The DEFAULT keyword is not allowed for
pseudocolumns and table columns defined as
SERIAL data type.

,PSRUWDQW If the DEFAULT keyword is used for a column that is defined as NOT
NULL DEFAULT NULL, then the load operation ends.

/RDGLQJ'DWDLQWRD:DUHKRXVH'DWDEDVH 
/RDGLQJD6(5,$/&ROXPQ

/RDGLQJD6(5,$/&ROXPQ
When loading a SERIAL column, the TMU can either directly load input data
containing serial values, or automatically generate serial values. When
loading or unloading a serial column, the load script must specify the data
type as either a numeric-external field type or an integer-binary field type.

When the serial values are provided in the input data and loaded in INSERT,
APPEND, REPLACE, UPDATE, or MODIFY modes, all positive values for the
SERIAL column are loaded. Any rows with zero or negative values are
discarded.

When serial values are not provided in the input data, the TMU automatically
generates them. In this case, you can either leave the load-script data field
undefined, or define the data field as RETAIN. You cannot define a field as
DEFAULT, because the serial data type has no default value.

When the serial values are being automatically generated in:

■ INSERT, APPEND, or REPLACE modes, a serial value is automatically


generated.
■ UPDATE mode, no changes are made to the original serial values.
■ MODIFY mode, no changes are made to the original serial values, and
new values are generated for new rows.

When serial values are loaded with AUTOROWGEN ON:

■ If the SERIAL column is part of the primary key of the referenced


table, the serial values are generated according to the following rules:
❑ If the value is zero or negative, the row for the referencing table
is discarded, and no rows are inserted into either the referencing
or the foreign table.
❑ If the values are positive, the input row is inserted into the refer-
encing table, and the automatically generated row is inserted
into the foreign table.
■ If the SERIAL column is not part of the primary key of the referenced
table, a serial value is generated for it.

You cannot perform an offline load on a table with a SERIAL column. For
more information about SERIAL columns, refer to the SQL Reference Guide.

 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
6HOHFWLYH&ROXPQ8SGDWHVZLWK5(7$,1DQG'()$8/7

6HOHFWLYH&ROXPQ8SGDWHVZLWK5(7$,1DQG'()$8/7
The following table defines the TMU behavior during load operations with
respect to the type of field specification you specify in the Table clause, and
the load mode you specify in the Format clause, of the LOAD DATA statement.

/RDG0RGHV )RUPDW&ODXVH

)LHOG APPEND, INSERT, UPDATE* modify*


6SHFLILFDWLRQ or REPLACE

No column Load column default Retain current New row: Load


name or field value. column value. column default
specification value.

RETAIN Error (#1362) Existing row: Retain


current column
keyword
value.

DEFAULT Load column default value.


keyword

Simple specifier, Load field input value.


CONCAT,
CONSTANT,
SEQUENCE,
INCREMENT

In UPDATE or MODIFY mode, the primary-key columns must be present in


the Table clause of the LOAD DATA statement.

In the following example, the Market table is being loaded with new data
that reflects a geographic reorganization of the districts and regions to which
each headquarters city belongs.

The CREATE TABLE statement for the Market table is as follows:


create table market (
mktkey integer not null,
hq_city char (20),
hq_state char (5),
district char (14),
region char (10),
primary key (mktkey))
maxrows per segment 128

/RDGLQJ'DWDLQWRD:DUHKRXVH'DWDEDVH 
6HOHFWLYH&ROXPQ8SGDWHVZLWK5(7$,1DQG'()$8/7

The basic LOAD DATA statement for the Market table is as follows:
oad data
inputfile ’aroma_market.txt’
recordlen 45
replace
discardfile ’aroma_discards’
discards 1
into table market (
mktkey integer external(2),
hq_city char(20),
hq_state char(2),
district char(13),
region char(7) );

The new LOAD DATA statement retains the information in the Hq_city and
Hq_state columns but loads the new definitions in the District and Region
columns. Even if the Hq_city and Hq_state field specification lines are
omitted, the values currently in the table are retained.

load data
inputfile ’aroma_mkt_upd.txt’
recordlen 45
modify
discardfile ’mkt_upd_discards’
discards 1
into table market (
These lines mktkey integer external(2),
could be hq_city retain,
omitted. hq_state retain,
district char(13),
region char(7) );

 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
6LPSOH)LHOGV

6LPSOH)LHOGV
A simple field specifies the data type of the field in the input record that is
loaded into the column.

simple_field Back to table_clause


p. 3-65

field_type
p. 3-97
POSITION ( start )
: end

xml_path

ROUND ADD
LTRIM SUBTRACT
RTRIM MIN
TRIM MAX
ADD_NONULL
SUBTRACT_NONULL
MIN_NONULL
:
MAX_NONULL

NULLIF ( start ) = ’string’


: end x’hex_string’

/RDGLQJ'DWDLQWRD:DUHKRXVH'DWDEDVH 
6LPSOH)LHOGV

POSITION Offset in bytes from the beginning of the record. This option
is used for fixed-format or variable-format files only. Do not
use this option with the SEPARATED BY keywords. The posi-
tion of the first field in a record is 1. If no position is specified,
then the position of a field is one greater than the last byte of
the previous field.

Position refers to the position in the record, not in a specific


field. For the CHARACTER field type, if no length is specified
with the POSITION keyword or in the field-type description
(for example, char (15)), then the column width defined for
the table is used.

For DECIMAL (EXTERNAL, PACKED, ZONED), INTEGER


EXTERNAL, FLOAT EXTERNAL, DATE, TIME, TIMESTAMP,
VARLEN, VARLEN EXTERNAL and M4DATE field types, a
length must be specified either with the POSITION keyword
or with the length parameter in the field-type specification.

The POSITION keyword is ignored for data in separated


format.

start:end Refers to the position of the data in the record, not to the posi-
tion in the field. Therefore, when you specify start:end,
remember to use the position of the data in the record. In the
following example, the field to load into the Weight column
starts at position 30 and ends at position 32. Other fields or
blanks precede this field in the input file:
weight position (30:32) integer external (3)

As in a Position clause, start:end refers to the position of the


data in the record, not to the position in the field. In the fol-
lowing example, the field for the City column contains 20
bytes, starting at position 25 and ending at position 44. The
TMU stores a null indicator in the City column if the first
three bytes in that field match the starting San:
city position (25:44) char (20) nullif (25:27) =
’San’

 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
6LPSOH)LHOGV

If :end is specified, the length of the field is end – start + 1


bytes. This length overrides any length specified in the field-
type specification. If :end is not specified, the length of the
string is used.

xml_path Describes how the XML input file should be parsed to load
each input field.

field_type Data type of the input field, for example, integer external. For
information about field types, refer to page 3-97.

LTRIM, RTRIM, Input modifiers to handle the loading and trailing space in
TRIM loading VARCHAR columns. Modifies input column by
removing preceding blanks, trailing blanks, or both, respec-
tively.

NULLIF Provides a way to load a column with the default value. If the
default value is not defined, the column is loaded with NULL.
If the data in the specified position is equal to the value of the
string, then the column value for the corresponding row is set
to NULL.

For VARLEN and VARLEN EXTERNAL, NULLIF is defined in


the fixed part, not the attached part.

NULLIF conditions cannot be specified for loads in XML for-


mat.

start:end See the start:end description earlier in this table.

string Must be specified by using the database-locale code set.

x'hex_string' Specifies a string in hexadecimal format. The character x


must be present.

ADD Adds the value in the input record to the corresponding


value in the table column.

SUBTRACT Subtracts the value in the input record from the correspond-
ing value in the table column.

MIN Keeps the smaller of the values in the input record and the
value in the corresponding table column.

/RDGLQJ'DWDLQWRD:DUHKRXVH'DWDEDVH 
6LPSOH)LHOGV

MAX Keeps the larger of the values in the input record and the
value in the corresponding table column.

ADD_NONULL Adds the value in the input record to the corresponding table
column. When the value in the input record is NULL, the
value in the table column remains unchanged. A null value
in the table column is treated as 0.

SUBTRACT_- Subtracts the value in the input record from the value in the
NONULL corresponding table column. When the value in the input
record is NULL, the value in the table column remains
unchanged. A null value in the table column is treated as 0.

MIN_NONULL Keeps the smaller of the values in the input record and the
value in the corresponding table column. If the value in the
input record is NULL, the value in the table column is
retained. If the value in the table column is NULL, it is
replaced by the value in the input record.

MAX_NONULL Keeps the larger of the value in the input record and the value
in the corresponding table column. If the value in the input
record is NULL, the value in the table column is retained. If
the value in the table column is NULL, it is replaced by the
value in the input record.

ROUND Converts the floating-point input data types (REAL, FLOAT,


and DOUBLE PRECISION) to integer or decimal table data
types (TINYINT, SMALLINT, INT, and DECIMAL) by rounding.

Converts the floating-point value in the input field to an inte-


ger or decimal value by rounding, based on the most signifi-
cant digit of the portion truncated because of the scale value.
A value less than 5 is rounded down to the nearest integral
value. A value greater than or equal to 5 is rounded up.

 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
6LPSOH)LHOGV

[POBSDWK6SHFLILFDWLRQ
The following diagram shows how to construct an XML path for an input
field:

xml_path Back to simple_field


p. 3-71

/element /@attribute field_type


p. 3-97
/#PCDATA

/element Identifies an element in the XML file, as specified with a


markup tag. Several consecutive elements can be specified to
represent the repetitive hierarchical structure of XML files.
Each element must begin with the “/” character. Each series
of elements must end with either “/@attribute” or
“#PCDATA.”

/@attribute Specifies the value of a particular attribute for the element. In


the XML file, this value must be a literal enclosed in quotes
and preceded by the equal sign.

#PCDATA Specifies the data content (all of the non-markup characters)


that appears between the element’s start and end tags.

In XML, a CDATA section instructs the parser to ignore char-


acters ordinarily recognized as markup characters (such as
the < and & symbols) and treat them as content. If a CDATA
section appears in an element, the TMU will load these char-
acters as well, based on the #PCDATA specification. If both
CDATA and #PCDATA content is present in the same element,
the character strings are concatenated to form the input for
the column.

/RDGLQJ'DWDLQWRD:DUHKRXVH'DWDEDVH 
6LPSOH)LHOGV

field_type Most field types can be specified in an XML path; however,


the VARLEN and VARLEN EXTERNAL field types are not sup-
ported. If a given field type supports the length specification,
you must specify the length. For example, character and
numeric external fields must have a length specification. See
page 3-97 for the syntax of each supported field type.

The length value represents the maximum number of bytes


that can be loaded into the field. If a given value contains
fewer bytes, it is padded with trailing spaces. If a value
exceeds the maximum, the additional bytes are truncated. If
the truncated bytes are not spaces, the TMU displays a warn-
ing message. If the truncated bytes are spaces, no message is
displayed.

7LS Specify the length of each field based on a careful estimation of the data in the
XML input file. The field lengths determine the size of each input row; if you set the
length values too high, the input rows will be larger than necessary.

$JJUHJDWH2SHUDWRUV
You can use the aggregate operators ADD, SUBTRACT, MIN, and MAX only
with the MODIFY AGGREGATE or UPDATE AGGREGATE modes. You cannot
use them with primary-key columns or true pseudocolumns (not the
AS $pseudocolumns) or non-numeric columns.

If a value in the specified field of the input record or a value in the specified
column of the table is NULL, then the result of the aggregation operation
(ADD, SUBTRACT, MIN, or MAX) is NULL.

 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
6LPSOH)LHOGV

([DPSOH3RVLWLRQ&ODXVH
The following example shows a LOAD DATA statement that reads a fixed-
format file. The Position clause specifies the starting byte of each field relative
to the offset in bytes from the beginning of the record. The field that maps to
the Perkey column starts at position 4 and ends at position 8 followed by 3
spaces. The next field maps to the Prodkey column and starts at position 12,
and the field after that maps to the Custkey column and starts at position 17.
load data
inputfile ’orders.txt’
recordlen 39
modify
discardfile ’orders_discards’
discards 1
into table orders(
perkey position (4) integer external (5),
prodkey position (12) integer external (2),
custkey position (17) integer external (2),
invoice sequence (1000,1)
);

The field specification for Invoice does not contain a Position clause. The
TMU generates values to store in the Invoice column. Because the values do
not exist in the input file, a Position clause is not relevant.

The following sample data comes from the orders.txt file. The dashes (-)
represent spaces. Actual data would contain spaces.

Perkey ProdkeyCustkey
---10045---12---56
---10046---13---57
---10047---14---58

/RDGLQJ'DWDLQWRD:DUHKRXVH'DWDEDVH 
6LPSOH)LHOGV

([DPSOH;0/'DWDDQG&RUUHVSRQGLQJ&RQWURO)LOH
The following XML document can be used as the input for a TMU load in XML
format:
<?xml version="1.0"?>
<aromaproducts>
<coffee>
<product>
<ID classkey="12" prodkey="68"/>
<name>Aroma 2002 shirt </name>
<package>No_pkg </package>
</product>
</coffee>
</aromaproducts>

The following control file (product_xml.tmu), using the XML document


above (product.xml) as its input file, will load one row into the Product table:
LOAD DATA INPUTFILE ’product.xml’
APPEND
FORMAT XML
NLS_LOCALE ’English_UnitedStates.US-ASCII@Binary’
INTO TABLE PRODUCT
(
classkey /aromaproducts/coffee/product/ID/@classkey integer
external(5),
prodkey /aromaproducts/coffee/product/ID/@prodkey integer
external(5),
prod_name /aromaproducts/coffee/product/name/#PCDATA char(30),
pkg_type /aromaproducts/coffee/product/package/#PCDATA char(30)
);

 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
6LPSOH)LHOGV

The following TMU and RISQL output shows the results of the load and a
subsequent query against the Product table:
157 brick % rb_tmu product_xml.tmu system manager
(C) Copyright IBM Corp. 1991-2002. All rights reserved.
Version 06.20.0000(0)TST
** STATISTICS ** (500) Time = 00:00:00.00 cp time, 00:00:00.00
time, Logical IO count=0, Blk Reads=0, Blk Writes=0
** INFORMATION ** (366) Loading table PRODUCT.
** INFORMATION ** (8555) Data-loading mode is APPEND.
** INFORMATION ** (9033) Parsing XML input file product.xml.
** INFORMATION ** (9036) XML Parsing Phase: CPU time usage =
00:00:00.00 time.
** INFORMATION ** (9018) Processed 1 rows in this LOAD DATA
operation from XML format input file(s).
** INFORMATION ** (513) Starting merge phase of index building
PRODUCT_PK_IDX.
** INFORMATION ** (513) Starting merge phase of index building
PRODUCT_FK_IDX.
** INFORMATION ** (367) Rows: 1 inserted. 0 updated. 0 discarded.
0 skipped.
** STATISTICS ** (500) Time = 00:00:00.14 cp time, 00:00:01.17
time, Logical IO count=55, Blk Reads=0, Blk Writes=1

RISQL> select * from product where prodkey = 68;


CLASSKEY PRODKEY PROD_NAME PKG_TYPE
12 68 Aroma 2002 shirt No_pkg

For more details about loading tables from XML input files, see page 3-129.

([DPSOH18//,)
The following example shows a NULLIF condition on the destination column,
City. If a value starting at position 3 and ending at position 5 in market.txt is
San, the TMU stores a null indicator in the City column.

load data
inputfile ’market.txt’
discardfile ’market_discards’
into table market(
mktkey integer external (2),
city char (20) nullif (3:5) = ’San’
);

/RDGLQJ'DWDLQWRD:DUHKRXVH'DWDEDVH 
6LPSOH)LHOGV

([DPSOH$XWR$JJUHJDWH
In the following example, the TMU adds the dollar values in the input record
to the existing values in the corresponding row of the Sales table. Because
UPDATE AGGREGATE is specified, new rows are not added to the table. Each
record must have a primary-key value that is already present in the Sales
table. An aggregate mode must be specified because ADD is part of the Auto
Aggregate mode.
load data
inputfile ’sales.txt’
update aggregate
discardfile ’sales_discards’
into table sales(
perkey integer external (5),
prodkey integer external (2),
mktkey integer external (2),
dollars decimal external (7,2) add
);

([DPSOH5281')XQFWLRQ
The following example shows the functions of the ROUND function for
floating-point input fields.

5RXQGHG9DOXH/RDGHGLQWR7DEOH

,QSXW)LHOG9DOXH &ROXPQ7\SH 9DOXH

1.5 INT 2

-1.5 INT -2

123.45612 DEC, scale = 2 123.46

123.45612 DEC, scale = 3 123.456

 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
&RQFDWHQDWHG)LHOGV

&RQFDWHQDWHG)LHOGV
A concatenated field specifies the concatenation of fields in the input record
that is loaded into the column.

Back to table_clause
concat_field p. 3-65

CONCAT ( concat_arg_spec , concat_arg_spec )


p. 3-81 p. 3-81

concat_arg_spec Back to concat_field


p. 3-81

column_name

$pseudocolumn
’character_string’

LTRIM ( column_name )
RTRIM $pseudocolumn
’character_string’

TRIM ( column_name , BOTH )


$pseudocolumn
, LEFT

’character_string’ , RIGHT

/RDGLQJ'DWDLQWRD:DUHKRXVH'DWDEDVH 
&RQFDWHQDWHG)LHOGV

column_name, Input fields to concatenate. Definitions for these input


$pseudocolumn, fields (except character strings) must precede the
character_string concatenated field definition. No forward references are
allowed.

If any of the fields to concatenate contains NULL for a


given row, then the result of the concatenation is NULL.

You must specify character strings by using the


database-locale code set.

LTRIM, RTRIM, TRIM Remove preceding blanks, trailing blanks, or both,


respectively, before concatenating the input fields.

([DPSOH&RQFDWHQDWHGILHOGV
In the following example, two fields are concatenated and stored in a column.
All of the fields in the product.txt file are loaded into the Product table. The
values in the Aroma and Acid fields in product.txt are stored in the Aroma
and Acid columns and are also joined in the Body column with the
ampersand character (&) used as a separator. The LTRIM option removes
preceding blanks.
load data
inputfile ’product.txt’
replace
format separated by ’:’
discardfile ’product_discards’
discards 100
into table product (
prodkey integer external (2),
product char (12),
aroma char (8),
acid char (7),
body
concat (ltrim (aroma), ’&’, ltrim (acid))
);

 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
&RQFDWHQDWHG)LHOGV

([DPSOH&RQFDWHQDWHGILHOGVDQGSVHXGRFROXPQV

The following example shows the use of concatenated fields and


pseudocolumns. Several fields are read and stored in the table, but the Str
field is not stored as a separate column. Instead, it is saved as a
pseudocolumn for use in a concatenated column. The concatenated column
named Prd_fr_st_pc in the Nba_basic table holds the string formed by
concatenating four fields read earlier, including the one stored as a
pseudocolumn.
load data
inputfile ’nba_basic’
replace
format separated by ’|’
discardfile ’nba_basic.dsc’
discards 100
into table nba_basic(
fill_1 char,
prd_key char,
frm_key char,
$str char,
pck char,
pr_fr_st_pc
concat(prd_key, frm_key, $str, pck),
mp char,
pcbo_tot int external,
pcbo_rx int external
);

/RDGLQJ'DWDLQWRD:DUHKRXVH'DWDEDVH 
&RQVWDQW)LHOGV

&RQVWDQW)LHOGV
A constant field specifies a constant value to load into the column. The TMU
generates the value. It does not exist in the input file.

constant_field Back to table_clause


p. 3-65

CONSTANT NULL
’character_literal’
float_constant

integer_constant
DATE 'date_literal'
TIME 'time_literal'
TIMESTAMP ’timestamp_literal’

'alternative_datetime_value'

‘character_literal’, Value to insert into the specified column. The


float_constant, value supplied must be type-compatible with
integer_constant, the designated-output column, as defined in
‘date_literal’, ‘time_literal’, “Field-Type Conversions” on page 3-133.
‘timestamp_literal’, Character literals must be specified by using the
‘alternative datetime_value’ database-locale code set. Decimal constants
must be specified with a decimal radix.

For information about allowable literal and


constant values, refer to the SQL Reference Guide.

Both ANSI SQL-92 datetime data types and the defined alternative datetime
formats are valid. If an ANSI SQL-92 datetime keyword is present, then the
literal that follows the keyword must be ANSI SQL-92 format. If you use an
alternative datetime value with numeric months and a format other than mdy,
you must include a SET DATEFORMAT statement in the TMU control file. For
more information about the TMU SET DATEFORMAT statement, refer to
“Format of Datetime Values” on page 2-33.

7LS You can also use a simple field and a datetime format mask to specify a
non-standard datetime value, as described on page 3-109.

 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
6HTXHQFH)LHOGV

In the following example, the TMU generates the value 999 and stores it in the
Dollars column for each record loaded into the Orders table.
load data
inputfile ’orders.txt’
replace
discardfile ’orders_discards’
discards 10
into table orders(
invoice integer external (5),
perkey integer external (5),
prodkey integer external (2),
custkey integer external (2),
dollars constant 999
);

In the following example, the TMU generates a constant date of March 10,
1998, and constant time and timestamp values and stores them in the
corresponding columns for each record loaded into the Period table.
load data
inputfile ’period.txt’
replace
format separated by ’*’
into table period (
perkey integer external (5),
date_col constant date ’1998-03-10’,
time_col constant time ’03:15:30’,
timestamp_col constant timestamp ’1998-03-10 3:15:30’
);

6HTXHQFH)LHOGV
A sequence field specifies a sequentially computed integer value to load into
a numeric column. The TMU generates the numbers. They do not exist in the
input file.

sequence_field Back to table_clause


p. 3-65
SEQUENCE

( start )

, increment

/RDGLQJ'DWDLQWRD:DUHKRXVH'DWDEDVH 
,QFUHPHQW)LHOGV

start, Starting value and the value by which to increment. These val-
increment ues can be any negative or positive integer, including 0. The
increment value is applied to each new row, whether that row
is skipped, loaded, or discarded because of an error. The
default value for both start and increment is 1.

,PSRUWDQW Verify that the incremented values do not overflow the range of the data
type of the destination column.

In the following example, the TMU automatically generates numbers starting


from 1000 and loads them into the Invoice column of the Orders table. The
numbers increment by 1 for each new row in the table.
load data
inputfile ’orders.txt’
append
discardfile ’orders_discards’
discards 10(
into table orders
invoice sequence (1000, 1),
perkey integer external (5),
prodkey integer external (2),
);

,QFUHPHQW)LHOGV
An increment field specifies a value to add to the existing column value. The
TMU generates the value to added. It does not exist in the input file.

increment_field Back to table_clause


p. 3-65
INCREMENT

( n )

n Increment (decrement) amount. It must be a positive or negative


numeric constant. The default value of n is 1.

If the value in the specified column is NULL, the result of the increment
operation on that row is also NULL.

 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
6HJPHQW&ODXVH

You must specify either the UPDATE AGGREGATE or the MODIFY


AGGREGATE format in the Format clause. The INCREMENT mode is part of
the Auto Aggregate mode and must be enabled with a license key.

You cannot use the INCREMENT keyword with pseudocolumns (true


pseudocolumns or AS $pseudocolumns).

The following example shows the use of the increment field and the UPDATE
AGGREGATE mode. The TMU updates the Weight column by adding the
value 15 to the values already existing in the column. Because UPDATE
AGGREGATE is specified, each record in the sales.txt file must have a
primary-key value that exists in the Sales table. Otherwise, the record is
discarded. An aggregate mode must be specified because Increment fields
use the Auto Aggregate mode.
load data
inputfile ’sales.txt’
recordlen 32
update aggregate
discardfile ’sales_discards’
discards 1
into table sales(
perkey integer external (5),
prodkey integer external (2),
mktkey integer external (2),
dollars decimal external (7,2),
weight increment (15)
);

6HJPHQW&ODXVH
You can use the Segment clause instead of a Table clause to specify a segment
of a table into which to load data. To load data into a single segment, the
following conditions must be met:

■ The segment must be attached to a table.


■ The segment must not be the table’s only segment.
■ The segment must be offline.
■ The table must not have a local index.
■ The table must not have a SERIAL column.
■ The Format clause mode must be APPEND, INSERT, or REPLACE.

/RDGLQJ'DWDLQWRD:DUHKRXVH'DWDEDVH 
6HJPHQW&ODXVH

■ The AUTOROWGEN feature in the Discard clause must be OFF.

The offline load operation is always done in OPTIMIZE mode, regardless of


settings in the Optimize clause or the rbw.config file.

segment_clause Back to LOAD DATA


p. 3-24

INTO OFFLINE SEGMENT segment_name

OF TABLE table_name WORKING_SPACE work_segment

,
( col_name simple_field )
AS $pseudocolumn concat_field
$pseudocolumn constant_field
sequence_field
increment_field

,PSRUWDQW You must omit the column names, pseudocolumns, and field specifica-
tions if and only if the input data is in UNLOAD format, as specified in the Format
clause.
INTO OFFLINE Offline row data segment into which data is to load.
SEGMENT The segment must be attached to the table specified by
segment_name table_name and must be offline.

The segment must also meet the conditions specified


by the mode in the Format clause. The scope of the
mode is limited to the segment being loaded. For
example, INSERT fails if the segment is not empty, but
does not fail if rows are present in other segments.
REPLACE deletes all rows in the specified segment but
not rows in any other segment.

OF TABLE table_name Table to which the segment is attached.

 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
6HJPHQW&ODXVH

WORKING_SPACE Unattached segment that contains adequate space to


work_segment hold control information as the new data is loaded.
This segment can be relatively small. In most cases, the
following values are adequate:

INITSIZE, EXTENDSIZE: Default values


MAXSIZE: 50 or 100 kilobytes

Use a larger MAXSIZE value under any of the


following conditions:

■ The table to load has many indexes.


■ The data to load has many duplicates.
■ The INDEX TEMPSPACE THRESHOLD value is
small relative to the amount of data.

col_name, Same as for Table clause. Defined on page 3-66 and


$pseudocolumn, page 3-65.
field specifications,
AS $pseudocolumn

After data is loaded into the offline segment, the partial indexes built in the
work segment must be synchronized, or merged, with the existing indexes on
the table with a SYNCH OFFLINE SEGMENT operation, as described in
“Writing a SYNCH Statement” on page 3-119.

([DPSOH

The following example shows a LOAD DATA statement to load data into an
offline segment, followed by a SYNCH operation to synchronize the offline
segment with the rest of the table.
load data
inputfile ’sales_96_data’
append
discardfile ’discards_sales_96’ discards 3
into offline segment s_1q96 of table sales
working_space work01 (
perkey date (10) ’MM/Y*/d01’,

/RDGLQJ'DWDLQWRD:DUHKRXVH'DWDEDVH 
&ULWHULD&ODXVH

prodkey integer external (2),


mktkey integer external (2),
dollars integer external (3)
);
synch offline segment s_1q96 with table sales
discardfile ’discards_synch’;

&ULWHULD&ODXVH
The Criteria clause allows you to specify that a comparison be made of each
input record or each row of data. The result of the comparison, true or false,
loads or discards the record, depending on whether the Criteria clause
specifies ACCEPT or REJECT. You can use this clause to ensure that the correct
data is loaded or that rows are not aggregated more than once when a load
operation in an AGGREGATE mode is interrupted.

The Criteria clause can be used for comparisons on numeric, character, and
datetime data-type columns.

The Criteria clause uses the collating sequence and the code set from the
database locale for all processing.

The syntax for criteria_clause on a numeric or datetime column is as follows.

criteria_clause Back to LOAD DATA


p. 3-24

ACCEPT constant = constant


REJECT column_name <> column_name
$pseudocolumn < $pseudocolumn
>
<=
>=

 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
&ULWHULD&ODXVH

The syntax for criteria_clause on a character column is as follows.

criteria_clause Back to LOAD DATA


p. 3-24

ACCEPT column_name LIKE constant


REJECT $pseudocolumn NOT ESCAPE ’c’

ACCEPT Specifies that each row of input data that meets the com-
parison criteria (that is, it evaluates to TRUE) is loaded
into the table. All others, including those containing
NULL indicators, are discarded.

REJECT Specifies that each row of input data that meets the com-
parison criteria (that is, it evaluates to TRUE) is rejected
and discarded. All others, including those containing
NULL indicators, are loaded.

constant Any numeric, character, or datetime literal with which


each input record or row of data is compared. Character
and datetime constants, by definition, must be enclosed
in single quotation marks.

The constant data type must be the same data type as, or
compatible with, the data type of the column or
pseudocolumn with which it is compared. For example,
numeric constants cannot be compared with character or
datetime columns.

Numeric, character, and datetime constants must con-


form to the definition of literals, as defined in the SQL
Reference Guide. (Both ANSI SQL-92 datetime data types
and the defined alternative datetime formats are valid.)

Character literals must be specified in the code set of the


database locale. Decimal constants must be specified
with a decimal radix.

/RDGLQJ'DWDLQWRD:DUHKRXVH'DWDEDVH 
&ULWHULD&ODXVH

column_name References any numeric, character, or datetime column


in the specified table. Referencing a column causes the
TMU to check existing data in the table before loading a
record. For example, based on the following ACCEPT
clauses, the TMU loads a record only if the corresponding
value existing in the Sales column of the table is equal to
the constant value of 1000:
accept SALES = 1000

The column reference can be used only in UPDATE and


MODIFY modes. It cannot be used in APPEND, INSERT, or
REPLACE modes (because the corresponding row does
not yet exist in these modes). When a column reference is
used in MODIFY mode, the ACCEPT or REJECT clause is
applied only when a record corresponds to an existing
row, as identified by the primary key. When a record does
not correspond to an existing row, no comparison is done
and the record is loaded. In other words, no comparison
is performed for insert operations in MODIFY mode.

$pseudocolumn Refers to the contents of a numeric, character, or datetime


field in the input record. Referencing a pseudocolumn
causes the TMU to check the data in the input record, first
checking for referential integrity and then making the
Criteria clause comparison, before loading a record. For
example, based on the following REJECT clause, the TMU
rejects all records with values in the sales field that are
less than 100:
reject $SALES < 100

A pseudocolumn must refer to a numeric, character, or


datetime pseudocolumn defined in the field specifica-
tion. For more information about pseudocolumns, refer
to page 3-66.

 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
&ULWHULD&ODXVH

LIKE, NOT LIKE Compares column or field values with a character string.
The column or pseudocolumn referenced must be of
CHARACTER data type.

The percent (%) wildcard character matches any charac-


ter string. The underscore (_) wildcard character matches
any one character in a fixed position.

ESCAPE ’c’ The ESCAPE keyword, which can be used only with a
LIKE or NOT LIKE comparison, defines a character (c) to
serve as an escape character so that the wildcard charac-
ters can be treated as character literals rather than control
characters. Use the ESCAPE keyword whenever the
pattern to match contains a percent or underscore
character.

The escape character must be specified by using the data-


base-locale code set and can be either a single-byte or
multibyte character.

8VDJH
Only one ACCEPT or REJECT Criteria clause can be present in each LOAD
DATA statement.

In determining whether a row meets the comparison criteria, a three-valued


logic is used: TRUE, FALSE, and UNKNOWN, with the NULL indicator
evaluated as UNKNOWN. This behavior can be confusing. For example, a row
containing a NULL indicator in the Dollars column is rejected for both of the
following criteria:
accept dollars >= 1000
accept dollars < 1000

When the Criteria clause contains a regular column (not a pseudocolumn)


and the load mode is MODIFY or MODIFY AGGREGATE, the load operation
automatically switches to OPTIMIZE OFF.

/RDGLQJ'DWDLQWRD:DUHKRXVH'DWDEDVH 
&ULWHULD&ODXVH

([DPSOH9DOLG&ULWHULD&ODXVHV
The following examples illustrate valid Criteria clauses that compare column
values to constants. (You can use this kind of Criteria clause only in UPDATE,
MODIFY, or AGGREGATE mode.)

accept DISTRICT = 475


reject BATCH_ID > 100
accept CITY = ’Los Angeles’
accept AUTH = ’Y’
reject SALE_DATE <= date ’1995-10-16’
-- ANSI SQL-92 datetime format
reject SALE_DATE <= ’10-16-1995’
-- Alternative datetime format
accept SALE_TIMESTAMP >= timestamp ’1995-10-16 13:13:13’
-- ANSI SQL-92 datetime format

The following examples illustrate valid Criteria clauses that compare input
field values to constants:
accept $CITY = ’Los Angeles’
accept $TIME_COL >= ’13:13:13’
reject $TIME_COL >= time ’08:35:40’
accept $TIMESTAMP_COL >= timestamp ’1995-10-16 12:13:13’

reject $CITY < ’Los Angeles’ /* rejects records where input


value occurs before “Los Angeles” in alpha-sorted list */

([DPSOH/,.(DQG127/,.(
The following examples illustrate the use of the LIKE and NOT LIKE operators
in a Criteria clause:
reject zip not like '950%'
-- rejects any zip codes that do not begin with'950'

accept city like 'Hamb_rg'


-- accepts cities like 'Hamburg, Hamberg, and so on.

reject sales_pct like '%Monthly \%' escape '\'


-- rejects any string that ends with 'Monthly %'

 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
&RPPHQW&ODXVH

&RPPHQW&ODXVH
The Comment clause contains a user-defined text string that describes the
load operation or the data being loaded. This information is then stored in the
RBW_LOADINFO system table to provide a historical record regarding the
loading of data into the specified table. You can retrieve the information by
querying the RBW_LOADINFO system table.

comment_clause Back to LOAD DATA


p. 3-24

COMMENT ’character_string’

COMMENT Specifies the comment character string to insert into the


’character_string’ RBW_LOADINFO system table in the Comment column
for the row that describes this load operation. Up to 256
bytes (plus the single quotation marks) can be included.
The character string must be specified using the
database-locale code set.

([DPSOH

In the following example, a comment is included that describes the source of


the data being loaded. This information, together with the other information
stored in the RBW_LOADINFO system table, provides a useful record of load
activity on the Sales table.
load data
inputfile ’sales.txt’
update aggregate
discardfile ’sales_discards’
into table SALES(
perkey integer external (5),
prodkey integer external (2),
mktkey integer external (2),
dollars decimal external (7,2),
weight integer external (3) add
)
comment ’East coast, Q2-96,input file sales.txt’;

/RDGLQJ'DWDLQWRD:DUHKRXVH'DWDEDVH 
&RPPHQW&ODXVH

To check load activity on the Sales table, for example, to verify that a specific
batch of data was loaded, you can query the RBW_LOADINFO system table
as follows:
select *
from RBW_LOADINFO
where tname = ’SALES’
order by started;

The information returned in the following example is ordered by the


timestamp at the start of the load operation (the STARTED column) and
includes the comment entered in the LOAD DATA statement, as well as other
information for load operations on the Sales table.

The RBW_LOADINFO system table contains a row for each of the last
256 LOAD DATA operations. To retrieve data in any specific order, include an
ORDER BY clause in the SELECT statement.

 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
)LHOG7\SHV

)LHOG7\SHV
A field type specifies the data type of the input data in a simple field, as
described on page 3-71. The TMU converts this data type into the data type
defined for the column in the CREATE TABLE statement. The two data types
must be compatible, as defined on page 3-133.

field_type Back to simple_field


p. 3-71
(part 1 of 2)
CHARACTER
Character SUBSTR ( start , num )
fields CHAR
p. 3-99 ( length )
VARLEN

EXTERNAL
Numeric
external fields INTEGER EXTERNAL
p. 3-101
INT EXTERNAL ( length )
, scale

DECIMAL EXTERNAL
DEC EXTERNAL ( length ) RADIX POINT ’c’
, scale
Floating-point
external fields FLOAT EXTERNAL
p. 3-103
( length )
Packed and DECIMAL
zoned decimal
fields p. 3-104 restricted_
DEC PACKED ( length ) date_spec
ZONED , scale p. 3-116
Integer binary
fields INTEGER
restricted_
INT ( scale ) date_spec
p. 3-116
SMALLINT
TINYINT
Floating-point
binary fields REAL
p. 3-106
DOUBLE PRECISION

/RDGLQJ'DWDLQWRD:DUHKRXVH'DWDEDVH 
)LHOG7\SHV

field_type Back to simple_field


(part 2 of 2) p. 3-71

Date fields DATE


‘date_mask’
p. 3-107 p. 3-116
( length )
CURRENT_DATE

Time fields TIME ‘time_mask’


p. 3-107 p. 3-116
( length )
CURRENT_TIME

Time-stamp TIMESTAMP ‘timestamp_mask’


fields p. 3-107 p. 3-116
( length )
CURRENT_TIMESTAMP

Metaphor-date M4DATE m4date_mask


fields p. 3-107
( length )

Each field type is defined in the following sections, with examples of each
type.

 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
&KDUDFWHU)LHOG7\SH

&KDUDFWHU)LHOG7\SH
CHARACTER, Identifies a character string. A CHARACTER field can
CHAR contain any character in the computer code set.

The total length (in bytes) is specified by length. If the


length of a target character column exceeds the length of
the source field in the input record, then the input charac-
ters are left-justified with spaces to fill the excess. If the
length of a target character column is less than the length
of the source field, then the source field is truncated. If the
length is not specified, the column definition in the
CREATE TABLE statement determines the length.

For character field types in XML format, the length must be


specified.

VARLEN, VARLEN Used for loading CHAR and VARCHAR column types, the
EXTERNAL VARLEN and VARLEN EXTERNAL field types identify the
length of a character data section. VARLEN and VARLEN
EXTERNAL field types are not supported for loads in XML
format.

VARLEN and VARLEN EXTERNAL are used only with


variable-format records. Both VARLEN and VARLEN
EXTERNAL are a fixed length, and their positions can be
described by using POSITION keywords or field-length
specifications. The output column for VARLEN and
VARLEN EXTERNAL is a real column in either the
VARCHAR or CHARACTER data type, or a pseudocolumn.

VARLEN is the binary form of the data-section length in


the variable-length part of a variable-format record. The
maximum length allowed for VARLEN is 2.

VARLEN EXTERNAL is the ASCII/EBCDIC external form of


the data-section length in the variable-length part of a
variable-format record. The maximum length allowed for
VARLEN EXTERNAL is 8.

/RDGLQJ'DWDLQWRD:DUHKRXVH'DWDEDVH 
&KDUDFWHU)LHOG7\SH

After the data-section length is extracted from the


variable-length part, operations to VARLEN and VARLEN
EXTERNAL fields are actually performed on the character
data section. The TRIM, LTRIM, RTRIM, and the SUBSTR
functions are all applied to the data section. When used
together with a pseudocolumn, the character data fills the
pseudocolumn to the exact length of the data section.

When multiple VARLEN or VARLEN EXTERNAL fields are


present in the TMU control file, the order of data sections
that appear in the variable-length part should be the same
as the order of the length fields. The order in which
VARLEN or VARLEN EXTERNAL fields appear in the con-
trol file is not relevant. In the case of two VARLEN or
VARLEN EXTERNAL fields with the same starting posi-
tion, the length of the two fields should be the same. If the
length is different, the TMU returns an error.

SUBSTR (start, Indicates that only a subset of a character field is loaded


num) and specifies the starting position and the number of
characters counted in characters, not bytes. (The POSI-
TION keyword specification uses bytes.) This function is
intended for use with multibyte code sets. Data loaded in
the table column can be unpredictable if the substring
start position is not 0, or the substring length is greater
than the table-column length.

length Number of bytes in the input field. With fixed-format and


variable-format input data for the CHARACTER field type,
if no length is specified with the POSITION keyword or in
the field-type description, the column width defined for
the table is used.

7LS TMU performance is better if you can load substrings of fixed-format character
data with the POSITION keyword rather than the SUBSTR function. With multibyte
characters, however, you must use the SUBSTR function to extract a string because
POSITION is byte-based whereas SUBSTR is character-based.
char (10)
character (24)
char (24) substr (1, 5)

 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
1XPHULF([WHUQDO)LHOG7\SHV

The following example shows the use of the SUBSTR keyword in a LOAD
DATA statement to load partial character strings into Col2 of the Sales table.
The numbers in parentheses define the starting character position and the
number of characters in the substring.
load data

into table sales (


col1 decimal external radix point ’,’,
col2 char substr(1,5)
);

For example, if the input data to Col2 is the string California, only the
substring Calif is loaded into the column.

1XPHULF([WHUQDO)LHOG7\SHV
INTEGER EXTERNAL, String of characters representing a number in
INT EXTERNAL [±]digits format. These numbers cannot exceed
38 digits. Use this field type when loading a
SERIAL column.

DECIMAL EXTERNAL, String of characters representing a decimal


DEC EXTERNAL number in the following format.

DECIMAL EXTERNAL, Back to field_type


DEC EXTERNAL p. 3-97

digit
+ .

- digit
. digit

/RDGLQJ'DWDLQWRD:DUHKRXVH'DWDEDVH 
1XPHULF([WHUQDO)LHOG7\SHV

length Number of bytes in the input field. For numeric


external field types in fixed-format input, a length
must be specified with either the POSITION key-
word or the length parameter. For numeric exter-
nal field types in XML format, the length must be
specified with the length parameter.

Leading and trailing blanks are allowed in a data


input file.

scale Number of digits to the right of the decimal


(radix) point. If a scale factor is not specified, the
default value is zero. The scaled input is then con-
verted to a value of the target data type.

If the target data type is DECIMAL or NUMERIC,


the decimal point of the input data (specified or
default) is aligned with the decimal (radix) point
of the target column (as specified by scale in the
column definition). If the precision of the integer
part of the scaled input value exceeds the preci-
sion of the target column, an overflow error occurs
and the record is discarded. If the fractional part of
the scaled input value exceeds the precision of the
target data type fraction and the excess least sig-
nificant digits of the input data are non-zero, then
a truncation error occurs and the record is
discarded.

RADIX POINT ’c’ Any single-byte or multibyte character that is


used to indicate the radix in numeric data. If the
input records contain separated data, the radix
character must be different from the character
specified as the separator character in the Format
clause. If no radix point is specified, the default
radix of the input locale is used.

:DUQLQJ If you specify the radix character, you must specify it by using the data-
base-locale code set. If the character used as a radix in the input data cannot be
expressed as a character in the database, then the input data cannot be interpreted
correctly.

 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
)ORDWLQJ3RLQW([WHUQDO)LHOG7\SH

)ORDWLQJ3RLQW([WHUQDO)LHOG7\SH
FLOAT EXTERNAL String of characters representing a floating point
number in the following format.

FLOAT EXTERNAL Back to field_type


p. 3-97
Mantissa (fraction):
digit
+ .
- digit
. digit
Exponent:

E digit
e +
-

length Total number of bytes in the input field. With


fixed-format input data for FLOAT EXTERNAL
field types, a length must be specified with either
the POSITION keyword or the length parameter.
For FLOAT EXTERNAL field types in XML format,
the length must be specified with the length
parameter.

([DPSOH
int external --length specified by POSITION clause
integer external (8)
decimal external --length specified by POSITION clause
decimal external (5)
decimal external (5,2)
float external --length specified by POSITION clause
float external (8)

If the input records are in separated format, the length can be determined
implicitly.

/RDGLQJ'DWDLQWRD:DUHKRXVH'DWDEDVH 
3DFNHGDQG=RQHG'HFLPDO)LHOG7\SHV

3DFNHGDQG=RQHG'HFLPDO)LHOG7\SHV
DECIMAL, DEC, Decimal numbers in packed format. These numbers
DECIMAL PACKED cannot exceed 38 digits.

If present, the length value specifies the number of


bytes in the input record field, which yields a
numeric value with a precision of (2* length) –1.
Each digit occupies .5 byte, with .5 byte reserved to
indicate the sign (+ or –) of the value, so the
maximum length is 20 bytes.

DECIMAL ZONED Decimal numbers in an IBM-zoned decimal repre-


sentation. This format is supported only with the
FORMAT IBM clause. These numbers cannot exceed
38 digits.

If present, the length value specifies the number of


bytes in the input record field, which corresponds to
the number of numeric digits (precision) in the input
value. Each digit occupies one byte, so the maxi-
mum length is 38 bytes.

length For packed and zoned decimal field types in fixed-


format input, the length must be specified with
either the POSITION keyword or the length parame-
ter. If decimal field types are loaded in XML format,
the length parameter is required.

scale Number of digits to the right of the decimal (radix)


point. If a scale factor is not specified, the default
value is zero. The scaled input is then converted to a
value of the target data type.

restricted_date_spec Provides a restricted datetime format mask that


allows packed or zoned decimal input data to load
into datetime columns. For more information about
the restricted datetime masks, refer to “Restricted
Datetime Masks for Numeric Fields” on page 3-116.

 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
,QWHJHU%LQDU\)LHOG7\SHV

([DPSOHV
decimal -- packed; length specified by POSITION clause
dec -- packed; length specified by POSITION clause
decimal packed -- length specified by POSITION clause
dec packed (5,2)
decimal zoned -- length specified by POSITION clause
decimal zoned (5,2)
decimal zoned (5)
decimal packed (8) date ‘YYYYMMDD'

If the TMU LOAD DATA script references a packed decimal-input field type of
DECIMAL PACKED (6,3) for conversion to a database data type of DECIMAL
(5,2), the following conversions or errors occur.

,QSXW9DOXH 5HVXOW

473220 473.22

819077 Truncation error. Record discarded.

2323478320 Overflow error. Record discarded.

,QWHJHU%LQDU\)LHOG7\SHV
For integer binary field types, the length is implied by the field type.

INTEGER, INT Four-byte binary integer in the two-complement


representation. Use this field type when loading a
SERIAL column.

SMALLINT Two-byte binary integer in the two-complement


representation.

TINYINT One-byte binary integer in the two-complement


representation.

/RDGLQJ'DWDLQWRD:DUHKRXVH'DWDEDVH 
)ORDWLQJ3RLQW%LQDU\)LHOG7\SHV

scale Optional. Number of digits to the right of the deci-


mal (radix) point. If a scale factor is not specified, the
default value is zero. The scaled input is then con-
verted to a value of the target data type.

If the target data type is INTEGER, TINYINT, or


SMALLINT, all digits to the right of the decimal
(radix) point must be zero. Otherwise, a truncation
error occurs and the record is discarded.

restricted_date_spec Provides a restricted datetime format mask that


allows integer binary-input data to load into
datetime columns. For more information about the
restricted datetime masks, refer to “Restricted
Datetime Masks for Numeric Fields” on page 3-116.

([DPSOH

If the TMU LOAD DATA script references a binary-integer input field type of
INT (3) for conversion to a database data type of DECIMAL (5,2), the following
conversions or errors occur.

,QSXW9DOXH 5HVXOW

473220 473.22

819077 Truncation error. Record discarded

2123478320 Overflow error. Record discarded

)ORDWLQJ3RLQW%LQDU\)LHOG7\SHV
The REAL and DOUBLE PRECISION field types are not supported if the FOR-
MAT IBM keywords are included in the Format clause.

REAL Four-byte floating-point numbers.

DOUBLE PRECISION Eight-byte floating-point number.

 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
'DWHWLPH)LHOG7\SHV

'DWHWLPH)LHOG7\SHV
DATE, TIME, Character data that is processed and stored as date, time,
TIMESTAMP and time-stamp information. For datetime field types in
fixed-format input, the length must be specified either
with the POSITION keyword or with the length
parameter.

Unlike other field types, DATE, TIME, and TIMESTAMP


fields are each composed of subfields. The date_mask,
time_mask, and timestamp_mask elements represent for-
mat masks that must be defined in the field specification
to specify which subfields are used and their order and
length. For more information about format masks, refer
to “Format Masks for Datetime Fields” on page 3-109.

The following table defines the allowable subfields for


each datetime field type. Required fields for each field
type are indicated in bold.

)LHOG7\SH $OORZDEOH6XEILHOGV

DATE year and Julian date; or year, month, date

TIME hour, minute, second, fractional second

TIMESTAMP year, Julian date, hour, minute, second,


fractional second, or year, month, date,
hour, minute, second, fractional second

CURRENT_DATE Value to use is the time at which the row is actually


loaded. Remember that the date value changes at
midnight.

CURRENT_TIME Value to use is the time at which the row is loaded.

CURRENT Value to use is the value of CURRENT_TIME and


TIMESTAMP CURRENT_DATE at the time the row is loaded.

/RDGLQJ'DWDLQWRD:DUHKRXVH'DWDEDVH 
'DWHWLPH)LHOG7\SHV

M4DATE String of characters representing a date in the format


specified by format.
M4DATE m4date_mask
M4DATE m4date_mask(length)

The length must be specified with the POSITION key-


word or with the length parameter, which specifies the
total length in bytes.

All formats defined for the M4DATE field type can be


represented by allowable format masks for datetime lit-
erals, which means input data containing dates in an
M4DATE format can be loaded into a DATE column with-
out any modification of the input data. Because the
datetime scalar functions cannot be used on the integer
columns loaded from M4DATE fields, IBM recommends
that you not use M4DATE fields for new databases.

The TMU converts the M4DATE string into the Metaphor DIS date format and
stores it as an integer when it loads a table. The format must be one of those
listed in the following table.

)RUPDW ([DPSOH$SULO

YYJJJ or 96/100
YYYYJJJ 1996/100

YYMD or 960410
YYYYMD 1996/4/10

MDYY or 4/10/96
MDYYYY 04101996

DMYY or 10/4/96
DMYYYY 10041996

 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
)RUPDW0DVNVIRU'DWHWLPH)LHOGV

D One or two digits specifying the day of the month.

JJJ Three digits specifying day of the year, in Julian format.

M One or two digits specifying the month of the year.

YY and YYYY Two and four digits respectively specifying a year. A two-
digit year nn is interpreted as 19nn.

The day, month, and year fields can be contiguous or can be separated by a
single blank, slash (/), hyphen (-), period (.), or comma (,). If the fields are
contiguous, then each day and month representation must contain two
digits.

)RUPDW0DVNVIRU'DWHWLPH)LHOGV
A format mask for a datetime field type is created by concatenating allowable
subfield format specifiers, using either a fixed- or variable-length subfield or
a combination of both. A DATE format mask composed of month, day, and
year subfields might look like one of those listed in the following table.

)RUPDW 'HILQLWLRQ

’MMDDYYYY’ DATE format mask, fixed format input

’M*/D*/Y*’ DATE format mask, variable format input

’MM D* YYYY’ DATE format mask, both fixed and variable

’m8d16y1997’ DATE format mask, constant date (Aug 16, 1997). Date
constants can also be defined with a CONSTANT field
specification, as described on page 3-84.

([DPSOHV
date (8) ’MMDDYYYY’
date (8) ’DDMMYYYY’
date ’y1996m8d17’

/RDGLQJ'DWDLQWRD:DUHKRXVH'DWDEDVH 
6XEILHOG&RPSRQHQWV

6XEILHOG&RPSRQHQWV
The following table defines each subfield component, its default value, and
its specifier in the format mask. Examples for fixed- and variable-length
subfields are provided in the sections that follow.

6XEILHOG 'HIDXOW 0DVN


&RPSRQHQW 9DOXH 6SHFLILHU 6XEILHOG5DQJHDQG,QWHUSUHWDWLRQ

Year Required subfield. Y 1 to 9999. Number of Ys specify number of digits to


No default. Must be read.
specified
y?Y* or 00 to 49 imply 2000 to 2049. 50 to 99 imply 1950-1999.
y?YY

ynY* or Century is fixed: 1 to 99, specified by n.


ynYY Year expressed by 1 or 2 digits.

yn Year is fixed: 1 to 9999, specified by n.

Julian day J 1 to 366. Number of Js specify number of digits to


(day of year) read. January 1 is 1.

jn Day is fixed: 1 to 366, specified by n.


Date value stored is adjusted for leap years.

Month 1 M 1 to 12. Number of Ms specify number of digits to


read.

mn Month is fixed: 1 to 12, specified by n.

Mon *Jan, Feb, Mar, Apr, May, Jun, Jul, Aug, Sep, Oct, Nov,
Dec

Month *January, February, March, April, May, June, July,


August, September, October, November, December

Day 1 D 1 to 31. Number of Ds specify number of digits to


read. Further constrained by month and year
according to rules in Gregorian calendar.

dn Day is fixed: 1-31, specified by n.


(1 of 2)

 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
6XEILHOG&RPSRQHQWV

6XEILHOG 'HIDXOW 0DVN


&RPSRQHQW 9DOXH 6SHFLILHU 6XEILHOG5DQJHDQG,QWHUSUHWDWLRQ

Hour Required subfield. H 0 to 23. Number of Hs specify number of digits to


No default value. read.
Must be specified
A Optional AM and PM subfield specifier (not case
AM sensitive). A implies data contains either A or P. AM
implies data contains AM or PM.
12-hour times are converted to 24-hour times.

hn Hour is fixed: 0 to 23, specified by n.

Minute 0 I 0 to 59. Number of Is specify number of digits to read.

in Minute is fixed: 0 to 59, as specified by n.

Second 0 S 0 to 59. Number of Ss specify number of digits to read.

sn Second is fixed: 0 to 59, as specified by n.

Second 0 F 0 to 999999. Number of Fs specify number of digits to


fraction read, as well as the scaling factor of this component,
which is 10- #of Fs
For example, F indicates tenths of a second,
FF indicates hundredths of a second.

fn Second fraction is fixed and specified in


microseconds: 0 to 999999, specified by n.
(2 of 2)

Interpretation of the Month, Mon, and AM subfields is locale-specific. Inter-


pretation of the A subfield is not locale-specific.

In a fixed-length subfield, the last letter of the subfield mask character repeats
once for each character of the input. Fixed-length subfields cannot contain
blanks.

In a variable-length subfield, the subfield specifier is followed by an asterisk


(*). For Mon* and Month*, enough characters are processed to complete the
month name. In the case of the numeric subfields, characters are processed
until the first non-digit character. Leading blanks are ignored. After the
specified number of characters are processed, zero or more digits can follow.

/RDGLQJ'DWDLQWRD:DUHKRXVH'DWDEDVH 
6XEILHOG&RPSRQHQWV

You can use an underscore (_) to indicate the end of the subfield and that the
next character should be ignored. You can also use the underscore as a
wildcard to skip bytes. You must repeat it for each byte to skip.

Regardless of the format you specify for the subfields, the number of bytes
processed is limited by the length parameter for the datetime field.

([DPSOHV RI6XEILHOG0DVNV

)RUPDWIRU)L[HGOHQJWK6XEILHOGV

YYYY Indicates 4 digits for year.

MMDDYYYY Indicates 2 digits each for month and day and 4 digits for year.

_ _ _ _Mon Indicates that the 4 bytes (represented by 4 underscores)


preceding the 3-character month should be ignored.

)RUPDWIRU9DULDEOHOHQJWK6XEILHOGV

D*/M*/YYYY Indicates 1 or more digits for day and month. 4 digits for year,
subfields separated by a slash (first non-digit character).

Mon d1 y?Y* Indicates short month. Day is 01, 1- or 2-digit year, subfields
separated by spaces.

Combination of fixed- and variable-length subfields, where the numbers


beneath each subfield correspond to the explanations that follow.

’Month* D*, YY*_HH:II:SS.FFFF*’


(1) (2) (3) (4) (5) (6) (7)

(1) Skip any blanks, read full-month name, and check for a blank.

(2) Skip any blanks, read one or more digits for date, and check for a comma.

(3) Skip any blanks and read two or more digits for year. Ignore the non-digit
character following the year.

(4) Read two digits for hours. No white space or other non-digits allowed.
Check for a colon.

 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
6XEILHOG&RPSRQHQWV

(5) Read two digits for minutes. No white space or other non-digits allowed.
Check for a colon.

(6) Read two digits for seconds. No white space or other non-digits allowed.
Check for a period.

(7) Skip any blanks, read four or more digits for fractional seconds, and
ignore anything that follows.

)RUPDW0DVNVWRUHDG,QSXW)LHOGV
The following table contains some types of input fields and suggests masks
to read them.

7RUHDGWKHVHLQSXWILHOGV 8VHWKHVHIRUPDWPDVNV

June 29 ’Month D* y1996’


May 14
April 1

June 29 49 ’Month D* y?Y*’ (years: 2049, 1996, 2001)


May 14 96 ’Month D* Y*’ (years: 0049, 0096, 0001)
April 1 1 ’Month D* y19Y*’ (years: 1949, 1996, 1901)

Tue Jun 4 16:50:49 PDT 1996 For timestamp:


Tue Jun 25 16:50:49 PDT 1996 ’_ _ _ _Mon D* HH:II:SS_ _ _ _ _YYYY’
For date:
’_ _ _ _Mon D* _ _ _ _ _ _ _ _ _ _ _ _ _YYYY’

Tue Jun 4 1996 ’_ _ _ _ Mon* D* YYYY’


Sat Jun 15 1996

Tue Jun 04 16:50:49 PDT 1996 For timestamp:


’_ _ _ _Mon DD HH:II:SS_ _ _ _ _YYYY’
For time:
’_ _ _ _ _ _ _ _ _ _ _HH:II:SS_ _ _ _ _ _ _ _ _’

Jun 29, 1996 ’Mon D*,_YYYY’


Jun 7, 1996 or ’Mon D*,Y*’’

Jun 29 1996 ’Mon D* YYYY*’


Jun 9 1996
Jun 2 1996
(1 of 2)

/RDGLQJ'DWDLQWRD:DUHKRXVH'DWDEDVH 
6XEILHOG&RPSRQHQWV

7RUHDGWKHVHLQSXWILHOGV 8VHWKHVHIRUPDWPDVNV

01/15/96 08:26 AM ’MM/DD/y?Y* HH:II AM’


(stored as 1996-01-15 08:26:00)
11/15/01 06:15 pm
(stored as 2001-11-15 18:15:00)

1995/060 (stored as 1995-03-01) ’YYYY/J*’


1996/060 (stored as 1996-02-29) or ’YYYY/JJJ’
1991/366 (rejected)
1980/366 (stored as 1980-12-31)

’YYYY_ _MM_ _DD_ _’ (where the multibyte


characters for year, month, and day are
skipped to produce 01-06-1998 (January 6,
1998).
(2 of 2)

([DPSOH/RDGLQJ'DWHWLPH'DWD
The following example shows how data in various formats is loaded into
DATETIME columns.

A table named Datetime is defined as follows:


create table datetime (
d1 date,
d2 date,
d3 date,
ts1 timestamp(0),
t1 time,
ts2 timestamp(0))

The data for the Datetime table is in a file named datetime_inputs with fields
separated by an asterisk (*). The first three records of input data in
datetime_inputs look like the following example.

96/12/25*December 25 96*07042359*11:59:00PM*Tue Jun 29 16:40:55 PDT 1996


06/12/25*December 25 6*07042359*11:59:00AM*Tue Jun 29 16:40:55 PDT 1996
6/12/25*December 25 6*07042359*12:59:00AM*Tue Jun 29 16:40:55 PDT 1996

to d1 to d2 to ts1 to t1 to ts2

 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
6XEILHOG&RPSRQHQWV

A LOAD DATA statement to load this data follows. The current date is loaded
into column D3.
load data
inputfile ’datetime_inputs’
replace
format separated by ’*’
into table datetime (
d1 date ’y19Y*/M*/D*’, -- Date subfields separated by /
d2 date ’Month D* y?Y*’,-- Date subfields separated by space
d3 current_date, -- Rows loaded with date at time of load
ts1 timestamp(8) ’y1996MMDDHHII’, -- Fixed format mask
t1 time ’HH:II:SSAM’, -- Time subfields separated by :
ts2 timestamp ’_ _ _ _Mon DD HH:II:SS_ _ _ _ _Y*’
-- First 4 characters and 5 characters
-- between S and Y are ignored
);

If the data is loaded on July 1, 1996, the information stored in the Datetime
table looks like the following example.

G G G WV W WV

1996-12-25 1996-12-25 1996-07-01 1996-07-04 23:59:00 23:59:00 1996-06-29 16:40:55

1906-12-25 2006-12-25 1996-07-01 1996-07-04 23:59:00 11:59:00 1996-06-29 16:40:55

1906-12-25 2006-12-25 1996-07-01 1996-07-04 23:59:00 00:59:00 1996-06-29 16:40:55

Suppose you wanted to load a specific date—Aug. 16, 1996—instead of the


date in the input record into the D2 column. The LOAD DATA statement to
load this data uses a constant field with a date value as follows:
load data
inputfile ’datetime_inputs’
replace
format separated by ’|’
into table datetime (
d1 date ’y19Y*/M*/D*’, -- Date subfields separated by /
d2 date ’1996-08-16’, -- Rows loaded with 1996-08-16
d3 current_date, -- Rows loaded with date at time of load
ts1 timestamp(8) ’y1996MMDDHHII’, -- Fixed format mask
t1 time ’HH:II:SSAM’, -- Time subfields separated by :
ts2 timestamp ’_ _ _ _Mon DD HH:II:SS_ _ _ _ _Y*’
-- First 4 characters and 5 characters
-- between S and Y are ignored
);

/RDGLQJ'DWDLQWRD:DUHKRXVH'DWDEDVH 
5HVWULFWHG'DWHWLPH0DVNVIRU1XPHULF)LHOGV

If the data is loaded on July 1, 1996, the information stored in the Datetime
table looks like the following example.

G G G WV W WV

1996-12-25 1996-08-16 1996-07-01 1996-07-04 23:59:00 23:59:00 1996-06-29 16:40:55

1906-12-25 1996-08-16 1996-07-01 1996-07-04 23:59:00 11:59:00 1996-06-29 16:40:55

1906-12-25 1996-08-16 1996-07-01 1996-07-04 23:59:00 00:59:00 1996-06-29 16:40:55

5HVWULFWHG'DWHWLPH0DVNVIRU1XPHULF)LHOGV
The TMU can load binary integer or packed or zoned decimal input data into
datetime columns when the input fields are described by a restricted
datetime format mask. For example, if an input record contains a date value
for February 14, 1998, represented as 19980214 in a packed decimal field, the
TMU can extract the date from the input field and store it in a DATE column.

Back to
field_type
restricted_date_spec
p. 3-97
DATE ’restricted_date_mask’
TIME ’restricted_time_mask’
TIMESTAMP ’restricted_timestamp_mask’

 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
5HVWULFWHG'DWHWLPH0DVNVIRU1XPHULF)LHOGV

restricted_date_mask, Subfields present in the input fields. These


restricted_time_mask, masks are defined like the masks for the DATE,
restricted_timestamp_mask TIME, and TIMESTAMP input fields on
page 3-109, with the following restrictions
(because the inputs are numeric and
fixed-format):

■ No separators (slashes) are allowed.


■ No variable-width fields are allowed:
the mask cannot contain asterisks (*).
■ No alphabetic representations are
allowed: the mask cannot contain
Mon, Month, or AM or PM specifiers.
■ A mask for a decimal input field must
represent exactly the number of digits
that can occur in decimal input.
■ A mask for an integer input field
cannot represent more digits than can
occur in the number of bytes
represented by the field type.:

,QWHJHU)LHOG7\SH 0D[LPXP'LJLWVLQ0DVN

INTEGER (4 bytes) 10

SMALLINT (2 bytes) 5

TINYINT (1 byte) 3
T bl 2
You can use the underscore (_) character in the
mask to indicate that an input digit should be
ignored.

,PSRUWDQW Scale values are ignored when restricted date masks are used. A length
value, from a length argument or a Position clause, is required with packed or zoned
decimal field types and must be consistent with the format mask.

/RDGLQJ'DWDLQWRD:DUHKRXVH'DWDEDVH 
5HVWULFWHG'DWHWLPH0DVNVIRU1XPHULF)LHOGV

5HTXLUHPHQWVIRU,QSXW'DWDIRU'DWHWLPH0DVNV
The input data must meet the following requirements:

■ No negative numbers are allowed. If the field contains a negative


number, the row is discarded.
■ For packed or zoned decimal input, the number of digits present in
the input data must be the same as the number of digits indicated by
the mask. If they are not the same, an error occurs and the TMU does
not begin the load operation.
■ For binary integer input:
❑ If the input data contains more digits than indicated by the mask,
the row is discarded.
❑ If the input data contains fewer digits than indicated by the
mask, the input is considered to have leading zeroes. For
example, a mask of YYYYMMDD used for input data 980101
stores the data as 00980101.

([DPSOHV

The following examples illustrate valid and invalid restricted masks.

5HVWULFWHG
'DWH0DVN ([DPSOHV &RPPHQWV

YYYYMMDD 19980214 Valid: 4 digits of year, 2 digits each for month and
day.

y1998MMDD 0214 Valid: Year fixed at 1998, followed by 2 digits


00000214 each of month and day.

y?YYJJJ 98046 Valid: Imply century from 2-digit year, followed


by 3 digits of Julian date.

y?Y* – Invalid because it contains a variable-length year


field.

MM/DD/YYYY – Invalid because it contains separators (/).

 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
:ULWLQJD6<1&+6WDWHPHQW

The following example shows how to use the restricted date mask in a LOAD
DATA statement. The Period table contains a DATE data-type column named
Date_Col. The following LOAD DATA statement loads input records that
contain date information stored as binary integers into the Date_Col column
in the Period table:
load data
inputfile ’aroma_period.txt’
replace
discardfile ’aroma_discards’
discards 1
into table period (
perkey integer external (4),
month char (4),
year integer external (4),
quarter integer external (1),
tri integer external (10),
date_col integer date ’YYYYMMDD’
) ;

In the following example, the input records contain the date information in
the format ‘YYYYMMDD’ (for example, 19971225) stored as a binary integer.
The TMU extracts the date information from the binary input and stores it as
a DATE data type in the Date_Col column.

:ULWLQJD6<1&+6WDWHPHQW
If data is loaded into an offline segment, you must complete the load
operation by synchronizing the segment with the table and its indexes before
the segment can be brought online and made available for use. Synchroni-
zation is necessary only for offline load operations. If the segment into which
data was loaded was online at the time of the load, synchronization is not
necessary.
,PSRUWDQW The SYNCH operation acquires an exclusive lock on the target table but
this operation is much quicker than an online load of the table.

To perform this synchronization, run the TMU with a control file that contains
a SYNCH OFFLINE SEGMENT statement. You can include this statement in the
same control file as the LOAD DATA statement or you can put it in a separate
control file. At the end of the synchronization operation, the work segment
used for the offline load is detached from the table and is available for reuse.

/RDGLQJ'DWDLQWRD:DUHKRXVH'DWDEDVH 
:ULWLQJD6<1&+6WDWHPHQW

If you decide, after loading data into the segment, that you want to remove
the newly loaded data rather than incorporate it into the table, you have two
choices:

■ Delete all the data in the segment with the ALTER SEGMENT…CLEAR
statement. This choice is appropriate if the segment was empty or if
you do not want the data that was in the segment before the load
operation.
■ Delete only the newly loaded data with the UNDO LOAD option to
the SYNCH SEGMENT statement. This choice is useful for segments
that contained data you want to preserve before the offline operation
was performed.

SYNCH OFFLINE SEGMENT segment_name WITH TABLE

table_name ;
DISCARDFILE ’ discard_filename’
UNDO LOAD

segment_name Offline segment that contains newly loaded data not yet
synchronized with the owning table and its indexes.

table_name Owning table of segment_name.

 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
:ULWLQJD6<1&+6WDWHPHQW

DISCARDFILE File to which all duplicate rows are discarded. This ASCII
discard_filename file contains rows in the same format as those rows dis-
carded during an optimized load or an UNLOAD
EXTERNAL operation on a table. For more information
about discard files, refer to page 3-60.

If this clause is omitted, no file is written.

UNDO LOAD Synchronizes the segment with the table and its indexes
by deleting all the rows that are added to the segment.
This operation is useful when you discover you loaded
the wrong data or when a lot of rows are discarded unex-
pectedly and you want to start over. It removes all evi-
dence of the previous offline load operation, leaving
intact the rows that were in the segment before the offline
load.

The UNDO LOAD option does not work in REPLACE


mode.

([DPSOH

The following example shows a control file that contains a LOAD DATA
statement and a SYNCH SEGMENT statement to synchronize the newly
loaded offline segment with the rest of the table.
load data
inputfile ’sales_96_data’
append
discardfile ’discards_sales_96’ discards 3
into offline segment s_1q96 of table sales
working_space work01 (
perkey date (10) ’MM/Y*/d01’,
prodkey integer external (2),
mktkey integer external (2),
dollars integer external (3)
) ;
synch offline segment s_1q96 with table sales
discardfile ’discards_synch’;

Because the SYNCH operation requires an exclusive lock on the table, you
might prefer to use a separate control file for that operation so you can
perform it at a time when users are not accessing the table.

After the SYNCH operation, you must use ALTER SEGMENT…ONLINE before
you can access the segment.

/RDGLQJ'DWDLQWRD:DUHKRXVH'DWDEDVH 
)RUPDWRI,QSXW'DWD

)RUPDWRI,QSXW'DWD
The TMU supports a wide variety of input data formats. However, not all
platforms support all formats. The TMU accepts both disk and tape input and
system standard input. Tape files can be ANSI standard label or TAR (Tape
ARchive) formats. The record format for disk input files can be fixed,
variable, separated, or XML. XML input files cannot be loaded from tape.

Windows Tape input is not supported on Windows platforms. ♦

Not all combinations of data, record format, and tape formats are valid. The
following table defines the valid combinations.

,QSXW'HYLFH )LOH)RUPDW 5HFRUGV

Disk files Fixed format


standard input (stdin)
Variable format
Standard flat files
Separated format

XML format

Tape Devices: TAR Fixed format


4 mm DAT
8 mm (Exabyte) Variable format
1/4” cartridge
9-track reel Separated format
3480/3490 18-track cartridge
ANSI standard label Fixed and variable length

The 1/4-inch cartridge input device is supported only for TAR tapes. The
3480/3490 18-track cartridge is supported only by variable-block-length
device drivers for ANSI Standard Label tapes.

The TMU also provides limited support for IBM standard label tapes. It can
read IBM standard label tapes with fixed-length records in EBCDIC FB format.
However, it cannot read variable-length (VB or VBS) tapes. The filenames on
the tape must be uppercase.

The TMU also supports an internal storage format, UNLOAD, which loads
data files written with a TMU UNLOAD control file. You can write UNLOAD-
format tapes in either TAR or standard-label format.

 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
'LVN)LOHV

The TMU handles disk and tape files differently with respect to file format,
record length, format, and data type, as the following sections describe.

'LVN)LOHV
Disk files can contain fixed-format, variable-format, separated-format, or
XML input data. If no format keywords are specified in the LOAD DATA
statement, the TMU assumes that fixed-format records will be loaded.

)L[HG)RUPDW5HFRUGV
For fixed-format records, all records are a fixed length and all field types are
allowed. The TMU determines the record length from the defined size of
RECORDLEN in the FORMAT clause of the LOAD DATA statement according
to the following rules:

■ If the RECORDLEN value is greater than the sum of field lengths


specified in the field specification, the RECORDLEN value is used as
record length.
■ If the RECORDLEN value is less than the sum of field lengths
specified in the field specification, a warning message is issued and
the load process stops.
■ If the RECORDLEN clause is missing, each record is read until a
newline character is encountered. Binary data is not permitted.
■ If the RECORDLEN clause is missing and a record is shorter than the
sum of field lengths specified in the field specification, a warning
message is issued and the row is written to the discard file.
■ If the RECORDLEN clause is missing and a record is longer than the
sum of field lengths specified in the field specification, data beyond
that length is discarded.

To read EBCDIC in fixed-record format, include the FORMAT IBM clause in the
LOAD DATA statement. This clause forces CHARACTER and EXTERNAL fields
to convert from EBCDIC to ASCII, and INTEGER fields to convert to the
byte-ordering of the native computer.

([DPSOHV

The following examples illustrate how to read fixed-format disk files.

/RDGLQJ'DWDLQWRD:DUHKRXVH'DWDEDVH 
'LVN)LOHV

The RECORDLEN value equals the sum of the field lengths:


LOAD DATA
INPUTFILE ’mkt.txt’
RECORDLEN 126

The RECORDLEN value equals the sum of the field lengths. Data is in EBCDIC.
LOAD DATA
INPUTFILE ’mkt.txt’
RECORDLEN 126
FORMAT IBM

The data is newline-character delimited. RECORDLEN is not present.


LOAD DATA
INPUTFILE ’mkt.txt’

9DULDEOH)RUPDW5HFRUGV
The variable-format record is a modified version of the fixed-format record.
A variable-format record consists of a fixed-length part and a variable-length
part. Every variable-format record has the same length for the fixed-length
part, but can have a different length for the variable-length part. For a
variable-length TMU column, the length of the column is the fixed-length
part, and the real data of the column is attached in the variable-length part.

The TMU reads the fixed-length part of the record first. Next, the TMU deter-
mines the length of the variable-length part, then reads the variable-length
part.

The TMU determines the fixed-length part of the record from the defined size
of FIXEDLEN in the FORMAT clause of the LOAD DATA statement according to
the following rules:

■ If the FIXEDLEN value is greater than the sum of field lengths


specified in the field specification, the FIXEDLEN value is used as the
fixed-length part.
■ If the FIXEDLEN value is less than the sum of field lengths specified
in the field specification, a warning message is issued and the load
process stops.

 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
'LVN)LOHV

■ If the FIXEDLEN clause is missing, each record is read until a newline


character is encountered. Binary data is not permitted.
■ If the FIXEDLEN clause is missing and a record is shorter than the
sum of the field lengths specified in the field specification, a warning
message is issued and the row is written to the discard file.

The variable-length part consists of multiple variable-length character data


sections. The length of a data section is defined by the corresponding
VARLEN or VARLEN EXTERNAL data fields in the fixed-length part of the
input record. The first data section starts at the end of the fixed-length part.
The starting offsets of subsequent data sections are defined by the total length
of the preceding data sections.

Based on the FORMAT clause in the TMU control file, the TMU reads the input
record two ways:

■ By the specified length.


The FIXEDLEN keyword should be used to indicate the total length
for the fixed-length part of the input record. The inclusion of the
INTRA RECORD SKIP modifier in the FORMAT clause informs the
TMU to skip a certain number of bytes between input records. The
INTRA RECORD SKIP modifier skips newlines ’\n’ or ’\r\n’
between records.
For each input record, the TMU reads the fixed-length part in the
specified length. The TMU converts each VARLEN and VARLEN
EXTERNAL field into a number, adds them to determine the total
length of the variable-length part, and then reads the variable-length
part according to that length. If the INTRA RECORD SKIP modifier is
present in the FORMAT clause, the TMU bypasses the specified bytes
before moving on to the next input record.
If the length characters in VARLEN or VARLEN EXTERNAL fields are
not valid digits, the TMU returns an error. If the length characters are
valid digits, but do not reflect the real length of the data section, the
TMU attempts to detect this. However, it is possible that length errors
can cause the TMU to read the following records in the wrong
boundary.

/RDGLQJ'DWDLQWRD:DUHKRXVH'DWDEDVH 
'LVN)LOHV

■ By each new line.


If the FIXEDLEN keyword is not present in the FORMAT clause, the
TMU reads the input record line by line. No binary data is allowed in
the input record, and the TMU does not parse the VARLEN fields in
the reading process.
Since every record is delimited by the new line, any length errors in
a particular record do not cascade to subsequent records. Only the
length-error record is discarded.
Use read by newline if you suspect the input data file might not be
clean and if no binary data is in the input data file.

After the input record is read, the TMU converts it into an internal row. The
TMU uses the appearance order of VARLEN and VARLEN EXTERNAL in the
fixed-length part to calculate the offset and length of each data section in the
variable-length part. The data sections are then put in the corresponding
output column of the internal row.

8VDJH
The variable-format record is more compact than the fixed-format record,
and also preserves significant trailing spaces. In addition, the variable-format
record data-input file is more complex than the fixed-format record
data-input file. Use the variable-format record format:

■ When moving data between different computers in external format.


■ To reduce the space use for the input file.
■ When loading VARCHAR data in international locales. Typically in
fixed-format records, a combination of the CHARACTER and RTRIM
modifiers is used to load data into a VARCHAR column. In interna-
tional locales, the RTRIM operation is expensive, and could cause
noticeable LOAD operation performance degradation.

 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
'LVN)LOHV

([DPSOH

CREATE TABLE statement to create table market:

CREATE TABLE MARKET (


MKTKEY INTEGER NOT NULL UNIQUE,
HQ_CITY VARCHAR(20) NOT NULL,
HQ_STATE CHARACTER(20) NOT NULL,
DISTRICT VARCHAR(20) NOT NULL,
REGION CHARACTER(20) NOT NULL,
PRIMARY KEY(MKTKEY));

The input file market.txt (a new line is used between records).


00000000001 000007 GA 000007 South AtlantaAtlanta
00000000002 000005 FL 000007 South MiamiAtlanta
00000000003 000011 LA 000011 South New OrleansNew Orleans
00000000004 000007 TX 000011 South HoustonNew Orleans
00000000005 000008 NY 000008 North New YorkNew York

Variable-format data loaded:

■ Read by new line


LOAD DATA INPUTFILE ’market.txt’
INSERT
FORMAT VARIABLE
NLS_LOCALE ’English_UnitedStates.US-ASCII@Binary’
INTO TABLE MARKET (
MKTKEY POSITION(2) INTEGER EXTERNAL(11) NULLIF(1)=’%’,
HQ_CITY POSITION(14) VARLEN EXTERNAL(6) NULLIF(13)=’%’,
HQ_STATE POSITION(21) CHARACTER(20) NULLIF(20)=’%’,
DISTRICT POSITION(42) VARLEN EXTERNAL(6) NULLIF(41)=’%’,
REGION POSITION(49) CHARACTER(20) NULLIF(48)=’%’);

■ Read by specified length


LOAD DATA INPUTFILE ’market.txt’
FIXEDLEN 68 INTRA RECORD SKIP 1
INSERT
FORMAT VARIABLE
NLS_LOCALE ’English_UnitedStates.US-ASCII@Binary’
INTO TABLE MARKET (
MKTKEY POSITION(2) INTEGER EXTERNAL(11) NULLIF(1)=’%’,
HQ_CITY POSITION(14) VARLEN EXTERNAL(6) NULLIF(13)=’%’,
HQ_STATE POSITION(21) CHARACTER(20) NULLIF(20)=’%’,
DISTRICT POSITION(42) VARLEN EXTERNAL(6) NULLIF(41)=’%’,
REGION POSITION(49) CHARACTER(20) NULLIF(48)=’%’);

/RDGLQJ'DWDLQWRD:DUHKRXVH'DWDEDVH 
'LVN)LOHV

Data after loading:


RISQL> select * from market;
MKTKEY HQ_CITY HQ_STATE DISTRICT REGION
1 Atlanta GA Atlanta South
2 Miami FL Atlanta South
3 New Orleans LA New Orleans South
4 Houston TX New Orleans South
5 New York NY New York North

6HSDUDWHG)RUPDW5HFRUGV
For separated-format records, the TMU determines the length of each field in
the record by the separator character defined in the FORMAT clause of the
LOAD DATA statement. The end of each record is indicated by the newline
character (or by the end of file for the last record in the file).

Only character and external field types are allowed. Length values and
POSITION keywords are ignored.

If a RECORDLEN clause is present with separated-format records, then the


record is read until either a newline character is read or the number of
characters specified by RECORDLEN is read, whichever comes first.

If records in separated format are longer than 8192 bytes, then a RECORDLEN
clause, which specifies the maximum length of a record, must be used.

([DPSOHV

These examples illustrate LOAD DATA statements to read separated-format


disk files or system-standard input.

The fields are separated by a comma and the record ends with a newline
character:
LOAD DATA
INPUTFILE ’mkt.txt’
FORMAT SEPARATED BY ’,’

The fields are separated by a slash (/) and the record length is 126 bytes:
LOAD DATA
INPUTFILE ’mkt.txt’
RECORDLEN 126
FORMAT SEPARATED BY ’/’

 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
'LVN)LOHV

The input is read from standard input, fields are separated by a colon, and the
record ends with a newline character:
LOAD DATA
INPUTFILE ’-’
…FORMAT SEPARATED BY ’:’

;0/)RUPDW
XML files consist of markup tags and data content. The markup tags define
elements that have a repetitive hierarchical structure. This structure has data
values embedded in it, but these values do not readily transform into flat
database rows and columns. In order for the TMU to locate the data content
in the XML file and construct rows, the XML file is parsed (using the Xerces-
C++ parser), according to rules specified in the TMU control file. The syntax
that defines these rules is the “xml_path Specification” on page 3-75.

The XML paths in the TMU control file must comply with the hierarchy of the
elements in the XML file. For each column to be loaded, the control file
specifies a path that points to the location of the data. The data is always
located at the end of a series of elements that comprise the path. The data is
either the value of an element’s attribute or the character data (PCDATA)
enclosed by the element’s start and end tags.

A row is constructed from sets of data values inside the XML input file that
map to corresponding sets of XML paths in the TMU control file. Only one row
can be constructed from the data inside the start and end tags of the last
common element specified in the control file. In order for multiple rows to be
loaded, the input file must contain repetitive blocks that begin and end with
the same common element, and that element must be the last common
element.

For example, the control file for loading four columns in a table might define
these four paths:
...
prod_brand /product/brand/#PCDATA char(20),
prod_name /product/brand/category/@name char(20),
prod_grind /product/brand/category/@grind char(50),
prod_weight /product/brand/category/@weight integer external (5),
...

/RDGLQJ'DWDLQWRD:DUHKRXVH'DWDEDVH 
'LVN)LOHV

The last common element is brand; therefore, the content that makes up a
single row must fall between the start tag <brand> and end tag </brand> in
the input file. In the following example of a partial input file, this content is
shown in bold:
<product>
<brand>Aroma
<category name=’Coffee’ grind=’Whole bean’ weight=’2’/>
</brand>
</product>

Using a control file with the paths shown above, the TMU would load the
following values:
Prod_Brand Prod_Name Prod_Grind Prod_Weight
Aroma Coffee Whole bean 2

Note that the TMU loads only the data values, not the markup tags. If only a
subset of the data values is required in the table, the TMU could load that
subset, based on an equivalent subset of XML paths in the control file. If
further TMU processing is required on the resulting rows, such as substrings
or concatenations of the XML data, that functionality can also be built into the
control file.

;0/,QSXWZLWKD1HVWHG6WUXFWXUH
If the content between the tags for the last common element produces more
than one row, the structure of the XML file is said to be nested. Whether the
structure is nested or not cannot be determined from the XML paths specified
in the TMU control file. The structure must be determined at run-time, based
on the content of the XML input file. If a nested structure is detected, the load
operation fails with an error.

Consider a case where the control file in the previous example is structured
the same but the XML input file contains repetitive sets of data values inside
the last common element (brand):
<product>
<brand> Aroma
<category name=’Coffee’ grind=’Whole bean’ weight=’2’/>
<category name=’Coffee’ grind=’Espresso’ weight=’1’/>
</brand>
</product>

 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
7DSH)LOHVRQ81,;2SHUDWLQJ6\VWHPV

Within the content defined by the common element <brand>, there are two
“rows,” and the TMU will return an error if you attempt to load them with the
original control file. However, you could load three of the columns by editing
the control file as follows:
...
prod_name /product/brand/category/@name char(20),
prod_grind /product/brand/category/@grind char(50),
prod_weight /product/brand/category/@weight integer external
...

Now the load will work, but it cannot load the “Aroma” character string
defined by the <brand> element. Instead, the <category> element defines
the boundary for each row and only three columns are produced:
Prod_Name Prod_Grind Prod_Weight
Coffee Whole bean 2
Coffee Espresso 1

WIN
UNIX
NT 7DSH)LOHVRQ81,;2SHUDWLQJ6\VWHPV
Tape files can be read from TAR or ANSI standard label tapes, as described in
the following sections.

XML input files cannot be loaded from tape.

7$57DSHV
TAR tape files are handled like disk files for both fixed-format,
variable-format, and separated-format records.

The TMU can read TAR tape files that span multiple tape volumes. However,
the TMU does not support multiple TAR archives on a single tape.

([DPSOH
The following example shows a LOAD DATA statement to read TAR tape files.

/RDGLQJ'DWDLQWRD:DUHKRXVH'DWDEDVH 
7DSH)LOHVRQ81,;2SHUDWLQJ6\VWHPV

The fields are separated by a comma and the record ends with a newline
character:
LOAD DATA
INPUTFILE ’/disk1/mkt.txt’
TAPE DEVICE ’/tape_dev’
FORMAT SEPARATED BY ’,’

$16,6WDQGDUG/DEHO7DSHV
ANSI-standard label tapes can contain either fixed-length or variable-length
records. The TMU determines the record length from the tape label. If the
RECORDLEN or FIXEDLEN clause is present, it is ignored.

If the SEPARATED clause is present, the TMU assumes separated-format


records. It ignores the label and reads each tape record, scanning for the field-
separator character defined in the TMU LOAD DATA statement. If fewer fields
are present than specified in the command, the TMU issues an error message
and discards the row. If more fields are present, it ignores the remaining fields
in the tape record.

If the SEPARATED clause is not present, the TMU assumes fixed-format or


variable-format data. It determines the format from the label and reads both
fixed-length and variable-length records according to the format described in
the tape label. If the record is shorter than expected, it issues an error message
and discards the record. If the record is longer, it ignores the remaining
characters in the record.

Spanned format tapes are not supported, and a single tape record is not
separated into multiple table rows.

To read EBCDIC format data with fixed-format, variable-format, or separated-


format records, use the FORMAT IBM or FORMAT IBM SEPARATED BY ’c’
clauses respectively.

The following examples show statements to read ANSI-standard label tape


files.

 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
)LHOG7\SH&RQYHUVLRQV

The label determines the record length and the statement specifies the field
lengths.
load data
inputfile ’mkt.txt’
tape device ’tape_dev’

The label determines the record length, a comma separates the fields, and the
data is in EBCDIC:
load data
inputfile ’mkt.txt’
tape device ’tape_dev’
format ibm separated by ’,’

)LHOG7\SH&RQYHUVLRQV
The TMU performs conversions between compatible field types and data
types, converting the data in each field in the input record to the data type of
the corresponding column in the table.

A CHARACTER and VARLEN field is compatible with the CHARACTER and


VARCHAR data types. Although all numeric fields are compatible with any
numeric data type, the conversion can yield unexpected results, as the table
on page 3-136 specifies.

Datetime input data (in either datetime or binary input fields) is compatible
only with other datetime data types, as the table on page 3-136 defines.

Rows are discarded in the following cases:

■ The data in an input field is not compatible with the data type of the
output table column.
■ The value of a numeric input field exceeds the maximum possible
value of the output table column.

/RDGLQJ'DWDLQWRD:DUHKRXVH'DWDEDVH 
)LHOG7\SH&RQYHUVLRQV

If a column is defined as NOT NULL DEFAULT NULL, and an empty field for
that column is encountered, the load operation ends. The following table
defines the allowable conversions and the results that occur for non-datetime
data types. Rows in this table represent input-record field types declared in
the TMU LOAD DATA statement. Columns in this table represent the data
types declared with the CREATE TABLE statement. The entry in each table cell
defines what can happen when input data of a given field type is loaded into
a table column of a given data type.

7DEOH'DWD7\SHV

,QSXW5HFRUG )ORDW
)LHOG7\SHV &KDU 9DUFKDU ,QWHJHU 6HULDO 6PDOOLQW 7LQ\LQW 'HFLPDO 5HDO 'RXEOH

Character, C C N/A N/A N/A N/A N/A N/A N/A


CONSTANT ‘str’,
Concat

Varlen, C C
Varlen External

Integer External N/A O O O O O S S

Decimal External N/A O,D,S O,D,S O,D,S O,D,S O,D,S S S

Float External, N/A O,S O,S O,S O,S O,S O,S O,S
CONSTANT f

Decimal, N/A O,D O,D O,D O,D O,D S S


Packed Decimal,
Zoned Decimal

Integer N/A None None O O O S OK


CONSTANT i
SEQUENCE i

Smallint N/A OK OK None O O O,S OK

Tinyint N/A OK OK OK None O OK OK

Real N/A O,S O,S O,S O,S O,S None OK


(1 of 2)

 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
)LHOG7\SH&RQYHUVLRQV

7DEOH'DWD7\SHV

,QSXW5HFRUG )ORDW
)LHOG7\SHV &KDU 9DUFKDU ,QWHJHU 6HULDO 6PDOOLQW 7LQ\LQW 'HFLPDO 5HDO 'RXEOH

Double N/A O,S O,S O,S O,S O,S O,S None

CONSTANT None None None None None None None None


NULL

M4Date N/A Date Date N/A N/A N/A N/A N/A


(2 of 2)

The following abbreviations are used in the preceding table.

N/A Not allowed. Load process is stopped.

None No conversion required.

C Character-to-character, left-justified, space-fill or truncation on


right. Truncation does not cause the record to be discarded.

OK No overflow or loss of significance possible.

O Overflow possible. Overflow causes the record to be discarded.


Overflow might occur when data is loaded into a DECIMAL column
because the precision is not set high enough to include both the
whole number and its decimal component.

S Loss of significance possible. Not a fatal error.

D Decimal point alignment. Truncation of digits to the right of the


decimal point possible. Truncation causes the record to be
discarded.

Date The date represented is converted to the number of days since


January 1, 1970. Dates before this date are represented as negative
numbers.

/RDGLQJ'DWDLQWRD:DUHKRXVH'DWDEDVH 
)LHOG7\SH&RQYHUVLRQV

The following table defines the allowable data-type conversions for datetime
data types. The input-record field types include binary numeric input data.

7DEOH'DWD7\SHV
,QSXW5HFRUG
)LHOG7\SHV '$7( 7,0( 7,0(67$03

DATE None N/A Date parts: from input data


Time parts: midnight

TIME N/A None Date parts: 1900-01-01


Time parts: from input data

TIMESTAMP Date parts Time parts None

The following abbreviations are used in the preceding table.

N/A Not allowed. Load process is stopped.

None No conversion required.

 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
/2$''$7$6\QWD[6XPPDU\

/2$''$7$6\QWD[6XPPDU\
The following syntax diagrams provide the complete syntax for the TMU
LOAD DATA statement.

LOAD input_clause
p. 3-25
DATA format_clause locale_clause
p. 3-30 p. 3-39

discard_clause row_messages_clause optimize_clause mmap_index_clause


p. 3-45 p. 3-58 p. 3-59 p. 3-63

table_clause ;
p. 3-65 criteria_clause comment_clause
p. 3-90 p. 3-95
segment_clause
p. 3-88

LQSXWBFODXVH

INPUTFILE ’filename’
INDDN ’ TAPE DEVICE ’device_name’
( ’filename’ )

START RECORD start_row STOP RECORD stop_row

/RDGLQJ'DWDLQWRD:DUHKRXVH'DWDEDVH 
/2$''$7$6\QWD[6XPPDU\

IRUPDWBFODXVH 

RECORDLEN n APPEND

FIXEDLEN n INSERT
INTRA RECORD SKIP n REPLACE
MODIFY
AGGREGATE
UPDATE
AGGREGATE

FORMAT IBM
FORMAT SEPARATED BY ’c’
FORMAT IBM SEPARATED BY ’c’
FORMAT UNLOAD
FORMAT VARIABLE
FORMAT IBM VARIABLE
FORMAT XML
FORMAT XML_DISCARD

ORFDOHBFODXVH

NLS_LOCALE ’ ’
language _territory . codeset @sort

XML_ENCODING

 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
/2$''$7$6\QWD[6XPPDU\

GLVFDUGBFODXVH

,
DISCARDFILE ’filename’
DISCARDDN IN ASCII
EBCDIC

RI_DISCARDFILE ’filename’
,
( table_name ’ filename’ )
OTHER ’filename’

DISCARDS n

AUTOROWGEN OFF
ON
,
( table_name )
,
DEFAULT ( table_name )
DEFAULT
,
( table_name )
,
ON ( table_name )

URZPHVVDJHVBFODXVH

ROWMESSAGES ’filename’

/RDGLQJ'DWDLQWRD:DUHKRXVH'DWDEDVH 
RSWLPL]HBFODXVH

OPTIMIZE OFF
ON DISCARDFILE ’filename’

PPDSBLQGH[BFODXVH


MMAP INDEX ( pk_index_name )

SEGMENT ( segment_name )

WDEOHBFODXVH

INTO TABLE table_name

,
( col_name RETAIN )
AS $pseudocolumn DEFAULT
$pseudocolumn simple_field
concat_field
constant_field
sequence_field
increment_field
/2$''$7$6\QWD[6XPPDU\

VLPSOHBILHOG 

field_type

POSITION ( start )
: end

xml_path

ROUND ADD
LTRIM SUBTRACT

RTRIM MIN

TRIM MAX
ADD_NONULL
SUBTRACT_NONULL
MIN_NONULL
:
MAX_NONULL

NULLIF ( start ) = ’string’


: end x’hex_string’

[POBSDWK

/element /@attribute field_type


p. 3-97
/#PCDATA

/RDGLQJ'DWDLQWRD:DUHKRXVH'DWDEDVH 
/2$''$7$6\QWD[6XPPDU\

FRQFDWHQDWHGBILHOG

CONCAT ( concat_arg_spec , concat_arg_spec )

where concat_arg_spec is:

column_name

$pseudocolumn
’character_string’

LTRIM ( column_name )
RTRIM $pseudocolumn
’character_string’

TRIM ( column_name , BOTH )


$pseudocolumn
, LEFT

’character_string’ , RIGHT

FRQVWDQWBILHOG

CONSTANT NULL
’character_literal’
float_constant

integer_constant
DATE 'date_literal'
TIME 'time_literal'
TIMESTAMP ’timestamp_literal’

'alternative_datetime_value'

 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
/2$''$7$6\QWD[6XPPDU\

VHTXHQFHBILHOG

SEQUENCE

( start )
, increment

LQFUHPHQWBILHOG

INCREMENT

( n )

VHJPHQWBFODXVH

INTO OFFLINE SEGMENT segment_name

OF TABLE table_name WORKING_SPACE work_segment

,
( col_name simple_field )
AS $pseudocolumn concat_field
$pseudocolumn constant_field
sequence_field
increment_field

/RDGLQJ'DWDLQWRD:DUHKRXVH'DWDEDVH 
/2$''$7$6\QWD[6XPPDU\

FULWHULDBFODXVHRQQRQFKDUDFWHUFROXPQ

ACCEPT constant = constant


REJECT column_name <> column_name
$pseudocolumn < $pseudocolumn
>
<=
>=

FULWHULDBFODXVHRQFKDUDFWHUFROXPQ

ACCEPT column_name LIKE constant


REJECT $pseudocolumn NOT ESCAPE ’c’

FRPPHQWBFODXVH

COMMENT ‘character_string’

 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
/2$''$7$6\QWD[6XPPDU\

ILHOGBW\SH

CHARACTER
CHAR SUBSTR ( start , num )
( length )
VARLEN

EXTERNAL
INTEGER EXTERNAL
INT EXTERNAL ( length )
, scale

DECIMAL EXTERNAL
DEC EXTERNAL ( length ) RADIX POINT ’c’
, scale

FLOAT EXTERNAL
( length )
DECIMAL
restricted_
DEC PACKED ( length ) date_spec
ZONED , scale
INTEGER
restricted_
INT ( scale ) date_spec
SMALLINT
TINYINT
REAL
DOUBLE PRECISION

/RDGLQJ'DWDLQWRD:DUHKRXVH'DWDEDVH 
/2$''$7$6\QWD[6XPPDU\

ILHOGBW\SH FRQWLQXHG 

DATE
‘date_mask’
( length )
CURRENT_DATE

TIME ‘time_mask’
( length )
CURRENT_TIME

TIMESTAMP ‘timestamp_mask’
( length )
CURRENT_TIMESTAMP

M4DATE m4date_mask
( length )

UHVWULFWHGGDWHBVSHF

DATE ’restricted_date_mask’
TIME ’restricted_time_mask’
TIMESTAMP ’restricted_timestamp_mask’

 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
Chapter

8QORDGLQJ'DWDIURPD7DEOH

In This Chapter . . . . . . . . . . . . . . . . . . . . 4-3
The UNLOAD Operation. . . . . . . . . . . . . . . . . 4-4
Internal Format . . . . . . . . . . . . . . . . . . . 4-5
External Format . . . . . . . . . . . . . . . . . . 4-5
Data Conversion to External Format . . . . . . . . . . . 4-6

UNLOAD Syntax . . . . . . . . . . . . . . . . . . . 4-8


Unloading or Loading Internal-Format Data . . . . . . . . . . 4-14
Unloading or Loading External-Format Data . . . . . . . . . . 4-16
Converting a Table to Multiple Segments . . . . . . . . . . . 4-18
Moving a Database . . . . . . . . . . . . . . . . . . . 4-18
Loading External-Format Data into Third-Party Tools . . . . . . . 4-19
Unloading Selected Rows . . . . . . . . . . . . . . . . 4-19
Example: External Fixed-Format Data . . . . . . . . . . . 4-20
Example: External Variable-Format Data . . . . . . . . . . 4-22
 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
,Q7KLV&KDSWHU
You can unload data from a warehouse table to a file, a magnetic tape, or
standard output with the TMU UNLOAD statement. The TMU can later reload
this data, or another application can use it. You might also find a TMU
selective UNLOAD operation to be faster than an SQL query in some cases.

UNLOAD operations can be performed both locally and remotely; for infor-
mation about remote TMU operations, see page 2-12.

This chapter contains the following sections:

■ The UNLOAD Operation


■ UNLOAD Syntax
■ Unloading or Loading Internal-Format Data
■ Unloading or Loading External-Format Data
■ Converting a Table to Multiple Segments
■ Moving a Database
■ Loading External-Format Data into Third-Party Tools
■ Unloading Selected Rows
■ Example: External Fixed-Format Data

8QORDGLQJ'DWDIURPD7DEOH 
7KH81/2$'2SHUDWLRQ

7KH81/2$'2SHUDWLRQ
The TMU UNLOAD operation is a flexible operation that you can use for many
purposes, which include:

■ Moving a table or a database from one system to another.


■ Maintaining better performance on frequently updated databases by
periodically (quarterly or annually) unloading, and then reloading
the data. This operation reorganizes the database for more efficient
data storage.
■ Archiving a segment before removing its rows from a table.
■ Loading data into tools that another vendor provides.

You can unload an entire table or just the data in a specified segment. The
TMU can also perform a selective unload. By specifying constraints in a
WHERE clause in the UNLOAD statement, you can select the rows to be
unloaded. An UNLOAD operation can unload a maximum of 2,147,483,647
rows (231 -1). If the table you want to unload contains more rows, break the
operation into two separate unloads and apply constraints. Alternatively, use
the SQL EXPORT command with a SELECT * query against the table.

You can specify whether the rows of a table are unloaded in the order of the
data (by doing a relation scan of the table) or in the order of one of the table
indexes.

You can also pipe the output data to another program for additional
processing, such as compressing or filtering.

To facilitate reloading unloaded data, the TMU can automatically generate


CREATE TABLE and LOAD DATA statements corresponding to the table and
data being unloaded as part of the UNLOAD operation. This capability is also
available with a TMU GENERATE statement as described in Chapter 5,
“Generating CREATE TABLE and LOAD DATA Statements.”

The rb_cm utility uses the TMU unload capability. For more information
about this utility, refer to “The rb_cm Utility” on page 7-4.

To execute an UNLOAD statement, you must be a member of the DBA system


role, be the owner of the table, or have the SELECT privilege for the table.

 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
,QWHUQDO)RUPDW

You can unload data in two formats with the TMU UNLOAD statement: an
internal binary format and an external character-based format. Data you
unload in the internal format can be reloaded only into an IBM Red Brick
Warehouse database on the same platform. Data unloaded in external format
can be reloaded into a Red Brick database on the same platform or a different
platform.

7LS You can use the SQL high-speed EXPORT command in the database server,
which unloads data in various formats, to export the results of any query to a
specified file. For more information on the EXPORT command, see the SQL Reference
Guide.

,QWHUQDO)RUPDW
Unloading a table to the internal format creates a binary output file. The TMU
quickly reloads internal-format files; however, the files must be reloaded
only on a system on the same platform. For example, you cannot unload a
table to the internal format on an HP 9000 and reload it on an IBM RISC
System/6000, or vice versa. In this context, the same platform also means that
the two systems must use the same Red Brick binaries—either 32-bit or 64-
bit. You cannot unload a table from a 32-bit system and reload the binary
output file on a 64-bit system.

To reload an internal-format file, use the FORMAT UNLOAD option in your


LOAD DATA statement. Loading data in this format is quicker than loading
from an external-format unload file or another all-character flat file.

([WHUQDO)RUPDW
Unloading a table to the external format creates an output file that you can
reload on the same or a different platform. Because external-format files are
character based, you can also read and edit the files, if necessary. You can also
use the files with other applications. For example, you can import the data
unloaded into an external-format file into a desktop spreadsheet application.

When you unload data in external format, multibyte characters are preserved
in data and table and column names, but data is not localized. Numeric and
datetime data are formatted according to ANSI SQL-92 rules for these data
types.

8QORDGLQJ'DWDIURPD7DEOH 
'DWD&RQYHUVLRQWR([WHUQDO)RUPDW

When the TMU unloads data in the external format, it generates the following
character-based files:

■ A fixed format or a variable format file that contains the data. To


name the file, specify the OUTPUTFILE keyword and filename.
■ An optional file that contains a generated CREATE TABLE statement,
which you can use to create a table to hold the unloaded data. To
produce this file, specify the DDLFILE keyword and filename. You
can also produce this file separately by using the GENERATE CREATE
TABLE statement.
■ An optional file that contains a generated LOAD DATA statement,
which you can use to reload the data. To produce this file, specify the
TMUFILE keyword and filename. You can also produce this file
separately by using the GENERATE LOAD DATA statement.

To reload the external-format unloaded data, invoke the TMU by using the
automatically generated TMU control file (named with the TMUFILE
keyword), which contains the LOAD DATA statement.

'DWD&RQYHUVLRQWR([WHUQDO)RUPDW
The following table defines how data from a table is mapped into an external
data file.

1XPEHURI
'DWD7\SHV %\WHV )RUPDW 1RWHV

Tinyint 4 [0 | –]ddd –128 to 127

Smallint 6 [0 | –]ddddd –32768 to 32767

Varchar (Only for 6 [0 ]ddddd 0 to 32767


Variable Format)

Serial 11 [0]ddddddddddd 1 to 2147483647

Integer 11 [0 | –]ddddddddddd –2147483648 to


2147483647
(1 of 2)

 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
'DWD&RQYHUVLRQWR([WHUQDO)RUPDW

1XPEHURI
'DWD7\SHV %\WHV )RUPDW 1RWHV

Decimal 1–38 [0 | –] [digits] . [digits] Number of digits on either side of


decimal point depends on
precision and scale of column. It
does not vary from row to row.

Float 20 [–] d.dddd...E[+ | –]dd Might lose precision

Double 31 [–] d.dddd...E[+ | –]dd Might lose precision

Date 10 YYYY-MM-DD Uses ANSI SQL format

Time (0) 8 HH:II:SS Uses ANSI SQL format

Time (non-zero) 15 HH:II:SS.FFFFFF Uses ANSI SQL format

Timestamp (0) 20 YYYY-MM-DD HH:II:SS Uses ANSI SQL format

Timestamp (non-zero) 26 YYYY-MM-DD HH:II:SS.FFFFFF Uses ANSI SQL format


(2 of 2)

In the external data file, each column is preceded by a single character


indicator, which indicates whether a NULL is present in that column for a
given row. If the indicator is blank, the value follows. If the indicator is the
percent character (%), the field is NULL and filled with blanks. A newline
character follows each record to make it easier for third-party tools to load the
file.

For an example of an external data file, refer to “Example: External Fixed-


Format Data” on page 4-20.

8QORDGLQJ'DWDIURPD7DEOH 
81/2$'6\QWD[

81/2$'6\QWD[

UNLOAD table_name
USING INDEX index_ name
,
SEGMENT ( segment_name )

EXTERNAL
VARIABLE

DDLFILE ’filename’ TMUFILE ’filename’

OUTPUTFILE ’filename’
OUTPUTDDN

TAPE DEVICE ’device_name’


FORMAT SL
TAR

USING QUERY REVISION

;
AND
OR
WHERE search_condition
p. 4-13
NOT

 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
81/2$'6\QWD[

table_name Table to be unloaded. If any data segments are offline, all


segments must be unloaded by an UNLOAD statement
with a SEGMENT clause.

USING INDEX Index to use for the unload operation. The index order
index_name determines the order in which the rows of data are
unloaded. If no index is specified, the data is unloaded
by a table scan.

The index can be any index on the table except a TARGET


index. You can determine an index name from the
RBW_INDEXES system table. If this clause is used, all
segments of the index must be online.

SEGMENT Allows unloading of specific segments. One or more seg-


segment_name ments can be unloaded. If data is unloaded by segments,
the data in each segment is unloaded in row order by
scanning the segment of the table. No index order can be
specified.

The specified segment can be any segment attached to


the table being unloaded. The segment can be either
online or offline for the unload operation. This clause
must be used for offline data segments.

EXTERNAL Unloaded data is in plain-text format in the database-


locale code set. If you do not specify EXTERNAL, the data
is unloaded in internal (binary) format. For a description
of the external-data format, refer to page 4-6.

VARIABLE Indicates that the output file is in variable format. Signif-


icant trailing spaces for VARCHAR columns are not pre-
served in VARIABLE format.

If VARIABLE is not specified, the output file is in fixed for-


mat.

8QORDGLQJ'DWDIURPD7DEOH 
81/2$'6\QWD[

DDLFILE ’filename’ File to which the TMU automatically writes a CREATE


TABLE statement for the table during an unload-to-exter-
nal operation. The file does not include any segment
information. You can then use this file to create a table to
hold the data on any platform. If you do not specify the
DDLFILE keyword and filename, the file is not created.

filename can be a relative pathname or a full pathname


and can include environment variables. Enclose filename
in single quotation marks.

This file must be written to disk; it cannot be written to a


tape device.

Generated DDL files are not supported for remote TMU


UNLOAD operations.

TMUFILE ’filename’ File to which the TMU automatically writes a LOAD DATA
statement during an unload-to-external operation. After
unloading the data, you can use this file as the control file
when you invoke the TMU to reload the data.

Before you reload the data, you might need to modify


this file to correctly specify the load mode, the input file
name, and if loading from a tape, the tape device name.
If the unloaded table data is written to a tape device, this
file contains a template TAPE DEVICE clause. If the
unloaded table data is written to a disk file, no TAPE
DEVICE clause is included.

filename can be a relative pathname or a full pathname


and can include environment variables. Enclose filename
in single quotation marks.

This file must be written to disk; it cannot be written to a


tape device.

Generated TMU files are not supported for remote TMU


UNLOAD operations.

 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
81/2$'6\QWD[

OUTPUTFILE File to which the table data is written during an unload


’filename’ operation.

This file can be written to disk, tape, standard output, or


piped to another program or filter. The filename can be a
relative pathname or a full pathname and can include
environment variables. Enclose filename in single
quotation marks.

If the output is standard output, the filename reference is


’-’:
OUTPUTFILE ’-’

,PSRUWDQW If the UNLOAD statement appears in a control file that the rb_cm utility
uses, OUTPUTFILE must be set to standard output.
If the output from a table or segment is piped to another
program, the filename reference is ’| command’ where
command is an operating-system program to which the
output should be piped. For example, the following state-
ment unloads the Sales table by using external format and
pipes the output to the compress program, which
compresses and writes the data to a file named
outdata_sales:
unload sales external OUTPUTFILE ’| compress >
outdata_sales’

Because not all operating-system error cases are detected


by the TMU, you should verify that the program completes
successfully.

TAPE DEVICE Specifies the tape device, if the data is to be written to a


’device_name’ tape file. Enclose device_name in single quotation marks.

If the table data is to be written to a disk file, do not use the


TAPE DEVICE device_name.

UNIX Tape support is only available on UNIX. ♦

8QORDGLQJ'DWDIURPD7DEOH 
81/2$'6\QWD[

FORMAT Format of the tape output file: SL (ANSI Standard Label) or


TAR format. SL is the default. If you unload in SL format
and the output tape is not already an ANSI-standard label
tape, the TMU prompts for a volume ID (volume serial
number) to use to label the tape.

A TAR file cannot exceed 8,589,934,591 bytes, a limit


imposed by the IEEE-POSIX standard. If the table to be
unloaded exceeds this limit, use a standard-label tape. You
can unload multiple files (tables) to a single TAR archive by
writing a single control file that contains multiple
UNLOAD statements. You cannot, however, use a WHERE
clause in an UNLOAD statement (a selective unload) that
writes to a TAR file.

USING QUERY This clause applies to versioned databases only. It specifies


REVISION that the read revision for the unload operation is the revi-
sion set with ALTER DATABASE FREEZE QUERY REVISION.
If no REVISION clause is specified, the unload operation
uses the latest revision.

If the query revision is not active, the UNLOAD statement


always uses the latest revision regardless of the clause.

WHERE Rows to be unloaded; only those rows that satisfy the


search_condition search condition are unloaded. The supported search
conditions are a subset of those supported for SQL queries.
You can group search conditions with parentheses to force
an evaluation order.

You can combine a WHERE clause with a segment list to


limit the scope of the unload operation. If specific
segments are listed, the search condition applies only to
those segments. If no specific segments are listed, the
search condition applies to the entire table.

The constraints on character data in the WHERE clause are


evaluated according to the collation sequence defined by
the database locale.

 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
81/2$'6\QWD[

7LS If you know that the search condition chooses rows from only a specific set of
segments (for example, if the search condition contains a constraint on the
segmenting key), you can achieve fastest performance by listing only those segments
in a SEGMENT clause in the UNLOAD statement.

The following syntax diagram shows how to construct a WHERE clause


search_condition in an UNLOAD statement.

search_condition Back to UNLOAD


p. 4-8
column_name1 = column_name2
literal <> literal

<
>
<=
>=

column_name IS NULL
NOT
column_name LIKE ’character_string’
NOT ESCAPE ’c’

column_name1, Columns in the table to be unloaded. A column can be


column_name2 compared with another column, tested for NULL, or
compared with a constant value.

literal Fixed sequence of characters, a numeric constant, or a


datetime constant. These character, datetime, and
numeric literals must correspond to the literal language
elements, as defined in the SQL Reference Guide. These lit-
erals must be specified in the database-locale code set;
datetime constants must be expressed in ANSI SQL-92
format; and decimal constants must use a decimal radix.

8QORDGLQJ'DWDIURPD7DEOH 
8QORDGLQJRU/RDGLQJ,QWHUQDO)RUPDW'DWD

’character_string’ Either a completely specified character string or a charac-


ter pattern that contains one or more wildcards; the char-
acter string must be enclosed in single quotation marks
(’ ’). For a complete definition, refer to the description of
the LIKE predicate in the SQL Reference Guide. The
character string must be specified in the database-locale
code set.

ESCAPE ’c’ Defines a character c to serve as an escape character so


that wildcard characters (% and _) can be used within the
preceding string constant.

The escape character must be specified in the data-


base-locale code set and can be either a single-byte or
multibyte character.

8QORDGLQJRU/RDGLQJ,QWHUQDO)RUPDW'DWD
7RXQORDGDWDEOHLQWKHLQWHUQDOIRUPDW GHIDXOW

 Create or open a TMU control file and write an UNLOAD statement


that writes the table or segments to disk or tape.
 Invoke the TMU and use the control file that you created in step 1.
The TMU creates a file that contains the unloaded data in internal
format.

7RUHORDGLQWHUQDOIRUPDWGDWDLQWRDWDEOH

 If you are creating a new table, create it with the SQL CREATE TABLE
statement, either one that you write or one that the TMU generated.
,PSRUWDQW The CREATE TABLE statement must create either the same table as the
table from which the data was unloaded or a table with the same number, type, and
order of columns.
 Prepare a control file that contains a LOAD DATA statement that
specifies FORMAT UNLOAD and the name of the file that contains the
unloaded data.

 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
8QORDGLQJRU/RDGLQJ,QWHUQDO)RUPDW'DWD

,PSRUWDQW A LOAD DATA statement with FORMAT UNLOAD cannot contain field
specifications.
 Invoke the TMU and specify the control file you created in step 2.
 Create any needed indexes, synonyms, and views. As the table is
loaded, the TMU automatically builds primary-key indexes and
updates other existing indexes.

You can also use pipes to accomplish the unload and load processes without
using an intermediate tape or disk file. For more information about using
pipes, refer to your operating-system documentation.

The following example shows how to unload a table by using internal


format. The Market table is unloaded into the market.output file on a disk.
The data is then reloaded into the same table using a LOAD DATA statement
that specifies FORMAT UNLOAD.

7RXQORDGWKH0DUNHWWDEOHLQWKHLQWHUQDOIRUPDWDQGSODFHWKHFRQWHQWVLQWRWKH
ILOHPDUNHWRXWSXW

 Create or open a TMU control file and write an UNLOAD statement.


In this example, the file is named unloadmkt.tmu:
unload market
outputfile ’market.txt’ ;
 Invoke the TMU, specifying unloadmkt.tmu as the control file:
rb_tmu unloadmkt.tmu db_username db_password
The contents of the Market table are now written in an internal
format to the file market.txt.

7RUHORDGWKH0DUNHWWDEOHZLWKLQWHUQDOIRUPDWGDWD

 In a file, prepare a LOAD DATA statement that specifies FORMAT


UNLOAD. In this example, the file is named loadmkt.tmu:
load data
inputfile ’market.txt’
format unload
into table market;

,PSRUWDQW The output file specified in the UNLOAD statement is used as the
inputfile of the LOAD DATA statement.

8QORDGLQJ'DWDIURPD7DEOH 
8QORDGLQJRU/RDGLQJ([WHUQDO)RUPDW'DWD

 Invoke the TMU from the command line, specifying loadmkt.tmu as


the control file:
rb_tmu loadmkt.tmu db_username db_password
 Create any needed indexes, synonyms, and views.

8QORDGLQJRU/RDGLQJ([WHUQDO)RUPDW'DWD
7RXQORDGDWDEOHLQHLWKHUH[WHUQDOIL[HGRUH[WHUQDOYDULDEOHIRUPDW

 Create or open a TMU control file and write an UNLOAD statement


with the EXTERNAL keyword that writes the table or segments to
disk or tape.
If you are going to reload the data into a new table, include the DDL-
FILE and TMUFILE keywords and filenames so that templates for
CREATE TABLE and LOAD DATA statements are automatically gener-
ated. Alternatively, you can generate these statements at another
time with the TMU GENERATE statement.
 Invoke the TMU, using the control file that contains the UNLOAD
statement that you created in step 1.
The TMU creates a file that contains the unloaded data in external for-
mat. If you specified the TMUFILE and DDLFILE clauses, it also
creates files that contains the automatically generated LOAD DATA
and CREATE TABLE statements.
7LS The CREATE TABLE statement does not include any segment information or a
MAX ROWS PER SEGMENT clause; you can edit the file to include this information
if necessary.

7RUHORDGH[WHUQDOIRUPDWGDWDLQWRDGDWDZDUHKRXVHWDEOH
 If you are creating a new table, create it with the SQL CREATE TABLE
statement, either one that you write or one that the TMU generated
automatically with a TMU UNLOAD or GENERATE statement
executed on the table from which the data was unloaded.
,PSRUWDQW The CREATE TABLE statement must create either the same table as the
table from which the data was unloaded or a table with the same number, type, and
order of columns.

 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
8QORDGLQJRU/RDGLQJ([WHUQDO)RUPDW'DWD

 Review the file that contains the automatically generated TMU LOAD
DATA statement to be sure the input file name is correct. If loading
from tape, edit the TAPE DEVICE clause to specify the correct device
name.
 Invoke the TMU, specifying the file that contains the LOAD DATA
statement as the TMU control file.
 Create any needed indexes, synonyms, and views. As the table is
loaded, the TMU automatically builds primary-key indexes and
updates other existing indexes.

The following example shows how to unload a table into a file by using the
external fixed format. The Sales table is unloaded to the sales.output file. The
data is then reloaded by using the automatically generated TMU file.

7RXQORDGWKH6DOHVWDEOHLQWKHH[WHUQDOIRUPDWDQGSODFHWKHFRQWHQWVLQWRWKH
ILOHVDOHVRXWSXW

 Create or open a file and write an UNLOAD statement with the


EXTERNAL keyword. In this example, the file is named
unloadsales.tmu:
unload sales external
ddlfile ’sales.create’
tmufile ’sales.load’
outputfile ’sales.output’;
 Invoke the TMU from the command line, using the file
unloadsales.tmu as the control file:
rb_tmu unloadsales.tmu db_username db_password
The TMU creates a file named sales.create, which contains a
CREATE TABLE statement that you can use to re-create the table; a file
named sales.load, which contains the LOAD DATA statements for
reloading the data; and a file named sales.output, which contains the
data in external format.

7RUHORDGWKH6DOHVWDEOHLQDQHZGDWDEDVH
 Create the table by using the DDL file sales.create. For example, if
you are using the RISQL Entry Tool, you can execute the file as
follows:
RISQL> run sales.create ;
 Modify the sales.load file, which contains the LOAD DATA statement,
to correctly specify the input tape device.

8QORDGLQJ'DWDIURPD7DEOH 
&RQYHUWLQJD7DEOHWR0XOWLSOH6HJPHQWV

 Invoke the TMU from the command line, specifying sales.load as the
control file:
rb_tmu sales.load db_username db_password
 Create any needed indexes, synonyms, and views.

&RQYHUWLQJD7DEOHWR0XOWLSOH6HJPHQWV
If a table resides in a single segment, you can use the unload operation to split
the data among additional new segments as follows:

 Unload the table to disk or tape.


 Create a new table with multiple segments.
 Reload the data, using UNLOAD format.

0RYLQJD'DWDEDVH
To move a database, create a file that contains an UNLOAD statement for each
table in the database. You can specify the internal format for fastest perfor-
mance or the external format for increased flexibility. However, if you move
a database to a system on a different platform, you must specify the external
format.

If you unload the data in the external format, either include the TMUFILE
parameter so that the TMU generates a file that contains the LOAD DATA
statements needed to reload the data or use the GENERATE LOAD DATA
statement to create the appropriate TMU file.

If you unload the data in the internal format, you must create or generate a
control file that contains the LOAD DATA statements. You can use the
GENERATE LOAD DATA statement to create the appropriate TMU file.

Be sure to write the LOAD DATA statements in the order in which the tables
must be loaded. For information about determining table order, refer to
“Determining Table Order” on page 3-14.

For more information about moving a database, refer to the Administrator’s


Guide.

 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
/RDGLQJ([WHUQDO)RUPDW'DWDLQWR7KLUG3DUW\7RROV

/RDGLQJ([WHUQDO)RUPDW'DWDLQWR7KLUG3DUW\
7RROV
After unloading a table in the external format, you can use the information in
the automatically generated TMU file to load the data into products that
accept portions of fixed input. The TMU file contains the LOAD DATA
statements for reloading and provides information about the positions of the
columns.

For example, you can load data into IBM DB2 using the DB2 LOAD utility or
into Microsoft Excel using the Excel parse function.

When you load data into Excel, Excel does not accept ANSI SQL date formats.
You can use the parse function to obtain date components and use a date
function to turn the components into an Excel date. You must also use the
Excel parse function to interpret null-indicator characters to extract columns
with null values. Because the data is in the external format, you can look at
the data so that you can set up Excel to handle its format.

8QORDGLQJ6HOHFWHG5RZV
To unload only selected rows of a table, create an UNLOAD statement that
contains a WHERE clause that specifies which rows to unload. Only those
rows that satisfy the column constraints in the WHERE clause are written to
the unload file.

You can combine a WHERE clause with a segment list to limit the scope of the
unload operation. If specific segments are listed, the search condition applies
only to those segments. If no specific segments are listed, the search condition
applies to the entire table. If you know that the WHERE clause will only
unloads rows from specific segments, the unload operation is faster if you list
just those segments in a SEGMENT clause of the UNLOAD statement.

To write the rows selected by the WHERE clause to a TAR file, you must first
insert them into a temporary table or unload them to a disk file: You cannot
use a WHERE clause in an unload operation to a TAR file (because each header
block in the TAR file must know the length of the file that follows it, which is
not known in the case of a selective unload operation.)

8QORDGLQJ'DWDIURPD7DEOH 
([DPSOH([WHUQDO)L[HG)RUPDW'DWD

Assume you want to unload the 2000 sales data from the Sales table in the
Aroma database. The following UNLOAD statement unloads the rows for
2000 from the Sales table, based on the Perkey column. The rows are written
in external format to a file named 2000_sales_data.
unload sales
external outputfile ’2000_sales_data’
where perkey >= 96001 and perkey <= 96053;

Assume you want to unload the 2000 sales data for the northern region from
the Sales table. The following UNLOAD statement unloads the rows for 2000
for the northern region from the Sales table, based on the Perkey and
Mktkey columns. The rows are written in internal format to a file named
2000_northern_sales_data.
unload sales
outputfile ’2000_northern_sales_data’
where perkey >= 96001 and perkey <= 96053
and ( mktkey = 6
or mktkey = 7
or mktkey = 8 );

([DPSOH([WHUQDO)L[HG)RUPDW'DWD
The following example shows the external format generated by the TMU, as
well as the automatically generated CREATE TABLE and LOAD DATA
statements for the Market table in the Aroma database.

The original Market table is created by the following statement:


create table market (
mktkey integer not null,
hq_city char(20),
hq_state char(20),
district char(20),
region char(20),
constraint mkt_pkc primary key (mktkey));

This TMU UNLOAD statement produces a file named market.txt:


unload market
external
ddlfile ’market_create.risql’
tmufile ’market_load.tmu’
outputfile ’market.txt’;

 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
([DPSOH([WHUQDO)L[HG)RUPDW'DWD

The file market.txt contains unloaded data from the Market table in the
external format:
00000000001 Atlanta GA Atlanta South
00000000002 Miami FL Atlanta South
00000000003 New Orleans LA New Orleans South
00000000004 Houston TX New Orleans South
00000000005 New York NY New York North

The TMU UNLOAD statement also produces a file named market_create.risql


that contains this SQL statement to recreate the Market table:
CREATE TABLE MARKET (
MKTKEY INTEGER NOT NULL UNIQUE,
HQ_CITY CHARACTER(20),
HQ_STATE CHARACTER(20),
DISTRICT CHARACTER(20),
REGION CHARACTER(20),
PRIMARY KEY(MKTKEY));

The TMU UNLOAD statement also creates a file named market_load.tmu that
contains the following LOAD DATA statement:
LOAD DATA INPUTFILE ’market.txt’
RECORDLEN 97
INSERT
INTO TABLE MARKET (
MKTKEY POSITION(2) INTEGER EXTERNAL(11) NULLIF(1)=’%’,
HQ_CITY POSITION(14) CHARACTER(20) NULLIF(13)=’%’,
HQ_STATE POSITION(35) CHARACTER(20) NULLIF(34)=’%’,
DISTRICT POSITION(56) CHARACTER(20) NULLIF(55)=’%’,
REGION POSITION(77) CHARACTER(20) NULLIF(76)=’%’);

,PSRUWDQW The table in the LOAD DATA statement and the CREATE TABLE
statement is MARKET. If you use these statements to create a new table, you might
want to edit them to change the name. Similarly, often you must edit input filenames
or tape-device names to make them correspond to the actual physical locations.
The NULLIF keyword and the percent character (%) in the LOAD DATA statement
indicate whether a column in the unloaded table contained NULL. For example, if the
value in the District column for New Orleans were NULL, the unloaded data would
look like this:
00000000001 Atlanta GA Atlanta South
00000000002 Miami FL Atlanta South
00000000003 New Orleans LA % South
00000000004 Houston TX New Orleans South
00000000005 New York NY New York North

8QORDGLQJ'DWDIURPD7DEOH 
([DPSOH([WHUQDO9DULDEOH)RUPDW'DWD

([DPSOH([WHUQDO9DULDEOH)RUPDW'DWD
This example shows the external-variable format generated by the TMU, as
well as the automatically generated CREATE TABLE and LOAD DATA state-
ments for the Market table in the Aroma database.

The Market table (which includes a VARCHAR column) is created with the
following statement:
create table market (
mktkey integer not null,
hq_city varchar(20) not null,
hq_state char(20) not null,
district char(20) not null,
region char(20) not null,
constraint mkt_pkc primary key (mktkey));

The following TMU UNLOAD statement produces a file named market.txt


that contains unloaded data from the Market table in the external format:
unload market
external variable
ddlfile ’market_create.risql’
tmufile ’market_load.tmu’
outputfile ’market.txt’;

The unloaded data is similar to the following results:


00000000001 000007 GA Atlanta South Atlanta
00000000002 000005 FL Atlanta South Miami
00000000003 000011 LA New Orleans South New Orleans
00000000004 000007 TX New Orleans South Houston
00000000005 000008 NY New York North New York

The preceding TMU statement also produces a file named


market_create.risql that contains this SQL statement to recreate the Market
table:
CREATE TABLE MARKET (
MKTKEY INTEGER NOT NULL UNIQUE,
HQ_CITY VARCHAR(20) NOT NULL,
HQ_STATE CHARACTER(20) NOT NULL,
DISTRICT CHARACTER(20) NOT NULL,
REGION CHARACTER(20) NOT NULL,
PRIMARY KEY(MKTKEY));

 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
([DPSOH([WHUQDO9DULDEOH)RUPDW'DWD

The preceding TMU statement also creates a file named market_load.tmu


that contains the following LOAD DATA statement:
LOAD DATA INPUTFILE ’market.txt’
FIXEDLEN 82 INTRA RECORD SKIP 1
INSERT
FORMAT VARIABLE
NLS_LOCALE ’English_UnitedStates.US-ASCII@Binary’
INTO TABLE MARKET (
MKTKEY POSITION(2) INTEGER EXTERNAL(11) NULLIF(1)=’%’,
HQ_CITY POSITION(14) VARLEN EXTERNAL(6) NULLIF(13)=’%’,
HQ_STATE POSITION(21) CHARACTER(20) NULLIF(20)=’%’,
DISTRICT POSITION(42) CHARACTER(20) NULLIF(41)=’%’,
REGION POSITION(63) CHARACTER(20) NULLIF(62)=’%’);

8QORDGLQJ'DWDIURPD7DEOH 
Chapter

*HQHUDWLQJ&5($7(7$%/(DQG
/2$''$7$6WDWHPHQWV 
In This Chapter . . . . . . . . . . . . . . . . . . . . 5-3
Generating CREATE TABLE Statements . . . . . . . . . . . 5-3
Generating LOAD DATA Statements . . . . . . . . . . . . 5-5
Example: GENERATE Statements and External-Format Data . . . . 5-8
 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
,Q7KLV&KDSWHU
The TMU can automatically generate CREATE TABLE and LOAD DATA state-
ments based on existing tables. You can use these statements as templates for
creating and loading new tables. These statements can also be generated as
part of the UNLOAD process; however, the GENERATE statements provide
more flexibility.

GENERATE operations can be performed both locally and remotely; for infor-
mation about remote TMU operations, see page 2-12.

This chapter contains the following sections:

■ Generating CREATE TABLE Statements


■ Generating LOAD DATA Statements
■ Example: GENERATE Statements and External-Format Data

*HQHUDWLQJ&5($7(7$%/(6WDWHPHQWV
To write a CREATE TABLE statement for a new table to hold unloaded data or
to create a template for a new table that is similar to an existing table, you can
use the TMU GENERATE CREATE TABLE statement to generate one instead of
generating it as part of the UNLOAD operation. You can either use the
statement as generated or edit it to make any necessary changes (for example,
modifying filenames or table names or adding segment information or
MAXROWS or MAXROWS PER SEGMENT values).

*HQHUDWLQJ&5($7(7$%/(DQG/2$''$7$6WDWHPHQWV 
*HQHUDWLQJ&5($7(7$%/(6WDWHPHQWV

To execute a GENERATE statement, you must have SELECT privileges on the


table.

GENERATE CREATE TABLE FROM table_name

DDLFILE ’filename’ ;

table_name An existing table for which a CREATE TABLE statement is to be


generated.

DDLFILE File to which the TMU writes the generated CREATE TABLE
’filename’ statement. The file does not include any segment information.
You can use this file to create a table on any platform.

filename can be a relative pathname or a full pathname and can


include environment variables. Enclose filename in single quo-
tation marks.

filename can also begin with a single vertical bar character


(“|”), followed by a command string. This special format
causes the TMU to direct the generated output data to a system
pipe rather than to a file. The generated CREATE TABLE state-
ment serves as input to the command string, which is run as a
shell command.

The following example shows how to generate a CREATE TABLE statement


for the existing table named Product and write the generated statement to the
disk file named create_product.risql in the current directory.
generate create table from product
ddlfile ’create_product.risql’;

UNIX The following example shows how you can use system pipes and a remote
shell command (rsh) to create the table on a remote host. A CREATE TABLE
statement is generated for the existing table named Sales. Instead of writing
the generated statement to a disk file, however, the generated statement is
passed to a system pipe and executed with an rsh remote-shell command on
a UNIX host named north1.

 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
*HQHUDWLQJ/2$''$7$6WDWHPHQWV

The remote shell executes the cat UNIX command to copy the remote shell
input to a file named sales.create. The cat command is enclosed in quotation
marks because it contains the greater-than character (>) to redirect output. In
this single operation, a CREATE TABLE statement is automatically generated
and copied to a disk file on a remote host.
generate create table from sales
ddlfile ’| rsh north1 "cat > sales.create"’;

As in the previous example, generate a CREATE TABLE statement for the table
named Sales and pass the output to a remote shell on a UNIX host named
north1. At the rsh UNIX remote-shell command, pass the generated CREATE
TABLE statement directly to the RISQL Entry Tool running on the north1 host
to create the table in an existing database. In this single operation, a replica of
the Sales table is created on the remote host.
generate create table from sales
ddlfile ’| rsh north1 risql user password’;

7LS The combination of the GENERATE CREATE TABLE statement and the remote
shell capability illustrated in this example is particularly useful with the rb_cm
utility. You can include a similar GENERATE statement in the rb_cm unload control
file before the UNLOAD statement, causing the remote table to be created immediately
before data is copied to that table. For more information about this utility, refer to
Chapter 7, “Moving Data with the Copy Management Utility.”

*HQHUDWLQJ/2$''$7$6WDWHPHQWV
To write a LOAD DATA statement to load unloaded data or to create a
template to load similar data into a new table, you can use the TMU
GENERATE LOAD DATA statement to generate one instead of generating it as
part of the UNLOAD operation. The GENERATE LOAD DATA statement allows
you to specify a name for the target table and a name for the input file. You
can either use the statement as generated or edit it to make any necessary
changes.

*HQHUDWLQJ&5($7(7$%/(DQG/2$''$7$6WDWHPHQWV 
*HQHUDWLQJ/2$''$7$6WDWHPHQWV

To execute a GENERATE statement, you must have SELECT privileges on the


table.

GENERATE LOAD FROM table_name


DATA

INTO new_table_name INPUTFILE ’new_filename’

TAPE DEVICE ’device_name’ EXTERNAL


VARIABLE
TMUFILE ’filename’ ;

table_name Existing table for which a LOAD DATA statement is to be


generated.

INTO Alternative table name to use in the INTO clause of the


new_table_name generated LOAD DATA statement. If not specified,
table_name is used in the INTO clause. The table named
new_table_name does not have to exist at the time the
LOAD DATA statement is generated, but it must exist
when the generated LOAD DATA statement is used.

INPUTFILE Filename to use in the INPUTFILE clause of the generated


’new_filename’ LOAD DATA statement. If not specified, the string “???” is
used in the INPUTFILE clause, requiring the user to edit
the generated file later to specify a valid input filename.

TAPE DEVICE Use this clause to generate a LOAD DATA statement to


'device_name' load data from a tape drive. If not specified, no TAPE
DEVICE clause is included in the generated statement.

UNIX Tape support is only available on UNIX. ♦

 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
*HQHUDWLQJ/2$''$7$6WDWHPHQWV

EXTERNAL Specifies that the generated LOAD DATA statement be


written with the field specifications necessary to reload
the table from an external unload-format file. If EXTER-
NAL is not specified, the generated LOAD DATA state-
ment is written to conduct an internal-format load, that
is, it specifies FORMAT UNLOAD and does not include
field specifications.
VARIABLE Only applies to tables with at least one VARCHAR
column.

TMUFILE ’filename’ Specifies the name of the file to which the TMU writes the
generated LOAD DATA statement for the table. You can
use this file unchanged as input to the TMU to load or
reload the table from an internal or external unload-for-
mat file. You can also use the generated file as a template
and edit it so that the generated field specifications match
some other input format.

filename can be a relative pathname or a full pathname


and can include environment variables. Enclose filename
in single quotation marks.

filename can also begin with a single vertical bar (|) char-
acter, followed by a command string. This special format
causes the TMU to direct the generated output data to a
system pipe rather than to a file. The generated LOAD
DATA statement serves as input to the command string
run as a shell command.

The following example generates a LOAD DATA statement based on an


existing table named Product for a new table named Newproduct (the INTO
clause). The generated LOAD DATA statement reads input data from a file
named product_unload (the INPUTFILE clause), which is in EXTERNAL
format. The generated statement is written to a file named load_newproduct
in the current directory (the TMUFILE clause).
generate load data from product
into newproduct
inputfile ’product_unload’
external
tmufile ’load_newproduct’;

*HQHUDWLQJ&5($7(7$%/(DQG/2$''$7$6WDWHPHQWV 
([DPSOH*(1(5$7(6WDWHPHQWVDQG([WHUQDO)RUPDW'DWD

The following example generates a LOAD DATA statement based on an


existing table named Market; the generated statement also loads data into a
table named Market (by the absence of the INTO clause). The data for the new
table comes from the system standard input (a filename in the INPUTFILE
clause of ’-’) as an internal-format unload file. The generated statement is
written to the file named copy_market in the current directory.
generate load data from market
inputfile ’-’
tmufile ’copy_market’;

([DPSOH*(1(5$7(6WDWHPHQWVDQG
([WHUQDO)RUPDW'DWD
This example illustrates the CREATE TABLE and LOAD DATA statements for
the Store table in the Aroma database that are generated by the GENERATE
statement, and the external-format data produced by an UNLOAD statement.

The original Store table is created by the following statement:


create table store (
storekey integer not null,
mktkey integer not null,
store_type char(10),
store_name char(30),
street char(30),
city char(20),
state char(5),
zip char (10),
constraint store_pkc primary key (storekey),
constraint store_fkc foreign key (mktkey)
references market (mktkey))
maxrows per segment 2500;

The following TMU GENERATE statement produces a file named


recreate_store:
generate create table from store ddlfile ’recreate_store’:

 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
([DPSOH*(1(5$7(6WDWHPHQWVDQG([WHUQDO)RUPDW'DWD

The file named recreate_store contains the following SQL statement to create
the Store table:
CREATE TABLE STORE (
STOREKEY INTEGER NOT NULL UNIQUE,
MKTKEY INTEGER NOT NULL,
STORE_TYPE CHARACTER(10),
STORE_NAME CHARACTER(30),
STREET CHARACTER(30),
CITY CHARACTER(20),
STATE CHARACTER(5),
ZIP CHARACTER(10),
PRIMARY KEY(STOREKEY),
CONSTRAINT STORE_FKC FOREIGN KEY(MKTKEY)
REFERENCES MARKET (MKTKEY) ON DELETE NO ACTION);

7LS The GENERATE statement does not produce segment information or a


MAXSEGMENTS or MAXROWS PER SEGMENT clause for the CREATE TABLE
statement. You can edit the file to provide the necessary information.

The following TMU GENERATE statement creates a file named store_load:


generate load data from store
into new_store
inputfile ’store_data’
external tmufile ’store_load’;

The file named store_load contains the following LOAD DATA statement:
LOAD DATA INPUTFILE ’store_data’
RECORDLEN 136
INSERT
NLS_LOCALE ’English_UnitedStates.US-ASCII@Binary’
INTO TABLE NEW_STORE (
STOREKEY POSITION(2) INTEGER EXTERNAL(11) NULLIF(1)=’%’,
MKTKEY POSITION(14) INTEGER EXTERNAL(11) NULLIF(13)=’%’,
STORE_TYPE POSITION(26) CHARACTER(10) NULLIF(25)=’%’,
STORE_NAME POSITION(37) CHARACTER(30) NULLIF(36)=’%’,
STREET POSITION(68) CHARACTER(30) NULLIF(67)=’%’,
CITY POSITION(99) CHARACTER(20) NULLIF(98)=’%’,
STATE POSITION(120) CHARACTER(5) NULLIF(119)=’%’,
ZIP POSITION(126) CHARACTER(10) NULLIF(125)=’%’);

7LS You can specify a new target table name and an input filename in the
GENERATE LOAD DATA statement. The percent character (%) in the LOAD
DATA statement indicates whether a column in the unloaded table contained
NULL.

*HQHUDWLQJ&5($7(7$%/(DQG/2$''$7$6WDWHPHQWV 
([DPSOH*(1(5$7(6WDWHPHQWVDQG([WHUQDO)RUPDW'DWD

If you unload the Store table in external format, the output looks like this:
00000000001 00000000014 Small Roasters, Los Gatos
1234 University Ave Los Gatos CA 95032
00000000002 00000000014 Large San Jose Roasting Company
5678 Bascom Ave San Jose CA 95156
00000000003 00000000014 Medium Cupertino Coffee Supply
987 DeAnza Blvd Cupertino CA 97865
00000000004 00000000003 Medium Moulin Rouge Roasting
898 Main Street New Orleans LA 70125
00000000005 00000000010 Small Moon Pennies
98675 University Ave Detroit MI 48209

 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
Chapter

5HRUJDQL]LQJ7DEOHVDQG
,QGH[HV 
In This Chapter . . . . . . . . . . . . . . . . . . . . 6-3
The REORG Operation . . . . . . . . . . . . . . . . . 6-3
REORG Operation Options. . . . . . . . . . . . . . . 6-5

Data Processing During the REORG Operation . . . . . . . . . 6-7


Coordinator Stage . . . . . . . . . . . . . . . . . . 6-10
Input Stage . . . . . . . . . . . . . . . . . . . . 6-10
Conversion Stage . . . . . . . . . . . . . . . . . . 6-10
Index-Building Stage . . . . . . . . . . . . . . . . . 6-11
Cleanup Stage . . . . . . . . . . . . . . . . . . . 6-11

REORG Syntax . . . . . . . . . . . . . . . . . . . . 6-12


discardfile Clause . . . . . . . . . . . . . . . . . . 6-19
Usage Notes . . . . . . . . . . . . . . . . . . . . 6-21
Referential Integrity . . . . . . . . . . . . . . . . 6-21
Locking Behavior. . . . . . . . . . . . . . . . . 6-22
Partial-Index REORG . . . . . . . . . . . . . . . 6-23
Online and Offline Operation . . . . . . . . . . . . 6-23
Disk Space . . . . . . . . . . . . . . . . . . . 6-24
Discardfile Format . . . . . . . . . . . . . . . . 6-24
 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
,Q7KLV&KDSWHU
The REORG command populates indexes, maintains referential integrity,
improves internal storage of indexes, and populates or rebuilds precom-
puted views. REORG operations apply to a specific target table and its related
database objects; these operations can be run in parallel to improve perfor-
mance on large tables with multiple indexes.

This chapter contains the following subsections:

■ The REORG Operation


■ Data Processing During the REORG Operation
■ REORG Syntax
■ Usage Notes

7KH5(25*2SHUDWLRQ
The REORG operation performs the following functions:

■ Checks referential integrity, if applicable for the target table, and


either deletes rows that violate it or invalidates any affected indexes.
(Referential integrity is the relational property that each foreign-key
value in a table exists as a primary-key value in the referenced table.)
■ Performs an internal reorganization of one or more of the indexes for
the table (all types) to improve the internal storage of this infor-
mation and thereby the performance when the index is used to
access data. It can rebuild all indexes, selectively rebuild one or more
named indexes, or selectively rebuild one or more segments of one
or more named indexes.

5HRUJDQL]LQJ7DEOHVDQG,QGH[HV 
7KH5(25*2SHUDWLRQ

■ Populates a DEFERRED index that is created with a CREATE INDEX


statement. A DEFERRED index is an empty index structure that can
be populated at a later date. For more information about DEFERRED
indexes, refer to the SQL Reference Guide.
■ Rebuilds precomputed views. For example, if precomputed view
maintenance is set to OFF in your application, you can use the REORG
command to rebuild precomputed views only, without touching the
indexes on the target table. Alternatively, you can use the REORG
command to rebuild both indexes and views. For detailed infor-
mation about precomputed view maintenance, refer to the IBM Red
Brick Vista User’s Guide.
In addition to rebuilding aggregate table data, the REORG command
rebuilds indexes on aggregate tables.

A REORG operation is necessary in the following cases:

■ To rebuild the affected indexes if you use a database restore


operation to restore individual segments of a table or index.
■ Whenever modifications to a database affect more than about
30 percent of the data, run the TMU with a REORG statement for any
tables directly modified. Periodically rebuilding such tables and
indexes with a REORG statement ensures referential integrity and
optimal performance.
■ To reorganize invalid STAR indexes. Certain operations can inval-
idate STAR indexes. For example, increasing the MAXROWS PER
SEGMENT or the MAXSEGMENTS parameter on a table, or using an
ALTER statement to expand a segment, can invalidate STAR indexes
on tables that reference the altered table. These operations always
generate a warning message that says STAR indexes based on the
altered table might be invalid, in which case the affected STAR
indexes need to be reorganized. You can either reorganize affected
indexes when the message is issued or schedule the REORG
operation for a more convenient time. However, any non-query
(INSERT, UPDATE, or DELETE) operation against a table that has an
invalid index results in an error message that says the index must be
reorganized. You must perform a REORG operation before the table
can be accessed for an INSERT, UPDATE, DELETE, or LOAD operation.

 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
5(25*2SHUDWLRQ2SWLRQV

REORG is unnecessary in the following cases:

■ If no changes are made to the database except by complete loads of


data.
■ If the table and indexes are segmented alike and new index data is
loaded into new index segments corresponding to new table
segments. For an example of this type of setup, refer to the
Administrator’s Guide.

5(25*2SHUDWLRQ2SWLRQV
The REORG operation offers the following options:

■ Reorganizing part of an index or the whole index. In the case of a


partial-index REORG operation, you can restrict the operation to a
specified list of one or more segments for each index being reorga-
nized. (Multiple segments for multiple indexes defined on a single
table can be rebuilt in a single REORG operation.)
■ During a partial REORG operation, keeping segments to be reorga-
nized either online or offline. If the segments are offline, you can still
use the online portion of the index in queries while the REORG is in
progress.
■ During a partial REORG operation, scanning only a portion of the
table to rebuild an index. You can scan one or more data segments
instead of scanning the entire table. You can use this option only if
the same column is used to segment the table and the indexes being
rebuilt and if all the keys for the index segment being rebuilt can be
found in the specified data segments.
A REORG of a local index segment will scan only the corresponding
table segment when that segment is specified in the command.
■ Disabling referential-integrity checking. This feature offers
substantial performance gains when referential-integrity checking is
not necessary to build the required index.

5HRUJDQL]LQJ7DEOHVDQG,QGH[HV 
5(25*2SHUDWLRQ2SWLRQV

■ Managing rows that fail referential-integrity checking or result in a


duplicate index key:
❑ The affected rows can be deleted from the table and from all
indexes defined on the table.
❑ Any index affected by the error can be marked invalid without
affecting the reorganization of other indexes.
❑ The REORG operation can be stopped immediately in case of an
error in any of these indexes. All indexes being rebuilt by the
REORG are marked invalid.
■ Recording rows that fail referential-integrity checking in a separate
file.
■ Memory-mapping primary key indexes on referenced tables to
optimize referential-integrity checking.
■ Managing the reporting of discarded rows. They can be reported
along with all other REORG messages, recorded in a user-specified
file, or separated into multiple files depending on the type of failure
(duplicate rows or referential-integrity failure). If you did specify a
discard file, a message notifies you when a row is discarded.
■ Rebuilding one or more precomputed views.

 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
'DWD3URFHVVLQJ'XULQJWKH5(25*2SHUDWLRQ

'DWD3URFHVVLQJ'XULQJWKH5(25*2SHUDWLRQ
Serial and parallel REORG operations perform identical functions. However,
a parallel REORG operation uses separate tasks concurrently whereas a serial
REORG operation uses only one task that proceeds serially from one stage to
the next. The REORG operation consists of the following stages:

■ Coordinator stage
❑ Validates the REORG statement.
❑ Acquires all necessary locks and sets the state of each index
being rebuilt to prevent other users from accessing the index.
❑ Clears the indexes or segments of the index being reorganized.
Also during the coordinator stage, a parallel REORG operation:
❑ Determines how many additional tasks to use for each stage.
❑ Assigns work to each task.
❑ Determines the order of the tasks in the REORG pipeline.
❑ Creates the processes (UNIX) or threads (Windows) for each
stage in the REORG pipeline and starts their execution.
■ Input stage
❑ Reads each row from the target table.
❑ Passes the data on to a conversion task.
■ Conversion stage
❑ Checks referential integrity on all foreign keys if reference
checking is enabled. If reference checking is disabled, checks
referential integrity only on foreign keys used in any STAR index
being rebuilt.
❑ Constructs a key for each index being rebuilt.
❑ Identifies which index-builder task has work to do for the
current row. The row is skipped if the key value does not belong
to any index segment being rebuilt.
❑ Passes the data on to the first index-builder task in the pipeline.

5HRUJDQL]LQJ7DEOHVDQG,QGH[HV 
'DWD3URFHVVLQJ'XULQJWKH5(25*2SHUDWLRQ

■ Index-builder stage
❑ Inserts key values into their assigned segments of an index.
❑ Passes the data on to the next index-builder task or to the
cleanup task.
■ Cleanup stage
❑ Performs the function required for the selected ON DISCARD
option.
❑ Marks the successfully rebuilt indexes valid.
❑ Reports status of the REORG operation.

The REORG operation uses its parallel-processing capability to improve


performance in two ways:

■ It uses separate processes or threads for each stage, creating a


pipeline in which rows are passed from one stage to the next, with
multiple rows being processed simultaneously. Even on systems
with a single CPU, multiple processes or threads can take advantage
of I/O and CPU overlap, which might reduce elapsed time.
■ On systems with multiple CPUs, additional input, conversion, and
index-builder tasks are created to further improve the pipeline.

The parallel-processing capability of the REORG operation improves with an


increase in the number of physical-storage units (PSUs) in a table or with an
increase in the number of segments in an index.

 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
'DWD3URFHVVLQJ'XULQJWKH5(25*2SHUDWLRQ

Figure 6-1 illustrates the sequence of tasks in a REORG operation.


)LJXUH 
5(25*6HTXHQFH

5(25*WDVNVVHTXHQFH .H\
&RRUGLQDWRUVWDJH
7DVN
&RQWUROILOH &RRUGLQDWRU
V\VWHPWDEOHV WDVN 3ULPDU\,2
6WDWXVFRQWURO
IORZ
,QSXWVWDJH

'DWDEDVHWDEOH ,QSXWWDVN
368V tasktask

&RQYHUVLRQVWDJH
3.LQGH[HVRI &RQYHUVLRQ
UHIHUHQFHGWDEOHV WDVN tasktask

,QGH[EXLOGLQJVWDJH

,QGH[EXLOGHU
IndexIndex
,QGH[VHJPHQWV WDVN task

,QGH[EXLOGHU
IndexIndex
,QGH[VHJPHQWV WDVN task 3ULPDU\,2

&OHDQXSVWDJH
'LVFDUGILOHV
'DWDEDVHWDEOH &OHDQXSWDVN

5HRUJDQL]LQJ7DEOHVDQG,QGH[HV 
&RRUGLQDWRU6WDJH

&RRUGLQDWRU6WDJH
During the coordinator stage, the coordinator task receives the REORG
statement and checks the validity of the REORG parameters. It determines
how many tasks to use, assigns work for each stage and determines the order
of the tasks in the REORG pipeline. After all the stages are complete, the
coordinator ends the REORG operation.

If aggregate maintenance is required, the server handles it as part of the final


coordinator task after the reorganization of the target table and indexes is
complete. To make this possible, the TMU communicates directly with the
server as part of the REORG transaction.

,QSXW6WDJH
During the input stage, each row from the target table is read and the infor-
mation is passed on to the conversion stage. (In some cases, to perform the
REORG operation faster, a STAR index might be scanned instead of the table.
This choice is invisible to the user.) The number of input tasks cannot exceed
the number of PSUs in the target table.

&RQYHUVLRQ6WDJH
During the conversion stage, referential integrity is checked (if you have
enabled this option) and a key is constructed for each index being rebuilt. The
index-building work to be done on each row is identified, and the row is then
directed to the next stage in the pipeline.

 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
,QGH[%XLOGLQJ6WDJH

,QGH[%XLOGLQJ6WDJH
During the index-building stage, a key value is inserted into a specified
segment of an index by an index-builder task. An index-builder task can
insert keys into multiple segments of one index, all the segments of one
index, or all the segments of a set of indexes. Multiple index-building tasks
handle different subsets of segments of the same index and each index is built
at a different stage of the pipeline. If the OPTIMIZE option is ON, sorting and
merging strategies are used to add rows to the index. Multiple index-builder
tasks can operate simultaneously. The number of index-building tasks cannot
exceed the total number of index segments being built. No index-builder
tasks are allocated for segments not being rebuilt.

&OHDQXS6WDJH
During the cleanup stage, the discarded rows are removed and recorded in
the discard files as you specify. The cleanup task, depending on the selected
ON DISCARD option, takes action as follows:

■ If the DELETE ROW option is selected, the discarded rows are


recorded in the discard file as you specify. If the maximum discard
count is exceeded, the cleanup task quits the REORG operation and
marks all indexes as invalid. If versioning is enabled, a table and its
indexes are restored to their previous state.
■ If the INVALIDATE INDEX option is selected, any index that
encounters an error while building is marked invalid.
■ If the ABORT option is selected, the REORG operation quits and
leaves all indexes in an invalid state.

The successfully rebuilt indexes are marked valid and the status of the
REORG operation is reported to the coordinator task.

5HRUJDQL]LQJ7DEOHVDQG,QGH[HV 
5(25*6\QWD[

5(25*6\QWD[

REORG table_name
,
SEGMENT ( segment_name )

INCLUDE PRECOMPUTED VIEWS


,
( view_name )

INDEX ( index_name )
,
SEGMENT ( segment_name )
EXCLUDE DEFERRED INDEXES

INCLUDE
EXCLUDE INDEXES

RECALCULATE RANGES OPTIMIZE ON REFERENCE CHECKING ON


OFF OFF


MMAP INDEX ( pk_index_name )

SEGMENT ( segment_name )

 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
5(25*6\QWD[

5(25*6\QWD[ FRQWLQXHG

ON DISCARD INVALIDATE INDEX

ABORT

DELETE ROW
discardfile_clause
p. 6-19

;
DISCARDS n ROWMESSAGES ’filename’

REORG table_name Specifies the table to be reorganized.

SEGMENT Specifies the table segment or segments to be scanned


segment_name for a partial REORG of an index. When no segments are
specified, the entire table is scanned. Rows scanned that
do not belong in any index segment being rebuilt are
ignored.

Table segments can be specified only if the table and all


indexes specified in the INDEX clause are segmented on
the same column (except in the case of local indexes).
All the rows whose keys belong in the specified index
segments must be present in the specified data seg-
ments.

For REORGs of local indexes, the table segments should


correspond to the local index segments specified in the
INDEX clause.

5HRUJDQL]LQJ7DEOHVDQG,QGH[HV 
5(25*6\QWD[

INCLUDE Rebuilds specified precomputed views when their


PRECOMPUTED detail table is reorganized.
VIEWS view_name
If INCLUDE PRECOMPUTED VIEWS is specified, but no
view names are entered, all precomputed views based
on the table_name specified are rebuilt.

If individual view names are entered, only the specified


precomputed views are rebuilt.

The REORG command cannot rebuild a precomputed


view if any segment of the precomputed view is offline
or if the precomputed view is referenced in a STAR
index.

INDEX index_name Specifies the index that is to be rebuilt. If this clause is


not present, all non-deferred indexes defined on the
table are rebuilt. The index name for a user-created
index is specified in the CREATE INDEX statement. The
name for a system-generated primary-key index is the
string _PK_IDX appended to the table name. For exam-
ple, the primary index for the Market table is
MARKET_PK_IDX.

SEGMENT Specifies the index segments to be rebuilt by the REORG


segment_name operation. Index segments must be attached to the
named index. They can be either online or offline, but
all segments must be in the same online or offline state.
When no segments are specified, all segments attached
to that index are rebuilt.
[EXCLUDE, Specifies whether to include or exclude deferred
INCLUDE] indexes (indexes that are not populated at the time of
DEFERRED INDEXES creation) from the REORG operation. The default is
EXCLUDE DEFERRED INDEXES. A deferred index cannot
be subjected to a partial REORG operation.

 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
5(25*6\QWD[

EXCLUDE INDEXES Used only when rebuilding precomputed views;


excludes indexes from the REORG operation.

Since precomputed views and their indexes are auto-


matically rebuilt together, it is unnecessary for the
INDEX CLAUSE to include indexes on precomputed
views.

RECALCULATE Specifies that the ranges for any STAR index rebuilt with
RANGES this REORG statement are to be recalculated to split
index entries evenly among the segments for the index.
Use this option when you change a MAXROWS PER SEG-
MENT or a MAXSEGMENTS value for a table that partic-
ipates in the STAR index. If this option is included, at
least one index in the index list must be a STAR index.

RECALCULATE RANGES cannot be specified in the


REORG statement for a partial-index REORG. For infor-
mation about segment ranges, refer to the SQL Reference
Guide.

RECALCULATE RANGES cannot be specified when


versioning is enabled.

OPTIMIZE ON, OFF Specifies to rebuild the index or indexes in the OPTI-
MIZE mode. This option overrides the OPTIMIZE mode
set in the rbw.config file. If this clause is not present in
the REORG statement, the rbw.config file determines the
default behavior. For more information on OPTIMIZE,
see “Optimize Clause” on page 3-59.

5HRUJDQL]LQJ7DEOHVDQG,QGH[HV 
5(25*6\QWD[

REFERENCE Specifies whether to conduct a referential-integrity


CHECKING ON, OFF check. If the option is set to ON, all foreign keys are
checked. If the option is set to OFF, only those foreign
keys that are part of a STAR index that is being rebuilt
are checked. The default is ON.

If you have already ensured that the data has no refer-


ential-integrity violations, referential-integrity
checking can be turned off while building a deferred
index.

During a partial REORG operation, referential integrity


is checked only for those rows actually read from the
table.
MMAP INDEX Specifies one or more primary key indexes on tables ref-
pk_index_name erenced by the table being reorganized. The purpose of
this specification is to define the order in which indexes
are memory-mapped (with the mmap system function),
as a means of optimizing referential-integrity checking.
Use this clause in conjunction with the TUNE
TMU_MMAP_LIMIT parameter, which controls the
amount of memory available for memory-mapping
during loads; see page 2-35.
SEGMENT Specifies one or more segments of the specified primary
segment_name key index. Use this specification when you know which
index segments the referenced table data is associated
with. The memory mapping function is limited to the
index segments you specify.

The list of segment names must be enclosed by paren-


theses.

 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
5(25*6\QWD[

ON DISCARD Indicates the action to take when data rows fail the
referential-integrity checks or contain duplicate index
key values. The three options are DELETE ROW,
INVALIDATE INDEX, and ABORT. The default is
DELETE ROW.

INVALIDATE INDEX Invalidates any unique index with a duplicate key or


any STAR index that cannot be created or rebuilt
because of a referential-integrity violation. Also invali-
dates any index that cannot be created or rebuilt
because the operation is running out of disk space. (The
error can be corrected later by deleting a row, adding a
missing primary-key value to the referenced table, or
dropping a foreign key constraint.) Rebuilding other
indexes continues. If versioning is enabled, queries can
access the previous version of the table and its indexes.
If versioning is disabled, queries can access the table but
cannot access the indexes being rebuilt.

ABORT Stops the REORG operation if an error occurs in any of


the indexes. If versioning is enabled, the indexes are
rolled back to their initial state. If versioning is not
active, all indexes being rebuilt are marked invalid.
Indexes not being rebuilt are not affected. If the ABORT
option is selected and an index runs out of disk space,
the REORG operation ends. If versioning is enabled,
queries can access the table and its indexes. If version-
ing is disabled, queries can access the table but cannot
access the indexes being rebuilt.

5HRUJDQL]LQJ7DEOHVDQG,QGH[HV 
5(25*6\QWD[

DELETE ROW Deletes duplicate rows and rows with referential-integ-


rity failures from the table and all indexes. Rows with
referential-integrity violations are deleted from the
table immediately. Rows with duplicate key values are
not deleted until the end of the REORG after all table
rows are processed. If versioning is enabled, queries can
access the table and its indexes. If versioning is
disabled, queries cannot access the table or its indexes.

The DELETE ROW option cannot be used during a par-


tial-index REORG operation. Hence, either the
INVALIDATE INDEX option or the ABORT option must
be selected.

DISCARDS n Specifies the maximum number of discarded rows


allowed. When this number is exceeded, the REORG
operation ends. If the value specified for n count is 0, the
the number of discarded rows is unlimited.

ROWMESSAGES The name of the file to which row-level warning mes-


’filename’ sages are sent. This filename should be different from
the filename specified for discard files. The file is cre-
ated only if row warnings exist. For more information
about this option, see page 3-57.

 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
GLVFDUGILOH&ODXVH

GLVFDUGILOH&ODXVH
The discardfile_clause specifies where to store duplicate rows or rows that fail
the referential-integrity check.

discardfile_clause Back to REORG


p. 6-12

DISCARDFILE ’filename’

RI_DISCARDFILE ’filename’
,

( table_name ’filename’ )

OTHER ’filename’

DISCARDFILE Specifies the name of the file to which the TMU writes
‘filename’ discarded rows. Discarded duplicate rows are always
written to this file. Rows discarded because of refer-
ential-integrity failure also can be written to this file if
no separate RI_DISCARDFILE filenames are specified.
This option can be used only when the ON DISCARD
DELETE ROW option is specified.

The filename must satisfy the file-specification con-


ventions of the operating system and must be
enclosed in single quotation marks.

For a partial REORG operation, you cannot specify the


DISCARDFILE option.

The user redbrick must have write permission for the


discard files.

5HRUJDQL]LQJ7DEOHVDQG,QGH[HV 
GLVFDUGILOH&ODXVH

RI_DISCARDFILE Name of the file to which to discard the records that


‘filename’ violate referential integrity. This clause cannot be
used on a table that does not reference other tables.
You can use this option only when the ON DISCARD
DELETE ROW option is specified.

The filename must satisfy the file-specification


conventions of the operating system and must be
enclosed in single quotation marks.

The user redbrick must have write permission for the


discard files.

RI_DISCARDFILE A table name and filename pair that names a table ref-
table_name ’filename’ erenced by a foreign key in the table being reorga-
nized and a file in which to record the discarded rows
that violate referential integrity with respect to the
named table.

These name pairs provide a separate discard file for


each named table. If a single record violates referen-
tial integrity with respect to multiple referenced
tables, that record is written to the file associated with
each of those tables.

Multiple pairs can be specified. If some but not all ref-


erenced tables are listed here, records that violate ref-
erential integrity with respect to tables missing from
the list are written either to the file following the
OTHER keyword, or if that keyword is missing, to the
standard discard file (following the DISCARDFILE
keyword).

The filenames must satisfy the file-specification


conventions of the operating system and must be
enclosed in single quotation marks; pairs must be
separated by commas.

 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
8VDJH1RWHV

OTHER ’filename’ Specifies a file in which to discard any rows that vio-
late referential integrity with respect to referenced
tables not named in the table name and filename
pairs. If a table name and filename pair list is present
and this clause is omitted, then any records that vio-
late referential integrity with respect to tables missing
from the list are written to the standard discard file
(following the DISCARDFILE keyword).

The filename must satisfy the file-specification


conventions of the operating system and must be
enclosed in single quotation marks.

8VDJH1RWHV
,PSRUWDQW To use the REORG statement, you must be a member of the DBA system
role or be the owner of the table.

5HIHUHQWLDO,QWHJULW\
Referential integrity is always preserved for databases, except in the
following cases:

■ When you use the OFFLINE OVERRIDE REFCHECK option in ALTER


SEGMENT statements.
■ When you use the CLEAR OVERRIDE REFCHECK option in ALTER
SEGMENT statements.
■ When you use the OVERRIDE REFCHECK option in SQL DELETE
statements.
■ When you use the ON DISCARD DELETE ROW option in REORG
statements.
■ When you use the REPLACE mode in a LOAD DATA statements on a
table when rows in that table are referenced by another table.

5HRUJDQL]LQJ7DEOHVDQG,QGH[HV 
8VDJH1RWHV

In any of these cases, rows can be deleted from a referenced table that violates
referential integrity. Because delete operations that the REORG statement
performs do not cascade to referencing tables, if you reorganize a referenced
table, you must also reorganize each table that references the reorganized
table. To restore referential integrity, you must perform a REORG operation
with the ON DISCARD DELETE ROW and the REFERENCE CHECKING ON
options enabled for all tables that reference the table from which rows were
deleted. The REORG operation deletes any rows that reference a deleted row,
thus restoring referential integrity.

:DUQLQJ Be sure that you perform the REORG operation on the referencing table,
not the table from which the rows were deleted (the referenced table).

Figure 6-2 illustrates how rows are deleted from referencing tables in a
REORG operation. Assume the Fact1 table references the Dim1 table, which
in turn references the Out1 table.
)LJXUH 
2XW 'HOHWHV'R1RW
'LP &DVFDGH

)DFW

After some rows are deleted from Out1, a REORG operation is performed on
Dim1. Any rows in Dim1 that reference rows that were deleted from Out1
(that, violate referential integrity) are deleted by the REORG operation to
preserve referential integrity. However, rows in Fact1 that reference deleted
rows in Dim1 are not deleted by the REORG operation on Dim1. To delete
these rows and preserve referential integrity, you must also perform a REORG
operation on Fact1.

/RFNLQJ%HKDYLRU
The REORG operation initially places a read lock on the database (if the
RECALCULATE RANGES option is specified, the REORG operation places a
write lock on the database). The lock on the database is released after the
REORG operation locks the table that it is modifying.

 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
8VDJH1RWHV

If versioning is enabled, queries can access the previous version of the table
and its indexes while the table is being reorganized. For more information on
versioning, refer to the Administrator’s Guide.

3DUWLDO,QGH[5(25*
The limitations of a partial-index REORG operation are as follows:

■ Referential integrity is checked only for specified data segments of


the table. Hence, referential integrity cannot be ensured for the entire
table.
■ The listed segments must all be either online or offline. If all the
segments are online, an index that is being rebuilt by the REORG
operation remains invisible to users until the rebuilding process is
completed. If those segments being reorganized are offline, the other
segments that are online remain visible throughout the REORG
operation.
■ Only an index that is valid at the outset is marked valid upon
completion of the REORG operation. An invalid index is not marked
valid; to mark the index valid, you must rebuild the entire index.
■ Rows are not deleted from the table or from any index.
■ A partial-index REORG operation cannot be performed on a
DEFERRED index.

2QOLQHDQG2IIOLQH2SHUDWLRQ
When the segments being reorganized are offline, the indexes are not marked
invalid. Instead, an internal flag is set to indicate that the segment is being
rebuilt. When this flag is set, the segment cannot be brought online. When the
REORG operation completes successfully, the flag is reset. If the REORG is
unsuccessful, the segment remains inaccessible until it is successfully
reorganized.

During a full-index REORG operation, all index segments must be online and
remain invisible to users until the rebuilding process is complete.

7LS A REORG operation cannot rebuild a segment marked “damaged” until the
problem is corrected.

5HRUJDQL]LQJ7DEOHVDQG,QGH[HV 
8VDJH1RWHV

'LVN6SDFH
An index that runs out of disk space is marked invalid, but the excess rows
are not deleted from the table. If you specify the DELETE ROW option or
INVALIDATE INDEX option, only the index that runs out of space is marked
invalid and the process of building other indexes continues. If you specify the
ABORT option, the REORG operation ends immediately. If versioning is
enabled, all REORG changes are rolled back to their initial state. If versioning
is not enabled, all indexes being rebuilt are marked invalid.

'LVFDUGILOH)RUPDW
Discarded rows are written in external format. You can load them into a table
by using a load script generated by an UNLOAD statement. For more
information, see “Unloading or Loading External-Format Data” on
page 4-16.

 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
Chapter

0RYLQJ'DWDZLWKWKH&RS\
0DQDJHPHQW8WLOLW\ 
In This Chapter . . . . . . . . . . . . . . . . . . . . 7-3
The rb_cm Utility . . . . . . . . . . . . . . . . . . . 7-4
System Requirements . . . . . . . . . . . . . . . . 7-5
Database Security Requirements . . . . . . . . . . . . . 7-6

The rb_cm Syntax . . . . . . . . . . . . . . . . . . . 7-7


TMU Control Files for Use with rb_cm . . . . . . . . . . . . 7-10
LOAD and UNLOAD Statements . . . . . . . . . . . . 7-11
Specifying INTERNAL Format . . . . . . . . . . . . 7-11
Specifying EXTERNAL Format . . . . . . . . . . . . 7-11
SYNCH Statement. . . . . . . . . . . . . . . . . . 7-12
SET Statements . . . . . . . . . . . . . . . . . . . 7-13

Examples of rb_cm Operations . . . . . . . . . . . . . . . 7-13


Example: Copying Data Between Different Computers . . . . . 7-14
Setting Up the UNLOAD Control File . . . . . . . . . 7-15
Setting Up the LOAD DATA Control File . . . . . . . . 7-15
Running rb_cm . . . . . . . . . . . . . . . . . 7-16
Example: Copying Data Between Tables on the Same Computer . . 7-18
Setting Up the UNLOAD Control File . . . . . . . . . 7-18
Setting Up the LOAD Control File . . . . . . . . . . . 7-19
Running rb_cm . . . . . . . . . . . . . . . . . 7-19
Verifying the Results of rb_cm Operations . . . . . . . . . . . 7-20
 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
,Q7KLV&KDSWHU
In an enterprise environment with multiple IBM Red Brick Warehouse
databases linked by networks, you might need to move data among
databases to synchronize the data in equivalent tables. For example:

■ A database administrator at the regional level needs to periodically


consolidate data from various stores into a larger regional database.
■ A database administrator for a small department needs to periodi-
cally refresh the department database with current corporate data.
■ A retail organization might want to share sales data for specific items
with the suppliers of those items.

This chapter contains the following sections:

■ The rb_cm Utility


■ The rb_cm Syntax
■ TMU Control Files for Use with rb_cm
■ Examples of rb_cm Operations
■ Verifying the Results of rb_cm Operations

0RYLQJ'DWDZLWKWKH&RS\0DQDJHPHQW8WLOLW\ 
7KHUEBFP8WLOLW\

7KHUEBFP8WLOLW\
The TMU can perform high-performance unloads and loads to and from
physical storage. The rb_cm utility provides an interface that allows you to
combine the following tasks into a single operation:

■ Extract data from a database by using a TMU UNLOAD statement.


■ Move the data across a network (rather than to and from physical
storage).
■ Load the extracted data into a database using a TMU LOAD
statement.

Figure 7-1 illustrates the difference between an rb_cm copy operation and the
LOAD and UNLOAD statements.

)LJXUH 
1HWZRUN 'LIIHUHQFH%HWZHHQ
UEBFPDQG/2$'
DQG81/2$'

8QORDG

UEBFP
/RDG
FRS\RSHUDWLRQ 'LVNRUWDSH
:DUHKRXVH :DUHKRXVH
RQ+RVW RQ+RVW

The rb_cm utility can copy data between any two tables (in the same
database, in different databases, in different databases in different
warehouses, or in different databases on different platforms) as long as the
column data types are compatible between the source and destination tables.

The rb_cm utility supports all of the existing TMU load and unload functions.
For example, rb_cm can unload data from the source table in internal or
external format. It can unload by column value (selective unload) or by
segment, and load in APPEND, INSERT, MODIFY, REPLACE, or UPDATE mode.
These features give you substantial flexibility in copying data between tables.

 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
6\VWHP5HTXLUHPHQWV

You can issue an rb_cm command from either the computer on which the
source table is located or the computer on which the destination table is
located. The following sections discuss the requirements for running the
rb_cm utility.

6\VWHP5HTXLUHPHQWV
If you are performing a copy operation over a network, either from the local
computer to the remote computer or from the remote computer to the local
computer, the system requirements are as follows:

■ Both computers must be on the same network.


■ Both computers must have IBM Red Brick Warehouse installed.
■ You must be able to establish communication from the local host to
the remote host as follows:
From the local host where you issue the command, you must be able
to access the shell daemon or the Red Brick Service on the remote
host.
■ Your environment on the remote host be must have the rb_tmu
executable in your path.
UNIX On UNIX your .cshrc or profile file on the remote host should include
lines to set the PATH environment variable to include
redbrick_dir/bin. ♦
Windows On Windows the Red Brick Service on the remote host automatically
uses the correct path.
■ You must start the Red Brick Copy Management Service on the
remote host. The Red Brick Copy Management Service does not start
on Windows server computers that are already running the Remote
Shell Service. To start the Copy Management Service, you have to
stop the Remote Shell Service first. ♦

If you are copying data between tables on the same computer, no special
system requirements apply.

If you are issuing the rb_cm command from a system other than the source
or destination computer, you must be able to access the remote shell on the
source computer, and you must also be able to access the remote shell on the
destination computer from the source computer.

0RYLQJ'DWDZLWKWKH&RS\0DQDJHPHQW8WLOLW\ 
'DWDEDVH6HFXULW\5HTXLUHPHQWV

'DWDEDVH6HFXULW\5HTXLUHPHQWV
To copy table data by using the rb_cm utility, the user running rb_cm must
have the necessary authorizations on both the source and destination
databases. The user running rb_cm must have:

■ One of the following items on the source database:


❑ The DBA system role
❑ Ownership of the source table
❑ The ACCESS_ANY task authorization
❑ The SELECT privilege on the source table
■ One of the following items on the destination database:
❑ The DBA system role
❑ Ownership of the destination table
❑ The MODIFY_ANY task authorization
❑ The INSERT, DELETE, or UPDATE privileges on the destination
table. The specific privileges needed depend on the type of load
that the rb_cm utility performs. The following table summarizes
this requirement.

5HTXLUHG3HUPLVVLRQ
/RDG0RGHDW
'HVWLQDWLRQ ,16(57 '(/(7( 83'$7(

APPEND Yes No No

INSERT Yes No No

MODIFY Yes No Yes

REPLACE Yes Yes No

UPDATE No No Yes

 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
7KHUEBFP6\QWD[

7KHUEBFP6\QWD[
source rb_cm [-s unload_host] [-c unload_config_path] [-h unload_rbhost]
(unload) [-d unload_database] [-e unload_prog_path] [-f filter_cmd]
parameters unload_control_file unload_username unload_password
destination [-s load_host] [-c load_config_path] [-h load_rbhost]
(load) [-d load_database] [-e load_prog_path] [-f filter_cmd]
parameters
[-p] load_control_file load_username load_password

,PSRUWDQW The prefix “unload” refers to the source of the data; the prefix “load”
refers to the destination.
-s unload_host, Optional. Hostname of the computer on which the
load_host corresponding source or destination table is located.

The default is the hostname of the computer from


which you issue the rb_cm command.

-c unload_config_path, Path to the directory that contains the rbw.config file


load_config_path for the corresponding source or destination warehouse
host.

UNIX If the RB_CONFIG environment variable is set, this


command-line argument is optional and, if present,
overrides the environment variable. If RB_CONFIG is
not set, this argument is required. ♦

Windows The value of RB_CONFIG is taken from the Registry,


based on the value of RB_HOST. If desired, another path
for a different configuration file can be specified. ♦

If no value is specified, the default is the value of


RB_CONFIG on the corresponding source or
destination computer.

0RYLQJ'DWDZLWKWKH&RS\0DQDJHPHQW8WLOLW\ 
7KHUEBFP6\QWD[

-h unload_rbhost, Logical name for the warehouse API daemon or thread


load_rbhost (rbwapid) for the corresponding source or destination
warehouse host.

UNIX If the RB_HOST environment variable is set, this com-


mand-line argument is optional and, if present, over-
rides the environment variable. If RB_HOST is not set,
this argument is required. ♦

Windows The value of RB_HOST is determined by its value in the


Registry; if desired, you can specify another value for
RB_HOST. ♦

The default is the value of RB_HOST on the


corresponding source or destination computer.

-d unload_database, Logical database names of the corresponding source


load_database and destination databases.

UNIX If the RB_PATH environment variable is set, this com-


mand-line argument is optional and, if present, over-
rides the environment variable. If RB_PATH is not set,
this argument is required. ♦

Windows
The value of RB_PATH is taken from the Registry, based
on the value of RB_HOST. If desired, you can specify
another logical database name. ♦

The default is the value of RB_PATH on the


corresponding source or destination computer.

UNIX -e unload_prog_path, Specifies the pathname to the directory containing the


load_prog_path rb_tmu, rb_ptmu, and rb_cm utilities.

If either the source or destination computer is a remote


computer and the default login shell on that computer
is the Korn shell (ksh), you must use this argument to
specify the path. The -e option is optional with the
other shells. ♦

 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
7KHUEBFP6\QWD[

-f filter_cmd Optional. User-supplied filter program. The program


can be an executable or a script; the only restrictions are
that it accept input from standard input and send
output to standard output.

-p Specifies the parallel TMU (rb_ptmu) to execute the


LOAD statement contained in load_control_file. (You
gain no benefit by running the parallel TMU for an
unload operation, so the standard TMU always
performs it).

unload_control_file Full pathname of the file that contains the UNLOAD


statement. This file must be located on the computer
specified by unload_host. For more information, refer
to “TMU Control Files for Use with rb_cm”.

unload_username, Database username and password under which the


unload_password unload portion of the rb_cm operation is performed.

load_control_file Full pathname of the file that contains the LOAD state-
ment. This file must be located on the computer
specified by load_host. For more information, refer to
“TMU Control Files for Use with rb_cm” on page 7-10.

load_username, Database username and password under which the


load_password, load portion of the rb_cm operation is performed.

,PSRUWDQW Arguments in the rb_cm command must be in the order shown in the
syntax.

0RYLQJ'DWDZLWKWKH&RS\0DQDJHPHQW8WLOLW\ 
708&RQWURO)LOHVIRU8VHZLWKUEBFP

708&RQWURO)LOHVIRU8VHZLWKUEBFP
The rb_cm utility works by directing the output from a TMU UNLOAD
statement to a TMU LOAD statement. Before you run rb_cm, therefore, you
must prepare compatible TMU LOAD and UNLOAD control files.

The syntax for an UNLOAD control file for use with rb_cm is as follows.

UNLOAD statement
SET statement

The syntax of a LOAD DATA and/or a SYNCH OFFLINE SEGMENT control file
for use with rb_cm is as follows.

LOAD DATA statement


SET statement

SYNCH SEGMENT statement


SET statement

,PSRUWDQW You can include only a single UNLOAD statement in an UNLOAD


control file and only a single LOAD DATA statement in a LOAD DATA control file.
Each statement must end with a semicolon (;).

In both LOAD and UNLOAD control files, you must separate multiple control
statements with a semicolon (;). You can enclose comments either in C-style
delimiters (/*…*/), in which case they can span multiple lines, or precede
them with two hyphens (--) and end them with an end-of-line character, in
which case they are limited to a single line.

 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
/2$'DQG81/2$'6WDWHPHQWV

/2$'DQG81/2$'6WDWHPHQWV
The LOAD and UNLOAD statements are required for their respective control
files. The syntax of these statements differs in one respect from the syntax of
LOAD and UNLOAD statements used with the TMU directly: you cannot
specify a disk file or tape device as the destination of the unload operation
and you cannot specify a disk file or a tape device as the data source of the
load operation. You must specify standard output as the unload destination
and standard input as the load source.

For LOAD and SYNCH statement syntax, refer to Chapter 3, “Loading Data
into a Warehouse Database.” For UNLOAD statement syntax, refer to
Chapter 4, “Unloading Data from a Table.”

6SHFLI\LQJ,17(51$/)RUPDW
If you are using the rb_cm utility to copy data between tables stored on the
same platform type (for example, if both source and destination platforms
are Sun Solaris systems), unload the data by using the internal format.
Internal format is a binary data format that requires less time to load than
external-format data.

Internal format is the default unload format and does not need to be specified
explicitly in the UNLOAD statement. The corresponding LOAD statement
must include the FORMAT UNLOAD keywords, however, to indicate that the
data to be loaded is internal-format data.

6SHFLI\LQJ(;7(51$/)RUPDW
If you are using the rb_cm utility to copy data between tables stored on
platforms of different types (for example, if the source platform is Compaq
TRU-64 and the destination platform is Sun Solaris), you must unload the
data into an external character format. External format produces plain-text
data in the database-locale code set, data that is compatible across different
platform types.

0RYLQJ'DWDZLWKWKH&RS\0DQDJHPHQW8WLOLW\ 
6<1&+6WDWHPHQW

To unload data in the external format, include the EXTERNAL (VARIABLE)


keyword in the UNLOAD statement. To load external format data, you must
define all the fields in the input data records, which increases the complexity
of the LOAD statement. However, you can have the TMU automatically
generate a LOAD statement for the specific data and table you are unloading,
either as part of the UNLOAD process or with a separate TMU GENERATE
statement.

For example, if you create a file with the following GENERATE statement and
run the TMU for this file, the TMU produces a file named load_control_file:
GENERATE LOAD FROM unload_table
INPUTFILE ’-’
EXTERNAL
TMUFILE ’load_control_file’ ;

The load_control_file contains the necessary LOAD statement with all the
column data types defined. You can edit this file to include any other TMU
directives that might be required, such as a SYNCH statement or SET option.

6<1&+6WDWHPHQW
If data is copied into an offline segment of a table, that segment must be
synchronized with the rest of the table. To perform this synchronization,
write a control statement that contains a SYNCH statement, including the
segment and table names. Use this statement only in conjunction with load
operations into offline segments.

For information about the SYNCH statement, refer to “Writing a SYNCH


Statement” on page 3-119.

 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
6(76WDWHPHQWV

6(76WDWHPHQWV
The SET statements provide controls for:

■ TMU behavior when the database or target tables are locked


(SET LOCK option). If you are using rb_cm for unattended copies in
multiuser environments, have the SET LOCK option set to WAIT.
■ Size of the buffer cache that the TMU uses (SET BUFFERCOUNT
option).
■ Temporary space that the TMU uses for index-building operations
(SET INDEX_TEMPSPACE options).

You can include all of these SET statements in a LOAD control file and all
except the INDEX_TEMPSPACE options in an UNLOAD control file.

A SET statement affects load or unload behavior only during the rb_cm
operation by using the control file that contains that SET statement. After that
session, the option value reverts to the value specified in the rbw.config file;
if no value is specified in the rbw.config file, the option value reverts to its
default.

For more information about these SET statements, refer to “SET Statements
and Parameters to Control Behavior” on page 2-23.

([DPSOHVRIUEBFP2SHUDWLRQV
This section presents two scenarios in which data needs to be copied between
tables and gives examples of the required rb_cm commands. These scenarios
illustrate:

■ Copying data between different computers.


■ Copying data between tables in the same data warehouse on the
same computer.

0RYLQJ'DWDZLWKWKH&RS\0DQDJHPHQW8WLOLW\ 
([DPSOH&RS\LQJ'DWD%HWZHHQ'LIIHUHQW&RPSXWHUV

([DPSOH&RS\LQJ'DWD%HWZHHQ'LIIHUHQW&RPSXWHUV
Suppose that the Aroma database is located in a data warehouse for sales and
marketing. A regional marketing team maintains a version of the Aroma
database named Southregion, which contains only the data relevant to their
region. The regional team can make changes to the Southregion database
and run long queries against it without affecting users outside the team.

Periodically, the regional team wants to copy any new rows relevant to their
region from the Sales table in the Aroma database to the Sales table in the
Southregion database. They can do this by using rb_cm with a selective
unload operation. The following figure summarizes this scenario.

$URPDGDWDEDVH 6RXWKUHJLRQGDWDEDVH
6DOHVWDEOH 6DOHVWDEOH

&RS\QHZURZV

&RUSRUDWH5HG%ULFNGDWDEDVH 5HJLRQDO5HG%ULFNGDWDEDVH
+RVWSODWIRUP&RPSDT758 +RVWSODWIRUP6XQ6RODULV
+RVWQDPHPDLQ +RVWQDPHVRXWK

To perform the operation that the preceding figure describes, the adminis-
trator needs to carry out the following tasks:

 Set up the UNLOAD control file.


 Set up the LOAD control file.
 Issue the rb_cm command.

All of these steps are described in the following section. The first two steps
only need to be performed once, before the initial copy operation. Subse-
quent copies can reuse the same control files.

 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
([DPSOH&RS\LQJ'DWD%HWZHHQ'LIIHUHQW&RPSXWHUV

6HWWLQJ8SWKH81/2$'&RQWURO)LOH
The administrator sets up an UNLOAD control file named unload_new_sales
on the host computer. The UNLOAD statement that this file contains must
unload to external format because the source and destination computers in
this example have different architectures. The UNLOAD statement must also
perform a selective unload of the relevant rows. The following UNLOAD
statement fulfills both of these requirements:
UNLOAD SALES
EXTERNAL
OUTPUTFILE ’-’
WHERE PERKEY = 94050
AND (MKTKEY = 1
OR MKTKEY = 2
OR MKTKEY = 3
OR MKTKEY = 4);

This UNLOAD statement performs a selective unload of those rows that are
relevant to the south region (where Mktkey is equal to 1, 2, 3, or 4) and are
new (where Perkey is equal to the most recent value).

6HWWLQJ8SWKH/2$''$7$&RQWURO)LOH
The administrator sets up a LOAD control file named load_new_sales on the
destination computer. You can use the GENERATE statement to obtain a
LOAD DATA statement as follows:

GENERATE LOAD DATA FROM SALES INPUTFILE ’-’ EXTERNAL TMUFILE


’load_new_sales’;

The administrator enters this statement in a TMU control file and runs the
TMU. The TMU creates a file named load_new_sales that contains the
following LOAD statement:
LOAD DATA INPUTFILE ’-’
RECORDLEN 62
INSERT
INTO TABLE SALES (
PERKEY POSITION(2) INTEGER EXTERNAL(11) NULLIF(1)=’%’,
PRODKEY POSITION(14) INTEGER EXTERNAL(11) NULLIF(13)=’%’,
MKTKEY POSITION(26) INTEGER EXTERNAL(11) NULLIF(25)=’%’,
DOLLARS POSITION(38) DECIMAL EXTERNAL(12) NULLIF(37)=’%’,
WEIGHT POSITION(51) INTEGER EXTERNAL(11) NULLIF(50)=’%’);

0RYLQJ'DWDZLWKWKH&RS\0DQDJHPHQW8WLOLW\ 
([DPSOH&RS\LQJ'DWD%HWZHHQ'LIIHUHQW&RPSXWHUV

With the present example, the administrator needs to modify the


load_new_sales file by changing the INSERT keyword to APPEND. This
directs the TMU to append the rows to the destination table. The
administrator might also add an OPTIMIZE ON clause to improve
performance. The LOAD statement now looks as follows:
LOAD DATA INPUTFILE ’-’
RECORDLEN 62
APPEND
OPTIMIZE ON
INTO TABLE SALES (
PERKEY POSITION(2) INTEGER EXTERNAL(11) NULLIF(1)=’%’,
PRODKEY POSITION(14) INTEGER EXTERNAL(11) NULLIF(13)=’%’,
MKTKEY POSITION(26) INTEGER EXTERNAL(11) NULLIF(25)=’%’,
DOLLARS POSITION(38) DECIMAL EXTERNAL(12) NULLIF(37)=’%’,
WEIGHT POSITION(51) INTEGER EXTERNAL(11) NULLIF(50)=’%’);

,PSRUWDQW The TMU uses the NULLIF clause at the end of each field specification
that the TMU uses when it loads the data. An extra position before each field is
reserved for a null indicator.

5XQQLQJUEBFP
After the control files are written, a user with the necessary privileges can
issue an rb_cm command by using those files. The rb_cm command can be
issued from either the source host computer (main) or the destination host
computer (south1).

If the rb_cm command is issued from main, the source host computer, the
command for this operation (formatted for readability) might resemble the
following excerpt of code.

% rb_cm \
source
-s main -c redbrick_dir -h RB_HOST -d Aroma \
parameters
$RB_CONFIG/util/unload_new_sales maindba secret \
destination -s south1 -c /south1_redbrick_dir -h RB_HOST -d Southregion \
parameters ’$RB_CONFIG’/util/load_new_sales southdba cryptic

 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
([DPSOH&RS\LQJ'DWD%HWZHHQ'LIIHUHQW&RPSXWHUV

The rb_cm command requires the following considerations:

■ The -c, -h, and -d options are present for both source and destination
and override the corresponding environment variables. These
options are not required if the corresponding variables are set.

Windows
■ You cannot use the RB_CONFIG environment variable as part of an
explicit pathname with the rb_cm utility for a source or destination
that is a Windows system. ♦
■ If you use the RB_CONFIG environment variable to specify a control
file on the remote computer, you must use appropriate escape
characters so that it is passed to the remote computer and not inter-
preted on the local computer. If you issue an equivalent command
from the destination host computer, the command (formatted for
readability) might resemble the following excerpt of code.

% rb_cm \
source -s main -c redbrick_dir -h RB_HOST -d Aroma \
parameters ’$RB_CONFIG’/util/unload_new_sales maindba secret \
destination -s south1 -c /south1_redbrick_dir -h RB_HOST -d Southregion \
parameters $RB_CONFIG/util/load_new_sales SouthDBA cryptic

,PSRUWDQW The preceding examples are broken into multiple lines for clarity; when
you enter the rb_cm command, enter it as a single line.

0RYLQJ'DWDZLWKWKH&RS\0DQDJHPHQW8WLOLW\ 
([DPSOH&RS\LQJ'DWD%HWZHHQ7DEOHVRQ WKH 6DPH &RPSXWHU

([DPSOH&RS\LQJ'DWD%HWZHHQ7DEOHV
RQ WKH 6DPH &RPSXWHU
Suppose that the same regional marketing team from the previous example
keeps a copy of their Southregion database named Testdb, to which they can
make numerous updates and simulate various scenarios. Periodically they
need to replace the modified data in Testdb with the actual data stored in the
Southregion database. The following figure illustrates this scenario for a
single table.

6RXWKUHJLRQGDWDEDVH 7HVWGEGDWDEDVH
6DOHVWDEOH 6DOHVWDEOH

&RS\DOOURZV

5HJLRQDOGDWDEDVH 5HJLRQDOGDWDEDVH
+RVWSODWIRUP6XQ6RODULV +RVWSODWIRUP6XQ6RODULV
+RVWQDPHVRXWK +RVWQDPHVRXWK

To perform this operation the administrator must set up the LOAD and
UNLOAD control files and then run an appropriate rb_cm command.

6HWWLQJ8SWKH81/2$'&RQWURO)LOH
The administrator sets up an UNLOAD control file named
unload_south_sales. This file contains the following UNLOAD statement:
UNLOAD
SALES
OUTPUTFILE ’-’;

This UNLOAD statement unloads data in internal (binary) format. This is


possible because both UNLOAD and LOAD operations take place on the same
platform.

 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
([DPSOH&RS\LQJ'DWD%HWZHHQ7DEOHVRQ WKH 6DPH &RPSXWHU

6HWWLQJ8SWKH/2$'&RQWURO)LOH
The administrator sets up a LOAD control file named load_south_sales. This
file contains the following LOAD statement:
LOAD
INPUTFILE ’-’
REPLACE
FORMAT UNLOAD
OPTIMIZE ON
INTO TABLE SALES;

This LOAD statement replaces all the rows in the Sales table with the loaded
data.

5XQQLQJUEBFP
After the control files are set up, a user with the required privileges can issue
an rb_cm command to copy the data. The command might look like the
following example:
% rb_cm \
-d Southregion $RB_CONFIG/util/unload_south_sales
SouthDBA cryptic \
-d Testdb /mktg/local/test/util/load_south_sales TestDBA
cryp007tic

The rb_cm command requires the following considerations:

■ The -s option is not present so the default source and destination is


the computer on which the rb_cm command is issued.
■ The -c and -h options are not present; both databases are in the same
warehouse and use the warehouse API and configuration file
defined by the RB_HOST and RB_CONFIG environment variables.
Windows ■ You cannot use the RB_CONFIG environment variable as part of an
explicit pathname with the rb_cm utility for a source or destination
that is a Windows system. ♦

0RYLQJ'DWDZLWKWKH&RS\0DQDJHPHQW8WLOLW\ 
9HULI\LQJWKH5HVXOWVRIUEBFP2SHUDWLRQV

9HULI\LQJWKH5HVXOWVRIUEBFP2SHUDWLRQV
To verify that all of the rows are successfully copied by the rb_cm utility,
query the RBW_LOADINFO table in the destination database. This system
table holds information on each load operation performed against the
database, including loads that are issued as part of an rb_cm operation. This
information includes the times at which the load started and completed, the
number of rows inserted into the table, and the status of the load. For more
information on the RBW_LOADINFO system table, refer to the Administrator’s
Guide.

The following example illustrates how to query the RBW_LOADINFO system


table to determine load information:
RISQL> select substr (tname,1,12) as table_name,
> substr(username, 1,10) as user_name,
> substr(string(started), 1, 19) as load_start,
> substr(string(finished), 1, 19) as load_finish,
> substr (status, 1, 6) as status
> from rbw_loadinfo
> where tname = ’SALES’;
TABLE_NAME USER_NAME LOAD_START LOAD_FINISH STATUS
SALES SYSTEM 2000-04-05 00:33:24 2000-04-05 00:33:26 NULL
SALES SYSTEM 2000-04-06 00:35:38 2000-04-06 00:35:41 NULL
RISQL>

 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
Chapter

%DFNLQJ8SD'DWDEDVH

In This Chapter . . . . . . . . . . . . . . . . . . . . 8-3
Backup Levels and Modes . . . . . . . . . . . . . . . . 8-4
External Full Backups . . . . . . . . . . . . . . . . 8-4
Restore Rules . . . . . . . . . . . . . . . . . . . 8-5
Backup Data . . . . . . . . . . . . . . . . . . . . 8-5
Backup Strategies . . . . . . . . . . . . . . . . . . 8-6
How Many Backups? . . . . . . . . . . . . . . . 8-6
Which Level? . . . . . . . . . . . . . . . . . . 8-6
Online or Checkpoint? . . . . . . . . . . . . . . . 8-7
How Important is Data Recovery? . . . . . . . . . . . 8-7
General Recommendations . . . . . . . . . . . . . 8-8
Backup Procedure . . . . . . . . . . . . . . . . . . 8-8

Preparing the Database for Backups . . . . . . . . . . . . . 8-8


ALTER DATABASE CREATE BACKUP DATA Command . . . . 8-9
ALTER DATABASE DROP BACKUP DATA Command . . . . . 8-10
Storage Requirements for the Backup Segment . . . . . . . . 8-10
Altering the Backup Segment . . . . . . . . . . . . . . 8-11
Valid Operations . . . . . . . . . . . . . . . . . 8-12
ADD STORAGE Example . . . . . . . . . . . . . . 8-12
Invalid Operations . . . . . . . . . . . . . . . . 8-12
How to Run a TMU Backup . . . . . . . . . . . . . . . . 8-13
Scope of Backup Operations . . . . . . . . . . . . . . 8-14
Database Locale . . . . . . . . . . . . . . . . . 8-14
Versioned Databases . . . . . . . . . . . . . . . 8-14
Configuring the Size of Backup Files . . . . . . . . . . . 8-15
Backups to Tape . . . . . . . . . . . . . . . . . . 8-17
Standard Label Format . . . . . . . . . . . . . . . 8-18
Tape Device Configuration . . . . . . . . . . . . . 8-18
Tape Capacity . . . . . . . . . . . . . . . . . . 8-18
Using a Storage Manager for TMU Backups . . . . . . . . . 8-19
Using External Tools for Full Backups . . . . . . . . . . . 8-20
Recommended Procedure for Foreign Backup Operations . . . 8-21
BACKUP Syntax . . . . . . . . . . . . . . . . . . 8-22
Examples of Backup Operations . . . . . . . . . . . . 8-24
Messages Displayed During Backups . . . . . . . . . . 8-25
Backup Metadata . . . . . . . . . . . . . . . . . . . 8-26
Media History File (rbw_media_history) . . . . . . . . . . 8-27
Editing the Media History File . . . . . . . . . . . . 8-28
PSU Offset Example . . . . . . . . . . . . . . . . 8-29
Backup Log File (action_log) . . . . . . . . . . . . . . 8-29

 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
,Q7KLV&KDSWHU
Most databases are routinely modified by table loads and server-based DDL
and DML operations. Such databases should be backed up on a regular
schedule in case of system or software failure. In the event of a failure, the
presence of a recent backup makes it possible to fully recover the database.
The larger the database, the more important it is to back it up.

Before deciding when and how often to back up a database, you must under-
stand the types of Table Management Utility (TMU) backups you can perform
and anticipate the time and effort involved in each case. As your database
evolves, you must consider the amount of data at risk and be prepared to
restore the database if necessary.

This chapter explains how to back up an IBM Red Brick Warehouse database
with the TMU. The chapter contains the following main sections:

■ Backup Levels and Modes


■ Preparing the Database for Backups
■ How to Run a TMU Backup
■ Backup Metadata

For information about restore operations, see Chapter 9, “Restoring a


Database.”

%DFNLQJ8SD'DWDEDVH 
%DFNXS/HYHOVDQG0RGHV

%DFNXS/HYHOVDQG0RGHV
The TMU supports full backups (level 0) and incremental backups (levels 1
and 2).

■ A level 0 backup is a full backup of the database. A copy of every


database object is stored in the specified media. The latest level 0
backup represents the starting point for any restore operations.
■ A level 1 backup is a backup of all of the data that has changed since
the last level 0 backup.
■ A level 2 backup is a backup of all of the data that has changed since
the last backup of any kind (0, 1, or 2).

Regardless of its level, a TMU backup can be performed in either online mode
or checkpoint mode:

■ Online backups take place while the database is “live”; both read
operations (queries) and write operations (updates and loads) are
allowed.
■ Checkpoint backups take place with the database in read-only mode—
available for read operations but not for write operations. You cannot
modify the database while a checkpoint backup is in progress.
,PSRUWDQW You should not restore a database to the level of an online backup; online
backups do not guarantee database recovery to a consistent state. The restore process
is not complete until you have returned the database to its state at the time of a check-
point backup.

([WHUQDO)XOO%DFNXSV
TMU incremental backups can be seamlessly integrated with full backups
performed with third-party tools and operating-system utility programs.
This combination of backups is a good solution for very large data
warehouses and for customers who have a system-wide full backup solution
already in place. For more information about this approach, see page 8-20.

 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
5HVWRUH5XOHV

5HVWRUH5XOHV
The following table lists the supported combinations of backup levels and
modes and indicates the associated rules for database restores:

/HYHO 0RGH &RPPHQWVRQ5HVWRUH3URFHVV

0 Checkpoint Can be restored.

1 Can only be restored after level 0


(online, checkpoint, or external).

2 Can only be restored after level 0


(online, checkpoint, or external) or
level 1 (online or checkpoint).

0 Online Cannot be restored; requires subse-


quent checkpoint.

1 Can only be restored after level 0; also


requires subsequent checkpoint.

2 Can only be restored after level 0 or 1;


also requires subsequent checkpoint.

For detailed information about the restore process, refer to Chapter 9,


“Restoring a Database.”

%DFNXS'DWD
IBM Red Brick Warehouse tracks changes to the database by maintaining a
bitmap in a special data segment declared as the backup segment. Every 8K
block that has changed since the last backup is stored in the backup segment.
Therefore, when databases are backed up incrementally, the minimum
amount of data has to be copied. The larger the database, the more advanta-
geous this approach becomes.

For detailed information about the backup segment, refer to “Preparing the
Database for Backups” on page 8-8.

%DFNLQJ8SD'DWDEDVH 
%DFNXS6WUDWHJLHV

%DFNXS6WUDWHJLHV
A certain amount of planning and scheduling is required to establish a sound
backup strategy for your database. The strategy you choose is a trade-off
between the performance requirements and time constraints of your routine,
scheduled backup operations and the degree of difficulty, reliability, and time
constraints of restore operations that you might have to perform in the future.
Remember that you can schedule when the next backup should be done, but
you cannot predict when a catastrophic failure might mandate a restore
operation.

The real purpose of backing up a database is to provide a means of restoring


the database should such a failure occur: for example, the loss of a disk or a
corrupted file system. With this goal in mind, you need to consider how often
to back up the database, what level of backup operation to choose, and
whether to run it in online mode or checkpoint mode. Another consideration
is how much data you are willing to put at risk; in other words, how many
updates can you afford to reload or lose rather than recover? All of these
factors will influence your TMU backup implementation.

+RZ0DQ\%DFNXSV"
The frequency of the backups you perform should be based on the extent of
the changes that routinely occur to the database. If your data warehouse is
static during the week while users are running queries, you need not
schedule backups during the week. If, on the other hand, the database is
refreshed with new data every night, a daily incremental backup is a wise
choice. In other words, you need to know how much data is at risk at any
given time. Any data that has yet to be copied to a safe backup file or tape is
at risk.

:KLFK/HYHO"
The level of backup you choose depends on the extent of the changes and the
size of the database. These factors determine how long a backup might take.
A full backup takes a long time to complete, and the larger the database, the
longer it takes. Incremental backups are faster, and are very effective for
picking up relatively small changes to the database, such as a new index or
some inserts into a dimension table.

 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
%DFNXS6WUDWHJLHV

The only disadvantage to frequent incremental backups is that they can cause
the restore process to be more difficult. If there is too long an interval between
a series of level 2 backups and the last full backup, you will have to restore
all of these backups (as well as the last full backup) in order to bring the
database back to a consistent state. Level 1 backups are slower, but because
they pick up all the changes since the last level 0, they reduce the number of
incremental backups from which the database needs to be restored.

2QOLQHRU&KHFNSRLQW"
The third factor to consider is whether the database needs to be available for
modifications during the backup operation. If there is not enough time to run
a full backup while the database is unavailable for loads and updates, you
might choose to run time-consuming backups in online mode, closely
followed by quick checkpoint backups that pick up any changes that the
online backup missed. You can schedule the online backups to run anytime
but schedule the checkpoints during database “downtime.” The checkpoints
are essential; without them, the database cannot be restored to a consistent
state.

+RZ,PSRUWDQWLV'DWD5HFRYHU\"
Finally, consider how much data loss your application can afford to sustain.
You might forego regular incremental backups in the knowledge that minor
daily updates to the database are easier to reload than to back up and restore.
On the other hand, if millions of rows are added or updated every night, you
must maintain a daily checkpoint backup of those changes.

If your database is relatively small, it might be more efficient to periodically


unload the database than to back it up. You can then use a reload procedure
instead of a restore operation to recover the database. If you choose this
strategy, it is critical that you preserve all of the input files and TMU control
files that you use to load the database.

Conversely, if your database is very large, it might be more efficient to use an


external tool for full backups and only use the TMU for incremental backups,
as discussed on page 8-20.

%DFNLQJ8SD'DWDEDVH 
%DFNXS3URFHGXUH

*HQHUDO5HFRPPHQGDWLRQV
As part of a reliable and efficient TMU backup program, IBM recommends
that you:

■ Provide consistent recovery points at regular intervals by closely


following online backups with checkpoint backups
■ Reduce the number of incremental backups that might need to be
restored by taking regular level 0 and/or level 1 backups

The examples on page 9-5 illustrate how different combinations of backups


affect the restore operations that might be required.

%DFNXS3URFHGXUH
The general procedure for backing up a Red Brick database is as follows:

 Prepare the database for backups by creating the backup segment. See
page 8-8.
 Run an initial full backup (online, checkpoint, or external) to provide
the baseline data for future restore operations. See page 8-13.
 Periodically run incremental backups, making sure that checkpoint
backups are done frequently enough to provide a means of
consistent database recovery. See page 8-13.

3UHSDULQJWKH'DWDEDVHIRU%DFNXSV
Incremental backups rely on bitmap information that indicates which blocks
in each physical storage unit (PSU) have changed because of table load opera-
tions, data manipulation commands, or the creation of new database objects.
These bitmaps allow incremental backups to be performed efficiently on very
large databases because the changed blocks of each PSU can be distinguished
from those that have not changed since the last backup.

The bitmap information is stored in a single segment known as the backup


segment. The database cannot be backed up with the TMU if this segment does
not exist. (The backup segment must also be present when a partial restore is
performed.)

 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
$/7(5'$7$%$6(&5($7(%$&.83'$7$&RPPDQG

7RFUHDWHWKHEDFNXSVHJPHQW

 Create a regular segment with a CREATE SEGMENT statement.


For conceptual information about creating segments, refer to the
Administrator’s Guide. For the full syntax of the CREATE SEGMENT
command, refer to the SQL Reference Guide. For information about
storage requirements for the backup segment, refer to page 8-10.
 Declare that the new segment is the backup segment by issuing an
ALTER DATABASE CREATE BACKUP DATA statement, as explained in
the following section.

Alternatively, you can use the Manage Segments and Manage System
wizards in the IBM Red Brick Warehouse Administrator tool to create the
backup segment.

$/7(5'$7$%$6(&5($7(%$&.83'$7$&RPPDQG
The ALTER DATABASE CREATE BACKUP DATA command names an existing
but unused segment as the backup segment for the database. When the
command is issued, the named segment is marked as the backup segment in
the DST_DATABASES table. Only one segment per database can be defined as
the backup segment. If no backup segment is defined, TMU backup and
restore operations cannot be performed.

The following SQL statements show how to create a segment, then define it
as the backup segment.

 Create the segment:


RISQL> create segment backup_seg
> storage ’/test/bar1’ maxsize 2048000,
> storage ’/test/bar2’ maxsize 2048000,
> storage ’/test/bar3’ maxsize 2048000,
> storage ’/test/bar4’ maxsize 2048000;

 Define the segment as the backup segment:


RISQL> alter database create backup data in backup_seg;

%DFNLQJ8SD'DWDEDVH 
$/7(5'$7$%$6('523%$&.83'$7$&RPPDQG

$/7(5'$7$%$6('523%$&.83'$7$&RPPDQG
The ALTER DATABASE DROP BACKUP DATA command removes the backup
data from the database and changes the backup segment to a regular
segment. The segment itself is not dropped. After this command has been
issued, TMU backup operations can no longer be performed.

For more information about the ALTER DATABASE command, refer to the SQL
Reference Guide.

6WRUDJH5HTXLUHPHQWVIRUWKH%DFNXS6HJPHQW
You allocate space for the backup segment by specifying the size of one or
more PSUs when you issue the CREATE SEGMENT statement. You can also
add space to the segment with an ALTER SEGMENT ADD STORAGE command.
In general, the best practice is to anticipate the amount of space the backup
segment will need and allocate that amount of space when you first create the
segment.

Because operations that access the backup segment are lock-intensive, the
backup segment should be stored on local, rather than NFS-mounted,
filesystems. The amount of local disk space the segment will require depends
on the size of the database and how much it is expected to grow. You can use
the following formula to calculate the maximum space (in kilobytes) required
for a backup segment:
 7RWDO6HJPHQWV  7RWDO368V     0D[LPXP6SDFH

For example, the Aroma database contains 41 segments and 43 PSUs (the 39
default segments consist of one PSU, and the 2 user-defined segments consist
of 2 PSUs each):
        .

Assuming that no new segments are created for the database, the maximum
amount of space that its backup segment will ever need is 4,108 kilobytes
(about 4 megabytes). However, if this database doubles in size, the space
allocated to the backup segment needs to be closer to 8 megabytes.

 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
$OWHULQJWKH%DFNXS6HJPHQW

If the backup segment runs out of space when you attempt to create a new
database object, an error message is displayed and you have to resubmit the
command that failed. To avoid subsequent out-of-space errors, immediately
add more space to the backup segment by using an ALTER SEGMENT...ADD
STORAGE statement. It is also recommended that you run a full backup after
adding storage.

,PSRUWDQW The backup segment itself is backed up only when a TMU checkpoint
backup is run. Online TMU backups do not back up the backup segment.

'DPDJHWRWKH%DFNXS6HJPHQW

If the backup segment is damaged, backup data for the database is not
maintained and no backup and restore operations can be performed until the
segment is repaired. For a procedure on repairing damaged segments, refer
to the Administrator’s Guide.

After the damaged PSUs have been repaired, IBM recommends that you issue
an ALTER SEGMENT VERIFY statement to check that the PSUs are intact, then
run a full backup of the database.

Do not use the ALTER SEGMENT FORCE INTACT command to mark a repaired
backup segment as intact unless you are sure that the database was not modified
while the backup segment was damaged. If the database was modified during this
time, the next backup operation would fail to back up all the modified blocks
and the database might be left in an inconsistent state. In this case, the only
way to restore the consistency of the database would be to run a full backup.

For more information about the VERIFY and FORCE INTACT options, refer to
the Administrator’s Guide and the SQL Reference Guide.

$OWHULQJWKH%DFNXS6HJPHQW
For detailed information about the ALTER SEGMENT command, refer to the
SQL Reference Guide. The following sections identify which ALTER SEGMENT
operations can and cannot be performed on the backup segment.

%DFNLQJ8SD'DWDEDVH 
$OWHULQJWKH%DFNXS6HJPHQW

9DOLG2SHUDWLRQV
The following ALTER SEGMENT operations can be performed on the backup
segment:
■ ADD STORAGE
■ CHANGE EXTENDSIZE
■ CHANGE MAXSIZE
■ CHANGE PATH
■ COMMENT
■ FORCE INTACT—Use this option with caution; see “Damage to the
Backup Segment” on page 11.
■ MIGRATE TO
■ RENAME
■ VERIFY
■ OPTICAL ON/OFF

$''6725$*(([DPSOH
The following ALTER SEGMENT command adds a PSU to the backup segment.
RISQL> alter segment backup_seg add storage ’/test/bar5’
> maxsize 2048000;

,QYDOLG2SHUDWLRQV
The following ALTER SEGMENT operations cannot be performed on the
backup segment:
■ ATTACH
■ CLEAR
■ DETACH
■ DROP LAST STORAGE
■ ONLINE and OFFLINE
These commands do not apply to the backup segment, which is
brought online as soon as it is created and always remains online.
The RBW_SEGMENTS system table always indicates that the backup
segment is online.
■ RANGE

 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
+RZWR5XQD708%DFNXS

■ RELEASE STORAGE
■ SEGMENT BY

+RZWR5XQD708%DFNXS
To run a TMU backup, follow these steps:

 Decide what your backup strategy is going to be (based on the


discussion on page 8-6). At a minimum, you need to decide the level
and type of backup operation and where the backup will be stored:
on disk, tape, or a storage management system.
 Make sure that the database is ready for backups. Be sure to allocate
sufficient space to the backup segment, based on the size of the
database and the number and frequency of the backups you plan to
perform. For instructions on creating and sizing the backup segment,
see page 8-9.
 Create a TMU control file that contains a complete BACKUP
statement. The BACKUP syntax is defined on page 8-22. For general
information about control files, refer to page 1-8.
 Use the rb_tmu (or rb_ptmu) program to execute the control file and
perform the backup. For information about the full rb_tmu syntax,
see page 2-5. (If you use rb_ptmu, the backup operation still runs in
serial mode; parallel backups are not supported.)
A checkpoint backup operation locks the database against any write
operations (DML, DDL, or TMU). Read-only operations continue
while the backup is processing, but write operations are locked out
until the backup is complete. You can set the TMU SET LOCK WAIT
option to cause the backup operation to wait until current locks on
the tables are released.
TMU BACKUP operations cannot be initiated with the Client TMU
(rb_ctmu) and executed on a remote server machine.

7DVN$XWKRUL]DWLRQ

The BACKUP_DATABASE authorization must be granted to a database user


before that user can run a TMU backup operation. For example:
RISQL> grant backup_database to evelyn;

%DFNLQJ8SD'DWDEDVH 
6FRSHRI%DFNXS2SHUDWLRQV

This authorization is inherited by the DBA role. BACKUP_DATABASE authori-


zation does not include authority to verify backups and run restore
operations.

6FRSHRI%DFNXS2SHUDWLRQV
The scope of a TMU backup operation is always the entire database, including
the system catalog. You cannot back up a single object or set of objects. The
only objects (or changed blocks) that are never backed up are as follows:

■ Damaged segments (You should try to repair and verify damaged


segments before running a backup.)
■ The version log segment (if the database is versioned)
■ Related directories and files outside the Red Brick database directory,
such as the rbw.config file and the $RB_CONFIG/bar_metadata
directory (see page 8-26).

The backup segment is backed up as part of checkpoint backups, but not


online backups.

'DWDEDVH/RFDOH
TMU backup and restore operations are fully localized. When the database is
backed up, the database locale is stored with the data. When the database is
restored, the locale of the backup must match the locale of the database;
otherwise, the restore operation fails and an error message is displayed. If
you have to re-create an empty version of a corrupted database in order to
restore it, you must create the new database with the same locale that was
saved in the backups.

9HUVLRQHG'DWDEDVHV
TMU online backups can be performed on versioned databases, whether or
not the version log is empty when the backup starts. However, TMU check-
point backups can only begin when the version log is empty. If the version
log is not empty, the backup will wait until the vacuum cleaner daemon has
cleaned the version log. Before running a checkpoint backup, issue the
following ALTER DATABASE command:
RISQL> alter database clean version log;

 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
&RQILJXULQJWKH6L]HRI%DFNXS)LOHV

If the version log contains any damaged segments, the clean operation will
not complete. In this case, try to repair the damaged segments and use ALTER
SEGMENT VERIFY commands to make sure they are intact. Do not use the
REMOVE DAMAGED SEGMENTS clause in the ALTER DATABASE command; if
you do, you might remove data blocks that should have been fixed and
included in the backup.

Before doing a restore, you should also attempt to fix any damaged segments.
However, when you clean the version log before a restore, you can include
the REMOVE DAMAGED SEGMENTS clause in the ALTER DATABASE
command:
RISQL> alter database clean version log remove damaged segments;

Then you can restore the damaged segments, using either full or partial
restore operations. Because this command only clears the damaged segment
blocks from the version log, the segments still exist in the system tables and
partial restores will work. For more details about partial restores, see
page 9-19.

&RQILJXULQJWKH6L]HRI%DFNXS)LOHV
Whether your TMU backups are written to disk, tape, or a storage
management system, the amount of backup data committed per transaction
is defined by the size of the backup and restore unit or BAR unit. The BAR unit
size represents the maximum size of individual backup files and XBSA
objects, except in those cases where the size of the backed-up blocks for a
given PSU exceeds the BAR unit size. Backup blocks for a single PSU cannot
be split across different backup files, but blocks for different PSUs can be
backed up within a single backup file.

%DFNLQJ8SD'DWDEDVH 
&RQILJXULQJWKH6L]HRI%DFNXS)LOHV

The following diagram illustrates a case where the BAR unit size is set to 300
megabytes. Because the backup data for PSU6 is approximately 320
megabytes, it occupies a single backup file that exceeds the configured size.
The other two files adhere to the size limit and contain the backed-up blocks
for multiple PSUs.

PSU1 PSU4 PSU6

PSU2 PSU5

PSU3

bar_unit2 (280MB)
bar_unit1 (290MB)

bar_unit3 (320MB)

The ability to store the blocks for multiple PSUs in a single backup file reduces
the number of data commits required during each backup operation. In turn,
this approach optimizes the performance of both the backup operation and
any restore operations from that backup. However, if the BAR unit size is set
too high, more data is potentially at risk while a backup operation is in
progress. If a failure occurs, the amount of data not yet committed could be
much greater and the entire current unit has to be backed up again.

The default BAR unit size is 256 megabytes. To change this setting, enter a
value for the BAR_UNIT_SIZE parameter in the rbw.config file. For example:
TUNE BAR_UNIT_SIZE 512M

Alternatively, enter the equivalent SET command in a TMU control file:


SET BAR_UNIT_SIZE 512M;

The TMU returns an error if you enter a value less than the minimum setting
of 1 megabyte (1M).

On disk, the maximum setting for this parameter is 2 gigabytes (2G), and the
TMU returns an error if the setting exceeds 2G. Nonetheless, the physical
maximum for a single-PSU backup file on disk is 2G minus 8K. On tape and
XBSA devices, the maximum size could be as much as 2 terabytes, depending
on the device type and configuration.

 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
%DFNXSVWR7DSH

For more information about TMU SET commands and TUNE parameters, see
page 2-23.

5HFRPPHQGDWLRQV
IBM recommends using the default BAR_UNIT_SIZE value for your initial
TMU backups. You can track the number of BAR units that each TMU backup
produces by checking the contents of the action_log file; see page 8-29.

Before changing the parameter, you need to consider several factors in your
application environment, including the size of the database, I/O performance
for the operating system, the average number of dirty blocks per PSU, the
configuration of the storage manager (if you are using XBSA backups), and so
on.

Remember that the BAR unit size represents the recommended size of each
backup file, not an absolute size for all backup files. At run-time, a very large
PSU with a large number of dirty blocks will cause that backup file’s size to
be rounded up to some higher value (but a value as close as possible to the
specified setting).

UNIX %DFNXSVWR7DSH
The TMU can perform backups to a wide range of non-rewind tape devices
that support UNIX open/read/write/close interfaces, using 4mm, 8mm, and
DLT tapes.

Backups directly to tape are not supported on Windows platforms; however,


you can configure an XBSA storage manager that stores backups on tape. For
information about configuring a storage manager, see page 8-19.

:DUQLQJ You cannot reuse the same tape for a subsequent TMU backup; the second
backup will overwrite the first backup. Make sure you mount a new tape on the device
before proceeding with a new backup operation.

%DFNLQJ8SD'DWDEDVH 
%DFNXSVWR7DSH

6WDQGDUG/DEHO)RUPDW
The backup files on tapes are ANSI standard-label tape files (ANSI STL
format). Only one backup is allowed on a single tape; however, one backup
can span several tapes. Before backing up the database to tape files, make
sure the tape device is configured as variable-length. (The same requirement
applies to TMU LOAD and UNLOAD operations.)

7DSH'HYLFH&RQILJXUDWLRQ
When you run a TMU backup to tape, you must specify a logical device name
in the command, as shown on page 8-22. This logical name must point to a
physical device specified in the rbw.config file. For example:
BARTAPE dev1 /dev/rmt0

When the backup starts, the logical name (in this case, dev1) is found in the
rbw.config file and the mapped physical device /dev/rmt0 is used for the
backup. In this way, DBAs can restore a backup using a different tape device
from the one that was originally used for the backup operation itself. The
BARTAPE entries can be edited at any time to update the mapping of logical-
to-physical device names.

The three parts of each BARTAPE entry must be separated by spaces. The
logical name can contain any combination of alphanumeric characters. The
physical name must be the exact name of the device.

7DSH&DSDFLW\
When you use the TMU to back up a database to tape, you must specify a
CAPACITY value. This value represents the maximum amount of data that
can be backed up to each tape. The TMU counts every byte of uncompressed
data it writes to the tape toward this limit. Standard label and file marks are
excluded from the calculation. (Although these marks are device-dependent,
they should not exceed 10 megabytes, which is the minimum capacity you
can specify.)

If the tape device uses compression, the actual amount of space used for the
backup might be less than the capacity you specified. To take advantage of
the “extra” space, factor the compression ratio into the capacity setting.

 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
8VLQJD6WRUDJH0DQDJHUIRU708%DFNXSV

For backups to devices that do not support compression, specify 90 to 95% of


the tape’s uncompressed maximum storage size, as marked on the tape itself.
If you specify a capacity value that is larger than the actual capacity of the
tape, the TMU fails with an error when it reaches the end of the tape and
aborts the backup operation. The error message displays the total number of
bytes the TMU has written to the tape, which can be used as a guide for
correcting the capacity value. ♦

8VLQJD6WRUDJH0DQDJHUIRU708%DFNXSV
The X/Open Backup Services API (XBSA) is implemented by several storage
management products, including IBM Tivoli Storage Manager. The TMU
supports backups via the XBSA interface, whereby control of the location and
contents of the backup is managed entirely by the configured storage
manager.

7RFRQILJXUH\RXUVWRUDJHPDQDJHUWRVXSSRUW;%6$EDFNXSV

 Install the IBM Red Brick Warehouse server, if it is not already


installed, or upgrade your databases to the current version. The
installation process (new or update) automatically creates the
$RB_CONFIG/bar_metadata directory for TMU backups.
 Install the storage manager that you intend to use and make a note
of your username for that product.
 Make sure the associated XBSA library for your operating system is
installed, as provided by the storage manager vendor. This library
must be accessible to the TMU at run-time. Make sure you have
installed the correct library for your Red Brick server, platform, and
addressability (32-bit or 64-bit).
 Specify the path of the XBSA library and your storage manager
username in the rbw.config file:
OPTION BAR_XBSA_LIB pathname
OPTION BAR_SM_USER sm_username
For specific examples, see Appendix B, “Storage Manager Configu-
ration for XBSA Backups.”

%DFNLQJ8SD'DWDEDVH 
8VLQJ([WHUQDO7RROVIRU)XOO%DFNXSV

 Optionally, run the barxbsa utility (on Windows, barxbsa.exe) to


verify connectivity to the storage manager.
 Run TMU backups, specifying XBSA as the backup media. (See
page 8-22 for details about syntax.)
,PSRUWDQW Only one storage manager can be used per backup operation, and the
same storage manager must be used to restore from that backup.

To change storage managers, you must change the BAR_XBSA_LIB and


BAR_SM_USER entries in the rbw.config file. Then run the barxbsa (or
barxbsa.exe) utility to verify connectivity before running TMU backups.

For configuration information specific to your storage management system,


refer to Appendix B, “Storage Manager Configuration for XBSA Backups,”
and the documentation that accompanies the storage management software.

8VLQJ([WHUQDO7RROVIRU)XOO%DFNXSV
You might have a very large database that takes a long time to complete a full
TMU backup, or you might already have an efficient process in place for
taking system backups that include all of the Red Brick files. Therefore, you
might elect to use an external tool for performing full backups but still take
advantage of the TMU functionality for optimized incremental backups that
external tools cannot emulate. The TMU can make use of any external full
backup as the baseline for subsequent incremental backups. If a reliable
means of taking full backups already exists, you do not have to re-create
those backups with the TMU.

For example, you could use a system-wide backup utility such as the UNIX
dump and restore commands or a third-party tool that can back up data in
parallel to multiple tape drives. As well as providing a faster full backup,
such external tools guarantee that files outside the database but related to it
(such as the rbw.config file and initialization files) are kept in synch with the
actual database files (PSUs).

To support a mixture of external full backups and TMU incremental backups


of the same database, execute the following TMU SET command immediately
before performing the external backup operation:
set foreign full backup;

 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
8VLQJ([WHUQDO7RROVIRU)XOO%DFNXSV

This command resets the backup segment and effectively states that a reliable
external backup is about to be created. In turn, TMU incremental backups can
follow, just as if a TMU full backup had been done.

An equivalent SET command supports the use of external full restore


operations:
set foreign full restore;

Issue this command immediately after the external restore is performed and
before any connections or changes can be made to the restored database. For
more details about foreign restore operations, see page 9-11.

The SET FOREIGN FULL BACKUP and SET FOREIGN FULL RESTORE
commands require the BACKUP_DATABASE and RESTORE_DATABASE task
authorizations, respectively.

5HFRPPHQGHG3URFHGXUHIRU)RUHLJQ%DFNXS2SHUDWLRQV
Do not allow any write operations against the database during a foreign
backup. Follow these steps:

 Identify the database as fully backed up with a foreign utility.


Execute a TMU control file that contains the following command:
set foreign full backup;
 Put a read lock on the database to protect the database from change
during the foreign backup operation. Keep this RISQL session open.
RISQL> lock database read;
 Perform the foreign backup operation.
 Unlock the database, using the same session as in step 2:
RISQL> unlock database;

7LS If you already have a full backup of a database that was taken with an external
program, you can still issue the SET FOREIGN FULL BACKUP command and use
the TMU for incremental backups. However, the recommended procedure is to use the
SET FOREIGN FULL BACKUP command before running the external backup.

%DFNLQJ8SD'DWDEDVH 
%$&.836\QWD[

%$&.836\QWD[
The following diagram shows how to construct a TMU BACKUP statement:

BACKUP TO DIRECTORY ’ directory_name ’

TAPE DEVICE logical_device_name CAPACITY value

K|M|G
XBSA

ONLINE LEVEL 0 ;

CHECKPOINT 1

DIRECTORY Specifies the name of an existing directory to which


’directory_name’ the TMU writes the backup data. The path to the
directory should be fully specified; environment vari-
ables can be used. The directory name must satisfy
the naming conventions of the operating system and
be enclosed in single quotation marks. The database
user running the TMU operation must have write per-
mission for the directory. The same directory can be
used for multiple backup operations.

Directory backups are stored in one or more files with


dynamically generated names. The naming conven-
tion for these files is as follows:
rb_bar_dbname.yyyymmdd.hhmmss.pid.nnnn

dbname: the logical name of the database. The


dbname is always represented by 14 characters. If the
actual logical name is longer, the last 14 characters are
used. If the name is shorter, it is padded with trailing
underscores.

 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
%$&.836\QWD[

yyyymmdd.hhmmss: date and timestamp that iden-


tifies exactly when the file was created.

pid: the ID of the process that created the file.

nnnn: the backup file sequence number, which iden-


tifies the order in which the files were created:
filename.0001, filename.0002, and so on. A new file is
created either when the value of BAR_UNIT_SIZE is
reached or when all of the dirty blocks for a single PSU
have been backed up.

TAPE DEVICE Specifies the logical name of the UNIX or Linux tape
logical_device_name device to be used for a backup to tape. See “Tape
Device Configuration” on page 8-18.

CAPACITY value Specifies the maximum capacity to which you want to


K|M|G fill each tape used for the backup. When the tape is
filled to this capacity, the tape is considered full and
the TMU prompts the user to mount another tape to
continue. The CAPACITY parameter is required for
tape backups but not allowed for backups to other
media.

If K (kilobytes), M (megabytes), or G (gigabytes) is not


specified, the default is K. The value must be an
unsigned integer. The minimum capacity is 10 mega-
bytes, and the maximum is 2 terabytes. Values out of
this range cause the backup to fail with an error. For
more information about the CAPACITY parameter, see
“Tape Capacity” on page 8-18.
XBSA Specifies that a third-party storage manager will
manage the backup files. For information about the
XBSA interface and storage manager configuration,
refer to “Using a Storage Manager for TMU Backups”
on page 8-19.
ONLINE or Specifies the type of backup operation. During an
CHECKPOINT online backup, the database is available for read and
write operations. During a checkpoint backup, the
database is in read-only mode.

%DFNLQJ8SD'DWDEDVH 
%$&.836\QWD[

LEVEL 0 | 1 | 2 Specifies the backup level. Level 0 is a full backup;


levels 1 and 2 are incremental. A level 1 backup
records all the changes since the last level 0 backup. A
level 2 backup records all the changes since the last
backup of any kind.

([DPSOHVRI%DFNXS2SHUDWLRQV
The following examples illustrate the syntax for various TMU backup
statements:

81,;/LQX[
backup to directory ’/disk1/db_bup/012202’ online level 0;
# full backup on 1/22/02

backup to directory ’/disk1/db_bup/012302’ checkpoint level 2;


# incremental checkpoint on 1/23/01

backup to xbsa checkpoint level 0;


# full backup to Tivoli on 1/30/02

backup to tape device dev1 capacity 100M


checkpoint level 0;
# full tape checkpoint on 2/01/02

:LQGRZV
backup to directory ’e:\db_bup\012202’ online level 0;
# full backup on 1/22/02

backup to directory ’e:\rbds611\devel\bar\013002’ checkpoint


level 2;
# incremental checkpoint on 1/30/02

backup to xbsa checkpoint level 0;


# full backup to Legato on 1/30/02

 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
%$&.836\QWD[

0HVVDJHV'LVSOD\HG'XULQJ%DFNXSV
In the following example, the TMU control file tb_db_level0.tmu backs up
the Aroma database to a directory named tb_db_backup. The messages
contain information about the backup process, the type and level of backup,
and the media used to store the backup files.
113 brick % rb_tmu -d AROMA tb_db_level0.tmu system manager
(C) Copyright IBM Corp. 1991-2002. All rights reserved.
Version 06.20.0000(0)TST
** INFORMATION ** (523) Backup of database ’AROMA’ with backup level
0, backup type CHECKPOINT, and backup media DIRECTORY started.
** INFORMATION ** (7051) Backup to
’/devel/local/bobr/tb_db_backup/rb_bar_AROMA__________.20011119.162542
.00032049.0001’ started on Monday, November 19, 2001 4:25:42 PM.
** INFORMATION ** (7061) Backup to
’/devel/local/bobr/tb_db_backup/rb_bar_AROMA__________.20011119.162542
.00032049.0001’ completed on Monday, November 19, 2001 4:25:42 PM.
** INFORMATION ** (7087) Backup of the database AROMA completed
successfully on Monday, November 19, 2001 4:25:42 PM.
** STATISTICS ** (500) Time = 00:00:00.14 cp time, 00:00:00.14 time,
Logical IO count=750, Blk Reads=0, Blk Writes=778

The following output is for a subsequent level 1 backup of the same database:
124 brick% rb_tmu -d AROMA tb_db_level1.tmu system manager
(C) Copyright IBM Corp. 1991-2002. All rights reserved.
Version 06.20.0000(0)TST
** INFORMATION ** (523) Backup of database ’AROMA’ with backup level
1, backup type CHECKPOINT, and backup media DIRECTORY started.
** INFORMATION ** (7051) Backup to
’/devel/local/bobr/tb_db_backup/rb_bar_AROMA__________.20011119.163859
.00032173.0001’ started on Monday, November 19, 2001 4:38:59 PM.
** INFORMATION ** (7061) Backup to
’/devel/local/bobr/tb_db_backup/rb_bar_AROMA__________.20011119.163859
.00032173.0001’ completed on Monday, November 19, 2001 4:38:59 PM.
** INFORMATION ** (7087) Backup of the database TBDB completed
successfully on Monday, November 19, 2001 4:38:59 PM.
** STATISTICS ** (500) Time = 00:00:00.07 cp time, 00:00:00.10 time,
Logical IO count=320, Blk Reads=0, Blk Writes=307

%DFNLQJ8SD'DWDEDVH 
%DFNXS0HWDGDWD

%DFNXS0HWDGDWD
In order to automate the process of restoring a database to a consistent state,
the TMU relies on metadata files that maintain a history of all the backups that
have been performed since the database was created. The metadata history is
specific to each IBM Red Brick Warehouse database that exists within a single
installation of the warehouse server. The backup metadata makes it possible
to restore a database with one TMU operation, without requiring the DBA to
specify which particular backups need to be restored (or the sequence in
which they need to be restored).

When the warehouse server is installed, the following metadata directory is


created:
$RB_CONFIG/bar_metadata

The bar_metadata directory contains a database_name subdirectory for each


database, where database_name is the logical name registered in the
rbw.config file. Each database_name directory contains four files:

■ .dbinfo, which stores the following information:


❑ The database locale
❑ The logical name and full path of the database
❑ The current action sequence number, used to track the sequence of
backup operations and incremented by 1 for each new operation.
Do not remove this file from the database_name directory.
■ .backup_dirty_psu, used for error recovery purposes during
backups.
Do not remove this file from the database_name directory.
■ rbw_media_history, described on page 8-27.
■ action_log, described on page 8-29.

The database_name directory is not created for a database until its first TMU
backup is performed.

 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
0HGLD+LVWRU\)LOH UEZBPHGLDBKLVWRU\

0HGLD+LVWRU\)LOH UEZBPHGLDBKLVWRU\
The rbw_media_history file is a text file that contains detailed PSU-level
information about every backup operation against the database. This backup
history is used to discover the sequence of backups that must be restored in
order to provide consistent database recovery to a specified point in time.

After each successful TMU or external backup operation, new backup records
are appended to the history file: one record per PSU for TMU backups and one
record for the whole database for external backups.

,PSRUWDQW The rbw_media_history file itself is not backed up by the TMU; you
should back it up regularly with an external program.

Each backup record contains the following information:

Component Description

Action sequence number Number assigned to the backup session; the same
number is shared by all the PSUs that are backed up
in a single operation

Version Version number for the IBM Red Brick Warehouse


server (for example, 6.20)

Backup level 0, 1, or 2

Backup type Online (O), checkpoint (C), or E (external backup)

Media type Tape (T), directory (D), or XBSA (X)

Time stamp When the backup of the first dirty block in the PSU
was started. The time-stamp format is as follows:
yyyy-mm-dd.hh:mm:ss
This format is defined on page 9-14.

PSU type System table (S) or user table (U)

Full PSU changed Whether the complete PSU was backed up or just
some of its blocks: True (T) or False (F)

Media ID File name (for directory backups); copy ID (for


XBSA backups); logical device name, tape file
name, and volume name (for tape backups).

Size of backup blocks Total number of dirty blocks backed up for this PSU

%DFNLQJ8SD'DWDEDVH 
0HGLD+LVWRU\)LOH UEZBPHGLDBKLVWRU\

Component Description

Offset Starting position (block number) in the backup


media where the backup data for this PSU is stored
(given that multiple PSUs can be stored in a single
backup file or object)

PSU size Total number of 8K blocks in the PSU (at the time of
the backup)

‘psu_name’ The name of the backed-up PSU, in single quotes

Segment name The name of the segment to which the PSU belongs

For an external backup, only one record is appended to the file and the PSU-
level components of the record are left blank.

(GLWLQJWKH0HGLD+LVWRU\)LOH
The redbrick user can edit the rbw_media_history file if necessary, but
changes must be made with great care. Make sure that records required for a
valid database restore operation are not removed; otherwise, the TMU could
construct an incorrect restore sequence.

Records that belong to backup operations that are no longer needed can be
safely removed. Each backup operation has a unique backup sequence
number. First determine the sequence number for the backup operation you
want to remove, then be sure to remove all of the records that have that
number. For example, say the rbw_media_history file contains records for a
series of seven backups:

 Level 0 online backup


 Level 2 online backup
 Level 1 online backup
 Level 2 online backup
 Level 1 online backup
 Level 2 online backup
 Level 2 checkpoint backup

Records with backup sequence numbers 1, 5, 6, and 7 must not be removed.


All records with sequence numbers 2, 3, and 4 can be removed.

 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
%DFNXS/RJ)LOH DFWLRQBORJ

3682IIVHW([DPSOH
The “offset” defines the starting position for the backed-up blocks of a given
PSU. This information is important because a single backup file can contain
multiple PSUs. For example, assume that PSU1 is 80K (10 blocks), and 2 of its
blocks (16K) are backed up to /tmp/file1, starting from block 20 of that
backup file. In the rbw_media_history file, the offset value for PSU1 will be
20. The first 19 blocks of /tmp/file1 are occupied by dirty blocks that belong
to other PSUs.

%DFNXS/RJ)LOH DFWLRQBORJ
Event-based records generated by TMU backup and restore operations
against a specific database are logged in the action_log file inside the
database_name directory. Each new record is appended to the end of the file.

Unlike the rbw_media_history file, the action_log file does not contain PSU-
level details and is not used by the TMU. Also, the log file contains infor-
mation about backups and restore operations, whereas the
rbw_media_history file contains information about backups only.

The redbrick user can edit or delete the file if necessary. If the file does not
exist, the TMU creates a new one. If you want to activate a new, empty version
of the file, simply rename the current file. When the next backup operation
starts, a new action_log file will be created.

There is no limit imposed on the maximum size of the log file, other than the
limit imposed by the operating system.

The log file contains a record for each backup or restore operation that is
started and another record for each operation that is completed. The records
contain the following information:

■ The full path for the database


■ The value of RB_HOST (the logical name of the warehouse daemon)
■ The database user who ran the operation
■ The date and time when the operation was started or completed
■ The complete backup or restore command, as defined in the TMU
control file

%DFNLQJ8SD'DWDEDVH 
%DFNXS/RJ)LOH DFWLRQBORJ

An odd number of records (such as two “backup started” entries but only one
“backup completed” entry) implies that a certain operation failed, but the
action_log file does not contain entries that indicate what caused the failure.
The cause of the failure should be apparent from the detailed messages in the
server log files (rbwlog.*).

([DPSOH/RJ)LOH(QWULHV

The following backup log entries describe successful level 0 and level 2
backups of the Aroma database.
TMU 06.20.0000(0)TST [BAR] Backup started.
DB: /qa/local/bobr/toucaroma, HOST and USER: TOUCAN SYSTEM, DATE and
TIME: Thursday, December 20, 2001 1:29:53 PM, COMMAND: BACKUP TO
DIRECTORY /qa/local/bobr/bar0 CHECKPOINT LEVEL 0, BAR_UNIT_SIZE:
262144.
Backup to
’/qa/local/bobr/bar0/rb_bar_bobr_toucaroma_.20011220.132953.00001820.0
001’ started on Thursday, December 20, 2001 1:29:53 PM.
Backup to
’/qa/local/bobr/bar0/rb_bar_bobr_toucaroma_.20011220.132953.00001820.0
001’ completed on Thursday, December 20, 2001 1:29:54 PM.

TMU 06.20.0000(0)TST [BAR] Backup completed.


DB: /qa/local/bobr/toucaroma, HOST and USER: TOUCAN SYSTEM, DATE and
TIME: Thursday, December 20, 2001 1:29:54 PM, COMMAND: BACKUP TO
DIRECTORY /qa/local/bobr/bar0 CHECKPOINT LEVEL 0, BAR_UNIT_SIZE:
262144, Number of BAR Units Backed Up: 1, Number of Segments Backed
Up: 32, Number of PSUs Backed Up: 39, Number of Blocks Backed Up: 666

TMU 06.20.0000(0)TST [BAR] Backup started.


DB: /qa/local/bobr/toucaroma, HOST and USER: TOUCAN SYSTEM, DATE and
TIME: Thursday, December 20, 2001 1:36:20 PM, COMMAND: BACKUP TO
DIRECTORY /qa/local/bobr/bar2 CHECKPOINT LEVEL 2, BAR_UNIT_SIZE:
262144.
Backup to
’/qa/local/bobr/bar2/rb_bar_bobr_toucaroma_.20011220.133620.00002002.0
001’ started on Thursday, December 20, 2001 1:36:20 PM.
Backup to
’/qa/local/bobr/bar2/rb_bar_bobr_toucaroma_.20011220.133620.00002002.0
001’ completed on Thursday, December 20, 2001 1:36:20 PM.

TMU 06.20.0000(0)TST [BAR] Backup completed. DB:


/qa/local/bobr/toucaroma, HOST and USER: TOUCAN SYSTEM, DATE and TIME:
Thursday, December 20, 2001 1:36:20 PM, COMMAND: BACKUP TO DIRECTORY
/qa/local/bobr/bar2 CHECKPOINT LEVEL 2, BAR_UNIT_SIZE: 262144, Number
of BAR Units Backed Up: 1, Number of Segments Backed Up: 2, Number of
PSUs Backed Up: 7, Number of Blocks Backed Up: 210

 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
Chapter

5HVWRULQJD'DWDEDVH

In This Chapter . . . . . . . . . . . . . . . . . . . . 9-3
Full and Partial TMU Restores . . . . . . . . . . . . . . . 9-4
Restore Path . . . . . . . . . . . . . . . . . . . . 9-4
Restore Examples . . . . . . . . . . . . . . . . . . 9-5
Example 1: Daily Level 2 Checkpoints . . . . . . . . . 9-6
Example 2: Daily Level 1 Checkpoints . . . . . . . . . 9-7
Example 3: Combined Level 1 and Level 2 Backups . . . . . 9-8
Example 4: Negative Case . . . . . . . . . . . . . . 9-9
How to Run a TMU Restore . . . . . . . . . . . . . . . . 9-10
Recommended Procedure for Foreign Restore Operations . . . . 9-11
Restore of Special Segments . . . . . . . . . . . . . . 9-11
Cold Restore Operations . . . . . . . . . . . . . . . 9-12
PSUs for Objects Created After a Restored Backup . . . . . . 9-12
RESTORE Syntax . . . . . . . . . . . . . . . . . . 9-13
Syntax Examples . . . . . . . . . . . . . . . . . 9-15
Example RESTORE Operation with Message Output . . . . 9-16
Example Output for RESTORE SHOW Operation . . . . . 9-17
Partial Restore Procedure . . . . . . . . . . . . . . . 9-19
FORCE Option . . . . . . . . . . . . . . . . . 9-20
Database Consistency After Partial Restores . . . . . . . 9-20
Partial Availability . . . . . . . . . . . . . . . . 9-21
 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
,Q7KLV&KDSWHU
This chapter describes how to restore a database in the event of a system or
software failure. The chapter contains the following main sections:

■ “Full and Partial TMU Restores”


■ “How to Run a TMU Restore”

For information about TMU backups, refer to Chapter 8, “Backing Up a


Database.”

5HVWRULQJD'DWDEDVH 
)XOODQG3DUWLDO7085HVWRUHV

)XOODQG3DUWLDO7085HVWRUHV
When you run a TMU backup, you always back up data across the entire
database, backing up either every object in its entirety or the changed
portions of every object. However, when you restore from a backup, you can
do either a full restore (the entire database) or a partial restore. A partial restore
recovers one specific segment or one specific physical storage unit (PSU).

Assuming that your backup strategy involves a combination of different


levels and types of backups, restoring the database to its state at a fixed point
in time often requires multiple backups to be restored in a specific sequence.
The TMU does not require the DBA to determine this sequence; the restore path
is constructed transparently and automatically, based on the metadata
history for the database.

The assumption is that you need to restore the database to the latest backup
before the failure; in practice, your restore requirements might be less
stringent. For example, you might want to restore to an earlier backup, then
reload the last set of changes to the database. In most cases your backup
strategy should make it possible for you to restore to a fixed point in time
without having to reload any data, but the TMU allows you to choose any
target date for the restore process. If you do not select a target date, the
default behavior is to restore to the date of the last checkpoint.

5HVWRUH3DWK
The restore path that the TMU constructs is basically the same whether you
are restoring to a specific timestamp or restoring “blind” (without an explicit
target date in the command). For all restores, the path must include at least
one checkpoint backup and at least one full backup (TMU level 0 or external).
If these critical backups are missing from the metadata history, the restore
operation will fail.

 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
5HVWRUH([DPSOHV

A complete restore path consists of the following subset of backups. In this


context, target date means either the date specified in the RESTORE command
or the date when the last checkpoint was taken:

D The last level 0 backup that was taken before the target date
E The last level 1 backup (if any) that was taken after (a) but before the
target date
F All level 2 backups that were taken after (b) but before the target
date; if (b) does not exist, all level 2 backups that were taken after (a)
but before the target date
G The last checkpoint backup that was taken before the target date

The restore process always stops at (d); any online backups taken after the
last checkpoint cannot be restored. The backups referred to in steps (a), (b),
and (c) could be online or checkpoint. The level 0 backup in (a) could also be
an external full backup.

Data is automatically restored from the same backup media that was
specified in the original BACKUP command. To verify the contents and scope
of the backup from which you plan to restore the database, run the
RESTORE...SHOW command before starting the restore operation. If you have
moved any of the backup data that needs to be restored, the RESTORE
operation will fail. The only exception to this rule is the ability to switch tape
devices between backups and restores, as explained on page 8-18.

5HVWRUH([DPSOHV
The following examples illustrate a combination of different types and levels
of backups and the resulting restore paths. These examples demonstrate how
the backup strategy determines the restore path that the TMU constructs in
the event of a failure.

5HVWRULQJD'DWDEDVH 
5HVWRUH([DPSOHV

([DPSOH'DLO\/HYHO&KHFNSRLQWV
In this case, each daily incremental backup operation is relatively short and
provides a consistent recovery point, but the restore process potentially
consists of several operations, depending on when the failure occurs.

 The DBA runs a level 0 checkpoint backup over the weekend to have
a fresh full backup to start each week.
 On weekday evenings, the DBA runs level 2 checkpoint backups.
Each checkpoint backup stores only the changes that have occurred
since the previous day’s backup, and each backup provides a
consistent recovery point:

F
A
I
L
U
LEVEL 0 LEVEL 2 LEVEL 2 LEVEL 2 R LEVEL 2 LEVEL 2 LEVEL 2
CHECK CHECK CHECK CHECK E CHECK CHECK CHECK

Sunday Monday Tuesday Wednesday Thursday Friday Saturday

The lines with arrowheads indicate that the changes picked up by


each level 2 backup are not cumulative.
 Assume that a system failure occurs on Thursday and the database
needs to be restored to its state at close of business on Wednesday.
The TMU executes the following restore path:
 Restore from last Sunday’s level 0 backup.
 Restore from Monday’s level 2 backup.
 Restore from Tuesday’s level 2 backup.
 Restore from Wednesday’s level 2 backup.

The database is now fully restored to its state on Wednesday night. No


changes to the database through Wednesday were missed because all of the
backups were checkpoints. However, if modifications were made to the
database after the Wednesday backup was completed and before the failure
on Thursday, they would not be present in the restored database.

 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
5HVWRUH([DPSOHV

([DPSOH'DLO\/HYHO&KHFNSRLQWV
In this case, each daily incremental backup provides a consistent recovery
point but picks up all the changes since the last level 0 backup. This approach
reduces the number of restore operations but the daily backups become more
time-consuming as the week progresses.

 The DBA runs a level 0 checkpoint backup over the weekend to have
a fresh full backup to start each week.
 On weekday evenings, the DBA runs level 1 checkpoint backups.
Each level 1 backup is cumulative; it stores all of the changes that
have occurred since last Sunday’s level 0 backup:

F
A
I
L
U
LEVEL 0 LEVEL 1 LEVEL 1 LEVEL 1 R LEVEL 1 LEVEL 1 LEVEL 1
CHECK CHECK CHECK CHECK E CHECK CHECK CHECK

Sunday Monday Tuesday Wednesday Thursday Friday Saturday

 Assume that a system failure occurs on Thursday and the database


needs to be restored to its state at close of business on Wednesday.
The TMU executes the following restore path:
 Restore from last Sunday’s level 0 backup.
 Restore from Wednesday’s level 1 backup.

The database is now fully restored to its state on Wednesday night. No


changes to the database through Wednesday were missed because all of the
backups were checkpoints. However, if modifications were made to the
database after the Wednesday backup was completed and before the failure
on Thursday, they would not be present in the restored database.

5HVWRULQJD'DWDEDVH 
5HVWRUH([DPSOHV

([DPSOH&RPELQHG/HYHODQG/HYHO%DFNXSV
In this case, the DBA runs a full backup over the weekend and two backups
on each weekday, one online and one checkpoint. This approach ensures that
database modifications are allowed during business days, that a checkpoint
for database recovery exists at the end of each business day, and that a
smaller number of restore operations is required to complete a recovery.

 The DBA runs a level 0 checkpoint backup every Sunday.


 During the week, the DBA runs two backups per day: an online level
1 followed by a checkpoint level 2.

F
A
I
L
U
R
LEVEL 0 LEVEL 1 LEVEL 2 LEVEL 1 LEVEL 2 LEVEL 1 LEVEL 2
E
CHECK ONLINE CHECK ONLINE CHECK ONLINE CHECK

Sunday Monday Monday Tuesday Tuesday Wednesday Wednesday


Night Night Night

The lines with arrowheads indicate that the changes picked up by


each level 1 backup are cumulative, while the level 2 backups are
non-cumulative.
 Assume that a system failure occurs on Thursday morning and the
database needs to be restored to its state at close of business on
Wednesday. The restore process consists of three steps:
 Restore from last weekend’s level 0 backup.
 Restore from Wednesday’s level 1 backup.
 Restore from Wednesday’s level 2 backup.

The database is fully restored to its state on Wednesday night.

 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
5HVWRUH([DPSOHV

([DPSOH1HJDWLYH&DVH
This is a negative case that demonstrates the need for regular checkpoint
backups. If the DBA uses online backups every day and waits until the
weekend to run a checkpoint, it is impossible to restore the database to any
date during the week.

 The DBA runs a level 0 checkpoint backup every Sunday.


 The DBA runs level 2 online backups Monday through Saturday.

Lost Changes F
A
I
L
U
R
E

LEVEL 0 LEVEL 2 LEVEL 2 LEVEL 2 LEVEL 2 LEVEL 2 LEVEL 2 LEVEL 0


CHECK ONLINE ONLINE ONLINE ONLINE ONLINE ONLINE CHECK
Sunday Monday Tuesday Wednesday Thursday Friday Saturday Sunday

 If a system failure occurs on Thursday morning, the database cannot


be restored to its state before Wednesday’s backup; the level 2 backups
are effectively useless. The only option is to restore the database to its
state on the previous Sunday, losing all of the changes that were
backed up during the week.

5HVWRULQJD'DWDEDVH 
+RZWR5XQD7085HVWRUH

+RZWR5XQD7085HVWRUH
Follow these steps to run a TMU restore operation:

 Determine whether a partial or full restore is required. It is sometimes


possible to recover the database to a consistent state by restoring a
specific PSU or segment. For detailed information about partial
restores, see page 9-19.
 Optionally, check the contents of the media from which the database
will be restored by using the RESTORE...SHOW command.
 Create a TMU control file for the RESTORE operation, using the syntax
defined on page 9-13. For general information about control files, see
page 1-8.
 Use the rb_tmu (or rb_ptmu) program to execute the RESTORE
operation. For information about the full rb_tmu syntax, see Chapter
2, “Running the TMU and PTMU.” (If you use rb_ptmu, the restore
operation still runs in serial mode; parallel restores are not supported
from any backup media, including XBSA storage managers.)
TMU BACKUP operations cannot be initiated with the Client TMU
(rb_ctmu) and executed on a remote server machine.

7DVN$XWKRUL]DWLRQ

The RESTORE_DATABASE authorization must be granted to a database user


before that user can run a TMU RESTORE operation. For example:
grant connect, restore_database to evelyn with ybsmur;

This authorization is inherited by the DBA role.

 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
5HFRPPHQGHG3URFHGXUHIRU)RUHLJQ5HVWRUH2SHUDWLRQV

5HFRPPHQGHG3URFHGXUHIRU)RUHLJQ5HVWRUH2SHUDWLRQV
The assumption is that the database does not exist prior to a full restore and
that there is no database-related activity in progress. Nonetheless, the DBA
should always make sure that no database activity is possible during the
restore operation.

 Stop all activities against the specified database:


RISQL> alter database terminate database logical_db_name;
 Make sure that users cannot connect to the database. It might be
necessary to stop the API daemon or service or remove the database
entry from the rbw.config file.
 Restore the database with the foreign utility.
 Restart the API daemon or service (if stopped in step 2) or replace the
database entry in the rbw.config file, as required.
 Identify the database as having been fully restored from a foreign
backup. Execute a TMU control file that contains the following
command:
set foreign full restore;
 Restore TMU incremental backups if necessary.

5HVWRUHRI6SHFLDO6HJPHQWV
The system segment, which contains the system tables, cannot be restored by
itself; it is restored as part of a full restore operation.

The backup segment is backed up only as part of a checkpoint backup but is


not required for a full database restore to succeed. If the backup segment is
damaged or does not exist, it is re-created as part of the full restore operation.

For versioned databases, the version log segment, which is never backed up
by the TMU, is automatically re-created, based on its definition in the restored
system catalog.

5HVWRULQJD'DWDEDVH 
&ROG5HVWRUH2SHUDWLRQV

&ROG5HVWRUH2SHUDWLRQV
A cold restore is a full restore operation for a database that cannot be brought
online. If a cold restore is necessary, you have to re-create an empty version
of the same database in order to restore it, using the same environment
settings, location, and locale. You also have to re-create a database user with
the authority to do the restore (RESTORE_DATABASE authorization).

Restore both the rbw.config file and the contents of the following directory
before starting the cold restore of the database:
$RB_CONFIG/bar_metadata/database_name

These files must be backed up separately from the database; TMU backups do
not include them.

The backup segment and the version log segment (if required) are re-created
automatically during TMU restore operations.

368VIRU2EMHFWV&UHDWHG$IWHUD5HVWRUHG%DFNXS
If you restore from a backup that was taken before an object was created in
the database, the PSUs for that object will not be present in the restored
database but they will exist in the filesystem. In the case of PSUs with default
names, this scenario can cause problems when the server attempts to reuse
the default names. A similar situation could arise in the case of user-defined
PSUs. In either case, after restoring from a backup, you should query the
restored RBW_STORAGE table and compare the list of segments and PSU
locations with the physical files in the system. You can then use system-level
commands to remove any unused PSUs.

 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
5(6725(6\QWD[

5(6725(6\QWD[
The following diagram shows how to construct a TMU RESTORE statement:

RESTORE

SEGMENT segment_name

FORCE

PSU ’ pathname ’

FORCE

;
AS OF ’timestamp’ SHOW

SEGMENT Specifies a single segment as the target of the


segment_name RESTORE operation. The named segment must be
present in the RBW_SEGMENTS table. The definition
of the segment and its owning table must not have
changed since the backup.

PSU ’pathname’ Specifies a single PSU (file) as the target of the


RESTORE operation. The file name must be enclosed
in single quotes. The path to the file does not have to
be fully qualified; relative pathnames are accepted.

The named PSU must be present in the


RBW_STORAGE table. The definition of the PSU and its
owning table must not have changed since the
backup.

If neither a segment nor a PSU is specified, the opera-


tion defaults to a full restore of the whole database.

5HVWRULQJD'DWDEDVH 
5(6725(6\QWD[

FORCE Specifies that a partial RESTORE operation should


proceed regardless of changes to the specified PSU or
segment since the backup on which the restore is
based. This option does not apply to full database restores.
For details about when to use the FORCE option, see
page 20.
AS OF ’timestamp’ Specifies a point in time to which the database will be
restored. The timestamp must be enclosed in single
quotes and follow this format:

’yyyy-mm-dd.hh:mm:ss’

yyyy: year; must be 4 digits

mm: month, from 1 to 12

dd: day, from 1 to 31

hh: hour, from 0 to 23

mm: minutes, from 0 to 59

ss: seconds, from 0 to 59

For example: ’2001-12-31.11:59:59’

The complete date specification is required (yyyy-


mm-dd), but the time values are optional. Do not
include the period if you are not specifying a time.
You do not have to enter all three parts of the time
string; hh, hh:mm, and hh:mm:ss are all acceptable.

When the AS OF specification is used for a restore


operation, the rbw_media_history file is copied to a
file named rbw_media_history.timestamp.sv, where
timestamp is the time the restore operation is per-
formed (not the AS OF timestamp). Backup records
timestamped later than the AS OF timestamp are then
removed from the current version of the
rbw_media_history file. For more information about
this file, see page 8-27.

 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
5(6725(6\QWD[

SHOW Displays the sequence and contents of the backups


required by the restore process. No actual RESTORE
operation takes place when you use this option.

The output of the SHOW command is a series of mes-


sages, containing details about each PSU that was
backed up, as well as general information about each
backup in the restore path, such as the storage loca-
tion or device (see page 9-17). The SHOW option can
be used in combination with all other RESTORE com-
mand options.

IBM recommends that you run the RESTORE SHOW


command before running the recovery operation
itself.

6\QWD[([DPSOHV
The following examples are all valid database restore statements:

■ restore;
Restore the database to the last checkpoint.
■ restore as of ’2001-12-31.11:59:59’;
Restore the database to the specified date and time.
■ restore segment sales_seg1;
Restore the specified segment to the last checkpoint.
■ restore segment sales_seg1 as of ’2001-12-31’;
Restore the specified segment to the specified date.
■ restore psu ’/rb/test/sales_psu1’ force;
Restore the specified PSU to the last checkpoint, regardless of
changes made to the PSU since the checkpoint backup was taken.
■ restore show;
Show the restore path for the database.

5HVWRULQJD'DWDEDVH 
5(6725(6\QWD[

■ restore as of ’2001-12-31.11:59:59’ show;


Show the restore path for the database, based on the specified date
and time.
■ restore segment sales_seg1 as of ’2001-12-31.11:59:59’
show;
Show the restore path for the specified segment, based on the speci-
fied date and time.

([DPSOH5(6725(2SHUDWLRQZLWK0HVVDJH2XWSXW
The following messages are displayed when a database is restored success-
fully from a level 0 backup and a level 2 backup:
(C) Copyright IBM Corp. 1991-2002. All rights reserved.
Version 06.20.0000(0)TST
** INFORMATION ** (7054) Starting database restore of database
/qa/local/bobr/toucaroma.
** INFORMATION ** (7055) Starting restore from
/qa/local/bobr/bar0/rb_bar_bobr_toucaroma_.20011220.132953.00001820.0001 on
Friday, December 21, 2001 11:11:26 AM.
** INFORMATION ** (7044) Completed restore from
/qa/local/bobr/bar0/rb_bar_bobr_toucaroma_.20011220.132953.00001820.0001 on
Friday, December 21, 2001 11:11:27 AM.
** INFORMATION ** (7055) Starting restore from
/qa/local/bobr/bar2/rb_bar_bobr_toucaroma_.20011220.133620.00002002.0001 on
Friday, December 21, 2001 11:11:27 AM.
** INFORMATION ** (7044) Completed restore from
/qa/local/bobr/bar2/rb_bar_bobr_toucaroma_.20011220.133620.00002002.0001 on
Friday, December 21, 2001 11:11:27 AM.
** INFORMATION ** (560) Restore process will re-start the database
/qa/local/bobr/toucaroma now.
** INFORMATION ** (7088) Restore of the database /qa/local/bobr/toucaroma
completed successfully on Friday, December 21, 2001 11:11:27 AM.
** STATISTICS ** (500) Time = 00:00:00.14 cp time, 00:00:00.86 time, Logical IO
count=881, Blk Reads=933, Blk Writes=705

 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
5(6725(6\QWD[

([DPSOH2XWSXWIRU5(6725(6+2:2SHUDWLRQ
The output of the RESTORE...SHOW command starts with three pieces of
general backup information:

■ Level (0, 1, 2)
■ Media type (XBSA, tape, directory)
■ Type (online, checkpoint, or external)

This information is repeated for each individual backup that would feature
in the restore path. For example, the output might contain all of the detailed
information for a level 0 backup, followed by the equivalent information for
a level 2 backup.

Underneath the main entries, rows of PSU-level information are displayed:

first_PSU_name media_ID full_backup backup_size timestamp

second_PSU_name media_ID full_backup backup_size timestamp

... ... ... ... ...

where:

■ media_ID: the location of the backup file


■ full_backup: 1 or 0, indicating whether the entire PSU was backed up
(1) or just some of its blocks (0).
■ backup_size: the number of backed-up 8K blocks for that PSU
■ timestamp: when the PSU was backed up

5HVWRULQJD'DWDEDVH 
5(6725(6\QWD[

For example, here is part of the output for a RESTORE SHOW command. In
this case, the restore path consists of a level 0 checkpoint backup and a level
1 checkpoint backup:
brick % rb_tmu -d AROMA tb_db_show.tmu system manager
(C) Copyright IBM Corp. 1991-2002. All rights reserved.
Version 06.20.0000(0)TST
** INFORMATION ** (7074) The following messages contain information about the
list of PSUs to be restored:
BACKUP_LEVEL: Level_0 MEDIA_TYPE: DIRECTORY BACKUP_TYPE: CHECKPOINT
PSU_NAME:"/qa/local/bobr/tb_db/RB_DEFAULT_IDX"
MEDIA_ID:"/devel/local/bobr/tb_db_backup/rb_bar_AROMA__________.20011119.162542.
00032049.0001" FULL_BACKUP:1 BACKUP_SIZE:41 TIMESTAMP:2001-11-19.16:25:42
PSU_NAME:"/qa/local/bobr/tb_db/RB_DEFAULT_LOCKS"
MEDIA_ID:"/devel/local/bobr/tb_db_backup/rb_bar_AROMA__________.20011119.162542.
00032049.0001" FULL_BACKUP:1 BACKUP_SIZE:8 TIMESTAMP:2001-11-19.16:25:42
PSU_NAME:"/qa/local/bobr/tb_db/RB_DEFAULT_TABLES"
MEDIA_ID:"/devel/local/bobr/tb_db_backup/rb_bar_AROMA__________.20011119.162542.
00032049.0001" FULL_BACKUP:1 BACKUP_SIZE:12 TIMESTAMP:2001-11-19.16:25:42
PSU_NAME:"/qa/local/bobr/tb_db/RB_DEFAULT_INDEXES"
MEDIA_ID:"/devel/local/bobr/tb_db_backup/rb_bar_AROMA__________.20011119.162542.
00032049.0001" FULL_BACKUP:1 BACKUP_SIZE:19 TIMESTAMP:2001-11-19.16:25:42
PSU_NAME:"/qa/local/bobr/tb_db/RB_DEFAULT_SEGMENTS"
MEDIA_ID:"/devel/local/bobr/tb_db_backup/rb_bar_AROMA__________.20011119.162542.
00032049.0001" FULL_BACKUP:1 BACKUP_SIZE:32 TIMESTAMP:2001-11-19.16:25:42
...
BACKUP_LEVEL: Level_1 MEDIA_TYPE: DIRECTORY BACKUP_TYPE: CHECKPOINT
PSU_NAME:"/qa/local/bobr/tb_db/RB_DEFAULT_IDX"
MEDIA_ID:"/devel/local/bobr/tb_db_backup/rb_bar_AROMA__________.20011119.163859.
00032173.0001" FULL_BACKUP:1 BACKUP_SIZE:42 TIMESTAMP:2001-11-19.16:38:59
PSU_NAME:"/qa/local/bobr/tb_db/RB_DEFAULT_LOCKS"
MEDIA_ID:"/devel/local/bobr/tb_db_backup/rb_bar_AROMA__________.20011119.163859.
00032173.0001" FULL_BACKUP:1 BACKUP_SIZE:8 TIMESTAMP:2001-11-19.16:38:59
PSU_NAME:"/qa/local/bobr/tb_db/RB_DEFAULT_TABLES"
MEDIA_ID:"/devel/local/bobr/tb_db_backup/rb_bar_AROMA__________.20011119.163859.
00032173.0001" FULL_BACKUP:1 BACKUP_SIZE:13 TIMESTAMP:2001-11-19.16:38:59
PSU_NAME:"/qa/local/bobr/tb_db/RB_DEFAULT_INDEXES"
MEDIA_ID:"/devel/local/bobr/tb_db_backup/rb_bar_AROMA__________.20011119.163859.
00032173.0001" FULL_BACKUP:1 BACKUP_SIZE:20 TIMESTAMP:2001-11-19.16:38:59
PSU_NAME:"/qa/local/bobr/tb_db/RB_DEFAULT_SEGMENTS"
MEDIA_ID:"/devel/local/bobr/tb_db_backup/rb_bar_AROMA__________.20011119.163859.
00032173.0001" FULL_BACKUP:1 BACKUP_SIZE:34 TIMESTAMP:2001-11-19.16:38:59
...
** STATISTICS ** (500) Time = 00:00:00.03 cp time, 00:00:00.03 time, Logical IO
count=3, Blk Reads=0, Blk Writes=2

 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
3DUWLDO5HVWRUH3URFHGXUH

3DUWLDO5HVWRUH3URFHGXUH
You can sometimes restore a database by using a partial restore operation.
Partial restores apply to base table and index segments and PSUs only; system
table data cannot be partially restored. (The system tables are restored as part
of a full restore operation.) Partial restore operations recover the specified
segment or PSU to its state as of the last checkpoint backup. As with full
restore operations, any subsequent changes captured by an online backup
cannot be restored.

,PSRUWDQW Where possible, partial restores should be avoided. If you have to use
them, it is better to use them in isolation than in combination with full restores. Try
to avoid several consecutive partial restores of the same object. There is no guarantee
that the database will be in a consistent state after any partial restore operation,
whether or not the FORCE option is required (see page 9-20).

You cannot restore a single segment or PSU if it does not exist in the database,
as recorded in the RBW_SEGMENTS or RBW_STORAGE table. For example,
you might have inadvertently dropped a table and its segment. In this case,
you must either:

■ Perform a full restore, which is the recommended solution.


■ Re-create the missing segment and then restore it. You must define
the segment exactly as it was when it was backed up. The restore
operation does not detect any differences in definition of filenames
or size parameters. If differences exist, you can damage the database
to the extent that a full restore is required.

IBM recommends that you do not attempt to restore a single segment or PSU
in the following cases:

■ If the table description for the table containing the segment, or the
segment that contains the PSU, has changed—for example, because
columns were added or dropped.
■ If the number of rows in the segment, or the segment that contains
the PSU, has changed. If you have inserted or deleted rows, the
number of rows has probably changed.

If none of the above conditions applies, a partial restore is possible. If you


have an up-to-date backup, you should be able to restore the segment or PSU
with no loss of data.

5HVWRULQJD'DWDEDVH 
3DUWLDO5HVWRUH3URFHGXUH

,PSRUWDQW If you do not have an up-to-date backup, you might be able to restore a
segment or a PSU by using the FORCE option. However, after the forced restore, you
will have to do some additional work to bring the database to a consistent, usable
state. It is highly recommended that you consult Customer Support before using the
FORCE option.

)25&(2SWLRQ
If the segment or PSU specified in a partial restore has changed since you ran
the backup from which you are trying to restore, the changes will be lost
when the restore operation restores the segment or PSU to its backed-up state.
The lost changes could be modified segment ranges or rows that were added,
updated, or deleted.

To prevent an inadvertent loss of data, the TMU issues a warning message


stating that you cannot complete the operation unless you use the FORCE
keyword in the RESTORE command. The FORCE option causes the RESTORE
command to override some built-in consistency checks and might leave the
database in an inconsistent state.

Partial restore operations reset the backup data for the specified segment or
PSU; therefore, after restoring an object with the FORCE option, you will not
see the warning message again when you perform a subsequent restore of
that object.

'DWDEDVH&RQVLVWHQF\$IWHU3DUWLDO5HVWRUHV
After running any partial restore operation (with or without the FORCE
option), IBM recommends that you check the integrity of related database
objects:

 Run the CHECK TABLE command against the applicable segment:


RISQL> check table table_name segment segment_name directory
> ’directory_name’;
This step is optional, but checking the consistency of the physical
structures in the segment is highly recommended. If any discrepan-
cies are reported, do not fix them until directed to do so by Customer
Support.

 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
3DUWLDO5HVWRUH3URFHGXUH

 Rebuild indexes (and precomputed views, if any) by running the


TMU REORG command against the table that contains the applicable
segment or PSU. It is recommended that you rebuild all of the indexes
and views on the table at the same time. For example:
reorg table_name include precomputed views;
 Repeat the TMU REORG command against any referencing tables:
reorg referencing_table_name include precomputed views;
Repeat the process for each level of reference outward from the table
whose segment or PSU was restored. For example, if the restored seg-
ment was from an outboard table, first REORG the dimension table
that references the outboard table, then REORG the fact table that ref-
erences the dimension table.
 Run the CHECK TABLE and CHECK INDEX commands against all of
the objects affected by the partial restore operation. This step is
optional, but highly recommended.

3DUWLDO$YDLODELOLW\
While you are restoring a damaged segment, you can give users partial access
to the affected table or index:

 Issue an ALTER SEGMENT command to take the damaged segment


offline before you perform the restore operation:
RISQL> alter segment segment_name offline;
 Set the PARTIAL_AVAILABILITY parameter for tables and indexes for
all users in the rbw.config file. Alternatively, use the SET PARTIAL
AVAILABILITY command for specific sessions. For example:
RISQL> set partial availability info;
For more details about partial availability, refer to the Administrator’s
Guide and the SQL Reference Guide.
 After restoring the segment, bring it back online before performing
any required REORG operations.
RISQL> alter segment segment_name online;

5HVWRULQJD'DWDEDVH 
Appendix

([DPSOH8VLQJWKH708
LQ$**5(*$7(0RGH A
The example in this appendix shows one way in which you can
use the TMU Auto Aggregate feature to generate and maintain
quarterly and yearly aggregates from daily input data.

This appendix contains the following sections:

■ Background
■ Strategy
■ Load Procedure: Refresh Loads
■ Load Procedure: Daily Loads
■ Results

%DFNJURXQG
Daily input data is extracted from another system used for day-
to-day operations. This data uses the following units of time:

■ Monthly totals for all months prior to the current month


■ Month-to-date totals for the current month
■ Daily totals for the current month

In addition to these units of time, analysis made with the


warehouse database requires quarter-to-date and year-to-date
aggregates. The TMU can generate these figures automatically
and update them as needed as part of the data loading
procedures.
6WUDWHJ\

The warehouse database is updated daily with information that includes the
new daily total and month-to-date total, as well as possible restated amounts
for previous daily totals. Twice a month the entire warehouse database is
completely reloaded from the operational system.

The example assumes the current date is May 4, 2000.

6WUDWHJ\
The first two tasks are to decide how to capture restated daily values and to
devise a period-key strategy to handle the desired time-aggregate levels.

The fact that the daily inputs might contain restated values for previous days
requires extra care in computing the quarter-to-date and year-to-date aggre-
gates. Adding new daily totals to these aggregates does not capture any
restated daily totals for days already included in the aggregate totals. The
solution to this problem is to use the daily month-to-date totals from the
operational data, and each day subtract the month-to-date totals for the
previous day from the two aggregates and add the new month-to-date totals,
thereby capturing any restated daily totals. (Restatements for other than the
current month are captured by the semi-monthly refreshes.)

This plan requires that input data for the daily updates be split into two files,
one that contains daily totals and one that contains month-to-date totals so
that you can use only the month-to-date totals to compute the aggregates.
The semi-monthly refreshes (and the initial load) are made with three files:
the daily and month-to-date files used for the daily updates, and a file that
contains monthly data for previous months. The following figure shows
these files and their use.

GDLO\GDW PWGGDW PRQGDW


'DLO\ 0RQWKWRGDWH 0RQWKO\
WRWDOVE\ WRWDOVE\ WRWDOVIRU
SURGXFW SURGXFW DOOSULRU
DQGJURXS DQGJURXS PRQWKVE\
SURGXFW
DQGJURXS
)RUGDLO\ORDGV

)RUUHIUHVKDQGLQLWLDOORDGV

$ 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
6WUDWHJ\

After a refresh, the quarter-to-date and year-to-date aggregates are computed


by using the Auto Aggregate feature. Splitting daily data into two files and
refresh data into three files as described keeps the month-to-date totals
separate for the aggregations made daily and after the refreshes. (Remember
that you need the month-to-date records from both the current day and the
previous day.)

Use a period key of eight characters, two characters each for year, quarter,
month, and date: YYQQMMDD. With this format, the various levels of
aggregation are represented as the following table shows.

$JJUHJDWLRQ )RUPDW ([SODQDWLRQ

Daily YYQQMMDD All 8 positions filled.

Monthly totals YYQQMM00 All 8 positions filled.

Month-to-date YYQQMM00 All 8 positions filled.

Quarter-to-date YYQQ_ _ _ _ 4 characters followed by 4 spaces.

Year-to-date YY_ _ _ _ _ _ 2 characters followed by 6 spaces.

([DPSOH8VLQJWKH708LQ$**5(*$7(0RGH $
7KH'LPHQVLRQ7DEOHV

7KH'LPHQVLRQ7DEOHV
The dimension tables in this example, Period, Product, and Market, are
created and loaded with data as Figure A-1 through Figure A-3 show.
)LJXUH $3HULRG7DEOH

This data period.dat

96010100
96010200
96010300
96020400
96020500
96020501
96020502
96020503
96020504
9601
9602
96

is loaded into this table create table period (


perkey char (8) not null,
primary key (perkey))
maxrows per segment 48;

with this
LOAD DATA statement load data
inputfile ’period.dat’
modify
into table period (
perkey char(8));

$ 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
7KH'LPHQVLRQ7DEOHV

)LJXUH $3URGXFW7DEOH

This data product.da


t

022
055
314
319

is loaded into this table create table product(


prodkey char(3) not null,
primary key (prodkey))
maxrows per segment 8;

with this
LOAD DATA statement load data
inputfile ’product.dat’
modify
into table product(
prodkey char(3));

)LJXUH $0DUNHW7DEOH

This data market.d


at

478
523

is loaded into this table create table market (


mktkey char (3) not null,
primary key (mktkey))
maxrows per segment 8;

with this
LOAD DATA statement load data
inputfile ’market.dat’
modify
into table market (
mktkey char(3));

([DPSOH8VLQJWKH708LQ$**5(*$7(0RGH $
7KH6DOHV7DEOH

7KH6DOHV7DEOH
Create the Sales table as the following example shows:

create table sales(


perkey char (8) not null,
prodkey char(3) not null,
mktkey char(3) not null,
dollars integer,
primary key (perkey, prodkey, mktkey),
foreign key (perkey) references period (perkey),
foreign key (prodkey) references product(prodkey),
foreign key (mktkey) references market (mktkey));

$ 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
7KH6DOHV7DEOH

The data for the Sales table comes from three input files.

)LOH 'DWD

month.dat Contains monthly total sales dollars for January to April, 2000 for
each month for each product in each market.

mtd.dat Contains total sales dollars sold month-to-date (May 3, 2000) for
each product in each market.

daily.dat Contains total sales dollars sold for May 1 to 3, 2000, for each day
for each product in each market.

([DPSOH8VLQJWKH708LQ$**5(*$7(0RGH $
7KH6DOHV7DEOH

0RQWKO\GDWDIRU-DQWR$SU 0RQWKWRGDWHGDWDIRU0D\ 'DLO\GDWD0D\

/* month.dat */ /* daily.dat */
/* mtd.dat */
96010100022478132 96020501022478044
96020500022478055
96010100022523048 96020501022523033
96020500022523106
96010100022931019 96020501022931022
96020500022931066
96010100055478026 96020501055478055
96020500055478124
96010100055523423 96020501055523066
96020500055523164
96010100055931025 96020501055931099
96020500055931212
96010100314478721 96020501314478088
96020500314478124
96010100314523096 96020501314523077
96020500314523103
96010100314931516 96020501314931065
96020500314931107
96010100319478318 96020501319478045
96020500319478093
96010100319523741 96020501319523006
96020500319523094
96010100319931925 96020501319931008
96010200022478012 SHUNH\ PNWNH\
96020502022478004
96010200022523036 SURGNH\ VDOHV
96020502022523068
96010200022931428 96020502022931041
96010200055478076 96020502055478023
96010200055523011 96020502055523082
96010200055931066 96020502055931091
96010200314478030 96020502314478005
96010200314523741 96020502314523009
96010200314931852 96020502314931013
96010200319478045 96020502319478021
96010200319523098 96020502319523052
SURGXFWV
96010200319931016   96020502319931049
96010300022478231 PDUNHWV 96020503022478007
96010300022523311 
96020503022523005
96010300022931056 96020503022931003
96010300055478753 96020503055478046
96010300055523072 96020503055523016
96010300055931455 96020503055931022
96010300314478622 96020503314478031
96010300314523944 96020503314523017
96010300314931823
96010300319478456
96010300319523029 SHUNH\ PNWNH\

96010300319931047 SURGNH\ VDOHV

96020400022478036
96020400022523891
96020400022931038
96020400055478059
96020400055523761
96020400055931648
96020400314478089
VDOHV
PNWNH\
SURGNH\
SHUNH\

$ 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
/RDG3URFHGXUH5HIUHVK/RDGV

/RDG3URFHGXUH5HIUHVK/RDGV
To load the Sales table for each semi-monthly refresh and the initial load.
(Steps 1 to 3 load the detail-level records.)

 Load the month.dat file containing monthly totals for previous


months.
 Load the mtd.dat file with the month-to-date totals for each product
and market.
 Load the daily.dat file with the daily totals for the current month.
(Steps 4 to 5 produce quarterly and quarter-to-date records.)
 Load data for prior months (January to April) and compute
aggregates for prior quarters (Q1: January to March) by using the
month.dat file.
 Load to-date data (May) FOR current month and compute the
quarter-to-date (April and May to date) aggregate BY using the
mtd.dat file.
(Steps 6–7 produce the year to date aggregates.)
 Load data FROM prior months (January to April) and compute the
year-to-date aggregate for prior months using the month.dat file.
 Load data from current month (May) and finish year-to-date
aggregate by including current month and using the mtd.dat file.

([DPSOH8VLQJWKH708LQ$**5(*$7(0RGH $
/RDG3URFHGXUH5HIUHVK/RDGV

)RU5HIUHVK/RDG6WHSVWR
Use the following LOAD DATA statements to load data into the Sales table
initially and for each semi-monthly refresh thereafter:
/*initial load */
/*loading month records */
load data
inputfile ’month.dat’
modify
into table sales(
perkey char(8),
prodkey char(3),
mktkey char(3),
dollars integer external (3));
/*loading month-to-date records */
load data
inputfile ’mtd.dat’
append
into table sales(
perkey char(8),
prodkey char(3),
mktkey char(3),
dollars integer external (3));
/*loading daily records */
load data
inputfile ’daily.dat’
append
into table sales(
perkey char(8),
prodkey char(3),
mktkey char(3),
dollars integer external (3));

$ 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
/RDG3URFHGXUH5HIUHVK/RDGV

)RU5HIUHVK/RDG6WHSVWR
Use the following LOAD DATA statements to produce the quarterly and
quarter-to-date aggregates initially and for each semi-monthly refresh
thereafter:
/* load monthly data and compute aggregates for full quarters*/
load data
inputfile ’month.dat’
modify aggregate
into table sales(
perkey position(1:4) char (8),
prodkey position(9:11) char(3),
mktkey position(12:14) char(3),
dollars position(15) integer external(3) add);
/* load month-to-date data; compute aggregate for qtr-to-date*/
load data
inputfile ’mtd.dat’
modify aggregate
into table sales(
perkey position(1:4) char (8),
prodkey position(9:11) char(3),
mktkey position(12:14) char(3),
dollars position(15) integer external(3) add);

)RU5HIUHVK/RDG6WHSVWR
Use the following LOAD DATA statements to produce the yearly and
year-to-date aggregates initially and for each semi-monthly refresh
thereafter:
/*load monthly data and compute year-to-date for prior months*/
load data
inputfile ’month.dat’
modify aggregate
into table sales(
perkey position(1:2) char (8),
prodkey position(9:11) char(3),
mktkey position(12:14) char(3),
dollars position(15) integer external(3) add);
/* load current month and complete year-to-date aggregate*/
load data
inputfile ’mtd.dat’
modify aggregate
into table sales(
perkey position(1:2) char (8),
prodkey position(9:11) char(3),
perkey position(12:14) char(3),
dollars position(15) integer external(3) add);

([DPSOH8VLQJWKH708LQ$**5(*$7(0RGH $
/RDG3URFHGXUH'DLO\/RDGV

/RDG3URFHGXUH'DLO\/RDGV
Use the following procedure to load the Sales table each day with the daily
updates.

(Steps 1 and 2 load new and modified, or restated, detail data.)

 Load the daily.dat file containing daily totals for the current month,
using the MODIFY mode (to capture any restatements).
 Load the mtd.dat.new file with the month-to-date totals as of the
current date, using MODIFY mode.
(Steps 3 and 4 adjust the quarterly and quarter-to-date figures for any
restated totals.)
 Subtract the month-to-date total for yesterday from the quarterly
and quarter-to-date figures by loading the mtd.dat.old file using
MODIFY AGGREGATE mode and subtracting the dollars column.
 Add the new month-to-date total for today to the quarterly and
quarter-to-date figures by loading the mtd.dat.new file, using
MODIFY AGGREGATE mode and adding the dollars column.
(Steps 5 and 6 adjust the yearly and year-to-date figures for any
restated totals.)
 Subtract the month-to-date total for yesterday from the yearly and
year-to-date figures by loading the mtd.dat.old file using
MODIFY AGGREGATE mode and subtracting the dollars column.
 Add the new month-to-date total for today to the yearly and
year-to-date figures by loading the mtd.dat.new file, using MODIFY
AGGREGATE mode and adding the dollars column.

$ 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
/RDG3URFHGXUH'DLO\/RDGV

)RU'DLO\/RDG6WHSVDQG
Use the following LOAD DATA statements to load the daily detail data:
/*loading month-to-date records*/
load data
inputfile ’mtd.dat.new’
modify
into table sales(
perkey char(8),
prodkey char(3),
mktkey char(3),
dollars integer external (3));
/*loading daily records*/
load data
inputfile ’daily.dat’
modify
into table sales(
perkey char(8),
prodkey char(3),
mktkey char(3),
dollars integer external (3));

)RU'DLO\/RDG6WHSVDQG
Use the following LOAD DATA statements to calculate the quarter and
quarter-to-date aggregates, including adjustments for any restated totals:
/*after initial loads, daily updates*/
/*using yesterday’s mtd.dat file */
/*(you will always have to keep the prior day’s data)*/
/* monthly totals and aggregates for previous quarters /*
load data
inputfile 'mtd.dat.old' */yesterday's file/*
modify aggregate
into table sales(
perkey position(1:4) char (8),
prodkey position(9:11) char(3),
mktkey position(12:14) char(3),
dollars position(15) integer external(3) subtract);
/* current month for current quarter-to-date data */
load data
inputfile 'mtd.dat.new' /*today's file*/
modify aggregate
into table sales(
perkey position(1:4) char (8),
prodkey position(9:11) char(3),
mktkey position(12:14) char(3),
dollars position(15) integer external(3) add);

([DPSOH8VLQJWKH708LQ$**5(*$7(0RGH $
/RDG3URFHGXUH'DLO\/RDGV

)RU'DLO\/RDG6WHSVDQG
Use the following LOAD DATA statements to calculate the yearly and
year-to-date aggregates, including adjustments for any restated totals:
/*aggregates of year-to-date data - previous months*/
load data
inputfile ’mtd.dat.old’ /*yesterday’s file*/
modify aggregate
into table sales(
perkey position(1:2) char (8),
prodkey position(9:11) char(3),
mktkey position(12:14) char(3),
dollars position(15) integer external(3) subtract);

/*aggregates of year-to-date data - current month*/


load data
inputfile ’mtd.dat.new’ /*today’s file*?
modify aggregate
into table sales(
perkey position(1:2) char (8),
prodkey position(9:11) char(3),
mktkey position(12:14) char(3),
dollars position(15) integer external(3) add);

$ 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
5HVXOWV

5HVXOWV
You update the database with a new daily file (daily.dat) and a new monthly
file (mtd.dat) for May 4, with some restated daily totals. The following figure
shows the new data. (Only the first row for May 1, the restated totals (three
items on May 1) and the data for May 4 from the daily.dat file are shown.)

0RQWKWRGDWHGDWDIRU0D\C 'DLO\GDWDIRU0D\C
(mtd.dat) (daily.dat)
9602050002247810 9602050102247804
0 4
9602050002252314 …
2 5HVWDWHGWRWDOV 9602050131493126
9602050002293109 5
3 9602050131947815
9602050005547818 4
2 9602050131952306
9602050005552323 0
3 …
9602050005593130 9602050402247804
3 5
9602050402252303
6
9602050402293102
7
9602050405547805
8

([DPSOH8VLQJWKH708LQ$**5(*$7(0RGH $
5HVXOWV

The quarter-to-date and year-to-date figures change as shown, reflecting not


only the new sales for May 4, but also the restated sales for previous days.

May 3 aggregates May 4 aggregates


RISQL> select * from sales RISQL> select * from sales
where perkey = ’9602 ’; where perkey = ’9602 ’;
PERKEY PROD MKTK DOLLARS PERKEY PROD MKTK DOLLARS
9602 022 478 91 9602 022 478 136
9602 022 523 997 9602 022 523 1033
9602 022 931 104 9602 022 931 131
9602 055 478 183 9602 055 478 241
9602 055 523 925 9602 055 523 994
9602 055 931 860 9602 055 931 951
9602 314 478 213 9602 314 478 295
9602 314 523 181 9602 314 523 254
9602 314 931 158 9602 314 931 412
9602 319 478 605 9602 319 478 768
9602 319 523 620 9602 319 523 734
9602 319 931 184 9602 319 931 264
RISQL> select * from sales RISQL> select * from sales
where perkey = ’96 ’; where perkey = ’96 ’;
PERKEY PROD MKTK DOLLARS PERKEY PROD MKTK DOLLARS
96 022 478 466 96 022 478 511
96 022 523 1392 96 022 523 1428
96 022 931 607 96 022 931 634
96 055 478 1038 96 055 478 1096
96 055 523 1431 96 055 523 1500
96 055 931 1406 96 055 931 1497
96 314 478 1586 96 314 478 1668
96 314 523 1962 96 314 523 2035
96 314 931 2349 96 314 931 2603
96 319 478 1424 96 319 478 1587
96 319 523 1488 96 319 523 1602
96 319 931 1172 96 319 931 1252

$ 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
Appendix

6WRUDJH0DQDJHU
&RQILJXUDWLRQIRU;%6$ B
%DFNXSV
This appendix points out specific configuration requirements for
XBSA-compliant storage management systems. For details about
running TMU backups via the XBSA interface, see page 8-19.

This appendix contains the following sections:

■ General Guidelines
■ Informix Storage Manager (ISM)
■ Legato Networker (NSR)
■ Tivoli Storage Manager (TSM)

*HQHUDO*XLGHOLQHV
To back up a Red Brick database to an XBSA-compliant storage
management system, you must first install and configure the
appropriate storage manager server and any required connec-
tivity software. Typically, apart from the “server” package itself,
a “client” package and an “Informix “or “XBSA” add-on module
are required. Refer to your storage manager documentation for
details.
%$5B60B86(53DUDPHWHU

%$5B60B86(53DUDPHWHU
To make a successful connection to the storage manager, you must set the
BAR_SM_USER parameter correctly in the rbw.config file. This parameter
corresponds to the bsaObjectOwner variable in the XBSA specification;
however, the value of the parameter depends on the storage manager being
used. Check your storage manager’s client or API reference documents for
information about valid parameter values.

Some examples are shown in the following sections.

%$5B;%6$B/,%3DUDPHWHU
The XBSA library provided by the storage manager vendor must be specified
with the BAR_XBSA_LIB entry in the rbw.config file. The name, suffix, and
location of the library sometimes depend on the operating system, as shown
in the following sections.

,QIRUPL[6WRUDJH0DQDJHU ,60
Make sure the rb_tmu user (the user running backups) has ISM administrator
(-admin) privileges.

The XBSA bsaObjectOwner variable is hard-coded in ISM as INFORMIX;


therefore, the BAR_SM_USER option in the rbw.config file must be set to
INFORMIX. The libbsa.so (or libbsa.sl or libbsa.so.1) file is usually the XBSA
library on UNIX platforms. Set the BAR_XBSA_LIB option to the appropriate
filename.

For example:
OPTION BAR_SM_USER INFORMIX
OPTION BAR_XBSA_LIB /usr/informix/lib/libbsa.so

% 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
/HJDWR1HWZRUNHU 165

/HJDWR1HWZRUNHU 165
Check the Legato installation guide for details about the installation process.
The following packages must be installed:

■ Server software: Legato Networker Server


■ Client software (on the machine where Red Brick server is installed):
❑ Legato Networker Client
❑ Informix Networker module (XBSA interface).

Make sure the rb_tmu user (the user running backups) has administrator
privileges.

The XBSA bsaObjectOwner variable is hard-coded in NSR as INFORMIX;


therefore, the BAR_SM_USER option in the rbw.config file must be set to
INFORMIX. The libxnmi.so (or libxnmi.sl or libxnmi.so.1) file from the
Informix Networker module is usually the XBSA library on UNIX platforms.
On Windows NT, the file is %NSRDIR%\bin\libbsa.dll.

For example:
OPTION BAR_SM_USER INFORMIX
OPTION BAR_XBSA_LIB /usr/lib/libxnmi.so

6WRUDJH0DQDJHU&RQILJXUDWLRQIRU;%6$%DFNXSV %
7LYROL6WRUDJH0DQDJHU 760

7LYROL6WRUDJH0DQDJHU 760
Check the Tivoli installation guide for details about the installation process.
The following packages must be installed:

■ Server software:
❑ Tivoli Storage Manager Server (“server” package)
❑ Tivoli Storage Manager Device Support (“devices” package)
■ Client software (on the machine where Red Brick server is installed):
❑ TSM Backup-Archive Client (“admin”, “api”, and “ba”
packages)
❑ Tivoli Data Protection for Informix (TDPI). See the TDPI
documentation for details.

The TSM client “node” name is used as the XBSA bsaObjectOwner variable
and the value of BAR_SM_USER. The example below uses the default CLIENT
node. On 32-bit platforms, the Informix XBSA library is under the following
directory:
tivoli_client_dir/informix/bin

On 64-bit platforms, the directory is as follows:


tivoli_client_dir/informix/bin64

The TMU and the Informix XBSA library must both be either 32-bit or 64-bit.

The example BAR_XBSA_LIB setting below is for 32-bit Solaris. TSM does not
provide a Windows NT XBSA library; however, you can use the Solaris
rb_tmu and XBSA library to run backups to a Windows NT Tivoli server.
OPTION BAR_SM_USER CLIENT
OPTION BAR_XBSA_LIB
/opt/tivoli/tsm/client/informix/bin/libTDPinf.so

On AIX platforms, the XBSA library file name is bsashr10.o.

% 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
Notices

1RWLFHV

IBM may not offer the products, services, or features discussed


in this document in all countries. Consult your local IBM repre-
sentative for information on the products and services currently
available in your area. Any reference to an IBM product,
program, or service is not intended to state or imply that only
that IBM product, program, or service may be used. Any
functionally equivalent product, program, or service that does
not infringe any IBM intellectual property right may be used
instead. However, it is the user’s responsibility to evaluate and
verify the operation of any non-IBM product, program, or
service.

IBM may have patents or pending patent applications covering


subject matter described in this document. The furnishing of this
document does not give you any license to these patents. You can
send license inquiries, in writing, to:

IBM Director of Licensing


IBM Corporation
North Castle Drive
Armonk, NY 10504-1785
U.S.A.

For license inquiries regarding double-byte (DBCS) information,


contact the IBM Intellectual Property Department in your
country or send inquiries, in writing, to:

IBM World Trade Asia Corporation


Licensing
2-31 Roppongi 3-chome, Minato-ku
Tokyo 106-0032, Japan
The following paragraph does not apply to the United Kingdom or any
other country where such provisions are inconsistent with local law:
INTERNATIONAL BUSINESS MACHINES CORPORATION PROVIDES THIS
PUBLICATION “AS IS” WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESS
OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
OF NON-INFRINGEMENT, MERCHANTABILITY OR FITNESS FOR A
PARTICULAR PURPOSE. Some states do not allow disclaimer of express or
implied warranties in certain transactions, therefore, this statement may not
apply to you.

This information could include technical inaccuracies or typographical


errors. Changes are periodically made to the information herein; these
changes will be incorporated in new editions of the publication. IBM may
make improvements and/or changes in the product(s) and/or the
program(s) described in this publication at any time without notice.

Any references in this information to non-IBM Web sites are provided for
convenience only and do not in any manner serve as an endorsement of those
Web sites. The materials at those Web sites are not part of the materials for
this IBM product and use of those Web sites is at your own risk.

IBM may use or distribute any of the information you supply in any way it
believes appropriate without incurring any obligation to you.

Licensees of this program who wish to have information about it for the
purpose of enabling: (i) the exchange of information between independently
created programs and other programs (including this one) and (ii) the mutual
use of the information which has been exchanged, should contact:

IBM Corporation
J46A/G4
555 Bailey Avenue
San Jose, CA 95141-1003
U.S.A.

Such information may be available, subject to appropriate terms and condi-


tions, including in some cases, payment of a fee.

The licensed program described in this information and all licensed material
available for it are provided by IBM under terms of the IBM Customer
Agreement, IBM International Program License Agreement, or any equiv-
alent agreement between us.

 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
Any performance data contained herein was determined in a controlled
environment. Therefore, the results obtained in other operating environ-
ments may vary significantly. Some measurements may have been made on
development-level systems and there is no guarantee that these measure-
ments will be the same on generally available systems. Furthermore, some
measurements may have been estimated through extrapolation. Actual
results may vary. Users of this document should verify the applicable data for
their specific environment.

Information concerning non-IBM products was obtained from the suppliers


of those products, their published announcements or other publicly available
sources. IBM has not tested those products and cannot confirm the accuracy
of performance, compatibility or any other claims related to non-IBM
products. Questions on the capabilities of non-IBM products should be
addressed to the suppliers of those products.

1RWLFHV 
7UDGHPDUNV
AIX; DB2; DB2 Universal Database; Distributed Relational Database
Architecture; NUMA-Q; OS/2, OS/390, and OS/400; IBM Informix;
C-ISAM; Foundation.2000TM; IBM Informix 4GL; IBM Informix
DataBlade Module; Client SDKTM; CloudscapeTM; CloudsyncTM;
IBM Informix Connect; IBM Informix Driver for JDBC; Dynamic
ConnectTM; IBM Informix Dynamic Scalable ArchitectureTM (DSA);
IBM Informix Dynamic ServerTM; IBM Informix Enterprise Gateway
Manager (Enterprise Gateway Manager); IBM Informix Extended Parallel
ServerTM; i.Financial ServicesTM; J/FoundationTM; MaxConnectTM; Object
TranslatorTM; Red BrickTM; IBM Informix SE; IBM Informix SQL; Infor-
miXMLTM; RedBack; SystemBuilderTM; U2TM; UniData; UniVerse;
wintegrate are trademarks or registered trademarks of International
Business Machines Corporation.

Java and all Java-based trademarks and logos are trademarks or registered
trademarks of Sun Microsystems, Inc. in the United States and other
countries.

Windows, Windows NT, and Excel are either registered trademarks or trade-
marks of Microsoft Corporation in the United States and/or other countries.

UNIX is a registered trademark in the United States and other countries


licensed exclusively through X/Open Company Limited.

Other company, product, and service names used in this publication may be
trademarks or service marks of others.

 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
A B C D E F G H I J K L M N O P Q R S T U V W X Y Z @

Index

,QGH[

$ %
ABORT keyword, TMU Backup log file 8-29
REORG 6-17 Backup operations
ACCEPT keyword, TMU 2-6, 3-91 event logging 8-29
ACCESS_ANY task examples 8-24
authorization 7-6 external tools for 2-46, 8-20
action_log file 8-26 general procedure 8-8, 8-13
ADD aggregate operator, locks held during 8-13
TMU 3-73 metadata 8-26
Administrator tool, using to create objects not backed up 8-14
backup segment 8-9 preparing the database 8-8
Aggregate maintenance storage managers 8-19
described 1-7 strategy 8-6
setting 2-36 syntax diagram 8-22
AGGREGATE mode, loading syntax examples 8-24
data 3-32 tape devices 8-17
AGGREGATE operators, task authorization 8-13
TMU 3-76 versioned databases 8-14
ALTER DATABASE commands XBSA interface 8-19, B-1
CLEAN VERSION LOG 8-14, Backup segment 8-8 to 8-13
8-15 Administrator tool for
CREATE BACKUP DATA 8-9 creating 8-9
DROP BACKUP DATA 8-11 altering 8-12
ALTER SEGMENT operations, on automatically restored 9-11
backup segment 8-12 bitmap information 8-8
APPEND mode, loading data 3-31 creating 8-9
AS $pseudocolumn, TMU 3-67 damaged 8-11
Auto aggregate feature, TMU sizing 8-10
described 1-7 storage requirements 8-10
example A-1 BACKUP_DATABASE
usage 3-76 authorization 2-46, 8-13
Automatic Row Generation, TMU barxbsa utility 8-20
described 3-7, 3-16 to 3-22 bar_metadata directory 8-26
syntax and usage 3-43 BAR_SM_USER parameter 8-19,
AUTOROWGEN parameter B-2
See Automatic Row Generation. BAR_UNIT_SIZE parameter 2-45
A B C D E F G H I J K L M N O P Q R S T U V W X Y Z @

BAR_XBSA_LIB parameter 8-19, Constant dates, loading 3-84, 3-109, unload formats 4-5
B-2 3-115 unloading 1-10, 4-3 to 4-20
Binary datetime CONSTANT fields, LOAD DATA Data processing 6-7 to 6-11
inputs 3-116 to 3-119 statement 3-84 Data source names, for remote
Bitmap information, in backup Contact information Intro-17 TMU configuration 2-13
segment 8-8 Control files, TMU 1-8 Database access 2-5
Blanks in input data 3-134 Conventions Database locale 8-14
Boldface type Intro-5 syntax diagrams Intro-7 database option (-d), TMU 2-6
Buffer cache, TMU 2-27 syntax notation Intro-6 Databases
Conversion stage backing up 8-3
in REORG operation 6-10 loading data 3-5 to 3-146
& loading data 3-8 locking by TMU 2-25
PTMU 3-11 moving 4-18
Cache size, buffer (TMU) 2-27
Coordinator stage, REORG restoring 9-3
syntax 2-27
operation 6-10 upgrading to new release 1-12
usage 2-28
Copy management. See rb_cm Datatype conversions, during load
Capacity parameter, for tape
utility. process 3-133 to 3-136
backups 8-18, 8-23
CREATE SEGMENT command 8-9 Dates, loading constant dates 3-84,
Cases tracked by Technical
CREATE TABLE control file 3-109, 3-115
Support Intro-13
from UNLOAD statement 4-10 Datetime inputs
Cautions
Criteria clause, LOAD DATA binary and packed/zoned
escape character and locale 3-93
statement decimal 3-116 to 3-119
UNDO LOAD and REPLACE
comparisons, three-valued Datetime
mode 3-121
logic 3-93 fieldtypes 3-107 to 3-116
CDATA section, in XML files 3-75
locale use 3-42 format masks 3-109
CHARACTER fieldtype, TMU 3-99
syntax and usage 3-90 restricted format masks 3-116
CHECK TABLE and CHECK
CURRENT_DATE keyword, DECIMAL fieldtype, TMU 3-101,
INDEX commands 9-21
TMU 3-107 3-104
Checkpoint backups 8-4
CURRENT_TIME keyword, DEFERRED INDEXES keyword,
Cleanup stage 3-11
TMU 3-107 TMU REORG 6-14
in REORG operation 6-11
CURRENT_TIMESTAMP DELETE ROW keyword, TMU
Client TMU 2-12
keyword, TMU 3-107 REORG 6-18
configuration 2-13
Customer Support Intro-12 Demonstration database, script to
syntax 2-14
install Intro-4
Cold restore operations 9-12
Directories, backups to 8-22
Columns, determining default
values 3-54
' Discard clause
loading data 3-43 to 3-57
Comment clause, LOAD DATA Damaged segments
reloading XML discards 3-44
statement 3-95 to 3-96 not backed up 8-14
Discard files, TMU
Comments, TMU control file 1-9 restoring 9-19
all discards 3-46
Commit record interval, Data
locale 3-42
setting 2-40 backing up 8-3
multiple 3-43
Commit time interval, setting 2-42 conversion to EXTERNAL
optimized load discards 3-60
Comparisons, TMU LOAD DATA format 4-6
referential integrity
statement 3-90 loading 1-10, 3-5 to 3-146
discards 3-46, 6-21
Compressed files loading into third-party
types and use 3-43
input to TMU 2-10 tools 4-19
Discarded rows during load 3-133
output from TMU 4-11 restoring 9-3
Discardfile clause, REORG 6-19

 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
A B C D E F G H I J K L M N O P Q R S T U V W X Y Z @

DISCARDFILE keyword External full backups 8-4 Format clause, LOAD DATA
REORG 6-19 EXTERNAL keyword, TMU statement 3-29
TMU 3-46, 3-60 UNLOAD statement 4-9 FORMAT keyword, TMU
DISCARDS keyword External-variable data format, TMU IBM SEPARATED by ’c’
REORG 6-18 example 4-22 format 3-34
TMU 3-48 SEPARATED by ’c’ format 3-33
Disk file formats, input data 3-123 UNLOAD 3-34
Disk spill files, INDEX ) XML 3-34
TEMPSPACE parameters 2-28 Format masks, datetime
Fieldtypes, TMU input
Documentation datetime fieldtypes 3-109
records 3-97 to 3-116
list for IBM Red Brick numeric fieldtypes 3-116
CHARACTER 3-99
Warehouse Intro-14 Full backups
conversions during
online Intro-16 defined 8-4
load 3-133 to 3-136
DOUBLE PRECISION fieldtype, external 8-4, 8-20
CURRENT_DATE 3-107
TMU 3-106 foreign 8-4, 8-20
CURRENT_TIME 3-107
Driver TMU 2-12 syntax 8-22
CURRENT_TIMESTAMP 3-107
DST_DATABASES table 8-9 Full restores
DECIMAL 3-101, 3-104
dump and restore commands, defined 9-4
DOUBLE PRECISION 3-106
UNIX 8-20 foreign 9-11
FLOAT EXTERNAL 3-103
Duplicate records syntax 9-13
INTEGER 3-101, 3-105
discarding 3-60
M4DATE 3-108
optimize mode 3-61
REAL 3-106
scale of field 3-104
*
SMALLINT 3-105 GENERATE statements (TMU)
( TIME 3-107 CREATE TABLE syntax and
Empty input fields 3-134 TIMESTAMP 3-107 usage 5-3 to 5-5
Environment variables Intro-5 TINYINT 3-105 example 5-8
general use with TMU 1-9 File formats, input data example with rb_cm 7-15
RB_CONFIG with TMU 2-8 disk files 3-123 LOAD DATA syntax and
RB_PATH with TMU 2-6 fixed record 3-123 usage 5-5 to 5-8
remote TMU configuration 2-13 separated record 3-128
USER statements 2-22 tape files 3-131
Error codes 2-7 variable record 3-124 ,
Error-handling stage XML 3-129
IBM standard label tapes 3-122
loading data 3-9 File redirection, TMU 2-9
INCREMENT fields 3-87
PTMU 3-11 Filesize, TAR limit 4-12
Incremental backups
ESCAPE keyword Fixed-format records, input
defined 8-4
Criteria clause 3-93 data 3-123
syntax 8-22
UNLOAD statement 4-14 FIXEDLEN keyword, LOAD DATA
Index name clause, TMU
Event logging, for backup statement 3-124
REORG 6-14
operations 8-29 FLOAT EXTERNAL fieldtype,
Index-building stage
Exit status codes 2-7 TMU 3-103
in REORG operation 6-11
External backups 2-46, 8-20 FORCE INTACT command, ALTER
loading data 3-9
External data format, TMU SEGMENT 8-11
Indexes
conversion rules 4-6 FORCE option, for restore
default names 6-14
example 4-20, 5-8 operations 9-20
DEFERRED 6-4
for unloaded data 4-5 Foreign backups 2-46, 8-20
rebuilding with REORG 6-3
with rb_cm utility 7-11 Foreign restores 2-46, 8-21, 9-11

,QGH[ 
A B C D E F G H I J K L M N O P Q R S T U V W X Y Z @

INDEX_TEMPSPACE parameters LOAD DATA control file trim options 3-73, 3-82
DIRECTORY 2-30 from UNLOAD statement 4-10 UPDATE mode 3-32
MAXSPILLSIZE 2-31 with rb_cm utility 7-12 Loading data 3-5 to 3-146
THRESHOLD 2-30 LOAD DATA statement Auto aggregate
TMU control 2-28 See also Loading data. example A-1 to A-16
INDEX_TEMPSPACE_DUPLICAT ACCEPT criteria 2-6, 3-91 conversion stage 3-8
E SPILLPERCENT parameter ADD aggregate operator 3-73 datatype
syntax 2-29 AGGREGATE mode 3-32 conversions 3-133 to 3-136
usage 2-31 APPEND mode 3-31 discard files 3-46, 3-60, 6-21
Input clause, LOAD DATA AUTOROWGEN keyword 3-49 discarded rows 3-6, 3-60, 3-133
statement 3-25 to 3-29 clauses, main error handling and cleanup
Input data Comment 3-95 to 3-96 stage 3-9
CONSTANT fields 3-84 Criteria 3-90 failure or interruption 3-28
fieldtypes 3-97 to 3-116 Discard 3-43 to 3-57 index stage 3-9
file formats 3-122 Format 3-29, 3-38 input stage 3-8
INCREMENT fields 3-87 Input 3-25 to 3-29 inputs and outputs 3-6
ordered 3-15 Locale 3-38 into segments 3-87
record formats 3-122 MMAP Index 3-63 load information 3-95
SEQUENCE fields 3-85 to 3-86 Optimize 3-59 main output stage 3-9
unused fields 3-66 Row Messages 3-58 memory-mapping indexes 3-63
Input files, LOAD DATA Segment 3-87 to 3-90 offline load 3-87
statement 3-25 Table 3-65 to 3-87 overview 1-10
Input locale, defined 3-39 CONSTANT fields 3-84 procedure 3-12 to 3-13
Input stage creating with GENERATE 7-12 processing flow 3-8
in REORG operation 6-10 fieldtypes 3-97 to 3-116 RBW_LOADINFO system
loading data 3-8 FIXEDLEN keyword 3-124 table 3-95
PTMU 3-10 INCREMENT fields, input SERIAL column 3-68
INSERT mode, loading data 3-31 data 3-87 terminating with NOT NULL
INTEGER fieldtype, TMU 3-101, input files 3-25 DEFAULT NULL 3-67
3-105 INSERT mode 3-31 unused input fields 3-66
Internal data format, TMU MAX aggregate operator 3-74 Locale clause
for unloaded data 4-5 MIN aggregate operator 3-73 LOAD DATA statement 3-38
with rb_cm utility 7-11 MODIFY mode 3-33 XML encodings 3-40, 3-41
Interrupted load 3-28 NLS_LOCALE keyword 3-38 Locales
interval option (-i), TMU 2-6 NULLIF keyword 3-73 backed-up databases 8-14
INTO OFFLINE SEGMENT POSITION keyword 3-72 default 3-41
keyword, TMU 3-87 pseudocolumns 3-66, 3-83, 3-92 TMU input files 3-38
Invalid STAR indexes, cause of 6-4 rb_cm control files 7-10 UNLOAD operations 4-5
INVALIDATE INDEX keyword, rb_cm example 7-15, 7-19 use by TMU 1-9
TMU REORG 6-17 RECORDLEN keyword 3-30, Locking
3-123 behavior during REORG 6-22
REJECT criteria 3-91 by TMU 2-25
/ REPLACE mode 3-31 during backups 8-13
RETAIN keyword 3-67 SET LOCK command 2-25
Large input and output files 3-7
SEQUENCE fields 3-85 to 3-86 wait behavior for TMU 2-26, 2-38
Level 0, 1, and 2 backups 8-4
SUBSTR keyword 3-100 Logging, for backup and restore
LIKE, NOT LIKE, TMU
SUBTRACT aggregate operations 8-29
wildcards 3-93
operator 3-73 LTRIM keyword, TMU 3-73, 3-82
syntax summary 3-137 to 3-146

 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
A B C D E F G H I J K L M N O P Q R S T U V W X Y Z @

PRECOMPUTED_VIEW_MAINTE
0 2 NANCE_ON_ERROR
M4DATE keyword, TMU 3-108 Offline-load operations option 2-37
Main output stage overview 1-10 Primary key indexes, memory-
loading data 3-9 syntax 3-88 mapping 3-63
PTMU 3-11 See also Segments. Pseudocolumn, TMU
MAX aggregate operator, ON DISCARD keyword, TMU field specification 3-66
TMU 3-74 REORG 6-17 with ACCEPT or REJECT 3-92
MAXROWS PER SEGMENT Online backups 8-4 with concatenated fields 3-83
parameter Online manuals Intro-16 PSUs, restoring single 9-19
and duplicate records 3-61 Operating system access 2-4 PTMU
effect on STAR indexes 6-4 Optimize clause 3480/3490 multiple-tape
MAXSPILLSIZE value 2-31 LOAD DATA statement 3-59 drive 2-54
Memory-map limit OPTIMIZE keyword, TMU automatic row generation 2-53
MMAP INDEX clause 3-63, 6-16 REORG 6-15 conversion stage 3-11
SET command 2-35 Order discard limits 2-53
Messages input data, TMU 3-15 effective use 2-52 to 2-55
backup log file 8-30 table order for loads 3-14 error-handling stage 3-11
displayed during restores 9-16 unloaded data 4-9 exit status codes 2-7
locale of 3-42 OTHER keyword, TMU features, described 1-6
Metadata, for TMU backups 8-26 REORG 6-21 input stage 3-10
MIN aggregate operator, TMU 3-73 LOAD operation stages 3-10
MMAP INDEX clause main output and index
LOAD DATA statement 3-63 3 stages 3-11
REORG statement 6-16 multiple tape drives 2-54
Packed decimal datetime parallel-processing
MODIFY mode
inputs 3-116 to 3-119 parameters 2-48 to 2-50
ACCEPT or REJECT clause 3-92
Partial availability, for tables with performance capabilities 3-9
loading data 3-33
damaged segments 9-21 syntax for rb_ptmu 2-5
MODIFY_ANY task
Partial restore operations 9-4, 9-19 TMU SERIAL MODE
authorization 7-6
PARTIAL_AVAILABILITY parameter 2-50
Moving a database 4-18
parameter 9-21
Multiple discard files 3-43
Password, TMU command line 2-6
Pipes 5
as TMU input 2-10
1 for TMU outputs 4-11 Radix point, TMU
NLS_LOCALE keyword, multiple TMU inputs 2-10 overriding locale 3-40
TMU 3-38, 3-40 TMU GENERATE statement 5-4 specifying 3-102
NO WAIT on locks, TMU 2-26, 2-38 POSITION keyword, TMU 3-72 rbw.config file
NULL values Precomputed views backing up externally 8-20
for input data, example 7-16 maintaining 3-16, 3-49 not backed up 8-14
in external-format data 4-21, 5-9 maintaining with TMU RBW_LOADINFO table
in numeric columns 3-134 REORG 6-4 rb_cm results 7-20
NULLIF keyword, TMU rebuilding 6-4 retrieving data 3-96
GENERATE example 5-9 setting 2-36 RBW_LOADINFO_LIMIT
LOAD DATA statement 3-73 PRECOMPUTED_VIEW_MAINTE parameter, example 2-34
rb_cm example 7-16 NANCE option 2-36 rbw_media_history file 8-26
UNLOAD example 4-21 RBW_SEGMENTS system
table 8-12

,QGH[ 
A B C D E F G H I J K L M N O P Q R S T U V W X Y Z @

rb_cm utility 7-3 to 7-20 Remote TMU 2-12 restoring a segment 9-19
examples 7-13 to 7-19 See also Client TMU. SHOW option 9-17
LOAD control file example 7-15, configuration 2-13 syntax diagram 9-13
7-19 example 2-19 syntax examples 9-15
necessary authorizations 7-6 REMOTE_TMU_LISTENER task authorization 9-10
overview 7-4 to 7-20 parameter 2-14 RESTORE...SHOW command,
requirements for running 7-5 summary of operation 2-18 TMU 9-17
results, verifying 7-20 syntax 2-14 RESTORE_DATABASE
TMU control files for use REMOTE_TMU_LISTENER authorization 2-46, 9-10
with 7-10 parameter 2-14 Restricted datetime masks 3-117
UNLOAD control file REORG RETAIN keyword, TMU 3-67
example 7-15, 7-18 after partial restore 9-21 RI_DISCARDFILE keyword,
RB_CONFIG environment aggregate maintenance 6-4 TMU 3-46
variable 2-8, 2-13 cleanup stage 6-11 Row Messages clause, LOAD
rb_ctmu 2-12 conversion stage 6-10 DATA statement 3-57 to 3-58
See also Client TMU. coordinator stage 6-10 Row messages, managing 2-38
rb_drvtmu 2-12 discardfile format 6-24 RTRIM keyword, TMU 3-73, 3-82
RB_HOST environment disk space 6-24
variable 2-8, 2-13 index-building stage 6-11
RB_NLS_LOCALE environment input stage 6-10 6
variable 1-9, 3-40 locking behavior 6-22
Scale, for fieldtype 3-104
See also NLS_LOCALE keyword. memory-mapping indexes 6-16
Search condition
RB_PATH environment online and offline operation 6-23
WHERE clause in UNLOAD
variable 2-6, 2-8, 2-13 parallel 6-7
statement 4-13
rb_ptmu file, location 2-5 partial index
wildcard characters 4-14
See also PTMU. limitations of 6-23
Segment clause, LOAD DATA
rb_tmu file, location 2-5 options 6-5
statement 3-87 to 3-90
See also TMU. precomputed views 6-4
Segment name clause, TMU
Read-only operations, during sequence of tasks 6-9
REORG 6-13
backups 8-13 serial 6-7
Segments
REAL keyword, TMU 3-106 SET command syntax 2-47
altering backup 8-11
RECALCULATE RANGES syntax 6-12, 6-13
converting table to multiple 4-18
keyword, TMU REORG 6-15 usage 6-4
creating backup 8-9
RECORDLEN keyword 3-30, 3-123 REPLACE mode, loading data 3-31
loading data into 3-87
Records to load between each RESET, TEMPSPACE
restoring single 9-19
COMMIT 2-40 parameters 2-31
unloading specific 4-9
redbrick directory, defined 2-5 Restated daily totals, with Auto
Selective column
redbrick user ID, defined 2-4 Aggregate A-2
updates 3-69 to 3-70
REFERENCE CHECKING option, Restore operations
Selective unload, wildcard
REORG 6-16 cold 9-12
character 4-14
Referential integrity damaged segments 9-19
Separated-format records 3-33,
maintaining with TMU description 9-4
3-128
REORG 6-4 event logging 8-29
SEQUENCE fields, LOAD DATA
overriding of 6-21 examples 9-5
statement 3-85 to 3-86
with AUTOROWGEN FORCE option 9-20
SERIAL column
described 3-16 to 3-22 foreign 9-11
datatype, use with 3-101, 3-105
syntax 3-45 general procedure 9-10
loading 3-68
Registry, Windows 2-8, 7-7 to 7-8 locks during 8-13
REJECT keyword, TMU 3-91 partial 9-19

 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
A B C D E F G H I J K L M N O P Q R S T U V W X Y Z @

Segment clause, using with 3-87 Standard error, redirecting from Tapes
unloading 3-68 TMU 2-9 backups to 8-23
Serial loader. See TMU. 2-50 Standard input, to TMU 2-10 input data file
SET commands, SQL Standard label tapes formats 3-122 to 3-133
PARTIAL AVAILABILITY 9-21 ANSI 3-122, 3-132 input data record
SET commands, TMU backups to 8-18 formats 3-122 to 3-133
BAR_UNIT_SIZE 2-45 IBM 3-122 tape devices 3-122
DATEFORMAT 2-33 Standard output, from TMU 4-11 TAR file, POSIX limit 4-12
FOREIGN FULL BACKUP 2-46, START/STOP RECORD keywords, TAR tapes, with input data 3-122,
8-20 TMU 3-28 3-131
FOREIGN FULL RESTORE 2-46, storage managers Technical support Intro-12
8-20, 9-11 configuration B-1 Templates, GENERATE CREATE
INDEX TEMPSPACE 2-28 TMU backups 8-19 TABLE statement, TMU 5-3
list of 2-23 to 2-25 SUBSTR keyword, LOAD DATA TEMPSPACE_DUPLICATESPILLP
LOCK 2-25 statement 3-100 ERCENT parameters, TMU
LOCK WAIT 8-13 SUBTRACT aggregate operator, control 2-28
PRECOMPUTED VIEW TMU 3-73 Third-party tools, loading with
MAINTENANCE 2-36 SYNCH statement warehouse data 4-19
PRECOMPUTED VIEW in rb_cm copy operation 7-12 TIME fieldtype, TMU 3-107
MAINTENANCE ON usage 3-119 Time interval to load data before
ERROR 2-37 Syntax COMMIT 2-42
STATS 2-45 LOAD DATA 3-24, 3-137 TIMESTAMP fieldtype, TMU 3-107
TEMPSPACE rb_ctmu 2-14 timestamp option (-t), TMU 2-6
DUPLICATESPILLPERCENT rb_ptmu 2-5 TINYINT keyword, TMU 3-105
2-28 rb_tmu 2-5 TMU
TMU BUFFERS 2-27 REORG statement 6-12, 6-13 aggregate maintenance,
TMU COMMIT RECORD Syntax diagrams described 1-7
INTERVAL 2-40 conventions for Intro-7 Auto aggregate example A-1
TMU COMMIT TIME keywords in Intro-9 backups 8-3
INTERVAL 2-42 System requirements Intro-4 buffer cache size 2-27
TMU CONVERSION System segment 9-11 comments in control file 1-9
TASKS 2-48 control files 1-8
TMU INDEX TASKS 2-47, 2-48 database option (-d) 2-6
TMU INPUT TASKS 2-47 7 exit status codes 2-7
TMU MAX TASKS 2-47 file redirection 2-9
Table clause, LOAD DATA
TMU MMAP LIMIT 2-35 generating control files
statement 3-65 to 3-87
TMU ROW MESSAGES 2-38 with GENERATE 5-3 to 5-10
Table name clause, TMU
TMU SERIAL MODE 2-50 with UNLOAD 4-10
REORG 6-13, 6-20
TMU VERSIONING 2-39 input data formats 3-122
Tables
SET options in copy operation 7-13 input data, decompressing 2-10
locking by TMU 2-25
SHOW command, TMU input file locale 3-38
multiple segments, converting
RESTORE 9-17 interval option (-i) 2-6
to 4-18
Simple fields LOAD DATA
order for load operations 3-14
syntax 3-71 statement 3-23 to 3-146
unloading data 4-3 to 4-20
XML path 3-73 loading data 3-5 to 3-146
Tab-separated data 3-33
SMALLINT fieldtype, TMU 3-105 locking operations 2-26, 2-38
Software dependencies Intro-4 logging in 1-9
Space, working space for offline memory-mapping indexes 3-63
loads 3-89 pipes 2-9, 4-11

,QGH[ 
A B C D E F G H I J K L M N O P Q R S T U V W X Y Z @

REORG operations 6-3 UNLOAD statement (TMU) Version log segment


report interval (-i option) 2-6 rb_cm example 7-15, 7-18 automatically restored 9-11
restore operations 9-3 rb_cm, use with 7-10 not backed up 8-14
SYNCH statement 3-119 variable format 4-9 Versioned databases
syntax for rb_tmu 2-5 WHERE clause 4-13 backing up 8-14
timestamp option (-t) 2-6 Unloading data 4-3 to 4-20 loading 2-39
unloading data 4-3 to 4-20 data conversion for
upgrading databases 1-12 EXTERNAL 4-6
USER statement 1-9 external format :
wait and no-wait control 2-26, described 4-5
WAIT and NO WAIT on locks 2-26
2-38 example 4-17
TMU 2-38
TMU COMMIT RECORD procedure 4-16
Wildcard characters
INTERVAL formats, internal and external 4-5
LOAD DATA statement 3-93
syntax 2-40 generating CREATE TABLE
UNLOAD statement 4-14
usage 2-41 statement 4-10
WORKING_SPACE keyword,
TMU COMMIT TIME INTERVAL generating LOAD DATA
TMU 3-89
example 2-43 statement 4-10
syntax 2-42 internal format
usage 2-42 described 4-5
TMU ROW MESSAGES example 4-15
;
parameter 2-38 procedure 4-14 XBSA backups 8-19
TMU_BUFFERS parameter 2-27 overview 1-10 configuration B-1
TMU_CONVERSION_TASKS privileges required 4-4 Xerces-C++ parser, for XML
parameter 2-48 row order 4-4, 4-9 loads 3-34, 3-129
syntax 2-49 selected rows 4-19 XML files
usage 2-50 SERIAL column 3-68 CDATA section 3-75
TMU_INDEX_TASKS UPDATE mode, loading data 3-32 nested structure 3-130
parameter 2-48 UPGRADE statement (TMU), structure 3-129
example 2-50 syntax 1-12 #PCDATA 3-75
syntax 2-49 Upgrading databases to new XML format
TMU_MMAP_LIMIT release 1-12 description 3-129
parameter 3-63, 6-16 USER statement (TMU) 1-9 discard files 3-44
TMU_ROW_MESSAGES, TUNE Username, database example load 3-78
option 2-38 TMU command line 2-6 LOAD DATA syntax 3-34, 3-73,
TMU_SERIAL_MODE TMU control file 2-21 3-75
parameter 2-50 Locale clause 3-40, 3-41
TRIM keyword, TMU 3-73, 3-82 XML paths 3-75, 3-129
Tuning parameters, TMU 2-27 9
VARIABLE format, UNLOAD
statement 4-9
=
8 Variable-format records, input Zoned decimal datetime
UNDO LOAD keyword, data 3-124 inputs 3-116 to 3-119
TMU 3-121 VARLEN EXTERNAL fieldtype,
UNIX dump and restore TMU 3-99
commands 8-20 VARLEN fieldtype, TMU 3-99
UNLOAD format, TMU input VERIFY command, ALTER
data 3-122 SEGMENT 8-11

 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
A B C D E F G H I J K L M N O P Q R S T U V W X Y Z @

6\PEROV
#PCDATA, in XML files 3-75
%, TMU NULL indicator 4-7, 4-21,
5-9
%, TMU wildcard character 3-93
.backup_dirty_psu file 8-26
.dbinfo file 8-26
.odbc.ini file, DSNs in 2-13
_ , TMU wildcard character 3-93

,QGH[ 

You might also like