Professional Documents
Culture Documents
Tmu Reference Guide Redbrick
Tmu Reference Guide Redbrick
8WLOLW\
5HIHUHQFH*XLGH
,%05HG%ULFN:DUHKRXVH
9HUVLRQ
$XJXVW
3DUW1R
Note:
Before using this information and the product it supports, read the information in the appendix
entitled “Notices.”
This document contains proprietary information of IBM. It is provided under a license agreement and is
protected by copyright law. The information contained in this publication does not include any product
warranties, and any statements provided in this manual should not be interpreted as such.
When you send information to IBM, you grant IBM a nonexclusive right to use or distribute the information
in any way it believes appropriate without incurring any obligation to you.
© Copyright International Business Machines Corporation 1996, 2002. All rights reserved.
US Government User Restricted Rights—Use, duplication or disclosure restricted by GSA ADP Schedule
Contract with IBM Corp.
LL 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
Table of
Contents
7DEOHRI&RQWHQWV
,QWURGXFWLRQ
In This Introduction . . . . . . . . . . . . . . . . . 3
About This Guide . . . . . . . . . . . . . . . . . . 3
Types of Users . . . . . . . . . . . . . . . . . . 4
Software Dependencies . . . . . . . . . . . . . . . 4
Documentation Conventions . . . . . . . . . . . . . . 5
Typographical Conventions . . . . . . . . . . . . . 5
Syntax Notation . . . . . . . . . . . . . . . . . 6
Syntax Diagrams . . . . . . . . . . . . . . . . . 7
Keywords and Punctuation . . . . . . . . . . . . . 9
Identifiers and Names . . . . . . . . . . . . . . . 10
Icon Conventions . . . . . . . . . . . . . . . . . 11
Customer Support . . . . . . . . . . . . . . . . . . 12
New Cases . . . . . . . . . . . . . . . . . . . 12
Existing Cases . . . . . . . . . . . . . . . . . . 13
Troubleshooting Tips . . . . . . . . . . . . . . . . 13
Related Documentation . . . . . . . . . . . . . . . . 14
Additional Documentation . . . . . . . . . . . . . . . 16
Online Documents . . . . . . . . . . . . . . . . 16
Printed Documents . . . . . . . . . . . . . . . . 16
Online Help . . . . . . . . . . . . . . . . . . . 16
IBM Welcomes Your Comments . . . . . . . . . . . . . 17
&KDSWHU ,QWURGXFWLRQWRWKH7DEOH0DQDJHPHQW8WLOLW\
In This Chapter . . . . . . . . . . . . . . . . . . . 1-3
TMU Operations and Functions . . . . . . . . . . . . . 1-4
TMU Control Files and Statements . . . . . . . . . . . . 1-8
Termination . . . . . . . . . . . . . . . . . . 1-8
Comments . . . . . . . . . . . . . . . . . . . 1-9
Locales and Multibyte Characters . . . . . . . . . . . 1-9
USER Statement . . . . . . . . . . . . . . . . . 1-9
LOAD DATA and SYNCH Statements . . . . . . . . . 1-10
UNLOAD Statements . . . . . . . . . . . . . . . 1-10
GENERATE Statements . . . . . . . . . . . . . . 1-11
REORG Statements . . . . . . . . . . . . . . . . 1-11
BACKUP Statements . . . . . . . . . . . . . . . 1-11
RESTORE Statements . . . . . . . . . . . . . . . 1-12
UPGRADE Statements . . . . . . . . . . . . . . . 1-12
SET Statements . . . . . . . . . . . . . . . . . 1-13
&KDSWHU 5XQQLQJWKH708DQG3708
In This Chapter . . . . . . . . . . . . . . . . . . . 2-3
User Access and Required Permission . . . . . . . . . . . 2-4
Operating System Access . . . . . . . . . . . . . . 2-4
Database Access . . . . . . . . . . . . . . . . . 2-5
Permissions on TMU Output Files . . . . . . . . . . . 2-5
Syntax for rb_tmu and rb_ptmu Programs . . . . . . . . . 2-5
Exit Status Codes . . . . . . . . . . . . . . . . . . 2-7
Setting Up the TMU . . . . . . . . . . . . . . . . . 2-8
Remote TMU Setup and Syntax . . . . . . . . . . . . . 2-12
Client-Server Compatibility . . . . . . . . . . . . . 2-12
Client Configuration . . . . . . . . . . . . . . . 2-13
Server Configuration . . . . . . . . . . . . . . . 2-14
Syntax for the rb_ctmu Program . . . . . . . . . . . 2-14
Summary of Remote TMU Operation . . . . . . . . . . 2-18
Example: Windows-to-UNIX Remote TMU Operation . . . . 2-19
USER Statement for User Name and Password . . . . . . . . 2-21
SET Statements and Parameters to Control Behavior . . . . . . 2-23
Lock Behavior . . . . . . . . . . . . . . . . . . 2-25
Buffer-Cache Size . . . . . . . . . . . . . . . . 2-27
Temporary Space Management . . . . . . . . . . . . 2-28
Format of Datetime Values . . . . . . . . . . . . . 2-33
LY 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
Load Information Limit . . . . . . . . . . . . . . . 2-34
Memory-Map Limit . . . . . . . . . . . . . . . . 2-35
Setting Precomputed View Maintenance . . . . . . . . . 2-36
Precomputed View Maintenance On Error . . . . . . . . 2-36
Managing Row Messages . . . . . . . . . . . . . . 2-38
Enabling Versioning . . . . . . . . . . . . . . . . 2-39
Commit Record Interval . . . . . . . . . . . . . . . 2-40
Commit Time Interval . . . . . . . . . . . . . . . 2-42
Displaying Load Statistics . . . . . . . . . . . . . . 2-45
Backup and Restore (BAR) Unit Size . . . . . . . . . . 2-45
External Backup and Restore Operations . . . . . . . . . 2-46
REORG Tasks . . . . . . . . . . . . . . . . . . 2-47
Parallel Loading Tasks (PTMU Only) . . . . . . . . . . 2-48
Serial Mode Operation (PTMU Only) . . . . . . . . . . 2-50
Suggestions for Effective PTMU Operations . . . . . . . . . 2-52
Operations That Use Parallel Processing . . . . . . . . . 2-52
Discard Limits on Parallel Load Operations . . . . . . . . 2-53
AUTOROWGEN with the PTMU . . . . . . . . . . . 2-53
Multiple Tape Drives with the PTMU . . . . . . . . . . 2-54
3480/3490 Multiple-Tape Drive with the PTMU . . . . . . 2-54
&KDSWHU /RDGLQJ'DWDLQWRD:DUHKRXVH'DWDEDVH
In This Chapter . . . . . . . . . . . . . . . . . . . 3-5
The LOAD DATA Operation . . . . . . . . . . . . . . 3-6
Inputs and Outputs . . . . . . . . . . . . . . . . 3-6
Processing Stages for Loading Data . . . . . . . . . . . 3-8
Procedure for Loading Data . . . . . . . . . . . . . . . 3-12
Some Preliminary Decisions . . . . . . . . . . . . . . . 3-14
Determining Table Order . . . . . . . . . . . . . . 3-14
Ordering Input Data . . . . . . . . . . . . . . . . 3-15
Maintaining Referential Integrity with Automatic Row Generation 3-16
Writing a LOAD DATA Statement . . . . . . . . . . . . . 3-23
LOAD DATA Syntax . . . . . . . . . . . . . . . . . 3-24
Input Clause . . . . . . . . . . . . . . . . . . . . 3-25
Format Clause . . . . . . . . . . . . . . . . . . . 3-29
EBCDIC to ASCII Conversion . . . . . . . . . . . . . 3-35
Locale Clause . . . . . . . . . . . . . . . . . . . . 3-38
Locale Specifications for XML Input Files . . . . . . . . . 3-41
Usage Notes . . . . . . . . . . . . . . . . . . . 3-42
7DEOHRI&RQWHQWV Y
Discard Clause . . . . . . . . . . . . . . . . . . . 3-43
Usage. . . . . . . . . . . . . . . . . . . . . 3-54
Row Messages Clause . . . . . . . . . . . . . . . . 3-57
Optimize Clause . . . . . . . . . . . . . . . . . . 3-59
MMAP Index Clause . . . . . . . . . . . . . . . . . 3-63
Table Clause . . . . . . . . . . . . . . . . . . . . 3-65
Loading a SERIAL Column . . . . . . . . . . . . . 3-68
Selective Column Updates with RETAIN and DEFAULT . . . 3-69
Simple Fields . . . . . . . . . . . . . . . . . . 3-71
Concatenated Fields. . . . . . . . . . . . . . . . 3-81
Constant Fields . . . . . . . . . . . . . . . . . 3-84
Sequence Fields . . . . . . . . . . . . . . . . . 3-85
Increment Fields . . . . . . . . . . . . . . . . . 3-86
Segment Clause . . . . . . . . . . . . . . . . . . 3-87
Criteria Clause . . . . . . . . . . . . . . . . . . . 3-90
Comment Clause . . . . . . . . . . . . . . . . . . 3-95
Field Types . . . . . . . . . . . . . . . . . . . . 3-97
Character Field Type . . . . . . . . . . . . . . . 3-99
Numeric External Field Types . . . . . . . . . . . . 3-101
Floating-Point External Field Type . . . . . . . . . . 3-103
Packed and Zoned Decimal Field Types . . . . . . . . . 3-104
Integer Binary Field Types . . . . . . . . . . . . . 3-105
Floating-Point Binary Field Types . . . . . . . . . . . 3-106
Datetime Field Types . . . . . . . . . . . . . . . 3-107
Format Masks for Datetime Fields . . . . . . . . . . . . 3-109
Subfield Components . . . . . . . . . . . . . . . 3-110
Restricted Datetime Masks for Numeric Fields . . . . . . . . 3-116
Writing a SYNCH Statement . . . . . . . . . . . . . . 3-119
Format of Input Data . . . . . . . . . . . . . . . . . 3-122
Disk Files . . . . . . . . . . . . . . . . . . . 3-123
Tape Files on UNIX Operating Systems . . . . . . . . . 3-131
Field-Type Conversions . . . . . . . . . . . . . . . . 3-133
LOAD DATA Syntax Summary . . . . . . . . . . . . . 3-137
YL 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
&KDSWHU 8QORDGLQJ'DWDIURPD7DEOH
In This Chapter . . . . . . . . . . . . . . . . . . . 4-3
The UNLOAD Operation . . . . . . . . . . . . . . . . 4-4
Internal Format . . . . . . . . . . . . . . . . . . 4-5
External Format . . . . . . . . . . . . . . . . . 4-5
Data Conversion to External Format . . . . . . . . . . 4-6
UNLOAD Syntax . . . . . . . . . . . . . . . . . . 4-8
Unloading or Loading Internal-Format Data . . . . . . . . . 4-14
Unloading or Loading External-Format Data . . . . . . . . . 4-16
Converting a Table to Multiple Segments . . . . . . . . . . 4-18
Moving a Database . . . . . . . . . . . . . . . . . . 4-18
Loading External-Format Data into Third-Party Tools . . . . . . 4-19
Unloading Selected Rows . . . . . . . . . . . . . . . . 4-19
Example: External Fixed-Format Data . . . . . . . . . . 4-20
Example: External Variable-Format Data . . . . . . . . . 4-22
&KDSWHU *HQHUDWLQJ&5($7(7$%/(DQG/2$''$7$6WDWHPHQWV
In This Chapter . . . . . . . . . . . . . . . . . . . 5-3
Generating CREATE TABLE Statements . . . . . . . . . . . 5-3
Generating LOAD DATA Statements . . . . . . . . . . . . 5-5
Example: GENERATE Statements and External-Format Data . . . 5-8
&KDSWHU 5HRUJDQL]LQJ7DEOHVDQG,QGH[HV
In This Chapter . . . . . . . . . . . . . . . . . . . 6-3
The REORG Operation . . . . . . . . . . . . . . . . 6-3
REORG Operation Options. . . . . . . . . . . . . . 6-5
Data Processing During the REORG Operation . . . . . . . . 6-7
Coordinator Stage . . . . . . . . . . . . . . . . . 6-10
Input Stage . . . . . . . . . . . . . . . . . . . 6-10
Conversion Stage . . . . . . . . . . . . . . . . . 6-10
Index-Building Stage . . . . . . . . . . . . . . . . 6-11
Cleanup Stage . . . . . . . . . . . . . . . . . . 6-11
REORG Syntax . . . . . . . . . . . . . . . . . . . 6-12
discardfile Clause . . . . . . . . . . . . . . . . . 6-19
Usage Notes . . . . . . . . . . . . . . . . . . . 6-21
7DEOHRI&RQWHQWV YLL
&KDSWHU 0RYLQJ'DWDZLWKWKH&RS\0DQDJHPHQW8WLOLW\
In This Chapter . . . . . . . . . . . . . . . . . . . 7-3
The rb_cm Utility . . . . . . . . . . . . . . . . . . 7-4
System Requirements . . . . . . . . . . . . . . . 7-5
Database Security Requirements . . . . . . . . . . . 7-6
The rb_cm Syntax . . . . . . . . . . . . . . . . . . 7-7
TMU Control Files for Use with rb_cm . . . . . . . . . . . 7-10
LOAD and UNLOAD Statements . . . . . . . . . . . 7-11
SYNCH Statement . . . . . . . . . . . . . . . . 7-12
SET Statements . . . . . . . . . . . . . . . . . 7-13
Examples of rb_cm Operations . . . . . . . . . . . . . 7-13
Example: Copying Data Between Different Computers. . . . 7-14
Example: Copying Data Between Tables on the Same Computer 7-18
Verifying the Results of rb_cm Operations . . . . . . . . . 7-20
&KDSWHU %DFNLQJ8SD'DWDEDVH
In This Chapter . . . . . . . . . . . . . . . . . . . 8-3
Backup Levels and Modes . . . . . . . . . . . . . . . 8-4
External Full Backups . . . . . . . . . . . . . . . 8-4
Restore Rules . . . . . . . . . . . . . . . . . . 8-5
Backup Data . . . . . . . . . . . . . . . . . . 8-5
Backup Strategies . . . . . . . . . . . . . . . . 8-6
Backup Procedure . . . . . . . . . . . . . . . . 8-8
Preparing the Database for Backups . . . . . . . . . . . 8-8
ALTER DATABASE CREATE BACKUP DATA Command . . 8-9
ALTER DATABASE DROP BACKUP DATA Command . . . 8-10
Storage Requirements for the Backup Segment . . . . . . 8-10
Altering the Backup Segment . . . . . . . . . . . . 8-11
How to Run a TMU Backup . . . . . . . . . . . . . . 8-13
Scope of Backup Operations . . . . . . . . . . . . . 8-14
Configuring the Size of Backup Files . . . . . . . . . . 8-15
Backups to Tape . . . . . . . . . . . . . . . . . 8-17
Using a Storage Manager for TMU Backups . . . . . . . 8-19
Using External Tools for Full Backups . . . . . . . . . 8-20
BACKUP Syntax . . . . . . . . . . . . . . . . . 8-22
Backup Metadata . . . . . . . . . . . . . . . . . . 8-26
Media History File (rbw_media_history) . . . . . . . . 8-27
Backup Log File (action_log) . . . . . . . . . . . . . 8-29
&KDSWHU 5HVWRULQJD'DWDEDVH
In This Chapter . . . . . . . . . . . . . . . . . . . 9-3
Full and Partial TMU Restores . . . . . . . . . . . . . . 9-4
Restore Path . . . . . . . . . . . . . . . . . . . 9-4
Restore Examples . . . . . . . . . . . . . . . . . 9-5
How to Run a TMU Restore . . . . . . . . . . . . . . . 9-10
Recommended Procedure for Foreign Restore Operations . . . 9-11
Restore of Special Segments . . . . . . . . . . . . . 9-11
Cold Restore Operations. . . . . . . . . . . . . . . 9-12
PSUs for Objects Created After a Restored Backup. . . . . . 9-12
RESTORE Syntax . . . . . . . . . . . . . . . . . 9-13
Partial Restore Procedure . . . . . . . . . . . . . . 9-19
$SSHQGL[ $ ([DPSOH8VLQJWKH708LQ$**5(*$7(0RGH
$SSHQGL[ % 6WRUDJH0DQDJHU&RQILJXUDWLRQIRU;%6$%DFNXSV
1RWLFHV
,QGH[
7DEOHRI&RQWHQWV L[
Introduction
,QWURGXFWLRQ
In This Introduction . . . . . . . . . . . . . . . . . . 3
About This Guide . . . . . . . . . . . . . . . . . . . 3
Types of Users . . . . . . . . . . . . . . . . . . . 4
Software Dependencies . . . . . . . . . . . . . . . . 4
Documentation Conventions . . . . . . . . . . . . . . . 5
Typographical Conventions . . . . . . . . . . . . . . 5
Syntax Notation . . . . . . . . . . . . . . . . . . 6
Syntax Diagrams . . . . . . . . . . . . . . . . . . 7
Keywords and Punctuation . . . . . . . . . . . . . . 9
Identifiers and Names . . . . . . . . . . . . . . . . 10
Icon Conventions . . . . . . . . . . . . . . . . . . 11
Comment Icons . . . . . . . . . . . . . . . . . 11
Platform Icons . . . . . . . . . . . . . . . . . . 11
Customer Support . . . . . . . . . . . . . . . . . . . 12
New Cases . . . . . . . . . . . . . . . . . . . . 12
Existing Cases . . . . . . . . . . . . . . . . . . . 13
Troubleshooting Tips . . . . . . . . . . . . . . . . . 13
Related Documentation . . . . . . . . . . . . . . . . . 14
Additional Documentation . . . . . . . . . . . . . . . . 16
Online Documents . . . . . . . . . . . . . . . . . 16
Printed Documents . . . . . . . . . . . . . . . . . 16
Online Help . . . . . . . . . . . . . . . . . . . . 16
$ERXW7KLV*XLGH
This guide provides the information you need to use both the Table
Management Utility (TMU) and its parallel version, the PTMU, to load and
maintain the tables and indexes in IBM Red Brick Warehouse databases. It
includes information necessary for the effective use of the TMU, as well as
syntax definitions and procedural descriptions. Use it in conjunction with the
Administrator’s Guide to develop and maintain an efficient data warehouse
operation.
,QWURGXFWLRQ
7\SHVRI8VHUV
7\SHVRI8VHUV
This guide is written for the following users:
■ Database administrators
■ Database users who are responsible for loading and maintaining the
tables and indexes in IBM Red Brick Warehouse
6RIWZDUH'HSHQGHQFLHV
This guide assumes that you are using IBM Red Brick Warehouse,
Version 6.2, as your database server.
IBM Red Brick Warehouse includes the Aroma database, which contains
sales data about a fictitious coffee and tea company. The database tracks daily
retail sales in stores owned by the Aroma Coffee and Tea Company. The
dimensional model for this database consists of a fact table and its
dimensions.
The scripts that you use to install the demonstration database reside in the
redbrick_dir/sample_input directory, where redbrick_dir is the IBM Red Brick
Warehouse directory on your system.
7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
'RFXPHQWDWLRQ&RQYHQWLRQV
'RFXPHQWDWLRQ&RQYHQWLRQV
This section describes the conventions that this document uses. These
conventions make it easier to gather information from this and other volumes
in the documentation set.
■ Typographical conventions
■ Syntax notation
■ Syntax diagrams
■ Keywords and punctuation
■ Identifiers and names
■ Icon conventions
7\SRJUDSKLFDO&RQYHQWLRQV
This document uses the following conventions to introduce new terms,
illustrate screen displays, describe command syntax, and so forth.
&RQYHQWLRQ 0HDQLQJ
italics Within text, new terms and emphasized words appear in italics.
italics Within syntax and code examples, variable values that you are
italics to specify appear in italics.
monospace Information that the product displays and information that you
monospace enter appear in a monospace typeface.
(1 of 2)
,QWURGXFWLRQ
6\QWD[1RWDWLRQ
&RQYHQWLRQ 0HDQLQJ
KEYSTROKE Keys that you are to press appear in uppercase letters in a sans
serif font.
6\QWD[1RWDWLRQ
This guide uses the following conventions to describe the syntax of
operating-system commands.
7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
6\QWD['LDJUDPV
6\QWD['LDJUDPV
This guide uses diagrams built with the following components to describe
the syntax for statements and all commands other than system-level
commands.
&RPSRQHQW 0HDQLQJ
Statement begins.
Statement ends.
Optional item.
DISTINCT
(1 of 2)
,QWURGXFWLRQ
6\QWD['LDJUDPV
&RPSRQHQW 0HDQLQJ
DBA TO Required item with choice. One and only one item
CONNECT TO must be present.
SELECT ON
,
Optional items. Several items are allowed; a
comma must precede each repetition.
ASC
DESC
(2 of 2)
The preceding syntax elements are combined to form a diagram as follows.
REORG table_name
,
INDEX ( index_name )
;
RECALCULATE RANGES OPTIMIZE ON
OFF
7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
.H\ZRUGVDQG3XQFWXDWLRQ
Complex syntax diagrams such as the one for the following statement are
repeated as point-of-reference aids for the detailed diagrams of their compo-
nents. Point-of-reference diagrams are indicated by their shadowed corners,
gray lines, and reduced size.
LOAD INPUT_CLAUSE
DATA FORMAT_CLAUSE DISCARD_CLAUSE
TABLE_CLAUSE ;
optimize_clause segment_clause criteria_clause comment_clause
INPUTFILE filename
INDDN TAPE DEVICE ’DEVICE_NAME ’
( ’FILENAME ’ )
.H\ZRUGVDQG3XQFWXDWLRQ
Keywords are words reserved for statements and all commands except
system-level commands. When a keyword appears in a syntax diagram, it is
shown in uppercase characters. You can write a keyword in uppercase or
lowercase characters, but you must spell the keyword exactly as it appears in
the syntax diagram.
,QWURGXFWLRQ
,GHQWLILHUVDQG1DPHV
,GHQWLILHUVDQG1DPHV
Variables serve as placeholders for identifiers and names in the syntax
diagrams and examples. You can replace a variable with an arbitrary name,
identifier, or literal, depending on the context. Variables are also used to
represent complex syntax elements that are expanded in additional syntax
diagrams. When a variable appears in a syntax diagram, an example, or text,
it is shown in lowercase italic.
The following syntax diagram uses variables to illustrate the general form of
a simple SELECT statement.
When you write a SELECT statement of this form, you replace the variables
column_name and table_name with the name of a specific column and table.
7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
,FRQ&RQYHQWLRQV
,FRQ&RQYHQWLRQV
Throughout the documentation, you will find text that is identified by several
different types of icons. This section describes these icons.
&RPPHQW,FRQV
Comment icons identify three types of information, as the following table
describes. This information always appears in italics.
3ODWIRUP,FRQV
Feature, product, and platform icons identify paragraphs that contain
platform-specific information.
,FRQ 'HVFULSWLRQ
UNIX
Identifies information that is specific to the UNIX and
Linux operating systems
Windows
Identifies information that is specific to Windows
platforms
,QWURGXFWLRQ
&XVWRPHU6XSSRUW
&XVWRPHU6XSSRUW
If you have technical questions about IBM Red Brick Warehouse but cannot
find the answer in the appropriate document, contact IBM Customer Support
as follows:
Internet http://www-3.ibm.com/software/data/informix/support/
access
1HZ&DVHV
To log a new case, you must provide the following information:
7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
([LVWLQJ&DVHV
■ Name and version of the client tool that you are using
■ Version of the Red Brick ODBC Driver or Red Brick JDBC Driver that
you are using, if applicable
■ Name and version of client network or TCP/IP stack in use
■ Error messages returned by the client application
■ Server and client locale specifications
([LVWLQJ&DVHV
The support engineer who logs your case or first contacts you will always
give you a case number. This number is used to keep track of all the activities
performed during the resolution of each problem. To inquire about the status
of an existing case, you must provide your case number.
7URXEOHVKRRWLQJ7LSV
You can often reduce the time it takes to close your case by providing the
smallest possible reproducible example of your problem. The more you can
isolate the cause of the problem, the more quickly the support engineer can
help you resolve it:
,QWURGXFWLRQ
5HODWHG'RFXPHQWDWLRQ
5HODWHG'RFXPHQWDWLRQ
The standard documentation set for IBM Red Brick Warehouse includes the
following documents.
'RFXPHQW 'HVFULSWLRQ
Client Installation and Connec- Includes procedures for installing ODBC, Red Brick
tivity Guide JDBC Driver, RISQL Entry Tool, and RISQL
Reporter on client systems. Describes how to access
IBM Red Brick Warehouse using ODBC for C and
C++ applications and JDBC for Java applications.
IBM Red Brick Vista User’s Describes the IBM Red Brick Vista aggregate
Guide computation and management system. Illustrates
how Vista improves query performance by
automatically rewriting queries to use aggregates,
describes how the Advisor recommends the best set
of aggregates based on data collected daily, and
explains how aggregate tables are maintained when
their detail tables are updated.
7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
5HODWHG'RFXPHQWDWLRQ
'RFXPHQW 'HVFULSWLRQ
RISQL Entry Tool and RISQL Is a complete guide to the RISQL Entry Tool,
Reporter User’s Guide a command-line tool used to enter SQL statements,
and the RISQL Reporter, an enhanced version of the
RISQL Entry Tool with report-formatting
capabilities.
SQL Reference Guide Is a complete language reference for the Red Brick
SQL implementation and RISQL extensions for IBM
Red Brick Warehouse databases.
,QWURGXFWLRQ
$GGLWLRQDO'RFXPHQWDWLRQ
$GGLWLRQDO'RFXPHQWDWLRQ
For additional information, you might want to refer to the following types of
documentation:
■ Online documents
■ Printed documents
■ Online help
2QOLQH'RFXPHQWV
A Documentation CD that contains IBM Red Brick Warehouse documents in
electronic format is provided with your Red Brick products. You can copy the
documentation to your computer or access it directly from the CD.
3ULQWHG'RFXPHQWV
To order printed documents, contact your sales representative.
2QOLQH+HOS
IBM provides online help with each graphical user interface (GUI) that
displays information about those interfaces and the functions that they
perform. Use the help facilities that each GUI provides to display the online
help.
7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
,%0:HOFRPHV<RXU&RPPHQWV
,%0:HOFRPHV<RXU&RPPHQWV
To help us with future versions of our documents, let us know about any
corrections or clarifications that you would find useful. Include the following
information:
comments@vnet.ibm.com
This address is reserved for reporting errors and omissions in our documen-
tation. If you prefer, you can fill out a comment form by going to:
http://www.ibm.com/software/data/rcf/
,QWURGXFWLRQ
Chapter
,QWURGXFWLRQWRWKH7DEOH
0DQDJHPHQW8WLOLW\
In This Chapter . . . . . . . . . . . . . . . . . . . . 1-3
TMU Operations and Functions . . . . . . . . . . . . . . 1-4
TMU Control Files and Statements . . . . . . . . . . . . . 1-8
Termination . . . . . . . . . . . . . . . . . . . . 1-8
Comments . . . . . . . . . . . . . . . . . . . . 1-9
Locales and Multibyte Characters . . . . . . . . . . . . 1-9
USER Statement . . . . . . . . . . . . . . . . . . 1-9
LOAD DATA and SYNCH Statements . . . . . . . . . . . 1-10
UNLOAD Statements . . . . . . . . . . . . . . . . 1-10
GENERATE Statements . . . . . . . . . . . . . . . . 1-11
REORG Statements . . . . . . . . . . . . . . . . . 1-11
BACKUP Statements . . . . . . . . . . . . . . . . . 1-11
RESTORE Statements. . . . . . . . . . . . . . . . . 1-12
UPGRADE Statements . . . . . . . . . . . . . . . . 1-12
SET Statements . . . . . . . . . . . . . . . . . . . 1-13
7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
,Q7KLV&KDSWHU
The Table Management Utility (TMU) is the IBM Red Brick Warehouse
program that you use to load data into the database and to maintain its tables,
indexes, precomputed views, and referential integrity. While the primary
function of the TMU is to load and index large amounts of data quickly, it also
performs the following tasks:
,QWURGXFWLRQWRWKH7DEOH0DQDJHPHQW8WLOLW\
7082SHUDWLRQVDQG)XQFWLRQV
7082SHUDWLRQVDQG)XQFWLRQV
The TMU is a program that runs independently of the database server; it is
invoked from the operating-system command line and uses the same config-
uration information as other components of IBM Red Brick Warehouse. The
TMU program can be invoked remotely, allowing DBAs to load data from an
input file on a networked client machine into a database table on the
production server machine.
Before you invoke the TMU, you must use the TMU control language to write
a control file that specifies the task to be done and provides the information
needed to perform that task. Next you run the TMU, naming the control
filename as input. The TMU reads the control file and carries out the task,
reading its input from tape, disk, or standard input, and modifying the
database or writing output files to tape or disk as directed. At the same time,
the TMU writes messages for system logging and accounting purposes. The
TMU supports a variety of tape, disk, and data formats.
7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
7082SHUDWLRQVDQG)XQFWLRQV
Figure 1-1 illustrates the TMU and its input and outputs.
)LJXUH
708,QSXWDQG2XWSXW2SWLRQV
,QSXWV 2XWSXWV
'DWDEDVH
&RQWURO V\VWHPWDEOH
ILOH XSGDWHV
7DEOHDQG
LQGH[
&RQWURO ILOHV
'LVFDUG
8QORDG ILOH 7DEOH ILOHV
IRUPDW 0DQDJHPHQW
ILOHV 8WLOLW\ 8QORDG
IRUPDW /2$''$7$
UEBWPXRU ILOHV FRQWURO
UEBSWPX ILOHV &5($7(
,QSXWGDWD 7$%/(
ILOHV ''/ILOHV
$FFRXQWLQJ
DQG
ORJUHTXHVW
'DWD PHVVDJHV
,QSXWGDWD VWDQGDUG
VWDQGDUG 81,;ORJGDHPRQ
RXWSXW
LQSXWSLSHV RU:LQGRZV
SLSHV 0HVVDJHV
VWDQGDUGHUURU ORJWKUHDG
ZDUQLQJPHVVDJH UEZORJG
ILOH
,QWURGXFWLRQWRWKH7DEOH0DQDJHPHQW8WLOLW\
7082SHUDWLRQVDQG)XQFWLRQV
Figure 1-2 shows how the TMU can be invoked on a remote server from a
local client machine:
)LJXUH
5HPRWH708$UFKLWHFWXUH
In this case, the TMU is invoked from the client machine (using the rb_ctmu
program) but the LOAD DATA or UNLOAD operation is performed against an
IBM Red Brick Warehouse database on the server machine. This feature
allows DBAs to maintain control files, input files, and output files on the
client, reducing the security risk on the production machine.
7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
7082SHUDWLRQVDQG)XQFWLRQV
,QWURGXFWLRQWRWKH7DEOH0DQDJHPHQW8WLOLW\
708&RQWURO)LOHVDQG6WDWHPHQWV
708&RQWURO)LOHVDQG6WDWHPHQWV
A TMU control file contains one or more statements that specify the functions
to be performed and the information the TMU needs to perform those
functions. A single control file can contain multiple control statements of the
same or different types; for example:
7HUPLQDWLRQ
Each control statement must end with a semicolon (;) as the detailed syntax
diagrams for each statement show in the remaining chapters. If multiple
control statements are included in a single control file, each one must end
with a semicolon.
7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
&RPPHQWV
&RPPHQWV
You can enclose comments in a control file with either C-language-style
delimiters (/*…*/), in which case they can span multiple lines, or precede
them with two dashes (--) and end them with end-of-line, in which case they
are limited to a single line.
/RFDOHVDQG0XOWLE\WH&KDUDFWHUV
You must specify most of a TMU control statement (LOAD DATA, UNLOAD,
SYNCH, REORG, and so on) with ASCII characters, regardless of the database
locale. However, the TMU does support multibyte characters for database
object names and for some special characters used in those statements, and
you can specify a locale for a TMU input file that differs from the database
locale. Messages that the TMU returns are displayed in the language of the
database locale unless the RB_NLS_LOCALE environment variable for the
current user overrides that locale. In all other cases, TMU operations use the
locale of the database.
For a list of defined locales supported by IBM Red Brick Warehouse, see the
locales.pdf file in the relnotes directory on your installation CD.
86(56WDWHPHQW
A USER statement provides a database user name and password, which
allows you to invoke the TMU without entering a username and password on
the command line or interactively in response to a prompt.
Only one USER statement can occur in a control file and it must be the first
statement in the file. If a username and password are provided on the
command line, those values override a USER statement present in the control
file and a warning is issued that the USER statement was overridden.
For information about USER statements, refer to “USER Statement for User
Name and Password” on page 2-21.
,QWURGXFWLRQWRWKH7DEOH0DQDJHPHQW8WLOLW\
/2$''$7$DQG6<1&+6WDWHPHQWV
/2$''$7$DQG6<1&+6WDWHPHQWV
A LOAD DATA statement provides the control information you need to load
data into a database. This information includes the LOAD DATA keywords,
the source of data, its format, its locale, what to do with records that cannot
be loaded, and how to map the input data record fields into the database
table columns. The statement does not include the data itself.
For information about LOAD DATA and SYNCH control files, refer to
Chapter 3, “Loading Data into a Warehouse Database.” For additional
information about load operations with the rb_cm Copy Management
utility, refer to Chapter 7, “Moving Data with the Copy Management Utility.”
81/2$'6WDWHPHQWV
An UNLOAD statement provides the information you need to unload data
from a database table in any of several formats to move the data or to use it
with another tool. An UNLOAD statement contains the UNLOAD keyword
and other relevant information such as the name of the table to be unloaded,
a description of the desired output format, and where to write the output
files. Data can be unloaded in the order determined by either a table scan or
an index. Either a complete table can be unloaded, or only those rows that
meet the specified criteria.
In cases where you are unloading data that will later be loaded into another
table, you can include instructions in the UNLOAD statement for the TMU to
automatically generate an SQL CREATE TABLE statement that corresponds to
the table to be unloaded and a TMU LOAD DATA statement that corresponds
to its data. These automatically generated statements provide templates that,
with little or no modification, allow you to create a table and load it with the
unloaded data. (This functionality is also available in the GENERATE
statement.)
7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
*(1(5$7(6WDWHPHQWV
*(1(5$7(6WDWHPHQWV
A GENERATE statement provides the information you need to automatically
generate an SQL CREATE TABLE or TMU LOAD DATA statement based on an
existing table. A GENERATE statement allows you to separate the task of
generating the CREATE TABLE or LOAD DATA statements from the task of
unloading the data, to generate these statements without actually unloading
the data.
5(25*6WDWHPHQWV
A REORG statement instructs the TMU to reorganize a table, which includes
enforcing referential integrity and rebuilding any specified indexes to
improve internal storage. (Referential integrity is the relational property that
each foreign-key value in a referencing table exists as a primary-key value in
the referenced table.) A REORG statement can also be used to rebuild any
aggregate tables defined on the target table of the REORG statement. A
REORG statement includes the REORG keyword, a table name, index names,
and precomputed view names and instructions for rebuilding them.
%$&.836WDWHPHQWV
A BACKUP statement performs a level 0, 1, or 2 backup of the database, in
either online or checkpoint mode. For information about BACKUP control
files, refer to Chapter 8, “Backing Up a Database.”
,QWURGXFWLRQWRWKH7DEOH0DQDJHPHQW8WLOLW\
5(6725(6WDWHPHQWV
5(6725(6WDWHPHQWV
A RESTORE statement fully or partially restores the database from one or
more TMU backups. For information about RESTORE control files, refer to
Chapter 9, “Restoring a Database.”
83*5$'(6WDWHPHQWV
An UPGRADE statement instructs the TMU to upgrade an existing database
so that it is compatible with a newer version of IBM Red Brick Warehouse.
Not all versions require that databases be upgraded; for those that do require
an upgrade, the information needed to upgrade a database is built into the
UPGRADE command in the new IBM Red Brick Warehouse software.
UPGRADE ;
DDLFILE ’filename’
For the following information, refer to the release notes that accompany each
release of IBM Red Brick Warehouse:
For general instructions on how to install new software and plan an upgrade
for a production database, refer to the Installation and Configuration Guide for
your platform and the Release Notes.
7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
6(76WDWHPHQWV
6(76WDWHPHQWV
Various options are available that allow you to customize certain aspects of
TMU behavior for a specific session. You can include SET statements for these
options in a control file to override global configuration parameters set in the
configuration file (rbw.config). For example, you can use a SET statement to
change the default location and amount of temporary space that a specific
load operation uses.
For information about these SET statements, refer to “SET Statements and
Parameters to Control Behavior” on page 2-23.
,QWURGXFWLRQWRWKH7DEOH0DQDJHPHQW8WLOLW\
Chapter
5XQQLQJWKH708DQG3708
In This Chapter . . . . . . . . . . . . . . . . . . . . 2-3
User Access and Required Permission . . . . . . . . . . . . 2-4
Operating System Access . . . . . . . . . . . . . . . 2-4
Database Access . . . . . . . . . . . . . . . . . . 2-5
Permissions on TMU Output Files . . . . . . . . . . . . 2-5
7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
,Q7KLV&KDSWHU
Before you can use the TMU or the Parallel TMU (PTMU), you must prepare a
control file that contains statements that define the tasks to perform. These
statements are described in subsequent chapters. When the control file is
ready (either a newly created file or an existing file that was modified) and
any required input files are ready, you can run the TMU or the PTMU, as this
chapter describes.
Use of the PTMU is similar to use of the TMU, except as noted in the syntax on
page 2-5 and the specific suggestions for PTMU use on page 2-52.
5XQQLQJWKH708DQG3708
8VHU$FFHVVDQG5HTXLUHG3HUPLVVLRQ
8VHU$FFHVVDQG5HTXLUHG3HUPLVVLRQ
To use the TMU or PTMU, you must have the required permissions for both
the operating system and the database.
2SHUDWLQJ6\VWHP$FFHVV
If you run either the TMU or the PTMU (rb_tmu or rb_ptmu) from a user other
than redbrick, you must ensure that the redbrick user ID has read access to
the control file and input files and write access to the directories in which the
table, index, discard, and generated files are written. For example, if the
administrative user at your site has the user name redbrick and you are
running the TMU under your user name calvin, then you must make sure that
the redbrick user has the necessary permissions to read, write, and execute
the required files. If you use another name for the administrative user, you
must make sure that user name has the necessary permissions on the
required files.
Windows On Windows platforms, the user who runs the TMU or PTMU must also
belong to one of the following user groups:
■ Administrator
■ REDBRICK_DBA
■ *_REDBRICK_DBA (where * refers to any prefix; for example,
TMU_REDBRICK_DBA)
If you do not want to make the TMU user part of the standard Windows
Administrator group, you must create the group REDBRICK_DBA or
*_REDBRICK_DBA with the User Manager administrative tool (accessible
from the Start menu) and assign the user to that group. ♦
7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
'DWDEDVH$FFHVV
'DWDEDVH$FFHVV
You can specify the database to access at the command line when you invoke
the TMU. If you do not specify a database at the command line, the TMU uses
the database that the RB_PATH environment variable specifies.
The database user ID you supply to the TMU must have the necessary object
privileges and task authorizations to perform the TMU operation. You can
supply the user ID and the password at the command line, in response to a
prompt, or in a USER statement.
UNIX 3HUPLVVLRQVRQ7082XWSXW)LOHV
By default, all TMU output files (including discard files, generated files,
unload files, and backup files) inherit the permissions of the user who runs
the rb_tmu or rb_ptmu executable. These permissions are based on the
current umask setting for that user. If umask is set to 0, the output files are
rw for all users. To restrict the permissions to rw for the redbrick user only,
you can set umask to 077:
% umask 077
6\QWD[IRUUEBWPXDQGUEBSWPX3URJUDPV
The executable files for the TMU and PTMU are named rb_tmu and rb_ptmu,
respectively, and they are located in the bin subdirectory in the redbrick
directory. These programs run under the redbrick user ID and the redbrick
user owns all files that they create. The rb_ctmu is a client executable used to
start a load or unload operation on a remote server; see page 2-12 for details.
5XQQLQJWKH708DQG3708
2SWLRQV
2SWLRQV
You can specify any or all of the following options, in any order:
7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
8VDJH
8VDJH
■ To display the TMU or PTMU syntax, enter rb_tmu or rb_ptmu with
no options at the system prompt. For example:
% rb_tmu
Usage: /redbrick_dir/bin/rb_tmu [<Options>]
<control_file> [<username> [<password>]]
Options:
-i or -interval <nrows> Display progress every
<nrows> rows.
-t or -timestamp Append a timestamp to all
TMU messages.
-d or -database <database> Database to use.
■ If the TMU or PTMU is interrupted, it exits immediately after closing
any open tables.
([LW6WDWXV&RGHV
Upon exiting, the TMU and PTMU return the highest status code encountered
during processing. For example, if the TMU generates only warning
messages, it returns an exit status code of 1; if, however, it generates both
warning and fatal messages, it returns an exit status code of 3. You can use
these exit status codes to control user-implemented applications that run the
TMU or PTMU. The following table defines the meaning of each exit status
code.
6WDWXV 0HDQLQJ
5XQQLQJWKH708DQG3708
6HWWLQJ8SWKH708
6HWWLQJ8SWKH708
7RVHWXS\RXUHQYLURQPHQWDQGLQYRNHWKH708RU3708
Log in as user redbrick. (If you run the TMU as any user other than
redbrick, you must verify that user redbrick has the necessary access
to all locations used for input, output, discard, and generated files, as
described on page 2-4.)
Make sure the system is configured as you want it:
■ Verify that the RB_HOST environment variable is set to the
correct database daemon (UNIX) or service (Windows).
■ Verify that the RB_CONFIG environment variable is set to the
directory that contains the rbw.config file.
■ Verify that the RB_PATH environment variable is set to the
correct database. If it is not set to the database that you want to
access, you must use the -d option to provide the logical
database name when you invoke the TMU.
Windows
On Windows, the default value of these environment variables is determined
by the database service that RB_HOST selects in the Registry. ♦
Invoke the TMU. Use rb_tmu and specify the file containing the TMU
control statements. Your database user name and password can be
entered on the command line, with a USER statement within the
control file, or in response to a prompt from the TMU.
7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
6HWWLQJ8SWKH708
Windows
At a Windows shell prompt, enter:
c:\db1> set RB_PATH=AROMA
c:\db1> rb_tmu -i 10000 -timestamp aroma.tmu curly secret
The TMU first logs in user curly with password secret to the database refer-
enced by the RB_PATH variable. The TMU executes the control statements
contained in aroma.tmu located in the current directory. Progress messages
are issued approximately every 10,000 rows (based on row batch intervals, as
described on page 2-6). Time stamps are appended to all messages.
The following example illustrates how to invoke the TMU and specify a
database with the -d option. Enter the following command at a shell prompt:
rb_tmu -d AROMA aroma.tmu
The TMU checks the control file for a USER statement. If it does not find one,
it prompts for the user name and password before it executes the control
statements in the file aroma.tmu on the Aroma database located in the
directory defined in the rbw.config file.
The following example illustrates how to invoke the PTMU and specify an
interval with a timestamp. Enter the following command:
rb_ptmu -i 10000 -timestamp aroma.tmu curly secret
If you do not want to display messages on the terminal or write them to a file,
you can redirect the system stderr output to /dev/null to prevent the creation
of very large files. IBM does not recommend this practice because you cannot
detect any problems that occur. You can also redirect messages through a
filter such as UNIX grep to filter out repetitive informational messages. ♦
5XQQLQJWKH708DQG3708
6HWWLQJ8SWKH708
The following example illustrates how to use the output from another
program (in this case, zcat, a decompression program) as the input for the
TMU. At a shell prompt, enter:
The following example illustrates another way to use standard input for the
input data, allowing a single control file (mydb.tmu) to process different data
input files (one of which is market.txt). At a shell prompt, enter:
rb_tmu mydb.tmu system secret < market.txt
The TMU uses the file market.txt as the input when it executes the mydb.tmu
control file. You can name another input file the next time you use the
mydb.tmu control file. The control file must specify standard input (’-’) as the
input source. For more information about file redirection and pipes, refer to
your operating-system documentation.
UNIX The following example, based on UNIX named pipes and the tee command,
illustrates how to run multiple instances of the TMU to read an input file once
and load multiple tables.
Windows Similar capabilities are available on Windows, using named pipes and
third-party software.
Assume the daily input data is in a file named Sales.txt and it is stored in a
table named Sales. This same input data is also used to generate the
aggregate sales data stored in tables named Sales_Monthly and
Sales_Quarterly.
7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
6HWWLQJ8SWKH708
7RUXQPXOWLSOHLQVWDQFHVRIWKH708WRUHDGDQLQSXWILOHRQFHDQGORDGPXOWLSOH
WDEOHV
Modify the load script that loads the Sales table to read its input from
the standard input (stdin). The modified script is in a file named
Sales.stdin.tmu.
Modify the load scripts that load the aggregate tables to read their
input from named pipes pipeM and pipeQ. These modified scripts
are in files named SalesMonthly.pipeM.tmu and
SalesQuarterly.pipeQ.tmu.
Create two named pipes pipeM and pipeQ with the UNIX mkfifo
utility:
% mkfifo pipeM pipeQ
Start two instances of the TMU in the background. They use the load
scripts modified in step 2, which read input from the two pipes, as
control files.
,PSRUWDQW In the C shell, the greater-than (>) and ampersand (&) characters direct
the output from each rb_tmu process to a separate file instead of to the terminal. The
ampersand (&) character runs the process in the background. You must use the corre-
sponding characters for the UNIX shell that you are using.
% rb_tmu MonthlySales.pipe_m.tmu system manager >& pipe_mout &
% rb_tmu QuarterlySales.pipe_q.tmu system manager >& pipe_qout &
Read the input data with the UNIX cat command, pipe the standard
output to the two pipes by using the UNIX tee command, and pipe
the tee standard output to a third instance of the TMU, which uses the
modified Sales load script as a control file:
% cat Sales.txt | tee pipeM pipeQ | rb_tmu Sales.stdin.tmu \
system manager >& sales_out
When you use the named pipes and the tee command, you read the input file
only once, but load it into three tables with a single operation in the same
time it takes to load a single table. You can also run step 5 in the background
so you can monitor all three output files. ♦
5XQQLQJWKH708DQG3708
5HPRWH7086HWXSDQG6\QWD[
5HPRWH7086HWXSDQG6\QWD[
The remote TMU feature allows DBAs to start a LOAD DATA, UNLOAD, or
GENERATE operation from a client machine, using local control files and
input files. The TMU runs on the remote server machine and returns its
output files to the client.
The Client TMU executable file is named rb_ctmu. When you invoke the
rb_ctmu, it establishes a connection with the server machine, and the server-
side Driver TMU program (rb_drvtmu) starts an rb_ptmu process for the
remote operation. Users must configure the client and server machines
correctly in order for remote TMU operations to work, as discussed in the
following sections.
Although the Client TMU is intended for TMU operations on remote servers,
the rb_ctmu program can also be used to run local TMU operations. The
environment setup on the machine where you start the rb_ctmu program
determines the target host and database for the TMU operation.
&OLHQW6HUYHU&RPSDWLELOLW\
The client and server do not need to be on the same platform or operating
system. For example, you can use a 32-bit Windows client to run a remote
load operation on a 64-bit UNIX server. All of the following configurations are
supported:
Windows Windows
7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
&OLHQW&RQILJXUDWLRQ
&OLHQW&RQILJXUDWLRQ
The RB_CONFIG environment variable must be set on the client machine to
point to the location of the client copy of the rbw.config file.
The recommended way to specify the host and database for a remote TMU
operation is to create an ODBC DSN (data source name). Then you can either
specify the DSN with the -s option on the rb_ctmu command line or set the
RB_DSN environment variable.
If you choose not to create DSNs, you can either set the RB_HOST and
RB_PATH environment variables or specify the -h (host) and -d (database)
values on the rb_ctmu command line. If you use the -h option to specify the
remote server, the client rbw.config file must contain a SERVER entry that
exactly matches the SERVER entry in the server-side rbw.config file. For
example:
RB_620 SERVER brick:6200
5XQQLQJWKH708DQG3708
6HUYHU&RQILJXUDWLRQ
6HUYHU&RQILJXUDWLRQ
Before using the rb_ctmu, check that the REMOTE_TMU_LISTENER configu-
ration parameter is set to ON (the default) in the server-side rbw.config file:
RBWAPI REMOTE_TMU_LISTENER ON
This parameter enables the server to respond to requests from the remote
TMU. The listening port is the server port +2. For example, if you specified
port number 6200 during the IBM Red Brick Warehouse installation, the
remote TMU port will be 6202.
To turn off the remote TMU feature, you must set REMOTE_TMU_LISTENER to
OFF, then stop and restart the rbwapid daemon. If the parameter is not
present in the rbw.config file, the remote TMU is still, by default, turned on.
You can also check the rbwlogview output when the rbwapid daemon or
service is started to see whether the server is listening for remote TMU
requests.
The client TMU always invokes the parallel TMU (rb_ptmu) on the server
machine. If you want remote TMU loads to run in serial mode, you must
include the following SET command in the control file:
set tmu serial mode on;
6\QWD[IRUWKHUEBFWPX3URJUDP
The syntax for the client TMU is as follows:
rb_ctmu [options] control_file [db_username [db_password]]
7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
6\QWD[IRUWKHUEBFWPX3URJUDP
2SWLRQV
Most of the options are identical to those supported by the rb_tmu and
rb_ptmu programs; see page 2-6. The following options are specific to the
rb_ctmu program. These options can be specified in any order, as long as they
follow the rb_ctmu executable and precede the control file.
5XQQLQJWKH708DQG3708
6\QWD[IRUWKHUEBFWPX3URJUDP
&RQWURO)LOHIRU5HPRWH7082SHUDWLRQV
The control file for the rb_ctmu must reside on the client machine and must
contain only one executable LOAD, UNLOAD, or GENERATE statement.
Multiple SET statements can be included in the file.
■ LOAD DATA
■ UNLOAD
Generated DDL and TMU files are not supported for remote UNLOAD
operations.
■ GENERATE CREATE TABLE
■ GENERATE LOAD DATA
You cannot use the remote TMU feature to load from a tape device or unload
to a tape device.
8VHUQDPHDQG3DVVZRUG
The database username and password entries for the rb_ctmu follow the
same rules as those for the rb_tmu. The rb_ctmu does not validate usernames
and passwords; the validation occurs on the remote machine.
You can set the environment variable RB_USER on the client machine instead
of specifying your database username on the command line.
7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
6\QWD[IRUWKHUEBFWPX3URJUDP
8VDJH
To display the client TMU syntax, enter rb_ctmu with no options at the
system prompt. For example:
109 brick % $RB_CONFIG/bin/rb_ctmu
(C) Copyright IBM Corp. 1991-2002. All rights reserved.
Remote TMU Client Tool Version 06.20.0000(0)TST
Usage: rb_ctmu [<Options>] <Control_file> [<Username>] [<Password>]
Options:
-i or -interval <nrows> Display progress every <nrows> rows.
-t or -timestamp Append a timestamp to all TMU messages.
-d or -database <database> Database to use.
-s <DSN> Data Source Name.
-h <Host> RB Host if different from RB_HOST.
-w or -waittime <secs> Wait time for connection.
-show Show connection information only. (No
execution)
Arguments:
<Control_file> Path to control file to be used.
<Username> User name, prompted for if not given.
<Password> Password, prompted for if not given.
6\QWD[([DPSOHV
■ This example uses a DSN named red_brick_620:
% rb_ctmu -s red_brick_620 sales.tmu orwell george
■ The following example specifies the host rb_6200, the database
aroma, and a wait time of 10 seconds:
% rb_ctmu -h rb_6200 -d aroma -w 10 sales.tmu wolfe tom
■ The following example does not specify a DSN, host, or database. The
client TMU will use the RB_DSN environment variable, if specified, or
connect to the host specified by RB_HOST and the database specified
by RB_PATH.
% rb_ctmu sales.tmu system manager
5XQQLQJWKH708DQG3708
6XPPDU\RI5HPRWH7082SHUDWLRQ
6XPPDU\RI5HPRWH7082SHUDWLRQ
The following procedure summarizes the steps required to run a remote TMU
operation.
7RUXQDUHPRWH708FRPPDQG
7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
([DPSOH:LQGRZVWR81,;5HPRWH7082SHUDWLRQ
Execute the control file from the command line, using the rb_ctmu
program with the options specified on page 2-15.
([DPSOH:LQGRZVWR81,;5HPRWH7082SHUDWLRQ
This example shows how to run the Client TMU from a Windows machine to
connect to the Aroma database on a UNIX machine. The TMU operation in this
case is a GENERATE CREATE TABLE command for the Period table. All of the
steps are performed on the Windows client.
Set the RB_DSN environment variable to the value of the new DSN:
C:\RedBrick\Client32>set RB_DSN=BRICK_620
Create the TMU control file for the GENERATE operation. Name this
file gen_pd.tmu.
generate create table from period ddlfile
’/tmufiles/period.ddl’;
5XQQLQJWKH708DQG3708
([DPSOH:LQGRZVWR81,;5HPRWH7082SHUDWLRQ
7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
86(56WDWHPHQWIRU8VHU1DPHDQG3DVVZRUG
86(56WDWHPHQWIRU8VHU1DPHDQG3DVVZRUG
If you prefer not to enter a database user name and password on the
command line, you can include a USER statement at the beginning of a
control file. A control file can include only one USER statement, and it must
be the first statement in the file.
USER Database user under whose name the TMU is invoked. The
db_username user name can be either a literal value (for example, john,
smith2, or elvis) or an environment variable.
PASSWORD Password for the database user. The password can be either a
db_password literal value or an environment variable; it must be composed
of single-byte characters.
UNIX If the user name or password begins with the dollar sign ($), the value is
taken as an environment variable. For example, $DBADMIN and $SECRET. ♦
Windows If the user name or password is surrounded by percent symbols (%) or begins
with the dollar sign ($), the value is taken as an environment variable. For
example, %DBADMIN% or $DBADMIN; and %SECRET% or $SECRET. ♦
5XQQLQJWKH708DQG3708
86(56WDWHPHQWIRU8VHU1DPHDQG3DVVZRUG
2SHUDWLQJ6\VWHP 6WDWHPHQW
The following USER statements use both a literal value and an environment
variable.
2SHUDWLQJ6\VWHP 6WDWHPHQW
7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
6(76WDWHPHQWVDQG3DUDPHWHUVWR&RQWURO%HKDYLRU
6(76WDWHPHQWVDQG3DUDPHWHUVWR&RQWURO%HKDYLRU
Along with the functional statements in a control file, you can include SET
statements to specify certain aspects of TMU and PTMU behavior for a specific
session. For example, for a load operation, you can specify a different
directory for temporary space from the one specified in the rbw.config file.
6(76WDWHPHQW &RQWUROV
5XQQLQJWKH708DQG3708
6(76WDWHPHQWVDQG3DUDPHWHUVWR&RQWURO%HKDYLRU
6(76WDWHPHQW &RQWUROV
TMU COMMIT TIME INTERVAL Amount of time to load data into a table
before each commit operation.
7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
/RFN%HKDYLRU
6(76WDWHPHQW &RQWUROV
/RFN%HKDYLRU
The TMU automatically locks the database or the affected tables during its
operations. If the database or table is already locked, then whether the TMU
returns immediately or waits for the lock to be released depends on the
behavior that the SET LOCK statement selects.
5XQQLQJWKH708DQG3708
/RFN%HKDYLRU
6\QWD[
To specify the behavior to use for a specific session when locked tables are
encountered, use the following syntax to enter a SET LOCK statement in the
TMU control file.
NO WAIT If the database or table is already locked, the TMU returns a mes-
sage that the operation failed because the database or table was
locked. (The default behavior for the database server is also
WAIT.)
,PSRUWDQW In cases where waiting for a lock might result in a deadlock, the lock
request is refused and control is returned to the lock requestor. Deadlocks occur only
when the LOCK TABLE or LOCK DATABASE command is used. A deadlock cannot be
caused by the automatic locking operations of the TMU. For more information about
deadlocks, refer to the Administrator’s Guide.
The following example illustrates how to set the lock behavior with a SET
LOCK statement:
7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
%XIIHU&DFKH6L]H
%XIIHU&DFKH6L]H
TMU performance is affected by the size of the program’s buffer cache, as well
as system load and other factors related to the database. IBM recommends
that you use the default settings for the TMU buffer cache. After careful
analysis of your hardware, software, and user environment, however, you
might determine that changes to the buffer-cache size would improve
performance.
6\QWD[
To specify TMU buffer-cache size for a specific session, enter a SET statement
in the TMU control file; for all sessions, edit the TUNE parameter in the
rbw.config file.
A SET statement can increase, but can never decrease, the current buffer-
cache size during a given session. For example, if the current buffer size is
1024 blocks, set either in the rbw.config file or by a previous SET statement in
the TMU control file, the size cannot be reduced to 512 with a SET statement.
■ SET command:
SET TMU BUFFERS 1024;
■ rbw.config file entry:
TUNE TMU_BUFFERS 1024
5XQQLQJWKH708DQG3708
7HPSRUDU\6SDFH0DQDJHPHQW
8VDJH1RWH
If your LOAD DATA statement includes OPTIMIZE OFF syntax, the 128-page
default buffer-cache size might be too low to adequately handle index
searches. A typical indication of inadequate buffer cache is when the logical
I/O count returned in statistics message 500 at the end of the LOAD operation
is greater than half the number of input rows for the load. To increase perfor-
mance in this kind of situation, you can increase the size of the buffer cache.
([DPSOH
The following examples illustrate how to increase buffer-cache size.
■ SET command:
SET TMU BUFFERS 5000;
■ rbw.config file entry:
TUNE TMU_BUFFERS 5000
7HPSRUDU\6SDFH0DQDJHPHQW
As data is loaded and indexed, intermediate results are stored in memory
until they reach a threshold value, at which point they are written to disk. The
following parameters control how temporary space, both memory and disk,
is used when the OPTIMIZE option is ON:
7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
7HPSRUDU\6SDFH0DQDJHPHQW
This section describes both the SET commands and the TUNE parameters in
the rbw.config file that control temporary space allocation and management.
For more information about temporary space management, which affects not
only TMU and PTMU operations but also SQL DDL statements, refer to the
Administrator’s Guide.
6\QWD[
To specify INDEX TEMPSPACE parameters for a specific session, enter a SET
statement in the TMU control file. For all sessions, edit the TUNE parameters
in the rbw.config file.
,
SET INDEX TEMPSPACE DIRECTORIES ’dir_path’ ;
THRESHOLD value
MAXSPILLSIZE size
DUPLICATESPILLPERCENT percent
RESET
5XQQLQJWKH708DQG3708
7HPSRUDU\6SDFH0DQDJHPHQW
RI
7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
7HPSRUDU\6SDFH0DQDJHPHQW
RI
8VDJH1RWHV
In addition, use the following guidelines when you set index-building
temporary space parameters:
■ Always set the THRESHOLD value before you set the MAXSPILLSIZE
value.
■ Remember that the INDEX TEMPSPACE parameter settings in the
rbw.config file affect not only TMU index-building operations but
also SQL index-building operations.
5XQQLQJWKH708DQG3708
7HPSRUDU\6SDFH0DQDJHPHQW
([DPSOH
The following examples illustrate SET commands that you can use to change
parameters for a specific session:
SET INDEX TEMPSPACE THRESHOLD 2M;
SET INDEX TEMPSPACE MAXSPILLSIZE 3G;
SET INDEX TEMPSPACE DUPLICATESPILLPERCENT 5;
♦
SET INDEX TEMPSPACE DIRECTORIES ’d:\itemp’, ’e:\itemp’;
Windows
♦
The following examples illustrate entries in the rbw.config file that apply to
all sessions:
TUNE INDEX_TEMPSPACE_THRESHOLD 20M
TUNE INDEX_TEMPSPACE_MAXSPILLSIZE 8G
TUNE INDEX_TEMPSPACE_DUPLICATESPILLPERCENT 5
♦
TUNE INDEX_TEMPSPACE_DIRECTORY d:\itemp
Windows
TUNE INDEX_TEMPSPACE_DIRECTORY e:\itemp
TUNE INDEX_TEMPSPACE_DIRECTORY f:\itemp
7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
)RUPDWRI'DWHWLPH9DOXHV
)RUPDWRI'DWHWLPH9DOXHV
To use an alternative date format (not ANSI SQL-92 datetime format) for a
date constant specified in a TMU statement, you must use a TMU SET
DATEFORMAT statement to specify the input format of the date when the
format includes numeric month values and the default order of mdy is not
used. (You can use a date constant in a LOAD DATA statement to load a
constant into a date column, to specify a date value in an ACCEPT or REJECT
clause, or in an UNLOAD statement in a WHERE clause.) The SET
DATEFORMAT statement must precede the LOAD DATA or UNLOAD
statement that it applies to in the control file.
format Order of month, day, and year components for non-ANSI SQL-
92 (alternative) date inputs, using a combination of the charac-
ters m, d, and y. The default format is mdy. This SET statement
uses the same format combinations as the SQL SET statement.
For more information about formats, refer to the
SET DATEFORMAT statement in the SQL Reference Guide.
The following statement specifies that any constants in the TMU statement
that follows are assumed to be in ymd format (year, month, day). For example:
2000/11/30:
SET DATEFORMAT ’ymd’;
The following statement specifies that any constants in the TMU statement
that follows are assumed to be in dmy format (day, month, year). For example:
30/11/2000:
SET DATEFORMAT ’dmy’;
5XQQLQJWKH708DQG3708
/RDG,QIRUPDWLRQ/LPLW
/RDG,QIRUPDWLRQ/LPLW
Information about each load operation is stored in the RBW_LOADINFO
system table, one row per operation. The RBW_LOADINFO_LIMIT configu-
ration parameter specifies the maximum number of rows that you can store
in that table, thereby allowing you to control the amount of historical load
information that the system records.
8VDJH1RWHV
■ If you set this parameter to a value less than the current value, the
RB_DEFAULT_LOADINFO file is truncated. However, the original file
is saved as RB_DEFAULT_LOADINFO.save.
■ The current value of RBW_LOADINFO_LIMIT is stored in the
RBW_OPTIONS system table.
([DPSOH
To check the value of RBW_LOADINFO_LIMIT, query the RBW_OPTIONS
table:
select option_name, value from rbw_options where option_name
= ’RBW_LOADINFO_LIMIT’;
OPTION_NAME VALUE
RBW_LOADINFO_LIMIT 256
7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
0HPRU\0DS/LPLW
0HPRU\0DS/LPLW
Memory-mapping the primary-key indexes of dimension tables into shared
memory could significantly accelerate referential integrity checking.
By default, the TMU will attempt to use available shared memory to map all
the indexes needed. If there is not enough available shared memory to allow
full mapping of all the dimension primary-key indexes, the TMU will map as
much of the indexes as possible into the available shared memory.
During the LOAD or REORG operation, the TMU maps up to the available
memory map limit. If the primary-key indexes of the dimension tables are
larger than the memory-map limit, the TMU maps the index up to the
memory-map limit, and then accesses the rest of the indexes through the
buffer cache.
5XQQLQJWKH708DQG3708
6HWWLQJ3UHFRPSXWHG9LHZ0DLQWHQDQFH
6HWWLQJ3UHFRPSXWHG9LHZ0DLQWHQDQFH
Precomputed view maintenance automatically updates aggregate tables
whenever their detail tables are updated.
6\QWD[
To specify the behavior to use for a specific session, use the following syntax
to enter a PRECOMPUTED VIEW MAINTENANCE statement in the TMU
control file. To specify the behavior for all sessions, edit the OPTION
parameter in the rbw.config file.
OPTION PRECOMPUTED_VIEW_MAINTENANCE ON
OFF
You can view the current value for this parameter in the RBW_OPTIONS
system table. Within the OPTION_NAME column, locate
PRECOMPVIEW_MAINTENANCE; the setting is in the VALUE column.
3UHFRPSXWHG9LHZ0DLQWHQDQFH2Q(UURU
The PRECOMPUTED VIEW MAINTENANCE ON ERROR statement specifies the
action that the versioned database takes when it encounters an aggregate
table that cannot be maintained due to an error during maintenance. This
statement only applies to versioned databases.
7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
3UHFRPSXWHG9LHZ0DLQWHQDQFH2Q(UURU
6\QWD[
To specify the behavior to use for a specific session, use the following syntax
to enter a PRECOMPUTED VIEW MAINTENANCE ON ERROR statement in the
TMU control file. To specify the behavior for all sessions, edit the OPTION
parameter in the rbw.config file.
INVALIDATE
ROLLBACK The entire transaction is rolled back, thus restoring all of the
tables (including aggregate tables) to their original condition.
INVALIDATE The offending aggregates are marked invalid.
You can view the current setting for this parameter in the RBW_OPTIONS
system table. Within the OPTION_NAME column, locate
PRECOMPVIEW_MAINTENANCE_ON_ERROR; the setting is in the VALUE
column.
5XQQLQJWKH708DQG3708
0DQDJLQJ5RZ0HVVDJHV
0DQDJLQJ5RZ0HVVDJHV
You can manage how and when you view messages and warnings returned
during the LOAD process by either sending the messages to a specific file or
viewing them during the LOAD processing.
6\QWD[
To control the amount of messages and warnings viewed for a single session,
enter a SET statement in the TMU control file. To control the amount of
messages and warnings viewed for all sessions, edit the TUNE parameters in
the rbw.config file. The syntax is as follows.
FULL If TMU ROW MESSAGES is set to FULL, and the ROW MESSAGES
clause of the LOAD DATA statement specifies a filename, all row-
level warning messages go to that file. If TMU ROW MESSAGES
is set to FULL, and no ROW MESSAGES clause is present, all row-
level warning messages go to standard output. For both the
FULL and NONE settings, depending on their error-level setting,
messages also go to the log file.
NONE If TMU ROW MESSAGES is set to NONE, the ROWMESSAGES
clause, if specified, is ignored, and no warning-level messages
are produced. For both the FULL and NONE settings, depending
on their error-level setting, messages also go to the log file.
7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
(QDEOLQJ9HUVLRQLQJ
(QDEOLQJ9HUVLRQLQJ
You can run a TMU operation as a versioned transaction on databases where
versioning is enabled. The SET TMU VERSIONING statement and OPTION
TMU_VERSIONING rbw.config file parameter specify whether TMU opera-
tions are executed as versioned or blocking transactions. A versioned
transaction allows query operations to occur on a previously committed
version of the database while a new version is being written. A blocking
transaction locks all of the tables involved and does not allow query opera-
tions to begin until the transaction is complete. For information about setting
up a versioned database, refer to the Administrator’s Guide.
To control TMU versioning for a single session, enter a SET statement in the
TMU control file. For all sessions, edit the OPTION parameter in the
rbw.config file.
RECOVER
RECOVER
When TMU VERSIONING is set to OFF, all TMU operations are run as blocking
transactions. The default value is OFF. When TMU VERSIONING is set to ON,
data is loaded directly on to the version log, and all TMU LOAD and REORG
operations are run as versioned transactions.
5XQQLQJWKH708DQG3708
&RPPLW5HFRUG,QWHUYDO
When TMU VERSIONING is set to RECOVER, data is loaded directly into the
version log. When the load commits, the changed blocks in the version log
are moved back to the database. If the PTMU is used, this operation is
performed in parallel across the load processes. The RECOVER option needs
exclusive access to the target table, so concurrent queries on the target table
are not allowed during the LOAD operation.
Using the version log ensures great LOAD operation stability and recover-
ability. Compared to TMU VERSIONING ON, TMU VERSIONING RECOVER
reduces the version-log usage and makes it easier to back up database files.
&RPPLW5HFRUG,QWHUYDO
The TMU COMMIT RECORD INTERVAL statement specifies the number of
records to load into a table between each commit operation. This function is
available for both the TMU and the PTMU.
7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
&RPPLW5HFRUG,QWHUYDO
8VDJH1RWHV
■ After each record-interval commit operation, the TMU issues an
information message indicating the number of rows that were
loaded. If the current transaction fails, the database rolls back to the
state of the last completed commit interval.
■ The TMU COMMIT RECORD INTERVAL statement is valid only with
TMU VERSIONING set to ON or RECOVER. This statement has no
effect if VERSIONING is set to OFF.
■ When performing a versioned load with OPTIMIZE ON, each commit
operation requires an index merge to occur. Depending on the values
of the INDEX TEMPSPACE parameters, this requirement might result
in more merges during the index-building phases, and therefore less-
efficient indexes than if the load operation completed as a single
transaction.
■ If this statement is used in conjunction with the TMU COMMIT TIME
INTERVAL statement, a commit is performed when either condition is
met. After the commit occurs, both counters are reset and loading
continues until the next interval (either TIME or RECORD) occurs.
■ If this statement is used in conjunction with the
PRECOMPUTED_VIEW_MAINTENANCE ON statement, all precom-
puted views are maintained automatically each time a commit is
made. Thus, precomputed views remain valid and synchronized
with the detail table data.
5XQQLQJWKH708DQG3708
&RPPLW7LPH,QWHUYDO
&RPPLW7LPH,QWHUYDO
The TMU COMMIT TIME INTERVAL statement specifies the amount of time, in
minutes, to load data into a table before each commit operation. This function
is available for both the TMU and the PTMU.
To control the TMU COMMIT TIME INTERVAL statement for a single session,
enter a SET statement in the TMU control file. To control the TMU COMMIT
TIME INTERVAL statement for all sessions, edit the OPTION parameter in the
rbw.config file.
8VDJH1RWHV
■ After each time-interval commit operation, the TMU issues an infor-
mation message indicating the number of rows that were loaded. If
the current transaction fails, the database rolls back to the state of the
last completed commit interval.
■ The TMU COMMIT TIME INTERVAL statement is valid only with TMU
VERSIONING set to ON or RECOVER. This statement has no effect if
VERSIONING is set to OFF.
7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
&RPPLW7LPH,QWHUYDO
([DPSOH
The following example shows a control file with a LOAD DATA statement for
the Sales table in a versioned Aroma database where the commit interval is
set for every 20,000 records. The following code segment is the text of the
TMU control file:
5XQQLQJWKH708DQG3708
&RPPLW7LPH,QWHUYDO
The following example shows the messages produced by the previous TMU
operation:
** STATISTICS ** (500) Time = 00:00:00.01 cp time, 00:00:00.00
time, Logical IO count=0
** STATISTICS ** (500) Time = 00:00:00.01 cp time, 00:00:00.00
time, Logical IO count=0
** INFORMATION ** (366) Loading table SALES.
** INFORMATION ** (8555) Data-loading mode is INSERT.
** INFORMATION ** (8707) Versioning is active.
** INFORMATION ** (8710) Interval commit set to 20000 records.
** INFORMATION ** (352) Row 3 of index SALES_STAR_IDX is out of
sequence. Switching to standard optimized index building. Loading
continues...
** INFORMATION ** (8708) Performing interval commit...
** INFORMATION ** (513) Starting merge phase of index building
SALES_STAR_IDX.
** INFORMATION ** (367) Rows: 20000 inserted. 0 updated. 0
discarded. 0 skipped.
** INFORMATION ** (8709) Commit complete. Loading continues...
** INFORMATION ** (8708) Performing interval commit...
** INFORMATION ** (513) Starting merge phase of index building
SALES_STAR_IDX.
** INFORMATION ** (367) Rows: 40000 inserted. 0 updated. 0
discarded. 0 skipped.
** INFORMATION ** (8709) Commit complete. Loading continues...
** INFORMATION ** (8708) Performing interval commit...
** INFORMATION ** (513) Starting merge phase of index building
SALES_STAR_IDX.
** INFORMATION ** (367) Rows: 60000 inserted. 0 updated. 0
discarded. 0 skipped.
** INFORMATION ** (8709) Commit complete. Loading continues...
** INFORMATION ** (315) Finished file
/redbrick/sample_input/aroma_sales.txt. 69941 rows read from this
file.
** INFORMATION ** (513) Starting merge phase of index building
SALES_STAR_IDX.
** INFORMATION ** (367) Rows: 69941 inserted. 0 updated. 0
discarded. 0 skipped.
** STATISTICS ** (500) Time = 00:00:29.21 cp time, 00:00:33.73
time, Logical IO count=1044
Notice that this operation performs four commit operations—one each after
inserting 20,000, 40,000, 60,000, and 69,941 records.
7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
'LVSOD\LQJ/RDG6WDWLVWLFV
'LVSOD\LQJ/RDG6WDWLVWLFV
The SET STATS statement turns on statistics reporting for the current TMU
session. The following syntax diagram shows how to construct a SET STATS
statement.
SET STATS ON
INFO
OFF
INFO The INFO setting returns the same statistics as the ON setting,
along with additional information about the load operation,
such as server messages generated during precomputed
view maintenance.
5XQQLQJWKH708DQG3708
([WHUQDO%DFNXSDQG5HVWRUH2SHUDWLRQV
For more information about BAR unit configuration, see page 8-15.
([WHUQDO%DFNXSDQG5HVWRUH2SHUDWLRQV
To support a mixture of external full backups and TMU incremental backups
of the same database, execute the SET FOREIGN FULL BACKUP command
immediately before performing an external backup operation. This
command resets the backup segment and effectively states that a reliable
external backup is about to be created. In turn, TMU incremental backups can
follow, just as if a TMU full backup had been done. For more information
about external backups, see page 8-20.
The SET FOREIGN FULL BACKUP and SET FOREIGN FULL RESTORE
commands require the BACKUP_DATABASE and RESTORE_DATABASE task
authorizations, respectively.
7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
5(25*7DVNV
5(25*7DVNV
The following statements control the number of input-tasks and
index-builder tasks during a REORG operation:
■ SET TMU MAX TASKS, which specifies an upper bound on the total
number of tasks allocated for input tasks and index builder tasks.
The total number of tasks specified by the MAX TASKS statement
must be at least two.
■ SET TMU INPUT TASKS, which specifies the number of tasks allocated
to scan the target table. Because the number of conversion tasks is
always identical to the number of INPUT tasks, this option also
controls the number of conversion tasks.
■ SET TMU INDEX TASKS, which specifies the number of tasks allocated
for indexes.
The SET TMU INPUT TASKS statement and the SET TMU INDEX TASKS
statement should not be used in conjunction with the SET TMU MAX TASKS
statement.
6\QWD[
The following syntax diagram shows how to construct a SET TMU TASK
statement or a TUNE TMU_TASKS rbw.config file parameter.
TMUINDEX
TMU INDEX TASKS
TASKS
TMU_INPUT_TASKS
TMU_INDEX_TASKS
5XQQLQJWKH708DQG3708
3DUDOOHO/RDGLQJ7DVNV37082QO\
:DUQLQJ The REORG operation might not allocate all the specified INPUT or INDEX
tasks if the tasks are deemed excessive.
([DPSOHV
The following examples illustrate SET statements that you can use to change
parameters for a specific session:
SET TMU MAX TASKS 5;
SET TMU INPUT TASKS 3;
SET TMU INDEX TASKS 2;
The following examples illustrate entries in the rbw.config file that apply to
all sessions:
TUNE TMU_MAX_TASKS 5
TUNE TMU_INPUT_TASKS 3
TUNE TMU_INDEX_TASKS 2
3DUDOOHO/RDGLQJ7DVNV37082QO\
As the PTMU loads data, it can use multiple tasks (even on a single CPU) to
perform the data conversion and index-building portions of the load
operation. You can control the amount of parallel processing for both data
conversion and for index-building, based on your site resources and
workload requirements. For more information about how parallel processing
is used to load data, refer to “Processing Stages for Loading Data” on
page 3-8.
7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
3DUDOOHO/RDGLQJ7DVNV37082QO\
6\QWD[
To set the PTMU parallel-processing parameters for a single session, enter
either or both SET statements in the PTMU control file. To set the PTMU
parallel-processing parameters for all sessions, edit the TUNE parameters in
the rbw.config file.
TMU CONVERSION Tasks that convert input data to the platform-based inter-
TASKS nal format used to represent data. These tasks also ensure
uniqueness and check referential integrity whenever such
checks are performed. The number specified is the actual
number of tasks used. The default value is one-half the
number of processors on the computer (as determined
from the hardware).
TMU INDEX TASKS Tasks that make index entries into nonunique indexes,
corresponding to the data being loaded. Each nonunique
index can have at most one task associated with it. The
number specified with this parameter is the maximum
number of tasks that can be used to process all nonunique
indexes. The actual number of tasks used is the smaller of
the number of nonunique indexes and the number
specified with this parameter. The default value is one
task per nonunique index.
The task that makes the entries into unique indexes is not
affected by this parameter.
5XQQLQJWKH708DQG3708
6HULDO0RGH2SHUDWLRQ37082QO\
([DPSOH
The following examples illustrate how to use the SET statements and TUNE
parameter entries.
To control parallel processing for a single PTMU session within a control file:
set tmu conversion tasks 5;
set tmu index tasks 8;
To control parallel processing for all PTMU sessions by using the rbw.config
file:
TUNE tmu_conversion_tasks 5
TUNE tmu_index_tasks 8
([DPSOH
To illustrate how the TMU CONVERSION TASKS parameter works, assume
you have 8 processors on the system. By default, 4 of them are used for
conversion tasks. If you want to use more than 4 processors, set the TMU
CONVERSION TASKS parameter to a number larger than 4 to increase the
number of processors.
([DPSOH
To illustrate how the TMU INDEX TASKS parameter works, assume you have
5 nonunique indexes and TMU INDEX TASKS is set to 3. In this case, 3 tasks
are used, and some tasks process multiple indexes in parallel. If you have 5
nonunique indexes and TMU INDEX TASKS is set to 6, 5 tasks are used, one
per nonunique index.
6HULDO0RGH2SHUDWLRQ37082QO\
You can force the PTMU to run in a serial mode in which no parallel
processing is used. In this case, the PTMU is effectively running as the TMU.
This capability is useful in cases where you do not want the resource
consumption and overhead of parallel processing. You can use the TMU
instead of the PTMU, but the ability to run the PTMU in serial mode allows
you to combine operations in a single control file to be executed by the PTMU.
Within the control file, you specify those operations that are to be executed in
serial mode, with all other operations to be executed in parallel mode.
7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
6HULDO0RGH2SHUDWLRQ37082QO\
,PSRUWDQW The TMU SERIAL MODE parameter affects only those operations for
which the PTMU uses parallel processing. It does not affect those operations for which
the PTMU normally uses serial processing, as defined on page 2-52.
To control PTMU serial mode for a single session, enter a SET statement in the
PTMU control file. To control PTMU serial mode for all sessions, edit the TUNE
parameters in the rbw.config file.
([DPSOH
Suppose you have a PTMU control file that performs multiple operations.
Most operations are performed in parallel, but you want the following opera-
tions to be performed in serial mode:
The SET SERIAL MODE statement (included wherever needed in the control
file) directs the PTMU to switch between parallel and serial mode.
5XQQLQJWKH708DQG3708
6XJJHVWLRQVIRU(IIHFWLYH37082SHUDWLRQV
6XJJHVWLRQVIRU(IIHFWLYH37082SHUDWLRQV
Although the TMU and PTMU function similarly, the PTMU has some
exclusive functions. These are detailed in the following sections.
2SHUDWLRQV7KDW8VH3DUDOOHO3URFHVVLQJ
The PTMU uses parallel processing for some operations and serial processing
for others:
When loading data in MODIFY or UPDATE mode, the PTMU does not support
using a combination of pseudo- and regular columns in an ACCEPT or REJECT
Criteria clause. Instead, the PTMU proceeds in serial mode.
■ AGGREGATE clause
■ AUTOROWGEN clause
■ RETAIN clause
■ ACCEPT or REJECT clause with one column as a regular column
If you use the PTMU to load data to take advantage of parallel processing, do
not use the MODIFY mode when you can use the APPEND or INSERT mode.
For example, to load data into an empty table, use the INSERT mode instead
of the MODIFY mode. To add new rows to an existing table without
modifying any existing rows, use the APPEND mode instead of the MODIFY
mode.
7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
'LVFDUG/LPLWVRQ3DUDOOHO/RDG2SHUDWLRQV
'LVFDUG/LPLWVRQ3DUDOOHO/RDG2SHUDWLRQV
Because the PTMU processes multiple input rows at the same time, its
behavior might be different from the TMU when a load operation reaches the
discard limit and ends early. For example, if input row 500 is discarded,
causing the maximum discard limit to be exceeded, the PTMU might already
be processing rows 501, 502, and so on. If these rows are being discarded,
messages appear for these rows and they appear in the discard file (if
specified) even though they are beyond the limit. Any rows beyond the
discard limit that are inserted into the table are removed before the load
operation ends.
The same behavior occurs if the load ends prematurely for other similar
reasons, such as exceeding the MAXROWS PER SEGMENT limit on the target
table.
$87252:*(1ZLWKWKH3708
The PTMU supports automatic row generation during parallel-load opera-
tions. However, automatic row generation reduces the amount of parallelism
that can be applied to the operation because the referential-integrity checking
and row generation must all be done by a single process. For the best perfor-
mance, do not use automatic row generation for parallel loads. The decrease
in performance can be significant.
To load a large amount of data that might have only a few rows that fail refer-
ential-integrity checking, load the data by specifying AUTOROWGEN OFF
and naming a discard file. Then load the rows in the discard file by specifying
AUTOROWGEN ON.
,PSRUWDQW This strategy is not appropriate if you expect a large number of discarded
rows because the discard processing significantly slows the load processing.
5XQQLQJWKH708DQG3708
0XOWLSOH7DSH'ULYHVZLWKWKH3708
0XOWLSOH7DSH'ULYHVZLWKWKH3708
UNIX Not all UNIX systems support multiple tape drives with the PTMU. ♦
If you use the PTMU with multiple tape drives (such as 8-mm drives), you can
load data in sequence into a database without operator intervention. Include
the following clause in the control file to specify the filename and each device
name:
INPUTFILE ’filename’ TAPE DEVICE ’device_name[,…]'
The following example illustrates how to describe and load multiple tape
drives for the PTMU. The following line in the control file loads data from
tape devices tx0 and tx1:
INPUTFILE 'myfile' TAPE DEVICE '/dev/rmt/tx0,/dev/rmt/tx1'
If the PTMU loads all the data from tx0 and then from tx1 and still does not
reach the end of the file, it pauses and requests that the next tape be loaded.
Load the next tape in the tape device tx0.
0XOWLSOH7DSH'ULYHZLWKWKH3708
UNIX Support for the 3480/3490 tape drive is not available on all UNIX systems. ♦
If you use the PTMU with the 3480 or 3490 multiple-tape drive, include the
following clause in the control file, specifying a device name and a range of
cartridges for each tape device:
INPUTFILE 'filename' TAPE DEVICE 'device_name[(start-end)][,…]'
The following line in the control file loads data from 3480/3490 tape devices
tf0 and tf1:
INPUTFILE myfile’ TAPE DEVICE ’/dev/rmt/tf0(1-3), /dev/rmt/tf1(1-3)’
If, after loading all the data on tf1, the PTMU does not reach the end of the file,
it pauses and requests that the next tape be loaded.
7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
0XOWLSOH7DSH'ULYHZLWKWKH3708
Tape 1 tf0 1
Tape 2 tf1 1
Tape 3 tf0 2
Tape 4 tf1 2
Tape 5 tf0 3
Tape 6 tf1 3
Tape 7 tf0 1
Tape 8 tf1 1
Tape 9 tf0 2
…
5XQQLQJWKH708DQG3708
Chapter
/RDGLQJ'DWDLQWRD:DUHKRXVH
'DWDEDVH
In This Chapter 3-5
The LOAD DATA Operation . . . . . . . . . . . . . . 3-6
Inputs and Outputs . . . . . . . . . . . . . . . . 3-6
Processing Stages for Loading Data . . . . . . . . . . . 3-8
Input Stage . . . . . . . . . . . . . . . . . . 3-10
Conversion Stage . . . . . . . . . . . . . . . . 3-11
Main Output and Index Stages . . . . . . . . . . . 3-11
Error Handling and Cleanup Stage . . . . . . . . . 3-11
Procedure for Loading Data . . . . . . . . . . . . . . . 3-12
Some Preliminary Decisions. . . . . . . . . . . . . . . 3-14
Determining Table Order . . . . . . . . . . . . . . 3-14
Ordering Input Data . . . . . . . . . . . . . . . . 3-15
Maintaining Referential Integrity with Automatic Row Generation 3-16
Discarding Records That Violate Referential Integrity . . . 3-16
Adding Generated Rows to Referenced Tables. . . . . . 3-17
Modifying the Input Rows . . . . . . . . . . . . 3-19
Adding Rows in Mixed Mode . . . . . . . . . . . 3-21
Specifying the AUTOROWGEN Mode . . . . . . . . 3-22
Writing a LOAD DATA Statement . . . . . . . . . . . . 3-23
LOAD DATA Syntax . . . . . . . . . . . . . . . . . 3-24
Input Clause . . . . . . . . . . . . . . . . . . . . 3-25
Format Clause . . . . . . . . . . . . . . . . . . . 3-29
EBCDIC to ASCII Conversion . . . . . . . . . . . . . 3-35
IBM Syntactic Code Set (CS 640) . . . . . . . . . . 3-36
Two Approaches to Loading EBCDIC Data . . . . . . . 3-36
Examples: Format Clause . . . . . . . . . . . . . 3-37
Locale Clause . . . . . . . . . . . . . . . . . . . . 3-38
Locale Specifications for XML Input Files . . . . . . . . . 3-41
Usage Notes . . . . . . . . . . . . . . . . . . . 3-42
/RDGLQJ'DWDLQWRD:DUHKRXVH'DWDEDVH
criteria_clause on non-character column . . . . . . . . 3-144
criteria_clause on character column . . . . . . . . . 3-144
comment_clause . . . . . . . . . . . . . . . . 3-144
field_type . . . . . . . . . . . . . . . . . . 3-145
field_type (continued) . . . . . . . . . . . . . . 3-146
restricted date_spec . . . . . . . . . . . . . . . 3-146
7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
,Q7KLV&KDSWHU
You use the TMU and a control file that contains a LOAD DATA statement to
load data into a data warehouse.
This chapter provides the information you need to write the LOAD DATA
statements, the field specifications within the LOAD DATA statements, and
the SYNCH statement for offline load operations.
/RDGLQJ'DWDLQWRD:DUHKRXVH'DWDEDVH
7KH/2$''$7$2SHUDWLRQ
7KH/2$''$7$2SHUDWLRQ
Before you can load data, the database and the tables to load must already
exist. You build databases with a utility program (rb_creator on UNIX and
dbcreate on Windows) and define the user tables with SQL CREATE TABLE
statements. The load process automatically builds primary-key indexes for
each table that has a primary key. It also builds existing user-defined indexes.
For information about defining tables and indexes, refer to the Administrator’s
Guide and the SQL Reference Guide.
,QSXWVDQG2XWSXWV
Input to the TMU for a data loading operation consists of:
The TMU also updates the IBM Red Brick Warehouse system tables that
contain the data format descriptions, table and index files, and other
information that the TMU and the database server need.
The TMU automatically creates indexes on all primary keys, based on table
definitions. It also builds and updates any additional user-created indexes
that exist at the time of the load operation.
7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
,QSXWVDQG2XWSXWV
If you are running on a 64-bit platform and have explicitly enabled your file
system for large files (files larger than 2 gigabytes), you can load and unload
input and output files larger than 2 gigabytes. You can only load input files
larger than 2 gigabytes in disk format.
The following TMU features can be useful when loading or copying data:
/RDGLQJ'DWDLQWRD:DUHKRXVH'DWDEDVH
3URFHVVLQJ6WDJHVIRU/RDGLQJ'DWD
3URFHVVLQJ6WDJHVIRU/RDGLQJ'DWD
When data is loaded into a database from an input file, the load process
includes several processing stages. By understanding what activities occur
during each stage, you are better able to avoid bottlenecks and resource
conflicts, thereby reducing the time required to load data.
■ Input stage
❑ Validates syntax of TMU control statement.
❑ Locks tables and segments.
❑ Reads input records, monitoring progress and status communi-
cated by error handling and cleanup stage.
❑ Sets up additional processes for conversion and index stages for
PTMU.
■ Conversion stage
❑ Converts input records to internal row format and validates
data.
❑ Checks referential integrity (if Automatic Row Generation is off).
7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
3URFHVVLQJ6WDJHVIRU/RDGLQJ'DWD
The TMU uses a single process that controls all stages. It processes small
batches of rows, passing one batch through each stage before starting the next
batch.
/RDGLQJ'DWDLQWRD:DUHKRXVH'DWDEDVH
3URFHVVLQJ6WDJHVIRU/RDGLQJ'DWD
Figure 3-1 illustrates the sequence of stages in a LOAD operation and the
additional parallelism that the multiple processes of the PTMU provide. (The
TMU uses a single process for all stages.)
)LJXUH
37080XOWLSOH3URFHVVHV
/2$''$7$3URFHVVLQJ6HTXHQFH
,QSXWVWDJH
.H\
&RQWUROILOH ,QSXW
6\VWHPWDEOHV SURFHVV 3URFHVV
,QSXWUHFRUGV
3ULPDU\,2
&RQYHUVLRQ
VWDJH 6WDWXV
&RQYHUVLRQ &RQWUROIORZ
3.LQGH[HV SURFHVVWDVNWDVN
0DLQRXWSXW
VWDJH
'DWDDQGXQLTXH 0DLQRXWSXW $GGLWLRQDOSURFHVVHVIRUWKHVHVWDJHV
LQGH[VHJPHQWV SURFHVV
,QGH[VWDJH
1RQXQLTXH ,QGH[
,QGH[
,QGH[
LQGH[VHJPHQWV SURFHVV
WDVN
(UURUDQGFOHDQXS
VWDJH
6\VWHPWDEOHV (UURUKDQGOLQJDQG
FOHDQXSSURFHVV
,QSXW6WDJH
During the input stage, the PTMU checks the syntax of the LOAD DATA
statement and locks the table or segment for exclusive use. You can specify
whether you want the TMU to wait for a lock or to return immediately if the
table is in use.
7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
3URFHVVLQJ6WDJHVIRU/RDGLQJ'DWD
&RQYHUVLRQ6WDJH
During the conversion stage, the PTMU performs any necessary data
conversion on each record, including conversion between code sets. For
example, EBCDIC to ASCII or MS932 to EUC, conversion from the external code
set to internal (binary) format, and decimal scaling. In this stage, the PTMU
checks referential integrity (if Automatic Row Generation is off) and the data
is validated by comparing it with the column data type and checking for
truncation, underflow, and overflow. Since the PTMU uses multiple
conversion processes, it improves conversion performance significantly.
0DLQ2XWSXWDQG,QGH[6WDJHV
During the main output stage, the PTMU writes data to the table and makes
entries in all unique indexes. In addition, if Automatic Row Generation is on,
the PTMU performs referential integrity checks during this stage, and inserts
into referenced tables any automatically-generated rows. In this stage, the
PTMU uses a single process to make all entries into each unique index.
During the index stage, PTMU makes entries into any nonunique indexes. The
PTMU, by default, uses one index process per nonunique index, thereby
speeding up this part of the load operation.
(UURU+DQGOLQJDQG&OHDQXS6WDJH
The error handling and cleanup stage performs the error handling, which
includes keeping track of rows loaded in case of interrupts, and cleans up
after the processing is complete. It also monitors progress through the
pipeline, providing feedback to the input stage to control the flow of records
being processed. The PTMU uses a single process for this stage.
/RDGLQJ'DWDLQWRD:DUHKRXVH'DWDEDVH
3URFHGXUHIRU/RDGLQJ'DWD
3URFHGXUHIRU/RDGLQJ'DWD
To load data into the tables of a warehouse database.
7DVN 'HVFULSWLRQ
Determine the following information about your The description of the input
input data: data is supplied in the Input
■ Source of your input data (disk, TAR or standard clause, as described on
label tape, or standard input) page 3-29. Additional infor-
mation about file and record
■ Record length (fixed or variable) formats is provided on
■ Record format (fixed, variable, separated, or XML) page 3-122 and about data
type conversions from the
■ Record field order and type input data to the server data
■ Mapping between input fields and table columns types on page 3-133.
■ Code set (ASCII, EBCDIC, or XML encoding)
Determine the load mode to use: APPEND, INSERT, The load mode is specified
MODIFY, REPLACE, or UPDATE. If you use in the Format clause, as
MODIFY or UPDATE mode, determine whether to described on page 3-29.
use the AGGREGATE mode.
7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
3URFHGXUHIRU/RDGLQJ'DWD
7DVN 'HVFULSWLRQ
Determine how you want to display row-level Row message file choices
warning messages. are specified in the Row
Message clause, as
described on page 3-57.
Determine whether to use Optimize mode to load the Optimize mode is selected
data. in the Optimize clause, as
described on page 3-59.
Write the LOAD DATA control statements, one per See “TMU Control Files and
table, in a file. A single file can contain multiple Statements” on page 1-8
control statements of different types.
/RDGLQJ'DWDLQWRD:DUHKRXVH'DWDEDVH
6RPH3UHOLPLQDU\'HFLVLRQV
6RPH3UHOLPLQDU\'HFLVLRQV
Before you write the LOAD DATA statements to load data into the tables of
your database, make the following decisions, which affect how you write the
statements:
'HWHUPLQLQJ7DEOH2UGHU
The TMU loads tables in the order of the LOAD DATA statements in the control
file. Each LOAD DATA statement corresponds to one table. To control the
order in which tables are loaded, place the LOAD DATA statements in the file
in the order in which you want the tables to load.
You can load tables in any order as long as any table referenced by a foreign
key is loaded before the table containing that foreign key. That is, a referenced
table must be loaded before the table that references it. For example, if the
Sales table, the referencing table, contains three foreign keys, each of the
three tables referenced by the foreign keys must be loaded before the Sales
table can be loaded.
7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
2UGHULQJ,QSXW'DWD
2UGHULQJ,QSXW'DWD
You must decide whether to order the data in the input files, balancing any
improvement in load time against the amount of time available to load data,
the time spent ordering input data, and the difficulty of maintaining ordered
data.
The initial load of input data into a table is somewhat faster with ordered
data. However, for incremental loads of data into indexed tables, the
optimized load mode makes data-order issues unimportant. With more than
one STAR index, a combination of primary key and STAR indexes, or refer-
ences to multicolumn primary keys, it is usually not useful to attempt to
order data. IBM suggests that for OPTIMIZE OFF loads, the incoming data
should be sorted in the key order of the leading columns of the primary STAR
index.
To order the data for an initial load, order the data for each referenced table
by the primary-key values. If you have a single STAR index on the referencing
table, you can order the data in the key order of the STAR index definition,
which can result in a more efficient index. To order the input data based on a
single STAR index on the referencing table, order it so that the data in the
foreign-key columns named first in the CREATE STAR INDEX statement is the
slowest to change. Data in the foreign-key columns named next changes
more slowly, and so on. The order of data in each foreign-key column must
match the order of data in the corresponding primary key in the referenced
tables. If you use multiple STAR indexes, then the difficulty of choosing an
order for the input data increases and the benefits are reduced.
In a single column in the key, data can be in any arbitrary order, provided that
the order in that column is the same as the order in the corresponding
primary key in the referenced table. For example, if the input data for the
foreign-key column of the referencing table is in descending collation order,
the input data for the corresponding column in the referenced table must also
be in descending collation order.
/RDGLQJ'DWDLQWRD:DUHKRXVH'DWDEDVH
0DLQWDLQLQJ5HIHUHQWLDO,QWHJULW\ZLWK$XWRPDWLF5RZ *HQHUDWLRQ
0DLQWDLQLQJ5HIHUHQWLDO,QWHJULW\ZLWK$XWRPDWLF
5RZ *HQHUDWLRQ
The Automatic Row Generation feature (AUTOROWGEN) allows the TMU to
add any rows needed to preserve referential integrity. If this feature is OFF,
the TMU discards records that violate referential integrity. However, this
behavior can be both time-consuming and frustrating in situations where the
data being loaded is dirty, unfamiliar, or incomplete. This feature offers the
following alternatives, in addition to discarding rows, to maintain referential
integrity:
This flexibility allows you to choose how you want to maintain referential
integrity on a table-by-table basis within a single load operation. Tables are
locked automatically, for either read or write access as needed, at the
beginning of the load operation.
You can set the AUTOROWGEN feature for ON and OFF mode in the
rbw.config file or in the Discard clause of a LOAD DATA statement. However,
you can set it for DEFAULT or mixed-mode operation only in the Discard
clause.
Precomputed views defined on the detail dimension tables for which you are
generating rows cannot be maintained. If aggregate maintenance is turned
on, such views are marked invalid.
'LVFDUGLQJ5HFRUGV7KDW9LRODWH5HIHUHQWLDO,QWHJULW\
If the AUTOROWGEN feature is OFF (the default behavior), all records that
violate referential integrity are discarded or written either to the standard
discard file or to files designated for referential-integrity violations. (You can
designate separate files for violations of each referenced table.)
7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
0DLQWDLQLQJ5HIHUHQWLDO,QWHJULW\ZLWK$XWRPDWLF5RZ *HQHUDWLRQ
$GGLQJ*HQHUDWHG5RZVWR5HIHUHQFHG7DEOHV
If the AUTOROWGEN feature is ON, whenever an input row in the table being
loaded contains a value in a foreign-key column that is not present in the
primary-key column of the referenced table, a row is generated and added to
the referenced table before the input row is added to the target table. This
behavior cascades through any outboard tables that are in turn referenced by
the referenced table.
In this mode, the referenced tables grow as rows are inserted into them. If a
table grows beyond its MAXROWS PER SEGMENT value, a REORG operation
might be required on STAR indexes built on these foreign-key columns.
The generated rows get their values from default values defined for each
column when the table was created.
([DPSOH$87252:*(121
Figure 3-2 illustrates the AUTOROWGEN ON feature. Assume you are the
database administrator for the following database. (Bold text indicates
primary keys. Bold italic indicates foreign keys.)
)LJXUH
$87252:*(121)HDWXUH$GGV5RZVWR5HIHUHQFHG7DEOH
Sales Period
Class Product SHUNH\ SHUNH\
FODVVNH\ FODVVNH\ FODVVNH\ GDWH
FODVVBW\SH SURGNH\ SURGNH\ GD\
FODVVBGHVF SURGBQDPH VWRUHNH\ ZHHN
SNJBW\SH SURPRNH\ PRQWK
TXDQWLW\ TWU
GROODUV \HDU
Store
Market Promotion
VWRUHNH\
PNWNH\ PNWNH\ SURPRNH\
KTBFLW\ VWRUHBW\SH SURPRBW\SH
KTBVWDWH VWRUHBQDPH SURPRBGHVF
GLVWULFW VWUHHW YDOXH
UHJLRQ FLW\ VWDUWBGDWH
VWDWH HQGBGDWH
]LS
/RDGLQJ'DWDLQWRD:DUHKRXVH'DWDEDVH
0DLQWDLQLQJ5HIHUHQWLDO,QWHJULW\ZLWK$XWRPDWLF5RZ *HQHUDWLRQ
The Sales table contains daily total sales for products sold in a chain of retail
stores. Because all managers have authority to order goods for their own
stores, frequently when you load the daily sales data, new products appear
for which no supporting entries exist in the Product table. Your operations
run more smoothly when you can complete the nightly load and complete
the entries for these new items the next day.
Each manager has a range of Prodkey values to assign to new products in the
defined classes.
The LOAD DATA statement for the daily load operation on the Sales table sets
the Automatic Row Generation feature to ON.
load data
inputfile 'sales.txt'
recordlen 86
insert
discardfile 'sales_disc.txt'
autorowgen on
…
When a record containing the sales dollars for a brand new product is
encountered during the load process, the TMU inserts the record in the Sales
table and adds a row containing the new Prodkey value into the Product
table, filling in that row with any specified default values or NULL.
…:01/96/d22:7:789:78:…:236:…
Dollars
Perkey Storekey
New Prodkey value
Classkey
7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
0DLQWDLQLQJ5HIHUHQWLDO,QWHJULW\ZLWK$XWRPDWLF5RZ *HQHUDWLRQ
The value 789, which was assigned by a store manager who added a new
item at his store, does not appear in the primary key of the Product table. The
AUTOROWGEN feature allows the TMU to insert the previous information
into the Sales table after adding the following row to the Product table.
The task of replacing the default values with real values remains, but now the
data is loaded and analysis can proceed. To find any new products that are
added, use a SELECT statement of the form:
select prodkey, product from product
where product = ’new product’;
You now need to find the missing information and update the Product table
entry.
0RGLI\LQJWKH,QSXW5RZV
If the Discard clause specifies AUTOROWGEN DEFAULT mode for a list of
referenced tables, when an input row contains a value in a foreign-key
column that is not present in the primary-key column of a referenced table in
the list, the row is first modified by replacing the missing value with the
default value for the foreign-key column. The row is then added to the target
table. In this mode, referenced tables in the list do not grow. This mode is
useful for data that contains unknown values in foreign-key columns that are
not of critical importance to the application. It is also useful in cases where
you do not want a referenced table to grow to exceed the MAXROWS PER
SEGMENT value.
/RDGLQJ'DWDLQWRD:DUHKRXVH'DWDEDVH
0DLQWDLQLQJ5HIHUHQWLDO,QWHJULW\ZLWK$XWRPDWLF5RZ *HQHUDWLRQ
([DPSOH$87252:*(1'()$8/7
Assume the load operation occurs on the Sales table with AUTOROWGEN
DEFAULT mode specified for the Product table. For all other tables that the
Sales table references, the default behavior is OFF mode.
load data
inputfile 'sales.txt'
recordlen 86
insert
discardfile 'sales_disc.txt'
autorowgen default (product)
…
The following records, which contain Prodkey values not present in the
Product table, are encountered in the load process.
…:01/96/d22:7:789:78:45:36:236.56:…
…:01/96/d22:7:790:78:46:42:168.72:…
…:01/96/d22:7:791:78:46:143:937.25:…
Dollars
Perkey
Quantity
Promokey
Storekey
New Prodkey value
Classkey
7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
0DLQWDLQLQJ5HIHUHQWLDO,QWHJULW\ZLWK$XWRPDWLF5RZ *HQHUDWLRQ
No changes are made to the Product table, but the first two records are added
to the Sales table as the following table shows.
1996-01-22 7 0 78 45 36 236.56
1996-01-22 7 0 78 46 42 168.72
$GGLQJ5RZVLQ0L[HG0RGH
The AUTOROWGEN feature also allows you to combine the ON and DEFAULT
behaviors in a mixed-mode operation. To combine behaviors, you must use
the Discard Clause of a LOAD DATA statement to specify on a table-by-table
basis whether rows should be added to the referenced or referencing table.
([DPSOH$87252:*(1PL[HGPRGH
/RDGLQJ'DWDLQWRD:DUHKRXVH'DWDEDVH
0DLQWDLQLQJ5HIHUHQWLDO,QWHJULW\ZLWK$XWRPDWLF5RZ *HQHUDWLRQ
As records are loaded into the Sales table, rows are also added to the Store
and Promotion tables as needed to maintain referential integrity. Rows are
also added to the Market table if necessary, because it is an outboard table
that the Store table references. However, if a record to be loaded contains a
foreign-key value not found in the Prodkey column of the Product table, the
record is added to the Sales table by using the default value (0) for the
Prodkey column of the Sales table. If a record to be loaded into the Sales
table contains a Perkey value not found in the Perkey column of the Period
table, it is discarded because the Period table is not in either table list and
hence is controlled by AUTOROWGEN OFF mode.
6SHFLI\LQJWKH$87252:*(10RGH
The default behavior for a data warehouse is controlled as a system default
with an entry in the rbw.config file.
The AUTOROWGEN feature also can be set for a specific load operation in the
Discard clause of a LOAD DATA statement, as described in “Discard Clause”
on page 3-43. Setting this feature in the Discard clause provides more
flexibility because you can specify the behavior (ON, OFF, DEFAULT) for each
table referenced by the table being loaded.
7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
:ULWLQJD/2$''$7$6WDWHPHQW
:ULWLQJD/2$''$7$6WDWHPHQW
The LOAD DATA statement for the TMU specifies, in this order:
The files that contain your input data, which can be tape files, disk
files, or standard input.
The format of the input data (format specification).
The locale of the input data, if different from the database locale.
Optional discard instructions, which can include filenames and
formats for discarded records and the Automatic Row Generation
feature.
Optional row messages instructions, specifying a filename in which
to view messages and warnings.
Optional optimization instructions that specify whether to build
indexes in optimize mode and a discard file for discarded records.
The table into which the data is loaded and a map of data fields into
table columns. Alternatively, you can specify an offline segment of a
table.
Optional criteria that determine which input records should be
loaded and which should be discarded.
Optional comment text that allows you to store information about a
load operation or the data loaded.
Each LOAD DATA statement loads only one table. It can load data into all of
the columns in a table or into a subset of the columns. The names of the
columns to load are specified together with a description of the source data,
which is called a field specification. A field specification contains information
about the data type of the data field in the input record if the data is in an
input file, or it contains information about the automatically generated input
data if the TMU must produce the data.
A control file can contain multiple LOAD DATA statements for multiple
tables, which are processed sequentially.
/RDGLQJ'DWDLQWRD:DUHKRXVH'DWDEDVH
/2$''$7$6\QWD[
/2$''$7$6\QWD[
LOAD input_clause
p. 3-25
DATA format_clause locale_clause
p. 3-30 p. 3-39
table_clause ;
p. 3-65 criteria_clause comment_clause
p. 3-90 p. 3-95
segment_clause
p. 3-88
The clauses shown in this syntax diagram are described in detail in the
following sections. For convenient reference, a syntax summary for the LOAD
DATA statement is shown at the end of this chapter.
7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
,QSXW&ODXVH
,QSXW&ODXVH
The TMU accepts input from tape drives, disk drives, and system standard
input. The Input clause specifies the file or files containing the input data, the
input device (for tape drives), and record numbers (for partial loads). In a
single LOAD DATA statement, files must be all tape files, all disk files, or all
standard input.
INPUTFILE ’filename’
INDDN ’
( 'filename’ ) TAPE DEVICE ’device_name’
INPUTFILE or File that contains the input data to load into the table. The
INDDN filename must satisfy operating-system conventions for file
’filename’ specification. The name must be enclosed in single quotation
marks. If you use multiple input files, you must enclose the
list of filenames in parentheses, and separate the names with
commas.
,PSRUWDQW If the LOAD DATA statement appears in a control file for the rb_cm util-
ity, INPUTFILE must be set to standard input.
/RDGLQJ'DWDLQWRD:DUHKRXVH'DWDEDVH
,QSXW&ODXVH
For TAR tapes, filenames are case-sensitive and the case of the
filename in the LOAD DATA statement must match the
filename on the tape.
7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
,QSXW&ODXVH
WIN
UNIX
NT TAPE DEVICE Tape device. It must be a rewind tape device. Use this clause
’device_name’ if the input files are on one or more tapes.
Each name in the filename list can be the name of a single file
on a multifile standard-label tape. However, the TMU does
not support multiple TAR archive files on a tape. It reads only
the first file.
,PSRUWDQW The TAPE DEVICE parameter is not valid for a LOAD DATA statement
appearing in a control file for the rb_cm utility. Load input must come from standard
input when the LOAD DATA statement is in a control file that the rb_cm utility uses.
♦
/RDGLQJ'DWDLQWRD:DUHKRXVH'DWDEDVH
,QSXW&ODXVH
START Specifies which records in the input file mark the beginning
RECORD, STOP and end of loading. If the START RECORD keywords are spec-
RECORD ified, loading begins at the specified record. Earlier records
are read and counted, but their contents are ignored. If the
STOP RECORD keywords are specified, loading stops after the
specified number of records.
The START RECORD and STOP RECORD clauses are useful in the following
circumstances:
■ It counts only the rows it sees. For example, if tape 1 is not used and
the load starts with tape 2, then the first record is the first one on
tape 2 (the first tape used).
■ The number of rows counted is not reset between files or tapes. The
number keeps incrementing until the end of the current LOAD DATA
statement.
■ If START RECORD is specified on data fields defined as a SEQUENCE
field type, the sequence value is incremented for each row skipped.
For example, if you specify START RECORD 10 and are loading a
column using SEQUENCE (2,2), then the first row loaded is input row
10 with sequence value 20.
7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
)RUPDW&ODXVH
In the following example the input file, market.txt, has fixed-format records.
The fields in market.txt start and end at the same position in each record.
They are not separated by characters. The input file is located on a disk
device, which is the default. The TMU starts counting records from the
beginning of the file, but starts loading at record 100 and stops loading at
record 200.
load data
inputfile ’market.txt’
start record 100
stop record 200
recordlen 7 replace
nls_locale ’English_Canada.MS1252@Default
discardfile ’mktdisc.txt’
discards 1
optimize on discardfile ’mktdups.txt’
into table market(
mktkey integer external (4),
state char (2)
);
)RUPDW&ODXVH
The Format clause is optional and specifies the format details of the input
data:
The absence of a Format clause indicates that records are in the ASCII code set,
and that their length is determined by the field lengths specified in the field
specifications in the Table clause.
/RDGLQJ'DWDLQWRD:DUHKRXVH'DWDEDVH
)RUPDW&ODXVH
RECORDLEN n APPEND
FIXEDLEN n INSERT
INTRA RECORD SKIP n REPLACE
MODIFY
AGGREGATE
UPDATE
AGGREGATE
FORMAT IBM
FORMAT SEPARATED BY ’c’
FORMAT IBM SEPARATED BY ’c’
FORMAT UNLOAD
FORMAT VARIABLE
FORMAT IBM VARIABLE
FORMAT XML
FORMAT XML_DISCARD
7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
)RUPDW&ODXVH
INTRA RECORD Valid only for variable format. Indicates to the read pro-
SKIP n cess to skip n bytes after finishing reading the previous
record. This is added for skipping newline characters
between input records.
7LS In REPLACE mode, the existing contents of a table are destroyed. Use this mode
carefully.
/RDGLQJ'DWDLQWRD:DUHKRXVH'DWDEDVH
)RUPDW&ODXVH
7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
)RUPDW&ODXVH
No FORMAT Indicates that all records are fixed length, as defined with
keyword RECORDLEN n. If RECORDLEN is not defined, each record
is read until a newline character is encountered. Binary
data is not permitted. For more information on fixed-for-
mat records, see “Fixed-Format Records” on page 3-123.
FORMAT IBM Specifies that data is in the EBCDIC code set. CHARACTER
and EXTERNAL fields are converted from EBCDIC to
ASCII, and integer fields are converted to the byte order of
the computer that is running IBM Red Brick Warehouse.
For details about specific EBCDIC to ASCII conversions,
contact IBM Customer Support.
/RDGLQJ'DWDLQWRD:DUHKRXVH'DWDEDVH
)RUPDW&ODXVH
FORMAT IBM Specifies that data is in the EBCDIC code set and that fields
SEPARATED BY ’c’ in a data record are separated by the character c, which
must be a single-character literal. It must be different from
the radix (decimal) point character.
FORMAT UNLOAD Specifies that the data to load is unloaded in internal for-
mat from a database using a TMU UNLOAD statement.
You cannot use this format choice to load data that was
unloaded in external format. For more information about
the UNLOAD statement, refer to Chapter 4, “Unloading
Data from a Table.”
Format variable, Use only if at least one VARLEN data field type is present
Format IBM in subsequent simple fields. (IBM keyword means that the
variable input is in EBCDIC.)
7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
(%&',&WR$6&,,&RQYHUVLRQ
(%&',&WR$6&,,&RQYHUVLRQ
If you are using the FORMAT IBM option to load data, note the following
restrictions:
/RDGLQJ'DWDLQWRD:DUHKRXVH'DWDEDVH
(%&',&WR$6&,,&RQYHUVLRQ
,%06\QWDFWLF&RGH6HW&6
ABCDEFGHIJKLMNOPQRSTUVWXYZ
abcdefghijklmnopqrstuvwxyz
0123456789
% (percent) * (asterisk)
; (semicolon) _ (underscore)
’ (apostrophe) , (comma)
7ZR$SSURDFKHVWR/RDGLQJ(%&',&'DWD
The two ways to load EBCDIC data with the TMU are:
7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
(%&',&WR$6&,,&RQYHUVLRQ
If you are certain that you are loading only characters that comply with
CS 640, you can use either the FORMAT IBM or the NLS_LOCALE specification,
but the FORMAT IBM approach yields higher performance. However, if you
are unsure whether your input data complies with CS 640, use the
NLS_LOCALE clause to select an EBCDIC code set that is fully compatible with
the specified language. Although load performance might not be optimal,
this approach ensures the integrity of both the loaded data and database
objects (such as indexes) that are built based on that data.
([DPSOHV)RUPDW&ODXVH
The following example shows the use of the Format clause. The Market table
contains existing data that is modified by the records in the input file,
market.txt. The keyword MODIFY specifies that if a record in market.txt has
the same primary key as a row in the Market table, the record in the file
replaces the row in the table. If a row does not yet exist in the table, the TMU
adds a new row.
load data
inputfile ’market.txt’
recordlen 7 modify
nls_locale ’English_Canada.MS1252@Default
discardfile ’mktdisc.txt’
discards 1
optimize on discardfile ’mktdups.txt’
into table market(
mktkey integer external (4),
state char (2)
);
/RDGLQJ'DWDLQWRD:DUHKRXVH'DWDEDVH
/RFDOH&ODXVH
The following example shows the use of a Format clause to load data in
UNLOAD format. No format specifications or field specifications apply other
than the keyword UNLOAD.
load data
inputfile ’market.txt’
insert format unload
discardfile ’mktdisc.txt’
discards 1
optimize on discardfile ’mktdups.txt’
into table market;
The following example shows the use of a Format clause to load data of
variable format.
load data
inputfile ’market.txt’
fixedlen 6 intra rec skip 1
insert format variable
…
(mktkey integer external (4)
state varlen external (2)
);
/RFDOH&ODXVH
The unique combination of a language and a location is known as a locale. A
locale specification consists of four components: language, territory, code set,
and collation order. The default locale for IBM Red Brick Warehouse is:
English_UnitedStates.US-ASCII@binary
where
■ English = language
■ United_States = territory
■ US-ASCII = code set
■ binary = collation order
7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
/RFDOH&ODXVH
,PSRUWDQW Any collation order value other than binary implies a linguistic sort. The
default locale resorts to the sort definition that the CAN/CSA Z243.4.1 Canadian order
specifies, which covers English and several Western European languages.
For more information about locales, refer to the Administrator’s Guide and to
the locales.pdf file on your installation CD.
Although the TMU uses the database locale for most of its processing, you can
specify a different locale for a TMU input file. In this way, the TMU can
automatically convert data from one code set to another as it is loaded into a
database table. The locale of the input file, if different from the database
locale, is specified with the NLS_LOCALE keyword in the Locale clause of the
LOAD DATA statement. If the Locale clause is omitted, the input locale is
assumed to be the same as the database locale.
,PSRUWDQW The Locale clause refers only to the contents of the input file itself. All
information specified in TMU control files must be specified in the database locale.
Specifically, this means that separator, radix, and escape characters must be specified
with the database-locale code set. If the character used as a separator or radix point in
the input data cannot be expressed as a character in the database locale, then the input
data cannot be interpreted correctly.
NLS_LOCALE ’ ’
language _territory . codeset @sort
XML_ENCODING
/RDGLQJ'DWDLQWRD:DUHKRXVH'DWDEDVH
/RFDOH&ODXVH
NLS_LOCALE Locale of the input data when the code set of the
input data differs from that of the database locale.
The locale specification also determines which
character is interpreted as the decimal (radix) point
in the input data, but it can be overridden by a
RADIX POINT definition, which is specified in the
Table clause as part of a DECIMAL field
description.
‘language_territory. All or part of the locale for the input file. The locale
codeset@sort’ specification must be enclosed in single quotation
marks. You do not need to specify all four parts of
a locale.
7LS You do not need to specify all the separator characters (the underscore (_), the
period (.), and the @ character) in a partial locale specification. Only the character
that immediately precedes the components is required, such as the underscore
character ( _ ) in the previous territory example.
7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
/RFDOH6SHFLILFDWLRQVIRU;0/,QSXW)LOHV
■ If you only specify the language, the omitted components are set to
the default values for that language. For example, if you set the locale
to Japanese the complete locale specification is as follows:
Japanese_Japan.JapanEUC@Binary
For a list of default components for each language, refer to the
locales.pdf file in the RELNOTES directory of your installation CD. If
you only specify the territory, the language defaults to English, the
code set to US-ASCII, and the collation order to binary. For example,
if you set the locale to _Japan the complete, but impractical, locale
specification is as follows:
English_Japan.US-ASCII@Binary
■ Similarly, if you only specify the code set, the language defaults to
English, the territory defaults to UnitedStates, and the collation order
component defaults to binary.
■ Finally, if you only specify the sort component (collation order), the
language defaults to English, the territory defaults to UnitedStates,
and the code set defaults to US-ASCII.
/RFDOH6SHFLILFDWLRQVIRU;0/,QSXW)LOHV
TMU loads in XML format are fully internationalized. The locale of an XML
input file can be specified in two different ways:
/RDGLQJ'DWDLQWRD:DUHKRXVH'DWDEDVH
8VDJH1RWHV
8VDJH1RWHV
■ For all discard files you specify in the Discard clause, the input file
locale is used. However, for a discard file you specify in the Optimize
clause, the database locale is used.
■ For all ACCEPT and REJECT processing in the Criteria clause, the
database locale is used.
■ The locale for TMU messages is either the database locale or the locale
specified for the current user with the RB_NLS_LOCALE
environment variable.
■ Before you specify a code set for the input file that is different from
the code set for the database, make sure that conversion between
those code sets is supported. For a complete list of supported
languages and code sets, refer to the locales.pdf file in the relnotes
directory of your installation CD.
.:DUQLQJ IBM Red Brick Warehouse provides no recovery mechanism when data
loss or data corruption occurs because of incompatible code sets.
([DPSOH
The following example shows the use of the Locale clause in a LOAD DATA
statement, where the language is English, the territory is Canada, the code set
is MS1252, and the collation sequence is Default, a Canadian collation
sequence definition.
load data
inputfile ’market.txt’
recordlen 7 modify
nls_locale ’English_Canada.MS1252@Default’
discardfile ’mktdisc.txt’
discards 1
optimize on discardfile ’mktdups.txt’
into table market(
mktkey integer external (4),
state char (2)
);
7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
'LVFDUG&ODXVH
'LVFDUG&ODXVH
If the TMU rejects any records because of data conversion, data content, or
referential integrity errors or because records do not meet the ACCEPT and
REJECT criteria in the Criteria clause, it places these records in one or more
discard files.
■ Discard filenames.
■ Whether to separate records discarded for referential-integrity
violations from those discarded for data-integrity violations (data
conversion, data content, or data that does not satisfy the ACCEPT
and REJECT criteria).
■ Whether to further separate records violating referential integrity by
specifying separate discard files for the referenced tables. (If a record
contains multiple referential-integrity violations, it is written to the
file specified for each violated dimension.)
,PSRUWDQW If you have multiple discard files for a TMU LOAD DATA statement (for
example, DISCARDFILE and RI_DISCARDFILE), be sure that the names of the discard
files are unique. If the names are not unique, one file will overwrite the other file.
/RDGLQJ'DWDLQWRD:DUHKRXVH'DWDEDVH
'LVFDUG&ODXVH
;0/'LVFDUGV
When data is loaded in XML format, the resulting discard files are in fixed
format, not XML format. You can reload the discarded rows by following
these steps:
7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
'LVFDUG&ODXVH
Each input record is first converted to internal row format and the data
integrity is checked. If an error occurs during this phase, the record is
discarded and written to the standard discard file, if one is specified. If no
error occurs during this phase, referential integrity checks are performed on
each referenced table.
,
DISCARDFILE ’filename’
DISCARDDN IN ASCII
EBCDIC
RI_DISCARDFILE ’filename’
,
( table_name ’ filename’ )
OTHER ’filename’
DISCARDS n
AUTOROWGEN OFF
ON
,
( table_name )
,
DEFAULT ( table_name )
DEFAULT
,
( table_name )
,
ON ( table_name )
/RDGLQJ'DWDLQWRD:DUHKRXVH'DWDEDVH
'LVFDUG&ODXVH
7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
'LVFDUG&ODXVH
table_name ’filename’ Table name and filename pair that names a table refer-
enced by a foreign key in the table being loaded and a
file to which to discard the records that violate referen-
tial integrity with respect to the referenced table. The
use of these pairs allows resolution and reprocessing of
referential-integrity violations more easily than if all
discards are stored in a single file.
You can specify multiple pairs. If some but not all refer-
enced tables are listed here, records that violate referen-
tial integrity with respect to tables missing from the list
are written either to the file following the OTHER
keyword, or if that keyword is missing, then to the
standard discard file (following the DISCARDFILE
keyword).
OTHER ’filename’ Optional. File to which to discard any records that vio-
late referential integrity with respect to referenced
tables not named in the table name and filename pairs.
If a table name and filename pair list is present and the
OTHER clause is omitted, any records that violate refer-
ential integrity with respect to tables missing from the
list are written to the standard discard file (following
the DISCARDFILE keyword).
/RDGLQJ'DWDLQWRD:DUHKRXVH'DWDEDVH
'LVFDUG&ODXVH
7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
'LVFDUG&ODXVH
/RDGLQJ'DWDLQWRD:DUHKRXVH'DWDEDVH
'LVFDUG&ODXVH
7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
'LVFDUG&ODXVH
([DPSOHUHMHFWHGUHFRUGVVWRUHGIRUUHIHUHQWLDOLQWHJULW\
In the following example, the TMU stores records rejected for data-integrity
violations in the file prod_di.txt and it stores records rejected for
referential-integrity violations in the file prod_ri.txt.
load data
inputfile ’prod.txt’
format separated by ’:’
discardfile ’prod_di.txt’
ri_discardfile ’prod_ri.txt’
discards 10
optimize on discardfile ’prd_dups.txt’
into table product(
classkey integer external (2)
prodkey integer external (2),
prodname char (30)
pkg_type (20)
);
/RDGLQJ'DWDLQWRD:DUHKRXVH'DWDEDVH
'LVFDUG&ODXVH
([DPSOHUHMHFWHGUHFRUGVVWRUHGLQGLIIHUHQWILOHVIRUGLIIHUHQWWDEOHV
In the following example, the TMU stores records that are rejected for data-
integrity violations in the file orders_di.txt. It stores the records rejected for
referential-integrity violations against the referenced tables Supplier and
Deal in the files sup_ri.txt and deal_ri.txt, respectively. Any other records
discarded for referential integrity are stored in the file misc_ri.txt.
load data
inputfile ’aroma_orders.txt’
format separated by ’*’
discardfile ’orders_di.txt’
ri_discardfile (supplier ’sup_ri.txt’,
deal ’deal_ri.txt’) other ’misc_ri.txt’
discards 10
optimize on discardfile ’orders_dups.txt’
into table orders(
order_no integer external,
perkey integer external,
supkey integer external,
dealkey integer external,
order_type char (20),
order_desc char (40)
close_date date ‘YYYY-MM-DD',
price dec external (7,2)
);
7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
'LVFDUG&ODXVH
([DPSOH$87252:*(1'()$8/7
Fact Dim3
... ...
Dim1
...
Dim2
...
Out1
...
…
autorowgen on (dim1, dim3) default (dim2)
…
If a record to insert into the Fact table violates referential integrity with
respect to the Dim2 table, a new row, in which the foreign-key value that
violated referential integrity is replaced by the default value for the
foreign-key column, is added to the Fact table.
/RDGLQJ'DWDLQWRD:DUHKRXVH'DWDEDVH
8VDJH
8VDJH
The following information applies to the use of the AUTOROWGEN feature in
the Discard clause.
'HIDXOW9DOXHV
Whenever a referential-integrity violation is detected and a row is inserted
with a default value (in either the referenced table (ON mode) or in the refer-
encing table (DEFAULT mode)) the default values used are determined from
default values defined for each column when the table was created. The
default values can be literals, NULL, or system values such as
CURRENT_USER, CURRENT_DATE, CURRENT_TIME, or
CURRENT_TIMESTAMP. These default values also have specific interactions
and restrictions with the column attributes NOT NULL and UNIQUE as
defined in the SQL Reference Guide.
You can change a default value assigned to a column with the ALTER TABLE
statement.
You can determine the default values for each column in a table by selecting
from the RBW_COLUMNS system table as follows:
select name, defaultvalue from rbw_columns
where tname = ’TABLENAME’ ;
7DEOH/RFNV
Whether a referenced table is locked for read or write access during a load
operation depends on the AUTOROWGEN mode. Locks are obtained at the
beginning of the load operation and held throughout the operation. First, to
maintain referential integrity, write locks are obtained on all referenced tables
into which a write might occur. Read locks are obtained on all referenced
tables that must be read to verify referential integrity.
All required locks are obtained automatically by the TMU. You do not need to
lock any tables manually.
7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
8VDJH
During a load operation, for all AUTOROWGEN modes, the table being
loaded is locked for write access. In addition, the modes, whether specified
in the rbw.config file or the LOAD DATA statement, require additional locks
on referenced tables as follows:
&RQIOLFWVLQ0L[HG0RGH2SHUDWLRQ
In mixed-mode operation, with both ON- and DEFAULT-mode table lists
present, the LOAD DATA statement might specify potentially conflicting
behavior. If such conflicts occur, a warning message is issued and the record
is discarded.
/RDGLQJ'DWDLQWRD:DUHKRXVH'DWDEDVH
8VDJH
)DFW
'LP
'LP
'()$8/7
'LP
2XW
21
21
Assume also that the LOAD DATA statement for the Fact table contains the
following Discard clause:
discardfile ’fact_dscd’ discards 100
autorowgen on (dim1, dim2) default (dim3)
Assume that a record to load into the Fact table requires (for referential
integrity) that a row be added to the Dim2 table, which in turn requires (for
referential integrity) that a row be added to the Dim3 table. However,
according to the AUTOROWGEN clause, referential-integrity violations in
which Dim3 is referenced directly are resolved by replacing the foreign-key
value with the default-column value before adding the row to the Fact table.
Because of this conflict, the TMU discards the record.
'()$8/70RGHDQG6LPSOH6WDU6FKHPDV
In a simple star schema, the primary key is composed of all the foreign-key
columns and only those columns. In DEFAULT mode, the same value, the
default value for the column, is used for each row to enter in the referencing
table. Because each row must contain a unique primary key, repeated use of
the same value might cause records to be discarded as duplicates rather than
entered into the referencing table. The example on page 3-20 shows this
behavior.
7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
5RZ0HVVDJHV&ODXVH
7URXEOHVKRRWLQJ
If automatic row generation is in ON or DEFAULT mode but the TMU is unable
to generate rows in a referenced table:
■ Verify that the database user ID running the TMU has INSERT
privilege on the table.
■ Verify that the referenced table does not specify both NOT NULL and
DEFAULT NULL for any column. This combination on a single
column prevents automatic row generation.
■ Verify that a MAXROWS PER SEGMENT value is set for each
referenced table.
If a load operation ends before any rows are loaded, verify that the user
redbrick has write permission on any files named in the Discard clause.
5RZ0HVVDJHV&ODXVH
The Row Messages clause allows you to specify a filename where row-level
warning messages are sent as the LOAD operation progresses. If no Row
Messages clause is specified, the messages are displayed as part of the
standard error output.
/RDGLQJ'DWDLQWRD:DUHKRXVH'DWDEDVH
5RZ0HVVDJHV&ODXVH
Regardless of the presence of the Row Messages clause in the control file,
messages also go to the log file maintained by the log daemon. A large
number of message-logging operations could slow down the LOAD
operation significantly; however, if warning messages are being sent to a file
instead of displayed, you might not realize that it is the message logging that
is causing the performance degradation. For information about configuring
the severity level for message logging, refer to the Administrator’s Guide.
You can set the Row Messages mode globally in the rbw.config file for all
load operations or in the SET statement for individual loads. If you have Row
Messages set to FULL, which is the default, the Row Messages clause desig-
nates the filename for the messages. If the RowMessages filename is
specified, but the rbw.config or SET statements are set to NONE, the
RowMessages filename is ignored.
ROWMESSAGES ’filename’
7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
2SWLPL]H&ODXVH
2SWLPL]H&ODXVH
The Optimize clause applies only to those operations that use REPLACE,
INSERT, APPEND or MODIFY modes in the Format clause. You cannot use the
Optimize clause with UPDATE mode.
The Optimize clause specifies how to update the indexes during a TMU
incremental load operation.
You can set the optimize mode globally in the rbw.config file for all load
operations. If the optimize setting in the rbw.config file is what you want for
a given LOAD DATA statement, then you do not need to include the Optimize
clause in the LOAD DATA statement.
OPTIMIZE OFF
ON DISCARDFILE ’filename ’
/RDGLQJ'DWDLQWRD:DUHKRXVH'DWDEDVH
2SWLPL]H&ODXVH
7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
8VDJH1RWHV
8VDJH1RWHV
In optimize mode, new index nodes in B-TREE and STAR indexes are built
using fill factors specified in the rbw.config file or found in the
RBW_INDEXES system table. For information about fill factors, refer to the
Administrator’s Guide.
In optimize mode, space is allocated for index entries for all records,
including duplicate records, and that space is not reclaimed for immediate
reuse when the duplicate records are discarded. If the data you are loading
contains many duplicate records, this behavior has several side effects:
/RDGLQJ'DWDLQWRD:DUHKRXVH'DWDEDVH
8VDJH1RWHV
Essentially, the MODIFY mode load with OPTIMIZE ON uses batch insertion
for insert-input rows and direct-index insertion for update-input rows. If
multiple unique indexes are defined on the table, OPTMIZE is turned to OFF.
Except for the primary-key index (which could be a B-TREE or a STAR index),
a unique index can only be a B-TREE defined on a unique column.
The duplicates going through the duplicate-removal phase are the rows that
have the same primary keys as the previous rows in the input file. The input
rows that have the same primary keys as the ones in the table do not count as
duplicates. Those input rows are updated directly. Therefore, when you use
the MODIFY mode load, duplicate handling is not as critical as when you use
the INSERT, APPEND, or REPLACE modes.
Checking for duplicate rows consumes both time and memory when the load
process is done in optimize mode. When the number of duplicate rows is
large, the amounts consumed can be significant. IBM recommends that you
do not use optimize mode for load processes in either of the following cases:
■ The target table has a UNIQUE index other than the primary-key
index (multiple UNIQUE indexes), and more than 5,000 records in the
input data are discarded because their key values are duplicates of
existing rows.
■ The target table has a single UNIQUE index (the primary-key index),
the input data contains more than 1 million records, and more than
10 percent of these records are discarded because their key values are
duplicates of existing rows.
7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
00$3,QGH[&ODXVH
In the following example, the TMU discards duplicate records and saves them
in the named file mktdups.txt. Any records discarded for other reasons
(referential-integrity violations, data-conversion errors) are saved (in a
different format) in the file mktdups.txt.
load data
inputfile ’market.txt’
recordlen 7
discardfile ’mktdisc.txt’
discards 1
optimize on discardfile ’mktdups.txt’
into table market(
mktkey integer external (4),
state char (2)
);
00$3,QGH[&ODXVH
The MMAP Index clause specifies one or more primary key indexes on tables
referenced by the table being loaded. The purpose of this specification is to
define the order in which those indexes are memory-mapped (with the
mmap system function), as a means of optimizing referential-integrity
checking. Use this clause in conjunction with the TUNE TMU_MMAP_LIMIT
parameter, which controls the amount of memory available for memory-
mapping during loads; see page 2-35.
’
MMAP INDEX ( pk_index_name )
’
SEGMENT ( segment_name )
/RDGLQJ'DWDLQWRD:DUHKRXVH'DWDEDVH
00$3,QGH[&ODXVH
In the following example, the primary key indexes from the Period and
Product dimension tables will be memory-mapped in that order when the
Sales_Forecast table is loaded.
load data
LOAD DATA INPUTFILE ’sales_forecast.txt’
RECORDLEN 62
INSERT
NLS_LOCALE ’English_UnitedStates.US-ASCII@Binary’
MMAP INDEX (period_pk_idx, product_pk_idx)
INTO TABLE SALES_FORECAST (
...;
If the MMAP INDEX clause is omitted from the control file, primary key
indexes on referenced tables are memory-mapped in descending order of size
(from largest to smallest). If insufficient memory is available to memory-map
an entire index, some of its segments are still memory-mapped.
7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
7DEOH&ODXVH
7DEOH&ODXVH
The Table clause specifies the table into which the data is loaded. It can
include:
Each field or group of fields maps to a column within the specified table. The
mapping of input data types to database server data types is defined on
page 3-133. For all formats except FORMAT UNLOAD, one or more field speci-
fications is required. For UNLOAD, field specifications are not allowed.
,
( col_name RETAIN )
AS $pseudocolumn DEFAULT
$pseudocolumn simple_field
p. 3-71
concat_field
p. 3-81
constant_field
p. 3-84
sequence field
p. 3-85
increment_field
p. 3-86
/RDGLQJ'DWDLQWRD:DUHKRXVH'DWDEDVH
7DEOH&ODXVH
table_name Table into which the data is loaded. The table must be
previously defined with a CREATE TABLE statement.
This name cannot be a view or synonym name.
7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
7DEOH&ODXVH
,PSRUWDQW If the DEFAULT keyword is used for a column that is defined as NOT
NULL DEFAULT NULL, then the load operation ends.
/RDGLQJ'DWDLQWRD:DUHKRXVH'DWDEDVH
/RDGLQJD6(5,$/&ROXPQ
/RDGLQJD6(5,$/&ROXPQ
When loading a SERIAL column, the TMU can either directly load input data
containing serial values, or automatically generate serial values. When
loading or unloading a serial column, the load script must specify the data
type as either a numeric-external field type or an integer-binary field type.
When the serial values are provided in the input data and loaded in INSERT,
APPEND, REPLACE, UPDATE, or MODIFY modes, all positive values for the
SERIAL column are loaded. Any rows with zero or negative values are
discarded.
When serial values are not provided in the input data, the TMU automatically
generates them. In this case, you can either leave the load-script data field
undefined, or define the data field as RETAIN. You cannot define a field as
DEFAULT, because the serial data type has no default value.
You cannot perform an offline load on a table with a SERIAL column. For
more information about SERIAL columns, refer to the SQL Reference Guide.
7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
6HOHFWLYH&ROXPQ8SGDWHVZLWK5(7$,1DQG'()$8/7
6HOHFWLYH&ROXPQ8SGDWHVZLWK5(7$,1DQG'()$8/7
The following table defines the TMU behavior during load operations with
respect to the type of field specification you specify in the Table clause, and
the load mode you specify in the Format clause, of the LOAD DATA statement.
/RDG0RGHV)RUPDW&ODXVH
In the following example, the Market table is being loaded with new data
that reflects a geographic reorganization of the districts and regions to which
each headquarters city belongs.
/RDGLQJ'DWDLQWRD:DUHKRXVH'DWDEDVH
6HOHFWLYH&ROXPQ8SGDWHVZLWK5(7$,1DQG'()$8/7
The basic LOAD DATA statement for the Market table is as follows:
oad data
inputfile ’aroma_market.txt’
recordlen 45
replace
discardfile ’aroma_discards’
discards 1
into table market (
mktkey integer external(2),
hq_city char(20),
hq_state char(2),
district char(13),
region char(7) );
The new LOAD DATA statement retains the information in the Hq_city and
Hq_state columns but loads the new definitions in the District and Region
columns. Even if the Hq_city and Hq_state field specification lines are
omitted, the values currently in the table are retained.
load data
inputfile ’aroma_mkt_upd.txt’
recordlen 45
modify
discardfile ’mkt_upd_discards’
discards 1
into table market (
These lines mktkey integer external(2),
could be hq_city retain,
omitted. hq_state retain,
district char(13),
region char(7) );
7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
6LPSOH)LHOGV
6LPSOH)LHOGV
A simple field specifies the data type of the field in the input record that is
loaded into the column.
field_type
p. 3-97
POSITION ( start )
: end
xml_path
ROUND ADD
LTRIM SUBTRACT
RTRIM MIN
TRIM MAX
ADD_NONULL
SUBTRACT_NONULL
MIN_NONULL
:
MAX_NONULL
/RDGLQJ'DWDLQWRD:DUHKRXVH'DWDEDVH
6LPSOH)LHOGV
POSITION Offset in bytes from the beginning of the record. This option
is used for fixed-format or variable-format files only. Do not
use this option with the SEPARATED BY keywords. The posi-
tion of the first field in a record is 1. If no position is specified,
then the position of a field is one greater than the last byte of
the previous field.
start:end Refers to the position of the data in the record, not to the posi-
tion in the field. Therefore, when you specify start:end,
remember to use the position of the data in the record. In the
following example, the field to load into the Weight column
starts at position 30 and ends at position 32. Other fields or
blanks precede this field in the input file:
weight position (30:32) integer external (3)
7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
6LPSOH)LHOGV
xml_path Describes how the XML input file should be parsed to load
each input field.
field_type Data type of the input field, for example, integer external. For
information about field types, refer to page 3-97.
LTRIM, RTRIM, Input modifiers to handle the loading and trailing space in
TRIM loading VARCHAR columns. Modifies input column by
removing preceding blanks, trailing blanks, or both, respec-
tively.
NULLIF Provides a way to load a column with the default value. If the
default value is not defined, the column is loaded with NULL.
If the data in the specified position is equal to the value of the
string, then the column value for the corresponding row is set
to NULL.
SUBTRACT Subtracts the value in the input record from the correspond-
ing value in the table column.
MIN Keeps the smaller of the values in the input record and the
value in the corresponding table column.
/RDGLQJ'DWDLQWRD:DUHKRXVH'DWDEDVH
6LPSOH)LHOGV
MAX Keeps the larger of the values in the input record and the
value in the corresponding table column.
ADD_NONULL Adds the value in the input record to the corresponding table
column. When the value in the input record is NULL, the
value in the table column remains unchanged. A null value
in the table column is treated as 0.
SUBTRACT_- Subtracts the value in the input record from the value in the
NONULL corresponding table column. When the value in the input
record is NULL, the value in the table column remains
unchanged. A null value in the table column is treated as 0.
MIN_NONULL Keeps the smaller of the values in the input record and the
value in the corresponding table column. If the value in the
input record is NULL, the value in the table column is
retained. If the value in the table column is NULL, it is
replaced by the value in the input record.
MAX_NONULL Keeps the larger of the value in the input record and the value
in the corresponding table column. If the value in the input
record is NULL, the value in the table column is retained. If
the value in the table column is NULL, it is replaced by the
value in the input record.
7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
6LPSOH)LHOGV
[POBSDWK6SHFLILFDWLRQ
The following diagram shows how to construct an XML path for an input
field:
/RDGLQJ'DWDLQWRD:DUHKRXVH'DWDEDVH
6LPSOH)LHOGV
7LS Specify the length of each field based on a careful estimation of the data in the
XML input file. The field lengths determine the size of each input row; if you set the
length values too high, the input rows will be larger than necessary.
$JJUHJDWH2SHUDWRUV
You can use the aggregate operators ADD, SUBTRACT, MIN, and MAX only
with the MODIFY AGGREGATE or UPDATE AGGREGATE modes. You cannot
use them with primary-key columns or true pseudocolumns (not the
AS $pseudocolumns) or non-numeric columns.
If a value in the specified field of the input record or a value in the specified
column of the table is NULL, then the result of the aggregation operation
(ADD, SUBTRACT, MIN, or MAX) is NULL.
7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
6LPSOH)LHOGV
([DPSOH3RVLWLRQ&ODXVH
The following example shows a LOAD DATA statement that reads a fixed-
format file. The Position clause specifies the starting byte of each field relative
to the offset in bytes from the beginning of the record. The field that maps to
the Perkey column starts at position 4 and ends at position 8 followed by 3
spaces. The next field maps to the Prodkey column and starts at position 12,
and the field after that maps to the Custkey column and starts at position 17.
load data
inputfile ’orders.txt’
recordlen 39
modify
discardfile ’orders_discards’
discards 1
into table orders(
perkey position (4) integer external (5),
prodkey position (12) integer external (2),
custkey position (17) integer external (2),
invoice sequence (1000,1)
);
The field specification for Invoice does not contain a Position clause. The
TMU generates values to store in the Invoice column. Because the values do
not exist in the input file, a Position clause is not relevant.
The following sample data comes from the orders.txt file. The dashes (-)
represent spaces. Actual data would contain spaces.
Perkey ProdkeyCustkey
---10045---12---56
---10046---13---57
---10047---14---58
/RDGLQJ'DWDLQWRD:DUHKRXVH'DWDEDVH
6LPSOH)LHOGV
([DPSOH;0/'DWDDQG&RUUHVSRQGLQJ&RQWURO)LOH
The following XML document can be used as the input for a TMU load in XML
format:
<?xml version="1.0"?>
<aromaproducts>
<coffee>
<product>
<ID classkey="12" prodkey="68"/>
<name>Aroma 2002 shirt </name>
<package>No_pkg </package>
</product>
</coffee>
</aromaproducts>
7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
6LPSOH)LHOGV
The following TMU and RISQL output shows the results of the load and a
subsequent query against the Product table:
157 brick % rb_tmu product_xml.tmu system manager
(C) Copyright IBM Corp. 1991-2002. All rights reserved.
Version 06.20.0000(0)TST
** STATISTICS ** (500) Time = 00:00:00.00 cp time, 00:00:00.00
time, Logical IO count=0, Blk Reads=0, Blk Writes=0
** INFORMATION ** (366) Loading table PRODUCT.
** INFORMATION ** (8555) Data-loading mode is APPEND.
** INFORMATION ** (9033) Parsing XML input file product.xml.
** INFORMATION ** (9036) XML Parsing Phase: CPU time usage =
00:00:00.00 time.
** INFORMATION ** (9018) Processed 1 rows in this LOAD DATA
operation from XML format input file(s).
** INFORMATION ** (513) Starting merge phase of index building
PRODUCT_PK_IDX.
** INFORMATION ** (513) Starting merge phase of index building
PRODUCT_FK_IDX.
** INFORMATION ** (367) Rows: 1 inserted. 0 updated. 0 discarded.
0 skipped.
** STATISTICS ** (500) Time = 00:00:00.14 cp time, 00:00:01.17
time, Logical IO count=55, Blk Reads=0, Blk Writes=1
For more details about loading tables from XML input files, see page 3-129.
([DPSOH18//,)
The following example shows a NULLIF condition on the destination column,
City. If a value starting at position 3 and ending at position 5 in market.txt is
San, the TMU stores a null indicator in the City column.
load data
inputfile ’market.txt’
discardfile ’market_discards’
into table market(
mktkey integer external (2),
city char (20) nullif (3:5) = ’San’
);
/RDGLQJ'DWDLQWRD:DUHKRXVH'DWDEDVH
6LPSOH)LHOGV
([DPSOH$XWR$JJUHJDWH
In the following example, the TMU adds the dollar values in the input record
to the existing values in the corresponding row of the Sales table. Because
UPDATE AGGREGATE is specified, new rows are not added to the table. Each
record must have a primary-key value that is already present in the Sales
table. An aggregate mode must be specified because ADD is part of the Auto
Aggregate mode.
load data
inputfile ’sales.txt’
update aggregate
discardfile ’sales_discards’
into table sales(
perkey integer external (5),
prodkey integer external (2),
mktkey integer external (2),
dollars decimal external (7,2) add
);
([DPSOH5281')XQFWLRQ
The following example shows the functions of the ROUND function for
floating-point input fields.
5RXQGHG9DOXH/RDGHGLQWR7DEOH
1.5 INT 2
-1.5 INT -2
7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
&RQFDWHQDWHG)LHOGV
&RQFDWHQDWHG)LHOGV
A concatenated field specifies the concatenation of fields in the input record
that is loaded into the column.
Back to table_clause
concat_field p. 3-65
column_name
$pseudocolumn
’character_string’
LTRIM ( column_name )
RTRIM $pseudocolumn
’character_string’
’character_string’ , RIGHT
/RDGLQJ'DWDLQWRD:DUHKRXVH'DWDEDVH
&RQFDWHQDWHG)LHOGV
([DPSOH&RQFDWHQDWHGILHOGV
In the following example, two fields are concatenated and stored in a column.
All of the fields in the product.txt file are loaded into the Product table. The
values in the Aroma and Acid fields in product.txt are stored in the Aroma
and Acid columns and are also joined in the Body column with the
ampersand character (&) used as a separator. The LTRIM option removes
preceding blanks.
load data
inputfile ’product.txt’
replace
format separated by ’:’
discardfile ’product_discards’
discards 100
into table product (
prodkey integer external (2),
product char (12),
aroma char (8),
acid char (7),
body
concat (ltrim (aroma), ’&’, ltrim (acid))
);
7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
&RQFDWHQDWHG)LHOGV
([DPSOH&RQFDWHQDWHGILHOGVDQGSVHXGRFROXPQV
/RDGLQJ'DWDLQWRD:DUHKRXVH'DWDEDVH
&RQVWDQW)LHOGV
&RQVWDQW)LHOGV
A constant field specifies a constant value to load into the column. The TMU
generates the value. It does not exist in the input file.
CONSTANT NULL
’character_literal’
float_constant
integer_constant
DATE 'date_literal'
TIME 'time_literal'
TIMESTAMP ’timestamp_literal’
'alternative_datetime_value'
Both ANSI SQL-92 datetime data types and the defined alternative datetime
formats are valid. If an ANSI SQL-92 datetime keyword is present, then the
literal that follows the keyword must be ANSI SQL-92 format. If you use an
alternative datetime value with numeric months and a format other than mdy,
you must include a SET DATEFORMAT statement in the TMU control file. For
more information about the TMU SET DATEFORMAT statement, refer to
“Format of Datetime Values” on page 2-33.
7LS You can also use a simple field and a datetime format mask to specify a
non-standard datetime value, as described on page 3-109.
7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
6HTXHQFH)LHOGV
In the following example, the TMU generates the value 999 and stores it in the
Dollars column for each record loaded into the Orders table.
load data
inputfile ’orders.txt’
replace
discardfile ’orders_discards’
discards 10
into table orders(
invoice integer external (5),
perkey integer external (5),
prodkey integer external (2),
custkey integer external (2),
dollars constant 999
);
In the following example, the TMU generates a constant date of March 10,
1998, and constant time and timestamp values and stores them in the
corresponding columns for each record loaded into the Period table.
load data
inputfile ’period.txt’
replace
format separated by ’*’
into table period (
perkey integer external (5),
date_col constant date ’1998-03-10’,
time_col constant time ’03:15:30’,
timestamp_col constant timestamp ’1998-03-10 3:15:30’
);
6HTXHQFH)LHOGV
A sequence field specifies a sequentially computed integer value to load into
a numeric column. The TMU generates the numbers. They do not exist in the
input file.
( start )
, increment
/RDGLQJ'DWDLQWRD:DUHKRXVH'DWDEDVH
,QFUHPHQW)LHOGV
start, Starting value and the value by which to increment. These val-
increment ues can be any negative or positive integer, including 0. The
increment value is applied to each new row, whether that row
is skipped, loaded, or discarded because of an error. The
default value for both start and increment is 1.
,PSRUWDQW Verify that the incremented values do not overflow the range of the data
type of the destination column.
,QFUHPHQW)LHOGV
An increment field specifies a value to add to the existing column value. The
TMU generates the value to added. It does not exist in the input file.
( n )
If the value in the specified column is NULL, the result of the increment
operation on that row is also NULL.
7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
6HJPHQW&ODXVH
The following example shows the use of the increment field and the UPDATE
AGGREGATE mode. The TMU updates the Weight column by adding the
value 15 to the values already existing in the column. Because UPDATE
AGGREGATE is specified, each record in the sales.txt file must have a
primary-key value that exists in the Sales table. Otherwise, the record is
discarded. An aggregate mode must be specified because Increment fields
use the Auto Aggregate mode.
load data
inputfile ’sales.txt’
recordlen 32
update aggregate
discardfile ’sales_discards’
discards 1
into table sales(
perkey integer external (5),
prodkey integer external (2),
mktkey integer external (2),
dollars decimal external (7,2),
weight increment (15)
);
6HJPHQW&ODXVH
You can use the Segment clause instead of a Table clause to specify a segment
of a table into which to load data. To load data into a single segment, the
following conditions must be met:
/RDGLQJ'DWDLQWRD:DUHKRXVH'DWDEDVH
6HJPHQW&ODXVH
,
( col_name simple_field )
AS $pseudocolumn concat_field
$pseudocolumn constant_field
sequence_field
increment_field
,PSRUWDQW You must omit the column names, pseudocolumns, and field specifica-
tions if and only if the input data is in UNLOAD format, as specified in the Format
clause.
INTO OFFLINE Offline row data segment into which data is to load.
SEGMENT The segment must be attached to the table specified by
segment_name table_name and must be offline.
7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
6HJPHQW&ODXVH
After data is loaded into the offline segment, the partial indexes built in the
work segment must be synchronized, or merged, with the existing indexes on
the table with a SYNCH OFFLINE SEGMENT operation, as described in
“Writing a SYNCH Statement” on page 3-119.
([DPSOH
The following example shows a LOAD DATA statement to load data into an
offline segment, followed by a SYNCH operation to synchronize the offline
segment with the rest of the table.
load data
inputfile ’sales_96_data’
append
discardfile ’discards_sales_96’ discards 3
into offline segment s_1q96 of table sales
working_space work01 (
perkey date (10) ’MM/Y*/d01’,
/RDGLQJ'DWDLQWRD:DUHKRXVH'DWDEDVH
&ULWHULD&ODXVH
&ULWHULD&ODXVH
The Criteria clause allows you to specify that a comparison be made of each
input record or each row of data. The result of the comparison, true or false,
loads or discards the record, depending on whether the Criteria clause
specifies ACCEPT or REJECT. You can use this clause to ensure that the correct
data is loaded or that rows are not aggregated more than once when a load
operation in an AGGREGATE mode is interrupted.
The Criteria clause can be used for comparisons on numeric, character, and
datetime data-type columns.
The Criteria clause uses the collating sequence and the code set from the
database locale for all processing.
7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
&ULWHULD&ODXVH
ACCEPT Specifies that each row of input data that meets the com-
parison criteria (that is, it evaluates to TRUE) is loaded
into the table. All others, including those containing
NULL indicators, are discarded.
REJECT Specifies that each row of input data that meets the com-
parison criteria (that is, it evaluates to TRUE) is rejected
and discarded. All others, including those containing
NULL indicators, are loaded.
The constant data type must be the same data type as, or
compatible with, the data type of the column or
pseudocolumn with which it is compared. For example,
numeric constants cannot be compared with character or
datetime columns.
/RDGLQJ'DWDLQWRD:DUHKRXVH'DWDEDVH
&ULWHULD&ODXVH
7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
&ULWHULD&ODXVH
LIKE, NOT LIKE Compares column or field values with a character string.
The column or pseudocolumn referenced must be of
CHARACTER data type.
ESCAPE ’c’ The ESCAPE keyword, which can be used only with a
LIKE or NOT LIKE comparison, defines a character (c) to
serve as an escape character so that the wildcard charac-
ters can be treated as character literals rather than control
characters. Use the ESCAPE keyword whenever the
pattern to match contains a percent or underscore
character.
8VDJH
Only one ACCEPT or REJECT Criteria clause can be present in each LOAD
DATA statement.
/RDGLQJ'DWDLQWRD:DUHKRXVH'DWDEDVH
&ULWHULD&ODXVH
([DPSOH9DOLG&ULWHULD&ODXVHV
The following examples illustrate valid Criteria clauses that compare column
values to constants. (You can use this kind of Criteria clause only in UPDATE,
MODIFY, or AGGREGATE mode.)
The following examples illustrate valid Criteria clauses that compare input
field values to constants:
accept $CITY = ’Los Angeles’
accept $TIME_COL >= ’13:13:13’
reject $TIME_COL >= time ’08:35:40’
accept $TIMESTAMP_COL >= timestamp ’1995-10-16 12:13:13’
([DPSOH/,.(DQG127/,.(
The following examples illustrate the use of the LIKE and NOT LIKE operators
in a Criteria clause:
reject zip not like '950%'
-- rejects any zip codes that do not begin with'950'
7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
&RPPHQW&ODXVH
&RPPHQW&ODXVH
The Comment clause contains a user-defined text string that describes the
load operation or the data being loaded. This information is then stored in the
RBW_LOADINFO system table to provide a historical record regarding the
loading of data into the specified table. You can retrieve the information by
querying the RBW_LOADINFO system table.
COMMENT ’character_string’
([DPSOH
/RDGLQJ'DWDLQWRD:DUHKRXVH'DWDEDVH
&RPPHQW&ODXVH
To check load activity on the Sales table, for example, to verify that a specific
batch of data was loaded, you can query the RBW_LOADINFO system table
as follows:
select *
from RBW_LOADINFO
where tname = ’SALES’
order by started;
The RBW_LOADINFO system table contains a row for each of the last
256 LOAD DATA operations. To retrieve data in any specific order, include an
ORDER BY clause in the SELECT statement.
7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
)LHOG7\SHV
)LHOG7\SHV
A field type specifies the data type of the input data in a simple field, as
described on page 3-71. The TMU converts this data type into the data type
defined for the column in the CREATE TABLE statement. The two data types
must be compatible, as defined on page 3-133.
EXTERNAL
Numeric
external fields INTEGER EXTERNAL
p. 3-101
INT EXTERNAL ( length )
, scale
DECIMAL EXTERNAL
DEC EXTERNAL ( length ) RADIX POINT ’c’
, scale
Floating-point
external fields FLOAT EXTERNAL
p. 3-103
( length )
Packed and DECIMAL
zoned decimal
fields p. 3-104 restricted_
DEC PACKED ( length ) date_spec
ZONED , scale p. 3-116
Integer binary
fields INTEGER
restricted_
INT ( scale ) date_spec
p. 3-116
SMALLINT
TINYINT
Floating-point
binary fields REAL
p. 3-106
DOUBLE PRECISION
/RDGLQJ'DWDLQWRD:DUHKRXVH'DWDEDVH
)LHOG7\SHV
Each field type is defined in the following sections, with examples of each
type.
7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
&KDUDFWHU)LHOG7\SH
&KDUDFWHU)LHOG7\SH
CHARACTER, Identifies a character string. A CHARACTER field can
CHAR contain any character in the computer code set.
VARLEN, VARLEN Used for loading CHAR and VARCHAR column types, the
EXTERNAL VARLEN and VARLEN EXTERNAL field types identify the
length of a character data section. VARLEN and VARLEN
EXTERNAL field types are not supported for loads in XML
format.
/RDGLQJ'DWDLQWRD:DUHKRXVH'DWDEDVH
&KDUDFWHU)LHOG7\SH
7LS TMU performance is better if you can load substrings of fixed-format character
data with the POSITION keyword rather than the SUBSTR function. With multibyte
characters, however, you must use the SUBSTR function to extract a string because
POSITION is byte-based whereas SUBSTR is character-based.
char (10)
character (24)
char (24) substr (1, 5)
7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
1XPHULF([WHUQDO)LHOG7\SHV
The following example shows the use of the SUBSTR keyword in a LOAD
DATA statement to load partial character strings into Col2 of the Sales table.
The numbers in parentheses define the starting character position and the
number of characters in the substring.
load data
…
For example, if the input data to Col2 is the string California, only the
substring Calif is loaded into the column.
1XPHULF([WHUQDO)LHOG7\SHV
INTEGER EXTERNAL, String of characters representing a number in
INT EXTERNAL [±]digits format. These numbers cannot exceed
38 digits. Use this field type when loading a
SERIAL column.
digit
+ .
- digit
. digit
/RDGLQJ'DWDLQWRD:DUHKRXVH'DWDEDVH
1XPHULF([WHUQDO)LHOG7\SHV
:DUQLQJ If you specify the radix character, you must specify it by using the data-
base-locale code set. If the character used as a radix in the input data cannot be
expressed as a character in the database, then the input data cannot be interpreted
correctly.
7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
)ORDWLQJ3RLQW([WHUQDO)LHOG7\SH
)ORDWLQJ3RLQW([WHUQDO)LHOG7\SH
FLOAT EXTERNAL String of characters representing a floating point
number in the following format.
E digit
e +
-
([DPSOH
int external --length specified by POSITION clause
integer external (8)
decimal external --length specified by POSITION clause
decimal external (5)
decimal external (5,2)
float external --length specified by POSITION clause
float external (8)
If the input records are in separated format, the length can be determined
implicitly.
/RDGLQJ'DWDLQWRD:DUHKRXVH'DWDEDVH
3DFNHGDQG=RQHG'HFLPDO)LHOG7\SHV
3DFNHGDQG=RQHG'HFLPDO)LHOG7\SHV
DECIMAL, DEC, Decimal numbers in packed format. These numbers
DECIMAL PACKED cannot exceed 38 digits.
7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
,QWHJHU%LQDU\)LHOG7\SHV
([DPSOHV
decimal -- packed; length specified by POSITION clause
dec -- packed; length specified by POSITION clause
decimal packed -- length specified by POSITION clause
dec packed (5,2)
decimal zoned -- length specified by POSITION clause
decimal zoned (5,2)
decimal zoned (5)
decimal packed (8) date ‘YYYYMMDD'
If the TMU LOAD DATA script references a packed decimal-input field type of
DECIMAL PACKED (6,3) for conversion to a database data type of DECIMAL
(5,2), the following conversions or errors occur.
,QSXW9DOXH 5HVXOW
473220 473.22
,QWHJHU%LQDU\)LHOG7\SHV
For integer binary field types, the length is implied by the field type.
/RDGLQJ'DWDLQWRD:DUHKRXVH'DWDEDVH
)ORDWLQJ3RLQW%LQDU\)LHOG7\SHV
([DPSOH
If the TMU LOAD DATA script references a binary-integer input field type of
INT (3) for conversion to a database data type of DECIMAL (5,2), the following
conversions or errors occur.
,QSXW9DOXH 5HVXOW
473220 473.22
)ORDWLQJ3RLQW%LQDU\)LHOG7\SHV
The REAL and DOUBLE PRECISION field types are not supported if the FOR-
MAT IBM keywords are included in the Format clause.
7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
'DWHWLPH)LHOG7\SHV
'DWHWLPH)LHOG7\SHV
DATE, TIME, Character data that is processed and stored as date, time,
TIMESTAMP and time-stamp information. For datetime field types in
fixed-format input, the length must be specified either
with the POSITION keyword or with the length
parameter.
)LHOG7\SH $OORZDEOH6XEILHOGV
/RDGLQJ'DWDLQWRD:DUHKRXVH'DWDEDVH
'DWHWLPH)LHOG7\SHV
The TMU converts the M4DATE string into the Metaphor DIS date format and
stores it as an integer when it loads a table. The format must be one of those
listed in the following table.
)RUPDW ([DPSOH$SULO
YYJJJ or 96/100
YYYYJJJ 1996/100
YYMD or 960410
YYYYMD 1996/4/10
MDYY or 4/10/96
MDYYYY 04101996
DMYY or 10/4/96
DMYYYY 10041996
7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
)RUPDW0DVNVIRU'DWHWLPH)LHOGV
YY and YYYY Two and four digits respectively specifying a year. A two-
digit year nn is interpreted as 19nn.
The day, month, and year fields can be contiguous or can be separated by a
single blank, slash (/), hyphen (-), period (.), or comma (,). If the fields are
contiguous, then each day and month representation must contain two
digits.
)RUPDW0DVNVIRU'DWHWLPH)LHOGV
A format mask for a datetime field type is created by concatenating allowable
subfield format specifiers, using either a fixed- or variable-length subfield or
a combination of both. A DATE format mask composed of month, day, and
year subfields might look like one of those listed in the following table.
)RUPDW 'HILQLWLRQ
’m8d16y1997’ DATE format mask, constant date (Aug 16, 1997). Date
constants can also be defined with a CONSTANT field
specification, as described on page 3-84.
([DPSOHV
date (8) ’MMDDYYYY’
date (8) ’DDMMYYYY’
date ’y1996m8d17’
/RDGLQJ'DWDLQWRD:DUHKRXVH'DWDEDVH
6XEILHOG&RPSRQHQWV
6XEILHOG&RPSRQHQWV
The following table defines each subfield component, its default value, and
its specifier in the format mask. Examples for fixed- and variable-length
subfields are provided in the sections that follow.
Mon *Jan, Feb, Mar, Apr, May, Jun, Jul, Aug, Sep, Oct, Nov,
Dec
7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
6XEILHOG&RPSRQHQWV
In a fixed-length subfield, the last letter of the subfield mask character repeats
once for each character of the input. Fixed-length subfields cannot contain
blanks.
/RDGLQJ'DWDLQWRD:DUHKRXVH'DWDEDVH
6XEILHOG&RPSRQHQWV
You can use an underscore (_) to indicate the end of the subfield and that the
next character should be ignored. You can also use the underscore as a
wildcard to skip bytes. You must repeat it for each byte to skip.
Regardless of the format you specify for the subfields, the number of bytes
processed is limited by the length parameter for the datetime field.
([DPSOHV RI6XEILHOG0DVNV
)RUPDWIRU)L[HGOHQJWK6XEILHOGV
MMDDYYYY Indicates 2 digits each for month and day and 4 digits for year.
)RUPDWIRU9DULDEOHOHQJWK6XEILHOGV
D*/M*/YYYY Indicates 1 or more digits for day and month. 4 digits for year,
subfields separated by a slash (first non-digit character).
Mon d1 y?Y* Indicates short month. Day is 01, 1- or 2-digit year, subfields
separated by spaces.
(1) Skip any blanks, read full-month name, and check for a blank.
(2) Skip any blanks, read one or more digits for date, and check for a comma.
(3) Skip any blanks and read two or more digits for year. Ignore the non-digit
character following the year.
(4) Read two digits for hours. No white space or other non-digits allowed.
Check for a colon.
7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
6XEILHOG&RPSRQHQWV
(5) Read two digits for minutes. No white space or other non-digits allowed.
Check for a colon.
(6) Read two digits for seconds. No white space or other non-digits allowed.
Check for a period.
(7) Skip any blanks, read four or more digits for fractional seconds, and
ignore anything that follows.
)RUPDW0DVNVWRUHDG,QSXW)LHOGV
The following table contains some types of input fields and suggests masks
to read them.
7RUHDGWKHVHLQSXWILHOGV 8VHWKHVHIRUPDWPDVNV
/RDGLQJ'DWDLQWRD:DUHKRXVH'DWDEDVH
6XEILHOG&RPSRQHQWV
7RUHDGWKHVHLQSXWILHOGV 8VHWKHVHIRUPDWPDVNV
([DPSOH/RDGLQJ'DWHWLPH'DWD
The following example shows how data in various formats is loaded into
DATETIME columns.
The data for the Datetime table is in a file named datetime_inputs with fields
separated by an asterisk (*). The first three records of input data in
datetime_inputs look like the following example.
to d1 to d2 to ts1 to t1 to ts2
7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
6XEILHOG&RPSRQHQWV
A LOAD DATA statement to load this data follows. The current date is loaded
into column D3.
load data
inputfile ’datetime_inputs’
replace
format separated by ’*’
into table datetime (
d1 date ’y19Y*/M*/D*’, -- Date subfields separated by /
d2 date ’Month D* y?Y*’,-- Date subfields separated by space
d3 current_date, -- Rows loaded with date at time of load
ts1 timestamp(8) ’y1996MMDDHHII’, -- Fixed format mask
t1 time ’HH:II:SSAM’, -- Time subfields separated by :
ts2 timestamp ’_ _ _ _Mon DD HH:II:SS_ _ _ _ _Y*’
-- First 4 characters and 5 characters
-- between S and Y are ignored
);
If the data is loaded on July 1, 1996, the information stored in the Datetime
table looks like the following example.
G G G WV W WV
/RDGLQJ'DWDLQWRD:DUHKRXVH'DWDEDVH
5HVWULFWHG'DWHWLPH0DVNVIRU1XPHULF)LHOGV
If the data is loaded on July 1, 1996, the information stored in the Datetime
table looks like the following example.
G G G WV W WV
5HVWULFWHG'DWHWLPH0DVNVIRU1XPHULF)LHOGV
The TMU can load binary integer or packed or zoned decimal input data into
datetime columns when the input fields are described by a restricted
datetime format mask. For example, if an input record contains a date value
for February 14, 1998, represented as 19980214 in a packed decimal field, the
TMU can extract the date from the input field and store it in a DATE column.
Back to
field_type
restricted_date_spec
p. 3-97
DATE ’restricted_date_mask’
TIME ’restricted_time_mask’
TIMESTAMP ’restricted_timestamp_mask’
7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
5HVWULFWHG'DWHWLPH0DVNVIRU1XPHULF)LHOGV
,QWHJHU)LHOG7\SH 0D[LPXP'LJLWVLQ0DVN
INTEGER (4 bytes) 10
SMALLINT (2 bytes) 5
TINYINT (1 byte) 3
T bl 2
You can use the underscore (_) character in the
mask to indicate that an input digit should be
ignored.
,PSRUWDQW Scale values are ignored when restricted date masks are used. A length
value, from a length argument or a Position clause, is required with packed or zoned
decimal field types and must be consistent with the format mask.
/RDGLQJ'DWDLQWRD:DUHKRXVH'DWDEDVH
5HVWULFWHG'DWHWLPH0DVNVIRU1XPHULF)LHOGV
5HTXLUHPHQWVIRU,QSXW'DWDIRU'DWHWLPH0DVNV
The input data must meet the following requirements:
([DPSOHV
5HVWULFWHG
'DWH0DVN ([DPSOHV &RPPHQWV
YYYYMMDD 19980214 Valid: 4 digits of year, 2 digits each for month and
day.
7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
:ULWLQJD6<1&+6WDWHPHQW
The following example shows how to use the restricted date mask in a LOAD
DATA statement. The Period table contains a DATE data-type column named
Date_Col. The following LOAD DATA statement loads input records that
contain date information stored as binary integers into the Date_Col column
in the Period table:
load data
inputfile ’aroma_period.txt’
replace
discardfile ’aroma_discards’
discards 1
into table period (
perkey integer external (4),
month char (4),
year integer external (4),
quarter integer external (1),
tri integer external (10),
date_col integer date ’YYYYMMDD’
) ;
In the following example, the input records contain the date information in
the format ‘YYYYMMDD’ (for example, 19971225) stored as a binary integer.
The TMU extracts the date information from the binary input and stores it as
a DATE data type in the Date_Col column.
:ULWLQJD6<1&+6WDWHPHQW
If data is loaded into an offline segment, you must complete the load
operation by synchronizing the segment with the table and its indexes before
the segment can be brought online and made available for use. Synchroni-
zation is necessary only for offline load operations. If the segment into which
data was loaded was online at the time of the load, synchronization is not
necessary.
,PSRUWDQW The SYNCH operation acquires an exclusive lock on the target table but
this operation is much quicker than an online load of the table.
To perform this synchronization, run the TMU with a control file that contains
a SYNCH OFFLINE SEGMENT statement. You can include this statement in the
same control file as the LOAD DATA statement or you can put it in a separate
control file. At the end of the synchronization operation, the work segment
used for the offline load is detached from the table and is available for reuse.
/RDGLQJ'DWDLQWRD:DUHKRXVH'DWDEDVH
:ULWLQJD6<1&+6WDWHPHQW
If you decide, after loading data into the segment, that you want to remove
the newly loaded data rather than incorporate it into the table, you have two
choices:
■ Delete all the data in the segment with the ALTER SEGMENT…CLEAR
statement. This choice is appropriate if the segment was empty or if
you do not want the data that was in the segment before the load
operation.
■ Delete only the newly loaded data with the UNDO LOAD option to
the SYNCH SEGMENT statement. This choice is useful for segments
that contained data you want to preserve before the offline operation
was performed.
table_name ;
DISCARDFILE ’ discard_filename’
UNDO LOAD
segment_name Offline segment that contains newly loaded data not yet
synchronized with the owning table and its indexes.
7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
:ULWLQJD6<1&+6WDWHPHQW
DISCARDFILE File to which all duplicate rows are discarded. This ASCII
discard_filename file contains rows in the same format as those rows dis-
carded during an optimized load or an UNLOAD
EXTERNAL operation on a table. For more information
about discard files, refer to page 3-60.
UNDO LOAD Synchronizes the segment with the table and its indexes
by deleting all the rows that are added to the segment.
This operation is useful when you discover you loaded
the wrong data or when a lot of rows are discarded unex-
pectedly and you want to start over. It removes all evi-
dence of the previous offline load operation, leaving
intact the rows that were in the segment before the offline
load.
([DPSOH
The following example shows a control file that contains a LOAD DATA
statement and a SYNCH SEGMENT statement to synchronize the newly
loaded offline segment with the rest of the table.
load data
inputfile ’sales_96_data’
append
discardfile ’discards_sales_96’ discards 3
into offline segment s_1q96 of table sales
working_space work01 (
perkey date (10) ’MM/Y*/d01’,
prodkey integer external (2),
mktkey integer external (2),
dollars integer external (3)
) ;
synch offline segment s_1q96 with table sales
discardfile ’discards_synch’;
Because the SYNCH operation requires an exclusive lock on the table, you
might prefer to use a separate control file for that operation so you can
perform it at a time when users are not accessing the table.
After the SYNCH operation, you must use ALTER SEGMENT…ONLINE before
you can access the segment.
/RDGLQJ'DWDLQWRD:DUHKRXVH'DWDEDVH
)RUPDWRI,QSXW'DWD
)RUPDWRI,QSXW'DWD
The TMU supports a wide variety of input data formats. However, not all
platforms support all formats. The TMU accepts both disk and tape input and
system standard input. Tape files can be ANSI standard label or TAR (Tape
ARchive) formats. The record format for disk input files can be fixed,
variable, separated, or XML. XML input files cannot be loaded from tape.
Not all combinations of data, record format, and tape formats are valid. The
following table defines the valid combinations.
XML format
The 1/4-inch cartridge input device is supported only for TAR tapes. The
3480/3490 18-track cartridge is supported only by variable-block-length
device drivers for ANSI Standard Label tapes.
The TMU also provides limited support for IBM standard label tapes. It can
read IBM standard label tapes with fixed-length records in EBCDIC FB format.
However, it cannot read variable-length (VB or VBS) tapes. The filenames on
the tape must be uppercase.
The TMU also supports an internal storage format, UNLOAD, which loads
data files written with a TMU UNLOAD control file. You can write UNLOAD-
format tapes in either TAR or standard-label format.
7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
'LVN)LOHV
The TMU handles disk and tape files differently with respect to file format,
record length, format, and data type, as the following sections describe.
'LVN)LOHV
Disk files can contain fixed-format, variable-format, separated-format, or
XML input data. If no format keywords are specified in the LOAD DATA
statement, the TMU assumes that fixed-format records will be loaded.
)L[HG)RUPDW5HFRUGV
For fixed-format records, all records are a fixed length and all field types are
allowed. The TMU determines the record length from the defined size of
RECORDLEN in the FORMAT clause of the LOAD DATA statement according
to the following rules:
To read EBCDIC in fixed-record format, include the FORMAT IBM clause in the
LOAD DATA statement. This clause forces CHARACTER and EXTERNAL fields
to convert from EBCDIC to ASCII, and INTEGER fields to convert to the
byte-ordering of the native computer.
([DPSOHV
/RDGLQJ'DWDLQWRD:DUHKRXVH'DWDEDVH
'LVN)LOHV
The RECORDLEN value equals the sum of the field lengths. Data is in EBCDIC.
LOAD DATA
INPUTFILE ’mkt.txt’
RECORDLEN 126
FORMAT IBM
…
9DULDEOH)RUPDW5HFRUGV
The variable-format record is a modified version of the fixed-format record.
A variable-format record consists of a fixed-length part and a variable-length
part. Every variable-format record has the same length for the fixed-length
part, but can have a different length for the variable-length part. For a
variable-length TMU column, the length of the column is the fixed-length
part, and the real data of the column is attached in the variable-length part.
The TMU reads the fixed-length part of the record first. Next, the TMU deter-
mines the length of the variable-length part, then reads the variable-length
part.
The TMU determines the fixed-length part of the record from the defined size
of FIXEDLEN in the FORMAT clause of the LOAD DATA statement according to
the following rules:
7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
'LVN)LOHV
Based on the FORMAT clause in the TMU control file, the TMU reads the input
record two ways:
/RDGLQJ'DWDLQWRD:DUHKRXVH'DWDEDVH
'LVN)LOHV
After the input record is read, the TMU converts it into an internal row. The
TMU uses the appearance order of VARLEN and VARLEN EXTERNAL in the
fixed-length part to calculate the offset and length of each data section in the
variable-length part. The data sections are then put in the corresponding
output column of the internal row.
8VDJH
The variable-format record is more compact than the fixed-format record,
and also preserves significant trailing spaces. In addition, the variable-format
record data-input file is more complex than the fixed-format record
data-input file. Use the variable-format record format:
7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
'LVN)LOHV
([DPSOH
/RDGLQJ'DWDLQWRD:DUHKRXVH'DWDEDVH
'LVN)LOHV
6HSDUDWHG)RUPDW5HFRUGV
For separated-format records, the TMU determines the length of each field in
the record by the separator character defined in the FORMAT clause of the
LOAD DATA statement. The end of each record is indicated by the newline
character (or by the end of file for the last record in the file).
Only character and external field types are allowed. Length values and
POSITION keywords are ignored.
If records in separated format are longer than 8192 bytes, then a RECORDLEN
clause, which specifies the maximum length of a record, must be used.
([DPSOHV
The fields are separated by a comma and the record ends with a newline
character:
LOAD DATA
INPUTFILE ’mkt.txt’
FORMAT SEPARATED BY ’,’
…
The fields are separated by a slash (/) and the record length is 126 bytes:
LOAD DATA
INPUTFILE ’mkt.txt’
RECORDLEN 126
FORMAT SEPARATED BY ’/’
…
7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
'LVN)LOHV
The input is read from standard input, fields are separated by a colon, and the
record ends with a newline character:
LOAD DATA
INPUTFILE ’-’
…FORMAT SEPARATED BY ’:’
;0/)RUPDW
XML files consist of markup tags and data content. The markup tags define
elements that have a repetitive hierarchical structure. This structure has data
values embedded in it, but these values do not readily transform into flat
database rows and columns. In order for the TMU to locate the data content
in the XML file and construct rows, the XML file is parsed (using the Xerces-
C++ parser), according to rules specified in the TMU control file. The syntax
that defines these rules is the “xml_path Specification” on page 3-75.
The XML paths in the TMU control file must comply with the hierarchy of the
elements in the XML file. For each column to be loaded, the control file
specifies a path that points to the location of the data. The data is always
located at the end of a series of elements that comprise the path. The data is
either the value of an element’s attribute or the character data (PCDATA)
enclosed by the element’s start and end tags.
A row is constructed from sets of data values inside the XML input file that
map to corresponding sets of XML paths in the TMU control file. Only one row
can be constructed from the data inside the start and end tags of the last
common element specified in the control file. In order for multiple rows to be
loaded, the input file must contain repetitive blocks that begin and end with
the same common element, and that element must be the last common
element.
For example, the control file for loading four columns in a table might define
these four paths:
...
prod_brand /product/brand/#PCDATA char(20),
prod_name /product/brand/category/@name char(20),
prod_grind /product/brand/category/@grind char(50),
prod_weight /product/brand/category/@weight integer external (5),
...
/RDGLQJ'DWDLQWRD:DUHKRXVH'DWDEDVH
'LVN)LOHV
The last common element is brand; therefore, the content that makes up a
single row must fall between the start tag <brand> and end tag </brand> in
the input file. In the following example of a partial input file, this content is
shown in bold:
<product>
<brand>Aroma
<category name=’Coffee’ grind=’Whole bean’ weight=’2’/>
</brand>
</product>
Using a control file with the paths shown above, the TMU would load the
following values:
Prod_Brand Prod_Name Prod_Grind Prod_Weight
Aroma Coffee Whole bean 2
Note that the TMU loads only the data values, not the markup tags. If only a
subset of the data values is required in the table, the TMU could load that
subset, based on an equivalent subset of XML paths in the control file. If
further TMU processing is required on the resulting rows, such as substrings
or concatenations of the XML data, that functionality can also be built into the
control file.
;0/,QSXWZLWKD1HVWHG6WUXFWXUH
If the content between the tags for the last common element produces more
than one row, the structure of the XML file is said to be nested. Whether the
structure is nested or not cannot be determined from the XML paths specified
in the TMU control file. The structure must be determined at run-time, based
on the content of the XML input file. If a nested structure is detected, the load
operation fails with an error.
Consider a case where the control file in the previous example is structured
the same but the XML input file contains repetitive sets of data values inside
the last common element (brand):
<product>
<brand> Aroma
<category name=’Coffee’ grind=’Whole bean’ weight=’2’/>
<category name=’Coffee’ grind=’Espresso’ weight=’1’/>
</brand>
</product>
7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
7DSH)LOHVRQ81,;2SHUDWLQJ6\VWHPV
Within the content defined by the common element <brand>, there are two
“rows,” and the TMU will return an error if you attempt to load them with the
original control file. However, you could load three of the columns by editing
the control file as follows:
...
prod_name /product/brand/category/@name char(20),
prod_grind /product/brand/category/@grind char(50),
prod_weight /product/brand/category/@weight integer external
...
Now the load will work, but it cannot load the “Aroma” character string
defined by the <brand> element. Instead, the <category> element defines
the boundary for each row and only three columns are produced:
Prod_Name Prod_Grind Prod_Weight
Coffee Whole bean 2
Coffee Espresso 1
WIN
UNIX
NT 7DSH)LOHVRQ81,;2SHUDWLQJ6\VWHPV
Tape files can be read from TAR or ANSI standard label tapes, as described in
the following sections.
7$57DSHV
TAR tape files are handled like disk files for both fixed-format,
variable-format, and separated-format records.
The TMU can read TAR tape files that span multiple tape volumes. However,
the TMU does not support multiple TAR archives on a single tape.
([DPSOH
The following example shows a LOAD DATA statement to read TAR tape files.
/RDGLQJ'DWDLQWRD:DUHKRXVH'DWDEDVH
7DSH)LOHVRQ81,;2SHUDWLQJ6\VWHPV
The fields are separated by a comma and the record ends with a newline
character:
LOAD DATA
INPUTFILE ’/disk1/mkt.txt’
TAPE DEVICE ’/tape_dev’
FORMAT SEPARATED BY ’,’
…
$16,6WDQGDUG/DEHO7DSHV
ANSI-standard label tapes can contain either fixed-length or variable-length
records. The TMU determines the record length from the tape label. If the
RECORDLEN or FIXEDLEN clause is present, it is ignored.
Spanned format tapes are not supported, and a single tape record is not
separated into multiple table rows.
7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
)LHOG7\SH&RQYHUVLRQV
The label determines the record length and the statement specifies the field
lengths.
load data
inputfile ’mkt.txt’
tape device ’tape_dev’
…
The label determines the record length, a comma separates the fields, and the
data is in EBCDIC:
load data
inputfile ’mkt.txt’
tape device ’tape_dev’
format ibm separated by ’,’
…
)LHOG7\SH&RQYHUVLRQV
The TMU performs conversions between compatible field types and data
types, converting the data in each field in the input record to the data type of
the corresponding column in the table.
Datetime input data (in either datetime or binary input fields) is compatible
only with other datetime data types, as the table on page 3-136 defines.
■ The data in an input field is not compatible with the data type of the
output table column.
■ The value of a numeric input field exceeds the maximum possible
value of the output table column.
/RDGLQJ'DWDLQWRD:DUHKRXVH'DWDEDVH
)LHOG7\SH&RQYHUVLRQV
If a column is defined as NOT NULL DEFAULT NULL, and an empty field for
that column is encountered, the load operation ends. The following table
defines the allowable conversions and the results that occur for non-datetime
data types. Rows in this table represent input-record field types declared in
the TMU LOAD DATA statement. Columns in this table represent the data
types declared with the CREATE TABLE statement. The entry in each table cell
defines what can happen when input data of a given field type is loaded into
a table column of a given data type.
7DEOH'DWD7\SHV
,QSXW5HFRUG )ORDW
)LHOG7\SHV &KDU 9DUFKDU ,QWHJHU 6HULDO 6PDOOLQW 7LQ\LQW 'HFLPDO 5HDO 'RXEOH
Varlen, C C
Varlen External
Float External, N/A O,S O,S O,S O,S O,S O,S O,S
CONSTANT f
7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
)LHOG7\SH&RQYHUVLRQV
7DEOH'DWD7\SHV
,QSXW5HFRUG )ORDW
)LHOG7\SHV &KDU 9DUFKDU ,QWHJHU 6HULDO 6PDOOLQW 7LQ\LQW 'HFLPDO 5HDO 'RXEOH
/RDGLQJ'DWDLQWRD:DUHKRXVH'DWDEDVH
)LHOG7\SH&RQYHUVLRQV
The following table defines the allowable data-type conversions for datetime
data types. The input-record field types include binary numeric input data.
7DEOH'DWD7\SHV
,QSXW5HFRUG
)LHOG7\SHV '$7( 7,0( 7,0(67$03
7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
/2$''$7$6\QWD[6XPPDU\
/2$''$7$6\QWD[6XPPDU\
The following syntax diagrams provide the complete syntax for the TMU
LOAD DATA statement.
LOAD input_clause
p. 3-25
DATA format_clause locale_clause
p. 3-30 p. 3-39
table_clause ;
p. 3-65 criteria_clause comment_clause
p. 3-90 p. 3-95
segment_clause
p. 3-88
LQSXWBFODXVH
INPUTFILE ’filename’
INDDN ’ TAPE DEVICE ’device_name’
( ’filename’ )
/RDGLQJ'DWDLQWRD:DUHKRXVH'DWDEDVH
/2$''$7$6\QWD[6XPPDU\
IRUPDWBFODXVH
RECORDLEN n APPEND
FIXEDLEN n INSERT
INTRA RECORD SKIP n REPLACE
MODIFY
AGGREGATE
UPDATE
AGGREGATE
FORMAT IBM
FORMAT SEPARATED BY ’c’
FORMAT IBM SEPARATED BY ’c’
FORMAT UNLOAD
FORMAT VARIABLE
FORMAT IBM VARIABLE
FORMAT XML
FORMAT XML_DISCARD
ORFDOHBFODXVH
NLS_LOCALE ’ ’
language _territory . codeset @sort
XML_ENCODING
7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
/2$''$7$6\QWD[6XPPDU\
GLVFDUGBFODXVH
,
DISCARDFILE ’filename’
DISCARDDN IN ASCII
EBCDIC
RI_DISCARDFILE ’filename’
,
( table_name ’ filename’ )
OTHER ’filename’
DISCARDS n
AUTOROWGEN OFF
ON
,
( table_name )
,
DEFAULT ( table_name )
DEFAULT
,
( table_name )
,
ON ( table_name )
URZPHVVDJHVBFODXVH
ROWMESSAGES ’filename’
/RDGLQJ'DWDLQWRD:DUHKRXVH'DWDEDVH
RSWLPL]HBFODXVH
OPTIMIZE OFF
ON DISCARDFILE ’filename’
PPDSBLQGH[BFODXVH
’
MMAP INDEX ( pk_index_name )
’
SEGMENT ( segment_name )
WDEOHBFODXVH
,
( col_name RETAIN )
AS $pseudocolumn DEFAULT
$pseudocolumn simple_field
concat_field
constant_field
sequence_field
increment_field
/2$''$7$6\QWD[6XPPDU\
VLPSOHBILHOG
field_type
POSITION ( start )
: end
xml_path
ROUND ADD
LTRIM SUBTRACT
RTRIM MIN
TRIM MAX
ADD_NONULL
SUBTRACT_NONULL
MIN_NONULL
:
MAX_NONULL
[POBSDWK
/RDGLQJ'DWDLQWRD:DUHKRXVH'DWDEDVH
/2$''$7$6\QWD[6XPPDU\
FRQFDWHQDWHGBILHOG
column_name
$pseudocolumn
’character_string’
LTRIM ( column_name )
RTRIM $pseudocolumn
’character_string’
’character_string’ , RIGHT
FRQVWDQWBILHOG
CONSTANT NULL
’character_literal’
float_constant
integer_constant
DATE 'date_literal'
TIME 'time_literal'
TIMESTAMP ’timestamp_literal’
'alternative_datetime_value'
7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
/2$''$7$6\QWD[6XPPDU\
VHTXHQFHBILHOG
SEQUENCE
( start )
, increment
LQFUHPHQWBILHOG
INCREMENT
( n )
VHJPHQWBFODXVH
,
( col_name simple_field )
AS $pseudocolumn concat_field
$pseudocolumn constant_field
sequence_field
increment_field
/RDGLQJ'DWDLQWRD:DUHKRXVH'DWDEDVH
/2$''$7$6\QWD[6XPPDU\
FULWHULDBFODXVHRQQRQFKDUDFWHUFROXPQ
FULWHULDBFODXVHRQFKDUDFWHUFROXPQ
FRPPHQWBFODXVH
COMMENT ‘character_string’
7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
/2$''$7$6\QWD[6XPPDU\
ILHOGBW\SH
CHARACTER
CHAR SUBSTR ( start , num )
( length )
VARLEN
EXTERNAL
INTEGER EXTERNAL
INT EXTERNAL ( length )
, scale
DECIMAL EXTERNAL
DEC EXTERNAL ( length ) RADIX POINT ’c’
, scale
FLOAT EXTERNAL
( length )
DECIMAL
restricted_
DEC PACKED ( length ) date_spec
ZONED , scale
INTEGER
restricted_
INT ( scale ) date_spec
SMALLINT
TINYINT
REAL
DOUBLE PRECISION
/RDGLQJ'DWDLQWRD:DUHKRXVH'DWDEDVH
/2$''$7$6\QWD[6XPPDU\
ILHOGBW\SHFRQWLQXHG
DATE
‘date_mask’
( length )
CURRENT_DATE
TIME ‘time_mask’
( length )
CURRENT_TIME
TIMESTAMP ‘timestamp_mask’
( length )
CURRENT_TIMESTAMP
M4DATE m4date_mask
( length )
UHVWULFWHGGDWHBVSHF
DATE ’restricted_date_mask’
TIME ’restricted_time_mask’
TIMESTAMP ’restricted_timestamp_mask’
7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
Chapter
8QORDGLQJ'DWDIURPD7DEOH
In This Chapter . . . . . . . . . . . . . . . . . . . . 4-3
The UNLOAD Operation. . . . . . . . . . . . . . . . . 4-4
Internal Format . . . . . . . . . . . . . . . . . . . 4-5
External Format . . . . . . . . . . . . . . . . . . 4-5
Data Conversion to External Format . . . . . . . . . . . 4-6
UNLOAD operations can be performed both locally and remotely; for infor-
mation about remote TMU operations, see page 2-12.
8QORDGLQJ'DWDIURPD7DEOH
7KH81/2$'2SHUDWLRQ
7KH81/2$'2SHUDWLRQ
The TMU UNLOAD operation is a flexible operation that you can use for many
purposes, which include:
You can unload an entire table or just the data in a specified segment. The
TMU can also perform a selective unload. By specifying constraints in a
WHERE clause in the UNLOAD statement, you can select the rows to be
unloaded. An UNLOAD operation can unload a maximum of 2,147,483,647
rows (231 -1). If the table you want to unload contains more rows, break the
operation into two separate unloads and apply constraints. Alternatively, use
the SQL EXPORT command with a SELECT * query against the table.
You can specify whether the rows of a table are unloaded in the order of the
data (by doing a relation scan of the table) or in the order of one of the table
indexes.
You can also pipe the output data to another program for additional
processing, such as compressing or filtering.
The rb_cm utility uses the TMU unload capability. For more information
about this utility, refer to “The rb_cm Utility” on page 7-4.
7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
,QWHUQDO)RUPDW
You can unload data in two formats with the TMU UNLOAD statement: an
internal binary format and an external character-based format. Data you
unload in the internal format can be reloaded only into an IBM Red Brick
Warehouse database on the same platform. Data unloaded in external format
can be reloaded into a Red Brick database on the same platform or a different
platform.
7LS You can use the SQL high-speed EXPORT command in the database server,
which unloads data in various formats, to export the results of any query to a
specified file. For more information on the EXPORT command, see the SQL Reference
Guide.
,QWHUQDO)RUPDW
Unloading a table to the internal format creates a binary output file. The TMU
quickly reloads internal-format files; however, the files must be reloaded
only on a system on the same platform. For example, you cannot unload a
table to the internal format on an HP 9000 and reload it on an IBM RISC
System/6000, or vice versa. In this context, the same platform also means that
the two systems must use the same Red Brick binaries—either 32-bit or 64-
bit. You cannot unload a table from a 32-bit system and reload the binary
output file on a 64-bit system.
([WHUQDO)RUPDW
Unloading a table to the external format creates an output file that you can
reload on the same or a different platform. Because external-format files are
character based, you can also read and edit the files, if necessary. You can also
use the files with other applications. For example, you can import the data
unloaded into an external-format file into a desktop spreadsheet application.
When you unload data in external format, multibyte characters are preserved
in data and table and column names, but data is not localized. Numeric and
datetime data are formatted according to ANSI SQL-92 rules for these data
types.
8QORDGLQJ'DWDIURPD7DEOH
'DWD&RQYHUVLRQWR([WHUQDO)RUPDW
When the TMU unloads data in the external format, it generates the following
character-based files:
To reload the external-format unloaded data, invoke the TMU by using the
automatically generated TMU control file (named with the TMUFILE
keyword), which contains the LOAD DATA statement.
'DWD&RQYHUVLRQWR([WHUQDO)RUPDW
The following table defines how data from a table is mapped into an external
data file.
1XPEHURI
'DWD7\SHV %\WHV )RUPDW 1RWHV
7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
'DWD&RQYHUVLRQWR([WHUQDO)RUPDW
1XPEHURI
'DWD7\SHV %\WHV )RUPDW 1RWHV
8QORDGLQJ'DWDIURPD7DEOH
81/2$'6\QWD[
81/2$'6\QWD[
UNLOAD table_name
USING INDEX index_ name
,
SEGMENT ( segment_name )
EXTERNAL
VARIABLE
OUTPUTFILE ’filename’
OUTPUTDDN
;
AND
OR
WHERE search_condition
p. 4-13
NOT
7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
81/2$'6\QWD[
USING INDEX Index to use for the unload operation. The index order
index_name determines the order in which the rows of data are
unloaded. If no index is specified, the data is unloaded
by a table scan.
8QORDGLQJ'DWDIURPD7DEOH
81/2$'6\QWD[
TMUFILE ’filename’ File to which the TMU automatically writes a LOAD DATA
statement during an unload-to-external operation. After
unloading the data, you can use this file as the control file
when you invoke the TMU to reload the data.
7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
81/2$'6\QWD[
,PSRUWDQW If the UNLOAD statement appears in a control file that the rb_cm utility
uses, OUTPUTFILE must be set to standard output.
If the output from a table or segment is piped to another
program, the filename reference is ’| command’ where
command is an operating-system program to which the
output should be piped. For example, the following state-
ment unloads the Sales table by using external format and
pipes the output to the compress program, which
compresses and writes the data to a file named
outdata_sales:
unload sales external OUTPUTFILE ’| compress >
outdata_sales’
8QORDGLQJ'DWDIURPD7DEOH
81/2$'6\QWD[
7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
81/2$'6\QWD[
7LS If you know that the search condition chooses rows from only a specific set of
segments (for example, if the search condition contains a constraint on the
segmenting key), you can achieve fastest performance by listing only those segments
in a SEGMENT clause in the UNLOAD statement.
<
>
<=
>=
column_name IS NULL
NOT
column_name LIKE ’character_string’
NOT ESCAPE ’c’
8QORDGLQJ'DWDIURPD7DEOH
8QORDGLQJRU/RDGLQJ,QWHUQDO)RUPDW'DWD
8QORDGLQJRU/RDGLQJ,QWHUQDO)RUPDW'DWD
7RXQORDGDWDEOHLQWKHLQWHUQDOIRUPDWGHIDXOW
7RUHORDGLQWHUQDOIRUPDWGDWDLQWRDWDEOH
If you are creating a new table, create it with the SQL CREATE TABLE
statement, either one that you write or one that the TMU generated.
,PSRUWDQW The CREATE TABLE statement must create either the same table as the
table from which the data was unloaded or a table with the same number, type, and
order of columns.
Prepare a control file that contains a LOAD DATA statement that
specifies FORMAT UNLOAD and the name of the file that contains the
unloaded data.
7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
8QORDGLQJRU/RDGLQJ,QWHUQDO)RUPDW'DWD
,PSRUWDQW A LOAD DATA statement with FORMAT UNLOAD cannot contain field
specifications.
Invoke the TMU and specify the control file you created in step 2.
Create any needed indexes, synonyms, and views. As the table is
loaded, the TMU automatically builds primary-key indexes and
updates other existing indexes.
You can also use pipes to accomplish the unload and load processes without
using an intermediate tape or disk file. For more information about using
pipes, refer to your operating-system documentation.
7RXQORDGWKH0DUNHWWDEOHLQWKHLQWHUQDOIRUPDWDQGSODFHWKHFRQWHQWVLQWRWKH
ILOHPDUNHWRXWSXW
7RUHORDGWKH0DUNHWWDEOHZLWKLQWHUQDOIRUPDWGDWD
,PSRUWDQW The output file specified in the UNLOAD statement is used as the
inputfile of the LOAD DATA statement.
8QORDGLQJ'DWDIURPD7DEOH
8QORDGLQJRU/RDGLQJ([WHUQDO)RUPDW'DWD
8QORDGLQJRU/RDGLQJ([WHUQDO)RUPDW'DWD
7RXQORDGDWDEOHLQHLWKHUH[WHUQDOIL[HGRUH[WHUQDOYDULDEOHIRUPDW
7RUHORDGH[WHUQDOIRUPDWGDWDLQWRDGDWDZDUHKRXVHWDEOH
If you are creating a new table, create it with the SQL CREATE TABLE
statement, either one that you write or one that the TMU generated
automatically with a TMU UNLOAD or GENERATE statement
executed on the table from which the data was unloaded.
,PSRUWDQW The CREATE TABLE statement must create either the same table as the
table from which the data was unloaded or a table with the same number, type, and
order of columns.
7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
8QORDGLQJRU/RDGLQJ([WHUQDO)RUPDW'DWD
Review the file that contains the automatically generated TMU LOAD
DATA statement to be sure the input file name is correct. If loading
from tape, edit the TAPE DEVICE clause to specify the correct device
name.
Invoke the TMU, specifying the file that contains the LOAD DATA
statement as the TMU control file.
Create any needed indexes, synonyms, and views. As the table is
loaded, the TMU automatically builds primary-key indexes and
updates other existing indexes.
The following example shows how to unload a table into a file by using the
external fixed format. The Sales table is unloaded to the sales.output file. The
data is then reloaded by using the automatically generated TMU file.
7RXQORDGWKH6DOHVWDEOHLQWKHH[WHUQDOIRUPDWDQGSODFHWKHFRQWHQWVLQWRWKH
ILOHVDOHVRXWSXW
7RUHORDGWKH6DOHVWDEOHLQDQHZGDWDEDVH
Create the table by using the DDL file sales.create. For example, if
you are using the RISQL Entry Tool, you can execute the file as
follows:
RISQL> run sales.create ;
Modify the sales.load file, which contains the LOAD DATA statement,
to correctly specify the input tape device.
8QORDGLQJ'DWDIURPD7DEOH
&RQYHUWLQJD7DEOHWR0XOWLSOH6HJPHQWV
Invoke the TMU from the command line, specifying sales.load as the
control file:
rb_tmu sales.load db_username db_password
Create any needed indexes, synonyms, and views.
&RQYHUWLQJD7DEOHWR0XOWLSOH6HJPHQWV
If a table resides in a single segment, you can use the unload operation to split
the data among additional new segments as follows:
0RYLQJD'DWDEDVH
To move a database, create a file that contains an UNLOAD statement for each
table in the database. You can specify the internal format for fastest perfor-
mance or the external format for increased flexibility. However, if you move
a database to a system on a different platform, you must specify the external
format.
If you unload the data in the external format, either include the TMUFILE
parameter so that the TMU generates a file that contains the LOAD DATA
statements needed to reload the data or use the GENERATE LOAD DATA
statement to create the appropriate TMU file.
If you unload the data in the internal format, you must create or generate a
control file that contains the LOAD DATA statements. You can use the
GENERATE LOAD DATA statement to create the appropriate TMU file.
Be sure to write the LOAD DATA statements in the order in which the tables
must be loaded. For information about determining table order, refer to
“Determining Table Order” on page 3-14.
7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
/RDGLQJ([WHUQDO)RUPDW'DWDLQWR7KLUG3DUW\7RROV
/RDGLQJ([WHUQDO)RUPDW'DWDLQWR7KLUG3DUW\
7RROV
After unloading a table in the external format, you can use the information in
the automatically generated TMU file to load the data into products that
accept portions of fixed input. The TMU file contains the LOAD DATA
statements for reloading and provides information about the positions of the
columns.
For example, you can load data into IBM DB2 using the DB2 LOAD utility or
into Microsoft Excel using the Excel parse function.
When you load data into Excel, Excel does not accept ANSI SQL date formats.
You can use the parse function to obtain date components and use a date
function to turn the components into an Excel date. You must also use the
Excel parse function to interpret null-indicator characters to extract columns
with null values. Because the data is in the external format, you can look at
the data so that you can set up Excel to handle its format.
8QORDGLQJ6HOHFWHG5RZV
To unload only selected rows of a table, create an UNLOAD statement that
contains a WHERE clause that specifies which rows to unload. Only those
rows that satisfy the column constraints in the WHERE clause are written to
the unload file.
You can combine a WHERE clause with a segment list to limit the scope of the
unload operation. If specific segments are listed, the search condition applies
only to those segments. If no specific segments are listed, the search condition
applies to the entire table. If you know that the WHERE clause will only
unloads rows from specific segments, the unload operation is faster if you list
just those segments in a SEGMENT clause of the UNLOAD statement.
To write the rows selected by the WHERE clause to a TAR file, you must first
insert them into a temporary table or unload them to a disk file: You cannot
use a WHERE clause in an unload operation to a TAR file (because each header
block in the TAR file must know the length of the file that follows it, which is
not known in the case of a selective unload operation.)
8QORDGLQJ'DWDIURPD7DEOH
([DPSOH([WHUQDO)L[HG)RUPDW'DWD
Assume you want to unload the 2000 sales data from the Sales table in the
Aroma database. The following UNLOAD statement unloads the rows for
2000 from the Sales table, based on the Perkey column. The rows are written
in external format to a file named 2000_sales_data.
unload sales
external outputfile ’2000_sales_data’
where perkey >= 96001 and perkey <= 96053;
Assume you want to unload the 2000 sales data for the northern region from
the Sales table. The following UNLOAD statement unloads the rows for 2000
for the northern region from the Sales table, based on the Perkey and
Mktkey columns. The rows are written in internal format to a file named
2000_northern_sales_data.
unload sales
outputfile ’2000_northern_sales_data’
where perkey >= 96001 and perkey <= 96053
and ( mktkey = 6
or mktkey = 7
or mktkey = 8 );
([DPSOH([WHUQDO)L[HG)RUPDW'DWD
The following example shows the external format generated by the TMU, as
well as the automatically generated CREATE TABLE and LOAD DATA
statements for the Market table in the Aroma database.
7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
([DPSOH([WHUQDO)L[HG)RUPDW'DWD
The file market.txt contains unloaded data from the Market table in the
external format:
00000000001 Atlanta GA Atlanta South
00000000002 Miami FL Atlanta South
00000000003 New Orleans LA New Orleans South
00000000004 Houston TX New Orleans South
00000000005 New York NY New York North
…
The TMU UNLOAD statement also creates a file named market_load.tmu that
contains the following LOAD DATA statement:
LOAD DATA INPUTFILE ’market.txt’
RECORDLEN 97
INSERT
INTO TABLE MARKET (
MKTKEY POSITION(2) INTEGER EXTERNAL(11) NULLIF(1)=’%’,
HQ_CITY POSITION(14) CHARACTER(20) NULLIF(13)=’%’,
HQ_STATE POSITION(35) CHARACTER(20) NULLIF(34)=’%’,
DISTRICT POSITION(56) CHARACTER(20) NULLIF(55)=’%’,
REGION POSITION(77) CHARACTER(20) NULLIF(76)=’%’);
,PSRUWDQW The table in the LOAD DATA statement and the CREATE TABLE
statement is MARKET. If you use these statements to create a new table, you might
want to edit them to change the name. Similarly, often you must edit input filenames
or tape-device names to make them correspond to the actual physical locations.
The NULLIF keyword and the percent character (%) in the LOAD DATA statement
indicate whether a column in the unloaded table contained NULL. For example, if the
value in the District column for New Orleans were NULL, the unloaded data would
look like this:
00000000001 Atlanta GA Atlanta South
00000000002 Miami FL Atlanta South
00000000003 New Orleans LA % South
00000000004 Houston TX New Orleans South
00000000005 New York NY New York North
…
8QORDGLQJ'DWDIURPD7DEOH
([DPSOH([WHUQDO9DULDEOH)RUPDW'DWD
([DPSOH([WHUQDO9DULDEOH)RUPDW'DWD
This example shows the external-variable format generated by the TMU, as
well as the automatically generated CREATE TABLE and LOAD DATA state-
ments for the Market table in the Aroma database.
The Market table (which includes a VARCHAR column) is created with the
following statement:
create table market (
mktkey integer not null,
hq_city varchar(20) not null,
hq_state char(20) not null,
district char(20) not null,
region char(20) not null,
constraint mkt_pkc primary key (mktkey));
7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
([DPSOH([WHUQDO9DULDEOH)RUPDW'DWD
8QORDGLQJ'DWDIURPD7DEOH
Chapter
*HQHUDWLQJ&5($7(7$%/(DQG
/2$''$7$6WDWHPHQWV
In This Chapter . . . . . . . . . . . . . . . . . . . . 5-3
Generating CREATE TABLE Statements . . . . . . . . . . . 5-3
Generating LOAD DATA Statements . . . . . . . . . . . . 5-5
Example: GENERATE Statements and External-Format Data . . . . 5-8
7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
,Q7KLV&KDSWHU
The TMU can automatically generate CREATE TABLE and LOAD DATA state-
ments based on existing tables. You can use these statements as templates for
creating and loading new tables. These statements can also be generated as
part of the UNLOAD process; however, the GENERATE statements provide
more flexibility.
GENERATE operations can be performed both locally and remotely; for infor-
mation about remote TMU operations, see page 2-12.
*HQHUDWLQJ&5($7(7$%/(6WDWHPHQWV
To write a CREATE TABLE statement for a new table to hold unloaded data or
to create a template for a new table that is similar to an existing table, you can
use the TMU GENERATE CREATE TABLE statement to generate one instead of
generating it as part of the UNLOAD operation. You can either use the
statement as generated or edit it to make any necessary changes (for example,
modifying filenames or table names or adding segment information or
MAXROWS or MAXROWS PER SEGMENT values).
*HQHUDWLQJ&5($7(7$%/(DQG/2$''$7$6WDWHPHQWV
*HQHUDWLQJ&5($7(7$%/(6WDWHPHQWV
DDLFILE ’filename’ ;
DDLFILE File to which the TMU writes the generated CREATE TABLE
’filename’ statement. The file does not include any segment information.
You can use this file to create a table on any platform.
UNIX The following example shows how you can use system pipes and a remote
shell command (rsh) to create the table on a remote host. A CREATE TABLE
statement is generated for the existing table named Sales. Instead of writing
the generated statement to a disk file, however, the generated statement is
passed to a system pipe and executed with an rsh remote-shell command on
a UNIX host named north1.
7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
*HQHUDWLQJ/2$''$7$6WDWHPHQWV
The remote shell executes the cat UNIX command to copy the remote shell
input to a file named sales.create. The cat command is enclosed in quotation
marks because it contains the greater-than character (>) to redirect output. In
this single operation, a CREATE TABLE statement is automatically generated
and copied to a disk file on a remote host.
generate create table from sales
ddlfile ’| rsh north1 "cat > sales.create"’;
As in the previous example, generate a CREATE TABLE statement for the table
named Sales and pass the output to a remote shell on a UNIX host named
north1. At the rsh UNIX remote-shell command, pass the generated CREATE
TABLE statement directly to the RISQL Entry Tool running on the north1 host
to create the table in an existing database. In this single operation, a replica of
the Sales table is created on the remote host.
generate create table from sales
ddlfile ’| rsh north1 risql user password’;
7LS The combination of the GENERATE CREATE TABLE statement and the remote
shell capability illustrated in this example is particularly useful with the rb_cm
utility. You can include a similar GENERATE statement in the rb_cm unload control
file before the UNLOAD statement, causing the remote table to be created immediately
before data is copied to that table. For more information about this utility, refer to
Chapter 7, “Moving Data with the Copy Management Utility.”
*HQHUDWLQJ/2$''$7$6WDWHPHQWV
To write a LOAD DATA statement to load unloaded data or to create a
template to load similar data into a new table, you can use the TMU
GENERATE LOAD DATA statement to generate one instead of generating it as
part of the UNLOAD operation. The GENERATE LOAD DATA statement allows
you to specify a name for the target table and a name for the input file. You
can either use the statement as generated or edit it to make any necessary
changes.
*HQHUDWLQJ&5($7(7$%/(DQG/2$''$7$6WDWHPHQWV
*HQHUDWLQJ/2$''$7$6WDWHPHQWV
7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
*HQHUDWLQJ/2$''$7$6WDWHPHQWV
TMUFILE ’filename’ Specifies the name of the file to which the TMU writes the
generated LOAD DATA statement for the table. You can
use this file unchanged as input to the TMU to load or
reload the table from an internal or external unload-for-
mat file. You can also use the generated file as a template
and edit it so that the generated field specifications match
some other input format.
filename can also begin with a single vertical bar (|) char-
acter, followed by a command string. This special format
causes the TMU to direct the generated output data to a
system pipe rather than to a file. The generated LOAD
DATA statement serves as input to the command string
run as a shell command.
*HQHUDWLQJ&5($7(7$%/(DQG/2$''$7$6WDWHPHQWV
([DPSOH*(1(5$7(6WDWHPHQWVDQG([WHUQDO)RUPDW'DWD
([DPSOH*(1(5$7(6WDWHPHQWVDQG
([WHUQDO)RUPDW'DWD
This example illustrates the CREATE TABLE and LOAD DATA statements for
the Store table in the Aroma database that are generated by the GENERATE
statement, and the external-format data produced by an UNLOAD statement.
7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
([DPSOH*(1(5$7(6WDWHPHQWVDQG([WHUQDO)RUPDW'DWD
The file named recreate_store contains the following SQL statement to create
the Store table:
CREATE TABLE STORE (
STOREKEY INTEGER NOT NULL UNIQUE,
MKTKEY INTEGER NOT NULL,
STORE_TYPE CHARACTER(10),
STORE_NAME CHARACTER(30),
STREET CHARACTER(30),
CITY CHARACTER(20),
STATE CHARACTER(5),
ZIP CHARACTER(10),
PRIMARY KEY(STOREKEY),
CONSTRAINT STORE_FKC FOREIGN KEY(MKTKEY)
REFERENCES MARKET (MKTKEY) ON DELETE NO ACTION);
The file named store_load contains the following LOAD DATA statement:
LOAD DATA INPUTFILE ’store_data’
RECORDLEN 136
INSERT
NLS_LOCALE ’English_UnitedStates.US-ASCII@Binary’
INTO TABLE NEW_STORE (
STOREKEY POSITION(2) INTEGER EXTERNAL(11) NULLIF(1)=’%’,
MKTKEY POSITION(14) INTEGER EXTERNAL(11) NULLIF(13)=’%’,
STORE_TYPE POSITION(26) CHARACTER(10) NULLIF(25)=’%’,
STORE_NAME POSITION(37) CHARACTER(30) NULLIF(36)=’%’,
STREET POSITION(68) CHARACTER(30) NULLIF(67)=’%’,
CITY POSITION(99) CHARACTER(20) NULLIF(98)=’%’,
STATE POSITION(120) CHARACTER(5) NULLIF(119)=’%’,
ZIP POSITION(126) CHARACTER(10) NULLIF(125)=’%’);
7LS You can specify a new target table name and an input filename in the
GENERATE LOAD DATA statement. The percent character (%) in the LOAD
DATA statement indicates whether a column in the unloaded table contained
NULL.
*HQHUDWLQJ&5($7(7$%/(DQG/2$''$7$6WDWHPHQWV
([DPSOH*(1(5$7(6WDWHPHQWVDQG([WHUQDO)RUPDW'DWD
If you unload the Store table in external format, the output looks like this:
00000000001 00000000014 Small Roasters, Los Gatos
1234 University Ave Los Gatos CA 95032
00000000002 00000000014 Large San Jose Roasting Company
5678 Bascom Ave San Jose CA 95156
00000000003 00000000014 Medium Cupertino Coffee Supply
987 DeAnza Blvd Cupertino CA 97865
00000000004 00000000003 Medium Moulin Rouge Roasting
898 Main Street New Orleans LA 70125
00000000005 00000000010 Small Moon Pennies
98675 University Ave Detroit MI 48209
…
7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
Chapter
5HRUJDQL]LQJ7DEOHVDQG
,QGH[HV
In This Chapter . . . . . . . . . . . . . . . . . . . . 6-3
The REORG Operation . . . . . . . . . . . . . . . . . 6-3
REORG Operation Options. . . . . . . . . . . . . . . 6-5
7KH5(25*2SHUDWLRQ
The REORG operation performs the following functions:
5HRUJDQL]LQJ7DEOHVDQG,QGH[HV
7KH5(25*2SHUDWLRQ
7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
5(25*2SHUDWLRQ2SWLRQV
5(25*2SHUDWLRQ2SWLRQV
The REORG operation offers the following options:
5HRUJDQL]LQJ7DEOHVDQG,QGH[HV
5(25*2SHUDWLRQ2SWLRQV
7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
'DWD3URFHVVLQJ'XULQJWKH5(25*2SHUDWLRQ
'DWD3URFHVVLQJ'XULQJWKH5(25*2SHUDWLRQ
Serial and parallel REORG operations perform identical functions. However,
a parallel REORG operation uses separate tasks concurrently whereas a serial
REORG operation uses only one task that proceeds serially from one stage to
the next. The REORG operation consists of the following stages:
■ Coordinator stage
❑ Validates the REORG statement.
❑ Acquires all necessary locks and sets the state of each index
being rebuilt to prevent other users from accessing the index.
❑ Clears the indexes or segments of the index being reorganized.
Also during the coordinator stage, a parallel REORG operation:
❑ Determines how many additional tasks to use for each stage.
❑ Assigns work to each task.
❑ Determines the order of the tasks in the REORG pipeline.
❑ Creates the processes (UNIX) or threads (Windows) for each
stage in the REORG pipeline and starts their execution.
■ Input stage
❑ Reads each row from the target table.
❑ Passes the data on to a conversion task.
■ Conversion stage
❑ Checks referential integrity on all foreign keys if reference
checking is enabled. If reference checking is disabled, checks
referential integrity only on foreign keys used in any STAR index
being rebuilt.
❑ Constructs a key for each index being rebuilt.
❑ Identifies which index-builder task has work to do for the
current row. The row is skipped if the key value does not belong
to any index segment being rebuilt.
❑ Passes the data on to the first index-builder task in the pipeline.
5HRUJDQL]LQJ7DEOHVDQG,QGH[HV
'DWD3URFHVVLQJ'XULQJWKH5(25*2SHUDWLRQ
■ Index-builder stage
❑ Inserts key values into their assigned segments of an index.
❑ Passes the data on to the next index-builder task or to the
cleanup task.
■ Cleanup stage
❑ Performs the function required for the selected ON DISCARD
option.
❑ Marks the successfully rebuilt indexes valid.
❑ Reports status of the REORG operation.
7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
'DWD3URFHVVLQJ'XULQJWKH5(25*2SHUDWLRQ
5(25*WDVNVVHTXHQFH .H\
&RRUGLQDWRUVWDJH
7DVN
&RQWUROILOH &RRUGLQDWRU
V\VWHPWDEOHV WDVN 3ULPDU\,2
6WDWXVFRQWURO
IORZ
,QSXWVWDJH
'DWDEDVHWDEOH ,QSXWWDVN
368V tasktask
&RQYHUVLRQVWDJH
3.LQGH[HVRI &RQYHUVLRQ
UHIHUHQFHGWDEOHV WDVN tasktask
,QGH[EXLOGLQJVWDJH
,QGH[EXLOGHU
IndexIndex
,QGH[VHJPHQWV WDVN task
,QGH[EXLOGHU
IndexIndex
,QGH[VHJPHQWV WDVN task 3ULPDU\,2
&OHDQXSVWDJH
'LVFDUGILOHV
'DWDEDVHWDEOH &OHDQXSWDVN
5HRUJDQL]LQJ7DEOHVDQG,QGH[HV
&RRUGLQDWRU6WDJH
&RRUGLQDWRU6WDJH
During the coordinator stage, the coordinator task receives the REORG
statement and checks the validity of the REORG parameters. It determines
how many tasks to use, assigns work for each stage and determines the order
of the tasks in the REORG pipeline. After all the stages are complete, the
coordinator ends the REORG operation.
,QSXW6WDJH
During the input stage, each row from the target table is read and the infor-
mation is passed on to the conversion stage. (In some cases, to perform the
REORG operation faster, a STAR index might be scanned instead of the table.
This choice is invisible to the user.) The number of input tasks cannot exceed
the number of PSUs in the target table.
&RQYHUVLRQ6WDJH
During the conversion stage, referential integrity is checked (if you have
enabled this option) and a key is constructed for each index being rebuilt. The
index-building work to be done on each row is identified, and the row is then
directed to the next stage in the pipeline.
7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
,QGH[%XLOGLQJ6WDJH
,QGH[%XLOGLQJ6WDJH
During the index-building stage, a key value is inserted into a specified
segment of an index by an index-builder task. An index-builder task can
insert keys into multiple segments of one index, all the segments of one
index, or all the segments of a set of indexes. Multiple index-building tasks
handle different subsets of segments of the same index and each index is built
at a different stage of the pipeline. If the OPTIMIZE option is ON, sorting and
merging strategies are used to add rows to the index. Multiple index-builder
tasks can operate simultaneously. The number of index-building tasks cannot
exceed the total number of index segments being built. No index-builder
tasks are allocated for segments not being rebuilt.
&OHDQXS6WDJH
During the cleanup stage, the discarded rows are removed and recorded in
the discard files as you specify. The cleanup task, depending on the selected
ON DISCARD option, takes action as follows:
The successfully rebuilt indexes are marked valid and the status of the
REORG operation is reported to the coordinator task.
5HRUJDQL]LQJ7DEOHVDQG,QGH[HV
5(25*6\QWD[
5(25*6\QWD[
REORG table_name
,
SEGMENT ( segment_name )
INDEX ( index_name )
,
SEGMENT ( segment_name )
EXCLUDE DEFERRED INDEXES
INCLUDE
EXCLUDE INDEXES
’
MMAP INDEX ( pk_index_name )
’
SEGMENT ( segment_name )
7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
5(25*6\QWD[
5(25*6\QWD[FRQWLQXHG
ABORT
DELETE ROW
discardfile_clause
p. 6-19
;
DISCARDS n ROWMESSAGES ’filename’
5HRUJDQL]LQJ7DEOHVDQG,QGH[HV
5(25*6\QWD[
7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
5(25*6\QWD[
RECALCULATE Specifies that the ranges for any STAR index rebuilt with
RANGES this REORG statement are to be recalculated to split
index entries evenly among the segments for the index.
Use this option when you change a MAXROWS PER SEG-
MENT or a MAXSEGMENTS value for a table that partic-
ipates in the STAR index. If this option is included, at
least one index in the index list must be a STAR index.
OPTIMIZE ON, OFF Specifies to rebuild the index or indexes in the OPTI-
MIZE mode. This option overrides the OPTIMIZE mode
set in the rbw.config file. If this clause is not present in
the REORG statement, the rbw.config file determines the
default behavior. For more information on OPTIMIZE,
see “Optimize Clause” on page 3-59.
5HRUJDQL]LQJ7DEOHVDQG,QGH[HV
5(25*6\QWD[
7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
5(25*6\QWD[
ON DISCARD Indicates the action to take when data rows fail the
referential-integrity checks or contain duplicate index
key values. The three options are DELETE ROW,
INVALIDATE INDEX, and ABORT. The default is
DELETE ROW.
5HRUJDQL]LQJ7DEOHVDQG,QGH[HV
5(25*6\QWD[
7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
GLVFDUGILOH&ODXVH
GLVFDUGILOH&ODXVH
The discardfile_clause specifies where to store duplicate rows or rows that fail
the referential-integrity check.
DISCARDFILE ’filename’
RI_DISCARDFILE ’filename’
,
( table_name ’filename’ )
OTHER ’filename’
DISCARDFILE Specifies the name of the file to which the TMU writes
‘filename’ discarded rows. Discarded duplicate rows are always
written to this file. Rows discarded because of refer-
ential-integrity failure also can be written to this file if
no separate RI_DISCARDFILE filenames are specified.
This option can be used only when the ON DISCARD
DELETE ROW option is specified.
5HRUJDQL]LQJ7DEOHVDQG,QGH[HV
GLVFDUGILOH&ODXVH
RI_DISCARDFILE A table name and filename pair that names a table ref-
table_name ’filename’ erenced by a foreign key in the table being reorga-
nized and a file in which to record the discarded rows
that violate referential integrity with respect to the
named table.
7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
8VDJH1RWHV
OTHER ’filename’ Specifies a file in which to discard any rows that vio-
late referential integrity with respect to referenced
tables not named in the table name and filename
pairs. If a table name and filename pair list is present
and this clause is omitted, then any records that vio-
late referential integrity with respect to tables missing
from the list are written to the standard discard file
(following the DISCARDFILE keyword).
8VDJH1RWHV
,PSRUWDQW To use the REORG statement, you must be a member of the DBA system
role or be the owner of the table.
5HIHUHQWLDO,QWHJULW\
Referential integrity is always preserved for databases, except in the
following cases:
5HRUJDQL]LQJ7DEOHVDQG,QGH[HV
8VDJH1RWHV
In any of these cases, rows can be deleted from a referenced table that violates
referential integrity. Because delete operations that the REORG statement
performs do not cascade to referencing tables, if you reorganize a referenced
table, you must also reorganize each table that references the reorganized
table. To restore referential integrity, you must perform a REORG operation
with the ON DISCARD DELETE ROW and the REFERENCE CHECKING ON
options enabled for all tables that reference the table from which rows were
deleted. The REORG operation deletes any rows that reference a deleted row,
thus restoring referential integrity.
:DUQLQJ Be sure that you perform the REORG operation on the referencing table,
not the table from which the rows were deleted (the referenced table).
Figure 6-2 illustrates how rows are deleted from referencing tables in a
REORG operation. Assume the Fact1 table references the Dim1 table, which
in turn references the Out1 table.
)LJXUH
2XW 'HOHWHV'R1RW
'LP &DVFDGH
)DFW
After some rows are deleted from Out1, a REORG operation is performed on
Dim1. Any rows in Dim1 that reference rows that were deleted from Out1
(that, violate referential integrity) are deleted by the REORG operation to
preserve referential integrity. However, rows in Fact1 that reference deleted
rows in Dim1 are not deleted by the REORG operation on Dim1. To delete
these rows and preserve referential integrity, you must also perform a REORG
operation on Fact1.
/RFNLQJ%HKDYLRU
The REORG operation initially places a read lock on the database (if the
RECALCULATE RANGES option is specified, the REORG operation places a
write lock on the database). The lock on the database is released after the
REORG operation locks the table that it is modifying.
7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
8VDJH1RWHV
If versioning is enabled, queries can access the previous version of the table
and its indexes while the table is being reorganized. For more information on
versioning, refer to the Administrator’s Guide.
3DUWLDO,QGH[5(25*
The limitations of a partial-index REORG operation are as follows:
2QOLQHDQG2IIOLQH2SHUDWLRQ
When the segments being reorganized are offline, the indexes are not marked
invalid. Instead, an internal flag is set to indicate that the segment is being
rebuilt. When this flag is set, the segment cannot be brought online. When the
REORG operation completes successfully, the flag is reset. If the REORG is
unsuccessful, the segment remains inaccessible until it is successfully
reorganized.
During a full-index REORG operation, all index segments must be online and
remain invisible to users until the rebuilding process is complete.
7LS A REORG operation cannot rebuild a segment marked “damaged” until the
problem is corrected.
5HRUJDQL]LQJ7DEOHVDQG,QGH[HV
8VDJH1RWHV
'LVN6SDFH
An index that runs out of disk space is marked invalid, but the excess rows
are not deleted from the table. If you specify the DELETE ROW option or
INVALIDATE INDEX option, only the index that runs out of space is marked
invalid and the process of building other indexes continues. If you specify the
ABORT option, the REORG operation ends immediately. If versioning is
enabled, all REORG changes are rolled back to their initial state. If versioning
is not enabled, all indexes being rebuilt are marked invalid.
'LVFDUGILOH)RUPDW
Discarded rows are written in external format. You can load them into a table
by using a load script generated by an UNLOAD statement. For more
information, see “Unloading or Loading External-Format Data” on
page 4-16.
7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
Chapter
0RYLQJ'DWDZLWKWKH&RS\
0DQDJHPHQW8WLOLW\
In This Chapter . . . . . . . . . . . . . . . . . . . . 7-3
The rb_cm Utility . . . . . . . . . . . . . . . . . . . 7-4
System Requirements . . . . . . . . . . . . . . . . 7-5
Database Security Requirements . . . . . . . . . . . . . 7-6
0RYLQJ'DWDZLWKWKH&RS\0DQDJHPHQW8WLOLW\
7KHUEBFP8WLOLW\
7KHUEBFP8WLOLW\
The TMU can perform high-performance unloads and loads to and from
physical storage. The rb_cm utility provides an interface that allows you to
combine the following tasks into a single operation:
Figure 7-1 illustrates the difference between an rb_cm copy operation and the
LOAD and UNLOAD statements.
)LJXUH
1HWZRUN 'LIIHUHQFH%HWZHHQ
UEBFPDQG/2$'
DQG81/2$'
8QORDG
UEBFP
/RDG
FRS\RSHUDWLRQ 'LVNRUWDSH
:DUHKRXVH :DUHKRXVH
RQ+RVW RQ+RVW
The rb_cm utility can copy data between any two tables (in the same
database, in different databases, in different databases in different
warehouses, or in different databases on different platforms) as long as the
column data types are compatible between the source and destination tables.
The rb_cm utility supports all of the existing TMU load and unload functions.
For example, rb_cm can unload data from the source table in internal or
external format. It can unload by column value (selective unload) or by
segment, and load in APPEND, INSERT, MODIFY, REPLACE, or UPDATE mode.
These features give you substantial flexibility in copying data between tables.
7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
6\VWHP5HTXLUHPHQWV
You can issue an rb_cm command from either the computer on which the
source table is located or the computer on which the destination table is
located. The following sections discuss the requirements for running the
rb_cm utility.
6\VWHP5HTXLUHPHQWV
If you are performing a copy operation over a network, either from the local
computer to the remote computer or from the remote computer to the local
computer, the system requirements are as follows:
If you are copying data between tables on the same computer, no special
system requirements apply.
If you are issuing the rb_cm command from a system other than the source
or destination computer, you must be able to access the remote shell on the
source computer, and you must also be able to access the remote shell on the
destination computer from the source computer.
0RYLQJ'DWDZLWKWKH&RS\0DQDJHPHQW8WLOLW\
'DWDEDVH6HFXULW\5HTXLUHPHQWV
'DWDEDVH6HFXULW\5HTXLUHPHQWV
To copy table data by using the rb_cm utility, the user running rb_cm must
have the necessary authorizations on both the source and destination
databases. The user running rb_cm must have:
5HTXLUHG3HUPLVVLRQ
/RDG0RGHDW
'HVWLQDWLRQ ,16(57 '(/(7( 83'$7(
APPEND Yes No No
INSERT Yes No No
UPDATE No No Yes
7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
7KHUEBFP6\QWD[
7KHUEBFP6\QWD[
source rb_cm [-s unload_host] [-c unload_config_path] [-h unload_rbhost]
(unload) [-d unload_database] [-e unload_prog_path] [-f filter_cmd]
parameters unload_control_file unload_username unload_password
destination [-s load_host] [-c load_config_path] [-h load_rbhost]
(load) [-d load_database] [-e load_prog_path] [-f filter_cmd]
parameters
[-p] load_control_file load_username load_password
,PSRUWDQW The prefix “unload” refers to the source of the data; the prefix “load”
refers to the destination.
-s unload_host, Optional. Hostname of the computer on which the
load_host corresponding source or destination table is located.
0RYLQJ'DWDZLWKWKH&RS\0DQDJHPHQW8WLOLW\
7KHUEBFP6\QWD[
Windows
The value of RB_PATH is taken from the Registry, based
on the value of RB_HOST. If desired, you can specify
another logical database name. ♦
7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
7KHUEBFP6\QWD[
load_control_file Full pathname of the file that contains the LOAD state-
ment. This file must be located on the computer
specified by load_host. For more information, refer to
“TMU Control Files for Use with rb_cm” on page 7-10.
,PSRUWDQW Arguments in the rb_cm command must be in the order shown in the
syntax.
0RYLQJ'DWDZLWKWKH&RS\0DQDJHPHQW8WLOLW\
708&RQWURO)LOHVIRU8VHZLWKUEBFP
708&RQWURO)LOHVIRU8VHZLWKUEBFP
The rb_cm utility works by directing the output from a TMU UNLOAD
statement to a TMU LOAD statement. Before you run rb_cm, therefore, you
must prepare compatible TMU LOAD and UNLOAD control files.
The syntax for an UNLOAD control file for use with rb_cm is as follows.
UNLOAD statement
SET statement
The syntax of a LOAD DATA and/or a SYNCH OFFLINE SEGMENT control file
for use with rb_cm is as follows.
In both LOAD and UNLOAD control files, you must separate multiple control
statements with a semicolon (;). You can enclose comments either in C-style
delimiters (/*…*/), in which case they can span multiple lines, or precede
them with two hyphens (--) and end them with an end-of-line character, in
which case they are limited to a single line.
7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
/2$'DQG81/2$'6WDWHPHQWV
/2$'DQG81/2$'6WDWHPHQWV
The LOAD and UNLOAD statements are required for their respective control
files. The syntax of these statements differs in one respect from the syntax of
LOAD and UNLOAD statements used with the TMU directly: you cannot
specify a disk file or tape device as the destination of the unload operation
and you cannot specify a disk file or a tape device as the data source of the
load operation. You must specify standard output as the unload destination
and standard input as the load source.
For LOAD and SYNCH statement syntax, refer to Chapter 3, “Loading Data
into a Warehouse Database.” For UNLOAD statement syntax, refer to
Chapter 4, “Unloading Data from a Table.”
6SHFLI\LQJ,17(51$/)RUPDW
If you are using the rb_cm utility to copy data between tables stored on the
same platform type (for example, if both source and destination platforms
are Sun Solaris systems), unload the data by using the internal format.
Internal format is a binary data format that requires less time to load than
external-format data.
Internal format is the default unload format and does not need to be specified
explicitly in the UNLOAD statement. The corresponding LOAD statement
must include the FORMAT UNLOAD keywords, however, to indicate that the
data to be loaded is internal-format data.
6SHFLI\LQJ(;7(51$/)RUPDW
If you are using the rb_cm utility to copy data between tables stored on
platforms of different types (for example, if the source platform is Compaq
TRU-64 and the destination platform is Sun Solaris), you must unload the
data into an external character format. External format produces plain-text
data in the database-locale code set, data that is compatible across different
platform types.
0RYLQJ'DWDZLWKWKH&RS\0DQDJHPHQW8WLOLW\
6<1&+6WDWHPHQW
For example, if you create a file with the following GENERATE statement and
run the TMU for this file, the TMU produces a file named load_control_file:
GENERATE LOAD FROM unload_table
INPUTFILE ’-’
EXTERNAL
TMUFILE ’load_control_file’ ;
The load_control_file contains the necessary LOAD statement with all the
column data types defined. You can edit this file to include any other TMU
directives that might be required, such as a SYNCH statement or SET option.
6<1&+6WDWHPHQW
If data is copied into an offline segment of a table, that segment must be
synchronized with the rest of the table. To perform this synchronization,
write a control statement that contains a SYNCH statement, including the
segment and table names. Use this statement only in conjunction with load
operations into offline segments.
7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
6(76WDWHPHQWV
6(76WDWHPHQWV
The SET statements provide controls for:
You can include all of these SET statements in a LOAD control file and all
except the INDEX_TEMPSPACE options in an UNLOAD control file.
A SET statement affects load or unload behavior only during the rb_cm
operation by using the control file that contains that SET statement. After that
session, the option value reverts to the value specified in the rbw.config file;
if no value is specified in the rbw.config file, the option value reverts to its
default.
For more information about these SET statements, refer to “SET Statements
and Parameters to Control Behavior” on page 2-23.
([DPSOHVRIUEBFP2SHUDWLRQV
This section presents two scenarios in which data needs to be copied between
tables and gives examples of the required rb_cm commands. These scenarios
illustrate:
0RYLQJ'DWDZLWKWKH&RS\0DQDJHPHQW8WLOLW\
([DPSOH&RS\LQJ'DWD%HWZHHQ'LIIHUHQW&RPSXWHUV
([DPSOH&RS\LQJ'DWD%HWZHHQ'LIIHUHQW&RPSXWHUV
Suppose that the Aroma database is located in a data warehouse for sales and
marketing. A regional marketing team maintains a version of the Aroma
database named Southregion, which contains only the data relevant to their
region. The regional team can make changes to the Southregion database
and run long queries against it without affecting users outside the team.
Periodically, the regional team wants to copy any new rows relevant to their
region from the Sales table in the Aroma database to the Sales table in the
Southregion database. They can do this by using rb_cm with a selective
unload operation. The following figure summarizes this scenario.
$URPDGDWDEDVH 6RXWKUHJLRQGDWDEDVH
6DOHVWDEOH 6DOHVWDEOH
&RS\QHZURZV
&RUSRUDWH5HG%ULFNGDWDEDVH 5HJLRQDO5HG%ULFNGDWDEDVH
+RVWSODWIRUP&RPSDT758 +RVWSODWIRUP6XQ6RODULV
+RVWQDPHPDLQ +RVWQDPHVRXWK
To perform the operation that the preceding figure describes, the adminis-
trator needs to carry out the following tasks:
All of these steps are described in the following section. The first two steps
only need to be performed once, before the initial copy operation. Subse-
quent copies can reuse the same control files.
7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
([DPSOH&RS\LQJ'DWD%HWZHHQ'LIIHUHQW&RPSXWHUV
6HWWLQJ8SWKH81/2$'&RQWURO)LOH
The administrator sets up an UNLOAD control file named unload_new_sales
on the host computer. The UNLOAD statement that this file contains must
unload to external format because the source and destination computers in
this example have different architectures. The UNLOAD statement must also
perform a selective unload of the relevant rows. The following UNLOAD
statement fulfills both of these requirements:
UNLOAD SALES
EXTERNAL
OUTPUTFILE ’-’
WHERE PERKEY = 94050
AND (MKTKEY = 1
OR MKTKEY = 2
OR MKTKEY = 3
OR MKTKEY = 4);
This UNLOAD statement performs a selective unload of those rows that are
relevant to the south region (where Mktkey is equal to 1, 2, 3, or 4) and are
new (where Perkey is equal to the most recent value).
6HWWLQJ8SWKH/2$''$7$&RQWURO)LOH
The administrator sets up a LOAD control file named load_new_sales on the
destination computer. You can use the GENERATE statement to obtain a
LOAD DATA statement as follows:
The administrator enters this statement in a TMU control file and runs the
TMU. The TMU creates a file named load_new_sales that contains the
following LOAD statement:
LOAD DATA INPUTFILE ’-’
RECORDLEN 62
INSERT
INTO TABLE SALES (
PERKEY POSITION(2) INTEGER EXTERNAL(11) NULLIF(1)=’%’,
PRODKEY POSITION(14) INTEGER EXTERNAL(11) NULLIF(13)=’%’,
MKTKEY POSITION(26) INTEGER EXTERNAL(11) NULLIF(25)=’%’,
DOLLARS POSITION(38) DECIMAL EXTERNAL(12) NULLIF(37)=’%’,
WEIGHT POSITION(51) INTEGER EXTERNAL(11) NULLIF(50)=’%’);
0RYLQJ'DWDZLWKWKH&RS\0DQDJHPHQW8WLOLW\
([DPSOH&RS\LQJ'DWD%HWZHHQ'LIIHUHQW&RPSXWHUV
,PSRUWDQW The TMU uses the NULLIF clause at the end of each field specification
that the TMU uses when it loads the data. An extra position before each field is
reserved for a null indicator.
5XQQLQJUEBFP
After the control files are written, a user with the necessary privileges can
issue an rb_cm command by using those files. The rb_cm command can be
issued from either the source host computer (main) or the destination host
computer (south1).
If the rb_cm command is issued from main, the source host computer, the
command for this operation (formatted for readability) might resemble the
following excerpt of code.
% rb_cm \
source
-s main -c redbrick_dir -h RB_HOST -d Aroma \
parameters
$RB_CONFIG/util/unload_new_sales maindba secret \
destination -s south1 -c /south1_redbrick_dir -h RB_HOST -d Southregion \
parameters ’$RB_CONFIG’/util/load_new_sales southdba cryptic
7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
([DPSOH&RS\LQJ'DWD%HWZHHQ'LIIHUHQW&RPSXWHUV
■ The -c, -h, and -d options are present for both source and destination
and override the corresponding environment variables. These
options are not required if the corresponding variables are set.
Windows
■ You cannot use the RB_CONFIG environment variable as part of an
explicit pathname with the rb_cm utility for a source or destination
that is a Windows system. ♦
■ If you use the RB_CONFIG environment variable to specify a control
file on the remote computer, you must use appropriate escape
characters so that it is passed to the remote computer and not inter-
preted on the local computer. If you issue an equivalent command
from the destination host computer, the command (formatted for
readability) might resemble the following excerpt of code.
% rb_cm \
source -s main -c redbrick_dir -h RB_HOST -d Aroma \
parameters ’$RB_CONFIG’/util/unload_new_sales maindba secret \
destination -s south1 -c /south1_redbrick_dir -h RB_HOST -d Southregion \
parameters $RB_CONFIG/util/load_new_sales SouthDBA cryptic
,PSRUWDQW The preceding examples are broken into multiple lines for clarity; when
you enter the rb_cm command, enter it as a single line.
0RYLQJ'DWDZLWKWKH&RS\0DQDJHPHQW8WLOLW\
([DPSOH&RS\LQJ'DWD%HWZHHQ7DEOHVRQ WKH 6DPH &RPSXWHU
([DPSOH&RS\LQJ'DWD%HWZHHQ7DEOHV
RQ WKH 6DPH &RPSXWHU
Suppose that the same regional marketing team from the previous example
keeps a copy of their Southregion database named Testdb, to which they can
make numerous updates and simulate various scenarios. Periodically they
need to replace the modified data in Testdb with the actual data stored in the
Southregion database. The following figure illustrates this scenario for a
single table.
6RXWKUHJLRQGDWDEDVH 7HVWGEGDWDEDVH
6DOHVWDEOH 6DOHVWDEOH
&RS\DOOURZV
5HJLRQDOGDWDEDVH 5HJLRQDOGDWDEDVH
+RVWSODWIRUP6XQ6RODULV +RVWSODWIRUP6XQ6RODULV
+RVWQDPHVRXWK +RVWQDPHVRXWK
To perform this operation the administrator must set up the LOAD and
UNLOAD control files and then run an appropriate rb_cm command.
6HWWLQJ8SWKH81/2$'&RQWURO)LOH
The administrator sets up an UNLOAD control file named
unload_south_sales. This file contains the following UNLOAD statement:
UNLOAD
SALES
OUTPUTFILE ’-’;
7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
([DPSOH&RS\LQJ'DWD%HWZHHQ7DEOHVRQ WKH 6DPH &RPSXWHU
6HWWLQJ8SWKH/2$'&RQWURO)LOH
The administrator sets up a LOAD control file named load_south_sales. This
file contains the following LOAD statement:
LOAD
INPUTFILE ’-’
REPLACE
FORMAT UNLOAD
OPTIMIZE ON
INTO TABLE SALES;
This LOAD statement replaces all the rows in the Sales table with the loaded
data.
5XQQLQJUEBFP
After the control files are set up, a user with the required privileges can issue
an rb_cm command to copy the data. The command might look like the
following example:
% rb_cm \
-d Southregion $RB_CONFIG/util/unload_south_sales
SouthDBA cryptic \
-d Testdb /mktg/local/test/util/load_south_sales TestDBA
cryp007tic
0RYLQJ'DWDZLWKWKH&RS\0DQDJHPHQW8WLOLW\
9HULI\LQJWKH5HVXOWVRIUEBFP2SHUDWLRQV
9HULI\LQJWKH5HVXOWVRIUEBFP2SHUDWLRQV
To verify that all of the rows are successfully copied by the rb_cm utility,
query the RBW_LOADINFO table in the destination database. This system
table holds information on each load operation performed against the
database, including loads that are issued as part of an rb_cm operation. This
information includes the times at which the load started and completed, the
number of rows inserted into the table, and the status of the load. For more
information on the RBW_LOADINFO system table, refer to the Administrator’s
Guide.
7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
Chapter
%DFNLQJ8SD'DWDEDVH
In This Chapter . . . . . . . . . . . . . . . . . . . . 8-3
Backup Levels and Modes . . . . . . . . . . . . . . . . 8-4
External Full Backups . . . . . . . . . . . . . . . . 8-4
Restore Rules . . . . . . . . . . . . . . . . . . . 8-5
Backup Data . . . . . . . . . . . . . . . . . . . . 8-5
Backup Strategies . . . . . . . . . . . . . . . . . . 8-6
How Many Backups? . . . . . . . . . . . . . . . 8-6
Which Level? . . . . . . . . . . . . . . . . . . 8-6
Online or Checkpoint? . . . . . . . . . . . . . . . 8-7
How Important is Data Recovery? . . . . . . . . . . . 8-7
General Recommendations . . . . . . . . . . . . . 8-8
Backup Procedure . . . . . . . . . . . . . . . . . . 8-8
7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
,Q7KLV&KDSWHU
Most databases are routinely modified by table loads and server-based DDL
and DML operations. Such databases should be backed up on a regular
schedule in case of system or software failure. In the event of a failure, the
presence of a recent backup makes it possible to fully recover the database.
The larger the database, the more important it is to back it up.
Before deciding when and how often to back up a database, you must under-
stand the types of Table Management Utility (TMU) backups you can perform
and anticipate the time and effort involved in each case. As your database
evolves, you must consider the amount of data at risk and be prepared to
restore the database if necessary.
This chapter explains how to back up an IBM Red Brick Warehouse database
with the TMU. The chapter contains the following main sections:
%DFNLQJ8SD'DWDEDVH
%DFNXS/HYHOVDQG0RGHV
%DFNXS/HYHOVDQG0RGHV
The TMU supports full backups (level 0) and incremental backups (levels 1
and 2).
Regardless of its level, a TMU backup can be performed in either online mode
or checkpoint mode:
■ Online backups take place while the database is “live”; both read
operations (queries) and write operations (updates and loads) are
allowed.
■ Checkpoint backups take place with the database in read-only mode—
available for read operations but not for write operations. You cannot
modify the database while a checkpoint backup is in progress.
,PSRUWDQW You should not restore a database to the level of an online backup; online
backups do not guarantee database recovery to a consistent state. The restore process
is not complete until you have returned the database to its state at the time of a check-
point backup.
([WHUQDO)XOO%DFNXSV
TMU incremental backups can be seamlessly integrated with full backups
performed with third-party tools and operating-system utility programs.
This combination of backups is a good solution for very large data
warehouses and for customers who have a system-wide full backup solution
already in place. For more information about this approach, see page 8-20.
7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
5HVWRUH5XOHV
5HVWRUH5XOHV
The following table lists the supported combinations of backup levels and
modes and indicates the associated rules for database restores:
%DFNXS'DWD
IBM Red Brick Warehouse tracks changes to the database by maintaining a
bitmap in a special data segment declared as the backup segment. Every 8K
block that has changed since the last backup is stored in the backup segment.
Therefore, when databases are backed up incrementally, the minimum
amount of data has to be copied. The larger the database, the more advanta-
geous this approach becomes.
For detailed information about the backup segment, refer to “Preparing the
Database for Backups” on page 8-8.
%DFNLQJ8SD'DWDEDVH
%DFNXS6WUDWHJLHV
%DFNXS6WUDWHJLHV
A certain amount of planning and scheduling is required to establish a sound
backup strategy for your database. The strategy you choose is a trade-off
between the performance requirements and time constraints of your routine,
scheduled backup operations and the degree of difficulty, reliability, and time
constraints of restore operations that you might have to perform in the future.
Remember that you can schedule when the next backup should be done, but
you cannot predict when a catastrophic failure might mandate a restore
operation.
+RZ0DQ\%DFNXSV"
The frequency of the backups you perform should be based on the extent of
the changes that routinely occur to the database. If your data warehouse is
static during the week while users are running queries, you need not
schedule backups during the week. If, on the other hand, the database is
refreshed with new data every night, a daily incremental backup is a wise
choice. In other words, you need to know how much data is at risk at any
given time. Any data that has yet to be copied to a safe backup file or tape is
at risk.
:KLFK/HYHO"
The level of backup you choose depends on the extent of the changes and the
size of the database. These factors determine how long a backup might take.
A full backup takes a long time to complete, and the larger the database, the
longer it takes. Incremental backups are faster, and are very effective for
picking up relatively small changes to the database, such as a new index or
some inserts into a dimension table.
7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
%DFNXS6WUDWHJLHV
The only disadvantage to frequent incremental backups is that they can cause
the restore process to be more difficult. If there is too long an interval between
a series of level 2 backups and the last full backup, you will have to restore
all of these backups (as well as the last full backup) in order to bring the
database back to a consistent state. Level 1 backups are slower, but because
they pick up all the changes since the last level 0, they reduce the number of
incremental backups from which the database needs to be restored.
2QOLQHRU&KHFNSRLQW"
The third factor to consider is whether the database needs to be available for
modifications during the backup operation. If there is not enough time to run
a full backup while the database is unavailable for loads and updates, you
might choose to run time-consuming backups in online mode, closely
followed by quick checkpoint backups that pick up any changes that the
online backup missed. You can schedule the online backups to run anytime
but schedule the checkpoints during database “downtime.” The checkpoints
are essential; without them, the database cannot be restored to a consistent
state.
+RZ,PSRUWDQWLV'DWD5HFRYHU\"
Finally, consider how much data loss your application can afford to sustain.
You might forego regular incremental backups in the knowledge that minor
daily updates to the database are easier to reload than to back up and restore.
On the other hand, if millions of rows are added or updated every night, you
must maintain a daily checkpoint backup of those changes.
%DFNLQJ8SD'DWDEDVH
%DFNXS3URFHGXUH
*HQHUDO5HFRPPHQGDWLRQV
As part of a reliable and efficient TMU backup program, IBM recommends
that you:
%DFNXS3URFHGXUH
The general procedure for backing up a Red Brick database is as follows:
Prepare the database for backups by creating the backup segment. See
page 8-8.
Run an initial full backup (online, checkpoint, or external) to provide
the baseline data for future restore operations. See page 8-13.
Periodically run incremental backups, making sure that checkpoint
backups are done frequently enough to provide a means of
consistent database recovery. See page 8-13.
3UHSDULQJWKH'DWDEDVHIRU%DFNXSV
Incremental backups rely on bitmap information that indicates which blocks
in each physical storage unit (PSU) have changed because of table load opera-
tions, data manipulation commands, or the creation of new database objects.
These bitmaps allow incremental backups to be performed efficiently on very
large databases because the changed blocks of each PSU can be distinguished
from those that have not changed since the last backup.
7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
$/7(5'$7$%$6(&5($7(%$&.83'$7$&RPPDQG
7RFUHDWHWKHEDFNXSVHJPHQW
Alternatively, you can use the Manage Segments and Manage System
wizards in the IBM Red Brick Warehouse Administrator tool to create the
backup segment.
$/7(5'$7$%$6(&5($7(%$&.83'$7$&RPPDQG
The ALTER DATABASE CREATE BACKUP DATA command names an existing
but unused segment as the backup segment for the database. When the
command is issued, the named segment is marked as the backup segment in
the DST_DATABASES table. Only one segment per database can be defined as
the backup segment. If no backup segment is defined, TMU backup and
restore operations cannot be performed.
The following SQL statements show how to create a segment, then define it
as the backup segment.
%DFNLQJ8SD'DWDEDVH
$/7(5'$7$%$6('523%$&.83'$7$&RPPDQG
$/7(5'$7$%$6('523%$&.83'$7$&RPPDQG
The ALTER DATABASE DROP BACKUP DATA command removes the backup
data from the database and changes the backup segment to a regular
segment. The segment itself is not dropped. After this command has been
issued, TMU backup operations can no longer be performed.
For more information about the ALTER DATABASE command, refer to the SQL
Reference Guide.
6WRUDJH5HTXLUHPHQWVIRUWKH%DFNXS6HJPHQW
You allocate space for the backup segment by specifying the size of one or
more PSUs when you issue the CREATE SEGMENT statement. You can also
add space to the segment with an ALTER SEGMENT ADD STORAGE command.
In general, the best practice is to anticipate the amount of space the backup
segment will need and allocate that amount of space when you first create the
segment.
Because operations that access the backup segment are lock-intensive, the
backup segment should be stored on local, rather than NFS-mounted,
filesystems. The amount of local disk space the segment will require depends
on the size of the database and how much it is expected to grow. You can use
the following formula to calculate the maximum space (in kilobytes) required
for a backup segment:
7RWDO6HJPHQWV7RWDO368V
0D[LPXP6SDFH
For example, the Aroma database contains 41 segments and 43 PSUs (the 39
default segments consist of one PSU, and the 2 user-defined segments consist
of 2 PSUs each):
.
Assuming that no new segments are created for the database, the maximum
amount of space that its backup segment will ever need is 4,108 kilobytes
(about 4 megabytes). However, if this database doubles in size, the space
allocated to the backup segment needs to be closer to 8 megabytes.
7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
$OWHULQJWKH%DFNXS6HJPHQW
If the backup segment runs out of space when you attempt to create a new
database object, an error message is displayed and you have to resubmit the
command that failed. To avoid subsequent out-of-space errors, immediately
add more space to the backup segment by using an ALTER SEGMENT...ADD
STORAGE statement. It is also recommended that you run a full backup after
adding storage.
,PSRUWDQW The backup segment itself is backed up only when a TMU checkpoint
backup is run. Online TMU backups do not back up the backup segment.
'DPDJHWRWKH%DFNXS6HJPHQW
If the backup segment is damaged, backup data for the database is not
maintained and no backup and restore operations can be performed until the
segment is repaired. For a procedure on repairing damaged segments, refer
to the Administrator’s Guide.
After the damaged PSUs have been repaired, IBM recommends that you issue
an ALTER SEGMENT VERIFY statement to check that the PSUs are intact, then
run a full backup of the database.
Do not use the ALTER SEGMENT FORCE INTACT command to mark a repaired
backup segment as intact unless you are sure that the database was not modified
while the backup segment was damaged. If the database was modified during this
time, the next backup operation would fail to back up all the modified blocks
and the database might be left in an inconsistent state. In this case, the only
way to restore the consistency of the database would be to run a full backup.
For more information about the VERIFY and FORCE INTACT options, refer to
the Administrator’s Guide and the SQL Reference Guide.
$OWHULQJWKH%DFNXS6HJPHQW
For detailed information about the ALTER SEGMENT command, refer to the
SQL Reference Guide. The following sections identify which ALTER SEGMENT
operations can and cannot be performed on the backup segment.
%DFNLQJ8SD'DWDEDVH
$OWHULQJWKH%DFNXS6HJPHQW
9DOLG2SHUDWLRQV
The following ALTER SEGMENT operations can be performed on the backup
segment:
■ ADD STORAGE
■ CHANGE EXTENDSIZE
■ CHANGE MAXSIZE
■ CHANGE PATH
■ COMMENT
■ FORCE INTACT—Use this option with caution; see “Damage to the
Backup Segment” on page 11.
■ MIGRATE TO
■ RENAME
■ VERIFY
■ OPTICAL ON/OFF
$''6725$*(([DPSOH
The following ALTER SEGMENT command adds a PSU to the backup segment.
RISQL> alter segment backup_seg add storage ’/test/bar5’
> maxsize 2048000;
,QYDOLG2SHUDWLRQV
The following ALTER SEGMENT operations cannot be performed on the
backup segment:
■ ATTACH
■ CLEAR
■ DETACH
■ DROP LAST STORAGE
■ ONLINE and OFFLINE
These commands do not apply to the backup segment, which is
brought online as soon as it is created and always remains online.
The RBW_SEGMENTS system table always indicates that the backup
segment is online.
■ RANGE
7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
+RZWR5XQD708%DFNXS
■ RELEASE STORAGE
■ SEGMENT BY
+RZWR5XQD708%DFNXS
To run a TMU backup, follow these steps:
7DVN$XWKRUL]DWLRQ
%DFNLQJ8SD'DWDEDVH
6FRSHRI%DFNXS2SHUDWLRQV
6FRSHRI%DFNXS2SHUDWLRQV
The scope of a TMU backup operation is always the entire database, including
the system catalog. You cannot back up a single object or set of objects. The
only objects (or changed blocks) that are never backed up are as follows:
'DWDEDVH/RFDOH
TMU backup and restore operations are fully localized. When the database is
backed up, the database locale is stored with the data. When the database is
restored, the locale of the backup must match the locale of the database;
otherwise, the restore operation fails and an error message is displayed. If
you have to re-create an empty version of a corrupted database in order to
restore it, you must create the new database with the same locale that was
saved in the backups.
9HUVLRQHG'DWDEDVHV
TMU online backups can be performed on versioned databases, whether or
not the version log is empty when the backup starts. However, TMU check-
point backups can only begin when the version log is empty. If the version
log is not empty, the backup will wait until the vacuum cleaner daemon has
cleaned the version log. Before running a checkpoint backup, issue the
following ALTER DATABASE command:
RISQL> alter database clean version log;
7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
&RQILJXULQJWKH6L]HRI%DFNXS)LOHV
If the version log contains any damaged segments, the clean operation will
not complete. In this case, try to repair the damaged segments and use ALTER
SEGMENT VERIFY commands to make sure they are intact. Do not use the
REMOVE DAMAGED SEGMENTS clause in the ALTER DATABASE command; if
you do, you might remove data blocks that should have been fixed and
included in the backup.
Before doing a restore, you should also attempt to fix any damaged segments.
However, when you clean the version log before a restore, you can include
the REMOVE DAMAGED SEGMENTS clause in the ALTER DATABASE
command:
RISQL> alter database clean version log remove damaged segments;
Then you can restore the damaged segments, using either full or partial
restore operations. Because this command only clears the damaged segment
blocks from the version log, the segments still exist in the system tables and
partial restores will work. For more details about partial restores, see
page 9-19.
&RQILJXULQJWKH6L]HRI%DFNXS)LOHV
Whether your TMU backups are written to disk, tape, or a storage
management system, the amount of backup data committed per transaction
is defined by the size of the backup and restore unit or BAR unit. The BAR unit
size represents the maximum size of individual backup files and XBSA
objects, except in those cases where the size of the backed-up blocks for a
given PSU exceeds the BAR unit size. Backup blocks for a single PSU cannot
be split across different backup files, but blocks for different PSUs can be
backed up within a single backup file.
%DFNLQJ8SD'DWDEDVH
&RQILJXULQJWKH6L]HRI%DFNXS)LOHV
The following diagram illustrates a case where the BAR unit size is set to 300
megabytes. Because the backup data for PSU6 is approximately 320
megabytes, it occupies a single backup file that exceeds the configured size.
The other two files adhere to the size limit and contain the backed-up blocks
for multiple PSUs.
PSU2 PSU5
PSU3
bar_unit2 (280MB)
bar_unit1 (290MB)
bar_unit3 (320MB)
The ability to store the blocks for multiple PSUs in a single backup file reduces
the number of data commits required during each backup operation. In turn,
this approach optimizes the performance of both the backup operation and
any restore operations from that backup. However, if the BAR unit size is set
too high, more data is potentially at risk while a backup operation is in
progress. If a failure occurs, the amount of data not yet committed could be
much greater and the entire current unit has to be backed up again.
The default BAR unit size is 256 megabytes. To change this setting, enter a
value for the BAR_UNIT_SIZE parameter in the rbw.config file. For example:
TUNE BAR_UNIT_SIZE 512M
The TMU returns an error if you enter a value less than the minimum setting
of 1 megabyte (1M).
On disk, the maximum setting for this parameter is 2 gigabytes (2G), and the
TMU returns an error if the setting exceeds 2G. Nonetheless, the physical
maximum for a single-PSU backup file on disk is 2G minus 8K. On tape and
XBSA devices, the maximum size could be as much as 2 terabytes, depending
on the device type and configuration.
7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
%DFNXSVWR7DSH
For more information about TMU SET commands and TUNE parameters, see
page 2-23.
5HFRPPHQGDWLRQV
IBM recommends using the default BAR_UNIT_SIZE value for your initial
TMU backups. You can track the number of BAR units that each TMU backup
produces by checking the contents of the action_log file; see page 8-29.
Before changing the parameter, you need to consider several factors in your
application environment, including the size of the database, I/O performance
for the operating system, the average number of dirty blocks per PSU, the
configuration of the storage manager (if you are using XBSA backups), and so
on.
Remember that the BAR unit size represents the recommended size of each
backup file, not an absolute size for all backup files. At run-time, a very large
PSU with a large number of dirty blocks will cause that backup file’s size to
be rounded up to some higher value (but a value as close as possible to the
specified setting).
UNIX %DFNXSVWR7DSH
The TMU can perform backups to a wide range of non-rewind tape devices
that support UNIX open/read/write/close interfaces, using 4mm, 8mm, and
DLT tapes.
:DUQLQJ You cannot reuse the same tape for a subsequent TMU backup; the second
backup will overwrite the first backup. Make sure you mount a new tape on the device
before proceeding with a new backup operation.
%DFNLQJ8SD'DWDEDVH
%DFNXSVWR7DSH
6WDQGDUG/DEHO)RUPDW
The backup files on tapes are ANSI standard-label tape files (ANSI STL
format). Only one backup is allowed on a single tape; however, one backup
can span several tapes. Before backing up the database to tape files, make
sure the tape device is configured as variable-length. (The same requirement
applies to TMU LOAD and UNLOAD operations.)
7DSH'HYLFH&RQILJXUDWLRQ
When you run a TMU backup to tape, you must specify a logical device name
in the command, as shown on page 8-22. This logical name must point to a
physical device specified in the rbw.config file. For example:
BARTAPE dev1 /dev/rmt0
When the backup starts, the logical name (in this case, dev1) is found in the
rbw.config file and the mapped physical device /dev/rmt0 is used for the
backup. In this way, DBAs can restore a backup using a different tape device
from the one that was originally used for the backup operation itself. The
BARTAPE entries can be edited at any time to update the mapping of logical-
to-physical device names.
The three parts of each BARTAPE entry must be separated by spaces. The
logical name can contain any combination of alphanumeric characters. The
physical name must be the exact name of the device.
7DSH&DSDFLW\
When you use the TMU to back up a database to tape, you must specify a
CAPACITY value. This value represents the maximum amount of data that
can be backed up to each tape. The TMU counts every byte of uncompressed
data it writes to the tape toward this limit. Standard label and file marks are
excluded from the calculation. (Although these marks are device-dependent,
they should not exceed 10 megabytes, which is the minimum capacity you
can specify.)
If the tape device uses compression, the actual amount of space used for the
backup might be less than the capacity you specified. To take advantage of
the “extra” space, factor the compression ratio into the capacity setting.
7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
8VLQJD6WRUDJH0DQDJHUIRU708%DFNXSV
8VLQJD6WRUDJH0DQDJHUIRU708%DFNXSV
The X/Open Backup Services API (XBSA) is implemented by several storage
management products, including IBM Tivoli Storage Manager. The TMU
supports backups via the XBSA interface, whereby control of the location and
contents of the backup is managed entirely by the configured storage
manager.
7RFRQILJXUH\RXUVWRUDJHPDQDJHUWRVXSSRUW;%6$EDFNXSV
%DFNLQJ8SD'DWDEDVH
8VLQJ([WHUQDO7RROVIRU)XOO%DFNXSV
8VLQJ([WHUQDO7RROVIRU)XOO%DFNXSV
You might have a very large database that takes a long time to complete a full
TMU backup, or you might already have an efficient process in place for
taking system backups that include all of the Red Brick files. Therefore, you
might elect to use an external tool for performing full backups but still take
advantage of the TMU functionality for optimized incremental backups that
external tools cannot emulate. The TMU can make use of any external full
backup as the baseline for subsequent incremental backups. If a reliable
means of taking full backups already exists, you do not have to re-create
those backups with the TMU.
For example, you could use a system-wide backup utility such as the UNIX
dump and restore commands or a third-party tool that can back up data in
parallel to multiple tape drives. As well as providing a faster full backup,
such external tools guarantee that files outside the database but related to it
(such as the rbw.config file and initialization files) are kept in synch with the
actual database files (PSUs).
7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
8VLQJ([WHUQDO7RROVIRU)XOO%DFNXSV
This command resets the backup segment and effectively states that a reliable
external backup is about to be created. In turn, TMU incremental backups can
follow, just as if a TMU full backup had been done.
Issue this command immediately after the external restore is performed and
before any connections or changes can be made to the restored database. For
more details about foreign restore operations, see page 9-11.
The SET FOREIGN FULL BACKUP and SET FOREIGN FULL RESTORE
commands require the BACKUP_DATABASE and RESTORE_DATABASE task
authorizations, respectively.
5HFRPPHQGHG3URFHGXUHIRU)RUHLJQ%DFNXS2SHUDWLRQV
Do not allow any write operations against the database during a foreign
backup. Follow these steps:
7LS If you already have a full backup of a database that was taken with an external
program, you can still issue the SET FOREIGN FULL BACKUP command and use
the TMU for incremental backups. However, the recommended procedure is to use the
SET FOREIGN FULL BACKUP command before running the external backup.
%DFNLQJ8SD'DWDEDVH
%$&.836\QWD[
%$&.836\QWD[
The following diagram shows how to construct a TMU BACKUP statement:
K|M|G
XBSA
ONLINE LEVEL 0 ;
CHECKPOINT 1
7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
%$&.836\QWD[
TAPE DEVICE Specifies the logical name of the UNIX or Linux tape
logical_device_name device to be used for a backup to tape. See “Tape
Device Configuration” on page 8-18.
%DFNLQJ8SD'DWDEDVH
%$&.836\QWD[
([DPSOHVRI%DFNXS2SHUDWLRQV
The following examples illustrate the syntax for various TMU backup
statements:
81,;/LQX[
backup to directory ’/disk1/db_bup/012202’ online level 0;
# full backup on 1/22/02
:LQGRZV
backup to directory ’e:\db_bup\012202’ online level 0;
# full backup on 1/22/02
7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
%$&.836\QWD[
0HVVDJHV'LVSOD\HG'XULQJ%DFNXSV
In the following example, the TMU control file tb_db_level0.tmu backs up
the Aroma database to a directory named tb_db_backup. The messages
contain information about the backup process, the type and level of backup,
and the media used to store the backup files.
113 brick % rb_tmu -d AROMA tb_db_level0.tmu system manager
(C) Copyright IBM Corp. 1991-2002. All rights reserved.
Version 06.20.0000(0)TST
** INFORMATION ** (523) Backup of database ’AROMA’ with backup level
0, backup type CHECKPOINT, and backup media DIRECTORY started.
** INFORMATION ** (7051) Backup to
’/devel/local/bobr/tb_db_backup/rb_bar_AROMA__________.20011119.162542
.00032049.0001’ started on Monday, November 19, 2001 4:25:42 PM.
** INFORMATION ** (7061) Backup to
’/devel/local/bobr/tb_db_backup/rb_bar_AROMA__________.20011119.162542
.00032049.0001’ completed on Monday, November 19, 2001 4:25:42 PM.
** INFORMATION ** (7087) Backup of the database AROMA completed
successfully on Monday, November 19, 2001 4:25:42 PM.
** STATISTICS ** (500) Time = 00:00:00.14 cp time, 00:00:00.14 time,
Logical IO count=750, Blk Reads=0, Blk Writes=778
The following output is for a subsequent level 1 backup of the same database:
124 brick% rb_tmu -d AROMA tb_db_level1.tmu system manager
(C) Copyright IBM Corp. 1991-2002. All rights reserved.
Version 06.20.0000(0)TST
** INFORMATION ** (523) Backup of database ’AROMA’ with backup level
1, backup type CHECKPOINT, and backup media DIRECTORY started.
** INFORMATION ** (7051) Backup to
’/devel/local/bobr/tb_db_backup/rb_bar_AROMA__________.20011119.163859
.00032173.0001’ started on Monday, November 19, 2001 4:38:59 PM.
** INFORMATION ** (7061) Backup to
’/devel/local/bobr/tb_db_backup/rb_bar_AROMA__________.20011119.163859
.00032173.0001’ completed on Monday, November 19, 2001 4:38:59 PM.
** INFORMATION ** (7087) Backup of the database TBDB completed
successfully on Monday, November 19, 2001 4:38:59 PM.
** STATISTICS ** (500) Time = 00:00:00.07 cp time, 00:00:00.10 time,
Logical IO count=320, Blk Reads=0, Blk Writes=307
%DFNLQJ8SD'DWDEDVH
%DFNXS0HWDGDWD
%DFNXS0HWDGDWD
In order to automate the process of restoring a database to a consistent state,
the TMU relies on metadata files that maintain a history of all the backups that
have been performed since the database was created. The metadata history is
specific to each IBM Red Brick Warehouse database that exists within a single
installation of the warehouse server. The backup metadata makes it possible
to restore a database with one TMU operation, without requiring the DBA to
specify which particular backups need to be restored (or the sequence in
which they need to be restored).
The database_name directory is not created for a database until its first TMU
backup is performed.
7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
0HGLD+LVWRU\)LOHUEZBPHGLDBKLVWRU\
0HGLD+LVWRU\)LOHUEZBPHGLDBKLVWRU\
The rbw_media_history file is a text file that contains detailed PSU-level
information about every backup operation against the database. This backup
history is used to discover the sequence of backups that must be restored in
order to provide consistent database recovery to a specified point in time.
After each successful TMU or external backup operation, new backup records
are appended to the history file: one record per PSU for TMU backups and one
record for the whole database for external backups.
,PSRUWDQW The rbw_media_history file itself is not backed up by the TMU; you
should back it up regularly with an external program.
Component Description
Action sequence number Number assigned to the backup session; the same
number is shared by all the PSUs that are backed up
in a single operation
Backup level 0, 1, or 2
Time stamp When the backup of the first dirty block in the PSU
was started. The time-stamp format is as follows:
yyyy-mm-dd.hh:mm:ss
This format is defined on page 9-14.
Full PSU changed Whether the complete PSU was backed up or just
some of its blocks: True (T) or False (F)
Size of backup blocks Total number of dirty blocks backed up for this PSU
%DFNLQJ8SD'DWDEDVH
0HGLD+LVWRU\)LOHUEZBPHGLDBKLVWRU\
Component Description
PSU size Total number of 8K blocks in the PSU (at the time of
the backup)
Segment name The name of the segment to which the PSU belongs
For an external backup, only one record is appended to the file and the PSU-
level components of the record are left blank.
(GLWLQJWKH0HGLD+LVWRU\)LOH
The redbrick user can edit the rbw_media_history file if necessary, but
changes must be made with great care. Make sure that records required for a
valid database restore operation are not removed; otherwise, the TMU could
construct an incorrect restore sequence.
Records that belong to backup operations that are no longer needed can be
safely removed. Each backup operation has a unique backup sequence
number. First determine the sequence number for the backup operation you
want to remove, then be sure to remove all of the records that have that
number. For example, say the rbw_media_history file contains records for a
series of seven backups:
7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
%DFNXS/RJ)LOHDFWLRQBORJ
3682IIVHW([DPSOH
The “offset” defines the starting position for the backed-up blocks of a given
PSU. This information is important because a single backup file can contain
multiple PSUs. For example, assume that PSU1 is 80K (10 blocks), and 2 of its
blocks (16K) are backed up to /tmp/file1, starting from block 20 of that
backup file. In the rbw_media_history file, the offset value for PSU1 will be
20. The first 19 blocks of /tmp/file1 are occupied by dirty blocks that belong
to other PSUs.
%DFNXS/RJ)LOHDFWLRQBORJ
Event-based records generated by TMU backup and restore operations
against a specific database are logged in the action_log file inside the
database_name directory. Each new record is appended to the end of the file.
Unlike the rbw_media_history file, the action_log file does not contain PSU-
level details and is not used by the TMU. Also, the log file contains infor-
mation about backups and restore operations, whereas the
rbw_media_history file contains information about backups only.
The redbrick user can edit or delete the file if necessary. If the file does not
exist, the TMU creates a new one. If you want to activate a new, empty version
of the file, simply rename the current file. When the next backup operation
starts, a new action_log file will be created.
There is no limit imposed on the maximum size of the log file, other than the
limit imposed by the operating system.
The log file contains a record for each backup or restore operation that is
started and another record for each operation that is completed. The records
contain the following information:
%DFNLQJ8SD'DWDEDVH
%DFNXS/RJ)LOHDFWLRQBORJ
An odd number of records (such as two “backup started” entries but only one
“backup completed” entry) implies that a certain operation failed, but the
action_log file does not contain entries that indicate what caused the failure.
The cause of the failure should be apparent from the detailed messages in the
server log files (rbwlog.*).
([DPSOH/RJ)LOH(QWULHV
The following backup log entries describe successful level 0 and level 2
backups of the Aroma database.
TMU 06.20.0000(0)TST [BAR] Backup started.
DB: /qa/local/bobr/toucaroma, HOST and USER: TOUCAN SYSTEM, DATE and
TIME: Thursday, December 20, 2001 1:29:53 PM, COMMAND: BACKUP TO
DIRECTORY /qa/local/bobr/bar0 CHECKPOINT LEVEL 0, BAR_UNIT_SIZE:
262144.
Backup to
’/qa/local/bobr/bar0/rb_bar_bobr_toucaroma_.20011220.132953.00001820.0
001’ started on Thursday, December 20, 2001 1:29:53 PM.
Backup to
’/qa/local/bobr/bar0/rb_bar_bobr_toucaroma_.20011220.132953.00001820.0
001’ completed on Thursday, December 20, 2001 1:29:54 PM.
7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
Chapter
5HVWRULQJD'DWDEDVH
In This Chapter . . . . . . . . . . . . . . . . . . . . 9-3
Full and Partial TMU Restores . . . . . . . . . . . . . . . 9-4
Restore Path . . . . . . . . . . . . . . . . . . . . 9-4
Restore Examples . . . . . . . . . . . . . . . . . . 9-5
Example 1: Daily Level 2 Checkpoints . . . . . . . . . 9-6
Example 2: Daily Level 1 Checkpoints . . . . . . . . . 9-7
Example 3: Combined Level 1 and Level 2 Backups . . . . . 9-8
Example 4: Negative Case . . . . . . . . . . . . . . 9-9
How to Run a TMU Restore . . . . . . . . . . . . . . . . 9-10
Recommended Procedure for Foreign Restore Operations . . . . 9-11
Restore of Special Segments . . . . . . . . . . . . . . 9-11
Cold Restore Operations . . . . . . . . . . . . . . . 9-12
PSUs for Objects Created After a Restored Backup . . . . . . 9-12
RESTORE Syntax . . . . . . . . . . . . . . . . . . 9-13
Syntax Examples . . . . . . . . . . . . . . . . . 9-15
Example RESTORE Operation with Message Output . . . . 9-16
Example Output for RESTORE SHOW Operation . . . . . 9-17
Partial Restore Procedure . . . . . . . . . . . . . . . 9-19
FORCE Option . . . . . . . . . . . . . . . . . 9-20
Database Consistency After Partial Restores . . . . . . . 9-20
Partial Availability . . . . . . . . . . . . . . . . 9-21
7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
,Q7KLV&KDSWHU
This chapter describes how to restore a database in the event of a system or
software failure. The chapter contains the following main sections:
5HVWRULQJD'DWDEDVH
)XOODQG3DUWLDO7085HVWRUHV
)XOODQG3DUWLDO7085HVWRUHV
When you run a TMU backup, you always back up data across the entire
database, backing up either every object in its entirety or the changed
portions of every object. However, when you restore from a backup, you can
do either a full restore (the entire database) or a partial restore. A partial restore
recovers one specific segment or one specific physical storage unit (PSU).
The assumption is that you need to restore the database to the latest backup
before the failure; in practice, your restore requirements might be less
stringent. For example, you might want to restore to an earlier backup, then
reload the last set of changes to the database. In most cases your backup
strategy should make it possible for you to restore to a fixed point in time
without having to reload any data, but the TMU allows you to choose any
target date for the restore process. If you do not select a target date, the
default behavior is to restore to the date of the last checkpoint.
5HVWRUH3DWK
The restore path that the TMU constructs is basically the same whether you
are restoring to a specific timestamp or restoring “blind” (without an explicit
target date in the command). For all restores, the path must include at least
one checkpoint backup and at least one full backup (TMU level 0 or external).
If these critical backups are missing from the metadata history, the restore
operation will fail.
7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
5HVWRUH([DPSOHV
D The last level 0 backup that was taken before the target date
E The last level 1 backup (if any) that was taken after (a) but before the
target date
F All level 2 backups that were taken after (b) but before the target
date; if (b) does not exist, all level 2 backups that were taken after (a)
but before the target date
G The last checkpoint backup that was taken before the target date
The restore process always stops at (d); any online backups taken after the
last checkpoint cannot be restored. The backups referred to in steps (a), (b),
and (c) could be online or checkpoint. The level 0 backup in (a) could also be
an external full backup.
Data is automatically restored from the same backup media that was
specified in the original BACKUP command. To verify the contents and scope
of the backup from which you plan to restore the database, run the
RESTORE...SHOW command before starting the restore operation. If you have
moved any of the backup data that needs to be restored, the RESTORE
operation will fail. The only exception to this rule is the ability to switch tape
devices between backups and restores, as explained on page 8-18.
5HVWRUH([DPSOHV
The following examples illustrate a combination of different types and levels
of backups and the resulting restore paths. These examples demonstrate how
the backup strategy determines the restore path that the TMU constructs in
the event of a failure.
5HVWRULQJD'DWDEDVH
5HVWRUH([DPSOHV
([DPSOH'DLO\/HYHO&KHFNSRLQWV
In this case, each daily incremental backup operation is relatively short and
provides a consistent recovery point, but the restore process potentially
consists of several operations, depending on when the failure occurs.
The DBA runs a level 0 checkpoint backup over the weekend to have
a fresh full backup to start each week.
On weekday evenings, the DBA runs level 2 checkpoint backups.
Each checkpoint backup stores only the changes that have occurred
since the previous day’s backup, and each backup provides a
consistent recovery point:
F
A
I
L
U
LEVEL 0 LEVEL 2 LEVEL 2 LEVEL 2 R LEVEL 2 LEVEL 2 LEVEL 2
CHECK CHECK CHECK CHECK E CHECK CHECK CHECK
7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
5HVWRUH([DPSOHV
([DPSOH'DLO\/HYHO&KHFNSRLQWV
In this case, each daily incremental backup provides a consistent recovery
point but picks up all the changes since the last level 0 backup. This approach
reduces the number of restore operations but the daily backups become more
time-consuming as the week progresses.
The DBA runs a level 0 checkpoint backup over the weekend to have
a fresh full backup to start each week.
On weekday evenings, the DBA runs level 1 checkpoint backups.
Each level 1 backup is cumulative; it stores all of the changes that
have occurred since last Sunday’s level 0 backup:
F
A
I
L
U
LEVEL 0 LEVEL 1 LEVEL 1 LEVEL 1 R LEVEL 1 LEVEL 1 LEVEL 1
CHECK CHECK CHECK CHECK E CHECK CHECK CHECK
5HVWRULQJD'DWDEDVH
5HVWRUH([DPSOHV
([DPSOH&RPELQHG/HYHODQG/HYHO%DFNXSV
In this case, the DBA runs a full backup over the weekend and two backups
on each weekday, one online and one checkpoint. This approach ensures that
database modifications are allowed during business days, that a checkpoint
for database recovery exists at the end of each business day, and that a
smaller number of restore operations is required to complete a recovery.
F
A
I
L
U
R
LEVEL 0 LEVEL 1 LEVEL 2 LEVEL 1 LEVEL 2 LEVEL 1 LEVEL 2
E
CHECK ONLINE CHECK ONLINE CHECK ONLINE CHECK
7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
5HVWRUH([DPSOHV
([DPSOH1HJDWLYH&DVH
This is a negative case that demonstrates the need for regular checkpoint
backups. If the DBA uses online backups every day and waits until the
weekend to run a checkpoint, it is impossible to restore the database to any
date during the week.
Lost Changes F
A
I
L
U
R
E
5HVWRULQJD'DWDEDVH
+RZWR5XQD7085HVWRUH
+RZWR5XQD7085HVWRUH
Follow these steps to run a TMU restore operation:
7DVN$XWKRUL]DWLRQ
7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
5HFRPPHQGHG3URFHGXUHIRU)RUHLJQ5HVWRUH2SHUDWLRQV
5HFRPPHQGHG3URFHGXUHIRU)RUHLJQ5HVWRUH2SHUDWLRQV
The assumption is that the database does not exist prior to a full restore and
that there is no database-related activity in progress. Nonetheless, the DBA
should always make sure that no database activity is possible during the
restore operation.
5HVWRUHRI6SHFLDO6HJPHQWV
The system segment, which contains the system tables, cannot be restored by
itself; it is restored as part of a full restore operation.
For versioned databases, the version log segment, which is never backed up
by the TMU, is automatically re-created, based on its definition in the restored
system catalog.
5HVWRULQJD'DWDEDVH
&ROG5HVWRUH2SHUDWLRQV
&ROG5HVWRUH2SHUDWLRQV
A cold restore is a full restore operation for a database that cannot be brought
online. If a cold restore is necessary, you have to re-create an empty version
of the same database in order to restore it, using the same environment
settings, location, and locale. You also have to re-create a database user with
the authority to do the restore (RESTORE_DATABASE authorization).
Restore both the rbw.config file and the contents of the following directory
before starting the cold restore of the database:
$RB_CONFIG/bar_metadata/database_name
These files must be backed up separately from the database; TMU backups do
not include them.
The backup segment and the version log segment (if required) are re-created
automatically during TMU restore operations.
368VIRU2EMHFWV&UHDWHG$IWHUD5HVWRUHG%DFNXS
If you restore from a backup that was taken before an object was created in
the database, the PSUs for that object will not be present in the restored
database but they will exist in the filesystem. In the case of PSUs with default
names, this scenario can cause problems when the server attempts to reuse
the default names. A similar situation could arise in the case of user-defined
PSUs. In either case, after restoring from a backup, you should query the
restored RBW_STORAGE table and compare the list of segments and PSU
locations with the physical files in the system. You can then use system-level
commands to remove any unused PSUs.
7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
5(6725(6\QWD[
5(6725(6\QWD[
The following diagram shows how to construct a TMU RESTORE statement:
RESTORE
SEGMENT segment_name
FORCE
PSU ’ pathname ’
FORCE
;
AS OF ’timestamp’ SHOW
5HVWRULQJD'DWDEDVH
5(6725(6\QWD[
’yyyy-mm-dd.hh:mm:ss’
7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
5(6725(6\QWD[
6\QWD[([DPSOHV
The following examples are all valid database restore statements:
■ restore;
Restore the database to the last checkpoint.
■ restore as of ’2001-12-31.11:59:59’;
Restore the database to the specified date and time.
■ restore segment sales_seg1;
Restore the specified segment to the last checkpoint.
■ restore segment sales_seg1 as of ’2001-12-31’;
Restore the specified segment to the specified date.
■ restore psu ’/rb/test/sales_psu1’ force;
Restore the specified PSU to the last checkpoint, regardless of
changes made to the PSU since the checkpoint backup was taken.
■ restore show;
Show the restore path for the database.
5HVWRULQJD'DWDEDVH
5(6725(6\QWD[
([DPSOH5(6725(2SHUDWLRQZLWK0HVVDJH2XWSXW
The following messages are displayed when a database is restored success-
fully from a level 0 backup and a level 2 backup:
(C) Copyright IBM Corp. 1991-2002. All rights reserved.
Version 06.20.0000(0)TST
** INFORMATION ** (7054) Starting database restore of database
/qa/local/bobr/toucaroma.
** INFORMATION ** (7055) Starting restore from
/qa/local/bobr/bar0/rb_bar_bobr_toucaroma_.20011220.132953.00001820.0001 on
Friday, December 21, 2001 11:11:26 AM.
** INFORMATION ** (7044) Completed restore from
/qa/local/bobr/bar0/rb_bar_bobr_toucaroma_.20011220.132953.00001820.0001 on
Friday, December 21, 2001 11:11:27 AM.
** INFORMATION ** (7055) Starting restore from
/qa/local/bobr/bar2/rb_bar_bobr_toucaroma_.20011220.133620.00002002.0001 on
Friday, December 21, 2001 11:11:27 AM.
** INFORMATION ** (7044) Completed restore from
/qa/local/bobr/bar2/rb_bar_bobr_toucaroma_.20011220.133620.00002002.0001 on
Friday, December 21, 2001 11:11:27 AM.
** INFORMATION ** (560) Restore process will re-start the database
/qa/local/bobr/toucaroma now.
** INFORMATION ** (7088) Restore of the database /qa/local/bobr/toucaroma
completed successfully on Friday, December 21, 2001 11:11:27 AM.
** STATISTICS ** (500) Time = 00:00:00.14 cp time, 00:00:00.86 time, Logical IO
count=881, Blk Reads=933, Blk Writes=705
7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
5(6725(6\QWD[
([DPSOH2XWSXWIRU5(6725(6+2:2SHUDWLRQ
The output of the RESTORE...SHOW command starts with three pieces of
general backup information:
■ Level (0, 1, 2)
■ Media type (XBSA, tape, directory)
■ Type (online, checkpoint, or external)
This information is repeated for each individual backup that would feature
in the restore path. For example, the output might contain all of the detailed
information for a level 0 backup, followed by the equivalent information for
a level 2 backup.
where:
5HVWRULQJD'DWDEDVH
5(6725(6\QWD[
For example, here is part of the output for a RESTORE SHOW command. In
this case, the restore path consists of a level 0 checkpoint backup and a level
1 checkpoint backup:
brick % rb_tmu -d AROMA tb_db_show.tmu system manager
(C) Copyright IBM Corp. 1991-2002. All rights reserved.
Version 06.20.0000(0)TST
** INFORMATION ** (7074) The following messages contain information about the
list of PSUs to be restored:
BACKUP_LEVEL: Level_0 MEDIA_TYPE: DIRECTORY BACKUP_TYPE: CHECKPOINT
PSU_NAME:"/qa/local/bobr/tb_db/RB_DEFAULT_IDX"
MEDIA_ID:"/devel/local/bobr/tb_db_backup/rb_bar_AROMA__________.20011119.162542.
00032049.0001" FULL_BACKUP:1 BACKUP_SIZE:41 TIMESTAMP:2001-11-19.16:25:42
PSU_NAME:"/qa/local/bobr/tb_db/RB_DEFAULT_LOCKS"
MEDIA_ID:"/devel/local/bobr/tb_db_backup/rb_bar_AROMA__________.20011119.162542.
00032049.0001" FULL_BACKUP:1 BACKUP_SIZE:8 TIMESTAMP:2001-11-19.16:25:42
PSU_NAME:"/qa/local/bobr/tb_db/RB_DEFAULT_TABLES"
MEDIA_ID:"/devel/local/bobr/tb_db_backup/rb_bar_AROMA__________.20011119.162542.
00032049.0001" FULL_BACKUP:1 BACKUP_SIZE:12 TIMESTAMP:2001-11-19.16:25:42
PSU_NAME:"/qa/local/bobr/tb_db/RB_DEFAULT_INDEXES"
MEDIA_ID:"/devel/local/bobr/tb_db_backup/rb_bar_AROMA__________.20011119.162542.
00032049.0001" FULL_BACKUP:1 BACKUP_SIZE:19 TIMESTAMP:2001-11-19.16:25:42
PSU_NAME:"/qa/local/bobr/tb_db/RB_DEFAULT_SEGMENTS"
MEDIA_ID:"/devel/local/bobr/tb_db_backup/rb_bar_AROMA__________.20011119.162542.
00032049.0001" FULL_BACKUP:1 BACKUP_SIZE:32 TIMESTAMP:2001-11-19.16:25:42
...
BACKUP_LEVEL: Level_1 MEDIA_TYPE: DIRECTORY BACKUP_TYPE: CHECKPOINT
PSU_NAME:"/qa/local/bobr/tb_db/RB_DEFAULT_IDX"
MEDIA_ID:"/devel/local/bobr/tb_db_backup/rb_bar_AROMA__________.20011119.163859.
00032173.0001" FULL_BACKUP:1 BACKUP_SIZE:42 TIMESTAMP:2001-11-19.16:38:59
PSU_NAME:"/qa/local/bobr/tb_db/RB_DEFAULT_LOCKS"
MEDIA_ID:"/devel/local/bobr/tb_db_backup/rb_bar_AROMA__________.20011119.163859.
00032173.0001" FULL_BACKUP:1 BACKUP_SIZE:8 TIMESTAMP:2001-11-19.16:38:59
PSU_NAME:"/qa/local/bobr/tb_db/RB_DEFAULT_TABLES"
MEDIA_ID:"/devel/local/bobr/tb_db_backup/rb_bar_AROMA__________.20011119.163859.
00032173.0001" FULL_BACKUP:1 BACKUP_SIZE:13 TIMESTAMP:2001-11-19.16:38:59
PSU_NAME:"/qa/local/bobr/tb_db/RB_DEFAULT_INDEXES"
MEDIA_ID:"/devel/local/bobr/tb_db_backup/rb_bar_AROMA__________.20011119.163859.
00032173.0001" FULL_BACKUP:1 BACKUP_SIZE:20 TIMESTAMP:2001-11-19.16:38:59
PSU_NAME:"/qa/local/bobr/tb_db/RB_DEFAULT_SEGMENTS"
MEDIA_ID:"/devel/local/bobr/tb_db_backup/rb_bar_AROMA__________.20011119.163859.
00032173.0001" FULL_BACKUP:1 BACKUP_SIZE:34 TIMESTAMP:2001-11-19.16:38:59
...
** STATISTICS ** (500) Time = 00:00:00.03 cp time, 00:00:00.03 time, Logical IO
count=3, Blk Reads=0, Blk Writes=2
7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
3DUWLDO5HVWRUH3URFHGXUH
3DUWLDO5HVWRUH3URFHGXUH
You can sometimes restore a database by using a partial restore operation.
Partial restores apply to base table and index segments and PSUs only; system
table data cannot be partially restored. (The system tables are restored as part
of a full restore operation.) Partial restore operations recover the specified
segment or PSU to its state as of the last checkpoint backup. As with full
restore operations, any subsequent changes captured by an online backup
cannot be restored.
,PSRUWDQW Where possible, partial restores should be avoided. If you have to use
them, it is better to use them in isolation than in combination with full restores. Try
to avoid several consecutive partial restores of the same object. There is no guarantee
that the database will be in a consistent state after any partial restore operation,
whether or not the FORCE option is required (see page 9-20).
You cannot restore a single segment or PSU if it does not exist in the database,
as recorded in the RBW_SEGMENTS or RBW_STORAGE table. For example,
you might have inadvertently dropped a table and its segment. In this case,
you must either:
IBM recommends that you do not attempt to restore a single segment or PSU
in the following cases:
■ If the table description for the table containing the segment, or the
segment that contains the PSU, has changed—for example, because
columns were added or dropped.
■ If the number of rows in the segment, or the segment that contains
the PSU, has changed. If you have inserted or deleted rows, the
number of rows has probably changed.
5HVWRULQJD'DWDEDVH
3DUWLDO5HVWRUH3URFHGXUH
,PSRUWDQW If you do not have an up-to-date backup, you might be able to restore a
segment or a PSU by using the FORCE option. However, after the forced restore, you
will have to do some additional work to bring the database to a consistent, usable
state. It is highly recommended that you consult Customer Support before using the
FORCE option.
)25&(2SWLRQ
If the segment or PSU specified in a partial restore has changed since you ran
the backup from which you are trying to restore, the changes will be lost
when the restore operation restores the segment or PSU to its backed-up state.
The lost changes could be modified segment ranges or rows that were added,
updated, or deleted.
Partial restore operations reset the backup data for the specified segment or
PSU; therefore, after restoring an object with the FORCE option, you will not
see the warning message again when you perform a subsequent restore of
that object.
'DWDEDVH&RQVLVWHQF\$IWHU3DUWLDO5HVWRUHV
After running any partial restore operation (with or without the FORCE
option), IBM recommends that you check the integrity of related database
objects:
7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
3DUWLDO5HVWRUH3URFHGXUH
3DUWLDO$YDLODELOLW\
While you are restoring a damaged segment, you can give users partial access
to the affected table or index:
5HVWRULQJD'DWDEDVH
Appendix
([DPSOH8VLQJWKH708
LQ$**5(*$7(0RGH A
The example in this appendix shows one way in which you can
use the TMU Auto Aggregate feature to generate and maintain
quarterly and yearly aggregates from daily input data.
■ Background
■ Strategy
■ Load Procedure: Refresh Loads
■ Load Procedure: Daily Loads
■ Results
%DFNJURXQG
Daily input data is extracted from another system used for day-
to-day operations. This data uses the following units of time:
The warehouse database is updated daily with information that includes the
new daily total and month-to-date total, as well as possible restated amounts
for previous daily totals. Twice a month the entire warehouse database is
completely reloaded from the operational system.
6WUDWHJ\
The first two tasks are to decide how to capture restated daily values and to
devise a period-key strategy to handle the desired time-aggregate levels.
The fact that the daily inputs might contain restated values for previous days
requires extra care in computing the quarter-to-date and year-to-date aggre-
gates. Adding new daily totals to these aggregates does not capture any
restated daily totals for days already included in the aggregate totals. The
solution to this problem is to use the daily month-to-date totals from the
operational data, and each day subtract the month-to-date totals for the
previous day from the two aggregates and add the new month-to-date totals,
thereby capturing any restated daily totals. (Restatements for other than the
current month are captured by the semi-monthly refreshes.)
This plan requires that input data for the daily updates be split into two files,
one that contains daily totals and one that contains month-to-date totals so
that you can use only the month-to-date totals to compute the aggregates.
The semi-monthly refreshes (and the initial load) are made with three files:
the daily and month-to-date files used for the daily updates, and a file that
contains monthly data for previous months. The following figure shows
these files and their use.
)RUUHIUHVKDQGLQLWLDOORDGV
$ 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
6WUDWHJ\
Use a period key of eight characters, two characters each for year, quarter,
month, and date: YYQQMMDD. With this format, the various levels of
aggregation are represented as the following table shows.
([DPSOH8VLQJWKH708LQ$**5(*$7(0RGH $
7KH'LPHQVLRQ7DEOHV
7KH'LPHQVLRQ7DEOHV
The dimension tables in this example, Period, Product, and Market, are
created and loaded with data as Figure A-1 through Figure A-3 show.
)LJXUH $3HULRG7DEOH
96010100
96010200
96010300
96020400
96020500
96020501
96020502
96020503
96020504
9601
9602
96
with this
LOAD DATA statement load data
inputfile ’period.dat’
modify
into table period (
perkey char(8));
$ 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
7KH'LPHQVLRQ7DEOHV
)LJXUH $3URGXFW7DEOH
022
055
314
319
with this
LOAD DATA statement load data
inputfile ’product.dat’
modify
into table product(
prodkey char(3));
)LJXUH $0DUNHW7DEOH
478
523
with this
LOAD DATA statement load data
inputfile ’market.dat’
modify
into table market (
mktkey char(3));
([DPSOH8VLQJWKH708LQ$**5(*$7(0RGH $
7KH6DOHV7DEOH
7KH6DOHV7DEOH
Create the Sales table as the following example shows:
$ 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
7KH6DOHV7DEOH
The data for the Sales table comes from three input files.
)LOH 'DWD
month.dat Contains monthly total sales dollars for January to April, 2000 for
each month for each product in each market.
mtd.dat Contains total sales dollars sold month-to-date (May 3, 2000) for
each product in each market.
daily.dat Contains total sales dollars sold for May 1 to 3, 2000, for each day
for each product in each market.
([DPSOH8VLQJWKH708LQ$**5(*$7(0RGH $
7KH6DOHV7DEOH
/* month.dat */ /* daily.dat */
/* mtd.dat */
96010100022478132 96020501022478044
96020500022478055
96010100022523048 96020501022523033
96020500022523106
96010100022931019 96020501022931022
96020500022931066
96010100055478026 96020501055478055
96020500055478124
96010100055523423 96020501055523066
96020500055523164
96010100055931025 96020501055931099
96020500055931212
96010100314478721 96020501314478088
96020500314478124
96010100314523096 96020501314523077
96020500314523103
96010100314931516 96020501314931065
96020500314931107
96010100319478318 96020501319478045
96020500319478093
96010100319523741 96020501319523006
96020500319523094
96010100319931925 96020501319931008
96010200022478012 SHUNH\ PNWNH\
96020502022478004
96010200022523036 SURGNH\ VDOHV
96020502022523068
96010200022931428 96020502022931041
96010200055478076 96020502055478023
96010200055523011 96020502055523082
96010200055931066 96020502055931091
96010200314478030 96020502314478005
96010200314523741 96020502314523009
96010200314931852 96020502314931013
96010200319478045 96020502319478021
96010200319523098 96020502319523052
SURGXFWV
96010200319931016 96020502319931049
96010300022478231 PDUNHWV 96020503022478007
96010300022523311
96020503022523005
96010300022931056 96020503022931003
96010300055478753 96020503055478046
96010300055523072 96020503055523016
96010300055931455 96020503055931022
96010300314478622 96020503314478031
96010300314523944 96020503314523017
96010300314931823
96010300319478456
96010300319523029 SHUNH\ PNWNH\
96020400022478036
96020400022523891
96020400022931038
96020400055478059
96020400055523761
96020400055931648
96020400314478089
VDOHV
PNWNH\
SURGNH\
SHUNH\
$ 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
/RDG3URFHGXUH5HIUHVK/RDGV
/RDG3URFHGXUH5HIUHVK/RDGV
To load the Sales table for each semi-monthly refresh and the initial load.
(Steps 1 to 3 load the detail-level records.)
([DPSOH8VLQJWKH708LQ$**5(*$7(0RGH $
/RDG3URFHGXUH5HIUHVK/RDGV
)RU5HIUHVK/RDG6WHSVWR
Use the following LOAD DATA statements to load data into the Sales table
initially and for each semi-monthly refresh thereafter:
/*initial load */
/*loading month records */
load data
inputfile ’month.dat’
modify
into table sales(
perkey char(8),
prodkey char(3),
mktkey char(3),
dollars integer external (3));
/*loading month-to-date records */
load data
inputfile ’mtd.dat’
append
into table sales(
perkey char(8),
prodkey char(3),
mktkey char(3),
dollars integer external (3));
/*loading daily records */
load data
inputfile ’daily.dat’
append
into table sales(
perkey char(8),
prodkey char(3),
mktkey char(3),
dollars integer external (3));
$ 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
/RDG3URFHGXUH5HIUHVK/RDGV
)RU5HIUHVK/RDG6WHSVWR
Use the following LOAD DATA statements to produce the quarterly and
quarter-to-date aggregates initially and for each semi-monthly refresh
thereafter:
/* load monthly data and compute aggregates for full quarters*/
load data
inputfile ’month.dat’
modify aggregate
into table sales(
perkey position(1:4) char (8),
prodkey position(9:11) char(3),
mktkey position(12:14) char(3),
dollars position(15) integer external(3) add);
/* load month-to-date data; compute aggregate for qtr-to-date*/
load data
inputfile ’mtd.dat’
modify aggregate
into table sales(
perkey position(1:4) char (8),
prodkey position(9:11) char(3),
mktkey position(12:14) char(3),
dollars position(15) integer external(3) add);
)RU5HIUHVK/RDG6WHSVWR
Use the following LOAD DATA statements to produce the yearly and
year-to-date aggregates initially and for each semi-monthly refresh
thereafter:
/*load monthly data and compute year-to-date for prior months*/
load data
inputfile ’month.dat’
modify aggregate
into table sales(
perkey position(1:2) char (8),
prodkey position(9:11) char(3),
mktkey position(12:14) char(3),
dollars position(15) integer external(3) add);
/* load current month and complete year-to-date aggregate*/
load data
inputfile ’mtd.dat’
modify aggregate
into table sales(
perkey position(1:2) char (8),
prodkey position(9:11) char(3),
perkey position(12:14) char(3),
dollars position(15) integer external(3) add);
([DPSOH8VLQJWKH708LQ$**5(*$7(0RGH $
/RDG3URFHGXUH'DLO\/RDGV
/RDG3URFHGXUH'DLO\/RDGV
Use the following procedure to load the Sales table each day with the daily
updates.
Load the daily.dat file containing daily totals for the current month,
using the MODIFY mode (to capture any restatements).
Load the mtd.dat.new file with the month-to-date totals as of the
current date, using MODIFY mode.
(Steps 3 and 4 adjust the quarterly and quarter-to-date figures for any
restated totals.)
Subtract the month-to-date total for yesterday from the quarterly
and quarter-to-date figures by loading the mtd.dat.old file using
MODIFY AGGREGATE mode and subtracting the dollars column.
Add the new month-to-date total for today to the quarterly and
quarter-to-date figures by loading the mtd.dat.new file, using
MODIFY AGGREGATE mode and adding the dollars column.
(Steps 5 and 6 adjust the yearly and year-to-date figures for any
restated totals.)
Subtract the month-to-date total for yesterday from the yearly and
year-to-date figures by loading the mtd.dat.old file using
MODIFY AGGREGATE mode and subtracting the dollars column.
Add the new month-to-date total for today to the yearly and
year-to-date figures by loading the mtd.dat.new file, using MODIFY
AGGREGATE mode and adding the dollars column.
$ 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
/RDG3URFHGXUH'DLO\/RDGV
)RU'DLO\/RDG6WHSVDQG
Use the following LOAD DATA statements to load the daily detail data:
/*loading month-to-date records*/
load data
inputfile ’mtd.dat.new’
modify
into table sales(
perkey char(8),
prodkey char(3),
mktkey char(3),
dollars integer external (3));
/*loading daily records*/
load data
inputfile ’daily.dat’
modify
into table sales(
perkey char(8),
prodkey char(3),
mktkey char(3),
dollars integer external (3));
)RU'DLO\/RDG6WHSVDQG
Use the following LOAD DATA statements to calculate the quarter and
quarter-to-date aggregates, including adjustments for any restated totals:
/*after initial loads, daily updates*/
/*using yesterday’s mtd.dat file */
/*(you will always have to keep the prior day’s data)*/
/* monthly totals and aggregates for previous quarters /*
load data
inputfile 'mtd.dat.old' */yesterday's file/*
modify aggregate
into table sales(
perkey position(1:4) char (8),
prodkey position(9:11) char(3),
mktkey position(12:14) char(3),
dollars position(15) integer external(3) subtract);
/* current month for current quarter-to-date data */
load data
inputfile 'mtd.dat.new' /*today's file*/
modify aggregate
into table sales(
perkey position(1:4) char (8),
prodkey position(9:11) char(3),
mktkey position(12:14) char(3),
dollars position(15) integer external(3) add);
([DPSOH8VLQJWKH708LQ$**5(*$7(0RGH $
/RDG3URFHGXUH'DLO\/RDGV
)RU'DLO\/RDG6WHSVDQG
Use the following LOAD DATA statements to calculate the yearly and
year-to-date aggregates, including adjustments for any restated totals:
/*aggregates of year-to-date data - previous months*/
load data
inputfile ’mtd.dat.old’ /*yesterday’s file*/
modify aggregate
into table sales(
perkey position(1:2) char (8),
prodkey position(9:11) char(3),
mktkey position(12:14) char(3),
dollars position(15) integer external(3) subtract);
$ 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
5HVXOWV
5HVXOWV
You update the database with a new daily file (daily.dat) and a new monthly
file (mtd.dat) for May 4, with some restated daily totals. The following figure
shows the new data. (Only the first row for May 1, the restated totals (three
items on May 1) and the data for May 4 from the daily.dat file are shown.)
0RQWKWRGDWHGDWDIRU0D\C 'DLO\GDWDIRU0D\C
(mtd.dat) (daily.dat)
9602050002247810 9602050102247804
0 4
9602050002252314 …
2 5HVWDWHGWRWDOV 9602050131493126
9602050002293109 5
3 9602050131947815
9602050005547818 4
2 9602050131952306
9602050005552323 0
3 …
9602050005593130 9602050402247804
3 5
9602050402252303
6
9602050402293102
7
9602050405547805
8
([DPSOH8VLQJWKH708LQ$**5(*$7(0RGH $
5HVXOWV
$ 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
Appendix
6WRUDJH0DQDJHU
&RQILJXUDWLRQIRU;%6$ B
%DFNXSV
This appendix points out specific configuration requirements for
XBSA-compliant storage management systems. For details about
running TMU backups via the XBSA interface, see page 8-19.
■ General Guidelines
■ Informix Storage Manager (ISM)
■ Legato Networker (NSR)
■ Tivoli Storage Manager (TSM)
*HQHUDO*XLGHOLQHV
To back up a Red Brick database to an XBSA-compliant storage
management system, you must first install and configure the
appropriate storage manager server and any required connec-
tivity software. Typically, apart from the “server” package itself,
a “client” package and an “Informix “or “XBSA” add-on module
are required. Refer to your storage manager documentation for
details.
%$5B60B86(53DUDPHWHU
%$5B60B86(53DUDPHWHU
To make a successful connection to the storage manager, you must set the
BAR_SM_USER parameter correctly in the rbw.config file. This parameter
corresponds to the bsaObjectOwner variable in the XBSA specification;
however, the value of the parameter depends on the storage manager being
used. Check your storage manager’s client or API reference documents for
information about valid parameter values.
%$5B;%6$B/,%3DUDPHWHU
The XBSA library provided by the storage manager vendor must be specified
with the BAR_XBSA_LIB entry in the rbw.config file. The name, suffix, and
location of the library sometimes depend on the operating system, as shown
in the following sections.
,QIRUPL[6WRUDJH0DQDJHU,60
Make sure the rb_tmu user (the user running backups) has ISM administrator
(-admin) privileges.
For example:
OPTION BAR_SM_USER INFORMIX
OPTION BAR_XBSA_LIB /usr/informix/lib/libbsa.so
% 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
/HJDWR1HWZRUNHU165
/HJDWR1HWZRUNHU165
Check the Legato installation guide for details about the installation process.
The following packages must be installed:
Make sure the rb_tmu user (the user running backups) has administrator
privileges.
For example:
OPTION BAR_SM_USER INFORMIX
OPTION BAR_XBSA_LIB /usr/lib/libxnmi.so
6WRUDJH0DQDJHU&RQILJXUDWLRQIRU;%6$%DFNXSV %
7LYROL6WRUDJH0DQDJHU760
7LYROL6WRUDJH0DQDJHU760
Check the Tivoli installation guide for details about the installation process.
The following packages must be installed:
■ Server software:
❑ Tivoli Storage Manager Server (“server” package)
❑ Tivoli Storage Manager Device Support (“devices” package)
■ Client software (on the machine where Red Brick server is installed):
❑ TSM Backup-Archive Client (“admin”, “api”, and “ba”
packages)
❑ Tivoli Data Protection for Informix (TDPI). See the TDPI
documentation for details.
The TSM client “node” name is used as the XBSA bsaObjectOwner variable
and the value of BAR_SM_USER. The example below uses the default CLIENT
node. On 32-bit platforms, the Informix XBSA library is under the following
directory:
tivoli_client_dir/informix/bin
The TMU and the Informix XBSA library must both be either 32-bit or 64-bit.
The example BAR_XBSA_LIB setting below is for 32-bit Solaris. TSM does not
provide a Windows NT XBSA library; however, you can use the Solaris
rb_tmu and XBSA library to run backups to a Windows NT Tivoli server.
OPTION BAR_SM_USER CLIENT
OPTION BAR_XBSA_LIB
/opt/tivoli/tsm/client/informix/bin/libTDPinf.so
% 7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
Notices
1RWLFHV
Any references in this information to non-IBM Web sites are provided for
convenience only and do not in any manner serve as an endorsement of those
Web sites. The materials at those Web sites are not part of the materials for
this IBM product and use of those Web sites is at your own risk.
IBM may use or distribute any of the information you supply in any way it
believes appropriate without incurring any obligation to you.
Licensees of this program who wish to have information about it for the
purpose of enabling: (i) the exchange of information between independently
created programs and other programs (including this one) and (ii) the mutual
use of the information which has been exchanged, should contact:
IBM Corporation
J46A/G4
555 Bailey Avenue
San Jose, CA 95141-1003
U.S.A.
The licensed program described in this information and all licensed material
available for it are provided by IBM under terms of the IBM Customer
Agreement, IBM International Program License Agreement, or any equiv-
alent agreement between us.
7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
Any performance data contained herein was determined in a controlled
environment. Therefore, the results obtained in other operating environ-
ments may vary significantly. Some measurements may have been made on
development-level systems and there is no guarantee that these measure-
ments will be the same on generally available systems. Furthermore, some
measurements may have been estimated through extrapolation. Actual
results may vary. Users of this document should verify the applicable data for
their specific environment.
1RWLFHV
7UDGHPDUNV
AIX; DB2; DB2 Universal Database; Distributed Relational Database
Architecture; NUMA-Q; OS/2, OS/390, and OS/400; IBM Informix;
C-ISAM; Foundation.2000TM; IBM Informix 4GL; IBM Informix
DataBlade Module; Client SDKTM; CloudscapeTM; CloudsyncTM;
IBM Informix Connect; IBM Informix Driver for JDBC; Dynamic
ConnectTM; IBM Informix Dynamic Scalable ArchitectureTM (DSA);
IBM Informix Dynamic ServerTM; IBM Informix Enterprise Gateway
Manager (Enterprise Gateway Manager); IBM Informix Extended Parallel
ServerTM; i.Financial ServicesTM; J/FoundationTM; MaxConnectTM; Object
TranslatorTM; Red BrickTM; IBM Informix SE; IBM Informix SQL; Infor-
miXMLTM; RedBack; SystemBuilderTM; U2TM; UniData; UniVerse;
wintegrate are trademarks or registered trademarks of International
Business Machines Corporation.
Java and all Java-based trademarks and logos are trademarks or registered
trademarks of Sun Microsystems, Inc. in the United States and other
countries.
Windows, Windows NT, and Excel are either registered trademarks or trade-
marks of Microsoft Corporation in the United States and/or other countries.
Other company, product, and service names used in this publication may be
trademarks or service marks of others.
7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
A B C D E F G H I J K L M N O P Q R S T U V W X Y Z @
Index
,QGH[
$ %
ABORT keyword, TMU Backup log file 8-29
REORG 6-17 Backup operations
ACCEPT keyword, TMU 2-6, 3-91 event logging 8-29
ACCESS_ANY task examples 8-24
authorization 7-6 external tools for 2-46, 8-20
action_log file 8-26 general procedure 8-8, 8-13
ADD aggregate operator, locks held during 8-13
TMU 3-73 metadata 8-26
Administrator tool, using to create objects not backed up 8-14
backup segment 8-9 preparing the database 8-8
Aggregate maintenance storage managers 8-19
described 1-7 strategy 8-6
setting 2-36 syntax diagram 8-22
AGGREGATE mode, loading syntax examples 8-24
data 3-32 tape devices 8-17
AGGREGATE operators, task authorization 8-13
TMU 3-76 versioned databases 8-14
ALTER DATABASE commands XBSA interface 8-19, B-1
CLEAN VERSION LOG 8-14, Backup segment 8-8 to 8-13
8-15 Administrator tool for
CREATE BACKUP DATA 8-9 creating 8-9
DROP BACKUP DATA 8-11 altering 8-12
ALTER SEGMENT operations, on automatically restored 9-11
backup segment 8-12 bitmap information 8-8
APPEND mode, loading data 3-31 creating 8-9
AS $pseudocolumn, TMU 3-67 damaged 8-11
Auto aggregate feature, TMU sizing 8-10
described 1-7 storage requirements 8-10
example A-1 BACKUP_DATABASE
usage 3-76 authorization 2-46, 8-13
Automatic Row Generation, TMU barxbsa utility 8-20
described 3-7, 3-16 to 3-22 bar_metadata directory 8-26
syntax and usage 3-43 BAR_SM_USER parameter 8-19,
AUTOROWGEN parameter B-2
See Automatic Row Generation. BAR_UNIT_SIZE parameter 2-45
A B C D E F G H I J K L M N O P Q R S T U V W X Y Z @
BAR_XBSA_LIB parameter 8-19, Constant dates, loading 3-84, 3-109, unload formats 4-5
B-2 3-115 unloading 1-10, 4-3 to 4-20
Binary datetime CONSTANT fields, LOAD DATA Data processing 6-7 to 6-11
inputs 3-116 to 3-119 statement 3-84 Data source names, for remote
Bitmap information, in backup Contact information Intro-17 TMU configuration 2-13
segment 8-8 Control files, TMU 1-8 Database access 2-5
Blanks in input data 3-134 Conventions Database locale 8-14
Boldface type Intro-5 syntax diagrams Intro-7 database option (-d), TMU 2-6
Buffer cache, TMU 2-27 syntax notation Intro-6 Databases
Conversion stage backing up 8-3
in REORG operation 6-10 loading data 3-5 to 3-146
& loading data 3-8 locking by TMU 2-25
PTMU 3-11 moving 4-18
Cache size, buffer (TMU) 2-27
Coordinator stage, REORG restoring 9-3
syntax 2-27
operation 6-10 upgrading to new release 1-12
usage 2-28
Copy management. See rb_cm Datatype conversions, during load
Capacity parameter, for tape
utility. process 3-133 to 3-136
backups 8-18, 8-23
CREATE SEGMENT command 8-9 Dates, loading constant dates 3-84,
Cases tracked by Technical
CREATE TABLE control file 3-109, 3-115
Support Intro-13
from UNLOAD statement 4-10 Datetime inputs
Cautions
Criteria clause, LOAD DATA binary and packed/zoned
escape character and locale 3-93
statement decimal 3-116 to 3-119
UNDO LOAD and REPLACE
comparisons, three-valued Datetime
mode 3-121
logic 3-93 fieldtypes 3-107 to 3-116
CDATA section, in XML files 3-75
locale use 3-42 format masks 3-109
CHARACTER fieldtype, TMU 3-99
syntax and usage 3-90 restricted format masks 3-116
CHECK TABLE and CHECK
CURRENT_DATE keyword, DECIMAL fieldtype, TMU 3-101,
INDEX commands 9-21
TMU 3-107 3-104
Checkpoint backups 8-4
CURRENT_TIME keyword, DEFERRED INDEXES keyword,
Cleanup stage 3-11
TMU 3-107 TMU REORG 6-14
in REORG operation 6-11
CURRENT_TIMESTAMP DELETE ROW keyword, TMU
Client TMU 2-12
keyword, TMU 3-107 REORG 6-18
configuration 2-13
Customer Support Intro-12 Demonstration database, script to
syntax 2-14
install Intro-4
Cold restore operations 9-12
Directories, backups to 8-22
Columns, determining default
values 3-54
' Discard clause
loading data 3-43 to 3-57
Comment clause, LOAD DATA Damaged segments
reloading XML discards 3-44
statement 3-95 to 3-96 not backed up 8-14
Discard files, TMU
Comments, TMU control file 1-9 restoring 9-19
all discards 3-46
Commit record interval, Data
locale 3-42
setting 2-40 backing up 8-3
multiple 3-43
Commit time interval, setting 2-42 conversion to EXTERNAL
optimized load discards 3-60
Comparisons, TMU LOAD DATA format 4-6
referential integrity
statement 3-90 loading 1-10, 3-5 to 3-146
discards 3-46, 6-21
Compressed files loading into third-party
types and use 3-43
input to TMU 2-10 tools 4-19
Discarded rows during load 3-133
output from TMU 4-11 restoring 9-3
Discardfile clause, REORG 6-19
7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
A B C D E F G H I J K L M N O P Q R S T U V W X Y Z @
DISCARDFILE keyword External full backups 8-4 Format clause, LOAD DATA
REORG 6-19 EXTERNAL keyword, TMU statement 3-29
TMU 3-46, 3-60 UNLOAD statement 4-9 FORMAT keyword, TMU
DISCARDS keyword External-variable data format, TMU IBM SEPARATED by ’c’
REORG 6-18 example 4-22 format 3-34
TMU 3-48 SEPARATED by ’c’ format 3-33
Disk file formats, input data 3-123 UNLOAD 3-34
Disk spill files, INDEX ) XML 3-34
TEMPSPACE parameters 2-28 Format masks, datetime
Fieldtypes, TMU input
Documentation datetime fieldtypes 3-109
records 3-97 to 3-116
list for IBM Red Brick numeric fieldtypes 3-116
CHARACTER 3-99
Warehouse Intro-14 Full backups
conversions during
online Intro-16 defined 8-4
load 3-133 to 3-136
DOUBLE PRECISION fieldtype, external 8-4, 8-20
CURRENT_DATE 3-107
TMU 3-106 foreign 8-4, 8-20
CURRENT_TIME 3-107
Driver TMU 2-12 syntax 8-22
CURRENT_TIMESTAMP 3-107
DST_DATABASES table 8-9 Full restores
DECIMAL 3-101, 3-104
dump and restore commands, defined 9-4
DOUBLE PRECISION 3-106
UNIX 8-20 foreign 9-11
FLOAT EXTERNAL 3-103
Duplicate records syntax 9-13
INTEGER 3-101, 3-105
discarding 3-60
M4DATE 3-108
optimize mode 3-61
REAL 3-106
scale of field 3-104
*
SMALLINT 3-105 GENERATE statements (TMU)
( TIME 3-107 CREATE TABLE syntax and
Empty input fields 3-134 TIMESTAMP 3-107 usage 5-3 to 5-5
Environment variables Intro-5 TINYINT 3-105 example 5-8
general use with TMU 1-9 File formats, input data example with rb_cm 7-15
RB_CONFIG with TMU 2-8 disk files 3-123 LOAD DATA syntax and
RB_PATH with TMU 2-6 fixed record 3-123 usage 5-5 to 5-8
remote TMU configuration 2-13 separated record 3-128
USER statements 2-22 tape files 3-131
Error codes 2-7 variable record 3-124 ,
Error-handling stage XML 3-129
IBM standard label tapes 3-122
loading data 3-9 File redirection, TMU 2-9
INCREMENT fields 3-87
PTMU 3-11 Filesize, TAR limit 4-12
Incremental backups
ESCAPE keyword Fixed-format records, input
defined 8-4
Criteria clause 3-93 data 3-123
syntax 8-22
UNLOAD statement 4-14 FIXEDLEN keyword, LOAD DATA
Index name clause, TMU
Event logging, for backup statement 3-124
REORG 6-14
operations 8-29 FLOAT EXTERNAL fieldtype,
Index-building stage
Exit status codes 2-7 TMU 3-103
in REORG operation 6-11
External backups 2-46, 8-20 FORCE INTACT command, ALTER
loading data 3-9
External data format, TMU SEGMENT 8-11
Indexes
conversion rules 4-6 FORCE option, for restore
default names 6-14
example 4-20, 5-8 operations 9-20
DEFERRED 6-4
for unloaded data 4-5 Foreign backups 2-46, 8-20
rebuilding with REORG 6-3
with rb_cm utility 7-11 Foreign restores 2-46, 8-21, 9-11
,QGH[
A B C D E F G H I J K L M N O P Q R S T U V W X Y Z @
INDEX_TEMPSPACE parameters LOAD DATA control file trim options 3-73, 3-82
DIRECTORY 2-30 from UNLOAD statement 4-10 UPDATE mode 3-32
MAXSPILLSIZE 2-31 with rb_cm utility 7-12 Loading data 3-5 to 3-146
THRESHOLD 2-30 LOAD DATA statement Auto aggregate
TMU control 2-28 See also Loading data. example A-1 to A-16
INDEX_TEMPSPACE_DUPLICAT ACCEPT criteria 2-6, 3-91 conversion stage 3-8
E SPILLPERCENT parameter ADD aggregate operator 3-73 datatype
syntax 2-29 AGGREGATE mode 3-32 conversions 3-133 to 3-136
usage 2-31 APPEND mode 3-31 discard files 3-46, 3-60, 6-21
Input clause, LOAD DATA AUTOROWGEN keyword 3-49 discarded rows 3-6, 3-60, 3-133
statement 3-25 to 3-29 clauses, main error handling and cleanup
Input data Comment 3-95 to 3-96 stage 3-9
CONSTANT fields 3-84 Criteria 3-90 failure or interruption 3-28
fieldtypes 3-97 to 3-116 Discard 3-43 to 3-57 index stage 3-9
file formats 3-122 Format 3-29, 3-38 input stage 3-8
INCREMENT fields 3-87 Input 3-25 to 3-29 inputs and outputs 3-6
ordered 3-15 Locale 3-38 into segments 3-87
record formats 3-122 MMAP Index 3-63 load information 3-95
SEQUENCE fields 3-85 to 3-86 Optimize 3-59 main output stage 3-9
unused fields 3-66 Row Messages 3-58 memory-mapping indexes 3-63
Input files, LOAD DATA Segment 3-87 to 3-90 offline load 3-87
statement 3-25 Table 3-65 to 3-87 overview 1-10
Input locale, defined 3-39 CONSTANT fields 3-84 procedure 3-12 to 3-13
Input stage creating with GENERATE 7-12 processing flow 3-8
in REORG operation 6-10 fieldtypes 3-97 to 3-116 RBW_LOADINFO system
loading data 3-8 FIXEDLEN keyword 3-124 table 3-95
PTMU 3-10 INCREMENT fields, input SERIAL column 3-68
INSERT mode, loading data 3-31 data 3-87 terminating with NOT NULL
INTEGER fieldtype, TMU 3-101, input files 3-25 DEFAULT NULL 3-67
3-105 INSERT mode 3-31 unused input fields 3-66
Internal data format, TMU MAX aggregate operator 3-74 Locale clause
for unloaded data 4-5 MIN aggregate operator 3-73 LOAD DATA statement 3-38
with rb_cm utility 7-11 MODIFY mode 3-33 XML encodings 3-40, 3-41
Interrupted load 3-28 NLS_LOCALE keyword 3-38 Locales
interval option (-i), TMU 2-6 NULLIF keyword 3-73 backed-up databases 8-14
INTO OFFLINE SEGMENT POSITION keyword 3-72 default 3-41
keyword, TMU 3-87 pseudocolumns 3-66, 3-83, 3-92 TMU input files 3-38
Invalid STAR indexes, cause of 6-4 rb_cm control files 7-10 UNLOAD operations 4-5
INVALIDATE INDEX keyword, rb_cm example 7-15, 7-19 use by TMU 1-9
TMU REORG 6-17 RECORDLEN keyword 3-30, Locking
3-123 behavior during REORG 6-22
REJECT criteria 3-91 by TMU 2-25
/ REPLACE mode 3-31 during backups 8-13
RETAIN keyword 3-67 SET LOCK command 2-25
Large input and output files 3-7
SEQUENCE fields 3-85 to 3-86 wait behavior for TMU 2-26, 2-38
Level 0, 1, and 2 backups 8-4
SUBSTR keyword 3-100 Logging, for backup and restore
LIKE, NOT LIKE, TMU
SUBTRACT aggregate operations 8-29
wildcards 3-93
operator 3-73 LTRIM keyword, TMU 3-73, 3-82
syntax summary 3-137 to 3-146
7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
A B C D E F G H I J K L M N O P Q R S T U V W X Y Z @
PRECOMPUTED_VIEW_MAINTE
0 2 NANCE_ON_ERROR
M4DATE keyword, TMU 3-108 Offline-load operations option 2-37
Main output stage overview 1-10 Primary key indexes, memory-
loading data 3-9 syntax 3-88 mapping 3-63
PTMU 3-11 See also Segments. Pseudocolumn, TMU
MAX aggregate operator, ON DISCARD keyword, TMU field specification 3-66
TMU 3-74 REORG 6-17 with ACCEPT or REJECT 3-92
MAXROWS PER SEGMENT Online backups 8-4 with concatenated fields 3-83
parameter Online manuals Intro-16 PSUs, restoring single 9-19
and duplicate records 3-61 Operating system access 2-4 PTMU
effect on STAR indexes 6-4 Optimize clause 3480/3490 multiple-tape
MAXSPILLSIZE value 2-31 LOAD DATA statement 3-59 drive 2-54
Memory-map limit OPTIMIZE keyword, TMU automatic row generation 2-53
MMAP INDEX clause 3-63, 6-16 REORG 6-15 conversion stage 3-11
SET command 2-35 Order discard limits 2-53
Messages input data, TMU 3-15 effective use 2-52 to 2-55
backup log file 8-30 table order for loads 3-14 error-handling stage 3-11
displayed during restores 9-16 unloaded data 4-9 exit status codes 2-7
locale of 3-42 OTHER keyword, TMU features, described 1-6
Metadata, for TMU backups 8-26 REORG 6-21 input stage 3-10
MIN aggregate operator, TMU 3-73 LOAD operation stages 3-10
MMAP INDEX clause main output and index
LOAD DATA statement 3-63 3 stages 3-11
REORG statement 6-16 multiple tape drives 2-54
Packed decimal datetime parallel-processing
MODIFY mode
inputs 3-116 to 3-119 parameters 2-48 to 2-50
ACCEPT or REJECT clause 3-92
Partial availability, for tables with performance capabilities 3-9
loading data 3-33
damaged segments 9-21 syntax for rb_ptmu 2-5
MODIFY_ANY task
Partial restore operations 9-4, 9-19 TMU SERIAL MODE
authorization 7-6
PARTIAL_AVAILABILITY parameter 2-50
Moving a database 4-18
parameter 9-21
Multiple discard files 3-43
Password, TMU command line 2-6
Pipes 5
as TMU input 2-10
1 for TMU outputs 4-11 Radix point, TMU
NLS_LOCALE keyword, multiple TMU inputs 2-10 overriding locale 3-40
TMU 3-38, 3-40 TMU GENERATE statement 5-4 specifying 3-102
NO WAIT on locks, TMU 2-26, 2-38 POSITION keyword, TMU 3-72 rbw.config file
NULL values Precomputed views backing up externally 8-20
for input data, example 7-16 maintaining 3-16, 3-49 not backed up 8-14
in external-format data 4-21, 5-9 maintaining with TMU RBW_LOADINFO table
in numeric columns 3-134 REORG 6-4 rb_cm results 7-20
NULLIF keyword, TMU rebuilding 6-4 retrieving data 3-96
GENERATE example 5-9 setting 2-36 RBW_LOADINFO_LIMIT
LOAD DATA statement 3-73 PRECOMPUTED_VIEW_MAINTE parameter, example 2-34
rb_cm example 7-16 NANCE option 2-36 rbw_media_history file 8-26
UNLOAD example 4-21 RBW_SEGMENTS system
table 8-12
,QGH[
A B C D E F G H I J K L M N O P Q R S T U V W X Y Z @
rb_cm utility 7-3 to 7-20 Remote TMU 2-12 restoring a segment 9-19
examples 7-13 to 7-19 See also Client TMU. SHOW option 9-17
LOAD control file example 7-15, configuration 2-13 syntax diagram 9-13
7-19 example 2-19 syntax examples 9-15
necessary authorizations 7-6 REMOTE_TMU_LISTENER task authorization 9-10
overview 7-4 to 7-20 parameter 2-14 RESTORE...SHOW command,
requirements for running 7-5 summary of operation 2-18 TMU 9-17
results, verifying 7-20 syntax 2-14 RESTORE_DATABASE
TMU control files for use REMOTE_TMU_LISTENER authorization 2-46, 9-10
with 7-10 parameter 2-14 Restricted datetime masks 3-117
UNLOAD control file REORG RETAIN keyword, TMU 3-67
example 7-15, 7-18 after partial restore 9-21 RI_DISCARDFILE keyword,
RB_CONFIG environment aggregate maintenance 6-4 TMU 3-46
variable 2-8, 2-13 cleanup stage 6-11 Row Messages clause, LOAD
rb_ctmu 2-12 conversion stage 6-10 DATA statement 3-57 to 3-58
See also Client TMU. coordinator stage 6-10 Row messages, managing 2-38
rb_drvtmu 2-12 discardfile format 6-24 RTRIM keyword, TMU 3-73, 3-82
RB_HOST environment disk space 6-24
variable 2-8, 2-13 index-building stage 6-11
RB_NLS_LOCALE environment input stage 6-10 6
variable 1-9, 3-40 locking behavior 6-22
Scale, for fieldtype 3-104
See also NLS_LOCALE keyword. memory-mapping indexes 6-16
Search condition
RB_PATH environment online and offline operation 6-23
WHERE clause in UNLOAD
variable 2-6, 2-8, 2-13 parallel 6-7
statement 4-13
rb_ptmu file, location 2-5 partial index
wildcard characters 4-14
See also PTMU. limitations of 6-23
Segment clause, LOAD DATA
rb_tmu file, location 2-5 options 6-5
statement 3-87 to 3-90
See also TMU. precomputed views 6-4
Segment name clause, TMU
Read-only operations, during sequence of tasks 6-9
REORG 6-13
backups 8-13 serial 6-7
Segments
REAL keyword, TMU 3-106 SET command syntax 2-47
altering backup 8-11
RECALCULATE RANGES syntax 6-12, 6-13
converting table to multiple 4-18
keyword, TMU REORG 6-15 usage 6-4
creating backup 8-9
RECORDLEN keyword 3-30, 3-123 REPLACE mode, loading data 3-31
loading data into 3-87
Records to load between each RESET, TEMPSPACE
restoring single 9-19
COMMIT 2-40 parameters 2-31
unloading specific 4-9
redbrick directory, defined 2-5 Restated daily totals, with Auto
Selective column
redbrick user ID, defined 2-4 Aggregate A-2
updates 3-69 to 3-70
REFERENCE CHECKING option, Restore operations
Selective unload, wildcard
REORG 6-16 cold 9-12
character 4-14
Referential integrity damaged segments 9-19
Separated-format records 3-33,
maintaining with TMU description 9-4
3-128
REORG 6-4 event logging 8-29
SEQUENCE fields, LOAD DATA
overriding of 6-21 examples 9-5
statement 3-85 to 3-86
with AUTOROWGEN FORCE option 9-20
SERIAL column
described 3-16 to 3-22 foreign 9-11
datatype, use with 3-101, 3-105
syntax 3-45 general procedure 9-10
loading 3-68
Registry, Windows 2-8, 7-7 to 7-8 locks during 8-13
REJECT keyword, TMU 3-91 partial 9-19
7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
A B C D E F G H I J K L M N O P Q R S T U V W X Y Z @
Segment clause, using with 3-87 Standard error, redirecting from Tapes
unloading 3-68 TMU 2-9 backups to 8-23
Serial loader. See TMU. 2-50 Standard input, to TMU 2-10 input data file
SET commands, SQL Standard label tapes formats 3-122 to 3-133
PARTIAL AVAILABILITY 9-21 ANSI 3-122, 3-132 input data record
SET commands, TMU backups to 8-18 formats 3-122 to 3-133
BAR_UNIT_SIZE 2-45 IBM 3-122 tape devices 3-122
DATEFORMAT 2-33 Standard output, from TMU 4-11 TAR file, POSIX limit 4-12
FOREIGN FULL BACKUP 2-46, START/STOP RECORD keywords, TAR tapes, with input data 3-122,
8-20 TMU 3-28 3-131
FOREIGN FULL RESTORE 2-46, storage managers Technical support Intro-12
8-20, 9-11 configuration B-1 Templates, GENERATE CREATE
INDEX TEMPSPACE 2-28 TMU backups 8-19 TABLE statement, TMU 5-3
list of 2-23 to 2-25 SUBSTR keyword, LOAD DATA TEMPSPACE_DUPLICATESPILLP
LOCK 2-25 statement 3-100 ERCENT parameters, TMU
LOCK WAIT 8-13 SUBTRACT aggregate operator, control 2-28
PRECOMPUTED VIEW TMU 3-73 Third-party tools, loading with
MAINTENANCE 2-36 SYNCH statement warehouse data 4-19
PRECOMPUTED VIEW in rb_cm copy operation 7-12 TIME fieldtype, TMU 3-107
MAINTENANCE ON usage 3-119 Time interval to load data before
ERROR 2-37 Syntax COMMIT 2-42
STATS 2-45 LOAD DATA 3-24, 3-137 TIMESTAMP fieldtype, TMU 3-107
TEMPSPACE rb_ctmu 2-14 timestamp option (-t), TMU 2-6
DUPLICATESPILLPERCENT rb_ptmu 2-5 TINYINT keyword, TMU 3-105
2-28 rb_tmu 2-5 TMU
TMU BUFFERS 2-27 REORG statement 6-12, 6-13 aggregate maintenance,
TMU COMMIT RECORD Syntax diagrams described 1-7
INTERVAL 2-40 conventions for Intro-7 Auto aggregate example A-1
TMU COMMIT TIME keywords in Intro-9 backups 8-3
INTERVAL 2-42 System requirements Intro-4 buffer cache size 2-27
TMU CONVERSION System segment 9-11 comments in control file 1-9
TASKS 2-48 control files 1-8
TMU INDEX TASKS 2-47, 2-48 database option (-d) 2-6
TMU INPUT TASKS 2-47 7 exit status codes 2-7
TMU MAX TASKS 2-47 file redirection 2-9
Table clause, LOAD DATA
TMU MMAP LIMIT 2-35 generating control files
statement 3-65 to 3-87
TMU ROW MESSAGES 2-38 with GENERATE 5-3 to 5-10
Table name clause, TMU
TMU SERIAL MODE 2-50 with UNLOAD 4-10
REORG 6-13, 6-20
TMU VERSIONING 2-39 input data formats 3-122
Tables
SET options in copy operation 7-13 input data, decompressing 2-10
locking by TMU 2-25
SHOW command, TMU input file locale 3-38
multiple segments, converting
RESTORE 9-17 interval option (-i) 2-6
to 4-18
Simple fields LOAD DATA
order for load operations 3-14
syntax 3-71 statement 3-23 to 3-146
unloading data 4-3 to 4-20
XML path 3-73 loading data 3-5 to 3-146
Tab-separated data 3-33
SMALLINT fieldtype, TMU 3-105 locking operations 2-26, 2-38
Software dependencies Intro-4 logging in 1-9
Space, working space for offline memory-mapping indexes 3-63
loads 3-89 pipes 2-9, 4-11
,QGH[
A B C D E F G H I J K L M N O P Q R S T U V W X Y Z @
7DEOH0DQDJHPHQW8WLOLW\5HIHUHQFH*XLGH
A B C D E F G H I J K L M N O P Q R S T U V W X Y Z @
6\PEROV
#PCDATA, in XML files 3-75
%, TMU NULL indicator 4-7, 4-21,
5-9
%, TMU wildcard character 3-93
.backup_dirty_psu file 8-26
.dbinfo file 8-26
.odbc.ini file, DSNs in 2-13
_ , TMU wildcard character 3-93
,QGH[