Professional Documents
Culture Documents
B308 Tpump
B308 Tpump
After completing this module, you will be able to: State the capabilities and limitations of TPump. Describe TPump commands and parameters. Prepare a TPump script.
TPump
Allows near real-time updates from transactional systems into the warehouse. Performs INSERT, UPDATE, and DELETE operations, or a combination, from the same
source. Up to 63 DML statements can be included for one IMPORT task.
Allows target tables to: Have secondary indexes and Referential Integrity constraints. Be MULTISET or SET. Be populated or empty. Have triggers - invoked as necessary Allows conditional processing. Supports automatic restarts; uses Support Environment. No session limit use as many sessions as necessary. No limit to the number of concurrent instances. Uses row-hash locks, allowing concurrent updates on the same table. Can always be stopped and locks dropped with no ill effect. Designed for highest possible throughput. User can specify how many updates occur minute by minute; can be changed as the
job runs.
TPump Limitations
Use of SELECT is not allowed. Concatenation of data files is not supported. Exponential operators are not allowed.
The default of two numerals (yy) to represent the year is interpreted to be the 20th
century.
The correct date format must be specified at the time of table creation.
However, TPump has numerous parameters on the .BEGIN LOAD statement that are unique to TPump.
SERIALIZE ON | OFF PACK number PACKMAXIMUM RATE number LATENCY number NOMONITOR ROBUST ON | OFF MACRODB dbname (default ON if UPSERT) (default is 20, max is 600) (use maximum pack factor) (default is unlimited) (range is 10 600 seconds) (default is monitoring on) (default is ON) (default is logtable dbase) ;
PACK PACKMAXIMUM
statements
RATE
LATENCY NOMONITOR ROBUST MACRODB
statement rate
seconds
Initial maximum rate at which statements are sent per minute. If the statement rate is zero or unspecified, the rate is unlimited.
# of seconds before a partial buffer is sent to the database. Prevents TPump from checking for statement rate changes from or update status information for the TPump Monitor.
ON | OFF dbname
OFF signals TPump to use simple restart logic; TPump will begin where the last checkpoint occurred. Indicate a database to contain any macros used by TPump.
Restrictions to consider: 64K message size limit TPump limit of 600 statements
Teradata USING clause limit of 2560 columns (from 507) Teradata Plastic Steps limit
SERIALIZE OFF does not guarantee the order in which transactions are processed.
This set of transactions may be processed first. Transaction File PI Time 01 8:00 03 8:01 02 8:02 01 8:03 04 8:04 05 8:05 03 8:06 01 8:07 08 8:08 06 8:09 07 8:10 01 8:11 03 8:12 02 8:13 TPump Buffers 01 8:00 03 8:01 02 8:02 01 8:03 04 8:04 05 8:05 03 8:06 01 8:07 08 8:08 06 8:09 07 8:10 01 8:11 : :
AMP 0
AMP 1
AMP 2
AMP 3
AMP
AMP N
Teradata
SERIALIZE guarantees both input record order and all records with the same PI value
will be handled in the same session. It is recommended to specify the PI in the statement column(s) as KEY.
KEY Fields determine the PE session in which TPump send the transaction to.
Transaction File PI Time 01 8:00 03 8:01 02 8:02 01 8:03 04 8:04 05 8:05 03 8:06 01 8:07 08 8:08 06 8:09 07 8:10 01 8:11 03 8:12 02 8:13 TPump Buffers 01 8:00 02 8:02 01 8:03 01 8:07 03 8:01 04 8:04 05 8:05 03 8:06 08 8:08 06 8:09 01 8:11 02 8:13 : : Session 1 01 8:00 02 8:02 01 8:03 01 8:07 08 8:08 06 8:09 01 8:11 02 8:13 Session 2 03 8:01 04 8:04 05 8:05 03 8:06 07 8:10 03 8:12
AMP 0
AMP 1
AMP 2
AMP 3
AMP
AMP N
Teradata
Causes a row to be written to the log table each time a buffer has successfully
completed its updates.
The larger the TPump PACK factor, the less overhead involved in this activity.
These rows are deleted from the log when a checkpoint is taken. ROBUST ON is recommended for these specific conditions:
INSERTS into multi-set tables, as such tables will allow re-insertion of the same
rows multiple times.
When UPDATEs are based on calculations or percentage increases. If PACK factors are large, and applying and rejecting duplicates after a restart
would be time-consuming.
ROBUST ON is always a good idea for TPump jobs that read from queues. It
keeps duplicates from being re-inserted into the table in the event of a restart.
Errors_tpp 1 2 * * * * * * * 2 * * * 2 * * * *
CHAR(1); INTEGER; INTEGER; CHAR(25); CHAR(20); CHAR(2); INTEGER; DECIMAL(10,2); DECIMAL (10,2); INTEGER; CHAR(30); CHAR(20); INTEGER; INTEGER; CHAR(10); INTEGER; CHAR(4); DECIMAL(10,2);
MultiLoad uses the DML statements. TPump uses row hash locking to allow for concurrent read and write access
to target tables. It can be stopped with target tables fully accessible.
Invoking TPump
Network Attached Systems: Channel-Attached MVS Systems: tpump [PARAMETERS] < scriptname >outfilename // EXEC TDSTPUMP PARM= [PARAMETERS]
Channel-Attached VM Systems:
Channel Parameter BRIEF Network Parameter -b
VERBOSE
. < scriptname
> outfilename
TPump Statistics
. . Candidate records considered:..... Apply conditions satisfied:....... Candidate records not applied:....... Candidate records rejected:.......... IMPORT 1 ========= 200 200 0 0 Total thus far =========== 200 200 0 0 Activity 100 100
** Statistics for Apply Label : UPS_ACCOUNT Type Database Table or Macro Name U TLJC25 Accounts I TLJC25 Accounts
**** 17:33:50 UTY0821 Error table TLJC25.errtable_tpp is EMPTY, dropping table. 0018 .LOGOFF; ===================================================================== = = = Logoff/Disconnect = = = ===================================================================== **** 17:34:08 UTY6216 The restart log table has been dropped. **** 17:34:08 UTY6212 A successful disconnect was made from the RDBMS. **** 17:34:08 UTY2410 Total processor time used = '2.43 Seconds' . Start : 17:33:13 - TUE MAY 06, 2003 . End : 17:34:08 - TUE MAY 06, 2003 . Highest return code encountered = '0'. Note: These statistics are not for the example TPump job shown earlier in this module.
TPump Monitor
Tool to control and track TPump imports.
The table SysAdmin.TPumpStatusTbl is updated once a minute. Alter the statement rate on an import by updating this table using
macros.
View
INMODs
Data
TPump
0
INMOD
Teradata
Valid data record in Buffer EOF not reached. Length field reflects correct length of output record. If an input record was supplied to the INMOD from TPump and is to be skipped, the length field should be set to zero. If no input record was supplied, setting the length to zero indicates EOF.
Non 0
0 1 0 1
2 3
Note: TPump can also use the MultiLoad INMOD return codes.
4 5 6 7
DDL Functions
DML Functions Multiple DML Multiple Tables Multiple Sessions Protocol Used Conditional Expressions Arithmetic Calculations
ALL
ALL Yes Yes Yes SQL Yes Yes
LIMITED
INSERT No No Yes
FASTLOAD
No
SELECT Yes Yes Yes EXPORT Yes Yes
ALL
ALL
INS/UPD/DEL INS/UPD/DEL
No No
1 per column
Yes Yes
Yes No
Data Conversion
Error Files Error Limits User-written Routines
Yes
No No No
Yes
No No Yes
Yes
Yes Yes Yes
Yes
Yes Yes Yes
Summary
Allows near real-time updates from transactional systems into the warehouse. Performs INSERTs, UPDATEs, and DELETEs to more than 60 tables at a time. Alternative to MultiLoad for low-batch maintenance of large databases; replacement for BulkLoad. Uses row-hash locks, allowing concurrent updates on the same table. Can always be stopped and locks dropped with no ill effect. User can specify how many updates occur minute by minute; can be changed as the job runs. No arithmetic functions or file concatenations.
Review Questions
Match the item in the first column to its corresponding statement in the second column. _____ 1. TPump purpose A. Query against TPump status table
Lab Exercises
Lab Exercise 8-1
Purpose In this lab, you will perform an operation similar to lab 7-2, using TPump instead of MultiLoad. For this exercise, use a PACK of 20 and a RATE of 2400. What you need Data file (data8_1) created from macro AU.Lab8_1. Tasks 1. Delete all rows from the Accounts Table and use the following INSERT/SELECT to create 100 rows of test data: INSERT INTO Accounts SELECT * FROM AU.Accounts WHERE Account_Number LT 20024101 ; 2. Export data to the file data8_1 using the macro AU.lab8_1. 3. Prepare a TPump script which performs an UPSERT operation (INSERT MISSING UPDATE) on your Accounts table as a single operation. Use the data from data8_1 as input to the UPSERT script. If the row exists, UPDATE the Balance_Current with the appropriate incoming value. If not, INSERT a row into the Accounts table. In your script, be sure to set a statement rate. 4. Run the script. 5. Validate your results. TPump should have performed 100 UPDATES and 100 INSERTS with a final return code of zero.
40 RATE 4800; INTEGER KEY; INTEGER; CHAR(25); CHAR(20); CHAR(2); INTEGER; DECIMAL (10,2); DECIMAL (10,2);
.DML LABEL Fix_Account DO INSERT FOR MISSING UPDATE ROWS ; UPDATE Accounts SET WHERE Balance_Current = :in_balancecur Account_Number = :in_accountno ;
INSERT INTO Accounts VALUES (:in_accountno, :in_number, :in_street, :in_city, :in_state, :in_zip_code, :in_balancefor, :in_balancecur); .IMPORT INFILE data8_1 LAYOUT Record_Layout_813 APPLY Fix_Account; .END LOAD; .LOGOFF; tpump < lab813.tpp > lab813.out