Professional Documents
Culture Documents
IBM Data Movement Tool (Developerworks)
IBM Data Movement Tool (Developerworks)
This article presents a very simple and powerful tool to move data from various sources easily.
The tool allows enabling applications from Oracle and Sybase to run on IBM® DB2®, Version
9.7 for Linux®, UNIX®, and Windows® as is, with very little or no changes. The tool can also be
used to move data from various other database management systems to DB2 for Linux, UNIX,
and Windows and DB2 for z/OS®. The tool also supports moving data from a source database
to the DB2 in a pureScale environment. Download the IBM Data Movement tool.
Introduction
This tool can be used to move data from various data sources to the DB2 in a pureScale
environment.
Beginning with DB2 V9.7 for Linux, UNIX, and Windows, the Migration Toolkit (MTK) is not
required in order to use applications from Oracle and Sybase (After Fixpack 3) on DB2 products.
This tool replaces the MTK functionality with a greatly simplified workflow.
For all other scenarios, for example, moving data from a database to DB2 for z/OS, this tool
supports the MTK particularly in the area of the high speed data movement. Using this tool, as
much as 4TB of data have been moved in just three days.
A GUI provides an easy to use interface for the novice while the command line API is often
preferred by the advanced user.
Preparation
Download
First, download the tool from the Download section to your target DB2 server. Additional steps are
required to move data to DB2 for z/OS. (Check for the latest available version of the tool.)
Installation
Once you have downloaded the IBMDataMovementTool.zip file, extract the files into a directory
called IBMDataMovementTool on your target DB2 server. A server side install (on DB2) is
strongly recommended to achieve the best data movement performance.
Prerequisites
• DB2 V9.7 should be installed on your target server if you are enabling an Oracle application to
be run on DB2 for Linux, UNIX, and Windows.
• Java™ version 1.5 or higher must be installed on your target server. To verify your current
Java version, run java -version command. By default, Java is installed as part of DB2 for
Linux, UNIX, and Windows in <install_dir>\SQLLIB\java\jdk (Windows) or /opt/ibm/db2/V9.7/
java/jdk (Linux).
Table 1. Location of JDBC drivers for your source database and DB2
Database JDBC drivers
DB2 for Linux, UNIX, and Windows db2jcc.jar, db2jcc_license_cu.jar or db2jcc4.jar, db2jcc4_license_cu.jar
Environment setup
• UNIX: Login to your server as DB2 instance owner.
• Windows: Launch a DB2 Command Window.
• Change to the IBMDataMovementTool directory. The tool is a JAR file with two driver scripts
to run the tool.
IBMDataMovementTool.cmd - Command script to run the tool on Windows.
IBMDataMovementTool.sh - Command script to run the tool on UNIX.
IBMDataMovementTool.jar - JAR file of the tool.
Pipe.dll - A DLL required on Windows if pipe option is used.
On UNIX systems
$ db2set DB2_COMPATIBILITY_VECTOR=ORA
$ db2set DB2_DEFERRED_PREPARE_SEMANTICS=YES
$ db2stop force
$ db2start
$ db2 "create db testdb automatic storage yes on /db2data1,
/db2data2,/db2data3 DBPATH ON /db2system PAGESIZE 32 K"
$ db2 update db cfg for testdb using auto_reval deferred_force
$ db2 update db cfg for testdb using decflt_rounding round_half_up
On Windows systems
C:\> db2set DB2_COMPATIBILITY_VECTOR=ORA
C:\> db2set DB2_DEFERRED_PREPARE_SEMANTICS=YES
C:\> db2stop force
C:\> db2start
C:\> db2 "create db testdb automatic storage yes on C:,D: DBPATH ON E: PAGESIZE 32 K"
C:\> db2 update db cfg for testdb using auto_reval deferred_force
C:\> db2 update db cfg for testdb using decflt_rounding round_half_up
On UNIX:
chmod +x IBMDataMovementTool.sh
./IBMDataMovementTool.sh
You will now see a GUI window. Some messages should also appear in the shell window. Please
look through these messages to ensure no errors were logged before you start using the GUI.
If you have not set DB2_COMPATIBILITY_VECTOR, the tool will report a warning. Please follow
the steps to set the compatibility vector if you have not done so.
[2010-01-10 17.08.58.578]
INPUT Directory = .
[2010-01-10 17.08.58.578]
Configuration file loaded: 'jdbcdriver.properties'
[2010-01-10 17.08.58.593]
Configuration file loaded: 'IBMExtract.properties'
[2010-01-10 17.08.58.593]
appJar : 'C:\IBMDataMovementTool\IBMDataMovementTool.jar'
[2010-01-10 17.08.59.531]
DB2 PATH is C:\Program Files\IBM\SQLLIB
[2010-01-10 17.35.30.015]
*** WARNING ***. The DB2_COMPATIBILITY_VECTOR is not set.
[2010-01-10 17.35.30.015]
To set compatibility mode, discontinue this program and
run the following commands
[2010-01-10 17.35.30.015] db2set DB2_COMPATIBILITY_VECTOR=FFF
[2010-01-10 17.35.30.015] db2stop force
[2010-01-10 17.35.30.015] db2start
After clicking on the Extract DDL/Data button, you will notice tool's messages in the View File tab,
as shown in Figure 2:
After completing the extraction of DDL and DATA, you will notice several new files created in the
working directory. These files can be used at the command line to run in DB2.
Configuration files
The following command scripts are regenerated each time you run the tool in GUI mode. However,
you can use these scripts to perform all data movement steps without the GUI. This is helpful
when you want to embed this tool as part of a batch processes to accomplish an automated data
movement.
IBMExtract.properties This file contains all input parameters that you specified through your
GUI or command line input values. You can edit this file manually to
modify or correct parameters. Note: This file is overwritten each time
you run the GUI.
unload This script is created by the tool. It unloads data from the source
database server to flat files, if you check DDL and Data options. The
same script moves data from source database to DB2 using pipes,
if you check the pipe option in the GUI to eliminate intermediate flat
files. The pipe option is controlled through usePipe option in the
IBMExtract.properties file.
rowcount This script is created by the tool, and you can run it after deploying data
to verify rowcounts in source and DB2 database.
You will be presented with interactive options to specify source and DB2 database connection
parameters in step-by-step process. A sample output from the console window is shown as below:
[2010-01-10 20.08.05.390] INPUT Directory = .
The interactive deploy option is likely your better choice when you are also deploying PL/SQL
objects such as triggers, functions, procedures, and PL/SQL packages.
The GUI screen, as shown in Figure 4, is used for interactive deployment of DDL and other
database objects. The sequence of events in this screen is:
On Windows systems
The tool uses Pipe.dll to create Windows pipes and makes sure that this dll is placed in the same
directory where IBMDataMovementTool.jar file is placed.
On UNIX systems
The tool creates UNIX pipes using the mkfifo command for use to move data from source to DB2.
Before you can use pipe between source and DB2 database, it is necessary to have table
definition created. Follow this procedure:
You can use this tool from z/OS to do the data movement from a source database to DB2 for z/OS.
However, the following additional steps are required.
Change file permission to 755 and run it and then you will get an output shown below:
DNET770:/u/dnet770/migr: >./jd
USAGE: ibm.Jd <filter_key>
USAGE: ibm.Jd "DNET770.TBLDATA.**"
USAGE: ibm.Jd "DNET770.TBLDATA.**.CERR"
USAGE: ibm.Jd "DNET770.TBLDATA.**.LERR"
USAGE: ibm.Jd "DNET770.TBLDATA.**.DISC"
So, if you want to delete all datasets under "DNET770.TBLDATA", use following command.
DNET770:/u/dnet770/migr: >./jd "DNET770.TBLDATA.**"
• You need a good network connection between source and DB2 server, preferably of 1GBPS
or higher. You will be limited by the network bandwidth for the time frame to complete the data
movement.
• The number of CPUs on the source server will allow you to unload multiple tables in parallel.
For database size greater than 1TB, you should have minimum 4 CPU on source server.
• The number of CPUs on the DB2 server will determine the speed of the LOAD process. As
a rule of thumb, you will require 1/4 to 1/3 of the time to load data in data and rests will be
consumed by the unload process.
• Plan ahead the DB2 database layout. Please consult IBM's best practice paperss for DB2
• Pay attention to the tables listed in the input tables file. The script geninput does not have
intelligence to put the tables in a particular order, but you need to order the tables in such a
way as to minimize unload time. The tables listed in the input files are fed to a pool of threads
in a round robin fashion. It may so happen that all the threads have finished the unload
process but one is still running. In order to keep all threads busy, organize the input file for the
tables in the increasing numbers of rows.
• It may still so happen that all tables have unloaded and a few threads are still holding up
unloading very large tables. You can unload the same table in multiple threads if you can
specify the WHERE clause properly in the input file. For example:
"ACCOUNT"."T1":SELECT * FROM "ACCOUNT"."T1" WHERE id between 1 and 1000000
"ACCOUNT"."T1":SELECT * FROM "ACCOUNT"."T1" WHERE id between 1000001 and 2000000
"ACCOUNT"."T1":SELECT * FROM "ACCOUNT"."T1" WHERE id between 2000001 and 3000000
"ACCOUNT"."T1":SELECT * FROM "ACCOUNT"."T1" WHERE id between 3000001 and 4000000
Make sure that you use the right keys in the WHERE clause, which should preferrably be
either the primary key or a unique index. The tool takes care of making proper DB2 LOAD
scripts to load data from multiple files generated by the tool. There is no other setup required
to unload the same table in multiple threads, except to add different WHERE clause as
explained.
• After breaking your unload process in several steps, you can start putting data in DB2
simultaneously when a batch has finished unloading the data. The key here is the seperate
output directory for each unload batch. All necessary files are generated to put data in DB2 in
the output directory. For DDL, you will use generated db2ddl script to create table definitions.
For data, you will use db2load script to load the data in DB2. If you combine DDL and data in
a single step, the name of the script will be db2gen.
• Automate the whole process in your shell scripts so that the unload and load processes are
synchronised. Each and every large data movement from Oracle or other databases to DB2
is unique. You will have your skills tested determining how to automate all of these jobs. Save
the output of the jobs in a file by using the tee command, so that you can keep watching the
progress, and the output is saved in a log file.
1. Copy your data movement scripts and automation shell scripts to a mock directory.
2. Estimate your time by unloading a few large tables in a few threads, and accordingly stagger
the movement of the data.
3. Add a WHERE clause to limit the number of rows to test the movement of data. For example,
you can add a ROWNUM clause to limit the number of rows in Oracle or use the TOP clause
for SQL Server.
"ACCOUNT"."T1":SELECT * FROM "ACCOUNT"."T1" WHERE rownum < 100
"ACCOUNT"."T2":SELECT * FROM "ACCOUNT"."T2" WHERE rownum < 100
"ACCOUNT"."T3":SELECT * FROM "ACCOUNT"."T3" WHERE rownum < 100
"ACCOUNT"."T4":SELECT * FROM "ACCOUNT"."T4" WHERE rownum < 100
4. Practice your scripts and make changes as necessary, and prepare for the final run.
Final run
1. You have already extracted DDL and made the required manual changes for the mapping
between tables and tablespaces if required.
2. Take a downtime for the movement of the data.
3. Make sure your have around 10000 open cursors setting for the Oracle database if that is the
source.
4. Watch the output from the log file.
For large movement of data, it is much more about planning, discipline and the ability to automate
jobs. The tool provides all the capability that you require for such movement. This little tool has
moved very large databases from source to DB2.
Do I need to install anything on my source database server in You do not need to install anything on your source database for this
order for this tool to work? tool.
What are the supported platforms for this tool? Windows, z/OS, AIX, Linux, UNIX, HP-UX, Solaris, Mac and any other
platform that has a JVM on it.
I am running this tool from a secure shell window on my Linux/ Depending upon your DISPLAY settings, the GUI window has opened
Unix platform and I see few messages in the command line shell on your display capable server. You need to properly export your
but I do not see GUI and it seems that tool has hung. DISPLAY settings. Consult your Unix system adminstrator.
I am trying to move data from PostgreSQL and I do not see There is no JDBC drivers provided with the tool due to licensing
PostgreSQL JDBC driver attached with the tool. considerations. You should get your database JDBC driver from your
licensed software.
It is not possible to grant DBA to the user extracting data from You will at least need SELECT_CATALOG_ROLE granted to the user
Oracle database. How can I use the tool? and SELECT privileges on tables used for migration.
What are the databases to which this tool can connect? Any database that has a type-IV JDBC driver. So, you can connect
to MySQL, PostgreSQL, Ingres, SQL Server, Sybase, Oracle, DB2
and others. It can also connect to a database that has a ODBC-JDBC
connector so you can also move from Access database.
What version of Java do I need to run this tool? You need minimum Java 1.5 to run the tool. The dependency for Java
1.5 is basically due to the GUI portion of the tool. If you really need
support for Java 1.4.2, send me a note and I will compile the tool for
Java 1.4.2 but the GUI will not run to create the data movement driver
scripts.
You can determine the version of Java by running this command.
$ java -version
C:\>java -version
How do I check the version of the tool? Run IBMDataMovementTool -version on Windows or ./
IBMDataMovementTool.sh -version on Linux/UNIX
I am get the error "Unsupported major.minor version 49.0" or You are using a version of Java less than 1.5. Install Java higher than
"(.:15077): Gtk-WARNING **: cannot open display: " when I run the version 1.4.2 to overcome this problem. We prefer that you install IBM
tool. What does it mean? Java.
What information do I need for a source and DB2 database servers You need to know IP address, port number, database name, user id and
in order to run this tool? password for the source and DB2 database. The user id for the source
database should have DBA priviliges and SYSADM privilege for the
DB2 database.
I am running this tool from my Windows workstation and it is The default memory allocated to this tool from
running extremely slow. What can I do? IBMDataMovementTool.cmd or IBMDataMovementTool.sh command
script is 990MB by using -Xmx switch for the JVM. Try reducing this
memory as you might be having less memory on your workstation.
I am doing a data movement from SQL Server to DB2. How do I get Specify mssqltexttoclob=true in IBMExtract.properties file.
my TEXT field to go to VARCHAR in DB2.
I am doing a data movement from Sybase to DB2 and it did not The purpose of this tool is only DDL and DATA movement. You will
move my T-SQL procedures to DB2. have to use MTK for the purpose of procedure / triggers movement.
I am doing a DDL movement from Sybase to DB2 and I have my The purpose of this tool is the high speed data movement and that is
Sybase objects in a file. I do not see a way to specify DDL file as a why there is no capability to transform a DDL file from a database to
data source. DB2. You can however use IBM InfoSphere Data Architect to trasnform
a DDL from a source database to a target.
I am doing a data movement from MS Access to DB2 and I do not We use basic ODBC-JDBC connector to connect to MS Access
see all indexes etc in the DDL generated. database. You will need a different commercial JDBC driver to obtain
complete set of DDLs. You can try HXTT JDBC driver for MS Access.
If you use HXTT driver, you will have to specify DBVENDOR=hxtt in
generated unload script instead of access.
I am doing a data movement from Sybase to DB2 using this tool It is quite possible that your Sybase database is not enabled for
and I am getting tons of error. required JDBC support. Please consult your Sybase DBA to ensure that
correct JDBC stored procedures are installed in your Sybase database.
I am doing a data movement from MySQL to DB2 and I am running Try different values with FETCHSIZE=nnn in the generated unload
out of memory. script and run the data movement from command line. If you use GUI
tool, it will overwrite unload script.
I am doing a data movement from Oracle to DB2 and I notice The additional JAR files are mainly required for Oracle XML data types.
that there are 3 jars files required for the data movement. My You should get those files from your Oracle installation directory.
understanding is the we only need a JDBC driver for data
movement. Why additional jar files?
I want Oracle data type of CLOB to go as DBCLOB in DB2. Go to IBMExtract.properties file and set DBCLOB=true.
I am using this tool to move data from Oracle to DB2 and I am The user ID connecting to Oracle should have
getting many Oracle SQL error that a table was not found. SELECT_CATALOG_ROLE granted to it and SELECT privileges on the
tables.
I do not want NCHAR and NVARCHAR2 to go as GRAPHIC or Go to IBMExtract.properties file and set GRAPHIC=false.
VARGRAPHIC in DB2. I want them to go as CHAR and VARCHAR2
since I created DB2 database as UTF-8.
Can I do data movement from Oracle database to DB2 version less Yes, go to IBMExtract.properties and set db2_compatibility=false
than V9.7/V9.5?
I noticed that your tool moved Oracle's NUMBER(38) to Go to IBMExtract.properties and set roundDown_31=false.
NUMBER(31) and I understand that DB2 supports only up to 31. I
do not want to round down and I want to convert this to DOUBLE.
I am getting lots of data rejected. How do I get that rejected data in Go to IBMExtract.properties and set dumpfile=true.
a file so that I can analyze the reason of rejection.
I am trying to load data from a workstation to a DB2 server and I It is preferable to run this tool from the DB2 server to extract data from
am getting erros. Do I have to run the tool from server only? the source database and avoid an intermediate server. However if
you want to run this tool from an intermediate server, you can specify
REMOTELOAD=TRUE in the generated script unload. Please
remember that DB2 LOAD utility requires for BLOBS/CLOBS/XML data
to be available on server. You will need to mount those directories with
same naming convention on the target DB2 server.
I can only login to my DB2 server through a SSH shell and we do Run IBMDataMovementTool.sh from your SSH and if there
not allow X-Windows to run on DB2 server. How do I run this GUI is no graphics support, the tool will switch to command line input
tool to move DDL and DATA? automatically. If it does not switch for some reason, specify -console
option to the IBMDataMovementTool.sh command and it will force to
run the tool in the interactive command line mode. The command line
mode is just a way to gather the input and to generate necessary scripts
for data movement. The use of GUI is just a way to generate the scripts
and the actual works is done through the scripts only.
Why did you not create DB2 database through your script since DBAs normally like to create their database as per their storage paths
you ask the name of the database. information. We do however create necessary table spaces so that
tables are put automatically in right table space by DB2. You should
consider reading IBM's best practice papers to carefully plan for your
database. It is recommended that you create DB2 database with 32K
page size as default.
Why do I need xdb.jar and xmlparserv2.jar in addition to Oracle The xdb.jar and xmlparserv2.jar will be required if your Oracle data
JDBC driver? contains XML data. You can locate xdb.jar in folder server/RDBMS/jlib
and xmlparserv2.jar in lib folder. If you are unable to locate these, you
can download Oracle XDK for Java.
I am getting java.lang.UnsatisfiedLinkError: Pipe Pipe.dll is not a This error comes if you are running the tool on Windows 64-bit platform
valid Win32 application. How do I fix this? using Java 32-bit JVM. Install Java 64-bit JVM on your Windows
platform, and rerun the tool.
Acknowledgements
Many IBMers from around the world provided valuable feedback to the tool and without their
feedback, the tool in this shape would not have been possible. I acknowledge significant help,
feedback, suggestions and guidance from following people.
• Jason A Arnold
• Serge Rielau
• Marina Greenstein
• Maria N Schwenger
• Patrick Dantressangle
• Sam Lightstome
• Barry Faust
• Vince Lee
• Connie Tsui
• Raanon Reutlinger
• Antonio Maranhao
• Max Petrenko
• Kenneth Chen
• Masafumi Otsuki
• Neal Finkelstein
Disclaimer
This article contains a tool. IBM grants you ("Licensee") a non-exclusive, royalty free, license to
use this tool. However, the tool is provided as-is and without any warranties, whether EXPRESS
OR IMPLIED, INCLUDING ANY IMPLIED WARRANTY OF MERCHANTABILITY, FITNESS FOR
A PARTICULAR PURPOSE OR NON-INFRINGEMENT. IBM AND ITS LICENSORS SHALL NOT
BE LIABLE FOR ANY DAMAGES SUFFERED BY LICENSEE THAT RESULT FROM YOUR USE
OF THE SOFTWARE. IN NO EVENT WILL IBM OR ITS LICENSORS BE LIABLE FOR ANY LOST
REVENUE, PROFIT OR DATA, OR FOR DIRECT, INDIRECT, SPECIAL, CONSEQUENTIAL,
INCIDENTAL OR PUNITIVE DAMAGES, HOWEVER CAUSED AND REGARDLESS OF THE
THEORY OF LIABILITY, ARISING OUT OF THE USE OF OR INABILITY TO USE SOFTWARE,
EVEN IF IBM HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGES.
Related topic
• IBM Data Movement Tool: A new build of the tool is uploaded very frequently, after bug
fixes and new enhancements. Click on Help > Check New Version from the GUI or enter
the command ./IBMDataMovementTool.sh -check to check if a new build is available for
download. You can find the Tool's build number from the Help > About menu option or by
entering the ./IBMDataMovementTool.sh -version command. This tool uses JGoodies Forms
1.2.1, JGoodies Look 2.2.2, and JSyntaxPane 0.9.4 packages for the GUI interface.