Professional Documents
Culture Documents
Oracle Data Integrator 11g Adv Integration PDF
Oracle Data Integrator 11g Adv Integration PDF
Oracle Data Integrator 11g Adv Integration PDF
D80589
Edition 1.0
D78191GC10
February 2013
Student Guide
Development
Advanced Integration and
Oracle Data Integrator 11g:
Technical Contr ibutors other intellectual property laws. You may copy and print this document solely for your
own use in an Oracle training course. The document may not be modified or altered in
and Reviewer s any way. Except where your use constitutes "fair use" under copyright law, you may
not use, share, download, upload, copy, print, display, perform, reproduce, publish,
Denis Gray
license, post, transmit, or distribute this document in whole or in part without the
Alex Kotopoulis express authorization of Oracle.
Julien Testut The information contained in this document is subject to change without notice. If you
Christophe Dupupet find any problems in the document, please report them in writing to: Oracle University,
500 Oracle Parkway, Redwood Shores, California 94065 USA. This document is not
Rebecca Sly warranted to be error-free.
Gerry Jurrens
Restricted Rights Notice
Sophia Chen
Publisher s
Michael Sebastian
Srividya Rameshkumar
Contents
These eKit materials are to be used ONLY by you for the express purpose SELF STUDY. SHARING THE FILE IS STRICTLY PROHIBITED.
1 Introduction
Lesson Objectives 1-2
Course Objectives 1-3
Target Audience 1-4
Class Introductions 1-5
iii
Replacing Existing KMs 2-18
Developing Knowledge Modules 2-20
These eKit materials are to be used ONLY by you for the express purpose SELF STUDY. SHARING THE FILE IS STRICTLY PROHIBITED.
iv
Quiz 3-27
Summary 3-28
These eKit materials are to be used ONLY by you for the express purpose SELF STUDY. SHARING THE FILE IS STRICTLY PROHIBITED.
v
Example: Flow with Multiple Data Sets 5-9
Defining a Data Set 5-10
These eKit materials are to be used ONLY by you for the express purpose SELF STUDY. SHARING THE FILE IS STRICTLY PROHIBITED.
vi
7 Accelerating Development in ODI with Groovy
Objectives 7-2
These eKit materials are to be used ONLY by you for the express purpose SELF STUDY. SHARING THE FILE IS STRICTLY PROHIBITED.
vii
Setting Up an Integration project and Creating a Complex File Model:
Example 8-18
These eKit materials are to be used ONLY by you for the express purpose SELF STUDY. SHARING THE FILE IS STRICTLY PROHIBITED.
viii
Quiz 9-38
Summary 9-39
These eKit materials are to be used ONLY by you for the express purpose SELF STUDY. SHARING THE FILE IS STRICTLY PROHIBITED.
ix
Setting Up External Password Storage 10-41
Implementing External Password Storage 10-42
These eKit materials are to be used ONLY by you for the express purpose SELF STUDY. SHARING THE FILE IS STRICTLY PROHIBITED.
Quiz 10-44
Summary 10-45
Practice 10-1: Implementing ODI External User Authentication 10-46
x
These eKit materials are to be used ONLY by you for the express purpose SELF STUDY. SHARING THE FILE IS STRICTLY PROHIBITED.
Quiz 11-46
Summary 11-47
6. Other Best Practices 11-45
xi
5. Choose Right Knowledge Module 11-44
Introduction
This lesson provides a general overview of the course objectives and agenda of lessons.
Lesson 1: Introduction
Lesson 2: Overview of ODI Knowledge Modules
Lesson 3: Customizing Knowledge Modules
Lesson 4: Designing ODI Integration Interfaces
Day 2:
Lesson 5: Designing Advanced Integration Interfaces
Browser
Oracle Internet
Oracle SOA Suite
11g
Copyright 2013, Oracle and/or its affiliates. All rights reserved.
In this course, the following products are used for lessons and practices:
Oracle Database 11g, WebLogic Server 11g, Oracle Data Integrator 11g, Oracle SOA Suite,
Oracle Internet Directory, and supporting products. These layered products together comprise the
Oracle Data Integrator runtime environment.
Student Guide
Activity Guide
The Student Guide contains the lecture slides and notes. The Activity Guide contains the practices
for the course.
Topic Website
Education and Training http://education.oracle.com
Product Documentation http://www.oracle.com/technology/documentation
Product Downloads http://www.oracle.com/technology/software
Product Articles http://www.oracle.com/technology/pub/articles
Product Support http://www.oracle.com/support
Course
Your instructor can provide additional information regarding the availability and contents of the
courses listed in the slide.
ODIs ELT architecture leverages disparate RDBMS engines to process and transform data.
This approach optimizes performance and scalability, and lowers overall solution costs.
ODI turns the promise of active integration into reality by providing all the key components
that are required to enable real-time data warehousing and operational data hubs. ODI
combines three styles of data integration: data-based, event-based, and service-based. ODI
unifies silos of integration by transforming large volumes of data in batch mode, by
processing events in real time through its advanced Changed Data Capture, and by
providing data services to the Oracle SOA Suite.
Oracle Data Integrator shortens implementation times with its declarative design approach.
Designers specify what they want to accomplish with their data, and then the tool generates
the details of how to perform the task. With ODI, the business user or the developer specifies
the rules to apply to the integration processes. The tool automatically generates data flows
and administers correct instructions for the various source and target systems. With
declarative design, the number and complexity of steps is greatly reduced, which in turn
shortens implementation times. Automatic code generation reduces the learning curve for
integration developers and streamlines non-IT professionals to the definition of integration
processes and data formats.
corporate standards, or for specific vertical knowhow). By helping companies capture and
reuse technical expertise and best practices, ODIs knowledge module framework reduces the
cost of ownership. It also enables metadata-driven extensibility of product functionality to meet
the most demanding data integration challenges.
ODI streamlines the high-performance movement and transformation of data between
heterogeneous systems in batch, real-time, synchronous, and asynchronous modes. It dramatically
enhances user productivity with an innovative, modularized design approach and built-in connectivity
to all major databases, data warehouse appliances, analytic applications, and SOA suites.
Integration process
Extract - Transform (check) - Load
A machine A machine
Source
Target
ORDERS A machine
Errors
CORRECTIONS
File
This integration process is also known as an extract, transform, and load (ETL) process.
The first part of an ETL process involves extracting data from the source systems. Most data
warehousing projects consolidate data from different source systems.
The transform stage applies a series of rules or functions to the data extracted from the source to
derive the data for loading into the target. Some data sources will require very little or even no
manipulation of data. In other cases, transformations (such as filtering, joining, sorting, and so on)
may be required to meet the business and technical needs of the target database.
The load phase loads the data into the target, usually the data warehouse.
Note: You can add to this process the checks that ensure the quality of data flow, as shown in the
slide.
Transform Transform
Transform
Extract Load Extract Load
Data is one of the most important assets of any company, and data integration constitutes the
backbone of any enterprises IT systems. Choosing the technology for data integration is critical
for productivity and responsiveness of business divisions within an enterprise.
ELT stands for extract, load, and transform. It includes the processes that enable companies to
move data from multiple sources, reformat and cleanse the data, and load it into another
database, or a data warehouse for analysis, to support a business process.
ODI provides a strong and reliable integration platform for IT infrastructure. Built on the next-
generation architecture of extract, load, and transform (ELT), ODI delivers superior performance
and scalability connecting heterogeneous systems at a lower cost than traditional, proprietary ETL
products. Unlike the conventional extract, transform, and load (ETL) design, with ODI, ELT
architecture extracts data from sources, loads it into a target, and transforms it by using the
database power according to business rules. The tool automatically generates data flows,
manages their complexity, and administers correct instructions for the various source and target
systems.
Load
This step involves loading the data into the destination target, which might be a database or data
warehouse.
Transform
After the data has been extracted and loaded, the next step is to transform the data according to a
set of business rules. The data transformation may involve various operations including, but not
limited to filtering data, sorting data, aggregating data, joining data, cleaning data, generating
calculated data based on existing values, and validating data.
The repository forms the central component of the ODI architecture. This stores configuration
information about the IT infrastructure; the metadata for all applications, projects, scenarios, and
execution logs. Repositories can be installed in an online transaction processing (OLTP) relational
database. The repository also contains information about the ODI infrastructure, defined by the
administrators. The two types of ODI repositories are Master and Work Repositories.
At design time, developers work in a repository to define metadata and business rules. The
resulting processing jobs are executed by the agent, which orchestrates the execution by
leveraging existing systems. The agent connects to available servers and asks them to execute
the code. It then stores all return codes and messages in the repository. The agent also stores
statistics, such as the number of records processed, and the elapsed time. Several repositories
can coexist in an IT infrastructure. The graphic in this slide shows two repositories: one for the
development environment and the other for the production environment. Developers release their
projects in the form of scenarios that are sent to production.
In production, these scenarios are scheduled and executed on a Scheduler Agent that also stores
all its information in the repository. Operators have access to this information and can monitor the
integration processes in real time.
Business users, as well as developers, administrators, and operators, can gain web-based read
access to the repository by using the ODI Console.
Technically, this term describes a template containing the code necessary to implement a
particular data integration task. These tasks include loading data, checking it for errors, or setting
up triggers necessary to implement journalization. However, all Knowledge Modules basically
work the same way: ODI uses them to generate code, which is then executed by a technology at
run time.
Note: Knowledge Modules are independent of the structure of the source and target data stores.
The same KM can be used, no matter which source table you have, or how many source tables
you have. Likewise, all target tables can use the same Knowledge Module.
Project KMs
ODI 11.1.1.6 introduces Global Knowledge Modules (KMs) allowing specific KMs to be shared
across multiple projects. In previous versions of ODI, Knowledge Modules were always specific to
a project and could only be used within the project into which they were imported. Global KMs are
listed in the Designer Navigator in the Global Objects accordion.
DB2/UDB
Knowledge Modules define a way of implementing business rules on a given technology. They
connect the abstract logic of your business rules to the concrete reality of your data servers.
If your data servers change, you just select a different Knowledge Module to match. Your business
rules remain unchanged. If your business rules change, you do not have to modify any codeyou
just update the SQL expressions that define them in ODI.
In your example, you have Loading Knowledge Modules that describe how to move data from the
Sybase source server to the DB2/UDB staging area. Integration Knowledge Modules perform the
work of creating temporary tables and moving data onto the Oracle target server. Lastly, Check
Knowledge Modules implement constraint checking and isolating errors in a separate table.
Note: Knowledge Modules are generic because they enable data flows to be generated
regardless of the transformation rules. And they are highly specific because the code they
generate and the integration strategy they implement are finely tuned for a given technology.
plus
Other metadata
Generated code
Truncate Table SCOTT.EMP - Topology of your
Sources
Run Truncate Table SCOTT.EMP
Insert into SCOTT.EMP
time
Targets ODI Agent orchestrates the
running of the generated code
Business rules and their implementation on a server are being carefully kept separate. Now you
will see how ODI combines the two at run time.
For example, you have a mapping that the net income of an employee is the sum of the income
components multiplied by their coefficients. This is an expression in SQL, but there is no code to
perform this integration.
You have a Knowledge Module that can wipe a destination table and fill it with source data.
However, it knows nothing about your business rules.
Then you have all the other metadata that you defined in ODI: the Topology of your servers, the
models that exist on them, the technologies used by each server, and so on.
When you put these three things together, ODI generates a code to carry out the integration. This
code is specific to a technology, a layout, and a set of business rules. However, if any of these
three things changes, all you need to do is regenerate the code.
At run time, the ODI Agent orchestrates the running of the generated code. Based on execution
locations, various parts of the generated code will be executed either on the Source, Staging or
Target.
LKM Loading Assembles data from source data stores to the Staging
Interfaces
Area
IKM Integration Uses a given strategy to populate the target data store from
the Staging Area
JKM Journalizing Sets up a system for Changed Data Capture to reduce the
amount of data that needs to be processed
SKM Data Deploys data services that provide access to data in data
Services stores
The first group of Knowledge Modules is critical for doing any work with interfaces.
Loading Knowledge Modules (LKMs) extract data from the source of interfaces. So, if your
data is stored in flat files, you will need the File to SQL LKM.
Integration Knowledge Modules (IKMs) implement a particular strategy for loading the target
of an interface. Thus, to load an Oracle table while taking into account the slowly changing
dimensions properties, a particular IKM can be used.
Check Knowledge Modules (CKMs) are selected in Interfaces and constraints can be
individually enforced by ODI at the interface level.
The second group is used for setting up, checking, and configuring models.
Check Knowledge Modules (CKMs) enforce constraints defined on the target data store.
CKMs are used by models to perform static checks outside of interfaces.
Reverse Engineering Knowledge Modules (RKMs) are needed only to perform customized
reverse engineering. They are used to recover the structure of a data model and are used
when standard reverse engineering cannot be performed.
Journalizing Knowledge Modules (JKMs) are used to set up Changed Data Capture. This
makes interfaces react only to changes in data and can vastly reduce the amount of data
that needs to be transferred.
Service Knowledge Module (SKM) is the code template for generating data services.
You now know what the various types of Knowledge Modules are. But, how do you know which
ones need to be imported into your specific project?
The most important thing to remember is to import all Knowledge Modules that may be used in a
project. Each time you create a project, you must import the Knowledge Modules that will be used
by that project.
The basic strategy for importing Knowledge Modules is as follows:
First, import the most basic SQL Knowledge Modules. These work on almost every database
management system (DBMS) with acceptable performance.
Then, consider importing more specific Knowledge Modules for the particular technologies
involved in your project. If you are transferring data to an Oracle server, consider adding the
Oracle-based IKMs.
Technology-specific Knowledge Modules can take advantage of certain characteristics or the
special tools provided by the technologies. However, it is often best to begin with the most
generic Knowledge Modules to start your interface, then choose specific Knowledge
Modules later to increase performance. Refer to:
Oracle Fusion Middleware Knowledge Module Developer's Guide for Oracle Data
Integrator 11g Release 1 (11.1.1)
Oracle Fusion Middleware Connectivity and Knowledge Modules Guide for Oracle Data
Integrator 11g Release 1 (11.1.1)
In real practice, there are few cases when you create a Knowledge Module from the beginning.
To create a new KM:
1. Add new Knowledge Module of appropriate type.
2. Specify the name and add the details to create the functionality of your KM based on your
project needs
In most cases, you dont create a Knowledge Module from the beginning but rather perform
developing a KM starting from some existing Knowledge Module. To develop a Knowledge
Module, import a KM of appropriate type, and then modify it by adding a new functionality
according to your project requirements.
Use the Knowledge Module Editor to create your customized Knowledge Modules. In this
example, the third line is highlighted, Create work table. If you double-click this line, a detailed
editor opens, as shown in the next slide.
General
editing
The KM Editor has a section to edit general information about an item within the KM, such as this
Create work table item in the LKM SQL to Oracle Knowledge Module.
The KM Editor also has a section to edit the KMs options.
Knowledge Modules are used later in this course. For additional detailed information about KMs,
see the ODI Knowledge Modules Reference Guide.
A knowledge
module is made
of steps.
Each step has
a name and a
template for the
After the execution, the step names will then be viewable in ODI Operator.
Each step has a command on source and command on target. In the Command window, you can
view and modify each command manually.
Details of the steps are generic:
The source and target tables are not known, only the technologies are known.
Substitution methods are the placeholders for the table names and column names
Parameters of the substitution methods: Tables or columns
Each option can be On or Off and is defined on the Options tab of each step of the KM.
Normally, if you import a Knowledge Module with the same name as an existing Knowledge
Module, you end up with two copies in your project. This can be useful if you want to make
changes to the Knowledge Module without breaking your existing interfaces.
However, another mode of importing is available: the import replace mode. In this mode, the
existing interface is replaced with the version imported from the disk. All interfaces that used the
old Knowledge Module are automatically updated to use the new version. Also, values that are set
for the Knowledge Module options in these interfaces are transferred to the new version. However,
any existing scenarios are not regenerated. If you want to incorporate changes to the Knowledge
Module into your scenarios, you must regenerate them.
There are various reasons why you would want to replace a Knowledge Module. The most
common reason is that a newer version of the Knowledge Module is released, perhaps by
someone on your team. You import the Knowledge Module again and all your existing interfaces
will still work.
Similarly, you may have made some undesirable changes to your Knowledge Module. You can
quickly undo these changes by reimporting from a saved version.
Note: Any interface that uses the replaced KM also will be impacted.
Very few KMs are ever created from scratch. They usually
are extensions or modifications of existing KMs.
Duplicate existing steps and modify them. This prevents
typos in the syntax of the odiRef methods.
All interfaces using the KM inherit the new behavior.
Remember to make a copy of the KM if you do not want to
One of the main guidelines when developing your own KM is to avoid starting from the beginning.
ODI provides more than 100 KMs. Take a look at these existing KMs, even if they are not written
for your technology. The more examples you have, the faster you develop your own code. You
can, for example, duplicate an existing KM and start enhancing it by changing its technology, or
copying lines of code from another KM.
To speed up development, duplicate existing steps and modify them. This prevents typos in the
syntax of the odiRef methods.
When developing your own KM, remember that it is targeted to a particular stage of the integration
process. As a reminder:
LKMs are designed to load remote source data sets to the staging area (into C$ tables).
IKMs apply the source flow from the staging area to the target. They start from the C$
tables, may transform and join them into a single I$ table, may call a CKM to perform data
quality checks on this I$ table, and finally write the flow data to the target.
CKMs check data quality in a data store or a flow table (I$) against data quality rules
expressed as constraints. The rejected records are stored in the error table (E$).
KMs are written as templates by using the Oracle Data Integrator substitution API. The API
methods are Java methods that return a string value. They all belong to a single object instance
named "odiRef". The same method may return different values depending on the type of KM that
invokes it. The following example illustrates how you would write a CREATE TABLE statement in
a KM and what it would generate. The following code is entered in a KM:
CREATE TABLE <%=odiRef.getTable("L", "INT_NAME",
"A")%>(<%=odiRef.getColList("", "\t[COL_NAME] [DEST_CRE_DT]", ",\n", "",
"")%>)
The generated code for the PRODUCT table is:
CREATE TABLE db_staging.I$_PRODUCT( P RODUCT_ID numeric(10),
PRODUCT_NAME varchar(250), FAMILY_ID numeric(4), SKU varchar(13),
LAST_DATE timestamp)
The generated code for the CUSTOMER table is:
CREATE TABLE db_staging.I$_CUSTOMER( CUST_ID numeric(10), CUST_NAME
varchar(250), ADDRESS varchar(250), CITY varchar(50), ZIP_CODE
varchar(12), COUNTRY_ID varchar(3))
Once executed with appropriate metadata, the KM has generated a different code for the
PRODUCT and CUSTOMER tables.
Substitution methods are direct calls to ODI methods implemented in Java. These methods are
used to generate text that corresponds to the metadata stored in the ODI repository or session
information.
If you write an expression to express a business rule, you do not generally know in which context
this expression is run. For instance, if you refer to a specific schema by its name in your test
setup, it may not work in your production environment. But, you want to be able to specify your
table in a generic way. Substitution methods enable you to do this. You can specify the table
name with a substitution method. At run time, ODI automatically adds the name of the appropriate
physical schema for the context. You can also use substitution methods to retrieve information
about the current session. For example, you can include the time when the session was launched,
its name, or the code for the current context. Similarly, substitution methods give you access to the
metadata about the source and target of your interface. The general syntax to use a substitution
method is as follows:
<%=odiRef.method_name(parameters)%>
Angle brackets with percentage signs enclose all substitution method calls. The equal sign tells
ODI to replace the tags and everything inside with the result of the method call.
odiRef is a special ODI Java object, which contains all the methods for performing substitution.
You can obtain more information about these methods in Oracle Fusion Middleware Knowledge
Module Developer's Guide for Oracle Data Integrator 11g Release 1 (11.1.1).
The substitution methods can be categorized depending on the type of Knowledge Module into
which they can be used. Global methods are the methods that can be used in any situation (in all
Knowledge Modules and in actions). In addition to the methods from the Global Methods list,
some methods can be used specifically in Journalizing Knowledge Modules (JKMs), Loading
Knowledge Modules (LKMs), Integration Knowledge Modules (IKMs), Check Knowledge Modules
(CKMs), Reverse-Engineering Knowledge Modules (RKMs), or in Service Knowledge Modules
(SKMs).
For details of using each substitution method, refer to Appendix A Substitution API Reference in
Oracle Fusion Middleware Knowledge Module Developer's Guide for Oracle Data Integrator 11g
Release 1 (11.1.1).
These examples show how to use substitution methods. First, you look at how to access the
default value of a column. In ODI, you can specify a default value for each column. To access this
default value, use the getColDefaultValue substitution method. If your column is a string, you
must enclose the entire method call within single quotation marks. If the default value for this
column is unknown, the result of this expression is the unknown string with single quotation
marks.
You might to want to record the system date into a last updated column. Normally, you use a
function available in your database engine. What if your target is not a database, but a flat file? In
this case, you can use the getSysDate substitution method. The system date on the machine
where the agent is running is used. To call this function, pass one argument, which is the date
format to be used. By passing yyyy, a four-digit date is returned. Thus, the final string is
something like The year is: 2006.
The next example shows how to use a SELECT statement in a filter. For example, you want to
import orders from the ORDERS table. You are retrieving only those orders processed in the last
week. To do this, you create a subquery that retrieves the maximum order date minus seven days
from the ORDERS table. However, the schema containing the ORDERS table may change on
different servers. Therefore, use the getObjectName substitution method to refer to the table
name. The generated code will contain the qualified name of the table. Thus, the final filter
expression refers to something like MSSQL_ORDERS_PROD.SRC_ORDERS.
An action corresponds to a DDL operation (CREATE TABLE, DROP REFERENCE, and so on) used
to generate a procedure to implement in a database the changes performed in a data integrator
model (GENERATE DDL operation). Each action contains several Action Lines, corresponding to
the commands required to perform the DDL operation (for example, dropping a table requires
dropping all its constraints first).
Action lines contain statements valid for the technology of the action group. Unlike procedures or
Knowledge Module commands, these statements use a single connection (SELECT ... INSERT
statements are not possible).
In the style of the Knowledge Modules, actions make use of the substitution methods to make their
DDL code generic. Action Calls methods are usable in the action lines only. Unlike other
substitution methods, they are not used to generate text, but to generate actions appropriate for
the context. For example, to perform a DROP TABLE DDL operation, you must first drop all foreign
keys referring to the table. In the DROP TABLE action, the first action line will use the
dropReferringFKs() action call method to automatically generate a Drop Foreign Key action
for each foreign key of the current table. This call is performed by creating an action line with the
following code: <% odiRef.dropReferringFKs(); %> The syntax for calling the action call
methods is: <% odiRef.method_name(); %>
Note: The action call methods must be alone in an action line, should be called without a
preceding "=" sign, and require a trailing semi-colon.
dropPK(): Call the Drop Primary Key for the primary key of the current table.
createTable(): Call the Create Table action for the current table.
dropTable(): Call the Drop Table action for the current table.
addFKs(): Call the Add Foreign Key action for all the foreign keys of the current table.
dropFKs(): Call the Drop Foreign Key action for all the foreign keys of the current table.
enableFKs(): Call the Enable Foreign Key action for all the foreign keys of the current table.
disableFKs(): Call the Disable Foreign Key action for all the foreign keys of the current table.
addReferringFKs(): Call the Add Foreign Key action for all the foreign keys pointing to the
current table.
dropReferringFKs(): Call the Drop Foreign Key action for all the foreign keys pointing to the
When working in Designer, you should avoid specifying physical information such as the database
name or schema name because they may change depending on the execution context. The
correct physical information will be provided by Oracle Data Integrator at execution time by using
substitution methods.
Qualified name
Method Usable In
required
Any object named getObjectName("L",
OBJ_NAME Anywhere
"OBJ_NAME", "D")
The target data getTable("L", "TARG_NAME",
store of the LKM, CKM, IKM, JKM
current interface "A")
The integration getTable("L", "INT_NAME",
(I$) table of the LKM, IKM
current interface. "A")
The loading table
The substitution API has methods that calculate the fully qualified name of an object or data store
taking into account the context at run time. These methods are listed in the table below in this
slide.
Generating code from a list of items often requires a while or for loop. Oracle Data Integrator
addresses this issue by providing powerful methods that help you generate code based on lists.
These methods act as iterators to which you provide a substitution mask or pattern and a
separator and they return a single string with all patterns resolved separated by the separator.
All of them return a string and accept at least these four parameters:
Start: A string used to start the resulting string
Pattern: A substitution mask with attributes that will be bound to the values of each item of
the list
Separator: A string used to separate each substituted pattern from the following one
End: A string appended to the end of the resulting string
Some of them accept an additional parameter (the Selector) that acts as a filter to retrieve only
part of the items of the list. For example, list only the mapped column of the target data store of an
interface.
Some of this methods are summarized in the table on the next pages.
Note
Instead of the getPKColList() method, you can alternatively use getColList with the
selector parameter set to "PK.
Instead of getSrcTablesList(), whenever possible use the getFrom method. The
getFrom method is discussed later in this lesson.
Instead of getFilterList() , use the getFilter() method, which is more appropriate in
most cases.
Note
Instead of the getJoinList() method, you can alternatively use the getJoin method,
which is usually more appropriate.
Instead of getGrpByList(), whenever possible use the getGrpBy method.
Instead of getHavingList(), use the getHaving method, which is more appropriate in
most cases.
In this example:
Start is set to "(\n": The generated code will start with a parenthesis followed by a carriage
return (\n).
Pattern is set to "\t[COL_NAME] [DEST_WRI_DT]": The generated code will loop over
every target column and generate a tab character (\t) followed by the column name
([COL_NAME]), a white space, and the destination writable data type ([DEST_WRI_DT]).
The Separator is set to ",\n": Each generated pattern will be separated from the next one
with a comma (,) and a carriage return (\n).
End is set to "\n)": The generated code will end with a carriage return (\n) followed by a
parenthesis.
In this example, the values that need to be inserted into MYTABLE are either bind variables with
the same name as the target columns or constant expressions if they are executed on the target.
To obtain these two distinct set of items, the list is split using the Selector parameter:
"INS AND NOT TARG": First, generate a comma-separated list of columns ([COL_NAME])
mapped to bind variables in the value part of the statement (:[COL_NAME]). Filter them to
get only the ones that are flagged to be part of the INSERT statement and that are not
executed on the target.
"INS AND TARG": Then generate a comma-separated list of columns ([COL_NAME])
corresponding to expression ([EXPRESSION]) that are flagged to be part of the INSERT
statement and that are executed on the target. The list should start with a comma if any
items are found.
In this example, getSrcTableList generates a message containing the list of resource names
used as sources in the interface to append to MYLOGTABLE. The separator used is composed of a
concatenation operator (||) followed by a comma enclosed by quotation marks (',') followed by the
same operator again. When the table list is empty, the SOURCE_TABLES column of MYLOGTABLE
will be mapped to an empty string ('').
getFrom() method
The FROM clause is built accordingly with the appropriate keywords (INNER, LEFT, and so
on) and parentheses when supported by the technology.
When used in an LKM, this method returns the FROM clause because it should be executed
by the source server.
When used in an IKM, it returns the FROM clause because it should be executed by the
staging area server.
getFilter method
When used in an LKM, it returns the filter clause because it should be executed by the
source server.
When used in an IKM, it returns the filter clause because it should be executed by the
staging area server.
getGrpBy() method
The GROUP BY clause includes all mapping expressions referencing columns that do not
contain aggregation functions.
This slide shows you the examples of code you use to obtain the result set from any SQL RDBMS
source server and any SQL RDBMS staging area server to build your final flow data.
Note that the getColList is filtered to retrieve only expressions that are not executed on the
target and that are mapped to writable columns. Because all filters and joins start with an AND,
the WHERE clause of the SELECT statement starts with a condition that is always true (1=1).
Oracle Data Integrator supports data sets. Each data set represents a group of joined and filtered
sources tables, with their mappings. Data sets are merged into the target data store using set-
based operators (UNION, INTERSECT, and so on) at the integration phase.
During the loading phase, the LKM always works on one data set. During the integration phase,
when all data sets need to merged, certain odiRef APIs that support working on a specific data set
are called using an index that identifies the data set. The example in this slide explains how this
data set merging is done. A Java For loop iterates over the data sets. The number of data sets is
retrieved using the getDataSetCount method. For each data set, a SELECT statement is
issued, each statement separated from the previous one by the data sets set-based operator
retrieved using the getDataSet method. The SELECT statement is built as in Generating the
Source Select Statement (see previous slide), except that each method call is parameterized with
i, the index of the data set being processed. For example, getFrom(i) generates the FROM
statement for the data set identified by the value of i. All the methods that support a parameter for
the data set index also support a syntax without this index value. Outside an IKM, these methods
should be used without the data set index. Within an IKM, if used without the data set index, these
methods address the first data set.
All the methods that support a parameter for the data set index also support Oracle Data
Integrator interfaces and knowledge modules.
This slide shoes methods that provide additional information, which may be useful. Note that for
getFlexFieldValue() with the List methods, flexfield values can be specified as part of the
pattern parameter.
Answer: b
Explanation: Duplicating an existing KM and enhancing it by changing its technology, or copying
lines of code from another KM is a regular practice for developing a KM.
CKMs check data quality in a data store or an integration table (I$) against data quality
rules expressed as constraints. The rejected records are stored in the error table (E$).
RKMs are in charge of extracting metadata from a metadata provider to the Oracle Data
Integrator repository by using the SNP_REV_xx temporary tables.
JKMs are in charge of creating and managing the Change Data Capture infrastructure.
2. Try to avoid:
Creating too many KMs
Using hard-coded values, including catalog or schema
names in KMs
You should instead use the substitution methods getTable(),
getTargetTable(), getObjectName(), or KM options.
The code generation in Oracle Data Integrator is able to interpret any Java code enclosed
between <% and %> tags.
connect my_user/<@=odiRef.getInfo("DEST_ENCODED_PASS")@>
insert into SCOTT.EMP
select EMP_ID, EMP_NAME, EMP_CODE
...
3
Compiled by the Agent Execution
connect my_user/M4678GHT
insert into SCOTT.EMP
select EMP_ID, EMP_NAME, EMP_CODE 4
...
This slide shows examples of code generation tags used in compilation steps:
1. Procedure / KM as it appears in ODI Designer
2. Generated Code
- Independent from the context
- Scenario or session ready to be executed
3. Generated code
- Resolution of the context
- Sensitive information
- Visible in the Sunopsis logs
4. Executed code
This slide shows an example of how you can use these advanced techniques. The following KM
code creates a string variable and uses it in a substitution method:
<%
String myTableName;
myTableName = "ABCDEF";
%>
drop table <%=odiRef.getObjectName(myTableName.toLowerCase())%>
This code generates the following:
drop table SCOTT.abcdef
You can use conditional branching and advanced programming techniques to generate code. The
following example illustrates how you can use conditional branching:
The following KM code generates code depending on the OPT001 option value.
<%
String myOptionValue=odiRef.getOption("OPT001");
if (myOption.equals("TRUE"))
{out.print("/* Option OPT001 is set to TRUE */");}
else
{%> /* The OPT001 option is not properly set */ <%}
%>
If OPT001 is set to TRUE, then the following is generated:
/* Option OPT001 is set to TRUE */
Otherwise the following is generated:
/* The OPT001 option is not set to TRUE */
The getJoin() method retrieves the SQL join string (on the source during the loading, on the
staging area during the integration) for a given data set of an interface.
Parameter
pDSIndex (Int): Index identifying which of the data sets is taken into account by this command
Note: The pDSIndex parameter can be omitted when this method is used in an LKM. It can be
also omitted for IKMs. In this case, the data set taken into account is the first one.
The getFilter() method returns filter expressions separated by an AND operator. When used
in an LKM, it returns the filter clause because it should be executed by the source server. When
used in an IKM, it returns the filter clause because it should be executed by the staging area
server.
getPK() method returns information relative to the primary key of a data store during a check
procedure. In an action, this method returns information related to the primary key currently
handled by the DDL command.
Parameters
ID: Internal number of the PK constraint
KEY_NAME: Name of the primary key
MESS: Error message relative to the primary key constraint
FULL_NAME: Full name of the PK generated with the local object mask
<flexfield code>: Flexfield value for the primary key
This slide shows you examples of substitution methods that can be used specifically in
Journalizing Knowledge Modules (JKMs).
The getJrnFilter() method returns the SQL Journalizing filter for a given data set in the
current interface. If the journalized table is in the source, this method can be used during the
loading phase. If the journalized table is in the staging area, this method can be used while
integrating.
Parameter (Int): pDSIndex Index identifying which of the data sets is taken into account by this
command.
The getJrnInfo() method returns journalizing information about a data store.
Parameter (String): pPropertyName The name of the requested property, for example:
FULL_TABLE_NAME: Full name of the journalized data store
JRN_NAME: Name of the journal data store
JRN_SUBSCRIBER: Name of the subscriber
JRN_METHOD: Journalizing Mode (consistent or simple)
Note: This method returns information about a data stores journalizing for a JKM while
journalizing a model/data store, or for a LKM/IKM in an interface.
The getSubscriberList() method returns a list of subscribers for a journalized table. The
pPattern parameter is interpreted and then repeated for each element of the list, and separated
from its predecessor with the parameter pSeparator. The generated string begins with pStart
and ends with pEnd. If no element is selected, pStart and pEnd are omitted and an empty string
is returned.
Parameters (String):
pStart: This sequence marks the beginning of the string to generate.
pPattern: The pattern is repeated for each occurrence in the list. The list of the attributes
usable in a pattern is detailed in the Pattern Attributes List below. Each occurrence of the
attributes in the pattern string is replaced by its value. Attributes must be between brackets ([
and ]). For example: My name is [SUBSCRIBER].
pSeparator: This parameter separates each pattern from its predecessor.
pEnd: This sequence marks the end of the string to generate.
Example:
Returning list of Subscribers: <%=odiRef.getSubscriberList("\nBegin List\n", "-
[SUBSCRIBER]", "\n", "\nEnd of List\n")%>
This slide shows you examples of substitution methods that can be used specifically in RKM.
The getModel() method returns information on the current data model during the processing of
a personalized reverse engineering. The list of available data is described in the pPropertyName
values table.
Note: This method may be used on the source connection (data server being reverse-engineered)
as well as on the target connection (repository). On the target connection, only the properties
independent from the context can be specified (for example, the schema and catalog names
cannot be used).
Parameters:
ID: Internal identifier of the current model
MOD_NAME: Name of the current model
LSCHEMA_NAME: Name of the logical schema of the current model
MOD_TEXT: Description of the current model
REV_TYPE: Reverse engineering typeS for standard reverse, C for customize
REV_UPDATE: Update flag of the model
REV_INSERT: Insert flag for the model
This topic provides information on how to troubleshoot problems that you might encounter when
using Oracle Knowledge Modules.
Errors appear often in Oracle Data Integrator in the following way:
java.sql.SQLException: ORA-01017: invalid username/password; logon
denied at ...
at ...
at ...
java.sql.SQLExceptioncode simply indicates that a query was made to the database
through the JDBC driver, which has returned an error. This error is frequently a database or driver
error, and must be interpreted in this direction.
Only the part of text in bold must first be taken in account. It must be searched in the Oracle
documentation. If it contains an error code specific to Oracle, the error can be immediately
identified.
If such an error is identified in the execution log, it is necessary to analyze the SQL code sent to
the database to find the source of the error. The code is displayed in the description tab of the
erroneous task.
A frequent cause is also the call made to a non-SQL syntax such as the call to an Oracle-stored
procedure using the syntax:
EXECUTE SCHEMA.PACKAGE.PROC(PARAM1, PARAM2)
The valid SQL call for a stored procedure is:
BEGIN
SCHEMA.PACKAGE.PROC(PARAM1, PARAM2);
END;
Note: The syntax EXECUTE SCHEMA.PACKAGE.PROC(PARAM1, PARAM2) is specific to
SQL*PLUS, and does not work with JDBC.
Other errors:
ORA-00904 invalid column name
Keying error in a mapping/join/filter. A string which is not a
column name is interpreted as a column name.
ORA-00903 invalid table name
The table used (source or target) does not exist in the Oracle
Other errors
ORA-00904 invalid column name: Keying error in a mapping/join/filter. A string which is not
a column name is interpreted as a column name, or a column name is misspelled. This error
may also appear when accessing an error table linked to a data store with a recently
modified structure. Modify or drop the error tables and let Oracle Data Integrator re-create it
in the next execution.
ORA-00903 invalid table name: The table used (source or target) does not exist in the
Oracle schema. Check the mapping logical/physical schema for the context, and check that
the table physically exists on the schema accessed for this context.
ORA-00972 Identifier is too Long: There is a limit in the object identifier in Oracle (usually
30 characters). When this limit is exceeded, this error appears.
ORA-01790 expression: Must have the same data type as the corresponding expression
You are trying to connect two different values that cannot be implicitly converted (in a mapping or
a join). Use the explicit conversion functions on these values.
Answer: b
Explanation: You should avoid using <%if%> statements instead of check box options for
conditional code generation.
Integration process
Extract - Transform (check) - Load
A machine A machine
Source
Target
ORDERS A machine
Errors
CORRECTIONS
File
This integration process is also known as an extract, transform, and load (ETL) process.
The first part of an ETL process involves extracting data from the source systems. Most data
warehousing projects consolidate data from different source systems.
The transform stage applies a series of rules or functions to the data extracted from the source to
derive the data for loading into the target. Some data sources will require very little or even no
manipulation of data. In other cases, transformations (such as filtering, joining, sorting, and so on)
may be required to meet the business and technical needs of the target database.
The load phase loads the data into the target, usually the data warehouse.
Note: You can add to this process the checks that ensure the quality of data flow, as shown in the
slide.
Transform Transform
Transform
Extract Load Extract Load
Data is one of the most important assets of any company, and data integration constitutes the
backbone of any enterprises IT systems. Choosing the technology for data integration is critical
for productivity and responsiveness of business divisions within an enterprise.
E-LT stands for extract, load, and transform. It includes the processes that enable companies to
move data from multiple sources, reformat and cleanse the data, and load it into another
database, or a data warehouse for analysis, to support a business process.
ODI provides a strong and reliable integration platform for IT infrastructure. Built on the next-
generation architecture of extract, load, and transform (E-LT), ODI delivers superior performance
and scalability connecting heterogeneous systems at a lower cost than traditional, proprietary ETL
products. Unlike the conventional extract, transform, and load (ETL) design, with ODI, E-LT
architecture extracts data from sources, loads it into a target, and transforms it by using the
database power according to business rules. The tool automatically generates data flows,
manages their complexity, and administers correct instructions for the various source and target
systems.
Load
This step involves loading the data into the destination target, which might be a database or data
warehouse.
Transform
After the data has been extracted and loaded, the next step is to transform the data according to a
set of business rules. The data transformation may involve various operations including, but not
limited to filtering data, sorting data, aggregating data, joining data, cleaning data, generating
calculated data based on existing values, and validating data.
An integration process is always needed in an interface. This process integrates data from the
source or loading tables into the target datastore using a temporary integration table.
An integration process uses an integration strategy, which defines the steps required in the
integration process .The following elements are used in the integration process:
An integration table (also known as the flow table) is sometimes needed to stage data after all
staging area transformation are made. This table is named after the target table, prefixed with
I$. This integration table is the image of the target table with extra fields required for the
strategy to be implemented. The data in this table is flagged, transformed or checked before
being integrated into the target table.
The source and/or loading tables (created by the LKM). The integration process loads data
from these tables into the integration table or directly into the target tables.
Source
Create a Temporary
2 3
Loading Data from Perform
Integration Table (if the Source and Transformations to
needed) Loading Table into Implement the
1
Target
Now look at the first of the three factors that structure the flow. This is the most important:
choosing where to put the staging area.
An interface consists of a set of rules that define the loading of a data store or a temporary target
structure from one or more source data stores.
An integration interface is made up of and defined by the following components:
Target data store: The target data store is the element that will be loaded by the interface.
This data store may be permanent (defined in a model) or temporary (created by the
interface).
Data sets: One target is loaded with data coming from several data sets. Set-based
operators (Union, Intersect, and so on) are used to merge the different data sets into the
target data store. Each data set corresponds to one diagram of source data stores and the
mappings used to load the target data store from these source data stores.
Diagram of source data stores: A diagram of sources is made of source data stores
possibly filteredrelated using joins. The source diagram also includes lookups to fetch
additional information for loading the target. Two types of objects can be used as a source of
an interface: data stores from the models and interfaces. If an interface is used, its target
data storetemporary or notwill be taken as a source. The source data stores of an
interface can be filtered during the loading process, and must be put in relation through joins.
Joins and filters are either copied from the models or defined for the interface. Join and filters
are implemented in the form of SQL expressions.
ORDERS
C$_0 4
1 SALES
Extract/Join/
2 C$_1
CORRECTIONS Extract/Transform
File
The following sequence of operations is more or less unavoidable to implement this integration.
However, you have some latitude in controlling how and where they are performed.
First, the two orders tables must be extracted and joined. The constraint that orders must be
closed is applied here.
Second, data from the corrections file must be extracted and transformed into the correct
format. Now, data from these two sources must be joined and transformed into a temporary
table, I$_SALES. This temporary table looks identical to the target SALES table.
Lastly, the data from this table must be copied to the Oracle server. You can also do a data
check, or flow control, here.
Extract/Join/
LINES Transform 3 TEMP_
Join/Transform SALES
ERRORS
2 TEMP_2 4
CORRECTIONS Extract/Transform Check constraints/
File Isolate errors
The staging area is where most of the transformation and error checking is usually performed.
After those tasks are performed, you just copy, or load, the data into the target server.
The staging area is a special area that you create when you set up a database in ODI. ODI
creates temporary tables in the staging area, and uses them to perform data transformation.
You can place the staging area on your source server, your target server, or another server
altogether. However, the best place for the staging area is usually on the target server. This gives
you the greatest scope for data consistency checking, and minimizes network traffic. You now look
at some of the consequences of different choices.
In an E-LT-style integration interface, ODI processes the data in a staging area, which is located
on the target. Staging area and target are located on the same RDBMS. The data is loaded from
the sources to the target. To create an E-LT-style integration interface, follow the standard
procedure of creating an ODI interface. See Working with Integration Interface in the Oracle
Fusion Middleware Developer's Guide for Oracle Data Integrator 11g Release 1 (11.1.1). for
generic information about how to design integration interfaces.
In an ETL-style interface, ODI processes the data in a staging area, which is different from the
target. The data is first extracted from the sources and then loaded to the staging area. The data
transformations take place in the staging area and the intermediate results are stored in temporary
tables in the staging area. The data loading and transformation tasks are performed with the
standard ELT KMs.
In this topic, you learn how to design an ETL-style interface where the staging area is Oracle
database or any ANSI-92compliant database and the target on Oracle database.
In an ETL-style interface, ODI processes the data in a staging area, which is different from the
target. Oracle Data Integrator provides two ways for loading the data from an Oracle staging area
to an Oracle target:
Using a multiconnection IKM
Using an LKM and a mono-connection IKM
Note: Depending on the KM strategy that is used, flow and static control are supported.
Oracle Data Integrator provides the following multiconnection IKM for handling Oracle data: IKM
Oracle to Oracle Control Append (DBLINK). You can also use the generic SQL multiconnection
IKMs for ANSI SQL-92compliant technologies: IKM SQL to SQL Incremental Update and IKM
SQL to SQL Control Append.
IKM SQL to SQL Incremental Update KM integrates data from any ANSI-SQL92compliant
database into any ANSI-SQL92compliant database target table in incremental update mode. This
IKM is typically used for ETL configurations: source and target tables are on different databases
and the interfaces staging area is set to the logical schema of the source tables or a third schema.
It allows an incremental update strategy with no temporary target-side objects. Use this KM if it is
not possible to create temporary objects in the target server. Because the application updates are
made without temporary objects on the target, the updates are made directly from source to target.
The configuration where the flow table is created on the staging area and not in the target should
be used only for small volumes of data. This KM supports flow and static control.
IKM SQL to SQL Control Append is typically used for ETL configurations: source and target tables
are on different databases and the interfaces staging area is set to the logical schema of the
source tables or a third schema. Use this KM strategy to perform control append.
Note: This KM supports flow and static control.
If there is no dedicated multiconnection IKM, use a standard exporting LKM in combination with a
standard mono-connection IKM. This slide shows the configuration of an integration interface
using an exporting LKM and a mono-connection IKM to update the target data. The exporting LKM
is used to load the flow table from the staging area to the target. The mono-connection IKM is
used to integrate the data flow into the target table.
Note that this configuration (LKM + exporting LKM + mono-connection IKM) has the following
limitations:
Neither simple CDC nor consistent CDC are supported when the source is on the same data
server as the staging area (explicitly chosen in the Interface Editor).
Temporary Indexes are not supported.
To use an LKM and a mono-connection IKM in an ETL-style interface, perform the following steps:
1. Create an integration interface using the standard procedure as described in the section
Working with Integration Interface" in the Oracle Fusion Middleware Developer's Guide for
Oracle Data Integrator 11g Release 1 (11.1.1).
2. On the Definition tab of the Interface Editor, select Staging Area different from Target and
select the logical schema of the source tables or a third schema.
3. On the Flow tab, select one of the Source Sets. In the Property Inspector, select an LKM
from the LKM Selector list to load from the sources to the staging area. See the chapter in
the Oracle Fusion Middleware Connectivity and Knowledge Modules Guide for Oracle Data
Integrator 11g Release 1 (11.1.1) that corresponds to the technology of your staging area to
determine the LKM you can use. Optionally, modify the KM options.
4. Select the Staging Area. In the Property Inspector, select an LKM from the LKM Selector list
to load from the staging area to the target. Optionally, modify the options.
5. Select the Target by clicking its title. In the Property Inspector, select a standard mono-
connection IKM from the IKM Selector list to update the target. Optionally, modify the KM
options.
Audits provide statistics on the integrity of application data. They also isolate data that is detected
as erroneous by applying the business rules. After erroneous records have been identified and
isolated in error tables, they can be accessed from Oracle Data Integrator Studio, or from any
other front-end application.
In some cases, it is useful to recycle errors from previous runs so that they are added to the flow
and applied again to the target. This method can be useful, for example, when receiving daily
sales transactions that reference product IDs that may not exist. Suppose that a sales record is
rejected in the error table because the referenced product ID does not exist in the product table.
This happens during the first run of the interface. In the meantime the missing product ID is
created by the data administrator. Therefore, the rejected record becomes valid and should be re-
applied to the target during the next execution of the interface.
This mechanism is implemented by IKMs with an extra task that inserts all the rejected records of
the previous executions of this interface from the error table into integration table. This operation is
made prior to calling the CKM to check the data quality, and is conditioned by a KM option usually
called RECYCLE_ERRORS.
You should have an answer to each of the questions in the slide before defining your data quality
strategy.
Answer: a
Explanation: In an ETL-style interface, ODI processes the data in a staging area, which is
different from the target. The data is first extracted from the sources and then loaded to the staging
area. The data transformations take place in the staging area and the intermediate results are
stored in temporary tables in the staging area. The data loading and transformation tasks are
performed with the standard ELT KMs.
Error Recycling
In this practice, you perform the steps to build an ODI interface
that will load an XML file with a constraint to the database table.
To maintain data quality, you enable error recycling. Any rows
that do not pass the constraint will be loaded to the error table
on the target database E_CLIENT.
Now look at the first of the three factors that structure the flow. This is the most important:
choosing where to put the Staging area.
A wizard is available in the interface editor to create lookups by using a source as the driving table
and a lookup datastore or interface. These lookups now appear as compact graphical objects in
the interface sources diagram.
The user can choose how each lookup is generated, either as a Left Outer Join in the FROM clause
or as an expression in the SELECT clause (in-memory lookup with nested loop). The second
syntax is sometimes more efficient in small lookup tables.
This feature simplifies the design and readability of interfaces that use lookups, and enables
optimized code for execution.
The lookup wizard has two steps. In the first step, you simply select your driving and lookup
tables. In the second step, you define the lookup condition and you can specify several options.
A data set:
Represents the data flow that comes from a group of joined
and filtered source data stores
Includes the target mappings for this group of sources
Note: Several data sets can be merged into the interface target
data store by using set-based operators such as Union and
A data set represents the data flow that comes from a group of joined and filtered source data
stores. One target can be loaded with data coming from several data sets. Set-based operators
(Union, Intersect, and son) are used to merge the different data sets into the target data store.
Each data set corresponds to one diagram of source data stores and the mappings used to load
the target data store from these source data stores.
Benefits of using data sets:
- Accelerates the interface design
- Reduces the number of interfaces needed to merge several data flows into the same
target data store
ORD_05
ORDERS2
ORDERS1
CORRECT UNION
MODIF_File1
MODIF_File1
In this slide, several data sets are shown merged into the interface target data store using set-
based operators such as Union and Intersect.
Note, that the support for data sets as well as the set-based operators supported depend on the
capabilities of the staging areas technology.
The set-based operators are always executed on the staging area.
When designing the integration interface, the mappings for each data set must be consistent. This
means that each data set must have the same number of target columns mapped.
REGION
LKM SQL to
Oracle
DataSet0 - Orders
-
(LOAD)
ORDERS C$ (SS0)
Target: ORACLE_SERVER1
C$ (SS0)
SS_2 (3): DB2_HSCM -
LKM DB2 to
EMPLOYEE MINUS
Oracle
BOOKINGS (LOAD)
PRODUCTS
Partitioning features:
Support (sub)partitioning on data stores
Reverse-engineer partition information
Enable partition selection in interfaces
Auto-generate SQL Statements:
Oracle Data Integrator is able to use database-defined partitions when processing data in
partitioned tables used as source or targets of integration interfaces. These partitions are created
in the data store corresponding to the table, either through the reverse-engineering process or
manually. For example with the Oracle technology, partitions are reverse-engineered using the
RKM Oracle.
The partitioning methods supported depend on the technology of the data store. For example, for
the Oracle technology the following partitioning methods are supported: Range, Hash, List.
After they are defined on a data store, partitions can be selected when this data store is used as a
source or a target of an interface.
In an interfaces mapping, you can specify the use of partitions on a target data store.
You can create an interface that would not have a predefined, permanent target data store. In this
case, all rows in the target are derived from the rows of one or more source data stores. These
interfaces are referred to as temporary interfaces. Such interfaces are designed to be used by
other interfaces as sources for further transformation.
Note: Unlike permanent interfaces, a temporary interfaces are symbolized with yellow interface
symbol.
Temporary interface is
used as a source data
store in the permanent
interface.
In this example, the temporary interface INT_TEMP_AGG_ORDERS, with its target data store
TEMP_AGG_ORDERS, is used in place of a source data store for the interface
INT_TRG_SALES.
In the Mapping Properties, you can see that the rows of the temporary interface are mapped to the
rows of the target data store.
When using a temporary data store that is the target in temporary interface as a source or as a
lookup table for another interface, you can choose:
To use a persistent temporary data store: You will run a first interface creating and
loading the temporary data store, and then a second interface sourcing from it. In this case,
you would typically sequence the two interfaces in a package.
Not to use a persistent data store: The second interface generates a sub-select
corresponding to the loading of the temporary data store. This option is not always available
because it requires all data stores of the source interface to belong to the same data server
(for example, the source interface must not have any source sets). You activate this option
by selecting Use Temporary Interface as Derived Table on the source.
Note the following when using a temporary interface as derived table:
The generated sub-select syntax can be either a standard sub-select syntax (default
behavior) or the customized syntax from the IKM used in the first interface.
All IKM commands except the one that defines the derived-table statement option Use
current command for Derived Table sub-select statement are ignored. This limitation
causes, for example, the disabling of support for temporary index management.
There are some limitations of using derived select for temporary interfaces. A temporary interface
INT_1 becomes eligible to be a source derived table in another parent interface INT_2, if, and only
if:
All source data stores and subinterfaces of INT_1 are on the same physical server as the
target table of INT_2
The technology of this physical server supports derived tables in the FROM clause
No data store within INT_1 is used as journalized
Answer: e
Explanation: Make sure that your technology supports set-based operators. Check Support Set
Operators. Save before adding data sets. Share the same target structure. Select an IKM that
supports data sets.
Using variables is highly recommended to create reusable packages or packages with a complex
conditional logic, interfaces, and procedures. Variables can be used everywhere within ODI. Their
values can be stored persistently in the ODI Repository if their Keep History parameter is set to All
values or Latest value. Otherwise, if their Keep History parameter is set to No History, their value
will only be kept in the memory of the agent during the execution of the current session.
You can use the variable as a SQL bind variable by prefixing it with a colon rather than a hash.
However, this syntax is subject to restrictions because it applies only to SQL DML statements, not
for OS commands or ODI API calls and using the bind variable may result in performance loss. It
is advised to use ODI variables prefixed with the #character to ensure optimal performance at
run time.
When you reference an ODI Variable prefixed with the : character, the name of the variable is
NOT substituted when the RDBMS engine determines the execution plan. The variable is
substituted when the RDBMS executes the request. This mechanism is called Binding. If using the
binding mechanism, it is not necessary to enclose in delimiters (such as quotation marks) the
variables that store strings because the RDBMS is expecting the same type of data as specified
by the definition of the column for which the variable is used.
For example, if you use the variable TOWN_NAME = :GLOBAL.VAR_TOWN_NAME, the VARCHAR
type is expected.
Note: When you reference an ODI variable prefixed with the #" character, ODI substitutes the
name of the variable by the value before the code is executed by the technology. In expressions,
the variable reference needs to be enclosed in single quotation marks, for example TOWN =
'#GLOBAL.VAR_TOWN'. The call of the variable works for OS commands, SQL, and ODI API
calls.
Variables are often used in packages. The four types of variable steps are as follows:
The Declare Variable step is often placed at the start of a package. It explicitly defines the
variable that is used in an interface, procedure, or other package step. In general, variables
are implicitly declared when they are referred to in a package. However, in certain complex
situations, this cannot be guaranteed. Using this kind of step also improves readability, as
you can visually display all the variables that are used in the package.
The Set Variable step is straightforward and can be used to perform two different actions.
You can assign a specific value to a variable. For example, you can set an error counter to
zero. You can also increment the value of a variable.
The Refresh Variable step updates the variable value by executing the SQL expression of
the variable. Note that variables are not automatically updated.
For example, you have a variable with an expression that retrieves the number of rows in a
table.
The Evaluate Variable step compares the value of the variable against a constant value or
another variable. This comparison is used to create branches and loops . You can use any
mathematical comparison, such as equals, greater than, less than, and so on. You can also
use the SQL membership operator IN. The execution path of the package then splits
according to the result of the comparison. Using the ok and ko tools, you define the true
and false paths.
This slide shows an example of using variable steps in a package to control the workflow. The
Refresh Variable step updates the variable value by executing the SQL expression of the variable.
The Evaluate Variable step compares the value of the variable against a constant value to create
branches.
It is also possible to use variables as substitution variables in graphical module fields such as
resource names or schema names in the topology. You must use the fully qualified name of the
variable (Example: #GLOBAL.MYTABLENAME) directly in the Oracle Data Integrator graphical
modules field.
Using this method, you can parameterize elements for execution, such as physical names of files
and tables (Resource field in the datastore) or their location (Physical Schema's schema [data] in
the topology), Physical Schema, and Data Server URL.
You can use variables anywhere within your procedures code. Code examples are shown in this
slide.
You should consider using options rather than variables whenever possible in procedures.
Options act like input parameters. Therefore, when executing your procedure in a package you
would set your option values to the appropriate values.
In this example you would write Step 1s code as follows:
Insert into <%=snpRef.getOption(LogTableName)%> Values (1, 'Loading
Step Started', current_date)
Then, when using your procedure as a package step, you would set the value of option
LogTableName to #DWH.LOG_TABLE_NAME.
Note that when using Groovy scripting (discussed in the lesson titled Using ODI Groovy Editor),
you need to enclose the variable name in double quotation marks ("), for example #varname
and #GLOBAL.varname; otherwise, the variables are not substituted with the ODI variable
value.
It is sometimes useful to have variables depend on other variable values as shown in this slide.
Note: The bind variable mechanism must be used to define the refresh query for a date type
variable that references another date type variable. For example:
VAR1 date type variable has the refresh query select sysdate from dual.
VAR_VAR1 date type variable must have the refresh query select :VAR1 from dual.
You may face some situations where the names of your source or target datastores are dynamic.
A typical example of this is when you need to load flat files into your Data Warehouse with a file
name composed of a prefix and a dynamic suffix such as the current date. For example, the order
file for March 26 would be named ORD2009.03.26.dat.
Note that you can only use variables in the resource name of a datastore in a scenario when the
variable has been previously declared.
To develop your loading interfaces, you should follow the steps shown in the slide.
Note: The variable in the datastore resource name must be fully qualified with its project code.
When using this mechanism, it is not possible to view the data of your datastore from within
Designer.
Create the
integration interface
3 for flat file-to-RDBMS
transformation.
In some integration projects, a flat file needs to be exported into a relational table. In this example:
1. The variable FileName is created and used as a dynamic name of a flat file to be exported
to a relational database table.
2. To reference the flat file dynamically, you edit the source datastore to point to the variable
rather than having the resource file name hardcoded: #FileName.
3. You create an interface to export a flat file to a relational table.
6. When you execute the scenario, the actual name of the flat file to be loaded into the
relational table will be set as a startup parameter.
Each server and each schema on the server have to be defined in the ODI topology. Now imagine
that you design processes that have to be run on a large number of systems. The exact same
code will be executed; only the physical location will be different. In this case, you have to either
create and maintain thousands of contexts or create an environment that will be more dynamic by
using variables in topology.
It is recommended to create at least two contexts:
A development context where all URLs in topology point to an actual server, not using the
variables. This will ensure that data can be viewed, and interfaces can be tested without any
concerns regarding the variables resolution.
The dynamic context will use the variables in topology to name the servers, port numbers,
usernames for the connections, and so on. The package will assign the appropriate values to
the variable and run the interfaces on the appropriate servers. This context will only be used
to validate that the processes defined in the Development context work properly when you
use the variables.
Independently from ODI, you will need a table to store the values that will be used for the topology
variables.
Define one physical data server and set its JDBC URL, such as shown in the slide.
Define the package for loading data for your store:
The input variable STORE_ID will be used to refresh the values for STORE_URL and
STORE_ACTIVE variables from the StoresLocation table.
If STORE_ACTIVE is set to YES, then the next steps will be triggered. The interfaces refer
to source datastores located according to the value of the STORE_URL variable .
To connect to your stores using a variable in the URL of your data servers definition:
1. Create a StoresLocation table, such as the one shown in the slide.
2. Create three variables in your EDW project:
- STORE_ID: Takes the current store ID as an input parameter
- STORE_URL: Refreshes the current URL for the current store ID with the SELECT
statement: select StoreUrl from StoresLocation where StoreId =
#EDW.STORE_ID
- STORE_ACTIVE: Refreshes the current activity indicator for the current store ID with
the SELECT statement: select IsActive from StoresLocation where
StoreId = #EDW.STORE_ID
3. Define one physical data server for all your stores and set
its JDBC URL to:
jdbc:oracle:thin:@#EDW.STORE_URL
4. Define your package for loading data from your store.
The input variable STORE_ID will be used to refresh the
values for STORE_URL and STORE_ACTIVE variables from
There are some cases where using contexts for different locations is less appropriate than using
variables in the URL definition of your data servers, for example, when the number of sources is
high (> 100) or when the topology is defined externally in a separate table. In these cases, you
can refer to a variable in the URL of a servers definition.
Suppose you want to load your warehouse from 250 source applicationshosted in Oracle
databasesused within your stores. Of course, one way to do it would be to define one context for
every store. However, doing so would lead to a complex topology that would be difficult to
maintain. Alternatively, you could define a table that references all the physical information to
connect to your stores and use a variable in the URL of your data servers definition.
Tracking Variables
Variable and Sequence tracking will record a history of variables and sequences participating in a
session. The values of all variables will be displayed in the Operator step in the section Variable
and Sequence Values.
With the variable tracking feature, you can also determine whether the variable was used in a
source/target operation or an internal operation such as an Evaluate step.
Tracking variables is useful for debugging purposes. See Section 22.2.3, Handling Failed
Sessions in Oracle Fusion Middleware Developer's Guide for Oracle Data Integrator 11g Release
1 (11.1.1) for more information on how to analyze errors in Operator Navigator and activate
variable tracking.
Answer: b
Explanation: When you reference an ODI variable prefixed with the #" character, ODI substitutes
the name of the variable by the value before the code is executed by the technology.
The ODI SDK provides an interface to enable developers to leverage the ODI concepts through an
SDK instead of using the graphical interface.
ODI can now be embedded in other products that can drive ODI process creation and
execution from their own GUI.
Dynamic Mappings: The structure of the source and/or target systems is very dynamic.
That structure would not be easily implemented through a GUI that leverages fixed metadata
definitions.
An enhanced SDK enables the developer to execute virtually every ODI method through a Java
program that enables developers to leverage ODI concepts through an SDK instead of using ODI
11g Studio.
Note: ODI SDK and Public API are terms that are often used interchangeably. The Public API
mimics the usage of the graphical interface. For a better understanding of the logic implemented in
the SDK, ensure that you are familiar with the GUI. Classes, methods, and parameters will be
similar to the options available through the graphical interfaces. The valid code will match valid
objects in the ODI GUI.
Master Repository:
Creation of Master / Work Repositories
Creation and management of the following objects:
Data servers, agents, contexts
Implementing mapping
Work Repository:
This slide lists the ODI operations supported by SDK. The list is grouped by operations performed
in the Master Repository and operations performed in the Work Repository.
Master Repository:
- Creation of Master / Work Repositories
- Creation and management of the following objects:
Data servers, agents, contexts
Implementing mapping
Work Repository:
- Creation and management of metadata
Models and submodels
Datastores, columns, constraints
- Creation and management of ODI projects
Projects, Folders, Interfaces, Variables, Packages
- Retrieving values of flexfields
Master Repository:
Creation of new technologies
Security management
Work Repository:
Creation of new ODI procedures (though procedures can be
imported and used)
Master Repository
Modification of the data type conversion matrix
Creation of new technologies
Security management (Only users with the Supervisor role can use the SDK.)
Work Repository
Creation of new ODI procedures (though procedures can be imported and used in packages)
Creation of new knowledge modules
Creation of user functions
Locking/Unlocking of objects
Versioning of objects
No ability to duplicate an object in the repository (though this can be achieved through
code)
No markers or memos
Setting the value of flexfields
Public API
- Developers relying on the SDK will be protected from repository evolutions over time.
Sunopsis APIs
- Each tool has a corresponding API version.
Substitution Methods
- They are used by developers to build generic code that will dynamically retrieve the
names of all tables, columns, and mappings during the process of building interfaces.
Groovy is a scripting language with Java-like syntax for the Java platform. The Groovy scripting
language simplifies the authoring of code by employing dot-separated notation, yet still supporting
syntax to manipulate collections, Strings, and JavaBeans. Groovy language expressions in ADF
Business Components differs from the Java code that you might use in a Business Components
custom Java class because Groovy expressions are executed at run time, whereas the strongly
typed language of Java is executed at compile time. Additionally, because Groovy expressions are
dynamically compiled, they are stored in the XML definition files of the business components
where you use it. ADF Business Components supports the use of the Groovy scripting language in
places where access to entity object and view object attributes is useful, including attribute
validators (for entity objects), attribute default values (for either entity objects or view objects),
transient attribute value calculations (for either entity objects or view objects), bind variable default
values (in view object query statements and view criteria filters), and placeholders for error
messages (in entity object validation rules). Additionally, ADF Business Components provides a
limited set of built-in keywords that can be used in Groovy expressions.
Note: Groovy can be used in ODI Procedures, so you could write a procedure with Groovy script
that does your work and calls the procedure through the regular channels, for example as a
scenario through web services.
The Groovy editor provides all standard features of a code editor such as syntax highlighting and
common code editor commands except for debugging. The following commands are supported
and accessed through the context menu or through the Source main menu:
Show Whitespace
Text Edits
- Join Line
- Delete Current Line
- Trim Trailing Whitespace
- Convert Leading Tabs to Spaces
- Convert Leading Spaces to Tabs
- Macro Toggle Recording
- Macro Playback
Indent Block
Unindent Block
You can execute one or several Groovy scripts at once and also execute one script several times
in parallel.
You can only execute a script that is opened in the Groovy editor. ODI Studio does not execute a
selection of the script, it executes the whole Groovy script.
To execute a Groovy script in ODI Studio, select the script that you want to execute in the Groovy
editor. Click Execute in the toolbar. The script is executed.
You can now follow the execution in the Log window.
Note that each script execution launches its own Log window. The Log window is named
according to the following format: Running <script_name>.
In this example, the ODI 11g SDK is used to create a new ODI Project within the Work Repository
that is connected within ODI Studio. A Knowledge Module (KM) is also to be imported into this
new project.
1. Open the native Groovy editor from the ODI Tools menu.
2. Enter the Groovy script in the Groovy console.
3. Verify that the ODI project is created.
The Groovy editor is able to access external libraries; for example, if an external driver is needed.
To use external libraries, do one of the following:
Copy the custom libraries to the userlib folder. This folder is located:
On Windows operating systems: %APPDATA%/odi/oracledi/userlib
On UNIX operating systems: ~/.odi/oracledi/userlib
Add the custom libraries to the additional_path.txt file. This file is located in the userlib
folder and has the following content:
; Additional paths file
; You can add here paths to additional libraries
; Examples:
; C:\ java\libs\myjar.jar
; C:\ java\libs\myzip.zip
; C:\java\libs\*.jar will add all jars contained in the C:\java\libs\
directory
; C:\java\libs\**\*.jar will add all jars contained in the
C:\java\libs\ directory and subdirectories
You can define a Groovy execution classpath in addition to all classpath entries available to ODI
Studio.
To define an additional Groovy execution classpath, perform the following steps:
1. Before executing the Groovy script, select Tools > Preferences.
2. In the Preferences dialog box, navigate to the Groovy Preferences page.
3. Enter the classpath and click OK.
Oracle Data Integrator provides the odiInputStream variable to read input streams. This
variable is used as follows:
odiInputStream.withReader { println (it.readLine())}
When this feature is used, an Input text field is displayed at the bottom of the Log tab. Enter a
string text and press Enter to pass this value to the script. The script is exited after the value is
passed to it.
This slide provides an example that shows how to create an ODI Project with a Groovy script. In
createProject method, you define parameters for Project Name, Project Code, and Folder
Name of your interface.
This slide provides an example of how to create an ODI model with the necessary topology using
the Groovy script.
To create the topology, enter:
lschema = createLogicalSchema("<Context>", "<Technology>", "<Logical
Schema Name>", "<Data Server Name>", "<Database User Name>",
ObfuscatedString.obfuscate("<Database User Password>"), "<URL>",
"<Driver>", "<DB Schema used for ODI Physical Schema>")
In this example, the following parameters are used:
lschema = createLogicalSchema("GLOBAL", "ORACLE", "ORACLE_EBS",
"ORACLE_HQ_DEV", "ODI", ObfuscatedString.obfuscate("ODI"),
"jdbc:oracle:thin:@localhost:1521:orcl", "oracle.jdbc.OracleDriver",
"ODI")
To create a model, enter:
createModel(lschema, "<Context>", "<Model Name>", "<Model Code>")
In this example, you use the following parameters:
createModel(lschema, "GLOBAL", "ORACLE_WH", "ORACLE_WH")
This slide provides an example that shows how to create an ODI interface with a Groovy script.
To create an interface with Groovy, create a text file that contains a string (or strings) with the
following parameters separated by commas: interface name, source model, source datastore,
target model ,and target datastore separated by commas .
In the Groovy script, define the variables with the values for the ODI project, the project folder, and
the path to the text file you created.
Run the Groovy script to create the ODI interface.
Note: You can create more that one interface with your Groovy script if you define more than one
row in the text file (one line of parameters for each interface).
You can automate creating elements of ODI Studio user interface using Java with the Groovy
editor. This slide provides an example that shows an implementation of ODI Studio UI automation.
Answer: c and e
Explanation: Creation of user functions and knowledge modules are not supported by SDK.
This practice covers creating the ODI project, the topology and
model, and the ODI interface with Groovy.
For complex files it is possible to build a Native Schema description file that describes the file
structure. Using this native schema (nXSD) description and the Oracle Data Integrator Driver for
Complex Files, Oracle Data Integrator is able to reverse-engineer, read, and write information from
complex files. Oracle Data Integrator Driver for Complex Files (Complex File driver) converts
native format to a relational structure and exposes this relational structure as a data model in
Oracle Data Integrator.
The Complex File driver translates internally the native file into an XML (internal) structure, as
defined in the native schema (nXSD) description and from this XML file it generates a relational
schema that is used by Oracle Data Integrator. The overall mechanism is shown in this slide.
Most concepts and processes that are used for Complex Files are equivalent to those used for
XML files. The main difference is the step that transparently translates the Native File into an XML
structure that is used internally by the driver but never persisted.
A Complex File corresponds to an Oracle Data Integrator data server. Within this data server, a
single schema maps the content of the complex file.
The Oracle Data Integrator Driver for Complex File (Complex File driver) loads the complex
structure of the native file into a relational schema. This relational schema is a set of tables
located in the schema that can be queried or modified using SQL. The Complex File driver is also
able to unload the relational schema back into the complex file. The relational schema is
Note: For simple flat files formats (fixed and delimited), File technology is recommended, and for
XML files, the XML technology.
You can use a Complex File data server as any SQL data
server.
Complex File data servers support both technology-
specific KMs sourcing or targeting SQL data servers, as
well as generic KMs.
You can also use the IKM XML Control Append when
For more information on using KMs, see the lesson titled Generic SQL in Oracle Fusion
Middleware Connectivity and Knowledge Modules Guide for Oracle Data Integrator 11g Release 1
(11.1.1).
You set up the topology for Complex files by creating a Data server and physical schema. Create
a data server for the Complex File technology using the standard procedure, as described in
"Creating a Data Server of the Oracle Fusion Middleware Developer's Guide for Oracle Data
Integrator 11gR1. This slide details only the fields required or specific for defining a Complex File
data server:
In the Definition tab:
Name: Name of the data server that will appear in Oracle Data Integrator
User/Password: These fields are not used for Complex File data servers
In the JDBC tab, enter the following values:
JDBC Driver: oracle.odi.jdbc.driver.file.complex.ComplexFileDriver
JDBC URL: jdbc:snps:complexfile?f=<native file location>&d=<native
schema>&re=<root element name>&s=<schema
name>[&<property>=<value>...]
Note: Creating a Complex File physical schema is standard procedure for any technology.
The following are the key properties of the ODI Driver for Complex files:
f <native file name>: This is native file location. Use slash "/" in the path name instead
of back slash "\". It is possible to use an HTTP, FTP, or File URL to locate the file. Files
located by URL are read-only. This parameter is mandatory.
d <native schema>: This is the native schema (nXSD) file location. This parameter is
mandatory.
re <root element>: The element to take as the root table of the schema. This value is
case-sensitive. This property can be used for reverse-engineering, for example, a specific
section of the Native Schema. This parameter is mandatory.
s <schema name>: The name of the relational schema where the complex file will be
loaded. This parameter is mandatory. This schema will be selected when creating the
physical schema under the Complex File data server.
For the full set of ODI driver properties specified in the JDBC URL, refer to Appendix C
Oracle Data Integrator Driver for Complex Files Reference in Oracle Fusion Middleware
Connectivity and Knowledge Modules Guide for Oracle Data Integrator 11g Release 1
(11.1.1).
This slide shows how to create a data server to implement Complex File technology. In ODI
Studio:
1. Open the Topology tab. In the Physical Architecture, expand the Complex File technology
node. Right-click Complex File and then select New Data Server.
2. Enter the name for this new data server, for example: PURCHASE_SAMPLE_CPLX_FILE,
and click the JDBC tab.
3 5
The Complex File technology uses a JDBC driver to read the original input file as well as the
metadata definition of this complex file generated using the Native Format Builder, a SOA
component. To select the driver:
3. Click the magnifying glass icon and then select the JDBC driver:
oracle.odi.jdbc.driver.file.complex.ComplexFileDriver.
4. Edit the URL to point to the input file and to the xsd file:
jdbc:snps:complexfile?f=C:\Labs\Files\Complex
Files\Purchase_sample.txt&d=C:\Labs\Files\Complex
Files\Purchase_schema.xsd&re=invoice
5. Test the connection.
Note: The native schema (nXSD) provided in the data server URL is used as the XSD file to
describe the XML structure. For more information, see Section 5.5.2 Reverse-Engineering an
XML Model in Oracle Fusion Middleware Connectivity and Knowledge Modules Guide for Oracle
Data Integrator 11g Release 1 (11.1.1).
For Complex files technology, you define physical and logical schemas the same way as you did
for other technologies:
1. Create a new physical schema for this data server. Save physical schema.
2. Open Logical Architecture tab. Expand Technologies > Complex File and create a new
logical schema as shown in the slide. Connect this logical schema to the physical schema (in
this example PURCHASE_SAMPLE_CPLX_FILE. GEO) in all contexts. Save this logical
schema.
For setting up a project using the Complex File technology, you follow the standard procedure.
See Creating an Integration project" of Oracle Fusion Middleware Developer's Guide for Oracle
Data Integrator 11gR1.
It is recommended to import the following knowledge modules into your project for getting started:
LKM SQL to SQL
LKM File to SQL
IKM XML Control Append
A Complex File model groups a set of datastores. Each datastore typically represents an element
in the intermediate XML file generated from the native file using the native schema.
Complex File technology supports standard reverse-engineering, which uses only the abilities of
the Complex File driver (the same process as for XML Files).
1
3
To perform a Standard Reverse- Engineering with a Complex File model use the usual procedure,
as described in Reverse-engineering a Model in the Oracle Fusion Middleware Developer's
Guide for Oracle Data Integrator 11gR1.
Note
BPEL is included in the ODI Suite, but not in the stand-alone ODI Enterprise Edition.
<?xml version="1.0"?>
<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema"
xmlns:nxsd="http://xmlns.oracle.com/pcbpel/nxsd"
elementFormDefault="qualified"
xmlns:tns="http://xmlns.oracle.com/pcbpel/demoSchema/csv"
targetNamespace="http://xmlns.oracle.com/pcbpel/demoSchema/csv"
attributeFormDefault="unqualified"
nxsd:encoding="US-ASCII" nxsd:stream="chars" nxsd:version="NXSD">
<xsd:element name="Root">
<xsd:complexType><xsd:sequence>
</xsd:sequence></xsd:complexType>
</xsd:element>
</xsd:sequence></xsd:complexType>
<xsd:element name="Customer" maxOccurs="unbounded">
<xsd:complexType><xsd:sequence>
<xsd:element name="Name" type="xsd:string"
nxsd:style="terminated" nxsd:terminatedBy=","/>
<xsd:element name="Street" type="xsd:string"
nxsd:style="terminated" nxsd:terminatedBy="," />
<xsd:element name="City" type="xsd:string"
nxsd:style="terminated" nxsd:terminatedBy="${eol}" />
</xsd:sequence></xsd:complexType>
</xsd:element>
</xsd:sequence></xsd:complexType>
</xsd:element>
</xsd:schema>
The nXSD schema is created using the Native Format Builder Wizard in JDeveloper with the SOA
Extensions installed. The Oracle File and FTP Adapters are automatically integrated with Oracle
BPEL PM. When you drag File Adapter for FTP Adapter from the Component Palette of the
JDeveloper BPEL Designer to the design area, the Adapter Configuration Wizard starts with a
Welcome page.
When you click the Define Schema for Native Format button in the Messages page of the Adapter
Configuration Wizard, the Native Format Builder Wizard is displayed. The Messages page is the
last page that is displayed in the Adapter Configuration Wizard before the Finish page. For details,
refer to Oracle Fusion Middleware User's Guide for Technology Adapters 11g Release 1
(11.1.1.6.3).
You can create new nXSD schema for your complex file, or edit an existing native schema
generated using the Native Format Builder Wizard by sampling a delimited, fixed-length, or
complex-type file. To edit an existing native schema, select the Edit existing option in the Choose
Type page of the Native Format Builder Wizard, and click Browse to navigate to the location of the
existing schema file and then select the native schema file that must be edited. The Native Format
Builder Wizard guides you through the editing of the native schema file.
Before you edit a native schema file, you must ensure that the sample file specified in the
annotation within the schema exists. This annotation is automatically added when the native
schema is generated the first time from the sample file.
Note: If the format is complex, it is often a good idea to approximate it with a similar simple
format and then add the complex components manually. The resulting *.xsd file can be copied
and used as the format for ODI. Using this technique it is also possible to parse the same file
format in SOA Suite and ODI; for example, using SOA for small real-time messages, and ODI for
large batches.
This slide shows the nXSD schema for a complex file generated by Native Format Builder Wizard.
Answer: b
Explanation: Complex File data servers support both technology-specific KMs sourcing or
targeting SQL data servers, as well as generic KMs.
and SOA
There are Standalone Agents and Standalone Agents in ODI 11g. Both types of agents are
multithreaded Java programs and can be configured for high availability. The main difference is
where and how you install them (in WebLogic Server or on top of a Java Virtual Machine [JVM])
and the benefits from this installation
The Standalone Agent is easier to deploy anywhere, but does not have clustering or connection
pooling. It can be monitored by Oracle Process Manager and Notification Server (OPMN).
The Java EE Agent is slightly more complex to set up (you need to install WLS first, set up a
domain, and so forth), but more preferable in terms of enterprise-scale deployment: clustering,
load-balancing, centralized monitoring (with Fusion Middleware Console) and so forth.
Note that the choice between the two types of agents is really a users choice, and it is easy to mix
both these types of agents seamlessly into an ODI architecture.
Java EE Agents require WebLogic Server whereas Standalone Agents run in their own JVM
container (no application server is required for Standalone Agents).
The Runtime Agent can be deployed as a Java EE component within an application server. It
benefits in this configuration from the application server layer features such as clustering and
connection pooling for large configurations. This Java EE Agent exposes an MBeans interface,
enabling life-cycle operations (start/stop) from the application server console and metrics that can
be used by the application server console to monitor the agent activity and health.
Oracle WebLogic Server Integration
- Oracle Data Integrator components integrate seamlessly with the Java EE application
server.
Java EE Agent Template Generation
- Oracle Data Integrator provides a wizard to automatically generate templates for
deploying Java EE Agents in Oracle WebLogic Server. Such a template includes the
Java EE Agent and its configuration, and can optionally include the JDBC data sources
definitions required for this agent, as well as the drivers and libraries files for these data
sources to work.
By using the Oracle WebLogic Configuration Wizard, domain administrators can extend their
domains or create a new domain for the Oracle Data Integrator Java EE Runtime Agents.
Java EE Agent
Oracle Data Integrator provides an extension integrated into the Fusion Middleware Control
Console (Enterprise Manager). The Oracle Data Integrator components can be monitored as a
domain through this console, and administrators can have a global view of these components
along with other Fusion Middleware components from a single administration console. To
implement integration with Enterprise Manager:
ODI Java EE Agent must be deployed and configured with the existing WebLogic server
domain
Enterprise Manager and the ODI Enterprise Manager Plug-in must be deployed in the
WebLogic server domain that has the ODI Java EE Agent deployed and configured
Development
Agent
The ODI Console provides web access to ODI repositories. It enables an ODI developer to browse
ODI objects (such as projects, models, logs, and so on) and manage the ODI environment through
the web service.
Business users, developers, operators, and administrators use their web browsers to access the
ODI Console. The ODI Console replaces the Metadata Navigator of ODI releases before ODI 11g.
Note that with the ODI Console, you also can perform executions.
ODI Run-Time Web Services and Data Services are two different types of web services.
ODI Run-Time Web Services:
The Public Web Service connects to the repository to retrieve a list of contexts and
scenarios. This web service is deployed in a Java EE application server.
The Agent Web Service is built in the Java EE or Standalone Agent.
Data services are specialized web services that provide access to data in datastores, and to the
captured changes in these datastores by using the CDC feature. These web services are
automatically generated by ODI and deployed to a web services containertypically, a Java
application server.
Web services
Datastore
Java classes container
SKM
ODI automatically generates the Data Services from a datastore or a model and deploys to a web
services container in an application server.
This generation can be customized by using Service Knowledge Module (SKM). The resulting
data service is presented in the form of a Java package. You can compile and deploy these Java
classes to a web services container.
Data services can be generated and deployed into a web service stack implementing the Java API
for XML Web Services (JAX-WS), such as Oracle WebLogic Server.
Note: For more information about how to set up, generate, and deploy Data Services, refer to
Oracle Fusion Middleware Developer's Guide for Oracle Data Integrator 11g.
Note that the agent must be accessible from the web services container host, and it must have
access to the repositories.
The involved parameters are similar to the ones used when executing a scenario from an OS
command. The required parameters depend on the way you invoke the web service.
Public Web Services can be deployed in any web services container installed in your machine,
such as Java application server. Web services can also be provided by ODI Standalone Agent or
as an archive file.
The response of the web service request is written and saved in an XML file that can be used in
ODI.
OdiInvokeWebService in action
OdiInvokeWebService invokes a specific
operation on a port of the web service.
1 2 The Web Services Description Language
(WSDL) file URL must be provided.
The OdiInvokeWebService tool
sends a client request to the
web service through HTTP or
HTTPS protocols.
XML 3
4 The response is written to an XML file that The response is written
can be processed with ODI. to a SOAP file.
Another way to invoke web services is by using the OdiInvokeWebService tool to create a tool
step in a package.
To create an OdiInvokeWebService tool step, perform the following steps:
1. Open the package where you want to create a tool step and click the Diagram tab.
2. From the toolbox, select the OdiInvokeWebService tool. Click the diagrama step
corresponding to your tool appears.
3. Click Free choice to be able to edit the step, then click the step icon in the diagram to open
the Properties panel.
The next section of this lesson explores ways to integrate ODI within a Service-Oriented
Architecture (SOA).
SOAP Third-party
ESB Public Web
Service Agent Web web service
Repository
This slide shows a simple example with the Data Services, Run-Time Web services (Public Web
Service and Agent Web Service) and the OdiInvokeWebService tool.
The Data Services and Run-Time Web Services components are invoked by a third-party
application, whereas the OdiInvokeWebService tool invokes a third-party web service.
Data Services provides access to data in datastores (both source and target datastores), as well
as changes trapped by the Changed Data Capture framework. This web service is generated by
Oracle Data Integrator and deployed in a Java EE application server.
The Public Web Service connects to the repository to retrieve a list of context and scenarios. This
web service is deployed in a Java EE application server.
The Agent Web Service commands the Oracle Data Integrator Agent to start and monitor a
scenario and to restart a session. Note that this web service is built into the Java EE or
Standalone Agent.
The OdiInvokeWebService tool is used in a package and invokes a specific operation on a port
of the third-party web service, for example, to trigger a BPEL process.
Oracle Data Integrator Run-Time Web Services and Data Services are two different types of Web
services. Oracle Data Integrator Run-Time Web Services enable you to access the Oracle Data
Integrator features through web services, whereas Data Services are generated by Oracle Data
Integrator to give you access to your data through web services.
Data
3 warehouse
Join
Finance data WS, XML
service discounts 4
(BPEL)
In this example, you see an ODI data load that is controlled by SOA processes and uses web
services as well as databases as sources.
1. In this example, a business process in the product management department is changing
data in the product table of the operational database. This data needs to be propagated to
the data warehouse with minimal delay.
2. The process calls an ODI bulk data service to initiate a load based on the changes since the
last load.
3. The ODI Agent executes a scenario based on a package that first calls a data service from
the finance department to obtain price discount information.
4. Then this service joins this information with the changed product data from the operational
database. The joined and transformed data is stored in the data warehouse.
This slide shows how ODI can be integrated with a BPEL process for handling ODI errors using
BPEL Human Workflow.
In this example, you expose an ODI Transformation Process as a web service, which is called
from within a BPEL process:
1. Create a BPEL process, which will call your ODI scenario.
2. Create your ODI transformation interface.
3. Create a Package with the OdiInvokeWebService step connected to your transformation
interface. When adding the OdiInvokeWebService tool step, you should connect to the
appropriate BPEL process WSDL and specify the parameters for the tool.
4. Generate a scenario from the package.
5. Edit the BPEL process by adding the Partner Links for integrating with ODI and calling your
package.
In JDeveloper, you may need to create the new application and the new project for your BPEL
process. The project type you create is BPEL Process Project. Create your BPEL process
according to your business project. This slide shows the example of a business process that
prepares the result that will be consumed and processed by ODI Interface. For details on creating
BPEL processes, refer to Oracle Fusion Middleware User's Guide for Oracle Business Process
Management 11g Release 1 (11.1.1.6.3).
This slide shows how to create an ODI interface and an ODI Package with the
OdiInvokeWebService step connected to your transformation interface:
1. Create your transformation interface.
2. Create the new package.
3. Open the Diagram tab and add a new OdiInvokeWebService step from the Internet
folder in the Toolbox panel. Also, add your interface step to the package.
4. Configure the OdiInvokeWebService step. Select the URL to connect to the BPEL
process WSDL. Enter other parameters for the OdiInvokeWebService tool as shown in
the slide. In the properties panel, fill the following parameters:
- Storage mode for Response File: New File
- Response File: /tmp/processResponse.xml
5. Connect the OdiInvokeWebService step to the Interface step in your diagram.
This slide shows the result of creating the ODI scenario for the package and the process of editing
the BPEL process for invoking the scenario.
1. Create the scenario for the ODI package.
2. Edit the BPEL process to ensure your invocation of your package.
After the BPEL process is ready, it should be deployed to the application server and tested. To
deploy and test your BPEL process:
1. In JDeveloper, select your BPEL process, right-click and then select Deploy. Follow the
screens to deploy your process to the application server.
2. In Enterprise Manager, expand: SOA > soa-infra (AdminServer) > ODIInvoke. Select
ODIInvoke and click Test Service.
Workflow
ODI Package
Business Process
(BPEL Process)
A business process is a set of coordinated tasks and activities, involving both human and system
interactions, that leads to accomplishing a set of specific organizational goals. Business Process
Execution Language (BPEL) is a programming abstraction that enables developers to compose
multiple discrete web services into an end-to-end process flow. BPEL enables the top-down
realization of a Service-Oriented Architecture (SOA) through composition, orchestration, and
coordination of Web services.
In the BPEL process, the activities, called Partner links are used to integrate the business
process (BPEL) process with other applications within SOA. They link the BPEL process to
corresponding web services.
ODI can be integrated with BPEL process by using the ODIInvokeWebservice Tool. Thus, a
business process can be invoked from the ODI Package. After the execution, the BPEL process
can send a response back to ODI for subsequent data processing.
For enterprise SOA deployments, there is almost always a need for enterprise data extraction,
loading, transformation, and validation. By leveraging the native SOA architecture within ODI, you
can take advantage of ODI to perform the ELT (Extract Load Transform). ODI provides the ability
to validate data during the load to the target by using ODI constraints or database constraints.
When this data is checked against a constraint by using Flow Control, any errors that are found
are not loaded to the target but are loaded to an errors table that is created and managed by ODI.
Each row of this table represents a record that did not pass a constraint. The row also has a
message column that explains why the record was rejected. This table can be edited within ODI
Designer or any other tool that can edit relational tables.
However, this is not always a convenient way for the end user or business user to edit the data.
Alternatively, the ODI Error Hospital can be created with BPEL Human Workflow. Any rows that
do not pass the constraint will be loaded to the error table on the target database. The ODI
scenario can be executed, and after the ODI ELT process is completed, the ODI scenario will then
call a BPEL web service to notify it of any errors during the load. The BPEL process will import the
errors and manage them by using BPEL human workflow tasks. A user can then use the BPEL
Worklist application to update bad records. On subsequent executions of the ODI scenario, the
updated records are recycled into the ELT process.
1
2
2. Create the BPEL process that will import data errors from the error table. The BPEL process
will then create the human workflow tasks from the error records. You build a new BPEL
process to track the errors for each execution of the ODI package and present them to a
user for review. The user will be able to update each of the fields and correct any errors so
that on the next execution of the ODI package, the corrected rows are inserted into the
target. In this example, JDeveloper is the tool that you use to build the BPEL process and
deploy it to the application server:
1. Create a New Project for the BPEL process.
2. Create and configure the connection to an ODI data source.
3. Invoke database adapters to read errors and write error corrections back to the error
table.
3. Deploy the BPEL process; create and execute the ODI package. Deploy the BPEL process
to the application server.
1. Deploy the BPEL process.
2. Create the ODI package to execute the interface and invoke the Web service.
3. Execute your ODI package.
2
1
4. Monitor execution of the BPEL process from the BPEL Console, and complete the human
task:
1. Using Oracle BPM Worklist application, perform an action on the human task to correct
errors.
2. Return to the BPEL Console page and refresh the browser. The process is now
updated and you see that it has progressed past the Error Hospital Human workflow
activity.
3. Click the Invoke Corrections activity to see the data that the error table has been
updated with.
Note: You should now have a fully functional ODI to BPEL Human Workflow Error Hospital. If you
rerun your ODI scenario now, the corrected errors are picked up and resubmitted to the target
table.
Answer: a
Explanation: Both types of agents have web service capabilities and can be used within a SOA
environment.
Objects are the visible part of ODI object components (Java classes). It is necessary to
combine the notions of object and object instance (or instances), which in ODI are similar to
object-oriented programming notions.
An example of an instance, MY_PROJ_1, is an instance of the project-type object. Similarly,
another instance of a project-type object is YOUR_PROJ_2. Thus, an instance is a particular
occurrence of an object. For example, the Datawarehouse project is an instance of the
Project object.
Each object has a series of methods that are specific to it. The notion of a method in Data
Integrator is similar to the one in object-oriented programming.
A profile contains a set of privileges for working with ODI. One or more profiles can be
assigned to a user to grant the sum of these privileges to the user. An authorization by profile
is placed on an objects method for a given profile. It allows a user with this profile to be
given, either optionally or automatically, the right to this object through the method. The
presence of an authorization by a profile for a method, under an object in the tree of a profile,
shows that a user with this profile is entitled (either optionally or automatically) to this objects
instances through this method. The absence of authorization by a profile shows that a user
with this profile cannot, under any circumstance, invoke the method on an instance of the
object.
Note: Objects and methods are predefined in ODI and must not be changed.
A user in the Security Navigator module represents an ODI user and corresponds to the
login name used for connecting to a repository. A user inherits the following rights:
- All privileges granted to its various profiles
- Privileges on objects or instances given to this user
An authorization by the user is placed on a method of an object for a given user. It allows the
user to be given, either optionally or automatically, the right to this object through the
method.
A User Method is a privilege granted to a user on a method of an object type. Each granted
method allows the user to perform an action (edit, delete, and so forth) on instances of an
object type (project, model, datastore, and so forth). These methods are similar to the
Profiles Methods, applied to users.
Note: It is possible to grant users with privileges on instances on specific work repositories where
these instances exist. For example, you may grant a developer user with the edit privilege on the
LOAD_DATAWAREHOUSE scenario on the a DEVELOPMENT repository and not on a PRODUCTION
repository. An authorization by Object Instance is granted to a user on an object instance. It allows
to grant to this user certain methods of this object instance.
ODI Objects
By using the Security Navigator tab, you can manage security in ODI. The Security Navigator
module allows ODI users and profiles to be created. It is used to assign user rights for methods
(edit, delete, and so on) on generic objects (data server, data types, and so on), and to fine-tune
these rights on the object instances (Server 1, Server 2, and so on).
The Security Navigator stores this information in a master repository. This information can be used
by all the other modules.
Security Navigator objects available to the current user are organized by these tree views:
- The objects, describing each ODI elements type (datastore, model, and so on)
- The users profiles, users, and their authorizations
You can perform the following operations in each tree view:
- Insert or import root objects to the tree view by clicking the appropriate button in the
frame title.
- Expand and collapse nodes by clicking them.
- Activate the methods associated with the objects (Edit, Delete, and so on) through the
pop-up menus.
- Edit objects by double-clicking them or by dragging them on the Workbench.
The windows for the object being edited or displayed appear in the Workbench.
Note: Each tree view appears in floatable frames that may be docked to the sides of the main
window. These frames can also be stacked up. When several frames are stacked up, tabs appear
at the bottom of the frame window to access each frame of the stack. Tree view frames can be
moved, docked, and stacked by selecting and dragging the frame title or tab. To lock the position
of views, select Lock window layout from the Windows menu. If a tree view frame does not appear
in the main window or has been closed, it can be opened by using the Windows > Show View
menu.
Generic profiles:
Have the Generic privilege option selected for all object
methods
A user with such a profile is by default authorized for all
methods of all instances of an object.
Nongeneric profiles:
Generic profiles have the Generic privilege option selected for all object methods. This implies that
a user with such a profile is by default authorized for all methods of all instances of an object to
which the profile is authorized.
Nongeneric profiles are not by default authorized for all methods on the instances because the
Generic privilege option is not selected for all object methods. The administrator must grant the
user the rights on the methods for each instance.
If the security administrator wants a user to have the rights on all instances of an object type by
default, the user must be given a generic profile.
If the security administrator wants a user to have the rights on no instance by default, but wants to
grant the rights by instance, the user must be given a nongeneric profile.
ODI has some built-in profiles that the security administrator can assign to the users he or she
creates. This slide shows some built-in profiles delivered with ODI. For a complete list of built-in
profiles, see the Oracle Fusion Middleware Developer's Guide for Oracle Data Integrator 11g
Release 1 (11.1.1).
CONNECT: Must be granted with another profile
DESIGNER: Use this profile for users who will work mainly on projects.
METADATA_ADMIN: Use this profile for users who work mainly on models.
NG_METADATA_ADMIN: Nongeneric version of the METADATA_ADMIN profile
OPERATOR: Use this profile for production users.
REPOSITORY_EXPLORER: Use this profile for users who do not need to modify objects.
SECURITY_ADMIN: Use this profile for security administrators.
2 2
3 2
1
Allow selected
methods in all
repositories. 2
Allow selected
methods in
selected
repositories.
As shown in the previous slide, you have to grant users with authorizations to each object
instance. To grant an authorization by object instance to a user:
1. In Security Navigator, expand the Users accordion. In the Designer, Operator, or Topology
Navigator, expand the accordion containing the object onto which you want to grant
privileges. Select this object and then drag it on the user in the Users accordion. Click Yes.
The authorization by object instance editor appears.
2. This editor shows the list of methods available for this instance and the instances contained
into it. For example, if you grant privileges on a project instance, the folders, interfaces, and
so forth contained in this project will appear in the editor. Fine-tune the privileges granted per
object and method. You may want to implement the following simple privileges policies on
methods that you select from the list:
- To grant all these methods in all repositories, click Allow selected methods in all
repositories.
- To deny all these methods in all repositories, click Deny selected methods in all
repositories.
- To grant all these methods in certain work repositories, click Allow selected methods in
selected repositories and then select the repositories from the list. Click OK.
External
Authentication
jps-config.xml
Directory
Services Manager
The ODI stores by default all the user information as well as the users privileges in the master
repository. A user who logs in to ODI logs against the master repository. This authentication
method is called Internal Authentication. ODI can optionally use Oracle Platform Security Services
(OPSS) to authenticate its users against an external Identity Store, which contains enterprise
users and passwords. Such an identity store is used at the enterprise level by all applications, to
have centralized username and password definitions and Single Sign-On (SSO). In such a
configuration, the repository contains only references to these enterprise users. This
authentication method is called External Authentication.
The ODI stores by default all security information in the master repository. This password storage
option is called Internal Password Storage. ODI can optionally use Java Provisioning Service
(JPS) for storing critical security information. If you are using JPS with ODI, the data server
passwords and contexts are stored in the JPS Credential Store Framework (CSF). This password
storage option is called External Password Storage.
Note: When using External Password Storage, other security details such as usernames,
password, and privileges remain in the master repository.
master repository.
Note: When you perform a password storage recovery, context and data server passwords are
lost and need to be reentered manually in the Topology Navigator. If you are using External
Authentication, usernames and passwords are externalized. ODI privileges remain within the
repository. Data servers and context passwords also remain in the master repository. You can
externalize data server and context passwords by using the External Password Storage feature.
To use the External Authentication option, you need to configure an enterprise Identity Store
(LDAP, Oracle Internet Directory, and so on), and have this identity store configured for each ODI
component to refer by default to it. The configuration to connect and use the identity store is
contained in an OPSS Configuration file called the jps-config.xml file.
Copy this file into the ODI_HOME/client/odi/bin/ directory. The Studio reads the identity
store configuration and authenticates against the configured identity store. If you want to locate
this file in a different location, edit the ODI_HOME/client/odi/bin/odi.conf file and edit the
option that sets the location of the configuration file. This option is set in the following line:
AddVMOption -Doracle.security.jps.config=./jps-config.xml.
Copy the jps-config.xml file also to the ODI_HOME/agent/bin/ directory. The agent and
the command line scripts will authenticate against the configured identity store.
This graphic depicts the options of using either the Embedded LDAP Server, which runs on the
ODI WebLogic Server domain, or the External Directory Server.
The applications use the identity, policy, and credential stores configured in the domain in which
they run.
A credential store is a repository of security data (credentials) that certify the authority of users,
Java components, and system components. In a credential store, data is used during
authentication, and then, during authorization, when determining what actions the subject can
perform. For more information about using the policy and credential store, refer to Oracle Fusion
Middleware Security Guide11g Release 1 (11.1.1).
WebLogic
Server Domain
Oracle
Directory
Services
Manager Oracle
Internet OPMN
Oracle
Database
OID is an LDAP v3 directory service that leverages the scalability, high availability, and security
features of Oracle Database. It serves as the central user repository for Oracle Access Manager
and other Oracle applications. This graphic depicts the high-level architecture of an OID
installation.
OID stores user data in an Oracle Database. It is recommended that a separate and dedicated
database for OID be used. The database may or may not be on the same host.
Oracle Directory Server is the component that actually services the directory requests. Directory
server instances listen to requests from the LDAP clients, fetch information from the database, and
return the data to the clients.
Oracle Directory Services Manager (ODSM) provides a GUI management application for OID.
This runs on a managed server within the WebLogic Server OID domain (called IDMDomain by
default). ODSM is the only OID component that runs on WebLogic Server.
Fusion Middleware Control can also be used to manage OID. Oracle Process Manager and
Notification Server (OPMN) manages and monitors OID.
For more information about OID, see the Oracle Fusion Middleware Administrator's Guide for
Oracle Internet Directory 11g Release 1 (11.1.1).
To run the script (odi_credtool.cmd) to set up the credentials for the identity store, navigate to
the c:\<ODI_Home>\oracledi\client\odi\bin directory and then execute the
odi_credtool.cmd command. Ignore possible warnings. The input entries are provided in the
following:
Map: oracle.odi.credmap
Key: <OID Realm>
Username: <OID Administrator User Name>
Password: <OID Administrator Password>
To create a new master repository referencing a user in the external LDAP Server:
1. For new master repository, create the new RDBMS schema/user (Oracle 11g) with Connect
and Resource privileges.
2. Start Master Repository Creation Wizard, and then enter the necessary Database
connection information.
3. Select Use External Authentication and provide new Supervisor username and password.
These are the User and Password created in OID (or other external LDAP server).
4. To specify the type password storage, select Internal or External Password storage. Click
Finish.
A user created
in the LDAP
server
You need to create a new connection to your new master repository. Configure Repository
Connections with necessary parameters. In the Oracle Data Integrator Connection section, enter
the User and Password of the authenticated user in your external LDAP store.
Switching the authentication mode of the ODI repository changes the way users authenticate. This
operation must be performed by a Supervisor user.
When switching from an External to an Internal authentication, user passwords are not copied
from the identity store to the repository. The passwords are nullified. All the user accounts are
marked as expired and must be reactivated by a SUPERVISOR that is created during the switch.
When switching from an Internal to an External authentication, the users that exist in the repository
and match a user in the identity store are automatically mapped. Users that do not match a user in
the identity store are disabled. A Supervisor must edit the users so that their name has a match in
the identity store.
Use the Switch Authentication Mode wizard to change the user authentication mode.
Note: Before launching the Switch Authentication Mode wizard, perform the following tasks:
Disconnect Oracle Data Integrator Studio from the repository.
Shut down every component by using the ODI repository.
The next action varies depending on the current Authentication Mode in use:
If currently using Internal Authentication, you are prompted to switch to external
authentication.
If currently using External Authentication, you are prompted to switch to internal
authentication. You must provide and confirm a password for the SUPERVISOR user that the
wizard will automatically create in the repository.
Click Finish.
After switching to Internal authentication, you can reactivate all ODI users that have been
deactivated during the switch.
To reactivate a User:
1. In Security Navigator, expand the Users accordion. Select the user that you want to
reactivate from the list of users. Right-click and select Edit. The User editor appears.
2. Deselect Allow Expiration Date. If you want to set a password for this user, click Change
Password and enter the new password for this user.
3. From the File main menu, select Save.
If you have switched from internal to external authentication, you can reconnect to the ODI
repository as one of the users with supervisor privileges and re-enable the ODI users that have
been disabled during the switch.
1. In Security Navigator, expand the Users accordion. Select the user that you want to re-
enable from the list of users. Right-click and select Open. The User editor appears.
2. In the Name field, enter a username that matches the login of an enterprise user in the
identity store. Click Retrieve GUID. If the username has a match in the identity store, this
external user's GUID appears in the External GUID field.
3. Click the Save button.
ODI stores by default all security information in the master repository. This is internal password
storage.
To use the external password storage option, you need to install a WebLogic Server instance
configured with JPS, and all ODI components (including the Runtime Agent) need to have access
to the remote credential store.
When using JPS with ODI, the data server passwords and contexts are stored in the JPS
Credential Store Framework (CSF).
To use the external password storage option, you need to install a WebLogic Server instance
configured with JPS and all ODI components (including the runtime Agent) need to have access to
the remote credential store. For more information, refer to Configuring Applications to Use OPSS
in the Oracle Fusion Middleware Application Security Guide 11g Release 1 (11.1.1).
Answer: d
Explanation: When switching from an External to an Internal authentication, user passwords are
not copied from the identity store to the repository. The passwords are nullified. All the user
accounts are marked as expired and must be reactivated by a SUPERVISOR that is created
during the switch.
Practices
This lesson provides an overview of managing ODI security. You also learn how to integrate
WebLogic Server and Enterprise Manager with ODI.
The following slides explain some of the integration strategies that are used in Oracle Data
Integrator. They are grouped into two families:
Strategies with Staging Area on the Target
Strategies with the Staging Area Different from the Target
An integration process uses an integration strategy that defines the steps required in the
integration process. Examples of integration strategies are:
Append: Optionally delete all records in the target datastore and insert all the flow into the
target.
Control Append: Optionally delete all records in the target datastore and insert all the flow
into the target. This strategy includes an optional flow control.
Incremental Update: Optionally delete all records in the target datastore. Identify new and
existing records by comparing the flow with the target, and then insert new records and
update existing records in the target. This strategy includes an optional flow control.
Slowly Changing Dimension: This strategy implements a Type 2 Slowly Changing
Dimension by identifying fields that require a simple update in the target record when
changed and fields that require to historize the previous record state.
This phase may involve one single server when the staging area and the target are located in the
same data server. It may involve two servers when the staging area and target are on different
servers.
The Append strategy inserts the incoming data flow into the
target datastore, before possibly deleting the content of the
target.
1. Delete (or truncate) all records from the target table. This
step usually depends on a KM option.
2. Transform and insert data from sources located on the
This strategy inserts the incoming data flow into the target datastore, possibly deleting the content
of the target beforehand.
This integration strategy includes the following steps:
1. Delete (or truncate) all records from the target table. This step usually depends on a KM
option.
2. Transform and insert data from sources located on the same server and from loading tables
in the staging area. When dealing with remote source data, LKMs will have already prepared
loading tables. Sources on the same server can be read directly. The integration operation
will be a direct INSERT/SELECT statement leveraging, containing all the transformations
performed on the staging area in the SELECT clause and on all the transformations on the
target in the INSERT clause.
3. Commit the transaction. The operations performed on the target should be done within a
transaction and committed after they are all complete. Note that committing is typically
triggered by a KM option called COMMIT.
Note: The same integration strategy can be obtained by using the Control Append strategy and
not choosing to activate flow control.
This approach can be improved by adding extra steps that will store the flow data in an integration
table ("I$"), and then call the CKM to isolate erroneous records in the error table ("E$").
The Control Append integration strategy includes the following steps:
1. Drop (if it exists) and create the integration table in the staging area. This is created with the
same columns as the target table so that it can be passed to the CKM for flow control.
2. Insert data into the loading table from the sources and loading tables by using a single
INSERT/SELECT statement similar to the one loading the target in the Append strategy.
3. Call the CKM for flow control. The CKM will evaluate every constraint defined for the target
table on the integration table data. It will create an error table and insert the erroneous
records into this table. It will also remove erroneous records from the integration table. After
the CKM completes, the integration table will contain only valid records. Inserting them into
the target table can then be done safely.
4. Remove all records from the target table. This step can be
made dependent on an option value that is set by the
designer of the interface.
5. Append the records from the integration table to the target
table in a single INSERT/SELECT statement.
6. Commit the transaction.
The Incremental Update strategy is used to integrate data in the target table by comparing the
records of the flow with existing records in the target according to a set of columns called the
"update key". Records that have the same update key are updated when their associated data is
not the same. Those that do not yet exist in the target are inserted. This strategy is often used for
dimension tables when there is no need to keep track of the records that have changed.
The challenge with such Integration Knowledge Modules (IKMs) is to use set-oriented SQL-based
programming to perform all operations rather than using a row-by-row approach that often leads to
performance issues. This method is described in the following:
1. Drop (if it exists) and create the integration table in the staging area. This is created with the
same columns as the target table so that it can be passed to the CKM for flow control. It also
contains an IND_UPDATE column that is used to flag the records that should be inserted ("I")
and those that should be updated ("U").
2. Transform and insert data in the loading table from the sources and loading tables by using a
single INSERT/SELECT statement. The IND_UPDATE column is set by default to "I".
3. Recycle the rejected records from the previous run to the integration table if the
RECYCLE_ERROR KM option is selected.
4. Call the CKM for flow control. The CKM will evaluate every constraint defined for the target
table on the integration table data. It will create an error table and insert the erroneous
records into this table. It will also remove erroneous records from the integration table.
5. Update the integration table to set the IND_UPDATE flag to "U" for all the records that have
the same update key values as the target ones. Therefore, records that already exist in the
target will have a "U" flag. This step is usually an UPDATE/SELECT statement.
6. Update the integration table again to set the IND_UPDATE column to "N" for all records that
are already flagged as "U" and for which the column values are exactly the same as the
target ones. As these flow records match exactly the target records, they do not need to be
used to update the target data. After this step, the integration table is ready for applying the
changes to the target as it contains records that are flagged:
- "I": These records should be inserted into the target.
- "U": These records should be used to update the target.
- "N": These records already exist in the target and should be ignored.
7. Update the target with records from the integration table that are flagged as "U". Note that
the update statement is typically executed before the INSERT statement to minimize the
volume of data manipulated.
8. Insert records into the integration table that are flagged as "I" into the target.
9. Commit the transaction.
10. Drop the temporary integration table.
A Type 2 SCD is one of the most well-known Data Warehouse loading strategies. It is often used
for loading dimension tables, to keep track of changes that occurred on some of the columns. A
typical SCD table would contain the following columns:
A surrogate key that is calculated automatically. This is usually a numeric column that
contains an autonumber such as an identity column, a rank function, or a sequence.
A natural key. List of columns that represent the actual primary key of the operational
system.
Columns that may be overwritten on change
Columns that require the creation of a new record on change
A start date column indicating when the record was created in the Data Warehouse
An end date column indicating when the record became obsolete (closing date)
A current record flag indicating whether the record is the actual one (1) or an old one (0)
Product P5 is 3 1
added. 2
4
Name is updated for product P3.
This slide shows an example of the behavior of the product SCD. In the operational system, a
product is defined by its ID that acts as a primary key. Every product has a name, a size, a
supplier and a family. In the Data Warehouse, you need to store a new version of this product
whenever the supplier or the family is updated in the operational system.
In this example, the product dimension is first initialized in the Data Warehouse on March 12,
2006. All the records are inserted and are assigned a calculated surrogate key as well as the
ending date is set to January 1, 2400. As these records represent the current state of the
operational system, their current record flag is set to 1.
After the first load, the following changes happen in the operational system: The supplier is
updated for product P1; The family is updated for product P2; The name is updated for product P3;
Product P5 is added.
These updates have the following impact on the Data Warehouse dimension:
1. The update of the supplier of P1 is translated into the creation of a new current record
(Surrogate Key 5) and the closing of the previous record (Surrogate Key 1).
2. The update of the family of P2 is translated into the creation of a new current record
(Surrogate Key 6) and the closing of the previous record (Surrogate Key 2).
3. The update of the name of P3 updates the target record with Surrogate Key 3.
4. The new product P5 is translated into the creation of a new current record (Surrogate Key 7).
To create a Knowledge Module that implements this behavior, it is necessary to know which
columns act as a surrogate key, a natural key, a start date, and so on. Oracle Data Integrator
stores this information in the Slowly Changing Dimension Behavior field on the Description tab for
every column in the model.
When populating such a datastore in an interface, the IKM has access to this metadata by using
the SCD_xx selectors on the getColList() substitution method.
Note
There may be some cases where the SQL produced requires further tuning and optimization. For
more information, refer to Oracle Fusion Middleware Knowledge Module Developer's Guide for
Oracle Data Integrator 11g Release 1 (11.1.1).
There are some cases when the source is a single file that can be loaded directly into the target
table by using the most efficient method. By default, ODI suggests to locate the staging area on
the target server, use a LKM to stage the source file in a loading table, and then use an IKM to
integrate the loaded data to the target table.
If the source data is not transformed, the loading phase is not necessary.
In this situation, you would use an IKM that directly loads the file data to the target: this requires
setting the staging area on the source file logical schema. By doing this, ODI will automatically
suggest to use a "Multi-Connection" IKM that moves data between a remote staging area and the
target.
Such an IKM would use a loader, and include the following steps:
1. Generate the appropriate load utility script.
2. Run the loader utility.
An example of such a KM is the IKM File to Teradata (TTU).
area to this target. The method to perform this data movement depends on the target technology.
For example, it is possible to use the agent or specific features of the target (such as a Java API).
Typical steps of such an IKM will include:
1. Reset the target file or queue made dependent on an option.
2. Unload the data from the staging area to the file or queue.
A business process is a set of coordinated tasks and activities, involving both human and system.
Bulk data processing is the processing of a batch of discrete records between participating
applications. Oracle Application Integration Architecture (AIA) uses Oracle Data Integrator to
perform bulk data integrations.
This slide provides an overview of AIA-approved design patterns for AIA ODI architecture.
Because ODI data transfer is always point-to-point, the source and target systems must be
capable of processing batch loads of data. An integration project should not adopt ODI as a
solution if there is a limitation in the number of rows that can be processed either on the source-
side or on the target-side application. There are four AIA-approved design patterns for using ODI
with AIA architecture:
Initial data loads: The initial set of data of a particular object is loaded from the source
database to the target database. For example, loading Customer Account information or
loading Invoice information into a new application database from an existing source
application database. In the process, XREF data may or may not get established depending
on the integration requirement.
High-volume transactions with XREF table: The interface tables on the source side have
a mechanism to indicate processed rows. Whenever a need arises for a high-volume data
transfer, AIA recommends using the Oracle Data Integrator solution for data transfer
between applications. Using this approach, the Oracle Data Integrator package transfers
data from source to target system on a regular basis. AIA recommends that the interface
tables on the source side have a mechanism to indicate processed rows.
volume transactions does not make sense, AIA recommends using point-to-point integration
by using Oracle Data Integrator. For example, the headquarters of a retail store chain
receives data from individual retail stores every day at the close of business. In this scenario,
you need not store XREF data between each individual local store with HQ because there
are not any DML operations on those datasets. Local enterprise resource planning (ERP)
applications load their interface table and invoke an Oracle Data Integrator package to send
data to the HQ Interface table. After the data is processed, Oracle Data Integrator updates
the local ERP application's Interface table with a Transferred or Processed status for each
row.
For details, refer to Using Oracle Data Integrator for Bulk Processing in Oracle Fusion
When you create or update objects in one application, you may also want to propagate the
changes to the other applications. For example, when a new customer is created in an SAP
application, you may want to create a new entry for the same customer in your Oracle E-Business
Suite application named EBS. However, the applications that you are integrating may be using
different entities to represent the same information. For example, for each new customer in an
SAP application, a new row is inserted in its Customer database with a unique identifier such as
SAP_001. When the same information is propagated to Oracle E-Business Suite and Siebel
applications, the new row should be inserted with different identifiers such as EBS_1001 and
SBL001. In such cases, you need some type of functionality to map these identifiers with each
other so that they can be interpreted by different applications to be referring to the same entity.
This can be done by using cross-references.
The identifier mapping is also required when information about a customer is updated in one
application and the changes must be propagated to other applications. You can integrate different
identifiers by using a common value integration pattern, which maps to all identifiers in a cross-
reference table. For example, you can add one more column named Common to the cross-
reference table shown in the slide.
Cross-referencing is an Oracle Fusion Middleware function, available through the Mediator
component, and leveraged typically by any loosely coupled integration that is built on the Service
Oriented Architecture (SOA) principles. It is used to manage the run-time correlation between the
various participating applications of the integration.
In ODI, the creation of XREF data is a two-step process. Each step is an interface.
In the first interface, the user's source table is the source in Oracle Data Integrator and the user's
target table is the target in Oracle Data Integrator.
While transporting data from the source table to the target table, create the XREF data for the
source and common rows. In this process, if you want to populate any column of the target with
the COMMON identifier, the Oracle Data Integrator knowledge module takes care of that too.
In the second step, after data is posted from the interface table to the base table, the target
application must identify a datastore where the mapping between target identifier and common (or
source) identifier that you have sent during previous interface processing is available.
A second interface must be run in which this datastore is the source in Oracle Data Integrator and
the XREF table is the target. This creates the appropriate target columns in the XREF table.
Note
Because XREF column names cannot be hardcoded, two variables must be defined to hold the
source and target column names. Normally, these column names are derived from the AIA
Configuration file. This section does not describe how to get that from the XML but rather it
describes how to refresh this from a SQL select statement.
Batch:
Data is loaded in full or incrementally using off-pick window
Capture: Filter Query
Mini-batch:
Data is loaded incrementally using intra-day loads
Capture: Filter Query
Various architectures for collecting transactional data from operational sources have been used to
populate Data Warehouses. These techniques vary mostly on the latency of data integration, from
daily batches to continuous real-time integration. The capture of data from sources is either
performed through incremental queries that filter based on a timestamp or flag, or through a CDC
mechanism that detects any changes as they are happening. Architectures are further
distinguished between pull and push operations, where a pull operation polls in fixed intervals for
new data, while in a push operation data is loaded into the target once a change appears.
A daily batch mechanism is most suitable if intraday freshness is not required for the data, such as
longer-term trends or data that is only calculated once daily (for example, financial close
information). Batch loads might be performed in a down time window, if the business model does
not require a 24-hour availability of the Data Warehouse.
Meets a variety of
Heterogeneous Extensibility & customer needs and data
Moves changed data across environments with open,
Flexibility
different databases and platforms modular architecture
High Availability
Standby
(Open & Active)
Zero-Downtime
Upgrades and
Migrations
New DB/OS/HW
Log-based, Real-
Time Change Data
Capture Live Reporting
Reporting
ODS EDW
Operational BI
ETL
EDW
OLTP Systems
BPM
BAM Transactional
Data Integration
Message CEP
Bus
TCP/IP
Source Target
Database(s) Bi-directional Database(s)
Oracle GoldenGate consists of decoupled modules that are combined to create the best possible
solution for your business requirements.
On the source system(s):
Oracle GoldenGates Capture (Extract) process reads data transactions as they occur, by
reading the native transaction log, typically the redo log. Oracle GoldenGate moves only
changed, committed transactional data, which is only a percentage of all transactions
therefore operating with extremely high performance and very low impact on the data
infrastructure.
Filtering can be performed at the source or target: at table, column, or row level.
Transformations can be applied at the capture or delivery stages.
Advanced queuing (trail files):
To move the transactional data efficiently and accurately across systems, Oracle
GoldenGate converts the captured data into an Oracle Canonical Format in trail files. With
both source and target trail files, Oracle GoldenGates unique architecture eliminates any
single point of failure and ensures that data integrity is maintainedeven in the event of a
system error or outage.
Transactional RDBMS
Source Tables
Staging DB Target DB
Replicated Target Tables
Source Tables
ODI CDC
ODI
Extract Datapump
Interfaces
WAN
Replicat ODI
Source trail Staging trail
files files
Oracle GoldenGate
Oracle GoldenGate captures and replicates changes from the source system, typically into a
staging schema on the target Data Warehouse platform. This staging schema contains a
replicated copy of the source table and the structures that are used by ODIs own Changed Data
Capture (CDC) framework. ODI then picks up the changes in the structures of the ODI CDC
framework, transforms the data, perhaps joins it to look up other tables, and then loads it into the
target schema. This technology does not require ODI to be changed in any way, and use
GoldenGate CDC with the regular existing ODI CDC framework.
To initialize CDC process with Oracle GoldenGate, you have to create the package, which
consists of a perpetual loop that is triggered by the log entry of a new journal and processes new
changes with the CDC Load Customer interface. Execute this package to initialize ODI CDC with
GoldenGate.
In many ODI objects (procedure, variables, interfaces, and so on), you can add expressions and
SQL codes. A very common mistake is to enter qualified object names, as in the following
example that drops a TBL_001 table in the staging schema MY_SCHEMA:
DROP TABLE MY_SCHEMA.TBL_001
If you run this code in the production environment where the staging schema is called
PRD_MY_SCHEMA, your code does not work. Note that the schema names are defined in the
topology, and the contexts would handle getting the right schema name depending on the
execution context. The Substitution Methods (aka OdiRef API) exist to leverage in your code the
metadata stored in ODI and make the code context-independent.
In this case, it is assured that the qualified table name is generated according to the context you
are running your code into.
Using Substitution Methods, the generic code would be:
DROP TABLE <%odiRef.getObjectName("L", "TEMP_001","W")%>.
Procedures allow you to perform very complex actions, including SQL statements. In addition,
they allow you to use source and target connections, and support data binding. In short, you can
move data by using procedures.
Developers who feel at ease with SQL may be tempted to code their transformations and data
movements within procedures instead of using interfaces.
There are several issues with using procedures in such a way:
Procedures contain manual code that needs to be maintained manually.
Procedures do not maintain cross-references to other ODI artifacts such as datastores,
models, and so on, making their maintenance very complex compared to interfaces.
Procedures should never be used to move and transform data. These operations should be
performed by using interfaces.
Project developers of data integration sometimes do not take into account the quality of their data.
This is a common mistake, as data integration itself may be to move and transform erroneous
data, and propagate it over the applications.
ODI allows you to enforce data quality of source data by using static checks as well as the quality
of the data before it is pushed into a target via flow checks. By using both these checks, you can
make sure that the quality of the data is improved or enforced when the data is moved and
transformed.
In a package, you can sequence any number of steps. Every single step in a package may fail for
some reason (for example, target or source database is down). You must always consider these
different error cases when designing packages.
The KM Choice is critical when designing interfaces. The KM Choice conditions the features
available and the performances of an interface.
Do not begin your interface with complex KMs. Choosing for example, technology-specific LKMs
by using loaders may lead to a nonworking interface because the loader configuration or the
installation is not correct. A safe choice for starting to work is to use generic KMs (usually SQL
KMs) that work in most cases. After designing your first interfaces with these KMs, you can start
switching to technology-specific KMs (read their descriptions first) and leverage all the features of
these KMs.
Avoid over-engineering interfaces: KMs with extra features have an extra cost in terms of
performance. For example, performing simple insert is faster than performing an incremental
update. If you are deleting the data in the target before integration, using incremental update is
over-engineering, and causes some performance loss. Use the KM that fits your needs.
Similarly, activating or deactivating some of the KM features may add extra steps that may
degrade the performance. Default KM options are sufficient for running the KM provided out of the
box. After running the KM with default options, it is always good to review the options and see
whether some of them can be changed for your needs. The KM and Option description are a good
documentation for understanding the optimization of KM usage.
Answer: a
Explanation: This technology does not require ODI to be changed in any way, and uses
GoldenGate CDC with the regular existing ODI CDC framework.