Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 10

P

a
Ver
g
sio
View Based ETL n
e
1
1.1
of
10

1. Document History.................................................................................................................... 1
Scope..............................................................................................................................................2
Purpose.......................................................................................................................................... 2
Background.....................................................................................................................................2
Data Flow........................................................................................................................................3
UTL_DML.LARGE_MERGE (Oracle).............................................................................................4
CDC Managed Interfaces............................................................................................................... 6

1. Document
History
Date Revision Author Reason/Description
05/23/2019 1.0 Ron Behrendt Initial Draft

This document is electronically controlled. Printed copies are considered uncontrolled.


Refer to GBSRCS for the official version of this document
Medtronic Confidential

1 of 10
75214-TMP, v3.0
P
a
Ver
g
sio
View Based ETL n
e
2
1.1
of
10

Scope
This scope of this document is to describe an ETL method for utilizing views as an ETL source to
target tables. The intended audience is ETL developers.

Purpose
This purpose of this document is to set a standard method for refreshing tables based on views
in the same database platform. This is an alternative Informatica workflows and materialized
views.

Background
This method is designed for small and medium sized data sets (< 10 million records) when daily
refresh method is full and complete. Traditionally, such tables might be refreshed by Informatica
or be created in Oracle as materialized views.
When the ETL logic is light and within Oracle, Informatica only adds overhead. All the data
needs to be transferred to and from Informatica servers. A refresh done via PL/SQL keeps the
data within Oracle.
Oracle, materialized views keep the data within Oracle, but are not very efficient when daily
refresh method is “complete”. Complete refreshes delete and re-insert the entire data set. It is
not possible to have CDC or accurate timestamps on target table when all records are
processed. Also, excessive writes on the target database are incurred even if no data changed.
Fast refreshes for materialized views require materialized view logs on all source tables. This is
not always feasible.

This method can be used to refresh tables within the same database or over DB links.

This document is electronically controlled. Printed copies are considered uncontrolled.


Refer to GBSRCS for the official version of this document
Medtronic Confidential

2 of 10
75214-TMP, v3.0
P
a
Ver
g
sio
View Based ETL n
e
3
1.1
of
10

Data Flow

Source View: The source view needs to have the same column names as the target table. The
view should be as close to source tables as possible.
Create the view in the database containing the source tables.
The view needs to be created such that primary keys are guaranteed not generate duplicate
values.

Target Table: Create the target table normally. The minimum requirement is that it have a
primary key. Optionally, it can have a standard update timestamp that will be maintained based
on changes to target table. Timestamp maintained columns need a default value of SYSDATE
and needs to be one of these standard column names:
INSERT_TIMESTAMP
UPDATE_TIMESTAMP
INSERT_TS
UPDATE_TS
INS_TS
UPD_TS

This document is electronically controlled. Printed copies are considered uncontrolled.


Refer to GBSRCS for the official version of this document
Medtronic Confidential

3 of 10
75214-TMP, v3.0
P
a
Ver
g
sio
View Based ETL n
e
4
1.1
of
10

This document is electronically controlled. Printed copies are considered uncontrolled.


Refer to GBSRCS for the official version of this document
Medtronic Confidential

4 of 10
75214-TMP, v3.0
P
a
Ver
g
sio
View Based ETL n
e
5
1.1
of
10

UTL_DML.LARGE_MERGE (Oracle)
LARGE_MERGE is used to merge from source view to base table. Only the differences between
source and target are processed. Differences are determined by running MINUS queries.

Reference:
procedure large_merge(
source_table varchar2, -- Source table
target_table varchar2, -- Target table
dml_mode integer -- Combination of INSERT,UPDATE,DELETE
);

-- MERGE procedures.
-- These procedures will merge all records in the source table
-- to the target table.
--
-- Source_table and target_table:
-- The source table and target table should have
-- common column names. They need to have at least the PK columns in common.
--
-- DML_MODE parameter: (1-7)
-- 1 - Insert
-- 2 - Update
-- 3 - Insert and Update
-- 4 - Delete
-- 5 - Insert and Delete
-- 6 - Update and Delete
-- 7 - Insert, Update, Delete
--
-- The operations done on the target are controlled by the DML_MODE
-- parameter.
-- Insert: Any record in the source not in the target is inserted.
-- Update: Any record in the source and target is updated.
-- Delete: Any record in the target but not the source is deleted.
--

This document is electronically controlled. Printed copies are considered uncontrolled.


Refer to GBSRCS for the official version of this document
Medtronic Confidential

5 of 10
75214-TMP, v3.0
P
a
Ver
g
sio
View Based ETL n
e
6
1.1
of
10

Example
begin
utl_dml.large_merge('XDS_MAIN.ETL_TIME@DIH','GLBL_MAIN.B_TIME',7);
end;

In this example, the view XDS_MAIN.ETL_TIME is used to populate GBL_MAIN.B_TIME table in


the local DB.

If the target table is empty, LARGE_MERGE procedures will automatically run standard pre/post
load procedures.
UTL_DDL.DROP_INDEXES
UTL_DDL.SET_TRIGGER (Disable)
UTL_DDL.SET_TRIGGER(Enable)
UTL_DDL.ANALYZE_TABLE
UTL_DDL.BUILD_INDEXES

If you want to force a truncate and re-load, then truncate the table before running.
Example
Begin
utl_ddl.truncate_table('GLBL_MAIN', 'B_TIME');
utl_dml.large_merge('XDS_MAIN.ETL_TIME@DIH','GLBL_MAIN.B_TIME',7);
end;

This document is electronically controlled. Printed copies are considered uncontrolled.


Refer to GBSRCS for the official version of this document
Medtronic Confidential

6 of 10
75214-TMP, v3.0
P
a
Ver
g
sio
View Based ETL n
e
7
1.1
of
10

CDC Managed Interfaces


CDC can manage view based refreshes. There are 2 advantages for using CDC package.
 Execution logging to CDC_LOG. Every time the refresh is run, a log record is created in
CDC_LOG. This is useful for support to track execution times and error rates.
 Refresh Groups. Multiple view refreshes and be grouped and run together in parallel.
The TWS batch job can be configured to run a refresh group. Tables can be added and
removed from refresh groups with no code changes.
Each table refresh within the group runs in parallel providing overall faster refreshes.

CDC Configuration steps


The examples demonstrate a configuration where source view is in a remote DB.
Target: GLBL_MAIN.B_TIME table in DCC1 database
Source: XDS_MAIN.ETL_TIME view in DIH database

Add to CDC_TABLES
insert into cdc_tables (TABLE_ID,NODE_ID, TYPE_CODE, TABLE_NAME, TABLE_OWNER)
values ('ETL_TIME','DIH','DATA','ETL_TIME','XDS_MAIN');

insert into cdc_tables (TABLE_ID,NODE_ID, TYPE_CODE, TABLE_NAME, TABLE_OWNER)


values ('B_TIME','DCC1','DATA','B_TIME','GLBL_MAIN');

This document is electronically controlled. Printed copies are considered uncontrolled.


Refer to GBSRCS for the official version of this document
Medtronic Confidential

7 of 10
75214-TMP, v3.0
P
a
Ver
g
sio
View Based ETL n
e
8
1.1
of
10

TABLE_ID: Standard is to match table or view name.


NODE_ID: Database where table is located.
Must reference a valid NODE_ID value from CDC_NODES.

TYPE_CODE: Use ‘DATA’


TABLE_NAME: Actual table name in the database
TABLE_OWNER: Actual schema owner of the table name in the database

Add to CDC Interfaces:


insert into cdc_interfaces (intf_id, path_code, trgt_Table_id, src_table_id,
type_Code, status_code,refresh_group)
values ('B_TIME','L','B_TIME','ETL_TIME','VIEW','IDLE','GLBL_MASTER_DATA');

INTF_ID: Standard is to use target table_id


PATH_CODE: Always is ‘L’ for view based interfaces.
TRGT_TABLE_ID: CDC_TABLES reference to target table
SRC_TABLE_ID: CDC_TABLS reference to source view
TYPE_CODE: Always use ‘VIEW’
STATUS_CODE: Add with ‘IDLE’
REFRESH_GROUP: Optional. Can be used run multiple interfaces by group.

This document is electronically controlled. Printed copies are considered uncontrolled.


Refer to GBSRCS for the official version of this document
Medtronic Confidential

8 of 10
75214-TMP, v3.0
P
a
Ver
g
sio
View Based ETL n
e
9
1.1
of
10

Executing the refreshes


Refreshes have to be run on the target database.
To run individually, use the cdc.run_view_intf procedure.
cdc.run_view_intf(‘intf_id’')
Example:
begin
cdc.run_view_intf('B_TIME');
end;
/

To run as a group, use cdc.run_refresh_group procedure.


cdc.run_refresh_group(‘refresh_group')
Example:
begin
cdc.run_refresh_group('GLBL_MASTER_DATA');
end;
/

To force a reload, truncate before calling refresh


Example (Single table)
Begin

This document is electronically controlled. Printed copies are considered uncontrolled.


Refer to GBSRCS for the official version of this document
Medtronic Confidential

9 of 10
75214-TMP, v3.0
P
a
Ver
g
sio
View Based ETL n
e
10
1.1
of
10

Utl_ddl.truncate_table(‘GLBL_MAIN’,’B_TIME’;
cdc.run_view_intf('B_TIME');
end;
/

This document is electronically controlled. Printed copies are considered uncontrolled.


Refer to GBSRCS for the official version of this document
Medtronic Confidential

10 of 10
75214-TMP, v3.0

You might also like