Download as doc, pdf, or txt
Download as doc, pdf, or txt
You are on page 1of 233

DATA STAGE REFERENCE GUIDE AND LAB HANDOUT

PALLETE:
Pallet contains all kind of stages
1. General
2. Database
4. Development and Debug
5. File
6. Processing
7. Real Time
8. Restructure

GENERAL STAGES:
General it contains 4 stages
1. Annotation,
2. Container,
3. Description Annotation,
4. Link
Link: It is used to give connection between two stages
Annotation: It is used to provide the comments

DATABASE STAGES:
It contains all kind of data base stages
1. DB2/UDB API
2. DB2/UDB Load
3. Dynamic RDBMS
4. ODBC Enterprise
5. SQL Server Enterprise
6. Oracle Enterprise
7. Stored Procedure
8. Teradata Enterprise
9. Teradata Multi Load

DEVELOPMENT AND DEBUG STAGES:


1. Column Generator
2. Head
3. Peek
4. Row Generator
5. Sample
6. Tail
7. Write Range Map

1
FILE STAGES:
1. Complex Flat File
2. Data set
3. External Source
4. External Target
5. File set
6. Look up File set
7. Sequential file
8. SAS Parallel Data Set

PROCESSING STAGES:
1. Aggregator
2. Change Apply
3. Change Capture
4. Compare
5. Compress
6. Copy
7. Decode
8. Difference
9. Encode
10. Expand
11. Filter
12. Funnel
13. Generic
14. join
15. Look up
16. Merge
17. Modify
18. Pivote
19. Remove Duplicate
20. External Filter
21. SAS
22. Sort
23. Surrogate Key Generator
24. Switch
25. Transformer

REAL TIME STAGES:


1. RTI Input
2. RTI Output
3. XML Input
4. XML Output
5. XML Transformer

2
RESTRUCTURE STAGES:
1. Column Export
2. Column Import
3. Combine Records
4. Make Sub Record
5. Make Vector
6. Promote Sub Record
7. Split Sub record
8. Split Vector

FAVOURATE STAGE: None

SEQUENCIAL FILE STAGE:


It is a file stage which is used to extract data from flat files (.txt, csv, .xls)
It supports one input link and one output link and one reject link
It is used either in source (or) target to extract data from flat files
Sequential file stage process the data sequentially by default
In Sequential file stage Read Methods are two types
Read Method = Specific files
= File pattern
Example of specific files:
c:\emp.txt
Example of file pattern:
C:\emp1.txt
C:\emp2.txt
C:\emp3.txt
If the file path is like above then if we want o extract data from all above files then
we can specify like this
C:\emp*.txt * for one or more character matching
C:\emp? .txt? For single character matching

LIMITAIONS OF SEQUENCIAL FILE:


12. It supports up to 2 GB Of data

3
EXAMPLE JOBS FOR SEQUENTIAL FILE STAGE:
Requirement: Extracting EMP data from text file and loading into text file By using
Sequential file Stage

Here Read method= specific file(s)

Input File data:

Output File data:

Job:

4
Properties for sequential_File_0:

Target Sequential_File_1 Properties:

5
Importing table table definition from sequential file:
Right click on sequential_file_0 Or double click on seqential_file_0

Click on Columns tab on left hand side tab

6
Click on Load Tab at Bottom
It will show the Window like this

Now Click On import and select Sequential file definitions..

7
Now in the file list u select the file emp1.txt

Click on Import tab:

Tick the check box First line is column names and Click on Defines

8
Now Click on Ok and click on Close tab Now that file emp1.txt will show in the table
Definition list

Now Select Emp1.txt in table definition list and click on OK

9
It will show Window like this

Now Click on OK and again Click on OK ….This is the way of procedure for importing
table definition

2)Example Job for Sequential File:


Requirement: Extracting EMP data from text file and loading into text file By using
Sequential file Stage
Here Read method=file pattern

Input sequential_File_0

Properties:

Emp1.txt

10
Emp2.txt

These two files are in this path: D:\dspractice\sanjeev\emp*.txt


Requirement:
Output target sequential_File_1:

11
Job:

Input Sequential_File_0 Properties:

12
Output Sequential_File_1 Properties:

3)Example Job for Sequential File:


Requirement: Extracting EMP data from text file and loading into text file By using
Sequential file Stage

Here Read method= specific file(s)


Reject Mode=Continue
Continue: Continue to simply discard any rejected rows;
Fail: Fail to stop if any row is rejected; Output to send rejected rows down a reject link.

here two records field delimiter is not properly ended empno=300,400

13
Input sequential file_0 data:

Job:

Input sequential file_0 data Properties:

14
Output sequential_File_01 Properties:

Output data:

4) Example Job for Sequential File:

Requirement: Extracting EMP data from text file and loading into text file By using
Sequential file Stage

Here Read method= specific file(s)


Reject Mode=Output

Input file data:

15
Input sequential file data :

Output Sequential File Data:

Output Rejects Data:

16
Job:

Note: if U select Reject Mode=Output then u must have reject link

5)Example Job for Sequential File:


Requirement: Extracting EMP data from text file and loading into text file By using
Sequential file Stage

Here Read method= specific file(s)


Reject Mode=Output
Options:
File Name Column=InputRecFilepath
Note:Here u should create InputRecFilepath Column in Extended column Properties
Job:

17
Input Seq file data:

Input Properties:

18
Columns:

Output Properties:

19
Outputdata:

6)Example Job for Sequential File:


Requirement: Extracting EMP data from text file and loading into text file By using
Sequential file Stage

Here Read method= specific file(s)


Reject Mode=Output
Options:
File Name Column=InputRecFilepath
RowNumberColumn=InputRowNumberColumn
Note:Here u should create InputRecFilepath Column and InputRowNumberColumn in
Extended column Properties

Job:

InputProperties:

20
Columns:

Input Data:

21
Output Properties:

OutputData:

22
7)Example Job for Sequential File:
Requirement: Extracting EMP data from text file and loading into text file By using
Sequential file Stage

Here Read method= specific file(s)


Reject Mode=Output
Options:
File Name Column=InputRecFilepath
RowNumberColumn=InputRowNumberColumn
Options
Filter=Sed –n’3,5p’
Note:Here u should create InputRecFilepath Column and InputRowNumberColumn in
Extended column Properties

Input Source file data:

23
Input Sequential filedata:

Job:

24
Input properties:

25
Input Columns:

Output properties:

26
Output Data:

8)Example Job for Sequential File:


Requirement: Extracting EMP data from text file and loading into text file By using
Sequential file Stage

Here Read method= specific file(s)


Reject Mode=Continue
Options:
File Name Column=InputRecFilepath
RowNumberColumn=InputRowNumberColumn
Read First Rows=3
Input source File data:

27
Job:

Input sequential file properties:

Columns:

28
Output sequential file properties:

29
9)Example Job for Sequential File:
Requirement: Extracting EMP data from text file and loading into text file By using
Sequential file Stage

Here Read method= specific file(s)


Reject Mode=Output
Options:
File Name Column=InputRecFilepath
RowNumberColumn=InputRowNumberColumn
Read First Rows=5
Filter=grep “bhaskar”

Grep Options:
1) grep “string” Ex: grep “bhaskar”
2) grep –v “string” Ex: grep – v “bhaskar”
3) grep –i “String” Ex: grep - i “bhaskar”
Source filed data:

Job:

30
Input sequential file Properties:

Columns:

Output sequential file properties:

31
Output Data:

10)Example Job for Sequential File:


Requirement: Extracting EMP data from text file and loading into text file By using
Sequential file Stage

Here Read method= specific file(s)


Reject Mode=Output
Options:
File Name Column=InputRecFilepath
RowNumberColumn=InputRowNumberColumn
Filter=grep –v “bhaskar” It dsiplay except bhaskar record
Filter=grep –i “bhaskar”It display only bhaskar record “I “ means ignore case sensitive

32
Source filedata:

Job:

Input Sequential file stage properties:

33
Columns:

Output seqfile stage properties:

34
DATA SET SATGE:
1. Data set is a file stage which is used for staging the data for dependent jobs
2. Data set is a internal stage in data stage.
3. The extension of data set is .DS
4. It never use to extract data from client location
5. It is used as intermediate stage between two tables
Types of Data sets
1. Persistence data set
2. Virtual Data set

DATA SET MANAGEMENT UTILITY:


It is an utility for Organizing data set
ToolsDataset ManagementFile Name Output. Ds

MULTIPLE FILES FOR DATA SET:


1. Descriptor file
2. Data File
3. Control File
4. Header File

Descriptor file:
It holds the information about the address and about the structure
Data file: Represents the data in the native form
Control and Header file: these files are operate at the OS level for controlling the
Descriptor and Data file

Note: Data set other names or alias


1. Os Files
2. Orchestrate File

EXAMPLE JOBS FOR DATA SET STAGE:


Example Job on DATA SET:
Source file data:

JOB:

Input sequential properties:

Output dataset Properties:

35
Note: the target dataset file extension is .ds
Outputdataset data

We can view the record schema of data set


Go to toolsdataset management it will show the window like this

36
it will show the list of files and datasets

Now select empoutput1.ds and click OK

37
U can view the record schema By click on table definition icon

U can see the data here by click on data viewer option:

Note: By using data set management we can we can open the dataset, it can show the
schema window’ it can show the data window, it can copy dataset and can delete dataset

38
DIFFRENCES BETWEEN SEQUENTIAL FILE AND DATA SET:

Sequential File stage Data set stage


It executes in sequential Mode It executes in parallel mode
Cannot Apply Part ion techniques Can Apply Part ion Techniques
It supports all formats like.txt,.csv,.xls etc It supports only .ds

It is used to extract data from flat file It never use to extract data from client flat
files

SORT STAGE:
It is used to sort the data either in ascending order or descending order based on key
column while populating data from source to target it takes one input link and one output
link

Double click on sort stage go to stagepropertieskey = Cid (Based on column Sort
the data)sort order=ascending order or descending go to the outputdrag and drop
all columns in stage tabAdvanced there is a option execute mode if developer wants
to execute in sequential mode then execute mode in sequential by default it is in parallel
click ok savecompilerun it

Note: if developer require more keys go to stage properties click on sorting


keysselect one more key.

EXAMPLE JOB FOR SORT STAGE:


JOB:

39
Sequential_File_0 properties:

Sort stage properties:

40
Output Columns:

Target Dataset_1 Properties:

41
View data:

EXAMPLE JOB FOR SORT STAGE:


Input data:

Output Requirement:

42
Job:

Input Sequential_File_0 Properties:

Sort_4 Properties:

43
Output Mapping:

Output columns:

44
Sort_7 Properties:

45
Output: mappings

Output columns:

46
Target Dataset_1 properties:

OUTPUT DATA:

47
REMOVE DUPLICATE STAGE:
It is used to remove duplicates based on key column while populating data from source to
target it takes one input link and use one output link

Double click on remove duplicate stagekey=sale_id(column name)go to output drag


and drop all columns
If requires do part ion technique go to input portioning choose which part ion
technique we required save compile run it
EXAMPLE JOB FOR REMOVE DUPLICATE:

Input file :

Output File:

48
JOB:

Input file properties:

49
Remove duplicate properties:

Target file properties:

50
Example job for remove duplicate :
Input data:

Output requirement:

Job:

Oracle enterprise properties:

51
Remove duplicate properties:

52
Target file properties:

53
COPY STAGE:
It is used to copy the source data into multiple targets. It takes one input link and gives
one (or) more than one output link.

Double click on copy stage go to outputdrag and drop of required target links by
choosing output names

FILTER STAGE:
It is used to filter the records while populating data from source to target. it takes one
(or) more than one output link and one reject link

54
Double click on filter stagegive constrains on where clause like sal_id<300 if u want
more where clauses click on predicatesgo to output tab drag and drop all columns of
3 target tables click ok choose link ordering for corresponding constraint.

MODIFY STAGE:
It is used to modify the column names, modify the data types, modify nullabilities.it takes
one input link and one output link

Note; Target column name=source column name can be maintained irrespective of


column changes, data type changes, nullability changes

1. Handling null values:


Column_Name=Handle_Null('Column_Name',Value)

2. Droping acolumn:
DROP Column_Name

3. Keeping a column:
KEEP Column_Name

4. Type conversion:
Column_Name=type_conversion('Column_Name')
EX: HIREDATE=date_from_timestamp(“HIREDATE”)

FUNNEL STAGE:
It is used to combine all source tables data into single target table.It takes one input link
and gives one out put link.
Note: It works union all operation,
All source input links meta data should be same.

55
Double click on funnel stagego to out put drag and drop all columns

There are 3 types of funnels are there


1. Continuous Funnel
2. Sort Funnel
3. Sequence Funnel

CONTINOUS FUNNEL:
In continues funnel it extracts each record from all input links and populate to target.

SORT FUNNEL:
In sort funnel it extracts each record from all input links sort it in ascending order and
populates to target.

SEQUENCE FUNNEL:
It extracts all records from one input link and populates to target and again extracts from
2 input links and populates to target etc.

56
Example:

SURROGATE KEY STAGE:


It is used to generate some sequence no s from 1 to n.
Surrogate key is a key which is defined by user and it starts from 1 and incremented by
1Sequentilally and continuously. it takes one input link and gives one output link

Double click on surrogate keyname=sequence_no (in which column to generate the


sequence number)choose start value=1 go to out put and drag and drop of required
columns.

Example:

57
AGGREGATOR STAGE:
It is used to find aggregate values, like sum, average, max, min, after group by values. it
takes one input link and gives one output link.

Double click on aggregator stage group by droup=deptnocolumn for


calculation=salchoose required functions like sum ,max,min,avg,assign that function
values the required columns go to outputdrag and drop all columns—Click ok

1. EXAMPLE JOB FOR AGGREGATOR STAGE:

2. EXAMPLE JOB FOR AGGREGATOR STAGE:

58
3. EXAMPLE JOB FOR AGGREGATOR STAGE:
Input data:

Requirement:
Output file11 data:

Output field12 data:

JOB:

59
database properties;

Copy stage properties:

60
Aggregator 2 properties:

Output tab:

target seqfile11 properties:

61
Column generator properties:

62
Column genenator out put tab:

Aggrigator4 properties:

63
output tab:

Target se file12 Properties:

64
ADVANCED PROCESSING STAGES:
JOIN STAGE:
It is used to join more than one tables based on key column and populates data into target
table while populating from source to target, it takes more than one i/p link and gives one
output link

Double click on join stage key=common column name from input links ie Deptno
It require same column name from all input links
Join type = inner/left/right /full outer choose any one go to output and drag and drop
all columns

Join stage performs 4 types of join operations


1.inner join
2.left outer join
3.right outer join
4.full outer join

1. Inner join:
In inner join the join stage extracts only matched records from all i/p links based on key
column and populates into target

2. Left outer join:


In left outer join the join stage extracts all records from left input link along with matched
records from other links based on key column and populates data into the target table

3. Right outer join:


In Right outer join the join stage extracts all records from Right input link along with
matched records from other links based on key column and populates data into the target
table

4.Full outer join


In Full outer join ,the join stage extracts all records from left and right link and
populates data to the target table based on the key
Note: in Case of full outer join it supports only two inputs.

65
Example:

Emp Table:

Empno Ename Sal Deptno


100 Bhaskar 1500 10
101 Mohan 2000 20
102 Sanjeev 2500 30

Dept Table

Deptno Dname Loc


10 Admin Hyderabad
20 Sales Bangalore
30 Marketing Delhi

Inner join:

Empno Ename Sal Deptno Dname Loc


100 Bhaskar 1500 10 Admin Hyderabad
101 Mohan 2000 20 Sales Bangalore

Left Outer Join:

Empno Ename Sal Deptno Dname Loc


100 Bhaskar 1500 10 Admin Hyderabad
101 Mohan 2000 20 Sales Bangalore
102 Sanjeev 2500 30 null null

Right outer join:

Empno Ename Sal Deptno Dname Loc


100 Bhaskar 1500 10 Admin Hyderabad
101 Mohan 2000 20 Sales Bangalore
null null null 40 Marketing Delhi

Full Outer join:

Empno Ename Sal Deptno Dname Loc


100 Bhaskar 1500 10 Admin Hyderabad
101 Mohan 2000 20 Sales Bangalore
102 Sanjeev 2500 30 null null
null null null 40 Marketing Delhi

66
LOOKUP STAGE:
It is used to join more than one table based on different key columns and populates into
target table.
It takes one input link, one (or) more than one i/p reference links and gives one output
link and one reject link

Look up stage can perform 2 types of join operation s


1. Inner join
2. Left outer join (left link always should be stream link)
Establish join condition manually drag and drop of required columnsnow take required
columns to output linkclick Ok
Note: Default inner join occurs

How to take left outer join:


Double click on lookup stage  click on constraint go to look up failure chose
continue Click Ok

How to take reject link:


Take one reject link from lookup stagesdouble click on lookup stagego to constraint
make it as reject
Note:
The constraint option contains lookup failure
if lookup failure=fail then job will aborted if values not found in reference table(look up
table)
if look up failure=drop this is inner join
if look up failure=continue then it is left outer join
if look up failure=reject then unmatched records populates into reject table

67
There are two types of look up types available in data stage
1. Narmal lookup
2. Sparse lookup

Normal lookup:
Normal lookup has less memory so that if reference tables having fewer amounts of data
prefer to use Normal lookup

Sparse lookup:
Sparse lookup has more memory so that if reference tables contain huge amount of data
prefer to use sparse lookup

How to find whether look up is normal lookup or sparse look up:


Double click reference table  choose lookup type
Note:
A sparse look up can not support more than one reference links

Differences between Join stage and Look up stage:

Join stage Look up stage


It can perform 4 types of join operations It can perform two types of operations
Inner ,left,right,full outer inner, left
It won’t gives reject link It gives reject link
It has more memory when compare to look It has less memory when compared to join
up stage stage

MERGE STAGE:
It is used to join more than one table based on common column while populating from
source to target .it takes more than one input link(one is master link and remaining are
child links (or) update links) and gives one output link,(n-1) reject links if “n” are source
links
Note: the no of reject links should be no of child tables (or) updated tables

68
Example:

TRANSFORMER STAGE:
Trans former stage plays major role in data stage .it is used to modify the data, apply
some functions while populating data from source to target
It takes one input link and gives one (or) more than one output links.
It has 3 components
1. Stage variable
2. Constraints
3. Derivations (or) Expressions
1. Transformer stage can works as copy stage and filter stage
2. Transformer stage requires C++ Compiler .it convert high level data into machine
language
Double click on transformer stage drag and drop of required target columnsClick Ok

Each Transformer stage contains only one stage variable and, each target table contains
only one constraints and each target column contains only one derivation.
The order of execution of components is
1. 1. Stage variable
2. Constraints
3. Derivations
Example:

69
How to work transformer as filter stage (or) how to apply constraints in the transformer
stage:

Double click on trans former stage  double click on constraint again double on
particular link click on this window  it provides all information’s automatically and
view Constraints  for reject link click other wise.

Example Derivation:

If Sale_Id <300 then Amount_Sold=Amount_Sold+300


Else if Sale_Id>300 and Sale_Id<600 then Amount _Sold=Amount_Sold+600
Else if Sale_Id>600 and Sale_Id<1000 then Amount _Sold=Amount_Sold+1000 Else
Amount_Sold=Amount_Sold+100

Transformer stage provides some Functions and other informations.those are


1. Ds Macro
2. Ds routine
3. Job parameter
4. Input column
5. Stage Variables
6. System Variables
7. String
8. Function
9. Parenthesis
10. If Then Else

1. Ds Macro:
Ds Macro provides some built in Functions like
1. DsProjectName()
2. DsJobName()
3. DsHostName()
4. DsJobStartDate()
5. DsJobStartTime()
6. DsJobStartTimeStamp()

2. Ds routine:
It is nothing but set up functions
3. Job parameters:
Job parameters are nothing but some variables. these are used to reduce the redundancy
of work
4. Input columns:
It provides all input column names
5. Stage variables:

70
Stage variables are used to increase the performance and to reduce the redundancy of
work
How to define stage variable properties:
Click on stage variable right click on stage variable select stage variable
propertiesdefine stage variables

6. Stage variables:
It contain some built in functions like
1. @INROWNUM
2. @OUTROWNUM
3. @NUMPARTIONS

INROWNUM and OUTROWNUM provides how many records are loading into
transformer stage and how many records extracted from transformer stage ,Num Portion
tells how many nodes is handled

7. String:
It provides information with in double quotation hard coded value

8. Functions:
There are several built functions in data stage
1. Date&Time
2. Logical
3. Mathematical
4. Null Handling
5. Number
6. Raw
7. String
8. Type Conversion
9. Utility

EXAMPLE JOBS OF TRANSFORMER STAGE:


1)EXAMPLE JOB FOR TRANSFORMER
JOB:1

Inputfile:

71
Output requirement

JOB:

Sequential file:

72
Transformer Stage properties:

INPUTCOLUMNS:

73
OUTPUTLINK:

TARGET FILE:

74
2) EXAMPLE JOB FOR TRANSFORMER:
JOB2:

Input file:

Output requirement:

JOB:

Input file properties:

75
Transformer stage properties:

Stage variable derivation:


Field(INPUT.HDATE,"/",3):"-": Field(INPUT.HDATE,"/",2):"-":
Field(INPUT.HDATE,"/",1)

4. EXAMPLE JOB FOR TRANSFORMER:


Inputfile data:

Out put Reqirements:


Output1:

Output2:

76
JOB:

INPUT:

77
Transformer1:

Stage variable derivation:


Left(INPUT.REC,1)
Transformer2:

78
Constrains logic:

OUTPUTINVC:

79
OUTPUTPRODID:

80
4. EXAMPLE JOB FOR TRANSFORMER STAGE:
Input file:seqfile1:

Input file:seqfile0:

Out put Requirement:

Job:

81
Input file file1 properties:properties:

Join properties:

Transformer stage properties:

82
Stage variable Status derivation:

If ((DSLink9.leftRec_ENO = DSLink9.rightRec_ENO) And (DSLink9.ENAME =


DSLink9.ENAME1) And (DSLink9.SAL = DSLink9.SAL1) )Then "SAME" Else If
(((DSLink9.leftRec_ENO = DSLink9.rightRec_ENO)) And (DSLink9.ENAME =
DSLink9.ENAME1) And (DSLink9.SAL <> DSLink9.SAL1) ) Or
((DSLink9.leftRec_ENO = DSLink9.rightRec_ENO) And (DSLink9.ENAME <>
DSLink9.ENAME1) And (DSLink9.SAL = DSLink9.SAL1)) Or
((DSLink9.leftRec_ENO = DSLink9.rightRec_ENO) And (DSLink9.ENAME <>
DSLink9.ENAME1) And (DSLink9.SAL <> DSLink9.SAL1))Then "UPDATE" Else
"NEW"

Target filel :properties:

83
1. TRANSFORMER DATE&TIME FUNCTIONS:
1.CurrentDate: Returns the date that the job runs in date format.
Syntax: CurrentDate()
Inputdata:

Output Data:

JOB:

84
Transformer_Time_Date Properties:

Currentdate Field derivation:


CurrentDate()

2.CurrentTime: Returns the time at which the job runs in time format.
Syntax:CurrentTime()
InputData:

OutputData:

JOB:

85
Transformer_Time_Date Properties:

Currenttime Field Derivation:


CurrentTime()

3.CurrentTimeStamp: Returns a timestamp giving the date and time that the job runs in
timestamp format
Syntax:CurrentTimeStamp()
InputData:

OutputData:

JOB:

86
Transformer_Time_Date Properties:

4.DateFromDaysSince:
Syntax: DateFromDaysSince(%number%,[%"yyyy-mm-dd"%])
Inputdata:

Outputdata:

JOB:

Transformer_Time_Date Properties:

87
DaysfromDaysSince Filed derivation:
DateFromDaysSince(DSLink5.Field003, DSLink5.Field004)

5. DateFromJulianDay
Syntax: DateFromJulianDay(%juliandate%)
Inputdata:

Outputdata:

JOB:

88
Transformer_TimeandDate:

HireDate Field Derivation:


DateFromJulianDay(Input.JULIANDATE)

6. DaysSinceFromDate
Syntax: DaysSinceFromDate(%date%,%"yyyy-mm-dd"%)
Inputdata:

OutputData:

89
JOB:

Transformer_TimeandDate Properties:

DAYSSINCEFROMDATE Field Properties:


DaysSinceFromDate(Input.HIREDATE,"2012-12-30")

7.HoursFromTime:

Syntax:HoursFromTime(%time%)
InputData:

OutputData:

90
JOB:

Transformer_TimeandDate Properties:

HoursFromTime Filed Derivation:


HoursFromTime(Input.DAYLOGINTIME)

8.JulianDayFromDate:
Syntax:JulianDayFromDate(%date%)

Outputdata:

91
JOB:

Transformer_TimeandDate Properties:

JULIANDAYFROMDATE Field Derivation


JulianDayFromDate(Input.DOJ)

9.MicroSecondsFromTime:
Syntax: MicroSecondsFromTime(%time%)

OutputData:

10.MinutesfromTime:

92
Syntax:MinutesFromTime(%time%)
Inputdata:

Outputdata:

JOB:

Transformer_TimeandDate Properties:

MINUTESFROMTIME Field derivation:


MinutesFromTime(Input.DAYLOGINTIME)

93
11.MonthDayFromDate:
Syntax: MonthDayFromDate(%date%)
InputData:

OutputData:

JOB:

Transformer_TimeandDate Properties:

MOMTHDAYFROMDATE Field Derivation:


MonthDayFromDate(Input.DOJ)

12. MonthFromDate

94
Syntax:MonthFromDate(%date%)
InputData:

Outputdata:

JOB:

Transformer_TimeandDate Properties:

MONTHDAYFROMDATE Field Derivation:


MonthFromDate(Input.DOJ)

13.NextWeekDayFromDate:
Syntax: NextWeekdayFromDate(%sourcedate%,%dayname%)
InputData:

95
OutputData:

JOB:

Transformer_TimeandDate Properties:

NEXTWEEKDAYFROMDATE Field Derivation:


NextWeekdayFromDate(Input.DOJ,"SATURDAY")

14.PreviousWeekdayFromDate
Syntax: PreviousWeekdayFromDate(%sourcedate%,%dayname%)
Inputdata:

96
Outputdata:

JOB:

Transformer_Time_Date properties:

Previous WeekdayFromDate Field derivation:


PreviousWeekdayFromDate(Intput.DATEOFJOIN,"SATURDAY")

15. SecondsFromTime
Syntax: SecondsFromTime(%time%)
Inputdata:

97
Outputdata:

JOB:

Transformer_Time_Date Properties:

Secondsfromtime Field derivation:


SecondsFromTime(Intput.TIME)

16.SecondsSinceFromTimeStamp:
Syntax:SecondsSinceFromTimestamp(%timestamp%,%"yyyy-mm-dd hh:nn:ss"%)

98
InputData:

OutputData:

JOB:

Transformer_TimeandDate Properties:

SECONDSSINCEFROMTIMESTAMP Field Derivation:


SecondsSinceFromTimestamp(Input.DOJTIMESTAMP,"2008-08-19 22:30:52")

17.TimeDate:
Syntax:TimeDate()

99
InputData:

OutputData:

JOB:

Transformer_TimeandDate Properties:

TIMEDATE Field Derivation:


TimeDate()

18.TimeFromMidNightSeconds
Syntax: TimeFromMidnightSeconds(%seconds%)

100
InputData

OutputData:

JOB:

Transformer_TimeandDate Properties:

TIMEFROMMIDNIGHTSECONDS Filed Derivation:


TimeFromMidnightSeconds(Input.DOJINSEC)

19.TimestampFromDateTime
Sybtax: TimestampFromDateTime(%date%,%time%)

101
Inputdata:

OutputData:

JOB:

Transformer_TimeandDate Properties:

TIMESTAMPFROMDATETIME Field Derivation:


TimestampFromDateTime(Input.DOJ, Input.TIME)

102
20. TimestampFromSecondsSince
Syntax: TimestampFromSecondsSince(%seconds%,[%timestamp%])
InputData:

OutputData:

21. TimetFromTimestamp
Syntax: TimetFromTimestamp(%timestamp%)
Inputdata:

Outputdata:

JOB:

103
Transformer_Time_Date Properties:

TimeTfromTimestamp Derivation:
TimetFromTimestamp(Intput.DATEOFJOIN)

22. WeekdayFromDate
Syntax:WeekdayFromDate(%date%,[%startdayname%])
Inputdata:

Outputdata:

JOB:

104
Transformer_Time_Date Properties:

Weekdayfromdate Field function Derivation:


WeekdayFromDate(Intput.DATEOFJOIN,"SATURDAY")

23. YeardayFromDate
Syntax: YeardayFromDate(%date%)
Inputdata:

Outputdata:

JOB:

105
Transformer_Time_Date Properties:

YeardayfromDate Filed derivation:


YeardayFromDate(Intput.DATEOFJOIN)

24. YearFromDate
Syntax: YearFromDate(%date%)
Inputdata:

Outputdata:

JOB:

106
Transformer_Time_Date Properties:

YearFromDate Field Derivation:


YearFromDate(Intput.DATEOFJOIN)

25. YearweekFromDate
Syntax: YearweekFromDate(%date%)
Inputdata:

Outputdata:

JOB:

107
Transformer_Time_Date Properties:

YearWeekFromDate Field Derivation:


YearweekFromDate(Intput.DATEOFJOIN)

2.TRANSFORMER LOGICAL FUNCTIONS


1. BitAnd:
Syntax: BitAnd(%integer%,%integer%)
InputData:

OutputData:

JOB:

108
Transformer_Logical_Functions Properties:

BITAND Field Derivation:


BitAnd(Intput.NUMBER1, Intput.NUMBER2)

2. BitCompress:
Syntax: BitCompress(%binarystring%)
Inputdata:

OutputData:

JOB:

109
Transformer_Logical_Functions Properties:

BITCOMPRESS Field Derivation:


BitCompress(Intput.BINARYNUMBER)

3. BitExpand
Syntax: BitExpand(%bitfield%)
InputData:

Outputdata:

JOB:

110
Transformer_Logical_Functions Properties:

BITEXPAND Field Derivation:


BitExpand(Intput.BITCOMPRESS)

4.BitOr:
Syntax:BitOr(%integer%,%integer%)
InputData:

OutputData

JOB:

111
Transformer_Logical_Functions Properties:

BITOR Field Derivation:


BitOr(Intput.NUMBER1, Intput.NUMBER2)

5. BitXOr
Syntax: BitXOr(%integer%,%integer%)
InputData:

OutputData:

112
JOB:

Transformer_Logical_Functions Properties:

BITXOR Field Derivation:


BitXOr(Intput.NUMBER1, Intput.NUMBER2)

6. Not
Syntax: Not(%expression%)
Returns the complement of the logical value of an expression. If the value of expression
is true, the Not function returns a value of false (0). If the value of expression is false, the
NOT function returns a value of true (1).

InputData:

113
OutputData:

JOB:

Transformer_Logical_Functions Properties:

NOT Field Derivation:


Not(Intput.EXPRESSION)

7. SetBit

114
Syntax: SetBit(%bitfield%,%bitliststring%,%bitstate%)
InputData:

OutputData:

JOB:

Transformer_Logical_Functions Properties:

SETBIT Filed Derivation:


SetBit(Intput.NUMBER, Intput.BITLIST, Intput.BITSTATE)

3. TRANSFORMER MATHEMATICAL FUNCTIONS:

115
TRANSFORMER LOGICAL FUNCTIONS
1. BitAnd:
Syntax: BitAnd(%integer%,%integer%)
InputData:

OutputData:

JOB:

Transformer_Logical_Functions Properties:

BITAND Field Derivation:


BitAnd(Intput.NUMBER1, Intput.NUMBER2)

116
2. BitCompress:
Syntax: BitCompress(%binarystring%)
Inputdata:

OutputData:

JOB:

Transformer_Logical_Functions Properties:

BITCOMPRESS Field Derivation:


BitCompress(Intput.BINARYNUMBER)

117
3. BitExpand
Syntax: BitExpand(%bitfield%)
InputData:

Outputdata:

JOB:

Transformer_Logical_Functions Properties:

BITEXPAND Field Derivation:


BitExpand(Intput.BITCOMPRESS)

118
4.BitOr:
Syntax:BitOr(%integer%,%integer%)
InputData:

OutputData

JOB:

Transformer_Logical_Functions Properties:

BITOR Field Derivation:


BitOr(Intput.NUMBER1, Intput.NUMBER2)

119
5. BitXOr
Syntax: BitXOr(%integer%,%integer%)
InputData:

OutputData:

JOB:

Transformer_Logical_Functions Properties:

BITXOR Field Derivation:


BitXOr(Intput.NUMBER1, Intput.NUMBER2)

6. Not

120
Syntax: Not(%expression%)
Returns the complement of the logical value of an expression. If the value of expression
is true, the Not function returns a value of false (0). If the value of expression is false, the
NOT function returns a value of true (1).
InputData:

OutputData:

JOB:

Transformer_Logical_Functions Properties:

NOT Field Derivation:


Not(Intput.EXPRESSION)

121
7. SetBit
Syntax: SetBit(%bitfield%,%bitliststring%,%bitstate%)
InputData:

OutputData:

JOB:

Transformer_Logical_Functions Properties:

SETBIT Filed Derivation:


SetBit(Intput.NUMBER, Intput.BITLIST, Intput.BITSTATE)

4.TRANSFORMER NULLHANDLING FUNCTIONS:

122
IsNotNull:
Syntax: IsNotNull(%value%)
InputData:

Outputdata:

3.NullToEmpty:

Output:

JOB:

Transformer_Logical_Functions Properties:

123
NULLTOEMPTY Field Derivation:
NullToEmpty(Intput.COMM)

4. NullToValue
NullToValue(%inputcol%,%value%)
Inputdata:

Outputdata:

124
JOB:

Transformer_Logical_Functions Properties:

NULLTOVALUE Field Properties:


NullToValue(Intput.COMM,"000")
5. SetNull
Syntax:SetNull()

Outputdata:

125
JOB:

Transformer_Logical_Functions Properties:

DOEXPIRE Field Derivation:


SetNull()

TRANSFORMER STRING FUNCTIONS:

STRING FUNCTIONS:
1. Alnum: Checks whether the given string contains only alphanumeric characters.

Syntax: Alnum(%string%)

126
Input data:

Output data:

JOB:

INPUT PROPERTIES:

127
Transformer_Str_Alnum properties:

Logic:Alnum(INPUT.EMPNO)

Same job with different data:Inputdata

Output data:

128
2. FUNCTION:
Alpha
Syntax: Alpha(%string%)
Input data:

Outputdata:

JOB:

Transformer_str_Alpha properties:

129
3.FUNCTION:
CompactWhiteSpace:
Syntax: CompactWhiteSpace(%string%)
Input Data:

Outputdata:

JOB:

Transformer_Str_CompactWhiteSpace properties:

4. FUNCTION

130
Compare
Syntax: Compare(%string1%,%string2%,[%justification%])

Input Data:

Output data:

If str1=str2 then it returns 0,


If str1>str2 then it returns 1
If str1<str2 then it returns -1
JOB:

5. FUNCTION

131
CompareNum:
Syntax: CompareNum(%string1%,%string2%,%length%)
Input data:

Output data:

Note:In this job If first 5 characters same in string1,sting2 it return 0,If string 1>string2 it
returns 1,if string1<string2 it returns -1

JOB:

Transformer_str_compareNum properties:

CompareNum field derivation:


CompareNum(Input.STRING1, Input.STRING2,5)

6.FUNCTION:

132
CompareNumNoCase
Syntax: CompareNumNoCase (%string1%,%string2%,%length%)
Input data:

OutputData:

JOB:

Transformer_str_compareNumNoCase properties:

CompareNumNoCase field derivation:


CompareNumNoCase(Input.STRING1, Input.STRING2,5)

133
Note:In this job If first 5 characters same in string1,sting2 it return 0,If string 1>string2 it
returns 1,if string1<string2 it returns -1

7.FUNCTION:
Convert:
Syntax: Convert(%fromlist%,%tolist%,%expression%)
Inputdata:

Outputdata:

JOB:

Transformer_str_Convert properties:

ConvertFunction field derivation:


Convert("@$#"," ", Input.STRING1)
Same job try with the logic: Convert("@$#","", Input.STRING1)

134
Outputdata:

8. FUNCTION
Count:
Syntax: Count(%string%,%substring%)
Inputdata:

Outputdata:

JOB:

135
Transformer_str_Count properties:

Transformer_str_Count Field derivation:


Count(Input.STRING1,"A")
Note in the Above field STRING1 Contain string A repeated twise:

9.FUNCTION:
Dcount:
Syntax: Dcount(%string%,%delimiter%)
Counts the number of delimited fields in a string.
InputData:

Outputdata:

JOB:

Transformer_str_Dcount properties:

136
Transformer_str_Dcount field derivation:
Dcount(Input.STRING1,"|")

10.FUNCTION:
DownCase:
Syntax: DownCase(%string%)
Inputdata:

Output Data:

JOB:

137
Transformer_str_DownCase properties:

STRING1 Field derivation:


DownCase(Input.STRING1)

11.FUNCTION:
Dquote:
Syntax: DQuote(%string%)
Inputdata:

Outputdata:

JOB:

138
Transformer_Str_Dquote properties:

CUSTID Field Derivation:


DQuote(input.CUSID)
CNAME Field Derivation:
DQuote(input.CNAME)
ADDRESS Field Derivation:
DQuote(input.ADDRESS)

12.FUNCTION;
Field:
Syntax: Field(%string%,%delimiter%,%occurrence%,[%number%])

Inputddata;

Outputdata:

JOB:

139
Transformer_Str_Filed properties:

ADDRESS Field derivation:


Field(input.ADDRESS,"-",2)

13.FUNCTION
Index:
Syntax; Index(%string%,%substring%,%occurrence%)
Inputdat:

Outputdata:

JOB:

Transformer_Str_Index properties:

140
Trxr_Str_Index field derivation:
Index(input.FLAVOUR,"chocolate",2)

14.FUUNCTION
Left:
Syntax: Left(%string%,%length%)
Inputdata:

Outputdata:

JOB:

Transformer_Str_left properties:

141
Trx_Str_left field derivation:
Left(input.FLAVOUR,9)

15.FUNCTION
length:
Syntax: Len(%string%)
Inputdata:

Outputdata:

JOB:

Transformer_Str_length properties:

Trx_Str_length filed derivation:


Len(input.FLAVOUR)

16.FUNCTION;
Num:
Syntax:Num (%string%)

142
Inputdata:

Outputdata:

JOB:

Transformer_Str_Num properties:

CUSTID Field derivation:


Num(DSLink5.CUSID)

Note: if Given field contain a value is num then it returns 1. if the given string value is
Alphabetic value or AlphaNumeric value then then it return in output 0

17.FUNCTION
PadString

143
Syntax: PadString(%string%,%padstring%,%padlength%)
Inputdata:

Outputdata:

JOB:

Transformer_Str_PadString properties:

CUSTID Field derivation:


PadString(DSLink5.CUSID,"0",2)
Note: Here iam padding Two characters with ‘0’to the existing CUSID filed

144
18.FIELD
Right
Syntax: Right(%string%,%length%)
Inputdata:

Outputdata:

JOB:

Transformer_Str_Right properties:

CNAME Field derivation:


Right(DSLink5.CNAME,7)

145
19.FUNCTION:
Soundex:
Syntax: Soundex(%string%)
Returns a code which identifies a set of words that are (roughly) phonetically alike based
on the standard, open algorithm for SOUNDEX evaluation.
Inputdata:

Outputdata:

JOB:

Transformer_Str_Soundex properties

CNAME Field derivation:


Soundex(DSLink5.CNAME)

20.FUNCTION

146
Space:
Syntax: Space(%length%)
Returns a string of n space characters.
Inputdata:

Output:

JOB:

Transformer_Str_Space Properties:

ADDRESS Field derivation:


Space(DSLink5.ADDRESS)

21.FUNCTION

147
Squote:
Syntax: Squote(%string%)
Inputdata:

Outputdata:

JOB:

Transformer_Str_Squote properties:

ADDRESS Field derivation:


Squote(DSLink5.ADDRESS)

22.FUNCTION

148
Str
Syntax: Str(%string%,%repeats%)
Inputdata:

Outputdata:

JOB:

Transformer_Str_Str properties;

CNAME Field derivation:


Str(DSLink5.CNAME,2)

23.FUNCTION:

149
StripWhiteSpace
Syntax: StripWhiteSpace(%string%)
Inputdata:

Outputdata:

JOB:

Transformer_Str_StripWhiteSpace properties:

ADDRESS Field derivation;


StripWhiteSpace(DSLink5.ADDRESS)

24.FUNCTION;

150
Trim:
Syntax: Trim(%string%,[%stripchar%],[%option%])
Inputdata:

Outputdata:

JOB:

Transformer_Str_Trim properties:

ADDRESS Field derivation:


Trim(DSLink5.ADDRESS,".","A")

In this job trim again “|” also

Outputdata:

151
JOB:

Transformer_Str_Trim_Tab Properties:

ADDRESS Field Derivation:


Trim(DSLink6.ADDRESS,"|","A")

Different Trim Options:

A Remove all occurrences of stripchar

B Remove both leading and trailing occurrences of stripchar

D Remove leading, trailing, and redundant white-space characters

E Remove trailing white-space characters

F Remove leading white-space characters

L Remove all leading occurrences of stripchar

R Remove leading, trailing, and redundant occurrences of stripchar

T Remove all trailing occurrences of stripchar

152
25.FUNCTION:

TrimF:

Syntax: TrimF(%string%)

Inputdata:

Output data:

JOB:

Transformer_Str_TrimF Properties:

153
ADDRESS Field Derivation;

TrimF(DSLink5.ADDRESS)

26.FUNCTION

TrimB:

Syntax:

Inputdata:

Outputdata:

JOB:

154
Transformer_Str_TrimB Properties:

CNAME Field derivation:

TrimB(DSLink5.CNAME)

27.FUNCTION:

TrimLeadingTrailing:

Syntax: TrimLeadingTrailing(%string%)

Inputdata:

Output data:

155
JOB:

Transformer_Str_Trim properties:

CNAME Filed Derivation:

TrimLeadingTrailing(DSLink5.CNAME)

28.FUNCTION:

UpCase:

Syntax: UpCase(%string%)

Inputdata:

156
Outputdata:

JOB:

Transformer_Str_UpCase properties:

ADDRESS Field derivation:

UpCase(DSLink5.ADDRESS)

157
7. TRANSFORMER TYPE CONVERSION FUNCTIONS:

DATA STGAE TRANSFORMER TOTAL 26 STRING FUNCTIONS:

STRING FUNCTIONS:
2. Alnum: Checks whether the given string contains only alphanumeric characters.

Syntax: Alnum(%string%)
Input data:

Output data:

JOB:

INPUT PROPERTIES:

158
Transformer_Str_Alnum properties:

Logic:Alnum(INPUT.EMPNO)

Same job with different data:Inputdata

Output data:

159
2.FUNCTION:
Alpha
Syntax: Alpha(%string%)
Input data:

Outputdata:

JOB:

Transformer_str_Alpha properties:

160
3.FUNCTION:
CompactWhiteSpace:
Syntax: CompactWhiteSpace(%string%)
Input Data:

Outputdata:

JOB:

Transformer_Str_CompactWhiteSpace properties:

4.FUNCTION
Compare
Syntax: Compare(%string1%,%string2%,[%justification%])
Input Data:

161
Output data:

If str1=str2 then it returns 0,


If str1>str2 then it returns 1
If str1<str2 then it returns -1
JOB:

5.FUNCTION
CompareNum:
Syntax: CompareNum(%string1%,%string2%,%length%)
Input data:

162
Output data:

Note:In this job If first 5 characters same in string1,sting2 it return 0,If string 1>string2 it
returns 1,if string1<string2 it returns -1

JOB:

Transformer_str_compareNum properties:

CompareNum field derivation:


CompareNum(Input.STRING1, Input.STRING2,5)

6.FUNCTION:
CompareNumNoCase
Syntax: CompareNumNoCase (%string1%,%string2%,%length%)
Input data:

163
OutputData:

JOB:

Transformer_str_compareNumNoCase properties:

CompareNumNoCase field derivation:


CompareNumNoCase(Input.STRING1, Input.STRING2,5)
Note:In this job If first 5 characters same in string1,sting2 it return 0,If string 1>string2 it
returns 1,if string1<string2 it returns -1

164
7.FUNCTION:
Convert:
Syntax: Convert(%fromlist%,%tolist%,%expression%)
Inputdata:

Outputdata:

JOB:

Transformer_str_Convert properties:

ConvertFunction field derivation:


Convert("@$#"," ", Input.STRING1)
Same job try with the logic: Convert("@$#","", Input.STRING1)

165
Outputdata:

8.FUNCTION
Count:
Syntax: Count(%string%,%substring%)
Inputdata:

Outputdata:

JOB:

Transformer_str_Count properties:

166
Transformer_str_Count Field derivation:
Count(Input.STRING1,"A")
Note in the Above field STRING1 Contain string A repeated twise:

9.FUNCTION:
Dcount:
Syntax: Dcount(%string%,%delimiter%)
Counts the number of delimited fields in a string.
InputData:

Outputdata:

JOB:

Transformer_str_Dcount properties:

167
Transformer_str_Dcount field derivation:
Dcount(Input.STRING1,"|")

10.FUNCTION:
DownCase:
Syntax: DownCase(%string%)
Inputdata:

Output Data:

JOB:

168
Transformer_str_DownCase properties:

STRING1 Field derivation:


DownCase(Input.STRING1)

11.FUNCTION:
Dquote:
Syntax: DQuote(%string%)
Inputdata:

Outputdata:

JOB:

169
Transformer_Str_Dquote properties:

CUSTID Field Derivation:


DQuote(input.CUSID)
CNAME Field Derivation:
DQuote(input.CNAME)
ADDRESS Field Derivation:
DQuote(input.ADDRESS)

12.FUNCTION;
Field:
Syntax: Field(%string%,%delimiter%,%occurrence%,[%number%])

Inputddata;

Outputdata:

170
JOB:

Transformer_Str_Filed properties:

ADDRESS Field derivation:


Field(input.ADDRESS,"-",2)

13.FUNCTION
Index:
Syntax; Index(%string%,%substring%,%occurrence%)
Inputdat:

Outputdata:

JOB:

171
Transformer_Str_Index properties:

Trxr_Str_Index field derivation:


Index(input.FLAVOUR,"chocolate",2)

14.FUUNCTION
Left:
Syntax: Left(%string%,%length%)
Inputdata:

Outputdata:

JOB:

Transformer_Str_left properties:

Trx_Str_left field derivation:


Left(input.FLAVOUR,9)

172
15.FUNCTION
length:
Syntax: Len(%string%)
Inputdata:

Outputdata:

JOB:

Transformer_Str_length properties:

Trx_Str_length filed derivation:


Len(input.FLAVOUR)

173
16.FUNCTION;
Num:
Syntax:Num (%string%)
Inputdata:

Outputdata:

JOB:

Transformer_Str_Num properties:

CUSTID Field derivation:


Num(DSLink5.CUSID)

Note: if Given field contain a value is num then it returns 1. if the given string value is
Alphabetic value or AlphaNumeric value then then it return in output 0

174
17.FUNCTION
PadString
Syntax: PadString(%string%,%padstring%,%padlength%)
Inputdata:

Outputdata:

JOB:

Transformer_Str_PadString properties:

CUSTID Field derivation:


PadString(DSLink5.CUSID,"0",2)
Note: Here iam padding Two characters with ‘0’to the existing CUSID filed

175
18.FIELD
Right
Syntax: Right(%string%,%length%)
Inputdata:

Outputdata:

JOB:

Transformer_Str_Right properties:

CNAME Field derivation:


Right(DSLink5.CNAME,7)

176
19.FUNCTION:
Soundex:
Syntax: Soundex(%string%)
Returns a code which identifies a set of words that are (roughly) phonetically alike based
on the standard, open algorithm for SOUNDEX evaluation.
Inputdata:

Outputdata:

JOB:

Transformer_Str_Soundex properties

CNAME Field derivation:


Soundex(DSLink5.CNAME)

20.FUNCTION

177
Space:
Syntax: Space(%length%)
Returns a string of n space characters.
Inputdata:

Output:

JOB:

Transformer_Str_Space Properties:

ADDRESS Field derivation:


Space(DSLink5.ADDRESS)

21.FUNCTION

178
Squote:
Syntax: Squote(%string%)
Inputdata:

Outputdata:

JOB:

Transformer_Str_Squote properties:

ADDRESS Field derivation:


Squote(DSLink5.ADDRESS)

22.FUNCTION

179
Str
Syntax: Str(%string%,%repeats%)
Inputdata:

Outputdata:

JOB:

Transformer_Str_Str properties;

CNAME Field derivation:


Str(DSLink5.CNAME,2)

23.FUNCTION:

180
StripWhiteSpace
Syntax: StripWhiteSpace(%string%)
Inputdata:

Outputdata:

JOB:

Transformer_Str_StripWhiteSpace properties:

ADDRESS Field derivation;


StripWhiteSpace(DSLink5.ADDRESS)

24.FUNCTION;

181
Trim:
Syntax: Trim(%string%,[%stripchar%],[%option%])
Inputdata:

Outputdata:

JOB:

Transformer_Str_Trim properties:

ADDRESS Field derivation:


Trim(DSLink5.ADDRESS,".","A")

In this job trim again “|” also

Outputdata:

182
JOB:

Transformer_Str_Trim_Tab Properties:

ADDRESS Field Derivation:


Trim(DSLink6.ADDRESS,"|","A")

Different Trim Options:

A Remove all occurrences of stripchar

B Remove both leading and trailing occurrences of stripchar

D Remove leading, trailing, and redundant white-space characters

E Remove trailing white-space characters

F Remove leading white-space characters

L Remove all leading occurrences of stripchar

R Remove leading, trailing, and redundant occurrences of stripchar

T Remove all trailing occurrences of stripchar

183
25.FUNCTION:

TrimF:

Syntax: TrimF(%string%)

Inputdata:

Output data:

JOB:

184
Transformer_Str_TrimF Properties:

ADDRESS Field Derivation;

TrimF(DSLink5.ADDRESS)

26.FUNCTION

TrimB:

Syntax:

Inputdata:

Outputdata:

185
JOB:

Transformer_Str_TrimB Properties:

CNAME Field derivation:

TrimB(DSLink5.CNAME)

27.FUNCTION:

TrimLeadingTrailing:

Syntax: TrimLeadingTrailing(%string%)

Inputdata:

186
Output data:

JOB:

Transformer_Str_Trim properties:

CNAME Filed Derivation:

TrimLeadingTrailing(DSLink5.CNAME)

187
28.FUNCTION:

UpCase:

Syntax: UpCase(%string%)

Inputdata:

Outputdata:

JOB:

188
Transformer_Str_UpCase properties:

ADDRESS Field derivation:

UpCase(DSLink5.ADDRESS)

TRANSFOMER TIME AND DATE FUNCTIONS:


1.CurrentDate: Returns the date that the job runs in date format.
Syntax: CurrentDate()
Inputdata:

Output Data:

JOB:

189
Transformer_Time_Date Properties:

Currentdate Field derivation:


CurrentDate()

2.CurrentTime: Returns the time at which the job runs in time format.
Syntax:CurrentTime()
InputData:

OutputData:

JOB:

Transformer_Time_Date Properties:

190
Currenttime Field Derivation:
CurrentTime()

3.CurrentTimeStamp: Returns a timestamp giving the date and time that the job runs in
timestamp format
Syntax:CurrentTimeStamp()
InputData:

OutputData:

JOB:

Transformer_Time_Date Properties:

191
CurrentTimeStamp Field Derivation:
CurrentTimeStamp()

TRANSFOMER TIME AND DATE FUNCTIONS:


1.CurrentDate: Returns the date that the job runs in date format.
Syntax: CurrentDate()
Inputdata:

Output Data:

JOB:

Transformer_Time_Date Properties:

192
Currentdate Field derivation:
CurrentDate()

2.CurrentTime: Returns the time at which the job runs in time format.
Syntax:CurrentTime()
InputData:

OutputData:

JOB:

193
Transformer_Time_Date Properties:

Currenttime Field Derivation:


CurrentTime()

3.CurrentTimeStamp: Returns a timestamp giving the date and time that the job runs in
timestamp format
Syntax:CurrentTimeStamp()
InputData:

OutputData:

JOB:

194
Transformer_Time_Date Properties:

CurrentTimeStamp Field Derivation:


CurrentTimeStamp()

JOB PARAMETERS:
Job parameters are variables which are using to reduce the redundancy of work there are
two types of job parameters are available
1. Local parameters
2. Global parameters

Local parameters are defining in job parameters

How to define parameters:


Go to the job propertiesclick on the icon(or) go to edit select job propertiesclick on
parameters
Parameter Name Prompt Type Default value
SERVER_NAME Enter server name String Oracle
USER_NAME Enter user name String Scott
PASSWORD Enter password Encrypted *****(Tiger)
Click okdouble click on Oracle stage Insert parameter for server, username and
password
Always the parameter between two hashes which indicates a parameter other wise it
considers i.e. actual value.
How to define Global parameter:
Go to data stage administratorgo to the projectsclick on propertiesclick on
environmentclick on user defined.

Parameter Name Prompt Type Default value


SERVER_NAME Enter server name String Oracle
USER_NAME Enter user name String Scott
PASSWORD Enter password Encrypted *****(Tiger)

Go to Data stage designer select one Jobgo to job propertiesclick on add


environment variable at last can find user defined environment variablenow select
one it will apply to job parameteragain click on add environment variableselect
remaining variablesclick okgo to oracle stage insert again parameters(global)

195
Note: Global parameter should have prefix as $ symbol

How to hide global parameter values:


Go to the job propertiesgo to parameterdouble click on valuesselect $PROJDEF

Differences between Local parameters and Global parameters

Local Parameter Global parameter


It is defined in job parameters only These parameters are defined environment
variable in administrator
These parameters valid with in a job These parameters valid with in a project
These parameters not require $ symbol These parameters start with $ symbol

CONTAINERS:
Containers are used to minimize the complexity of job for better understanding and
reusability purpose
There are two types of containers are available in data stage
1. Local Container
2. Shared Container
Local Container:
It is used to minimize the complexity of job for better understanding purpose only it
never used for reusability purpose and its limit with in a job.
Shared Container:
It is used to both the purpose like to minimize the complexity of job and reusability
purpose and its limit with in a project.
Differences between Local Containers and Shared Container:

Local Containers Shared Containers


It is used to minimize the complexity of job It is used to minimize the complexity of job
only for better understanding purpose only it
never used for reusability
Its limit with in a job. And its limit is with in a project
It occupies no memory It occupies some memory
It can reconstruct directly We Can not deconstruct it directly if we
want deconstruct it first converted to local
and then constructed
How to construct Container:
Go to data stage designeropen a specific job select a required stages click on edit
click on construct container then choose local or shared if want deconstruct right
click on containerclick deconstruct.

How to use shared container in another job:


Create a new job drag and drop of shared container into new jobdesign our job
according to our requirement Double click on shared container go to out put assign

196
old output link (shared container link) to new out put linkgo to columns click on
loadselect reconcile from container link (old link)click on validatedo it for
remaining linksclick ok

IMPORTING AND EXPORTING DATA STAGE JOBS:


How to export Jobs:
Go to data stage managerclick on exportselect data stage componentsselect path
where you want to save choose whole project or selectionclick export
Note:
The extension of back up file is .dsx

How to export Jobs:


Go to data stage managerclick on importselect data stage componentsselect the
required path of file click ok

SWITCH STAGE:
It is used to filter the records based on constraint while populating the data from source to
target it takes one input link and gives more than one out put link
Note:
It takes only equal operator double click on switch stageselector= deptnocase=10,
case=20, case=30drag and drop of required column links

Example:

Oracle_Enterprise_0 properties:

197
Switch stage properties:

Outputmappings:
Outputname=T1;

198
Output_Dataset_2 properties:

199
output:

Outputname=T2;

200
Output Dataset_3 Properties:

output:

Outputname=T3;

201
Output Dataset_4 Properties:

Output:

DIFFERENCE STAGE:
It is used to find out the difference of two input files (or) two input tables.
It takes only two input links and gives one output link .
Difference stage takes some kind of meta data from two input links ,it gives one
extra column to the output link called change_code.
If change_code=1 then its anew record, if change_code=2, it’s a copy record ,if
change_code=3 it’s a updated record.

202
Example:

COMPRESS STAGE:
It is used to zip the file by using UNIX command. It takes one input link and gives
one output link.

EXPAND STAGE:
It is used to unzip the zipped file into normal format command. It takes one input
link and gives one output link.

DECODE STAGE:
It is used to decode a particular file into unknown format which was preferred to
security purpose. It takes one input link and gives one output link.

203
ENCODE STAGE:
It is used to encode the decode file into normal. It takes one input link and gives one
output link.

PEEK STAGE:
It is used to find out what records are going to which node.It is a file type stage but
can not a saved a file.It take one input link and one output link

EXAMPLE JOB FOR PEEKSTAGE:


Inputdata:

JOB:

204
Input sequential file properties:

Peek stage properties:

205
Peek stage Output columns:

Output seqfile properties:

206
Output seq file data:

EXAMPLE JOB FOR


PEEKSTAGE:
Inputdata:

Option outputmode=Joblog:
Job:

Input seqfile properties:

207
Input seqfile data:

208
peek stage properties:

Here we set the option Peek outputmode=joblog so we can the data at Logs only
Procedure for see the data at logs:
Goto the tools and rundirectornow click on view log it will show the screen like

209
in the above screen from bottom to 8th row u click it will show the log details

210
EXAMPLE JOB FOR PEEKSTAGE:
Inputdata:

JOB:

211
Input seqfile data:

Input seqfile properties:

212
Peek stage properties:

Peek output1 columns:

213
peek output1 mappings:

Peek output2 columns:

peek output2 mappings:

214
peekoutput3 columns:

215
Peekoutput3 mappings:

peekout1 properties:

216
peek output1 data:

Peekoutput2 properties:

217
Peekoutput2 data:

Peekoutput2 properties:

Peekoutput3 data:

218
Peekoutput3 properties:

MULTIPLE INSTANCE:
Multiple instance is a good concept available in data stage parallel job. Through multiple
instance a developer can run a physical job more than one time at a time parallely with
the invocation ID.
How to run a job more than one time:
Go to data stage designer open a required jobgo to job propertiescheck the box
Allow multiple instanceclick ok

RUN TIME COLUMN PROPAGATION:


This concept also available in datastage parallel jobs only. Through this concept even
though if source contains no columns then run time column may propagate and treat the
column is available in the source.

How to run the job through RCP:


Go to the data stage administrator go to the propertiesin general click on enable
run time column propagation in parallel jobs click okclose.
Go to data stage designer select a required jobgo to propertiesclick
on enable run time column propagation for new links.

CONFIGURATION FILE:
The configuration file is available in data stage server and the extension of the
configuration file is .APT and which is used to know how many nodes are available in
the particular projects.
It contains four components
1. Fast Name: It is the name of the Node
2. Pools: Which is used for specific task
3. Resource Scratch Disk: It is temporary memory
4. Resourced Disk: It is permanent memory.
The Configuration file name is Default. Apt
How to view Configuration file:
Go to data stage managerclick on tools go to configurationopen default

219
How to run a specific job with two nodes if project is running with four nodes:
Go to data stage manageropen default configuration filesave as with another name
by clicking down save button(say sample)now delete two nodes from sample
filesave and close.
Go to data stage designeropen a particular job go to job propertiesgo to
parametersclick on add environment variable select APT_CONFIG_FILEDouble
click on default value Choose Sample. Aptclick ok Now our required job is run with
two nodes.

DATA SET MANAGEMENT:


From data set management a developer can identify the size of the data set ,records in
node, view data set data, delete data set and copy data set

How to go to data set management:


Go to data set managerclick on tools select data set managementnow choose
required data setclick ok.

COMBINALITY MODE:
It is used assign a single processor to all homogeneous (or) similar stages in data stage
designer
How to maintain combinality:
Go to data stage designer open required job go to processing stagesgo to stage and
advanced make it combinality node= combinality

HOW TO IMPORT TABLE DEFINATION FROM ORACLE DATA BASE:


Go to data stage designer In repository right click on table definitionselect import 
choose plug in meta data definition  choose oracle 9iclick okprovide server name,
user name and passwordclick nextchoose scott for owners list select required
tables,choose folder name click import

ACTIVE STAGES:
The active stage is nothing but if the data is going to modify into particular stage then the
stage is called active stage
Example: Transformer, Aggrigator.

PASSIVE STAGE:
The passive stage is nothing but if the data is going to modify on that stage then the stage
is called passive stage.
RUNNING AJOB THROUGH UNIX COMMAND:
Dsjob –run –job status -param<parameter1>=<value1>param<parameter2>=<value2> -
par am<parameter n>=<value n>

220
JOB SEQUENCE:
It is used to run all jobs in a sequence (or) in a order by considering its dependencies. it
has many activities.
How to go to job sequence:
Select job sequencedrag drop of required jobs from jobs in repositorygive
connectionsave itcompile it now sequentially these 3 jobs will be run.

These jobs are called job activity.


Double click on job activity we can find general/job/triggersgo to job make
execution =reset if required ,then rungo to triggersin expression type make un
conditional/conditional ok/conditional fail
Unconditionally= if job1 is successfully finished (or) aborted then job2 will run
Conditional ok= if job1 is successfully finished then job2 will run.
Conditional fail=if job1 is fail then job2 will run.

NOTIFICATION ACTIVITY:
It is used to send a mail to required persons automatically.

Double click on notification activity go to notification SMTP mail server: Company
name (www.xyz.com) ,sender email address: abreddy2@xyz.com, recipients email
address: recipients email address : abreddy2@xyz.com Email subject:Aggregatot job has
been aboarted,give some information on the bodyclick ok

221
TERMINATOR ACTIVITY:
It is used to send stop request to all running job.

WAIT FOR FILE ACTIVITY:


It is used to wait for a file up to some extent of time

Double click on wait for file activitygo to filefilename: select the file and set
timing(24 hours time only)

222
SEQUENCER:
It is used to connect one activity to another activity it takes more input links and gives
one output link

Double click on sequencergo to sequencer chose mode=All/Any


All is nothing but needs to get all requests from all input links.
Any means any request from one input link

223
ROUTINE ACTIVITY:
It is used to execute a routine between two jobs
Double click on routine activitychoose routine nameif required parameter give
parameter.

EXECUTE COMMAND ACTIVITY:


It is used to execute a UNIX command between two jobs
Double click on execute command activityexecute UNIX command click ok

END LOOP START LOOP ACTIVITY:


It is used to execute some jobs more than one time in a sequence.

224
SLOWLY CHANGING DIMENSIONS:
There are 3 types of SCD‘s Available in DWH.
Type1: It always maintains current data and updated data
Type2: It always maintains current data and full historical data
Type3: It always maintains current data and partial historical data

EXERICE-1:
no name sal
100 Bhaskar 1500
101 Mohan 2000
103 Sanjeev 2000

no name sal
100 Bhaskar 1000
101 Mohan 1500
102 Srikanth 2000

After implementing SCD Type1

no name sal
100 Bhaskar 1500
101 Mohan 2000
102 Srikanth 2000
103 Sanjeev 2000

Type-I:
In SCD Type-I If a record exists in source table and not exists in target table then simply
insert a record into target table (103 record) if a record exists in both source and target
tables then simply update source record into target table(100,101)

Type-II:
While implementing SCD Type-II there are two extra columns are maintained in target
called Effective Start Date and Effective End Date .Effective start date is also part of
primary key.
If a record exists in source and not exists in target table then simply insert records into
target table. while inserting put Effective Start Date is equal to current date and effective
end date set null.

225
If a record exists in both source and target tables even though we are inserting a
source record into target table but before insert a record into target table the existing
record in target table update effective End Date=CurrentDate-1.
Now insert source record into target table effective start date=Current Date and Effective
End Date=Null

no name sal Effective_Strat_Date Effective_End_Date


100 Bhaskar 1000 2011-01-31 2012-04-05
101 Mohan 1500 2011-01-31 2012-04-05
102 Srikanth 2000 2011-01-31 2012-04-05
100 Bhaskar 1500 2011-02-01 Null
101 Bhaskar 2000 2011-02-01 Null
103 Sanjeev 2000 2011-02-01 Null

Type-III:
If a record exists in source and not exists in target table then simply insert records
Into target table. While inserting put Effective start Date is equal to Current Date and
Effective End Date set Null.
If a record exists in both source and target tables then check target table count group by
primary key if count=1 then update Effective End Date=Current Date-1 then simply
insert source record into target record.
If count greater than one then delete a record into target table group by primary key
where Effective End Date=Not Null. Now update target record Effective End
Date=Current Date-1 Then simply insert source record into target.

226
DATAWAREHOUSE:
Data ware house is nothing but collection of transactional data and historical data and can
be maintained in dwh for analysis purpose.
They are 3 types of tools should be maintained on any data warehousing project
1. ETL Tools
2. OLAP Tools (or) Reporting Tools
3. Modeling Tool

ETL TOOL:
ETL is nothing but Extraction, Transformation, and Loading. a ETL Developer(those
who are expertise in dwh extracts data from heterogeneous databases(or)Flat files,
Transform data from source to target(dwh) while transforming needs to apply
transformation rules and finally load data into dwh.
There are several ETL Tools available in the market those are
1. Data stage
2. Informatica
3. Abinitio
4. Oracle Warehouse Builder
5. Bodi (Business Objects Data Integration)
6. MSIS (Microsoft Integration Services)

OLAP:
OLAP is nothing but Online Analytical Processing and these tools are called as reporting
Tools Also
A OLAP Developer analyses the data ware house and generate reports based on selection
criteria.
There are several OLAP Tools are available
1. Business Objects
2. Cognos
3. Report Net
4. SAS
5. Micro Strategy
6. Hyperion
7. MSAS (Microsoft Analysis Services)

227
MODELING TOOL:
Those who are working with ERWIN Tool called data modeler .A data modeler can
design data base of DWH with the help of fallowing tools
A ETL Developer can extract data from source databases (or) flat files(.txt,csv,.xls etc)
and populates into DWH .While populating data into DWH they are some staging areas
can be maintained between source and target .these staging areas are called staging area1
and staging area2.

STAGING AREA:
Staging Area is nothing but is temporary place which is used for cleansing unnecessary
data (or) unwanted data (or) inconsistency data.

Note: A Data Modeler can design DWH in two ways


1. ER Modeling
2. Dimensional Modeling

ER Modeling:

ER Modeling is nothing but entity relationship modeling. in this model always call table
as entities and it may be second normal form (or) 3rd normal form (or) in between 2nd and
3rd normal form

Dimensional Modeling:
In this model tables called as dimensions (or) fact tables. It can be subdivided into three
schemas.
1. Star Schema
2. Snow Flake Schema
3. Multi Star Schema (or) Hybrid (or) Galaxy

Star Schema:
A fact table surrounded by dimensions is called start schema. it looks like start
In a start schema if there is only one fact table then it is called simple start schema.
In a start schema if there are more than one fact table then it is called complex start
schema

228
Sales Fact table:
Sale_id
Customer_id
Product_id
Account_id
Time_id
Promotion_id
Sales_per_day
Profit_per_day

Account Dimension:
Account_id
Account_type
Account_holder_name
Account_open_date
Account_nominee
Account_open_balence

Pramotion:
Promotion_id
Promotion_type
Promotion_date
Pramotion_designation
Pramotion_Area

229
Product:
Product_id
Product_name
Product_type
Product_desc
Product_version
Product_stratdate
Product_expdate
Product_maxprice
Product_wholeprice

Customer:
Cust_id
Cust_name
Cust_type
Cust_address
Cust_phone
Cust_nationality
Cust_gender
Cust_father_name
Cust_middle_name

Time:
Time_id
Time_zone
Time_format
Month_day
Week_day
Year_day
Week_Yeat

DIMENSION TABLE:
If a table contains primary keys and provides detail information about the table
(or) master information of the table then called dimension table.

FACT TABLE:
If a table contains more foreign keys and it’s having transactions, provides
summarized information such a table called fact table.

DIMENSION TYPES:
There are several dimension types are available

230
CONFORMED DIMENSION:

If a dimension table shared with more than one fact table (or) having foreign key more
than one fact table. Then that dimension table is called confirmed dimension.

DEGENERATED DIMENSION:
If a fact table act as dimension and it’s shared with another fact table (or) maintains
foreign key in another fact table .such a table called degenerated dimension.

JUNK DIMENSION:
A junk dimension contains text values, genders,(male/female),flag values(True/false) and
which is not use full to generate reports. Such dimensions is called junk dimension.

DIRTY DIMENSION:
If a record occurs more than one time in a table by the difference of non key attribute
such a table is called dirty dimension

FACT TABLE TYPES:


There are 3 types of fact s are available in fact table
1. Additive facts
2. Semi additive facts
3. Non additive facts

ADDITIVE FACTS:
If there is a possibility to add some value to the existing fact in the fact table .that facts
we called as additive fact.

SEMI ADDITIVE FACT:

If there is possibility to add some value to the existing fact up to some extent in the fact
table is we called as semi additive fact.

NON ADDITIVE FACT:


If there is not possibility to add some value to the existing fact in the fact table is we
called as Non additive fact.

SNOW FLAKE SCHEMA:


Snow Flake schema maintains in dimension table normalized data .in this schema some
dimension tables are not directly maintained relation ship with fact table and those are
maintained relation ship with another dimension

231
DIFFERENCE BETWEEN STAR SCHEMA AND SNOW FLAKE SCHEMA:

Star schema Snow flake schema


It maintains demoralized data in the It maintains normalized data in the
dimension table dimension table
Performance will be increased when Performance will be decreases when
joining fact table to dimension table when joining fact table to dimension table to
shrunken dimension table because it
require more inner joins when compared
compared with snow flake with snow flake
All dimension table should maintain ed Some dimension tables are not directly
relation ship directly with fact table maintained relationship with fact table

PREPARED BY
BHASKAR REDDY.A
Mail:abreddy2003@gmail.com
Contact: 91-9948047694

232
233

You might also like