Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 5

Syllabus of DataStage Course

Module 1: Introduction to Data Warehouse Concepts

 What is Data Warehouse?


 Data Mart
 OLAP VS OLTP
 Data Warehouse Architecture
 What is Data Modelling?
 Explorer on Dimensional Modelling
 Explorer on Star Schema
 Explain Snowflake Schema
 Understanding on Dimension
 Understanding on Fact
 Slowly Changing Dimension
 Lifecycle of Data Warehous

Module 2: Understanding onETL  (EXTRACTION, TRANSACTION, LOAD)

 Overview of ETL
 Feature and benefit for Business
 Different SCD Types
 ETL tools in markets
 Explain on staging tables
 Explain on Transformation
 Loading data into different stage of table

Module 3: Overview of InfoSphere DataStage

 What is InfoSphere DataStage?


 Architecture of DataStage
 Explain on Topologies
 Components in DataStage
 Runtime Architecture
 OSH Script and Execution Flow

Module 4: Install and Configuration on InfoSphere DataStage

 Prerequisite for InfoSphere DataStage


 Install InfoSphere DataStage
 Verify Installation
 Setup Environment variables
 Create / Update / Delete projects
 User creation and Grand permission
Module 5: Working with DataStage Designer

 Overview of Designer
 Explorer on DataStage Designer
 High level overview of Commonly used Stages
 Schema
 Pipeline Parallelism
 Manipulate configuration file
 Repository Palette
 Passive and Active stages
 Annotation and Create jobs
 Import and Export Metadata
 Dataset Management
 Partition technique

Module 6: DataStage Job

 Overview of Job types


 Explain on Sequence and Parallel Jobs
 Explain on Server Jobs
 Different stages
 Understanding Containers

Module 7: DataStage Director

 Introduction to DataStage Director


 User Interface Director
 Job status and view
 Compiling Single and Multiple jobs
 Run, Reset ad Restart jobs
 Scheduling Batches
 Performance monitor

Module 8: Creating Parallel Job

 Overview of Parallel Jobs


 Design a Parallel Job using Designer
 Pipeline Parallelism
 Partition Parallelism
 NLS Mode Work
 Maps in Parallel Jobs
 Run Parallel Jobs

Module 9: Handle Files

 Introduction to file handling


 Sequence and Complex file stage
 Huge File Manipulation
 Error and Invalid Records Rejection

Module 10: File Stages

 File Stages
 Sequential File stage
 Explain on DataSet
 Complex Flat File stage
 Create jobs to read and write on sequential files
 Multiple file reader using file patterns
 Null handling in File Stage
 Lookup file Set

Module 11: Combining and Partitioning Data

 Overview of data process for combine and Partitioning


 Combine data using by Lookup stage
 Combine data using by merge stage
 Combine data using by Join stage
 Combine data using the Funnel stage

Module 12: Sorting and Aggregating Data

 Sort data using in-stage sorts and Sort stage


 Data Segregation using Aggregates stage
 Unique data using Duplicates stage

Module 13: Transformation on Data

 Understanding DataStage internal logical message


 Column generator and Row generator
 Transform message one to another format
 Filter Data using on business criteria
 Control data flow based on data conditions
 Cover real time scenario using different Processing Stages
 Routes creation

Module 14: Working with Relational DataBase

 Understanding Database Stage


 Database Metadata
 Explain on ODBC Connection
 Import Definitions for Tables.
 Use Connector stages in a job.
 Define SQL statements using Builder
 ODBC Connector
 Oracle Connector
 Parallel Job with Connector

Module 15: Advanced Parallel Jobs

 Overview of Type1 and Type2 process


 Range look process
 Job Performance analysis
 Performance tuning

Module 16: Job Sequence

 Job activities in Sequencer


 Sequence Trigger
 Notification and Terminator activity
 Start and End Loop activity
 Error and Exception handling activity

Module 17: Working with Cleansing Data

 Overview of Cleansing
 Explain Workflow of Standardization
 Create and Configure Standardize Stage Job
 Explain on Rule sets
 Managing Rule sets

Module 18: Exception Handling on DataStage

 Introduction to Exception Handling


 How to Design job to link with Exception
 Explain on Exception stage
 One-source and Two-source Match Exception Stage
 Route exception to Exception Stage

Module 19: Deployment on InfoSphere DataStage

 Introduction to InfoSphere Information Server Manager


 Explain on Deployment life cycle
 Adding Domain on Information Server Manager
 View job and asset properties
 Explain Deployment Package
 Deployment Workflow
 Define Deployment Package
 Setup Deploy Path
 Deploy Package
 Import and Export Assets
 Explain various types of Source Control for DataStage
Module 20: Working with Monitoring Jobs

 Introduction to Monitoring Jobs


 Explain on Operations Console
 How to Monitoring Jobs by using console

Module 21: Performance Tuning Job

 Understanding performance impact activities


 Design Job for Optimal Performance
 Design flow with minimize CPU and Memory usage
 Explain on Buffering
 Deadlock prevention

Module 22: Best Practice on DataStage and Data Load

You might also like