Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 6

Azure Data Engineer Online Training Plan

Demo
Online Softwares
Domain Understanding
How to Data Preparation happens
Different Files to store Data
Where to Store the Data
How to play on stored Data
Data Transferring
Azure Cloud/Non Cloud
Azure Introduction
Azure Data Engineering Architectures
Azure Data Engineering Technologies with few examples
Course Duration and Details
Data Manipulating Or Querying with SQL, Snowflake, Databricks.

Concept Azure SQL Snowflake Azure Data bricks Azure Task


Storage/blo
b/datalake
Azure Subscription, ResourceGroups, Azure AD
Introduction
Technology Azure SQL Creation Snowflake Creation Azure Databricks Creation Create
Creation in Storage
Azure Account
Virtual DTU T-shirts Clusters NA
Machine
Processers
LogIn
Library NA NA Pyspark NA
Data Store Tables Tables DataFrames , tables Files

variables NA
DataTypes Varchar, Nvarchar, Varchar, Nvarchar, int, Spark SQL datatypes – NA
int, decimal, date, decimal, date, Varchar, Nvarchar, int,
timestamp etc… timestamp etc… decimal, date, timestamp
etc…

Alter Tables Drop table, add Drop table, add columns Drop table, add columns NA
columns etc. etc. etc.
Conditional Constraints Constraints DQ NA
constraint
Integrity PrimaryKey, PrimaryKey, ForeginKey, Not Applicable NA
Constriants ForeginKey, Unique Unique
Python & NA NA Spark Datatypes, NA
Spark Basics Python Datatypes
(set,list, tuple, Dictionary)
Data Insert Data into Insert Data into Tables Insert Data into Data NA
Prepartion Tables frames
ADF ADB
ETL/ELT
Prepare Files
Data in Source mainataining
System 1. SQL in ADLS
Server 2. API
3. FileSystem

Connection Connectors in ADF Connectors in ADF Connectors in Databricks ADLS


Establishment connections
Read Data How to read data How to read data from How to read data from NA
from table from SQL table from Snowflake SQL, Snowflake and File
and ADF and ADF using Databricks
Write Data to Write files to ADLS Write files to ADLS using
ADLS using ADF Databricks
Transformations
Querying on Select,Distinct,where Select,Distinct,where, Select, filter etc.. NA
Read data , AND,OR, (is null, is AND,OR, (is null, is not
not null,IN, =,NOT IN null,IN, =,NOT IN
between, (!= between, (!= or
or <>),,Like,order by <>),,Like,order by and
and etc. etc.
Temp tables How to create temp How to create temp How to create temp tables NA
tables tables
External Other NA Internal Tables, Delta files, Other files DataFiles
Sources External Tables
Store queried Select * into Create OR replace as Data frame, temp table NA
data into a
table
Working on Union, union all, Union, union all, Union, union all, Intersect, NA
multiple Intersect, Except Intersect, Except Substract
queries
Joins Inner,left,right,full Inner,left,right,full Inner,left,right,full NA
Subquery & Examples Examples Not Applicable NA
Co-related
Virtual storage Views Views Views NA
Temporary CTE CTE CTE in spark tables NA
Storage result
Pre defined String,Aggregate, String,Aggregate, Conversion in Python NA
functions Data,Math, Data,Math, Conversion, ,Spark String,Aggregate,
Conversion, window window functions Data, Conversion, window
functions functions
User Defined NA NA Method creations with NA
functions python syntax with various
examples
Updates on Update query Update query Merge NA
data
Data removing Delete, Truncate Delete, Truncate Drop NA
Semi NA NA Json, Json functions, NA
Structured Structure to Semi
Data Structure, Semi Structure
to Structure
Indexing Clustered, non Not Applicable Not Applicable NA
clustered
Write all Stored Procedure Stored Procedure Notebooks NA
queries at
once
Write data NA NA To SQL, Snowflake,
DatalakeStorage
ETL/ELT with ADF and ADB
Role of Azure Data Factory on above Technologies

Concept ADF ADB

ADF What is ADF, Use of ADF, How to call other technologies


Introduction How to process above technologies in ADB
(Azure SQL, Snowflake, Data bricks) in ADF
Why ADF Advantage of Pipelines Advantage of Notebooks
Integration How to access ADF to process external How to access ADB to process
runtime external data.
Linked Azure Blob Storage, Azure Data lake gen2 Azure Blob Storage, Azure Data
Services ,Azure SQL Database ,File System lake gen2
,Azure Key vault ,On-premise SQL ,Azure SQL Database
, Snowflake , Rest-API
DataSets File Formats (CSV, Json, Parquet, Excel) on Azure Datalake Csv,Json,Parquet
Storage, Tables on Azure SQL , Snowflake
Pipelines Processing workflows in ADF to achieve Data Engineering ADB importance in Data
Architectures. Engineering compare to ADF
Activities What is Activity in ADF, why ADF Intro related to Activities
, how to use, various activities like
Copy, Lookup, Get metadata, Set variable
, append variable, for each, if , filter, switch,
Copy Activity Filter by last modified , wild card How copy will be done in ADB
, Preserve Hierarchy, Flatten hierarchy
, Merge, List Of Files ,skip incompatible row
Delete Activity Delete folders, files from different datasets Remove files, folders.
GetMetaData Read metadata from folders, files, tables Read metadata using Databricks
Activity from different datasets

Lookup Read data from different datasets Read data from different
Activity datasets

WebActivity How to access semi structured data from Https.


DQ Mapping Dataflows Custom DQ
Stored Calling different Procedures in ADF NA
Procedure
Activity

Notebook Calling Databricks notebook in ADF Calling notebooks in another


notebook
Collection Functions, String Functions Already covered
Functions in Conversion Functions, Logical Functions
ADF Data Functions, Math Functions
Parameters in .Pipeline parameters
ADF .Dataset parameters
.global parameters
Linkage Pass results into next activity NA
Activities
Expression Dynamic Content expression with different Functions in ADF Select expr
writing in ADF and previous activity results
Azure Data Factory Data Flows NA
Derived Column
Joins
aggregations
select
exists
lookup
Rank
Window
Conditional Split
Alter Row
Iteration & Conditional Activities
If Bypass the Activities
For each Iterate the datasets to execute in parallel
Filter Filter the activites in ADF workflow
Switch Passing the activities based on the matching condition
Triggers
Schedule ,Tumbling Window, Event based Triggers
Querying Best Practices & Optimization
Make the things Dynamic for your Projects with above Technologies
SQL Snowflake Databricks
Variables What is variable , What is variable , Variables in python
examples examples
Parameters What is parameter, What is parameter, Databricks parameters
types and examples examples
Support Dynamic querying Dynamic querying Dynamic querying
multiple things
using single
statement

You might also like