Apache Airflow TRAINING12532

You might also like

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 3

Apache Airflow Developer TRAINING

Course Content
Training Duration- 5 Days
*****Every Day Two Module will be covered. Module 1 before Break and Module 2
After Lunch Break.

***All Module is having Practical Expects.

 Pre-req::::::::::::::: python

*** 90 % Hands – 10 % Theory

Day1::: Apache Airflow INTRODUCTION

 BIG data vs Normal ETL Pipelines


 Why We Need Airflow ?
 First Approach to Airflow
 Introduction Airflow?
 Airflow Architecture
 Working Model of Airflow
 Installing Airflow
 Quick Tour of Airflow UI
 Quick Tour of Airflow CLI
 setting environment variable and starting web server
 setting encryption to secure connection secrets
 Configuration Option: Maximum Active Runs Airflow Configuration

Day 2:::

 Airflow Configuration Overview


 Configuration options: ORM Configuration
 Configuration Option: Maximum Active Runs Explained
 Explained Continued
 Configuration Options: Additional Configuration Settings
 Coding Your First Data Pipeline with Airflow
 DAG Explation
 Time to code your first DAG::::::::::::::: python
Day 3 :::

 Operator
 Let's use Operators Practically
 Operator Relationships and Bitshift Composition
 Adding dependencies
 How the Scheduler Works?
 A Quick Play With Backfill and Catchup
 Workflow Description

 Developing Data Pipeline


 Hands on: Project Setup
 Hands on: Data Retrieval from File System
 Hands on: Merging DataFrames
 Hands on: Aggregation Using Pandas
 Hands on: Database Connectivity:: postgres // mysql db
 Hands on: Creating Dags

Day 4 :::

 Databases and Executors


 Introduction Sequential Executor with SQLite
 Local Executor with PostgreSQL
 Configure a DAG with Local Executor and PostgreSQL
 Celery Executor with PostgreSQL and RabbitMQ
 [Practice] Configure a DAG with Celery Executor, PostgreSQL and RabbitMQ

 Implementing Advanced Concepts in Airflow



 Introduction
 Minimising Repetitive Patterns With SubDAGs
 Minimising a DAG with SubDAGs
 How to Interact With External Sources Using Hooks
 Getting Results From PostgreSQL Using Hooks
 How to Share Data Between Your Tasks With XCOMs
 Sharing Your First Messages Using XCOMs
 How to Execute Tasks According To Criteria Using Branching
 Make Your First Conditional Task Using Branching
 Control Your Tasks With SLAs
 Defining a SLA in a DAG
 AIRFLOW SENSORS

Day 5 :::

 Creating Airflow Plugins with Elasticsearch and PostgreSQL


 Adding Functionalities to Apache Airflow
 Creating a Hook to Interact With Elasticsearch
 Creating a Transfer Operator PostgresqlToElasticsearch
 Adding a View to Apache Airflow UI

DATA PROFILING IN AIRFLOW

 Adhoc Queries
 Querying Metadata Tables
 Charts in Airflow

 Executors
 Configure Local Executor
 Configure Celery Executor
 Service Level Agreements (SLAs)
 Security: Authentication, Roles, Encryption
 Write Logs to a Remote Location
 Monitor Airflow with StatsD, Prometheus and Grafana
 Managed Airflow Services

You might also like