Download as pdf or txt
Download as pdf or txt
You are on page 1of 4


Data Engineer Course (5 Days)

Understand the concept of Big Data, Data Lakes, and Data Engineer.

Understand the Apache NiFi dataflow development, common patterns

(feature extraction, database connectivity, etc), debugging and best

Target Audience
Data Engineers

Data Analysts

Technical professionals with data-oriented responsibilities (i.e. scientists

needing more near real-time data, data engineers looking to build a new
data pipeline)

Basic Knowledge of IT

There are no prerequisites for learning Apache NiFi. However, having a

basic knowledge of Linux command line can help.

Course Detail
Big Data Overview
Introduction to Big Data overview and various components of the big data

Data Lake Concepts and Constructs

Introduction to the concept of data lake, its attributes, support for
colocation of data in various formats and overcoming the problem of data

Data Engineer Course (5 Days) 1


Data Ingestion Layer in Data Lakes

Used to transfer data from source to destination and varieties of data

Data Ingestion Tools and Concepts

NiFi Processor, the data ingestion tools available for importing, transferring,
loading and processing of data and introduction to Apache NiFi for data

Who is Data Engineer? an introduction to data engineer

Introduction to Data Engineer and what are the responsibilities

SQL Refresher with MySQL

Used to refresh how querying data with SQL

Apache NiFi Concepts
The fundamental concepts of Apache NiFi, the concepts of FlowFile,
FlowFile Processor, Flow Controller and their attributes and functions in

Apache NiFi Architecture

Introduction to the architecture of Apache NiFi, various components
including FlowFile Repository, Content Repository, Provenance Repository
and web-based user interface

Installation Requirements and Cluster Integration

Apache NiFi installation requirements, cluster integration, successfully

running Apache NiFi, adding a processor, scaling up and down and working
with attributes

Key Features of Apache NiFi

Important features of Apache NiFi, the dataflow function, various aspects of
FlowFile, File Professor, Flow Controller, Processor group and connection

Queuing and Buffering Data

Connecting NiFi to database, transforming, splitting and aggregating data,
the process of data egress, monitoring of NiFi, reporting, data lineage, NiFi

Data Engineer Course (5 Days) 2

administration and expression language

Lab #1: Create a simple data pipeline
Apply fundamental concepts of Apache NiFi to real world examples tailored
to your specific domain. Examples: working with IoT data, parsing financial
reports, or managing a streaming feed of social media data.

Real World Examples & Lab #2: Customizing dataflows

Expansion of Lab#1 with more complex features and version control

Data Structures and Usage in Apache NiFI & Lab #3

How Apache NiFi is an event processing framework vs a batch framework,

incorporating content from Day 1 & 2 and different approaches to common
use cases. Data type is excel or csv

Practice and Customization of Data Pipelines, Continue Lab #3
Continuation of lab exercise #3 - domain specific data lab

Database Data Ingestion: Lab #4

Collecting data from SQL and NoSQL systems and building a data ingestion
pipeline and monitor the data pipeline to ensure that it is running smoothly
and troubleshoot any issues that arise.

API Data Ingestion: Lab #5

Collecting data from API and building a data ingestion pipeline using NiFi
and Schedule the data pipeline to run at regular intervals (daily, hourly etc)
and automate the process

Advanced Use Case (Complex Problem): Lab#6

Building the solution for complex problem one or two problems for real-
world complex problems.


Data Engineer Course (5 Days) 3




Data Engineer Course (5 Days) 4

You might also like