Download as pdf or txt
Download as pdf or txt
You are on page 1of 4

👨🏻‍💻

Data Engineer Course (5 Days)


Outcomes
Understand the concept of Big Data, Data Lakes, and Data Engineer.

Understand the Apache NiFi dataflow development, common patterns


(feature extraction, database connectivity, etc), debugging and best
practices.

Target Audience
Data Engineers

Data Analysts

Technical professionals with data-oriented responsibilities (i.e. scientists


needing more near real-time data, data engineers looking to build a new
data pipeline)

Prerequisites
Basic Knowledge of IT

There are no prerequisites for learning Apache NiFi. However, having a


basic knowledge of Linux command line can help.

Course Detail
Day-1
Big Data Overview
Introduction to Big Data overview and various components of the big data
ecosystem

Data Lake Concepts and Constructs


Introduction to the concept of data lake, its attributes, support for
colocation of data in various formats and overcoming the problem of data

Data Engineer Course (5 Days) 1


silos

Data Ingestion Layer in Data Lakes


Used to transfer data from source to destination and varieties of data
ingestion

Data Ingestion Tools and Concepts


NiFi Processor, the data ingestion tools available for importing, transferring,
loading and processing of data and introduction to Apache NiFi for data
ingestion

Who is Data Engineer? an introduction to data engineer


Introduction to Data Engineer and what are the responsibilities

SQL Refresher with MySQL


Used to refresh how querying data with SQL

Day-2
Apache NiFi Concepts
The fundamental concepts of Apache NiFi, the concepts of FlowFile,
FlowFile Processor, Flow Controller and their attributes and functions in
dataflow

Apache NiFi Architecture


Introduction to the architecture of Apache NiFi, various components
including FlowFile Repository, Content Repository, Provenance Repository
and web-based user interface

Installation Requirements and Cluster Integration

Apache NiFi installation requirements, cluster integration, successfully


running Apache NiFi, adding a processor, scaling up and down and working
with attributes

Key Features of Apache NiFi


Important features of Apache NiFi, the dataflow function, various aspects of
FlowFile, File Professor, Flow Controller, Processor group and connection

Queuing and Buffering Data


Connecting NiFi to database, transforming, splitting and aggregating data,
the process of data egress, monitoring of NiFi, reporting, data lineage, NiFi

Data Engineer Course (5 Days) 2


administration and expression language

Day-3
Lab #1: Create a simple data pipeline
Apply fundamental concepts of Apache NiFi to real world examples tailored
to your specific domain. Examples: working with IoT data, parsing financial
reports, or managing a streaming feed of social media data.

Real World Examples & Lab #2: Customizing dataflows

Expansion of Lab#1 with more complex features and version control

Data Structures and Usage in Apache NiFI & Lab #3

How Apache NiFi is an event processing framework vs a batch framework,


incorporating content from Day 1 & 2 and different approaches to common
use cases. Data type is excel or csv

Day-4
Practice and Customization of Data Pipelines, Continue Lab #3
Continuation of lab exercise #3 - domain specific data lab

Database Data Ingestion: Lab #4


Collecting data from SQL and NoSQL systems and building a data ingestion
pipeline and monitor the data pipeline to ensure that it is running smoothly
and troubleshoot any issues that arise.

Day-5
API Data Ingestion: Lab #5

Collecting data from API and building a data ingestion pipeline using NiFi
and Schedule the data pipeline to run at regular intervals (daily, hourly etc)
and automate the process

Advanced Use Case (Complex Problem): Lab#6


Building the solution for complex problem one or two problems for real-
world complex problems.

References

Data Engineer Course (5 Days) 3


1. https://www.webagesolutions.com/courses/TP2913-cloud-data-
engineering-with-nifi-on-aws-or-gcp

2. https://intellipaat.com/apache-nifi-training/

3. https://www.gologica.com/course/apache-nifi-training/#AboutCourse

Data Engineer Course (5 Days) 4

You might also like