Professional Documents
Culture Documents
Syllabus de Advance Level.1.0
Syllabus de Advance Level.1.0
1. Introduction
Data engineers work in a variety of settings to build systems that collect, manage, and convert raw data into usable
information for data scientists and business analysts.
8. Training Syllabus
Program is providing work-in-progress courses and sections from the skills of the product owner role includes but is not
limited to the following topics:
No. Role-based Skills Hours Vendor Course Name & Uri Lessons
Filter Section 1 21 Big data landscape & Advanced Spark and
Python
1 Big data foundation 4 Udemy A foundation course for big data that covers All
the big data tools for various stages of a big
data project.
- Big data characteristics
- Big data storage
- Big data ingestion
- Big data analytics
- Big data visualization, security and
vendors
- Big data project
2 Introduction to Apache 2 Udemy Apache NiFi - An Introductory Course to Learn All
NiFi | Cloudera Installation, Basic Concepts and Efficient
DataFlow Streaming of Big Data Flows
- Introduction to NiFi and first concept
- Getting started with NiFi
- Apache NiFi in depth
- Annexes
3 Taming Big Data with 7 Udemy Apache Spark tutorial with 20+ hands-on All
Apache Spark and examples of analyzing large data sets on your
Python - Hands On! desktop or on Hadoop with Python!
- Getting started with Spark
- Spark basic and RDD Interface
- Spark SQL, DataFrame and Dataset
- Running Spark on a Cluster
- Machine learning with Spark ML
- Spark Streaming, Structured
Streaming and GraphX
4 Mock Project 1 8 Self-paced Using NiFi to ingest data from many sources
and process with Spark
II Section 2 29~33 Cloud (Learner select Azure or GCP)
1 GCP - Google Cloud 17 Udemy Learn Google Cloud Professional Data All
Professional Data Engineer Certification with 80+ Hands-on
Engineer Certification. demo on storage, Database, ML GCP Services
- Grasp Basic Data Engineering &
Database Concept
9. Mock Projects
In mock project, you’ll define the specifications of scenario, environment to setup in Azure cloud platform
I Mock Project 1 MongoDB Using NiFi to ingest data from many sources
Cassandra (MongoDB, social network, CMS DB...) and
NiFi process with Spark framework and store on
Spark Azure Data Lake.
Python
II Mock Project 2 Azure/ Build full data pipeline on GCP/Azure.
Google Cloud Platform - Extract from multiple data sources.
Spark - Transform/Process the data using
Scala/Python cloud services
- Load and store on cloud DWH.
- Visualize using cloud visualization
tools.
Notes: ___________________________________________________________________________________________________________________
__________________________________________________________________________________________________________________________
__________________________________________________________________________________________________________________________