Download as pdf or txt
Download as pdf or txt
You are on page 1of 41

edureka!

Discover Learning

Data Scientist Masters Program

Course Curriculum

masters
programme
About Edureka

Edureka is a leading e-learning platform providing live instructor-led interactive online


training. We cater to professionals and students across the globe in categories like Big
Data & Hadoop, Business Analytics, NoSQL Databases, Java & Mobile Technologies,
System Engineering, Project Management and Programming.

We have an easy and affordable learning solution that is accessible to millions of


learners. With our students spread across countries like the US, India, UK, Canada,
Singapore, Australia, Middle East, Brazil and many others, we have built a community of
over 1 million learners across the globe.

About The Course

Edureka's Masters Program provides an in-depth hands-on experience with tools &
systems used by Data Scientists. This program starts with Data Science training to
master data extraction, exploration techniques, and Machine Learning algorithms,
followed by Python Apache Spark and AI & Deep learning using Tensorflow. The
program is a combination of interactive online classroom and self-paced sessions
curated and led by industry experts. The exhaustive curriculum sets this program one
step ahead of short-term certifications and transforms you into an expert Data Scientist.

www.edureka.co © 2017 Brain4ce Education Solutions Pvt. Ltd. All rights Reserved.
edureka!

Data Scientist

Index
1. Statistics Essentials for Analytics 01

2. Data Science Certification Training 03

3. Python Certification Training 10

4. Apache Spark and Scala Certification Training 18

5. AI & Deep Learning with TensorFlow 27

6. Tableau Training and Certification 34

www.edureka.co © 2017 Brain4ce Education Solutions Pvt. Ltd. All rights Reserved.
edureka! 01

Statistics Essentials for Analytics


Course Curriculum

About The Course

The self-paced Statistics Essentials for Analytics Course is designed for the learners to

understand and implement various statistical techniques. These techniques are

explained using dedicated examples. The use case is taken up at the end of each

module and insights are gathered, thus at the end of the course we have a Project

which is consistently worked upon throughout the course.

Module 1 : Introduction to Statistics and Basic Probability

Learning Objectives

At the end of this module, you will be able to understand Skewness, Modality, Measures

of Center, Measures of Spread etc. You will also understand the relationship between

these terminologies. You will also be able to analyze airlines data set to gather insights.

Topics

Statistics & Basic Probability Measures of Center

Sampling Methods Measures of Spread

www.edureka.co © 2017 Brain4ce Education Solutions Pvt. Ltd. All rights Reserved.
edureka! 02

Module 2 : Basic Probability, Conditional Probability and


Bayesian Inference

Learning Objectives

At the end of this module, you will be able to understand the rules of probability, learn

about Disjoint and Independent events, understand the concept of probability, and

implement these concepts on a case-study. You will also learn and implement Bayes'

Theorem and implement Bayes’ theorem on a case-study.

Topics

Conditional Probability & Definitions

Bayesian Inference Examples

Terms Concepts & Applications

Module 3 : Distributions and Regression Modeling

Learning Objectives

At the end of this module, you will be able to understand Normal distribution,

interpreting z-scores and calculating percentiles, Binomial Distribution, Mean and

Standard deviation. You will also understand the Milgram Experiment.

Topics

Probability Distributions & Binomial Distribution

Regression Modeling Linear Regression Model and Analysis

Normal Distribution

Probability Distributions & Regression Modeling


www.edureka.co © 2017 Brain4ce Education Solutions Pvt. Ltd. All rights Reserved.
edureka! 03

Data Science Certification Training


Course Curriculum

About The Course

Data science is a "concept to unify statistics, data analysis and their related methods"

in order to "understand and analyze actual phenomena" with data. It employs

techniques and theories drawn from many fields within the broad areas of

mathematics, statistics, information science, and computer science, in particular, from

the subdomains of machine learning, classification, cluster analysis, data mining,

databases, and visualization. The Data Science Certification Training Course enables

you to gain knowledge of the entire Life Cycle of Data Science, analyzing and visualizing

different datasets, different Machine Learning Algorithms like K-Means Clustering,

Decision Trees, Random Forest, and Naive Bayes.

Module 1 : Introduction to Data Science

Learning Objectives

At the end of this Module, you should be able to: Define Data Science. Discuss the era of

Data Science. Describe the Role of a Data Scientist. Illustrate the Life cycle of Data

Science. List the Tools used in Data Science. State what role Big Data and Hadoop, R,

Spark and Machine Learning play in Data Science.

www.edureka.co © 2017 Brain4ce Education Solutions Pvt. Ltd. All rights Reserved.
edureka! 04

Topics

What is Data Science? Tools of Data Science

What does Data Science involve? Introduction to Big Data and Hadoop

Era of Data Science Introduction to R

Business Intelligence vs Data Science Introduction to Spark

Life cycle of Data Science Introduction to Machine Learning

Module 2 : Statistical Inference

Learning Objectives

After this module students will be able to: Define Statistical Inference. List the

Terminologies of Statistics. Illustrate the measures of Center and Spread. Explain the

concept of Probability. State Probability Distributions.

Topics

What is Statistical Inference? Probability

Terminologies of Statistics Normal Distribution

Measures of Centers Binary Distribution

Measures of Spread

Module 3 : Data Extraction, Wrangling and Exploration

Learning Objectives

After this module students will be able to: Discuss Data Acquisition techniques. List the

different types of Data. Evaluate Input Data. Explain the Data Wrangling techniques.

Discuss Data Exploration.

www.edureka.co © 2017 Brain4ce Education Solutions Pvt. Ltd. All rights Reserved.
edureka! 05

Topics

Data Analysis Pipeline Data Wrangling

What is Data Extraction Exploratory Data Analysis

Types of Data Visualization of Data

Raw and Processed Data

Module 4 : Introduction to Machine Learning

Learning Objectives

After this module students will be able to: Define Machine Learning. Discuss Machine

Learning Use cases. List the categories of Machine Learning. Illustrate Supervised

Learning Algorithms.

Topics

What is Machine Learning? Supervised Learning

Machine Learning Use-Cases Linear Regression

Machine Learning Process Flow Logistic Regression

Machine Learning Categories

Module 5 : Classification

Learning Objectives

After this module students will be able to: Define Classification. Explain different Types of

Classifiers such as Decision Tree, Random Forest, Naïve Bayes Classifier and Support

Vector Machine.

www.edureka.co © 2017 Brain4ce Education Solutions Pvt. Ltd. All rights Reserved.
edureka! 06

Topics

What is Classification and its use cases? Algorithm for Decision Tree Induction

What is Decision Tree? Creating a Perfect Decision Tree

Confusion Matrix What is Navies Bayes?

What is Random Forest? Support Vector Machine: Classification

Module 6 : Unsupervised Learning

Learning Objectives

After this module students will be able to: Define Unsupervised Learning.

Discuss the Cluster Analysis like K - means Clustering, C - means Clustering and

Hierarchical Clustering

Topics

What is Clustering & its Use Cases? What is Canopy Clustering?

What is K-means Clustering? What is Hierarchical Clustering?

What is C-means Clustering?

Module 7 : Recommender Engines

Learning Objectives

After this module students will be able to: Define Association Rules. Define

Recommendation Engine. Discuss types of Recommendation Engines like Collaborative

Filtering and Content-Based Filtering. Illustrate steps to build a Recommendation Engine

www.edureka.co © 2017 Brain4ce Education Solutions Pvt. Ltd. All rights Reserved.
edureka! 07

Topics

What is Association Rules & its use cases? Item-Based Recommendation

What is Recommendation Engine & Difference: User-Based and Item-

it’s working? Based Recommendation.

Types of Recommendation Types Recommendation Use-case

User-Based Recommendation

Module 8 : Text Mining

Learning Objectives

After this module students will be able to: Define Text Mining. Discuss Text Mining
Algorithms like Bag of Words Approach and Sentiment Analysis

Topics

The concepts of text-mining Quantifying text

Use cases TF-IDF

Text Mining Algorithms Beyond TF-IDF

Module 9 : Time Series

Learning Objectives

After this module students will be able to: Describe Time Series data , Format your Time
Series data . List the different components of Time Series data. Discuss different kind of
Time Series scenarios. Choose the model according to the Time series scenario.
Implement the model for forecasting. Explain working and implementation of ARIMA
model. Illustrate the working and implementation of different ETS models. Forecast the
data using the respective model

www.edureka.co © 2017 Brain4ce Education Solutions Pvt. Ltd. All rights Reserved.
edureka! 08

Topics

What is Time Series data? Exponential smoothing models

Time Series variables Identifying different time series scenario

Different components of Time based on which different Exponential

Series data Smoothing model can be applied

Visualize the data to identify Time Implement respective ETS model

Series Components for forecasting

Implement ARIMA model for forecasting

Module 10 : Deep Learning

Learning Objectives

After this module the will be able to: Define Reinforced Learning. Discuss Reinforced

Learning Use cases. Define Deep Learning. Understand Artificial Neural Network.

Discuss basic Building Blocks of Artificial Neural Network. List the important

Terminologies of ANN’s.

Topics

Reinforced Learning Understand Artificial Neural Networks

Reinforcement learning Process Flow Building an Artificial Neural Network

Reinforced Learning Use cases How ANN works

Deep Learning Important Terminologies of ANN’s

Biological Neural Networks

www.edureka.co © 2017 Brain4ce Education Solutions Pvt. Ltd. All rights Reserved.
edureka! 09

Project Work

Industry: Human Resource management

Challenge : A company has been in industry since a long time. Their business had been

increasing quite well over past, however in recent years, there has been a slowdown in

terms of growth because their best and most experienced employees leaving

prematurely. The VP of the firm is not very happy with the company’s best and most

experienced employees leaving prematurely. The VP of the firm has employed you to find

out insights in the company employee data and find out an answer as to know why best

and most experienced employees are leaving prematurely.

www.edureka.co © 2017 Brain4ce Education Solutions Pvt. Ltd. All rights Reserved.
edureka! 10

Python Certification Training


Course Curriculum

About The Course

Edureka's Python Certification Training not only focuses on fundamentals of Python,


Statistics, Machine Learning and Spark but also helps one gain expertise on applied Data
Science at scale using Python. The training is a step by step guide to Python and Data
Science with extensive hands on. The course helps you gain practical experience in
addressing an automation problem that would either require only Python or Machine
Learning using Python. Starting from basics of Statistics such as mean, median and
mode to exploring features such as Data Analysis, Regression, Classification, Clustering,
Naive Bayes, Cross Validation, Label Encoding, Random Forests, Decision Trees
and Support Vector Machines with a supporting example and exercise help you get into
the weeds. Python course also covers basic & advanced concepts of Python like writing
Python scripts, sequence and file operations in Python. You will use libraries like Pandas,
Numpy, Matplotlib, Scipy, Scikit, Pyspark and master the concepts like Python machine
learning, scripts, sequence, web scraping and big data analytics leveraging Apache Spark.

Module 1 : Introduction to Python

Learning Objectives

At the end of this Module, you should be able to understand Python – an Object oriented
Programming Language, List the Users of Python for Data Analytics, Define Identifiers
and Indentation, List Operations on Strings and Numbers, Run a Python Script.

Topics
Get an overview of Python Start Python

www.edureka.co © 2017 Brain4ce Education Solutions Pvt. Ltd. All rights Reserved.
edureka! 11

Learn about Interpreted Languages Discuss Interpreter PATH

List the Advantages/Disadvantages of Python Run a Python Script


Explore Pydoc Discuss Python Scripts on

Explore Python Editors and IDEs UNIX/Windows

Module 2 : Sequences and File Operations

Learning Objectives

At the end of this module, you will be able to Define Reserved Keywords and Command

Line Arguments, Describe how to Get User Input from Keyboard, Describe Flow Control

and Sequences, Practice Working with Files, Define and Describe Dictionaries and Sets.

Topics

Lists Iterating through a sequence

Tuples Functions for all sequences

Indexing and Slicing Using enumerate()

Generator expressions Operators and keywords for sequences

Dictionaries and sets The xrange()function

Working with files List comprehensions

Modes of opening a file File methods

File attributes

Module 3 : Deep Dive – Functions, Sorting, Errors and Exception,


Regular Expressions and Package

Learning Objectives

At the end of this Module, you should be able to explain Functions and various forms of Function

Arguments, explain Standard Library, define modules, describe Zip Archives and Packaging.

Probability Distributions & Regression Modeling


www.edureka.co © 2017 Brain4ce Education Solutions Pvt. Ltd. All rights Reserved.
edureka! 12

Topics

Functions Sorting

Function Parameters Alternate Keys

Global variables Lambda Functions

Variable scope and Returning Values Sorting collections of collections

Errors and Exception Handling Handling multiple exceptions

Sorting dictionaries The standard exception hierarchy


Using Modules
Sorting lists in place

Module 4 : Object Oriented Programming in Python

Learning Objectives

At the end of this Module, you should be able to implement Regular Expression and its

Basic Functions; Use Classes, Objects, and Attributes, Develop applications based on

Object Oriented Programming and Methods

Topics

The sys Module Math Function

Interpreter information Random Numbers

STDIO Dates and Times

Launching external programs Zipped Archives

Paths Introduction to Python Classes

Directories and filenames Defining Classes

Walking directory trees Initializes

Instance methods Properties

Class methods and data Static methods

Private methods and Inheritance

www.edureka.co © 2017 Brain4ce Education Solutions Pvt. Ltd. All rights Reserved.
edureka! 13

Module 5 : Debugging, Databases, and Project Skeletons

Learning Objectives

At the end of this Module, you should be able to debug python scripts using pdb, debug

python scripts using IDE, classify Errors, develop Unit Tests, create project Skeletons,

implement Database using SQLite and perform CRUD operations on SQLite database.

Topics

Debugging Creating a database with SQLite 3

Dealing with errors CRUD operations

Using unit tests Creating a database object.

Project Skeleton Project Directory

Required packages Final Directory Structure

Creating the Skeleton Testing your set up

Using the skeleton

Module 6 : Statistics – Machine Learning Prerequisites

Learning Objectives

At the end of this Module, you should be able to Statistics - data terminology,

measurement scales, types of data, Libraries - IPython, Matplotlib, Measures, Moments,

Variance, Std. Deviation using Numpy. Distributions, Probability and Bayes’ Theorem

using Scipy. Numpy - arrays, matrices, related operations. Scipy - overview, areas of

application.

www.edureka.co © 2017 Brain4ce Education Solutions Pvt. Ltd. All rights Reserved.
edureka! 14

Topics

Data terminology Covariance and correlation

Scales of measurement Conditional probability

Types of data Bayes theorem

IPython notebook installation Distribution/Probability functions

Numerical measure Installing Numpy

Matplotlib introduction Numpy arrays and matrices

Deviation and variance Installing Scipy

Standard deviation Scipy Modules and stats

Module 7 : Machine Learning using Python – Essentials

Learning Objectives

At the end of this Module, you should be able to Define Machine Learning and

understand Supervised vs Unsupervised. Apply Supervised Learning process flow,

regression analysis. Apply Unsupervised Learning process flow, clustering. Apply Linear

Regression, Multivariate Regression. Measure accuracy using Mean Squared Error, Cross

Validation. Analyze data using Pandas.

Topics

Introduction to Machine Learning Linear regression and mean

Areas of implementation of Machine learning squared error

Why Python Multivariate regression

Major classes of Learning Algorithms Cross validation

Supervised vs. Unsupervised learning Regression Summary

Inference models Introduction to Pandas

www.edureka.co © 2017 Brain4ce Education Solutions Pvt. Ltd. All rights Reserved.
edureka! 15

Creating Data frames Creating functions

Grouping Converting different formats

Sorting Combining data from various formats

Plotting Data Slicing/Dicing operations

Module 8 : Data Analysis and Machine Learning – Deep Dive

Learning Objectives

At the end of this Module, you should be able to: Feature engineer datasets using PCA,

Bias/Variance analysis. Apply classifications algorithms like KNN, Random Forests, SVM

etc. Apply clustering algorithms like K-Means, Hierarchical clustering etc. Compute

classification and clustering metrics to ascertain model accuracy.

Topics

Feature engineering Random Forest classifier

Dealing with categorical data Support Vector Machines (SVM)

Dealing with text data Support Vector Classifier

Using encoders Accuracy measures - AUC, ROC,

Count, TF-IDF Vectorizer Confusion Matrix, Log Loss

Bias/Variance tradeoff Clustering algorithms and accuracy

Principal Component Analysis (PCA) measures

KNN K-Means clustering

Decision Trees Silhouette coefficient

Random Forests Hierarchical clustering using Dendrogram

Ensemble Learning Density-based clustering using DBSCAN

Averaging and boosting algorithms

www.edureka.co © 2017 Brain4ce Education Solutions Pvt. Ltd. All rights Reserved.
edureka! 16

Module 9 : Scalable Machine learning using Spark

Learning Objectives

At the end of this Module, you should be able to discuss: Apache Spark - Concepts, RDD,

MLLib, Data frames. Transformations, Actions, Shuffling, Persistence and Data Removal.

Shared variables - accumulators and broadcast. Spark SQL and Data frames. Spark MLlib.

Regression, Classification & Clustering with PySpark.

Topics

Apache Spark introduction Shared variables - Accumulators,

Spark engine Broadcasts

Spark core API Spark SQL and Dataframes

Spark libraries Spark MLlib

SparkContext and SparkConf Regression with PySpark

Concepts - RDD, Shuffling and Classification with PySpark

Persistence Clustering with PySpark

RDD transformations and actions

Module 10 : Web Scraping in Python and Project Work

Learning Objectives

At the end of this Module, you should be able to discuss: Web scraping and its

advantages. Discuss Steps Involved in Web Scraping. Use BeautifulSoup package and its

functions. Scrape IMDB webpage. Fetch Streaming Tweets from Twitter. Perform

Sentiment Analysis on tweets Fetched from Twitter and determine which is more popular

Ferrari or Porsche.

www.edureka.co © 2017 Brain4ce Education Solutions Pvt. Ltd. All rights Reserved.
edureka! 17

Topics

Web scraping A Real-world project showing

Introduction to Beautiful soup package scrapping data from Google

How to scrape webpages finance and IMDB.

Project Work

Industry: Human Resource management

Challenge : AB Consultants is a company that outsources its employees as

Consultants to top various IT firms. Their business had been increasing

quite well over past, however in recent times there has been a slowdown in terms of

growth because their best and most experienced

employees have started leaving the Company. In order to prevent this proactively you

first need to dive in to the Company’s Employee

Data and find out an answer as to know why the best and most experienced

employees are leaving.

www.edureka.co © 2017 Brain4ce Education Solutions Pvt. Ltd. All rights Reserved.
edureka! 18

Apache Spark and Scala


Certification Training
Course Curriculum

About The Course

The Edureka Apache Spark & Scala course will enable learners to understand how

Spark enables in-memory data processing and runs much faster than Hadoop

MapReduce. Learners learn about RDDs, different APIs which Spark offers such as

Spark Streaming, MLlib, SparkSQL, GraphX. This Edureka course is an integral part of a

developer's learning path.

Module 1 : Introduction to Scala for Apache Spark

Learning Objectives

In this module, you will understand the basics of Scala that are required for program-

ming Spark applications. You can learn about the basic constructs of Scala such as vari-

able types, control structures, collections, and more.

Topics

What is Scala? Basic Scala operations

Why Scala for Spark? Variable types in Scala

Scala in other frameworks Control Structures in Scala

Introduction to Scala REPL Foreach loop

www.edureka.co © 2017 Brain4ce Education Solutions Pvt. Ltd. All rights Reserved.
edureka! 19

Functions, Procedures, Collections ArrayBuffer


in Scala-Array Map, Tuples, Lists, and more.

Module 2 : OOPS and Functional Programming in Scala

Learning Objectives

In this module, you will learn about object oriented programming and functional

programming techniques in Scala.

Topics

Class in Scala Extending a Class

Getters and Setters Overriding Methods

Custom Getters and Setters Traits as Interfaces

Properties with only Getters Layered Traits

Auxiliary Constructor Functional Programming

Primary Constructor Higher Order Functions

Singletons Anonymous Functions

Companion Objects and more.

Module 3 : Introduction to Big Data and Apache Spark

Learning Objectives

In this module, you will understand what is big data, challenges associated with it and

the different frameworks available. The module also includes a first-hand introduction

to Spark.

www.edureka.co © 2017 Brain4ce Education Solutions Pvt. Ltd. All rights Reserved.
edureka! 20

Topics

What is Big Data? o Hadoop Ecosystem and HDFS

Big Data Customer Scenarios Hadoop Core Components

Limitations and Solutions of Existing Rack Awareness and Block Replication

Data Analytics Architecture with Edureka’s VM Tour

Uber Use Case YARN and Its Advantage

How Hadoop Solves the Big Data Problem Hadoop Cluster and Its Architecture

What is Hadoop? Hadoop: Different Cluster Modes

Hadoop’s Key Characteristics Data Loading using Sqoop

Module 4 : Apache Spark Framework

Learning Objectives

In this module, you will understand different frameworks available for Big Data

Analytics and the module also includes a first-hand introduction to Spark, demo on

Building and Running a Spark Application and Web UI.

Topics

Big Data Analytics with Batch & Spark Components & It’s

amp; Real-Time Processing Architecture

Why Spark is Needed? Running Programs on Scala IDE &

What is Spark? amp; Spark Shell

How Spark Differs from Its Competitors? Spark Web UI

Spark at eBay Configuring Spark Properties

Spark’s Place in Hadoop Ecosystem

www.edureka.co © 2017 Brain4ce Education Solutions Pvt. Ltd. All rights Reserved.
edureka! 21

Module 5 : Playing with RDDs

Learning Objectives

In this module, you will learn one of the fundamental building blocks of Spark -

RDDs and related manipulations for implementing business logics (Transformations,

Actions and Functions performed on RDD). You will learn about Spark applications, how it

is developed and configuring Spark properties.

Topics
Challenges in Existing Computing Methods RDD Lineage

Probable Solution & How RDD RDD Persistence


Solves the Problem WordCount Program Using RDD Concepts
What is RDD, It’s Functions,
RDD Partitioning & How It Helps
Transformations & Actions?
Achieve Parallelization
Data Loading and Saving Through RDDs

Key-Value Pair RDDs and Other Pair RDDs

Module 6 : DataFrames and Spark SQL

Learning Objectives

In this module, you will learn about Spark SQL which is used to process structured

data with SQL queries. You will learn about data-frames and datasets in Spark SQL and

perform SQL operations on data-frames.

Topics

Need for Spark SQL Spark SQL Architecture

What is Spark SQL? SQL Context in Spark SQL

www.edureka.co © 2017 Brain4ce Education Solutions Pvt. Ltd. All rights Reserved.
edureka! 22

Data Frames & Datasets JSON and Parquet File Formats


Interoperating with RDDs Loading Data through Different Sources

Module 7 : Machine Learning using Spark MLlib

Learning Objectives

In this module you will learn about what is the need for machine learning, types of

ML concepts, clustering and MLlib (i.e. Spark’s machine learning library), various

algorithms supported by MLlib and implement K-Means Clustering.

Topics

What is Machine Learning? Features of MLlib and MLlib Tools

Where is Machine Learning Used? Various ML algorithms supported by MLlib

Different Types of Machine K-Means Clustering & How It Works

Learning Techniques with MLlib

Face Detection: USE CASE Analysis on US Election Data: K-Means

Understanding MLlib MLlib USE CASE

Module 8 : Understanding Apache Kafka and Kafka Cluster

Learning Objectives

In this module, you will understand Kafka and Kafka Architecture. Afterwards you

will go through the details of Kafka Cluster and you will also learn how to configure

different types of Kafka Cluster.

www.edureka.co © 2017 Brain4ce Education Solutions Pvt. Ltd. All rights Reserved.
edureka! 23

Topics

Need for Kafka Understanding the Components of

What is Kafka? Kafka Cluster

Core Concepts of Kafka o Configuring Kafka Cluster

Kafka Architecture o Producer and Consumer

Where is Kafka Used?

Module 9 : Capturing Data with Apache Flume and Integration


with Kafka

Learning Objectives

In this module you will get an introduction to Apache Flume and its basic

architecture and how it is integrated with Apache Kafka for event processing.

Topics

Need of Apache Flume Flume Channels

What is Apache Flume? Flume Configuration

Basic Flume Architecture Integrating Apache Flume and

Flume Sources Apache Kafka

Flume Sinks

Module 10 : Apache Spark Streaming

Learning Objectives

In this module you will get an opportunity to work on Spark streaming which is
used to build scalable fault-tolerant streaming applications. You will learn about
DStreams and various Transformations performed on it. You will get to know about main
streaming operators, Sliding Window Operators and Stateful Operators.

www.edureka.co © 2017 Brain4ce Education Solutions Pvt. Ltd. All rights Reserved.
edureka! 24

Topics

Drawbacks in Existing Computing WordCount Program using Spark Streaming

Methods Describe Windowed Operators and

Why Streaming is Necessary? Why it is Useful

What is Spark Streaming? Important Windowed Operators

Spark Streaming Features Slice, Window and ReduceByWindow

Spark Streaming Workflow Operators

How Uber Uses Streaming Data Stateful Operators

Streaming Context & DStreams Perform Twitter Sentimental Analysis

Transformations on DStreams Using Spark Streaming

Project Work

At the end of the course, you will be working on a live project:

Project 1 : US Election

Industry: Government

Technologies Used:

HDFS (for storage)

Spark SQL (for transformation) 

Spark MLlib (for machine learning)

Zeppelin (for visualization)

www.edureka.co © 2017 Brain4ce Education Solutions Pvt. Ltd. All rights Reserved.
edureka! 25

Problem Statement: In the US Primary Election 2016, Hillary Clinton was nominated

over Bernie Sanders from Democrats and on the other hand, Donald Trump was

nominated from Republican Party to contest for the presidential position. As an analyst, 

you have been tasked to understand different factors that led to the winning of Hillary

Clinton and Donald Trump in the primary elections based on demographic features to

plan their next initiatives and campaigns.

Project 2 : Design a system to replay the real time replay of transactions in


HDFS usingSpark.

Technologies Used:

Spark Streaming  HDFS (for storage) 

Kafka (for messaging)  Core Spark API (for aggregation)

Project 3 : Instant Cabs

Industry: Transportation

Technologies Used:

HDFS (for storage) Spark MLlib (for machine learning)

Spark SQL (for transformation) Zeppelin (for visualization)

Problem Statement: A US cab service start-up (i.e. Instant cabs) wants to meet the

demands in an optimum manner and maximize the profit. Thus, they hired you as a data

analyst to interpret the available Uber’s data set and find out the beehive customer

pick-up points & peak hours for meeting the demand in a profitable manner.

www.edureka.co © 2017 Brain4ce Education Solutions Pvt. Ltd. All rights Reserved.
edureka! 26

Project 4 : Drop-page of signal during Roaming

Industry: Telecom Industry

Technologies Used:

HDFS (for storage)

Spark SQL (for transformation)

Problem Statement: You will be given a CDR (Call Details Record) file, you need to find

out top 10 customers facing frequent call drops in Roaming. This is a very important

report which telecom companies use to prevent customer churn out, by calling them

back and at the same time contacting their roaming partners to improve the connectivity

issues in.

www.edureka.co © 2017 Brain4ce Education Solutions Pvt. Ltd. All rights Reserved.
edureka! 27

AI & Deep Learning with TensorFlow


Course Curriculum

About The Course

Delve into neural networks, implement Deep Learning algorithms, and explore layers

of data abstraction with the help of this Deep Learning using TensorFlow Certification

Training. In this Training, you will be able to learn the basic concepts of TensorFlow, the

main functions, operations and the execution pipeline. Starting with a simple “Hello

Word” example, throughout the course you will be able to see how TensorFlow can be

used in curve fitting, regression, classification and minimization of error functions. This

concept is then explored in the Deep Learning world. You will evaluate the common,

and not so common, deep neural networks and see how these can be exploited in the

real world with complex raw data using TensorFlow. In addition, you will learn how to

apply TensorFlow for backpropagation to tune the weights and biases while the Neural

Networks are being trained. Finally, the course covers different types of Deep

Architectures, such as Convolutional Networks, Recurrent Networks & Autoencoders.

Module 1 : Introduction to Deep Learning

Learning Objectives

At the end of this Module, you should be able to:

1.Discuss the revolution of Artificial Intelligence

www.edureka.co © 2017 Brain4ce Education Solutions Pvt. Ltd. All rights Reserved.
edureka! 28

2. Discuss the limitations of Machine Learning

3. List the advantages of Deep Learning over Machine Learning

4. Discuss Real-life use cases of Deep Learning

5. Understand the Scenarios where Deep Learning is applicable

6. Discuss relevant topics of Linear Algebra and Statistics

7. Discuss Machine learning algorithms

8. Define Reinforcement Learning

9. Discuss model parameters and optimization techniques

Topics

Deep Learning: A revolution in Artificial Limitations of Machine Learning

Intelligence Discuss the idea behind Deep


Advantage of Deep Learning over Learning
Machine learning 3 Reasons to go Deep
Real-Life use cases of Deep Learning Scenarios where Deep Learning
The Math behind Machine Learning: is applicable
Linear Algebra The Math Behind Machine Learning:
• Scalars Statistics
• Vectors • Probability
• Matrices • Conditional Probabilities
• Tensors • Posterior Probability
• Hyperplanes • Distributions
Review of Machine Learning Algorithms • Samples vs Population
• Regression • Resampling Methods
• Classification • Selection Bias

• Likelihood

www.edureka.co © 2017 Brain4ce Education Solutions Pvt. Ltd. All rights Reserved.
edureka! 29

Module 2 : Understanding Fundamentals of Neural Networks


with Tensorflow

Learning Objectives

At the end of this Module, you should be able to: Illustrate How Deep Learning Works?

How Neural Networks Work?. Understand Various Components of a Neural Network.

Define TensorFlow. Illustrate How TensorFlow works. Discuss the Functionalities of

TensorFlow. Implement a Single Layer Perceptron using TensorFlow.

Topics

How Deep Learning Works? Tensorflow code-basics

Activation Functions Graph Visualization

Illustrate Perceptron Constants, Placeholders, Variables

Training a Perceptron Creating a Model

Important Parameters of Perceptron Step by Step - Use-Case Implementation

What is Tensorflow?

Module 3 : Deep dive into Neural Networks with Tensorflow

Learning Objectives
At the end of this Module, you should be able to: Understand limitations of A Single

Perceptron. Illustrate Working of Multi-Layered Perceptron (MLP). Understand MLP

Training Phases. Illustrate How TensorFlow works. Discuss the Functionalities of

TensorFlow. Implement a Multi-Layered Perceptron using TensorFlow.

Topics

Understand limitations of A Single Understand Neural Networks in Detail

Perceptron

www.edureka.co © 2017 Brain4ce Education Solutions Pvt. Ltd. All rights Reserved.
edureka! 30

Illustrate Multi-Layer Perceptron MLP Digit-Classifier using TensorFlow

Backpropagation – Learning Algorithm TensorBoard

Understand Backpropagation – Summary

Using Neural Network Example

Module 4 : Master Deep Networks

Learning Objectives

At the end of this Module, you should be able to: Define TensorFlow. Illustrate how

TensorFlow works. Discuss the Functionalities of TensorFlow. Illustrate different ways to

install TensorFlow. Write and Run programs on TensorFlow.

Topics

What is TensorFlow? HelloWorld with TensorFlow

Use of TensorFlow in Deep Learning Running a Machine learning algorithms

Working of TensorFlow on TensorFlow

How to install Tensorflow

Module 5 : Convolutional Neural Networks (CNN)

Learning Objectives

At the end of this Module, you should be able to: Define CNNs. Discuss the

Applications of CNN. Explain the Architecture of a CNN. ist Convolution and Pooling

Layers in CNN. Illustrate CNN. Discuss Fine-tuning and Transfer Learning of CNNs.

www.edureka.co © 2017 Brain4ce Education Solutions Pvt. Ltd. All rights Reserved.
edureka! 31

Introduction to CNNs Understanding and Visualizing a CNN

CNNs Application Transfer Learning and Fine-tuning

Architecture of a CNN Convolutional Neural Networks

Convolution and Pooling layers

in a CNN

Module 6 : Recurrent Neural Networks (RNN)

Learning Objectives

At the end of this Module, you should be able to: Define RNN. Discuss the Applications

of RNN. Illustrate how RNN is trained. Discuss Long Short-Term memory(LSTM).

Explain Recursive Neural Tensor Network Theory. Illustrate the working of Neural

Network Model.

Topics

Intro to RNN Model Long Short-Term memory (LSTM)

Application use cases of RNN Recursive Neural Tensor Network Theory

Modelling sequences Recurrent Neural Network Model

Training RNNs with Backpropagation

Module 7 : Restricted Boltzmann Machine(RBM) and Autoencoders

Learning Objectives

At the end of this Module, you should be able to: Define RBM. Discuss the Applications

of RBM. Illustrate Collaborative Filtering using RBM. Define Autoencoders.

www.edureka.co © 2017 Brain4ce Education Solutions Pvt. Ltd. All rights Reserved.
edureka! 32

Topics

Restricted Boltzmann Machine Introduction to Autoencoders

Applications of RBM Autoencoders applications

Collaborative Filtering with RBM Understanding Autoencoders

Module 8 : Keras

Learning Objectives

At the end of this Module, you should be able to: Define Keras. Understand Keras

Model Building Blocks. Illustrate Different Compositional Layers for a Keras Model.

Implement a Use-Case Step by Step. Understand few features available with Keras.

Topics

Define Keras What is Batch Normalization

How to compose Models in Keras Saving and Loading a model with Keras

Sequential Composition Customizing the Training Process

Functional Composition Using TensorBoard with Keras

Predefined Neural Network Layers Use-Case Implementation with Keras

Module 9 : Tflearn

Learning Objectives

At the end of this Module, you should be able to: Define TFlearn. Understand TFlearn

Model Building Blocks. Illustrate Different Compositional Layers for a TFlearn Model.

Implement a Use-Case Step by Step. Understand few features available with TFlearn.

www.edureka.co © 2017 Brain4ce Education Solutions Pvt. Ltd. All rights Reserved.
edureka! 33

Topics

Define TFlearn What is Batch Normalization

Composing Models in TFlearn Saving and Loading a model with TFlearn

Sequential Composition Customizing the Training Process

Functional Composition Using TensorBoard with TFlearn

Predefined Neural Network Layers Use-Case Implementation with TFlearn

Module 10 : Hands-On Project

Learning Objectives

At the end of this Module, you should be able to: Define RBM. Discuss the Applications

of RBM. Illustrate Collaborative Filtering using RBM. Define Autoencoders.

Project Work

1. To create an image classifier using CNN, to classify images in one of the predefined

100 classes.

2. To create a script generator using LSTM, for generating scripts for any popular

novel that you might interest you.

3. Capstone project, here you can choose a dataset of your own, explore the different

challenges faced on the dataset domain and try to solve any one of them with any

neural network architecture covered in the course.

www.edureka.co © 2017 Brain4ce Education Solutions Pvt. Ltd. All rights Reserved.
edureka! 34

Tableau Training and Certification

Course Curriculum

About The Course

Tableau is one of the hottest trends in business intelligence. With its intuitive and

user-friendly approach to data visualization, Tableau is today, a popular choice for

organizations, big and small. It is one of the many data-related tools that work

alongside R, Python or D3.js that can help you to create complex and beautiful data

visualizations. More than 35,000 companies worldwide have truly transformed the way

they uncover insights from data they possess. Learn the key concepts of Data

Visualization and how data can be transformed by cleaning, splitting, pivoting, and

merging using Tableau 10. Understand the architecture of Tableau, establish

connection with datasets, perform Joins on the data sets and explore the new Cross

Join feature. Discover new ways to analyze your data such as, quick highlighting,

reference lines, and the new clustering function. Create personalized, dynamic

visualizations by using parameters to take user input and drive the visualization.

Understand good design practices for dashboards and how to make them fully

interactive using actions. Perform extensive hands-on activities using Tableau 10

thereby emphasizing the concepts.

www.edureka.co © 2017 Brain4ce Education Solutions Pvt. Ltd. All rights Reserved.
edureka! 35

Module 1: Introduction to Data Visualization

Learning Objectives

In this module, you will learn to: Identify the prerequisites, goal, objectives, methodology,

material, and agenda for the course. Discuss the basic of Data Visualization. Get a brief

idea about Tableau, establish connection with the dataset, and perform Joins operation

on the data set.

Topics
Data Visualization. Joins and Union

Introducing Tableau 10.0 Data Blending.


Establishing Connection

Module 2 : Visual Analytics

Learning Objectives

In this module, you will learn to: Manage extracts and metadata (by creating hierarchy

and folders). Describe what is Visual Analytics, why to use it, and it’s various scopes.

Explain aggregating and disaggregating data and how to implement data granularity

using marks card on aggregated data. Describe what is highlighting, with the help of a

use-case. Illustrate basic graphs including bar graph, line graph, pie chart, dual axis

graph, and area graph with dual axis.

Topics

Managing Extracts. Data Granularity using Marks Card

Managing Metadata. Highlighting

Visual Analytics. Introduction to basic graphs

www.edureka.co © 2017 Brain4ce Education Solutions Pvt. Ltd. All rights Reserved.
edureka! 36

Module 3: Visual Analytics in Depth I

Learning Objectives

In this module, you will learn to: Perform sorting techniques including quicksort, using

measures, using header and legend, and sorting using pill with the help of a use case.

Master yourself into various filtering techniques such as Parametrized filtering, Quick

Filter, Context Filter. Learn about various filtering option available with the help of use

case and different scenarios. Illustrate grouping using data-window, visual grouping, and

Calculated Grouping (Static and Dynamic). Illustrate some more graphical visualization

including Heat Map, Circle Plot, Scatter Plot, and Tree Maps.

Topics

Sorting. Grouping

Filtering. Graphical Visualization

Module 4: Visual Analytics in Depth II

Learning Objectives

In this module, you will learn to: Explain the basic concepts of sets followed by creating
sets using Marks Card, computation sets and combined sets. Describe the concepts of
forecasting with the help of Forecasting problem as a use-case. Discuss the basic
concept of clustering in Tableau. Add Trend lines and reference line to your visualization.
Discuss about Parameter in depth using Sets and Filter.

Topics

Sets. Trend Lines.

Forecasting. Reference Lines.

Clustering. Parameters.

www.edureka.co © 2017 Brain4ce Education Solutions Pvt. Ltd. All rights Reserved.
edureka! 37

Module 5: Dashboard and Stories

Learning Objectives

In this module, you will learn to: Describe the basic concepts of Dashboard and its UI.

Build a dashboard by adding sheets and object into it. Modify the view and layout. Edit

your dashboard, how it should appear on phones or tablets. Create an interactive

dashboard using actions (filter, highlighting, URL). Create stories for your Visualization

and Dashboards.

Topics

Introduction to Dashboard. Dashboard Interaction - Using Action.

Creating a Dashboard Layout. Introduction to Story Point.

Designing Dashboard for Devices.

Module 6: Mappings

Learning Objectives

In this module, you will learn to: Map the coordinates on the map, plot geographic data,
and use layered view to get the street view of the area. Edit the ambiguous and
unrecognized location plotted on the map. Customize territory in a polygon map.
Connect to the WMS Server, use a WMS background map and saving it. Add a
background image and generate its coordinate and plot the points.

Topics

Introduction to Maps. Polygon Maps.

Editing Unrecognized Locations. Web Mapping Services.

Custom Geocoding. Background Images.

www.edureka.co © 2017 Brain4ce Education Solutions Pvt. Ltd. All rights Reserved.
edureka! 38

Module 7: Calculations

Learning Objectives

In this module, you will learn to: Perform Calculations using various types of functions

such as Number, String, Date, Logical, and Aggregate. In addition, you will get to know

about Quick Table Calculation. Cover the following LOD expressions – Fixed, Included,

and Excluded.

Topics

Introduction to Calculation: Number Introduction to Table Calculation.

Functions, String Functions, Date Introduction to LOD expression :

Functions, Logical Functions, Fixed LOD , Included LOD, Excluded LOD

Aggregate Functions.

Module 8: LOD Problem sets & Hands On

Learning Objectives

In this module, you will learn to: Tackle complex scenarios by using LOD expressions.

Topics

Use Case I - Count Customer by Order. Use Case IV - Profit Vs Target

Use Case II - Profit per Business Day. Use Case V - Finding the second order date.

Use Case III - Comparative Sales. Use Case VI - Cohort Analysis

www.edureka.co © 2017 Brain4ce Education Solutions Pvt. Ltd. All rights Reserved.
edureka! 39

Module 9: Charts

Learning Objectives

In this module, you will learn to: Plot various types of Charts using Tableau 10 and have

extensive hands-on on industry use cases.

Topics

Box and Whisker’s Plots Pareto Charts

Gantt Charts Control Charts

Waterfall Charts Funnel Charts

Module 10: Integrating Tableau with R and Hadoop

Learning Objectives

In this module, you will learn to: You will know the basics of Big Data, Hadoop, and R.
You will discuss the integration between Hadoop and R and will integrate R with Tableau.
In addition, you will get to publish your workbook on Tableau Server.

Topics

Introduction to Big Data Calculating measure using R

Introduction to Hadoop Integrating Tableau with R

Introduction to R Integrated Visualization using Tableau

Integration among R and Hadoop

www.edureka.co © 2017 Brain4ce Education Solutions Pvt. Ltd. All rights Reserved.

You might also like