Download as pdf or txt
Download as pdf or txt
You are on page 1of 32

ACADEMY

SYLLABUS
Algoritma Education Center
Menara Kadin lvl.4, Jl. H. R.Rasuna Said Blok X-5 No.Kav 3, RT.1/RW.2,
Kuningan Timur, Setia Budi, Jakarta Selatan, DKI Jakarta 12950
community@algorit.ma
www.algorit.ma
+62 8777-8353-007

@teamalgoritma
@teamalgoritma
Team Algoritma
Algoritma Data Science Education Center
Algoritma Data Science Education Center
Our Journey Our Core Product
Modern workforces are under-equipped to solve the kind of problems we face as a Data Science Academy
collective in the digital era. While the complexity and enormity of data have grown
explosively in the past few years, the tools we use to solve these problems have seen only
incrementals improvements every decade or so. Non-stop, highly practical pearning. Takes a learn by building approach in 14 modules and
3 capstone projects to become an employable Data Scientist.
Data science, the field dedicated to the study of insights extraction, statistical modeling, and
artificial intelligence, can empower professionals in profound ways and it will continue to ACADEMY REGULAR ACADEMY FULL-STACK
play an increasingly integral role in our interaction with technology. Bootcamp using R in just 11 weeks Bootcamp using R, Python, & SQL in
just 15 weeks
We are founded with the vision of democratizing data science skills and equip every COURSES:
professional with a set of core skills across the various domains of data visualization, - Data Visualization Specialization COURSES:
regression, data modeling, machine learning, and statistical programming literacy. Whether - Machine Learning Specialization - Data Analytics Specialization
you’re a marketing executive, a business analyst, an entrepreneur, or a financial market - 2 Capstone Projects - Data Visualization Specialization
professional, we want to help you be a rockstar and a highly effective executive. - Data Science Communications & - Machine Learning Specialization
Presentation Workshops* - 3 Capstone Projects
Algoritma’s pedagogical excellence is recognized regionally, an achievement backed by - Demo Day Coaching* - Data Science Communications &
our illustrious track record in year-long corporate consulting projects and other shorter Presentation Workshops*
consultative, customized training engagements. DURATION & FREQUENCY:
- Demo Day Coaching*
4-5 days a week for 3 months
Day Class: 13:00 – 16:00 DURATION & FREQUENCY:
Night Class: 18:30 – 21:30 4-5 days a week for 3 months
Day Class: 13:00 – 16:00
- Algoritma was founded in Jakarta - Algoritma Data Science Academy Night Class: 18:30 – 21:30
by Samuel Chan and Nayoko launched for the very first time. Students
Wicaksono are taught to learn data science using Career Support & Algoritma Data Career Day
- Our first workshop: Kickstart Your R programming language in 3 months
After graduating from Algoritma Data Science Academy Program, students from each cohort
Data Science Career at Rework,
June 2017 Setiabudi January 2018 who have fulfilled the criteria can join the Data Career Day. This event is held for students to
be able to showcase their project to the attendees which consist of our Hiring Partners,
Corporate Clients, and the public.
- Student Track was initiated for - Algoritma held the very first Data Career The student will be given a 4-week period after they finished their project, for them to
the first time, Algoritma gave Day. 6 chosen alumni from the first and prepare for Data Career Day. During Data Career Day, participants will also get the chance
scholarship for chosen students second cohort (Agon and Bifrost)
showcased their project to the Hiring
to be interviewed by our Hiring Partners in the Speed Dating session.
from 6 universities (UI, UGM, ITB,
Telkom University, Prasetya Partners and Algoritma’s corporate

May 2018
Mulya, and Bina Nusantara
University) to learn data science June 2018
network
Other Workshops & Events
every Saturdays in the Algoritma
Weekend Track program

KICKSTART Kickstart Series


A collection of 3-4 hours introductory seminar for
- Data Analytics Specialization is - Algoritma’s Data Science Academy
SERIES professionals who are curious about what data science
introduced to the public. (3-months bootcamp) is offered through is and how it impacts their career and company.
Students will learn how to use Ngee Ann Polytechnic (with SkillsFuture)
Python and SQL for Data Analysis at WeWork Funan. The class is taught in
Corporate Training
in one month, using real business
data donated by Algoritma’s
English by highly qualified instructors
based in Singapore. CORPORATE In-House corporate training for industries from a wide
June 2019 Corporate Network. August 2019 TRAINING range of domains. We provide training to enable your
organization to capitalize on the potential already
working under your roof.

1 2
learn data science by building

DATA ANALYTICS
SPECIALIZATION

PYTHON FOR
DATA ANALYSTS

DATA ANALYTICS SPECIALIZATION

Data Analytics Specialization is a course that is light in theory


and perfect for beginners/non-programmers who are looking
EXPLORATORY
DATA ANALYSIS
to learn data analysis. The course has an easier learning curve
and takes a more accessible approach by getting participants
to understand the “how” part first, rather than a detailed
breakdown of the “why”. It is modeled after real-world
Analytics apps, with elements of storing and querying from
SQL, preprocessing with pandas, reshaping data and
DATA WRANGLING &
VISUALIZATION
producing Visualization. An example of such application that
uses all of these elements we teach is pedagogyapp.com

In this program, students will learn how to use Python


programming language for Data Analysis where students are
prompted to write short snippets of code in frequent intervals,
SQL & DATA&
SQL QUERY before being offered an explanation on the underlying
CAPSTONE PROJECT
VISUALIZATION
IN PYTHON theoretical frameworks. Instead of mastering the syntactic
design of the Python programming language, then moving
into data structures, and then the pandas library, and then the
mathematical details in an imputation algorithm, and its code
implementation; we would do the opposite: Implement the
imputation, then a succinct explanation of why it works and
applicational considerations (what to look out for, what are
assumptions it made, when not to use it, etc).
Python for
Data Analysts
PYTHON FOR
DATA ANALYSTS 4-Days Workshop

P4DA
Module 1: Python Programming Basic
PYTHON FOR
DATA ANALYSTS
Setting up Anaconda environment
Working with Jupyter Notebook
Python Syntaxes and Jargons
In this 12-hour course, we will cover a
comprehensive practice in applying Python’s
data analysis library: pandas. This library is the Module 2: Introduction to Dataframes

EXPLORATORY
core member of many Python-based scientific Importing pandas Library
DATA ANALYSIS
computing environments. You will be guided in Reading CSV Data
a gentle introduction to Python programming Python Data Types
Data Frame Structure
using Jupyter Notebook. Upon the completion
of this course, you will be familiar with Python Module 3: Exploratory Data Analysis Tools I
programming and utilizing pandas for simple
exploratory data analysis. Categorical and Numerical Variables
DATA WRANGLING &
VISUALIZATION
Using panda’s Built-in Statistics Summary
Indexing and Subsetting in pandas

Module 4: Graded Assignment


Use a small sample of a larger CRM dataset to perform exploratory data
analysis process. Perform what you have learned using pandas! Try and
SQL & DATA&
SQL QUERY
CAPSTONE PROJECT
extract insights from the data in order to understand which customers
VISUALIZATION
IN PYTHON offer a better value proposition to the business.

5 6
Exploratory
Data Analysis
EXPLORATORY
DATA ANALYSIS 4-Days Workshop

EDA
Module 1: Exploratory Data Analysis Tools II
PYTHON FOR
DATA ANALYSTS
Frequency Table in pandas
Higher Dimensional Table
Data Aggregation
Get more in-depth on exploratory data analysis Using Pivot Table
practice you can perform using pandas in this
12-hour course. Pick up the essential exploratory Module 2: Working with Data Types
EXPLORATORY
tools in this library to cover more statistical
DATA ANALYSIS
capabilities of pandas. We will also guide you Working with Date Time Data
Working with Categorical Data
through one of the most demanding, yet
important process in data analytics: data
cleansing. Upon completion of this course, you Module 3: Dealing with Untidy Data
will uncover more possibilities in working with Not a Number (NaN)
DATA WRANGLING &
data using pandas. Checking NaN Values
VISUALIZATION Missing Values Treatment
Removing Duplicate Values

Module 4: Graded Assignment


Use item dataset listed in a popular e-commerce website to perform the
exploratory data analysis process you have learned. We will explore different
SQL & DATA&
SQL QUERY
CAPSTONE PROJECT
product categories in terms of various price and scale range. This module
VISUALIZATION
IN PYTHON was gathered as part of a larger research work where the analyst wanted to
study the price convergence of Indonesia essential household items.

7 8
Data Wrangling
& Visualization
DATA WRANGLING &
VISUALIZATION 4-Days Workshop

DW&V
Module 1: Working with Multiindex Dataframe
PYTHON FOR
DATA ANALYSTS
Stack and Unstack
Slicing in MultiIndex
Reshaping data is an important component of Module 2: Data Wrangling and Reshaping
any data wrangling toolkit as it allows the analyst
to “massage” the data into the desired shape for Data Melt
further processing. As we go through a Using Group by Aggregation
EXPLORATORY
DATA ANALYSIS
comprehensive process in uncovering pandas
Module 3: Visual Data Exploratory
capabilities, we will learn more adept techniques
in working with data in this 12-hour course. Using matplotlib
Plotting using pandas Object

Module 4: Graded Assignment


DATA WRANGLING &
VISUALIZATION
Use stock dataset to compare a stock’s price variance. Use data wrangling
techniques you have learned to answer the questions related to stock data
analysis. Find out which stock has the lowest volume, most funded, and has
the highest closing price difference each day on average!

SQL & DATA&


SQL QUERY
CAPSTONE PROJECT
VISUALIZATION
IN PYTHON

9 10
SQL Query &
Capstone Project
SQL & DATA&
SQL QUERY
CAPSTONE PROJECT
VISUALIZATION 5-Days Workshop
IN PYTHON

SQLQ
Module 1: Working with SQL Database
PYTHON FOR
DATA ANALYSTS
Grammar of Graphics
Creating Database Connection
SQL Basic Queries
This 15-hour course guides student in using SQL Joins and Conditional Statements
Python’s grammar of graphic for data
visualization. We will also cover important data Module 2: Graded Assignment
EXPLORATORY analytics stack: SQL and how to integrate it to
DATA ANALYSIS
Python. Upon completion of this course, students Use a music store’s operations database to query all sales relating to
particular artists or albums. Using the relational schema of the database,
will be able to gain proficiency for an end to end we will learn how to query invoice data from the database in order to
process for data analysis using Python. analyze the store’s top valuable customers. We will also apply the
visualization techniques we have learned to produce an accompanying
chart for the analysis.

DATA WRANGLING &


VISUALIZATION Module 3: Capstone Project
After understanding the importance of data analysis and master its
processing steps, we will model an end-to-end data analysis process, from
upstream to downstream. Students can choose one amongst four provided
skeletons as their final data analysis product; from creating an automated
generated email with fire, a simple web analytics dashboard with Flask, a
SQL & DATA&
SQL QUERY simple API from Heroku Server, or web scraping with Beautiful Soup.
CAPSTONE PROJECT
VISUALIZATION
IN PYTHON

11 12
learn data science by building

DATA VISUALIZATION SPECIALIZATION

A fun, hands-on, and project-based specialization


that helps student gain full proficiency in data
visualization systems and tools. Create compelling
narratives by combining charting elements with
custom aesthetics under the guidance of our
instructors.

The learn-by-building module in all the workshops


follows our project-based learning philosophy to
this specialization. The course capstone requires
that the student build a real-world application under
stringent criteria modeled after real business
scenarios.
Programming for
Data Science
3-Days Workshop

P4DS
Module 1: Data Science in R
Data Science in R Working with Data
R Programming Basics Reading & Extracting Data
Why Learn R? Understanding Statistics
R Studio Interface Exploratory Data Analysis
Programming for Data Science is a course that Data Structures in R
covers the important programming paradigms
and tools used by data analysts and data Data Manipulation
scientists today. You will be guided through a Working with your Global Environment
Getting familiar with your Workspace
series of coding exercises designed to Continuous and Categorical Data
maximize your familiarity with data science
programming in RStudio, an integrated
Module 2: Data Manipulation
development environment for the statistical
computing language R. Data Manipulation II Practical Data Cleansing
Vector Types and Classes The Data Transformation Process
Upon completion of this workshop, you will be List and Objects Reproducible Data Science Projects
Matrix and Data Frames Reading and Writing from your IDE
familiar with the programming language,
popular tools, libraries (data science packages)
R in Practice
and toolkits required to excel in Programming Exercise: e-Commerce Retail Datasets
your data analysis and statistical computing In-depth review of Data Frame subsetting
projects. Sampling and Randomization
Cross-Tabulations
Aggregations

15 16
Academy Modules
Working with R
R Scripts and Functions
R Markdown
Why Care about Reproducibility
Graded Quiz

Learn by Building Module


Writing your code as R scripts make up for automation and
integration with other tools and services, while writing a
R Markdown presents your findings and recommendations in a
way that is friendly to non-technical / managerial team members.

R Script to clean & transform the data


Write a R script containing a function (name the function
however way you want) that reads a dataset as input, perform
the necessary transformation and export a cross-tabulation
numeric result or plot as output.
Reproducible Data Science
Create an R Markdown file that combines your step-by-step
data transformation code with some explanatory text. Add
formatting styles and hierarchical structure using Markdown.

17 18
Practical
Statistics
2-Days Workshop

PS
Module 1: Descriptive Statistics
5-Number Summary Central Tendency & Variability
Mean, Median and Mode Visualizing Central Tendency
Measures of Central Tendency Variance and Covariance
Have the statistical foundation for more Quantiles in R
advanced machine learning theories later on in
the specialization by picking up the key ideas in Standard Score and z-Score
Standard Normal Curve
statistical thinking. Learn to interpret correlations, Central Limit Theorem
construct confidence intervals and other z-Score Calculation & Student's T-test
statistical principles that form the basis of many
common machine learning models. Module 2: Inferential Statistics
The 2-days course is optional for participation of Probabilities Intervals
the Data Visualization and Machine Learning Probability Mass Function Confidence Intervals
Probability Density Function Prediction Intervals
Specialization and intended for learners without Expected Values
prior experience in statistics. p-Values

Inferential Statistics in Practice


Hypothesis Testing
Deriving Scientific Truths from Data
Case Study

19 20
Academy Modules
Tips and Techniques: R for Statisticians
Density Plots
Interpreting Box Plots (Box and Whisker)
Better summary statistics with `skimr()`
Pais Matrix
Graded Quiz

Learn by Building Module


Statistical Treatment of Retail Dataset
Using what you’ve learned formulate a question and derive
statistical hypothesis test to answer the questions. You’he to
demonstrate that you’re able to make decision using data in
a scientific manner.
Examples of questions can be:
-Is there a different in profitability between standard
shipment and same-day shipment?
-Supposed there is no difference in profitability between the
different product segment, what is the probability that we
obtain the current observation due to pure chance alone?

21 22
Data Visualization
in R
4-Days Workshop

DVinR
Module 1: Plotting Essentials
Base Plotting I Base Plotting II
Plots and Lines Histograms and Curves
Built-in Plot Types Cleveland's Dot Plot
A fun, hands-on, and project-based workshop Legends and Annotations Axis, Titles, Subtitles and Panel Styles
Other built-in Plotting Functionalities The Notorious Pie Chart
that help students gain full proficiency in data
visualization systems and tools. Create Working with ggplot2 Enhancing with ggplot2
Grammar of Graphics System Axis, titles and scales
compelling narratives by combining charting Mapping aesthetics Adding themes to your plots
elements with custom aesthetics under the Working with Geometries Custom aesthetics and styles
guidance of our instructors. Background image Working with Legends

The 4-days course follows our learn-by-building Module 2: Richer Visualization Techniques
approach, in that students are tasked to
Enhancing ggplot2 II Enhancing ggplot2 III
reproduce a series of plots applying what Flipping coordinates and Axis Rotation Enriching: Scatterplots and bubble plots
they’ve learned. While it covers the three main Multi-dimensional Faceting Enriching: Jitterplots
plotting systems in R, its particular focus is on Text Layers and Label Layers Enriching: Boxplots and violin plots
Expected Values Layer transparency
ggplot2 and the additional libraries centered
around it that brings interactivity and enhanced Enhancing ggplot2 IV Other Visualization Toolset
aesthetic options to the art of creating rich, Enriching: Column Plots Discrete, Continuous, and Gradient colors
Enriching: Texts and Labels Facet with wraps and grids
powerful visualizations. Enriching: Horizontal and Vertical Lines Visualizing Spatial Data
Fills and Colors Working with Leaflet and Maps

23 24
Academy Modules
Project: Mining Trending Videos on YouTube
Hands-on data visualization
Identifying temporal patterns in trending videos
Combining aesthetics and geometries
Graded Quiz

Learn by Building Module


Creating a Publication-Grade Plot
Applying what you’ve learned, create an economics- or
social-related plot that is polished with the appropriate
annotations, aesthetics and some simple commentary. You
may use the same "YouTube Trending Videos" dataset or any
other dataset for this practice.
Creating an Interactive Map
Applying what you’ve learned, create a web page with an
interactive map embedded on it. Use a custom icon for the
map markers to represent business locations, and show
details about each location pin (“markers”) upon user’s
interaction with it.

25 26
Interactive Plotting &
Web Dashboard
4-Days Workshop

IP&WD
Module 1: Interactive Visualization
Working with Plotly Publication & Layout Options
Refresher on dplyr Multiple Plots Arrangement
ggplotly function More export functions
Building on the foundation from previous Visualization as a HTML widget Subplots
Range Slider and other interactivity Tips and Techniques for Layouts
classes, we will create a series of interactive plots
and gadgets that renders multiple visualization
elements based on the user’s input. This is the Module 2: Web Dashboard Development
final workshop leading up to the data Flex Dashboard Interactive Document
visualization capstone project. Creating Flex Dashboard from RStudio Inputs and Outputs
Layouts The renderPlot() function
Hands-on Practice: Text, Plots, Tables Embedded Application
The 4-days course follows our learn-by-building Demonstration and Practical Advice Demonstration and Practical Advice
approach, in that students are tasked to
reproduce a series of plots applying what Shiny Web App
Shiny Dashboard
they’ve learned. It covers an exhaustive list of Tabs and Pagination
techniques that add interactivity to an R UI, Server and Shiny Functions
document and set the stage for the data science Custom Styles, Structure

capstone project.

27 28
Academy Modules
Tips on Web Dashboard Deployment
Working with live data
App deployment solutions
Tips for live dashboard performance

Learn by Building Module


Building an Interactive Dashboard
Applying what you’ve learned, create a paginated web
dashboard with a rich set of UI elements coupled with the
appropriate server logic. The web dashboard can be of any
theme, using any dataset, but must feature an input panel that
accepts end user inputs and render the output accordingly.

29 30
Data Visualization
Capstone Project
VISUALIZE
YOUR
After having learned and explored appropriate techniques on
visualizing data, students are required to deploy an interactive
dashboard web application using a shiny server which contains
any plotting objects such as ggplot and/or leaflet that display
useful insights. In addition, students are given the freedom to
use their own dataset or past datasets from previous classes.

Marks of the project is out of 30 points, the rubrics for

SUCCESS
assessment and grading will be discuss in the class.

THEN TAKE ACTION

31 32
learn data science by building

MACHINE LEARNING SPECIALIZATION

An intensive specialization that strives for a fine


balance between practical applications and
mathematical rigor in teaching essential machine
learning concepts. By taking a learn-by-building
approach, you will learn to develop regression and
classification algorithms and incorporate them into
real-life solutions or data products / business
applications.

The modules in all the workshops follow our


project-based learning philosophy to this
specialization. The course capstone requires that
the student build a real-world application under
stringent criteria modeled after real business
scenarios.
Regression
Models
4-Days Workshop

RM
Module 1: Regression Models I
OLS Regression Linear Models in R
Understanding Least Squares Understanding Coefficients
This course strives for a fine balance between Outliers: Leverage and Influence Plotting Regression
Simple Linear Regression Model Construction
business applications and mathematical rigor in
its treatment to regression models, one of the Interpreting Linear Models
most essential statistical techniques in the field Residuals Manually
of machine learning. Its aim is to equip you with Coefficients Manually
R-Squared Manually
the knowledge to investigate relationships
between variables of a data effectively and
rigorously. Module 2: Regression Models II
Interpreting Linear Models Multiple Regression
We strongly recommend that you complete Estimates and Standard Errors Model Assumptions
Practical Statistics prior to taking this course. t-value and P-value Bias-Variance Trade-off
Upon completion of this workshop, you will Adjusted R-Squared Outliers: Leverage and Influence
Confidence Interval Model Limitation and Evaluation
acquire a rigorous statistical understanding of
machine learning models, allowing you to Dive Deeper: Regression Models
extrapolate the same ideas into other, more Model Selection and Specification
Step-wise Regression
advanced machine learning models. All-possible Regressions
Residual Plots
Model Diagnostics
Limitations of Regression Models

35 36
Academy Modules
Graded Quiz
Learn by Building Module
Recommendation on Lowering Crime Rates
Write a regression analysis report applying what you’ve
learned in the workshop. Using the dataset provided by you,
write your findings on the different socioeconomic variables
most highly correlated to crime rates.
Explain your recommendations where appropriate.

37 38
Classification in
Machine Learning I
4-Days Workshop

CIML1
Module 1: Logistic Regression
Relating Probabilities to Odds Logistic Regression from First
Understanding Odds Principles
Understanding Log of Odds Sigmoidal Logistic Function
Learn to solve binary and multi-class classification Plotting Odds and Log of Odds Key Assumptions of Sigmoid
models using machine learning algorithms that Function Extra Proof: Intuition behind the
Sigmoid Function
are easily understood and readily interpretable.
You will learn to write a classification algorithm Logistic Regression in Action Practical Tips and Case Study
from scratch, and appreciate the mathematical Binary Logistic Regression Flight Delay Prediction Examples
Interpreting Coefficients Customer Churn and Attrition Examples
foundations underpinning logistic regressions Interpretation Against Continuous Risk Modeling on Loans from Quarter 4, 2017
and nearest neighbors algorithms. & Discrete Variables

We strongly recommend that you complete the Performance Evaluation and


Regression Models workshop prior to taking this Model Selection
AIC (Akaike Information Criteria)
course. Upon completion of this workshop, you Null Deviance and Residual Deviance
will acquire the depth to develop, apply, and Hauck Donner Effect
evaluate two highly versatile algorithms widely
used today. Module 2: Nearest Neighbours Algorithm
Closer Look at Classification k-NN in Action
Probabilties vs Class Responses Characteristics of k-NN
Cross Validation and Out-of Sample Error Positives and Negatives
Bias-variance trade off Diagnosing Breast Cancer with k-NN
Confusion matrix (accuracy, sensitivity,
specificity, & precision)

Bulding Blocks of k-NN k-NN from First Principles


Distance Function (Euclidean, Classifying Customer Segments with k-NN
Minkowsky) Writing Your Own k-NN Classifier
The k Parameter Predicting Using Your Own k-NN Classifier
Standardization vs Min-Max
Normalization

39 40
Academy Modules
Graded Quiz
Learn by Building Module
Logistic Regression on Credit Risk
Applying what you’ve learned, present a simple R Markdown
document in which you demonstrate the use of logistic
regression on the lbb_loans.csv dataset. Explain your findings
wherever necessary and show the necessary data preparation
steps. To help you through the exercise, consider the following
questions throughout the document:
How do we correctly interpret the negative coefficients
obtained from your logistic regression?
How do we know which of the variables are more
statistically significant as predictors?
What are some strategies to improve your model?

Customer Segment Prediction


Applying what you’ve learned, present a simple R Markdown
document in which you demonstrate the use of k-NN on the
wholesale.csv dataset. Compare the k-NN to the logistic
regression model and answer the following questions
throughout the document:
What is your accuracy? Was the logistic regression better
than k-NN in terms of accuracy? (recall the lesson on
obtaining an unbiased estimate of the model’s accuracy)
Was the logistic regression better than our kNN model at
explaining which of the variables are good predictors of a
customer’s industry?
List down 1 disadvantage and 1 strength of each of the
approach (k-NN and logistic regression)

41 42
Classification in
Machine Learning II
4-Days Workshop

CIML2
Module 1: Naive Bayes
Law of Probability Naive Bayes Classifier
Dependent and Independent Events Characteristics of a Naive Bayes Classifier
Bayes Theorem The "naive" assumptions
Learn to apply the law of probabilities, boosting, Formula for Posterior Probability Customer Churn example
bootstrap aggregation, k-fold cross-validation,
ensembling methods, and a variety of other Practical and Performance Naive Bayes in Action
Considerations Spam Classification
techniques as we build some of the most widely
The Case for Smoothing Predicting on Text (Corpus)
used machine learning algorithms today. Learn Laplace (Add-One) Predicting Political Party Affiliation
to add performance to your models using Thinking about Training vs
mathematically sound principles you’ll learn in Prediction Speed
this course.
Module 2: Tree-Based Methods and Ensembles
We strongly recommend that you complete the
Classification in Machine Learning 1 workshop Decision Trees Decision Tress in Action
Advantages and Model Characteristics Predicting Diabetes from Diagnostics
prior to taking this course. Some concepts Information Gain and Splitting Measurement
presented throughout the lecture may be Criterion Pruning and Tree Size AUC Curve
less-than-ideal for practitioners who have not Key Considerations and Practical Advice
completed the pre-requisite courses. Random Forest Machine Learning Theories
Ensemble-based Methods Logistic Regression, Naive Bayes and
Case Example: Predicting the Quality Decision Trees have more in common
of Exercise than you think
Industrial Applications
Thinking about Decision Boundaries

High-Performance Machine Learning


Bias-Variance Tradeoff revisited k-Fold
Cross Validation
Predicting Exercise Form with Fitness
Tracker Data

43 44
Academy Modules
Graded Quiz
Learn by Building Module
Identifying Risky Bank Loans
Use any of the 3 classification algorithms you’ve learned in
this lesson to predict the risk status of a bank loan. The
variable default in the dataset indicates whether the applicant
did default on the loan issued by the bank.
Use an R Markdown document to lay out your process, and
explain the methodology in 1 or 2 brief paragraph. The
student should be awarded the full (3) points when:
The preprocessing steps are done, and the student show
an understanding of holding out a test/cross-validation set
for an estimate of the model’s performance on unseen data.
The model’s performance is sufficiently explained
(accuracy may not be the most helpful metric here! Recall
about what you’ve learned regarding specificity and
sensitivity).
The student demonstrated extra effort in evaluating his/her
model and proposes ways to improve the accuracy
obtained from the initial model.

45 46
Unsupervised
Machine Learning
4-Days Workshop

UML
Module 1: Dimensionality Reduction
Background Principal Component Analysis
Understanding Unsupervised Learning Rethinking about Covariances
The "dimensionality" problem The Case for PCA
Learn PCA (Principal Component Analysis), Industrial Use of PCA Eigenvalues and Eigenvectors
Clustering, and other algorithms to work with
unsupervised machine learning tasks where the PCA from First Principles PCA in Action
Just enough Matrix Algebra Dubious Property Sales in NYC
target variable is not known or defined. Applying Mathematical Proof PCA on US Arrests data
what you’ll learn from this workshop, you will be Visualization and Visual Proof Biplot and the variables factor map
tasked to develop an anomaly detection or an
e-commerce product recommendation model PCA in Action II
Eigenfaces
that can be related to real-life business scenarios. PCA on Credit Loan Data
Deconstruction and Reconstructing
We strongly recommend that you complete the Faces with PCA
Principal Components by Hand
pre-requisite courses prior to taking this course.
Some concepts presented throughout the
lecture may be less-than-ideal for practitioners Module 2: k-Means Clustering
who are new to the field of machine learning. Understanding Clustering k-Means Clustering in Action
Centroid-based Clustering Algorithms Cluster-based Product Recommendation
The k-Means Procedure Scaling and Implementation Details
Mathematical Details Visualizing Clusters

Evaluating k-Means
Between sum-of-squares
Within sum-of-squares
Combining k-Means with PCA

47 48
Academy Modules
Graded Quiz
Learn by Building Module
Diving into Wholesale Transactions
Using any of the two unsupervised learning algorithms you’ve
learned, produce a simple R markdown document where you
demonstrate an exercise of either clustering or dimensionality
reduction on the wholesale.csv data provided to you.

Digging Deep into NYC Property Sales


Using any of the two unsupervised learning algorithms you’ve
learned, produce a simple R markdown document where you
demonstrate an exercise of either clustering or dimensionality
reduction on the nyc data provided to you.

Explain your choice of parameters (how you choose k for


k-means clustering, or how you choose to retain n number of
dimensions for PCA) from the original data. What are some
business utilities for the unsupervised model you’ve developed?
The R Markdown document should be no longer than 4
paragraph and contain one or two visualizations.

49 50
Time Series
& Forecasting
4-Days Workshop

TS&F
Module 1: Time Series I
Working with Time Series Time Series in Action
Application of Time Series Indonesia's gas emissions, 1970-2012
Definition of a ts object Frequency, Start and End
Decomposition of time series allows us to learn Functions to work with timeseries Time Series Plots
about the underlying seasonality, trend and Classical Decomposition Classical Decomposition in Action
random fluctuations in a systematic fashion. In Trend, Seasonality and Residuals Monthly Airline Passenger, 1949-1960
this workshop, we learn the methods to account Understanding Lags The decompose function
Additive vs Multiplicative Understading Smoothing
for seasonality and trend, work with autocorrelation
models and create industry-scale forecasts using Techniques to work with Time Series
modern tools and frameworks. Adjusting for Seasonality
Detrending
Decomposing Non-Seasonal Time Series
We strongly recommend that you complete the
pre-requisite workshops prior to taking this
course. Some concepts presented throughout Module 2: Forecasting
the lecture may be less-than-ideal for practitioners
Forecasting I Forecasting II
who have not completed the pre-requisite courses. Simple Moving Average Forecasting using One-sided SMA
Simple Moving Average from First Forecasting using Exponential Smoothing
Principles Holt's Exponential Smoothing
Log-transformation

Forecasting III Advanced Time Series


The alpha, beta, and gamma coefficients ACF and PACF
Mathematical Details ARMA and ARIMA Models
Holt-Winters Exponential Smoothing Stationarity and Differencing

Advanced Time Series II


Augmented Dickey-Fuller (ADF) test
Seasonal ARIMA
Tips to work with xts
Facebook's Prophet
Quantmod for quantitative traders

51 52
Academy Modules
Graded Quiz
Learn by Building Module
Forecasting the Crime rate in Chicago
Download the dataset from Chicago Crime Portal, and use a
sample of these data to build a forecasting project where you
inspect the seasonality and trend of crime in Chicago. Submit
your project in the form of an RMD format, and address the
following questions:
Is crime generally rising in Chicago in the past decade
(last 10 years)?
Is there a seasonal component to the crime rate?
Which time series method seems to capture the variation in
your time series better? Explain your choice of algorithm
and its key assumptions

53 54
Neural Network
& Deep Learning
4-Days Workshop

NN&DL
Module 1: Neural Network
Artificial Neural Networks Neural Network Architecture
The biological brain inspiration Layers, Nodes and Signals
Cost function Network Topology
Develop artificial neural networks that can The building blocks of neural networks Feed-forward vs Recurrent Signal
recognize a face, handwriting patterns and are at
Neural Network Architecture II Multi-Layer Perceptrons (MLP)
the core of some of the most cutting-edge Hidden Layers Backpropagation of error
cognitive models in the AI landscape. We will Computing with Neural Network Feed-forward vs Recurrent
learn to create a backpropagation neural network Mathematical Details Mathematical Details

from scratch, and use our neural network for


classification tasks. This class is the final course
Module 2: Deep Learning
in the Machine Learning Specialization.
Neural Networks from First Neural Networks from Scratch
We strongly recommend that you complete the Principles Gradient Descent by hand
Sum of Squared Error Neural Network by hand
pre-requisite workshops prior to taking this course. Cross-Entropy Error Learning Rate and Implementation Details
Some concepts presented throughout the lecture The Gradient Descent Algorithm
may be less-than-ideal for practitioners who have
not completed the pre-requisite courses. Neural Networks in Action Deep Learning in Action
Putting it all together Theorizing with Effect of Depth
Parameterization and Practical Advice Activation Functions
Deep Learning for Classification and Visualizing Logarithmic Loss
Regression

Deep Learning in Action II Keras in Action


Predicting Bank Telemarketing MNIST handwritten digit recognition
Campaign Defining the Model
Visualizing tricks for Deep Neural Training and Evaluation
Networks
Parameterization and Practical
Advice

55 56
Academy Modules
Graded Quiz
Learn by Building Module
Image Classification Using Neural Network
Build a neural network capable of classifying images into one
of many classes and explain the choice of your architecture.
Test your neural network using unseen images – can your
algorithm correctly classify 80% of the images?

57 58
Machine Learning
Capstone Project LEARN
MORE
After having learned various machine learning methods and its
application, students are required to choose one project that
challenge them to construct an optimal model from the dataset
given. The selection of methods include Forecasting, Regression
and Classification.

Marks of the project is out of 36 points, the rubrics

and
for assessment and grading will be discuss in the class.

DIVE
DEEPER
59 60
ENROLL NOW TO OUR ACADEMY!
ENROLL NOW TO OUR ACADEMY
bit.ly/algo_academy
bit.ly/algo_academy

“Learning
“Learninghowhowto todo
do
data science
data scienceisislike
like
learning
learningtotoski.
ski.
How to Apply:
How to Apply: You
Youhave
havetotodo
doit.”
it.”
1 Go to bit.ly/algo_academy
Go to bit.ly/algo_academy
1 (or scan the QR CODE)
(or scan the QR CODE) ~ Claudia Perlich,
2 Click ENROLL NOW! and fill in the form.
Chief Scientist, Dstillery.
~ Claudia Perlich,
2 Click ENROLL NOW! DQGȴOOLQWKHIRUP
Chief Scientist, Dstillery.
3 One of our Education Consultants will
reach
One ofout
ourtoEducation
you in 1x24 hours (working
Consultants will days).
4 3 Arrange
reach outpayment
to you in 1x24 hours (working days).

5 4 Congratulations!
Arrange payment You're on your way to start
your Data Science journey!
5 Congratulations! You're on your way to start
your Data Science journey!

You might also like