Professional Documents
Culture Documents
Machine Learning Tutorial
Machine Learning Tutorial
Machine Learning Tutorial
An Overview
deeplearning4j.org
MACHINE
deeplearning4j.org
X f(X,β) f(X)≡Y
LEARNING
deeplearning4j.org
N features or dimensions
M examples or observations
X= Y=
Xi= Ŷ=?
OUTLINE
deeplearning4j.org
• Prediction/Regression (temperature|day)
• Linear Regression
• Classification (cold/hot|day)
• Logistic Regression
• Nearest Neighbor Classifier • Evaluation Metrics
• Support Vector Machine • Confusion Matrix
• Ensembling • Cheatsheet
• Bagging • ROC Curve
• Boosting • PR Curve
• Random Forest • F1 Score
deeplearning4j.org
REGRESSION
XMN= YM=
Linear Regression deeplearning4j.org
CLASSIFICATION
XMN= YM=
Logistic Regression (classifier) deeplearning4j.org
Logit - odds
ENSEMBLING
Q classifiers
hq(X)
Bagging deeplearning4j.org
(Bootstrapping)
• Improves overfitting
• Simple to implement
• Alternative to cross-validation
• Error correlation X
• picking examples
• weighted average Sample with repacement
• … 63% unique examples
• Complexity O(Q)
• Brute force
Boosting (AdaBoost) deeplearning4j.org
• New example O(Q·log(M)) • Well suited for the cloud • Training O(Q·M·log(M))
• Feature selection • Access to a portion of • Generalization?
• the database
• each example
• Output weighted average
deeplearning4j.org
EVALUATION METRICS
(classifiers)
Confusion Matrix deeplearning4j.org
Cheatsheet deeplearning4j.org
ROC Curve deeplearning4j.org
PR Curve deeplearning4j.org
‘usefulness’ ‘completeness’
F1 Score deeplearning4j.org