Professional Documents
Culture Documents
ML Basic Concepts
ML Basic Concepts
Basic Concepts
!"#$%&"'('
Feature'2'
!"#$%&"')'
Feature'1' *"+,-,./'0.%/1#&2'
Terminology
Machine Learning, Data Science, Data Mining, Data Analysis, Sta-
tistical Learning, Knowledge Discovery in Databases, Pattern Dis-
covery.
Data everywhere!
1. Google: processes 24 peta bytes of data per day.
6. . . .
o X
Python Machine Deep SQL Excel
Learning Learning
COMBO COURSE
Available at just
259941$337,42) 9999 ($13499)
Data types
Data comes in different sizes and also flavors (types):
Texts
Numbers
Clickstreams
Graphs
Tables
Images
Transactions
Videos
Domain DB
m
Ti
expertise
+
models, - +
- - +
clustering, density -
-
+
Statistics!
Biology! Visualization!
Signal
processing! Databases!
ML versus Statistics
Statistics: Machine Learning:
http://statweb.stanford.edu/~jhf/ftp/dm-stat.pdf
Machine Learning definition
Unsupervised learning:
Learning a model from unlabeled data.
Supervised learning:
Learning a model from labeled data.
Unsupervised Learning
Training data:“examples” x.
x1, . . . , xn, xi ∈ X ⊂ Rn
• Clustering/segmentation:
Feature'2'
Feature'1'
Unsupervised learning
Feature'2'
Feature'1'
Unsupervised learning
Feature'2'
Feature'1'
Methods: K-means, gaussian mixtures, hierarchical clustering,
spectral clustering, etc.
Supervised learning
Training data:“examples” x with “labels” y.
!"#$%&"'('
!"#$%&"')'
Supervised learning
!"#$%&"'('
!"#$%&"')'
*"+,-,./'0.%/1#&2'
Supervised learning
!"#$%&"'('
!"#$%&"')'
*"+,-,./'0.%/1#&2'
Methods: Support Vector Machines, neural networks, decision
trees, K-nearest neighbors, naive Bayes, etc.
Supervised learning
Classification:
!"#$%&"'('
!"#$%&"')'