Professional Documents
Culture Documents
ML1 - Classification - KNN & NB
ML1 - Classification - KNN & NB
REFERENCES
kNN Classifier:
Book: Machine Learning with Python for Everyone (Chapter 3)
NB Classifier:
Book: Machine Learning with Python for Everyone (Chapter 3)
Naive Bayes Classifier in Machine Learning (enjoyalgorithms.com)
Bayes Theorem - Statement, Proof, Formula, Derivation & Examples (byj
us.com)
Classification Tasks
• Depending on no. of outcomes
• Binary Classifiction (Two class classification)
• {Yes, No}; {Red, Black}; {True, False}
• {-1 +1}; {0, 1}
• Multiclass Classification
• {Cruiser, Destroyer, Frigate Mine Sweeper, Air Craft Carrier…}
• Depending on steps involved
• Direct outcome in one step
• K Nearest Neighbours
• Two step process
• (1) build a model of how likely the outcomes are and
• (2) pick the most likely outcome
• Naïve Bayes
Sample (& Simple) Classification Dataset
• IRIS Dataset
• Included with sklearn
• Fisher’s Dataset
• Sir Ronald Fisher, mid-20th-century statistician
• First academic paper on classification
• Edgar Anderson
• Gatherer of data!
• Contents
• Each Row: describes one iris flower, in terms of the length and width of that flower’s sepals
and petals
• Rows: Examples / samples
• Final Column: Particular species of that iris: setosa, versicolor, or virginica
• Features / Attributes / IV (initial columns) and Target / Label / DV (Final column)
Sample (& Simple) Classification Dataset
Sample (& Simple) Classification Dataset
Training and Test (Data) Sets
Training and Test (Data) Sets
• Generalization
• Performance on novel data (general knowledge)
• Evaluation Schemes
• in-sample evaluation or training error
• out-of-sample or test error evaluation
• sklearn’s train_test_split
• training data
• portion of the data that we will use to study and build up our understanding
• testing data
• portion of the data that we will use to test ourselves
• Split randomly
Training and Test (Data) Sets
Training and Test (Data) Sets
• Aim: to make a ML model which receives the feature value of humidity and tries to
predict whether the play will happen or not
• Given that Humidity is Normal, lets find the chances of the play
• p(Play = Yes | Humidity = Normal)
Naïve Bayes Classifier
Naïve Bayes Classifier