Professional Documents
Culture Documents
03 Supervised Classification
03 Supervised Classification
Classification
What the
computer
sees
What the
computer
sees
Pixel intensity
Challenges
Attempts have been made
Output
Supervised vs. Unsupervised Learning
• Train the machine using data • Working with unlabeled data
which is well “labeled”. • Clustering data patterns
• The machine then predict • Unsupervised data finds all kind
unforeseen data in the same of unknown patterns in data
domain
• It is easier to get unlabeled data
• Labeling is not easy than labeled data
• Challenge with huge data • Better in case of huge data
Supervised vs. Unsupervised Learning
Are the
supervised and
unsupervised
learning a data-
driven
approach?
Supervised Learning
• Classification:
• Classification
• Segmentation
• Object detection
• Regression
Unsupervised Learning
• Clustering
• Association
• Semisupervised Learning
Training and Testing in Supervised Learning
Input
Input Prediction
Training and Testing in Supervised Learning
Supervised
Classification
Binary Classification
[]
output
255
x 231
⋮
⋮
input 𝑥= 255
134
⋮
⋮
142
Binary Classification
X(1) y(1)
𝑛𝑥× 𝑚
𝑋∈ℝ
…
…
m training examples
(Mtrain, Mtest)
X(m) y(m)
Multi-class Classifier
• Data-driven approach:
• Collect a dataset of images and labels
• Use Machine Learning to train a classifier
• Evaluate the classifier on new images
Nearest Class 1
Neighbor
Class 2
Class 3
Class 4 …
First classifier: Nearest Neighbor
Example Dataset: CIFAR10
• 10 classes
• 50,000 training images
• 10,000 testing images Test images and nearest neighbors
Distance Metric to compare images
• L1 distance:
Nearest Neighbor Classifier
Nearest Neighbor Classifier
Memorize training data
For each test image:
• Find closest train image
• Get the same label of
the nearest image
Problem:
• Fast for training
• Slow for prediction
We want classifiers that
are fast at prediction, slow
for training is ok.
Prediction
K-Nearest Neighbors (KNN)
Instead of copying label from nearest neighbor,
take majority vote from K closest points
Setting hyperparameters
Setting hyperparameters
Usefull for small datasets, but not used too frequently in real-world problems
Pros and Cons
• Pros:
• Simple to implement and understand
• Take no time to train
• Cons:
• Pay computational cost at test time
• Bad choice in case of high-dimensional data since distances over
high-dimensional spaces can be very counter-intuitive
• Distances over
high-dimensional
space can be
counter-intuitive
sklearn.neighbors.KNeighborsClassifier
Assignment
Where
• : constant, : bias
• The relationship is a linear relationship
• This problem is a regression problem when we want to find the
optimized parameters LINEAR REGRESSION
Linear Regression
• Loss function
(1)
Linear Regression
• If is invertible (non singular), then (1) has only one solution
Line of regression
Datapoints
Simple Linear Regression
• We have a table of height and weight of 15 persons as below
Height (cm) Weight (kg) Height (cm) Weight (kg)
sklearn.neighbors.KNeighborsRegressor
Neural Network
Neural Network
?
𝑋 → 𝑦=𝑓 (𝑋 )
Convolutional Neural Network (CNN)
Retinal Disease Classification
Skin Diseases Detection
Acne Grading
Linear Classification
• Parametric Approach
Parametric Approach: Linear Classifier
Example with an image with 4 pixels, and 3 classes (cat/dog/ship)
Non-linear
Sigmoid Function
1.002329376404
1745e-233
1.0
5.333220360939
705e-164
Homework 1
• Using KNN or Linear classifier (1 dense layer without activation
function) to classify the iris flower dataset or the MNIST hand-written
digit dataset.
• Hint: Dense layer: tf.keras.layers.Dense(activation = None)
References
• https://cs231n.github.io/
• https://
www.coursera.org/programs/data-science-program-6-months-5n0mk
/browse?productId=W62RsyrdEeeFQQqyuQaohA&productType=s12n
&query=deep+learning&showMiniModal=true