Professional Documents
Culture Documents
Supevised Learning - 2 - Classification - v2
Supevised Learning - 2 - Classification - v2
Artificial Intelligence
(ME3181)
Supervised Learning
o Supervised Learning (Học có giám sát): learns from labeled training data
to make predictions or decisions.
o Regression: Finding the relationship between a dependent variable
(label, target, output, outcome variable) and one or more
independent variables (also known as predictors or features).
o Classification: assign input data points to one of several predefined
categories or classes
o Unsupervised Learning (Học không giám sát): finds patterns,
relationships, or structures in a dataset without the presence of labeled
output or target variables.
Training set
Learning Algorithm
𝑥 𝑦
ℎ
Data Estimated Value
Hypothesis/ Model
https://www.amybergquist.com/
Lecture notes of Andrew Ng
Applications of AI (ME3181) 4
Classification
Classification
Goal: To learn a classification model from the data that can be used to predict the
classes of new (future, or test) cases/instances.
Applications of AI (ME3181) 6
Classification
Each classification model can be referred as classifier
https://datasciencedojo.com/
Types of classification
• Binary Classification: categorizes data into one of two classes or categories.
• Multiclass classification: data is classified into more than two classes or
categories.
• Multilabel classification: each data point can belong to multiple classes
simultaneously.
• Multioutput classification: one dataset, different missions
Applications of AI (ME3181) 7
Simple classification: Binary
• 2 classes
• Set 1 class as refered class (1 – Positive)
• Another class is (0 or -1, Negative)
Class 2
Class 1
Classifier
Decision Boundary
Seperating
Hyperplane
https://machinelearningmastery.com/types-of-classification-in-machine-learning/
Applications of AI (ME3181) 8
Categorical Encoding
Popular methods
Binary Encoding
1, 𝑠𝑒𝑡𝑜𝑠𝑎
𝑦=ቊ
0, 𝑣𝑒𝑟𝑠𝑖𝑐𝑜𝑙𝑜𝑟
Label encoding
Setosa class: y = ‘setosa’
Versicolor class: y = ‘versicolor’
Virginica class: y = ‘virginica’
Ordinary Encoding
Setosa class: y=1
Versicolor class: y = 2
Virginica class: y=3
One-hot encoding
Setosa class: y = [ 1, 0, 0]T
Versicolor class: y = [ 0, 1, 0]T
Virginica class: y = [ 0, 0, 1]T
Applications of AI (ME3181) 9
K-Nearest Neighbor (k-NN)
𝑥2 𝑥2
𝑥1 𝑥1
𝑥2
𝑥1
Applications of AI (ME3181) 10
K-Nearest Neighbor (k-NN)
Pseudocode of K-NN classifier
Input: 𝑿 – Training data, 𝒀 – Class labels of 𝑿, 𝐾: Number of Euclidean Distance
considered nearest neighbor, 𝒙 𝑡𝑒𝑠𝑡 : a new testing sample. 2
𝑡𝑒𝑠𝑡 𝑗 𝑡𝑒𝑠𝑡 𝑗
1. For each 𝒙(𝑗) in 𝑿 do: 𝑑 𝑥 ,𝑥 = 𝑥𝑘 − 𝑥𝑘
𝑘
Calculate the distance 𝑑 𝒙 𝑡𝑒𝑠𝑡 , 𝒙 𝑗
End for Manhattan Distance
2. Sort the calculated distances in increasing
𝑡𝑒𝑠𝑡 𝑗
3. Take first 𝐾 points correspoinding to the first 𝐾 sorted 𝑑 𝑥 𝑖 ,𝑥 𝑗
= 𝑥𝑘 − 𝑥𝑘
𝑘
distances.
4. For each point in 𝐾 taken points do:
Count the classes appear.
Minkowski Distance
End for 1
𝑝
𝑝
5. Class of 𝒙 𝑡𝑒𝑠𝑡 = most appeared class in 𝐾 taken points 𝑑 𝑥 𝑖 ,𝑥 𝑗
= 𝑥𝑘
𝑡𝑒𝑠𝑡
− 𝑥𝑘
𝑗
𝑘
𝑡𝑒𝑠𝑡
Output: Class of 𝒙
Applications of AI (ME3181) 11
K-Nearest Neighbor (k-NN)
Applications of AI (ME3181) 12
Naïve Bayes Classifier
Probability Review
Biến ngẫu nhiên (Random Variable): Một biến số mà giá trị của nó có thể xuất hiện
một cách ngẫu nhiên.
𝐴: Ω → 𝐸
Ω là tập hợp (không gian) các giá trị mà 𝐴 có thể mang (sample space). Mỗi giá trị 𝑎 ∈
Ω là một thể hiện của 𝐴
Xác suất (Probability): Là khả năng 𝐴 mang một hoặc 1 khoảng giá trị nào đó.
Ký hiệu
Xác suất để điều Điều kiện
kiện xuất hiện
𝑃(𝐴 ≤ 𝑎)
• Maximize Likelihood: Tìm 1 biến (tham số) để xác suất của biến còn lại đạt cao
nhất.
𝜃 ∗ = 𝑎𝑟𝑔𝑚𝑎𝑥𝜃 𝑃 𝑥
Applications of AI (ME3181) 14
Naïve Bayes Classifier
Probability Review
Applications of AI (ME3181) 15
Naïve Bayes Classifier
• Naïve: Giả sử 𝑥 = 𝑥1 , … , 𝑥𝑖 , … 𝑥𝑛 𝑇 • Gaussian Naïve: Các biến tuân theo phân
gồm 𝑛 features. Ta sẽ giả sử “ngây phối Gaussian.
thơ” rằng 𝑥𝑖 độc lập với nhau Đặc trưng của phân phối Gaussian:
𝑃 𝑥 = 𝑃 𝑥1 𝑃 𝑥2 … 𝑃 𝑥𝑛 𝜇: Trung bình
2 2 σ 𝑥−𝜇 2
𝜎 : Phương sai, 𝜎 = 𝑁
𝑥𝑖 −𝜇𝑖 2
1 −
2𝜎𝑖2
𝑥𝑖 ~𝑁 𝜇𝑖 , 𝜎𝑖2 → 𝑃 𝑥𝑖 = 𝑒
𝜎𝑖 2𝜋
Như vậy Phương pháp tính như sau:
Đối với mỗi class 𝑦 = 𝑐𝑘 , tính
1. Tỉ lệ xuất hiện của 𝑐𝑘
2. Các thông số để tính toán xác suất → 𝑚𝑎𝑥𝑃 𝑦 = 𝑐𝑘 𝑥
Decision
Boundaries
Applications of AI (ME3181) 17
Logistic Regression
• General form:
𝑦ො 𝑥 = 𝑓 𝑤 𝑇 𝑥
• 𝑓 is the a function of logistic function: 𝑓 = 𝑔 𝜎 𝑤 𝑇 𝑥
1
𝜎 𝑡 =
1 + exp −𝑡
𝑡 → +∞, 𝜎 → 1
𝑡 → −∞, 𝜎 → 0
• We could consider 𝜎 is a probability, or 𝑝Ƹ 𝑦 = 𝑐𝑘 𝑥 = 𝜎
Applications of AI (ME3181) 18
Logistic Regression
• Prediction of 𝑦ො form:
𝑦ො 𝑥 = 𝑓 𝑤 𝑇 𝑥 = 𝑔 𝜎 𝑤 𝑇 𝑥
1, 𝑖𝑓 𝑝Ƹ ≥ 0.5
𝑦ො 𝑥 = ቊ
0, 𝑖𝑓 𝑝Ƹ < 0.5
Applications of AI (ME3181) 19
Softmax Regression
• Prediction of 𝑦ො form:
𝑦ො 𝑥 = 𝑓 𝑤 𝑇 𝑥 = 𝑔 𝜎 𝑤 𝑇 𝑥
exp 𝑧
𝜎 𝑧 = 𝐾
σ𝑗 exp 𝑧
https://machinelearningcoban.com/
Applications of AI (ME3181) 20
Support Vector Machine (SVM)
• Suppose we have a hyperplane as a classifier.
• Support vectors are the data points nearest to the hyperplane.
• The distance between the support vectors and the hyperplane is
the margin.
𝑦 = 𝑤 𝑇 𝑥 − 𝑤0
𝑦=1
𝑦 = −1
Applications of AI (ME3181) 21
Other classification algorithms
Applications of AI (ME3181) 22
Model Evaluation
https://www.researchgate.net/publication/347447352_Classification_of_stages_of_
Diabetic_Retinopathy_using_Deep_Learning
Applications of AI (ME3181) 23
Model Evaluation
Applications of AI (ME3181) 24