Artificial Intelligence and Machine Learning

Artificial Intelligence
and Machine Learning
A.A. 2020/2021
Tatiana Tommasi
1
Naïve Bayes Classifier
2020/21
Slide Credit: Barnabás Póczos & Alex Smola 2
Deep Learning
2020/21
3
How good are our predictions?
The avocado problem:
Example:
● Imagine to buy your first avocado
● How do you understand if it is ripe?
● Gather data: buy avocados and open them
● Features:
○ color: from dark green to dark brown https://www.istockphoto.com/
○ softness: from rock hard to mushy
● What is your prediction strategy on new avocados?
● Is perfect prediction of avocados possible?
2020/21
slide credit: Francesco Orabona 4
Example:
● Features:
Build a Make predictions

Model on a new sample
Training Data
2020/21
Example:
● Features:
...why do we minimize the training error while we care about the test error?
...is it always possible to get 100% accuracy?
...what are the important parameters characterizing a learning problem?
2020/21
Notation
2020/21
Loss
2020/21
True Risk
2020/21
True Risk
2020/21
Bayes Classifier
2020/21
Bayes Risk
2020/21
Batch Learning
2020/21
Batch Learning - IID condition
2020/21
Empirical Risk
2020/21
Can Only Be Probably Correct
2020/21
Can Only Be Approximately Correct
2020/21
Probably Approximately Correct (PAC) Learning
2020/21
Let’s Start Easy: Realizability Assumption
2020/21
Learning in the Realizable Setting
2020/21
Analysis of a Consistent Classifier (1)
2020/21
Analysis of a Consistent Classifier (2)
2020/21
PAC Learning
2020/21
What is Learnable and How to Learn?
2020/21
Waving the Realizability Assumption
2020/21
What is Learnable and How to Learn?
2020/21
Infinite Hypothesis Classes?
2020/21
Shattering & VC-dimension
2020/21
Example
+ - + -
c c c
1 + 1 2 -
+ -
c c
c + 1 2 -
1
c c
1 2
This set is not shattered by
VCdim = 1 c1 ➡ - c2 ➡+ ?
2020/21
Example
Consider as the set of linear classifiers in two dimensions
2020/21
Example
2020/21
Example
Case 1: one point inside the triangle formed by the

others. Cannot label inside point as positive and
outside points as negative.
Case 2: all points on the boundary (convex hull).

Cannot label two diagonally as positive and other two
as negative.

VCdim = 3
2020/21
Example
d VCdim = d+1
Case 1: one point inside the triangle formed by the

others. Cannot label inside point as positive and
outside points as negative.
Case 2: all points on the boundary (convex hull).

Cannot label two diagonally as positive and other two
as negative.

VCdim = 3
2020/21
Not PAC Learnable
2020/21
Infinite Hypothesis Classes?
2020/21
Things can go wrong
Example: regression using

polynomial curve
2020/21
from Machine Learning and Pattern Recognition, Bishop 36
Things can go wrong

polynomial curve
2020/21 http://www-inst.eecs.berkeley.edu/~cs188/fa19/
Things can go wrong

polynomial curve
2020/21
Things can go wrong

polynomial curve
http://www-inst.eecs.berkeley.edu/~cs188/fa19/
2020/21
Things can go wrong

polynomial curve
2020/21 http://www-inst.eecs.berkeley.edu/~cs188/fa19/
Things can go wrong
General
Phenomenon:
2020/21
Figure from Deep Learning, Goodfellow, Bengio and Courville 41
Overfitting
2020/21 Slide Credit: Francesco Orabona

42
Cross Validation
Dataset
Training Testing
Cross Validation
If data permits:
Training Validation
2020/21
43
k-fold Cross Validation
Dataset
Training Testing
Cross Validation
Figure from Machine

Learning and Pattern
Recognition, Bishop
2020/21
44
Dataset
Training Testing
Cross Validation
Figure from Machine

Learning and Pattern
Recognition, Bishop
2020/21
45
Dataset
Training Testing
Cross Validation
If k = |S|
Leave-One-Out
Cross Validation
2020/21
46
Train - Validation - Test
2020/21
47
Model Selection In Summary
2020/21
48
Model Selection In Summary
2020/21 https://en.ephoto360.com/blood-writing-text-online-77.html
49
Learning Solution to Overfitting
Figure Credit: Nati Srebro
Control complexity by penalizing complex

models in learning: regularization
2020/21
50
Regularization
2020/21
51
Regularization
+ λ Complexity(h)
● λ is a parameter, a positive number that serves as a conversion rate between the loss
and the hypothesis complexity (they might not have the same scale)
● the form of the complexity /regularization function depends on the hypothesis space
● we still need cross validation and in this case we need to include the parameter λ -
select the one that gives the best validation score
2020/21
52
Linear Regression
2020/21
53
Linear Regression
2020/21
Slide Credit: Francesco Orabona 54
Flashback: Loss Function
2020/21
Empirical Loss
2020/21
Linear Predictors in 1d
2020/21
Linear Fitting to Data
2020/21
2020/21
2020/21
2020/21
Linear Functions
2020/21
Least Squares Criterion
2020/21
Least Squares in Matrix / Vector Form
2020/21
Least Squares via Calculus
Slide Credit: E. Rodolà

2020/21
65
Least Squares via Calculus
2020/21
Linear Predictor in 1d
forget about the b for one second….
2020/21
Slide Credit: William Cohen 67
add the bias term
2020/21
Slide Credit: William Cohen 68
Add regularization (Ridge Regression)
2020/21
69
Beyond Linear Models: Polynomial Regression
2020/21
Fitting Polynomials
2020/21
Regression Summary
2020/21
Final Summary
● Learning can only be Probably and Approximately Correct (PAC)

○ true risk vs empirical risk
○ loss
○ empirical risk minimization can lead to learning algorithms with reasonable
generalization guarantees (within some conditions)
○ when a task is not PAC learnable
● Underfitting / Overfitting
○ cross-validation
● Linear Regression
○ without and with regularization
2020/21
73

Artificial Intelligence and Machine Learning

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Artificial Intelligence and Machine Learning

Uploaded by

Copyright:

Available Formats

Artificial Intelligence

and Machine Learning

Build a Make predictions

Case 1: one point inside the triangle formed by the

Case 2: all points on the boundary (convex hull).

This set is not shattered by

Case 1: one point inside the triangle formed by the

Case 2: all points on the boundary (convex hull).

This set is not shattered by

Example: regression using

Example: regression using

Example: regression using

Example: regression using

Example: regression using

2020/21 Slide Credit: Francesco Orabona

Figure from Machine

Figure from Machine

Figure Credit: Nati Srebro

Control complexity by penalizing complex

Slide Credit: E. Rodolà

forget about the b for one second….

● Learning can only be Probably and Approximately Correct (PAC)

You might also like