Professional Documents
Culture Documents
Lec1 Introduction
Lec1 Introduction
• Course Assistant:
– Zed Lee - zed.lee@dsv.su.se
• Schedule:
– 12 Lectures
– 5 Lab sessions
– Written Exam: Mar 14
• Guest lecturers:
– Ioanna Miliou - ioanna.miliou@dsv.su.se
– Korbinean Randl - korbinian.randl@dsv.su.se
– Alejandro Kuratomi - alejandro.kuratomi@dsv.su.se
– Sindri Magnusson - sindri.magnusson@dsv.su.se
• Assignments 3hp
– Three programming assignments (python and
tensorflow)
the methods?
*https://www.aiiottalk.com/machine-learning-examples-in-real-life/
Types of Machine Learning
• Supervised learning
• Unsupervised learning
• Semi-supervised learning
• Active learning
• Reinforcement learning
• Federated learning
• …
Supervised Learning
• Setup: a model is trained using labeled data, where the input is
paired with the correct output labels, with the objective of
learning to predict the output label, given an input data example.
• Learning: by analyzing a training set of labeled examples and
adjusting the model parameters in order to minimize the error
between the model’s predictions and the correct output labels.
Drawbacks:
• Labelled data is required
• Susceptible to bias
• Generalization issues
• May require feature engineering
Unsupervised Learning
• Setup: a model is trained using unlabeled data, with the objective
of finding underlying hidden structures and patterns in the data.
• Learning: by analyzing a training set of unlabeled examples and
adjusting the model parameters in order to maximize a learning
objective.
Drawbacks:
• Lack of ground truth
• Difficult to evaluate
• Curse of dimensionality
• Computationally intensive
Semi-supervised Learning
• Setup: a model is trained using both labelled and unlabeled data,
with the objective of learning to predict the output label given an
input data example.
• Learning: by iteratively training the model using the most
confident predictions, and augmenting the training dataset.
Drawbacks:
• Sensitive to data quality
• Difficult to evaluate
• Difficult to interpret
• Computationally intensive
Semi-supervised learning: self-learning
• Pick a small amount of labeled data and use this dataset to train a base model.
• Then apply the process known as pseudo-labeling by taking the partially
trained model and use it to make predictions for the unlabeled data
• Take the most confident predictions made with by the model and add them
into the labeled dataset; hence creating a new, combined input to train an
improved model.
• Repeat the process for several iterations.
Semi-supervised learning: co-learning
• Train two individual classifiers
based on two views of data.
• Views: different sets of features
but same instances.
• The bigger pool of unlabeled
data receives pseudo-labels
from both classifiers.
• Classifiers co-train one another
using the pseudo-labels with
the highest confidence level.
• The predictions of the two
classifiers are combined.
Active Learning
• Setup: a model is trained using both labelled and unlabeled data,
with the objective of learning to predict the output label given an
input data example.
• Learning: by iteratively training the model using a human expert
annotator for the least confident predictions.
Drawbacks:
• Sensitive to data quality
• Dependence on human
annotator
• Computationally intensive
Types of Machine Learning…
• Reinforcement learning: an agent learns to interpret its
environment and optimize a given objective by interacting with
the environment through actions that are either rewarded or
penalized.
• Federated learning: training data is locally stored in inter-
connected nodes, and it and cannot be shared. A central model
is learned using the distributed local data by only sharing model
parameters and not the data points themselves.
Examples of supervised learning
• Predict whether a patient, hospitalized due to
heart attack, will have a second heart attack
• Identify the strongest predictors for heart attack
• Prediction based on many data modalities:
demographics, diet, clinical measurements, clinical
history
H
Examples of supervised learning
• Predict the price of a stock in 6 months from now, based on
– the performance of the company (e.g., sales, future outlook)
– other economic metrics (e.g., market value, Price-to-Earnings ratio)
– other exogenous variables (e.g., social media)
Examples of supervised learning
• Identify the handwritten numbers in a
a digitized image
• Identify animals in an image
Examples of reinforcement learning
Define the measures that need to be taken by the public health authorities
under a pandemic situation
training
(“rewards” / “penalties”)
RL Output
Input “actions”
System
Examples of federated learning
Given data from distributed hospitals learn a central model that can propose
the optimal patient treatment without sharing any data
Medical Images
Global
model
Sensor reading
dimethylanthracene poor 0 5.29 4.61 108 7 16 18
hexahydropyrene med. 0 5.00 3.82 117 7 16 19
triphenylene poor 0 5.00 5.15 126 7 18 21
benzo(e)pyrene poor 0 5.29 5.64 152 7 20 24
Output
Input Machine Learning responses or
predictors or
independent variables function f dependent variables
denoted as Y
Appropriate-fitting
Underfitting: an ML model is unable to capture the relationship between the input and
output variables accurately, generating a high error rate on both training and test sets
Overfitting: an ML model is unable to generalize, hence the relationship it learns between
the input and output variables in the training set does not reflect the true relationship in the
test set, generating a low error rate on the training set but a high error rate on the test set
training error
test error
training error
Underfitting: an ML model is unable to capture the relationship between the input and
output variables accurately, generating a high error rate on both training and test sets
Overfitting: an ML model is unable to generalize, hence the relationship it learns between
the input and output variables in the training set does not reflect the true relationship in the
test set, generating a low error rate on the training set but a high error rate on the test set
• But we assume:
• High bias can cause an algorithm to miss the relevant relations between
features and target outputs (underfitting).
• Variance: over different realizations of the training set, how much does
𝒇! 𝑿 deviate from its expectation:
Recall:
Proof
• square expansion
• linear property of expectation
Proof
• Properties of error
• independence of random variables and 𝒇!
Proof
: is a constant, hence
the expectation is also constant
Continuing the decomposition
linearity of
expactation
is applied
Continuing the decomposition
Model selection & assessment
assesment of the
model generalization error of
model fitting selection the final model
• Shortcoming:
- depends highly on which data points end up in the training set and
which end up in the test set
k-fold Cross Validation
• Partition the N data samples into k disjoint subsets
– train on k-1 partitions
– test on the remaining one
D 1 2 3 … k-1 k
The training set is not representative The validation set is not representative
enough (few examples to support the enough (few examples to support the
validation set). training set).
Comparing Learning Curves
• Indication of high bias
As # of training examples increases:
o training error increases too
o model is not fitting the data with
an appropriate curve
Which model do
we finally deploy?
Cross-validation tests a model
building procedure and not a
single model instance
• Test each sample not in b, i.e., the out-of-bag set C-i against all replications
it does not appear in
Leave-one-out bootstrap
• Train on each replication b
• Test each sample not in b, i.e., the out-of-bag set C-i against all replications
it does not appear in
• Problems?
– the average number of distinct observations in each bootstrap sample is
about 0.632 N
– upward bias on the validation error
– behaves roughly like two-fold Cross Validation
.632 bootstrap
• Main idea:
– extension of leave-one-out bootstrap
– intuition: pulls the leave-one out bootstrap estimate down towards the
training error rate, and hence reduces its upward bias