Professional Documents
Culture Documents
2024 MTH058 Lecture04 AILearningParadigms
2024 MTH058 Lecture04 AILearningParadigms
2024 MTH058 Lecture04 AILearningParadigms
PARADIGMS
Supervised
𝑋 𝑦 learning Model
algorithm
Image credit:
𝑦ො EnjoyAlgorithms
3
Spam detection
Recommender systems in BI
Object recognition
on MNIST dataset 4
Classify protein structures at different levels
Regression
Classification
• Train a model to predict a categorical dependent variable
6
Classification vs. Regression
8
Classification: Problem types
• Binary vs. Multiclass
classification
9
Unsupervised learning (USL)
• Infer a function to describe hidden structure from "unlabeled"
data, i.e., labels are not included in the observations.
Unsupervised
𝑋 𝑦 learning Model
algorithm
Outlier detection
13
Supervised vs. Unsupervised learning
Supervised Unsupervised
learning learning
14
Human
action
recognition
problem
Semi-Supervised Learning
and Capsule Network for
Video Action Detection.
RIVF 2023. (HCMUS)
15
Reinforcement learning (RL)
• An agent learns to interact with an environment based on
feedback signals it receives from the environment.
19
From AlphaGO to AlphaZero
https://deepmind.com/blog/article/
alphazero-shedding-new-light-
grand-games-chess-shogi-and-go
24
Hybrid • Semi-supervised learning
self-training co-training
28
Autoencoder
• An autoencoder captures the most important patterns in the
input data to produce efficient representations.
Image colorization
Image denoising
Image credit:
1,2,3
Image
super-resolution 30
Generative Adversarial Networks (GANs)
• The key idea behind GANs is to train two neural networks, a
generator, and a discriminator, in a competitive framework.
31
GAN Applications: Image inpainting
Nvidia StyleGAN 33
GAN Applications: Image synthesis
Nvidia StyleGAN 3 34
Multi-instance learning
• An entire collection of examples is labeled as containing or
not containing an example of a class.
• However, the individual members are not labeled.
Image credit: 1, 2 35
Multi-instance learning: Pathology
38
• Multi-task learning
• Active learning
Learning
• Online learning
techniques • Transfer learning
• Ensemble learning
Multi-task learning
• A type of supervised learning that involves fitting a model on
one dataset that addresses multiple related problems.
• E.g., the same word embedding is used to learn a distributed
representation of words and then shared across multiple NLP tasks.
The architecture takes a single monocular RGB image as input and produces a pixel-wise
classification, an instance semantic segmentation and an estimate of per pixel depth.
Multi-task learning can improve accuracy over separately trained models because cues
from one task, such as depth, are used to regularize and improve the generalization of
another domain, such as segmentation.
Kendall, Alex, Yarin Gal, and Roberto Cipolla. "Multi-task learning using uncertainty to
weigh losses for scene geometry and semantics." Proceedings of the IEEE conference on
computer vision and pattern recognition. 2018. 41
Active learning
• The learner resolves ambiguity during the learning process by
querying an oracle to request labels for new points.
42
Active learning
44
Online learning
• The model is updated as each new data point arrives rather
than waiting until “the end” (which may never occur).
• The probability distribution of observations changes over time.
• There are too many observations to reasonably fit into memory.
• Stochastic gradient descent (SGD)
45
46
Transfer learning
• Models learning from one type of problem and the learning
is then applied to solve a different but related problem.
47
Transfer learning
Classifier
The pre-trained model is used directly to classify samples.
Feature extractor
The pre-trained model is used to extract relevant features.
Weight initialization
The pre-trained model is integrated into a new model, and
its layers are trained in concert with the new model.
48
Ensemble learning
• Two or more models are fit on the same data and the
predictions from each model are combined.
49
Ensemble learning
• An ensemble goes more accurate than its base learners,
especially when the models are significant diverse.
• Parallelizability: each base learner to a different CPU
• Types of ensemble learning
Decision boundary by (a) a single decision tree and (b) an ensemble of decision
trees for a linearly separable problem (i.e., where the actual decision boundary is
a straight line). The decision tree struggles with approximating a linear boundary.
The decision boundary of the ensemble is closer to the true boundary. 54
Random Forest (Breiman 2001)
• Each base learner is a decision tree, generated using a random
selection of attributes at each node to determine the split.
• Classification: each tree votes and the most popular class is returned.
55
AdaBoost (Freund and Schapire, 1997)
Reference: GitHub 57
Video generation: Deepfake
59
• Federated learning
Advanced
• Explainable AI
learning • Generative AI
Federated learning (FL)
• Privacy-preserving models can be trained in heterogeneous
and distributed networks.
61
Federated learning (FL)
Generate a global model shared by all nodes
Exchange
parameters
between
these local
nodes
63
A typical FL workflow
• Only the updated model will be sent to the server side.
• Any actual data based on user behavior is not necessarily included.
64
A typical FL workflow
• The new model is aggregated on the server side and will be
distributed to all client devices.
65
Federated learning: Pros and Cons
66
Federated learning platforms
TensorFlow Federated
68
Explainable AI (XAI)
70
Marked areas in the image were relevant for
Context image the answer and hidden areas were irrelevant
The AI
answer
The AI
answer
The AI
answer
74
Image credit: Turing
ChatGPT: An introduction
• ChatGPT is built upon either GPT-3.5 or GPT-4, that were
fine-tuned to target conversational usage.
• It leverages both supervised learning and reinforcement learning in a
process called reinforcement learning from human feedback (RLHF).
75
ChatGPT: An introduction
• ChatGPT is built upon either GPT-3.5 or GPT-4, that were
fine-tuned to target conversational usage.
• It leverages both supervised learning and reinforcement learning in a
process called reinforcement learning from human feedback (RLHF).
StyleGAN
78
DALL·E: An introduction
• DALL·E is a text-to-image model using deep learning to
generate digital images from natural language prompts.
3.5 billion parameters,
a diffusion model
conditioned on CLIP
image embeddings
A multimodal
implementation of
GPT-3 with 12 Released natively into
billion parameters ChatGPT Plus
Implemented in Bing's
Image Creator
79
DALL·E 2: Demonstration
A modern architectural
building with large glass Meaningful
windows, situated on a representation
cliff overlooking a serene
ocean at sunset
Created by DALLE 3
82
Virtual Interior
Design “Japanese style,
wooden wall,
cotton puff”
“A Van Gogh
style painting
Original image on the wall”
Le, Minh-Hien, Chi-Bien Chu, Khanh-Duy Le, Tam V. Nguyen, Minh-Triet Tran, and Trung-
Nghia Le. "VIDES: Virtual Interior Design via Natural Language and Visual Guidance." ISMAR
2023. HCMUS research work.
83
Audio generation: AI music
• Generative AI empowers creators to generate new warbles,
chimes, measures, and even entire songs.
84
85