Download as pdf or txt
Download as pdf or txt
You are on page 1of 26

Artificial Intelligence

Summer School:
Machine Learning
Deep
Learning

Machine
• Day 9: Popular CNN Architectures,
Learning Transfer Learning, Fine-tuning
Artificial
Intelligence

Dr Rinki Gupta, Prof, ASET, ACAI

Amity Centre for Artificial Intelligence, Amity University, Noida, India


Outline: Day 2

• Popular CNN Architectures


• ILSVRC
• AlexNet
• VGGNet
• Transfer Learning
• Method
• Types
• Fine-tuning and its Applications

Amity Centre for Artificial Intelligence, Amity University, Noida, India


Popular CNN
Architectures

Amity Centre for Artificial Intelligence, Amity University, Noida, India


ImageNet Dataset
• ImageNet is a dataset of
• over 15 million labelled high-
resolution images
• from ~22,000 categories
• ImageNet Large Scale Visual
Recognition Challenge (ILSVRC):
• Between 2010 -2017
• Uses ~1000 categories, each with
~1000 images

https://image-net.org/index.php

Amity Centre for Artificial Intelligence, Amity University, Noida, India


Popular CNN Architectures
• ImageNet Large Scale Visual (Layers) In ILSVRC
Recognition Challenge
(7)
(ILSVRC) winners:
• AlexNet (8)

• VGGNet
• GoogLeNet/Inception
• ResNet (22)

(16 or 19)
25 million
(152)
(ResNet 50)

Amity Centre for Artificial Intelligence, Amity University, Noida, India


Popular CNN Architectures
Top-5 Error Rate, eg.
(Layers) In ILSVRC
• Input: cat image
(7)
• Neural network outputs:
• Tiger: 0.4 • Lion: 0.08 (8)

• Dog: 0.3 • Bird: 0.02


• Cat: 0.1 • Bear: 0.01
• Lynx: 0.09 (22)
• Top-1 o/p contains Tiger
• Top-5 o/p contains Cat (16 or 19)
25 million
(152)
(ResNet 50)

Amity Centre for Artificial Intelligence, Amity University, Noida, India


AlexNet
• Developed by Alex Krizhevsky under the guidance of Sutskever Ilya and
Geoffrey E. Hinton (PhD supervisors) in 2012

• Winner of ILSVRC challenge 2012 (top-5 error of 15.3%)

• 8-layer CNN (5 Conv+MaxPool and 3 FC),

• More than 60 million parameters


Krizhevsky, Alex, Ilya Sutskever, and Geoffrey E. Hinton. "Imagenet classification with deep convolutional neural
networks." Advances in neural information processing systems 25 (2012): 1097-1105.

Amity Centre for Artificial Intelligence, Amity University, Noida, India


AlexNet
• Input:
227x227x3
• Conv 11x11,
5x5, three 3x3
• 3 MaxPool
• ReLu for
hidden units
• Softmax for
output

6x6x256=9216

Dropout p=0.5 Dropout p=0.5

Amity Centre for Artificial Intelligence, Amity University, Noida, India


AlexNet
• ReLU activation (avoid vanishing gradient),
• Data Augmentation (avoid overfitting),
• Dropout regularization (avoid co-adaptation)
• Introduced Local Response Normalization (LRN)
• LRN is a non-trainable layer that square-
normalizes the pixel values in a feature map
within a local neighbourhood (Inter-channel,
Intra-channel)
• does lateral inhibition: refers to the capacity of
a neuron to reduce the activity of its
neighbours

Amity Centre for Artificial Intelligence, Amity University, Noida, India


VGGNet
• Developed by Visual Geometry Group in 2014
• VGG16 was 2nd in ILSVRC challenge 2014 (top-5 classification error of 7.32%)
• Characterized by Simplicity and Depth
• All Conv layers with 3x3 filters and stride 1, SAME padding
• All max polling layers 2x2 filters, stride 2
• VGG16: 16-layer CNN (16 layers with trainable parameters, over 134 million
parameters); VGG19: 19-layer CNN (more than 144 million parameters)
Karen Simonyan, and Andrew Zisserman. "Very deep convolutional networks for large-scale image recognition." arXiv
preprint arXiv:1409.1556 (2014).

Amity Centre for Artificial Intelligence, Amity University, Noida, India


VGGNet

Conv= 3x3 filter, s=1, same ReLU activation in all hidden units
Max pool= 2x2, s=2 (5 Max pooling layers) Softmax activation in output units

Amity Centre for Artificial Intelligence, Amity University, Noida, India


Stacking of Conv Conv3 Conv3
• Multiple Stacked Conv layers lead to
Wide Receptive Field
• In VGG, varying filter sizes are
implemented by stacking Conv layers
with fixed filter sizes

• Limitation: Training is very slow


5x5 3x3
Effective Receptive Field

Amity Centre for Artificial Intelligence, Amity University, Noida, India


Transfer Learning
.

Knowing one programming language makes learning of


another programming language simpler

Amity Centre for Artificial Intelligence, Amity University, Noida, India


Amity Centre for Artificial Intelligence, Amity University, Noida, India
How TL works in case of Deep Learning Models?
• The machine exploits the
knowledge gained from a Source
task to improve generalization
about Target task
• Improvement of learning in a new
task through the transfer of
knowledge from a related task
that has already been learned
• When to use TL? New dataset is
small and similar to the original
dataset

Amity Centre for Artificial Intelligence, Amity University, Noida, India


Steps in Transfer Learning

Amity Centre for Artificial Intelligence, Amity University, Noida, India


1-Obtain 2. Create a base model
pre-trained model

• VGG-16
• VGG-19
• Inception V3
• XCeption
• ResNet-50

Amity Centre for Artificial Intelligence, Amity University, Noida, India


3. Freeze layers
• Freezing the starting layers from the
pre-trained model is essential to avoid
the additional work of making the
model learn the basic features.
• If we do not freeze the initial layers, we
will lose all the learning that has
already taken place. This will be no
different from training the model from
scratch and will be a loss of time,
resources, etc.

Amity Centre for Artificial Intelligence, Amity University, Noida, India


4. Add new trainable layers

• The only knowledge we are reusing from the base model is the feature extraction layers. We need to add additional
layers on top of them to predict the specialized tasks of the model. These are generally the final output layers.

Amity Centre for Artificial Intelligence, Amity University, Noida, India


5. Train the new layers

• The pre-trained model’s final output will most likely


differ from the output we want for our model. For
example, pre-trained models trained on the
ImageNet dataset will output 1000 classes.
• However, if we need our model to work for two
classes, we have to train the model with a new
output layer in place

Amity Centre for Artificial Intelligence, Amity University, Noida, India


Amity Centre for Artificial Intelligence, Amity University, Noida, India
Fine-tune the model

• One method of improving the performance is fine-tuning.


• Fine-tuning involves unfreezing some part of the base model and
training the entire model again on the whole dataset at a very low
learning rate.
• A low learning rate will increase the performance of the model on
the new dataset while preventing overfitting.

Amity Centre for Artificial Intelligence, Amity University, Noida, India


Applications of Transfer Learning

One of the most popular and successful applications of transfer


learning in neural networks is image recognition.
Image recognition is the task of identifying and classifying objects,
faces, scenes, or emotions in images.
There are many pre-trained neural networks that have been trained on
large and diverse image datasets, such as ImageNet or COCO, that can
be used as feature extractors or fine-tuned for new image recognition
tasks.
For example, you can use a pre-trained network like ResNet or VGG to
extract features from your own images, and then add a new classifier
layer on top to train on your specific image recognition problem, such
as flower type identification or medical image diagnosis

Amity Centre for Artificial Intelligence, Amity University, Noida, India


Summary

• There are many popular CNN Architectures in literature, such as


AlexNet, VGGNet, GoogleNet, ResNet
• Above mentioned models are winners in the ImageNet Large Scale
Visual Recognition Challenge (ILSVRC)
• In Transfer Learning, the machine exploits the knowledge gained from
a Source task to improve generalization about Target task
• Fine-tuning involves unfreezing some part of the base model and
training the entire model again on the whole dataset at a very low
learning rate

Amity Centre for Artificial Intelligence, Amity University, Noida, India


References
1. Machine Learning: A Probabilistic Perspective" by Kevin Murphy, published by MIT Press, 2012.
2. "Pattern Recognition and Machine Learning" by Christopher M. Bishop, published by Springer,
2006.
3. "Machine Learning Yearning" by Andrew Ng, published by Goodfellow Publishers, 2018.
4. "Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow" by Aurélien Géron,
published by O'Reilly Media, 2019.
5. “Deep Learning”, Ian Goodfellow, Yoshua Bengio and Aaron Courville, MIT Press, 2016.
https://www.deeplearningbook.org/

Amity Centre for Artificial Intelligence, Amity University, Noida, India


Thank You !

Any Questions

Amity Centre for Artificial Intelligence, Amity University, Noida, India

You might also like