Download as pdf or txt
Download as pdf or txt
You are on page 1of 28

Prediction methods and Machine learning

Session V

Pierre Michel
pierre.michel@univ-amu.fr

M2 EBDS

2021
1. Convolutional Neural Networks

1. Convolutional Neural Networks

Pierre Michel Prediction methods and Machine learning 2/28


1. Convolutional Neural Networks
1.1. Introduction to Convolutional Neural Networks

1.1 Introduction to Convolutional Neural Networks

Pierre Michel Prediction methods and Machine learning 3/28


1. Convolutional Neural Networks
1.1. Introduction to Convolutional Neural Networks

Artificial neural networks: recap

Until now, you have used classical Artificial Neural Networks (ANN),
also called Multi-Layer Perceptron (MLP).
We use the MNIST dataset, that contains simple images of hand-written
digits (low-resolution images). More realistic datasets imply more complex
images (high-resolution color images).
In the context of MLP, each neuron in the input layer is an input of each
neuron in the hidden layer, this is a fully-connected network.

Pierre Michel Prediction methods and Machine learning 4/28


1. Convolutional Neural Networks
1.1. Introduction to Convolutional Neural Networks

Fully-connected networks: illustration

Flattened image
(25x1)
𝐼11
Fully-connected MLP
𝐼12

Input image


(5 x 5) ⋮ 𝐼11


𝐼11 𝐼12 ⋯ ⋯ 𝐼15 ⋮



𝐼21 ⋱ ⋱ ⋱ ⋮ ⋮
⋮ ℎ𝑊,𝑏 (𝑥)

⋮ ⋱ ⋱ ⋱ ⋮ ⋮
⋮ 𝐼55
⋮ ⋱ ⋱ ⋱ ⋮ ⋮ 1

Layer L
𝐼51 ⋯ ⋯ ⋯ 𝐼55 ⋮ 1 1 (output)
⋮ Layer L-1
𝐼𝑖𝑗 : Intensity of pixel (𝑖, 𝑗) ⋮


Layer 1 Layer 2
⋮ (input)


𝐼55

Figure 1: Example of fully-connected neural network

Pierre Michel Prediction methods and Machine learning 5/28


1. Convolutional Neural Networks
1.1. Introduction to Convolutional Neural Networks

Fully-connected networks

Considering the MNIST dataset (28 × 28 images), we have seen that a


fully-connected MLP with a simple architecture (3/4-layer neural network)
is sufficient to get satisfactory performance results.
It is computationally feasible to consider all of the features (28×28 = 784
features) for learning the network.
Problem: with larger images (96 × 96) it becomes very computationally
expensive to learn using all the features: feedforward and backpropagation
will be slower to compute.

Pierre Michel Prediction methods and Machine learning 6/28


1. Convolutional Neural Networks
1.1. Introduction to Convolutional Neural Networks

Locally-connected networks

Solution: restrict the connections between hidden neurons and inputs


neurons: it will reduce the number of parameters to estimate.
Each hidden neuron is connected to only a subset of input neurons.
In the case of time-series (1-dimensional objects): each hidden neuron is
connected to a contiguous region of successive values (ex: values at
T − 2, T − 1, T , T + 1, T + 2).
In the case of images (2-dimensional objects): each hidden neuron is
connected to a contiguous region of pixels.

Pierre Michel Prediction methods and Machine learning 7/28


1. Convolutional Neural Networks
1.1. Introduction to Convolutional Neural Networks

Locally-connected networks: illustration


Locally-connected network
Convolution filter Flattened image
matrix (windows) (25 x 1) 𝐼11
(3 x 3) 𝐼11
𝐼12 𝐼12

Output
𝐼13
𝑓11 𝑓12 𝑓13 ⋮ 𝐼13
𝑓11
𝑓12
𝑓21 𝑓22 𝑓23 ⋮
𝑓13 ⋮ (convolved)
𝐼21 ⋮
𝐼22 image
𝑓31 𝑓32 𝑓33 𝑓21
𝐼ሚ11 (3 x 3)
𝐼23 𝐼21
⋮ 𝑓22
⋮ 𝐼22
𝐼31
𝑓23 ⋮ 𝐼ሚ11 𝐼ሚ12 𝐼ሚ13
Input image 𝐼32 𝐼23
𝑓31
𝐼ሚ21 𝐼ሚ22 𝐼ሚ23
(5 x 5)
𝐼33 𝑓32
⋮ ⋮ 𝑓33
⋮ 𝐼ሚ31 𝐼ሚ32 𝐼ሚ33
𝐼11 𝐼12 𝐼13 ⋯ 𝐼15 ⋮ 𝐼31

𝐼21 𝐼22 𝐼23 ⋱ ⋮ ⋮ 𝐼32
𝐼ሚ𝑖𝑗 : Intensity of pixel (𝑖, 𝑗)
⋮ Convolution layer
𝐼31 𝐼32 𝐼33 ⋱ ⋮ ⋮
⋮ 𝐼33 (layer with 9 neurons)

⋮ ⋱ ⋱ ⋱ ⋮ ⋮


𝐼51 ⋯ ⋯ ⋯ 𝐼55 𝐼55
𝐼55

𝐼𝑖𝑗 : Intensity of pixel (𝑖, 𝑗) Input layer

1
with 𝐼ሚ𝑖𝑗 = (𝑓11 𝐼(𝑖−1)(𝑗−1) + 𝑓12 𝐼(𝑖−1)(𝑗) + 𝑓13 𝐼(𝑖−1)(𝑗+1) + 𝑓21 𝐼(𝑖)(𝑗−1) + 𝑓22 𝐼(𝑖)(𝑗) +𝑓23 𝐼(𝑖)(𝑗+1) + 𝑓31 𝐼(𝑖+1)(𝑗−1) + 𝑓32 𝐼(𝑖+1)(𝑗) + 𝑓33 𝐼(𝑖+1)(𝑗+1) )
9

Figure 2: Example of locally-connected (convolutional) neural network

Pierre Michel Prediction methods and Machine learning 8/28


1. Convolutional Neural Networks
1.2. Convolutional Neural Networks

1.2 Convolutional Neural Networks

Pierre Michel Prediction methods and Machine learning 9/28


1. Convolutional Neural Networks
1.2. Convolutional Neural Networks

Convolution

A neural network containing convolutional layers is called a convolutional


neural network (CNN).
The main operation in a CNN is a convolution, which can be seen as a
feature extraction method (in the cases of images: a filtering method).
Instead of applying a function to an entire image, a convolution scans
windows of low size and applies a kernel (a matrix multiplication typically)
to each window. These kernels are seen as image filters.
Main effect: reduce the number of parameters.

Pierre Michel Prediction methods and Machine learning 10/28


1. Convolutional Neural Networks
1.2. Convolutional Neural Networks

Convolution

More formally we can see an image X as a matrix of size r × c. Let consider


k windows of size a × b with a ≤ r and b ≤ c, scanning the image.
For each window w, representing a submatrix of X denoted X (w) , a
convolution is computed, defined as a function f : Ra×b → R with:

a X
b
(w)
X
f (X (w) ) = xij fij
i=1 j=1

The k hidden neurons are of size (r − a + 1) × (c − b + 1) (the size of the


convolved image).
Example: r = 5, c = 5, a = 3, b = 3 so (r − a + 1) = 3 and (c − b + 1) = 3

Pierre Michel Prediction methods and Machine learning 11/28


1. Convolutional Neural Networks
1.2. Convolutional Neural Networks

Convolution and curse of high dimension

Consider now larger images (say 100 × 100 images).


Convolutional layer: we consider 100 non-overlapping windows (i.e 100
hidden layers) of size 10 × 10.
Each convolution is an output of size (100−10+1)×(100−10+1) = 8281,
so the total number of convolutions in the layer is 828100, representing the
number of outputs of the convolutional layer.
Comparing with the size of the input image (100 × 100 = 10000 pixels),
we can observe that:
Convolution implies an increase of the dimension.

Pierre Michel Prediction methods and Machine learning 12/28


1. Convolutional Neural Networks
1.2. Convolutional Neural Networks

Pooling

Solution: pooling convolved images.


A pooling layer is also a locally-connected layer.
Idea: Once a region of pixels in an image has been convolved, a pooling
layer will aggregate for each convolution window the values obtained
using a simple function (typically mean pooling using the mean value
or max pooling using the maximum value). The output for the whole
convolved image is a pooled image of lower size.
Main effect: reduce the dimension.
Formally,
P compute  max(X (w) ) for max pooling or compute
1 a Pb (w)
ab i=1 i=1 xij for mean pooling.

Pierre Michel Prediction methods and Machine learning 13/28


1. Convolutional Neural Networks
1.2. Convolutional Neural Networks

Pooling: illustration
Locally-connected network
Convolution filter Flattened image
matrix (windows) (25 x 1) 𝐼11
(3 x 3) 𝐼11
𝐼12 𝐼12

Output
𝐼13
𝑓11 𝑓12 𝑓13 ⋮ 𝐼13
𝑓11
𝑓12
𝑓21 𝑓22 𝑓23 ⋮
𝑓13 ⋮ (convolved)
𝐼21 ⋮
𝐼22 image
𝑓31 𝑓32 𝑓33 𝑓21
𝐼ሚ11 (3 x 3)
𝐼23 𝐼21
⋮ 𝑓22
⋮ 𝐼22
𝐼31
𝑓23 ⋮ 𝐼ሚ11 𝐼ሚ12 𝐼ሚ13
Input image 𝐼32 𝐼23
𝑓31
𝐼ሚ21 𝐼ሚ22 𝐼ሚ23
(5 x 5)
𝐼33 𝑓32
⋮ ⋮ 𝑓33
⋮ 𝐼ሚ31 𝐼ሚ32 𝐼ሚ33
𝐼11 𝐼12 𝐼13 ⋯ 𝐼15 ⋮ 𝐼31

𝐼21 𝐼22 𝐼23 ⋱ ⋮ ⋮ 𝐼32
𝐼ሚ𝑖𝑗 : Intensity of pixel (𝑖, 𝑗)
⋮ Convolution layer
𝐼31 𝐼32 𝐼33 ⋱ ⋮ ⋮
⋮ 𝐼33 (layer with 9 neurons)

⋮ ⋱ ⋱ ⋱ ⋮ ⋮


𝐼51 ⋯ ⋯ ⋯ 𝐼55 𝐼55
𝐼55

𝐼𝑖𝑗 : Intensity of pixel (𝑖, 𝑗) Input layer

1
with 𝐼ሚ𝑖𝑗 = (𝑓11 𝐼(𝑖−1)(𝑗−1) + 𝑓12 𝐼(𝑖−1)(𝑗) + 𝑓13 𝐼(𝑖−1)(𝑗+1) + 𝑓21 𝐼(𝑖)(𝑗−1) + 𝑓22 𝐼(𝑖)(𝑗) +𝑓23 𝐼(𝑖)(𝑗+1) + 𝑓31 𝐼(𝑖+1)(𝑗−1) + 𝑓32 𝐼(𝑖+1)(𝑗) + 𝑓33 𝐼(𝑖+1)(𝑗+1) )
9

Figure 3: Example of locally-connected convolutional layer

Pierre Michel Prediction methods and Machine learning 14/28


1. Convolutional Neural Networks
1.2. Convolutional Neural Networks

Pooling: illustration
Pooling
Pooling matrix Flattened image
(windows) (25 x 1)
𝐼ሚ11
(2 x 2)
𝑃 11
𝑓11 𝑓12 𝐼ሚ12 Output
𝐼ሚ11 (pooled)
𝑓21 𝑓22
𝐼ሚ12 image
𝐼ሚ13
Input 𝐼ሚ13 (2 x 2)
(convolved) 𝑃12
𝐼ሚ21 𝐼ሚ21 𝑃11 𝑃12
image
(3 x 3) 𝐼ሚ22 𝑃21 𝑃22
𝐼ሚ23 𝐼ሚ22
𝐼ሚ11 𝐼ሚ12 𝐼ሚ13 𝑃21
𝐼ሚ31 𝑃𝑖𝑗 : pooled value in (𝑖, 𝑗)
𝐼ሚ21 𝐼ሚ22 𝐼ሚ23
𝐼ሚ32 𝐼ሚ23
𝐼ሚ31 𝐼ሚ32 𝐼ሚ33
𝐼ሚ33
𝐼ሚ𝑖𝑗 : Intensity of pixel (𝑖, 𝑗) 𝐼ሚ31 𝑃22

𝐼ሚ32

Pooling layer
𝐼ሚ33 with 4 neurons

Figure 4: Example of locally-connected pooling neural network

Pierre Michel Prediction methods and Machine learning 15/28


1. Convolutional Neural Networks
1.2. Convolutional Neural Networks

Convolutional neural network

A CNN contains one ore more convolutional layers often followed by a


pooling step, and then followed by one or more fully connected layers
(traditional multi-layer neural network: see Session IV slides).
The architecture of CNN is typically designed for inputs with a 1D (any
time-series) or 2D structure (image, speech signal). It works with local
connections and pooling step.
Advantages: easier training, fewer parameters than fully-connected net-
works.

Pierre Michel Prediction methods and Machine learning 16/28


1. Convolutional Neural Networks
1.2. Convolutional Neural Networks

Architecture

A CNN is a sequence of convolutional and pooling layers, often followed by


full-connected layers.
Formally, the input of a CNN is a m × m × r image with m rows of m
pixels and r channels (for a RGB image, r = 3).
The convolutional layer applies k filters of size n × n × q, with n < m and
q ≤ r.
Locally-connected structure: outputs k convolved images of size m − n +
1, that are mean or max pooled over p × p contiguous regions (traditionally
2 ≤ p ≤ 5).
Finally, this can be followed by fully (densely) connected layers (standard
multilayer neural network).

Pierre Michel Prediction methods and Machine learning 17/28


1. Convolutional Neural Networks
1.2. Convolutional Neural Networks

Backpropagation: full-conected layers

Let δ (l+1) be the error term in the layer (l + 1) in a network.


Consider J(W, b; x, y) the cost function with parameters (W, b) and (x, y)
the training data and target values. If layer l is fully-connected to layer
(l + 1), its error term is:
 
δ (l) = (W (l) )T δ (l+1) · f 0 (z (l) )

and the gradients are as follows:

∇W (l) J(W, b; x, y) = δ (l+1) (a(l) )T


∇b(l) J(W, b; x, y) = δ (l+1)

Pierre Michel Prediction methods and Machine learning 18/28


1. Convolutional Neural Networks
1.2. Convolutional Neural Networks

Backpropagation: locally-connected layers

If layer l is a convolutional and pooling layer then the error backpropagated


as follows:
 
(l) (l+1) (l)
δ (l) = upsample (Wk )T δk · f 0 (zk )
(l)
where k refers to filter k, f 0 (zk ) is the derivative of the activation function.
The upsample operator depends on the type of pooling used:

• max pooling: assigns all the error to the neuron corresponding to


the max pooled value.
• mean pooling: uniformly distribute the error in the neurons of the
previous layer

Pierre Michel Prediction methods and Machine learning 19/28


1. Convolutional Neural Networks
1.2. Convolutional Neural Networks

Backpropagation: locally-connected layers

Finally, the gradients with respect to the filters matrices are givenby:

n
(l) (l+1)
X
∇W (l) J(W, b; x, y) = (ai ) ∗ δ̃k
i=1
a X
b
(l+1)
X
∇b(l) J(W, b; x, y) = (δk )ij
i=1 j=1

(l) (l+1)
where (ai ) ∗ δ̃k is the transposed convolution between input i layer l
and the error of filter k.

Pierre Michel Prediction methods and Machine learning 20/28


1. Convolutional Neural Networks
1.2. Convolutional Neural Networks

CNN: recap
Locally-connected network
Convolution filter Flattened image
matrix (windows) (25 x 1) 𝐼11
(3 x 3) 𝐼11
𝐼12 𝐼12

Output
𝐼13
𝑓11 𝑓12 𝑓13 ⋮ 𝐼13
𝑓11
𝑓12
𝑓21 𝑓22 𝑓23 ⋮
𝑓13 ⋮ (convolved)
𝐼21 ⋮
𝐼22 image
𝑓31 𝑓32 𝑓33 𝑓21
𝐼ሚ11 (3 x 3)
𝐼23 𝐼21
⋮ 𝑓22
⋮ 𝐼22
𝐼31
𝑓23 ⋮ 𝐼ሚ11 𝐼ሚ12 𝐼ሚ13
Input image 𝐼32 𝐼23
𝑓31
𝐼ሚ21 𝐼ሚ22 𝐼ሚ23
(5 x 5)
𝐼33 𝑓32
⋮ ⋮ 𝑓33
⋮ 𝐼ሚ31 𝐼ሚ32 𝐼ሚ33
𝐼11 𝐼12 𝐼13 ⋯ 𝐼15 ⋮ 𝐼31

𝐼21 𝐼22 𝐼23 ⋱ ⋮ ⋮ 𝐼32
𝐼ሚ𝑖𝑗 : Intensity of pixel (𝑖, 𝑗)
⋮ Convolution layer
𝐼31 𝐼32 𝐼33 ⋱ ⋮ ⋮
⋮ 𝐼33 (layer with 9 neurons)

⋮ ⋱ ⋱ ⋱ ⋮ ⋮


𝐼51 ⋯ ⋯ ⋯ 𝐼55 𝐼55
𝐼55

𝐼𝑖𝑗 : Intensity of pixel (𝑖, 𝑗) Input layer

1
with 𝐼ሚ𝑖𝑗 = (𝑓11 𝐼(𝑖−1)(𝑗−1) + 𝑓12 𝐼(𝑖−1)(𝑗) + 𝑓13 𝐼(𝑖−1)(𝑗+1) + 𝑓21 𝐼(𝑖)(𝑗−1) + 𝑓22 𝐼(𝑖)(𝑗) +𝑓23 𝐼(𝑖)(𝑗+1) + 𝑓31 𝐼(𝑖+1)(𝑗−1) + 𝑓32 𝐼(𝑖+1)(𝑗) + 𝑓33 𝐼(𝑖+1)(𝑗+1) )
9

Figure 5: Convolutional layer

Pierre Michel Prediction methods and Machine learning 21/28


1. Convolutional Neural Networks
1.2. Convolutional Neural Networks

CNN: recap
Pooling
Pooling matrix Flattened image
(windows) (25 x 1)
𝐼ሚ11
(2 x 2)
𝑃 11
𝑓11 𝑓12 𝐼ሚ12 Output
𝐼ሚ11 (pooled)
𝑓21 𝑓22
𝐼ሚ12 image
𝐼ሚ13
Input 𝐼ሚ13 (2 x 2)
(convolved) 𝑃12
𝐼ሚ21 𝐼ሚ21 𝑃11 𝑃12
image
(3 x 3) 𝐼ሚ22 𝑃21 𝑃22
𝐼ሚ23 𝐼ሚ22
𝐼ሚ11 𝐼ሚ12 𝐼ሚ13 𝑃21
𝐼ሚ31 𝑃𝑖𝑗 : pooled value in (𝑖, 𝑗)
𝐼ሚ21 𝐼ሚ22 𝐼ሚ23
𝐼ሚ32 𝐼ሚ23
𝐼ሚ31 𝐼ሚ32 𝐼ሚ33
𝐼ሚ33
𝐼ሚ𝑖𝑗 : Intensity of pixel (𝑖, 𝑗) 𝐼ሚ31 𝑃22

𝐼ሚ32

Pooling layer
𝐼ሚ33 with 4 neurons

Figure 6: Pooling layer

Pierre Michel Prediction methods and Machine learning 22/28


1. Convolutional Neural Networks
1.2. Convolutional Neural Networks

CNN: recap

Flattened image
(25x1)
𝐼11
Fully-connected MLP
𝐼12

Input image


(5 x 5) ⋮ 𝐼11


𝐼11 𝐼12 ⋯ ⋯ 𝐼15 ⋮



𝐼21 ⋱ ⋱ ⋱ ⋮ ⋮
⋮ ℎ𝑊,𝑏 (𝑥)

⋮ ⋱ ⋱ ⋱ ⋮ ⋮
⋮ 𝐼55
⋮ ⋱ ⋱ ⋱ ⋮ ⋮ 1

Layer L
𝐼51 ⋯ ⋯ ⋯ 𝐼55 ⋮ 1 1 (output)
⋮ Layer L-1
𝐼𝑖𝑗 : Intensity of pixel (𝑖, 𝑗) ⋮


Layer 1 Layer 2
⋮ (input)


𝐼55

Figure 7: Dense layer

Pierre Michel Prediction methods and Machine learning 23/28


2. CNN with Keras

2. CNN with Keras

Pierre Michel Prediction methods and Machine learning 24/28


2. CNN with Keras
2.1. Example of CNN using Keras: Fashion-MNIST

2.1 Example of CNN using Keras: Fashion-MNIST

Pierre Michel Prediction methods and Machine learning 25/28


2. CNN with Keras
2.1. Example of CNN using Keras: Fashion-MNIST

Load the library Keras


If the installation of Tensorflow is successful, you should be able to run the
following commands:

import numpy as np
import tensorflow as tf
from tensorflow import keras
import matplotlib.pyplot as plt

If not, here are some advices (Unix or Anaconda Prompt):

• install Keras (CPU version): pip install keras


• install Tensorflow: pip install tensorflow
• upgrade Numpy: pip install numpy --upgrade

https://keras.io/getting_started/
Pierre Michel Prediction methods and Machine learning 26/28
2. CNN with Keras
2.1. Example of CNN using Keras: Fashion-MNIST

Load the data

Keras uses the classical train-test structure.


It comes with a function to load the data in this way, load_data():
from keras.datasets import fashion_mnist
(train_X,train_Y), (test_X,test_Y) = fashion_mnist.load_data()

Pierre Michel Prediction methods and Machine learning 27/28


2. CNN with Keras
2.1. Example of CNN using Keras: Fashion-MNIST

Pierre Michel Prediction methods and Machine learning 28/28

You might also like