Session 5

Prediction methods and Machine learning
Session V
Pierre Michel
pierre.michel@univ-amu.fr
M2 EBDS
2021
1. Convolutional Neural Networks
Pierre Michel Prediction methods and Machine learning 2/28

1.1. Introduction to Convolutional Neural Networks
1.1 Introduction to Convolutional Neural Networks

Artificial neural networks: recap
Until now, you have used classical Artificial Neural Networks (ANN),
also called Multi-Layer Perceptron (MLP).
We use the MNIST dataset, that contains simple images of hand-written
digits (low-resolution images). More realistic datasets imply more complex
images (high-resolution color images).
In the context of MLP, each neuron in the input layer is an input of each
neuron in the hidden layer, this is a fully-connected network.

Fully-connected networks: illustration
Flattened image
(25x1)
𝐼11
Fully-connected MLP
𝐼12
⋮
Input image
⋮
⋮
(5 x 5) ⋮ 𝐼11
⋮
⋮
𝐼11 𝐼12 ⋯ ⋯ 𝐼15 ⋮
⋮
⋮
⋮
𝐼21 ⋱ ⋱ ⋱ ⋮ ⋮
⋮ ℎ𝑊,𝑏 (𝑥)
⋮
⋮ ⋱ ⋱ ⋱ ⋮ ⋮
⋮ 𝐼55
⋮ ⋱ ⋱ ⋱ ⋮ ⋮ 1
⋮
Layer L
𝐼51 ⋯ ⋯ ⋯ 𝐼55 ⋮ 1 1 (output)
⋮ Layer L-1
𝐼𝑖𝑗 : Intensity of pixel (𝑖, 𝑗) ⋮
⋮
⋮
Layer 1 Layer 2
⋮ (input)
⋮
⋮
𝐼55
Figure 1: Example of fully-connected neural network

Fully-connected networks
Considering the MNIST dataset (28 × 28 images), we have seen that a

fully-connected MLP with a simple architecture (3/4-layer neural network)
is sufficient to get satisfactory performance results.
It is computationally feasible to consider all of the features (28×28 = 784
features) for learning the network.
Problem: with larger images (96 × 96) it becomes very computationally
expensive to learn using all the features: feedforward and backpropagation
will be slower to compute.

Locally-connected networks
Solution: restrict the connections between hidden neurons and inputs

neurons: it will reduce the number of parameters to estimate.
Each hidden neuron is connected to only a subset of input neurons.
In the case of time-series (1-dimensional objects): each hidden neuron is
connected to a contiguous region of successive values (ex: values at
T − 2, T − 1, T , T + 1, T + 2).
In the case of images (2-dimensional objects): each hidden neuron is
connected to a contiguous region of pixels.

Locally-connected networks: illustration

Locally-connected network
Convolution filter Flattened image
matrix (windows) (25 x 1) 𝐼11
(3 x 3) 𝐼11
𝐼12 𝐼12
Output
𝐼13
𝑓11 𝑓12 𝑓13 ⋮ 𝐼13
𝑓11
𝑓12
𝑓21 𝑓22 𝑓23 ⋮
𝑓13 ⋮ (convolved)
𝐼21 ⋮
𝐼22 image
𝑓31 𝑓32 𝑓33 𝑓21
𝐼ሚ11 (3 x 3)
𝐼23 𝐼21
⋮ 𝑓22
⋮ 𝐼22
𝐼31
𝑓23 ⋮ 𝐼ሚ11 𝐼ሚ12 𝐼ሚ13
Input image 𝐼32 𝐼23
𝑓31
𝐼ሚ21 𝐼ሚ22 𝐼ሚ23
(5 x 5)
𝐼33 𝑓32
⋮ ⋮ 𝑓33
⋮ 𝐼ሚ31 𝐼ሚ32 𝐼ሚ33
𝐼11 𝐼12 𝐼13 ⋯ 𝐼15 ⋮ 𝐼31
⋮
𝐼21 𝐼22 𝐼23 ⋱ ⋮ ⋮ 𝐼32
𝐼ሚ𝑖𝑗 : Intensity of pixel (𝑖, 𝑗)
⋮ Convolution layer
𝐼31 𝐼32 𝐼33 ⋱ ⋮ ⋮
⋮ 𝐼33 (layer with 9 neurons)
⋮
⋮ ⋱ ⋱ ⋱ ⋮ ⋮
⋮
⋮
𝐼51 ⋯ ⋯ ⋯ 𝐼55 𝐼55
𝐼55
𝐼𝑖𝑗 : Intensity of pixel (𝑖, 𝑗) Input layer
1
with 𝐼ሚ𝑖𝑗 = (𝑓11 𝐼(𝑖−1)(𝑗−1) + 𝑓12 𝐼(𝑖−1)(𝑗) + 𝑓13 𝐼(𝑖−1)(𝑗+1) + 𝑓21 𝐼(𝑖)(𝑗−1) + 𝑓22 𝐼(𝑖)(𝑗) +𝑓23 𝐼(𝑖)(𝑗+1) + 𝑓31 𝐼(𝑖+1)(𝑗−1) + 𝑓32 𝐼(𝑖+1)(𝑗) + 𝑓33 𝐼(𝑖+1)(𝑗+1) )
9
Figure 2: Example of locally-connected (convolutional) neural network

1.2. Convolutional Neural Networks
1.2 Convolutional Neural Networks

Convolution
A neural network containing convolutional layers is called a convolutional

neural network (CNN).
The main operation in a CNN is a convolution, which can be seen as a
feature extraction method (in the cases of images: a filtering method).
Instead of applying a function to an entire image, a convolution scans
windows of low size and applies a kernel (a matrix multiplication typically)
to each window. These kernels are seen as image filters.
Main effect: reduce the number of parameters.

Convolution
More formally we can see an image X as a matrix of size r × c. Let consider

k windows of size a × b with a ≤ r and b ≤ c, scanning the image.
For each window w, representing a submatrix of X denoted X (w) , a
convolution is computed, defined as a function f : Ra×b → R with:
a X
b
(w)
X
f (X (w) ) = xij fij
i=1 j=1
The k hidden neurons are of size (r − a + 1) × (c − b + 1) (the size of the

convolved image).
Example: r = 5, c = 5, a = 3, b = 3 so (r − a + 1) = 3 and (c − b + 1) = 3

Convolution and curse of high dimension
Consider now larger images (say 100 × 100 images).

Convolutional layer: we consider 100 non-overlapping windows (i.e 100
hidden layers) of size 10 × 10.
Each convolution is an output of size (100−10+1)×(100−10+1) = 8281,
so the total number of convolutions in the layer is 828100, representing the
number of outputs of the convolutional layer.
Comparing with the size of the input image (100 × 100 = 10000 pixels),
we can observe that:
Convolution implies an increase of the dimension.

Pooling
Solution: pooling convolved images.

A pooling layer is also a locally-connected layer.
Idea: Once a region of pixels in an image has been convolved, a pooling
layer will aggregate for each convolution window the values obtained
using a simple function (typically mean pooling using the mean value
or max pooling using the maximum value). The output for the whole
convolved image is a pooled image of lower size.
Main effect: reduce the dimension.
Formally,
P compute max(X (w) ) for max pooling or compute
1 a Pb (w)
ab i=1 i=1 xij for mean pooling.

Pooling: illustration
(3 x 3) 𝐼11
𝐼12 𝐼12
Output
𝐼13
𝑓11 𝑓12 𝑓13 ⋮ 𝐼13
𝑓11
𝑓12
𝑓21 𝑓22 𝑓23 ⋮
𝐼21 ⋮
𝐼22 image
𝑓31 𝑓32 𝑓33 𝑓21
𝐼ሚ11 (3 x 3)
𝐼23 𝐼21
⋮ 𝑓22
⋮ 𝐼22
𝐼31
𝑓23 ⋮ 𝐼ሚ11 𝐼ሚ12 𝐼ሚ13
𝑓31
(5 x 5)
𝐼33 𝑓32
⋮ ⋮ 𝑓33
⋮ 𝐼ሚ31 𝐼ሚ32 𝐼ሚ33
𝐼11 𝐼12 𝐼13 ⋯ 𝐼15 ⋮ 𝐼31
⋮
𝐼21 𝐼22 𝐼23 ⋱ ⋮ ⋮ 𝐼32
𝐼31 𝐼32 𝐼33 ⋱ ⋮ ⋮
⋮
⋮ ⋱ ⋱ ⋱ ⋮ ⋮
⋮
⋮
𝐼51 ⋯ ⋯ ⋯ 𝐼55 𝐼55
𝐼55
1
9
Figure 3: Example of locally-connected convolutional layer

Pooling: illustration
Pooling
Pooling matrix Flattened image
(windows) (25 x 1)
𝐼ሚ11
(2 x 2)
𝑃 11
𝑓11 𝑓12 𝐼ሚ12 Output
𝐼ሚ11 (pooled)
𝑓21 𝑓22
𝐼ሚ12 image
𝐼ሚ13
Input 𝐼ሚ13 (2 x 2)
(convolved) 𝑃12
𝐼ሚ21 𝐼ሚ21 𝑃11 𝑃12
image
(3 x 3) 𝐼ሚ22 𝑃21 𝑃22
𝐼ሚ23 𝐼ሚ22
𝐼ሚ11 𝐼ሚ12 𝐼ሚ13 𝑃21
𝐼ሚ31 𝑃𝑖𝑗 : pooled value in (𝑖, 𝑗)
𝐼ሚ32 𝐼ሚ23
𝐼ሚ33
𝐼ሚ𝑖𝑗 : Intensity of pixel (𝑖, 𝑗) 𝐼ሚ31 𝑃22
𝐼ሚ32
Pooling layer
𝐼ሚ33 with 4 neurons
Figure 4: Example of locally-connected pooling neural network

Convolutional neural network
A CNN contains one ore more convolutional layers often followed by a

pooling step, and then followed by one or more fully connected layers
(traditional multi-layer neural network: see Session IV slides).
The architecture of CNN is typically designed for inputs with a 1D (any
time-series) or 2D structure (image, speech signal). It works with local
connections and pooling step.
Advantages: easier training, fewer parameters than fully-connected net-
works.

Architecture
A CNN is a sequence of convolutional and pooling layers, often followed by

full-connected layers.
Formally, the input of a CNN is a m × m × r image with m rows of m
pixels and r channels (for a RGB image, r = 3).
The convolutional layer applies k filters of size n × n × q, with n < m and
q ≤ r.
Locally-connected structure: outputs k convolved images of size m − n +
1, that are mean or max pooled over p × p contiguous regions (traditionally
2 ≤ p ≤ 5).
Finally, this can be followed by fully (densely) connected layers (standard
multilayer neural network).

Backpropagation: full-conected layers
Let δ (l+1) be the error term in the layer (l + 1) in a network.

Consider J(W, b; x, y) the cost function with parameters (W, b) and (x, y)
the training data and target values. If layer l is fully-connected to layer
(l + 1), its error term is:

δ (l) = (W (l) )T δ (l+1) · f 0 (z (l) )
and the gradients are as follows:
∇W (l) J(W, b; x, y) = δ (l+1) (a(l) )T

∇b(l) J(W, b; x, y) = δ (l+1)

Backpropagation: locally-connected layers
If layer l is a convolutional and pooling layer then the error backpropagated

as follows:

(l) (l+1) (l)
δ (l) = upsample (Wk )T δk · f 0 (zk )
(l)
where k refers to filter k, f 0 (zk ) is the derivative of the activation function.
The upsample operator depends on the type of pooling used:
• max pooling: assigns all the error to the neuron corresponding to

the max pooled value.
• mean pooling: uniformly distribute the error in the neurons of the
previous layer

Backpropagation: locally-connected layers
Finally, the gradients with respect to the filters matrices are givenby:
n
(l) (l+1)
X
∇W (l) J(W, b; x, y) = (ai ) ∗ δ̃k
i=1
a X
b
(l+1)
X
∇b(l) J(W, b; x, y) = (δk )ij
i=1 j=1
(l) (l+1)
where (ai ) ∗ δ̃k is the transposed convolution between input i layer l
and the error of filter k.

CNN: recap
(3 x 3) 𝐼11
𝐼12 𝐼12
Output
𝐼13
𝑓11 𝑓12 𝑓13 ⋮ 𝐼13
𝑓11
𝑓12
𝑓21 𝑓22 𝑓23 ⋮
𝐼21 ⋮
𝐼22 image
𝑓31 𝑓32 𝑓33 𝑓21
𝐼ሚ11 (3 x 3)
𝐼23 𝐼21
⋮ 𝑓22
⋮ 𝐼22
𝐼31
𝑓23 ⋮ 𝐼ሚ11 𝐼ሚ12 𝐼ሚ13
𝑓31
(5 x 5)
𝐼33 𝑓32
⋮ ⋮ 𝑓33
⋮ 𝐼ሚ31 𝐼ሚ32 𝐼ሚ33
𝐼11 𝐼12 𝐼13 ⋯ 𝐼15 ⋮ 𝐼31
⋮
𝐼21 𝐼22 𝐼23 ⋱ ⋮ ⋮ 𝐼32
𝐼31 𝐼32 𝐼33 ⋱ ⋮ ⋮
⋮
⋮ ⋱ ⋱ ⋱ ⋮ ⋮
⋮
⋮
𝐼51 ⋯ ⋯ ⋯ 𝐼55 𝐼55
𝐼55
1
9
Figure 5: Convolutional layer

CNN: recap
Pooling
Pooling matrix Flattened image
(windows) (25 x 1)
𝐼ሚ11
(2 x 2)
𝑃 11
𝑓11 𝑓12 𝐼ሚ12 Output
𝐼ሚ11 (pooled)
𝑓21 𝑓22
𝐼ሚ12 image
𝐼ሚ13
Input 𝐼ሚ13 (2 x 2)
(convolved) 𝑃12
𝐼ሚ21 𝐼ሚ21 𝑃11 𝑃12
image
(3 x 3) 𝐼ሚ22 𝑃21 𝑃22
𝐼ሚ23 𝐼ሚ22
𝐼ሚ11 𝐼ሚ12 𝐼ሚ13 𝑃21
𝐼ሚ31 𝑃𝑖𝑗 : pooled value in (𝑖, 𝑗)
𝐼ሚ32 𝐼ሚ23
𝐼ሚ33
𝐼ሚ𝑖𝑗 : Intensity of pixel (𝑖, 𝑗) 𝐼ሚ31 𝑃22
𝐼ሚ32
Pooling layer
𝐼ሚ33 with 4 neurons
Figure 6: Pooling layer

CNN: recap
Flattened image
(25x1)
𝐼11
Fully-connected MLP
𝐼12
⋮
Input image
⋮
⋮
(5 x 5) ⋮ 𝐼11
⋮
⋮
𝐼11 𝐼12 ⋯ ⋯ 𝐼15 ⋮
⋮
⋮
⋮
𝐼21 ⋱ ⋱ ⋱ ⋮ ⋮
⋮ ℎ𝑊,𝑏 (𝑥)
⋮
⋮ ⋱ ⋱ ⋱ ⋮ ⋮
⋮ 𝐼55
⋮ ⋱ ⋱ ⋱ ⋮ ⋮ 1
⋮
Layer L
𝐼51 ⋯ ⋯ ⋯ 𝐼55 ⋮ 1 1 (output)
⋮ Layer L-1
𝐼𝑖𝑗 : Intensity of pixel (𝑖, 𝑗) ⋮
⋮
⋮
Layer 1 Layer 2
⋮ (input)
⋮
⋮
𝐼55
Figure 7: Dense layer

2. CNN with Keras
2. CNN with Keras

2. CNN with Keras
2.1. Example of CNN using Keras: Fashion-MNIST
2.1 Example of CNN using Keras: Fashion-MNIST

2. CNN with Keras
Load the library Keras

If the installation of Tensorflow is successful, you should be able to run the
following commands:
import numpy as np
import tensorflow as tf
from tensorflow import keras
import matplotlib.pyplot as plt
If not, here are some advices (Unix or Anaconda Prompt):
• install Keras (CPU version): pip install keras

• install Tensorflow: pip install tensorflow
• upgrade Numpy: pip install numpy --upgrade
https://keras.io/getting_started/
2. CNN with Keras
Load the data
Keras uses the classical train-test structure.

It comes with a function to load the data in this way, load_data():
from keras.datasets import fashion_mnist
(train_X,train_Y), (test_X,test_Y) = fashion_mnist.load_data()

2. CNN with Keras

Session 5

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Session 5

Uploaded by

Copyright:

Available Formats

Prediction methods and Machine learning

1. Convolutional Neural Networks

Pierre Michel Prediction methods and Machine learning 2/28

1.1 Introduction to Convolutional Neural Networks

Pierre Michel Prediction methods and Machine learning 3/28

Artificial neural networks: recap

Pierre Michel Prediction methods and Machine learning 4/28

Fully-connected networks: illustration

Figure 1: Example of fully-connected neural network

Pierre Michel Prediction methods and Machine learning 5/28

Considering the MNIST dataset (28 × 28 images), we have seen that a

Pierre Michel Prediction methods and Machine learning 6/28

Solution: restrict the connections between hidden neurons and inputs

Pierre Michel Prediction methods and Machine learning 7/28

Locally-connected networks: illustration

𝐼𝑖𝑗 : Intensity of pixel (𝑖, 𝑗) Input layer

Figure 2: Example of locally-connected (convolutional) neural network

Pierre Michel Prediction methods and Machine learning 8/28

1.2 Convolutional Neural Networks

Pierre Michel Prediction methods and Machine learning 9/28

A neural network containing convolutional layers is called a convolutional

Pierre Michel Prediction methods and Machine learning 10/28

More formally we can see an image X as a matrix of size r × c. Let consider

The k hidden neurons are of size (r − a + 1) × (c − b + 1) (the size of the

Pierre Michel Prediction methods and Machine learning 11/28

Convolution and curse of high dimension

Consider now larger images (say 100 × 100 images).

Pierre Michel Prediction methods and Machine learning 12/28

Solution: pooling convolved images.

Pierre Michel Prediction methods and Machine learning 13/28

𝐼𝑖𝑗 : Intensity of pixel (𝑖, 𝑗) Input layer

Figure 3: Example of locally-connected convolutional layer

Pierre Michel Prediction methods and Machine learning 14/28

Figure 4: Example of locally-connected pooling neural network

Pierre Michel Prediction methods and Machine learning 15/28

Convolutional neural network

A CNN contains one ore more convolutional layers often followed by a

Pierre Michel Prediction methods and Machine learning 16/28

A CNN is a sequence of convolutional and pooling layers, often followed by

Pierre Michel Prediction methods and Machine learning 17/28

Backpropagation: full-conected layers

Let δ (l+1) be the error term in the layer (l + 1) in a network.

and the gradients are as follows:

∇W (l) J(W, b; x, y) = δ (l+1) (a(l) )T

Pierre Michel Prediction methods and Machine learning 18/28

Backpropagation: locally-connected layers

If layer l is a convolutional and pooling layer then the error backpropagated

• max pooling: assigns all the error to the neuron corresponding to

Pierre Michel Prediction methods and Machine learning 19/28

Backpropagation: locally-connected layers

Pierre Michel Prediction methods and Machine learning 20/28

𝐼𝑖𝑗 : Intensity of pixel (𝑖, 𝑗) Input layer

Figure 5: Convolutional layer

Pierre Michel Prediction methods and Machine learning 21/28

Figure 6: Pooling layer

Pierre Michel Prediction methods and Machine learning 22/28

Figure 7: Dense layer

Pierre Michel Prediction methods and Machine learning 23/28

2. CNN with Keras

Pierre Michel Prediction methods and Machine learning 24/28

2.1 Example of CNN using Keras: Fashion-MNIST

Pierre Michel Prediction methods and Machine learning 25/28