Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 24

Convolution Neural Network

Convolution Neural Network

• Convolution neural network (CNN) is a type of artificial


Neural network used in image recognition and processing that
is specifically designed to process pixel data.

• They are made up of neurons that have learnable weights and


biases.

• Each neuron receives some inputs, performs a dot product.


Convolution Neural Network
Types of layers in a CNN:

•Convolution Layer

•Pooling Layer

•Fully Connected Layer


CONVOLUTION LAYER
CONVOLUTION LAYER(cont..)
How Kernel Works
CONVOLUTION LAYER(cont..)
CONVOLUTION LAYER(cont..)
CONVOLUTION LAYER(cont..)
• CNN’s make use of filters (also known as
kernels), to detect what features, such as
edges, are present throughout an image.
• A filter is just a matrix of values, called
weights, that are trained to detect specific
features.
• The filter moves over each part of the image
to check if the feature it is meant to detect is
present
CONVOLUTION LAYER(cont..)
In the following example, a filter that is in charge of checking for right-hand curves is passed
over a part of the image. Since that part of the image contains the same curve that the filter is
looking for, the result of the convolution operation is a large number
CONVOLUTION LAYER(cont..)
But when that same filter is passed over a part of the image with a considerably different set of
edges, the convolution’s output is small, meaning that there was no strong presence of a right
hand curve.
Stride
• Stride is the number of pixels shifts over the input matrix. When the stride is 1
then we move the filters to 1 pixel at a time. When the stride is 2 then we move
the filters to 2 pixels at a time and so on.
Stride
Padding
• Padding is a critical technique used to manage the spatial dimensions of input data.
• Padding is the process of adding layers of zeros or other values outside the actual data in an
input matrix.
• The primary purpose of padding is to preserve the spatial size of the input so that the output
after applying filters (kernels) remains the same size, or to adjust it according to the desired
output dimensions.
Types of Padding
There are two common types of padding used in neural networks:
•Valid Padding: This type of padding involves no padding at all. The convolution
operation is performed only on the valid overlap between the filter and the input. As
a result, the output dimensions will be smaller than the input dimensions.
Types of Padding
•Same Padding: In this approach, padding is added to the input so that the output
dimensions after the convolution operation are the same as the input dimensions.
This is typically achieved by adding an appropriate number of zero-value pixels
around the input.
Pooling Layer
● Pooling layers are one of the building blocks of Convolutional Neural Networks.

Where Convolutional layers extract features from images, Pooling

layers consolidate the features learned by CNNs. Its purpose is to gradually

shrink the representation’s spatial dimension to minimize the number of

parameters and computations in the network.

● It downsamples the output of the Convolutional layers by sliding the filter of some

size with some stride size and calculating the maximum or average of the input.
Types of Pooling Layers:
Max Pooling
Max pooling is a pooling operation that selects the maximum element from the region of the
feature map covered by the filter. Thus, the output after max-pooling layer would be a feature
map containing the most prominent features of the previous feature map.
Types of Pooling Layers:
Average Pooling
Average pooling computes the average of the elements present in the region of
feature map covered by the filter. Thus, while max pooling gives the most
prominent feature in a particular patch of the feature map, average pooling gives
the average of features present in a patch.
Fully-Connected Layer
• In the fully-connected operation of a neural network, the

input representation is flattened into a feature vector and

passed through a network of neurons to predict the output

probabilities.

• The rows are concatenated to form a long feature vector. If

multiple input layers are present, its rows are also

concatenated to form an even longer feature vector.


Output Layer
• The output layer of a CNN is in charge of producing the probability of each class given the

input image.

• To obtain these probabilities, we initialize our final Dense layer to contain the same number

of neurons as there are classes.

• The output of this dense layer then passes through the Softmax activation function.

• Softmax function outputs a vector that represents the probability distributions of a list of

potential outcomes.
Data Augmentation
• Overfitting happens because of having too few examples to train on, resulting in a model that
has poor generalization performance. If we had infinite training data, we wouldn’t overfit
because we would see every possible instance.

• The common case in most machine learning applications, especially in image classification
tasks is that obtaining new training data is not easy.

• Data augmentation is a way to generate more training data from our current set. It enriches or
“augments” the training data by generating new examples via random transformation of existing
ones.

• This way we artificially boost the size of the training set, reducing overfitting. So data
augmentation can also be considered as a regularization technique.
Data Augmentation
• Data augmentation is done dynamically during training time.

• We need to generate realistic images, and the transformations should be learnable, simply
adding noise won’t help.

• Common transformations are: rotation, shifting, resizing, exposure adjustment, contrast


change etc.

• This way we can generate a lot of new samples from a single training example. Also, data
augmentation is only performed on the training data, we don’t touch the validation or test
set.
Data Augmentation
• Visualization will help understanding the concept. Let’s say
Image1 our original image.
Image1
• Using data augmentation we generate these artificial training
instances.

• Image2 is new training instances, applying transformations on


the original image doesn’t change the fact that this is still a cat
image.

• We can infer it as a human, so the model should be able to


learn that as well.

Image2
CNNs are everywhere
• Image retrieval

• Detection

• Self driving cars

• Semantic segmentation

• Face recognition (FB tagging)

• Detect diseases

• Speech Recognition

• Text processing

• Analysing satellite data

You might also like