Download as pdf or txt
Download as pdf or txt
You are on page 1of 6

Convolution Neural Network

Convolution neural networks, also known as convet or CNN are a special kind of neural networks
for processing data that has a known grid-like topology like time series data (1D) like weather
data, temperature readings, stock prices, heart rate monitoring etc or images (2D).

The key components of a CNN includes convolutional layers, pooling layers, and fully connected
layers.

1. Convolution Layer:

Convolution is a mathematical operation that involves the element-wise multiplication of a filter


(also called a kernel) with the input data followed by summing the results. This operation is
performed across the entire input to produce a feature map.

If I is the input matrix and K is the filter matrix, the convolution operation at position (i,j) is given
by:

(I ∗ K)ij = ∑m∑n I(i+m) (j+n) ⋅ Kmn

I: Input matrix and K: Filter matrix.

The output size is determined by the input size, filter size, stride, and zero-padding.
Padding

Let’s observe this image:

Convolution produces a feature map by passing the input image through the filter or kernal. Since
the feature map is smaller than the image, the size that results from applying an additional
convolution layer to it would also be lower. The border pixels participate in fewer convolutions,
which is another issue with this.

Padding involves adding extra pixels around the input data to preserve spatial information during
convolutional operations. Two common types of padding are "valid" (no padding) and "same"
(zero padding).

The formula for the size of features map with padding is (n+2p-k+1) where

 n is image size
 k is filter size
 p is padding size
Conv2D(32,kernel_size=(3,3),padding='same',activation='relu',input_shape=(28,28,1)))

 Conv2D - a convolution layer.


 32 - number of filter
 kernal_size=(3,3) means filter size is 3X3
 padding = ‘valid’ means there will be no padding.
 input_size gives the information about dimensions of input image.
 Flatten() converts the data into 1D.
 Dense is fully connected layer.

In convolution layers, the output size decreases in the absence of padding. This is evident from the
summary statistics, which indicates a 2 decrease in size (28-3+1 = 26).

Strides

Strides determine the step size of the convolutional filter as it moves across the input data. Strides
affect the spatial dimensions of the feature maps and influence the information preserved during
the convolution process.

Stride is necessary since it helps prevent low-level features and aids in capturing high-level
features. More low level values will be captured if the stride size is small.
𝑛+2𝑝−𝑘
The formula for feature map with padding and strides is: + 1 where
𝑠

 n: input image size


 p: padding size
 k: filter size
 s: strige size

The code shows both padding and strides together:


Pooling:

Pooling layers reduce the spatial dimensions of the input by downsampling, helping in feature
extraction and computational efficiency. Pooling aids in memory reduction and translation
variance problem solution.

In machine learning, objects or patterns may appear in multiple locations within an image. When
the model achieves translation invariance, it becomes capable of identifying these patterns or
objects independent of their location or position.

The code uses MaxPooling that reduces the features map size by 2.

The resulting output, when using the "valid" padding option, has a spatial shape (number of rows
or columns) of: output_shape = math.floor((input_shape - pool_size) / strides) + 1 (when
input_shape >= pool_size).

The resulting output shape when using the "same" padding option is: output_shape =
math.floor((input_shape - 1) / strides) + 1.
Fully Connected Layer:

Fully connected layers connect every neuron in one layer to every neuron in the next layer,
transforming the input into an output.

If x is the input vector, W is the weight matrix, and b is the bias vector, the operation is given by:
FC(x)=Wx+b

 x: Input vector.
 W: Weight matrix.
 b: Bias vector.

The output size is determined by the number of neurons in the layer.

Overall Architecture:

Forward Propagation:

The forward propagation in a CNN involves passing the input through convolutional layers,
activation functions, pooling layers, and fully connected layers successively.

Activation Function:

Common activation functions include ReLU (Rectified Linear Unit) in convolutional layers and
fully connected layers.

Backward Propagation:

During training, backpropagation is used to update the weights and biases based on the error
between the predicted output and the actual output.

You might also like