Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 5

UNIT: 3

 Padding
• Padding is a technique used to preserve the spatial dimensions of the input image after convolution
operations on a feature map and can improve the performance of the model.

• Padding is simply a process of adding layers of zeros to our input images so as to avoid
the problems mentioned above.
• Padding involves adding extra pixels around the border of the input feature map before convolution.

• This can be done in two ways:


• Valid Padding: In the valid padding, no padding is added to the input feature map, and the
output feature map is smaller than the input feature map. This is useful when we want to reduce
the spatial dimensions of the feature maps.

• Same Padding: In the same padding, padding is added to the input feature map such that the
size of the output feature map is the same as the input feature map. This is useful when we want
to preserve the spatial dimensions of the feature maps.

• This prevents shrinking as, if p = number of layers of zeros added to the border of the
image, then our (n x n) image becomes (n + 2p) x (n + 2p) image after padding. So,
applying convolution-operation (with (f x f) filter) outputs (n + 2p – f + 1) x (n + 2p – f
+ 1) images. For example, adding one layer of padding to an (8 x 8) image and using a
(3 x 3) filter we would get an (8 x 8) output after performing convolution operation.
• This increases the contribution of the pixels at the border of the original image by
bringing them into the middle of the padded image. Thus, information on the borders is
preserved as well as the information in the middle of the image.

 Edge Detection
 As the name suggests, edge detection is the process of detecting the edges in an image
 Convolution operations are widely used in image processing for tasks such as edge
detection. The basic idea behind edge detection using convolutions is to apply a
convolutional kernel (also known as a filter) to an image.

 Strided Convolution
 A strided convolution is another basic building block of convolution that is used in
Convolutional Neural Networks.

 1 * 1 Convolution or Network in Network

-> A 1x1 convolution, also known as a pointwise convolution or


network-in-network.

 It is a type of convolutional operation commonly used in deep learning


architectures, particularly in neural networks such as convolutional neural
networks(CNN).

 Traditional convolutional layers that use larger filters (e.g., 3x3 or


5x5), a 1x1 convolution employs a filter with a size of 1x1. But when
we compare with the 1 * 1 convolutional layers, it only filter with size
of 1 * 1.

 A convolutional layer with a 1×1 filter can, therefore, be used at any point
in a convolutional neural network to control the number of feature maps.
As such, it is often referred to as a projection operation or projection layer,
or even a feature map or channel pooling layer.

Examples of How to Use 1×1 Convolutions


 We can make the use of a 1×1 filter concrete with some examples.
 Consider that we have a convolutional neural network that expected color
images input with the square shape of 256x256x3 pixels.
 These images then pass through a first hidden layer with 512 filters, each
with the size of 3×3 with the same padding, followed by a ReLU activation
function.
Benefits of using 1 * 1 convolutions:-
1. Computational Efficiency
2. Non-Linearity
3. Network Flexibility
Overall, 1x1 convolutions are a powerful tool in deep learning architectures, and they
are often used in combination with other convolutional layers to build more expressive
and efficient models.

 Inception Network Motivation


 Inception Network is also known as GoogleNet, and it is a family of CNN architecture
designed to improve the computational efficiency and accuracy of deep learning
models, especially for image classification tasks.

 When we designing a layer for a convolution operations, we might have to choose


between a 1×3 filter, a 3×3 or 5×5 or maybe a pooling layer.

 An inception network solves this by saying. “ Why shouldn’t we apply them all ? ”.
This makes the network architecture more complicated, but remarkably improves
performance as well. Let’s see how this works.

 Object detection using deep learning


 Deep learning is a powerful machine learning technique in which the object detector
automatically learns image features required for detection tasks.
 There are various techniques or algorithms for object detection in the field of deep
learning available such as Faster R-CNN, you only look once (YOLO) v2, YOLO v3,
YOLO v4, and single shot detection (SSD).
 Object detection is a phenomenon in computer vision that involves the detection of
various objects in digital images or videos.

 Applications for object detection include as follows.


1. Image classification
2. Scene understanding
3. Self-driving vehicles
4. Surveillance

 YOLO Algorithm for Object Detection


 YOLO is an abbreviation for the term ‘You Only Look Once’. This is an
algorithm that detects and recognizes various objects in a picture (in real-
time).
 It is a popular algorithm for object detection in computer vision.
 YOLO is an algorithm that uses neural networks to provide real-time object
detection.
 This algorithm is popular because of its speed and accuracy. It has been
used in various applications to detect traffic signals, people, parking
meters, and animals.
 Object detection is a phenomenon in computer vision that involves the
detection of various objects in digital images or videos. Some of the objects
detected include people, cars, chairs, stones, buildings, and animals.
 YOLO uses a single bounding box regression to predict the height, width,
center, and class of objects.

 Why the YOLO algorithm is important


1. Speed: This algorithm improves the speed of detection because it can predict objects
in real-time.
2. High accuracy: YOLO is a predictive technique that provides accurate results with
minimal background errors.
3. Learning capabilities: The algorithm has excellent learning capabilities that enable it
to learn the representations of objects and apply them in object detection.

 How the YOLO algorithm works


YOLO algorithm works using the following three techniques:

1. Residual blocks
2. Bounding box regression
3. Intersection Over Union (IOU)
First, the image is divided into various grids. Each grid has a dimension of S x S.
Every grid cell will detect objects that appear within them.

You might also like