Professional Documents
Culture Documents
PGP - Ai & ML
PGP - Ai & ML
• Please do a check on your network connection and audio before the class to have a smooth session
• All participants will be on mute, by default. You will be unmuted when requested or as needed
• Please use the “Questions” panel on your webinar tool to interact with the instructor at any point during the
class
• Please have the support phone number (US : 1855 818 0063 (toll free), India : +91 90191 17772) and raise
tickets from LMS in case of any issues with the tool
• Most often logging off or rejoining will help solve the tool related issues
Module
Deep dive into Neural Networks Module
3 with Tensorflow 8 Keras
Module Module
5 Convolutional Neural Networks 10 Hands-On Project
▪ Convolutional Layer
▪ ReLU layer
▪ Pooling Layer
▪ We can represent the above image using an array of dimension 32 X 32 X 3 (The 3 refers to RGB values)
▪ Each of these numbers is given a value from 0 to 255 which describes the pixel intensity at that point.
▪ These numbers, while meaningless to us when we perform image classification, are the only inputs available
to the computer.
2352 weights
How to
200x200x3 manage that?
120,000 weights
Moreover, we would almost certainly want to have several such neurons, so the parameters would add up quickly! Clearly, this
full connectivity is wasteful and the huge number of parameters would quickly lead to overfitting.
Now let’s discuss about a deep net that has completely dominated the machine vision space in
recent years
Unlike regular neural networks where we have fully connected layers, in case of CNN the neuron in a layer will
only be connected to a small region of the layer before it, instead of all of the neurons in a fully-connected
manner.
▪ This is one the top most method used worldwide for image recognition.
CNN X or O
A two-dimensional
array of pixels
CNN X
CNN O
Copyright © 2017, edureka and/or its affiliates. All rights reserved.
Trickier Cases
▪ Here, we will have some problems because X and O images won’t always have the same images, there can
be certain deformations such as shown below:
X
Translation Scaling Rotation Weight
O
Copyright © 2017, edureka and/or its affiliates. All rights reserved.
How to Decide?
▪ We need a classifier which can be used with these images and correctly predict what it is.
?
=
Copyright © 2017, edureka and/or its affiliates. All rights reserved.
How Computers See
▪ A computer understands an image using numbers at each pixels.
▪ In our example, we have considered that a black pixel will have a -1 value and a white will have a 1.
-1 -1 -1 -1 -1 -1 -1 -1 -1
-1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 1 -1 -1 -1 1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 -1 -1 1 -1 -1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 1 -1 -1 -1 1 -1 -1
-1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 -1 -1 -1 -1 -1 -1 -1
-1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1
-1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 1 -1 -1 -1 X -1 -1 -1 -1 X X -1
+ =
-1 -1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 X X -1 -1 X X -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1 -1 -1 1 1 -1 1 -1 -1 -1 -1 -1 X 1 -1 1 -1 -1 -1
-1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1 -1 -1 -1 1 -1 1 1 -1 -1 -1 -1 -1 1 -1 1 X -1 -1
-1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 X X -1 -1 X X -1
-1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 X X -1 -1 -1 -1 X -1
-1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1
As we can see this is not optimal, since these types of techniques requires exactly same images to
classify. This is not the case in real world, let’s see how CNN solves this problem.
=
=
=
Copyright © 2017, edureka and/or its affiliates. All rights reserved.
How CNN Works?
▪ CNNs compare images piece by piece. The pieces that it looks for are called features. By finding rough
feature matches, in roughly the same position in two images, CNNs get a lot better at seeing similarity than
whole-image matching schemes.
=
=
=
Copyright © 2017, edureka and/or its affiliates. All rights reserved.
How CNN Works?
▪ Some examples of features used for classifying our input data are:
1 -1 -1 1 -1 1 -1 -1 1
-1 1 -1 -1 1 -1 -1 1 -1
-1 -1 1 1 -1 1 1 -1 -1
These are small pieces of the bigger image. If we choose a feature and put it on the input image, if it
matches then the image is classified correctly.
1 -1 1
-1 1 -1
1 -1 1
-1 -1 1
-1 1 -1
1 -1 -1 Now let’s see how these features are used for pattern matching
▪ Convolution is the process of trying every possible position, here we will move the feature to every possible
position.
Convolution symbol
▪ Steps to follow:
1. Line up the feature and the image patch (by default image patch size is taken of 9 pixels).
-1 -1 -1 -1 -1 -1 -1 -1 -1
-1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 1 -1 -1 -1 1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 -1 -1 1 -1 -1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 1 -1 -1 -1 1 -1 -1
-1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 -1 -1 -1 -1 -1 -1 -1
Copyright © 2017, edureka and/or its affiliates. All rights reserved.
Filtering
1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1
-1 1 -1 -1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 1 -1 -1 1 -1 -1 -1 1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 -1 -1 1 -1 -1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 1 -1 -1 -1 1 -1 -1 1
-1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 -1 -1 -1 -1 -1 -1 -1
1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1
-1 1 -1 -1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 1 -1 -1 1 -1 -1 -1 1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 -1 -1 1 -1 -1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 1 -1 -1 -1 1 -1 -1 1 1
-1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 -1 -1 -1 -1 -1 -1 -1
1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1
-1 1 -1 -1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 1 -1 -1 1 -1 -1 -1 1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 -1 -1 1 -1 -1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 1 -1 -1 -1 1 -1 -1 1 1 1
-1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 -1 -1 -1 -1 -1 -1 -1
1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1
-1 1 -1 -1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 1 -1 -1 1 -1 -1 -1 1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 -1 -1 1 -1 -1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 1 -1 -1 -1 1 -1 -1 1 1 1
-1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 -1 -1 -1 -1 -1 -1 -1 1
1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1
-1 1 -1 -1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 1 -1 -1 1 -1 -1 -1 1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 -1 -1 1 -1 -1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 1 -1 -1 -1 1 -1 -1 1 1 1
-1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 -1 -1 -1 -1 -1 -1 -1 1 1
1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1
-1 1 -1 -1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 1 -1 -1 1 -1 -1 -1 1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 -1 -1 1 -1 -1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 1 -1 -1 -1 1 -1 -1 1 1 1
-1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 -1 -1 -1 -1 -1 -1 -1 1 1 1
1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1
-1 1 -1 -1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 1 -1 -1 1 -1 -1 -1 1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 -1 -1 1 -1 -1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 1 -1 -1 -1 1 -1 -1 1 1 1
-1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 -1 -1 -1 -1 -1 -1 -1 1 1 1
1
Copyright © 2017, edureka and/or its affiliates. All rights reserved.
Filtering
1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1
-1 1 -1 -1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 1 -1 -1 1 -1 -1 -1 1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 -1 -1 1 -1 -1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 1 -1 -1 -1 1 -1 -1 1 1 1
-1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 -1 -1 -1 -1 -1 -1 -1 1 1 1
1 1
Copyright © 2017, edureka and/or its affiliates. All rights reserved.
Filtering
1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1
-1 1 -1 -1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 1 -1 -1 1 -1 -1 -1 1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 -1 -1 1 -1 -1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 1 -1 -1 -1 1 -1 -1 1 1 1
-1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 -1 -1 -1 -1 -1 -1 -1 1 1 1
1 1 1
Copyright © 2017, edureka and/or its affiliates. All rights reserved.
Adding up the filter
▪ When we are done, we add the values of our filter and divide it by 9, which is the filter size.
1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 1 1 1
-1 1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 1 1 1
-1 -1 1 -1 -1 1 -1 -1 -1 1 -1 -1
1 1 1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 -1 -1 1 -1 -1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 1 -1 -1 -1 1 -1 -1
-1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 -1 -1 -1 -1 -1 -1 -1
-1 -1 -1 -1 -1 -1 -1 -1 -1
-1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 1 -1 -1 -1 1 -1 -1 1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 -1 -1 1 -1 -1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 1 -1 -1 -1 1 -1 -1
-1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 -1 -1 -1 -1 -1 -1 -1
1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1
-1 1 -1 -1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 1 -1 -1 1 -1 -1 -1 1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 -1 -1 1 -1 -1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 1 -1 -1 -1 1 -1 -1 1
-1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 -1 -1 -1 -1 -1 -1 -1
1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1
-1 1 -1 -1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 1 -1 -1 1 -1 -1 -1 1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 -1 -1 1 -1 -1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 1 -1 -1 -1 1 -1 -1 1 1 -1
-1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 -1 -1 -1 -1 -1 -1 -1
1 1 1
-1 1 1
Copyright © 2017, edureka and/or its affiliates. All rights reserved.
Map Creation
▪ Now we will put the value of the filter at that position
-1 -1 -1 -1 -1 -1 -1 -1 -1
-1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 1 -1 -1 -1 1 -1 -1 1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 -1 -1 1 -1 -1 -1 -1 .55
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 1 -1 -1 -1 1 -1 -1
-1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 -1 -1 -1 -1 -1 -1 -1
From the map we can understand that our feature matches along the diagonal much better than
other areas
=
-1 -1 -1 1 -1 1 -1 -1 -1
-1
-1
-1
-1
-1
-1
-1
1
1
-1
-1
1
-1
-1
-1
-1
-1
-1
-1 1 -1 -0.11 0.33 -0.77 1.00 -0.77 0.33 -0.11
-1
-1
-1
-1
-1
-1
-1
1
1
-1
-1
1
-1
-1
-1
-1
-1
-1
-1 1 -1 -0.11 0.33 -0.77 1.00 -0.77 0.33 -0.11
In convolution one image becomes a stack of filtered images, number of filtered images depends
on the filters
Unit Step
Sigmoid x f(x)=x F(x)
-3 f(-3) = 0 0
Tanh
-5 f(-5) = 0 0
3 f(3) = 3 3
5 f(5) = 5 5
-5 f(-5) = 0 0
3 f(3) = 3 3
5 f(5) = 5 5
0.77 -0.11 0.11 0.33 0.55 -0.11 0.33 0.77 0 0.11 0.33 0.55 0 0.33
-0.11 1.00 -0.11 0.33 -0.11 0.11 -0.11 0 1.00 0 0.33 0 0.11 0
0.11 -0.11 1.00 -0.33 0.11 -0.11 0.55 0.11 0 1.00 0 0.11 0 0.55
0.33 0.33 -0.33 0.55 -0.33 0.33 0.33 0.33 0.33 0 0.55 0 0.33 0.33
0.55 -0.11 0.11 -0.33 1.00 -0.11 0.11 0.55 0 0.11 0 1.00 0 0.11
-0.11 0.11 -0.11 0.33 -0.11 1.00 -0.11 0 0.11 0 0.33 0 1.00 0
0.33 -0.11 0.55 0.33 0.11 -0.11 0.77 0.33 0 0.55 0.33 0.11 0 0.77
0.33 0.33 -0.33 0.55 -0.33 0.33 0.33 0.33 0.33 0 0.55 0 0.33 0.33
0.55 -0.11 0.11 -0.33 1.00 -0.11 0.11 0.55 0 0.11 0 1.00 0 0.11
-0.11 0.11 -0.11 0.33 -0.11 1.00 -0.11 0 0.11 0 0.33 0 1.00 0
0.33 -0.11 0.55 0.33 0.11 -0.11 0.77 0.33 0 0.55 0.33 0.11 0 0.77
0.33 -0.55 0.11 -0.11 0.11 -0.55 0.33 0.33 0 0.11 0 0.11 0 0.33
-0.55 0.55 -0.55 0.33 -0.55 0.55 -0.55 0 0.55 0 0.33 0 0.55 0
0.11 -0.55 0.55 -0.77 0.55 -0.55 0.11 0.11 0 0.55 0 0.55 0 0.11
-0.11 0.33 -0.77 1.00 -0.77 0.33 -0.11 0 0.33 0 1.00 0 0.33 0
0.11 -0.55 0.55 -0.77 0.55 -0.55 0.11 0.11 0 0.55 0 0.55 0 0.11
-0.55 0.55 -0.55 0.33 -0.55 0.55 -0.55 0 0.55 0 0.33 0 0.55 0
0.33 -0.55 0.11 -0.11 0.11 -0.55 0.33 0.33 0 0.11 0 0.11 0 0.33
0.33 -0.11 0.55 0.33 0.11 -0.11 0.77 0.33 0 0.55 0.33 0.11 0 0.77
-0.11 0.11 -0.11 0.33 -0.11 1.00 -0.11 0 0.11 0 0.33 0 1.00 0
0.55 -0.11 0.11 -0.33 1.00 -0.11 0.11 0.55 0 0.11 0 1.00 0 0.11
0.33 0.33 -0.33 0.55 -0.33 0.33 0.33 0.33 0.33 0 0.55 0 0.33 0.33
0.11 -0.11 1.00 -0.33 0.11 -0.11 0.55 0.11 0 1.00 0 0.11 0 0.55
-0.11 1.00 -0.11 0.33 -0.11 0.11 -0.11 0 1.00 0 0.33 0 0.11 0
0.77 -0.11 0.11 0.33 0.55 -0.11 0.33 0.77 0 0.11 0.33 0.55 0 0.33
▪ In our first window the maximum or highest value is 1, so we track that and move the window two strides
-0.11 1.00 -0.11 0.33 -0.11 0.11 -0.11 1.00 0.33 0.55 0.33
Here we can see that instead of 7 x 7 pixel image we get 4 x 4 one. Also the higher values are not affected.
This process is very helpful when we are working with images of 1000 x 1000 pixels and more.
Also we can see that even if a particular feature in image is slightly off from its place, the layer picks it up
0.11 -0.55 0.55 -0.77 0.55 -0.55 0.11 0.55 0.55 0.55 0.11
-0.55 0.55 -0.55 0.33 -0.55 0.55 -0.55
0.33 0.11 0.11 0.33
0.33 -0.55 0.11 -0.11 0.11 -0.55 0.33
0.11 -0.11 1.00 -0.33 0.11 -0.11 0.55 1.00 1.00 0.11 0.55
-0.11 1.00 -0.11 0.33 -0.11 0.11 -0.11
0.77 0.33 0.55 0.33
0.77 -0.11 0.11 0.33 0.55 -0.11 0.33
0.11 -0.11 1.00 -0.33 0.11 -0.11 0.55 0.33 1.00 0.33 0.55
0.33 0.33 -0.33 0.55 -0.33 0.33 0.33
0.11 -0.55 0.55 -0.77 0.55 -0.55 0.11 0.33 1.00 0.55 0.11
-0.11 0.33 -0.77 1.00 -0.77 0.33 -0.11
0.55 0.55 0.55 0.11
0.11 -0.55 0.55 -0.77 0.55 -0.55 0.11
0.55 -0.11 0.11 -0.33 1.00 -0.11 0.11 0.55 0.55 1.00 0.33
0.33 0.33 -0.33 0.55 -0.33 0.33 0.33
1.00 1.00 0.11 0.55
0.11 -0.11 1.00 -0.33 0.11 -0.11 0.55
-0.11 1.00 -0.11 0.33 -0.11 0.11 -0.11 0.77 0.33 0.55 0.33
0.77 -0.11 0.11 0.33 0.55 -0.11 0.33
Convolution
-1 -1 -1 -1 -1 -1 -1 -1 -1 0.55 0.33 1.00 0.11
Pooling
ReLU
-1 1 -1 -1 -1 -1 -1 1 -1 0.33 0.55 0.11 0.77
-1 -1 1 -1 -1 -1 1 -1 -1
0.55 0.33 0.55 0.33
-1 -1 -1 1 -1 1 -1 -1 -1
0.33 1.00 0.55 0.11
-1 -1 -1 -1 1 -1 -1 -1 -1
0.55 0.55 0.55 0.11
-1 -1 -1 1 -1 1 -1 -1 -1
0.33 0.11 0.11 0.33
-1 -1 1 -1 -1 -1 1 -1 -1
-1 1 -1 -1 -1 -1 -1 1 -1 0.33 0.55 1.00 0.77
1.00 0.55
-1 -1 -1 -1 -1 -1 -1 -1 -1 0.55 1.00
-1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 1 -1 -1 -1 1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1 1.00 0.55
-1 -1 -1 -1 1 -1 -1 -1 -1
0.55 0.55
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 1 -1 -1 -1 1 -1 -1
-1 1 -1 -1 -1 -1 -1 1 -1 0.55 1.00
-1 -1 -1 -1 -1 -1 -1 -1 -1
1.00 0.55
Each time the image gets more filtered as it goes through convolution layers and smaller as it goes
through pooling layers
▪ Here we take our filtered and shrinked images and put them into a single list
1.00
1.00
1.00
0.55
1.00
0.55
0.55
1.00
1.00
X
0.55
0.55
If these values of our image is high, then it is
O
0.55
classified as X
0.55
1.00
1.00
0.55
0.55
1.00
1.00
0.55
0.55
X
0.55
0.55
If these values of our image is high, then it is
O
0.55
classified as O
1.00
0.55
0.55
1.00
0.9
0.65
0.87
0.96
X
0.73
0.23
O
0.63
0.44
0.89
0.94
0.53
0.65
X
0.65
0.45
0.87
0.96
X 0.45
0.87
0.96
0.73
0.73
0.23
0.23
O
0.63
O
0.63
0.44
0.44
0.89
0.89
0.94
0.94
0.53
0.53
X
O
Copyright © 2017, edureka and/or its affiliates. All rights reserved.
CNN Use-Case
Airplane
Automobile
Bird
Cat
Deer
Dog
Frog
Horse
Ship
Truck
Training
Download the dataset Convolution Layer 1
Fully Connected
Define the Convolution Layer
Layers
Airplane
Automobile
Bird
Cat
Deer
Dog
Frog
Horse
Input Convolutional Pooling Convolutional Pooling Ship
layer 1 1 layer 2 2 Truck
Dense Output
▪ Here we are importing the functions distorted_inputs() and input() from cifar10_input file.