Download as pdf or txt
Download as pdf or txt
You are on page 1of 21

OFFICIAL (CLOSED) \ NON-SENSITIVE

Computer Vision
Mr Hew Ka Kian
hew_ka_kian@rp.edu.sg
OFFICIAL (CLOSED) \ NON-SENSITIVE

Image Convolution

2
OFFICIAL (CLOSED) \ NON-SENSITIVE

Image Convolution
• In image processing, convolution is the process of transforming an image by applying a kernel
over each pixel and its local neighbors across the entire image. The kernel is a matrix of values
whose size and values determine the transformation effect of the convolution process.

https://medium.com/@bdhuma/6-basic-things-to-know-about-convolution-daef5e1bc411
3
OFFICIAL (CLOSED) \ NON-SENSITIVE

Grid Size
• The number of pixels a kernel “sees” at once
• Typically use odd numbers so that there is a “center” pixel
• Kernel does not need to be square

Height: 3, Width: 3 Height: 1, Width: 3 Height: 3, Width: 1

4
OFFICIAL (CLOSED) \ NON-SENSITIVE

Padding
• Using Kernels directly, there will be an “edge effect”
• Pixels near the edge will not be used as “center pixels” since there are not
enough surrounding pixels
• Padding adds extra pixels around the frame
• So every pixel of the original image will be a center pixel as the kernel
moves across the image

5
OFFICIAL (CLOSED) \ NON-SENSITIVE

Stride
• The ”step size” as the kernel moves across the image
• Can be different for vertical and horizontal steps (but usually is the
same value)
• When stride is greater than 1, it scales down the output dimension
Stride 2 Example – No Padding

3
0

6
OFFICIAL (CLOSED) \ NON-SENSITIVE

Depth
• In images, each pixel may be represented by multiple values. This is
also known as image mode.
• The number of values is referred to as “channels”
• RGB image – 3 channels
• CMYK – 4 channels
• The output of the convolution, called feature map, will have the same
depth or channel.

7
OFFICIAL (CLOSED) \ NON-SENSITIVE

Basic Idea
• A convolution layer consists of n-number of kernels. Each
kernel will convolve with the input images to produce n-
number of output images.
• The activation function of a convolution layer is Relu
because
(i) there is no requirement of ‘thresholding’ the
convoluted output image, and
(ii) there is no negative values in an image.

8
OFFICIAL (CLOSED) \ NON-SENSITIVE

Other types of layers


• Pooling layer
• Shrinks the dimensions of an image (or reduce its size) by mapping a patch of pixels to one
value.
• Max-pooling
• Average-pooling

9
OFFICIAL (CLOSED) \ NON-SENSITIVE

Other types of layers


• Dropout layer
• Randomly deactivate one or more nodes from a network
• Does NOT change the network shape as nodes are NOT removed, just (some
are) deactivated.
• To prevent overfitting of the model as it explores more pathway from the
input layer to the output layer
• Dropout is ONLY performed during training

10
Ref: https://medium.com/analytics-vidhya/a-simple-introduction-to-dropout-regularization-with-code-5279489dda1e
OFFICIAL (CLOSED) \ NON-SENSITIVE

Other types of layers


• Flatten layer
• transforms a two-dimensional matrix of features into a vector that can be fed
into a fully connected neural network classifier

11
OFFICIAL (CLOSED) \ NON-SENSITIVE

Forming the CNN


• By combining the convolution, pooling, flatten and dense (fully
connected) layers, they formed a Convolution Neural Network.

12
OFFICIAL (CLOSED) \ NON-SENSITIVE

Forming the CNN


• The output with the highest value is what the model predicts the
class is
• If output of Dog is 3.5, it is difficult to make sense of the significance
of 3.5 unless we view it in perspective to the total values
• Softmax converts the values into probabilities and the total values
add up to 1.0

13
OFFICIAL (CLOSED) \ NON-SENSITIVE

Softmax
exp(𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜 𝑎𝑎𝑎𝑎 𝑖𝑖)
• 𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠 𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜 𝑎𝑎𝑎𝑎 𝑖𝑖 =
𝑠𝑠𝑠𝑠𝑠𝑠 𝑜𝑜𝑜𝑜 𝑎𝑎𝑎𝑎𝑎𝑎 exp(𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜)
• Use Excel to calculate exp() with =EXP(value)

output EXP() Softmax output


Dog 3.5 33.115 =33.115/35.170
=0.94
Cat 0.72 2.054 =2.054/35.170
=0.06
Total: 35.170

14
OFFICIAL (CLOSED) \ NON-SENSITIVE

Student Activity
How many actual Mango are wrongly predicted as Orange?

Prediction

Actual

Source:
OFFICIAL (CLOSED) \ NON-SENSITIVE

Student Activity
How many predictions of Apple are actually Mango?

Prediction

Actual

Source:
OFFICIAL (CLOSED) \ NON-SENSITIVE

Student Activity
What is the accuracy when tested with actual Apple?
• Accuracy = (𝑛𝑛𝑢𝑢𝑚𝑚𝑏𝑏𝑒𝑒𝑟𝑟 𝑜𝑜𝑓𝑓 𝒄𝒄𝒐𝒐𝒓𝒓𝒓𝒓𝒆𝒆𝒄𝒄𝒕𝒕 𝑝𝑝𝑟𝑟𝑒𝑒𝑑𝑑𝑖𝑖𝑐𝑐𝑡𝑡𝑖𝑖𝑜𝑜𝑛𝑛𝑠𝑠) / (𝑛𝑛𝑢𝑢𝑚𝑚𝑏𝑏𝑒𝑒𝑟𝑟 𝑜𝑜𝑓𝑓
𝑝𝑝𝑟𝑟𝑒𝑒𝑑𝑑𝑖𝑖𝑐𝑐𝑡𝑡𝑖𝑖𝑜𝑜𝑛𝑛𝑠𝑠)
Accuracy = 85 / (8 + 10 + 9) = 0.82

Actual

Source:
OFFICIAL (CLOSED) \ NON-SENSITIVE

Student Activity
What is the overall accuracy?
• Accuracy = (𝑛𝑛𝑢𝑢𝑚𝑚𝑏𝑏𝑒𝑒𝑟𝑟 𝑜𝑜𝑓𝑓 𝒄𝒄𝒐𝒐𝒓𝒓𝒓𝒓𝒆𝒆𝒄𝒄𝒕𝒕 𝑝𝑝𝑟𝑟𝑒𝑒𝑑𝑑𝑖𝑖𝑐𝑐𝑡𝑡𝑖𝑖𝑜𝑜𝑛𝑛𝑠𝑠) / (𝑛𝑛𝑢𝑢𝑚𝑚𝑏𝑏𝑒𝑒𝑟𝑟 𝑜𝑜𝑓𝑓
𝑝𝑝𝑟𝑟𝑒𝑒𝑑𝑑𝑖𝑖𝑐𝑐𝑡𝑡𝑖𝑖𝑜𝑜𝑛𝑛𝑠𝑠)
Accuracy = (85+93+85)/(85+10+9+6+93+12+7+8+85) = 0.83

Source:
OFFICIAL (CLOSED) \ NON-SENSITIVE

Student Activity
Calculate the test accuracy from
the numbers you can get from the
confusion matrix below and make
sure you can get the same accuracy
as the test_acc
• Accuracy =
𝑛𝑛𝑢𝑢𝑚𝑚𝑏𝑏𝑒𝑒𝑟𝑟 𝑜𝑜𝑓𝑓 𝒄𝒄𝒐𝒐𝒓𝒓𝒓𝒓𝒆𝒆𝒄𝒄𝒕𝒕 𝑝𝑝𝑟𝑟𝑒𝑒𝑑𝑑𝑖𝑖𝑐𝑐𝑡𝑡𝑖𝑖𝑜𝑜𝑛𝑛𝑠𝑠 /
𝑛𝑛𝑢𝑢𝑚𝑚𝑏𝑏𝑒𝑒𝑟𝑟 𝑜𝑜𝑓𝑓 𝑝𝑝𝑟𝑟𝑒𝑒𝑑𝑑𝑖𝑖𝑐𝑐𝑡𝑡𝑖𝑖𝑜𝑜𝑛𝑛𝑠𝑠

Source:
OFFICIAL (CLOSED) \ NON-SENSITIVE

Student Activity
Complete the code below to build the network as the summary shows.
Output will be 40 32x32 feature map. 32x32
network = models.Sequential() input shape is preserved because of padding
network.add(layers.Conv2D(40, (3, 3), activation="relu", padding='same’, input_shape=(32,32,3)))
network.add(layers.Conv2D(60, (3, 3), activation="relu")) Output is 60 30x30 feature map. Size is reduced
network.add(layers.MaxPooling2D(pool_size=(2, 2))) by 2 rows and 2 columns with no padding

Pool size (2,2) reduce the rows and


columns by 2
network.add(layers.Dropout(0.4))

Dropout does not affect the output shape


as it just deactivate some nodes at each
epoch, not permanently removing the
node
Source:
OFFICIAL (CLOSED) \ NON-SENSITIVE

Student Activity
network.add(layers.Flatten()) Flatten 2x2x40 into 160 nodes

network.add(layers.Dense(60, activation='relu')) 60 nodes


network.add(layers.Dropout(0.3))
network.add(layers.Dense(10, activation='softmax')) Output layer of 10 nodes as
there are 10 classes

Source:

You might also like