Lecture 5

Deep
Learning
Dr. Felix Gonda

Assistant Professor of Computer Science
Email: uojdeeplearning@gmail.com
Web: https://uojai.github.io/deeplearning
U o J Deep Learning - hĪĪps://uojai.giĪhub.io/deeplearning
Recap of Last
Week
1 1 1 0 0
4 = (1x1) + (1x0) +
4 3 4 (0x1) + + (1x1) + (0x0)
(1x0)
0 1 1 1 0
1 0 1 +
(1x1) + (0x0) + (0x1)
0 0 1x1 1x0 1x1 0 1 0 2 4 3
1 0 1
0 0 1x0 1x1 0 x0 2 3 4
Filter
0 1 1x1 0 x0 0x1 Feature map
Image

Convolutional Neural Networks
(CNNs)

CNNs for
classification
Fully
Input Convolution
MaxPooling Connected
Image (Feature Maps)
Layer
Three Main Operations to a CNN:

1. Convolution: apply filters to generate feature maps.
2. Non-linearity: often ReLU
3. Pooling: down sampling operation on each feature map.

Convolution Layers: Local
Connectivity
For a neuron in hidden
layer:
• Take inputs from patch
• Compute weighted sum
• Apply bias
Hidden layer
Input 1

Connectivity
layer:
• Apply bias
Hidden layer Hidden layer 2

Input 1

Connectivity
layer:
• Apply bias
4x4 filter:
Matrix of weights Wi.j
4 4
Wij + Xi+p,j+q + b
i= j=
1 1
for neuron (p, q) in hidden layer
Hidden layer
Input 1. Applying a window of weights
1
2. Computing linear combinations
3. Activating with non-linear function

CNNs Spatial Arrangement of Output
Volume
Dept
h
Layer Dimensions:
hxwxd
Height Where h and w are spatial dimensions

and
d (depth) = number of filters
3
2 Stride:
Filter step size
Receptive Field:
Locations in input image that a node
3
Width is path connected to.
2
3

Introducing Non-
Linearity
Input Feature Map Rectified Feature Map Rectified Linear Unit

Black=negative, white=positive values Only non-negative values. (ReLU)
+v
e
-ve +v
e
-ve
g(y) = max(0 ,y)
• Apply after every convolution operation (i.e., after convolution layers)

• ReLU: pixel-by-pixel operation that replaces all negative values by zero. Non-linear operation.

Poolin
g
1 1 2 4
Max pool with
2x2 filters and
stride of 2
5 6 7 8 6 8
3 2 1 0 3 4
• Reduced dimensionality
1 2 3 4 • Spatial invariance
How else can we downsample and preserve spatial

invariance?

Representation Learning in Deep
CNNs
Low Level Features Mid Level Features High Level Features
Edges, dark spots Eyes, ears, nose Facial structure
Convolution Layer Convolution Layer 2 Convolution Layer 3

1

CNNs for Classification: Feature
Learning
0.9 Car
0.1 Truck
0.0 Van
0.0 Plane
0.0 Bus
0.0 Bicycle
Input Convolution + Pooling Convolution + Pooling Fully

Flatten SoftMax
ReLU ReLU Connected
Feature Learning Classification
1. Learn features in input image through convolution

2. Introduce non-linearity through activation function (real world data is non-linear)
3. Reduce dimensionality and preserve spatial invariance with pooling

CNNs for Classification: Class
Probabilities
0.9 Car
0.1 Truck
0.0 Van
0.0 Plane
0.0 Bus
0.0 Bicycle

Flatten SoftMax
ReLU ReLU Connected
softmax( yi ) ey
1. Convolution and pooling layers output high-level features of input
2. Fully connected layer uses these features for classifying input image = i i
j ey
3. Express output as probability of image belonging to a particular
class

Putting it all
together
Number of features to learn.

model
model = tf.keras.Sequential()
first layer
model.add( tf.keras.layers.Conv2d(filters=32, filter_size=(3,3), activation=‘relu’) )
model.add( MaxPool2D(pool_size=(2,2), strides=2) )
second layer
model.add( tf.keras.layers.Conv2d(filters=64, filter_size=(3,3), activation=‘relu’) )
model.add( MaxPool2D(pool_size=(2,2), strides=2) )
fully connected classifier

model.add( tf.keras.layers.Flatten() )
model.add( tf.keras.layers.Conv2d(units=1024, activation=‘relu’) )
model.add( tf.keras.layers.Dense(units=10, activation=‘softmax’) )

An Architecture for Many
Applications

CNNs for Classification: Feature
Learning
0.9 Car
0.1 Truck
0.0 Van
0.0 Plane
0.0 Bus
0.0 Bicycle

Flatten SoftMax
ReLU ReLU Connected

Object detection
Segmentation
Probabilistic control

Classification: breast cancer
screening.

Object
Detection
Taxi
Class label y

Object
Detection
Taxi
Class label y
Taxi

Object
Detection
Taxi
Output:
Taxi: (xl, yl, wl,
hl)

Object
Detection
Taxi
Output:
Taxi: (xl, yl, wl,
hl)
Person
Taxi
Output:
Taxi: (x1, y1, w1,
h1)
Person: (x2, y2,
w2,h2)
Person:
Truck (x3,y3,w3,h3)

Naive Solution to Object
Detection

Detection
Class?

Detection
Class?

Detection
Class?
Problem: way too many inputs. This results in too many scales, positions, and
sizes.

Object Detection with R-
CNNsfind regions that we think have objects. Use CNN to classify.
R-CNN algorithm:
Warped region
Aeroplane: No
Person: Yes
CNN Car: No
1. Input 2. Extract region 3. Compute 4. Classify

Image proposals CNN regions
(~2K) Features
Problems:
• Slow! Many regions; time intensive
inference.
• Brittle! Manually defined region proposals.
Faster R-CNN Learns Region
Proposals
Image input directly into

convolutional feature extractor.
Fast! Only input image once.

Proposals
Region proposal
Region Proposal Network network to learn
candidate regions.
Feature maps Learned, data driven


Proposals
Feature
extraction Proposals
over proposed
regions Region proposal
candidate regions.


Proposals
Classifier
Classification of regions -> Object detection
ROI Pooling
Feature
extraction Proposals
over proposed
regions Region proposal
candidate regions.


Semantic Segmentation: Fully Convolutional
Networks
FCN: Fully Connected Network
Network designed with all convolutional layers, with
downsampling and upsampling operations.
Goal: for every pixel in the input image, learn what class
it belongs to.
Med-res: Med-res:
D2 x H/4 x W/4 D2 x H/4 x W/4
High-res:
D3 x H/4 x
W/4
Input: High-res: High-res: Prediction:
3xHXW D1 x H/2 x D1 x H/2 x HXW
W/2 W/2
Pixels of cow is differentiated

from sky, grass, etc

Semantic Segmentation: Biomedical Image
Analysis
Automatic segmentation of neural tissue in 3D electron microscopy images using deep

learning.
Credits: https://www.aivia-software.com/
Original Image Confidence Map 2D Segmentation 3D Reconstruction

Mid term
exam

Mid-Term
Examination
When: Tuesday, November 24th,
2023
Where: Senate Hall
Time: Noon
• Neural Network Basics • Convolutional Neural • TensorFlow
• Biological Neural Networks Networks • Creating a model
• Structure of a neuron • Convolution • Layer types
• Neural Model • Pooling • Input
• Artificial Neural Networks • CNN structure • Convolution
• Perceptron • CNN applications • Fully Connected
• MultiLayer Perceptron (MLP) • Classification • Pooling
• Deep Neural Networks • Object • Drop out
• Activation Functions Detection • Training
• Loss function • Segmentation
• Backpropagation
• Stochastic Gradient Decent
• Optimization (learning rate, mini
batches)
• Underfitting, Overfitting
• Regularization (Dropout, Early
Stoping)
• U o J Deep Learning - hĪĪps://uojai.giĪhub.io/deeplearning
See Instructor after
Lecture
Chris Sagin Alexander Monday Yok Muorwel Emmanuel Simon Sadig
Akie Isaac Deng Dhieu Kiir Yoll Dhieu Tariir Abdun Ngong Mathiang
Aluel Akeen Aguot Akeen John Buosh Mayiek Maper Christopher Adhar Mathon
David Gabriel Gatjiek Riak Matuor Ayor Kuol Dinando Majok
John Kulang Leme Achoul Isaac Sebit Faustino Mathew Doru Sylvanus Yama
Kuol Stephen Hoth Bong Sabir Basall Shatta Fadul Favicky Matur Mamer
Mary John Ajugo Ladu Stephen Jada Paul Kuot Agoth Majok
M o u Thiik Riiny Giir Koang Peter Makuil Richard George

Hana Noreldeen
Nelson Mawien Wieu Khamis Moderia Suzy John
Bosco Selfador Kabashi Francis Mathew Soak Kuri Lazarus Ajang Lual
Emmanuel Martin Loro Clement Malak Akech
Dominic Olelang Moi Sabio Sizy John Bornone Ladu

Next…
Laboratory on Creating a CNN For Sign Language
Numbers
@ UoJ Computer Lab

Lecture 5

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Lecture 5

Uploaded by

Copyright:

Available Formats

Deep

Dr. Felix Gonda

U o J Deep Learning - hĪĪps://uojai.giĪhub.io/deeplearning

U o J Deep Learning - hĪĪps://uojai.giĪhub.io/deeplearning

Three Main Operations to a CNN:

U o J Deep Learning - hĪĪps://uojai.giĪhub.io/deeplearning

U o J Deep Learning - hĪĪps://uojai.giĪhub.io/deeplearning

Hidden layer Hidden layer 2

U o J Deep Learning - hĪĪps://uojai.giĪhub.io/deeplearning

U o J Deep Learning - hĪĪps://uojai.giĪhub.io/deeplearning

Height Where h and w are spatial dimensions

U o J Deep Learning - hĪĪps://uojai.giĪhub.io/deeplearning

Input Feature Map Rectified Feature Map Rectified Linear Unit

g(y) = max(0 ,y)

• Apply after every convolution operation (i.e., after convolution layers)

U o J Deep Learning - hĪĪps://uojai.giĪhub.io/deeplearning

How else can we downsample and preserve spatial

U o J Deep Learning - hĪĪps://uojai.giĪhub.io/deeplearning

Low Level Features Mid Level Features High Level Features

Edges, dark spots Eyes, ears, nose Facial structure

Convolution Layer Convolution Layer 2 Convolution Layer 3

U o J Deep Learning - hĪĪps://uojai.giĪhub.io/deeplearning

Input Convolution + Pooling Convolution + Pooling Fully

Feature Learning Classification

1. Learn features in input image through convolution

U o J Deep Learning - hĪĪps://uojai.giĪhub.io/deeplearning

Input Convolution + Pooling Convolution + Pooling Fully

Feature Learning Classification

U o J Deep Learning - hĪĪps://uojai.giĪhub.io/deeplearning

Number of features to learn.

fully connected classifier

U o J Deep Learning - hĪĪps://uojai.giĪhub.io/deeplearning

U o J Deep Learning - hĪĪps://uojai.giĪhub.io/deeplearning

Input Convolution + Pooling Convolution + Pooling Fully

Feature Learning Classification

U o J Deep Learning - hĪĪps://uojai.giĪhub.io/deeplearning

U o J Deep Learning - hĪĪps://uojai.giĪhub.io/deeplearning

U o J Deep Learning - hĪĪps://uojai.giĪhub.io/deeplearning

U o J Deep Learning - hĪĪps://uojai.giĪhub.io/deeplearning

U o J Deep Learning - hĪĪps://uojai.giĪhub.io/deeplearning

U o J Deep Learning - hĪĪps://uojai.giĪhub.io/deeplearning

U o J Deep Learning - hĪĪps://uojai.giĪhub.io/deeplearning

U o J Deep Learning - hĪĪps://uojai.giĪhub.io/deeplearning

U o J Deep Learning - hĪĪps://uojai.giĪhub.io/deeplearning

U o J Deep Learning - hĪĪps://uojai.giĪhub.io/deeplearning

1. Input 2. Extract region 3. Compute 4. Classify

Image input directly into

U o J Deep Learning - hĪĪps://uojai.giĪhub.io/deeplearning

Image input directly into

U o J Deep Learning - hĪĪps://uojai.giĪhub.io/deeplearning

Image input directly into

U o J Deep Learning - hĪĪps://uojai.giĪhub.io/deeplearning

Image input directly into

U o J Deep Learning - hĪĪps://uojai.giĪhub.io/deeplearning

Pixels of cow is differentiated

U o J Deep Learning - hĪĪps://uojai.giĪhub.io/deeplearning

Automatic segmentation of neural tissue in 3D electron microscopy images using deep

Original Image Confidence Map 2D Segmentation 3D Reconstruction

U o J Deep Learning - hĪĪps://uojai.giĪhub.io/deeplearning

U o J Deep Learning - hĪĪps://uojai.giĪhub.io/deeplearning

Chris Sagin Alexander Monday Yok Muorwel Emmanuel Simon Sadig

David Gabriel Gatjiek Riak Matuor Ayor Kuol Dinando Majok