Professional Documents
Culture Documents
Lecture 5
Lecture 5
Learning
Email: uojdeeplearning@gmail.com
Web: https://uojai.github.io/deeplearning
U o J Deep Learning - hĪĪps://uojai.giĪhub.io/deeplearning
Recap of Last
Week
1 1 1 0 0
4 = (1x1) + (1x0) +
4 3 4 (0x1) + + (1x1) + (0x0)
(1x0)
0 1 1 1 0
1 0 1 +
(1x1) + (0x0) + (0x1)
0 0 1x1 1x0 1x1 0 1 0 2 4 3
1 0 1
0 0 1x0 1x1 0 x0 2 3 4
Filter
0 1 1x1 0 x0 0x1 Feature map
Image
Fully
Input Convolution
MaxPooling Connected
Image (Feature Maps)
Layer
Hidden layer
Input 1
4x4 filter:
Matrix of weights Wi.j
4 4
Wij + Xi+p,j+q + b
i= j=
1 1
for neuron (p, q) in hidden layer
Hidden layer
Input 1. Applying a window of weights
1
2. Computing linear combinations
3. Activating with non-linear function
Layer Dimensions:
hxwxd
+v
e
-ve +v
e
-ve
1 1 2 4
Max pool with
2x2 filters and
stride of 2
5 6 7 8 6 8
3 2 1 0 3 4
• Reduced dimensionality
1 2 3 4 • Spatial invariance
0.9 Car
0.1 Truck
0.0 Van
0.0 Plane
0.0 Bus
0.0 Bicycle
0.9 Car
0.1 Truck
0.0 Van
0.0 Plane
0.0 Bus
0.0 Bicycle
softmax( yi ) ey
1. Convolution and pooling layers output high-level features of input
2. Fully connected layer uses these features for classifying input image = i i
j ey
3. Express output as probability of image belonging to a particular
class
first layer
model.add( tf.keras.layers.Conv2d(filters=32, filter_size=(3,3), activation=‘relu’) )
model.add( MaxPool2D(pool_size=(2,2), strides=2) )
second layer
model.add( tf.keras.layers.Conv2d(filters=64, filter_size=(3,3), activation=‘relu’) )
model.add( MaxPool2D(pool_size=(2,2), strides=2) )
0.9 Car
0.1 Truck
0.0 Van
0.0 Plane
0.0 Bus
0.0 Bicycle
Taxi
Class label y
Taxi
Class label y
Taxi
Taxi
Output:
Taxi: (xl, yl, wl,
hl)
Taxi
Output:
Taxi: (xl, yl, wl,
hl)
Person
Taxi
Output:
Taxi: (x1, y1, w1,
h1)
Person: (x2, y2,
w2,h2)
Person:
Truck (x3,y3,w3,h3)
Class?
Class?
Class?
Problem: way too many inputs. This results in too many scales, positions, and
sizes.
Warped region
Aeroplane: No
Person: Yes
CNN Car: No
Problems:
• Slow! Many regions; time intensive
inference.
• Brittle! Manually defined region proposals.
U o J Deep Learning - hĪĪps://uojai.giĪhub.io/deeplearning
Faster R-CNN Learns Region
Proposals
Region proposal
Region Proposal Network network to learn
candidate regions.
Feature maps Learned, data driven
Feature
extraction Proposals
over proposed
regions Region proposal
Region Proposal Network network to learn
candidate regions.
Feature maps Learned, data driven
ROI Pooling
Feature
extraction Proposals
over proposed
regions Region proposal
Region Proposal Network network to learn
candidate regions.
Feature maps Learned, data driven
Goal: for every pixel in the input image, learn what class
it belongs to.
Med-res: Med-res:
D2 x H/4 x W/4 D2 x H/4 x W/4
High-res:
D3 x H/4 x
W/4
Input: High-res: High-res: Prediction:
3xHXW D1 x H/2 x D1 x H/2 x HXW
W/2 W/2
Akie Isaac Deng Dhieu Kiir Yoll Dhieu Tariir Abdun Ngong Mathiang
Aluel Akeen Aguot Akeen John Buosh Mayiek Maper Christopher Adhar Mathon
John Kulang Leme Achoul Isaac Sebit Faustino Mathew Doru Sylvanus Yama
Kuol Stephen Hoth Bong Sabir Basall Shatta Fadul Favicky Matur Mamer
Mary John Ajugo Ladu Stephen Jada Paul Kuot Agoth Majok
Bosco Selfador Kabashi Francis Mathew Soak Kuri Lazarus Ajang Lual