Professional Documents
Culture Documents
Conv2d Multi Channels
Conv2d Multi Channels
Yes 5
End
5 Multi-channels input
Input image or feature map: multiple channels
1 3 1 0
2 12 10 21 0
3 11 210 111 00 0
1 11 202 000 01 1
1 12 12 01 1
0 1 0 2
Input: 3 channels
6 Multi-channels input
Input image or feature map: multiple channels
1 3 1 0 3 1 0 1 2 2 0 1 1 3 1 0
2 12 10 21 0 1 1 2 0 1 1 1 0 1 1 2 0
3 11 210 111 00 0 1 2 2 1 1 0 0 1 2 1 0 0
1 11 202 000 01 1 0 1 0 2 1 1 0 1 2 0 0 1
1 12 12 01 1
Channel 1 Channel 2 Channel 3
0 1 0 2 (RED) (GREEN) (BLUE)
Input: 3 channels
7 Multi-channels input
Input image or feature map: multiple channels
1 3 1 0 3 1 0 1 2 2 0 1 1 3 1 0
2 12 10 21 0 1 1 2 0 1 1 1 0 1 1 2 0
3 11 210 111 00 0 1 2 2 1 1 0 0 1 2 1 0 0
1 11 202 000 01 1 0 1 0 2 1 1 0 1 2 0 0 1
1 12 12 01 1
Channel 1 Channel 2 Channel 3
0 1 0 2 (RED) (GREEN) (BLUE)
Input: 3 channels
0 0 1
1 10 11 0
1 11 000 10 0
1 00 00 1
0 0 1
Kernel: 3 channels
8 Multi-channels input
Input image or feature map: multiple channels
1 3 1 0 3 1 0 1 2 2 0 1 1 3 1 0
2 12 10 21 0 1 1 2 0 1 1 1 0 1 1 2 0
3 11 210 111 00 0 1 2 2 1 1 0 0 1 2 1 0 0
1 11 202 000 01 1 0 1 0 2 1 1 0 1 2 0 0 1
1 12 12 01 1
Channel 1 Channel 2 Channel 3
0 1 0 2 (RED) (GREEN) (BLUE)
Input: 3 channels 1 1 0 1 0 1 0 0 1
0 0 1 1 0 0 1 0 0 1 1 0
1 10 11 0 0 0 1 0 0 1 0 1 0
1 11 000 10 0 Channel 1 Channel 2 Channel 3
1 00 00 1 (RED) (GREEN) (BLUE)
0 0 1
Kernel: 3 channels
9 Multi-channels input
1 Rotate kernel 1800
1 1 0 1 0 1 0 0 1
1 0 0 1 0 0 1 1 0
0 0 1 0 0 1 0 1 0
1 0 0 1 0 0 0 1 0
0 0 1 0 0 1 0 1 1
0 1 1 1 0 1 1 0 0
10 Multi-channels input
1 Flatten the rotated kernel to a vector
1 1 0 1 0 1 0 0 1
1 0 0 1 0 0 1 1 0
0 0 1 0 0 1 0 1 0
1 0 0 1 0 0 0 1 0
0 0 1 0 0 1 0 1 1
0 1 1 1 0 1 1 0 0
After flattening:
1 0 0 0 0 1 0 1 1 1 0 0 0 0 1 1 0 1 0 1 0 0 1 1 1 0 0
11 Multi-channels input
2 Padding the input
3 1 0 1 2 2 0 1 1 3 1 0
1 1 2 0 1 1 1 0 1 1 2 0
1 2 2 1 1 0 0 1 2 1 0 0
0 1 0 2 1 1 0 1 2 0 0 1
(No padding)
12 Multi-channels input
2 Allocate the output buffer
3 1 0 1 2 2 0 1 1 3 1 0
1 1 2 0 1 1 1 0 1 1 2 0
1 2 2 1 1 0 0 1 2 1 0 0
0 1 0 2 1 1 0 1 2 0 0 1
input: i1 = i2 = 4 i1 − k1 + 1 = 2
kernel: k1 = k2 = 3
padding: p1 = p2 = 0 i2 − k2 + 1 = 2
strides: s1 = s2 = 1
Output
3 1 0 1 2 2 0 1 1 3 1 0
1 1 2 0 1 1 1 0 1 1 2 0
1 2 2 1 1 0 0 1 2 1 0 0
0 1 0 2 1 1 0 1 2 0 0 1
3 1 0 2 2 0 1 3 1
·
1 1 2 1 1 1 1 1 2
1 2 2 1 0 0 2 1 0
3 1 0 1 1 2 1 2 2 2 2 0 1 1 1 1 0 0 1 3 1 1 1 2 2 1 0
dot-product
1 0 0 0 0 1 0 1 1 1 0 0 0 0 1 1 0 1 0 1 0 0 1 1 1 0 0
21
> Slide the kernel to right
3 1 0 1 2 2 0 1 1 3 1 0
1 1 2 0 1 1 1 0 1 1 2 0
1 2 2 1 1 0 0 1 2 1 0 0
0 1 0 2 1 1 0 1 2 0 0 1
1 0 1 2 0 1 3 1 0
·
1 2 0 1 1 0 1 2 0
2 2 1 0 0 1 1 0 0
1 0 1 1 2 0 2 2 1 2 0 1 1 1 0 0 0 1 3 1 0 1 2 0 1 0 0
dot-product
1 0 0 0 0 1 0 1 1 1 0 0 0 0 1 1 0 1 0 1 0 0 1 1 1 0 0
21 11
> Slide the kernel down
3 1 0 1 2 2 0 1 1 3 1 0
1 1 2 0 1 1 1 0 1 1 2 0
1 2 2 1 1 0 0 1 2 1 0 0
0 1 0 2 1 1 0 1 2 0 0 1
1 1 2 1 1 1 1 1 2
·
1 2 2 1 0 0 2 1 0
0 1 0 1 1 0 2 0 0
1 1 2 1 2 2 0 1 0 1 1 1 1 0 0 1 1 0 1 1 2 2 1 0 2 0 0
dot-product
1 0 0 0 0 1 0 1 1 1 0 0 0 0 1 1 0 1 0 1 0 0 1 1 1 0 0
21 11
> Slide the kernel to right
10
3 1 0 1 2 2 0 1 1 3 1 0
1 1 2 0 1 1 1 0 1 1 2 0
1 2 2 1 1 0 0 1 2 1 0 0
0 1 0 2 1 1 0 1 2 0 0 1
1 2 0 1 1 0 1 2 0
·
2 2 1 0 0 1 1 0 0
1 0 2 1 0 1 0 0 1
1 2 0 2 2 1 1 0 2 1 1 0 0 0 1 1 0 1 1 2 0 1 0 0 0 0 1
dot-product
1 0 0 0 0 1 0 1 1 1 0 0 0 0 1 1 0 1 0 1 0 0 1 1 1 0 0
21 11
10 10
18 Multi-channels input
Final result
1 3 1 0
2 12 10 21 0
3 21 11
11 210 111 00 0 CONV
1 ∂b 11 202 000 01 1 10 10
1 12 12 01 1
Output: 1 channel
0 1 0 2
0 0 1
Input: 3 channels 1 10 11 0
1 11 000 10 0
1 00 00 1
0 0 1
Kernel: 3 channels
Convolution
Multi-filters layer
Dr. Thanh-Sach LE
LTSACH@hcmut.edu.vn
0 0 1
11 10 01
11 01 00
1 0 0
1 3 1 0 0 1 0
32 12 00 11 00 00 11
11 11 12 00 Filter 1
1 1 2 0
12 01 00 10
1 2 2 1
12 10 00 11
0 1 0 2
Input feature map
(D = 3 channels)
22 Multi-filters layer
21 11
CONV
10 10
0 0 1
11 10 01
11 01 00
1 0 0
1 3 1 0 0 1 0
32 12 00 11 00 00 11
11 11 12 00 Filter 1
1 1 2 0 x x
12 01 00 10
1 2 2 1 CONV x x
12 10 00 11
0 1 0 2
x x
Input feature map CONV
(D = 3 channels) x x
10 00 11
2 1 0
11 01 00
1 1 1
00 01 10
0 0 1
Filter K
23 Multi-filters layer
21 11
CONV
10 10
0 0 1
11 10 01
11 01 00 xx xx
1 0 0 21 11
1 3 1 0 0 1 0 CONCAT xx xx
32 12 00 11 00 00 11 10 10
11 11 12 00 Filter 1
1 1 2 0 x x K channels
12 01 00 10 (input to next layer)
1 2 2 1 CONV x x
12 10 00 11
0 1 0 2
x x
Input feature map CONV
(D = 3 channels) x x
10 00 11
2 1 0
11 01 00
1 1 1
00 01 10
0 0 1
Filter K