Professional Documents
Culture Documents
Nam StradVision 2020 Embedded Vision Summit Slides Final
Nam StradVision 2020 Embedded Vision Summit Slides Final
© 2020 StradVision
StradVision
© 2020 StradVision 2
Agenda
© 2020 StradVision 3
Development Process of
Deep Learning-based Software
© 2020 StradVision 4
Development Process
Repeated for each task (e.g., object detection) and target platform
© 2020 StradVision 5
Scalability Challenge
© 2020 StradVision 6
Development Process
Scalability challenge
• Number of networks to maintain grows rapidly
• More than 100 networks become difficult to manage
# of edges = # of networks
© 2020 StradVision 7
Scalable Development Process
Main idea
• Grouping target platforms based on similar computational characteristics (or
capabilities) with respect to a given CNN transformation
# of edges = # of networks
Same group
© 2020 StradVision 8
Scalable Development Process
Network
Binary Report
Weights
Same group
© 2020 StradVision 9
Scalable Development Process
Cost-effective method
• Increasing computational efficiency
• E.g., lower latency, higher throughput, cheaper hardware
• Increasing engineering efficiency
• E.g., shorter development time, less engineers, more projects
• Increasing economic efficiency
• E.g., lower usage of cloud services for training, lower production-cost
© 2020 StradVision 10
Layer Transformations
© 2020 StradVision 11
Layer Transformations
• Basic operations
• Case studies
• YUV Input
• Fixed-size Kernel Acceleration in Convolution
• No FC Support
• No Custom Layer Support in Quantization
• Block-level Accumulation in Convolution
© 2020 StradVision 12
Layer Transformations
Basic operations
𝑦 = 𝑎𝑥 + 𝑏 and 𝑧 = 𝑐𝑦 + 𝑑
• (Linear) layer fusion
→ 𝑧 = (𝑎𝑐)𝑥 + (𝑏𝑐 + 𝑑)
• Conv & Conv → Conv New weight New bias
Transpose
© 2020 StradVision 13
Case Studies
© 2020 StradVision 14
Case Study – YUV Input
© 2020 StradVision
Fusion 15
Case Study – YUV Input
© 2020 StradVision 16
Case Study – Fixed-size Kernel Acceleration in Convolution
Zero filling
w31 w32 w33 0 0 0 0 0
© 2020 StradVision 17
Case Study – Fixed-size Kernel Acceleration in Convolution
iw=input width kw=kernel width ow=output width
ih=input height kh=kernel height oh=output height
→ Convert 1x1 Conv into 4(kh)x1(kw) Conv ic=input channel oc=output channel
1x1 Convolution
4x1 Convolution
© 2020 StradVision 18
Case Study – Fixed-size Kernel Acceleration in Convolution
iw=input width kw=kernel width ow=output width
ih=input height kh=kernel height oh=output height
→ Convert 1x1 Conv into 4(kh)x1(kw) Conv ic=input channel oc=output channel
iw=3 kw=1
ow=3
1 2 3
ih=2 7 8 9
kh=1 1
1 2 3
2
4 5 6
13 14 15 1x1 Convolution oh=2
10 11 12 3 4 5 6
19 20 21
16 17 18 ic=4 4
ic=4 22 23 24 oc=1
oc=2
ic=3 oc=2
1 4
batch=2
batch=2
1 2 3 1 2
* =
ic=3
2 5
4 5 6 3 4
3 6
Input Weight Output
© 2020 StradVision 20
Case Study – No FC Support
kh’=1
1
oc=2 iw’=batch=2 2 ow’=batch=2
oh’=1
ih’=1
ic=3 oc=2 3
oc’=oc=2
1 4 1 4 1 3
batch=2
batch=2
1 2 3 1 2
* = 2 5 4 2 4
ic=3
2 5
4 5 6 3 4 3 6 5
3 6 6
Input Weight Output Input Weight Output
FC 1x1 Conv
© 2020 StradVision 21
Case Study – No Custom Layer Support in Quantization
© 2020 StradVision 22
Case Study – No Custom Layer Support in Quantization
h’=batch*c=8
22 23 24 13 14 15 16 17 18
(batch)x(c)x(h)x(w) 19 20 21 22 23 24
batch=2
25 26 27 25 26 27 28 29 30
31 32 33 Subtract 128 & Reshape
28 29 30 31 32 33 34 35 36
37 38 39
34 35 36
43 44 45 37 38 39 40 41 42
40 41 42
46 47 48 43 44 45 46 47 48
2(batch)x4(c)x2(h)x3(w) 8(h’)x6(w’)
Tensor in Int8 format Gray image in Uint8 format
© 2020 StradVision 23
Case Study – Block-level Accumulation in Convolution
Input[0:8,:,:] Weight[:,0:8,:,:]
= Conv8(oc)x8(ic) ( 8(ic)x32(ih)x32(iw)
, 8(oc)x8(ic)x1(kh)x1(kw) )
Input[8:16,:,:] Weight[:,8:16,:,:]
+ Conv8(oc)x8(ic) ( 8(ic)x32(ih)x32(iw)
, 8(oc)x8(ic)x1(kh)x1(kw) )
Input[16:24,:,:] Weight[:,16:24,:,:]
+ Conv8(oc)x8(ic) ( 8(ic)x32(ih)x32(iw)
, 8(oc)x8(ic)x1(kh)x1(kw) )
© 2020 StradVision 24
Case Study – Block-level Accumulation in Convolution
© 2020 StradVision 25
Case Study – Block-level Accumulation in Convolution
oc=32
1x1 Conv Weight (dense) 1x1 Conv Weight (53% sparse)
© 2020 StradVision 26
Conclusions
© 2020 StradVision 27
Conclusions
© 2020 StradVision 28
Resources
• Sparsification
• “Exploring the Granularity of Sparsity in Convolution Neural Networks”, CVPR2017
https://openaccess.thecvf.com/content_cvpr_2017_workshops/w29/html/Mao_Ex
ploring_the_Granularity_CVPR_2017_paper.html
© 2020 StradVision 29