Professional Documents
Culture Documents
CV Course
CV Course
Computer Vision 2023 / 2024 Assiut University | Faculty of computers and information 1
Motivation
1 2 3
Computer Vision 2023 / 2024 Assiut University | Faculty of computers and information 2
Motivation
4 5
Computer Vision 2023 / 2024 Assiut University | Faculty of computers and information 3
Motivation
6 7
Computer Vision 2023 / 2024 Assiut University | Faculty of computers and information 4
Motivation
Computer Vision 2023 / 2024 Assiut University | Faculty of computers and information 5
Outline
9 10
Computer Vision 2023 / 2024 Assiut University | Faculty of computers and information 6
Outline
❑ Common Architectures.
▪ AlexNet
▪ VGGNet
▪ GoogleNet
▪ Inception v2, v3, and v4
▪ ResNet
➢ Residual Block
➢ ResNet v1, v2
Computer Vision 2023 / 2024 Assiut University | Faculty of computers and information 7
Outline
❑ Common Architectures.
▪ AlexNet
▪ VGGNet
▪ GoogleNet
▪ Inception v2, v3, and v4
▪ ResNet
➢ Residual Block
➢ ResNet v1, v2
Computer Vision 2023 / 2024 Assiut University | Faculty of computers and information 8
Common Architectures
Computer Vision 2023 / 2024 Assiut University | Faculty of computers and information 9
Outline
❑ Common Architectures.
▪ AlexNet
▪ VGGNet
▪ GoogleNet
▪ Inception v2, v3, and v4
▪ ResNet
➢ Residual Block
➢ ResNet v1, v2
Computer Vision 2023 / 2024 Assiut University | Faculty of computers and information 10
AlexNet | ILSVRC 2012 winner
Computer Vision 2023 / 2024 Assiut University | Faculty of computers and information 12
Outline
❑ Common Architectures.
▪ AlexNet
▪ VGGNet
▪ GoogleNet
▪ Inception v2, v3, and v4
▪ ResNet
➢ Residual Block
➢ ResNet v1, v2
Computer Vision 2023 / 2024 Assiut University | Faculty of computers and information 13
VGGNet | ILSVRC 2014 2nd place
Computer Vision 2023 / 2024 Assiut University | Faculty of computers and information 14
VGGNet | (design principle) More depth is better for the respective field
▪ Suppose that:
▪ The input is H x W x C (depth of the input)
▪ and we use convolutions with C (depth of the output feature map) filters to preserve
depth (stride 1, padding to preserve H, W)
Computer Vision 2023 / 2024 Assiut University | Faculty of computers and information 15
Let’s Code
VGG
Computer Vision 2023 / 2024 Assiut University | Faculty of computers and information 16
Outline
❑ Common Architectures.
▪ AlexNet
▪ VGGNet
▪ GoogleNet
▪ Inception v2, v3, and v4
▪ ResNet
➢ Residual Block
➢ ResNet v1, v2
Computer Vision 2023 / 2024 Assiut University | Faculty of computers and information 17
GoogleNet | 2015
Module
Architecture
Computer Vision 2023 / 2024 Assiut University | Faculty of computers and information 18
GoogleNet | 2015
▪ Note that: 1 x 1 Convolution only changes the depth of the output feature map.
▪ Remember that: the depth of the output feature map equals the number of filters.
▪ Remember that: the depth of the filter equals the depth of the input.
Inception module
Computer Vision 2023 / 2024 Assiut University | Faculty of computers and information 20
GoogleNet | another version of GoogleNet
Inception module
Computer Vision 2023 / 2024 Assiut University | Faculty of computers and information 21
GoogleNet | another version of GoogleNet
Conv Ops:
• Computational complexity! [1x1 conv, 64] 28x28x64x1x1x256
Conv Ops: [1x1 conv, 64] 28x28x64x1x1x256
[1x1 conv, 128] 28x28x128x1x1x256 [1x1 conv, 128] 28x28x128x1x1x256
[3x3 conv, 192] 28x28x192x3x3x256 [3x3 conv, 192] 28x28x192x3x3x64
[5x5 conv, 96] 28x28x96x5x5x256 [5x5 conv, 96] 28x28x96x5x5x64
[1x1 conv, 64] 28x28x64x1x1x256
Total: 854M ops
Total: 358M ops
Computer Vision 2023 / 2024 Assiut University | Faculty of computers and information 22
GoogleNet
Inception module
Computer Vision 2023 / 2024 Assiut University | Faculty of computers and information 23
Outline
❑ Common Architectures.
▪ AlexNet
▪ VGGNet
▪ GoogleNet
▪ Inception v2, v3, and v4
▪ ResNet
➢ Residual Block
➢ ResNet v1, v2
Computer Vision 2023 / 2024 Assiut University | Faculty of computers and information 24
Inception v2, v3 (2016)
V2 V3
C. Szegedy et al., Rethinking the inception architecture for computer vision, CVPR 2016
Computer Vision 2023 / 2024 Assiut University | Faculty of computers and information 25
Inception v4
C. Szegedy et al., Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning, arXiv 2016
Computer Vision 2023 / 2024 Assiut University | Faculty of computers and information 26
Outline
❑ Common Architectures.
▪ AlexNet
▪ VGGNet
▪ GoogleNet
▪ Inception v2, v3, and v4
▪ ResNet
➢ Residual Block
➢ ResNet v1, v2
Computer Vision 2023 / 2024 Assiut University | Faculty of computers and information 27
Outline
❑ Common Architectures.
▪ AlexNet
▪ VGGNet
▪ GoogleNet
▪ Inception v2, v3, and v4
▪ ResNet
➢ Residual Block
➢ ResNet v1, v2
Computer Vision 2023 / 2024 Assiut University | Faculty of computers and information 28
ResNet | Residual Block
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun, Deep Residual Learning for Image
Recognition, CVPR 2016 (Best Paper)
Computer Vision 2023 / 2024 Assiut University | Faculty of computers and information 29
ResNet | v1 & v2
Computer Vision 2023 / 2024 Assiut University | Faculty of computers and information 30
Let’s Code
ResNet
Computer Vision 2023 / 2024 Assiut University | Faculty of computers and information 31
Outline
❑ Common Architectures.
▪ AlexNet
▪ VGGNet
▪ GoogleNet
▪ Inception v2, v3, and v4
▪ ResNet
➢ Residual Block
➢ ResNet v1, v2
Computer Vision 2023 / 2024 Assiut University | Faculty of computers and information 32
Thank You
Computer Vision 2023 / 2024 Assiut University | Faculty of computers and information 33