Goog Le Net

Unveiling GoogLeNet:
A Deep Dive into Google's State-of-the-

Art Convolutional Neural Network
Architecture
Presented By: Muhammad Mohsin Zafar

GoogLeNet
Introduction
 A deep Convolutional Neural Network (CNN) developed by researchers at Google.

 It was introduced in 2014.
 Named GoogLeNet because it was developed at Google and LeNet was the first structured
CNN.
GoogLeNet
Introduction
 A deep Convolutional Neural Network (CNN) developed by researchers at Google.

 It was introduced in 2014.
 Named GoogLeNet because it was developed at Google and LeNet was the first structured
CNN.
 Prior to the GoogLeNet every model was trying to increase the layers of the model to
attain batter generalization which results in higher complexity because as we increase the
layers (Going deeper), the number of parameters increase exponentiality.
GoogLeNet
ILSVRC
 Won the ILSVRC (ImageNet Large-Scale

Visual Recognition Challenge) that year, with a
top-five error rate of 6.67%.
GoogLeNet
ILSVRC
 Won the ILSVRC (ImageNet Large-Scale

Visual Recognition Challenge) that year, with a
top-five error rate of 6.67%.
 ILSVRC 2014 classification challenge involves
the task of classifying the image into one of 1000
categories in the ImageNet hierarchy. There are
about 1.2 million images for training, 50,000 for
validation and 100,000 images for testing.
GoogLeNet
Motivation (Going Deeper)
Researchers noted the as we increase the number of convolutional layer the results are
getting batter But as you can imagine, this can often create complications like:
 The bigger the model, the more prone it is to overfitting. This is particularly noticeable
when the training data is small.
GoogLeNet
getting batter But as you can imagine, this can often create complications:
 Increasing the number of parameters means you need to increase your existing
computational resources.
GoogLeNet
getting batter But as you can imagine, this can often create complications:
 Increasing the number of parameters means you need to increase your existing
computational resources.
 The problems like vanishing gradient can occur which training very deep models.
Going Deeper and Wider
 Addressed the complexity problem of architecture by forming
wider network instead of deeper by introducing the concept of
inception module.
Going Deeper and Wider
(Naïve Inception)
 Addressed the complexity problem of architecture by forming
wider network instead of deeper by introducing the concept of
inception module.
 First inception module called naïve inception module
incorporated 3 convolutions with different kernel or filter sizes
(1x1, 3x3 and 5x5) at the same level and a 3x3 max pooling.
 Outputs of these 3 convolutions and pooling are concatenated
and fed to the next inception module.
Naïve Inception
 The problem is that large number of parameters are generated

when we directly apply 5 x 5 convolution with large number of
filters which results in complex model.
 This architecture might cover the optimal sparse structure, it

would do it very inefficiently, leading to a computational blow
up within a few stages.
Inception V3
 Due to the higher complexity problem in naïve inception

module, they employed 1x1 convolution before 3x3 and 5x5
convolutions.
Inception V3
 Due to the higher complexity problem in naïve inception

module, they employed 1x1 convolution before 3x3 and 5x5
convolutions.
 By employing 1x1 convolution before 5x5, it is noted that
combined parameters of 1x1 and 5x5 convolutions are less
than when employing only 5x5 convolution.
 By introducing 1x1 convolution, we can increase the number of
filters without increasing the complexity of the model.
Complexity Comparison
Suppose we need to perform:

5×5 convolution without the use of 1×1 convolution as below:
 No of operations = (14×14×480) × (5×5×48) = 112.9M

With 1 x 1 convolution:
 No of operations for 1×1 = (14×14×480)×(1×1×16) = 1.5M
 Total No of operations = 1.5M + 3.8M = 5.3M

5×5 convolution without the use of 1×1 convolution as below:
 No of operations = (14×14×480) × (5×5×48) = 112.9M
With 1 x 1 convolution:
 Total No of operations = 1.5M + 3.8M = 5.3M
5.3M is much smaller than 112.9M
General Architecture
 It is already a very deep model compared with previous AlexNet, ZFNet and
VGGNet.
 There are numerous inception modules connected to go deeper.
Global Average Pooling
 Previously fully connected layers were used at the end of architecture which adds
a lot of parameters and increases computational complexity But in GoogLeNet
global average pooling is introduced.
 Number of weights (connections) above =

7×7×1024×1024 = 51.3M
 In GoogLeNet, global average pooling is used nearly at
the end of network by averaging each feature map from
7×7 to 1×1.
Number of weights = 0
 Number of weights (connections) above =

7×7×1024×1024 = 51.3M
 In GoogLeNet, global average pooling is used nearly at
the end of network by averaging each feature map from
7×7 to 1×1.
Number of weights = 0
 A move from FC layers to average pooling improved
the top-1 accuracy by about 0.6%.
Detailed Architecture
Inception V3 Algorithm
GoogLeNet
Pros:
1. Reduced number of operations and computational cost because of the use of 1x1
convolution.
GoogLeNet
Pros:
convolution.
2. Capable of capturing features at multiple scales with the help of different kernel size;
3x3 and 5x5, at the same level.
GoogLeNet
Pros:
convolution.
3. Also performs well while working with low contrast images.
4. Highly scalable according to available resources because of its modular architecture.
GoogLeNet
Pros:
convolution.
3. Also performs well while working with low contrast images.
4. Highly scalable according to available resources because of its modular architecture.
5. Avoid vanishing gradient problem to some extent by introducing the auxiliary classifiers.
6. Requires less memory compared to previous models like VGG and AlexNet.
GoogLeNet
Cons:
1. Due to the complex architecture, it’s harder to interpret.

2. Still requires a substantial memory.
3. Possible features loss when moving from one module to another.
References:
Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., ... & Rabinovich, A. (2015).
Going deeper with convolutions. In Proceedings of the IEEE conference on computer vision
and pattern recognition (pp. 1-9).

Goog Le Net

Uploaded by

Copyright:

Available Formats

You might also like

Goog Le Net

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Goog Le Net

Uploaded by

Copyright:

Available Formats

Unveiling GoogLeNet:

A Deep Dive into Google's State-of-the-

Presented By: Muhammad Mohsin Zafar

 A deep Convolutional Neural Network (CNN) developed by researchers at Google.

 A deep Convolutional Neural Network (CNN) developed by researchers at Google.

 Won the ILSVRC (ImageNet Large-Scale

 Won the ILSVRC (ImageNet Large-Scale

 The problem is that large number of parameters are generated

 This architecture might cover the optimal sparse structure, it

 Due to the higher complexity problem in naïve inception

 Due to the higher complexity problem in naïve inception

Suppose we need to perform:

Suppose we need to perform:

Suppose we need to perform:

 Number of weights (connections) above =

 Number of weights (connections) above =

1. Due to the complex architecture, it’s harder to interpret.

You might also like