Professional Documents
Culture Documents
AI Lec-08
AI Lec-08
AI Lec-08
Thien Huynh-The
HCM City Univ. Technology and Education
Jan, 2023
Challenges in Deep Learning
Local minima
• The objective function of deep learning usually has many local minima
• The numerical solution obtained by the final iteration may only minimize the
objective function locally, rather than globally.
• As the gradient of the objective function's solutions approaches or becomes zero
Exploding gradient
• On the contrary, in some cases, the gradients keep on getting
larger and larger as the backpropagation algorithm
progresses. This, in turn, causes very large weight updates
and causes the gradient descent to diverge. This is known as
the exploding gradients problem.
Which one is better in preventing a neural network having more activation layers from vanishing gradient,
sigmoid or ReLU ?
Gradient descent with 2 variables Gradient descent with 2 variables (another example)
• Instead of using only the gradient of the current step to guide the search,
momentum also accumulates the gradient of the past steps to determine the
direction to go.
• The equations of gradient descent are revised as follows.
• Decay the learning rate for parameters in proportion to their update history
• Adapts the learning rate to the parameters, performing smaller updates (low learning rates) for
parameters associated with frequently occurring features, and larger updates (high learning rates)
for parameters associated with infrequent features
• It is well-suited for dealing with sparse data
• Adagrad greatly improved the robustness of SGD and used it for training large-scale neural nets
Assignment 2 (mandatory)
Design a multilayer neural networks with input layer, 02 hidden layers (sigmoid) ,
output layer (softmax). Apply the optimization methods of Momentum và Adam.
Compare the accuracy and Converging time among two methods. Assume that the
MNIST dataset is used for training and testing the neural network. Important: The
use of built-in functions are prohibited.
Student submit the python code on Google Class.