Download as pdf or txt
Download as pdf or txt
You are on page 1of 15

Deep Neural Networks

1
RNN

2
Graphical representation:

Can be
only
one
Reusing the
same weight in
iterations
Can be
only one Summation of gradients during backprop!

3
BP through time

Same!

4
Complex!

Part by
part

5
LSTM:

The most deep you


actually go with RNNs

6
Auto-encoder: Latent representation

Minimizing:

7
GAN (Generative Adversarial Net):

Two players competing against each other!

8
9
DNN Theory

10
A couple of primary theoretical questions on DNNs

• Which classes of functions can it approximate and learn


well? Is it any different from what Shallow can do?

• Why is stochastic gradient descent (SGD) so


unreasonably efficient?

• Overparametrization may explain why minima are easy


to find during training but then why does over-fitting
seems to be less of a problem than for classical shallow
networks?

11
Deep Learning and Curse of Dimensionality

We are considering question 1

The main message will be that deep networks have the


theoretical guarantee, which shallow networks do not have, that
they can avoid the curse of dimensionality for an important class
of problems.

We will be considering:
Compositional functions (f of f) -> hierarchically local
compositional functions (bounded small dimensionality)

Ofcourse Deep convolution types!

12
The key aspect of convolutional networks that can give them an
advantage is locality at each level of the hierarchy.

Function approximation by deep networks

Degree of approximation

Assume:

13
Now

Obviously

in DNN
14
Shallow and deep networks

15

You might also like