Professional Documents
Culture Documents
Deeplearning - Ai Deeplearning - Ai
Deeplearning - Ai Deeplearning - Ai
DeepLearning.AI makes these slides available for educational purposes. You may not use or distribute
these slides for commercial purposes. You may make copies of these slides and use or distribute them for
educational purposes as long as you cite DeepLearning.AI as the source of the slides.
● Solutions
RNNs: Advantages
Captures dependencies within a short range
Learnable parameters
Backpropagation through time
Same at every
step
● Gradient clipping 32 25
0
● Skip connections
Introduction to
LSTMs
deeplearning.ai
Outline
● Meet the Long short-term memory unit!
● LSTM architecture
● Applications
LSTMs: a memorable solution
● Learns when to remember and when to forget
● Basic anatomy:
○ A cell state
○ A hidden state
○ Multiple gates
Produce output
Gates in LSTM
Output
Input
Applications of LSTMs
Next-character Chatbots Music
prediction composition
Image Speech
captioning recognition
Summary
● LSTMs offer a solution to vanishing gradients
LSTM Unit
x x
3. Output Gate:
information to use at
1 2 3 current step
Hidden state tanh
Sigmoid output
between 0 and 1 0-Closed
Input 1-Open
Candidate Cell State
Output
Tanh Shrinks
Other activations could be
argument to be
Input used
between -1 and 1
New Cell State
Output
Input
New Hidden State
Output
● Recommendation engines
● Customer service
● Automatic trading
Training NERs:
Data
Processing
deeplearning.ai
Outline
● Token padding
x +
Inputs
tanh
x x
tanh
Layers in Trax
model = tl.Serial(
tl.Embedding(),
tl.LSTM(),
tl.Dense()
tl.LogSoftmax()
)
Summary
return accuracy
Summary
● If padding tokens, remember to mask them when computing accuracy
● Coding assignment!