Welcome to Scribd!

0% found this document useful (0 votes)

12 views

Seq 2 Seq

Uploaded by

The document discusses several key concepts in natural language processing including recurrent neural networks (RNNs), sequence-to-sequence (seq2seq) models, attention mechanisms, transformers, and BERT. It provides an overview of each concept with examples and illustrations. The concepts move from RNNs to improvements like seq2seq, attention to address RNN limitations, self-attention and transformer architectures, and finally BERT which introduced a novel pre-training approach.

Copyright:

Available Formats

Download as PDF, TXT or read online from Scribd

Flag for inappropriate content

Recurrent Neural Network (RNN) : Tuan Nguyen - AI4E
Document38 pages
Recurrent Neural Network (RNN) : Tuan Nguyen - AI4E
Thuần Văn
No ratings yet
Lecture8 421
Document85 pages
Lecture8 421
ossayedkh
No ratings yet
Attention: Sharad Jones
Document25 pages
Attention: Sharad Jones
David Guevara
No ratings yet
Deeplearning - Ai Deeplearning - Ai
Document40 pages
Deeplearning - Ai Deeplearning - Ai
Jian Quan
No ratings yet
UDRC RNN LSTM LibrariesTutorial
Document144 pages
UDRC RNN LSTM LibrariesTutorial
Soumana Sanou
No ratings yet
Sequence Models
Document85 pages
Sequence Models
Tayyaba Asif
No ratings yet
Sequence-To-Sequence Models: CIS 530, Computational Linguistics: Spring 2018
Document61 pages
Sequence-To-Sequence Models: CIS 530, Computational Linguistics: Spring 2018
fforappffor
No ratings yet
Lecture 7 - Conditional Language Modeling
Document64 pages
Lecture 7 - Conditional Language Modeling
Mario Molina
No ratings yet
Deeplearning - Ai Deeplearning - Ai
Document58 pages
Deeplearning - Ai Deeplearning - Ai
9f8z4k2cxs
No ratings yet
9 RNN LSTM Gru
Document91 pages
9 RNN LSTM Gru
sandhya
No ratings yet
Recurrent Neural Networks: CMSC498L
Document36 pages
Recurrent Neural Networks: CMSC498L
Zu Ki
No ratings yet
Recurrent Neural Networks: Azmi Haider Muhammad Salamah
Document42 pages
Recurrent Neural Networks: Azmi Haider Muhammad Salamah
Cesar Aliaga
No ratings yet
07 RNN Recurrent Neural Networks
Document115 pages
07 RNN Recurrent Neural Networks
조동올
No ratings yet
Intro DL 10 NLP
Document99 pages
Intro DL 10 NLP
Ngọc Nguyễn Quý
No ratings yet
2014 10 Cho EMNLP
Document11 pages
2014 10 Cho EMNLP
hungbkpro90
No ratings yet
NLP m1
Document148 pages
NLP m1
priyankap1624153
No ratings yet
Lecture 2
Document24 pages
Lecture 2
Vedang Chavan
No ratings yet
Analyzing A Complex Problem by Using An Issue Tree
Document2 pages
Analyzing A Complex Problem by Using An Issue Tree
Swagat Pradhan
No ratings yet
A Character-Level Decoder Without Explicit Segmentation For Neural Machine Translation
Document11 pages
A Character-Level Decoder Without Explicit Segmentation For Neural Machine Translation
Cuisine Gan
No ratings yet
Adv Deep Learning
Document32 pages
Adv Deep Learning
lil tel
No ratings yet
CH - 10 - 1 VQ Description PDF
Document13 pages
CH - 10 - 1 VQ Description PDF
Alexandre Cabrel
No ratings yet
Sequence Modeling
Document62 pages
Sequence Modeling
Prafful Varshney
No ratings yet
Variational Image Captioning Using Deterministic Attention
Document7 pages
Variational Image Captioning Using Deterministic Attention
Insta
No ratings yet
Pertemuan 13
Document2 pages
Pertemuan 13
Nuthfah Faijah
No ratings yet
Lec 14
Document76 pages
Lec 14
tereyo
No ratings yet
Neural Network
Document23 pages
Neural Network
Yandi Anzari
No ratings yet
Pretraining Part2 17 Mar 23 PDF
Document38 pages
Pretraining Part2 17 Mar 23 PDF
arpan singh
No ratings yet
Lecture 27
Document40 pages
Lecture 27
cjchien
No ratings yet
Seminar#1
Document29 pages
Seminar#1
Akhil Akhi
No ratings yet
Modern Language Models
Document28 pages
Modern Language Models
John Hawkins
No ratings yet
Christopher Manning Lecture 3: Neural Net Learning: Gradients by Hand (Matrix Calculus) and Algorithmically (The Backpropagation Algorithm)
Document84 pages
Christopher Manning Lecture 3: Neural Net Learning: Gradients by Hand (Matrix Calculus) and Algorithmically (The Backpropagation Algorithm)
Muhammad Arshad Awan
No ratings yet
07 Dlintro
Document39 pages
07 Dlintro
maheshsangamreddiias
No ratings yet
Advanced Data Analytics: Simon Scheidegger - University of Lausanne, Department of Economics
Document50 pages
Advanced Data Analytics: Simon Scheidegger - University of Lausanne, Department of Economics
Ruben Kempter
No ratings yet
Tut21a NLP
Document123 pages
Tut21a NLP
Prachurya Nath
No ratings yet
X-Bar Theory and Standard Arabic (PDFDrive)
Document23 pages
X-Bar Theory and Standard Arabic (PDFDrive)
Cheang Ravy
No ratings yet
Christopher Manning Lecture 5: Language Models and Recurrent Neural Networks (Oh, and Finish Neural Dependency Parsing J)
Document66 pages
Christopher Manning Lecture 5: Language Models and Recurrent Neural Networks (Oh, and Finish Neural Dependency Parsing J)
Muhammad Arshad Awan
No ratings yet
ANN Text and Sequence Processing
Document33 pages
ANN Text and Sequence Processing
Muhammad Hanan
No ratings yet
NLP - Natural Language Processing
Document74 pages
NLP - Natural Language Processing
MichaelLevy
No ratings yet
Deep Learning Basics Lecture 10 Neural Language Models
Document34 pages
Deep Learning Basics Lecture 10 Neural Language Models
baris
No ratings yet
Module No. 4 - Recurrent Neural Networks
Document3 pages
Module No. 4 - Recurrent Neural Networks
Cyril Smart Ambedkar
No ratings yet
Noun Phrase Extraction: A Description of Current Techniques
Document36 pages
Noun Phrase Extraction: A Description of Current Techniques
Ridha Galih Permana
No ratings yet
Natural Language Processing With Deep Learning CS224N/Ling284
Document33 pages
Natural Language Processing With Deep Learning CS224N/Ling284
rakesh
No ratings yet
Recurrent & Recursive Nets
Document10 pages
Recurrent & Recursive Nets
Aisha Singh
No ratings yet
L5 TextClassification Updated
Document179 pages
L5 TextClassification Updated
Ike S. Ma
No ratings yet
Curs 5 DEPI - 2va - Slides - EN
Document130 pages
Curs 5 DEPI - 2va - Slides - EN
Trasca Alin
No ratings yet
Fast Transformer Decoding - One Write-Head Is All You Need
Document9 pages
Fast Transformer Decoding - One Write-Head Is All You Need
Agarwal Shubham
No ratings yet
W8 Grade 12 American Weekly Focus
Document5 pages
W8 Grade 12 American Weekly Focus
Mohamed Amin
No ratings yet
Week10 - Text-Word Vectors
Document16 pages
Week10 - Text-Word Vectors
Luis Pinuer
No ratings yet
Recurrent Neural Networks
Document6 pages
Recurrent Neural Networks
B. Vasanthi
No ratings yet
Imperatives
Document12 pages
Imperatives
Thanuja
No ratings yet
8-2 正则表达式的神经网络化
Document33 pages
8-2 正则表达式的神经网络化
chunhua li
No ratings yet
Lesson7 LZ77
Document49 pages
Lesson7 LZ77
Pi Ka Chu
No ratings yet
CS772 Lec21
Document18 pages
CS772 Lec21
Yuvraj Pardeshi
No ratings yet
Introduction To Deep Learning: TA: Drew Hudson May 8, 2020
Document33 pages
Introduction To Deep Learning: TA: Drew Hudson May 8, 2020
Sunil Kumar
No ratings yet
Introduction To Deep Learning: TA: Drew Hudson May 8, 2020
Document33 pages
Introduction To Deep Learning: TA: Drew Hudson May 8, 2020
vip_thb_2007
No ratings yet
Linear Functional Analysis: Lecture 1: Introduction Rynne and Youngson 1.1, 1.2
Document9 pages
Linear Functional Analysis: Lecture 1: Introduction Rynne and Youngson 1.1, 1.2
MR
No ratings yet
A Vietnamese Language Model Based On Recurrent Neural Network
Document5 pages
A Vietnamese Language Model Based On Recurrent Neural Network
phuccoi
No ratings yet
Pertama SEQUENCES AND THEIR LIMITS BARTLE
Document8 pages
Pertama SEQUENCES AND THEIR LIMITS BARTLE
Putri Suryani
No ratings yet
Mastering Java: A Comprehensive Guide to Development Tools and Techniques
From Everand
Mastering Java: A Comprehensive Guide to Development Tools and Techniques
Lena Neill
No ratings yet
Python Text Processing with NLTK 2.0 Cookbook: LITE
From Everand
Python Text Processing with NLTK 2.0 Cookbook: LITE
Jacob Perkins
Rating: 4 out of 5 stars
4/5 (1)

Seq 2 Seq

Uploaded by

Thuần Văn

0% found this document useful (0 votes)

12 views61 pages

Original Description:

Original Title

Seq2seq.pptx

Copyright

Available Formats

PDF, TXT or read online from Scribd

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Report this Document

Copyright:

Available Formats

Download as PDF, TXT or read online from Scribd

Flag for inappropriate content

Download as pdf or txt

0% found this document useful (0 votes)

12 views61 pages

Seq 2 Seq

Uploaded by

Thuần Văn

Copyright:

Available Formats

Download as PDF, TXT or read online from Scribd

Flag for inappropriate content

Download as pdf or txt

Jump to Page

You are on page 1of 61

Search inside document

Seq2seq, Attention, Self

attention, Transformer, BERT

Tuan Nguyen - AI4E
Outline
● RNN review
● Seq2seq
● Beam search
● Attention
● Self-attention
● Transformer
● BERT
Recurrent Neural Network
Usually drawn as:
RNN Formula
The state consists of a single “hidden” vector h:
y

RNN

x
Forward C1 C2 C3

y1 y2 y3

h1 h2 h3

x1 h0 x2 h1 x3 h2
Deep RNN

Same
parameters
at this level

≠
Same
parameters
at this level

Time
Recurrent neural network problem
Character-level language model example
Character-level language model example

Vocabulary:
[h,e,l,o] y

Example training RN
sequence: N
“hello”
x
Character-level language model example
Character-level language model example

Vocabulary:
[h,e,l,o]

Example training
sequence:
“hello”
Character-level language model example
Vocabulary:
[h,e,l,o]

Example training
sequence:
“hello”
Character-level language model example
Character-level language model example

Vocabulary:
[h,e,l,o]

Example training
sequence:
“hello”
Long short term memory (LSTM)
Translation
Model
Seq2seq
Seq2Seq
Seq2seq
Seq2seq - prediction
Greedy search
Beam search
Word embedding (glove, word2vec)
Word semantic
Attention - motivation
Attention

q
k1 k2 k3
Attention
Attention function
Self attention
Transformer
Seq2seq
Transformer
Transformer
Multi-head attention
BERT
BERT model

● Mask Language
Modeling (MLM)
● Next Sentence
Prediction (NSP)
Masked LM (MLM)
Next sentence prediction
Fine-tuning BERT
● Classiﬁcation tasks such as sentiment analysis.
● In Question Answering tasks (e.g. SQuAD v1.1).
● In Named Entity Recognition (NER).
Q&A

Recurrent Neural Network (RNN) : Tuan Nguyen - AI4E
Document38 pages
Recurrent Neural Network (RNN) : Tuan Nguyen - AI4E
Thuần Văn
No ratings yet
Lecture8 421
Document85 pages
Lecture8 421
ossayedkh
No ratings yet
Attention: Sharad Jones
Document25 pages
Attention: Sharad Jones
David Guevara
No ratings yet
Deeplearning - Ai Deeplearning - Ai
Document40 pages
Deeplearning - Ai Deeplearning - Ai
Jian Quan
No ratings yet
UDRC RNN LSTM LibrariesTutorial
Document144 pages
UDRC RNN LSTM LibrariesTutorial
Soumana Sanou
No ratings yet
Sequence Models
Document85 pages
Sequence Models
Tayyaba Asif
No ratings yet
Sequence-To-Sequence Models: CIS 530, Computational Linguistics: Spring 2018
Document61 pages
Sequence-To-Sequence Models: CIS 530, Computational Linguistics: Spring 2018
fforappffor
No ratings yet
Lecture 7 - Conditional Language Modeling
Document64 pages
Lecture 7 - Conditional Language Modeling
Mario Molina
No ratings yet
Deeplearning - Ai Deeplearning - Ai
Document58 pages
Deeplearning - Ai Deeplearning - Ai
9f8z4k2cxs
No ratings yet
9 RNN LSTM Gru
Document91 pages
9 RNN LSTM Gru
sandhya
No ratings yet
Recurrent Neural Networks: CMSC498L
Document36 pages
Recurrent Neural Networks: CMSC498L
Zu Ki
No ratings yet
Recurrent Neural Networks: Azmi Haider Muhammad Salamah
Document42 pages
Recurrent Neural Networks: Azmi Haider Muhammad Salamah
Cesar Aliaga
No ratings yet
07 RNN Recurrent Neural Networks
Document115 pages
07 RNN Recurrent Neural Networks
조동올
No ratings yet
Intro DL 10 NLP
Document99 pages
Intro DL 10 NLP
Ngọc Nguyễn Quý
No ratings yet
2014 10 Cho EMNLP
Document11 pages
2014 10 Cho EMNLP
hungbkpro90
No ratings yet
NLP m1
Document148 pages
NLP m1
priyankap1624153
No ratings yet
Lecture 2
Document24 pages
Lecture 2
Vedang Chavan
No ratings yet
Analyzing A Complex Problem by Using An Issue Tree
Document2 pages
Analyzing A Complex Problem by Using An Issue Tree
Swagat Pradhan
No ratings yet
A Character-Level Decoder Without Explicit Segmentation For Neural Machine Translation
Document11 pages
A Character-Level Decoder Without Explicit Segmentation For Neural Machine Translation
Cuisine Gan
No ratings yet
Adv Deep Learning
Document32 pages
Adv Deep Learning
lil tel
No ratings yet
CH - 10 - 1 VQ Description PDF
Document13 pages
CH - 10 - 1 VQ Description PDF
Alexandre Cabrel
No ratings yet
Sequence Modeling
Document62 pages
Sequence Modeling
Prafful Varshney
No ratings yet
Variational Image Captioning Using Deterministic Attention
Document7 pages
Variational Image Captioning Using Deterministic Attention
Insta
No ratings yet
Pertemuan 13
Document2 pages
Pertemuan 13
Nuthfah Faijah
No ratings yet
Lec 14
Document76 pages
Lec 14
tereyo
No ratings yet
Neural Network
Document23 pages
Neural Network
Yandi Anzari
No ratings yet
Pretraining Part2 17 Mar 23 PDF
Document38 pages
Pretraining Part2 17 Mar 23 PDF
arpan singh
No ratings yet
Lecture 27
Document40 pages
Lecture 27
cjchien
No ratings yet
Seminar#1
Document29 pages
Seminar#1
Akhil Akhi
No ratings yet
Modern Language Models
Document28 pages
Modern Language Models
John Hawkins
No ratings yet
Christopher Manning Lecture 3: Neural Net Learning: Gradients by Hand (Matrix Calculus) and Algorithmically (The Backpropagation Algorithm)
Document84 pages
Christopher Manning Lecture 3: Neural Net Learning: Gradients by Hand (Matrix Calculus) and Algorithmically (The Backpropagation Algorithm)
Muhammad Arshad Awan
No ratings yet
07 Dlintro
Document39 pages
07 Dlintro
maheshsangamreddiias
No ratings yet
Advanced Data Analytics: Simon Scheidegger - University of Lausanne, Department of Economics
Document50 pages
Advanced Data Analytics: Simon Scheidegger - University of Lausanne, Department of Economics
Ruben Kempter
No ratings yet
Tut21a NLP
Document123 pages
Tut21a NLP
Prachurya Nath
No ratings yet
X-Bar Theory and Standard Arabic (PDFDrive)
Document23 pages
X-Bar Theory and Standard Arabic (PDFDrive)
Cheang Ravy
No ratings yet
Christopher Manning Lecture 5: Language Models and Recurrent Neural Networks (Oh, and Finish Neural Dependency Parsing J)
Document66 pages
Christopher Manning Lecture 5: Language Models and Recurrent Neural Networks (Oh, and Finish Neural Dependency Parsing J)
Muhammad Arshad Awan
No ratings yet
ANN Text and Sequence Processing
Document33 pages
ANN Text and Sequence Processing
Muhammad Hanan
No ratings yet
NLP - Natural Language Processing
Document74 pages
NLP - Natural Language Processing
MichaelLevy
No ratings yet
Deep Learning Basics Lecture 10 Neural Language Models
Document34 pages
Deep Learning Basics Lecture 10 Neural Language Models
baris
No ratings yet
Module No. 4 - Recurrent Neural Networks
Document3 pages
Module No. 4 - Recurrent Neural Networks
Cyril Smart Ambedkar
No ratings yet
Noun Phrase Extraction: A Description of Current Techniques
Document36 pages
Noun Phrase Extraction: A Description of Current Techniques
Ridha Galih Permana
No ratings yet
Natural Language Processing With Deep Learning CS224N/Ling284
Document33 pages
Natural Language Processing With Deep Learning CS224N/Ling284
rakesh
No ratings yet
Recurrent & Recursive Nets
Document10 pages
Recurrent & Recursive Nets
Aisha Singh
No ratings yet
L5 TextClassification Updated
Document179 pages
L5 TextClassification Updated
Ike S. Ma
No ratings yet
Curs 5 DEPI - 2va - Slides - EN
Document130 pages
Curs 5 DEPI - 2va - Slides - EN
Trasca Alin
No ratings yet
Fast Transformer Decoding - One Write-Head Is All You Need
Document9 pages
Fast Transformer Decoding - One Write-Head Is All You Need
Agarwal Shubham
No ratings yet
W8 Grade 12 American Weekly Focus
Document5 pages
W8 Grade 12 American Weekly Focus
Mohamed Amin
No ratings yet
Week10 - Text-Word Vectors
Document16 pages
Week10 - Text-Word Vectors
Luis Pinuer
No ratings yet
Recurrent Neural Networks
Document6 pages
Recurrent Neural Networks
B. Vasanthi
No ratings yet
Imperatives
Document12 pages
Imperatives
Thanuja
No ratings yet
8-2 正则表达式的神经网络化
Document33 pages
8-2 正则表达式的神经网络化
chunhua li
No ratings yet
Lesson7 LZ77
Document49 pages
Lesson7 LZ77
Pi Ka Chu
No ratings yet
CS772 Lec21
Document18 pages
CS772 Lec21
Yuvraj Pardeshi
No ratings yet
Introduction To Deep Learning: TA: Drew Hudson May 8, 2020
Document33 pages
Introduction To Deep Learning: TA: Drew Hudson May 8, 2020
Sunil Kumar
No ratings yet
Introduction To Deep Learning: TA: Drew Hudson May 8, 2020
Document33 pages
Introduction To Deep Learning: TA: Drew Hudson May 8, 2020
vip_thb_2007
No ratings yet
Linear Functional Analysis: Lecture 1: Introduction Rynne and Youngson 1.1, 1.2
Document9 pages
Linear Functional Analysis: Lecture 1: Introduction Rynne and Youngson 1.1, 1.2
MR
No ratings yet
A Vietnamese Language Model Based On Recurrent Neural Network
Document5 pages
A Vietnamese Language Model Based On Recurrent Neural Network
phuccoi
No ratings yet
Pertama SEQUENCES AND THEIR LIMITS BARTLE
Document8 pages
Pertama SEQUENCES AND THEIR LIMITS BARTLE
Putri Suryani
No ratings yet
Mastering Java: A Comprehensive Guide to Development Tools and Techniques
From Everand
Mastering Java: A Comprehensive Guide to Development Tools and Techniques
Lena Neill
No ratings yet
Python Text Processing with NLTK 2.0 Cookbook: LITE
From Everand
Python Text Processing with NLTK 2.0 Cookbook: LITE
Jacob Perkins
Rating: 4 out of 5 stars
4/5 (1)

Seq 2 Seq

Uploaded by

Copyright:

Available Formats

You might also like

Seq 2 Seq

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Seq 2 Seq

Uploaded by

Copyright:

Available Formats

Seq2seq, Attention, Self

attention, Transformer, BERT

You might also like