Pretrained-Model

You might also like

Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 50

NLP Basic – 08

Pre-trained Word Vectors

AI VIET NAM
Nguyễn Quốc Thái
CONTENT

1 Pre-trained Word Vectors


2 Text Classification
3 Summary

2
1 – Pretrained Word Vectors
 Pre-trained Language Models (LMs)

Not Enough

Embedding Matrix

Get Vectors

3
1 – Pretrained Word Vectors
 Pre-trained Language Models (LMs)

Objective: Language Model Supervised: Text Classification

0 1
Model LANGUAGE MODEL
Classifier

PRE-TRAINED LMs

Data
This movie is bad 0
This movie is good 1
4
1 – Pretrained Word Vectors
 Pre-trained Language Models (LMs)

 Word2Vec
 Glove
 Fasttext
 ELMO

Training Data
Wikipedia
News
Book
Social Network

5
1 – Pretrained Word Vectors
 Word2Vec

6
1 – Pretrained Word Vectors
 Fasttext

Perplexity

7
1 – Pretrained Word Vectors
 ELMO

Paper
Pre-trained

8
1 – Pretrained Word Vectors
 Glove

 The conditional probability: NER Task


exp(ujTvi)
Qij =
∑ w∈ exp(uw T i
V
 Using co-occurrence
v) probabilities:

J=−- - Xij log


Qij
i j∈context(i)

9
1 – Pretrained Word Vectors
 Pre-trained Glove Embedding

10
1 – Pretrained Word Vectors
 Pre-trained Glove Embedding

Version:
6B
400K Vocab
50 D

11
1 – Pretrained Word Vectors
 Pre-trained Glove Embedding

Version:
6B
400K Vocab
50 D

12
1 – Pretrained Word Vectors
 Pre-trained Glove Embedding

Version:
6B
400K Vocab
50 D

13
1 – Pretrained Word Vectors
 Pre-trained Glove Embedding

Find Synonyms
Find Analogies

14
1 – Pretrained Word Vectors
 Pre-trained Glove Embedding

Find Synonyms
Given a word
Find top k synonym words

15
1 – Pretrained Word Vectors
 Pre-trained Glove Embedding

Find Synonyms
Given a word
Find top k synonym words

16
1 – Pretrained Word Vectors
 Pre-trained Glove Embedding

Find Analogies
Given 3 words
Find a word with analogies relationship

Example:
“man” : “woman” :: “son” : “daughter”
a : b :: c : d
Vec(a) + Vec(d) = Vec(b) + Vec(c)

17
1 – Pretrained Word Vectors
 Pre-trained Glove Embedding

Find Analogies
Given 3 words
Find a word with analogies relationship

Example:
“man” : “woman” :: “son” : “daughter”
a : b :: c : d
Vec(a) + Vec(d) = Vec(b) + Vec(c)

18
2 – Text Classification

Preprocessing

Glove
Representation One-hot BoW TF-IDF Pre-trained

Classifier Naïve Bayes Logistic Neural Network

Metrics Accuracy Recall Precision F1 Score

19
2 – Text Classification
 Neural Network

20
2 – Text Classification
 Review: Embedding Layer

Embedding matrix
Input matrix 0 0.1 3.1 Select
w[0 w[4 w[2] w[1
Index-based Representation Operation ] ] w[3] ]
1 0.5 2.5
w[5 w[6 w[1] w[3
0 4 2 3 2 1.3 0.6 ] ] w[2] ]
1
3 0.4 0.1 Output matrix
5 6 1 2
Input
3 shape: 2x5 4 0.7 1.4
5 2.3 1.7 3.1 1.4 0.6 0.1 2.5
6 2.5 2.5 0.1 0.7 1.3 0.4 0.5
2.3 2.5 0.5 1.3 0.4
Vocab size = 7
Input shape: 2x5x2
21
2 – Text Classification
 Pre-trained Glove Embedding
Embedding matrix
0 <oov> 0.1 3.1
1 <pad> 0.5 2.5
2 <unk> 1.3 0.6 Final embedding matrix
3 neural 0.4 0.1 0 <oov> 0.1 0.1
4 language 0.7 1.4 1 <pad> 0.5 0.5
2 <unk> 0.3 0.6
Glove embedding matrix
3 neural 0.0 0.0
0 <oov> 0.1 0.1
4 language 0.7 0.7
1 <pad> 0.5 0.5
2 <unk> 0.3 0.6
3 language 0.7 0.7
4 mưa 0.7 0.4 22
2 – Text Classification
 Pre-trained Glove Embedding
Glove embedding matrix
0 <oov> 0.1 0.1
1 <pad> 0.5 0.5
2 <unk> 0.3 0.6
3 language 0.7 0.7
4 mưa 0.7 0.4

23
2 – Text Classification
 Pre-trained Glove Embedding
Final embedding matrix
0 <oov> 0.1 0.1
1 <pad> 0.5 0.5
2 <unk> 0.3 0.6
3 neural 0.0 0.0
4 language 0.7 0.7

24
2 – Text Classification
 Pre-trained Glove Embedding

Final embedding matrix


0 <oov> 0.1 0.1
1 <pad> 0.5 0.5
2 <unk> 0.3 0.6
3 neural 0.0 0.0

Update weight 4 language 0.7 0.7 Update weight

25
3 - Summary
 Basic NLP Course

01 Introduction
02 Preprocessing
03 Language Modeling
04 Part Of Speech (POS)
05 Constituency Parsing
06 Basic Vectorization
07 Word2Vec
08 Pretrained Model
26
3 - Summary
 2 - Preprocessing

27
3 - Summary
 2 - Preprocessing
Balanced Data

Positive Negative
Imbalanced Data

Positive Negative
28
3 - Summary
 3 – Language Model

29
3 - Summary
 4 – POS Tagging - NER

Model: Hidden Markov Model (HMM)

30
3 - Summary
 5 – Constituency Parsing (CFG)

Grammar in CNF
CKY
G = (T, N, P, S, R) 1. Start  S
2. S  NP VP
3. NP  Det Noun
T: a set of terminal symbols 4. NP  NN PP
5. PP  Prep NP
6. VP  V NP
N: a set of non-terminal symbols 7. a. VP  V Args
b. Args  NP PP
P(PN ): a set of pre-terminal symbols 8. V  ate
9. NP  John
10. NP  ice-cream, snow
S: a start symbol 11. Noun  ice-cream, pizza
12. Noun  table, guy,
R: a set of rules or productions campus
R = {| N, (TN)} 13. Det  the
14. Prep  on
31
3 - Summary
 5 – Dependency Parsing

A graph G = (V, A)

V vertices
{w0=root, w1,…, wn)
usually one per word in sentence

A arcs
{(wi,r,wj):wi≠wj) | wi∈V,
wj∈V-w0, r ∈Rx}
Rx: a set of all possible dependency
relations in x

32
3 - Summary
 5 – Dependency Parsing

Dependency Tree

a ROOT

Each word has a single head

Dependency structure is connected

Projective

Acyclic
Unique path from ROOT to each word

33
3 - Summary
 6 – Basic Vectorization

 One-hot encoding
 Bag-of-words (BoW)
 Bag-of-N-gram

34
3 - Summary
 7 – Word2Vec

35
3 - Summary
 7 – Pre-trained Embedding

 Word2Vec
 Glove
 Fasttext
 ELMO

36
3 - Summary
 NLP Pipeline

37
3 - Summary
 The neural history of NLP

2001 Neural Language


Models
2008 Multi-task Learning
Word Embedding
2013 Neural Network for NLP

2014 Sequence-to-sequence Models

2015 Attention - Transformer

2018 Pretrained Language Models


38
3 - Summary
 The neural history of NLP

2001 Neural Language


Models
2008 Multi-task Learning
Word Embedding
2013 Neural Network for NLP

2014 Sequence-to-sequence Models

2015 Attention - Transformer

2018 Pretrained Language Models


39
3 - Summary
 The neural history of NLP

2001 Neural Language


Models

2008
Word Embedding
Multi-task Learning
2013 Neural Network for NLP

2014 Sequence-to-sequence Models

2015 Attention - Transformer

2018 Pretrained Language Models


40
3 - Summary
 The neural history of NLP

2001 Neural Language


Models

2008
Word Embedding
Multi-task Learning
2013 Neural Network for NLP

2014 Sequence-to-sequence Models

2015 Attention - Transformer

2018 Pretrained Language Models


41
3 - Summary
 The neural history of NLP

2001 Neural Language RNN LSTM GRU


Models

2008
Word Embedding
Multi-task Learning
2013 Neural Network for NLP
CNN
2014 Sequence-to-sequence Models

2015 Attention - Transformer

2018 Pretrained Language Models


42
3 - Summary
 The neural history of NLP

Machine Translation
2001 Neural Language
Models
2008 Multi-task Learning
Word Embedding
2013 Neural Network for NLP
Image Captioning
2014 Sequence-to-sequence Models

2015 Attention - Transformer

2018 Pretrained Language Models


43
3 - Summary
 The neural history of NLP

2001 Neural Language Question Answer


Models
2008 Multi-task Learning
Word Embedding
2013 Neural Network for NLP

2014 Sequence-to-sequence Models

2015 Attention - Transformer

2018 Pretrained Language Models


44
3 - Summary
 The neural history of NLP

2001 Neural Language Attention


Models
2008 Multi-task Learning
Word Embedding
2013 Neural Network for NLP

2014 Sequence-to-sequence Models

2015 Attention - Transformer

2018 Pretrained Language Models


45
3 - Summary
 The neural history of NLP

2001 Neural Language


Models
2008 Multi-task Learning
Word Embedding
2013 Neural Network for NLP

2014 Sequence-to-sequence Models

2015 Attention - Transformer


Transformer
2018 Pretrained Language Models
46
3 - Summary
 The neural history of NLP

2001 Neural Language


Models
2008 Multi-task Learning
Word Embedding
2013 Neural Network for NLP

2014 Sequence-to-sequence Models

2015 Attention - Transformer


BERT

2018 Pretrained Language Models


47
3 - Summary
 The neural history of NLP

2001 Neural Language


Models
2008 Multi-task Learning
Word Embedding
2013 Neural Network for NLP

2014 Sequence-to-sequence Models

2015 Attention - Transformer GPT

2018 Pretrained Language Models


48
3 - Summary
 Reference

1. https://web.stanford.edu/~jurafsky/slp3/
2. http://web.stanford.edu/class/cs224n/
3. https://d2l.ai/
4. http://nlpprogress.com/
5. https://github.com/undertheseanlp/NLP-Vietnamese-progress

49
Thanks!
Any
questions?

You might also like