Professional Documents
Culture Documents
SNLP Overview
SNLP Overview
q Left-context only?
§ The {big, pig} dog …
§ P(dog|the big) >> P(dog|the pig)
Noisy Channel Framework
5
I Noisy Channel O Î
Decoder
p(o|i)
ˆ p(i)p(o|i)
I = argmax p(i| o) = argmax = argmax p(i)p(o|i)
i i p(o) i
q N-gram models:
Unigram: P(w1,n) = P(w1) P(w2) … P(wn)
Bigram: P(w1,n) = P(w1) P(w2| w1) … P(wn|wn-1)
Trigram: P(w1,n) = P(w1) P(w2| w1) … P(wn|wn-2,n-1)
5000.00
words 4000.00
3000.00
2000.00
1000.00 A large
0.00 number of
0 10 20 30 40 50 60
rare words
rank
Data Smoothing
11
0.025
0.020
0.015 Topic
Frequency
horses
ships
vehicles
0.010
0.005
20
…
election
voter
vote
president
ballot
law
legal
lawyer
court
judge
city
state
florida
united
california
*Harvard Law Review, Vol. 118, No. 7 (May, 2005), pp. 2314-2335 (Note).
However,
Using all of thisreasoning,
probabilistic thematic information
we wish to is hidden.
infer We only
the latent observe
structure the documents
of the words…
Topic Proportions
21
…
Topics
Topic
Indices
*Harvard Law Review, Vol. 118, No. 7 (May, 2005), pp. 2314-2335 (Note).
Bayesian Probability
22
q Bayes’ Theorem
p(x | θ )p(θ )
P(θ | x) =
p(x)
posterior ∝ likelihood × prior
K
1
P(θ | α ) = ∏
B(α ) i=1
θ αi −1
B(α ) =
∏ Γ(α )
i=1 i
K
Γ(∑ α ) i
i=1
Latent Dirichlet Allocation (LDA)
24
§ Initially proposed by β ϕ
K
Blei, et al. (2003):
α θ z w
N
M
Generative Process:
1. ϕ(k) ~ Dir(β)
2. For each document d ∈ M: p( w, z,θ , φ | α , β ) =
K M
a. θd ~ Dir(α)
b. For each word w ∈ d:
∏ p(φ
k =1
k | β ) × ∏ p(θ m | α ) ×
m =1
i. z ~ Discrete(θd) M Nm
ii. w ~ Discrete(ϕ(z)) ∏∏ p( z
m =1 n =1
m,n | θ m ) p( wm,n | φ zm )
Inference
25
q Example 1:
§ Our favourite city during the trip was _________.
q Example 2:
§ Is the word “book” a noun or a verb?
• If we know that a “library” topic generated it, it’s much more
likely to be a noun
• If we know that an “airline” topic generated it, it’s more likely
to be a verb (“to book a flight”)
q Example 3:
§ We know that the word “seal” is a noun, what is its topic?
• More likely to be related to “marine mammals” than
“construction” (“to seal a crack”)
POSLDA (Part-Of-Speech LDA) Model
31
q
β
the emitted word only w 1 w 2 w 3
...
depends on class c itself
ϕ c
1 c
2 c
3
...
T*SSEM + SSYN
q This model results in POS- M
Questions?
References
40