Professional Documents
Culture Documents
MT PDF
MT PDF
SNLP 2014
CSE, IIT Kharagpur
Machine Translation
1 / 27
Machine Translation
Machine Translation
2 / 27
An early MT system, when translating from English to Russian and then back
to English:
The spirit is willing but the flesh is weak.
Machine Translation
3 / 27
An early MT system, when translating from English to Russian and then back
to English:
The spirit is willing but the flesh is weak. The liquor is good but the
meat is spoiled.
Machine Translation
3 / 27
An early MT system, when translating from English to Russian and then back
to English:
The spirit is willing but the flesh is weak. The liquor is good but the
meat is spoiled.
Out of sight, out of mind.
Machine Translation
3 / 27
An early MT system, when translating from English to Russian and then back
to English:
The spirit is willing but the flesh is weak. The liquor is good but the
meat is spoiled.
Out of sight, out of mind. Invisible idiot.
Machine Translation
3 / 27
Machine Translation
4 / 27
Machine Translation
4 / 27
Problems
Translation Divergence
It is running Wah bhaag raha hai
It is raining Baarish ho rahi hai
Machine Translation
5 / 27
Problems
Translation Divergence
It is running Wah bhaag raha hai
It is raining Baarish ho rahi hai
Structural Divergence
Ram will attend the meeting ram sabha mein jayega
Ram will go to school ram school jayega
Machine Translation
5 / 27
Problems
Other Divergence
The fan is on [adverb] Pankha chal [verb] raha hai
The fan is good [adjective] Pankha achcha [adjective] hai
Machine Translation
6 / 27
Problems
Other Divergence
The fan is on [adverb] Pankha chal [verb] raha hai
The fan is good [adjective] Pankha achcha [adjective] hai
Machine Translation
6 / 27
Machine Translation
7 / 27
IBM Model 1
First model proposed as part of CANDIDE, the first complete SMT system
Assumes a simple generative model of producing F from E = e1 , e2 , . . . , eI
Machine Translation
8 / 27
IBM Model 1
First model proposed as part of CANDIDE, the first complete SMT system
Assumes a simple generative model of producing F from E = e1 , e2 , . . . , eI
Generative model
Choose length, J , of F sentence: F = f1 , f2 , . . . fJ
Choose a one to many alignment A = a1 , a2 , . . . aJ
For each position in F , generate a word fj from the aligned word in E : eaj
Machine Translation
8 / 27
Generative Model
Machine Translation
9 / 27
Generative Model
Machine Translation
10 / 27
Generative Model
Machine Translation
11 / 27
Generative Model
Machine Translation
12 / 27
Generative Model
Machine Translation
13 / 27
Machine Translation
14 / 27
(I + 1)J
P(A|E) = P(A|E, J)P(J|E) =
Machine Translation
P(J|E)
(I + 1)J
14 / 27
(I + 1)J
P(A|E) = P(A|E, J)P(J|E) =
P(J|E)
(I + 1)J
A
Machine Translation
P(J|E)
(I + 1)J
t(fj , ea )
j
j=1
14 / 27
= arg max
A
P(J|E)
(I + 1)J
t(fj , ea )
j
j=1
Machine Translation
15 / 27
= arg max
A
P(J|E)
(I + 1)J
t(fj , ea )
j
j=1
Machine Translation
15 / 27
MT Evaluation
Machine Translation
16 / 27
MT Evaluation
BLEU
Determine number of n-grams of various sizes that the MT output shares
with the reference translations.
Compute a modified precision measure of the n-grams in MT result.
Machine Translation
16 / 27
BLEU Example
Machine Translation
17 / 27
Machine Translation
18 / 27
Machine Translation
19 / 27
Machine Translation
20 / 27
BLEU Example
Machine Translation
21 / 27
Machine Translation
22 / 27
Machine Translation
23 / 27
Machine Translation
24 / 27
Brevity Penalty
Machine Translation
25 / 27
Brevity Penalty
Machine Translation
25 / 27
Brevity Penalty
Machine Translation
25 / 27
Brevity Penalty
BP = e(1r/c) if c r, 1 otherwise
Machine Translation
25 / 27
BLEU Example
Machine Translation
26 / 27
BLEU Example
Reference 1 has
the largest ngram matches with candidate 1, while Reference 2 has the
largest ngram matches with candidate 2.
Machine Translation
26 / 27
Machine Translation
27 / 27