Professional Documents
Culture Documents
Machine Translation Post-Editing: Oryslava Bryska, April 28, 2021
Machine Translation Post-Editing: Oryslava Bryska, April 28, 2021
Working definition
The term has (Allen, 2003 ):
been Distinguishing Previously
commonly factor of the defined as:
used in: process:
- Subfields of natural Correction of a pre-
language processing,
Term used for the Editing, modification
translated text rather correction of MT
MT including; than translation from and/or correction of a pre-
output by human
- Automated error scratch (Wagner, 1985) linguists/editors translated text that has
correction;
- Optical character
(Veale and Way, 1997) been processed by a MT
recognition; system from a
- TM source language into
- Controlled languages;
- Separate translation (a) target language(s)
related service with its
own standard ISO 2017.
Human-assisted Machine-assisted HT
MT
- Passive activity with
editors closing the gap • Humans are at the centre of
between defective MT translation production;
outputs and high quality • MT are often used as a part
translations; of CAT tools (TMs and
Human-assisted MT - often monolingual pre- terminology resources);
editors and post-editors • Achievements in terms of
vs Machine-assisted - Undesirable final step in effort and quality;
• Etc.
HT MT application;
- Post-editors were viewed
as ‘human partners’
- Evoked negative
perceptions of MT
- Etc.
Pioneer articles describing the different tasks, processes and profiles in post-editing (Vasconcellos and L
eón 1985, Wagner 1985, 1987, Vasconcellos 1986, 1989, 1992, 1993, Senez 1998) and different level
s of post-editing: rapid and conventional (Loffler-Laurain 1983, 1986). Jeffrey Allen (2003,2010), A
na Guerberof (2009, 2013), Ke Hu and Patrick Cadwell (2016), Lucas Nunes Vieira (2017)
Even though approaches to post- editing levels are wide- ranging, the most po
pular levels are often referred to as just ‘light’ and ‘full’ post- editing (see Hu an
d Cadwell 2016).
Influential guidelines published by the Translation Automation User Society (TA
US) refer to these levels based on two standards of expected target- text qualit
y, namely:
‘good enough’
‘similar or equal to human translation’ (Massardo et al. 2016: 17– 18).
These two quality standards roughly correspond to ‘light’ and ‘full’ post- editing, re
spectively.
Guidelines
for Light PE
Guidelines
for Full PE
Full PE g
uidelines
Do retain as much of the raw translation as possible. Resist the
temptation to delete and rewrite too much. Remember that
many of the words you need are there somewhere, but
probably in the wrong order.
Don’t allow yourself to hesitate too long over any particular
problem — put in a marker and go back to the problem later if
necessary.
Don’t worry if the style of the translation is repetitive
or pedestrian — there is no need to change words simply for
the sake of elegant variation.
Case-study: Don’t embark on time-consuming research. Use only rapid
research aids (Eurodicautom, knowledgeable colleagues,
ECTS (Wag specialised terminology. If a terminology problem is insoluble,
ner 1985) bring it to the attention of the requester by putting a question
mark in the margin.
Case-study: ECTS (cont)
Do make changes only when they are absolutely necessary, i.e. correct
only words or phrases that are:
a) nonsensical
b) wrong
and, if there is enough time left,
c) ambiguous.
CASL (Controlled Automotive Service Language) project:
• Well-rounded case of establishing and using documentation for PE
• The use of Society for Automotive Engineering (SAE) J2450 standard metric for
translation quality
• Several prioritized categories of errors rated as unacceptable in
translated texts:
wrong terms,
syntactic errors,
Case study: omissions,
word-structure or agreement errors,
General Mo misspelling,
punctuation errors,
tors miscellaneous errors
• Does not address stylistic considerations
• Identifying and correcting errors of the above-mentioned categories (minimal level
of PE)
• Weights for each type of error (serious and minor)
Procedure of PE stuff training
• Post-editors receive post-editing ‘macros’
• Basic guidance about how to take advantage of the raw MT
output text
Case-study: Pan • How to avoid extensive reordering of concepts
-American Heal • How to respect phrases enclosed in ‘reliability marks’ in the output
• How to deal with context-sensitive alternate translations
th Organization
Case study: Microsoft (Groves and Schmidtke (2009) )
The source text, the machine raw output and the post-edited text are used for the analysis.
Microsoft reports improvements in the quality of the MT and related productivity increases
from 5-10 percent to 10 to 20 percent for certain languages, although they signal variatio
ns in post-editing productivity for the same language depending on project, product, differ
ent file deliveries of the same project, and between different translators.
Translators report on issues related to terminology, grammar, and incorrect handling of ma
rk-up and formatting (tagging).
To analyze the post-editing patterns, two data sets are used: English into German and into
French. Using their own edit distance (the number of modifications a human editor is require
d to make to a system translation so that the resulting edited translation counts as accurate) t
echniques, they find that for French the edit distance is 5.60 whereas the German score 8.
81, indicating a greater post-editing effort for German.
The most common types of edits are deletion and insertion of function words (especially d
eterminers), also edits in punctuation, especially actions related to inserting or deleting co
mmas.
They also give a detailed report on structure- based comparison for each language.
Recent Developments
• Static vs interactive mode in PE
• Automatic MT PE
• NMT PE
• MT Literary texts PE
Static vs interactive PE modes
• This same study found that even PBSMT led to an 18% increase in
translators’ productivity compared to the from- scratch condition.
• The MT system was tailored to literary content (Toral and Way 2018), which is
likely to have played a significant role in productivity boost.
Post editing
Human-assisted MT paradigm in PE
Machin-assisted MT paradigm in PE
Cognitive effort
Temporal effort
Technical effort
Level of PE
Automatic PE
Interactive PE
Thank you
See you at the next lecture