Professional Documents
Culture Documents
Memsource Unlock The Potential of MT With Post Editing Guide - Original
Memsource Unlock The Potential of MT With Post Editing Guide - Original
Machine
Translation
Your Guide to Implementing MT Post-editing
HOW TO UNLOCK THE POTENTIAL OF MACHINE TRANSLATION
CONTENTS
1 INTRODUCTION 03
4 IMPLEMENTING MT POST-EDITING 12
7 CONCLUSION 18
8 ACKNOWLEDGMENTS 19
9 REFERENCES 20
02
www.memsource.com
1 INTRODUCTION
For organizations breaking into and establishing themselves in new markets, develop-
ing an effective localization strategy that meets market needs requires considerable
planning. English may well be the language of business worldwide, but increasingly
customers expect that the content they consume or the services they use will be availa-
ble in their preferred language, almost instantaneously. This puts pressure on companies
to provide the same user experience to customers across the globe, all the while stay-
ing within budget.
Human-only translation, either in-house, or outsourced, has been the norm in many
global companies, but it isn’t scalable, is very costly, and has a long turnaround time.
Also, it isn’t necessary for many use cases. Machine translation (MT), combined with
human translators who edit the MT output – post-editors – can fill this gap.
There are two main advantages to using MT. The first is that it can increase the efficiency
of “traditional” human translation. A professional linguist edits the target text to
“near-human level”, correcting any grammatical or linguistic errors, and the estimated
productivity rate is approximately double, compared to normal translation. The
second is that it makes it possible to translate content which otherwise would not be
translated, typically because opting for human-only translation would be too costly or
time-consuming. The end result will still be a target text where any grammatical, syntac-
tical, and semantic errors are eliminated.
Over the past few years, we have seen the use of MT increase and based on recent
market estimations, MT usage will continue to grow, especially given the number of
tech giants entering the space – Microsoft, Google, Amazon, and now Apple. The MT
market has been valued between USD 130 million to USD 400 million (Nimdzi, 2019)
and is estimated to exceed USD 1.5 billion by 2024 (Marketwatch, 2019).
We can see that the overwhelming majority of Memsource users work with MT engines
on at least some of their projects.
03
www.memsource.com
The pie chart below shows the ratio of organizations that used machine translation in
at least 1% of their jobs between 2018 and 2019. We included only organizations that
have translated at least 10 000 words in total.
MT not used
17.84%
82.16%
MT used
One key reason why MT usage is high is that general MT quality has improved signifi-
cantly, thanks to increasing investment and research into neural machine translation
(NMT) and deep learning. The quality improvements are reflected in Memsource data
from the past few years.
The graph below shows high-quality MT output. (50 and 75 are quality scores on a scale
of 0 to 100. The maximum quality score, 100, is on par with professional human transla-
tion). These scores are based on the amount of editing required on the MT output. Over
the last two years, the proportion of high-quality MT has been increasing, so the
amount of required post-editing has been decreasing.
68%
60%
62%
57%
40%
33%
20% 26%
23%
0%
50+ Quality 75+ Quality
04
www.memsource.com
Despite this overall improvement in quality, high-quality output is not guaranteed.
This means that although there are some use cases where using raw, un-edited, MT is
appropriate, for the majority of content, it is still important for human linguists to
review and edit the MT output.
Machine translation is the starting point for approximately 35% of content translated
by professional human translators according to Memsource data. One major advantage
is that almost any content is suitable for post-editing - perhaps with the exception of
highly creative marketing or literary content that needs to be transcreated rather than
translated. Otherwise, machine translation post-editing is suitable as every single
word is checked - and if needed - corrected by human translators.
+
05
www.memsource.com
3 CREATING A BUSINESS CASE FOR MT POST-EDITING
There are four key areas you should focus on to build a compelling MT post-editing
business case. We have taken each area and provided guidance on what you
should be presenting to your key decision-makers.
06
www.memsource.com
B. MT POST-EDITING SAVINGS
Presenting potential savings is key when persuading your main stakeholders.
Machine translation savings depend on two main factors, the language pair
and content type. The language pair plays a role because, for the most
common language pairs, there is more bilingual training data so machine
translation results tend to be more accurate and idiomatic. Content type
is also significant because generally MT is better suited to certain content
types, such as technical texts.
NOTE
When evaluating the success of your own MT
post-editing strategy it’s important to assess
the financial savings using your own data.
There are different ways of doing this, but one
effective method is presented in the TAUS
Post-Editing Guidelines.
Below is an estimate of the money that can be saved with machine transla-
tion and post-editing, based on Memsource data.
When translators use results from translation memory (TM), a discount can
be applied using a Net Rate Scheme. The same is true for machine translation.
07
www.memsource.com
Percentage paid of full rate
Here’s an example of a net rate
Match type TM MT
scheme for translation memory and
machine translation results. The
Repetions 10
Match Type represents the amount
101% 10
of editing that was required, and
100% 10 30 the Percentage paid of the full
95-99% 33 40 rate (discount rate) depends on this
85-94% 66 70 match type. You can think of MT
75-84% 100 100
output which has not been post-ed-
ited as 100% MT matches and in just
50-74% 100 100
the same way as TM, the translation
0-49% 100 100
rate is reduced.
1000
1000 NOTE
886 Please be aware that the translation
memory discount rates shown here reflect
general industry standards, although rates
800
do vary. There are currently no industry
standards for machine translation – the
rates shown are estimations. Generally,
companies apply discount rates according
600 to their own financial strategy.
08
www.memsource.com
Let’s look at this in financial terms. Say
Cost ($)
Word count you pay $0.2 per translated word, 1000
($0.2 per word)
words would cost you $200, but if you
NO MT PE 1000 200 used MT, you would only pay $173.2. The
MT PE 866 173.2 gross savings would then be $26.8. You
need to deduct the cost of the machine
Gross Savings ($) 26.8
translation characters to get the net
Net Savings ($) savings. 1000 words is around 5.5k char-
(minus MT 26.69
character cost) acters – machine translation is generally
charged in characters. One of the top MT
This equates to savings of around 13% from providers charges $0.11 for 5.5k charac-
using machine translation post-editing. ters. The net savings are therefore $26.69
(as shown in the table on the left).
C. NON-FINANCIAL BENEFITS
The potential increases in translation productivity are a huge benefit of MT
and MT post-editing (MT PE) and should be included in the business case.
One study on MT post-editing conducted by Intertranslations, reported a
40% average increase in translation productivity per hour (Intertranslations,
2019).
Data from Memsource reflects this too. The graph below compares transla-
tion productivity when using MT compared to translating from scratch (with
no translation resources).
NOTE
The main aim of this data analysis was to find trends in productivity, rather than
carry out a study into precise productivity values. The translation productivity data
shown in the graph is higher than the industry average of 2000-2500 words/day.
We suspect this is partly due to a certain level of noise in the data which is difficult
to remove and increases the productivity values. But, as this noise is common to
all language pairs and workflows, the overall productivity trends are not affected.
As Memsource is a productivity tool, it is likely that this has also driven up produc-
tivity values.
09
www.memsource.com
The data is based on content translated from English-German, but in every language
combination, there is a similar trend. The MT quality scores were calculated based on
the human edits that were made to the MT suggestions. (The lower the number of edits,
the higher the MT quality, and the higher the score. The greater the number of edits,
the lower the MT quality, and the lower the score. A score of 100 means no edits were
required).
4k
Short sentences, from scratch
Short sentences, MT PE
Long sentences, MT PE
2k
1k
0
0 20 40 60 80 100
MT Quality Score
(Higher means less post-editing)
10
www.memsource.com
D. MT POST-EDITING TECHNOLOGY
Your business case should explain the technology options you are consid-
ering. It’s best to select technology that’s easy to deploy and scale.
Here are the two main areas to consider when selecting MT post-edit-
ing technology:
11
www.memsource.com
? HOW TO CHOOSE AN MT ENGINE
In terms of choosing an MT engine, it’s best to use what’s readily available.
You are probably already familiar with generic MT engines, such as Google
Translate or Microsoft Translator. These engines are not trained with data
for a specific domain or topic. Starting with a generic out-of-the-box MT
option is relatively easy to do, especially when using a translation manage-
ment system (TMS), as typically you can just connect the MT engine to the
TMS and start using it. As you become more experienced with machine
translation, you can explore custom MT options, where an engine is trained
with your own data.
4 IMPLEMENTING MT POST-EDITING
Implementing MT post-editing sections: Your implementation plan will depend
A. Select appropriate content for machine translation on your use cases and your current local-
B. Check the personal data policy of your ization processes, but these are some
MT provider/s carefully general steps to consider when defining
C. Create a team of post-editors your approach. For additional details on
D. Run (large) samples before deployment each step of the process, you can read
E. Agree on a pricing model this article on the Memsource blog.
12
www.memsource.com
B. CHECK THE PERSONAL DATA POLICY OF YOUR
MT PROVIDER/S CAREFULLY
Your company may have data storage policies that the MT provider must
comply with.
TRAINING POST-EDITORS
To get the most out of MT and to ensure your team is ready for the change,
it is a good idea to provide training, including the limitations of MT, and
train linguists on post-editing. You may want to think about inviting your
linguists to take a post-editing course.
Aglatech14
13
www.memsource.com
D. RUN (LARGE) SAMPLES BEFORE DEPLOYMENT
Compare human translation with MT output and generate an edit-distance
report or post-editing analysis.
It’s always a good idea to conduct A/B Running different engine tests against
testing for MT content vs human trans- each language was crucial to selecting
lated content. That’s the best way to the most suitable option in terms of
understand how customers are reacting fluency and accuracy.
to the MT content.
We promoted a culture of only correcting pre-translated content when there were real
stylistic or grammatical errors and avoiding subjective changes. We ran a pilot trying
out the MT post-editing process and it required a lot of restraint to only make the
changes that were strictly necessary in terms of context, style, and client terminology.
We also had to come to terms with switching back and forth between the role of a trans-
lator and a proofreader/editor. An additional element was factoring in the perceived
quality aspect i.e. the customer's perception of the overall quality of the service given
the purpose of the translation as a way to decide whether something was a necessary
or subjective correction. Conducting this pilot proved very worthwhile when it came to
the smooth roll-out of the post-editing strategy to all teams. After 6 months, we saw
overall efficiency gains and very motivated translators and proofreaders.
14
www.memsource.com
E. AGREE ON A PRICING MODEL
There are various ways in which MT post-editing pricing models can be
structured and it depends on how much experience your organization and
linguists have with post-editing.
Time spent: This is a good first option when starting out with
post-editing. You need to set reasonable and achievable goals and
time must be tracked by all stakeholders. It is also important to
understand that productivity is dependent on the individual. This
option is particularly good when post-editors are new too as it
ensures fair payment as they gain experience.
Fixed word rate: This is best for organizations which already have
some experience with MT post-editing and it’s based on historical
performances, according to the language/content type/engine
chosen. It can, however, be problematic for vendors/resources as it
doesn’t account for time spent post-editing, which can vary consid-
erably given that MT quality is unstable.
15
www.memsource.com
5 EVALUATING THE SUCCESS OF MT IMPLEMENTATION
AND POST-EDITING
It is best to wait around 2 months after deploying the new MT and post-editing strategy
before starting to assess the success of the implementation.
There are three main indicators for monitoring the success of MT implementation:
A. QUALITY
Observe the post-editing analysis results per language and/or per domain.
Low levels of post-editing can be a good sign as it means MT output quality
is generally high. But, under-editing is also a reality and can lead to low
quality translations. Quality checks should be carried out regularly, espe-
cially to start off with, and the quality be compared to the original quality
goals; is near-human quality required or should it be just good enough?
B. POST-EDITOR’S SATISFACTION
The happier post-editors are, the fewer quality issues you’ll have. This can
be measured by creating a satisfaction survey and sending it out to the
post-editors. Although it may not be possible to address all their concerns,
there are likely to be areas that can be improved and it’s important to
maintain effective lines of communication.
C. SAVINGS
You should monitor the savings over the first few months after deploying
the new strategy. By analyzing the numbers, you can decide where to act.
Should you change the MT engine? Do the post-editors require more train-
ing? Is the content type not suitable? Is the language pair not suitable?
16
www.memsource.com
6 WHAT ABOUT RAW MACHINE TRANSLATION
You have now learned about MT post-editing - but what are the alternatives? As MT
post-editing merely augments human translation, the cost and time savings are not as
significant as with raw machine translation. “Raw” means that the MT output is not
post-edited. The much higher savings, however, come with their own risks.
Since raw machine translation is not checked and corrected by humans before it is pre-
sented to its consumers, it poses a number of risks – this isn’t just in terms of not meet-
ing its purpose to accurately translate content. If not handled properly, machine trans-
lation could even pose a risk to a brand, causing, for example, a cultural faux pas. The
volatile quality of MT suggestions means that sometimes the opposite meaning of the
original is communicated, idiomatic expressions are not properly expressed, content
becomes unintentionally offensive, and words can be missed.
t i o n
a n sla
t r
GDnt D DZFXsA ZFc r M OJcJO
r a aa
17
www.memsource.com
This does not mean that enterprise organizations can’t benefit from raw machine
translation. On the contrary. It’s a question of ensuring that the risks are managed
properly. Perhaps the easiest way of starting out with raw machine translation is for
internal use cases, such as company memos, internal documentation, etc. Any incor-
rect or even ridiculous translations will do much less damage internally than if you
present them to a customer.
The real challenge is measuring the quality of raw MT. Consider this – should you meas-
ure the translation quality by, for example, linguistic standards, user satisfaction with
the machine translation, or whether users were simply able achieve what they needed
to do (e.g. resolve a technical issue)? Alternatively, will you set up various conversion
metrics related to actions supported by machine translated content? With a post-edit-
ing strategy, there doesn’t need to be these kinds of quality concerns because of the
important human touch point.
7 CONCLUSION
The age of MT is here, and it’s here to stay. MT is the single biggest innovation in the
translation industry to date, which already affects many areas, and its influence will
continue to spread.
MT quality has improved dramatically in the last few years and although it is still not
entirely predictable, there are now both tools and strategies to estimate and improve
quality, as this e-book has explained. Adopting an effective post-editing strategy, and
using the right post-editing technology, is at the heart of this.
With MT and post-editing already delivering real commercial results, it’s important not
to be left behind and start building your MT post-editing business case now.
18
www.memsource.com
8 ACKNOWLEDGMENTS
19
www.memsource.com
9 REFERENCES
Chrétiennot, G. Six Continents (2019). What is Machine Translation Post-editing?
[Video]. Retrieved from
https://register.gotowebinar.com/recording /4416112989671985933
Machine Translation Market to witness 19% CAGR till 2024, Driving Factors,
Growth Revenue, Key Vendors, Google, IBM, Lingotek, Microsoft, Omniscien
Technologies, Welocalize. (2019). [online] Available at:
https://www.marketwatch.com/press-release/machine-translation-market-to-w
itness-19-cagr-till-2024-driving-factors-growth-revenue-key-vendors-google-ib
m-lingotek-microsoft-omniscien-technologies-welocalize-2019-02-07 [Accessed
10 Jan. 2020].
Rise of the machines - the state of machine translation (Report preview for
Project Opus) » Nimdzi Insights. (2019). Retrieved 10 January 2020, from
https://www.nimdzi.com/mt-report-preview-opus/
20
www.memsource.com
ABOUT MEMSOURCE
Memsource is the leading AI-powered translation management system. Global compa-
nies choose Memsource for our robust TMS features combined with state-of-the-art AI
technology. At Memsource, we are focused on using AI to ensure you make the most of
the latest machine translation advances.
www.memsource.com