Download as pdf or txt
Download as pdf or txt
You are on page 1of 21

how to unlock the potential of

Machine
Translation
Your Guide to Implementing MT Post-editing
HOW TO UNLOCK THE POTENTIAL OF MACHINE TRANSLATION

CONTENTS
1 INTRODUCTION 03

2 WHY CHOOSE AN MT POST-EDITING STRATEGY? 05

3 CREATING A BUSINESS CASE FOR MT POST-EDITING 06

4 IMPLEMENTING MT POST-EDITING 12

5 EVALUATING THE SUCCESS 16


OF MT IMPLEMENTATION AND POST-EDITING

6 WHAT ABOUT RAW MT? 17

7 CONCLUSION 18

8 ACKNOWLEDGMENTS 19

9 REFERENCES 20

02
www.memsource.com
1 INTRODUCTION
For organizations breaking into and establishing themselves in new markets, develop-
ing an effective localization strategy that meets market needs requires considerable
planning. English may well be the language of business worldwide, but increasingly
customers expect that the content they consume or the services they use will be availa-
ble in their preferred language, almost instantaneously. This puts pressure on companies
to provide the same user experience to customers across the globe, all the while stay-
ing within budget.

Human-only translation, either in-house, or outsourced, has been the norm in many
global companies, but it isn’t scalable, is very costly, and has a long turnaround time.
Also, it isn’t necessary for many use cases. Machine translation (MT), combined with
human translators who edit the MT output – post-editors – can fill this gap.

There are two main advantages to using MT. The first is that it can increase the efficiency
of “traditional” human translation. A professional linguist edits the target text to
“near-human level”, correcting any grammatical or linguistic errors, and the estimated
productivity rate is approximately double, compared to normal translation. The
second is that it makes it possible to translate content which otherwise would not be
translated, typically because opting for human-only translation would be too costly or
time-consuming. The end result will still be a target text where any grammatical, syntac-
tical, and semantic errors are eliminated.

Over the past few years, we have seen the use of MT increase and based on recent
market estimations, MT usage will continue to grow, especially given the number of
tech giants entering the space – Microsoft, Google, Amazon, and now Apple. The MT
market has been valued between USD 130 million to USD 400 million (Nimdzi, 2019)
and is estimated to exceed USD 1.5 billion by 2024 (Marketwatch, 2019).

We can see that the overwhelming majority of Memsource users work with MT engines
on at least some of their projects.

03
www.memsource.com
The pie chart below shows the ratio of organizations that used machine translation in
at least 1% of their jobs between 2018 and 2019. We included only organizations that
have translated at least 10 000 words in total.

MT not used
17.84%

82.16%

MT used

One key reason why MT usage is high is that general MT quality has improved signifi-
cantly, thanks to increasing investment and research into neural machine translation
(NMT) and deep learning. The quality improvements are reflected in Memsource data
from the past few years.

The graph below shows high-quality MT output. (50 and 75 are quality scores on a scale
of 0 to 100. The maximum quality score, 100, is on par with professional human transla-
tion). These scores are based on the amount of editing required on the MT output. Over
the last two years, the proportion of high-quality MT has been increasing, so the
amount of required post-editing has been decreasing.

2017 2018 2019


80%

68%
60%
62%
57%

40%

33%
20% 26%
23%

0%
50+ Quality 75+ Quality

04
www.memsource.com
Despite this overall improvement in quality, high-quality output is not guaranteed.
This means that although there are some use cases where using raw, un-edited, MT is
appropriate, for the majority of content, it is still important for human linguists to
review and edit the MT output.

Adopting an MT strategy based on MT post-editing can be the most effective and


risk-free way of using machine translation. This ebook serves as a practical
step-by-step guide for anyone interested in using MT or trying to expand the use of MT
in their organization. And even if you outsource your translations to a vendor, this is a
useful resource for convincing them to move to an MT post-editing business model.

2 WHY CHOOSE AN MT POST-EDITING STRATEGY?


MT post-editing may be the most effective way to start out with machine translation.
Why? It helps eliminate one of the main risks associated with machine translation – its
volatile quality. By providing machine translation as an additional resource to your
translators, they will be able to produce translations faster without the risk of compro-
mising translation quality. In fact, producing high-quality human translation and using
machine translation as an aid has become very common.

Machine translation is the starting point for approximately 35% of content translated
by professional human translators according to Memsource data. One major advantage
is that almost any content is suitable for post-editing - perhaps with the exception of
highly creative marketing or literary content that needs to be transcreated rather than
translated. Otherwise, machine translation post-editing is suitable as every single
word is checked - and if needed - corrected by human translators.

+
05
www.memsource.com
3 CREATING A BUSINESS CASE FOR MT POST-EDITING
There are four key areas you should focus on to build a compelling MT post-editing
business case. We have taken each area and provided guidance on what you
should be presenting to your key decision-makers.

Business case sections: It’s very important to build a strong


A. MT Post-Editing and your Business Strategy business case and define your goals.
How easy or difficult it is to push through
B. MT Post-Editing Savings
an MT business case all depends on the
C. Non-financial Benefits stakeholders. If stakeholders belong to,
D. MT Post-Editing Technology for example, Support or Sales teams it’s
easier because they often have content
which changes frequently and needs to
be translated on a regular basis.

A. MT POST-EDITING AND YOUR BUSINESS STRATEGY


You should explain how MT post-editing fits in with your company’s busi-
ness plan. For example, preserving the human touch and guaranteeing
accuracy can be very important parts of a company’s strategy and brand.
Machine translation does not contradict this. While the perception by
some is that MT removes humans from the translation process, in fact it
can also be used as a tool to increase human efficiency.

Equally, if you define yourself as a global company or are looking to break


into the international market, it’s important to have a localization strategy
in place which meets your global clients’ expectations. When implemented
correctly, along with other translation resources, MT post-editing can
reduce your costs and time to market, allowing you to localize a larger
volume of content and reach more clients or prospects worldwide.

06
www.memsource.com
B. MT POST-EDITING SAVINGS
Presenting potential savings is key when persuading your main stakeholders.
Machine translation savings depend on two main factors, the language pair
and content type. The language pair plays a role because, for the most
common language pairs, there is more bilingual training data so machine
translation results tend to be more accurate and idiomatic. Content type
is also significant because generally MT is better suited to certain content
types, such as technical texts.

NOTE
When evaluating the success of your own MT
post-editing strategy it’s important to assess
the financial savings using your own data.
There are different ways of doing this, but one
effective method is presented in the TAUS
Post-Editing Guidelines.

Below is an estimate of the money that can be saved with machine transla-
tion and post-editing, based on Memsource data.

To calculate the potential savings from MT post-editing, we compared the


cost of translating 1000 words using MT post-editing with what it would
have cost to translate the same 1000 words from scratch. (The 1000 words
is taken from aggregated data gathered over 1 year which has been normal-
ized to 1000 words. Also, the example assumes no other translation resourc-
es, such as translation memory, were used).

When translators use results from translation memory (TM), a discount can
be applied using a Net Rate Scheme. The same is true for machine translation.

07
www.memsource.com
Percentage paid of full rate
Here’s an example of a net rate
Match type TM MT
scheme for translation memory and
machine translation results. The
Repetions 10
Match Type represents the amount
101% 10
of editing that was required, and
100% 10 30 the Percentage paid of the full
95-99% 33 40 rate (discount rate) depends on this
85-94% 66 70 match type. You can think of MT
75-84% 100 100
output which has not been post-ed-
ited as 100% MT matches and in just
50-74% 100 100
the same way as TM, the translation
0-49% 100 100
rate is reduced.

1000
1000 NOTE
886 Please be aware that the translation
memory discount rates shown here reflect
general industry standards, although rates
800
do vary. There are currently no industry
standards for machine translation – the
rates shown are estimations. Generally,
companies apply discount rates according
600 to their own financial strategy.

400 The graph on the left shows the


difference between the raw word
count and the weighted word
count with the net rate scheme
200
applied. By applying the net rate
scheme and identifying the differ-
ent match types, the number of
0 RAW NET words to pay decreases.

08
www.memsource.com
Let’s look at this in financial terms. Say
Cost ($)
Word count you pay $0.2 per translated word, 1000
($0.2 per word)
words would cost you $200, but if you
NO MT PE 1000 200 used MT, you would only pay $173.2. The
MT PE 866 173.2 gross savings would then be $26.8. You
need to deduct the cost of the machine
Gross Savings ($) 26.8
translation characters to get the net
Net Savings ($) savings. 1000 words is around 5.5k char-
(minus MT 26.69
character cost) acters – machine translation is generally
charged in characters. One of the top MT
This equates to savings of around 13% from providers charges $0.11 for 5.5k charac-
using machine translation post-editing. ters. The net savings are therefore $26.69
(as shown in the table on the left).

C. NON-FINANCIAL BENEFITS
The potential increases in translation productivity are a huge benefit of MT
and MT post-editing (MT PE) and should be included in the business case.
One study on MT post-editing conducted by Intertranslations, reported a
40% average increase in translation productivity per hour (Intertranslations,
2019).

Data from Memsource reflects this too. The graph below compares transla-
tion productivity when using MT compared to translating from scratch (with
no translation resources).

NOTE
The main aim of this data analysis was to find trends in productivity, rather than
carry out a study into precise productivity values. The translation productivity data
shown in the graph is higher than the industry average of 2000-2500 words/day.
We suspect this is partly due to a certain level of noise in the data which is difficult
to remove and increases the productivity values. But, as this noise is common to
all language pairs and workflows, the overall productivity trends are not affected.
As Memsource is a productivity tool, it is likely that this has also driven up produc-
tivity values.

09
www.memsource.com
The data is based on content translated from English-German, but in every language
combination, there is a similar trend. The MT quality scores were calculated based on
the human edits that were made to the MT suggestions. (The lower the number of edits,
the higher the MT quality, and the higher the score. The greater the number of edits,
the lower the MT quality, and the lower the score. A score of 100 means no edits were
required).

There is often an assumption that only high-quality MT output improves translation


productivity. But, this data shows that even for lower quality MT output, with a score
between 50-60, translation productivity is greater compared with translating from
scratch with no MT. And for “perfect” MT output (with a score of 100), translation
productivity triples for short sentences and quadruples for long sentences, compared
to translating from scratch.

4k
Short sentences, from scratch
Short sentences, MT PE

3k Long sentences, from scratch


Productivity (words/hour)

Long sentences, MT PE

2k

1k

0
0 20 40 60 80 100
MT Quality Score
(Higher means less post-editing)

SAVE TIME WITH MACHINE TRANSLATION QUALITY ESTIMATION


To make post-editing more efficient, Memsource’s AI team has built a Machine
Translation Quality Estimation (MTQE) feature which automatically calculates
quality scores before post-editing so that linguists know which content to
focus on and removes the guesswork from MT. Learn more.

10
www.memsource.com
D. MT POST-EDITING TECHNOLOGY
Your business case should explain the technology options you are consid-
ering. It’s best to select technology that’s easy to deploy and scale.

When it comes to MT post-editing, most translation tools have some


degree of support but there are big differences in how well MT post-editing
is supported.

Here are the two main areas to consider when selecting MT post-edit-
ing technology:

A good range of integrations: Leading translation management


system (TMS) providers, like Memsource, support dozens of MT
engines. There are so many MT options available that the tricky
part is deciding which one to choose. Some tech providers, includ-
ing Memsource, have introduced technology that selects the best
machine translation engine automatically for customer’s content.

CHOOSING THE BEST MT ENGINE IS EASIER THAN EVER BEFORE


WITH MEMSOURCE TRANSLATE
Selecting an MT engine used to be a slow process involving human evalua-
tion of the MT output. Now, Memsource Translate automatically selects the
best MT engine for your content.

Post-editing Analysis: Post-editing analysis measures the edit


distance between the raw machine translation output and the final
editing effort required to sufficiently edit the raw machine transla-
tion output and create the final post-edited translation. (It is what
we used to calculate the savings from MT post-editing above). It
gives you an almost exact idea of the post-editing effort that was
required on a translation and so it can be used as part of an MT
post-editing pricing model (see the Implementing MT post-editing
section below).

11
www.memsource.com
? HOW TO CHOOSE AN MT ENGINE
In terms of choosing an MT engine, it’s best to use what’s readily available.
You are probably already familiar with generic MT engines, such as Google
Translate or Microsoft Translator. These engines are not trained with data
for a specific domain or topic. Starting with a generic out-of-the-box MT
option is relatively easy to do, especially when using a translation manage-
ment system (TMS), as typically you can just connect the MT engine to the
TMS and start using it. As you become more experienced with machine
translation, you can explore custom MT options, where an engine is trained
with your own data.

4 IMPLEMENTING MT POST-EDITING
Implementing MT post-editing sections: Your implementation plan will depend
A. Select appropriate content for machine translation on your use cases and your current local-
B. Check the personal data policy of your ization processes, but these are some
MT provider/s carefully general steps to consider when defining
C. Create a team of post-editors your approach. For additional details on
D. Run (large) samples before deployment each step of the process, you can read
E. Agree on a pricing model this article on the Memsource blog.

A. SELECT APPROPRIATE CONTENT


FOR MACHINE TRANSLATION
As mentioned above, it’s best to avoid highly creative marketing content,
which needs to be transcreated. But, in general, when you adopt an MT
post-editing strategy, you can use MT for any other types of content.
Machine translation post-editing is particularly suited for large volumes of
text, tender documentation, internal documentation, manuals, etc., where
time is of the essence.

12
www.memsource.com
B. CHECK THE PERSONAL DATA POLICY OF YOUR
MT PROVIDER/S CAREFULLY
Your company may have data storage policies that the MT provider must
comply with.

C. CREATE A TEAM OF POST-EDITORS


Assemble a team of post-editors or re-trained linguists to execute the
post-editing step. Decide whether to opt for light post-editing (making
minimal changes) or full post-editing (more extensive).

TRAINING POST-EDITORS
To get the most out of MT and to ensure your team is ready for the change,
it is a good idea to provide training, including the limitations of MT, and
train linguists on post-editing. You may want to think about inviting your
linguists to take a post-editing course.

Training was an important part of the As a first step, we provided translators


process. We focused on and are still and post-editors with training materials
engaged in presenting MT as a useful where they can learn the process of
tool that is going to make life easier for post-editing using our MT engine. We
translators and linguists in general by then sent out questionnaires about the
taking care of the “heavy lifting”, leaving quality and usability of our MT engine
space for more pleasant and challenging and the productivity increase or decrease
tasks. We promote post-editing as a they experienced.
critical set of new skills to be learned to
stay up to speed with a fast-paced and
future-oriented industry.

Aglatech14

13
www.memsource.com
D. RUN (LARGE) SAMPLES BEFORE DEPLOYMENT
Compare human translation with MT output and generate an edit-distance
report or post-editing analysis.

It’s always a good idea to conduct A/B Running different engine tests against
testing for MT content vs human trans- each language was crucial to selecting
lated content. That’s the best way to the most suitable option in terms of
understand how customers are reacting fluency and accuracy.
to the MT content.

Senior Digital Marketing Specialist


Leading Software company

We promoted a culture of only correcting pre-translated content when there were real
stylistic or grammatical errors and avoiding subjective changes. We ran a pilot trying
out the MT post-editing process and it required a lot of restraint to only make the
changes that were strictly necessary in terms of context, style, and client terminology.
We also had to come to terms with switching back and forth between the role of a trans-
lator and a proofreader/editor. An additional element was factoring in the perceived
quality aspect i.e. the customer's perception of the overall quality of the service given
the purpose of the translation as a way to decide whether something was a necessary
or subjective correction. Conducting this pilot proved very worthwhile when it came to
the smooth roll-out of the post-editing strategy to all teams. After 6 months, we saw
overall efficiency gains and very motivated translators and proofreaders.

14
www.memsource.com
E. AGREE ON A PRICING MODEL
There are various ways in which MT post-editing pricing models can be
structured and it depends on how much experience your organization and
linguists have with post-editing.

The three main options are:

Time spent: This is a good first option when starting out with
post-editing. You need to set reasonable and achievable goals and
time must be tracked by all stakeholders. It is also important to
understand that productivity is dependent on the individual. This
option is particularly good when post-editors are new too as it
ensures fair payment as they gain experience.

Fixed word rate: This is best for organizations which already have
some experience with MT post-editing and it’s based on historical
performances, according to the language/content type/engine
chosen. It can, however, be problematic for vendors/resources as it
doesn’t account for time spent post-editing, which can vary consid-
erably given that MT quality is unstable.

Post-editing effort: This model is based on the measured editing


effort after a translation is delivered using Post-Editing Analysis,
pioneered by Memsource. It means the final fee is only known after
the job is complete. Source: (Chrétiennot, 2019)

IMPROVE PROJECT SCOPING WITH MACHINE TRANSLATION


QUALITY ESTIMATION
Memsource’s Machine Translation Quality Estimation (MTQE) feature can be
used to estimate the amount of post-editing required on a job so you can
scope out a project more accurately. Learn more.

15
www.memsource.com
5 EVALUATING THE SUCCESS OF MT IMPLEMENTATION
AND POST-EDITING
It is best to wait around 2 months after deploying the new MT and post-editing strategy
before starting to assess the success of the implementation.

There are three main indicators for monitoring the success of MT implementation:

A. QUALITY
Observe the post-editing analysis results per language and/or per domain.
Low levels of post-editing can be a good sign as it means MT output quality
is generally high. But, under-editing is also a reality and can lead to low
quality translations. Quality checks should be carried out regularly, espe-
cially to start off with, and the quality be compared to the original quality
goals; is near-human quality required or should it be just good enough?

B. POST-EDITOR’S SATISFACTION
The happier post-editors are, the fewer quality issues you’ll have. This can
be measured by creating a satisfaction survey and sending it out to the
post-editors. Although it may not be possible to address all their concerns,
there are likely to be areas that can be improved and it’s important to
maintain effective lines of communication.

C. SAVINGS
You should monitor the savings over the first few months after deploying
the new strategy. By analyzing the numbers, you can decide where to act.
Should you change the MT engine? Do the post-editors require more train-
ing? Is the content type not suitable? Is the language pair not suitable?

16
www.memsource.com
6 WHAT ABOUT RAW MACHINE TRANSLATION
You have now learned about MT post-editing - but what are the alternatives? As MT
post-editing merely augments human translation, the cost and time savings are not as
significant as with raw machine translation. “Raw” means that the MT output is not
post-edited. The much higher savings, however, come with their own risks.

Since raw machine translation is not checked and corrected by humans before it is pre-
sented to its consumers, it poses a number of risks – this isn’t just in terms of not meet-
ing its purpose to accurately translate content. If not handled properly, machine trans-
lation could even pose a risk to a brand, causing, for example, a cultural faux pas. The
volatile quality of MT suggestions means that sometimes the opposite meaning of the
original is communicated, idiomatic expressions are not properly expressed, content
becomes unintentionally offensive, and words can be missed.

FR: traduction automatique


EN: Machine

t i o n
a n sla
t r
GDnt D DZFXsA ZFc r M OJcJO
r a aa

17
www.memsource.com
This does not mean that enterprise organizations can’t benefit from raw machine
translation. On the contrary. It’s a question of ensuring that the risks are managed
properly. Perhaps the easiest way of starting out with raw machine translation is for
internal use cases, such as company memos, internal documentation, etc. Any incor-
rect or even ridiculous translations will do much less damage internally than if you
present them to a customer.

As mentioned before, the quality of machine translation depends significantly on a


number of factors including content type and language pair. Although these factors
affect both machine translation post-editing and raw machine translation, the critical
difference is that with raw MT, errors remain unchecked. The actual machine transla-
tion technology used will also determine the quality of the output.

The real challenge is measuring the quality of raw MT. Consider this – should you meas-
ure the translation quality by, for example, linguistic standards, user satisfaction with
the machine translation, or whether users were simply able achieve what they needed
to do (e.g. resolve a technical issue)? Alternatively, will you set up various conversion
metrics related to actions supported by machine translated content? With a post-edit-
ing strategy, there doesn’t need to be these kinds of quality concerns because of the
important human touch point.

7 CONCLUSION
The age of MT is here, and it’s here to stay. MT is the single biggest innovation in the
translation industry to date, which already affects many areas, and its influence will
continue to spread.

MT quality has improved dramatically in the last few years and although it is still not
entirely predictable, there are now both tools and strategies to estimate and improve
quality, as this e-book has explained. Adopting an effective post-editing strategy, and
using the right post-editing technology, is at the heart of this.

With MT and post-editing already delivering real commercial results, it’s important not
to be left behind and start building your MT post-editing business case now.

18
www.memsource.com
8 ACKNOWLEDGMENTS

Thank you to everyone who


contributed to this e-book.
Your input was invaluable.
GAËTAN CHRÉTIENNOT
CEO

TORBEN DAHL JENSEN ELENA MURGOLO


Language Technology Manager Language Technology Lead

DEEPAK NAGABHUSHANA NANA TAJIMA


Senior Localization Project Manager General Manager Quality Management Division

19
www.memsource.com
9 REFERENCES
Chrétiennot, G. Six Continents (2019). What is Machine Translation Post-editing?
[Video]. Retrieved from
https://register.gotowebinar.com/recording /4416112989671985933

Intertranslations. (2019). Machine Translation: From Translation to Post-editing


[Video]. Retrieved from
https://www.gala-global.org /ondemand/machine-translation-translation-post-
editing#video

Machine Translation Market to witness 19% CAGR till 2024, Driving Factors,
Growth Revenue, Key Vendors, Google, IBM, Lingotek, Microsoft, Omniscien
Technologies, Welocalize. (2019). [online] Available at:
https://www.marketwatch.com/press-release/machine-translation-market-to-w
itness-19-cagr-till-2024-driving-factors-growth-revenue-key-vendors-google-ib
m-lingotek-microsoft-omniscien-technologies-welocalize-2019-02-07 [Accessed
10 Jan. 2020].

Rise of the machines - the state of machine translation (Report preview for
Project Opus) » Nimdzi Insights. (2019). Retrieved 10 January 2020, from
https://www.nimdzi.com/mt-report-preview-opus/

20
www.memsource.com
ABOUT MEMSOURCE
Memsource is the leading AI-powered translation management system. Global compa-
nies choose Memsource for our robust TMS features combined with state-of-the-art AI
technology. At Memsource, we are focused on using AI to ensure you make the most of
the latest machine translation advances.

Contact us to learn more at memsource.com/contact-us

MEMSOURCE EMPLOYEES ARE BASED ACROSS THE GLOBE!

www.memsource.com

You might also like