Corso-su-AI-post-editing_ENG

You might also like

Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 23

COURSE FOR

AI CONTENT POST-EDITORS
AND AI CONTENT REVIEWERS
About me
Lead Linguist for Creative Words
I translate and revise texts, supervise the
linguistic part of translation projects,
monitor and assess the quality of
translations and create instructions and
training materials for translators.

MARCO RUSSO
Contents
01 02 03
GENERATIVE AI AND LLMs AI CONTENT POST-EDITORS TYPICAL ISSUES OF
& AI CONTENT REVIEWERS AI-GENERATED TEXTS

04 05 06
EXAMPLES OF CURRENT USEFUL LINKS
AI-GENERATED TEXTS AND FUTURE TRENDS
01
GENERATIVE AI AND LLMs
What is generative AI?
Generative artificial intelligence (generative AI or GenAI) is a type of artificial
intelligence that has the ability to generate new and original content in response to
prompts. This type of artificial intelligence can be used to create images, videos, texts
and other types of content.

The main features of generative AI are as follows:


• Ability to create new and original content.
• Use of machine learning techniques such as generative neural networks to
generate content.
• It requires training on large amounts of representative data to generate high
quality content.
• It can be used for multiple purposes, such as producing creative content, revising
existing texts, creating chatbots, generating synthetic data for scientific research,
writing code in various programming languages and image generation.
What are LLMs?
Large Language Models (LLMs) are a type of generative AI that uses natural
language processing (NLP) techniques to generate texts. These models are trained on
large text datasets and are able to perform various language-related tasks, including:

• Text generation: they can generate texts based on the prompts they receive, using
information learned from the corpus of texts they have been trained on. The texts
that can be generated include, for example, creative content, texts for chatbots,
product descriptions for marketing purposes and even entire articles on a specific
topic.
• Machine translation: they can be trained on bilingual datasets and used to
automatically translate text from one language to another.
• Natural language understanding: they can understand the meaning of texts and
answer questions or perform tasks that require natural language understanding.
For example, they can summarise articles, revise texts or create stories and poems.
NMT vs LLMs
Neural Machine Translation (NMT) is a
machine learning model specifically
designed for translation between
languages. Large Language Models
(LLMs) , on the other hand, were not
designed with translation as their main use
case and have some limitations in this kind
of task (see module 3). However, although
less predictable, they still have excellent
linguistic skills and can provide good
quality machine translations and perform
various tasks in the localisation field,
including QA, automatic post-editing,
terminology extraction and transcreation.

Source: The 2023 Nimdzi Language Technology Atlas


02
AI CONTENT POST-EDITORS
&
AI CONTENT REVIEWERS
Definition of new
roles
The introduction of Generative Artificial
Intelligence (generative AI) and Large
Language Models (LLMs), such as GPT
and its variants, has had a significant
impact on the localization sector,
creating the need for new skills and
new roles.

Two of these are AI Content Post-Editor


and AI Content Reviewer. As these roles
are very recent and not yet established,
there is currently no official
nomenclature, so you could find very
different names referring to the same
roles.
Definition of new roles

The AI Content Post-Editor revises texts in the target language within bilingual files, i.e. texts
that have been translated by the AI instead of the MT. This role, as the name suggests, is very
similar to the role of post-editor and already having experience in it is certainly an advantage.
However, compared to traditional texts, AI-generated texts have some peculiarities (which we
will see later) you need to be aware of during the post-editing process.

The AI Content Reviewer task is to review of original AI-generated texts, i.e. texts that have not
been translated from any language but generated directly in the language in which they will be
used. These are often creative texts produced by the AI based on instructions (prompts)
provided by the user. An example would be: "Could you create the copy for the advertising
campaign of a pair of high fashion shoes?". Obviously, the prompt can become much more
detailed depending on the information provided by the customer (type of shoe, colour, type of
buyers it was designed for, level of formality, etc.).
In this role it is not necessary to know a foreign language (and this is what mainly differentiates
it from the AI Content Post-Editor), instead it can be very useful to have marketing and
copywriting skills and experience in creating content from a client brief.
Skills of the new roles

 Linguistic skills: excellent knowledge of the language in which the texts are generated. This
includes a thorough understanding of grammar, syntax, vocabulary and stylistic rules. As
mentioned above, the AI Content Post-Editor should also have a very good knowledge of the
source language. In general, both must be aware of the peculiarities and errors associated
with AI-generated texts, such as inconsistencies, inappropriate style and tone, grammatical
errors, etc. (see module 3 for more details).
 Editing and proofreading skills: it may seem obvious, but, as for post-editors, you need to
be able to quickly identify and correct errors and improve fluency and consistency by using
the provided text as much as possible rather than rewriting it rom scratch.
 Technological skills: basic understanding of how GenAI works, including the concepts of
neural networks, LLM and AI training.
 Searching and verifying information skills: you need to carefully examine texts and
identify errors, inaccuracies, ambiguities, additions or omissions to ensure the accuracy of
information and avoid so-called 'hallucinations' (see module 3). To do this, the AI Content
Post-Editor just needs to compare the target text with the source, while the AI Content
Reviewer has to carry out specific searches.
03
TYPICAL ISSUES OF
AI-GENERATED TEXTS
TYPICAL ISSUES OF AI-GENERATED TEXTS
 Hallucinations: the AI may produce sentences or paragraphs that seem coherent and well-
written, but contain made-up information or concepts that do not correspond to reality (in
the case of translation, reality corresponds to the source text). The AI could distort the
information provided in the source text or in the prompt and/or add new information that is
completely false or misleading. Other than additions, significant omissions can of course also
occur.
 Coherence issues: AI models may occasionally produce texts that lack coherence. There
may be logical incoherences or abrupt transitions between sentences that make the text lack
fluency.
 Cultural and linguistic sensitivity issues: AI models may not be fully aware of the different
cultural and linguistic sensitivities. Any insult, stereotype or inappropriate expression should
be identified and corrected. Furthermore, despite precautionary measures, generative AIs
can be negatively influenced by the data they have been trained on and include prejudice,
racism, sexism or other discrimination in the texts they produce.
 Originality and creativity issues: AI-generated texts are based on statistical models and
information included in the training data, so they are created without a deep understanding
of meaning or context and may have limited creativity and lack originality. It is important,
particularly for the AI Content Reviewer, to assess the originality and creativity of the
generated texts and to identify any plagiarised or unoriginal content.
TYPICAL ISSUES OF AI-GENERATED TEXTS
 Grammatical and syntactical errors: despite significant progress in natural language
processing, texts generated by AI may still contain grammatical or syntactical errors, e.g.
verbal concordance errors, gender or number concordance errors, punctuation errors, etc.
 Capitalisation errors: frequent in titles and after colons. Examples:
S: Our Values
T: I Nostri Valori
R: I nostri valori

S: International outreach: We are an international company


T: Presenza internazionale: Siamo un’azienda internazionale
R: Presenza internazionale: siamo un’azienda internazionale
 Inappropriate style and tone: even with specific instructions, AI models can sometimes
generate texts with style, tone and level of formality that may not be appropriate for the
context or target audience.
04
EXAMPLES OF
AI-GENERATED TEXTS
Texts generated through translation

SOURCE AI TRANSLATION HUMAN REVIEW


Fermatevi in alcuni punti per scattare Stop at some places to take Stop at some of the locations to take
foto dei luoghi più significativi della photographs of those significant places pictures of significant places in the city.
città. in the city.
Passeggiate negli oltre 70.000 m2 del Stroll around the more 70.000 m2 of the Stroll around the more than 70.000 m2
parco e trovate la vostra attrazione park and find your favourite attraction. of the park and find your favourite
preferita. attraction.
Collaboriamo con i nostri clienti We collaborate closely* with our We collaborate with our clients
valued* clients to meet their needs.*
Ciò ci consente di favorire lo sviluppo This allows us to encourage the This allows us to encourage the
dei nostri collaboratori. sustainable* development of our development of our collaborators.
collaborators.
Avere la botte piena e la moglie ubriaca. To have the barrel full and the wife Have your cake and eat it too.
drunk.

Sono andato a fare una passeggiata nel I went for a walk in the flying rabbit* I went for a walk in the park.
parco. park.
Sentii una brezza leggera sussurrare tra I heard a light breeze whispering I heard a light breeze whispering
le foglie. through the leaves like an enchanted through the leaves.
* Hallucinations melody*.
Texts generated directly in the language of use

PROMPT AI-GENERATED TEXT HUMAN REVIEW


Create an advertising text for the Our new red shoes are an explosion of Our new red shoes are an explosion of
market launch of a new model of red style and boldness. The shoes are made style and boldness. The shoes are made
canvas shoes. of fine leather* in a deep red colour of fine canvas in a deep red colour that
that catches the attention of anyone catches the attention of anyone around
wearing them. you.
Describe a hypothetical attraction of an [...] "Whirlwind Wonder" is a roller [...] "Whirlwind Wonder" is a roller
amusement park by never addressing the coaster that combines high speeds with coaster that combines high speeds with
reader. dizzying heights - it's an experience you dizzying heights - it's an experience not
won't want to miss! to be missed!
Create a description of restaurant X [...] A refined ambiance with a cosy [...] A refined ambience with a cosy
based on features Y. fireplace that overheats the room, fireplace that warms the room, creating
creating a homely atmosphere and a homely atmosphere and making every
making every moment unique. moment unique.
Create a description of an imaginary The grass was green. The mountains The grass was green. The mountains
landscape with four-word sentences. were blue. The river was purple. The were blue. The river was purple. The
grass was brown*. leaves were brown.

* Hallucinations
05
CURRENT AND FUTURE TRENDS
CURRENT AND FUTURE TRENDS
The field of generative AI, and of LLMs in particular, is constantly evolving. In the last year
(2023), we witnessed rapid development and significant changes in LLMs, which are now
capable of generating incredibly high quality texts. At present, LLMs such as OpenAI's GPT-3
and GPT-4 have reached levels of sophistication that allow them to generate coherent and
persuasive texts in a wide range of styles and formats. They can write blog articles, answer
questions, translate texts, create convincing dialogues and even write code in programming
languages.

However, despite considerable progress, these models still have their limitations. For
instance, they may generate incorrect or misleading information (hallucinations), texts that
lack coherence or output that does not comply with the instructions given in prompts.

In the future, we are likely to see a further improvement in the quality of texts generated by
AI models. Researchers continue to refine these models to reduce errors and improve
coherence. Furthermore, there is a growing interest in the use of AI models in increasingly
broader applications. These include not only text generation, but also text revision, natural
language processing (NLP), machine translation, speech synthesis and much more.
CURRENT AND FUTURE TRENDS
CURRENT AND FUTURE TRENDS

In summary, the field of GenAI and the content it


generates is evolving very rapidly, but for the time
being, human review remains essential to
ensure quality and appropriateness. The
exponential increase of such content will most
likely lead to a sharp increase in the demand for
revisions to correct or improve the generated
texts or to assess their quality.

AI Content Post-Editors and AI Content


Reviewers will therefore have to continuously
adapt and develop their skills to keep up with
these trends.
06
USEFUL LINKS
• ChatGPT & LLMs – Separating Fact from Fiction for Locali
zation
(Nimdzi)
• The Language AI Alphabet: Transformers, LLMs, Generati
ve AI, and ChatGPT
(Nimdzi)
• How to Use ChatGPT as a Language Translation Tool
(MAKE USE OF)
• Innovating at the Speed of AI (Slator)

• Embracing Generative AI and Linguist Prompt

Engineering (GALA)

• The Impact of Generative AI in the Translation Industry a


nd beyond
Thank you!
Do you have any questions?

Contacts

E-mail: info@creative-words.com Via Cairoli 1/4 16124 (GE)


Web: www.creative-words.com Via Paolo da Cannobio 37, 20122 (MI)
Tel: +39 (0) 10 897050 IT02431070990
Mob: +39 320 9730292 Working hours: 9 AM - 6 PM

You might also like