Professional Documents
Culture Documents
Corso-su-AI-post-editing_ENG
Corso-su-AI-post-editing_ENG
Corso-su-AI-post-editing_ENG
AI CONTENT POST-EDITORS
AND AI CONTENT REVIEWERS
About me
Lead Linguist for Creative Words
I translate and revise texts, supervise the
linguistic part of translation projects,
monitor and assess the quality of
translations and create instructions and
training materials for translators.
MARCO RUSSO
Contents
01 02 03
GENERATIVE AI AND LLMs AI CONTENT POST-EDITORS TYPICAL ISSUES OF
& AI CONTENT REVIEWERS AI-GENERATED TEXTS
04 05 06
EXAMPLES OF CURRENT USEFUL LINKS
AI-GENERATED TEXTS AND FUTURE TRENDS
01
GENERATIVE AI AND LLMs
What is generative AI?
Generative artificial intelligence (generative AI or GenAI) is a type of artificial
intelligence that has the ability to generate new and original content in response to
prompts. This type of artificial intelligence can be used to create images, videos, texts
and other types of content.
• Text generation: they can generate texts based on the prompts they receive, using
information learned from the corpus of texts they have been trained on. The texts
that can be generated include, for example, creative content, texts for chatbots,
product descriptions for marketing purposes and even entire articles on a specific
topic.
• Machine translation: they can be trained on bilingual datasets and used to
automatically translate text from one language to another.
• Natural language understanding: they can understand the meaning of texts and
answer questions or perform tasks that require natural language understanding.
For example, they can summarise articles, revise texts or create stories and poems.
NMT vs LLMs
Neural Machine Translation (NMT) is a
machine learning model specifically
designed for translation between
languages. Large Language Models
(LLMs) , on the other hand, were not
designed with translation as their main use
case and have some limitations in this kind
of task (see module 3). However, although
less predictable, they still have excellent
linguistic skills and can provide good
quality machine translations and perform
various tasks in the localisation field,
including QA, automatic post-editing,
terminology extraction and transcreation.
The AI Content Post-Editor revises texts in the target language within bilingual files, i.e. texts
that have been translated by the AI instead of the MT. This role, as the name suggests, is very
similar to the role of post-editor and already having experience in it is certainly an advantage.
However, compared to traditional texts, AI-generated texts have some peculiarities (which we
will see later) you need to be aware of during the post-editing process.
The AI Content Reviewer task is to review of original AI-generated texts, i.e. texts that have not
been translated from any language but generated directly in the language in which they will be
used. These are often creative texts produced by the AI based on instructions (prompts)
provided by the user. An example would be: "Could you create the copy for the advertising
campaign of a pair of high fashion shoes?". Obviously, the prompt can become much more
detailed depending on the information provided by the customer (type of shoe, colour, type of
buyers it was designed for, level of formality, etc.).
In this role it is not necessary to know a foreign language (and this is what mainly differentiates
it from the AI Content Post-Editor), instead it can be very useful to have marketing and
copywriting skills and experience in creating content from a client brief.
Skills of the new roles
Linguistic skills: excellent knowledge of the language in which the texts are generated. This
includes a thorough understanding of grammar, syntax, vocabulary and stylistic rules. As
mentioned above, the AI Content Post-Editor should also have a very good knowledge of the
source language. In general, both must be aware of the peculiarities and errors associated
with AI-generated texts, such as inconsistencies, inappropriate style and tone, grammatical
errors, etc. (see module 3 for more details).
Editing and proofreading skills: it may seem obvious, but, as for post-editors, you need to
be able to quickly identify and correct errors and improve fluency and consistency by using
the provided text as much as possible rather than rewriting it rom scratch.
Technological skills: basic understanding of how GenAI works, including the concepts of
neural networks, LLM and AI training.
Searching and verifying information skills: you need to carefully examine texts and
identify errors, inaccuracies, ambiguities, additions or omissions to ensure the accuracy of
information and avoid so-called 'hallucinations' (see module 3). To do this, the AI Content
Post-Editor just needs to compare the target text with the source, while the AI Content
Reviewer has to carry out specific searches.
03
TYPICAL ISSUES OF
AI-GENERATED TEXTS
TYPICAL ISSUES OF AI-GENERATED TEXTS
Hallucinations: the AI may produce sentences or paragraphs that seem coherent and well-
written, but contain made-up information or concepts that do not correspond to reality (in
the case of translation, reality corresponds to the source text). The AI could distort the
information provided in the source text or in the prompt and/or add new information that is
completely false or misleading. Other than additions, significant omissions can of course also
occur.
Coherence issues: AI models may occasionally produce texts that lack coherence. There
may be logical incoherences or abrupt transitions between sentences that make the text lack
fluency.
Cultural and linguistic sensitivity issues: AI models may not be fully aware of the different
cultural and linguistic sensitivities. Any insult, stereotype or inappropriate expression should
be identified and corrected. Furthermore, despite precautionary measures, generative AIs
can be negatively influenced by the data they have been trained on and include prejudice,
racism, sexism or other discrimination in the texts they produce.
Originality and creativity issues: AI-generated texts are based on statistical models and
information included in the training data, so they are created without a deep understanding
of meaning or context and may have limited creativity and lack originality. It is important,
particularly for the AI Content Reviewer, to assess the originality and creativity of the
generated texts and to identify any plagiarised or unoriginal content.
TYPICAL ISSUES OF AI-GENERATED TEXTS
Grammatical and syntactical errors: despite significant progress in natural language
processing, texts generated by AI may still contain grammatical or syntactical errors, e.g.
verbal concordance errors, gender or number concordance errors, punctuation errors, etc.
Capitalisation errors: frequent in titles and after colons. Examples:
S: Our Values
T: I Nostri Valori
R: I nostri valori
Sono andato a fare una passeggiata nel I went for a walk in the flying rabbit* I went for a walk in the park.
parco. park.
Sentii una brezza leggera sussurrare tra I heard a light breeze whispering I heard a light breeze whispering
le foglie. through the leaves like an enchanted through the leaves.
* Hallucinations melody*.
Texts generated directly in the language of use
* Hallucinations
05
CURRENT AND FUTURE TRENDS
CURRENT AND FUTURE TRENDS
The field of generative AI, and of LLMs in particular, is constantly evolving. In the last year
(2023), we witnessed rapid development and significant changes in LLMs, which are now
capable of generating incredibly high quality texts. At present, LLMs such as OpenAI's GPT-3
and GPT-4 have reached levels of sophistication that allow them to generate coherent and
persuasive texts in a wide range of styles and formats. They can write blog articles, answer
questions, translate texts, create convincing dialogues and even write code in programming
languages.
However, despite considerable progress, these models still have their limitations. For
instance, they may generate incorrect or misleading information (hallucinations), texts that
lack coherence or output that does not comply with the instructions given in prompts.
In the future, we are likely to see a further improvement in the quality of texts generated by
AI models. Researchers continue to refine these models to reduce errors and improve
coherence. Furthermore, there is a growing interest in the use of AI models in increasingly
broader applications. These include not only text generation, but also text revision, natural
language processing (NLP), machine translation, speech synthesis and much more.
CURRENT AND FUTURE TRENDS
CURRENT AND FUTURE TRENDS
Engineering (GALA)
Contacts