Professional Documents
Culture Documents
New Approaches To Audiovisual Translatio
New Approaches To Audiovisual Translatio
net/publication/299669269
CITATIONS READS
8 7,149
3 authors, including:
All content following this page was uploaded by Juan-Pedro Rica-Peromingo on 31 May 2016.
Universidad de Salamanca
Universidad de Salamanca
Abstract
1. Introduction
Thanks to the use of scientific data in the analysis of written and spoken
language, linguists are able to make objective accounts on different language
varieties and text types, leaving aside previous analytical procedures where
assertions were based on subjective, individual perceptions (McEnery and Wilson
2001:103). The work of language engineers and linguists has made important
advances in the generation of resources and tools in the field of language
description; improved corpora are now being built as well as computer software
being developed to assist the linguist in his/her work. Benefits brought by
computerized analysis of data to text processing are both innumerable and
invaluable, as we have to consider that not only the processing of data is
accelerated and improved, but also the quality of the evidence obtained are
favored by the help of computers in information managing. As it is stated by
McEnery and Wilson (2001:17), computers have meant the turning of the
discipline from a pseudoprocedure to a feasible methodology: computational
corpus linguistics, where the computer is the basic tool to assist the analysis and
description of language and meaning as they are contained in texts collected in a
corpus. Computational corpus linguistics has, for this reason, accomplished a
variety of techniques to retrieve, search for and compute data, to provide linguists
with information about language that can be very varied and, at the same time, to
empower them to solve problems that had been haunting language research for
years.
The main object of study in corpus linguistics is the corpus, a collection
of texts which implies not a mere gathering of language, but a purposeful
assemblage of texts intended to be analyzed looking for some linguistic pattern.
As McEnery and Wilson posit, although any collection of texts may be called a
12
corpus, linguistics has made use of this word in specific contexts where has
acquired specific characteristics:
In principle, any collection of more than one text should be called a corpus: the term
‘corpus’ is simply the Latin for ‘body’, hence a corpus may be defined as any body of
text. It need imply nothing more. But the term ‘corpus’ when used in the context of
modern linguistics tends more frequently to have more specific connotations [Sampling
and representativeness, finite size, machine readable form and standard reference] than
this simple definition provides for (McEnery and Wilson, 2001: 29).
This corpus has been compiled along 2010 and 2011 (Rica, 2011a). The
compilation process has been long and tedious because there are around 100
movies in each of the periods of the American film history among which those
included in the corpus have been selected. The subtitles files have been extracted
from the original DVDs using some software appropriate for this purpose:
SubRip: http://subrip.softonic.com/ or DVD Subtitle Ripper:
http://sourceforge.net/projects/subtitleripper/. The files obtained are .txt files
which can be analyzed by corpus software such as Wordsmith Tools 3.0.
Each file has a code (Sub), an order number (000X), followed by “a” (for
those original English subtitles) or “b” (for those translated subtitles into Spanish):
for example, for the movie It happened one night (Fran Capra, 1934), the two files
obtained are coded as follows: Sub0001a for the original English version and
Sub0001b for the translated version in Spanish.
At the beginning of this process the distribution of movies into different
movie styles was considered. That idea was discharged because there seems not to
be a clear agreement on the distribution of American movies into different styles
in some of the American movie databases consulted. That is why the distribution
into different historical periods was used. The five periods in which the
CORSUBIL is distributed are stated as follows: from the beginnings of the
“talking films” in the USA to the 40s; the 50s and the 60s; the 70s and the 80s; the
90s; and, finally, the cinema in the XXI century. We followed some of the most
important bibliographical references on the issue (Aguilar et al. 2002, Alsina
15
Thevenet 1993, Caparrós Lera 2009, Finler 2010). This distribution was
considered more functional and practical for the analysis of the subtitles.
For each of the subcorpora a table with all the details of each movie
chosen has been designed. This table includes, firstly, the file code, the original
title of the movie in English and the translated title into Spanish, the movie
director and the number of words in the file (see Figure 3 for an example of the
information included in this table).
After each of the periods collected a summary of all the words included
in the subcorpora is also included: each of the periods contains approximately 3
million words (1.5 million words in the CORSUBILIN and 1.5 million words in
the CORSUBILES), as can be seen in Figure 4 below.
16
2 Both versions 4.0 and 5.0 are available (Scott, 2003). Among their new
functions we find the WebGetter which allows users to build up a corpus using
web pages with the grammatical or lexical unit that is being analyzed. For the
purposes of the present study, since the .txt files have already been collected, we
are not in the need of using an updated version of the program with the Webgetter
function.
18
We have searched in the CORSUBILIN for all those terms and in the
CORSUBILES for their translated counterparts. The final goal is to analyze the
translation of such terms and build up a bilingual list of oral lexical units in
English and Spanish. Our final goal is threefold:
See whether the terms are translated correctly in Spanish or not.
See whether there are different possible translations for the same terms in
Spanish.
See whether all these oral lexical units are translated in the Spanish
subtitles or omitted (as it normally happens with this kind of formulae) in
the oral discourse.
For example, we have searched for the lexical unit “good luck” in the
CORSUBIL: 321 tokens have been found in the CORSUBILIN for such lexical
unit in the subtitles in English (see Figure 5 below).
Searching for the lexical unit in the Spanish translated subtitles, 311 tokens were
found in the CORSUBILES corpus (see Figure 6 below).
19
been omitted and in 2 other movies (4 of the tokens) the translation in Spanish has
been “¡Suerte!” and not “Buena suerte”. Therefore, in our future bilingual list of
this speechact formulae it will be necessary to include the three possibilities
found, as stated in Table 2 below.
It has been extensively proved in later years that for second language
acquisition (SLA) or foreign language acquisition (FLA) and teaching the use of
linguistic corpora constitutes a very useful tool (Rica, 2010; Granger, 2004;
Hunston, 2002). We believe that through a linguistic analysis of the subtitles in
English and in Spanish, then, we may attain a better knowledge of the lexical
networks that have been more frequently used in the translation of some films; we
also have precisely and quickly access to specific terms; and, finally, we may be
able to assess, either in a positive or negative way, the role of the translator in the
translation of the film.
9. Practical applications
The application of corpus studies to the analysis of AVT can mean a wide
variety of changes in different fields of study in language teaching and research,
such as lexicographic or translation studies.
Lexicographic studies benefitted from empirical data long before the
discipline of corpus linguistics was established as a source of samples for
linguistic research, but it has been the development of corpus linguistics which has
triggered the compilation of more corpora for lexicographic purposes, collections
of texts that have been used to study specific characteristics of the lexical elements
of the language under study, and which have mainly given birth to important
dictionaries that mean a step forward in the classification and description of
language. The study that we have carried out can also be a step forward in the
elaboration of dictionaries, providing lexicographers with uptodate information
and more complete and precise definitions, derived from the use of real examples
extracted from actual uses of the language. Glossaries and dictionaries are
essential tools for translators, data reservoirs of audiovisual language that can also
benefit from corpus analyses on AVT. Specific translations shown in film subtitles
21
can constitute a database for translators, who, from now on, do not depend
exclusively on their knowledge to carry out audiovisual translations, but also on
previous occurrences of the word, and how it has been translated.
Another of the most straightforward practical applications that corpus
linguistics has is language teaching. New insights into the shape, meaning, beha
vior or company words keep can be immediately transported to the teaching field,
thus obtaining real input coming from actual instances of language in use.
Including real instances of language was an essential premise for corpus building
and has now a direct repercussion on the outcomes and data that can be extracted
from corpora, thus being based on actual use and reliable to be presented to
learners.
A further step in this process is to include corpora as resources to learn
the language in the classroom as reservoirs of information that students can
analyze to discover the rules of language or the meanings of new words,
especially in the area of languages for specific purposes, a field where corpora of
particular linguistic varieties, like CORSUBIL, have been compiled to exploit
them as sources of information about domainspecific language, including
quantitative data about vocabulary and accounts of uses that can be especially
relevant for learners of specific areas of language rather than examples taken from
general corpora (McEnery and Wilson 2001:121).
Translation studies are a fruitful field of application for corpus linguistic
advances, a place to implement new theories and to use the tools that the
discipline provides linguists with. Many of the corpora compiled nowadays
include texts in more than one language, and some of them are used as tools to
compare specific elements across different languages. We have a clear reflection
of the use of corpora in translation in the alignment of concordance lines (Scott
and Tribble, 2006), outcomes from corpora that are compared in order to improve
translation techniques or to assess if a translation process has been correctly
carried out, and, more specifically, to be used as a learning resource in the
audiovisual translation class.
An additional feature that has to be borne in mind about corpora is that,
besides the opinion of some linguists, who consider them ‘unreal’ examples of
language and not valid sources of evidence for corpora, translated texts are
genuine communicative events that should not be considered inferior to texts in
the original language they were written. However, they are slightly different, and
are precisely these differences which have to be observed and described through
corpus analysis (Baker, 1993:234).
22
10. Conclusions
which are, firstly, identified by the visual context in which the unit has been
uttered and, secondly, due to the need to adjust the information contained in a
subtitle to the space provided on the screen (Díaz Cintas, 2008). With this
example we intend to show that a deeper and more qualitative analysis of all the
lexical units analyzed will be necessary in order to account for the initial
differences found between the original English subtitles and the translated Spanish
ones. An extensive study of all these lexical units will shed some light on the
strategies used when translating audiovisual texts for dubbing and subtitling
purposes: firstly, the same lexical units may be translated in different ways (i.e.,
Sorry translated as Perdón, Lo siento, Perdona or Perdóneme); secondly, the same
lexical unit in the Spanish subtitles corresponds to different lexical units in the
English subtitles (i.e., ¿Qué pasa? in the Spanish subtitles is found to be the
translation of lexical units in English such as What's the matter?, What is it? or
What's wrong?); and, finally, the absence of such lexical units in the Spanish
translations that we have already commented (Rica, 2011b).
We believe that our main achievement in this article has been the
presentation of an interdisciplinary study by blending AVT and Corpus
Linguistics and showing some of the potential possibilities they offer, but this
field of work still deserves further attention, and a deeper analysis will be carried
out in forthcoming stages of the research.
References
Aguilar, C. et al. (2002). Diccionario de películas del cine norteamericano: Antología crítica.
Madrid: T&B Editores.
Alsina Thevenet, H. (1993). Historia del cine americano I: desde la creación al primer sonido
(18931930). Barcelona: Laertes.
Alsina Thevenet, H. (1993). Historia del cine americano II: el esplendor y el éxtasis. Barcelona:
Laertes.
Baker, M., Francis, G. and E. TogniniBonelli (Eds.) (1993). Text and Technology: In
Honour of John Sinclair. Amsterdam/Philadelphia: John Benjamins.
Bell, A. (1991). The language of News Media. Oxford: Blackwell.
Bell, A. and Garret, P. (1998) Approaches to media discourse. Oxford: Blackwell.
BeltránPalanques, V. (2010). “Analising Refusals in Films” In Ruano García et al. (Eds.)
Current trends in Anglophone Studies: Cultural, Linguistic and Literary Research. Colección
Aquilafuente nº72. Ediciones Universidad de Salamanca: 7084.
Biber, D., Conrad, S. and Leech, G. (2003). The Longman student grammar of spoken and
written English. Londres: Longman.
Biber, D., Conrad, S. and Reppen, R. (1994). “Corpusbased approaches to issues in applied
linguistics”, Applied Linguistics, 15(2): 169189.
Biber, D., Conrad, S. and Reppen, R. (1998). Corpus Linguistics. Investigating Language
Structure and Use. Cambridge: Cambridge University Press.
24
Biber, D., Johansson, S., Leech, G., Conrad, S. and Finegan, E. (1999). The Longman grammar
of spoken and written English. Londres: Longman.
Calsamiglia, H. and López Ferrero, C. (2003). “Role and position in scientific voices: Reported
speech in the media”. Discourse Studies 5(2): 147173.
Caparrós Lera, J.M. (2009). Historia del cine mundial. Madrid: Ediciones RIALP.
Conrad, S. (2002). “Corpus linguistic approaches for discourse analysis”, Annual Review of
Applied Linguistics 22: 7595.
Delabastita, D. (1989). “Translation and MassCommunication: Film and TV Translation as
Evidence of Cultural Dynamics,” Babel 354: 193218.
Díaz Cintas, J. (2003). Teoría y práctica de la subtitulación. InglésEspañol. Madrid: Ariel.
Díaz Cintas, J. (ed.) (2008). The Didactics of Audiovisual Translation. Ámsterdam: Philadelphia:
John Benjamins Publishing Company.
Finler, J.W. (2010). El cine americano. Historia de Hollywood: un viaje completo por la historia
de la industria americana del cine. Barcelona: Ediciones MaNonTroppo.
García Riaza, B. (2010a). “Explicit attribution sources expressed by ‘according to’ in a corpus of
science popularization articles from The Guardian”. Interlingüística XXI: 157168.
García Riaza, B. (2010b). “Mapping Attribution across Texts: the particle ‘according to’ in
science popularizations and editorials from The Guardian”. Paper presented at Mapping
Language across Cultures Conference (MLAC10), Salamanca, 57 July 2010.Garret, P. &
Bell, A. (1998): Approaches to Media Discourse. Oxford. Blackwell.
Granger, S. (2004). “Computer Learner Corpus Research: Current Status and Future Prospects”.
In U. Connor & T. Upton (Eds.), Applied Corpus Linguistics: A Multidimensional
Perspective. Amsterdam: Rodopi: 123145.
Herbst, T. (1987): “A pragmatic translation approach to dubbing,” EBU Review. Programmes,
Administration, Law, 386: 2123.
Hunston, S. (2002). Corpora in Applied Linguistics. Cambridge: Cambridge University Press.
Hunston, S. and Francis, G. (2000). Pattern Grammar. A Corpusbased Approach to the Lexical
Grammar of English. Amsterdam/Philadelphia: John Benjamins.
Kennedy, G. (1998). An Introduction to Corpus Linguistics. London and New York: Longman.
Kim, C. K. and Thompson, G. (2010). “Obligation and Reader Involvement in English and
Korean Science Popularizations: A Corpusbased Crosscultural Text Analysis”, Text & Talk
30(1): 5373.
Lyons, J. (1981). Language and Linguistics: An Introduction. Cambridge: Cambridge University
Press.
Mason, I. (1989) “Speaker meaning and reader meaning: preserving coherence in screen
translating” en Kölmel, Rainer y Jerry Paine (Eds.) Babel: the Cultural and Linguistic
Barriers Between Nations. Aberdeen: Aberdeen University Press. 1324.
Mayoral, R. (2001). “Campos de estudio y trabajo en traducción audiovisual”, en M. Duro, La
traducción para el doblaje y la subtitulación. Madrid: Cátedra. 1945.
McEnery, T, and Wilson, A. (2001) [1996]. Corpus Linguistics. Edinburgh Textbooks in
Empirical Linguistics. Edinburgh. Edinburgh University Press.
OlveraLobo, M. D., et al. (2005). “Translator Training and Modern Market Demands ” in
PerspectivesStudies in Translatology, 13, 2: 132142.
Pearson, J. (1998) Terms in Context. Studies in Corpus Linguistics 1. Amsterdam/Philadelphia:
John Benjamins.
Rica, J.P., Albarrán, R. and García Riaza, B. (2010) “LSP Software tools and resources for
teaching linguistic aspects in audiovisual translation”, communication presented at the First
International Workshop on Technological Innovation for Specialized Linguistic Domains:
Theoretical and Methodological Perspectives. UNED, Madrid (Spain), October 21st.
Rica, J.P. (2010). “Lingüística de corpus en la enseñanza del inglés como lengua extranjera
(ILE)”, en J.L. Cifuentes, A. Gómez, A. Lillo, J. Mateo y F. Yus (Eds.), Los caminos de la
25