Download as pdf or txt
Download as pdf or txt
You are on page 1of 25

Cohesive explicitness and explicitation in an English-German translation corpus*

Silvia Hansen-Schirra, Stella Neumann and Erich Steiner


Johannes Gutenberg-Universitt Mainz/Universitt des Saarlandes, Germany

Explicitness or implicitness as assumed properties of translated texts and other texts in multilingual communication have for some time been the object of speculation and, at a later stage, of more systematic research in linguistics and translation studies. This paper undertakes an investigation of explicitness/implicitness and related phenomena of translated texts on the level of cohesion. A corpusbased research architecture, embedded in an empirical research methodology, will be outlined, and first results and possible explanations will be discussed. The paper starts with a terminological clarification of the concepts of explicitness and explicitation in terms of dependent variables to be investigated. The two terms and their usage by other scholars will be discussed. An electronic corpus will then be described which provides the empirical data and techniques for information extraction. For the investigation carried out using our corpus, indicators will then be derived on the basis of which operationalizations and hypotheses can be formulated for patterns of explicitation occurring between source and target texts. Some initial results relating to cohesive explicitness and explicitation in the data will be presented and discussed, with particular attention being paid to the areas of reference, substitution, ellipsis, conjunction, and lexical cohesion. First attempts will also be made at explaining the findings. Keywords: translation, explicitness, explicitation, cohesion, English/German

1. Introduction Cohesive features have been the object of research in translation studies as indicators of explicitation. They have been studied in what can be called exploratory studies in example-based approaches (Blum-Kulka 1986) and in a psycholinguistic experiment (Englund Dimitrova 2005). Moreover, empirical corpus-driven research has employed concordances in monolingually comparable corpora of raw text to gain insight into the nature of cohesive features (cf. several contributions
Languages in Contrast 7:2 (2007), 241265. issn 13876759 / e-issn 15699897  John Benjamins Publishing Company

242 Silvia Hansen-Schirra, Stella Neumann and Erich Steiner

in Laviosa 1998; Olohan and Baker 2000). In spite of the insight yielded by this tradition of research, we argue that where explicitation is investigated in raw texts without taking into account the source texts of translations, the interpretation of results will remain limited to queries that are possible without annotation, and problematic since explicitation can only be considered as a shift between source and target text, not as a comparison between comparable texts. Works on translations adopting a more linguistic perspective have addressed some of these limitations and problems (cf. relevant work in Johansson and Oksefjell 1998; FabriciusHansen 1999; House 2002; House and Rehbein 2004; Doherty 2002; 2006); the focus of these research interests and methodologies is however different from, and partly complementary to, ours with respect to corpus architecture, querying techniques and underlying linguistic modelling (for which cf. Hansen 2003; Neumann 2003; Steiner 2001; Teich 2003). The remainder of the paper is organized as follows: for an initial clarification of key concepts, we shall differentiate between two distinct, though related, notions: that of explicitness and explicitation in Section2. After a description of the corpus architecture developed within the CroCo Project,1 of which the present study forms a part (Section3), Section4 will attempt to stratify these notions in terms of the linguistic levels of lexicogrammar and text. We shall also derive some cohesive indicators operationalizing explicitness and explicitation in this section. In Section5 we will discuss findings for the indicators thus derived, and finally come to some conclusions in Section6.

2. Explicitness and explicitation The main aim of the present paper is to discuss cohesive explicitation using a quantitative methodology. For this purpose, the discussion of key contributions to the study of explicitness is necessary because it helps to delineate our notion of explicitation in view of these contributions. Explicitness on the lexicogrammatical level is conceptually related to density and directness. These three are properties of (lexico-)grammatical constructions (cf. Steiner 2004; 2005a; 2005b; 2005c). The opposite of explicit in this usage is lexicogrammatically not realized, but still part of the construction (unrealized participant roles, unrealized features in non-finite constructions, grammatical ellipsis, projection of units of meaning onto different grammatical categories, grammatical metaphor, transcategorization, etc.). At a textual level, explicitness is related to properties such as simple, normal, levelled-out, sanitized, explicit vs. implicit, direct vs. indirect; oriented towards self vs. other; oriented towards content vs. persons (cf. Baker 1996; House 2002 for relevant work). The explicitness

Cohesive explicitation in an English-German translation corpus 243

of higher level units such as texts/discourses is not simply the sum total of the explicitness features of clauses. It is a property emerging at a higher level in the sense that text-level properties are perceived as a result of the interaction of clauselevel features, such as explicitness, directness, density, with textual features such as cohesion, markers of genre or register. All of the latter will, in turn, be realized as lexical and/or grammatical patterns, but their function is not accounted for by lexicogrammar. Explicitness on this level can furthermore be a result of global textual patterns (such as typetoken ratio, lexical density, etc.), which are epiphenomena of lexicogrammatical patterns, but not lexicogrammatical themselves. Explicitness a property of lexicogrammatical or cohesive structures and configurations in one text is measured through operationalizations of the type we shall indicate below. Explicitation, on the other hand, is a process or a relationship between intralingual variants and/or translationally related texts. The texts resulting from explicitation are more explicit than their counterparts in terms of their lexicogrammatical and cohesive properties. Explicitation can only be observed in instantiated, indexed and aligned pieces of discourse/text sharing all or some of their meaning, which is particularly true for translations.
Definition: We assume explicitation if a translation (or, language-internally, one text in a pair of register-related texts) realizes meanings (not only ideational, but also interpersonal and textual) more explicitly than its source text more precisely, meanings not realized in the less explicit source variant but implicitly present in a theoretically-motivated sense. The resulting text is more explicit than its counterpart.

Note that this definition deliberately excludes the indefinite number of possibilities through which meaning can simply be added to some text/discourse, without being in any motivated sense implicit in the source variant (a view similar to that of Doherty 2006: 49ff). More general discussions in the literature regard the notions of explicitness/ explicitation and its counterpart implicitness/implicitation as a challenge in several respects. These notions are very general, central to some models of language, especially for a philosophically anchored semantics, and in any case highly complex. They usually refer to fully interpreted acts of communication in a communicative context of situation. However, the data available to a methodologically empirical project will not consist of high-level interpretations of utterances by human interpreters, but of text corpora with relatively low-level lexicogrammatical and cohesive categories captured in multi-level annotations. The data thus yield information about properties of encoding, rather than about high-level interpretations of such data by human interactants. Precisely the former are the focus of the

244 Silvia Hansen-Schirra, Stella Neumann and Erich Steiner

current project an attempt to enquire into properties of encoding which relate to explicitness and explicitation, rather than to add yet another set of examplebased discussions of (interpretations of) the data. Within this context, Linke and Nussbaumer (2000:435ff) anchor their discussion in their handbook article on concepts of implicitness in the wide-spread metaphor, or allegory, which conceptualizes texts as icebergs: only a small part of them is visible, the larger part is hidden from perception. The visible part (of form and meaning) is called explicit, the invisible part implicit. More specifically, they draw a distinction between meanings which are implicit, non-literal, dependent on use (the province of pragmatics) on the one hand (B), and those meanings which are fixed, literal and independent of use (the province of semantics) (A). Only within the latter do they distinguish, in a linguistically narrower sense, between implicit (non-realized) and explicit. Within the latter category, what they call semantics (A), they subclassify semantic, but implicit, meanings into presuppositions, implications (entailments), connotations, affective and deontic meanings, remaining marginal types to do with inferencing. The remainder of meanings on the semantic level are assumed to be explicit. Meanwhile, with meanings which are implicit, non-literal, dependent on use (the province of pragmatics) (B), the sub-classification is into pragmatic presuppositions (frames, scripts), conversational maxims and conversational implicatures and finally illocution and perlocution. Situating our own concept of explicitness vis--vis this overview, it appears as if our classification cuts across the one represented there, even though the two can be related. First, our corpus-based research design enables the investigation of meanings which are explicit in one of the registerial or translational variants under comparison or else can be grammatically or cohesively related as explicit/implicit variants to our data. What remains outside of our methodology is the simple addition or omission of meanings without any grammatical or cohesive relationships between variants. Second, the meanings which we investigate do not have to be literal, they may, indeed, be (grammatically or lexically) metaphorical, provided they are explicit in one of our variants (registers, translations). Finally, the meanings which we are looking at are dependent on usage in that the data are drawn from linguistic instantiations, i.e. texts. However, our operationalizations in terms of lexicogrammatical or cohesive realization will bias our observations towards whatever is grammaticalized and lexicalized, or at least highly conventionalized (cohesive relations, rhetorical relations), and in that sense our approach may appear quite system- and grammar-oriented. The reason why our perspective seems to cut across that of Linke and Nussbaumer is that, being corpus-based, and thus product-based, rather than interpretation-based and process-based, we are forced to gear our methodology to the

Cohesive explicitation in an English-German translation corpus 245

investigation of lexicogrammatical realization. Consequently, any meanings they call pragmatic and not systematically linked to realization appear invisible to our method which is not saying they are unimportant. They will feature in hermeneutic example-based interpretations of our data, but only there. Next, let us attempt to situate our own methodology relative to a discussion contrasting Relevance Theory with Gricean Pragmatics, this time taken from Burton-Roberts (2005:389ff) review of Carstons (2002) Thoughts and Utterances: The Pragmatics of Explicit Communication. We assume, like Burton-Roberts (and Carston), that our explicit vs. implicit distinction cuts across at least several Gricean dichotomies: (A) semantics vs. pragmatics, (B) what is said vs. what is implicated, (C) explicit vs. implicit, (D) linguistically en(/de)coded vs. not linguistically en(/ de)coded, (E) context-free vs. context-sensitive, (F) truth conditional (entailment) vs. non-truth-conditional (non-deductive). Furthermore, addressing Carstons (2002:117) and at this point also Burton-Roberts (2005:391) position, we would also claim that the variants in Burton-Roberts example 1 (a) to (d) below, cannot simply be contrasted in terms of a binary explicit vs. implicit dichotomy:
(1) a.  Mary Jones put the book by Chomsky on the table in the downstairs sitting room. b. Mary put the book on the table. c. She put it there. d. On the table.

According to Carston and Burton-Roberts, any of 1 (a-d) above could be used, in different contexts, to communicate explicitly one and the same proposition (or thought or assumption) (Carston 2002:117). This appears to be true what we are investigating with our research design, however, is not an act of communication (and interpretation) situated in a specific context, but rather properties of the encoding (explicitness, alongside directness and density). In our terms, 1 (a) to (d) are identical as far as ideational and interpersonal explicitness are concerned. There is no difference between them in terms of directness, but there are differences along several dimensions in density, and there are differences in explicitness on the interpersonal and textual dimensions and in terms of some sub-parameters of cohesion. However, if we regarded 1 (a) to (d) as intralingual translations of each other, we could also investigate explicitation, rather than only explicitness. In this case, (b) to (d) would be partial implicitations of (a), with lexicogrammatical and cohesive markers which would still trigger a fully instantiated interpretation along the lines of (a) in a fully instantiated discourse. These lexicogrammatical and cohesive markers in (b) to (d) include definite articles, phoric elements and ellipses, all of which would implicitate some aspect of explicitly coded experien tial meaning

246 Silvia Hansen-Schirra, Stella Neumann and Erich Steiner

from 1 (a), while still providing a trigger or clue. With respect to Relevance Theory, then, our approach is characterized by the measurement of explicitness as a property of encoding, not as a property of the communicative act as such. A distinction which seems somewhat closer to our own modelling is that of von Polenz (1988:24ff, 40ff, 92ff, 202ff). He draws a basic distinction between elliptical, compressed/com pact, and implicating modes of expression, and their respective corresponding full, expanded and explicating counterparts. According to his classification, our methodology focuses on the difference between: compressed/compact modes of expression and their expanded counterparts; elliptical textures which can be related through grammar or cohesion to non-elliptical full counterparts; implicit textual configurations and their explicit counterparts. Von Polenz, however, frequently uses explicit in opposition to all three compressed/compact, elliptical, implicit (1988:24ff). Our methodology is more constrained in that we would restrict our notion of realization to lexicogrammatical and cohesive realization. We would demand some sort of lexicogrammatical reflex for an assumed elliptical, compact/compressed, implicit meaning, rather than a potentially implicit meaning addable to the piece of discourse in question without violating coherence. Summarizing our discussion so far, compared to Linke and Nussbaumer (2000), Carston (2002), Burton-Roberts (2005), and to a lesser extent to von Polenz (1988), our methodology appears restrictive in the sense of being tied to formal realization. However, all of the realizational patterns are considered to be signals only, instructions, to the full (inter-)textual meaning, and in that sense, we are opening the door to allow a fuller view, which ultimately extends to the previously invisible part of the iceberg. Methodologically, though, we can only do this via additional example-based hermeneutic interpretations of individual examples, not in the empirical part of our investigations. So far, we have located our position in relation to the semantic and pragmatic end of the spectrum of approaches to explicitness/explicitation. At the opposite end of the spectrum, there are notions of lexicogrammatically encoded types of implicitness, realized in non-finite constructions, unrealized participant roles, logico-semantic relators (conjunctions, prepositions), tense, aspect and number. Grammarians (e.g. Dixon 1991:6871) have noted the optional dropping of complementizers, relative pronouns or copulas from complement clauses (see also Olohan and Baker 2000 in the context of explicitation). In all of these cases, it can of course be argued that the (highly generalized) grammatical meaning signalled by the absence of the lexical items is contained in the text, at least in the features

Cohesive explicitation in an English-German translation corpus 247

of the construction. It can be made visible by contrasting the construction with its counterparts. However, this notion of implicitness is very grammar-oriented and thus also very language-specific. Our methodology is suited to this type of implicitness, which will be used as an indicator although not necessarily of experiential, but often of logical, interpersonal or textual meaning. Our methodology is related to Bibers (1995:157ff, 161ff on explicit vs. situation-dependent reference, but also Biber et al. 1999) in many respects. However, we do believe that it is possible to develop a linguistically richer and theoretically more substantiated notion of data than is used by Biber, while building on his achievements in making linguistic enquiry a more empirical discipline than before. The linguistically richer conceptual tools to be outlined below, influenced by the notions of grammatical metaphor and of metafunctional diversification (Halliday and Matthiessen 1999; 2004), are intended to narrow the gap between the more conceptual and hermeneutic top-down and the more empirical bottomup approaches. There are functional notions of implicitness/explicitness, as in accounts of modality (Halliday and Matthiessen 2004:620ff), or of inferred/implicit discourse relations, often triggered by genre or register (Halliday and Matthiessen 2004:363ff). A further context for the notion of implicitness is cohesive ellipsis (Halliday and Hasan 1976:142ff). And there is, of course, the important notion of grammatical metaphor. At least the type involving relocation in rank between semantics and grammar has far-reaching influences on how much and what kinds of information are made explicit (Halliday and Matthiessen 1999:231ff; 258; 270; Halliday and Martin 1993, Steiner 2004). A final important source of hypotheses concerning the tendencies postulated for translations between English and German is Doherty (2002, 2006); this applies particularly to her studies of typological parameters of information distribution. These are our starting points for recognizing more and higher-level types of implicit meaning, even if operationalizations at the borderlines (i.e. those to do with genre and register) are often not sufficiently advanced to enable a reliable level of quantification.

3. A corpus for investigating explicitness and explicitation Our investigation of explicitation and implicitation of cohesion markers in translations is based on a cross-linguistic corpus containing statistically meaningful and representative samples (cf. Biber 1993) of German and English. The corpus comprises multilingually comparable texts (English originals (EO) and German originals (GO)), monolingually comparable texts (EO and English translations (ETrans), GO and German Translations (GTrans)) as well as parallel texts (EO and GTrans, GO and ETrans). These sub-corpora represent eight registers relevant for

248 Silvia Hansen-Schirra, Stella Neumann and Erich Steiner

translation: popular-scientific texts (POPSCI), tourism leaflets (TOU), prepared speeches (SPEECH), political essays on economics (ESSAYS), fictional texts (FICTION), corporate communication (SHARE), instruction manuals (INSTR) and websites (WEB). The design criteria we consider particularly important are comparability, balance and representativeness. Functional variety forms a basis for comparison ensuring comparability. Thus, our choice of registers to be included was determined by registerial considerations (for a basic description of this kind of register analysis see Halliday and Hasan 1989): each register is distinct from the other registers in terms of the three register variables of field, tenor and mode of discourse. Apart from functional variety, the following criteria were taken into account to achieve a balanced corpus design: publication date (including texts from the 1990s onwards), regional language variety (including American and British English as well as Standard German from Austria, Switzerland and Germany) and text length. For our purposes, drawing a representative sample from the basic population of all texts meant choosing enough specimens from one register to cover all relevant linguistic features. We follow Biber (1993) who shows that smaller corpora if well-balanced are capable of covering all the linguistic features of a given register. His calculations, i.e. 10 texts per register with a length of at least 1,000 words, serve as an orientation for the size of our core corpus. In our case, we collected full texts or text excerpts of about 3,000 words per text and ten texts per register. This means that the four sub-corpora EO, GO, GTrans and ETrans contain 250,000 words each. In addition to register-controlled corpora, we also included reference corpora both in English (ER) and German (GR) for detecting contrastive restrictions of the respective language systems which force the translator to explicitate a source language structure. The reference corpora also allow the identification of specific features of register-controlled corpora, thus serving as a basis of comparison (cf. Neumann 2003 for a detailed description of the reference corpora). The overall corpus comprises 1 million words plus 68,000 words in the register-neutral (cross-register) reference corpora in both languages. We have stored meta-information on all texts on the basis of the TEI guidelines2 (i.e. information on the author, translator, language variety, publication date, register information etc.) using the graphical user interface CroCo-Meta specifically developed for a user-friendly and efficient annotation of meta-information (cf. Vela and HansenSchirra 2006). In addition to storing references to the texts, meta-information allows us to filter the corpus according to particular characteristics of the texts when querying the corpus in view of linguistic research questions. A characteristic feature of our corpus is the annotation and alignment of source and target texts on different linguistically motivated layers: the texts are annotated with parts of speech, morphology, phrase structure and grammatical

Cohesive explicitation in an English-German translation corpus 249

functions. Our alignment is truly multidimensional since it does not only comprise word- and sentence level but also chunk (phrases and grammatical functions) and clause level (cf. Hansen-Schirra et al. 2006 for a detailed description of the tools and techniques used for corpus alignment and annotation). Each annotation and alignment layer is stored separately in a multi-layer stand-off XML representation format keeping the annotation and alignment of overlapping and/or discontinuous units in separate files. The mark-up is based on the XCES Standard.3 One of the methodological principles applying to the exploitation of the resource is the distinction between lexicogrammatical/cohesive annotation of source and target language texts (including the alignment) on the one hand, and the interpretation of the data in view of more abstract concepts in our case explicitation on the other. The architecture of the CroCo Corpus enables the viewing of annotation in aligned segments and the combined querying of different layers of lower level linguistic features assumed to be indicators of the more abstract concept (using the query languages XSLT and XQuery; cf. Neumann and Hansen-Schirra 2005; Hansen-Schirra et al. 2006). The theory-neutral analysis of the texts permits the interpretation of a wealth of linguistic information, also in terms of other research questions.

4. Derivation of indicators After the outline of our corpus provided above, let us derive a number of indicators and operationalizations for explicitness and explicitation. In the present context, indicators on the lexicogrammatical level will be given in linguistic terms only, whereas, for the level of cohesion, we shall narrow our discussion down to the level of specific queries into representations in our corpus. Operationalizations for explicitness in any text, and for explicitation between translationally related segments, will initially be carried out in a theory-neutral way. By adding a modularization of meaning and encoding according to metafunctions (cf. Halliday and Matthiessen 2004 and elsewhere; Steiner 2005c), we can measure lexicogrammatical explicitness as represented in Table1. To these indicators and operationalizations, which up to this point have all been limited to grammatical phenomena and therefore expressed as indicators per grammatical unit, i.e. the clause, must then be added indicators and operationalizations for cohesion, i.e. indicators per text. In the following, we will exemplify the queries possible on the basis of the annotation and alignment for the cohesion markers described by Halliday and Hasan (1976) and their equivalents for German. 1 to 7 below are hypotheses about cohesion to be tested on the data: for either a given pair of non-aligned text segments, or else for a given aligned sourcetarget

250 Silvia Hansen-Schirra, Stella Neumann and Erich Steiner

Table1. Modularization of encoding according to metafunctions


Metafunction Gramm. system Operationalization Number of explicit functions : Number of implicit functions (per unit) Number of explicit functions: Number of implicit functions (per unit) Number of explicit Mood-markers: Number of implicit Mood-markers (per unit) Number of explicit Modality-markers : Number of implicit Modality-markers (per unit) Number of auto-semantic Themes : Number of syn-semantic (phoric) Themes (per unit)

Experiential Transitivity Ideational Logical Taxis Mood Interpersonal Modality Textual Theme

fragment of two texts in a translation relationship, we expect global differences across entire texts along the following parameters: 1. 2. 3. 4. 5. the proportion of explicit to implicit referents; the proportion of phoric to fully lexical (auto-semantic) phrases; the number of newly introduced discourse referents per discourse segment; the amount of cohesive ellipsis and substitution; the strength of lexical cohesion as measured by various ratios between content and function words, and as measured by typetoken relationships; 6. the strength (internal connectivity) of lexical chains as measured by average number of items per lexical chains; 7. the ratio between explicit and implicit encoding of conjunctive relations. Observe that in comparing any text fragments which are not in a unit-of-translation-relationship, as in our registerially parallel sub-corpora of originals, we are testing for the global property of (relative) explicitness. However, whenever we are comparing a specific aligned and instantiated sourcetarget (translation) unit, we are testing explicitation (or its opposite, implicitation).

5. Cohesive explicitness and explicitation in the CroCo Corpus The following discussion illustrates our current usage of the annotated and aligned CroCo corpus, following the account of cohesive devices in Halliday and Hasans Cohesion in English (1976).

Cohesive explicitation in an English-German translation corpus 251

5.1 Reference (Co-)Reference denotes those cohesive ties where the same referential meaning is represented by possibly different wordings, typically a fully lexical referent and a pro-form. It operates similarly both in English and German (cf. Kunz this volume for an extensive analysis of co-reference). First, we take a look at hypothesis 1 mentioned in Section4 concerning the explicit presence of a referent. A comparison of German relative clauses containing a pronominal referent with their non-finite English correspondences lacking an equivalent referent could provide clues as to the proportion of explicit referents to implicit ones. The retrieval of (co-)reference markers separately from the source and target language corpora is a straightforward procedure. The part-of-speech information contained in the two corpora permits precise queries. Specific queries into these reference markers in the target texts which have no equivalent in the source texts are more complex. Yet, they address levels of encoding richer in linguistic information than merely string-based queries which are unable to retrieve information encoded on higher linguistic levels. The relevant evidence is reflected in the annotation and alignment at word level. German relative pronouns, which reactivate the antecedent referent, are assigned the part-of-speech tag prels4 and if they occur in both languages, they are linked to each other in the word alignment. However, if there is a relative pronoun in the German translation which cannot be found in the English original text as in example 2,5 the German relative pronoun is not aligned at all it receives a socalled empty or undefined link (see Figure1 for the XML representation with the token, its part-of-speech tag and the empty link in bold face).
(2) at53 palmistt54,t55 inferringt56 thet57 futuret58 outt59 oft60 hist61 ownt62 linedt63 flesht64 eint64 Handlesert65,t66 dert67 seinet68 Zukunftt69 aust70 dent71 eigenent72 Linient73 ableitetet74
GTrans token index file <token id=t64 strg=ein/> <token id=t65 strg=Handleser/> <token id=t66 strg=,/> <token id=t67 strg=der/> GTrans part of speech annotation G2E word alignment <token pos=art xlink:href=#t64/> <token> <align xlink:href=#t55/> <token pos=nn xlink:href=#t65/> <align xlink:href=#t66/> </token> <token> <align xlink:href=#t56/> <align xlink:href=#t74/> </token> <token> <align xlink:href=#undefined/> <align xlink:href=#t67/> </token>

<token pos=yc xlink:href=#t66/> <token pos=prels xlink:href=#t67/> <token id=t68 strg=seine/> <token pos=pposat xlink:href=#t68/> <token id=t69 <token pos=nn xlink:href=#t69/> strg=Zukunft/>

Figure1. XML corpus annotation and alignment at word level including empty links

252 Silvia Hansen-Schirra, Stella Neumann and Erich Steiner

for $k in $doc//tokens/token let $fileName := $doc//translations/translation[@n='1']/@trans.loc let $fileNameNew := replace($fileName,"tok","tag" ) where ($k/align[1][@xlink:href != "#undefined"] and $k/align[2] [@xlink:href = "# undefined "] and doc($fileNameNew)//token [@xlink:href eq $k/align[1]/@xlink:href][@pos eq " prels "])

Figure2. XQuery for relative pronouns with empty links

For the investigation of explicit pronominal referents in German relative clauses vs. implicitly encoded English referents (since the reference is encoded in the English participle), all German tokens with the part-of-speech tag prels which are not aligned at word level have to be extracted. The respective XQuery is shown in Figure2 (the part-of-speech tag and its lacking equivalent in bold face). The output of this query yields sentences like 2. This example (taken from the FICTION sub-corpus) is interpreted as an instance of explicitation since participant role (and thus the reactivation of the referent), tense and mood are explicitly realized in the finite relative clause of the German translation, whereas they remain implicit in the English original. The results of relative pronoun alignment suggest that the use of relative pronouns is typologically motivated: Figure3 shows that zero-to-one alignment, which implies the lack of a relative pronoun in the original as opposed to the presence of a relative pronoun in the translation, occurs more frequently for EnglishGerman translations, whereas one-to-zero alignment is a feature of German-English translations. These findings, taken from the SHARE sub-corpus, indicate that regardless of the translation direction we can find more relative pronouns in the German texts than in the English ones. Hence, relative pronouns seem to be more characteristic of German than of English. The results for the other constellations depicted in Figure3 corroborate this assumption. The evaluation of one-to-zero alignments typical of the German-English translations suggests that the loss of the relative pronoun in translation can be interpreted as implicitation. Example 3 is a case in point:
(3) Der Pkw-Markt in Deutschland ist der einzige wesentliche Markt, in dem ein Rckgang der Auslieferungen an Kunden zu verzeichnen war. The German passenger car market was the only major market to see a decline in deliveries to customers.

Here, the German relative pronoun again reactivates the referent, and the finite verb war (was) of the relative clause renders tense and mood explicit. In contrast, the English infinitive to see implicitly refers to the nominal phrase the only major market. Tense and mood are also implicit in the non-finite construction. The examples discussed here show that, where the use of relative pronouns is concerned,

Cohesive explicitation in an English-German translation corpus 253


Alignment of relative pronouns

3:1

frequency of rel. pronouns per aligned sentence

2:1 2:0 1:2 1:1 1:0 0:3 0:2 0:1 0 200 number of constellations 400 EO-Gtrans GO-Etrans

Figure3. Alignment distribution of relative pronouns6

phenomena such as explicitation or implicitation seem to be triggered by typological language constraints. In a next step, hypothesis 2 (cf. Section4) concerning the proportion of phoric to fully lexical (auto-semantic) phrases was verified by comparing the number of tokens carrying a pronoun tag with the number of tokens carrying a noun tag in aligned sentences. To obtain an overview of the distribution of nouns and pronouns in the CroCo sub-corpora we compared the frequencies of these two partof-speech groups (see Table2). From this table we can gather the following information. The register-neutral reference corpus for German (GR) includes a lower proportion of nouns and a higher proportion of pronouns than the English one (ER). The different proportions in EO and GO are probably a reflection of the broader registerial composition of ER and GR. The frequencies in the translation sub-corpora lie between the originals and the reference corpora, moving in the direction of the language
Table2. Frequencies of nouns and pronouns in the sub-corpora in percentage terms
Subcorpus ER EO ETRANS GTRANS GO GR Noun 24.60 27.21 26.14 23.84 24.51 22.93 Pronoun 5.46 4.73 4.54 8.67 9.32 8.45

254 Silvia Hansen-Schirra, Stella Neumann and Erich Steiner

Difference originaltranslation 5.00 4.00 3.00 2.00 1.00 0.00 -1.00 -2.00 -3.00 -4.00 -5.00 -6.00

percentage points

EO-Gtrans noun pronoun GO-Etrans

part of speech

Figure4. Shifts of noun and pronoun frequencies from originals to translations

average, i.e. the reference corpora (except for pronouns in ETrans). And finally, the comparison of originals and their matching translations in the respective target language reveals a strong influence of the target language with similar percentages occurring in each target language category. The frequency of pronouns in ETrans is even lower than both the ER and the EO percentages so that target language conventions are exaggerated. With a focus on the comparison of source and target texts, Figure4 shows this tendency to conform to target language conventions. It reflects the increase and decrease of noun and pronoun frequencies observed when comparing the source texts with the respective target texts. In the translation direction EnglishGerman, the noun frequency drops by 3.37 percentage points, while pronouns rise by 3.94 percentage points. In the opposite direction, there is a slight increase in the frequency of nouns (1.63 percentage points) accompanied by a considerable decrease in the frequency of pronouns (-4.77 percentage points). This may be due to a typological constraint which entails an increased use of pronouns in German and nouns in English. The degree to which this happens may, however, also be influenced by register, as indicated in Figure5. This figure illustrates the impact of different register norms on the usage of nominal and pronominal constructions in English-German translations. It supports the overall impression created before, i.e. in all registers the German translations use more pronouns and fewer nouns than the English originals. Most of the registers seem to compensate for their low frequency of nouns through an increased use of pronouns. This is the case for instructional texts in particular. The German SHARE corpus conforms to this tendency with respect to the low number of nouns, whereas the texts also tend to use

Cohesive explicitation in an English-German translation corpus 255 Difference EO-Gtrans 8.00 6.00 percentage points 4.00 2.00 0.00
io n In str uc tio n Po ps ci Sh ar e Sp ee ch To ur ism Es sa y Fi ct W eb

noun pronoun

-2.00 -4.00 -6.00

Register

Figure5. Shifts of noun and pronoun frequencies across registers in the translation direction E to G

fewer pronouns than texts in other registers. This interesting finding encourages us to take a closer look at this register. When comparing the distribution of nouns and pronouns for a sub-corpus globally, we are dealing with explicitness. We can only determine explicitation by comparing noun and pronoun tags in aligned versions. In the aligned sentence pairs, non-identical sentence boundaries can be detected through one-to-two mappings and through so-called crossing lines between sentence and clause alignments or sentence and phrase alignments. Table3 illustrates this for a SHARE text in the translation direction English to German. Sentence pairs 12/15, 13/16 and 15/18 contain strong indicators for explicitation since both more nouns and more pronouns are used in the German translations. Shifts in sentence pairs 18/23, 20/27 and 24/33, where fewer nouns are used in the German translations compared to their English originals, might be interpreted as instances of implicitation.
Table3. Noun and pronoun distribution in aligned sentence pairs
EO sentence id s12 s13 s15 s18 s20 s24 no. of nouns 3 5 6 8 11 10 no. of pronouns 2 2 4 1 1 0 GTrans sentence id s15 s16 s18 s23 s27 s33 no. of nouns 6 7 10 5 9 7 no. of pronouns 3 6 7 1 1 1

256 Silvia Hansen-Schirra, Stella Neumann and Erich Steiner

A closer look at the sentence pair in example 4 shows that the English original contains 5 nouns and 4 pronouns; the aligned German translation 9 nouns and 3 pronouns. The translation of the pronouns it referring to downturn in the original illustrates how explicitation is created. In the translation, we find two paraphrases of the antecedent Abschwung der Konjunktur, namely a general noun in the noun phrase diese schwierige Situation and a synonym die konjunkturelle Delle, both creating lexical cohesion. Obviously this shift entails changes in co-reference.
(4) During challenging market transitions, successful companies usually get surprised by the downturn, they determine how long it will last and how deep it will be, and then they get ready for the upturn. In einem Marktumfeld, dessen stndige Vernderungen eine groe Herausforderung darstellen, werden auch erfolgreiche Unternehmen von einem Abschwung der Konjunktur zunchst berrascht, untersuchen dann jedoch umgehend, wie lange diese schwierige Situation andauern und wie tief die konjunkturelle Delle sein wird, um fr den folgenden Aufschwung gerstet zu sein.

5.2 Substitution and ellipsis Referring to hypothesis 4, we shall discuss substitution and ellipsis together, as they represent the same process, namely replacing one item by either a semantically weaker one, or, in the case of ellipsis, by zero (cf. Halliday and Hasan 1976:88). Substitution is a cohesive device without a direct equivalent in German, which has implications for translation. The most neutral strategy for compensation as used by the translator in 5 is to use ellipsis instead.7
(5) () proponents of enlargement, of which I am an enthusiastic one, occasionally fall into the rhetorical trap of arguing () () die Verfechter der Erweiterung und ich zhle mich zu den enthusiastischen gehen manchmal in die rhetorische Falle zu argumentieren, ()

However, this is not the only strategy pursued by translators. In 6, instead of replacing or omit ting the item in question, the translator chose to repeat it, thus making use of lexical cohesion. This can be interpreted as a case of cohesive explicitation because the syn-semantic element one, whose referential meaning can only be determined by retrieving its previous mention, is replaced by the auto-semantic item einen Partner, which spells out the referential meaning.

Cohesive explicitation in an English-German translation corpus 257

(6) () I again want to stress that the US genuinely wants the EU to be a strategic global partner one we can work closely with to address political, security and economic issues of common concern around the world. () mchte ich noch einmal hervorheben, dass die Vereinigten Staaten aufrichtig an der EU als strategischen globalen [sic!] Partner interessiert sind einen [sic!] Partner, mit dem wir bei politischen, sicherheitspolitischen und wirtschaftlichen Fragen, die die ganze Welt betreffen, eng zusammenarbeiten knnen.

From an explicitation-oriented point of view, the other translation direction is even more interesting. Since substitution is not available to the German source text author, we would expect not to find any substitution in the translations either. This is, however, not the case. Translators employ substitution in texts that do not contain any marked cohesive device in the source text, thereby normalizing the translations as in 7. Here, the referent is taken up in an additional cohesive realization (one).
(7) Die Justizministerin werde ich dazu einladen, und der Verband und wir werden eine vernnftige Lsung finden, die sicherstellt, dass die Wettbewerbsbedingungen in Deutschland verglichen mit anderen europischen Lndern nicht schlechter sind () I will ask the Justice Minister to join our discussions and together with your Association we will work out a sensible solution, one that ensures the competitive environment in Germany is no less favourable than in other European countries ()

Retrieval of all these examples is possible at word level by searching for the substitutor one in a concordance tool. Thanks to our annotation and alignment, substitutions can also be classified according to their grammatical functions or to the aligned segments. Since the annotation of grammatical functions and the corresponding alignment is still in progress, this classification is postponed to a later project stage. As to substitution, German seems to offer more flexible means of realizing cohesive ellipsis than English as can be seen from 8 below.
(8) Habt ihr gesehen, ob er sie kt? Und sie ihn? Have you seen whether he kisses her? And does she kiss him?

Here, the whole verbal group is elided in the German source text. For language typological reasons, this is not possible in the English target text, both the finite does and the predicator kiss are verbalized. This results in a more explicit wording in several ways: Not only is the verbal referent explicated, strengthening the lexical cohesion. Looking at the example from the perspective of grammar, we can

258 Silvia Hansen-Schirra, Stella Neumann and Erich Steiner

also conclude that tense, mood and voice are explicitated in the translation. The annotation of grammatical functions is particularly helpful for querying ellipsis. We can retrieve elided elements from the corpus for instance by querying finites in one version with an empty link, i.e. no alignment with a matching finite, in the other version. 5.3 Conjunction Hypothesis 7 in Section4 addresses conjunctive relations. Halliday and Hasan describe English conjunctive elements as specifying the way in which what is to follow is systematically connected to what has gone before (1976:227). This applies to German conjunctive elements as well. While conjunction works quite differently from reference, we can nonetheless pose a similar query to the one described for reference in Section5.1. Here, all German tokens with the part-of-speech tag kous (for conjunction) which are not aligned at word level (since the conjunctive relation is encoded implicitly, for instance through a participle clause) are extracted. The results for this query displayed in Figure6 are taken from the SHARE sub-corpus. The two examples show explicitation in the German translations since the implicit conjunctive relation encoded in the English participles (marked in bold face) are translated explicitly with German conjunctions (again marked in bold face). It should be noted, however, that, formally speaking, wodurch in the translation of the first example is not a conjunction but an adverbial relative pronoun. The fact that this sentence pair is retrieved by the query can be explained by a useful mistake of the statistical part-of-speech tagger. 9 shows a case of implicitation where the coordination marked by the conjunction und in the German original is translated by a subordination which implicitates the conjunctive relation.
<result><ori_en>Baker Hughes Business Support Services has assumed accounting, payroll, benefits and IT support duties for many of the companys U.S. operations, eliminating duplicate efforts by division personnel. </ori_en> <trans_ge>Baker Hughes Business Support Services hat die Buchfhrung, Gehalts- und Sozialleistungen sowie IT-Aufgaben fr viele Niederlassungen des Unternehmens in den Vereinigten Staaten bernommen, wodurch doppelte Arbeit durch das Personal in den Tochterunternehmen vermieden werden konnte. </trans_ge></result> <result><ori_en>In this environment, Baker Hughes revenue declined 22% to $4.5 billion for 1999, compared to $5.8 billion in 1998. </ori_en> <trans_ge>Vor diesem Hintergrund sanken die Umsatzerlse von Baker Hughes im Jahre 1999 um 22% auf 4,5 Mrd. Dollar, whrend sie 1998 noch 5,8 Mrd. Dollar betragen hatten. </trans_ge></result>

Figure6. Results for conjunctions with empty links

Cohesive explicitation in an English-German translation corpus 259

(9) Wie lange sind die schon da drinnen? fragte ich und starrte auf das schwarze Fenster in der Baracke. How long have they been in there? I asked, staring at the black window of the structure.

5.4 Lexical cohesion Lexical cohesion is achieved through the selection of vocabulary (Halliday and Hasan 1976:274). It is realized in English by the replacement of a lexical item with a general noun, a (near-) synonym, a superordinate or a hyponym. Finally, the lexical item may simply be repeated. All of these procedures are also available in German the use of general nouns is however more restricted in German than in English. While most of these procedures require a semantic analysis which is currently not part of the CroCo resource, we obtain an impression of repetition by computing the two indices typetoken ratio and lexical density as expressed in hypothesis 5 in Section4. As discussed in Section2, we have to bear in mind that these ratios have to be seen as emergent phenomena and can only serve as indirect indicators for lexical cohesion. They are subject to different and conflicting influences. For instance, a low typetoken ratio signalling frequent repetitions of types does not discriminate between the repetition of lexical words relevant to lexical cohesion and function words not contributing to lexical cohesion. The automatic counting even of lemmatized items has to allow for different spellings (the most prominent example being the different spelling of compounds in English and German). These ratios therefore have to be complemented by a thorough semantic analysis particularly with respect to the other realizations of lexical cohesion.8 To compute the typetoken ratio, we counted how often a given lexical item, i.e. a type, is repeated as a token in the text. Referring to the spoken-written distinction, Biber et al. (1999:43) claim that a high type/token ratio (i.e. a high degree of lexical diversity in a text) serves to increase the semantic precision and informational density of a written text. In CroCo, the calculation is based on the lemmatization included in the morphological analysis. Table4 lists the ratios for each aligned text of the register TOU in the translation direction English to German. If we compare the ratios in aligned source and target texts, we can interpret them in terms of explicitation/implicitation, the assumption being that a lower typetoken ratio in the aligned target text points to higher lexical cohesion and thus to explicitation. For the interpretation, we have picked out text pair number 007 which differs in typetoken ratio by 2.5 percentage points. In absolute numbers, the English original has 959 types and 6,174 tokens while the German translation contains more types (1,007) and fewer tokens (5,586). This leads to the conclusion that the translated text is less repetitive and thus shows less lexical cohesion

260 Silvia Hansen-Schirra, Stella Neumann and Erich Steiner

Table4. Typetoken ratio in EO_TOU and GTrans_TOU in percentage terms.


Text no. 001 002 003 004 005 006 007 008 009 010 011 EO 17.00 15.96 16.19 18.77 15.35 15.04 15.53 16.24 15.51 14.96 13.10 Gtrans 20.10 17.94 18.90 19.71 18.17 16.81 18.03 18.26 16.85 16.09 15.86

as realized through repetition. While part of the difference can be explained by the different compound chunking, the two following examples from text pair 007 show that there are indeed cases of reduced repetition in the translation.
(10) There are many bodies concerned with the preservation of the countryside and wildlife based in Wales and all local authorities pursue an active conservation policy. Es gibt hierzulande eine ganze Reihe von Vereinen, die sich dem Schutz der Natur und der Landschaft widmen. Auerdem betreiben alle Gemeinde und Kreisverwaltungen aktiven Naturschutz. (11) As regards density of population, two thirds live in South East Wales and the rest are thinly spread throughout the West and North, which gives these areas a tremendous feeling of space. Die Bevlkerungsdichte ist sehr unregelmig: zwei Drittel leben im Sdosten des Landes, und das andere Drittel ist dnn ber den Westen und Norden des Landes verteilt, was diesen Landesteilen ein unbeschreibliches Gefhl des Raumes verleiht.

The most frequent content word of the text pair is the proper noun Wales. It is replaced in 10 by the adverb hierzulande (here, in this country), and in 11 by the noun Land (country), consequently reducing the repetition of the noun Wales. In terms of lexical cohesion this can be interpreted as implicitation. Other interpretations which may, for instance, emphasize the translators tendency to diversify the vocabulary are beyond the scope of this paper. Lexical density can be interpreted as an indicator of lexical cohesion in the following way: a high lexical density in a given text reflects a high number of lexical items which require a certain level of semantic connection, if we presume that the

Cohesive explicitation in an English-German translation corpus 261

Table5. Lexical density in English original fiction texts and their translations in percentage terms
Text no. 001 002 003 004 005 006 007 008 009 010 EO_FICTION 43.06 42.66 48.43 47.87 46.31 42.45 47.11 43.84 45.50 48.80 GTRANS_FICTION difference in percentage points 46.53 3.47 48.45 5.79 53.17 4.74 54.62 6.76 52.44 6.13 51.81 9.36 52.23 5.12 49.34 5.50 54.40 8.90 54.47 5.67

text is coherent at all. The comparison of the general figures for the language corpora across all registers does not show any marked tendency. However, in all four sub-corpora the register FICTION has the lowest lexical density. If we take a closer look at concrete aligned text pairs from the English-German FICTION corpora we can see that each German translation in this register has a considerably higher lexical density than the source language texts (see Table5). This can be seen as an indicator of a higher level of lexical cohesion. The influence of contrastive differences does not provide a satisfactory explanation for this phenomenon, considering that the differences between source and target texts vary considerably. Therefore, this rise in lexical density might be due to the translation process. It must be stressed, however, that both typetoken ratio and lexical density can only serve to provide a first quantitative indication of lexical cohesion and can only be interpreted in terms of repetition. As said before, a complete quantitative overview of lexical cohesion can only be obtained by way of an annotation of semantic relations. In this section we have discussed the impact of the different cohesive devices on explicitness/implicitness and explicitation/implicitation in translations. It is important to point out that each of the devices may affect the explicitness of a translation in different or even conflicting ways; the substitution of one device with another may even generate an offsetting effect. A generalized statement on cohesive explicitness and explicitation would call for an assessment of the relative impact of the different devices and their contribution to explicit and/or implicit packaging of information. It is questionable whether such a prioritization is feasible and whether it results in substantiated assertions.

262 Silvia Hansen-Schirra, Stella Neumann and Erich Steiner

6. Conclusions and outlook In this article we have attempted to show that explicitness and explicitation operate differently and thus have to be analyzed in different ways. After clarifying the two concepts and explaining how linguistic indicators from the areas of cohesion, and partly also lexicogrammar, serve to operationalize them, we have discussed results obtained by analyzing the CroCo Corpus of annotated English and German parallel texts. We have presented findings illustrating differences in cohesion in English and German and particularly variation in translations compared to their source texts and/or originals in the target language. In several cases, as for instance with respect to pronominal reference, the results suggested contrastive differences between the two languages involved as the explanation for the explicitation or implicitation diagnosed. The CroCo resource proved able to provide a wealth of information for investigating research questions in this area. Additional annotation of the corpus, e.g. by encoding grammatical functions and particularly semantic relations, will extend the range of possible queries and thus also the range of interpretations in future work.

Notes
* The authors would like to thank Mihaela Vela for her help with the queries discussed here, Kerstin Kunz for continuous discussions on cohesive indicators and Mary Mondt for proof reading. Without their contributions this paper would not be possible in its present form. We would also like to thank two anonymous reviewers for helpful comments on an earlier version. The research reported on here is sponsored by the German Research Foundation (DFG) as project no. STE 840/51. 1. For current information cf. http://fr46.uni-saarland.de/croco/ 2. http://www.tei-c.org 3. http://www.xml-ces.org 4. Following the Stuttgart Tbingen Tag Set for German (Schiller et al. 1999) 5. Unless marked otherwise the examples are all taken from the CroCo Corpus. 6. Note that we did not test the significance of our results. 7. We concentrated on nominal substitution where the head of a nominal group is substituted by one. 8. Teich and Fankhauser (2005) exemplify an interesting way of interpreting lexical cohesion on the basis of a corpus annotated with semantic relations.

Cohesive explicitation in an English-German translation corpus 263

References
Baker, M. 1996. Corpus-based Translation Studies: The Challenges that Lie Ahead. In Terminology, LSP and Translation Studies in Language Engineering, H. Somers (ed.), 175186. Amsterdam: John Benjamins. Biber, D. 1993. Representativeness in Corpus Design. Literary and Linguistic Computing 8/4: 243257. Biber, D. 1995. Dimensions of Register Variation: A Cross-linguistic Comparison. Cambridge: Cambridge University Press. Biber, D., Johansson, S., Leech, G., Conrad, S. and Finegan, E. 1999. Longman Grammar of Spoken and Written English. London: Longman. Blum-Kulka, S. 1986. Shifts of Cohesion and Coherence in Translation. In Interlingual and Intercultural Communication: Discourse and Cognition in Translation and Second Language Acquisition Studies, J. House and S. Blum-Kulka (eds.), 1735. Tbingen: Narr. Burton-Roberts, N. 2005. Review Article: Robyn Carston on Semantics, Pragmatics and Encoding. Journal of Linguistics 41 (2005):389407. Carston, R. 2002. Thoughts and Utterances: the Pragmatics of Explicit Communication. Oxford: Blackwell. Dixon, R.M.W. 1991. A New Approach to English Grammar, on Semantic Principles. Oxford: Clarendon Press. Doherty, M. 2002. Language Processing in Discourse. A Key to Felicitous Translation. London, New York: Routledge. Doherty, M. 2006. Structural Propensities. Amsterdam: John Benjamins. Englund Dimitrova, B. 2005. Expertise and Explicitation in the Translation Process. Amsterdam: John Benjamins. Fabricius-Hansen, C. 1999. Information Packaging and Translation: Aspects of Translational Sentence Splitting (German English/Norwegian). In Sprachspezifische Aspekte der Informationsverteilung, M. Doherty (ed.), 175214. Berlin: Akademie-Verlag. Halliday, M.A.K. and Hasan, R. 1976. Cohesion in English. London: Longman. Halliday, M.A.K. and Hasan, R. 1989. Language, Context and Text: Aspects of Language in a Social-Semiotic Perspective. Oxford: Oxford Univ. Press. Halliday, M.A.K. and Martin, J. 1993. Writing Science: Literacy and Discursive Power. London: The Falmer Press. Halliday, M.A.K. and Matthiessen, C.M.I.M. 1999. Construing Experience through Meaning. A Language-Based Approach to Cognition. London: Cassell. Halliday, M.A.K. and Matthiessen, C.M.I.M. 2004. An Introduction to Functional Grammar. 3rd edition. London: Arnold (earlier versions by Halliday in 1985/1994). Hansen, S. 2003. The Nature of Translated Text. An Interdisciplinary Methodology for the Investigation of the Specific Properties of Translations. Saarbrcken: Saarbrcken Dissertations in Computational Linguistics and Language Technology. vol. 13. Hansen-Schirra, S., Neumann, S. and Vela, M. 2006. Multi-dimensional Annotation and Alignment in an English-German Translation Corpus. Proceedings of the 5th Workshop on Multidimensional markup in NLP, 3542. Trento. House, J. 2002. Maintenance and Convergence in Translation some Methods for Corpus-Based Investigations. In Information Structure in a Cross-Linguistic Perspective, H.

264 Silvia Hansen-Schirra, Stella Neumann and Erich Steiner Hasselgrd, S. Johansson, B. Behrens, and C. Fabricius-Hansen (eds.), 199212. Amsterdam, New York: Rodopi. House, J. and Rehbein, J. (eds.). 2004. Multilingual Communication. Amsterdam: John Benjamins. Johansson, S. and Oksefjell, S. (eds.). 1998. Corpora and Cross-linguistic Research. Amsterdam: Rodopi. Kunz, K. (this volume). A Method for Investigating Coreference in Translations and Originals. Laviosa, S. (ed.). 1998. Meta Translators Journal. vol. 43 no. 4. Linke, A. and Nussbaumer, M. 2000. Konzepte des Impliziten: Prsuppositionen und Implikaturen. In Text- und Gesprchslinguistik. Ein internationales Handbuch zeitgenssischer Forschung, K. Brinker, G. Antos, W. Heinemann, and S. F. Sager, 435448. Halbband 1. Berlin, New York: de Gruyter. Neumann, S. 2003. Die Beschreibung von Textsorten und ihre Nutzung beim bersetzen. Frankfurt/M.: Peter Lang. Neumann, S. and Hansen-Schirra, S. 2005. The CroCo Project. Cross-Linguistic Corpora for the Investigation of Explicitation in Translations. In Proceedings from the Corpus Linguistics Conference Series, 1(1). Available at http://www.corpus.bham.ac.uk/PCLC/cl-134-pap. pdf, accessed May 2007. Olohan, M. and Baker, M. 2000. Reporting that in Translated English. Evidence for Subconscious Processes of Explicitation? Across Languages and Cultures 1(2): 141158. Polenz, P. von 1988. Deutsche Satzsemantik. Grundbegriffe des Zwischen-den-Zeilen-Lesens. Berlin: Mouton de Gruyter. Schiller, A., Teufel, S., Stckert, C. and Thielen, C. 1999. Guidelines fr das Tagging deutscher Textcorpora mit STTS. Stuttgart: Universitt Stuttgart. Steiner, E. 2001. Translations English German: Investigating the Relative Importance of Systemic Contrasts and of the Text-Type Translation. SPRIK-Reports no. 7. Oslo. Steiner, E. 2004. Ideational Grammatical Metaphor: Exploring some Implications for the Overall Model. Languages in Contrast. 4(1):139166. Steiner, E. 2005a. Some Properties of Texts in Terms of Information Distribution Across Languages. Languages in Contrast 5(1):4972. Steiner, E. 2005b. Some Properties of Lexicogrammatical Encoding and their Implications for Situations of Language Contact and Multilinguality. Zeitschrift fr Literaturwissenschaft und Linguistik Jg. 35, Heft 139:5475. Steiner, E. 2005c. Explicitation, its Lexicogrammatical Realization, and its Determining (Independent) Variables Towards an Empirical and Corpus-Based Methodology. SPRIKreports no. 36. Oslo. Teich, E. 2003. Cross-Linguistic Variation in System and Text: A Methodology for the Investigation of Translations and Comparable Texts. Berlin: Mouton de Gruyter. Teich, E. and Fankhauser, P. 2005. Exploring Lexical Patterns in Text: Lexical Cohesion Analysis with WordNet. Interdisciplinary Studies on Information Structure 2 (2005):129145. Vela, M. and Hansen-Schirra, S. 2006. The Use of Multi-Level Translation and Alignment for the Translator. Proceedings of the Translating and the Computer 28 Conference. London. Available at http://www.aslib.co.uk/conferences/proceedings.html, accessed May 2007.

Cohesive explicitation in an English-German translation corpus 265

Authors addresses
Silvia Hansen-Schirra Fachbereich Angewandte Sprach- und Kultur wissen schaft Johannes Gutenberg-Universitt Mainz Postfach 11 50 D-76711 Germersheim Germany hansenss@uni-mainz.de Stella Neumann Fachrichtung Angewandte Sprachwissenschaft, sowie bersetzen und Dolmetschen Universitt des Saarlandes Postfach 15 11 50 D-66041 Saarbrcken Germany st.neumann@mx.uni-saarland.de Erich Steiner Fachrichtung Angewandte Sprachwissenschaft, sowie bersetzen und Dolmetschen Universitt des Saarlandes Postfach 15 11 50 D-66041 Saarbrcken Germany e.steiner@mx.uni-saarland.de

You might also like