Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
You are on page 1of 17

Direct translation

• no complete intermediary sentence structure


• translation proceeds in a number of steps, each
step dedicated to a specific task
• the most important component is the bilingual
dictionary
• typically general language
• problems with
– ambiguity
– inflection
– word order and other structural shifts
Simplistic approach
• sentence splitting
• tokenisation
• handling capital letters
• dictionary look-up and lexical substitution incl.
some heuristics for handling ambiguities
• copying unknown words, digits, signs of
punctuation etc.
• formal editing
Advanced classical approach
(Tucker 1987)
• Source text dictionary look-up and
morphological analysis
• Identification of homographs
• Identification of compound nouns
• Identification of nouns and verb phrases
• Processing of idioms
Advanced approach, cont.
• processing of prepositions
• subject-predicate identification
• syntactic ambiguity identification
• synthesis and morphological processing
of target text
• rearrangement of words and phrases in
target text
Feasibility of the direct
translation strategy

Is it possible to carry out the direct


translation steps as suggested by Tucker
with sufficient precision without relying on
a complete sentence structure?
Assignment 1: manual direct
translation
Sv. Ytterst handlar kampen för sysselsättning om att hålla
samman Sverige.
En. Ultimately, the fight for full employment concerns the
cohesion of Swedish society.
(from Statement of Government Policy 1996)

• Define an algorithm and a dictionary (based on


Norstedts) for simplistic translation of the
example.
• Present the model and the result.
Assignment 1, cont.
• Improve the result stepwise in accordance with
the advanced direct translation strategy
– Specify each step carefully and demonstrate its effect
on the translation.
• Evaluate and discuss the final result.
• Translate the ex. using Systran (http://
kwic.systran.fr/systran/svdemo) and discuss the
differences in an evaluative way
• Report the assignment and up-load on the web
(041001)
Current trends in direct
translation
• re-use of translations
– translation memories of sentences and sub-sentence
units such as words, phrases and larger units
– lexicalistic translation
– example-based translation
– statistical translation

Will re-use of translations overcome the problems with the


direct translation approach that were discussed above?

If so, how can they be handled?


Systran
• System Translation
• developed in the US by Peter Toma
• first version 1969 (Ru-En)
• EC bought the rights of Systran in 1976
• currently 18 language pairs
• demo version sv-en in 2003 (http://
kwic.systran.fr/systran/svdemo)
• http://babelfish.altavista.com/
Systran, cont.
• more than 1,600,000 dictionary units
• 20 domain dictionaries
• daily use by EC translators, administrators
of the European institutions
• originally a direct translation strategy
– see H&S
• today more of a transfer-based strategy
Ex. 1: fairly good translation
/Systran sv-en
• "Enskilda företagare som inte bildat bolag
klassificeras hit." 

• "Individual entrepreneurs that have not formed


companies are classified  here.”

• Systemet har känt igen bildat som en


perfektform och översätter tempusformen
korrekt have formed med negationen not på rätt
plats.
Ex. 2: word order problem/
Systran sv-en
•  "När byarna kontaktades hade de inte ens
utsatts för influensa." 

• "When the villages were contacted had


they not even been exposed to flu.”

• Systemet har inte hittat subjekt och


predikat och ger därför fel ordföljd.
Ex. 3: ambiguity problem/
Systran sv-en
• "Vad kan vi lära av Arrawetestammen?" 

• "What can we faith of the Arawete?”

• Systemet hittar inte sambandet mellan kan


och lära och ser därför inte att lära är ett
verb.
Ex. 4: ambiguity problem/
Systran sv-en
• ”Extrapoleringen går till så här. " 

• ”The extrapolation goes to so here.”

• Systemet känner inte till partikelverbet


känna till och översätter därför felaktigt ord
för ord.
Systran Linguistic Resources
• Dictionaries
– POS Definitions
– Inflection Tables
– Decomposition Tables
– Segmentation Dictionaries
• Disambiguation Rules
• Analysis Rules
Systran Processing Steps
• Analysis
– Lookup
– Compound Decomposition
– Disambiguation
– Syntactic Analysis
– Compound Expansion
• Sentence Transfer
– Initial Target Structure
– Lookup
– Default Transfer of Attributes
– Structure Transformation
Systran Processing Steps (cont)
• Sentence Synthesis
– Structure Transformation
– Inflection lookup
– Surface Transformation

You might also like