Description of The Tangora System

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 5

COUPLING AN AUTOMATIC DICTATION SYSTEM WITH A GRAMMAR CHECKER

Jean-Pierre CHANOD, Marc EL-BEZE, Sylvle GUILLEMIN-LANNE


IBM France, Paris Scientific Center

Automatic dictation systems (ADS) are Architecture of the system


nowadays powerful and reliable. However,
some Inadequacies of the underlying The voice signal is submitted to a chain of
models still cause errors. In this paper, we signal processing, in order to extract
are essentially interested in the language acoustic parameters from the sound wave.
model implemented In the linguistic Thus, the data flow is reduced from 30,000
component, and we leave aside the acoustic to 100 bytes per second. Two passes of
module. More precisely, we aim at acoustic evaluation are performed: a
Improving this linguistic model by coupling relatively gross pass (so-called Fast Match)
the ADS with a syntactic parser, able to selects a first list of candidate words
diagnose and correct grammatical errors. (around 500 words); this list is further
We describe the characteristics of such a reduced thanks to the language model (see
coupling, and show how the performance of below)~ so that only a small number of
the ADS improves with the actual coupling remaining candidates are submitted to a
realized for French between the Tangora second, more precise, acoustic pass (so-
ADS and the grammar checker developed at called Detailed Match). Storage constraints
the IBM France Scientific Center. as we!l as the methods used to provide the
language model explain that the size of the
dictionary is limited to about 20,000 entries.

Description of the Tangora The decoding algorithm


system This algorithm determines the more likely
uttered sequence of words. It works from
The Tangora system is implemented on a left to right by combining the various scores
personal computer IBM PSI2 or IBM estimated by the acoustic and linguistic
RS/6000. A vocal I/O card is added, as well models, according to a so-called stack
as a specialized card equipped with two decoding strategy. At this stage, the
micro-processors, which provide the needed elementary operation consists tn expanding
power for the decoding algorithms. The the best existing hypothesis which Is not yet
programs are written In assembly or C. expanded, i. e. It consists In keeping the
sentence segment, which, followed by the
The multi-lingual aspect of the Tangora contemplated current word, Is rated with the
system (DeGennaro 91) constitutes a major highest likelihood.
asset. Indeed, It was Initially conceived for
English (Averbuch, 87) by the F. Jellnek Methods
team (IBM T. J. Watson Research Center),
but It was adapted since to process Italian, If one formulates the problem of speech
German and French Inputs. As a whole, the recognition according to an Information
average error rate is close to 5%. But theory approach, one naturally chooses
problems specific to each language require probabillstic models among all available
adapted solutions. language models (Jeltnek, 76). The trlgram
(Cerf, 90), trlPOS 1 (Derouault, 84), or
The user is required to train the system by trilemma (Derouault, 90) models offer ways
uttering 100 sentences during an enrollment of estimating the probability of any
phase, and to manage slight pauses sequence of words. For instance, formula of
between two words. For the French system, the trlgram model:
liaisons at this time are prohibited. fl

P(W~)= P(wl) × P(w2/wO× HP(wj/wI_ =, wl_ 1)


1~3

The analysis of decoding errors show that


half of them are due to the acoustic model,
the other half being associated with the

I Model b a l e d on triplets of parts of AOooeh (POS),

ACTES DE COLING-92, NANTI'S, 23 28 AoOr 1992 940 PRO[:. OF COLING-92, NArcr~s. AUG. 23-28, 1992
language model. Actually, the number of Appositions and interpolated clauses
homophones being quite high (2.6) In an Increase the distance between elemeuts
inflected language such as French, it Is which must agree:
clear that no acoustic model, as perfect as It Plusloun= PARTI5 d'oppo=lUo, de
may be, can produce a satisfactory gaucho, notammant Io paHl commu=
decoding without the support of a language nlate, PARTAGENT co point de rue.
model.
Predicting a word thanks to tim
preceding words does not allow the
Power and limitations of probabilistic system to appropriately control person
language models agreement when the subject follows the
verb. Example:
Probablllstlc language models are powerful Quo aont DEVENUS los prlnelpaux
enough to considerably reduce ambiguities PROTAGONtSTES de la vlctolre du
that the acoustic model alone cannot solve. onze novombre?
However, they suffer from punctual Imper-
fections that are bound to their formulation. Moreover, s o m e confusions due to
This Is clearly shown by testing a homophony induce changes of
probablllstlc model on the lattice formed by grammatical category, that require a
the set of the homophones of the words of complete Interpretation of the sentence
every sentence. The decoding obtained by to be properly diagnosed, as in "et"/'est ~
searching for the maxlreum likelihood path (conjunction/verb) or "&"l"a ~
(Cerf, 91) gives an error rate close to 3%, (preposition/verb).
thus showing some of the Inadequacies of
the probablllstlc language models.

Besides, and agatn for reliability reasons, Coupling the ADS with the
statistics need to be gathered from large
learning corpora (tens or even hundreds of
g r a m m a r checker
millions words). In spite of all the
preliminary cleaning that may be done To bring a solution to the problems
(automatic correction of typos, tripled described above, we propose to perform a
consonants for Instance), such a huge grammatical analysis after the decoding
corpus contains a certain number of operation. The grammatical analysis applies
grammatical errors, that Introduce noise In to the best of the hypotheses selected by
the model. the ADS. It serves as a basis to diagnose
grammatical errors and te suggest correc-
Probablllstlc estlmatlons are produced by tions 2
.

counting triplets of words or grammatical


classes, tn any of the trtgram, triPeS or The syntactic parser must prove powerful
trllemma models, a word Is generally and reliable enough to effectively Improve
predicted according to the two preceding the performance of the ADS. It must provide
words, classes or lemmas only. However, a broad coverage, In order to cope with a
grammatical rules may apply to larger large variety of texts, the source and the
frames. Not only the rules often apply to domain of which are not known In advance.
words located out of the window used by It must also compute a global analysis of
the probabtllstlc model, but also the sentence In order to fill the deficiencies
grammatically significant words are to be of the probablllstlc model.
found either In previous or In posterior
position. Let us mention, as Illustrations,
some phenomena for which the probablllstlc Description of the syntactic parser
model does not fit:
The syntactic parser we use meets the
requirements described above (Chased el).
• Adverbs and complements constitute an
It is actually conceived to provide the global
obstacle to tile transfer of information on
gender, number and person, while this syntactic analysis of extremely diversified
Information Is needed to choose texts.
between different homophones, as In:
I~ COMMISSION charg(ie d' 6tabllr un It is based on an original linguistic ~rategy
plan de aoutlen global aux populotlone developed by Karen Jonson for US English
des terrltolres occup~m s" est RdUNIE (Heldorn 132, Jonson, 8G). The parser Initially
dlmanche,

e A similar approachwas tested in English, but only to detect grammatically incorre~ct~nionceB (Bellegarda92)
AcrEs DE COLING-92, NANTES,23-28 AO~r 1992 94 1 PROC. o=: (;OI.ING-92, NANTES, AUG. 23-28, 1992
compute8 a syntactic sketch, which (Jansen, 83) and multiple parses are ranked
represents the likeliest syntactic surface thanks to specific procedures (Heldorn, 76).
structure of the sentence; at this stage, such This last point allows the system to
phenomena as coordinations, ellipses, automatically select the strongest
interpolated clauses, If not totally resolved, hypothesis, according to the linguistic
do not block the parsing. The analysis Is features (Including the grammar errors) of
based on the so-called relaxed approach, the syntactic trees.
which consists in rejecting linguistic
constraints which, as pertinent as they may
be In descriptive linguistics, are rarely Adaptation o f the parser to the ADS
satisfied strlcto sansu In the surface struc- As mentioned above, many grammatical
tures of free texts. This strategy proves to errors In written French are actually caused
broaden the coverage of the grammar as by homophones (gender, number
well as it allows the parser to deal with agreement, confusion between Infinitive and
erroneous texts. past participle, "chantez/chanter', %t/esf",
etc.). The parser, Initially built for written
Architecture of the parser:. French, Is thus well prepared to detect
errors produced by an ADS.
The system is written in PLNLP
(Programming Language for Natural It can however be adapted to the specific
Language Processing, G. Heldorn, 72). It needs of the ADS, by adding specific
Includes: procedures (detection of ill-recognized
frozen phrases, etc.), and by filtering out
• A morphologic dictionary (50,000 non-homophonic corrections, or corrections
lemmas plus their Inflection tables),= which do not belong to the list of candidates
* A morpho-syntactlo dictionary, which initially proposed by the ADS.
describes the sub-categorizations
attached to each temma, Indeed, post-processing procedures are
• A set of more than 300 PLNLP produc- largely used to diagnose errors after the
tion rules, which produce the syntactic syntactic tree has been computed. This
sketches, offers the Immense advantage of making the
• A set of procedures built to re-interpret system evolutionary: It can be easily
the syntactic sketches and to diagnose modified, In order to Improve the scope of
errors, the detections. This made the adaptation of
• A form generator, which provides the grammar checker to the ADS quite
corrected forms. straightforward.

Indeed, some other techniques are also


used. Strong syntactic constraints are Description of the processing chain
relaxed during a second pass; It allows the In case of the ADS, the coupling Is done by
system to detect errors which induce major a simple call to the parser for each sen-
syntactic changes (for Instance confusion tence. In case of the homophone scheme,
"et/est"), whim forbidding undesired or too the diagram of the processing chain Is
numerous parses. Fitted parses are shown In the following figure:
computed In case the global analysis falls

= The=e50,000lemmaeproduceabout350,000inflectedforms,which largelyexceedsthe 20,000forms uemdby


the Tangorasystem.
ACTESDECOLING-92, NANTES,23-28 AOI)T 1992 942 Pr~oc. OFCOLIN'G-92, NANTES,AUG. 23-28, 1992
Given the high performance of the ADS and
the difficulty to Improve It In the frame of the
probablllstlc model, the improvement of
around 1% observed on three of the test
corpora is very promising.

Samples of corrected sentences:

Example 1: Subject-predicate, attributive


adjective-noun, subject-verb agreement

Lee conditions sont t r ~ durs r o l l ie pays,


devenus Ind6fendable, les accepteeL
After parsing, the suggested correction Is:
LUS condWons sont ~ DURES mall; le
pays, DEVENU Ind6fendeble, les ACCEPTE.
Example 2: subject.verb agreement; contusion
between the conjunction "st" end the verbal form
"est" :

Le felt que le I~ros de chscun des bols


romans solent dlffdrents el: rGv~lateers.
After parsing, the suggested correction Is:
Le felt qua le h6ros de chscun des b~ls
romans SOIT DIFF6RENT EST
Figure 1. Coupling Diagram R6V6LATEUR.

ExpeHencos Example 3: Confusion between the verbal form


"e ~ and the preposition "A"; Confusion between
Our tests were carried on the following the past participle and the Infinitive form of the
texts: corresponding verb.

corpl AFP dispatches (1000 words) Ce document est a falro sign6 recto et
corp2 AFP dispatches (3221 words) verso par le propdGtalro st par le gesUon-
nalro.
corp3 e-mail notes (1909 words)
corp4 grammar books (1337 words) After parsing, the suggested correction Is:
Ce document est & falro SIGNER recto et
verso par le proprl6talro et par le gestlon-
Only the CORP1 file was obtained through a nalro.
real decoding; the other corpora were
processed by automatically generating their
homophones.
Conclusion
Results

The experiments were made at an early Coupling the ADS and the syntactic parser
stage of the coupling. They could certainly meets the Initially assigned objectives quite
be improved with more extensive tests, as satisfactorily: broad coverage of the texts
the adaptation of the grammar checker to parsed by the grammar, meaningful
the ADS would gain In accuracy. percentage of justified corrections,
adequacy of the syntactic parser to the
Percentage of erroneous words left types of errors specifically generated by the
uncorrected decoder.

LM without parser with parser The tests that we performed on various


corpl 4.5% 3.6% corpora are all the more encouraging, since
corp2 4.6% 3.6% a great deal of the remaining errors result
corp3 6.3% 6.1%4 from semantic ambiguities that no grammar
corp4 7% 5.8% checker based upon a syntactic analysis of
the sentence can detect.

4 The bad results of the CORP3file are due In greet part to the difficulties of e-mall, that make parsing less
accurate.
ACTES DE COLING-92, NANTES, 23-28 AOt~T 1992 943 PROC. OF COLING-92, NANH'ES, AUG. 23-28, 1992
L'~ge do la MER lu plus fr~luent ~ I'accou- DeGennaro S., Cerf-Danon H., Ferrettl M.,
chement est de vlngt-slx ans. Gonzales J., Keppel E., 1991: "Tangora - a
large vocabulary speech recognition system
A subsidiary advantage of the coupling for five languages ", EuroSpeech 1991,
would be to detect errors that would not be Genoa.
produced by the ADS but by the speaker
him/herself (punctuation, stylistic infelicities, Derouault A-M., M~rialdo B., 1984:
mood of subordinate clauses, etc.). Not only "Language modeling at the syntactic level"
we may contemplate transcribing as 7th International Conference on Pattern
accurately as possible the words of a Recognition, August 1984, Montreal.
speaker, but also offering him/her a stylistic
aid. Derouault A-M., EI-B~ze M., 1990: "A
Morphological Model for Large Vocabulary
Speech Recognition", ICASSP 1990.
References Heldorn, G.E., 1972: Natural Language Inputs
to a Simulation Programming System, Ph.D.
dissertation, Yale University.
Averbuch A. et al., 1987: Experiments with
the TANGORA 20,000 word Speech Heidorn G.E., Jensen K., Miller L.A., Byrd
Recognizer, Proceedings of ICASSP, Dallas, R.J., Chodorow M.S., 1962: "3"he EPISTLE
pp. 701-704. Text-Critiquing System", IBM system Journal,
vol.21, n°3.
Bellegarda J., Braden-Harder L., Jensen K.,
Kanevsky D., Zadrozny W., 1992: "Post- Heidorn, G.E., 1976: "An Easily Computed
recognizer language processing: applica- Metric for Ranking Alternative Parses",
tions to speech, handwriting", submitted to Presented at the Fourteenth Annual Meeting
EUSIPCO'92. of the ACL, San Francisco, October 1976.
Cerf-Danon H., de La Noue P., Dlrlnger L., Jellnek F., 1976: "Continuous Speech
EI-B~ze M., Marcadet J.C., 1990: "A 20,000 Recognition by Statistical Methods",
words, automatic speech recognizer. Adap- Proceedings of the IEEE, Vo/ 64, April 1976.
tation to French of the US TANGORA
system", Nato 1990. Jensen, K., Heldorn, G.E., 1983: "The Fitted
Parse: 100% Parsing Capability In a
Cerf-Danon H., EI-B~ze M., 1991: "Three Syntactic Grammar of English", Prec. Conf.
different Probablllstlc Language Models: on Applied Natural Language Processing,
Comparison and Combination", ICASSP Santa Monlca, California, pp.93-98.
1991.
Jensen, K. 1966: "A Broad-Coverage
Chanod J-P., 1991: Analyse automatlque Computational Syntax of English",
d'erreurs: strat(~gie Ilngulstlque et Unpublished documents, IBM T.J. Watson
computatlonnelle, Colloque Informatlque et Research Center, Yorktown Heights, N.Y.
Langue naturelle, 23-24 janvler 91, Liana
Univ. de Nantes.

ACt'ES DECOLING-92, NANTES, 23-28 AO~V 1992 944 PROC. OF COLING-92, NANTES, AUG. 23-28, 1992

You might also like