Professional Documents
Culture Documents
Classification of Discourse Coherence Relations
Classification of Discourse Coherence Relations
Classification of Discourse Coherence Relations
net/publication/269088016
CITATIONS READS
31 1,327
5 authors, including:
All content following this page was uploaded by Roser Saurí on 14 January 2016.
117
Proceedings of the 7th SIGdial Workshop on Discourse and Dialogue, pages 117–125,
Sydney, July 2006. c 2006 Association for Computational Linguistics
probabilistic models for identifying clausal ele- The textual evidence contributing to identifying
mentary discourse units and generating discourse the various resemblance relations is heterogeneous
trees at the sentence level. These are built using at best, where, for example, similarity and contrast
lexical and syntactic information obtained from are associated with specific syntactic constructions
mapping the discourse-annotated sentences in the and devices. For each relation type, there are well-
RST Corpus (Carlson et al., 2003) to their corre- known lexical and phrasal cues:
sponding syntactic trees in the Penn Treebank.
Within SDRT, Baldridge and Lascarides (2) a. similarity: and;
(2005b) also take a data-driven approach to b. contrast: by contrast, but;
the tasks of segmentation and identification of c. example: for example;
discourse relations. They create a probabilistic d. elaboration: also, furthermore, in addi-
discourse parser based on dialogues from the Red- tion, note that;
woods Treebank, annotated with SDRT rhetorical e. generalization: in general.
relations (Baldridge and Lascarides, 2005a). The
However, just as often, the relation is encoded
parser is grounded on headed tree representations
through lexical coherence, via semantic associa-
and dialogue-based features, such as turn-taking
tion, sub/supertyping, and accommodation strate-
and domain specific goals.
gies (Asher and Lascarides, 2003).
In the Penn Discourse TreeBank (PDTB) (Web-
The cause-effect relations include conventional
ber et al., 2005), the identification of discourse
causation and explanation relations (captured as
structure is approached independently of any lin-
the label ce), such as (3) below:
guistic theory by using discourse connectives
rather than abstract rhetorical relations. PDTB (3) cause: SEG1: crash-landed in New Hope,
assumes that connectives are binary discourse- Ga.,
level predicates conveying a semantic relationship effect: SEG2: and injuring 23 others.
between two abstract object-denoting arguments.
The set of semantic relationships can be estab- It also includes conditionals and violated expecta-
lished at different levels of granularity, depend- tions, such as (4).
ing on the application. Miltsakaki, et al. (2005)
propose a first step at disambiguating the sense of (4) cause: SEG1: an Eastern Airlines Lockheed
a small subset of connectives (since, while, and L-1011 en route from Miami to the Bahamas
when) at the paragraph level. They aim at distin- lost all three of its engines,
guishing between the temporal, causal, and con- effect: SEG2: and land safely back in Miami.
trastive use of the connective, by means of syntac-
tic features derived from the Penn Treebank and a
MaxEnt model. The two last coherence relations annotated in
GraphBank are temporal (temp) and attribution
3 GraphBank (attr) relations. The first corresponds generally to
the occasion (Hobbs, 1985) or narration (Asher
3.1 Coherence Relations and Lascarides, 2003) relation, while the latter is
For annotating the discourse relations in text, Wolf a general annotation over attribution of source.2
and Gibson (2005) assume a clause-unit-based
3.2 Discussion
definition of a discourse segment. They define
four broad classes of coherence relations: The difficulty of annotating coherence relations
consistently has been previously discussed in the
(1) 1. Resemblance: similarity (par), con- literature. In GraphBank, as in any corpus, there
trast (contr), example (examp), generaliza- are inconsistencies that must be accommodated
tion (gen), elaboration (elab); for learning purposes. As perhaps expected, an-
2. Cause-effect: explanation (ce), violated notation of attribution and temporal sequence rela-
expectation (expv), condition (cond); tions was consistent if not entirely complete. The
3. Temporal (temp): essentially narration; most serious concern we had from working with
4. Attribution (attr): reporting and evidential 2
There is one non-rhetorical relation, same, which identi-
contexts. fies discontiguous segments.
118
the corpus derives from the conflation of diverse Splitting these classes would preserve the sound-
and semantically contradictory relations among ness of such a procedure, while keeping them
the cause-effect annotations. For canonical cau- lumped generates inconsistencies.
sation pairs (and their violations) such as those
above, (3) and (4), the annotation was expectedly 4 Data Preparation and Knowledge
consistent and semantically appropriate. Problems Sources
arise, however when examining the treatment of
In this section we describe the various linguistic
purpose clauses and rationale clauses. These are
processing components used for classification and
annotated, according to the guidelines, as cause-
identification of GraphBank discourse relations.
effect pairings. Consider (5) below.
4.1 Pre-Processing
(5) cause: SEG1: to upgrade lab equipment in
1987. We performed tokenization, sentence tagging,
effect: SEG2: The university spent $ 30,000 part-of-speech tagging, and shallow syntactic
parsing (chunking) over the 135 GraphBank docu-
This is both counter-intuitive and temporally false. ments. Part-of-speech tagging and shallow parsing
The rationale clause is annotated as the cause, and were carried out using the Carafe implementation
the matrix sentence as the effect. Things are even of Conditional Random Fields for NLP (Wellner
worse with purpose clause annotation. Consider and Vilain, 2006) trained on various standard cor-
the following example discourse:3 pora. In addition, full sentence parses were ob-
(6) John pushed the door to open it, but it was tained using the RASP parser (Briscoe and Car-
locked. roll, 2002). Grammatical relations derived from
a single top-ranked tree for each sentence (head-
This would have the following annotation in word, modifier, and relation type) were used for
GraphBank: feature construction.
(7) cause: to open it 4.2 Modal Parsing and Temporal Ordering
effect: John pushed the door. of Events
The guideline reflects the appropriate intuition We performed both modal parsing and tempo-
that the intention expressed in the purpose or ra- ral parsing over events. Identification of events
tionale clause must precede the implementation of was performed using EvITA (Saurı́ et al., 2006),
the action carried out in the matrix sentence. In an open-domain event tagger developed under the
effect, this would be something like TARSQI research framework (Verhagen et al.,
2005). EvITA locates and tags all event-referring
(8) [INTENTION TO SEG1] CAUSES SEG2 expressions in the input text that can be tempo-
rally ordered. In addition, it identifies those gram-
The problem here is that the cause-effect re-
matical features implicated in temporal and modal
lation conflates real event-causation with telos-
information of events; namely, tense, aspect, po-
directed explanations, that is, action directed to-
wards a goal by virtue of an intention. Given that larity, modality, as well as the event class. Event
these are semantically disjoint relations, which annotation follows version 1.2.1 of the TimeML
are furthermore triggered by distinct grammatical specifications.4
constructions, we believe this conflation should be Modal parsing in the form of identifying sub-
undone and characterized as two separate coher- ordinating verb relations and their type was per-
ence relations. If the relations just discussed were formed using SlinkET (Saurı́ et al., 2006), an-
annotated as telic-causation, the features encoded other component of the TARSQI framework. Slin-
for subsequent training of a machine learning al- kET identifies subordination constructions intro-
gorithm could benefit from distinct syntactic envi- ducing modality information in text; essentially,
ronments. We would like to automatically gen- infinitival and that-clauses embedded by factive
erate temporal orderings from cause-effect rela- predicates (regret), reporting predicates (say), and
tions from the events directly annotated in the text. predicates referring to events of attempting (try),
3
volition (want), command (order), among others.
This specific example was brought to our attention by
4
Alex Lascarides (p.c). See http://www.timeml.org.
119
SlinkET annotates these subordination contexts 4.3.1 Corpus-based Lexical Similarity
and classifies them according to the modality in- Lexical similarity was computed using the
formation introduced by the relation between the Word Sketch Engine (WSE) (Killgarrif et al.,
embedding and embedded predicates, which can 2004) similarity metric applied over British Na-
be of any of the following types: tional Corpus. The WSE similarity metric imple-
ments the word similarity measure based on gram-
factive: The embedded event is presupposed matical relations as defined in (Lin, 1998) with mi-
or entailed as true (e.g., John managed to nor modifications.
leave the party).
4.3.2 The Brandeis Semantic Ontology
counter-factive: The embedded event is pre- As a second source of lexical coherence, we
supposed as entailed as false (e.g., John was used the Brandeis Semantic Ontology or BSO
unable to leave the party). (Pustejovsky et al., 2006). The BSO is a lexically-
based ontology in the Generative Lexicon tradi-
tion (Pustejovsky, 2001; Pustejovsky, 1995). It fo-
evidential: The subordination is introduced
by a reporting or perception event (e.g., Mary cuses on contextualizing the meanings of words
saw/told that John left the party). and does this by a rich system of types and qualia
structures. For example, if one were to look up the
phrase RED WINE in the BSO, one would find its
negative evidential: The subordination is a
reporting event conveying negative polarity type is W INE and its type’s type is A LCOHOLIC
(e.g., Mary denied that John left the party). B EVERAGE. The BSO contains ontological qualia
information (shown below). Using the BSO, one
modal: The subordination creates an inten-
sional context (e.g., John wanted to leave the wine
CONSTITUTIVE Alcohol
party). HAS ELEMENT Alcohol
MADE OF Grapes
INDIRECT TELIC drink activity
Temporal orderings between events were iden- INDIRECT AGENTIVE make alcoholic beverage
tified using a Maximum Entropy classifier trained
on the TimeBank 1.2 and Opinion 1.0a corpora. is able to find out where in the ontological type
These corpora provide annotated events along system W INE is located, what RED WINE’s lexi-
with temporal links between events. The link cal neighbors are, and its full set of part of speech
types included: before ( occurs before ) , in- and grammatical attributes. Other words have a
cludes ( occurs sometime during ), simultane- different configuration of annotated attributes de-
ous ( occurs over the same interval as ), begins pending on the type of the word.
( begins at the same time as ), ends ( ends at
We used the BSO typing information to seman-
the same time as ).
tically tag individual words in order to compute
lexical paths between word pairs. Such lexical as-
4.3 Lexical Semantic Typing and Coherence
sociations are invoked when constructing cause-
Lexical semantic types as well as a measure of effect relations and other implicatures (e.g. be-
lexical similarity or coherence between words in tween crash and injure in Example 3).
two discourse segments would appear to be use- The type system paths provide a measure of the
ful for assigning an appropriate discourse rela- connectedness between words. For every pair of
tionship. Resemblance relations, in particular, re- head words in a GraphBank document, the short-
quire similar entities to be involved and lexical est path between the two words within the BSO
similarity here serves as an approximation to defi- is computed. Currently, this metric only uses the
nite nominal coreference. Identification of lexical type system relations (i.e., inheritance) but prelim-
relationships between words across segments ap- inary tests show that including qualia relations as
pears especially useful for cause-effect relations. connections is promising. We also computed the
In example (3) above, determining a (potential) earliest common ancestor of the two words. These
cause-effect relationship between crash and injury metrics are calculated for every possible sense of
is necessary to identify the discourse relation. the word within the BSO.
120
The use of the BSO is advantageous compared 6 Experiments and Results
to other frameworks such as Wordnet because it
focuses on the connection between words and their In this section we provide the results of a set of
semantic relationship to other items. These con- experiments focused on the task of discourse rela-
nections are captured in the qualia information and tion classification. We also report initial results on
the type system. In Wordnet, qualia-like informa- relation identification with the same set of features
tion is only present in the glosses, and they do as used for classification.
not provide a definite semantic path between any
two lexical items. Although synonymous in some 6.1 Discourse Relation Classification
ways, synset members often behave differently in The task of discourse relation classification in-
many situations, grammatical or otherwise. volves assigning the correct label to a pair of dis-
course segments.7 The pair of segments to assign
5 Classification Methodology a relation to is provided (from the annotated data).
This section describes in detail how we con- In addition, we assume, for asymmetric links, that
structed features from the various knowledge the nucleus and satellite are provided (i.e., the di-
sources described above and how they were en- rection of the relation). For the elaboration rela-
coded in a Maximum Entropy model. tions, we ignored the annotated subtypes (person,
time, location, etc.). Experiments were carried out
5.1 Maximum Entropy Classification on the full set of relation types as well as the sim-
For our experiments of classifying relation types, pler set of coarse-grained relation categories de-
we used a Maximum Entropy classifier5 in order scribed in Section 3.1.
to assign labels to each pair of discourse segments The GraphBank contains a total of 8755 an-
connected by some relation. For each instance (i.e. notated coherence relations. 8 For all the ex-
pair of segments) the classifier makes its decision periments in this paper, we used 8-fold cross-
based on a set of features. Each feature can query validation with 12.5% of the data used for test-
some arbitrary property of the two segments, pos- ing and the remainder used for training for each
sibly taking into account external information or fold. Accuracy numbers reported are the average
knowledge sources. For example, a feature could accuracies over the 8 folds. Variance was gener-
query whether the two segments are adjacent to ally low with a standard deviation typically in the
each other, whether one segment contains a dis- range of 1.5 to 2.0. We note here also that the
course connective, whether they both share a par- inter-annotator agreement between the two Graph-
ticular word, whether a particular syntactic con- Bank annotators was 94.6% for relations when
struction or lexical association is present, etc. We they agreed on the presence of a relation. The
make strong use of this ability to include very majority class baseline (i.e., the accuracy achieved
many, highly interdependent features6 in our ex- by calling all relations elaboration) is 45.7% (and
periments. Besides binary-valued features, fea- 66.57% with the collapsed categories). These are
ture values can be real-valued and thus capture fre- the upper and lower bounds against which these
quencies, similarity values, or other scalar quanti- results should be based.
ties. To ascertain the utility of each of the various
feature classes, we considered each feature class
5.2 Feature Classes independently by using only features from a sin-
We grouped the features together into various gle class in addition to the Proximity feature class
feature classes based roughly on the knowledge which serve as a baseline. Table 2 illustrates the
source from which they were derived. Table 1 result of this experiment.
describes the various feature classes in detail and We performed a second set of experiments
provides some actual example features from each shown in Table 3 that is essentially the converse
class for the segment pair described in Example 5 of the previous batch. We take the union of all the
in Section 3.2. 7
Each segment may in fact consist of a sequence of seg-
5
We use the Maximum Entropy classifier included with ments. We will, however, use the term segment loosely to
Carafe available at http://sourceforge.net/projects/carafe refer to segments or segment sequences.
6 8
The total maximum number of features occurring in our All documents are doubly annotated; we used the anno-
experiments is roughly 120,000. tator1 annotations.
121
Feature Description Example
Class
C Words appearing at beginning and end of the two discourse seg- first1-is-to; first2-is-The
ments - these are often important discourse cue words.
P Proximity and direction between the two segments (in terms of adjacent; dist-less-than-3; dist-less-
segments) - binary features such as distance less than 3, distance than-5; direction-reverse; samesentence
greater than 10 were used in addition to the distance value itself;
the distance from beginning of the document using a similar bin-
ning approach
BSO Paths in the BSO up to length 10 between non-function words in the ResearchLab EducationalActivity
two segments. University
WSE WSE word-pair similarities between words in the two segments WSE-greater-than-0.05; WSE-
were binned as ( 0.05, 0.1, 0.2). We also computed sen- sentence-sim = 0.005417
tence similarity as the sum of the word similarities divided by the
sum of their sentence lengths.
E Event head words and event head word pairs between segments as event1-is-upgrade; event2-is-spent;
identified by EvITA. event-pair-upgrade-spent
SlinkET Event attributes, subordinating links and their types between event seg1-class-is-occurrence; seg2-class-
pairs in the two segments is-occurrence; seg1-tense-is-infinitive;
seg2-tense-is-past; seg2-modal-seg1
C-E Cuewords of one segment paired with events in the other. first1-is-to-event2-is-spent; first2-is-
The-event1-is-upgrade
Syntax Grammatical dependency relations between two segments as iden- gr-ncmod; gr-ncmod-head1-equipment;
tified by the RASP parser. We also conjoined the relation with one gr-ncmod-head-2-spent; etc.
or both of the headwords associated with the grammatical relation.
Tlink Temporal links between events in the two segments. We included seg2-before-seg1
both the link types and the number of occurrences of those types
between the segments
Table 1: Feature classes, their descriptions and example feature instances for Example 5 in Section 3.2.
122
Relation Precision Recall F-measure Count Adding these features improved classification ac-
elab 88.72 95.31 91.90 512
attr 91.14 95.10 93.09 184 curacy to 82.3%. This improvement is fairly sig-
par 71.89 83.33 77.19 132 nificant (a 6.3% reduction in error) given that this
same 87.09 75.00 80.60 72 dependency information is only encoded weakly
ce 78.78 41.26 54.16 63
contr 65.51 66.67 66.08 57 as features and not in the form of model con-
examp 78.94 48.39 60.00 31 straints.
temp 50.00 20.83 29.41 24
expv 33.33 16.67 22.22 12
cond 45.45 62.50 52.63 8 7 Discussion and Future Work
gen 0.0 0.0 0.0 0
We view the accuracy of 81% on coherence rela-
Table 4: Precision, Recall and F-measure results. tion classification as a positive result, though room
for improvement clearly remains. An examination
6.3 Coherence Relation Identification of the errors indicates that many of the remain-
ing problems require making complex lexical as-
The task of identifying the presence of a rela- sociations, the establishment of entity and event
tion is complicated by the fact that we must con- anaphoric links and, in some cases, the exploita-
sider all potential relations where is the tion of complex world-knowledge. While impor-
number of segments. This presents a trouble- tant lexical connections can be gleaned from the
some, highly-skewed binary classification prob- BSO, we hypothesize that the current lack of word
lem with a high proportion of negative instances. sense disambiguation serves to lessen its utility
Furthermore, some of the relations, particularly since lexical paths between all word sense of two
the resemblance relations, are transitive in na- words are currently used. Additional feature engi-
ture (e.g. !#" %$&$' $)('* +-,.*0/2143567" %$8$'$9('*:/ ,.*<; 1>= neering, particularly the crafting of more specific
!#" #$&$' $)('*+-,.*<;21 ). However, these transitive links conjunctions of existing features is another avenue
are not provided in the GraphBank annotation - to explore further - as are automatic feature selec-
such segment pairs will therefore be presented in- tion methods.
correctly as negative instances to the learner, mak- Different types of relations clearly benefit from
ing this approach infeasible. An initial experiment different feature types. For example, resemblance
considering all segment pairs, in fact, resulted in relations require similar entities and/or events, in-
performance only slightly above the majority class dicating a need for robust anaphora resolution,
baseline. while cause-effect class relations require richer
Instead, we consider the task of identifying the lexical and world knowledge. One promising ap-
presence of discourse relations between segments proach is a pipeline where an initial classifier as-
within the same sentence. Using the same set of signs a coarse-grained category, followed by sepa-
all features used for relation classification, perfor- rately engineered classifiers designed to model the
mance is at 70.04% accuracy. Simultaneous iden- finer-grained distinctions.
tification and classification resulted in an accuracy An important area of future work involves in-
of 64.53%. For both tasks the baseline accuracy corporating additional structure in two places.
was 58%. First, as the experiment discussed in Section 6.4
shows, classifying discourse relations collectively
6.4 Modeling Inter-relation Dependencies shows potential for improved performance. Sec-
Casting the problem as a standard classification ondly, we believe that the tasks of: 1) identify-
problem where each instance is classified inde- ing which segments are related and 2) identify-
pendently, as we have done, is a potential draw- ing the discourse segments themselves are prob-
back. In order to gain insight into how collec- ably best approached by a parsing model of dis-
tive, dependent modeling might help, we intro- course. This view is broadly sympathetic with the
duced additional features that model such depen- approach in (Miltsakaki et al., 2005).
dencies: For a pair of discourse segments, * + and We furthermore believe an extension to the
*:/ , to classify the relation between, we included GraphBank annotation scheme, with some minor
features based on the other relations involved with changes as we advocate in Section 3.2, layered on
the two segments (from the gold standard annota- top of the PDTB would, in our view, serve as an
tions): ?@A('* + ,.*; 1CB DFGI
E
HKJ
and ?@A('* / ,.*L1CB $MGO
E
N)J
. interesting resource and model for informational
123
P
discourse. J. Pustejovsky, C. Havasi, R. Saurí, P. Hanks, and
A. Rumshisky. 2006. Towards a Generative Lexical
Acknowledgments resource: The Brandeis Semantic Ontology. In Lan-
guage Resources and Evaluation Conference, LREC
This work was supported in part by ARDA/DTO 2006, Genoa, Italy.
under grant number NBCHC040027 and MITRE
J. Pustejovsky. 1995. The Generative Lexicon. MIT
Sponsored Research. Catherine Havasi is sup- Press, Cambridge, MA.
ported by NSF Fellowship # 2003014891.
J. Pustejovsky. 2001. Type construction and the logic
of concepts. In The Language of Word Meaning.
References Cambridge University Press.
N. Asher and A. Lascarides. 2003. Logics of Con- R. Saurı́, M. Verhagen, and J. Pustejovsky. 2006. An-
versation. Cambridge University Press, Cambridge, notating and recognizing event modality in text. In
England. The 19th International FLAIRS Conference, FLAIRS
2006, Melbourne Beach, Florida, USA.
J. Baldridge and A. Lascarides. 2005a. Annotating
discourse structures for robust semantic interpreta- R. Soricut and D. Marcu. 2003. Sentence level dis-
tion. In Proceedings of the Sixth International Work- course parsing using syntactic and lexical informa-
shop on Computational Semantics, Tilburg, The tion. In Proceedings of the HLT/NAACL Confer-
Netherlands. ence, Edmonton, Canada.
J. Baldridge and A. Lascarides. 2005b. Probabilistic M. Verhagen, I. Mani, R. Saurı́, R. Knippen, J. Littman,
head-driven parsing for discourse structure. In Pro- and J. Pustejovsky. 2005. Automating temporal an-
ceedings of the Ninth Conference on Computational notation within TARSQI. In Proceedings of the ACL
Natural Language Learning, Ann Arbor, USA. 2005.
T. Briscoe and J. Carroll. 2002. Robust accurate sta- B. Webber, A. Joshi, E. Miltsakaki, R. Prasad, N. Di-
tistical annotation of general text. Proceedings of nesh, A. Lee, and K. Forbes. 2005. A short intro-
the Third International Conference on Language Re- duction to the penn discourse TreeBank. In Copen-
sources and Evaluation (LREC 2002), Las Palmas, hagen Working Papers in Language and Speech Pro-
Canary Islands, May 2002, pages 1499–1504. cessing.
L. Carlson, D. Marcu, and M. E. Okurowski. 2003. B. Wellner and M. Vilain. 2006. Leveraging ma-
Building a discourse-tagged corpus in the frame- chine readable dictionaries in discriminative se-
work of rhetorical structure theory. In Janvan Kup- quence models. In Language Resources and Eval-
pelvelt and Ronnie Smith, editors, Current Direc- uation Conference, LREC 2006, Genoa, Italy.
tions in Discourse and Dialogue. Kluwer Academic
Publishers. F. Wolf and E. Gibson. 2005. Representing dis-
course coherence: A corpus-based analysis. Com-
K. Forbes, E. Miltsakaki, R. Prasad, A. Sakar, A. Joshi,
putational Linguistics, 31(2):249–287.
and B. Webber. 2001. D-LTAG system: Discourse
parsing with a lexicalized tree adjoining grammar.
In Proceedings of the ESSLLI 2001: Workshop on
Information Structure, Discourse Structure and Dis-
course Semantics.
J. Hobbs. 1985. On the coherence and structure of dis-
course. In CSLI Technical Report 85-37, Stanford,
CA, USA. Center for the Study of Language and In-
formation.
A. Killgarrif, P. Rychly, P. Smrz, and D. Tugwell.
2004. The sketch engine. In Proceedings of Eu-
ralex, Lorient, France, pages 105–116.
D. Lin. 1998. Automatic retrieval and clustering of
similar words. In Proceedings of COLING-ACL,
Montreal, Canada.
E. Miltsakaki, N. Dinesh, R. Prasad, A. Joshi, and
B. Webber. 2005. Experiments on sense anno-
tation and sense disambiguation of discourse con-
nectives. In Proceedings of the Fourth Workshop
on Treebanks and Linguistic Theories (TLT 2005),
Barcelona, Catalonia.
124
Q
A Appendix
A.1 Confusion Matrix
elab par attr ce temp contr same examp expv cond gen
elab 488 3 7 3 1 0 2 4 0 3 1
par 6 110 2 2 0 8 2 0 0 2 0
attr 4 0 175 0 0 1 2 0 1 1 0
ce 18 9 3 26 3 2 2 0 0 0 0
temp 6 8 2 0 5 3 0 0 0 0 0
contr 4 12 0 0 0 38 0 0 3 0 0
same 3 9 2 2 0 2 54 0 0 0 0
examp 15 1 0 0 0 0 0 15 0 0 0
expv 3 1 1 0 1 4 0 0 2 0 0
cond 3 0 0 0 0 0 0 0 0 5 0
gen 0 0 0 0 0 0 0 0 0 0 0
NX VX NX NX
DT NN VBD $ CD TO VB NN NN IN CD
Event Event
+Past +Infinitive
+Occurr +Occurr
125