Kintsch Van Dijk Toward A Model of Text Comprehension Production PsychRev 1978 PDF

syc e 1
VOL U M E 85 N UJiI B E R 5 S Elp T E lVI B E R i 9 78

.&.r
~, .. ""
./ '..
Toward a Model" of Text Com;ppehension Production

Walter Kintsch Teun A. van Dijk
University of Colorado University of Amsterdam,
Amsterdam, The ~etherlands
The semantic structure of texts can be described both at the local microlevel
and at a more global macrolevel. A model for text comprehension based on this
notion accounts for the formation of a coherent semantic text base in terms of
a cyclical process constrained limitations of working memory. Furthermore,
the model includes macro-operators, whose purpose is to reduce the information
in a text base to its gist, that is, the theoretical macrostructure. These opera-
tions are under the control of a schema, which is a theoretical formulation of
the comprehender's goals. The macroprocesses are only when the
control schema can be made On the production the Dodel is con-
cerned with rhe generation of recall and,~ummarization ;Jrowco:s. This process
is partly reproductive· and partly constructive, involving the inverse operation
of the macro-operators. The model is applied w ia paragraph froD a
I'esearch report, and methods for the empirical testing of the model
are developed.
The main goal of this article is to describe organized into a coherent whole, a process
the system of mental operations that underlie that results in multiple processing of some
the processes occurring in text comprehension elements and, hence, in differential retention.
and in the production of recall and summariza- /\. second set of operations condenses the full
tion protocols. A processing model will be meaning of the text into its gist. These pro-
outlined that specifies three sets of operations. cesses are complemented by a third set of
First, the meaning elements of a text'become operations that generate new texts from the
memorialco,nsequences of the comprehension
This research was supported by Grant MH15872 processes.
from the Xational Institute of .:YIental Health to the These goals involve a number of more con-
frst author and by grants from the Netherlands crete objectives. 'Vv'e want fIrst to be able to
Organization for the Advancement of Pure Research
go through a text, sentence by sentence,
and the Facalty of Letters of the University of Amster-
dam to the second author. specifying the processes that these ser:tences
,\Ve have benefited from many discussions with mem- undergo in comprehension as well as the out-
bers of our laboratories both at Amsterdam and puts of these processes at various stages of
Colorado and especially with Ely Kozminsky, who also comprehension. Next, we propose to analyze
performed the statistical analyses reported here.
for reprints should be sent to Wa;ter recall protocols and summaries in the same way
Department of Psychology, l.:niversity of and to specify for each senter:ce the operations
'_OlOTaU(}, Boulder, Colorado 80309. required to produce such a sentence. The
Copyright 1978 by the American Psychological Association, Inc. 0033·295X/7S/8505-0363$OO. 75
363
364 WALTER KINTSCH AND TEUN'A, VAN DIJK
model is, however, not merely descriptive. It ably ones that are shared by many readers in
will yield both simulated output ptotocols order to make feasible the collection of mean-
that can be qualitatively compared with ingful data. Hence, we shall emphasize the
experimental protocols and detailed predic- processing of certain conventional text types,
tions of the frequencies with which proposi- such as stories.
tions from the text as well as certain kinds of Comprehension is involved in reading as well
inferences about it appear in experimental as in listening, and our model applies to both.
protocols, which can then be tested with con- Indeed, the main differences between reading
ventional statistical methods. and listening occur at levels lower than the
LaBerge and Samuels (1974), in their in- ones we are concerned with (e.g., Kintsch &
fluential article on reading processes, have Kozminsky, 1977; Sticht, in press). We are
remarked that "the complexity of the compre- talking here, of course, about readers for whom
hension operation appears to be as enormous the decoding process has become automated,
as that of thinking in general" (p. 320). As but the model also has implications for readers
long as one accepts this viewpoint, it would be who still must devote substantial resources to
foolish to offer a model of comprehension decoding.
processes. If it were not possible to separate One of the characteristics of the present
aspects of the total comprehension process for model is that it assumes a multiplicity of
study, our enterprise would be futile. We processes occurring sometimes in parallel,
believe, however, that the comprehension sometimes sequentially. The fact that, phe-
process can be decomposed into components, nomenologically, comprehension is a very
some of which may be manageable at present, simple experience does not disprove tl'Js no-
while others can be put aside until later. tion of multiple, overlapping processes. Per-
Two very important components will be ception is likewise a unitary, direct experience;
excluded from the present model. First, the but when its mechanisms are analyzed at the
model will be concerned only with semantic physiological and psychological levels, they
structures. A full grammar, including a parser, are found to consist of numerous interacting
which is necessary both for the interpretation subprocesses that run off rapidly and with
of input sentences and for the production of verv little mutual interference. Resources
output sentences, will not be included. In see~ to be required only as attention, con-
fact, no such grammar is available now, nor sciousness, decisions, and memory become
is there hope for one in the near future. Hence, involved; it is here that the well-known
the model operates at the level of assumed capacity limitations of the human system
underlying semantic structures, which we seem to be located rather than in the actual
characterize in terms of propositions. Further- processing. Thus, in the comprehension model
more, comprehension always involves knowl- presented below, we assume several complex. ..(
edge use and inference processes. The model processes operating in parallel and interactively'
does not specify the details of these processes. without straining the resources of the system.
That is, the model only says when an inference Capacity limitations become crucial, however,
occurs and what it will be; the model does not when it comes to the storage of information in
say how it is arrived at, nor what precisely memory and response production. That such
was its knowledge base. Again, this restriction a system is very complex should not in itself
is necessary because a general theory of in- make it implausible as an account of compre-
ference processes is nowhere in sight, and the hension processes.
alternative of restricting the model to a small The organization of this article is as follows.
but manageable base has many disadvantages. First, the semantic structure of texts, includ-
Comprehension can be modeled if we ing their macrostructure, is discussed briefly,
are given a specific goal. In other words, we mainly with reference to our previous work.
must know the control processes involved in Then, a psychological processing model is
comprehension. Frequently, these are ill described that is based on these ideas about
specified, and thus we must select cases where text structure. While related, the functions of
well-defined control processes do exist, prefer- these two enterprises are rather different. The
------~~~~------------
TEXT COMPREHENSION AND PRODUCTION 365
structural theory is a semiformal statement coherence: We establish a linear or hierarchical

of certain linguistic intuitions, while the pro- sequence of in \vhich coreferential
cessing model attempts to predict the contents expressions occur. The first superordinate)
of experimental protocols. Therefore, they of these propositions often appears to have a
must be evaluated in rather different ways. specific status in such a sequence,
Finally, some preliminary applications and being recalled two or three times more often
tests of the processing model are described, than other propositions (see Kintsch,
and some linguistic considerations are raised Clearly, a processing model must account for
that may guide future extensions of it. the superior recall of propositions that func-
tion as superordinates In the text-base
structure.
Semantic Structures
We assume that the surface structure of a Inferences

discourse is interpreted as a set of proposi-
tions. This set is ordered by various semantic Natural language discourse may be con-
relations among the propositions. Some of these nected even if the propositions expressed by the
relations are explicitly expressed in the surface discourse are not directly connected. This
structure of the discourse; others are inferred possibility is due to the fact that language
during the process of interpretation with the users are able to provide, during comprehen-
help of various kinds of context-specific or sion, the missing links of a sequence on the
general knowledge. basis of their general or contextual knowledge
The semantic structure of a discourse is of the facts. In other words, the facts, as
characterized at two levels, namely, at the known, allow them to make inferences about
levels of microstructure and of macrostructure. possible, likely, or necessary other facts and to
The microstructure is the local level of the interpolate missing propositions that may make
discourse, that is, the structure of the in- the sequence coherent. Given the general
dividual propositions and their relations. The pragmatic postulate that we do not normally
macrostructure is of a more global nature, assert what we assume to be known by the
characterizing the discourse as a whole. These listener, a speaker may leave all propositions
levels are related by a set of specific semantic implicit that can be provided by the listener.
mapping rules, the macrorules. At both levels, An actual discourse, therefore, normally ex-
we provide an account of the intuitive notion presses what may be called an impUcit text
of the coherence of a discourse: A discourse is base. An explicit text base, then, is a theoretical
coherent only if its respective sentences and construct featuring also those propositions
propositions are connected, and if these necessary to establish formal coherence. This
propositions are organized globally at the means that only those propositions are inter-
macrostructure level. polated that are interpretation conditions, for
example, presuppositions of other propositions
expressed by the discourse .
•Microstructure of Discourse
In our earlier work (Kintsch, 1974; van IvIacrostructure of Discourse

* Dijk, 1972, 1977d), we attempted to character-
ize the semantic structure of a discourse in The semantic structure of discourse must be
terms of an abstract text base. The present described not only at the local microlevel but
. article will shift attention to the more specific also at the more global macrolevel. Besides
properties of comprehension, that is, to the the purely psychological motivation for this
ways a language understander actually con- approach, which will become apparent below
structs such a text base. in our description of the processing model,
Text bases are not merely unrelated lists the theoretical and linguistic reasons for this
of propositions; they are coherent, structured level of description derive from the fact that
units. One way to assign a structure to a text the propositions of a text base must be con-
base may be derived from its referential nected relative to what is intuitively called a
366 \YALTER KIXTSCH A~D TEU~ A. VAN DIJK
topic of discourse (or a topic of conversation), tions may be substituted by a proposition de-
that is, the theme of the discourse or a fragment noting a global fact of which the facts denoted
thereof. Relating propositions in a local manner by the microstructure propositions are normal
is not sufficient. There must be a global con- conditions, components, or consequences.
straint that establishes a meaningful whole,
The macrorules are applied under the con-
characterized in terms of a discourse topic.
trol of a schema, which constrains their opera-
The notion of a discourse tCDic can be made
lion so that macrostructures do not become
explicit in terms of semantic ~acrostructures.
virtually meaningless abstractions or general-
Like other semantic structures, these macro-
izations.
structures are described in terms of proposi-
Just as general information is needed to
tions and proposition sequences. In order to
establish connection and coherence at the
show how a discourse topic is related to the
microstructural world knowledge is also
respective propositions of the text base, we
for the operation of macrorules. This
thus need semantic mapping rules with micro-
is particularly obvious in Rule 3, where our
structural information as and macro-
frame knowledge (:Vlinsky, 1975) specifIes
structural information as output. Such macro-
which facts belong conventionally to a more
rules, as we call them, both reduce and
global fact, for example, "paying" as a normal
organize the more detailed information of the
component of both" shopping" and" eating in
microstn.~cture of the text. describe the
a restaurant" (d. Bobrow & Collins, 1975).
same facts but from a more global point of
view. This means, for instance, that we may
have several levels of macrostructure; macr~ Schematic Structures oj Discourse
rules may apply recursively on sequences of Typical examples of conventional schematic
macropropositions as long as the constraints
structures of discourse are the structure of a
on the rules are satisfied.
story, the structure of an argument, or the
The general abstract nature of the macro-
structure of a psychological report. These
rules is based on the relation of semantic en- structures are specified by a set of character-
tailment. That is, they preserve both truth istic categories and a set of (sometimes re-
and meaning: A macrostructure must be im- cursive) rules of formation and transiorma-
plied by the (explicit) microstructure from tion defining the canonical and possible order-
which it is derived. Kote that in actual pro-
ing of the categories van Dijk,
cessing, as we will see, macro structures may
in press).
also be inferred on the basis of incomplete Schematic structures play an important role
information. in discourse comprehension and production.
The basic constraint of the macrorules is Without them, we would not be able to
that no proposition may be deleted that is an why language users are able to understand a
condition of a following proposi- discourse as a story, or they are able to
the text. In iact, this also guarantees judge whether a story or an argument is cor-
that a macrostructure itself is connected and rect or not. More specifically, we must be
coherent. able to explain why a free-recall protocol or a ".
A formal description of the macrorules will summary of a story also has a narrative
not be given here. We shall merely mention structure, or why the global structure of a
them briefly in an informal way (for psychological paper is often conserved in its
see van Dijk, 1977b, 1977d). abstract (d. Kintsch & van Dijk, 1975; van
1. Deletion, Each proposition that is Dijk & Kintsch, Furthermore, schematic
neither direct nor an indirect interpretation structures play an important control function'
condition of a subsequent proposition may be in the processing model to be described below,
deleted. where they are used not only in connection with
2. Generalization. Each sequence of propo- conventional texts but also to represent
sitions may be substituted by the general idiosyncratic personal processing goals. The
proposition denoting an immediate superseL application of the macro rules on
3. Construction. Each sequence of proposi- whether a given proposition is or is not judged
TEXT COMPREHENSION AXD PRODUCTION 367
to be relevant in its context, with the schema The latter may be concepts or other embedded
specifying the kind of information that is to be propositions. The arguments of a proposition
considered relevant for a particular compre- fulfili different semantic functions, such as
hension task. agent, object, and goal. Predicates may be
realized in the surface structure as verbs,
adjectives, adverbs, and sentence connectives.
Process Model
Each predicate constrains the nature of the
Forming Coherent Text Bases argument that it raay take. These constraints
are imposed both linguistic rules and general
The model takes as its input a list of pro- "world knowledge and are assumed to be a
positions that represent the meaning of a text. part of a person's knovdedge or semantic
Thp derivation of this proposition list from the memory.
text is not part of the model. It has been de- Propositions are ordered in the text base
scribed and justified in Kintsch and according to the way in which they are ex-
further codified by Turner and Greene pressed in the text itself. Specificaliy, their
press). With some practice, persons using this order is determined by the order of the words
system arrive at essentially identica.l pro- in the text that correspond to the proposi-
positional representations, in our experience. tional predicates.
Thus, the system is simple to use; and while Text bases must be coherent. One of the
• it is undoubtedly too to capture nuances linguistic criteria for the semantic coherence
of meaning that are crucial for some linguistic of a text base is referential coherence. In terms
and logical analyses, its robustness is well of the present notational system, referential
suited for a process model. Most important, coherence corresponds to argument overlap
this notation can be translated into other among propositions. Specifically, (P, A, B) is
systems quite readily. For instance, one could referentially coherent with (R, B, C) because
translate the present text bases into the the two propositions share the argument B,
graphical notation of Norman and Rumelhart or with (Q, D, (P, A, B)) because one proposi-
(1975), though the result would be quite tion is embedded here as an argument into
cumbersome, or into a somewhat more sophisti- another. Referential coherence is probably the
cated notation modeled after the predicate most important single criterion for the co-
calculus, which employs atomic propositions, herence of text bases. It is neither a necessary
variables, and constants (as used in van Dijk, nor a sufficient criterion linguistically. How-
1973). Indeed, the latter notation may ever, the fact that in many texts other factors
eventually prove to be more suitable. The tend to be correlated with it makes it a useful
important point to note is simply that al- indicator of coherence that can be checked
though some adequate notational system for easily, quickly, and reliably. We therefore pro-
the representation of is required, the pose the following hypothesis about text
details of that system often are not crucial processing: The first step in forming a coherent
for our model. text base consists in checking out its referential
coherence; if a text base is found to be refer-
entially coherent, that is, if there is some
Propositional N alation argument overlap among of its propositions,
it is accepted for further processing; if gaps
The propositional notation employed below
are found, inference processes are initiated to
will be outlined here only in the briefest possible
close them; specifically, one or more proposi-
way. The idea is to represent the meaning of
tions will be added to the text base that make
a text by means of a structured list of proposi-
it coherent. Note that in both cases, the argu-
tions. Propositions are of concepts
(the names of which we shall write in capital ment-referent repetition rule also holds for
letters, so that they will not be confused ,vith arguments consisting of a proposi~ion, thereby
words). The composition rule states that each establishing relations not only behveen in-
proposition must include first a predicate, or dividuals but also between facts denoted by
relational concept, and one or more arguments. propositions.
368 WALTER KINTSCH AND TEUN A. VAN DIJK
Processing Cycles those propositions not available from long-

The second major assumption is that this term memory can be located re-reading the
checking of the text base for referential co- text itself. Clearly, we are assuming here a
herence and the addition of inferences wherever reader who processes all available information;
necessary cannot be performed on the text deviant cases will be discussed later.
base as a whole because of the capacity If this search process is successful, that is, if
limitations of working memory. We assum'e a proposition is found that shares an argument
that a text is processed sequentially from left to with at least one proposition in the input set,
right (or, for auditory inputs, in temporal the set is accepted and processing continues.
order) in chunks of several propositions at a If not, an inference process is initiated, which
time. Since the proposition lists of a text base adds to the text base one or more propositions
are ordered according to their appearance in that connect the input set to the already pro-
t?e text, this means that the first nl proposi- cessed propositions. Again, we are assuming
tlons are processed together in one cycle, then here a comprehender who fully processes the
the next n2 propositions, and so on. It is un- text. Inferences, like long-term memory
reasonable to assume that all n.,s are equal. searches, are assumed to make relatively
Instead, for a given text and a given compre- heavy demands on the comprehender's re-
the maximum 10i will be specifIed. sources and, hence, contribute significantly to
Within this limitation, the precise number of the difficulty of comprehension.
propositions included in a processing chunk
depends on the surface characteristics of the Coherence Graphs
text. There is ample evidence (e.g., Aaronson
& Scarborough, 1977; Jarvella, 1971) that In this manner, the model proceeds through
sentence and phrase boundaries determine the the whole text, constructing a network of
chunking of a text in short-term memory. The coherent propositions. It is often useful to
maximum value of ni, on the other hand, is a represent this network as a graph, the nodes
model parameter, depending on text as 'well of which are propositions and the connecting
as reader characteristics.! lines indicating shared referents. The graph
If text bases are processed in cycles, it be- can be arranged in levels by selecting that
comes necessary to make some provision in proposition for the top level that results in
the model for connecting each chunk to the the simplest graph structure (in terms of some
ones already processed. The following assump- suitable measure of graph complexity). The
tions are made. Part of working memory is a second level in the graph is then formed by all
short-term memory buffer of limited size s. the propositions connected to the top level;
When a chunk of 10; propositions is processed, propositions that are connected to any proposi-
s of them are selected and stored in the buffer.2
Only those s propositions retained in the buffer • 1 Presumably, a speaker employs surface cues to
are available for connecting the new incoming SIgnal to the comprehender an appropriate chunk size.
chunk with the already processed material. If Thus, a good speaker attempts to place his or her
a connection is found between any of the new sentence boundaries in such a way that the listener can -
use them effectively for chunking purposes. If a
propositions and those retained in the buffer, speaker-listener mismatch occurs in this respect, com-
that is, if there exists some argument overlap prehension problems arise because the listener cannot
between the input set and the contents of the use the most obvious cues provided and must rely on '
short-term memory buffer, the input is harder-to-use secondary indicators of chunk boundaries.
accepted as coherent with the previous text. One may speculate that sentence grammars evolved
because of the capacity limitations of the human
If not, a resource-consuming search of all ~ognitive system, that is, from a need to package
previously processed propositlons is made. In mfmmation in chunks of suitable size for a cyclical
auditory•
comprehension, this search ranaes 0
comprehension process.
2 Sometimes a speaker does not entrust the listener ..
over alJ text propositions stored in long-term
with this selection process and begins each new sentence
memory. (The storage assumptions are detailed by repeating a crucial phrase from the old one. Indeed,
below.) In reading comprehension, the search in some languages such hbackreference" is obligatory
includes all previous propositions because even (Longacre & Levinsohn, 1977).
tion at the second level, but not to the first- currently being processed may be stored in
level proposition, then form the third level. long-term memory and later reproduced in a
Lower levels may be constructed similarly. recall or summarization task, each with proba-
For convenience, in drawing such graphs, only bility p. This probability p is called the
connections between levels, not within levels, reproduction probability because it combines
are indicated. In case of multiple between- both storage and retrieval information. Thus,
level connections, we arbitrarily indicate only for the same comprehension conditions, the
one (the first one established in the processing value of p may vary depending on whether
order). Thus, each set of propositions, depend- the subject's task is recall or summarization
ing on its pattern of interconnections, can be or on the length of the retention interval. We
assigned a unique graph structure. are merely saying that for some comprehen-
The topmost propositions in this kind of sion-production combination, a proposition is
coherence graph may represent presuppositions reproduced with probability p for each time it
of their subordinate propositions due to the has participated in a processing cycle. Since at
fact that they introduce relevant discourse each cycle a subset of s propositions is selected
referents, where presuppositions are important and held over to the next processing cycle,
for the contextual interpretation and hence for some propositions will participate in more than
discourse coherence, and ,,,here the number of one processing cycle and, hence, will have
subordinates may point to a macrostructural higher reproduction probabilities. Specifically,
role of such presupposed propositions. Note if a proposition is selected k - 1 times for
that the procedure is strietly formal: There is inclusion in the short-term memory buffer, it
no claim that topmost propositions are ahvays has k chances of being stored in long-term
most important or relevant in a more intuitive memory, and hence, its reproduction proba-
sense. That will be taken care of with the bility will be 1 - (1 - p)k.
macro-operations described below. Which propositions of a text base will be
A coherent text base is therefore a con- thus favored by mUltiple processing depends
nected graph. Of course, if the long-term crucially on the nature of the process that
memory searches and mrerence processes selects the propositions to be held over from
required by the model are not execu ted, ~he one processing cyoie to the next. Various
resulting text base ,yill be incoherent, that is, strategies may be employed. For instance, the
the graph will consist of several separate s buffer propositions may be chosen at random.
clusters. In the experimental situations we are Clearly, such a strategy would be less than
concerned where the subjects read the optimal and result in unnecessary long-term
text rather carefully, it is usually reasonable memory searches hence, in poor reading
to assume that coherent text bases are performance. It is not possible at this time to
established. specify a unique optimal strategy. However,
To review, so far we have assumed a cyclical there are two considerations that probably
process that checks on the argument overlap characterize good strategies. First, a good
in the proposition list. This process is auto- strategy should select propositions for the
matic, that is, it has low resource require- buffer that are important, in the sense that
ments. In each cycle, certain propositions are they play an important role in the graph
retained in a short-term buffer to be connected already constructed. A proposition that is
with the input set of the next cycle. If no con- already connected to many other propositions
nections are found, resource-consuming search is more iikely to be relevant to the next input
and inference operations are required. cycle than a proposition that has played a less
pivotal role so far. Thus, propositions at the
top levels of the graph to ,,,hieh many others
lvI emory Storage
are connected should be selected preferentially.
In each processing cycle, there are lli Another reasonable consideration involves
propositions involved, plus the s propositions recency. If one must choose between two
held over in the short-term buffer. It is equally important propositions in the sense
assumed that in each cycle, the propositions outlined above, the more recent one might be
370 WALTER KINTSCH AND TEU~ A. VAN DIJK
expected to be the one most relevant to the Suppose that" the selection strategy, as dis-
next cycle. lJnfortunately, t~ese con- cussed is indeed biased in favor of
siderations do not specify a unique selection important, that is, high-level propositions.
strategy but a whole family of such strategies. Then, such propositions will, on t.he average,
In the example discussed in the next section, be processed more frequently than low-level
particular example of such a strategy will be propositions and recal'ed better. The better
explored in not because we have any recall of high-level propositions can then be
reason to believe that it is better than some explained because they are processed dif-
alterm,tives, but to show that each strategy ferently than low-Ieve·l propositions. Thus,
leads to empirically testable consequences. we are now able to add a processing explana-
Hence, 'which strategy is used tion to our previous account of levels effects,
in a given population of subjects becomes an which was merely in terms of a correlation
empirically decidable lssue because each between some st;uctural aspects of texts and
strategy specifies a somewhat different recall probabilities.
of reproduction probabilities over the we have been concerned here only
propositions of a text base. Experimental with the organization of the micro propositions,
results may be such that only one particular one should not forget that other processes,
selection strategy can account for them, or such as macro-operations, are going on at the
perhaps we shall not be able to discriminate same time, and that therefore the buffer must
among a dass of reasonable selection strategies contain the information about macroproposi-
on the basis of empirical frequency distribu- tions and presuppositions that is required to
tions. In either case, we shall have the in- establish the global coherence of the dis-
formation the mode! requires. course. One could extend the model in that
It is already clear that a random seleCIion direction. Instead, it appears to us worth-
strategy wiil not account for paragraph recall while to these various processing com-
data, and that some selection strategy in- ponents in isolation, in order to reduce the
corporating an "importance" principle is re- complexity of our problem to manageable
quired. The evidence comes from work on proportions. We shall remark on the feasibility
what has been termed the "levels effect" of this research strategy below.
(Kintsch & Keenan, 1973; Kintsch et aI.,
1975; .:\fIeyer, It has been observed that Testing the M oriel
propositions belonging to high levels of a text-
The frequencies with which different text
base hierarchy are much better recalled (by
proposition~ are recalled provide the major
a factor of two or three) than propositions low
possibility for testing the model experi-
in the The text-base hierarchies in
mentally. It is possible to obtain good esti-
these studies were constructed as follows. The
mates of such frequencies for a given popula-
proposition or proposltlons of a
tion of readers and texts. Then, the best-
paragraph (as determined its title or
fitting model can be determined by varying
otherwise by intuition) were selected as the
the three parameters of the the
top level of the hierarchy, and all those
maximum input size per cycle; s, the capacity
propositions connected to them via argument
of the short-term buffer; and p, the reproduc-
overlap ,vere determined, forming the next
tion probability-at the same time exploring
level of the hierarchy, and so on. Thus, these
different selection strategies. Thus, which
hierarchies were based on referential coherence,
selection strategy is used by a given popUla-
that is, arguments overlap among
tion of readers and for a given type of text
tions. In fact, the propositional networks
becomes decidable empirically. For instance,
constructed here are identical with these
in the example described belOlv, a selection
hierarchies, except for changes introduced by strategy has been assumed that emphasizes
the fact that in the present. coherence recency and ':leading-edge"
graphs are constructed cycle by cycle. strategy). It is conceivable that for college
Xote that the present model suggests an students reading texts rather carefuliy under
interesting reinterpretation of this levels effect. laboratory conditions, a fairly sophisticated
...
TEXT COMPREHENSION AND PRODGCTIOX 371
strategy like this might actually be used (i.e., ing a text base is perhaps best described as
it would lead to fIts of the model detectably apperception. That is, a reader's knowledge
better than alternative assumptions). At the determines to a large extent the meaning that
sar~le different selection strategies might he or she derives from a text. If the knowledge
be used in othcr reader populations or under base is lacking, the reader will not be able to
different conditions. It would be derive the same meaning that a person with
wori.hwhile to investigate vvhether there are adequate knowledge, the same text,
strategy differences between good and poor would obtain. Unfamiliar material would have
readers, for instance. to be processed in smaller chunks than familiar
If the model is successful, the values of the material, and hence, tt should be directly
parameters s, n, and p, and their dependence related to ~o experimental tests
on text and reader characteristics might also of this prediction appear to exist at present.
prove to be informative. A number of factors However, other factors, too, influence n.
might influence s, the short-term memory Since it is determined by the surface form of the
capacity. Individuals differ in their short-term text, the complexity of the surface
memory capacity. Perfetti and Goldman form while leaving the underlying meaning
for exan1ple, have shown that good intact ought to decrease the size of the pro-
readers are capable of holding more of a text cessing chunks. Again, no data are presently
in short-term memory than poor readers. At available, though Kintsch and Monk (1972)
the same time, good and poor readers did not and King and Greeno (1974) have reported
differ on a conventional memory-span test. that such manipulations aliect reading times.
Thus, the differences observed in the reading Finally, the third parameter of the model,
task may not be due to capacity differences p, wOclld also be expected to depend on famili-
per se. Instead, they could be related to the arity for much the same reasons as s: The
observation of Lunneborg, and Lewis more familiar a text, the fewer resources are
(1975) that persons with low verbal abiiities required for other aspects of processing, and
are slower in accessing information in short- the more resources are available to store
term memory. According to the model, a individual propositions in memory. The value
comprehender continually tests of p should, however, depend above all on the
hons against the contents of the short-term task demands that govern the comprehension
buffer; even a slight decrease in the speed process as well as the later production process.
with which these individual operations are If a long text is read with attention focused
performed would result in a noticeable de- on gist comprehension, the probability
terioration in performance. In effect, lowering of storing individual propositions of the text
the speed of scanning and matching operations base should be considerably lower than when
would have the same effect as decreasing the a short paragraph is read with immediate recall
capacity of the buffer. instructions. Similarly, when summarizing a
The buffer capacity may also depend on the text, the value of p should be lower than in
difticulty of the text, however, or more pre- recall because individual micropropositions arc
cisely, on how difficult particular readers find given less weight in a sumrnary relative to
a text. Presumably, the size of the buffer macropropositions.
depends, within some limits, on the amount of ~ote that in a model like the present one,
resources that must be devoted to other as- such factors as familiaritv may have rather
pects of processing (perceptual decoding, syn- complex effects. Not only ~an fa~iliar material
tactic-sen1antic analyses, inference generation, be processed in larger chunks and retained
and the macro-operations discussed below). more efficiently in the memory buffer; if a
The greater the automaticity of these processes topic is unfamiliar, there will be no frame avail-
and the fewer inferences required, the larger able to organize and interpret a given proposi-
~he buffer a reader has to ,vith. tion sequence (e.g., for the purpose of generat-
Certainly, familiarity should have pro- ing inferences from so readers might con-
nounced effect on tt, the number of propositions tinue to in the hope
accepted per cycle. The process of construct- of finding information that may organize what
- --~.
they already hold in their buffer. If, however, must be evaluated in a way that conslders
the crucial piece of information fails to arrive, both comprehension time and test perfor-
the working memory vvill be quickly overloaded mance because of the trade-off between the
and incomprehension will result. Thus, famili- two. Only measures such as reading time per
arity may have effects on cO'::l1prchension not proposition recaHed in Kintsch ct al.,
only at the level of processing considered here 1975) or reading speed adjusted for compre-
(the construction of a coherent text base) but hension (suggested by Jackson & .McClelland,
also at higher processing levels. 1975) can therefore be considered adequate
measures of comprehension difficulty.
The factors related to readability in the
Readability
model depend, of course, on the input size
The model's predictions are relevant not per cycle and the short-term memory
capacity (s) that are assumed. In addition,
only for recall but also for the readability of
they depend on the nature of the selection
texts. This aspect of the model has been ex-
strategy used, since that determines which
plored by Kintsch and Vipond (1978) and will
propositions in each cycle are retained in the
be only briefly summarized here. It was noted
short-term buffer to be interconnected with
there that conventional accounts of readability
the next set of input propositions. A reader
have certain shortcomings, in part because
with a poor selection strategy and a small
they do not concern themselves with the
buffer, reading unfamiliar material, might
organization of long texts, which might be
have all kinds of problems with a text that
expected to be important for the ease of
would be highly readable for a good reader.
difficulty with which a text can be compre-
Thus, readability cannot be considered a
hended. The present model makes some pre-
property of texts alone, but one of the text-
dictions about readability that go beyond these
reader interaction. Indeed, some preliminary
traditional accounts. Comprehension in the
analyses reported by Kintsch and Vipond
normal case is a fully automatic process, that
(1978) show that the readability of some texts
is, it makes low demands on resources. Some-
changes a great deal as a function of the short-
times, however, these normal processes are
term memory capacity and the size or input
blocked, for instance, when a reader must
chunks: Some texts were hard for all param-
retrieve a referent no longer available in his or
eter combinations that were explored; others
her working memory, ,,,hieh is done almost
were easy in every case; still others could be
consciously, requiring considerable processing
processed by the model easily when short-term
resources. One can simulate the comprehen-
memory and input chunks were large, yet
sion process according to the model and deter-
became very difficult for small values of these
mine the number of resource-consuming opera-
parameters.
tions in this process, that is, the number of
long-term memory searches required and the
number of inferences required. The assumption AIacroslmclure of a Text
was made that each one of these operations
disrupts the automatic comprehension pro- Macro-operators transform the propositions
cesses and adds to the difficulty of reading. of a text base into a set of macropropositions
On the other hand, if these operations are that represent the gist of the text. They do so
not performed a reader, lhe representation by deleting or generalizing all propositions
of the text that this reader arrives at will be that are either irrelevant or redundant and by
incoherent. Hence, the retrieval of the text constructing new inferred propositions. "De-
base and all performance depending on its lete" here does not mean' 'delete from memory"
retrieval (such as recall, summarizing, or
but "delete from the macrostructure." Thus,
question answering) will be poor. Texts re-
quiring many operations that make high de- a given text proposition-a microproposition--
mands on resources should yield either in- may be deleted from the text's macrostructure
creased reading times or low scores on com- but, nevertheless, be stored in memory and
prehension tests. Comprehension, therefore, subsequently recalled as a microproposition.
...
Role of the Schema A second type of reading situation where

well-defined schemata exist comprises those
The reader's goals in reading control the
~ases. wh~e o.ne reads with a sr>ecial purpose
application of the macro-operators. The formal
m mmd. 1:'or mstance, Hayes, VVaterman, and
representation of these goals is the schema..
Robinson (1977) have studied the relevance
~he schema determines which microproposi-
of subjects vvho read a text with a
tlOns or generalizations of micropropositions
specific problem-solving set. It is not necessary
are relevant and, thus, which parts of the text
that the text be conventionally structured.
will form its gist.
Indeed, the special purpose overrides whatever
It is assumed that text commehension is
tex~ structure there is. For instance, one may
always controlled by a specific s~hema. How-
read a story with the processing controlled
ever, in some situations, the controlling schema
not by the usual story schema but by some
may not be detailed, nor predictable. If a
~pecial-~urpose schema established by task
reader's goals are vague, and the text that he
mstructlOns, special interests, or the like.
or she reads lacks a conventional structure
~hus, Decameron stories may be read not for
different schemata might be set up by dif~
the plot and the interesting events but be-
ferent readers, essentially in an unpredictable
cause of concern with the role of women in
manner. In such cases, the macro-operations
fourteenth-century Italy or with the attitudes
would also be unpredictable. Research on com-
of th~ characters in the story toward morality
prehension must concentrate on those cases
and sm.
where texts are read with clear goals that are
T~ere are many situations ,,;here compre-
shared among readers. Two kinds of situations
henSIOn processes are similarly controlled. As
qualify in this respect. First of there are
one final example, consider the case of reading
a number of highly conventionalized text
the review of a play with the Durpose of de-
types. If a reader processes such texts in ac- ,,'
.
Clumg whether or not to go " to that play.
cordance with their conventional nature
Wh~t is important in a play for the particular
specific well-defined schemata are obtained:
reader serves as a schema controlling the gist
These are shared by the members of a given
formation: There are open slots in this schema,
;ultural group and, hence, are highly suitable
each one standing for a property of plays that
lor research purposes. Familiar examples of
this reader finds important, positively or
such texts are stories (e.g., Kintsch & van Diik,
negatively, and the reader will try to fill these
~9!5) ~nd ~5ychological research repo;ts slots from the information provided in the
-' "Kmtsch, 1914): These schemata specify
both the schematIC categories of the texts (e.g.,
review. Certain propositions in the text base
of the review will be marked as relevant and
a research report is supDosed to contain
assigned to one of these slots, while others will
introduction, method, results, and discussion
be disregarded. If thereby not all slots are
sections), as well as what information in each
filled, inference processes will take over, and
section is relevant to the macrostrucTure
... (e IT
\ '0"
an attempt will be made to fill in the missin0a
the introduction of a research report must • ,. •
mtormatlOn by applying available knowledge
specify the purpose of the study). Predictions
frames to the informa tion presented directly.
about the macrostructure of such texts re-
Thus, while the macro-operations themselves
quire a thorough understanding of the nature
are always information reducing, the macro-
) of their schemata. Foilowing the lead of
structure may also contain information not
anthropologists (e.g., Colby, 1973) and lin-
directly represented in the original text base,
guists . (e.g., Labo~ & Wal~tzky, 1967), psy-
'when such information is required by the con-
chologlsts have so far given the most attention
trolling schema. If the introduction to a re-
to story schemata, in part within the present
search report does not contain a statement of
framework (Kintsch, 1977; Kintsch & Greene,
~978; Poulsen, Kintsch, Kintsch, & Premack,
the study's purpose, one will be inferred ac-
m press; van Dijk, 1977b) and in part within cording to the present model.
the related "story grammar" approach (:::Vland- In many cases, of course, people read
lef & Johnson, 1977; Rumelhart, 1975; Stein loosely structured texts with no clear goals
& Glenn, in press). in mind. The outcome of such comprehension
374 WALTER Kn,TSCH AND TEt:N A. VAN DIJK
processes, as far as the resuhing macrostruc- even the macropropositions have been
ture is concerned, is indeterminate. V,'e believe, forgotten.
that this is not a failure of the model: :YIacrostructures are hierarchical, and hence,
If the schema that controls the rnacro-opera- macro-operations are applied in several cycles,
tions is not weI! the outcome wiE be with more and more stringent criteria of rele-
haphazard, and we wouid argue that no vance. At the lowest level of the macro-
scientific in principle, can it. structure, relatively many propositions are
selected as relevant by the controlling schema
and the are perforrr.ed ac-
!vIaero-operators
cordingly. At the next level, stricter criteria
for relevance are so that only some
The schema thus classifies all propositions
of a text base as either relevant or irrelevant. of the first-level macro propositions are re-
Some of these propositions have generaliza- tained as second-level macropropositions. At
tions or constructions that are also classitied subsequent levels, :he criterion is strengthened
further until o"J.ly a r~aacroproposltlOn
in the same ,vay. In ~he absence of a genera!
(essentially a title for that text unit) remains.
theory of knowledge, whether a microproposi-
A tlIa t was selected as
tion is generalizable in its particular context or
relevant at k levels of the hierarchy has there-
can be replaced by a construction must be
decided on the basis of intuition at present. fore a of 1- (l-m)\
which is the probability that at least one of the
Each microproposition may thus be either
,k times that this proposition was processed
deleted from the macrostructure or included in
the macrostructure as is, or its generalization results in a s"Jccessful reproduction.
or construction m,ty be induded in the macro-
structure. A proposition or its generalization/
construction that is included in tte macro-
Recall or summarization pro locals obtained
structure is called a macro proposition. Thus,
in experiments arc texts in their own right,
some propositions may be both micro- and
satisfying the general textual and contextual
macropropositions (when they are relevant);
conaltlons or and communication.
irrelevant propositions are only microproposi-
A protocol is not simply a replica of a memory
tions, and generalizations and constructions
representation of the original discourse. On the
are only macropropositions. 3
contr2cry, the subject will try to produce a new
The macro-operations are performed proba- text that satisfies the pragmatic conditions of
bilistically. Specifically, irrelevant micropro- a task context in an experiment or
positions never become macropropOSltlons. the requirements of effective communication
if such propositions have generalizations in a more naturalcomext. Thus, the language
or constructions, their generalizations or con- user will not produce information that he or
structions may become macropropositions, she assumes IS known or redundant.
with probability g when they, too, are ir- the operations involved in dis-
relevant, and with probability m when they are so that ~he
are relevant. Relevant micropropositions be- subject wiil be unable to retrieve at any onc
come macroproposi tions with m;
time the information tha~ is in principle
if they have generalizations or constructions, accessible to memory. protocols
these ,tre included in the macrostructure, also contain information not based on \vhat the
with probability m. The parameters m and g subject remembers from the original text, but
are reproduction probabilities, just as p in the
previous section: They depend on both storage
and retrieval conditions. m may be ;{ Longacre and Levinsohn (1977) mention a language
sma Ii because the reading conditions did not tha~ apparent'y uses macrostructu,re mar~ers ~n Y1C
encourage macro processing (e.g., only a short surtace st:ucture: In Cl'..DeO, a SO~Ln AmerIcan lnClan
language) certain sentences contain a particular particle;
paragraph \yas read, with the expectation of stringi::ig together the sentences of a text that are
an immediate recall or neca"J5e the pro- rnarked by this pa:-tic1c res~lts i:1 ad abstract or sum-
duction was delayed for such a long ~ime that mary of :he text.
TEXT COMPREHENSIOK AXD PRODUCTION 375
consisting or reconstructively added details, comprehension processes, hovvever, are speci-

explanations, and various features that are the fied by the model in detaiL Specifically, those
result of constraints characterizing traces consist of a set of
in general. For each
Opt~o/1,al transformations. Propositions the model tha i it Vv-ill
stored in memory may be transformed in be included in this :11eraory set, on
various essentially arbitrary and unpredictable the number of cycles in which it
ways, in addition to the schema- participated and on the reproduction param-
controlled transformations achieved through eter p. In the memory episode con-
the macro-operations that are the focus of the tains a set of macropropositions, with the
present model. An underlying conception may probability that a mac;:oproposition is included
be realized in the surface structure in a variety in that set being specified by the parameter m.
of 'ways, depending on pragmatic and stylistic (~ote that the parameters m and p can only be
concerns. Reproduction reconstruction) of specified as a function of storage and
a text is in terms of different lexical retrieval conditions; hence, the memory con-
itcms or larger units of meaning. Transforma- tent on which a production is based must also
tions may be at the level of the micro- be specified ,,"ith respect to the task demands
structure, the macrostructure, or the schematic of a situation.)
structurc. Among these transformations one A is assumed that
can distinguish reordering, explication of retrieves the memory contents as described
coherence relations ~mlOng propositions, iexical so that become part of the subject's
substitutions, and perspective changes. These text base from which the output protocol is
transformations may be a source of errors in derived.
protocols, too, though most of the time they Reconstruction. \Vhen mlcro- or macro-
preserve meaning. \f\'hether such transforma- information is no longer directly retrievable,
tions are made at the time of comprehension, the language user will usually try to recon-
or at the time of production, or both cannot be struct this information applying rules of
decided at present. inference to the information that is still
A systematic account of optional transforma- available. This process is modeled with three
tions might be possible, but it is not necessary reconstruction operators. They consist of the
within the present model. It would make an inverse of the macro-opera tors and
..i
already complex model and scoring system result in reconstruction of some of the
even more cumbersome, and thereIore, we information deleted from the" macrostructure
choose to ignore optional transformations in with the aid of world knowledge: (a) addition
the applications below. of plausible details and normal properties,
Reproduction. Reproduction is the simplest particularization, and (c) specification of
,>-
involved in "ext production. A norraal conditions, components, or conse-
subject's memory for a particular text is a quences of events. In all cases, errors may be
memory episode containing the foil owing types made. The language user may make guesses
of memory traces: (a) traces irom the various that are plausible but wrong or even give
perceptual and processes involved in dctails that are inconsistent with the
text processing, traces from the compre- text. However, the reconstruction operators
hension processes, and (c) contextual traces. are not applied operate under
Among the first would be memory for the type- the control of the schema, just as in the case of
face used or memory for words and the macro-operators. Only reconstructions that
phrases. The third kind of trace permits the are relevant in terms of this control schema
to remember the circumstances in are actually produced. Thus, for instance,
which the processing took place: the laboratory when "Peter went to Paris train" IS a
~ setting, his her own reactions to the whole remembered macroproposition, It be
procedure, and so on. The model is not con- expanded means of reconstruction opera-
cerned with either of these two types of tions to include such a normal as
'memory traces. The traces resulting from the "He went into the station to buy a ticket,"
but it would not include irrelevant reconstruc- transformed into an actual text will not be c
tions such as "His leg muscles contracted." considered here, although it is probably a less
The schema in control of the output operation intractable problem than that posed by the
need not be the same as the one that controlled inverse transformation of verbal texts into
comprehension: It is perfectly possible to look conceptual bases.
at a house from the of a buyer and Text generation. Although we are con-
later to change one's viewpoint to that of a cerned here with the production of second-
prospective burglar, though different informa- order discourses, that is, discourses with re-
tion will be produced than in the case ,,,hen no spect to another discourse, such as free-recall
such schema change has occurred (A.'1derson protocols or summaries, the present model may
& Pichert, 1978). eventually be incorporated into a more general
Afetastatements. In producing an output theory of text production. In that case, the
protocol, a subject will not only operate core propositions of a text 'would not be a
directly on available information but will also set of propositions remembered from some
make -all kinds of metacomments on the specliIc text, and therefore new
structure, the content, or the schema of the mechanisms must be described that would
text. The subject may also add comments, generate core propositions de novo.
opinions, or express his or her attitude.
Production plans. In order to monitor the
Processing Simulation
production of propositions and especially the
connections and coherence relations, it must How the model outlined in the previous
be assumed that the speaker uses the available section works is illustrated here with a simple
macropropositions of each fragment. of the example. A suitable text for this purpose must
text as a production plan. For each macro- be (a) sufficiently long to ensure the involve-
proposition, the speaker reproduces or recon- ment of macroprocesses in comprehension; (b)
structs the propositions dominated by it. well structured in terms of a schema, so that
Similarly, the schema, once actualized, will predictions from the present model become
guide the global ordering of the production possible; and (c) understandable without
process, for the order in which the technical knowledge. A 1,300-\vord research
macropropositions themselves are actualized. report by Heussenstam (1971) called "Bumper-
At the microlevel, coherence relations will stickers and the Cops," appeared to fill these
determine the ordering of the propositions to requirements. Its structure follows the con-
be expressed, as well as the topic-comment ventions of psychological research reports
structure of the respective sentences. Both at closely, though it is written informally, with-
~he micro- and macrolevels, production is out the usual subheadings indicating methods,
guided not only by the memory structure of results, and so on. Furthermore, it is con-
the discourse itself but also by general knowl- cerned 'with a social psychology experiment
edge about the normal ordering of events and that requires no previous familiarity with
episodes, general principles of ordering of either psychology, experimental design, or
events and episodes, general principles of statistics.
ordering of information in discourse, and We cannot analyze the whole report here,
schematic rules. This explains, for instance, though a tentative macro analysis of i'Bumper-
why speakers will often transform a non- stickers" has been given in van Dijk (in press).
canonical ordering into a more canonical one Instead, we shall concentrate only on its first
(e.g., when summariziing scrambled stories, paragraph:
as in Kintsch, Mandel, & Kozminsky, 1977).
Just as the model of the comprehension A series of violent, bloody encounters between police
and Black Panther Party members punctuated the
process begins at the propositional level rather early summer days of 1969. Soon after, a group of
than at the level of the text itself, the produc- Black students I teach at California State College, Los
Angeles, who were members of L~e Panther Party,
tion model will also leave off at the proposi-
began to complain of continuous harassment by law
tional level. The question of how the text enforcement officers. Among their many grievances, _
base that underlies a subject's protocol is they complained about receiving 50 many traffic-
p
TEXT COMPREHEKSIOK AND PRODUCTION 377
citations that some were in danger of losing their driv- Table 1

ing privileges. During one lengthy discussion, we Proposition List for the Bumperstickers
realized that all of them drove automobiles with Panther Paragraph
-Party signs glued to their bumpers. This is a report of
a study that undertook to assess the seriousness of
their charges and to determine \-vhether \ve were hearing Proposition
the voice of paranoia or reality. (Heussenstam, 1971, number Proposition
p.32) (SERIES, ENCOUl\;TER)
Our model does not start with this text but 2 (VIOLE:-lT, E:-lCOU}!TER)
3 (BLOODY, E]\COt:NTER)
with its semantic representation, presented in 4 (BETWEEN, ENCOuNTER, POLICE,
Table 1. \Ve follow here the conventions of BLACK PANTHER)
Kintsch (1974) and Turner and Greene (in 5 (n"m: IN, ENCOt:NTER, SL'II!MER)
press). Propositions are numbered and are listed 6 (EARLY, SUMMER)

7 (TIME: IN, SUMMER, 1969)
consecutively according to the order in 'which
their predicates appear in the English text. 8 (SOON, 9)
9 (AFTER, 4, 16)
Each proposition is enclosed by parentheses
10 (GROUP, STuDENT)
and contains a predicate (written first) plus 1 (BLACK, STt:DEl\T)
one or more arguments. Arguments are con- 12 (TEACH, SPEAKER, STUDENT)
cepts (printed in capital letters to distinguish 13 (LOCATION: AT, 12, CAL STATE
them from words) or other embedded proposi- COLLEGE)
14 (LOCATION: AT, CAL STATE COLLEGE,
tions, which are referred to by their number. LOS .~NGELES)
Thus, as is shown in Table 1, (CO:MPLAIN, 15 (IS A, STeDENT, BLACK PAl\THER)
STUDENT, 19) is shorthand for (COMPLAIN, 16 (BEGIN, 17)
STUDENT, (HARASS, POLICE, STUDENT)). The 17 (COl\lPLAI::f, STUDENT, 19)
semantic cases of the arguments have not been 18 (CONTINUOUS, 19)

19 (HARASS, POLICE, STUDENT)
indicated in Table 1 to keep the notation
simple, but they are obvious in most instances. 20 (AMONG, COMPLAINT)
Some comments about this representation 21 (MANY, COMPLAINT)

22 (COllI PLAIN, STUDENT, 23)
are in order. It is obviously very "surfacy" 23 (RECEIVE, STUDE}!T, TICKET)
and, in connection 'with that, not at ail un- 24 (~IAl\;Y, TICKET)
ambiguous. This is a long way from a precise, 25 (CAUSE, 23, 27)
logical formalism that 'would unambiguously 26 (smm, STUDENT)
27 (IN DANGER OF, 26, 28)
represent "the meaning" of the text, preferably 28 (LOSE, 26, LICE)lSE)
through some combination of elementary
semantic predicates. The problem is not only 29 (DURING, DISCUSSIOl\, 32)
30 (LENGTHY, DlSCUSSI01\)
that neither 'we nor anyone else has ever 31 (A1\D, STUDE1\T, SPEAKER)
devised an adequate formalism of this kind; 32 (REALIZE, 31, 34)
such a formalism 'would also be quite inap- 33 (ALL, STUDENT)
propriate for our purposes. If one wants to 34 (DRIVE, 33, AUTO)
develop a psychological processing model of 35 (HAVE, At:TO, SIGN)

36 (BLACK PANTHER, 51G1\)
comprehension, one should start with a non- 37 (GLUED, SIGN, BeMPER)
elaborated semantic representation that is
close to the surface structure. It is then the 38 (REPORT, SPEAKER, STL'DY)
39 (DO, SPEAKER, STUDY)
task of the processing model to elaborate this (P1;RPOSE, STUDY, 41)
40
representation in stages, just as a reader starts 41 (ASSESS, STUDY, 42, 43)
out with a conceptual analysis directly based 42 (TRt:E, 17)
on the surface structure but then constructs 43 (HEAR, 31; 4 Ll)
from this base more elaborate interpretations. 44 (OR, 45, 46)
45 (OF REALITY, VOICE)
To model some of these interpretative processes 46 (OF PARA:-lOIA, VOICE)
, is the goal of the present enterprise. These
processes change the inItial semantic repre- Note. Lines indicate sentence boundaries. Proposi-
tions are numbered for eace oi reference. Numbers as
sentation (e.g., by specifying coherel1ce rela- propositional arguments refer to the proposition with
tions of a certain kind, adding required in- that number.
- -----~
378 WALTER KINTSCH AKD TEUK A. VAK DIJK
ambiguity may remain. In general, neither a~

the original text nor a person's understanding tl
of it is completely S
This raises, of course, the question of what v
such a semantic representation represents. a
What is by (C01IPLAIK, STU- n
DENT) rather than The students complained?
r
C)'Cle 2: Buffer: P3,4, 5, 7: Input: P8-19. Capitalizing complain and it a concept t
~ 3 rather ~han a word explains in itself.
~5-1 There are two sets of reasons, however, 'xhy e
\\"~-8 11 model of comprehension must be based on a
semantic representation rather than directly
\
\
15~IO
- II
on English text. The notation clarifies ",hieh
aspects of the text are important (semantic
\ 12--13--14
and which are no~ (e.g., surface
\ 11--16 structure) for our purpose; it provides a unit
[9--18 that appears for studying com-
C)'Cle 3: Buffer: P4,9,15,19: Input: P20-28.
prehension processes, that is, the proposition
(see Kintsch & Keenan, for a demon-
~~\~~22~~~
stration that number of propositions in a
sentence rather than number of words deter-
mines reading time) ; finaily, this kind of nota-
23 -.:::::::-24 tion greatly simplifies the scoring of experi-
~ ........... 25 mental protocols in that it establishes fairly
26 -.:::::::-21 unambiguous equivalence classes within which
. . . . . . 28 paraphrases can be treated interchangeably.
Hmvever, there are other even more impor-
Cycle 4: Buffer: P4,9,i5, 19: Input P29-37.
tant reasons for the use of semantic renresenta-
4\\1~~31-32--29--30
tions. V\'ork that goes beyond the confmes of
the present article, for example, on the organi-
33--34 zation of text bases in terms of fact frames and
U! the generation of inferences from such knowl-
~~35 edge fragments, requires the development of
II suitable representations. For instance, CO:\1-
PLAIN is merely the name of a knowledge
Cycle 5: Buffer: P4, 19,36,37: Inpu!: P 38-46.
complex that specifies the normal uses of this
4 i9
-~361~~7
concept, including antecedents and conse-
quences. Inferences can be derived from this
I 31 38 knowledge for that the
\ ". I '339--40
I. students were probably unhappy or angry or
'-@+--42-- that they to someone, with the
'-43-44~45 text supplying the professor for that role. Once
46 a concept like cmIPLAIN is elaborated in that J
way, the semantic notation is anything but
Figure 1. The cyclical construction of the coherence vacuous. at present y,'e are not con-
graph for the propositions <P) shown in Table
cerned ,'lith this problem, the possibility for
ferences, and specifying a macrostructure) this kind of extension must open.
toward a deeper, less surface-dependent struc-
Cycling
ture. the deep, elaborated semantic
should stand at the end rather The first step in the processing model is to
than at the beginning of processing organize the inpu: into a co-
though even at that point some vagueness and herent graph, as illustrated in Figure 1. It was
TEXT CO~PREHENSION AND PRODUCTION 379
assumed for the purpose of this figure that

the text is processed sentence by sentence.
Since in our example the sentences are neither
very short nor very long, this is a reasonable
assumption and nothing more complicated is
needed. It means that ni, the number of
propositions processed per cycle, ranges be-
t,veeni and 12.
In Cycle 1 (see Figure 1), the buffer is
10
empty, and the propositions derived from the
first sentence are the input. P4 is selected as ~:I
the superordinate proposition because it is ~, \ : 2 - - - !3---14
the only proposition in the input set that is
directly related to the title: It shares with the
title the concept POLICE. a title,
propositions are organized in such a way that
~\\ \
"1
1
\\.\\\
h \ \22
\\ \ \
,18+
l.\\.\ 17~. 16
20.
the simplest graph results.) P2, P3, and PS 11' \ 2,
\\ ~ \ i
are directly subordinated to P4 because of the \\\23~: 24
shared argument E""Cou""TER; P6 and P7 are 1\\ '
subordinated to PS because of the repetition

II \ I 25
\1 \
of SLTMMER.
At this point, we must specify the short-term
\\261='
II
,I
27 28
1\
memory assumptions of the model. For the 131. 32---29--30
present illustration, the short-term memory '\ \.
capacity was set at s=4. Although this seems
like a reasonable value to try, there is no ~:~~
'i. . 34
particular justification for it. Empirical -~ 3B
methods to determine its adequacy will be , 39--40

described later. We also must specify some
strategy for the selection of the propositions
i 42--4,
that are to be maintained in the short-term

memory buffer from one cycle to the next. As
~43--"~::1
was mentioned before, a good strategy should
be biased in favor of superordinate and recent Figure 2. The complete coherence graph. (Numbers
represent propositions. The number of boxes represents
propositions. The "leading-edge strategy," the number of extra cycles required for processing.)
originally proposed by Kintsch and Vipond
(1978), does exactly that. It consists of the
Cycle 2 starts with the propositions carried
following scheme. Start with the top proposi-
over from Cycle 1; P9 is connected to P4, and
tion in Figure 1 and pick up all propositions P8 is connected to P9; the next connection is
along the graph's lower edge, as long as each
formed between P4 and P1S because of the
is more recent than the previous one (i.e., the
common argument BLACK PA:\,THER; P19 also
index numbers increase); next, go to the highest
connects to P4 because of POLICE; P10, P11,
level possible and pick propositions in order
P12, and P17 contain the argument STUDE""T
of their recency (i.e., highest numbers first);
and are therefore connected to PIS, which
stop whenever s propositions have been
first introduced that argument. The construc-
, selected. In Cycle 1, this means that the pro-
tion of the rest of the graph as well as the other
cess first selects P4 ,then P5 and P7) which are processing cycles should now be easy to follow.
along the leading edge of the graph; thereafter, The only problems arise in Cycle 5, where the
it returns to Level 2 (Levell does not contain propositions do not share a common
any other not-yet-selected propositions) and argument with any of the propositions in the
picks as a fourth and final proposition the buffer. This requires a long-term memory
most recent one from that level, that is, P3. search in this case; since some of the input
INTROOUCTlON· (00. S, ) \mE~i~ENT RESULTS DISCUSSION

j STUDY
( $
\ S, $»))
S£HI~G IS) lITE~ATURE' 1$) PURPOSE' (PURPOSE, \EX?ERIMENT, (FINO OUT, (CAUSE,
I
I ) STUDY
( $
I
I \
\
I S, S
(TIME, j
I
($)
I \
($) ($)
(lOC' S, Sl
Figure 3. Some components of the report schema. (Wavy brackets indicate alternatives. $ indicates
unspecified information.)
propositions refer back to P17 and P31, the an observational study, or some other type of
search leads to the reinstatement of these two research was performed. The introduction con-
propositions, as shown in Figure 1. Thus, the tains three kinds of information: the setting
model predicts that (for the parameter values of the report, literature references, and (c)
used) some processing difficulties should occur a statement of the purpose (hypothesis) of the
during the comprehension of the last sentence, report. The setting category may further be
having to do with determining the referents broken down into location and time ($ signs
for "we" and "their charges." indicate unspecified information). The litera-
The coherence graph arrived at by means of ture category is neglected here because it is
the processes illustrated in Figure 1 is shown in not instantiated in the present example. The
Figure 2. The important aspects of Figure 2 purpose category states that the purpose of
are that the graph is indeed connected, that is, the experiment is to find out whether some
the text is coherent, and that some proposi- causal relation holds. The nature of this
tions participated in more than one cycle. causal relation can then be further elaborated.
This extra processing is indicated in Figure 2
by the number of boxes in which each proposi- illaero-operations
tion is enclosed: P4 is enclosed by four
boxes, meaning that it was maintained during In forming the macrostructure, micro pro-
four processing cycles after its own input cycle. positions are either deleted, generalized, re-
Other propositions are enclosed by three, two, placed by a construction, or carried over un-
one, or zero boxes in a similar manner. This changed. Propositions that are not generalized
information is crucial for the model because it are deleted if they are irrelevant; if they are
determines the likelihood that each proposi- relevant, they may become macropropositions.
tion will be reproduced. Whether propositions that are generalized
become macropropositions depends on the
Schema relevance of the generalizations: Those that
are relevant are included in the macrostructure
Figure 3 shows a fragment of the report with a higher probability than those that are
schema. Since only the first paragraph of a not relevant. Thus, the first macro-operation
report will be analyzed, only information in is to form all generalizations of the micro-
the schema relevant to the introduction of a propositions. Since the present theory lacks a
report is shown. Reports are conventionally formal inference component, intuition must
organized into introduction, method, results, once again be invoked to provide us with the
and discussion sections. The introduction required generalizations. Table 2 iists the
section must specify that an experiment, or generalizations that we have noted in the text
base of the Bumperstickers paragraph. They propositions are denoted by M in the table;
are printed below their respective micro- micropropositions that are determined to be
propositions and indicated by an "M." Thus, relevant by the schema and thus also function
PI (SERIES, ENCO"GNTER) is generalized to M1 as macro positions are denoted by MP.
(SOME, ENCOtJNTER). The reproduction probabilities for all these
In Figure 4, the report schema is applied to propositions are also derived in Table 2. Three
our text-base example in order to determine kinds of storage operations are distinguished.
which of the propositions are relevant and The operation S is applied every time a
which are not. The schema picks out the (micro-) proposition participates in one of the
generalized setting statements-that the whole cycles of Figure 1, as shown by the number of
episode took place in California, at a college, boxes in Figure 2. Different operators apply to
and in the sixties. Furthermore, it selects that macropropositions, depending on whether or
an experiment was done with the purpose of not they are relevant. If a macroproposition
finding out whether Black Panther bumper- is relevant, an opera tor M applies; if it is
stickers were the cause of students receiving traffic irrelevant, the operator is G. Thus, the repro-
tickets. Associated with the bumperstickers is duction probability for the irrelevant rr..acro-
the fact that the students had cars with the proposition some encmmters is determined by
signs Ott them. Associated with the tickets is G; but for the relevant in the sixties, it is
many tickets, and that the students complained determined by M. Some propositions can be
about police harassment. Once these proposi- stored either as micro- or macropropositions;
tions have been determined as relevant, a for example, the operator SM2 is applied to
stricter relevance criterion selects not all MP23 - S because P23 participates in one
generalized setting information, but only the processing cycle and M2 because M23 is
major setting (in California), and not all incorporated in two levels of the macro-
antecedents and consequences of the basic structure.
causal relationship, but only its major com- The reproduction probabilities shown in the
ponents (bumperstickers lead to tickets). Finally, last column of Table 2 are directly determined
at the top level of the macrostructure hier- by the third column of the table, with p, g,
archy, ali the information in the introduction and m being the probabilities that each applica-
is reduced to an experiment was done. Thus, tion of S, G, and M, respectively, results in a
some propositions appear at only one level of successful reproduction of that proposition.
the hierarchy, others at two levels, and one It is assumed here that Sand M are statistically
at all three levels. Each time a proposition is independent.
selected at a particular level of the macro-
structure, the likelihood that it will later be Productiott
recalied increases.
The results of both the micro- and macro- Output protocols generated by the model
processes are shown in Table 2. This table are illustrated in Tables 3 and 4. Strictly
contains the 46 micropropositions shown in speaking, only the reproduced text bases
Table 1 and their generalizations. Macro- shown in these tables are generated by the
iNTRODUCTiON
I
~RIME~TI
IlOC' IN,CAUflIUiAI (PURPIlSE. EXP. (ASSESS. (CAUSE. (8LACIPAIfIHER. 8u.mSlIClERl (RfCEI'IE. STUDENT. Tlcms})))
! I /
ilOC· AT. COLlEGE) (HAVE. STUDENT. AUTO) iCO~'lAIN.

SiUom.
ITI~E' 1M. SIXTIES} (Hm. \Aw'TO • SIG~1 (MAUSS. PGUCI:. STlJilm)}
(STUDENT (~m, nCl£n
Figure 4. The macrostructure of the text base shown in Table 1. (Wavy brackets indicate alternatives.)
model; this output was simulated bv reuroduc- Preliminary Data Analyses

ing the propositions of Table 2 with th~ proba-
bilities indicated. particular values of p, g, The goal of the present section is to demon-
and m used here appear reasonable in light of strate how actual experimental Drotocols can
the data analyses to be reported in the next be analyzed with the methods de'veloped here.
section.) The results shown are from single We have used "Bumperstickers and the Cops"
simulation run, so that no selection is involved. in various experiments. The basic procedure
For easier reading, English sentences instead of of all these experiments was the same, and
semantic representations are shown. only the length of the retention interval varied.
The simulated protocols contain some meta- Subjects read the typewritten text at their own
statements and reconstructions, in addition speed. Thereafter, subjects were asked to
to the reproduced proposItIOnal content. recall the whole report, as well as they could,
Nothing in the model permits one to assign a not necessarily verbatim. They were urged to
probability value to either metastatements or keep trying, to go over their protocols several
reconstructions. All we can say is that such times, and to add anything that came to mind
statements occur, and we can identify them later. Most subjects worked for at least 20
as meta statements or reconstructions if they minutes at this task, some well over an hour.
are encountered in a protocol. The subjects typed their protocols into a
Tables 3 and 4 show that the model can computer-controlled screen and were shown
produce recall and summarization protocols how to use the computer to edit and change
that, with the addition of some metastatements their protocols. The computer recorded their
and reproductions, are probably indistinguish- writing times. After finishing the recall
able from actual protocols obtained in psycho- protocol, a subject was asked to write a sum-
logical experiments. mary of the report. The summary had to be
Table 2
M~emory Storage oj l'vficro- and Macropropositions
Proposition 5torage
number Proposition operation Reproduction probability
PI (SERIES, ENC) p
:VIl (smfE, E::'C) g
P2 (VIOL, ENe) P
P3 (BLOODY, EXC) 1 (1 - p)2
P4 (BETW, ENe, POL, BP) 1 - (1 - p)'
P5 (IK, ENC, SU:'vl) 1 - (1 - p)'
P6 (EARL Y j SUM) P
P7 (IK, SU:VI, 1969) 1 - (1 - p)2
M7 (IN, EPISODE, SIXTIES) m
P8 (SOON, S P
P9 (AFTER, 16) 53 1 - (1 - p)3
PI0 (GROUP STUD' 5 P
MIO (SO:v!E, 'STUD)' G g
Pl1 (BLACK, STUD; 5 P
P12 (TEACH, SPEAK, STUD) 5 p
:YI12 (HAVE, SPEAK; ST17D) G g
P13 (AT, 12, esc) S p
M13 (AT, EPISODE, COLLEGE) M m
PI4 (IN, esc; LA) 5 p
;\II14 (IN, EPISODE, C.\LIF) :VP 1 - (1 - m)'
PIS (IS A, STUD, BP; 5' 1-(1-p)3
P16 (BEGIl\, 17) S p
MP17 (COMPLAIN, STUD, 19) 5'.VI p+m-pm
P18 (COl\T, 19) S p
:YIP19 (HARASS, POL, STUD) S4:Y1 1 - (1 - p)' +m - [1 - (1 - p)4Jm
-
Table 2 (contl:nued)
Proposition Storage
number Proposition operation Reproduction probability
P20 (A11ONG, eml?) S p

P21 (:>lA:SY, CO:>lP) S p
P22 (COMPLAIN, STUD, 23) S p
MP23 (RECEIVE, STL'"D, TICK) S:W p + 1 - (1 - m)2 - P=l - ~1 - 1,")2J
MP24 (MANY, TICK) S:VI P -!- m - pm
P25 (CAUSE 23 27) S p
P26 (SOME, 'ST;D) S p
P27 (IX DA"GER, 26, 28) S p
P28 (LOSE, 26, LIC) S p
P29 (DURlNG, DISC) S p

P30 (LENGTHY, DISC) S p
P31 (AXD, STGD, SPEAK) S p
P32 (REALIZE, 31, 34) S p
P3:~ (ALL~ sTun) 5 p
P34 (DRIVE, 33, AUTO) S p
M34 (HeWE, STUD, Al'TO) ~I m
MP35 (HAVE, ACTO/STUD, SIG:-.r) SM P + m - pm
MP36 (BP, SIGN) S2M2 1 - (1 - -r - (1 - m)2
- [1 - - p)'] X [1 - (1 - m)2J
P37 (GLUED, SIGN, BU;>lP) S P
M37 (ON, SIG", BU~IP) ~F 1 - (1 - m)2
P38 (REPORT, SPEAK, STUDY) S p

MP39 (DO, SPEAK, STUDY) SIVI" p+l (1 - m)3 - p[l (1 - m)3]
:VIP40 (PURPOSE, STUDY, 41) 5M2 p+l (1 - m)2 - pil (1 - rn)2]
.:vIP41 (ASSESS, Sn;DY, 42, 43) SIV{2 p+1 (1 - m)2 - pEl (1 - m)21
:VIP42 (TRUE, 17)/(CAUSE, 36, 23) 5M' p+l (1 - mi' - PCl (1 - m)']
P43 (HEAR, 31, 44) S p
P44 (OR, 45, 46) 5 p
P45 (OF REALITY, VOICE) S p
P46 (OF PARANOIA, VOICE) 5 p
Note. Lines show sentence boundaries. P indicates a microproposition, M indicates a macroproposition, and
;vrp is both a micro- and macroproposition. 5 is the storage operawr that is applied to microproposirions, G
is applied to irrelevant macropropositions, and M is applied to relevant macropropositions.
between 60 and 80 words long to facilitate over a 3-month penoo IS hardly remarkable.
between-subjects comparisons. The compu~er But note that the overall length of the free-
indicated the number of words written to avoid recall protocols declined only by about a
the distraction of repeated 'word counts. All third or quarter with Obviously, sub-
subjects were undergraduates from the Uni- jects did very well even after 3 months,
versity of Colorado and were fulfilling a at least in terms of the number of words
course requirement. produced. If we restrict ourselves to the first
A group of 31 subjects was tested immedi- paragraph of the text and analyze proposi-
ately after reading the report. Another group tionally those portions of the recail protocols
of 32 subjects was tested after a 1-month referring to that paragraph, ,,'e arrive at the
retention interval, and 24 further subjects same conclusions. Although there is a (sta-
were tested after 3 months. tistically significant) decline over delay in the
Some statistics of the protocols thus ob- number of propositions in the protocols that
tained are shown in Table 5. Since the sum- could be assigned to the introduction category,
maries were restricted in length, the fact that the decline is rather moderate.
their average length did not change appreciably While ;:he changed relatively little
Table 3
A Simulated Recall Protocol
Reproduced text base

PI, P4, M7, P9, P13, P17, MP19, MP3S, MP36, M37, MP39, MP40, MP42
Text base converted to English, with reconstructions italicized

In the sixties (M7), there was a series (P1) of riots between the police and the Black Panthers (P4). The
police did not like the Black Panthers (normal consequence of 4). After that (P9), students who were enrolled
(normal component of 13) at the California State College (PI3) complained (P17) that the police were har-
assing them (PI9). They were stopped and had to show their driver's license (normal component of 23). The
police gave them trouble (PI9) because they had (MP35) Black Panther (MP36) signs on their
bumpers (M37). The police discriminated against students and violated their civ-it rights (particularization
of 42). They did an experiment (MP39) for this purpose (MP40).
Note. The parameter values used were p = .10, g = .20, and 111 = .40. P, M, and MP identify propositions
from Table 2.
in quantity, its qualitative composition changed For the summaries, the picture is similar,
a great deal. In the recall protocols, the propor- though the changes are not quite as pro-
tion of reproductive propositions in the total nounced, as shown in Figure 6. The contribu-
output declined from 72% to 48%, while at tion of the three response categories to the
the same time, reconstructions almost doubled total output still differs significantly, however,
their contribution, and metastatements quad- x2 (4) = 16.90, P = .002. ~ot surprisingly,
rupled theirs.4 These results are shown in summaries are generally less reconstructive
Figure 5; they are highly significant sta- than recall. ~ote that the reproductive com-
tistically, x 2 (4) = 56.43. As less material from ponent includes reproductions of both micro-
the text is reproduced, proportionately more and macropropositions!
material is added the production processes We have scored every statement that could
themselves. Interestingly, these production be a reproduction of either a macro- or micro-
processes operated quite accurately in the proposition as if it actually were. This is, of
present case, even after 3 months: Errors, that course, not necessary: It is not only possible
but quite probable that some of the statements
is, wrong guesses or far-fetched confabulations,
that we regarded as reproductive in origin
occurred with negligible frequency at all
are reconstructions that accidentally duplicate
delay intervals (1%, 0%, and 1% for zero, actual text propositions. Thus, our estimates of
I-month, and 3-month delays, respectively). the reproductive components of recall and
summarization protocols are in fact upper
Table 4 bounds.
A Simulated Summary
Protocol Analyses
Reproduced text base
In order to obtain the data reported above,
M14, .M37, MP39, MP40 all protocols had to be analyzed proposi-
Text base converted to English,
with metastatements and recon-
structions italicized • Metastatements seem to be most probable when
one has little else to say: Children use such statements
This report was abm.t (metastatement) an experiment most frequently when they talk about something they
done (MP39) in California (l"n4~. It was done do not understand (e.g., Poulsen et a1., in press), and
(MP39) in order to (MP40) protect students from the speech of aphasics is typically characterized by such
discrimination (normal consequence of 41). semanticaily empty comments (e.g., Luria & Tsvet-
kova, 1968). In boL~ cases, the function of these
Note. The parameter values used were p = .01, statements mav be to maintain the communicative
g = .05, and m = .30. M and MP identify proposi- relationship between speaker and listener. Thus, their
tions from Table 2. function is a pragmatic one.
-pz
Table 5
Average Number of Words and Propositions of
Recall and Summarization Protocols as a
Function of the Retention Interval
No. words No. propositions

Protocol and (total (lst paragraph
retention interval protocol) only)
Recall
Immediate 363 17.3
1 month 225 13.2
3 months 268 14.8
Summaries
Immediate 75 8.4
month 76 7.4
3 months 72 6.2
(d) errors and unclassifiable statements (not

represented in the present protocol).
Note that the summary is, for the most part,
Retention Interval
a reduced version of the recall protocol; on
Figure 5. Proportion of reproductions, reconstructions, the other hand, the second proposition of the
and metastatements in the recall protocols for three
retention intervals. summary that specifies the major setting did
not appear in the recall. Clearly, the recall
tionally. An example of how this kind of

analysis works in detail is presented in Tables 100
6 and 7, for which one subject from the im-
90
mediate condition was picked at random. The
recall protocol and the summary of this sub- 80
ject are shown, together with the propositional
representations of these texts. In construct- III
c 70 ~tions
0
ing this representation, an effort was made to
bring out any overlap between the proposi- ~
Q.
60
tional representation of the original text and e
a.. 50
its macrostructure (see Table 2) and the pro-
tocol. In other words, if the protocol contained '0
a statement that was semantically (not 'E 40~
:t
Q)
necessarily lexically) equivalent to a text

u... Reconstructio~...o
proposition, this text proposition was used in ~
0--_---0-..--
the analysis. This was done to simplify the
analysis; as mentioned before, a fine-grained i I
analysis distinguishing optional meaning-pre-
serving transformations would be possible.
Each protocol proposition was then assigned
to one of four response categories: (a) repro-
t IMMEDIATE ONE
MONTH
THREE
MONTH
l
ductions (the index number of the reproduced Retention Interval
proposition is indicated), (b) reconstructions Figure 6. Proportion of reproductions, reconstructions,
(the source of the reconstruction is noted and metastatements in the summaries for three reten-
wherever possible), (c) metastatements, and tion intervals.
386 WALTER KINTSCH AXD TEL'X A. VAX DIJK
Table 6
Recall Protocol and Its A nalysis From a Randomly Selected Subject
Recall protocol
The report started with telling about the Black Panther movement in the 1960s. A professor was telling
about some of his stUdents who were in the movement. They were complaining about the traffic tickets they
were getting from the police. They felt it was just because they were members of the movement. The pro·
fessor decided to do an experiment to find whether or not the bumperstickers on the cars had anything to
do with it.
Proposition
nU\TIber Proposition Classification
(START WITH, REPORT, 2) Schematic metastatement

2 (TELL, REPORT, BLACK PA~THER) Sernandc metastatement
3 (TIME: IN, BLACK PA»THER, SIXTIES) M7
4 (TELL, PROFESSOR, STl:DEKT) Semantic metastatement

5 (SOME, STUDENT) MiO
6 (HAVE, PROFESSOR, ST<.;DENT) .\112
7 (IS A, STUDENT, BLACK PANTHER) PIS (lexical transformation)
8 (C01!PLADf, STUDEKT, 9) MP17

9 (RECEIYE, STUDE~T, TICKET, POLlCE) MP23 (specification of component)
10 (FEEL, STUDENT) Specification of condition of MP17

11 (CAUSE, '7, 9) ::VIP42
12 (DECIDE, PROFESSOR, 13) Specification of condition of MP39

13 (DO, PROFESSOR, EXPERIMEKT) MP39
14 (PURPOSE, 13, 15) YIP40
15 (ASSESS, 16) ~1P41
16 (CArSE, 17, 9) ::v!P42
17 (OK, BUMPER, SIGN) .:vI 37
18 (HAVE, AUTO, SIGN) .:vIP35 I
Note. Only the portion referring to the first paragraph of the text is shown. Lines indicate sentence bound·
aries. P, lVI, and :vIP identify propositions from Table 2.
protocol does not reflect only what is stored in the three retention intervals. They do not
memory. differ significantly [maximum t(54) = 1.99J,
and the intercorrelations are .92 for immediate/
Statistical Analyses I-month, .87 for and .94
for l-month/3-month recall. These high cor-
"'hile metastatements anci reconstructions relations are not very instructive, however,
cannot be analyzed further, the reproductions because they hide some subtle but consistent
permit a more powerful test of the model. changes in the probabilities. A
If one looks at [he pattern of frequencies more powerful analysis that brings out these
',,-ith which both micro- and macroproposi· cnanges is by the model.
tions (i.e., Table 2) are reproduced in the The predicted frequencies for each micro-
various recall conditions, one notes that and macroproposition are shown in Table 3
al though the three delay condi tions differ as a function of the p, g, and 1n.
among each other significamly [minimum These equations are, oi course, for a special
t(54) = 3.67J, they are highly correlated: case of the model, assuming chunking by
Immediate recall correlates .88 with I-month sentences, a buffer capacity of four proposi-
recall and .77 \yith 3-month recall; the two tions, and the leading-edge strategy. It is,
delayed recalls correlate .97 betiyeen each worthwhile to see huw well this
other. The summaries are even more alike at special case of the model can fit the data ob-
Table 7
Summary and Its Analysis From the Same Randomly Selected Subject as in Table 6
Summary
This report was about an experiment in the Los Angeles area concerning whether police were giving out
tickets to members of the Black Panther Party just because of their association with the party.
Proposition
number Proposition Classification
(IS AB01:T, REPORT, EXPERIMEKT) Semantic rnetastatement

2 (LaC: I"', 1, LOS ANGELES) Y114
3 (PURPOSE, EXPERI~lENT, 4) MP40
4 (CAUSE,S, 6) MP42
5 (IS A, SOMEOKE, BLACK PANTHER) PIS
6 (GIVE, POLlCE, TICKET, SOc,lEOKE) YIP23 (lexical transformation)
Note. Only the portion referring to the first paragraph oi the text is shown. P, M, and::vIP identify proposi-
tions from TabJe 2.
tained from the first paragraph of Bumper- could not be adequately fit by the model in
stickers when the three statistical parameters its present form, though even there the
are estimated from the data. This was done numerical value of the obtained goodness-of-
by means of the STEPIT program (Chandler, fit statistic is by no means excessive. (The
1965), which for a given set of experimental problems are caused mostly by two data
frequencies, searches for those parameler points: Subjects produced Pll and P33 with
values that minimize the chi-squares between greater frequency than the model could account
predicted and obtained values. Estimates for; this did not happen in any of the other
obtained for six separate data sets data sets.)
retention intervals for both recall and sum- Perhaps more important than the goodness-
maries) are shown in Table 8. We first note of-fit statistics is the fact that the pattern of
that the goodness of ht of the special case of parameter estimates obtained is a sensible
the model under consideration here was and instructive one: In immediate recall, the
good in five of the six cases, with the obtained probability of reproducing micropropositions
values being s;ightly less than the expected (p) is about five times as high as after 3
values. 5 Only the imnlediate recall data months; the probability of reproducing ir-
relevant generalizations (g) decreases by a
Table 8 similar factor; however, the likelihood that
Parameter Estimates and Chi-Square macroDroDositions are reproduced (m) changes
Goodness-oj-Fit Values very (ittl~ with delay. In spite of the high
correlations observed between the immediate
Protocol and recall and 3-month delayed recall, the forget-
retention interval p m (j x'(SO) ting rates for micropropositions appear to be
about four times greater than that for macro-
Recall
propositions! In the summaries, the same kind
Immediate .099 .391 .198 88.34*
1 momh .032 .309 .097 46.08 of changes occur, but they are rather less pro-
3 months .018 .321 .041 37.41 nounced. Indeed, the summary data from all
three delay intervals could be pooled and fitted
'", Summaries
Immediate .015 .276 . 026 30 . 1
1 month .024 .196 .040 46.46 Ii Increasing (or decreasing) all parameter values in
3 months .001 .164 .024 30.57 Table 8 bv 33% vields significant in-
creases in the chi-s~uares. On the hand, the in-
"Vole. The average standard errors are .003 for 15, creases in the chi-gq~ares that result from 10% changes
.023 for m, and .025 for {j, respectively. in the parameter values are aot sufficiently large to be
*p < .001. statisticaliy significant.
by one set of parameter values with little of the protocols consisted of reproductions, fa
effect on the goodness of fit. Furthermore, with the remainder being mostly reconstruc- et
recall after 3 months looks remarkably like tions and a few metastatemems. With delay, dj
the summaries in terms of the parameter the reproductive portion of these protocols T
estimates obtained here. decreased but only slightly. ::\10st of the re- al
These observations can be sharpened con- productions consisted of macropropositions, w
siderably through the use of a statistical with the reproduction probability for macro- S(
technique originally developed by Keyman propositions being about 12 times that for
(1949). Keyman showed that if one fits a micropropositions. For recall, there were much
model with some set of r parameters, yielding more dramatic changes with delay: For im- o
a X2(n - r), and then fits the same data set mediate reproductions were three times 1=
again but with a restricted parameter set of as frequent as reconstructions, while their s
size q, where q < 1, and obtains a X2(n - q), contributions are nearly equal after 3 months. I
X2 (n - r) - X2(n - q) also has the chi-square Furthermore, the composition of the repro- s
distribution with r- q degrees of freedom under duced information changes very much: Macro- c
the hypothesis that the r - q extra parameters propositions are about four times as important
are merely capitalizing on chance variations as micropropositions in immediate recall;
in the data, and that the restricted model fits however, that ratio increases with delay, so
the data as well as the model with more after 3 months, recall protocols are very much
parameters. Thus, in our case, one can ask, like summaries in that respect.
for instance, whether all six sets of recall and It is interesting to contrast these results
summarization data can be fitted with one set with data obtained in a different experiment.
of the parameters p, m, and g, or whether Thirty subjects read only the nrst paragraph
separate parameter values for each of the six of Bumperstickers and recalled it in writing
conditions are required. The answer is clearly immediately. The macroprocesses described by
that the data cannot be fitted with one single the present model are clearly inappropriate
set of parameters, with the chi-square for the for these subjects: There is no way these
improvement obtained by separate fits, X2(15) subjects can construct a macrostructure as
= 272.38. Similarly, there is no single set of outlined above, since they do not even know
parameter values that will fit the three recall that what they are reading is part of a research
conditions jointly, x2(6) = 100.2. However, report! Hence, one would expect the pattern
single values of p, m, and g will do almost as of recall to be distinctly different from that
weH as three separate sets for the summary obtained above; the model should not fit the
data. The improvement from separate sets is data as well as before, and the best-fitting
still significant, x2(6) = 22.79,.01 < p < .001, parameter estimates should not discriminate
but much smaller in absolute terms (this between macro- and micropropositions.
corresponds to a normal deviate of 3.1, while Furthermore, in line with earlier observations
the previous two chi-squares translate into z (e.g., Kintsch et al., 1975), one would expect
scores of 16.9 and 10.6, respectively). Indeed, most of the responses to be reproductive. This
it is almost possible to fit jointly all three sets is precisely what happened: 87% of the pro-
of summaries and the 3-month delay recall tocols consisted of reproductions, 10% re-
data, though the resulting chi-square for the constructions, 2% meta statements, and 1%
significance of the improvement is again errors. The total number of propositions pro.-
significant statistically, X2(9) =35.77, p< .001. duced was only slightly greater (19.9) than
The estimates that were obtained when all immediate recall of the whole text (17.3; see
three sets of summaries were fitted with Table 5). The correlation with immediate
a single set of parameters were p = .018, recall of the whole text was down to r = .55;
m = .219, and g = .032; when the 3-month with the summary data, there was even less
recall data were included, these values changed correlation, for example, r = .32, for the
to p = .018, m = .243, and g = .034. 3-month delayed summary. The model fit
. The summary data can thus be described in was poor, with a minimum chi-square of 163.72
terms of the model by saying that over 70% for 50 df. Most interesting, however, is the
,....
fact that the estimates for the model param- sketched at the beginning of this article. There,
eters p, m, and g that were obtained were not we limited our discussion to points that were
differentiated: p = .33,m = .30, g = .36. directly relevant to our processing model.
Thus, micro- and macroinformation is treated The model, as developed here, is, however, not
alike here-very much unlike the situation a comprehensive one, and it now becomes
where subjects read a long text and engage in necessary to extend the discussion of semantic
sch ema-con trolled macro-operations! structures beyond the self-imposed limits v,'e
'Vhile the tests of the model reported here have observed so far. This is necessary for two
certainly are encouraging, they fall far short reasons: (a) to indicate the directions that
of a serious empirical evaluation. For this further elaborations of the model might take
purpose, one would have to work ";"ith at least and (b) to place what we have so far into a
several different texts and different schemata. broader context.
Furthermore, in order to obtain reasonably
stable frequency estimates, many more proto- Coherence
cols should be analyzed. Finally, the general
model must be evaluated, not merely a special Sentences are assigned meaning and refer-
case that involves some quite arbitrary assump- ence not only on the basis of the meaning and
tions. 6 However, once an appropriate data reference of their constituent components but
set is available. systematic tests of these also relative to the of other,
assumptions become feasible. Should the mostly previous, sentences. Thus, each sen-
model pass these tests satisfactorily, it could tence or clause is subject to contextual inter-
become a useful tool in the investigation of pretation. The cognitive correlate of this ob-
substantive questions about text comprehen- servation is that language user needs to
sion and production. relate new incoming information to the in-
formation he or she already has, either from
General Discussion the text, the context, or from the language
user's general knowledge system. \~le have
A processing model has been presented here
modeled this process in terms of a coherent
for several aspects of text comprehension and
propositional network: The referential identity
production. Specifically, we have been con-
of concepts was taken as the basis of the co-
cerned with problems of text cohesion and gist
formation as components of comprehension
processes and the generation of recall and sum- 6 We have used the leading-edge strategy here be-
marization protocols as output processes. Ac- cause it was used previously by Kintsch and Vipond
cording to this model, coherent text bases are (1978). In comparison, for the data from the immedi-
constructed by a process operating in cycles ate-summary condition, a recency:'only strategy in-
creases the minimum chi-square by 43%, while a
and constrained by iimitations of working levels-plus-primacy strategy yields an increase of 23%
memory. Macroprocesses are described that (both are highly significant). A random selection
reduce the information in a text base through strategy can be rejected unequivocally, x'(34) = 113.77.
deletion and various types of inferences to its Of course, other plausible strategies must be investi-
gated, and it is quite possible that a better strategy
gist. These processes are under the control of than the leading· edge strategy can be found. However,
the comprehender's goal, which is formalized in a systematic investigation would be pointless with only
the present model as a schema. The macro- a single text sample.
processes described here are predictable only Similar considerations apply to the determination of
if the controlling schema can be made explicit. the buffer capacity, s. The statistics reported above were
calculated for an arbitrarilv chosen value of s = 4.
Production is conceived both as a process of For s = 10, the minim~m chi-square increases
r reproducing stored text information, 'which drastically, and the model no longer can fit the data.
includes the macrostructure of the text, and On L'1e other hand, for s = 1, 2, or 3, the minimum
as a process of construction: Plausible in- chi-squares are only slightly larger than those reported
in Table 8 (though many more reinstatement searches
erences can be generated through the inverse are required during comprehension for the smaller
application of the macro-operators. values of s). A more precise determination of the buffer
This processing model 'was derived from the capacity must be postponed untii a broader data base
i onsiderations about semantic structures becomes available.
herence relationships among the propositions ential relationships are frequent concomitant
of a text base. While referential coherence is properties of coherence relations between
undoubtedly important psychologically, there propositions of a text base, such relations are
are other considerations that need to be ex- not sufficient and sometimes not even neces-
plica ted in a comprehensive theory of text sary. If we are talking about John, for instance,
processing. we may say that he was born in London, that
A number of experimental studies, follow- he was ill yesterday, or that he now smokes a
ing Haviland and Clark (1974), demonstrate pipe. In that case, the referential individual
very nicely the role that referential coherence remains identical, but we would hardly say
among sentences plays in comprehension. that a discourse mentioning these facts about
However, the theoretical orientation of these him 'would by this condition alone be coherent.
studies is somewhat broader than the one we It is essential that the facts themselves be
have adopted above. It derives from the work related. Since John may well be a participant
of the linguists of the Prague School (e.g., in such facts, referential identity may be a
Sgall & Hajicova, 1977) and others (e.g., normal feature of coherence as it is established
Halliday, 1967) who are concerned with the between the propositions taken as denoting
given-new articulation of sentences, as it is facts.
described in terms of topic-comment and
presupposition-assertion relations. Within psy- Facts
chology, this viewpoint has been expressed
The present processing model must therefore
most clearly by Clark (e.g., Clark, 1977).
be extended by introducing the notion of
Clark's treatment of the role of the "given-new
"facts." Specifically, we must show how the
contract" in comprehension is quite consistent
propositions of a text base are organized into
with the processing model offered here. Refer-
higher-order fact units. We shall discuss here
ential coherence has an important function in
some of the linguistic theory that underlies
Clark's system, but his notions are not limited
these notions.
to it, pointing the way to possible extensions
The propositions of a text base are con-
of the present model.
nected if the facts denoted by them are related.
Of the t,yO kinds of semantic relations that
These relations bet\veen facts in some possible
exist among propositions, extensional and
world (or in related possible worlds) are
intensional ones, we have so far restricted
typically of a conditional nature, where the
ourselves to the former. That is, we have
conditional relation may range from possi-
related propositions on the basis of their
bility, compatibility, or enablement via proba-
reference rather than their meaning. In the
bility to various kinds of necessity. Thus, one
predicate logical notation used here, this kind
fact may be a possible, likely, or necessary
of referential relation is indicated by common
condition of another fact; or a fact may be a
arguments in the respective propositions. In
possible, or necessary consequence of
natural surface structures, referential identity
another fact. These connection relations be-
is most clearly expressed pronouns, other
tween propositions in a coherent text base are
"pro" forms, and definite articles. 1\ote that
typically expressed by connectives such as
referential identity is not limited to individuals,
" and," "but," "because," "although," "yet,
such as discrete objects, but may also be based
"then," "next," and so on.
on related sets of propositions specifying prop-
Thus, facts are joined through temporal
erties, relations, and events, or states of affairs.
ordering or presuppositional relationships to
Identical referents need not be referred to with
form more units of information. At
the same expressions: Vve may use the expres-
the same time, facts have their own internal
sions "my brother," "he," "this man," or
structure that is reHected in the case structure
"this teacher" to refer to the same individual, of clauses and sentences. This framelike
depending on which property of the individual structure is not unlike that of propositions,
is relevant in the respective sentences of the except that it is more complex, with slots that
discourse. assign specifiC roles to concepts, propositions,
Although referential identity or other refer- and other facts. The usual account of sud
......
TEXT COMPREHENSION AND PRODuCTION 391
structures at the level of sentence meaning is tween propositions in a discourse in more

given in terms of the following: functional terms, at the or rhetorical
levels of description. one proposltlon
1. state/event/action (predicate)
2. participants involved may be taken as a "specification," a "correc-
" an "explanation," or a "generalization"
a.
h. of another (=Vleyer, 1975; van Dijk,
c. 1977c, 1977d). However, \ve do not yet have an
d. adequate theory of such functional rela~ions.
o. instramcnt(s) The present model was not extended beyond
and so on
the processes involved in referential coherence
3. clrcumstantials of texts because we do not feel that the
a. time problems involved are sufficiently wel! under-
b. place stood. However, by limiting the processing
c. direction
d. origin model to coherence in terms of argument
c. goal repetition, we are neglecting the important
and so on role that fact relationships play in comprehen-
4. properties of 1, 2, and 3 sion. For instance, TI10St of the inferences that
5. modes, moods, modalities. occur during comprehension probably derive
from the organization of the text base into facts
Without wanting to be more explicit or com-
that are matched up with knowledge frames
we surmise that a fact unit would feature
stored in long-term memory, thus providing
categories such as those mentioned above. The
information missing in the text base by a pro-
syntactic and lexical structure of the clause
cess of or "default assign-
permits a first provisional assignment of such
ments," in Minsky's terminology (Minsky,
a structure. The interpretation of subsequent
1975). Likewise, the fate of inconsistent in-
clauses and sentences might correct this
formation might be of interest in a text, for
hypothesis.
example, whether it would be disregarded in
In other words, we provisionally assume that
recall or, on the contrary, be particularly well
the case structure of sentences may have pro-
reported because of the extra processing it
cessing relevance in the construction of complex
receives. Finally, by looking at the organiza-
units of information. The propositions
tion of a text base in terms of fact relations,
the role of characterizing the various proper-
a level of representation is obtained that cor-
ties of such a structure, for specifying
responds to the "knowledge structures" of
which agents are involved, what their respec-
Schank and Abelson (1977). Schank and
tive properties are, hO\y they are related,
Abelson have argued convincingly that such
and so on. In fact, language users are able to
a level of representation is necessary in order to
derive the constituent propositions from once-
account for full comprehension. Kote, how-
established fact frames.
ever, that in their system, as in the present
We are now in a position to better under-
one, that level of representation is built on
stand how the content of the proposition in a
earlier levels (their microscopic and macro-
text base is related, namely, as various kinds
scopic conceptual dependency representations,
of relations between the respective individuals,
which correspond loosely to our microstructure
properties, and so on characterizing the con-
and macrostructure).
nected facts. The presupposition-assertion
structure of the sequence shows how new facts Limitations and Future Directions
, are added to the old facts; more specifically,
. each sentence may pick out one particular Other simplifying assumptions made in tne
aspect, for example, an individual or property course of developing this model should like-
already introduced before, and assign it further wise be examined. Consider first the relation-
properties or introduce new individuals reI a ted ship between a linguistic text grammar and a
to individuals introduced before (see Clark, text base actually constructed by a reader or
1977; van Dijk, 1977d). listener. In general, the reader's text base will
It is possible to characterize relations be- be different from the one abstractly specifIed
392 "'ALTER KINTSCH AND TEUN A. VAN DIJK
by a text grammar. The operations and strate- of a full-fledged comprehension model, such
gies involved in discourse understanding are as the text-to-text-base translation processes
not identical to the theoretical rules and con- and the organization of the text base in terms
straints used to generate coherent text bases. of facts.
One of the reasons for these differences lies in There are a number of details of the model
the resource and memory limitations of cogni- at its present stage of development that could
tive processes. in a specific not be seriously defended vis-a.-vis some
pragmatic and communicative context, a theoretical alternatives. For instance, we have
user IS able to establish textual co- assumed a limited-capacity, short-term buffer
herence even in the absence of strict, formal in the manner of Atkinson and Shiffrin (1968).
coherence. He or she may accept discourses It may be possible to achieye much the same
that are not properly coherent but that are effect in a quite different way, for example, by
nevertheless appropriate from a pragmatic or assuming a decaying, spreading activa-
communicative point of view. Yet, in the tion network, after Collins and Loftus (1975)
present article, we have neglected those and .Anderson Similarly, if one grants
properties of natural language understanding us the short-term buffer, we can do no more
and adopted the \vorking hypothesis that full than guess at the strategy used to select
understanding of a discourse is possible at the propositions to be maintained in that buffer.
semantic level alone. In future a dis- That is not a serious problem, however, be-
course must be taken as a complex structure of cause it is open to empirical resolution with
speech acts in a pragmatic and social context, presently available methods, as outlined in
which requires what may be called pragmatic this artide. Indeed, the main value of the
comprehension (see van Dijk, on these present model is that it is possible to formulate
aspects of verbal communication). and test empirically various specific hypotheses
Another processing assumption that may about comprehension processes. Some rather
have to be reevaluated concerns our decision to promising initial results have been reported
describe the two aspects of comprehension here as illustrations of the kind of questions
that the present model deals with-the co- that can be raised with the help of the model.
herence graph and the macrostructure-as if Systematic investigations using these methods
they were separate processes. Possible inter- must, of course, involve many more texts and
actions have been neglected because each of several different text types and will require an
these processes presents its own set of problems extended experimental effort. In addition,
that are best studied in isolation, without the questions not raised here at all will have to be
added complexity of interactive processes. investigated, such as the possible separation of
The experimental results presented above are reproduction probabilities into storage and
encouraging in this regard, in that they indicate retrieval components.
that an experimental separation of these sub- To understand the contribution of this
processes may be feasible: It appears that re- model, it is important to be clear about its
call of long tests after substantial delays, as limitations. The most obvious limitation of the
,yell as summarization, mostly reflects the model is that at both input and output it
macro-operations; while for immediate recall deals only with semantic representations, not
of short paragraphs, the macroprocesses play with the text itself. Another restriction is
only a minor role, and the recall pattern ap- perhaps even more serious. The model stops
pears to be determined by more local con- short of full comprehension because it does not
siderations. In any case, trying to decompose deal \yith the organization of the propositional
the overly complex comprehension problem text base into facts. Processing proceeds only
into subproblems that may be more accessible as far as the coherence graph and the macro-
to experimental investigation seems to be a structure; the component that interprets
promising approach. At some later stage, how- clusters of propositions as facts, as outlined
ever, it appears imperative to analyze above, is as yet missing. General "wrld knowl-
theoretically the interaction between these t,yO edge organized in terms of frames must playa
subsystems and, indeed, the other components crucial role in this process, perhaps in the.
:1 manner of Charniak (1977). Note that this Chandler, P. J. Subroutine STEPIT: An algorithm
component would also provide a basis for a that finds the values the parameters which lninimize
s a given continuous Bloomington: Indiana
s theory of inference, which is another missing University, Quantum Chemistry Program Exchange,
link from our model of macro-operations. 1965.
:1 In of these serious limitations, the Charniak, E. A framed PAINTING: The representation of
present model promises to be quite useful, even a common sense k:lOwledge fragment. Cogniti~e
:I
Science, 1977. 1, 335-394.
e at its present stage of development. As long as Clark, H. H. Inferences in comprehension. In D. La-
e "comprehension" is viewed as one undif- Berge & S. J. Samuels (Eds.), Basic processes in
r ferentiated process, as complex as "thinking reading, Hillsdale, N.J.: Erlbaum, 1977.
in general," it is simply impossible to formulate Colby, B. N. A partial grammar of Eskimo folktales.
American Anthropologist, 1973, 75, 645-662.
precise, researchable questions. The model
Collins, A. M., & Loftus, E. F. A spreading activation
opens up a wealth of interesting and significant theory of semantic processing. Psychological Revkd!,
research problems. 'Ve can ask more refined 1975, 82, 407-428.
questions about text memory. \Yhat about the Halliday, M. A. K. Xotes on transitivity and theme in
suggestion in the data reported above that gist English. Journal of Linguistics, 196-7, 3, 199-244.
Haviland, S. E., & Clark, H. H. \Vhat's new? Acquiring
and detailed information have different decay new information as a process in comprehension.
rates? Is the use of a short-term buffer in any J otwnal of Verbal Learning and Verbal Behavior,
way implicated in the persistent finding th,,:t 1974, 13, 512-521.
phonological coding during reading facilitates Hayes, J. R., Waterman, D. A., & Robinson, C. S.
Identifying the relevant aspects of a problem text.
comprehension (e.g., Perfetti & Lesgold,
Cognitive Science, 1977, 1, 297-313.
1978)? One can begin to take a new look at Heussenstam, F. K. Bumperstickers a~d the cops.
individual differences in comprehension: Can Transactions, 1971,8,32-33.
they be characterized in terms of buffer size, Hunt, E., Lunneborg, C., & Lewis, J. What does it
selection strategies, or macroprocesses? De- mean to be high verbal? Cognitire Psychology, 1975,
7, 194-227.
pending on the kind of problem a particular Jackson, ::VI. D., & McClelland,]. L. Sensory and cogni-
individual may have, very different educational tive determinants oi reading speed. Journal of Verbal
remedies ,yould be suggested! Similarly, vie Learn'ing and Ve'rbat Behavior, 1975, 14, 556-574.
may soon be able to replace the traditional J arvella, R. J. Syntactic processing of connected speech.
Journal of Verbal Learnittg and Verbal Behavior,
concept of readability with the question
1971, 10,409-416.
"What is readable for whom and whv?" and King, D. R. W., & Greeno, J. G. Invariance of inference
to design texts and teaching methods i~ such a times when information was presented in different
way that they are suited to the cognitive pro- linguistic formats. Memory & Cognition, 1974, 2,
cessing modes of particular target groups. Thus, 233-235.
Kintsch, W. The representation of mea.ning in memory.
the model to some extent makes up in promise Hillsdale, N.J. : Erlbaum, 1974.
for what it lacks in completeness. Kintsch, W. On comprehending stories. In M. A. Just
& P. Carpenter (Eds.), Cognitive processes itt compre-
References hen.sion. Hillsdale, N.]. : Eribaum, 1977.
Kbtsch, W., & Greene, E. The role of culture specific
Aaronson, D., & Scarborough, H. S. Performance schemata in the comprehension and recall of stories.
theories for sentence coding: Some quantitative Discourse Processes, 1978, 1, 1-13.
models. Journal of Verbal Learning and Verbal Be- Kintsch, W., & Keenan, J. ::VI. Reading rate as a func-
havior, 1977, 16, 277-303. tion of the number of propositions in the base struc-
Anderson, J. R. Language, memory a.nd thought. Hills- ture of sentences. Cogtlitire Psychology, 1973, 5,
dale, N.].: Erlbaum, 1977. 257-274.
Anderson, R. C., & Pichert, J. W. Recall of previously Kintsch, W., & Kozminsky, E. Summarizing stories
unrecallable information following a shift in per- after reading and listening. JlYtlrnaJ of Educationa.l
spective. Journal of Verbal Learning and Verbal Be- Psychology, 1977, 69,491-499.
havior, 1978, 17, 1-12. Kintsch, W., Kozminsky, E., Streby, W. J., McKoon,
Atkinson, R. C., & Shiffrin, R. M. Human memory: A G., & Keenan]. M. Comprehension and recall of
proposed system and its controJ processes. In K. W. text as a function of content variables. J ournill of
Spence & J. T. Spence (Eels.), The psychology oj Verbal Learning ena Verbal Behavior, 1975, 14,
learning and motivation: Advances in research and 196-214.
theory (VoL 2). Xew York: Academic Press, 1968. Kintsch, W., Mandel, T. S., & Kozminsky, E. Sum-
Bobrow, D., & Collins, A. (Eds.). Representation and marizing scrambled stories. Memory & Cogllition,
understanding: Studies in cogniti~e science. Xew 1977,5, 547-552.
York: Academic Press, 1975. Kintsch, 'N., & :Monk, D. Storage of complex informa-
394 WALTER KINTSCH AXD TEUN A. VAN DIJK
tion in memory: Some implications of the speed Bobrow & A. Collins (Eds.), Representation and
with which inferences can be made. Journal of understanding: Shld'ies in cogniti,'e science. New York:
Experimental Psychology, 1972, 94, 25-32. Academic Press, 1975.
Kintsch, W., & van Dijk, T. A. Comment on se rapelle Schank, R., & Abelson, R. Scripts, plans, goals, mtd
et on resume des histoires. Langages, 1975,40,98-116. un:aerslan·at:nf(. Hillsdale, : Erlbaum, 1977.
Kintsch, W., & Vipond, D. Reading Sgall, P., & W. Focus on focus. The
and readability in educational practice psycho- Prague Bulletin of Linguistics, 1977,
logical In L. G. Xilsson (Ed.), 2vlemory: 28,5-54.
Processes problems. Hillsdale, N.J.: Erlbaurp, Stein, X. L., & Glenn, C. F. An analysis of story com-
1978. prehension in school children. In R
LaBerge, D., & Samuels, S. J. Toward a theory or Freedle approaches to dis-
automatic information processing in reading. Cogni- C(n£rse : Ablex, in press.
tive Psychology, 1974, 6, 293-323. Sticht, of the audread model to reading
Labov, W. ]., & vValetzky, J. Narrative analysis: evaluation and instruction. In L. Resnick & P.
Oral versioas of persoaal experience. In J. Helm '''eaver (Eds.), Theory and practice in early reading.
(Ed.), Essays the ~erbal and visual arts. Seattle: Hillsdale, N.J.: in press.
Washington University Press, 1967. & Greene, E. construction of a pro-
Longacre, R, & Levinsohn, S. Field analysis of dis- positior:al base. JSAS Catalog oj Selected Docu-
course. In W. U. Dressler (Ed.), Current trends in s'V,cn"lo.n. in press.
textlingtlistics. Berlin, \Vest Germany: de Gruyter, van of text grammars. The
1977. Hague, : ::vrouton, 1972.
Luria, A. R, & Tsvetkova, L. S. The mechanism of van Dijk, T. A. Text and text logic. In J. S.
"dynamic aphasia." Foundations of Language, 1968, PetOfi & H. Rieser Studies in text grammar.
4,296-307. Dodrecht, The Reidel, 1973.
Mandler, J. lVI., & Johnson, N. J. Remembrance of van Dijk, T. A. Philosophy of action and theory of
things parsed: Story "tructure and recall. Cognitive narrative. Poetics, 1976, 5, 287-338.
Psychology, 1977,9, 111-151. van Dijk, T. A. Context and cognition: Knowledge
Meyer, B. The organization of prose and its effect ·upon frames and speech act comprehension. Journal of
memory. Amsterdam, The Netherlands: Xorth Pragmatics, 1977, 1, 211-232. (a)
Holland, 1975. van T. A. l'vlacro-structures, knowledge frames,
:.winsky, M. A framework for representing knowledge.
and discourse comprehension. In M. A. Just & P.
In P. Winston (Ed.), The psychology of computer Carpenter (Eds.), Cognitive processes in comprehen-
vision. New York: McGraw-Hill, 1975.
sion. Hillsdale, KJ. : Erlbaum, 1977. (h)
Neyman, J. Contribution to the theory of the test. In
J. Neyman Proceedings the Berkeley Sym- van Dijk, T. A. Pragmatic macro-structures in dis-
on and Probability. course and cognition. In M. de Mey et al. (Eds.),
Communication and cognition. Ghent, Belgium:
DL:rK'''~:Y: University of California Press, 1949.
A., & Rumelhart, D. E. Explorations in University of Ghent, 1977. (c)
W!,,,,,,",m. San Francisco: Freeman, 1975. van Dijk, T. A. Text and context: ExploratiotlS the
C. A., & Goldman, S. R Discourse memory semantics and pragmatics of discourse. London,
and reading comprehension skill. Journal oj Verbal England: Longmans, 1977. (d)
Learning and Verbal Beho/mor, 1976, 15, 33-42. van Dijk, T. A. Recalling and summarizing complex
Perfetti, C. A., & Lesgold, A. M. Coding and compre- discourse. In \'1. Burghardt & K. Rolker (Eds.),
hension in skilled reading and implications for Text processing. Berlin, \Vest Germany: de Gruyter,
reading instruction. In L. B. Resnick & P. Weaver in press.
(Eds.)) Theory and practice in early reading. Hillsdale, van Dijk, T. A., & Kintsch, W. Cognitive psychology
N.}.: Erlbaum, 1978. and discourse. In W. Dressler (Ed.), Currettt trends in
Poulsen, D., Kintsch, E., Kintsch, W., & Premack, D. textlinguistics. Berlin, West Germany: de Gruyter,
Children's comprehension and memory for stories. 1977.
, Journal of Child Psychology, in press.
Run~elhart, D. Notes on a schema for stories. In D. Received January 3, 1978 !III

Kintsch Van Dijk Toward A Model of Text Comprehension Production PsychRev 1978 PDF

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Kintsch Van Dijk Toward A Model of Text Comprehension Production PsychRev 1978 PDF

Uploaded by

Copyright:

Available Formats

syc e 1

VOL U M E 85 N UJiI B E R 5 S Elp T E lVI B E R i 9 78

Toward a Model" of Text Com;ppehension Production

Copyright 1978 by the American Psychological Association, Inc. 0033·295X/7S/8505-0363$OO. 75

TEXT COMPREHENSION AND PRODUCTION 365

structural theory is a semiformal statement coherence: We establish a linear or hierarchical

We assume that the surface structure of a Inferences

In our earlier work (Kintsch, 1974; van IvIacrostructure of Discourse

Processing Cycles those propositions not available from long-

TEXT COMPREHENSION AND PRODGCTIOX 371

372 WALTER KINTSCH AND TEUN A. VAN DIJK

TEXT COMPREHENSION AND PRODUCTION 373

Role of the Schema A second type of reading situation where

consisting or reconstructively added details, comprehension processes, hovvever, are speci-

TEXT COMPREHEKSIOK AND PRODUCTION 377

citations that some were in danger of losing their driv- Table 1

press). Propositions are numbered and are listed 6 (EARLY, SUMMER)

semantic cases of the arguments have not been 18 (CONTINUOUS, 19)

Some comments about this representation 21 (MANY, COMPLAINT)

develop a psychological processing model of 35 (HAVE, At:TO, SIGN)

378 WALTER KINTSCH AKD TEUK A. VAK DIJK

ambiguity may remain. In general, neither a~

assumed for the purpose of this figure that

subordinated to PS because of the repetition

particular justification for it. Empirical -~ 3B

methods to determine its adequacy will be , 39--40

that are to be maintained in the short-term

INTROOUCTlON· (00. S, ) \mE~i~ENT RESULTS DISCUSSION

ilOC· AT. COLlEGE) (HAVE. STUDENT. AUTO) iCO~'lAIN.

model; this output was simulated bv reuroduc- Preliminary Data Analyses

P20 (A11ONG, eml?) S p

P29 (DURlNG, DISC) S p

P38 (REPORT, SPEAK, STUDY) S p

Reproduced text base

Text base converted to English, with reconstructions italicized

TEXT COMPREHENSION AND PRODUCTION 385

No. words No. propositions

(d) errors and unclassifiable statements (not

tionally. An example of how this kind of

necessarily lexically) equivalent to a text

(START WITH, REPORT, 2) Schematic metastatement

4 (TELL, PROFESSOR, STl:DEKT) Semantic metastatement

8 (C01!PLADf, STUDEKT, 9) MP17

10 (FEEL, STUDENT) Specification of condition of MP17

12 (DECIDE, PROFESSOR, 13) Specification of condition of MP39

(IS AB01:T, REPORT, EXPERIMEKT) Semantic rnetastatement

TEXT COMPREHENSION AND PRODUCTION 389

TEXT COMPREHENSION AND PRODuCTION 391

structures at the level of sentence meaning is tween propositions in a discourse in more

You might also like