Professional Documents
Culture Documents
Notes
Notes
That difference has been termed linguistic creativity-the ability to produce and
understand strings of a language that have never been previously encountered.
Traffic-lightese is a language with a fixed number of sentences- what is termed a
finite language. Natural languages such as English, on the other hand, are
infinite languages, in the sense that there are an infinite number of sentences in
each natural language. What is the source of this infinity?
Well, for one thing, the words of English seem to be grouped into classes, so
that we can recognize new words coming in as members of these classes. Unlike
traffic-lightese, in which there are a fixed number of words (three, to be precise),
natural languages have an unlimited number of words that simply have to be
fitted into word-classes. The traditional grammar term for a word-class is part-
of-speech. We term a word-class a grammatical category. We will soon be
examining the basis for the notion of a grammatical category, and contrasting
two views of grammatical categories- the notional view, in which each
grammatical category has a particular meaning, and the distributional view, in
which each grammatical category has a unique distribution, but the
Jabberwocky example bears on the comparison of these two views. Lets see
why.
The reason that the Jabberwocky example is so striking is that the
words are nonsense words. Weve never encountered them before, so we cant
possibly know what they mean. Nevertheless, we feel that (2) and (3) are
English sentences with unfamiliar words. The basis for our feeling is that the
words are in the right places for words of the appropriate word-classes (lets call
them grammatical categories from now on). To see this more carefully, lets
systematically deform, for example, (2), and see if, at each stage of the
deformation, we still have the feeling that the string of words is an English
sentence.
Lets start by removing the [-s] from the example in (2), and see if the
[-s]s removal changes our perceptions of the status of the string:
(2) The blithy tove did gyre and gimble.
(2) does seem to be English-its talking about a single tove, who
performed a compound action in the past of gyring and gimbling. Now, let us
remove the did:
?(2) The blithy tove gyre and gimble.
This has a somewhat shakier status as an English sentence, and the sense
that I have gotten in the past, when Ive performed this experiment in classes, is
that speakers of English are split. Some people find this sentence to be non-
English, while others find it to be English if tove is taken to be an irregular
4
plural of some sort, like children or cattle. Removing the makes the string still
harder to recognize as English:
?(2)Blithy tove gyre and gimble.
Finally, removing the and causes the sequence of words to be felt by all
speakers as being simply a string of words, with the character of a list:
(3)*Blithy tove gyre gimble.
An asterisk before a set of words is taken by convention to mean that the
sequence of words is an ungrammatical sentence.
Let us stop for a minute and think about how we dealt with this example.
We couldnt have known the words. Rather, we took some words that we knew
(and parts of words, such as the [-s]), and figured out details about the
unfamiliar words from how they were positioned with respect to the familiar
parts of English. In this sense, the distributional account of what grammatical
categories are seems to fare better than the notional account. We had to be
figuring out what kind of structure to assign the string based on the sequencing
of its parts, looking at the unfamiliar parts and seeing where they were relative
to the familiar ones.
It is important to see what weve just done. Weve taken two a
priori plausible views of what a grammatical category is, and weve tested them
by seeing what predictions they each make about phenomena in the part of the
world that were investigating (i.e., sentences).
In any event, weve seen one reason for the open-endedness of a
language such as English, as opposed to traffic-lightese, and that is the fact that
natural languages (human languages, for our purposes) have a syntax- a set of
rules for arranging elements into more complex units of language (i.e., words
into sentences). Traffic-lightese does not have these principles-every word is a
complete sentence, and there are no principles for stringing words together to
form more complex sentences.
As well see very shortly, there are two ways in which natural
languages are infinite, meaning that theres an infinity of sentences in the
language. We have seen the first way, in which sentences are said to be made
up of members of grammatical categories, and new words can enter the
language to instantiate these grammatical categories.
A second way in which natural languages are infinite is that, as opposed
to traffic-lightese, in which you can specify the length of each sentence ( because
each sentence is composed of one word and there are no procedures in traffic-
lightese for combining words), there is no specifiable bound on the length of a
sentence in a natural language. To see this, consider the following:
(4) a. The teacher left.
5
Does this sound like an English sentence? Most people would say that
it doesnt; it sounds like a main clause The horse raced past the barn, but then we
have no way of integrating the word fell. This analysis of the string (called
parsing- the assignment of a structure to a string) relies on analyzing raced as the
past tense of race.
However, the past tense of race is homophonous with another form of
race, called a participle form of race. The participle form of race is shown in
examples such as (5):
(5) ?? 1The horse was raced past the barn.
Keep in mind the two uses of the word raced- the past tense form and the
participle form. Now, let us alter (4) by substituting the word driven for the
word raced:
(6) The horse driven past the barn fell.
This sentence is perfectly acceptable, as is its paraphrase (7):
(7) The horse which was driven past the barn fell.
In (7), the sequence which was driven past the barn is an instance of
what is known as a relative clause- strictly speaking, a restrictive relative clause.
Restrictive relative clauses have the function of limiting the class of objects to
which the noun that precedes the relative clause can refer. When a restrictive
relative clause begins with a word like who or which (known as wh-words, which
well talk more about later on) and is followed by a form of the verb be, there is
under most conditions a synonymous sentence that simply omits the wh-word
plus be. This construction is known as the reduced relative construction. Further
examples:
(7) a. The girl who was sitting on the stoop was studying for her finals.
b. The girl sitting on the stoop was studying for her finals.
(8) a. People who are angry about this issue should write their elected
representatives.
b. People angry about this issue should write their elected representatives.
If we decide that (4) is ungrammatical, the question that we would ask is why.
Why is there no reduced relative counterpart to (9), which is perfectly
acceptable?
1
A question mark before a sentence indicates that the sentence is of dubious acceptability,
while an asterisk indicates ungrammaticality. The point of this section, however, is that
acceptability is a pre-theoretic notion, having to do with how we feel about certain strings,
while grammaticality is a post-theoretic notion, having to do with whether or not a certain
string is generated by the grammar. We cannot decide whether or not a certain string is
generated by the grammar until we have constructed the grammar, however, and it is in this
sense that grammaticality is post-theoretic, since the account that we are constructing, a
grammar, is a theory of what it means to know a language.
9
(9) The horse which was raced past the barn fell.
We have a textbook example here of the Rationale for Performance
Explanation. Obviously, (4) is unacceptable as a paraphrase of (9) because the
word raced is taken to be a main clause. We could of course say that (4) is
ungrammatical . However, in considering the implications of this decision, we
would be saying that the reduced relative clause construction does not occur in
English when the first word of the reduced relative clause would create a
sequence that is homophonous with a simple main clause. This restriction is
mysterious from the standpoint of grammar, but is explained naturally from the
vantage point of perception, given that we, as speakers of English, must have a
psychological mechanism for understanding sentences as they come in. In a
sense, placing the restriction within the grammar of English would make the
restriction look bizarre; the restriction as an instance of what Bever calls a
perceptual strategy is quite natural.
All of this is intended as a cautionary note that, paradoxically, the raw
data for syntactic analysis is speech, which is an instance of performance, but
what we are trying to construct is a model of competence, reflected
psychologically as our knowledge of language (as opposed to performance,
which is how that knowledge is put to use one particular occasions).
As a historical note, the competence-performance distinction that
Chomsky makes is a quite traditional one within linguistics, but under different
names for these concepts: the late Swiss linguist Ferdinand de Saussure coined
the terms langue (for language) and parole (speech).
The Role of Formalism
As linguists, we are trying to mimic the task of children in learning their
native languages. It is uncontroversial that children learn the rules of their
languages without explicit instruction, as can be seen by, e.g. the innovations
and over-regularizations (making forms regular that are irregular in the adult
language, such as goed and buyed as the past tenses of go and buy
respectively).
It is clear that children are constructing a grammar, but what form does
this grammar take? It cannot be expressed in, e.g., English, because they dont
know English yet. To borrow a term from the philosophy of language, we
would say that English in this case is the object language, the language being
described, while the rules of English are being formulated by children in what
is known as a meta-language, a language that is outside of the language being
described.
A good deal of what we will be doing will involve discovering the nature
of this meta-language. We will be posing hypotheses about the rules of
grammar, and the way that they interact, by viewing the grammar as what is
known as a formal system, a system in which all of the concepts have a precise
definition. It is by making this assumption that we can make testable predictions
about the grammar. Additionally, formalism has the advantage of ensuring that
the terms that are used in an account have the same meaning to all parties.
10
However, we are not only interested in generating the right set of strings.
Remember, what we are really interested in modeling is the full set of abilities
that native speakers have, and one of those abilities is the ability to recognize
the meanings of sentences. For example, we know that The cat is on the mat does
not mean that John saw Mary. We therefore have to build in this ability as well.
A traditional way of describing a grammar is as an infinite set of pairings of form
and meaning. Let us therefore revise our definitions of observational and
descriptive adequacy as follows:
Observational adequacy (Final Version): the ability of a grammar to generate all
and only the grammatical sentences of a language in a fixed body of data
(called a corpus), and to pair each grammatical sentence with its meaning.
Descriptive adequacy(Final Version): the ability of a grammar to generate all
and only the grammatical sentences of the language, and to pair each grammatical
sentence with its meaning.
C.Explanatory Adequacy:
The third requirement is not, strictly speaking, a requirement on grammars,
but, rather, a requirement on the account that underlies the construction of a
particular grammar, i.e. an account of what a possible grammar of a human
language can be. This needs a little more explanation.
When we formulate a grammar, we must have, at some level, a set of
assumptions as to what a possible grammar can be- there are certain possibilities for
rules that dont even occur to us. We therefore have, if only implicitly, a theory of
possible grammars.
It is commonplace to view the task of a linguist, in discovering a descriptively
adequate grammar of a language, as being identical to the task of a child, who is
trying to discover the descriptively adequate (adult) grammar of the language of her
or his community. Because linguists are trying to model the abilities of native
speakers, one of their goals is to try to formulate this theory, called a theory of
universal grammar, as well as the grammars of particular languages. We would
therefore say that the relation of the theory of grammar to grammars of particular
languages could be described as follows:
(10) Theory of Grammar (Universal Grammar) ={G1,., Gn}
In other words, a theory of grammar is a specification of the possible grammars,
which were calling G1 through Gn (instead of French, English, Ewe, Chinese, etc.).
Now, a theory of grammar that is the correct account of what a possible grammar of
a natural language should predict only the set of actual possible grammars, and
should not predict that some grammar is the grammar of a natural language that is
never realized in fact. In other words, a theory of grammar should not over-predict.
12
sentence needs. The term light verb is so-called because the verb is
semantically light, not carrying much, if any, meaning.
When we look at verbs, as I mentioned above, verbs are often said to
denote actions. However, what do we then make of the underlined elements
here, all of which are thought to be verbs, and none of which seem to describe
actions by the subject?
(5)
a. Germany endured a crushing defeat.
b. Jones underwent surgery.
c. Bill suffered a fatal blow to the head.
The underlined elements above all have an understood beginning
and end, a characteristic of actions, but they do not denote the volition, or free
will, on the part of the subject that is characteristic of actions. For example, the
subject of run chooses to run, while the subject of each of the above elements
doesnt choose to perform an action that each of the elements denote. You dont
normally choose to endure something, or suffer something, and you can choose
to undergo surgery (if its elective), but you can undergo surgery that is totally
involuntary.
In fact, the philosopher Zeno Vendler came up with a classification for
verbs in 1967 (Zeno Vendler, Linguistics in Philosophy, Cornell University
Press). He classified verbs into accomplishments, achievements, activities, and
states. Accomplishments are actions that have a definite result, which the
subject intends to bring about. An example is (6):
(6)
John built a house.
Achievements are events that have a definite result, but the subject does not
intend to bring this about, such as dying or being born:
(7)
John died. ( a rather dubious achievement, but an achievement
nevertheless!)
Activities are actions that do not necessarily have a result, so that
the notion of success is not an integral part of the felicitous use of the verb. An
example is walking:
(8)
John walked.
States are timeless, and, unlike the other three semantic types, dont have a
beginning and end. Examples:
(9)
a. John knows French.
b. John understands this point.
For our purposes, we see a real heterogeneity in Vendlers classification,
such that it is difficult if not impossible to pick out a single aspect of meaning
that unites all four types of verbs.
16
If we drop the idea that we classify words into parts of speech based on
meaning, we are left with a distributional basis for parts of speech, and this was
the idea of the structuralists, who essentially founded American descriptive
linguistics (Leonard Bloomfield (1933), Language, Holt, Rinehart, & Winston;
Zellig Harris (1951), Structural Linguistics, University of Chicago Press). In this
view, elements are classified into word classes on the basis of the environments
in which they can appear in sentences. To illustrate, consider the class of
elements that can appear in the slots in (10) , (11), and (12). Think of ten
elements that can appear in (10), and ten elements that cannot.
(10)
_____interest me.
(11)
I talked about ___.
(12)
John likes ___.
Restricting our attention to single words, ten elements that can appear in the
underlined slot in (10) are: ideas, people, cards, sheep, pencils, lectures, classes,
dogs, journals, computers. Ten elements that cannot appear there are: at, laugh,
from, angry, yellow, grow, because, incidentally, never, not. We set up classes, then,
of elements that can appear in a large number of identical environments, and
which are said (to introduce a somewhat technical term) to be mutually
substitutable. As a matter of historical accident, we call these terms nouns,
verbs, adjectives, etc., but they could have been called anything. In any event,
a grammatical category is defined as follows:
(13)
A grammatical category = a class of elements whose members
are mutually substitutable (i.e., interchangeable without any
diminution of acceptability of the resulting string) in a
sufficiently wide range of environments.
You will recall that we are trying to model as linguists the linguistic abilities
of fluent native speakers of the languages that we are describing, and it is useful
in this connection to return to the Jabberwocky example that I discussed in the
first lecture. You didnt know the meanings of the words there, by design, since
they were nonsense words, and yet you were able to understand the string The
blithy toves did gyre and gimble by virtue of the environments in which the words
appeared, and were able to assign the words to grammatical categories (known
as parts of speech). The only basis for this assignment had to be by applying the
definition of grammatical category given in (13), since there was no other.
The definition in (13) has a caveat, however-namely, the italicized
phrase. This is the point at which science becomes art, in that we really have no
way of determining in advance which set of environments are the right ones to
pick out the correct set of grammatical categories. We will now consider the
pitfalls in two polar extremes in applying the mutual substitutability criterion-
17
Man I, which would require that members of the same grammatical category be
mutually substitutable in all environments?
Obviously not, so wed have to set up seven totally distinct parts of speech
(call them Thelma, Louise, Bob, Carol, Ted, Alice, Mortimer), so that we would
have the following assignment:
(21)a.like is of category Thelma.
21 b. put Louise.
c.elapse Bob.
d.grow Carol.
e.become Ted.
f.persuade Alice.
g.dash Mortimer.
Whats wrong with having all of these words as separate parts of
speech?
Recall that in our last lecture, we talked about three criteria by which we
could evaluate proposed grammars: observational adequacy, descriptive
adequacy, and explanatory adequacy.
Descriptive adequacy involves a grammar not only generating the right
forms, but doing it in such a way as to show the regularities that speakers make
about their language.
While the seven words above have differences in their distributions, they
also have similarities. For example, they all agree with the preceding nouns:
22 a. John likes pizza.
b. People like pizza.
23.a. John puts books on tables.
b. People put books on tables.
24.a. John grows despondent.
b. People grow despondent.
25. a. John becomes despondent.
b. People become despondent.
26. a. John persuades us that Clinton will be impeached.
b. People persuade us that Clinton will be impeached.
27. a. John dashes into rooms.
b. People dash into rooms.
The (a) forms agree with the singular noun John, while the (b) forms agree
with the (irregular) plural form people.
A second similarity that all of the words have above can be generalized as
follows:
19
c. S-- N V N A
d.S-- N V N P
e. S-- N V N S
f. S--- N V S
22 g. S-- N V P
23 h. S-- N V A
24 The technical term for a rule such as (31), which uses abbreviatory
conventions to collapse a number of rules, is rule schema, an abstraction
over more than one rule which has the appearance of only being one rules.
25 Furthermore, application of one of the rules in (32) will create a structured
representation, so that application of, for example, (32)(b), will create a
representation as in (33):
26 (33)S
N V N
And we could show that the second line was formed from the first by
drawing lines from the first symbol to the elements that are introduced in the
second line, as in (34):
(34) S
N V N
Rules as in (32a-g) are known as phrase-structure rules, because they show
how phrases are structured. We can say that the phrase-structure rules generate,
or create, structured representations of sentences, which are known as phrase-
markers.
27 Phrase-structure rules have at most one symbol to the left of the arrow, a
necessary restriction, as we will see. The arrow is said to be an instruction to
rewrite the symbol to its left as a sequence of symbols on the right, and the
symbols on the right of the arrow are called the expansion of the symbol on
the left.
28 However, notice that (34) does not contain any words, and we still do not
have any device for registering the fact that not all verbs can, for example,
occur in the phrase-marker in (34).
29 Suppose that we extend our definition of grammatical categories, based
on mutual substitutability in a sufficiently wide range of environments,
and say that a grammatical category can be composed of different
subcategories, categories that are members of some larger category but
which have some further characteristic in common.
22
N V N
We have two options now for the symbol N. We could simply go to the
lexicon and insert two Ns and a V, or we could now apply our new phrase-
structure rule given in (2), to yield the sequence Art Adj N P N, so that
we would have the following sequence:
(5) S
N V N
Art Adj N P N V N
We now go to our lexicon, which includes the following:
the , Art
big, Adj
books, N
about, P
Nixon, N
Interest, V, +[___ N]
Me, N
Let us now introduce some terminology. We would say that the grammar
generates, or creates, a set of sentences by allowing a set of derivations of those
sentences. A derivation is a sequence of representations such that each
representation , except for the initial representation, is formed from the
preceding representation by a rule of grammar. One symbol is designated as
the initial symbol of the grammar, and every derivation must therefore begin
with this symbol. In this case, the designated initial symbol is S, and so every
derivation must begin with S.
A symbol which appears to the left of an arrow in a phrase-structure rule is
said to be a non-terminal symbol, and a symbol which does not appear to the
left of any arrow is said to be a pre-terminal symbol. Lexical items, which are
introduced by the lexicon, are said to be terminal symbols.
Therefore, the grammar so far, with the phrase-structure rules in (6) and
the lexicon above, will generate the phrase-marker in (7):
25
(7)
52
N V N
The problem with the grammar in (6), however, is that, while it will generate
grammatical sentences such as (3), it will also generate many sentences that we
will want to say are ungrammatical. In this sense, the grammar is said to be too
powerful, in that it does more than we want it to be able to do.
We can see this by considering the symbol N. The phrase-structure rules
operate in a top-down fashion, beginning with the designated initial symbol S.
Therefore, whenever we reach the symbol N, we can, by the phrase-structure
schema expanding N in (6), take any of the options for expanding N that the
schema permits. For instance, we could allow the first N, for example, in, e.g.
Art Adj N P N V N, to be expanded as Art N, generating such strings
as The big the books about Nixon interest me.
The grammar in (6) exhibits a property that is known in mathematics as
recursion, the ability of a device to re-apply to its own output an infinite number
of times. The recursion in the phrase-structure component of the grammar
results from the same symbol that appears to the left of the arrow appearing in
an expansion of that symbol.
Exercise: Generate five ungrammatical sentences using the recursive
power of N.
26
A way to solve this problem was suggested by Zellig Harris in his book,
Structural Linguistics( (1951), University of Chicago Press). Harris suggested
assigning integers to the different occurrences of N, so that the symbol N that
appears to the left of the arrow would be notated as N 1, and the symbol of N
that is the simple instance of N would be notated as N 0. Hence, the phrase-
structure rule that expands N would be formulated as in (8):
(8) N1-- (Art) (Adj) N0 (P N1)
and the rule that expands S would be reformulated as in (9):
27
(19) S- N1 V
V- V (N1) ({A} )
1
({P N })
({ S })
Hence, we have a major division of the sentence into two parts,
consisting of a noun phrase and a verb phrase, so that a sentence
such as (20) would have the phrase-marker in (21):
(20)The man read the book.
(21)
S
N1 V
Art N0 V N1
the book
At this point, a question arises. We established that it was necessary to
distinguish levels of projection for nouns, via the superscript notation. Is it
necessary to distinguish levels of projection in the same way for other categories,
such as verbs?
Recall that we motivated the device of assigning integers to grammatical
categories as a method of preventing unwanted recursion. If we look at the
rule for expanding Vs in (19), it would permit a potentially infinite sequence of
verbs, as in, for example, (22):
(22)
V
30
V N1
V N1
V N1
V N1
We can get around this problem by simply using the superscript notation for Vs
as well, so that the phrase-structure component in (19) would be revised to
include the rules in (23):
(23)
S- N1 V1
V1-- V0 (N1) ({ P N1)}
({ A } )
( {S } )
B. 1
The Category A
Let us look at some environments for simple adjectives. We have
seen one, following the verbs become and grow. In addition to this
environment, we can find adjectives occurring after the verb consider
followed by a noun phrase:
(24)
I consider the man crazy.
We can also substitute adjectives that are modified in that environment:
(25)
a. I consider him fond of chocolate.
b. I consider him partial to vanilla.
Furthermore, just as we can question simple adjectives, in which they appear at
the front of the sentence (we will return to question formation later), modified
adjectives can occur in that position:
(26)a. How angry are you?
b. How fond of Sally are you?
c. How partial to vanilla are you?
Hence, we can justify a phrasal projection of A as well.
C.
The Category P1
In a sense, the category P1 is the easiest category to motivate, in the sense that
prepositions usually occur with following Ns, as in (27):
(26)
John ran to Mary.
However, Joseph Emonds has argued (in Evidence that Indirect Object
Movement is a Structure-Preserving Transformation, Foundations of Language
(1972)) that , just as certain verbs, such as elapse, are intransitive (i.e., dont
take objects), or are optionally intransitive, such as eat, there are intransitive
prepositions as well. For example, consider the verb put, which requires a
31
locative prepositional phrase ( see the subcategorization frame for put in (35) of
Lecture #2)). Hence, (27) is unacceptable:
(27)(a) *John put the book.
(b) John put the book on the table.
However, certain single words can satisfy puts requirement of
having an element after the object:
(27)
John put the book on.
Aside from Emonds view of words such as on, which didnt occur with a
following noun phrase, there has been another view--- that of Bruce Fraser, who
analyzed such words in Fraser (1965) ( An Examination of the Verb-Particle
Construction, MIT Doctoral Dissertation) as particles. Hence, Fraser posited the
category Prt.
Let us consider the two views more closely, and formalize them in our
phrase-structure grammar. Emonds posited a set of phrase-structure rules that
included (28):
(28)
a. V1--- V 0 (N1) (P1)
b. P1- P0 (N1)
Frasers view can be described as follows:
(29)
a.V1- V0 (N1) ({P1 } )
( {Prt } )
1
b. P - P N0 1
57
58 (36) S
N1 V1
N0 V0 N1 Prt1
He send N0 ? Prt0
it right up
(37) S
N1 V1
N0 V0 N1 P1
He send N0 ? P0 N1
it right up Art N0
the stairs
Under Emonds analysis, (35) would receive the phrase-marker in (38), and
(31) would receive the phrase-marker in (39):
34
(38) S
N1 V1
N0 V0 N1 P1
He send N0 ? P0
it right up
(39) S
N1 V1
N0 V0 N1 P1
He send it ? P 0 N1
right up Art N0
the stairs
Notice that the view of these single-word categories as particles forces us to
assign radically different structures in (36) and (37), while the view of these
categories as being intransitive prepositions, in this case optionally intransitive,
makes the structures in (38) and (39) as being minimally different. Notice that
the analysis of these words as being optionally transitive prepositions places
them on a footing parallel to that of verbs, which can , in some instances, be
optionally transitive, as in the case of the verb eat:
(40) a. He ate something.
b. He ate.
or adjectives:
(41) a. John is angry with Sally.
b. John is angry.
Hence, we will assume that what has been called particles are really
nothing more than intransitive prepositions.
D.
The Parallelism of Grammatical Categories and How to Reflect It In
the Grammar
Toward the end of the discussion of prepositions, we appealed to the
notion that there is a certain symmetry in the way that grammatical
categories are constructed. Zellig Harris originally noted this, and
35
Chomsky developed this idea into what is now known as X-bar theory
(N. Chomsky (1970), Remarks on Nominalization, in R. Jacobs and P.
Rosenbaum, eds. , Readings in English Transformational Grammar, Ginn-
Blaisdell). The idea is that grammatical categories are constructed
according to a fixed template. It will be noted that all of the grammatical
phrasal categories that we have discussed so far are of the form in (42):
(42) X1-- (Y) X0 Z1**2
The asterisk is known as the Kleene star (after a mathematician named
Kleene), and it means from zero to infinite occurrences of the symbol to
which it is asterisked.
X, Y, and Z stand for arbitrary categories, with the only
understanding of this notation being that each symbol stands for the same
category in all of its occurrences in a given statement. It is what is known as a
variable, in this case ranging over categories. In other words, all phrasal
categories of N, V, A, and P have the same arrangement, and are constructed
the same way, so that a P1 would have the same structure as an A1, for
instance.
(42) is another instance of a rule schema. It is an abbreviation for a number of
different rules. Note that each phrase-structure rule has an obligatory
element in the expansion, while all of the other elements in the expansion
is optional. The obligatory element in the expansion is called the head, so
that N0 is the head of N1, A0 is the head of A1, P0 is the head of P1, and V0
is the head of V1.
The notion of all categories being constructed in the same way is standard in
modern syntactic theories of all stripes. It is also well-supported in studies of
language typology, the study of the ways in which languages may be said to
differ from one another. In a classic study by Joseph Greenberg (1963) (Some
Universals of Word Order With Reference to Meaningful Elements, in J.
Greenberg, ed., Universals of Language, MIT Press), for example,
Greenberg classified languages into three main types -V(erb)S(ubject) O(bject),
SVO, and SOV. Describing SOV as verb-final, and the other two as non-verb-
final, he noted some striking correlations of the dimension of verb-finality
with, for example, the fact that some languages have postpositions, while
others have prepositions, so that SOV languages tended to have postpositions,
while VSO and SVO languages had prepositions. He also noted that VSO
languages always had SVO word order as an alternative, and we will discuss
this more later.
2
We will discuss S in the next lecture.
36
E.
Some Terminology
At this point, it would be useful to review where weve come to.
A grammar is a sequence of rules that generates a set of phrase-markers,
which are structured representations of sentences. For natural languages,
the set of phrase-markers is infinite, even though each phrase-marker is
finite in length. There is only a finite set of rules, and so the rules that
generate the phrase-markers for natural languages must be recursive i.e.,
have the ability to reapply to their own output a potentially infinite number
of times, although there must also be non-recursive rules in the grammars
of natural languages, otherwise phrase-markers would never terminate.
So far, we have only seen one type of grammatical rule- a phrase-
structure rule, which determines how sentences are composed. Phrase-
structure rules have the formal requirement that they can have at most one
symbol to the left of the arrow in the phrase-structure rule (called the
symbol to be expanded), and can have, in principle, any number of
symbols to the right of the arrow (called the expansion).
A symbol that appears to the left of the arrow in a phrase-structure rule is
called a non-terminal symbol. A symbol that is introduced by the phrase-
structure rules, but does not appear to the left, is called a pre-terminal
symbol. A grammar is set to generate a set of derivations. A derivation is
a sequence of representations such that each representation is formed from
the immediately preceding representation by a rule of grammar, except for
a distinguished symbol that is said to be the designated initial symbol;
every derivation must start with this symbol. For our grammar so far, the
designated initial symbol is S.
We also assume that, apart from S, all phrasal categories are expanded
according to a particular template, called an X-bar schema.
The levels of complexity of the various grammatical categories, such as
N versus N1, A0 versus A1, etc., are called the levels of projection of the
0
grammatical categories. So far, our X-bar schema has claimed that all
grammatical categories project up to level 1. The level 1 projection of the
category is said to be the maximal projection of the category (so far).
37
It is also useful to note some of the relations that are defined on phrase-
markers. Phrase-markers can be represented in any one of a number of
ways, just so long as the groupings are represented. For example, we have
given phrase-markers as trees, but they could also be represented as
labelled bracketings. I will represent them as trees, because I feel that
they are easier to inspect that way, but this is simply an expository
convenience.
The labelled points in the tree diagram are called nodes.A node A that
is above another node B in the phrase-marker, such that A contains B, is
said to dominate node B. If node A is the first node above node B, node A
is said to immediately dominate node B.
Phrase-markers show the groupings, and these groupings of elements
are called constituents. A constituent is a sequence of nodes that are all
immediately dominated by the same node, such that the immediately
dominating node exhaustively dominates the sequence (i.e. immediately
dominates the sequence and nothing else).
In the next lecture, we will examine the X-bar schema more closely,
and refine the structures of NP, VP, AP, and PP.
38
(4)
a. This king of England was taller than that one.
b. The prospect of a slow trial of the President is more damaging than a
quick one.
c. This picture of Fred is clearer than that one.
It would seem, therefore, that one can refer to a noun and a PP headed by of.
One way to account for this would be to ascribe a structure such as (5) to a noun
phrase such as the king of England:
(5)
N2
Det N1
The N0 P1
king P0 N 1
of N0
England
We then have a unit that consists of the noun and the following
prepositional phrase, namely the constituent N 1. We would then say that one
must be anaphoric to an N1. Because the N1 is comprised in, e.g. (5), of both
the noun and the following PP, the ungrammaticality of (3) stems from the
violation of the requirement that one replace an N1, rather than just an N0.
Interestingly, not all sequences of nouns followed by a PP disallow one
followed by a PP, as pointed out by Radford (1988), (Transformational
Grammar: A First Course, Cambridge University Press). For example, (6) is
perfectly acceptable:
(6)
The man with Sally left, but the one with Susan stayed.
However, we can also allow one to be interpreted as a noun plus a PP headed
by with:
(7)
One man with Susan left, but that one stayed.
We can account for this if we posit a structure in which two things happen: (i)
the sequence consisting of the noun itself is an N1, to the exclusion of the with-
PP; (ii) the sequence consisting of the noun itself plus the with-PP is an N1.
In short, we would need the following phrase-structure rule for noun
phrases:
(8)
N2- (Det) N1
N1-- { N0 (P1)}
{ N1 P1 }
40
(9)
N2
Det N1
the N1 P1
N0 P0 N2
man with N1
N0
Sally
In short, a recursive expansion of N 1 allows, in some instances, a simple N 0
to also be an N1, and allows a sequence N followed by PP to be analyzed as
two N1s, a simple N1 followed by a PP, as well as a N 1 consisting of a N1 and a
PP.
Exercise: Show the phrase-marker for the underlined noun phrase in (a),
using the grammar that we have developed so far:
(a) The picture of Sally with the green frame is bigger than the one with
the blue frame.
A. Adjectives within the Noun Phrase
It will also be noted that the assumption that one replaces an N1 tells us
about the hierarchical position of adjectives within the noun phrase, as in the
blue car. Consider sentences such as (10) and (11):
(10)
The blue car was prettier than the green one.
(11)
This blue car was prettier than that one.
Our previous reasoning forces us to posit a phrase-structure rule as in (12) for
1
N s:
(12)
N1- A N1
Hence, the structure of , e.g., the blue car, would be as in (13):
(13)
N2
Det N1
the A N1
blue N0
car
Exercise: What is the structure of the big blue car , given the following sentences?
42
(b) The big blue car was faster than the small green one.
(c) The big blue car was faster than the small one (can mean either the small blue car or
the small car).
(d) This big blue car was smaller than that one (can mean either that big blue car, that
blue car, or that car).
Summary: We have motivated the following phrase-structure rules for the noun
phrase:
(14) N2- Det N1
N1- {A N1 }
{ N 1 P1 }
{ N0 (P1)}
We have not yet talked about genitives, such as Johns mothers boyfriends sisters
teacher. Notice that genitives seem to occur in the same position as the determiner.
Therefore, we can modify the rule for expanding N2 above as in (15), putting aside for
the moment the mechanism by which the possessive s is introduced:
(15) N2- { N2 } N1
{ Det }
B. Implications of the structure of the Noun Phrase for Other Categories
In the last section, we examined the structure of the noun phrase, and argued for two
levels of projection of the noun above the N0. If we assume that all categories are created
by the same X-bar schema, this would indicate that our X-bar schema of (42) of the last
section, repeated here, should be revised to (16):
(42) X1-- (Y) X0 Z1*
(15) X2-- (Z2) X1
X1-- { Y2 X1 }
{ X1 Y2 }
{ X0 Y2*}
We might then say that Y is instantiated by articles in noun phrases (as well as
possessives, to which we shall return).
Notice, first of all, that the determiner that is found in noun phrases, a word that
appears before the head, is paralleled by degree words that precede adjectives in adjective
phrases and adverbs that precede verbs and prepositions. Examples are as in (17):
(16) a. The pictures of Sally.
b. quite fond of Sally.
c. completely lost his mind.
d. right up the stairs.
It will be noted that we have direct evidence, from ones- pronominalization, for a
projection that includes the noun and a following PP ( in some instances- specifically, PPs
headed by of), and we have generalized from that evidence to saying that all categories
have two levels of projection. Are we warranted in leaping from evidence for two levels
of projection in N to two levels of projection in all categories, when we have no evidence
for the latter?
Recall that we are assuming that the grammar is as simple and as general as possible.
We have no direct evidence for two levels of projection in adjectives, verbs, and
prepositions, but we have no evidence against two levels, either. If we were to assume
that there is only one level of projection for these categories, but two for the noun, we
43
would clearly be assuming a more complicated grammar than if we assumed two levels of
projection for all the categories. For this reason, we assume that there is a single X-bar
schema for all grammatical categories, given in (15). Much of what we will be doing in
the succeeding weeks is finding evidence for the various options given by (15) for
particular sentences.
It will be noted that we have not fit S into the X-bar schema. We have not
defined S so far as the maximal projection of any category. That will change in the next
lecture.
B. Grammatical Relations and Grammatical Categories
Traditional grammar speaks of such notions as subject and object. Do these
notions play a role in grammar? Notice that our grammar generates partial phrase-
markers such as (18):
(17) S
1
N V
g
V
1
V N
The phrase-marker shows the hierarchical (up-and- down dominance) and linear
(right-to-left precedence) relations of grammatical categories-in this case, S ,
N, V, V, and V. What is the difference between grammatical categories and
grammatical relations?
There is a once-and-forever characteristic of grammatical categories that is
not present in grammatical relations. An element or sequence of elements either
is or is not a given category depending on the nature of its head. Grammatical
relations, on the other hand, can be deduced from phrase-markers depending
on the positions of the various grammatical categories. Hence, in the partial
phrase-marker in (17), an N is a subject if it is immediately dominated by S,
and an object if it is immediately dominated by V. If the Ns were in different
structural positions, they would bear different grammatical relations. This was
the point made by Chomsky in Aspects of the Theory of Syntax (1965, MIT
Press), who introduced the following formulations of subject and object
(updated for X-bar theory):
(18)subject = [N, S] (meaning the N immediately dominated by S)
object = [ N, V] (meaning the N immediately dominated by V)
If grammatical relations are predictable from the positions of the elements that bear
them, the reasoning goes, they should not be represented in phrase-markers, for the
same reason that we do not represent regular plurals in the lexical entries of nouns. We
only represent in representations what we cannot predict from something else.
C. Some Terminology
The grammatical relations of subject and object are akin to other grammatical
relations that are particular to X-bar theory. Let us take a representation as in (20), which
is generable from the schema in (15):
44
(20) X
2
Z X
1
Y X
2
Y X
2
X Y
2
X Y
In phrase-markers, it is convenient to use the notions sister and daughter, defined in
terms of immediate domination. These notions are defined as in (20):
(21) A is a sister of B if A and B are both immediately dominated by the same node.
(22)A is a daughter of B if B immediately dominates A.
We can now define the following grammatical relations:
(23)a.A is a specifier if A is a daughter of X.
b. A is an adjunct if A is a sister and daughter of X.
c.A is a complement if A is a daughter , but not a sister, to X.
These notions will play a role in the next lecture, when we try to integrate S into the X-
bar system. For now, however, let us note the following.
One of the original motivations for the X-bar system was the attempt by Chomsky to
capture the similarity in understood semantic relations between sentences and
nominalizations, as in (24):
(24) a. Rome destroyed Carthage.
b. Romes destruction of Carthage.
We could say that, in (b), the N that realizes the agent semantic role is a specifier, but
we cannot say that the agent is the specifier of the phrase in (a), because the notion of a
specifier is defined in X-bar terms, and S is not (so far) an X-bar projection. We could
say that the object in (a) is really a complement, and the notion of a complement is
realized in sentences (which contain Vs) and noun phrases (which contains Ns).
We will return to this in the next lecture.
45
{ could }
{ will }
{ would }
{ shall }
{ should }
{ may }
{might }
{ must }
These, then , are the facts concerning linear order of helping verbs. If
we call the class of elements in (8) modals, and abbreviate this class by
the symbol M, we can account for the facts (aside from the affixes on the
verbs that occur with the aspectual helping verbs, which we will put aside
for now), by positing, as a first approximation, the phrase-structure rule
in (9):
(9) S- N (M) (have) (be) V.
B. Yes-No Questions In English
Let us now consider the formation of yes-no questions in English. First, consider
yes-no questions in simple sentences that have all three types of helping verbs- modals,
perfective have, and progressive be.
When all three helping verbs occur, the modal will appear at the beginning of the
question:
(10) a. Would he have been eating?
b. *Have he would been eating?
c. *Been he would have eating?
When the modal is absent, however, and have and be occur, have will appear
at the beginning of the question:
(11) a. Has he been eating?
b. *Been he has eating?
When the modal and have are absent, but be occurs in the simple sentence, be
introduces the question:
(12) Is he eating?
There is a generalization about the helping verb that appears at the beginning of a yes-
no question, if one notices the corresponding order of helping verbs in the declarative.
Can you guess what it is?
Youre right. Its (13):
(13) The helping verb that appears at the beginning of a yes-no question
is the helping verb that would appear immediately after the first N in
the declarative version of the sentence.
We might try revising our phrase-structure rule in (9) to account for this, as in (14):
(14) S- {M N (have) (be) } V
{have N (be) }
{be N }
The phrase-structure rule in (14), while it gives us the effect in (14), doesnt directly
capture it as a generalization. The fact that the helping verb that appears at the beginning
of the yes-no question is just that helping verb which would appear after the N in the
declarative is accidental. We have three separate expansions of S which gives us this
result, but we could just as well have had a grammar that had (15) as the phrase-structure
rule:
47
John
2
1
N have been V
1 2
have N been V
(20) S
Be N V
Transformation(s)
Final Phrase-marker
We can then say that the tense element moves in forming yes-no questions as
well, revising (16), repeated here, to (30):
(16) N- { M }
{have}
{be }
1- 2 --
61 2- 1
(30) N- T ( {M } )
( {have} )
( {be } )
1 - 2 --
2 - 1
Finally, the rule of tense-hopping would be formulated as in (31):
(31) T - { have }
{ be }
{ V }
1 - 2 ---
0 - 2+1
A word of explanation is in order about the + symbol, and what it is supposed to
mean. It means that the two elements form a unit after movement. Let us illustrate
with the derivation of (23), John visited Sally.
The initial phrase-marker for (23) would be (32), with the Tense generated separately:
(32) S
N T V
John past V
V N
51
visit Sally
The phrase-marker that results from tense-hopping is (33):
52
(33) S
N V
John V
V N
V T Sally
visit Past
There is a term for this type of forming a unit, as in the forming of a unit
between the verb and the tense in (33). It is called adjunction, which is
defined as follows:
(34)A adjoins to B iff A moves to the periphery (i.e., beginning or end of B),
moves out of B, and forms a new instance of B dominating A and B.
If we look at the structural description of (31), however, it requires adjacency
between Factor #1 and Factor #2. It is this requirement that the Tense be
adjacent to the element that it adjoins to which seems to be violated in the
formation of the yes-no question .
There are two transformations that we are looking at here, in the formation
of the yes-no question. One is Subject-Helping Verb Inversion; the other is
Tense-Hopping. Their formulations are repeated here:
(30) N- T ( {M } )
( {have} )
( {be } )
1 - 2 --
2 - 1
(31) T - { have }
{ be }
{ V }
1 - 2 ---
0 - 2+1
Suppose the transformations are ordered, in the sense that they apply, or can only get
the chance to apply, in a fixed sequence, and that (30) is ordered before (31). In that
case, the application of (30) to (32) would be (35):
53
(35) S
Tense N V
Past John V
V N
visit Sally
Lecture #6- Additional Evidence for Tense- Hopping and a Revision of the
Phrase-Structure Rule for S
In the last lecture, we needed to account for the fact that present and
past tense in English, which usually are realized as affixes on verbs, can be
separated from those verbs in some instances. We accounted for this by positing a
level of representation at which the tense was not part of the verb, generated by
the phrase-structure rule (29) of Lecture #5, repeated here:
(29) S-- N T (M) (have) (be) V
and formulating a transformation that turned it into a unit with the verb when it is adjacent
to it, i.e. the transformation of Tense-hopping, given in (31) of Lecture #5:
(31) T - { have }
{ be }
{ V }
1 - 2 ---
0 - 2+1
We also saw that, in the event that the structural description of Tense-hopping was
not met, so that it could not apply, a rule that inserts do would apply, rule (36) of the
last lecture:
(35) Tense- do +Tense, if Tense is not affixed.
The analysis in which tense is generated separate from the verb and then affixed to it
transformationally, unless some earlier transformation destroys the adjacency between
Tense and V, was motivated by the fact that we could find an earlier stage of the
derivation at which Tense and the verb were separated from one another, and some
process applies at this earlier point to destroy the environment for Tense to attach to the
V. We found such a process in our examination of Subject-Helping Verb Inversion. We
will now find two other transformations that destroy the required adjacency between
Tense and V, a transformation that places negatives and a transformation that elides verb
phrases, and we will see that our rule of Tense-hopping, required by the interaction of
the occurrence of Tense and the distribution of yes-no questions in the last lecture, carries
over without modification to the analysis of Tense in these other two areas of English
syntax. One set of rules that is motivated on the basis of the analysis of one area of
grammar must be the set of rules that is motivated on the basis of other areas of grammar.
We dont have an analysis of Tense for yes-no questions that is different from the analysis
of Tense for negation. If we require a different analysis for different areas, we assume
that we must go back to the drawing board, and that one of our two analyses is wrong.
A grammar is an inter-locking system.
A. Negation
There are two types of negation in English sentential negation and constituent
negation. Sentential negation is the negation of a sentence, and constituent
negation is the negation of a part of the sentence. For example, consider two
possible interpretations of (1):
(1) John could not read the book.
One interpretation is that John is unable to read the book, and could be
paraphrased as (2):
(2) It is not the case that John could read the book.
In this case, we would say that, with reference to the semantics, or meaning, of
the sentence, the negative is taking scope over the modal.
56
A second interpretation of (1) is that John is able to refrain from reading the book.
These two interpretations correlate with non-semantic distinctions, such as
phonological or morphological ones. For example, it is possible to contract the
negative onto a helping verb, but only if the negative is an example of sentential,
rather than constituent, negation. For example, (3) can only have the
intepretation of (2), and cannot mean that John is able to refrain from reading the
book:
(3) John couldnt read the book.
We will be concentrating on the distribution of sentential negation for now.
The Placement of Sentential Negation
To consider the placement of sentential negation, let us consider an
affirmative sentence, i.e. one that lacks negation. First, consider a sentence with all the
helping verbs present:
(4) John could have been reading the book.
The negative goes perfectly well after the modal:
(5) John could not have been reading the book.
It cannot occur after have:
(6)* John could have not been reading the book.
It also cannot occur after be:
(7)* John could have been not reading the book.
However, if the modal is absent, the negative will most naturally occur after have:
(8) John has not been reading the book.
(9) * John has been not reading the book.
If both the modal and have are absent, the negative will occur after be:
(10)John is not reading the book.
It seems , then, that the generalization about where the negative will occur is the
following:
(11)The negative will occur after the first helping verb in the sentence, and there is
only one sentential negation in a sentence.
Now, we will try to account for (11) within the grammar, putting Tense aside for the
moment. Assume the phrase-structure rule in (12):
(12) S--- N (M) (have) (be) V
It is impossible to introduce negation by the phrase-structure rules and keep to (12). Let
us see why. Suppose we introduced the negative directly after the modal:
(13)S- N (M) (Neg) (have) (be) V
We could generate (5), but we could not generate (8) or (10). Similarly, if we
generated the negative after have, we could generate (8), but we would also
incorrectly generate (6), and could not generate (5) or (10). Similarly, if we
generated the negative after be, we could generate (10), but we would incorrectly
generate (9) and (7).
The impossible phrase-structure rules that are hypothesized in the preceding
paragraph are (14) and (15):
(14)
S- N (M) (have) (Neg) (be) V
(15)
S-- N (M) (have) (be) (Neg) V
If we accounted for the multiplicity of positions for the negation by introducing the
negative and three different points in the phrase-marker, as in (16), we would run
afoul of the generalization that there can only be one negative per simple sentence,
and would incorrectly generate (17):
57
(16)
S--- N (M) (Neg) (have) (Neg) (be) (Neg) V
(17)
* John could not have not been not reading books.
Chomsky, in Syntactic Structures, proposed a solution which involved dropping the
assumption that negatives are present in the initial phrase-marker. He proposed, instead,
that negatives are inserted via a transformation after the first helping verb in the phrase-
marker. This rule , called negative placement , was formulated as in (18):
(18) N- {M }
{have }
{ be }
1- 2 --
1- 2- Neg
As in the formulation of Subject-Helping Verb Inversion in the last lecture, we again
make reference to the notion first helping verb after the subject, not in the phrase-
structure rules, but in the structural description of a transformation. We will return
shortly to the question of why this set of elements is mentioned in the structural
description of two separate transformations, Subject-Helping Verb Inversion and Negative
Placement, but we will assume the formulation of Negative Placement in (18), and, as
we did with Subject-Helping Verb Inversion, consider the distribution in sentences
without helping verbs:
(18)
a. John did not read books.
b. John does not read books.
Again, the tense does not appear as a suffix on the verb, but is separated from it,
appearing instead on a form of do. We can account for this by generating Tense as an
element separate from the verb, as in the phrase-structure rule (29) in Lecture #5,
repeated here:
(29)) S-- N T (M) (have) (be) V
and reformulating negative placement as in (19):
(19) N- T ({M })
({have } )
( { be } )
1- 2 --
62 1- 2- Neg
We would then order negative placement before Tense-hopping, repeated at the
beginning of this lecture. Hence, the deep structure (initial phrase-marker) of (18) would
be (20):
58
(20) S
N T V
N Past V
John V N
read N
books
Negative Placement will insert the negative after T, transforming (20) to (21):
(21) S
N T Neg V
N Past V
N V N
John read N
books
It is clear, however, that Tense-hopping cannot apply, because T and V are not
adjacent, the adjacency having been destroyed by the insertion of the negative element.
Do- support will then apply, as in (36) of Lecture #5.
In short, the same analysis of Tense that we needed for the distribution of yes-no
questions is needed for the analysis of negation.
B. Verb-Phrase Deletion
In English, there is a process that allows verb phrases to fail to be expressed. An
example is (22):
(22) John reads books, and Bill does __,too.
Which means (23):
(23) John reads books, and Bill reads books, too.
Interestingly enough, verb phrases can only fail to be expressed when they follow a
helping verb. There are verbs in English that take verb phrases as complements, typically
the verbs of temporal aspect : start, begin, continue, stop, and keep on. Verb phrases
that appear after the verb begin, for example, cannot elide (pointed out by Joan Bresnan
in a 1976 article , On the Form and Functioning of Transformations, Linguistic Inquiry,
Vol. 7):
(24)*First fire began pouring out of the building, and then smoke began___.
59
It therefore seems as though Verb Phrase Ellipsis requires what Bresnan calls a
context predicate, or trigger, to be mentioned in the structural description of VP-
Ellipsis:
(25) {M } - V
{ have }
{ be }
1 - 2 --- 1-O
There is an aspect of verb phrase ellipsis that is not specifically mentioned in (25), which
is that the verb phrase that is deleted must be identical to another verb phrase that is
specifically mentioned In the terminology of (3) of Lecture #4, we would say that the
null element must be anaphoric to another verb phrase in the sentence, so that (22)
cannot mean, for instance, (26):
(26) John reads books, and Bill drinks wine, too.
Now, again notice that, in (22), the tense remains as a suffix to do, while the verb,
which is part of the verb phrase, has been deleted. We can account for this by saying that
the rule of Verb phrase Deletion, formulated finally as in (27), is ordered before Tense-
hopping:
(27) (VP-Ellipsis){T } - V
{ M }
{Have }
{be }
1 - 2 --
1 - 0
Applying VP-ellipsis to the deep structure of the second conjunct of (22) generates the
phrase-marker in (28):
(28) S
0
N T
N Past
Bill
Tense-hopping cannot apply, in this case because there is nothing for the Tense to
hop onto (i.e., no Factor #2), and so Do-support will apply, yielding (29):
(29) Bill does.
In this case, again, the same rules of Tense-hopping and Do- support that we needed
for Yes-No Questions and the placement of negation are needed for the distribution of
tense in sentences with elided verb phrases, confirming the original analysis of Tense.
This is what is meant by the grammar being an inter-locking system, with rules that are
justified on the basis of one set of considerations internal to the grammar having to jibe
with rules that are needed for other areas of grammar.
60
N T M have been V
N Past will V
N V N
the N
book
If we apply VP-ellipsis to (5), however, we predict that all of the helping verbs
would have to remain. They can all remain, as in (6):
61
(6) Although John wouldnt have been reading the book, Bill would have
been__.
However, it is also possible to just leave the modal, or the modal and have, as in
(7), and this is not predicted by (5), assuming (3):
(7) a. Although John wouldnt have been reading the book, Bill would__.
b. Although John wouldnt have been reading the book, Bill would have__.
If we assume that the ellipses in (7) arise via deletion of Vs, we would have to
assume that the structure of (4) is (8), rather than (5):
(8) S
N T M V0
N Past will V
N V V 1
Bill have V
V V 2
been V
V N
reading Det N
the N
book
This would allow for any of the numbered Vs to delete, assuming (3).
A. Are Modals Tensed?
It has occasionally been suggested that modals are generated directly under T.
This would imply that modals are not themselves tensed. A competing view about modals
is that they can themselves be tensed, implying that they must be generated separately
from Tense, so that the rule of Tense-Hopping should really be formulated as in (9):
(9) T- { M }
{have }
{ be }
{ V }
1- 2 --
0- 2+1
First, it can be noted that the modals, while a closed class of items, seem to contain
pairs such as those in (10):
(9) will-would
can-could
shall-should
may-might
62
This by itself is not persuasive, since the closed nature of this class (there are
less than ten modals in the language).
More persuasive is the evidence from idioms. Idioms are sequences of
words whose meaning is non-compositional in nature (i.e., the meaning of
the whole idiom cannot be predicted from the meanings of the individual
words). Examples of idioms are phrases such as make headway, keep tabs
on, kick the bucket, keep track of. Examples are given in (10):
(10)a. John made headway. (means John progressed)
b. John kept tabs on Mary. ( John kept apprised of Marys situation).
c. John kicked the bucket. (John died).
d. John kept track of Mary (same meaning as (b)).
Recalling the role of the lexicon as the repository of idiosyncratic information, idioms ,
by the very nature of their unpredictability and irregularity, must be listed in the lexicon.
Hence, an idiom such as, e.g., make headway, will have a lexical entry as in (11):
(11)make headway, [V make] [ N headway]
The lexicon has unpredictable information, but recall from our discussion of plurals on
English nouns ( i.e., you dont want to specify the plural of book in the lexicon since its
predictable), that you want to keep the amount of information in the lexicon to the bare
minimum . In this connection, there are idioms that include the modal can- specifically,
the idioms can help but and can afford. The requirement that help but and afford occur
with can is seen in (12):
(12)a. *Did John help but notice?
b. *John afforded a new car.
Obviously, the lexical entries for these idioms will have to mention can, and will
look something like (13):
(13)a. can help but, [M can] [V help] [Conj but] V
b. can afford, [ M can] [V afford] N
Interestingly enough, could can replace can in these two idioms:
(14)a. Can he help but notice?
b. Could he help but notice?
(15)a. John can afford a new car.
b. John could afford a new car.
If we posit could as a past tense variant of can, formed by Tense hopping, we can
keep to the lexical entries in (13). If we dont, we would have to have a disjunctive
lexical entry for each of these two idioms:
(16) a. {can } help but, {[M can ] }[ V help][ Conj but] V
{could } {[Mcould ] }
b. { can } afford, { [M can ] } [V afford] N
{ could } { [M could ] }
We would then have to answer the question of why the same set of elements appears in
two separate disjunctive statements (i.e., the two lexical entries in (16)), whereas if we
analyze could as a past tense variant of can, we do not have to posit a lexical entry that
leads to the posing of this question.
B. The Position of The Modal
Earlier in this lecture, I proposed that the phrase-structure rule for S should be
revised to (17):
(17) S-- N T (M) V
63
I would now like to propose that M heads its own projection as well, perhaps as V.
First, it is time to re-consider the treatment of sentential negation in English. Earlier, we
analyzed negatives as not being present in deep structures, but after Tense and the first
helping verb. It was formulated as (19) in Lecture #6:
(19) N- T ({M })
({have } )
( { be } )
1- 2 --
63 1- 2- Neg
However, claiming that negatives will not be present, but rather inserted after T,
predicts that negatives will not be able to occur in clauses that apparently lack Tense.
Such clauses exist, however. Infinitives are a case in point (we will return to infinitives
in more detail later):
(19)
For John to leave early.
Sentential negation precedes the to:
(20)
For John not to leave early.
If we assume that negatives are only inserted after T, how do we then account for the
presence of negation in clauses that apparently lack T?
Another case that makes the same point is gerunds:
(21)
a. Johns eating steak bothered me.
b. Johns not eating steak bothered me.
One proposal that has been made is due to Jean-Yves Pollock (Verb Movement,
Universal Grammar, and the Structure of IP, Linguistic Inquiry, Vol. 20 (1989)) . He
proposed that negation headed its own projection, so that there is a constituent Neg
Phrase ( Neg). If this is the case, we might adapt (17) to (22):
(22)
S- N T (M) {Neg }
{ V }
Neg - Neg
Neg - Neg V
The deep structure (i.e., initial phrase-marker) of (23) would then be (24):
(23)
John would not eat steak.
64
(24)
S
N T M Neg
John Neg V
V N
eat N
steak
Note, however, that in the infinitive, to follows the negative. Therefore, if we assume
that negation is a head that is lower than Tense in the phrase-marker, to cannot be a sister
to Tense, but rather must also be lower than Tense in the phrase-marker.
With this in mind, let us consider the distribution of modals in infinitives. They
are absent in infinitives, and, in fact, are the only helping verbs that do not appear in
infinitives. We can account for this fact if we analyze to as occurring in the same position
as modals, so that the consequence of the absence of modals in infinitives is simply a
consequence of the fact that, in English, we can only have one modal per simple
sentence.
We might, therefore , analyze modals as heads of their own projections. Let us
call them Ms. Therefore, the phrase-structure rule for S would be (25): 3
(25) S- N T { Neg }
{ M }
{ V }
Neg-- Neg
Neg-- { M }
{ V }
M- M
M- M V
Looking at (25), however, we see two elements that must appear in every S- N
and T. N, being a phrasal constituent, is not a possible head, but T is. We might
therefore view T as being a possible head of S, so that S would really be T. However,
we would then have to posit a T, consisting of T and a following phrasal constituent,
which would be the complement of T. In other words, the structure of , e.g. (26),
would be (27):
3
At this point, note that we have the same disjunction, {V}, in two separate phrase-structure
{M}
rules. There is a way to eliminate this, but we will not go into it at this time.
65
N T
N T V
John Pres V
V N
like N
pizza
We can actually find somewhat direct evidence for the constituency of T and
the following phrasal unit, if we assume that only constituents can conjoin
(This argument is originally due to Ray Dougherty in his (1970) article, Recent
Studies on Language Universals, Foundations of Language, Vol. 5).
We must find an element that we know resides in T, based on our analysis so
far. One such element is the do that results from Tense-Hopping being unable to
apply, as in (28):
(28)John does not like pizza.
As Dougherty points out, we can conjoin such sequences as those in (29):
(29)John does not like pizza and does not like steak.
Assuming a T allows us to conjoin Ts, so that the structure of (29) would be (30):
(30)T
N T
N T and T
N T Neg T Neg
Neg V Neg V
V V
V N V N
Hence, we have direct evidence for the constituency of T and a following phrase. If we
analyze S as the maximal projection of T, we can call this phrase a T, and analyze the
66
phrase following T as its complement. Hence, our phrase structure rules at the clausal
level are given in (31):
(31) T-- N T
T- T { Neg}
{ M }
{ V }
Neg- Neg
Neg- Neg { M }
{V }
M- M
M- M V
C. Restructuring
If we assume that negatives are generated between Tense and the main verb, and
are not placed there by a transformation of negative-placement, and we assume that the
helping verbs are generated lower, and to the right of, negatives, we must account for the
fact that the helping verbs precede, rather than follow, sentential negatives:
(31)a. John would not eat the steak.
b. *John not would eat the steak.
(32)a. John has not eaten the steak.
b. *John not has eaten the steak.
(33)a. John is not eating the steak.
b. *John not is eating the steak.
We might account for the ungrammaticality of (31)(b)-(33)(b) by noting that Tense-
hopping is blocked by the negation, but if that were the reason for the unacceptability of
the (b) examples, we would expect Do-support to be able to rescue them, contrary to
fact:
(34)*John did not will eat the steak.
(35)*John does not have eaten the steak.
(36)*John does not be eating the steak.
Rather, assuming that the negatives stay in the position in which they are generated by
the phrase-structure rules, we must move the helping verbs to the left of them. One way
to state this movement is as a movement of the helping verb to Tense, formulated as in
(37), giving the helping verbs a feature [ +Aux] (for Auxiliary):
(37) Restructuring
T- (Neg) - +Aux
1- 2 - 3 --
3+1 -2 - 0
Hence, the D-Structure of , e.g. (32), would be as in (38):
67
(38)
(37) T
N T
N T Neg
N pres Neg
John Neg V
V V
+Aux
V
have
V N
eaten Det N
the N
steak
Factoring the phrase-marker according to the structural description of restructuring gives
us the following factored phrase-marker:4
4
We will return to the question of whether anything is left behind when a head moves out of its
maximal projection.
68
(39) T
N T
N T Neg
N pres Neg
John Neg V
V V
+Aux
V
have
V N
eaten Det N
the N
steak
1 2
3
69
N T
N T Neg
N V T Neg
+Aux
John Pres Neg V
have V
V
V N
The thematic relations that are exhibited in passive sentences are the same as the
thematic relations in the corresponding actives, but the thematic relations in passives
are simply realized in different positions. There is a simple algorithm (i.e., method of
computing) the positions in which thematic relations in passives are realized:
(8) The passive subject bears the thematic relation of the post-verbal NP 5 in the
corresponding active, and the passive object of by bears the thematic relation
of the active subject.
Clearly, it would be desirable to have the grammar of English reflect (8) in some
way, rather than stating (8) as a sort of post-hoc, after the fact observation.
C. Idiom Chunks
The third regularity between passives and actives concerns the form of idioms,
sequences of words which have meanings that are non-compositional in nature.
Examples of idioms are : keep track of , keep tabs on, make headway.
Sentences with idioms include such sentences as (9):
(9) a. John kept track of Sally.
b. John kept tabs on Sally.
c. John made significant headway.
By their very nature, idioms are unpredictable. Every language has idioms, and
recall that there is a specific place in the grammar to put unpredictable information- the
lexicon, which we have called the suppository of idiosyncratic information. One way
of representing idioms is as in (10):
(10) a. track, N, +[keep____[P of X ]]
b. tabs, N, +[ keep___ [P on X ]]
c. headway, N, +[make ____]
Representing the idioms as in (10) reflects the fact that the sequence of words is not just an
isolated list of words, but rather that the words are sequenced in a way that conforms to the
general syntactic patterns of the language. In particular, NPs are generated after verbs, and
keep and make in the idioms above are formally verbs, and track, tabs, and headway are nouns
that are sequenced in the same way that non-idioms are sequences.
The nouns in each of the three idioms above can appear as subjects of the verbs in the
passive voice:
(11)a. Careful track was kept of Sally.
b. Close tabs were kept on Sally.
c. Significant headway was made by John.
If we generated actives and passives separately, we would have to have disjunctive
subcategorization frames for these nouns, so that, e.g. (10)(c) would have to be modified to (12):
(12) headway, N,{ +[ make__ ] }
{+[ ___be made] }
Clearly, disjunctive subcategorization frames are missing the relationship between actives
and passives. If we allow such disjunctive subcategorization frames, we would then have to ask
why we couldnt have a disjunctive frame for headway as in (13), for instance:
(13) *headway N, { + [make____ ] }
5
There is a reason that I am using the term post-verbal NP rather than object, in that I will be
trying to show in Lecture #10 that there are post-verbal NPs that are not objects, but which
participate in the passive construction.
72
{+ [ ____ be seen ] }
Generating actives and passives separately does not predict the non-occurrence of lexical
entries such as (13).
II. The Solution- Generating Passives Transformationally
Chomsky (1957), in Syntactic Structures (Mouton), after noticing the above
regularities between English actives and passives, proposed to capture them by
not generating verbal passives directly via the phrase-structure rules, but
rather by forming passives from the phrase-markers for the corresponding active
sentences. The transformation was formulated as in (14):
(14) N- X - V - N
1 - 2 - 3 - 4--
4- 2 - be+en 3- by +1
We immediately capture the fact that passives correspond to actives in
which the active verbs take post-verbal NPs, because of the mentioning of Ns as Term
#4 in the structural description of the passive transformation. Furthermore, if we
assume that thematic roles are assigned in deep structures, the correspondence stated
in (8) is accounted for. The passive subject is, in deep-structure, the NP that follows
the verb in the corresponding active, and since thematic roles are assigned in deep
structure, whatever thematic role the post-verbal NP got in deep structure will be
retained when it moves. Similarly, if the passive object of by is generated as the deep-
structure subject, whatever thematic role that the subject was assigned at deep structure
will be retained if the subject is postposed in the passive transformation.
Finally, given that the distribution of idioms is stated in the lexicon, and lexical
information is only accessed at deep structure, an idiom chunk that is a postverbal
noun phrase will be permitted to move to subject position.
Hence, a transformational derivation of verbal passives will capture all of the
regularities described at the beginning of this section.
III. On Restricting the Scope of the Passive Transformation
The passive transformation, as formulated in Section II, has a number of components:
(i) it preposes the post- verbal NP into subject position; (ii) it postposes the original
subject to the position after by; (iii) it inserts be +en and by. We shall now see that the
postposing of the subject , component (ii), and the insertion of be +en and by,
component (iii), are best viewed as not being transformations.
With respect to agent postposing, we can see that the grammar of English needs a
mechanism to give the object of by, in certain instances, the thematic role that it would have
received had it appeared in subject position, without movement being the way of accounting for
this dependency . Norbert Hornstein first noticed nominals such as (15) in S and the X-Bar
Convention, , Vol. 3 (1977):
(14) Johns portrait of Nixon by Warhol.
Warhol is, of course, interpreted as the agent of the verb related to the nominalization portrait,
i.e. portray, and John is interpreted as the owner of the portrait. However, Warhol could not have
moved from any other position within the nominal, since all of the other positions are occupied.
Hornsteins conclusion is that the agent that occurs as the complement of by must be generated in
that position, and that there must be a semantic mechanism that interprets agents in two places-
subject position, and the object position of by. Nominalizations such as (14) point to the
73
necessity of such a mechanism in some cases, and it would seem natural, given its necessity, to
posit it for verbal passives. This sheds new light on so-called truncated passives, which lack by-
phrases altogether, as in (15):
(15) John was murdered.
It had previously been thought that there was an underlying agent there which was
deleted by a transformation called Unspecified Agent Deletion, an optional
transformation which, if it did not apply, would yield (16):
(16) John was murdered by someone.
Another way of interpreting truncated passives is to say that by-phrases are
adjuncts, and that the subject thematic role is present but not linked to an argument.
Notice that I say that by takes whatever thematic role the subject would take. It has
occasionally been suggested that by marks agents. This cannot be right, however, in
view of passives such as (17):
(17) a. A crushing defeat was endured by Germany.
b. A glancing blow was suffered by John.
The subjects of the verbs endure and suffer are not agents.
Another problem exists with the idea that passive be is always inserted via a passive
transformation, in that we find passives without be:
(18) I want him given a book.
A rule deleting to be is occasionally invoked, so that the deep structure of (18) would
be the structure corresponding to (19):
(19) I want him to be given a book.
Presumably, to be deletion would assign a common deep structure to (20)(a) and
(20)(b) as well:
(20)(a) I consider him to be crazy.
(b) I consider him crazy.
However, we can find evidence against the rule of to be deletion if we consider
the English expletive there , which needs a verb such as be for its appearance:
(21) There is a valid reason for his absence.
We can have a full infinitival counterpart to (21) within a VP headed by consider, but the copula
must be retained:
(22) a. I consider there to be a valid reason for his absence.
b. * I consider there a valid reason for his absence.
Assuming that the expletive there requires the copula, we must ask why, if there is a
rule of to be deletion, the expletive would not be licensed at deep structure by the
copula, followed by the copulas deletion, yielding (22)(b). Because (22)(b) is not
acceptable, we can account for its unacceptability by not positing a deep structure for
such instances of secondary predication which posits the be.
If there is no rule of to be deletion, we must conclude, then, that passives
contained in such larger structures as (18) must be derived without be having been
present in their formation.
Hence, the transformation involved in the formation of English verbal passives is
simply a transformation that preposes the post-verbal NP, formulated as in (23):
74
(23) N - X- V - N
1 - 2 - 3 - 4 -- 4-2-3- 0
We will now see that (23), which we will call NP-Preposing, operates in a wider
range of constructions than just passives. In the next section, we will see the operation
of NP-Preposing in a class of superficially intransitive verbs that do not take passive
morphology at all.
II. Unaccusatives
There is a great deal of evidence from other languages that superficially intransitive
verbs differ in the syntactic position of the one argument that occurs with the verb.
The sole argument generally acts as a surface structure subject, but for some verbs,
there is evidence that the surface subject is an underlying object, while the surface
subjects of other verbs are deep-structure subjects as well. Verbs of the latter class
take subjects that are agents, while verbs of the former class take subjects that are
non-agents. Examples of the two types of verbs are the verbs telephone and arrive:
(23) John telephoned.
(24) John arrived.
Therefore, the deep structures of (23) and (24) are (25) and (26), respectively:
75
(25) T
N T
John T V
Past V
Telephone
(26) T
N T
e T V
Past V
V N
arrive John
There is no evidence for this distinction in English, but there is a great deal of
evidence from other languages. Furthermore, there seems to be no basis for
learning this distinction in these other languages, and the two classes in each of
the languages that show overt evidence of the distinction seem to have the same
set of verbs.
We will first look at the evidence from Italian, as first discussed by David
Perlmutter (1978, Impersonal Passives and the Unaccusative Hypothesis,
Proceedings of the Berkeley Linguistic Society). Perlmutter noted that Italian
has two auxiliaries that are used for expressing past tense- the verbs avere
(roughly have) and essere (roughly be). Transitive verbs take avere in the past
tense:
agentive intransitives may also be marked with the genitive of negation. Examples are given
in (36) and (37):
(36) (Babbys (4)(b)): V- nasem- lesu-ne-ratet-gribov.
In-our-forest-neg-grow(3rd. sg.) mushrooms (GEN pl.)
There are no mushrooms growing in our forest.
(37) (Babbys (6)(b)):
(38) Ne-ostalos-somnenij.
Neg-remained(3rd.n. sg.)-doubts (GEN pl.).
There were no doubts that remained.
Subjects of negatied transitive verbs that are nominative in the affirmative cannot take the
genitive:
(39) (D. Pesetsky, Paths and Categories, unpublished Doctoral dissertation, MIT (1982), ex.
(15)):
a. ni odna gazeta ne pecetaet takuji erundu.
Not one newspaper(fem nom sg) NEG prints (3sg) such nonsense (fem acc sg).
b. *ni odnoj gazety ne pecataet takuju erundu.
interpreted factively, for-to complements can never be, as noted by Kiparsky &
Kiparsky (1970).
A second point about the choice of complementizer is that it is lexically restricted,
a point made by Bresnan. While verbs such as hate can take either that-complements
or for-to complements, verbs such as claim can only take that-complements, and verbs
such as wait can only take for-to complements:
(8) a. John claimed that Fred was more popular than him.
b. *John claimed for Fred to be more popular than him.
(9) a. *John waited that Fred was more popular than him.
b. John waited for Fred to be more popular than him.
Therefore, if there were a complementizer-placement transformation, as proposed
by Rosenbaum, there would actually have to be two complementizer-placement
transformations-one to insert the that-complementizer, and the other to insert the for-to
complementizer. The transformation would have to be lexically restricted, so that
claim would trigger the that-placement transformation, wait would trigger the for-to
placement transformation, and hate would trigger either one, with a rule of
interpretation interpreting the that-complement as factive.
We have been making a division thus far between the lexicon, which contains
unpredictable information that is peculiar to particular lexical items, and the syntax,
which is more regular. Syntactic rules are thought to apply maximally generally, but
the marking of a large number of lexical items as to which of a family of
transformations apply to them undercuts this division between lexical (idiosyncratic)
and grammatical (systematic). Therefore, Bresnan proposes to base-generate
complementizers (i.e., generate them directly via the phrase-structure rules). She
originally proposed the phrase-structure rule in (10):
(10) S-- Comp S
and the selection by particular predicates is now a simple matter of selection, rather
than features that trigger particular transformations. Hence, the lexical entries for
hate, claim, and wait would be as in (11):
(11) a.hate, V, +[___ [S [Comp {that } ]]]
{ for }
b. claim, V, + [ ___[S [Comp that] ]]]
c. wait, V, +[ ____[S [ Comp for ]]]
Updating (10) into current X-bar terms, we would say that Comp is the head of this
clausal projection, so that (10) would be replaced by (12):
(12) a.C- C
b.C-- C T
Bresnan (1974) ( The Position of Certain Clause-Particles in Phrase-Structure,
Linguistic Inquiry Vol. 5) later provides direct evidence for the constituency in which
the rest of the sentence forms a constituent that is sister to the complementizer. There is
a construction known as the Right-Node-Raising construction, in which, in a conjoined
phrase, if the rightmost elements of the conjuncts are identical, the final rightmost
80
element is set off intonationally as a pause, and the previous rightmost elements are
deleted. An example is (13):
(13) Mary wrote, and John performed, a beautiful Peruvian love song.
which is presumably related to (14):
(14) Mary wrote a beautiful Peruvian love song, and John performed a beautiful
Peruvian love song.
The structure of (13) is plausibly (15):
81
(15) T
T and T N
N T N T a beautiful Peruvian
love song
Mary T V John T V
Past V Past V
V V
wrote performed
The assumption is that only constituents can appear in the position after the pause in
the Right-Node-Raising constuction. With this in mind, Bresnan notes that the
sequence after the complementizer can appear in this position:
(16) Im wondering whether, but Im not sure that, your hypothesis is correct.
Hence, we have evidence for the constituency in which the complementizer is set off
from the rest of the clause.
B.For Infinitives
Let us now consider the position of the infinitive marker to. As we noted in
Lecture #7, it follows the sentential negation, and is incompatible with the presence
of a modal. The incompatibility of to with the modal could follow from the fact that
only one modal is permitted per clause if we analyzed the to itself as a modal.
We could therefore assume that the infinitive takes a null T, which selects to in
the modal position. Hence, the structure of (17) would be (18):
(17)For John to leave.
82
(18) C
C T
For N T
John T M
0 M
M V
to V
Leave
We can account for the dependencies between that-complementizers and finiteness, and
for-complementizers and non-finiteness, via the mechanism of selection, if we assume
that heads select for the heads of their complements ( as I had argued in Heads and
Projections, in M. Baltin & A. Kroch, eds., Alternative Conceptions of Phrase-
Structure (1989), University of Chicago Press). Hence, the following lexical entries
would suffice:
(19) that, C, +[____ [T {Pres } ]
{ Past}
(20) for, C, +[ ___[ T 0 ]
(21) 0, T, +[___[M to] ]
C. Clauses in NP Positions
It is clear that clauses can appear in subject position. To see this, consider an
alternative way of expressing the previous sentence:
(21) That clauses can appear in subject position is clear.
It is also clear that clauses can appear in object position:
(22) John proved that Bill liked Sally.
It is also clear that clauses in object position can passivize:
(23) That Bill liked Sally was believed by everybody.
83
Rosenbaum proposed to account for the ability of clauses to appear in subject and
object position, as well as the ability of clauses to passivize, by positing a phrase-
structure rule as in (24):
(24) N- C
However, this violates X-bar theory, in that N is not a projection (ultimately) of
N . J. Emonds (1976) ( A Transformational Approach to English Syntax, Academic
0
Press) modified Rosenbaums analysis by proposing that these clauses were actually
complements to a null N0 head, so that we have the phrase-structure rule as in (25):
(25) N-- N C
C T
N T
N T V
N C Pres V
0 C V A
C T be A
(27) For every sentence in which a clause appears in subject position, there is a
variant of the sentence in which the clause appears at the end of the sentence , and the
expletive it appears in subject position.
For example, we have the following pairs:
(28)a. For Fred to leave would bother me.
b. It would bother me for Fred to leave.
(29) a. That Fred is crazy is obvious.
b. It is obvious that Fred is crazy.
(30) a. That Fred has blood on his hands proves nothing.
b. It proves nothing that Fred has blood on his hands.
The exceptionless nature of this generalization strongly suggests that the grammar of
English should be formulated in such a way as to express it . There have been two main
approaches to capturing this generalization transformatonally: extraposition and
intraposition. The extraposition approach moves the C rightward and inserts the
expletive it. The intraposition approach takes the variant in which the C is in clause-
final position as basic, and moves the C leftward into the subject position.
Extraposition was originally proposed by Rosenbaum, and Intraposition was proposed
by J. Emonds (1970) in his MIT Doctoral dissertation, Root, Structure-Preserving, and
Local Transformations.
Extraposition can be formulated as follows:
(31) [N it - C] - X - V
1- 2 - 3 - 4---
1 - 0 - 3 - 4+2
Hence, the D-structure of (28)(a) would be (32):
(32) C
C T
N T
N T M
N C Past M
0 C M V
C T will V
For N T V N
85
Fred T M bother me
0 M
M V
to V
Leave
(33) C
C T
N T
It T M
Past M
M V
will V C
V N
bother me
Intraposition works in reverse: the underlying structure of (28)(a) and (b) would, under
the intraposition analysis, be (34):
(34) C
C T
N T
It T M
past M
M V
will V
V N C
87
(36) C
C T
N T
N T M
C Past M
C M V
C T will V
For N T V N
Fred T M bother me
0 M
M V
to V
Leave
The intraposition account generates the clausal argument within the V, and it is tied
to the independent motivation for positions within the V for clausal arguments. For
example,
88
the intraposition analysis of (28)(b), in which the clausal argument is generated within
the VP, depends upon the phrase-structure rule in (37):
(37) V-- V (N) (C)
and the analysis requires, for its plausibility, that there be independent instances of
this pattern in which the subject is something other than the expletive it. We can find
such independent instances of the pattern V N C. For example, we have the verbs
convince, tell, and persuade:
(38) a. John convinced Sally that she should leave.
b. John told Sally that she should leave.
c. John persuaded Sally that she should leave.
However, we have no instances of the pattern in (39), a verb followed by two
clausal arguments:
(39) * V
V C C
Verbs with sentential subjects and complements exist, however (these are known
as bisentential verbs):
(39) a. That John has blood on his hands proves that hes the murderer.
b. That John has blood on his hands convinces me that hes the murderer.
c. That John has blood on his hands suggests that hes the murderer.
d. That John has blood on his hands indicates that hes the murderer.
e. That John has blood on his hands means that hes the murderer.
If clausal arguments are generated within the VP, as they are under the intraposition
analysis, we would need to generate the configuration in (39), but this configuration
would only be employed for verbs in which one of the arguments ended up in subject
position. We would then have to answer the question of why no verbs existed which
allowed both clausal arguments to remain inside the V, i.e. why there are no verbs of
the form in (40):
(40) * John glorped that Fred has blood on his hands that hes the murderer.
If we adopt the extraposition analysis, which allows for clauses to be generated in
subject position and moved rightward, we do not have this problem. We would
generate the clausal subjects in (39) in subject position, and only generate one clause
inside the V, making (39)(a-e parallel to (41)(a-e),or (42)(a-e), in which the clausal
subject or object is replaced by a constituent that is clearly an N:
(41) (a) This proves that hes the murderer.
(b) This convinces me that hes the murderer.
(c) This suggests that hes the murderer.
(d) This indicates that hes the murderer.
(e) This means that hes the murderer.
(42) (a)That John has blood on his hands proves nothing.
(b) That John has blood on his hands convinces me of nothing.
(c) That John has blood on his hands suggests his guilt.
89
(d) That John has blood on his hands indicates his guilt.
(e) That John has blood on his hands means nothing.
I believe that a further argument can be made for allowing clauses to be generated in
subject position, and this argument deals with the possibility of formulating a set of
linking principles , principles that link thematic roles and syntactic positions. Recall
that, in Lecture #8, when we discussed unaccusatives, we noted that the same set of
verbs (i.e. translation equivalents of each other) were unaccusative and unergative, so
that agentive intransitives were unergative, and non-agentive intransitives were
unaccusative. The cross-linguistic predictability of the membership of the two classes
of verbs indicated strongly that Universal Grammar has some linking principles that
require this. A problem with the formulation of such linking principles, however, is
that some psychological predicates seem to exist which are paired in such a way that
the two members of the pair take the same set of arguments, and the same set of
thematic relations of the arguments, but the thematic relations of the arguments of
each verb are realized in the opposite positions from the other verb. The verbs fear and
frighten show this:
(43) a. John fears Sally.
b. Sally frightens John.
Each of these verbs takes an argument that is called an experiencer, the experiencer of
the emotion, as well as what could, for convenience, be called the theme, the object of
the emotion. However, the experiencer is t he subject of fear but the object of frighten,
and the theme is the object of fear but the subject of frighten. If these two sentences are
synonymous, and hence have the same array of thematic relations for their arguments,
how could we say that there is a universal set of linking principles that allows us to
predict the syntactic position of an argument from its thematic relation?
Grimshaw (1990)( Argument Structure, MIT Press) provides a solution. She
claims that the synonymy of the pair in (43) is only apparent. In particular, she notes
that there is a grammatical difference between verbs in which the theme is the subject
and the experience is the object and verbs with experiencer subject and theme object
verbs. Verbs of the former class can appear in the progressive, while verbs of the latter
class cannot:
(44) a. *John is fearing Sally.
b. Sally is frightening John.
She ties this difference in progressivizability to the claim that object-experiencer
verbs are accomplishment verbs ( recall the discussion in terms of Zeno Vendlers
classification in the early lectures), while subject-experiencer verbs are states. She
then posits a lexical representation for accomplishment verbs in which they are
decomposed into two parts- (i) an activity of causing, which results in (ii) a state.
Hence, the lexical representation of the meaning of,e.g. frighten would be cause to fear,
as in (45):
(45) frighten, V, [[ CAUSE][ ENTITYi][ STATE [FEAR][ ENTITY]j]
The linking principle, then ,would link the causer to the subject position.
90
A. Understood Subjects
We will now look at infinitives that are not introduced by a for-complementizer,
and do not even seem to contain subjects. An example is the following:
(1) To leave would be inconvenient.
It is clear that, in some sense, there is an understood subject, and this can
be brought out if we add a benefactive phrase to the main clause:
(2) To leave would be inconvenient for Fred.
We can understand (2) to mean either that it would be inconvenient for Fred if he
himself were to leave, or it would be inconvenient for Fred if somebody else were
to leave. The fact that we understand a subject, however, does not mean that
the subject is present in the syntactic representation, i.e. the phrase-marker. We
assume that the grammar of natural language is, like the grammars of logical
languages, organized in such a way that the syntactic component generates
representations that are then interpreted by the semantics. Therefore, the fact
that a subject is understood does not mean that it is present syntactically.
Let us then look for some syntactic evidence that infinitives have subjects.
There are a number of theories of grammar that claim that infinitives do not
have syntactic subjects, but rather, subjects that are, as it were, plugged in, or
supplied, by the semantics. The subject is missing, in all of these approaches,
because there is no structural position for it. For example, we could generate
subjectless infinitives as Ms, with the phrase-structure rules that we have used
in (3):
(3) M- M
M-- M V
And we could generate them in the same way that we decided to
generate clauses that function as Ns, i.e. as in (4):
(4) N- N ({ M } )
( {C } )
Hence, the D-structure of (1) would be as in (5):
92
(5) C
C T
N T
N T M
N M Past M
0 M M V
M V will V
to V V A
V be A
leave A
inconvenient
(15) a. C-- C
b. C-- C T
c. T- N T
d. T-- T { M }
{ V }
Notice that by using the phrase-structure rules in (15), there is a subject
position , and we must then ask why this subject position for the infinitive is not
overtly realized. We must also ask why the complementizer position is not
overtly realized.
In evaluating the claim of the plug-in theory of understood subjects of
infinitives, in which they are not syntactically present but instead supplied in
the semantics, and subjectless infinitives are generated as Ms, a further
complication arises with respect to the substitutability of for-infinitives and
subjectless infinitives. A for-infinitive cannot appear as the complement of a verb
if its subject is understood as identical to the main clause subject. English has a
form that expresses the identity of a noun phrase with another noun phrase in
the sentence, and this form is called the reflexive pronoun ( we will be talking
more about reflexive pronouns shortly):
(16) John likes himself.
The identity of John and himself, termed referential identity because both terms
pick out the same individual, is usually expressed by superscripting an index
to the term that is the same as the index that is superscripted to the term with
which it is co-referential, and this device is termed co-indexing. An example is
(17):
(17) Johni likes himselfi.
We cannot use a for-infinitive, however, when the subject of the infinitive is co-
indexed with the main clause subject. We must use the subjectless infinitive:
(18)a. * He would prefer for himself to win.
b. He would prefer to win.
(19) a. *He would hate for himself to lose.
b. He would hate to lose.
(20)a. * He was hoping for himself to win.
b. He was hoping to win.
(21)a. * He was waiting for himself to leave.
b. He was waiting to leave.
If we adopt the plug-in view of understood subjects, and all that it entails, we
would still need a mechanism to prevent the generation of for-infinitives with
subjects that are co-referential with main clause subjects. In other words, given
95
C T
N T
He T
T M
past M
96
M V
will V
V N
prefer N
N C
0 C
C T
for N T
himself T M
0 M
M V
to V
win
Reflexive Deletion then applies, yielding (30):
30 (30)
C
C T
97
N T
He T
T M
past M
M V
will V
V N
prefer N
N C
1 C
C T
for T
T M
1 M
M V
to V
win
Finally, For- Deletion applies, yielding (31):
98
(31)
31
C
C T
N T
He T M
past M
M V
will V
V N
prefer N
N C
0 C
T M
0 M
M V
to V
V
99
Win
(40) C
C T
N T
Careful track T V
Past V
V C
seem C
C T
for N T
e T M
0 M
M V
to V
V V
have V
V V
been V
V N
P
kept careful
of
102
track his
prog-
ress
We would then need a mechanism to convert the second occurrence of the idiom
chunk careful track to a reflexive, after which it would undergo N-preposing to the
empty subject position of the infinitive, where it would undergo reflexive deletion,
and the for would undergo for-deletion.
However, we would be violating our lexical requirements on the occurrence of
these idiom chunks by generating them as subjects of seem.
It would also be desirable to relate the use of seem with the finite complement and
expletive subject to the use with the infinitive. Let us now try to do this.
Suppose we give seem the lexical entry in (41):
(41) seem, V, +[___C]
We must make one stipulation. Given that the overt complementizer for never shows up in this
type of infinitive construction, we actually have no evidence that the infinitive is introduced by for
here. We do have evidence , as we have just seen , that the subject of seem , when seem takes
an infinitive, is, for all intents and purposes, the subject of the infinitive. Furthermore, we have
seen that seem, when it takes a finite complement, lacks a subject in the semantic sense. Let us
then generate seem without a semantic subject in both instances, so that the D-structure of, e.g.
(35)(a), would be (42):
(42) C
C T
N T
e T V
Past V
V C
seem C
C T
N T
John T M
0 M
M V
to V
103
V A
be A
happy
The symbol e simply means empty, i.e. an unexpanded node in the phrase-marker. We then
apply N-preposing, the same transformation that was employed in the derivation of passive and
unaccusative constructions, to move the subject of the infinitive into the subject position of seem.
Recall that the formulation of N-preposing was given in Lecture #8, (23), repeated here:
(23) ) N - X- V - N
1 - 2 - 3 - 4 -- 4-2-3- 0
In order to have the phrase-marker in (42) meet the structural description of N-preposing, we
must disregard the intervening complementizer. Let us therefore, for the moment, assume that
null elements are not factored as being present when inspecting phrase-markers for compatibility
with the structural descriptions of transformations.
Therefore, N-preposing will apply, yielding (43):
(43) C
C T
N T
John T V
pres V
V C
seem C
C T
0 T
T M
0 M
M V
to V
V A
be A
A
104
happy
The analysis of the subject position of seem in the infinitive as coming to be occupied
by the employment of N-preposing now makes the derivation of (35)(d), repeated here,
straightforward.
(35)(d) Headway seemed to have been made.
It simply involves t wo applications of N-preposing to the D-structure in
(44):
(44) C
C T
N T
e T V
past V
V C
seem C
C T
0 N T
e T M
0 M
M V
to V
V V
have V
105
V V
be V
V N
make+en
headway
106
No such restriction exists for the verbs of the second class (called for
expository convenience the believe-class); any N infinitive sequence is
possible, provided that the N can be interpreted as the subject of the infinitive.
Hence, the star is removed for all of the examples in (3) if the verb is of the
believe-class:
(4) John { believed } { the rock to be on the table }
{ proved } { there to be a valid reason for his absence }.
{expected } { Fred to be six feet tall }.
We see, then, that while the N that follows a verb of either class must be
interpreted as the subject of the infinitive that follows, a verb of the persuade-
class imposes thematic restrictions on the N as well, while a verb of the believe-
class does not. Can we deduce anything about the structure of sentences
containing such verbs from these co-occurrence facts?
With respect to verbs of the persuade-class, we can deduce that the post-
verbal N is not syntactically the subject of the following infinitive, but rather is
in the structural position of the object. We can deduce this from the following
constraint on locality of theta-marking:
(5) Principle of Locality of Theta-Marking:
If theta-marks (i.e. assigns a theta-role to ) , then and must be
sisters.
We can see evidence of (5) by examining sentences containing clausal
complements that are overtly marked by complementizers. Turning first to verbs
that take complements that are introduced by the complementizer for, we see
that the matrix verb never restricts the content of the infinitive in any way, let
alone restricting the subject position of the infinitive:
108
(8) C
C T
N T
John T V
Past V
V N C
persuade Sally C
C T
for N T
herself T M
0 M
M V
to V
V A
be A
polite
110
(9) C
C T
N T
John T V
past V
V C
persuade C
C T
0 N T
Sally T M
0 M
M V
to V
V A
be A
polite
111
We can then use the rules of reflexive deletion and for-deletion to derive the
structure for (1)(a).
For verbs of the believe-class, however, the matrix verb does not assign a
theta-role to the post-verbal N. Note that the Principle of Locality of Theta-
Marking only states a necessary condition for theta-marking; in order to be
theta-marked, the N must be a sister to the element that theta-marks it. We
might ask, however, sisterhood is a sufficient condition for theta-marking, in
the sense that if an element is a sister to a lexical head, the head would assign a
theta-role to the element. If we could establish that sisterhood assigns theta-
marking, we would then be in a position to establish the structures of sentences
containing believe-type verbs- the post-verbal N would have to be the subject of
the following infinitive, rather than the object of believe. Therefore, the
structure of , e.g. (2)(a), would have to be (10):
(10) C
C T
N T
John T V
past V
V C
believe C
C T
0 N T
Sally T M
0 M
M V
112
to V
V A
be A
polite
There is some evidence that sisterhood entails theta-marking, as pointed out by
Chomsky (Lectures on Government and Binding (1981), Foris Press).
Specifically, expletives appear in subject position, but not in object position.
Therefore, we have no intransitive verbs that take an expletive object, such as
the hypothetical laugh ( having the thematic structure of laugh, but taking an
expletive object):
(11) a.* John laughed there.
b. * John laughed it.
We might therefore propose that sisterhood, the environment for
subcategorization, entails theta-marking, requiring the structure in (10).
Proposals have been made in the literature, however, notably by Paul Postal
(On Raising (1974), MIT Press) that , while the post-verbal N may originate as
the subject of the infinitive complement of a believe-type verb, it becomes the
object , by a transformation that is called Subject-to-Object Raising, and would
be formulated as follows: (12) N- [C- N -X ]
1 - 2 - 3- 4--
3 - 2- 0 - 4
We can concretize the analysis by positing an empty N position, as in (13):
113
(13)
C
C T
N T
John T V
past V
V N C
believe e C
C T
0 N T
Sally T M
0 M
M V
to V
V A
be A
polite
We must ask what the evidence is for subject-to-object raising, which would alter the
structure but not the terminal string (as does restructuring of the helping verbs into T).
114
The best argument for subject-to-object raising concerns the placement of adverbs that
must modify the main clause, as in (14):
(14) I believe John with all my heart to be guilty.
The adverb obviously refers to the speakers belief. Now, let us consider a verb that
takes an infinitive complement with the for-complementizer, as in (15):
(15) I would prefer for John to be the winner.
Because the complementizer for is present, we assume that the N that
immediately follows is within the infinitive. Notice, however, that an adverb which
modifies the main clause cannot intervene between the post-verbal N and the
infinitive marker when for is retained:
(16) * I would prefer for John with all my heart to be the winner.
When the for is deleted, however, a main clause adverb can occur there more
naturally.
(17) I would prefer John with all my heart to be the winner.
It would seem, therefore, that an adverb must occur in the clause that it modifies.
Therefore, the N must be in the matrix clause, according to Postals argument.
One argument for subject-to-object raising that does not go through relies on the
assumption that the antecedent for a reflexive must be in the same clause as its antecedent
, known as a clause-mate condition. Evidence for the clause-mate condition can be seen in
the ungrammaticality of sentences containing reflexives when this condition is not met:
(18) a. * John thinks that nobody likes himself.
b. *John would prefer for himself to win.
However, reciprocals in English seem to be subject to the same distributional constraints
as reflexives:
(19) a. * They think that nobody likes each other.
b. * They would prefer for John to see each other.
However, reciprocals are clearly not subject to a clause-mate condition:
(20) They would prefer for each other to win.
We will return to the distribution of reciprocals and reflexives and their antecedents. It
is an extremely important topic in current syntactic theory, and we will account for the
ungrammaticality of such examples as (18) and (19) in a different way.
Another argument for the subject-to-object raising is based on the interpretation of
logical words such as every and not (called logical operators). Consider the
interpretation of a sentence such as (21):
(21) Every boy did not read the book.
Many people say that (21) is ambiguous, and can have either the interpretation in
(22) or (23):
(22) Not every boy read the book.
(23) No boy read the book.
The two interpretations are said to correspond to a difference in the scope (roughly,
the logical jurisdiction) of the two logical operators every and not. In the interpretation
corresponding to (22), the negative is said to take wide scope relative to every, and
every (called a universal quantifier) is said to take narrow scope. In (23), the
universal quantifier is said to take wide scope relative to the negation, and the negative
is said to take narrow scope. The assumption is that there is a mapping procedure
between these expressions in natural language and a logical language which provides the
basis of their semantic interpretation . The logical language is called Logical Form.
The scope of negation in Logical Form corresponds to the clause in which it is contained.
115
(18) C
N C
C T
N T
John T V
Past V
V [ N ]
[+wh]
what
{ [ C that ] }
We might then reformulate wh-movement to require that the C to whose
Spec the wh-phrase moves must be a +wh C. Hence, we can account for
the ungrammaticality of (20) and (21), because believe selects a CP
headed by a that-complementizer.
To account for whether, we might propose that it is a marker of a yes-no
question that is generated in [ Spec, C ] when the question is a yes-no
question. We might propose, then, that whether is deleted in the Spec of a
direct question, accounting for the interpretation of whether in embedded
contexts, but its absence in main clause contexts.
In short, (17) should be re-formulated as (34):
(34) X - C- W- [X +wh]
[+wh]
1 - 2 - 3 - 4--
4 - 2- 3 - 0
In short, wh-movement to the Spec of a CP that does not contain a +wh
complementizer will be ruled out because the structural description of wh-
movement will not be met.
English, however, like most (but not all) other natural languages
allows for more than one constituent to be questioned . When a multiple
question occurs, such as (35), however, only one element will undergo
wh-movement:
(35)Who gave what to whom?
We can see the reason for this if we consider the derivation of (35). The D-
structure will be (36):
(36) C
N C
C T
+wh
N T
who T V
past V
V N P
122
give what P
P N
to whom
Let us assume that the subject wh-phrase moves by wh-movement into the Spec
of the matrix C, yielding (37). This is known as a string-vacuous movement
(like restructuring of have or be into T), discussed in Lecture #5), in that it
changes the structure without changing the terminal string of the phrase-marker.
(37) C
N C
Who C T
+wh
T
T V
Past V
V N P
give what P
P N
to whom
In the case of multiple whs, only one can move to [Spec, C] for the simple
reason that there is only one [Spec, C], and when it is occupied by one wh-
phrase, movement of another wh-phrase to that position would cause the first
wh-phrase to be irrecoverably deleted, violating Recoverability of Deletion .
Recall that movement only takes place to empty positions, as we saw in the case
of restructuring and N-preposing. Hence, these transformations are
obligatory, but only to the extent that their application does not violate
recoverability.
C. What the feature +wh selects
A striking discrepancy exists between declarative complements and
interrogative complements. We have seen that, among the set of verbs that select
123
for declarative complements, some only select finite complements, such as say,
and some only select infinitive complements, such as wait:
(38)a. John said that Sally was crazy.
b. * John said for Sally to be crazy.
(39)a. * John waited that Sally left.
b. John waited for Sally to leave.
However, whenever a verb selects an interrogative complement, the
complement can always be either finite or non-finite.
(40) a. I inquired as to whether or not to leave.
b. I inquired as to whether or not I should leave.
(41) a. I asked him whether or not to leave.
b. I asked him whether or not I should leave.
(42) a. He knew what to do.
b. He knew what he should do.
We can account for this by assuming, as we have, that when A selects for
B, A is selecting for the head of B, and the head of B imposes its own
selectional restrictions. Hence, selection is a head-to-head phenomenon. With
this in mind, let us assume that the complementizer that selects for finite T, for
selects for non-finite T, and +wh simply selects for T, and doesnt care about
whether or not T is finite. Hence, it would allow either finite or non-finite
complements. Because selection is simply for the head of a sister, it would be
impossible for a verb that selected a +wh complement to require that the
complement be finite or non-finite, because the verb would be separated from
the complements Tense by the intervening Complementizer.
statement of this could be found in a 1976 paper by Lasnik & Fiengo (Some
Issues In The Theory of Transformations, Linguistic Inquiry, Vol. 7). So, for
example, the rule of N-preposing that we have discussed operates in the
derivation of the passive construction, as well as the unaccusative construction
and in the subject-raising constructions.
Similarly, there is not thought to be a specific transformation of question-
formation, responsible for the generation of constituent questions, but rather a
transformation of wh-movement, that generates constituent questions as well as
other constructions in which wh-movement plays a role. One of these other
constructions is the relative clause construction, exemplified in (1):
(1) The man who I saw.
There are two types of relative clauses in English, known as restrictive
relative clauses and non-restrictive relative clauses. The relative clause in (1) is
known as a restrictive relative clause, and an example of a non-restrictive
relative clause is given in (2):
(2) John, who I like,
In written English, non-restrictive relative clauses are set off by commas,
and in spoken English, by pauses ( the intonation with pauses around the
relative clauses is actually known as comma intonation). Semantically, the two
types of relative clauses are quite different. The restrictive relative clause serves
to restrict the reference of the head noun., so that in (1), the speaker is
specifying more closely which man is being referred to. Non-restrictive relative
clauses do not restrict the reference of the head noun, but simply provide , as a
sort of side-comment, a description of some property that the head noun
possesses ( they are also known as appositive relative clauses.) We will now
focus on the structure of restrictive relative clauses, but we must first distinguish
restrictive relative clauses from another construction in which a clause occurs
within a N that contains a lexical head noun, known as the noun-complement
construction.
{ belief }
{ claim }
Notice that the clause within the C in (3) does not contain a gap, and the C is
introduced by that rather than a wh-phrase. Notice that this type of clause
within an N is lexically restricted by the head noun in its occurrence, in that not
all nouns allow this type of clause within the N, as can be seen by the
impossibility of (4):
(4) * The { pencil } that John was the murderer.
{ book }
{ letter }
Hence, it would seem that the nouns exemplified in (3) subcategorize for the
clause, and, by local subcategorization, the noun and the clause must be sisters.
Hence, the structure of the N must be as in (5):
(5) N
Det N
the N C
theory C
C T
that N T
John T V
Past V
V N
be D N
the N
murderer
On the other hand, restrictive relative clauses can always occur within a N
that is headed by a common noun. In this sense, the licensing of relative
clauses, which must occur within Ns, is similar to the licensing of temporals,
127
which occur within a clause. Every simple sentence allows some kind of
temporal, and the specific type is restricted by the semantic class within which
the particular verb is situated, but the type of temporal that can occur within a
clause is not restricted by the individual verb. Hence, temporal phrases that
denote duration cannot occur within sentences headed by stative verbs:
(6) John {knows } French while Sally visited Fred.
{ understands}
However, this is a matter of the semantic class of stative verbs, not an
individual lexical choice. For temporals, a temporal can always occur in a
simple sentence, although the particular temporal is restricted by the semantic
context in which it occurs. Restrictive relative clauses show a similar freedom of
occurrence, suggesting that the phrase-structures of temporals and restrictive
relative clauses should be similar.
Interestingly enough, when a noun-complement occurs with a relative
clause, the order within the N is most naturally noun complement-relative
clause, as in (7):
(6) The theory that John is the murderer that Bill was propounding.
Furthermore, there is no upper bound on the number of restrictive relative
clauses that can modify a N:
(7) The book which John wrote which you wanted to read which was on
the table....
The fact that (i) restrictive relative clauses follow noun-complements in the
N ,and (ii) are infinite in number, suggests that relative clauses should be
adjoined to some projection of N. There are two possibilities: (i) adjunction to
N; and (ii) adjunction to N. The two possibilities are shown in (8):
(8) a. N b. N
Det N C N C
N Det N
N (C) N (C)
In fact, it is possible to choose between (8)(a) and (8) (b) if we analyze numerals
as determiners, as argued by Jean-Roger Vergnaud (1974, French Relative
Clauses, unpublished Doctoral dissertation, MIT) . Consider a relative clause
such as (9):
(8) Five men and three women who were similar.
128
Under the interpretation in which the men in question are similar to the
women in question. A predicate such as similar is known as a symmetric
predicate (G. Lakoff & S. Peters (1969), Phrasal Conjunction and Symmetric
Predicates in English, in D. Reibel & S. Schane, eds., Modern Studies in
English, Holt, Rinehart, & Winston). Symmetric predicates require plural or
conjoined subjects, hence it is impossible to say (unless interpreted elliptically),
John is similar. Hence, the conjunction must be interpreted as being base-
generated. With this in mind, the structure of (8) must be (9):
129
(9) N
N C
N and N N C
five N three N T
men women T V
Past V
V A
be A
similar
If relative clauses are adjoined to N, as in (8)(a), there is no source for the second
numeral, which is analyzed as a determiner by hypothesis. Hence, we have direct
evidence for the adjunction to N for relative clauses.
B. Relative Clauses That Are Not Introduced By a Wh-Phrase
There are restrictive relative clauses that are not introduced by a wh-phrase, an
example of which is given as the title to this section. Notice, however, that all relative
clauses contain a gap. Assuming that the that which introduces these relative clauses is
a complementizer, the subject of the relative clause is missing in the title above. Other
instances of that-relatives which contain a gap in a position other than the subject
position are given in (10):
(10) a. The book that I read
b. The person that John was speaking to
The process that forms the gap in that relatives has all of the characteristics of
wh-movement, with the exception that the wh-form does not occur. Interestingly
enough, earlier stages of English, and some Scandinavian languages, such as
Swedish, allow both the wh-phrase and the overt complementizer to occur, as in
(11):
130
So far, we have been looking at only one transformation that applies over an
apparently unbounded distance. Another such transformation is the movement
rule of topicalization, which can be seen to be operative in (4) and (5):
(4)
John I really like.
(5)
John I cant believe the claim that anybody likes.
It is clear that topicalization is a different transformation than wh-movement. For
one thing, topicalization moves the N that is topicalized to a position in the phrase-
marker that is distinct from [Spec, C]. Wh-phrases never follow a complementizer,
but topicalized phrases do, as pointed out in Baltin (1982) (A Landing Site Theory
of Movement Rules, Linguistic Inquiry, Vol. 13, No. 1):
(6)
John said that this book, he really likes.
As I had also pointed out, topicalized elements can also follow fronted wh-
phrases, as in (7):
(7)
Hes a man to whom liberty, we could never grant.
I will adopt the analysis of topicalization in Baltin (1982), in which topicalized
elements adjoin to T. With this in mind, notice that topicalized elements show
the same restriction as the one exemplified by wh-movement in (2) and (3):
(8)
*This book I cant believe the claim that anybody likes.
(9)
*This book I saw the man who read.
Therefore, Ross argued that the relevant restriction was not a restriction that was
stated as conditions on particular transformations, but rather as a separate
constraint on all transformations. It was stated as follows:
(10)
Complex NP Constraint
No transformation can move an element out of a C that is
contained within an N that has a lexical head noun to a position out of that
N.
We can see how the Complex NP Constraint operates to block, e.g. (2).
The underlying structure of (2) would be (11):
133
(11)
C
N C
C T
+wh
N T
John T V
Pres V
V N
N
believe Det N
the N C
claim C
C T
that N T
Mary T V
Pres V
V N
like who
The circled N counts as a complex NP by Rosss definition, and hence extraction
out of it is impossible.
There are other island constraints, and we shall now go through them.
A. The Coordinate Structure Constraint
Consider a coordination such as (12):
(12)
John gave a book to Bill and Mary gave a magazine to Fred.
134
(15)
C
C T
N T
I T V
Pres V
V C
wonder N C
C T
+wh
T and T
N T N T
John T V Mary T V
Past V Past V
V N P V N P
(22)
C0
C T
N T
N T V
N C1 Pres V
C T V N
N T V
N C2 Pres V A
C be A
C T A
Such relative clauses were analyzed as being left-branching, with the structure
given in (25):
139
(25)
N
N C
(28) C0
C T
N T
N T V
N C1 Pres V
Someone N C V
N C T arrived
N C2 T V
who N C Pres V
who C T V N
N T likes Mary
Fred T V
Pres V
likes
The structure for (27) would be derived by extraposing C2 to the end of C1.
In this view, sequences of relative clauses are derived by positing structures in
which the later relative clauses are contained within the earlier ones.
However, (27) has (29) as a variant:
(29) Someone who likes Mary arrived who Fred likes.
141
If (28) were the correct structure for (27), (29) would have to be derived by
extraposing C2 to the end of C0. However, this application of extraposition
would violate the Right Roof Constraint, otherwise well-motivated. Hence, if
we assume the Right Roof Constraint, we must allow the second relative clause
to be dominated by the matrix clause, rather than the first relative clause. In
short, we must assume the possibility of stacked relative clauses, as in (25),
rather than assuming that the only source for sequences of restrictive relative
clauses is one in which the second relative clause is embedded within the first
one.
142
143
144