Professional Documents
Culture Documents
The Chomskyan Theoryandits Implicationsfor Language Teachingand Learning
The Chomskyan Theoryandits Implicationsfor Language Teachingand Learning
The Chomskyan Theoryandits Implicationsfor Language Teachingand Learning
2017
DEDICATION
This book tries to relate the work done in Chomskyan Theory to the areas of
language teaching and learning. The book is divided into two main parts.
Using the historical approach, the first part traces back the stages, changes
and developments that occurred during the development of the Chomskyan
Theory. The second explains in some detail the possible implications for
language teaching and learning. Although no direct connection can be made
between the Chomskyan Theory and language teaching and learning, this
humble work has shown that there are some important implications that can
be used in these two areas.
The topic of this book is divided into two parts. The first deals with the
Chomskyan Theory, and the second shows how we can benefit from it in
language pedagogy. This piece of work is mainly concerned with the most
important changes that took place during the course of developing the
transformational-generative framework. This particularly applies to those that
took place after the publication of Noam Chomsky's 1957 book Syntactic
Structures. The historical approach is used to trace each change and explain
it in a plain consistent way. Developments and changes are analysed and
traced back to its precursors. The book also shows how these developments
complement each other. Providing a way to benefit from the implications of
the theory in teaching and learning languages is also a main concern in this
book.
The work is divided into five (5) main chapters. Each chapter treats some
aspect of the framework and is subdivided into several sections. The first
chapter, which is divided into six (6) sections, briefly summarizes the history
of the Chomskyan Linguistics and its present state. It states the most
important issues, assumptions, and advancements that occurred during the
course of developing the Chomskyan Theory. Each section deals with a
distinct version of the Chomskyan Theory. Chapter (2) is divided into five (5)
sections. Section (1) identifies the labels used to refer to the Chomskyan
Theory. It discusses the legitimacy of these different labels and provides the
basis to accept or resist them as labels or names for this theory. Section (20
restricts the fundamental assumptions and claims of the changes and
developments that occurred in the course of their modification. Section (3)
clarifies the shift from movement transformations to constraints on them in the
post Aspects Model and the replacement of movement transformations by a
single transformation: Affect Alpha. A derivation consists of distinct
representational levels. Section (4) the history and changes that happened to
these levels. Section (5) shows how trace theory and the Structure
Preservation Constraint (SPC) together assure that the history of a derivation
can be recovered at any step in the course of a derivation.
As it will be seen in section (3) of chapter (2), the shift in attention in the
Revised Extended Standard Theory (REST henceforth) was from
transformations to constraints on them. These constraints are grouped into
what are often called "modules". Titled properly 'Modules of the Grammar',
Chapter (3) defines these modules and explains each one and its subsequent
modifications in a separate section.
Chapter (5) concludes the book with a summary of the main and most
important points of the Chomskyan Theory.
CHAPTER (1)
THE CHOMSKYAN LINGUISTICS
Past and Present
1.1 Introduction
It is widely known that Chomsky has brought about a revolution in the field of
linguistics. He has postulated a syntactic base called deep structure that
consists of phrase structure rewrite rules and a set of transformations. Those
phrase rewrite rules generate base or kernel sentences. Then
transformational rules transform these kernel sentences into derived ones the
sentences of languages then, can be generated by the application of
transformational rules to the kernel sentences according to an obligatory and
optional set of transformational rules. Thus, a derivation involves a sequence
of phrase markers, the first of which is a base and the last is a surface
structure that is equivalent to an actual sentence.
Noam Chomsky has shifted the focus of investigation from the "performance"
(the speaker's actual use of language) to the "competence" ( the
subconscious control of a linguistic system). He criticized the empirical
approaches of the previous decades and showed that they are inadequate to
explain the complexities of linguistic structure and that a generative model is
more adequate. He argued that semantic considerations were an integral part
of grammatical analysis and posited a deep structure in his grammatical
analysis.
In his book, Chomsky (1957) also proposed that tense was a separate
element apart from the verb in the underlying structure. A movement
transformation too was designed to derive the construction of 'inversion
questions'. To account for the negation of sentences, moreover, he proposed
an insertion transformation that positions 'not' in its appropriate place. Both
these two transformations intervene and prevent the combination of the verb
with the tense inflection1. For this reason, Chomsky devised a transformation
capable of inserting the dummy do in order to carry tense. Several other
functions of the auxiliary do (e.g. in ellipsis constructions) were analyzed as
instances of tense stranding. This syntactic dissection of the functions of the
auxiliary do as well as the clear demonstration Chomsky used in his book
have convinced many linguistics.
1
This process is known as Tense Stranding in linguistic studies.
Unlike the generative semanticists who claimed that all languages could be
derived from the same underlying structure, Chomsky and others (Stern,
1996) rejected the idea that similar sentences with identical deep structures
must be synonymous. They insisted that transformations involved in the
reordering of quantified expressions are capable of changing the scope of
quantifiers.
GB theory represents a great shift in the generative tradition. This shift was
from transformations to constraints on them. These constraints are grouped
together in "modules". These modules are semiautonomous systems that
contain principles and constraints on those principles. Each module applies at
particular points in a derivation. Each one of these modules has its own
universal principles. An output of a derivation is the result of the interaction
between these modules. Transformations, moreover, were reduced to a single
moving anything anywhere. Other general principles prevent Move α from
overgenerating by filtering out any derivation. GB is also enriched by a
number of new empty categories (see chapter 3). Binding theory, on which
GB research concentrated, constraints on movement to anaphor/ pronoun-
antecedent relations. As a result of the link between movement and the
binding principles2, a richly interconnected system emerged. For example, a
constituent can only move to a position where it can bind its trace as shown in
(2), otherwise the derivation will be ill-formed.
If the word who has been positioned in a place where it does not bind its
trace, then the question in (2) will be ungrammatical. An important connection
among movement, c-command, and binding theory is the fact that
constituents cannot move rightward because they (i.e. constituents) will not be
able to c-command the traces left behind. Thus, they cannot bind these traces
either.
2
The principles of binding will be discussed in chapter (3).
that they be minimal: no extra steps in derivations and no extra symbols in
representations are allowed. Second, the theory itself has progressed in the
direction of Minimality. Thus, the collection of different earlier transformations
is substituted by Affect Alpha. The constraints on transformations and
representations also avoid redundancy by not overlapping in a process that
yields the same output.
2. Performance systems to use and access the data (the "external" systems
Articulatory-Perceptual and Conceptual-Intentional interacting with the
cognitive system at two interface levels of PF and LF respectively).
2.1. Overview
The theory under discussion suffers a lot from the undesirable confusion in its
terminology, because it lacks a consistent label that is acceptable to
everybody. It has been given many undistinguished labels. Some call it
Government & Binding (GB), others call it Principles & Parameters Approach
(P&P or PPA henceforth), and there are also those who prefer to to identify it
with rather different labels such as Minimality or the Minimalist Program (MP).
It must be noted that these labels are not synonymous, because each one
refers to a quite distinct version of the theory, and there is no clear cut
between the stages defined by them. Thus, Alec Marantz (1995) was right in
referring to it as "this latest version of Chomsky's Principles and Parameters
Approach". He implies by this that, at least in his mind, "Minimality" is just a
newer version of "PPA". The list in (1) below shows the different labels used
to identify this theory, all these labels (or names) are objectionable for some
reasons.
(1)
a. 'The framework that is associated with Noam Chomsky and his students at
the Massachusetts Institute of Technology.'
We might assign the name 'Chomskyan Theory' to (a) above, but this would
be unacceptable to many because this theory is a result of cooperation
between many researchers. In fact, this theory is not tied to a single individual
or small group of individuals. While Chomsky was the guide and evaluator of
the new developments, the research in this program is freewheeling and its
proponents frequently disagree among themselves including Chomsky
himself. So what is the most common label used to identify this theory?
One label is 'Standard Theory', which offends many people because it entails
that the "standard" is given by Chomsky and his followers, and whatever
deviates from it is not. Many "Standard Theoreticians" who talk as if the
"Standard Theory" were the only theory available reinforce this attitude.
Furthermore, the label "Standard Theory" refers to the entire history of
syntactic theory that is built by Chomsky and his students over several
decades. In fact, it also includes several fundamental sections, which have
been developed occasionally and differently. This framework began in the
mid-sixties with the application of Chomsky's 1965 book "Aspects of the
Theory of Syntax". The label "Standard Theory" refers specifically to the
theory presented there. It is also called the "Aspects Model".
Over the fifteen years that followed, the framework was revised to the extent
that its character changed fundamentally. By the early eighties, another
different framework was developed from the "Aspects Model". The publication
of Chomsky's 1979 lectures at Pisa under the title "Lectures on Government
and Binding" presents this framework in an organised coherent form for the
first time.
Unfortunately, the title of the book was given to the framework (Government &
Binding or GB). The Pisa lectures and the book were appropriately titled
because in them Chomsky concentrated on two particular sub-theories,
namely "government and binding"3, but the framework as a whole consists of
many such sub-theories besides that "government" and "binding" are not the
3
Government is the relation between a syntactic head and its dependents, but binding refers to the
relationship between a pronoun or anaphor and its antecedent.
most important ones in it. They are just those that Chomsky had more to say
about in1979. Chomsky himself has expressed his regret for labelling the
entire theory with it, and his preference for the label "Revised, Extended
Standard Theory" (REST). As Steven Schäufele (1999) pointed out in his
synthinar lecturattes, this label can be used in referring to the Chomskyan
Theory.
During the second half of the eighties, the label "Principles & Parameters
Approach" (P&P or PPA) was developed among the proponents of the
framework. Recently, a new label was circulated among the proponents of the
framework. It was the result of the works published in the early 1990s, namely
the "Minimalist Program (MP).
Finally, one has to know all these labels, because some proponents of the
framework are sensitive about using one label or another. It is also useful
when writing research papers to give all the labels, state explicitly what they
denote and then choose one and use it throughout the research paper.
The term "Constituent Structure" denotes a complex concept that refers to the
way of organising the words and other constituents in a string that involves a
combination of two logical relations, namely "Dominance" and "Precedence".
If a constituent precedes another in linear order, the two are said to be in
precedence relation. Theoretically, "Dominance" involves the notion of
constituent being contained within another. "Tree diagrams" are used to
represent dominance relations. By way of illustration, consider the tree
diagram in (2).
The labels S, NP₁, VP, and NP₂ are called nodes. These nodes represent the
constituents of the string described by the diagram. A node that is linked to a
lower node by a line dominates that node. The S node in (2) for example,
dominates all the other nodes in the tree. NP₁ in turn immediately
dominates Det and N, and VP dominates the inflected verb killed. The nodes
occupied by the words him, the, boy and killed do not dominate anything.
They are called terminal nodes. We describe nodes, which dominate a single
node, as "non-branching", but nodes that dominate more than a node like S
and NP₁ as "branching" (Schäufele, 1999; Ouhala, 1999).
There is also another relation called adjacency. Two nodes are adjacent if
there is no third node intervening between them. As we shall see in chapter
(3), the Adjacency relation is important for specific details of the theory.
What is critical now is the claim that the proper analysis of syntactic string
may involve several constituent structures, which share a common skeleton
and the same lexical items. The set of all the constituent structures involved is
called the derivation of that string. Note that a given item may occupy different
positions in different constituent structures of the derivation.
In questions that contain only one Wh-word, that word can move to the front
of the sentence as shown in (4b,c), but when the sentence contains two Wh-
words as shown in (4d) only one can move to the front of the sentence while
the other wh-word remains in its place. The problem is that why cannot what
in (5) move to the front of the sentence. Why is (5) ungrammatical? The
answer is that (4d) is more economical than (5). (4d) is more economical
because the wh-word who is nearer to Spec CP than what. Although the two
constituents can move, the "Shortest Movement" condition permits only the
nearest constituent to the specifier position of CP to move. This generalises to
other constructions showing that some principle of economy holds. Such
constructions include inversion questions in which the first auxiliary has to
move to the front of the sentence as shown in the following example.
Here also the same condition of economy applies to permit only the nearest
first auxiliary to move to the front of the clause.
During the seventies, while the formal language of the Aspects Model
continued to be used, the cutting edge of syntactic research, at least in the
Chomskyan School, was not to define specific transformations, but to identify
constraints on transformations and on the implicit power of the
4
There was a lot of talk in the sixties (and later) about specific transformations such as the Passive
Transformation, There-Insertion, and Dative Shift.
transformational component. After all, it was realised that the formalism for
identifying transformations in the Aspects Model was not on the right track.
There was a huge range of imaginable transformations that could be formally
described but which did not seem to be affirmed in any known human
language. If the goal of grammatical theory is to explain how human language
works, then the Aspects formalism was missing something.
By the end of the seventies, it became clear that the theory could operate with
a single transformation known as "Move α" 5. This transformation was
understood to mean, "move any constituent anywhere", provided that no
constraints are violated in the operation. At the early stages of REST, it was
proved impractical to reduce every motivated transformation to an instance of
"Move α". As a result an alternative broader transformation known as "Affect
α"6 was proposed. In addition to movement of constituents, "Affect α" has the
ability to rearrange the constituent structure without moving any constituent.
For example, Affect α can insert the dummy do to carry the inflectional
features of the verb in questions and negatives. Many proponents of REST,
however, admit only transformations that are describable as instantiations of
"Move α". These people regard the broader transformation "Affect α" as
evidence that the best analysis has not yet been discovered. In fact, the
notion of "Affect α" is very attractive, but any theory that can dispense with it is
indeed a stronger one.
Up to the seventies at least, the main focus in the grammatical theory was on
"rules", but the attention in REST focussed on general principles of the
grammar. What happens is "Move α", and grammatical theory is concerned
with the general principles that delimit its scope of operation. If the
transformation can be reduced to a single transformation, "Move α", why
cannot we dispense with it altogether? Indeed there are certain frameworks of
syntactic theory that dispense with movement transformations altogether such
as Generalised Phrase-Structure Grammar (hence forth GPSG) and Lexical-
Functional Grammar (hereafter LFG). But in Standard Theory the essential
part of the point is to represent some generalisations that any syntactic theory
5
α is understood as a variable that can stand for any syntactic constituent
6
Αffect α is interpreted as "Do anything to anything"..
of human languages must represent somehow. Non-transformational
frameworks utilise completely different ways to reach such generalisations.
For instance, in an agentless passive clause like "The door was opened", the
constituent "door" behaves in some respects like the "subject" and in others
like the "object". In Standard Theory, this fact is explained by claiming that it is
both the "subject" and "direct object", but at different levels connected by a
movement transformation. On the other hand, non-transformational
frameworks like GPSG and LFG represent grammatical relations such as the
subjecthood of "door" by constituent-structure tree diagrams, but semantic
relations are indicated in the verb's "argument structure". The link between the
two is shown in the verb's lexical entry.
7
There is a good discussion of this principle in Steven Schäufele's Synthinar Lecturettes.
In the Aspects Model, a derivation could consist of any number of levels. Each
level differs from the preceding one by a movement transformation. It was
believed that certain transformations might have to precede others in order to
achieve the desired result. Thus, in the Aspects Model a sentence like (1)
would be derived from the "deep structure" in (2) by a dozen of
transformations. For instance, for "Equi-Deletion" to erase the NP "Sam" in
the lowest clause, the lowest clause would have had to be passivized so as to
get "Sam" into the subject position from which it can be deleted.
Note that instead of the sign e, "∆" was used to identify the empty positions in
earlier versions of the theory.
With the invention of "Move α", the motivation behind these assumptions fell
by the wayside. It became clear that there was no need to impose an order on
the application of transformations. As a consequence of the developments
during the seventies, the "deep structure" in (3) as underlying the sentence in
(1) replaced the one in (2) above.
The lexicon lists the lexical items and their properties that make up the atomic
units of the syntax. These properties include, for example, what sort of object
the lexical verb requires, etc. "DS" means "deep structure"; "PF" and "LF"
stand for "phonological form" and "logical form" respectively. "SS" is
understood to stand for "Syntactic Structure". "SS" is central and connected
directly to all the other levels. PF and LF are called the interface levels. PF is
the interface with the phonology where phonological rules apply to give the
string its phonological manifestation. It is similar to the surface structure of the
derivation in terms of the Aspects Model. LF is the interface with the
semantics where meaning relationships of various kinds are explicitly
represented. DS is the interface level with the lexicon where lexical items are
combined. In REST the DS represents the base-generated form of a string.
Some transformational operations process the DS-representation to satisfy
certain constraints of the grammar, and the result is the SS-representation.
SS is not directly interpreted itself, but is converted into PF and LF. "Move α"
operates between any two levels and nothing important can be said about the
order of its operations. The difference in its character is due to the fact that
different constraints operate at different levels. For example, the "Theta
Criterion" is relevant at DS, the "Case Filter" at SS and PF, and the "Empty
Category Principle" primarily at LS8.
The proponents of the Aspects Model claimed that the semantics of language
had to be encoded in the Deep Structure that is the first level in the derivation
of any sentence. They hypothesised that a speaker/writer generates a deep
structure that represents the intended meaning, then performs certain
operations on that deep structure to produce the final surface sentence that
he pronounces or writes. The listener/reader on the other hand, receives the
surface sentence and applies the reverse operations to decode the abstract
deep structure and interpret it. During the period between the sixties and the
seventies linguists recognised that it was not plausible to relate all the
semantics of language to only one derivational level. The evidence came from
the ambiguity of sentences whose surface structure could have two possible
but distinct meanings. Some ambiguities were processed properly within the
Aspects Model. For example, the sentence in (4) has two possible meanings.
Each meaning can be derived from a different deep structure as shown in (5).
8
These constraints will be explained in the next chapter.
from the ranking of quantifiers. By way of illustration consider the sentence in
(6).
Now what about a sentence like (9)? It has either of the meanings in (10).
(10) a. There exists some person y such that for every person x, x loves y.
b. For every person x, there exists some person y such that x loves y.
The statement (10b) is true if we can find pairs like this for every single human
being. On the contrary, the meaning in (10a) says that there is some special
human being (call it Ala) who is loved by everybody: Faraj loves Ala,
Mohamed loves Ala, Basma loves Ala, etc.
The importance of saying all this is that in REST two levels of representation
are involved in the interpretation of sentences. At DS, thematic relations must
be represented. For instance, the verification of what constitutes a verb or any
other constituent subcategorizes for ought to be done at DS. LF however,
represents scope relations. DS on the other hand is concerned only with
lexical semantics.
9
A 'trace' is typically represented by a lower case 't' for 'trace' or 'e' for 'empty'.
10
The letter 'i' is used for index; if more than one is needed, the letters after 'I' in the alphabet are
used.
based on this typology. As shown in the list below, his typology classifies the various
kinds of transformations into three types.
This kind of transformation applies precisely to two adjacent constituents. They are
only subject to the conditions within those two constituents. The constituents to which
this kind applies need not to be sisters, but merely adjacent in linear order. There
must be a c-command relationship between the affected constituents, and at least
one of them must not be a maximal projection.
The difference in the Aspects Model was that the PP already exists at Deep
Structure, but without complement. The Deep Structure of (1) then would be as in (3).
In Aspects explanations, the PP in the passive sentence (1) is already there,
not created by the Passive Transformation. In this way, the Aspects' Passive
Transformation is said to be "structure-preserving". Since any embedded
clause can be passivized, the Passive Transformation cannot be considered a
Root Transformation. It also involves two NPs that are not adjacent and so it
cannot be a Local Transformation as well. Thus, according to the SPC it must
be a Structure-Preserving one. Assuming that the subject NP "Ali" has moved
into the PP in (3), the result will be a structure like that in (4). The moved NP
"Ali" leaves a trace in the subject position coindexed by the letter "i" with the
former subject NP "Ali". But since traces cannot be erased, it is impossible for
the direct object NP "the ball" to move into this position. To be licensed as a
new subject, changes must occur in the constituent structure, which
unfortunately violates the SPC.
In REST, as it will become clear in the next chapter, this analysis is credited
by concerns of Theta and Case Theory. In fact, there is an exception to the
definition of a "Structure-Preserving" Transformation, namely what is called
"Adjunction". Adjunction is a recursive process that integrates adjuncts
(optional arguments) into the syntactic structure. It targets the X1 projection
as illustrated by the box in (6a).
11
The question of the passive auxiliary is left out because it is irrelevant to the discussion.
Then it makes a copy of the target node right above the original one, as in
(6b). Finally it attaches the adjunct phrase as a daughter of the newly created
node as in (6c). Adjunction seems to create a new structure as it creates a
new node and therefore it cannot be called "structure-preserving". But since
adjunction is a recursive process and because the new V' node immediately
dominates the original V' node, it is considered a "structure-preserving"
transformation. This understanding is explicitly clarified in some works
published in the mid-eighties (see for example Chomsky, 1986b). but what
about "Local Transformations"! Steven Schäufele (1999) argued that they only
occur between SS and PF (e.g. subject AUX inversion) as a consequence of
the theoretical desirability to restrain the power of Move α.
At the end of this section, it is worth mentioning that trace theory and the SPC
together assure that the history of a derivation can always be recovered. As it
will become clear in the next chapter, the framework has some intricate
complications that might obscure the derivational history, but to some extent,
they are comparatively few. The result is that even at LF we can reconstruct
the base DS representation of a string. In fact this is crucial to the operation of
the framework, moreover it is attractive to many theorists.
CHAPTER (3)
THE MODULES OF THE GRAMMAR
3.1. Overview
As we have seen in chapter (2), the shift in attention in the Chomskyan
Theory was from transformations to constraints on them. These constraints
are grouped into what are often called "modules". These modules are semi-
autonomous systems consisting of basic principles and constraints on them.
Each module is relevant at certain levels of a derivation. The derivation of a
grammatical string involves the interaction of these different modules. In order
to be considered grammatical, a string has to be approved by all the modules.
Note that Chomsky here takes the grammar to be both autonomous and
"internally highly modularised, with separate subsystems of principles
governing sound, meaning, structure and interpretation of linguistic
expression . . ." (Chomsky, 1991). He maintains that linguistic theory is
concerned with a specific mental faculty operating in the brain, not external
such as linguistic behaviour. This is reflected in his 'competence-performance'
dichotomy.
It has been shown that grammar is autonomous and modular. This was
proved by the fact that people may damage one of their brain faculties in an
accident but still be able to use the others in an efficient way. For example, a
person whose brain was injured in an accident might have lost the ability to
speak but nevertheless he can perform well in solving very complicated
mathematical operations or in using aerodynamics to design a high-tech
aircraft. This fact led to the conclusion that the human brain consists of
independent (autonomous) faculties. Therefore the human brain has a
modular structure. Although the mind includes distinct modules responsible
for different abilities, using language requires the interaction of these
independent faculties.
The language faculty inside the brain is also modular in the sense that it is
highly structured (see McGilvary, 1999: 3-4, and Smith, 1999: 7-21). It
consists of separate subfaculties responsible for language acquisition,
production, and comprehension. In this way the structure of language faculty
is similar to that of the brain, which consists of various faculties responsible
for different senses such as vision, sight, smell, hearing and touch. Although
these faculties are separate components of the brain, there is no reason to
assume that they do not interact. Likewise, the separate sub-faculties
responsible for language interact to yield the expected effect (ibid.).
(1) Ẋ (X-Bar) Theory is the theory of phrase structure. It identifies the shared
characteristics in the internal structure of the different kinds of phrases and
the relations between them. It applies primarily at DS.
(4) Case Theory is "concerned with the assignment of abstract case on the
basis of relations of government" (ibid.: 47-48). It is relevant at SS and PF. It
decides whether a given NP is in a legitimate slot in constituent structure or
not.
3.2. Ẋ Theory
3.2.1. Introduction
Ẋ Theory was first introduced with the publication of Noam Chomsky's paper
"Remarks on Nominalisation" in 1979. In this paper Chomsky identified nodes
like NP, VP, etc. with sets of feature specifications common to various nodes.
Thus, a lexical noun like "book" and a complex NP like "the last book that
Faraj wrote several months ago" share similar "nominality" features. Both are
referential and they can be inserted in similar positions. Similarly, a lexical
verb such as "speak" has certain qualities in common with VPs like "speak the
speech trippingly upon the tongue".
On the contrary, lexical nouns and verbs like "book" and "speak" have some
features that other NPs and VPs do not share with them and vice versa. The
node labels NP and VP are therefore classified into two types : a "Category
Type" (N or V) and a "Projection Level" (NP or VP). The projection level is
usually referred to as bar level and it has a numerical value. Lexical items,
however, have a zero bar level. Higher bar levels are identified by one or
more horizontal lines : or by primes after it: N', V''. Another way is to write
a numeral after the category label: N1, V2, etc. 12.
In the seventies there was a considerable debate on the number of bar levels.
Although Chomsky thought that two levels above the lexical are enough,
Peggy Speas (1990) asserted that there is no good reason to have more than
one bar level. Whatever the number of bar level is, a node that is given the
maximal value is called a "maximal projection". Therefore, nodes like NP and
VP are called "maximal projections".
12
I will use this technique to refer to specific bar levels.
3.2.2. Universal Base and Functional Heads
The basic assumption of X-bar Theory as introduced in the seventies and
eighties was the generic pattern given to the internal structure of any maximal
projection "XP". A maximal projection "XP" has a lexical head X' and other
dependent constituents as its daughters. These dependents are classified into
three kinds: specifiers, complements, and adjuncts. All these dependents bear
different relations to the head. The difference between specifiers and
complements lies in the distinction between bar-1 and bar-2 projections. A
maximal projection XP (i.e. bar-2) has two daughters: a head X1 (i.e. bar-1)
and a specifier. The projection X1 dominates the lexical head X' and its
complements. Both complements and adjuncts differ only in that complements
are sisters of the head whereas adjuncts are sisters of the X1 projection.
Adjuncts are similar to specifiers because the two are sisters of the X1
projection. But specifiers differ from adjuncts because they are daughters of
the maximal projection whereas adjuncts are daughters of the X1 projection.
The table given below summarizes the different syntactic relations of the three
dependent constituents and their relations with the head.
Specifiers are usually articles, determiners and degree adverbs as "very" and "too" in
adjective phrases (APs) like "very difficult" or "too hard". Complements however, are
the constituents for which a verb or a preposition subcategorizes13. Before 1980
complements were considered to be maximal projections but specifiers were either
maximal projections (e.g. genitive NPs modifying larger NPs, as in "[ [Lockerbie's]
13
We will discuss this issue in the section on Theta Theory.
case]") or X0 lexical heads (e.g. "the" in "the case") or morphological elements such
as the prefix representing definitions in Arabic. In the late 80's some linguists tried to
re-evaluate this controversial issue. Due to the work done by Steven Abney and
others (see S. Abney, 1987), specifiers are now regarded as being maximal
projections. Abney's work also redefined "noun phrases" as "determiner phrases"
(DPs). The head in a phrase like "the government" is the article "the" and
"government" is just its complement. The reason behind mentioning Abney's
argument here is due to the fact that some linguists talk about "DPs" instead of
"NPs".
All category types are supposed to have the same projection structure discussed
above. The constraints of Theta and Case theories permit different categories to take
distinct kinds of complements. But what about specifiers? If Abney's argument is not
right, then the specifiers of NPs are very clear as hose italicised in (1).
(1) a. a book
b. the book
c. some teachers
PPs and APs also have adverbial modifiers as specifiers like those in (2) and (3).
VPs too have adverbial modifiers as in (4), but unlike the case with APs and PPs
they are not called specifiers; the adverbial modifiers of VPs are called adjuncts.
Adjuncts are optional arguments integrated into the syntactic structure by a recursive
process called Adjunction. Adjunction targets the X1 projection as indicated by the
box in (4a). then it makes a copy of the target node right above the original one as in
(4b). finally it attaches the adjunct phrase as a daughter of the newly created node as
in (4c).
(5) a. We know that democracy means popular rule not popular expression.
In 1986 Chomsky applied the X-bar structure of lexical heads to "I NFL" and
"COMP" (Chomsky, 1986b). Therefore, both "INFL" and "COMP" were supposed
15
See for example the Aspects Model's "affix-hopping", but note that it is not used nowadays. In the
minimalist framework a word is supposed to carry its features when it enters derivations. The
function of a head as INFL is not to supply the inflectional morphemes but rather to justify them on
the lexical items via checking. This implies that a verb is already inflected with tense/agreement
markers when it merges with another syntactic object, say, an object DP. Then it moves to INFL either
at SS or at LF so as to check its inflections against the abstract features found there.
to license full projections. "INFL" subcategorizes for VP as its complement and
the surface position of the subject as its specifier. "IP", the maximal projection
of "INFL" is what previously was called "S". "COMP" however, have an "IP"
complement. "COMP" or "C" is the place for complementizers as in (5-6), and
its specifier position is the landing-site of WH-movement as in (7). "CP" is the
maximal projection for "COMP". In the 70's "CP" was referred to as "S". There
seems to be important disputes on the problem of whether every "IP" must be
a complement of a "CP" or not. In fact the presence of a "CP" node is
assumed whenever it is required.
By the end of the 1980s "INFL" was considered to be overly simplified. At the
present day there is a functional node for every verbal inflection. So we have
categories for Tense (TNS), and Aspect (ASP) as well as "AGR" for
agreement features. It is very common to differentiate between "AGRs" and
"AGRo" for subject agreement and object agreement respectively. Steven
Schäufele (1999) argued that Mood could have its own node too. All these
functional heads occur in the position that was first occupied by "INFL". Thus,
instead of working within the structure Chomsky postulated in 1986,
represented here in (8a), we have to work within the structure in (8b). these
functional heads must not always be as in (8b).
Their order is dependent on the language in question. For instance, if TNSP is
the complement of Agrs or vice versa. The strength-weakness distinction (see
Chapters 16, 17 and 19 in Ouhala, 1999) assumed in dealing with the
structure of languages plays a central role in ordering the functional
categories, particularly TnsP and AgrP. (8b) therefore is just a hypothetical
structure that can be altered (rearranged) according to the strength or
weakness of the functional categories peculiar to the language in question. It
is clear from this structure that everything above "VP" at "DS" is functional
without any phonological manifestation16. "VP" dominates the lexical clause.
Hence the "Lexical Clause Hypothesis" which differentiates between the
"lexical clause" (i.e. VP) and the "functional clause" (i.e. the projection above
VP).
16
Here we ignored the assumption that there is a NEGP between VP and AGRS, which contains clausal
negators such as "not" in English. COMP is also ignored because it usually dominates lexical items as
in (5).
The hypothesis that increased the number of functional categories is
sometimes called the "Split Infl Hypothesis", and Steven Schäufele (1999) has
at least once referred to it as the "Exploded Infl Hypothesis". In this
hypothesis the node TNS has to be linked to Tense features, the ASP with
aspectual features, and everything related to agreement has to be correlated
with the appropriate AGR nodes. These assumption are by no means arbitrary
stipulations. Firstly, it has long been assumed that tense and agreement
features are associated with "AUX/INFL", a node separate from the verb in
pre-GB theory. This is contrary to a lexicalist account of tense/agreement
features (e.g. Chomsky, 1995) in which these features have been affixed to a
word when it enters syntactic derivations.
The objectives of all that was to get rid of all the phrase-structure rules of the
pre-1980 generative theories. It was assumed that in the description of any
language, its grammar should only specify the order of heads, specifiers, and
complements. Other details are left to either UG or independent constraints.
From the discussion above we see why this framework is sometimes referred
to as the "Principles and Parameters Approach". Because X-bar Theory is a
crucial aspect of UG, it must be innate. But whether heads precede
complements or follow them is dependent on the language or the category
type in question. It is variable from language to language. Therefore, it is a
parameter of UG that permits variation in languages and so the language
learner has to acquire it from the data available to him. To borrow an analogy
from the computer science, imagine the internal grammar to be something like
the "Windows Operating System" with several language choices when you
start its setup program. If you choose Arabic, the Windows' interface will be in
Arabic language, but if you choose English, you will have an English
Windows' interface. The internal code of the Windows program represents
UG, and the language choices represent the parameters. Just as the goal of
'Microsoft Company' is to enable its windows program to work with interfaces
in all languages, the main goal of research in PPA has been to recognize and
minimize the number of parameters, while saving descriptive adequacy over
the many attested human languages.
3.3. Ө Theory
3.3.1. Introduction
In our discussion of X-bar theory we said that DS represents thematic
relations. In REST, thematic roles are assigned to nominal constituents by
verbs and other constituents which have the ability to license them (ibid.).
b. He likes Pepsi.
'Ө-grid' is a term used to talk about the list of Ө-roles which a given verb can
assign17. The nominal assigned one of those Ө-roles is then considered to fill
the slot in the verb's Ө-grid (Schäufele, 1999). The Ө-grid of a given verb is
saturated whenever the roles are filled. The assignment process of Ө-roles to
the suitable constituents is called "saturation". Saturation is a semantic
condition imposed on linguistic expressions. Prepositions assign a single Ө-
17
The term " Ө-grid" is a synonymy for the terms "valency" and "subcategorisation frame".
role within the PP headed by that preposition. NPs specifiers and
complements receive their Ө-roles inside the NP that contains them. Every
VP includes all the arguments to which its head verb assigns Ө-roles. An
exceptional case in VPs is their specifier, but if we take into account the
Internal Subject Hypothesis, then even VP specifiers must receive their Ө-
roles from the head verb, at least at DS which is the working domain of Ө-
Theory. Another way of assigning the external Ө-role to the specifier of VP is
through the V-bar projection. The verb together with its complements is said
to assign an external Ө-role to the specifier of VP (ibid.).
Here adjuncts enter the scene. The verb "walked" is the head and because it
assigns only a single Ө-role, it must have only a single argument 18. An adjunct
is a constituent that is neither a head nor an argument of the head. The string
"to the shop" in (2) is an adjunct. It is in the form of PP. the head of that PP is
"to" which also has a Ө-grid and can assign a Ө-role to its complement ('Goal'
or 'Path'). Thus, the NP "shop" does not receive its Ө-role from the verb
"walk", but it is the preposition that assigns it. Therefore, sentence (2)
contains two NPs and two Ө-roles, but each is assigned by a different lexical
head (Schäufele, 1999).
As demonstrated in (1c) above, the verb "give" assigns three Ө-roles to three
different NPs. But as shown in (3) one of these NPs is dominated by a PP. so
does "Basma", like "shop" in (2) get its Ө-role from the preposition "to" or from
the verb "gave as shown in (1c).
In short, "Basma" still receives its Ө-roles from the verb "gave", and the
preposition "to" in (3) is merely a dummy case-marker inserted to satisfy the
18
"Argument" here means an NP filling a slot in the head's Ө-grid.
conditions of Case Theory19. It is also assumed that "to" mediates the
assignment of the Goal Ө-role by the verb to its indirect object. From (1c) and
(3) we see that there are two ways in English for assigning three Ө-roles by
the same head. In fact, it is logical to say that a verb like "sell" assigns four Ө-
roles, Agent (the seller), Recipient (the buyer), Theme (the thing sold), and
the price (Path) (ibid.). as shown in (4), there is no option for the last
argument but to be realised as a PP, though its Ө-role is assigned by the
verb.
The sentences above showed that "sell" assigns four Ө-roles. Now consider
the example in (5) where "sell" assigns only three Ө-roles. Does this violates
the Ө-Criterion?
There are two ways to solve this problem. The first says that the verb's Ө-
roles exist in its Ө-grid, but one of them is assigned to a null NP 20. The second
says that some verbs have optional arguments that can be left out. This
approach clarifies whether a verb is obligatory transitive like "devour" in (6) or
optionally transitive like "eat" in (7). It is worth mentioning that verbs, which
are optionally transitive, have some generic argument when the argument is
not stated explicitly (i.e. lexicalized). For example, the verb eat in (7) has the
generic argument "food".
b. *He devoured.
19
Case Theory will be discussed in a separate section.
20
As it will be discussed in the section on Case Theory, a condition called the "Case Filter" says that if
the NP has no phonological manifestation, it does not need case and thence does not need a dummy
preposition to case mark it.
Essentially, the Ө-criterion holds at DS where Ө-relations are determined, but
by virtue of the Projection Principle it is expected to hold also at the level of S-
structure and LF. It does not apply at PF because its derivation from SS
essentially involves the elimination of traces and other non-phonological
elements. The Projection Principle (PP) requires that subcategorized
categories be present at all syntactic levels, but says nothing about non-
subcategorized categories such as subjects. The Extended Projection
Principle (EPP) extends this requirement to subjects (Ouhala, 1994). Verbs
like "rain" in (8) do not assign a Ө-role at all and they cannot even license a
subject position. This would seem to violate the Ө-Criterion. In such cases a
dummy NP "it" is inserted in the subject position and is licensed by the
Extended Projection Principle.
(8) It is raining.
It should be noted that although all A'-positions are Ө'-positions, not all A-
positions are also Ө-positions. Because as we saw above, the subject
position (Spec, IP) is an A-position, but the subject positions of clauses with
raising verbs are Ө'-positions even though they are A-positions (Ouhala,
1999: 161-162).
None of the words in the genitive NP "the boy's" c-commands the N1 "blue
bike", and none of the words in the N1 "blue bike" c-commands the genitive
NP. The words "blue" and "bike" however, do m-command the genitive NP
asymmetrically, because the first maximal projection at the top dominates
both "blue bike" and the genitive NP.
Definition of M-command:
a. A must c-command B.
b. There must be no barrier between A and B.
The last condition is referred to as the Minimality Condition. It ensures that the
proper governor must be the closest to B in this case.
21
We will discuss this issue in the next subsection.
positions to subject positions in passive constructions. In an active transitive
clause as in (1a), the verb assigns two Ө-roles to two distinct NPs at DS. It
also assigns accusative Case to its object NP. The subject NP gets its Case
from the head of S/IP. But as shown in the passive counterpart in (3), the verb
assigns only a single Ө-role to its internal argument at DS and the subject
position is empty as in (3a).
Another type of verbs that do not assign accusative Case to their internal
argument is called Unaccusatives. They resemble passives in that they do not
assign an external Ө-role and Case to their internal argument. But unlike
passives, unaccusatives do not manifest a morpheme that can be said to
absorb the (accusative) Case. Among others, the following verbs are all
unaccusatives: "break", "die", and "open". The inability of both passives and
unaccusatives to assign external Ө-roles and accusative Case led to the
generalisation that if a verb does not assign an external Ө-role to the subject
position, it will probably be unable to assign accusative Case to its internal
argument too and vice versa (see Ouhala, 1999: 173-175 and 212-213).
In fact not only Case is assigned to chains but also Ө-roles. Every chain can
have a single Ө-role and a Case. Thus, each chain must contain only one Ө-
position and one Case-position. The position that is c-commanded by all the
other positions is the typical Ө-position, and the one that c-commands all the
other positions is typically the Case-position.
b. Remember you
c. To you
d. With you
e. Think of you
b. Ergative Marking:
There are two kinds of Case assignment, Structural and Inherent. Structural
Case is assigned to an NP by its governor. For a governor to assign Inherent
Case, it must also assign a Ө-role to the NP in question. The difference
between structural and inherent case is that structural case is determined at
S-structure and does not necessarily involve a thematic relation between the
assigner and the assignee (see Ouhala, 1999: 218 – 220, 395 – 397).
Nominative and Accusative Cases are both structural Cases. Genetive Case
is also a structural case since it is assigned via Spec-Head Agreement with D,
which does not bear any thematic relation to it. A Case which is determined at
D-structure and involves a thematic relation between the assigner and the
assignee, is called inherent Case. Oblique Case assigned by prepositions to
their complements qualifies as inherent Case.
The verb "explain" in the example above requires an NP to assign case to it,
but the NP in question has moved to the front of the sentence (i.e.
topicalisation). The moved NP "this condition" establishes a chain with the
trace left behind by movement. This chain works as a medium through which
the moved NP transmits its properties to the trace. The trace receives case
from the verb and transmits it back to the moved NP. Because the trace is
adjacent to the verb and has the very same properties of the moved NP then
the adjacency condition is satisfied in this way.
Laziness
Procrastination
The principle of Greed prevents the Case Filter from motivating movement
because if the subject NP for example, is base-generated under Spec-VP with
its Nominative Case already brought from the lexicon, why should it move to
Spec-S/IP? Here licensing plays a very crucial role. The NP's base position
licenses its Ө-role but not necessarily its Case. If its Case is not licensed in its
base position, then it has to move to the Spec position of the appropriate
functional head where its Case can be licensed. In short, thw abstract Case
features that an NP bears must be eliminated before the derivation is
complete. This is due to a principle known as Full Interpretation (FI) which
requires the elimination of unnecessary symbols that play no role in the
interpretation of expressions (Chomsky, 1997).
In the eighties, the Case Filter was held to apply at PF. Now it is assumed that
the abstract Case features must be eliminated before deriving PF by means of
movement to the Spec position of the proper functional head. If these were
visible at LF, then the NP in question must move to the proper Spec position
at LF. If the abstract Case features were visible at both PF and LF then their
elimination must be at SS22. Thus, we have three options upon which
languages differ (i.e. we have a parameter with three options).
These two NPS, and the like, get their Cases from their governing
prepositions in the older version of Chomskyan theory. In minimalist accounts,
these NPs must check and eliminate their Case features in the Spec position
of some functional head. In fact, these Case features can be eliminated
through head-complement checking, or by proposing some functional head in
which these Case features can be erased.
3.6.1 Introduction
In this section we are going to look at the relations of the binding module in
REST. The notion of 'Binding' is a syntactic one. It is similar to the notion of
'coreference', but not identical, because 'coreference' and ;reference' are
semantic notions. If a linguistic expression refers to something, an entity, an
element or a condition in the physical world, then that linguistic expression is
said to be referential. And if two or more linguistic expressions refer to the
same thing in the real world, then they are considered coreferent. To illustrate
this, take a look at the passage in (1):
At that time, Mo'amar Alqaddafi was a very busy man. The Revolutionary
beloved leader had been working hard pointing out ideological and procedural
differences between the traditional ways of governing and the authority of the
people. The Engineer of the Industrial River from Sirte had also been touring
the whole country, making appearances most notably in Sebha, the site of the
first sign of the great Elfateh Revolution.
In (1) above, the underlined expressions all refer to the same thing and are
therefore coreferential. Each of these expression occurred in a separate
sentence and there is no syntactic relationship between them. The province of
Binding Theory is the kind of coreferentiality represented in (2) where two
coreferential constituents inhabit the same clause.
This is called Big PRO. This interesting category will be discussed at some
length in the next subsection.
23
The Principles of Binding are also referred to as Principle A, B, and C.
c. Basma painted her blue.
a. [+Anaphor, -Pronominal]
This empty anaphor is called a 'trace' and is usually represented by the small
letter 't'24. as explained earlier, movement of constituents in REST must leave
a trace behind. This trace is an empty anaphor even if the moved constituent
was a verb and that is why the label NP-trace is avoided nowadays.
Accordingly, traces are subject to Principle A which restricts the distance to
which an antecedent can move. Thus, Principle A constitutes a constraint on
Move α. But successive movement in which a constituent climbs the tree
gradually to meet the binding requirement can avoid the violation of this
24
This is the same as the seventies' NP-trace that was used to distinguish it from the wh-trace, which
will be discussed below.
constraint. This movement is illustrated in the tree diagram in (1). The
constituent in question is x and each trace is positioned inside its governing
category. It is worth noting here that it is possible for an anaphor to be the
antecedent of another.
b. [-Anaphor, +Pronominal]
The empty pronominal is called 'small pro' or 'little pro' and is represented in
text and tree diagrams by 'pro' written in small letters. It functions as a
personal pronoun and it replaces a fully referential NP, but it lacks any
phonological manifestation. It occupies the subject position of imperatives as
in (2) and the subject position in the languages in which overt subject
pronouns are optional in all kinds of clauses as in (3).
c. [-Anaphor, -Pronominal]
There is an agreement that each chain has only one Ө-position and only one
Case-position. Neither is an Ā-position, and therefore, the antecedent of a
variable cannot inhabit either. The variable itself occupies one of them.
Binding Theory says that if a 'trace' is in Case-marked position, then it is a
variable (Schäufele, 1999). Being neither anaphor nor pronominal, variables
are subject to Principle C and therefore completely free to refer. This does not
mean that they are identical to R-expressions. Variables must not be
syntactically bound but, nevertheless, they must have an antecedent
somewhere. They ought to be semantically bound by an operator. Thus, they
must have an antecedent. To prove this, remove the word 'who' from (4) and
the clause will be unacceptable. This is not because the variable missed its
antecedent, but because it has no antecedent. It is not a variable at all but
'pro' and English does not permit pro in subject positions in normal declarative
clauses.
d. [+Anaphor, +Pronominal]
4.1 Introduction
If one wants to benefit from something, she/he should know what it is first. So
if would like to benefit from the Chomskyan Theory in the field of language
teaching and learning, we ought to know exactly what it is, what it does, and
why it is developed in the first place. Part of the answer to the first question
has been provided in the preceding chapters. The rest of the answer can be
summarized in two or three sentences. The Chomskyan theory is a theory of
language. In addition to describing language in general, it describes its
structure in a consistent coherent manner. It does so to explain what
language is and what a person knows when she/he knows a language. It is
also ant attempt yo define the essential characteristics of human languages
so as to be able to differentiate them from other artificial languages.
The answer to the last question involves the answer to the more basic
question "why do we study language at all?". Of course we learn language to
be able to communicate with other foreign speakers, but this is not the main
focus of theoretical linguistics. In fact, the story is that linguists study language
to reveal, at least, some secrets of the human mind. As Chomsky puts it in his
books Language and Mind:
Building on the antecedent work of his teacher Zelling Harris (see Stern, 1996:
141), Chomsky and his colleagues and students studied language and its
structure and discovered that language is bot creative and rule-governed. After
proposing a theory of its structure, they tried to develop a theory of language
acquisition. They proposed that children have a natural endowment that
enables them to acquire language in a limited time with no instruction at all.
This assumption is called the 'Innateness Hypothesis'. It says that children are
genetically endowed with a device in the brain responsible for language
production, acquisition and maybe comprehension too. This device is called
the 'language faculty'. Nowadays, linguists call it 'Universal Grammar' or UG
for short. UG consists of all the principles that cannot be acquired through
experience. UG theorists argue that all human are genetically endowed with a
set of principles and parameters, which tell children what sort of sounds and
grammar are or are not possible in human language. Thus UG facilitates the
children's task of language learning by restricting the possibilities available to
them. Principles show children what is possible in a language and what it is
not. Parameters are possible options from which one can choose in learning
one language or another. For example, languages vary on what can be
relativized in relative clauses (see Flynn and others, 1998): a) subject, b)
object, c)indirect object, d) object of preposition, e) genitive, f) object of
comparison. All human languages begin with the first option and use subject
relative clauses, but other languages have different possible options of relative
clauses. However, if one type of relative clause is possible in a given
language, then all the other ones to the left must be possible too. There is no
language in which object of preposition relative clauses are found but not
object relative clauses. Accordingly, children only have to set their parameter
on the furthest type of relative clause to the right that is possible in their
language. They do not have to learn each kind separately. Therefore, a UG-
analysis of second language acquisition (SLA) could help language teachers.
It has been suggested that researchers could tell teachers when the
parameters are set similarly or differently for both the L1 L2. If the setting, for
example for relative clauses, were the same, then the teacher would not need
to concentrate on this aspect of language (Bartels, 1999).
So, the children acquiring their mother language have some universal
principles that they do not have to learn; they are already there in their brains.
Other specific language rules have to be worked out by processing the speech
of adults they hear. By forming hypotheses about the speech of adults and
then testing them, children set the parameters which represent the specific
properties upon which language vary. But when we compare first language
acquisition and second language acquisition within the Chomskyan
framework, we come up with three distinct views. First language acquisition
has only one form within the Chomskyan theory (see Cook and Newson,
1998). We have the speech of adults, UG, and finally a competence in a given
language. This view of first language acquisition can be illustrated as in figure
(1) below:
From figure (1) above we see that children have a direct access to UG.
Children hear the speech of adults, process it in the UG device, and then have
their L1 competence. In L2 acquisition however, children can have either a
direct or indirect access to UG, or else have no access at all. If they have a
direct access, then learning a second language can be viewed similarly as
when acquiring a mother tongue. But it may happen that children have indirect
access to UG. They have the L1 knowledge, and thus, can utilize it in learning
a second language. They can transfer their L1 principles positively to the
second language that is being learnt. This view L2 acquisition is illustrated in
figure (2) below:
Although all the studies (Flynn and others, 1988) show that UG has at least
some effect on SLA, there is no consensus regarding the difficulty learners still
have when acquiring L2 grammar if they have access to UG, or why they have
difficulty. Since UG theory in SLA is well developed, studies in the future
should be directed to uncover the difficulties learners would have if they have
access to UG rather than further justifying the existence of UG theoretically.
However, many authors (see Flynn and others, 1988) also point out that while
justifications of UG in interlanguage prevail, what we still lack is an explicit
learning theory.
After this brief introduction about the way the Chomskyan theory deals with
language acquisition, we are going to take a look at the impact of the
Chomskyan thought on language teaching and learning. After so doing we will
interpret some of its implications for language teaching and present them in an
explicit way.
The examples (1a & b) prove that want and to can contract to wanna if they
are adjacent. Because want and to are not adjacent in (2a & b) they cannot
contract to wanna. In the examples (2a & b) the NP this apple intervenes
between want and to, which prevents their contraction. To clarify the role of
trace theory here, consider now the examples in (3) and (4). These sentences
involve movement of the NP this apple to the leftmost of the sentence.
As shown in (3) want and to can contract because the trace of the moved NP
this apple is in the object position of eat and does not intervene between want
and to. But in (4) the trace in question intervenes between want and to. For
this reason, the two categories cannot contract to wanna as shown in (4b).
What all this shows is that the grammatical rule which contracts want and to is
dependent on whether there is an intervening constituent between the two
categories or not. This intervening constituent could be a trace and therefore
Trace Theory plays an important role in explaining such phenomena to the
learners of English language.
Merging (i.e. combining) the noun school with the preposition to derives the
PP in (6).
Merging the PP in (6) with the verb going derives the VP in (7) below.
Merging the VP in (7) with the auxiliary derives the incomplete phrase I' in (8)
(not complete because it cannot be said in conversation as a reply to a
question or in any exchange of speech).
Merging the I' in (8) with the subject pronoun he will derive the IP in (9) below.
The student can use the operations Select and Merge plus the idiosyncratic
properties of lexical items to avoid ungrammaticality. For example, the
properties of the pronoun he in (5) specify that it has a nominative Case
features and therefore it should always be a subject. Its properties also
specify that it selects a singular verb and this was fulfilled by the auxiliary verb
was in (5) among the properties of was that it must have a singular nominative
a specifier (subject). The verb was also requires a verb in the '–ing' form as its
complement (in its use as an auxiliary forming the past continuous tense).
This property was fulfilled by the verb going in (5). The idiosyncrasies of the
verb going in turn indicate that it needs a prepositional phrase headed by the
preposition to and this was fulfilled by the PP to school in (5). The preposition
to also needs a noun as its complement and this requirement was met by the
noun school in (5).
From the discussion above, we can see that these two operations (Select and
Merge) together with the idiosyncrasies of lexical items are very useful in
explaining how sentences are formed. Thus, they can be used in teaching
writing.
4. The schema of X-bar theory is the same for all phrases and sentences. The
university teacher can explain this to his/her students. After that the teacher
can show them how to insert lexical items in their appropriate positions. The
idiosyncrasies of lexical items plus other principles, relations (for instance, C-
command and the Binding Principles), and operations (for example, the
operation Merge discussed above) of Chomskyan Theory can be used
effectively in defining the appropriate positions in which lexical items can be
positioned. At the end students will recognize that they can generate an
infinite number of phrases and sentences through the use of X-bar schema
and other useful operations, constraints, and principles. Thus, the result will
be the development of the creativity of language among students.
An important basis of Chomskyan Theory is UG. The teacher can benefit from
UG in defining the content of his/her language course. For example, instead
of telling students that there are statements, questions, nouns, etc. in the FL,
the teacher should construct his/her language course so as to delve directly
into how to form statements, questions, etc.. in other words, he/she should
make use of the innate linguistic knowledge that his/her students possess.
(1) He has regarded syntax as the central feature of linguistic structure, and
proposed that its study should be central in linguistic research (Trask, 1997).
25
It should be noted that most of these ideas have been understood and collected from different
discussions and answers to several queries submitted by different linguists to the Linguist List
Website.
mental processes are real, can be utilized in linguistic descriptions, and can
be studied themselves.
(6) While his predecessors had acknowledged that languages vary without
limit, Chomsky argued for the universal nature of language. He paid more
attention to the similarities among languages than their variable nature.
Recently, the emphasis upon universality has led to the search for universal
grammar, the supposedly universal structural properties of human languages.
(7) Accordingly, Chomsky has argued that these universal properties are
genetically built into our brains at birth: this is his innateness hypothesis. In
Chomsky's view, children know in advance what human languages are like,
and have only to acquire the particular idiosyncrasies of the specific language
they are learning.
From its early descriptions of languages, Chomskyan Theory has proved that
it has great generalizing power and is able to explain underlying regularities
among languages. By offering to both the teacher and the learner more
general rules in place of the traditional lists of exceptions and special cases,
Chomskyan Theory is a very useful aid to teaching and learning. It is this point
which Rutherford (1968) takes up in the preface to his textbook:
Although Chomsky has changed his syntactic ideas frequently, his beliefs
have remained broadly the same. He refreshes his theory repeatedly and
dramatically. This continuous change is observed in each version of his
theory.
20. Cook, V. J. and Mark Newson (1998). 2nd edition, Chomsky's Universal
Grammar: An Introduction. Blackwell Publishers.
21. Di Pietro, R J. (1968). "Contrastive Analysis and the Notions of Deep and
Surface Grammar", in Alatis (1968), 65 – 80.
24. E., Rutherford William (1968). Modern English: A Textbook for Foreign
Students, New York, Harcourt, Brace and World.
26. Flynn, S. and others (1998) (Eds). The Generative Study of Language
Acquisition. Mahwah, NJ: Lawrence Erlbaum.
27. Hudson, Richard (2000). Grammar Teaching and Writing Skills: the
Research Evidence, London.
29. Lotfi, Ahmed Reza (2000). "An Outline of the Pooled Feature Hypothesis:
How Imperfect are 'Language Imperfections'", Unpublished Paper, Azad
University, Iran.
39. Sag, Ivan A. and Thomas Wasow (1999). Syntactic Theory: A Formal
Introduction. CSLI Publications, USA.
42. Smith, Neil (1999). Chomsky: Ideas and Ideals, Cambridge University
Press.
47. Webelhuth, G. (1995). (ed.). Government and Binding Theory and the
RMinimalist Program: Principles and Parameters in Syntactic Theory. A
reader's Guide to "A Minimalist Program for Linguistic Theory". Blackwell,
Cambridge.