Professional Documents
Culture Documents
Logical Issues in Language Acquisition 9789067655064 9067655066 Compress
Logical Issues in Language Acquisition 9789067655064 9067655066 Compress
in Language
Acquisition
Linguistic Models
The publications in this series tackle crucial
problems, both empirical and conceptual, within the
context of progressive research programs. In
particular Linguistic Models will address the
development of formal methods in the study of
language with special reference to the interaction
of grammatical components.
Series Editors:
Teun Hoekstra
Harry van der Hülst
1 Michael Moortgat, Harry van der Hülst and Teun Hoekstra (eds)
The Scope of Lexical Rules
2 Harry van der Hülst and Norval Smith (eds)
The Structure of Phonological Representations. Part I
3 Harry van der Hülst and Norval Smith (eds)
The Structure of Phonological Representations. Part II
4 Gerald Gazdar, Ewan Klein and Geoffrey K. Pullum (eds)
Order, Concord and Constituency
5 W. de Geest and Y. Putseys (eds)
Sentential Complementation
6 Teun Hoekstra
Transitivity. Grammatical Relations in Government-Binding Theory
7 Harry van der Hülst and Norval Smith (eds)
Advances in Nonlinear Phonology
8 Harry van der Hülst
Syllable Structure and Stress in Dutch
9 Hans Bennis
Gaps and Dummies
10 Ian G. Roberts
The Representation of Implicit and Dethematized Subjects
11 Harry van der Hülst and Norval Smith (eds)
Autosegmental Studies on Pitch Accent
12 a. Harry van der Hülst and Norval Smith (eds)
Features, Segmental Structures and Harmony Processes (Part I)
12 b. Harry van der Hülst and Norval Smith (eds)
Features, Segmental Structures and Harmony Processes (Part II)
13 D. Jaspers, W. Klooster, Y. Putseys and P. Seuren (eds)
Sentential Complementation and the Lexicon
14 René Kager
A Metrical Theory of Stress and Destressing in English and Dutch
Logical Issues
in Language
Acquisition
¥
1990
FORIS PUBLICATIONS
Dordrecht - Holland/Providence RI - U.S.A.
Published by:
Foris Publications Holland
P.O. Box 509
3300 AM Dordrecht, The Netherlands
Logical
Logical Issues in Language Acquisition / I.M. Roca (ed.). - Dordrecht [etc.]: Foris. - 111. -
(Linguistic Models : 15)
With Index, Ref.
ISBN 90 6765-506-6
Subject Heading: Language Acquisition
No part of this publication may be reproduced or transmitted in any form or by any means,
electronic or mechanical, including photocopy, recording, or any information storage and
retrieval system, without permission from the copyright owner.
List of Contributors xi
I.M. Roca
Introduction xv
Martin Atkinson
The logical problem of language acquisition: representational and
procedural issues 1
1. Background 2
2. Representational Issues 10
3. Procedural Issues 16
Footnotes 26
References 29
Vivian Cook
Observational data and the UG theory of language acquisition . . . . 33
1. Evidence in the UG model 33
2. I-language and E-language theories 34
3. Observational data, performance and development 35
4. Representativeness of observational data 37
5. Observational data and adult performance 38
6. Evidence of absence 40
7. Correlations within observational data 42
8. General requirements for observational data in UG research 43
References 45
Michael Hammond
Parameters of Metrical Theory and Learnability 47
1. Metrical Theory 48
2. Learnability 49
3. The Seven-Syllable Hypothesis 52
4. Levels and Options 53
5. Short-Term Memory Constraint 57
vi Logical Issues in Language Acquisition
Footnotes 60
References 61
Teun Hoekstra
Markedness and growth 63
1. Parameters and Markedness 64
2. Developmental markedness 69
3. Extension and Intension 72
4. The notion of growth: the Unique External Argument
Principle 73
5. A-chains 76
5.1. Ergatives 76
5.2. Passives 79
6. Conclusion 82
Footnote 82
References 83
James Hurford
Nativist and Functional Explanations in Language Acquisition 85
1. Preliminaries 85
1.1. Setting and Purpose 85
1.2. Glossogenetic and Phylogenetic mechanisms 87
1.3. Competence/performance, I-Language/E-language 89
1.4. The ambiguity of 'functional' 94
2. Glossogenetic mechanism of functional influence on language
form 96
2.1. The Arena of Use 96
2.2. Frequency, statistics and language acquisition 107
2.3. Grammaticalisation, syntacticisation, phonologisation 113
2.4. The role of invention and individual creativity 120
2.5. The problem of identifying major functional forces 124
2.6. Language drift 129
3. Conclusion 130
Footnotes 131
References 132
Rita Manzini
Locality and Parameters again 137
1. Locality 138
2. English Anaphors and Pronouns 142
3. Italian Reciprocal Constructions 148
4. Parameters in Locality Theory 152
References 156
Contents vii
Marina Nespor
On the rhythm parameter in phonology 157
1. Phonetic evidence against two types of timing 159
2. Phonological evidence against two types of rhythm 162
2.1. Nonrhythmic characteristics of "stress-timed and "syllable-
timed" languages 162
2.2. On the existence of intermediate systems 163
2.3. On the development of rhythm 165
3. The Phonology of rhythm: arguments for a unified rhythmic
component 166
3.1. The metrical grid in English and Italian 166
3.2. The Rhythm Rule in English and Italian 167
3.3. The definition of stress clash in Italian and English 169
3.4. Stress lapses in English and Italian 171
4. Conclusions 172
Footnotes 173
References 173
Mark Newson
Dependencies in the Lexical Setting of Parameters: a solution to the
undergeneralisation problem 177
1. The Lexical Parameterisation Hypothesis and Ensuing
Problems 177
2. A solution to the problems 179
3. Undergeneralisations and the Binding Theory 180
3.1. Background issues 180
3.2. Generalisations and the Lexical Dependency 182
4. Support for the Lexical Dependency 187
5. A further predicted generalisation 192
Footnotes 195
References 197
Andrew Radford
The Nature of Children's Initial Grammars of English 199
1. Introduction 199
2. Structure of nominals in early child English 202
3. Structure of clauses in early child English 209
4. The overall organisation of early child grammars 219
5. Summary 228
Footnotes 229
References 231
viii Logical Issues in Language Acquisition
Anjum Saleemi
Null Subjects, Markedness, and Implicit Negative Evidence 235
1. Some background assumptions 236
2. The Licensing Parameter 237
3. The Learnability Problem 242
4. Positive Identification 242
5. Exact Identification 247
6. Is Implicit Negative Evidence Really Necessary? 248
7. Developmental Implications 249
8. Binding Parameters and Markedness 252
Footnotes 255
References 256
N. V. Smith
Can Pragmatics fix Parameters? 277
1. Introduction 277
2. Exclusions 278
3. Relevance 280
4. Parameters 281
5. Hyams 282
6. Fixing 284
7. Conclusion 287
Footnotes 288
References 288
Contents
Author Index
Subject Index
List of Contributors
Martin Atkinson
Department of Language and Linguistics
University of Essex
Colchester
Essex C 0 4 3SQ
ENGLAND
Vivian Cook
Department of Language and Linguistics
University of Essex
Colchester
Essex C 0 4 3SQ
ENGLAND
Michael Hammond
Department of Linguistics
University of Arizona
Tucson, AZ 857221
USA
e-mail:hammond@ccit.arizona.edu
Teun Hoekstra
Instituut voor Algemene Taalwetenschap
Rijksuniversiteit
Postbus 9515
2300 RA Leiden
THE NETHERLANDS
e-mail:letthoekstra@nl.leidenuniv.rulcri
xii
James Hurford
Department of Linguistics
Adam Ferguson Building
40 George Square
Edinburgh EH8 9LL
SCOTLAND
e-mail:jim@uk.ac.ed.edling
Rita Manzini
Department of Phonetics and Linguistics
University College
Gower Street
London WC1E 6BT
ENGLAND
Marina Nespor
Italiaans Seminarium
Universiteit van Amsterdam
Spuistraat 210
1012 VT Amsterdam
THE NETHERLANDS
Mark Newson
Department of Language and Linguistics
University of Essex
Colchester
Essex C04 3SQ
ENGLAND
Andrew Radford
Department of Language and Linguistics
University of Essex
Colchester
Essex C 0 4 3SQ
ENGLAND
Iggy Roca
Department of Language and Linguistics
University of Essex
Colchester
Essex C 0 4 3SQ
ENGLAND
e-mail:iggy@uk.ac.essex
Anjum P. Saleemi
English Department
Allama Iqbal Open University
H-8, Islamabad
PAKISTAN
N.V. Smith
Department of Phonetics and Linguistics
University College
Gower Street
London WC1E 6BT
ENGLAND
e-mail:uclynvs@uk.ac.ucl
Introduction
I.M. Roca
University of Essex
This volume grew out of a seminar series on the theme 'The Logical Problem
of Language Acquisition' that I organised for the Department of Language
and Linguistics of the University of Essex in 1988, and at which most
of the papers included here were first presented. The aim of the series
was to examine the impact of the issue on various areas of language research,
thus offering as broad as possible an overview of what is rapidly becoming
the focal point of generative linguistics.
The change in concerns and outlook which has taken place in linguistics
over the past quarter century is nicely encapsulated in the contrast between
the basic tenet of American descriptive linguistics that 'languages ... differ
from each other without limit and in unpredictable ways' (Joos 1957:96)
and Chomsky's current position that there is only one language (cf. e.g.
Chomsky 1988c: 2).
The apparent irreconcilability of these two stands betrays a more
fundamental truth that contemporary linguistics, under Chomsky's endu-
ring leadership, has been labouring to unravel and articulate. Specifically,
the crucial discovery has been that phenomenon must be kept distinct
from noumenon, or, in plainer words, that underlying the obvious diversity
of languages there is a unity more essential to language than its surface
geographical variety.
Chomksy has thus shifted the focus of linguistics from language to man,
from manifestation to source. The central question has now become that
of accounting for the possession of language, that is, of an object which
has the precise characteristics that human languages are known to have.
Pursuing this line of logical investigation, it is reasonable to conclude that
if all human languages are cut to the same shape, this shape must be
imposed by the very organism in which such languages are contained,
that is to say, by man himself. Moreover, given the obvious fact that
language develops in man rather than from him, like, say, a physical limb
or body hair, the need for interaction between the organism and its
environment becomes more acutely obvious. Briefly, what psychological
(or, more accurately, biological) attributes must humans possess in order
for language learning to take place in early childhood, under the usual
conditions of spontaneity, rapidity, satisfactory completion, and so on?
xvi I.M. Roca
In turn, what traits are necessary in the ambient language itself to make
such learning possible in spite of the apparent input variety which so struck
linguists of Joos's generation? Here we have in a nutshell what has come
to be known as the logical problem of language acquisition.
Chomsky's unashamedly nativist position is of course well-known.
Briefly, the surface complexity of language is such that no acquisition
could meaningfully take place unless the organism already comes equipped
with a sort of mental template designed to anticipate and match the ambient
data in some way. Given the reality of cross-linguistic surface variation,
however, such matching cannot be simplistically direct. Rather, the idea
is that the variation is built into the template in the form of a limited
range of values for each of a set of parameters. From this perspective,
therefore, the task of the child learner is one of elucidating from the data
which of the available values must be assigned to the language to which
he is being exposed. To a large degree, the acquisition of this language
consists in the setting of such parameters. Further to this, there will be
the (of course non-negligible) task of rote learning the idiosyncratic
properties of lexical items. Not unexpectedly, these two tasks are in fact
interdependent, in ways that are gradually becoming better known.
It is not my intention to review here the short but already hefty history
of the topic which inspires the title of this book. For most of the relevant
information, the curious reader can refer to such works as Wexler and
Culicover (1980), Baker and McCarthy (1981), Hornstein and Lightfoot
(1981), Atkinson (1982), Borer (1984), Pinker (1984), Berwick (1985),
Chomsky (1986a), Hyams (1986), Roeper and Williams (1987), and
Chomsky (1988a, 1988b, 1988c)
Focussing then on the contents of the present collection, a range of
interwoven themes are discernable, and we shall now go through them
briefly.
Granting the reality of Universal Grammar in the form of principles
and parameters, one obvious question concerns the chronology of its
availability. Specifically, are all such principles and parameters present
and accessible from the onset of the acquisition process or do they (or
at least some of them) emerge as development unfolds, as in Borer and
Wexler's (1987) Maturation Hypothesis? While Atkinson is decidedly
sympathetic to the maturational account, an important part of Hoekstra's
paper is aimed against Borer and Wexler's key argument for the hypothesis,
which is based on the claim that the non-occurrence of verbal passives
in the early stages is the result of the unavailability of A-chains at this
point of development. Hoekstra's alternative hinges on the characterisation
of language acquisition as growth in the system of grammatical knowledge,
the central theme of his paper. Importantly, such intensional accruement
need not result in extensional expansion, but is also consistent with
Introduction xvii
that the operation of the child's pragmatic principles may not in fact
presuppose a total syntactic analysis.
Relevant to this issue is the detailed evidence presented by Radford
concerning the structure of early child grammars (20-24 months). These
grammars must be taken to lean heavily on Universal Grammar, given
the minimal amount of exposure to the ambient language by this time.
Obviously, thus, they constitute a privileged testing ground for claims
regarding the availability of grammatical devices to the developing child.
Radford's finding is that early grammars are lexical-thematic, that is to
say, they contain neither functional categories nor non-thematic con-
stituents. Correspondingly, all structures in these grammars are projections
of lexical categories and comprise networks of thematic relations. As a
consequence, child grammar will lack the functional properties associated
with functional categories, such as case and binding. Interestingly, the
lexical-thematic hypothesis can account for the absence of movement chains,
referred to above in connection with Hoekstra's contribution.
The topic of the interaction between Universal Grammar and the ambient
language is taken up again by Hammond, who draws a distinction along
standard lines between a default setting of a parameter, for which no external
evidence is required, and a marked setting, which can only be triggered
by positive evidence. He goes on to show that in the domain of word
stress the marked values cooccur with a maximum of seven syllables. Rather
than building the corresponding constraint into UG, Hammond opts for
the non-stipulative strategy of relating the observation to Miller's (1967)
magical number seven. In particular, he contends that the reason for the
7-syllable limit is derivable from the limitation of the storage capacity
of short-term memory to seven units. In this way, the statement of the
stress parameters is kept at its maximum level of generality, while still
being compatible with the facts.
Hurford explicitly sets out to reconcile nativist and (social or cognitive)
functional explanations to language acquisition, all ¡too often incarnated
in the guise of two openly warring factions. He makes a general plea
for the integration of extra-grammatical factors into the domain of
learnability by drawing a distinction between the evolution of the species
and the evolution of particular languages, which he claims to be a function
of both innate and culturally transmitted factors. The central concept in
his theory is the 'Arena of Use', a performance-related abstraction pa-
ralleling Chomsky's competence-related Language Acquisition Device. By
jointly providing the input data for the next generation of learners, the
LAD and the AoU are both instrumental in the acquisition of competence.
Importantly, a model of this kind allows for such factors as statistical
frequency and distribution, discourse structure, and individual invention
and creativity to play a role in language development without needing
Introduction xxi
to build them directly into the competence. It moreover goes some way
towards accounting for such recalcitrant phenomena as the existence of
language drift or the survival of the phoneme through adverse theoretical
conditions.
As follows from the broad range of contributions, the book ought to
be readable by, and useful to, linguists with a variety of interests and
from a variety of backgrounds: child language researchers, learnability
theorists, syntacticians and phonologists with an interest in principles and
parameters, functionalists, language phylogenists, second language rese-
archers, and so on. Indeed, it is perhaps not unreasonable to hope that
the volume will make some contribution towards the integration of the
rich and varied field of language acquisition.
During the period leading up to publication, the papers were subject
to critical reviews and subsequent extensive revision, and I wish to make
public my gratitude to the anonymous referees who so generously con-
tributed their time and expertise. My editing task has been considerably
facilitated by the help and encouragement I received from the series editors,
Teun Hoekstra and Harry van der Hulst, and from the Essex colleagues
who participated in the project, in particular Martin Atkinson, whose idea
the collection originally was, and who made funds and facilities available
during his period as chairman of the Department of Language and
Linguistics.
In the interest of symmetry, I have taken the liberty of introducing
a modest degree of style harmonisation across the papers, which will
hopefully make the reader's task a more pleasurable one. The generic use
of he adopted here should obviously not mislead the pragmatically aware
reader into believing that children (or adults) are all of one sex. Owing
to practicalities and, especially, time pressure, the choice of a number
of typographic conventions has however been left to the individual initiative
of the contributors.
Throughout the two years which have elapsed between conception and
delivery, the contributors have at all times borne my periodic bombardments
with patience and good humour. I apologise to them for my countless
inefficiencies and thank them warmly for their enthusiasm and cooperation.
It is of course to the contributors that any merit of this collection must
ultimately revert.
REFERENCES
Baker, C. and J. J. McCarthy. 1981. The Logical Problem of Language Acquisition. Cambridge,
Massachusetts: MIT Press.
Berwick, R. C. 1985. The Acquisition of Syntactic Knowledge. Cambridge, Massachusetts:
MIT Press.
Borer, H. 1984. Parametric Syntax. Dordrecht: Foris.
Borer, H. and K. Wexler. 1987. The Maturation of Syntax. In Roeper and Williams. 123-
172.
Borer, H. and K. Wexler. 1988. The Maturation of Grammatical Principles. Ms. University
of California, Irvine.
Chomsky, N. 1986a. Knowledge of Language: Its Nature, Origin and Use. New York: Praeger.
Chomsky, N. 1986b. Barriers. Cambridge, Massachusetts: MIT Press.
Chomsky, N. 1988a. Generative Grammar. Studies in English Linguistics and Literature. Kyoto
University of Foreign Studies.
Chomsky, N. 1988b. Language and Problems of Knowledge: the Managua Lectures. Cambridge,
Massachusetts: MIT Press.
Chomsky, N. 1988c. Some Notes on Economy of Derivation and Representation. Ms. MIT.
Hornstein, N. and D. Lightfoot. 1981. Explanation in Linguistics. London: Longman.
Hyams, N. 1986. Language Acquisition and the Theory of Parameters. Dordrecht: Reidel.
Joos, M. 1957. Readings in Linguistics. Chicago: University of Chicago Press.
Manzini, R. 1988. Constituent Structure and Locality. In A. Cardinaletti, G. Cinque and
G. Giusti (eds.), Constituent Structure. Papers from the 1987 GLOW Conference, Annali
di Ca' Foscari 27, IV.
Manzini, R. 1989. Locality. Ms. University College, London.
Manzini, R. and K. Wexler. 1987. Parameters, Binding Theory and Learnability. Linguistic
Inquiry 18. 413-444.
Miller, G.A. 1967. The Magical Number Seven, plus or minus two: Some Limits on our
Capacity to Process Information. In G. A. Miller (ed.) The Psychology of Communication.
New York: Basic Books Inc. 14-44.
Piatelli-Palmerini, M. 1989. Evolution, Selection and Cognition: from 'Learning' to Parameter
Setting in Biology and in the Study of Language. Cognition 31. 1-44.
Pica, P. 1987. On the Nature of the Reflexivization Cycle. In Proceedings ofNELS 17, GSLA
University of Massachusetts.
Pike, K. 1945. The Intonation of American English. Ann Arbor. Michigan: University of
Michigan Press.
Pinker, S. 1984. Language Learnability and Language Development. Cambridge, Massachusetts:
Harvard University Press.
Roeper, T. and E. Williams. 1987. Parameter Setting. Dordrecht: Reidel.
Safir, K. 1987. Comments on Wexler and Manzini. In Roeper and Williams. 77-89.
Sperber, D. and D. Wilson. 1986. Relevance: Communication and Cognition. Oxford: Blackwell.
Wexler, K. and P. Culicover. 1980. Formal Principles of Language Acquisition. Cambridge,
Massachusetts: MIT Press.
Wexler, K. and R. Manzini. 1987. Parameters and Learnability in Binding Theory. In Roeper
and Williams. 41-76.
The logical problem of language
acquisition: representational and
procedural issues
Martin Atkinson
University of Essex
1. BACKGROUND
L2 = {aa, aaa, }
L3= {aaa, }
etc.
The logical problem of language acquisition 3
(3) *aa
With our assumptions about the data available to the learner explicit,
it is easy to see how to formulate a procedure which will guarantee successful
identification after a finite time. This procedure simply instructs the learner
to set i, in his current hypothesis L i; as the length of the shortest sentence
to which he has been exposed so far. Since, by our assumptions about
the data, this shortest string will be presented after some finite time, at
that time the procedure will select the correct language and no subsequent
datum will modify this selection. Adding this procedure, then, to the
assumptions about the space of hypotheses and those concerning available
data will yield a learning theory for the languages of (2) in which no
logical problem arises.
But now consider a superficially similar problem which leads to a radically
different outcome. Suppose that the hypothesis space is defined by the
languages in (4), only one of which contains an infinite number of sentences,
and that our assumptions about data remain unaltered:
(4) L, = {a}
L 2 = {a, aa}
L 3 = {a, aa, aaa}
Or, to take a less widely discussed example, Baker (1988b) cites data from
Chichewa, showing that this language has applicative constructions cor-
responding to both instrumentals and benefactives:
In (9) and (10) the instrumental NP mpeni and the benefactive NP mfumu
are 'promoted' to direct object position immediately following the verb,
creating a structure in which the verb appears to be followed by two objects.
However, these two objects behave rather differently in a number of respects
between the instrumental and benefactive cases. To take one such difference,
for the instrumental both 'objects' can appear as pronominal object prefixes
in front of the verb, as in (11):
The point now is that to the extent that these judgements are reliable
and diagnostic of a uniform, internally represented grammar, we are obliged
to seek an account of how they arise. That native-speakers of English
are not consistently (or even exceptionally) told that (5) is odd but nothing
like so bad as (6), or that (7) is fine but (8) is less good is surely
uncontroversial. Furthermore, resorting to analogy has no attractions here,
as the relevant English judgements concern degrees of ill-formedness and,
by assumption, the child is provided with no information of this nature
which could form the basis for an analogical inference. Also in the Chichewa
case, if analogy were to be employed, it would presumably yield the
conclusion that (12b) is well-formed alongside ( l i b ) , since simple appli-
catives, lacking object prefixes, do not appear to differentiate between
instrumentals and benefactives. In these circumstances, we appear to be
driven to the conclusion that the judgements that are made regarding these
sentences must arise from an interaction of the data to which children
are exposed and knowledge which is brought to the acquisition task, this
knowledge amounting, within the framework of (1), to a substantive
constraint on the hypotheses considered by the child. Characterisation of
this knowledge is precisely the concern of linguists attempting to formulate
accounts of Universal Grammar and, as we shall see, is specifically aimed
at dealing with representational aspects of the logical problem.
Before being swept along with the current of opinion which claims that
information about non-sentences, or negative evidence as it is often called,
is not available to the learner, it is prudent to note that the above
observations claim only that acquirers of English (or Chichewa) do not
receive systematic explicit exposure to non-sentences together with infor-
mation about their status. This does not rule out the possibility of a causally
efficacious role for implicit negative evidence, as is noted in Chomsky
(1981). One way in which this suggestion could be given some substance
would be to equip the child with some mechanism which is sensitive to
non-occurring tokens which might be predicted as occurring on the basis
The logical problem of language acquisition 7
Compared to the structuralist account, there are major changes here. The
hypothesis space is now constrained by linguistic principles and the
formulation (or discovery) of hypotheses is replaced by selection from
an antecedently specified set of possibilities. However, the framework is
bedevilled by a number of problems, including the following: (i) Universal
Grammar, as a set of constraints on possible rule systems, makes available
a very rich set of descriptive options, many of which are not attested
and, indeed, are unlikely to be so; the descriptive poverty of structuralism
is replaced by profligacy; (ii) it is quite counterintuitive to assume that
the child has access to the same data as the linguist; yet without this
assumption, the problem raised under (i) takes on massive proportions,
as the child would then be required to pick his way through this forbiddingly
complex set of options on the basis of rudimentary data; (iii) the form
and operation of the evaluation measure, which was the mechanism enabling
the child to select the descriptively adequate grammar over one that was
merely observationally adequate, remained poorly understood and unde-
veloped.
It was against this background of descriptive largesse that the current
Principles and Parameters model emerged with the characteristics in (15):
(15) a. any core grammar which results from the interaction of a set
of universal principles and a set of parameters, the values of
which can vary;
b. subject to a criterion of 'epistemological priority';
c. triggering, parameter setting and maturation.
Of course, the switch from rule systems to principles does not in itself
guarantee the restrictiveness of descriptive options, but this must be seen
alongside an emphasis on the deductive structure of the theory, which
enables a particular principle to have effects, as far as the properties of
sentences are concerned, only at the end of a lengthy deduction, perhaps
involving complex interactions with other principles and parameters. The
intention, anyway, is that the number of principles, and perhaps also
parameters, will be fairly small and this is clearly a shift away from a
10 Martin Atkinson
situation in which each construction type in each language merits its own
rule.
Under (15b), the reference to epistemological priority is a recognition
that the development of the system must take place in the context of data
which it is plausible to assume the child actually has access to. Thus,
alongside the familiar restriction on negative data, this approach prohibits
reliance on complex data in the fixing of parameter values (in this
connection, see Wexler and Culicover 1980 on degree-2 learnability, Morgan
1986 on degree-1 learnability if the child has access to constituent infor-
mation, the speculations of Lightfoot (1989) on degree-0 learnability, and
Elliott and Wexler 1988 on the emergence of a set of grammatical categories
from an epistemologically plausible perspective).
Finally, triggering, parameter-setting and maturation under (15c) are
intended to have a character which makes them quite distinct from learning,
even when the latter is construed mentalistically as in (14c), and the extent
to which this can be maintained will be examined in Section 3 below.
I now turn to a discussion of issues surrounding (15a).
2. REPRESENTATIONAL ISSUES
Here S0 and Sn designate the initial and final states in the acquisition
process. Pj, P 2 , ..., Pn is the set of universal principles, and p b p 2 , ...,
pm is the set of parameters. The use of x in connection with the parameters
The logical problem of language acquisition 11
at S0 is intended to indicate that at this stage their values are open, and
the aj at S n represent the values that are determined in the acquisition
process. Presumably, S 0 will also contain specifications of the ranges of
the different parameter values, but I am not concerned with such niceties
here.10 (16) contains nothing corresponding to the observed gradualness
of the acquisition process and no detailed information about how the
transition between S0 and S„ is effected. Consideration of these questions
is set aside until Section 3.
Perhaps the most serious problem confronting this way of looking at
things is that of the nature of the principles and parameters, i.e. what
is needed is a general theory of what principles and parameters are
legitimate, and this section is largely concerned with examining a number
of perspectives on this problem. I shall have little to say about principles
here, although the issues I raise deserve consideration from this perspective
too (see Safir 1987).
As things stand, there is not a great deal of agreement among researchers
on the identity of more than a small number of parameters. There is pro-
drop, the unitary status of which is the subject of considerable debate
(see, for example, Safir 1985), bounding node for Subjacency, direction
of Case and 0-role assignment, governing category, again involving some
dispute, the set of proper governors for ECP, and perhaps a few others
which have been the subject of systematic discussion. Alongside these,
however, there is a large set of proposals in the literature which might
be viewed simply as parametric relabellings for aspects of linguistic
variation. To take one example, in Lasnik and Saito (1984), in the context
of a discussion of the position of w/z-phrases in English and several other
languages, we meet the suggestion that whether complementiser positions
marked as [+w/i] must contain a [+w/i] element at S-structure is a parameter,
and they speculate on whether such a parameter is implicationally related
to whether languages have syntactic wA-movement, concluding that it is
and that the 'basic' parameter is one expressing the presence or absence
of such movement. But these observations do not proceed significantly
beyond the data that lead to them, and it is difficult to resist the suggestion
that we are being offered nothing more than a translation of an aspect
of linguistic variation into a fashionable mode.
Now, I do not wish to suggest that the Lasnik and Saito parameter
is illegitimate, but the view that the theory of parameters is itself in a
position similar to that of the theory of transformational rules in the late
1960s is not easy to put aside. Of course, there is an important difference
in that individual transformational rules were seen as having to be learned,
whereas parameters and their values are given as part of the solution to
the logical problem, but from a methodological perspective, there are
uncomfortable similarities; just as it was all too easy to formulate con-
12 Martin Atkinson
that binarity sits most comfortably with the switch-setting analogy offered
by Chomsky (1988a), an analogy which Piattelli-Palmarini uses extensively
(see further below), although nothing in principle rules out the possibility
of multiple switch-settings.
Unfortunately, attractive as binarity might be conceptually, in the current
state of enquiry we are forced to acknowledge the existence of multiple-
valued parameters even among those where fairly extensive justification
exists. Thus, for example, Wexler and Manzini (1987) and Manzini and
Wexler (1987) offer a 5-valued parameter for governing category, Saleemi
(this volume) considers a 4-valued parameter in connection with his
reanalysis of pro-drop phenomena in terms of the postponement of Case
assignment to LF, and Baker (1988a) suggests that verbs (or perhaps
languages) admit multiple possibilities for Case assignment, including the
option of assigning two structural cases and the option of one structural
and one inherent case, alongside the common situation of having only
a single structural case. Nor can we maintain the converse position that
a defining property of parameters is non-binarity, as there appear to be
some, most notably the directionality parameters, which by their very nature
are binary. Naturally, there is nothing unintelligible about sets of parameters
some of which are binary and others of which are not, particularly if
it transpired that the two sets clustered together with respect to other
properties, perhaps thereby constituting parametric natural 'kinds' (see
below), but this first attempt to impose a substantive constraint on
parameters would appear to require major réévaluations of central parts
of the theory if it were to be adopted. 13
A related possibility is that of whether parameters come pre-set, resetting
being determined by positive evidence (see Hammond, this volume) or
whether they are simply unset, requiring positive evidence to be set in
one way or another. Pre-set values, if they exist, can then be referred
to as unmarked. This possibility, of course, raises procedural questions,
and it will arise again in Section 3, but for now, since we are assuming
that S 0 contains a specification of the range of permissible parameter values
for each parameter, it is natural to wonder whether some a priori ordering
might not be imposed on this range. Again, unfortunately, what we have
on the ground is a mixed bag. Thus, taking the governing category parameter
in the work of Wexler and Manzini, the notion of a default, pre-set value
makes perfect sense and, indeed, is necessary from the set-theoretic
perspective they adopt (see p. 18 below, and Newson, this volume). As
is well-known, a default value has also been suggested by Hyams (1986)
for pro-drop, this being [+pro-drop] and motivated by some controversial
claims about early child speech (see, for example, Aldridge 1988). Already,
with these two cases, however, whatever the empirical status of the claims,
there is the uncomfortable observation that Wexler and Manzini's account
14 Martin Atkinson
of the primitives which enters into its formulation. This begins to look
like a fairly tidy constraint to impose on the location of parameterisation,
but, again, there are claimed instances of parameterisation to which it
is not clearly applicable. Thus, the directionality parameters do not enter
directly into the principles of X-bar theory, Case Theory or 0-theory and
we appear to have a situation where parameterisation can occur in the
primitives appearing in principles and elsewhere. It is, of course, notable
that the directionality parameters have clustered together with respect to
the properties of binarity and default values, being binary when other
parameters are not and not having default values when other parameters
do. This may be symptomatic of an interesting partition in the set of
possible parameters. 16
Another parametric constraint may also be construed as locational and
involves the claim that parametric variation is restricted to occur in the
lexicon. The suggestion seems to have been first made by Borer (1984)
and it has a clear intuitive appeal. Everyone agrees that the lexical items
of a language have to be learned along with their idiosyncratic properties.
It is also apparent that those features of languages which make them
different from each other have to be somehow acquired and cannot be
antecedently specified in the structure of S0. It is natural, therefore, to
identify the locus of variation with that aspect of grammar that has to
be learned, viz. the lexicon.
Support for this localisation of parametric variation has been supplied
by Wexler and Manzini (1987), who argue in detail that different values
of the governing category parameter cannot be associated with a language
once and for all, but have to be linked to specific anaphors and pronouns,
since it is possible to find two such items in a single language the syntactic
behaviour of which is regulated by different values of the parameter.
A more radical alternative is considered in a tentative way by Chomsky
(1988b), basing his discussion on Pollock (1987). Having stated the lexical
parameterisation view, he goes on to say (p. 44): "If substantive elements
(verbs, nouns, etc.) are drawn from an invariant universal vocabulary,
then only functional elements will be parameterised". His subsequent
discussion argues for just such a parameterisation of functional elements,
proposing that AGR is 'strong' in French but 'weak' in English, these
attributes being spelled out in terms of the ability or lack of it to transmit
0-roles, and that [+finite] is 'strong' for both languages, whereas [-finite]
is 'weak'. These proposals enable Chomsky, again following Pollock, to
produce a comprehensive account of the behaviour of adverbials, quan-
tifiers, negation, etc. in simple clauses in English and French. 17
There are at least two reasons for being cautious about these proposals,
which clearly represent a significant attempt to localise parametric effects.
First, they do not bear at all on the nature of parameters, so the questions
16 Martin Atkinson
with which I began this section stand unanswered, i.e. we are no nearer
an understanding of exactly what forms of lexical parameterisation are
legitimate and we have at best partially responded to the dangers of
descriptivism. Second, as Safir (1987) observes, restricting variation to the
lexicon runs the risk of losing generalisations, if it transpires that all, or
even most, lexical items of a particular category behave in a certain way.
As we shall see in the next section, Wexler and Manzini themselves confront
this sort of problem in connection with Binding Theory phenomena, but
their way of dealing with it is not entirely satisfactory.
Safir's specific worries again concern the directionality parameters and
could be met by extending the notion of lexical parameterisation to zero-
level categories. Thus, the claim that verbs in English uniformly assign
Case and 0-role to the right would fall under this extended notion of
lexical parameterisation. As far as the more radical version of lexical
parameterisation, restricting it to functional elements, is concerned, it is
perhaps premature to speculate on its plausibility. Suffice it to say that
pursuit of it would require the development and justification of an inventory
of functional categories and their properties (for an initial view on such
an inventory, see Abney 1987) and a re-analysis of the whole range of
linguistic variation in terms of these properties. An instance of how progress
might be achieved in this regard is Fassi-Fehri's (1988) discussion of Case
assignment in Arabic and English. Adopting Abney's (1987) DP analysis,
he argues that verbs in English and Arabic uniformly assign accusative
case to the right, a necessary consequence of restricting parameterisation
to functional elements. However, D and I, both functional elements, differ
in the two languages in that they assign genitive and nominative case to
the left in English and to the right in Arabic. This proposal enables Fassi-
Fehri to construct an interesting account of word-order differences in the
two languages.18
This section has surveyed some of the obvious and less obvious ways
in which a theory of parameters might be constrained within the instan-
taneous idealisation. I hope that the need for such constraints is self-evident,
but it is not clear which, if any, of the possibilities raised, is appropriate
to pursue. Some of these issues will arise in a different context as we
now shift away from the instantaneous idealisation and construe the system
as developing in real time.
3. PROCEDURAL ISSUES
(18) S0 — Sj —• — Sn
The logical problem of language acquisition 17
Here, again, S 0 and S n designate the initial and final states, but now we
recognise a succession of intermediate states. A large number of questions
arise in this context, but in this section I shall focus on aspects of just
two of these. What is the nature of the developmental process which
mediates between the various states in this sequence? And does S 0 contain
a full inventory of the principles and parameters of Universal Grammar,
thereby implying that the same is true of the intermediate states, or do
some principles and parameters only become available as the child develops?
If this latter possibility is correct, an immediate further question arises:
what is responsible for the emergence of those principles and parameters
which are not available in the initial state?
Let us initially focus on the first of these questions, assuming for the
purposes of this discussion that, indeed, the full set of principles and
parameters is present from S 0 . An obvious way to view the offerings of
the instantaneous idealisation is in terms of it providing a restricted set
of hypotheses in line with (la), each hypothesis corresponding to a core
grammar; then the learner's task is seen as that of selecting and testing
hypotheses on the basis of exposure to data which are subject to the criterion
of'epistemological priority', and the job of the theorist, no longer operating
with the idealisation, is to provide a detailed account of exactly how
hypotheses are selected, what 'epistemological priority' amounts to, etc.
Presumably, there will be a relation of 'content' between selected
hypotheses and the data which occasion their selection. For example, we
would anticipate that the governing category parameter for a particular
anaphor will be set, or re-set, on the basis of exposure to data containing
that anaphor, represented as such by the child, in a relevant structural
configuration. From the perspective of Fodor (1981), such content-rela-
tedness is diagnostic of paradigmatic cases of learning, yet supporters of
the parameter-setting account often give the impression that they are
offering something quite distinct from a learning account, and Piattelli-
Palmarini (1989) suggests that applications of the label 'learning' to the
envisaged procedures is quite wrong and should be resisted. With the
rejection of learning comes the rejection of hypothesis selection and testing,
since this is the only coherent account of learning within a mentalistic
framework.
Whether the claim that something conceptually distinct from learning
is going on here is a question of some importance. Exactly where does
the distinctiveness of development in this model reside?
First, and most obviously, the restrictedness of the hypothesis space
might be seen as contributing to this distinctiveness, but a moment's
reflection should persuade us that this is unlikely. In the typical concept
'learning' experiment, there is normally only a finite (and small) number
of obvious candidates for stimulus variation, and the subject's task is to
18 Martin Atkinson
fix a value for each of these. As Fodor (1975, 1981) maintains, the only
remotely plausible story that has ever been told about what goes on in
such experiments has the subject selecting and testing hypotheses, the
hypotheses being related in 'content' to occasioning stimuli, and this is
a learning situation.
Furthermore, there are cases in the linguistics literature which make
it clear that something like this is seen to be going on. So, consider Huang's
(1982) discussion of English and Chinese word-order and recall that he
takes Greenberg to task for (i) failing to account for why word-order
properties cluster in the way they do, and (ii) failing to account for exceptions
to his statistical tendencies. Huang's alternative is to formulate a version
of the head-direction parameter and, indeed, this comes to terms with
(i) in a straightforward way. For (ii), however, Huang has to recognise
that the head-direction parameter is not set once and for all for all categories
and all bar levels, and he has to contemplate a learner refining hypotheses
in the light of additional experience with the language. More generally,
any account that admits of a parameter being wrongly set, thereby requiring
re-setting (and this applies to some of the best-known proposals in the
field, e.g. Hyams 1986, Wexler and Manzini 1987), has to have some
mechanism for achieving this re-setting, and, at this level of generality,
it is not clear that authors have anything other than hypothesis testing
in mind. Qualitatively, we have no difference between this sort of account
and standard views on learning, although quantitatively, particularly in
comparison to the account offered in classic transformational grammar
in (14) above, there may be major differences in terms of the size of the
hypothesis space (see Atkinson 1987, for more extended discussion).
If distinctiveness does not lie in a shift away from hypothesis testing
per se, perhaps it resides in properties of the mechanism by which hypotheses
are selected and tested. The Subset Principle, as developed by Wexler and
Manzini (1987), building on earlier suggestions of Berwick (1985), can
be viewed in this context.
The Subset Principle is designed to directly alleviate the difficulty arising
from the no negative data assumption by rendering the learner conservative
in a straightforward sense. 7/"parameter values give rise to set-theoretically
nested languages, then the Subset Principle obliges the learner to select
the least inclusive language compatible with the data received so far and
the parameter value yielding this language is deemed to be less marked
than those giving rise to more inclusive languages. Modifications 'upwards'
will always be possible in the light of further positive data to justify them;
modifications 'downwards' will never occur, but, then, if things have gone
according to plan, they will never be needed. There are several points
to make about the Subset Principle.
The logical problem of language acquisition 19
category, but even if this proves feasible, it does not bear on the main
issue under consideration.
Overall, it seems that there is no compelling reason to view the Subset
Principle as requiring us to move away from a learning account. The learning
in question is 'special' in that it is governed by a domain-specific principle
for selecting hypotheses, but that selection and testing of hypotheses is
going on is surely incontestable.
The plausibility of the Subset Principle, a property of the learning module,
is seen as deriving from the Subset Condition, a constraint on possible
parameters, which might, therefore, be seen as responding to some of the
issues raised in Section 2 (cf. fn. 12) from a procedural perspective. The
Subset Condition is defined in Wexler and Manzini (1987, 60) as in (19):
(19) For every parameter p and every two values i, j of p, the languages
generated under the two values of the parameter are one a subset
of the other, that is, L(p(i)) c L(p(j)) or L(pG)) c L(p(i))
D and D'. I conclude, therefore, that the linguistic tradition does not readily
accept stipulations of deductive relationships between parameter values.
The existence of implicational relationships between parameter values
comes under pressure from a different consideration in the learnability
context. Wexler and Manzini (1987), considering the operation of the Subset
Principle in the case where many parameters are to be set, formulate an
Independence Principle, the content of which is that the set-theoretic
relationships between languages generated by the values of one parameter
should not be disturbed by the values of other parameters. If these
relationships were not robust in this fashion, there would be no way for
the Subset Principle to function in a consistent way. At first glance, it
would appear that the Independence Principle rules out exactly the sort
of implicational relationships we are considering here.
Newson (this volume) argues that this is not necessarily the case, pointing
out that implicational relationships between the values of distinct para-
meters are not guaranteed to change set inclusion relations. What they
will do is make certain languages illegitimate, and it will follow that
markedness hierarchies for parameter values will not be calculable by the
learning module, since some of the languages which constitute the input
to the computation will not be available. 22 However, we have already seen
above that reservations about the extensional computations of the Wexler
and Manzini account have been expressed (Safir 1987), and it therefore
seems worthwhile to put these aside and give serious consideration to
including implicational statements as part of Universal Grammar. Newson
(1988 and this volume) pursues this course for Wexler and Manzini's
governing category parameter, arguing that the value of this parameter
for a pronominal is initially fixed on the basis of the value for a
corresponding anaphor. This enables Newson to produce coherent accounts
of two phenomena which are recalcitrant in the Wexler and Manzini
framework.
Manzini and Wexler (1987) note that it is never the case that the governing
category for pronominals in a language properly contains the governing
category for anaphors. If this were the case, there would be domains between
the pronominal and anaphor governing category boundary in which binding
relations would be inexpressible. To rule out this possibility, they formulate
the Spanning Hypothesis, as in (20):
Commenting on the status of (20), they say (p.440): "... it seems plausible
that [it] expresses a proposition that happens to be true of natural languages
as they have actually evolved, but has no psychological necessity, either
The logical problem of language acquisition 23
Borer and Wexler (1987) and Wexler (1988), following Pinker (1984), refer
to as the Continuity Hypothesis.
The best-known arguments against the Continuity Hypothesis are set
out in Borer and Wexler (1987). Most obviously, they draw attention to
what they refer to as the Triggering Problem in connection with Hyams'
(1986) account of the re-setting of the pro-drop parameter for children
acquiring a [-pro-drop] language such as English. For Hyams, this re-
setting is 'triggered' by the presence of expletive subjects, but the question
that immediately arises is that of why this triggering does not occur earlier,
since the child is exposed to sentences containing expletive subjects from
an early age. Borer and Wexler do not offer an alternative theory of pro-
drop in their paper, but, to illustrate an area where they feel a non-continuity
account is insightful, they propose that the child's early 'passives' in English
and Hebrew are all adjectival and therefore do not involve movement.
Movement involves the representation of A-chains, and, they claim, this
aspect of Universal Grammar is not available to the child at the stage
at which the earliest 'passives' are produced. These 'passives' it is assumed,
are all lexical. This suggestion receives further support from a consideration
of causatives in English and Hebrew and also plays a role in accounting
for a range of control phenomena in Wexler (1988). It seems to me plausible
to consider similar proposals in connection with Radford's (1988) claim
that "small children speak small clauses", his explication of this being
in terms of children lacking an I-system at the relevant stage, and his
extension of this claim to include the C-system and D-system in Radford
(1990).
It is not my purpose here to submit such proposals to critical scrutiny
(for some remarks on the Borer and Wexler proposals, see Hoekstra, this
volume; also Weinberg 1987). Rather, I shall take the correctness of some
kind of non-continuity hypothesis for granted and briefly consider the
question of the developmental mechanisms it requires.
We might be tempted to think that a non-continuity hypothesis is
consistent with a learning emphasis and that the representation of A-chains,
I-constituents, etc. is somehow induced by the child on the basis of exposure
to the linguistic environment. But well-known arguments of Fodor (1975,
1980, 1981) militate against this approach. If learning is to be viewed in
terms of hypothesis testing, the hypotheses must be available to be tested,
and Fodor's conclusion that a 'more expressive' system cannot develop
out of a 'less expressive' one by this mechanism follows. An alternative,
advocated by Borer and Wexler, is that the relevant representational
capacities mature, coming on-line according to some genetically determined
schedule. This perspective raises a number of interesting issues which are
likely to be the subject of considerable debate in the near future.
First, and most obviously, there is clearly nothing unintelligible about
The logical problem of language acquisition 25
in a more or less fixed order, and convinced that this is not explicable
in terms of later-acquired concepts being defined in terms of earlier-acquired
ones, Fodor extends the notion of brute-causal triggering to embrace the
possibility that certain concepts, while not defined in terms of others,
nevertheless have others as causal antecedents. To the extent that this view
of the development of concepts can be maintained, again the psychologist's
task should involve looking rather than attempting to analyse concepts
in terms of others. Such looking will reveal the layered conceptual structure
of the mind, but this structure will ultimately only be rationally explicable
in biological terms.
The extent to which representational capacities germane to the devel-
opment of syntax can be defined in terms of more basic capacities is,
in my view, a question still on the agenda (cf. fn. 15). To the extent that
they can, we may, at least in principle, contemplate producing a 'rational'
account of linguistic development. To the extent that they cannot, there
would appear to be no alternative to looking.
The conclusion suggested by the considerations in this section is that
if the acquisition of syntax is to be seen as having characteristics which
take it clearly outside the domain of learning, these will result from the
correctness of the non-continuity view. For the development of various
formal operations and the principles formulated in terms of them, this
is a perspective well worth pursuing. For linguistic variation, encoded in
distinct parameter settings, however, prospects of this kind do not look
inviting. While it makes sense for a parameter to become available as
a result of maturational scheduling, there is little to be said for its values
entering the system at different times. To date, I feel that there is no
compelling evidence to suggest that learning, perhaps in a very attenuated
sense, has no role to play in this aspect of development.
FOOTNOTES
1. Arguably, the problem has always had a central role in Chomsky's theorising, particularly
in his less technical works, e.g. Chomsky (1975). Isolated examples in the linguistics literature,
such as Peters (1972), Baker (1979) also exist.
2. An additional component of formalised versions of this framework is usually an
assumption about what should count as acquisition. The most obvious candidate here is
that there should be some finite time at which the learner selects the correct hypothesis
and then retains this hypothesis as further data are presented. For discussion of other
possibilities, see Wexler and Culicover (1980), Osherson, Stob and Weinstein (1986). I shall
suppress reference to this component in my discussion.
3. Readers familiar with the literature will recognise this as an informal characterisation
of Gold's (1967) text presentation.
4. This is an informal characterisation of the condition of informant presentation in Gold
(1967).
The logical problem of language acquisition 27
5. The core-periphery distinction is not one on which I shall focus here, although its utilisation
in linguistic argument may in itself constitute an interesting area for reflection. Chomsky
(1988b: 70), briefly referring to this distinction, says: "... [it] should be regarded as an expository
device reflecting a level of understanding that should be superseded as clarification of the
nature of linguistic inquiry advances".
6. What is interesting about (5) and (6) is that both of them involve a Subjacency violation
(although, see Chomsky (1986a, 50) for the suggestion that this may not be the correct
account of extraction from whether-c\auses). In addition, (6) includes an ECP violation,
since the empty subject position in the embedded clause is not properly governed. (7) and
(8) are also distinguished in terms of the ECP, with a violation of this principle occurring
in (8). For discussion of the relevant theoretical concepts, see Lasnik and Saito (1984).
7. Baker accounts for these differences in terms of the instrumental NP receiving its 0-
role directly from the verb, whereas the benefactive NP is part of a PP at D-structure and
receives its 6-role from the preposition. Readers are referred to Baker's discussion for extensive
justification of this asymmetric behaviour of different sorts of PP.
8. It is a matter of some contention whether the hypothesis selection and testing framework
adopted here is appropriate for the speculations we shall be considering below. Due notice
will be given to this issue at the appropriate time. For the purposes of this introduction,
I believe that adopting this mode of talk is harmless and quite useful.
9. The clearest argument for the innocence of the idealisation for linguistic theory is the
fact that the end state, S„, is remarkably uniform and does not seem to be at the mercy
of vagaries in the order in which data are presented. Given this, questions to do with the
order in which parameters are set, for example, will be irrelevant to the primary concern
of the linguist.
10. Whether information about markedness for parameter values should be included in
S 0 is an issue to which I shall return in Section 3.
11. Safir (1987: 77-8) puts the worry thus: "... our assumptions about what counts as a
"possible parameter" or a "leamable parameter" remain very weak. ... what is to prevent
us from describing any sort of language difference in terms of some ad hoc parameter?
In short, how are we to prevent S t a n d a r d ] Parameter] T[heory] from licensing mere
description?"
12. I refer here specifically to the Subset Principle and the Subset Condition of Manzini
and Wexler (1987) and Wexler and Manzini (1987). Another 'procedural' constraint might
be that any legitimate parameter must be capable of being set on the basis of the evidence
available to the child. Thus, a parameter requiring negative evidence or highly complex
evidence in order to be set would be illegitimate on this basis. See Lightfoot (1989) for
how some parameters could be set by degree-0 data.
13. This is not to suggest that such réévaluation would be impossible, and it might be
interesting to consider the possibility of replacing non-binary parameters by several distinct
binary parameters which together conspire to yield the effects of the original.
14. More recently, Hyams (1987) has proposed a reanalysis of pro-drop phenomena in
terms of morphological uniformity (see Jaeggli and Safir 1988). Briefly, the idea is that
languages which have morphologically uniform verbal paradigms allow pro-drop. Thus,
Italian, in which all verbal forms are inflected,"and Chinese in which none are, are pro-
drop languages, but English, which admits both inflected and non-inflected forms, is not.
With this reanalysis, Hyams is able to maintain that [+morphologically uniform] is the
unmarked parameter setting on learnability grounds, since positive data in the form of inflected
and uninfected forms will serve to re-set the parameter. If the initial setting were to be
[-morphologically uniform], no such re-setting could occur. This simple and attractive idea
involves a number of complications concerning licensing and identification, and I shall not
discuss it further here.
28 Martin Atkinson
15. To talk of primitives in this connection is merely to acknowledge that principles refer,
in their formulation, to a variety of configurational and non-configurational notions. These
could be viewed as 'primitive' modulo the statement of the principle. Whether there is a
fundamental primitive basis for the whole account and, if there is, how it relates to the
issue of epistemological priority mentioned earlier is an interesting issue which will not be
pursued here.
16. The directionality parameters, presumably formulated in terms of the predicates 'right
o f and 'left o f would appear to be readily relatable to a primitive epistemological basis.
Again, this raises issues beyond the scope of this paper.
17. 'Strong' and 'weak' are, of course, mere mnemonic labels for the distinction. It seems
to me reasonable to ask why these properties involve 9-role transmission in the way claimed,
i.e. it is not clear that reference to 8-roles is more than another labelling of the distinction.
Speculations such as those being considered here take on additional perspectives in the light
of Radford's (1988,1990) claims that children acquiring English pass through a pre-functional
stage during which they offer evidence of their control of lexical categories and their projections,
but give no indication of having mastered the functional systems based on I, C and D.
A number of very interesting questions emerge from a juxtaposition of Chomsky's speculations
and Radford's empirical claims, particularly if the latter are generalisable to the acquisition
of languages other than English. For example, it would appear to follow that any systematic
pre-functional variations in the speech of children, say in word order, must be referred
to factors which are not properly viewed as belonging to the language module. It would
not be appropriate to pursue the ramifications of this suggestion in this paper.
18. Fassi-Fehri's account also requires the assumption that subjects appear at D-structure
in some projection of V (or N) which occurs as complement to I (or D). Sportiche (1988),
who also adopts this proposal, suggests that languages are parameterised as to whether
this subject obligatorily moves to (Spec, I) at S-structure. This 'unconstrained' parameterisation
becomes principled on Fassi-Fehri's account; in languages like English, the subject has to
move in this way to get nominative case, assigned by I to its left. The movement of the
subject follows from a localised parameterisation and does not, it itself, constitute the
parameterisation.
19. Alternatively, the applicability of the Subset Principle might be relatable to binarity,
being restricted to multi-valued parameters, there then being two distinct types of development
countenanced by the model. One would fall under the Subset Principle and involve significant
learning; the other would conform more closely to the switch-setting analogy. It is premature
to speculate further in this respect.
20. In the literature, 'triggering' often appears to be identified with 'having consequences
beyond those immediately contained in the data' but this is obviously true of paradigmatic
learning situations, and it seems that some reference to 'content' and 'arbitrariness' is necessary
to distinguish these two notions (see Fodor 1978 for illuminating discussion). The appro-
priateness of the label 'triggering' in this scenario will depend on whether the data leading
to the fixing of P] bear an opaque relationship to P 2 . To take an implausible, but relevant
case, it is conceivable that the fixing of the governing category parameter is implicationally
dependent on the fixing of the head direction parameter, and if this were so, we would
surely be justified in asserting that a phrase with a particular head-complement order triggers
a value of the governing category parameter. This is not learning.
21. We could, of course, have D' without D if the implication is not bilateral, but I will
set this possibility aside here as it does not bear centrally on the discussion.
22. It is not clear that even this is necessary. Given a strict separation between Universal
Grammar and the learning module, it is conceivable that the latter could have access to
'impossible' languages to facilitate its computations, e.g. those obtained by removing just
the implicationally induced constraints from Universal Grammar.
The logical problem of language acquisition 29
23. It is noteworthy that, if this view is largely correct, then the traditional concerns of
developmentalists in accounting for how stages develop out of their predecessors evaporate;
there is no such development.
REFERENCES
Abney, S. 1987. The English Noun Phrase in its Sentential Aspect. Doctoral dissertation,
MIT.
Aldridge, M. 1988. The Acquisition of INFL. Research Monographs in Linguistics, UCNW,
Bangor 1. (Reprinted by IULC).
Atkinson, M. 1982. Explanations in the Study of Child Language Acquisition. Cambridge:
Cambridge University Press.
Atkinson, M. 1987. Mechanisms for language acquisition: learning, parameter-setting and
triggering. First Language 7. 3-30.
Baker, C. L. 1979. Syntactic theory and the projection problem. Linguistic Inquiry 10. 533-
81.
Baker, M. 1988a. Incorporation: A Theory of Grammatical Function Changing. Chicago:
University of Chicago Press.
Baker, M. 1988b. Theta theory and the syntax of applicatives in Chichewa. Natural Language
and Linguistic Theory 6. 353-89.
Berwick, R. C. 1985. The Acquisition of Syntactic Knowledge. Cambridge, Massachusetts:
MIT Press.
Borer, H. 1984. Parametric Syntax. Dordrecht: Foris.
Borer, H. and K. Wexler. 1987. The maturation of syntax. In T. Roeper and E. Williams
(eds.).
Chomsky, A. N. 1965. Aspects of the Theory of Syntax. Cambridge, Massachusetts: MIT
Press.
Chomsky, A. N. 1975. Reflections on Language. New York: Pantheon.
Chomsky, A. N. 1981. Lectures on Government and Binding. Dordrecht: Foris.
Chomsky, A. N. 1986a. Barriers. Cambridge, Massachusetts: MIT Press.
Chomsky, A. N. 1986b. Knowledge of Language. New York: Praeger.
Chomsky, A. N. 1988a. Generative Grammar. Studies in English Linguistics and Literature.
Kyoto University of Foreign Studies.
Chomsky, A. N. 1988b. Some notes on economy of derivation and representation. MIT
Working Papers in Linguistics 10. 43-74.
Elliott, W. N. and K. Wexler. 1988. Principles and computations in the acquisition of
grammatical categories. Ms. UC-Irvine.
Fassi-Fehri, A. 1988. Generalised IP structure, Case and VS word order. MIT Working Papers
in Linguistics 10. 75-112.
Fodor, J. A. 1975. The Language of Thought. New York: Thomas Y. Crowell.
Fodor, J. A. 1978. Computation and reduction. In C. W. Savage (zd.) Perception and Cognition:
Issues in the Foundations of Psychology, Minnesota Studies in the Philosophy of Science
9. Minneapolis: University of Minnesota Press.
Fodor, J. A. 1980. Contributions to M. Piattelli-Palmarini (ed.) Language and Learning:
The Debate Between Jean Piaget and Noam Chomsky. London: Routledge & Kegan Paul.
Fodor, J. 1981. The present status oftheinnateness controversy. In J. A. Fodor Representations.
Hassocks: Harvester.
Gleitman, L. R., E. Newport, and H. Gleitman. 1984. The current status of the Motherese
hypothesis. Journal of Child Language 11. 43-79.
Gold, E. M. 1967. Language identification in the limit. Information and Control 10. 447-74.
30 Martin Atkinson
Huang, J. C.-T. 1982. Logical Relations in Chinese and the Theory of Grammar. Doctoral
dissertation MIT.
Hyams, N. 1986. Language Acquisition and the Theory of Parameters. Dordrecht: Reidel.
Hyams, N. 1987. The setting of the null subject parameter: a reanalysis. Paper presented
to Boston University Conference on Child Language Development.
Jaeggli, O. and K. Safir. 1988. The null subject parameter and parametric theory. Version
of a paper to appear in O. Jaeggli and K. Safir (eds.) The Null Subject Parameter. Dordrecht:
Reidel.
Jakobson, R. 1968. Child Language, Aphasia and Phonological Universals. The Hague: Mouton.
Jakobson, R., G. Fant, and M. Halle. 1952. Preliminaries to Speech Analysis. Cambridge,
Massachusetts: MIT Press.
Lasnik, H. 1985. On certain substitutes for negative data. Ms. University of Connecticut.
Lasnik, H. and M. Saito. 1984. On the nature of proper government. Linguistic Inquiry
15. 235-89.
Lightfoot, D. 1989. The child's trigger experience: 'degree-0' learnability. Behavioral and
Brain Sciences 12. 321-34.
Manzini, R. and K. Wexler. 1987. Parameters, Binding Theory, and learnability. Linguistic
Inquiry 18. 413-44.
Morgan, J. L. 1986. From Simple Input to Complex Grammar. Cambridge, Massachusetts:
MIT Press.
Newport, E., L. R. Gleitman, and H. Gleitman. 1977. Mother, I'd rather do it myself: some
effects and non-effects of maternal speech style. In C. Snow and C. A. Ferguson (eds.)
Talking to Children. Cambridge: Cambridge University Press.
Newson, M. 1988. Dependencies in the lexical setting of parameters: a solution to the
undergeneralisation problem. Ms. University of Essex.
Oehrle, R. 1985. Implicit negative evidence. Ms. University of Arizona.
Osherson, D., M. Stob, and S. Weinstein. 1986. Systems that Learn. Cambridge, Massachusetts:
MIT Press.
Peters, S. 1972. The projection problem: how is a grammar to be selected?. In S. Peters,
(ed.) Goals of Linguistic Theory. Englewood Cliffs, N. J.: Prentice Hall.
Piattelli-Palmarini, M. 1989. Evolution, selection and cognition: from 'learning' to parameter
setting in biology and in the study of language. Cognition 31. 1-44.
Pinker, S. 1984. Language Learnability and Language Development. Cambridge, Massachusetts:
Harvard University Press.
Pollock, J. Y. 1987. Verb movement, UG and the structure of IP. Ms. Université de Haute
Bretagne, Rennes II.
Radford, A. 1988. Small children's small clauses. Transactions of the Philological Society
86. 1-43.
Radford, A. 1990. Syntactic Theory and the Acquisition of Syntax. Oxford: Blackwell.
Randall, J. 1985. Positive evidence from negative. In P. Fletcher and M. Garman (eds.)
Child Language Seminar Papers. University of Reading.
Roeper, T. and E. Williams (eds.). 1987. Parameter Setting. Dordrecht: Reidel.
Safir, K. 1985. Syntactic Chains. Cambridge: Cambridge University Press.
Safir, K. 1987. Comments on Wexler and Manzini. In T. Roeper and E. Williams (eds.).
Saleemi, A. 1988. Learnability and parameter-fixation: the problem of learning in the ontogeny
of grammar. Doctoral Dissertation, University of Essex.
Solan, L. 1987. Parameter setting and the development of pronouns and reflexives. In T.
Roeper and E. Williams (eds.).
Sportiche, D. 1988. A theory of floating quantifiers and its corollaries for constituent structure.
Linguistic Inquiry 19. 425-50.
Weinberg, A. 1987. Comments on Borer and Wexler. In T. Roeper and E. Williams (eds.).
The logical problem of language acquisition 31
Wexler, K. 1982. A principle theory for language acquisition. In E. Wanner and L. R. Gleitman
(eds.) Language Acquisition: The Slate of the Art. Cambridge: Cambridge University Press.
Wexler, K. 1988. Aspects of the acquisition of control. Paper presented to Boston University
Conference on Language Development.
Wexler, K. and Y. C. Chien 1985. The development of lexical anaphors and pronouns.
Papers and Research on Child Language Development 24. 138-49.
Wexler, K. and P. W. Culicover. 1980. Formal Principles of Language Acquisition. Cambridge,
Massachusetts: MIT Press.
Wexler, K. and R. Manzini. 1987. Parameters and learnability in Binding Theory. In T.
Roeper and E. Williams (eds.).
Observational data and the UG theory of
language acquisition
Vivian Cook
University of Essex
The claims that UG theory makes for language acquisition are largely
based on the "poverty of the stimulus" argument; given that the adult
knows X, and given that X is not acquirable from the normal language
input the child hears, then X must have been already present in the child's
mind. This crucial argument uses the comparison of the knowledge of
language that the adult possesses with the initial state of the child to establish
what could not have been acquired from the types of evidence available
and must therefore be innate. Chomskyan UG theory would not be
discomfited if other evidence from acquisition were not forthcoming. On
the one hand, such research is not of prime importance, given the reliance
on the poverty of the stimulus argument. On the other, evidence from
language development in the child is related with difficulty to acquisition
because of the other factors involved in language performance and
development - production and comprehension processes, situation and
use, the growth in other mental faculties, and so on - all of which are
"non-stationary" (Morgan, 1986) and liable to change as the child grows
older.
The attraction of the current model is that the aspects of language built-
in to the mind are precisely the principles of GB theory - the Projection
Principle, the Binding Principles, and so on; the aspects that have to be
34 Vivian Cook
learnt are the settings for parameters of variation, and the properties of
lexical items. Hence built-in principles of syntax can now be postulated
in a rigorous form that has testable consequences; it is possible to start
looking for evidence of the effects of U G in children's language devel-
opment. And also the reverse; it is possible to start phrasing research
into acquisition in ways that can affect issues of linguistic theory, the
prime example being the work of Hyams (1986) and Radford (1986).
asked directly whether they accept Is the man who is here tall? as their
answer would not be meaningful. Children are by and large not capable
of attesting unambiguously that a particular sentence is or isn't generated
by their grammar. Other than single-sentence evidence or the pure poverty
of the stimulus argument, what else can count as evidence of language
acquisition in a UG framework? One possibility is to use experimental
techniques and statistical procedures from the psycholinguistic tradition.
The research of the past decade has employed a wealth of techniques ranging
from the elicited imitation tasks used by Lust and her associates (1989)
to the comprehension tasks employed by Matthei (1981) and others; indeed
the specific case of structure dependency seen in Is the man who is here
tall? was investigated by Crain and Nakayama (1983) through a question
production task.
The validity of such forms of evidence is not the concern here; an account
of some of their merits and demerits is seen in Bennett-Kastor (1988).
Instead the discussion will be restricted to one type of evidence that has
been used within UG theory, namely the use of sentences observed in
actual children's speech, which can be called "observational data". If a
child is heard to say Slug coming, what status does this sentence have
as evidence for UG theory? The main argument here is that there is an
inherent paradox in using observational data to support a UG model that
needs to be aired, even if it cannot be resolved. Observational data belongs
in essence to E-language; the typical E-language study of acquisition looks
at statistically prominent features found in a substantial collection of
children's speech, say Brown (1973) or Wells (1985). The major problem
is how to argue from E-language descriptive data of children's actual speech
to their I-language knowledge, a problem first perhaps highlighted in
Chomsky (1965) as "a general tendency... to assume that the determination
of competence can be derived from description of a corpus by some sort
of sufficiently developed data-processing technique". While it is interesting
and instructive to use observational data to investigate the UG claims,
the chain of qualifications and inferences between such data and language
knowledge is long and tortuous.
There are two related dimensions to this within the UG theory - performance
and development. Any use of performance data by linguists faces the
problem of distinguishing grammatical competence from the effects of
production and comprehension processes, short term memory, or other
non-competence areas of the mind involved in actual speech production.
Single-sentence evidence is immune to all of these factors. In this sense
36 Vivian Cook
children's speech presents exactly the same problems for the I-language
analyst as the speech of adults. GB oriented linguists base their syntactic
analyses on single example sentences rather than on chunks of performance;
they too have problems with deriving the knowledge of the native speaker
from samples of raw performance.
But children's language also ties in with their development on other
fronts; the actual sentences they produce reflect their developing channel
capacity, that is to say a mixture of cognitive, social, and physical
development, from which the effects of language acquisition need to be
filtered out. The distortions that performance processes cause in actual
speech are doubly difficult to compensate for in language acquisition
research because they may be systematically, or nonsystematically, different
from those of adults - short term memory may be smaller in capacity
or organised in a different way, cognitive schemas may be different, and
so on; insofar as these are involved in language performance they affect
children differently from adults. "Much of the investigation of early
language development is concerned with matters that may not properly
belong to the language faculty ... but to other faculties of the mind that
interact in an intimate fashion with the language faculty in language use"
(Chomsky, 1981b). Cook (1988) distinguishes "acquisition" - the logical
problem of how the mind goes from S0 (zero state) to Ss (steady state)
- from "development" - the history of the intervening stages, S b S2, and
so on. To argue from observation of children's development to the theory
of acquisition means carefully balancing all: these possibilities. Linguists
are frequently struck by the child's presumed difficulties in dealing with
primary linguistic data; their own difficulties in deriving a representation
of grammatical knowledge from samples of children's performance are
not dissimilar, or indeed worse since children's sentences are more deficient
than the fully grammatical sentences spoken by caretakers (Newport, 1976).
So the child saying Slug coming may be suffering from particular production
difficulties shared by adults or from specific deficits in areas that have
not yet developed, say the articulatory loop in working memory (Baddeley,
1986). The apparent syntax of the sentence may be different from the
child's competence for all sorts of reasons.
Observational data thus raise two problems related to performance; one
is the distortion resulting from the systematic or accidental features of
psychological processes; the other is the compounding effect of the
development of the child's other faculties. For observational data to be
used in a UG context, eventually these distortions need to be accommodated
within a developmental framework that includes adequate accounts of the
other faculties involved in the child's language performance, which, needless
to say, does not yet exist. Furthermore, observational data of children's
speech are still only evidence for production rather than comprehension,
Observational data and the UG theory of language acquisition 37
the two processes being arguably distinct in young children (Cairns, 1984).
A major point also concerns what data from children should be compared
with - adult competence or adult performance? The significant paper by
Radford (1986), for instance directly compares two sets of sentences, one
consisting of actual child performance such as That one go round, the other
of bracketed adult versions such as Let [that one go round], as if they
were the same type of data (pp. 10-11). It is difficult to offer children's
E-language data as evidence for their knowledge of language without
comparing them with E-language data from adults. A comparison of
observational data from children with single-sentence evidence from adult
competence begs many questions. Once it is conceded that adult perfor-
mance needs to be used, a range of phenomena must be taken into account
that GB syntax has mostly excluded. Let us take the pro-drop parameter
as an illustration. The main criterion for a pro-drop language is the absence
of certain subjects in declarative sentences. In her important work with
pro-drop Hyams (1986) found that children from three different language
backgrounds have null-subject sentences; she regarded this as confirmation
of an initial pro-drop setting, later rephrased as [+uniform] morphology
(Hyams, 1987). However, adult E-language data for English reveal that
subjects are often omitted in actual speech and writing, usually at the
beginning of the sentence. Taking a random selection of sources, Can't
buy me love and Flew in from Miami Beach come from well-known song
Observational data and the UG theory of language acquisition 39
lyrics; the opening pages of the novel The Onion Eaters (Donleavy, 1971)
contain Wasn't a second before you came in, Must be ninety now, and Hasn't
been known to speak to a soul since anyone can remember, a column writer
in The Weekend Guardian with the pseudonym "Dulcie D o m u m " typically
uses sentences such as Drive to health food shop for takeaway, In fact might
be too exciting, and Replies that it's in my desk drawer (Domum, 1989)
- indeed in this article some 34 out of 68 sentences have at least one
null-subject; an anecdote in Preston (1989) concerns a prescriptively oriented
teacher denying that she uses gonna "Ridiculous" she said; "Never did;
never will". Adult speakers of English appear to use null-subject sentences,
even if they only utilise them in certain registers and situations. So, if
the performance dimension of variation between styles of language is taken
into account, null-subjects may be expected to appear in children's
performance and children may also be expected to have encountered them
in some forms of adult speech.
But also, given the many ways in which children are different from
adults, an argument based on observational data has to explore the
alternative developmental explanations that might cause something to be
lacking from their speech. One explanation might indeed be a more frequent
use by children of some performance process that the initial elements or
elements in the sentence can be omitted - a clipping of the start of the
sentence - which creates the illusion of pro-drop among other effects.
A counterargument is that the null-subject is not always initial and hence
not a product of utterance-initial clipping; however in a sample of children's
language discussed in Cook (in progress) only 3 out of 59 null-subject
examples had non-initial null-subject. Another explanation might be the
"recency" effect whereby children pay attention chiefly to the ends of
sentences (Cook, 1973), thus being more likely in SVO languages to omit
subjects than objects and hence giving the illusion of null subject sentences.
Hyams (1986) presents the counterargument that, at the same time as
children produce subjectless sentences, they also produce ones with subjects,
so that the lack of overt subjects is not a memory limitation; while this
may well be true, the use of null-subject sentences by English-speaking
adults is equally not a product of memory limitations. A further explanation
might be found in the type of subject that is missing. Children may leave
out some first person subjects because they feel they are not needed, and
this might be a cognitive universal; Sinclair and Bronckart (1971) suggested
that at a certain period children see themselves as the implicit subject
of the sentence; Halliday (1985) sees first person subjects as the most
prototypical form. According to Hyams (1986, p.69), however, "the referent
of the null-subject is not restricted to the child himself'. Yet in the same
sample of sentences some 39 out of 59 null subjects were apparently first
person; a high proportion, though not all, of children's null-subject
40 Vivian Cook
6. EVIDENCE OF ABSENCE
Furthermore, if data from more than one child are being used, it is necessary
to define coexistence in terms of chronological age, mental age, MLU,
LARSP, grammatical stages or whatever developmental schedule one
prefers: developmental clocks need to be set to the same standard time
if comparisons are to be made.
Secondly, there is the question of how different aspects of behaviour
correlate within a stage. Ingram (1989) finds four meanings for "stage"
when considered in terms of a single behaviour and four more when
considered in terms of two behaviours;in his terms, much of the UG related
research goes beyond the simple "succession" stage to the "co-occurrence"
stage where two behaviours occur during the same timespan or the
"principle" stage in which a single principle accounts for diverse forms
of behaviour. In terms of observational data, given that many forms occur
or don't occur at the same stage, how can it be shown which correlate
with each other and which don't? All the forms present at the same stage
correlate in the sense that they coexist and are part of the same grammar;
what are the grounds for believing some are more closely related than
others? There may be entirely independent reasons why two things happen
at the same time; a paradigm statistical example showing correlation is
not causation is the clocks in town all striking twelve simultaneously.
Correlation is more problematic when it relies on absence rather than
presence of forms. The Hyams (1986) analysis depends on a link between
null-subject sentences and lack of expletive subjects; the Hyams (1987)
analysis depends on a link between null-subject sentences and lack of
inflections. UG can predict grouping of precise features missing from
children's language and their absence can be correlated closely; but, if
their absence coincides with large numbers of other absent features, the
validity of such a correlation becomes hard to test; why should any two
pairs of missing bits of the sentence be more related than any two other
missing bits? Early children's language is difficult for observational data
because it is so deficient: arguing from absence provides too unconfined
a set of possibilities for correlation.
The main conclusion to this paper is that the use of observational data
within the UG theory of language acquisition must always be qualified;
such data should be treated as showing the interaction of complex
performance processes that are themselves developing. The work with
observational data by Hyams, Radford, and others has provided a tre-
mendous revitalisation of the UG theory in recent years; greater discussion
Observational data and the UG theory of language acquisition 45
REFERENCES
Chomsky, N. 1988. Language and Problems of Knowledge: The Managua Lectures. Cambridge,
Massachusetts: MIT Press.
Cook, V. J. 1973. The comparison of language development in native children and foreign
adults. IRAL XI/1. 13-28.
Cook, V. J. 1988. Chomsky's Universal Grammar: An Introduction. Oxford: Blackwell.
Cook, V. J., in progress. Universal Grammar and the child's acquisition of word order
in phrases.
Crain, S. C. and I. Nakayama. 1983. Structure dependence in grammar formation. Language
63. 522-543.
Domum, D. 1989. Plumbing the bidet depths. The Weekend Guardian. 11th April, p . l l .
Donleavy, J. P. 1971. The Onion Eaters. London: Eyre and Spottiswode.
Dulay, H. C. and M. K. Burt. 1973. Should we teach children syntax? Language Learning
23. 245-258.
Feyerabend, P. 1975. Against Method. London: Verso.
Fodor, J. A. 1981. Some notes on what linguistics is about. In N. Block (ed.) Readings
in the Philosophy of Psychology. 197-207.
Halliday, M. A. K. 1985. An Introduction to Functional Grammar. London: Edward Arnold.
Hyams, N. 1986. Language Acquisition and the Theory of Parameters. Dordrecht: Reidel.
Hyams, N. 1987. The setting of the null subject parameter: a reanalysis. Paper presented
to the Boston University Conference on Child Language Development.
Ingram, D. 1989. First Language Acquisition. Cambridge: Cambridge University Press.
Lust, B., J. Eisele and N. Goss (in prep.). 'The development of pronouns and null arguments
irfxhild language', Cornell University.
Matthei, E. 1981. Children's interpretation of sentences containing reciprocals. In Tavakolian
(ed.), 58-101.
Morgan, J. L. 1986. From Simple Input to Complex Grammar. Cambridge, Massachusetts:
MIT Press.
Newport, E. L. 1976. Motherese: the speech of mothers to young children. In N. Castellan,
D. Pisoni, and G. Potts (eds.) Cognitive Theory, vol 2. Hillsdale: Erlbaum.
Preston, D. R. 1989. Sociolinguistics and Second Language Acquisition. Oxford: Blackwell.
Radford, A. 1986. Small children's small clauses. Bangor Research Papers in Linguistics 1.
1-38.
Radford, A. 1988. Small children's small clauses. Transactions of the Philological Society
86. 1-43.
Saleemi, A. 1988. Learnability and Parameter Fixation. Doctoral Dissertation, University
of Essex.
Sinclair, H. and J. Bronckart. 1971. SVO a linguistic universal? Journal of Experimental
Child Psychology 14. 329-348.
Stromswold, K. 1988. Linguistic representations of children's wh-questions. Papers and Reports
in Child Language 27.
Wells, C. G. 1985. Language Development in the Preschool Years. Cambridge: Cambridge
University Press.
Parameters of Metrical Theory and
Learnability*
Michael Hammond
University of Arizona
1. METRICAL THEORY
(2) a. constituents,
b. directionality,
c. iterativity,
d. extrametricality,
e. destressing,
f. scansions/levels.
Parameters of Metrical Theory and Learnability 49
All theories include a set of constituents (2a). The constituents in (1) are
binary and left-headed. (The stress occurs on the left side of the disyllabic
unit.) However, there are other kinds of constituents as well. For example,
there are binary right-headed constituents as well (e.g. in Aklan per Hayes
1981). Theories differ in how many constituent types they allow and in
the precise properties of those constituents.
All theories include a parameter of directionality (2b). Are constituents
assigned from left to right or from right to left? The directionality parameter
has an effect in polysyllabic words with an odd number of syllables.
All theories include some mechanism to deal with superficial iterativity
(2c). Are constituents constructed iteratively, filling the span with stresses,
or is only a single constituent built, placing a stress at or near one of
the peripheries of the domain?4
All theories include some analogue to the mechanism of extrametricality
(2d). This allows a peripheral syllable or higher-level constituent to be
excluded from metrification.
Metrical theory also includes some subsequent operations that will be
included under the rubric of "destressing" here (2e). Such rules manipulate
the metrical structure assigned by the parameters discussed above. In this
paper, some of the results concerning destressing rules of Hammond (1984/
1988) will be assumed. First, destressing rules may only remove stresses.
Second, stresses may only be removed to resolve stress clashes. Last, the
main stress of a domain may not be removed.
Finally, all versions of metrical theory include levels and scansions (2f).
These are discussed in section 4 below.
There are other aspects of metrical theory which are not discussed here,
e.g. cyclicity, exceptions, and the relationship between segmental rules and
metrical structure. Space limitations preclude an adequate treatment of
these. It is expected that including them would not alter the results arrived
at here.
2. LEARNABILITY
There are two minimal assumptions about the kinds of data that children
are exposed to that are accepted here. The first is that learning proceeds
on the basis of positive evidence (but cf. Saleemi, this volume). That is,
it is normally assumed that children are not systematically corrected for
ill-formed utterances (Brown and Hanlon, 1970).
A second assumption that is often made and that will be adopted here
is that learning proceeds on the basis of a presentation of a finite set
of data. This is a natural consequence of the assumption that speakers
do actually come up with a grammar at some point and that the time
up to that point is finite.
(5) i. H0 = {a,aa,aaa }
ii. H, = {a}
iii. H2 = {a,aa}
iv. H3 = {a,aa,aaa}
v. H, = {a,aa,...,a<}
Where should the work be done? The answer depends on the character
of the theories of learning and UG that result. For example, if excluding
G 4 in the learning algorithm would vastly complicate the learning algorithm,
but excluding G 4 in UG would only slightly complicate UG, then G 4 should
be excluded by UG. If, on the other hand, excluding G 4 in UG would
overcomplicate UG, but excluding it in the learning algorithm would be
relatively minor, then G 4 should be excluded by the learning algorithm.
In the next two sections, a case is presented that would seem to be best
accounted for in terms of the approach in (7b).
52 Michael Hammond
The word "occurring" here is crucial to the argument for (7b) above.
It will be shown that while all occurring metrical systems are learnable
on the basis of words of seven syllables or less, the larger set of metrical
systems licensed by UG is not. This distinction will form the centrepiece
of the argument for (7b).
A proof can be constructed if any two existing metrical systems can
be distinguished on the basis of words with n syllables (where n < 8).
Compare, for example, the following two systems. In Language I, a
simplified version of English, trochaic feet insensitive to syllable weight
are constructed from right to left. The rightmost foot is elevated to main
stress and adjacent stresses are resolved by removing one of the clashing
stresses. In Language II, a simplified version of Lenakel (Hammond, 1986,
1990b), one trochee is built from the right and then as many as possible
are built from the left. Again, adjacent stresses are resolved by destressing.
In both languages, destressing operates in a familiar fashion. The second
of two adjacent stresses is removed unless it is the main stress. Otherwise,
the first is removed.
The patterns produced in words of different lengths are diagrammed
with schematic words in (9). Notice how the two patterns only become
distinct in examples of at least seven syllables in length.
The comparison in (9) shows that the two systems considered require that
Parameters of Metrical Theory and Learnability 53
In principle, one might expect to find the same options and parameters
available at each level of the hierarchy, but, in fact, that does not occur.
At each successively higher level, the number of options available decreases.
This fact finds an explanatory solution only when one looks to learnability
concerns.
In (11), the options for foot construction are diagrammed. 6
Here, there are fewer possibilities. For example, there seems to be only
left-headed binary cola. 7 All languages that exhibit cola exhibit left-headed
binary cola, e.g. in Tiberian Hebrew, Passamaquoddy, Hungarian, Odawa,
etc. Moreover, no language exhibiting cola exhibits more than one scansion
of cola. The fact that these additional options are not available at the
colon level has no explanation within any current version of metrical theory.
At the word tree level, the options are even more restricted.
Again, this absence of the full power of metrical theory at the word tree
level is unexplained in all versions of metrical theory. 8
There are two ways to go about rectifying this lack of explanation in
metrical theory. One possibility might be to alter the theory in some radical
fashion so as to preclude these options at higher levels of the hierarchy.
This approach is problematic in two ways.
First, it would result in a rather "numerological" version of metrical
theory. The options can only be excluded by brute force and the resulting
theory does not have a desirable character.
The second problem is that excluding these options would be unex-
planatory. That is, altering U G directly would miss an important gener-
alization about the nature of the restrictions outlined in (11), (12), and
(13). Specifically, the restrictions on options available at each level are
directly related to the fact that words of seven syllables or less are sufficient
to distinguish all occurring stress systems. If the same number of options
were available at each level, then the seven-syllable hypothesis could not
be maintained. As a demonstration of this, let us consider several possible
enrichments of the system in (11), (12), and (13).
Parameters of Metrical Theory and Learnability 55
(15) a. x
(x X x)
(x ) ( x X )(x x)
(x x) (x x) (x x) (x x) (x) word tree = [fx]...]
aa aa aa aa a
b. x
(x X x)
(x ) (x X ) (x x)
(x x) (x x) (x x) (x x) (x)
aa aa aa aa a word tree = [...[x]]
c. x x
(X X ) (X)
(x ) (x X ) (x x)
(x x) (x x) (x x) (x x) (x)
a a a a a a a a a word tree = [[x]x] (R->L)
are built from left to right. Then cola are built right to left. The two
grammars diverge at that point. In the first, the rightmost colon is
extrametrical and a right-headed word tree is built. In the other, a left-
headed word tree is built. Figure (16) shows how these systems are indistinct
with words of eight syllables (or less); (17) shows how they are distinct
in words of nine syllables (or more).
(16) x x
(x ) <x> (X x)
(x X )(x x) (x X )(x x)
(x x) (x x) (x x) (x x) (x x) (x x) (x x) (x x)
a a a a aa aa aa aa aa aa
(17) x x
(x x ) <x> (x X X )
(X ) ( x x )(x) (x ) ( x x) (x x)
(x x) (x x) (x x) (x x) (x) (x x) (x x) (x x) (x x) (x)
aa aa aa aa a aa aa aa aa a
Thus the seven-syllable hypothesis can explain why fewer options are
available at successively higher levels of the metrical hierarchy.
Notice that the particular options available at any level does not follow
from the seven-syllable restriction. For example, it was argued above that
the seven-syllable restriction accounts for why the set of word tree
constituents cannot be augmented with an iterated trochee. The seven-
syllable restriction does not explain why the word tree constituents are
as in (18a), and not as in (18b). In (18a), the actually occurring possibilities
are given. In (18b), the left-headed unbounded foot is replaced with an
iterated trochee. The number of choices in each system is the same; the
particular choices are different.
(19) a. x b. x
(X X) (X X)
(x x )(x x) (x x )(x x)
(x x) (x x) (x x) (x) (x x) (x x) (x x) (x)
aa aa aa a aa aa aa a
Thus an explanation for the specific asymmetries of (11), (12), and (13)
in terms of the seven-syllable hypothesis has to be supplemented with
something else. That "something else" would appear to be some kind
of markedness. Iterated trochees are more marked than [[x]...]. The
particular options available at any level are the least marked. The specific
details remain to be worked out.
To summarise thus far, it has been hypothesized that metrical systems
are all distinguishable on the basis of words of seven syllables or less.
It has been shown that there is an asymmetric use of the parameters provided
by the theory at the different levels of the metrical hierarchy. It has been
argued that directly accounting for this asymmetry would result in an
undesirable theory because the account would result in an inelegant theory
that does not explain the relationship between the restrictions and the
seven-syllable hypothesis. The seven-syllable hypothesis predicts that fewer
options should be available at higher levels of the metrical hierarchy.
Markedness accounts for what specific options are available at those higher
levels.
Possible support for this proposal comes from the psychological literature.
Miller (1967) discusses a number of psychological results that seem to
converge on the conclusion that human short-term memory is basically
limited to retaining seven elements (plus or minus two). The proposal
made here is that the seven-unit maximum on short-term memory applies
to language learning as well.
58 Michael Hammond
The idea is that forms can only be used to learn stress systems if they
can be held in short-term memory long enough for the learner to extract
the relevant generalizations. Words longer than seven syllables are learnable
because short-term memory does not constrain other aspects of acquisition.
The hypothesis is given in (21) below.
The specific claim is that the nonlinguistic effects Miller discusses are
mirrored by a constraint on the learning algorithm for metrical systems.
This constraint prevents the learner from paying attention to words of
more than seven syllables.
This proposal solves both of the problems mentioned above. First, UG
is not complicated needlessly. The theory of UG allows all options at
all three levels, and the restrictions at higher levels are a function of the
fact that the number of options available increases the number of syllables
necessary to distinguish the resulting systems. The particular options
available are a function of markedness as discussed above. This proposal
also solves the second problem. The asymmetry is directly tied to the seven-
syllable restriction expressed as (20) or (21). For example, the absence
of bidirectional cola follows from the constraint on short-term memory
and is explained by it. The alternative tactic of complicating UG does
not connect the absence of bidirectional cola with the seven-syllable limit
at all.
Finally, this proposal is more general in that the seven-unit effect is
expected to have extralinguistic consequences, just as Miller shows.
In order to maintain this explanation, several aspects of the proposal
must be fleshed out. First, unlike some of the experiments Miller discusses,
it looks like the restriction with respect to language learning refers to
precisely seven syllables. It does not allow for variation. This is taken
as progress in our understanding of short-term memory.
Second, unlike the effects Miller discusses, the restriction on short-term
memory as it affects language is specific to the unit syllable. In the
psychological literature, the particular unit restricted in short-term memory
can vary. This is not the case in metrical phonology. The seven-syllable
restriction is specific to syllables, and not some other phonological unit,
like cola or word trees. This is arguably a consequence of the fact that,
while a variety of factors influence how metrical structure is applied, it
is always applied to syllables. For example, while syllable weight in a
Parameters of Metrical Theory and Learnability 59
language like English affects metrical structure, that structure is still applied
to syllables.9
Third, it might be thought that the seven-syllable limit is excessive as
there are many languages, e.g. English, where words of seven syllables
or more are vanishingly rare. This is not a problem at all, however. The
seven-syllable restriction makes the strong prediction that languages where
children are exposed only to relatively short words, must opt for default
settings of parameters when contradictory data are impossible because
of the length of words. Contrast languages like English and Lenakel. In
English, there is a single scansion of right-to-left footing. Moreover, children
are exposed to relatively short words. Lenakel, on the other hand, exhibits
bidirectional footing (at least two scansions from different directions). The
demonstration in (14) requires that Lenakel children be exposed, at the
appropriate point of acquisition, to words of at least seven syllables. Our
approach predicts that learners not exposed to words of sufficient length
will have to opt for the default choice between one scansion and two
scansions: presumably one scansion (as in English).10 As a second example
of this sort, consider the possibility of foot extrametricality. 11 Contrast
the following systems. All involve building trochees from left to right.
The first two build a right-headed word tree. The second system also makes
a final degenerate foot extrametrical. The third builds a left-headed word
tree. As shown in (22), the first two systems only become distinct in words
of three syllables or more. The latter two only become distinct when words
of four syllables are considered.
X X X
X X X
x) X X
(X X) (x ) <X> (X X)
(XX) (X) (X X) (X) (X X) (X)
aa a aa a aa a
X X X
(X X) (X X) (X X)
(x x)(x x) (X x)(x x) (x x)(x x)
aa aa aa aa a a aa
60 Michael Hammond
Again, the system developed here requires that learners exposed to words
of insufficient length to distinguish these systems will opt for the default
settings for the parameters that separate these systems.12
Finally, the approach taken here makes an extremely interesting pre-
diction about other components of grammar. If the hierarchical asymmetry
is truly a function of an extralinguistic constraint on the size of short-
term memory, then we would expect the same constraint to also affect
other domains of grammar, e.g. syntax, semantics, etc.
To summarise, it has been shown that there is an asymmetry in the
use of metrical parameters at different layers of the metrical hierarchy.
This asymmetry is most appropriately handled by imposing an extralin-
guistic constraint on the learning of stress systems. This forces us to revise
the criterion of learnability so that only occurring grammars need to be
learned. It also forces us to revise our understanding of the character
and relevance of short-term memory.
These conclusions are based on a comparison of the predictions made
by metrical theory and the stress systems of the world. If there are significant
flaws in our understanding of either of these, the results would have to
be reconsidered. This is not a problem by any means. The proposal made
here is easily falsified and thus provides clear directions for further
investigation.
Last, note that our results with respect to the learnability criterion are
independent of how learning actually takes place.13 The seven-syllable
hypothesis says nothing about how learning happens. What it says is about
what the input to learning must be.
FOOTNOTES
•Thanks for useful discussion to the participants in my Spring 1990 seminar at the University
of Arizona, Diana Archangeli, Andy Barss, Robin Clark, Dick Demers, Elan Dresher, Kerry
Green, Terry Langendoen, Adrienne Lehrer, John McCarthy, Cecile McKee, Shaun O'Connor,
Dick Oehrle, Doug Saddy, Paul Saka, and Sue Steele. Thanks also to the editor and two
anonymous reviewers. Some of this material was presented at G L O W (Hammond, 1990a).
All errors are my own.
1. See, however, Braine (1974), Dell (1981), Dresher (1981), Dresher and Kaye (1990), and
McCarthy (1981).
2. This particular representation is used for typographical convenience. In all respects, the
representation employed here is a notational variant of the "lollipop" representation used
by Hammond (1984/1988) etc. See Hammond (1987) for discussion.
3. See the references cited in the text.
4. Halle and Vergnaud (1987) accomplish this indirectly with the mechanism of conflation.
5. See Wexler and Culicover (1980) and Gold (1967) for a discussion of what these properties
are.
Parameters of Metrical Theory and Learnability 61
6. Theories differ with respect to the number of constituents allowed. For example, Hayes
(1987) has three, Halle and Vergnaud (1987) have five, Hammond (1990b) has nine, and
Hayes (1981) has twelve. All of these theories allow all possibilities at the foot level.
7. Beat addition of the sort that promotes the first stress of Apalachicola can be accomplished
with a binary left-headed colon. An unbounded colon is not necessary for cases like this.
8. As pointed out to me by Iggy Roca, some of the parameters in (13) are dependent in
an interesting sense. For example, from the fact that the only constituents available at this
level are unbounded, it follows that there is no directionality, no iterativity, and a one-
scansion maximum. While this accounts for some of the restrictions (13b,c,f), it does not
account for all of them (13a,d,e).
It might be possible to derive the feet in (13a) from the requirement that metrical trees
terminate in a single node. This requirement is a stipulation that is otherwise unmotivated.
9. This is shown by the fact that syllables can never be split into separate metrical constituents
(Hayes, 1981). There are languages like Southern Paiute where stress is arguably assigned
to the mora, rather than the syllable. In such a language, the seven-unit restriction may
apply to morae. The Southern Paiute stress system is consistent with either hypothesis.
10. Obviously, it is unethical to test this hypothesis experimentally. The hypothesis can be
verified observationally, however, if the language learner's experience can be assessed for
word length at the critical stage. Language acquisition research has not reached the point
where this information is available.
11. Foot-extrametricality can only apply to degenerate feet, e.g. in English, Aklan, Odawa,
etc. (Hammond, 1990b).
12. See Dresher and Kaye (1990) for one proposal regarding default parameters in metrical
theory.
13. For some interesting recent proposals, see Barss (1989) and Clark (1990).
REFERENCES
Barss, Andrew. 1989. Against the Subset Principle. Paper presented at WECOL, Phoenix.
Baker, C.L. and John J. McCarthy (eds.). 1981. The Logical Problem of Language Acquisition.
Cambridge, Massachusetts: MIT Press.
Braine, M. 1974. On what might constitute learnable phonology. Language 50. 270-299.
Brown, R. and C. Hanlon. 1970. Derivational complexity and the order of acquisition of
child speech. In J.R. Hayes (ed.) Cognition and the Development of Language, New York:
Wiley.
Chomsky, Noam and Morris Halle. 1968. The Sound Pattern of English, New York: Harper
& Row.
Clark, Robin. 1990. Some elements of a proof for language learnability. Ms. Université
de Geneve.
Dell, F. 1981. On the learnability of optional phonological rules. Linguistic Inquiry 12. 31-
38.
Dresher, Bezalel Elan. 1981. On the learnability of abstract phonology. In Baker and McCarthy
(eds.), 188-210.
Dresher, B. Elan and Jonathan D. Kaye. 1990. A computational learning model for metrical
phonology. Cognition 34. 137-195.
Gold, E.M. 1967. Language identification in the limit. Information and Control 10. 447-
474.
Halle, Morris. 1989. The exhaustivity condition, idiosyncratic constituent boundaries and
other issues in the theory of stress. Ms. MIT.
62 Michael Hammond
Halle, Morris and Jean-Roger Vergnaud. 1987. An Essay on Stress. Cambridge, Massachusetts:
MIT Press.
Hammond, Michael. 1984/1988. Constraining Metrical Theory: A Modular Theory of Rhythm
and Destressing, 1984 UCLA doctoral dissertation, revised version distributed by IULC,
1988, published by Garland, New York.
Hammond, Michael. 1986. The obligatory-branching parameter in metrical theory. Natural
Language and Linguistic Theory 4. 185-228.
Hammond, Michael. 1987. Accent, constituency, and lollipops. CLS 23/2. 149-166.
Hammond, Michael. 1990a. Degree-7 learnability. Paper presented at GLOW, Cambridge,
England.
Hammond, Michael. 1990b. Metrical Theory and Learnability. Ms. U. of Arizona.
Hayes, Bruce. 1981. A Metrical Theory of Stress Rules, 1980 MIT Doctoral Dissertation,
revised version available from IULC and Garland, New York.
Hayes, Bruce. 1987. A revised parametric metrical theory. NELS 17. 274-289.
Hayes, Bruce. 1989. Stress and syllabification in the Yupik languages. Ms. UCLA.
Levin, J. 1990. Alternatives to exhaustivity and conflation in metrical theory. Ms. University
of Texas, Austin.
McCarthy, John J. 1981. The role of the evaluation metric in the acquisition of phonology.
In Baker and McCarthy (eds.), 218-248.
Miller, George A. 1967. The magical number seven, plus or minus two: some limits on
our capacity for processing information. In G. A. Miller (ed.) The Psychology of Com-
munication. New York: Basic Books Inc. 14-44.
Wexler, Kenneth and Peter W. Culicover. 1980. Formal Principles of Language Acquisition.
Cambridge, Massachusetts: MIT Press.
Markedness and growth*
Teun Hoekstra
University of Leiden
sition. Yet, as I shall clarify below, there are several notions of markedness
in the current literature on language acquisition, which need to be kept
apart. Before getting into those matters, I shall start with a short description
of the parameters model.
b. in terms of PS-rules:
1 2 3 4 5 6 = ^ 1 2 5 3 6
systems the child will have a hard time figuring out what the adequate
grammar for the language he is being exposed to should look like. Even
more difficult for the linguist is the interpretation of production data in
early stages of acquisition, as these data can also be analysed in a rich
variety of ways.
The first task facing generative theory was therefore to drastically reduce
the descriptive options made available by UG. Several changes in the theory
brought this goal within reach. Specifically, the abandonment of a con-
struction specific approach and/or its replacement by the modular con-
ception, according to which a particular construction can be seen as the
result of an interplay of several relatively simple modules. The reduction
of the transformational component to the move (a) format, available in
both French and English, made it impossible to express the difference
between these languages with respect to dative constructions in the manner
described in (3). The proposal to reduce the content of PS-rules to the
principle that the internal structure of a phrase is to be regarded as a
projection of lexical properties makes (2) unavailable. This leaves us with
(1) as a means to capture the difference between French and English,
but it will be clear that this can only be regarded as a description of
the difference, as it raises the question of why French should not have
lexical items with the properties of English give. In fact, the modular
approach leads us to ask even more general questions: is the fact that
French does not have such lexical items related to other properties of
French in which it differs from English, and can these sets of properties
follow from a single difference at a more abstract level? Could there be
a principle P-prep with two values, such that a positive value of the
parameter in P-prep yields a grammar of the French type, while a negative
value yields an English type system? For the case at hand, one might
think of a correlation of such properties as those in (4) (cf. Kayne 1981):
2. DEVELOPMENTAL MARKEDNESS
that object pro-drop is much less common than subject pro-drop. It seems
to me that in order to evaluate this quantitative distinction one has to
also take into account the relative distribution of subject and object
pronouns in adult speech. From this we know that the frequency of
pronominal subjects in transitive clauses is much higher than that of
pronominal objects. Looking at it as a dropping process, we must take
into account that from a discourse point of view the number of candidates
in subject position far exceeds the number of object candidates. We may
then assume that from a grammatical point of view there is initially no
asymmetry between subject and object drop, contrary to what Hyams
concludes.
Drawing on Rizzi (1986), Hoekstra & Roberts (1989) make a distinction
between content licensing and formal licensing, where content licensing
is interpreted as licensing in terms of 0-roles and formal licensing as licensing
in terms of "morphological" features. It is argued that the former can
be considered a form of D-structure licensing, while the latter is S-structure
licensing. Mechanisms of S-structure licensing have to do with the iden-
tification of the referent of an argument, e.g. through AGR-coindexation,
chain formation, or visibility in terms of phi-features or descriptive features.
Thus, the two arguments of a sentence like He kicked the boy are D-
structure licensed in terms of the argument roles assigned by the predicate
kick, while the agent argument is S-structure licensed through the phi-
features of the pronoun he and the patient argument is S-structure licensed
in terms of the descriptive content (plus quantification) in the NP the
boy.
I would now like to put forward the hypothesis that early child grammars
are characterised by the absence of an S-structure licensing requirement,
i.e. D-structure licensing suffices. Adopting a maturational perspective we
may interpret this hypothesis in terms of a maturational delay of S-structure
licensing. In Hoekstra & Roberts (1989) it is argued that under certain
conditions adult systems too allow arguments that are D-structure licensed
only, e.g. the null objects in the constructions discussed by Rizzi (1986)
and the null external arguments in middle constructions. In those cases,
the lack of S-structure licensing is compensated for by an additional form
of D-structure licensing (cf. Hoekstra & Roberts 1989 for details).
To make our hypothesis more specific, let us assume that S-structure
licensing is a function of Case marking. This seems to be quite reasonable
if we regard Case assignment as a way of providing visibility to arguments.
As we saw, there are two structural Case configurations, complements
of Verbs and Prepositions and the Specifier of tensed clauses. While P
and V assign Case to their complement under government, Nominative
Case is assigned under the mechanism of Head-Spec agreement. Given
this formal dissimilarity, we might expect an asymmetric growth of the
72 Teun Hoekstra
Children divert from this pattern in two respects: they uniformly have
agreement between the object and the participle in (8a), and there are
no occurrences of the perfect with intransitives of the type (8d). The question
is, how to capture the generalisation between non-occurrence of (8d) and
overgeneral agreement in (8a).
This is where UEAP comes in. With Borer & Wexler (1988) we must
make the basic assumption that agreement in early stages results from
the same mechanism that is operative in adult grammars, which is to say
that it results from a relation with a local subject (cf. Kayne 1986). UEAP
requires that every predicate element has its own unique subject. There
is no way in which this requirement can be met in (8d), as there are two
predicative elements {hanno and corso), but only one subject candidate.
UEAP can be met in (8a), however, if i libri is taken as the subject of
an adjectival participle letti, which must agree with its subject according
to the agreement rule. The overgeneralised agreement in (8a) is lost and
(8d) is let in as soon as UEAP disappears from the grammar. This way
the generalisation is captured.
Let us first turn to the epistemological status of UEAP. The interpretation
of UEAP given by B&W (1988) is a maturational one: "UEAP, we propose,
represents a maturational stage. While it constrains the early grammar,
it is, obviously, not a constraint on the grammar of adults" (1988:22).
Notice the implication of this for the hypothesis of UG-constrained
maturation. Not only are we to assume that certain portions of UG become
available at a certain maturational stage, other portions of UG become
unavailable at a certain maturational stage, since UEAP, a principle of
UG, is not characteristic of any adult system (by definition), but only
of certain stages of language acquisition, disappearing from the organism
in a way similar to the loss of the drowning reflex.
Notice that UEAP comes very close to a principle of adult-systems,
in effect one of the most basic principles of GB-theory, viz. the Projection
Principle. If a predicate has a role, it must be assigned to a unique argument.
Rather than taking UEAP as an independent principle, B&W suggest
looking upon it as a proto-principle that ultimately develops into this
principle. The difference between UEAP and the Projection Principle is
mainly a matter of scope, UEAP being wider in scope in the sense that
a predicate requires a subject independent of the assignment of an argument
role to it. The question we have to answer then is how this scope is narrowed
down, so as to capture the relevant generalisation, viz. that loss of agreement
in (8a) and the emergence of (8d) are simultaneous.
Markedness and growth 75
5. A-CHAINS
5.1. Ergatives
in terms of the class distinction either will fail to hold, or are captured
in other terms. A case in point is the selection of perfective auxiliaries,
which is sensitive to the (un)ergativity of the verb (cf. Burzio 1981 for
Italian, Hoekstra 1984 for Dutch). Dutch children are correct in this respect
very early, long before the purported emergence of A-chains. The same
appears to be true for Italian children. This implies that they are sensitive
to the distinction. If the distinction is not represented in the way it is
assumed to be in the adult grammar, the mechanism for auxiliary selection
should equally be different. This would raise the question of why children
would ever change their system.
In order to motivate the claim that children represent ergative and
unergative predicates in the same way, viz. as unergatives, B&W adduce
cases of overgeneralisation of lexical causativisation, reported for English
children by Bowerman (1982). So, apart from transitives such as John
broke the glass related to the ergative intransitive The glass broke, children
are reported to form alongside unergative intransitives like I sneezed
transitive causatives like Daddy's cigar sneezes me. To explain this, B&W
assume that, given the fact that they also have to represent intransitive
break as unergative, children are forced to formulate a causativisation
rule that is marked, while in adult English causatives are formed by an
unmarked rule. The marked rule requires the internalisation of an external
argument, while the unmarked one would solely add an external causer
argument to a verb that did not yet have an external argument. It is only
after the maturation of A-chain formation that the child realises that some
of the causative/inchoative patterns are consistent with the unmarked
instantiation of the rule. Once this is realised, the child stops overgene-
ralising, as he drops the assumption that the marked rule is operative
in the language he is learning.
There are several problems with this analysis. The most basic of these
is that the hypothesis lacks a perspective on the way in which the difference
between an ergative and an unergative representation of a particular item
is determined. It is unclear, therefore, how, after the A-chain mechanism
has come into the child's reach, he finds out which of his intransitive
verbs have an erroneous representation, given that an unergative repre-
sentation is still available after the emergence of A-chain formation. Related
to this is the observation that reported cases of overgeneralisation are
not random. I shall elaborate on this matter below.
First, however, I would like to consider the notion of marked causative
rule itself. From a crosslinguistic point of view, the notion of a marked
causative rule as the one employed by B&W seems highly suspect. They
notice in passing that the English rule makes use of a zero-affix. Such
zero-causative formation of the English type occurs in many languages,
but the rule always seems to be restricted to ergative verbs. On the other
78 Teun Hoekstra
that the fact that the participant roles are not always uniquely determined
does not mean that the choice of an ergative or unergative representation
is always arbitrary.
The essential ingredients of this hypothesis also underlie ideas such as
Pinker's (1984) semantic bootstrapping hypothesis. According to UAH,
agents are uniformly represented as external arguments, while themes are
taken as internal arguments. In dealing with concepts that determine the
participant roles less clearly, the child has the same hypothesis space as
languages have: in the absence of any grammatical indications, one may
wonder whether the sole argument of (adult) sneeze is an experiencer or
theme, undergoing a process, or whether it should qualify as an agent.
The child might have the same difficulty. Precisely under these circums-
tances, erroneous representations are to be expected. The non-random
character of the overgeneralisations which are reported follows from this
perspective on the nature of the determination of the external/internal
status of participant roles.
To sum up, overgeneralisations of the causative rule in English do not
provide sufficient motivation for the claim that all ergative verbs are initially
represented as unergatives. Such a claim would undercut the essence of
the UAH, in the absence of which the way in which arguments are linked
up with grammatical functions would be arbitrary in principle. Moreover,
the claim requires that children exploit mechanisms which should be
excluded as a matter of principle, such as a marked causative rule, as
well as mechanisms for e.g. auxiliary selection and agreement which are
quite different from the mechanisms assumed for adult systems. None of
this is needed if the claim that A-chain formation is unavailable is given
up.
5.2. Passives
6. CONCLUSION
FOOTNOTE
*I would like to thank the following persons for conversations about the subject matter:
Harry van der Hulst, Hans Bennis, Jan Voskuil and Rene Mulder. A special thanks goes
to Hagit Borer, for giving comments which may have led to clarifications, although she
is bound to disagree on a number of points.
Markedness and growth 83
REFERENCES
Bowerman, M. 1982. Evaluating competing linguistic models with language acquisition data.
Semantica 3. 1-73.
Borer, H. and K. Wexler. 1987. The maturation of syntax. In T. Roeper and E. Williams
(eds) Parameter setting. 123-172. Dordrecht: Reidel.
Borer, H. and K. Wexler. 1988. The maturation of grammatical principles. Ms. UC at Irvine.
Burzio, L. 1981. Intransitive verbs and Italian auxiliaries. Doctoral dissertation, MIT.
Chomsky, N. 1986. Knowledge of language: its nature, origin and use. New York: Praeger.
Hoekstra, T. 1984. Transitivity. Dordrecht: Foris.
Hoekstra, T. 1986. Passives and participles. In F. Beukema and A. Hulk (eds) Linguistics
in the Netherlands ¡986. 95-104. Dordrecht: Foris.
Hoekstra, T., forthcoming. Theta theory and aspectual classification.
Hoekstra, T. and I. Roberts. 1989. The mapping from lexicon to syntax: null arguments. Paper
delivered at the Groningen conference "Knowledge and language".
Hyams, N. 1983. The acquisition of parametrized grammars. Doctoral dissertation, CUNY.
Jaeggli, O. 1986. Passive. Linguistic Inquiry 17. 587-622.
Jakobson, R. 1941. Kindersprache, Afasie und allgemeine Lautgesetze. Uppsala.
Kayne, R. 1981. On certain differences between French and English. Linguistic Inquiry 12.
349-372.
Kayne, R. 1986. Principles of participle agreement. Ms. University of Paris VIII.
Pesetsky, D. 1987. Psych predicates, universal alignment, and lexical decomposition. Ms. UMASS
at Amherst.
Pica, P. 1987. On the nature of the reflexivization cycle. NELS 17. 483-500.
Pinker, S. 1984. Language learnability and language learning. Cambridge, Massachusetts:
Harvard University Press.
Rizzi, L. 1986. Null objects in Italian and the theory of pro. Linguistic Inquiry 17. 501-
557.
Roberts, I. 1985. [1987] The representation of implicit and dethematized subjects. Dordrecht:
Foris.
Siewierska, A. 1985. The passive. London: Croom Helm.
Wasow. T. 1977. Transformations and the lexicon. In T. Wasow, P. Culicover and A. Akmajian
(eds) Formal syntax. 327-377. New York: Academic Press.
Wexler, K. and R. Manzini. 1987. Parameters and learnability in binding theory. In: T.
Roeper and E. Williams (eds), Parameter setting. 41-76. Dordrecht: Reidel.
Williams, E. 1982. Another argument that passive is transformational. Linguistic Inquiry
13. 160-163.
Nativist and Functional Explanations in
Language Acquisition
James R. Hurford
University of Edinburgh
1. PRELIMINARIES
remarkably little further gets done about it. Contributions from linguists,
of whatever theoretical persuasion, (e.g. Lightfoot's section "Evolution
of Grammars in the Species" (Lightfoot, 1983:165-169) and Givon's chapter
"Language and Phylogeny" (Givon, 1979:271-309)) remain sketchy, su-
perficial, and anecdotal.
On the other hand, a more promising sign is Pinker and Bloom's (1990)
paper, in which they systematically address some of the major skeptical
positions (e.g. of Piattelli-Palmarini, 1989, Chomsky, and Gould) concer-
ning natural selection and the evolution of the language faculty. Several
other articles (Hurford, 1989, 1991a, 1991b; Newmeyer, forthcoming) make
a start on working out proposals about how quite specific properties of
the human language faculty could have emerged through natural selection.
To whet the reader's appetite, without, I hope, appearing too enigmatic
or provocative at this stage, I give here a short paragraph with a diagram
(Figure 1), sketching the phylogenetic mechanism, and a table (Table 1),
summarising the major differences between the glossogenetic and the
phylogenetic mechanisms. Deep aspects of the form of language are not
likely to be readily identifiable with obvious specific uses, and one cannot
suppose that it will be possible to attribute them directly to the recurring
short-term needs of successive generations in a community. Here, nativist
explanations for aspects of the form of language, appealing to an innate
LAD, seem appropriate. But use or function can also be appealed to on
the evolutionary timescale, to attempt to explain the structure of the LAD
itself.
Fig. 1.
GLOSSOGENETIC PHYLOGENETIC
(Sec.2 of this paper) (Hurford, 1989, 1991b)
Table 1.
"The borderline between the purely linguistic and the psychological aspects of language
... may not exist at all". (Clark and Haviland, 1974:91)
90 James R. Hurford
For concreteness, I will give some examples, all for Standard English,
of how I assume some relevant phenomena line up:
"Saussure (1959:11-23, 191ff) demarcates sharply between what he calls internal lin-
guistics, the study of langue, and external linguistics, which encompasses such significant
fields of study as articulatory phonetics, ethnographic linguistics, sociolinguistics,
geographical linguistics and the study of utterances (discourse?), all of which deal with
positive facts.
Classical structuralism thus establishes a gulf between the two spheres, so that
structuring forces or organizing principles which operate in the one domain will not
affect the other. Though this formulation will be seen to be too one-sided, given its
assumption that langue is in principle independent of structuring forces originating outside
it, I will suggest that the distinction between internal linguistics and external linguistics
nevertheless remains useful and in fact necessary. I will draw on this distinction to
show how certain phenomena can be at the same time unmotivated from the generative
synchronic point of view and motivated from a genuinely metagrammatical viewpoint
which treats grammars as adaptive systems, i.e. both partially autonomous (hence systems)
and partially responsive to system-external pressures (hence adaptive). This will be fruitful
only if we recognise the existence of competing motivations, and further develop a
theoretical framework for describing and analysing their interaction within specified
contexts, and ultimately for predicting the resolution of their competition. This (pan-
chronic) approach to metagrammar is part of the developing theory of what has been
called the ecology of grammar ( D u Bois, 1980:273)." (1985:343-344).
"... the 'steady state' reached by adults also contains patterns of statistical variation
in the use of grammatical structures that cannot be captured by discrete rules". (Bates
and MacWhinney, 1987:158)
"If we ask ourselves why the various contexts of a linguistic alternation should, as
a general rule, be constrained to change in lock step, the only apparent answer consistent
with the facts of the matter is that speakers learning a language in the course of a
gradual change learn two sets of well-formedness principles for certain grammatical
subsystems and that over historic time pressures associated with usage (presumably
processing or discourse function based) drive out one of the alternatives". (Kroch,
1989:349)
This echoes a long tradition in linguistics (cf. Fries and Pike, 1949).
It is hard, perhaps impossible, to distinguish empirically between a
situation where a speaker knows two grammars or subsystems, correspon-
ding, say, to 'New Variety' and 'Old Variety', and a situation where a
speaker knows a single grammar or subsystem providing for a number
of options, where these options are associated with use-related labels, 'Old'
and 'New'. Plural competences would certainly be methodologically more
intractable to investigate, presenting a whole new, and more difficult, ball-
game for learnability theory, for instance. On the other hand, plural
competences do presumably arise in genuine cases of bilingualism, and
so the LAD is equipped to cope with internalizing more than one grammar
94 James R. Hurford
at a time. Perhaps plural competences are indeed the rule for the majority
of mankind, and the typical generative study of singular monolithic
competence is a product of concentrating on standardised languages (a
point made by Milroy). The question is forced on us by the pervasive
facts of statistical patterning in sociolinguistic variation, even in the usage
of single individuals, and language change. And the question is highly
relevant to language acquisition studies, as McCawley (1984:435) points
out: 'Do children possess only one grammar at a time? Or may they possess
multiple grammars, corresponding to either overlapping developmental
stages, or multiple styles and registers?'
In what follows I will simply assume that statistical facts belong to
the domain of performance and pragmatics (e.g. rules of stylistic preference
or, more globally, rules of 'code choice'), whereas facts of acquired adult
grammatical competence are not to be stated probabilistically. I do not
claim to have argued this assumption, or demonstrated that the variation
problem must be handled in this way. But one cannot explore all the
possibilities in one article, and I shall explore here how the interplay of
grammar and use might be envisaged, if one banishes probabilities from
the realm of competence. The research challenge then appears as the twin
questions: 'How does all-or-nothing competence give rise to phenomena
in which statistical distributions are apparent?' and 'How does exposure
to variable data result in all-or-nothing competence?' Possibly, these are
the wrong research questions to ask, but the only way to find out is by
seeing how fruitful theorising along these lines turns out to be. Other
researchers may pursue other assumptions in parallel. In a later subsection
(2.3), I will discuss the phenomenon of grammaticalisation, in which, over
time, a statistical pattern of use (as I assume it to be) gets fixed into
a nonstatistical fact of grammar.
"There have been two major kinds of attempts to explain linguistic structure as the
result of speech functions. One I shall call the 'behavioural context' approach, the other
the 'interactionist' approach. The 'behavioural context approach' argues that linguistic
patterns exist because of general properties of the way language is used and general
properties of the mind. The interactionist approach argues that particular mental
mechanisms guide and form certain aspects of linguistic structure". (Bever, 1975:585-
6)
where some clear connection between F (the putatively useful form) and
U (the user) a n d / o r P (the purpose) is articulated. The connection between
form and user or purpose need not be immediate or direct but may be
mediated in some way, provided the plausibility of the connection is not
thereby lost. As a simple concrete example, consider a spade. Parts of
its form, e.g. the sharp metal blade, relate directly to the intended purpose,
digging into the earth, but other aspects of its form, e.g. its handle and
its manageable weight, relate more directly to the given (human) charac-
teristics of the user. Separating out which aspects of spade-design are
purpose-motivated and which user-motivated is not easy; likewise it can
also be difficult to separate out social (purpose-motivated) functional
explanations of language form from psychological (user-motivated) func-
tional explanations.
For the purpose of exploring the relationship between nativist and
functional explanations of linguistic phenomena, it will in fact be convenient
to continue to deal in terms of a single functional domain, which has
both cognitive and social components. This domain, which I will label
the 'Arena of Use' and discuss in the next section, is contrasted with the
'internal' domain, the domain of facts of grammar. The Arena of Language
Use must figure in any explanation of language form that can reasonably
be called a 'functional' explanation.
Fig. 2.
Primary Individual
Linguistic Grammatical
Data Competence
ARENA OF USE
Fig. 3.
really are 'out there'. The Arena of Use is not populated by just whatever
exists out there, but (in part) by entities that exist-as-some-category. The
relevant idea is put thus by Edie (commenting, as it happens, on Husserl):
useful purpose, and these are either simply not uttered, or uttered and
not taken up by interlocutors.
At the level of discourse, the filtering function of the Arena is accepted
as uncontroversial. A coherent discourse (monologue or dialogue) is not
just any sequence of sentences generated by a generative grammar. The
uses to which sentences are put when uttered determine the order in which
they may be strung together. With the usual reservations about performance
errors, interruptions, etc., sequences which do not serve useful purposes
in discourse do not occur in the Primary Linguistic Data to which the
child is exposed.
At the level of vocabulary, the filtering function of the Arena is also
uncontroversial. Words whose usefulness diminishes are uttered less fre-
quently, eventually falling out of use. When they fall out of use, they
are no longer present in the PLD and cannot pass into the competences
of new language acquirers. What words pass through the cycle in Figure
3, assuming their linguistic properties present no acquisition difficulties,
is almost entirely determined by considerations of use. I grant that the
relation between vocabulary and use is far from simple, as academic folk-
tales about Eskimo words for snow (cf. Pullum, 1989, Martin, 1986), and
Arabic words for camel might lead the gullible to believe. But there is
a large body of scholarship, under the various titles of ethnographic
semantics, ethnoscience, and cognitive anthropology (cf Brown, 1984, for
a recent example), building up a picture of the relation between the structure
of a community's vocabulary and its external environment. Clearly the
usefulness of words is one part of this picture. One example from Brown
is:
"The fact that warm hues cluster with white and cool hues with dark contributes to
the likelihood that languages will make a "macro-white"/"macro-black" distinction
in the initial encoding of basic color categories. A utilitarian factor may also contribute
to this development. Basic color categories become important when people develop
a need to refer to colors in a general manner. An initial "macro-white"/"macro-black"
contrast is highly apt and useful since it permits people to refer to virtually all colors
through use of general terms". (Brown, 1984:125)
(a) Declarative
(b) Imperative
(c) Interrogative
(i) WH-question
(ii) Y e s / N o question.
It is hard to find a language in which some "norm" does not exist for (a), (b), (ci)
and (cii), i.e. some structural-syntactic means for keeping these four prototypes apart."
(Givon, 1986:94)
"... the position is occupied by a case-marked empty element associated with an empty
topic, which receives the interpretation of addressee from the discourse". (Beukema
and Coopmans, 1989:435)
Nativist and Functional Explanations in Language Acquisition 103
"Innateness is not the only factor to which one can appeal when explaining universals.
Certain linguistic properties may have a communicative/functional motivation. If every
grammar contains pronouns distinguishing at least three persons and two numbers (cf.
Greenberg 1966:96), then an explanation involving the referential distinctions that
speakers of all languages regularly need to draw is, a priori, highly plausible". (Hawkins,
1985:583)
The facts of grammatical person are not quite so simple. Foley (1986:66-
74) (while subscribing to the same functional explanation as Hawkins for
distinctions of grammatical person) mentions languages without 3rd person
pronouns, and Mtihlhausler and Harre (1990) claim that even 1st versus
2nd person, as usually understood, is not universal. Nevertheless Hawkins'
point stands; it is not surprising that 'the referential distinctions that
speakers of all languages regularly need to draw' cannot be described by
a simple list, but rather require description in statistical terms of significant
tendencies.
Hawkins gives a number of further plausible examples, which I will
not take the space to repeat. In a more recent, and important, contribution
the same author accounts for universal tendencies to grammaticalise certain
word orders in terms of certain (innate) parsing principles:
"The parser has shaped the grammars of the world's languages, with the result that
actual grammaticality distinctions, and not just acceptability intuitions, performance
frequencies and psycholinguistic experimental results, are ultimately explained by it.
This does not entail, however, that the parser must also be assumed to have influenced
innate grammatical knowledge, at the level of the evolution of the species, as in the
discussion of Chomsky and Lasnik (1977). Rather, I would argue that human beings
are equipped with innate processing mechanisms in addition to innate grammatical
knowledge, that the grammars of particular languages are shaped by the former as
well as by the latter, and that the cross-linguistic regularities of word order that we
have seen in this paper are a particularly striking reflection of such innate mechanisms
for processing. The evolution of these word order regularities could have come about
through the process of language change (or language acquisition): the most frequent
orderings in performance, responding to principles such as EIC [Early Immediate
Constituents, a parsing principle], will gradually become fixed by the grammar. One
can see the kinds of grammaticalization principles at work here in the interplay between
"free" word order and fixed word order within and across languages today. The rules
or principles that are fixed by a grammar in response to the parser must then be learned
by successive generations of speakers". (Hawkins, 1990:258)
104 James R. Hurford
to be more 'superficial' than research into UG and the LAD. But the
intrinsic interest of such a theory is not thereby diminished.
A full and helpful discussion of the uses of 'deep' by generative
grammarians and others, and of the misunderstandings which have arisen
over the term, is to be found in Chapter 8 of Chomsky (1979). Putting
aside the use of 'deep' as a possible technical term applied to a level of
structure (which I am not talking about here), the term 'deep' can be
applied either to theories and analyses or to phenomena and data considered
pretheoretically. Those aspects of languages due to the LAD seem, at first
pretheoretical blush, to be 'deep', to require theories of notable complexity
to account for them. These aspects of a language's structure are subtle;
they are not the most obvious facts about it, and, for instance, probably
get no attention in courses teaching the language, even at an advanced
level. Exactly this point is stated by Chomsky:
"We cannot expect that the phenomena that are easily and commonly observed will
prove to be of much significance in determining the nature of the operative principles.
Quite often, the study of exotic phenomena that are difficult to discover and identify
is much more revealing, as is true in the sciences generally. This is particularly likely
when our inquiry is guided by the considerations of Plato's problem, which directs
our attention precisely to facts that are known on the basis of meager and unspecific
evidence, these being the facts that are likely to provide the greatest insight concerning
the principles of U G " . (Chomsky, 1986:149)
the same aspects of language could well necessitate quite deep analyses.
If one casts a theory of language as a theory of communication systems 3
operating within human societies (systems transmitted from one generation
to the next), then the problem of acquisition is not the only problem one
faces. The kind of question one asks is, for instance: Why do these
communication systems (languages) have irregular morphological forms?,
Why do languages have words for certain classes of experience, but not
for others? And the answer to these questions may be quite deep, or at
least deeper than the answers to the corresponding acquisition questions.
(A similar argument is advanced in Ch.l of Hurford, 1987)
Figure 3, introducing the Arena of Use, is actually a version of a diagram
given by H. Andersen (1973). Andersen's diagram looks like this:
Fig. 4.
"... through time the content of mentally represented grammars, which are not in my
view social objects, comes to contain a content which was in origin clearly social or
cultural in character". (Pateman, 1985:51)
George Miller also expresses the same thought concisely and persuasively:
"Probably no further organic evolution would have been required for Cro-Magnon
man to learn a modern language. But social evolution supplements the biological gift
of language. The vocabulary of any language is a repository for all those categories
and relations that previous generations deemed worthy of terminological recognition,
a cultural heritage of c o m m o n sense passed on from each generation to the next and
slowly enriched from accumulated experience". (Miller, 1981:33)
It is worth asking whether the social evolution that Miller writes of affects
aspects of languages besides their vocabularies. An argument that it does
is presented in Hurford (1987), especially Ch.6.
It is clear that much of language structure can be explained by innate
characteristics of the LAD; I do not claim that all, or even 'central'
(according to some preconceived criterion of centrality) aspects of languages
can be explained by factors in the Arena of Use. Bates et al. (1988:235-
6) conclude: "we have found consistent evidence for 'intraorganismic'
correlations, i.e. nonlinguistic factors in the child that seem to vary
consistently with aspects of language development". Such factors belong
to the Arena of Use, as defined here, but so far as is yet known, affect
only development, and not the end product, the content of adult grammars.
On 'extraorganismic' correlations, Bates et al. conclude: "This search for
social correlates of language has been largely disappointing". (1988:236).
At a global level, one should not be 'disappointed' or otherwise at how
scientific results turn out. The question of interest is: ' What aspects of
language structure are attributable to the innate LAD, and what aspects
to the Arena of Use?' It seems likely that the search for influences of
the Arena of Use on acquired grammars will be least 'disappointing' in
the marked periphery of grammar, as opposed to the core, as the core/
periphery distinction is drawn by UG theorists.
whose truth may perhaps be taken for granted by the inventor of the
system, but which the system itself can in no way guarantee to be true.
The theorems of learnability theory are derived in systems which assume
a particular type of definition of 'language', in particular, languages are
assumed not to have stochastic properties. But, under a different definition
of 'language', different theorems are provable, showing that frequencies
in the input data can be relevant to language acquisition. See, for example,
Horning (1969), and comments by Macken (1987:391).
But, even with a nonstochastic definition of the adult competence
acquired, it is still easily conceivable that frequency factors in the input
should influence the process of acquisition. Pinker (1987), for example,
assumes that adult competence is nonprobabilistic, but proposes a model
of acquisition in which exposure to a piece of input data results in the
'strengths' of various elements of the grammar being adjusted, usually
being incremented. The point is that in Pinker's proposal one single example
of a particular structure in the input data does not automatically create
a corresponding all-or-nothing representation in the child's internal gram-
mar; it can take a number of exposures for the score on a given element
to accumulate to a total of 1. Presumably, if that number of exposures
isn't forthcoming in the input data, that element (rule, feature, whatever)
doesn't get into the adult grammar.
Learnability theory typically operates with an assumption that the
learning device is 'one-memory limited'. This is the assumption that
"the child has no memory for the input other than the current sentence-plus-inferred-
meaning and whatever information about past inputs is encoded into the grammar
at that point". (Pinker, 1984:31)
But the success of learnability theory does not depend on the assumption
that its 'one-memory inputs' correspond to single events in the experience
of a child. It is quite plausible that there is some pre-processing front
end to the device modelled by learnability theory, such that an accumulation
of experiences is required for the activation of each one-memory input.
Likewise it is easy to envisage that the setting of parameters in the G B /
UG account needs some threshold number (more than one) of exemplars.
If there were some theorem purporting to demonstrate that this is alien
to language acquisition, one would need to examine carefully the relevant
axioms and definitions of terms, to see if they made assumptions cor-
responding appropriately to data uncovered by real acquisition studies.
There are studies revealing relationships between acquired (albeit interim)
grammars and statistical properties of the input.
Nativist and Functional Explanations in Language Acquisition 109
"... certain acquisition data in conjunction with an interpretation of the relevant evidence
and correlations show that there are stochastic aspects to language acquisition, like
sensitivity to frequency information". (Macken, 1987:393)
"... Gleitman et al. (1984) cite several studies showing that the development of verbal
auxiliaries is affected by the statistical distribution of auxiliaries in maternal speech.
In particular, mothers who produce a large number of sentence-initial auxiliaries ...
tend to have children who make greater progress in the use of sentence-internal auxiliaries
... Because this auxiliary system is a peculiar property of English, it cannot belong
to the stock of innate linguistic hypotheses. It follows that auxiliaries have to be picked
up by some kind of frequency-sensitive general learning mechanism". (Bates et al.,
1988:62)
"Neu (1980) found that adults delete the / d / in 90 percent of their productions of
and, compared to a 32.4 per cent rate of / d / deletion in other monomorphemic clusters;
... Fidelholtz (1975) has observed less in the way of perceptible vowel reduction for
frequent words, and Koopmans-van Beinum and Harder (1982/3) have confirmed this
in the laboratory. The frequency-reducibility effect evidently holds even where syllabic
and phonemic length are equated (Coker, Umeda and Browman 1973; Wright 1979),
and as the effect has little to do with differences in the information content or predictability
of high and low frequency words (Thiemann 1982), their different reducibility suggests
that frequent (i.e. familiar) words may be stored in reduced form. [Footnote:-] Though
it is not my purpose here to deal with the child's role in phonological change, my
discussion here ... has an obvious bearing on this subject". (Locke, 1986:248; footnote,
524)
"High frequency forms were found to be poorer primes of productive patterns than
medium frequency forms. Furthermore, the real verb classes which showed some
productivity were those with fewer high frequency forms. Because high frequency forms
are often rote-learned [Bybee and Brewer, 1980], they are less likely to be analysed
and related morphologically to the other members of their paradigm." (Moder, 1986:180)
"Changes affecting the most frequent words first are motivated by physiological factors,
acting on surface phonetic forms; changes affecting the least frequent words first are
motivated by other, non-physiological factors, acting on underlying forms". (Phillips,
1984:320)
"People of all ages and abilities are extremely sensitive to frequency of occurrence
information. ... [In] the domain of cognitive psychology ... we note that the major
conclusion of this area of research stands on a firm empirical base: The encoding of
frequency information is uninfluenced by most task and individual difference variables.
As a result, memory for frequency shows a level of invariance that is highly unusual
in memory research. This is probably not so because memory is unique but because
memory researchers have paid little attention to implicit, or automatic, information
acquisition processes. Here we demonstrated the existence of one such process. We
also showed its implications for the acquisition and utilisation of some important aspects
of knowledge". (Hasher and Zacks, 1984:1385)
Hasher and Zacks also briefly discuss the relation of their work to that
of Tversky and Kahneman; they conclude "... the conflict between our
view and that of Tversky and Kahneman is more apparent than real"
(p. 1383)
Thus far, my arguments have been that statistical patterns in the input
can and do affect the content of the acquired competence, perhaps especially
where the language changes from one generation to the next (i.e. where
the acquired competence differs from the competence(s) underlying the
PLD). There is another, powerful, argument indicating the necessity, for
language acquisition to take place at all, of a certain kind of statistical
patterning in the input data. This involves what has been called the 'Semantic
Bootstrapping Hypothesis', discussed in detail by Pinker (1984), but
advanced in various forms by several others.
Briefly, the Semantic Bootstrapping Hypothesis states that the child
makes use of certain rough correspondences between linguistic categories
(e.g. Noun, Verb) and nonlinguistic categories (e.g. discrete physical object,
action) in order to arrive at initial hypotheses about the structure of strings
he hears. Without assuming such correspondences, Pinker argues, the set
of possible hypotheses would be unmanageably large. This seems right.
It is common knowledge, of course, that there is no one-to-one corre-
spondence between conceptual categories and linguistic categories - any
such correspondence is statistical. Pinker (1984:41) lists 24 grammatical
elements that he assumes correspond to nonlinguistic elements. (In Pinker,
1989 the background to the hypothesis is modified somewhat, but not
in any way that endangers the main point.) Now, according to the Semantic
Bootstrapping Hypothesis, if these correspondences are not present in the
experience of the child, grammar acquisition cannot take place.
UG theory characterises a class of possible grammars. These grammars,
as specified by UG, make no mention of nonlinguistic categories. Of course,
for the grammars to be usable, nonlinguistic categories must be associable
with elements of a grammar. For instance, the lexical entry for table must,
if a speaker is to use the word appropriately, get associated with the
nonlinguistic, experiential concept of a table (or tablehood, or whatever).
112 James R. Hurford
"It has long been known that not everything a child hears has a noticeable or long-
term effect on the emergent mature capacity; some sifting is involved. Some of the
sifting must surely be statistical, some is effected through the nature of the endowed
properties ..." (Lightfoot, 1989b:364)
"... we would expect phenomena that belong to the periphery to be supported by specific
evidence of sufficient 'density'..." (Chomsky, 1986:147)
language acquisition can avoid making quantitative commitments altogether. After all,
it may turn out to be true that one rule is learned more reliably than another only
because of the steepness of the relevant rule strengthening function or the perceptual
salience of its input triggers". (Pinker, 1984:357)
Pinker then states a methodological judgement that 'For now there is little
choice but to appeal to quantitative parameters sparingly'. I share his
apprehension about the possibility of 'injudicious appeals to quantitative
parameters in the absence of relevant data', but the solution lies in making
the effort to obtain the relevant data, rather than in prejudging the nature
(statistical or not) of the theories that are likely to be correct.
"In absolute terms, if semantic agreement is possible in a given position in the hierarchy,
it will also be possible in all positions to the right. In relative terms, if alternative
agreement forms are available in two positions, the likelihood of semantic agreement
will be as great or greater in the position to the right than in that to the left." (Corbett,
1983:10-11)
"Or one may view the phenomenon in both languages in the context of 'communicative
function', as being essentially of the same kind. The obvious inference to be drawn
from the presentation is as follows: If indeed the phenomenon is of the same kind
in both languages, then the distinction between competence and performance - or
grammar and speaker's behaviour - is (at least for these particular cases) untenable,
counterproductive, and nonexplanatory." (Givon, 1979:26)
"... through time the content of mentally represented grammars, which are not in my
view social objects, comes to contain a content which was in origin clearly social or
cultural in character." (Pateman, 1985:51)
Diachronic change T
in either direction. |
Diachronic change T
in either direction. |
"In the model proposed, individual language learners respond in a discrete all-or-nothing
way to overwhelming frequency facts. Language learners do not merely adapt their
own usage to mimic the frequencies of the data they experience. Rather, they 'make
a decision' to use only certain types of expression once the frequency of those types
of expression goes beyond some threshold. At a certain point there is a last straw
which breaks the camel's back and language learners 'click' discretely to a decision
about what for them constitutes a fact of grammar. What I have in mind is similar
to Bally and Sechehaye's suggestion about Saussure's view of language change. 'It is
only when an innovation becomes engraved in the memory through frequent repetition
and enters the system that it effects a shift in the equilibrium of values and that language
[langue] changes, spontaneously and ipso facto' (Saussure, 1966:143n). Bever and
Langendoen (1971:433) make the same point nicely by quoting Hamlet: 'For use can
almost change the face of nature'". (Hurford, 1987:282-3, slightly adapted)
Beyond the kind of vague remarks cited above, no-one has much idea
of how grammaticalisation works. Givon's book documents a large number
of interesting cases, but his account serves mainly to reinforce the conclusion
that grammaticalisation happens, rather than telling us how it happens.
And of course the fact that it does happen, that aspects of performance
get transmuted into aspects of competence, reinforces, rather than un-
dermines, the competence/performance distinction. But one thing that is
clear about grammaticalisation is that the LAD plays a vital part. This
emerges from Givon's discussion of Pidgins and Creoles, in which the
discrete step from Pidgin to Creole coincides with language acquisition
by the first-generation offspring of Pidgin speakers.
118 James R. Hurford
"Briefly, it seems that Pidgin languages (or at least the most prevalent type of Plantation
Pidgins) exhibit an enormous amount of internal variation and inconsistency both within
the output of the same speaker and across the speech community. The variation is
massive to the point where one is indeed justified in asserting that the Pidgin has no
stable syntax. No consistent "grammatical" word-order can be shown in a Pidgin, and
little or no use of grammatical morphology. The rate of delivery is excruciatingly slow
and halting, with many pauses. Verbal clauses are small, normally exhibiting a one-
to-one ratio of nouns to verbs. While the subject-predicate structure is virtually
undeterminable, the topic-comment structure is transparent. Virtually no syntactic
subordination can be found, and verbal clauses are loosely concatenated, usually separated
by considerable pauses. In other words, the Pidgin speech exhibits almost an extreme
case of the pragmatic mode of communication.
In contrast, the Creole - apparently a synthesis di novo [sic] by the first generation
of native speakers who received the Pidgin as their data input and proceeded to "create
the grammar" - is very much like normal languages, in that it possesses a syntactic
mode with all the trimmings ... The amount of variation in the Creole speech is much
smaller than in the Pidgin, indistinguishable from the normal level found in "normal"
language communities. While Creoles exhibit certain uniform and highly universal
characteristics which distinguish them, in degree though not in kind, from other normal
languages, they certainly possess the entire range of grammatical signals used in the
syntax of natural languages, such as fixed word order, grammatical morphology,
intonation, embedding, and various constraints". (Givon, 1979:224)
This passage makes the case so eloquently for the existence of an innate
Language Acquisition Device playing a large part in determining the shape
of normal languages that one would not be surprised to tind it verbatim
in the introduction to a text on orthodox Chomskyan generative grammar.
In my terms, the prototypical Pidgin is a hybrid monstrosity inhabiting
the Arena of Use, limping along on the basis of no particular shared core
of individual competences. The main unifying features it possesses arise
from its particular spatial/temporal/social range in the Arena of Use. When
a new generation is born into this range, and finds this mess, each newborn
brings his innate linguistic faculty to bear on it and helps create, in
interaction with other members of the community, the grammar of the
new Creole.
The picture just given is, by and large, that of Bickerton's Language
Bioprogram Hypothesis (Bickerton, 1981), and is probably correct in broad
outline, if no doubt an oversimplification of the actual facts. "Usually,
however, the trigger experience of original Creole speakers is shrouded
in the mists of history, and written records of early stages of Creole languages
are meagre." (Lightfoot, 1988:100) A vast amount of empirical research
into the creolisation process needs to be done before interesting details
become discernible, but clearly the focal point of the process is the point
where the innate LAD meets the products of the Arena of Use. The step
from a Pidgin to a Creole is an extreme case of many simultaneous
Nativist and Functional Explanations in Language Acquisition 119
"The most obvious point is that not everything that the child hears 'triggers' a device
in the emerging grammar. For example, so-called 'performance errors' and slips of
the tongue do not entail that the hearer's grammar be amended in such a way as to
generate such deviant expressions, presumably because a particular slip of the tongue
does not occur frequently enough to have this effect. This suggests that a trigger is
something that is robust in a child's experience, occurring frequently. Children are
typically exposed to a diverse and heterogeneous linguistic experience, consisting of
different styles of speech and dialects, but only those forms which occur frequently
for a given child will act as triggers, thus perpetuating themselves and being absorbed
into the productive system which is emerging in the child, the grammar." (Lightfoot,
1988:98)
munity. They are not innate. Such aspects of languages, therefore, are
typically well-determined by the observable data of performance, since
they need to be sufficiently obvious to new generations to be noticed and
adopted.
Obviously, quite a lot is innate in the lexicon too. For instance, no
single verb can mean 'eat plenty of bread and...', 'persuade a woman that...',
'read many books but not...'. The constraints on possible lexical meanings
are strong and elaborate. My point is that, within such innately determined
constraints, the matter of what lexical items a language possesses is
influenced by factors of usefulness. Individual inventiveness cannot violate
the innately determined boundaries, see Hurford, 1987, Ch.2,Sec.5, for
a detailed discussion of the relation of individual inventiveness to the
capacity for language acquisition.
Aspects of languages transmitted culturally from one generation to the
next because of their usefulness have their origins in the inventiveness
and creativity (presumably in some sense innate) of the individuals who
first coined them and gave them currency. In the field of vocabulary again,
it is uncontroversial that new words are invented by individuals, or arise
somehow from small groups. Often it is not possible to trace who the
first user of a new word was, but nevertheless there must have been a
first user. In other parts of languages, such as their phonological, mor-
phological, syntactic, semantic and pragmatic rule components, it is difficult
to attribute the origins of particular rules to the creativity of individuals
or groups, but even here a kind of attenuated creativity in the use of
language, proceeding by small increments over many generations, seems
plausible. The approximate story would be of existing rules having their
domain of application gradually extended or diminished due to a myriad
of small individual choices motivated by considerations of usefulness. Very
few rules of syntax are completely general in the sense of having no lexical
exceptions. Such sets of lexical exceptions are augmented or lessened
continually throughout the history of languages. The specifically functional
considerations, that is considerations of usefulness, which motivate such
changes in the grammar of a language are of course usually impossible
to identify with accuracy, and will remain so until we have much subtler
theories and taxonomies of language use (which will help us to define
the notion of usefulness itself more precisely).
The historical role of invention and creativity that I have in mind is
envisaged by Gropen et al. (1989) and described by Mithun (1984):
"Instead, it could be that the historical processes which cause lexical rules to be defined
over some subclasses but not others seem to favour the addition or retention of narrow
classes of verbs whose meanings exemplify or echo the semantic structure created by
the rule most clearly. The full motivation for the dativisability of a narrow class may
122 James R. Hurford
come from the psychology of the first speakers creative enough or liberal enough to
extend the dative to an item in a new class, since such speakers are unlikely to make
such extensions at random. Thereafter speakers may add that narrow class to the list
of dativisable classes with varying degrees of attention to the motivation provided by
the broad-range rule - by recording that possibility as a brute memorised fact, by grasping
its motivation with the aid of a stroke of insight recapitulating that of the original
coiners, or by depending on some intermediate degree of appreciation of the rationale
to learn its components efficiently, depending on the speaker and the narrow class
involved". (Gropen et al., 1989:245)
GRAMMARS G1 G2 G3
/IK / u vms
/iiX / m \ /
I 11 AoU LAD I | . AoU LAD \\\ AoU
II I
/Il \ / i\\
REALISED I I \
LANGUAGES 1I 1I I1 LI ,' ' \ L2 ^ ^ L3
I I I \ \ \
I I \ / I l \ \ ^
1
/ ' / I I 1 \ \
UNREALISED / | \ j | \ \ \ \
LANGUAGES La Lb Lc... Li Lj Lk... Lx Ly Lz...
COMPATIBLE
WITH G l . e t c
Fig. 5.
The upper two levels in this diagram indicate the course of actual linguistic
history: the actually mentally represented grammars G l , G2, G3, ..., and
the actually realised languages LI, L2, L3, ... The bottom level in the
diagram represents alternative language histories - what languages might
have been realised if the pressures of the Arena of Use had been other
than what they actually were. These possible but unrealised languages can
be thought of as aborted due to competition in the Arena of Use from
a more successful rival language. Competition in the Arena of Use, in
the case of this short-term functional mechanism, is therefore between
possible languages defined by the same LAD. (Figure 5 is in fact another
variant of Andersen's scheme in Figure 4.) The unrealised languages are
possible but non-occurring aggregates of real speech events in the language
community, alternative courses of history, in effect.
"Merely on the evidence provided so far, if my arguments are sound, the proponents
of any functional motivation whatever for linguistic change have to do one of two
things:
(i) Admit that the concept of function is ad hoc and particularistic and give up;
or
(ii) Develop a reasonably rigorous, non-particularistic theory with at least some
predictive power; not a theory based merely on post hoc identification plus a
modicum of strategies for weaseling out of attempted disconfirmations.
This is the picture as I see it: (i) is of course the easy way out, and (ii) seems to be
the minimum required if (i) is not acceptable. I am myself not entirely happy with
(i), and it should probably not be taken up - though failing a satisfactory response
to (ii) it seems inevitable." (Lass, 1980:79-80)
use. In other words, quite clearly, these terms do not describe phenomena
in the Arena of Use. Instances of contrast, mean degree of allomorphy,
and pervasiveness of homophony can all be ascertained from inspection
of a grammar, without ever observing a single speaker in action. This
is of course what makes them attractive to many linguists. These are formal
properties, in the same way that the simplicity of a grammar, measured
in whatever way one chooses, is a formal property. Martinet's 'functional
load' is likewise a formal property of language systems, not of language
use, which may account for the failure of that concept to blossom as
a tool of functional explanation. Obviously, the presence of contrast makes
itself/e/i in the Arena of Use, but then so do most other aspects of grammars.
In fact, an old and important debate in the transition from post-
Bloomfieldian structuralist phonology to generative phonology sheds light
on the relation between contrast, competence, and functionally motivated
language change. The classical, taxonomic, or autonomous phoneme, whose
essence was that it was defined in terms of contrast, was the central concept
of pregenerative phonology. This was before the emergence of a better
understanding of the competence/performance, or I-language/E-language,
distinction, that came with the advent of generative linguistics. To the
surprise of some, it turned out that generative phonology, conceived as
a model of an individual's mentally represented knowledge of the sound
pattern of his language, had no place at all for the classical phoneme.
The classical phoneme simply did not correspond to any linguistically
significant level of representation in competence grammars. The phone-
micists who found this puzzling had no arguments against this conclusion,
yet puzzlement remained, in some quarters. And, in 1971, a postscript
to the debate appeared, an article by Schane (Schane, 1971), which pointed
the way to a resolution of the puzzle. But even 1971 was too close to
the events for matters to have become completely clear, and Schane's
postscript still leaves something rather unsettled; I now offer a post-
postscript, taking Schane's ideas, and showing how they can be well
accommodated within the picture of the interaction between the LAD
and the Arena of Use.
Schane points to attested or ongoing sound changes in a number of
languages (French nasalisation, Rumanian Palatalisation, Rumanian de-
labialisation, Nupe palatalisation and labialisation, and Japanese palata-
lisation). These changes conform to a pattern:
"If, on the surface, a feature is contrastive in some environments but not in others,
that feature is lost where there is no contrast". (Schane, 1971:505)
On the basis of these examples, Schane maintains that, for the speakers
involved, the (approximately) phonemic level of representation at which
126 James R. Hurford
these contrasts exist must have had some psychological validity. But he
has this problem:
like the 'Phonemic principle' is the Arena of Use. Speakers who allow
their phonetic performance to stray too far away from the surface contrasts
used as clues in reception by hearers are likely to be misunderstood. To
remain as (linguistically) successful members of the speech community,
they learn to respect, in a rough and ready way, a degree of surface
contrastivity.
I believe that Schane's basic account of the sound changes he discusses
does illuminate them. Something puzzling (e.g. denasalisation following
hard on the heels of nasalisation) is made to seem less puzzling by drawing
attention to the fact that this happened in an environment where no surface
contrast was lost. But Schane's principle is only explanatory in this weak
sense; it lacks the predictive power that Lass calls for, and falls into Lass's
category of 'a theory based merely on post hoc identification'. As Hock
(1976) points out:
"Though such changes undeniably occur, [Schane's] general claim is certainly t o o strong.
Note, first of all that the similar loss of u-umlaut before remaining u, referred to as
an 'Old Norse' change ..., is actually limited to Old Norwegian (cf. Benediktsson 1963)
- Old Icelandic does not participate in it:... Moreover, among such frequent conditioned
changes as palatalization and umlaut, examples of such a 'reversal' of change seem
extremely infrequent, suggesting that the phenomenon is quite rare". (Hock, 1976:208)
"It must be emphasised that functional theories are not performance theories. That
is, they seek to describe language in terms of the types of speech activities in which
language is used as well as the types of constructions which are used in speech activities.
They do not attempt to predict the actual tokens of speech events. ... They are theories
of systems, not of actual behavior". (Foley and Van Valin, 1984:15)
"I have attempted to avoid vague reference to properties such as "mental effort"
"informativeness" "importance" "focus" "empathy" and so on. I do not mean that
these terms are empty in principle: however they are empty at the moment, and
consequently can have no clear explanatory force". (Bever, 1975:600-601)
proverbial man searching for his keys under the street lamp, rather than
where he had dropped them, because the light was better under the lamp.
But it is precisely because the light is (at present) dim in the area of functional
influences on language change that adequate functional theories have not
emerged.
Perhaps in some cases there are indeed no functional causes of language
change, and the changes merely come about by random drift such as one
may expect in any complex culturally transmitted system. But it would
be quite unreasonable to assert that in no cases does the factor of usefulness
exert a pressure for change. The fact that we are unable to pinpoint specific
instances should not be confused for an argument that changes caused
by factors of usefulness do not exist. We can't see black holes in space,
but we have good reasons to believe they exist. Does anyone really doubt
that languages are useful systems and that (some) changes in them are
brought about by factors of usefulness? The only (!) issue is of the precise
nature and extent of the mechanisms involved.
"... the types of change that create grammatical morphemes are universal, and the same
or similar material is worn down into grammatical material in the same manner in
languages time after time ..." (Bybee, 1986:26)
"NI apparently arises as part of a general tendency in language for V's to coalesce
with their non-referential objects, as in Hungarian and Turkish. The drift may result
in a regular, productive word formation process, in which the NI reflects a reduction
of their individual salience within predicates (Stage I). Once such compounding has
become well established, its function may be extended in scope to background elements
within clauses (Stage II). In certain types of languages, the scope of NI may be extended
a third step, and be used as a device for backgrounding old or incidental information
within discourse (Stage III). Finally, it may evolve one step further into a classificatory
130 James R. Hurford
system in which generic NP's are systematically used to narrow the scope of V's with
and without external NP's which identify the arguments so implied (Stage IV)". (Mithun,
1984:891)
"Tendency I: Meanings based in the external described situation > meanings based
in the internal (evaluative/perceptual/cognitive) described situation.
Tendency II: Meanings based in the external or internal described situation > meanings
based in the textual and metalinguistic situation.
Tendency III: Meanings tend to become increasingly based in the speaker's subjective
belief state/attitude toward the proposition. ...
All three tendencies share one property: the later meanings presuppose a world not
only of objects and states of affairs, but of values and of linguistic relations that cannot
exist without language. In other words, the later meanings are licensed by the function
of language". (Traugott, 1989:34-35)
3. C O N C L U S I O N
The LAD is born into, and lives in, the Arena of Use. The Arena does
not, in the short term, shape the Device, but, in conjunction with it, shapes
the learner's acquired competence. The interaction between this competence
and the enveloping Arena reconstructs the Arena in readiness for the entry
of the next wave of LADs.
FOOTNOTES
"Bever, Carrithers, Cowart, and Townsend (1989) have extensive experimental data showing
that right-handers with a family history of left-handedness show less reliance on syntactic
analysis and more reliance on lexical association than d o people without such a genetic
background.
Moreover, beyond the "normal" range there are documented genetically-transmitted
syndromes of grammatical deficits. Lenneberg (1967) notes that specific language disability
is a dominant partially sex-linked trait with almost complete penetrance (see also Ludlow
and Cooper, 1983, for a literature review). More strikingly, Gopnik, 1989, has found
a familial selective deficit in the use of morphological features (gender, number, tense,
etc.) that acts as if it is controlled by a dominant gene". (Pinker and Bloom, 1990)
2. Sperber and Wilson's theory is, however, still controversial. See the peer review in Behavioral
and Brain Sciences 10 (1987), also the exchange in Journal of Semantics 5 (1988), and Levinson
(1989).
132 James R. Hurford
"The fundamental question that a theory of language seeks to answer is: How is it
possible for speakers and hearers to communicate by the production of acoustic wave
forms?". (Fodor, 1976:103)
4. In this quotation, I have (with the author's approval) three times replaced an original
instance of 'speakers' with 'language learners' and (indicating a shift in my opinion about
certain numeral expressions) replaced 'preferred usage' with 'a fact of grammar'.
5. This convention is actually quite standard. Pinker, for example, adopts this usage: 'What
the Uniqueness principle does is ensure that languages are generally not in proper inclusive
relationships. When the child hears an irregular form and consequently drives out its
productively generated counterpart, he or she is tacitly assuming that there exists a language
that contains the irregular form and lacks the regular form, and a language that contains
the regular form and lacks the irregular form, but no language that contains both". (Pinker,
1984:360)
REFERENCES
Grimshaw, A. D. 1989. Infinitely Nested Chinese 'Black Boxes': Linguists and the Search
for Universal (Innate) Grammar. Behavioral and Brain Sciences 12. 339-340.
Grimshaw, J. and S. Pinker. 1989. Positive and Negative Evidence in Language Acquisition.
Behavioral and Brain Sciences 12. 341-342.
Gropen, J., S. Pinker, M. Hollander, R. Goldberg and R. Wilson. 1989. The Learnability
and Acquisition of the Dative Alternation in English. Language 65. 203-257.
Hasher, L. and R. T. Zacks. 1984. Automatic Processing of Fundamental Information: the
Case of Frequency of Occurrence. American Psychologist 39. 1372-1388.
Hawkins, J. A. 1990. A Parsing Theory of Word Order Universals. Linguistic Inquiry 21.
223-261.
Hock, H. H. 1976. Review article on Raimo Anttila 1972. An Introduction to Historical
and Comparative Linguistics. New York: Macmillan. Languaqe 52. 202-220.
Hooper, J. 1976. Word Frequency in Lexical Diffusion and the Source of Morphophonological
Change. In W. M Christie, Jr. (ed.). Current Progress in Historical Linguistics. 95-105.
Amsterdam: North Holland.
Horning, J. J. 1969. A Study of Grammatical Inference. Doctoral Dissertation, Stanford
University.
Hudson, G. 1980. Automatic Alternations in Non-Transformational Phonology. Language
56. 94-125.
Hurford, J. R. 1987. Language and Number. Oxford: Basil Blackwell.
Hurford, J. R. 1989. Biological Evolution of the Saussurean Sign as a Component of the
Language Acquisition Device. Lingua 77. 245-280.
Hurford, J. R. 1991a. The Evolution of the Critical Period for Language Acquisition.
Cognition.
Hurford, J. R. 1991b. An Approach to the Phylogeny of the Language Faculty. In J. A.
Hawkins and M. Gell-Mann (eds.) The Evolution of Human Languages. Santa Fe Institute
Studies in the Sciences of Complexity, Proceedings vol. X. Addison Wesley.
Hyman, L. M. 1984. Form and Substance in Language Universals. In Brian Butterworth,
B. Comrie and O. Dahl (eds.) Explanations for Language Universals. 67-85. Berlin: Mouton.
Ingram, D. 1979. Cross-linguistic Evidence on the Extent and Limit of Individual Variation
in Phonological Development. Proceedings of the 9th International Congress of Phonetic
Sciences. Institute of Phonetics. University of Copenhagen.
Itkonen, E. 1978. Grammatical Theory and Metascience: a critical investigation into the
methodological and philosophical foundations of' autonomous' linguistics. Amsterdam: John
Benjamins.
Itkonen, T. 1977. Notes on the Acquisition of Phonology. English summary of: Huomiota
lapsen aanteiston kehitykseka. Virittaja. 279-308. (English summary 304-308).
Koopmans-van Beinum, F. J. and J. H. Harder. 1982/3. Word Classification, Word Frequency
and Vowel Reduction. Proceedings of the Institute of Phonetic Sciences of the University
of Amsterdam 7. 61-9.
Kripke, S. 1982. Wittgenstein on Rules and Private Language. Cambridge, Massachusetts:
Harvard University Press.
Kroch, A. 1989. Language Learning and Language Change. Behavioral and Brain Sciences
12. 348-349.
Labov, W. 1969. Contraction, Deletion and Inherent Variability of the English Copula.
Language AS. 716-762.
Lasnik, H. 1981. Learnability, Restrictiveness, and the Evaluation Metric. In C. L. Baker
and J. J. McCarthy (eds.) The Logical Problem of Language Acquisition. 1-21. Cambridge,
Massachusetts: MIT Press.
Lass, R. G. 1980. On Explaining Language Change. Cambridge: Cambridge University Press.
Lenneberg, E. H. 1967. Biological Foundations of Language. New York: John Wiley and
Nativist and Functional Explanations in Language Acquisition 135
there is no longer any argument for even the weak version of the Subset
Condition, and the Subset Principle remains completely unsupported.
In the present study, I do not set out to uncover new evidence in favour
of the subset theory of learning. What I will argue however is that at
least the original evidence for it stands, in that precisely a locality parameter
for binding of the type in Manzini and Wexler (1987), Wexler and Manzini
(1987) is necessary, and an approach of the type in Pica (1987) is insufficient.
1. LOCALITY
(1) CP
This is the structure proposed in Chomsky (1986a; b), except that the
subject is taken to move to the Spec of IP position, where it can be assigned
a Case, from a VP-adjoined position, where it can be assigned a theta-
role, as in Sportiche (1988).
It is generally assumed that the locality theory for movement is based
at least in part on a notion of government, which following Chomsky
(1986b) is formulated as in (2) in terms of a notion of barrier:
In Manzini (1988; 1989) it is argued that the locality theory for movement
can be entirely based on the notion of government, if the notion of barrier
Locality and Parameters again 139
It is important to notice that (3)-(4) uses all and only the primitives used
in the definition of barrier (and minimality barrier) in Chomsky (1986b),
to the exclusion notably of the notion of subject. This in turn is the only
crucial property of (3)-(4) for the present discussion; hence the conclusions
that we will reach are essentially independent of the theory in Manzini
(1988; 1989).
Consider then the locality theory for referential dependencies. The locality
condition on anaphors in Chomsky (1981), Binding Condition A, states
that an anaphor must have an antecedent in its governing category. A
governing category for a is defined in turn as a category that dominates
a, a governor for a and a subject accessible to a . Let us compare this
definition of locality with (4). To begin with, there is no indication that
a governing category need ever be a non-maximal projection; thus in this
respect governing categories and barriers need never differ. Furthermore,
a governing category must dominate a governor for a, while a barrier
must dominate a g-marker for a , if a is g-marked. It is easy to check
however that in our theory the notion of g-marker reconstructs the notion
of governor in Chomsky (1981); thus in this respect the two definitions
of locality do not differ either. The only difference between the two remains
the notion of subject, which appears in the definition of governing category,
but not in the definition of barrier.
If so, the definition of governing category y can be given as in (5),
which is the definition of barrier in (4) with the added requirement that
7 must dominate a subject accessible to a; in the first instance a subject
can be taken to be acccessible to a just in case it c-commands a:
140 Rita Manzini
The facts in (7)-(8) are predictable on the basis of (5)-(6), but also on
the basis of a government condition, stating that anaphors must have an
antecedent that governs them. Consider first the object position. VP is
a barrier for it, hence an anaphor in object position can only have an
antecedent that is not excluded by VP, if government is to be satisfied.
The VP-adjoined subject position in (1) satisfies this condition, and no
position higher than it does. Thus it correctly follows that himself in (7)
can only be bound by the embedded subject. Consider now the ultimate
subject position, in the Spec of IP, as in (1). CP is a barrier for the Spec
of IP, hence an anaphor in the Spec of IP must be bound internally to
CP, if government is to be satisfied. However, all available positions in
this domain are A'-position. Thus A-binding must violate government,
and the ungrammaticality of (8) is correctly derived.
(7)-(8), then, do not necessitate recourse to the notion of subject, and
therefore lend no support to the theory in (5)-(6) as opposed to (4). The
notion of subject is in fact needed to account for this type of examples
in Chomsky (1981; 1986a) but only because one subject position only is
postulated, the Spec of IP, a VP-external position; this has been noticed
also in Kitagawa (1986) and Sportiche (1988). Because of this, and because
under any definition of locality based only on the notion of maximal
projection (and governor/ g-marker) VP is a locality domain for the direct
object, the notion of subject must be referred to in order to allow the
Locality and Parameters again 141
(10) is wellformed with John as the antecedent for him, while (9) is wellformed
with John again, but not Peter as the antecedent. According to Chomsky
(1981), this behaviour is again accounted for by a condition formulated
in terms of the notion of governing category in (5), Binding Condition
B; following the format of Binding Condition A, as in (6), Binding Condition
B can be rendered as in (11):
As in the case of (5)-(6), the theory in (5) and (11) can account for the
data, in this case (9)-( 10); but an account is equally possible in terms
of a government condition. Consider first a pronoun in object position,
as in (9). Its first barrier is VP, which under the theory of phrase structure
in (1) contains a subject position. It follows that the pronoun is correctly
predicted to be disjoint in reference from the immediately superordinate
subject, if the condition on it is that it cannot be governed by its antecedent.
Similarly, consider the subject pronoun in (10), ultimately in the Spec
of IP position. IP is not a barrier for the subject, if it is g-marked by
C; but CP is a barrier for it. Hence the superordinate subject is correctly
predicted to be a possible antecedent for the pronoun, since it does not
govern it.
As far as the object and subject position of a sentence are concerned,
or in general sentential positions, it appears then that Binding Conditions
A and B can be formulated in terms of the notion of government, as
in (12), and do not require reference to the notion of governing category
in (5):
(6) and (11) that has not been considered so far. These data, involving
NP-internal positions will be considered in the next section, where I will
conclude that anaphors are indeed associated with the notion of governing
category, though pronouns are associated with the notion of barrier. Thus
the notion of governing category cannot be reduced to the notion of barrier,
and viceversa.
Suppose we assume that NP's have a structure of the type in (13), where
a is the NP's object, and /3 the NP's subject:
(13) NP
(14) DP
It is not difficult to see that in this case the correct predictions follow
under both our definition of barrier and the definition of governing category.
Under the former, NP is a barrier for because it is a maximal projection
that dominates a and the g-marker of a, namely N. Under the latter,
NP is a governing category for a for the same reasons and because it
also dominates a subject that c-commands a. Thus if an anaphor must
have a binder not excluded by its governing category, himself in (15) must
be bound by Peter's; the same result follows if an anaphor must have
a binder that governs it.
Consider now the case in which in (13) is not realised, as in (16), which
exactly reproduces (15) but for the absence of the subject Peter's-, crucially,
it is not necessary to the wellformedness of (16) that the subject of the
NP is interpreted as referentially dependent on John:
(20) John and Peter thought that [each other's pictures] were on sale
reciprocal. The reason is that in (19) the matrix verb is the g-marker for
each other in the Spec position of NP. NP then is not a governing category
for each other because it does not dominate its g-marker. Rather, the first
category that dominates the g-marker for each other is the matrix VP,
and this is its governing category. The correct predictions then follow,
in particular that each other can be bound by the matrix subject.
As for (20), our theory predicts its ungrammaticality. Notice that each
other does not have a g-marker in (20), since NP, being in subject position,
is not a sister to a head. NP itself then is a barrier for each other, and
since each other is accessible to itself by our definitions so far, NP is its
governing category. Since of course no antecedent is available for it within
NP, ungrammaticality is predicted to follow. In fact sentences of the type
in (20) appear not to be worse than their counterparts in (18).
However, leaving this problem aside, we have verified that the funda-
mental data relating to English anaphors, including himself and each other,
are correctly predicted by our theory under a subject-based definition of
locality. In doing so, we have also shown that at least in the cases considered
so far the notion of accessibility can be reduced to that of c-command.
Remember that in Chomsky (1981) accessibility is defined in terms of c-
command and of the i-within-i constraint; in particular, ft is said to be
accessible to a in case it c-commands a and it can be coindexed with
it under the i-within-i constraint. If we are correct the second part of
this definition can be eliminated altogether. Similarly, in Manzini (1983)
7 is said to be a locality domain for a just in case two independent conditions
are satisfied, which can be expressed as follows: first, y dominates a subject
that c-commands a, and second, this subject is accessible to a in the sense
that it does not violate the i-within-i constraint. In case the second condition
is not satisfied, no locality domain for a is defined. If we are correct
the whole definition of accessibility must reduce to the first of these two
conditions.
Nothing that I have said so far touches yet on pronouns. If an anaphor
and a pronoun in a language are associated with the same definition of
locality, and if locality theory is in fact a biconditional to the effect that
an element is anaphoric just in case it is bound within that locality domain,
we expect the pronoun and anaphor to have complementary distribution
in the language. This is the prediction in Chomsky (1981) for English,
and as is well known the prediction fails.
Consider first a pronoun in the object and subject position of a sentence,
as in (9) and (10) again:
(26) R2
I I
NP ... I'uno ... I'altro
Ri
R,*
Two observations are in order before we dismiss the issue. First, if I'uno
at LF takes scope immediately over NP, the locality properties of R ^
are exactly the same as the locality properties of Rj. Thus (26) and (27)
are equivalent in this respect. Second, accepting that something like R]*
characterises the quantifier part of a reciprocal in English as well, as argued
for instance in Heim et al. (1988), and that L F is not parameterised, the
only hope of accounting for the discrepancies that we will see exist between
Italian and English is at s-structure. Thus (26) and (27) are not equivalent
in this respect, and there is perhaps a reason why R[ must be postulated
as an s-structure dependency.
Given this background, consider an NP in the object position of a
sentence. L'uno can float either NP-externally or NP-internally. If I'uno
floats NP-externally, the sentence is wellformed, provided NP is otherwise
subjectless. Relevant examples are of the type in (28). Notice that in our
examples the NP containing (part of) the reciprocal is systematically made
into an accusative subject of a small clause, rather than into an object;
this is to avoid as much as possible readings with the reciprocal taken
as an argument of the verb:
(28) does not chose among locality domains for l'uno, which I assume
is VP-internal. By our definition of barrier its locality domain is VP, the
first maximal projection that dominates it. The subject quei pittori ('those
painters') is then predicted to be a possible antecedent for it, correctly.
The same correct prediction, that the subject of the sentence is a possible
antecedent for l'uno, follows however under a subject-based definition of
governing category, since in this case the subject itself defines the governing
category.
Consider now I'altro. Under a subject-based definition of locality, the
locality domain for I'altro is defined by the subject of the sentence, and
150 Rita Manzini
Consider now the cases, crucial to the determination of the locality domain
for l'uno, where this floats NP-internally. These are exemplified in (30)-
(31), where (30) differs from (31) in that an overt subject is present in
NP:
(30) represents by far the easier of the two examples, though again it
does not distinguish between locality domains for l'uno. The locality domain
for l'uno is NP, both under our definition of barrier, since NP is the maximal
projection that dominates l'uno, and under a subject-based definition of
governing category, since NP has a subject. Hence the only possible
antecedent for l'uno is the subject of NP, loro ('their'). This of course
is also true for l'altro, which we have just seen to be associated with the
subject-based definition of locality. The prediction correctly is that if the
subject of NP, which is pronominal, is interpreted as coreferential with
the subject of the sentence, so is the reciprocal; but not otherwise.
Locality and Parameters again 151
(32) *Quei pittori pensano che [ N P lo stile l'uno dell'altro] sia ammirevole
Those painters think that the style each of the other is admirable
(33) Quei pittori pensano l'uno che [ N P lo stile dell'altro] sia ammirevole
Those painters think each that the style of the other is admirable
152 Rita Manzini
I do not want to imply that these problems are insoluble; only that
their solution will involve a complication of the grammar. If so, the potential
simplicity argument for what I have called non-ad hoc theories would
disappear. What is crucial to our argument however is that the parameter
derived in Pica (1987) is a two-way parameter, between short-distance and
long-distance dependencies. If Manzini and Wexler (1987), Wexler and
Manzini (1987) are correct, there are of course many more values to the
parameter; but leaving these aside, on the basis solely of the new evidence
presented here, we must conclude that the parameter that is needed for
observational adequacy is at least a three-way one, the short-distance value
of Pica (1987) splitting into a subject-based and a non-subject-based value.
Notice that a parameter to this effect can presumably be added to the
theory in Pica (1987); but this only illustrates our point further, namely
that a locality parameter is required in any case.
In fact, a desirable property of the theory in Pica (1987) is that it links
the locality domain of anaphors and pronouns with their ability or inability
to take antecedents other than subjects. In particular, it appears to be
a fact that anaphors whose opacity element is a subject, do not necessarily
have a subject as their antecedent; on the other hand, long-distance anaphors
are subject oriented. The link between long-distance binding and subject
orientation follows if the landing site for a long-distance anaphor, which
is an X° and moves head-to-head, is I. This still needs to be stipulated
within the theory, but it at least provides a natural basis for linking
antecedence and locality.
By contrast, the theory in Manzini and Wexler (1987), Wexler and
Manzini (1987) cannot derive the link between long-distance binding and
subject orientation. Rather, the subject orientation of certain anaphors
is treated as a second parameter, the antecedent parameter. Thus it is
likely that at least this aspect of the theory in Pica (1987) is correct. This
however leaves our present argument unchanged. Once more, our argument
is simply that a crucial feature of the theory in Manzini and Wexler (1987),
Wexler and Manzini (1987) must be retained, namely parameterised locality
domains.
Any modification of the original conception of the locality parameter
in order to take into account at least the link with the antecedent parameter
must in turn raise the question whether the argument in favour of the
Subset Principle is preserved. In the meantime, however, our provisional
conclusions are supportive of the Subset Principle. If the necessity of a
parameter of the type in (34) is demonstrated, the original argument in
favor of the Subset Principle stands at least for the time being.
156 Rita Manzini
REFERENCES
Abney, S. 1987. The English Noun Phrase in its Sentential Aspect. Doctoral Dissertation,
MIT.
Belletti, A. 1983. On the Anaphoric Status of the Reciprocal Construction in Italian. The
Linguistic Review 2.
Borer, H. 1984. Parametric Syntax. Dordrecht: Foris.
Chomsky, N. 1981. Lectures on Government and Binding. Dordrecht: Foris.
Chomsky, N. 1986a. Knowledge of Language: its Nature, Origin and Use. New York: Praeger.
Chomsky, N. 1986b. Barriers. Cambridge, Massachusetts: MIT Press.
Heim, I., H. Lasnik and R. May, to appear. Reciprocity and Plurality. Ms. UCLA, University
of Connecticut and UC Irvine.
Hyams, N. 1986. Language Acquisition and the Theory of Parameters. Dordrecht: Reidel.
Kitagawa, Y. 1986. Subjects in Japanese and English. Doctoral Dissertation, University of
Massachusetts.
Koster, J. 1986. Domains and Dynasties. Dordrecht: Foris.
Manzini, M. R. 1983. On Control and Control Theory. Linguistic Inquiry 14. 421-446.
Manzini, M. R. 1988. Constituent Structure and Locality. In A. Cardinaletti, G. Cinque
and G. Giusti (eds.) Constituent Structure. Papers from the 1987 GLOW Conference, Annali
di Ca' Foscari 27, IV.
Manzini, M. R. 1989. Locality. Ms. University College London.
Manzini, M. R. and K. Wexler. 1987. Parameters, Binding Theory and Learnability. Linguistic
Inquiry 17. 413-444.
Pica, P. 1987. On the Nature of the Reflexivization Cycle. In Proceedings ofNELS 17. GSLA.
University of Massachusetts.
Sportiche, D. 1988. A Theory of Floating Quantifiers and its Corollaries for Constituent
Structure. Linguistic Inquiry 19. 425-449.
Wexler, K. and M. R. Manzini 1987. Parameters and Learnability in Binding Theory. In
T. Roeper and E. Williams (eds.) Parameters in Linguistic Theory. Dordrecht: Reidel.
Yang, D.-W. 1984. The Extended Binding Theory of Anaphora. Theoretical Linguistic Research
1.
On the rhythm parameter in phonology*
Marina Nespor
University of Amsterdam
The dichotomy stress-timed and syllable-timed has largely been taken for
granted since Pike (1945), although already from the early sixties many
studies devoted to the issue have put into question the physical basis of
this dichotomy. Shen and Peterson (1962), O'Connor (1965) and Lea (1974),
for example, have shown, with different types of experiments, that in
English, interstress intervals increase in duration in a manner that is directly
proportional to the number of syllables they contain. Bolinger (1965),
besides showing that the isochrony of interstress intervals in English is
not a physical reality, finds that the length of the intervals is influenced
not only by the number of syllables they contain, but, among other factors,
also by the structure of the syllables and the position of the interval within
the utterance.
More recently, Roach (1982) carried out some experiments to test two
claims made by Abercrombie (1967:98): first, that there is variation in
160 Marina Nespor
to hear such intervals as more isochronous than they really are (cf. also
Donovan and Darwin, 1979, Darwin and Donovan, 1980) might suggest
the presence of an underlying rhythm that imposes itself on the phonetic
material. In other words, the (more or less) regular recurrence of stresses
would be part of the rhythmic competence of native speakers of English.
A similar conclusion is reached by Cutler (1980a) on the basis of syllable
omission errors. These speech errors tend to produce sequences whose
interstress intervals are more regular than they are in the original target
sentence (cf. also Cutler, 1980b).
These results are very interesting for the present discussion in that T1
and T2 make different predictions about perception as well. According
to T l , first, the behavior exhibited by native speakers of English to regularise
interstress intervals should be extraneous to native speakers of "syllable-
timed" languages since stress would supposedly not play any role in their
rhythmic organisation; second, the native speakers of syllable-timed langua-
ges should have the tendency of perceiving syllables as more isochronous
than they really are.
As far as the latter prediction of Tl is concerned, there are, to my
knowledge, no perception experiments on syllable-timed languages that
would parallel those just mentioned for English. It is, however interesting
to notice that most claims about Spanish, French, Italian or Yoruba having
syllables of similar length are made by native speakers of English, not
by native speakers of syllable-timed languages, the ones that supposedly
should most feel this type of regularity.
Concerning the first prediction of T l , important results have been reached
by Scott, Isard and de Boysson-Bardies (1985), who found that native
speakers of "syllable-timed" French and of "stress-timed" English behave
in the same way: they both hear the intervals in between stressed syllables
as more regular than they actually are. While the similar behaviour of
French and English listeners in the perception of linguistic rhythm con-
tradicts the first predicion of Tl mentioned above, it is just what T2 would
predict: since language is temporally organised according to universal
principles, these should have similar effects in the perception of all
languages. These results thus indicate that there is no perceptual support
for different underlying rhythmic systems for stress-timed and syllable-
timed languages.
162 Marina Nespor
weight and stress reinforce each other in some languages much more than
in others.
The second phonological factor that, according to Dauer, characterises
"stress-timed" English, Swedish and Russian as opposed to Spanish, Italian
or Greek, is the reduction of stressless vowels. A phenomenon instead
that is widespread in "syllable-timed" languages is the deletion of one
of two adjacent vowels. The important difference between the two processes
for the present discussion is that while a syllable whose vowel undergoes
reduction retains its syllabicity, a syllable whose vowel undergoes deletion
disappears. Very short syllables are thus originated in "stress-timed"
languages but not in "syllable-timed" languages. Thus, the lack of vowel
reduction in Spanish, Italian and Greek also contributes to the impression
of syllable isochrony in these languages. The presence of it in English
or Dutch, instead, is partly responsible for the impression that the stressed
syllables recur at regular intervals. That is, the fact that stressless syllables
are reduced and thus shortened, together with the fact that they are shorter
than stressed syllables to begin with, makes them so much less prominent
than the syllables that carry stress, and the impression is created that a
sequence of stressless syllables occupy a more or less constant amount
of time, independently of how many syllables it contains. 5
Finally, stress has a greater lengthening effect in English than it has
in Spanish (cf. Dauer, 1983). This is one more characteristic that makes
the difference in duration between stressed and stressless syllables much
greater in the former language than in the latter, thus reinforcing the illusion
of regular recurrence of stresses and syllables, respectively. Now that the
nonrhythmic processes have been identified that are present in the languages
most often used as examples of either stress-timing or syllable-timing, it
must be demonstrated that the causality relation between rhythmic and
nonrhythmic phonology supports T2.1 turn to this task in the next section.
(cf. Allen and Hawkins, 1975). It is the acquisition of the reduction processes
that contributes to the development of adult rhythm.
Once more, we are confronted with data that are accounted for within
T2, while they are not explainable within Tl.
From the observations presented in sections 1 and 2, the conclusion
must be drawn that T2 is superior to Tl for both phonetic and phonological
reasons. That is, no motivation has been found in favour of different
temporal organisations in language, but rather against it.
x x x xxx
x x x x x x xxx
(1) a. the manager's here b. il popolo
In this way, the observation that Italian syllables are more or less
isochronous is incorporated in the representation of rhythm.
Since, however, the length of a syllable depends crucially on the number
of segments it contains, both in Italian and in English, and since the number
of segments per syllable can vary in Italian, though less than in English,
the representation proposed by Selkirk for Italian is not a reflection of
physical reality. The results of an experiment described in den Os (1988)
indicate, in addition, that representing the timing of "stress-timed" and
On the rhythm parameter in phonology 167
x
x xxx
(2) il popolo
prominence. The application of the rule is illustrated in (3), where " / "
marks word primary stress. 7
It has, in addition, been shown that the domain within which the rules
apply is identical in the two languages (cf. Selkirk, 1978, Nespor and Vogel,
1982, 1986). As shown in (5) and (6) for Italian and English, respectively,
this domain coincides with the phonological phrase [<p]. In (7) and (8),
it is shown that the rule does not apply across phonological phrases. The
analysis in <p's is made according to the rules of phonological phrase
formation and restructuring proposed in Nespor and Vogel (1986).8
It is clear that the rule of English and the rule of Italian are very similar
and that they apply in order to create a more alternating pattern of stressed
On the rhythm parameter in phonology 169
and unstressed syllables. This motivation is a very natural one for a language
that, like English, is supposed to be stressed-timed. However, if Italian
were indeed to be syllable-timed, that is, have a rhythm based on the
succession of identical syllables rather than on the alternation of stressed
and stressless syllables, the existence of the rhythm rule just discussed
is quite unexpected. It is, in fact, difficult to find a reason for its existence,
but if one could imagine such a reason, it would still be surprising to
have one and the same rule triggered in two different ways and with different
motivations in the two languages.
It seems much more natural to assume that if one rule applies in the
same way in two languages, its motivation is also the same. In our specific
case, then, Italian would also have an aspiration to the alternation of
stresses. Since for a "syllable-timed" language there is no reason to have
alternating rather than adjacent stresses, we may once more draw the
conclusion that the fact that most Italian syllables are similar in length
has nothing to do with the language's temporal organisation. Rather,
rhythm in Italian, as well as in English, is an accentual phenomenon.
That is, the one object that must recur at regular intervals to establish
"order in movement" is stress.
The rhythm rule has, in addition, been proposed to account for similar
facts in German (Kiparsky, 1966), Dutch (Schultink, 1979, Kager and Visch,
1983), Finnish (Hayes, 1981), Polish (Hayes and Puppel, 1985) French
(Dell, 1984), Canadian French (Phinney, 1980), Brazilian Portuguese
(Major, 1985), Tiberian Hebrew (Mc Carthy, 1979), Dari (Bing, 1980),
Passamaquoddy (Stowell, 1979), Catalan (Nespor and Vogel, 1989). As
Hayes and Puppel (1985) suggest, it might very well be the exception for
stress languages not to have the rhythm rule (cf. also Nespor and Vogel,
1989), where it is proposed that the rules that take care that stress clashes
are eliminated are part of universal grammar).
While with adjacent stresses the rule applies at all rates of speech, however,
in the case in which the two stresses are not adjacent the rule is gradient
in application, that is, its likelihood to apply increases as speech becomes
faster (cf. Hayes, 1984). On the basis of these data, it is proposed in Nespor
and Vogel (1989) that there is a parameter in the phonology of rhythm
that accounts for the different behavior of stress-timed English and syllable-
timed Italian in the definition of the configuration that constitutes a stress
clash. In particular, it is proposed that what counts as adjacent differs
in the two groups of languages: strict adjacency would be required in
Italian and more generally in syllable-timed languages; instead, two stressed
syllables would be considered adjacent in stress-timed languages, even
though one unstressed syllable intervenes to separate them.
In this section, I will argue, contra Nespor and Vogel (1989) that the
difference between English and Italian is to be found in their nonrhythmic
phonological systems, rather than in their rhythmic component (cf. also
Nespor, 1990). The data that reveal that rhythm functions in the two
languages in a way more similar than previously thought come from
Northern Italian. Besides the contexts of application of the rule exemplified
in (4), there are other cases in which it applies, although the syllables
bearing the clashing stresses are not linearly adjacent, as shown in (10).
What these two examples have in common is that the (italicised) weak
syllable that separates the stressed syllables consists exclusively of one vowel.
Any longer intervening syllable blocks the application of the rule, as shown
in (11).
These data are interesting in light of the observation made by Hayes that,
in English, a syllable intervening between two clashing prominences should
be short in order for a rhythmic readjustment to take place (1984:70).
On the rhythm parameter in phonology 171
Stress clashes are not the only type of arhythmic configurations that may
arise when words are strung together in a sentence. Another type of
rhythmically ill-formed configurations is an "overlong" sequence of weak
positions, the so-called stress lapse (Selkirk, 1984:49). As is observed in
Selkirk (1984), a lapse is eliminated in both English and Italian (cf. also
Roca, 1986, Nespor and Vogel, 1989). Since English is stress-timed, it is
quite clear why its rhythmic component has a rule that adds a prominence
to a certain position in a lapse: its effect is that of producing alternation.
But what about the addition of a prominence in stress lapses in Italian,
exemplified in (13)?
Once more, if rhythm were parametric and based, in Italian, on the regular
succession of syllables rather than on the alternation of stressed and
stressless syllables, there would not be any reason why a sequence of
172 Marina Nespor
unstressed syllables like, for example, the ones italicised in (13), should
not be well-formed in Italian. If, however, rhythm is not parametric, but
is based on alternation in Italian, as well as in English, their similar
behaviour in the elimination of stress lapses is exactly what we expect.
Of course, this is not to say that the physical realisation of the added
prominence is the same: rather, it reflects the different nature of stress
in the two languages (cf. section 1).
4. CONCLUSIONS
FOOTNOTES
* I would like to thank Iggy Roca for his comments on some of the ideas presented in
this paper and Joan Mascaro for discussions on Catalan phonology and for commenting
on a previous version of this paper and offering suggestions for improvements.
1. Strictly speaking, while the analogy between the sound of a machine-gun and a rhythm
with isochronous recurrence of events of sorts is appropriate, the same cannot be said for
the analogy between a message in morse code and any type of isochrony. There is, in fact,
no regular recurrence of events in the sequence of dots and dashes of a message in morse
code.
2. The arguments I will present are against two different claims made by T l : a) that there
are only two types of rhythmic organisation in language and, b) that the type of rhythm
according to which a language is temporally organised triggers a number of nonrhythmic
phonological properties. These two claims are not logically related. Since, however, they
are treated as strictly connected within T l , I will not always separate the arguments against
one from the arguments against the other.
3. The studies mentioned in this paper by no means exhaust the literature on isochrony.
For a more complete survey of the literature, cf. den Os (1988).
4. I am using here the term syllable in a by now traditional way: all consonants that appear
in the surface are included in a syllable whose nucleus is also present on the surface. This
is by no means an unquestioned assumption (cf. Kaye, Lowenstamm and Vergnaud, 1987).
5. It must be noted that the amount of stressless syllables that may occur in between two
stressed ones cannot vary very much, since if the sequence of stressless syllables is long
enough to constitute a stress lapse, a stress is added to remedy this arhythmic configuration
(cf. among others, Selkirk, 1984, and section 3.4 below).
6. Jerzy Rubach has pointed out to me that traditional Polish linguists and phoneticians
also consider Polish a stress-timed language.
7. Although at present, I believe that a rule that deletes a prominence (Beat Deletion) plus
a rule that adds a prominence (Beat Addition) account for this phenomenon better than
a rule that moves a prominence (cf. Nespor and Vogel, 1988 or Nespor and Vogel, 1989
for an extended proposal about the rules of rhythm), I omit a discussion of this proposal,
since it is not crucial to the point being made here. For different accounts of these facts,
the reader is referred to Liberman and Prince, 1977, Prince, 1983, Selkirk, 1984, Hayes,
1984, Nespor and Vogel, 1988, 1989, among others.
8. In grid terms, the domain of Beat Deletion is derivable from the definition of what
constitutes a minimal clash in a given language. For reasons of space, I will not discuss
this analysis here, although at present I believe it is the most adequate account of the facts.
The interested reader is referred to Nespor and Vogel, 1989.
REFERENCES
Bolinger, D. 1965. Pitch Accent and Sentence Rhythm. In Forms of English: Accent, Morpheme,
Order. Cambridge, Massachusetts: Harvard University Press.
Bortolini, U. 1976. Tipología sillabica dell'Italiano. Studio statistico. In R. Simone, U. Vignuzzi
and G. Ruggiero (eds.) Studi di Fonética e Fonología. 5-22. Roma: Bulzoni.
Borzone de Manrique, A. M. and A. Signorini. 1983. Segmental durations and the rhythm
in Spanish. Journal of Phonetics 11.117-128.
Classe, A. 1939. The Rhythm of English Prose. Oxford: Blackwell.
Cutler, A. 1980a. Syllable omission errors and isochrony. In H.W. Dechert and M. Raupach
(eds.) Temporal Variables in Speech. 183-190. The Hague: Mouton.
Cutler, A. 1980b. Errors of stress and intonation. In V.A. Fromkin (ed.) Errors in Linguistic
Performance: Slips of the Tongue, Ear, Pen and Hand. 67-80. New York: Academic Press.
Darwin, C. and A. Donovan. 1980. Perceptual studies of speech: isochrony and intonation.
In J. Simon (ed.) Proceedings of NA TO AST on spoken language generation and understanding.
77-85. Dordrecht: Reidel.
Dasher, R. and D. Bolinger. 1982. On pre-accentual lengthening. Journal of the International
Phonetic Association 12. 58-69.
Dauer, R. 1983. Stress-timing and syllable-timing reanalysed. Journal of Phonetics 11. 51-
62.
Dell, F. 1984. L'accentuation dans les phrases en français. In François Dell, Daniel Hirst
and Jean-Roger Vergnaud (eds.) Forme Sonore du Langage. Paris: Hermann.
den Os, E. 1988. Rhythm and Tempo in Dutch and Italian, a contrastive study. Doctoral
dissertation, Utrecht.
Donovan, A. and C. Darwin. 1979. The perceived rhythm of speech. In Proceedings of the
Ninth International Congress of Phonetic Sciences. 268-274. Copenhagen.
Hayes, B. 1980. A Metrical Theory of Stress Rules. Doctoral dissertation, MIT (IULC, 1981).
Hayes, B. 1984. The phonology of rhythm in English. Linguistic Inquiry 15. 33-74.
Hayes, B. to appear. The prosodie hierarchy in meter. In P. Kiparsky and G. Youmans
(eds.) Proceedings of the 1984 Stanford Conference on Meter. New York: Academic Press.
Hayes, B. and S. Puppel. 1985. On the Rhythm Rule in Polish. In H. van der Hulst and
N. Smith (eds.) Advances in Nonlinear Phonology. 59-81. Dordrecht: Foris.
Kaye, J., J. Lowenstamm and J.-R. Vergnaud. (to appear). Konstituentenrektion und Rektion
in der Phonologie. In H. Prinzhorn (ed.) Phonologie. Wiesbaden: Westdeutscher Verlag.
Kager, R. and E. Visch. 1983. Een Metrische Analyse van Ritmische Klemtoonverschijnselen.
M.A. Thesis, Utrecht.
Kiparsky, P. 1966. Über den deutschen Akzent. Studia Grammatica 7. 69-98.
Lea, W. A. 1974. Prosodie Aids to Speech Recognition: IV. A General Strategy for
Prosodically-guided Speech Understanding. Univac Report. No. PX10791, Sperry Univac,
DSD, St. Paul, Minnesota.
Lehiste, I. 1973. Rhythmic units and syntactic units in production and perception. Journal
of the Acoustical Society of America 54. 1228-34.
Lehiste, I. 1977. Isochrony Reconsidered. Journal of Phonetics 5. 253-263.
Liberman, M. and A. Prince. 1977. On stress and linguistic rhythm. Linguistic Inquiry 8.
249-336.
Lloyd, James A. 1940. Speech Signals in Telephony. London.
Maia, E. A. D. M. 1981. Hierarquias de constituentes en fonologia. Anais do V Encontro
Nacional de Lingüistica. 260-289. Pontificia Universidade Católica, Rio de Janeiro.
Major, R. C. 1981. Stress-timing in Brazilian Portuguese, Journal of Phonetics 9. 343-351.
Major, R. C. 1985. Stress and Rhythm in Brazilian Portuguese. Language 61. 259-282.
Mascaró, J. 1976. Catalan Phonology and the Phonological Cycle. Doctoral dissertation, MIT
(IULC, 1978).
On the rhythm parameter in phonology 175
Mascaro, J. 1989. On the Form of Segment Deletion and Insertion Rules. Probus 1. 31-
62.
McCarthy, J. 1979. Formal Problems in Semitic Phonology and Morphology. Doctoral
dissertation, MIT.
Nespor, M. 1990. On the Separation of Prosodic and Rhythmic Phonology. In S. Inkelas
and D. Zee (eds.) The Phonology-Syntax Connection. 243-258. CSLI. Chicago: The University
of Chicago Press.
Nespor, M. and I. Vogel. 1979. Clash Avoidance in Italian. Linguistic Inquiry 10. 467-482.
Nespor, M. and I. Vogel. 1982. Prosodic domains of external sandhi rules. In H. van der
Hulst and N. Smith (eds.) The Structure of Phonological Representations. Part I. 225-255.
Dordrecht: Foris.
Nespor, M. and I. Vogel. 1986. Prosodic Phonology. Dordrecht: Foris.
Nespor, M. and I. Vogel. 1988. Arhythmic sequences and their resolution in Italian and
Greek. Constituent structure. Papers from the 1987 GLOW Conference. Annali di
Ca'Foscari, University of Venezia.
Nespor, M. and I. Vogel. 1989. On clashes and lapses. Phonology 6. 69-116.
O'Connor, J. D. 1965. The Perception of Time Intervals. Progress Report 2. 11-15. Phonetics
Laboratory, University College, London.
Phinney, M. 1980. Evidence for a Rhythm Rule in Quebec French. NELS 9.
Pike, K. 1945. The Intonation of American English. Ann Arbor, Michigan: University of
Michigan Press.
Plato, The Laws. Loeb Classical Library. Cambridge, Massachusetts: Harvard University
Press, 1926.
Prince, A. 1983. Relating to the Grid. Linguistic Inquiry 14. 19-100.
Roach, P. 1982. On the distinction between "stress-timed" and "syllable-timed" languages.
In D. Crystal (ed.) Linguistic Controversies. London: Edward Arnold.
Roca, I. 1986. Secondary stress and metrical rhythm. Phonology Yearbook 3. 341-370.
Rubach, J. and G. E. Booij. 1985. A Grid Theory of Stress in Polish. Lingua 66. 281-319.
Schultink, H. 1979. Readies op "Stress Clash". Spektator 8.5. 195-208.
Scott, D. R., S. D. Isard and B. de Boysson Bardies. 1985. Perceptual isochrony in English
and French. Journal of Phonetics 13. 155-162.
Selkirk, E. O. 1978. On Prosodic Structure and its Relation to Syntactic Structure. Paper
presented at the Conference on Mental Representation in Phonology. IULC, 1980.
Selkirk, E. O. 1984. Phonology and Syntax: the Relation between Sound and Structure.
Cambridge, Massachusetts: MIT Press.
Shen, Y. and G. G. Peterson. 1962. Isochronism in English. University of Buffalo Studies
in Linguistics. Occasional Papers 9. 1-36.
Stowell, T. 1979. Stress Systems of the World, Unite. MIT Working Papers in Linguistics
1. 51-76.
Wheeler, M. 1979. Phonology of Catalan. Oxford: Blackwell.
Dependencies in the Lexical Setting of
Parameters: a solution to the
undergeneralisation problem*
Mark Newson
University of Essex
some lexical learning but this view claims that this is all there is. Second,
it is conceptually simpler to locate language variation in the lexicon. One
result of this is that in a suitably abstract sense there is only one language;
all language variation can be put down to mere lexical differences. Moreover,
if such variation represents idiosyncratic properties of individual languages
then the lexicon is the rightful place to store such information. Finally,
storing all information about variability in the behaviour of lexical items
in the same place may make parsing less complex and thus the proposal
that all parameters are lexical may have computational advantages too.
The view that all parameters are lexical has been termed by Wexler
and Manzini (1987) the Lexical Parameterisation Hypothesis.
However, despite the empirical and theoretical support that the Lexical
Parameterisation Hypothesis receives, its adoption does lead to some major
problems. Although the Lexical Parameterisation Hypothesis may be the
minimal learning theory, it is not in all cases the most obvious one and,
in certain instances, it is positively counter-intuitive. For instance, take
word order parameters, however these are to be construed.1 Rather than
these being set once and for all for the language as a whole, if these
parameters are lexical they will have to be set for each and every individual
lexical item of the language. Obviously, this increases the amount of learning
that we suppose a child must do, perhaps beyond tolerable limits if we
consider that all parameters must be thus set. Furthermore, it is highly
counter-intuitive to suppose that each token of, say, a verb must be presented
so that the learner will know that each is head initial, for example. This
does not fare well with the fact that if native speakers are presented with
a newly invented verb, they will automatically know that that verb is head
final or head initial, depending on how this parameter is set for their
language. With respect to such "creativity" the traditional view is more
intuitive, as one instance of a head initial verb would be sufficient to
allow the learner to generalise this information to all other lexical items.
This last point raises a second problem with the Lexical Parameterisation
Hypothesis. Consider, again, word order parameters. It appears to be the
case that languages in general either have head initial or head final verbs
(or, more generally, it may be possible to characterise a whole language
as being basically head initial or head final). If word order parameters
are set for each individual lexical item, we might expect to find some
languages in which some verbs are head initial and some are head final.
There are, to my knowledge, no such languages.2
In the main, it is very difficult to account for intra-language genera-
lisations if we assume that the mechanisms which were originally proposed
to capture cross-language variance are also responsible for variation within
languages. This problem, Safir (1987) has referred to as the undergene-
ralisation problem.
Dependencies in the Lexical Setting of Parameters 179
Danish1
a. ...at [Peterj sa [Johnsj fem billeder af ham,/j]]
that P saw J's five pictures of him
b. ...at [Peterj bad Johnj om [PROj at ringe til hanvj]]
that P asked J for to ring to him
c. *...at [Peter; fortalte Anne om hanij]
that P told A about him
d. ...at [Peter; sa [Johns fem billeder af sigi]]8
that P saw J's five pictures of self
e. ...at [Peterj herte [Anne omtale sig;]
that P heard A mention self
f. *John; sagde at [Peter kritiserer sig. meget ofte]
J said that P criticises self very often
(4) Norwegian9
a. Dei leste [mine klager mot demj
they read my complaints against them
b. *De, leste [klager mot demj]
they read complaints against them
c. Knut; ba Olaj [PROj korrigere s e g j
K asked O to-correct self
d. *01aj vet [vi beundrer segj]
O knows we admire self
184 Mark Newson
(5) Icelandic10
a. Jonj segir a5 [Maria elski haniij]
J said that M love him
[subj]
(6) Japanese
a. J o h n r w a kare^-ni tsite-no Billj-no hon -o
J TOP he DAT about B GEN book OB
yonda
read
'John read Bill's book about him'
b. J o h n r w a kare r ni tsite-no hon -o yonda
J TOP he DAT about book OB read
'John read a book about him'
c. J o h n r w a zibunj/j-ni tsite-no Billj-no hon -o yonda
J TOP self DAT about B GEN book OB read
'John read Bill's book about self
d. J o h n r w a Billj-ga zibunj/j-o semeta to itta
J TOP B SUB self OB blamed that said
'John said that Bill blamed self
It is therefore the case that languages may have lexical items which take
governing categories which can include or be included within those of
other lexical items.
It is interesting to consider what this means for the distribution of
pronominals and anaphors in a language. There are three possible con-
ditions. If anaphors and pronominals select the same value of the Governing
Category Parameter, then they will be in complementary distribution, given
that anaphors must be bound and pronominals free within the same domain.
If the anaphors have a governing category which includes that of the
pronominals, then they will have overlapping distributions as the domain
within which the anaphors must be bound will extend beyond that within
which pronominals must be free, hence there will be a domain in which
either can be bound. Finally, if the pronominals have a governing category
Dependencies in the Lexical Setting of Parameters 185
a —a
b b
c c
d d
e e
In (8), the vertical arrows represent the markedness hierarchies for anaphors
and pronominals and the horizontal arrows represent the Lexical Depen-
dency operating from anaphor to pronominal.
What we have so far is a rather neat way of accounting for one generalisation
concerning the setting of the Governing Category Parameter. We do not,
as yet, have any other evidence to support this Lexical Dependency, nor
do we have any support for the idea that Lexical Dependencies are in
general operation for setting parameters, thereby offering a potential
solution to other undergeneralisations. However, if support can be found
for this particular Lexical Dependency, then it would be reasonable to
assume that Lexical Dependencies are generally available as parameter
setting devices, as it would be odd in the extreme for such a device to
be only available for the setting of the Governing Category Parameter.
In this section, support for this particular Lexical Dependency will be
presented.
The first piece of evidence concerns a generalisation about the values
of the Governing Category Parameter which both anaphors and pron-
ominals tend to select. If it is the case that markedness hierarchies of
parameters have any influence over which values are selected within
languages, 13 we might expect that there would be a tendency toward the
selection of unmarked values. Thus, as far as the Governing Category
Parameter is concerned, we might expect that anaphors, in general, would
favour the selection of value (a) and pronominals that of value (e). These
expectations are borne out for the case of anaphors, with the majority
of these selecting their unmarked value. However, the same is not true
for pronominals and in fact most pronominals tend to select value (a),
188 Mark Newson
their most marked value. There are very few cases of pronominals which
select their unmarked value; Manzini and Wexler (1987) present only one.
Obviously, the question begged here is - why should this be so? Note
that we cannot take this to mean that anaphors and pronominals should
properly be considered as having the same markedness hierarchy. After
all, the markedness hierarchies that have been proposed for anaphors and
pronominals are based on learnability arguments under some fairly re-
asonable assumptions. If these hierarchies are not as proposed, the
Governing Category Parameter should be impossible to set for any lexical
item. Therefore we cannot simply reject the suggested pronominal mar-
kedness hierarchy, even though the data concerning the values they tend
to select seem to suggest otherwise.
There is a very simple solution to this puzzle, however, which follows
directly from the Lexical Dependency suggested above. If it is the case
that pronominals are dependent for their parameter values on anaphors,
then the markedness hierarchy for anaphors will impose a restriction on
the values that pronominals can select. Simply put, if most anaphors tend
to select value (a) of the Governing Category Parameter, the Lexical
Dependency will force most pronominals to select this value. Given that
value (a) is most marked for pronominals, no further learning can take
place for these and thus most pronominals will end up with this value.
Furthermore, for a pronominal to be able to select its least marked value,
i.e. (e), will be dependent on the relevant anaphor also selecting this value.
This value is most marked for anaphors and therefore, presumably, least
likely to be selected by them. Further still, even if an anaphor were to
select its most marked value, thereby enabling the counterpart pronominal
to select the same, there is still the possibility of further learning for the
pronominal, with any value from (e) to (a) available for selection. We
can see, then, that given the Lexical Dependency, it is entirely expected
that pronominals should tend to select their most marked value and not
their least marked one.
There are a number of other places where the anaphor markedness
hierarchy seems to dominate pronominal behaviour quite contrary to
expectation. Each of these can be taken as empirical support for the Lexical
Dependency which would lead us to expect this situation. For example,
take the case of empty categories. Under standard assumptions, these are
seen as having pronominal and anaphoric features and thus come under
the restrictions of the binding theory. When we look into the question
of which values of the binding parameters empty categories conform to,
we find that most seem not to be parameterised at all; i.e. their behaviour
is the same in any language which has them. For example, the trace of
a moved NP, standardly considered as a pure anaphor, seems to conform
to value (a) of the Governing Category Parameter in all languages. There
Dependencies in the Lexical Setting of Parameters 189
FOOTNOTES
* I wish to thank Martin Atkinson, Annabel Cormack and Iggy Roca for invaluable comments
and suggestions concerning previous drafts of this paper.
1. One proposal concerning word order parameters, put forward by Huang (1982), is that
complements are ordered with respect to their heads by a head final/head initial parameter.
Another suggestion is that word order falls out from parameters determining the direction
of Case and theta role assignment (Koopman (1984), Travis (1984) and Fukui (1986)).
2. In actual fact, there are languages which have both prepositions and postpositions; for
example, German, which is basically prepositional, has at least two postpositions; entlang
'along' and gegenüber 'opposite' (thanks to Mike Jones for pointing these out to me) and
similarly Persian, another overwhelmingly prepositional language, has one postposition ra,
which is used for direct objects (Comrie (1981)). This, perhaps, indicates that word order
parameters for these items are set individually. However, a closer investigation of the properties
of these words would need to be undertaken before a strong claim about this can be made.
It is also intereting to find that such a phenomenon seems only to affect adpositions and
never verbs. This observation obviously warrants investigation.
3. This is similar to an idea proposed by Huang (1982) who claimed that an " u n m a r k e d "
setting of the head final/head initial parameter is one where all lexical categories conformed
to the same setting. More marked situations were possible where certain lexical categories
would have the value for this parameter changed on the basis of positive evidence.
4. The term "proper antecedent" is a little misleading in connection with pronominals given
that what is meant is that such elements cannot be co-referential with a pronominal within
its governing category and hence are not antecedents at all. However, the term is convenient
and far less awkward than Yang's (1983) more accurate term "disjoint reference target".
5. This is obviously a much simplified definition of the governing category than the one
that is required to capture the true distribution of anaphors and pronominals and also such
elements as PRO. For a more accurate but complex version of the parameter, the reader
is directed to Manzini and Wexler (1987) where such issues are addressed. However, for
the purpose of the present paper, the simplified parameter will suffice.
196 Mark Newson
unmarked Proper Antecedent Parameter value) is not incompatible with pronominal be-
haviour, as Manzini and Wexler point out. However, it is also true that such a statement
does not capture the restriction placed on pronominals that they conform to at most only
one value (a).
REFERENCES
Wexler, K. and Y-C. Chien. 1985. The Development of Lexical Anaphors and Pronouns.
Papers and reports on child language development 24. Stanford University: Stanford University
Press.
Wexler, K. and M. R. Manzini. 1987. Parameters and Learnability in Binding Theory. In
T. Roeper and E. Williams (eds.) Parameter Setting. Dordrecht: Reidel. 41-76.
Yang, D-W. 1983. The Extended Binding Theory of Anaphors. Language Research 19. 169-
192.
The Nature of Children's Initial
Grammars of English
Andrew Radford
University of Essex
1. INTRODUCTION
It is widely held that we first have clear evidence that a child has developed
an initial grammar of his native language during the period of early patterned
speech, when the child shows evidence of being able to combine words
together productively to form systematic structures - a period which
typically lasts from around 20 to 24 (±20%) months of age (cf. e.g. Goodluck
1989). The nature of children's initial grammars is of particular interest
because this is the point at which the child has accumulated minimal
linguistic experience, and is thus the point at which the contribution made
by Universal Grammar to the child's linguistic development might therefore
seem to be most readily observable (albeit indirectly). In addition, children's
initial grammars provide an obvious testing-ground for maturational
theories of language acquisition (such as that proposed by Borer and Wexler
1987) which hold that different principles and parameters may come 'on
line' at different stages of linguistic maturation.
In this paper, I shall suggest that early child grammars of English differ
radically from adult grammars in two interesting and inter-related respects.
Firstly, whereas adult phrases and sentences are projections of both lexical
and functional categories, child phrase and sentence structures are pro-
jections of the four primary lexical categories (Noun, Verb, Adjective, and
Preposition), and thus lack functional categories altogether. Secondly,
whereas adult phrases and sentences contain both thematic and nonthematic
constituents, their child counterparts are purely thematic structures (in
the sense defined below). We can represent what I am saying in diagram-
matic terms by positing that all phrases and clauses produced by young
children will be lexical-thematic structures of the canonical form (1) below
(where X, Y, and Z are lexical categories):
200 Andrew Radford
(1)
specifier
0-marked lexical lexical
by X' 0-marking complement
head 0-marked
by X
(2) VP
NP V'
AGENT
I V NP
Daddy I PATIENT
read I
book
The whole structure would thus be a Verb Phrase (or verbal Small Clause):
it would be a lexical structure in that it comprises only projections of
the head lexical categories N and V. It would also be a thematic structure
in the sense that the V read theta-marks its sister constituent book, the
V-bar read book theta-marks its sister constituent Daddy, the NP book
is theta-marked by its sister V read, and the NP Daddy is theta-marked
The Nature of Children's Initial Grammars of English 201
by its sister V-bar read book (VP is not theta-marked, but is not required
to be as it is a root constituent and so has no sisters).
The twin facts that children's phrases and sentences contain (i) lexical
but not functional, and (ii) thematic but not nonthematic constituents are
clearly closely inter-related. Abney (1987: 54 ff.) posits that the essential
difference between lexical and functional categories lies in the fact that
lexical categories have thematic content (by which he presumably means
that non-maximal lexical projections theta-mark any sister constituents
which they have), whereas functional categories do not (hence he refers
to non-functional categories as 'thematic categories'). However, the inter-
relationship between categorial status and thematic status is more complex
than this implies. For instance, some lexical categories do not theta-mark
their sisters: e.g. a single-bar constituent headed by a raising predicate
like seem or likely, or by a passive participle like thought does not theta-
mark its sister (subject) constituent; hence, in a sentence such as:
the suggestion that early child phrases and sentences are purely lexical-
thematic structures echoes earlier ideas in Radford 1986/1987/1988a/1990,
Abney 1987 (p. 64), Guilfoyle and Noonan 1989, Lebeaux 1987/1988,
Kazman 1988, Platzack 1989, and many others.
In the remainder of this paper, I shall present substantial empirical
evidence in support of the lexical-thematic analysis of early child English,
arguing that this provides a correct characterisation of the structure of
both phrasal and clausal structures produced by young children: in section
2, I shall look at typical nominal phrases produced by young children;
in section 3 I turn to examine children's clausal structures; and subsequently
(in section 4) I present an overview of the overall organisation of early
child grammars.
texts where adults would require determinate nominals (i.e. nominals with
a preceding Determiner). This can be illustrated by the spontaneous speech
data in (9) below:
children at this stage do not attach the genitive 's suffix to possessor
nominals, as the data in (11) below illustrate (the (a) examples are from
Bloom 1970, and the (b) examples from Braine 1976):
the differences between adult and child possessive structures support the
hypothesis that adult nominals are functional-nonthematic structures, while
their child counterparts are lexical-thematic structures.
There is a third piece of evidence which we can adduce in support of
the claim that early child nominals are indeterminate. As I noted earlier
in relation to examples such as (4) above, there are a number of reasons
for supposing that so-called 'personal pronouns' are Determiners (so that
we and you function as prenominal D constituents in structures such as
We linguists respect you psychologists, and as pronominal D constituents
in structures like We respect you). This being so, then one should expect
to find that early child English is characterised by the absence of personal
pronouns. In this connection, it is interesting to note the observation by
Bloom et al. (1978) that young children typically have a nominal style
of speech which is characterised by their nonuse of case-marked pronouns
such as I/you/he/she/we/they, etc. (cf. also the parallel remark by Bo-
werman (1973: 109) that in the first stage of their grammatical development
'Seppo and Kendall used no personal pronouns at all'). We can illustrate
this nominal speech style from the transcript of the speech of Allison Bloom
at age 22 months provided in the appendix to Bloom (1973: 233-57), since
Bloom et al. (1978: 237) report that Allison's NPs at this stage were
'exclusively nominal'. Of particular interest to us is the fact that Allison
used nominals in contexts where adults would use pronominals. For
example, in conversation with her mother, Allison uses the nominal
expressions baby, baby Allison, or Allison to refer to herself (where an
adult would use the first person pronouns I/me/my), as we see from the
examples in (13) below:
(13) a. Baby Allison comb hair. Baby eat cookies. Baby eat cookie.
Baby eat. Baby open door. Baby drive truck. Baby ride truck.
Baby down chair.
b. Help baby
c. Allison cookie. Put away Allison bag. Baby cookie. Baby diaper.
Baby back. Wiping baby chin. There baby cup (Allison 22)
(14) a. Mommy open. Mommy help. Mommy pull. Mommy eat cookie
b. Peeking Mommy. Get Mommy cookie. Pour Mommy juice
c. Eat Mommy cookie. Eating Mommy cookie. Mommy lap (Al-
lison 22)
However, the pronoun one arguably has a very different categorial status
from personal pronouns like we. Although (as we have seen) personal
pronouns have the status of pronominal Determiners, the pronoun one
by contrast has the status of a pronominal Noun-, the analysis of one as
a pro-N would account for why it takes the Noun plural inflection
in the plural form ones, and why it can be premodified by Adjectives,
as we see from examples such as (15) (cf. Radford 1989 for evidence in
support of the claim that one is a pro-N constituent). Thus, more accurately,
we should say that our lexical-thematic analysis of early child English
predicts that children may have lexical pronouns but not functional pronouns
- e.g. they may have pro-N constituents like one, but not pro-D constituents
like he.2
The Nature of Children's Initial Grammars of English 209
(16) There is no reason to think that it would be politic for the president
to back down
(17) [Cp [c for] [IP [Dp [D the] [NP president]] h to] [ VP back down]]]
The structure is a functional one in the sense that the overall clause for
the president to back down is a Complementiser Phrase ( = CP), i.e. a
projection of the head functional C constituent for: moreover, the com-
plement of for is itself a functional IP constituent ( = the Infinitive Phrase
the president to back down), headed by the functional Infinitiviser to\ and
the subject of the Infinitive Phrase is itself a functional DP ( = the Determiner
Phrase the president), headed by the functional D constituent the. Thus,
clauses are clearly functional structures; moreover, it should be apparent
from (17) that they are also nonthematic structures (in the sense defined
above): for example, for, and the are nonthematic constituents; and the
subject DP the president is in a nonthematic position, in the sense that
it is not theta-marked by (i.e. is not a logical argument of) its immediate
sister constituent ( = the I-bar to back down), but rather by its grand niece
( = the V-bar back down).
Having looked briefly at the structure of adult clauses, we can now
turn to look at the structure of the earliest clause types produced by young
children. Typical examples of early child clauses are given in (18) below
(the (a) and (b) examples being from Bloom 1970):
210 Andrew Radford
(18) a. Man drive truck ( = 'The man drives the truck', Allison 22)
b. Baby eating cookies ( = 'I am eating the cookies', Allison 22)
c. Wayne taken bubble ( = 'Wayne has taken the bubble container',
Daniel 21)
d. Bear in chair ( = 'The bear is in the chair', Gerald 21)
e. Hand cold ( = 'My hand is cold', Elen 20)
f. Want lady get chocolate ( = 'I want the lady to get the chocolate',
Daniel 23)
but also where the bracketed complement clause lacks an overt subject,
as in (21) below (taken from a longitudinal study of a boy called Daniel):
There are two familiar differences between the adult and child structures.
Firstly, the adult structure is a functional one (since it contains the functional
categories I and D and their projections), whereas the child structure is
a lexical one (in that it contains only the lexical categories V and N and
212 Andrew Radford
This being so, then we should expect to find that if early child clauses
do indeed contain no I-system, then they will also lack Modals. Numerous
published studies have commented on the systematic absence of Modals
as a salient characteristic of early child speech: cf. e.g. Brown 1973, Wells
1979, and Hyams 1986. Indeed, this pattern was reported in studies of
imitative speech in the 1960s. For example, Brown and Fraser 1963, Brown
and Bellugi 1964 and Ervin-Tripp 1964 observed that children systematically
omit Modals when asked to repeat model sentences containing them, as
illustrated by the following examples which they provide (ibid.):
The systematic differences between the adult model sentences and their
child counterparts are directly predictable from our hypothesis that early
child grammars lack functional/nonthematic constituents, so that in place
of adult modal IPs children use nonmodal VPs (moreover, they replace
the functional DPs Mr Miller/the book/a cow/the doggy by the lexical
NPs Miller/book/cow/doggy, and likewise have a 'missing' argument in
place of the adult pronominal Determiner I). It should be immediately
apparent that (25)(b) is a lexical-thematic structure of the canonical form
(1) above (save for the fact that it is headed by an intransitive verb which
has no sister complement to theta-mark). The fact that children use
nonmodal VPs in contexts where adults require Modal IPs provides us
with further evidence that the earliest clauses produced by young children
are purely lexical-thematic structures.
Although we have concentrated on modal Auxiliaries here, there is parallel
evidence that children likewise make no productive use of nonmodal
Auxiliaries at this stage. We can illustrate this in terms of examples such
as the following:
Since Tense and Agreement are properties of I, the fact that children at
this stage have not acquired the relevant Tense/Agreement inflections
provides further evidence in support of our claim that they have not yet
acquired I.4 Nonacquisition of finite verb inflections is a characteristic
property of early child English widely reported in the relevant literature
The Nature of Children's Initial Grammars of English 215
(cf. e.g. Brown and Fraser 1963, De Villiers and De Villiers 1973a, Brown
1973, etc.).
It is perhaps useful to pinpoint the exact nature of the differences between
auxiliariless finite clauses in adult English, and their child counterparts.
Given the assumptions made here, the child's reply Pig say oink in (30)(b)
would have the structure (31)(a) below, whereas its adult counterpart The
pig says oink would have the superficial structure (31)(b):
The differences between the two structures reflect the familiar pattern that
adult clauses are functional-nonthematic structures, whereas their child
counterparts are lexical-thematic structures. Thus, the overall clause has
the status of a functional IP in adult English, but of a lexical VP in early
child English; the verb says carries the Tense/Agreement properties
discharged by I in the adult structure (31)(b), but the child's verb say
in (31)(a) carries no I-inflections for the obvious reason that the child's
grammar has no I-system; the adult subject the pig is a functional DP
which superficially occupies a functional position (as the specifier of the
functional category IP) in (31)(a), whereas its child counterpart is a lexical
NP which superficially occupies a lexical position (as the specifier of the
lexical category VP) in (31)(b).
Having argued that early child clauses lack an I-system, I shall now
turn to argue that early child clauses likewise lack the functional C-system
found in adult English Ordinary Clauses (the structure of which is discussed
in Radford 1988b, chapter 9). We can illustrate the nature of the adult
English C-system in terms of the bracketed complement clause in (32)
below:
Given the arguments in Radford 1988b (section 6.4) that preposed Au-
xiliaries are superficially positioned in the head C position of CP, it follows
that a wh-question such as:
216 Andrew Radford
Given the assumptions we are making here, it follows that the head C
position of CP can be filled either by a base-generated Complementiser
(for/that/whether/if), or by a finite Auxiliary {can/could/will/has/is/was/
does etc.) preposed into C (from I); and that the specifier position in CP
can be filled by a preposed constituent of some kind (e.g. a preposed
wh-phrase).
Given that Complementisers are both functional and nonthematic cons-
tituents, it follows that our lexical-thematic analysis of early child phrases
and sentences would predict that early child clauses will contain no C-
system whatever. Some evidence which supports this conclusion comes
from the fact that children's complement clauses at this stage are never
introduced by Complementisers like that/for/whether/if. on the contrary,
children's complement clauses at this stage are purely lexical-thematic
structures of the canonical form (1) above, as illustrated by the bracketed
clausal complements of want in (36) below:
The head of the complement clause is usually either a nonfinite Verb (as
in the (a-d) examples), or a Preposition (as in the (e-h) examples). In no
case, however, is the complement clause ever introduced by a Comple-
mentiser - a fact which is clearly consistent with our view that early child
clauses entirely lack a C-system. Moreover, imitative speech data yield
The Nature of Children's Initial Grammars of English 217
much the same conclusion: Phinney 1981 argues that young children
consistently omit Complementisers on sentence repetition tasks.
Given that a second role of the C-constituent in adult speech is to act
as the landing-site for preposed Auxiliaries (e.g. in direct questions), we
should expect that children under two years of age will not show any
productive examples of 'subject-auxiliary inversion' in direct questions.
In fact, early child interrogative clauses show no evidence whatever of
Auxiliaries preposed into C, and more generally lack Auxiliaries altogether.
Typical examples of auxiliariless interrogatives found in early child speech
are given below (examples of yes-no questions from Klima and Bellugi
1966: 200):
(37) Fraser water? Mommy eggnog? See hole? Sit chair? Ball go?
(38) Chair go? Kitty go? Car go? Jane go home? Mommy gone?
(40) a. Bow-wow go? (' Where did the bow-wow go?', Louise 15)
b. Mummy doing? ('What is mummy doing?', Daniel 21)
c. Car going? ('Where is the car going?', Jem 21)
d. Doing there? (What is he doing there?', John 22)
e. Mouse doing? ('What is the mouse doing?', Paula 23)
Klima and Bellugi conclude of children at this stage that 'They do not
understand this construction when they hear it.' Why should this be? The
answer we suggest here is that early child clauses are purely lexical-thematic
structures which lack a C-system, with the result that children are unable
to parse (i.e. assign a proper syntactic and thematic analysis to) adult
CP constituents (They have to rely on pragmatic rather than syntactic
competence in order to attempt to assign an interpretation to adult CP
structures).
Thus far, I have argued that the earliest phrasal and clausal structures
produced by young children are lexical-thematic in nature, and so lack
the functional and nonthematic constituents found in adult English. In
this section, I shall argue that in consequence, children's grammars are
radically different in nature from their adult counterparts.
In section 2,1 argued that early child nominals are lexical NP constituents
which lack a functional D-system. If this is so, then we should expect
that children's NPs would lack the functional properties carried in the
adult D-system. One such property which is carried by the D-system in
adult grammars is morphological case. For example, in a sentence such
as:
the case properties of the italicised nominal arguments they and us Americans
are carried in the D-system, in the sense that they are overtly marked
on the nominative pronominal Determiner they, and the objective pre-
nominal Determiner us. Now, if (as I am arguing here) early child grammars
have no D-system, then we should expect to find that they lack the functional
case properties carried by the D-system: in other words, we should expect
to find that early child English is a caseless language. In this connection,
it is interesting that (as we have already noted), children typically have
220 Andrew Radford
in early child English (sentences like (5) above provide additional evidence
in support of this claim, in that the complement Noun is in a caseless
position).
An interesting corollary of our claim that early child nominals are caseless
NPs free of case constraints on their distribution is that we should expect
this to be true not only of overt nominal arguments used by young children,
but also of any covert nominal arguments which they may use. In a number
of influential works, Nina Hyams 1986/1987a/1987b/1988/1989 has argued
that apparently subjectless child utterances have (syntactically present but
phonologically null) 'understood' empty subjects, so that (details aside)
a child utterance such as (45)(a) below would have the 'fuller' structure
(45)(b):
But there are also numerous examples of what appear to be null object
sentences, such as:
Not only can the first or second argument of a predicate be null, but
in addition the third argument of a three-place predicate can likewise be
null, as illustrated by examples such as the following (where I use e to
designate the empty third argument; the examples also have 'missing'
subjects):
(52) a. Kendall see Kendall (= 'I can see myself, Kendall 23, from
Bowerman 1973)
b. Betty touch head...touch Betty head (Betty 24)
where the subscripts are binding indices which serve to indicate that the
trace t is bound by the pronominal wh-DP what (i.e. that what is the
antecedent of the trace). However, since binding is a property of a D-
system which children have not acquired at the lexical-thematic stage of
their development, it follows that nominal movement chains (because they
involve binding chains) will not be licensed in early child grammars. The
fact that two entirely different routes lead us to the same conclusion (viz.
that there are no nominal movement chains in early child grammars)
substantially reinforces the plausibility of the conclusion.
Given that movement of nominals from one position to another involves
movement of a maximal projection, an obvious question to ask is whether
the second type of movement which we find in adult grammars (viz.
movement of a head category into another head position, i.e. head-to-head
movement) is licensed in early child grammars. Typical instances of head-
to-head movement are movement from N to D in Arabic (cf. Fassi Fehri
1988), movement from V to I for Auxiliaries in English and Verbs in
French (cf. Pollock 1989), and movement from I to C for preposed
Auxiliaries in English. All instances of head-to-head movement that I am
familiar with involve movement into a head functional category position
(D, I, or C). However, if (as we have argued here), children have no
functional category systems in their initial grammars, then it follows that
there will be no head-to-head movement in early child grammars: empirical
support for this claim comes from our earlier observation that children
have not acquired direct questions with preposed Auxiliaries at this stage.
Thus, the fact that children have neither XP movement (i.e. movement
of maximal projections) nor X movement (i.e. movement of heads) in
their grammars leads us to the overall conclusion that there are no movement
chains of any kind in early child grammars of English. Our conclusion
thus echoes the words of McNeill 1966, who suggests that 'It is not
unreasonable to think of children "talking" base strings directly.'
5. SUMMARY
The overall conclusion to be drawn from this paper is that whereas adult
phrases and sentences are functional structures which may contain non-
thematic constituents, their child counterparts are purely lexical structures
which contain only thematic constituents, and thus conform to the category-
neutral schema (1) above. In consequence of the absence of functional
categories, early child nominals have no D-system, and early child clauses
have no I-system or C-system. Because case is a property of a D-system
not yet acquired, early child grammars are caseless systems, so that there
The Nature of Children's Initial Grammars of English 229
FOOTNOTES
1. There are cases reported in the acquisition literature of children at this stage using 's
pronominally, but not prenominally - cf. e.g. the following sequence produced by Gia at
age 20 months (from Bloom 1970):
In Radford (1990: 106-8) I discuss such examples, suggesting that 's may be misanalysed
by the child as a nominal proform having much the same status as adult one (save for
the fact t h a t ' s encliticises onto a preceding NP).
2. It should be acknowledged that some children do make limited use of demonstrative,
interrogative, and even personal pronouns at this stage (i.e. use items such as this/that/
what/it). However, in Radford (1990: 99-105) I argue that such pronouns have the status
of pronominal Nouns in early child speech, not of pronominal Determiners. From the pro-
N analysis, it follows that while children may use an item like what in a typical Noun position
(viz. as subject or object of a verb), they will not use it in its adult prenominal Determiner
function at this stage, and hence will not produce interrogative nominals like What carl
3. It might be supposed that the negative particles no/not used by young children are functors,
and thus belong to a functional category of some kind (so challenging our assertion that
children have no functional category systems at this stage). However, such negative particles
differ from functional heads in a number of respects. For example, functional heads typically
assign or are assigned functional properties (e.g. D and I can assign case to their specifiers
230 Andrew Radford
and C can likewise case-mark its complement, and D can carry case), whereas negative
particles neither assign nor carry functional properties. Moreover, functional (and other)
heads have specific subcategorisation properties (so that e.g. the Complementiser for
subcategorises an IP headed by to, and to in turn subcategorises a VP headed by an infinitival
V), whereas no/not impose no such restrictions on their choice of complement in child English,
so that we find e.g. Mummy not go/going/gone shops. All in all, it seems more plausible
that prepredicate negative particles like no(t) have the structural function of adjuncts to
V-bar in early child English, and may have the categorial status of Adverbs. Preclausal
negatives (as in No Mummy go shops) are probably best analysed as clausal adjuncts (with
no adjoined to the VP Mummy go shops).
4. While children have not acquired the tense/agreement inflections +s/+d at this stage,
it is nonetheless true that they make productive use of progressive +ing, and limited (though
not productive) use of perfective +n - cf. examples (26)(b) and (c) in the text. I do not
take this to indicate that they must therefore have developed one or more functional heads
(e.g. PROG or PERF) marking aspect. Rather, I take the view (defended in Radford 1988b)
that there is a clear distinction between lexical and functional inflections - i.e. those inflections
associated with lexical heads, and those associated with functional heads. In these terms,
the tense/agreement inflections +s/+d are functional inflections associated with finite I
constituents, whereas +ing/+n are lexical inflections associated with V constituents (in much
the same way as plural + i is a lexical inflection associated with N constituents).
5. Although we find no productive wh-movement at this stage, we do find formulaic wh-
questions like Whasat? ( = 'What's that?'); however, there is general agreement in the literature
that such structures do not involve wh-movement. More problematic for the analysis proposed
in the text is the fact that some children at this stage develop semiformulaic wh-questions
like What NP do(ing)?, a n d / o r Where NP go(ing)?. However, since these structures are item-
specific and clause-bound, no productive movement rule seems to be involved. It may be
that children who produce item-specific structures like What Daddy do(ing)? initially develop
a lexical entry for do which projects an interrogative T H E M E argument (or perhaps, more
specifically, a what T H E M E argument) as an adjunct to the VP containing do, in which
case What Daddy doing? would have the skeletal structure:
On this account, no movement would be involved, since what would be directly projected
into the clausal adjunct position, not moved there from the complement position within
VP. It follows from this analysis that the interrogative word will always remain clause-
bound (i.e positioned within its containing clause, so that there is no movement of the
the wh-word out of one clause into another). For more detailed discussion of early wh-
questions, see Radford (1990: 122-36).
6. An interesting question which arises here is whether children master the directionality
of theta-marking from the very beginnings of early multiword speech. In this connection,
it is interesting to note that Bowerman's 1973 study of Kendall at age 22 and 23 months
showed Kendall producing not only A G E N T + A C T I O N + P A T I E N T structures like (i) below,
but also P A T I E N T + A C T I O N + A G E N T structures such as (ii):
While it is far from clear how to interpret the relevant data, one possibility would be to
posit that Kendall has not yet set the relevant parameter which determines the directionality
The Nature of Children's Initial Grammars of English 231
REFERENCES
Abney, S. P. 1987. The English Noun Phrase in its Sentential Aspect. Doctoral dissertation,
MIT.
Bever, T. 1970. The cognitive basis for linguistic structures. In J.R. Hayes (ed.) Cognition
and the Development of Language. 274-353. New York: Wiley.
Bloom, L. 1970. Language Development. Cambridge, Massachusetts: MIT Press.
Bloom, L. 1973. One Word at a Time. The Hague: Mouton.
Bloom, L., P. Lightbown and L. Hood. 1978. Pronominal-Nominal Variation in Child
Language. In L. Bloom (ed.) Readings in Language Development. 231-238. New York:
Wiley.
Borer, H. and K. Wexler. 1987. The Maturation of Syntax. In Roeper and Williams, 123-
172.
Bowerman, M. 1973. Early Syntactic Development. Cambridge: Cambridge University Press.
Braine, M. D. S. 1976. Children's first word combinations. Monographs of the Society for
Research in Child Development no. 41.
Brown, R. 1968. The Development of Wh Questions in Child Speech. Journal of Verbal
Learning and Verbal Behaviour 7. 279-290.
Brown, R. 1973. A First Language: the Early Stages. London: George Allen and Unwin.
Brown, R. and U. Bellugi. 1964. Three processes in the child's acquisition of syntax. Harvard
Educational Review 34. 133-51.
Brown, R. and C. Fräser. 1963. The Acquisition of Syntax. In C. Cofer and B. Musgrave
(eds.) Verbal behaviour and learning: problems and processes. 158-201. New York: McGraw-
Hill.
Cazden, C. B. 1968. The acquisition of noun and verb inflections. Child Development 39.
433-448.
Chomsky, N. 1965. Aspects of the Theory of Syntax. Cambridge, Massachusetts: MIT Press.
Chomsky, N. 1975. Reflections on Language. New York: Pantheon.
Chomsky, N. 1981. Lectures on Government and Binding. Dordrecht: Foris.
Chomsky, N. 1986a. Knowledge of Language: Its Nature, Origin, and Use. New York: Praeger.
Chomsky, N. 1986b. Barriers. Cambridge, Massachusetts: MIT Press.
Clahsen, H. 1984. Der Erwerb von Kasusmarkierungen in der deutschen Kindersprache,
Linguistische Berichte 89. 1-31.
De Villiers, P.A. and J.G. 1973a. A cross-sectional study of the acquisition of grammatical
morphemes in child speech. Journal of Psycholinguistic Research 2. 267-78.
De Villiers, P.A. and J.G. 1973b. Development of the use of word order in comprehension.
Journal of Psycholinguistic Research 2. 331-341.
Ervin-Tripp, S.M. 1964). Imitation and structural change in children's language. In E.H.
Lenneberg (ed.) New Directions in the Study of Language. Cambridge, Massachusetts: MIT
Press. 163-89.
Fassi Fehri, A. 1988. Generalised IP structure, Case, and VS word order. In Fassi Fehri,
A. et al. (eds.) Proceedings of the First International Conference of the Linguistic Society
of Morocco. Rabat: Editions OKAD. 189-221.
Fräser, C., U. Bellugi, and R. Brown. 1963. Control of grammar in imitation, comprehension,
and production. Journal of Verbal Learning and Verbal Behaviour 2. 121-135.
232 Andrew Radford
Fukui, N. 1986. A Theory of Category Projection and its Applications. Doctoral dissertation,
MIT.
Gleitman, L. 1981. Maturational Determinants of Language Growth. Cognition 10. 103-
114.
Goodluck, H. 1989. The Acquisition of Syntax. Ms. University of Ottawa.
Greenfield, P. et al. 1985. The structural and functional status of single-word utterances
and their relationship to early multi-word speech. In M. D. Barrett (ed.) Children's Single-
Word Speech. New York: Wiley. 233-267.
Guilfoyle, E. and M. Noonan. 1989. Functional Categories and Language Acquisition. Ms.
McGill University.
Guillaume, P. 1927. Le dévelopement des éléments formels dans le langage de l'enfant. Journal
de Psychologie 24. 203-229. Translated as The development of formal elements in the
child's speech. In C.A. Ferguson and D. I. Slobin (eds.) 1973. Studies of Child Language
Development. New York: Holt Rinehart and Winston. 240-251.
Hill, J. A. C. 1983. A Computational Model of Language Acquisition in the Two Year Old.
Indiana University Linguistics Club.
Hyams, N. 1986. Language Acquisition and the Theory of Parameters. Dordrecht: Reidel.
Hyams, N. 1987a. The Theory of Parameters and Syntactic Development. In Roeper and
Williams, 1-22.
Hyams, N. 1987b. The Setting of the Null Subject Parameter: A reanalysis. Text of paper
presented at the Boston University Conference on Child Language Acquisition.
Hyams, N. 1988. The Acquisition of Inflection: a parameter-setting approach. Ms. UCLA.
Hyams, N. 1989. The Null Subject Parameter in Language Acquisition. In O. Jaeggli and
K. Safir (eds.) The Null Subject Parameter. Dordrecht: Kluwer. 215-238.
Kazman, R. 1988. Null Arguments and the Acquisition of Case and Infl. Text of paper
presented at the University of Boston Conference on Child Language Acquisition.
Klima, E. S. and U. Bellugi. 1966. Syntactic Regularities in the Speech of Children. In J.
Lyons and R. Wales (eds.) Psycholinguistic Papers. Edinburgh: Edinburgh University Press.
183-207.
Koopman, H. 1984. The Syntax of Verbs. Dordrecht: Foris.
Lebeaux, D. S. 1987. Comments on Hyams. In Roeper and Williams, 23-39.
Lebeaux, D. S. 1988. Language Acquisition and the Form of the Grammar. Doctoral Dissertation,
University of Massachusetts.
Macnamara, J. 1982. Names for things: a study of child language. Cambridge, Massachusetts:
MIT Press.
Maratsos, M. 1974. Children who get worse at understanding the passive: a replication of
Bever. Journal of Psycholinguistic Research 3. 65-74.
Park, T-Z. 1981. The Development of Syntax in the Child with special reference to German.
Innsbruck: AMOE.
Phinney, M. 1981. Syntactic Constraints and the Acquisition of Embedded Sentential Com-
plements. Doctoral Dissertation, University of Massachusetts.
Platzack, C. 1989. A grammar without functional categories: A syntactic study of Early
Swedish Child Language. Ms. Lund University.
Pollock, J-Y. 1989. Verb Movement, Universal Grammar, and the structure of IP. Linguistic
Inquiry 20. 365-424.
Radford, A. 1986. Small Children's Small Clauses, Research Papers in Linguistics 1. 1-38.
Bangor: UCNW.
Radford, A. 1987. The Acquisition of the Complementiser System. Research Papers in
Linguistics 2. 55-76. Bangor: UCNW.
Radford, A. 1988a. Small Children's Small Clauses. Transactions of the Philological Society
86. 1-46 (revised extended version of Radford 1986).
The Nature of Children's Initial Grammars of English 233
could be related, and the general nature of the language learning process.
(See Saleemi (1988a) for related discussion, and Saleemi (1988b) for a
detailed treatment of the syntactic and learnability issues).
It is an established fact that pro can occur only in those contexts in which
overt Case-governed NPs can appear. Thus, Rizzi (1986) asserts that pro
can only be licensed by a Case-assigning head, which is Infl in the case
of a subject pro. We can draw many different conclusions on the basis
of this generalisation. Firstly, we can hypothesise that pro is Case-marked
just like lexical NPs (e.g. Chomsky (1982), 86); Hyams (1986), 32-33)).
Alternatively, it can be stipulated that nominative Case is absorbed by
Infl (Rizzi (1986)), or that it is simply not phonetically realised (Safir (1985)).
None of these assumptions is straightforwardly consistent with the view,
crucial to the present analysis, that Case, standardly considered to be
assigned at S-structure, renders an NP "visible" at PF, and thus requires
it to be phonetically realised at that level (Bouchard (1984), Fabb (1984)).
In our analysis we also adopt the broader Visibility Condition: under this
condition 0-role assignment at LF can occur only if an argument is Case-
marked (Aoun (1985), Chomsky (1986a)). Suppose that the Case Filter
can be subsumed under the Visibility Condition, as Chomsky (1986a)
suggests, then it would appear that Case is instrumental in ensuring visibility
at both PF and LF: just as arguments can have a phonetic matrix at
PF only if they are Case-marked, they can be assigned 6-roles at LF,
238 Anjum P. Saleemi
and thus be interpreted, just in case they are Case-marked. This means
that if a pro reached LF without Case, it would not be visible for 0-
role assignment at that level.
In order to meet the dual visibility requirements, I first propose that
an NP may be assigned Case at LF as well as at S-structure (cf. Fabb
(1984), 43). This should account for the assignment of Case to null and
overt NPs alike, the former acquiring Case only at LF. 2 Second, I adopt
the idea, due to Bouchard (1984), that in pro-drop languages Case
assignment to the subject can be delayed until LF. This idea presupposes,
in keeping with the above discussion, that obligatory Case at S-structure
requires an NP to be lexically realised at PF, whereas in the event of
optionality of syntactic Case an NP need not be so realised, unless some
other feature (e.g. focus) forces it to acquire Case in order to be overt.
Thus the null subject parameter may now be (provisionally) stated as
follows.
To sum up, I hypothesise that null subjects are formally made possible
by optionality of syntactic Case. This licensing condition (together with
identification through language-particular means) determines whether a
language will allow null subjects. Much like the traditional approaches,
the licensing condition (2) differentiates null subject languages from non-
null subject ones on a binary basis. Considering (2) to be essentially correct,
I shall now explore the possibility of revising the null subject parameter
to encompass more than two types of language, as it appears that the
phenomenon under consideration is more diverse than can be captured
by a minimal binary parameter. It is notable that a multi-valued parameter,
such as the governing category parameter proposed by Wexler and Manzini, 3
makes it possible to directly capture a wider range of variation. On the
other hand, a binary formulation that is intended to account for the same
range of variation must somehow explain away part of the attested diversity.
Further, a many-valued analysis should be of greater advantage in de-
termining learnability if the corresponding binary analysis requires the
postulation of many additional grammatical mechanisms to the system,
the exact consequences of which may appear to necessitate some intricate
deductive reasoning on the part of the learner. Is the null subject
phenomenon diverse enough to warrant an expanded typological analysis?
The following crosslinguistic data suggest that such an analysis is quite
plausible.
First consider these German examples. 4
Null Subjects, Markedness, and Implicit Negative Evidence 239
In this respect Spanish, Portuguese, and many other null subject languages,
resemble Italian.
Clearly, in view of these data a revision of the traditional binary view
of pro-drop is in order. While acknowledging this diversity, Rizzi (1986)
adheres to a binary-valued formulation of the parameter, suggesting that
the reduced range of pro-drop in some essentially pro-drop languages may
be a result of the language-particular interaction of the null subject
parameter with an independent parameter that regulates the recovery of
pronominal features in a piecemeal fashion. If operative in a given language,
the recovery mechanism overrides the consequences of the null subject
parameter whenever the retrieval of certain designated feature(s) of the
empty subject from overt affixation on Infl is not possible. 7 Rizzi's account
is incomplete or unsatisfactory for several reasons. Empirically, it does
not have much to say about languages without Agr which allow pro-drop,
e.g. Chinese (Huang (1984)) and Japanese (Hasegawa (1985)), or indeed
about any cases of pro-drop where the overt correspondence between pro-
drop and Agr is weak. Theory-internally, it implies that parameters can
massively nullify each other's triggering conditions, with one parameter
almost undoing the effect of another one, a supposition that is likely to
run up against considerable descriptive and learnability problems. It would
be tantamount to a conspiracy for the suppression of relevant evidence,
which might complicate the learning procedure whereby the language
learner is supposed to pick the correct value of a parameter. This underscores
the point that the major motivation behind a parameterised theory of
grammar is to guarantee learnability, a key but often unrecognised
assumption being that parameterisation is meant to enable the learner
to make a set of independent decisions. This obviously means that
parameters have to be independent from each other in such a way that
they can be fixed directly on the basis of relevant evidence. Note that
the view of independence being implied here is much weaker than that
embodied in Wexler and Manzini's Independence Principle, which spe-
cifically requires the subset relations between the languages generated by
the values of a parameter to hold irrespective of such relations between
the languages associated with the values of all other parameters. Neither
view, I believe, rules out the possibility that parameters may interact to
some extent, or that the degree of mutual functional compatibility (in
some sense that can be made precise) between different parameters may
be rather strong; hence the tendency in null subject languages with a rich
Agr to depend on identification through overtly realised «¿»-features (namely,
person, number and gender).
A more effective way to describe the null subject phenomenon, in part
following the typological variation noted by Rizzi (1986) (see also Travis
(1984)), is to posit a wider range of parameterisation. Therefore, under
Null Subjects, Marlcedness, and Implicit Negative Evidence 241
English, French, and Swedish are associated with value (a) of the parameter,
allowing no null subjects. On the other hand, German takes value (b),
that permits only nonarguments to be omitted, requiring all argumental
subjects to be lexically expressed. Yiddish, Malagasy, Icelandic and Faroese
take value (c), that allows the omission of quasi-arguments as well as
nonarguments, i.e. all nonreferential subjects. Finally, languages like Italian
and Spanish (and possibly also those like Chinese and Japanese) are
associated with value (d), under which any subject, referential or non-
referential, may remain null.
(6) is apparently problematic in one respect, though: it predicts that
pleonastic pro-drop will be optional, like the core cases of referential pro-
drop. However, whereas referential pro-drop is in general optional, non-
referential pro-drop seems to be mandatory in most pro-drop languages,
with some (possibly marked) exceptions. Practically, then, the pro-drop
option with respect to pleonastics might be no more than a Hobson's
choice. But there appears to be a simple solution to the problem. In the
spirit of Chomsky's (1981) Avoid Pronoun Principle, the absence of lexical
pleonastics in many pro-drop languages can be ultimately attributed to
their pragmatic infelicity (cf. Travis (1984), 229; Hyams (1986)). Briefly,
it can be assumed that since pleonastic subjects are nonreferential, they
would be superfluous, and thus might not, or might have ceased to, exist
in some null subject languages. What I am suggesting is that there could
be more or less fortuitous gaps in languages (though perhaps not in their
grammars) that might (at least in part) be ascribable to lack of functional
usefulness.
We can thus indirectly account for the lack of expletive subjects in most
pro-drop languages, and one can still claim (6), pragmatically qualified,
to be formally correct. This maximally general statement of the parameter
may in any case be required for some supposedly "marked" languages
- such as Welsh (Awbery (1976)), Irish (Travis (1984), 231ff.), substandard
Hebrew (Borer (1984), 216), Faroese (Platzack (1987)), and Urdu - in
242 Anjum P. Saleemi
3. T H E LEARNABILITY PROBLEM
The null subject parameter, as formulated in the last section, raises some
interesting questions about the relationship between parameters and the
languages they generate, and about the resulting implications for lear-
nability. It affords an example of the intricate connection between parameter
values and the corresponding languages, in particular of a mismatch between
the two, perhaps suggesting that parameters do not, strictly speaking,
generate languages, but only fix the maximal bounds within which languages
can be realised. The problem is analogous to the well-known ontological
problem of structures that are predicted to be well-formed by a grammar
but that do not exist. This gives rise to some inconsistency between the
training instances available to the learner and the relevant generalisation
in the grammar. The parameter (6) illustrates that this state of affairs
is possible in a parametric theory as well, resulting in a projection puzzle
that is discussed below. As the data the child will get may not exactly
be the data he will expect under the parameter, the problem to resolve
is how the child infers the presence of gaps in the ambient language in
the absence of any negative information. In more general terms, the question
is how the learner deals with partial generalisations in the target grammar.
4. POSITIVE I D E N T I F I C A T I O N
consider that the values of the parameter are ordered in terms of markedness
just as dictated by a subset hierarchy, as in theory the parameter is
compatible with the Subset Condition. However, that may not be desirable,
since it has been shown that although the languages associated with different
values of the parameter can fall into a subset hierarchy, they do not do
so in relation to a significant number of cases. Alternatively, one can define
the inclusion relations among values, rather than extensionally (i.e. among
languages generated by these values), a possibility that follows naturally
from the internal structure of the parameter. The markedness hierarchy
and the learning procedure can accordingly be redefined.
Recall that the set of null subject types under the four values progressively
enlarges from value (a) to value (d): the set of null subjects under value
(a) of the parameter is 0; the set of possible null subjects under value
(b) consists of nonarguments only; the set of possible null subjects under
value (c) has as its members quasi-arguments as well as nonarguments;
and the set of possible null subjects under value (d) contains nonarguments,
quasi-arguments, and referential arguments. In other words, in this specific
sense value (d) includes value (c), value (c) includes value (b), and value
(b) includes value (a). The following condition is proposed to determine
markedness among parameter values that are so related.
(9) formalises the learning procedure, which says that the learning function
/ p maps the set of data D onto a value P; of parameter P if and only
if D is a subset of L(Pi), the language generated when P takes value Pi(
and Pj is the least marked value consistent with D. I consider (9) to be
a domain-specific learning procedure, comparable to the Subset Principle
in that respect; however, like the markedness condition (8) it possesses
greater likelihood of being computationally tractable.
In present terms the markedness condition (8) defines the order in which
the parametric choices expressed in the null subject parameter are explored
by the child learner, and (9) the learning principle that can be used to
select the correct value of the parameter on the basis of positive-only data,
a process that may be termed positive identification.
Likewise in the case of L(d). The appearance in the data of null referential
subjects should straightaway rule out L(a), L(b),and L(c). But the problem
of possible overgeneralisation to overt pleonastics within L(d) is still there.
Considering that quite often expletives are homophonous with certain
referential pronouns or in some way semantically nonempty items (e.g.
Yiddish es and English it have referential counterparts; notably, Welsh
hi is 3rd person feminine singular), in principle overgeneralisation can occur
even though overt expletives are totally absent in the language being learned,
as they are in Italian. 9 This type of overgeneralisation may simply consist
of an expectation on the part of the learner that non-null pleonastic subjects
are possible, without any definite knowledge of the particular lexical form(s)
they would actually assume.
Null Subjects, Markedness, and Implicit Negative Evidence 247
The upshot is that a learner armed solely with (6), (8) and (9) may
not be guaranteed to be entirely successful. Though absolutely central to
the process of parameter fixation, positive identification could well prove
to be insufficient, since, owing to the nonexistence of certain predicted
structures, the principle (9) will not ensure that the learner's language
as defined by the parameter (6) is extensionally identical to the ambient
language. To put it more succinctly, (9) may not be able to exactly identify
the ambient language. Recall that the learning procedure (9) is designed
to be driven solely by positive-only evidence. It seems then that although
such evidence is effective in positively identifying a language from among
the four possible ones, it is not effective in exactly identifying that language.
Beyond the point in linguistic development where a parameter is fixed,
say following the application of the learning procedure (9), the learner
might have to employ further inferencing strategies that are essentially
data-driven; that is to say, exact identification would require that in relation
to the missing forms the learning system must be completely guided by
the record of linguistic examples made available to him, rather than
exclusively by the specific innate entities modelled in the form of (6), (8)
and (9) above.
5. EXACT IDENTIFICATION
Keeping this definition in mind, let us now try to ascertain the mechanisms
whereby exact identification can come about.
A solution to the projection problem described in the last section, which
could ensure exact identification, would be for the learner to undergeneralise
within the conjectured language. This can be accomplished by noticing
the nonoccurrence of the relevant types of overt pleonastic subjects in
the "incomplete" data, in other words by resorting to what Chomsky called
248 Anjum P. Saleemi
indirect negative evidence (Chomsky (1981), 8-9; see also Lasnik (to appear),
Oehrle (1985), Wexler (1987)). The basic idea is as follows. If there are
structures that are predicted to exist by the learner's grammar, and that
do not appear in the stream of data after n positive instances (where n
is a sufficiently large number indicating the size at a given time of the
ever expanding corpus of data), then the learner is capable of the negative
inference that these structures are in fact missing from the ambient language.
Under our approach exact identification is considered to come into
operation once the core learning process, i.e. positive identification, has
taken place, its task being to bridge any gaps between the predictions
made by the parameters of Universal Grammar and the linguistic data
actually exhibited. Whereas positive identification is a simple selective
process that can occur on the basis of a small number of triggering instances,
exact identification in addition consists of (relatively general) inferential
processes which involve much closer and extensive inspection of the data,
including the keeping of a record of certain examples. Clearly, the use
of indirect negative evidence in a manner akin to that outlined above
will indeed be sufficient to exactly identify the correct language from data
that are incomplete with respect to the parameter.
7. DEVELOPMENTAL IMPLICATIONS
In this paper I have proposed a metric for parameter values which regards
the range of grammatical categories affected by a value as being criterial
for evaluating markedness. The idea is to dispense with the specific
extensional measure of markedness put forward by Wexler and Manzini.
It is natural to ask if our approach is extendible to the binding parameters
as defined by these authors (also cf. Koster (1987), 319ff.). If it is, that
would provide additional support for our contention that markedness is
not a function of the application of certain set-theoretical constructs to
the languages generated by parameter values.
It is relatively easy to show that the proper antecedent parameter of
Wexler and Manzini lends itself to an intensional view of markedness rather
straightforwardly. Recall that the parameter has two values, (a) and (b).
Under value (a) the set of proper antecedents contains only subjects, and
under value (b) the set of proper antecedents includes both subjects and
objects. Thus the set of proper antecedents defined by value (b) is a proper
superset of the set of proper antecedents defined by value (a), and value
(b) includes value (a) exactly as the markedness condition (8) requires.
What about the governing category parameter? I believe that this parameter
too is compatible with the present approach. I begin by pointing out a
weakness in the formulation of the parameter, repeated in (15) in the form
given in Wexler and Manzini (1987).
According to Wexler and Manzini the reflexives in (16a) and (17a) are
not bound in the minimal governing category containing them and their
antecedents, i.e. the NP, since it lacks Infl/tense, but in the maximal
governing category, i.e. the sentence containing the NP and Infl/tense.
I assume that it is legitimate to speak of the grammaticality of any kind
of maximal projections, not just sentences; this would be in keeping with
the well established assumption that grammatical processes are essentially
category-neutral, and that the category S has no privileged status in the
theory of Universal Grammar. Now, if (16b) and (17b) are also grammatical
alongside (16a) and (17a), as I presume they are, then the relevant definitions
of governing category under the parameter (15) are inadequate for these
constructions, as under these definitions both (16b) and (17b) are wrongly
predicted to be ill-formed. Note that in their independent capacity the
NPs in (16b) and (17b) do not contain Infl or tense, respectively, as required
by (15), so they can be grammatical only by virtue of being a part of
an S. I have delineated the problem only with respect to NPs, but the
logic is equally relevant to small clauses - which, just like NPs, can be
governing categories in their own right under no value other than (a) -
and indeed to all those types of governing categories which can be embedded
in another. The point I wish to make is that, although there is no way
out of this dilemma under the Subset Condition, a solution is possible
precisely in terms of the markedness condition (8).
Suppose that the set of governing categories permitted in Universal
Grammar contains five types corresponding to the five values of the
parameter (15). For convenience I shall write a governing category with
a subject as GC(A), a governing category with an Infl as GC(B), a governing
254 Anjum P. Saleemi
(18) a. GC(A)
b. GC(A), GC(B)
c. GC(A), GC(B), GC(C)
d. GC(A), GC(B), GC(C), GC(D)
e. GC(A), GC(B), GC(C), GC(D), GC(E)
FOOTNOTES
*I am deeply indebted to Martin Atkinson and Vivian Cook for their constant help and
encouragement while the research in part reported here was in progress, and to Michael
Jones, Ken Safir and Ken Wexler for valuable comments on the earlier versions of this
paper.
I also wish to acknowledge the helpful comments I received from audiences at Stanford
and Essex.
1. A word of caution is necessary at this point. It is not certain that the distinctions between
the three types of subjects are neatly held across most languages. It is quite possible that
a class of predicates that appears in one category in one language appears in another category
in a different language. For instance, in some languages the atmospheric-temporal predicates
are expressed without recourse to quasi-arguments. Consider the following Urdu (i), Jordanian
(ii) and Standard (iii) Arabic examples, where the italicized subject is a referential NP.
This is, in a way, consistent with the tendency among the subjects of atmöspheric-temporal
predicates in languages like English to behave as if they are somewhat like referential
arguments, in that they appear to be thematic. In the text I shall presume, with the proviso
regarding cross-linguistic lexical variation in mind, that the distinctions between referential
arguments, quasi-arguments, and nonarguments are in general well motivated.
2. Lack of syntactic Case, rather than government, may be held to be responsible for the
distribution of PRO as well as pro. If this is correct, then one can assert that, with the
exception of variables, all empty categories lack syntactic Case, and that they get Case
universally at LF. This would straightforwardly account for the visibility of PRO at LF.
3. This and all subsequent undated references are to both Manzini and Wexler (1987) and
Wexler and Manzini (1987).
4. Throughout this paper, parentheses in examples indicate optional pro-drop. Further, an
asterisk outside the parentheses suggests that pro-drop is not available, and an asterisk inside
the parentheses denotes obligatory pro-drop.
5. The German facts are not as simple as shown in the text, and therefore deserve some
comment. Specifically, it is not the case that lexical nonarguments can be freely dispensed
with. Consider the following examples.
In neither of these examples is the expletive subject es allowed to drop. I assume that in
German subject-initial sentences the subject must appear in order to fulfil the V2 constraint.
256 Anjum P. Saleemi
Thus, in German nonarguments can be omitted as long as they are not required by an
independent factor, such as the V2 rule; according to Safir (1984), "for some speakers this
prediction is roughly borne o u t " (p. 216). However, one should note that some further
restrictions, of a relatively less systematic nature, appear to exist that counteract the possibility
of a null subject in certain cases; see Safir (1984, 1985) and Travis (1984).
6. As in German, pleonastic pro-drop in Icelandic is not without exceptions; see Platzack
(1987).
7. A similar view is adopted in Jaeggli and Safir (1989), who for example maintain that
underlyingly German and Icelandic are null subject languages in the same sense as Italian
and Spanish, but that in them the presence of the V2 effect blocks the process of identification,
and therefore the availability of referential null subjects. Interestingly, for Adams (1987)
the V2 effect is one of the two mechanisms that make pro-drop possible, the other being
Romance inversion; on her view the loss of V2 in Old French was responsible for the change
in the language from a pro-drop to a non-pro-drop character.
8. I assume, following Rothstein (1983), that "subject" and "predicate" are syntactic terms,
and not merely derivative functional categories; hence the appearance of the term "subject"
in the definition (6) in the text.
9. That expletives have homophonous referential or otherwise meaningful analogues is of
course not purely accidental; Nishigauchi and Roeper (1987) adduce evidence suggesting
that expletives are bootstrapped via their meaningful counterparts.
10. The use of implicit negative evidence is also faulted on the grounds that it would rule
out certain highly complex well-formed structures that tend to be very rare, for example
the fully expanded auxiliary phrase in English (Wexler (1987)). However, Berwick and Pilato
(1987) show that a machine induction model designed to learn the English auxiliary system
can infer such rare examples from relatively simple ones likely to be encountered quite
frequently.
REFERENCES
Adams, M. 1987. From Old French to the Theory of Pro-Drop. Natural Language and Linguistic
Theory 5. 1-32.
Aoun, J. 1985. A Grammar of Anaphora. Cambridge, Massachusetts: MIT Press.
Awbery, G. M. 1976. The Syntax of Welsh: a Transformational Study of the Passive. Cambridge:
Cambridge University Press.
Bennis, H. and L. Haegeman. 1984. On the Status of Agreement and Relative Clauses in
West-Flemish. In W. de Geest and Y. Putseys (eds.) Sentential Complementation. Dordrecht:
Foris.
Berwick, R. C. 1985. The Acquisition of Syntactic Knowledge. Cambridge, Massachusetts:
MIT Press.
Berwick, R. C. and S. Pilato. 1987. Learning Syntax by Automata Induction. Machine Learning
2. 9-38.
Borer, H. 1984. Parametric Syntax: Case Studies in Semitic and Romance Languages. Dordrecht:
Foris.
Bouchard, D. 1984. On the Content of Empty Categories. Dordrecht: Foris.
Chomsky, N. 1981. Lectures on Government and Binding. Dordrecht: Foris.
Chomsky, N. 1982. Some Concepts and Consequences of the Theory of Government and Binding.
Cambridge, Massachusetts: MIT Press.
Chomsky, N. 1986a. Knowledge of Language: its Nature, Origin and Use. New York: Praeger.
Chomsky, N. 1986b. Barriers. Cambridge, Massachusetts: MIT Press.
Null Subjects, Markedness, and Implicit Negative Evidence 257
Saleemi, A. P. 1988b. Learnability and Parameter Fixation: the Problem of Learning in the
Ontogeny of Grammar. Doctoral dissertation, University of Essex. (To be published by
Cambridge University Press).
Saleemi, A. P., in preparation. Choice and Maturation in Language Learnability and
Development.
Travis, L. 1984. Parameters and Effects of Word Order Variation. Doctoral dissertation, MIT.
Wexler, K. 1987. On the Nonconcrete Relation between Evidence and Acquired Language.
In B. Lust (ed.) Studies in the Acquisition of Anaphora, Vol. II: Applying the Constraints.
Dordrecht: Reidel.
Wexler, K. and M. R. Manzini. 1987. Parameters and Learnability in Binding Theory. In
Roeper and Williams (1987).
Second Language Learnability
Michael Sharwood Smith
University of Utrecht
1. INTRODUCTION
The aim of this paper will be to examine second language (L2) research
in recent years with particular reference to how the subconscious devel-
opmental processes underlying non-native language development are cur-
rently viewed. Since second language acquisition has by now been shown
to be a highly complex and poorly understood process not depending simply
on habit formation or indeed on the deliberate learning of rules and
vocabulary items, it is very relevant to ask whether the Chomskyan notion
of learnability is relevant for L2 research (for earlier discussions of this
topic, see, for example, papers in Pankhurst et al. 1988)
In the preliminary sections, I will set the scene for a discussion of second
language learnability by giving a brief sketch of the development of ideas
in this field over the last two decades. I will then summarise the main
points of view on learnability and describe the kind of research that is
currently being done. Since there is an explosion of literature on this subject
(see for example papers in Flynn and O'Neil 1988, Pankhurst et al 1989,
and the last five volumes of Second Language Research, for example), the
aim here will simply be to give an informative impression of the situation
with references to some of the important work in the field. In my
illustrations, I will use the better known (rather than the most recent)
theoretical analyses of various aspects of grammar such as subjacency and
adjacency, the analysis itself being immaterial for present purposes.
The term coined in the seventies by Heidi Dulay and Marina Burt to
describe the organising principles that (by hypothesis) create LI or L2
grammars from primary linguistic input was "creative construction". The
idea was implicated in their successful attempt to discredit the view that
L2 learning was basically overcoming LI habits or, to put it in non-
behaviourist terms, a process of gradually transforming mother tongue
(i.e. Ll)-based cognitive structures into ones which conformed to the
information coming from the environment. Dulay and Burt, and later Steven
Krashen, pursued the line that L2 creative construction took place without
recourse to the mother tongue: this meant that all Ll-patterns observed
in L2 production ought to be attributable to performance constraints, i.e.
Second Language Learnability 261
2.1. L2 learnability
has argued, the learners LI may also play an important role in the setting
of L2 parameters.
Naturally, what is understood by parameters of UG is theory bound.
For example, prepositional languages vary in the degree to which extraction
of the NP out of the prepositional phrase is permitted. Preposition stranding,
generally seen as a marked phenomenon and perhaps representing the
marked value of a parameter of which pied-piping is the unmarked value,
may be epiphenomenal, and be a reflection of the operation of other
principles and parameters involving Empty Category Movement, direction
of government, and so forth (see overview in Van Buren and Sharwood
Smith 1985). Clearly, developmental researchers are going to have to keep
changes in linguistic theory in mind when formulating their hypotheses.
However, the notion of learnability provides a constant motivation for
such research, in that it is always possible to say that direct positive evidence
disconfirming some parameter-setting carried over from the LI is not
available to the L2 learner. Whatever the theoretical analysis is, the French
learners of English will not encounter direct positive evidence to tell them
that interruption of the verb and its direct object NP is unacceptable in
native-English except via stylistically motivated movement (heavy NP shift).
And "marked" English sentences exhibiting this stylistic property will
actually reinforce the mistaken (subconscious) assumption they may have
that English works like French in this respect (as in il mange souvent les
pommes = he eats often apples). By the same token, English learners learning
French will not encounter information in the input that a stranded
preposition such as qui parle-je a? is not one of the options open to Standard
French native-speakers as it is in English (who am I speaking to?). If learners
manage to acquire the native property despite the apparent deficiency of
the data, it is a matter for researchers to determine how they did this.
Was it because they received indirect positive evidence from some other
source enabling them to infer the property (see Hilles 1986, Zobl 1988
and Van Buren 1988) or was it because, after all, negative evidence allowed
them to reset the relevant parameters in their IL (see Rutherford and
Sharwood Smith 1986)? In this way, examining the nature of the input
from a learnability perspective allows research to formulate interesting
questions about IL grammars.
Principles of UG Principles of UG
1 L — n
LI grammar LI grammar
(initial template) (ignored)
IL grammar IL grammar
L2 input
!
L2 input
Note:
U G ensures
conformity
within the
grammar
as it
develops
L2 input
Fig 2. A third view: the reorganisation view (UG active in LI and IL grammar) (from Sharwood
Smith 1988)
The three main views on the roles of LI and U G that have just been
outlined above can be re-expressed as three general (working) hypotheses
for IL investigations, i.e., respectively, the Parasitic Hypothesis (as advanced
by Bley-Vroman, Schachter and others; see also Clahsen and Muysken
1986), the Recreative Hypothesis (as advanced by Mazurkewich), and the
Reorganisation Hypothesis (as advanced by White, Liceras, Sharwood Smith
and Van Buren and others). 2
270 Michael Sharwood Smith
3. RESEARCH STRATEGIES
L2 input L2 input
4. CONCLUSION
FOOTNOTES
1. Second language researchers have looked at various kinds of markedness although, here,
only markedness in the Chomskyan sense will be considered.
2. For various positions on this issue, see Mazurkewich 1984, White 1986, Flynn 1986, Liceras
1986, Bley-Vroman 1986, Schachter 1986, 1988, Van Buren and Sharwood Smith 1986.
REFERENCES
Adjemian, C. 1976. On the nature of Interlanguage systems. Language Learning 26. 297-
320.
Bialystok, E. and M. Sharwood Smith. 1985. Interlanguage is not a state of mind, an evaluation
of the construct for second language acquisition. Applied Linguistics 6. 101-107,
Bley-Vroman, R. 1986. Hypothesis testing in second language acquisition theory. Language
Learning 36. 353-376.
Bley-Vroman, R. 1988. The fundamental character of foreign language learning. In W.
Rutherford and M. Sharwood Smith (eds.) Grammar in the Classroom. New York: Harper
and Row.
Foster, S. 1985. Taking a modular approach to universal of language acquisition. Paper presented
at SLRF, Los Angeles, February 1985.
Chomsky, N. 1981. Lectures on Government and Binding. Dordrecht: Foris.
Clahsen, H. 1989. The comparative study of first and second language development. Ms.
University of Düsseldorf.
Clahsen, H. and P. Muysken 1988. The UG paradox in L2 acquisition. Second Language
Research 5. 1-30.
Clahsen, H. and P. Muysken 1986. The accessibility of Universal Grammar to adult and
child learners. A study of the acquisition of German word order. Second Language Research
2. 93-119.
Corder, S. Pit. 1967. The significance of learner's errors. International Review of Applied
Linguistics 5. 160-170.
Dulay, H., M. Burt and S. Kxashen. 1982. Language Two. Oxford: Oxford University Press.
Dulay, H. and M. Burt. 1974. Natural sequences in child second language acquisition. Language
Learning 25. 37-53.
Duplessis, J., L. Solin, L. Travis and L. White. 1987. U G or not UG: that is the question.
Second Language Research 3. 56-75.
Felix, S. 1985. More evidence on competing cognitive systems. Second Language Research
1. 47-72.
Flynn, S. 1986. A Parameter-Setting Model of Second Language Acquisition. Dordrecht: Reidel.
Flynn, S. and W. O'Neil (eds.) 1988. Linguistic Theory and Second Language Acquisition.
Dordrecht: Reidel.
Fodor, J. 1983. Modularity of Mind. Cambridge, Massachusetts: MIT Press.
274 Michael Sharwood Smith
Gass, S. and L. Selinker (eds.) 1983. Language Transfer in Language Learning. Rowley,
Massachusetts: Newbury House.
Goodluck, H. 1986. Language acquisition and linguistic theory. In P. Fletcher and M. Garman
(eds.) Language Acquisition. Cambridge: Cambridge University Press.
Gregg, K. 1988. Epistemology without knowledge, Schwartz on Chomsky, Fodor and Krashen.
Second Language Research 4. 66-80.
Hilles, S. 1986. Interlanguage and the pro-drop parameter. Second Language Research 2.
33-52.
Jenkins, L. 1988. Second language acquisition: a biological perspective. In S. Flynn and
W. O'Neil (eds.) Linguistic Theory and Second Language Acquisition. Dordrecht: Reidel.
Jordens, P. 1986. Production rules in interlanguage: evidence from case errors in L2 German.
In E. Kellerman and M. Sharwood Smith (eds.) Crosslinguistic Influence in Second Language
Acquisition. Oxford: Pergamon.
Jordens, P. 1988. The acquisition of verb categories and word order in Dutch and German:
evidence from first and second language acquisition. In J. Pankhurst, M. Sharwood Smith
and P. Van Buren (eds.) Learnability and Second Languages. Dordrecht: Foris.
Kean, M-L. 1986. Core issues in transfer. In E. Kellerman and M. Sharwood Smith (eds.)
Crosslinguistic Influence in Second Language Acquisition. Oxford: Pergamon.
Kean, M-L. 1988. The relation between linguistic theory and second language acquisition:
a biological perspective. In J. Pankhurst, M. Sharwood Smith and P. Van Buren (eds.)
Learnability and Second Languages. Dordrecht: Foris.
Kellerman, E. 1985. Dative alternation and the analysis of data. Language Learning 35. 91-
101.
Kellerman, E. and M. Sharwood Smith. 1986. Crosslinguistic Influence in Second Language
Acquisition. Oxford: Pergamon.
Krashen, S. 1976. Formal and informal linguistic environments in language acquisition and
language learning. TESOL Quarterly 10. 157-168.
Krashen, S. 1982. Principles and Practice in Second Language. Learning and Acquisition. Oxford:
Pergamon.
Krashen, S. 1985. The Input Hypothesis: Issues and Implications. London: Longmans.
Liceras, J. 1986. Linguistic Theory and Second Language Acquisition: The Spanish Non-native
Grammar of English Speakers. Tubingen: Narr.
Mazurkewich, I. 1984. The acquisition of the dative alternation by second language learners
and linguistic theory. Language Learning 34. 91-110.
Mazurkewich, 1.1985. In reply to Kellerman: a response from Mazurkewich. Language Learning
30. 103-106.
Odlin, T. 1989. Language Transfer: Cross-linguistic influence in Language Learning. Cambridge:
Cambridge University Press.
Pankhurst, J., M. Sharwood Smith and P. Van Buren (eds.) 1989. Learnability and Second
Languages. Dordrecht: Foris.
Rutherford, W. 1987. Learnability, SLA and explicit metalinguistic knowledge. Ms. University
of Southern California.
Rutherford, W. 1989. Linguistics and SLA: the two-way street phenomenon. Ms. University
of Southern California.
Schachter, J. 1988. Second language acquisition and its relationship to Universal Grammar.
Applied Linguistics 9. 219-235.
Schumann, J. 1978. The relationship of pidginization, creolization and decreolization to
second language acquisition. Language Learning 8. 367-388.
Schwartz, B. 1986. The epistemological status of second language acquisition. Second Language
Research 2. 120-159.
Second Language Learnability 275
Schwartz, B. and S. Tomaselli, forthcoming. Analyzing the acquisition stages in L2: support
for UG in adult SLA. Second Language Research.
Selinker, L. 1972. Interlanguage. International Review of Applied Linguistics 10. 109-230.
Sharwood Smith, M. 1988. On the role of linguistic theory in explanations of second language
developmental grammars. In S. Flynn and W. O'Neil (eds.) Linguistic Theory and Second
Language Acquisition. Dordrecht: Reidel.
Van Buren, P. and M. Sharwood Smith 1985. The acquisition of preposition-stranding by
second language learners and parametric variation. Second Language Research 1. 18-46.
Van Buren, P. 1988. Some remarks of the subset principle in second language acquisition.
Second Language Research 4. 33-40.
White, L. 1985. The pro-drop parameter in adult second language acquisition. Language
Learning 30. 43-47.
White, L. 1986. The principle of adjacency in second language acquisition. In S. Gass (ed.)
Second Language Acquisition: a Linguistic Perspective. Cambridge: Cambridge University
Press.
Wode, H. 1978. Developmental sequences in naturalistic L2 acquisition. In E. Hatch (ed.)
Second Language Acquisition. London: Newbury House.
Zobl, H. 1978. The formal and developmental selectivity of LI influence on L2 acquisition.
Language Learning 30. 43-57.
Zobl, H. 1988. Configurationality and the subset principle. In J. Pankhurst, M. Sharwood
Smith and P. Van Buren (eds.) Learnability and Second Languages. Dordrecht: Foris. 116-
131.
Zobl, H. (forthcoming) Evidence for parameter-sensitive acquisition: a contribution to the
domain-specific vs. central processes debate. Second Language Research 6.
Can Pragmatics fix Parameters?
N.V. Smith
University College London
1. INTRODUCTION
2. EXCLUSIONS
of the normal input of data the child needs as triggering devices. That
is, there is no evidence that differences of social environment determine
differences of grammatical development. As Dore (1979:360) put it: "...
while abstract linguistic structures can not be acquired by the child on
the basis of his communicative experience, a communicative environment
is necessary to provide the child with empirical sources against which to
assess his hypotheses about structure". Apart from the questionable
assumption that children test their nascent hypotheses, this remark seems
as valid now as a decade ago. Despite the cogency of Dore's observation,
the same volume contains typical examples of a not unusual confusion
between the acquisition of grammar and the acquisition of the ability to
participate in inter-personal interaction. For instance, Bates & MacWhinney
claim that "the child's acquisition of grammar is guided not by abstract
categories, but by pragmatic and semantic structures of communication
interacting with the performance constraints of the speech channel" (Bates
& MacWhinney, 1979:168). The nature of such "pragmatic and semantic
structures" and how they eventuate in a complex syntax is never disclosed;
and elsewhere in the same article (p.210) they talk of the child "encoding"
aspects of the language in a way which presupposes the existence of the
grammar which is putatively being acquired.
It is necessary to exclude from consideration two further possible
interpretations of the original question. First, a number of writers have
suggested that certain rules or principles of the grammar might be usurped
by pragmatic considerations. That is, what were previously deemed to
be bona fide grammatical rules may turn out not to need incorporating
into the grammar at all, as the phenomena concerned fall out automatically
from independently motivated pragmatic considerations. A typical example
is provided by Lust (1986) who discusses whether part of Binding Theory
can be reduced to pragmatics. Similarly, Kempson (e.g. 1988 and work
in progress) has embarked on a revisionist attempt to construct a grammar
in which Binding Theory, while articulated within the grammar, is im-
plemented outside it, with the appropriate generalisations captured by
Relevance Theory. Clearly, to the extent that such attempts are successful,
there will be in these domains simply no parameters to fix. For present
purposes I shall assume that in some domains (including Binding Theory),
there are parameters and that therefore the question of whether pragmatics
is causally involved in fixing them remains coherent.
Second, there is an extensive literature on the effect of "pragmatic
context" on the child's interpretation of the sentences to which he is exposed.
For instance, Lust (1986:82ff) discussed the effect of priming on children's
judgements of coreference. She showed that when children were primed
with the name of one of the characters mentioned in the test sentences,
the probability of their opting for coreference increased even when such
280 N. V. Smith
3. RELEVANCE
The grammar tells us that some male person has done something involving
Can Pragmatics fix Parameters? 281
a collection, but whether "he" refers to the churchwarden you were just
chatting to, or an unknown burglar; whether "take" is synonymous with
"solicit" or "steal"; and whether the "collection" is the money solicited
or the Meissen absconded with are pragmatically determined. If you have
just entered a ransacked room with someone who then says (2) to you,
you will interpret it as a comment on a theft rather than as a quotation
from the vicar, simply because that is the only construal that a rational
speaker might have thought worth your attention: the only reading that
is consistent with the principle of relevance. If you have just asked your
pew neighbour where Fred has disappeared to and he responds with (2),
you will take it as a comment on a normal part of church ritual. In neither
case is the other interpretation impossible, given additional contextual
assumptions, but the complexity of the contextualising legerdemain ne-
cessary to arrive at it makes it vanishingly unlikely.
Our ability to exploit contextual information in this way is automatic
and unconscious: so much so that we frequently fail to notice indeter-
minacies or ambiguities in utterances addressed to us. Even the child still
in the process of acquiring his first language can represent to himself
sufficient of the context to make some understanding possible, (cf. Smith
1988b, for further discussion), and it is not implausible that the tendency
to maximise the relevance of incoming stimuli, and the notion of optimal
relevance, are innate. If so, one might well imagine that considerations
of relevance could be exploited in the process of language development.
4. PARAMETERS
5. HYAMS
The evidence that Hyams claims children use in progressing from one
stage to the next is in part structural, e.g. whether the language being
learned contains expletives, and in part pragmatic: specifically, the ex-
ploitation of the Avoid Pronoun Principle which, in Chomsky's (1981a:65)
formulation is "interpreted as imposing a choice of PRO over an overt
pronoun where possible".1
The Avoid Pronoun Principle accounts for the choice of (6) rather than
(7) (taken from Chomsky, ibid.) where his is to be construed as coreferential
with John:
"operates under the Avoid Pronoun Principle, and hence, expects that subject pronouns
will be avoided except where required for contrast, emphasis, etc. In English contrastive
or emphatic elements are generally stressed. Once the child learns this, any subject
pronoun which is unstressed might be construed as infelicitous ... the child could then
deduce that if the referential pronoun is not needed for pragmatic reasons, it must
be necessary for grammatical reasons; i.e. a null pronominal is impossible, and hence,
A G ^ P R O " (1986:94)
of contrastive stress as a basis for learning other parts of the system. Second,
the assumption that pro-drop languages do not have expletives is suspect.
Welsh is (probably) a pro-drop language but normally manifests expletives
in sentences like that in (8) - where "e" is an expletive pronoun, not an
empty category:
(8) cy ydy e
so the structural evidence the child can use is less clear-cut than Hyams'
argument requires. Third, while "general" it is not the case that stress
is a necessary concomitant of subject pronouns (in pro-drop or non-pro-
drop languages), so the evidence available to the child is of minimal salience.
6. FIXING
Let us see a little more closely how a parameter becomes fixed, working
on the assumption put forth in Chomsky (1987:61) that "the initial state
Can Pragmatics fix Parameters? 285
It is assumed that the child knows the meanings of the individual words
and that he perceives some relation between these words and the actions
associated with them on particular occasions. By hypothesis, moreover,
UG will provide him with categories like V and N, and X-bar theory
will give him the category VP. Accordingly, ate will be identified as a
V, beans as its internal argument, and ate beans will be automatically
analysed as VP. As V is the Head of VP it follows that the parameter
will be set to Head-first. That is, given the data and innately specified
knowledge about UG, in particular X-bar theory, the analysis is indeed
deterministic.
Provided one accepts that the child's perception of the situation described
above allows him to identify beans rather than Fred as the internal argument
of ate, it is not difficult to see how this particular parameter can be fixed
deterministically in the absence of further pragmatic considerations. Is
it possible to provide an equally deterministic account of the fixing of
other parameters in the same autonomous fashion? Take as an example
the Subject Antecedent Parameter, according to which a proper antecedent
for an anaphor is either: a) a subject NP or: b) any NP. In English, the
setting for the parameter is (b), in most other languages it is (a); so (10)
is ambiguous in English - with himself able to refer to John or Bill, whereas
its congener in Hindi or Swedish is univocal - with only John as a possible
antecedent.
7. CONCLUSION
Where does this leave us with our initial question? On the one hand, it
is clear that pragmatic factors are not directly causally involved in the
fixing of parameters in the way that principles of Universal Grammar
such as X-bar theory are, so Chomsky's claim that the development of
grammar is independent of cognitive considerations is partially vindicated.
On the other hand, Hyams' contention that pragmatic principles play a
role is indirectly correct, in that it seems necessary to assume that:
pragmatics (in the form of the Principle of Relevance) contributes to providing the data
which constitute the evidence for the analysis which, once arrived at, deterministically
sets the parameter.
Finally, this formulation suggests the need for greater care than is customary
in the use of the term "primary linguistic data". The linguistic data the
child uses in acquiring his first language are representations - phonological,
288 N. V. Smith
FOOTNOTES
* I am grateful to Iggy Roca for inviting me to present the predecessor of this paper at
the University of Essex, and for coercing me into resuscitating it subsequently. I am likewise
grateful to those who contributed to the discussion and forced me to revise my ideas (some
of which appeared in Smith, 1988a). I am particularly indebted to Michael Brody, Robyn
Carston, Annabel Cormack, Deirdre Wilson, and an anonymous referee, who have all plied
me with constructive suggestions and saved me from innumerable solecisms and stupidities.
I alone am to blame for remaining errors and infelicities in the paper. Iggy is to blame
for its appearing at all.
REFERENCES
Den Os, E. 159, 160, 164, 166, 167, 172 Hale, K. 194, 236
Deutsch, W. 191 Halle, 12, 47, 48
Dgani, R. 220 Halliday, M.A.K. 39, 278
Dornum, D. 39 Hammond, M. 13, 47, 49, 52, 53
Donleavy, J. P. 39 Hanlon, C. 50
Donovan, A. 161 Harbert, W. 177
Dore, J. 279 Harder, J.H. 109
Downes, W. 102 Harre, R. 103
Du Bois, J.W. 92, 104 Hasegawa, N. 246
Dulay, H. 260, 261, 270 Hasher, L. I l l
Duplessis, J. 266 Haviland, S.E. 89, 114
Hawkins, J.A. 85, 103
Edie, J. 99 Hawkins, S. 1, 165, 166
Elliott, W.N. 10 Hayes, B. 47, 49, 128, 164, 169, 170
Ervin-Tripp, S.M. 212 Heidegger, 99
Hill, J.A.C. 206, 217
Fabb, N. 237, 238 Hilles, S. 264, 268
Fant, 12 Hock, H.H. 127
Fassi-Fehri, A. 16, 202, 228 Hockett, C. 126
Felix, S. 260 Hoekstra, T. 23-25, 71, 77, 78, 80, 81
Ferguson, C.A. 7 Hooper, J. 110
Feyerabend, P. 45 Horning, J.J. 108
Fidelholtz, J.L. 109 Huang, C.T.J. 18, 236, 240, 251
Flynn, S. 259 Hudson, G. 126
Fodor, J.A. 17, 18, 24-26, 45, 265, 270, 277, Hurford, J.R. 87, 88, 92, 98, 104, 106, 107,
278,288 117, 120, 130
Foley, W.A. 103, 127 Husserl, E. 99
Foster, 271 Hyams, N. 13, 14, 18, 23, 24, 34, 38, 39, 40,
Fräser, C. 99, 201, 204, 212, 215, 218, 225 42, 43, 44, 67, 68, 70, 71, 73, 137, 212,
Fries, C.C. 93 222-224, 235-237, 239,241, 249, 250, 251,
Fukui, N. 202, 205, 217 282, 283, 287
Hyman, L.M. 94, 95, 113, 115, 117
Gass, S. 262
Givon, T. 86, 88, 89, 90, 102, 113-115, 117, Ingram, D. 43, 109
118, 124 Isard, S.D. 161
Gleitman, L. 7, 109, 201 Itkonen, E. 109, 115
Gleitman, H. 7, 109
Golinkoff, R.M. 85 Jaeggli, O. 81, 236, 250
Gold, E.M. 88, 107 Jakobson, R. 12, 63, 68
Goodluck, H. 199, 266 Jenkins, P. 265
Gordon, L. 85 Johnston, 270
Gregg, K. 265 Jordens, 270
Greenberg, J.H. 18 Jorge, 268
Greenfield, P. 206
Grice, P. 95 Kager, R. 169
Grimshaw, J. 100, 191 Kahneman, D. 110, 111
Gropen, J. 121, 122 Katada, F. 177
Guilfoyle, E. 202, 217, 250 Kayne, R. 65, 74
Guillaume, P. 218 Kazman, R. 202
Kean, M. 267, 268
Haegeman, L. 236 Kellerman, E. 262, 267
Author Index 293
A movement 224-226 Case xx, 11, 13, 14, 16, 25, 66, 71-73, 76,
A-bar movement 225-226 142
A-chains xvi, 24,25, 69, 76-82 Case Theory 12, 15
A'-chains 25 Categories xix, xx, 10, 111, 112, 279, 285,
Acquisition xvi-xx, 1, 2, 6, 10-12, 23, 26, 286
33-36, 41, 42, 58, 59, 63-65, 67, 76, 85, Chains xvi, xx, 24, 25, 69, 76-82, 227, 228
101, 106, 109, 111, 131, 172, 206, 249- Child grammars (of English) xx, 75, 79,
251, 260, 261,263, 265, 267,270-273, 199-228
278, 279, 281, 282, 288 Cognition 95, 278
Adjectival passivisation 24, 76, 79-82 Cognitive processes 36, 277, 284
Agr 15, 41, 70, 71, 73-75, 79, 81, 214, 236, Communication 126
240, 250, 251, 266 Competence/performance xix, xx, xxi, 35-
Ambient language 246-248, 251 45, 87, 93, 113, 114, 117, 125, 279
Ambiguity 281 Complementisers: see C
Anaphor xvii, xix, 15, 17, 19, 22,23, 67, Comprehension 33, 35, 280
72, 73, 139-148, 152-155, 177, 181, 183- Concept Learning 19, 21
194, 226, 244, 252, 254, 285, 286 Continuity Hypothesis 24-26, 63, 72, 82
Antecedent 26, 140,141, 148,149, 151, Coreference 226, 227
154, 181, 186, 191,283,285, 286 Correlations and stages 42-44
Arena of Use xx, 85, 96-101, 103-107, 109, Covert nominals: see null arguments
110, 112, 118-120, 122-125, 127, 128, CP: see C
130, 131 Creative construction 261
Argumentai subjects (see also thematic Creativity xx
subjects) 239, 241, 246 Crosslinguistic-influence 271
Auxiliary 40,41, 75-77, 79, 81, 210, 213,
217 D 16, 25, 202, 204, 205, 209, 215, 219-221,
Avoid Pronoun Principle 241, 248, 282, 226-228
283 Defaults xx, 14, 15, 59, 60, 286
Determiners: see D
Barrier 138, 141, 143, 146, 148, 150-152, Developmental implications 249
154 Developmental mechanisms 2, 24, 25, 36,
Binarity xviii, 12, 13, 15, 49, 51, 54, 57 40
Binding xvii, xviii, xx, 22, 72, 139-141, Disambiguation 280
150, 185, 188, 193-195,226-228 DP: see D
Binding Theory 14, 16, 139-141, 154, 178-
181, 188, 189,227, 279,286 E-language xix, 34-45, 89-91, 122
Bounding node parameter 14 E-Language/I-Language 19, 34-45, 89-91,
By-pass 278, 286 123, 125, 130
Empty categories 151, 189, 222, 223, 226,
C 25, 41, 215-218, 226 236, 283, 284
296 Subject Index
Empty Category Principle 11, 154, 236 Innateness xix, xx, 33,47, 63, 278, 285
Enumeration 7, 8 Input xvi, xix, xx, 3, 33, 60, 72, 75, 277,
Epistemological priority 9, 10, 17 279, 284, 285, 287
Ergative verb 76-79, 81 Inversion (of subject and auxiliary) 67, 282
Evidence from absence 40-42, 44 IP:see I
Evil Neighbour Syndrome 287
Exact identification 247 L2 learnability 259, 262, 272
Expletive pronouns 282, 284 Language development 34, 42, 43, 87
Expletive subjects (see also pleonastic Language acquisition xx, xxi, 2, 4, 8, 33,
subjects) 14, 24, 43, 235, 237, 241, 282 35, 36, 40, 44, 63, 64, 66-69, 74, 82, 85,
Expletives 282, 284 92, 95, 107, 108, 159, 165, 177, 178, 199
Extended Projection Principle 281 Language Acquisition Device xx, 86, 88,
93,94, 97, 100, 101, 104-107, 112, 117-
Finite evidence 50 120, 123, 125, 130, 131, 268, 277, 278,
Finite verb inflections: see I 281,282, 287, 288
Fossilisation 260, 261 Learnability xvii-xxi, 7, 10, 14, 22, 23, 47,
Frequency 107-111, 117, 131 49-51, 53, 57, 60, 100, 108, 158, 180,
Functional categories 200, 201, 206, 209, 182, 235, 238, 240, 242, 249, 250, 259,
210 264, 265, 268, 272
Functional mechanisms of language Learning xv-xix, 2, 7, 10, 12, 17-22, 25, 26,
change 87 49-51, 53, 57, 58, 60, 63, 73, 76, 277,
278, 284, 286
Glossogenetic mechanism of functional Learning procedure 247
influence on language 87, 124 Levels 18, 47, 53-57
Governing category 11, 13, 19, 22, 23, 67, Lexical Dependencies 179, 180, 182, 185-
68, 72, 138-141, 143-149, 152, 181, 183, 188, 190-192, 194, 195
184, 186, 189, 190, 253, 254 Lexical Parameterisation Hypothesis xviii,
Governing Category Parameter xix, 15, 15, 16, 153, 177, 179, 183, 185, 195, 244
17, 19,21-23, 139-142, 182, 185-190, Lexical-thematic structures 219, 223-226,
194, 252 228
Government 138-141, 143 LF xvii, 13, 73, 154, 237, 238, 240, 287
Grammaticalisation 116, 119 Licensing 71
Locality xvii, xviii, 137, 138, 143, 145-154
Head-direction parameter 18 Logical Form (LF) xvii, 13, 73, 154, 237,
Head-first parameter 285, 287 238, 240, 287
Head-to-head movement 228 Logical problem of language acquisition
Hypothesis selection and testing xvii xv, xvi, 1, 2-11, 69
I 16, 24, 25, 40, 43, 210, 211, 213-215, 252, Markedness xx, 19, 22, 23, 57, 58, 63-82,
253 245, 249, 252, 254, 267
Identification 251 Markedness condition 244, 245, 253
I-language 34-45, 89-91, 122 of anaphors 67
I-Language and E-language 34-45, 89-91 of pronominals 67
Imitative speech 212 Markedness hierarchy xviii, xix
Implicit (or indirect) negative evidence 6, for anaphors xix
235, 248, 249 for pronominals xix
Independence Principle 22, 240 Maturation xvi, 9, 10, 24-26, 69, 73-79, 81,
Inflectional uniformity 251 82, 286
INFL-features 70 Maturational hypothesis xvi, 63, 69
Infinitive: see also I 210 Memory, short-term xx, 57-60
Inflection: see also I 40, 43 Methodology of acquisition research 34-45
Subject Index 297
Observational data 33, 35-45, 68 Quasi-arguments 237, 239, 241, 243, 246
Questions: see interrogatives
Parameter xvii, xviii, xx, 10-16, 18-21, 25,
26, 47, 49, 53, 60, 63-70, 113, 137, 152, R expressions 286
153, 155, 157, 158, 172, 178, 180, 181, Reciprocal 145, 149-152
190, 192-195, 199, 235, 237, 240-244, Recreation Hypothesis 267, 269, 270
246, 247, 249-251, 262, 264, 270, 279, Redundancy constraint 248
284-286 Referential arguments 237, 244
Parameter fixation/setting xix, 10, 177, Reflexive 152, 194
235, 243, 247, 249, 252, 264, 277-288 Relevance Theory xix, 277-279
Parameter setting xvi, xviii-xx, 10, 12, 17, Rhythm xviii, 157-159, 161, 165-167, 169,
21, 26, 33, 34, 53, 59, 60, 63, 65, 67, 70, 172
72,81,82, 277, 281,285-287
Parameter value xvi, xviii, xix, 9-11, 13, Semantics 60
15, 18-23, 25, 26, 33, 63, 65-67, 70, 72, Small clause 75
82, 252 Spanning Hypothesis 22
Parametric variation 2, 15, 25, 250, 254 Statistical distributions 128
Parsing 103 Stress xx, 47-49, 52-60, 157, 159-164, 169,
Passive 24, 69, 73, 76, 79-82 283, 284
Passivisation 79, 224 Stress clash 49, 52, 167, 169-171
Phylogenetic mechanism of functional Stress lapse 171
influence on language 87 Stress-timed language xviii, 157-164, 172
Pleonastic subjects (see also expletive Structuralist linguistics 8
subjects) 12, 24, 43, 235, 237, 241, 243, Subject 38, 39, 41, 66, 68, 73, 139, 140,
246, 249, 250 143-145, 151-153, 155, 281, 283-285
Positive evidence xviii, 13, 23, 41, 47, 50, Subject Antecedent Parameter 285-287
66, 67, 69, 7 2 , 8 1 , 2 4 3 , 2 6 4 Subset Condition 20, 137, 138, 244, 251,
298 Subject Index
UG xvi, xvii, xx, 6, 8, 9, 14, 17, 19, 22-24, Wh-movement 11, 14, 217, 218, 226
33-45,47, 49-52, 54, 55, 58, 63-67, 69,
73, 74,78, 97,98, 101, 105, 107, 108, X-bar Theory 15, 285, 287, 288