Professional Documents
Culture Documents
Language Universals - Greenberg
Language Universals - Greenberg
W
DE
G
Language Universals
With Special Reference to Feature Hierarchies
by
Joseph H. Greenberg
with a preface by
Martin Haspelmath
Mouton de Gruyter
Berlin New York
Mouton de Gruyter (formerly Mouton, The Hague)
is a Division of Walter de Gruyter GmbH & Co. KG, Berlin.
ISBN 3-11-017284-4
be sure, Language Universals was widely read and cited, and the fact
that the terms marked and unmarked are known to every second-
year linguistics student is to a considerable extent due to its influence.
But Greenberg's earlier 1963 article (with its even clumsier title
"Some universale of language with particular reference to the order
of meaningful elements") became far more influential; the book in
which it appeared had to be reprinted three years later and is still
widely available on the antiquarian market, and Greenberg's article
is still commonly assigned as reading to graduate students in linguis-
tics.
Language Universals, too, should be compulsory reading for lin-
guists. The main reason why it did not come close to Greenberg's
word order work was that it mostly deals with phonology, morphol-
ogy, and kinship terminology. But in the 1960s and 1970s, the field
of linguistics was obsessed with syntax and its relation to semantics,
and many of the students entering the field did not have the solid
grounding in historical-comparative linguistics or the linguistics of
some non-European languages that was characteristic of Greenberg's
generation, and that could have helped readers to appreciate the full
significance of the proposed universals. Morphology was simply not
a hot topic, and phonology had to be done in Chomsky and Halle's
(1968) generative framework, which was more interested in morpho-
phonology than in explaining truly phonological patterns and relat-
ing them to phonetic factors. Greenberg's (1963) work on word order
universals was just as remote in spirit from the widely popular gener-
ative syntactic model as his phonological work was from generative
phonology, but the potential relevance of his word order universals
to Chomsky's "Universal Grammar" approach to syntax was evident
to everyone. In the 1980s, generative linguists began to incorporate
Greenberg's discoveries into their theories of Universal Grammar.
The markedness universals of Language Universals never made it on
the agenda of generative grammarians (in phonology, markedness is
now widely discussed again in the framework of Optimality Theory
[McCarthy 2002], but it mostly follows the markedness concept of
chapter 9 of Chomsky and Halle 1968 rather than Greenberg's).
The full impact of the ideas of Greenberg's typological marked-
ness theory on the field of linguistics is apparently still ahead of us.
That statistical regularities of language use are intimately connected
with language structure and are in fact an important ingredient for
Preface to the reprinted edition by Martin Haspelmath ix
did he not do this in Language Universalst This book does not con-
tain a single numbered universal, set off from the main text in the
way in which typologists now routinely highlight their precious dis-
coveries.
The reason is simple: Language Universals contains too many uni-
versals to list them all! In an understatement, Greenberg (p. 10) an-
nounces "a considerable number of specific universals". And they
need not be listed individually, because they can be derived in a me-
chanical fashion from "a single rich and complex set of notions"
(p. 10). All we need to list is the set of (un)markedness properties
(called "markedness criteria" in Croft 1990) and the set of category
pairs (or more generally, category hierarchies). A few such properties
and category pairs are listed in (1)(2).
(1) phonology
unmarkedness properties: category pairs:
neutralization voiceless/voiced
higher text frequency short/long
greater phonemic differentiation non-nasal/nasal
greater subphonemic variation unpalatalized/palatalized
typological implicatum non-glottalized/glottalized
basic allophone unaspirated/aspirated
(2) grammar
unmarkedness properties: category pairs:
facultative expression singular/plural
contextual neutralization direct case/oblique case
higher text frequency masculine/feminine
zero expression positive/comparative
syncretism 3rd person/1st and 2nd person
defectivation indicative/hypothetical
irregularity present tense/future tense
For each category pair, it is claimed that universally (i. e., in all lan-
guages), the unmarked member will exhibit the unmarkedness prop-
erties of (1) and (2). For example, the following universals are among
those hypothesized by Greenberg:
Preface to the reprinted edition by Martin Haspelmath xi
es
I
B
g"1
c
a > </> ) -
777 7
rt
a
I
! 1 1
I
(rt
(
'S
s
i
unmarked
8 2t .s
I
value
s | g 1
> C M CS >^ S
Preface to the reprinted edition by Martin Haspelmath xiii
ON
<N
oo m
\ m
m
r-
ri
oo oo
(N
I
<N (N
m
7 I
oo o
m rf
t
rt
. i ON
<N
v~t vo ON
"
O
l
t/l
o
'
8.
'S
l
13
^ H
arked
N
val
xiv Preface to the reprinted edition by Martin Haspelmath
c
Ov
Os
04
<N
\ 8
5.
o r- oo o\ o
t TfrTj-
8 Jo
!l fN
t
fN
*
Tj
<1 VO
***
Tf
oo
t
oo os o\
- Tt <
e
"3
S ""
Ii
ls sl
r- oo
'S
_c
c
H
unmarked
<N
I
value
Preface to the reprinted edition by Martin Haspelmath xv
Notes
1 On p. 14, Greenberg mentions that Trubetzkoy (1939: 230-41) noted the correlation
between higher text frequency and unmarkedness, but Trubetzkoy (in contrast to Zipf)
did not assign much real significance to text frequency. He explicitly rejected Zipfs
ideas about frequency as a causal factor in phonological simplicity. In a letter to
Jakobson in 1930, he put it bluntly: "statistics are beside the point" (Trubetzkoy 1975:
162, cf. Andersen 1989:21).
2 The properties "dominance" (p. 30) and "agreement a potiori" (p. 31) seem to be
relevant only to number and gender, respectively, so they are not included in the
count here.
3 Notice, incidentally, that Greenberg often used the term "feature" where nowadays
"(feature) value" would be used.
4 As Croft (2003) points out, this is true for most of the correlating properties, but not
for facultative expression and neutralization, so this is another reason for treating
these properties separately.
5 Note that Greenberg's "feature hierarchies" are very different from Silverstein's (1976)
"hierarchy of features", which is a true hierarchy (not a scale) and involves binary
features (i. e., features with two values, plus and minus).
References
Aissen, Judith
1999 Markedness and subject choice in Optimality Theory. Natural Language
and Linguistic Theory 17: 673-711.
Andersen, Henning
1989 Markedness theory - the first 150 years. In Mieska Tomio, Olga (ed.),
Markedness in Synchrony and Diachrony, 11-46. Berlin: Mouton de
Gruyter.
Barlow, Michael and Suzanne Kemmer (eds.)
2000 Usage-Based Models of Language. Stanford: CSLI Publications.
Bod, Rens, Jennifer Hay, and Stefanie Jannedy (eds.)
2003 Probabilistic Linguistics. Cambridge, Mass.: MIT Press.
Bybee, Joan L. and Paul Hopper (eds.)
2001 Frequency and the Emergence of Linguistic Structure. Amsterdam: Benja-
mins.
Chomsky, Noam and Morris Halle
1968 The Sound Pattern of English. New York: Harper & Row.
Croft, William
1990 Typology and Universals. Cambridge: Cambridge University Press.
2003 Typology and Universals. 2nd ed. Cambridge: Cambridge University
Press.
Preface to the reprinted edition by Martin Haspelmath xvii
Greenberg, Joseph H.
1963 Some universale of grammar with particular reference to the order of
meaningful elements. In Greenberg, Joseph H. (eds.), Universal of
Grammar, 73-113. Cambridge, MA: MIT Press.
Haspelmath, Martin
2005 Against markedness (and what to replace it with). Ms., Max-Planck-
Institute for Evolutionary Anthropology, Leipzig.
Lehmann, Christian
1989 Markedness and grammaticalization. In Miieska Tomic, Olga (ed.),
Markedness in Synchrony and Diachrony, 17590. Berlin: Mouton de
Gruyter.
Mayerthaler, Willi
1981 Morphologische Natrlichkeit. Wiesbaden: Athenaion.
McCarthy, John J.
2002 A Thematic Guide to Optimally Theory. Cambridge: Cambridge Univer-
sity Press.
Prince, Alan and Paul Smolensky
1993 Optimality Theory: Constraint Interaction in Generative Grammar. (Tech-
nical report, Rutgers University Center for Cognitive Science) Rutgers
University.
Silverstein, Michael
1976 Hierarchy of features and ergativity. In Dixon, R. M. W. (ed.), Gram-
matical Categories in Australian Languages, 112-71. Canberra: Austral-
ian Institute of Aboriginal Studies.
Tiersma, Peter
1982 Local and general markedness. Language 58: 832-49.
Trubetzkoy, Nikolaj
1939 Grundzge der Phnologie. Gttingen: Vandenhoeck & Ruprecht.
1975 Ltiers and notes. The Hague: Mouton.
Zipf, George K.
1935 The Psycho-Biology of Language: An Introduction to Dynamic Philology.
Houghton Mifflin. (Republished 1965 by MIT Press.)
1949 Human Behavior and the Principle of Least Effort: An Introduction to
Human Ecology. Cambridge, MA: Addison-Wesley.
Joseph H. Greenberg
May 28, 1915 - May 7, 2001
PREFACE
Preface 5
1. Introduction: Marked and Unmarked Categories . . . 9
2. Phonology 13
3. Grammar and Lexicon 25
4. Common Characteristics in Phonology, Grammar, and
Lexicon 56
5. Universals of Kinship Terminology 72
References 88
INTRODUCTION
MARKED AND UNMARKED CATEGORIES
3
N. S. Trubetzkoy, Grundzge der Phnologie (Prague, 1939). It is not the
purpose here to give a detailed historical account. The first occurrence of the
terminology marked and unmarked (in phonology) appears to be by Trubetzkoy
in 1931, "Die phonologischen Systeme", TCLP, 4, 96-116, especially p. 97.
The first explicit use of this terminology for grammatical categories is probably
by Jakobson in "Zur Struktur des russischen Verbums", Charistera Guilelmo
Mathesio ... 74-84 (Prague, 1932). Cf. also with a different terminology
Hjelmslev in La Cat gorie des Cas (Aarhus, 1935), particularly p. 113. Earlier
adumbrations of these ideas in reference to inflectional categories are to be
found in certain Russian grammarians, e.g. Peshkovskij, Karcevskij.
1
R. Jakobson, "Signe Zero", in Melanges Bally 143ff. (Geneva, 1939).
12 INTRODUCTION: MARKED AND UNMARKED CATEGORIES
subjects, applications will first be considered in phonology and
then in grammar and semantics. The treatment will be at least
partly in terms of the history of the subject, but it should be under-
stood that the historical material is purely illustrative and merely
incidental to the main purpose.
PHONOLOGY
spending short vowels, that is, that length is the marked feature.
In general ZipPs hypotheses regarding aspirated and voiced con-
sonants hold although there are a few exceptions. It may be noted
that Ferguson's hypotheses regarding the relatively greater text
frequency of non-nasal over nasal vowels is consonant with the
general thesis of the greater frequency of unmarked features.2
Additional data on the less frequently considered cases of marked
and unmarked phonologic features compiled by myself are presented
here, along with some evidence already published in other sources
and cited here for purposes of comparison. My own data are to
be considered tentative insofar as the samples are small, usually
1000 phonemes. The results, nevertheless, are obviously significant
and unlikely to be seriously modified by subsequent work. The
following are examples of counts, all done by myself, on the relative
frequency of glottalic and non-glottalic consonants in the following
languages: Hausa, in West Africa, and the Amerind languages
Klamath, Coos, Yurok, Chiricahua Apache, and Maidu. In the
case of Hausa, voiced implosives contrast with ordinary voiced
consonants in the pairs b/6 and d/d", and glottalized consonants
contrast with non-glottalized in the pairs k/k', s/s' (in Kano and
some other dialects usually ts'), and y/'y. In Maidu voiced im-
plosives as well as glottalized contrast with ordinary unvoiced
consonants in certain positions. In the other languages a single
series of unvoiced, unglottalized consonants occurs, but for
Chiricahua I have counted the three series unaspirated, aspirated,
and glottalized. The results for each language are found in Tables
, II, III, IV, V and VI, and the results for the six languages are
summarized and compared in Table VII.3
2
C. A. Ferguson, "Assumptions about nasals; a sample study in phonological
universals", in Universals of language ed. J. H. Greenberg 46 (Cambridge,
Mass., 1963).
' The Hausa count consists of the first 1000 phonemes on pages 1, 5 and 9 of
R. C. Abraham Hausa literature and the Hausa sound system (London, 1959)
[Greenberg]; Klamath from M. A. R. Barker, Klamath texts (Berkeley and
Los Angeles, 1963), first 1000 on pages 6, 16 and 26 [Greenberg]; Coos from
L. Frachtenberg, Coos texts (Leyden, 1913) first 1000 from pages 5, 7, 14, 17,
20 and 24 (14 from commencement of new story on middle of page) [Greenberg];
Yurok from R. H. Robins, The Yurok language (Berkeley and Los Angeles,
16 PHONOLOGY
TABLE I
Hausa (1000 phonemes)
b 17.0 6 00.2
d 19.8 <f 03.7
k 21.9 k' 02.8
s 14.2 ts' 00.3
y 19.3 y' 00.8
TABLE II
Klamath (WOO phonemes)
P 02.8 b 01.8 P' 00.3
t 07.6 d 04.7 t' 01.9
08.7 j 00.2 ' 01.5
k 10.4 g 06.1 k' 02.1
q 02.4 g 02.4 q' 01.9
1 05.4 L 00.5 00.7
m 04.0 M 00.4 m' 01.3
n 13.9 N 00.1 n' 00.8
W 08.3 W 00.2 w' 00.4
y 08.6 Y 00.0 y' 00.6
1958) first 1000 on pages 162, 164 and 166 [Greenberg]; Chiricahua from
H. Hoijer, Chiricahua and Mescalero Apache texts (Chicago, 1938), first 1000
on pages 5,10,15,20,23 and 25 [Greenberg]; Maidu from W. F. Shipley, Maidu
texts and dictionary (Berkeley and Los Angeles, 1963) first 1000 on pages 10,
20, 30 and 40 [Greenberg].
PHONOLOGY 17
TABLE V
Chiricahua (1000 phonemes)
unaspiratedlunglottalized aspirated glottalized
d 28.2 t 05.3 t' 03.3
z 03.0 c 05.8 c' 00.0
Z 07.7 01.8 C' 02.8
02.3 00.1 ' 01.2
g 21.4 k 13.4 k' 03.7
TABLE VI
Maidu (1000 phonemes)
p 09.3 p' 00.5 6 05.9
t 19.6 t' 01.4 rf 13.1
ts 00.2 ts' 19.9
k 19.2 k' 11.4
TABLE VII
Summary
Hausa: non-glottalic 92.2; glottalic 07.8
Klamath: unvoiced stops/voiced sonants 72.1; voiced stops/
unvoiced sonants 16.4; glottalized' 11.5
Coos: unvoiced non-glottalized 88.2; glottalized 11.8
Yurok: unvoiced non -glottalized 85.8; glottalized 14.2
Chiricahua: unaspirated 62.6; aspirated 26.4; glottalized 11.0
Maidu: unvoiced unglottalized 48.3; glottalized 32.7; im-
plosive 19.0
feature than aspiration. However, for the sets with the lowest
over-all frequency /' and /' this is reversed. There is, however,
no exception in Chiricahua to the rule that in each set the consonant
in the first column is more frequent than that in either the second
or third.
For vowel nasalization, Ferguson and Chowdhury report a short
count on Bengali vowels in which the ratio of non-nasalized to
nasalized vowels was 50:1. I counted the first thousand vowels
in Stendhal's Le rouge et le noir and found 82.5% oral vowels to
17.5% nasal.4 In connection with vowel length, further data are
given below for Chiricahua Apache in which the vowel system
involves both length and nasalization. Here the ratio of oral
vowels, whether short or long, was 12.8:1 to the number of nasalized
vowels, whether short or long.
Data are now presented for vowel length for Icelandic, Sanskrit,
Hungarian, Finnish, Karok, and Chiricahua.6
TABLE X TABLE XI
Czech Hungarian
a 6.83 a: 2.08 a 22.48 a: 11.62
e 9.40 e: 1.11 e 26.64 e: 07.57
i 6.49 i: 3.70 i 09.30 i: 01.06
8.24 o: 0.00 o 11.00 o: 01.95
u 3.07 u: 0.60 u 02.63 u: 00.70
02.95 : 01.62
01.08 : 00.32
TABLE XIV
Chiricahua
a 31.8 a: 06.8 00.8 : 00.7
e 07.7 e: 05.0 g 00.0 : 00.0
i 19.4 i: 02.5 03.5 I: 00.9
08.1 o: 04.4 00.8 : 00.1
vocalic nasals: n 07.4; n: 00.1
For Icelandic, Sanskrit, and Czech the frequencies have been given
in reference to the entire set of phonemes; for the others the
percentages are of the vowel total. In Table XV the figures are all
reduced to percentages of vowel occurrences:
20 PHONOLOGY
TABLE XV
Icelandic: short vowels 83.3; long vowels 16.7
Sanskrit: short vowels 74.8; long vowels 25.2
Czech: short vowels 82.0; long vowels 18.0
Hungarian: short vowels 75.2; long vowels 24.8
Finnish: short vowels 91.7: long vowels 08.3
Karok: short vowels 80.0; long vowels 20.0
Chiricahua.
short non-nasal vowels 67.0; long non-nasal vowels 18.7;
short nasal vowels 05.1; long nasal vowels 01.7;
short syllabic nasal 07.4; long syllabic nasal 00.1
TABLE XVI
100 samples, 1000 each
P 23.090 P' 4.750
b 10.960 b' 3.680
f 9.470 .590
V 29.780 v' 10.160
t 42.660 t' 18.850
d 16.650 d' 10.390
s 30.930 s' 18.630
k 31.750 k' 5.340
m 23.170 m' 8.050
n 41.000 n' 22.970
1 26.640 r 20.810
r 29.070 r' 13.810
PHONOLOGY 21
inner connection among these and to what extent they are logically
independent has not been treated. This matter will be taken up later
after the problem of the marked and unmarked in grammar and
semantics has also been considered. One additional observation
may be offered before going on to these other topics. It should be
noted that in some cases we had what might be called conditional
categories for marked and unmarked. For example, whereas for
obstruents, voicing seems clearly the marked characteristic, for
sonants the unvoiced feature has many of the qualities of a marked
category.
GRAMMAR AND LEXICON
1
L. Hjelmslev, Prolegomena to a theory of language (Baltimore, 1953);
P. Trnka, "On Some Problems of Neutralization", Omagiu lui Jorgu lordan
861-6 (Bucharest, 1958).
28 GRAMMAR AND LEXICON
where the context makes it clear that we are dealing with non-
phonological matters. In certain environments the opposition
between two or more categories is suppressed, and it is the un-
marked member which appears. In Hungarian, Turkish and
certain other languages only the singular form of nouns may appear
with cardinal numbers. This is obviously the closest analogue to
neutralization in phonology.
A fifth characteristic is the lesser degree of morphological
irregularity in marked forms. For example, in the verbs of classical
Arabic, the basic form as against such derived forms as the causative
and intensive shows variation in the internal vowel of the imperfect,
i.e. the forms yaqtilu, yaqtulu, and yaqtalu all exist so that there are
three allomorphs in the discontinuous morpheme of the imperfect
tense forms. In all the derived forms there is a single allomorph,
e.g. in yuqattilu in the corresponding form of the intensive. In
German all dative plurals have uniformly -n or -en depending on
phonological factors while the dative singular varies with gender
and declensional class. In Sanskrit, the dual which is so to speak
even more marked than the plural has not only extensive case syn-
cretism so that there are only three distinct forms but also greater
regularity than plural or singular, particularly in the oblique cases.
It may be observed that in general the oblique cases have a marked
character as against the direct cases.
A sixth characteristic will be called, in conformity with Hjelmslev's
terminology, defectivation. The marked category may simply
lack certain categories present in the unmarked category. In fact
for inflectional categories, defectivation can be considered a form
of syncretism. Thus one might say that in the marked subjunctive
category, French lacks a future. This would be in conformity with
the usual terminology of grammars of French, but one might also
argue that there has been syncretism of the present and future in the
subjunctive and that the concept of defectivation rests in the
identification of the subjunctive as a form of the present rather
than the future because of its greater formal resemblance to the
present indicative. It is of interest to note here that the present as
an unmarked category in relation to the future is taken as the
30 GRAMMAR AND LEXICON
Another instance is Sanskrit ahani 'the days (dual)' for 'day and
night'. Compare also such usages as Spanish los padres for 'parents'
lit. 'the fathers'; los hijos 'the children' lit. 'the sons'.
A related phenomenon is agreement a potion in which words
from two or more selective categories such as gender have a
common modifier and the modifier is in the unmarked category,
e.g. Spanish el hijo y la hija son buenos 'The son and the daughter
are good' (masculine plural).
Finally the question may be raised whether an analogue to the
frequency phenomenon in phonology exists likewise for grammatical
categories. Data here are very sparse, for there are very few word
frequency studies which give information about the frequency of
the grammatical categories to which the words belong. Data will
be presented at this point only in regard to the category of number
in the noun where there is much evidence for a hierarchy singular,
plural, dual from the most unmarked to the most marked. Cor-
responding to the situation in phonology we might expect that the
text frequency for nominal categories of numbers will be singular
(most frequent), plural (less frequent), and dual (least frequent).
The data that I have been able to collect are: (1) for the noun in
the Rigveda by C. H. Lanman; (2) for the Russian noun by
Josselson; (3) for the Latin noun by using the data in the exhaustive
concordance of Terence by Edgar B. Jenkins.8 Zipfs list of word
frequencies compiled from four plays of Plautus was not suitable
for this purpose because homonymous forms are lumped together
(even forms which differ in vowel quantity), thus the occurrences of
eo go, thither, in him, in it' are all under one undifferentiated
entry. (4) I have recorded for the first thousand nouns in Francois
Mauriac's Le chair et le sang whether they were singular or plural.
Data from these and other studies will be cited later in regard to
other grammatical categories. The results for number in the noun
are set forth in Table XVII.
TABLE XVII
masculine singular, of the verb can occur whether the noun subject
is singular, dual, or plural. In Semitic languages where the verb
has both sex gender and nominal concord with the subject there
is often syncretization of gender in the plural. Thus in Biblical
Hebrew in the perfective of the verb, there is gender distinction
in the third person singular but not in the plural. Many colloquial
Arabic dialects have gender distinction in the singular second and
third person but not in the plural for either person. In Tunica, an
Amerind language of the Gulf group, the verb has singular and
plural agreement forms. According to Haas, "the use of the plural
is far from consistent; one finds cases of plural occurrence referred
to by singular".7 This then appears to be an instance of facultative
expression.
Zero expression of the singular is common in imperatives.
Instances are German which always suffixes -t or -et to the singular
imperative to form the plural, and Russian which always suffixes
-Ye for the plural.
Quantitative data regarding number in verb forms are presented
in Table XX for Vedic Sanskrit, Latin, and Russian.8
TABLE xx
Tot
Total Singular Dual Plural
Sanskrit 29,370 71.0 05.6 23.4
Latin 10,948 91.0 09.0
Russian 3,560 77.1 22.9
(conversational)
like the same closeness of fit that is generally possible for the category
of number. There are often a different number of cases and even
the same conventional name may hide important differences.
However, some reasonable, if rough, equivalences can be made,
e.g. the notion of direct cases (nominative, accusative, vocative) as
a group and the oblique cases. There is generally a possessive or
genitive case and a case of the subject and one of the object. Con-
fining ourselves to the direct/oblique opposition, we often find
that one or more direct cases have zero expression as compared
to the oblique suggesting that the direct cases comprise an un-
marked category in relation to the oblique. Thus in Turkish the
nominative and the indefinite accusative have a zero affix, while
definite accusative and all the oblique cases have overt marking.
In Sanskrit neuter nouns are distinguished from masculine and
feminine nouns in the direct cases of the plural, but this gender
opposition is neutralized in the oblique cases. In Latin the neuter
and masculine in general are only distinguished in the direct cases
and are merged in the oblique.9
In Table XXI data for direct and oblique case of the noun are
given for Sanskrit, Latin, and Russian.
TABLE XXI
Sample Size Direct Oblique
Sanskrit 93,277 72.5 27.5
Latin 8,342 68.7 31.3
Russian 6,194 65.2 34.8
The total for the direct cases is thus substantially greater than for
the oblique cases even though for each language the number of
oblique cases is larger.
For the category of gender, whether sex or non-sex, the evidence
is less clear than for the items already discussed. Here the problem
of interlinguistic comparability is, in general, even more difficult
than for case systems. By such terms as masculine or feminine are
' The fourth declension is a marginal exception in that standard grammars
give for the dative singular -HI but for the rare neuter -. In fact -u is at least
equally as common as -ui for the masculine.
GRAMMAR AND LEXICON 39
TABLE XXIII
Spanish French German
Cardinal Ordinal Cardinal Ordinal Cardinal Ordinal
1 36,000+ 9,698 1,000+ 817 230,000+ 10,960
2 36,000+ 4,188 1,000 + 237 7,331 4,760
3 36,000+ 2,365 631 + 97 4,535 2,489
4 5,714 (3,923) 349 + 31 2,073 760
5 3,714 1,341 336 17 1,296 352
6 2,654 611 193 1,015 277
7 1,960 273 157 669 186
8 1,894 (589) 229 12 (1,018) (490)
9 955 463 92 7 264 122
10 2,078 112 244 12 921 154
The vast size of the Kaeding count and the fact that in German
higher numbers are treated orthographically as single words e.g.
zweiundzwanzig versus 'twenty-two' allows us to pursue this topic
for the interval 11-99. The relatively higher frequency of multiples
of 10 and the small absolute frequencies of numbers in this range
even in a very large scale count would seem to justify a summary
by decades. The results once again are strikingly confirmatory as
can be seen from Table XXIV.11
10
Spanish from I. R. Bou, op. cif., French, G. E. Vander Beke, French word
book (New York. 1929); German, F. W. Kaeding, Hufigkeitswrterbuch der
Deutschen Sprache (Berlin, 1897-8). In all three languages the numeral One*
includes occurrences of the definite article. For French + 1000 is arbitrarily
chosen for the 69 items not counted by Vander Beke. These are the 69 most
frequent items in the earlier and smaller count of Henmon. The highest
frequency among the items figuring in the Vander Beke count is quelque with
1232 occurrences. The figures in parentheses have larger than expected frequen-
cies because they include homonymous forms, e.g. Spanish cuarto, both
'fourth' and 'room.'
11
F. W. Kaeding, op. cit.
44 GRAMMAR AND LEXICON
TABLE XXIV
Cardinal Ordinal
2-9 18,199 9,590
10-19 2,307 822
20-29 721 115
30-39 416 69
40-49 220 93
50-59 239 17
60-69 113 10
70-79 101 13
80-89 90 20
90-99 55 8
We now consider the category of person. The situation is complex.
In general the third person appears as the 'most unmarked' and
may be considered as in opposition to the first/second person.
The following are examples tending to show the unmarked status
of the third person vis-a-vis the first and second persons. In Syriac
in the perfective form of all conjugations, basic and derived, the
third person masculine singular and the plural for both genders
has zero expression. In the Akkadian permansive and the Hebrew
perfective the third person masculine is the only zero form. In
some languages, e.g. Latin, there is a class of impersonal verbs
which are neutralized for person and in which the third person
appears as the surrogate for all three persons. In Masai and Nandi,
Nilotic languages of East Africa, there are verb forms which
include the pronominal object. The third person object has zero
expression. The same is true for Kanuri, a Saharan language of
a different branch of Nilo-Saharan. There are a few discordant
facts, e.g. the zero of the first person singular in the Dutch verb
where the third person has overt form. As between first and second
persons the predominant evidence is for the unmarked status of the
first person. In German in the preterite both the first and third
person singular have zero, whereas the second person singular
and the entire plural have suffixes. However in imperative and
hortatory forms, the second person is evidently the unmarked form
and frequently has zero expression.
GRAMMAR AND LEXICON 45
Examples are Danish in which the passive suffixes -es, while the
active has no overt mark, and Swahili in which the passive is
formed by -w suffixed to the active stem. Sometimes as in English,
the passive is formed periphrastically. The passive often syncretizes
forms which are distinct in the active. Thus Finnish has a single
form for the three persons and two numbers of the active. In
Albanian the indicative and subjunctive are not distinguished in
the passive. The expected higher frequency of the active over the
mediopassive is shown by the data of Table XXVI, from Latin
(Terence) and Vedic Sanskrit:
TABLE XXVI
Active Passive
Latin 90.2 09.8
Sanskrit 73.1 26.9
We now consider mode. Here there are of course considerable
interlinguistic differences. Primary is the difference between the
indicative from which statements can be formed which are true
or false and the various non-indicatives, imperatives, hortatives,
subjunctives, optatives, etc.
Leaving aside for the moment imperative-hortatives, which raise
certain special problems, the indicative may be considered the
unmarked category as against the marked character of the one or
more hypothetical modes. An example of syncretism is Italian in
which for the present subjunctive all the three persons of the
singular, distinguished in the indicative, have a single form, while
in the past subjunctive the first and second person singular which
are distinct in the indicative are merged. In Akkadian the sub-
junctive is marked by a suffix - added to the indicative.
Hortatives usually of the first and third person but sometimes
found in the second person distinct from the imperative are surely
a marked category as against the indicative. Sometimes, as in
Latin, a form with general subjunctive meaning may be used
hortatively, including in this instance a second person hortative.
Latin has in addition an infrequent second or third so-called future
imperative always marked by an -o or -to formative added to the
GRAMMAR AND LEXICON 47
TABLE XXVII
Indicative Subjunctive Optative Conditional Imperative
Latin 70.0 22.7 07.3
Sanskrit 58.5 12.4 03.7 25.4
Russian 84.1 02.3 13.6
over 28 percent; 'requests' less than 7 per cent; and 'calls' less than
1 per cent.15
In all of the grammatical examples considered here, the categories
have belonged to what are conventionally called the same part of
speech. However, it may be observed in passing that parts of speech
as a whole give some evidence of hierarchical structuring along the
lines being discussed. Thus the pronoun commonly has some of
the characteristics of an unmarked category as contrasted with the
noun, e.g. greater formal irregularity and greater differentiation
of inflectional categories. The frequency interpretation here would
presumably be applied in a somewhat different fashion. Thus
although the over-all frequency of nouns is greater than pronouns
in all instances where data are available, since the number of
pronominal forms is always far smaller than the number of nominal
forms individual pronouns usually have very high frequency.
Perhaps here the average frequency of individual forms of the two
classes is a fitting measure but, of course, the details remain to be
worked out.
In general the same criteria for marked and unmarked apply to
the area of lexical meaning as for grammatical categories. Instances
of zero expression may be cited from kinship terminology, e.g.
brother vs. brother-in-law, father vs. grandfather. It will be shown
later that there is much evidence to show that in general in kinship
systems consanguineal terms are unmarked in relation to affinal
and less distant are unmarked in relation to more distant lineal kin.
As an example of facultative expression in a lexical category we
may note the use of the unmarked 'author', incidentally with zero
expression, to refer to a writer regardless of sex, while 'authoress'
indicates only a female writer. A further illustration from kinship
is the extended use of such terms as 'mother' to include both the
consanguineal kin type female parent and the affinal type female
parent of spouse, while the marked term 'mother-in-law' designates
only the affinal kin type. This is then a further evidence of the
unmarked character of consanguineal in contrast to affinal in
one of which the response frequency was greatest for the same part
of speech.
If we hypothesize on the basis that, for example, singular nouns
ceteris paribus will elicit singular nouns and plural nouns will elicit
plural nouns, we will make a set of predictions of the following
form. A stimulus of an unmarked category will have responses
of the same unmarked category almost exclusively since both
factors, the tendency towards responses in the same category on the
marked-unmarked hierarchy are working in the same direction.
A marked stimulus will have a marked response but to a substan-
tially smaller degree.
It was possible to test this general hypothesis from the Palermo-
Jenkins material in the following instances with consistently
favorable results. For nouns there were 64 singulars as stimuli and
11 plurals and one ambiguous ('sheep'). The noun responses to
each noun were classified as singular or plural with the following
results.
TABLE XXX
Singular R Plural R Total R
Singulars .940 .060 41456
PluralS .367 .633 7058
Ambiguous S .897 .103 817
For adjectives some comparatives were included along with the
usual positives, but no superlatives. The number of comparative
responses to positive stimuli were so small (4 in 15,353) that it does
not figure in the percentage summary. Superlative responses to
comparative stimuli were exclusively with the same adjective base,
e.g. 'hottest' to 'hotter' as stimulus. There were 29 positive and 9
comparative adjective forms in the study, with these results.
TABLE XXXI
Positive R Comparative R Superlative R Total R
Positive .S 1.000 .000 .000 15353
Comparative S .294 .689 .017 6018
For verbs the data only included the 'general' (i.e. infinitive) form
GRAMMAR AND LEXICON 55
with different sex reference are consolidated shows that the pre-
dicated hierarchy holds for these languages without exception.8
TABLE XXXIV
English Spanish French German Russian
1
G+ 7,228 11,229 1,260 9,428 +721
G-1 1,858 5,514 1,030 6,047 721
G 1,249 4,931 419 3,449 703
G+2 519 2,774 83 614 293
G-2 65 152 31 242 20
G+* 27 93 31
0 4 29
A second set of hypotheses predicts greater frequency for lineal
than corresponding collateral terms. This is also verified in the
figures of Table XXXV.
TABLE XXXV
English Spanish French German
+1
G lineal 7,228 11,229 1,260 9,428
G+1 collateral 1,504 4,717 511 1,219
G-1 lineal 1,858 5,514 1,030 6,047
G-1 collateral 148 361 140 464
G lineal 1,249 4,931 419 3,449
G collateral 316 867 151 427
G+2 lineal 519 2,774 83 614
G+2 collateral 0 0 6
G-2 lineal 65 152
31 242
G-2 collateral 0 0 6
Finally, as would be expected there is overwhelmingly greater
frequency for consanguineal terms over corresponding affinal ones.
* English, E. L. Thorndike and I. Lorge op. cit.; Spanish, I. R. Bou, op. cif.\
German, Kaeding, op. cit.; Russian, H. H. Josselson op. cit. Blanks indicate
items not concluded in the count. In Russian both 'father' and 'mother1 otets
and maf are in Josselson's group of words (Group I) whose frequency was
so great that they were not counted after a certain point. The figures are there-
fore not comparable with the rest but are necessarily greater than any of the
others in the first sources counted.
82 UNIVERSALS OF KINSHIP TERMINOLOGY
For we can have either one, two, or three terms for these three
kin types. Obviously the use of a single term or three separate terms
each give one type. But for systems with two terms, any one of the
three can receive a unique designation, while the other two fall
under a second term. There are therefore three additional types
producing a total of five not four. The missing type is the one in
which the father and mother's brother are covered by a single kin
term, while the father's brother is given a separate name. The fact
that this type is not even mentioned is sufficient evidence of its
extreme rarity or non-existence. In fact, I do not know of a single
instance of this type. Its usual absence leads to the following im-
plicational universal: whenever the father and mother's brother
are designated by the same term the father's brother is likewise
designated by the same term. Note that the father and mother's
brother are the two most divergent, as it were, of the three relatives
in that they differ both in the lineal/collateral dimension and in
line of descent paternal/maternal.
Analogous typologies can be constructed in other cases, and
their complexity, in the sense of number of possible types, will
depend of course on the size of the basic set of relatives. The
earlier observation that all languages distinguish father from
mother was an example of the simplest possible case. Here there
are only two kin types, father and mother, and therefore only
two logically possible types, those which use two terms and those
which use one, that is have no separate father and mother term.
Of these two types, apparently all languages belong to the first and
none to the second.
An example of a more complex typology is one based on grand-
parent terms, for here there are four kin types to be considered
father's father, mother's father, father's mother, and mother's
mother. In this instance there are fifteen logically possible clas-
sifications. With one term there is one possibility. With two terms
either term covers two relatives, or one covers three and the other
a single kin type. The former occurs three ways, the later four,
making a total of seven. For three terms the only possible division
is two, one, one; and this can occur in six ways. There is only one
UNIVERSALS OF KINSHIP TERMINOLOGY 85