Conceptualframework

See discussions, stats, and author profiles for this publication at: https://www.researchgate.
net/publication/249933408
On the conceptual framework for voice phenomena
Article in Linguistics · January 2006

DOI: 10.1515/LING.2006.009
CITATIONS READS
67 1,287
1 author:
Masayoshi Shibatani
Rice University
57 PUBLICATIONS 2,738 CITATIONS
SEE PROFILE
All content following this page was uploaded by Masayoshi Shibatani on 01 August 2017.
The user has requested enhancement of the downloaded file.

On the conceptual framework
for voice phenomena*
MASAYOSHI SHIBATANI
Abstract
This article attempts to lay the conceptual foundations of voice phenomena,

ranging from the familiar active/passive contrast to the ergative/antipas-
sive opposition, as well as voice functions of split case-marking in both tran-
sitive and intransitive constructions. We advance the claim that major voice
phenomena have conceptual bases rooted in the human cognition of actions,
which have evolutionary properties pertaining to their origin, development,
and termination. The notion of transitivity is integral to the study of voice
as evident from the fact that the so-called transitivity parameters identified
by Hopper and Thompson (1980) and others are in the main concerned
with these evolutionary properties of an action, and also from the fact that
the phenomena dealt with in these studies are mostly voice phenomena. A
number of claims made in past studies of voice and in some widely-received
definitions of voice are shown to be false. In particular, voice oppositions
are typically based on conceptual — as opposed to pragmatic — meanings,
may not alter argument alignment patterns, may not change verbal valency,
and may not even trigger verbal marking. There are also voice oppositions
more basic and wide-spread than the active/passive system, upon which
popular definitions of voice are typically based.
1. Introduction
Current studies on voice phenomena su¤er from a number of inadequa-

cies at several levels of description and explanation. At the most funda-
mental level, there is no coherent conceptual framework that adequately
addresses the matter, such that we are often left to wonder whether or not
a given phenomenon falls in the domain of voice. For one thing, people
di¤er in the treatment of causative and reflexive constructions; some con-
sider them to represent voice categories, while others do not. Still others
Linguistics 44–2 (2006), 217–269 0024–3949/06/0044–0217

6 Walter de Gruyter
218 M. Shibatani
avoid raising the issue at all. Various definitions currently o¤ered are of
little use, as they are typically based on an Indo-European active/passive
opposition, and arbitrarily include or exclude a particular phenomenon
from the domain of voice.1
Properly identifying construction types representing a voice sub-
domain is also a serious problem. In Crystal’s (2003) definition (cf. Note
1) reflexives are not recognized as proper voice constructions and their
relationship to the middle voice is not entirely clear. A similar problem
is seen in Kemmer’s (1993) extensive study of middle voice constructions.
There are also severe limitations at the level of explanation. Closer to
the main theme of this volume is the problem of understanding the in-
creases and decreases in valency and accompanying changes in argument
structure observed in voice phenomena. Why do certain phenomena (e.g.
the causative and applicative) show an increase in valency, while others
(e.g. the passive and antipassive) typically have a valency-reducing e¤ect?
What motivates these valency changes in opposite directions?
Functional explanations regarding the distribution of certain voice
constructions go a long way toward an explanatory functional study of
grammatical phenomena (cf. Haiman 1985). Being largely based on
formal properties such as ‘‘linguistic distance’’ and ‘‘full’’ vs. ‘‘reduced
form,’’ these explanations are not functional enough to be able to make
more general predictions.2
The problems outlined above largely stem from two related method-
ological issues. One is the lack of a coherent conceptual framework for
characterizing and analyzing voice phenomena; the other is an over-
reliance on formal properties in both analysis and explanation. Clearly
the latter problem is caused by the former and by the lack of commitment
to the cognition-to-form approach in linguistic analysis.3 The purpose of
this article is thus to lay out a conceptual framework that coherently de-
lineates the domain of voice, which embraces both those phenomena that
are traditionally recognized as falling in the voice domain and those that
have been kept in limbo. The framework required must deal with the fact
that many voice phenomena straddle the semantics-pragmatics boundary,
although the active/middle opposition is basically conceptual or semantic,
and the active/passive opposition is largely pragmatic. We endeavor to
unify these manifestations of voice function by assuming that the prag-
matic relevance of clausal units is semantically determined in the first place.
The conceptual foundations of voice can only be arrived at by inspect-
ing contrasting phenomena across languages. Our initial task is therefore
to learn how a given language, using its own resources, achieves the goal
of expressing a relevant conceptual opposition found in another language.
While the ultimate goal of functional typology is to discover the correla-
Voice phenomena 219
tive patterns between form and function, this article is concerned primarily
with the initial task of postulating conceptual bases of voice phenomena
and identifying constructions across languages that express the relevant
oppositions.
One final introductory remark is due regarding the controversy over
the question of whether the formal relationships between opposing voice
categories should be treated as inflectional or derivational. We consider
this question to be academic in the absence of rigorous definitions for
these processes. In the realm of voice phenomena, some systems, for ex-
ample, the Ancient Greek active/middle system, incorporate voice mor-
phology in their inflectional paradigm. Others like the English active/
passive opposition do not show a simple morphological relationship —
inflectional or derivational — since constructions as a whole enter into
the formal opposition. The regularity or productivity of the pattern is
often taken to be an important criterion distinguishing inflections from
derivations; the former are thought to be regular and obligatory, while
the latter allow exceptions. But regularity in natural language is always
relative, and so are the patterns of voice oppositions. Even among the
known ones, nothing is one hundred percent regular. An alternation that
is well-integrated within the inflectional paradigm may show irregularity.
In Ancient Greek, for example, we find both active forms that do not
have middle counterparts (activa tantum) and middle forms lacking the
corresponding active (media tantum). The active/passive opposition also
shows a high degree of regularity, without ever being one hundred per-
cent (as in the case of English), others place much severer limitations on
the range of permissible passive constructions.
2. The evolution of an action: voice, transitivity, and aspect
The basic claim of this article is that major voice phenomena have their
conceptual bases rooted in the human cognition of actions. Because such
actions have various e¤ects upon us, we have special interest in the way
that they arise, how they develop, and the manner in which they terminate
— what is referred to as the evolutionary properties or phases of an
action in this article. Through a system of grammatical oppositions, a
language provides a means for expressing conceptual contrasts pertaining
to the evolutionary properties of an action that the speaker finds relevant
for communicative purposes. Among the evolutionary properties, voice is
primarily concerned with the way event participants are involved in ac-
tions, and with the communicative value, or discourse relevance pertain-
ing to the event participants from the nature of this involvement.
220 M. Shibatani
Mention of the evolution of an action immediately brings to mind two

other grammatical concepts, namely, transitivity and aspect. It is thus ap-
propriate to clarify the relationships and di¤erences between these no-
tions. Traditionally, voice has been defined in reference to transitivity, or
more narrowly in terms of the transitivity of a verb or clause; the active/
passive opposition most typically obtains with transitive verbs. A more
important connection between transitivity and voice, however, lies in the
notion of semantic transitivity, rather than strictly verbal or clausal tran-
sitivity. Indeed, it is easy to see this connection, as in the work of Hopper
and Thompson (1980), where many of the phenomena discussed in terms
of transitivity are nothing but voice phenomena. This important article
concludes the section on grammatical transitivity as follows: ‘‘It is tempt-
ing to find a superordinate semantic notion which will include all the
Transitivity components. If there is one, it has so far not been discovered
. . .’’ (Hopper and Thompson 1980: 279). Our claim is that what they are
looking for is a theory of voice. In fact, the work of Hopper and Thomp-
son lays important ground work for the study of voice. In this regard,
Kemmer (1993: 247) is absolutely correct in noting that ‘‘the scale of
transitivity . . . forms the conceptual underpinning for voice systems in
general, and for reflexive and middle marking systems in particular.’’4
While none of these works makes it quite clear, voice is a system of cor-
respondences between action or event types and syntactic structures. For
example, what is known as the active voice is the pattern of correspon-
dence between the high transitive event type or the prototypical transitive
action and the nominative-accusative coding pattern of the event partici-
pants, as in the English active sentence She killed him (see Section 5 below).
The parameters of transitivity identified by Hopper and Thompson
(1980) pertain to ‘‘di¤erent facet[s]’’ of ‘‘carrying-over or transferring an
action from one participant to another’’ (Hopper and Thompson 1980:
253), and they in e¤ect represent the evolutionary properties of an action,
that is, they pertain to the way an action is brought about, to the way it
is transferred to the second participant, and to the way it a¤ects this
participant. In order to bring grammar closer to cognition, we propose
to examine specific evolutionary properties of an action pertaining to
voice oppositions that are distilled as transitivity parameters in Hopper
and Thompson (1980) and others dealing with the issues of transitivity.
If transitivity is integral to a theory of voice, how then do aspect and
voice di¤er under the assumption that both are concerned with the way
an action evolves? These two grammatical categories invite di¤erent kinds
of questions. Aspect asks where the vantage point is with regard to the
temporal structure of an action. When the action is viewed holistically en-
compassing all of its temporal phases, we obtain the perfective viewpoint
Voice phenomena 221
Figure 1. Aspectual categories
Figure 2. Evolutionary phases of an action and the relevant voice categories
of the described action. On the other hand, if specific sections of internal

temporal structure are focused, we obtain various types of imperfective
aspectual construal of an event. The contrast between the perfective and
the imperfective aspects and the representative subcategories of the latter
seen across languages are represented in Figure 1.
Voice, on the other hand, asks how an action evolves — that is, it asks
about the nature of its origin, the manner in which it develops, and the
way that it terminates. These evolutionary phases of an action and the
various voice categories pertaining to them are depicted schematically in
Figure 2.
222 M. Shibatani
3. Major voice oppositions and their conceptual bases
Under the present conception, the three principal evolutionary phases of

an action — origin, development, and termination — form the basis for
the major voice parameters. These parameters are generally expressible
in the form of questions concerning the evolutionary properties of an
action, as below:
Major voice parameters:

I. The origin of an action
(a) How is the action brought about?
(b) Where does the action originate?
(c) What is the nature of the agent?
II. The development of an action
How does the action develop?
(a) Does the action extend beyond the agent’s personal sphere or
is it confined to it?
(b) Does the action achieve the intended e¤ect in a distinct pa-
tient, or does it fail to do so?
III. The termination of an action
Does the action develop further than its normal course, extend be-
yond the immediate participants of the event, and terminate in an
additional entity?
Figure 2 summarizes the voice constructions pertaining to these parame-

ters. Throughout the following discussion, we touch upon the theoretical
consequences of this diagrammatic representation of the voice domain.
3.1. Parameters pertaining to the origin of an action
The first opposition to be examined has to do with the nature of the ori-
gin of an action — namely, whether the action in question is brought
about volitionally or nonvolitionally by a human agent.
Volitional/spontaneous opposition:
Is the action brought about volitionally?
Yes ! volitional
No ! spontaneous
While not widely recognized as a voice opposition, this distinction has

been recognized as such in the Japanese grammatical tradition, perhaps
because the su‰x for the spontaneous voice is identical with that used in
Voice phenomena 223
the passive construction. In fact, it is generally believed that the Japanese

passive arose from the spontaneous construction. Languages (or gram-
marians’ interpretations of the facts?) may di¤er with regard to the pre-
cise meaning contrast seen in the volitional/spontaneous opposition. In
Japanese, the spontaneous construction expresses a situation where the
agent does not intend to bring about an action, but where there is a cir-
cumstantial factor external to the agent that induces an action (such as
eating ‘‘dancing-mushrooms’’ as in [1b] below). In other languages, a
spontaneous form conveys the meaning of an action accidentally brought
about. Other manifestations of the opposition may be alternatively ex-
pressed in terms of such notions as intentional/unintentional or con-
trolled/uncontrolled, but we shall take the position that these contrasts
are included in the basic function of the volitional/spontaneous opposi-
tion. That is, by ‘‘volitional voice’’ we mean a connection between a par-
ticular syntactic form and a type of action that is brought about by the
willful involvement of an agent who ‘‘intends the action,’’ and sees to it
that the intended e¤ect is achieved. Departure from this action type in
any significant way may be construed as constituting a spontaneous
action, expressed by a construction formally contrasting with the voli-
tional construction.
In Modern Japanese, the domain of the volitional/spontaneous opposi-
tion has shrunk to such an extent that mental activities are the only ones
where the contrast is readily observed, with the spontaneous morphology
(-re/-rare) having generally given way to a passive interpretation in the
domain of physical actions. In Classical Japanese (ninth–twelfth centu-
ries), the volitional/spontaneous opposition was more widely observed,
as in the following examples:
Classical Japanese
(1) a. Kikori-domo mo mai-keri. (volitional)
wood cutter-PL also dance-PAST
‘Wood cutters also danced.’
b. Kikori-domo mo mawa-re-keri. (spontaneous)
wood cuter-PL also dance-SPON-PAST
‘Wood cutters also danced willy-nilly.’
Spontaneous expressions in Japanese typically do not contain an agent in
subject position. Because information regarding the volitional status of an
agent is most readily accessible to the speaker, the volitional/spontaneous
distinction is typically made with reference to a first person agent; accord-
ingly, the missing agent is understood to be the speaker unless otherwise
specified. This non-coding of an agent in subject position paved the way
for a spontaneous expression where a patient nominal is coded in subject
224 M. Shibatani
position, as in the following spontaneous construction (2b). Undoubtedly,

this was an important step in the development of the passive from the
spontaneous construction.
Modern Japanese
(2) a. Boku-wa yoku mukasi-no-koto-o
I-TOP often old days-GEN-things-ACC
omo-u. (volitional)
think-PRES
‘I often think about the things of the old days.’
b. Saikin mukasi-no-koto-ga yoku
recently old days-GEN-things-NOM often
omowa-re-ru. (spontaneous)
think-SPON-PRES
‘Recently the things of the old days often come to mind.’
Since the volitional/spontaneous opposition is not widely recognized as

a voice phenomenon, it is perhaps worth spending some time showing
how widespread in the world’s languages it actually is. As in other voice
sub-domains, languages make use of di¤erent resources in expressing the
volitional/spontaneous opposition. Indonesian and Malay use the multi-
functional prefix ter- to express unintended or accidental actions:
Indonesian
(3) a. Ali memukul anak-nya. (volitional)
Ali AF.hit child-3SG.POSS
‘Ali hit his child.’
b. Ali ter-pukul oleh anak-nya. (spontaneous)
Ali SPON-hit PREP child-3SG.POSS
‘Ali accidentally hit his child.’
(I Wayan Arka pers. comm.)
According to Winstedt (1927: 86–87), the function of ter- in Malay is

characterized as denoting an action due ‘‘not to conscious activity on the
part of the subject, but to external compulsion or accident.’’ It is note-
worthy that spontaneous constructions in both Japanese and Indonesian/
Malay have an a‰nity with the passive in that they share the same a‰x in
these languages. Compare the spontaneous constructions above with the
passives in Japanese and Indonesian below:
(4) a. Taroo-wa Ziroo-ni nagura-re-ta. (Japanese passive)

Taro-TOP Jiro-by hit-PASS-PAST
‘Taro was hit by Jiro.’
Voice phenomena 225
b. rumah itu tidak ter-beli oleh

house that NEG PASS-buy PRES
saya. (Indonesian passive)
1.SG
‘The house cannot be bought by me.’
The diagrammatic representation of voice constructions in Figure 2 can

be thought of as a semantic map, where di¤erent constructions are dis-
tributed over relevant territory within the voice domain. This is a useful
way of representing conceptual a‰nities among various voice construc-
tions, but its utility is predicated only on a comprehensive view of voice
as advocated in this article. Spontaneous and passive are both concerned
with the origin of an action. What they share is the idea that this lies not
in the pragmatically most relevant participant; in the case of the passive,
it is the agent of low discourse relevance and in the spontaneous case, it is
the external circumstance.
The map in Figure 2 also shows the ‘‘neighboring’’ relationship be-
tween the spontaneous, the middle, and the antipassive. In Russian and
a number of Australian languages, middle forms are recruited for the
volitional/spontaneous contrast, as in the following examples:
Russian
(5) a. Kostja poreza-l xleb.
Kostja cut.PERF-PAST.SG.MASC bread
‘Kostja cut the bread.’
b. Kostja poreza-l-sja.
Kostja cut.PERF-PAST.SG.MASC-SPON
‘Kostja has [accidentally] cut himself.’
(Vera Podlesskaya pers. comm.)
Diyari
(6) a. Ðatu yinana danka-na wawa-yi.
1SG.ERG 2SG.O find-PARTC AUX-PRES
‘I found you (after deliberately searching).’
b. Ðani danka-tadi-na wara-yi yiÐka Ðgu.
1SG.ABS find-SPON-PARTC AUX-PREP 2SG.LOC
‘I found you (accidentally).’
(Austin 1981: 154)
Another favorite source for the spontaneous construction — especially

prominent among Indo-Aryan and Dravidian languages of India — is
the so-called dative-subject construction, which typically expresses un-
controllable states:
226 M. Shibatani
Sinhala
(7) a. mam ee wacne kiwwa.
I.NOM that word say.PAST
‘I said that word.’
b. maţ ee wacne kiywuna.
I.DAT that word say.P.PAST
‘I blurted that word out.’
(Gair 1990: 17)
The adaptation of the dative-subject construction for a spontaneous
action is also seen when the ‘‘dative-subject’’ is marked by cases di¤erent
from the dative as in the following Bengali examples, where the nominal
form corresponding to the dative subject is marked with genitive. Here
the volitional/spontaneous contrast takes on interesting nuances:
Bengali
(8) a. Ami toma-ke khub p chondo kor-i.
1SG.NOM 2ORDSG-OBJ very liking do-PRES.1
‘I like you very much.’ (According to my own criteria.)
b. Ama-r toma-ke khub p chondo
1SG-GEN 2ORDSG-OBJ very liking
h y.
become-PRES.3ORD
‘I like you very much.’ (According to some [socially] set
criteria.)
(Onishi 2001: 120)
When the basic meaning of the verb denotes a spontaneous (involun-
tary) action, the volitional voice form can be obtained by using a self-
benefactive construction, as in Marathi and other Indo-Aryan languages:
Marathi
(9) a. sitaa raD-l-i.
Sita.NOM cry-PERF-F
‘Sita cried.’
b. sitaa-ne raD-un ghet-l-a.
Sita-ERG cry-CONJ take-PERF-N
‘Sita cried (so as to relieve herself ).’
(Prashant Pardeshi pers. comm.)
Lhasa Tibetan has a set of auxiliaries expressing di¤erent categories of
perspective. ‘‘Perspective-choice’’ interacts with both person and eviden-
tial categories in a complex way, but the relevant auxiliaries can be di-
vided into a ‘‘self-centered’’ and an ‘‘other-centered’’ group (Denwood
Voice phenomena 227
1999). Verbs denoting such intentional actions as reading and dancing

normally occur with self-centered auxiliaries when used with first person
subjects. They can be made nonintentional or spontaneous with the use of
other-centered auxiliaries, as in the following examples:
Tibetan
(10) a. ngas. yi.ge. klog.ba yin.
I-SMP letter read-LINK-AUX (self-centered)
‘I read the letter (on purpose).’
b. ngas. yi.ge. klog.song.
1-SMP letter read-AUX (other-centered)
‘I read the letter (without meaning to).’
(Denwood 1999: 137)
Conversely, although unintentional verbs expressing involuntary actions
such as coughing and seeing normally occur with other-centered auxilia-
ries, they can be rendered volitional by the use of self-centered auxiliaries:
Tibetan
(11) a. glo. rgyab.byung.
cough-AUX (other-centered)
‘I coughed (involuntarily).’
b. glo. rgyab.pa.yin.
cough-LINK-AUX (self-centered)
‘I coughed (deliberately).’
(Denwood 1999: 139)
A similar pattern is observed in Newar (Tibeto-Burman), where the rele-
vant contrast is expressed in terms of a distinction between conjunct and
disjunct verbal endings — apparently an evidentiality-related phenome-
non. Note that only clauses with first person subjects allow this contrast
to be expressed.
Newar
(12) a. ji-n kayo tachyâ-nâ.
1SG-ERG cup break-PC
‘I broke the cup (deliberately).’
b. ji-n kayo tachyâ-ta
1SG-ERG cup break-PD
‘I broke the cup (accidentally).’
(Kansakar 1999: 428)
Finally, the phenomenon now widely recognized in the name of ‘‘split
intransitivity’’ is rooted in the volitional/spontaneous opposition. Ob-
serve first some well-known examples from Eastern Pomo below:
228 M. Shibatani
Eastern Pomo
(13) a. ha: c’e:xelka. (volitional)
1SG.A slip
‘I am sliding.’
b. wı́ c’e:xelka. (spontaneous)
1SG.P slip
‘I am slipping.’
(McLendon 1978: 1–3)
Although the verb forms are the same, when the pronominal form is in-
flected for the patient (13b), the sentence conveys a spontaneous action or
a ‘‘lack of protagonist control’’ (McLendon 1978: 4). A similar contrast
is seen in the Caucasian language Tsova-Tush (Batsbi), where ‘‘[the] ref-
erent of [an ergative] subject is a voluntary, conscious, controlling partic-
ipant in the situation named by the verb’’ (Holisky 1987: 113).
Tsova-Tush (Batsbi)
(14) a. (as) vuiž-n-as.
1SG.ERG fall-AOR-1SG.ESRG
‘I fell down, on purpose.’
b. (so) vož-en-sO.
1SG.NOM fell-AOR-1SG.NOM
‘I fell down, by accident.’
(Holisky 1987: 104)
In addition to these cases of ‘‘fluid-S’’ marking (Dixon 1994), split in-
transitivity may be realized as a lexically-conditioned phenomenon, where
intransitive verbs are classified into an ‘‘agentive’’ class and a ‘‘patientive
class.’’ Agentive and patientive nominals respectively trigger marking
similar to the corresponding arguments of a transitive clause. The Philip-
pine language Cebuano shows this pattern through a focus system which
is characteristic of Formosan and Western Austronesian languages:
Cebuano
(15) Transitive actor-focus construction
Ni-basa ako ug libro.
AF-read I.TOP INDEF book
‘I read a book.’
(16) Transitive patient-focus construction
Gi-basa nako ang libro.
PF-read I TOP book
‘I read the book.’
(17) a. Agentive intransitive
Ni-dagan ako. (actor-focus form)
Voice phenomena 229
AF-run I.TOP
‘I ran.’
b. Patientive intransitive
Gi-kapoy ako. (patient-focus form)
PF-tired I.TOP
‘I got tired/I am tired.’
Generalizing processes have the e¤ect of obliterating the basic semantic

motivation for distinguishing two classes of intransitive verbs; either the
larger agentive or larger patientive class of intransitive verbs tends to
have semantically heterogeneous verbs. Nevertheless, the split of intransi-
tive verbs into two classes is rooted in the distinction between volitional
and involuntary actions involving an animate protagonist. This is seen in
a minority class of verbs, such that a minority agentive class contains
verbs denoting controlled actions, and a minority patientive class includes
verbs denoting involuntary states of a¤airs (see Merlan 1985). In Cebuano
(and perhaps other Philippine languages as well) the larger agentive class
includes verbs denoting uncontrolled events such as raining or slipping
o¤, while the minority patientive class contains verbs that express strictly
involuntary states of a¤airs such as being hungry, becoming tired, or con-
tracting diseases.
The patterns of split intransitivity discussed here underscore an im-
portant point that we wish to advance in this article: voice can be also
expressed by nominal forms. Traditionally, voice has been regarded as a
verbal category. Indeed, many linguists take verbal marking or verbal
inflection as the defining feature of voice.5 We reject this restrictive view.
As we define it, voice is concerned with the evolutionary properties of
an action. It is typically marked on the verb because a verb expresses an
action. Verbal voice marking is therefore simply a case of iconicity. An
action, however, also involves participants such as agent and patient. Be-
cause an action occurs in relation to these protagonist participants, any
form representing them could also bear voice marking. The volitional/
spontaneous opposition manifested in nominal forms also reflects the un-
derlying relationship between the origin of an action and the volitional
status of the agent.6 Nominal marking for certain voice contrasts is thus
also motivated by the iconicity principle.
Let us now turn to the causative/noncausative opposition. As noted in
the introduction, the causative has been problematic with respect to its
status as a voice category. Widely-received definitions of voice, such as
Crystal’s in Note 1, maintain that voice oppositions do not entail a se-
mantic contrast, which have prevented many grammarians from readily
accepting causative/noncausative as one. As the above discussion on the
230 M. Shibatani
Figure 3. Causative action chain
volitional/spontaneous opposition shows, however, there is no reason to

believe that voice is a semantically neutral phenomenon. As it happens,
one of the oldest systems of voice contrast in Indo-European — the
active/middle opposition — also involves a meaning contrast (see be-
low).7 The question concerning the causative/noncausative opposition
(and other semantic oppositions) is whether the relevant contrasts can be
naturally integrated into a coherent conceptual framework of voice. Our
answer will be yes.
The causative/noncausative opposition pertains to the origin of an
action; that is, whether the action originates with the agent of the main
action or with another agent heading the action chain. The causative action
chain is represented in Figure 3.8
In a noncausative situation, the initial agent (Agent2 ) is also the agent
of the main action. In a causative situation, the ultimate origin of the
main action lies in the agent (Agent1 ) heading the action chain, which is
di¤erent from the agent (Agent2 ) of the main action. The relevant pa-
rameter for the causative/noncausative distinction can be formulated as
below:
Causative/noncausative opposition:
Does the action originate with an agent heading the action chain that is
distinct from the agent or patient of the main action?
Yes ! causative
No ! noncausative
The contrast between a noncausative situation represented by an expres-
sion such as Bill walked and its causative counterpart expressed by a peri-
phrastic causative form like John made Bill walk can thus be naturally
captured in terms of the nature of the origin of an action. Situations ex-
pressed by lexical causatives such as John killed Bill have an (initial) agent
distinct from the patient of the main action.
One of the important points of past studies of causative constructions
has to do with the fact that a voice category can be expressed by a con-
struction as a whole, rather than by local morphological entities such as
verb inflection or nominal case marking. Lexical and periphrastic caus-
Voice phenomena 231
ative constructions such as John killed Bill and John made Bill walk are
a case in point. They di¤er in form from morphological causatives such
as Quechua wañu-či (die-CAUSE) ‘kill’ and Japanese aruka-se (walk-
CAUSE) ‘make walk’, where the causative meaning is expressed mor-
phologically. Traditionally, grammarians have tended to consider only
morphological causatives as proper cases. However, such a position leads
to the uncomfortable decision of treating the Quechua and Japanese
forms cited above as causative, while treating the semantically parallel
English expressions kill and make walk as noncausative. The form-based
treatment of causatives is tantamount to simply circumscribing morpho-
logical causatives, and does not lead to a comprehensive study of caus-
ative phenomena. Causation is a semantic, not a morphological notion,
and as such the whole range of expression types must be taken into ac-
count in a satisfactory analysis. Indeed, a (functional) typological study
is predicated on the view that a variety of expression types will obtain in
any given conceptual domain. The formal tripartite pattern of lexical,
morphological, and periphrastic causative constructions has now been
widely accepted, and some revealing correlations between form and func-
tion have been identified in the causative domain (see Shibatani and Par-
deshi 2002 on recent developments). We see below that a similar pattern
holds in other voice domains as well.
Having discussed two voice phenomena pertaining to the origin of an
action, we now turn to the next major voice parameter concerning its de-
velopment. We will consider the other voices associated with the nature
of the origin of an action — the passive and the inverse — after dealing
with other conceptually-based voice phenomena.
3.2. Parameters pertaining to the development of an action
In this section we recognize at least two sets of contrastive patterns in the

developmental phase of an action. One is concerned with whether the
action develops beyond the personal sphere of the agent or is instead con-
fined within it. The latter mode of development forms the conceptual ba-
sis of what is known as the middle voice. The other contrastive pattern of
action development is concerned with whether or not the action has been
successfully transferred to the patient and has achieved its intended e¤ect.
This contrast forms the conceptual basis for the ergative/antipassive
opposition.
The active/middle voice opposition is best known from studies of clas-
sical Indo-European languages such as Ancient Greek and Sanskrit, and
calls for a broad understanding of the notion of action confinement in the
232 M. Shibatani
agent’s personal sphere. The clearest case in which the development of an

action is confined to the agent’s sphere is when simple intransitive activ-
ities, such as sitting and walking, are lexicalized as intransitive verbs.
Here the development of the action is clearly confined within domain of
the agent, as shown in the schematic representation Figure 5a. These sit-
uations contrast with active (causative) situations (e.g. John sat his son in
the chair and John made his son walk) where the relevant actions involve
an agent that instigates an action which develops outside the (initial)
agent’s domain (see Figure 4). In the words of Benveniste (1971 [1950]:
148): ‘‘In the active, the verbs denote a process that is accomplished out-
side the subject. In the middle, which is the diathesis to be defined by the
opposition, the verb indicates a process centering in the subject, the sub-
ject being inside the process.’’
Reflexive situations also constitute one of the middle action types, since
here the action is also confined within the agent’s personal sphere. The
active expression John hit Bill contrasts with the reflexive expression
John hit himself, where the confinement of the hitting action within one’s
personal sphere (e.g. hitting one’s head or body) is marked by a corefer-
ential reflexive pronoun (see Figure 5b).9
Other middle situations of body-care action — bathing, combing one’s
hair, washing one’s hands, and dressing oneself — are straightforward,
where the agent’s action deals with its own body or body part. Because
an action confined to the agent’s sphere typically a¤ects the agent itself,
this aspect of the middle — an e¤ect accruing to the agent itself — plays
an important role in framing certain actions of the middle. Greek middle
expressions such as paraschésthai ti ‘to give something from one’s own
means’ and paratı́thesthai sı̄ton ‘to have food served up’ are a case in
point. Here the actions actually extend beyond the agent’s sphere, but
their e¤ects accrue on the agent in the manner of a typical middle de-
picted in Figures 5b and 5c. In other cases, the notion of the agent’s per-
sonal sphere is more strictly adhered to, as in the following examples:
Sanskrit
(18) a. devadatto yajnadattasya bharyam
Devadatta.NOM Yajnadatta.GEN wife
upayacchati. (active)
have.relations.3SG.ACT
‘Devadatta has relations with Yajnadatta’s wife.’
b. devadatto bharyam upayacchate. (middle)
Devadatta.NOM wife have.relations.3SG.MID
‘Devadatta has relations with his (own) wife.’
(Klaiman 1988: 34)
Voice phenomena 233
Figure 4. Active/ergative situation type
Figure 5. Representative middle situation types
Sanxiang Dulong/Rawang
(19) a. aÐ 53 a 31 dffil 31 a 31 be 55 . (active)
3SG mosquito hit
‘S/he is hitting the mosquito.’
b. aÐ 53 a 31 dffil 31 a 31 be 55 -ffl 31 . (middle)
3SG mosquito hit-MID
‘S/he is hitting the mosquito (on her/his body).’
(LaPolla 1996: 1945)
234 M. Shibatani
Table 1. Balinese middle forms (Shibatani and Artawa 2003)
Periphrasitic Morphological Lexical
nyagur awak- ‘hit oneself ’ xxx xxx

nyukur awak- ‘shave oneself ’ ma-cukur ‘shave’ xxx
xxx ma-juju ‘stand’ xxx
xxx ma-jalan ‘walk’ xxx
(causative þ awak-) xxx negak ‘sit’
(causative þ awak-) xxx nyongkok ‘squat’
The active/middle opposition is diagrammatically shown as above, where

the dotted circles and arrows represent the agent’s personal sphere and
actions respectively.
The conceptual basis of the active/middle opposition can then be
formulated in terms of the manner of the development of an action, as
follows:
Active/middle opposition:
Active: The action extends beyond the agent’s personal sphere and
achieves its e¤ect on a distinct patient.
Middle: The development of an action is confined within the agent’s per-
sonal sphere so that the action’s e¤ect accrues on the agent itself.
Defining the middle voice domain in terms of confinement of an action
within the sphere of the agent a¤ords a unified treatment of various types
of middle construction. Just as in the case of causatives, middle construc-
tions come in three types — lexical, morphological, and periphrastic —
both within individual languages and across di¤erent ones. Balinese, for
example, exhibits all three types of middle construction, allowing some
situation types to be expressed either morphologically or periphrastically,
as shown in Table 1.
Our approach to middle voice phenomena is more consistent than
Kemmer’s (1993), which distinguishes reflexive situations from other
middle event types, although these two categories are assumed to form a
continuum, as shown in the diagrammatic representation in Figure 6. In
our approach, Kemmer’s reflexive, middle, and single participant situa-
tion types all fall in the middle voice domain, as defined above. Kemmer’s
distinctions among these types appear to be partly based on the typical
forms expressing them. Reflexive situations tend to be expressed peri-
phrastically, as in the case of Balinese nyagur awak ‘hit oneself ’. Kemmer’s
middle situation types are typically expressed morphologically, as in
ma-cukur ‘shave’ in Balinese, and single-participant events are typically
expressed by forms without any middle markers, as in the Balinese lexical
middle negak ‘sit’.
Voice phenomena 235
Figure 6. Kemmer’s classification of event types (Kemmer 1993: 73)
Kemmer arrives at her classification of event types as a result of her

decision to ‘‘[deal] with . . . middle-marking languages, or languages with
overt morphological indications of the middle category’’ (Kemmer 1993:
10; bold face original, underline added). As pointed out in the discussion
of causatives above, a strict form-based approach to the middle voice
tends to focus on morphological middles, which is similar to the narrow
treatment of morphological causatives, ignoring other possible form
types. Such an approach would consider the Tarascan (Mexico) form
ata-kurhi ‘hit oneself ’ and the Quechua form maqa-ku ‘hit onself ’ as mid-
dles, while treating the English and Balinese equivalents hit oneself and
nyagur awak as distinct reflexives. Perhaps Kemmer would consider one-
self and awak here as ‘‘overt morphological indications of the middle cat-
egory.’’ But then, why is she distinguishing reflexive situations from the
middle situations in her diagram reproduced in Figure 6? Also, what of
the German form aufstehen ‘stand up’, which shows no middle marking?
Is it not a middle because it lacks any morphological marking? It is
semantically equivalent to the Balinese middle form ma-jujuk ‘stand up’.
A more systematic typological investigation of the form-function cor-
relation can be achieved if variation in form is taken as a function of
the ‘‘naturalness’’ of the middle action. Natural middle actions — for ex-
ample, sitting and walking — tend to be lexicalized as intransitive verbs,
while actions typically directed to others — for example, hitting and kick-
ing — tend to be expressed by periphrastic constructions involving a re-
flexive form when they are confined within the agent’s personal sphere.
What Kemmer (1993) has identified as middles — morphological middles
— center on those actions that people typically apply to themselves, but
that are applied to others often enough.10 One must, however, realize that
there are both intra- and crosslinguistic variations — such that in Bali-
nese ma-jujuk ‘stand up’ has a morphological middle prefix, but negak
‘sit (down)’ is simply lexical. The same marking pattern is reversed in
236 M. Shibatani
German, where sich hinsetzen ‘sit down’ has a middle marker, but aufste-
hen ‘stand up’ does not. These irregularities require individual accounts,
based on historical, cognitive, and even cultural data.
The middle voice system has several important implications for our
general understanding of the nature of voice phenomena. Recall that
most of the widely received definitions of voice (such as the one quoted
from Crystal [2003] in Note 1) hold that voice opposition does not entail
a meaning contrast. This is not the case for the active/middle opposition,
as shown by the examples above as well as by the contrast between the
English active form John hit Bill and the middle form John hit himself.
Secondly, these examples show that voice alternations do not necessar-
ily alter argument alignment patterns. There is no change in grammatical
relation in the contrastive pairs in (18) and (19). If the situations depicted
there give the impression of unusual utterances, consider the mundane
situations described by the following Greek examples, where a meaning
contrast is expressed without a realignment of arguments:
Ancient Greek
(20) a. loúô khitôna. (active)
wash.1SG.ACT shirt.ACC
‘I wash a shirt.’
b. loúomai khitôna. (middle)
wash.1SG.MID shirt.ACC
‘I wash my shirt/I wash a shirt for myself.’
While morphological middle constructions in some languages are

strictly intransitive (as in the case of the Balinese ma-), and middles de-
rived via the decausative function (as in the Greek forms poreûsai ‘to
cause to go, to convey’: poreusasthai ‘to go’ and kaı́ein ‘to light, kindle’:
kaı́esthai ‘to be lighted, to burn’) are intransitive, intransitivity is not a
defining property of middle constructions. A large number of languages
allow middle constructions that are syntactically transitive, as shown in
the examples above and (21b) below, where the direct object is clearly
marked by the accusative case su‰x -n.
Amharic
(21) a. lmma t-lač’ č.
Lemma MID-shave.PERF.3M
‘Lemma shaved himself.’
b. lmma ras-u-n t-lač’ č.
Lemma head-POSS.3M-ACC MID-shave.PERF.3M
‘Lemma shaved his head.’
(Amberber 2000: 325, 326)
Voice phenomena 237
The general tendency for morphological middles to be intransitive is

best viewed as the result of historical processes responding to the pres-
sure on the form to conform to the semantic intransitivity, which char-
acterizes middle events. This is exactly what has happened to many of
the middle forms expressing reflexive middle situations in European
languages, where the relevant a‰xes evolved from reflexive pronouns
in the parent languages. The course of this development can be
illustrated by using synchronic data below, where the Swedish exam-
ple shows an intermediate clitic stage, the Russian form sebja exem-
plifies the earliest transitive pattern, and -s’ (or -sja) the advanced fused
pattern.
(22) a. Ivan ubi-l sebj-a. (Russian)
Ivan kill.PERF-PAST.SG.MASC self-ACC
‘Ivan killed himself.’
b. Hon kamma-de sig. (Swedish)
she comb-PAST MID
‘She combed.’
c. Ona prichesa-l-a-s’. (Russian)
she comb-PAST-FEM-MID
‘She combed.’
Finally, in recognizing intransitive and transitive verbs as lexicalized
middle and active voice forms, we elevate the active/middle contrast to
the status of a central voice opposition observed in all human lan-
guages (cf. Dixon’s [1979: 68–69] observation that ‘‘all languages ap-
pear to distinguish activities that necessarily involve two participants
from those that necessarily involve one . . . Then all languages have
classes of transitive and intransitive verbs, to describe these two classes
of activity’’).11
Let us now turn to the antipassive voice. As the name suggests, the syn-
tactic properties of antipassive constructions mirror somewhat those of
passives, but the semantic aspect is di¤erent in these two voices. In the
case of the passive, there is no implication that an agent is not somehow
fully involved in the action. Indeed, full involvement of an agent is a cru-
cial feature distinguishing the passive (e.g. John was killed while he was
asleep) from the spontaneous middle (e.g. John died while he was asleep).
Antipassive situations contrast in meaning with those expressed in the
active and the ergative voice regarding the attainment of the intended
e¤ect upon a patient, however.
The intended e¤ect of an action on a patient di¤ers depending on the
verb type. With contact verbs, the antipassive presents a situation as fail-
ing to make contact, as in the following examples:
238 M. Shibatani
Chukchee
(23) a. ltg¼e keyÐ¼n penr-nen.
father¼GER bear¼ABS attack¼3SG:3SG/AOR
‘The father attacked the bear.’
b. ltg¼n penr¼tko¼ge
father¼GER attack¼APASS¼3SG.AOR
keyÐ¼et. (antipassive)
bear¼DAT
‘The father rushed at the bear.’
(Kozinsky et al. 1988: 652)
Warlpiri
(24) a. nyuntulu-rlu u-npa-ju pantu-rnu ngaju.
you-ERG u-2SG.A-1SG.P spear-PAST I.ABS
‘You speared me.’
b. nyuntulu-rlu u-npa-ju-rla pantu-rnu
you-ERG u-2SG.A-1SG-DAT spear-PAST
ngaju-ku. (antipassive)
I-DAT
‘You speared at me; you tried to spear me.’
(Dixon 1980: 449)
According to Dixon (1980: 449), (24b) above ‘‘indicates that the action
denoted by the verb is not fully carried out, in the sense that it does not
have the intended e¤ect on the entity denoted by the object [read ‘‘pa-
tient’’, MS].’’ Similarly, visual contact is not made when situations in-
volving visual perception are presented in the antipassive voice:
Warrungu
(25) a. nyula nyakaþn wurripaþØ.
3SG.NOM seeþP/P beeþABS
‘He saw bees.’
b. ngaya nyakaþkaliþØ wurripaþwu katyarraþwu.
1SG.NOM see-APASSþP/P beeþDAT possumþDAT
‘I was looking for bees and possums.’
(Tsunoda 1988: 606)
Moreover, for action types a¤ecting a patient, the antipassive voice
presents a situation as not a¤ecting the patient in totality, as in the fol-
lowing examples:
Samoan
(26) a. Sā ‘ai e le teine le i‘a.
PAST eat ERG ART girl ART fish
‘The girl ate the fish.’
Voice phenomena 239
b. Sā ‘ai le teine i le i‘a.

PAST eat ART girl LOC ART fish
‘The girl ate some (of the) fish.’
(Mosel and Hovdhaugen 1992: 108)
The voice parameter focusing on the ergative/antipassive contrast can be
formulated as below:
Ergative/antipassive opposition:
Does the action develop to its full extent and achieve its intended e¤ect
on a patient?
Yes ! ergative(/active)
No ! antipassive
Notice that in (24b) an antipassive event is conveyed solely by the case
marking on the patient, underscoring our earlier point that voice may be
manifested in a nominal element denoting the relevant participant. In the
case of the antipassive, the status of the patient is at issue, and antipassiv-
ization iconically a¤ects the form of the patient nominal — either case
marking it di¤erently from the active/ergative (a case of the so-called dif-
ferential object marking [Moravcsik 1978]), or avoiding coding it (exam-
ples below).
As conceived here, both the middle and the antipassive relate to the
nature of the development of an action. Specifically, both have the onto-
logical feature of an action not (totally) a¤ecting a distinct patient. The
conceptual a‰nity between the two explains the middle/antipassive poly-
semy seen in a fair number of languages. Observe:
Yidiny
(27) a. wagu:a bambi-inu.
man.ABS cover-MID
‘The man covered himself.’
b. wagu:a wawa-:iu gudaganda.
man.ABS saw-APASS dog.DAT
‘The man saw the dog.’
(Dixon 1977: 277, 280)
Balinese
(28) a. Ia sedek ma-sugi.
3SG ASP MID-wash.face
‘She is washing her face.’
b. Tiang ma-daar.
1SG APASS-eat
‘I ate.’
(Shibatani and Artawa 2002)
240 M. Shibatani
Russian
(29) a. Ivan mojetsja mylom.
Ivan wash.MID soap.INSTR
‘Ivan washed himself with soap.’
b. Babuška rugajetsja.
granny.NOM scold.APASS
‘Granny is scolding.’
(Geniušienė 1987: 9)
In addition, languages may show the well-known connection between the
middle and the passive12 through the use of the same form as the antipas-
sive, thus illustrating a three-way middle-passive-antipassive polysemy:
Russian (cf. the examples immediately above)
(30) Dom stroitsja turezk-oj firm-oj
house.NOM is.being.built.PASS Turkish-INST firm-INST
INKA.
INKA
‘The house is being built by the Turkish company INKA.’
Kuku Yalanji
(31) a. karrkay julurri-ji-y. (middle)
child.ABS wash-MID-NONPAST
‘The child is washing itself.’
b. warru (yaburr-ndu) bayka-ji-ny. (passive)
young man.ABS shark:LOC:pt bite-PASS-PAST
‘The young man was bitten (by a shark).’
c. nyulu dingkar minya-nga nuka-ji-ny. (antipassive)
3SG.NOM man.ABS meat-LOC eat-APASS-PAST
‘The man had a good feast of meat (he wasted nothing).’
(Patz 1982: 244, 248, 255)
3.3. The termination of action parameter
In a regular transitive event, an action terminates in a patient. However,

the action may extend beyond the patient and a¤ect an additional entity,
which then functions as a new terminal point. Benefactives/malefactives
and applicatives express this kind of situation. The relevant parameter
can be formulated in the following form:
Benefactive/malefactive/applicative parameter:
Does the action develop further than its normal course, such that an
entity other than the direct event-participants becomes a new terminal
point registering an e¤ect of the action?
Voice phenomena 241
No ! active/middle
Yes ! benefactive/malefactive/applicative
While the notion of benefit-giving is a broad one, there is one particular
type with a perceptible change in the beneficiary. This is the case involv-
ing transfer of an object, where the object itself is directly a¤ected by the
act of giving. In a typical giving situation, the object is physically moved
from one owner to a new one. The recipient beneficiary is secondarily af-
fected because it comes into possession of the transferred object. Lan-
guages often have a special benefactive construction that portrays this
type of situation, where the e¤ect on the beneficiary is indicated by its
argument status in syntactic coding. As shown in Shibatani (1996), bene-
factive constructions are typically based on the syntactic schema of the
give-construction even involving the verb form for giving in some lan-
guages, as in the case of Japanese seen below:
(32) a. Taroo-wa Hanako-ni hon-o yat-ta.
Taro-TOP Hanako-DAT book-ACC give-PAST
‘Taro gave Hanako a book.’
b. Taroo-wa Hanako-ni hon-o kat-te
Taro-TOP Hanako-DAT book-ACC buy-CONJ
yat-ta.
BEN-PAST
‘Taro bought Hanako a book.’
In (32b) the buying action is extended beyond the patient (the book),
and a¤ects the beneficiary nominal (Hanako) coded in the dative form.
Compare this construction to the one below, expressing a more general
benefit-giving in which the beneficiary takes on a nonargument form.
(33) Taroo-wa Hanak-no tame-ni hon-o
Taro-TOP Hanako-of sake-for book-ACC
kat-te yat-ta.
buy-CONJ GIVE-PAST
‘Taro bought a book for (the sake of ) Hanako.’
While (33) may express any type of benefit-giving — including one of
buying a book to help Hanako’s book-selling business — (32b) specifi-
cally conveys the meaning that the transfer of the book was intended.
Note also the English translations accompanying these examples, which
show the same contrast.
Benefactive/malefactive events are also realized by so-called external
possession constructions in Indo-European and some other languages
(cf. Payne and Barshi 1999), although the context may determine whether
242 M. Shibatani
or not a clear benefactive/malecfactive reading obtains from them. When

a body part is involved as the primary patient (cf. below), the benefac-
tive/malecfactive reading is not strongly pronounced beyond that which
is conveyed by the verb; cf. (34) and (35a):
German
(34) Ich wasche mir die Hände.
I wash I.DAT the hands
‘I wash my hands.’ (lit. ‘I wash me the hands.’)
(35) a. Man hat ihm den Arm gebrochen.
lit. ‘They broke him the arm.’
b. Man hat seinen Arm gebrochen.
‘They broke his arm.’
Where inalienable possession is implicated as above, the dative nomi-
nal indicates that the action has a¤ected it as a new terminal point of the
action. In German, the external possession construction is generally
obligatory when the a¤ected body part is inalienably possessed; the ex-
tension of the action to its owner is inevitable under such circumstances.
Indeed, an internal possession construction like (35b) suggests that the
arm in question was detached, and no e¤ect on its owner is asserted by
such a sentence. Internal possession constructions involving inalienably
possessed body parts, as in the English form I broke his arm, suggest that
the arm’s owner was a¤ected, but the implication is obtained through a
commonsensical world view. The dative construction (35a), on the other
hand, asserts that the body part owner is a¤ected by the action.
The benefactive/malefactive reading can be seen more readily in the
following examples, where the dative nominal represents a mentally af-
fected party:
French
(36) a. Jean lui a cassé sa vaisalle.
lit. ‘Jean broke her her dishes.’
b. Jean a cassé sa vaisalle.
‘Jean broke her dishes.’
Modern Hebrew
(37) a. ha tinok lixlex li et ha xulca.
the baby dirtied I.DAT ACC the shirt
‘The baby dirtied the shirt on me.’
b. ha tinok lixlex et ha-xulca shel-i.
the baby dirtied ACC the-shirt of-me
‘The baby dirtied my shirt.’
(Berman 1982; T. Givón pers. comm.)
Voice phenomena 243
Where inalienable possession is evident, as in these examples, a male-

factive meaning obtains more readily. The trade-o¤ between inalienabil-
ity and a¤ective reading shows that a principle of relevance is at work in
these constructions: the relevance of the dative arguments to the event
must be somehow ‘‘guaranteed.’’ Involvement of an inalienably possessed
object guarantees the relevance of the possessor to the event, since what-
ever happens to the body part will a¤ect its possessor automatically.
When an inalienable possession relation does not obtain — as in (36a)
and (37a) — a benefactive/malefactive e¤ect upon the dative argument
is pronounced as a way of establishing its relevance to the event. The
attendant interpretation that a possessive relation exists contributes to
the establishment of the a¤ective relationship; the owner of an object is
more easily a¤ected by what happens to its possession.
Contrary to what the label suggests then, so-called external possession
constructions do not assert a possessive relation between the dative argu-
ment and the directly a¤ected patient. Indeed, the relevant constructions
arise independently from externalization of the possessor, as in the Ger-
man example below (also in [36a] above), or when the notion of posses-
sion is irrelevant, as in the following examples (40)–(41) from River Wari-
hı́o (Uto-Aztecan):13
German
(38) Peter repariert mir mein Fahrrad.
‘Peter fixes me my bicycle.’
River Warihı́o
(39) a. hustı́na pasu-ré munı́ kukučı́ ičió.
Agustina cook-PERF beans children BEN
‘Agustina cooked beans for the children.’
b. hustı́na pasú-ke-re munı́ kukučı́.
Agustina cook-BEN-PERF beans children
‘Agustina cooked beans for the children.’
(40) maniwı́ri no’ó wikahtá-ke-ru yomá aarı́.
Manuel 1SG.NS sing-BEN-PERF all afternoon
‘Manuel sang all afternoon for me.’
(41) tapaná no’ó yukú-ke-ru.
yesterday 1SG.NS rain-BEN-PERF
‘Yesterday it rained on/for me.’
(Felix 2005: 253, 257, 258)
That the condition of physical proximity should be more important
than the possessive relation in inducing a benefactive/malefactive con-
struction is shown by the following River Warihı́o examples (see Shiba-
tani 1994 for other cases):
244 M. Shibatani
(42) a. maniwı́ri ihčorewapáte-re wanı́ pantaóni-ra.

Manuel get.dirty-PERF John jeans-POSS
‘Manuel dirtied John’s jeans.’ (John’s jeans were over the
chair.)
b. maniwı́ri ihčorewapaté-ke-re pantaóni wanı́.
Manuel get.dirty-BEN-PERF jeans John
‘Manuel dirtied John’s jeans.’ (John was wearing his jeans.)
In general, applicative constructions have been considered as syntactic
valency-increasing operations that are pragmatically motivated (see Pe-
terson 1999). Our claim is that their conceptual basis is rooted in the
ontological feature of an action, as stated in the voice parameter above.
Peterson’s (1999) survey shows that certain applicatives are more basic
and prevalent than others. In the words of Peterson (who lumps benefac-
tives and applicatives together), ‘‘the locative and circumstantial applica-
tives depend on the presence of other applicative constructions, while
benefactive and instrumental/comitative applicatives do not. That is, there
are two core applicative constructions, benefactive and instrumental/
comitative, and these serve as anchors as it were for the development of
additional applicative constructions marked either by the same or distinct
morphology’’ (Peterson 1999: 135). This observation is consistent with
our view of the benefactive/applicative voice. Benefactive and instrumen-
tal/comitative participants are much more directly involved in the event
than a causal factor, or setting entity such a location, hence much more
likely to be a¤ected by the action. That the benefactive applicative is
obligatory in some languages also underscores the point regarding the
a¤ected nature of the recipient beneficiary (cf. above).
In the past, grammarians may have not paid su‰ciently close attention
to the subtle meaning di¤erences that exist between applicative construc-
tions and their nonapplicative counterparts. However, recent descriptions
of applicative constructions have begun to notice some revealing semantic
e¤ects. For example, Donohue (1999) shows that the Tukang Besi comi-
tative applicative conveys a meaning whereby the applied comitative
nominal is actively engaged in the event:14
Tukang Besi
(43) a. No-moturu kene wowine ane ke hotu mopera.
3R-sleep and woman exist and hair short
‘He slept with the woman with the short hair.’
(i.e. they were sleeping near each other.) (athey had sex
together.)
b. No-moturu-ngkene te wowine ane ke hotu
3R-sleep-COM CORE woman exist and hair
Voice phenomena 245
mopera.
short
‘He slept with the woman with the short hair.’ (i.e. they had
sex together.)
(Donohue 1999: 231)
The following instrumental applicative from Pulaar also demonstrates
how an applied instrumental nominal can implicate a participant more
thoroughly a¤ected by the agent’s action:
Pulaar
(44) a. mi loot-ii miñ am a
1SG wash-PERF.ACT y.s. 1SG.POSS PREP
saabunnde hee.
soap DET
‘I washed my younger sibling with (some of ) the soap.’
b. mi loot-r-ii miñ am
1.SG wash-INST-PERF.ACT y.s. 1SG.POSS
saabunnde hee.
soap DET
‘I washed my younger sibling with (all of ) the soap.’
(Sebastian Ross-Hagebaum pers. comm.)
The various e¤ects of locative applicatives have also been recognized in
the literature. The Balinese locative expression in (45b) below, for exam-
ple, describes a situation where the action of planting banana trees ex-
tends in such a way as to a¤ect the garden. Here the entire garden ends
up being planted with banana trees, while no such implication is made in
the nonapplicative counterpart (45a).
Balinese
(45) a. Tiang mulan biyu di tegalan tiang-e.
1SG plant banana in garden 1SG-POSS
‘I planted bananas in my garden.’
b. Tiang mulan-in tegalan tiang-e biyu.
1SG plant-APPL garden 1SG-POSS banana
‘I planted my garden with bananas.’
(I. Wayan Arka pers. comm.)
4. Pragmatically motivated voice systems
While pragmatically motivated voice systems may appear to have little to

do with the manner in which an action evolves, they can still be framed in
246 M. Shibatani
the overall picture of voice as pertaining to the evolutionary properties of

an action. Both the direct/inverse and the active/passive opposition con-
cern the origin of an action and its relative discourse status with respect to
the terminal point — for example, whether or not the action originates
with a first person and terminates in a third person.
First, let us consider the direct/indirect opposition, exemplified by the
Plains Cree and Kutenai examples below:
Plains Cree
(46) a. ni-se:kih-a-wak. (direct)
1SG-frighten-DIRECT-3PL
‘I frighten them.’
b. ni-se:kih-ik-wak. (inverse)
1SG-frighten-INV-3PL
‘They frighten me.’
(Dahlstrom 1991: 69, 70)
Kutenai
(47) a. wu:kat-i palkiy-s titqat’. (direct)
see-IND woman-OBV man
‘The man saw the woman.’
b. wu:kat-aps-i titqat’-s palkiy. (inverse)
see-INV-IND man-OBV woman
‘The man saw the woman.’
(Dryer 1994: 65)
Inverse systems are normally described in reference to a topicality hier-
archy of the type shown below, also known as an ‘‘animacy’’ or ‘‘empa-
thy’’ hierarchy. When the agent of a transitive event is higher (farther to
the left) on the hierarchy than the patient, the direct form obtains; when
the agent is lower on the hierarchy than the patient, the inverse form is
used.
(48) Topicality hierarchy:
first person, second person > third person proximate > third per-
son obviative
(The ranking of first and second person varies across languages.)
The third person proximate-obviative distinction pertains to the topicality
status of a third person argument, such that when a higher referent (prox-
imate) acts on a lower one (obviative), a direct form obtains. The reverse
pattern leads to an inverse form (see the Kutenai examples above). Since
the speech act participants (first and second person nominals referring to
the speaker and hearer) are higher in topicality than a third person nomi-
nal, the entire inverse system is controlled by the relative topicality of the
Voice phenomena 247
agent and the patient. In other words, the inverse is concerned with the
degree of topicality of the origin of an action, relative to its terminal
point. The direct/inverse system is thus essentially concerned with the
question of where the action originates, and can be characterized in the
following manner:
Direct/inverse opposition:
Does the action originate in an agent higher in discourse relevance than
the patient?
Yes ! direct
No ! inverse
Here we invoke a notion of discourse relevance that is more general
than that of topicality, though the two are intimately connected. By
discourse relevance we refer to two types of relationship which event
participants have with the speaker and hearer and with the discourse.
The first- and second-person event participants are most relevant to the
speaker and hearer, since we have a natural interest in what we do and
what happens to us. Our inclination to talk about ourselves and those
familiar to us leads to the high discourse-topic potential of first- and
second-person referents and other similar entities. The topicality hierar-
chy in (48) and the more elaborated version in (69) below represent the
degrees of discourse relevance in the above sense. Event participants also
vary along di¤erent degrees of relevance commensurate with their infor-
mation value. An entity that is crucial to the information content of a
message has a higher degree of discourse relevance than one supplying
information tangential to the core information conveyed. The two types
of discourse relevance naturally cohere and characterize the most fre-
quently occurring participants in a stretch of discourse.
The inverse system marks constructions denoting situations where the
action originates with an agent low in discourse relevance relative to the
patientive terminal. This is very much like what we see in the passive con-
struction; hence the question naturally arises whether the direct/inverse
and the active/passive systems should be distinguished or not. This has
been a controversial issue in the analysis of some languages. The most
telling di¤erence between the two systems is that while the active/passive
opposition involves changes in the alignment of semantic roles and gram-
matical relations, the direct/inverse opposition does not. In the following
Japanese examples, agents are subjects in both direct and inverse forms:
Japanese
(49) a. Boku-ga Taroo-ni denwa-o si-ta. (direct)
I-NOM Taro-DAT phone-ACC do-PAST
‘I phoned Taro.’
248 M. Shibatani
b. Taroo-ga boku-ni denwa-o si-te

Taro-NOM I-DAT phone-ACC do-CONJ
ki-ta. (inverse15)
INV-PAST
‘Taro phoned me.’
Likewise in the Tibeto-Burman language Nocte, agent and patient are
respectively consistently aligned with ergative or absolutive relations in
both direct and inverse forms:
Nocte
(50) a. nga-ma nang hetho-e.
I-ERG you teach-1PL
‘I will teach you.’
b. nang-ma nga hetho-h-ang.
you-ERG I teach-INV-1
‘You will teach me.’
(Das Gupta 1971, quoted after DeLancey 1981: 641)
Another major distinction between the inverse and the passive is that
the former is a transitive clause requiring the expression of an agent argu-
ment, whereas in the latter the agentive nominal is deranked from argu-
ment status. This well-known feature of the passive can be understood
in light of the extremely low discourse relevance of the agentive nominal
in a passive clause (Jespersen 1965; Shibatani 1985). This prompts us to
the following formulation of the parameter controlling the active/passive
opposition:
Active/passive opposition:
Does the action originate with an agent extremely low in discourse rel-
evance, or at least lower relative to the patient?
Yes ! passive
No ! active
The major di¤erence between the inverse and the passive is that in the
former, the agent has a relatively high discourse relevance compared to
the agent of a passive. That is, while the agents of both inverse and pas-
sive clauses have a low degree of discourse relevance or topicality in com-
parison to the patient, the agent of the inverse has a higher degree of
relevance than that of the passive agent.16 The consequence of this di¤er-
ence is that while the agent of the inverse does not su¤er syntactic derank-
ing, that of the passive does. The null expression of an agent in a passive
clause is a favored syntactic response to its extremely low discourse rele-
vance. This is more widespread in impersonal than in personal passives,
Voice phenomena 249
as in Nepali, where an agentive phrase cannot be overtly expressed in an

impersonal passive clause.
Nepali
(51) a. pulis-e ma-lAi jel-mA hAl-yo. (active)
police-ERG I-DAT jail-LOC put-3SG.PERF
‘The police put me in jail.’
b. pulis-dwArA/bATa ma jel-mA
police-ABL/INST I.NOM jail-LOC
hAl-i-ẽ. (personal passive)
put-PASS-1SG.PERF
‘I was put in jail by the police.’
c. ma-lAi jel-mA hAl-i-yo. (impersonal passive)
I-DAT jail-LOC put-PASS-3SG.PERF
‘(They) put me in jail.’
(Madav Pokharel pers. comm.)
Another telling contrast between the inverse and the passive in the treat-
ment of an agent is seen in the following Kutenai examples (from Dryer
1994):
Kutenai
(52) a. wu.kat-i. (direct)
see-IND
‘He/she/it/they [PROX] saw him/her/it/them [OBV].’
b. wu.kat-aps-i. (Inverse)
see-INV-IND
‘He/she/it/they [OBV] saw him/her/it/them [PROX].’
(53) a. wu.kat-ap-ni. (active)17
see-1SGOBJ-IND
‘They/he/she saw me.’
b. hu wu.kat-iL-ni. (passive)
I see-PASS-IND
‘I was seen.’
In the inverse form (52b) above, a definite third person agent and a defi-
nite third person patient are expressed through zero pronominals. In con-
trast, in the passive clause (53b) the agent is not coded and its identity is
left unspecified. In the words of Dryer (1994: 69): ‘‘In sharp contrast to
the inverse construction, the A[gent] is never expressed in passive clauses.’’
When an agentive phrase is overtly expressed in a passive clause, its
discourse relevance is naturally greater than in a one where it is not
overtly expressed. However, such an agent still has lower discourse
relevance than a patient or the agent of an inverse clause. Despite their
250 M. Shibatani
di¤erences, both the passive and the inverse have a close conceptual a‰n-
ity, since both are concerned with the discourse relevance of the agentive
participant, as shown in Figure 2. We then expect to find situations where
the passive and the inverse share a functional domain, or where the fea-
tures of these two constructions combine. The first case can be observed
in Japanese, where the active/passive and direct/inverse systems divide
the task of indicating the direction of an action with regard to the deictic
center. When simple actions are involved, the active/passive opposition
is utilized. When an action involves the transfer of some entity — for ex-
ample, a letter in letter-writing, or a message in telephoning — the direct/
inverse pattern is invoked.
Japanese
(54) a. Boku-wa Taroo-o nagut-ta. (active)
I-TOP Taroo-ACC hit-PAST
‘I hit Taro.’
b. aTaroo-wa boku-o nagut-ta.18 (active)
Taro-TOP I-ACC hit-PAST
‘Taro hit me.’
c. Boku-wa Taroo-ni nagura-re-ta. (passive)
I-TOP Taroo-by hit-PASS-PAST
‘I was hit by Taro.’
(54 0 ) a. Boku-wa Taroo-ni hon-o okut-ta. (direct)
I-TOP Taro-DAT book-ACC send-PAST
‘I sent a book to Taro.’
b. aTaroo-ga boku-ni hon-o okut-ta. (direct)
Taro-NOM I-DAT book-ACC send-PAST
‘Taro sent me a book.’
c. Taroo-wa boku-ni hon-o okut-te ki-ta. (inverse)
Taro-TOP I-DAT book-ACC send-CONJ INV-PAST
‘Taro sent me a book.’
A case where the features of the active/passive and the direct/inverse

combine can be seen in Southern Tiwa, which has constructions resem-
bling both passive and inverse. This happens in such a way that while
the agent is deranked, as in the passive, it is also controlled by the
topicality hierarchy as in inverse constructions. Observe the following
examples:
Southern Tiwa
(55) a. seuanide Ø-liora-mu-ban.
man 3SG.3SG-lady-see-PAST
‘The man saw the lady.’
Voice phenomena 251
b. liorade Ø-mu-che-ban seuanide-ba.

lady 3SG-PASS-PAST man-INST
‘The lady was seen by the man.’
(56) a. seuanide-ba te-mu-che-ban.
man-INST 1SG-see-PASS-PAST
‘I was seen by the man/The man saw me.’ (no active
counterpart)
b. seuanide-ba a-mu-che-ban.
man-INST 2SG-see-PASS-PAST
‘You were seen by the man/The man saw you.’ (no active
counterpart)
(57) a. *te-mu-che-ban ‘i-ba.
1SG-see-PASS-PAST 2SG-ISNT
‘I was seen by you.’
b. bey-mu-ban. (no passive counterpart)
2SG.1SG-see-PAST
‘You saw me.’
(58) a. *a-mu-che-ban na-ba.
2SG-see-PASS-PAST 1SG-INST
‘You were seen by me.’
b. i-mu-ban. (no passive counterpart)
1SG.2SG-see-PAST
‘I saw you.’
(Allen and Franz 1983: 304, 305)
The first pair in (55) indicates that both active and passive forms are pos-
sible when the agent and the patient are both third person.19 Allen and
Franz (1983) consider sentences like (55b) and (56b) as passive rather
than inverse, presumably because the agent is relegated to oblique status,
as indicated by the instrumental marking on it. Other examples show that
only active constructions obtain when the action originates in a speech
act participant and terminates in any person (57b)–(58b), while a passive
form is obligatory when the action originates in a third person and termi-
nates in a speech act participant (56a)–(56b). This is summarized below:
(59) Optional (either an active or a passive form is possible): 3 ! 3

Active forms only: 1 ! 2, 1 ! 3, 2 ! 1, 2 ! 3 (i.e. SAP ! X)
Passive forms only: 3 ! 1, 3 ! 2 (i.e. 3 ! SAP)
The patterns described above are identical to the typical direct/inverse

pattern. In general, the use of a passive clause is considered to be op-
tional, depending on the context which determines the degree of dis-
course relevance of the event participants. What appears to be unique
252 M. Shibatani
about the Southern Tiwa passive is that discourse relevance is fairly rig-
idly determined by the topicality hierarchy — as in the case of an inverse
system — indicating the possibility that the Southern Tiwa passive has
historically developed from an inverse system. Nevertheless, passive con-
structions that are apparently unrelated to the inverse system actually
also show restrictions like those of the Tiwa passive. In Japanese, for ex-
ample, passivization is not as free or optional as normally described.
Indeed, the Japanese system is not unlike the Southern Tiwa system de-
scribed above. Generally, if an action originates with an agent higher
in discourse relevance than the patient, an active sentence is preferred,
whereas when the relevance status is reversed, a passive sentence is
chosen:
(60) a. Boku-wa Taroo-o nagut-ta.
I-TOP Taro-ACC hit-PAST
‘I hit Taro.’
b. aTaroo-wa boku-ni nagura-re-ta.
Taro-TOP I-by hit-PASS-PAST
‘Taro was hit by me.’
(61) a. aTaroo-ga boku-o nagut-ta.
Taro-NOM I-ACC hit-PAST
‘Taro hit me.’
b. Boku-wa Taroo-ni nagura-re-ta.
I-TOP Taro-by hit-PASS-PAST
‘I was hit by Taro.’
The parallelism of the active/passive and direct/inverse patterns is evi-

dent in that in both cases the action pattern of 3 ! 1 triggers the marked
passive and inverse constructions. Similar choices based on the degree of
discourse relevance can also be seen in the Korean active/passive system.
There, a combination of an inanimate agent and an animate patient trig-
gers passivization, while the reverse pattern of combination resists it, as
observed below:
Korean
(62) a. c salam-i ku ai-lul ccoch-ko issyo.
that man-NOM that child-ACC chase-CONJ be
‘The man is chasing that child.’
b. ku ai-ka c salam-eke ccoch-ki-ko
that child-NOM that man-by chase-PASS-CONJ
issyo.
be
‘That child is being chased by the man.’
Voice phenomena 253
a
(63) a. sikan-i na-lul ccoch-ko issyo.
time-NOM I-ACC chase-CONJ be
‘Time is chasing me.’
b. na-nun sikan-e ccoch-ki-ko issyo.
I-TOP time-by chase-PASS-CONJ be
‘I am being chased by the time.’
(64) a. namca-ka kon-ul ccoch-ko issyo.
man-NOM ball-ACC chase-CONJ be
‘A man is chasing a ball.’
b. akon-i namca-eke ccoch-ki-ko issyo.
ball-NOM man-by chase-PASS-CONJ be
‘A ball is being chased by a man.’
(Klaiman 1988: 56–57)
Even in English — where passivization is believed to be highly grammati-
calized — a similar distribution pattern is generally observed, such that
passives like John was hit by me and actives like A dog bit me this morning
are generally avoided. All in all, then, the active/passive system and the
direct/inverse system are controlled by a similar principle based on the
discourse relevance of the origin of an action relative to the patient termi-
nal point.
Next we find a surprising application of the proposed approach to a do-
main that is seemingly very di¤erent from voice, namely split case marking
in transitive clauses. The best known case occurs with the so-called split er-
gativity phenomenon, whereby both accusative and ergative case-marking
patterns coexist within a single language. Here we focus on a so-called
NP-split, where the case-marking pattern is conditioned by the nature of
the nominal element, as illustrated by the Warrgamay examples below:
Warrgamay
(65) a. ngulmburu gaga-ma.
woman.ABS go-FUT
‘The woman will go.’
b. ngulmburu-nggu maal
woman-ERG man.ABS
ngunda-lma. (absolutive-ergative pattern)
see-FUT
‘The woman will see the man.’
(66) a. ngana gaga-ma.
we.NOM go-FUT
‘We will go.’
b. ngana nyurra-nya
we.NOM you-ACC
254 M. Shibatani
ngunda-lma. (nominative-accusative pattern)

see-FUT
‘We will see you.’
(Australia; Dixon 1980: 287–289)
The examples with full nouns in (65) show an ergative case-marking

pattern, while those with pronouns in (66) show an accusative one. The
actual situation is a bit more complicated, such that among pronouns
nonsingular forms show the accusative pattern, while singular pronouns
show a tripartite pattern, where three distinct forms are used for the A,
S, and P functions. The first person singular pronoun, for example, has
forms ngaja (A), ngyba (S), and nganya (P). Note too that the term
‘‘split-ergativity’’ is in fact a misnomer in that the two case-marking
patterns are not clearly segregated; a single sentence may contain a
nominative (unmarked) agentive nominal and an absolutive (unmarked)
patientive nominal. An ergative agentive nominal and an accusative
patientive nominal may also co-occur in a single sentence. Indeed, all pos-
sible combinations obtain, as illustrated below:
(67) a. ngana.u ngulmburu.u ngunda-lma. (least marked)

we.NOM woman.ABS see-FUT
‘We will see the woman.’
b. ngana.u nyurra-nya ngunda-lma.
we.NOM you-ACC see-FUT
‘We will see you.’
c. maal-du ngulmburu.u ngunda-lma.
man-ERG woman.ABS see-FUT
‘The man will see the woman.’
d. ngulmburu-nggu ngana-nya ngunda-lma. (most marked)
woman-ERG we-ACC see-FUT
‘The woman will see us.’
These combinatory case-marking patterns suggest that there is just one

set of case-marking rules that accounts for the distribution of case in
Warrgamay if we recognize the notion of the origin and the terminal
point of an action as controlling factors. For example, the major rules
can be straightforwardly formulated as below:
(68) Major case marking rules in Warrgamay

Mark the relevant nominal with
a. -Ø (NOM/ABS) if the transitive action originates in 1/2NSG;
b. -du/-ngga/-nggu (ERG) if the transitive action originates in
3SG, 3DU, 3PL, N;
Voice phenomena 255
c. -nya (ACC) if the transitive action terminates in 1/2NSG, 1SG,

3DU/3PL;
d. -Ø (NOM/ABS) if the transitive action terminates in 3SG, N.
Notice that there are two rules, (68a) and (68d), for unmarked forms.
When a first or second nonsingular pronoun functions as the origin of a
transitive action, it will be unmarked. When a third person pronoun or a
full noun functions as a terminal point, it will receive no marking. Like-
wise, there are two rules for marked forms. When a third person pronoun
or a full noun is the origin of a transitive action, the form representing it
receives the ergative marker -ngga or -nggu. When a nonsingular first or
second person, first person singular, or third person dual/plural is the
termination of an action, the form representing it takes the accusative
marker -nya. These patterns reflect an elaborated version of the topicality
hierarchy seen earlier, which we now rename ‘‘relevance hierarchy’’:
(69) Relevance hierarchy (after Dixon 1994: 85)
1st person > 2nd person > Demonstratives > Proper >
pronouns pronouns 3rd person nouns
pronouns
Common nouns
zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl}|fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{
human animate inanimate
Again it is the speech act participants (SAPs) that constitute the core of
the group of unmarked agents. The status of SAPs as quintessential
agents is undoubtedly rooted in the child’s first experiences of transitive
events that typically involve either a first-person agent or a second-person
agent. Also relevant is the fact that the first person is the only party with
direct access to the volitional status of an agent, which is an essential fea-
ture of a transitive action. The generalization, then, is that an event con-
figuration with a SAP agent acting on a third person pronominal patient
or a full NP patient (SAP ! 3/N) is the most natural event configuration
in human cognition, while an event involving the reverse pattern —
3/N ! SAP — is the most marked. The least marked case marking pat-
tern in the former (67a) and the most marked pattern in the latter (67d)
reflect the naturalness status of these configurations. It is no accident
that these event configurations are exactly the ones that the direct/inverse
and the active/passive systems respond to with regard to markedness. In
the inverse system, direct constructions are typically unmarked, and are
used in expressing an event configuration such as SAP ! 3/N, whereas
marked inverse forms express the reverse pattern 3/N ! SAP. Previously,
we saw that the active/passive systems in Southern Tiwa and Japanese
(as well as Korean and English to a certain extent) followed the same pat-
256 M. Shibatani
tern; that is, the most natural event configuration is expressed by an un-
marked active construction and the least natural by marked passive
forms. To summarize, we obtain the following form-function correlations
regarding the relevant phenomena:
(70) a. The event configuration SAP ! 3/N is realized as an un-

marked expression: a direct form in the direct/inverse system,
an active form in the active/passive system, and a NOM-
(ABS)-NOM(ABS) case marking pattern in the split case
marking system.
b. The event configuration 3/N ! SAP is realized as a marked
expression: an inverse form in the direct/inverse system, a pas-
sive form in the active/passive system, and an ERG-ACC case
marking pattern in the split case marking system.
To the extent that these seemingly di¤erent phenomena — inverse, pas-

sive, and split case marking — receive a unified account under our con-
ception of voice, we are assured of its correctness.
5. Voice and argument structure
As a grammatical phenomenon, voice has two sides. On the conceptual

side, it is concerned with patterns of interaction between an action and
event participants in the course of the evolution of the action, as detailed
above. Grammar responds to the complex patterns of participant involve-
ment in events because they correlate with the information value of the
event participants. This in turn is determined in several ways. The most
basic type of discourse relevance correlates with the degree of participant
involvement in an event. Those participants upon which the realization of
an event depends (such as agent and patient in a transitive event type),
have what Shibatani (1994) calls constitutive relevance — meaning that
without their involvement, the event itself will not be constituted. They
naturally have high discourse relevance due to their intrinsic information
value, and their formal expression is mainly what lexical subcategoriza-
tion (e.g., of transitive verbs) insures in grammatical description.
The degree of discourse relevance of a nominal constituent (or more
precisely its referent) is also contextually determined. Those nominals
with high referential status, such as the speaker and hearer, and definite
and/or specific nominals, have a higher degree of discourse relevance
than indefinite or unindividuated ones. We also recognize that languages
generally assign a di¤erent degree of discourse relevance to the two cen-
Voice phenomena 257
tral participants of a transitive event, thus determining basic voice orien-

tation in a given language (see below).
What is known as active voice is the correlation between what is
known as the ‘‘high transitive situation type’’ and the transitive clause
structure organized in the nominative-accusative fashion, where the agen-
tive nominal is accorded the most central syntactic status, known as the
subject. The patient is a secondary argument, together with the subject
constituting the set of core arguments. In other words, a high transitive
situation type with the ontological features shown below defines the con-
ceptual basis of the active, and other primary voice categories.
Transitive situation type:
The action originates in a volitional agent, extends beyond the agent’s
personal sphere, and terminates in a distinct patient achieving an in-
tended e¤ect on it.
The core/noncore division in argument structure on the expression side
of voice is rooted in the distinction between those nominals representing
participants with constitutive relevance and those without — or between
those having high discourse relevance vs. those with low relevance. How
these two core arguments are given syntactic primacy (or centrality, or
prominence) varies across languages, however. As mentioned above,
nominative-accusative clauses assign the highest syntactic status to the
agentive nominal. Clauses organized along an absolutive-ergative pattern,
on the other hand, assign syntactic primacy to the patientive nominal
over the agentive one.
The cognitive underpinnings of accusative/ergative coding and their
syntactic/discourse correlates are both controversial, and here we o¤er a
speculative, somewhat traditional interpretation of them in terms of dis-
course relevance. We assume that the di¤erence between a nominative-
accusative pattern and an absolutive-ergative one is rooted in the cog-
nition of a transitive situation. Accusative languages — languages that
primarily use the nominative-accusative clauses in describing transitive
events — focus on the agentive participant, and treat it as the more cen-
tral of two focal participants in a transitive event. On the other hand, er-
gative languages — those that use absolutive-ergative clauses as the pri-
mary means of describing transitive events — opt for the patientive
participant in assigning the central role to an event participant.20 Support
for this view comes from two areas.
A fair number of ergative languages avoid assigning an ergative sec-
ondary syntactic status to a SAP agent, which has high discourse rele-
vance, and display the phenomenon of split case marking — as in the
case of Warrgamay examined above. A second type of support for our
258 M. Shibatani
view comes from the distribution of the passive and the antipassive. The
active/passive opposition, which deals with the discourse status of the
agentive nominal, tends to be grammaticalized in accusative languages,
for in this type of language the primary focus is on the agentive partici-
pant. In ergative languages, on the other hand, the status of the patient
is of primary concern, and the ergative/antipassive opposition dealing
with its status is typically grammaticalized here.
The basic voice orientation of accusative- and ergative-type languages
is thus role-based, which motivates a formal mechanism of the marked
voice — the passive and antipassive, respectively. Compared to this, lan-
guages with a Philippine-type focus system do not have a role-based voice
orientation. The choice of primary syntactic argument — known variously
as ‘‘topic,’’ ‘‘subject,’’ or ‘‘pivot’’ — is made broadly without confinement
to the primary focal participants of agent or patient. The choice is moti-
vated by pragmatic factors such as the definiteness of the participants and
aspectual orientation, as well as semantic factors determining the degree
of transitivity. The ‘‘fluid’’ character of Philippine-type voice systems
makes voice conversion facile without special morphosyntactic mecha-
nisms such as passive and antipassive morphology, or the radical re-
arrangement of argument structure seen in the languages with a basic
voice orientation. The best names for the Philippine-type voice patterns
are ‘‘actor-voice,’’ ‘‘patient-voice,’’ ‘‘instrumental-voice,’’ etc.
To summarize the above discussion, the active voice is the relationship
between the transitive situation type, as defined above, and its syntactic
coding in the nominative-accusative fashion. The ergative voice, on the
other hand, is the relationship between the transitive situation type and
its absolutive-ergative coding pattern. Notice that not all coding patterns
of the transitive situation type are active voice. The passive, for example,
codes the transitive situation type, yet it is not an active voice construc-
tion. Likewise, the relationship between the transitive situation type and
the ergative construction cannot be considered as active voice, contrary
to the view expressed by Dixon (1994: 216) in his statement: ‘‘there is
typically an active/antipassive voice contrast in ergative languages.’’ The
relevant contrast he has in mind is between the ergative and the anti-
passive voice.
Both active and ergative constructions are basic construction types
such that the transitive situation type would be coded according to these
basic construction patterns in accusative and ergative languages, respec-
tively, unless pragmatic factors call for the marked voice constructions
of passive and antipassive. In languages with a Philippine-type focus sys-
tem, there is no basic voice orientation such that the transitive situation
type can be coded by either the actor-focus or the patient-focus construc-
Voice phenomena 259
tion (see [15] and [16] above). While the patient-focus constructions are
favored over the actor-focus constructions when a definite patient is in-
volved, the system is not as codified as in the case of ergative languages
that code the transitive situation type in the absolutive-ergative manner
regardless of the definiteness status of the patient nominal.21
We now relate the choice of the central syntactic element of the clause
with the notion of discourse relevance. The primary argument has the
highest discourse relevance in the following senses: 1) it typically has con-
stitutive relevance; 2) it is most salient in the speaker’s mind; 3) it plays
an important role in the propositional act; and 4) it is the entity on which
the hearer’s attention is focused. What distinguishes the three primary
arguments in the three types of languages being examined here — the
nominative argument in accusative languages, the absolutive argument
in ergative languages, and the topic/subject/pivot of the Philippine-type
languages — is their indispensability. That is, except for a few other
minor sentence types, such as existential and exclamatory expressions, all
sentences (N.B.: not clauses) in the respective language types must con-
tain these arguments. We take this fact to be connected to the require-
ment of a proposition to contain an item to be predicated over. In other
words, the primary arguments under consideration all have the referential
function of pointing out what is to be talked about, or predicated over, in
a propositional verbal act. They are what the traditional term ‘‘subject’’
represents in both logic and grammar, and there is no harm in applying
this term to nominative, absolutive, and Philippine-style ‘‘topic’’ nomi-
nals, as long as they are understood in terms of their role in a propositio-
nal act. Just as importantly, they must not be understood in terms of the
agent-based notion of subject, as it applies to the subject in English and
other European languages, or in terms of syntactic roles, that is, their
behavioral and coding properties (Keenan 1976).22 The ranking of dis-
course relevant items at the clausal level may be necessitated by a require-
ment regarding the distribution of attention. The subject of a proposition
is what is most salient in the speaker’s mind, and on which the hearer’s
attention is expected to be focused.23
Situations deviating from prototypical transitive situations are often ex-
pressed by marked voice constructions. Still the structural complexity of a
given voice construction depends on what the normal state of a¤airs is.
A transitive situation involving an agent and a patient is ‘‘normal’’ for
such actions as killing and breaking that entail the transfer of the action
from one party to another. Departures from prototypical transitive situa-
tions for these kinds of activities therefore result in marked situations,
and languages reflect it in marked voice constructions. Thus, a middle
situation in which a transitive action is confined within the agent’s sphere
260 M. Shibatani
(rather than extending to another party) typically results in a middle

construction that is marked — relative to the active voice. On the other
hand, actions such as running and crying — whose realizations are nor-
mally achieved within the agent’s domain — receive lexicalization as basic
middle verbs, that is, lexical intransitives. In this case, it is the departure
from the pattern of confinement of the action to the agent’s sphere that
invokes a marked voice construction such as the causative (e.g. Japanese
hasira-se ‘cause to run’) — as when the act of running is induced exter-
nally. Lexical (i.e. underived) intransitive verbs as well as lexical transitive
ones respectively represent what are considered to be ‘‘normal’’ middle
and active voice situations. Marked voice constructions encode depar-
tures from these normal situation types.
Returning to the issue of argument structure (in particular, valency
change accompanying voice alternations), our framework o¤ers a
straightforward answer to the question of why, for example, causatives
and applicatives have a valency-increasing e¤ect, while passives and anti-
passives are valency-decreasing. Both causative and applicative situations
involve the addition of an entity to a basic situation, the former a new
origin of action, the latter a new terminal point. These newly introduced
entities have discourse relevance because they become an integral partici-
pant of the event being reported. The causer (being an initiating agent) is
typically accorded an agent-like status, while the target of an extended
action functions as a new terminal point in taking on the status of an af-
fected patient.
The lower degree of patient involvement in an antipassive situation, on
the other hand, lessens its discourse relevance since it is not fully inte-
grated into the event being reported. Valency reduction in an antipassive
due to the deranking of the patient from core to adjunct status is a way of
signaling the low degree of discourse relevance it bears.
Valency reduction in the passive is likewise due to a lower degree of
discourse relevance of the agent. What is interesting about the passive
construction is that it represents the same situation type as an active voice
construction, namely, the transitive situation type. Its valency is thus due
to the pragmatically determined discourse relevance of the event partici-
pants and not to the patterns of deviation from the basic transitive situa-
tion type.
We observed in Section 4 that many passive clauses are motivated by a
particular distribution pattern of event participants having di¤erent de-
grees of pragmatic relevance, as in a situation where a third person acts
on a first person (the 3 ! 1 case). In other cases, the discourse relevance
of the agent is determined contextually by the speaker. For example,
where the identity of the agent is either obvious from the context, can be
Voice phenomena 261
surmised, or is unknown to the speaker, its information value is deemed

low. The syntactic deranking of the agent in the passive, that is, valency
reduction, is a way of signaling it low discourse relevance.22
Various grammatical relations hierarchies proposed in the literature
(e.g. subject > object; absolutive > ergative; Philippine-type topic >
nontopic) capture the ranking of nominal arguments in terms of syntactic
centrality. The core/noncore and oblique/adjunct distinction similarly re-
flects the degree of discourse relevance borne by nominal constituents.
The relationship between the coding pattern and the discourse relevance
of nominal constituents can then be stated in the form of the follow-
ing principle, which governs the argument structure of various voice
constructions:
Coding principle:
The degree of the syntactic centrality in nominal coding reflects the de-
gree of discourse relevance borne by the nominal constituents. That is,
the higher the degree of discourse relevance a nominal argument bears,
the more central its syntactic status will be.
It is this established correlation between the coding pattern and dis-
course relevance that lies behind the pragmatically-motivated use of cer-
tain conceptually-based voice constructions. It is well-known that anti-
passive constructions are used where attention on the patient is to be
defocused — as in the following examples from Central Yupik and
Warrungu — where the ergative/antipassive opposition revolves around
the referential status of the patient, or where the action itself rather than
the action’s e¤ect upon the patient is to be emphasized — as, for example,
when the speaker is taking a habitual imperfective perspective:
Central Yupik
(71) a. Arna-m neqa iir-aa.
woman-ERG.SG fish.ABS.SG hide-3.SG/3.SG
‘A/the woman is hiding the fish.’
b. Arnaq neq-mek iir-i-uq.
woman.ABS.SG fish-ABL.SG hide-APASS-3.SG
‘The woman is hiding a fish.’
(Miyaoka 1984: 197)
Warrungu
(72) Referring to a drunkard:
kamukmuþngku nyula pityaþkaliþn.
grogþINST 3SG.NOM drinkþAPASSþP/P
‘He drinks grog all the time.’
(Tsunoda 1988: 603)
262 M. Shibatani
Conversely, applicative constructions may be used in a situation where,

for example, an instrumental nominal has high discourse relevance by
virtue of referring to an item identifiable to the hearer. The coding of
such a nominal as an argument, as in example (73b) below, is a way to
signal its high discourse relevance.
Amharic
(73) a. aster sı̈ga btillik’ billa k’orrt’-čč.
Aster meat with-big knife cut.PF-3F
‘Aster cut some meat with a big knife.’
b. aster tillik’-u-n billa sga k’orrt’-čč-ı̈bb-t.
Aster big-DEF-ACC knife meat cut.PF-3F-APPL-3MO
‘Aster cut some meat with the big knife.’
(Mengistu Amberber pers. comm.)
6. Conclusion
This article has attempted to lay a comprehensive framework for voice

phenomena. We have endeavored to show that, contrary to generally-
held beliefs, voice opposition reflect conceptual distinctions pertaining to
the evolutionary properties of an action — namely the nature of the ori-
gin of an action, the manner of its development, and the way it termi-
nates. In addition to conceptually-based voice phenomena (volitional/
spontaneous, causative/noncausative, active/middle, and ergative/anti-
passive oppositions), there are pragmatically-motivated ones that deal
with marked patterns of event configuration. These (the inverse, the pas-
sive, and split ergativity) regulate the syntactic prominence of agentive
and patientive nominals in response to the pragmatically-determined de-
grees of discourse relevance. Some voice constructions are conceptually-
based (such as the antipassive and benefactive/applicative), but are also
used for pragmatic purposes. All of them are unified under the notion of
discourse relevance, which, on the one hand, is rooted in the manner of
involvement of event participants in an action, and on the other, in the
pragmatically-determined relevance of the event participants in relation
to the speaker/hearer and discourse.
Based on both synchronic and diachronic distribution patterns, we must
conclude that voice phenomena reflecting conceptual contrasts are basic,
and that they constitute the main types of opposition found in the lan-
guages of the world. The active/middle and causative/noncausative op-
positions typically involve lexicalization of basic voice situations — those
considered to be normal states of a¤airs — while marked situations are
Voice phenomena 263
expressed morphologically or though the syntax. These voice construc-

tions are observed in most, if not all languages of the world. Compared
to this, a majority of languages do not have a systematic formal expres-
sion for the active/passive opposition.24 It is also well known that pas-
sives often develop as a further stage of middle forms and other sources.
The valency pattern of various voice constructions represents a way of
arranging nominal constituents according to their degree of discourse rel-
evance. Argument structure, in other words, reflects the pattern of the lis-
tener’s attention. The most central syntactic role, for example, the subject
of a proposition, is reserved for the nominal bearing the highest degree
of discourse relevance; hence it is this constituent on which the listener’s
attention will be focused. Di¤erences in clausal organization across
languages — especially those in the choice of primary syntactic argument
— mean that correspondence patterns between the situation types and
the clause types will be di¤erent, too. Moreover, since the concept of
voice refers to these correspondence patterns, we must conclude that the
active/passive, ergative/antipassive and Philippine-type focus systems are
actually di¤erent voice systems. In other words, constructions expressing
transitive situation types across languages are di¤erent in structure. This
comes as no surprise in view of the fact that what is called ‘‘passive’’
across languages is often vastly di¤erent in structure and even in function,
so that two or more types of passive can be happily accommodated with-
in a single language (the majority of European languages have both aux-
iliary and middle-based passives). Crosslinguistic di¤erences in specific
voice constructions make perfect sense when language is viewed as a
historically-evolving functional organism sustaining constant pressure for
adaptation, as eloquently advocated by Givón (2002).
Received 20 January 2004 Rice University

Revised version received
11 September 2005
Notes
* This is a thoroughly revised version of a paper presented in 2002 in Berlin at the work-
shop of the project ‘‘A cognitive-typological study of valency structures: Japanese-
German contrastive perspective’’ supported by the Japan Society for the Promotion of
Science and the Deutsche Forschungsgesellschaft. A fuller account of the overall
framework and individual voice phenomena is presented in my forthcoming book
Voice from Cambridge University Press.
Uncommon abbreviations: AF ¼ actor focus, APASS ¼ antipassive, EV ¼
evidential, INV ¼ inverse, NS ¼ nonsubject, OBV ¼ obviative, P ¼ passive-like
264 M. Shibatani
verbal form in Sinhala, PC ¼ past conjunct, PD ¼ past disjunct, PF ¼ patient-

focus, PP ¼ present/past tense, pt ¼ potent case inflection, TOP ¼ topic, SMP ¼
subject-marking noun particle, SPON ¼ spontaneous. Correspondence address: De-
partment of Linguistics — MS23, Rice University, P.O. Box 1892, Houston, TX
77251-1892, U.S.A. E-mail: matt@rice.edu.
1. Consider the following definition by Crystal (2003: 495):
A CATEGORY used in the GRAMMATICAL description of SENTENCE or

CLAUSE STRUCTURE, primarily with reference to VERBS, to express the way sen-
tences may alter the relationship between the SUBJECT and OBJECT of a verb, with-
out changing the meaning of the sentence. The main distinction is between ACTIVE
and PASSIVE . . . In other languages, further contrasts in voice may be encountered,
e.g. the ‘‘middle’’ voice of Greek (which included verbs with a REFLEXIVE meaning,
e.g. She cut herself ), and there are several other types of construction whose role
in language is related to that of voice, e.g. ‘‘reflexive,’’ CAUSATIVE, ‘‘impersonal’’
constructions . . .
This definition is problematic at least in the following four respects: 1) there are voice
oppositions that do not involve alterations in grammatical relation; 2) there are voice
phenomena — even passive constructions — which involve meaning contrast; 3) the
active/passive opposition is not the main voice distinction either in a diachronic or a
synchronic sense. Diachronically passive constructions develop secondarily from vari-
ous sources — for example, middles — and synchronically there are numerous lan-
guages that do not have a formal system for an active-passive opposition, even while
showing other types of voice opposition; 4) constructions such as reflexives are voice
constructions par excellence (see text).
2. See Shibatani and Pardeshi (2002).
3. Another general methodological problem, which we have little room to address in
this article, is the importance of diachronic understandings for structural diversities
and for boundary problems seen in various functional domains including the voice
domain.
4. Studies on transitivity subsequent to Hopper and Thompson (1980) — for example,
Lazard (2002), Kittilä (2002), and Næss (2003) — also deal with a set of phenomena
highly similar to that discussed in this study.
5. Cf. the following definitions of voice: ‘‘[voice] is a regular marking in the verb of the
correspondences between units at the syntactic level and units at the semantic level. In
short, voice is a diathesis grammatically marked in the verb’’ (Xolodovič 1970, as
quoted in Geniušienė 1987: 42–53; emphasis added); ‘‘The category of voice is an in-
flectional category such that its grammemes specify such modifications of basic diathe-
sis of a lexical unit that do not a¤ect its propositional meaning’’ (Mel’čuk 1993: 11;
emphasis added).
6. See Malchukov (forthcoming) for a similar view on the iconicity of formal marking of
transitivity features.
7. Cf. Givón (2001: Ch. 13), where the middle voice and some others are characterized as
semantically-contrastive voices and are distinguished from pragmatically-motivated
voices such as the passive. However, Givón does not group causatives and applicatives
with other voice constructions, presumably because they are not detransitiving con-
structions. Yet, they contrast with simple active transitive constructions in meaning,
and applicatives in particular are often pragmatically controlled as the passive and
antipassive are. Moreover, syntactic detransitivization is not even a defining feature of
middle voice constructions, which Givón recognizes as detransitive voice constructions.
Voice phenomena 265
8. See Langacker (1991: Ch. 7) for a fuller discussion of the notion of action chain. See
also Croft (1994), where voice is treated under a similar conception as in this article,
although its scope is much narrower than ours.
9. Not surprisingly, reflexive markers develop from nouns denoting a body or head in
many languages, for example, Amharic ras ‘head’.
10. See Haiman (1985) for a similar approach, which distinguishes two types of verb —
introverted and extroverted.
11. Our position in recognizing intransitive verbs as lexical middles may be challenged on
the basis of the fact that many languages allow marked middle forms for intransitive
verbs: for example, Spanish La pelota cayó de la canasata ‘The ball fell from the bas-
ket’ (as in a basket game) vs. La pelota se cayó de la mesa ‘The ball fell from the table’.
The question is, shouldn’t the former example be considered as an active expressing
the active/middle contrast with the latter? The answer is ‘‘No.’’ The doubling of forms
expressing the same voice domain occurs frequently, as in the case of lexical and mor-
phological causatives (e.g. Japanese ire- ‘put in’ and haira-se- ‘cause to go in’), as well
as that of morphological and periphrastic middle (e.g. Balinese ma-sugi ‘wash.face’ and
nyugiin awak ‘wash.face self ’). These pairs of forms show subtle meaning di¤erences —
as past studies on causatives have shown — but not in terms of voice opposition. See
Maldonado (1992) for a detailed study of the Spanish middle, from which the Spanish
examples above are taken.
A more serious challenge could be raised regarding the verb ending interpreted as
an active marker in Classical Greek, which also occurs in intransitive verbs: politeu-ô
‘I am a citizen/have civic rights’ vs. politeu-omai ‘I act as a citizen/carry out my
civic rights for myself ’ (Klaiman 1988: 32). Our position on this is that the -ô — and
other so-called active endings — should be interpreted as default subject markers oc-
curring in both lexical middles and transitive actives. The middle endings mark
derived middles.
12. The connection between the middle and the passive is believed to be due to the shared
a¤ected meaning of the subject.
13. A ‘‘possessive relation’’ is in turn implicated in constructions other than the benefac-
tive/malefactive, for example, the double subject construction (see Shibatani 1994).
14. According to Ad Foolen (pers. comm.), the Dutch applicative prefix be- shows a simi-
lar e¤ect. While Hij slaapt met een vrouw ‘He sleeps with a woman’ is euphemistically
used to mean ‘He has sex with a woman’, it still has the literal interpretation of simply
sharing a bed with a woman, as in English. The applicative form Hij beslaapt een
vrouw, lit. ‘He sleeps a woman’, on the other hand, specifically means ‘He has sexual
intercourse with a woman’.
15. Here the inverse marker is a grammaticalized form of the verb kuru ‘come’.
16. Givón (2001: 93) characterizes the di¤erence in the degree of discourse relevance of the
agent and patient of the passive and inverse constructions as: Agt f Pat (passive),
Agt < Pat (inverse), where f and < respectively indicate a case where ‘‘the agent is ex-
tremely non-topical . . . , so the patient is the surviving topical argument in the clause,’’
and a case where ‘‘the patient is more topical than the agent but both agent and patient
are topical.’’
17. According to Dryer (1994), Kutenai shows the direct/converse contrast only with com-
binations of a third person agent and a third person patient.
18. A pound sign ðaÞ — as opposed to an asterisk — indicates that a sentence is pragmat-
ically odd, even though it is grammatical. This and other pragmatically odd sentences
are possible in a context that is not deictically anchored to speech time or location, for
example, in a narrative of past events recited from a detached perspective.
266 M. Shibatani
19. A closer examination may indicate a distinction similar to the proximate-obviative con-
trast seen in the inverse system of Kutenai. See Kroskrity’s (1985) discussion of the sta-
tus of third person referents in Arizona Tewa, which has a voice system very similar to
that of Southern Tiwa.
20. See Shaumyan (1986) for a similar view.
21. Basque, for example, uses the ergative construction in encoding both definite and indef-
inite patient participant of a transitive situation, as in:
(i) Jonek ardoa ekarri du.
Jon.ERG wine bring AUX.TR
‘Jon brought (the) wine.’
(Hualde and Ortiz de Urbina 2003: 411)
In Philippine-type languages, the actor-focus construction is used for the indefinite pa-
tient reading and the patient-focus construction for the definite patient reading. The
term ‘‘fluid voice system’’ is intended for this kind of property of the Philippine-type
focus system, which (i) allows alternation between the actor-focus construction and
the patient-focus construction without involving an additional verbal morphology
such as a passive or antipassive marker, and (ii) ‘‘promotes’’ an oblique nominal to
the topic/subject/pivot status without involving an additional/separate applicative
morphology. Austronesian languages with a Philippine-type focus system vary with re-
spect to the second property. Balinese, for example, has developed a separate applica-
tive process, which is required for an oblique to be promoted to the patient status be-
fore it can participate in the actor-focus/patient-focus alternation.
22. The syntactic properties of the subject vary across languages, and in fact the list of
properties such as those in Keenan (1976) do not o¤er universal definitions of the sub-
ject pace his assertion. See Langacker (1991: 305 ¤.) on this point.
23. In this light, the controversial label of ‘‘focus system’’ for the Philippine-type voice phe-
nomenon is not entirely inappropriate. Also see Givón (1976), Langacker (1991: 305
¤.) and the related discussions on the subject and the topic.
24. By ‘‘a systematic formal expression’’ we mean a construction that has the passive func-
tion, approximating the passive prototype as defined, for example, in Shibatani (1985).
According to the surveys by Dik Bakker and Anna Siewierska, 53% of the languages in
their 397-language sample and 57% in their similar 374-language sample do not have a
passive construction. The di¤erence between these two samples is that the former in-
cludes languages with a Philippine-type focus system as languages having a passive
construction, while the latter does not. (I am grateful to Dik Bakker for supplying
this information to me.)
Languages that have not developed a mature passive construction can express pas-
sivity by means of those constructions that share the pragmatic function of the passive
such as indefinite person constructions (e.g. they normally ignore the speed limit on this
freeway) and inverted word order for patient prominence. See Dezső (1988) for the dis-
cussion of how Hungarian copes with the needs for expressing passivity in the absence
of a passive construction.
References
Allen, B.; and Franz, D. (1983). Advancements and verb agreement in Southern Tiwa. In
Studies in Relational Grammar 1, D. Perlmutter (ed.), 303–316. Chicago: University of
Chicago Press.
Voice phenomena 267
Amberber, Mengistu (2000). Valency-changing and valency-encoding devices in Amharic. In

Changing Valency: Case Studies in Transitivity, R. M. W. Dixon and A. Aikhenvald
(eds.), 312–332. Cambridge: Cambridge University Press.
Austin, Peter (1981). A Grammar of Diyari, South Australia. Cambridge: Cambridge Univer-
sity Press.
Benveniste, Emile (1971 [1950]). Active and middle voice in the verb. In Problems in General
Linguistics, 145–151. Coral Gables, FL: University of Miami Press.
Berman, Ruth (1982). Dative marking of the a¤ectee role: data from modern Hebrew. He-
brew Annual Review 6, 35–59.
Croft, William (1994). Voice: beyond control and a¤ectedness. In Voice: Form and Function,
B. Fox and P. Hopper (eds.), 89–117. Amsterdam: John Benjamins.
Crystal, David (2003). An Encyclopedia of Linguistics and Phonetics. 5th ed. Oxford:
Blackwell.
Dahlstrom, Amy (1991). Plains Cree Morphosyntax. New York: Garland Press.
DeLancey, Scott (1981). An interpretation of split ergativity and related patterns. Language
57, 626–657.
Denwood, Philip (1999). Tibetan. Amsterdam: John Benjamins.
Dezső, László (1988). Passiveness in Hungarian: with reference to Russian passive. In Pas-
sive and Voice, M. Shibatani (ed.), 291–328. Amsterdam: John Benjamins.
Dixon, R. M. W. (1977). A Grammar of Yidiny. Cambridge: Cambridge University Press.
— (1979). Ergativity. Language 55, 59–138.
— (1980). The Languages of Australia. Cambridge: Cambridge University Press.
— (1994). Ergativity. Cambridge: Cambridge University Press.
Donohue, Mark (1999). A Grammar of Tukang Besi. Berlin and New York: Mouton de
Gruyter.
Dryer, Matthew (1994). The discourse function of the Kutenai inverse. In Voice and Inver-
sion, T. Givón (ed.), 65–100. Amsterdam: John Benjamins.
Felix, Rolando (2005). A grammar of River Warihı́o. Unpublished doctoral dissertation,
Rice University.
Gair, James W. (1990). Subjects, case and INFL in Sinhala. In Experiencer Subjects in South
Asian Languages, M. Verma and K. P. Mohanan (eds.), 13–42. Stanford, CA: CSLI.
Geniušienė, Emma (1987). The Typology of Reflexives. Berlin and New York: Mouton de
Gruyter.
Givón, T. (1976). Topic, pronoun and grammatical agreement. In Subject and Topic, C. N.
Li (ed.), 149–188. New York: Academic Press.
— (2001). Syntax: An Introduction. Vol. 2. Amsterdam: John Benjamins.
— (2002). Bio-Linguistics: The Santa Barbara Lectures. Amsterdam: John Benjamins.
Haiman, John (1985). Iconic and economic motivation. Language 59, 781–819.
Holisky, Dee A. (1987). The case of the intransitive subject in Tsova-Tush (Batsbi). Lingua
71, 103–132.
Hopper, Paul; and Thompson, Sandra (1980). Transitivity in grammar and discourse. Lan-
guage 56, 251–299.
Hualde, José I.; and Ortiz de Urbina, Jon (2003). A Grammar of Basque. Berlin and New
York: Mouton de Gruyter.
Jespersen, Otto (1965). The Philosophy of Grammar. New York: Norton.
Kansakar, Tej (1999). Verb agreement in Classical Newar and Modern Newar. In Topics in
Nepalese Linguistics, Y. P. Yadava and W. W. Glover (eds.), 421–443. Kathmandu:
Royal Nepal Academy.
Keenan, Edward (1976). Towards a universal definition of ‘‘subject’’. In Subject and Topics,
C. N. Li (ed.), 303–333. New York: Academic Press.
268 M. Shibatani
Kemmer, Suzanne (1993). The Middle Voice. Amsterdam: John Benjamins.

Kittilä, Seppo (2002). Transitivity: Toward a Comprehensive Typology. Turku and Åbo: Åbo
Academis Tryckeri.
Klaiman, M. H. (1988). A¤ectedness and control: a typology of voice systems. In Passive
and Voice, M. Shibatani (ed.), 25–83. Amsterdam: John Benjamins.
Kozinsky, Igor; Nedjalkov, Vladimir P.; and Polinskaja, Maria S. (1988). Antipassive in
Chukchee: oblique object, object incorporation, zero object. In Passive and Voice, M. Shi-
batani (ed.), 651–706. Amsterdam: John Benjamins.
Kroskrity, Paul (1985). A holistic understanding of Arizona Tewa passives. Language 61,
306–328.
Langacker, Ronald W. (1991). Foundations of Cognitive Grammar. Vol. 2. Stanford, CA:
Stanford University Press.
LaPolla, Randy (1996). Middle marking in Tibeto-Burman languages. In Pan-Asian Linguis-
tics: Proceedings of the Fourth International Symposium on Languages and Linguistics,
Vol. V, 1940–1954. Bangkok: Mahidol University.
Lazard, Gilbert (2002). Transitivity revisited as an example of a more strict approach in ty-
pological research. Folia Linguistica XXXVI, 141–190.
Malchukov, Andrej (forthcoming). Transitivity parameters and transitivity alternations:
constraining co-variation. In Case, Valency and Transitivity: A Cross-linguistic Perspec-
tive, L. Kulikov, A. Malchukov, and P. de Swart (eds.). Amsterdam: John Benjamins.
Maldonado, Ricardo (1992). Middle voice: the case of the Spanish se. Unpublished doctoral
dissertation, University of California, San Diego.
McLendon, Sally (1978). Ergativity, case and transitivity in Eastern Pomo. International
Journal of American Lingusitics 44, 1–9.
Mel’čuk, Igor (1993). The inflectional category of voice: towards a more rigorous definition.
In Causatives and Transitivity, B. Comrie and M. Polisnky (eds.), 1–46. Amsterdam: John
Benjamins.
Merlan, Francesca (1985). Split intransitivity: functional oppositions in intransitive inflec-
tion. In Grammar Inside and Outside the Clause, J. Nichols and T. Woodbury (eds.),
324–362. Cambridge: Cambridge University Press.
Miyaoka, Osamu (1984). On the so-called half-transitive verbs in Eskimo. Études/Inuit/
Studies VIII: Supplement Issue: The Central Yupik Eskimos, 193–218.
Moravcsik, Edith (1978). On the case marking of objects. In Universals in Human Language,
J. Greenberg (ed.), 249–289. Stanford, CA: Stanford University Press.
Mosel, Ulrike; and Hovdhaugen, Even (1992). Samoan Reference Grammar. Oslo: Scandina-
vian University Press.
Næss, Å. (2003). Transitivity: From Semantics to Structure. Nijmegen: Ponsen & Looijen.
Onishi, Masayuki (2001). Non-canonically marked S/A in Benhali. In Non-Canonical Mark-
ing of Subjects and Objects, A. Aikhenvald, R. M. W. Dixon, and M. Onishi (eds.), 113–
148. Amsterdam: John Benjamins.
Patz, Elizabeth (1982). A Grammar of the Kuku Yalanji Language of North Queensland.
Pacific Linguistics Series b (19). Canberra: Australian National University.
Payne, Doris; and Barshi, Immanuel (eds.) (1999). External Possession. Amsterdam: John
Benjamins.
Peterson, Donald A. (1999). Discourse-functional, historical, and typological aspects of appli-
cative constructions. Unpublished doctoral dissertation, University of California, Berkeley.
Shaumyan, Sebastian (1986). The semiotic theory of ergativity and markedness. In Marked-
ness, F. Eckman, E. Moravscik, and J. Wirth (eds.), 169–217. New York: Plenum Press.
Shibatani, Masayoshi (1985). Passives and related constructions: a prototype approach. Lan-
guage 61, 821–848.
Voice phenomena 269
— (1994). An integrational approach to possessor raising, ethical datives, and adversative

passives. Proceedings of the 20th Annual Meeting of the Berkeley Linguistics Society,
461–486.
— (1996). Applicatives and benefactives: a cognitive account. In Grammatical Constructions:
Their Form and Meaning, M. Shibatani and S. A. Thompson (eds.), 157–194. Oxford: Ox-
ford University Press.
— ; and Artawa, Ketut (2003). The middle voice in Balinese. Paper presented at the XIIIth
Southeast Asian Linguistics Society Conference at UCLA on May 3, 2003. To appear in
the proceedings of the conference.
— ; and Pardeshi, P. (2002). The causative continuum. In The Grammar of Causation and
Interpersonal Manipulation, M. Shibatani (ed.), 85–126. Amsterdam: John Benjamins.
Tsunoda, Tasaku (1988). Antipassives in Warrungu and other Australian languages. In Pas-
sive and Voice, M. Shibatani (ed.), 595–649. Amsterdam: John Benjamins.
Winstedt, Richard O. (1927). Malay Grammar. 2nd ed. Oxford: Clarendon Press.
Xolodovič, A. A. (1970). Zalog. In Kategorija zaloga. Conference material, 2–26. Leningrad.
View publication stats

Conceptualframework

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Conceptualframework

Uploaded by

Copyright:

Available Formats

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

On the conceptual framework for voice phenomena

Article in Linguistics · January 2006

The user has requested enhancement of the downloaded file.

This article attempts to lay the conceptual foundations of voice phenomena,

Current studies on voice phenomena su¤er from a number of inadequa-

Linguistics 44–2 (2006), 217–269 0024–3949/06/0044–0217

2. The evolution of an action: voice, transitivity, and aspect

Mention of the evolution of an action immediately brings to mind two

Figure 1. Aspectual categories

Figure 2. Evolutionary phases of an action and the relevant voice categories

of the described action. On the other hand, if speciﬁc sections of internal

3. Major voice oppositions and their conceptual bases

Under the present conception, the three principal evolutionary phases of

Major voice parameters:

Figure 2 summarizes the voice constructions pertaining to these parame-

3.1. Parameters pertaining to the origin of an action

While not widely recognized as a voice opposition, this distinction has

the passive construction. In fact, it is generally believed that the Japanese

position, as in the following spontaneous construction (2b). Undoubtedly,

Since the volitional/spontaneous opposition is not widely recognized as

According to Winstedt (1927: 86–87), the function of ter- in Malay is

(4) a. Taroo-wa Ziroo-ni nagura-re-ta. (Japanese passive)

b. rumah itu tidak ter-beli oleh

The diagrammatic representation of voice constructions in Figure 2 can

Another favorite source for the spontaneous construction — especially

1999). Verbs denoting such intentional actions as reading and dancing

Generalizing processes have the e¤ect of obliterating the basic semantic

Figure 3. Causative action chain

volitional/spontaneous opposition shows, however, there is no reason to

3.2. Parameters pertaining to the development of an action

In this section we recognize at least two sets of contrastive patterns in the

agent’s personal sphere. The clearest case in which the development of an

Figure 4. Active/ergative situation type

Figure 5. Representative middle situation types

Table 1. Balinese middle forms (Shibatani and Artawa 2003)

Periphrasitic Morphological Lexical

nyagur awak- ‘hit oneself ’ xxx xxx

The active/middle opposition is diagrammatically shown as above, where

Figure 6. Kemmer’s classiﬁcation of event types (Kemmer 1993: 73)

Kemmer arrives at her classiﬁcation of event types as a result of her

While morphological middle constructions in some languages are

The general tendency for morphological middles to be intransitive is

b. Sā ‘ai le teine i le i‘a.

3.3. The termination of action parameter

In a regular transitive event, an action terminates in a patient. However,

or not a clear benefactive/malecfactive reading obtains from them. When

Where inalienable possession is evident, as in these examples, a male-

(42) a. maniwı́ri ihčorewapáte-re wanı́ pantaóni-ra.

4. Pragmatically motivated voice systems

While pragmatically motivated voice systems may appear to have little to

the overall picture of voice as pertaining to the evolutionary properties of

b. Taroo-ga boku-ni denwa-o si-te

as in Nepali, where an agentive phrase cannot be overtly expressed in an

A case where the features of the active/passive and the direct/inverse

b. liorade Ø-mu-che-ban seuanide-ba.

(59) Optional (either an active or a passive form is possible): 3 ! 3

The patterns described above are identical to the typical direct/inverse

The parallelism of the active/passive and direct/inverse patterns is evi-

ngunda-lma. (nominative-accusative pattern)

The examples with full nouns in (65) show an ergative case-marking

(67) a. ngana.u ngulmburu.u ngunda-lma. (least marked)

These combinatory case-marking patterns suggest that there is just one

(68) Major case marking rules in Warrgamay