Download as pdf or txt
Download as pdf or txt
You are on page 1of 33

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/233812273

Toward Multimodal Ethnopoetics

Article · January 2012


DOI: 10.1515/applirev-2012-0005

CITATIONS READS

2 165

1 author:

Kuniyoshi Kataoka
Aichi University
37 PUBLICATIONS   134 CITATIONS   

SEE PROFILE

Some of the authors of this publication are also working on these related projects:

Poetics View project

Communicative competence View project

All content following this page was uploaded by Kuniyoshi Kataoka on 19 August 2016.

The user has requested enhancement of the downloaded file.


Kataoka, K. (2012). Toward multimodal ethnopoetics.
Applied Linguistics Review 3(1): 101-130. Kuniyoshi Kataoka

Towards multimodal ethnopoetics

Kuniyoshi KATAOKA
Aichi University

Abstract:
Multimodal analysis of discourse is a fast-developing area of linguistic research. With
this trend in mind, the purpose of the current chapter is twofold: first, to briefly review
previous endeavors in the study of linguistic poetics with special attention to parallelism and
repetition (cf. Jakobson 1960, 1966), and to seek potential paths to expand it to multimodal
analyses of natural discourse by incorporating the ideas from ethnopoetics (Hymes 1981,
1996, 2003) and gesture studies (McNeill 1992, 2005); and second, to present a sample
analysis of media discourse in the framework of “multimodal ethnopoetics” by highlighting
the interplay between the verbal-nonverbal coordination and the audio-visual representations.
With these goals in mind, we confirm that poeticity is not a distinctive quality restricted to
constructed poetry but is an endowment to any kind of natural discourse that is
co-constructed by language, the body, and the environment.
Specifically, I first review some basic and extended concepts of repetition and parallelism,
identifying the notion of “lines” as the fundamental criterion for conducting Hymesian
ethnopoetics, in which lines are weaved into larger, culture-specific units on the “verse/stanza”
levels. In addition, it is proposed that para-linguistic and nonverbal aspects of language use
may (un)consciously contribute to the construction of poetic structure, typically in terms of
“catchment” (McNeill 2005) and the distributional configuration of gestures (Kataoka 2009,
2010, 2012). In the latter half of the paper, we move on to examine an actual case (a Japanese
TV commercial) in which poetic intentions are apparently maximized for greater appeal to the
audience and larger profit from the product. The analysis indicates that the aesthetics
encoded and shared therein could be an outcome of the repeated practice, accumulated and
sedimented by attending to the ongoing—whether actual or virtual—participation, which is
generally facilitated by favored manners of conduct, or “habitus” (Bourdieu 1990).

Keywords: ethnopoetics, multimodality, repetition and parallelism, narrative

Introduction
The purpose of the current chapter is to briefly overview previous endeavors in
the study of linguistic poetics and to seek potential paths to expand it to
multimodal analyses of natural discourse. Multimodal analysis of discourse is a

1
Multimodal ethnopoetics

fast-developing area of research, but the analysis of poetic functions therein is


still in a burgeoning stage. In the following, I will first examine basic concepts
of (ethno)poetics, and then I will review recent findings and future issues that
resulted during the development of the research. Finally, a brief analysis based
on multimodal ethnopoetics will be given.
Since it is obviously not feasible with my capacity and in this limited space
to discuss all the aspects of poetics, what I call “poetics” here is mostly
restricted to linguistic poetics articulated by Roman Jakobson (1960, 1966) and
to the subsequent branches under the rubric of “ethnopoetics” developed by Dell
Hymes (1981, 1996, 2003), as well as to recent developments in gesture studies,
which include nonverbal aspects of poetics (McNeill, 1992, 2005).
There is no doubt that poeticity in language use has been a major site of
(ethno)linguistic investigation, as seen in such works as Jakobson (1960),
Hymes (1981, 1996, 2003), Tedlock (1983), Woodbury (1985), Silverstein
(1985, 1998), Tannen (1989), Stockwell (2002), Friedrich (2001, 2006), and
Rumsey (2007). However, investigation of poetic functions of nonverbal
elements in naturally occurring discourse is relatively new, beginning essentially
with Cassell and McNeill (1991) and McNeill (1992), and further explored by
McNeill (2005) with the notion of “catchment” (see also Furuyama and Sekine
2007; Kataoka 2010). Now it is time to take these ideas seriously and expand
them to a wider scope, encompassing nonverbal aspects of poetic realization. I
propose below that multimodal ethnopoetics would make explicit inherent
properties that reside not only in literary text but also in the synthetic use of
language-body-environment amalgams.

Poetics
In ordinary terms, poetics (or poetry) usually evokes in our minds special skills
and/or fixed rules for writing or reciting poetry (in a broad sense), which
typically cultivate and are characterized by various figures of speech, rhetorical
techniques, and rhythmic/prosodic features. In the European literary tradition,
most conventional poetic forms are based on coordinated structures of the
sounds of words—rhyme and meter—and recurring words or lines (Fabb 2002),
while in linguistics the study of poetics is most closely associated with the
theories put forth by the Russian linguist Roman Jakobson, who defined the
features of poetics in terms of vertical (paradigmatic) and horizontal
(syntagmatic) relations of linguistic elements.1 He palpably envisioned a
structural construal based on paradigmatic/syntagmatic configurations of
phonetic and morpho-syntactic constituents.

2
Kuniyoshi Kataoka

On a societal level, poetic practice can be found anywhere, not only in


“high” cultures with which poems and verses are typically associated, but also in
the mass cultures in which spontaneous language use, and even
vulgarism/profanity, prevails. In other words, what we call “poetic” here is more
concerned with organization and formal structures than with theme or content.2
What defines such a form and organization is widely acknowledged to be
characterized by parallelism and repetition, to which we will turn later.
Also, on an individual level, poetics seeps into every realm of our lives,
although not all the time. Examining various literary traditions, Friedrich (2001)
identified and defined a phenomenon of what he calls “lyric epiphany” as a
momentary but momentous breakout of short stretches of lyric forms in an epic.
It is characterized by various rhetorical features such as certain prosodic contour,
increased lexical density, similes, and, most probably, repetition and parallelism
in all types and at all levels of performance. In fact, those features are also
characteristics of the “peak/climax” in (narrative) performance (Longacre 1996;
Turner 1981). That being the case, we could assume by extension that this
notion may also apply to a momentary breakout of poetic forms in casual and
spontaneous interaction including verbal and nonverbal features.
Then, what are the relationships among poetics, narrative, and conversation?
In practice, natural conversation may often include narratives (Norrick 2000),
which are arguably claimed to comprise common components across cultures
(no matter what they are called). Since narratives, whether “small” or “large”
(Bamberg 2007; Georgakopoulou 2007), permeate conversation and typically
emerge and recur at a moment of focused attention or as an affect-laden
sequence in a mundane stretch of discourse, we would expect that “lyric
epiphany” also inheres and lurks behind them. Given this, poetics is not a
distinct quality restricted to constructed poetry but is an endowment to any kind
of natural discourse.

Parallelism
It is believed that the importance of parallelism in verse and poetic forms was
first advocated by the reverend Robert Lowth in his lecture on Old Testament
and sacred Hebrew poetry, which he delivered at Oxford in 1753. It was later
published in 1778 as his “Preliminary Dissertation” of Isaiah, in which he
termed the feature as parallelismus membrorum (Lowth 1778; cited in Jakobson
1966: 399–400):

The correspondence of one Verse, or Line, with another, I call

3
Multimodal ethnopoetics

parallelism. When a proposition is delivered, and a second is subjoined


to it, or drawn under it, equivalent, or contrasted with it in Sense; or
similar to it in the form of Grammatical Construction; these I call
parallel lines; and the words or phrases, answering one to another in
corresponding Lines, Parallel Terms.

Lowth elsewhere identifies three forms of parallelism: the synonymous,


antithetic, and synthetic, all forms of which, he claims, are perpetually mixed
with one another, and such mixtures add essential beauty and coordinated
variety to the text. His analysis of biblical texts was immediately and
pronouncedly received, and it had a profound impact on the subsequent research,
spawning voluminous works that were inspired by his scholarship.
That tradition is diversely conceptualized in modern linguistics. The most
well-known version of it is Jakobson’s definition of the “poetic” function of
language, which was heavily motivated by Russian formalism and the Prague
School linguist ideas of poeticity, as well as by a communication theory of Karl
Bühler. For example, the Prague School linguist Mukařovský (1964: 19) puts the
function of poetic language as consisting in “the maximum foregrounding of the
utterance.” The foregrounding is most effectively managed “to the extent of
pushing communication into the background as the objective of expression and
of being used for its own sake.” He claims that aesthetic intentions may be
brought to a maximum degree by distorting the standard usage to the extent
where the “ordinary” can be turned into the “extraordinary” by means of various
foregrounding techniques.
Likewise, as Jakobson (1960: 356) succinctly remarked, the poetic function
becomes manifest when it “focuses on the message for its own sake.”3 To put it
another way, discourse and text itself—including, in my interpretation, the
manners that convey them with phonation, and the accompanying
body—become the sources of fascination apart from the semantic content. This
idea is embedded in the two axes of language: a paradigmatic axis (based on
“selection”) and a syntagmatic axis (based on “combination”), the first of which
defines the projection of metaphor by means of similarity, and the second,
metonymy, by means of contiguity. Thus, the combination of syntagmatic
elements is always indexical, projecting an upper-level parallelism of relevant
components. This notion is condensed in Jakobson’s oft-quoted statement that
“[t]he poetic function projects the principle of equivalence from the axis of
selection into the axis of combination (1960: 358).”
This statement applies to our observation that selecting a certain language,

4
Kuniyoshi Kataoka

phonation, action, and/or sign triggers co-occurrence or avoidance of other


means of poetic representation, and it covertly requires those elements to be
indexically ordered as, in many societies, condolences require limited options of
words/phrases and phonological and behavioral control, all of which need to be
concerted to constitute a desired register for the occasion (see also conflict and
mediation in Warao [Briggs 1996] and modern psychiatric treatment in
Bangladesh [Wilce 2008] for the working of coordinated mobilization).
Those devices serve to maintain the equivalent properties ranging from
“shape/form” of text to the holistic organization of discourse appropriate for the
reciprocal exchange of emotion and information. As Silverstein (1976, 1987)
indicates, numerous linguistic (and non-linguistic) features have a potential to
participate in the hierarchical poetic formation by cultivating parallel
constructions at different levels (c.f. Friedrich’s [2006] broader concept of
ethnopoetics4). In such situations, the poetic message, whether explicitly or
implicitly, and intentionally or unwittingly, emerges as the “figure” out of the
backdrop of normative interactional practices via various foregrounding devices
based on linguistic (and somatic) features.

Repetition
In relation to poetic parallelism, a notion of “repetition” would instantly come
up to our mind. It is a common feature of oral traditions worldwide and is
defined as “a grammatical, stylistic, poetic, and cognitive resource associated
with attention; as such it is a core resource in our mental and social life” (Brown
1999: 225, 1998). Parallelism and repetition are spread over every nook and
cranny of language use, and they sometimes are not clearly demarcated. In
ordinary definitions of the terms, repetition is a rhetorical device that includes
the repeated use of the same sounds, words, phrases, clauses, etc., for emphasis,
clarity, amplification, or emotional effect (see also Tannen 1989; Ferrara 1994;
Schegloff 1997; and Rieger 2003 for conversational functions such as
“participatory listenership,” “ratifying listenership,” “humor,” “savoring,”
“evaluation,” “expansion,” “rejoinder,” “initiation of repair,” and “floor
holding”), while parallelism may or may not include reiteration of such units,
but could consist of equivalent structures and ideas. Thus, we might say that,
although the distinction is always leaky, repetition is more about diction, while
parallelism is more about organization. In this sense, we could regard Jakobson’s
notions of “equivalence” (Jakobson 1960) and “recurrent returns” (Jakobson
1966) as broader concepts that incorporate both phenomena.5
On another level of the phenomenon, Tannen (1989) and Howard (2009)

5
Multimodal ethnopoetics

distinguish between synchronic and diachronic repetition: synchronic repetition


is locally (i.e., “text-structurally”) achieved by resorting to the semantic
cohesion and schematic coherence of utterance (and gesture) in discourse,
whereas diachronic repetition is globally (i.e., “socially, culturally, and
historically”) achieved by connecting the current discourse to the inherited one
in terms of the implicit, shared assumptions and ideology (Urban 1991;
Silverstein and Urban 1996). Especially in this latter sense, the practice of
repetition is a performatively embedded asset or linguistic “habitus” (Bourdieu
1977, 1990) in everyday communication.
These two orientations, although their boundary is again blurred, may open
up another possibility for achieving repetition, one toward individual parallelism
(achieved by a single speaker/performer: Rieger 2003) and the other toward
interactional parallelism (achieved by more than one speaker/performer:
Schegloff 1997), and I take both orientations as valid instances of a poetic
practice, whether uttered, gesticulated, or a combination of both.
The most palpable and explicit poetic function should be achieved by heavy
(but not complete) repetition and parallelism. Natural conversations or oral
narratives will rarely be like that, as they are always decentering and allowing
for ad hoc diversion from the source. Nevertheless, they should potentially be
“poetic” as far as they converge on certain parallel formations within the
identifiable range. Here, if the idea of “poetic license” is further expanded, it
would offer to us a new perspective on naturally occurring conversations.
Tannen (1989), Hopper and Glenn (1994), and Jefferson (1996) have all pointed
out that conversation exhibits natural orientation to interactive poetics, which
sustain, structure, and renew the “voice” of participants through various aspects
of repetition, imitation, and meta-talk of their verbal (and nonverbal) resources.
Seen that way, it would come as no surprise to say that “conversation operates
by poetic principles” (Hopper and Glenn 1994: 39).

Lines as a basic unit


A line as the basic component of a poetic text is largely acknowledged in many
European and non-European traditions and cultures. For example, in studies of
verbal performance—such as songs, prayers, mythical narratives, ceremonial
performances, jokes, riddles, political speeches, oratories, sermons, debates (see
relevant chapters in Bauman 1992)—one prominent criterion for defining a
basic unit has been claimed to be the determination of “lines” and their
concomitant parallelism, often based upon couplets (see Sherzer 1982; Bright
1990; Fox 1988; Rumsey 2007).

6
Kuniyoshi Kataoka

However, identification of lines is highly susceptible to indigenous norms.


As Fabb (2002: 143–144) maintains, “lineation,” the division of a text into lines,
is only implied and can compete with other options for how the text should be
divided up. Among those options, the following “boundary characteristics”
(though not exhaustive) are widely accepted as tried-and-true criteria for
dividing text into lines (1).

(1) Criteria for demarcating lines


Parallelism between lines
Parallelism within the line
Alliteration at the beginning of the line
Alliteration within the line
The initial letter in the line can be part of a pattern
Rhyme
Meter
Layout on the printed page
Boundaries at line ends
Size (in terms of normative length).

To take a palpable example from a traditional verse, a strict English meter


such as “x/x/x/x/x/,” which is canonically called “iambic pentameter,” is made
up by repeating the unit “x/” five times (where “x” stands for “unstressed” and
“/” for “stressed”). This grid has a strict periodic structure, but in performance,
“lines” are not usually strictly periodic because strict meters govern only
stressed syllables in polysyllabic words and do not necessarily apply to a
performance of a line (Fabb 2002: Ch. 4). Obviously, they are major
characteristics for written verses, but many of the features listed above may
apply to spoken utterance in different but comparable forms, as in “same
discourse marker/conjunction” for “alliteration,” “comparable rhythm” for
“meter,” “intonation unit” for “boundaries at line ends,” and so forth.
The major issue here is what it is that constitutes poetic units and serves as
the determining factor for grouping them. Fabb (2002: 203–214) in fact resorted
to “lines” and “line-groups” of an English oral narrative and analyzed it using
the ethnopoetic model proposed by Hymes (1981), Tedlock (1983), and Labov
(1972), confirming that oral narratives have verse-like structures. It thus seems
reasonable to start from identifying lines and line-groups, although they may
possibly exhibit variation in size and element across communities and cultures.
As will be shown below, such basic units could be phrase-like, clause-like, or

7
Multimodal ethnopoetics

sentence-like, and may (or may not) be determined by prosodic/intonational


contours.

Hymes’ Verse Analysis model


In linguistic studies of poetics, most attention has been paid to the poetic effect
achieved by the parallel and systematized arrangement of lexical,
morpho-syntactic, and semantic elements. The most prominent in this line of
research is Dell Hymes’ Verse Analysis, which has been extensively and
persistently made sophisticated in a series of his monographs (Hymes 1981,
1996, 2003).6
In Hymes’ model, smaller-to-larger units (“lines,” “verses,” “stanzas,”
“scenes,” and “acts”) are hierarchically organized in terms of such features as
prosodic, lexical, and morpho-syntactic elements. He maintains that
intermediate-level organizations such as “verse” and “stanza” exhibit a distinct
cultural property, which preserves “emically” entrenched poetic performance;
e.g., the Chinookan family, the Finnish family, American English, Japanese, etc.,
connect these levels in sequences of three and five,7 while others such as
Kwakiutl, Takelma, Zuni, Hopi, and Navajo prefer twos and fours.
It should be remembered, however, that such formations are never
prescribed: “Narrators are not restricted to just these alternatives. Some
command both principles and may adopt one or the other for a particular story or
situation, or part of a story, or level of organization…” (Hymes 1994: 331).
Moreover, when the intonation contours are available in audio-visual formats,
recent proposals concerning “intonation units” and prosodic contours should be
taken into consideration (e.g., Du Bois et al., 1993; Chafe, 1994; Kataoka 2009).
As a general rule, lines are organized into verses, which are further bound up
into stanzas based on the sound, form, and content.
The following (2) is a list of proposed criteria for marking boundaries,
complemented by relevant findings:

(2) Hymes’ and other relevant criteria for verse/stanza analysis


LINES:
(a) A line is basically an idea unit (Chafe 1980; Hymes 1996) or an
intonation(al) unit (IU: Chafe 1994). When intonation contours are not
available, as in written or edited spoken narratives, a clause is the primary
unit in lines8;
(b) Connectives such as “and,” “so,” “then,” and “but” are usually boundary
markers of a line; also, (sentence) final particles (such as Japanese ne, yo,

8
Kuniyoshi Kataoka

and others: Minami and McCabe 1991) may possibly serve the same
purpose;
(c) When prosodic contours are available, all three types of IUs are considered
for analysis (Chafe 1994). Substantive and regulatory IUs basically
constitute single lines. Fragmentary IUs are appended to the beginning of the
next new line or to the end of the same line, depending on the significance
they achieve in each case.9

VERSES and STANZAS:


(a) In a narrative form, the verse is basically a central building block of a
sentence-like contour and may involve more than one line.
(b) A new verse is usually demarcated by a preceding, often intonational, period
(or a falling contour).10
(c) Shifts in temporal and spatial relations may indicate a new verse or stanza.
(d) Conversational turns are always verses—Hymes (1996) maintains that a turn
leads to a demarcation of a verse.
(e) A series of verbs in the same tense, a chunk of repetitions, or the same topic
may indicate a stanza (or a scene).

SCENES and ACTS:


(a) A drastic change of spatial and temporal relations often marks the change of
scenes.
(b) A change of participants often leads to a change of scenes.
(c) A hierarchically higher category of scenes constitutes an Act.

Hymes’ form-content parallelism could be complemented even by seemingly


disruptive discourse features, both verbal and nonverbal. Conversation is
considered to be robustly resistible to ready segmentation in terms of form and
structure because the speaker exhibits numerous types of disfluency
characterized as speech errors, false starts, hesitations, truncations, trail-offs,
stuttering, etc.11 Such disfluencies are ordinarily seen as side-products of online
planning or continuous negotiation between the surface representations and
underlying mental processes. At other times, conversation and narration may be
interactionally suspended for Q/A, clarification, inattention, or problematization.
Even in these ongoing troubles, ad hoc system-sustaining mechanisms may
come into play to fit themselves into a preferred poetic formation (see Kataoka
[2009] for “diversions” excluded from an ethnopoetic formation).

9
Multimodal ethnopoetics

It is yet to be known, however, the extent to which such interactionally


generated disfluencies could be part of an ethnopoetic construction and how
they are different from individually produced ones. Whether individually or
interactionally produced, natural conversation and even its concomitant
disfluencies—not to mention narrative components within—may potentially
participate in ethnopoetic formations.

Ethnopoetics in text/discourse analysis (focus on socialization and


education)
Although there are numerous instances of ethnopoetic analysis of text and
performance, I will pick up one notable theme of ethnopoetics
here—socialization and educational/instructional significance—as a rigidly
established area of application.
In order to grow into responsible and respectable adults in a community,
people need to be somehow socialized into the respective cultural norms and
ways of conduct through observation of, participation in, and proactive emersion
in everyday interaction. Included among those explicit and implicit controls of
verbal and nonverbal (including bodily) conduct are ways to solicit and exercise
interaction with other cultural members. Ochs and Schieffelin (1984) called this
process “language socialization” (see also Duranti, Ochs, and Schieffelin 2011).
For example, in the Kaluli society in New Guinea, a certain speech act called
elema helps constitute an essential sister-brother relationship and is repeatedly
reinforced in the everyday lives of the community (Schieffelin 1990).
Along this vein, we could also say that construction of ethnopoetic
awareness and competence may be a life-long process, starting from infancy via
childhood into puberty/adulthood and then into old age. Highly relevant to this
aspect is Miall and Dissanayake’s (2003) study on babytalk with an 8-week
infant, who was found to be highly “poetic” in that short and simple
words/phrases were metrically repeated, constituting parallelism of heavy
stresses/accents, alliteration, assonance, etc. (see also Beebe, Stern, and Jaffe
1979 for kinetic coordination in mother-infant interactions). Based on these
findings, they propose that babytalk “attunes cognitive and affective capacities
in ways that provide a foundation for the skills at work in later aesthetic
production and response” (353).
In a similar vein, Minami and McCabe (1991) argue that Japanese children’s
narrative development is facilitated by triplets, which are implicitly embedded in
various linguistic and non-linguistic activities of Japanese. They attribute the
origin of this pattern to haiku—a traditional poetic style consisting of a

10
Kuniyoshi Kataoka

three-line sequence of 5-7-5 syllables—which still is a popular literary form in


contemporary Japan (see also McCabe and Peterson 1991). Gomes and Martin
(1996) also found that an English teacher’s talk to a problem student showed a
strikingly formulated pattern of intonational segments that consisted of
triplets—a phenomenon also observed in other traditional oral cultures (Hymes
1981, 1996; Tedlock 1983; Bright 1984). With the triplets, the teacher clearly
exemplified the desirable manner of speaking in the host culture and used these
constructions as an implicit “remedial” strategy for delinquent students.
At the same time, students’ and teachers’ oral narratives in and out of
classrooms may serve to reveal ongoing processes in manifesting their identity
and agency (Juzwik 2004; Warriner 2010). A representative case is Juzwik’s
(2004) study of a teacher’s narrative in an American middle school classroom.
Focusing on the narrative about the Holocaust in the WW II, she showed how
systematically the teacher constructed the stanza formation by employing the
metricalized parallelism, succeeding in inviting hearers to identify and divide
with the characters such as “we vs. they” and “Jews vs. Nazis,” and in
enhancing a moral stance against atrocities inflicted by Hitler.
On the other hand, variable conformity to and appreciation of different
ethnopoetic norms may lead to social inequality in various settings. In education,
such differences may be due to ethnic rhetorical styles (e.g., Scollon and Scollon
1979; Michaels 1981; Hester 1996), as often stated in the “deficit model”
interpretation, or to socioeconomic “class capitals” (cf. Bourdieu 1977).
Blommaert (2006) also explicitly points out possible disadvantages emanating
from such differences in diverse service encounters, typically where
cross-cultural storytelling is crucially required for, say, social-welfare
procedures, hospitalization procedures, police interviews, courtroom hearings,
and so forth. In these environments, the narrators’ (and their families’) life
quality and fate will drastically be affected by the storytelling abilities preferred
by and well grounded in the host community/culture.
The mismatch of ethnopoetic norms may be subtle, but the consequences
may be grave if such disparity continues to accumulate. Those covert norms and
preferences are inexorably embedded in verbal exchanges.

Poetics of para-linguistic and nonverbal features


It is not only utterance but also vocal qualities that can convey and contribute to
the perception of poetic formation. In fact, however, it still is a persisting
question whether and to what extent the formations based on prosody and
linguistic elements may converge.12

11
Multimodal ethnopoetics

It was Tedlock (1977, 1983) who raised the issue against Hymes’
verse/stanza-based approach. He most articulately emphasized the primacy of
pause groupings (e.g., pitch, loudness, rhythm, silence) in oral performance of
the Zuni language, complaining that Hymes’ model transforms what actually
happens through constantly changing sounds and silences into “regularized
typographical patterns” (which are largely based upon the Western literary
tradition) with his “verse-seeking” eye (see also Messineo 2004 and Purvis 2009
for recent rhythm/prosody-based analyses).
Factually, physical vocal quality may be highly relevant to the organizational
mechanism of narratives and conversations. As to the development of an
intonation-oriented survey of everyday interaction, one of the seminal works
was conducted by Erickson (1982), who argued that the musicality of speech
brings the listener’s attention to the key information in the speech stream and
that it cues the “transition relevance places,” where turn exchange between
speakers is appropriate. He also showed by using a musical score that cadence
and musicality are abundantly utilized in ritual speech, opera, dinner table
conversation, etc., so as to time the audience’s participation at the right moment
with the pitch and volume-stressed syllables that recur (Erickson 2002).
In addition to prosodic aspects, physiological vocal phenomena such as
“weeping” and “wailing” can be boundary-marking tools. For example, Hill
(1991) analyzed the incidence (or performance) of weeping in a Mexican
woman’s narrative by employing the Labovian narrative model. Interestingly,
she found that sobs, gasps, and sniffles tended to gather around phrase
boundaries, not disrupting the syntax. Weeping (and tears), on the other hand,
did deform the intonation contours and flood across episode boundaries,
disrupting the syntax and comprehension of the story. Nevertheless,
uncontrollable weeping then and there seemed acceptable and even desirable,
indexing her fidelity and good selfhood in the face of life’s challenges and
hardship. Also, Briggs (1993) noted that the musical and poetic synthesis in
Warao ritual wailing—especially the polyphonic and intertextual nature of
laments—plays an essential role in shaping/producing their symbolic power
sustained by socio-political and economic distinctiveness of individual voices
(see also Feld 1990 for Kaluli ritual wailing/weeping).
Furthermore, other para-/non-linguistic features may also come into play to
further corroborate the awareness of narrative structures. For example, laughter
is a typical means to represent the recipient’s “evaluation” and his or her
orientation toward the utterance. Thus, it is not hard to imagine that it will
cluster around, but not necessarily converge on, what narrative studies variably

12
Kuniyoshi Kataoka

term “evaluation,” “peak,” “climax,” or “punch line.” Goodwin (2007) gives a


good example in which Don and Ann (“knowing” participants) collaboratively
held their laughter until they reached the climax of the narrative because it is the
locus where laughter becomes most relevant for the other (“unknowing”)
audience.13 Kendon’s seminal study on gaze (Kendon 1967) is another good
example. He observed that the listener tends to gaze at the speaker more (i.e.,
longer and more frequently), while the speaker’s gaze is used more sparingly,
clustering around the boundaries of narrative and ending in mutual gaze.

Gesture and poetics


Examination of gesture in terms of poetics is not particularly a new area. Some
anthropologists attempted to incorporate nonverbal aspects of performance into
the study of ethnopoetics; e.g., song and dance in Kaluli ceremonial
performance called Gisaro in Papua New Guinea (Schieffelin 1976), “action
writing” where mime and calligraphic signs are combined in artistic combat by
two members in the Egbe society in Nigeria/Cameroon (Thompson 1983), or the
performance in American Sign Language poetry (Klima and Bellugi 1983; visit
for example “http://www.youtube.com/watch?v=GmhbuGZJyJA” for a beautiful
performance by Clayton Valli). They were either of traditional rituals or of
purely poetic performance through bodily control, and in that sense the
nonverbal features there are “scripted” rather than captured “on the fly.”
Now the time is ripe for multimodally informed studies of poetics, and we
could incorporate spontaneous resources such as gesture, posture, and various
“participating referents” made available in the immediate context. In so doing,
we will notice that the relationship between particular textual/gestural features
and narrative components may not be as stable as has been previously assumed,
but rather that they can be sensitized and constructed in situ by incorporating the
surrounding resources at hand.
More researchers are currently turning their attention to the poetics of
naturally occurring discourses. A systematic relationship between narrative and
gesture was first articulated by McNeill (1992) in terms of three levels of the
narrative structure. First, the “narrative level” consists of references to events in
the story world, where the speaker narrates the events which the listener takes
them to be “a faithful simulacrum of world occurrences in their actual order
(McNeill 1992: 185).” This temporal constraint is what characterizes this level,
and it mostly corresponds to what Labov (1972) calls “Complication (or
complicating action)” based on “narrative clauses.”14 This level includes two
types of gestural perspectives: the “character” viewpoint (C-VPT) and the

13
Multimodal ethnopoetics

“observer” viewpoint (O-VPT). C-VPT is a mode in which the narration is made


from the actor’s viewpoint in the scene, while O-VPT is anchored onto the
narrator’s viewpoint, who is objectively depicting the scene from a detached
vantagepoint that can further be decomposed into two closely related modes of
description: “inside” and “outside” perspectives.
At the “meta-narrative” level, narrators make explicit references to the
structure of the story, as seen in such a comment as “this is a story about my
sophomore year, ….” or “That’s all.” This level is not constrained by the order
of events in the real or fictive world, and roughly corresponds to Labov’s
“Abstract,” “Orientation,” and/or “Coda.” The final level is called the
“para-narrative” level, where storytellers also make references to their own
experience that engenders the narration. At the para-narrative level, narrators
step out of the official narrator role and speak for themselves, objectifying the
relationship of the narrator to the listener, as seen in (3). Although there are no
corresponding elements in Labov’s model, it is probably equivalent to the
original interview question, “Have you ever been in a situation where you were
in serious danger of getting killed?”

(3) (excerpt from McNeill 1992: 186)


A: Um, have you seen any of the uh Bugs Bunny cartoons? <PARA>
B: yeah, like ….
A: Right, ok, this one actually wasn’t a Bugs Bunny cartoon. <META>
….
A: And uh the first scene you see is uh <META>
This…this window with Birdwatcher’s Society underneath it. <NARR>

And there’s Sylvester peeking around the window. <NARR>

These different levels of narration (based on “gesticulation”15) are also


manifested in different “voices” (Bakhtin 1981). In Bakhtin’s theory of dialogue,
or hybrid construction of dual selves or multiple personas, “voice” (who is
narrating at the moment through what media/language) is a very important
means of constructing identity and encoding ideology (see, e.g., Hill 1995). The
same is true of gestures. Certain perspectives, stances, and personas can be
inferred from the form and space of an iconic gesture (Cassell and McNeill
1991: 388). In other words, a different point of view is manipulated by a
different voice associated with a particular self in the scene, and accordingly
represented by different gestures.

14
Kuniyoshi Kataoka

However, voice can be split into dual selves, as can gesture. When two
“voices” compete, they may be represented separately or incorporated into one,
encoding “the perspective of the character himself and the perspective of an
outside observer” (Cassell and McNeill 1991: 391; see also Parrill 2009 for
“dual viewpoint gestures”).
More interesting is a claim that particular gesture types are more likely to be
embedded in specific narrative levels (Cassell and McNeill 1991; McNeill 1992).
Based on the above classification, McNeill (1992: Ch. 2) proposed that the
following relationships (4) are observed between different types of gesture and
the narrative levels.

(4)
Beats: They appear when there are rapid shifts of level, and they
indicate the temporal locus of the shifts without having to convey
the content on either of the levels involved.
Pointing: It appears at all levels when orientation or change of orientation
is the focal content.
Iconics: They appear at the narrative level, where the content consists of
emplotted story events.
Metaphorics: They appear at the meta-narrative level, where the content
consists of the story structure itself viewed as an object or space.

This idea of gesture and meta-structural correlation is further advanced in


McNeill (2005: 116) by elaborating on the function of “catchment.” Catchment
is defined as a recurrent visuo-spatial image across a long stretch of discourse that
provides access to thematic cohesion and underlying semantic structures. The
form, direction, and/or action space of gesture are recurrently used throughout
discourse, establishing consistency and unity where common features are
highlighted and/or maintained. These recursive gestural elements are assumed to
embody and converge onto certain poetic features, heavily relying upon the nature
of repetition.16
For example, McNeill (2005: 117–119) explains the notion by focusing on the
alternative use of three catchments (C1, C2, and C3) conducted by one of his
informants (called “Vivian”), who was asked to narrate the content of an
animation just shown. There, C1 was embodied by one hand to describe the main
character in the story (Sylvester the cat); C2 concerned an object in the story (a
bowling ball), the recurring feature of which is a round shape created by
symmetrical hands; C3 depicted the relative positions of two entities in a

15
Multimodal ethnopoetics

drainpipe (Sylvester and a bowling ball), which are indicated by two


asymmetrical hands. These systematic relationships showed that, no matter how
superficially hidden behind textual representations, gesture reflects the ongoing
achievement of thematic coherence.
However, it is still arguable whether such correspondences between gesture
and the referent are cross-linguistically valid or culturally relativized.17 Also,
since the results and findings of McNeill and his colleagues mostly came from
“monologic” narratives, interactional aspects of catchment have yet to be
examined in detail. That is, the catchment association may not only be
individually created but also collaboratively constructed and extended in situ.
For example, Kataoka (2009, 2010) closely examined the verbal and
non-verbal aspects of way-finding instructions between close friends and
confirmed that speakers coordinately cue subtle shifts of the narrative phases in
terms of forms, referents, and shifts of hand gestures. He also showed that, as in
Bright (1984), boundaries of both rhetorical and ethnopoetic components may
largely coincide, mutually cultivating and corroborating multimodal cohesive
ties with local and global catchments. Kataoka (2011, 2012) also demonstrated
that a cardiopulmonary resuscitation (CPR) instruction session was highly
poetically organized with the help of multimodal imports by cultivating
odd-number formations.
Gestural repetition and parallelism are a major resource for achieving
thematic coherence and a totality of instruction. Using this notion of “catchment,”
Pozzer-Ardenghi and Roth (2008) focused on one particular type of gesture
(“squeezing/pumping”) repeated many times both within and across biology
lectures. They showed that the teacher, even if he or she uses different
“signifiers” (such as “atria,” “ventricles,” and “systole”), tactfully differentiates
and relates the “signifieds” by using the same “squeezing/ pumping” gesture
through the collaborative achievement of verbal and gestural elements,
facilitating the understanding of newly introduced terms.
Overall, not only linguistic but also multi-modal semiotic resources in
tandem substantially contribute to the achievement of thematic coherence.

Macro-/micro-components and arenas of poetic emergence


Below, I summarize the basic components that may be cultivated for creating
ethnopoetic formations. What I mean by macro- and micro-levels here relies
upon the size of the component, not upon the process. Thus, the macrostructure
here refers to the components in the narrative structure or to the sequential
structure of conversation, both of which translate into an overall architecture of

16
Kuniyoshi Kataoka

interactional discourse. On the other hand, the microstructure concerns a


smallest unit in narration here, a line-like component. The microstructure
elaborates on phrasal/clausal/sentential functions in terms of meaning, prosody,
verb types, linking patterns, etc. (These smaller units within a line are not
included here.) Table 1 summarizes such macro- and micro-components
proposed in previous studies.

MICROSTRUCTURE proponent MACROSTRUCTURE proponent


COMPONENT (genre) COMPONENT (genre)
Narrative clause Labov & Waletzky, Abstract Labov & Waletzky,
Free clause 1967 Orientation 1967
(interview, Narrative) Complicating Action (Interview, Narrative)
Evaluation
Resolution/Result
Coda
line Hymes 1981, 1996, Exposition Hymes, 1981
verse 2003 Complication (Mythical/Traditional
stanza (Mythical/Traditional Climax Narratives)
scene Narratives) Denouement
act
Substantive Chafe, 1994 Exposition Longacre, 1996
intonation unit (IU) (Conversation) Inciting moment (Narrative, Folk tales)
Regulatory IU Developing conflict
Fragmentary IU Climax (peak)
Denouement (peak)
Final suspense
Conclusion
TCU (lexical, *Sacks, Schegloff, & Pre-pre *Sacks, Schegloff, &
phrasal, clausal, Jefferson 1974 Pre-sequence Jefferson 1974,
sentential) (conversation) (Main story) Jefferson 1978
TRP Pre-closing (conversation)
Adjacency pair Closing
Gesture unit: Kendon 1980, 2004 Narrative level McNeill 1992
Preparation McNeill 1992 (C-VPT, O-VPT (gesture)
(Pre-stroke hold) (gesture) [outside/inside])
Stroke Paranarrative level
(Post-stroke hold) Metanarrative level
Retraction

Table 1. Macro- and Micro-Components of Narrative/Conversation


* Conversation analytic categories are listed for comparison.

We have so far looked at structure-building features of verbal and nonverbal


elements by focusing on how they would contribute to framing the event or
performance, and how they may shift, expand, and create new ones in ongoing
discourse. In fact, the arenas of poetic emergence are much wider than currently
assumed. So it would be helpful to illustrate how poetic language works in terms

17
Multimodal ethnopoetics

of the media of realization and the genres of use. The list (5) is only for a better
service of the current review and would hardly be exhaustive. They simply
represent the possible areas of research which have been given attention to in
previous studies. Due to the constraint of space, this chapter only deals with the
first three media of realization.

(5)
(a) SOUND (alliteration/rhyme, homophone, sound symbolism, onomatopoeia,
etc.)
(b) LANGUAGE (linguistic [lexical, morphological, syntactic, semantic
pragmatic], textual, discursive, meta-/paralinguistic, prosodic, etc.)
(c) BODY (gaze, gesture, posture, multi-party body formation,
scripted/improvisational performance, proxemics, etc.)
(d) THOUGHT (mental image, theorization, cognitive constraints/preference,
ethnosciences; see Friedrich 2006)
(e) ARTIFACT (sculpture, architecture, art forms, commercial products, visual
design, etc.)
(f) ENVIRONMENT (not necessarily related to human intention or capacity:
fractal structures, birds’ flying formation, snowflake patterns, etc.).

With these broader scopes and enormous potentials, ethnopoetics offers to us


promising paths toward an integrated study of languages and semiotic signs as a
whole.

Sample analysis of multimodal (con)text


In this section, I will briefly examine one palpable instance by employing the
notions and methods introduced above. Specifically, we look into a case in
which poetic intentions are apparently maximized for the greatest appeal to the
audience and the highest expectation of profit—a TV commercial message. This
commercial was produced for a snack food product called “purittsu,” and it
actually won the Actor Performance Award at the 44th ACC (abbreviation of “All
Japan Radio and Television Commercial Confederation”) CM Festival in
2004—here performed by a then-prominent national idol, Aya Matsuura, or
affectionately called Ayaya, of the Mooningu Musume ‘Morning Gals’ (to view
the video clip, access “http://www.youtube.com/watch?v=e5I3-J-zpXE”). In this
30-second commercial, she is donned in an orange costume, like the one worn
by Bruce Lee in The Game of Death (1972) or by the Bride in Quentin
Tarantino’s movie Kill Bill (2003), and she dances to a fixed rhythm with two

18
Kuniyoshi Kataoka

sumo wrestlers on her sides, yelling “TSUppuri TSUppuri TSUppuri ….” That
is why this performance was later called the “Tsuppuri Dance” in the mass
media. The product name, Purittsu, is obviously a Japanized and accented
version of “pretzel,” although the actual snack is more like a stick made from
the same ingredients.
The funny and bizarre characterization in the commercial seemed to gain a
lot of popularity (and, as mentioned, eventually won an award) probably
because it heavily incorporates and cultivates several layers of poetic principles
of the Japanese language and culture. The first prominent feature is the rhythm,
which consists of the repeated sets of beats, going “ .” All
Japanese people would recognize that this is what is called san-san-nana
byooshi ‘3-3-7 beats,’ which is widely utilized in traditional cheering
performance for sports (often with a drum beating the rhythm).18 This rhythm is
overlaid with the performer’s chanting “tsuppuri,” which is a punned
transformation of tsuppari “thrust” (a sumo punching technique) and purittsu,
the name of product advertised here. These are the reasons that Ayaya (the
performer) is accompanied by two sumo wrestlers on her sides and conducts
thrusts throughout the performance. Notice also that the background decoration,
a pair of huge hand models marked for acupuncture points, adds to it an Asian
atmosphere, the same type of amalgam representation pursued in Kill Bill.
First, I would like to show the overall structure of the commercial by
referring to the rhythmic and rhetorical features of the performance. As widely
known, the Japanese poetic form haiku consists of a fixed set of moraic
units—i.e., 5-7-5 moras—and this performance also cultivates the same sort of
traditional formats. What I would like to emphasize here is not the number of
moras (which is intrinsic to san-san-nana byooshi) but rather the higher levels of
organization equivalent to verses and stanzas.
Since there is a short pause/breath after a set of three beats and an
exclamatory cheer “dosukoi” at the end of the 3-3-7 beats, we can take the three
“lines” as forming a single verse, as represented in (6). The conspicuous scenes
and frames in the commercial are shown in Fig. 1, and they will be separately
referred to in the following analysis.

(6) Lines and a verse


1. TSÚppuri TSÚppuri TSÚppuri,
2. TSÚppuri TSÚppuri TSÚppuri,
3. TSÚppuri TSÚppuri TSÚppuri TSÚppuri TSÚppuri TSÚppuri
TSÚppuri (DosuKÓi!).

19
Multimodal ethnopoetics

(a)“TSÚppuri TSÚppuri ….” (b) “DosuKÓi!”

(c) (crunch!) (d) (crunch!)

(e) (Narration) “Cod roe flavor is out!” (f) (Narration) “Let’s eat Pretz! /
Why not, let’s!!”

(g) (Narration) “GU..RI..KO”


Figure 1. Conspicuous frames from Tsuppuri dance.

Verses are bound into a stanza, and in this case, although not very noticeable
with a single viewing, a verse is repeated five times within the 30-second
commercial, mounting to a stanza of five-verse structure on the higher level (Fig.

20
Kuniyoshi Kataoka

2). In Figure 2, a single beat/phrase “Tsuppuri” is represented by an asterisk “ ”


(see also Fig. 1). The sign “ ” at the verse-final position represents the phrase
“DosuKÓi!” (Fig. 1 (b)), which is an exclamatory cheer used (especially in
sumo wrestling) to refer to a strenuous action or to a person doing it, but here it
is also used as a boundary marker of a verse.

Performance Direction Action type Close-up to…


of motion
(Stanza A)
(Verse 1)
1. forward thrust
2. forward thrust
3. forward thrust/ceremonial posture
(Verse 2)
4. backward thrust
5. backward thrust
6. backward thrust/ceremonial posture
(Verse 3)
7. forward thrust Rock music
8. forward thrust continues….
9. forward thrust/ceremonial posture
(Verse 4)
10. backward thrust
11. backward thrust
12. backward thrust/ceremonial posture Aya crunching Pretz x2
(Verse 5)
13. forward thrust
14. forward thrust
15. forward thrust/ceremonial posture Aya dancing

Figure 2. Overall verse structure and bodily/visual control.

There are other features that corroborate the validity of this organization.
First, although this is a five-verse structure, it could be seen as a repetition of
two identical units, or partial lamination of “Verse 1 to Verse 3” and “Verse 3 to
Verse 5,” with Verse 3 working as a pivot of those units. The rationale for
identifying two identical units, rather than a single five-verse unit, is that they
are rhetorically differentiated by different patterns. That is, the second unit is
inaugurated by the background rock music, which lasts nearly to the end of the
commercial (until the 4th beat in Line 15) and gradually fades out at the
announcement of the company name “GU..RI..KO.”19 As shown below, the
second unit incorporates other visual frames and is more densely devised for
rhetorical effects (Fig. 3 is a detailed account of the second unit). That is, these
units are “equivalently differentiated,” but are constructed in a way that an
absurd equation “3 + 3 = 5” is made coherent by the distributional patterns
based on parallelism.20

21
Multimodal ethnopoetics

Now let us examine the second unit in more detail (Fig. 3). This unit is
characterized by numerous rhetorical features often observed in other narrative
and storytelling performances, especially around the “peak” or the “climax.”
There we have a “crowded stage,” where various actors/entities take turns
appearing, “rhetorical underlining” (parallelism, paraphrase, and tautologies),
and “heightened vividness” (close-ups, lamination of performance, visual frames,
and narration) (cf. Longacre 1996). First, after the rock music sets in, the initial
breakout from the routine occurs as the medial and proximal close-ups of
“Ayaya crunching a purittsu stick,” co-occurring with the third and the fourth
beat in Line 12 (Fig. 3 and Fig. 1 (c, d)). Following this, the image of “a box of
cod roe flavor” (Fig. 1 (e)) is over-layered upon the performance, starting on the
seventh (i.e., the last) beat in Line 12 and lasting through the second beat in Line
13 (Fig. 3). Then it is immediately followed by another image of “four boxes of
different flavors” (Fig. 1 (f)), running from the third (i.e., the last) beat in Line
13 through the second beat in Line 15. Then the screen frame suddenly switches
to a medial close-up of Ayaya on the third beat in Line 15 (see also Fig. 1 (g)),
running on to the end of the commercial. Notice that the pattern of the switch of
shots is not random. The switches do not occur at the verse boundaries but do

Verse Line Beat (mora) Fig.1 Scene Narration/sound


12 (1)
… (2)
(3) 3rd (c) Close-up (Md)* (crunch)
4 (4) (d) Close-up (Pr) (crunch)
(5) men-
(6) taiko
(7) Final (e) 1 Prittsu box mo ..
(DosuKOi) DEtaa!
13 (1)
(2) 2nd purittsu
(3) Final (f) 4 Prittsu boxes TAbe-
(breath) ma-shoo.
14 (1)
(2) SOo
5 (3) shi-ma-
(breath) shoo.
15 (1)
(2) 2nd
(3) 3rd (g) Close-up (Md) GU..
(4) RI.. Rock music
(5) KO fades out.
(6)
(7)

Figure 3. Focused poetic schema of Verses 4 and 5.


* “Md” and “Pr” in the Scene column represent “medial” and “proximal” respectively.

22
Kuniyoshi Kataoka

occur staggeredly, exactly on the same n-th beat on each line. For instance,
close-ups of Ayaya appear on the third beat in Lines 12 and 15, while boxes
appear on the last beat in Lines 12 and 13, and vanish after the second beat in
Lines 13 and 15, maintaining the multiple structural parallelism across the lines.
Not only does this visuo-rhythmic parallelism exist, but we also find the
oral-rhythmic correspondences (Fig. 3, right-most column). They are the three
utterances which occurred in Lines 12, 13, and 14. All of those utterances are
terse and simple, pronounced by a male voice, and concur in a staggered manner
so that they are terminated with the exclamatory cheer or the breath placed at the
boundaries. More interestingly, an utterance mentaiko mo DEtaa! ‘The cod row
flavor is OUT!’ is given a slight pause before DEtaa! ‘out!,’ as if to wait for its
tonal peak to overlap with the exclamatory DosuKÓi!, both of which alliterate
for the plosive /d/.21 The next two utterances constitute clearer parallelism, both
semantically and syntactically. Purittsu TAbe mashoo! ‘Let’s EAT purittsu!’ is
paired with SOo shi mashoo! ‘Why not, let’s! (literally, “Let’s do SO!”),’
rhyming and roughly repeating the same meaning and construction. The
rhyming morpheme mashoo ‘let’s~’ contrapuntally falls on the breath of Lines
13 and 14. In addition, for these three utterances the tonal peaks tend to cluster
around the center, falling on the final beat in the first utterance (DEtaa), the
middle in the second (TAbe), and the initial in the third (SOo), rendering the
juncture point between Verses 4 and 5 most dense so that it amounts to the
highest tension. Furthermore, these utterances are inserted so as to roughly
correspond to different visual images of the boxes and distinct sentence
structures such that the “cod roe flavor box” (Fig. 1 (e)) appears in a structurally
different segment (Line 12: a 7-beat line), whereas the “four boxes” image
overarches semantically and syntactically equivalent sentences that rhyme with
each other (Lines 13 and 14: 3-beat lines).
Finally, although the next interpretation may sound far-fetched or accidental,
it is notable that all of the features mentioned above come in threes: (1) three
close-ups of Ayaya in Lines 12 and 15 (Fig. 3, “Scene” column), the first and the
last of which (“medial” close-ups) occur on the 3rd beat; (2) the three utterances
by a male voice about the product roughly match up with the images of the
boxes differentiated by equivalent meaning/structure and distinct rhyming
patterns; and (3) each male utterance consisting of three smaller segments (e.g.,
“purittsu - TAbe - mashoo,” although the first utterance is an exception aimed
presumably for alliteration), that correspond to the rhythm (including “breath”),
as does the announcement “GU..RI..KO.” This final announcement was made by
a female voice in a staccato manner with an ample pause (0.2 s) in between so as

23
Multimodal ethnopoetics

to match the fading-out chants.


These observations show that this commercial heavily cultivates the
potential of a multimodal ethnopoetic narrative, which is highly devoid of
semantic and verbal content but rich in semiotic representations. In other words,
a narrative, whether “big” or “small” (Bamberg 2007; Georgakopoulou 2007),
may be achieved through non-verbal means, without referring to the temporal
sequentiality of experience or to the “complicating actions” in the scene—the
assumed mainstay of the narrative research. The important factor here is the
skeletal poetic/narrative configuration that emerges through the accumulation of
semiotic and evaluative layerings (Fig. 3).
Unfortunately, we cannot know for sure the extent to which the commercial
producers’ or creators’ intentions were incorporated in achieving the ethnopoetic
formation. Further, the odd-/even-number construction is essentially neutral as
to cultural values. One thing that is for sure, however, is that they opted, whether
consciously or unconsciously, to create the commercial the way it was. The
aesthetics encoded and shared therein must be an outcome of the repeated
practice, accumulated and sedimented by attending to the ongoing social
participation, with the creators and the audience included. Such actual and
virtual participations will generally be facilitated by following an interactionally
favored manner of conduct, as there is, say, a communally preferred and
naturalized length of TCU or an acceptable duration/amount of overlapping for
smoothly taking part in interaction. In other words, such preferred constructions
seem to encode and disseminate a greater appeal to the prospective
audience-consumers by covertly evoking shared cultural values, thus stimulating
the purchasing instinct toward the product. As we have seen so far, one such
preferred format (or “habitus”: Bourdieu 1990) of dissemination may be an
odd-number construction.22

Final remarks
We have so far reviewed basic principles and some recent developments in the
study of ethnopoetics, confirming the potential for expanding it to the study of
multimodal communication. Although we have largely focused on the
systematic and structure-abiding features of poetics, that does not mean that they
are always rigidly observed or stably utilized. Instead, they can be modified,
expanded, or even violated, even if their occurrence may be restricted, for
immediate manipulations or special rhetorical effects for the ongoing discourse.
Also, in the current climate of discourse analyses, what has generally drawn
attention is the emergent and ad-hoc achievement of interactional practice, and a

24
Kuniyoshi Kataoka

type of verbal and gestural semiosis we have seen here is often relegated to
narrow-minded determinism or pseudo-universalism, or labeled as regimenting
and stereotype-forming at best. Instead of presuming the existence of such a
formula, and by focusing bottom-up on the naturalizing practice that engenders
it, I claim that we could elucidate culturally embedded practice, which was
accumulated and entrenched among the speakers of language. Multimodal
ethnopoetics, I argue, will serve as an “emic” tool for revealing the naturalizing
process of cultural values and for examining the indigenous management of
language, the body, and the environment.

References
Alim, H. Samy. 2006. Roc the mic right: The language of Hip Hop Culture. London:
Routledge.
Bakhtin, Mikhail M. 1981. Forms of time and of the chronotope in the novel. In Holquist,
Michael (Ed.) The dialogic imagination: Four essays by M. M. Bakhtin. Austin and
London: University of Texas Press.
Bamberg, Michael 2007. Stories: Big or small—Why do we care? In M. Bamberg (ed.),
Narrative—state of the art, 165–174. Amsterdam: John Benjamins.
Basso, Keith. H. 1990. Western Apache language and culture: Essays in linguistic
anthropology. Tucson, AZ: University of Arizona Press.
Bauman, Richard (ed.) 1992. Folklore, cultural performances, and popular entertainments.
Oxford, U.K.: Oxford University Press.
Beebe, Beatrice, Daniel Stern, & Joseph Jaffe 1979. The kinesic rhythm of mother-infant
interactions. In A. W. Siegman and S. Feldstein (eds.), Of speech and time: Temporal
speech patterns in interpersonal contexts, 23–34. Hillsdale, New Jersey: Erlbaum.
Blommaert, Jan 2006. Applied ethnopoetics. Narrative Inquiry 16(1), 181–190.
Bourdieu, Pierre, 1977. Outline of a theory of practice. R. Nice(tr.). Cambridge, U.K.:
University Press.
Bourdieu, Pierre 1990. The logic of practice. R. Nice(tr.). Palo Alto, CA: Stanford University
Press.
Briggs, Charles 1993. Personal sentiments and polyphonic voices in Warao women’s ritual
wailing. American Anthropologist 95(4): 929–957.
Briggs, Charles 1996. Conflict, language ideologies, and privileged arenas of discursive
authority in Warao dispute mediation. In Disorderly discourse: Narrative, conflict and
inequality, Charles Briggs (ed.), 204-42. Oxford, U.K.: Oxford University Press.
Bright, William 1984. American Indian linguistics and literature. Berlin: Mouton.
Bright, William, 1990. ‘With One Lip, with Two Lips’: Parallelism in Nahuatl. Language
66(3), 437–452.
Brown, Penelope, 1998. Conversational structure and language acquisition: The role of
repetition in Tzeltal adult and child speech. Journal of Linguistic Anthropology 8(2),
197–222.
Brown, Penelope, 1999. Repetition. Journal of Linguistic Anthropology 9(2), 223–226.
Cassell, Justine, & David McNeill 1991. Gesture and the poetics of prose. Poetics Today 12,
375–404.

25
Multimodal ethnopoetics

Chafe, Wallace L. 1980. The deployment of consciousness in the production of a narrative. In


W. Chafe (ed.), The pear stories: Cognitive, cultural, and linguistic aspects of narrative
production, 9–50. Norwood, NJ: Ablex.
Chafe, Wallace 1994. Discourse, consciousness, and time: The flow and displacement of
conscious experience in speaking and writing. Chicago: The University of Chicago Press.
Condon, William S., 1982 [1974]. Cultural microrhythms. In M. Davis (ed.), Interaction
rhythms, 53–77. New York: Human Sciences Press.
Daley, Michael 2007. Vocal performance and speech intonation: Bob Dylan’s ‘Like A Rolling
Stone.’ Oral Tradition 21(1), 84–98.
Du Bois, John W., S. Schuetze-Coburn, S. Cumming, & D. Paolino 1993. Outline of discourse
transcription. In J.A. Edwards & M.D. Lampert (eds.), Talking data: Transcription and
coding in discourse research 45–89. Hillsdale, NJ: Lawrence Erlbaum.
Duranti, Alessandro, Elinor Ochs, & Bambi B. Schieffelin (eds.) 2011. The handbook of
language socialization. New York: Wiley-Blackwell.
Erickson, Frederick 1982. Classroom discourse as improvisation: Relationships between
academic task structure and social participation structure in lessons. In Louise C.
Wilkinson (ed.), Communicating in the classroom , 155–81. New York: Academic Press.
Erickson, Frederick 2002. Some notes on the musicality of speech. In Deborah Tannen &
James E. Alatis (eds.), Linguistics, language, and the real world: Discourse and beyond,
11–35. Washington, D.C.: Georgetown University Press.
Fabb, Nigel 2002. Language and literary structure: The linguistic analysis of form in verse
and narrative. Cambridge, U.K.: Cambridge University Press.
Feld, Steven 1990. Sound and sentiment. Philadelphia, PA: University of Pennsylvania Press.
Ferrara, Kathreen 1994. Repetition as rejoinder in therapeutic discourse: Echoing and
mirroring. In: B. Johnstone (ed.), Repetition in discourse, vol. 2, 66–83. Norwood, NJ:
Ablex.
Fox, James (ed.) 1988. Introduction. To speak in pairs: Essays on the ritual languages of
Eastern Indonesia, 1–28. Cambridge, U.K.: Cambridge University Press.
Friedrich, Paul. 2001. Lyric epiphany. Language in Society 30 (2), 217–247.
Friedrich, Paul 2006. Maximizing ethnopoetics: Fine-tuning anthropological experience. In C.
Jordan& K. Tuite (eds.), Language, culture, and society, 217–247. Cambridge, U.K.:
Cambridge University Press.
Furuyama, Nobuhiro & Kazuki Sekine 2007. Forgetful or strategic? The mystery of the
systematic avoidance of reference in the cartoon story narrative. In S. Duncan, D.J.
Cassell, & L.E. Levy (eds.), 75–81. Gesture and the dynamic dimension of language:
Essays in honor of David McNeill.
Gee, James P. 1986. Units in the production of narrative discourse. Discourse Processes 9,
391–422.
Gee, James P. 1989. Two styles of narrative construction and their linguistic and educational
implications. Discourse Processes 12, 287–307.
Georgakopoulou, Alexandra 2007. Thinking big with small stories in narrative and identity
analysis. In M. Bamberg (ed.), Narrative—state of the art, 145-154. Amsterdam: John
Benjamins.
Gomes, Barbara A., & Laura Martin 1996. “I only listen to one person at a time”: Dissonance
and resonance in talk about talk. Language in Society 25, 205–236.
Goodwin, Charles 2007. Interactive footing. In Elizabeth Holt and Rebecca Clift (eds.),
Reporting talk, 16–46. Cambridge, U.K.: Cambridge University Press.
Hester, E. 1996. Narratives of young African American children. In A. Kamhi, K. Pollock, & J.
Harris (eds.), Communication development and disorders in African American children:
Research, assessment, and intervention, 227–246. Baltimore: Brookes.

26
Kuniyoshi Kataoka

Hill, Jane H. 1991. Weeping as a meta-signal in a Mexicano woman's narrative. Journal of


Folklore Research 27(1/2), 29–49.
Hill, Jane H. 1995. The voices of Don Gabriel: Responsibility and self in a modern Mexicano
narrative. In Dennis Tedlock & Bruce Mannheim (eds.). The dialogic emergence of culture.
Urbana-Champaign: University of Illinois Press, 97–147.
Hopper, Robert, & Phillip Glenn 1994. Repetition and play in conversation. In B. Johnstone
(ed.), Perspectives on repetition Vol. 2, 29–40. Norwood, NJ: Ablex.
Howard, Kathryn M. 2009. Breaking in and spinning out: Repetition and decalibration in Thai
children's play genres. Language in Society 38(3), 339–363.
Hymes, Dell, 1981. In vain I tried to tell you: Essays in Native American ethnopoetics.
Philadelphia: University of Pennsylvania Press.
Hymes, Dell 1994. Ethnopoetics, oral-formulaic theory, and editing texts. Oral Tradition 9(2),
330–370.
Hymes, Dell, 1996. Ethnography, linguistics, narrative inequality. Bristol, PA: Taylor and
Francis Inc.
Hymes, Dell 2003. Now I know only so far: Essays in ethnopoetics. Omaha, NE: University
of Nebraska Press.
Iwasaki, Shoichi 1993. The structure of the intonation unit in Japanese. J/K Linguistics 3:
39–53.
Jakobson, Roman 1960. Linguistics and poetics. In: T. Sebeok (ed.), Style in language,
350–377. Cambridge, MA: MIT Press.
Jakobson, Roman 1966. Grammatical parallelism and its Russian facet. Language 42 (2),
399–429.
Jefferson, Gail 1996. On the poetics of ordinary talk. Text and Performance Quarterly 16 (1),
1–61.
Juzwik, Mary M. 2004. What rhetoric can contribute to an ethnopoetics of narrative
performance in teaching: The significance of parallelism in one teacher's narrative.
Linguistics and Education 15(4), 359–386.
Kataoka, Kuniyoshi 2009. A multi-modal ethnopoetic analysis (Part 1): Text, gesture, and
environment in Japanese spatial narrative. Language and Communication 29(4),
287–311.
Kataoka, Kuniyoshi 2010. A multi-modal ethnopoetic analysis (Part 2): Catchment, prosody,
and frames of reference in Japanese spatial narrative. Language and Communication
30(2), 69–89.
Kataoka, Kuniyoshi 2011. Verbal and non-verbal convergence on discursive assets of
Japanese speakers: An ethnopoetic analysis of repeated gestures by Japanese first-aid
instructors. Japanese Language and Literature 45 (1): 227–253.
Kataoka, Kuniyoshi, 2012. The “body poetics”: Repeated rhythm as a cultural asset for
Japanese life-saving instruction. Journal of Pragmatics.
Kendon, Adam 1967. Some functions of gaze direction in social interaction. Acta
Psychologica 32, 1–25.
Kendon, Adam 1980. Gesticulation and speech: Two aspects of the process of utterance. In:
Mary Ritchie Key (ed.), The relationship of verbal and nonverbal communication,
207–227. The Hague: Mouton.
Kendon, Adam 2004. Gesture: Visible action as utterance. Cambridge U.K.: Cambridge
University Press.
Kita, Sotaro, & Asli Özyürek 2003. What does cross-linguistic variation in semantic
coordination of speech and gesture reveal? Evidence for an interface representation of
spatial thinking and speaking. Journal of Memory and Language 48(1), 16–32.

27
Multimodal ethnopoetics

Klima, Edward S., & Ursula Bellugi 1983. Poetry without sound. In Jerome Rothenberg and
Diane Rothenberg (eds.), Symposium of the whole, 291–302. Berkeley, CA: University
of California Press.
Labov, William 1972. Language in the inner city. Philadelphia: University of Pennsylvania
Press.
Labov, William, & Joshua Waletzky 1967. Narrative analysis. In J. Helm (Ed.), Essays on the
verbal and visual arts, 12–44. Seattle, WA: University of Washington Press.
Lomax, Alan, 1982. The cross-cultural variation of rhythmic style. In M. Davis (ed.).
Interaction rhythms: Periodicity in human behavior, 149–174. New York: Human
Sciences Press.
Longacre, Robert E. 1996. The grammar of discourse (2nd ed.). New York: Plenum Press.
McCabe, Allyssa, & Carole Peterson 1991. Developing narrative structure. Hillsdale, NJ:
Lawrence Erlbaum Associates, Inc.
McNeill, David 1992. Hand and mind. Chicago: The University of Chicago Press.
McNeill, David 2005. Gesture and thought. Chicago: University of Chicago Press.
McNeill, David, & Susan Duncan 2000. Growth points in thinking-for-speaking. In D.
McNeill (ed.), Language and gesture, 141–161. Cambridge, U.K.: Cambridge University
Press.
McNeill, David 2003. Pointing and morality in Chicago. In S. Kita (Ed.), Pointing: Where
language, culture, and cognition meet, 293–306. Hillsdale, NJ: Erlbaum.
Messineo, Cristina 2004. Toba discourse as verbal art. Anthropological Linguistics 46(4),
450–479.
Miall, David S., & Ellen Dissanayake 2003. The poetics of babytalk. Human Nature 14(4),
337–364.
Michaels, Sarah 1981. Sharing time: Children's narrative styles and differential access to
literacy. Language in Society 10, 423–442.
Minami, Masahiko, & Alyssa McCabe 1991. Haiku as a discourse regulation device: A stanza
analysis of Japanese children's personal narratives. Language in Society 20, 577–599.
Minamoto, Ryouen 1992. Kata to Nihon Bunka ‘“Pattern/type/style/form” and Japanese
culture.’ In Minamoto Ryouen (ed.), Kata to Nihon Bunka ‘“Pattern/type/style/form” and
Japanese culture,’ 5–68. Tokyo: Sôbunsha.
Mukařovský, Jan 1964. Standard language and poetic language. In Paul Garvin (ed.), A Prague
school reader on esthetics, literary structure, and style, 17–30. Washington D.C.:
Georgetown University Press.
Norrick, Neal 2000. Conversational narrative: Storytelling in everyday talk. Amsterdam: John
Benjamins.
Ochs, Elinor, & Bambi Schieffelin 1984. Language acquisition and socialization: Three
developmental stories and their implications. In Culture theory: Essays on mind, self, and
emotion. R. Shweder & R.A. LeVine (eds.) 276–320. New York: Cambridge University
Press.
Parrill, Fey 2009. Dual viewpoint gestures. Gesture 9(3), 271–289.
Pozzer-Ardenghi, Lilian, & Wolff-Michael Roth, 2008. Catchments, growth points, and the
iterability of signs in classroom communication. Semiotica 172-1/4, 389–409.
Purvis, Tristan M. 2009. Speech rhythm in Akan oral praise poetry. Text and Talk 29(2),
201–218.
Rieger, Caroline L. 2003. Repetitions as self-repair strategies in English and German
conversations. Journal of Pragmatics 35(1), 47–69.
Rothenberg, Jerome, and Diane Rothenberg 1983. Symposium of the whole: A range of
discourse toward an ethnopoetics. Berkeley, CA: University of California Press.

28
Kuniyoshi Kataoka

Rumsey, Alan 2007. Musical, poetic, and linguistic form in "Tom Yaya" sung narratives from
Papua New Guinea. Anthropological Linguistics, Vol. 49, No. 3/4, 235–282.
Sacks, Hervey, Emanuel Schegloff, & Gail Jefferson 1974. A simplest systematics for the
organization of turn-taking for conversation. Language 50, 696–736.
Sapir, Edward 1921. Language: An introduction to the study of speech. New York: Harcourt,
Brace and Co..
Schegloff, Emanuel 1997. Practices and actions: Boundary cases of other-initiated repair.
Discourse Processes 23(3), 499–547.
Schieffelin, Bambi B. 1990. The give and take of everyday life: Language socialization of
Kaluli children. New York: Cambridge University Press.
Schieffelin, Edward 1976. The sorrow of the lonely and the burning of the dancers. New
York: St Martin’s Press.
Scollon, Ronald, and Suzanne B. K. Scollon 1979. Linguistic convergence: An ethnography
of speaking at Fort Chipewyan, Alberta. New York: Academic Press.
Sherzer, Joel 1982. Poetic structuring of Kuna discourse: The line. Language in Society 11(3),
371–390.
Silverstein, Michael 1976. Shifters, linguistic categories, and cultural description. In K. Basso
and H. Selby (eds.), Meaning in anthropology, 11–55. Albuquerque, NM: University of
New Mexico Press.
Silverstein, Michael 1985. On the pragmatic ‘poetry’ of prose: Parallelism, repetition, and
cohesive structure in the time course of dyadic conversation. In D. Schiffrin, (ed.),
Meaning, form, and use in context: Linguistic applications, 181–99 Washington, D.C.:
Georgetown University Press.
Silverstein, Michael, 1987. Cognitive implications of a referential hierarchy. In: M. Hickmann
(ed.), Social and functional approaches to language and thought, 125–164. Orlando, FL:
Academic Press.
Silverstein, Michael 1998. The improvisational performance of culture in realtime discursive
practice. In K. Sawyer (ed.), Creativity in performance, 265–312. Greenwich, CT: Ablex
Publishing Corp.
Silverstein, Michael, & Greg Urban (eds.) 1996. Natural histories of discourse. Chicago:
University of Chicago Press
Stockwell, Peter 2002. Cognitive Poetics: An Introduction. London: Routledge.
Tannen, Deborah 1989. Talking voices. Cambridge, U.K.: Cambridge University Press.
Tedlock, Dennis 1977. Toward an oral poetics. New Literary History 8(3), 507–519.
Tedlock, Dennis 1983. The spoken word and the work of interpretation. Philadelphia:
University of Pennsylvania Press.
Thompson, Robert F. 1983. Nsibidi/action writing. In Jerome Rothenberg and Diane
Rothenberg (eds.), Symposium of the whole: A range of discourse toward an ethnopoetics,
285–290. Berkeley, CA: University of California Press.
Turner, Victor 1981. Social dramas and stories about them. In Thomas W.J. Mitchell (ed.), On
narrative, 137–164. Chicago: The University of Chicago Press.
Urban, Greg 1991. A discourse-centered approach to culture: Native South American myths
and rituals. Austin, TX: University of Texas Press.
Warriner, Doris S. 2010. Communicative competence revisited: An ethnopoetic analysis of
narrative performances of identity. In Francis M. Hult (ed.), Directions and prospects for
educational linguistics, 63–78. Heidelberg: Springer.
Webster, Anthony K. 2008. “To all the former cats and stomps of the Navajo Nation”:
Performance, the individual, and cultural poetic traditions. Language in Society 37(1),
61–89.

29
Multimodal ethnopoetics

Wilce, James M., 2008. Scientizing Bangladeshi psychiatry: Parallelism, enregisterment, and
the cure for a magic complex. Language in Society 37(1), 91–114.
Woodbury, Anthony C. 1985. The functions of rhetorical structure: A study of Central Alaskan
Yupik Eskimo discourse. Language in Society 14, 153–190.

Websites
ASL poetry (by Clayton Valli): http://www.youtube.com/watch?v=GmhbuGZJyJA
“Tsupuri dance”: http://www.youtube.com/watch?v=e5I3-J-zpXE

Professional bio: Dr. Kuniyoshi Kataoka is Professor of English Linguistics in the Faculty of Arts and
Letters at Aichi University, Japan. He is particularly interested in the relationship between linguistic and
para-/meta-linguistic means of representation of poeticity in written and spoken discourse. His studies have
appeared in many journals and books, including Language in Society, Journal of Linguistic Anthropology,
Pragmatics, Journal of Pragmatics, Language & Communication, Discourse Constructions of Youth
Identities (edited by J.K. Androutsopoulos and A. Georgakopoulou, 2003), and Style Shifting in Japanese
(edited by K. Jones and T. Ono, 2008). He is currently an editorial board member of Pragmatics (IPrA),
Pragmatics & Beyond New Series (John Benjamins), and Language & Communication (Elsevier).

1
Semiotically speaking, a poetic structure is closely akin to an icon in the sense of invoking
in our mind images and diagrams. It is also similar to an index in that it projects contiguous
extensions through association, and if such association becomes highly conventional to the
extent that the original semantic content is largely diluted and hardly perceived, it can
legitimately serve as a symbol. In this sense, even a single word can be poetic (e.g., Sapir
1921: 228; Basso 1990) because it may be associated with the entire episode or event of
cultural significance.
2
It is no wonder, therefore, that most researchers in this field (e.g., Rothenberg &
Rothenberg 1983) maintain that the poetries in the world are equal and comparably valued
in each milieu, whether it is a peasant’s bantering folksong performance or a prime
minister’s inaugural speech in the cabinet.
3
However, “(t)he poetic function is not the sole function of verbal art but only its dominant,
determining function, whereas in all other verbal activities it acts as a subsidiary, accessory
constituent” (Jakobson 1960: 356).
4
In fact, Friedrich (2006) broadens the scope of ethnopoetics by connecting it to other
(sub)disciplines of ethnoscience—e.g., ethnotaxonomy, ethnomathematics, ethnophysics, etc.
in that topological aspects of our knowledge may largely be shared (on an abstract level)
among the world populations. In this broad concept of poetics, any patterns, structures, and
formations emerging from struggles and cooperation, ebbs and flows, and methods and
processes observed among indigenous practices would concern (ethno)poetics.
5
Parallelism, broadly conceived, should underlie different but related notions such as
“intertextuality,” “polyphony,” “voices,” and “pastiche,” as well as figures of speech such as
“metaphor/metonymy,” “synecdoche,” and “allusion” because they implicitly refer to and
evoke in our minds an entity comparable to or associated with the original.
6
This approach, however, was soon critically evaluated by attending to paralinguistic,
intonational features. For example, Tedlock (1983) most articulately emphasized the
primacy of pause groupings (e.g., pitch, loudness, rhythm, silence) in oral performance in
the Zuni language. On the other hand, Bright (1984) later observed that the approaches in
Karok myths mostly converged (90% of the time) in identifying the basic units and juncture
points in the myths, showing that both elements could be coordinately incorporated. On the
other hand, Woodbury (1985) convincingly developed a set of modular systems for dealing

30
Kuniyoshi Kataoka

with micro- and macro-level connections, proposing that different criteria for identifying
lines may facilitate a holistic account of the performance. Other studies attempt to
incorporate not only formal and morpho-syntactic features but also the historical
contingencies of ideological formations (Silverstein & Urban 1996; Webster 2008).
7
Gee (1986, 1989) argues that the differences in the types of information statuses, emotional
representations, and line-linking styles are reflections of the different oral cultures of the
speakers. Contrary to Hymes’ contention that American English speakers heavily rely on the
patterns of threes and fives, Gee found that both black and white girls organized their talks
based on four-line stanzas. (However, this difference is not conclusive. Hymes’ attention is
mostly on verses and stanzas, and Gee’s is on ‘reconstructed’ lines; hence, there is a
generalization derived from slightly different levels of analysis. Gee also mentions that he
does not have any claim on the number.)
8
As Iwasaki (1993) mentions, IUs may exhibit language-specific skewing in length of, and
proportion of, preferred units (e.g., lexical, phrasal, or clausal) for achieving IU. He found
that Japanese conversation mainly consists of phrasal IUs, whereas American English
conversation tends to include more clausal IUs.
9
Chafe’s (1994) three types of intonation units—substantive, regulatory, and fragmentary
units—may all be relevant to demarcating lines. Regulatory units (which mostly coincide
with discourse markers) are seen here as mediators before and/or after the substantive one,
attuning and reconciling propositional contents that are not straightforwardly utterable in
terms of social, physical, and psychological constraints. Although Chafe excluded
fragmentary IUs (e.g., truncation, restart, trail-off) from some of his narrative analyses, they
could be retained in the same or separate lines for ethnopoetic analysis (see also Note 11).
10
I say “often” because what is called up-talk (or currently a widespread practice among the
younger generations in many parts of the world) could indicate the termination of an IU.
11
We have seen some attempts to deal with disfluencies in ethnopoetic analysis. For example,
Gee (1986, 1989) worked on the spontaneous speech data from black and white girls’ oral
stories, assuming that when disfluencies are stripped off and fragmented pieces of speech are
formed into lines—usually reconstructed as clauses—they represent an underlying structure
of idea units (Chafe 1980).
12
(Ethno)poetic formations also concern prosodic and rhythmic features in other genres of
performance such as songs and dances: see analyses of Bob Dylan’s use of pitch in “Like a
Rolling Stone” (Daley 2007) and of rhyming techniques used by hiphop artists (Alim 2006).
13
Rhythm, previously viewed from physiological or behavioral aspects, has now come under
interactive scrutiny of body postures and gestural management in terms of “interactional
synchrony” (Condon 1982). It is considered to link people’s actions, provide a framework
for cooperative endeavor, and intersubjectively facilitate their expressive intentions,
although it may succumb to cultural variation (Lomax 1982).
14
However, McNeill’s narrative level may partially include what Labov calls “orientation.”
15
Gesticulation represents “patterns of movement that are enactive or depictive of the ideas
being expressed,” most notably by hands and arms, but “such expressions are concurrent
with, indeed they often somewhat precede, verbal expression” (Kendon 1980: 209).
16
By combining Silverstein’s (1985) previous analysis of the poetic formation of ideology,
McNeill (2003) further investigated the contingent pointing behaviors that occurred there.
His focus was on what he calls “Growth Points” (“an analytic unit combining imagery and
linguistic categorical content”; McNeill & Duncan 2000: 144), and it provided an initial
form of thought for the complex manipulations of pointing. Although McNeill neatly
showed that those pointings served such interpersonal functions as “evasion, probing, and
confession (300),” it would also be possible to reanalyze it in terms of emergent catchment
under construction.
17
For example, Kita and Özyürek (2003) argue that different gestural representations of
motion events are heavily influenced by specific lexicalization patterns of language.

31
Multimodal ethnopoetics

18
Another oft-used rhythm is a 3-3-5 rhythm, the last set of which typically includes a blank
beat in the fourth, as in “ ( ) .”
19
“Guriko” originally comes from the term “glycogen.”
20
Hymes (1996: 158, 215; 2003: 219, 304-311) emphasizes the “pivot” function of the
double triads, engendering the pentad structure.
21
A tone unit may be identified in terms of the nuclei of vocal prominence. Although the
relationship between a tone unit and a gesture phrase is complex, a gesture phrase is widely
assumed to manifest the ‘idea unit’ that is linked to the tone unit (Kendon 2004).
22
The importance of kata ‘pattern/type/style/form’ is widely acknowledged in Japanese
society (see Minamoto 1992). In practice, an odd-number construction is a valued and
culturally preferred unit, and it permeates various aspects of Japanese art forms and writing
conventions.

32

View publication stats

You might also like