Professional Documents
Culture Documents
Terry Janzen
Terry Janzen
Terry Janzen
University of Manitoba (Canada)
Recent work has shown that ASL (American Sign Language) signers not
only articulate the language in the space in front of and around them, they
interact with that space bodily, such that those interactions are frequently
viewpointed. At a basic level, signers use their bodies to depict the actions of
characters, either themselves or others, in narrative retelling. These view-
pointed instances seem to reflect “embodied cognition”, in that our con-
strual of reality is largely due to the nature of our bodies (Evans and Green,
2006) and “embodied language” such that the symbols we use to communi-
cate are “grounded in recurring patterns of bodily experience” (Gibbs,
2017: 450). But what about speakers of a spoken language such as English?
While we know that meaning and structure for any language, whether spo-
ken or signed, affect and are affected by the embodied mind (note that the
bulk of research on embodied language has been about spoken, not signed,
language), we can learn much about embodied cognition and viewpointed
space when spoken languages are treated as multimodal. Here, we compare
signed ASL and spoken, multimodal English discourse to examine whether
the two languages incorporate viewpointed space in similar or different
ways.
1. Introduction
Recent work has shown that American Sign Language (ASL) signers not only
articulate the language in the space in front of and around them, they interact
with that space bodily, such that those interactions are frequently viewpointed
(Janzen, 2004, 2012, 2019; Janzen et al., forthcoming). At a basic level, signers
use their bodies to depict the actions of characters, either themselves or others,
https://doi.org/10.1075/lic.00020.jan | Published online: 4 July 2022
Languages in Contrast 22:2 (2022), pp. 227–258. ISSN 1387-6759 | E‑ISSN 1569-9897
© John Benjamins Publishing Company
228 Terry Janzen
1. Additional elements of a signed utterance may also be treated as gestural, for example the
use of gesture spaces, discussed below.
2. This is not to mean that all gestures are considered nonconventional. Enfield (2009, 2013)
discusses emblematic gestures (e.g., thumbs up meaning all good) as conventional symbols.
Embodied cognition 229
perspective-taking that is addressed in the present study. Also, for the purpose of
this study, I use the term “multimodal” primarily for discussion of, and examples
from, spoken English, even though in the ASL examples I discuss gesture spaces
that signers incorporate into their “enactments” (Ferrara and Johnston, 2014) of
depicted actions or dialogue. Ferrara and Johnston (2014: 197) define enactments
as partial demonstrations of behaviour, either linguistic (as constructed dialogue)
or non-linguistic (as constructed action). The premise is that spoken English
discourse is inherently multimodal, and utterance meaning cannot be consid-
ered complete without taking into account the gestures that speakers use. In the
English texts described below, gestures of the body are constant, contributing to
meaning at the levels of object description, relational elements in space, and sub-
jective and epistemic stance.
In this study, I compare signed ASL and spoken, multimodal English narrative
discourse to examine whether the two languages incorporate viewpointed space
in similar or different ways. To do this, we look at video-recorded spontaneous
narratives in discourse that is fully contextualized and, in most cases, occurs in
face-to-face settings intersubjectively between interlocutors. I explore the follow-
ing questions: What can we learn about how ASL signers use viewpointed spaces?
Do English speakers use viewpointed spaces in similar ways to ASL signers, and
to the same extent? Do their gestures, including body stance and body orienta-
tion, reflect such viewpointing, both in depicting characters and in more abstract
ways? Are there differences that could be due to modality differences? And finally,
what aspects of these narrative sequences indicate the signers’ and speakers’ sub-
jective and epistemic stance-taking?
It follows, then, that language and language use reflect embodied cognition,
thus we can talk about “embodied language”. Language structure reflects embod-
ied cognition in many ways, not the least is with the numerous metaphors we use
everyday. If we say ‘I can’t get that image out of my head’, for example, the linguis-
tic construction reflects the conceptual metaphor the mind is a container. We
understand the meaning because cognitively we have had countless experiences of
the bodily action of putting things in and taking things out of containers, and can
conceptualize abstract processes as mimicking these physical actions. The basis
for such cognitive processing may be proprioception (Altman, 2009).
For MacWhinney (2013), a critical component of embodied cognition is
perspective-taking: “Speakers and listeners use language as a way of working
through various perspectives and shifts in perspective grounded on the objects
and actions described by language. We can refer to these processes of active
embodiment as the perspective-taking system” (MacWhinney, 2013: 214; italics in
original). MacWhinney discusses four aspects (or “levels”, ibid.: 215) of
perspective-taking, affordances (sensations we experience when interacting with
individual objects), spatio-temporal reference frames, causal action chains, and
social roles. Cognitive development begins with the individual being able only to
understand their own point of view, but as they mature, they are able not only to
grasp that there are other points of view, but can conceptualize what these other
points of view might be. This is one of the hallmarks of intersubjectivity (e.g.,
Zlatev, 2017).
including movements of the head and torso, facial gestures, eye gaze, and so on.3
Janney (1999: 954) proposes that even beyond considering prosodic performances
as gestural, the “acts of speech” are gestures because they are performed with a
particular intention in mind; they show the addressee what we are trying to do
with words. Even further, Janney’s main tenet is that words as gestures
somewhat like voice and body gestures in face-to-face speech, operate iconically,
producing ‘likenesses’ or ‘pictures’ of speakers’ states of mind, feelings, and inten-
tions. The difference between verbal and nonverbal gestures, however, is that the
former are performed at a higher level of abstraction than the latter. Verbal ges-
tures are figurative gestures. (Janney, 1999: 956; italics in original)
The relationship between signed language and gesture has been a topic for discus-
sion since the beginning of signed language research in linguistics (for a summary
and discussion of issues, see Janzen, 2006). In cognitive linguistics the tendency
has been to take a continuity approach that sees no great divide between what
might be considered linguistic and what gestural (Wilcox, 2002, 2004; see also
Müller, 2018), but rather to examine synergies between the two. More recent
explorations in composite utterances (Enfield, 2009, 2013; Ferrara and Hodge,
2018; Janzen, 2017; Kendon, 2014) along with the claims in Janney (1999) suggest
that similar continuities may exist for both signed and spoken languages. There-
fore, in a comparative study such as the present one, a multimodality approach
must underlie the analysis.
Returning briefly to embodied cognition and embodied language, we see that
the principles of embodiment are at the body-brain-mind level without regard to
language modality, although it is of great interest to note that when multiple bod-
ily resources are co-opted for the production of utterances, semantic composites
result from a much greater array of “articulatory” participants than just simply
speech or signing as lexico-syntactic strings. It is not the case that signed lan-
guages are more “embodied” than spoken languages because articulation involves
hands, face, head, and torso; rather, we might suggest that embodied language
is expressed in lexico-syntactic structure along with a plethora of bodily gestures
(see also Hostetter and Alibali, 2008).
3. See, for example, Kendon (2004) Chapter 2 for a detailed look at bodily gestures beyond
just what the hands do.
232 Terry Janzen
ing an utterance with aspects of the “space topology”, including temporal location
or epistemic stance, which goes far beyond a simple spatial viewing arrangement.
Parrill (2012) discusses some of the difficulties both with terms and how they
are defined, citing Chafe (1976) and DeLancey (1981) as defining viewpoint in
terms of the conceptualizer’s perspective on a scene or event, suggesting that it is
not clear what “perspective” means. Parrill makes the important point, however,
that “conceptual viewpoint” (Parrill, 2012: 98), even if based on perceptual view-
ing arrangement, is a mental representation involving mental simulations that can
differ from what one actually sees. For example, in recalling an event we can take
either a “character viewpoint” as if actually participating in a scene or an “observer
viewpoint” which is a much more distanced point of view (see also McNeill, 1992).
In the analysis below, there are two aspects of viewpoint I focus on. One is the
way that the discourse participants portray their recalled interactions with other
people and objects. As they do, they appear to set up a mental scene within which
reported actions and reported discourse takes place. The physical orientation they
display in the present retelling of the story I will refer to as perspective-taking, in
the sense that they are simulating a mental representation of their physical orien-
tation within the past event. As relevant, I will also use the terms character view
and observer view to note these two when they contribute important elements
to the discussion. The second aspect of viewpoint has to do with stance-taking.
Stance is defined by Du Bois (2007: 163) as a social action taking place in dia-
logical interchanges, a “public act by a social actor, achieved dialogically through
overt communication means, of simultaneously evaluating objects, positioning
subjects (self and others), and aligning with other subjects, with respect to any
salient dimension of the sociocultural field”. Stance-taking takes the form of eval-
uation, assessment, and appraisal. Stance is inherently subjective, as expressions
of stance are seen as representations of the speaker’s or signer’s subjective “evalua-
tion of assertability” (Dancygier, 2012a: 74), coded directly in the expression used.
Stance can also be epistemic, the expression of the subject’s conceptualization of
reality (Langacker, 2009), a judgement, as in a reason to have a particular expecta-
tion (Verhagen, 2005), or a strength of their commitment to the assessment of an
item being evaluated (Bybee et al., 1994; Dancygier, 2012a), and involves intersub-
jective understanding of both the speaker’s/signer’s and hearer’s/viewer’s beliefs
and reasoning processes (Ferrari and Sweetser, 2012). What appears to be the case
frequently in the narrative passages analyzed below are multifaceted viewpoint
sequences that dynamically build viewpoints composed of perspective-taking on
the scenes themselves, overlaid with discourse-level stance-taking and, because
individual expressions of stance combine into larger and more complex expres-
sions of speaker and signer viewpoint, a kind of “stance stacking” (Dancygier,
2012a) results.
234 Terry Janzen
The data examined in this study are three narrative passages in ASL and four nar-
rative passages in multimodal English. It is a qualitative study in the sense that
elements within these narratives are analysed that reflect perspective-taking and
stance, but they are not quantified here. They are impromptu (with some qualifi-
cation, discussed below) descriptions of some event that has taken place. Table 1
lists the seven narratives, the language used, the total length of each in minutes
and seconds, and the context within which each is found. All of the narratives
were chosen as they are casual and typical examples of contextualized storytelling,
are personal experiences, and are not scripted.
The three ASL narratives are all taken from a small conversational ASL corpus
recorded at the University of Manitoba in 2000. There is a total of 9 hours and
9 minutes of conversational data, with all participants being deaf L1 ASL signers.
Dislocated Shoulder is an excerpt from one conversation where the signer tells
about a shoulder injury sustained at a sports event, and the ordeal of going to
emergency and attempting to communicate with the medical staff there, with
descriptions of some interesting characters and interactions in the emergency
waiting room. In the Mouse Story, the signer recounts how her family spent a hol-
iday at a camp, where one evening her father was cooking dinner at an outdoor
grill and encountered a mouse crawling down the brick chimney toward the food
on the grill. The signer in Rock Climbing tells how she found herself struggling to
complete a difficult climb with a friend, fearing that one slip would be disastrous.
Embodied cognition 235
The English narratives are taken from various sources and not collected as part of
a corpus as is the case with the ASL narratives. Two of the excerpts are from tele-
vised events, and two are from internet vlogs. Earth Landing is a brief excerpt of
an interview with Canadian astronaut Chris Hadfield on CBC’s The Hour, where
Hadfield is being interviewed by host George Stroumboulopoulous, posted on
YouTube, October 30, 2013.4 In this excerpt, Hadfield describes his return from
the International Space Station, and the ordeal of a disconcerting return to grav-
ity upon landing. In Working with Alan, British actor Maggie Smith appears
in a British Film Institute production interview with Mark Lawson, posted on
YouTube, May 19, 2017. In the excerpt chosen, she tells the story of working with
writer Alan Bennett on the production of A Private Function, who gave her less
feedback on playing her role than she would have liked.5 Paranormal Sighting is a
brief narrative told by Georgia on her YouTube channel GeorgiaAnimated, posted
October 25, 2020.6 She is from Brisbane, Australia, but spent her early years in
England. The paranormal incident took place in England when she was about
eight or nine years old. Georgia and her mother were driving down a country
road when they encountered a woman under a bridge who may or may not have
really been there. Finally, Kinsey is from California and has a YouTube channel
called kinsey b, on which she posted Barista on November 24, 2020.7 She tells us
about one rude customer who could not be made happy, no matter what the Star-
bucks staff tried to do for her.
In these last two narratives, being YouTube vlogs, the speakers do not have an
interlocutor present but are talking to their virtual audiences. The discourse is not
scripted. Whether the quality of viewpoint expression differs in this circumstance
is not known, and is not examined here. However, these narrative video record-
ings are taken at face value, as they appear to be as rich in the expression of both
perspective-taking and stance-taking as the other narratives being examined.
In what follows I give examples of how the speakers and signers in the seven
narratives use perspectivized spaces, express stance, and overlap these elements
in both ASL and multimodal English, noting in particular where similarities and
differences occur. Example figures are identified by language and the name of the
narrative, as in ASL: Dislocated Shoulder.
4. https://www.youtube.com/watch?v=pVQlJeESUgo
5. https://www.youtube.com/watch?v=detAlTL9sbM; Smith notes in this excerpt that Bennett
also directed the movie, although Malcolm Mowbray is credited with direction.
6. https://www.youtube.com/watch?v=O8zxLdkyvGM
7. https://www.youtube.com/watch?v=FWSpS9LPE0A; ‘worst starbucks customer experi-
ences | storytime’.
236 Terry Janzen
5. Perspectivized spaces
One important question to consider is whether speakers and signers present nar-
rative events in objective ways. A truly objective portrayal would necessitate the
non-involvement of the speaker in the storyline. But since the videos chosen for
this study each have the speaker or signer as a character in the story (in six cases
the story was about the speaker’s or signer’s own experience; in one case the signer
was a bystander as a family member) so an objective portrayal may not be a rea-
sonable expectation.
A second question has to do with how signers and speakers manipulate, and
interact with, the gesture spaces surrounding them. I begin with some examples
of positioning referents, either objects or other persons, within this space, exam-
ining both the ASL and English narratives together.
Winston (1995: 92) describes “spatial mapping” in ASL as beginning “when the
signer evokes a space during the discourse. It can begin by the use of a point, a
pointing sign, or the pointing or locating of a sign at a place” (see also Earis and
Cormier, 2013, for British Sign Language). She goes on to list a number of ways
“pointing” can take place to establish a referent, including articulating a sign in
a designated spatial location, using indicating verbs (Dudis, 2004; Liddell, 2003)
which move toward or away from such a location, etc. In these narratives, intro-
ducing references in a scene by pointing to a location is extremely rare, and when
they occur, they most often seem not to be simple locating points, but carry larger
discourse meaning. Means of locating other than pointing are discussed below in
Section 5.2.
The clearest example is in ASL: Dislocated Shoulder, where the signer
describes coming into the hospital and looking for Emergency. He stops at the
Information Centre, which he introduces into the discourse simply as in (1):
(1) ASL: Dislocated Shoulder
INFORMATION CENTRE INDEX.centre8,9
“the Information Centre (there)”
Figure 1. Indexical point to a central location identifying the hospital Information Centre
However, the point in this example appears to do more than just locate an
entity in the signer’s articulation space. Janzen (2019) analyses this distal point as
not only mapping to a distant location from that of the signer, it also represents
a conceptualized time in the distant past, reflecting a conceptual distance is
spatial distance metaphor (see also Janzen et al., forthcoming). If it were only
an index toward a distant location in topographical space, the height of the point
would not be accurate: the camp is not located at a higher elevation from the
signer (see Gärdenfors, 2004, along with earlier Gärdenfors references in his bib-
liography, for discussion of conceptual spaces).
A second fact contradicts the idea that a referent is located in the signer’s artic-
ulation space so that further references to that entity index the same location,
as described in Winston (1995; see also Engberg-Pedersen, 1993, for Danish Sign
Language). In other words, the discourse function of positioning a conceptualized
entity at a location in articulation space sets up the potential for referential cohe-
sion in the continuing discourse. In this case, however, once the location has been
set, the signer begins a narrative of an event that took place at that location, but
brings it to a central, proximal location, incorporating character viewpoint, and
never refers to the distally indexed location again.
A third example seems to be a hybrid case, where the signer uses a lexical
sign to mean to spot something visually, or to “spy” something. Figure 3 shows the
beginning and final handshapes for the sign, which has a hand-internal change
from an [S] handshape (a closed fist) opening to an extended index finger. The
palm is oriented downward and significantly, it is a directional (or indicating)
verb that has a short path moving toward the spatial location of the referent seen.
The utterance is given in (3):
(3) ASL: Dislocated Shoulder
PRO.1 SPY.left HANDCUFFS.ON.PERSON WITH TWO POLICE PRO.3
“I spotted a handcuffed guy with two police officers.”
Figure 3. Beginning handshape (a) and final handshape (b) for the sign SPY
Where a present and ‘directly’ perceivable referent is the target of a pointing ges-
ture, it is relatively presupposable: its existence, as well as its location and other
salient characteristics, may be taken for granted in the speech context. You can
exploit presupposable features of the actual location of a co-present referent, thus
rendering the interpretability of your gesture dependent on those presupposed
features… A gesture that ‘points at’ such a presupposable entity simply inserts it,
and its relevant features, into the current universe of discourse.
(Haviland, 2000: 19)
Figure 4 shows that her descriptive passage is presented using observer space, but
again, this is relatively rare in these seven narratives.10 The ASL examples above
10. While not examined in detail here, further study may show that the use of observer view-
point is topic specific. In the ASL: Rock Climbing narrative, much of the discourse is taken up
with describing the setting of the rock face, nearby waterfall, and pool of water far below where
the climber finds herself, with an observer view taken by the signer exemplified by the use of
240 Terry Janzen
“classifier” constructions, among other things. Without question, these observations demand
further study.
Embodied cognition 241
because such referent establishing takes place in the present narrative space, and
not as part of the past event. This is taken up in Section 7 below.
For such establishing gestures and indexical points in both ASL and English,
there may be genre and topic effects, in that only certain kinds of discourse neces-
sitate their usage. Discourse type matches and frequencies would be of interest to
examine, but this is left for further research. Once again, these occur only rarely
in the present study. Instead, the most prolific use of space concerns how signers
and speakers interact with referents in gesture spaces.
Figure 5. Character viewpoint/reported action in the three ASL signers’ narratives: (a)
Dislocated Shoulder, (b) Rock Climbing, and (c) Mouse Story
Figure 6, corresponding to the examples in (6), shows that for multimodal English
speakers, body positioning indicates a very similar perspectivized constructed
action/constructed speech orientation.
(6) a. English: Working with Alan
I would look at him, hopefully for, you know, notes.
b. English: Earth Landing
…and I’m calling my wife and going, how did the Leafs do?
c. English: Paranormal Sighting
I saw her, she was standing here…
d. English: Barista
Let me just go and grab my co-worker…
Embodied cognition 243
Figure 6. Perspectivized first person constructed action in (a) Working with Alan,11 (b)
Earth Landing, and (c) Paranormal Sighting, and constructed speech in (d) Barista
11. Smith’s interviewer is sitting to her right; in this constructed action, her conceptualized
face-to-face interaction with Alan Bennett is oriented just left of centre.
244 Terry Janzen
All of this might well be expected, because by default, what else would we expect
these signers and speakers to do other than to represent their past selves from
their own perspective? But how do they represent other persons? This is taken up
next in 5.2.1.
In (7b) and (7e), with constructed dialogue, the character is named lexically,
while in (7d) the character was named a number of utterances prior to this one,
followed by a lengthy portrayal of her dialogue in which this particular utterance
occurred. The point is that these storytellers switch between story characters by
12. This is a rather difficult utterance to transcribe accurately. The narrator’s father was stand-
ing at an outdoor brick grill, chopping food and cooking, while the family sat off to the side,
engaged in animated ASL conversation. The hand position in 7a is held from the preceding
action of cooking, simultaneous with gesturally looking left toward the family and nodding
along, but in this sequence no lexical signs are apparent.
Embodied cognition 245
Figure 7. Third person character perspective (a) reported action in ASL: Mouse Story,
(b) reported dialogue in ASL: Rock Climbing, (c) reported action in English: Working
with Alan, (d) reported dialog in English: Barista, and (e) English: Paranormal Sighting
246 Terry Janzen
way of mentally rotating the gesture spaces, and give sufficient information for
the addressee to identify which character’s perspective is being portrayed without
physically moving into different spaces (which would be another overt means of
identifying a perspective shift). Once this has been established, however, it takes
less lexical material to evoke the characters, which may prompt an increase in the
reliance on physical stance gestures.
Given the definition of stance outlined above in Section 3 from Du Bois (2007)
as a “public” act that evaluates, assesses, or appraises states of affairs, positions or
aligns the speaker (or others) with others, we can look for multimodal evidence
for stance-taking in these narrative texts. Stance-taking is by its very nature sub-
jective (see also Scheibman, 2002), and directly coded in expressions (Dancygier,
2012a). Lexical expressions of stance have been studied (Dancygier, 2012a, is a
good overview), but there has been less work on gestures as contributing to
stance-taking in multimodal language, although this work has begun (Dancygier
et al., 2019). The phrase ‘I think’ very often indicates assessment or appraisal,
words like ‘probably’ are evaluative, and adverbs, adjectives, and modals are all
considered as stance expressions. At its very heart, stance is dialogic, a “unit of
social action” (Du Bois, 2007: 173), because signers and speakers present a stance
in an effort to impact the addressee, often to persuade them to “see it like they
do” – alignment is frequently a social goal (see Croft, 2000, on jointly held con-
strual). Examples of lexical stance-marking are abundant in these narratives as
in Figure 8 for ASL and the utterances in (8). In (8c and d) the stance marker is
bolded.
(8) a. ASL: Mouse Story
EVERY.YEAR PRO.1 GO+ PRO.1 ALWAYS LOOK.AT(left) PRO.1 SICK
THAT PLACE
“Every time I go there I always see that place and feel sick!”
b. ASL: Dislocated Shoulder
Right hand: O-H F-U-C-K
Left hand: CL:F(stare at)
“As I stared at him I was thinking, oh fuck!”
c. English: Paranormal Sighting
This is basically the only time that I think I saw something.
d. English: Earth Landing
…and make him go lie down so that he stops metabolizing it so fast,
because this might kill him.
Embodied cognition 247
Figure 8. The sign SICK in ASL: Mouse Story in (a); fingerspelling O-H F-U-C-K (rh) in
ASL: Dislocated Shoulder in (b)
In each of these cases, the stance marker is a comment on some aspect of a con-
ceptualized event or state of affairs, and is not probably something that actually
took place or was said at the time of the event, although in (8b), we don’t know
if this is what was truly thought or said or a current appraisal of what the signer
felt, but it may be considered as a kind of quotation of what was uttered, overtly
or mentally, by the signer as past self (cf. Clark and Gerrig, 1990; see also Sams,
2010). Nonetheless all of these stance markers are for the benefit of the addressee
more than they are a part of the story.
Stance-taking can be expressed gesturally as well, as the examples in Figure 9
and Example (9) show.
(9) a. ASL: Dislocated Shoulder
PRO.1 WAIT++ PRO.1
“I waited and waited.”
b. English: Working with Alan
It was very difficult to get him, you’d say please tell me.
c. English: Barista
(with a quiet, breathy voice, slightly higher pitch) ‘I don’t know why, I just –
I really hate it when people are like (gestures aggressively13).
In each of these examples, the gestures – facial gesture in Figure 9a, claw-hands
gesture in (9b), and voice quality in (9c) – add specific meaning at the level of
stance-taking. Each conveys something to the addressee about the signer’s or
13. This gesture, which is distinct from, and functions differently than the vocal quality
addressed here, is discussed below with regard to stance-stacking and dual viewpointing.
248 Terry Janzen
Figure 9. Exasperated facial gesture in (a) ASL: Dislocated Shoulder, and claw-hands
gesture coincidental with ‘please’ in Working with Alan in (b)
speaker’s attitude or assessment of the situation, inviting the addressee into a joint
experience. In Figure 9a, the facial gesture adds the sense of exasperation to the
situation of having to wait that is not conveyed by the sign or manner of signing
itself. In (9b) the spoken word ‘please’ is lengthened which evokes pleading, but
the double claw-hands (Figure 9b) additionally suggest a more intense negative
sense, perhaps tension or frustration. And in (9c) voice quality and pitch differ-
ence portray the speaker as helpless, the victim in this situation.
Bucholtz and Hall (2016) note that the voice is grounded in the body, and
that Ohala (1994) has suggested that high pitch voice indicates smallness, female-
ness, and nondominance. Voice quality, it turns out, may be a productive area of
study in terms of embodiment, intersubjectivity and stance, as Podesva (2013), for
example, discusses the use of falsetto voice as enacting a power stance in African
American women.14
Dancygier (2012a) was the first to point out that overall stance effects emerge
when individual stance markers are sprinkled throughout a stretch of discourse,
but which come together to form a composite stance. Dancygier refers to this
as “stance-stacking”. While this concept is just beginning to be explored, here it
seems evident that in these examples (which are in fact emblematic of numer-
ous instances in the multimodal discourse in all seven narratives), the realization
of stance can only be accomplished if multiple stance markers are considered
14. One reviewer insightfully pointed out that stance-taking aspects of voice characteristics
may have an analogy in the role that body posture in signed languages may play in terms of
stance. Details regarding this may become more apparent in future work.
Embodied cognition 249
together. In these cases, the stance markers are of various types: lexical, vocal, ges-
tures of the hands and body, even references to gesture spaces. Space does not
permit detailed discussion of this effect, but as the examples in (9) show, stance-
taking on the part of the signer or speaker cannot be understood without con-
sidering numerous stance-related elements over a series of constructions. As for
the “aggressive gesture” in (9c) for example, part of which is shown in Figure 10,
the speaker re-enacts the aggressive actions of the woman she is describing, but
in fact, it is not a straightforward re-enactment. In the utterance up to this point
she has used a voice quality that exemplifies a helpless victim of someone’s atti-
tude, and the enactment is her own representation of what happened, filtered
through her own more recent evaluation of what had taken place and the impact
she wishes to make now. It is very much an “in your face” representation, and
the speaker moves close to her video camera as especially Figure 10b shows,
whereas she has said earlier in her narrative that the woman was well on the other
side of the counter from her. Further, again in Figure 10b, she is smiling, which
is not what she intends to be the woman’s facial gesture, but which belies her
own incredulity or, perhaps, indicating the ludicrousness of the woman’s actions.
Therefore, whether or not this is under her conscious control, she achieves
through body partitioning (Dudis, 2004) a presentation of her construal of the
woman’s aggressive actions and her own (present) attitude toward her, by “stack-
ing” several indications of stance-taking.
Figure 10. Aggressive gesturing by leaning the body forward sharply and assuming a
stern facial gesture (in (b)) the speaker is also smiling
250 Terry Janzen
It thus appears there are both simultaneous potentials along with sequential
potentials for stance-stacking for both speakers and signers, which is represented
in a rudimentary way in Figure 11. This figure represents the fact that stance-
stacking simultaneously or sequentially is not an either/or phenomenon, but may
be some combination of both, suggesting that with some refinement of these
ideas, we might be able to graph instances of stance-stacking in terms of combi-
nations of these features to learn about tendencies among groups of speakers or
signers.
7. Dual viewpointing
The final topic considered here, however briefly, is whether more than one view-
point can be conveyed simultaneously. Parrill (2009: 271) looks at dual viewpoint
gestures, which she suggests might occur if a speaker takes on “multiple spatial
perspectives on a scene at the same time”. She considers McNeill’s (1992) examples
of 1) a character’s viewpoint combined with an observer’s point of view of the
character’s trajectory through space, and 2) two different characters’ viewpoints
being combined, for example if a speaker points to his own body when the point-
ing is associated with one character, and the speaker’s body represents that of
another character. Dudis’s (2004) descriptions of body partitioning would be
applicable here. Sweetser (2013: 240) makes the illuminating point that the narra-
tor has just one body and voice “to represent all the aspects of viewpoint which
Embodied cognition 251
8. Conclusion
The examples and the discussion above suggest that perspective-taking and
stance-taking involve complex interactions among lexico-syntactic material and
various gesture types in both a signed language such as ASL and a spoken lan-
guage such as English, as also claimed by Quinto-Pozos et al. (this volume) for
the same pair of languages and by Parisot and Saunders (this volume) for the
French – Quebec Sign Language pair. It is critical to understand both as multi-
modal systems, but perhaps one scalar difference is that because a signed language
is necessarily articulated fully within a visual channel whereas a multimodal spo-
Embodied cognition 253
ken language is dual-channelled, we might expect that language within the sin-
gle visual channel exhibits elaborated visual constructions. This comparison was
not examined specifically in this study, but it is worthy of analysis in future work
(see the contributions of Quinto-Pozos et al. and Parisot and Saunders to this vol-
ume). Without considering the articulation differences between speech and sign-
ing, which may not be particularly significant once a multimodal approach is
assumed, it is evident that similarities far outweigh differences in terms of how
perspective and stance are presented. Differences may in fact be attributed to indi-
viduals’ communicative repertoires and habits, as there is bound to be variation
among individuals whether they are signers or speakers. However, one of the most
significant similarities between the ASL and English examples here is the use of
perspectivized gesture spaces. When both signed languages and spoken languages
are considered as multimodal systems, at least in the narrative contexts explored
here, the experiential basis of language as embodied is fully realized in the ways
that speakers and signers evoke past interactions in past spaces and present them
in their current discourse as spatially contextualized. It appears that addressees
can understand more about the interactions and the things the speakers and sign-
ers construe as significant when references to gesture spaces form part of the
retelling. For both language modalities, the result is that beyond simply describing
scenes in these narratives, multimodal expression frequently contributes to sub-
jective and epistemic stance-taking, and the impression of stance is at least some-
times assembled from multiple parts – stacked, in Dancygier’s terms – at times
simultaneously and at times sequentially, which tells us that the expression of
stance is complex and composed of both conventional linguistic items and less
conventionalized verbal and gestural pieces.
When the focus of study is not singular gestures or singular signs but rather
the dynamicity of gestures and signs in action, the division between what might be
gestural and might be lexical diminishes. This dynamicity is addressed in Müller
(2018: 16; italics added), who argues that the object of examination must be “com-
paring multimodal languages in use, that involve singular, recurrent, and emblem-
atic gestures, [which] is different from comparing signing or speaking only with
regard to singular gestures and under experimental conditions”. Doing so results
in the analyst being faced with the seeming complicated imperfections of lan-
guage usage, but as Bybee (2006) notes, a usage-based approach to understanding
the structure and functions of language tells us much about users’ experiences,
their cognitive representations of language, and therefore ways in which they
understand language can be used. What we have seen in the present study is that
whether spoken or signed, language is experienced as embodied via embodied
cognition and, in a sense, comes full circle in its bodily expression.
254 Terry Janzen
ASL signs are in uppercase glosses. PRO.1 and PRO.3 are first and third person pronouns.
POSS.1 is a 1s possessive pronoun. Glosses of more than one word for a single sign have words
separated by a period, e.g., WHATS.UP. Plus signs indicate repeated movement, e.g., GO+. Fin-
gerspelled words are indicated by letters separated by dashes, e.g., S-N-O-W. ‘rt’ means posi-
tioned or moving rightward; ‘lt’ means positioned or moving leftward. ‘rh’ is right hand; ‘lh’ is
left hand.
References
Janzen, T., Shaffer, B. and Leeson, L. 2017. Does Grammar Include Gesture? Evidence from
Two Signed Languages. Paper presented at the Fourteenth International Cognitive
Linguistics Conference (ICLC 14), Tartu, Estonia, 10–14 July 2017.
Janzen, T., Shaffer, B. and Leeson, L. 2019. The Embodiment of Stance in Narratives in Two
Signed Languages. Paper presented at the Fifteenth International Cognitive Linguistics
Conference (ICLC 15), Nishinomiya, Japan, 6–10 August 2019.
Janzen, T., Shaffer, B. and Leeson, L. forthcoming. What I Know Is Here; what I don’t Know Is
Somewhere Else: Deixis and Gesture Spaces in American Sign Language and Irish Sign
Language. In Signed Language and Gesture Research in Cognitive Linguistics, T. Janzen
and B. Shaffer (eds). Berlin: de Gruyter Mouton.
Johnson, M. 1987. The Body in the Mind: The Bodily Basis of Meaning, Imagination, and
Reason. Chicago: University of Chicago Press.
https://doi.org/10.7208/chicago/9780226177847.001.0001
Kendon, A. 2004. Gesture: Visible Action as Utterance. Cambridge: Cambridge University
Press. https://doi.org/10.1017/CBO9780511807572
Kendon, A. 2014. Semiotic Diversity in Utterance Production and the Concept of ‘Language’.
Philosophical Transactions of The Royal Society B 369: 20130293. 1–13.
https://doi.org/10.1098/rstb.2013.0293
Kövecses, Z. 2010. Metaphor: A Practical Introduction (2nd ed). New York: Oxford University
Press.
Lakoff, G. and Johnson, M. 1999. Philosophy in the Flesh: The Embodied Mind and its
Challenge to Western Thought. New York: Basic Books.
Langacker, R. W. 2009. Investigations in Cognitive Grammar. Berlin: Mouton de Gruyter.
https://doi.org/10.1515/9783110214369
Liddell, S. K. 2003. Grammar, Gesture, and Meaning in American Sign Language. Cambridge:
Cambridge University Press. https://doi.org/10.1017/CBO9780511615054
MacWhinney, B. 2013. The Emergence of Language from Embodiment. In The Emergence of
Language, B. MacWhinney (ed.), 213–256. Mahwah: Lawrence Erlbaum.
https://doi.org/10.4324/9781410602367-13
McNeill, D. 1992. Hand and Mind: What Gestures Reveal about Thought. Chicago: University
of Chicago Press.
Müller, C. 2018. Gesture and Sign: Cataclysmic Break or Dynamic Relations? Frontiers in
Psychology 9(1651): 1–20. https://doi.org/10.3389/fpsyg.2018.01651
Ohala, J. J. 1994. The Frequency Code Underlies the Sound-Symbolic Use of Voice Pitch. In
Sound Symbolism, L. Hinton, J. Nichols and J. J. Ohala (eds), 325–347. Cambridge:
Cambridge University Press.
Parrill, F. 2009. Dual Viewpoint Gestures. Gesture 9(3): 271–289.
https://doi.org/10.1075/gest.9.3.01par
Parrill, F. 2012. Interactions between Discourse Status and Viewpoint in Co-Speech Gesture. In
Cambridge Handbook of Cognitive Linguistics, B. Dancygier (ed.), 97–112. Cambridge:
Cambridge University Press. https://doi.org/10.1017/CBO9781139084727.008
Podesva, R. J. 2013. Gender and the Social Meaning of Non-Modal Phonation Types.
Proceedings of the Annual Meeting of the Berkeley Linguistics Society (Vol. 37),
C. Cathcart, I-H. Chen, G. Finley, S. Kang, C. S. Sandy and E. Stickles (eds), 427–448.
Available at https://escholarship.org/uc/bling_proceedings/37/37
Embodied cognition 257
Quinto-Pozos, D. and Parrill, F. 2015. Signers and Co-Speech Gesturers Adopt Similar
Strategies for Portraying Viewpoint in Narratives. Topics in Cognitive Science 7: 12–35.
https://doi.org/10.1111/tops.12120
Sams, J. 2010. Quoting the Unspoken: An Analysis of Quotations in Spoken Discourse. Journal
of Pragmatics 42: 3147–3160. https://doi.org/10.1016/j.pragma.2010.04.024
Scheibman, J. 2002. Point of View and Grammar: Structural Patterns of Subjectivity in
American Conversation. Amsterdam: John Benjamins. https://doi.org/10.1075/sidag.11
Sweetser, E. 2012. Introduction: Viewpoint and Perspective in Language and Gesture, from the
Ground down. In Viewpoint in Language: A Multimodal Perspective, B. Dancygier and
E. Sweetser (eds), 1–22. Cambridge: Cambridge University Press.
https://doi.org/10.1017/CBO9781139084727.002
Sweetser, E. 2013. Creativity across Modalities in Viewpoint Construction. In Language and
the Creative Mind, M. Borkent, B. Dancygier and J. Hinnell (eds), 239–254. Stanford:
CSLI Publications.
Sweetser, E. and Stec, K. 2016. Maintaining Multiple Viewpoints with Gaze. In Viewpoint and
the Fabric of Meaning: Form and Use of Viewpoint Tools Across Languages and Modalities,
B. Dancygier, V.-l. Lu and A. Verhagen (eds), 237–257. Berlin: de Gruyter Mouton.
https://doi.org/10.1515/9783110365467-011
Traugott, E. C. and Dasher, R. B. 2002. Regularity in Semantic Change. Cambridge: Cambridge
University Press.
Vandelanotte, L. 2017. Viewpoint. In Cambridge Handbook of Cognitive Linguistics,
B. Dancygier (ed.), 157–171. Cambridge: Cambridge University Press.
https://doi.org/10.1017/9781316339732.011
Verhagen, A. 2005. Constructions of Intersubjectivity: Discourse, Syntax, and Cognition.
Oxford: Oxford University Press.
Wilcox, P. P. 2000. Metaphor in American Sign Language. Washington: Gallaudet University
Press.
Wilcox, P. P. 2004. A Cognitive Key: Metonymic and Metaphorical Mappings in ASL. Cognitive
Linguistics 15(2): 197–222. https://doi.org/10.1515/cogl.2004.008
Wilcox, S. E. 2002. The Gesture-Language Interface: Evidence from Signed Languages. In
Progress in Sign Language Research: In Honor of Siegmund Prillwitz/Fortschritte in der
Gebärdensprachforschung: Festschrift für Siegmund Prillwitz, R. Schulmeister and
H. Reinitzer (eds), 63–81. Hamburg: Signum-Verlag.
Wilcox, S. E. 2004. Gesture and Language: Cross-Linguistic and Historical Data from Signed
Languages. Gesture 4(1): 43–73. https://doi.org/10.1075/gest.4.1.04wil
Winston, E. A. 1995. Spatial Mapping in Comparative Discourse Frames. In Language, Gesture,
and Space, K. Emmorey and J. S. Reilly (eds), 87–114. Hillsdale: Lawrence Erlbaum.
Zlatev, J. 2017. Embodied Intersubjectivity. In The Cambridge Handbook of Cognitive
Linguistics, B. Dancygier (ed.), 172–187. Cambridge: Cambridge University Press.
https://doi.org/10.1017/9781316339732.012
258 Terry Janzen
Terry Janzen
Department of Linguistics
University of Manitoba
R3T 5V5 Winnipeg, Manitoba
Canada
terry.janzen@umanitoba.ca
Publication history