Professional Documents
Culture Documents
Emotion Recognition in Comics: The Effect of Visual Morphemes in Visual Narrative Contexts
Emotion Recognition in Comics: The Effect of Visual Morphemes in Visual Narrative Contexts
Abstract
In comics, the term “visual morphemes” refers to one type
of graphic structure that can be combined with other graphic
elements to generate diverse meaning. For instance, the visual
morphemes of whirlwind-shaped lines indicate confusion if they
are placed above a person’s head. Prior empirical research has
shown that such emotive visual morphemes do in fact help comic
readers recognize the emotions of comic characters. However,
there has been little evidence of the effect of emotive visual
morphemes when emotion recognition is required in narrative
contexts where multiple images are arranged to form a story, as
opposed to when in solitary images of character-morpheme dyads.
This study thus examined how emotive visual morphemes affect
the identification of character’s emotions in narrative contexts
consisting of three image panels. Results showed that emotion
recognition was slower when the visual morphemes were not
corresponding to the emotions of characters than when they were
corresponding or when they were not provided at all. The findings
thus add to our understanding of visual morpheme processing by
providing empirical support for the emotive visual morpheme
effects in the visual narrative structure.
1. Introduction
obtained. The experimental work offers some important insights into the
processing of visual morphemes in visual narrative structures.
2. Method
2.1 Participants
Participants were 36 native speakers of Korean (Mage = 21.47, SDage = 0.17)
who had no difficulties in reading comic scripts on a screen. They were
recruited by flyers posted at Sookmyung Women’s University, Seoul and
were paid for participation. All participants have signed a consent form to
participant in the experiment.
2.2 Stimuli
2.2.1 Comic scripts drawing
A total of 90 webtoon-style comic scripts were created by the first author,
with 30 for each of anger, surprise, and embarrassment target emotion
conditions. The stimuli were created using Adobe Photoshop and Clip
Studio and were saved as PNG image files with a size of 1200 by 3200
pixels. Each comic script consists of three panels arranged vertically. It
follows the canonical comic narrative structure of Establisher-Initiation-
Peak, where the Establisher (the first panel) describes the background of
the story, the Initiation (the second panel) is the beginning of an event,
and the Peak (the last panel) describes the core of the event (Cohn, 2013,
2014). All stimuli scripts are about the stories of Korean college students’
daily life, with a main character who is directly involved in the events and
who, as a result, experiences the target emotions. Any visual morphemes
corresponding to the target emotions are not included in the stimuli, so
that the target emotions should be inferred by the events of the stories (see
Figure 1 for example scripts). The main character was colored orange, while
other characters were colored light yellow. Other objects or backgrounds
the beginning of an event, and the Peak (the last panel) describes the core
of the event (Cohn, 2013, 2014). All stimuli scripts are about the stories
of Korean college students’ daily life, with a main character who is
directly involved in the events and who, as a result, experiences the target
emotions. Any visual morphemes corresponding to the target emotions
areEmotion
not included
Recognition in the stimuli,
in Comics: soVisual
The Effect of thatMorphemes
the target emotions
in Visual should be
Narrative Contexts 341
inferred by the events of the stories (see Figure 1 for example scripts).
The main character was colored orange, while other characters were
colored
were light
in black andyellow.
white, Other
and textsobjects
wereorin backgrounds
Korean. were in black and
white, and texts were in Korean.
Figure
Figure 1. Examples
1. Examples of of 3-panel
3-panel webtoon-style
webtoon-style comic
comic scriptsforfora)a)anger,
scripts anger,b)
b) embarrassment,
embarrassment, and c)and c) surprise.
surprise.
We computed hit rates for each comic script as a measure of how well
the participants recognized the intended emotion. One-sample t-tests
were performed to compare each hit rate to the chance level of 0.25 (in
a 4-alternative forced choice task). Twenty-five stimuli with hit rates
above chance were selected. Additionally, 28 stimuli were further selected
(from the rest of 65 stimuli) on the basis of coherence in the participants’
classification. That is, in spite of the hit rates being lower than chance
(i.e., the failure to recognize the intended emotion), if the rates of what
most participants chose as their answer were significantly higher than
the chance level, the stimuli were included. The understanding ratings
for each of the selected 53 stimuli were computed and compared to the
midpoint of the scale (i.e., 4) using one-sample t-tests. A total of 10 stimuli
with an understanding rating significantly lower than 4 (i.e., not well-
comprehensible stories) were excluded, leaving 43 experimental stimuli:
22 for anger, 15 for surprise, and 6 for embarrassment target emotion
conditions.
2.2.3 Manipulation
The selected 43 stimuli were manipulated by inserting visual morphemes
above the main character’s head in the last panels. The visual morphemes
were either corresponded (congruent condition) or did not correspond to
the target emotions (incongruent condition). The original stimuli, which
had no visual morphemes for the target emotions, were also used for the
empty condition. For the congruent condition, three visual morphemes that
are conventionally used to express each target emotion were used (Cohn,
2018a; Cohn & Ehly, 2016; Ojha et al., 2021) (1) the shape of smoke clouds
for anger; (2) the shape of a spark for surprise; and (3) the shape of beads
of perspiration for embarrassment (see Figure 2). For the incongruent
condition, the visual morpheme of the heart shape that was widely used
to represent happiness was selected. See figure 3 for examples of the three
visual morphemes that are conventionally used to express each target
emotion were used (Cohn, 2018a; Cohn & Ehly, 2016; Ojha et al., 2021)
(1) the shape of smoke clouds for anger; (2) the shape of a spark for
Emotion Recognition
surprise; and (3) thein Comics:
shape TheofEffect
beadsof Visual Morphemes in Visual
of perspiration for Narrative Contexts 343
embarrassment
(see Figure 2). For the incongruent condition, the visual morpheme of
the heart shape that was widely used to represent happiness was selected.
visual
Seemorpheme
figure 3 forconditions.
examples of the three visual morpheme conditions.
2.3 Procedure
Participants were tested individually in an experimental room at
Sookmyung Women’s University. They were seated at a comfortable
viewing distance from a screen which showed the stimuli. Written
instructions were presented on a screen, asking them to read a 3-panel
webtoon stimulus, to press the space key on a keyboard when they
understood the stimulus, and to indicate the emotion they thought the
main character, colored in orange, had experienced among anger, surprise,
embarrassment, happiness, and other emotions. The instructions further
asked them to rate how easy it was to understand the story in the stimulus
on a scale of 1 (not very understandable) to 7 (very understandable) and
how natural the drawing expressions of the stimulus were on a scale of 1
(very contrived) to 7 (very natural).
Task trials began with a fixation mark in center screen for 300 ms,
followed by a blank screen for 100 ms, and one stimulus comic script in
center screen until participant pressed the space key. Five response options
(Anger, Surprise, Embarrassment, Happiness, and Other emotions) were
then presented, and participants responded by pressing the corresponding
keys on a keyboard. After that, the 7-point understandability rating scale
and the 7-point drawing naturalness rating scale (with descriptions at
the end points) were displayed one by one until participants responded.
Responses were given by pressing the number keys on a keyboard.
A total of 65 stimuli, comprised of 43 experimental and 22 filler comic
scripts, were presented in random order. To avoid the effects of repetition,
the 43 experimental stimuli were divided into three sets of 14, 14, and 15
comic scripts (each set consisting of 5 for surprise, 2 for embarrassment,
and 7 or 8 for anger conditions), and each set was assigned to each of the
congruent, incongruent, and empty visual morpheme conditions. Thus,
every experimental comic script occurred only once in the congruent,
incongruent, or empty condition. The set assignment to the visual
Emotion Recognition in Comics: The Effect of Visual Morphemes in Visual Narrative Contexts 345
3. Results
Figure
Figure 4.
4. Mean
Mean emotion
emotion inference
inference response
response times
times (RTs)
(RTs) for
for three
three visual
visual
morpheme conditions. Error bars show standard errors. * p < 0.05;
morpheme conditions. Error bars show standard errors. * p < 0.05; **** p
p<
< 0.001.
0.001.
3.2 Understandability
3.2 Understandability rating
rating
The mean understandability ratings were calculated for correct
The mean understandability ratings were calculated for correct emotion
emotion inference responses, and they were analyzed in the same way as
inference responses, and they were analyzed in the same way as described
above. As shown in Figure 5a, the understandability ratings were higher
for the empty and congruent conditions than for the incongruent condition.
Emotion Recognition in Comics: The Effect of Visual Morphemes in Visual Narrative Contexts 347
described above.
As shown ANOVA
A by-subject in Figureindeed
5a, theshowed
understandability
a significantratings were higher
main effect for
of Visual
the empty and congruent conditions than for the incongruent condition.
morpheme
A by-subject (F(2,
ANOVA70) =indeed
19.70,showed
p < .001, η p2= .360),main
a significant andeffect
the follow-up
of Visual
analyses (Bonferroni-corrected) revealed significantly
2 higher
morpheme (F(2, 70) = 19.70, p < .001, ηp = .360), and the follow-up ratings for the
analyses
empty and(Bonferroni-corrected)
congruent conditions thanrevealed
for thesignificantly
incongruenthigher ratings
condition for
(empty
the empty and congruent conditions than for the incongruent condition
vs. incongruent: t(35) = 4.893, p < .001; congruent vs. incongruent: t(35) =
(empty vs. incongruent: t(35) = 4.893, p < .001; congruent vs.
5.521, p < .001);
incongruent: there
t(35) was nopdifference
= 5.521, < .001); between
there wasthenoempty and congruent
difference between
conditions (t(35) = 0.164, p = 1.0).
the empty and congruent conditions (t(35) = 0.164, p = 1.0).
Figure
Figure5.5.a)a)Mean
Meanunderstandability ratings
understandability and b)and
ratings mean
b) naturalness ratings
mean naturalness
ratings
for threefor three
visual visual morpheme
morpheme conditions.conditions. Error standard
Error bars show bars show standard
errors. ** p
<errors.
0.001.** p < 0.001.
3.3Drawing
3.3 Drawingnaturalness
naturalness rating
rating
The mean ratings for drawing naturalness of
The mean ratings for drawing naturalness of correct
correct emotion inference
emotion inference
responses were analyzed in the same way as understandability ratings.
responses
Similar were
to theanalyzed in the same ratings,
understandability way as understandability ratings.
the ratings for drawing
Similar to thewere
naturalness understandability
higher for theratings,
empty the
andratings for drawing
congruent naturalness
conditions than for
the incongruent
were higher for condition
the empty(see andFigure 5b), conditions
congruent and there was thanindeed
for thea
significant
incongruent main effect of Visual morpheme (F(2, 70) = 25.10, p
condition (see Figure 5b), and there was indeed a significant < .001,
ηp2 = .418); follow-up analyses (Bonferroni-corrected) 2 showed
main effect ofhigher
significantly Visualratings
morpheme
for the(F(2, 70) and
empty = 25.10, p < .001,
congruent ηp = .418);
conditions than
follow-up analyses (Bonferroni-corrected)
for the incongruent showed significantly
condition (empty vs. incongruent: higherp
t(35) = 5.070,
< .001;for
ratings congruent
the empty vs.and
incongruent: t(35) = 5.353,
congruent conditions p < the
than for .001), with no
incongruent
condition (empty vs. incongruent: t(35) = 5.070, p < .001; congruent vs.
348 Hyorim Han & Jiyoun Choi
incongruent: t(35) = 5.353, p < .001), with no difference between the empty
and congruent conditions (t(35) = 0.483, p = 1.0).
4. Discussion
overall narratives.
The visual morphemes, however, had no effect on emotion inference
accuracy, as shown by comparable accuracy across the congruent,
incongruent, and empty visual morpheme conditions. This demonstrates
that congruent morphemes failed to provide additional benefits for
identifying emotive meanings in the narrative contexts and that incongruent
morphemes alone were unable to override the semantic information
retrieved from the narratives. This result differs from that of Cohn et al.
(2016), who showed the visual morpheme effect on the interpretations of
the overall meaning of morpheme-face dyads, with the interpretations being
more consistent under matching than mismatching morpheme conditions.
This discrepancy can be explained by the distinct methodologies applied
in the two studies. While Cohn et al. (2016) used an open-ended task with
no right or wrong answers and measured the consistency of participants’
responses, we employed a 5-alternative forced choice task and measured
the correctness. The type of stimulus also varied so that Cohn et al. (2016)
used independent 1-panel comic images, whereas 3-panel comic sequences
were employed in this study to provide the narrative contexts. These
methodological differences may have resulted in assessing distinct aspects
of visual morpheme comprehension processes, which in turn may have led
to the inconsistent findings.
Another interesting finding of the current study was that participants’
fluency with reading comics, which was measured by VLFI, did not
correlate with accuracy and RTs of emotion recognition or any ratings.
The results are in accordance with the pattern observed by earlier studies,
including those by Cohn & Foulsham (2022), who found no correlation
between VLFI scores with comprehension ratings at morpheme-face
dyads, as well as Ojha et al. (2021), who showed no difference in judgment
on comic characters’ emotions between groups that were familiar versus
unfamiliar with comics. In contrast, others found correlative effects of
350 Hyorim Han & Jiyoun Choi
Acknowledgements A part of the results was presented to the Korean Society for
Cognitive & Biological Psychology Annual Conference 2022.
Funding This work was supported by Sookmyung Women's University Research Grants
1-2203-2020.
Declarations
Ethics Approval This study was approved by the Sookmyung Women's University
Institutional Review Board (approval number: SMWU-2104-HR-021-03).
Consent to Participate and Consent for Publication Written consent was obtained from
all participants.
References
Ojha, A., Forceville, C., & Indurkhya, B. (2021). An experimental study on the
effect of emotion lines in comics. Semiotica, 2021(243), 305–324. https://
doi.org/10.1515/sem-2019-0079
Peirce, J. W., & MacAskill, M. R. (2018). Building Experiments in PsychoPy.
London: Sage.
Peirce, J., & Open Science Tools Ltd. (15 April 2021). Standalone PsychoPy
2021.1.4 for 64bit Windows (using Python3.6) [Computer Software].
Retrieved from https://github.com/psychopy/psychopy/releases