Geise S and Baden C 2013 Putting The Ima

PUTTING THE IMAGE BACK INTO THE FRAME
MODELING THE LINKAGE BETWEEN VISUAL COMMUNICATION
AND FRAME PROCESSING THEORY

INTRODUCTION
The vivid debate around framing as an important „new paradigm“ (Entman 1993), as “bridging
model” of communication science (Reese, 2001) has led to an elaborate corpus of empirical and
theoretical research in the last twenty years. In this, framing has been defined as process by which
some aspects of perceived reality are selectively emphasized to render them more salient in a
communicative context (Gamson & Modigliani, 1989; Gitlin, 1980). As a consequence, people’s
attempts to attribute meaning to perceived reality are guided to construct specific interpretations
that conform to one “central organizing idea” (Gamson & Modigliani, 1987, p. 143) rather than
other, equally viable interpretations. Framing thus describes, in principle, a rather general process
of meaning construction through the guided reduction of complexity.
Although there is nothing in common definitions of framing that suggests that a frame
needs to be composed of words (de Vreese, 2005), the vast majority of existing framing research
has focused on textual (or, more generally, linguistic) media messages and their structures,
functions, and effects. Research on other modalities of communication messages – notably, a rich
research tradition on visual communication – has long developed relatively independently from
mainstream framing research. Only recently has a surge in research on visual framing brought the
possibility back to attention that framing may occur also based on non-linguistic – specifically,
visual – information (Borah, 2009; Coleman & Banning, 2006; Messaris & Abraham, 2001;
Schwalbe, Silcock & Keith, 2008). Noting the potentials of framing theory as a framework for
explaining visual communication contents and effects, Coleman (2010) describes the visual
framing approach as one of the “life lines” of visual communication research.
The relatively late encounter of research in framing and visual communication is
lamentable for several reasons. First, given the progressive shift from a logo-centric to an icono-centric
(political) culture (Hofmann, 2009), continuing to think of frames as primarily linguistic
phenomena fails to capture the many important ways in which visuals contribute to conveying
meaning in communication. Considering the frequent, often strategic joint use of visual and
2
textual information in various communication contexts (notably, political campaigning; Coleman
& Banning, 2006; Wicks, 2007), an integrated theory of visual, linguistic, or multimodal framing is
needed. Second, while considerable work has hitherto investigated the mechanisms and logics of
both framing and the processing of visual information, the mutual reception of this specialized
knowledge in the respective other field has been quite limited: While many framing studies –
both content analytic and experimental – involve also attention to visual information, their
precision in treating the complex possibilities of meaning conveyed by visuals is rather limited;
likewise, most of the recent studies focusing on visual framing apply a rather general notion of
(usually strategic)1 framing and pay little attention to the interaction of mediating and moderating
processes that govern the emergence of the framing effect (Scheufele, 1999: 94).2 Third, despite
several noted differences, there are important parallels between the cognitive and affective
mediation processes discussed in the contexts of framing and visual information processing
(VIP). Most notably, both framing and VIP deal with the problem of identifying a unique
interpretation based on a reality that is principally open to a variety of interpretations. In both
framing and VIP, directing attention to selectively disregard available information enables the
reduction of complexity which is necessary for comprehension. Additionally, both framing and
VIP draw upon prior knowledge for decoding symbolic devices, and both face the challenge of
constructing coherent meaning by connecting a set of elements perceived as carriers of relevant
meaning.
In this paper, we aim to unfold a model that describes the process of visual framing based
on information processing theory from both framing research and visual communication. We
discuss the ways in which specific properties of a stimulus message lead to the guided selection of
information during perception; subsequently, we investigate the characteristic implications of the
respective perceptual processes for possible interactions with the perceiver’s pre-existing

1 These studies typically compare visual images and accompanying texts used to describe certain issues (e.g.,
Zillmann, Gibson & Sargent, 1999; Gibson & Zillmann, 2000; Scheufele, 1999; 2001; Griffin, 2004; Ballensiefen,
2009)
2 Some first studies investigating visual framing effects include Grabe and Bucy (2009), Detenber, Gotlieb, McLeod
and Malinkina (2007), Coleman and Banning (2006) and Wicks (2007).
3
semantic and relational knowledge in the process of identifying relevant contents, classifying and
decoding their likely meanings, and constructing possible interpretations. In our view, the
construction of meaning through visual information involves several modality-specific variations,
but is not categorically different from framing processes based on non-visual information.
Relevance
Coleman (2010: 233) characterizes visual framing as an „important new direction for theory
building and future research“. A review of the recent literature in the field of visual
communication, however, not only illustrates the rising scientific interest in visual framing and its
effects, but also uncovers that the scholarly debate of visual framing remains highly fragmented
(Rodriguez & Dimitrova, 2011). From a theoretical perspective, thus, modeling the linkage between
visual communication, VIP and frame processing theory should provide a useful anchor to help
integrating findings from current research. Even more urgently, the existing empirical state of
research reveals the necessity of a more elaborate examination of visual framing theory: The current
lack of integrated theory building on the foundation of the visual framing process confronts
visual research with essential challenges regarding the methodological operationalization of its
key variables and the measurement of effects. Several noted methodological inadequacies
originate from important theoretical shortcomings, particularly with regard to the field’s strong
focus on post-receptive one-time measurements that neglect the process-related character of
visual framing (de Vreese, 2005; [2 references removed for the review process]).
This hitherto rather unsatisfactory degree of theoretical differentiation may be one reason
why the visual framing approach has – despite its rising relevance in the field of visual research –
not yet entered the mainstream of the mostly linguistic-centered framing discourse (Matthes,
2009; Tewksbury & Scheufele, 2009). This neglect should be of concern for both sides as both
visual framing and “classic” framing theory have much to gain from one another, both with
regard to explanatory prowess, theoretical insights, and methodological development.
4
With the analytic integration of visual communication, VIP and frame processing theory
proposed in this paper, we aim for a contribution to the progress of both visual and classic
framing theory. Moreover, our modeling is designed to serve as a possible starting point for
theoretically informed empirical research: In our argument below, we will theoretically distinguish
several steps and key constructs in (visual) frame processing that can be translated into an
operationalization on a level of specific indicators and thus made fruitful for empirical research.
The modeling thus aims to present a theoretical foundation strengthening the study of visual
framing as an “important new direction” for empirical future research.
In the following argument we proceed as follows. Based on the available literature on
VIP, we first characterize the processes of perceiving, decoding, and interpreting visual messages.
Next, we integrate these processes into a more general process theory of frame processing,
highlighting modality-specific differences within an otherwise unified framework. For the sake of
argument, this presentation initially disregards the many overlaps and hybrid cases between both
domains (e.g., visual or multimodal framing, conventional pictorial symbols, the information
conveyed by visual properties of written texts). Third, we discuss the implications of this
synthetic view for framing and visual communication theory and research. The paper concludes
with a few suggestions for a future research agenda.
THEORY
1. Visual communication and visual information processing
In the last years, visual communication scholars have accumulated considerable knowledge on the
specific mode of operation of visual communication processes. Visual communication – which
we conceptualize here as all mediation and information exchange processes of semantic context
that utilizes visual phenomena, materialized in the format of pictures (Mueller, 2003) – differs in
structure from linguistic communication modes (Carroll, 1982), and is perceived and processed
according to its own “logic” (Mueller, 2001, p. 22; Mueller, 2007; Kress & Leeuwen, 2010). From
a theoretical perspective, explanations for this special logic can be located on two different levels:
5
The first one focuses on the reception perspective of visual communication, where the characteristics
of visual communication regarding its perception and processing are considered. The second one
places the emphasis on visual communication as a special mode of communication, affecting its
manner of decoding and encoding. In visual perception, both aspects are intrinsically linked: The
specific structure of visual communication messages induces a specific mode of visual reception
and VIP. In the following, we will discuss how exactly the properties of visual information
structure the reception process throughout the successive steps of perception, decoding, and the
construction of coherent meaning.
One first characteristic of VIP that has received considerable attention in the fields of
visual perception, visual cognition and visual psycho-physics (Seymour, 1979; Kennedy, 1984;
Pinker, 1986; Schneider & Maasen, 1998; Elkins, 2003) concerns the so-called picture superiority
effect (Nelson, Reed & Walling, 1976; Paivio & Csapo, 1973; Childers & Houston, 1984;
Kobayashi, 1986): Due to their resemblance to sensory real-life experience, their associative
communication mode, their vividness and concreteness, images command higher (visual)
attention and stronger physiological activation in the perception process than textual messages.
This high salience of visual information is connected to the holistic mode in which visuals are
perceived (Nelson, Reed & Walling, 1976; Childers & Houston, 1984; Kobayashi, 1986; Kroeber-
Riel, 1993; Schneider & Maasen, 1998; Lachmann, 2002; Elkins, 2003). Visuals are received
through a “parallel processing system”, which enables the very rapid perception of rather big and
multifaceted information chunks (Schneider & Maasen, 1998). As Eyetracking analysis displays
(Bucher & Schumacher, 2006; Duchowski, 2007; Hammoud, 2008; [reference removed]), visual
information cues are highly salient especially in early stages of visual perception processes
(Yantis, 2005). Linked to its ability to trigger superior attention and activation, the observation of
visual information also leads to a concise mental anchoring even if perception occurs only briefly
or superficially (Nelson, Reed & Walling, 1976; Paivio & Csapo, 1973; Childers & Houston, 1984;
Kobayashi, 1986): Already after a perception of one to two seconds, a picture of moderate
6
complexity can already be recollected by recipients; by comparison, only five to seven simple
words can be deciphered in the same time (Paivio & Csapo, 1973). Being easily, quickly and
deeply encoded into memory, pictures also can be recognized and remembered better than
textual messages in a post-receptive context. Through their implied similar-to-reality charakter
pictures establish a sense of eye witnessing and thus are considered more trustworthy and
authentic (Berger, 1989). Moreover, due to their holistic and very rapid perception, there is
relatively little cognitive control over the information processing of visual cues. This holds also
true for the process of visual perception: People can direct focused visual attention (see below),
but they are usually unable to filter perception effectively (Wedel & Pieters, 2007; Lester, 2010;
Schneider & Maasen, 1998). Visuals thereby can affect knowledge, attitudes and behavioral
dispositions more thoroughly and, potentially, more resiliently than information that can only be
reconstructed on a conceptual-verbal basis (Paivio & Csapo, 1973; Nelson, 1979; Nelson &
Castano 1984; Schneider & Maasen, 1998; Lester, 2010). In sum, the mental hard-wiring toward
visual perception as well as the perceptual characteristics of visual messages lead to their superior
salience in perception and an at least potentially much deeper impact upon memory and memory-
based operations after the perception (Lester, 2010; Kress & Leeuwen, 2006; Yantis 2005).
¢ Premise 1a: With their implied similar-to-reality character, visuals present information holistically, which is
processed quickly and intuitively, in a holistic fashion, gaining superior salience in the perception and information
processing. Visuals thus hold the potential to render aspects of perceived reality more salient than non-visual
messages.
Another implication of the same properties of visual stimuli is that the visual communication
mode does not already possess a formalized logical structure that determines the hierarchical
order of perception. Meaning is derived from visual percepts in a holistic manner, taking
associative relations between pictorial elements on different hierarchical levels into account. As a
consequence, visual content is perceived in a way that requires (1) a high degree of (primarily
implicit) online-structuring during the reception process, (2) the immediate selection of relevant
7
elements for comprehension and, in many cases, (3) elaborated ex-post structuring activities to
uncover the transported meaning: To derive information from images, recipients need to “decide”
which aspects to attend to foveally, in which perception sequence, and which aspects to disregard
as irrelevant or only focus on peripherally (Bundesen & Habekost, 2008; Findlay & Gilchrist,
2003). Reducing a rich, unstructured perception of a visual stimulus to such a set of depicted
elements and relevant configurations of these elements, however, is no trivial task – and it often
fails (Gordon, 2004). People need to draw upon both cues presented by the picture itself, and
prior knowledge about which perceived aspects of the image can be interpreted as meaningful
elements or configurations: On the one hand, pictures adhere to an analogous, spatial-associative
logic, a “spatial grammar” (Paivio, 1991; Kosslyn, 1995) which suggests both which perceptions
constitute one distinct element, and how depicted elements may be organized hierarchically in
relation to one another. Together with the visual salience of selected aspects (Proulx, 2007; Yantis,
2002), properties of the perceived image itself can therefore suggest a preferred parsing and
sequencing of contained elements (Proulx, 2007; Barnard, Breeding & Cross, 1984). On the other
hand, knowledge or communicative intentions either brought to the processing task or activated
by initially perceived aspects guides attention to search for expected elements in the picture
(Yarbus, 1967): People can actively shift foveal and cognitive attention when perceiving visual
information (Bundesen & Habekost, 2008; Luck & Hollingworth, 2008; Snowdden, Thompson,
& Troscianko, 2006). Stimulus-driven and endogenous perception control, however, cannot fully
determine the structuring of the visual percept into meaningful elements or the sequential order
of attention directed at different elements within a picture, since as endogenous and exogenous
visual perception processes are highly interrelated (Bullier, 2001). While properties of the image
as well as conventional knowledge can suggest certain preferred ways of structuring the perceived
stimulus into interpretable elements and configurations, pictures are principally open to multiple
ways of structuring and sequencing the contained information.
8
¢ Premise 1b: Visuals contain a range of elements, which are arranged spatially in a possibly hierarchical, but
non-sequential fashion. They therefore allow variable selections and orders of perception, which is partly structured
stimulus-driven by the composition of the image, but can be overridden goal-driven by active attention shifting.
The true-to-life character and loose internal structuring of pictorial messages also has important
consequences for the decoding of visual information. Following Kress and Leeuwen (2010),
visual information has a specific semiotic quality that codes and decodes information as concrete
representations instead of transforming it into abstract signs (Emmison & Smith, 2000; Rose,
2012; van Leeuwen & Jewitt, 2010). 3 Linked to this, VIP uses comparable communication
principles to those that underlie the direct perception of phenomena in the real world, rather than
non-concrete sign-theoretical representations (Sachs-Hombach 2003; Schuermann, 2011). This is
also true for abstracted representations as long as these bear an iconic or indexical relation to the
object that these depict. Through this rather intuitive mode of assigning meaning to visual
perceptions, many ideas, intentions and positions amongst others can be conveyed much more
easily than by explicit linguistic description or through other conventionalized sign systems (Rose,
2012; Wiesing, 2006). Pictures are even capable of creating also new, fictitious realities, which can
be subjectively interpreted although they have neither been experienced not linguistically
described before (Kroeber-Riel, 1993). However, due to the ambiguity and openness of visual
information to different possible ways of structuring their perception, this intuitive
comprehension simultaneously allows quite a variety of concurrently possible interpretations:
Through the ability of perceived and interpreted visual elements to influence “active vision”
(Findlay & Gilchrist, 2003) and thus guide the further structuring of image content, a series of
complex interactions can arise that is highly dependent on both the order of perception and the
knowledge activated during decoding. Depending on which aspects of an image are first
perceived and identified as meaningful elements for decoding, a path-dependent process unfolds

3Obviously, visual messages can also encode information into abstracted symbolic representations, which require
code knowledge to be appropriately decoded. However, unlike other modalities, this is not a dominant principle.
While the dominant encoding principle of linguistic communication is symbolic – with some exceptions such as
onomatopoetic expressions in auditive messages and some very rare textual representations that visually resemble
what they represent – the dominant encoding principle of visual communication is direct instantiation.
9
that can lead to the selection of quite different information contained in a visual stimulus (Bouma
& Bouwhuis, 1982; Cantoni, Marinaro & Petrosino, 2002; Yarbus, 1967).
Unlike linguistic codes, moreover, even if the same elements are selected, visual
representations do not necessary need a fixed one-to-one relationship between its signifier and its
signified (Wiesing 1998; Rose, 2012). Besides signifying the precise object that is visually
represented, visuals can always also represent a certain concept4 that is represented through one
out of many possible instantiations, or even more abstracted meanings that are related to the
visual sign through a semiotic code. Hence, to detect possible meanings beyond the picture – not
as a concrete representation but as a pictorial sign – recipients need to draw upon their
knowledge of existing semiotic codes, prototypes and/or familiar visual experiences to identify
appropriate, context-, convention- and culture-dependent mappings of specific signifiers to
signified meaning (Goodman, 1997; Seymour, 1979). Consequently, the semantic content of
visuals is relatively open to various interpretations. Despite the availability of certain socio-
semiotic conventions, which can help reducing the ambiguity of certain visually represented
information, the meaning of elements in a picture necessarily arises from an interactive process
driven by both the stimulus, its intuitive interpretation, and the use of applicable knowledge.
What meaning is attributed to the set of identified elements and configurations depends on a
construction process that is dependent on both the situational, temporal, spatial, individual, social
and medial context (Mitchell, 1986; Hall, 1966): As a performative practice, seeing is a process
situated in individual, social and cultural contexts – an “activity of determining as well as being
determined as such, neither purely receptive nor purely constructionist” (Schuermann, 2011: 95).
¢ Premise 2: Visual representations use comparable communication principles to those that underlie the direct
perception of phenomena in the real world and therefore often allow for intuitive interpretation; however, their

4 The same instantiation can, moreover, represent many different concepts varying in their level of abstraction and
perspective: For instance, a visual representation of a house may equally well represent the concepts “house”,
“building”, “home”, “my home”, or many more.
10
successful decoding depends on the structuring of perception and allows multiple constructions of meaning, as visual
signifiers cannot usually be matched with one unique signified in an unambiguous fashion.
Similarly to the detection of meaningful elements in visual communication, also the detection of
relevant relations between these elements follows from an interaction of stimulus-driven
suggestions and goal-driven, knowledge-based constructions (Findlay & Gilchrist, 2003). The
characteristic spatial grammar of configurations in pictures not only partially expresses which
perceived aspects jointly constitute meaningful elements (hierarchical part-of relations), but also
implies specific semantic relations between these elements. Such relations can again be partly
processed intuitively based on their perceptual similarity to familiar situations from real life
experience (Watt, 1992). On top of this intuitive interpretation, also associative knowledge
contributes to fill the depicted relations between elements with possible meaning. Just like the
decoding of visually represented elements, also the representation of semantic relations in a
picture does not follow a set of conventional rules, but derives from a path-dependent interaction
between stimulus-suggested intuitive decodings and a construction of further propositional links
from associative knowledge.
¢ Premise 3: Visual representations arrange picture elements in specific configurations that can often be
interpreted intuitively; however, their interpretation depends on the structuring of perception and allows multiple
possible ways of perceiving meaningful relations between the different elements.
Picture perception and picture understanding are thus based on a complex system of mental processes
that influence each other (Bergstroem, 2008; Joyce & Cottrell, 2004; Sugimoto & Cottrell, 2001).
To determine the contents of a visual stimulus, people need to integrate two fundamentally
different approaches to picture perception: On the one hand, the surrogate function of pictures for
the perceptions of the real world operates based on both the recognition of an intrinsic similarity
(iconicity) between the signifier and the signified, and the realization that there is a difference
between the real object and a picture used instrumentally and purposefully in a communicative
context (Schwan, 2005; Posner & Schmauks, 1998). Visuals are intuitively identified as direct
11
representations of the basis they depict, even if there is no prior conceptual or code knowledge to
support this decoding. On the other hand, visuals as signs can refer to various kinds of
conceptual meaning that needs to be identified based on the recipient’s knowledge of
conventional codes. However, since there are many possible meanings that can be assigned to the
same visual perception, people need to decide which kind of interpretations of perceived
elements and relations are relevant and suitable for interpretation. In this process, both stimulus-
driven perception and exogenous perception control, structurings based on both the
compositional/configurational setup of the stimulus and prior experience, and intuitive
interpretation as well as knowledge-based decoding interact: Each processing step produces the
necessary input for the subsequent stage, but at the same time, these subsequent steps can always
feed back into renewed deliberate perception, alternative structuring, different decodings or the
foregrounding of different perceived or inferred relations (Bullier, 2001). Picture understanding is
thus to be understood as a interactive, multi-step structured process of partial tasks, where the semantic
information provided by the image is the key to the activation or respectively the comparison to
cognitive concepts pre-existent in the recipient’s memory.
¢ Premise 4: Visual stimuli are inherently ambiguous such that their interpretation necessarily requires a
recursive process of perception, structuring, decoding, relating and interpretation. The outcome of each processing
stage is evaluated based on its ability to inform subsequent stages, and can be revised if it is found unsatisfactory.
Due to people’s high processing fluency with regard to the visual perception of real-life-like
complex stimuli, many of these operations function rapidly and “quasi-automatically”. Perceived
aspects, identified elements, decoded meanings and inferred relations that are found
unproductive with regard to the construction of an interpretation are discounted or discarded,
often without noticeable mental effort (Schwan, 2005). In addition, good familiarity with
conventional semiotic codes and similar visual experiences, which is acquired via (media and non-
media) socialization (Ludes, 2001; Schwan, 2005) facilitate the fast and efficient identification also
of implied non-“literal” meaning: While all perceivers are typically able to rapidly proceed
12
through the entire process of picture comprehension to identify the kind of real-life situation
instantiated and directly represented in an image, the detection of symbolic meaning requires
both knowledge and practice in the decoding of visual information. Visuals that correspond
closely to conventionalized cultural prototypes and familiar experiences are more easily and more
consensually decoded than weakly conventionalized or unfamiliar depictions (Ludes, 2001;
Seymour, 1979).
In order to decide which out of a range of possible interpretations of a visual message is
most suitable, finally, people need to reflect the communicative purpose of the perceived image.
Chiefly, the comprehension of visual messages requires the re-construction of a potential
communicative intention (Schwan, 2005; Kress & Leeuwen, 2006), based on the assumption that
the message refers to some coherent meaning that has been deliberately encoded into the image:
„Beyond detection and attention lie the analysis and evaluation of purpose“ (Messaris, 1994: 154).
Hence, to interpret a message, recipients search for one way of structuring and decoding a visual
message that contributes to the construction of coherent meaning which contains a recognizable
information value. This process operates on two levels. Following Weidenmann (1988; 1998),
recipients usually restrict themselves to identifying in which way the depicted meaning augments
information otherwise available in the same situation (“picture understanding of 1st order”):
They relate constructed interpretations of the image to their prior knowledge about the same
subject or situation, and determine the significance of the visual message against this background.
This process is chiefly aided by the associative activation of related knowledge during the process
of image perception and comprehension, which references potentially relevant contexts against
which the message may be interpreted. On the second level, people can furthermore reflect the
communicative intention of the message and contextualize constructed meaning against the
possible motivations of the message sender; however, such a systematic analysis happens rather
rarely (“picture understanding of 2nd order”; Weidenmann, 1988, 1989). One explanation for the
relatively low propensity of recipients to question the communicative intentions of visuals can be
13
found in the rather realistic, information-rich quality of visual messages and the intuitive and
rapid processing of their contents: Unlike the processing of language, visuals can be processed
seemingly effortlessly, such that often many highly consequential choices in perception,
structuring and decoding are already achieved before discretionary, controlled cognitive processes
become effective (Weidenmann, 1988; 1989; Elkins, 2003): „Television is ‘easy’, print is
‘tough’“ (Salomon, 1984: 647). While the range of possible interpretations of the same visual
stimulus is typically rather wide, recipients make sense of images in a way that settles quite rapidly
on a plausible interpretation, which is characterized by both high salience and a relatively little
reflection about other possible meanings or a possible strategic-communicative intention. To the
degree that visuals succeed in guiding the process of meaning construction based on those
aspects rendered salient in an image, they therefore, possess the potential to powerfully convey
persuasive messages (Messaris, 1994; 1992; 1997): “The special qualities of visuals – their
iconicity, their indexicality, and especially their syntactic implicitness – makes them very effective
tools for framing and articulating ideological messages“ (Messaris & Abraham, 2001: 220).
Principles of the Visual Communication Process
All of these considerations are rich in consequences for the theoretical examination and empirical
analysis of visual communication, emphasizing three fundamental principles of visual
communication: First, with the focus on mediation and exchange, the holistic, interacting process
character of visual communication has to be considered. This is particularly true as findings in
perception and cognition psychology imply that a systematic analysis of visual communication
requires the integration of „upstream“ (stimulus-driven) and interacting, but interdependent
(goal-driven) mechanisms of visual perception and information processing, which in themselves
have to be understood as dynamic, highly interrelated processes.
Second, and consequently, the interactivity of image and recipient has to be taken into account.
As pictures transport meaning, as they are “carriers” and “mediators” of social constructions that
do not reflect reality itself but have to be understood as independent forms of symbolic
14
expression. The active input of the recipient into the process of the meaning construction has to
be included in the analysis and simultaneously grounded in the relevant situative, temporal,
spacial, individual and social contexts (Charlton, 1997). In addition, the „double dynamics“ of
visual communication as a 1) mode and 2) a perception process increase the complexity of this
interaction. For example, conventional visualization strategies are highly interlinked with the
culturally coded act of seeing – being shaped by it and shaping the perception process
simultaneously. This implies, third, a dynamic, multi-step character of visual communication processes,
which also needs to be differentiated in its temporal dimension: Visual perception, processing,
meaning construction and contextualisation do not occur synchronously, but in different
temporal phases, on different hierarchical cognitive levels (MacInnis & Price, 1987; Cantoni et al.
2002; Mendelson, 2004; Bundesen & Habekost, 2008). The “logic” of visual communication can
therefore be only fully comprehended follow a process perspective, which is both holistic, dynamic,
interactive, and structured into multiple steps.
2. Visual information processing as a specific case of frame processing
Unlike VIP, framing theory and research does not derive from a focus on a special mode of
analyzed stimuli, but from a specific communicative function that frames perform for the
processing of information (van Gorp, 2007): Frames reduce the complexity of available
information by discriminating between relevant and irrelevant information based on a
comprehensible “central organizing idea” (Gamson & Modigliani, 1987: 143; Entman, 1993).
Given limited capacity and a communicative purpose, messages necessarily represent reality in a
purposefully selective, coherently interpretable fashion.
The process by which frames convey specific meaning shares a number of important
parallels with the process of VIP and its principles. From a meta-theoretical perspective, this is
not surprising as both approaches address central ideas of information processing in general
(Graber, 1988). Nonetheless, these parallels are highly important, especially to understand how
visual communication and framing effects interact in visual framing processes and potentially
15
reinforce each other’s inherent process steps. In this context, particularly those processes that
originate from visual perception and result in a superior salience of visual cognitions become
significant: Communication, perception and information processing of visuals do not change the
framing process in general, but they are especially suitable for triggering framing processes and
inducing framing effects (Coleman, 2010).
Similarly to the configurational and compositional attributes of images that direct attention,
frames render selected aspects of perceived reality more salient than others by employ of a wide
variety of framing devices (D'Angelo, 2006; Gamson, & Modigliani, 1989; Pan & Kosicki, 1993;
Reese, 2001; van Gorp, 2007). These devices encompass more or less anything that has the capacity
of attracting attention, depending of the modality of the message considered, ranging from
textual emphasis and bold typefaces over the pitch and loudness in auditory messages to include
also those devices known from the processing of visual information (Pan & Kosicki, 1993; van
Gorp, 2007). Once the complex range of available perceptions is structured into a limited set of
salient aspects, frame processing next requires that people decode the information revealed by
the highlighted elements (van Dijk & Kintsch, 1983; Entman, Matthes & Pellicano, 2009).
Similarly to VIP again, this process consists in the multi-step structured decoding of perceived
signals; next, the identified information serves to activate knowledge “already at the recipients’
disposal” (Nelson, Oxley, & Clawson, 1997: 225; see also [reference removed]): While frames, like
images, can convey some information that already takes propositional form (Holyoak & Thagard,
1995),5 messages rarely explicate all relations between the provided bits of information that are
needed to interpret the meaning of the frame (Weßler, 1999; [reference removed]). Accordingly, the
set of exogenous framing devices interacts with people’s endogenous prior knowledge to search
for possible connections between the communicated information (Price & Tewksbury, 1997;
Scheufele & Scheufele, 2012; van Atteveldt, Ruigrok, & Kleinnijenhuis, 2006). Based on the

5 Propositional form is required for semantic interpretation: One cannot believe “that X”, but only “that X relates to
Y (in a specified way)” (Holyoak & Thagard, 1995). Propositional information can be communicated, for instance, by
means of language (which uses grammar to express propositional relations) or simple configurations of visual
representations that can be interpreted without prior knowledge.
16
identified connections, people attempt to construct coherent meaning by reconstructing a central
organizing idea that integrates the set of information rendered salient (Gamson, & Modigliani,
1987; van Dijk & Kintsch, 1983). From an analytic perspective, framing as a general process of
deriving coherent meaning out of complex signals involves more or less the same cognitive tasks
that are needed to derive coherent meaning from visual imagery: It follows a highly interrelated
sequence of selective, stimulus- and knowledge-guided attention, decoding based on prior
familiarity with similar signals, the inductive as well as deductive search for connections, and
finally the endogenous construction of coherent meaning (Gamson, 1992; van Dijk & Kintsch,
1983; van Gorp, 2007; [reference removed]). This process can be relatively straightforward to the
degree that attention is guided well, decoding succeeds due to unambiguous code knowledge, and
the message specifies the propositional relations needed to identify the common organizing idea
(Kintsch, 1998). However, at each stage, ambiguity can lead to a return to prior stages, using
previous attempts at decoding, connecting and integrating information to guide the renewed
perception and interpretation of the stimulus (Cook & Guéraud, 2005; Kintsch, 1998; Veling &
van der Weerd, 1999). Frame processing cannot be reduced to a simple transfer of medially
communicated (visual or linguistic) frames into the recipient’s cognitive system. Instead, framing
effects rather are the result of a dynamic interaction of media frames with the recipients’ pre-
existing cognitive concepts (Gamson, 1992; Scheufele, 2004; Price & Tewksbury, 1997; van
Gorp, 2007). While the main mechanisms underlying VIP can be easily expressed in terms of the
general framing process, most specifics of VIP concern the degree of ambiguity and resulting
nonlinear interactions between the different processing steps.
Following this general information processing sequence, we will now characterize the
important differentiations between stimuli of different modality, focusing on visual versus textual
information.
¢ Proposition 1 (Structuring Perception): Both frames and images contain a range of devices which render specific
aspects of perceived reality more salient than others.
17
Framing theory states surprisingly little about how frame messages structure perception. Those
contributions that discuss the nature of framing devices mostly mention lists of techniques that
can be used to communicate emphasis in a text, or to complement texts with illustrative visual
images (Pan & Kosicki, 1993; Tankard, 2001; Tewksbury & Scheufele, 2009; van Gorp, 2007).
Framing theory has thus already recognized the effect of picture superiority to boost the salience
of information if it is depicted visually, as opposed to textually (de Vreese, 2005; van Gorp, 2007;
2010). However, salience manipulations occur on somewhat different levels in linguistic and
visual information: To the degree that framing theory assumes textual frame messages, most
variations in the salience of aspects consist in their simple inclusion or omission in the textual or
verbal description. Linguistic descriptions are necessarily highly selective due to their highly
abstracted code (van Dijk & Kintsch, 1983; Kintsch, 1998). Unlike visual stimuli, which typically
contain rich detail unless deliberately omitted by abstraction, linguistic descriptions require the
explicit addition of detail where desired.6 Those salience manipulations achieved by emphasis
devices within a text mostly serve to fine-tune the salience within the set of frame-relevant
information, while most selectivity is already achieved by the description (Druckman, 2001; Pan
& Kosicki, 1993). By contrast, visual stimuli typically contain rich information, much of which is
peripheral to frame construction (premise 1a). While also images necessarily manipulate salience by
including or entirely omitting aspects of reality, considerable further selectivity is needed to arrive
at a limited set of relevant information. By contrast, textual messages mostly discriminate
between absent and present information, adding minor variations only to further increase
salience. As outlined above, visual information can generally achieve higher salience than textual
information, but it requires much higher discretion to select relevant from unrelated information.
The textual communication mode structures perception much stronger than visuals, but where
visuals succeed in communicating relevant information in an associative entirety, the structural
hierarchy in VIP exerts strong influence on the construction of meaning.

6 If a frame requires, for instance, the concept “child”, it fully suffices to use the word “child” in a textual description
and omit all information about gender, age, hair color, or current mood. Visual depictions of the concept “child”, by
contrast, typically provide this information even if it is not required for constructing the frame.
18
The same is also true about the order of perception: While texts possess a conventional
form (words as units, separated by spaces, read from the top left, linewise, to the bottom right)
resulting in a “sequential processing” mode (Paivio, 1979: 33; van Dijk & Kintsch, 1983) of
orderly jumps (Leven, 1991) images in most cases do not prescribe a specific order of attending
to their contents (premise 1b). Obviously, also images can suggest specific perceptual hierarchies
and perceptual sequences (e.g., especially by physiological attraction cues such as size, color
contrast, or positional configuration) and mark the set of relevant units more or less clearly (e.g.,
by means of abstraction); However, these suggested selections and perceptual sequences are
much less binding and can easily be overridden by discretionarily directed, endogenous attention-
shifting without disrupting the intelligibility of the message.
} Differentiation 1a (Strong Structure/Weak Structure): Texts largely determine both the selection and the
(linear) perceptual sequence. They therefore exert strong and immediate influence on which aspects appear as
salient during information processing. Visuals can suggest a selection and (hierarchical) order of perception, but
allow multiple ways of structuring the presented contents. They require a discretionary identification of elements
that are then processed holistically.
} Differentiation 1b (Low Salience/High Salience): The salience attribution triggered by textual information
is much weaker than that triggered by visual information.
¢ Proposition 2 (Decoding): Both frames and images contain a set of signifiers that require the classification and
decoding of meaningful information units.
Once the raw perception has been structured into a sequence of discrete signals, the next
challenge is to decode the elements and derive the contained conceptual information. For textual
stimuli, again, the underlying perceptual task is pretty straightforward: Given basic knowledge of
grammatical structures and lexical expressions, most words can be decoded quickly and uniquely;
remaining polysemic expressions can easily be disambiguated based on their context (Cook &
Guéraud, 2005; Kintsch, 1998; Rayner, 1978; 1998; see below). If the required code knowledge –
19
the dictionary – is available, thus, decoding textual frames is no major challenge, which explains
why also this step receives little attention in the (predominantly text-focused) framing literature.
If, by contrast, the required language code is unavailable, the symbolic representation of text
cannot be decoded (Price & Tewksbury, 1997; Slothuus & de Vreese, 2010): The message
remains unintelligible, and framing fails. However, this is not necessarily the case for visual
frames. Even if there is no prior knowledge that helps recognizing and decoding visual percepts,
many visual stimuli can be interpreted intuitively or by means of analogical inference (Schnotz &
Bannert, 1999) based on more general world knowledge and experience. Visual percepts can
rarely be mapped uniquely onto specific concepts, because their interpretation depends on which
aspects of its representation are considered relevant.7 Which conceptual information is derived
from a visual percept depends strongly on a) the perceiver’s expectations and intentions, b)
selective active or passive attention mechanisms, and c) its context and relations to other percepts
(Premise 2).
} Differentiation 2 (Dictionary-based decoding/Constructive interpretation): The classification and decoding of
textual framing devices is usually guided by a direct matching of signifier and signified; their semantic content is
conventionalized. By contrast, the meaning of visual contents is not usually conventionally defined and requires
an interaction of perception-based construction and knowledge-based interpretation.
¢ Proposition 3 (Identification of relations): For an interpretation of both frames and images, meaningful relations
must be identified between the decoded element. This identification initially draws upon the relations expressed in
the stimulus and completes these by searching their associative knowledge.
While the identified concepts form the information base needed for (re-)construction of the
frame message’s meaning, it remains still unclear how the elements relate to one another (van
Atteveldt et al., 2006; van Dijk & Kintsch, 1983; [reference removed]). Hence, the next task facing
recipients of a frame message is to meaningfully connect the identified concepts and find out

7 A depiction of an crying Asian female child, for instance, may be interpreted as a child, a girl, an Asian, a victim in
the Vietnam war, an innocent person, an immature person, someone unhappy, and so forth – depending on which
of its perceptible features are considered.
20
how they relate to one another. This is again relatively straightforward for linguistic stimuli,
where the explicit use of language can specify a set of propositions connecting the raised
concepts: All concepts in a textual frame message are necessarily explicitly related to at least one
other concept in the message and provide a lot of information for the identification of
meaningful relations to begin with (van Dijk & Kintsch, 1983; Kintsch, 1998; van Atteveldt,
Kleinnijenhuis, & Ruigrok, 2008). By contrast, visual configurational and compositional
arrangements can specify which elements are related to one another,8 but not expressly specify
the nature of each association (Premise 3). In both cases, the identification of meaningful relations
by far exceeds the number of explicitly stated or visually represented links ([reference removed]):
Drawing upon their knowledge about how things are usually related, association quickly delivers a
wide range of additional possible connections between the available concepts (Nelson et al., 1997;
[reference removed]).9 In the processing of textual frames, association usually proceeds in a rather
guided fashion, using the explicitly stated propositions as information for further elaboration (van
Dijk & Kintsch, 1983; Collins & Loftus, 1975; [reference removed]). In VIP, by contrast, each and
every perceived configuration may require interpretation, may lead to an (at least initially) much
wider range of relations considered, before a limited set of plausible propositions can be found.
As a consequence, although both textual frames and images depend to a significant degree on the
recipient’s relational knowledge to connect the information (Edy & Meirick, 2007; Shah, Kwak,
Schmierbach, & Zubric, 2004; [reference removed]), texts structure the construction of relations
much more strongly than images.
} Differentiation 3 (Explicit Relation/Implicit Association): Textual frame stimuli explicitly specify the
nature of relations between elements. This identification requires familiarity with the grammatical and language

8 Indeed, visual information can easily express associations between very many elements, while texts are usually
limited in this respect: Due to their linear form, relating one concept to many others requires the use of
enumerations or multiple anaphora, which quickly leads to complex and inelegant descriptions.
9 A related task is the resolution of figurative or symbolic expressions, both in visual and textual messages: Both the
word and the image “child” can be used in communication to literally express the concept “child”, but they can also
assume a wide range of “non-literal”, symbolic meanings. Such connections can typically be easily identified by
searching for associated relational knowledge: If no direct links between raised concepts can be found, there are
usually other meanings associated with a concept which can be identified based on their ability to enter meaningful
relations with other present concepts. Thus, the identification of relations among concepts can lead to changes in the
decoding of the stimulus, replacing literal conceptual interpretations with symbolic ones.
21
code. Visual frame stimuli can suggest specific relations between elements, but the nature of the relation must be
inferred from experience and knowledge. This identification does not require familiarity with a specific code.
¢ Proposition 4 (Meaning construction): Based on the set of identified relations, a common macrostructure can be
constructed for both frames and images, which integrates the available information and renders it meaningful.
During the identification of meaningful relations between concepts, typically many relations are
found that do not cohere with one another (Pennington & Hastie, 1998; Tourangeau & Rasinski,
1988; van Gorp, 2007). Thus, in order to derive coherent meaning from an image or linguistic
frame message, people need to engage in a construction-integration process that discriminates the
relevant relations from unrelated content (van Dijk & Kintsch, 1983; [reference removed]). While sets
of coherent relations are identified and unrelated cues are discarded, people attempt to construct
a common macrostructure that summarizes the “central organizing idea” (Gamson & Modigliani,
1987: 143; van Dijk & Kintsch, 1983) integrating the participating propositions. If organizing
ideas are found that account only for a small part of identified relations while being contradicted
by other salient elements of the message, these are discarded as implausible (Pennington &
Hastie, 1988). This is the case specifically when highly salient elements of a message conflict with
the interpretation. Since texts typically render only relatively few aspects highly salient, while
associative inferences should be more easily discarded, identifying a suitable interpretation should
normally be relatively straightforward. Images, by contrast, render rich information highly salient
(Premise 1), giving rise to various possible interpretations while complicating the identification of
one macrostructure that integrates all salient elements (Premise 4). Where both visual and textual
stimuli are present, visual information should tend to override discrepant textual information
even though the text should more uniquely refer to a specific interpretation than the image
(Gibson & Zillmann, 2000). Only if an organizing idea integrates most of the salient propositions
raised by the stimulus, while all deviant propositions can be discarded as irrelevant, a possible
interpretation is found ([reference removed]). The more constrained and coherent the range of
identified propositions, the more easily should people be able to (re)construct the central
22
organizing idea of the frame (Price & Tewksbury, 1997; [reference removed]). Such is facilitated by all
structure imposed upon the three prior steps in information processing: If similar units were
interpreted in unambiguous ways, connected by explicit relations and then elaborated upon
within narrow bounds, a unique interpretation should be readily found. By contrast, deviant
structurings of perception, ambiguous decoding, and a relatively weakly constrained associative
interpretation lead to a rather diverse set of identified propositions (Gamson, 1992; Vliegenthart
& van Zoonen, 2011; Druckman, 2001). These complicate the construction of unique meaning
and also fuel diverse interpretations by different individuals. The considerably higher openness of
visual (compared to textual) stimuli to diverse readings (Premises 2 & 3) thus results in a much less
constrained and predictable, more effortful and less certain construction of meaning.
} Differentiation 4a (Constrained construction/Ambiguous construction): Frame construction based on visual
and textual information follows more or less the same logic. However, the considerably higher ambiguity in the
prior steps of processing visual as opposed to textual information introduces higher variability into the
construction of meaning from images compared to textual frames.
} Differentiation 4b (Picture superiority): Compared to textual information, visual information is both richer
and more salient and therefore more difficult to disregard in the construction of coherent meaning.
DISCUSSION
Our paper had two starting points: First, the rising relevance of the framing approach in visual
research, which still lacks a thorough theoretical development of the underlying assumptions and
implications of visual framing as a process of VIP; and second, relatedly, the observation that the
current state of empirical research presents some methodological shortcomings that originate, at
least partly, in an unsatisfying theoretical underpinning. Against this background, we proposed
our integrated modeling of the linkages between visual communication, VIP and frame
processing, pursuing two main intentions: To contribute to the analytical differentiation of visual
23
frame processing theory, drawing upon theoretical insights from mainstream framing research,
and to render this theoretical perspective available to empirical research.
Regarding our first aim, the above discussion has shown that the process of deriving
coherent meaning from visual messages can be characterized as a specific case of a more general
framing process. If we understand framing as a process by which complex perceptions become
interpretable through the purposeful selection and constructive integration of salient aspects
(Entman, 1993; Gamson & Modigliani, 1987; van Gorp, 2007), any kind of stimulus capable of
signifying semantic information can initiate a framing process. The modality of the stimulus
matters merely with respect to the manner in which the signification takes place, but the
fundamental process does not differ. What does differ, chiefly, is the degree to which different
kinds of stimuli can suggest specific meaning with different degrees of ambiguity (Differentiations
1a, 2, & 3), and cause higher or lesser salience (Differentiation 1b). Taking into consideration the
characteristic properties of VIP, hence, framing process theory can be easily generalized and
adapted to inform the study of visual communication. This integration of visual and “classic”
framing perspectives then enables several distinctions that lead to a much more nuanced view on
framed information processing.
Relative strength of visual and linguistic framing
Based on the not yet integrated literature discussing the specifics of the visual framing process,
one common conclusion leads to the expectation that visual frames should be generally more
powerful than textual or auditive ones: Following a “special logic” perspective on visual
communication, visual frames should gain a relatively high impact to induce visual framing
effects (Messaris & Abraham, 2001; Rodriguez & Dimitrova, 2011): Both the superior salience,
the richness in conveyed information, and the intuitive processing mode support this expectation.
Through their associative logic, images should directly create a visual frame that is particularly
salient and hardly questioned by recipients, and therefore able to strongly shape even their
interpretation of textual information (Brantner, Lobinger & Wetzstein, 2011; Gibson & Zillmann,
24
2000; Griffin, 2004; Messaris & Abraham, 2001; Scheufele, 1999; Zillmann, Gibson & Sargent,
1999). For many scholars, hence, the important question “whether the visual framing or verbal
framing has the stronger effect” (Coleman 2010: 255) seemed settled. However, our theoretical
discussion above has also raised a range of considerations that suggest a more complex picture
and calls for a more differentiated, empirically backed answer:
Interactive Processes
First, as we have shown, the processing of both visual and linguistic information requires a rather
complex process of stimulus-based perception, knowledge- and education-based refinement, and
constructive integration of available information. While visuals indeed command greater attention
and convey rich information during perception, and especially in its early stages (Differentiation 1b),
they provide comparatively little structures for the subsequent process of identifying and
decoding meaningful elements (Differentiations 1a & 2). The relatively low demands that visuals
put on conventional knowledge aids the framing process if unique meaning can indeed be
identified intuitively; however, the same property may slow down the construction of meaning
when the elemental composition and signified meaning are unclear, and ambiguity needs to be
resolved between different possible interpretations (Kintsch, 1998; Cook & Guéraud, 2005).
Linguistic frame processing may be more effortful and less salient, but due to the strong
selectiveness of linguistic frames these reduce ambiguity well and facilitate the use of associative
knowledge for frame construction (van Dijk & Kintsch, 1993; Pan & Kosicki, 1993). Different
modalities thus have their specific strengths in conveying frame messages, suggesting that
different kinds of frames exert stronger influence depending on the frame communicated: For
frame messages that are easily understood intuitively, and contents that can draw upon well-
familiar, conventional semiotic codes and knowledge, the ambiguity of visuals should be
unproblematic, such that their potential for high salience can be actualized (van Gorp, 2007). For
frame messages referring upon weakly conventionalized knowledge and unfamiliar situations, by
25
contrast, resolving the meaning of images should be difficult, such that linguistic frames provide
more effective guidance (Holyoak & Thagard, 1995; [reference removed]).
Differentiated modality-specific properties
Second, and related to the former point, the specific strengths of visual and linguistic frame
messages depend crucially upon the presence of their described characteristic properties.
However, not all visuals are indeed rich in detail, vivid and similar to direct perceptions of reality
(Rose, 2012; Wiesing, 2005). Likewise, not all linguistic expressions are equally unambiguous and
explicit about the relations between raised concepts (Kintsch, 1998; Holyoak & Thagard, 1995;
Langacker, 1998). To the degree that visuals are abstracted or used to signify symbolic meaning,
their appeal to intuitive comprehension diminishes while their demands on prior knowledge for
decoding and interpretation increase (Neuman, Just, & Crigler, 1992). Visuals encompass a wide
variety of phenotypes, ranging from photo-realistic pictures of largely unstructured reality, via
abstracted and reduced depictions, to visual symbols that have in many respects more
commonalities with textual than with photo-realistic visual representations (Abraham & Appiah,
2006). Likewise, there are some cases of linguistic representations such as onomatopoetic
expressions that can be understood intuitively based on their phonetic similarity to naturally
occurring sounds. Visual similarity of textual communication to the depicted objects is rare, but
also possible.10 Linguistic metaphors and catchphrases with rich associated meaning can conjure
up detail-rich imaginations that achieve a certain amount of vividness and suggest much more
detailed information than other, rather abstract expressions (Holyoak & Thagard, 1995; van
Gorp, 2007). While we have deliberately overstated the distinctness of visual and linguistic
(textual or auditive) modalities of information for the sake of argument, there are many cases in
between that share properties with both ideal cases to a certain extent. Both visual and linguistic
representations can more realistically be arranged along a continuum, ranging from relatively
unstructured, vivid and intuitively comprehensible information to highly abstracted, sequentially

10 One example are smileys composed from character sequences :D
26
encoded information (Differentiation 2). The same is true also with regard to Differentiations 1a, 1b,
and 3: Written words can be spatially arranged in specific ways, different typesets and colors can
suggest orders of perception that deviate from conventions, and in several cases, text characters
can be used to simultaneously create a textual representation and a visual image (Card,
MacKinlay, & Shneiderman, 1999). Emphasis in spoken language, rhymes and creative uses of
the phonetic qualities of words can be used to communicate additional meaning and relations
that transcend the grammatical order (van Dijk & Kintsch, 1983). Inversely, texts may forego
their ability to express the quality of relations, using vague formulations or merely enumerating
associated elements (Kintsch, 1998); visuals can provide additional information to qualify implied
relations between depicted elements – e.g., in a sequence of pictures depicting ongoing events, or
by adding visual symbols to aid interpretation (Carroll, 1982; Moriarty, 2005). While the
differentiations presented above highlight the respective characteristic potentials and limitations
of visual and linguistic communication modes, neither are necessarily fully actualized, and there
are several ways of compensating for the specific deficits. Accordingly, frame processing can only
benefit from the specific properties of either processing mode to the degree that the named
properties are fully present.
Multimodal messages
Third, importantly, visual communication messages rarely occur alone: As Mitchell (2005)
pointedly put it, “there are no visual media”. Purely linguistic messages, by contrast, are more
common, however, also these are increasingly accompanied by visualizations of various kinds:
Both the development of digital, multimedia-based communication and its rising use by both
political and corporate strategic communication actors lead to a wide and salient proliferation of
multimodal messages: “Visual and verbal messages occur together in media, and audiences
process them simultaneously“ (Coleman, 2010, p. 235). Consequently, an isolated view on only
one specific communication mode can result in only an incomplete understanding of any media
effects (Coleman 2010); Also theoretically, as we have argued above, visual and linguistic frame
27
processing should not be understood as fundamentally distinct, competing logics (Mueller, 2007;
Mitchell, 2005). However, multimodal messages come in many varieties – ranging from texts
utilizing visual attributes or simple illustrations to bolster their impact to images accompanied by
captions or containing textual elements. Most broadly, video communication messages often
combine visual, textual, and auditive elements to convey their intended meaning (van Gorp, 2007;
Abraham & Appiah, 2006). In multimodal messages, however, both visual and linguistic
information processing operates in close interaction, and also this interrelation should
significantly contribute to the resulting framing effects (Coleman & Banning, 2006), leading to a
number of important contingencies with regard to either mode’s characteristic properties: Textual
captions can be used to disambiguate image content, visuals can raise the salience of linked
linguistic expressions, and both images and texts can initiate the renewed perception, re-
structuring and decoding as well as the re-interpretation of the respective other component (Son,
Reese, & Davie, 1987; van Gorp, 2007). At the same time, visuals, texts and verbal messages may
also compete for attention and consideration during the meaning construction process.
Multimodal messages wherein the different components reinforce each other’s suggested
meaning (high visual-verbal redundancy) should benefit from both the increased salience,
vividness and memorability of visuals, and from the guided structuring and unambiguous
signification of linguistic representations. Where different elements conflict, it is utterly unclear
whether salient but ambiguous visuals or clear but less salient texts ultimately hold the upper
hand (Graber, 1990; Grimes, 1991). While relatively much can be said about the respective
subprocesses laid out above, their interactions depend strongly on the actualization of the various
properties of depicted and described contents, the amount and quality of knowledge and
exogenous control brought to the processing task, and a number of path-dependent
contingencies that cannot be anticipated based on the available theoretical and empirical
knowledge.
28
Conclusion
As a consequence, a full understanding of multimodal frame processing requires that all included
modalities as well as the interactions of their respective contributions are examined within the
same theoretical framework. Based on this more differentiated view, the question which kinds of
framing messages exert stronger or weaker influence upon people’s interpretations and
judgments (Coleman, 2010) remains largely open. The relative strength of the various interacting
influences during information processing cannot be deduced from theoretical knowledge alone,
but remains ultimately an empirical question. Hence, addressing our second aim, the above
theoretical view upon visual and linguistic frame processing presents both a complex empirical
research program and an important methodological challenge for visual framing research:
Corroborating our above arguments requires detailed experimental data based on varying
combinations of visual and linguistic messages, exhibiting their characteristic properties to
varying degrees. Moreover, due to the complex interactions leading to the final interpretation
constructed from both uni- and multimodal messages, a specific research focus on the underlying processes
is desirable. While some studies have begun to address questions regarding the perception of
more or less ambiguous, weakly or strongly structured visual, textual, and multimodal messages
(Kahle, Yu & Whiteside 2007; Kim & Kelly 2007; Rodgers, Kenix & Thorson 2007), little is
known to date about the processes of decoding, relating, and integrating the information such
perceived. For each of these steps, stimulus-driven, intuitive comprehension processes should
interact with exogenous, knowledge-controlled re-assessments and constructions in manifold
ways. Untangling these interactions requires not only the development of an integrative,
modality-sensitive theory of the involved perceptual and cognitive processes, which we have
attempted to sketch above; it also requires theoretically informed research that pays respect to
both processes and the specific properties of processed messages. In our view, juxtaposing two
largely distinct, competing logics of visual versus linguistic information processing not only
neglects the immense amount of variability within either modality of information representation;
29
it also overlooks the many parallels and important interactions between the involved processes.
Categorizing frame messages as either visual, textual, verbal, or multimodal not only says
relatively little about the important qualities of each message impacting the way they are
processed, it also suggests a distinctness of the respective processes – and hence, the theoretical
frameworks suitable for their interpretation – that does not correspond to the actual complexity
of framed information processing.
To conclude, our theoretical discussion leads us to agree with Coleman (2010) that framing
theory has rightfully emerged as one of the life lines for visual research. Visual framing holds rich
potential both for theoretically understanding and empirically investigating visual (and
multimodal) media content and its related effects. It provides an important new direction for
theory building and future research. However, when we refer to visual framing, we understand
this to be one special facet within a general framing process, which is characterized by a range of
specific properties, but not fundamentally distinct from it. Specifically when studying the effect
potentials of visual, verbal and multimodal media messages, an integrative approach to visual
framing within a more general theory of frame processing appears mandatory. We therefore
advocate a more differentiated perspective on visual framing to inform our theoretical
explanations and empirical research: Only if we carefully characterize the information presented
in the form of different modalities and focus on the resulting variations in the manner of
information processing, we can fully understand the intricate effects of visual and other frames
upon the construction of coherent meaning.
30
REFERENCES
[4 references removed for the review process]
Abraham, L., & Appiah, O. (2006). Framing news stories: The role of visual imagery in priming
racial stereotypes. The Howard Journal of Communications, 17(183-203).
Ballensiefen, M. (2009). Bilder machen Sieger — Sieger machen Bilder. Die Funktion von Pressefotos im
Bundestagswahlkampf 2005. Wiesbaden: VS.
Barnard, W. A., Breeding, M., & Cross, H. (1984). Object Recognition as a Function of Stimulus
Characteristics. Bulletin of the Psychonomic Society, 22, 15-18.
Berger, A. A. (1989). Seeing is Believing. An Introduction to Visual Communication. Mountain View: McGraw
Hill.
Bergstroem, B. (2008). Essentials of Visual Communication. London.
Borah, P. (2011). Conceptual Issues in Framing Theory: A Systematic Examination of a Decade's
Literature. Journal of Communication, 61(2), 246-263.
Bouma, H., & Bouwhuis, D. G. (1982). Visual Selection in Reading, Picture Perception and Visual Search. A
Tutorial Review. Hillsdale.
Brantner, C., Lobinger, K., & Wetzstein, I. (2011). Effects on Visual Framing on Emotional Responses
and Evaluations of News Stories about the Gaza Conflict 2009. Journalism & Mass Communication
Quarterly, 88(3), 523-540.
Bucher, H.-J., & Schumacher, P. (2006). The Relevance of Attention for Selecting News Content. An
Eye-Tracking Study on Attention Patterns in the Reception of Print and Online Media.
Communications: The European Journal of Communication Research, 31(3), 347-368.
Bullier, J. (2001). Integrated Model of Visual Processing. Brain Research Review, 36, 96-107.
31
Bundesen, C., & Habekost, T. (2008). Principles of Visual Attention: Linking Mind and Brain. Oxford Portraits
in Science. Oxford, New York.
Cantoni, V., Marinaro, M., & Petrosino, A. (2002). Visual Attention Mechanisms. New York.
Card, S. K., MacKinlay, J. D., & Shneiderman, B. (1999). Readings in Information Visualization. Using
Vision to Think. San Francisco: Morgan Kaufmann.
Carroll, J. M. (1982). Structure in Visual Communication. Semiotica, 40, 371-392.
Charlton, M. (1997). Rezeptionsforschung als Aufgabe einer interdisziplinären Medienwissenschaft. In
M. Charlton & S. Schneider (Eds.), Rezipientenforschung. Theorien und Untersuchungen zum Umgang mit
Massenmedien. (pp. 16-39). Opladen: Westdeutscher.
Childers, T. L., & Houston, M. J. (1984). Conditions for a Picture Superiority Effect on Consumer
Memory. Journal of Consumer Research, 11(2), 643-654.
Coleman, R. (2010). Framing the Pictures in Our Heads: Exploring the Framing and Agenda- Setting
Effects of Visual Images. In P. D‘Angelo & J. A. Kuypers (Eds.), Doing News Framing Analysis:
Empirical and Theoretical Perspectives (pp. 233-262). New York: Routledge.
Coleman, R., & Banning, S. (2006). Network Tv News' Affective Framing of the Presidential Candidates:
Evidence for a Second-Level Agenda-Setting Effect Through Visual Framing. Journalism and Mass
Communication Quarterly, 83(2), 313-328.
Collins, A. M., & Loftus, E. F. (1975). A spreading-activation theory of semantic processing.
Psychological Review, 82, 407-428.
Cook, A. E., & Guéraud, S. (2005). What have we been missing? The role of general world
knowledge in discourse processing. Discourse Processes, 39(2&3), 265-278.
Detenber, B. H., Gotlieb, M. R., McLeod, D. M., & Malinkina, O. (2007). Frame Intensity Effects of
Television News Stories About a High-Visibility Protest Issue. Mass Communication and Society, 10,
32
439-460.
de Vreese, C. H. (2005). News framing: Theory and typology. Informational Design Journal +
Document Design, 13(1), 51-62.
Druckman, J. N. (2001). The implications of framing effects for citizen competence. Political
Behaviour, 23, 225-256.
Duchowski, A. T. (2007). Eye Tracking Methodology. Theory and Practice. London.
Edy, J. A., & Meirick, P. C. (2007). Wanted, dead or alive: Media frames, frame adoption, and
support for the war in Afghanistan. Journal of Communication, 57, 119-141.
Elkins, J. (2003). Visual Studies. A Skeptical Introduction. New York.
Emmison, M., & Smith, P. (2000). Researching the Visual: Images, Objects, Contexts and Interactions in Social and
Cultural Inquiry. London: Sage.
Entman, R. M. (1993). Framing: Towards Clarification of a Fractured Paradigm. Journal of Communication,
43(4), 51-58.
Entman, R. M., Matthes, J., & Pellicano, L. (2009) Nature, Sources, and Effects of News Framing. In K.
Wahl-Jorgensen & T. Hanitzsch (Eds.), The Handbook of Journalism Studies (pp. 175-190). New
York, NY: Routledge.
Findlay, J. M., & Gilchrist, I. D. (2003). Active Vision. The Psychology of Looking and Seeing. Oxford, New
York.
Gamson, W. A. (1992). Talking politics. Cambridge, UK: Cambridge University Press.
Gamson, W. A., & Modigliani, A. (1987). The Changing Culture of Affirmative Action. In R. G.
Braungart & M. M. Braungart (Eds.). Research in Political Sociology (pp. 137-177). Greenwich, CT:
JAI Press.
Gamson, W. A., & Modigliani, A. (1989). Media Discourse and Public Opinion on Nuclear Power: A
33
Constructionist Approach. American Journal of Sociology, 95(1), 1-37.
Gibson, R., & Zillmann, D. (2000). Reading Between the Photographs. The Influence of Incidental
Pictorial Information on Issue Perception. Journalism & Mass Communication Quarterly, 77(2), 355-
366.
Gitlin, T. (1980). The Whole World is Watching: Mass Media in the Making and Unmaking of the New Left.
Berkeley: University of California Press.
Goodman, N. (1997). Sprachen der Kunst. Entwurf einer Symboltheorie. Frankfurt: Suhrkamp.
Gordon, I. E. (2004). Theories of Visual Perception. New York: Wiley.
Grabe, M. E., & Bucy, E. P. (2009). Image Bite Politics: News and the Visual Framing of Elections. Oxford:
Oxford University Press.
Graber, D. (1988). Processing the news: How people tame the information tide (2 ed.). White Plains, NY:
Longman.
Graber, D. (1990). Seeing Is Remembering: How Visuals Contribute to Learning from Television
News. Journal of Communication, 40(3), 134-155.
Griffin, M. (2004). Picturing America‘s War on Terrorism in Afghanistan and Iraq. Photographic
Motifs as News Frames. Journalism, 5(4), 381-402.
Grimes, T. O. M. (1991). Mild Auditory-Visual Dissonance in Television News May Exceed
Viewer Attentional Capacity. Human Communication Research, 18(2), 268-298.
Hammoud, R. I. (2008). Passive Eye Monitoring. Algorithms, Applications and Experiments. Berlin,
Heidelberg: Springer.
Hofmann, W. (2009). «Ich schau Dir in die Augen»: Die Bedeutung visueller Medien für die politische
Kommunikation in entwickelten Demokratien. In H. Münkler & J. Hacke (Eds.), Strategien der
Visualisierung: Verbildlichung als Mittel politischer Kommunikation. (pp. 109-126). Frankfurt am Main:
34
Campus.
Holyoak, K. J., & Thagard, P. (1995). Mental leaps: Analogy in creative thought. Cambridge, MA: The
MIT Press.
Joyce, C., & Cottrell, G. W. (2004). Solving the visual expertise mystery. Paper presented at the Connectionist
Models of Cognition and Perception II . Proceedings of the Eighth Neural Computation and
Psychology Workshop., Singapore.
Kahle, S., Yu, N., & Whiteside, E. (2007). Another Disaster: An Eamination of Portrayals of Race in
Hurricane Katrina Coverage. Visual Communication Quarterly, 14(2), 75-89.
Kennedy, J. M. (1984). How Minds Use Pictures. Social Research, 51, 885-904.
Kim, Y. S., & Kelly, J. (2007). Visual Framing and the Photographic Coverage of the Kwangju and
Tiananmen Square Prodemocracy Movements: A Partial Replication. Conference Paper, ICA 2007.
Kintsch, W. (1998). Comprehension: A paradigm for cognition. Cambridge, UK: Cambridge University
Press.
Kobayashi, S. (1986). Theoretical Issues Concerning Superiority of Pictures Over Words and Sentences
in Memory. Perceptual and Motor Skills, 63, 783-792.
Kosslyn, S. (1995). Mental Imagery. In S. Kosslyn, D. N. Osherson & L. Gleitman (Eds.), Visual
Cognition. An Invitation to Cognitive Science. (pp. 267-296). Cambridge: Massachusetts Institute of
Technology.
Kress, G., & Leeuwen, T., van. (2006). Reading Images. The Grammar of Visual Design. London, New York:
Routledge.
Kress, G., & Leeuwen, T., van. (2010). Reading Images. The Grammar of Visual Design. London, New York:
Routledge.
Kroeber-Riel, W. (1993). Bildkommunikation. The New Science of Imagination. München: Vahlen.
35
Lachmann, U. (2002). Wahrnehmung und Gestaltung von Werbung. Hamburg: Gruner und Jahr.
Langacker, R. W. (1998). Conceptualization, symbolization, and grammar. In M. Tomasello (Ed.),
The new psychology of language: Cognitive and functional approaches to language Structure (Vol. 1, pp. 1-
39). Mahwah, NJ: Lawrence Erlbaum Associates.
Lester, P. M. (2010). Visual Communication. Images with Messages. Belmont: Wadsworth/Thompson.
Leven, W. (1991). Blickverhalten von Konsumenten. Grundlagen, Messung und Anwendung in der Werbeforschung.
Heidelberg: Physika.
Luck, S. J., & Hollingworth, A. (2008). Visual Memory. Oxford Series in Visual Cognition. Oxford, New York.
Ludes, P. (2001). Multimedia und Multi-Moderne: Schlüsselbilder. Fernsehnachrichten und World Wide
Web. Medienzivilisierung in der Europäischen Währungsunion. Wiesbaden: Westdeutscher.
MacInnis, D. J., & Price, L. L. (1987). The Role of Imagery in Information Processing: Review and
Extensions. Journal of Consumer Research, 13, 473-491.
Matthes, J. (2009). What's in a Frame? A Content Analysis of Media Framing Studies in the World's
Leading Communication Journals, 1990-2005. [Article]. Journalism & Mass Communication Quarterly,
86(2), 349-367.
Mendelson, A. L. (2004). For Whom is a Picture Worth a Thousand Words? Effects of the Visualizing
Cognitive Style and Attention of News Photos. Journal of Visual Literacy, 24(1), 1-22.
Messaris, P. (1992). Visual 'Manipulation'. Visual Means of Affecting Responses to Images.
Communication, 3, 181-195.
Messaris, P. (1994). Visual Literacy. Image, Mind and Reality. Boulder: Westview.
Messaris, P. (1997). Visual Persuasion. The Role of Images in Advertising. Thousand Oaks: Sage.
Messaris, P., & Abraham, L. (2003). The Role of Images in Framing News Stories. In S. Reese, O. Gandy
36
& A. Grant (Eds.), Framing Public Life: Perspectives on Media and Our Understanding of the Social World.
(pp. 215-226). Mahwah, NJ: Erlbaum.
Mitchell, W. J. (1986). Iconology. Image, Text, Ideology. Chicago, London.
Mitchell, W. J. T. (2005). There are No Visual Media. Journal of Visual Culture, 4(2), 257-266.
Moriarty, S. (2005). Visual Semiotics Theory. In K. Smith, S. Moriarty, G. Barbatsis & K. Kenney
(Eds.), Handbook of visual communication. Theory, Methods, and Media. (pp. 227-242). Mahwah:
Lawrence Erlbaum Publishers.
Mueller, M. G. (2001). Bilder, Visionen, Wirklichkeiten. Zur Bedeutung der Bildwissenschaft im 21.
Jahrhundert. . In T. Knieper & M. G. Müller (Eds.), Kommunikation visuell. Das Bild als
Forschungsgegenstand – Grundlagen und Perspektiven. (pp. 14-24). Köln.
Mueller, M. G. (2003). Grundlagen der Visuellen Kommunikation. Theorieansätze und Analysemethoden. Konstanz:
UTB.
Mueller, M. G. (2007). What is Visual Communication? Past and Future of an Emerging Field of
Communication Research. Studies in Communication Science, 7(2), 7-34.
Nelson, D. L. (1979). Remembing Pictures and Words: Appearance, Significance, and Name. In L. S.
Cermak & F. I. M. Craik (Eds.), Levels of Processing in Human Memory. (pp. 45-76). Hillsdale.
Nelson, D. L., & Castano, D. (1984). Mental Representations for Pictures and Words: Same or Different?
American Journal of Psychology, 97, 1-15.
Nelson, T. E., Oxley, Z. M., & Clawson, R. A. (1997). Toward a psychology of framing effects.
Political Behaviour, 19(3), 221-246.
Nelson, D. L., Reed, V. S., & Walling, J. R. (1976). Pictorial Superiority Effect. Journal of Experimental
Psychology: Human Learning and Memory., 2, 523-528.
37
Neuman, R. W., Just, M. R., & Crigler, A. N. (1992). Common knowledge: News and the construction of
political meaning. Chicago: University of Chicago Press.
Paivio, A. (1979). Imagery and Verbal Processes. New Jersey: Psychology Press.
Paivio, A., & Csapo, K. (1973). Picture Superiority in Free Recall: Imagery or Dual Coding? Cognitive
Psychology, 5, 176-206.
Pan, Z., & Kosicki, G. M. (1993). Framing analysis: An approach to news discourse. Political
Communication, 10, 55-75.
Pennington, N., & Hastie, R. (1988). Explanation-based decision making: Effects of memory
structure on judgment. Journal of Experimental Psychology, 14(3), 521-533.
Pinker, S. (1986). Visual Cognition. Cambridge: MIT Press.
Posner, R., & Schmauks, D. (1998). Die Reflektiertheitder Dinge und ihre Darstellung in Bildern. In K.
Sachs-Hombach & K. Rehkämper (Eds.), Bild - Bildwahrnehmung - Bildverarbeitung. Interdisziplinäre
Beiträge zur Bildwissenschaft. (pp. 15-32). Wiesbaden: DUV.
Price, V., & Tewksbury, D. (1997). News Values and Public Opinion: A Theoretical Account of Media
Priming and Framing. In B. G. A. & F. J. Boster (Eds.), Progress in Communication Sciences: Advances
in Persuasion (pp. 173-212). Greenwich: Ablex.
Proulx, M. (2007). The Strategic Control of Attention in Visual Search- Top-Down and Bottom-Up Processes.
Saarbrücken: VDM.
Rayner, K. (1978). Eye Movements in Reading and Information Processing. Psychological Bulletin, 85(3),
618-660.
Rayner, K. (1998). Eye Movements in Reading and Information Processing. 20 Years of Research.
Psychological Bulletin, 124(3), 372-422.
Reese, S. (2001). Prolgue - Framing Public Life: A Bridging Model for Media Research. In S. Reese, O.
38
Gandy & A. Grant (Eds.), Framing Public Life. Perspectives on Media and Our Understanding of the Social
World. (pp. 7-31). Mahwah: Erlbaum.
Rodgers, S., Kenix, L. J., & Thorson, E. (2007). Stereotypical Portrayals of Emotionality in News Photos.
Mass Communication and Society, 10(1), 119-138.
Rodriguez, L., & Dimitrova, D. V. (2011). The Levels of Visual Framing. Journal of Visual Literacy, 30(1),
48-65.
Rose, G. (2012). Visual Methodologies. An Introduction to the Interpretation of Visual Materials. (3 ed.).
London/Thousand Oaks/New Delhi: Sage.
Sachs-Hombach, K. (2003). Das Bild als kommunikatives Medium. Elemente einer allgemeinen Bildwissenschaft.
Köln: von Halem.
Salomon, G. (1984). Television is 'easy' and Print is 'tough'. The Differential Investment of Mental Effort
in Learning as a Function of Perception and Attribution. Journal of Educational Psychology, 76, 647-
658.
Scheufele, B. (1999). (Visual) Media Framing und Politik. Zur Brauchbarkeit des Framing-Ansatzes im
Kontext (visuell) vermittelter politischer Kommunikation und Meinungsbildung. In W. Hofmann
(Ed.), Die Sichtbarkeit der Macht. Theoretische und empirische Untersuchungen zur visuellen Politik. (pp. 91-
107). Baden-Baden: Nomos.
Scheufele, B. (2001). Visuelles Medien-Framing und Framing-Effekte. Zur Analyse visueller
Kommunikation aus der Framing-Perspektive. In T. Knieper & M. G. Müller (Eds.),
Kommunikation visuell - Das Bild als Forschungsgegenstand - Grundlagen und Perspektiven. (pp. 144-158).
Köln: von Halem.
Scheufele, B. (2004). Framing-Effects Approach: A Theoretical and Methodological Critique.
Communications: The European Journal of Communication Research, 29(4), 401–428.
39
Scheufele, B., & Scheufele, D. A. (2012 (in print)). Framing and Priming Effects: Exploring Challenges
connected to Cross-Level Approaches in Media Effects Research. In E. Scharrer (Ed.), Media
Effects/Media Psychology. Volume 8 of the International Companions to Media Studies. Malden: Blackwell.
Schneider, W., & Maasen, S. (1998). Mechanisms of Visual Attention: A Cognitive Neuroscience Perspective. A
Special Issue of the Journal Visual Cognition. East Sussex: Psychology Press.
Schnotz, W., & Bannert, M. (1999). Einflüsse der Visuali- sierungsform auf die Konstruktion mentaler
Modelle beim Text-und Bildverstehen. Zeitschrift für Experimentelle Psychologie, 7(3), 217-236.
Schuermann, E. (2011). Transitions from Seeing to Thinking. On the Relation of Perception, Wordview
and Word-Disclosure. In K. Sachs-Hombach & R. Totzke (Ed.), Bilder, Sehen, Denken. Zum
Verhältnis von begrifflich-philosophischen und empirisch-psychologischen Ansätzen in der bildwissenschaftlichen
Forschung (pp. 93-105). Köln: von Halem.
Schwalbe, C. B., Silcock, W. B., & Keith, S. (2008). Visual Framing of the Early Weeks of the US-led
Invasion of Iraq: Applying the Master War Narrative to Electronic and Print Images. Journal of
Broadcasting & Electronic Media, 52(3), 448-465.
Schwan, S. (2005). Psychologie. In K. Sachs-Hombach (Ed.), Bildwissenschaft. Disziplinen, Themen, Methoden
(pp. 124-133). Frankfurt am Main: Suhrkamp.
Seymour, P. H. (1979). Human Visual Cognition. A Study in Experimental Cognitive Psychology. London:
Palgrave Macmillan.
Shah, D. V., Kwak, N., Schmierbach, M., & Zubric, J. (2004). The interplay of news frames on
cognitive complexity. Human Communication Research, 30(1), 102-120.
Slothuus, R., & de Vreese, C. H. (2010). Political parties, motivated reasoning, and issue framing
effects. The Journal of Politics, 72(3), 630-645.
Snowdden, R., Thompson, P., & Troscianko, T. (2006). Basic Vision. An Introduction to Visual Perception.
40
Oxford: Oxford University Press.
Son, J., Reese, S. D., & Davie, W. R. (1987). Effects of Visual-Verbal Redundancy and Recaps on
Television News Learning. Journal of Broadcasting & Electronic Media, 31(2), 207-216.
Sugimoto, M., & Cottrell, G. W. (2001). Visual Expertise is a General Skill. Proceedings of the 23rd Annual
Cognitive Science Conference, Edinburgh.
Tankard, J. W. (2001). Prologue: Framing public life: A bridging model for media research. In S.
D. Reese, O. H. Gandy, Jr. & A. E. Grant (Eds.), Framing public life (pp. 95-103). Mahwah,
NJ.: Lawrence Erlbaum Associates.
Tewksbury, D, & Scheufele D. A. (2009). News framing theory and research. In J. Bryant & M. B. Oliver
(Eds.), Media effects: Advances in theory and research (pp. 17-33). Hillsdale, NJ: Erlbaum.
Tourangeau, R., & Rasinski, K. A. (1988). Cognitive processes underlying context effects in
attitude measurement. Psychological Bulletin, 103(3), 299-314.
van Atteveldt, W., Kleinnijenhuis, J., & Ruigrok, N. (2008). Parsing, semantic networks, and
political authority: Using syntactic analysis to extract semantic relations from Dutch
newspaper articles. Political Analysis, 16(4): 428-446.
van Atteveldt, W., Ruigrok, N., & Kleinnijenhuis, J. (2006). Associative framing: A unified method for
measuring media frames and the media agenda. Paper presented at the ICA 56th Annual
Conference, Dresden, Germany.
van Dijk, T. A. & Kintsch, W. (1983). Strategies of Discourse Comprehension. New York: Academic Press.
van Gorp, B. (2007). The constructionist approach to framing: Bringing culture back in.
Journal of Communication, 57(1), 60-78.
41
van Gorp, B. (2010). Strategies to take subjectivity out of framing analysis. In P. D'Angelo & J. A.
Kuypers (Eds.), Doing news framing analysis: Empirical and theoretical perspectives (pp. 84-109).
New York: Routledge.
van Leeuwen, T., & Jewitt, C. (Eds.). (2010). Handbook of Visual Analysis. London/Thousand Oaks/New
Delhi: Sage.
Veling, A., & van der Weerd, P. (1999). Conceptual grouping in word co-occurrence networks. Paper
presented at the IJCAI.
Vliegenthart, R., & van Zoonen, L. (2011). Power to the frame: Bringing sociology back to frame
analysis. European Journal of Communication, 26(2), 101-115.
Watt, R. (1992). Visual Analysis and Representation of Spatial Relations. In G. Humphreys (Ed.),
Understanding Vision. Readings in Mind and Language. (pp. 19-38). Oxford, Cambridge.
Wedel, M., & Pieters, R. (Eds.). (2008). Visual Marketing: From Attention to Action. New York.
Weidenmann, B. (1988). Psychische Prozesse beim Verstehen von Bildern. Bern, Stuttgart, Toronto: Huber.
Weidenmann, B. (1989). Der mentale Aufwand beim Fernsehen. In J. Groebel & P. Winterhoff-Spurk
(Eds.), Empirische Medienpsychologie (pp. 134-150). Bern, Stuttgart, Toronto
Weidenmann, B. (1998). Psychologische Ansätze zur Optimierung des Wissenserwerbs mit Bildern. In K.
Sachs-Hombach & K. Rehkämper (Eds.), Bild - Bildwahrnehmung - Bildverarbeitung. Interdisziplinäre
Beiträge zur Bildwissenschaft. (pp. 243-254). Wiesbaden: DUV.
Weßler, H. (1999). Öffentlichkeit als Prozess: Deutungsstrukturen und Deutungswandel in der deutschen
Drogenberichterstattung. Opladen, Germany: Westdeutscher Verlag.
Wicks, R. H. (2007). Does Presentation Style of Presidential Debates Influence Young Voters'
Perceptions of Candidates? American Behavioral Scientist, 50(9), 1247-1254.
Wiesing, L. (1998). Sind Bilder Zeichen? In K. Sachs-Hombach & K. Rehkämper (Eds.), Bild -
42
Bildwahrnehmung - Bildverarbeitung. Interdisziplinäre Beiträge zur Bildwissenschaft. (pp. 95-104).
Wiesbaden: DUV.
Wiesing, L. (2005). Artifizielle Präsenz. Studien zur Philosophie des Bildes. Frankfurt: Suhrkamp.
Yantis, S. (2002). Stimulus-Driven and Goal-Directed Attention Control. In V. Cantoni, M. Marinaro &
A. Petrosino (Eds.), Visual Attention Mechanisms. (pp. 125-134). New York.
Yantis, S. (2005). How Visual Salience Wins the Battle for Awareness. Nature Neuroscience, 8(8), 975-977.
Yarbus, A. (1967). Eye Movements and Vision. New York: Plenum Press.
Zillmann, D., Gibson, R., & Sargent, S. L. (1999). Effects of Photographs in News- Magazine Reports on
Issue Perception. Media Psychology, 1(3), 207-228.
43

Geise S and Baden C 2013 Putting The Ima

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Geise S and Baden C 2013 Putting The Ima

Uploaded by

Copyright:

Available Formats

PUTTING THE IMAGE BACK INTO THE FRAME

MODELING THE LINKAGE BETWEEN VISUAL COMMUNICATION

AND FRAME PROCESSING THEORY

of meaning construction through the guided reduction of complexity.

framing approach as one of the “life lines” of visual communication research.

The relatively late encounter of research in framing and visual communication is

(political) culture (Hofmann, 2009), continuing to think of frames as primarily linguistic

interpretation based on a reality that is principally open to a variety of interpretations. In both

constructing coherent meaning by connecting a set of elements perceived as carriers of relevant

information during perception; subsequently, we investigate the characteristic implications of the

construction of meaning through visual information involves several modality-specific variations,

focus on post-receptive one-time measurements that neglect the process-related character of

regard to explanatory prowess, theoretical insights, and methodological development.

framing as an “important new direction” for empirical future research.

In the following argument we proceed as follows. Based on the available literature on

with a few suggestions for a future research agenda.

1. Visual communication and visual information processing

specific mode of operation of visual communication processes. Visual communication – which

construction of coherent meaning.

textual messages in a post-receptive context. Through their implied similar-to-reality charakter

elements or configurations: On the one hand, pictures adhere to an analogous, spatial-associative

ways of structuring and sequencing the contained information.

non-concrete sign-theoretical representations (Sachs-Hombach 2003; Schuermann, 2011). This is

information to different possible ways of structuring their perception, this intuitive

comprehension simultaneously allows quite a variety of concurrently possible interpretations:

appropriate, context-, convention- and culture-dependent mappings of specific signifiers to

relevant relations between these elements follows from an interaction of stimulus-driven

decoding of visually represented elements, also the representation of semantic relations in a

between stimulus-suggested intuitive decodings and a construction of further propositional links

from associative knowledge.

possible ways of perceiving meaningful relations between the different elements.

conceptual meaning that needs to be identified based on the recipient’s knowledge of

compositional/configurational setup of the stimulus and prior experience, and intuitive

foregrounding of different perceived or inferred relations (Bullier, 2001). Picture understanding is

cognitive concepts pre-existent in the recipient’s memory.

unproductive with regard to the construction of an interpretation are discounted or discarded,

consensually decoded than weakly conventionalized or unfamiliar depictions (Ludes, 2001;

In order to decide which out of a range of possible interpretations of a visual message is

Chiefly, the comprehension of visual messages requires the re-construction of a potential

reflection about other possible meanings or a possible strategic-communicative intention. To the

Principles of the Visual Communication Process

analysis of visual communication, emphasizing three fundamental principles of visual

character of visual communication has to be considered. This is particularly true as findings in

requires the integration of „upstream“ (stimulus-driven) and interacting, but interdependent

(goal-driven) mechanisms of visual perception and information processing, which in themselves

have to be understood as dynamic, highly interrelated processes.

meaning construction and contextualisation do not occur synchronously, but in different

interactive, and structured into multiple steps.

2. Visual information processing as a specific case of frame processing

information by discriminating between relevant and irrelevant information based on a

purposefully selective, coherently interpretable fashion.

inducing framing effects (Coleman, 2010).

sequence of selective, stimulus- and knowledge-guided attention, decoding based on prior

nonlinear interactions between the different processing steps.

aspects of perceived reality more salient than others.

at a limited set of relevant information. By contrast, textual messages mostly discriminate

visuals succeed in communicating relevant information in an associative entirety, the structural

hierarchy in VIP exerts strong influence on the construction of meaning.

shifting without disrupting the intelligibility of the message.

that are then processed holistically.

is much weaker than that triggered by visual information.

decoding of meaningful information units.

} Differentiation 2 (Dictionary-based decoding/Constructive interpretation): The classification and decoding of