Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 22

Literature and Cinema

James Monaco, “The Language of Film: Signs and Syntax”

Contents
S. No. Title Pg. No.
1. Introduction 01
1.1 About the Author 01
1.2 Film Semiotics 01
1.3 Learning Objectives 02
2. Signs 02
2.1 The Physiology of Perception 02
2.2 Film as Language 04
2.3 Denotative and Connotative Meanings 09
3 Syntax 11
3.1 Mise-en-scéne 13
3.2 Montage 19
4. Conclusion 20
James Monaco, “The Language of Film: Signs and Syntax”

1. Introduction
About the Author
James Monaco was an eminent American film critic, historian, and academic. With an
illustrious career spanning over thirty years - from the 1970s to the 2010s, Monaco authored
several books on cinema and media and taught at The New School of Social Research,
Columbia University and New York University. As a film critic he wrote for The New York
Times, The Village Voice, and American Film, amongst several others, and appeared as a
cultural commentator on television and radio. How to Read a Film (1977; 2009) is his most
significant contribution to the field of film studies and it has served as a key pedagogical text
for the discipline. It remains, till date, an incredibly popular introduction to cinema as a visual
art and its practices of cultural meaning making and storytelling.
Film Semiotics
In the 1970s, when film was gaining importance not just as an industrial practice but also as
an academic discipline in universities, Western film theory drew heavily from the study of
visual art histories, language, and literature - in order to distinguish cinema as a specific
technological medium and art form. In order to identify the particular features of film and
how it conveys meaning through the moving image, the semiotic approach was preferred by
several film studies’ scholars. Semiotics, the science of studying language as a system of
communication, with specific signs or words, rules of grammar and organization, had a major
impact on the study of cinema. It offers useful analytical tools to think about film as text, as
well as means to deconstruct how film texts communicate with audiences in very specific
ways, unique to the medium. Semiotics provides a method to evaluate how spoken and
written languages work as structures of communication. Film theorists apply semiotics to
examine whether film, through practice and production of images, sounds, and sequences,
functions like any other language.
How to Read a Film, especially the chapter “The Language of Film: Signs and Syntax”
elaborates how cinema communicates through visual codes, cultural signs, connotations, and
denotations; generating images as icons and symbols, not only to construct the narrative but
also to convey meaning to the audience. Monaco draws from the semiotic approach in
linguistics to ascertain whether, as a form of communication, film follows certain principles
of language structures. It is a detailed chapter that compares film with various other
storytelling and visual art forms like the novel, theatre, painting and photography, to
differentiate film as a medium of moving images.

1
Monaco’s work builds upon the theories of two scholars, Christian Metz and C. S.
Pierce. Metz’s The Imaginary Signifier (1977) and Pierce’s Film Language (1974) were
pioneering texts in film semiotics that have deeply influenced the direction of film theory
since the 1970s, as they initiated the comparison of film with language. C.S. Peirce was a
late-nineteenth century American philosopher who is said to have coined the term
“semiotics” and it is from his classification of signs as symbolic, iconic, and indexical, that
Monaco borrows and elaborates how visual codes relay meanings in an image-based medium,
such as cinema.
Learning Objectives
The prescribed reading, “The Language of Film: Signs and Syntax” is the third chapter of
Monaco’s book, How to Read a Film: The World of Movies, Media, and Multimedia. The
students are advised to read this chapter before going through the study material.
“The Language of Film: Signs and Syntax” is divided into two parts, that raise two
fundamental questions, which are;
 Does film, as a medium, constitute a language of its own?
 Does film have a grammar of its own, with unique features and ways of organization?
Monaco answers these two questions with detailed explanations of how film is primarily an
image-based art practice and lists the particular features that differentiate film from any other
medium. Remember that Monaco is interested in how the signification process occurs in
cinema or, to put it simply, how meaning is constructed with images and sequences such that
they imply certain messages or suggest possibilities that take the film narrative forward or
convey emotions and subjectivities that audiences can identify with. Therefore, as you
proceed through the chapter, you will learn about;
 the language of cinema and how it works differently from other art forms such as
theatre, photography, and literature; and
 the basic features that constitute the language of cinema such as shot, frame,
movement, mise-en-scène, and montage.
The following sections provide a detailed and critical analysis of the different sections of the
prescribed chapter.
2. Signs
The Physiology of Perception
James Monaco begins his illustration of the specific features of film as a visual medium, with
a comparison between language and cinema. Any language system, be it a verbal language
like English or French or a scientific one like mathematics, requires learning, familiarity with
the units that constitute such languages, like alphabets or numbers and, most importantly, the
knowledge of how to use them correctly to make any sense of it. However, with film there is
no such compulsion. As Monaco points out, even toddlers who have not yet adapted to any

2
spoken language are attracted to moving images and respond to them. This proves that
cinema has a grammar of its own. It’s an audio-visual medium that relays images in
sequences that anyone can interpret in their own way even without any prior specific
knowledge about the medium.
According to Monaco, this is precisely where the challenge of studying film lies - as
almost everyone can understand it at a basic cognitive level, it is often read simplistically at
the visual level or taken too lightly as an art form. Instead, Monaco suggests that film “quite
like language” offers multiple levels of signification. Film is like a quasi-language, which
means that even though learning is not needed, with time and repetition, people can acquire
more nuanced skills and a sophisticated understanding of how cinema works. Thus, those
who watch and observe films more often, are able to identify and grasp its particular features
and conventions, often acquiring observational skills: in a way, educating themselves about
how cinema creates visual codes, gestures, and tropes to communicate with the audience.
Monaco suggests that while an educated approach is not necessary to watch and
understand a film, the metaphor of language is a useful one to describe film as a
phenomenon. Even though everyone can see pictures or moving images, not everyone will
interpret them in the same fashion. There are always differences in perception and cultural
interpretation. Film resembles language because, only someone who is familiar with the
language can hear the sound of a word and associate it with what that word stands for.
Alternatively, even in a shared language, words may generate different associations in the
minds of people, depending upon factors such as level of literacy, cultural and social status;
even imagination and creativity. This is a quality that film shares with language even though
it is primarily an optical medium.
Monaco uses a few examples of optical illusions to expand on the optics and
physiological functions that make it possible to read and interpret any kind of visual.
Therefore, whether it’s a pattern of lines or a dot or a photograph, these patterns are
registered and interpreted by the human brain. However, what we perceive in optical illusions
(i.e. the shapes and patterns that we see) or interpret in any kind of image, depends on our
mental and cultural experiences. Another example that Monaco gives to explain the
physiology of perception is through a comparison of how we listen and how we see; while the
ear can hear everything in its immediate surroundings, in gradations of volume, the eye
functions by focusing on what it chooses to see in its field of vision.
Other than this physiological function of seeing, Monaco suggests the ethnographical
and psychological, as two other methods through which various individuals might interpret
images that they see. An ethnographical approach is often used by relatively more literate and
sophisticated viewers who interpret images, based on their knowledge and experience of
different kinds of cultural conventions. The psychological approach is used by viewers who
are able to incorporate various sets of meanings, observe through different perspectives, and
assimilate them into their experience. Therefore, Monaco opines that while just about anyone

3
can see a film, there are select and sophisticated viewers who move ahead to read and
comprehend films ethnographically and psychologically.
In order to enjoy literature in any language, one would need to know how to read in that
language but cinema makes no such demands and can be seen by everyone. However, its
visual information is organized like any other language system and lends itself to
interpretation, once its codes are identified. Audiences are able read cinema as they become
familiar with these codes and signs.
Film as Language
The sign of cinema is a short-circuit sign (158).
James Monaco draws from film semiotician, Christian Metz’s observation that, “it is not
because the cinema is language that it can tell such fine stories, but rather it has become
language because it has told such fine stories.” In semiotics, every language and its spoken
and written forms are all individual systems, constituted with their own structures and signs,
through which communication takes place. A sign is made up of the signifier and the
signified. For example, take any word, like “cat,” then the letters c-a-t or the sound of it is the
signifier and what it represents is the signified, that is the idea or shape or form of a small
furry animal. In literature, this relationship between the signifier and signified creates much
of the pleasure of reading. As Monaco elaborates, as a poet composes poetry, the words or
the sounds of the words (signifiers) create the idea or the meaning of the words (signified),
such that when we read the words, we understand and imagine what it represents. Therefore,
if a poet tells us about a rose, the flower rose that every reader imagines is likely to be
different, such that seeing the word “r-o-s-e” one person might think of a red rose in his
garden, another might think of a rose in a vase in the corner of her house, yet another might
think of a picture of a rose and so on. Monaco says that writers engage readers with this
dance between the signifier and the signified since the sounds of the words unleash multiple
possibilities of signification, which readers can then imagine and interpret, while writers can
utilize the same to refine their craft.
For Monaco, this “relationship between signifier and the signified is the main locus of
art” and the “power of language systems is that there is a great difference between the
signifier and the signified” (158). However, film doesn’t enjoy such a privilege. In film, the
signifier and the signified are almost identical, which means that if a book is shown in a film,
you not only know what is being shown but exactly which book is being shown; its title,
cover, colour are all details that are likely to be visible. That makes “the sign in cinema a
short-circuit sign” (158). In contrast, the letters b-o-o-k don’t resemble the object that is a
book in any way but reading it or hearing the word we know what it refers to. This is because
cinema is constituted of images and images have a more direct relationship with what they
signify, compared to written or spoken words. Another aspect specific to film is that we have
no control over the images as viewers, we cannot modify or change what is already on screen
or what we are seeing. Yet, as viewers, we need to interpret what we see, like a language.

4
Even the images that are put in a sequence signify more than just the objects or people or
places shown. Therefore, the more we learns to read images and observe films, the more we
can intellectually unpack them, understand meanings beyond the obvious, and read into their
cultural significations.
How does semiotics work for film? James Monaco cautions against the crude application
of linguistic theories to the signs and structures of film. Film bears some crucial differences
from other language systems. For instance, while a word can be taken as the fundamental,
smallest unit that constitutes spoken/written language, parts of a film cannot be so easily
broken down into such small, quantifiable units.
Early theory of film attempted to equate the shot with a word, a scene with a sentence,
and sequences with paragraphs. But this was a flawed project. Unlike any written word, a
shot is both an image as well as a unit of time. For example, think of the word ‘painting’; the
alphabets combined give a single unit of meaning that signifies a framed drawing or artwork.
Just seeing this word or hearing it, we understand what it is, but we still don’t know which
painting, by whom or where it is placed. The complete sentence in which it is appears will
provide that information.

Figure 1: Dreams (1990) Dir. Akira Kurosawa. This shot establishes that the man is looking at the
portrait painting of Vincent Van Gogh. The focus is on the painting, so our eyes are drawn towards
that. We are looking at what the character the same thing that the character is looking at but our
perspective as a spectator is that of looking over his shoulder.
However, in a film, a shot of a painting could be of someone looking at a painting, framed in
such a way that we see a person enter the frame and stare at the painting in front of him. We
only see his back (See Fig. 1). In the same shot, with the viewer watching his back, he may
move closer to the painting or keep standing still. We know he is looking at the painting and

5
we are looking along with him. Therefore, a shot is about the frame within which an image is
captured – as in what is inside the square outline of the camera and the screen, but most
significantly, it is also about capturing time, in this case the time taken to establish that the
person is looking at a specific painting.
This same activity, of looking at a painting, when filmed differently, generates an
entirely different feeling and context. Below, we have a shot in which we see a woman sitting
in front of a painting (See Figs. 2-7), which then cuts to details of her surroundings, and with
each detail, a closer shot of a select portion of the painting follows. The camera focuses on
parts of the painting, leaving the woman out of its frame, and with each cut, it frames another
detail in her surrounding as well as the painting, perhaps a clue towards the next scene. With
a change in frame, shot sequences and camera movement, the meaning of this setting changes
from merely a woman observing a painting to us piecing together details of why she is there.
Thus, these features of film like the shot, frame or sequence are medium specific,
incorporating time and movement: each shot contains many visual signs and construction in
time and space.

Figure 2: Veritgo (1958) Dir. Alfred Hitchcock. A woman sits in front of large portrait, with a
bouquet next to her. Figure 3: Close-up of the bouquet.

Figure 4, 5: The mid-shot of the bouquet in the painting, followed by a close-up.

6
Figure 6: The next shot cuts to another mid-shot, this time focusing on how her hair is rolled and
pinned up. Figure 7: Cut to the shot of the painting again. This time we see a mid-close up of the
painted face, the slender shoulders and notice that the hair is tied in the same rolled up fashion. Entire
sequence is established through visual codes of “seeing,” each shot offers a new information and each
shift in perspective on the painting reveals a new clue. In the full sequence, we see that the woman
sitting on the bench is being watched by someone. Just like she is reading the painting, someone is
reading all the details in her surroundings to know who she is. See the full sequence here:
https://www.youtube.com/watch?v=d-kcczAff40&t=64s

Film is not constituted of such singular and neat units of meaning. Similarly, to compare a
scene to a sentence or a sequence to a paragraph also doesn’t quite work out because unlike
written/spoken languages, there is no fixed grammar of constructing scene or a sentence.
Scenes and sequences are developed according to the choices that the filmmakers make, the
practices that are common in the industry, skills of performers and film genres among several
other stylistic and cultural choices.
For example, a dramatic scene between a couple can be framed in a long-shot where the
couple is present inside a spacious dining room (See Fig.8). We see them along the horizontal
axis, like a theatre stage, and the camera gradually moves towards the closely seated couple.
The shots change to frame them individually, at two ends of the table, talking to each other.
With each change in shot, the couple is slightly older and look richer. Soon in this shot
reverse shot exchange, their conversation turns from disappointment to quarrels and finally
stops at a mid-shot of the man’s face, as we see them now as an aged couple harboring only
resentment towards with each other.

Figure 8: Citizen Kane (1941) Dir. Orson Welles. The opening of this sequence is a mid-shot of the
newly married couple sitting at the dining table, fondly gazing at each other.

7
Figure 9: By the end of this sequence, the camera pulls back to frame the distance between the couple,
disinterested and resentful towards each other. It’s a sharp contrast to how their marriage had begun.
The long shot establishes this distance while the dark shadows in the frame indicate an unhappy
marriage. See the full sequence here https://www.youtube.com/ watch?v=Rfl2M8B9WA8

The scene encompasses the passage of time and ends with a dramatic pause as something
emotionally shattering is said in this heated conversation. Both remain present within the
frame and the camera pulls back, ending in a long shot that establishes the distance between
them.
In film, interpersonal dramatic context is often built up by the shot-reverse-shot
convention, where the shots alternate between the actors as they speak, emote and react to the
narrative context. As more close-ups and mid-shots are chosen to capture any one-on-one
conversation, the rapid intercutting accelerates the rhythm of the sequence, enhancing the
confrontational nature of such a dramatic episode. Such scenes can be sequences or even part
of a continuing narrative sequence within the film. Here it is a tightly bound sequence that
shows time lapse.
Sequences, with the combination of several shots and scenes, build the continuity and
pace of the film. For example, think of chase sequences in action movies which usually
involve alternating parallel actions; one which follows the chasers and another which follows
the one who is being chased. It usually ends with both the parallel actions converging at one
point, in which either the chased will escape or the chaser will win. Therefore, while film can
be compared to written language, it functions very differently. Rather than providing us with
a smallest unit of composition, film offers a “continuum of meaning” (160).
Check your progress
i) How does film function as a language?
ii) Films does not have any specific rules of grammar but has developed certain dramatic
conventions. Can you identify such conventions?

8
iii) Distinguish between frame, shot and sequence? How are these features related to
space, time and movement in film?
Denotative and Connotative Meanings
In the earlier section, Monaco identified certain features of film like frame, shot and
sequence, movement and time, that are very specific to the medium; therefore, establishing
the difference from other language systems. In this section, he focuses on illustrating how
film functions as a language. Film, like any other written/spoken language, communicates
meaning through its denotative and connotative potential. As an image-based medium, film is
strongly denotative. As an audio-visual form it also creates the closest approximation of
reality. While, in written languages, the alphabets chosen to represent an object, person or
place don’t have any similarity to what is represented; in cinema the denotative aspect is
stronger because of how closely the image represents what is on screen. Languages primarily
produce meaning through connotative associations, expanding beyond the denotative aspects
of signs or words.
Cinema, remarkably, combines both; its intrinsically denotative abilities and connotative
potential. All other art forms that create connotative associations can be replicated in film.
Besides drawing from other arts, film can generate its own connotative meanings because
film is ultimately a cultural product. Thus, anything that is represented within a film’s
narrative has meaning, not only within that context, but also acquires meaning according to
how it resonates outside the narrative.
How is cinematic connotation created? What are the tools at cinema’s disposal to create
such meaning? Film’s unique connotative ability depends primarily on the shots that are
chosen by the filmmaker. Monaco uses the example of filming a rose to explain how
cinematic connotation works. Think about it: how you feel about a rose on screen will first,
depend on what kind of rose is shown – is it fresh or dried up? does it have thorns? Is it a
bright colour? The second aspect that influences the meaning of the rose will be how it is
framed – is it shown from a lower angle such that the flower looms and looks ominous or is it
show from above where its colours and petals are distinctly visible? The lighting of the shot,
whether the rose is a bright or dimly lit setting will also add to the quality and tone of the
shot. These tools, specific to film aid in cinematic connotations and contribute to the
precision or efficiency of the form.
Monaco suggests that there are two kinds of cinematic connotations. First, is
paradigmatic connotation, when the sense of the connotation depends on the shot which has
been selected from a range of other possible shots for the same thing. Second, is syntagmatic
connotation, when the sense of the shot is determined by which shot has preceded it and
which shot follows. We look back at the above example of filming a rose to elaborate on this.
Paradigmatic connotations depend on which shot is chosen by a filmmaker to frame and
represent any object or place or person. Therefore, if the rose is filmed from a low angle shot,
it acquires the meaning of being dominant or overpowering. Syntagmatic connotations are

9
determined not by what is within a particular shot but from the shots that precede and follow.
Paradigmatic and syntagmatic connotations in cinema offer two axes of meaning and are
extremely useful to understand the language of film.
Once a filmmaker has decided what to show, two main questions remain; first, how to
shoot it, which is the paradigmatic aspect; that is which shot, frame and composition within
that. The second is; how to present the shot, in what order in a sequence, and for how long,
which involves processes of cutting, editing and montage, which constitute the syntagmatic
aspect.
It is important to note that connotative and denotative meanings are not mutually
exclusive categories but can exist simultaneously. In order to distinguish between connotative
and denotative meaning construction in cinema, we return to the question of signs in cinema.
Monaco borrows the trichotomy of icon, index and symbol offered by philosopher C.S. Pierce
and film scholar Peter Wollen, to analyze film for the different orders of meanings it
generates. Cinematic signs are primarily denotative and can be categorized into three orders:
 the icon, in which the similarity between the signifier and signified is because of close
likeness, for example, portraits are icons because of their resemblance to what is
represented.
 the index presents a quality; it has an inherent relationship to what is being
represented but it’s not identical to it. It can further be divided into technical and
metaphorical indexes. For example, shots of ticking clocks can be classified as
technical indexes of time. In a metaphorical index, one small detail might suggest the
overall quality, for example the shot of someone walking with a rolling gait or a limp
could indicate the kind of a person being represented.
 the symbol is easier to recognize because it usually already exists in cultural
conventions of representation.
According to Monaco, the icon remains the short-circuit sign that is characteristic of
cinema, as it refers to the identical. A symbol as sign is more arbitrary because it is drawn
from other existing conventions and usually cinema just borrows them from art or literature.
Finally, it is the index that Monaco considers most characteristic of cinema. An index is
neither identical nor arbitrary in signification but functions somewhere midway between the
cinematic icon and the literary symbol. The index signifies the idea of something; for
example, if one needs to express the idea of heat, it could be indicated through an extreme
close up of simmering beads of sweat on a forehead or by the rising temperature levels on a
thermometer. The indexical sign lends metaphorical power to film.
Film is a language of continuum; it is constructed, not only through the composition of
the frame but also through the sequencing of shots. This ability of cinema grants it a certain
degree of flexibility in terms of what can be put into the frame. Therefore, the indexical sign
contains immense potential to both denote as well as create connotative associations within
the composition of the frame and the order of shots. It is quite like how metonymy and

10
synecdoche function as figures of speech in language. Metonymy is when an idea or concept
is represented by the name of something it is that is associated with. In cinema, the falling
pages of a calendar could indicate the passage of time, such an indexical sign for time is a
cinematic shorthand; it shows you, in a compact and succinct way, that time has passed
within the narrative. Similarly, indexes also function like synecdoche, wherein a part of an
object represents a larger idea. For example, if there is a close-up of marching boots in a file,
such an image suggests marching men, which further indicates the presence of an army.
Therefore, indexical images are as much about what is not shown or omitted so as to signify a
particular idea or notion.
At the end of this section, Monaco remarks that since cinema works within limits of
space and time, which means that it is about what we see within the frame or shot and within
the duration of the film, cinema needs to work with extensions and indexes. Meaning in
cinema is generated, not only from what we see and what we don’t see but also through
associations. This last aspect, that also functions like a sign in cinema, is the element of trope.
While all the examples of image construction and signification till now have been more or
less about static images, single shots, focusing on particular objects or ideas placed within
that shot, the trope is a more dynamic element, “the connecting element between denotation
and connotation” (170). Film makes it possible, through its technical capacity, to expand
meanings through “tropes of comparison” (171).
Check your progress
i) Explain connotative and denotative meaning in cinema.
ii) What is the difference between paradigmatic and syntagmatic connotation? How are
they related to the process of shot composition and selection?
iii) Denotative connotations can be classified as icon, index and symbol. Explain these
three categories.
3. Syntax
The syntax of film is a result of its usage, not a determinant of it (172).
After describing how images, much like words, function like language systems with different
levels of signification, in the following sections, James Monaco delves into the processes of
construction of meaning in film. He elaborates upon cinematic methods and choices of image
construction. In language systems, there are rules of grammar that are essential in order to
communicate properly. Words are organized in a sentence according to correct syntax and
grammar. However, as discussed earlier, there are no such fixed rules for organizing images
and sequences in cinema. Yet, cinema, across cultures and at different points in history, has
developed new means of presenting as well as organizing images in sequences, leading to
new forms and genres of cinema.
Monaco observes that the syntax of film or the systematic arrangement of images,
sounds, movement, and space is descriptive in nature rather than prescriptive. What develops

11
as film syntax is “an outcome of usage or cultural practice” (172). Syntax in film relates to
the fundamental choices that are made regarding filmmaking; such as what to shoot, how to
shoot, and how much to shoot. These questions are, in turn, dependent on industrial
conventions, cultural factors, aesthetic choices, and narrative needs. But the most important
aspect affecting the construction of any film sequence is that, unlike verbal or written
narratives, cinema incorporates spatial and temporal components. Film is a moving image art
form. It is also a narrative art form. Thus, film is not restricted to the depiction of objects,
people or places but can both record as well as construct space and time within its form. Any
progression of plot, action or story in a film consists of visual construction of space and time
within that narrative.
The construction of space within a film is termed mise-en-scène, which is a term derived
from French theatre, that means “putting in the scene.” In the case of theatre, this refers to
placing props, setting up the stage, and positioning the actors, whereas in film, while there is
a set or a location where a scene is set up, the selection of shots to frame that scene modifies
exactly how much of that space you will see and in what order. Thus, what you see within the
frame or on screen, which parts of the set, which prop, at what angle, constitutes the mise-en-
scène of film.
According to Monaco, setting up a scene is as much an organization of time as of space.
If mise-en-scène refers to the organization of space, montage is the modification of time.
Montage is the act of cutting or editing the film. Montage is the editing technique that also
composes experience of time in a film, consequently controlling the pace and rhythm of film.
Therefore, how long a particular shot or sequence should be or when should one shot cut to
another are questions related to montage; “Mise-en-scène and montage are principles of
organization of time and space” (174). However, film theorists have debated about the proper
function of these two aspects of cinema. Mise-en-scène has been often seen as leading
towards realism while montage has been seen as expressionistic. Since the function of mise-
en-scène elements is to construct or frame space in a way that reflects some likeness to the
actual world, it is often seen as an essential practice of cinematic realism. In other words,
mise-en-scène in cinema follows the principle of verisimilitude. In contrast, montage
manipulates time and space such that it can contract or expand the experience of narrative
time. If a rapidly cut sequence can create a sense of urgency, then careful cross-cutting in a
dramatic dialogue sequence between two characters can enhance conflict and build upon the
emotional impact of the scene. Therefore, because of the relationship between shots
developed through cross-cutting, montage generates meaning by contrast and is often seen as
an expressionistic technique.
Monaco observes that while there have been debates in film theory regarding which
practice is a truer or more ethical method of representation in film, this binary between
montage and mise-en-scène creates a separation between time and space which is not quite
useful. While mise-en-scène establishes narrative space, setting and context; montage creates

12
the experience of time within the narrative. Both are essential in organizing space and time in
film.
Codes
The structure of cinema depends on the codes that operate within it as well as the codes that it
follows. These are not fixed systems of meanings but a system of logical relationship within
texts, formed by practice and convention. Codes can be both cultural as well as artistic -
something that cinema might share with other forms. But there are some codes that are
unique to cinema. For instance, gestures or facial expressions can convey similar meanings in
theatre as well as film. They hold communicative and expressive value in both forms.
However, if the shot of a gesture is rapidly intercut with the shot of a screaming face, then
such a montage structures the action or suspense within the film. This way, it builds upon
methods of narration or creates codes that audiences learn to identify, interpret, and even
anticipate.
Codes are critical constructions that structure film as well as build an interpretive
relationship between the audience and the cinematic text. Mise-en-scène and montage in
cinema function as codes. For example, think of specific film genres, such as action, horror,
or romantic comedy. In each of these genres, you can identify visual codes like colour,
texture, lighting: in short, the mise-en-scène elements that signal to you what to expect in
each genre. The horror film is likely to have a darker colour scheme than the romantic
comedy. Similarly, action films tend to draw upon spatial and temporal cinematic codes,
dependent on montage. Action films have established conventions of wide shots of
landscapes, automobiles, technology and speed, which are codes that make audiences
anticipate high speed chases, as well as intense fight sequences. Therefore, codes establish
the structure and communicate the logical relationship within various cinematic elements.
3.1. Mise-en-scène
“What to shoot? How to shoot? How to present the shot?” (179)
In the next three sections, Monaco delineates the actual elements and features that constitute
the cinematic mise-en-scène. According to Monaco, mise-en-scène in cinema is the tool that
filmmakers use to establish, change, subvert and challenge codes that we learn to interpret
while watching films. The elements of mise-en-scène are also the crucial tools at any
filmmaker’s disposal to find answers to the three most important questions they are faced
with while making a film; “What to shoot? How to shoot? How to present the shot?” (179).
The answers to these questions are not arbitrary individual choices but a host of detailed
methods of shot composition and shot selection. All the codes of film operate within the
frame. Here we look at how the mise-en-scène constructs those codes of meaning. There are
three elements of the cinematic mise-en-scène that Monaco discusses.
(a) The Framed Image
If mise-en-scène means putting into the scene, then in cinema a scene is constructed as a
series of shots, and the shots are constituted by the composition within the frames. The limits

13
and possibilities of representation in cinema play out within the boundaries of this frame.
While image composition within the frame has been a preoccupation even in the classical
visual arts, in cinema there is also the matter of time and movement that usually shifts from
one frame to the other. Nevertheless, all framed images or compositions are underlined by a
set of common concerns, as enlisted by Rudolph Arnheim, such as balance, shape, form,
growth, space, light, colour, movement, tension and expression. Now let’s look at how these
factors of composition appear in cinema.
Aspect Ratio: The aspect ratio is the ratio between the width and height of the screen upon
which a film is projected. While the rectangular shape of the frame or the screen sets the
limits of representation, the aperture of the film camera and dimensions of the screen offer
different possibilities for composition.

Figure 10: A Few Dollars More (1965) Sergio Leone. An extreme long shot that establishes the
landscape and the characters within in. This is a known Western genre film, which typically utilized
the widescreen aspect ratio in its shot compositions.

Figure 11: Even when the perspective shifts, the shot is reversed, and we see the other end of the
previous shot, the spatial terrain is distinctly visible in the widescreen aspect ratio. These men are
framed in the vastness of the desert landscape, symbolizing a certain degree of loneliness. See the full
sequence here https://www.youtube.com/watch?v=0JPnR7C8mZQ

14
Monaco explains that when the standard aspect ratio in the 1940s was 1.33, films contained
more interiors, faces, and dialogues. As the dimensions became wider and the aspect ratio
increased to 2.33 and above, this change ushered in the widescreens such as CinemaScope
and Panvision, eventually resulting in the visual composition shifting to the exteriors and
landscapes (See Fig. 11).

Open and Closed Forms: These forms depend on what kind of movement is organized or
allowed within the frame. If the scene is such that the frame is filled with the dramatic close-
up of a face, that would be a closed form, since the drama of the scene is communicated
through the facial expressions and no other visual information is in the picture. An open form
is when the framing of the shot is such that we expect an action to occur within it. (See
Fig.10-11)
Planes of Composition: This refers to the axis along which the composition within a frame
is organized. There are three planes: frame, geographical and depth. To put it simply, this is
about what is put in the foreground and the background and what kind of depth of space is
constructed. For example, if in a shot of an interior, the characters/actors and props are
positioned entirely along the horizontal axis then one doesn’t get much sense of how large or
small the room is. However, if the actors are positioned in an angular fashion such that one is
in the foreground, another in the background, the depth of the space is established.
Colour: Colour is an extremely significant element of composition that creates balance or
contrast within a frame. It can distinguish the spatial plane between the elements in the
background and foreground. It can codify meaning on the subject or an object. It can also
function as a trope that recurs through the film narrative (See Fig. 12 & 13). Colour is both a
constituent of the chemical processes or technological features of film as well as a culturally
defined code.

15
Figure 12: Singin’ in the Rain (1952) Gene Kelley, Stanley Donen. This is the titular song from the
celebrated 50s musical, in which the lead character is so happy because he has fallen in love that,
despite the heavy rain, he sings and dances about the city streets expressing his happiness. While the
everything else is wet and grey, there’s the bright red letter-box that is the only sign of any bright
colour. The letter box is the lone bright spot on a gloomy day, quite like our hero who is the only one
in such a joyful mood in the dark, empty streets drenched in rain.

Figure 13: Kabhi Alvida Naa Kehna (2006) Dir. Karan Johar. Romance films typically use colour to
signify mood and emotions of their character. In this song sequence, the couple’s feelings and moods
of being in love are codified in different bright colours that everyone around them is wearing. It is like
their inner emotions have coloured the entire city.

Lights: Quite like colour and lines within frame composition, light also establishes depth and
meaning. While in classical visual arts, light was represented through the logical relationship
of the subject to the light source, in cinema light enhances the visual and cultural codes. Light
also has a specific relationship to the photochemical process of film.

16
Figure 14, 15: The Dark Knight (2008) Dir. Christopher Nolan. In the film, every time Bruce Wayne
needs to take on the guise of his vigilante alter ego, Batman, in order to deal with the criminal crisis in
the city, he is framed in very dimly lit settings. In the first figure, as he sits in the dark room, only his
profile is visible in silhouette, while there is still light outside. This serves as signifying his state of
mind through contrast. In the next figure, he leaves to go out into the city, but even now the light from
above and behind casts shadows on his face. The themes of light and dark are important elements in
the Batman films since he is a protagonist marked by personal trauma and loss. He is also a reluctant
hero therefore the low and high contrast lighting signify his troubled persona. (For a detailed
illustration of these mise-en-scène elements see e-resource no. 3)

However, we are interested in light as an element of the mise-en-scène, where light attaches
meaning to a scene. For instance, overhead lighting creates a dominating presence, while
lighting from below makes the subject look ominous. Conventionally, in the Hollywood
tradition, “key light” and “fill lights” were used to create the depth of field within a shot,
highlight or foreshadow a character’s emotion and subjectivity.

17
(b) The Diachronic Shot
In the previous section, Monaco talked about the major film codes operating within a static
film frame. Here, he identifies and discusses, in detail, the variables of a dynamic or
diachronic shot: “distance, focus, angle, movement, and point-of-view” (195).
Monaco underlines not only the kinds of shots, but also compositional and conceptual
features of the shot, that make it one of the fundamental features of the cinematic mise-en-
scène. Shots are usually described according to the perceived distance between the camera
and the subject; such as full-shot, mid-shot, long shot, extreme long shot and closeup.
Camera angles are integral in establishing depth, space and perception in a shot. They
also define the relationship between the subject and the filmmaker, depending on whether the
camera is overhead, at eye-level, or low-level. Along with the syntax or arrangement of
distance inside the frame, the other compositional feature of the shot is focus; namely deep
focus, shallow focus and soft focus. Focus in a shot is variable. It can change from one shot to
another, it can also change within the same shot but most importantly, it is essential to
maintain the continuum between shots. Soft focus is usually used to qualify the role and
emotions of a favorable character and set the emotional tone or mood of a sequence. Deep
focus is the technique when, within a shot, the entire space is kept in focus, such that the
background is seen with as much clarity as the foreground.
Movement is another important variable in a diachronic shot. It is important to remember
that, in film, each shot anticipates the next one. A shot is not only based on the coordinates of
distance; movement is also integral to the film camera. Therefore, the camera can frame while
moving from one point to another, as in the tracking shot. Further, it can also convey an
impression of movement, without actually moving, as in the zoom. In a tracking shot, the
camera moves along one axis, so that our depth of perception shifts as the distance between
the objects on screen shift. In a zoom shot, spatial relationships between the objects don’t
change, but the image becomes larger and we see the subject up close. (For a detailed
discussion on shot composition see e-resource no.4.)
Monaco also touches upon the point of view in cinema, which he considers the fifth
variable in a diachronic shot. He points out how Hollywood has an established idiom in this
respect, similar to prose narratives. Camera movements, such as long shots in the beginning,
quickly establish an omniscient point of view in the narrative.
(c) Sound
Borrowing from Christian Metz’s different channels of information that cinema provides,
Monaco identifies speech, music and noise (sound effects) as three entities of sound in film.
Sound has unique features because it is not only omnipresent but also omnidirectional. What
Monaco suggests is that sound is so pervasive that it often goes unnoticed.
In film, the soundtrack is where sound is controlled and constructed. Speech or dialogues
easily draw attention to the action but sound effects need to be inserted at the required places.

18
The soundtrack also generates codes between sound and image. Siegfried Kracauer has
divided film sound into “actual” sound, which connects to the image, and “commentative”
sound, which is external. Another division from Karel Reisz is more commonly used in film
theory, which is synchronous and asynchronous. Synchronous sound is that which comes
from a source in the scene, such as dialogue between characters or the sound of footsteps.
Asynchronous sound is not from a source within the scene but added to the soundtrack, like
musical scores and sound effects. It is outside the frame and the narrative but adds to the
mood, emotions and even direction of the film. Sometimes, asynchronous sound is also used
as contrapuntal sound which offers a counterpoint to the image and action on screen.
3.2 Montage
In Europe, the act of cutting or editing a film was termed montage. In the 1920s, the practice
of editing that was developed by early European filmmakers, with styles like German
Expressionism and Soviet Montage, developed on the idea of editing as synthesis, i.e. putting
things together. Film was understood as a constructed form, based on the concept of putting
shots together one after the other. Monaco explains that, technically, shots can be put together
only in two ways; by overlapping, as in superimposition or dissolve, or by placing them end
to end.
Montage, as both practice and theory of editing, developed and expanded on the
possibilities of putting shots together. First, montage was viewed as a dialectical process
wherein, after the juxtaposition of two adjacent shots, a third meaning emerged - one that
might not be there in the two individual shots. Second, montage was understood as a process
that wove together multiple short shots within a rapid span of time; thus, not only packing in
a lot of information within a short duration sequence but also adding sensation and rhythm to
the filmic context.
Montage was the opposite of the editing practice followed in Hollywood, which
emphasized “invisible cutting,” to keep audiences from noticing the cut. Maintaining a strict
logic of setting, space and time within the narrative, Hollywood grammar insisted on
seamless change from one shot to another, so that the focus of the audience remained on the
narrative action. In contrast, the montage conventions were based on rapid cutting (even
between still shots and short duration shots) and repetition, to establish the nature of action
that was taking place. Montage followed continuity between shots only when building an
accelerated pace for a sequence. Parallel montage was used by filmmakers to alternate
between two stories or two spaces of action; cross-cutting between to enhance dramatic
conflict. However, montage gained attention primarily for its juxtaposition of shots from
different spaces, times, and contexts. Montage did not place shots in chronological order:
instead, it created another meaning altogether through repetition and juxtaposition. Montage
techniques either compressed or expanded the experience of time in films. If the continuity
style wanted to make the editing process invisible to audiences, montage wanted to
demonstrate it. Montage was geared towards unsettling the audience or producing a shock

19
effect rather than seamless submission to the narrative (For a detailed illustration of these
techniques see e-resource no. 5).
Monaco concludes his detailed observations on the syntax of mise-en-scène and montage
with a quick comment on punctuation in cinema; about how the endings of shots or
sequences are typically marked. In early and silent cinema, there was the insertion of
“intertitles” to provide brief information, when moving from one scene to another or even the
“The End” title card which worked as a full stop. Moreover, there were also cinematic
practices like the fade, dissolve, iris, or wipe effects that were stylized ways of transitioning
from one sequence to another. Such stylistic choices marked or announced the transition or
the end of one spatial sequence and the beginning of another. Today, the dominant
convention is to just cut to the next scene or, in case of the ending, fade out or freeze the
frame.
4. Conclusion,
James Monaco’s essay is an overview of the codes and practices of making meaning in
cinema. It observes cinema as developing its own language system, with unique signs,
symbols, icons which audiences, whether consciously or unconsciously, always read to make
sense of a film. While, typically, audiences tend to pursue the story of a film, this essay
explains how that narrative is constructed through visual and cinematic elements such as
framing, shot composition, movement, editing, and soundtrack. These aspects constitute the
language and syntax of cinema, which are deployed by filmmakers to craft their characters
and narrate their films.
Monaco provides an illustrative survey of the fundamental cinematic tools and processes
as they developed in Western film cultures. The sections on mise-en-scène and montage
identify elements that are specific to cinema as a medium and form. Finally, depending upon
the cultural practices of filmmaking and shifts in film technology, conventions and stylistic
practices change, and the language of cinema also transforms.
Check your progress
i) What is mise-en-scène in cinema?
ii) Explain how cinematic mise-en-scene elements such as framing, light, and colour act
as codes of meaning?
iii) What is a diachronic shot? Mention some variables of the diachronic shot.
iv) What is montage? How does it combine space, time and movement in cinema?
Reading List
Corrigan, Timothy and Patricia White. The Film Experience. Bedford/ St.Martins, Boston,
New York, 2012.
Dix, Andrew. Beginning Film Studies. Manchester University Press, 2008.

20
Additional E-resources for:
Setting, Props, Costume, Lighting, Colour, Space https://www.youtube.com/
watch?v=clBT7O3A3wI
Frame, Shots and Composition https://www.youtube.com/watch?v=InfcMwcSG3g
Continuity Editing, Montage, Jump Cut, Fade https://www.youtube.com/
watch?v=NUryfkLSwfM

21

You might also like