Download as pdf or txt
Download as pdf or txt
You are on page 1of 8

Erik Smitterberg (erik.smitterberg@engelska.uu.

se) Master’s Programme: OE and ME


Dept. of English, Uppsala University Autumn/Fall Term 2010

Introduction to Phonetics and Phonology1


Some of the symbols and terms in Baker (2007) and Horobin and Smith (2002) may be
unfamiliar to students who have limited experience of phonetics, i.e. THE SCIENTIFIC STUDY OF
SPEECH SOUNDS IN LANGUAGES, and phonology, i.e. THE SCIENTIFIC STUDY OF SOUND SYSTEMS OF
LANGUAGES. This compendium is intended to help such students to make sense of the account
in the textbooks. I first introduce the descriptive apparatus and the symbols used to discuss
speech sounds in writing, and then address the basic concepts of notation, phonemes,
allophones, syllables, and stress. The account focuses on matters that are relevant to the
production of English speech sounds.

1 The Description of Speech Sounds


In order to describe variation and change in speech sounds clearly in writing, it is necessary
to be able to represent sounds using written symbols that mean the same thing to all
readers. This section outlines the use of such phonetic symbols, and the different
characteristics of speech sounds that these symbols represent. Speech sounds fall into two
major categories: vowels and consonants. I will describe these in turn, starting with vowels.
It is important to remember that, when vowels and consonants are discussed
phonetically, it is the sound that matters, not the spelling; for instance, in a word like
through, the two vowel letters <o> and <u> as well as the two consonant letters <g> and <h>
represent the single vowel sound [uː]. As is common practice in linguistics, linguistic forms
will appear in italics in this compendium, and <angle brackets> will be used to draw
attention specifically to written forms, when necessary. [Square brackets] will be used
around exact descriptions of speech sounds, and /slashes/ will be used around the sound
units known as phonemes (see section 2.1).

1.1 Vowels
All vowel sounds meet two criteria:
1. When they are produced, THE AIR FLOWING THROUGH THE ORAL CAVITY IS RELATIVELY
UNIMPEDED.
2. THEY CAN FORM THE NUCLEUS OF A SYLLABLE (see section 2.2.1).
The first criterion is the more important one for our purposes. It means that there is no
closure anywhere that is tight enough to produce audible friction; instead, which vowel is
produced depends on the shape of the articulatory organs. In English, the tongue and the
lips are of particular importance.
There are two main types of vowel in English. The first kind of vowel is “simple” or
“pure” in that the tongue, lips, and other articulatory organs stay in the same place from
the moment the vowel begins to be produced until the next speech sound starts. Such
simple vowels are called monophthongs. English monophthongs can basically be described
using four features:
1. Openness. This feature concerns THE VERTICAL POSITION OF THE TONGUE IN THE ORAL CAVITY:
close (the tongue almost makes contact with the roof of the mouth), mid-close, mid,

1
I am grateful to Gregory Garretson for his comments on a draft version of this compendium.

1
Erik Smitterberg (erik.smitterberg@engelska.uu.se) Master’s Programme: OE and ME
Dept. of English, Uppsala University Autumn/Fall Term 2010

mid-open, or open (the tongue is at the bottom of the oral cavity).2 In some sources,
this feature is called height, and the vowels are then referred to as being high, mid-
high, mid, mid-low, and low instead.
2. Backness. This feature concerns THE HORIZONTAL POSITION OF THE TONGUE IN THE ORAL
CAVITY: front (the highest part of the tongue is close to the teeth), central, back (the
highest part of the tongue is close to the pharynx).
3. Rounding. This feature has to do with WHETHER OR NOT THE LIPS ARE ROUNDED WHEN THE
VOWEL IS PRODUCED: rounded or unrounded.
4. Quantity. This feature concerns THE LENGTH OF THE VOWEL SOUND: long or short. Long
vowels are indicated by the phonetic colon sign <ː>. This feature is called duration
(and occasionally length) in Baker (2007); Horobin and Smith (2002) use the term
quantity.
In some languages, e.g. French, nasality – whether or not the air passes through the nasal
cavity – is also an important criterion. But this feature is of lesser importance in English.
The other major type of vowel is called a diphthong. In a diphthong, the tongue – as
well as the lips, in some cases – moves from one position to another while the vowel is being
produced. Diphthongs are given a double phonetic symbol that combines the symbol for the
starting position of the articulatory organs and the symbol for the final position of the
articulatory organs; for instance, the symbol for the vowel sound in Present-day English my
is [aɪ], which indicates that the articulatory organs start in the position for [a] and move
into position for [ɪ] as the vowel is being produced.
In Old English, diphthongs could be either long (i.e. equivalent to a long
monophthong in length) or short (i.e. equivalent to a short monophthong in length). To
distinguish these two categories, long diphthongs are given a [ː] after the first vowel symbol
(because the first sound was more prominent in Old English diphthongs), e.g. long [eːo] vs.
short [eo]. In Middle English, in contrast, all diphthongs are equivalent to a long
monophthong in length, so no length marks are used for Middle English diphthongs.3

1.2 Consonants
In contrast to vowels, English consonants meet at least one of the two following criteria:
1. THEY ARE PRODUCED BY BLOCKING OR RESTRICTING THE AIRSTREAM THROUGH THE VOCAL TRACT.
2. THEY CANNOT FORM THE NUCLEUS OF SYLLABLES.
As with vowels, the first criterion is the more important one for our purposes.
Consonants in English can be described using three features:
1. Voicing, i.e. WHETHER THE VOCAL FOLDS ARE PULLED APART OR BROUGHT TOGETHER. A
consonant sound is thus either unvoiced (the vocal folds are pulled apart) or voiced
(the vocal folds are brought together). (In contrast, all vowels are typically voiced.)
Some sources use the term voiceless instead of unvoiced.
2. Place of articulation, i.e. WHERE THE AIRSTREAM IS MODIFIED. We chiefly need to include
the following categories:
a. Bilabial. The upper and lower lip are involved.
b. Labiodental. The upper teeth and lower lip are involved.
2
Note that it is not merely the position of the tip of the tongue that is relevant to the distinction between
close and open vowels; for instance, to produce the close vowel [u], it is the body of the tongue that is raised
towards the roof of the mouth.
3
This is a slightly inconsistent notation, in that a symbol like Old English [eo] indicates a diphthong equivalent
to a short monophthong in length, while a symbol like Middle English [ɔɪ] indicates a diphthong equivalent to
a long monophthong in length. However, these notations are used in virtually every book on the history of
English, so it is necessary to get used to the inconsistency.

2
Erik Smitterberg (erik.smitterberg@engelska.uu.se) Master’s Programme: OE and ME
Dept. of English, Uppsala University Autumn/Fall Term 2010

c. Dental. The tip of the tongue and the front teeth are involved.
d. Alveolar. The tip of the tongue and the alveolum – the ridge behind the front
teeth – are involved.
e. Postalveolar. The front of the tongue and the sharply rising surface between
the alveolum and the hard palate are involved.
f. Palatal. The front of the tongue and the hard palate are involved.
g. Velar. The back of the tongue and the velum – the soft palate – are involved.
h. Glottal. The vocal folds in the glottis are involved.
3. Manner of articulation, i.e. how the airstream is modified. In English, the most
important categories are as follows:
a. Stops. Stops are produced by completely closing off the airflow through the
mouth, and then releasing it.
b. Fricatives. Fricatives are produced by continuous airflow through a narrow
opening in the vocal tract, which produces audible friction. These sounds are
called spirants in Baker (2007); I follow Horobin and Smith (2002) and use the
more frequent term fricatives in this compendium.
c. Affricates. Affricates represent a combination of a stop and a following
fricative, but count as one single sound.
d. Nasals. Like stops, nasals are produced by sealing off the oral cavity at a
specific place, which modifies the sound; but the velum is lowered so that the
air escapes through the nose instead. Nasals are typically voiced in English.
e. Approximants. Approximants are produced with little obstruction of the
airflow. They can be subdivided into two groups.
i. Liquids include sounds like English [l] and (in most varieties) [r].
ii. Semivowels in English comprise the sounds [j] and [w]. Semivowels
are produced in the same way as vowels are, i.e. with very little
obstruction of the airflow. However, they count as consonants
because they cannot form the nucleus of a syllable.

2 Phonological Concepts
2.1 Phonemes and Allophones
The descriptions in section 1 concern speech sounds, or phones. But we need only look at
how we use our first language to see that it is not enough to merely describe the sounds in
the language in order to explain how they function as a system. Most importantly, in any
given language, some differences in articulation will be considered important, in the sense
that two sounds that differ in those respects are not considered the “same” sound. Other
differences, in contrast, will be regarded as incidental, in the sense that they are systematic
and predictable from the sounds that occur before or after the sound in question.
To illustrate this, let us look at a common feature in natural languages known as
aspiration. If a consonant is aspirated, it is accompanied by a puff of air. This is signalled in
phonetic notation with the symbol [h]; an aspirated [p], for instance, is given as [ph]. The
only difference between [p] and [ph] is aspiration; in other respects the sounds are identical.
It is easy to check whether a consonant is aspirated: if you hold the palm of your hand in
front of your lips while producing the sound, you will feel the puff of air clearly if the
consonant is aspirated, while an unaspirated consonant will produce a far weaker puff or no
puff at all. Test the difference between aspirated and unaspirated consonants by doing so

3
Erik Smitterberg (erik.smitterberg@engelska.uu.se) Master’s Programme: OE and ME
Dept. of English, Uppsala University Autumn/Fall Term 2010

while saying first pit and then spit in English. You will notice that pit has an aspirated [ph],
while spit has an unaspirated [p]. This is because of a very simple rule that applies to most
Germanic languages (it works for Swedish too): an unvoiced stop – [p], [t], or [k] – is
aspirated at the beginning of a stressed syllable, except when it follows [s]. The [s] in spit
“blocks” the aspiration. The difference between aspirated and unaspirated unvoiced stops
in most Germanic languages is thus both systematic and predictable. For this reason, we
regard [p] and [ph] as the “same” sound in languages like English and Swedish – indeed, it is
likely that you had never considered the fact that there are two different p sounds in
English and Swedish before you read this, because we do not think of such differences as
significant.
In contrast, let us consider a feature that was introduced in section 1.2: voicing.
Consonants may be voiced, in which case the vocal folds are brought together and vibrate
as the sound is produced (you can feel this vibration by holding a finger on your throat as
you produce a voiced sound). Alternatively, they may be unvoiced, in which case the vocal
folds are pulled apart. Let us look at the sounds [v] and [f]. Both sounds are labiodental
fricatives; the only difference between them is that [v] is voiced and [f] is unvoiced. But [v]
and [f] clearly do not count as the “same” sound in Present-day English. The easiest way to
test this empirically is to find two words that are identical in pronunciation with one single
exception: in one word the first of the two sounds we are looking at occurs, and in the other
word the second of the two sounds we are looking at occurs – in exactly the same place.
Such pairs are known as minimal pairs. For [v] and [f], one minimal pair is fan [fæn] and van
[væn] (note that only pronunciation counts, while spelling is irrelevant; for instance, few
[fjuː] and view [vjuː] also form a minimal pair). Native speakers of Present-day English
instantly perceive that fan and van are different words even though they only differ in this
one respect. This test thus enables us to conclude that the difference between [f] and [v] is
enough to distinguish Present-day English words on its own, and that [f] and [v] are not the
“same” sound in Present-day English. In linguistic terms, they belong to separate
phonemes. The phoneme is the smallest distinct sound unit in a particular variety of a
language.
By contrast, it would be impossible to find a minimal pair for the two sounds [p] and
[ph] in Present-day English, because there is always one more feature that distinguishes
them: the [s] that comes before [p] but not before [ph] (see above). This means that [p] and
[ph] must instead be regarded as two variants of the same English phoneme. Such variants
are known in linguistics as allophones.4
Note that one can only decide what sounds belong to different phonemes and what
sounds belong to the same phoneme for a single variety of a language. Different languages
and dialects may have different systems; for instance, both Present-day English and Khmer
(the official language of Cambodia) have the sounds [p] and [ph]. As we saw above, these
sounds are allophones of the same phoneme /p/ in English; in contrast, in Khmer, these
sounds belong to different phonemes /p/ and /ph/, since there are minimal pairs like

4
There are some cases where it is difficult or impossible to find minimal pairs, but where we still conclude
that two sounds belong to different phonemes. For instance, [h] and [ŋ], as in hang [hæŋ], never form a
minimal pair in Present-day English, because [h] only occurs at the start of a syllable and [ŋ] never does. But
because they are so different, we still conclude that they belong to different phonemes. It is very difficult to
find minimal pairs for [θ] and [ð] (either/ether is one, if you pronounce either with [iː] or [i]), and they are also
similar phonetically. Nevertheless, we conclude that they belong to different phonemes, because their
distribution is not predictable; that is, we cannot look at their phonetic context and determine which of them
will appear, as we would have been able to do if [θ] and [ð] had been allophones of the same phoneme (cf. the
account of [p] and [ph] above, where, in contrast, the distribution is completely predictable).

4
Erik Smitterberg (erik.smitterberg@engelska.uu.se) Master’s Programme: OE and ME
Dept. of English, Uppsala University Autumn/Fall Term 2010

/pɔːn/ ‘to wish’ vs. /phɔːn/ ‘also’. Conversely, we saw above that, in Present-day English, the
sounds [f] and [v] belong to separate phonemes /f/ and /v/; but as you will see in Baker
(2007), the Old English sounds [f] and [v] were allophones of the same phoneme /f/.5
There are several phonemes in varieties of English that have important allophones.
For instance, in many Present-day English accents, there is a “clear” [l], used before vowels,
as in lab [læb], and a “dark”, more [ʊ]-like [ɫ], used before consonants and at the end of
words, as in milk [mɪɫk] and deal [diːɫ]. These are allophones of the same phoneme because,
again, which allophone is used is predictable from the surrounding phonetic context. This
predictability is also why we never think about allophonic variation as native speakers: the
abstract unit of the phoneme is what we focus on subconsciously. The relative
unimportance of allophonic variation can also be seen from many alphabetic writing
systems, where one symbol typically corresponds to a phoneme, including its allophones, if
any; for instance, we use the same <p> in English to represent [p] and [ph], and the same <l>
to represent [l] and [ɫ].6
When we use symbols to represent sounds, it is necessary to distinguish between
two kinds of notation, depending on whether we wish to capture the exact sound, or merely
symbolize the phonemes. When we are interested in describing the exact sound, we
surround the phonetic symbols with [square brackets]; this is known as narrow or phonetic
transcription. In contrast, if we are only interested in symbolizing the phoneme, we
surround the symbols with /slashes/; we are then using broad or phonemic transcription.
In phonemic transcription of Present-day English, pit and spit above would thus have the
same symbol representing the unvoiced bilabial stop: pit /pɪt/, spit /spɪt/. But in phonetic
transcription, the symbol would be different for these two words: pit [phɪt] vs. spit [spɪt].
The relationship between a phoneme and its allophones can be described as in
Figure 1.

Phonemic level /f/

Allophonic level [v] [f]


Distribution (Allophone used between voiced (Allophone used word-initially,
sounds within words) word-finally, and within words
when preceded and/or followed
by an unvoiced sound)
Figure 1. The allophones of the Old English phoneme /f/ and their distribution

As illustrated in Baker (2007: 14–15), the Old English phoneme /f/ can be realized in two
distinct ways – [f] and [v] – and the realization is dependent on the phonetic context and is
thus predictable. When the situation is as described in Figure 1, with one allophone – [v] –

5
In Old English, the same is true of [s] and [z], which were allophones of the same phoneme /s/, and of [θ] and
[ð], which were allophones of the same phoneme /θ/. See Baker (2007: 14–15) for the contexts that
determined the occurrence of the unvoiced and voiced allophones.
6
Writing systems are rarely perfect representations of the phonemes in a given language, so this statement is
not always true; for instance, in English, <a> represents the vowels in trap, bath, face, palm, and start, and these
vowels belong to three different phonemes in most varieties of English. In addition, alphabets are usually
more conservative and standardized than speech, which means that recent changes in pronunciation, as well
as regional differences from the standard pronunciation, are unlikely to be reflected in alphabets. But the
basis of alphabetic writing is typically phonemic, not phonetic.

5
Erik Smitterberg (erik.smitterberg@engelska.uu.se) Master’s Programme: OE and ME
Dept. of English, Uppsala University Autumn/Fall Term 2010

occurring in one specific context and the other allophone – [f] – occurring in all other
contexts, scholars frequently give only the contexts where the first allophone occurs and
then simply state that the other one occurs “elsewhere”; in this case, [f] would be the
“elsewhere” allophone.

2.2 Beyond the Phoneme


There are a few phonological features operative above the level of the individual phoneme
that need to be addressed. Two matters will be discussed briefly in this section: syllable
structure and stress.

2.2.1 The Syllable


A syllable is a unit of sound above the phoneme. Every syllable has a nucleus, and the
nucleus is almost always a vowel – either a monophthong or a diphthong.7 Any consonants
(one or several) that precede the nucleus in the syllable are known collectively as the onset
of the syllable. Any consonants (one or several) that follow the nucleus in the syllable are
known collectively as the coda of the syllable. The nucleus and the coda together are known
as the rhyme (also spelt rime), because two syllables that have identical nuclei and codas
rhyme. Let us look at two one-syllable words to illustrate this: print and mint.

syllable syllable

rhyme rhyme

onset nucleus coda onset nucleus coda

pr ɪ nt m ɪ nt

The syllables print and mint rhyme because both have the rhyme /ɪnt/.
Many words consist of more than one syllable (because they contain more than one
vowel). The Old English word nama, for example, which contains the phonemes /n/, /ɑ/,
/m/, and /ɑ/, has two syllables, one for each occurrence of the vowel /ɑ/. The /n/ must be
the onset of the first syllable; but where does the /m/ belong? In such cases, we count the
consonant appearing between vowels as belonging together with the following vowel: the
syllable structure is thus /nɑ/ (onset /n/, nucleus /ɑ/) + /mɑ/ (onset /m/, nucleus /ɑ/). If
there were two consonants between vowels, as in Old English wordum /wordum/ (the dative
plural of word ‘word’) and fremman /fremmɑn/ ‘to advance’, one consonant ends up in each
syllable; thus fremman would have the structure /frem/ (onset /fr/, nucleus /e/, coda /m/)
+ /mɑn/ (onset /m/, nucleus /ɑ/, coda /n/); wordum would have the structure /wor/ (onset
/w/, nucleus /o/, coda /r/) + /dum/ (onset /d/, nucleus /u/, coda /m/).8 (See Handout I for
why fremman is transcribed with a geminated consonant /mm/.) You need to know this
much about syllable division in order to understand two features that are important when

7
There are exceptions, and the most important ones for Present-day English are the consonants /l/ and /n/,
which occasionally form the nucleus of syllables. Speakers who do not pronounce an /ǝ/ after the /t/ in words
such as bottle /ˈbɒtl || bɑːtl/ and Britain /ˈbrɪtn/ are still thought to have two syllables in these words, but the
nucleus of the second syllable is /l/ and /n/, respectively. Such consonants are sometimes referred to as
syllabic consonants.
8
This account of syllable division is an oversimplification, but it is provides you with the information that is
necessary to follow this course.

6
Erik Smitterberg (erik.smitterberg@engelska.uu.se) Master’s Programme: OE and ME
Dept. of English, Uppsala University Autumn/Fall Term 2010

studying Old and Middle English: syllable length and the difference between open and
closed syllables.
The division of syllables into long and short syllables is important in Old English; for
instance, a word may take different endings depending on whether it consists of a long or a
short syllable. Only the rhyme (nucleus + coda) is relevant to syllable length. In order to
decide whether a syllable is long or short, it is helpful to think of the rhyme as composed of
“length units” where each consonant is equivalent to one unit, each short vowel is
equivalent to one unit, and each long vowel is equivalent to two units. In linguistics, this
unit is called a mora (plural: morae).9 In the account in Baker (2007: 20), an Old English
syllable is long if its rhyme contains at least two morae. This is why fæt /fæt/ is a long
syllable: its rhyme contains two morae, one for /æ/ and one for /t/. Similarly, sǣ /sæː/ is
long because its rhyme /æː/ contains two morae on its own. But the two syllables in fæte
/fæte/ are both short, since the /t/ is considered to be the onset of the second syllable
rather than the coda of the first (see above): the first syllable has only one mora in the
rhyme (the nucleus /æ/), and the second syllable also has a rhyme consisting only of one
mora (the nucleus /e/).
There are two main types of syllable: open syllables and closed syllables. A closed
syllable is one that has a coda; both print and mint in Present-day English are thus closed
syllables because they have the coda /nt/. An open syllable is one that does not have a coda;
Present-day English free /friː/, for instance, is an open syllable, because it has only the onset
/fr/ and the nucleus /iː/. Since a single consonant between two vowels is considered to
belong with the following vowel, the Old English word nama, which we looked at above,
consists of two open syllables: /nɑ/ and /mɑ/. The difference between open and closed
syllables will become important when we deal with Middle English, since some vowels were
lengthened when they occurred in open syllables (see Horobin and Smith 2002: 59–60). In
fact, the reason why name, the present-day version of nama, is pronounced /neɪm/ and not
/næm/ in Present-day English is that the first syllable was open in Middle English!

2.2.2 Stress (= Accentuation)


Syllables can be stressed or unstressed.10 A stressed syllable is perceived as more prominent
than an unstressed syllable. Stress is signalled in different ways in different languages
(pitch, length, loudness, etc.); in English, a stressed syllable is primarily louder than an
unstressed syllable, but stressed and unstressed syllables often differ in pitch and length as
well.
When we look at individual words, there is no need to signal stress if the word has
only one syllable, since that syllable will carry lexical stress by default. If there are several
syllables in the word, the sign /ˈ/ is used to signal primary stress in a word. Note that stress
signs are placed before the syllable that takes the stress.
Stress is important in English for several reasons. First, unstressed syllables have
tended to be reduced to /ǝ/ over the history of the language. As you will learn during this
course, this reduction is one of the most important reasons why English now relies on word
order and prepositions rather than word endings to signal the relationship between clause
elements: after many endings, which were unstressed, merged as /ə/ (and then
9
You do not have to know the term mora (although it is a useful term to be familiar with). The important thing
is that you can tell the difference between long and short syllables in Old English.
10
Baker (2007) uses the term accentuation and discusses the difference between accented syllables and
unaccented syllables, while Horobin and Smith (2002) use the terms stress, stressed syllables, and unstressed
syllables to account for the same features. I follow Horobin and Smith (2002) in this compendium, as their
terminology is more widely used in linguistics.

7
Erik Smitterberg (erik.smitterberg@engelska.uu.se) Master’s Programme: OE and ME
Dept. of English, Uppsala University Autumn/Fall Term 2010

disappeared), it was no longer possible to use endings to distinguish between functions such
as subject and indirect object.
Secondly, lexical stress was once predictable in English. As Baker (2007: 20–21)
shows, all Old English words were stressed on the first syllable, with two regular exceptions:
1. If any word began with the prefix ġe-, the stress fell on the syllable after the prefix.
2. If a verb began with a prefix (i.e. not just ġe-, but any prefix), the stress fell on the
syllable after the prefix.
Since these rules apply to all Old English words – including loanwords – there is no need to
indicate lexical stress in transcriptions of Old English words. In Middle English, however,
stress in English becomes variable, mainly as a result of an influx of French loanwords that
are stressed on the last syllable. It is thus necessary to indicate stress in transcriptions of
Middle English words.

You might also like