Dr. Yunisrina Qismullah Yusuf, M. Ling.

Phonetics vs. Phonology? (Rangelov, 2005)

Phonetics studies the nature of speech sounds:

• their production by the vocal tract (articulatory phonetics)
• their perception by the auditory system (auditory phonetics)
• their physical properties as sound waves (acoustic phonetics)

Phonology studies the ways in which speech sounds form

systems and patterns:
• the relationship between how sounds are pronounced and
how they are stored in the mind
• which phonetic distinctions are significant enough to signal
differences in meaning
• the ways sounds are organized within words
“The ear hears phonetics, but the brain hears
phonology.” - Dennis Preston
That is, your ear is capable of processing whatever linguistic sounds are
given to it (assuming someone with normal hearing), but your language
experience causes your brain to filter out only those sound patterns that
are important to your language(s).
Phonologists are often as interested in patterns related to the
manner of articulation as they are the patterns of the speech
Phoneticians, meanwhile, would have no way to analyze their data
sets if they didn't have phonological categories to help organize
How does P&P work?
Let’s observe the following:

(1) The first sound the word fight is produced by

bringing together the top teeth and the
bottom lip, and then blowing air between
 illustrates the fact that we use our vocal
tract to produce speech.
How does P&P work?
(2) The word war is produced with one
continuous motion of the lungs, tongue, lips,
and so on, yet we interpret this motion as a
series of three separate speech sounds:
 illustrates the fact that words are not
physically one continuous motion but are
psychologically a series of discrete units called
How does P&P work?
(3) The words pea, see, and key all have the same
vowel, even though the vowel in each word is
spelled differently.
 illustrates that a single segment can be
represented by a variety of spellings
 observations (2) and (3) can be used to
justify a phonemic alphabet, a system of
transcription in which one symbol uniquely
represents one segment.
How does P&P work?
(4) p and b are alike in that they are both
pronounced with the lips; p and k are
different in that k is not pronounced with the
 illustrates the fact that segments are
composed of smaller units called distinctive
features. Thus, “labial” (referring to the lips)
is a distinctive feature shared by p and b, but
not by p and k.
How does P&P work?
(5) The vowels in the words cab and cad are longer than the
same vowels in cap and cat.
 illustrates that two segments can be the same on one
level of representation but different on another. Thus,
the vowels in cab, cad, cap and cat are the same on one
level (the vowel a), but different on another level (long a
in cab and cad; short in cap and cat).
 These systematic variations between levels of
representation can be stated in terms of phonological
rules (e.g. vowels are lengthened in a particular context).
Research in P&P
• Look at how sound systems have changed over
periods of time.
• You can examine specific languages to discover the
rules that apply to their phonology and how these
compare to other languages.
• You can also use phonology within a country to look
at specific dialects and compare them to one another.
• You can study the characteristics of sounds (vowels
and consonants) of a language, dialect, accent.
Vocal tract
The vocal tract consists of the passageway between
the lips and nostrils on one end and the larynx
(which contains the vocal cord) on the other.

It is important in the study of phonology because:

• it is used to produce speech.
• it refers to the physical properties that are used to
describe the psychological units of phonology.
Vocal tract
1. Lips
2. Teeth
3. Alveolar ridge, the bony ridge
right behind the upper teeth
4. Hard palate, the bony dome
constituting the roof of the mouth
5.Velum (soft palate), the soft tissue
immediately behind the palate
6.Uvula, the soft appendage hanging
off the velum
7, 8, 9, 10. Tongue
11. Epiglottis, the soft tissue which covers
the vocal cords during eating, thus
protecting the passageway to the lungs
12. Pharynx, the back wall of the throat
behind the tongue
13. Larynx, containing the vocal cords
14. Esophagus, the tube going to the stomach
15. Trachea, the tube going to the lungs
Anatomy of vocal organs
Vocal tract
In short, the vocal tract is a tube which produces sound
when air from the lungs is pumped through it.

Different speech sounds are produced by manipulating

the lips, tongue, teeth, velum, pharynx and vocal cords,
thus, changing the shape of this tube.

The primary importance of the vocal tract is the fact that

phonological units and rules are described in terms of
these physical properties of the vocal mechanism.
Phonemic Alphabet
One type of segment that we perceive when we hear speech is
termed the phoneme.

Conventional orthography (e.g. spelling) does not provide an

adequate means of representing the phonological structure of

For example: pea and key both contain the same vowel, but in pea
the vowel is spelled ea and in key is spelled ey. Consequently,
phonemic alphabet is developed where one symbol always
corresponds to a single phoneme.
So, the phonemic alphabet of the vowel in pea, see, me and key
as /i/.
Phonemic Alphabet
Phonemic transcription is always enclosed in
slashes to distinguish it from conventional

From the previous example, we perceive all the

words as having the same vowel by
transcribing them as: pea /pi/, see /si/, me
/mi/ and key /ki/.
Phonemic Alphabet of English VOWELS
No. Phonemic Examples No. Phonemic Examples
symbol Symbol

1 /i/ Seat (kursi) 9 /u/ Suit (setelan; sesuai)

2 /ɪ/ Sit (duduk) 10 /ʊ/ Soot (jelaga)

3 /e/ Say (ucap - present) 11 /о/ Sewed (menjahit – past)

4 /ɛ/ Said (ucap – past) 12 /ɔ/ Sought (dicari)

5 /æ/ Sad (sedih) 13 /aɪ/ Sight (penglihatan)

6 /ʌ/ (unstressed Suds (buih soda) 14 /aʊ/ South (selatan)

7 = Alone, butter

8 /a/ Sod (tanah) 15 /ɔɪ/ Soy (kedelai)

Physical dimensions of the
The vowel phonemes (e.g. percepts – psychological units) are
described in terms of their physical dimensions:
1. Tongue height: for any articulation corresponding to one
of these vowel phonemes, the tongue is either relatively
high in the mouth (/i, ɪ, u, ʊ/), mid (/e, ɛ, ʌ (ə), о/), or low
(/æ, a, ɔ/). Compare see /si/ (high) and say /se/ (mid).
2. Frontness: for any articulation corresponding to one of
these vowel phonemes, the tongue is either relatively
front (i, ɪ, e, ɛ, æ/) or back (/ʌ (ə), a, u, ʊ, о, ɔ/). Compare
see /si/ (front) and sue /su/ (back).
Physical dimensions of the
3. Lip Rounding: for any articulation corresponding to
one of these vowels phonemes, the lips are either
relatively round (/u, ʊ, о, ɔ/) or spread (/i, ɪ, e, ɛ, æ, ʌ
(ə), a/). Compare so /so/ (round) and say /se/
4. Tenseness: for any articulation corresponding to one
of these vowel phonemes, the vocal musculature is
either relatively tense (/i, e, u, о, ɔ/) or lax (/ɪ, ɛ, æ, ʌ,
ə, a, ʊ/). Compare aid /ed/ (tense) and Ed /ɛd/ (lax).
Distinctive feature
Each vowel phoneme is a composite of values (+ or -) along several dimensions
that constitute distinctive features. For example, /i/ and /ɔ/ are not really
units in themselves, but rather each is a bundle of features.

/i/ = +high /ɔ/ =-high -low +low

-back +back
+tense +tense
-round +round
Vowel quadrangle
Monophthongs, Diphthongs and Tripthongs of English

Vowels that are pronounced at one and the same place  monophthongs, and
English has 12 of them.

Vowels that change character during their pronunciation, that is, they begin at one
place and move towards another place  diphthongs, and English has 8 of them.

A glide from one vowel to another and then to a third, all produced rapidly and
without interruption  triphthongs, and English has 5 of them.

Compare for example the monophthong in car with the diphthong in cow, or the
monophthong in girl with the diphthong in goal. The vowels of cow and goal both
begin at a given place and glide towards another one .
In goal the vowel begins as if it was [ə], but then it moves towards [ʊ].
Therefore it is written [əʊ], as in [gəʊl] goal, with two symbols, one for how it starts
and one for how it ends.
Some people speak of triphthongs for groups of diphthongs + schwa (ə), for
[məʊər] mower
English monophthongs
English monophthongs (12 monophthongs):
1. /i:/  see, unique, feel
2. /ɪ/  wit, mystic, little
3. /e/  set, meant, bet
4. /æ/  pat, cash, bad
5. /ɑ/  half, part, father
6. /ɒ/  not, what, cost
7. /ɔ/  port, caught, all
8. /ʊ/  wood, could, put
9. /u/  you, music, rude
10./ʌ/  bus, come, but,
11./ə/  alone, butter
12./ɜ/  beard, word, fur
English diphthongs
English diphthongs (8 diphthongs):
Centering diphthong:
1. three (3) ending in /ə/ : /ɪə/, /eə/, /ʊə/
• /ɪə/ : beard, weird, fierce, ear, beer, tear
• /eə/: aired, cairn, scarce, bear, hair,
• /ʊə/: moored, tour, lure, sure, pure

Closing diphthong
2. three (3) ending in /ɪ/: /eɪ/, /aɪ/, /ɔɪ/
• /eɪ/ : paid, pain, face, shade, age, wait, taste, paper
• /aɪ/: tide, time, nice, buy, bike, pie, eye, kite, fine
• /ɔɪ/: void, loin, voice, oil, boil, coin, toy, Roy

3. two (2) ending in /ʊ/: /əʊ/, /aʊ/

• /əʊ/: load, home, most, bone, phone, boat, bowl
• /aʊ/: loud, gown, house, cow, bow, brow, grouse
Diphthongs trajectory
English triphthongs: 5 closing diphthongs with
/ə/ added on the end.
1. /eɪ/ + /ə/ = /eɪə/, as in layer, player
2. /aɪ/ + /ə/ = /aɪə/, as in lire, fire
3. /ɔɪ/ + /ə/ = /ɔɪə/, as in loyal, royal
4. /əʊ/ + /ə/ = /əuə/, as in lower, mower
5. /aʊ/ + /ə/ = /auə/, as in power, hour
Triphthongs trajectory
Closing ending in /ɪ/ + /ə/ Closing ending in /ʊ/ + /ə/
Exercise 1
1.Describe each of the following vowel phonemes of English in terms of tongue height,
frontness, lip rounding, and tenseness.
a. /æ/ b. /о/ c. /e/ d. /ʌ/ e. /a/ f. /ɔ/g. /ʊ/ h. /i/
2.The symbol /ʌ/ represents the vowel in:
a. pat b. pet c. pot d. put e. putt
3.Which symbol represents the vowel in look?
a. /ʌ/ b. /u/ c. /ʊ/ d. /o/ e. /a/
4. Match each of the following words with its phonemic vowel.
a. sues _____ /ɔ/
b. sews _____ /aʊ/
c. sows _____ /i/
d. sighs _____ /u/
e. sees _____ /ɪ/
f. says _____ /ɛ/
g. Sis _____/aɪ/
h. sauce _____ /o/
Phonemic Alphabet of English CONSONANTS
No. Symbol Example No. Symbol Example

1 /p/ pat, zipper, cap 13 /ʃ/ shoe, thresher, rush

2 /b/ bat, fiber, cab 14 /ʒ/ ___, treasure, rouge
3 /t/ tab, catty, cat 15 /h/ ham, ahead, ___
4 /d/ dab, caddy, cad 16 /tʃ/ chain, sketchy, beseech
5 /k/ cap, dicker, tack 17 /dʒ/ Jane, edgy, besiege
6 /g/ gap, digger, tag 18 /m/ mitt, simmer, seem
7 /f/ fat, safer, belief 19 /n/ knit, sinner, seen
8 /v/ vat, saver, believe 20 /ŋ/ ___, singer, sing
9 /Ɵ/ thin, ether, breath 21 /l/ light, teller, coal
10 /ð/ then, either, breathe 22 /r/ right, terror, core
11 /s/ sue, lacy, peace 23 /w/ wet, lower, ___
12 /z/ zoo, lazy, peas 24 /y, j yet, layer, ___
Phonemic Alphabet of English
Note: English words which appear to end in /w/ and /y/
are analyzed as ending in vowels in this system. For
example, cow = /kaʊ/ and sky = /skaɪ/.

The consonant phonemes (e.g. percepts – psychological

units) are described in terms of their physical
1. Place of Articulation
2. Manner of Articulation
3. Voicing
Place of Articulation
For any articulation corresponding to one of these consonant
phonemes, the vocal tract if constricted at one of the
following points:
1. Bilabial: (from bi ‘two’ + labial ‘lips’, the primary constriction
is at the lips (/p, b, m, w/). Compare pea /pi/ (bilabial) and
tea /ti/ (non-bilabial).
2. Labiodental: (from labio ‘lip’ + dental ‘teeth’). The primary
constriction is between the lower lip and upper teeth (/f, v/).
Compare fee /fi/ (labiodental) and see /si/ (non-labiodental),
3. Interdental: (from inter ‘between’ + dental ‘teeth’). The
primary constriction in between the tongue and the upper
teeth (/ɵ, ð/). Compare thigh /ɵaɪ/ (interdental) and shy /ʃaɪ/
Place of Articulation
4. Alveolar: (from alveolar ridge), the primary constriction is
between the tongue and the alveolar ridge (/t, d, s, z, n,
l/). Compare tea /ti/ (alveolar) and key /ki/ (non-alveolar).
5. Palatal: (from palate), the primary constriction is
between the tongue and the palate (/ʃ, ʒ, tʃ, dʒ, r, y(j)/).
Compare shoe /ʃu/ (palatal) and sue /su/ (non-palatal).
6. Velar: (from velum), the primary constriction is between
the tongue and the velum (/k, g, ŋ/). Compare coo /ku/
(velar) and two /tu/ (non-velar).
7. Glottal: (from glottis, which refers to the space between
the vocal cords), the primary constriction is at the glottis
(/h/). Compare hoe /ho/ (glottal) and so /so/ (non-glottal).
Places of the articulation in the vocal tract
Manner of Articulation
For any articulation corresponding to one of these
consonant phonemes, the vocal tract is constricted in
one of the following ways:
1. Stops: two articulators (lips, tongue, teeth, etc.) are
brought together such that the flow of air through the
vocal tract is completely blocked (/p, b, t, d, k, g, ʔ/).
Compare tea /ti/ (stop) and see /si/ (non-stop).
2. Fricatives: Two articulators are brought near each
other such that the flow of air is impeded but not
completely blocked. The flow of air through the
narrow opening creates friction, hence the term
fricative (/f, v, ɵ, ð, s, z, ʃ, ʒ, h/). Compare zoo /zu/
(fricative) and do /du/ (non-fricative).
Manner of Articulation
3. Affricates: Articulations corresponding to affricates are
those that begin like stops (with a complete closure in the
vocal tract) and end like fricatives (with a narrow
4. Nasals: A nasal articulation is one in which the airflow
through the mouth is completely blocked but the velum
is lowered, forcing the air through the nose (/m, n, ŋ/).
Compare no /no/ (nasal) and doe /do/ (non-nasal).
5. Liquids and Glides: Both of these terms describe
articulations that are mid-way between true consonants
(e.g. stops, fricatives, affricates and nasals) and vowels,
although they are both generally classified as consonants.
Manner of Articulation
Liquid is a cover term for all l-like and r-like articulations
(/l, r/). Compare low /lo/ (liquid) and doe /do/ (non-liquid).
Glide refers to an articulation in which the vocal tract in
constricted, but not enough to block or impede the
airflow (/w, y/). Compare way /we/ (glide) and bay /be/

Consonants can be divided into obstruents (stops, fricatives

and affricates, which are formed by obstructing airflow,
causing a strong gradient of air pressure in the vocal tract)
and sonorants (nasals, liquids and glides, no obstruction).
For any articulation corresponding to one of
these consonant phonemes, the vocal cords
are either vibrating (or voiced, /b, d, g, v, ð, z,
ʒ, j, m, n, ŋ, r, l, w, y/) or not (or voiceless, /p,
t, k, f, ɵ, s, ʃ, tʃ, h/). Compare zoo /zu/
(voiced), and sue /su/ (voiceless). Stops,
fricatives and affricates come in voiced and
voiceless pairs (except for /h, ʔ/); nasals,
liquids and glides are all voiced, as are vowels.
Consonant Phonemes of English

Note: previously, the IPA symbols for /ʃ/ was /š/, /ʒ/ was /ž/, /tʃ/ was /č/, and /dʒ/ was /j/. This
change was due to the manner that affricates can be described as a stop plus a fricative.
Consonant Phonemes of English
The previous chart shows the consonant phonemes of English
in terms of three physical dimensions: place of articulation,
manner of articulation and voicing.

For example: /p/ is a voiceless (voicing) bilabial (place) stop

(manner), /v/ is a voiced labiodental fricative, /tʃ/ is a voiceless
palatal affricate, /ŋ/ is a voiced velar nasal, and so on.

Therefore, each consonant phoneme constitutes a distinctive

/p/ = +bilabial /ŋ/= +velar
+stop +nasal
-voice +voice
Even though phonemes and distinctive features are
described in physical terms, they are actually
psychological entities: no one has ever uttered a
phoneme or a distinctive feature. But, when we talk,
we utter a physical speech signal which we interpret as
containing phonemes, which in turn consist of
distinctive features.

Phoneme is the smallest contrastive linguistic unit which

may bring about a change of meaning (Gimson, 2008).
Consider the following sentence:
[ðə kæt ɪz ɒn ðə mæt]
(1) The cat is on the mat.

If we change the first consonant of the noun cat and insert [h] instead
we get the sentence:
[ðə hæt ɪz ɒn ðə mæt]
(2) The hat is on the mat.

…which does not have the same meaning.

Again, if we substitute [b] for [k], we get

[ðə bæt ɪz ɒn ðə mæt]
(3) The bat is on the mat.
The three strings of sound [kæt], [hæt] and [bæt] differ
only because of their initial sound, and thus are
potentially three different words. Obviously the set of
sounds uttered here are identical. So the difference lies
in the order in which these sounds appear: [k]and [h]
permute in the first two examples. If we permute [k] and
[m] we change the meaning of the sentence and hence
we aren’t speaking about the same thing.

In our examples we produce a change in meaning

through a substitution of segments in a string of sounds.
These segments are called phonemes.
Imagine you’re in London and you want to go to Bond Street. You ask a
couple: “Excuse me, could you tell me where Bond Street is?”. They both
answer in chorus: “Second left and then right”, which can be transcribed

(5a) [sekənd left ən ðen raɪt]

(5b) [sekənd left ən ðen Raɪt]

Both have given you the same information although you perceive a
difference in the sounds used, that is, the woman has used [r], the regular
English / r / sound, whereas the man used the rolled lingual [R] instead.
They are transcribed phonetically respectively as [raɪt] and [Raɪt].
This difference in the pronunciation, which allows you to
assume that the wife is English and the husband
Scottish, doesn’t entail a change in meaning. The two
segments [r] and [R] can be used indifferently since
there is no change of meaning: the difference between
the two is said to be phonetic.

But…this was not the case for the substitution of [h] for
[k] in [kæt] - [hæt], which brings about a change in
meaning and is said to be phonological (or phonemic).
Minimal Pairs
A phoneme is a speech sound that can make one word
different from another in meaning. So, when there is a
difference between two otherwise identical strings of sound
and this difference results in a change of meaning, these two
strings are said to constitute a minimal pair.

If we substitute one segment for another and this results in a

change in meaning the two segments belong to two different
phonemes. Thus [k] and [h] are realizations of two different
phonemes: /k/ and /h/, because substituting one for the
other as first element of the string
[-æt] gives two different words:
/kæt/ (cat) and /hæt/ (hat)  minimal pairs.
Minimal Pairs
The phonemes of a given language form a system in which they are all
opposed to one another. The procedure can theoretically be applied to each
phoneme of the language. This is because even though all phonemes of a
given language form a system, oppositions in that language are organized in
such a way that consonants can only be opposed to consonants and vowels to


/ɪ/ and /i:/ (sit and seat)
/e/ and /ɪ/ (desk and disk)
/e/ and /eɪ/ (wet and wait)
/æ/ and /ʌ/ (bat and but)
/əʊ/ and /ɔ:/ (so and saw)
/ɒ/ and /əʊ/ (not and note)
/æ/ and /e/ (bad and bed)
/ɑ:/ and /ɜ:/ (fast and first)
Minimal Pairs
/b/ and /v/ (berry and very)
/b/ and /p/ (buy and pie)
/n/ and /ŋ/ (thin and thing)
/l/ and /r/ (alive and arrive)
/ʧ/ and /t/ (catch and cat)
/s/ and /ʃ/ (sea and she)
/f/ and /v/ (fan and van)
/f/ and /h/ (fat and hat)
/f/ and /θ/ (free and three)
/s/ and /θ/ (sing and thing)
/ð/ and /z/ (with and whizz)
/ʤ/ and /z/ (page and pays)
/d/ and /ʤ/ (bad and badge)
Minimal Pairs
initial /f/ and /p/ (fast and past)
initial /k/ and /g/ (came and game)
initial /t/ and /d/ (two and do)


final /k/ and /g/ (back and bag)
final /m/ and /n/ (am and an)
final /t/ and /d/ (hat and had)
A phoneme can be pronounced in different ways according to
its context.

Compare (AE):
• The difference between /t/ in : tea, eat, writer, eighth, two,
• The difference between /i:/ in: see, seed, seat, seen
• /i/ - /i:/

Therefore, a phoneme may have more than one realization.

The different realizations of a phoneme are called allophones
of that phoneme. The allophone is a variant of a phoneme.
An allophone is one of a set of multiple possible spoken sounds (or phones)
used to pronounce a single phoneme (Jakobson, 1980). Allophones happen
because of the position of a phoneme and the phonetic characteristics of
neighboring sounds. – man vs tap  /a/

All allophones of a phoneme share the same set of distinctive features but
each one can also show additional features. For example the phoneme /p/ is
realized as [ph] in [ph ɪt], as it would be every time it occurs in a word as initial
consonant before a vowel, and as [p] in all other cases.
Please, prom, pray
Pit, pat, pot  [ph]  [phIt] [phat] [phɔt]

/a/ in hana /hana/ (Aceh Utara, Pidie), /hanɛa/ (Aceh Besar)

[ph] and [p] are said to be allophones because:
1) they can both be described as voiceless bilabial plosives
2) if we substitute one for the other we do not get any
change in meaning and the production is not considered
incorrect by native speakers.

Two or more sounds are allophones of the same phoneme

a) they have a predictable, complementary distribution;
b) they do not create a semantic contrast;
c) and they are phonetically similar.

Because of allophones…
• slashes enclose phonemes: /t/
• square brackets enclose allophones: [t]
This is an important distinction!
Types of Distribution in Allophones:

Types of Distribution in Allophones:

1. Contrastive distribution: Two sounds are said to be
contrastive if replacing one with the other results in a change
of meaning. Example:
‘cat’ [khæt] and ‘hat’ [hæt].
2. Complementary distribution: phones appear in differing
environments; are allophones of the same phoneme. Example:
‘top’ [thɔp] and ‘stop’ [stɔp].
3. Free Variation: phones appear in exactly the same
environments; no difference in meaning; are allophones of the
same phoneme. Example: ‘economics’ or ‘end’, [i] or [ɛ]
Free variation is exceedingly common, and, along
with differing intonation patterns, variation in
allophones is the most important single feature in
the characterization of regional accents. So, more
systematic instances of allophones may be due to
regional “accent”:
• The case of the two /r/: [r] and [R], which can
occur in exactly the same context without change
of meaning, hence with an identical set of
distinctive features but accompanied by non-
distinctive features indicating that the speaker is,
for example, a Scotsman.  The Brave
Based on its production, there are 3 kinds of
1. Aspiration
2. Assimilation
3. Elision
1. Aspiration: an interval air heard between the end of the
plosive and the following vowel. It is represented by the
symbol [h]. Only voiceless plosives may be aspirated; /p/, /t/
and /k/ in the initial position. It is characterized by strong
explosion of breath or puff.
Aspiration may be strong or weak, depending on the

Strong aspiration: voiceless plosives are strongly

aspirated in initial stressed position. Examples: pen
[phen], potato [pə’theɪtəʊ].

Weak aspiration: voiceless plosives are weakly aspirated

in unstressed syllables and in final position. Examples:
pot [phɔt], tomorrow [tə’mɔrəʊ].
2. Assimilation: the influence of a sound on a neighboring sound
so that the two become similar or the same (Salzmann, 2004).

Assimilate  to incorporate

It is a common phonological process by which one sound

becomes more like a nearby sound. This can occur either within a
word or between words. In rapid speech, for example, "handbag"
is often pronounced [ˈhæmbæɡ].

As in this example, sound segments typically assimilate to a

following sound (this is called regressive or anticipatory
assimilation), but they may also assimilate to a preceding one
(progressive assimilation).
While assimilation most commonly occurs between immediately
adjacent sounds, it may occur between sounds separated by others
("assimilation at a distance").

• White pepper /waɪt ‘pepə/
If we pronounce this phrase rapidly, the phoneme /t/ in the word
“white” /waɪt/ becomes /p/ because of the influence of the phoneme
/p/ in the word “pepper’ /pepə/. So the phrase becomes /waɪt’pepə/
• On the house /ɒn ðə ‘haʊs/
If we pronounce this phrase rapidly, the phoneme /ð/ in the word
“the” /ðə/ becomes /n/, because of the influence of the phoneme
/n/ in the word “on” /ɒn/. So the phrase becomes /ɒn nə ‘haʊs/
3. Elision: the omission of a sound for phonological reasons (Algeo, 1999).

It is an instance of complete sound deletion; for examples:

• in consonant clusters, such as facts (deletion of [t]) or fifths (deletion of
[θ]) to ease the articulation process
• when unstressed, the word and often loses its [d]
• entire unstressed syllables are often elided from longer words, such
as comfortable and family, wednesday

In English as spoken by native speakers, elision comes naturally, and it is often

described as ‘slurred’ or ‘muted.’ Often, elision is deliberate. It is a common
misconception that contractions automatically qualify as elided words, which
comes from slack definitions. Not all elided words are contractions and not all
contractions are elided words (for example, ‘going to’ → ‘gonna’: an elision
that is not a contraction; ‘can not’ → ‘cannot’: a contraction that is not an
Phonological rules
It often happens in language that the phonetic
environment before or after a sound influences how
this sound is pronounced  sounds become

When a compromise occur in articulation, changes of

the sound(s) will follow.

Generalizations (simplifications /oversimplifications)

about the patterning of allophones can be stated as
phonological rules.
Phonological rules
A phonological rule is a formal way of expressing a
systematic phonological process or diachronic
(historical) sound change in language.

The relationship between the phonemic

representations of words and the phonetic
representations that reflect the pronunciation of words
is rule-governed.

The concept of rule is central to phonology. Why?

Phonological rules
Hayes (2007):

(1) rules are language-specific: it is not a universal rule, nor

some kind of general principle of speech articulation. The
shortening rule of English (e.g. fifths (deletion of [θ]) is part of
the phonological pattern of the English language, and must be
learned in some form by children acquiring English.
(2) rules are usually productive in the sense that they extend to
novel cases. “Madu” and “dikit” are not words of English, but if
they become words, we can be confident that they would obey
the rules and be pronounced [mædʊ] and [dhɪkhɪt].
Phonological rules
(3) rules give rise to well-formedness intuitions. If a phonetician, or
a speech synthesizer, were to create exceptions to the rule, English
speakers sense the awkwardness of the result. In other words, rule
violations are sensed intuitively.
(4) phonological rules are untaught. Instead, they are learned
intuitively by children from the ambient language data, using
mechanisms that are as yet unknown. In this respect, phonological
rules are very different from rules that are imparted by direct
instruction, like (for example) the rules for traffic lights, or rules of
normative grammar like “don’t end a sentence with a preposition.”
(5) phonological rules are evidently a form of unconscious
knowledge. No matter how hard we try, we cannot access our
phonological rules through introspection.
Types of phonological rules
The following is a representation of the process:

Phonemic form


phonetic form

In other words, phonological rules apply to the phonemic form to produce the phonetic

Phonological rules can be divided into seven types:

1. Assimilation
2. Dissimilation
3. Insertion
4. Deletion
5. Metathesis
6. Strengthening
7. Weakening
Types of phonological rules

Assimilation: When a sound changes one of its features to be more similar to

an adjacent sound, with respect to some similar phonetic property.
Phone book /fonbuk/  /fombuk/

Place assimilation in nasals:

• I can ask [ɑɪ kæn æsk]
• I can see [ɑɪ kæn si]
• I can bake [ɑɪ kæm beɪk]
• I can play [ɑɪ kæm pleɪ]
• I can go [ɑɪ kæŋ goʊ]
• I can come [ɑɪ kæŋ kʌm]

 The nasal has the same place of articulation as the stop following it:
/n/  [m] / __C[labial]
/n/  [ŋ] / __ C[velar]
/n/  [n] / elsewhere
Types of phonological rules
Dissimilation: When a sound changes one of its features to
become less similar to an adjacent sound, usually to make the
two sounds more distinguishable, with respect to some property.
Fifths /fifɵs/  /fifs/

Dissimilation rules are less common than assimilation rules, at

least in English.

One example of a dissimilation rule is fricative dissimilation,

where /θ/ changes to [t] following another fricative:
• fifth phonemically fɪfθ], phonetically often realized as [fɪft]
• sixth phonemically [sɪksθ], but often realized as [sɪkst]
Types of phonological rules
Insertion: When an extra sound is added between two others; a sound
appears in the surface of a phonetic form which is not in the underlying
phonemic form  epenthesis

Warmth /wɔrmɵ/  /wɔrmpɵ/

Insertion of voiceless stops

• Dance /dænØs/  [dænts]
• Strength /strɛŋØɵ/  [strɛŋkɵ]
• Hamster /hæmØstər/ [hæmpstər]

/Ø/  [C[voiceless stop]] / between a nasal and voiceless fricative

 voiceless stop is inserted between a nasal and a voiceless fricative

 The inserted stop has the same place of articulation as the following nasal
Types of phonological rules
Deletion: When a sound, such as a stressless syllable or a weak consonant, is
not pronounced; when a sound which is present in the underlying phonemic
form is not expressed at all in the surface phonetic form.

Hand [hænd]  [hæn]

List [lɪst]  [lɪs]

The deletion of /h/:

He handed her his hat.
/hi hændəd hər hɪz hæt/
[hi hændəd Øər Øɪz hæt]

/h/  [Ø] / in unstressed syllables

 /h/ is deleted in unstressed syllables

Types of phonological rules
Metathesis: When there is a change in order of sounds; refers to a reordering of
prescribe  perscribe
ask  aks

One way words evolve over time is through metathesis, which is the transposition
of sounds or syllables in a word. After some time, if enough people pronounce the
word in that way, the new pronunciation can eventually be adopted  best known
for the description of historical sound changes (sporadic).

Modern English ‘bird’, ‘first’ have earlier forms ‘brid’ & ‘frist’.

Here are some more examples:

• comfortable → comfterble
• iron → iern
• asterisk → asteriks
• cavalry → calvary
Types of phonological rules
Strengthening: When a sound becomes stronger  fortition

English Aspiration
/C[voiceless stops]/  [C[aspirated]] / $[stress]__

 Voiceless stops become aspirated at the beginning of

stressed syllables
 Aspirated stops are considered stronger because the
duration of voicelessness is much longer than in
unaspirated stops
Types of phonological rules
Weakening: When a sound becomes weaker  lenition

English Flapping
/C[alveolar oral stop]/  [ɾ] / V[stress]__ V[unstress]

 An alveolar oral stop /t/ or /d/ becomes a flap when it

occurs after a stressed V and before an unstressed V
 The flap is weaker because it is shorter and obstructs
air less than the alveolar stops
Rule Writing
Same or different phonemes?
If the sounds are allophones of the same phoneme,
describe the environments where each allophone occurs.
• The environment should be described as a natural
class, using sound properties
• One allophone may have the environment “elsewhere”

Now…you are ready to write a phonological rule

Rule Writing
When one phoneme has multiple allophones, we write a
phonological rule (or rules) to determine where each
allophone appears

- The phoneme appears in its basic form in the mental lexicon
- When it needs to be changed into a different allophone, a
phonological rule applies to make that adjustment
- Phonological rules are part of the mental grammar of a
native speaker
Rule Writing
How to write a phonological rule:

(1) Choose one allophone as basic

- If one allophone has the environment ‘elsewhere’,
pick this one as basic
- Otherwise, if one allophone has an environment that
is a more general natural class, pick this one as basic
- If no allophone has a more general environment, just
pick one as the basic one
Rule Writing
(2) The basic allophone is the “name” of the phoneme (what to put
inside the / /)
- This is the allophone we will get when no phonological rule

(3) For each non-basic allophone of the phoneme, write a

phonological rule
- The rule states which segment (or segments) it applies to
- The rule then states which properties need changing in order to
turn the basic form of the phoneme into the appropriate allophone
- The rule also states the environment in which it applies
Rule Writing
Conceptually, a phonological rule says, “When phoneme /P/
appears in the designated context, change it into allophone

Proposal: It is sound properties like “voiced” or “nasal” that

the mental grammar manipulates, not individual speech
sounds like [m]

- Therefore: Always write your phonological rule in terms of

sound properties, even when only one sound is affected!
Rule Writing
Here is how we will express phonological rules in our model of mental
A → B / X __ Y

A The sound(s) affected by the rule

B The property(ies) that the rule changes
/ ‘In the environment of’
__ Where the affected sound(s) are located with respect to the context
X Preceding context, if any
Y Following context, if any

Always state A, B, X, Y in terms of properties

Phonological rules are commonly used as a notation (to capture sound-related operations
and computations the human brain performs when producing or comprehending spoken
language. They may use phonetic notation or distinctive features or both.

The basic format for specifying phonological rules is as follows:

The Form: A  B / C __ D

This format is meant to be read as “A becomes B in the environment following C and

preceding D.”

A: affected segment
B: the change
C & D: the context or environment.
Rule Writing
Consider the following words:
• write -- ride
• rope -- robe
• lock -- log
• cute -- cued
• pick -- pig
• tap -- tab
Is there a difference in the vowel sounds? Yes ! The change-triggering
consonants /p t k/ all differ in the same way from their counterparts /b
d g/; they are voiceless, whereas the counterparts are voiced.

A -> B/C___D “A becomes B following C and preceding D”.

V -> V:/___C (voiced) Vowels are lengthened preceding voiced
Conventional Symbols:
Ø  B / C___D “insert B between C & D”
A  Ø / C __ D “delete A between C & D”

# Word boundary
C [-syllabic] segment
V [+syllabic] segment
+ Morpheme boundary
Acoustic Phonetics
Acoustic phonetics is a branch of phonetics that deals with the physical
characteristics of sound waves which carry speech sounds between
mouth and ear (transmission of sound).

Phonetics refers to the physiological and acoustic parts of the following

diagram, while phonology resides in the brain:
Speech sound waves can be analyzed in terms of its acoustic properties:

PRAAT: computer program that enables visualizing , playing , annotating ,

and analyzing of sound object in terms of its acoustic properties (e.g.
frequency , pitch , etc.).

Pulses & pitch: In the oscilllogram: pulses are indicated by

blue solid lines = phonation mode (voiced)


In the spectrogram, here is the pitch track of the

voice, i.e. what you perceive as high and low
frequencies. Spectrogram
Tiers are used to segment a speech waveform and attach labels for each
segment for further processing:

Speech acoustic analysis can be realized by using spectogram & oscillogram.

Oscillogram: represents speech signals  amplitude (vertical axis) and time

or total duration (horizontal axis)


Spectrogram: graphic representation of sounds in terms of their component
• frequency (vertical axis)
• time (horizontal axis).
• third dimension (dark
shading  acoustic energy
(F1, F2, F3, F4)
Frequency is the number of cycles completed per second  measured in
Hertz (Hz).
When the cycle meets the axis for the second time, one cycle is completed.
Sine wave is the simplest kind of periodic wave; the lowest frequency of a
sine wave component is fundamental frequency (F0).

One cycle
Spectrogram shows
formants’ concentration
of acoustic energy.
Vowels are characterized
by four formants
(F1, F2, F3, F4).

Formants in PRAAT
are also shown by red
dotted lines in the

Recording in a sound-
proof room will make
the dotted lines less.
Formants can be digitally tracked by formant-based speech production and
linear predictive coding (LPC) (Harrington, 2010).

These formants are numbered from the lower to higher frequency.

The different shapes of the vocal tract and the different positions of the
tongue generate different formant patterns.

Therefore, different vowels are produced by the change of positions of the

tongue and thereby changing the shape of the vocal tract.

They are usually classified by the part of the tongue that is raised: front,
middle or back, and according to the degree of rising which takes place,
namely: close, half-close, half-open and open.

For example, /i/ is located at the front of the mouth and produced with
unrounded lips and tongue, while /o/ is located at the back of the mouth and
produced with rounded lips and tongue.
Comparison of formant values is precarious across speakers of
different sex.
For adult females, the length of the vocal tract is around 13 cm
and for adult males, it can vary to over 18 cm (Maragakis,
The vocal tracts of women are shorter; therefore, they have
higher resonance frequencies than those of men (Flynn, 2011).
Female’s formant frequencies are roughly 10% to 15% higher;
therefore, they produce clearer speech compared to males
(Foulkes & Docherty, 1999; Simpson, 2009; Wang & van
Heuven, 2006).
Data for Vowel Analysis
Speakers produce words that contain the target vowels from a word list (elicited speech).
The rationale of word list is to have control over the phonetic environment of the vowels
being investigated (King, 2006). How?

• embed the words in a carrier sentence  speakers read the word list and repeat
each word a number of times.
For example: “Say ___ again.”

• design questionnaires to elicit single words

that illustrate the target vowels
For example:
Interviewer: “What number is this?”
Interviewee: “Four.”

• Lexical sets

• Read speech
Data for Vowel Analysis

Preferably, the vowels occur in an identical

environment, where stops and fricatives are
favored as they have minimum effect on vowels
(King, 2006), such as in:
• [sVs] contexts
• [hVd] contexts
• [bV] or [bVt]
• [hV]
Data for Vowel Analysis
Target Words (Long and Short Vowels) for Monopthongs (source: Manueli,
Pillai, & Dumanig, 2010:

No. Vowel Word 1 (short) Word 2 (long)

1 ɪ bib bid
2 iː beep bead
3 e beck beg
4 a back bag
5 ʌ buck bug
6 ɑː bard barb
7 ɒ pot pod
8 ɔː bought board
9 ʊ put could
10 uː boot booed
11 ɜː burp bird
Data for Vowel Analysis
Target Words for Diphthongs from Pillai (2014):

No Vowel Word
1 eɪ bayed
2 aɪ bide
3 ɔɪ Boyd
4 əʊ bode
5 aʊ bout
6 ɪə beard
7 ʊə poor
8 eə bear
Data for Vowel Analysis
Lexical Set: the linguistic concept of lexical sets is a group
of words that share a specific form or meaning. This
means each word in the group refers to a similar
pronunciation of a particular group of words in a language.

A well-known design lexical set is Wells (1987) lexical set

to explain the varieties of English.

Watt and Tillotson (2001) examined the fronting of /o/ in

Bradford English based on the GOAT lexical set.
Wells lexical set was also used to explain the varying
degrees of accents in the British Isles (Foulkes & Docherty,
1999) and to specifically describe changes in the London
vowel system of young and elderly informants from inner
and outer London (Torgersen, Kerswill & Fox, 2006).
Data for Vowel Analysis
The standard lexical sets by Wells (1987) (reproduced from Wells, 1987, p.
123): The standard lexical sets
No. RP GenAm keyword
1. ɪ ɪ KIT
2. e ɛ DRESS
3. æ æ TRAP
4. ɒ ɑ LOT
5. ʌ ʌ STRUT
6. ʊ ʊ FOOT
7. ɑː æ BATH
8. ɒ ɔ CLOTH
9. ɜː 1
10. iː i FLEECE
11. eɪ eɪ FACE
12. ɑː ɑ PALM
13. ɔː ɔ THOUGHT
14. əʊ o GOAT
15. uː u GOOSE
16. aɪ aɪ PRICE
17. ɔɪ ɔɪ CHOICE
18. aʊ aʊ MOUTH
19. ɪə 1
20 ɛə1 ɛr SQUARE
21. ɑː1 ɑr START
22. ɔː 1
23. ɔː1 or FORCE
24. ʊə1 ʊr CURE
with /r/ following before a vowel only.
Data for Vowel Analysis
However, there are also shortcomings in the use of lexical
sets because phonological systems and phonetic
realizations are always evolving (Ferragne and Pellegrino,

For this reason, the Wells lexical set has been modified in
some studies. An example is a study by Hickey (1999) on
Dublin English. More words were added, which are MEAT,
GIRL, DANCE and PRIDE as those are deemed necessary
to capture the vowel realizations of Dublin English.
Data for Vowel Analysis
Read Speech
• Data are also collected from segments of read speech.
• Common used text in English phonetics: The North Wind
and The Sun (NWS)
• Using specific texts to collect vowels may result in not
obtaining all of the target vowels under study.
• King (2006): if the text is not exclusively created to
contain all target sounds; there is no control over
influences on vowel quality from different environments
and on the frequency of the target sounds.
Data for Vowel Analysis
(3) Spontaneous Speech
• Speech that is unrehearsed and produced spontaneously, because
vowels that are collected in citation form may be hyper-articulated
or articulated too carefully and do not correspond to true
representations of the vowels.
• Source: Interviews, have the speakers describe pictures,
recordings of a set of monologues such as the news and
commentaries , a televised series of talk shows, telephone
exchanges , two speakers chat freely and unmonitored for some
time while being recorded
• But, the disadvantage of obtaining the vowels from spontaneous
speech is that it may not cover all of the vowels being investigated,
and further prone to effects from elision, intonation, stress, vowel
reduction and other phenomenon related to connected speech.
Practice 1
Before recording:
• Choosing target words in English which covers all of the
• Choose two respondents, one boy and one girl
• They must be of the same age, born and bred in the same area
• Have no dental problems and healthy

Recording directly from your laptop to Praat:

• Find a quiet room
• Turn off all electronic devices (AC, fan, cellphone, fridge, etc)
• Close the doors and windows
• Maintain the quietness: if while recording the room is noisy, stop,
wait until it is quiet again and start recording
• Have each respondent to repeat the word 30 times
Practice 2
Saving your sound file:
• Transfer to WAV file
• Annotate to Textgrid file
• A TextGrid object consists of a number of tiers. Interval
tier is a connected sequence of labeled intervals, with
boundaries in between. Point tier is a sequence of
labeled points.
• For interval tiers, create INTERVIEWER, CONSULTANT’S
• For point tiers, create F1 and F2
Practice 3:
Measuring your vowel:

F1 and F2
in Hertz
Practice 4
• Insert your measurements in Exel:

Boy /o/from"topi" time Duration F1(Hz) F2(Hz) F1(Bark) F2(Bark)

1 1,564 482 574 1149 5.37 9.41

2 6,861 488 574 1149 5.37 9.41

3 11,632 721 533 1232 5.02 9.88

Girl /o/from"topi" time Duration F1(Hz) F2(Hz) F1(Bark) F2(Bark)

31 0.980 600 737 1469 6.67 11.06

32 3,675 660 645 1423 5.95 10.84

33 6,455 687 776 1674 6.96 11.93

• Conversion from Hz to Bark:

*D refers to the column in Exel of the F1 or F2 measurement in Hertz
Practice 5
• Conducting t-tests: to test the similarities or
differences between the boy and girl vowel
production, use VassarStats: Website for
Statistical Computation at
• Choose t-Tests & Procedures
• Use Two-Sample t-Test for Correlated Samples
• After calculation, choose p two-tailed for results
Practice 6
• Plotting vowels in Exel:
Boy [o] Girl [o] Girl [o] Boy [o]

F2 (Bark) F2 (Bark)
17 16 15 14 13 12 11 10 9 8 7 17 16 15 14 13 12 11 10 9 8 7
2 2

3 3

4 4

5 5

F1 (Bark)

F1 (Bark)
6 6

7 7

8 8

9 9

10 10

11 11

• Left: scatter plot, right: vowel plot (in average)

Practice 7

• Writing up a simple report:

Our data is taken from the word “topi” to extract the vowel /o/ of Bahasa Indonesia. We
had recorded one boy and one girl age…, born and bred … who is living in … .
The result of our measurements are as follows: (copy paste your measurements from
The measurement of the tokens in graph is as follows: (copy paste your vowel scatter
plot from Exel)

Based on the graph above, there are a number of outliers (mention this if there are any).
These outliers are caused by…(the first measurement of the researchers) or (the
production from the speaker him/herself). Therefore, we conducted a second
measurement on these outliers, and the outcome is as the following: (copy paste the
corrected graph from Exel)

The measurement of the tokens in average in graph is as follow: (copy paste your vowel
plot from Exel)

In conclusion, the data shows that the sound [o] from Girl is lower/higher than Boy, but
more fronted/back than Boy.
A syllable is a unit of organization for a
sequence of speech sounds. For example, the
word water is composed of two syllables:

A syllable is typically made up of a syllable

nucleus (most often a vowel) with optional
initial and final margins (typically, consonants).
Types of Syllables
There are six types of syllables:
1. A closed syllable ends in a consonant. The vowel has a short vowel sound,
as in the word bat.
2. An open syllable ends in a vowel. The vowel has a long vowel sound, as in
the first syllable of apron.

3. A vowel-consonant-e syllable is typically found at the end of a word. The

final e is silent and makes the next vowel before it long, as in the word
4. A vowel team syllable has two vowels next to each other that together
say a new sound, as in the word south.
5. A consonant+l-e syllable is found in words like handle, puzzle, and middle.
6. An r-controlled syllable contains a vowel followed by the letter r. The r
controls the vowel and changes the way it is pronounced, as in the word

You might also like