Rhythm in West African tone languages: a study of Ibibio, Anyi and


Ulrike Gut, Eno-Abasi Urua, Sandrine Adouakou & Dafydd Gibbon

Bielefeld University, University of Uyo
{gut, urua, adouakou, gibbon}

Abstract and the muscles concerned are the

breathing muscles” (p. 96). In syllable-
The speech rhythm of three West African timed languages “chest-pulses, and hence
languages is measured with current the syllables, recur at equal intervals of
methods which focus on durational time – they are isochronous” (p. 97).
relationships of various units. It is shown Syllables are assumed to be equal in
that predictions about the phonological length (p. 98), stress-pulses, on the other
structure and their classification in terms hand, are unevenly spaced. Abercrombie
of speech rhythm can be confirmed with cites Yoruba, a West African tone
data from these languages. language, as an example for a syllable-
timed language.
1. Introduction Stress-timed languages such as
The concept of speech rhythm is a much English, in contrast, are supposed to have
discussed issue in phonetics and regular recurring stress beats with the
phonology. Impressionistic accounts agree same time interval separating two beats of
that the languages of the world differ in equal length. Since the number of
their rhythm (syllable-timing, stress- syllables between two stress beats varies,
timing, mora-timing) but attempts to their length is adjusted to fit into the stress
capture these differences acoustically have interval – syllable length, hence is very
so far been unsatisfactory. The aim of this variable in stress-timed languages.
paper is to describe three West African Many researchers have tried and failed
tone languages – Anyi, Ega and Ibibio – to find an acoustic basis for these claims.
which have not been previously classified First, the interstress interval in stress-
in terms of their speech rhythm. We timed languages such as English is not of
started from the first working hypothesis equal length (Classe 1939, O’Connor
that they are syllable-timed, as reflected in 1965, Uldall 1971, Hill et al. 1979, Fauré
a low variation between syllable durations. et al. 1980, Roach 1982, Dauer 1983), but
Next, we tried to correlate phonological varies from 488 to 566 ms. Roach (1982)
features such as absence/presence of divided the duration of a tone unit in three
phonemic vowel length, syllable structure, stress-timed (English, Russian, Arabic)
and tone system with the speech rhythm. and three syllable-timed languages
(French, Telugu, Yoruba) by the number
2. Rhythm measurements of feet, which gives a hypothetical ideal
duration for a foot with complete
The rhythm of the languages of the world isochrony assumed. Actual measurements
have traditionally been divided into stress- of feet durations taken from speech in
timed and syllable-timed (Pike 1945, these languages were then compared with
Abercrombie 1967). Rhythm is understood the predicted value and the percentage
to be a periodic recurrence of events, deviation was calculated. Roach showed
which in stress-timed languages are stress that the variance of the percentage
beats and in syllable-timed languages deviation in English is higher than in
syllables. Abercrombie (1967) sees speech
rhythm as “essentially a muscular rhythm,

TAPS Proceedings Gut,Urua,Adouakou,Gibbon

French, Telugu and Yoruba, which is deviation of these intervals, and the
contradictory to expectations. standard deviation of the consonantal
Second, syllable length in stress-timed intervals. By comparing carefully selected
and syllable-timed languages does not read sentences by four speakers each of 7
differ significantly. Roach (1982) different languages along the axes of the
calculated the standard deviation of percentage of vowels and the standard
syllable durations in English, Russian, deviation of the consonantal intervals
Arabic and French, Telugu and Yoruba. Ramus et al. succeed in grouping some
No significant difference was found: languages similarly to the originally
English 86ms, Russian 77ms, Arabic suggested groups of stress-timing and
76ms; French 75.5ms, Telugu 66ms and syllable-timing. English, Polish and
Yoruba 81ms. Dutch, all presumed stress-timed
languages, group together with a relatively
Dauer (1983) suggested that low vocalic proportion (around 39%) and
“rhythmic differences […] between a relatively high standard deviation of
languages […] are more a result of consonantal intervals. French, Italian,
phonological, phonetic, lexical, and Spanish and Catalan group together at a
syntactic facts about that language than higher %V (about 44%) and lower delta C.
any attempt on the part of the speaker to Japanese, finally, differs from those two
equalize interstress or intersyllable groups by having an even higher vocalic
intervals” (p. 55). In Dauer’s view, speech proportion (53%) and even lower
rhythm reflects variety of syllable consonantal standard deviation.
structures, phonological vowel length
distinctions, absence/presence of vowel Grabe & Low (2001) measure the
reduction and lexical stress. Whereas difference in duration between successive
languages classified as stress-timed such vowel durations and between successive
as English show a variety of different consonantal intervals. Both approaches
syllable structures (CV (30% frequency), succeed in classifying languages that show
CVC (34%), VC (15%), V (8%), CVCC mixed phonological properties as
(6%)), languages classified as syllable- suggested by Dauer (1983), e.g. vowel
timed have a majority of CV syllables reduction but small variation in syllable
(58% for Spanish). Since syllables structure.
increase in length when segments are With the exception of Roach (1982),
added and closed syllables are longer than the speech rhythm of West African
open ones (Delattre 1966), speech rhythm languages has not been investigated yet. In
measured in syllable duration differences this paper, we selected three of these
reflects syllable structure distribution. languages, Anyi (Kwa), Ega (putative
Equally, differences in rhythm between Kwa) and Ibibio (Benue Congo: Lower
languages reflect whether a language has Cross). Our hypothesis is that they have a
vowel reduction or not; those classified as syllable-timed rather than stress-timed
stress-timed usually do. In addition, rhythm and that their phonological
syllable-timed languages either do not features such as absence/presence of
have lexical stress or accent is realized by phonemic vowel length, syllable structure,
variations in pitch contour. Conversely, and tone system correlate with their
stress-timed languages realize word level speech rhythm.
stress by a combination of length, pitch,
loudness and quality changes, which result
in clearly discernible beats. 3. Languages
This approach is partly reflected in 3.1 Anyi
recent measurements of the acoustic Anyi is spoken in the Eastern part of Ivory
correlates of speech rhythm. Ramus, Coast where according to an inventory by
Nespor & Mehler (1999) segment speech Burmeister (in prep.) 10 varieties exist. In
into vocalic and consonantal parts and Ghana, two additional varieties of Anyi
compute the proportion of the vocalic are spoken. According to the classification
intervals of a sentence, the standard by Stewart (1989), Anyi belongs to the
TAPS Proceedings Gut,Urua,Adouakou,Gibbon

Kwa languages. The consonantal from the geographically nearest Kwa

inventory of Anyi includes a series of languages. Syllables are V, CV, CCV. The
voiced and voiceless stops: bilabial, consonant system contains a full series of
dental, palatal, velar and labio-velar. unvoiced, voiced (voiced fortis) and
Voiceless labio-dental, dental and glottal implosive (voiced lenis) stops: labial,
fricatives occur. In addition, there are dental, palatal, velar, labiovelar. There are
liquids, nasals and glides. In general, apart 9 vowels, with ATR harmony, but no
from the labio-velars, which are absent in nasal or length contrast (except in isolated
the Ndenye variety of Anyi, all the words such as /'fe:~/ "all"). The tone
consonants are permissible in the system of Ega has a three-way contrast:
consonant structure of Anyi. Anyi has 14 high (H), mid (M) and low (L). Initial
vowels, 9 oral and 5 nasal. The Sanvi observations indicate that Ega has
variety has more than 14 vowels, discrete-level tone patterning in context,
including the central vowel / / and the with abrupt final lowering.
corresponding nasal vowel / ~ /. Phonetic
long vowels occur but their analysis as a
sequence of two vowels has been 3.3 Ibibio
suggested, which is corroborated by the Ibibio has been classified as a Lower
fact that they almost always have a Cross language ((New) Benue-Congo)
complex tone. Words with contrasting spoken in the south-eastern part of Nigeria
vowel length therefore always also have a (Faraclas 1989, Williamson 1989) by
tonal contrast. Only the vowels of the third about four million people (Essien 1991).
and fourth degree of aperture do not allow The Ibibio syllable structure is (V/N), CV,
corresponding nasals. The following CVV, CVC, CVVC, CGV, C V. The V/N

syllable structures occur in Anyi: V, CV, is the syllabic prefix, which may be either
N, CVV, C1C2V where C2 is a semivowel, a vowel or a syllabic nasal consonant,
and CLV where L is a liquid. Anyi has usually homorganic to the following
four phonological tones: Two level tones, consonant. Consonant clusters are hardly
H and L, and two contour tones, rising LH attested and where they occur, are usually
and falling HL. The appearance of a mid restricted to only C or CG. The G is

tone is due to the effect of tone sandhi either a palatal or labial-velar glide (Cy or
rules. Cw) which arise from deletion or other
phonological processes in the language.
3.2 Ega The is an alveolar tap arising from two

Ega is an endangered isolate within a Kru phonological processes of /d/ weakening

speaking area of South Central Ivory to  and V1 deletion in a CV1dV2
Coast (Dida to the West, North, East; sequence. These structures may be
Godie to the South), with around 1000 full modified through suffixation. Owing to
speakers. Ega has been classified as a Kwa overlapping and neutralisation in Ibibio
langauge. However, much of the consonants, there are differences in the
information about the language is of number of phonemic and even phonetic
uncertain status, and Ega is currently the consonants proposed for Ibibio. Urua
subject of ongoing linguistic and (1990, 2000) proposes thirteen phonemic
sociolinguistic documentation research by consonants, comprising five oral and four
Connell, Ahoua, and Gibbon (2002). A nasal stops, two fricatives, one palatal and
published sketch of the language by Rémy one labial-velar glide. Although voiced
Bole-Richard is included in Hérault & al. and voiceless stops are attested, contrast
(1982), and a detailed phonetic study has between them is significant mainly in
been made by Dago (1999). Although the word-initial position. In word-final
language has been classified as Kwa, Ega position the voicing contrast is neutralised
has phonological features, such as the full and only unreleased and voiceless stops
implosive series, and morphological occur. In intervocalic position, the stops
features, such as a complex nominal are weakened to homorganic
classification system, which distinguish it continuants/taps. Consonants may be
TAPS Proceedings Gut,Urua,Adouakou,Gibbon

lengthened for morphophonemic reasons. 4.2 Analysis

The number of phonemic vowels proposed The sentences were transcribed using
ranges from six to ten. The six which are Transcriber 1.4 (Ega and Anyi) and Praat
common to all the researchers include /i, e, (Ibibio). The length of each syllable was
a, , o, u/. These six vowels occur in all measured, a phonetic transcription was
positions in the word, albeit with made in SAMPA, and the syllable
allophonic variations. Crucially, all six structure was transcribed.
short vowels contrast with their long The Rhythm Index (RI) (Low & Grabe
counterparts in a C-C environment, e.g. 1995) for each sentence was then

dep/deep ‘buy/scratch’. Vowels undergo calculated using the following formula:

assimilation, lengthening, shortening and
deletion but not reduction in the sense in
which English vowels undergo reduction m −1
dk − dk + 1
in unstressed syllables. The tone system RI = 100∑ | | /( m − 1)
fits into a terrace level pattern since it has k =1 (dk + dk + 1) / 2
two level pitches, High and Low plus a
contrastive downstepped High tone in
where m stands for the maximal number of
addition to two contour tones, High-Low
units, d stands for duration and di=dk and
and Low-High. The contour tones are
dj=dk+1. That is, for a sequence of units
combinations of the level pitches. Both
(either syllables or vowels) of length d, the
High and Low tones in Ibibio manifest
average difference of the absolute
downdrift (Urua 1996/1997, 2000,
differences between adjacent units is
Gibbon, Urua & Gut 2000).
calculated. The differences are normalised
by dividing each difference by the average
4. Method duration of the syllables in the pair. If the
units are very similar in duration, the RI
4.1 Subjects
will be close to 1, whereas for maximal
One Anyi speaker, one Ibibio speaker and differences the RI will approach 200. A
one Ega speaker were recorded. The Anyi value of 1 will be interpreted as perfect
speaker is male and has been living in the syllable-timing, higher values reflect a
Ivory Coast all his life. The Ibibio speaker tendency for stress-timing.
is female and has lived all of her life in
In addition, we calculated the Rhythm
Ibibio land, with occasional sojourns of
Ratio (Gibbon & Gut, 2001) for each
not more than two years at a time in
sentence. This measurement is based on
Scotland and Germany. The Ega speaker
the following formula:
is male and has lived in the Ivory Coast all
his life.

/ (m − 1)
RR = 100∑k =1
m −1 i

4.2 Data dj
The Anyi speaker told a story of 1.42
minutes length. The Ega speaker spoke an
where di=dk and dj=dk+1 if di is smaller
address to the elders of his home village
than dj and dj=dk and di=dk+1 if di is not
lasting about 3 minutes. The Ibibio
smaller than dj. In other words, for each
speaker read a formal address to the elders
pair of adjacent syllables, the shorter is
of her village of about two minutes length.
divided by the longer. The average of all
For all three speakers 12 sentences were
these ratios is calculated and multiplied by
selected which had been spoken without
100. Thus, if the RR equals 100 we have
hesitation or repairs and restarts and which
perfect syllable-timing. The lower the
had a minimum length of 8 syllables.
degree of syllable-timing the lower the RR
value. Unlike the RI, the RR does not
calculate absolute differences in length
between adjacent units but computes their
ratio. Also unlike the RI, the RR
TAPS Proceedings Gut,Urua,Adouakou,Gibbon

measurement does not normalise for

With the help of a wide-band
spectrogram, vocalic and consonantal
parts of the speech signal were annotated.
In order to ensure comparability, the
annotation technique used by Ramus et al.
(1999) was adopted. This means that pre-
vocalic glides were treated as consonants
whereas post-vocalic glides were treated
as vowels. Thus, vowels were coded as V
and stops, fricatives, liquids, nasals,
glides, implosives and approximants were
coded as C. The beginning and end of a
vocalic interval was determined using
Figure 1. %V and delta C of English, Dutch, Polish,
standard phonetic criteria. Spanish, Italian, French, Catalan and Japanese
classified by Ramus et al. (1999) and Anyi, Ega and
5. Results
Table 1 lists the average RI and RR across
all twelve sentences for Anyi, Ega and Table 2 presents the relative frequency of
different syllable types in Anyi, Ega and
Ibibio. All three languages exhibit a clear
tendency towards syllable-timing, i.e. the Ibibio. Differences in both the occurrence
difference in duration between adjacent of syllable types and their relative
frequency appear. In the Anyi speech, only
syllables is relatively small. Both the
rhythm ratio (RR) being much closer to six different syllable types; CV, V, CCV
100 than to 0 and the rhythm index (RI) where the second C is either a liquid or an
approximant, syllabic N, CVV and CCCV
being closer to 0 than to 200 indicate this.
with a liquid and an approximant as the
second and the third consonant) occur with
RR RI CV being the most frequent (61.5%) and
V being the second most frequent (21.9%).
Anyi 65.8 43.9
In Ega, we observed nine different syllable
Ega 70.1 37.3 types; CV, V, CCV with /r/ or /j/ as the
Ibibio 66.3 42.4 second consonant, CVV, CCCV with a
Table 1. Average RR and RI (syllables) across the
liquid and an approximant as the second
12 sentences in Anyi, Ega and Ibibio. and the third consonant, VN, CVN, and
CCVN. The most frequent syllable type is
CV (70.1%), the second most frequent
Figure 1 shows the speech rhythm of type is CVV (14.9%) and the third most
Anyi, Ega and Ibibio, measured in the frequent V (9.9%). In Ibibio, seven
method proposed by Ramus et al. (1999) different syllable types were produced;
compared to the other languages CV, V, CCV with /n/ and /r/ as second
investigated by them. Anyi and Ibibio consonant, N, CVV, CVC with either a
group closer to the syllable-timed than the nasal or a stop in the coda position and
stress-timed languages, although they have CVVN. The most frequent syllable type is
a higher delta C value. Ega is even below CV (47%), the second most frequent is
Japanese with a very high vocalic CVV (22.9%) and the third most frequent
proportion and a very low delta C value. is CVN (14.7%).
TAPS Proceedings Gut,Urua,Adouakou,Gibbon


Anyi 61.5 21. 9.1 3 3 - 0.6 - - - 164
Ega 70.1 9.9 2.5 - 14.9 1 0.35 0.35 0.35 0.35 281
Ibibio 47 5 1.7 7.8 22 - - - 15.1 0.9 231
Table 3. Percentage of different syllable types
occurring in Anyi, Ega and Ibibio. vowel length contrast and must
consequently reject our second hypothesis.
Instead, we propose that, in Ibibio, a
In Anyi, 3% of all syllables consist of compensation for these durational factors
syllabic nasals, but no closed syllables occurs and that speech rhythm is the
occur. Ega shows a very high percentage dominant frame requiring this.
of open syllables (99%) and a complete
absence of syllabic nasals. Only nasals
occur in the coda of closed syllables. 7. References
Ibibio has more than 16% closed syllables, Abercrombie, D. (1967). Elements of
all with a nasal in the coda position, and General Phonetics. Edinburgh:
more than 7% syllabic nasals. Edinburgh University Press.
Allen, G. (1975). Speech rhythm: its
6. Discussion relation to performance universals and
articulatory timing. Journal of Phonetics
Our data show that the degree of syllable- 3, 75-86.
timing, as measured both in the RR and Bond, Z. & Fokes, J. (1985). Non-native
the RI, as well as with the method patterns of English syllable timing.
proposed by Ramus et al. (1999) is very Journal of Phonetics 13, 407-420.
high for all three West African languages Burmeister, J. (in prep.) On Anyi.
under investigation in this paper. As we Classe, André (1939). The Rhythm of
had predicted in our first hypothesis, English Prose. Oxford: Blackwell.
durational differences between adjacent Connell, B., Ahoua, F. & Gibbon, D.
syllables are not very pronounced. (2002). Ega. Journal of the International
However, there is no acoustic evidence for Phonetic Association 32(1), 99-104.
equal length of syllables as originally Dago, Georgette (1999). Étude phonétique
proposed by Abercrombie (1967). et phonologique de l'éga. M.A. thesis,
The syllable structures found in Anyi, Université de Cocody, Abidjan.
Ega and Ibibio speech show differences Dauer, R. (1983). Stress-timing and
between the first two and the third syllable-timing reanalysed. Journal of
language. Closed syllables do not exist in Phonetics 11, 51-62.
Anyi and are very rare (1%) in Ega, where Delattre, R. (1966). A comparison of
only a nasal can occur in the coda syllable length conditioning among
position. In Ibibio, conversely, 16% of all languages. International Review of
syllables are closed and both nasals and Applied Linguistics 4, 183-198.
stops occur in the coda position. In Essien, O. E. (1991). The nature of tenses
addition, Ibibio is the only language with in African languages: a case study of
phonemic vowel length contrast. However, the morphemes and their variants.
this difference in syllable structure and

Archiv Orientalni 59:1-11.

phonemic vowel length contrast is not Faraclas, N. G. 1989. Cross River. Niger-
reflected in a difference in speech timing Congo languages. Bendor-Samuel, J.
between those three languages. We (ed.), Lanham: University Press of
therefore did not find any support for America, 377-399.
Dauer’s (1983) claim that speech rhythm Faure, G., Hirst, D. & Chafcouloff, M.
reflects the variety of syllable structure (1980). Rhythm in English:
and presence or absence of phonemic Isochronism, Pitch, and Perceived
TAPS Proceedings Gut,Urua,Adouakou,Gibbon

Stress. In: L. Waugh & C. van Stewart, (1989). “Kwa”. In Benda, S., The
Schooneveld, The melody of language, Niger-Congo languages. A
Baltimore: University Park Press, pp. classification and description of
71-79. Africa’s largest language family.
Gibbon, D. & Gut, U. (2001). Measuring Ianhan: University Press of America.
speech rhythm. Proceedings of Uldall, E. (1971). Isochronous Stresses in
Eurospeech, Aalborg. R.P.. In: L. Hammerich, R. Jacobson &
Gibbon, D., E. E. Urua & U. Gut (2000). E. Zwirner, Form and substance,
How low is the floating Low tone in Copenhagen: Akademisk Forlag,
Ibibio. Paper presented at the 30th pp.205-210.
Colloquium on African Languages and Urua, E. E. (2000). Ibibio phonetics and
Linguistics, Leiden, August 2000. phonology. Cape Town, South Africa:
Grabe, E. & Low, E.-L. (2001). Durational Centre for Advanced Studies of African
Variability in Speech and the Rhythm Society.
Class Hypothesis. Papers in Laboratory Williamson, K. (1989). Benue-Congo
Phonology. overview. Niger-Congo languages.
Hérault, G. (1982). Atlas des langues kwa Bendor-Samuel, J. (ed.), Lanham:
de Côte d'Ivoire. Tome 1. Abidjan: University Press of America, 247-274.
Institut de Linguistique Appliquée &
Agence de Cooperation Culturelle et
Hill, D., Jassem, W. & Witten, I. (1979).
A statistical approach to the problem of
isochrony in spoken British English. In:
pp. 285-294.
Hoequist, C. (1983). Durational Correlates
of Linguistic Rhythm Categories.
Phonetica 40, 19-31.
Hoequist, C. (1983). Syllable Duration in
Stress-, Syllable- and Mora-Timed
Languages. Phonetica 40, 203-237.
Lehiste, I. (1977). Isochrony reconsidered.
Journal of Phonetics 5, 253-263.
Low, E.-L. & Grabe, E. (1995). Prosodic
patterns in Singapore English.
Proceedings of the International
Congress of Phonetic Sciences,
Stockholm, 3, 636-639.
O’Connor, J. (1965). The Perception of
Time Intervals. Progress Report 2,
Phonetics Laboratory, UCL, 11-15.
Pike, K. (1945). The Intonation of
American English. Ann Arbor:
University of Michigan Press.
Ramus, F., Nespor, M. & Mehler, J.
(1999). Correlates of linguistic rhythm
in the speech signal. Cognition, 73, 3:
Roach, P. (1982). On the distinction
between ‘stress-timed’ and ‘syllable-
timed’ languages. In: D. Crystal (ed.),
Linguistic controversies, Essays in
linguistic theory and practice, London:
Edward Arnold, pp. 73-79.

