Download as pdf or txt
Download as pdf or txt
You are on page 1of 9

Journal of Phonetics (1981) 9, 273-281

Durational relationship between Japanese stops


and vowels
YayoiHomma
Department of Foreign Languages, Osaka Gakuin University, Kishibe, Suita-shi,
Osaka 564, Japan

R eceived 28th April 1980

Abstract: The durational relationship between Japanese stops and vowels is


examined, measuring closure duration, voice onset time, and vowel
duration of two and three mora words with single and geminated
stops. The results reveal that these three variables are closely related
to fixed word duration , although the duration of each syllable in a
word is different. Not only universal but also language-specific
durational principles are incorporated into Japanese, and acoustic
measurements well fit at word level with the linguistic uses of seg-
mental duration in Japanese.

1. Introductio~
Japanese is a mora-counting language. The length of an utterance phonologically depends on
the number of moras . For example, kan ("a can") and kana ("syllabary") are two mora
words which linguistically have the same duration. The ratio of the duration of gaka
("painter" ) and gakka ("lesson") corresponds to the number of moras namely 2 : 3 , because
the first part of the geminated stop /kk/ is counted as one mora. An examination of the
durational relationship between Japanese stops and vowels in several contexts, measuring
closure duration, voice onset time (VOT) and vowel duration, might reveal acoustic evidence
for the linguistic uses of segmental duration in Japanese. This paper will contrast the results
of experiments in Japanese , a mora-counting language, with those of English, a stress-
counting language .
Various people have conducted experiments on closure duration, VOT, and vowel
duration in English. They have found that voiceless stops have greater closure duration
than voiced stops. According to Lisker (1957), the average closure duration for /p/ was
about 120 ms and the average value for /b/ was 75 ms in trochee words . Most investigators
agree that when other factors are constant, closure duration for labials is longer than
alveolars and velars (Lehiste, 1970).
VOT increases as the place of closure moves toward the back (Lisker & Abramson, 1964).
VOT is longer in stressed syllables than in unstressed syllables (Lisker & Abramson, 1967).
VOT is closely related with the following vowel quality (Klatt, 1975 ; Port & Rotunno ,
1979).
Vowel duration in English varies greatly according to environments. Vowels are the
longest before pauses. Stressed vowels are longer than unstressed ones (Klatt , 1976). Vowels
before voiced consonants are longer than vowels before voiceless consonants (House &
Fairbanks, 1953; Denes , 1955; House, I 961; Peterson & Lehiste, 1960). The number of

0095-4470/81/030273+09 $02.00/0 © 1981 Academic Press Inc. (London) Ltd.


274 Y. Homma

syllables (Klatt , 1973; Harris & Umeda, 1974) and speaking rate (Lehiste, 1970; Port, 1976)
are also important factors which influence vowel duration.
In Japanese some experiments on closure duration and vowel duration have been reported.
In closure duration, voiceless stops are longer than voiced ones (Han, 1962; Okada, 1969,
1971 ); fpf is longer than ftf and /k/ (Han, 1962). Vowel duration is longer in voiced environ-
ments (Okada, 1969 ; Homma, 1973). Pitch accent does not have a significant influence on
vowel duration (Homma , 1973). Temporal compensation works within a word (Homma,
1973; Maeda, 1979). VOT in Japanese, however, has been given the least attention (Homma,
1980).
The specific objectives of this paper are (1) to confirm the results of the previous experi-
ments in Japanese; (2) to discuss VOT; and (3) to study the durational relationship between
stops and vowels.

2. Experiment
The purpose of this experiment was to make acoustic measurements of closure duration,
VOT, and vowel duration of two and three mora words with single and geminated stops.

2.1 Methods
Twenty-four real and nonsense test words were prepared. The words contained vowel /a/
and voiceless stops fp, t , k, b, d , gf at the initial and also at the medial position.
Test words
(A) two moras (B) three moras
/p/ papa pappa
paba pabba
/b/ bapa bappa
baba babba
/t/ tata tatta
tad a tadda
/d/ data datta
dada dadda
/k / kaka kakka
kaga kagga
Jg/ gaka gakka
gaga gagga
These words were placed in the following sentence frame.
Kore wa _ _ _ desu. This is _ __

The sentences with the test words were randomly arranged on a list. The list was read
three times by four Japanese speakers, three males and one female, in natural speed with
accent on the first syllable . The total 288 sentences were recorded and wide-band spectro-
grams were made of each utterance with a Kay-Sonagraph (6061B) in the Phonetic Labor-
atory of Indiana University.

2.2. Measurement procedures


2.2.1. Closure duration. At the initial position of the word, closure duration was difficult to
measure , because there was sometimes a pause before the test word. Therefore, closure
Japanese stops and vowels 275

duration of the medial stop was measured between termination of the first vowel formant
transition and the burst of the medial stop.

2.2.2. Voice onset time. The release of plosive consonant, especially voiceless stops, involves
three phases - explosion, frication, and aspiration. Frication and aspiration are called voice
onset time (VOT) (Lisker & Abramson, 1964). VOT is one of the most important cues to
separate voiced and voiceless stops. Klatt measured frication and aspiration separately in his
paper (1975), but in this experiment the whole VOT was measured by marking off the
interval between the release of the stop and the onset of glottal vibration for voice. The
onset of glottal vibration was shown on the spectrogram with the beginning of the regularly
spaced vertical striations.

2.2.3. Vowel duration. The duration of the first and second vowels was measured from the
onset of glottal vibration to the closure for the following stop shown by the abrupt cessation
of energy in all the formants.

2.3. Results and discussion


2.3.1 . Closure duration. Table I gives the results.

Table I Average closure duration of the medial stops for each speaker in
milliseconds

S1 S2 S3 S4 Mean

IPI 90 64 72 80 77
lbl 50 59 56 55 55
It I 87 56 42 62 62
ldl 37 41 22 39 35
lkl 75 56 51 62 61
lgl 33 49 41 ? 41
IPPI 184 180 199 170 183
lbbl 144 161 181 148 159
Itt I 190 178 154 159 170
lddl 130 148 156 141 144
lkkl 169 179 174 176 175
lggj 109 124 151 151 134

The following points were observed.


(1) Closure duration in Japanese was larger for voiceless stops than voiced stops just as in
English (Lisker, 1957).
(2) Labials were longer than denials and velars, but the difference was not very large.
(3) There was very little difference between dentals and velars.
(4) The ratio of single stops and geminated stops was about 1 : 3.
Table II shows the average closure duration in milliseconds and also in percentage. In (B),
the average closure duration for voiced stops was assumed to be 100%, and in (C), single
stop closure duration was assumed to be 100%.
According to Table II, geminated stops were less influenced by voicing feature than
single stops. This means that the last part of a voiced geminated stop is voiceless before the
burst. These findings may agree with the fact that VOT was measured not only for voiceless
276 Y. Homma

Table II

(A) Average closure (B) Closure duration ratio (C) Closure duration ratio
duration (ms) between voiced and between single and
voiceless stops(%) geminated stops(%)
Single Geminated Single Geminated Single Geminated

Voiceless 67 176 152 121 100 263


Voiced 44 146 100 100 100 332

stops but also for voiced ones in geminated stops, but seldom in single stops, a phenomena
we will discuss later.
Voicing effects on closure duration, and the ratio between single and geminated stop
closure by and large agreed with Han's measurements (1962). She reported "/p, t, k, c, s/
show approximately 20 to 40% increase in duration over their corresponding voiced ones"
and that "the duration of short and long consonants is, on the average, in the ratio of 1.0
to 2.6 and often 1.0 to 3 .0."

2.3.2. Voice onset time. Table III shows the results.

Table III Voice onset time of the initial and medial stops for each speaker
in milliseconds

(A) VOT of the initial stops (B) VOT of the medial stops
Sl S2 S3 S4 Mean Sl S2 S3 S4 Mean

(p - pj 18 34 14 29 24 !PI 3 7 3 13 7
/p-b/ 24 43 19 31 29 ft I 8 16 18 23 16
/t-t I 26 42 26 34 32 /k/ 22 26 21 25 24
j t-df 16 53 24 33 32 /PP/ 0 16 7 20 11
/k-k/ 48 51 35 46 45 ftt I 1 14 10 28 13
/k- g/ 63 62 57 60 61 /kk/ 19 37 20 36 28
fg-kf 14 20 10 12 14 /bb/ 0 4 1 2 2
fg-gf 8 19 17 13 14 /dd/ 0 7 11 12 8
fggf 25 25 14 23 22

The following points were observed.


( 1) VOT in Japanese clearly increased as the place of closure moved toward the back of
the mouth, just as in English (Lisker & Abramson, 1964, 1967).
(2) VOT was shorter before voiceless stops, but the difference was negligible except with
velars.
(3) VOT was longer in accented syllables like English (Lisker & Abramson, 1967), although
Japanese has pitch accent, not stress accent. The average VOT of initial /p, t, k/ was 37 ms,
and the average of medial /p, t, k/ was 16 ms.
(4) Gemination of stops did not affect VOT. The average value of fpp, tt, kk/ was 17 ms.
(5) As mentioned above, VOT was observed in voiced geminated stops, but was very rare
in single voiced stops, except for /g/ at word initial position.
Klatt (1975) reported that the average VOT in English was 61 ms for voiceless stops and
and 18 ms for voiced ones in stressed monosyllabic words. That voiced stops have VOT
Japanese stops and vowels 277

indicates that English voiced stops are not truly voiced at the initial position. Although
comparing VOT of English and Japanese is not a simple task , it may be safe to say that
English has longer VOT than Japanese (Homma, 1980) .
Lisker & Abramson (196 7) found that although the effects of stress on VOT were rather
limited , in English "stress and VOT are not strictly independent of one another." They
reported that the difference between mean values for /p, t , k / in isolated words as against
sentences was about 25 ms , and that the difference between stressed and unstressed /p, t, k/
was about 6 ms in sentences. In Japanese , accent and VOT were clearly related. The differ-
ence between accented and unaccented /p, t, k / was about 21 ms .

2.3.3. Vowel duration. Tables IV and V show the average duration of the first vowel fa/ and
the second vowel /a/ respectively.

Table IV Average vowel duration of the first /a/ for each speaker in
milliseconds

Sl S2 S3 S4 Mean

IP- PI 90 63 59 69 70
IP- bl 101 59 86 78 81
l b-· PI 103 93 81 96 93
lb- b/ 104 97 90 108 100
It- t I 94 49 71 62 69
l t - dj 113 63 85 77 85
ld- tl 111 98 99 100 102
l d- d j 124 105 97 108 109
lk- kj 74 64 85 65 72
lk- gl I 02 82 75 84 86
lg- kj 107 111 113 108 110
lg- gl 126 108 107 114 114

Table V Average vowel duration of the second fa/ for each


speaker in milliseconds

S1 S2 S3 S4 Mean

IP- dl 106 76 107 92 95


lb- dl 114 98 108 110 108
IPP-dl 85 78 92 87 86
lbb- dl 92 104 103 103 101
l t- dl 106 75 96 87 91
l d- dl 123 100 109 115 112
ltt- dl 88 73 84 81 82
ldd- dl 102 98 89 104 98
lk - dl 100 64 83 89 84
lg- dl 123 96 92 95 102
lkk- dl 89 68 79 80 79
lgg- dl 95 94 105 100 99

The following points we re observed.


(1) Voicing of both the preceding and the following stops had a lengthening effect on
vowel duration. In light of this we may ask whether the preceding or the following consonant
has more influence on vowel duration in Japanese. Okad a (1969) reported that the following
278 Y. Homma

consonant had a slightly stronger influence, while Homma (1973) recognized a stronger
effect of the preceding consonant, and Maeda (1979) found the same results as Homma.
Table VI shows the average vowel duration in milliseconds and in percentage under the
influence of (A) the preceding stop, and (B) the following stop. Table VI was based on
Table IV. For all the stops, vowel duration in voiced environments was assumed to be 100%.

Table VI Effects of voicing of the stops on vowel duration

(A) The preceding stop (B) The following stop


ms % ms %
Voiceless 77 73 86 90
Voiced 105 100 96 100

Comparing (A) and (B), we fmd that the preceding stop has more influence than the
following stop. In (A) vowel duration is reduced approximately 25% and in (B) 10%. There-
fore, we can conclude that in Japanese the preceding consonant has a stronger effect on
vowel duration.
(2) As seen in Table VII, vowel duration of the first /a/ slightly increased as the place
of closure of the adjacent stops moved toward the back.

Table VII Comparison of the first and second vowels in milliseconds

The first fa/ The second fa/

Labial 86 98
Apical 91 96
Velar 96 91
Mean 91 95

These findings agreed with VOT. However, in the second /a/, vowel duration decreased in
the same direction.
(3) In duration there was no significant difference between accented and unaccented
vowels. The second unaccented vowels were somewhat longer as in Homma (1973). This is
very different from English.
(4) Gemination of stop had a slight influence on the following vowel. Vowels were a little
shorter after geminated stops. The average difference was about 8 ms.
Contextual effects on vowel duration in Japanese are very different from those in English.
First, in English, voicing of the following consonant has much stronger influence on vowel
duration (Peterson & Lehiste, 1960; Chen, 1970) than in Japanese.
Secondly, in English, stress-accented vowels are much longer than unstressed vowels. In
1976 Klatt reported that the average duration for stressed vowels is approximately 130 ms,
and that the average duration for unstressed vowels is approximately 70 ms in a connected
discourse. In Japanese the mean of the first /a/ in accented syllables was 91 ms, and that of
the second /a/ was 95 ms.

2.3.4. Durational relationship. Table VIII shows the average word duration in milliseconds
and ratio. That the word-duration ratio by acoustic measurements fits well into the linguistic
mora ratio 2 : 3 is very interesting.
Japanese stops and vowels 279

Table VIII Average word duration in milliseconds

With single With geminated


medial stop medial stop

Labial 264 376


Apical 260 372
Velar 279 401
Mean 268 383
Ratio 2 2.9
Number of moras 2 3

Although Han wrote that a unit of duration in Japanese is associated with a syllable, it
may be more appropriate to say that domain of durational pattern is not a syllable, but a
word. The following examples illustrate this.

VOT
IVowol dumtion
1Clow<O dumtion
t Vowe\duration
p a p a
18 67\ 77 98 260ms
I
'I
I
13 109 '. 40 105 267 ms
g a g a

Here /papa/ and /gaga/ are two mora words which have almost the same word duration.
However, when we compare these two words, we find a big difference in vowel duration
and in closure duration. In /papa/ vowels are shorter and closure duration is longer because
of the voiceless environment, while in /gaga/ vowels are longer and closure duration is
shorter. Therefore, the duration of the first syllable of /papa/ is 85 ms, and that of /gaga/
is 122 ms. In spite of the big difference in the first syllable duration, the two words are
almost the same in the word duration. This means that temporal compensation works
within a word, not within a syllable or mora. The previous experiments by Homma (1973)
and Maeda (1979) support this observation.

3. Conclusions
In this durational study of Japanese, the following points were observed.

3.1. Closure duration


(1) Closure duration was longer for voiceless stops than voiced ones.
(2) Closure duration for labials was longer than apicals and velars.
(3) The ratio of closure duration between single stops and geminated stops was about
1 : 3. This ratio implies that the duration of geminated stops is not only doubling the stop
segment but also including the length which corresponds to a larger unit, namely a mora, as
Han pointed out.
280 Y. Homma

3.2. Voice onset time


(1) VOT increased as the place of closure moved toward the back of the mouth.
(2) VOT was shorter before voiceless stops than voiced ones, but the difference was
negligible except with velars .
(3) VOT was clearly related with accent but not with gemination of stops.
(4) Japanese stops have shorter VOT than English.

3.3. Vowel duration


(1) As in English, vowel duration was longer before voiced consonants than voiceless con·
sonants, but the extent differed drastically. One of the reasons for this may be that vowel
duration in Japanese is more influenced by the preceding consonant than the following
one.
(2) Vowel duration was independent from accent. The mean of the second unaccented
vowel duration was longer.
(3) The place of articulation of the adjacent stops affected the vowel duration. As the
place of closure moved toward the back, both VOT and vowel duration became longer
in the first syllable. In the second syllable , on the contrary, vowel duration became shorter
in this direction.
English is a rhythmic stress-timed language . Rhythm tends to fall with the same amount
of time between two primary stresses , regardless of the number of syllables . Although there
is a certain limit in compressibility (Klatt, 1973), the more syllables a rhythmic unit has, the
shorter becomes the duration of the segments (Lehiste , 1970 ; Homma, 1978). Therefore ,
stress and the number of syllables have great effects on vowel duration. On the other hand,
Japanese is a mora-counting language , and given a certain number of moras, word duration
is relatively fixed , although the duration of each syllable in a word is phonetically different.
As a result , temporal compensation is observed within a word, not within a syllable. In
voiceless environments , we have VOT , shorter vowel duration and longer closure duration,
and in voiced environments, we have almost no VOT, but we do have longer vowel duration
and shorter closure duration. In other words, closure duration, voice onset time, and vowel
duration work together to obtain fixed word duration. Thus the difference in word duration
is small.
The present experiment clearly revealed that the durational relationship between
Japanese stops and vowels shows not only universal but also language-specific characteristics,
and that acoustic measurements well fit at word level with the linguistic uses of segmental
duration in Japanese.

I appreciate the guidan ce and comments of Professor R. F. Port at Indiana University on


earlier versions of this paper.

References
Chen, M. (1970). Vowel length variation as a function of the voicing of the consonant environment.
Phonetica , 22, 129- 159.
Denes, P. (1955) . Effect of duration on the perception of voicing. Journal of the Acoustical Society of
America, 25, 105 - 113.
Han, S. M. (1962) . The feature of duration in Japanese. To kyo : Phonetic Society of Japan, Study of
Sounds, 10, 65- 75 .
Harris, M. S. & N. Umed a (1974) . Effect of speaking mode on temporal factors in speech : vowel duration.
Journal of th e Acoustical Society of America, 56, 1016- 101 8.
Homma, Y. (1973). An acoustic study of Japanese vowels: their quality, pitch, amplitude, and duration.
Study of Sounds , 16, 34 7- 368.
Japanese stops and vowels 281

Homma, Y. (1978). Vowel duration in English. Osaka: Osaka Gakuin University, Gaikokugo Ronshu, 6,
51-67 (in Japanese).
Homma, Y. (1980). Voice onset time in Japanese stops. Bulletin of the Phonetic Society of Japan, 163,
7-9.
House, A. S. (1961). On vowel duration in English. Journal of the Acoustical Society of America , 33,
1174- 1178.
House, A. S. & G. Fairbanks (1953). The influence of consonant environment upon the secondary
acoustical characteristics of vowels. Journal of the Acoustical Society of America, 25, 105-113 .
Klatt, D. H. (1973). Interaction between two factors that influence vowel duration. Journal of the
Acoustical Society of America, 54, 1102-1104.
Klatt , D. H. (1975). Voice onset time, frication and aspiration in word-initial consonant clusters. Journal
of Speech and Hearing Research, 18,686-705.
Klatt, D. H. (1976). Linguistic uses of segmental duration in English: acoustic and perceptional evidence.
Journal of the Acoustical Society of America, 59, 1208-1221.
Lehiste, I. (1970). Suprasegmentals. Cambridge, Massachusetts; London: M.l.T. Press.
Lisker, L. (1957). Closure duration and the intervocalic voiced-voiceless distinction in English. Language,
33, 42--49.
Lisker, L. & A. S. Abramson (1964 ). A cross-language study of voicing in initial stops: acoustical measure-
ments. Word, 20, 384--422.
Lisker, L. & A. S. Abramson (1967). Some effects of context on voice onset time in English stops.
Language and Speech, 10, 1-28.
Maeda, S. (1979). Timing control in Japanese speech production. Nara: Tenri University, Tenri Daigaku
Gakuho, 121, 1-21 .
Okada, T. (1969). The influence of voiced or voiceless consonants on vowel duration. Kyoto: Literary
Association of Doshisha University,Jimbungaku, 115,68-84 (in Japanese) .
Okada, T. (1971). A spectrographic study of the duration correlation between some Japanese vowels and
consonants. Literary Assocation of Doshisha University, Doshisha Studies in English, 1, 32-49 (in
Japanese) .
Peterson, G. E. & I. Lehiste (1960). Duration of syllable nuclei in English. Journal of the Acoustical
Society of America, 32, 693-703.
Port, R. F. (1976). Influence of Speaking Tempo on the Duration of Stressed Vowel and Medial Stop in
English Trochee Words. Bloomington: Indiana University Linguistics Club.
Port, R. F. & R. Rotunno (1979). Relation between voice-onset time and vowel duration. Journal of the
Acoustical Society of America, 66, 654-662.

You might also like