Professional Documents
Culture Documents
Vowel and Consonant Characterization Using Fractal Dimension in Natural Speech
Vowel and Consonant Characterization Using Fractal Dimension in Natural Speech
Vowel and Consonant Characterization Using Fractal Dimension in Natural Speech
antonio.guillamon@upct.es
1. Introduction
1
2
constructed by the first two or three formants. This is true even if some
overlapping between the vowel regions occurs. However the consonant
characterization is more difficult because consonant waveforms may be
indistinguishable in time or frequency domain.
In this paper, nonlinear dynamic methods are used to describe the
complexity of natural speech. The fractal dimension is a geometric invariant
measure used for characterizing nonlinear systems, which is of interest to us
for of its properties.
The remainder of this paper is organized as follows. In Section 2, the
techniques used in this work for estimating the fractal dimension are
commented. In Section 3, we show the experimental results, and finally,
conclusions based in results are presented in Section 4.
2. Methods
i.e., the Df compares the actual number of units that compose a sequence
with the minimum number of units required to reproduce a pattern of the
same spatial extent, but the Df computed in this fashion depends on
measurement units used. Katz´s approach solves this problem by creating a
general unit or yardstick: the average distance between successive points,
L . Normalizing distances in equation (1) by this average results in:
L
log 10
Df = L (4)
d
log 10
L
This procedure discretizes the sequence into N-1 intervals,
L
then N −1 = , and (4) can be written as
L
log 10 (N − 1)
Df = (5)
d
log 10 + log 10 (N − 1)
L
Expression (5) summarizes Katz´s approach to calculate the Df of a waveform.
3. Results
This section shows the experimental results obtained during this study. The
fractal dimension is calculated for a variety of voiced sounds (vowels and
nasals) and unvoiced sounds (fricatives). Concretely, the sounds considered
in our research are:
• The vowels: /a/, /e/, /i/, /o/, /u/.
• The nasals: /m/, /n/.
• The fricatives: /f/. /s/, /z/.
Sound /a/ /e/ /i/ /o/ /u/ /m/ /n/ /f/ /s/ /z/
Df 2.2 2.2 2.2 2.1 2.1 1.8 1.8 3.3 4.9 4.1
4. Conclusions
This study suggests that the Df is a useful and practical tool for characterizing vowel and
nasal sounds. The Df values obtained show that vowel and nasal sounds are low-
dimensional, whereas fricatives are high-dimensional.
In conclusion, the fractal measures expand the distinguishing features in
characterizing voiced and unvoiced sounds and it gets the possibility of designing an
automatic system for speech recognition.
References
1. M. Katz, Comp Biol. Med., vol. 18, no. 3, pp. 145-156, 1988.
2. M. Banbrook and S. McLaughin, IEEE Collouium on Exploiting Chaos
and Signals Processing, 8/1-8/10, 1994.
3. A. Kumar and S.K. Mullick, Electronics Letters., vol. 26, no. 21, pp. 1790-
1791, 1990.
4. H.N. Teodorescu and cols., Int. J.of Chaos Theo. .and Appl., vol. 5, no.
3, pp. 1453-145, 1996.
5. F. Takens, Lectures Notes in Mathematics 898, Springer, Berlin, pp. 366-
381, 1981.
6. P. Grassberger and L. Procaccia, Phys. Rev. Lett., vol. 5, pp. 345-349,
1983.
7. T. Higuchi, Phys. D, vol. 31, pp. 277-287, 1988
8. A. Petrosian, IEEE Symposium on Computer-Based Medical Systems,
pp. 212-217, 1995.
9. R. García and L. Gómez, Base de datos para la Identificación y
Clasificación de locutores, Universidad Politécnica de Madrid, 1997.