Professional Documents
Culture Documents
Speech Trainer Kit Using Laryngeal Vibrations
Speech Trainer Kit Using Laryngeal Vibrations
ABSTRACT
Development of hardware and software tools for speech training for hearing-impaired, on
a multilingual basis using laryngeal vibrations has been proposed. After identification of user
requirements in close co-operation with user groups for the three languages, a software package
for transforming speech signals into images has been developed and tested. These images
including display of various parameters like frequency, amplitude, pitch etc and constitute a real
time visual feedback for the hearing impaired.
2. IMPLEMENTATION MIC
Our implementation of the traner kit is MIXER PC SOUND CARD
MIC
based on the speech processing techniques
which include feature extraction and MIC
identification using Mel Frequency Cepstral
Figure 1: Block diagram of signal acquisition
Coefficients. Clusters are created for every
module
sample and stored in database using K-mean
Signal processing module consists of
clustering. The feature extracted from the
speech feature extraction and clustering. The
real time inputs of trainee are compared with
trainee’s utterances are analyzed and some
these by calculating the Euclidean distance.
features are extracted that are compared with
The system outputs a suitable feedback
those extracted in the same way from a
depending whether the utterance of the
normal person’s utterance of the same sound.
trainee was correct or not by analyzing the
The decision of the test, whether the trainee’s
Euclidean distances. The product
utterance is correct or not is conveyed
implements hardware and software tools for
through the visual display. Thus the
speech training on a multilingual basis. It
technique may be useful for improving 3. RESULTS
effectiveness of speech-training for Hearing The package was tested with speech
Impaired children by providing visual signals for consistency and validity of the
feedback to improve their articulation. system. Figure 1 shows the signal
Mel frequency cepstral coefficients acquisition module.
algorithm is used for feature extraction and
K means algorithm is used for clustering.
Then the feature extracted from the real time
input is compared with clusters by
computing the Euclidean distance. The
testing process is iterated until the Euclidean
Figure 1. Signal acquisition module
distance becomes minimum. This module
also comprises a visual feedback for the Figure 2 shows the MIC input of vowel ‘a’
trainee with information about the goodness and its corresponding MFCC plot.
of each utterance. It also provides a graph
showing the level of improvement for each
utterance.
Training signal
Feature Extraction
Figure 2. MIC input of vowel ‘a’ and
using MFCC
corresponding MFCC plot
Real
Clustering Time i/p The GUI provides an option for the
(KMEANS)
trainer to select language, create database,
Feature retrieve data from the database, analyse the
Measure Euclidean
Extraction improvement graph of the patient, and also
Distance
using MFCC
provides a visual feedback to the trainee.
Visual Feedback figure 3 shows the GUI of the main menu
Display Result and figure 4 shows the visual feedback.
Figure 4. Visual feedback for the user [4] Kewley-Port D, Watson CS, Cromer
PA: The Indiana Speech Training Aid
(ISTRA) : A microcomputer-based aid
4. CONCLUSION using speaker-dependent speech
recognition. Presented at American
Our aim of implementing a speech
Speech-Hearing-Language Foundation
trainer, in the MATLAB platform turned
Computer Conference, Houston, 1987.
out successful. The system was able to
display the visual feedback in real time.
[5] Lynne E. Bernstein, Moise H.
The feasibility of the project was
Goldstein, James J. Mahshie: Journal of
successfully tested. During the training
Rehabilitation Research and
process the trainees were trained till their
Development Vol. 25 No.4, 1988
pronunciation matched that of a normal
person which required a trainer on a one- [6] Mahshie JJ: A computerized
to-one basis. Our device automated the approach to assessing and modifying the
whole process. The feedback for voice of the deaf. In Proceedings of the
improvement was provided in the form of 1985 International Congress on
visual images which were easily Education of the Deaf (in press).
perceptible.
[7] Murata N, Yamada Y, Sugimoto T,
Hirosawa K, Shibata S, Yamashita S:
Speech training aid for people with
impaired speaking ability, New York:
IEEE Press, 1986.