Professional Documents
Culture Documents
Electrolarynx in Voice Rehabilitation: Hanjun Liu, Manwa L. NG
Electrolarynx in Voice Rehabilitation: Hanjun Liu, Manwa L. NG
www.elsevier.com/locate/anl
Abstract
Objective: Patients of laryngeal cancer who have undergone the surgical removal of the entire larynx suffer the loss of phonation.
Electrolarynx (EL) speech is the most commonly adopted alaryngeal phonation. However, EL speech is notorious of the sound quality being
monotonic and robotic with the lack of pitch control and the presence of the radiated noise. This paper provides a review of modalities in EL
speech as well as introducing the technologies to control the pitch and reduce the noise of the device.
Methods: Improvements of EL speech quality have been divided into two parts: improving the sound quality of EL device by applying
different enhancement algorithms to reduce the radiated and the additive noise, and implementing pitch-control function to the EL with
advanced technology.
Results: Adaptive filtering and the subtractive-type algorithms have shown to be able to reduce the noise level associated with EL speech.
And more mature technologies are showing promise to the making of a hand-free EL system producing more accurate and synchronized pitch
and voice onset control.
Conclusion: The advent of micro-technology and human-machine integration promisingly improves EL speech quality and more efficient
algorithms enhance EL sound quality. Such improvements apparently improve the intelligibility of EL speech, and thus better quality of life of
the EL speakers.
# 2007 Elsevier Ireland Ltd. All rights reserved.
1. Introduction device, which incorporates the internal preset pitch that can
be adjusted to meet with individual preference for male and
The removal of the entire larynx as a treatment of female speakers. Lauder [4] and Rothman [5] found that the
laryngeal cancer usually results in the loss of the ability to use of the EL was easier, produced longer sentences without
produce voice and speech. Statistical data show that there are special care, and was more effective for communication in
over 600,000 laryngectomees in the world [1], and many situations.
apparently voice restoration is essential to these people. Since the debut of the first EL, named Sonovox, by
Standard esophageal (SE) speech and tracheoesophageal Wright in 1942, EL has been undergoing many modifica-
(TE) speech are two main methods used by laryngectomees tions. In 1945, Aurex company in Chicago started producing
for voice rehabilitation. But due to the low acquisition rate in an EL named Aurex Neovox M-520T, setting the design
SE speech (6%) [2] and the fact that as many as one-third foundation of modern EL. In 1959, the transistorized EL was
of laryngectomized patients find TE speech unsuitable for developed by the Bell Laboratories [6]. Up to date, there are
anatomical or personal considerations [3], electrolarynx several commercially-used ELs including Nu-voice, Romet,
(EL) phonation is the most commonly adopted form of Amplicode, Cooper-Rand, Servox, etc. The former four do
phonation. An electrolarynx (EL) is a battery-powered not allow pitch adjustment during speaking, and Servox only
has two preset pitch levels (high and low) with an external
* Corresponding author. Tel.: +1 847 491 2428; fax: +1 847 467 2776. tone activation switch during conversation (see Fig. 1).
E-mail address: hanjun-liu@northwestern.edu (H. Liu). There are two different types of EL: the neck-type and the
0385-8146/$ – see front matter # 2007 Elsevier Ireland Ltd. All rights reserved.
doi:10.1016/j.anl.2006.11.010
328 H. Liu, M.L. Ng / Auris Nasus Larynx 34 (2007) 327–332
control, respectively. To produce natural speech, the These control signals are then fed to an EL which is held
fingertip switch produces binary commands of voicing against the neck by using an inconspicuous brace. An
and accent which are coded according to the amount of the envelope waveform proportional to the time-averaged power
finger pressure, and then the controller implements the pitch in the EMG signal is produced by the EMG processing
generation model based on the commands. Such finger- circuit. The fast envelope is to turn the transducer on and off
control type EL is characterized by simplicity and based on a controllable threshold voltage. Simultaneously,
availability, allowing for rapid pitch changes and more the slow envelope is used to control the pitch by directly
natural pitch inflection. While pressure-controlled pitch is modifying the frequency of the oscillator driving the EL
valuable for many speakers, those who are new to using transducer. The results show that this EL overcomes the
electromechanical speech may find it distracting and inconvenience of using hand and appears to be appropriate
difficult to accommodate to this pitch-control way. The for daily communication. However, it is noted that, the
precise control of pitch is still in doubt because the EMG-control EL is still limited to the on/off control with no
movement of the finger is hard to match the pitch changes further report on how it is used to control pitch.
during speaking. Poor synchronization between speech and Similarly, Painter et al. [30] designed an electromagnetic
finger maneuvers will often result in slow speaking rate in EL which can be implanted in laryngectomees. It consists of
order to achieve high speech intelligibility. This certainly two parts: an activator and an implantable transducer. The
limits the practical application in actual verbal commu- activator is composed of periodic pulse generator and
nication. wireless frequency emitter; the transducer is composed of
Another new EL with pitch-control function was subcutaneous transformer, filter, and sensor. Both are
developed with the use of expiration pressure from the connected with wire and implanted in the neck tissue of
stoma of the laryngectomee [28]. This EL consists of three the laryngectomee. During phonation, the pulse generator
parts: a pressure sensor that can detect expiration pressure produces a periodic signal like other vibrating source of EL,
produced from a stoma, an electrical circuit that can convert and passes it to the vocal tract with the use of emitter,
expiration pressure into a fundamental frequency for voice transformer, filtering, and sensor. Finally speech is produced
and an electromechanical vibrator that can be attached to by movement of vocal organs. As compared to the neck-type
the neck. This method has been proven to be useful in or intra-oral type EL, the most advantage of the implantable
improving the naturalness of EL speech. But there are still EL is that the pitch and the intensity can be adjusted during
problems that are difficult to solve, especially for the phonation without the use of hand. However, several factors
traditional EL users. It appears to be very challenging for a should be considered when using this EL among the
long-time user of traditional EL to grasp this method, but patients: compatibility, repellency, and duration of the
relatively easy for a laryngectomee who never used EL implanted materials, the size of the EL, the technique and
before or who only uses SE or TE speech. Traditional EL the cost of the surgery, etc. Meanwhile, this surgery is very
users, according to previous studies, speak by holding the dangerous to the older patients. All these factors tend to limit
breath without the use of airflow, which makes these users the development of the implantable EL, which is still in the
unavailable to the expiration-control EL. Moreover, the stage of lab testing.
improvement of voice quality is limited because pitch
changes in a relatively narrower range and it is difficult to
find the optimal function transforming the expiration 3. EL speech enhancement
pressure into frequency value.
Goldstein et al. [29] designed a hands-free EL triggered During EL phonation, some of the sounds produced by
by neck muscle EMG activity. Signal processing circuitry in the vibrating diaphragm are radiated directly from the
a belt-mounted control unit transforms EMG activity into device. Poor interface with the neck and the surrounding
control signals for initiation and termination of voicing. neck tissues may result in radiated noise which interferes
330 H. Liu, M.L. Ng / Auris Nasus Larynx 34 (2007) 327–332
These two subtractive-type algorithms consider the auditory preservation in advanced laryngeal cancer. Part II. Laryngectomy
masking properties of the human ear, which allows finding a rehabilitation: the state-of-the-art in the VA system. Ann Otol Rhinol
Laryngol 1998;107(Suppl 172, Pt 2):1–27.
better tradeoff between the amount of noise reduction, the [3] Karen C, Joel M. Utilization of microprocessors in voice quality
speech distortion and the level of residual noise in a improvement: the electrolarynx. Curr Opin Otolaryngol Head Neck
perceptual sense. Furthermore, in order to overcome the 2000;8:138–42.
limitation of the noise estimation in the subtractive-type [4] Lauder E. The laryngectomee and the artificial larynx—a second look.
J Speech Hear Disord 1970;35:62–5.
algorithms, we are developing a new method based on the
[5] Rothman H. Acoustic analysis of artificial electronic larynx speech.
adapted wavelet packet transform to improve EL speech, In: Seikey A, editor. Electroacoustics analysis and enhancement of
and the results of the pilot study is inspiring. alaryngeal speech. Springfield, IL: Charles Thomas; 1982. p. 95–118.
[6] Barney HL, Haworth FE, Dunn HK. An experimental transistorized
artificial larynx. Bell Syst Tech J 1959;38:1337–56.
4. Summary [7] Weiss MS, Yeni-Komshian GH, Heinz JM. Acoustic and perceptual
characteristics of speech produced with an electronic artificial larynx.
J Acoust Soc Am 1979;65:1298–308.
As mentioned above, considerable researches have been [8] Qi Y, Weinberg B. Low-frequency energy deficit in electrolaryngeal
conducted to investigate the practical and theoretical speech. J Speech Hear Res 1991;34:1250–6.
improvements of EL speech. With the development of the [9] Knox AA, Anneberg M. The effects of training in comprehension of
state-of-the-art technology, we have reasons to believe that electrolaryngeal speech. J Commun Disord 1973;6:110–20.
significant advances in improving the speech quality of EL [10] Hyman M. An experimental study of artificial-larynx and esophageal
speech. J Speech Hear Disord 1955;20:291–9.
speech in two aspects. The first aspect is the ability of the EL [11] McCroskey RL, Mulligan M. The relative intelligibility of esophageal
to adjust pitch and intensity real-time during phonation. The speech and artificial larynx speech. J Speech Hear Disord 1963;28:
EMG-control EL will likely be adopted for several reasons. 37–41.
First of all, the method of pitch control using EMG signals is [12] Shipp T. Frequency, duration, and perceptual measures in relation to
judgments of alaryngeal speech acceptability. J Speech Hear Res
the most natural, and it is accomplished without the help of
1967;10:417–27.
intentional finger or expiration pressure manipulation. [13] Holly SC, Lernman C, Randolph K. A comparison of the intelligibility
Secondly, the response time of EMG is faster than the of esophageal, electrolarynx, and normal speech in quiet and in noise. J
finger or expiration pressure during phonation, which is Commun Disord 1983;16:143–55.
expected to solve the synchronization issue between pitch [14] Williams SE, Watson JB. Differences in speaking proficiency in three
control and speech production. Finally, surgery similar to the laryngectomy groups. Arch Otol 1985;111:216–9.
[15] Gandour J, Weinberg B. Perception of contrastive stress in alaryngeal
implantable EL is not required and such EL can be used speech. J Phonet 1982;10:347–59.
easily without the dependence of the hand. The other aspect [16] Gandour J, Weinberg B. Production of intonation and contrastive stress
of technological advances in EL development is the speech in electrolaryngeal speech. J Speech Hear Res 1984;27:605–12.
enhancement system of EL used for electronically mediated [17] Gandour J, Weinberg B, Petty SH, Dardarananda R. Vowel length in
environments. Considering the rapid development of Thai alaryngeal speech. Folia Phoniatric Logo 1987;39:117–21.
[18] Gandour J, Weinberg B, Petty SH. Voice onset time in Thai alaryngeal
information technology, communication via electronic speech. J Speech Hear Disord 1987;52:288–94.
media becomes more prevalent. It is essential to improve [19] Gandour J, Weinberg B, Petty SH, Dardarananda R. Tone in Thai
the quality of EL speech to make the laryngectomees live alaryngeal speech. J Speech Hear Res 1988;53:23–9.
with comfort and convenience, thus better quality of life. [20] Ching TY, Williams R, Van Hasselt CA. Communication of lexical
tones in Cantonese alaryngeal speech. J Speech Hear Res
Now we are developing this system in which different
1994;37:557–71.
algorithms of speech enhancement are embedded. They can [21] Ng M, Kwok C, Chow S. Speech performance of adult Cantonese-
be selected by using a switch, based on different noise speaking laryngectomees using different types of alaryngeal phona-
conditions, to obtain a high intelligibility for better tion. J Voice 1997;11:338–44.
understanding. With the use of Digital Signal Processing [22] Ng M, Lerman J, Gilbert H. Perceptions of tonal changes in normal
technology, we are trying to realize the enhancement laryngeal, esophageal, and artificial laryngeal male Cantonese speak-
ers. Folia Phoniatric Logo 1998;50:64–70.
function with a microprocessor and embed it into a [23] Ng M, Gilbert H, Lerman J. Fundamental frequency, intensity, and
telephone, microphone, or other electronic media. With vowel duration characteristics related to perception of Cantonese
the development of efficient enhancement methods, the alaryngeal speech. Folia Phoniatric Logo 2001;53:36–47.
quality of EL speech will be extensively improved for better [24] Liu HJ, Wan MX, Wang SP, Niu HJ. Aerodynamic characteristics of
understanding and thus higher quality of life. laryngectomees breathing quietly and speaking with the electrolarynx.
J Voice 2004;18(4):567–77.
[25] Liu HJ, Wan MX, Wang SP. Features of listeners affecting the
perceptions of Mandarin electrolaryngeal speech. Folia Phoniatric
References Logo 2005;57:9–19.
[26] Takahashi H, Nakao M, Kikuchi Y, Kaga K. Alaryngeal speech aid
[1] Hirokazu S, Takahashi H. Voice generation system using an intra- using an intra-oral electrolarynx and a miniature fingertip switch.
mouth vibrator for the laryngectomee. MS thesis. Japan: The Uni- Auris Nasus Larynx 2005;32(2):157–62.
versity of Tokyo; 2000. [27] Takahashi H, Nakao M, Okuas T, Hatamura Y, Kikuchi Y, Kaga K. A
[2] Hillman R, Walsh M, Wolf G, Fisher S, Hong W. Functional outcomes voice-generation system using an intra-mouth vibrator. J Artificial
following treatment for advanced laryngeal cancer. Part 1. Voice Organs 2001;4:288–94.
332 H. Liu, M.L. Ng / Auris Nasus Larynx 34 (2007) 327–332
[28] Uemi N, Ifukube T, Takahashi M, Matsushima J. Design of a new component analysis. Med Biol Eng Comput 2003;41(6):
electrolarynx having a pitch control function. In: Proceedings of the 670–8.
IEEE international workshop on robot and human communication; [35] Cole D, Sridharan S, Moody M, Geva S. Application of noise
1994. p. 198–203. reduction techniques for alaryngeal speech enhancement. In: Proceed-
[29] Goldstein EA, Heaton JT, Kobler JB, Stanley GB, Hillman RE. Design ings of the IEEE TENCON-97, vol 2; 1997. p. 491–4.
and implementation of a hands-free electrolarynx device controlled by [36] Pandey PC, Bhandarkar SM, Bachher GK, Lehana PK. Enhancement
neck strap muscle electromyographic activity. IEEE Trans Biomed of alaryngeal speech using spectral subtraction. In: Proceedings of the
Eng 2004;51:325–32. DSP 2002, vol 2; 2002. p. 591–4.
[30] Painter C, Kaiser T, Fredrickson JM, Karzon R. Human speech [37] Pratapwar SS, Pandey PC, Lehana PK. Reduction of background noise
development for an implantable artificial larynx. Ann Otol Rhinol in alaryngeal speech using spectral subtraction with quantile based
Laryngol 1987;96(5):573–7. noise estimation. In: Proceedings of the seventh world multiconfer-
[31] Norton RL, Bernstein RS. Improved laboratory prototype electrolar- ence on systemics, cybernetics and informatics; 2003. p. 408–13.
ynx (LAPEL): using inverse filtering of frequency response function of [38] Boll SF. Suppression of acoustic noise in speech using spectral
the human throat. Ann Biomed Eng 1993;21:163–74. subtraction. IEEE Trans Acoust Speech Signal Process 1979;27:
[32] Espy-Wilson CY, Chari VR, MacAuslan J, Walsh M. Enhancement of 113–20.
electrolaryngeal speech by adaptive filtering. J Speech Lang Hear Res [39] Liu HJ, Zhao Q, Wan MX, Wang SP. Enhancement of electrolaryngeal
1998;41:1253–64. speech based on auditory masking. IEEE Trans Biomed Eng 2006;53:
[33] Espy-Wilson CY, Chari VR, Huang CB. Enhancement of alaryngeal 865–74.
speech by adaptive filtering. ICSLP Proc 1996;2:764–7. [40] Liu HJ, Zhao Q, Wan MX, Wang SP. Application of spectral sub-
[34] Niu HJ, Wan MX, Wang SP, Liu HJ. Enhancement of electrolarynx traction method on enhancement of electrolarynx speech. J Acoust Soc
speech using adaptive noise cancelling based on independent Am 2006;120:398–406.