A New Generation Videokymography For Routine Clinical Vocal Fold Examination PDF

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 5

The Laryngoscope

Lippincott Williams & Wilkins, Inc.

© 2006 The American Laryngological,
Rhinological and Otological Society, Inc.

A New Generation Videokymography for

Routine Clinical Vocal Fold Examination
Qingjun Qiu, PhD; Harm K. Schutte, MD, PhD

Objective: This study aims to introduce a new- laryngoscopy, voice disorders, routine clinical vocal
generation videokymographic system, which provides fold examination.
simultaneous laryngoscopic and kymographic image, Laryngoscope, 116:1824 –1828, 2006
for routine clinical vocal fold examination. Study De-
sign: The authors explored a new imaging method for INTRODUCTION
diagnosis and evaluation of voice disorders. Methods: The use of kymographic imaging as a method for visu-
The new-generation videokymographic system in- alizing vocal fold vibration, especially for disordered vibra-
cludes two charge-coupled device image sensors, a tion, has increased greatly since Gall and Hanson first in-
color area image sensor, and a monochromic high- troduced it to register the motion of the vocal folds in 1971.1
speed line-scan image sensor. The high-speed line-
In their research, a special photograph camera with a slit
scan image sensor is used to capture the kymogram,
and the color area image sensor is used to obtain the shutter was used to expose the vocal fold movement onto the
laryngoscopic image. The two images can be dis- film, a method that is known as photograph kymography.
played simultaneously on a video monitor or stored in This method is extremely time-consuming because the film
a standard video recorder. Three subjects with non- must still be developed with the result that photograph
pathologic voice were investigated in detail with the kymography is not practical for routine clinical diagnosis.2
new videokymographic system. Results: The high- Fortunately, videokymography (VKG),3 in which the
quality laryngoscopic image and kymogram can be kymographic image is encoded as a standard video signal,
used directly for clinical purposes with no further can reveal the vocal fold kymogram directly on a standard
postprocessing. The scan position of the kymogram is video monitor. Thus, the use of this real-time imaging tech-
always indicated in the laryngoscopic image, which
nique spread rapidly into voice research and clinical prac-
provides feedback for the operator to easily locate the
expected scanning position. All varieties of vocal fold tice. Schutte et al. reported the first clinical application of
vibration, including irregular vibrations, phonation VKG in 1998. In their study, more than 800 patients with
onset and offset, can be observed with the presented various functional and organic voice disorders were exam-
method. The continuous kymogram of the vocal fold ined.4 Jiang et al. used VKG to quantify vocal fold mucosal
vibration can be retrieved from a kymographic image wave movements in canine larynges.5 Verdonck-de Leeuw
sequence for quantitative analysis. Conclusions: The et al. combined videokymographic image sequences and
new-generation videokymography provides a simple, speech signals to evaluate the effect of irregular vocal fold
quick means to investigate vocal fold vibration, espe- vibration on voice quality.6 A common conclusion from these
cially for voice disorders. It can emerge as an impor- studies is that, with high spatial and temporal resolution,
tant tool for routine clinical vocal fold examination.
VKG would emerge as a valuable method for voice disorder
Key Words: Videokymography, vocal fold vibration,
diagnosis and a powerful tool for better understanding the
mechanism of the vocal fold vibration.
However, several drawbacks of the first-generation
From the Groningen Voice Research Lab, Department of Biomedical VKG, which are illustrated in the “Discussion” section,
Engineering, University Medical Center of Groningen, University of Gro-
ningen, Groningen, The Netherlands. slowed the progress of its application during the last 5 years.
Editor’s Note: This manuscript was accepted for publication June 7, To overcome these obstacles, a new-generation videokymo-
2006. graphic system has been developed that retains the advan-
The work was done in Groningen Voice Research Lab, the Depart- tages of the first-generation VKG while resolving its prob-
ment of Biomedical Engineering, University Medical Center of Groningen,
University of Groningen, Groningen, The Netherlands. lems. A preliminary clinical result is also presented,
This research was supported by the Technology Foundation STW, illustrating what has been improved in the new system.
the applied science division of NWO, and the technology programme of the
Ministry of Economic Affairs, The Netherlands, project No. G5973.
Send correspondence to Dr. Qingjun Qiu, BioMedical Department, Uni-
versity Medical Center of Groningen, University of Groningen, A. Deusing-
laan 1, NL 9713AV Groningen, The Netherlands. E-mail: q.qiu@med.umcg.nl
The New-Generation Videokymography
The new-generation videokymographic system includes two
DOI: 10.1097/01.mlg.0000233552.58895.d0 charge-coupled device (CCD) image sensors, a color area CCD,

Laryngoscope 116: October 2006 Qiu and Schutte: A New Generation Clinical Videokymography
and a monochromic high-speed line-scan CCD. For the kymo-
graphic imaging, the high-speed line-scan CCD is used to capture
a selected line, which is usually aligned perpendicular to the
glottal axis of the vocal folds. The color area CCD sensor is used
to record the laryngoscopic image. A beam splitter optically di-
vides the image from the laryngoscope into two paths, one path
for the area CCD and the other for the line-scan CCD. The two
CCDs work simultaneously so that the laryngoscopic image and
kymographic image can be obtained at the same time. By means
of the beam splitter, the position of the linear CCD is fixed on the
reflective center of the area CCD, i.e., the kymogram taken from
the line-scan CCD shows the vocal fold vibration at the center line
of the laryngoscopic image.
The system is divided into two parts, a camera head and a
controller unit. The two parts connect with one another by a
high-performance video cable. The camera head houses only the
image sensors and the beam splitter, reducing weight and size for
easy handling. The main controller unit contains an embedded
microprocessor and a frame buffer memory, providing capability
for processing the video images in real time. The system has two
output interfaces, an analog and a digital. The analog video output
allows the two images to be displayed on a standard analog video
monitor, whose screen is vertically split into two equal parts with
the laryngoscopic image on the left and the kymogram on the right.
The digital output port allows the video data to be acquired by a
digital video frame-grabber. Custom-made software, which includes
a database system, was developed to capture, display, and store the

The Instruments and Subjects

The new-generation videokymographic system was used for
gathering kymographic data of the present study. The images
were obtained with a 90° rigid laryngoscope (Richard Wolf
4450.57, Germany) together with a C-mount optical adapter
(Richard Wolf 5261.27, f ⫽ 32 mm, Germany) The vocal folds were
illuminated by the light from a 300-W xenon light source (Kay
Elemetrics 7150), which is transmitted to the tip of the endoscope
using a bunch of optical fibers. For digital recording, a 16-bit
parallel digital frame-grabber (National Instruments PCI-1422)
was used to directly obtain the images from the digital port of the Fig. 1. The new kymographic images from subject no. 1. The upper
new system. For analog video recording, an s-VHS video recorder one was captured during a sustained vowel /i/. The lower images
(Panasonic AG-7355; Panasonic-Matsushita Electric Industrial show irregular vocal fold vibration when the subject intentionally
Co., Ltd., Japan) was used. The analog video recording was then produced a hoarse voice. Each image is vertically split into two
digitized by a video frame-grabber (National Instruments PCI- parts. The left part shows the laryngoscopic image and the right part
1411). For comparison purposes, the first-generation videokymo- shows the kymographic one. The white line in the laryngoscopic
image indicates the scanning position of the kymogram. “A” points
graphic system (Lambert Instruments, BV, Leutingewolde, The
out a blood vessel on the vocal folds. “B” and “C” indicate vibratory
Netherlands) was also used in the experiment. cycles, respectively, with and without closed phase. “L” and “R”
Three subjects with nonpathologic voices were investigated in indicate left and right sides, respectively.
detail in the Groningen Voice Research Lab in The Netherlands.

RESULTS graphic image contains approximately seven vibratory cy-

The upper part of Figure 1 was captured by the new cles, implying that the fundamental frequency is approxi-
system while subject no. 1 was sustaining the vowel /i/. The mately 175 Hz for this phonation. Other quantitative
laryngoscopic and kymographic images are shown on the left parameters such as closed quotient can be calculated by
and the right, respectively. The white line in the laryngo- postprocessing software.
scopic image indicates the scan position for the kymogram. The lower part of Figure 1 shows an irregular vocal
The vocal folds in the laryngoscopic image appear blurred fold vibration when the subject intentionally produced a
because the area CCD is not fast enough to follow the vibra- hoarse voice. Like in the upper half of the figure, the
tion of the vocal folds (25 frames per second); the high-speed laryngoscopic image is also blurred. The kymographic im-
kymographic image, however, clearly displays the vibrations age, however, shows a complex disordered vibration pat-
(7200 lines per second). Each frame contains 40 millisecond tern. The closed phase is much shorter than in a regular
of vibratory history of the vocal folds. Thus, the fundamental vibration (as indicated in B), and several of these cycles
frequency of the regular vocal fold vibration can easily be have no closed phase (as indicated in C).
estimated by counting the number of vibratory periods in the Onset and offset are particularly revealing phases of
kymographic image. In this particular frame, the kymo- phonation. In clinical practice, the assessment of a pho-

Laryngoscope 116: October 2006 Qiu and Schutte: A New Generation Clinical Videokymography
Fig. 3. Two kymographic images from subject no. 3. The left image
was taken with the old videokymographic system and the right one
with the new system. The subject was instructed to maintain the
same pitch and loudness. “L” and “R” indicate left and right sides,

sired scanning position, a general operational rule must be

observed. First, the normal mode is used to locate the vocal
folds in the laryngoscopic image. The desired scanning posi-
Fig. 2. A kymographic image sequence from subject no. 2. The
images were taken during a very brief phonation of vowel /i/, doc- tion must be kept at the top of the laryngoscopic image,
umenting voice onset and offset. The phonation continues without because only the top line will be scanned in the kymographic
interruption from the left segment through the right. “L” and “R” mode (Fig. 4A). Then the kymographic mode is activated to
indicate left and right sides, respectively.

nation onset can be used as a tool to diagnose vocal dys-

functions. To show the capability of revealing phonation
onset and offset with the new system, Figure 2 illustrates
a complete short phonation, including the phonation onset
and offset period. The image was obtained when subject
no. 2 produced a very short vowel /i/. In this case, during
the onset period, the two vocal folds are synchronized.
However, the right vocal fold is ahead of the left vocal fold
during the offset period.
For comparison purposes, subject no. 3 was examined
with both the new- and the old-generation videokymo-
graphic systems. The subject was instructed to keep the
pitch and loudness constant for both examinations. Figure 3
shows the result. The left image was taken with the old
videokymographic system and the right with the new one.

The simultaneous laryngoscopic and kymographic im-
aging in the new system dispenses with the complicated
operational procedure of kymogram acquisition used in the Fig. 4. A general procedure for acquiring a kymographic image with
the old-generation videokymograph. The videokymograph has two
old-generation VKG system, illustrated in Figure 4. The old
working modes, a (A) normal mode and a (B) kymographic mode.
system provides two working modes, normal mode and ky- The normal mode is first used to locate the desired scanning posi-
mographic mode. In the normal mode, laryngoscopic images tion on the top line of the laryngoscopic image. Then one shifts to
are obtained. The kymographic mode produces the high- the kymographic mode, and the kymogram of the top line in the
speed kymogram. A footswitch controls the working mode. laryngoscopic image is revealed on the screen (B). The interlaced
kymogram makes the images not directly observable from the
However, the two working modes are mutually exclusive, screen. As a result, postprocessing is necessary to retrieve the
preventing the operator from seeing the scan position while kymogram, which will always lack the information during the vertical
using the kymographic mode. Therefore, to obtain the de- blanking period (C).

Laryngoscope 116: October 2006 Qiu and Schutte: A New Generation Clinical Videokymography
obtain the kymogram. However, if there is subsequently a parallel digital interface. The resolution of laryngoscopic
slight relative movement between the endoscope and the image is 720 ⫻ 576 pixels in one frame with 25 frames per
vocal folds, the kymogram will represent the vibrations at a second. The resolution of kymographic image is 625 pixels
position other than the intended one. The error can be more per line with 7,200 lines per second. A further advantage
serious if the operator does not notice that discrepancy, pos- of using the digital port is that the speed of the digital port
sibly resulting in an incorrect clinical diagnosis. Fortunately, is high enough to transfer the raw image in real time
the new VKG system presents the laryngoscopic and kymo- without any loss through compression. Thus, the quality
graphic images simultaneously with the scan position of the of the images from the digital port is superior to that from
kymogram always indicated on the laryngoscopic image. the analog port. Nevertheless, both the analog port and
Even a slight relative movement between the vocal folds and the digital port provide noninterrupted vibratory informa-
the endoscope would be revealed in the laryngoscopic image. tion of the vocal folds, which can be digitized or acquired
In this way, the simultaneous imaging greatly facilitates by a computer, quantified by analysis software, and stored
finding the desired position for the kymogram along the in a database system.
glottis. The CCD used to record the kymographic image has
The new imaging system also provides an uninter- a particularly high sensitivity, resulting in two major ad-
rupted kymogram, which was not possible in the old system. vantages of the new system. First, the high-sensitivity
The old VKG is a species of television standard video camera imaging reduces the requirement for the light source. A
(PAL standard is used in this article) in which each half standard 250-W or 300-W xenon light source is sufficient
frame of 20 milliseconds, called a field, contains an active to obtain good-quality images. A 180-W xenon light source
time and a vertical blanking time. The video image can only is even adequate if a rigid laryngoscope with direct optical
be shown during the active time of 18.4 milliseconds. As a fiber connection is used, that is, the optical fiber is an inte-
result, during the vertical blanking period, the kymographic gral part of the laryngoscope. Second, the high-sensitivity
information is missing (Fig. 4C). However, in the new sys- feature gives it a marked advantage in image quality over
tem, this problem is solved by buffering the kymographic other commercially available kymographic systems. Figure 3
image. During vertical blanking, the line-scan CCD still
shows the obvious difference in image quality between the
captures kymographic images, which cannot be displayed
old videokymographic system and the new one. In the noise-
but can be stored in the buffer memory of the controlling
free image of the new system, even the blood vessels on the
system. When the video is activated, these images will be
vocal folds are recognizable (Fig. 3).
displayed. This makes possible the construction of a contin-
The advances in image quality also provide a promising
uous kymogram, like in Figure 2.
impression of three-dimensional vocal fold movement. In
For the clinical application, it is very important to
comparison to the upper part of Figure 1, the three-
keep examination time brief. The real-time laryngoscopic
dimensional vibration of the whole vocal folds is more sa-
and kymographic imaging of the new system greatly
liently recognizable in the kymographic image of the lower
speeds the process. Examination time using the first-
part of Figure 1. The chief explanation of the high-amplitude
generation VKG is relatively short, but some postprocess-
vocal fold movement is that the vocal folds are less coupled to
ing is always necessary to obtain satisfactory images. In
that system, there are two problems with the raw kymo- the airflow in this specific irregular phonation with a high
graphic image, which is displayed on the video screen in mean airflow. A similar phenomenon can be seen in vocal
real time (see Fig. 4). First, every second line is black, fold vibration in a unilateral laryngeal paralysis.4 Evidence
interrupting the continuity of information. Second, in nor- of such a three-dimensional movement pattern might help to
mal video, each frame is made up of two interlaced fields, interpret the cause of an existing vocal fold disorder.
an odd and an even. Without postprocessing, the display of The three-dimensional vocal fold movement can also
these interlaced kymograms is difficult to read. Figures be revealed with stroboscopy. However, stroboscopy has a
4B and 4C show kymograms before and after postprocess- serious limitation in that it works only with periodic vi-
ing. Both problems are caused by the TV standard used in bration.7 To observe aperiodic vibration, VKG or a full
the old system. However, in the new system, the digital high-speed camera is the proper choice.
signal processor reformats output video data so that these Several earlier papers have compared the old system
problems are eliminated, and the images can be used of VKG and the full high-speed camera with respect to
directly for clinical purposes. If desired, postprocessing their various advantages,8,9 and we do not repeat that
can be implemented for extracting quantities such as fun- comparison here. The new system, however, adds major
damental frequency and closed quotient. improvements, discussed previously, to the advantages of
The new videokymophic system has both analog and the old VKG. The most important of these advantages is
digital video outputs. Using the PAL standard, the images its high image quality, which includes high spatial reso-
from the analog output can be shown on a standard video lution, high temporal resolution, and a high signal-to-
monitor, stored in a standard video recorder, or printed by noise-ratio. Another important advantage is lower data
a video image printer. However, the maximal spatial res- volume, allowing the kymographic image to be captured in
olution is also limited by the PAL standard of 720 ⫻ 576 real time without the time constraint that pertains to the
pixels. The laryngoscopic and kymographic images each full high-speed camera.
occupy half of the frame, having a maximal spatial reso- The audio and the electroglottograph signals also play
lution of 360 ⫻ 576 pixels. This limitation is overruled in an important role in voice research. The system provides a
the digital video output, which is based on a nonstandard synchronization mechanism to align the audio signal and

Laryngoscope 116: October 2006 Qiu and Schutte: A New Generation Clinical Videokymography
electroglottogram. This is beyond the scope of the present appreciate the comments of J. G. Švec and N. A. George in
study and will be addressed in a forthcoming paper. reviewing the manuscript.

Our results show that the new-generation videokymo- 1. Gall V, Gall D, Hanson J. Laryngeal photokymography. Arch
graphic system provides high-quality laryngoscopic and ky- Klin Exp Ohren Nasen Kehlkopfheilkd 1971;200:34 – 41.
mographic images simultaneously. For the first time, the 2. Gross M. Larynxfotokymographie. Sprache-Stimme-Gehör
kymographic image can be visualized directly on a video 3. Švec JG, Schutte HK. Videokymography: high-speed line scan-
monitor or stored in a standard video recorder without any ning of vocal fold vibration. J Voice 1996;10:201–205.
waiting time for postprocessing, remarkably reducing exam- 4. Schutte HK, Švec JG, Šram F. First results of clinical appli-
ination time. A continuous kymogram of the vocal fold vibra- cation of videokymography. Laryngoscope 1998;108:
1206 –1210.
tion can be retrieved from a kymographic image sequence
5. Jiang JJ, Chang CI, Raviv JR, Gupta S, Banzali FM Jr, Hanson
without interruption of vertical blanking. This gives the new DG. Quantitative study of mucosal wave via videokymography
VKG system important advantages not only for clinical pur- in canine larynges. Laryngoscope 2000;110:1567–1573.
poses, but also for voice quantitative research. In summary, 6. Verdonck-de Leeuw IM, Festen JM, Mahieu HF. Deviant vocal
the new-generation VKG provides a simple and fast way to fold vibration as observed during videokymography: the
effect on voice quality. J Voice 2001;15:313–322.
study vocal fold vibration, especially when this is irregular. 7. Kitzing P. Stroboscopy—a pertinent laryngological examina-
It will be an important tool for routine clinical vocal fold tion. J Otolaryngol 1985;14:151–157.
examination as well as for detailed research. 8. Wittenberg T, Tigges M, Mergell P, Eysholdt U. Functional
imaging of vocal fold vibration: digital multislice high-
speed kymography. J Voice 2000;14:422– 442.
Acknowledgments 9. Hertegard S, Larsson H, Wittenberg T. High-speed imaging:
The authors gratefully acknowledge D. G. Miller for applications and development. Logoped Phoniatr Vocol
help with the English version of the article. The authors also 2003;28:133–139.

Laryngoscope 116: October 2006 Qiu and Schutte: A New Generation Clinical Videokymography

You might also like