Professional Documents
Culture Documents
Gonçalves 1998 A Hardware-Software System For Analysis of Video Images
Gonçalves 1998 A Hardware-Software System For Analysis of Video Images
Quantitative analyses of videotaped images can be in the reach of even the most economy-conscious
a desirable objective for both researchers and clini- budgets.
cians. In the past, most image measurements were
manually performed, and calculations were subject SYSTEM DESCRIPTION
to a large margin of error. New possibilities have
been realized with advances in computer hardware One such system, heavily used at our Center, is
and software. Currently available products improve based on the software program Image, initially de-
precision, provide shortcuts, save time in analysis, veloped by Wayne Rasband at the U.S. National In-
and permit manipulation of images in ways that en- stitute of Mental Health. (The Image program can be
hance their quality prior to measurement. At present, downloaded from the Internet at the Image Home
powerful digital image-processing techniques are read- Page, http://rsb.info.nih.gov/nih-image/. It can also
ily available, and in some cases, at prices that are with- be obtained by contacting the National Institute of
Mental Health in Bethesda, MD, U.S.A.). This soft-
ware is a public domain image processing and analy-
Accepted for publication December 4, 1996. sis program for Macintosh computers (a PC version
Address correspondence and reprint requests to Rebecca
which runs on Windows 95 is also available). It re-
Leonard, Ph.D., Universityof California, Davis, Department of
Otolaryngology, 2521 Stockton Blvd., Suite 7200, Sacramento, quires a computer (from Mac II to contemporary
California, 95817. models) with at least 8MB of RAM, and a monitor
143
144 MARIA INF,S GON,CALVESAND REBECCA LEONARD
with the capacity to display 8-bit or 16-bit images the computer monitor is helpful for simultaneously
(256 gray levels). A frame-grabber board is also re- visualizing video and digitized images.
quired to digitize video images. Boards made by Once an image has been captured and enhanced or
Data Translation* or Scion** support the NIMH pro- otherwise manipulated to meet the user's criteria, it
gram, and are available for around $I000. Alterna- can be subjected to analysis, hnage contains several
tively, Image supports QuickTime digitizers, such as tools that facilitate measurement. Tools are similar to
those built into AV Macs and selected PowerMacs. those in many draw and paint programs and include
Wnh Image and appropriate frame-grabber hard- a magnifying glass, scrolling tool, selection rectan-
ware, a user can acquire, display, edit, enhance, ana- gle, oval or polygon, a freehand drawing tool, line
lyze, print, and animate images directly from a video- tool, pencil, eraser, paintbrush, and look-up table
camera, or from a VCR. Once an image is captured, it (LUT). The LUT permits the user to transform each
can be subjected to a number of enhancements, includ- of the 256 possible gray scale pixel values into color,
ing contrast and brighmess adjustments, smoothing, if desirable. Areas chosen for processing are identi-
sharpening, edge detection, and a variety of filtering fied using rectangle, oval, or polygon selection tools.
processes. Digitized images can be rotated, inverted, Lines are created using the line tool, and can be
scaled, and manipulated in several other ways as well. straight, freehand, or segmented. Any selection can
The program can be used to measure areas, path be moved, stretched, added, subtracted, deleted, trans-
lengths and angles, to average gray values, and to de- ferred, saved, or restored. Selection options are also
termine center and angle of orientation of defined re-
useful to isolate and enhance a particular region of an
gions of interest. An additional feature of the program
image without changing other parts of the image.
is its capability to perform automated particle analy-
sis. Editing of color and grayscale images (such as
that seen with MacPaint), including the option of APPLICATION E X A M P L E S
overriding automatic operations to manually outline,
select, and/or measure particular regions of interest, Swallowing
make Image extremely "user-friendly." Any calcula- One application of the Image program that we have
tions obtained can be printed, exported to text files found extremely useful is the analysis of videotaped
and spreadsheets, or copied to the "Clipboard" for fluoroscopic studies of swallowing. At our institution,
further manipulation or analysis. The program sup- dynamic swallow studies are performed in adults and
ports multiple windows which can be simultaneous- children experiencing dysphagia related to head and
ly opened, and eight levels of magnification in which neck pathology, neuromuscular disease, neurogenic,
all editing, filtering, and measurement functions can and other disorders. During these studies, patients are
operate. asked to swallow 1 cc, 3 cc, and self-selected amounts
In order to use Image, a videotaped frame of inter- of barium, of both liquid and paste consistencies, dur-
est is input from a VCR or camera through the com- ing videofluoroscopic filming in lateral and anteropos-
puter's frame-grabber board, digitized, and captured. terior views at 30 frames per second. Timing measures
Alternatively, sequences of images can be collected, can be obtained without digitization, but other quanti-
with the number of frames per second and the amount tative assessments are made possible with Image.
of data limited by capabilities of the frame-grabber For this purpose, a radiopaque ring of known di-
board and available memory on the computer. A VCR ameter is placed on the patient's midchin to serve as
with stop frame or variable playback forward and re- a referent measurement (Fig. 1), that is, x pixels = y
verse speeds is useful for identifying selected frames displacement (in mm, cm, or other measurement
of interest for digitization. A TV monitor used with standard), assuming linearity of images obtained. The
line tool is used first to draw a straight line across the
diameter of the ring. The number of pixels traversed in
* (100 Locke Drive, Marlboro, MA 01752) this distance is then entered in the calibration win-
** (152 West Patrick Street, Frederick, MD 21701) dow to equal the number of m m of the known diam-
A B
FIG. 2. A: Pharyngeal area at rest in normal adult is outlined with tools in hnage. B: Pharyngeal area at point of maximum constric-
tion during swallow in normal adult.
A B
FIG. 3. A: Pharyngeal area at rest in adult patient with oropharyngeal resection. B: Pharyngeal area in same patient at point of maxi-
mum constriction during swallow. Large area reflects difficulty in tongue-pharynx contact caused by resection.
laboratory suggest it is a useful measure in charac- /u/in normal speakers and in speakers with glossec-
terizing the nature of swallowing impairment in dys- tomy was investigated. Range of tongue motion is
phagic patients. defined here as the total area encompassed by the
tongue across the three vowels, as measured from lat-
ARTICULATORY M O V E M E N T S eral view videofluoroscopy studies. To make this
measurement, steady-state portions of subjects' pro-
An additional application of Image at our center is ductions of/i/,/a/, and/u/were first identified on the
in determining range of tongue +jaw motions during videotape, captured, and digitized. The tongue was
selected speech tasks. In a recent study, for example, then outlined or traced anteriorly from its insertion in
range of tongue motion across the vowels/i/,/a/, and the floor of mouth and posteriorly to the vallecula.
Journal of Voice, Vol. 12, No. 2, 1998
A HARDWARE-SOFTWARE ~YSTEM FOR ANALYSIS OF VIDEO IMAGES 147
A B
C D
FIG. 4. A: Hyoid at rest in normal adult. Referent lines are added and anterior hyoid is outlined using tools in Image. B: Hyoid at point
of maximum elevation during swallow in normal adult. Anterior hyoid is outlined and referent lines are added. C: Portions of anterior
hyoid and referent lines in A are selected and copied for pasting onto the image in B. D: Selection of hyoid and referent lines in A is su-
perimposed on image in B, with referent lines aligned. The shortest distance between the two points can be calculated; alternatively, an-
terior and superior displacement of hyoid, or displacement in terms of vertebral height, can be quantified.
An example of this for the vowel/a/is shown in Fig. vowel/a/(Fig. 6D). This step was then repeated for
6A. As shown, a straight line was again projected the image of the speaker producing the vowel/u/.
along the floor of nose to the tubercle of the atlas, and With each superimposition, care was taken to align
a straight line was projected inferiorly from the tu- the referent lines. When the composite picture was
bercle. This process is then repeated for the subject's completed, the measurements of the total and shared
production o f / i / ( F i g . 6B). With/i/completed, the areas of movement of the tongue for the three vowels
outline of the tongue and portions of the two refer- were made, as illustrated in Fig. 6E and E As noted,
ence lines were selected, copied (Fig. 6C), and then both overall range of tongue motion and the propol'-
pasted onto the image of the speaker producing the tion of shared area to total area are being compared
A B
FIG. 5. A: Arrow indicates location of UES at rest (closed) in normal adult. B: Arrow indicates maximum opening of UES during swal-
low in normal adult.
in control speakers and speakers with glossectomy. quantitative information about any reduction in the
Although analyses of these data have not been com- extent of the lesion with various interventions.
pleted, preliminary findings suggest that speakers may Other applications of Image in our setting have
strive to preserve the ratio of shared and independent ranged from measures of velopharyngeal function
areas to total area even with extensive oropharyngeal during speech to tissue measurements from histology
resection. slides input into the computer via a videocamera at-
tached to a microscope, but the system lends itself to
LARYNGEAL PARAMETERS any type of video information for which measure-
ment or quantitative analysis is desirable. With a Mac-
Additional uses of Image include relative measure- intosh (or PC) computer, digitizing board or built-in
ments of a number of laryngeal parameters. It has not digitizer, good quality VCR (preferably with stop
been possible to make absolute measurement of la- frame and variable playback rates), and the hnage
ryngeal variables due to the difficulty of locating a software program from NIMH, the clinician or re-
searcher has a powerful tool. Virtually any clinical or
known measurement referent for structures of inter-
research material that can be prepared in video for-
est. However, relative measures are quite possible,
mat can be subjected to a wide range of measurement
and include extent and degree of closure of the vocal
and analysis techniques. Although many image
folds, characteristics of anterior and posterior glottal
analysis options are available, the system described
chinks, angles formed by the vocal processes or an-
here, involving free software (which is continually
terior commissure, length of the true vocal folds as- upgraded) and relatively inexpensive hardware, has
sociated with frequency changes, and displacement proven to be an extremely valuable resource with a
of the vocal fold edges associated with intensity vari- wide range of applications.
ation. A simple example is presented in Fig. 7, in
which the extent of a lesion along the vibratory por- REFERENCES
tion of one vocal fold edge is compared to glottic
1. Kendall K, McKenzie S, Leonard R, Gon~:alves M, Walker A.
length. In the example shown, the broad-based lesion
Dynamic videofluoroscopic swallowing parameters in normal
occupies about one third of the entire length of the adults. Presented at Dysphagia Research Society Meeting, As-
glottis. Repeated measures over time can provide pen, CO, October, 1996.
A, B
C, D
E, F
FIG. 6. A: Lateral view videofluoroscopic frame of normal adult producing vowel/a/. Tongue is outlined from anterior floor of mouth
to vallecula using tools in hnage. Referent lines are added. B: Process is repeated for/i/. C: Tongue shape and portion of referent lines
in B are selected and copied for pasting onto image in A. D: Selection in B is pasted onto frame of speaker producing vowel/a/, with
care taken to align referent lines. E: Composite of images for the three vowels is completed, with referent lines aligned. A line connects
vallecula to the anterior floor of mouth for each tongue shape. These points are then connected to form the inferior border of the com-
posite. F: Area common to all three tongue positions (shared area) is outlined with segmented line. Measurements permit calculation of
total, shared, and independent tongue areas for the three vowel productions.
FIG. 7. A lesion of the right true vocal fold is shown. Its extent along the vibratory edge of the fold is calculated as a percentage of the
total length of the membranous portion of the fold.