Download as pdf or txt
Download as pdf or txt
You are on page 1of 10

The Mediterranean Journal of Otology


Speech Audiometry: The Development of Modern Greek Word Lists for

Suprathreshold Word Recognition Testing

Nikolaos Trimmis, MS; Evangelos Papadeas, MD, PhD; Theodoros Papadas

MD, PhD; Stefanos Naxakis, MD, PhD; Panagiotis Papathanasopoulos, MD, PhD; Panos Goumas, MD, PhD

From the Department of The existing suprathreshold word recognition tests in Greece are of limit-
Otolaryngology, University Hospital ed value in audiometric speech evaluation that is performed in audiology
of Patras, Greece
clinics. One reason for this is that none of the lists approximates phone-
Correspondence mic balance. Another reason for the limited value of the existing tests is
Nikolaos Trimmis related to word familiarity. In this study, a valid speech test for determin-
Technological Educational Institute ing the word recognition score in Modern Greek-speaking populations has
of Patras
Department of Logotherapy
been developed. The test material consists of 4 lists, each of which con-
M. Alexandrou 1-Koukouli tains 50 open-set bisyllabic words. Monosyllabic words were not included
26334 Patras because few exist in the Modern Greek language. The lists are phonemi-
cally balanced and are of approximately equal difficulty. Preliminary
Phone: +30 2610 369160
Fax: +30 2610-328481
results of the test in 10 native Greek-speaking subjects whose hearing
E-mail: was within the normal range suggest that the lists are generally equivalent
for clinical purposes. This study can be used as a guideline for the devel-
Submitted: 10 February, 2006 opment of word recognition score testing materials in other languages.
Revised: 12 May, 2006
Accepted: 28 June, 2006

Mediterr J Otol 2006; 3: 117-126

Copyright 2005 © The Mediterranean

Society of Otology and Audiology

The Mediterranean Journal of Otology

A primary purpose of human auditory function is to These factors may have a negative effect on the
enable speech. As a result, speech can be used to accuracy of the existing measuring tools used to
assess auditory function. Speech audiometry that is determine speech intelligibility (eg, tools used to
properly conducted with calibrated equipment and differentiate the effectiveness of hearing aids or to
standardized recorded speech material can be a useful determine the degree of communication impairment).
tool for audiologic testing. A typical speech Therefore, the existing speech corpora are of limited
audiometric evaluation usually includes 2 measures. value for assessing auditory function in populations
The first, which is a threshold for the identification of who speak Modern Greek. In our collective opinion,
speech material, is called the speech recognition the current practices used for the clinical assessment of
threshold (SRT). The second is the maximum word word recognition are in need of improvement.
recognition score (WRS, PBmax), which is Given this rationale regarding the existing tests in
determined at suprathreshold intensities under Greece for use in assessing speech intelligibility in the
optimum conditions. Various speech materials are audiology clinic, the goal of this work was to develop
used to obtain each of these measurements with a test typical of the circumstances of actual use. The
maximum accuracy. The testing materials differ specific aims of this study were as follows: to establish
among languages because of differences in phonetic, the frequency of occurrence of phonemes in Modern
syntactic, and semantic rules(1). spoken Greek, to determine the word familiarity of list
Suprathreshold word recognition testing has items, to construct word lists for suprathreshold word
traditionally been performed for several reasons: to recognition testing, and to perform a preliminary
assess the degree of social handicap, to aid in the investigation of list equivalence.
diagnosis of an auditory impairment by providing
useful site-of-lesion information, to monitor
rehabilitation, and to compare hearing aid
performance(2). An analysis of English and Greek speech test
literature led to the adaptation of the following criteria
Lists for WRS testing in Greece were first developed
that were necessary for generating the word lists:
several decades ago. Kogias (1961) developed 6 lists of
40 words each(3). Manolidis (1964) constructed 5 lists
of 30 words each(4). Kastelis developed 10 lists of 10 Phonemic Balance (PB)
words each "These findings have been verified (G. Phonemic balance is also referred to as "phonetic
Kastelis, MD, unpublished data). Iliades and colleagues balance." There is a difference between phonemic and
developed 24 lists, each of which consists of 10 words phonetic elements. Language works by associating
(some of which are common) that are based primarily meanings with sounds. Each language has a finite set of
on the words used by Manolidis(5). Words from the sounds from which the forms of utterances in that
above 45 lists are used for both SRT and WRS testing. language are constructed. Phonemic elements
(phonemes) are abstract concepts related to semantics.
An analysis of phonemic balance and familiarity of Phonetic elements (allophones) are the articulatory
words in all of the above lists revealed the following manifestations of phonemes. Thus, different allophones
findings: Phonemic balance does not approximate correspond to a particular phoneme. The difference
everyday speech in all of the lists, familiarity (ie, between 2 sounds is said to be phonemic (or contrastive)
ratings from 50 male and 50 female judges [average if it can be used to distinguish one meaning from
age, 40.65 years; SD, 12.87] on a 3-point scale of another(6). Two allophones belong to 2 different
familiarity) was adequate (> 90%) in 28 of the 45 lists, phonemes if, in any word of a language, the substitution
and most lists consist of an inadequate number of of one allophone for the other results in a change of the
semantic content of the word. Differences in
items, which increases the variability of test scores.
allophones, such those found in different dialects of a

Speech Audiometry: The Development of Modern Greek Word Lists for Suprathreshold Word Recognition Testing

language, do not necessarily impede communication. sounds belong to separate phonemes) in a particular
Therefore, it is the phonemic rather than the phonetic language is to find minimal pairs. A minimal pair
balance that is relevant to speech perception (7). consists of 2 words, phrases, or sentences with distinct
The number of phonemes in any language depends meanings whose forms differ by only 1 sound. A study
on the method of determining phoneme versus that used minimal pairs to evaluate the most frequent
allophone. The simplest way to demonstrate that the sounds in Modern Greek revealed that 30 sounds were
difference between 2 sounds is contrastive (ie, that the 2 contrastive(8). Tables 1 and 2 show the phonemic
inventory alphabet found suitable for use in our study.
Table 1. Consonantal phonemes of Modern Greek

Moving from left to right correlates with how much toward the front or the back of the oral cavity the sounds are produced. The first symbol within a cell
denotes an unvoiced sound (eg, "t"), and the second symbol denotes the corresponding voiced sound (eg, "d"). Each row of the table represents a sound
quality (International Phonetic Association, 1999). Examples of Greek symbols are included in parentheses.

Table 2. Vowel phonemes of Modern Greek

Tongue position in the Oral Cavity

Front Center Back
Closed i (ι) u (ου)
Oral Cavity

Partially closed
o (ο)
Partially open ε (ε)

Open a (α)

From left to right, the tongue moves from the front of the oral cavity (near the teeth) to the back of the oral cavity. In
the top rows, the mouth is relatively closed, but in the bottom rows, the mouth is more open (International Phonetic
Association, 1999). Examples of Greek symbols are included in parentheses.

The Mediterranean Journal of Otology

To establish phonemic balance of the test lists, a following general categories: medical or health,
large pool of data for everyday speech was created. It psychology or human relations, culture, politics, theater
was important that the material to be selected was or movies, music, education, economics, books, or
spoken rather than written, because the differences athletics. Ten shows for each category were selected
between spoken and written language can be great. This and were recorded. Each show consisted of about 1000
everyday distribution was represented by a phonemic words. An analysis of the recordings revealed the
analysis of 102,934 words obtained from 100 television frequency of occurrence of the 30 phonemes (Table 3).
and radio shows from the national Hellenic Different phonemes should appear in the test
broadcasting station. All shows belonged to one of the material with the same relative frequency as that in

Table 3. Frequency of occurrence of phonemes in Modern Greek and frequency of lists 1 through 4

IPA, International phonetic alphabet(9).

Speech Audiometry: The Development of Modern Greek Word Lists for Suprathreshold Word Recognition Testing

everyday speech. According to that rationale, if the topics (economy, medicine, leisure, art, human sciences,
listener is totally unable to perceive a particular etc)(11). Initially, for the construction of the test lists, 900
phoneme that occurs infrequently in normal everyday bisyllabic words were selected from this pool of data
speech, the handicap that he or she experiences is not as and from another 100 bisyllabic words (common words
severe as it would have been had the phoneme been a in everyday speech) that were not selected from any
more common one(7). Test lists are regarded as established source for familiarity. No database on the
interchangeable if each has the same phonemic balance. word familiarity of Greek words exists to the time of
Exact phonemic balance is not possible to achieve this writing.
because each phoneme changes the phonemic balance To further improve the criterion of familiarity, only
by about 0.47% (see Table 3). the most familiar words were included. The most
familiar words were obtained from the ratings of 50 male
Bisyllabic words and 50 female judges from 3 age groups (12-25 years,
26-50 years, and 51- 75 years; average age, 38.8 years;
Tests for the measurement of maximum speech
SD, 20.18) on a 3-point scale of familiarity. Words with
identification performance must consist of items with
a negative image (eg, terms associated with disease or
low redundancy, or the multiplicity of clues available to
criminality) were excluded. All 200 words selected to be
the listener can obscure some of the inability to
in the final 4 lists were rated as most familiar.
differentiate speech sounds by their acoustic properties(7).
Therefore, monosyllabic word lists are widely used.
Egan (1948) showed a relationship between the number Variability of test scores
of sounds in a word and the ability to recognize that The variability of test scores depends on both the
word(10). The more phonemes and the more acoustic sample size (the number of scorable items in the test)
redundancy that characterize a word, the more easily it is and the level of performance (the percent correct). The
recognized. In our study, fulfillment of the criteria smaller the test list, the larger the test-retest difference
necessitated the use of bisyllables for the selection of list score needed to exceed the 95% confidence limits
items, because an inadequate number of monosyllables (binomial distribution model) for percentage scores.
exists in the Modern Greek language. However, all Confidence limits based on that relationship are used to
bisyllables selected were low in redundancy because we determine whether 2-word recognition scores are
selected words with the minimum possible number of significantly different from each other on an individual-
phonemes, thus keeping the number of phonemes as low patient basis(12, 13, 14). If the difference between 2 scores is
as possible in each list. List 1 contains 211 phonemes, within the applicable confidence limits, then that
and lists 2 through 4 contain 213 phonemes each. difference cannot be considered statistically significant.
The only method of reducing variability and
Familiarity of material therefore increasing reliability is to increase the number
To control familiarity, the 10,000 most frequent of items on a test. However, the duration of the test is
Greek lemmas were selected from the Hellenic National also an important factor to consider. To comply with the
Corpus (HNC), which is the largest body of written previously listed factors, the maximum number of
Greek texts available over the Internet. The HNC, which words in each list was 50. For each word correctly
is based on the general language corpus developed by understood, the patient received a score of 2%.
the Greek Institute of Language and Speech Processing,
currently contains about 32,000,000 words of written Equal distribution of stress
texts from several media (books, periodicals,
Stress is a suprasegmental linguistic feature of
newspapers, etc), from various genres (articles, essays,
speech. In many languages (including Greek), stress
literary works, reports, biographies, etc), and on various

The Mediterranean Journal of Otology

indicates which information in an utterance is most Equal average difficulty

important. Stress emphasizes the most important To examine list equivalency, the 4 lists were
syllable in a word, marks syntactic contrasts (phrase administered monaurally (to the right ear) at 10-dB
endings, interrogation versus declaration), signals increments (range, 10-40 dBHL) to 10 subjects whose
attitudes and feelings(15), and highlights differences in hearing was within normal limits. All participants
meaning. For example, the first syllable in the noun (average age, 23.1 years; SD, 1.73) were native speakers
/jεros/, which means "old man" in Modern Greek, is of Modern Greek and had no history or signs of a
likely to have a higher fundamental frequency and a neurologic disorder. All had pure tone thresholds of ≤
greater duration and amplitude than that of the same 15 dB HL at all frequencies from 250 Hz to 8000 Hz.
syllable in the adjective /jεros/, which means “strong.” All testing was performed in a sound chamber that
In Modern Greek, stress is placed on one syllable in exceeded standards for the ambient noise level for
multisyllabic words. The stressed vowels and audiometric rooms.
consonantal phonemes in Modern Greek are All words were recorded in a double-walled Industrial
phonetically slightly longer than those that are Acoustic Company booth by an adult male speaker with
unstressed, but that difference is not distinctive(16). A professional experience as a radio announcer. A
stressed phoneme will not cause the perception of a professional unidirectional microphone, a FireWire Solo
different phoneme and thus will not affect the phonemic sound card interfaced to a PC computer, and digital signal
balance. For example, the word /jεros/ can have, processing software (Adobe Audition. Version 1. Adobe
depending on the meaning, a stress that is placed on the Systems Incorporated. San Jose, CA) were used for all
first or the second syllable. In either case, the listener recording and editing tasks. All words were monitored
perceives the same 5 phonemes (/j/, /ε/, /r/, /o/, /s/). for minimum required suprasegmental features during
It is important to remember that the more specific recording. A carrier phrase with a 1-second interval
the items of the test lists, the more the test results reflect separating the phrase from the target word was recorded
an accurate measurement of peripheral hearing. before each word. A 5-second interval separated the
Linguistic competence substantially affects the results presentation of each word. The words were digitized at a
of tests using more information in the acoustic signal, sampling frequency of 44.100K Hz and 16-bit resolution.
such as heavily suprasegmental features (stress, The editing of each word involved equalization of the
intonation, duration), and such tests become a measure root mean square (RMS) amplitude that was scanned
of both peripheral and central processes. As a result, we with a window width of 50 ms. Specifically, each average
endeavored to distribute syllabic stress equally in each RMS power value, which represents the average power
list. In every list, 25 words in which stress is placed on of the entire waveform selection and is a good measure of
the first syllable were selected, as were 25 words in the overall loudness, was measured and was then
which stress is placed on the second syllable. multiplied by an appropriate dB factor to achieve the
equal average RMS power value. In this way, each
stimulus word was brought at an equivalent overall
Phonemic dissimilarity
loudness level. A 1000-Hz calibration tone of 60
To ensure that every bisyllable selected in each list
seconds’ duration was synthesized and scaled to the
could not be easily confused with another bisyllable in
overall RMS level of the 200 test words.
the same list, there were no minimal pairs in each list
with the stress on the same syllable. When the stress is
on different syllables, there is a minimum difference of
1 phoneme. The initial four 50-word lists were reordered and
adjusted numerous times so that the final 4 lists (Table
4) better satisfied the criteria of phonemic balance,

Speech Audiometry: The Development of Modern Greek Word Lists for Suprathreshold Word Recognition Testing

Table 4. The final four Modern Greek word lists for use in suprathreshold word recognition testing.

familiarity, and phonemic dissimilarity. Each of the new Mean word scores increased with an average slope per
word lists consisted of 50 open-set bisyllables. Word decibel of 3.37% in the first list, 3.35% in the second
recognition scores were assessed with the recorded list, 3.31% in the third list, and 3.36% in the fourth list
materials. Comparison of the word recognition curves in the rapidly increasing portion of the function between
for mean scores and standard deviations of the 4 lists 10 and 30 dBHL (Figure).
revealed that the lists were essentially equivalent for
clinical purposes (Table 5).
Monaural (right ear) performance functions showed
performance on the 4 lists at 4 hearing levels ranging The degree of hearing impairment inferred from
from 10 dBHL to 40 dBHL in 10-dBHL increments. the results of a pure-tone audiogram cannot depict the
degree of disability in speech communication caused

The Mediterranean Journal of Otology

Table 5. Mean percent of the monaural (right ear) correct scores and standard deviations for the first, second,
third, and fourth lists

10 dBHL 20 dBHL 30 dBHL 40 dBHL

List No. Mean SD Mean SD Mean SD Mean SD

List 1 31.2 3.16 79.80 3.19 98.60 1.65 99.80 0.63
List 2 31.8 3.82 77.40 2.99 98.80 1.69 100.00 0.00
List 3 32 3.27 78.80 2.35 98.20 1.99 100.00 0.00
List 4 32 2.74 78.60 3.53 99.20 1.69 99.80 0.63

Speech Audiometry: The Development of Modern Greek Word Lists for Suprathreshold Word Recognition Testing

disorders. An assessment procedure of this sort reduces

extrinsic redundancy (spoken language) by filtering and
by time alteration of the speech signal such as
compression or interruption. Several methods have been
used to distort the speech signal for central auditory
tests. Speech has been periodically filtered, compressed
in time, interrupted and masked.
Recently, Iliadou and colleagues (2006) developed 3
PB lists, each of which contains 50 bisyllabic words in
Modern Greek. The phonemic balance was calculated
according to a corpus of 1000 bisyllabic words extracted
from written materials (the 1000 most frequent two-
syllable words from the Computational Lexicon of
Modern Greek) (18). In that study, lists 1, 2, and 3
Figure. Mean percent of monaural (right ear) correct per- contained 239, 242, and 245 phonemes, respectively,
formance as a function of the presentation level*. which are about 13% to 15% more phonemes per list
than those in the lists developed in our study.
by hearing loss. Because difficulty in hearing and
The clinician who performs WRS testing must
understanding speech evokes the greatest complaints
remember that the variability of WRS increases as the
from patients with hearing impairments, it is logical that
number of test items decreases. On the other hand, a
tests of hearing function should be performed with
long testing time may negatively affect both the test
speech stimuli. A large number of suprathreshold word
validity and overall clinical use. The problem of
recognition tests are available for clinical use according
deciding how many items each list should contain is
to the restrictions and variations of different languages.
essentially a trade-off between precision and test
Such tests provide a wide choice of test parameters, such
duration. Most audiologists prefer to test with half lists
as the type of speech material (eg, monosyllabic or
(25 words), with a weight of 4% per word, which
bisyllabic words, nonsense syllables, sentences), the test
reflects a lack of understanding of the implications of
format (open set or closed set), and the item scored
the Thornton and Raffin application of the binomial
(words, syllables, phonemes, key words in sentences,
distribution. Those investigators have demonstrated the
subjective intelligibility)(17). The speech test developed
need for the frequent use of full lists (50 words) to
in this study retains those features. Also, the
ensure reliable test scores. The largest standard
performance functions used are generally consistent
deviations of a binomial distribution for WRSs occur
with those commonly encountered in the literature for
around 50%. The most reliable scores occur at the upper
traditional WRS test materials. The new word lists are
(near 100%) and lower (near zero percent) limits, and
intended for people older than 12 years. When
audiologists should have greater confidence in those
compared with older word lists for the assessment of
scores after the presentation of half lists (25 words).
WRS, this test offers considerable improvements in
Differences in scores obtained from different
phonemic balance, familiarity, and variability. In
treatments, for example hearing amplification devices,
addition, it increases reliability by increasing each list
need to be quite large in the 50% region than for scores
size to 50 scorable items while retaining the
close to either extreme (100% or zero percent) before
fundamental characteristics of the most widely accepted
significance can be attached to them. Therefore,
WRS tests. In addition, the new word lists provide the
depending on the number of items used, it is risky for
foundation for development of specialized (degraded)
professionals to assume that an increase or decrease in a
speech tests for the assessment of auditory processing

The Mediterranean Journal of Otology

given patient’s WRS represents a real change in speech 8. Trimmis N, Papadeas E, Papadas T. The
recognition ability. Similarly, professionals should be frequency of phonemes and allophones in Modern
cautious when selecting and verifying amplification Greek spoken language. Poster session presented
device fittings that are based on the results of a WRS. at: Seventeenth International Symposium on
Moreover, statements such as "You understand 66% of Theoretical and Applied Linguistics; April 15-17,
speech." are oversimplified because they ignore 2005; Thessaloniki, Greece.
important variables such as contextual cues, speech- 9. International Phonetic Association. Handbook of the
reading abilities, speaker intelligibility, etc. International Phonetic Association: a guide to the use
The results of this study indicate that the lists used of the international phonetic alphabet. Washington,
are reliable and valid for suprathreshold word DC: Cambridge University Press; 1999.
recognition score testing, a finding that holds 10. Egan JP. Articulation testing methods.
considerable promise for successful clinical use. Further Laryngoscope 1948; 58: 955-91.
investigations with larger numbers of subjects (those 11. Institute of Language and Speech Processing. The
whose hearing is within the normal range as well as Hellenic National Corpus. Available at:
individuals whose hearing is impaired) are required to Accessed October 8, 2004.
determine the validity and reliability of those lists.
12. Raffin MJ, Schafer D. Application of a
Greater experience with the lists will show whether they
probability model based on the binomial
vary in the degree of difficulty. Finally, the methods that
distribution to speech-discrimination scores. J
we used to develop our lists could also be used as a
Speech Hear Res 1980; 23:570-5.
guideline in the development of audiometric materials
for WRS testing in other languages. 13. Raffin MJ, Thornton AR. Confidence levels for
differences between speech-discrimination
scores. A research note. J Speech Hear Res 1980;
1. Carhart R. Basic principles of speech audiometry. 14. Thornton AR, Raffin MJ. Speech-discrimination
Acta Otolaryngol 1951;40:62-71. scores modeled as a binomial variable. J Speech
2. Silman S, Silverman CA. Auditory diagnosis: Hear Res 1978; 21:507-18.
principles and applications. San Diego, Calif: 15. Borden GJ, Harris KS, Raphael LJ. Speech
Academic Press, Inc; 1991. science primer: physiology, acoustics, and
3. Kogias A. How to choose words for speech perception of speech. 3rd ed. Baltimore, MD:
audiometry. Acad Med 196;25:265. Williams & Wilkins; 1994.
4. Manolidis L. Development and use of speech 16. Holton D, Mackridge P, Philippaki-Warburton I.
audiometry in the Greek language [dissertation]. Greek Grammar: A ComprehensiveGrammar of
Thessaloniki, Greece: Aristotle University; 1964. the Modern Language. London: Routledge; 1997.
5. Iliades T, Arvanitidis V, Giannakakis A & Magganaris 17. Gelfand SA. Optimizing the reliability of speech
T. The value of speech audiometry. recognition scores. J Speech Lang Hear Res 1998;
Otorhinolaryngological Society of North Greece; 1978. 41:1088-102.
6. Philippaki-Warburton I. Introduction to linguistics. 18. Iliadou V, Fourakis M, Vakalos A, Hawks JW,
Athens, Greece: Nefeli Publishers; 1992. Kaprinis G. Bi-syllabic, Modern Greek word lists
7. Martin M, editor. Speech audiometry. 2nd ed. for use in word recognition tests. Int J Audiol
London, England: Whurr Publishers Ltd; 1997. 2006; 45:74-82.


You might also like