Download as pdf or txt
Download as pdf or txt
You are on page 1of 18

Journal of Phonetics 40 (2012) 491–508

Contents lists available at SciVerse ScienceDirect

Journal of Phonetics
journal homepage: www.elsevier.com/locate/phonetics

Native Catalan learners’ perception and production of English vowels


Lucrecia Rallo Fabra a,n, Joaquı́n Romero b
a
Departament de Filologia Espanyola, Moderna i Llatina, Edifici Ramon Llull, Universitat de les Illes Balears, Ctra. Valldemossa, km. 7.5, 07122 Palma de Mallorca, Spain
b
Universitat Rovira i Virgili, Tarragona, Spain

a r t i c l e i n f o abstract

Article history: This paper reports two experiments on nonnative vowel perception and production. In Experiment 1,
Received 23 November 2007 three groups of Catalan learners varying in English proficiency were tested on their ability to
Received in revised form discriminate seven Catalan–English (C–E) and four English–English (E–E) vowel contrasts. The vowel
19 December 2011
contrasts were natural speech tokens obtained from native Catalan and native American English
Accepted 7 January 2012
speakers. On average, listeners distinguished the C–E /i–i / contrast relatively well, and they could
Available online 23 February 2012
partially distinguish /i–i/, /u–u/, and /a–>/, but they had great difficulty with the /a–e/, /a–æ/ and /e–e/
contrasts. As for the E–E pairs, the learners could discriminate the speech sounds in the /i–i / and /u–R/
pairs, suggesting that learners may have established new phonetic categories for /i / and /R/. In
Experiment 2, a subgroup of the Catalan learners and a control group of native English speakers
produced words containing one of the English vowels /i/, /i /, /e/, /æ/, />/, /e/, /R/, and /u/. Vowel
accuracy was assessed by means of acoustic measurements and by native listener judgments. The
acoustic measurements revealed that, in spectral terms, learners produced vowels that were less
peripheral than the native English (NE) versions, although there was a tendency for vowel expansion as
a function of language proficiency. Vowel duration in the tense–lax vowel pairs also progressed toward
more nativelike values in the productions of the more proficient learners. Finally, the NE listener
judgments showed that most learners produced the vowels /i /, /æ/, /R/, and /u/ intelligibly but with
significantly lower goodness ratings than did the NE speakers.
& 2012 Elsevier Ltd. All rights reserved.

1. Introduction English sentences by native Korean speakers as age of arrival in


the United States increased. Tsukada et al. (2005) found that
One of the major goals of L2 learners is to become fluent native Korean children discriminated English vowel contrasts
speakers and pronounce the sounds of the target language with- better than adults. These children also outperformed adults in a
out a foreign accent. However, not all learners succeed in this production task, in which acoustic distances between the target
endeavour. Various factors account for accented perception and vowels produced by the children fell within the value range of
production of L2 speech. Flege, Schirru, and MacKay (2003) those produced by their native English-speaking peers.
classify these factors into three types: maturational factors, Failure to produce the target sounds accurately can also be
amount and nature of L2 input, and interaction of L1–L2 sound attributed to the quality of the input learners have received.
systems. A growing number of speech perception studies have Learners who have acquired the target language in a naturalistic
suggested that the mechanisms that are operative for L1 acquisi- setting, and those who have learned the L2 through formal
tion are less effective as starting age of exposure to the L2 classroom instruction represent two distinct populations. In this
increases (Johnson & Newport, 1989; Pallier, Bosch, & Sebastián- respect, work by Flege et al. with an Italian community living in
Gallés, 1997; Sanders, Yamada, & Neville, 1999). As a conse- Ottawa for an average of 20 years (Flege, Frieda, & Nozawa, 1997;
quence, learners who were first exposed to the target language in Flege, MacKay, & Meador, 1999; Piske, Flege, MacKay, & Meador,
childhood tend to be more successful than learners whose 2002) showed that some late learners achieve ‘‘nativelike’’ pro-
exposure started later in life. Flege, Yeni-Komshian, and Liu nunciation despite their late exposure to English.
(1999) reported stronger foreign accents in the pronunciation of The instructional setting does not always provide the best
conditions for speech learning, partly because many foreign
language teachers themselves do not succeed in producing
n the L2 sounds accurately. As a consequence, it seems unlikely
Corresponding author. Tel.: þ34 971173019; fax: þ34 971173473.
E-mail addresses: lucrecia.rallo@uib.es (L. Rallo Fabra), that students who are regularly exposed to foreign-accented
joaquin.romero@urv.cat (J. Romero). speech will ever achieve nativelike pronunciation. Some empirical

0095-4470/$ - see front matter & 2012 Elsevier Ltd. All rights reserved.
doi:10.1016/j.wocn.2012.01.001
492 L. Rallo Fabra, J. Romero / Journal of Phonetics 40 (2012) 491–508

studies on acquisition of L2 sounds by learners with short length 1.1. Perception and production of English vowels by Catalan learners
of residence in the host country, or with non-naturalistic expo-
sure to the target language, provide some evidence of the Previous studies investigating acquisition of L2 speech sounds
limitations of ‘‘nativelikeness’’ in L2 speech. For instance, Flege have suggested that the inability to produce non-native sounds
(1987) investigated the production of the French vowels /u/ and accurately may be related to an inability to perceive these sounds
/y/ by four groups of speakers, a group of native French speakers in a nativelike fashion (Flege, Bohn, & Jang, 1997; Flege, MacKay
living in the United States and three groups of American English et al., 1999; Rochet, 1995). Based on these findings, Cebrian (2006)
speakers varying in L2 experience: a group with no French investigated the use of temporal and spectral cues in the perception
experience, a group of American graduate students who had of the English vowels /i/, /ei/, /i/, and /e/ by two groups of Catalan
spent one year in France, and a group of more advanced speakers listeners who differed in starting AOL and learning setting, and a
of French who taught this language at university level. The two control group of native Canadian listeners. The more experienced
groups of English speakers with French experience produced Catalans had been exposed to English later in life (age 20–45), but
French /y/ with F2 values that fell within the range of values exposure had occurred mostly in the host country, since they had
measured in the native French productions. However, the three been living in Canada for an average of 25 years. The less experienced
groups of learners produced more ‘‘English-like’’ /u/s with F2 group was exposed to different varieties of English, since they were
values that were higher than those of the native speakers of university English majors living in Catalonia. In a three-alternative,
French. forced-choice task, listeners had to identify the vowel stimuli from a
Bohn and Flege (1997) examined the production of the English two-dimensional continuum in which spectral and durational values
vowels /e/ and /æ/ by two groups of native German-speaking were varied in linear steps. Results indicated that native English
adults differing in English language experience as measured by listeners and Catalan listeners showed different trends in the use
length of residence in the United States (7.5 years vs. 0.5 years). of perceptual cues to distinguish between the three English vowels.
The authors provided acoustic data that suggested that /æ/ is a The Catalan groups relied more on vowel duration than the native
new vowel for German speakers, since its location in a Bark- English group, who made a greater use of spectral cues. No signifi-
difference space did not overlap with the acoustic space taken by cant differences were found between the two groups of Catalan
the neighboring German vowels /e/, /e7/, and /a/. The productions listeners, suggesting that experience did not influence their vowel
of the experienced Germans and the native English speakers identification. These findings indicated that lengthy residence in the
showed two distinct vowel categories,1 one for /æ/ and one for host country might not have played a crucial role in how non-native
/e/. However, the productions by the inexperienced Germans listeners categorize the vowels of a non-native language.
overlapped, suggesting that the speakers had merged the two The production of the English vowels /i/, /i/, /ei/, and /e/ was also
categories. examined in Cebrian (2007) by means of acoustic analysis and
Relevant to the present study are the experiments reported in identification judgments by native listeners. The same group of
Bongaerts (1999) and Birdsong (2007). In the first case, nativelike Catalan speakers majoring in English produced the target vowels in
pronunciation was investigated in two groups of Dutch learners of nonsense words. Results showed that the F1 and F2 frequencies of
English in a formal setting (Bongaerts, Planken, & Schils, 1995; /i/, /i/, /ei/, and /e/ fell within the range of the values obtained for
Bongaerts, Van Summeren, Planken, & Schils, 1997). One of these native English speakers, but overlap in the acoustic areas of the /i–i/
groups was defined as ‘‘highly successful’’ by university EFL contrast suggested that the Catalan speakers had difficulties produ-
experts. Learners were tested on English pronunciation by reading cing these two vowels differently. Further, the difference in dura-
aloud a series of sentences that contained sounds that were tion between the tense and lax vowels of the /i–i/ and /ei–e/ pairs
judged to be potentially difficult for Dutch learners. It was found produced by the Catalan speakers, indicated a nativelike use of this
that the pronunciation of some of these learners was consistently acoustic parameter. The judgments of the Catalan speaker produc-
judged by native English listeners to be nativelike, or authentic. tions yielded similar scores for /i/ and /i/, leading to the assumption
Similar findings were reported in a subsequent experiment with that learners might succeed in producing accurate instances of /i/, a
Dutch learners of French (Bongaerts, 1999), indicating that ‘‘new’’ sound, at the cost of occasionally producing /i/ inaccurately,
nativelikeness could be also generalizable to other L1–L2 pairings. which was perceptually assimilated to its Catalan counterpart.
The author concluded that nativelike pronunciation in some of Fullana Rivera and MacKay (2002) investigated the production of
these successful learners was possible thanks to the coexistence English words by a large group of Catalan EFL learners varying in
of three factors: high motivation, massive exposure to the L2, and onset age of learning (ages 8–18) and amount of exposure to English
intensive training in L2 perception and production skills. in terms of total hours of instruction. Native English judges rated the
In the Birdsong (2007) study, anglophone late learners of learner productions for foreign accent. Overall, the productions were
French were tested on their ability to read aloud French words rated as showing a moderate accent. A significant effect of experi-
and sentences in a nativelike fashion. At the segmental level, ence was found, i.e., experienced learners received significantly
acoustic measures of vowel duration and VOT revealed that some lower ratings than less experienced learners, but comparison with
learners produced vowels and consonants with values that fell a group of native English controls revealed poor performance of the
within the range of the native French controls. Likewise, global experienced learners relative to the native English speakers. In a
pronunciation ratings by French judges showed that performance subsequent study (Fullana Rivera & MacKay, 2003), the authors
by some of these learners met nativelike standards. Again, it was examined production of the English vowels /i/ and /i/ by means of
concluded that those ‘‘exceptional learners’’ had some traits in intelligibility scores. It was found that identification scores were
common with the learners in the Bongaerts studies, i.e., they were higher for /i/ than for /i/. No significant differences were found
highly motivated and had received phonetic training. among groups who varied in age of learning and experience with
The Bongaerts and Birdsong studies provided some indication the target language, but there was a trend toward more targetlike /i/
that nativelikeness is not completely out of reach for some late L2 productions as exposure to English increased.
learners, who learned the target language mostly in a classroom
setting just like the participants in the present study. 1.2. Theoretical frameworks of L2 speech acquisition: SLM and PAM

Most of the studies reviewed above were motivated by


1
The notion of phonetic category is widely discussed in Section 1.2. two theoretical models that aim to account for the acquisition
L. Rallo Fabra, J. Romero / Journal of Phonetics 40 (2012) 491–508 493

of non-native speech sounds, specifically, perceptual learning and phonological space between the two sounds of the contrast.
accurate production of L2 sounds. The Speech Learning Model The more distant they are, the easier to discriminate they
(SLM) (Flege, 1995) makes the assumption that the ability to will be.
perceive and produce L2 sounds is not lost in late adolescence or
adulthood. Late learners can perceive and produce the target
sounds with varying degrees of success, depending on the nature 1.3. Cross-linguistic similarity of English vowels to Catalan vowels
and conditions of L2 exposure and use. Crucial to the SLM is the
relationship between the L1 and L2 sound systems, which has a As suggested by Strange (2007), similarity relationships
strong impact on how the L2 sounds are perceived and produced. between L1 and L2 sounds can be established in two ways,
One of the postulates of the SLM is that the ‘‘L1 and L2 sound namely, acoustic/articulatory descriptions of sound inventories
systems exist in a common phonological space’’ (p. 238) and and direct assessment by means of perceptual tests. A comparison
hence influence each other. As argued in Flege et al. (2003), this of English and Catalan vowels based on acoustic data from
interaction involves two mechanisms, namely, category assimila- published studies is problematic for various reasons: studies
tion and category dissimilation, with the understanding that an exploring vowel variability in various American English dialects
L2 sound assimilates to an L1 sound when it is perceived as an do not always provide acoustic data (Clopper, Pisoni, & de Jong,
instance of the L1 sound, despite audible differences between the 2005), while others provide acoustic data but limit the scope of
two sounds. their study to certain regional varieties (Hagiwara, 1997;
The notion of perceptual assimilation/dissimilation leads to Hillenbrand, Getty, Clark, & Wheeler, 1995; Strange et al., 2007).
another key issue of the SLM, namely, category formation. Flege Given these conditions, we will offer a simple comparison of
hypothesizes that a new phonetic category for an L2 sound can be the Catalan and American English vowel inventories just for
established if bilinguals can auditorily differentiate the L2 sound informative purposes. The sources of this comparison are the
from the closest L1 sound and from neighboring L2 sounds. The studies by Peterson and Barney (1952) for American English
notion of phonetic category implies the perceptual ability to (1) (E) and Recasens and Espinosa (2006) for Catalan (C). The former
‘‘identify a wide variety of phones as being the same, despite was selected because it provides acoustic data from various
auditorily detectable differences between them along dimensions regional dialects of the United States, thus being more represen-
that are not phonetically relevant’’ and (2) ‘‘distinguish the tative of General American than other recent studies that provide
multiple exemplars of a category from realizations of other data from a single regional variety. The data from the Peterson
categories, even in the face of noncriterial commonalities’’ and Barney study reported below are from quite a broad sample
(Flege, 1995, p. 244). The model predicts that category formation of 33 male speakers from different parts of the United States.
and accurate L2 speech production are closely related, and that The vowels were elicited in the /h_d/ context. As for Catalan, the
many L2 production errors are caused by inaccurate discrimina- vowels were produced by five male speakers in four different
tion of L1–L2 and L2–L2 sounds. consonantal contexts (labial, dentoalveolar, palatal and dark /l/ and
The Perceptual Assimilation Model (PAM) (Best, 1994) shares the trill /r/). To obtain the closest conditions for the crosslinguistic
with the SLM the postulate that discrimination of non-native acoustic comparison, we used the average formant frequencies in
sounds can be predicted from the perceptual relatedness of non- a dentoalveolar context only. Further, in an attempt to neutralize
native categories to native categories. In the first versions of the the effects of possible acoustic differences due to variation in
model, Best’s predictions of discrimination were applicable to vocal tract length, the Catalan spectral values were normalized to
naı̈ve listeners/monolinguals who were not at all familiar with those of General American English using the same method as
the target language. In a recent extension of the model (PAM-L2), previous studies on crosslinguistic comparison of vowels (Lee,
Best and Tyler (2007) discuss similarities and differences between Guion, & Harada, 2006; Yang, 1996).2
both frameworks and extend the predictions of successful speech As shown in Fig. 1, Eastern Catalan has a smaller vowel
perception to L2 learners. inventory than General American English, with three front
The PAM-L2 distinguishes four distinct assimilation patterns of vowels, /i/, /e/, and /e/, three back vowels, /L/, /o/, and /u/, and
L2 phonological categories to L1 phonological categories: two central vowels, /=/3 and /a/. On the basis of the acoustic
values in a dentoalveolar context obtained by Recasens and
(1) Type 1 (‘‘two-category assimilation’’): The two sounds of an L2 Espinosa (2006), normalized to those in Peterson and Barney
contrast are perceived as equivalent to two L1 categories. (1952) for adult male speakers, it seems fair to say that C /i/
These contrasts are predicted to be discriminated with (F1¼330; F2 ¼2106) is intermediate between E /i/ (F1¼270;
relative ease. F2¼ 2290) and E /i / (F1¼390; F2 ¼1990). The formant frequencies
(2) Type 2 (‘‘category-goodness assimilation’’): The two members of of E /i / are close to those of C /e/ (F1¼453; F2 ¼1834). E /e/
the L2 contrast are heard as instances of the same L1 category, (F1¼530; F2 ¼1840) is acoustically close to C /e/ (F1 ¼582;
but one of the members is perceived as being more similar F2¼ 1781). E /æ/ (F1 ¼660; F2¼1720) occupies an intermediate
than the other; that is, they vary in category goodness. The position in the acoustic space between C /e/ and C /a/. The low
PAM-L2 predicts moderately good discrimination for these central and back English vowels E /e/ (F1¼640; F2 ¼1190) and
contrasts and considers the possibility that learners establish E />/ (F1 ¼730; F2¼1090) do not seem to have an equivalent in
a new phonological category for the more ‘‘dissimilar’’ sound. Catalan because C /a/ (F1¼716; F2 ¼1420) is a central vowel and,
(3) Type 3 (‘‘single-category assimilation’’): The two sounds of the as such, it shows higher F2 frequencies than either E /e/ or E />/.
L2 pair are assimilated to the same L1 category with equal Regarding the back vowels, C /L/ (F1 ¼602; F2 ¼1102) is slightly
degrees of goodness. The PAM-L2 predicts poor discrimina- more centralized than E /L/ (F1¼ 570; F2¼ 840). Finally, C /u/
tion for this contrast unless the sound contrast is lexically (F1¼389; F2¼1002) is intermediate between tense E /u/
productive, which in turn would facilitate perceptual (F1¼300; F2¼ 870) and lax E /R/ (F1¼440; F2 ¼1020).
learning.
(4) Type 4 (‘‘uncategorized-uncategorized’’): The two sounds of a 2
A scale factor between the General American English (GAE) and the Catalan
given contrast are not clearly mapped onto a particular L1 (C) spectral data was calculated using the average F3 values of the open vowels by
category, but rather as in-between several L1 categories. k¼F3GAE/F3C.
Successful discrimination will depend on the distance in 3
/=/ only occurs in unstressed position.
494 L. Rallo Fabra, J. Romero / Journal of Phonetics 40 (2012) 491–508

and listeners were instructed to click on the vowel symbol that


best fit the vowel they heard.
The stimuli used in the experiment were obtained from the
recordings of six native Eastern Catalan speakers (three male and
three female) and six native English speakers (one male and five
female) with a mean age of 21 years. The American speakers were
from different parts of the United States (Illinois, Idaho, Missouri
and Pennsylvania). They had studied Spanish as a foreign lan-
guage in the United States for an average of five years and had
been living in Barcelona for about six months.
The assimilation patterns of the target English vowels to the
native Catalan vowels are tallied in Table 1. Differences between
the experienced and inexperienced groups were found. These
differences involved two aspects: percentage of times that the
‘‘non-Catalan’’ label was chosen to classify the English vowels and
the choice of the modal L1 category to classify a given L2 vowel.
Overall, the experienced group used the ‘‘non-Catalan’’ label more
often than the inexperienced group, possibly indicating that in
some cases many listeners heard some English vowels as ‘‘differ-
ent’’ from any Catalan vowel. More importantly, however, differ-
ences in the perceptual similarity between English and Catalan
vowels were found for /i / and /æ/. The inexperienced Catalans
perceived E /i / as similar to C /e/. In contrast, the experienced
Catalans showed a preference for C /i/ over C /e/. As for /æ/,
experienced Catalans heard this vowel as both C /a/ and ‘‘non-
Catalan’’ whereas inexperienced Catalans identified it as C /a/ and
C /e/, suggesting that /æ/ had no clear equivalent in Catalan.
When compared to Cebrian (2006), the results of perceptual
similarity between English and Catalan vowels obtained in Rallo
Fabra (2005) reveal an important discrepancy regarding the
classification of English /i /. We found differences in how experi-
ence with English affected the perceptual mapping of this vowel.
Fig. 1. Vowel charts for Eastern Catalan (top) and General American English
(bottom). The sources of the spectral data are Peterson and Barney (1952) and
The inexperienced listeners perceived E /i / as similar to C /e/,
Recasens and Espinosa (2006). The English vowels were produced by 33 male replicating assimilation patterns found by Cebrian (2006) for both
speakers in an alveolar context; the Catalan vowels were produced by 5 male experienced and inexperienced listeners. In contrast, the experi-
speakers in a dento-alveolar context. enced learners in our study classified this vowel both as
C /i/ and also as a ‘‘non-Catalan’’ vowel. Likewise, another recent
study (Cebrian, 2009) showed that E /i / was most frequently
The perceptual similarity of English vowels to Catalan vowels assimilated to C /i/, both by the Catalan inexperienced and
was first examined in Cebrian (2006). In this study, two groups of experienced listeners. Various factors may have accounted for
Catalan listeners with and without English language experience the disparity of these results. We speculate that listeners’ lan-
identified the English front vowels /i/, /i /, /ei / and /e/ as instances guage experience, the different consonantal contexts in which the
of the four Catalan vowels /i/, /e/, /ei/ and /e/. The inexperienced vowels were elicited, the dialects of English used for the stimuli
listeners could be considered ‘‘naı̈ve’’ in that they did not speak, and the response categories may all have interacted here. The
and had not been exposed to, English. In contrast, the experienced vowel stimuli in Cebrian (2006) were produced by Canadian
group had regular exposure, having lived in Canada for an average English speakers in a bilabial context, whereas Rallo Fabra
of 24 years. The stimuli, produced by two male Canadian English (2005) used vowel stimuli produced by American speakers from
speakers, consisted of /hVb/ syllables, from which the initial and different dialectal areas in an alveolar context. There is evidence
final consonants had been edited out. The English vowels /i/ and in the literature that both language experience and consonantal
/e/ were the most clearly assimilated to their Catalan counter- context affect cross-language speech perception. For instance,
parts /i/ and /e/, but /i / was not clearly mapped onto a single Bohn and Steinlen (2003) found that Danish speakers with
Catalan category. No significant differences were found between English language experience classified English /i / as instances of
the inexperienced and experienced listeners, thus calling into Danish /e/ in glottal and alveolar contexts but as /i/ in velar
question the role of experience found in previous studies. context. Levy (2009a) reported that assimilation patterns of
The perceptual relatedness of English vowels to Catalan French vowels to English vowels were more consistent in a
vowels was further investigated in Rallo Fabra (2005). Two groups bilabial context than in an alveolar context; further, experienced
of listeners—a group of ‘‘naı̈ve’’ Eastern Catalan listeners with no learners were found to be more consistent than inexperienced
experience with English (inexperienced Catalans) and a group of listeners in their mapping of French vowels onto English vowels.
university students who were majoring in English (experienced In an AXB discrimination experiment, Levy and Strange (2008)
Catalans)—identified the English vowels /i/, /i /, /e/, /æ/, /e/, />/, noted that inexperienced English listeners confused French /i–y/
/L/, /R/, and /u/ and the Catalan vowels /i/, /e/, /e/, /a/, /L/, /o/, and more often in an alveolar context than in a bilabial context, but
/u/ in terms of the seven Catalan vowel categories. Both groups of that experienced listeners showed no context effect. Finally, the
listeners were familiar with the IPA phonetic symbols. Thus, the response categories used in Rallo Fabra (2005) included an option
IPA symbols for the Catalan vowels were used as response labelled ‘non-Catalan’, which listeners were instructed to select if
categories along with an extra category labeled ‘‘nc’’, which stood the vowel they heard was not similar to any Catalan vowel. This
for ‘‘non-Catalan.’’ The stimuli were presented over headphones, response option was not included in Cebrian (2006). On the basis
L. Rallo Fabra, J. Romero / Journal of Phonetics 40 (2012) 491–508 495

Table 1
Percent identification of American English vowels by 15 naı̈ve Catalan listeners (inexperienced) and 37 Catalan EFL learners (experienced). Bold figures indicate the modal
identification vowel. Responses selected 3% or less are omitted.
Adapted from Rallo Fabra, 2005.

Vowel stimuli Response category (inexperienced)

a e e i o L u nc

i 99.2
i 19.2 62.5 15.8 2.5
e 11.7 50.8 25 10.8
æ 45 35 5.8 11.7
e 57.5 5 5.8 11.7 16.7
> 78.3 9.2 10
L 55.8 3.3 23.3 15.8
R 5 11.7 15.8 6.7 30.8 30
u 95

Vowel stimuli Response category (experienced)

a e e i o L u nc

i 95.9
i 12.2 11.5 39.5 35.8
e 13.9 60.5 7.8 16.9
æ 57.8 6.8 34.5
e 47 11.1 37.5
> 64.2 14.5 20.3
L 40.5 3.5 33.1 22.3
R 9.5 34.9 47.5
u 81.1 18.2

of the evidence reported in all the above studies, it seems that Korean). Regression analyses of the perception and production
assimilation patterns of L2 vowels to L1 vowels are highly data showed that non-native speaker accuracy in producing the
sensitive to experimental conditions such as language experience English vowels was determined by accuracy in perceiving the
of the listener groups, consonantal context and dialect variation of same vowels. Similarly, Flege, MacKay et al. (1999) found sig-
the vowel stimuli. nificant correlations between the intelligibility scores and the
vowel discrimination scores, supporting the hypothesis that the
1.4. Perception/production relationship ability to produce accurate instances of the target vowels was
related to a parallel ability to discriminate these vowels
The assumption that perception and production do align in accurately.
adult L2 acquisition has been one of the most controversial issues The relationship between the perception and the production of
in L2 research. Various authors agree that the relation between the English vowels /i/, /i /, /ei /, and /e/ by Catalan learners was also
these two abilities is a complex one (Llisterri, 1995; Wode, 1999). explored in Cebrian (2002). Although the r values for the correla-
For instance, Bohn and Flege (1997) argue that ‘‘in the early stages tions were statistically significant, they were quite modest in size.
of L2 speech learning, perception may lead productiony. Inex- In most cases (six out of eight), r values ranged from 0.33 to 0.44
perienced L2 learners may differentiate a new vowel contrast for a sample size of 29. Only in two cases did correlations reach
perceptually, without differentiating this contrast in production’’ higher values of 0.61 or 0.73. Cebrian attributed the low correla-
(p. 69). They add that the two abilities do not ‘‘progress in tions to methodological factors, as it had been pointed out earlier
parallel’’ and that further experience with the target language by Flege (1999).4
will have positive effects on production, but perception will be
highly resistant to malleability. Similarly, Strange (1995) shares
the view that experienced and inexperienced learners represent 1.5. The present study
two distinct groups as far as the perception /production align-
ment is concerned. She claims that production errors of inexper- The two experiments reported in this paper were designed
ienced learners can be predicted from perceptual errors, but both with four major goals in mind. The aim of the perception
skills might be uncorrelated for experienced speakers. experiments was to test whether some of the predictions and
Flege (1999) challenges the Critical Period Hypothesis tenet that hypotheses of the PAM and the SLM could be applied to EFL
segmental production and perception no longer align after the learners who were learning the target language in a non-natur-
passing of a critical period for language acquisition. He argues alistic setting, specifically, three groups of Catalan EFL learners
that ‘‘perception and production may not be brought into perfect varying in English proficiency. As reviewed earlier in this section,
alignment, as in L1 speech acquisition, but modest correlations the PAM-L2 claims that discrimination of L2 contrasts can be
will exist between L2 segmental production and perception for predicted from the perceptual relatedness of L2 vowels to L1
highly experienced speakers of an L2’’ (p. 1273). Flege’s claims are vowels. In light of the PAM framework, we examined the
grounded on two empirical studies examining the relationship discrimination of four English vowel contrasts on the basis of
between perception and production data. In one of these studies, the crosslinguistic similarity results found in a previous study
Flege, Bohn et al. (1997) investigated the perception and produc-
tion of four English vowels, /i/, /i /, /e/, and /æ/ in 90 speakers 4
Correlations might not be the most appropriate method to compare
from different L1 backgrounds (German, Spanish, Mandarin and perception and production data.
496 L. Rallo Fabra, J. Romero / Journal of Phonetics 40 (2012) 491–508

(Rallo Fabra, 2005). Predictions of potential difficulty in discrimi- Table 2


nating English vowel pairs by Catalan learners of English were Assimilation patterns of English vowels to Catalan vowels based on PAM-L2
(Best & Tyler, 2007).
made on the basis of the four types of assimilation patterns
proposed in the PAM-L2: two-category assimilation, category-good- L2 vowel % assimilation Perceptual assimilation
ness assimilation, single-category assimilation and uncategorized. contrast to L1 pattern (PAM-L2)
A key tenet of Flege’s SLM is that L2 learners may eventually
establish phonetic categories for certain L2 sounds if they can /i–i / /i/-/i/ (96%) ‘‘Category-goodness’’
assimilation (type 2)
discern the phonetic differences between the L2 sound and the
/i /-/i/ (39.5%)
closest L1 sound, as well as the differences between the L2 sound
/u–R/ /u/-/u/ (81.1%) ‘‘Category-goodness’’
and other neighboring sounds in the acoustic space. The model
assimilation (type 2)
also hypothesizes that category formation enhances accurate /R/-/u/ (34.9%)
production in the L2. Thus, our second goal was to test the
/>–e/ />/-/a/ (64%) ‘‘Single-category’’
hypothesis that some L2 learners could establish new perceptual
assimilation (type 3)
categories for certain English vowels. This was done by means of a /e/-/a/ (47%)
perceptual test that examined discrimination of six Catalan–
/e–æ/ /e/-/e/ (60%) ‘‘Two-category
English vowel contrasts and four English vowel contrasts.
assimilation’’ (type 1)
The third goal was to investigate whether the Catalan learners /æ/-/a/ (58%)
could produce English vowels accurately and to what extent
language proficiency would influence their accuracy in the pro-
duction of these vowels. Vowel accuracy was assessed in two
ways, acoustically, and by native English listener judgments. It instance, the tense vowels in the /i–i / and /u–R/ contrasts
was hypothesized that the proficient learners would produce were quite consistently heard as C /i/ and C /u/, respectively,
vowels that were acoustically closer to those produced by native as opposed to the lax vowels, whose percentage of assimila-
English speakers than the vowels produced by mid-proficient and tion to C /i/ and C /u/ was quite low. We judged that these two
low-proficient learners. pairs fell within the ‘‘category goodness’’ type, since one
Finally, the fourth goal involved the relationship between member of each pair was more consistently classified as /i/
perceptual and production abilities. The SLM is grounded on the or /u/ than the other. Based on the PAM-L2, we predicted
hypothesis that production errors of L2 sounds have a perceptual moderately good discrimination for these two contrasts. The
basis. It should be noted here that these predictions are based on contrast />–e/ was a case of ‘‘single category’’ assimilation
findings from populations that had extensive exposure to the L2 since both />/ and /e/ were perceived as instances of Catalan
in a naturalistic setting. We are not certain to what extent they /a/. As predicted in the PAM-L2, Catalan learners are not as
can be applied to foreign language acquisition in a formal class- likely to discriminate this contrast successfully. Finally, /e/
room setting. and /æ/ were heard as similar to C /e/ and C /a/, respectively,
and, as such, it was predicted that the /e–æ/ contrast would
be discriminated with relative ease.
2. Experiment 1: Categorial discrimination test (2) Can late L2 learners establish phonetic categories for some L2
sounds?
As reported earlier, Eastern Catalan has a smaller vowel One of the predictions of the SLM is that adults can establish
inventory than General American English. Further, many of the long-term memory representations or ‘‘phonetic categories’’
English vowels do not have an equivalent in Catalan or have for some L2 sounds that are perceptually distant from
different degrees of similarity to Catalan vowels. It can then be the closest L1 sound. The model also hypothesizes that the
hypothesized that certain pairings of L1–L2 and L2–L2 vowels will chances of category formation are inversely related to the
be potentially difficult to discriminate by Catalan learners. In this perceived phonetic distance between the target sound and
experiment, the same groups of Catalan EFL learners from a its L1 equivalent. On these grounds, we predicted that more
previous study examining crosslinguistic similarity of vowels proficient learners might establish new perceptual categories
(Rallo Fabra, 2005) were tested on their ability to discriminate for English /i / and /R/, as these were perceived by some
Catalan–English and English–English vowel contrasts. Two listeners to be deviant from any Catalan vowel. The establish-
research questions derived from the PAM-L2 and the SLM were ment of a new phonetic category for an L2 sound will thus be
addressed: determined by learners’ ability to successfully discriminate
contrasts made of an L2 sound and the closest L1 sound.
(1) Can ease of discrimination be predicted from the perceptual
assimilation of L2 sounds to L1 sounds?
The PAM-L2 calls for research testing predictions of success or 2.1. Method
failure to discriminate non-native contrasts by listeners with
and without L2 experience. A recent study (Levy, 2009b) 2.1.1. Stimuli
tested the PAM-L2’s predictions by comparing discrimination The same stimuli as in Rallo Fabra (2005) were used. They
of French vowels by English listeners with differing language consisted of sVt syllables that contained one of the seven Catalan
experience to perceptual assimilation data by the same vowels, /i/, /e/, /e/, /a/, /L/, /o/, and /u/, and one of the English
listeners (Levy, 2009a). Following Levy (2009b), ease or target vowels, /i/, /i /, /e/, /æ/, /e/, />/, /R/, and /u/. The words were
difficulty in non-native vowel discrimination by experienced recorded on a CP-300 Marantz tape-recorder and subsequently
learners was predicted on the basis of results from a cross- digitized with a waveform editor at 22.00 kHz sampling rate and
linguistic similarity study by the same learners (Rallo Fabra, 16-bit resolution. The sVt words were then edited to eliminate
2005). the final stop and normalized for peak intensity. This was done to
Assimilation patterns were determined qualitatively consid- prevent listeners from basing their perceptual judgments on the
ering the percentage of trials in which an English vowel was crosslinguistic difference between the dentoalveolar released
assimilated to a particular Catalan vowel (Table 2). For production of C /t/ and the alveolar unreleased production of
L. Rallo Fabra, J. Romero / Journal of Phonetics 40 (2012) 491–508 497

E /t/ in final position, but it had the limitation of creating open students had to provide throughout the term. These exercises
syllables for the lax vowels /i /, /e/, /æ/, /e/, and /R/. tackled a wide variety of aspects of American English pronuncia-
The stimuli were combined in triads to examine the discrimi- tion, including vowels, consonants, stress, rhythm and intonation,
nation of eleven vowel contrasts in two separate perception tests. with a special emphasis on those areas that are known to induce
For the first experiment, seven Catalan–English (C–E) contrasts transfer problems for native speakers of Spanish and/or Catalan.
including a Catalan vowel and an English vowel were tested /a–æ/, The instructor, an experienced Catalan phonetician who had lived
/a–>/, /e–e/, /a–e/, /i–i/, /i–i /, and /u–u/. The C–E contrasts were in the United States for an extended period of time and spoke
paired following the outcomes of the perceptual similarity study English on a daily basis, was the only one responsible for the
(Rallo Fabra, 2005). For instance, the English vowels />/, /e/, and assessment of the recordings.7 The mean grade obtained by each
/æ/ were mapped onto C /a/, so three of the C–E contrasts group, on a 0–10 scale, was 8.72 (range 7.9–9.8) for the proficient
consisted of one of these three vowels and the closest Catalan learners, 6.66 (range 6–7.5) for the mid-proficient learners, and
vowel /a/. English /L/ was not included in any of the contrasts as 4.25 (range 2.5–5.5) for the low-proficient learners.
many speakers of American English do not distinguish between
/>/ and /L/. The C–E contrast /u–R/ was not included, either, 2.1.3. Procedure
because /R/ was mostly heard as non-Catalan, so we judged that Listener discrimination of the target vowel contrasts was
the /u–u/ contrast was probably more difficult to discriminate assessed using a categorial discrimination test (CDT) as proposed
than /u–R/. In the second experiment, four English–English (E–E) in Flege (2003). The CDT modifies the classical AXB categorial
contrasts />–e/, /e–æ/, /i–i / and /u–R/ were tested. The reason discrimination task to prevent listeners from discriminating
why the />–e/ and /æ–e/ contrasts were chosen and not other between instances of the same category. The inclusion of ‘‘catch
possible combinations such as />–æ/ or /æ–e/ is simply because trials,’’ which consist of realizations of a single category spoken by
the former was potentially more difficult to discriminate than three different speakers ‘‘encourages subjects to respond only to
the latter on the basis of the acoustic distance between the two phonetically relevant differences, not merely to auditorily detect-
vowels within the pair.5 able ones’’ (p. 23). This test has been used by Flege and colleagues
The three stimuli in all trials were produced by different in various perceptual studies to assess category formation in L2
speakers, which forced the listener to ignore within-category speech learning (Flege & MacKay, 2004; Flege, MacKay et al.,
phonetic differences. Each vowel contrast was tested in 16 trials: 1999; Guion, Flege, Akahane-Yamada, & Pruitt, 2000; Tsukada
eight ‘‘different trials’’ and eight ‘‘catch trials’’. ‘‘Different trials’’ et al., 2005).
always had an odd item, that is, one of the stimuli belonged to a The test was administered to individual listeners in a single
different vowel category and it could occur in any of the three session divided in two separate blocks. In the first block, partici-
positions. ‘‘Catch trials’’ were made of three physically different pants listened to the seven C–E contrasts while in the second
stimuli of the same vowel category, so there was no odd item. The block they listened to the four E–E contrasts. There was a 5-min
total number of trials in each test was 112 for the C–E contrasts break between the two blocks of the test. Four buttons appeared
and 64 for the E–E contrasts. on the computer screen with the numbers 1, 2 and 3 and a button
labelled ‘‘no.’’ Listeners heard three stimuli in each trial and they
were instructed to click on the button that corresponded to the
2.1.2. Listeners ‘‘odd item out’’ if they heard one vowel that was different from
The same group of Catalan learners as in Rallo Fabra the other two. This odd item could be any of the three stimuli or
(2005)6 were tested individually in a quiet room. All participants none of them. If they heard all three stimuli as being ‘‘the same,’’
were undergraduate students majoring in English (mean they were to click the ‘‘no’’ button. The inter-trial interval was set
age¼22). At the time of testing they had just finished a two-term at 2.8 s, and the inter-stimulus interval was set at 1.3 s.
American English Phonology course at their home university in
Tarragona (Spain), a course that included an in-depth study of the 2.2. Results
English sound system, phonetic transcription, and intensive
pronunciation practice. Listeners had benefited from a similar To test whether discrimination varied as a function of group
quality and amount of exposure to English in terms of formal (proficient, mid-proficient, low-proficient), contrast (/a–æ/, /a–>/,
instruction and length of stay in an English-speaking country, but /e–e/, /a–e/, /i–i/, /i–i /, and /u–u/ for the C–E contrasts and />–e/,
they exhibited different levels of proficiency in English. For this /e–æ/, /i–i /, and /u–R/ for the E–E contrasts), or both group and
reason, they were divided into three groups according to their contrast, two separate repeated measures two-way ANOVAs
language proficiency: proficient (n¼10), mid-proficient (n¼12), were performed, one for the C–E contrasts and one for the
and low-proficient (n¼ 12). The criterion for inclusion in one of E–E contrasts. In both cases, the dependent variable was the A0
the three groups was the grade they obtained in the pronuncia- values (Snodgrass, Levy-Berger, & Haydon, 1985),8 a measure of
tion component of the American English phonology course, which
consisted of a series of self-recorded pronunciation exercises that
7
Inter-rater reliability analysis could not be performed because the record-
ings were returned to the students after they finished the course. It was not
5
Evidence from various studies that explore vowel variability across dialects possible to present these recordings to other raters for assessment and thus
in American English (e.g., Clopper et al., 2005, pp. 1668–1669) shows that, in strengthen the participants’ group division made on the basis on one rater only.
speech production, /æ/ often merges with /e/ but not with />/ or /e/. Similarly, />/ Instead, we performed intra-rater reliability analysis based on the pronunciation
merges with /e/ but not with /æ/. Likewise, other studies that examine VISC grades obtained by 417 students who had taken the English Phonology course in
(vowel-inherent spectral change) patterns (see Hillenbrand, 2011, for a review), five successive academic years (the two academic years in which the present
show that formant frequencies measured from the beginning and end of vowels study was conducted, the previous two years and the following year). Following
/æ/ and /e/ share similar VISC patterns (from high to low frequencies). This is true Bruton, Conway, and Holgate (2000) we measured this one rater’s reliability by
of Canadian English (Nearey & Assmann, 1986) and Midwest American English means of a one-way ANOVA examining the single effect of time. The ANOVA
(Hillenbrand et al., 1995). The vowels />/ and /e/ also share similar VISC patterns yielded no significant effects in the pronunciation grades of the students’ grades in
(from low to high frequencies) which are different from those of /æ/ and /e/ (high five academic years (F(4, 411) ¼1.161; p ¼ 3.28), indicating the rater’s consistency
to low). assessing students’ pronunciation over time.
6 8
The number of participants in the perceptual assimilation study (Rallo Fabra, The function used to calculate the A0 scores depended on the proportion
2005) was 37. Three of these participants did not take the categorial discrimina- of hits and false alarms for each contrast. If the proportion of hits (H) equaled
tion test. the proportion of false alarms (FAl), then A0 ¼0.5þ ((H FAl)  (1þ H  FAl))/
498 L. Rallo Fabra, J. Romero / Journal of Phonetics 40 (2012) 491–508

Table 3
Mean A0 values for the seven C–E contrasts tested in experiment 1. The F-values are for separate one-way ANOVAs examining the contrast  group interaction. (n) indicates
significance at the 0.01 level. Alpha level for the pairwise comparisons was set at 0.05.

Group 1 2 3 4 5 6 7 F Tukey’s HSD


/a–e/ /a–>/ /e–e/ /a–æ/ /i–i/ /i–i / /u–u/

Proficient 0.425 0.452 0.361 0.274 0.638 0.802 0.650 (6,74) ¼ 17.6n 6,741,2,3,4; 5 41,3,4
Mid-prof. 0.352 0.534 0.452 0.302 0.537 0.799 0.625 (6,75) ¼ 11.1n 641,2,3,4,5; 7 41,4; 544
Low-prof. 0.511 0.581 0.382 0.410 0.526 0.684 0.457 (6,62) ¼ 1.77

The length of the error bars suggests high variability across


individuals in all three groups. A general inspection of the graphs
reveals practically no between-group differences in performance
for the contrasts /a–æ/, /a–>/, /e–e/ and /a–e/, but moderate
between-group differences for the contrasts /i–i/, /i–i / and /u–u/.
Fig. 2b also suggests that discrimination involving the vowels /i/,
/i /, and /u/ was somewhat more accurate as a function of language
proficiency, that is, overall, the proficient and mid-proficient
groups showed moderately higher sensitivity than the low-
proficient group.
The A0 scores for the Catalan–English contrasts were submitted
to a repeated measures ANOVA with contrast as the within-
subject factor and group as the between-subjects factor. The main
effect of contrast was significant (F(6, 21) ¼20.3; p o0.01). How-
ever, the ANOVA yielded no statistically significant effect of group
(F(2, 26)¼ 0.008; p ¼0.99), but a significant two-way interaction
(F(12, 42) ¼3.09; p o0.01). Additional one-way ANOVAs (see
Table 3) revealed that the main effect of contrast was significant
for the proficient (F (6, 74) ¼17.6; p o0.01) and mid-proficient
learners (F(6, 75) ¼11.1; p o0.01) but not for the low-proficient
learners (F(6, 62) ¼1.77; p ¼0.12).
Pairwise comparisons with Tukey’s HSD tests showed that the
proficient learners made fewer errors in the discrimination of the
/i–i / and /u–u/ contrasts (A0 values between 0.6 and 0.8) than in
discrimination of the /a–e/, /a–>/, /e–e/ and /a–æ/ contrasts (A0
values between 0.2 and 0.4). The /i–i/ contrast was also signifi-
cantly more accurately discriminated (A0 ¼0.63) than /a–e/, /e–e/,
or /a–æ/. There was no significant difference between the three
‘‘more accurately discriminated’’ contrasts and the four ‘‘poorly
discriminated’’ contrasts.
The mid-proficient learners showed significantly higher sensi-
tivity to the /i–i / C–E contrast (A0 ¼0.79) relative to the /a–e/,
/a–>/, /e–e/, /a–æ/, and /i–i/ contrasts, whose A0 values ranged
from 0.35 to 0.53, indicating a lack of sensitivity to these five
Fig. 2. A0 scores obtained by the three groups of Catalan learners differing in
English proficiency, for the seven C–E contrasts tested.
contrasts. They also showed higher sensitivity to the /u–u/ C–E
contrast in comparison with /a–e/ and /a–æ/. Finally, the A0 scores
for /i–i/ and /a–æ/ also differed significantly from one another.

sensitivity scored by each subject in discriminating each of the


vowel contrasts. The independent variables or factors were vowel 2.2.2. English–English contrasts
contrast (within-subject) and group (between-subject). The average A0 scores for the E–E contrasts are presented in
Table 4 and plotted in Fig. 3. The error bars in Fig. 3 indicate high
2.2.1. Catalan–English contrasts variability across groups and contrasts, especially among the low-
The average A0 values for the seven C–E contrasts tested are proficient learners. A repeated measures ANOVA with contrast
shown in Table 3 and plotted in Fig. 2a and b. The error-bar plots (4) as a within-subject factor and group (3) as a between-subjects
reveal that listeners showed no sensitivity in discriminating pairs factor yielded a significant simple effect of contrast (F(3, 29) ¼
involving English /e/, /æ/, /e/, or />/ in contrast with Catalan /e/ 4.85; p o0.01), but no significant effect of group (F(2, 31)¼0.87;
and /a/ and that they showed moderately good sensitivity to the p¼0.42) and no significant interaction (F(6, 58)¼1.37; p ¼0.24).
English vowels /i/, /i /, and /u/ in contrast with Catalan /i/ and /u/. Pairwise comparisons based on the estimated marginal means
revealed that /i–i / and /u–R/ were more accurately discriminated
(average A0 ¼0.59 and 0.58, respectively) than />–e/ and /e–æ/,
(footnote continued) po0.05 (average A0 ¼0.45 and 0.42, respectively). However there
((4  H)  (1  FAl). However, if FAl exceeded H, then A0 ¼ 0.5  ((FAl–
H)  (1þ FAl  H))/((4  FAl)  (1 H)). A0 scores of 0.5 or below indicated a lack
was no significant difference between the two ‘‘more accurately
of phonetic sensitivity, values of 0.6 to 1.00, indicated that listeners were sensitive discriminated’’ contrasts and the two ‘‘poorly-discriminated’’
to the contrasts. contrasts (p 40.05).
L. Rallo Fabra, J. Romero / Journal of Phonetics 40 (2012) 491–508 499

Table 4
Mean A0 values for the four E–E contrasts tested in the CDT. The F-values are from a repeated measures ANOVA examining the main effects of contrast and group and the
contrast  group interaction. (n) indicates significance at the 0.01 level. Alpha level for the pairwise comparisons was set at 0.05.

Group 1 />–e/ 2 /e–æ/ 3 /i–i / 4 /u–R/ F effect F effect F group  contrast Pairwise comparisons
of contrast of group interaction (effect of contrast)

Prof. 0.559 0.452 0.618 0.618 (3,29) ¼4.85n (2,31)¼0.87 (6,58)¼1.37 3, 4 4 1, 2


Mid-prof. 0.470 0.410 0.548 0.555 (3,29) ¼4.85n (2,31)¼0.87 (6,58)¼1.37 3, 4 4 1, 2
Low-prof. 0.333 0.449 0.608 0.587 (3,29) ¼4.85n (2,31)¼0.87 (6,58)¼1.37 3, 4 4 1, 2

Overall discrimination of the E–E and C–E vowel contrasts was


considerably poorer compared to the discrimination scores found
in previous studies with L2 learners from different L1 back-
grounds (Flege & MacKay, 2004; Flege, MacKay et al., 1999;
Guion et al., 2000). It is likely that the higher sensitivity showed
by listeners in the previous literature is related to amount of
exposure to the target language. The participants in our study had
limited exposure to English and hardly ever used the target
language outside the classroom, quite a different scenario from
the Italian speakers in two of the studies cited earlier, who had
lived in Canada for many years and, consequently, had long-term
exposure to the L2.
The A0 scores for both the E–E and the C–E contrasts were
found to vary considerably. Individuals in each of the three groups
showed varying degrees of sensitivity to the same contrast. This
variability also increased as a function of group. Proficiency in
English influenced discrimination of the C–E contrasts, but it did
not influence discrimination of the E–E contrasts. The significant
contrast by group interaction found for the C–E contrasts indicated Fig. 3. A0 scores obtained by the three groups of Catalan learners differing in
that the proficient and mid-proficient learners could discriminate English proficiency, for the four E–E contrasts.
two contrasts relatively well—/i–i / and /u–u /—and that the
proficient learners also showed modest sensitivity to the /i–i/ more English vowels /e/ and />/ were also perceived as close to C
contrast. However, the low-proficient group discriminated all C–E /a/, a pattern of L2–L1 assimilation (three L2 vowels assimilated
contrasts quite poorly. As for the four E–E contrasts, neither the to the same L1 vowel) that is not accounted for by the PAM-L2,
main effect of group, nor the contrast by group interaction was and which might have contributed to the difficulty in discrimi-
significant. This suggests that the three groups of learners nating the /e–æ/ contrast.
discriminated these contrasts in a similar fashion and that they The second research question was concerned with one of the
all showed little sensitivity to the English vowel pairs. hypotheses of the SLM (Flege, 1995), which states that adult
learners will eventually establish phonetic categories for some L2
2.3. Discussion sounds if they can discern the phonetic differences between the
L2 sound and the closest L1 sound. It could be argued that testing
The first research question addressed in this experiment was vowel discrimination using stimuli that were elicited in an
whether ease of discrimination could be predicted from the alveolar context only is insufficient to support or reject this
perceptual relatedness of the target sounds to the existing native hypothesis because it is uncertain whether the perceptual pat-
categories, as suggested by the PAM-L2. This model predicts poor terns found so far will extend to other consonantal contexts.
discrimination when two L2 sounds assimilate to one L1 category However, the fact that the highest A0 scores in the CDT were
(‘‘single category’’ type). As predicted, overall discrimination of obtained for the /i-i / and /u-R/ pairs seems to indicate that
the E–E contrasts was poor for /e–>/ because both vowels were learners could distinguish between the two vowels of each pair.
assimilated to a single Catalan category, i.e., /a/. The PAM-L2 In addition, learners could also discriminate between C /i/ and
further predicts moderate to good discrimination in cases when E /i /, but only the proficient learners could modestly differentiate
two L2 sounds are mapped onto one L1 sound but differ in between C /i/ and E /i/. On the basis of the evidence provided so
‘‘category goodness.’’ As predicted, learners showed more sensi- far, it seems reasonable to assume that the mid-proficient and
tivity to the /u–R/ and /i–i / contrasts because in both cases the proficient learners could have established new phonetic cate-
vowels were mapped onto a single Catalan vowel /u/ and /i/, gories for E /i / and E /R/, at least in an alveolar context, but it is
respectively, but one instance of each pair was more consistently unlikely that they could have established new phonetic categories
assimilated to its Catalan counterpart than the other. Finally, the for E /i/ and E /u/, which were perceptually merged with their
/e–æ/ pair fails to meet the PAM-L2’s predictions, as a case of Catalan counterparts.
‘‘two-category assimilation’’, learners should have discriminated The SLM also posits that a learner’s L1 and L2 phones exist in a
the vowels with relative ease, but this was not the case. The common phonological space and interact with one another. The
English vowels /e/ and /æ/ were mostly heard as instances of C /e/ Catalan learners could not perceive the differences between the
and C /a/, respectively. However, Catalan learners frequently Catalan and English vowels in most contrasts. The learners showed
identified E /e/ in terms of C /a/ and E /æ/ was heard as C /e/, high sensitivity to only one C–E contrast, i.e., C /i/–E /i /. They
albeit in a small number of the trials. Further, listeners classified showed marginal sensitivity to three C–E contrasts—C /a/–E />/, C
both vowels as ‘non-Catalan’, suggesting that the perceptual /i /–E /i/, and C /u/–E /u/—and they showed no sensitivity to the
distance between the two vowels of the pair is very small, thus other three vowel pairs, C /a/–E /æ/, C /a/–E /e/, and C /e/–E /e/.
causing confusability among non-native listeners. Further, two These results suggest that in the learner’s internal representations,
500 L. Rallo Fabra, J. Romero / Journal of Phonetics 40 (2012) 491–508

E />/, E /e/, and E /æ/ share the same perceptual space as C /a/; abilities is a complex one and that they do not progress in
further, E /i/, E /e/, and E /u/ are not separate categories from their parallel (Llisterri, 1995; Wode, 1999). Other authors (Bohn &
Catalan counterparts C /i/, C /e/, and C /u/, respectively. In other Flege, 1997; Strange, 1995) argue that in the early stages of L2
words, learners probably do not switch between two separate acquisition perception leads production, but as learners gain
perceptual systems; instead, they reorganize their existing per- experience with the target language, speech production
ceptual vowel space to fit both the L1 and the L2 vowels. improves while perceptual abilities are highly resistant to
Language proficiency did not seem to influence learners’ change. We are uncertain to what extent these claims can be
discrimination abilities. All three groups showed similar patterns extended to EFL learning, but assuming that they are, we can
of perceptual sensitivity to the target vowels. Overall, A0 scores predict that low-proficient learners are more likely to exhibit
were poor in comparison with previous studies (Flege, MacKay a correlation of these two abilities than mid-proficient or
et al., 1999; Guion et al., 2000). We could argue that this is due to proficient learners.
the fact that that there were no differences in starting age or
amount of exposure to the target language among the three
3.1. Method
groups of learners. In all groups, exposure to English did not
occur prior to ten or eleven years of age, and it was often
3.1.1. Participants
restricted to formal instruction by both native and non-native
The same group of Catalan learners who participated in the
speakers. In contrast, the participants in the studies mentioned
perception experiment produced the target words in a different
above had a considerable amount of exposure to the target
testing session. Due to the fact that female speakers outnumbered
language, so this probably contributed to the development of a
male speakers, only the speech productions of the 27 female
higher sensitivity for discriminating differences between L1 and
speakers were used in this experiment. As with the perception
L2 sounds. Taken together, our findings suggest that formal
test, the Catalan learners were divided into three groups accord-
language instruction may produce only small perceptual changes.
ing to their language proficiency: proficient (n ¼10), mid-profi-
Consequently, malleability of the speech perception system
cient (n ¼9), and low-proficient (n ¼8). The five native English
seems to be severely limited when exposure to the L2 occurs at
speakers (NE) were visiting students at the University of Barce-
adolescence and, particularly, in non-naturalistic contexts.
lona (mean age ¼21). They came from four different regional
areas in the United States (Mid-Atlantic, North, South and West),
but none of them had a strong regional accent as judged by the
3. Experiment 2: Vowel production
second author. There was a strong reason for the geographical
diversity in the control group. We reasoned that the learning
This experiment examined production of the English vowels
conditions for the Catalan learners in terms of nature and amount
by three groups of Catalan learners of English varying in profi-
of exposure to English were not ideal. Unlike L2 learners in a
ciency in the target language, and a group of native American
naturalistic setting, who are typically exposed to a single English
English speakers who served as controls. The following research
dialect on a regular basis, the Catalan learners’ exposure was
questions were addressed:
restricted to the instructional setting, and was often highly
diversified in terms of the English variety spoken by the lecturers.
(3) Do Catalan learners produce the English vowels authentically
Given this L2 multidialectal setting, it was unlikely that learners
and to what extent does production accuracy vary as a function
had a single English dialect as their target.
of proficiency in English?
Prior work with Catalan interlanguage provides limited and
contradictory evidence on English vowel production by Cat- 3.1.2. Word elicitation
alan learners. For instance, Cebrian (2007) reported that Each talker produced a corpus of 33 words containing tokens
experienced Catalans produced /i/, /i /, /ei /, and /e/ with of each of the eight English vowel categories tested, namely, /i/, /i/,
spectral values that fall within the range of native English /e/, /æ/, />/, /e/, /R/ and /u/ (see Table 5). The words were selected
speakers, but the /i–i / pair showed overlap in the acoustic from The Oxford Acoustic Database (Pickering & Rosner, 1993),
space. Both vowels were identified as intended by native which provides word lists for various languages classified by
listeners at similar accuracy levels. In contrast, a study with sounds. Monosyllabic words in the context of voiceless stops or
less experienced Catalan EFL learners (Fullana Rivera and fricatives were chosen when possible to facilitate sound segmen-
MacKay, 2003) found that /i/ received higher intelligibility tation. In order to test the learners’ ability to produce accurate
scores than /i /. To our knowledge, no other published studies instances of the intended vowels, the target words were elicited
have investigated production of other vowels by Catalan in a variety of consonant environments. A task in which subjects
learners. Consequently, in light of the evidence provided so are required to produce the target sounds in a single consonantal
far, we predicted that the Catalan learners in the present context would probably be too easy for subjects who had received
study would produce some of the target vowels accurately. some phonetic training that involved listening and repeating
Specifically, the lax vowels /i / and /R/ were more likely to minimal word pairs.
receive higher ratings since they were discriminated with Participants were recorded individually in a sound-proof booth
relative ease from their tense counterparts /i/ and /u/, and using a Marantz tape recorder, Model CP-300 and a Shure (model
also from their closest Catalan vowels /i/ and /u/. Further, we
predicted that English proficiency would influence levels of Table 5
accuracy, with better performance for the proficient group Speech materials used in the production experiment.

and poorer performance among the mid-proficient and low- /i/ /e/ /æ/ /u/
/i / />/ /e/ /R/
proficient learners.
(4) Are perception and production data correlated? heap hip bed cap rob rub hood boot
The SLM claims that accurate production in the L2 can only be eve fish dead dad sob shut hook moon
guaranteed if learners develop perceptual targets that moni- each itch head had knob nun full fool
eel ill men man cop cup look
tor the sensorimotor learning of L2 sounds (Flege, 1995). It less lass
has been suggested that the relationship between these two
L. Rallo Fabra, J. Romero / Journal of Phonetics 40 (2012) 491–508 501

SM58) microphone. The task was a single-word elicitation in one day and had at least a one-hour break between them. In the
which the experimenter visually presented the randomized target first session, they received some short training (five trials) to
words, with no carrier phrases, using flashcards. Speakers were familiarize themselves with the task. Participants were told that
instructed to read the words that appeared orthographically on they would hear an unspecified proportion of tokens spoken by
the flashcards and to produce them using a flat intonation. Four native and non-native speakers of English. (Native speakers
extra words, which were not included in the reported data, were produced 78 tokens, and non-native speakers produced 503
added at the beginning of the randomized list to familiarize tokens.) They were instructed to perform two kinds of judgments.
speakers with the task. Speakers were allowed to self-correct, First, participants were to identify the vowel within the word they
that is, on the rare occasion that a speaker read the word heard, selecting one of the keywords presented on the screen.
incorrectly, she could repeat it once. Immediately after that, they had to rate the intended vowel for
goodness using a five-point scale (1¼poor instance; 5¼good
instance). They had the option to listen to the stimuli word again
3.1.3. Accuracy measurement if necessary. Following Piske et al. (2002), we limited the number
Accuracy of the speech productions by the Catalan learners of response keywords in order for listeners not to feel over-
and the native English group was measured in two ways: by whelmed; thus, we selected the keywords heed, hid, head, had, hot,
acoustic measurements and by native listener judgments in a and hut to identify the front vowels, and had, head, hut, hot, hood,
forced-choice vowel identification experiment with goodness and who’d to identify the central and back vowels.
ratings. A total of 1056 words (32 speakers  8 target vowels  4
or 5 words) were digitized with Kay Elemetrics CSL 5500 at a 3.2. Results
10000 Hz sampling rate. The spectral measurements were
extracted automatically using the auto-correlation method of 3.2.1. Spectral measurements
linear predicting coding (LPC) analysis option. The pitch detection Eight separate one-way ANOVAs examined the effect of group
frame was set at 5 ms and the zero-crossing limit at 50 crossings. on the formant frequency values of the vowels produced by the
The onset and offset of each vowel segment were identified by four groups of speakers. The F-values for the front vowels shown
visual inspection of wide-band spectrograms and time domain in Table 6 indicate that the simple effect of group was significant
waveforms. The average F1 and F2 values from the five central at the 0.01 alpha level for the F1 and F2 values of the /i/ tokens.
frames were taken from the steady-state portion of the vowel Tukey’s HSD post hoc tests further revealed that the first formant
segment at 5 ms steps. Duration measurements for the tense/lax of vowels produced by the mid-proficient group was significantly
vowel pairs /i–i / and /u–R/ were obtained manually by visual higher than the F1 values of tokens produced by the NE. The post
inspection of waveform and time-synchronized wideband spec- hoc tests also showed that the second formant values of the /i/
trographic displays using Praat (Boersma & Weenink, 2008). For tokens produced by the native English speakers were significantly
CVC words, vowel onsets were defined as the onset of voicing higher than those produced by the mid-proficient and low-
following the frication noise for fricatives, the release burst for /b/ proficient learners. The proficient learners did not differ signifi-
and as the onset of low intensity formants for /m/. Offset for both cantly from the NE, indicating that language proficiency positively
CVC and VC words was defined as the beginning of closure- influenced production of the high front tense vowel.
interval /frication noise when the following consonant was a stop Again, the main effect of group for /i / was significant at the
or a fricative and as a low intensity formant if a nasal followed. 0.01 level for F2 but not for F1. The three groups of learners
A total of 352 time measurements were obtained (11 words  32 produced /i / tokens with significantly higher F2 values than those
speakers).9 of the native English speakers. For /e/, F was significant at the 0.01
A subset of 16 words was further assessed for accuracy by alpha level and Tukey’s HSD tests showed that this effect was due
native English listeners. Specifically, the words heap, each, hip, to a marginal difference between the F1 frequencies of the tokens
itch, head, dead, had and dad were selected to assess productions produced by the low-proficient and mid-proficient groups. Apart
of the front vowels /i/, /i /, /e/, and /æ/ and the words shut, cup, cop, from this, none of the three groups differed significantly in their
sob, hood, look, boot, and moon were chosen to assess accuracy of productions from the NE group. This is also true of the formant
the central and back vowels /e/, />/, /R/ and /u/. A total of 512 values of the /æ/-tokens, which did not yield any significant
words (16 words  32 speakers) were normalized for peak inten- differences between the native and the non-native productions.
sity and arranged in four blocks for auditory evaluation by four As for the back and central vowels, the ANOVAs revealed a
native English listeners. All listeners were native American significant main effect of group on first-formant frequencies for
English speakers from the Study Abroad Program hosted by the vowels />/ and /R/, a marginally significant effect for /e/ and a
University of the Balearic Islands. Their mean age was 21. They nonsignificant effect for /u/. Tukey’s HSD (a ¼0.05) revealed that
came from different universities in the Midwest region of the for />/, the first formants for the proficient and low-proficient
United States. They were tested individually in a quiet room and groups were significantly lower than for the native English
were paid for their participation. speakers. The F1 values for the mid-proficient learners did not
The 16 target words were presented via dual driver satellite differ significantly from the native English speaker group. The
loudspeakers using a laptop computer in four separate testing productions of the low central vowel /e/ were within the range of
sessions, two sessions for the front vowel subset and another two native English standards, as shown by the absence of significant
sessions for the central and back vowel subset. The tests were run differences in formant values. All three groups of learners had
with the MFC application of the Praat software, which allows trouble producing the high back pair /u–R/. Significant differences
presentation of sound files in counterbalanced order and auto- were found in F1 between the /R/ tokens of the native English
matically collects listeners’ responses. Each testing session lasted speakers and those of the learners, regardless of their language
about 10 min. In most cases, listeners completed two sessions in proficiency level. Finally, learners produced more peripheral
instances of /u/ as judged by the lower F2 values relative to the
9
NE speakers, who produced more centralized /u/ tokens as
The words eel, ill, full and fool were not included in the analysis because of
the on-glide that often appears between the vowel and the [l] in these sequences.
defined by the higher F2 values.
Of these 32 speakers, 27 were Catalan learners and five were native English The mean F1 and F2 formant values for the eight vowels tested
speakers. were plotted in three separate two-dimensional acoustic spaces
502 L. Rallo Fabra, J. Romero / Journal of Phonetics 40 (2012) 491–508

Table 6
Mean frequency values and standard deviations (in parentheses) for the vowels tested in the production task. The F-values are for separate one-way ANOVAs examining
the simple effect of group. (n) and (nn) indicate significance at the 0.01 and 0.001 levels, respectively. Alpha level for the pairwise comparisons was set at 0.05.

1 2 3 4 F Tukey’s HSD
Low-prof. Mid-prof. Prof. NE

Vowel
/i/ F1 447 (83) 472 (61) 436 (70) 391 (49) (3,86) ¼ 4.12n 2 44
F2 2549 (186) 2593 (150) 2682 (164) 2801 (253) (3,86) ¼ 6.53nn 4 41, 2

/i / F1 493 (84) 546 (87) 557 (89) 583 (91) (3,90) ¼ 3.85
F2 2390 (195) 2361 (205) 2378 (188) 2171 (286) (3,90) ¼ 4.00n 1, 2, 344

/e/ F1 710 (130) 653 (123) 738 (89) 698 (74) (3,157)¼ 5.09n 1 o2
F2 2067 (169) 2188 (151) 2183 (204) 2205 (232) (3,157)¼ 2.99

/æ/ F1 794 (145) 831 (109) 782 (100) 862 (150) (3,155)¼ 2.87
F2 2003 (261) 2003 (270) 2083 (243) 1921 (309) (3,155)¼ 2.11

/>/ F1 751 (148) 828 (90) 788 (113) 905 (80) (3,124)¼ 8.36nn 4 41,3; 241
F2 1478 (225) 1419 (185) 1386 (217) 1346 (123) (3,124)¼ 2.24

/e/ F1 711 (140) 804 (122) 728 (109) 785 (110) (3,122)¼ 4.26n 2 41,3
F2 1620 (235) 1548 (207) 1605 (215) 1684 (241) (3,122)¼ 1.56

/R/ F1 498 (69) 549 (72) 527 (69) 655 (96) (3,90) ¼ 12.99nn 4 41,2,3,
F2 1180 (269) 1243 (228) 1395 (283) 1337 (217) (3,90) ¼ 3.43n 1 o3

/u/ F1 498 (71) 482 (101) 461 (98) 435 (56) (3,58) ¼ 1.16
F2 1047 (187) 1031 (291) 1062 (252) 1354 (274) (3,58) ¼ 3.81n 4 41,2,3,

for each group of speakers (Fig. 4). Eight additional one-way negative effect on vowel confusability by the non-native groups:
ANOVAs and a posteriori contrasts (Tukey’s post hoc tests) the acoustic distances between the two vowels in the pairs /i–i /,
examining the effect of vowel were performed to determine /u–R/, /e–æ/, and /e–>/ narrowed, causing a high degree of
whether the F1 and F2 dimensions had different weight across overlap between the two members of each pair. However, it
the four groups. The results showed significant effects for the two should be noted that there was a trend for vowel spaces to spread
acoustic dimensions in all groups.10 as proficiency in English increased, so that the acoustic distances
The post hoc tests revealed three distinct patterns: the mid- between neighboring vowels became wider, especially in the
proficient and low-proficient learners showed the same trend, but pairs /i–i /, /æ–e/, and /u–R/.
the proficient learners and the NE group showed two separate
trends. Comparison of the mean formant frequencies for the
vowels produced by the mid-proficient and low-proficient groups 3.2.2. Duration measurements for the tense–lax pairs
yielded three subsets of data for the F1 dimension and six subsets Four separate one-way ANOVAS examined the effect of group
of data for the F2 dimension. The three separate F1 subsets on vowel duration for the tense/lax vowel pairs /i–i / and /u–R/.
included the high vowels /i/, /i /, /u/, and /R/, the mid-vowels /e/ Significant differences in vowel duration were found for the tense
and /e/, and the low vowels /e/, />/, and /æ/. The F2 dimension vowel /i/ (F(3, 88)¼3.79, p ¼0.013), but no significant differences
included six subsets of data, /u, R/; />, e/; /æ/; /e/; /i /; and /i/. were found for /i / (F(3, 91)¼1.22; p ¼0.304) /u/ (F(3, 56) ¼2.19;
The F1 data for the proficient group yielded four separate p¼0.098) or /R/ (F(3, 91) ¼1.96; p¼0.125). Pair-wise comparisons
subsets: /i, u/; /R, i /; /e, e/; /e, æ, >/. As with the other two groups for the /i/ durations indicated that the mid-proficient and low-
of learners, six subsets were identified for the F2 data: /u/; /R, >/; proficient learners differed significantly from the proficient lear-
/e/; /æ, e/; /i /; and /i/. Finally, examination of the data obtained ners and native English speakers. Table 7 shows the average
for the NE group revealed a different pattern. Unlike the three vowel duration and duration ratios for the four high vowels. In all
groups of learners, the F1 values of the vowel tokens produced by cases except for /i/, vowel duration becomes closer to nativelike
these speakers represent six separate subsets of data— /i, u/; /R, i / values as a function of proficiency in English. In the case of /i/, the
; /e, R/; /e, e/ /e, æ/ and /æ, >/—in contrast to the five subsets for proficient learners show a clear trend towards more nativelike
the F2 values: /u, R, >/; /e/; /æ, e/; /e, i /; and /i/. vowel durations, relative to the low-proficient and mid-proficient
These data, along with the three vowel charts, showed that groups. These results indicate that a considerable number of
learners, due to the influence of the Catalan vowel system, learners have great difficulty differentiating /i/ from /i / in terms
exhibited a tendency to narrow the F1 dimension of the acoustic of vowel length and contrast with previous studies that explore
vowel space relative to the native English group, who showed six duration of these vowels as produced by Catalan learners
distinct degrees of vowel height in terms of F1 values. This (Cebrian, 2007; Mora & Fullana, 2007). They found that Catalan
suggests that learners gave more weight to the F2 dimension learners relied on duration in the production of /i/ and /i /
when distinguishing vowel categories. Unlike learners, native regardless of L2 experience. We speculate that the different
speaker relative weighting of the F1 and F2 dimensions was more methodologies used in the elicitation tasks in both these studies
balanced across vowels. This crosslinguistic difference had a might not make them comparable with the present study. In the
Cebrian study learners produced the target vowels in a repetition
and vowel insertion task, while Mora & Fullana used a delayed
10
NE: F1 F(7, 152)¼ 62.4; p o 0.001, F2 F(7, 152)¼ 99.2; p o 0.001; proficient sentence repetition task. These methodologies probably allow for
learners: F1 F(7, 316)¼ 95.3; po 0.001, F2 F(7, 316) ¼240.2; p o 0.001; mid-
proficient learners: F1 F(7, 278)¼ 90.5; po 0.001; F2 F(7, 278) ¼210; po 0.001
a better control of speaking rate, which is known to affect spectral
and low-proficient learners: F1 F(7, 277)¼ 48.91; po 0.001, F2 F(7, 277) ¼227.1; and temporal parameters of vowels (Gay, 1968; Lindblom, 1963;
p o0.001. Miller, 1981). The present study was not primarily designed to
L. Rallo Fabra, J. Romero / Journal of Phonetics 40 (2012) 491–508 503

Table 7
Mean vowel durations and duration ratios in ms. of the tense–lax vowel pairs /i–i /
and /u–R/. SDs are in parentheses.

/i/ /i / Ratio /u/ /R/ Ratio


/i–i / /u–R/

Low-proficient 120 (36) 101 (28) 1.18 149 (40) 105 (30) 1.41
Mid-proficient 124 (36) 100 (22) 1.24 168 (48) 118 (31) 1.42
Proficient 141 (47) 109 (25) 1.29 173 (42) 122 (32) 1.41
Native English 165 (63) 113 (27) 1.46 196 (42) 130 (40) 1.50

3.2.3. Native English Listener judgments


Reliability analyses were performed to assess inter-rater
agreement of the vowel identification and goodness ratings. The
intraclass correlation coefficients based on the forced-choice
responses by four native English listeners were 0.96 for the front
vowels and 0.91 for the back vowels, indicating a high level of
inter-rater consistency. However, listeners were less consistent
when rating the intended vowels for goodness: the correlation
coefficients were 0.61 for the front vowel ratings and 0.58 for the
back vowel ratings.
The percentage of times that a target vowel was correctly
identified as an instance of the intended vowel, along with the
corresponding goodness ratings, is shown in Table 8. These
percentages are based on the forced-choice judgments by four
native English listeners.11 The total percentage of ‘‘hits,’’ defined
as the percentage of times that a target vowel was identified as
intended, averaged across learners and vowels, was submitted to
a two-way ANOVA. Both the single effects of vowel and group
were significant (F(7, 2004)¼9.9; p o0.001 for vowel and F(3,
2008)¼33.2; po0.001 for group), indicating that the percentage
of vowels that were heard as intended by native English judges
varied significantly across vowels and groups. Additional one-way
ANOVAs examining the effect of group yielded significant effects
for /i/ (F(3, 244) ¼6.4, po0.001), /i / (F(3, 249) ¼5.1, p ¼0.002), /e/
(F(3, 247) ¼7.9, po0.001), /æ/ (F(3, 244)¼ 5.4, p o0.001), /e/ (F(3,
252)¼ 2.7, p o0.05), />/ (F(3, 248) ¼14.1, po0.001), and /R/ (F(3,
248)¼ 5.2, p o0.001), but not for /u/ (F(3, 248) ¼2.2, po0.082).
Pair-wise comparisons with Tukey’s post-hoc tests showed that,
for /i /, /e/ and />/, percent identification by NE judges was
significantly higher for the tokens produced by the NE speakers
than for the tokens produced by any of the three groups of
learners. For the vowels /i/ and /æ/, the tokens produced by the
NE speakers were significantly better identified than the tokens
produced by the mid-proficient and the low-proficient learners.
The proficient learners produced instances of /i/ and /æ/ that were
heard as intended and within the range of NE speakers. For the
vowel /e/, the mid-proficient and proficient learners did not
significantly differ from NE speakers and, for /R/, the mid-
proficient learners did not differ from NE speakers either.
As for the goodness ratings, the main effect of group was also
significant F (3, 2008)¼69.9; po0.001, the pair-wise comparisons
showed that the three groups of learners received significantly
Fig. 4. Vowel chart plotting the mean values of the English vowels /i/, /i /, /e/, /æ/, lower goodness ratings than the native English group. In turn,
/e/, />/, /R/ and /u/ produced by the three groups of Catalan learners varying in
English proficiency and a control group of native English speakers.

11
Originally, six native listeners were used but three of them failed to
investigate how learners implemented durational cues in the consistently identify the NE productions as intended. These three listeners had
tense–lax English vowel pairs and, consequently, speaking rate reported having lived in other dialect areas before they moved to the Midwest.
was not controlled. Further, the syllabic structure in which the The decision was made to reject the judgments by these three listeners, which had
vowels were placed also differed. The Catalan learners in the not reached 80% identification percentage of the NE data averaged across vowels,
and obtain judgments from an additional listener. Still, instances of vowel /u/ by
Cebrian and Mora & Fullana studies produced the vowels in CVC the NE group were identified less consistently than other vowels. Inspection of the
syllables, in the present study vowels appeared in VC as well as identification percentages across listeners, suggested that one listener system-
CVC syllables. atically identified instances of the word moon as close to hood rather than who’d.
504 L. Rallo Fabra, J. Romero / Journal of Phonetics 40 (2012) 491–508

Table 8
Percent identification and goodness ratings of eight English vowels produced by three groups of Catalan learners differing in English proficiency, and a group of native
English speakers. Percentages indicate the average number of times a given vowel was identified as intended by four native English listeners, who also rated the speech
productions for goodness on a one-to-five scale (1¼poor instance; 5¼ good instance). SDs for the goodness ratings are given in parentheses. Responses selected 3% of the
times or less have been omitted.

Low-proficient learners Mid-proficient learners Proficient learners Mean Catalan learners Native English

/i/ /i/43/i /55 /i/43/i /57 /i/62/i /35 /i/49% /i/81/i /11/e/8
3.38 (1.37) 3.58 (1.39) 3.53 (1.35) 4.22 (1.26)

/i / /i /72/i/25 /i /71/i/25 /i /71/i/26 /i /71% /i /100


3.41 (1.26) 3.95 (1.11) 3.75 (1.22) 4.55 (0.63)

/e/ /e/61/æ/27/i /12 /e/49/æ/48 /e/50/æ/45/i /5 /e/53% /e/87/æ/8


3.11 (1.18) 3.37 (1.32) 3.29 (1.18) 4.58 (0.87)

/æ/ /æ/67/e/27/i /5 /æ/76/e/21 /æ/83/e/17 /æ/75% /æ/100


3.22 (1.14) 3.26 (1.27) 3.52 (1.33) 4.89 (0.31)

/e/ /e/48/æ/20/>/12/e/9/R/8 /e/62/æ/20/>/18 /e/64/æ/12/>/9 /e/58% /e/83/æ/7/>/2


3.28 (1.06) 3.74 (0.97) /R/12 3.83 (0.95)
3.60 (1.05)

/>/ />/32/e/57/æ/7 />/65/e/33 />/52/e/39/æ/8 />/49% />/92/e/7


3.25 (0.87) 3.61 (0.98) 3.53 (1.13) 4.48 (0.78)

/R/ /R/65/u/28/e/5 /R/74/u/24 /R/62/u/17/e/6 /R/67% /R/90/u/10


3,59 (1.08) 3.86 (0.92) 3.54 (1.08) 4.83 (0.44)

/u/ /u/52/R/43 /u/67/R/33 /u/67/R/30 /u/62% /u/75/R/25


3.77 (1.04) 4.03 (0.87) 4.19 (0.87) 4.73 (0.55)

the low-proficient learners received significantly lower ratings these two measurements could be an artifact of the speech
than the proficient or mid-proficient learners. materials measured in both experiments. We must remember
The identification percentages across groups and vowels sug- that the acoustic measurements were taken from a larger number
gest that, on average, learners produced the English vowels /i /, /æ/, of words than were used in the auditory evaluation experiment.
/R/, and /u/ relatively well, with identification percentages of 75% Further, the acoustic measurements were based on formant values
for /æ/, 71% for /i /, 67% for /R/, and 62% for /u/. However, learners obtained at the midpoint of the steady-state portion of the vowel
were less successful at producing good instances of the vowels /e/, and thus other spectral data such as vowel transitions, VISC or
/e/, />/, and /i/. The vowel /i/ was often misidentified as /i / and the upper formant values were not considered.
number of misidentifications was higher in the vowels produced There was a trend toward better identification as a function of
by the mid-proficient and low-proficient groups. The vowel /e/ language proficiency in the learners’ production of five target
was inaccurately produced, with most errors being confusions vowels— /i/, /æ/, /e/, />/, and /u /—but no differences in language
with /æ/ as judged by the native English listeners. The low back proficiency were found for /i / and /R/. This finding might suggest
vowel />/ yielded the lowest identification percentages, although that the lax vowels present less difficulty for Catalan learners than
accuracy was higher in the mid-proficient and proficient learners. their tense counterparts. For /i/, correct percent identification was
Listeners heard this vowel as instances of English /e/, indicating considerably higher among the proficient learners relative to the
that Catalan learners had great difficulty differentiating />/ low-proficient or mid-proficient learners, whose productions of /i/
from /e/. were more often heard as /i /. Similarly, the percentage of times
The vowel identification results partially parallel the results of that /u/ was heard as /R/ was lower among the mid-proficient and
acoustic measurements in terms of vowel formant frequencies. proficient learners than among the low-proficient learners. Mid-
In some cases, lower identification percentages can be accounted proficient and proficient learners also showed better performance
for by significant spectral differences between learners and NE for the mid-vowels /æ/, /e/, />/, relative to the low-proficient
speakers. For instance, />/ was identified as intended in 49% of learners. Comparison of the percent identification of /e/ and />/
the cases; this is reflected in the spectral measurements in that the suggests that the low-proficient group had great difficulty produ-
F1 values of the tokens produced by two groups of learners were cing these two vowels since their productions of />/ were mostly
significantly lower than the NE speaker values. Similarly, /i/ was heard as /e/ by the native English listeners.
identified as intended quite rarely—only 49% of the time; the
spectral measurements showed highly significant differences
between NE speakers and low-proficient and mid-proficient lear- 3.2.4. Perception/ production relationship
ners in terms of F2 values. A vowel that was consistently identified In this section the research question addressed was whether a
as intended with a reasonable degree of success (75%), namely /æ/, relationship existed between the results obtained in the categor-
exhibited no significant spectral differences between learners and ial discrimination test (Experiment 1) and the vowel production
NE speakers. In one case, /u/, a moderate identification is indicated task (Experiment 2). Following a previous study by Jia, Strange,
by smaller spectral differences (at the 0.01 level) between learners Wu, Collado, and Guan (2006), the perception /production align-
and NE speakers. Finally, there are two cases in which spectral ment was examined at the individual and group levels. Bivariate
measurements do not parallel identification percentages: the correlations were computed for the 27 Catalan learners between
learners’ /e /-tokens were identified as intended in 53% of the the mean A0 scores obtained by each subject in the perception test
cases only, whereas these tokens were not found to differ spect- and the mean identification percentage obtained in the word
rally from the NE tokens. Similarly, the learners’ /R /-tokens were elicitation task. No significant correlation was found when the
quite consistently identified (67%) even though spectrally they had data were pooled as a whole (r ¼0.262; n ¼27; p ¼0.18); however,
significantly lower F1 values than the NEs. The mismatch between when the data were pooled separately for each group, they
L. Rallo Fabra, J. Romero / Journal of Phonetics 40 (2012) 491–508 505

Table 9 the vowel /æ/ that were close to nativelike; that is, they were
Difficulty ranking of the bidirectional production error rate and A0 scores heard as intelligible instances of the intended target vowel and,
(discrimination) for the four E–E vowel contrasts.
acoustically, they had spectral values that did not significantly
Production Difficulty Perception Difficulty differ from the NE speaker values. However, learners were less
bidirectional rank A0 scores rank successful at pronouncing good instances of the vowels /i/ and />/,
error since these differed significantly from NE spectral values and
were not consistently identified as intended. The production of /i /
/i–i / 49 þ 25¼ 75 1 0.621 4
and /u/ was moderately good in that both vowels showed small
/e–æ/ 40 þ21 ¼ 61 2 0.426 1
/>–e/ 43 þ 13¼ 56 4 0.559 3 spectral differences in comparison with the NE values and were
/u–R/ 23 þ 35¼ 58 3 0.557 2 identified as intended with relative success. It is hard to deter-
mine production success for /e/, /e/, and /R/, given the fact that
the spectral results did not parallel identification results or vice
exhibited three distinct trends. Production and perception data versa. In the first two cases, both vowels were heard as intended
were not correlated for the low-proficient learners (r ¼  0.206; in 53% and 58% of the instances, respectively, but neither of them
n ¼8; p ¼0.62) or for the proficient learners (r ¼0.255; n ¼10; was spectrally distant from NE /e/ and /e/. The vowel /R/ showed
p ¼0.47) but a significant correlation was found for the mid- just the opposite trend: It was quite consistently identified by the
proficient learners (r ¼0.755; n ¼9; p ¼0.01). NE judges but, conversely, it was spectrally quite distant from NE
The perception /production alignment at the group level was /R/.
computed correlating the overall A0 score for each E–E vowel pair Comparison with a previous study on Catalan interlanguage
with the mean bidirectional error found in production. This (Cebrian, 2007) reveals that there exists a partial discrepancy in
variable was obtained summing up the confusion rates of the relation to the production of the vowels /i/ and /i/, which were
two vowels in each pair; that is, the percentage of times that each identified by native listeners at similar identification percentages,
intended vowel was heard as its counterpart (see Table 9). The 73% and 71% respectively, suggesting that Catalan learners could
correlation between these two variables did not reach signifi- produce both vowels at comparatively similar levels of accuracy. In
cance, either (r ¼0.465; n¼4; p ¼0.53), causing a mismatch of the the present study, we found that only the proficient learners could
perception and production difficulty rankings. produce both vowels at comparable levels of consistency (62% for
/i/ and 71% for /i /). When the performance of the low-proficient
3.3. Discussion and mid-proficient groups was considered, we found that /i/ was
more consistently heard as intended than /i/. The finding that /i/
In this experiment one of the research questions addressed was could present less difficulty than its tense counterpart for less
whether Catalan learners of English could produce some English proficient learners is in line with Cebrian’s speculation that a
vowels authentically and whether this varied as a function of gradual realization of the spectral characteristics of the lax vowel
language proficiency. We found that the total percentage of times may go in hand with the deterioration of its tense counterpart.
that a vowel was correctly identified by native English listeners As far as production of /e/ is concerned, our findings differ
was significantly lower among the Catalan learners relative to the substantially from Cebrian’s. He found that, /e/ was quite con-
native English speakers, suggesting that language proficiency did sistently identified by NE listeners but this was not the case of the
not positively influence nativelike vowel production. present study. Learners produced instances of /e/ that were often
The spectral measurements showed that the vowels produced by heard as /æ/. Conversely, learners encountered less difficulties
the three groups of learners showed a tendency to centralization with /æ/, which was more consistently identified as intended by
relative to the vowels produced by the NE. Overall, this trend NE listeners. Production of these two vowels seems to follow a
diminished as a function of language proficiency, which means that similar trend as the /i/–/i / contrast. As learners gain in L2
the acoustic distances between the vowels that define the extremes proficiency, they gradually produce a new L2 vowel sound
of the acoustic space—‘‘point vowels’’ /i/, /u/, and /a/ (Maddieson, authentically but this achievement comes at the cost of deterior-
1984)—were larger for the proficient group compared to the low- ating production of a neighboring L2 sound. Similar findings
proficient group. The less-expanded vowel inventories of the Catalan involving production of English /e/ and /æ/ by Portuguese speak-
learners have immediate consequences for vowel intelligibility, in ers are reported in Major (1987).
that they increase potential confusions between neighboring vowel It is likely that the type of elicitation task and the different
categories. This was quite obvious in the productions of the /i–i/ and contexts in which the target vowels were elicited could have
/u–R/ pairs by the low-proficient learners. The dispersion of /i/ and /u/ accounted for the difficulty shown by the less-proficient learners
in the acoustic space relative to /i/ and /R/ caused overlap between in producing the tense vowel. In Cebrian’s study the target vowels
the two vowels of each contrast. This was also found in studies were elicited in the hVb context, whereas in our study, the vowels
examining vowel production by the hearing-impaired (Angelocci, were elicited in VC in addition to CVC positions. The possibility
Kopp, & Holbrook, 1991; McGarr & Whitehead, 1992), which reported exists that low-proficient and mid-proficient learners encounter
cases of centralized vowel productions by deaf speakers relative to fewer difficulties when the target vowel occurs in CVC syllable
their normal-hearing peers. Researchers attributed this fact to a lack types, than when it occurs in VC syllables. Onsetless syllables are
of precision in the articulatory gestures due to impaired auditory known to be acquired at a later stage in first language acquisition
feedback. Similar findings were reported in a study exploring (Levelt, Schiller, & Levelt, 1999/2000). We also speculate that an
acquisition of the Spanish vowel system by English learners elicitation task in which learners are asked to utter the vowel in a
(Reeder, 1998, 1999). In this study, speakers showed just the opposite single consonantal context yields better results than a task in
trend from the Catalan learners in the present study: They had to which learners are asked to produce the vowels in a variety of
adjust from a large L1 vowel inventory to the five-vowel system of unpredictable contexts, as was the case of our study. However,
Spanish. Reeder also reported significant improvement in vowel the data obtained so far are insufficient to give support to this
attempts for higher level learners as compared with the lower level hypothesis.
groups. The group differences just reported align with acoustic mea-
Considering both spectral measurements and NE judgments, it surements of vowel formants and vowel duration. For instance,
could be stated that the Catalan learners produced instances of the proficient learners produced /i/ tokens with vowel durations
506 L. Rallo Fabra, J. Romero / Journal of Phonetics 40 (2012) 491–508

that did not differ significantly from native English values. Similarly, learners might be able to produce some target vowels as intended
the three vowel charts plotted in Fig. 4 further attest that the but that these would be perceived as less accurate by native
spectral distance between pairs of neighboring vowels such as /i–i/ English listeners.
and /e–>/ increased with language proficiency. Taken together, the The lack of an overall significant correlation between the
acoustic measurements and the native listener judgments suggest perception and production data at both the individual and group
that there might exist some limitations to the number of vowels EFL levels does not parallel prior studies exploring perception of L2
learners can produce in a nativelike fashion when their L1 vowel sounds (Bohn & Flege, 1997; Cebrian, 2002; Flege, Bohn et al.,
inventory has a smaller number of vowel phonemes than the target 1997; Flege, MacKay et al., 1999; Jia et al., 2006), in which these
language. Late EFL learners may learn to produce relatively good two abilities were brought into varying degrees of alignment. At
instances of some new and old L2 vowels at the cost of mispro- the individual level, we only found marginally significant correla-
nouncing other acoustically close vowels. tions in the data corresponding to the mid-proficient learners.
Some learners could perform one of the two tasks at acceptable
levels of accuracy and yet exhibit a poor performance in the other.
4. Summary and conclusions Various reasons may have accounted for the lack of correla-
tions in the reported data. Presumably, the nonsignificant corre-
This study examined perception and production of American lations could be due to the different consonantal contexts in
English vowels by three groups of Catalan learners in a non- which the vowels occurred in the perception and production
naturalistic setting. In Experiment 1, Catalan learners varying in experiments. In the perception experiment, vowels were pre-
English proficiency were tested on their ability to discriminate sented in an alveolar context, whereas in the production task, the
seven C–E and four E–E vowel pairs. Overall, they showed target vowels where elicited in different consonant environments.
moderately good sensitivity to the two English /i–i / and /u–R/ Further, as Jia et al. (2006) suggest, results from certain percep-
pairs but no sensitivity to /e–æ/ and /e–>/. These outcomes were tion and production tasks might not be comparable because of
interpreted in light of the PAM-L2 (Best & Tyler, 2007) and, except differences in the difficulty ranking of these tasks. Learners found
in one case, they met the model’s predictions of discriminability. the discrimination task quite difficult, in the sense that it required
Learners had great difficulty discriminating most C–E vowel pairs. a high level of auditory attention. In contrast, the production task
With the exception of the C /i/–E /i / pair, which was easy to was cognitively less demanding so they could do it effortlessly.
discriminate, learners showed only marginal sensitivity to the Finally, the possibility exists that perception and production
C /a/–E />/, C /i/–E /i/ and C /u/–E /u/ pairs and no sensitivity to abilities cannot always be brought into perfect alignment in
the C /a /–E /æ/, C /a/–E /e/, and C /e/–E /e/ pairs. These findings foreign language speech learning. The learners in the present
suggest that learners had difficulties differentiating between the study lacked sufficient amount of exposure to the target language
target English vowels /e/, /æ/, /e/, />/, /i/ and /u/ and their Catalan in a naturalistic setting. In these conditions, it is unlikely that
counterparts /e/, /a/, /i/, and /u/. their perceptual abilities will improve if they do not receive
In Experiment 2, the same group of learners that participated specific training aimed at reducing perceptual confusion of L2
in the perceptual test produced instances of eight English vowels sounds.
/i/, /i /, /e/, /æ/, /e/, />/, /R/, and /u/ in a variety of consonantal Even though the PAM-L2 warns that its predictions on per-
contexts. Accuracy was assessed by means of acoustic measure- ceptual learning may not be appropriate to foreign language
ments and native English listener judgments. Significant differ- acquisition situations in which the target language is learned
ences in formant frequency between non-native and native through formal instruction and limited L2 exposure, the results
English speakers were found for the vowels /i/, />/, and /R/. obtained here seem to suggest otherwise. The findings of the
Native and non-native instances of the vowels /i / and /u/ also perception experiment add modest evidence to two recent
differed marginally in spectral terms. Vowel duration was also studies by Levy (2009a, 2009b) in which the model’s predictions
measured in the tokens that contained one of the vowels in the of perceptual difficulty were also extended to foreign language
tense–lax pairs /i–i / and /u–R/. Significant differences between learning in a formal setting.
the mid-proficient and low-proficient learners on the one hand, Similarly, the SLM focuses on populations who have spoken
and native English speakers on the other hand, were found for the the target language for many years, not beginners. One of the
/i/ tokens only. However, the proficient learners produced /i/ hypotheses of this framework is that, eventually, adult L2 learners
tokens with vowel durations that did not differ significantly from might establish new phonetic categories for some L2 sounds if
native English speaker values. they can perceptually distinguish a given L2 sound from its
A subset of the speech materials was further assessed by closest L1 counterpart. We concluded that some mid-proficient
native English listeners who judged the words as instances of and proficient learners may have established new phonetic
the intended vowel and rated them for goodness of accuracy. The categories for vowels /i / and /R/ in an alveolar context. Again,
native English judgments indicated that, overall, most learners these findings would suggest that category formation is not
had less difficulty with the lax vowel /i / than with its tense totally out of reach for some EFL learners who learn the target
counterpart /i/, but this did not extend to the other tense–lax pair language in a predominantly L1-speaking environment.
/u–R/. Of the three low vowels, /æ/ was the most frequently The present study has some limitations in the sense that we
identified as intended by NE listeners, followed by /e/, but this did did not directly test which specific acoustic cues learners relied
not hold true of /e/ and />/, which are located next to /æ/ and /e/, on in discriminating the vowel contrasts. A question that should
respectively, in the acoustic space. As for the goodness ratings, be addressed in future research is whether listeners attend
our results are consistent with prior findings by Munro and exclusively to vowel duration and spectral differences and, to
Derwing (1999) which explored foreign accent and L2 speech what extent, crosslinguistic vowel perception is influenced by the
intelligibility. They found that foreign language speech can be consonantal context. At present there is some evidence that L2
intelligible yet strongly accented. We reported that only the perception varies depending on the consonant environment
proficient learners produced tokens that were rated as signifi- in which a vowel appears (Bohn & Steinlen, 2003; Levy, 2009a;
cantly more accurate than the tokens produced by the other two Levy & Strange, 2008). Subsequent studies should then focus on
learner groups, albeit less accurate than tokens produced by the testing discrimination of vowel contrasts in consonantal contexts
NE speakers. It is reasonable to conclude, then, that late EFL other than alveolar.
L. Rallo Fabra, J. Romero / Journal of Phonetics 40 (2012) 491–508 507

Another limitation that should also be addressed in subse- Clopper, C. G., Pisoni, D., & de Jong, K. (2005). Acoustic characteristics of the vowel
quent studies involves the methodological issue of how nativeli- systems of six regional varieties of American English. Journal of the Acoustical
Society of America, 118, 1661–1676.
keness should be investigated in L2 learning in a formal setting. Flege, J. E. (1987). Effects of equivalence classification on the production of foreign
The small significant differences between the three groups of language speech sounds. In: J. Leather, & A. James (Eds.), Sound patterns in
learners call for an examination of both perceptual and produc- second language acquisition. Dordrecht: Foris Publications.
Flege, J. E. (1995). Second language speech learning: Theory, findings and
tion skills at the individual level, instead of at the group level. This
problems. In: W. Strange (Ed.), Speech perception and linguistic experience
method has already been adopted by Bongaerts (1999) and (pp. 233–277). Baltimore: York Press.
Birdsong (2007). Ideally, this approach would provide a useful Flege, J. E. (1999). The relation between L2 production and perception. In: J. Ohala,
tool to single out the ‘‘exceptional’’ learners from the more Y. Hasegawa, M. Ohala, D. Granville, & A. Bailey (Eds.), Proceedings of the XIVth
international congress of phonetic sciences (pp. 1273–1276). Berkeley: Univer-
‘‘average’’ learners and thus solve the problem of within-group sity of California.
variability in L2 speech research. Flege, J. E. (2003). Methods for assessing the perception of vowels in a second
language: A categorial discrimination test. In: E. Fava, & A. Mioni (Eds.), Issues
in clinical linguistics (pp. 19–44). Padova: UniPress.
Flege, J. E., & MacKay, I. R. A. (2004). Perceiving vowels in a second language.
Acknowledgments Studies in Second Language Acquisition, 26, 1–34.
Flege, J. E., Bohn, O.-S., & Jang, S. (1997). The production and perception of English
vowels by native speakers of German, Korean, Mandarin and Spanish. Journal
Parts of this work were presented at the AESLA, 2005 and the
of Phonetics, 25, 437–470.
ClaSIC, 2006, conferences, as well as, at the ASA 2nd Special Flege, J. E., Frieda, E. M., & Nozawa, T. (1997). Amount of native-language (L1) use
Workshop on Speech. This research was supported by grants affects the pronunciation of an L2. Journal of Phonetics, 25, 169–186.
PB94-0919 and HUM2007-66053-C02-02/FILO, from the Spanish Flege, J. E., MacKay, I. R. A., & Meador, D. (1999). Native Italian speakers’
production and perception of English vowels. Journal of the Acoustical Society
Ministry of Education and Science, and SGR-2009-003 from the of America, 106, 2973–2987.
Catalan Government. The authors would like to express their Flege, J. E., Schirru, C., & MacKay, I. R. A. (2003). Interaction between the native and
gratitude to Ocke-Schwen Bohn and to the three anonymous second language phonetic subsystems. Speech Communication, 40, 467–491.
Flege, J. E., Yeni-Komshian, G., & Liu, S. (1999). Age constraints on second language
reviewers for their helpful comments on earlier versions of this
acquisition. Journal of Memory and Language, 41, 78–104.
manuscript. Many thanks to Daniel Recasens for providing acous- Fullana Rivera, N., & MacKay, I. R. A. (2002). A study of foreign accent in Spanish
tic data from the Catalan database. and Catalan speakers’ production of English Words: Preliminary evidence. In:
J. Dı́az Garcı́a (Ed.), Actas del II Congreso de Fonética Experimental (pp. 198–203).
Sevilla: Universidad de Sevilla.
References Fullana Rivera, N., & MacKay, I. R. A. (2003). Production of English sounds by EFL
learners: The case of /i/ and /i /. In: M. J. Solé, D. Recasens, & J. Romero (Eds.),
Proceedings of the 15th international congress of phonetic sciences (pp. 1525–
Angelocci, A. A., Kopp, G. A., & Holbrook, A. (1991). The vowel formants of deaf and 1528). Barcelona: Causal Productions.
normal-hearing eleven to fourteen-year-old boys. In: R. J. Baken, & R. Gay, T. (1968). Effect of speaking rate on diphthong formant movements. Journal of
G. Daniloff (Eds.), Readings in clinical spectrography of speech (pp. 510–524). the Acoustical Society of America, 44, 1570–1573.
San Diego: Singular Publishing and Kay Elemetrics. Guion, S. G., Flege, J. E., Akahane-Yamada, R., & Pruitt, J. C. (2000). An investigation
Best, C. T. (1994). The emergence of native-language phonological influences in of current models of second language speech perception: The case of Japanese
infants: A perceptual assimilation model. In: C. Goodman, & H. C. Nusbaum adults’ perception of English consonants. Journal of the Acoustical Society of
(Eds.), The development of speech perception (pp. 167–224). Cambridge, MA: America, 107, 2711–2724.
The MIT Press. Hagiwara, R. (1997). Dialect variation and formant frequency: The American English
Best, C. T., & Tyler, M. D. (2007). Nonnative and second-language speech vowels revisited. Journal of the Acoustical Society of America, 102, 655–658.
perception. In: O.-S. Bohn, & M. J. Munro (Eds.), Language experience in second
Hillenbrand, J. (2011). Static and dynamic approaches to understanding vowel
language speech learning: In honor of James E. Flege (pp. 13–34). Amsterdam:
perception. In: G. S. Morrison, & P. F. Assmann (Eds.), Vowel inherent spectral
John Benjamins.
change. Heidelberg: Springer.
Birdsong, D. (2007). Nativelike pronunciation among late learners of French as a
Hillenbrand, J., Getty, L. A., Clark, M. J., & Wheeler, K. (1995). Acoustic character-
second language. In: O.-S. Bohn, & M. Munro (Eds.), Language experience in
istics of American English vowels. Journal of the Acoustical Society of America,
second language speech learning. In honor of James E. Flege (pp. 99–116).
97, 3011–3099.
Amsterdam: John Benjamins.
Jia, G., Strange, W., Wu, Y., Collado, J., & Guan, Q. (2006). Perception and
Boersma, P., & Weenink, D. (2008). Praat: Doing phonetics by computer (Version
production of English vowels by Mandarin speakers: Age-related differences
5.0.36) [Computer program]. Retrieved from /http://www.praat.org/S.
vary with amount of L2 exposure. Journal of the Acoustical Society of America,
Bohn, O.-S., & Flege, J. E. (1997). Perception and production of a new vowel
119, 1118–1130.
category by adult second language learners. In: A. James, & J. Leather (Eds.),
Johnson, J. S., & Newport, E. L. (1989). Critical period effects in second language
Second language speech: Structure and process (pp. 53–74). Berlin and New
learning: The influence of maturational state on the acquisition of English as a
York: Mouton de Gruyter.
Bohn, O.-S., & Steinlen, A. K. (2003). Consonantal context affects cross-language second language. Cognitive Psychology, 21, 60–99.
perception of vowels. In: D. Recasens, M. J. Solé, & J. Romero (Eds.), Proceedings Lee, B., Guion, S. G., & Harada, T. (2006). Acoustic analysis of the production of
of the XV international congress of phonetic sciences (pp. 2289–2292). Barcelona: unstressed English vowels by early and late Korean and Japanese bilinguals.
Causal Productions. Studies in Second Language Acquisition, 28, 487–513.
Bongaerts, T. (1999). Ultimate attainment in L2 pronunciation: The case of very Levelt, C. C., Schiller, N. O., & Levelt, W. J. (1999 /2000). The acquisition of syllable
advanced late L2 learners. In: D. Birdsong (Ed.), Second language acquisition and types. Language Acquisition, 8, 237–264.
the critical period hypothesis (pp. 133–159). Mahwah: Erlbaum. Levy, E. S. (2009a). Language experience and consonantal context effects on
Bongaerts, T., Planken, B., & Schils, E. (1995). Can late learners attain a native perceptual assimilation of French vowels by American-English learners of
accent in a foreign language? A test of the critical period hypothesis. In: French. Journal of the Acoustical Society of America, 125, 1138–1152.
D. Singleton, & Z. Lengyel (Eds.), The age factor in second language acquisition Levy, E. S. (2009b). On the assimilation–discrimination relationship in American
(pp. 30–50). Clevedon: Multilingual Matters. English adults’ French vowel learning. Journal of the Acoustical Society of
Bongaerts, T., Van Summeren, C., Planken, B., & Schils, E. (1997). Age and ultimate America, 126, 2670–2682.
attainment in the pronunciation of a foreign language. Studies in Second Levy, E. S., & Strange, W. (2008). Perception of French vowels by American English
Language Acquisition, 19, 447–465. adults with and without French language experience. Journal of Phonetics, 36,
Bruton, A., Conway, J. H., & Holgate, S. T. (2000). Reliability: What is it, and how is 141–157.
it measured?. Physiotherapy, 86, 94–99. Lindblom, B. (1963). Spectrographic study of vowel reduction. Journal of the
Cebrian, J. (2002). Phonetic similarity, syllabification and phonotactic constraints Acoustical Society of America, 35, 1773–1781.
in the acquisition of a second language contrast. Toronto Working Papers in Llisterri, J. (1995). Relationships between speech production and speech percep-
Linguistics. Dissertation Series. Toronto: University of Toronto. tion in a second language. In: K. Elenius, & P. Branderud (Eds.), Proceedings of
Cebrian, J. (2006). Experience and the use of non-native duration in L2 vowel the 13th International Congress of Phonetic Sciences (pp. 92–99). Stockholm:
categorization. Journal of Phonetics, 34, 372–387. Arne Stombergs.
Cebrian, J. (2007). Old sounds in new contrasts: L2 production of the English Maddieson, I. (1984). Patterns of sounds. Cambridge: Cambridge University Press.
tense–lax vowel distinction. In: J. Trouvain, & W. Barry (Eds.), Proceedings of Major, R. (1987). Phonological similarity, markedness, and rate of L2 acquisition.
the 16th international congress of phonetic sciences (pp. 1637–1640). Studies in Second Language Acquisition, 9, 63–82.
Saarbrücken: Universität des Saarlandes. McGarr, N., & Whitehead, R. (1992). Contemporary issues in phoneme production
Cebrian, J. (2009). Effects of native language and amount of experience on cross- by hearing impaired persons: Physiological and acoustic aspects. The Volta
linguistic perception. Journal of the Acoustical Society of America, 125, 2775. Review, 94, 33–45.
508 L. Rallo Fabra, J. Romero / Journal of Phonetics 40 (2012) 491–508

Miller, J. L. (1981). Effects of speaking rate on segmental distinctions. In: P. D. Eimas, D. Granville, & A. Bailey (Eds.), Proceedings of the 14th international congress
& J. L. Miller (Eds.), Perspectives on the study of speech (pp. 39–74). Mahwah, of phonetic sciences (pp. 1475–1478). Berkeley, CA: University of California.
New Jersey: Erlbaum Associates. Rochet, B. L. (1995). Perception and production of second-language speech
Mora, J. C., & Fullana, N. (2007). Production and perception of English /i/–/i / and sounds by adults. In: W. Strange (Ed.), Speech perception and linguistic
/æ/–/e/ in a formal setting: Investigating the effects of experience and starting experience (pp. 379–410). Baltimore: York Press.
age. In: J. Trouvain, & W. Barry (Eds.), Proceedings of the 16th international Sanders, L., Yamada, Y., & Neville, H. J. (1999). Speech segmentation by native and
congress of phonetic sciences (pp. 1613–1616). Saarbrücken: Universität des non-native speakers: An ERP study. Society for Neuroscience Abstracts, 25, 358.
Saarlandes. Snodgrass, J., Levy-Berger, G., & Haydon, M. (1985). Human experimental psychol-
Munro, M. J., & Derwing, T. M. (1999). Foreign accent, comprehensibility and ogy. New York: Oxford University Press.
intelligibility in the speech of second language learners. Language Learning, 49, Strange, W. (1995). Phonetics of second-language acquisition: Past, present,
285–310. future. In: K. Elenius, & P. Branderud (Eds.), Proceedings of the XIII international
Nearey, T. M., & Assmann, P. F. (1986). Modeling the role of inherent spectral change congress of phonetic sciences (pp. 76–83). Stockholm: Arne Stombergs.
in vowel identification. Journal of the Acoustical Society of America, 80, 1297–1308. Strange, W. (2007). Cross-language phonetic similarity of vowels. Theoretical and
Pallier, C., Bosch, L., & Sebastián-Gallés, N. (1997). A limit on behavioral plasticity methodological issues. In: O.-S. Bohn, & M. J. Munro (Eds.), Language experience
in speech perception. Cognition, 64, B9–B17. in second language speech learning: In honor of James E. Flege (pp. 35–55).
Peterson, G. E., & Barney, H. L. (1952). Control methods used in a study of the Amsterdam: John Benjamins.
vowels. Journal of the Acoustical Society of America, 24, 175–184. Strange, W., Weber, A., Levy, E. S., Shafiro, V., Hisagi, M., & Nishi, K. (2007).
Pickering, J. B., & Rosner, B. S. (1993). The Oxford acoustic database. Oxford: OUP. Acoustic variability within and across German, French and American English
Piske, T., Flege, J. E., MacKay, I. R. A., & Meador, D. (2002). The production of English vowels: Phonetic context effects. Journal of the Acoustical Society of America,
vowels by fluent early and late Italian-English bilinguals. Phonetica, 59, 49–71. 122, 1111–1129.
Rallo Fabra, L. (2005). Predicting ease of acquisition of L2 speech sounds: A perceived Tsukada, K., Birdsong, D., Bialystok, E., Mack, M., Sung, H., & Flege, J. E. (2005).
dissimilarity test. Vigo International Journal of Applied Linguistics, 2, 75–92. A developmental study of English vowel production and perception by native
Recasens, D., & Espinosa, A. (2006). Dispersion and variability of Catalan vowels. Korean adults and children. Journal of Phonetics, 33, 263–290.
Speech Communication, 48, 645–666. Wode, H. (1999). Perception and production in early L1 acquisition and some
Reeder, J. T. (1998). An acoustic description of the acquisition of Spanish phonetic theoretical implications. In: J. Ohala, Y. Hasegawa, M. Ohala, D. Granville, &
detail by adult English speakers. Unpublished Doctoral Dissertation. Austin, TX: A. Bailey (Eds.), Proceedings of the XIVth international congress of phonetic
University of Texas. sciences (pp. 1265–1268). Berkeley, CA: University of California.
Reeder, J. T. (1999). Acquisition of a second language vowel system: Evidence from Yang, B. (1996). A comparative study of American English and Korean vowels
English speakers learning Spanish. In: J. Ohala, Y. Hasegawa, M. Ohala, produced by male and female speakers. Journal of Phonetics, 24, 245–261.

You might also like