Dizon2017 USING INTELLIGENT PERSONAL ASSISTANT FOR SECOND LANGUAGE LEARNING - A CASE STUDY OF ALEXA

Feature Article
Using Intelligent Personal

Assistants for Second Language
Learning: A Case Study of
Alexa
GILBERT DIZON
Himeji Dokkyo University
The proliferation of smartphones has given rise to intelligent

personal assistants (IPAs), software that helps users accomplish
day-to-day tasks. However, little is known about IPAs in the
context of second language (L2) learning. Therefore, the primary
objectives of this case study were twofold: to assess the ability
of Amazon’s IPA, Alexa, to understand L2 English utterances
and to investigate student opinions of the IPA. Four university
students of English as a foreign language (EFL) in Japan partici-
pated in the study, which involved each participant interacting
with Alexa in a 20-min session. Three sets of data were collected
and analyzed to achieve the study’s aims: learner-generated
command performance, interactive storytelling performance,
and interviews. The quantitative results showed that Alexa
accurately understood only 50% of learner commands, whereas
comprehensibility during the interactive storytelling skill, Ear-
play, was much higher (90%). Three themes were identified
from the interviews based on criteria developed by Hubbard
(2009): hindered learner efficiency due to the lack of first lan-
guage (L1) support, improved learner effectiveness through
indirect pronunciation feedback, and better access to conversa-
tional opportunities. These findings demonstrate that EFL learn-
ers perceive Alexa to be a potentially useful tool to enhance
language learning and underscore the need for additional (L2)
research of IPAs.
doi: 10.1002/tesj.353
In an article discussing massive online open courses (MOOCs),

Delbanco (2013) contemplated a world where language teachers
are made obsolete due to advances in technology: “But if the new
TESOL Journal 8.4, December 2017 811
© 2017 TESOL International Association
technology threatens some professors with obscurity, others face
obsolescence. Language instructors may someday be replaced by
multilingual versions of Siri on your iPhone” (para. 24). Although
this statement might seem outrageous, the features of intelligent
personal assistants (IPAs) may paint the illusion that fully
autonomous language learning with artificial intelligence (AI) is
just around the corner. However, as Godwin-Jones (2014) points
out, real-world communicative ability involves not only the use of
vocabulary but also strategic and pragmatic competence. Thus, the
likelihood of these skills being developed solely through AI
without human interaction is low. Nevertheless, there is still a
need to investigate these emerging technologies to see if they can
be used to support second language (L2) development,
particularly due to their potential to foster autonomous learning
among students. Thus, the primary aims of this study were to
assess the views of the use of Alexa, an IPA developed by
Amazon, as well as determine its reliability to comprehend L2
English utterances.
THEORETICAL FRAMEWORK
This study draws from an interactionist perspective, which
stresses the importance of interaction in second language
acquisition (Long, 1996). According to Chapelle (2005), interaction
provides three key benefits: opportunities for negotiation of
meaning, obtaining enhanced input, and directing attention to
linguistic form. According to interactionist theory, negotiation of
meaning is particularly important because the ongoing process of
conversational modifications helps learners make stronger
connections between form and meaning (Long, 1996; Pica, 1994).
While interaction is generally referred to as interpersonal activity
between two or more people, Chapelle (2005) posits that it should
also include interaction between a person and a computer. Given
this, Alexa seems to be a promising tool for L2 learning as it offers
opportunities for learners to interact with an IPA.
Several studies have incorporated the interactionist theory in
computer-assisted language learning (CALL) research. Nagata
(1995) looked at the use of software that provided learners with
intelligent feedback and found that it helped them notice gaps in
812 TESOL Journal
their linguistic knowledge as well as improve their L2
pronunciation. In another study, Blake (2000) investigated the
efficacy of jigsaw tasks via online chat between pairs and found
that the activities promoted negotiation of L2 vocabulary more
effectively than other tasks (information gap, decision-making,
opinion tasks). A study by Pellettieri (2000) also examined L2
learners’ use of online chat versus the oral mode to complete tasks
which were intended to promote negotiation of form and meaning.
She discovered that computer-mediated chat encouraged
negotiation to the same degree as the speaking tasks. Moreover,
the computer-mediated chat may have enhanced the learners’
awareness of L2 forms and as a result made them more likely to
modify their language. In a similar study, K€ otter (2003) found that
tandem learning through object-oriented multiple-user domains or
MOOs promoted negotiation in the L2. While the tasks were not
originally designed to enhance negotiation, the participants in his
study engaged in interaction in order to collaborate on projects,
with the first language (L1) also playing a key role in the
interactions. Based on the findings of the aforementioned studies,
it is clear that interaction is a significant factor in creating a fertile
environment for L2 learning to occur.
BACKGROUND
Automatic Speech Recognition

While research on IPAs and L2 learning is sparse (Moussalli &
Cardoso, 2016), more L2 studies have been done in the field of
automatic speech recognition (ASR), which is the technology that
allows users to speak and be understood by IPAs and other
related software. ASR converts speech to text, which in turn allows
IPAs to respond to users accordingly. Liakin, Cardoso, and
Liakina (2015) found that ASR had a greater positive effect on L2
French learners’ pronunciation of /y/ than a control group which
practiced pronunciation with a teacher. These results are in line
with the findings of Neri, Mich, Gerosa, and Giuliani (2008) who
incorporated ASR-based computer-assisted pronunciation training
(CAPT) in a study involving learners of English as a foreign
language (EFL). Although the CAPT group in their study received
Using Intelligent Personal Assistants for Second Language Learning 813
half the amount of pronunciation training as the teacher-instructed
group, the students learning through CAPT improved at a
comparable level. According to the researchers, this was due to the
fact that the learners using CAPT software could receive
undivided attention while the teacher-led students were provided
with little individual support. These findings suggest that L2
students training with ASR-based CAPT learn more efficiently
than those studying through more traditional means. Similar
results were found in a recent study by Luo (2016). Her findings
showed that a combination of peer review and CAPT led to
greater overall improvements in L2 English pronunciation than a
control group. Moreover, a comprehensive study by Chiu, Liou,
and Yeh (2007) demonstrated that web-based ASR software could
improve oral proficiency among university EFL students in
Taiwan. What is interesting is that the non–English major students
in their study had more significant improvements than the
learners majoring in English, suggesting that beginner learners
might have more to gain from ASR activities. Despite the positive
results found in these studies, ASR-based systems do not always
outperform traditional teacher-led training. A study with L2 Dutch
learners by Neri, Cucchiarini, and Strik (2008) found that there
were no significant differences in pronunciation improvements
between the CAPT group and the control group. The researchers
posited that intensified training combined with a more targeted
system catering to speakers of the same language could have led
to greater gains among the students using CAPT.
In addition to the positive effects that ASR can have on L2
students’ oral skills, language learners seem to have mostly
positive perceptions towards their use. The participants in Neri,
Mich, et al.’s (2008) study thought that the ASR-based CAPT
system was beneficial to their pronunciation training. Moreover,
the students enjoyed the CAPT during the lessons, which is an
aspect that should not be overlooked, because motivational factors
are an essential part of developing pronunciation (Moyer, 1999).
Chiu et al. (2007) also had similar findings. The learners in their
study believed that ASR simulated real-life conversation, which
led to improvements in their speaking skills. However, the one
aspect of ASR that was viewed less favorably in Chiu et al.’s
814 TESOL Journal
(2007) study was the recognition rate; some of the students may
have had difficulty being understood by the ASR software. The
researchers posited that this may have been due to several factors,
including students’ mispronunciation, ASR misjudgment of
utterances, the limited size of the corpus, as well as the
effectiveness of the hardware. These issues speak to the need to
cater ASR-based activities to suit the backgrounds and abilities of
a particular group of learners.
IPAs and L2 Learning

Santos et al. (2016) define IPAs as “software agents that can
automate and ease many of the daily tasks of their users” (p. 194).
For most users of IPAs, this means aiding in the completion of
day-to-day activities such as searching for things, creating
reminders, or setting alarms and timers. However, recent advances
to IPAs have increased their functionality, enabling them to do
more complex tasks including controlling other Internet-connected
devices as well as interacting with third-party services. Bellegarda
(2014) states that advances of IPAs might have even greater
implications: “Far beyond merely offering more options in terms
of user interface, this trend may well usher in a genuine paradigm
shift in man–machine communication” (p. 3). The ability to have
authentic communication between people and machines via AI has
the potential to transform language learning. Therefore, it is
essential to investigate how IPAs and similar technologies can help
support second and foreign language development in and outside
of the classroom.
To the best of the author’s knowledge, there has only been
one study on IPAs in the context of L2 learning. In the study,
Moussalli and Cardoso (2016) examined the use of the Amazon
Echo and its companion IPA, Alexa, with four EFL learners. The
students interacted with Alexa via the Echo, using a variety of
set questions and commands as well as learner-generated
questions. Afterward, a questionnaire was administered and the
learners were interviewed in order to gain a comprehensive
understanding of their views on the Echo. Moussalli and
Cardoso (2016) found that the participants had positive opinions
toward the device. Specifically, the learners felt comfortable
speaking with the Echo, were able to understand its utterances,
and believed that the overall experience with the Echo was
enjoyable. Moreover, they thought it was a useful tool for
language development, particularly for learning pronunciation
and vocabulary.
In sum, studies on ASR suggest that the technology may have a
positive impact on the improvement of L2 skills, especially in
terms of the development of pronunciation. Additionally,
language learners seem to find its usage enjoyable, which is an
important affective factor in maintaining students’ motivational
levels. However, little is known about the use of IPAs in L2
contexts. In particular, it is still unclear whether or not language
learners view IPAs as beneficial to their studies. Additionally, the
accuracy of IPAs to understand EFL students is under question,
and Moussalli and Cardoso (2016) have found this to be an issue.
Thus, the following research questions were addressed in this
study:
(1) How accurate is Alexa at understanding L2 English utterances?
(2) What are EFL students’ opinions of Alexa to improve language learning?
METHODOLOGY
Research Design
A mixed-methods case study design was utilized in this study.
According to Creswell (2007), case study research incorporates
“the study of an issue explored through one or more cases
within a bounded system (i.e., a setting, a context)” (p. 73.) In
the present study, the bounded system was a group of four EFL
students at a Japanese university. Each participant met
individually with the researcher for approximately 20 minutes in
order to interact with Alexa and reflect upon their experiences
with the IPA. Quantitatively, the study investigated the accuracy
of Alexa to understand L2 utterances under two conditions:
learner-generated commands and interactive storytelling. The
qualitative aim of this study was to examine the participants’
views of Alexa according to a set of criteria developed by
Hubbard (2009).
816 TESOL Journal
Participants
Two male and two female 2nd-year EFL learners studying at a
Japanese university provided written consent to participate in the
study. All of them were taking an elective class entitled
Communicative English within the Department of Foreign Studies
which was taught by the researcher. The participants were chosen
due to their upper-intermediate to advanced proficiency levels
(TOEIC exam scores ranging from 655 to 740). Prior to interacting
with Alexa, the students attended a brief orientation which
outlined the goals of the study and informed them of what they
would be asked to do.
Alexa
Alexa, the companion IPA to the Amazon Echo, was adopted in
this study for several reasons. First, the Echo is more affordable
than many other devices with IPAs such as smartphones and
tablets, a factor that should not be overlooked considering the
financial constraints of many language teachers and students
alike when adopting technology (Bateson & Daniels, 2012). In
addition, Alexa has the ability to gain numerous downloadable
“skills” through the Alexa Skills Store which makes it more
versatile than other IPAs, especially in the context of language
learning. For instance, applications such as Vocabulary Master
(quizzes users on different levels of vocabulary) and Today’s
Word of the Day (introduces a new word daily with its
definition and an example of the word in use) have the potential
to help students develop their L2 English vocabulary. Other
applications may also support listening and oral speaking
development as well; specifically, interactive audio storytelling
skills like Earplay, which is featured in this study. Last, while
other IPAs are largely tied to an individual line of devices—for
example, Siri with iPhones or iPads and Google Assistant with
Android mobile devices and Google Home—Alexa has been
enabled in a wide range of products including TVs, home
appliances, robots, portable audio accessories, and cars
(Kastrenakes, 2017). This extends Alexa’s potential reach to
nearly limitless possibilities, with a learner having the ability to
use Alexa virtually anytime and anywhere.
Earplay
According to its description on the Amazon Skills Store (Amazon,
2017), Earplay is a “storytelling medium where you become part
of the story, interacting with characters to affect how the events
unfold” (para 1). When the skill is opened, a narrator guides users
through an interactive story, prompting them at different times to
make choices by orally responding to the Echo. An excerpt from
one of the participant’s Earplay interactions with Alexa is shown
in Table 1 to illustrate the types of exchanges that are possible
through the application.
Data Collection and Analysis

Three sets of data were collected and analyzed by the researcher:
command performance, storytelling performance, and interviews.
The command performance data was comprised of five set
learner-generated commands (Table 2). The commands were
prepared in advance and were adjusted for grammar mistakes
(see the Appendix). If Alexa did not understand or
misunderstood a given command, the learners were asked to
repeat the command up to two additional times. However, if
Alexa was not able to accurately understand the same student
utterance a third time, the student was asked to move on to the
TABLE 1. Extract From an Earplay Interaction
Speaker Transcript
Earplay narrator It’s a great restaurant but your guest is late, he’s always late.
You’ve texted twice and now you’re toying with the
silverware, wondering if you should just order, when
suddenly a young woman you’ve never seen in your life
sits down at your table. She looks anxious, “Pretend you
know me.” Do you ask questions or play along?
Student Play along.
Earplay narrator “So glad you could make it,” you say. “I’m starving. Let’s
order.” The woman smiles, then looks nervously around
the room as you motion for the waiter. A server comes over
and puts two drinks on your table. It’s not the same waiter
who’s been refilling your water glass. And you never
ordered any drinks. You can send them back, take a drink,
or leave them alone.
Student Take a drink.
818 TESOL Journal

TABLE 2. The Alexa Commands
1. What’s the definition of [word]?
2. How do you spell [word]?
3. Wikipedia [topic].
4. What’s in the news?
5. General question which varied from student to student.
next command. The second phase of data collection involved the

Earplay demo. After each prompt in the interactive audio story,
the learners were given approximately six seconds to respond. If
the application could not understand the student or if the learner
gave no response within this set period of time, the prompt
would be replayed. With regard to assessing Alexa’s accuracy to
understand the L2 students, the transcripts were analyzed and
divided into categories depending on Alexa’s response to a given
command. For the first part of the session in which Alexa was
given five learner-generated commands, the student utterances
were divided into four categories: understood, not understood,
misunderstood, and wake word error. Wake word errors refer to
each time Alexa was not able to pick up the specific trigger
phrase, that is, the word Alexa, which enables it to listen and
respond to a user’s commands. The utterances in the second part,
which consisted of student responses in the interactive audio
storytelling skill Earplay, were categorized into three groups:
understood, not understood, and no response. No response refers
to each time a user did not give a response to a prompt.
After the completion of the interactive session with Alexa,
semi-structured interviews were conducted to evaluate the
students’ opinions. Interview data were audiorecorded,
transcribed, and then analyzed to identify the learners’ viewpoints
of Alexa as it pertains to six criteria created by Hubbard (2009) to
evaluate how CALL can improve the language learning process:
(1) learner efficiency, (2) learner effectiveness, (3) access, (4)
convenience, (5) motivation, and (6) institutional efficiency. As
Hubbard (2009) points out, the use of CALL can in fact hinder
improvement in one or more of these areas and language learning
outcomes may not be achieved simply through the usage of
technology.

RESULTS AND DISCUSSION
RQ#1: How Accurate Is Alexa at Understanding L2 English

Utterances?
Comprehensibility of command performance was quite poor, with
Alexa only understanding 50% of the students’ total commands
(N = 36). Misunderstood commands were the most common
reason for the inaccuracy (28%), followed by wake word error
(17%) and commands which were not understood by Alexa (5%).
These results mirror the findings of Moussalli and Cardoso (2016)
and Chiu et al. (2007), who found that L2 students may struggle
with being understood by ASR-based systems. The findings are
also in line with interactionist theory and illustrate that IPAs can
help learners become aware of gaps in their L2 pronunciation,
thereby forcing them to modify their output in order to be
understood by Alexa.
As shown in Figure 1, questions 1–3 had a considerably lower
level of comprehensibility than questions 4–5, suggesting that the
students were able to improve their pronunciation accuracy as
they spent more time with Alexa. As Fischer (2012) states, learner
experience and knowledge play significant roles in determining
the effectiveness that a student has with a given technology. These
results highlight the importance of experience and demonstrate
that learners ought to be given time to familiarize themselves with
Alexa and other IPAs in order to be effective users.
One explanation for the low level of accuracy that was found
during the command performance session is the fact that Alexa is
only supported for the use of three languages: American English,
British English, and German. Therefore, speakers of other varieties
of English as well as L2 speakers may have difficulty being
understood by Alexa, which may limit its usability to native
speakers and nonnative speakers with advanced proficiency levels.
Another reason why the accuracy was so poor can be
attributed to a single participant whose utterances were only
understood 30% of the time, as depicted in Table 3. This was less
than half of the mean rate of comprehensibly of the other three
participants (61%). The primary reason for this discrepancy was
820 TESOL Journal

0% 10% 20% 30% 40% 50% 60% 70% 80% 90%
Q1
Q2
Q3
Q4
Q5
Figure 1. Comprehensibility rate per learner-generated question
TABLE 3. The Comprehensibility Rate of Each Student

Comprehensibility rate
Student #1 83%
Student #2 30%
Student #3 45%
Student #4 56%
due to the commands given by the student. Specifically, the

learner asked for the definition of the word law and the spelling of
the word raw. All six of his attempts regarding these commands
were either misunderstood or not understood by Alexa. These
results correspond to research which shows that Japanese students
struggle with the production of the /ɹ/ and /l/ sounds in English,
even among advanced learners who have experience living in
English-dominant countries (Larson-Hall, 2006).
The rate of comprehensibility was much higher during
interactive storytelling. Ninety-four percent of the students’
utterances were understood by Alexa. This accounted for 31 out of
the 33 total responses given by the learners in the session. In fact,
only one of the four participants had issues being understood or
responding fast enough to each prompt. The most likely reason for
the high rate of accuracy during the Earplay demo is because the
Alexa skill was programmed to listen for specific words in each
user’s answers and designated the choice that most closely
resembled one of the predetermined responses. Thus, there was
likely a greater range of responses that the skill could comprehend
compared with learner commands where it had to understand the
complete utterance in order to respond accurately. These results
suggest that Alexa may be best suited to understand L2 utterances
when students are given prompts to choose from. This may limit
the functionality of Alexa to promote L2 speaking ability, but it
would likely increase the accuracy of IPAs to understand
nonnative speakers of English or other languages.
RQ#2: What Are EFL Students’ Opinions of Alexa to Improve

Language Learning?
According to the interview data analyses, three criteria from
Hubbard (2009) were identified. The first was learner efficiency. As
shown in Table 4, all four of the students noted that incorporating
multiple language support, specifically Japanese, could have led to
better usability of Alexa. As aforementioned, Alexa currently
supports only three languages, which may restrict its usability to
native speakers and nonnative speakers with advanced proficiency
levels. Even upper-intermediate to advanced learners like the ones
included in this study may have difficulty effectively leveraging
all the features of Alexa without the incorporation of other
languages. Therefore, it can be concluded that learner efficiency was
impeded in the eyes of the learners due to the limited language
support of Alexa, which aligns with K€ otter’s (2003) research which
suggests that the L1 can be used to help support interaction in the
L2.
In addition, all of the students mentioned that they could
receive pronunciation feedback through Alexa. While the IPA does
not provide explicit feedback or correction, participants’ comments
suggest it does provide implicit feedback whenever the learners’
utterances are not understood. In other words, Alexa promoted
learner effectiveness by providing valuable feedback in a key area of
822 TESOL Journal
TABLE 4. Assessment of Students’ Comments
Hubbard’s
(2009)CALL
evaluation
criteria Description Students’ Comments
1.Learning Learners are able to 1. “If it has the ability to
efficiency pick up language understand Japanese it is very
knowledge or skills useful. If I can find the meaning
faster or with of the words by using Japanese
less effort I could understand it fully.”
2. “If Alexa could understand
my Japanese, I could ask things
that I don’t know in Japanese.”
3. “It would be better to if I could
hear Japanese definitions.
For example, if Alexa could
say the definition in English
and after the definition in
Japanese too. “
4. “It is hard for us to continue
the [interactive audio] story
because of our pronunciation. . .
So maybe in the future they
could add some new elements
like different languages to
help us continue the story or
with our studies.”
2. Learning Learners retain language 1. “By talking to Alexa it can tell

effectiveness knowledge or skills if my pronunciation is good
longer, make deeper or bad. . . When I try to talk to
associations and/or Alexa it didn’t work sometimes.
learn more of what It means my pronunciation
they need. was bad or difficult to
understand.”
2. “I thought studying with Alexa
is very good to know my
pronunciation. For example,
I’m curious to ask about /ɹ/
sound and /l/ sound to
Alexa but Alexa couldn’t
understand my pronunciation
of the words.”

TABLE 4. (Continued)
Hubbard’s
(2009)CALL
evaluation
criteria Description Students’ Comments
3. “And also, I know if Alexa can
understand my speaking or
not for pronunciation practice.”
4. “Also, she can check our
pronunciation.”
3. Access Learners can get materials 1. “It’s interesting because

or experience interactions usually I talk to someone
that would otherwise in English only in class.
be difficult or impossible But I have a chance to hear
to get or do. a native speaker’s voice
through it so it is useful.”
2. “Maybe it’s useful to improve
my English skills in the point
that it’s actually speaking
English and I have to respond
so it is very nice for me. So,
using this is like a conversation
to respond to Alexa.”
3. “I just feel like in that situation
and it can also improve my
listening skills and if I need
to focus on the story I have
to listen to it very carefully.
And I also I could choose the
different answers to see how
the story goes. So, I think it is
a very good way to practice
your speaking skills and
listening skills.”
oral development. This complements the findings of Moussalli and

Cardoso (2016) and supports the notion that interaction via CALL
can support L2 pronunciation ability (Nagata, 1995). By receiving
indirect feedback, the students’ attention was directed toward
errors in their pronunciation, thereby encouraging them to correct
these mistakes in subsequent exchanges.
824 TESOL Journal
Also, three of the participants noted that Alexa could simulate
conversations in the target language, which aligns with the views
of the participants in Chiu et al.’s (2007) study. As Nishino and
Watanabe (2008) point out, “Japanese students—like many other
foreign language learners—have few opportunities to engage in
face-to-face communication in English outside the classroom” (pp.
134–135). Given the few chances that L2 learners have to use
English in a meaningful way outside of the classroom, Alexa
seems to be a promising tool to promote autonomous language
learning through simulated conversations. Based on these results,
it can be inferred that Alexa can provide students with better
access to opportunities for interaction in the L2 than they otherwise
might not have. These findings underscore a key benefit of CALL
in that it provides learners with more opportunities for
meaningful interaction in the L2 (Chapelle, 2007).
In summary, the accuracy of Alexa to understand L2 utterances
was moderate. Although the IPA was able to understand more
than 90% of the student utterances during interactive storytelling,
learner-generated commands were comprehended only 50% of the
time. These findings indicate that using interactive storytelling
skills may be an effective way to maximize Alexa’s ability to
understand L2 utterances. In terms of the perceptions of Alexa for
language learning, three themes based on Hubbard’s (2009) criteria
were identified: (1) hindered learner efficiency due to the lack of L1
support, (2) improved learner effectiveness through indirect
pronunciation feedback, and (3) better access to conversational
opportunities in the L2. These results support research that shows
ASR-based systems are viewed positively by language learners,
particularly in the context of pronunciation and speaking practice
(Chiu et al., 2007; Neri, Mich, et al., 2008).
CONCLUSION
As IPAs continue to improve and add new capabilities, more and
more students will look to them as a way to support autonomous
language learning. Therefore, it is critical to understand how IPAs
such as Alexa can better serve the needs of language students.
Based on the participants’ comments, the inclusion of L1 support
would improve its usability and give them the ability to clarify
anything they do not understand as well as allow the learners to
give commands in their native language if they are not
understood by Alexa in the L2. Moreover, the results of this study
suggest that Alexa has the potential to support L2 development by
providing implicit feedback on pronunciation, which mirrors
previous research by Moussalli and Cardoso (2016). Additionally,
IPAs can afford students opportunities to practice their L2
speaking skills through simulated conversations, which is
especially challenging for EFL learners (Nishino & Watanabe,
2008). These benefits take on greater significance considering the
positive impact that interaction has on L2 development (Long,
1996). While most CALL studies on interaction have focused on
the impact of computer-mediated communication (Blake, 2000;
K€otter, 2003; Pellettieri, 2000), more researchers ought to look at
how IPAs can promote negotiation of meaning and attention to
form among L2 learners.
One of the limitations of this study is its small sample size
which incorporated only four EFL learners in Japan, thereby
limiting the generalizability of the findings. Moreover, future
research could administer questionnaires in addition to interviews
in order to gain a more comprehensive understanding of learner
views toward IPAs. Last, no attention was paid to the actual
linguistic improvements the learners could make through Alexa
and as a result, investigating pedagogically sound ways to
enhance language development with IPAs may be another
interesting avenue of study.
THE AUTHOR
Gilbert Dizon is a lecturer at Himeji Dokkyo University, Japan. His
major research interests are focused on the use of social
networking sites, intelligent personal assistants, and mobile apps
to enhance L2 teaching and learning.
REFERENCES
Amazon. (2017). Earplay. Retrieved from https://www.amazon.c
om/Earplay/dp/B01K8V6NSI
Bateson, G., & Daniels, P. (2012). Diversity in technologies. In G.
Stockwell (Ed.), Computer-assisted language learning: Diversity in
826 TESOL Journal
research and practice (pp. 127–146). New York, NY: Cambridge
University Press. https://doi.org/10.1017/cbo9781139060981
Bellegarda, J. R. (2014). Spoken language understanding for natural
interaction: The Siri experience. In J. Mariani, S. Rosset, M.
Garnier-Rizet, & L. Devillers (Eds.), Natural interaction with robots,
knowbots and smartphones (pp. 3–14). New York, NY: Springer.
Blake, R. (2000). Computer-mediated communication: A window
on L2 Spanish interlanguage. Language Learning and Technology,
4(1), 120–136. Retrieved from http://llt.msu.edu/vol4num1/
blake/
Chapelle, C. (2005). Interactionist SLA theory in CALL research. In
J. L. Egbert & G. M. Petrie (Eds.), CALL research perspectives (pp.
53–64). Mahwah, NJ: Lawrence Erlbaum.
Chapelle, C. (2007). Technology and second language acquisition.
Annual Review of Applied Linguistics, 27, 98–114. https://doi.
org/10.1017/S0267190508070050
Chiu, T.-L., Liou, H.-C., & Yeh, Y. (2007). A study of web-based
oral activities enhanced by automatic speech recognition for
EFL college learning. Computer Assisted Language Learning, 20,
209–233. https://doi.org/10.1080/09588220701489374
Creswell, J. W. (2007). Qualitative inquiry and research design: Choosing
among five traditions (2nd ed.). Thousand Oaks, CA: Sage.
Delbanco, A. (2013). MOOCs of hazard. The New Republic.
Retrieved from http://www.newrepublic.com/article/
112731/moocs-will-online-education-ruin-university-experience
Fischer, R. (2012). Diversity in learner usage patterns. In G.
Stockwell (Ed.), Computer-assisted language learning: Diversity in
research and practice (pp. 14–32). New York, NY: Cambridge
University Press. https://doi.org/10.1017/cbo9781139060981
Godwin-Jones, R. (2014). Global reach and local practice: The
promise of MOOCS. Language Learning and Technology, 18(3), 5–
15. Retrieved from http://llt.msu.edu/issues/october2014/eme
rging.pdf
Hubbard, P. (2009). Computer assisted language learning: Critical
concepts in linguistics (Vols. I–IV). London, England: Routledge.
Kastrenakes, J. (2017). Amazon’s Alexa is everywhere at CES 2017.
The Verge. Retrieved from http://www.theverge.com/ces/
2017/1/4/14169550/amazon-alexa-so-many-things-at-ces-2017
K€
otter, M. (2003). Negotiation of meaning and codeswitching in
online tandems. Language Learning and Technology, 7, 145–172.
Retrieved from http://llt.msu.edu/vol7num2/kotter/
Larson-Hall, J. (2006). What does more time buy you? Another
look at the effects of long-term residence on production
accuracy of English /r/ and /l/ by Japanese speakers.
Language and Speech, 49, 521–548. https://doi.org/10.1177/
00238309060490040401
Liakin, D., Cardoso, W., & Liakina, N. (2015). Learning L2
pronunciation with a mobile speech recognizer: French /y/.
CALICO Journal, 32(1), 1–20. https://doi.org/10.1558/cj.v32i1.
25962
Long, M. (1996). The role of the linguistic environment in second
language acquisition. In W. C. Ritchie & T. K. Bhatia (Eds.),
Handbook of second language acquisition (pp. 413–666). San Diego,
CA: Academic Press.
Luo, B. (2016). Evaluating a computer-assisted pronunciation
training (CAPT) technique for efficient classroom instruction.
Computer Assisted Language Learning, 29, 451–476. https://doi.
org/10.1080/09588221.2014.963123
Moussalli, S., & Cardoso, W. (2016). Are commercial “personal
robots” ready for language learning? Focus on second language
speech. In S. Papadima-Sophocleous, L. Bradley, & S. Thou€esny
(Eds), CALL communities and culture—Short papers from
EUROCALL 2016 (pp. 325–329). Research-publishing.net.
https://doi.org/10.14705/rpnet.2016.eurocall2016.583
Moyer, A. (1999). Ultimate attainment in L2 phonology. Studies in
Second Language Acquisition, 21, 81–108. https://doi.org/10.
1017/s0272263199001035
Nagata, N. (1995). An effective application of natural language
processing in second language instruction. CALICO Journal, 13
(1), 47–67.
Neri, A., Cucchiarini, C., & Strik, H. (2008). The effectiveness of
computer-based speech corrective feedback for improving
segmental quality in L2 Dutch. ReCALL, 20, 225–243. https://
doi.org/10.1017/S0958344008000724
Neri, A., Mich, O., Gerosa, M., & Giuliani, D. (2008). The
effectiveness of computer assisted pronunciation training for
828 TESOL Journal
foreign language learning by children. Computer Assisted
Language Learning, 21, 393–408. https://doi.org/10.1080/
09588220802447651
Nishino, T., & Watanabe, M. (2008). Communication-orientated
policies versus classroom realities in Japan. TESOL Quarterly,
42, 133–138. https://doi.org/10.1002/j.1545-7249.2008.tb00214.x
Pellettieri, J. (2000). Negotiation in cyberspace: The role of chatting
in the development of grammatical competence in the virtual
foreign language classroom. In M. Warschauer & R. Kern
(Eds.), Network-based language teaching: Concepts and practice (pp.
59–86). Cambridge, England: Cambridge University Press.
Pica, T. (1994). Research on negotiation: What does it reveal about
second-language learning conditions, processes, and outcomes?
Language Learning, 44, 493–527. https://doi.org/10.1111/j.1467-
1770.1994.tb01115.x
Santos, J., Rodrigues, J. J. P. C., Silva, B. M. C., Casal, J., Saleem, K.,
& Denisov, V. (2016). An IoT-based mobile gateway for
intelligent personal assistants on mobile health environments.
Journal of Network and Computer Applications, 71, 194–204.
https://doi.org/10.1016/j.jnca.2016.03.014

APPENDIX
List of Learner-Generated Commands Used by Each Student

Participants Commands
Student #1 1. Alexa, what’s the definition of management?
2. Alexa, how do you spell technology?
3. Alexa, Wikipedia Taylor Swift.
4. Alexa, what’s in the news?
5. Alexa, what’s the population of Japan?
Student #2 1. Alexa, what’s the definition of law?

2. Alexa, how do you spell raw?
3. Alexa, Wikipedia Nintendo.
5. Alexa, who is the richest woman in the world?
Student #3 1. Alexa, get the definition of the word condition.

2. Alexa, how do you spell customer?
3. Alexa, Wikipedia guitar.
5. Alexa, how tall is Tokyo Tower?
Student #4 1. Alexa, what’s the definition of marriage?

2. Alexa, how do you spell communication?
3. Alexa, Wikipedia dog.
5. Alexa, what’s the weather like tomorrow?
830 TESOL Journal

Dizon2017 USING INTELLIGENT PERSONAL ASSISTANT FOR SECOND LANGUAGE LEARNING - A CASE STUDY OF ALEXA

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Dizon2017 USING INTELLIGENT PERSONAL ASSISTANT FOR SECOND LANGUAGE LEARNING - A CASE STUDY OF ALEXA

Uploaded by

Copyright:

Available Formats

Feature Article

Using Intelligent Personal

The proliferation of smartphones has given rise to intelligent

In an article discussing massive online open courses (MOOCs),

Automatic Speech Recognition

IPAs and L2 Learning

Data Collection and Analysis

818 TESOL Journal

next command. The second phase of data collection involved the

Using Intelligent Personal Assistants for Second Language Learning 819

RQ#1: How Accurate Is Alexa at Understanding L2 English

820 TESOL Journal

Figure 1. Comprehensibility rate per learner-generated question

TABLE 3. The Comprehensibility Rate of Each Student

due to the commands given by the student. Specifically, the

RQ#2: What Are EFL Students’ Opinions of Alexa to Improve

2. Learning Learners retain language 1. “By talking to Alexa it can tell

Using Intelligent Personal Assistants for Second Language Learning 823

3. Access Learners can get materials 1. “It’s interesting because

oral development. This complements the findings of Moussalli and

Using Intelligent Personal Assistants for Second Language Learning 829

List of Learner-Generated Commands Used by Each Student

Student #2 1. Alexa, what’s the definition of law?

Student #3 1. Alexa, get the definition of the word condition.

Student #4 1. Alexa, what’s the definition of marriage?

830 TESOL Journal

You might also like