Professional Documents
Culture Documents
ChangKazemiEsmaeiliDavies2023_VRtraining
ChangKazemiEsmaeiliDavies2023_VRtraining
net/publication/373157641
CITATIONS READS
2 498
4 authors, including:
Matthew Davies
California State University, Northridge
3 PUBLICATIONS 6 CITATIONS
SEE PROFILE
All content following this page was uploaded by Ellie Kazemi on 04 November 2023.
ABSTRACT KEYWORDS
Researchers have conducted studies on integrating autono artificial intelligence;
mous artificial intelligence (AI) in Virtual Reality Training (VRT); autonomous; training; virtual
however, little is known about the effectiveness of these train reality
ings and the skills typically taught. Out of the 2,017 related
articles found, there were 20 articles that met our inclusionary
criteria. We analyzed the 20 articles along the dimensions of
participant demographics (e.g. age, disability, ethnicity); skills
taught; measurement methods; components of VRTs (e.g. feed
back, communication medium, degree of immersion); effective
ness; and social validity. We also checked the 11 VRTs
mentioned in the present review for components of behavior
skills training (BST). Our results showed that VRT is effective in
teaching social, safety, and professional skills (e.g. initiation of
play, emergency bystander intervention, job interview) to 1,144
participants, including children with disabilities and adults with
and without disabilities. Across the reviewed articles, authors
probed for skill generalization and found that the targeted skills
generalized across setting or time in 15 out of 20 (75%) articles.
Our results indicate that VRT is a flexible and viable option for
scaling BSTs, although additional research is needed for cost-
benefit analysis. Lastly, we discussed ways for behavior analysts
to leverage VRTs with autonomous AI and recommendations for
future research.
Figure 1. Literature search and review process. Other reasons for excluding an article for present
review include: (a) IV or DV does not meet inclusion criteria; (b) articles were inaccessible; (c)
articles were reviews, meta-analysis, or dissertations; (d) duplicates of articles in present review.
Method
We identified articles in PsycINFO and Academic Search Premier (see Figure 1)
using identical keywords, filters, and inclusionary/exclusionary criteria in each
search. Our keywords included: virtual reality, train* or teach*, and skill*,
adding asterisks to include similar variations on the terms (e.g., train, training,
trained). We set our search range from July 2010 until December 10th, 2022, to
control for any major technological advances in VR hardware and software.
We also selected “peer-reviewed empirical studies” and “English” filters to
narrow our search to experimental studies written in English. We found a total
of 2,017 articles between both databases at the time of the search.
Inclusion/Exclusion criteria
We included articles in the review that met all the following criteria: (a)
dependent variables that were observable and measurable; (b) authors
used VRT to teach a specific skill; (c) the explicit presence of an
autonomous AI avatar; (d) learners interacted with an autonomous AI
avatar in the virtual world (e.g., using voice, text, or gestures). We
defined “autonomous AI avatar” as a computer-generated avatar that
independently responds to the learner’s behavior within the training
without the control of a confederate. We added autonomy as
a criterion to include the growing trend of organizations using machine
learning to improve strategic data-informed decision-making (Jordan &
4 A. A. CHANG ET AL.
Participants
Participant characteristics (e.g., age, disabilities, ethnicity/culture) can influ
ence responsiveness to intervention (Jones et al., 2020; Li et al., 2017). To
evaluate whether effects of VRTs generalized across diverse populations, we
recorded the number of participants in each study, their age (defining adults as
18 years or older and children as 17 years or younger), and any diagnoses of
mental and/or physical disability. Age and disabilities were the two consis
tently reported participant characteristics reported across the articles. Two
articles included student populations, so we looked at age distribution to
identify any potential learning differences between adults and children using
VRTs. Similarly, we were interested in whether mental and physical disabilities
could impede learners’ response to VRT as an intervention.
The skills being taught within the VRTs varied with overlapping topographies;
however, the contexts in which each skill was taught were clearly discrimin
able. For example, engaging in appropriate job-interview skills and engaging
in appropriate conversation while waiting for public transportation both
involve social skills, but only job-interview skills are occurring in the context
of a job. We evaluated the different contexts of each skill being trained by
examining three different categories: social, occupational, and safety contexts.
We defined occupational skills as any observable and measurable behavior for
job-related tasks (e.g., classroom management, interview skills, patient triage).
We defined social skills as any observable and measurable verbal behavior for
non-job-related community contexts (e.g., appropriate responses while wait
ing for a bus, asking airport staff for directions). We defined safety skills as any
observable and measurable behavior for emergency protocols (e.g., bystander
CPR administration).
JOURNAL OF ORGANIZATIONAL BEHAVIOR MANAGEMENT 5
The databases used in the present review include literature from disciplines
beyond behavior analysis and organizational behavior management, such as
clinical psychology, nursing, and education. We reviewed the various mea
surement methods used across the multiple disciplines to identify potential
trends in the literature around VR and training. We defined “direct measures”
as any instance of authors recording behavior based on their objective obser
vation (e.g., tracking rate of speech, scoring roleplays, etc.) and “indirect
measures” as any instance of authors recording behavior based on partici
pants’ self-reports or third-party reports. Additionally, since every interaction
with the virtual world requires technology that can run a code and track
interactions, we checked for explicitly stated use of automatic data collection
within the VRT.
Social validity
Following the recommendations set by Wolf (1978), we evaluated VRTs’ social
validity based on the significance of targeted goals, acceptability of procedures,
and satisfaction of the results across participants and stakeholders (e.g., par
ents, teachers, and trainers). For goal significance, we checked if authors
conducted pre-assessments or surveys to identify participants’ baseline skill
levels. For acceptability and satisfaction of VRTs, we reviewed the reported
results of post-training surveys and assessments. We also noted complications
6 A. A. CHANG ET AL.
learners reported after using the VR equipment (e.g., discomfort and restricted
movement).
Human-AI interaction
We examined how the learners interacted with the AI in the program. We
categorized interactions modalities into “point-and-click”, “text chat”, and
“vocal chat”. We defined point-and-click as interactions where the learner
selected from pre-determined scripts programmed into the system (e.g., click
ing a response from a list when prompted, etc.). We defined text chat as
interactions where the learner verbally interacted with the system by typing
(e.g., typing responses with a keyboard, etc.). We defined vocal chat as inter
actions where the learner communicated with AI vocally via microphone. We
coded systems with multiple modalities as “yes” for all modalities used (e.g.,
both vocal chat and point-and-click). We included both guided response
JOURNAL OF ORGANIZATIONAL BEHAVIOR MANAGEMENT 7
(point-and-click) and free response (text chat and vocal chat) to determine
whether topography of learners’ responses would influence skill acquisition.
We compiled all the information found across the articles reviewed and
created a table to compare the features of VRTs (see Table 3). We first checked
if the different components of BST (i.e., instruction, model, rehearsal, and
feedback) were presented in the real or virtual world. We then noted the
presence of human trainers across the different components to determine if
the VRT can be implemented without an expert human trainer present. Next,
we checked for multiple exemplars presented during rehearsals and feedback.
For the feedback portions, we noted the type and timing of feedback provided
(see Table 4). We defined natural consequences as any feedback provided in
training, not including vocal verbal behaviors that are presented contingent to
learner’s behavior, independent of another person’s efforts (e.g., correct
response leads to changes in avatar’s heart rate). We defined socially mediated
consequences as any contingency of reinforcement (or punishment) that are
presented by another person (e.g., store helper answering learner’s question).
We defined “concurrent feedback” as any instance of learners receiving feed
back during rehearsal and “terminal feedback” as any instance of learners
receiving feedback post-rehearsal. Finally, we recorded whether the system is
capable of automatically collecting and presenting learners’ performance data.
Results
We compiled the characteristics of the 20 articles that met our criteria (see
Table 1). There were 1,144 participants across all studies (n = 20). Nineteen
out of 20 articles included exclusively adult or child participants. Smith et al.
(2021) is the only author to include both adult and child-participants (n = 71)
but did not include the age distribution. Across the other 19 articles, there
were 1,073 participants: 619 were adults with no reported disabilities (57.69%),
451 were adults with some form of mental or developmental disability
(42.03%), and three were children with autism spectrum disorder (0.28%).
Authors reported information on ethnicity in 10 articles (50%), which was
insufficient to derive an accurate representation. Across articles, the authors
used VRTs to teach participants occupational skills in 16 articles (80%), social
skills in three articles (15%), and safety skills in one article (5%).
Next, we examined the methodologies found in recent literature and found
that authors evaluated behavior change using direct measures (e.g., perfor
mance scores, expert scores comparison), indirect measures (e.g., knowledge
tests and self-reported surveys), and combinations thereof in 50%, 10%, and
40% of the articles, respectively. All authors reported a significant increase in
skill performance after using VRTs, though there is evidence that VRTs may
not be effective across all topographies of skills. For example, in Leary et al.
(2019), participants were taught chest compressions by tapping on
a smartphone. Participants were able to practice their rhythm by tapping but
could not practice how deep to press down on the chest. As a result, when
tested with CPR mannequins, participants met the mastery criteria for rhythm
of the chest compressions but not the criteria for depth of the chest compres
sions. In terms of measuring generality of VRTs, we found that authors probed
for skill generalization and found that the targeted skills generalized across
setting and/or time in 15 out of 20 (75%) of the articles. The longest delay to
the maintenance probe was conducted 9 months after training ended (Smith
JOURNAL OF ORGANIZATIONAL BEHAVIOR MANAGEMENT 9
et al., 2022). Five studies did not include a generality probe (Aysina et al., 2016;
Hassani et al., 2013; Middeke et al., 2018; Sapkaroski et al., 2021; Ward &
Esposito, 2018).
In addition to the effectiveness of VRTs, we wanted to know whether VRTs
are socially valid interventions. We found that in 17 articles (85%), authors
surveyed participants about their training experience using Likert scales,
rating scales, and/or qualitative short answers. The learners and stakeholders
alike scored and described VRTs favorably. Additional benefits of VRTs
reported by authors included lower attrition rate for participants in VRTs
experimental groups compared to those of alternative training methods.
Authors reported no health-related complications. However, one article
(Park et al., 2011) mentioned learners expressed some difficulty with move
ment due to wires on the technology.
Next, we reviewed characteristics of the mentioned VRTs to understand the
different mechanisms authors utilized that made VRTs effective. Across the 20
articles, authors mentioned 11 different VRTs. Virtual Reality Job Interview
Training (VR-JIT) was examined across nine different articles, and King et al.’s
VR training (King, Boyer, et al., 2022; King, Estapa, et al., 2022) was examined
across two articles. We compared the characteristics of the 11 different VRTs
in Table 2. No clear trend was found in terms of degrees of immersion,
topographies of Human-AI interactions and data presentation capability
across the mentioned VRTs. However, we found that nine of the 11 (82%)
VRTs included multiple exemplars during practice. Multiple exemplar train
ings were presented in forms of different settings (e.g., school, home, and
community), situations (e.g., symptoms and events), levels of difficulty (e.g.,
easy and hard), and/or personas of conversation partners (e.g., friendly,
aggressive, and cold).
We reported the modality of BST components across the 11 mentioned
VRTs in Table 3. The presentation of BST components in VR and inclusion of
human trainer’s support were not mutually exclusive (e.g., if the learner
received feedback in VR and had a follow-up with a human trainer to further
discuss performance). All the reviewed VRTs incorporated rehearsal and
feedback, although incorporation of human trainers varied. For example, the
authors showed flexibility of VR-JIT by implementing BST either completely
in VR or in combination with additional human trainer support (see Table 3).
In Table 4, we reviewed the distribution of feedback modality across all 20
articles. Learners contacted natural consequences in VR during rehearsals in
11 articles (55%) and after rehearsals in nine articles (45%). Learners contacted
socially mediated consequences presented within VR during rehearsals in 17
articles (85%) and after rehearsals in 16 articles (80%). Learners did not
contact any natural consequences in-person from human trainers during or
after rehearsal. Learners contacted socially mediated consequences in-person
from human trainers during rehearsal in one article (5%) and after rehearsal in
seven articles (35%). Natural consequences presented in VR were emotional
responding or non-verbal responses from AI-conversation partners (e.g., tone
change after being asked inappropriate question) and changes in status or
event (e.g., getting the job). Socially mediated consequences in VR included
visual performance ratings (e.g., number of hearts lefts, trust or rapport meter,
thumbs up/down), numerical summary report (e.g., number of correct and
Table 5. (Continued).
Visual
Articles Display Descriptive Automatic Human-AI Interactions
M. J. Smith, Monitor Learner used both point-and-click and vocal responses to communicate with AI-
E. J. Ginger, interviewer. Contingent on learner’s responses, AI-interviewer provided verbal
M. Wright, et al. and nonverbal responses. A different AI-avatar praised or delivered corrective
(2014) feedback. (p. 660–661)
Smith, Fleming, Monitor Learner used both point-and-click and vocal responses to communicate with AI-
et al. (2015) interviewer. Contingent on learner’s responses, AI-interviewer provided verbal
and nonverbal responses. A different AI-avatar praised or delivered corrective
feedback. (p. 87)
Smith, Humm, et al. Monitor Learner used both point-and-click and vocal responses to communicate with AI-
(2015) interviewer. Contingent on learner’s responses, AI-interviewer provided verbal
and nonverbal responses. A different AI-avatar praised or delivered corrective
feedback. (p. 273–274)
Smith et al. (2016) Monitor Learner used both point-and-click and vocal responses to communicate with AI-
interviewer. Contingent on learner’s responses, AI-interviewer provided verbal
and nonverbal responses. A different AI-avatar praised or delivered corrective
feedback. (p.325)
Smith et al. (2021) Monitor Learner used both point-and-click and vocal responses to communicate with AI-
interviewer. Contingent on learner’s responses, AI-interviewer provided verbal
and nonverbal responses. A different AI-avatar praised or delivered corrective
feedback. (p.1538–1541)
Smith et al. (2022) Monitor Learner used both point-and-click and vocal responses to communicate with AI-
interviewer. Contingent on learner’s responses, AI-interviewer provided verbal
and nonverbal responses. A different AI-avatar praised or delivered corrective
feedback. (p. 1030)
Ward and Esposito Monitor Learner used both point-and-click and vocal responses to communicate with AI-
(2018) interviewer. Contingent on learner’s responses, AI-interviewer provided verbal
and nonverbal responses. A different AI-avatar praised or delivered corrective
feedback. (p.425)
Discussion
Across disciplines, researchers have provided evidence for the efficacy and
social validity of VRTs, as well as the generalization of target skills from the
virtual world to the real world. Learners with or without disabilities, young
and old alike all have shown improvements in acquiring targeted skills after
using VRTs in settings such as health care (e.g., bedside manner and patient
triage), emergency responses (e.g., bystander CPR), employment services (e.g.,
job interview skills), and education (e.g., classroom management, learning
JOURNAL OF ORGANIZATIONAL BEHAVIOR MANAGEMENT 13
Limitations
Given the complexity and constant evolution of VR and AI technology, there
are several limitations to this present review. To narrow down self-
instructional VRTs, we specifically looked for the presence of autonomous
AI within each article as one of our inclusionary criteria. As a result, viable
VRTs like those with Wizard of Oz technique may not have been included. For
example, if a research article used an autonomous AI avatar within the VRT
for roleplays, but used a different, controlled avatar for an experimenter to
provide feedback within the virtual environment, this would still have met our
criteria for inclusion. At the time of present review, Wizard of Oz VRTs offer
some benefits that are still difficult to achieve with autonomous AI. For
instance, Wizard of Oz VRTs (e.g., Simmersion®; TeachLivE®) offer learners
non-scripted role-play experiences that cannot currently be simulated in VRTs
with autonomous AI even with extensive programming. Natural language
processing may be used to help the autonomous AI adapt to variations of
learner’s verbal behaviors, but large datasets are needed to effectively train the
autonomous AI. Although it is possible to program for VRT with AI to vary as
often as needed, based on the learner’s performance, the programming skills
and upfront time required may be currently cost-prohibitive.
Moreover, due to the lack of standardization in reporting the use of
technology, it was difficult for us to identify exactly which autonomous AIs
authors used and how they used them. Some articles only included a single
sentence describing the function of the autonomous AI avatar. It is possible
that more VRTs with autonomous AI were not included in this review because
it was unclear that autonomous AI was used. Our analysis of keywords showed
that only three articles specifically mentioned AI. Both King et al. articles
(King, Boyer, et al., 2022; King, Estapa, et al., 2022) included “artificial
intelligence” and Hassani et al. (2013) included “embodied conversational
agents” as one of their keywords.
With the advances in VR technology and application of AI, more interest in
the specific characteristics of their capabilities is required. This would involve
a dive into questions such as, how AI’s decision-making can better aid in real-
time changes in virtual environments and its interactions with users, and how
VR with an autonomous AI can make the learning process more efficient. The
current literature review’s aim was much more precise, so it did not capture
any such potential.
16 A. A. CHANG ET AL.
Future research
Disclosure statement
We have no known conflict of interest to disclose.
ORCID
Ellie Kazemi http://orcid.org/0000-0001-8316-4112
References
Albright, G., Bryan, C., Adam, C., Mcmillan, J., & Shockley, K. (2018). Using virtual patient
simulations to prepare primary health care professionals to conduct substance use and
mental health screening and brief intervention. Journal of the American Psychiatric Nurses
Association, 24(3), 247–259. https://doi.org/10.1177/1078390317719321
Aysina, R. M., Maksimenko, Z. A., & Nikiforov, M. V. (2016). Feasibility and efficacy of job
interview simulation training for long-term unemployed individuals. PsychNology Journal,
14(1), 41–60. https://www.researchgate.net/publication/324108447_Feasibility_and_
Efficacy_of_Job_Interview_Simulation_Training_for_Long-Term_Unemployed_
Individuals
Bartle, R. (2003). Introduction to virtual worlds. Designing virtual worlds (pp. 1–108). essay,
Pearson Education Limited. https://www.researchgate.net/publication/200025892_
Designing_Virtual_Worlds
Burke, S. L., Bresnahan, T., Li, T., Epnere, K., Rizzo, A., Partin, M., Ahlness, R. M., &
Trimmer, M. (2018). Using Virtual interactive Training Agents (ViTA) with adults with
autism and other developmental disabilities. Journal of Autism and Developmental Disorders,
48(3), 905–912. https://doi.org/10.1007/s10803-017-3374-z
Çakıroğlu, Ü., & Gökoğlu, S. (2019). Development of fire safety behavioral skills via virtual
reality. Computers & Education, 133, 56–68. https://doi.org/10.1016/j.compedu.2019.01.014
Cheng, Y., Huang, C. L., & Yang, C. S. (2015). Using a 3D immersive virtual environment
system to enhance social understanding and social skills for children with autism spectrum
disorders. Focus on Autism and Other Developmental Disabilities, 30(4), 222–236. https://
doi.org/10.1177/1088357615583473
Dechsling, A., Orm, S., Kalandadze, T., Sütterlin, S., Øien, R. A., Shic, F., & Nordahl-Hansen,
A. (2021). Virtual and augmented reality in social skills interventions for individuals with
autism spectrum disorder: A scoping review. Journal of Autism and Developmental
Disorders, 52(11), 4692–4707. https://doi.org/10.1007/s10803-021-05338-5
Hanington, B., & Martin, B. (2019). 99 wizard of Oz. In Universal methods of design: 125 ways
to research complex problems, develop innovative ideas, and design effective solutions (pp.
462) Essay, Rockport Publishers.
18 A. A. CHANG ET AL.
Harvard Business Review Analytic Services. (2020). The Future of Work Is Immersive. Harvard
Business School Publishing.
Hassani, K., Nahvi, A., & Ahmadi, A. (2013). Design and implementation of an intelligent
virtual environment for improving speaking and listening skills. Interactive Learning
Environments, 24(1), 252–271. https://doi.org/10.1080/10494820.2013.846265
Humm, L. B., Olsen, D., Bell, M., Fleming, M., & Smith, M. (2014). Simulated job interview
improves skills for adults with serious mental illnesses. Studies in Health Technology and
Informatics, 199, 50–54. https://doi.org/10.1007/s11414-014-9392-0
Jones, S. H., Peter, C., & Ruckle, M. M. (2020). Reporting of demographic variables in the
Journal of Applied Behavior Analysis. Journal of Applied Behavior Analysis, 53(3),
1304–1315. https://doi.org/10.1002/jaba.722
Jordan, M. I., & Mitchell, T. M. (2015, July 17). Machine learning: Trends, perspectives, and
prospects. Science: Advanced Materials and Devices, 349(6245), 255–260. https://doi.org/10.
1126/science.aaa8415
King, S., Boyer, J., Bell, T., & Estapa, A. (2022). An automated virtual reality training system for
teacher-student interaction: A randomized controlled trial. JMIR Serious Games, 10(4),
e41097. https://doi.org/10.2196/41097
King, S., Estapa, A., Bell, T., & Boyer, J. (2022). Behavioral skills training through smart virtual
reality: Demonstration of feasibility for a verbal mathematical questioning strategy. Journal
of Behavioral Education, 1–25. https://doi.org/10.1007/s10864-022-09492-3
Leary, M., Mcgovern, S. K., Chaudhary, Z., Patel, J., Abella, B. S., & Blewer, A. L. (2019).
Comparing bystander response to a sudden cardiac arrest using a virtual reality CPR
training mobile app versus a standard CPR training mobile app. Resuscitation, 139,
167–173. https://doi.org/10.1016/j.resuscitation.2019.04.017
Li, A., Wallace, L., Ehrhardt, K. E., & Poling, A. (2017). Reporting participant characteristics in
intervention articles published in five behavior-analytic journals, 2013-2015. Behavior
Analysis: Research and Practice, 17(1), 84–91. https://doi.org/10.1037/bar0000071
Middeke, A., Anders, S., Schuelper, M., Raupach, T., Schuelper, N., & Ito, E. (2018). Training of
clinical reasoning with a serious game versus small-group problem-based learning:
A prospective study. Plos One, 13(9), e0203851. https://doi.org/10.1371/journal.pone.020385
Pantziaras, I., Fors, U., & Ekblad, S. (2015). Training with virtual patients in transcultural
psychiatry: Do the learners actually learn? Journal of Medical Internet Research, 17(2), e46.
https://doi.org/10.2196/jmir.3497
Park, K. M., Ku, J., Choi, S. H., Jang, H. J., Park, J. Y., Kim, S. I., & Kim, J. J. (2011). A virtual
reality application in role-plays of social skills training for schizophrenia: A randomized,
controlled trial. Psychiatry Research, 189(2), 166–172. https://doi.org/10.1016/j.psychres.
2011.04.003
Proud, R. W., Hart, J. J., & Mrozinski, R. B. (2003). Methods for determining the level of
autonomy to design into a human spaceflight vehicle: A function specific approach. Proc.
Performance Metrics for Intelligent Systems (PerMIS ’03), NIST Special Publication 1014,
September 2003. https://ntrs.nasa.gov/citations/20100017272
Sapkaroski, D., Mundy, M., & Dimmock, M. R. (2021). Immersive virtual reality simulated
learning environment versus role-play for empathic clinical communication training.
Journal of Medical Radiation Sciences, 69(1), 56–65. https://doi.org/10.1002/jmrs.555
Smith, M. J., Bell, M. D., Wright, M. A., Humm, L. B., Olsen, D., & Fleming, M. F. (2016).
Virtual reality job interview training and 6-month employment outcomes for individuals
with substance use disorders seeking employment. Journal of Vocational Rehabilitation, 44
(3), 323–332. https://doi.org/10.3233/jvr-160802
Smith, M. J., Fleming, M. F., Wright, M. A., Roberts, A. G., Humm, L. B., Olsen, D., &
Bell, M. D. (2015). Virtual reality job interview training and 6-month employment outcomes
JOURNAL OF ORGANIZATIONAL BEHAVIOR MANAGEMENT 19