Bai et al. (2024)

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 12

IEEE TRANSACTIONS ON LEARNING TECHNOLOGIES, VOL.

17, 2024 1313

Write-Curate-Verify: A Case Study of Leveraging


Generative AI for Scenario Writing in
Scenario-Based Learning
Shurui Bai , Donn Emmanuel Gonda , and Khe Foon Hew

Abstract—This case study explored the use of generative artifi- choices, and authenticity [2]. Research has indicated that creat-
cial intelligence (GenAI), specifically chat generative pretraining ing situations mirroring real-life circumstances in learning can
transformer (ChatGPT), in writing scenarios for scenario-based promote student self-efficacy [3], [4], course satisfaction [2],
learning (SBL). Our research addressed three key questions: 1) how
do teachers leverage GenAI to write scenarios for SBL purposes? 2) course engagement, and learning performance [5]. However,
what is the quality of GenAI-generated SBL scenarios and tasks? scenarios focusing on pure reasoning without emotional con-
and 3) how does GenAI-supported SBL affect students’ motiva- nection are unlikely to engage readers in the long term [6]. In
tion, learning performance, and learning perceptions? A three-step contrast, using storytelling skills can deliver more compelling
prompting engineering process (write the prompts, curate the scenarios by tapping user emotion, holding students’ attention
output, and verify the output, WCV) was established during the
teacher interaction with GenAI in the scenario writing. Findings and engagement [7].
revealed that by using the WCV approach, ChatGPT enabled However, many teachers face challenges in designing effec-
the efficient creation of quality scenarios for SBL purposes in a tive SBL scenarios and tasks. SBL learning tasks need to be
short timeframe. Moreover, students exhibited increased intrinsic appropriately aligned with real-world scenarios and intended
motivation, learning performance, and positive attitudes toward learning outcomes [8]. Teachers may find difficulty in align-
GenAI-supported scenarios. We also suggest guidelines for using
the WCV prompt engineering process in scenario writing. ing SBL learning tasks with appropriate real-world scenarios
(challenge 1). Writing engaging scenarios is a laborious and
Index Terms—Generative artificial intelligence (GenAI), time-consuming affair (challenge 2), and not all subject-matter
intrinsic motivation, prompt engineering, scenario-based learning
(SBL).
experts are good storytellers, and not all good storytellers are
subject-matter experts. Many teachers find it difficult to write an
I. INTRODUCTION engaging SBL-related story that can reflect the teaching content
(challenge 3).
INCE its release on November 30, 2022, chat genera-
S tive pretraining transformer (ChatGPT) has gained over
100 million monthly users, surpassing Google’s record for
In this study, we aim to address these challenges faced by
teachers in writing engaging stories with the help of GenAI
technology. Recent advancements in GenAI technology, such as
the fastest-growing user base [1]. This conversational large-
Open AI’s ChatGPT, have spawned widespread interest among
language model has impressed the world with its powerful
educators. Due to its impressive linguistics abilities and its
capabilities in performing sophisticated tasks. Educators should
ability to respond in a human-like fashion, GenAI has been
therefore ask: how can we leverage generative artificial intelli-
employed to answer test questions (e.g., Economics tests) [9],
gence (GenAI) to facilitate better teaching and learning?
perform language translation [10], write emails (e.g., writing a
Scenario-based learning (SBL) uses scenarios as a medium to
reply email to an angry customer who did not receive his order
let students apply learning to real-world experiences [2]. SBL
on time) [11], and write short stories using some given keywords
integrates elements, such as challenges, narratives, role-play,
and literary features such as suspense [12].
The present study is similarly concerned with using GenAI
Manuscript received 31 August 2023; revised 6 December 2023; accepted such as ChatGPT in writing stories, but it explores the problem
13 March 2024. Date of publication 18 March 2024; date of current version 9
April 2024. The work was supported by a Faculty-Level Teaching Development from a novel angle. Although GenAI can produce writing within
Grant from The Education University of Hong Kong under Reference T0266. just a few seconds and thus help teachers save time in writing
(Corresponding author: Shurui Bai.) SBL-related stories, it is not clear how the GenAI-supported
This work involved human subjects or animals in its research. Approval of
all ethical and experimental procedures and protocols was granted by Human SBL may affect students’ outcomes. In other words, evidence
Research Ethics Committee of the Education University of Hong Kong under of the effects of GenAI-created content on student motivation or
Application No. 2022-2023-0424, and performed in line with the Ethical Review learning has not been well explored.
Brief Guidelines.
Shurui Bai is with the Department of Mathematics and Information Tech- Therefore, the aim of this case study is to explore how
nology, Education University of Hong Kong, Hong Kong SAR, China (e-mail: GenAI, and particularly ChatGPT, can be used to create sce-
tstbai@eduhk). narios for online SBL purposes and examine the effects of
Donn Emmanuel Gonda and Khe Foon Hew are with the Faculty of Education,
University of Hong Kong, Hong Kong SAR, China. ChatGPT-created SBL tasks on students’ intrinsic motivation
Digital Object Identifier 10.1109/TLT.2024.3378306 and learning performance.

1939-1382 © 2024 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
See https://www.ieee.org/publications/rights/index.html for more information.

Authorized licensed use limited to: SHENZHEN UNIVERSITY. Downloaded on May 28,2024 at 06:12:21 UTC from IEEE Xplore. Restrictions apply.
1314 IEEE TRANSACTIONS ON LEARNING TECHNOLOGIES, VOL. 17, 2024

We used digital storytelling techniques to guide the scenario RQ1: How do teachers leverage GenAI to write scenarios for
writing, documenting a teacher’s interactions with ChatGPT. SBL purposes?
After a 3-h interaction, we formulated a three-step prompt RQ2: What is the quality of GenAI-generated SBL scenarios
engineering process using GenAI (i.e., Write the prompts, Cu- and tasks?
rate the output, and Verify the output; WCV) that emerged from RQ3: How does GenAI-supported SBL affect students’ mo-
the practice. tivation, learning performance, and learning perceptions?

II. LITERATURE REVIEW III. METHOD

A. Application of GenAI in Assessment Design and Story A. Overall Research Design


Writing We conducted an exploratory case study to address our re-
The integration of GenAI in education allows teachers to search questions. This method enabled us to conduct an in-
optimize instructional efficiency. Research has indicated that depth investigation [22], as few relevant studies on this sub-
it can be used for creating subject discipline assessments. For ject are available. The study included a teacher, the GenAI
example, teachers have used GenAI to create questions pertinent platform, and 37 student participants. An insider’s perspective
to English comprehension [13], [14] or simplify hard-to-read can be accommodated in a case study research design [23], as
math word problems [15]. In terms of question formats, GenAI the teacher-researcher is situated within the focal context and
has been found effective in generating multiple-choice questions activity.
(MCQs) [16]. Previous studies revealed that GenAI can generate
useful quiz questions, although the quality of the questions is B. Student Participants
sometimes lower than human-generated quizzes [14] or too
In this study, 37 postgraduates (21 females and 16 males)
naïve where the wrong answer can be identified without any
participated in GenAI-supported SBL prototype testing (in the
background needed [17]. In addition, GenAI can assist in story
spring semester of 2022–2023). Their ages ranged from 22 to
writing, such as cowriting a story with human writers by re-
49 (M = 25.7, SD = 4.8). A total of 32 participants were from
sponding to human writers’ customized requests in natural lan-
mainland China, 3 were from Hong Kong SAR, 1 was from Italy,
guage expression (e.g., asking GenAI to rewrite a text to be more
and 1 was from Australia.
Dickensian) and propose suggestions to help unblock human
writers in the creative process of writing [18]. Recent research
also revealed the potential of ChatGPT for creating realistic C. Course Information
patient scenarios in medical education for medical students to We used a course called “coding and computational thinking”
practice their problem-solving skills [19]. (a master’s-level course at a university in Hong Kong in Spring
However, few studies have actually investigated the effects of 2023) as a test-bed to write scenarios using ChatGPT. This
GenAI-supported content on students’ learning outcomes and course begins with a review of the knowledge and skills of
their perceptions. In addition, the process of using GenAI to computational thinking and its role in developing advanced and
create educational content has not been fully explored. Although future technology. The important role of coding and computa-
GenAI has many benefits, if instructions for creating educational tional thinking as an integral part of Science, Technology, Engi-
content are not provided, teachers may remain unsure about its neering, and Mathematics education and the rationale behind are
applications and hesitate to utilize this tool in their classrooms. critically examined. It discusses strategies for learning coding
from various perspectives to develop students’ computational
B. Use SBL to Enhance Student Learning thinking skills.

Learning can be optimized when integrated into scenarios


D. Digital Storytelling-Driven Scenario-Based Learning
that mirror the contexts in which students will eventually apply
Design Framework
their acquired skills [3]. These can range from simple scenarios
to complex sequences of actions involving multiple scenarios. We included a theoretical framework that consists of the four
Their key features are that they are challenging and authentic, story elements [24], and the three digital storytelling elements to
involving narratives, choices, and role-play [2]. The authentic create an enhanced digital storytelling-driven SBL design (see
learning experiences provided by SBL can enhance students’ Fig. 1).
learning by bridging the divide between theory and practice [20]. The four story elements are lesson, character, setting, and plot
Literature has suggested that SBL can benefit students’ compu- [24]. Lesson represents the lesson the speaker wishes to convey
tational thinking skills mastery due to the stepwise narrative that [24], which is the intended learning content that students can
navigates students’ learning so that students can follow to learn, derive from the story. Character is the individual who makes
understand, and test their programming knowledge [21]. a situation come to life [24]. They can be either the students
themselves or fictional characters created by teachers, which can
help students connect with scenarios. Setting defines the time
C. Research Questions
and location in which the plot unfolds [24]. Plot encompasses
The following three research questions guided our study. the sequence of actions that drive the events in the story [24].

Authorized licensed use limited to: SHENZHEN UNIVERSITY. Downloaded on May 28,2024 at 06:12:21 UTC from IEEE Xplore. Restrictions apply.
BAI et al.: WRITE–CURATE–VERIFY: A CASE STUDY OF LEVERAGING GENAI FOR SCENARIO WRITING IN SBL 1315

Fig. 1. Digital storytelling-driven SBL design framework.

More details about the four story elements are provided in to fundraising, and from beta testing of a mobile application to
Section III-E. the future corporate strategies.
Teacher-created digital stories have become a popular tool Pacing: A deliberate “pacing” of the narrative was achieved
for facilitating comprehension of abstract subject matter [25]. through the episodic structure of the story. Each chapter served
Throughout various media forms (e.g., children’s literature, to advance the plotline and culminated in a distinct set of five
comics, and journalistic pieces), visual imagery and texts have scenario-based questions.
been consistently employed in storytelling, making them more Lesson: It is the intended learning content that students can de-
prevalent than alternative mediums, such as audio or anima- rive from the entire story, which is using computational thinking
tion. We employed digitalized visual elements and texts as skills to solve real-world complex problems during the startup
foundational components in our digital storytelling approach. building and development in our story. This is also the purpose of
Our story, titled “The Startup,” was fashioned with two distinct learning computational thinking skills, which are used to solve
attributes: teacher-created content and a heavy incorporation of a large complex problem [27].
visual imagery. Character: The main character that students need to assume
The seven elements of digital storytelling are point of view, the role is Tom (the startup main founder).
dramatic questions, emotional content, the gift of your voice, the Plot: It refers to the plot for each chapter in our story. The
power of soundtrack, economy, and pacing [26]. We highlighted five main plots were “the start of the business,” “find the gap in
the use of three digital storytelling elements, point of view, the market,” “raise funding for the startup,” “initial beta test of
dramatic questions, and pacing in our story design, because the mobile application,” and “next move of the company.”
they were in visual imagery format so that the gift of your voice Setting: There were five settings for five main plots. Setting
and the power of the soundtrack are not necessary. Emotional 1: in a café in the morning. Setting 2: online meeting room in
content helps evoke emotion from the audience, which is already the afternoon. Setting 3: in the mentor’s office in the afternoon.
embedded in our “plot” design. Economy means not overloading Setting 4: in the company conference room after the first triumph
the audience with too much information [26], which often hap- of the company. Setting 5: in Tom’s office for his reflection in
pens in digital stories where video or audio are involved. Since the morning.
our story design employed only visual images without video A series of dramatic questions glue both the lesson and the
or audio, we did not address the element of economy here. In character together. The motive to find the answer to the dra-
the crafting of the scenarios for computational thinking skills matic question drives the character (assumed by the student) to
development, we carefully employed the three essential digital complete a series of learning activities by applying the learned
storytelling elements and the four story elements as follows. knowledge in scenarios (the intended learning outcome). The
Point of view: This was realized through the creation of the point of view melds the character with the plot by inviting
central fictitious character, Tom. By assuming the role of Tom, students to experience the story from a first-person viewpoint as
students could immerse themselves in the story from a first- they navigate through the unfolding plots of the story. Finally,
person perspective. This was facilitated by the deliberate use of this movement from one plot to another is guided by the story’s
the first-person pronoun “I” in Tom’s dialogues with supporting pacing, which, in turn, enhances both the plot and the setting.
characters. This deliberate pacing strategy serves to deconstruct complex
Dramatic questions: The narrative was strategically punctu- learning content for students’ better cognitive processing.
ated by a series of “dramatic questions.” The ultimate inquiry,
revolving around the success of the startup company, was un-
veiled in the story’s final chapter. There were five chapters in E. Writing Scenarios by Prompting ChatGPT
the story and each chapter was accompanied by its own distinct “The Startup” is a linear story and has five main plots: Chapter
pivotal question. These questions encompassed the initial stages 1: Introduction; Chapter 2: Discovery; Chapter 3: Challenge;
of the business from its inception and market gap identification Chapter 4: Triumph; and Chapter 5: Conclusion. In this section,

Authorized licensed use limited to: SHENZHEN UNIVERSITY. Downloaded on May 28,2024 at 06:12:21 UTC from IEEE Xplore. Restrictions apply.
1316 IEEE TRANSACTIONS ON LEARNING TECHNOLOGIES, VOL. 17, 2024

Fig. 4. Startup SBL tasks design flowchart, with the support of ChatGPT.
Notes: WP = Write the prompts; CO = Curate the output; VO = Verify the
output; P = Plots; P∗ = Plot twists; C = Characters; S = Settings; and MCQs
= Multiple-choice questions.

after reading the scenario. Submission of one question’s answer


could unlock a subsequent conversation and a question. This
pattern would continue until all five scenarios were completed.
We used ChatGPT 3.5 to write the story outline (scenarios 1–
5), an opening conversation, five MCQs, and an ending conver-
sation for each scenario (see Fig. 4). We selected different story
elements for each scenario. Scenarios 1–5 had the four story
Fig. 2. Opening conversation in the first scenario of the Startup story.
elements (i.e., plot, character, setting, and lesson), along with
the opening and ending conversations. For example, Scenario 3
presented the main challenge that Tom (main fictitious character,
student role) encountered in building the startup company, which
served as the conflict. Scenario 4 showed how the challenge had
been resolved as a triumph, which served as the climax. We
viewed the conflict and climax as the plot twists in the story.

F. Measures
1) Quality of the SBL Scenarios and Tasks: We employed
self-developed rubrics (see Appendixes I and II) to examine the
quality of AI-generated SBL tasks and scenarios. In addition,
we used classical test theory [28] to further examine the qual-
ity of AI-generated SBL tasks. Its effectiveness in evaluating
Fig. 3. One sample MCQ in the first scenario of the Startup story.
self-developed MCQs has been demonstrated in previous stud-
ies [29]. The difficulty index is determined by calculating the
proportion of test-takers who select the correct response. This
we unpack the themes that emerged from the scenario writing proportion is also known as the item difficulty index (an item is a
process using ChatGPT 3.5. We also used it to generate the question). An item with a value approaching 1 is considered easy,
opening and ending conversations in each scenario, with each whereas one with a value approaching 0 is considered difficult
scenario representing one main plot. In other words, scenario [30]. An optimum difficulty index is suggested to be close to
1 represented Chapter 1: Introduction, scenario 2 represented 0.5, and items within a range of 0.15–0.85 are acceptable [31].
Chapter 2: Discovery, and so on. Altogether, five scenarios were The discrimination index defined by Haladyna [32] measures
generated. We delivered these scenarios in a comic strip format. the extent to which an item differentiates individuals based on
We used the quiz function in Moodle (a learning management their levels of learning: students with a higher level of learning
system) to present the SBL tasks. We implemented the GenAI- are more likely to answer correctly, while those with a lower level
created SBL scenarios and tasks in one session of the course are more likely to answer incorrectly. A good discrimination
(duration: 3 h) to examine their effectiveness. A representative power is greater than 0.4, and a poor discrimination power is
scenario and SBL task (i.e., a quiz question) were illustrated less than 0.2 [29]. However, items with low power may still be
in Figs. 2 and 3. Students assumed the role of Tom, the main valuable if they test basic knowledge. It means no matter if the
character, and engaged in a dialogue with Anna, a supporting student is a high or low performer, he or she can answer the
character. The setting was in a café in the morning. The plot question correctly. In this case, this question should be included
revolved around building the recommendation engine for the in the question bank. Items with a discrimination index of zero
food delivery mobile application startup. The main lesson was do not differentiate students. Those that are either too easy or
about decomposition and pattern recognition techniques in com- too difficult are more likely to have low discrimination power
putational thinking. Students needed to answer the five MCQs [29].

Authorized licensed use limited to: SHENZHEN UNIVERSITY. Downloaded on May 28,2024 at 06:12:21 UTC from IEEE Xplore. Restrictions apply.
BAI et al.: WRITE–CURATE–VERIFY: A CASE STUDY OF LEVERAGING GENAI FOR SCENARIO WRITING IN SBL 1317

2) Student Intrinsic Motivation: Students were explicitly in-


formed that ChatGPT created the SBL tasks and scenarios.
We measured and compared intrinsic motivation levels at pre-
and postactivity stages using the interest/enjoyment subscale
of the intrinsic motivation inventory (IMI) survey proposed by
Ryan [33]. The responses to each item are rated on a 7-point Fig. 5. Prompt used to create an overall story outline.
scale, ranging from “not at all true” (1 point) to “very true” (7
points). Two sample statements were: While I was working on TABLE I
a storytelling-supported learning task, I was thinking about how PROMPTS USED IN THE STARTUP, GENERIC PROMPTS, AND PATTERNS OF
WRITING PROMPTS IN SBL TASKS
much I enjoyed it, and I am satisfied with my performance in
this storytelling-supported learning task.
3) Learning Performance: To assess students’ learning per-
formance, a pretest and posttest design was adopted. The pretest,
conducted voluntarily during the first-class session, examines
students’ foundational comprehension and application of coding
and computational thinking skills. The pretest comprised five
MCQs and five short-essay questions, offering insights into stu-
dents’ familiarity with pattern recognition in problem-solving.
Two sample questions from the test were: “What is the purpose
of pattern recognition in problem-solving?” and “Please name
two ways of doing pattern recognition.” We used the course
final assessments as the posttest measure. This encompassed
an individual assessment (70% of the course grade) and a
group assessment (30% of the course grade). The individual
assessment focused on designing an interactive teaching module
prototype. The assessment was about designing a lesson to de-
velop students’ computational thinking skills in solving complex
real-world problems. The group assessment involved crafting a purposes, particularly for creating SBL tasks. Thus, we had to
mini video game using fundamental coding commands. The pre- learn prompt writing from scratch and through trial and error to
and posttests evaluations carried a maximum score of 100 each. generate good-quality scenarios.
Study groups of five–six students each were formed in the first In writing the scenarios and SBL tasks, we used the structure
session of the class to facilitate the collaborative work. of story outline – opening conversation – questions – ending
4) Learning Perceptions: We used a self-developed open- conversation. In this section, we discuss our process of writing
ended survey to understand students’ perceptions of the AI- prompts, which involved deconstructing them to find a common
generated SBL tasks. We asked three questions: How would pattern. Fig. 5 gives a sample prompt for the story outline.
you describe your learning experience in “The Startup” story? Here, the pattern was the presence of character, plot, and
Name three things you enjoyed in these scenario-based tasks. lesson, which was evident in the opening conversations, MCQs,
Have you any suggestions for improving “The Startup” learning and ending conversations. Table I outlines the patterns for each
experience (e.g., quiz quality, visual design, and story structure)? SBL component. The lesson refers to the main information the
teacher needs to provide. ChatGPT was useful in the develop-
IV. RESULTS ment of character, setting, and plot (see Table I for details).
2) Curate the Output: The second step is curating the output.
A. RQ1: How do Teachers Leverage GenAI to Write Scenarios We define this as the user’s selection of the satisfactory output
for SBL Purposes?
provided by the GenAI tool previously. The three initial eval-
We analyzed the prompting strategies used in interacting with uations of the generated output are unsatisfactory, need to be
ChatGPT to craft a scenario. We identified three themes: Write improved, or satisfactory.
prompt (WP), Curate the output (CO), and Verify the output a) Unsatisfactory output: The GenAI did not produce rel-
(VO). We elaborate on this prompt engineering process in the evant or accurate content. The user should then rewrite the
following section. prompt with updated and specific keywords to guide the GenAI.
1) Write the Prompts: The first step in every text-based As the topic of this new prompt is not relevant to the previ-
GenAI is writing prompts. We define it as the entries that users ous content, this step can be considered as going back to the
provide as inputs into the GenAI tool that elicit the intended out- “prompt writing” stage (see Section V for prompt improvement
put. This stage is related to the concept of prompt engineering, strategies).
which involves an iterative process of improving the input to an Fig. 6 illustrates our request to ChatGPT to create an outline
AI tool to enhance its output [34]. This refers to how to talk to of the story. After reading the output, we found the story setting
AI to get it to do what you want. However, as prompt writing is a uninteresting, so we added the phrase “IT startup.” However, the
relatively new concept, few prompts are available for education new output derived from the previous output was just replaced

Authorized licensed use limited to: SHENZHEN UNIVERSITY. Downloaded on May 28,2024 at 06:12:21 UTC from IEEE Xplore. Restrictions apply.
1318 IEEE TRANSACTIONS ON LEARNING TECHNOLOGIES, VOL. 17, 2024

Fig. 6. Example of writing prompts for a story outline.

by a new character, and no new setting was provided. Thus, we Fig. 7. Example of the rewritten prompt and its output after curating the output
stopped it from generating the output and rewrote the prompt, as for the story outline creation.
Fig. 7 shows. This time, we were more specific in our prompting
and added the phrases “about an entrepreneur,” and “keep the
beginning, climax, and ending.”
Old prompt: Can you change the story so it is about an IT
technology startup?
Rewritten prompt: Can you change the story so it is about an
entrepreneur trying to build an IT startup company? Keep the
five parts with a beginning, climax, and ending.
b) Need-to-be-improved output: The user will “keep (par-
tial output) and change (the next prompt).” The user needs to
determine which part of the output is useful and will be included
in the next prompt.
Old prompt: Continue the story where Tom did extensive
research into the potential rate of return on investment in the
food delivery app market in Hong Kong. You need to create an
MCQ with one correct answer about a computational skill called
decomposition. Keep it short.
Extended prompt: Continue … Increase the difficulty level of Fig. 8. Old prompt and output about creating an MCQ on computational
this question. The answer cannot be directly found in the conver- thinking skills.
sation text, so put the question in the scenario. Tom is reporting
the findings to a potential investor. Keep it a conversation mode
(new addition). a real-world scenario. However, its format was correct and
The generated question given in Fig. 8 could be answered accurate. Therefore, we added a new instruction to emphasize
without reading the scenario and it was not seamlessly integrated the application of knowledge, as shown in Fig. 9.
into the characters’ conversation. Thus, this question did not c) Satisfactory output: If the user is satisfied with the
reflect the good application of relevant content knowledge in generated output, he or she can start verifying the output.

Authorized licensed use limited to: SHENZHEN UNIVERSITY. Downloaded on May 28,2024 at 06:12:21 UTC from IEEE Xplore. Restrictions apply.
BAI et al.: WRITE–CURATE–VERIFY: A CASE STUDY OF LEVERAGING GENAI FOR SCENARIO WRITING IN SBL 1319

TABLE II
DIFFICULTY AND DISCRIMINATION INDICES OF AI-SUPPORTED 25 TASKS

TABLE III
SCORES OF AI-SUPPORTED 25 TASKS ASSESSED BY TWO MARKERS

TABLE IV
SCORES OF AI-SUPPORTED TEN OPENING AND ENDING CONVERSATIONS
ASSESSED BY TWO MARKERS

Fig. 9. Extended prompt and output after curating the output for Fig. 8
question.

3) Verify the Output: The last step is to verify the output. We 3.44 and SD = 0.34 (full score was 4.3). Both raters discussed
define this stage as the user’s verification of the authenticity and the results of their coding until a mutual agreement was reached.
accuracy of the GenAI output. ChatGPT can generate informa- Thus, the 25 tasks were in good quality in terms of provided
tion that is seemingly credible yet incorrect (i.e., hallucination), content and feedback (see Table III).
which unfairly discriminates against specific groups of individ- Second, we assessed the GenAI-supported SBL scenarios.
uals (i.e., bias), or that is harmful (i.e., toxicity) [35]. This step The two markers graded the ten opening and ending conversa-
should involve consulting subject matter experts or obtaining tions in the five chapters based on the self-developed rubrics
reliable evidence to verify the output. Finding fact-checkable (see Appendix II): M = 3.63 and SD = 0.35 (full score was
claims involves examining legislative records, media sources, 4.3). Both raters also discussed the results of their coding until a
and social media platforms. This entails assessing prominent mutual agreement was reached. Thus, the ten opening and ending
public assertions to ascertain both their verifiability and the conversations were of good quality in terms of creativity, com-
necessity of subjecting them to fact-checking scrutiny [36]. For putational thinking, story elements, and content (see Table IV).
the subject knowledge, teachers can verify the GenAI output by
himself/herself, consult scholarly, peer-reviewed articles, and C. RQ3: How Does GenAI-Supported SBL Affect Students’
books or other subject matter experts. For the scenario infor- Motivation, Learning Performance, and Learning Perceptions?
mation, teachers need to consult legislative records, trustworthy
1) Intrinsic motivation: A normality test suggested a non-
media outlets, and social media.
normal distribution of the preactivity sample data. Thus,
we used the Wilcoxon Signed Ranked test (a nonpara-
B. RQ2: What is the Quality of GenAI-Generated SBL
metric test for within-subject comparison) to compare
Scenarios and Tasks? the students’ pre- and postactivity intrinsic motivation
First, we examined the GenAI-supported SBL task. Table II levels. The results indicated that 27 out of 37 students
presents the difficulty and discrimination indices of the 25 items (response rate: 73%) reported a significantly higher level
in the GenAI-supported test. The difficulty index of the 25 tasks of intrinsic motivation for engaging in the SBL activities
was M = 0.58, SD = 0.2, which was in the acceptable range at the postactivity stage (Md = 5.57, n = 27) compared
of 0.15–0.85. The discrimination index of the AI-supported with the preactivity stage (Md = 4.70, n = 27); Z = −3.09,
SBL test was M = 0.35, SD = 0.22; this was close to the p = 0.002. The Cronbach’s α was 0.78 and 0.86 for the
discrimination power of 0.4, which is deemed as good. The pre- and postactivity intrinsic motivation measurement.
two independent markers also graded the 25 tasks based on the 2) Learning performance: Four participants did not join the
self-developed rubrics (see Appendix I), with an overall M = pretest as it was a voluntary activity. All 37 students

Authorized licensed use limited to: SHENZHEN UNIVERSITY. Downloaded on May 28,2024 at 06:12:21 UTC from IEEE Xplore. Restrictions apply.
1320 IEEE TRANSACTIONS ON LEARNING TECHNOLOGIES, VOL. 17, 2024

(response rate: 89%) were counted in the post-test mea- TABLE V


STUDENTS’ LEARNING EXPERIENCES AND SUGGESTIONS FOR IMPROVEMENT
surement. A normality test suggested a normal distri- OF THE GENAI-SUPPORTED SBL
bution of the pre- and post-performance test data. The
paired-samples T-test indicated that students achieved
significantly higher performance scores after doing the
GenAI-supported SBL tasks (M = 77.8, SD = 5.94 for
posttest, M = 5.26, SD = 5.08 for pretest; full mark is
100); t (65) = −90.33, p < 0.001.
3) Learning perceptions: A total of 24 students (response
rate: 65%) completed the open-ended survey about learn-
ing perceptions. We identified three themes from the
students’ reported learning experiences via the thematic
analysis method [37]: a) learning tasks were delivered
in well-written and realistic scenarios (n = 15); b) they
helped me review learning content and identify the knowl-
edge gap (n = 8); and c) I could put myself in the shoes of
the fictitious character Tom (student role) in the story (n
= 5). The four most common suggestions for improving
the SBL tasks design were as follows.

a) The storyline can be more complex and have more twists (n


= 6).
b) The association between the story and tasks needs to be
further enhanced (n = 5).
c) Analyses of wrong answers need to be included (n = 3).
d) The use of multimedia (audio and video) in presenting the
scenarios is preferable (n = 2).

Table V provides examples of students’ quotations of these


themes. The interrater reliability on qualitative data analysis was
92%. All the discrepancies have been well discussed and reached
a perfect agreement.
The teacher also found it helpful to include the number of story
V. DISCUSSION parts (i.e., the story should consist of five parts with a beginning,
climax, and ending). Otherwise, the generated output will most
A. Write the Prompts
likely be a simple description of the main plot, which does not
Prompt writing is the first step in guiding GenAI to generate help the teacher develop the story outline.
relevant content, which spans the entire SBL design process. Second, prompts must be provided for the opening/ending
ChatGPT can be used to initially generate the necessary SBL conversation creation. Our prompt for one opening conversation
components. Our four SBL components are the story outline, was “Give a short opening for the part 1 introduction. Introduce
opening conversations, MCQs, and ending conversations. Tom, and the setting is in a cafe with a friend where they discuss a
First, a prompt must be written for the story outline cre- new idea for a startup. Tom discusses how to use decomposition
ation. Within the story outline creation, the four story elements skills in conceptualizing his new idea about a food delivery
(characters, plots, settings, and lessons) were used. Our story app.” This prompt consisted of the main character, setting, and
outline prompt was “Write me an outline of a story about plot, which were generated from the previous output. Here, the
an entrepreneur trying to build an IT startup company. You teacher only needed to provide the “lesson” information.
need to include computational thinking skills in the story. The Finally, SBL task creation prompts require output verifi-
story should consist of five parts with a beginning, climax, cation. We found that detailed and even repetitive instruc-
and ending.” We explicitly requested ChatGPT to generate the tions helped ChatGPT focus on the most important infor-
story elements by inputting keywords such as “an entrepreneur” mation. Our prompt was “Continue the story where Tom …
(character), “trying to build an IT startup company” (plot), and You need to create a follow-up MCQ with one correct an-
“include computational thinking skills” (lesson). Information swer about giving a general idea for an algorithm for qual-
about the setting was not required as the teacher considered ity control measures. The answer cannot be derived directly
when and where the startup story happened was not important. from the conversation text, so put the question in the sce-
However, this should be assessed on a case-by-case basis. If the nario.” This prompt included several key phrases, such as
story is time- and space-sensitive (e.g., historical events), then “MCQ,” “one correct answer,” and “the answer cannot be de-
including settings in the prompt is necessary. rived directly from the conversation.” In other prompts, we also

Authorized licensed use limited to: SHENZHEN UNIVERSITY. Downloaded on May 28,2024 at 06:12:21 UTC from IEEE Xplore. Restrictions apply.
BAI et al.: WRITE–CURATE–VERIFY: A CASE STUDY OF LEVERAGING GENAI FOR SCENARIO WRITING IN SBL 1321

example, ChatGPT tends to associate a bad person with a third-


world country or poverty-stricken region [41]. Fortunately, we
did not encounter this problem in our study. We should, however,
keep in mind that GenAI potentially has these shortcomings.
The learning content must also be carefully verified. Teachers
must be aware of “hallucination” (creating content that appears
Fig. 10. Initial prompt and the subsequent revised prompt. to be truthful but in fact is not), which is a common phenomenon
in ChatGPT output. They must apply their subject-matter ex-
pertise when cocreate content with GenAI tools. Subject-matter
asked ChatGPT to ensure all answer options were of the same expertise is essential to ensure that the learning content is not
length. only accurate but also coherent.
Writing good prompts helps set the context for subsequent
conversations and informs the GenAI of what information is im-
portant, and what the desired output form and content should be
D. Impacts on Students’ Multiple Outcomes
[38]. The prompt should include important keywords or phrases
and offer sufficient information related to those keywords [38] The results of the student intrinsic motivation survey showed
rather than focusing on connecting words. Connecting words, a significant improvement after using the AI-generated digital
such as function words and punctuation do not contribute sub- storytelling-enhanced scenario-based tasks. According to Yang
stantially to GenAI output [39]. and Wu [42], task value is a motivational factor in digital story-
telling. Task value represents students’ subjective assessments of
B. Curate the Output learning tasks’ interest, relevance, and significance [43]. In our
research, we integrated digital storytelling into authentic SBL
After obtaining the first round of generated output, users
activities, thereby augmenting the task value and subsequently
may feel conflicted. Some parts of the output are good while
elevating student motivation levels.
others need improvement. Refining the prompts can improve the
Furthermore, students demonstrated more favorable learning
GenAI output. White et al. [38] have suggested using alternative
performance after using the AI-generated digital storytelling-
approaches to improve prompt writing. By using the alternative
enhanced scenario-based tasks. Empirical studies suggest that
approaches, GenAI is asked to offer alternative approaches that
well-structured scenarios help make connections between theory
the user can pursue in addition to the original input. For example,
and practice [5]. Although SBL can potentially improve student
in our SBL task design, the teacher could ask ChatGPT to
learning, this method requires careful design, such as structuring
provide a different arrangement of four story elements (i.e.,
the course around a relevant and authentic situation [2], as
character, setting, plot, and lesson). If some of the generated
scenarios are considered “essential slices of reality” [44]. Digital
output is satisfactory, the teacher can write follow-up prompts
storytelling is able to provide visual alternatives to traditional
to expand those aspects and obtain more details.
text-based approaches and contributes to enhanced learning.
If the generated output does not meet the teacher’s expecta-
The effective presentation of information through image-rich
tions, the teacher can revise the old prompt to retain a neater
instruction can help students retain information faster and longer
interaction string with ChatGPT (see Fig. 10). This editing
[45].
approach can also benefit the teacher when writing branching
Regarding learning experience, most students reported their
scenarios. Every editing record can be viewed as one branching
appreciation of the well-written and realistic scenarios. It pro-
scenario that leads the story in a different direction. Multiple
vided evidence for teachers to leverage the power of GenAI to
scenarios can then be developed under one main plot as well.
create SBL scenarios and tasks. Students also appreciated the
plot twists which required teachers to explicitly ask ChatGPT
C. Verify the Output to embed a plot twist in between the main plots. Students
The content of the story outline and the SBL tasks need to be were engaged in role-play experiences, immersing themselves
verified. There are gender biases in ChatGPT output [40]. For in fictional characters’ perspectives. It can positively influ-
example, ChatGPT often described the economics professor as ence their self-identification and emotional attachment. Con-
a man [40]. In our story, ChatGPT suggested a male character sequently, self-identification can motivate people to invest more
as the protagonist in the startup story. This gender bias can be time in the activity and enhance their positive sense of them-
assumed from the demographics of technology startup founders. selves [46]. By assuming the main character’s role, students
Such “biases” are neither right nor wrong and thus are difficult can develop a deeper connection to the subject matter, leading
to be verified. They can, therefore, diverge from the culture, to increased engagement and a heightened drive to acquire
beliefs, and values of the teachers and the readers (students). knowledge. Our findings highlight the potential of role-play as
Teachers have the responsibility to provide a fair represen- an effective pedagogical approach in fostering students’ intrinsic
tation of students in terms of their ethnicity, gender, and social motivation within the learning process. Moreover, our SBL tasks
economic status. Deshpande et al. [41] found in their recent study were designed with different difficulty or challenge levels (e.g.,
that ChatGPT has varying levels of toxicity and may provide a easy and difficult mode tasks). The challenges in our study were
derogatory or negative output based on various persona. For communicated clearly to the students in the learning scenarios

Authorized licensed use limited to: SHENZHEN UNIVERSITY. Downloaded on May 28,2024 at 06:12:21 UTC from IEEE Xplore. Restrictions apply.
1322 IEEE TRANSACTIONS ON LEARNING TECHNOLOGIES, VOL. 17, 2024

to ignite students’ desire to complete them and help intrinsically field trial of the WCV prompting process for designing SBL and
motivate students to engage with the scenarios [2]. found that by applying this process, ChatGPT can help teachers
In summary, this study suggests the benefits of using GenAI create quality SBL scenarios and tasks within a short time frame.
properly to write digital storytelling elements (e.g., point of view, Students’ learning performance has significantly improved af-
dramatic questions, and pacing) and story elements (characters, ter exposing to the GenAI-supported digital storytelling-driven
settings, plots, and lessons) in an efficient way to engage students SBL. In addition, most of our student participants reported
and improve their learning. positive attitudes toward this learning experience with a strong
sense of self-identification to the fictitious characters in the story.
VI. LIMITATIONS AND FUTURE STUDY
We acknowledge potential researcher bias in a case study APPENDIX I
design [47]. To mitigate these biases and enhance the trustwor- SELF-DEVELOPED EVALUATION RUBRIC FOR
thiness of this study, the teacher-researcher left an audit trail and SBL TASK QUALITY
involved an independent research assistant to provide an external
perspective on the study design and finding report [48]. The
teacher-researcher documented her entire process of creating
the SBL scenarios and tasks via audio recording to maintain
transparency in data collection and finding reports. The teacher-
researcher also reflected on the entire process of conducting the
intervention and data analysis with the independent research
assistant to help minimize the impact of the potential biases. To
further minimize potential teacher-researcher bias, the research
assistant helped distribute the IMI surveys (pre- and posttests),
performance tests (pre- and posttests), and open-ended surveys
at the end of the intervention. The research assistant also helped
in marking the SBL scenarios and tasks as the second marker.
There are three main limitations of this study. First, teachers
should note the potential variability of GenAI output quality APPENDIX II
based on the provided prompts. The results of this study were SELF-DEVELOPED RUBRIC FOR SBL SCENARIO
based on a limited number of prompts. Educators, as subject QUALITY EVALUATION
matter experts, should carefully assess and curate the GenAI
output to maintain authenticity and accuracy. That is why curat-
ing the output is a very important process in prompting GenAI
tools. Second, the 3-h (see Appendix III for action timetable)
SBL task intervention is relatively short, and the novelty effect
might have influenced students’ motivation and perceptions.
Thus, future research can replicate this study with a longer
intervention duration. Third, we focused on a higher education
setting. Adult learners typically are more intrinsically motivated,
mastery-oriented, and value personal growth and fulfillment
than younger students [49]. The SBL tasks use real-world cases
that conform to adult learners’ desire for relevance. Thus, our
participants’ motivation to learn may have been inherent and
robust before they engaged in the SBL. We, therefore, encourage
future research to apply the WCV prompting process to other ed-
ucational contexts. Finally, we call for collective efforts to build
a resource hub for designing SBL tasks or other instructional APPENDIX III
strategies. Research into and applications of GenAI can then ACTION TIMETABLE OF 3-H INTERACTION WITH GENAI TO
be developed rapidly to facilitate better teaching and learning DESIGN SBL TASKS AND SCENARIOS
purposes.
Hour 1: Introduction and Discovery
1. Introduction
VII. CONCLUSION r Use GPT to create an opening for Chapter 1: Introduc-
In this case study, we explored the process and effects of using tion, introducing Tom in a cafe with his friend.
GenAI, specifically ChatGPT, in developing digital storytelling- 2. Discussion of New Idea
driven SBL scenarios and tasks. A GenAI prompting process in- r Create a conversation between Tom and his friend using
volving WCV emerged through iterative cycles. We conducted a GPT to discuss Tom’s new food delivery app idea.

Authorized licensed use limited to: SHENZHEN UNIVERSITY. Downloaded on May 28,2024 at 06:12:21 UTC from IEEE Xplore. Restrictions apply.
BAI et al.: WRITE–CURATE–VERIFY: A CASE STUDY OF LEVERAGING GENAI FOR SCENARIO WRITING IN SBL 1323

r Generate ideas for the potential features and challenges [9] W. Geerling, G. D. Mateer, J. Wooten, and N. Damodaran, “ChatGPT has
of the app. mastered the Principles of Economics: Now what?” SSRN Electron. J.,
2023, doi: 10.2139/ssrn.4356034.
Hour 2: Challenge and Triumph [10] D. Baidoo-Anu and L. Owusu Ansah, “Education in the era of generative
1. Discussing Challenges (30 min) artificial intelligence (AI): Understanding the potential benefits of Chat-
r Utilize GPT to continue the story with Tom encounter- GPT in promoting teaching and learning,” SSRN Electron. J., vol. 7, no. 1,
pp. 52–62, Jan. 2023, doi: 10.2139/SSRN.4337484.
ing challenges raising funding for his startup.
r Develop a conversation between Tom and his mentor, [11] M. A. AlAfnan, S. Dishari, M. Jovic, and K. Lomidze, “Chat-
GPT as an educational tool: Opportunities, challenges, and recom-
discussing the difficulties and potential solutions. mendations for communication, business writing, and composition
r Mention the need for a prototype and AI development courses,” J. Artif. Intell. Technol., vol. 3, no. 2, pp. 60–68, Mar. 2023,
doi: 10.37965/JAIT.2023.0184.
within a short timeframe. [12] T. S. Wang and A. S. Gordon, “Playing story creation games with
2. Overcoming Challenges (30 min) large language models: Experiments with GPT-3.5,” 2023, pp. 297–305,
r Interact with GPT to create a conversation where Tom doi: 10.1007/978-3-031-47658-7_28.
[13] F. Qu, X. Jia, and Y. Wu, “Asking questions like educational experts: Au-
seeks advice on creating a prototype and outsourcing tomatically generating question-answer pairs on real-world examination
AI development. data,” in Proc. Conf. Empirical Methods Natural Lang. Process., 2021,
r Develop the mentor’s response and Tom’s understand- pp. 2583–2593, doi: 10.18653/V1/2021.EMNLP-MAIN.202.
[14] R. Dijkstra, Z. Genç, S. Kayal, and J. Kamps, “Reading comprehension
ing of using computational thinking skills. quiz generation using generative pre-trained transformers,” in Proc. 4th
r Use GPT to outline the steps and pseudocode for build- Int. Workshop Intell. Textbooks, 2022, pp. 4–17. [Online]. Available: https:
//github.com/RamonDijkstra/EduQuiz
ing and training the AI model. [15] N. Patel, P. Nagpal, T. Shah, A. Sharma, S. Malvi, and D. Lomas, “Im-
Hour 3: Triumph and Conclusion proving mathematics assessment readability: Do large language models
1. Building a Successful Team help?” J. Comput. Assist. Learn., vol. 39, no. 3, pp. 804–822, Jun. 2023,
r Utilize GPT to write a conversation between Tom and doi: 10.1111/jcal.12776.
[16] R. Rodriguez-Torrealba, E. Garcia-Lopez, and A. Garcia-Cabot, “End-
his startup team members about their successful proto- to-end generation of multiple-choice questions using text-to-text trans-
type testing. fer transformer models,” Expert Syst. Appl., vol. 208, Dec. 2022,
r Mention the positive feedback and securing funding Art. no. 118258, doi: 10.1016/j.eswa.2022.118258.
[17] A. Tlili et al., “What if the devil is my guardian angel: ChatGPT as
from investors. a case study of using chatbots in education,” Smart Learn. Environ.,
r Discuss plans for further development, features, and vol. 10, no. 1, pp. 1–24, Dec. 2023, doi: 10.1186/S40561-023-00237-X/
FIGURES/13.
team expansion. [18] A. Yuan, A. Coenen, E. Reif, and D. Ippolito, “Wordcraft: Story writing
with large language models,” in Proc. Int. Conf. Intell. User Interfaces,
REFERENCES Mar. 2022, pp. 841–852, doi: 10.1145/3490099.3511105.
[19] G. Eysenbach, “The role of ChatGPT, generative language mod-
[1] K. Hu, “ChatGPT sets record for fastest-growing user base - els, and artificial intelligence in medical education: A conversa-
analyst note,” Reuters, Feb. 2, 2023, Accessed: May 8, 2023. tion with ChatGPT and a call for papers,” JMIR Med. Educ.,
[Online]. Available: https://www.reuters.com/technology/chatgpt-sets- vol. 9, no. 1, Mar. 2023, Art. no. e46885, Accessed: May
record-fastest-growing-user-base-analyst-note-2023-02-01/ 8, 2023. [Online]. Available: https://mededu.jmir.org/2023/1/e46885,
[2] M. S. Smith, S. Warnes, and A. Vanhoestenberghe, “Scenario-based doi: 10.2196/46885.
learning,” in Teaching and Learning in Higher Education: Perspectives [20] H. Hussein Ahmed, “Adopting scenario based learning in Critical
From UCL, J. P. Davies and N. Pachler, Eds., London, U.K.: UCL care nursing education: Students’ achievement and feedback,” Am. J.
IOE Press, May 2018, pp. 144–156, Accessed: Apr. 13, 2023. [Online]. Nurs. Res., vol. 7, no. 4, pp. 581–588, Jun. 2019, doi: 10.12691/
Available: https://www.ucl-ioe-press.com/ioe-content/uploads/2018/05/ ajnr-7-4-20.
Teaching-and-Learning-in-Higher-Education.pdf [21] A. Zitouniatis, F. Lazarinis, and D. Kanellopoulos, “Teaching com-
[3] L. Bardach, R. M. Klassen, T. L. Durksen, J. V. Rushby, K. C. putational thinking using scenario-based learning tools,” Educ. Inf.
P. Bostwick, and L. Sheridan, “The power of feedback and reflec- Technol., vol. 28, no. 4, pp. 4017–4040, Apr. 2023, doi: 10.1007/
tion: Testing an online scenario-based learning intervention for stu- s10639-022-11366-0.
dent teachers,” Comput. Educ., vol. 169, Aug. 2021, Art. no. 104194, [22] R. K. Yin, Case Study Research and Applications: Design and Methods,
doi: 10.1016/J.COMPEDU.2021.104194. 6th ed. Newbury Park, CA, USA: Sage, 2018.
[4] R. M. Klassen, J. V. Rushby, L. Maxwell, T. L. Durksen, L. Sheridan, [23] S. Unluer, “Being an insider researcher while conducting case study
and L. Bardach, “The development and testing of an online scenario- research.,” Qual. Rep., vol. 17, pp. 1–14, 2012, Accessed: Nov. 28 2023.
based learning activity to prepare preservice teachers for teaching place- [Online]. Available: http://www.nova.edu/ssss/QR/QR17/unluer.pdf
ments,” Teaching Teacher Educ., vol. 104, Aug. 2021, Art. no. 103385, [24] S. D. Cohen, “The art of public narrative: Teaching students how to
doi: 10.1016/j.tate.2021.103385. construct memorable anecdotes,” Commun. Teacher, vol. 25, no. 4,
[5] S. Mamakli, M. K. Alimoğlu, and M. Daloğlu, “Scenario-based learning: pp. 197–204, Oct. 2011, doi: 10.1080/17404622.2011.601726.
Preliminary evaluation of the method in terms of students’ academic [25] B. R. Robin, “Digital storytelling: A powerful technology tool for the 21st
achievement, in-class engagement, and learner/teacher satisfaction,” century classroom,” Theory Pract., vol. 47, no. 3, pp. 220–228, Jul. 2008,
Adv. Physiol. Educ., vol. 47, no. 1, pp. 144–157, 2023, doi: 10.1152/AD- doi: 10.1080/00405840802153916.
VAN.00122.2022/ASSET/IMAGES/LARGE/ADVAN.00122.2022_F003 [26] J. Lambert, Digital Storytelling: Capturing Lives, Creating Commu-
.JPEG. nity, 5th ed. (revised and updated). New York, NY, USA: Rout-
[6] G. Bowman, R. B. MacKay, S. Masrani, and P. McKiernan, “Story- ledge, 2018 (4th ed. published in 2013 by Routledge), doi: 10.4324/
telling and the scenario process: Understanding success and failure,” 9781351266369.
Technol. Forecast. Soc. Change, vol. 80, no. 4, pp. 735–748, May 2013, [27] J. M. Wing, “Computational thinking,” Commun. ACM, vol. 49, no. 3,
doi: 10.1016/J.TECHFORE.2012.04.009. pp. 33–35, 2006.
[7] J. Moon and J. Fowler, “‘There is a story to be told …’; A framework [28] A. F. De Champlain, “A primer on classical test theory and item response
for the conception of story in higher education and professional devel- theory for assessments in medical education,” Med. Educ., vol. 44, no. 1,
opment,” Nurse Educ. Today, vol. 28, no. 2, pp. 232–239, Feb. 2008, pp. 109–117, Jan. 2010, doi: 10.1111/J.1365-2923.2009.03425.X.
doi: 10.1016/J.NEDT.2007.05.001. [29] J. M. Azevedo, E. P. Oliveira, and P. D. Beites, “Using learn-
[8] D. Nicholson, S. Fiore, J. J. Vogel-Walcutt, and S. Schatz, “Advancing ing analytics to evaluate the quality of multiple-choice questions:
the science of training in simulation-based training,” Proc. Hum. Factors A perspective with classical test theory and item response theory,”
Ergonom. Soc. Annu. Meeting, vol. 53, no. 26, pp. 1932–1934, Oct. 2009, Int. J. Inf. Learn. Technol., vol. 36, no. 4, pp. 322–341, Jul. 2019,
doi: 10.1177/154193120905302611. doi: 10.1108/IJILT-02-2019-0023/FULL/PDF.
Authorized licensed use limited to: SHENZHEN UNIVERSITY. Downloaded on May 28,2024 at 06:12:21 UTC from IEEE Xplore. Restrictions apply.
1324 IEEE TRANSACTIONS ON LEARNING TECHNOLOGIES, VOL. 17, 2024

[30] P. Macdonald and S. V. Paunonen, “A Monte Carlo comparison of item Shurui Bai received the Ph.D. degree in e-learning
and person statistics based on item response theory versus classical environment from the University of Hong Kong,
test theory,” Educ. Psychol. Meas., vol. 62, no. 6, pp. 921–943, 2002, Hong Kong, in 2022.
doi: 10.1177/0013164402238082. She is an Assistant Professor with the Department
[31] M. McAlpine, Principles of Assessment. Luton, U.K.: CAA Centre, Univ. of Mathematics and Information Technology, Edu-
of Luton, 2002. cation University of Hong Kong, Hong Kong. She
[32] T. Haladyna, Developing and Validating Multiple-Choice Test Items, 3rd teaches courses on coding and computational think-
ed. New York, NY, USA: Routledge, 2004, doi: 10.4324/9780203825945. ing and design of innovative learning environments.
[33] R. M. Ryan, “Control and information in the intrapersonal sphere: An Her research interests include gamification, digital
extension of cognitive evaluation theory,” J. Pers. Soc. Psychol., vol. 43, storytelling, and student motivation to learn.
no. 3, pp. 450–461, 1982, doi: 10.1037/0022-3514.43.3.450. Dr. Bai was a recipient of second prize in course
[34] T. Teubner, C. M. Flath, C. Weinhardt, W. van der Aalst, and O. Hinz, design at the International Contest on Blended Teaching and Learning 2021 for
“Welcome to the era of ChatGPT et al.: The prospects of large language the project “Stories in gamification works! Amplifying the positive effects of
models,” Bus. Inf. Syst. Eng., vol. 65, no. 2, pp. 95–101, Apr. 2023, gamification through fantasy to enhance students’ learning and engagement.”
doi: 10.1007/S12599-023-00795-X/METRICS.
[35] L. Ouyang et al., “Training language models to follow instructions with hu-
man feedback,” Adv. Neural Inf. Process. Syst., vol. 35, pp. 27730–27744,
Dec. 2022.
[36] A. Mantzarlis, “Fact-checking 101,” Journalism, Fake News and Disinfor-
mation: Handbook for Journalism Education and Training, pp. 85–100,
2018, Accessed: May 8, 2023. [Online]. Available: https://reporterslab.
org/big-year-fact-checking-not-new-u-s-fact-checkers/
[37] V. Braun and V. Clarke, “Using thematic analysis in psychol-
ogy,” Qual. Res. Psychol., vol. 3, no. 2, pp. 77–101, Jan. 2006,
doi: 10.1191/1478088706qp063oa.
[38] J. White et al., “A prompt pattern catalog to enhance prompt engineering Donn Emmanuel Gonda received the M.Sc. degree
with ChatGPT,” Feb. 2023, Accessed: May 8, 2023. [Online]. Available: in information technology in education from the Uni-
https://arxiv.org/abs/2302.11382v1 versity of Hong Kong, Hong Kong, in 2018.
[39] V. Liu and L. B. Chilton, “Design guidelines for prompt engineering text- He is a Lecturer with the Faculty of Education,
to-image generative models,” in Proc. Conf. Hum. Factors Comput. Syst., teaching courses on instructional design, adult learn-
Apr. 2022, Paper 384, doi: 10.1145/3491102.3501825. ing, and storytelling. He has over a decade of ex-
[40] N. Gross, “What ChatGPT tells us about gender: A cautionary tale about perience in designing and developing higher edu-
performativity and gender biases in AI,” Social Sci., vol. 12, no. 8, cation courses and corporate training, implementing
Aug. 2023, Art. no. 435, doi: 10.3390/SOCSCI12080435. innovative technologies, and leveraging data analytics
[41] A. Deshpande, V. Murahari, T. Rajpurohit, A. Kalyan, and K. Narasimhan, and research to evaluate professional development
“Toxicity in ChatGPT: Analyzing persona-assigned language models,” courses and training. He also facilitates several pro-
Apr. 2023, Accessed: May 8, 2023. [Online]. Available: https://arxiv.org/ fessional development workshops and seminars across Asia on various topics,
abs/2304.05335v1 such as improving the overall learning experience through instructional design,
[42] Y. T. C. Yang and W. C. I. Wu, “Digital storytelling for enhancing student leveraging e-learning strategies to achieve learning outcomes, and curating
academic achievement, critical thinking, and learning motivation: A year- cutting-edge technology to make learning fun.
long experimental study,” Comput. Educ., vol. 59, no. 2, pp. 339–352, Mr. Gonda was a recipient of the University Outstanding Teaching Award
Sep. 2012, doi: 10.1016/J.COMPEDU.2011.12.012. (Team Category) for his work supporting university teachers.
[43] P. R. Pintrich, D. A. F. Smith, T. Garcia, and W. J. Mckeachie, “Re-
liability and predictive validity of the motivated strategies for learning
questionnaire (MSLQ),” Educ. Psychol. Meas., vol. 53, no. 3, pp. 801–813,
Sep. 1993, doi: 10.1177/0013164493053003024.
[44] T. Smith, “Gamified modules for an introductory statistics course and
their impact on attitudes and learning,” Simul. Gaming, vol. 48, no. 6,
pp. 832–854, Dec. 2017, doi: 10.1177/1046878117731888.
[45] L. Burmark, “Visual presentations that prompt, flash & transform here are
some great ways to have more visually interesting class sessions,” Media
Methods, vol. 40, pp. 4–5, 2004.
[46] S. R. Sioni, M. H. Burleson, and D. A. Bekerian, “Internet gaming disorder:
Social phobia and identifying with your virtual self,” Comput. Hum. Khe Foon Hew received the bachelor’s degree in
Behav., vol. 71, pp. 11–15, Jun. 2017, doi: 10.1016/j.chb.2017.01.044. computer technology and the master’s degree in
[47] R. J. Chenail, “Interviewing the investigator: Strategies for addressing instructional design and technology from Nanyang
instrumentation and researcher bias concerns in qualitative research,” Technological University (NTU), Singapore, in 1993
Qual. Rep., vol. 16, no. 1, pp. 255–262, 2011. and 2002, respectively, and the Ph.D. degree in in-
[48] E. G. Guba, “Criteria for assessing the trustworthiness of naturalistic structional systems technology from Indiana Univer-
inquiries,” Educ. Commun. Technol., vol. 29, no. 2, pp. 75–91, Jun. 1981, sity, Bloomington, IN, USA, in 2006.
doi: 10.1007/BF02766777/METRICS. He is a Full Professor of information technology
[49] H. Murphy and N. Roopchand, “Intrinsic motivation and self-esteem in studies with the Faculty of Education, University of
traditional and mature students at a post-1992 university in the north- Hong Kong, Hong Kong. He was a Systems Engineer
east of England,” Educ. Stud., vol. 29, no. 2/3, pp. 243–259, Jun. 2003, and had worked for four years at Sony. His research
doi: 10.1080/03055690303278. focuses on engaging students in online and blended learning environments.

Authorized licensed use limited to: SHENZHEN UNIVERSITY. Downloaded on May 28,2024 at 06:12:21 UTC from IEEE Xplore. Restrictions apply.

You might also like