Professional Documents
Culture Documents
Parker Et Al 2023 Chatgpt For Automated Writing Evaluation in Scholarly Writing Instruction
Parker Et Al 2023 Chatgpt For Automated Writing Evaluation in Scholarly Writing Instruction
Parker Et Al 2023 Chatgpt For Automated Writing Evaluation in Scholarly Writing Instruction
G
Nursing Practice domain emphasize students’ ability to de-
enerative artificial intelligence (GAI) in education recent- scribe, articulate, examine, and integrate. Thus, written as-
ly has sparked widespread interest. In contrast to other signments should be designed to assess these crucial qualifiers
types of artificial intelligence (AI) used in education that (Lim et al., 2023).
rely on machine learning algorithms or rule-based systems (e.g., However, nursing students at all postsecondary education
Grammarly), GAI produces content learned from its training on levels struggle in meeting such expectations; moreover, they
a large dataset. ChatGPT, a large language model developed by face difficulties with organization, grammar, flow, sentence
OpenAI, is one example of GAI and has been touted to have the structure, and critical analysis (DeCoux Hampton & Chafetz,
potential to transform health care education, research, and prac- 2021; Cone & Van Dover, 2012; McMillan & Raines, 2011).
tice (Sallam, 2023). Specific applications of ChatGPT in health Strategies such as peer review, self-assessment, and frequent
care education include personalized learning experiences in nurs- feedback have been shown to improve postsecondary nurs-
ing education (O’Connor & ChatGPT, 2023), academic writing ing students’ writing proficiency (Bickes & Schim, 2010;
Hampton, 2019; Kilmer et al., 2023; Tornwall & McDaniel,
Jessica L. Parker, EdD, MS, RDH, is a Lecturer, Massachusetts College of 2022). However, providing thorough and frequent feedback is
Pharmacy and Health Sciences, School of Healthcare Business. Kimberly time-consuming, posing challenges for instructors and students
Becker, PhD, is a Lecturer, English Department, Iowa State University. (Bjerkvik & Hilli, 2019; Micsinszki & Yeung, 2021; Zhan et
Catherine Carroca, DHS, MSN, RN, is an Associate Professor, Department of al., 2023).
Nursing, Massachusetts College of Pharmacy and Health Sciences. One possible way to mitigate the challenges associated with
Address correspondence to Jessica L. Parker, EdD, MS, RDH, School of student writing and faculty feedback is using GAI tools, such
Healthcare Business, Massachusetts College of Pharmacy and Health Sci- as ChatGPT, for AWE. AWE systems originally focused on
ences, 179 Longwood Avenue, Boston, MA 02115; email: Jessica.parker@ analyzing local- or micro-level characteristics, such as gram-
mcphs.edu. mar, mechanics, lexis, and discourse (Attali & Burstein, 2006;
Disclosure: The authors have disclosed no potential conflicts of inter- Deane, 2013; Deane & Quinlan, 2010). However, as AWE
est, financial or otherwise. systems have become more sophisticated, their foci increas-
Received: May 29, 2023; Accepted: July 3, 2023 ingly have shifted to global- or macro-level features, such as
doi:10.3928/01484834-20231006-02 organization, communicative goals, and genre characteristics.
To explore the suitability 1 – Wording is often imprecise or ambiguous; the sentence structure
of GAI for AWE, we utilized is often confusing
exemplary nursing student 2 – The paper is, for the most part, precisely worded and
papers from the Michigan unambiguous; the sentence structure is mostly clear
Corpus of Upper-Level Stu- 3 – Throughout the paper, the wording is precise and unambiguous;
dent Papers (MICUSP), a the sentence structure is consistently clear and lucid
broad source of success- Accuracy Mechanics Now, evaluate the same paper for writing mechanics according to
ful, multidisciplinary stu- the provided criteria; provide a score and explain the assigned score;
dent writing (O’Donnell and also include suggestions for improvement with at least one
& Römer, 2012; Römer & specific example from the text if a score of 0 to 2 is achieved.
O’Donnell, 2011). A total of 0 – The paper contains numerous spelling and grammatical errors,
42 texts were downloaded including incomplete and run-on sentences
from MICUSP, of which 18
1 – The paper contains several spelling and grammatical errors; there
were written by undergradu- are some incomplete or run-on sentences
ate nursing students and 24
by first- through third-year 2 – The paper contains minor spelling or grammatical errors; there
graduate nursing students. are minimal incomplete or run-on sentences
Each text was entered sepa- 3 – The paper contains no spelling errors, grammatical errors, or
rately into ChatGPT-3 fol- run-on sentences
lowing an input prompt that
included a rubric with con-
structs of language proficien- assigning a score of 0 to 2 and also provided guidance on
cy intended to represent aspects of complexity, accuracy, and how to improve the writing (Figure 1). ChatGPT responses
fluency (CAF), which have been commonly used as criteria for then were assessed based on its ability to provide feedback
assessing writing development (Ellis & Ellis, 2008; Housen & on macro-level features in writing. In this respect, ChatGPT
Kuiken, 2009). The six criteria and their association with CAF consistently identified and addressed broader aspects of the
are outlined in Table 1, which presents the input prompt and ru- writing rather than examining only surface-level grammar
bric criteria with a 4-point scale, ranging from 0 = unacceptable or sentence structure (Figure 2). ChatGPT also consistently
to 3 = meets or exceeds criteria. A paper with a score of 3 for provided specific suggestions for improvement using ex-
each criterion was considered an “A” paper. The following cerpts from the text. Notably, ChatGPT feedback contained
three criteria from recent AWE research (Zhai & Xiaomei, human-like commentary.
2021) were applied to explore its potential for AWE to ap- Finally, the ability of ChatGPT to support multiple submis-
proximate human raters (Shermis, 2014); relate to macro- sions and learner autonomy was assessed. When a score of 2
level features (e.g., organization and development [Burstein was assigned, users could ask clarifying questions for more
et al., 2013]); and support multiple submissions and learner detailed explanations. Further, users could implement the
autonomy (Wang et al., 2013). feedback and enter revised text or ask ChatGPT to explain
its score and provide additional feedback (Figure 3). In all
Results cases, ChatGPT provided suggestions for improvement and
After collecting ChatGPT feedback for the 42 papers, the answered clarifying questions. Table 2 presents excerpts
criteria mentioned above were applied to assess its forma- from the feedback aligned with best-practice criteria from
tive feedback potential. ChatGPT feedback was compared the literature.
first to the benchmark score of 3 (grade A) assigned by a
human rater. Overall, ChatGPT assigned a score of 3 to one Discussion
graduate nursing paper and a score of 2 (grade B) to all The assessment of ChatGPT’s feedback on undergradu-
other papers, indicating stricter grading. As the prompt in- ate- and graduate-level nursing texts demonstrated its po-
put requested, ChatGPT provided a detailed explanation for tential utility as a tool for AWE in writing instruction. The