Waer, H. (2021) - The Effect of Integrating Automated Writing Evaluation On EFL Writing Apprehension and Grammatical Knowledge PDF

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 26

Innovation in Language Learning and Teaching

ISSN: (Print) (Online) Journal homepage: https://www.tandfonline.com/loi/rill20

The effect of integrating automated writing


evaluation on EFL writing apprehension and
grammatical knowledge

Hanan Waer

To cite this article: Hanan Waer (2021): The effect of integrating automated writing evaluation
on EFL writing apprehension and grammatical knowledge, Innovation in Language Learning and
Teaching, DOI: 10.1080/17501229.2021.1914062

To link to this article: https://doi.org/10.1080/17501229.2021.1914062

Published online: 15 Apr 2021.

Submit your article to this journal

Article views: 38

View related articles

View Crossmark data

Full Terms & Conditions of access and use can be found at


https://www.tandfonline.com/action/journalInformation?journalCode=rill20
INNOVATION IN LANGUAGE LEARNING AND TEACHING
https://doi.org/10.1080/17501229.2021.1914062

The effect of integrating automated writing evaluation on EFL


writing apprehension and grammatical knowledge
Hanan Waer
Department of Curriculum & Instruction, Faculty of Education, New Valley University, El Kharja, Egypt

ABSTRACT ARTICLE HISTORY


Recent years have witnessed an increased interest in automated writing Received 22 November 2019
evaluation (hereafter AWE). However, few studies have examined the use Accepted 3 April 2021
of AWE with apprehensive writers. Hence, this study extends research in
KEYWORDS
this area, investigating the effect of using AWE on reducing writing Automated writing
apprehension and enhancing grammatical knowledge. The participants evaluation; EFL writing
were English majors at an Egyptian university, divided into experimental apprehension; EFL
and control groups. The experimental group used Cambridge ‘Write & grammatical knowledge;
Improve’ to evaluate their writing; whereas, the control group had an corrective feedback
instructor evaluate their essays. Data were collected using the English
Writing Apprehension Scale (EWAS, Abdel Latif [2015. “Sources of L2
Writing Apprehension: A Study of Egyptian University Students.” Journal
of Research in Reading 38 (2): 194–212. doi:10.1111/j.1467–
9817.2012.01549.x]) and a grammar knowledge test (hereafter GKT;
adapted from TOEFL). The findings showed statistically significant
differences in the post-administration in EWAS and GKT in favor of the
experimental group. Additionally, the effect size of the intervention on
apprehensive writers was large. The results indicated that AWE helped
the apprehensive writers reduce their apprehension and slightly enhance
the GKT of the non-apprehensive writers. Further, negative correlations
were found between writing apprehension and grammatical knowledge.
This study results suggest that AWE might be used as a remedial
classroom treatment for struggling or apprehensive writers, purposefully
integrating it as a formative tool, augmented with feedback practices.

Introduction
The twenty-first century has witnessed a radical movement towards digital education, including in
second language teaching. Accordingly, writing as an academic life skill has been influenced by this
movement. Different web-based tools have been used to provide unconventional writing support to
help learners deal with the new digital era. For example, various automated writing evaluation (AWE)
applications have been utilized to help learners with their writing by generating corrective feedback.
Research on using AWE has also shown various gains in writing motivation (Grimes and Warschauer
2010; Wilson and Czik 2016), quality of writing (Stevenson and Phakiti 2014; Wilson 2017), grammatical
accuracy (Liao 2015; Saricaoglu 2019) and writing self-efficacy (Wilson and Roscoe 2020).
Despite the interest in examining the use of AWE with different aspects of writing, few studies
have examined its effect on writing apprehension (also known as anxiety), which can have a negative

CONTACT Hanan Waer hananwaer1@gmail.com; hanan.waer@edu.nvu.edu.eg Department of Curriculum & Instruction,


Faculty of Education, New Valley University, El Kharga, Egypt
This article has been republished with minor changes. These changes do not impact the academic content of the article.
© 2021 Informa UK Limited, trading as Taylor & Francis Group
2 H. WAER

influence on writing performance (Daly 1978; Hayes 1996; Sanders-Reio et al. 2014). Previous studies
have also shown that avoidance of writing evaluation situations is an indispensable component in
defining the writing apprehension construct (Daly and Miller 1975; Daly and Shamo 1978; Mabrito
2000; Cheng 2004; Abdel Latif 2015; Autman and Kelly 2017). Most importantly, most writing appre-
hension research occurred in the 1970s before the start of the digital era in which new ways of
writing, such as email or blogging, emerged. Accordingly, it is time to investigate writing apprehen-
sion with ‘the modern learner’ (Autman and Kelly 2017, 516) using new writing applications such as
AWE, which provides immediate and individualized feedback. With this said, this study has focused
on whether integrating an online formative assessment tool, namely Cambridge Write & Improve,
could augment the writing evaluation context and thus reduce students’ writing apprehension.

Writing apprehension
Affect variables such as writing apprehension, writing self-efficacy and writing motivation (Hayes
1996) play a significant role in writing performance. Writing apprehension is an overwhelming
fear that behaviorally manifests in general avoidance of writing and writing evaluation situations.
The term writing apprehension was coined by Daly, who defines it as ‘the general avoidance of
writing situations perceived by individuals to potentially require some amount of writing
accompanied by the potential for evaluation of that writing’ (Daly 1979, 37).
Writing apprehension may have significant negative consequences. For instance, apprehensive
writers tend to avoid jobs that require writing and avoid situations in which they have to show
their writing to the teacher or other colleagues (Daly and Shamo 1978). Significantly, different
studies have shown that writing apprehension is correlated inversely with other affect constructs
such as self-esteem (Hassan 2001) and writing self-efficacy (Sanders-Reio et al. 2014; Abdel Latif
2015; Daniels et al. 2019). Furthermore, writing apprehension negatively affects writing performance.
High-apprehensive writers perform significantly lower on grammar, mechanics, and writing skills
(Daly 1978) and produce lower quality writing (Hassan 2001) than low-apprehensive writers. In par-
ticular, writing apprehension negatively correlates with grammar (Jebreil et al. 2015). Additionally,
Abdel Latif (2015) found a significant negative relationship between linguistic knowledge (vocabu-
lary and grammar) and L2 writing apprehension. Thus, L2 writing apprehension can affect students
emotionally and academically.
Various studies have investigated the sources or causes that may lead to writing apprehension. In
the Egyptian context, two studies focused on that issue. First, El-Shimi’s research (2017, 81) students’
and teachers’ responses showed that the fear of negative evaluation and writing under time con-
straints were two main reasons for L2 writing anxiety. Second, Abdel Latif (2015) identified six
factors that account for a high level of writing apprehension among English majors in Egypt. The
factors were the lack of linguistic knowledge, low foreign language competence, a history of low
achievement in writing and perceived writing performance improvement, low English writing self-
efficacy, instructional practices of English writing and fear of criticism. Outside the Egyptian contexts,
other sources of apprehension were also identified. Hanna’s study (2009) found a direct relationship
between the tone of teacher comments and writing apprehension. Lipsou (2018) also identified four
reasons for apprehension: fear of evaluation, students’ self-beliefs as writers, students’ self-beliefs
about writing, and insufficient writing skills. Thus, apprehension sources can be internal, such as
fear of evaluation/criticism, or external, such as writing instructional practices and teachers’ evalua-
tive comments.
Accordingly, various treatments have been recommended to tackle writing apprehension pro-
blems. For instance, various digital tools, such as networked computer-assisted classrooms, were
used to reduce anxiety or apprehension (Sullivan and Pratt 1996). Similarly, Morphy and Graham
(2012) successfully used word-processing software with struggling writers and found that, when
compared with handwriting, the software use more successfully enhanced student writing
motivation.
INNOVATION IN LANGUAGE LEARNING AND TEACHING 3

Different writing arrangements might influence students’ writing apprehension. For example,
Nobles and Paganucci (2015) investigated students’ perceptions of writing skills and quality using
digital tools and online writing environments versus pen/pencil and paper. Their students perceived
online word processing as a helpful tool in deleting, inserting and revision strategies. Word proces-
sing was also reported as increasing writing motivation (Grimes and Warschauer 2010; Wilson and
Czik 2016). Wilson (2017, 694) argued that automated essay evaluation ‘goes further than basic
word processing by providing instantaneous feedback’. When combining word processing facilities
with effective feedback, students could gain more benefits and better revision quality (Nobles and
Paganucci 2015). Most web-based applications, including AWE, can provide a different supportive
context for writing instruction for apprehensive writers. However, few studies have used AWE to
ameliorate writing apprehension.

Automated writing evaluation (AWE)


Automated writing evaluation, also known as automated essay evaluation or automated writing
scoring, is a programmed software that generates scores and evaluative feedback on writing
(Ware and Warschaue 2006; Saville 2017; Hockly 2019). AWE technology was initially used for sum-
mative assessment in high stakes writing tests, but it has been used increasingly in classroom writing
instruction (Stevenson 2016; Lee 2017). Various AWE applications are available: Project Essay Grade
(PEG), introduced by Ellis Page in 1968; Intelligent Essay Assessor by Pearson; Criterion by Educational
Testing Service, MY Access! by Vantage; Write & Improve by Cambridge English and Write to Learn by
Pearson. Although there are some subtle differences in the feedback layout of the different AWE
applications, the automated scores within those applications are generally based on ‘artificial intelli-
gence, natural language processing and latent semantic analysis’ (Stevenson 2016, 2). For example,
PEG has a peer review feature, whereas Write & Improve provides space for teacher feedback. Such
features provide learners with different writing experiences and individualized feedback.
AWE has various advantages for both teachers and learners. A key advantage of AWE is generat-
ing instantaneous individualized corrective feedback (Li, Link, and Hegelheimer 2015). Thus, AWE is
‘a viable, economically feasible alternative to the expensive endeavor of hand-scored writing assess-
ments’ (Ware and Warschaue 2006, 107), especially in large classes. AWE is also beneficial for learners
since it improves their self-revision skills (Stevenson and Phakiti 2019) and enhances students’ inde-
pendent writing ability and fosters autonomy (Chen and Cheng 2008; Stevenson 2016). Furthermore,
many students reported positive attitudes towards AWE feedback (Grimes and Warschauer 2010;
Wali and Huijser 2018).
A number of different studies have shown the effectiveness of AWE feedback in writing instruc-
tion, mainly grammatical accuracy. For example, Saricaoglu (2019) investigated the effect of auto-
mated formative feedback on the improvement of English as a second language (ESL) written
causal explanations. The results revealed statistically significant changes in learners’ causal expla-
nations within one cause-and-effect essay. Further, Li, Hui-Hsien, and Saricaoglu (2017) explored
both short-term and long-term effects of Criterion feedback on ESL students’ grammatical accuracy
development. The results showed a positive short-term effect of using Criterion on reducing
grammar errors in revised drafts for different proficiency levels in eight out of nine grammar cat-
egories; however, a positive long-term effect was found in only one category. Students reported
that the grammar category was the most helpful out of the five categories presented by Criterion.
Similarly, Wali and Huijser (2018) investigated Bahraini students’ perceptions of Write & Improve,
where 88% perceived grammar as the most useful aspect in helping participants improve their
writing. Their students’ positive attitude toward grammar in AWE is evident in a recent eye tracking
study by El Ebyary and Windeatt (2019). They used a questionnaire and a think-aloud technique to
see how students engaged with Criterion feedback. Results indicated a general tendency to focus
first on grammar, then organization and development.
4 H. WAER

Other studies have investigated revisions in students writing. Li, Link, and Hegelheimer (2015)
suggested that composing with Criterion increased the quality of revisions and that the AWE correc-
tive feedback helped improve accuracy from first to final drafts. While instructors confirmed the
benefits of using Criterion, students’ perceptions were varied. Liao (2015) studied the effects of
using AWE in a process-writing approach on grammatical accuracy in L2 writing. Analyzing the feed-
back reports on students’ essays, the authors found that using AWE seemed to reduce grammatical
errors in revisions and new writing tasks. Chen and Cheng (2008) also found that MY Access! software
improved some formal aspects of students’ writing, such as word choice and sentence structure, but
their participants did not favor using the software.
Nevertheless, there has been criticism of AWE due to its limitations. It has been argued that AWE is
incapable of identifying creativity and self-expression (Stevenson and Phakiti 2019, 137) by focusing
more on formal correctness than supporting writing as a contextualized social activity (Ware and
Warschaue 2006). Thus, AWE feedback is not suitable for young learners who need humanized feed-
back (Lee 2017, 130). Moreover, research has overlooked the teaching and learning involved in using
AWE (Ware and Warschaue 2006); there is excessive focus on the writing product (Lee 2017, 130); and
the empirical evidence for its effectiveness is inconclusive (Cotos 2014). For example, in Huang and
Renandya (2018), although the students’ perceptions of AWE (Pigai) were positive, the improvement
in the quality of their revisions was not significant. However, since the intervention lasted for only
two weeks, their results are questionable.
As with any technological tool, we would not expect AWE to replace the teachers. AWE is neither
‘a magic bullet’ (Hockly 2019, 6) nor ‘a silver bullet for language and literacy development’
(Warschauer and Ware 2006, 175). It would be better to make use of its merits and compensate
for its demerits. AWE corrective feedback is particularly beneficial at the sentence-level, where it
identifies errors in grammar and language usage. It is not, however, sufficiently developed to
provide detailed feedback at the discourse level. If AWE can help with linguistic aspects, such as
grammar, teachers can use it in this capability and direct their feedback to other aspects, such as
ideas and organization. Thus, instructors would be better to ‘go beyond the design of the assessment
instrument itself’ (Saville 2017, 204) and exploit its potential. Put simply; teachers can benefit by
using AWE as ‘coaches rather than prescribers’ (Warschauer and Grimes 2008).
AWE can be integrated into an instructional framework that focuses on the writing process and
the writing product. For example, Liao (2015) integrated AWE within a process-based approach
where the teacher’s corrective feedback focused on ideas and organization and AWE was used to
revise grammatical and accuracy aspects. Cotos (2014) used an AWE program called ‘IAED’ as a revi-
sion tool in an academic writing class in the same vein. Such studies demonstrate how AWE may
supplement the teachers’ role by saving time and effort, focusing their own feedback on the
higher aspects of writing.
If AWE is helpful in grammar and other linguistics aspects (e.g. Chen and Cheng 2008; Liao 2015;
Li, Hui-Hsien, and Saricaoglu 2017), it may be a good asset or best fit for apprehensive writers. Some
studies have reported that apprehensive writers are more worried about linguistic errors than dis-
course ones (Sanders-Reio et al., 2014; as cited in Abdel Latif 2015). Consequently, the nature of
AWE feedback may mean that it can address the particular needs of apprehensive writers. The
researcher thought that AWE corrective feedback might be beneficial for enhancing incidental
grammar learning for apprehensive writers.
The studies discussed so far have dealt with students’ perceptions of AWE and its effects on aca-
demic gains, such as writing performance, grammatical accuracy and revision quality. Another two
studies dealt with writing affect constructs, namely, self-efficacy and apprehension. The first study by
Wilson and Roscoe (2020) investigated the direct and indirect effects of PEG when compared with
Google Docs software on writing quality, writing self-efficacy, English language arts test (ELA)
results, and teachers’ perceptions. Using the path analysis technique, the researchers found that
writing with PEG led to more positive writing self-efficacy and better ELA performance. Furthermore,
self-efficacy indirectly influenced or ‘partially mediated the effect of the composing condition in state
INNOVATION IN LANGUAGE LEARNING AND TEACHING 5

test performance’ (Wilson and Roscoe 2020, 29). Additionally, the teachers highly valued the PEG
feedback benefits and, significantly, its benefits for struggling writers.
The second study done by Fisher (2017) investigated the effect of using an AWE application called
‘Autograder’ on reducing writing anxiety among freshmen community college students. The U.S.
Daly and Miller’s (1975) writing apprehension scale was applied as a pre-and post-test to 130 stu-
dents. Autograder was used with the experimental group to evaluate their writing; whereas, a
control group had an instructor evaluate the essays following in-class writing. Although the
results showed no statistically significant difference between the two groups based on autograder
usage, gender, or age, it was found that autograder students’ writing anxiety decreased with each
attempt.
Wilson and Roscoe (2020) is the first study that reveals how AWE can positively affect writing self-
efficacy as a writing affect construct. Their study is closely related to the present study as the pre-
vious studies showed that self-efficacy is correlated negatively with writing apprehension (e.g.
Abdel Latif 2015). Similarly, AWE may be a promising tool with writing apprehension as it provides
both linguistic and situational support that may relieve writing apprehension. As Abdel Latif (2019,
156) argued, using AWE can reduce writing apprehension among L2 students in two ways:
First, it raises their linguistic awareness and in turn helps in alleviating negative affect by improving language
and writing performance and competence beliefs. Second, it has a positive motivational impact on their willing-
ness to write due to the private window provided for evaluating their texts.

AWE may be particularly beneficial for apprehensive writers because it provides ample writing and
revision opportunities (Wilson and Czik 2016). It can also enhance explicit grammatical knowledge,
as ‘grammar is probably the most rule-based aspect of writing. An automated feedback tool is, there-
fore, suitable as it allows for clear feedback about whether the rules are being followed or not’ (Wali
and Huijser 2018, 17).
Importantly, using AWE as a digital formative assessment tool may provide a different evaluative
environment in which EFL learners will work at their own pace without the fear of face-to-face evalu-
ation of either teachers or peers. AWE may also help L2 apprehensive writers as it provides an uncon-
ventional method of evaluation, which was not available in other technological tools used in the
previously mentioned studies.
In conclusion, a gap in literature appears to exist regarding AWE’s potential effect on L2 writing
apprehension, as there seems to be only one study (Fisher 2017) in the USA. No previous studies, to
the researcher’s best knowledge, have investigated this issue in an EFL context. The Egyptian EFL
context deserves further investigation since it differs from student populations in the previous
studies (Fisher 2017; Wilson and Roscoe 2020), where English is taught as the first or the second
language. The Egyptian learners primarily practice the foreign language in the classroom, and
they are occasionally exposed to English outside the classroom. Since writing apprehension is
context-sensitive due to some factors such as language proficiency and previous exposure to
writing (Masny and Foxall 1993, 9), it is vital then to consider the varied linguistic contexts when
researching the effectiveness of AWE. Moreover, the available qualitative studies in the Egyptian
context about writing apprehension indicated high levels among Egyptian students (Hassan 2001;
Abdel Latif 2015; El Shimi 2017). This apprehension problem requires an interventional solution,
which is not adequately addressed.
Additionally, most previous AWE classroom-based studies investigated specific software such as
Criterion, PEG, and My Access (Stevenson 2016). Only one study used Write & Improve (Wali and
Huijser 2018). What distinguishes Write & Improve is that it is freely available software. It is specifically
tailored to EFL learners’ needs (Harrison 2017) since the software feedback is aligned with the CEFR
scale. CEFR is designed to provide a comprehensive basis for language learning and teaching and the
assessment of foreign language proficiency (Council of Europe 2020, 42).
Accordingly, the present study was conducted to fill these critical gaps and explore the impact of
integrating AWE on reducing writing apprehension and improving grammatical knowledge. This
6 H. WAER

study aims to examine whether the use of AWE could significantly reduce writing apprehension
levels and enhance explicit grammatical knowledge among Egyptian English majors. This study
seeks to answer the following questions:

(1) What is the effect of AWE on EFL writing apprehension?


(2) What is the effect of AWE on EFL grammatical knowledge?
(3) What is the effect of AWE on the relationship between writing apprehension and grammatical
knowledge?

Context and methodology


Participants and course design
The present study participants were 103 English majors at an Egyptian university (12 males and 91
females). They were between 19 and 21 years old. They are all native Egyptians whose first language
is Arabic, the official language of Egypt. Before joining the university, they studied English as a
foreign language for approximately 12 years. At university, they studied different courses in
English such as drama, poetry and criticism. They studied educational courses in Arabic, such as psy-
chology. All the participants had a year of computer classes and were computer literate. Prior to the
writing course described in this study, participants had three university-level courses: Writing I, II and
III. The participants’ language proficiency levels ranged from intermediate to upper-intermediate.
The study course, Writing IV, was taught by the researcher and a teaching assistant (TA). The
course aimed to help students enhance coherence and cohesion in their writing, to write
different types of essays: argumentative, comparison/contrast, and descriptive, and to use
different pre-writing techniques such as clusters, listing, and graphic organizers. The course followed
a process-writing approach. First, the focus was on analyzing the structure of sample essays, followed
by an essay assignment. The students brainstormed ideas, wrote their outlines in the classroom, and
then wrote their first draft as homework. Next, samples of students’ essays were selected by the TA,
who provided generic feedback on structure and content and individualized linguistic feedback. The
course lasted twelve weeks from February to April 2019, and the students studied one theoretical
hour with the course instructor and two practical hours with the TA. The intervention ran from
the last week of March until the last week of April. Participants were divided randomly into either
the control (50) or experimental group (53).
AWE was used in the middle of the course design. As participants in the experimental group did
not use AWE before the intervention, so they received focused training on the use of AWE in class.
The researcher explained to the students how to use the website, the layout views and the interpret-
ation of different types of feedback, and finally, how to revise and check their drafts. In groups, stu-
dents wrote one essay using their mobiles and extensively discussed the software feedback. Next,
the researcher explained that they were required to individually submit 4 essays as assigned on
the AWE website and make necessary revisions until they are satisfied with their final drafts.
Four writing tasks were used in the present study for both the experimental and control groups.
They were selected from the sample writing topics of the TOEFL. The writing task was to write a 300–
400 word essay. The following are the prompts for the four tasks:

(1) Do you agree or disagree with the following statement?


‘Children should grow up in the countryside rather than in a big city’.
Use specific reasons and examples to support your opinion.
(2) Some people say that the internet provides people with much valuable information. Others think
access to so much information creates problems. Which view do you agree with? Use specific
reasons and examples to support your opinion.
INNOVATION IN LANGUAGE LEARNING AND TEACHING 7

(3) Do you agree or disagree with the following statement?


‘Life today is easier and more comfortable than it was when your grandparents were children’.
Use specific reasons and examples to support your opinion
(4) Some students prefer classes with frequent discussions between the professor and the students
with almost no lectures. Other students prefer classes with many lectures and almost no discus-
sions. Which class do you prefer? Use specific reasons and examples to support your opinion.

Study design
A pre-and post-test (between- and within-subjects) experimental design was used with a random-
ized assignment to the experimental (N = 53) and control groups ( (N = 50) . Each group included
apprehensive and non-apprehensive writers. The tests consisted of the EWAS as a measure of
writing apprehension and the grammatical knowledge test (GKT) to measure grammatical knowl-
edge: the two dependent variables in this study. The independent variable was the feedback or
evaluation method: traditional vs. AWE.

Instruments
English Writing Apprehension Scale (EWAS)
The researcher used the English Writing Apprehension Scale (EWAS) adapted by Abdel Latif (2015)
from three different measures: Gungle and Taylor’s (1989) ESL version of the Daly and Miller (1975)
Writing Apprehension Test (WAT), Graham, Schwartz, and MacArthur (1993) Attitudes Toward
Writing Scale and Cheng’s (2004) Second Language Writing Anxiety Inventory (SLWAI) for avoidance
behavior subscale. The researcher selected this scale for two main reasons. First, it is adapted from
three different valid scales. Second, it was administered with a satisfactory reliability Alpha coeffi-
cient (α = 0.85) in an Egyptian context similar to the present study.
The EWAS assesses only one construct, which is writing apprehension. The construct is composed
of three main components, which are (1) students’ dis-/liking of writing (e.g. I like writing in English),
(2) students’ tendency to avoid/approach the situations in which they may be required to write (e.g. I
usually do my best to avoid writing English essays.), and (3) their tendency to avoid/approach the situ-
ations in which their writing may be evaluated (e.g. I do not like my English essays to be evaluated).
The adapted EWAS is a 12-item 5-point Likert scale. The scale scores may range from 12 (the minimal
score) to 60 (the maximal score). All EWAS items (Appendix 1) were normally coded from 5 = totally
agree to 1 = totally disagree (e.g. I would rather read than write in English.) except for items 3, 4, 5, 6
and 8, which were reverse coded (e.g. I have no fear of my English writing being evaluated.). To
measure the reliability and validity of EWAS, it was piloted on a sample of 60 university students
in April 2018. The reliability coefficient was (α = 0.80). Besides, Pearson correlation showed significant
correlations for all scale items (See Appendix 1).
In Abdel Latif’s study (2015), his participants had a mean score of 32.3 on the EWAS and a stan-
dard deviation of 8.4. Using a cutting score formula, Abdel Latif identified the apprehensive and non-
apprehensive writers as those whose scores on the scale fell a half standard deviation below and
above the mean. Accordingly, the apprehensive writers in his study scored more than 36 on the
EWAS, whereas the non-apprehensive writers scored less than 28 on the scale. Likewise, this
study used the same criteria. After pre-administration of the EWAS and analyzing the participants’
responses, the data showed that the control group had 24 apprehensive and 26 non-apprehensive
writers; whereas, the experimental group had 22 apprehensive and 31 non-apprehensive writers.

Grammatical knowledge test (GKT)


The structure and written expression section of the Longman paper-based TOEFL test (Phillips 2004)
was used to check students’ grammatical knowledge. This section includes two parts. The structure
8 H. WAER

part (questions 1–15) has 15 incomplete sentences, and the test taker chooses the one word or
phrase that best completes each sentence. In the written expression part (questions 16–40), each
sentence has four underlined words or phrases, and the test taker is required to choose the incorrect
one. The researcher included questions 1–30 in the GKT. Hence, the adapted GKT has a score range of
0–30. The results of test piloting showed high reliability with Cronbach’s alpha (α = 0.91).

Reflective journals
The experimental group was asked to write two reflective journals discussing what they liked and did
not like about the software and what they would do differently in their next piece of online writing to
face the challenges they met. They were free to write the journal online on the Cambridge website or
submit a handwritten version. Most of the students wrote one journal.

Cambridge English write & improve


Write & Improve software is an AWE system that uses the Common European Framework of Reference
(CEFR) scale. Professor Ted Briscoe’s team developed it at Cambridge University Computer Labora-
tory. Its scoring system builds on a predictive model using a machine learning algorithm called
supervised machine learning (SVM) and computational linguistics. Unlike other scoring systems
that use classification, Write & Improve is distinguished by using pairwise preference ranking. The
SVM is fed with graded essays and extracted text features (such as word sequences, grammatical
constructions) of annotated essays (Briscoe, Medlock, and Andersen 2010). The training data is
the Cambridge Learner Corpus (CLC) that includes five million Cambridge exams (65 million
words) and is benchmarked with CEFR scores graded by human examiners. The CLC written part
is manually error-coded and parsed using RASP (Briscoe, Carroll, and Watson 2006), which extracts
the linguistic features that characterize the richness of different essays. Therefore, Write & Improve
assesses and scores unseen written texts or an essay based on the previously extracted features
and immediately provides feedback. The system was evaluated via Pearson correlation between
human scores, and the system predicted scores are 0.77, whereas test analyses showed up to
90% precision of incorrect word sequences (Briscoe, Medlock, and Andersen 2010).
Write & Improve can be freely used for self-study or with a classroom teacher. For self-study, the
user chooses a suitable level (beginners A1 and A2, intermediate B1 and B2, advanced C1 and C2)
and selects a writing task (an essay, report or a letter). Once the user finishes the task, S/he
submits the writing by clicking on the ‘check’ button. After submission, the system displays the
writer’s level and provides feedback at the sentence level and word level using different colors
(see Figure 1; extracted from this study data). For example, a white highlight color means the

Figure 1. Students’ feedback view (an example extracted from the corpus of the present study).
INNOVATION IN LANGUAGE LEARNING AND TEACHING 9

sentence is fair; whereas, a yellow highlight color indicates problems in the sentence. As for word
errors, they appear as colored icons indicating different errors such as spelling, punctuation,
grammar and word choice. The software also generates some suggestions for improvement. As
shown in Figure 1, the correct spelling of the word ‘moreover’ is suggested.
The software also shows the user the writing level progress after the writer revises and checks for
automatic feedback. Figure 2 shows a progress chart of a student whose starting level is A1 with the
first submission, and after 34 checks, the level increases to B2.

Figure 2. A student’s level and progress chart in Write & Improve.

Teachers using Write & Improve with a class first need to create a workbook (e.g. Class A). They can
then copy ready-made tasks from the website into their workbook or create a new task and select the
appropriate CEFR level. To use a different scoring system, such as estimated IELTS scores, they need
to purchase a subscription.
Teachers can track their students’ changes or revisions through class view, which also needs a
subscription. The classroom data can also be downloaded as an excel file. The teacher can
examine the score range, student progress, and student progress heat map via the class view
option (See Appendix 2). Figure 3 illustrates the students’ progress in all essays submitted in the
present study. This view shows the class teacher the different writing levels in the class. For
example, this figure shows that most students are located within the A2-C1 range; whereas, a
small number is at the lowest level (A1) and the highest level (C2). Teachers may also use this
data to identify low-level students and prepare individualized feedback for them.

Figure 3. Students’ progress in the teacher’s class view.


10 H. WAER

Procedures
In light of the research ethics of the study intervention, which involves collecting scale and test data
from the research participants, the researcher explained to the participants the research purpose,
any possible risks and benefits of participation, the confidentiality of the collected data, and the
ability to withdraw from participation at any time. After getting the participants’ consent on the
third week of March 2019, the researcher randomly divided students into experimental and
control groups. A few students did not have regular internet access, so they were assigned to the
control group. The researcher created a workbook on Write & Improve entitled ‘An English writing
course’ (see Appendix 3). The GKT and the EWAS were carried out in two different sessions with
both groups before the intervention (3rd week of March 2019). After confirming the groups’ hom-
ogeneity (see Table 1 in the following section), the researcher met with the experimental group
and introduced them to the Write & Improve website. Some students registered using their
mobile phones (4th week of March 2019). Additionally, a Facebook group for the course was estab-
lished to discuss any challenges or technical issues facing students on the Cambridge website. The
total members of this community were 73 students, 22 of them were from the control group and the
total posts were 5. The researcher posted a video on the 30 of March about using ‘Write & Improve’.
However, this Facebook community was not actively used by students except for one student who
posted on the 5 of April 2019 asking about a technical issue in accessing the writing task. The other 3
posts were by the researcher, and students did not comment on them. Thus, the sense of belonging
to a community, to some extent, would not influence students’ writing apprehension.
Both groups were assigned the same number of essay tasks and were required to submit two
drafts. The experimental group used Write & Improve. They submitted their first draft online,
checked the website, and worked on it until they were satisfied with the final version. The control
group wrote their paper drafts and submitted them to the teaching assistant.
Feedback on grammar and vocabulary categories was similar in AWE and the teaching assist-
ant’s corrective feedback, focusing on tenses, subject-verb agreement, spelling, prepositions,
articles and word choice. However, essay structure and organization categories were not included
explicitly in Write and improve. Therefore, the instructor provided supplemental feedback in the
teacher’s space after the essay to compensate for this shortage. For example, the thesis statement
was weak in one student’s essay, so the researcher provided explicit feedback on this point in the
space provided for teacher feedback and asked the student to fix the thesis pattern. To ensure
fairness and objectivity in providing equal feedback to both groups, the researcher used a
rubric (see Appendix 4), which the teaching assistant also used to evaluate the control group
essays. This rubric was a benchmark to ensure, as much as possible, that the feedback provided
to students in the control group was not different from their counterparts in the experimental
group.
The provided corrective feedback in both groups guided students in editing their second
drafts, correcting their errors, and improving writing quality. Using such feedback in the revision
process could improve the writing accuracy (e.g. usage and mechanics) of students’ drafts. In
such writing process practices, corrective feedback might improve some linguistic aspects and
accuracy from first to final drafts and develop learners’ grammatical knowledge. It also might
enhance grammatical accuracy in subsequent writing tasks, as reported in some AWE and
teacher feedback studies (Chen and Cheng 2008; Li, Link, and Hegelheimer 2015; Liao 2015;
Saadi and Saadat 2015).
The experimental group students submitted four different essays online with at least two drafts
on the website. In addition, some students wrote either online or handwritten reflective journals
about their experience with AWE software. By the end of April 2019, the GKT and the EWAS were
carried out again with both groups. Semi-structured interviews were then held individually with
20 students to explore the participants’ experience further. For reasons of space, the interview
data is not included.
INNOVATION IN LANGUAGE LEARNING AND TEACHING 11

Results
Students’ engagement with the AWE system
The classroom view data provided some valuable input to contrast patterns of interaction with the
Write & Improve website among the experimental group participants. It was found that the mean of
formative feedback checks by apprehensive writers (Sum = 1118; Mean = 50.81) surpassed that of
the non-apprehensive writers (Sum = 1113; Mean = 35.90). This result indicates that the apprehen-
sive writers (N = 22) used the website more and experienced a greater amount of AWE use than
the non-apprehensive writers (N = 31). In turn, that might have helped decrease writing apprehen-
sion, as it is evident in the decreased number of apprehensive writers. The number dropped from 22
before the intervention to only two apprehensive writers after it. Accordingly, it can be concluded
that the initial analysis of AWE use provided some evidence that AWE feedback might have
helped the apprehensive writers in some way. The findings in the following subsections extensively
reveal how the AWE writing experience affected apprehensive writers.

Quantitative findings of EWAS scale and GKT test


Inferential statistics were used to draw conclusions from the sample tested. The Statistical Package
for the Social Sciences (SPSS 23) was used to code and tabulate scores collected from the EWAS and
the GKT. T-test was used to identify differences between the main groups – experimental and control
groups. As this study involved 3 variables (time, condition and writing apprehension level); a 2
(experimental vs. control group) by 2 (apprehensive vs. non- apprehensive) by 2 (pre- vs. post-
test), a mixed design was more reasonable. Thus, a one-way analysis of variance (ANOVA) was
used to analyze data obtained from EWAS and GKT (see Erkan and Saban 2011; Vanhille, Gregory,
and Corser 2017 used similar statistics). The difference being investigated was between more
than two groups. Only ANOVA can compare means ‘in the context of more complex research
designs with three or more conditions or groups’ (Kinnear and Gray 2008, p. 244). Nevertheless,
when conducting the ANOVA analysis, apprehension was treated as a categorical level rather
than a continuous level. Treating it as categorical data is arbitrary and might lose statistical
power. To compensate for this potential limitation, paired-sample T-tests and an independent T-
test were run between the apprehensive subgroups to locate the effectiveness of AWE and its
effect size. Next, the Pearson correlation was used to reveal the change in the relationship
between the study variables, namely writing apprehension and grammatical knowledge.
The GKT and the EWAS were carried out before the intervention. The experimental and the
control groups were homogenous as the EWAS pre-test results showed no significant difference
between the two groups (t = (103) 0.81; p = 0.418), nor did the GKT pre-test (t = (103) 0.74; p =
0.464). Further, the T-test for homogeneity between the apprehensive writers in both the experimen-
tal and control groups showed no significant difference (t = (46) 0.718; p = 0.476).
Descriptive statistics of the participants’ scores on the EWAS and GKT by condition (experimental
vs. control) and apprehension level are displayed in Tables 1 and 2.
The following findings answer the research questions, respectively.

Table 1. Control and experimental group scores.


Experimental group (N = 53) The control group (N = 50)
Tests M SD M SD
EWAS
Pre 31.32 7.27 32.46 6.90
Post 28.51 5.17 32.82 6.68
GKT
Pre 15.60 5.94 14.78 5.38
Post 19.42 6.632 16.26 4.70
12 H. WAER

Table 2. Descriptive data according to the group and the apprehension level.
Pre-test Post-test
DV Group Apprehension level N Means SD Means SD
EWAS scale Control N (50) Apprehensive 24 38.75 1.96 36.67 4.30
Non-apprehensive 26 26.65 4.18 29.27 6.48
Experimental N (53) Apprehensive 22 38.27 2.52 31.23 3.59
Non-apprehensive 31 26.39 5.15 26.58 5.30
GKT Test Control Apprehensive 24 13.33 3.82 15.38 3.95
Non-apprehensive 26 16.11 6.28 17.08 5.25
Experimental Apprehensive 22 13.77 4.74 17.18 4.03
Non-Apprehensive 31 16.90 6.43 21.00 7.65

Effect of AWE on writing apprehension


Initially, an independent samples t-test was conducted to determine whether there were differences
in the control and experimental groups’ scores regarding writing apprehension. Significant differ-
ences were found between the two groups on EWAS (t (101) = 3.68; p = 0.001). Accordingly, this indi-
cates a statistically significant difference between the mean scores of the experimental and control
groups in the EWAS post-test in favor of the experimental group. To determine the effect size of the
treatment for the experimental group (N = 53), Cohen’s (1988) was calculated (d = 0.70; r = 0.32). This
value illustrates a medium effect of the independent variable (AWE) on the writing apprehension of
this group.
To further examine the precise difference among the apprehensive writers in the experimental
and the control groups on the EWAS, an independent sample t-test was run. Although no significant
differences were found between the apprehensive writers in the experimental and control groups in
the EWAS pre-test (t (44) = 0.718; p = 0.051), significant differences were found in the pos-ttest (t (44)
= 4.63; p = 0.001) favoring the apprehensive writers in the experimental group as shown in Table 3.
Moreover, Eta squared indicates that 32.8% of the variance in the writing apprehension is explained
by the independent variable, which possibly indicates the effectiveness of AWE intervention in redu-
cing writing apprehension for the apprehensive writers in the experimental group.

Table 3. Independent sample t-test for EWAS pre and post-tests for the apprehensive subgroups.
Apprehensive N Mean SD Df T Sig Eta2
Experimental 22 31.23 3.59 44 4.63 0.000 0.328
Control 24 36.67 4.30
*p < 0.05.

Cohen’s d was calculated (d = 1.37; r = −0.57) to determine the extent of the effect of the AWE on
the experimental apprehensive writers. The value shows a large effect size of the AWE in reducing
writing apprehension among them. This effect also exceeds the effect size of an educational inter-
vention on reducing anxiety (0.42), which Hattie (2017) stated.

Differences between the subgroups according to their writing apprehension level


ANOVA tests were run to determine the precise difference between the four subgroups based on the
apprehension level (see Table 2) on the EWAS. A one-way ANOVA was performed to compare the
impact of AWE use on writing apprehension. As shown in Table 2, there were four groups based
on two intervention conditions (control × experimental) and two apprehension levels (apprehensive
and non-apprehensive), coded as Exp Appre., Exp non-Appre, Cont. Appre and Cont. Non-Appre. The
EWAS results were normally distributed, but equal variances were not assumed based on Levene’s
test (F (99) = 3.071, p = 0.031). Therefore, the following robust tests of equality of means were
INNOVATION IN LANGUAGE LEARNING AND TEACHING 13

used. Both ANOVA’s Welch test (F (99) = 53.397, p = 0.001) and Brown-Forsythe test (F (99) = 78.83, p
= 0.001) indicated significant differences in the EWAS means between the four sub-groups.
Post-hoc comparisons using the Games-Howell test were run to determine where those differ-
ences lay. The test indicated a significant difference between the mean scores of the experimental
apprehensive (M = 31.23, SD = 3.59) and the control apprehensive (M = 36.67, SD = 4.30) at p = 0.001;
between the experimental non-apprehensive (M = 26.58, SD = 5.30) and the control apprehensive at
p = 0.001; and between the control apprehensive and the control non-apprehensive (M = 29.27, SD
= 6.48) at p = 0.001. There was also a significant difference between the experimental apprehensive
and the experimental non-apprehensive (M = 26.58, SD = 5.30) at (p = 0.002). The control non-appre-
hensive group did not differ from either the experimental apprehensive (p = 0.55) or the experimen-
tal non- apprehensive (p = 0.33). This finding shows that the mean significant differences in the pre-
EWAS between the experimental apprehensive and the non-apprehensive in both the experimental
(M = 11.88, p = .001) and control groups (M = 11.61, p = .001) were not significant in the post EWAS.
This means that the writing apprehension of the apprehension experimental group changed and got
closer to the non-apprehensive writers (Figure 4 visually shows this change). Accordingly, this indi-
cates that AWE intervention might have contributed to reducing the writing apprehension of the
experimental apprehensive group.

Figure 4. EWAS means plot for time × treatment interaction.

Furthermore, a one-way repeated measures ANOVA was conducted to test the interaction effect
of time and treatment. Initially, the sphericity assumption was met as time has two levels (pre and
post) and thus, there was no value for Mauchly’s test. The results revealed a significant interaction
effect (F (3, 99) = 27.86, partial eta squared (η 2) = 0.458, p = 001). This effect indicates that time
across different treatment conditions was different for the study groups. The interaction is
ordinal, as displayed in Figure 4. The lines’ slope is not parallel, indicating that the interaction
effect between time and treatment is significant, given enough statistical power.
14 H. WAER

Figure 4 shows the means of the four groups on EWAS before (time 1) and after the treat-
ment (time 2). First, it is noted that the experimental apprehensive group is closer to both
the non-apprehensive in both the experimental and control groups. This result confirms the
insignificant differences among these three groups, as shown in the post-hoc comparisons.
Second, the experimental apprehensive writers’ apprehension level considerably decreased
from time 1 to time 2 (with a mean difference of −7.04). In contrast, the apprehensive
control writers have slightly decreased (with a mean difference of −2.08). This result explains
the decreased number of apprehensive writers in the experimental group, from 22 to 2, after
the intervention. It also explains the significant difference between these two groups, as dis-
played in the post-hoc tests.
The apprehensive writers in the control group have slightly decreased (with a mean difference of
−2.08). Accordingly, it is quite interesting to determine the significance of the reduction rate in
apprehensive writers’ apprehension level in both the experimental and control groups. To that
end, a paired sample t-test was run and it indicated a significant difference for the experimental
apprehensive writers between the pre and post EWAS (t (21) = 9.6; p = 0.001). Interestingly, it also
showed a significant difference for the control apprehensive writers between the pre and post
EWAS (t (23) = 2.58, p = 0.017).

Effect of AWE on grammatical knowledge


An independent sample t-test for the GKT post-test showed a significant difference between the
experimental and control group in the GKT (t = (101) 2.77; p = 0.007). Accordingly, this indicates a
statistically significant difference between the mean scores of the experimental and control
groups in the GKT post-test in favor of the experimental group. To determine the effect size of
the intervention (N = 53), Cohen’s d was calculated (d = 0.55; r = 0.26). This value illustrates the
medium effect of the independent variable (AWE) on grammatical knowledge, which implies that
the AWE use can explain a medium proportion of the total variance of the grammar knowledge vari-
able. Eta2 indicates a 12% variance, which shows an average effect of AWE on enhancing gramma-
tical knowledge.

Differences in GKT among the experimental and control groups based on the
apprehension level
A one-way between-groups ANOVA was performed to identify the differences among the four
groups on the GKT. Initial analysis showed that the data were normally distributed, but equal var-
iances were not assumed based on Levene’s test (F (99) = 6.78, p = .001). Therefore, robust tests of
equality of means were used. Both ANOVA’s Welch test (F(99) = 54.82, p = .011) and Brown-Forsythe
test (F(99) = 85.74, p = .002) indicated significant differences in the GKT means. Post-hoc comparisons
using the Games Howell test were done to determine where those differences lie. The multiple com-
parisons indicated no significant differences with the exception of one difference between the
experimental non-apprehensive (M = 21.00, 6.90, D = 7.65) and the control apprehensive group
(M = 15.38, SD = 3.95) at p = .009.
Further, a one-way repeated measures ANOVA was conducted to test the interaction effect of
Time x treatment. Initially, the sphericity assumption was met as time has two levels (pre and
post) and thus, there was no value for Mauchly’s test. The results did not reveal a significant inter-
action between time and treatment (F (3, 99) = 2.02, partial eta squared (η 2) = 0.06, p = .12), which
indicates that time was not different across the four study groups. The interaction is ordinal, as dis-
played in Figure 5 and the slope of lines is mostly parallel. This result confirms that there is not a
significant interaction effect.
INNOVATION IN LANGUAGE LEARNING AND TEACHING 15

Figure 5. GKT Means plot for time × treatment interaction.

Additionally, Figure 5 confirms slight differences between apprehensive writers in both the exper-
imental and control groups (mean difference 1.73). The non-apprehensive writers in the experimen-
tal group differed remarkably from the non-apprehensive in the control group (mean difference
3.92). Further, it seems that the AWE had a more positive effect on the grammatical knowledge of
the non-apprehensive experimental group; whereas, the apprehensive did not have a similar
effect. Nevertheless, this effect is medium, as indicated by Cohen’s d (d = 0.60, r = 28). This effect
is also close to the effect of AWE on the whole experimental group (d = 0.55; r = 0.26).

Effect of AWE on the relationship between writing apprehension and grammatical


knowledge
The relationship between writing apprehension and grammatical knowledge was investigated using
a Pearson product-moment correlation coefficient, which indicated a strong negative correlation
between the two constructs: r = −.46**, N = 103, p = .001. This negative or reverse relationship indi-
cates that as grammatical knowledge increases, writing apprehension decreases. Further, to deter-
mine whether the relationship between these two constructs remained the same or changed due
to the intervention, the correlation between EWAS and GKT at pre-test and then at post-test was
computed for both the control group (N = 50) and the experimental group (N = 53). The correlation
value before the intervention (r (50) = −.356**; r (53) = −.397** p = .001) changed after the interven-
tion for both groups (r(50) = −.377**, r(53) = −.469**, p = .001). Nevertheless, the correlation value
increased more for the experimental group (7.2) than the control group (2.1), which indicates that
the relationship changed. This slight change might be attributed to the intervention.
Fisher’s r-to-z’ transformation (z’r1 = .42; z’r2 = .51) was then used to examine the significance of
the correlation difference between the two groups. The results showed that the observed Z’ score
was smaller than 1.96 (z’ = .55, p = 0.58). This result indicates no significant difference in the corre-
lation of writing apprehension and grammatical knowledge between the control and experimental
16 H. WAER

groups before and after the intervention. Consequently, this finding shows that the relationship
between these two constructs remained the same.

Discussion
This study has explored the use of AWE with EFL apprehensive writers. It investigated its impact on
writing apprehension and grammatical knowledge. The results extend previous research on the effec-
tiveness of AWE by addressing an aspect of writing affect, which is fundamental in researching writing
construct. The results showed that the experimental group’s writing apprehension decreased on the
EWAS post-test compared to the control group. In particular, the apprehensive writers in the exper-
imental group had a significantly lower apprehension level than their control group counterparts.
The number of apprehensive writers decreased from 22 before the intervention to only 2 apprehen-
sive writers. The effect size of the treatment on the whole experimental group (N = 53) is medium
(Cohen’s d = 0.70); whereas, the effect size (d = 1.37) is remarkable with the experimental apprehen-
sive writers (N = 22), which may indicate that AWE might have contributed to reducing students’
writing apprehension. This result is not consistent with Fisher (2017), who found that the Autograder
application did not have a significant effect on reducing anxiety levels in freshman community college.
Her results may be affected by the nature of the study participants whose age and educational back-
ground were heterogeneous. Similarly, the results of the present study are not consistent with Sullivan
and Pratt (1996), which may be due to the nature of AWE. It provides a formative assessment environ-
ment with instant feedback, which was not available in the networked-classroom used in their study.
Furthermore, the Write & Improve classroom data showed that the apprehensive students used
the system more and experienced a greater dose of AWE than the non-apprehensive writers. This
finding, in turn, may indicate that repeated usage of AWE could help in decreasing writing apprehen-
sion. Accordingly, it can be concluded that the initial analysis of AWE tends to provide some evi-
dence that AWE feedback might be an engaging learning environment and supportive for EFL
apprehensive writers. AWE use facilitates plenty of revising opportunities and immediate feedback,
which may not be available in the regular classroom due to the teacher’s workload. Writing oppor-
tunities and feedback are stated in the literature as significant elements in lowering writing appre-
hension (Fisher 2017).
Interestingly, the apprehensive writers in the control group also showed a significant decrease
after the intervention. It seems that practicing writing with an instructor’s guidance and providing
formative feedback might be a beneficial practice to support students with writing apprehension,
especially when the feedback is individualized. Writing apprehension could be reduced with instruc-
tor feedback and individual scaffolding (Stewart, Seifert, and Rolheiser 2015). The reduction of
writing apprehension in this group might indicate that formative feedback and quality writing
experience seem to be helpful in reducing writing apprehension ‘not whether the evaluation was
human or automated’ (Fisher 2017, 85). Nevertheless, time and workload might affect a teacher’s
availability. Therefore, AWE could be used as a scaffolding tool that might help teachers provide
other immediate feedback forms and target their feedback towards ideas and organizational
aspects (Cotos 2014; Liao 2015; Warschauer and Grimes 2008).
Moreover, the results of the GKT showed statistically significant differences between the exper-
imental and the control groups. The non-apprehensive writers in the experimental group showed
a significant difference compared to their counterparts in the control group. The experimental
apprehensive group did not show a significant improvement compared to their counterparts in
the control group. In addition, the results of repeated measures ANOVA confirmed no interaction
between time and treatment. This finding can be explained in terms of the design of the intervention
of this study. The apprehensive did not have direct grammar instruction. It was hypothesized that the
experimental group might benefit from the incidental grammar feedback provided by AWE. The soft-
ware might be more suitable for the non-apprehensive writers because their grammar level is
slightly higher than the apprehensive students (Daly 1978; Abdel Latif 2015). The AWE feedback’s
INNOVATION IN LANGUAGE LEARNING AND TEACHING 17

nature might explain this; the AWE grammar feedback appears to be tailored and individualized
rather than sequenced and focused grammar instruction. For example, one apprehensive student
reported in her journal that feedback on prepositions was helpful, while another student liked the
spelling feedback.
Additionally, the apprehensive writers might have needed a more extended treatment period to
expose them to more AWE feedback. Possibly, a different AWE tool such as Criterion could provide
detailed grammar categories (Li, Hui-Hsien and Saricaoglu 2017) and thus, it might have been more
suitable for them. Accordingly, the GKT results may imply that a more direct and sequenced
grammar instruction might help apprehensive writers enhance their grammar and consequently
their writing. Furthermore, the AWE feedback uptake tends to depend on the student’s noticing
and paying attention to the corrective feedback generated by AWE, which could be influenced by
the individual cognitive and emotional differences, such as proficiency level (Bitchener 2012). Prob-
ably, using grammar tests other than TOEFL might have yielded different results since the unfami-
liarity or difficulty of the test might have influenced the performance of the study participants.
The results also showed a negative relationship between grammatical knowledge and writing
apprehension. This finding confirms previous research on writing apprehension (Abdel Latif 2015)
that showed linguistic knowledge as a negative predictor of writing apprehension. In turn, this
result might imply the need for more focused grammar lessons for apprehensive writers.

Conclusion
The present study attempted to explore the effect of AWE on writing apprehension, an aspect that is
not adequately explored in the AWE literature. Data were collected from experimental and control
groups using the EWAS and the GKT. Statistical analysis showed a significant improvement in the
student’s grammatical knowledge in the experimental group and a reduction in writing apprehen-
sion. The apprehension level of the apprehensive writers in the experimental group decreased sig-
nificantly and remarkably compared to their counterparts in the control group; whereas, no
significant difference was found between them on the GKT. The results imply that AWE might
help EFL apprehensive writers by reducing their writing apprehension and enhancing the gramma-
tical knowledge of the non-apprehensive writers. Negative correlations were also found between
writing apprehension and grammatical knowledge. The result suggests that AWE can be used as
a remedial classroom treatment for struggling or apprehensive writers, providing that AWE use is
integrated as a formative feedback tool. Additionally, the teacher feedback and writing experience
were deemed useful in this study for the apprehensive writers. Therefore, teachers are encouraged to
provide ample individualized and supportive feedback to help apprehensive writers.
Accordingly, it is recommended that future AWE research considers the role of writing affect vari-
ables such as writing apprehension, especially when investigating learners’ perceptions and writing
performance. Such variables probably interact with research results, and this may be one factor
behind inconsistent or inconclusive results about the effectiveness of AWE feedback. Methodologi-
cally, the homogeneity of the writing apprehension levels among AWE users needs consideration,
namely in quantitative research design. This was evident in the differences in the effect size of
the whole experimental group compared to the differentiated apprehension levels.
Pedagogically, it is suggested that curriculum designers judiciously incorporate AWE in EFL
writing courses to make use of its instant feedback potential within an instructional framework
such as the process-based approach or process-genre approach. Moreover, ESL teachers should
be aware of the effect of writing apprehension on their students and use new technological tools
that provide unconventional writing experiences with an immediate feedback environment. Such
tools might help reduce students’ writing difficulties and their apprehension. Additionally, construc-
tive feedback and discussions should be provided to assist students’ writing. To this end, teachers
should be adequately trained to identify the apprehensive writers in their classes, use differentiated
feedback techniques and purposefully integrate AWE tools in their EFL classrooms.
18 H. WAER

In conclusion, the results of this study need to be considered within its limitations. For example, the
researcher was the instructor. As far as possible, careful procedures were implemented to ensure
equality among the study groups (e.g. using the same essay rubric and not segregating them). Never-
theless, the researcher and the teaching assistant might have behaved differently concerning instruc-
tion and feedback with students in the experimental and control groups. Thus, to strengthen the
results’ internal validity and decrease bias, it is recommended that future researchers consider this
point. Additionally, this study was also limited to a small sample size within a short period. Although
the role of gender in writing apprehension is not conclusive in previous studies, the limitation of a pre-
dominantly female sample should be accounted for. Consequently, the results of the present study
might not apply to a majority male context or a sample with a more balanced gender representation.
Due to the limitations mentioned above, it is difficult to generalize the results of this study.
Accordingly, researchers might replicate this study with larger samples from different EFL/ESL or L1
contexts and balanced gender. They might also use other AWE applications or investigate other
writing aspects as writing motivation, self-efficacy or writing strategies. Finally, comparative studies
might explore AWE’s impact and other digital applications on writing affect and writing performance.

Acknowledgment
The author would like to thank Thomas Kerr from Cambridge ‘Write & Improve’ for providing a free class view subscrip-
tion. The author is not employed by Cambridge and is solely responsible for the content and any inaccuracies presented
in this article. The authors would like also to thank Dr. Ruth Petzold from the U.S. Department of State (Regional English
Language Officer) for revising and editing an early draft of the manuscript; Dr. Muhammad Abdel Latif from Cairo Uni-
versity for helpful data on EWAS; and the anonymous reviewers for their valuable feedback.

Compliance with ethical standards


This study was not funded by any organization or university. Neither author A nor author B has
received any funding from any company.

Disclosure statement
No potential conflict of interest was reported by the authors.

Notes on contributor
Hanan Waer studied at Assiut University (Egypt). She received her Ph.D. degree in Educational & Applied Linguistics at
Newcastle University, UK, in 2012. She currently works as a Lecturer of TEFL at the Faculty of Education, New Valley Uni-
versity, Egypt. Her research interests are teaching writing, AWE, CALL, Conversation Analysis and Teacher Education.

ORCID
Hanan Waer http://orcid.org/0000-0002-4484-9941

References
Abdel Latif, M. 2015. “Sources of L2 Writing Apprehension: A Study of Egyptian University Students.” Journal of Research
in Reading 38 (2): 194–212. doi:10.1111/j.1467-9817.2012.01549.x.
Abdel Latif, M. 2019. “Helping L2 Students Overcome Negative Writing Affect.” Writing & Pedagogy 11 (1): 151–163.
doi:10.1558/wap.38569.
Autman, H., and S. Kelly. 2017. “Reexamining the Writing Apprehension Measure. Business and Professional.”
Communication Quarterly 80 (4): 516–529. doi:10.1177/2329490617691968.
Bitchener, J. 2012. “A Reflection on ‘The Language Learning Potential’ of Written CF.” Journal of Second Language Writing
21 (4): 348–363. doi:10.1016/j.jslw.2012.09.006.
INNOVATION IN LANGUAGE LEARNING AND TEACHING 19

Briscoe, T., J. Carroll, and R. Watson. 2006. “The Second Release of the RASP System.” ACLColing’06 Interactive
Presentation Session, 77–80. doi:10.3115/1225403.1225423.
Briscoe, T., B. Medlock, and Ø. Andersen. 2010. “Automated Assessment of ESOL Free Text Examinations”. Technical
report, University of Cambridge, Computer Laboratory. https://www.cl.cam.ac.uk/techreports/UCAM-CL-TR-790.pdf.
Chen, C., and W. Cheng. 2008. “Beyond the Design of Automated Writing Evaluation: Pedagogical Practices and
Perceived Learning Effectiveness in EFL Writing Classes.” Language Learning & Technology 12: 94–112. https://
scholarspace.manoa.hawaii.edu/bitstream/10125/44145/1/12_02_chencheng.pdf.
Cheng, Y. 2004. “A Measure of Second Language Writing Anxiety: Scale Development and Preliminary Validation.”
Journal of Second Language Writing 13 (4): 313–335. doi:10.1016/j.jslw.2004.07.001.
Cohen, J. 1988. Statistical Power Analyses for the Behavioral Sciences. 2nd ed. Hillsdale, NJ: Lawrence Erlbaum Associates,
Inc.
Cotos, E. 2014. “Enhancing Writing Pedagogy with Learner Corpus Data.” ReCALL 26 (2): 202–224. doi:10.1017/
s0958344014000019.
Council of Europe. 2020. Common European Framework of Reference for Languages: Learning, Teaching, Assessment.
Strasbourg: Companion Volume, Council of Europe Publishing. www.coe.int/lang-cefr.
Daly, J. 1978. “Writing Apprehension and Writing Competency.” The Journal of Educational Research 72 (1): 10–14. doi:10.
1080/00220671.1978.10885110.
Daly, J. A. 1979. “Writing Apprehension in the Classroom: Teacher Role Expectancies of the Apprehensive Writer.”
Research in the Teaching of English 13 (1): 37–44.
Daly, J., and M. Miller. 1975. “The Empirical Development of an Instrument to Measure Writing Apprehension.” Research
in the Teaching of English 9 (3): 242–249. https://www.jstor.org/stable/40170632.
Daly, J., and W. Shamo. 1978. “Academic Decisions as a Function of Writing Apprehension.” Research in the Teaching of
English 12 (2): 119–126. http://www.jstor.org/stable/27539858.
Daniels, S. M., L. Whitsitt, C. Skinner, J. Schwartz-Micheaux, and J. White. 2019. “Evaluating the Effects of a Writing Self-
Efficacy Intervention on Writing Quantity in Middle School Students.” Reading & Writing Quarterly, 1–17. doi:10.1080/
10573569.2019.1618226.
El Ebyary, K., and S. Windeatt. 2019. “Eye Tracking Analysis of EAP Students’ Regions of Interest in Computer-Based
Feedback on Grammar, Usage, Mechanics, Style and Organization and Development.” System 83: 36–49. doi:10.
1016/j.system.2019.03.007.
El Shimi, E. 2017. “Second Language Learners’ Writing Anxiety: Types, Causes, and Teachers’ Perceptions.” MA thesis,
The American University in Cairo, Egypt. http://dar.aucegypt.edu/handle/10526/5096
Erkan, D., and A. Saban. 2011. “Writing Performance Relative to Writing Apprehension, Self-Efficacy in Writing, and
Attitudes Towards Writing: A Correlational Study in Turkish Tertiary-Level EFL.” Asian EFL Journal 13 (1): 164–192.
Fisher, K. 2017. “The Intelligent Essay Assessor Autograder and Its Effect on Reducing College Writing Anxiety.” PhD diss.,
Keiser University, USA. https://pqdtopen.proquest.com/pubnum/10265396.html?FMT=AI
Graham, S., Sh. S Schwartz, and Ch. A. MacArthur. 1993. “Knowledge of Writing and the Composing Process, Attitude
Toward Writing, and Self-Efficacy for Students With and Without Learning Disabilities.” Journal of Learning
Disabilities 26 (4): 237–249. http://dx.doi.org/10.1177/002221949302600404.
Grimes, D., and M. Warschauer. 2010. “Utility in a Fallible Tool: A Multi-Site Case Study of Automated Writing Evaluation.”
Journal of Technology, Learning, and Assessment 8 (6). http://www.jtla.org.
Gungle, B., and V. Taylor. 1989. “Writing Apprehension and second language writers.” In Richness in Writing: Empowering
ESL students, edited by D. M. Johnson and D. H. Roen, 235–248. New York: Longman.
Hanna, K. 2009. “Student Perceptions of Teacher Comments: Relationships between Specific Aspects of Teacher
Comments and Writing Apprehension.” PhD thesis, The University of North Dakota. https://commons.und.edu/
theses/895.
Harrison, L. 2017. “Developing an ELT Product based on Machine Learning: Write and Improve.” ELTjam [website] https://
learnjam.com/developing-an-elt-product-based-on-machine-learning-write-improve/.
Hassan, B. 2001. “The Relationship of Writing Apprehension and Self-Esteem to The Writing Quality and Quantity of EFL
University Students.” Mansoura Faculty of Education Journal 39 (1): 1–36. https://eric.ed.gov/?id=ED459671.
Hattie, J. 2017. “250+ Influences on Student Achievement.” https://us.corwin.com/sites/default/files/250_influences_
chart_june_2019.pdf.
Hayes, J. 1996. “A New Framework for Understanding Cognition and Affect in Writing.” In The Science of Writing: Theories,
Methods, Individual Differences, and Applications, edited by M. Levy, and S. Ransdell, 1–27. Mahwah, NJ: Lawrence
Erlbaum Associates.
Hockly, N. 2019. “Automated Writing Evaluation.” ELT Journal 73 (1): 82–88. doi:10.1093/elt/ccy044.
Huang, S., and W. Renandya. 2018. “Exploring the Integration of Automated Feedback Among Lower-Proficiency EFL
Learners.” Innovation in Language Learning and Teaching 14 (1): 15–26. doi:10.1080/17501229.2018.1471083.
Jebreil, N., A. Azizifar, H. Gowhari, and A. Jamilinesari. 2015. “Investigating the Relationship between Anxiety and Writing
Performance among Iranian EFL Learners.” Iranian EFL Journal 11 (3): 49–60. https://www.journals.aiac.org.au/index.
php/IJALEL/article/view/1208.
Kinnear, P., and C. Gray. 2008. SPSS 15 Made Simple. Hove and New York: Psychology Press, Taylor & Francis Group.
20 H. WAER

Lee, I. 2017. Classroom Writing: Assessment and Feedback in L2 School Contexts. Singapore: Springer. doi:10.1007/978-
981-10-3924-9.
Li, Z., F. Hui-Hsien, and A. Saricaoglu. 2017. “The Short-Term and Long-Term Effects of AWE Feedback on ESL Students’
Development of Grammatical Accuracy.” CALICO Journal 34 (3): 355–375. doi:10.1558/cj.26382.
Li, J., S. Link, and V. Hegelheimer. 2015. “Rethinking the Role of Automated Writing Evaluation AWE Feedback in ESL
Writing Instruction.” Journal of Second Language Writing 27: 1–18. doi:10.1016/j.jslw.2014.10.004.
Liao, H.-C. 2015. “Using Automated Writing Evaluation to Reduce Grammar Errors in Writing.” ELT Journal 70 (3): 308–
319. doi:10.1093/elt/ccv058.
Lipsou, E. A. 2018. “The Most Common Reasons C’ Lyceum Students Fear Writing Composition in Cyprus.” PhD diss.,
Saint Louis University. https://search.proquest.com/openview/db0a19927d093acefaf0c668128a7f43/1?pq-origsite=
gscholar&cbl=18750&diss=y ProQuest Dissertations Publishing database.10822486.
Mabrito, M. 2000. “Computer Conversations and Writing Apprehension.” Business Communication Quarterly 63 (1): 39–
49. doi:10.1177/108056990006300104.
Masny, D., and J. Foxall. 1993. “Writing Apprehension in L2 (ED352844).” ERIC. https://files.eric.ed.gov/fulltext/ED352844.
pdf.
Morphy, P., and S. Graham. 2012. “Word Processing Programs and Weaker Writers/Readers: A Meta-Analysis of Research
Findings.” Reading and Writing 25 (3): 641–678. doi:10.1007/s11145-010-9292-5.
Nobles, S., and L. Paganucci. 2015. “Do Digital Writing Tools Deliver? Student Perceptions of Writing Quality Using
Digital Tools and Online Writing Environments.” Computers and Composition 38: 16–31. doi:10.1016/j.compcom.
2015.09.001.
Page, E. 1968. “The Use of the Computer in Analyzing Student Essays.” International Review of Education 14 (3): 253–263.
Phillips, D. 2004. Longman Complete Course for the TOEFL Test: Preparation for the Computer and Paper Tests, 2nd ed.
New York: Pearson Education, Addison-Wesley Longman.
Saadi, Z. K., and M. Saadat. 2015. “Iranian EFL Learners’ Grammatical Knowledge: Effect of Direct and Metalinguistic
Corrective Feedback.” English Language Teaching 8 (8): 112–120. doi:10.5539/elt.v8n8p112.
Sanders-Reio, J., P. Alexander, T. Reio, and I. Newman. 2014. “Do Students’ Beliefs about Writing Relate to Their Writing Self-
Efficacy, Apprehension, and Performance?” Learning and Instruction 33: 1–11. doi:10.1016/j.learninstruc.2014.02.001.
Saricaoglu, A. 2019. “The Impact of Automated Feedback on L2 Learners’ Written Causal Explanations.” ReCALL 31 (2):
189–203. doi:10.1017/s095834401800006x.
Saville, N. 2017. “Digital Assessment.” In Digital Language Learning and Teaching: Research, Theory, and Practice, edited
by M. Carrie, R. M. Damerow, and K. M. Bailey, 198–207. New York: Taylor & Francis.
Stevenson, M. 2016. “A Critical Interpretative Synthesis: The Integration of Automated Writing Evaluation Into Classroom
Writing Instruction.” Computers and Composition 42: 1–16. doi:10.1016/j.compcom.2016.05.001.
Stevenson, M., and A. Phakiti. 2014. “The Effects of Computer-Generated Feedback on the Quality of Writing.” Assessing
Writing 19: 51–65. doi:10.1016/j.asw.2013.11.007.
Stevenson, M., and A. Phakiti. 2019. “Automated Feedback and Second Language Writing.” In Feedback in Second
Language Writing: Contexts and Issues, 2nd ed., edited by F. Hyland, and K. Hyland, 125–142. Cambridge:
Cambridge University Press.
Stewart, G., T. Seifert, and C. Rolheiser. 2015. “Anxiety and Self-Efficacy’s Relationship with Undergraduate Students’
Perceptions of the Use of Metacognitive Writing Strategies.” The Canadian Journal for the Scholarship of Teaching
and Learning 6 (1): 19. doi:10.5206/cjsotl-rcacea.2015.1.4.
Sullivan, N., and E. Pratt. 1996. “A Comparative Study of Two ESL Writing Environments: A Computer-Assisted Classroom
and a Traditional Oral Classroom.” System 24: 491–501. doi:10.1016/S0346-251X(9600044-9.
Vanhille, J., B. Gregory, and G. C. Corser. 2017. “The Effects of Mood on Writing Apprehension, Writing Self-Efficacy, and
Writing Performance.” Psi Chi Journal of Psychological Research 22 (3): 220–230. https://pdfs.semanticscholar.org/
b563/f391c95afa640804077ff6774c8325f2c706.pdf.
Wali, F., and H. Huijser. 2018. “Write to Improve: Exploring The Impact of an Online Feedback Tool on Bahraini Learners
of English.” Learning and Teaching in Higher Education: Gulf Perspectives 15. doi:10.18538/lthe.v15.n1.293.
Ware, P., and M. Warschaue. 2006. “Electronic Feedback and Second Language Writing.” In Feedback in Second Language
Writing: Contexts and Issues, edited by K. Hyland, and F. Hyland, 105–122. New York: Cambridge University Press.
Warschauer, M., and D. Grimes. 2008. “Automated Writing Assessment in the Classroom.” Pedagogies: An International
Journal 3 (1): 22–36. doi:10.1080/15544800701771580.
Warschauer, M., and P. Ware. 2006. “Automated Writing Evaluation: Defining the Classroom Research Agenda.”
Language Teaching Research 10 (2): 157–180. doi:10.1191/1362168806lr190oa.
Wilson, J. 2017. “Associated Effects of Automated Essay Evaluation Software on Growth in Writing Quality for Students
with and Without Disabilities.” Reading and Writing 30 (4): 691–718. doi:10.1007/s11145-016-9695-z.
Wilson, J., and A. Czik. 2016. “Automated Essay Evaluation Software in English Language Arts Classrooms: Effects on
Teacher Feedback, Student Motivation, and Writing Quality.” Computers & Education 100: 94–109. doi:10.1016/j.
compedu.2016.05.004.
Wilson, J., and R. Roscoe. 2020. “Automated Writing Evaluation and Feedback: Multiple Metrics of Efficacy.” Journal of
Educational Computing Research 58 (1): 87–125. doi:10.1177/0735633119830764.
INNOVATION IN LANGUAGE LEARNING AND TEACHING 21

Appendices
Appendix 1
EWAS (Abdel Latif 2015, used with author’s permission)

Internal consistency of EWAS

EWAS items Correlation Sum


EWAS1 Pearson Correlation .640**
Sig. (2-tailed) .000
N 60
EWAS2 Pearson Correlation .713**
Sig. (2-tailed) .000
N 60
EWAS3 Pearson Correlation .407**
Sig. (2-tailed) .001
N 60
EWAS4 Pearson Correlation 0.329*
Sig. (2-tailed) .04
N 60
EWAS5 Pearson Correlation .460**
Sig. (2-tailed) .000
N 60
EWAS6 Pearson Correlation .676**

(Continued )
22 H. WAER

Continued.
EWAS items Correlation Sum
Sig. (2-tailed) .000
N 60
EWAS7 Pearson Correlation .611**
Sig. (2-tailed) .000
N 60
EWAS8 Pearson Correlation .361**
Sig. (2-tailed) .005
N 60
EWAS9 Pearson Correlation .434**
Sig. (2-tailed) .001
N 60
EWAS10 Pearson Correlation .610**
Sig. (2-tailed) .000
N 60
EWAS11 Pearson Correlation .624**
Sig. (2-tailed) .000
N 60
EWAS12 Pearson Correlation .625**
Sig. (2-tailed) .000
N 60

Appendix 2
Class view at Write & Improve website: Student progress in all essays.
In the student progress view, the teacher sees each student submitted an essay located in the CEFR bar. Each blue
line represents an essay, and the white within it represents the highest level the student reached. The small vertical lines
represent the number of submitted revisions.
INNOVATION IN LANGUAGE LEARNING AND TEACHING 23

Class View: Students progress Heatmap of the first essay


This heatmap view shows the teacher the level of each student and the number of submissions. It displays each
writing task. For example, this view shows the student progress in essay 1 ‘living in the city or the country’. It displays
this task with all students. The light blue color indicates a high level in writing (B2 :C1); whereas, the dark blue indicates
a low level in writing ( (A1: A2).
24 H. WAER

Appendix 3
The online English writing course workbook at Write & Improve website
INNOVATION IN LANGUAGE LEARNING AND TEACHING 25

Appendix 4
Essay writing rubric

Band Mark Descriptors


5 30 The essay:
• has a strong central idea ( or a thesis that is related to the topic;
• provides compelling support to the thesis topic;
• has a clear, logical organization with well - developed major points that are supported with concrete and
specific evidence;
• uses effective transitions between ideas;
• uses appropriate words composing sophisticated sentences;
• expresses ideas freshly and vividly;
• is free of mechanical, grammatical, and spelling errors.

4 24 . has a strong central idea that is related to the assignment;


. has a clear, logical organization with developed major points but the supporting
. evidence may not be especially vivid or thoughtful;
. uses appropriate words accurately, but seldom exhibits an admirable
. style while the sentences tend to be less sophisticated;
. has few mechanical, grammatical, and spelling error that

3 18 . has a good thesis statement


. has an organization but uses some evidence
. develops some ideas with some details
. uses simple sentences
. uses some transitions and accurate words
. has some mechanical, grammatical, and spelling error that

2 12 . is not related to the assignment


. has a central idea that is presented in such a way that the reader understands the writer’s purpose;
. has an organization that reveals a plan, but the evidence tends to be general rather than specific or concrete;
. uses common words accurately, but sentences tend to be simplistic and unsophisticated;
. has one or two severe mechanical or grammatical errors.
. is substantially more or less than the required page length.

1 6 . The essay lacks a central idea (; - lacks clear organization;


.
. is not related to the assignment;
. fails to develop main points, or develops them in a repetitious or illogical way;
. fails to use common words accurately;
. uses a limited vocabulary in that chosen words fail to serve the writer’s purpose;
. has three or more mechanical or grammatical errors

You might also like