Download as pdf or txt
Download as pdf or txt
You are on page 1of 20

Language Awareness

ISSN: 0965-8416 (Print) 1747-7565 (Online) Journal homepage: https://www.tandfonline.com/loi/rmla20

Exploring how rubric training influences students’


assessment and awareness of interpreting

Wei Su

To cite this article: Wei Su (2020): Exploring how rubric training influences students’ assessment
and awareness of interpreting, Language Awareness, DOI: 10.1080/09658416.2020.1743713

To link to this article: https://doi.org/10.1080/09658416.2020.1743713

Published online: 31 Mar 2020.

Submit your article to this journal

Article views: 38

View related articles

View Crossmark data

Full Terms & Conditions of access and use can be found at


https://www.tandfonline.com/action/journalInformation?journalCode=rmla20
Language Awareness
https://doi.org/10.1080/09658416.2020.1743713

Exploring how rubric training influences students’


assessment and awareness of interpreting
Wei Su
College of Foreign Languages and Cultures, Xiamen University, Xiamen, China

ABSTRACT ARTICLE HISTORY


Previous studies explored how rubrics of writing and speaking could Received 28 April 2018
change students’ self assessment and awareness of language skills, yet Accepted 11 March 2020
few disclosed the benefits of rubrics of interpreting. To close the gap, KEYWORDS
the present paper tapped the effects of rubric training in students’ self Rubrics; awareness;
assessment and awareness of interpreting. After giving a three-week interpreting;
rubric training on Chinese-English interpreting to 32 Chinese under- self-assessment
graduates, it was found that students were able to simultaneously
attend to multiple criteria, as they circled more descriptor words from
the rubric and generated more numerous and elaborate comments. In
addition, their assessments were extended from local, negative com-
ments to global, positive ones, indicating their balanced and hierarchi-
cal thinking of interpreting skills. Post-training self reports revealed that
rubric training improved their understanding and learning of interpret-
ing skills, forming a favourable cycle of assessment-awareness-acquisi-
tion. Future pedagogical suggestions were proposed accordingly.

Introduction
Rubrics have been found to assist students in their assessment under various language tasks,
like speaking (Glover, 2011; Kissling & O’Donnell, 2015; Patri, 2002) or writing (Andrade &
Du, 2005; Li & Lindsey, 2015; Wang, 2014). However, rubrics alone cannot yield benefits in
assessment; their full potential can only be realised with prior rubric training (Freeman, 1995;
Min, 2016; Patri, 2002). Rubric training usually involves teaching student assessors knowledge
of the rubric criteria and skills of using them in actual tasks. Past research shows that rubric
training has a positive impact on various areas, including assessors’ high quality evaluations
(Berg, 1999; Liou & Peng, 2009), their negotiation skills in peer feedback (Zhu, 1995), and
their ability to focus on high-order issues (Min, 2016; Zhu, 1995). Students who receive rubric
training will demonstrate better language awareness, the kind of conscious perception and
sensitivity in language learning, language teaching and language use (Association for
Language Awareness (ALA), 2018). As a higher-order meta-cognitive thinking, it can regulate
students’ self-choice, planning, and judgments, as well as monitor and measure their progress
in various language tasks.
Compared to conventional language tasks, rubric training in the task of interpreting has
only received scant attention (e.g. Bartłomiejczyk, 2007; Su, 2019a). The questions of whether

CONTACT Wei Su swxmu@qq.com


© 2020 Informa UK Limited, trading as Taylor & Francis Group
2 W. SU

and to what extent rubric training improve students’ knowledge of interpreting still remain
largely unexplored. The present paper argues that a closer investigation of interpreting is
significant for the following two reasons.
Theoretically, as an interpreting rubric simultaneously taps multiple language constructs
(comprehending, speaking, language transfer, etc.), examining students’ use of such a rubric
can validate the extent to which students attend to multiple constructs simultaneously.
There has long been a debate on the role of rubrics in integrated tasks (e.g. reading to write
in TOEFL test). Some argue that thanks to rubrics assessors can form a consistent under-
standing of construct(s) and generate reliable judgments (e.g. Delaney, 2008), yet some
contend that due to rubrics students could be overwhelmed by numerous dimensions and
produce inconsistent judgments(e.g. Isaacs, 2016). The results of the current study could
help to address the debate and further our understanding of the role of rubrics and the
benefits of rubric training.
Practically, an investigation of student interpreters can satisfy a surging demand in inter-
preting teaching: As of 2019 in China alone there are 249 schools providing Master of
Translation and Interpreting programs (MTI), which have enrolled nearly 60,000 translation
students (Zhong, 2019), and in 44 countries there were 84 interpreting schools certified by
AIIC (Association internationale des interprètes de conférence), an international organization
for professional interpreters (Setton & Durant, 2016). Facing such a growing force of inter-
preting students, it is thus essential to understand rubric training and its pedagogical
applications.

Self-assessment
Self-assessment in this study was defined as learners’ ability to ‘judge their own work to
improve performance [by identifying] discrepancies between current and desired perfor-
mance’ (McMillan & Hearn, 2008, p. 40). While self-assessment was widely seen and used as
an alternative testing method (Kim, 2006; Luoma, 2004), its influences on students’ language
awareness and learning autonomy have been increasingly noted and reported (Alderson,
2005; Gonzalez, 2009)
Past research suggested that self-assessment could heighten learners’ language aware-
ness in a number of ways: First, it could direct learners’ attention to the communicative
function of language as opposed to mastery of lexico-grammatical structures. Blanche and
Merino (1989) concluded that with self-assessments students tended to ‘estimate their purely
communicative competence level rather than to estimate their mastery of grammar’ (p. 332).
Second, it could reveal learners’ (mis)understanding of language construct variables, or doc-
ument their evolving perception of those constructs. Trofimovich et al. (2016) found that,
compared to native listener ratings, L2 speakers adopted the same linguistic dimensions
when self-assessing their accent. However, when judging comprehensibility, L2 speakers
relied on a limited range of dimensions, ignoring other components like lexicon and dis-
course structure. The relationship between assessment and the perception of language
constructs was also stressed in Bachman and Palmer (1989). Their study treated sociolin-
guistic competence as one of the three traits of language competence. By asking themselves
whether they could tell how polite English-speaking people were through the kind of English
they used, students could check their recognition of the sociolinguistic dimension. More
Language Awareness 3

recently, Kissling and O’Donnell (2015) tracked 13 Spanish learners’ three self-assessments
over the course of one semester and noted their increasing awareness of language con-
structs, such as how various language features could be described and how they were related,
what constituted proficiency, and what characterised more advanced language
proficiency.
Third, self assessment could help learners identify and develop their own language com-
petence. Trofimovich et al. (2016) found that learners who demonstrated greater language
awareness in their journals achieved more target-like pronunciation and sought more
contact with the target language outside of class. Likewise, Chen (2006) collected the
self-reported data from 40 English as a Foreign language (EFL) students to infer potential
benefits of self and peer assessments. It was found that students perceived assessments to
be facilitative in enhancing their awareness of their strengths and weaknesses, developing
their critical thinking ability, and acquiring oral skills.
Fourth, self-assessment could affect students’ language learning process. As self-assess-
ment modified their language perceptions and revealed their strengths/weakness, they
would naturally adjust their learning habits to suit their language needs. Earl (2013) defined
this wash-back effect as learning autonomy, meaning that when students went outside the
classroom walls they had gained the skills to become lifelong learners. Alternatively, Kissling
and O’Donnell (2015) treated this change of learning habits as self-efficacy, as learners
became more aware of their specific abilities and limitations. They further argued that stu-
dents who set reasonable expectations and concrete goals would likely continue to improve
their oral production, forming a favourable cycle of assessment-awareness-acquisition.
In sum, self-assessment has been shown to positively influence students’ language aware-
ness by correcting/improving their understanding of language functions and dimensions,
by stimulating their reflections during language acquisition and by adjusting their language
learning agenda/goals. However, these benefits should be treated with caution, as they were
subject to various external factors including tasks, rubric training, and learning objectives.
In particular, rubric training that specified tasks and objectives and were specially designed
for self-assessments has attracted growing attention and conflicting opinions. The next sec-
tion reviews whether and in which way rubric training could change learners’ language
awareness.

Rubric training
A rubric is a scoring tool that lays out the specific expectations for an assignment. It divides
an assignment into its component parts and provides a detailed description of what con-
stitutes an acceptable or unacceptable level of performance for each of those parts (Stevens
& Levi, 2013).
Rubric training usually consists of two approaches: the first approach is lecture-based, in
which teachers elaborate each criterion/band and its requirement. Such declarative knowl-
edge enriches students’ understanding of rubrics and enlarges their vocabulary in explaining
their performance. Patri (2002) conducted such a lecture-based training in her speaking
class for two hours. She first explained the components of her rubric, telling students what
should be focused on while assessing speaking. Then she used three speech samples as well
as her own assessment as demonstration. Through such explanation and modelling Patri
found that the assessment criteria could be firmly set among student assessors. Similarly,
4 W. SU

Freeman (1995) recommended showing sample videos during training so that students
could identify and describe best and worst presentations.
The second approach is practice-oriented, which involves letting students use rubrics in
actual language tasks. Compared to the first approach, which treats students as listeners/
viewers, this approach urges students to perform rubric activities, acquiring rubric use after
deliberate practice and reflections. Various activities have been designed to foster such
knowledge acquisition. In Glover’s (2011) rubric training, 62 English as a Foreign language
(EFL) students in Turkey firstly made an initial assessment based on a rubric, then they were
shown sample performances at specific levels with descriptions of those performances. After
that, they discussed with each other their own performances to modify their judgment. Such
a ‘Trial-Sample watching-Discussion’ training mode not only enlarged students’ vocabulary,
but also encouraged their engagement, reflection, and awareness of learning as a process,
leading to a greater self-awareness and a more realistic view of the learners’ own abilities.
Glover also stressed that students’ success seemed to be related to the kind of support from
their teacher and each other through training.
To identify which training activity led to better language awareness and assessment com-
petence, Min (2016) compared two modelling (mastery and coping) and two teacher feedback
(praise and correction plus explication) types in her writing class. Results showed that the two
modelling types interacted with the two feedback types to produce differential effects on
students’ assessment skills over time, with the combination of mastery modelling and correc-
tion and explication being the most effective approach. She concluded that with careful plan-
ning, systematic execution, and timely individualised feedback, the effect of rubric training
can emerge in a month. Yet she also cautioned that the content of training should be con-
stantly modified and updated with each session based on information gleaned from the
previous one so that novice student assessors could continue advancing their assessment skills.
In sum, while the effect of rubric training on students’ awareness and assessment skills
seem evident and relatively immediate, it could be mediated by the type of language task
and students’ individual factors. According to Min (2016), more research was needed for the
interaction between teachers’ support, task demand and students’ progress before we could
finally formulate effective training schemes in the class.

Rubrics in interpreting
Compared to other task types, the task of interpreting poses some unique challenges to
student assessors (Pöchhacker, 2016; Su, 2019a). Pöchhacker (2016) defined interpreting as
a form of translation in which a first and final rendition in another language was produced
on the basis of a one-time presentation of an utterance in a source language. Immediacy as
its most salient feature imposes great burden to assessors. Setton and Durant (2016) found
that, due to the transient nature of interpreting, untrained assessors often felt stressed to
identify major weaknesses of interpreting performance.
In addition to immediacy, interpreting was also marked by its multi-tasking nature.
Compared to speaking and writing, interpreting tapped more sub-kills (comprehending,
language transferring, cultural mediating, etc.) and its rubric encompassed more dimensions
(fidelity, appropriateness, fluency, etc.). Further, the interplay and mutual interferences
among those multiple dimensions could complicate the assessment process. Hartley et al.
(2003) analysed seven postgraduate students’ assessment based on an interpreting rubric.
Language Awareness 5

The assessors reported confusion between the criterion of accuracy and of delivery, and
their comments were not evenly distributed across different criteria. Bartłomiejczyk (2007)
found that faithfulness to the original and completeness were the most frequently men-
tioned components of students’ self-assessment, whereas the rubric of presentation received
very little attention even though in most cases student presentations were far from perfect.
More recently, Su (2019a) investigated peer assessments by his 18 interpreting students in
a Chinese university. He found that although the rubric extended their perspectives on
interpreting, some higher-order rubric criteria (e.g. cohesion) seemed too difficult to assist
their judgments. Su called on more teacher intervention before the rubric could exercise its
full benefits. Likewise, Lee (2017) also stressed the need to train students’ use of higher-order
rubric criteria in interpreting.
All in all, while interpreting rubrics are generally perceived as motivating, their scaffolding
role has not yet been fully acknowledged, much less examined, in students’ assessment and
learning. Although a few studies claimed that students could gain a better knowledge of
interpreting skills after rubric training, such gains have not been empirically investigated
and validated. To address this gap, the present paper examined whether and how rubric
training could enhance students’ understanding of interpreting skills. Specifically, it formu-
lates the following research questions.

1. How is students’ assessment of interpreting different before and after rubric


training?
2. Does rubric training change students’ awareness of interpreting skills? If so, in what
aspects?

The study
Participants
This paper reported on a research on how rubric training affected students’ rubric use and
knowledge of interpreting. The participants were 32 third-year undergraduates (28 female,
4 male) in the English department at a Chinese university. All participants were 21 years old,
native speakers of Chinese, with Mandarin as their first language. Prior to the study, they
had received lectures on interpreting for 12 weeks during the previous spring semester and
acquired some basic knowledge of interpreting skills, but they had not undergone any
systematic training on interpreting assessment. Regarding their language proficiency, they
all passed a national English proficiency examination, Test for English Majors Band 4 (TEM
4), with scores ranging from 70 to 81 (Full score was 100) roughly equivalent to IELTS 6.5-7.0,
so they could be regarded as upper intermediate English learners at similar levels.

Rubric
The rubric in this study was a skill-based self-assessment guide for interpreting. Most of the
rubrics used in interpreting were product-based and summative-test oriented (e.g. Han &
Riazi, 2017; Lee, 2008; Su, 2019a). Users of such rubrics approached interpreting from the
aspect of content fidelity or language accuracy. Their assessment results mainly revealed
features of interpretation products, falling short of explaining the kind of interpreting skills
6 W. SU

Table 1. Skill-based assessment form (English-Chinese Sight Interpreting).


Self-assessment
Skill Description Comment Examples
Comprehension --Can rapidly grasp the meaning of every word and of
the whole text
--Can quickly anticipate how sentences are going to
end and where the argument is going
--Can accurately identify logical connections between
sentences
Reformulation -- Can transfer between languages at local and global
level
- Can add or delete words to produce idiomatic target
language
Delivery -- Can use pauses properly with minimal
self-corrections.
- Can maintain steady volume and pace throughout
the delivery process

leading to such features. To better expose interpreting skills and to fully address the present
research questions, this study used a modified version of the interpreting rubric at the
researcher’s university, a formative-oriented scoring criteria designed for students’ self-judg-
ment of their skill acquisition (see Table 1).
Table 1 includes a rubric for students learning English to Chinese sight interpreting, a
common mode of interpreting where interpreters orally translate the text into another lan-
guage as they read. The rubric included three skills, each having two to three can-do state-
ments as its description. During their self-assessments, students evaluated their mastery of
each skill by circling the keywords in the descriptors in each statement. Based on their circled
words of each criterion, they entered comments on the extent of their performance satisfying
those descriptor words. Next to each comment they also provided illustrative examples from
their interpretations to justify the preceding comment. The language they had used for their
examples/comments was English.
The reason why this paper used a sight interpreting task and its rubrics is twofold: First, sight
interpreting is a frequently exercised interpreting activity and a core module in most interpret-
ing programs worldwide (Setton & Durant, 2016). The investigation of sight interpreting assess-
ment is thus highly relevant to actual interpreting teaching. Second, sight interpreting can be
regarded as a special form of simultaneous interpreting. It is the oral reproduction in the target
language of a text originally written in the source language, and such reproduction is rendered
almost simultaneously (Pöchhacker, 2016). Its stringent time pressure and multi-tasking
demand represent unique features of interpreting, so this task is deemed appropriate for the
current study.

Rubric training and data collection


The study took place in an interpreting course during the summer semester of 2016. The
summer semester, also known at the university as the short semester, lasted for five weeks
and provided a wide range of elective courses. One course, ‘assessment of interpreting’, was
taught by the researcher with its aim to improve students’ ability to use rubrics to assess
interpreting skills. It contained no training or practice on interpreting skills and was used to
collect data for the research. Its main activities are listed as follows:
Language Awareness 7

Week#1: Course introduction; Before-training self-assessment


Week#2: Training rubric component one: comprehension
Week#3: Training rubric component two: reformulation
Week#4: Training rubric component three: delivery
Week#5: After-training self-assessment; self-report; course summary;

In Week#1, the teacher (also the researcher) explained the course objective and his
upcoming study on the influence of rubric training, obtained consent from students, and
throughout his research adhered to research ethics of keeping their identities confidential.
Students were also informed that the elective course would not assign any marks in students’
performance, and their rubric markings or comments would not be graded or penalised as
this study was meant to explore rubrics’ effects on students’ genuine and sustained changes
on their interpreting awareness.
In Week #1, he invited the students to finish an interpreting task: each student was given
a 200 word English text (see Appendix) and orally translated it into Chinese while reading.
Their interpretation was recorded and saved for later assessment. After interpretation, students
were given an assessment form, familiarised themselves with the three criteria categories and
their descriptions, and used them to assess their interpreting skills. As they listened to their
own recording, they could pause the audio, circle keywords from descriptors and enter com-
ments to their assessment forms. They could not circle the same keyword more than once,
but they could circle as many keywords as they wished and justify their word choice with their
comments and examples under each criterion. The total time of self-assessment (listen-
ing + form filling) was about 30 minutes, then the teacher collected all 32 assessment forms.
From Week#2 to Week#4, the teacher trained one criterion at a time. Modified from Patri
(2002), the training procedure included three parts: studying rubrics (S), identifying perfor-
mance features (I), bridging the gap (B). This training cycle would be referred to as SIB scheme.
Studying rubrics meant letting students discuss the skill description and summarise their dif-
ficulties in the assessment, followed by the teacher’s explanation of the nature and solutions
of such difficulties. At the stage of identifying performance features, the teacher gave two
samples of interpretation, one sample at the students’ present level with typical mistakes for
a criterion (i.e. Defect) and another sample at the excellent level, i.e. the model performance
of the criterion (i.e. Exemplar). Students used the descriptors to identify the prominent features
of both samples and produced a complete list of their discrepancies. For those easy-to-overlook
or hard-to-articulate features, the teacher showed how he would rate this criterion (demon-
stration). Bridging the gap meant students discussed among themselves how to close the gap
between their present level and the exemplar. Later the teacher intervened again, introducing
them to the textbook of interpreting (Chen, 2010) and explaining their methods to attain the
exemplar levels. He also encouraged them to try alternative methods after class, or doing
research about interpreting skill acquisition to further their understanding. That way they could
both raise their awareness and speed up their acquisition of skills, thus undergoing a favourable
cycle of assessment-awareness-acquisition. The texts used during training were all English and
of similar length and difficulty level to the pre- and post-training ones (see Appendix).
It should be pointed out that an alternative instruction format (i.e. instruction without
rubric training or even without reference to rubrics at all) might also boost students’ aware-
ness and promote their assessment. However, the SIB scheme boasted two notable features
that uniquely contributed to students’ assessment success. First, during their training
8 W. SU

students were required to compare two recordings (Defect and Exemplar) at a time under
a given criterion, and to list all discrepancies between two samples. Both examples served
as practical instances of abstract descriptors, enabling students to grasp and apply the rubric.
More importantly, by identifying the Defect-Exemplar gap they could internalise the rubric,
cultivating an expert-like judgment ability. The teacher’s rubric-based demonstration could
further speed up such rubric-internalization. Second, the part of bridging the gap helps to
extend in-class practice to out-of-class self-exploration, as students could bring the rubric
elsewhere, re-practising, re-thinking or even researching possible ways to improve their
performance. Such function of constantly alerting and motivating students is unique to
rubrics and cannot be easily replaced by time-limited, space-bounded teacher instruction.
In Week#5, students performed another sight interpreting task with an English text of
similar length (200 words) and of a comparable difficulty level. After interpretation, they
used about 30 minutes to finish and submitted their self-assessment forms. In both assess-
ment sessions (Week #1 and Week#5), students were told to use English to enter their com-
ments, as the language of both the rubric descriptors and the rubric training was English.
In addition, in Week #5 they also spent 30 minutes writing a self-reflection report (around
250 words) in English on whether and how rubric training improved their understanding of
interpreting skills. The idea of 250-word self-reports was based on Glover (2011) because
such length was ‘short enough to be written comfortably by the students, yet long enough
to have meaningful content’ (p,124). The teacher again collected 32 assessment forms and
32 reports for later data analysis.

Data analysis
To answer RQ1, a total of 64 assessment forms from two sessions (32 from Week #1 and 32
from Week #5) were collected, and the following features were calculated: A) the circled
keywords and their frequencies, B) the length (word count) and types (positive or negative)
of the corresponding comments C) the length (word count) of the corresponding examples.
For example, in Week#1 Student#4 circled the descriptor ‘logical’, entered the comment as
‘not translate instead’, listed the corresponding example as ‘Instead, it’s better to focus on
making gradual changes to your diet’ (see Table 2). The length of the comment was 3 words,
and the example 12 words. Using this method the lengths of all comments and examples
were calculated.
To answer RQ2, all reports were read deductively so that the students’ self-perceived roles
of rubric training were identified, then those roles were coded into basic level concepts such
as ‘rubric training can sequence my attention’, ‘can prioritize my assessment’. After that,
similar basic-level concepts were aggregated into categories. For example, the above two
were grouped into the category of ‘rubric training helps to form hierarchical judgments of
interpreting skills’. Different categories were compared constantly, across data sources and
cases. The identified categories were revised based on the comparison before they were
finalised.
To improve validity of the analysis and coding of both research questions, another English
teacher was invited to separately code the answers, then the researcher and the second
coder met and compared their coding. There were altogether three instances where their
opinion differed: in one instance, both coders came up with different names of one theme
(correct understanding vs. complete understanding of interpreting skills). They then
Language Awareness 9

Table 2. Comments and illustrative examples of the three circled words.


Positive Negative
Comments Comments Example (underlined and double slashed by
Circled Words (words) (words) students)
Logical Before-training 0 51 44 words
e.g. Instead, it’s better to focus on making gradual
changes to your diet
e.g. far fewer calorie than most people think
After-training 0 293 410 words
e.g. a stressful marriage can increase the risk of
heart attacks and stroke
e.g. those who were married became less healthy
Transfer Before-training 27 32 41 words
e.g. for long-term, sustainable weight loss
e.g. Bike to work (for positive comment)
After-training 63 205 448 words
e.g. a stressful marriage can increase the risk of
heart attacks and stroke
e.g. the death of a spouse or long-term partner(for
positive comment)
Properly Before-training 0 15 21 words
e.g. There is a persistent myth // that you can
exercise your calories away.
After-training 9 221 610 words
e.g. Then, there is the issue of divorce and the
death//of a spouse or long-term partner.
e.g. A study this past year from Switzerland tracked
11,000 people annually for sixteen years asking
health-related questions.

re-examined students’ reports together, and settled on the name of nuanced understanding
(see below). In two other instances, two students’ reports were ambiguous and led to dif-
ferent interpretations between coders. To verify students’ intention, both coders met with
the students and jointly identified the themes.
Two points about coding need to be noted. First, as students were explicitly told to answer
‘how can rubric training help you in your assessment’, nearly all students wrote in a clear,
point-by-point fashion, with each paragraph stating one specific area of benefit. Consequently,
the boundaries of a comment that fit one particular theme were relatively clear-cut and straight
forward, both physically and conceptually. Second, altogether six themes had been jointly
identified by both coders, the three most salient ones will be discussed in more details below.
The other three themes included ‘rubric training can boost confidence’ (only mentioned by 6
students), ‘rubric training too short’ (mentioned by 5 students), ‘rubrics as new tools’ (men-
tioned by 2 students). These three themes were infrequent and irrelevant because this paper
explores the change in awareness of the construct (in this paper, interpreting skills), and such
germane themes as confidence gains were then excluded from the following discussion.
The data collection and analysis of the study are illustrated in Figure 1.

Results and discussion


Differences in rubric use
Table 3 shows frequencies of circled descriptor words in both assessments. As a student
cannot circle the same word more than once, the maximum frequency of any given word
10 W. SU

Data collection Data analysis

Students finish two assessments

and one self-report

Identifying change in
comparing assessment results
rubric use (RQ1)

Identifying rubric
coding self-reports
perceptions(RQ2)

Figure 1. Procedure and measurement.

circling was 32, meaning that this word is noticed and marked by all the students (though
they may generate different lengths of comments and examples based on the same word,
see below). In other words, the frequency of one circled word represents the number of
students who mark it.
As can be seen from Table 3, for the most frequently circled nine words across the three
criteria, students registered higher frequencies of marking them in the after-training assess-
ment session than the before-training one (267:218). This suggests that after training more
words in the descriptors are marked by more students. This trend is most apparently demon-
strated in the category of Comprehension: prior to training students circled ‘Word’ 27 times,
meaning 27 students marked this word. The next frequently used word ‘Meaning’ only reg-
istered 15 times. By contrast, after training the most frequent word ‘Meaning’ was marked
by 32 students, followed by 30 times of ‘Logical’ and 30 times of ‘Word’. This indicates that
training makes more students mark more words from the rubric in their self-assessments.
To illustrate the point that more students marked more keywords in the category of
comprehension, keyword frequencies between Week#1 and Week#5 were compared.
Figure 2 visualises frequency differences in Comprehension keyword between the two
assessment sessions. Prior to training, student assessors relied on a limited number of key-
words for self-comments. Their zero use of keywords like ‘text’ and ‘argument’ reflect their
insufficient knowledge of the terms and inability to trace them in their performance. With
rubric training, their awareness of descriptors was considerably boosted. They not only used
higher frequencies of key concepts, but also a wider range of them, including these previ-
ously ignored words like ‘text’. The higher occurrence and wider spread of keywords of the
back row in Figure 2 epitomise students’ growing awareness of Comprehension.
Another noticeable trend is the differentiated effects rubric training exerts across three
criteria. Before training, more students tended to circle words in Reformulation and Delivery,
yet many fewer marked words in Comprehension (the second and third most marked words
attracted only 15 and 10 students respectively). After training, however, the number of stu-
dents attending to Comprehension increased markedly, with the top three words attracting
32, 30 and 30 students respectively. The change in Comprehension is most remarkable
among the three categories, suggesting that after receiving training students enjoy a much
better understanding and use of Comprehension skills in their assessments.
A third trend is the change in their choice of marking words. Most notably, the second most
marked word ‘Logical’ from Comprehension (frequency: 30) in after-training assessment was
Language Awareness 11

Table 3. Frequencies of circled descriptor words.


before-training assessment after-training assessment
Top 3 Circled words Frequency Top 3 Circled words Frequency
Comprehension Word 27 Meaning 32
Meaning 15 Logical 30
Rapidly 10 Word 30
Reformulation Idiomatic 30 Transfer 32
Local 28 Idiomatic 30
Delete 21 Global 22
Delivery Pause 32 Properly 32
Self-correction 28 Pause 31
steady 27 steady 28
Total Frequency 218 267

Frequency

35
30
25
20
15
10
5 week 1
0
week 5

keywords

Figure 2. Frequencies of some keywords used before and after training.

clearly not valued before training. Similarly, the word ‘Transfer’ from Reformulation (frequency
is 32 after training) is largely ignored before training (frequency<21). The most frequent word
‘Properly’ from Delivery (frequency is 32 after training) is also underused before training.
Regarding the change in comment length and example length between the two assess-
ment sessions, Table 4 lists the length differences. Prior to training, student assessors in total
circled 14 keywords in the rubric, generating 520 words of comments and 478 words of
examples. This means that on average one keyword is accompanied with 37 words of com-
ments and 34 words of examples. In contrast, after training they marked 31 keywords, pro-
ducing 3872 words of comments and 4112 words of examples, so each keyword has about
125 words of comments and 133 words of examples.
Table 2 further displays how the three circled words (logical, transfer, properly) are com-
mented upon before and after training. Starting from the word ‘Logical’ in Comprehension,
we find that before training the students generated 51 words of comments, all negative,
about their logical features. The interpretation examples they reported having logical errors
are also of limited length, with connectives (like ‘instead’) as their error sources.
In contrast, after training students become more aware of this logic-identification skill,
generating 293 words of comments. More importantly, their accompanying examples include
longer and more sentences. A closer study of the examples shows that their analysis is no
longer limited to the word level (e.g. connectives). They are now able to see errors from a more
12 W. SU

Table 4. Word count of comments & examples between two assessment sessions.
Before training (Session #1) After training (Session #2)
No. of circled keywords 14 31
Total Comment length (comment 520 (37) 3872 (125)
per keyword)
Total example length (example per 478 (34) 4112 (133)
keyword)

global level since the examples were longer. This finding partly confirms previous results (e.g.
Hudson & Shapiro, 1991; To et al., 2010). Hudson and Shapiro (1991) argued that from the
language acquisition point of view language learners acquired lower order, word-level abilities
first before they mastered the higher-order skills at a later developmental stage. To et al. (2010)
further held that logical relations not only reflected the ability to join sentences into a text, but
also indicated their ability to assess a listener’s knowledge needs during different points of
their utterance. Assessors needed to simultaneously attend to the context, the precise use of
linguistic devices and the listener’s needs. That is why understanding and assessing this skill
could be difficult. The current study contends that a possible solution to tackle this difficulty
is a deliberate rubric training in self-assessment: self-assessment can enhance students’
self-awareness and their awareness of listeners’ needs; rubric training can direct their attention
to those easy-to-overlook or hard-to-analyse language skills like logic identification. A combi-
nation of both can thus empower them to be competent, knowledgeable assessors.
Regarding ‘Transfer’ in Reformulation, this circled word also leads to more comment words
and examples after training. The examples listed in Table 2 indicate that students at both
occasions focus on how to convert a part of speech when performing interpreting, like a noun
into a verb (e.g. loss, work, death) or an adjective into a verb (e.g. stressful). In other words, at
both sessions students seem to be predominantly concerned with word level transfer, ignoring
conversion at the sentence level (like passive to active voice) or beyond. This indicates that
though students after training become more perceptive of word-level transference features,
their awareness of a higher-level manipulation achieves limited progress. Another feature in
this category is that at both sessions students can enter positive comments based on the
keyword ‘transfer’. This seems to suggest that students at this stage become at least more
confident, though not necessarily more proficient, in this particular criterion.
Regarding ‘Properly’ in delivery, students before training generate many fewer comment
words and example words under this category than the other two. They point out having
many pauses, or pausing too long, thus circle the word ‘Pause’ rather than ‘Properly’ in their
self-assessment. It seldom occurs to them that pauses, when using properly, could serve as
a skill to organise their thought and manage their delivery in interpreting. After training,
however, they acquire the knowledge of this pause function, circle the word ‘Properly’ more
frequently, and consequently produce more comment words and example words.
As Table 4 explains, overall students who receive training can produce more comments
and examples based on the chosen keywords. In addition to the above mentioned words
(logical, transfer, properly), all other keywords like ‘anticipate’, ‘global’ and ‘pace’ unanimously
see increases in the lengths of comments/examples.
In sum, rubric training can guide student assessors in attending to underlying skills rather
than surface features of their interpretation. They would focus on the function of pause rather
than the sheer frequency of its occurrence, focus on the logic behind words rather than
surface level word errors. Admittedly, their awareness of some skills like language transfer
Language Awareness 13

only achieve minimal progress given the limited amount of training time and their developing
competence as language learners, yet overall with rubric training they can acquire a better
understanding of the component skills, higher perception of the descriptors’ key words and
keener awareness of comments/examples matching those keywords they have circled.
It should be noted that some of the above benefits could also be shared by a no-rubric
instruction mode. With teachers’ guidance students can still improve their understanding
of the skills, or sharpen their sensitivity to errors. However, this study found that rubric
training could uniquely boost students’ ability to attend to multiple quality aspects. For
example, after training more students simultaneously circled the rubric word ‘add’ and
‘properly’, and became more sensitive to the skill of ‘adding words to produce idiomatic
target language’ and the skill of ‘pausing properly’. In one occasion, after translating the
segment ‘tracked 11,000 people annually for sixteen years…’, eight students wrote that
their translations were filled with running-on, awkward Chinese sentences. Checking against
the criterion of reformulation and of delivery, they added that it would be more natural if
they could break the original sentence into two or three shorter Chinese sentences, each
using the verb ‘track’. Through using shorter sentences and adding the verb, they said the
language would be more idiomatic, and the pace of delivery could be smoother. Here the
two skills (adding words, pausing properly) used to be overlooked by students who had a
habit of closely following the original sentence. However, after a SIB training intervention,
students became quicker to identify the gaps on multiple fronts (here on reformulation
and delivery) and more capable of closing these gaps. Guided by the rubric, they would
listen to their recordings more critically and intentionally, locating the places to add words
or insert pauses. This purposeful, skill-led self-assessment is a unique benefit brought by
rubric training. Teachers’ instruction may raise students’ overall awareness of interpreting
quality, or focus students’ attention to one aspect of their performance at a time, but it is
the rubric what guides students to attend to multiple aspects (in this case, adding words
plus properly pausing) and form a balanced, systematic judgment. To confirm and further
our understanding of rubric training, we now turn to students’ self-reports.

Self-reports of rubric training


Self-reports reveal that all students recognise to varying degrees that rubric training can
better their awareness of interpreting skills. Report coding has identified the top three
changes in their awareness:

Cultivating a nuanced understanding of interpreting skills


The most immediate and most frequently mentioned change is that after rubric training
students report having a sophisticated understanding of interpreting skills, as the following
two excerpts display.
Excerpt 1: I use to think that adding words was bad, not allowed, now I realize that the differ-
ences between languages require some necessary additions, like the rubric says, the skill of
addition is necessary, to make target language idiomatic. (Student #12)

Excerpt 2: I now see the pause not as an error, but as a skill, because if I know how to pause at
proper places, I can also achieve a kind of fluency. (Student #3)
14 W. SU

In Excerpt 1, Student #12 used to treat addition as being unfaithful. With rubric training he
realises that a rigid and blind word-for-word translation may not serve the objective of fidelity.
Rather, a judicious addition can reflect an interpreter’s scrupulous judgment of target audience
and skilful manipulation of language transfer. He then uses addition as a lens to investigate
whether and to what extent interpreters (including himself) have mastered this transfer skill.
Similarly, Student #3 also uses pause as a lens to assess the delivery skill. This finding marks a
contrast with Han and Riazi (2017), where students could only assess their own disfluency, falling
short of explaining or suggesting the skills to counter their delivery errors. The difference lies in
the provision of the skill-oriented rubric training. While the former used a product-based rubric,
the latter is more process-based, alerting students of the factors behind their performance and
stimulating their reflections. In these two excerpts, both students acknowledge the benefits of
rubric training as modifying or even correcting their attitude towards certain interpreting skills.

Forming hierarchical judgements of interpreting skills


Students not only form a nuanced understanding of interpreting skills, they form a hierar-
chical judgement of them, as Excerpt 3 shows
Excerpt 3: The biggest help is that I now know how to sequence my attention. I know I need
to form a systematic picture, like a pyramid, word transfer skill is easy to locate, but discourse
transfer skill is not as easy, so I have to prioritize my attention. (Student #11)

Student#11 thinks that the rubric training divides interpreting competence into several
interconnected componential skills, from more global, difficult-to-rate aspects to more local,
easy-to-rate ones. He is thus instructed to sequence his attention, to cover some elusive,
global skills while not ignoring local aspects. Here lies an important difference between
rating interpreting and rating speaking and writing: interpreting poses more challenges to
student assessors: they need to simultaneously attend to multiple dimensions like fidelity
to the source text, acoustic features, and language use. That’s why novice assessors without
rubric training often feel overwhelmed by multi-tasking and produce limited comments, as
Table 3 shows. Now with the help of rubric training, students can form a hierarchy in their
mind and strategically allocate their attention, generating elaborate and balanced self-judg-
ments. Their changed understanding of interpreting skills has been reflected in their gen-
eration of discourse-level comments and examples: Before training, they only generate quite
limited discourse-level comments (14 words) and examples (10 words). After training, they
can give more such comments (121 words) and identify more examples (102 words).

Generating comprehensive judgments


Another often quoted change is that students can balance both positive and negative
examples. Before training, their total comments (520 words) is composed of 90% negative
(470 words) and 10% positive (50 words). After training, the total comments (3872 words)
is about 66% negative (2556 words) and 34 positive (1316 words). On the one hand, rubric
training gives them confidence in singling out their strong points of interpretation, thus
producing more positive examples as is shown in Table 2. On the other hand, while stu-
dents’ own interpretations may be flawed, they are perceptive and confident in identifying
the area or direction where they can work on to acquire the needed skills. A typical excerpt
is Excerpt 4.
Language Awareness 15

Excerpt 4: About the transfer skill, during the training I was shown an exemplar and knew what
an excellent transfer skill at a global level looked like. When I interpret, I may not be able to
transfer as well as the exemplar, but I can identify more places where my interpretation is unsat-
isfactory, mainly the global level transference. (Student # 9)

Student#9 claims that he can locate more examples of NOT mastering the transfer skill.
This is due to his prior training where he has studied the exemplar transfer skills at the
global-level. He may not yet master the skill or apply it in his interpretation, but he knows
where it goes wrong and reports his weakness in the self-assessment. The finding that
students can locate more weakness examples at a global level is unique to this study.
Previous research have pointed out that the global-level features were difficult to com-
ment on (e.g. Malvern & Richards, 2002; Wang, 2011). For example, Malvern and Richards
(2002) claimed that judgment of global level speaking errors could be too complex for
students. This study argues that even though students could not improve their global
competence in a short term, their awareness and analysis of discourse errors could achieve
remarkable progress. Their knowledge of NOT mastering skills is as valuable as that of
mastering them.
To relate the report data with the previous rubric data, we can find that students after
training claim to form a new attitude towards interpreting, thus generating more skill-ori-
ented comments and examples. They also claim to form a hierarchical and comprehensive
approach, covering both global and local, both positive and negative aspects of their
interpretation. Thus their comments and examples are more numerous and balanced
than those before training, as Tables 3 and 4 display. Their reports also suggest that the
markings and comments represent their sustained changes on their interpreting aware-
ness, that rubric training has a lasting effect on their perceptions and actual behaviour
in self-assessment.
More importantly, both data analyses highlight some features unique to interpreting.
First, the task of interpreting may lead to errors in both languages: source language com-
prehension and target language production. Even when the target language is the students’
native language, they still find a lot of unidiomatic expressions due to their inability to
overcome the source language interference (Su, 2019b, 2019c). That is why students in this
study often report language-related errors both in English (comprehension) and in Chinese
(production).
Second, interpreting competence is more than a simple combination of language com-
prehension and production, it involves a series of activating and selecting operations.
While reading the source text, interpreters need to rapidly activate multiple target text
options across different levels (word, sentence, discourse, etc.) and select the most appro-
priate option to realise transference. Inexperienced students can only activate limited
target options at a local level (hence the numerous word-level transference in this study).
Rubric training can afford them knowledge of global level activation and selection, yet
they need more time to turn this explicit, declarative knowledge into automatic, procedural
interpreting skills, to turn their awareness of interpreting skills into the acquisition of the
skills. As DeKeyser (2014) explained, skills may be acquired through perceptive observation
(awareness) and assessment of others engaged in skilled behaviour. Students’ keen aware-
ness of skills may not immediately translate into their acquisition, but the guidance of
rubric, exemplar and teacher-led training can clearly accelerate the acquisition process as
is shown in this study.
16 W. SU

Conclusion and implications


This study reports the effects of rubric training on students’ assessment and awareness of
interpreting. It suggests that after training students can notice more features of interpreting
as they circle more descriptor words in the rubric, generate more comments and correspond-
ing examples in their self-assessment. Besides, in two areas students have demonstrated
better awareness of interpreting skills: they become more sensitive to global features of
language use like discourse devices, and they better understand the source language inter-
ference and the skill of cross-language transference. Their self-reports show that after training
they could learn to prioritise their attention to the higher-level skills, forming a comprehen-
sive judgment of their interpreting competence.
An immediate implication from this study is the design of a rubric training scheme for
language class. The proposed SIB scheme could encourage students to study rubrics, identify
the gap between their level and reference level, and seek ways to bridge the gap. This scheme
is particularly useful to self-assessment in integrated tasks like a read-to-write test, where
students could be taught to simultaneously treat multiple criteria and circle their errors on
multiple fronts. In addition, the scheme extends self-assessment beyond the space-bound,
time-limit classroom, promoting students’ efforts in problem-solving and self-reflection.
Teachers could thus monitor students’ progress by collecting their after-class reports or their
circling of rubric descriptors
Another implication is for testing or teaching multi-constructs language tasks. Tasks like
interpreting can easily overwhelm student assessors with its multiple and inter-connected
constructs. To ease students’ anxiety and to afford them a necessary tool kit, an instructor
needs to explain to them the component constructs and ways to measure them, preferably
with typical examples or exemplars. Then students can have a warm-up practice and watch
the instructor’s demonstration of rubric use before they can formally undertake feedback
sessions. This incremental, adequately guided rubric training can be replicated to many
emerging integrated language tasks like reading-to-writing or listening-to-writing.
The methodological implication of the study is the use of rubric descriptors in exploring
students’ language awareness. Previous studies tap such awareness mainly by soliciting
students’ self-report (e.g. Glover, 2011) or comments (e.g. Kissling & O’Donnell, 2015). This
study finds that by asking students to circle key words from the descriptors and to justify
their word choice with accompanying examples, we can better judge the extent of their
knowledge of the construct. Where students have difficulty in judging some important
descriptor words (like ‘Global’ or ‘Properly’ in this study), we can design special training to
highlight those elusive aspects and direct their attention to those components. Thus descrip-
tor circling can serve as a window to gauge students’ awareness of language constructs.
An additional implication is for Master of Translation and Interpreting (MTI) programs in
China and the rest of the world. As the growth of interpreting students far outpaces that of
teachers, rubric training combined with self-assessments can ease teachers’ workload and
alleviate teachers’ short supply. More importantly, as students become competent assessors
and responsible learners, they can regulate their own pace of learning, fostering genuine
learner autonomy (Su, 2019b).
In addition, this study highlights the superiority of skill-based formative assessment over
product-based summative assessment when it comes to language awareness-raising.
Interpreter learners using product-based assessment can easily spot the surface level errors
Language Awareness 17

like pauses, but their knowledge of using pauses as an interpreting skill is generally lacking.
Similarly, in speaking assessment students may know the frequency of their pauses, but their
knowledge of pause functions is likely to be superficial (e.g. should avoid pauses, should be
more fluent). Once they see the pause as a measure of thought-organizing or pace-manag-
ing, they will form a deeper understanding of this feature and integrate it in their holistic,
hierarchical view of speaking competence.
This study is not without limitations. It only investigates interpreting in two self-assess-
ment sessions, and its training lasts only three weeks. One or two more assessments could
be added to reveal whether there is a continual trend of awareness-raising. Future research
could thus implement multiple self-assessments over a longer period of training, so that a
clear longitudinal trend could be documented. Having said that, it is hoped that this study
could serve as a starting point to further inquiries into this underexplored yet promising
area of rubric training.

Disclosure statement
No potential conflict of interest was reported by the author(s).

Funding
This study was supported by Ministry of Education of the People’s Republic of China
(17YJC740074); the Fundamental Research Funds for the Central Universities of China
(20720181002).

Notes on contributor
Wei Su is an Associated Professor at Xiamen University, China. He received his PhD in Translation
and Interpreting in 2011, and published articles in journals like Language and Education and
The Interpreter and Translator Trainer. His main research interests include interpreting compe-
tence and assessment of interpreting in teaching.

ORCID
Wei Su http://orcid.org/0000-0003-2204-3418

References
Alderson, J. C. (2005). Diagnosing foreign language proficiency. Continuum.
Andrade, H. G., & Du, Y. (2005). Student perspectives on rubric-referenced assessment. Practical as-
sessment. Research & Evaluation, 10(3), 1–11.
Association for Language Awareness (ALA). (2018). Language awareness defined. Retrieved January,
2018, from http://www.lexically.net/ala/la_defined.htm
Bachman, L. F., & Palmer, A. S. (1989). The construct validation of self-ratings of communicative lan-
guage ability. Language Testing, 6(1), 14–29.
Bartłomiejczyk, M. (2007). Interpreting quality as perceived by trainee interpreters. The Interpreter and
Translator Trainer, 1(2), 247–267. https://doi.org/10.1080/1750399X.2007.10798760
Berg, E. C. (1999). The effects of trained peer response on ESL students’ revision types and writing
quality. Journal of Second Language Writing, 8(3), 215–241. https://doi.org/10.1016/S1060-3743
(99)80115-5
18 W. SU

Blanche, P., & Merino, B. J. (1989). Self-assessment of foreign-language skills: Implications for teachers
and researchers. Language Learning, 39(3), 313–338.
Chen, J. (2010). Sight translation. Shanghai Foreign language Education Press.
Chen, Y. M. (2006). Peer- and self-assessment for English oral performance: A study of reliability and
learning benefits. English Teaching and Learning, 30(4), 1–22.
DeKeyser, R. (2014). Skill acquisition theory. In B. Vanpatten (Ed.), Theories in second language acquisi-
tion: An introduction (pp. 94–112). Routledge.
Delaney, Y. A. (2008). Investigating the reading-to-write construct. Journal of English for Academic
Purposes, 7(3), 140–150.
Freeman, M. (1995). Peer assessment by groups of group work. Assessment & Evaluation in Higher
Education, 20(3), 289–299.
Glover, P. (2011). Using CEFR level descriptors to raise university students’ awareness of their speaking
skills. Language Awareness, 20(2), 121–133. https://doi.org/10.1080/09658416.2011.555556
Gonzalez, J. A. (2009). Promoting student autonomy through the use of the European Language
Portfolio. ELT Journal, 63(4), 373–382.
Han, C., Riazi, M. (2017). The accuracy of student self-assessments of English-Chinese bidirectional
interpretation: A longitudinal quantitative study. Assessment and Evaluation in Higher Education,
2938, 1–13. https://doi.org/10.1080/02602938.2017.1353062
Hartley, A., Mason, I., Peng, G., Perez, I. (2003). Peer and Self-Assessment in Conference Interpreter
Training. www.llas.ac.uk/resourcedownloads/1454/hartley.rtf.
Hudson, J. A., & Shapiro, L. R. (1991). From knowing to telling: The development of children’s scripts,
stories and personal narratives. In A. McCabe & C. Peterson (Eds.), Developing narrative structure.
(pp. 89–136). Erlbaum.
Isaacs, T. (2016). Assessing speaking. In D. Tsagari & J. Banerjee (Eds.), Handbook of second language
assessment (Vol. 12, pp. 131–146). Walter de Gruyter.
Kim, H. J. (2006). Providing validity evidence for a speaking test using FACETS. Teachers College.
Columbia University Working Papers in TESOL & Applied Linguistics, 6(1), 1–37.
Kissling, E. M., & O’Donnell, M. E. (2015). Increasing language awareness and self-efficacy of FL stu-
dents using self-assessment and the ACTFL proficiency guidelines. Language Awareness, 24(4),
283–302. https://doi.org/10.1080/09658416.2015.1099659.
Lee, J. (2008). Rating scales for interpreting performance assessment. The Interpreter and Translator
Trainer, 2(2), 165–184. https://doi.org/10.1080/1750399X.2008.10798772
Lee, S. B. (2017). University students’ experience of “scale-referenced” peer assessment for a consec-
utive interpreting examination. Assessment & Evaluation in Higher Education, 42(7), 1015–1029.
https://doi.org/10.1080/02602938.2016.1223269
Li, J., & Lindsey, P. (2015). Assessing writing understanding variations between student and teacher
application of rubrics. Assessing Writing, 26, 67–79. https://doi.org/10.1016/j.asw.2015.07.003
Liou, H. C., & Peng, Z. Y. (2009). Training effects on computer-mediated peer review. System, 37(3),
514–525. https://doi.org/10.1016/j.system.2009.01.005
Luoma, S. (2004). Assessing speaking. Cambridge University Press.
Malvern, D., & Richards, B. (2002). Investigating accommodation in language proficiency interviews
using a new measure of lexical diversity. Language Testing, 19(1), 85–104.
McMillan, J., & Hearn, J. (2008). Student self-assessment: The key to stronger student motivation and
higher achievement. Educational Horizons, 87(1), 40–49.
Min, H. T. (2016). Effect of teacher modeling and feedback on EFL students’ peer review skills in peer
review training. Journal of Second Language Writing, 31, 43–57. https://doi.org/10.1016/j.
jslw.2016.01.004
Patri, M. (2002). The influence of peer feedback on self-and peer-assessment of oral skills. Language
Testing, 19(2), 109–131.
Pöchhacker, F. (2016). Introducing interpreting studies. Routledge.
Setton, R., & Durant, A. (2016). Conference interpreting: A trainer’s guide. John Benjamin Publishing
Company.
Stevens, D. D., & Levi, A. J. (2013). Introduction to rubrics: An assessment tool to save grading time, convey
effective feedback, and promote student learning. Stylus Publishing, LLC.
Language Awareness 19

Su, W. (2019a). Interpreting quality as evaluated by peer students. The Interpreter and Translator
Trainer, 13(2), 177–189. https://doi.org/10.1080/1750399X.2018.1564192
Su, W. (2019b). Exploring native English teachers’ and native Chinese teachers’ assessment of inter-
preting. Language and Education, 33(6), 577–594.
Su, W. (2019c). NNS and NES teachers’ co-teaching of interpretation class: A case study. Asia-Pacific
Education Researcher. Advance online publication. https://doi.org/10.1007/s40299-019-00489-7.
To, C. K.-S., Stokes, S. F., Cheung, H.-T., & T’sou, B. (2010). Narrative assessment for Cantonese-speaking
children. Journal of Speech, Language, and Hearing Research, 53(3), 648–669. https://doi.org/
10.1044/1092-4388(2009/08-0039)
Trofimovich, P., Isaacs, T., Kennedy, S., Saito, K., & Crowther, D. (2016). Flawed self-assessment:
Investigating self-and other-perception of second language speech. Bilingualism: Language and
Cognition, 19(1), 122–140. https://doi.org/10.1017/S1366728914000832
Wang, W. (2014). Students’ perceptions of rubric-referenced peer feedback on EFL writing: A longitu-
dinal inquiry. Assessing Writing, 19, 80–96.
Wang, Z. J. (2011). A case study of one EFL writing teacher’s feedback on discourse for advanced
learners in China. University of Sydney Papers in TESOL, 3198, 21–42.
Zhong, W. (2019). China’s translation education in the past four decades: Problems, challenges and
prospects. Chinese Translators Journal, 1, 68–75.
Zhu, W. (1995). Effects of training for peer response on students’ comments and interaction. Written
Communication, 12(4), 492–528. https://doi.org/10.1177/0741088395012004004

Appendix
Text 1 (before training)
There is a persistent myth that you can exercise your calories away. Hit the gym. Bike to work. If you’ve
tried to lose weight, you know the importance of getting off the couch and to get moving. But in
­reality, becoming more active generally does not result in rapid, substantial weight loss. Instead, it’s
better to focus on making gradual changes to your diet, such as eating more vegetables and cutting
back on refined carbohydrates. Dietary changes are especially important at the beginning of any new
weight loss plan. Exercise burns off far fewer calorie than most people think. One classic fast food
meal generally consists of thousands of calories exceeding the amount most adults need in a day. For
instance, you have to walk 35 miles to burn off 3,500 calories. That is about two times the amount in
one supersized Big Mac meal. But that’s not to say exercise is unimportant. A large review of studies
that included more than1,000 adults suggested that for long-term, sustainable weight loss, a plan
that combines both healthy eating regimen and regular exercise works better than either diet or
­exercise alone.

Text 2 (after training)


Many people believe that marriage is good for you, and there are an ample number of studies that
support that conclusion. The latest one by researchers at the Aston Medical School in England anal-
ysed data from more than 900,000 patients. Those patients diagnosed with type 2 diabetes, high
blood pressure and high cholesterol were more likely to survive if they were married than those who
were single. Fortunately, there are also studies that show single people aren’t necessarily doomed. A
study this past year from Switzerland tracked 11,000 people annually for sixteen years asking
health-related questions. Their conclusion: those who were married became less healthy. Then, there
is the issue of divorce and the death of a spouse or long-term partner. A study in 2009 by the University
of Chicago discovered that certain aspects of one’s health were worse off for people who had been
divorced or widowed than for those who had never been married. The happiness in one’s marriage
seems to be the important factor. Studies have found a stressful marriage can increase the risk of
heart attacks and stroke compared with a contented union.

You might also like