Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 20

European Journal of Training and Development

Quality of feedback following performance assessments: does assessor expertise matter?


Marjan J.B. Govaerts, Margje W.J. van de Wiel, Cees P.M. van der Vleuten,
Article information:
To cite this document:
Marjan J.B. Govaerts, Margje W.J. van de Wiel, Cees P.M. van der Vleuten, (2013) "Quality of feedback following
performance assessments: does assessor expertise matter?", European Journal of Training and Development, Vol. 37
Issue: 1, pp.105-125, https://doi.org/10.1108/03090591311293310
Permanent link to this document:
https://doi.org/10.1108/03090591311293310
Downloaded on: 04 May 2018, At: 19:34 (PT)
References: this document contains references to 49 other documents.
To copy this document: permissions@emeraldinsight.com
DownloadedbyUniversitasBrawijayaAt19:3404May2018(PT)

The fulltext of this document has been downloaded 1964 times since 2013*
Users who downloaded this article also downloaded:
DownloadedbyUniversitasBrawijayaAt19:3404May2018(PT)

(2013),"Perceptions of quality of feedback in organizations: Characteristics, determinants, outcomes of feedback, and


possibilities for improvement: introduction to a special issue", European Journal of Training and Development, Vol. 37 Iss 1
pp. 4-23 <a href="https://doi.org/10.1108/03090591311293266">https://doi.org/10.1108/03090591311293266</a>
(2015),"Effects of feedback motives on inquiry and performance", Journal of Managerial Psychology, Vol. 30 Iss 2 pp.
199-215 <a href="https://doi.org/10.1108/JMP-12-2012-0409">https://doi.org/10.1108/JMP-12-2012-0409</a>

Access to this document was granted through an Emerald subscription provided by emerald-srm:600462 []
For Authors
If you would like to write for this, or any other Emerald publication, then please use our Emerald for Authors service
information about how to choose which publication to write for and submission guidelines are available for all. Please
visit www.emeraldinsight.com/authors for more information.
About Emerald www.emeraldinsight.com
Emerald is a global publisher linking research and practice to the benefit of society. The company manages a portfolio of
more than 290 journals and over 2,350 books and book series volumes, as well as providing an extensive range of online
products and additional customer resources and services.
Emerald is both COUNTER 4 and TRANSFER compliant. The organization is a partner of the Committee on Publication
Ethics (COPE) and also works with Portico and the LOCKSS initiative for digital archive preservation.

*Related content and download information correct at time of download.

The current issue and full text archive of this journal is available at
www.emeraldinsight.com/2046-9012.htm

argje W.J. van d


Quality of feedback following e Wiel
performance assessments: does Dep
artment of Work a
assessor expertise matter? nd Social Psycholo
gy, FPN,
Marjan J.B. Govaerts Maastri
Department of Educational Development and Research, FHML, cht University, Maa
Maastricht University, Maastricht, The Netherlands stricht, The Netherl
ands, and
Cees P.M. van der Vleuten
Department of Educational Development and Research, FHML, Quality of
Maastricht University, Maastricht, The Netherlands feedback

Abstract
Purpose – This study aims to investigate quality of feedback as offered by supervisor-assessors
with varying levels of assessor expertise following assessment of performance in residency training in 105
a health care setting. It furthermore investigates if and how different levels of assessor expertise
influence feedback characteristics. Received 11 July 2012
Design/methodology/approach – Experienced (n ¼ 18) and non-experienced (n ¼ 16) Revised 5 October 2012
supervisor-assessors with different levels of assessor expertise in general practice (GP) watched Accepted 9 October 2012
two videotapes, each presenting a trainee in a “real-life” patient encounter. After watching each
videotape, participants documented performance ratings, wrote down narrative feedback comments
and verbalized their feedback. Deductive content analysis of feedback protocols was used to explore
quality of feedback. Between-group differences were assessed using qualitative-based quantitative
analysis of feedback data.
Findings – Overall, specificity and usefulness of both written and verbal feedback was limited.
Differences in assessor expertise did not seem to affect feedback quality.
Research limitations/implications – Results of the study are limited to a specific setting (GP) and
assessment context. Further study in other settings and larger sample sizes may contribute to better
understanding of the relation between assessor characteristics and feedback quality.
Practical implications – Findings suggest that even with supervisor-assessors with varying levels
of assessor expertise who are trained in performance assessment and the provision of performance
DownloadedbyUniversitasBrawijayaAt19:3404May2018(PT)

feedback, high-quality feedback is not self-evident; coaching “on the job” of feedback providers and
continuous evaluation of feedback processes in performance management systems is crucial.
Instruments should facilitate provision of meaningful feedback in writing.
Originality/value – The paper investigates quality of feedback immediately following assessment
of performance, and links feedback quality to assessor expertise. Findings can contribute to
improvement of performance management systems and assessments for developmental purposes.
Keywords Feedback, Performance development, Performance management, Professional training,
Performance assessment
Paper type Research paper

The authors would like to thank the Editors and the reviewers for their recommendations that
have made improvements to this paper.

European Journal of Training and


Development
Vol. 37 No. 1, 2013
pp. 105-125
q Emerald Group Publishing Limited
2046-9012
DOI 10.1108/03090591311293310

EJTD Introduction
Feedback is regarded as being essential in the development of performance and
37,1 professional expertise, as it helps to build knowledge and skills (Ericsson, 2009; Salas
and Rosen, 2010). Observation and documentation of performance on a day-to-day
basis, as well as feedback on performance are considered to be key elements of
well-implemented performance management systems which aim at continuous
106 development of performance consistent with attainment of organisational goals
(Aguinis and Pierce, 2008; Aguinis, 2009). In order to be useful for performance
improvement feedback must be informative; high quality feedback is critical to an
effective performance management system.
Research on feedback, however, rather consistently shows that in many instances
feedback decreases rather than increases performance (Kluger and DeNisi, 1996;
Latham et al., 2005). Therefore, evaluation of the quality of feedback and its impact on
performance is crucial in monitoring effectiveness of performance management
systems (Aguinis, 2009). Although there is considerable research on the use of
quantitative feedback (performance ratings and scores), researchers have only recently
begun to focus on the role of narrative comments in performance assessments. For
instance, one of the first studies on narrative comments from a multi-source feedback
process was only published in 2004 (Smither and Walker, 2004). Even fewer studies
have focused on quality and documentation of written narrative feedback directly
following observation of performance. Given the impact of feedback on performance
improvement, further research into the quality of written narrative feedback and its
role in performance improvement is needed. The purpose of the present study is to
address this gap. It aims to investigate the quality of feedback given and documented
by supervisor-assessors directly following observation of performance in professional
training. We furthermore explore whether supervisors’ level of experience with regard
to assessment of performance affects quality of feedback.

Conceptual framework
Feedback: a definition
Research on feedback has a long and well-documented history, going back to the
beginning of the previous century. The concept of feedback is now used in many
different fields, including education, professional training, HRD as well as
mathematics and engineering. As a consequence, the term “feedback” is used and
DownloadedbyUniversitasBrawijayaAt19:3404May2018(PT)

interpreted in many ways (Van de Ridder et al., 2008). Van de Ridder et al. (2008), for
instance, define feedback in clinical education as “Specific information about the
comparison between a trainee’s observed performance and a standard, given with the
intent to improve the trainee’s performance” (Van de Ridder et al., 2008, p. 193). Recent
reviews on feedback in educational settings define feedback as “information provided
by an agent regarding aspects of one’s performance or understanding” (Hattie and
Timperley, 2007, p. 81) or “information communicated to the learner that is intended to
modify his or her thinking or behaviour for the purpose of learning” (Shute, 2008,
p. 154). Similarly, performance feedback in organisational settings has been defined in
different ways (Alvero et al., 2001), and definitions may include (providing) information
about past performance, telling performers what and how well they have been doing as
well as information about how to adjust performance according to task requirements.
For the purpose of this article we use Ramaprasad’s (1983) comprehensive definition of
feedback:

Feedback is information about the gap between the actual level [of performance] and the Quality of
reference [or standard] level which is subsequently used to alter the gap in some way
(Ramaprasad, 1983, p. 4). feedback
Based on Ramaprasad’s definition, feedback that is useful for learners minimally
requires information about observed performance (“performance measurement” or
assessment); performance standards as well as information about what needs to be
done in order to achieve performance goals. In line with Ramaprasad’s definition, the 107
term “feedback” not only implies “looking back”, but also incorporates what is known
as “feed forward”: cues to directions for learning and performance improvement (Hattie
and Timperley, 2007; Hounsell, 2007).

The role of feedback in performance development


Within the recent past, the HRM literature has shown a shift from “measurement” of
performance (with an almost exclusive focus on ratings and assessment scores)
towards “performance management” systems which combine various purposes
(Latham et al., 2005). Aguinis (2009) describes six key purposes that can – and should
be – served by effective performance management systems, including administrative
purposes (to make decisions about employees or trainees); communication about
performance and performance expectations; as well as developmental purposes, to help
employees improve performance on an ongoing basis. As recently described in a
definitional review of the HRD domain, the intended purposes of HRD can be
summarised as improving individual, group and organisational effectiveness and
performance by developing knowledge, skills and competencies and/or by enhancing
human potential and personal growth (Hamlin and Stewart, 2011). Performance
management with its increasing emphasis on the use of performance assessments for
developmental purposes can thus be regarded as a core element of HRD. Recent
developments in HRM are very similar to developments in professional training, which
also show a shift in focus from assessment of trainee achievement towards assessment
as a tool for learning and development of professional expertise (e.g. Boud and
Falchikov, 2007). A principle underlying all these developments is that we want our
learners, trainees and employees to continue to improve their performance throughout
their lifetimes and to develop expertise, i.e. a body of knowledge and skills that allows
high-quality and adaptive performance in a domain. High quality feedback cycles are
crucial in this process (Bransford and Schwartz, 2009; Ericsson, 2009).
The relationship of assessment and feedback in performance development is
summarised in Figure 1. Performance is observed, interpreted and integrated in the
assessment process; this cognitive processing by the assessor results in a judgment of
task performance. If the goal is to improve performance, the assessor subsequently
offers information obtained in the assessment process as feedback to the trainee or

Figure 1.
Feedback in performance
DownloadedbyUniversitasBrawijayaAt19:3404May2018(PT)

development

EJTD employee. Finally, this feedback must be correctly interpreted to enable performance
improvement.
37,1 Systematic assessment of performance and identification of relevant strengths and
weaknesses is thus the foundation of feedback that is to be used for performance
improvement and expertise development (Ericsson, 2009). Both in organisational
settings as well as in professional training and education, assessments of performance
108 mostly combine both quantitative and qualitative feedback formats. Until recently,
performance appraisal research focused on quantitative feedback (performance ratings
and scores), rather than the role of written, narrative feedback comments in
performance assessments (Brutus, 2010; Finney, 2010). The exclusive focus on
quantitative data can be explained by the fact that performance assessment and
appraisal systems have been used mainly for administrative purposes, and for
justification of decisions about selection and promotion. The shift in focus towards
narrative comments in performance appraisal and management can be attributed to
several factors. First, it is being acknowledged that there is a limit to how well ratings
or scores can capture individual performance. Performance is deeply embedded in
context (Durning et al., 2012; Ferris et al., 2008), and narrative comments are needed to
capture context-specific aspects of task performance in order to be able to accurately
evaluate performance effectiveness (Brutus, 2010; Govaerts et al., 2005). Second, and
more importantly, narrative comments are better equipped than ratings to provide
information about strengths and weaknesses needed to actually improve performance
in line with personal and organisational goals. There is considerable evidence from
research in organisational as well as in educational settings, that providing written
comments is more effective for learning or personal improvement than just providing
grades or scores (Black and Wiliam, 1998; Overeem et al., 2010; Smither and Walker,
2004; Wiliam, 2007). Comments are considered vital to professional development
(Finney, 2010), and employees tend to process them more extensively than they do
numerical ratings (Brutus, 2010).

Effectiveness of feedback
The large body of research on feedback shows conflicting findings and mixed results
regarding effectiveness of feedback (Hattie and Timperley, 2007; Kluger and DeNisi,
1996; Shute, 2008; Smither et al., 2005). Recent reviews, however, have summarised
specific characteristics of feedback that are likely to make feedback more effective. In
general, feedback becomes more useful for performance improvement if the feedback is
directed at the task at hand, and provides feedback recipients with specific information
about aspects of task performance (i.e. what has been observed), about how and why
this deviates from desired goals and/or standards and how they can do better
(i.e. contains specific suggestions and strategies for improvement) (Hattie and
Timperley, 2007; Shute, 2008; Van de Ridder et al., 2008). Effective, high quality
feedback is thus specific and elaborate, targeting at specific, observablebehaviours
that can be changed or adapted to improve task performance. Effectiveness of feedback
is also likely to increase if it is directed at the processes underlying task performance –
to promote transfer to other tasks – and if it helps learners to adopt self-regulated
learning approaches in which they plan, monitor and evaluate their own performance
and progress. These types of feedback, i.e. specific, behavioural feedback on task,
process or self-regulation, help learners to represent and understand what successful
performance entails. Feedback which lacks specificity does not provide learners with
concrete behavioural information that can be used to improve performance and is thus

less useful. For instance, feedback that restricts the informational value to the Quality of
correctness of the outcome (overall evaluative judgment or verification) or worse, is
directed at the person (the “self” or character of the feedback recipient) rather than the
feedback
task is not very informative, is likely to distract attention from the task and thus
DownloadedbyUniversitasBrawijayaAt19:3404May2018(PT)

mostly ineffective in terms of performance development (Hattie and Timperley, 2007;


Kluger and DeNisi, 1996; Shute, 2008). Therefore, it goes without saying that
evaluation of feedback quality is essential in monitoring and evaluating effectiveness 109
of performance management systems: high quality performance management systems
depend on high quality feedback. Given the increasing importance of narrative
comments in feedback, analysis of feedback comments may not only indicate
effectiveness of feedback but also reveal opportunities to improve feedback culture in
organisations.

Assessment informs feedback


Feedback, however, can only be as good as the assessment that informs it: high quality
assessment is conditional for high quality feedback (Figure 1). It is generally assumed
that accuracy in performance assessments is preconditional for accurate and
meaningful feedback (e.g. Holmboe et al., 2010; Van de Wiel et al., 2011). Both in
organisations and in professional training a great deal of learning is “on-the-job”, in
working environments in which direct supervisors (e.g. line managers or clinical
supervisors) are responsible for assessment of performance as well as provision of
meaningful feedback. Performance assessments inherently rely on expert judgments
by assessors who – ideally – have two types of expertise: considerable expertise in
their original working field and assessor expertise, i.e. expertise in judging
performance. Obviously, both high quality assessment and feedback require
supervisor-assessors who are experts in theirtask domain. However, research in
organisational psychology as well as in professional training also indicates that
assessment of performance in work settingsis to be seen as a complex cognitive task, in
which assessors are continuously challenged to observe, recognise and select relevant
performance information (information acquisition); to interpret and organize
information in memory (to build representations of employee behaviours); to search
for additional information; and to finally retrieve and integrate all relevant information
to arrive at judgments on performance (DeNisi, 1996; Feldman, 1981; Govaerts et al.,
2011). Research findings show that this cognitive information processing by assessors
– and thus assessment outcome – is determinedby a broad range of assessor
characteristics, such as assessors’ personal goals, performance theories (e.g. Murphy
et al., 2004) or mood (e.g. Forgas, 2002; Forgas and George, 2001). Recent studies
furthermore indicate that cognitive processes that are related to judgment and decision
making in workplace-based assessment are influenced by assessor expertise, i.e. the
assessor’s level of expertise in assessment of performance (Govaerts et al., 2011;
Govaerts et al., 2012; Lievens, 2001). The studies by Govaerts et al. indicated that
cognitive processes underlying performance assessment change over time, due to
increased experience in the assessment task. In general, Govaerts et al’s results were
similar to findings from expertise studies in other domains (Ericsson, 2009; Salas and
Rosen, 2010). Compared to non-experienced supervisor-assessors, cognitive
information processing in experienced supervisor-assessors indicated expertise
effects and more expert information processing – reflecting high-quality problem
representations that guided their performance in the assessment task. For instance,
when observing and assessing trainee performance in general practice (GP)
EJTD experienced supervisor-assessors not only used more time to assess complex
behaviours compared to less experienced supervisor-assessors, but also paid more
37,1 attention to situation-specific cues in the assessment task. Experienced assessors were
also more likely than inexperienced assessors to explicitly link task-specific cues (such
as patient personality or specific features of the clinical case) to task-specific
performance requirements and performance assessment. Furthermore, when judging
110 performance experienced assessors explicitly linked specific aspects of trainee
performance to (changes in) patient behaviours and outcome of the consultation.
Findings also showed that experienced assessors were more likely to compile and
integrate different pieces of information resulting in more comprehensive, overall
interpretations of trainee performance, whereas non-experienced assessors seemed to
pay more attention to specific and discrete behavioural aspects of performance. Similar
findings were reported by Kerrins and Cushing (2000), in their study on supervision of
teachers: inexperienced supervisors mostly provided literal descriptions of what they
had seen on videotape, whereas more experienced supervisors interpreted their
observations as well as made more evaluative judgments, combining various
information into meaningful patterns of classroom teaching. These studies, however,
DownloadedbyUniversitasBrawijayaAt19:3404May2018(PT)

focused on processes underlying judgment of performance and did not look into
feedback information actually offered to trainees. An interesting question therefore is
whether the differences in cognitive processes underlying performance assessments as
described above impact the quality of feedback given to trainees following observation
and assessment of performance.

The present study


The present study examined the quality of feedback as offered by supervisor-assessors
in professional training in a health care setting (residency training in general practice
(GP). For the purpose of the study, feedback quality was defined in terms of
characteristics that make feedback more informative and thus likely more useful and
effective (Hattie and Timperley, 2007; Shute, 2008; Van de Ridder et al., 2008). We
specifically aimed to address gaps in the literature regarding the quality of narrative
comments on assessment forms directly following observation and assessment of
performance. We examined the quality of the written feedback as well as feedback
verbally expressed to the trainee after assessment and documentation of performance
on an assessment and feedback form. We furthermore explored if and how the
differences in assessors’ information processing and expertise as demonstrated in the
studies of Govaerts et al. (2011) influenced the quality of feedback offered to trainees.
More specifically, we explored hypotheses that arose from the research findings of
Govaertset al. as described above. First, we hypothesised that the non-experienced
assessors would provide more specific feedback (i.e. feedback referring to specific
behaviours or specific behavioural strategies for improvement) compared to
experienced assessors, since the non-experienced assessors seemed to focus more on
specific observable behaviours in assessment of performance. Second, the experienced
assessors were expected to provide more general feedback as well as overall
judgments, reflecting clustering of observed behaviours and overall interpretations of
performance effectiveness. Third, we hypothesised that experienced assessors would
provide more comments explaining the reasons underlying (in-)effectiveness of
task-specific behaviours, by paying more attention to specific features of the task at
hand and by linking feedback on performance effectiveness to these task-specific cues.
In sum, the focus of our study is on the relationship between characteristics of
expertise
(number
of years
being a s
upervisor
-assessor
in GP-
information processing by experienced and non-experienced assessors during training)
assessment of performance and the quality of the information these assessors watched t
offered as feedback to the trainee (Figure 1). wo
We used a quasi-experimental design and mixed methods approach to address our videotap
research questions. Two groups of supervisor-assessors with different levels of es, each
presentin
g a trainee performing an authentic professional task
(“real-life” patient encounter). After watching each videotape, assessors were asked to
fill in a rating scale, to write down their feedback to the trainee, and to subsequently
verbalize their feedback. Qualitatively, we used content analysis of feedback protocols
to explore content and quality of both written and verbal feedback. Quantitatively, we
explored how supervisor-assessor expertise influences feedback quality.

Quality of
Method
Participants feedback
Participants in our study were GP-supervisors who were actively involved as
supervisor-assessor in general practice residency training in the south of The
Netherlands. GP-supervisors are all very experienced general physicians, continuously
involved in supervision of trainees on a day-to-day basis. General practice training in The
Netherlands has a long tradition of systematic direct observation and assessment of
111
trainee performance throughout the training programme. For that purpose, video
equipment has been placed in GP practices, and trainees are expected to videotape their
patient consultations on a regular basis. Video recordings of patient consultations can
thus be used by supervisors to assess medical performance and to provide feedback to
trainees. All GP-supervisors need to attend workshops on supervision, assessment of
medical performance and provision of feedback for learning and performance
development.
In our study, we defined the level of assessor expertise as the number of years of
task-relevant experience as a supervisor-assessor. Since there is no formal equivalent
DownloadedbyUniversitasBrawijayaAt19:3404May2018(PT)

of elite assessor performance we adopted a relative approach to expertise. This


approach assumes that novices develop into experts through extensive task experience
and training (Chi, 2006; Norman et al., 2006). In general, about seven years of
continuous experience in a particular domain is necessary to achieve expert
performance (e.g. Arts et al., 2006). Registered GP-supervisors with different levels of
supervisor-assessor experience were invited to voluntarily participate in our study; a
total of 34 GP-supervisors participated. GP-supervisors with at least seven years of
experience as supervisor-assessor were defined as “experts”. The experienced group in
our study consisted of 18 GP-supervisors (number of years GP-experience M ¼ 26:3;
SD ¼ 5:0; number of years supervision experience M ¼ 13:4; SD ¼ 5:9); the
“non-experienced group” consisted of 16 GP-supervisors (number of years
GP-experience M ¼ 12:9; SD ¼ 5:0; number of years supervision experience
M ¼ 2:6; SD ¼ 1:2). Levels of supervisor-assessor experience between both groups
differed significantly (tð32Þ ¼ 7:2, p , 0:001) and as demonstrated by Govaerts et al.
(2011), the two groups of supervisor-assessors represented groups with different levels
of expertise in performance assessment, the more experienced participants showing
more expert information processing while observing and judging performance.
Participants received financial compensation for their participation.

EJTD Materials and procedure


Participants watched two videotaped patient encounters (VCs), each showing a
37,1 final-year medical student in a “real-life” encounter with a patient. Participants had
never met the medical students before. The VCs were selected purposively with respect
to both patient problems and students’ performance. The patient problems are common
in general practice (atopic eczema and angina pectoris) to ensure that all participants
112 were familiar with task-specific performance requirements. As indicated above, the
video-assessment task represented an authentic assessment task for the GPs in our
study.
For each VC, participants were asked to observe and assess the trainee’s
performance and to verbalize their judgments and thoughts while filling in a rating
form (Figure 2). They were subsequently asked to write down their feedback for the
trainee on the assessment form (in a box designated for provision of narrative
comments) and to verbalize their feedback. The think-aloud protocols and
verbalisations of feedback were audio-recorded and transcribed verbatim.

Data analysis
For the present study, all written and verbalized feedback protocols were first
segmented into small, meaningful information units. Segments were identified on the
basis of semantic features (i.e. content features -as opposed to non content features
such as syntax); each segment represented a single thought, idea or feedback statement
DownloadedbyUniversitasBrawijayaAt19:3404May2018(PT)

Figure 2.
Assessment form:
one-dimensional overall
performance rating and
text box for feedback

(Chi, 1997). Deductive content analysis was used to determine the informational value Quality of
of each of the feedback comments (Elo and Kynga¨s, 2008). Each segment was assigned
to one or more coding categories, using software for qualitative data analysis (Atlas.ti
feedback
6.2). Coding categories were based on theoretical models for feedback effectiveness as
described before (Hattie and Timperley, 2007; Shute, 2008; Van de Ridder et al., 2008),
and included feedback orientation (feedback aimed at person, process, task or
self-regulation) as well as level of behavioural specificity (general feedback versus 113
feedback directed at specific observable behaviours; general versus specific
suggestions for improvement) and feedback topic (performance domains).
Definition of coding categories and examples of feedback statements (verbal and
written) are:
(1) Feedback orientation:
.
Person – feedback oriented at the self (person) of feedback recipient
(e.g. personality traits).
.
Process – feedback oriented at processes underlying task performance;
generalizable to other, similar tasks.
.
Self-regulation – feedback oriented at fostering self-regulated learning.
.
Task – feedback oriented at the task at hand; does not generalize to other
tasks.
(2) Feedback topic: performance dimensions:
.
Performance in general.
.
Communication; doctor-patient relationship.
.
Handling biomedical aspects.
.
Structuring of consultation, time management.
(3) Level of specificity:
.

.
Keyword – word or statement (written or mentioned) without any further
.
explanation, clarification or valence; starting point for feedback dialogue
(e.g. “communication skills”).
Verification – overall judgment of performance within main performance
.
dimensions (e.g. excellent communication; good; satisfactory).
General feedback on performance – feedback that needs further explanation
to be meaningful or informative; error flagging (e.g. you didn’t do enough to
facilitate the patient’s responses; you should have used a more open
approach; patient management in acute coronary disease needs attention).
.
Specific feedback on performance – feedback referring to specific instances
of behaviours (e.g. you gave a summary at the end of the very first phase of
the consultation, which was excellent; you used closed questions only).
Providing explanation (why) – clarifying comment or explanation of why
behaviours were (in)correct, or ineffective (e.g. you gave a summary at the
Positive.
end of the very first phase of the consultation, which was an excellent way to
structure the consultation; “you acknowledged the patient’s feelings by
saying xxxxxx”).
Suggestion for improvement – general strategy (e.g. try to think of the
doctor as medicine, even if you cannot prescribe any medication).
DownloadedbyUniversitasBrawijayaAt19:3404May2018(PT)

EJTD .
Suggestion for improvement – specific behavioural strategy (e.g. try to ask
more open-ended questions in the very first phase of the consultation).
37,1 .
Goal setting and follow-up: suggesting concrete plans of action (e.g. you need
to study the GP guidelines on acute chest pain and we will get together and
discuss these guidelines next week).
(4) Feedback valence:
114 .

.
Negative.
(5) Meta-remarks:
.
Reflections on feedback process (e.g. “I think that he is an authoritarian
doctor but obviously, that is not what I am going to tell him”; “Of course, you
have to be careful about how to formulate your feedback, otherwise it will
not be accepted”).
.
Reflections on assessment (e.g. “Overall, I think that he did not do so bad, for
a final year medical student; it was a difficult case”. “I think that he will
become a good physician, so I will not be too strict”).
Initial coding was applied by two independent coders (MG; MvdW; members of the
research team), who met repeatedly to discuss coding discrepancies and to make
iterative refinements to the coding scheme, until the coding scheme was stable. Once
the coding scheme was agreed on, all protocols were coded by one researcher (MG).
Interpretative rigour was ensured through researcher triangulation throughout data
analysis: the research team met repeatedly to discuss any uncertainties, to cross check
coding strategies and interpretation of data (Barbour, 2001; Kitto et al., 2008). The final
coding scheme is presented in the list above; repetitions were coded as such.
After coding, data were exported from Atlas-ti to SPSS 20. For each participant, the
number of statements per coding category were transformed to percentages for each of the
written and verbal feedback protocols. Because of the small sample sizes and
non-normally distributed data, non-parametric tests (Mann-Whitney U) were used to
estimate differences between the two groups of assessors (experienced and
non-experienced).

Results
Table I provides a descriptive summary of the feedback reports. It shows that the GP
supervisor-assessors in our study provided on average four to five feedback statements
in writing per trainee. Protocol analysis of the verbal feedback showed that many
feedback statements were literal verbalisations of the feedback that had been written
on the form. However, also new information or additional explanations were verbally
expressed. Per trainee, on average two to four statements in verbal feedback provided
information which was not also presented in the narrative comments in writing. For
experienced assessors 37 per cent of verbal feedback statements contained new
information, while 51 per cent of verbal feedback by non-experienced assessors added
to information on the feedback form.
Table II presents percentages (median and interquartile range) of feedback
statements according to level of specificity, feedback orientation and feedback topic –
for each group of assessors and for written and verbal feedback protocols respectively.
Table II shows that, in general, assessors differ with respect to content and usefulness

of feedback messages. When looking at median and interquartile range of feedback Quality of
statements, it can be noted that some feedback patterns seem to occur in a relatively
feedback
small number of assessors only; for instance, only 28 per cent of experienced assessors
and 31 per cent of non-experienced assessors offer specific suggestions for
improvement in written feedback (median ¼ 0 per cent). Table II clearly shows that
the majority of feedback in writing, both from experienced and non-experienced
assessors, lacks behavioural specificity. On average, more than 55 per cent of feedback 115
statements in written feedback protocols by experienced and non-experienced
assessors were formulated either as a keyword, verification or a very general statement
DownloadedbyUniversitasBrawijayaAt19:3404May2018(PT)

requiring further clarification in order to become meaningful for the trainee. Especially
feedback which is phrased as a single keyword (as a reminder or starting point for the
feedback dialogue) lacks informative value. On average 18 per cent of written feedback
statements by experienced assessors consisted of this type of feedback, and 40 per cent
of written feedback protocols from experienced assessors contained at least one
feedback statement that was phrased as a single keyword. Experienced assessors also
tended to limit their feedback in writing to overall judgments of performance
(verifications or overall evaluations), significantly more than non-experienced
assessors (p , 0:05); 47 per cent of written protocols from experienced assessors
contained at least one overall judgment versus 7 per cent of protocols from
non-experienced assessors.
Table II also shows that, although additional verbal feedback was partly aspecific in
nature, it contained much more statements reflecting higher levels of specificity compared
to feedback in writing. When providing verbal feedback, assessors in general were more
likely to provide explanatory comments on feedback (i.e. clarify why performance was
(in-)effective), as well as specific suggestions for improvement. Near-significant
differences between experienced and non-experienced assessors were found for
provision of explanatory comments in verbal feedback (p ¼ 0:06). Non-experienced
assessors seemed to pay more attention to elaborate on their feedback by clarifying to the
trainee how specific behaviours impact on task performance.
Negative feedback concerns the fact that [you] do not acknowledge the patients’ feelings, that
[you] do not pay attention to the real reasons underlying the patient’s visit to the office. As a
consequence, the patient does not feel understood, does not feel that you really understand his
problem (explanatory comment, verbal feedback P4; Experienced).

Experienced Non-experienced
assessorsa assessorsb
n (%) n (%)
Assessors who provided narrative feedback 17 94 14 88
Assessors who gave verbal feedback 18 100 16 100 Table I.
Statements in narrative feedback (mean, SD) 9.6 3.7 8.4 3.1 Sample descriptive
Statements in verbal feedback, including repetitions 12.2 5.1 13.1 5.6 statistics: number of
of feedback in writing (mean, SD) assessors providing
Statements in verbal feedback presenting new 4.5 3.5 6.7 4.8 written and verbal
information, not previously given in written feedback and number of
feedback (mean, SD) statements in verbal and
written feedback
Notes: aN ¼ 18; bN ¼ 16 protocols

EJTD
37,1
Experienced Non-experienced
Verbal FB Verbal FB

Written FB Verbal FB NEW Written FB Verbal FB NEW a


Level of specificity
Keyword 0.0 (0.0-31.2) 0.0 (0.0-0.0) 0.0 (0.0-18.1) 0.0 (0.0-0.0)
116 0.0 (0.0-0.0) 0.0 (0.0-0.0)
Verification/overall 0.0 (0.0-10.4) 0.0 (0.0-0.0) 0.0 (0.0-0.0) 0.0 (0.0-0.0)
judgement 0.0 (0.0-0.0) 0.0 (0.0-0.0)
General feedback 40.0 (20.8-60.0) 8.0 (3.1-14.3) 41.4 (25.9-64.6) 11.1 (5.9-21.5)
38.9 (15.7-50.0) 27.6 (14.9-33.3)
Specific feedback 0.0 (0.0-16.7) 7.7 (0.0-13.8) 5.0 (0.0-17.8) 7.7 (0.0-18.6)
23.6 (2.8-29.6) 18.3 (0.0-31.3)
Why – explanation/ 0.0 (0.0-0.0) 4.4 (0.0-9.0) 0.0 (0.0-6.8) 8.4 (6.0-18.6)
clarifying comment 9.2 (0.0-31.3) 17.1 (7.2-33.3)
Suggestions for 0.0 (0.0-11.4) 0.0 (0.0-0.0) 4.5 (0.0-23.8) 0.0 (0.0-2.7)
improvement general 0.0 (0.0-0.0) 0.0 (0.0-5.0)
Suggestion for 0.0 (0.0-10.6) 0.0 (0.0-2.8) 0.0 (0.0-15.7) 2.8 (0.0-10.0)
improvement specific 0.0 (0.0-10.7) 4.2 (0.0-21.7)
Goal setting and 0.0 (0.0-0.0) 0.0 (0.0-0.0) 0.0 (0.0-0.0) 0.0 (0.0-.0)
follow-up 0.0 (0.0-0.0) 0.0 (0.0-0.0)
DownloadedbyUniversitasBrawijayaAt19:3404May2018(PT)

Orientation
Task 68.3 (37.5-86.7) 16.0 (12.2-33.3) 47.2 (23.2-74.4) 22.6 (7.4-45.7)
68.3 (50.0-100.0) 54.4 (33.3-78.8)
Process 21.3 (0.0-40.0) 0.0 (0.0-7.9) 36.5 (2.1-61.2) 15.5 (2.3-23.5)
0.0 (0.0-27.7) 26.8 (5.0-66.7)
Self-regulation 0.0 (0.0-0.0) 0.0 (0.0-6.8) 0.0 (0.0-0.0) 0.0 (0.0-0.0)
0.0 (0.0-22.3) 0.0 (0.0-0.0)
Person 0.0 (0.0-10.4) 0.0 (0.0-5.0) 0.0 (0.0-0.0) 0.0 (0.0-0.0)
0.0 (0.0-8.0) 0.0 (0.0-0.0)
Topic
Communication 63.1 (50.0-71.8) 17.0 (8.2-19.8) 73.2 (44.3-82.5) 29.4 (15.3-45.7)
58.3 (45.8-82.5) 66.7 (40.0-89.6)
Handling biomedical aspects 23.6 (9.4-40.0) 7.4 (0.0-15.4) 12.5 (0.0-25.0) 4.6 (0.0-16.9)
18.3 (0.0-43.3) 7.5 (0.0-29.9)
Structure 0.0 (0.0-0.0) 0.0 (0.0-0.0) 0.0 (0.0-13.2) 0.0 (0.0-9.8)
0.0 (0.0-0.0) 0.0 (0.0-19.4)
Performance general 2.9 (0.0-9.9) 0.0 (0.0-0.0) 0.0 (0.0-11.9) 0.0 (0.0-4.1)
Table II. 0.0 (0.0-31.3) 0.0 (0.0-28.6)
Percentages of feedback Valence
statements for Positive 21.6 (0.0-31.6) 0.0 (0.0-5.2) 0.0 (0.0-15.6) 2.9 (0.0-13.2)
experienced and 0.0 (0.0-25.3) 5.0 (0.0-29.2)
non-experienced Negative 29.3 (27.8-55.6) 9.7 (0.0-27.0) 30.4 (2.8-65.9) 11.1 (6.0-26.4)
assessors, for written and 50.0 (0.0-58.0) 31.4 (15.7-41.3)
verbal feedback,
according to level of Notes: a In italics: statements concerning new information (not previously given in written feedback);
specificity, feedback Data are non-normally distributed. Percentages do not add up to 100%, because the table presents
orientation and feedback median and interquartile range and not all coding categories are included in the table; Presented are
topic median and IQR (interquartile range between brackets)
The majo
rity of fee
dback pr
ovided b
y both ex
If you had paid more attention to this patient’ s concerns in the beginning of the consultation,
the patient wouldn’t have started the dispute [about the management plan] at the end perienced
(explanatory comment, verbal feedback –P15; Non-experienced).
and non-
With a patient such as this, always ask the patient what it means to have a skin disease like
experienc
this, what the impact is on daily, social life (specific suggestion for improvement, verbal
feedback P13; Non-experienced). ed assess
Sometimes you have to be very direct and just ask the patient “Do you think that you have ors
got cancer, or do you think that you had a heart attack?” (specific suggestion for is directe
improvement, verbal feedback P17; Non-experienced). d at the
level of t
he task (Table II), both in written and verbal feedback
protocols. This indicates that the majority of feedback statements aim at instruction
and improving task performance by pointing out task-specific aspects of performance.
In patients with acute chest pain, you have to also consider other diagnoses, not just
myocardial infarction (feedback at task level, written feedback P2; Experienced).
When formulating the patient management plan for this patient, you have to consider factors
causing stress in this patient (feedback at task level, written feedback P12; Non-experienced).
Quality of
Assessors paid less attention to feedback aiming at transfer of knowledge to similar
tasks (process-oriented feedback), or feedback aiming at fostering self-regulated
feedback
learning (Table II). Analysis of verbal protocols, however, showed significant
differences between assessor groups with respect to process-oriented feedback. In their
verbal feedback, non-experienced assessors paid significantly more attention to
process-oriented feedback compared to experienced assessors (p ¼ 0:003). 117
[You] need to work on a more methodical approach in patient consultation. You need to look
again at everything that you learned concerning doctor-patient communication, summarizing
information, showing involvement and empathy..... there is lots of room for improvement here
(feedback at process level, verbal feedback P28; Experienced).
In the beginning of patient consultations, try to take a more open approach. What does the
patient have to tell you; what is going on? (feedback at process level, verbal feedback P9;
Non-experienced).
Although person-oriented feedback (i.e. feedback directed at the “self”) occurred quite
frequently both in written and verbal feedback protocols, this type of feedback
DownloadedbyUniversitasBrawijayaAt19:3404May2018(PT)

represented a very small percentage of total feedback statements in feedback protocols


in both assessor groups (Table II). Experienced assessors provided person-oriented
feedback in 41 per cent of written feedback and 28 per cent of verbal feedback
protocols; non-experienced assessors provided feedback oriented at the person of the
trainee in 21 per cent and 13 per cent of written and verbal feedback protocols
respectively.
[. . .]excellent physician! (person-oriented feedback, written feedback P10; Experienced).
[. . .][you] ask questions like a robot (person-oriented feedback, written feedback P4;
Experienced).
With respect to feedback topic, results show that assessors paid more attention to
communication and doctor-patient relationship than to handling biomedical aspects of

EJTD the patient encounter – both in written and verbal feedback protocols. This may reflect
performance theories of supervisor-assessors in general practice. Compared to
37,1 non-experienced assessors, experienced assessors seemed to pay more attention to
feedback on biomedical aspects of task performance (differences near-significant;
p ¼ 0:08), suggesting a more balanced feedback pattern with respect to behaviours
relevant for effective task performance (Table II).
118 Both written and verbal feedback contained many feedback statements carrying
negative valence, and most assessors included negative feedback phrases in their
feedback reports (88 per cent of experienced assessors; 86 per cent of non-experienced
assessors). Near-significant differences between assessor groups were found with
respect to positive feedback remarks in written protocols, experienced assessors
providing more positive feedback compared to non-experienced assessors (p ¼ 0:07;
Table II).
[You] do not show any empathy (negative valence feedback, written feedback P24;
Non-experienced).
[. . .][you] listen attentively (positive valence feedback, written feedback P5; Experienced).
Finally, some protocols of verbal feedback included reflective remarks either on the
feedback process or on the judgment and decision making process prior to giving
feedback. In general, these reflections indicated that feedback which is actually given
to the trainee may differ substantially from the judgments and observations that are
made during the assessment process. In other words: the public judgment (feedback in
writing or verbal feedback to trainee) may differ from the private judgement (judgment
resulting from assessment of performance according to private performance theories
and standards).
Well, I just evaluated his performance, but I will phrase my feedback in positive terms
(reflection on feedback P31; Experienced).
You have to start with positive comments, otherwise he will not listen any longer (reflection
on feedback, P17; Non-experienced).
Providing feedback in writing is always difficult, because you have to have a broader focus, I
just wrote down a few comments, but when discussing the consultation , with the trainee .
I would take a different approach (reflection on feedback, P12; Non-experienced).
Physical examination is incomplete. [He] didn’t do cardiac auscultation, percussion was also
missing, but I do not consider this to be very important (reflection on assessment, P6;
Experienced).

Discussion and conclusion


The present study investigated quality of feedback offered by assessors following
direct observation and assessment of performance in professional training. Using
theoretical frameworks of feedback effectiveness (Hattie and Timperley, 2007; Shute,
2008; Van de Ridder et al., 2008), we aimed to identify and compare quality of written
and verbal feedback offered by trained supervisor-assessors in general practice
residency training. To our knowledge, this is one of the first attempts to identify
quality of narrative comments on performance assessment forms immediately
DownloadedbyUniversitasBrawijayaAt19:3404May2018(PT)

findings
with res
pect to
narrative
comment
s can be
explaine
d by the
following observation of performance. The study furthermore examined whether
fact that
quality of feedback was influenced by the different levels of assessor expertise. productio
In HRM as well as in professional training, the importance of high quality feedback n of mea
in writing, both for professional development and for administrative decision making, ningful a
has been acknowledged (Brutus, 2010; Dudek et al., 2005; Govaerts et al., 2005). Results nd effecti
from our study indicate that assessors in general differed with respect to content and ve narrati
usefulness of feedback messages. A more important and alarming finding from our ve comm
study, however, is the fact that most of the feedback lacked behavioural specificity, ents plac
even although all participants in our study were trained in performance assessment es high d
and provision of performance feedback. Feedback in general, but especially the written emands
feedback lacked information which could help trainees to identify specific behavioural on
strengths and weaknesses, and specific suggestions for improvement were limited. assessors
Only part of the feedback was elaborated in the verbal feedback. As a consequence, ’ feedbac
feedback may not be very useful, and thus have only limited impact on learning and k skills;
professional development. productio
Findings with respect to this (lack of) feedback specificity contrast with findings n of narr
from the previous studies on think-aloud protocols during actual assessment of ative co
performance by Govaerts et al. (2011, 2012). Think-aloud protocols of performance mments
assessments showed that the assessors observed, selected and interpreted a broad takes m
range of task-specific behaviours to arrive at judgments on performance. The feedback ore time
protocols in our study do not reflect the rich and specific behavioural information that and
the assessors used when assessing performance. Even though assessors had been able requires
to identify effective and ineffective task-specific behaviours during assessment of more cog
performance, feedback following performance assessment most often lacked specific nitive eff
information on what was observed, why task-specific behaviours were effective or not, ort, chall
and how to improve performance. In other words: high quality assessment in terms of enging a
identification of effective and ineffective task behaviour does not automatically imply ssessors
high quality feedback. to put th
Our findings also suggest that elaborated feedback occurs more often in verbal eir thoug
feedback than in narrative comments, i.e. feedback in writing. Assessors used verbal hts into
feedback to provide additional clarifying comments elaborating on the how and why of words,
feedback, capturing context-specific aspects of performance. Narrative comments, and to pr
therefore, appear to be less useful for learning and professional development than ovide per
verbal feedback. The lack of informational value in written feedback may be a cause of formance
concern, since findings from feedback research indicate that delivery of feedback in informati
writing is to be preferred over oral feedback delivery, as written feedback would be on witho
provided in a more neutral manner (Kluger and DeNisi, 1996; Shute, 2008). Our ut the dir
ective prompts of items on a
rating scale (Brutus, 2010).
Our findings do not seem to support our hypotheses with respect to relationships
between assessor expertise, cognitive information processing in assessment of
performance and quality of feedback information. Contrary to our expectations,
non-experienced assessors do not provide more specific (behavioural) feedback on task
performance nor do experienced assessors provide more elaborative feedback in terms
of explanatory comments which link performance feedback to situation-specific
Quality of
feedback

119

EJTD features of the task at hand. Although experienced assessors do seem to provide more
overall judgements (verifications) in written feedback, and seem to have a more
37,1
DownloadedbyUniversitasBrawijayaAt19:3404May2018(PT)

balanced feedback pattern (i.e. paying more attention to all relevant performance
domains, including positive as well as negative comments in written feedback),
between-group differences are small. Our results may be explained by the experimental
setting of our study, which prevented assessors from having an actual feedback
120 dialogue with their trainees. This may have impacted on the quality (in terms of
richness, specificity and meaningfulness) of especially verbal feedback. Another and
perhaps more likely explanation might be that feedback given by participants in our
study is in line with feedback literature which recommends that feedback should focus
on a limited number of behaviours that can be improved, whereas assessment and
judgment on performance is based on a more comprehensive set of behavioural
observations reflecting overall performance (Govaerts et al., 2012). Furthermore,
processes underlying assessment and feedback on performance – although
intrinsically related, may call on different skills, competencies and information
processing. This is supported by research findings in industrial and organisational
psychology which suggest that assessors’ cognitive processing is influenced by
assessment purposes. Assessors’ information processing has been found to be
impaired when the purpose of the final ratings differed from the purpose assessors had
in mind when observing trainees’ performance (DeNisi and Williams, 1988; Murphy
et al., 2004). For feedback to be meaningful, assessors do not only need skills for direct
observation and evaluation of performance, they must also be able to translate their
impressions, interpretations and personal (evaluative) judgments into high quality
information that motivates and modifies subsequent learning. In other words:
assessors need to shift in focus from assessment for summative purposes (judgment on
performance effectiveness or achievement) to formative assessment (i.e. assessment for
improvement of performance, or assessment for learning). The integration of
summative and formative (developmental) purposes in assessment of performance in
our study may therefore have had a negative impact on quality of feedback provided
by our participants. Finally, analysis of feedback protocols in our study confirmed
findings from previous studies which indicate that public ratings or feedback often
differ from assessors’ private judgments (Murphy and Cleveland, 1995).
A remarkable finding from our study is the large number of statements carrying
negative valence in feedback protocols. This contrasts with findings from previous
studies in medicine and other professional domains. In a study on narrative comments
in multisource feedback Canavan et al. (2010), for instance, found that only 10 per cent
of feedback reports contained negative feedback phrases, many of which were directed
at the person of the feedback recipient. Also in industrial and organisational
psychology many studies indicate that feedback in writing tends to have positive
valence, especially if feedback is used for administrative purposes and/or if feedback
providers are held accountable to feedback recipients (Murphy and Cleveland, 1995).
Findings in our study may be explained by the setting of our experiment, in which
feedback had no (negative) consequences for the trainee or the feedback provider.

Strengths and limitations


Strength of our study is the fact that we could base assessor expertise on actual
measures of assessor performance showing that the more experienced
supervisors-assessors indeed formed higher-quality representations of trainee task
performance in their assessments than the less experienced participants (Govaerts

et al., 2011). However, the results of Govaerts et al. (2012) on assessor performance also Quality of
showed that assessor idiosyncrasy in both expertise groups was substantial. A golden
standard would be needed to show that the experts in our study indeed showed
feedback
reproducible superior performance (Ericsson, 2009).
There are several limitations to our study. Our study is limited to a specific setting
(professional training in health care) and a specific assessment context. We feel,
however, that our findings with respect to feedback are likely to reflect quality of 121
feedback in performance management systems. Feedback which is given infrequently,
targeting at past performance over longer periods of time, is likely to be even less
specific than the feedback in our study – which was provided immediately following
direct observation and assessment of task performance.
Another limitation to our study is the fact that assessors were not able to have an
actual feedback dialogue with the trainee after watching a video-tape presenting
authentic trainee performance. This was a result of conducting our study using
authentic assessment tasks in an experimental setting in order to be able to assess
DownloadedbyUniversitasBrawijayaAt19:3404May2018(PT)

impact of different levels of experience on performance assessment and feedback. The


experimental setting may furthermore have had consequences for commitment of our
participants to the feedback task, and thus quality of feedback – especially verbal
feedback. Reflective remarks in feedback protocols, however, indicated that many
assessors were concerned about how to motivate and improve trainee performance –
as in a real-life supervisor-trainee relationship. Finally, the sample size of our study is
small, although the sample used is not uncommon in qualitative research of this type.

Implications for practice and research


Findings from our study indicate that high-quality feedback in performance
management and professional training is not self-evident, even if supervisors are
trained, and even if feedback is provided immediately following observation and
assessment of performance. These findings have several implications for practice as
well as future research. High-quality feedback in performance management systems
calls for supervisor-assessor training which focuses not only on face-to-face feedback
but also on how to write narrative comments that are motivating and meaningful in
guidance of competence development, as well as useful in decision making about
competence achievement. Obviously, there is a limit to what formal training can
achieve. With respect to training of feedback providers, our results seem to indicate
that brief, one-off training sessions will not do if we aim at excellence in performance
management systems. Development of assessors’ feedback giving skills may require
long-term support, coaching and feedback on-the-job. Quality assurance of
performance management systems should include evaluation of feedback quality:
feedback on feedback is needed to ensure that performance feedback is informative and
meaningful, in order to promote use of feedback for learning and professional
development. Our findings also seem to confirm previous research indicating the need
for feedback dialogue in which the meaning of the feedback can be clarified. Trainees
and learners may need help and guidance in interpreting (aspecific) feedback for
performance improvement, through discussing feedback with supervisors or coaches
(e.g. Luthans and Peterson, 2003). Compared to feedback in writing, verbal feedback
requires less effort, is more specific and elaborate and thus seems to be more useful for
learning and performance development. Increasing pressures for accountability,
however, result in increasing emphasis on documentation of performance and high
quality information in narrative comments (Brutus, 2010; Dudek et al., 2005).

EJTD Therefore, accurate documentation of feedback has to be supported by well-designed


and user-friendly tools and instruments which elicit informative narrative comments,
37,1 as well as procedures that promote use of feedback for performance improvement and
trustworthiness in decision making (van der Vleuten et al., 2010).
Further research should examine whether our findings can be reproduced in other
settings and larger sample sizes. Although we were not able to establish a relationship
122 between assessors’ expertise and quality of feedback, effects of assessor
characteristics – including expertise – on feedback and performance assessment
remain important areas for study. Our findings also call for research into the
relationship between features of the assessment system such as assessment purposes
and feedback quality. Is it possible to combine summative and formative assessment
purposes in performance management systems – as advocated by Aguinis (2009)?
How should we design performance management systems to overcome assessors’
limitations in cognitive processing precluding integration of different assessment
purposes in performance management? What are implications for assessor training?
Research questions may also concern design of assessment instruments: how can we
facilitate assessors to provide comments that are meaningful and useful, to support
learning as well as credible decision making? Finally, future research should
investigate trainees’ perceptions of feedback quality, delivered both in writing and
verbally. Feedback reactions are the immediate predecessors of performance
improvement. Research into determinants underlying acceptance and use of
narrative comments should therefore have top priority.
The paper highlights feedback and feedback providers as a key in performance
management and assessment for learning. It identifies the need for quality assurance of
performance management systems (or: assessment programmes in professional
training) including continuous evaluation of feedback processes and coaching
“on-the-job” of feedback providers.
DownloadedbyUniversitasBrawijayaAt19:3404May2018(PT)

References
Aguinis, H. (2009), “An expanded view of performance management”, in Smither, J.W. and
London, M. (Eds), Performance Management: Putting Research into Action, Jossey-Bass,
San Francisco CA, pp. 1-43.
Aguinis, H. and Pierce, C.A. (2008), “Enhancing the relevance of organizational behavior by
embracing performance management research”, Journal of Organizational Behavior,
Vol. 29 No. 1, pp. 139-45.
Alvero, A.M., Bucklin, B.R. and Austin, J. (2001), “An objective review of the effectiveness and
essential characteristics of performance feedback in organizational settings”, Journal of
Organizational Behavior Management, Vol. 21 No. 1, pp. 3-29.
Arts, J.A.R.M., Gijselaers, W.H. and Boshuizen, H.P.A. (2006), “Understanding managerial
problem-solving, knowledge use and information processing: investigating stages from
school to the workplace”, Contemporary Educational Psychology, Vol. 31 No. 4, pp. 387-410.
Barbour, R.S. (2001), “Checklists for improving rigour in qualitative research. A case of the tail
wagging the dog?”, British Medical Journal, Vol. 322, pp. 1115-7.
Black, P. and Wiliam, D. (1998), “Assessment and classroom learning”, Assessment in Education:
Principles, Policy & Practice, Vol. 5 No. 1, pp. 7-74.
Boud, D. and Falchikov, N. (2007), Rethinking Assessment in Higher Education. Learning for the
Longer Term, Routledge, New York, NY.
Bransford, J.D. and Schwartz, D.L. (2009), “It takes expertise to make expertise: some thoughts
about why and how and reflections on the themes in chapters 15-18”, in Ericsson, K.A.
DeNisi, A.
S. and Wi
lliams, K.J.
(1988), “C
ognitive a
pproaches
(Ed.), Development of Professional Expertise: Toward Measurement of Expert Performance to perfor
and Design of Optimal Learning Environments, Cambridge University Press, New York, mance ap
NY, pp. 432-48. praisal”, i
Brutus, S. (2010), “Words versus numbers: a theoretical exploration of giving and receiving n
narrative comments in performance appraisal”, Human Resource Management Review, Ferr
Vol. 20 No. 2, pp. 144-57. is, G. an
Canavan, C., Holtman, M.C., Richmond, M. and Katsufrakis, P.J. (2010), “The quality of written d Rowlan
comments on professional behaviors in a developmental multisource feedback program”, d, K. (Ed
Academic Medicine, Vol. 85 No. 10, pp. S106-9. s), Resear
Chi, M. (1997), “Quantifying qualitative analyses of verbal data: a practical guide”, The Journal of ch in Per
sonnel an
the Learning Sciences, Vol. 6 No. 3, pp. 271-315.
d Human
Chi, M.T.H. (2006), “Two approaches to the study of experts’ characteristics”, in Ericsson, K.A., Resource
Charness, N., Feltovich, P.J. and Hoffman, R.R. (Eds), The Cambridge Handbook of Ma
Expertise and Expert Performance, Cambridge University Press, Cambridge, pp. 21-30. nagement,
DeNisi, A.S. (1996), Cognitive Approach to Performance Appraisal: A Program of Research, Vol. 6, JAI
Routledge, New York, NY. Press, Gre
enwich, CT
.
Dudek, N.L., Marks, M.B. and Regehr, G. (2005), “Failure to fail: the perspectives of clinical
supervisors”, Academic Medicine, Vol. 80 No. 10, suppl., pp. S84-7.
Durning, S.J., Artino, A.R., Boulet, J.R., Dorrance, K., Van der Vleuten, C.P.M. and Schuwirth,
L.W.T. (2012), “The impact of selected contextual factors on experts’ clinical reasoning
performance (does context impact clinical reasoning performance in experts?)”, Advances
in Health Sciences Education, Vol. 17 No. 1, pp. 65-79.
Elo, S. and Kynga¨s, H. (2008), “The qualitative content analysis process”, Journal of Advanced
Nursing, Vol. 62 No. 1, pp. 107-15. Quality of
Ericsson, K.A. (2009), Development of Professional Expertise: Toward Measurement of Expert feedback
Performance and Design of Optimal Learning Environments, Cambridge University Press,
New York, NY.
Feldman, J.M. (1981), “Beyond attribution theory: cognitive processes in performance appraisal”,
Journal of Applied Psychology, Vol. 66 No. 2, pp. 127-48.
Ferris, G.R., Munyon, T.P., Basik, K. and Buckley, M.R. (2008), “The performance evaluation 123
context: social, emotional, cognitive, political and relationship components”, Human
Resource Management Review, Vol. 18 No. 3, pp. 146-63.
Finney, T.G. (2010), “Performance appraisal comments: the practitioner’s dilemma”, The Coastal
Business Journal, Vol. 9 No. 1, pp. 60-9.
Forgas, J.P. (2002), “Feeling and doing: influences on interpersonal behaviour”, Psychological
Inquiry, Vol. 13 No. 1, pp. 1-28.
Forgas, J.P. and George, J.M. (2001), “Affective influences on judgments and behavior in
organizations: an information processing perspective”, Organizational Behavior and
Human Decision Processes, Vol. 86 No. 1, pp. 3-34.
Govaerts, M., Van der Vleuten, C. and Schuwirth, L. (2005), “The use of observational diaries in
DownloadedbyUniversitasBrawijayaAt19:3404May2018(PT)

in-training evaluations: student perceptions”, Advances in Health Sciences Education,


Vol. 10 No. 3, pp. 171-88.
Govaerts, M.J.B., Schuwirth, L.W.T., van der Vleuten, C.P.M. and Muijtjens, A.M.M. (2011),
“Workplace-based assessments: effects of rater expertise”, Advances in Health Sciences
Education, Vol. 16 No. 2, pp. 151-65.
Govaerts, M.J.B., Van de Wiel, M.W.J., Schuwirth, L.W.T., Van der Vleuten, C.P.M. and Muijtjens,
A.M.M. (2012), “Raters’ performance theories and constructs in workplace-based

EJTD assessments”, Advances in Health Sciences Education, May, DOI:


10.1007/s10459-012-9376-x.
37,1 Hamlin, B. and Stewart, J. (2011), “What is HRD? A definitional review and synthesis of the HRD
domain”, Journal of European Industrial Training, Vol. 35 No. 3, pp. 199-220.
Hattie, J. and Timperley, H. (2007), “The power of feedback”, Review of Educational Research,
Vol. 77 No. 1, pp. 81-112.
124 Holmboe, E.S., Sherbino, J., Long, D.M., Swing, S.R. and Frank, J.R. (2010), “The role of
assessment in competency-based medical education”, Medical Teacher, Vol. 32 No. 8,
pp. 676-82.
Hounsell, D. (2007), “Towards more sustainable feedback to students”, in Boud, D. and
Falchikov, N. (Eds), Rethinking Assessment in Higher Education. Learning for the Longer
Term, Routledge, New York, NY, pp. 101-13.
Kerrins, J.A. and Cushing, K.S. (2000), “Taking a second look: expert and novice differences when
observing the same classroom teaching segment a second time”, Journal of Personnel
Evaluation in Education, Vol. 14 No. 1, pp. 5-24.
Kitto, S.C., Chesters, J. and Grbich, C. (2008), “Quality in qualitative research”, Medical Journal of
Australia, Vol. 188 No. 4, pp. 243-6.
Kluger, A.N. and DeNisi, A. (1996), “The effects of feedback interventions on performance: a
historical review, a meta-analysis, and a preliminary feedback intervention theory”,
Psychological Bulletin, Vol. 119 No. 2, pp. 254-84.
Latham, G., Almost, J., Mann, S. and Moore, C. (2005), “New developments in performance
management”, Organizational Dynamics, Vol. 34 No. 1, pp. 77-87.
Lievens, F. (2001), “Assessor training strategies and their effects on accuracy, interrater
reliability, and discriminant validity”, Journal of Applied Psychology, Vol. 86 No. 2,
pp. 255-64.
Luthans, F. and Peterson, S.J. (2003), “360-degree feedback with systematic coaching: empirical
analysis suggests a winning combination”, Human Resource Management, Vol. 42 No. 3,
pp. 243-56.
Murphy, K.R. and Cleveland, J.N. (1995), Understanding Performance Appraisal. Social,
Organizational and Goal-based Perspectives, Sage Publications, Thousand Oaks, CA.
Murphy, K.R., Cleveland, J.N., Skattebo, A.L. and Kinney, T.B. (2004), “Raters who pursue
different goals give different ratings”, Journal of Applied Psychology, Vol. 89 No. 1,
pp. 158-64.
Norman, G., Eva, K., Brooks, L. and Hamstra, S. (2006), “Expertise in medicine and surgery”, in
Ericsson, K.A., Charness, N., Feltovich, P.J. and Hoffman, R.R. (Eds), The Cambridge
Handbook of Expertise and Expert Performance, Cambridge University Press, Cambridge,
pp. 339-54.
Overeem, K., Lombarts, M., Arah, O., Klazinga, N., Grol, R. and Wollersheim, H. (2010), “Three
methods of multi-source feedback compared: a plea for narrative comments and
coworkers’ perspectives”, Medical Teacher, Vol. 32 No. 2, pp. 141-7.
DownloadedbyUniversitasBrawijayaAt19:3404May2018(PT)

Ramaprasad, A. (1983), “On the definition of feedback”, Behavioural Science, Vol. 28 No. 1,
pp. 4-13.
Salas, E. and Rosen, M.A. (2010), “Experts at work: principles for developing expertise in
organizations”, in Kozlowski, S.W.J. and Salas, E. (Eds), Learning, Training, and
Development in Organizations, Routledge/Taylor & Francis, New York, NY, pp. 99-134.
Shute, V.J. (2008), “Focus on formative feedback”, Review of Educational Research, Vol. 78 No. 1,
pp. 153-89.
Smither, J.W. and Walker, A.G. (2004), “Are the characteristics of narrative comments related to
improvement in multirater feedback over time?”, Journal of Applied Psychology, Vol. 89
No. 3, pp. 575-81.
research f
ocuses on
professio
nal learni
ng, workp
lace learni
ng, profes
sional dec
Smither, J.W., London, M. and Reilly, R.R. (2005), “Does performance improve following ision maki
multisource feedback? A theoretical mode, meta-analysis and review of empirical ng,
findings”, Personnel Psychology, Vol. 58 No. 1, pp. 33-66. expertise
Van de Ridder, M., Stokking, K., McGaghie, W. and Ten Cate, O. (2008), “What is feedback in developm
clinical education?”, Medical Education, Vol. 42 No. 2, pp. 189-97. ent, and k
Van de Wiel, M.W.J., Van den Bossche, P. and Koopmans, R.P. (2011), “Deliberate practice, the nowledge
high road to expertise: K.A. Ericsson”, in Dochy, F., Gijbels, D., Segers, M. and Van den sharing. S
Bossche, P. (Eds), Theories of Learning for the Workplace: Building Blocks for Training he receive
and Professional Development Programs, Routledge, London, pp. 1-16. d her PhD
van der Vleuten, C.P.M., Schuwirth, L.W.T., Scheele, F., Driessen, E.W. and Hodges, B. (2010), at Maastri
“The assessment of professional competence: building blocks for theory development”, cht Univer
Best Practices & Research Clinical Obstetrics and Gynaecology, Vol. 24 No. 6, pp. 703-19. sity
for her dis
Wiliam, D. (2007), “Keeping learning on track: classroom assessment and the regulation of
sertation
learning”, in Lester, F.K. Jr (Ed.), Second Handbook of Mathematics Teaching and
Learning, Information Age Publishing, Greenwich, CT, pp. 1053-98. on medica
l expertise
developm
About the authors ent in 199
Dr Marjan J.B. Govaerts is an Assistant Professor of Medical Education at Maastricht 7. She obt
University. She earned her medical degree and her PhD on assessment of performance in real life ained a M
contexts, cum laude, from Maastricht University. Her primary teaching responsibilities are in aster’s deg
medical education. She is currently the chair of the taskforce on student assessment, involved in ree in
design and quality assurance of assessment programmes, both in undergraduate and Cognitive
postgraduate medical education. Her research interests are in competency-based education and Psycholog
assessment, and more specifically in work-based assessment, in-training assessment y at the Ra
programmes and expertise development in professional education. Marjan J.B. Govaerts is the dboud Un
corresponding author and can be contacted at: marjan.govaerts@maastrichtuniversity.nl iversity Nij
Dr Margje W.J. van de Wiel is Assistant Professor and Researcher at the Faculty of megen in
Psychology and Neuroscience at Maastricht University. She teaches courses on Learning and 1991.
Human Resources and is responsible for the staff development programme at her faculty. Her Prof. D
r Cees P.M. van der Vleuten came to the University of Maastricht in 1982. He was
appointed as a Professor of Education in 1996 at the Faculty of Health, Medicine and Life
Sciences and Chair of the Department of Educational Development and Research. In 2005 he was
appointed as the Scientific Director of the School of Health Professions Education
(www.maastrichtuniversity.nl/she). His area of expertise lies in evaluation and assessment. He
has published widely on these topics, holds numerous academic awards for his work, including
several career awards. He has frequently served as a consultant internationally. He is a mentor
for many researchers in medical education and has supervised more than 50 doctoral graduate
students in the past. In 2010 he received a Dutch royal decoration for the societal impact of his Quality of
work. A full curriculum vitae can be found at: www.fdg.unimaas.nl/educ/cees/CV/
feedback

125
To purchase reprints of this article please e-mail: reprints@emeraldinsight.com
Or visit our web site for further details: www.emeraldinsight.com/reprints
DownloadedbyUniversitasBrawijayaAt19:3404May2018(PT)

This article has been cited by:

1. Guan-Yu Lin. 2018. Anonymous versus identified peer assessment via a Facebook-based learning
application: Effects on quality of peer feedback, perceived learning, perceived fairness, and attitude toward
the system. Computers & Education 116, 81-92. [Crossref]
2. Laury P. J. W. M. de Jonge, Angelique A. Timmerman, Marjan J. B. Govaerts, Jean W. M. Muris, Arno
M. M. Muijtjens, Anneke W. M. Kramer, Cees P. M. van der Vleuten. 2017. Stakeholder perspectives on
workplace-based performance assessment: towards a better understanding of assessor behaviour. Advances
in Health Sciences Education 22:5, 1213-1243. [Crossref]
3. Stan van Ginkel, Judith Gulikers, Harm Biemans, Martin Mulder. 2017. Fostering oral presentation
performance: does the quality of feedback differ when provided by the teacher, peers or peers guided by
tutor?. Assessment & Evaluation in Higher Education 42:6, 953-966. [Crossref]
4. SharmaAnshu, Anshu Sharma, BhatnagarJyotsna, Jyotsna Bhatnagar. 2017. Emergence of team
engagement under time pressure: role of team leader and team climate. Team Performance Management:
An International Journal 23:3/4, 171-185. [Abstract] [Full Text] [PDF]
5. Harold G. J. Bok, Debbie A. D. C. Jaarsma, Annemarie Spruijt, Peter Van Beukelen, Cees P. M. Van Der
Vleuten, Pim W. Teunissen. 2016. Feedback-giving behaviour in performance evaluations during clinical
clerkships. Medical Teacher 38:1, 88-95. [Crossref]
6. Annette Burgess, Chris Roberts, Tyler Clark, Karyn Mossman. 2014. The social validity of a national
assessment centre for selection into general practice training. BMC Medical Education 14:1. . [Crossref]
7. Andrea Gingerich, Jennifer Kogan, Peter Yeates, Marjan Govaerts, Eric Holmboe. 2014. Seeing the ‘black
box’ differently: assessor cognition from three research perspectives. Medical Education 48:11, 1055-1068.
[Crossref]
8. Zeljko Stojanov, Dalibor Dobrilovic. The Role of Feedback in Software Process Assessment 7514-7524.
[Crossref]

You might also like