Download as pdf or txt
Download as pdf or txt
You are on page 1of 15

Assessment & Evaluation in Higher Education

ISSN: 0260-2938 (Print) 1469-297X (Online) Journal homepage: http://www.tandfonline.com/loi/caeh20

The impact of stress in self‐ and peer assessment

Nigel K. Ll. Pope

To cite this article: Nigel K. Ll. Pope (2005) The impact of stress in self‐ and peer
assessment, Assessment & Evaluation in Higher Education, 30:1, 51-63, DOI:
10.1080/0260293042003243896

To link to this article: http://dx.doi.org/10.1080/0260293042003243896

Published online: 14 Sep 2010.

Submit your article to this journal

Article views: 458

View related articles

Citing articles: 24 View citing articles

Full Terms & Conditions of access and use can be found at


http://www.tandfonline.com/action/journalInformation?journalCode=caeh20

Download by: [University of Nebraska, Lincoln] Date: 27 November 2015, At: 01:38
Assessment & Evaluation in Higher Education
Vol. 30, No. 1, February 2005, pp. 51±63

The impact of stress in self- and peer


assessment
Nigel K. Ll. Pope*
Downloaded by [University of Nebraska, Lincoln] at 01:38 27 November 2015

Grif®th Business School, Queensland, Australia

While a large amount of interest has been shown in the use of peer and self-assessment, few studies
have considered the effect of stress on the students involved. None have considered whether the
resultant stress itself might account for any noticeable improvements in student performance. The
research presented in this paper addresses this question. An experimental design measured the
effects of type of assessment and gender on student stress levels and performance. Results suggest
that females are more stressed by self-assessment than males and that being subjected to self- and
peer assessment, while more stressful, leads to improved student performance in summative tasks.

It is generally accepted that in order to properly assess a student's knowledge, an


educator needs access toÐand the use ofÐa variety of assessment methods (Fry,
1990; Orsmond et al., 2000). As a result of this, a great deal of interest has developed
in the areas of both self- and peer assessment of student learning. In addition to the
pedagogical advantages of these methods, there is also a managerial aspect. It has
been noted by several researchers that the structure of university education is such as
to create a bias toward summativeÐas opposed to formativeÐassessment. This in
turn requires faculty-based assessment that increases marking loads for tutors and
increases staff costs (see, for example, Gibbs, 1992; McDowell, 1996; Fallows &
Chandramohan, 2001). The ability to reduce this faculty dependence and provide an
effective means of assessment is attractive to higher education institutions in terms of
increased ef®ciencies and reduced costs (Hanrahan & Isaacs, 2001). It is therefore in
the interests of universities to consider the introduction of peer and self-assessment if
only on a purely ®nancial basis. Within a summative environment however, adoption
of such methods needs to be tempered by two important riders. First, the assessment
method must be as accurate as the more traditional, faculty-based methods, and
second, the practice must not be prejudicial to the students themselves.
This paper reports the results of an experimental study of the use of both self- and

*Department of Marketing, Grif®th Business School, Nathan, QLD 4111, Australia.


Email: n.pope@grif®th.edu.au
ISSN 0260±2938 (print)/ISSN 1469±297X (online)/05/010051-13
ã 2005 Taylor & Francis Ltd
DOI: 10.1080/0260293042000243896
52 N. K. Ll. Pope

peer assessment in a business school. A comparison is drawn between the marks


obtained by tutor, peer and individual (self) students. In addition, the effect of the
assessment methods on student stress levels is reported and the interaction between
such stress levels and ®nal marks is examined. The paper begins with a brief overview
of prior research into self- and peer assessment.

Theoretical background
A large amount of support for self- and peer assessment has been developed over the
last two decades. Most of these studies indicate enhanced student learning outcomes
Downloaded by [University of Nebraska, Lincoln] at 01:38 27 November 2015

(Falchikov, 1986, 1988; Boud, 1991, 1995; Hounsell et al., 1996; Dochy et al., 1999;
Lapham & Webster, 1999; Roach, 1999). There has been some question as to the
enthusiasm that should be accorded to these practices, however. For example,
Topping (1998) is somewhat sceptical and suggests that the evidence is limited,
particularly with regard to actual bene®ts. For that reason, it is necessary to examine
exactly the sort of bene®ts one would anticipate.
Some of these bene®ts can give the impression of being nebulous. For example,
Ellis (2001) suggests that the advantages of self- and peer assessment relate to student
involvement, independence and assertiveness. OtherÐoften rigorousÐstudies dem-
onstrate improved thinking processes (Falchikov, 1986; Stefani, 1992; Boud, 1995;
Hanrahan & Isaacs, 2001) and system awareness (Falchikov, 1986). Such outcomes
are, of themselves, worthy, but do not address the issue, raised above, of possessing
the same accuracy as faculty-based marking. This is because they are formative
outcomes as opposed to the summative type preferred by university administrations.
In order for such formative bene®ts to be persuasive as ends in themselves, it becomes
necessary for educators to pursue an argument that can demonstrate longer-term
practicalities. Fortunately, such an argument has been mounted in a variety of
disciplines.
The crux of this argument is that the skills required for self-evaluation are generally
the same as those required for successful independent study (Clark, 1991). These
self-assessment skills will help graduates in their later, professional lives and peer
assessment will help them in collaborative working environments (Hanrahan & Isaacs,
2001). The end result should be a graduate whose re¯ective learning allows her or him
to cope with con¯icting value positions within a chosen profession (Ellis, 2001). Peer
assessment will not be possible until the assessor has undergone some degree of self-
assessment and the two activities will inform each other (Boud, 1991; Ellis, 2001).
Therefore to be effective, it seems that both methods need to be used.
Examples from individual professions have been proffered to support this. For
example, in medicine, a critical skill is the ability to monitor and assess one's own
performance, as is the ability to monitor one's peers for professional and team
development (Wooliscroft et al., 1993; Gordon, 1997; Rudy et al., 2001). Research
shows that medical students have typically been found to accurately assess both
themselves and peers (Arnold et al., 1981; Rudy et al., 2001). But the argument that
the bene®ts of peer and self-assessment lie in these formative means, invites the
The impact of stress in assessment 53

counter argument that there is still a need for summative, faculty-based assessment in
order to judge a student's discipline knowledge. In order for self- and peer assessment
to be acceptable in the mainstream, they need to be able to stand as equal to the
summative task.
Some studies have addressed this issue. Peer assessment can be of either
performanceÐtypically a group project or discussionÐor product (Falchikov,
1988; Hanrahan & Isaacs, 2001; Smith et al., 2002). Examples of process studies
include those of Conway et al. (1993) and Gold®nch and Raeside (1990). The process
studies tend to be those that, while giving bene®ts of involvement and understanding
assessment, do not demonstrate a high level of summative value. That is to say, they
Downloaded by [University of Nebraska, Lincoln] at 01:38 27 November 2015

do not give us a number that can be used for grading a student's discipline knowledge.
It is the product type of assessment that can provide this. Several product studies have
examined such assessment items as the use of posters (Orsmond et al., 1996;
Billington, 1997; Berry & Nyman, 1998; Smith et al., 2002), computer projects (Lejk
& Wyvill, 2001a) and remedial English classes (Patri, 2002). All of these studies
reported high levels of correlation between student and faculty marks (e.g. Orpen,
1982; Old®eld & MacAlpine, 1995). In medicine, Morton and Macbeth (1977)
found a high correlation between faculty and peer assessment of surgery students,
while Rudy et al. (2001) found a high correlation between the same groups in a
medical interviewing cohort. Incidentally, both of these studies also showed students
to be overly critical in self-assessment. A combination of self- and peer assessment has
also been found to be effective in oral presentations of literature classes (Fallows &
Chandramohan, 2001) and with engineering students (Old®eld & MacAlpine, 1995).
By contrast, other studies have found the assessments poorly correlated (for
example, Swanson et al., 1991; Freeman, 1995). Medicine is one example where
studies have found lower correlations between self-, peer and faculty evaluations of
students and it appears that such assessments may represent different insights into
performances (Kegel-Flom, 1975). In other examples, Mowl and Pain (1995) found
poor correlations in geography students, but Billington (1997) found good correl-
ation.
It is possible that success in the use of these forms of assessment is dependent on the
means of application. A lot of work has examined assessment of group members' work
by fellow group members (Lejk & Wyvill, 2001b; Li, 2001). It appears that holistic
assessment methods (one grade is awarded to a group member for overall
contribution) are superior to categorical methods (wherein contributions are put
together in various categories and then summated) (Lejk & Wyvill, 2001b). One
factor that appears to be vitally important is the clear setting of marking criteria and
training students in their use (Orpen, 1982; Mowl & Pain, 1995; Old®eld &
MacAlpine, 1995; Orsmond et al., 2000; Topping et al., 2000; Hanrahan & Isaacs,
2001; Smith et al., 2002). Failure to obey these rules may be the cause of cases where
the correlation between faculty, peer and self marks is poor.
A last point relating to the use of these methods is that of the stress it places on the
student. Earlier it was noted that students have been found to be overly critical of
themselves. This is one particular form of stressor. Similarly, Pope (2001) proposed
54 N. K. Ll. Pope

that where students were aware that they would have their work peer-assessed they
would (a) work harder, and (b) be subjected to greater levels of work-related stress.
Other researchers report similar ®ndings (Sambell et al., 1997; Smith et al., 2002).
Much of this may derive from lack of familiarity (Smith et al., 2002), but there is also
evidence that it is dif®cult, and involves the possibility of harming a peer (Falchikov,
1986; Hanrahan & Isaacs, 2001).
Stress relates to a perceived imbalance between demands and the availability of
resources to meet them (Bonn & Bonn, 2000). In this sense, the requirement for
students to assess themselves and their peers, who will also assess them, can create a
stress in the student. That stress will derive from inexperience, possibly the fear of
Downloaded by [University of Nebraska, Lincoln] at 01:38 27 November 2015

hurting others, or of being hurt by others. Important questions that develop from this
are whether that extra stress is present to a high degree, and whether the effect of that
stress is bene®cial to the student's work.
A problem with measuring stress in the educational setting relates to a lack of
theoretical basis. While several recent studies have examined stress in this setting
(e.g., Heiman & Precel, 2003; Last & Fulbrook, 2003; Natvig et al., 2003) none have
successfully developed a scale pertinent directly to the classroom. For example, in
their investigation of tertiary students with learning disabilities, Heiman and Precel
(2003) simply asked `How much stress do you feel due to academic studies?' In an
examination of teaching methods, Natvig et al. (2003) asked only three questions
relating to task dif®culty and tiredness, while Last and Fulbrook's (2003) question-
naire included four questions relating to time constraints and workload.

Development of hypotheses
From the above, it is clear that some questions exist regarding the use of peer and self-
assessment, particularly in the area of summative outcomes. Additionally of interest is
whether or not their use has a clear formative effect on those outcomes. Of great
importance is thatÐin order to be valid for summative purposesÐboth self- and peer
assessment should be consistent with those marks that would be awarded by a faculty
assessor. It appears, in fact, that this is more likely to be the case with peer than with
self-assessment. This forms the starting point for this study:

H1: Marks assigned by peers to a student's work will be more highly correlated with
faculty marks for the same work than will marks given by the same students to their own
work.
Logically, one would expect that the formative bene®ts that are claimed for self- and
peer assessment (those of greater involvement; clearer understanding of assessment
methods and criteria; and the ability to monitor oneself and one's peers) would result
in a superior summative outcome for students. To date, no such comparison study has
been reported in the literature. That is interesting, because should the marks for
students who have been involved in these processes be higher, in particular when
cross-marked by faculty, this would indicate that the use of these assessment methods
The impact of stress in assessment 55

has an even stronger claim to use than has heretofore existed. The current study
examines that possibility. Formally stated:

H2: An individual assessed by a combination of faculty, peer and self- assessment will
record a higher mark for a piece of work from a faculty member than a similar
individual assessed by faculty assessment only.
There is a body of evidence that women have superior elaborative powers to men
(Meyers-Levy & Maheswaran, 1991; Meyers-Levy & Sternthal, 1991). This would
suggest that where different genders are placed in different groups for assessment
Downloaded by [University of Nebraska, Lincoln] at 01:38 27 November 2015

purposes, they will respond differently in terms of performance. A further hypothesis


is that this may occur in the case of self- and peer assessment:

H3: A student's gender will signi®cantly interact with the type of assessment that student
undergoes in affecting resultant, independently derived marks.
One would also expect, given the demands of self- and peer assessment, that stress will
have an impact on ®nal marks given by an independent assessor. Therefore:

H4: Stress will signi®cantly co-vary with faculty marks in any main effects from either
gender or type of assessment that a student undergoes.
This directly implies that self- and peer assessment are stressful. In order to examine
this possibility it is further hypothesised that:

H5: Students undergoing self- and peer assessment will report higher levels of perceived
stress than students undergoing faculty marking only.
Because of evidence that women in a group setting are more susceptible to emotional
contagionÐand therefore stress responses are strongerÐthan are men (Meyers-Levy
& Maheswaran, 1991), there is also the possibility that gender may impact on any
reported stress response. Therefore:

H6: A student's gender will signi®cantly interact with method of assessment in any effect
on reported perceived stress.

Method
For this research, a two (Gender) by four (Group) experimental design was used.
Three treatment groups and a control were established, the groups being: (1) Tutor
mark only, which acted as a control (hereafter TO); (2) Tutor mark and self mark
(TS); (3) Tutor mark and peer mark (TP) and; (4) Tutor mark, self mark and peer
mark (TSP). The sample frame was an undergraduate research methods class at an
56 N. K. Ll. Pope

Australian east coast university. The class size was 192 and students were required to
participate in research projects as part of their assessment. Considerable data had
already been collected from the students that allowed for paired sampling.
On arrival for a two-hour lecture, students were informed that they would be
undergoing an assessment based on previous weeks' work. The assessment task was to
write an essay on research philosophy comparing qualitative and quantitative
paradigms. They were all advised of the assessment criteria to be used and questions
were taken for clari®cation. They were then allocated into groups by name.
Membership had already been determined by random split pair sampling, thereby
creating groups that were homogeneous and discrete. Absenteeism was allowed for
Downloaded by [University of Nebraska, Lincoln] at 01:38 27 November 2015

and a ®fth group not participating in the experiment performed the task separately.
The four groups that participated in the experiment were each of 40 members
(N=160) and were evenly male and female (i.e., ten males and ten females per group).
This was done to remove any possibility of gender bias, following the recommen-
dations of Falchikov and Magin (1997).
After allocation, groups were sent to different rooms where the task was performed.
Before commencement, they were informed of their method of assessment. That is to
say, one group was told they would be marked by their peers and a tutor, another that
they would be marked by the tutor only, etc. On completion of the task (30 minutes),
those in self-assessment mode (TS and TSP) were required to mark their papers
according to the criteria earlier provided. Peer marking was the longest task to
perform due to the multiple marking necessaryÐmultiple peer marks were used and
then averaged, consistent with earlier studies (Magin, 1993; Falchikov & Magin,
1997; Smith et al., 2002). In the case of the TSP group, peer assessment was
performed after the self-assessment was completed. This was done in order not to
confound the self-assessment stress with the known stress caused by peer assessment
(Pope, 2001). Finally, stress was measured.
Of the available means of measuring stress, only one is non-invasive, and that is self-
report (Stanton et al., 2001). The instrument used in this study was the Perceived
Stress Scale (PSS-10). This scale has received a great deal of support in applications in
the literature (e.g., Hewitt et al., 1992; Pbert et al., 1992). In addition, it has been
found to be uni-dimensional with no `practically meaningful differential item
functioning' (Cole, 1999, p. 320). Several other scales do exist, such as the Stress
in General Scale (Stanton et al., 2001) and the Job Demand-Control-Support model
(Karasek & Theorell, 1990) but these tend to be involved with employees in situations
wherein they report to a management structure (Pelfrene et al., 2001; Stanton et al.,
2001).
The PSS is based on the premise that the impact of stressful events is based on the
individual's perception of how stressful that event was (Cohen et al., 1983). Originally
a 14-item scale, it has since been reduced to 10 items (Cole, 1999). It is a Likert-type
scale with the following descriptors: (1) upset because something happened
unexpectedly, (2) unable to control the important things, (3) felt nervous and
stressed, (4) dealt successfully with irritating life hassles, (5) ineffectively coping with
important changes, (6) felt con®dent about the ability to cope with personal
The impact of stress in assessment 57

problems, (7) felt things were going your way, (8) felt you could not cope with all the
things you had to do, (9) been able to control irritations in your life and, (10) felt you
were on top of things. Subjects are asked to respond to the statements or questions on
the basis of feelings they have had in the last month. Items are on a scale of 1 to 7 with
anchors of Never (1) and Very Often (7). Items 4, 5, 6, 7, 9, and 10 are reverse scored.
When scores for each item are summated they give a total stress rating with higher
scores indicating higher levels of perceived stress. In this case, the sum was divided by
10 to give an answer on a scale of 1 to 7. An item reliability analysis of the data in the
current study found the scale to be robust with a Cronbach's alpha of 0.89.
After all data were collected, the stress responses were removed from the hard copy
Downloaded by [University of Nebraska, Lincoln] at 01:38 27 November 2015

as were the self- and peer assessment mark sheets. These ®ndings were entered into a
data set while an independent tutor marked all the work to the same criteria. Total
possible mark for the essay was 20. Group membership was not divulged to the tutor.
Students were debriefed as to the purpose of the experiment in their next class and
results were shared with them after analysis.

Findings
The ®rst hypothesis suggested that there would be greater correlation between marks
awarded by peers and those given by an independent tutor than would exist between
marks given by oneself to one's own work and that of the independent tutor. A
Product Moment Correlation Matrix was examined to test for relationships between
the tutor's mark and the student's self-assessment in groups with both tutor and self-
marking (TS and TSP). This showed a signi®cant correlation between the student's
own mark and that of the tutor (R2=.59; t=10.60; N=80; p < .05). A similar matrix
between tutor and peer marks for groups with both present (TP and TSP) also
showed signi®cant correlation (R2=.60; t=10.83; N=80; p < .05). No meaningful
difference exists between the correlations so Hypothesis 1 is rejected.
It was also suggestedÐin Hypothesis 2Ðthat students subjected to a combination
of assessment methods (i.e., peer, self- and tutor marking) would perform better than
those in only one or two assessment conditions (i.e., tutor only, tutor and peer or tutor
and self). It was further arguedÐin Hypothesis 3Ðthat a student's gender would have
a signi®cant interaction with group membership in any effect on a tutor's mark. An
analysis of variance (ANOVA) was conducted with the tutor's marks for each of the
four groups as the dependent variable. Independent variables were group membership
and student gender. Levene's test of the assumption of the normality of variance was
not signi®cant (F=0.18, p > .05) and the analysis was allowed to proceed.
Results of the analysis are presented at Table 1. The model was signi®cant (p < .05)
and statistically signi®cant main effects were found for both group membership and
student gender. There was no signi®cant interaction between the two independent
variables. The main effect of group membership was explored post hoc using Fisher's
LSD.
The post hoc test revealed that students in groups subjected to self-assessment (TS
and TSP) performed better than students marked by the tutor only (TO). Mean score
58 N. K. Ll. Pope

Table 1. Results of univariate tests of signi®cance, dependent variable tutor mark and
independent variables of group membership and subject gender

SS df MS F p

Intercept 18211.56 111 18211.56 607.55 0.00


Group 11267.92 113 11189.31 112.98 0.03
Gender 11146.31 111 11146.31 114.88 0.03
Group* Gender 11136.97 113 11145.66 111.52 0.21
Error 14556.25 152 11129.98
Downloaded by [University of Nebraska, Lincoln] at 01:38 27 November 2015

Table 2. Results of univariate tests of signi®cance, dependent variable perceived stress and
independent variables of group membership and subject gender

SS df MS F p

Intercept 2984.25 111 2984.25 994.64 0.00


Group 1169.51 113 1123.17 117.72 0.00
Gender 1112.65 111 1112.65 114.22 0.04
Group* Gender 1130.51 113 1110.17 113.39 0.02
Error 1456.05 152 1113.00

for TS was 11.4, for TSP 12.2 and for TO, 8.8. Group TP (10.3) also scored higher
than TO, but the difference was not statistically signi®cant. Females for the total
sample performed better than males (11.6 as opposed to 9.7), although no interaction
with group membership was observed. This difference for the overall sample was
statistically signi®cant. Both Hypotheses 2 and 3 are rejected, but open interesting
areas for discussion (below).
Recall that Hypothesis 4 suggested that stress would signi®cantly impact on a
student's overall performance in this experiment. To examine this effect, the previous
ANOVA was repeated with stress scores treated as a covariate. Results of this analysis
showed that stress co-varied with the dependent variable of tutor's mark (Beta=0.72,
R2=0.51, df=1, 158, p < .05). So strong is this relationship that all main effects lose
statistical signi®cance. Hypothesis 4 is therefore supported.
The ®fth hypothesis related to the location of stress within groups, while Hypothesis
6 suggested that females would ®nd peer and self-assessment more stressful than
males. A further ANOVA was conducted using perceived stress as the dependent
variable and group membership and gender as the independent variables. Results
appear in Table 2.
Levene's test of the assumption of the normality of variance was not signi®cant
(F=0.96, p > .05) and the analysis proceeded. The model was signi®cant (p < .05)
and statistically signi®cant main effects were found for both group membership and
student gender. There was also a signi®cant interaction between the two independent
variables.
The impact of stress in assessment 59

Females were more likely to report higher levels of perceived stress overall than
males (female mean=4.60; male mean=4.04; p<.05). The source of signi®cant
differences between groups was analysed post hoc using Fisher's LSD. This analysis
showed that all groups in either or both self- and peer assessment conditions reported
higher levels of stress than those in the tutor only group (TSP mean=4.98; TP
mean=4.33; TS mean=4,73; TO mean=3.25) at a 95% con®dence level. No
signi®cant differences were found between treatment groups. Hypothesis 5 is
therefore accepted.
The interaction between a subject's gender and group membership was also
examined post hoc. Because of the large number of resulting groups (eight), the more
Downloaded by [University of Nebraska, Lincoln] at 01:38 27 November 2015

conservative ScheffeÂ's test was used in order to reduce the possibility of Type I error.
Signi®cant differences were found in three instances. Differences in perceived stress
were identi®ed between females in TS group (mean=5.7) and both males in the TO
group (mean=3.0) and females in the TO group (mean=3.5). A further difference was
found between females in the TSP group (mean=5.1) and males in the TO group
(mean=3.0). Hypothesis 6 is accepted.

Discussion
Several earlier commentators have observed that while peer assessment is generally
highly correlated with faculty marks (Morton & Macbeth, 1977; Orpen, 1982;
Old®eld & MacAlpine, 1995; Rudy et al., 2001), the same cannot also be said of self-
assessment. Reasons for this may include students being overly self-critical (Morton &
Macbeth, 1977; Rudy et al., 2001) or poorer students favouring themselves while
brighter students do not (Lejk & Wyvill, 2001). This study examined correlations
between self, peer and tutor marks and found that the correlations were very strong in
each case with about 60% of variance being explained. This does not mean that
students are not overly self-critical, but if that is the case then it is being evened out by
other factors not examined in this study.
What is interesting about the differences between self- and peer assessment in this
study relates to the effect on marks and the interaction of stress. To begin with the
marks, there is a clear indication that females performed better across the total sample
than did males. Additionally, students who were told they would be marked by either
themselves, their peers or both, performed better according to the independent
marker than those who were not given this prompt. There was no signi®cant
difference in scores between the groups given these prompts. So it would seem that
merely telling individuals that they are to be assessed by their peers or themselves is
suf®cient to improve their assessment performance.
But this changes as soon as allowance is made for stress as a potential confounding
variable. Not only were the main effects observed in the preceding paragraph reduced,
they also lost all statistical signi®cance. What might be occurring is that the prompt of
receiving either self- or peer assessment induces a stress response of its own, and this
stress response is suf®cient to raise performance in the assessed item. This may occur
either through greater effort or greater care being taken.
60 N. K. Ll. Pope

Now with the ®nal two hypotheses, this stress response was examined. Any of the
three treatment prompts induced a stress response in the subjects that was
signi®cantly different from the stress observed in the control group. Importantly,
there was no statistical signi®cance in differences between treatment groups
themselves. So we can conclude that the mere prompt of self- or peer assessment is
suf®cient to induce stress, and we know that this stress accounts for improved
assessment results when compared with the marks of a faculty member.
Of great interest is the gender effect in the stress response. Overall, females found all
of the assessment more stressful than males and this may well have been re¯ected in
their overall performance being superior. Additionally, females produced another
Downloaded by [University of Nebraska, Lincoln] at 01:38 27 November 2015

interesting response. It was self-assessment that they found more stressful. This is
intriguing and would warrant further examination.

Conclusion
The results of the research presented in this paper support the contention that peer
and self-assessment contribute to a student's work performance. It also suggests that
both these forms of assessment are highly correlated with faculty-awarded marks. The
contribution of this research lies in its examination of the stress effects of peer and self-
assessment. This has not been studied in conjunction with these forms of assessment
in previous, published work.
There seems to be a possibility from these ®ndings that bene®ts of peer and self-
assessment on students' ®nal performance are interactive with the stress level
perceived by the student. It also seems that this stress response is stronger in females,
particularly in the self-assessment condition. It is interesting that previous researchers
have not examined the issue of whether or not females tend to be more self-critical of
their own work, although it has been noted that this is a quality of better performing
students and, in this study, females outperformed males overall.
As with all research and, in particular, experimental designs, this study was subject
to limitations. The time scale used in this study was extremely short. There is a
possibility that had a longer delay been involved between setting of assessment
criteria, performance of task, assessment and stress tests, the results would have been
different. Effectively, this research did not allow for any `wear-in' or `wear-out' of the
various stimuli involved. Additionally, the sample used was already conditioned to
being involved in research projects and there is the possibility that a certain amount of
`yea-saying' (stating what one believes the researcher is seeking) occurred.
Despite these limitations, the study provides another insight into the possible
mechanisms working in the use of self- and peer assessment. It also demonstrated that
the PSS-10 scale is useable within the educational context. This is fortunate, given the
absence of suitable conceptualisation of stress in higher education, an area in need of
development. Future researchers may wish to replicate the current study, paying
attention to these particular limitations and varying the timeframes. Another
possibility for further research is in the area of gender and stress, with emphasis on
the effect of higher marks among females as opposed to males.
The impact of stress in assessment 61

Notes on contributor
Nigel K. Ll. Pope is Associate Professor of Marketing at Grif®th Business School.

References
Arnold, L., Willoughby, L., Calkins, V., Gammon, L. & Eberhart, G. (1981) Use of peer
evaluation in the assessment of medical students, Journal of Medical Education, 56, 35±42.
Berry, J. & Nyman, M. (1998) Introducing mathematical modelling skills to students and the use of
posters in assessment, Primus, 8, 103±115.
Billington, H. L. (1997) Poster presentations and peer assessment: novel forms of evaluation and
Downloaded by [University of Nebraska, Lincoln] at 01:38 27 November 2015

assessment, Journal of Biological Education, 31, 218±220.


Bonn, D. & Bonn, J. (2000) Work-related stress: can it be a thing of the past? The Lancet, 355, p.
124.
Boud, D. (1991) Implementing student self-assessment, HERDSA Green-Guide Series (2nd edn)
(Campbelltown, Higher Education Research and Development Society of Australia).
Boud, D. (1995) Enhancing learning through self-assessment (London, Kogan Page).
Clark, R. (1991) Student opinion of ¯exible teaching and learning in higher education, in: W.
Wade, K. Hodgkinson, A. Smith & J. Ar®eld (Eds) Flexible learning in higher education
(London, Kogan Page), 136±150.
Cohen, S., Kamarck, T. & Mermelstein, R. (1983) A global measure of perceived stress, Journal of
Health and Social Behavior, 24, 385±396.
Cole, S. R. (1999) Assessment of differential item functioning in the Perceived Stress Scale-10,
Journal of Epidemiology and Community Health, 53, 319±320.
Conway, R., Kember, D., Sivan, A. & Wu, M. (1993) Peer assessment of an individual's
contribution to a group project, Assessment and Evaluation in Higher Education, 18, 45±56.
Dochy, F., Segers, M. & Sluijman, S. (1999) The use of self-, peer and co-assessment in higher
education: a review, Studies in Higher Education, 24, 331±350.
Ellis, G. (2001) Looking at ourselvesÐself-assessment and peer assessment: practice examples
from New Zealand, Re¯ective Practice, 2(3), 289±302.
Falchikov, N. (1986) Product comparisons and process bene®ts of collaborative group and self-
assessment, Assessment and Evaluation in Higher Education, 11, 146±166.
Falchikov, N. (1988) Self- and peer assessment of a group project designed to promote the skills of
capability, Programmed Learning and Education Technology, 25(4), 327±339.
Falchikov, N. & Magin, D. (1997) Detecting gender bias in peer marking of students' group
progress work, Assessment and Evaluation in Higher Education, 22(4), 385±396.
Fallows, S. & Chandramohan, B. (2001) Multiple approaches to assessment: re¯ections on tutor,
peer and self-assessment, Teaching in Higher Education, 6(2), 229±246.
Freeman, M. (1995) Peer assessment by groups of group work, Assessment and Evaluation in Higher
Education, 20, 289±300.
Fry, S. A. (1990) Implementation and evaluation of peer marking in higher education, Assessment
and Evaluation in Higher Education, 15, 177±189.
Gibbs, G. (1992) Assessing more students, Teaching More Students Project Series (Vol. 4) (Oxford,
Oxford Centre for staff Development).
Gold®nch, J. & Raeside, R. (1990) Development of a peer assessment technique for obtaining
individual marks on a group project, Assessment and Evaluation in Higher Education, 15, 210±
231.
Gordon, M. J. (1997) Cutting the Gordian knot: a two part approach to the evaluation and
professional development of residents, Academic Medicine, 72, 876±880.
Hanrahan, S. J. & Isaacs, G. (2001) Assessing self- and peer assessment: the students' views, Higher
Education Research and Development, 20, 53±70.
62 N. K. Ll. Pope

Heiman, T. & Precel, K. (2003) Students with learning dif®culties in higher education: academic
strategies and pro®le, Journal of Learning Disabilities, 36(3), 248±258.
Hewitt, P. L., Flett, G. L. & Mosher, S. W. (1992) The Perceived Stress Scale: factor structure and
relation to depression symptoms in a psychiatric sample, Journal of Psychopathology and
Behavioral Assessment, 14(3), 247±258.
Hounsell, D., McCullouch, M. & Scott, M. (1996) Changing assessment practices in higher education:
the ASSHE inventory (Edinburgh, Centre for Teaching, Learning and Assessment,
University of Edinburgh and Napier University, in association with the Universities and
Colleges Staff Development Agency).
Karasek, R. A. & Theorell, T. (1990) Healthy work: stress, productivity, and the reconstruction of
working life (New York, Basic Books).
Downloaded by [University of Nebraska, Lincoln] at 01:38 27 November 2015

Kegel-Flom, P. (1975) Predicting supervisor, peer and self-ratings of intern performance, Journal
of Medical Education, 50(8), 812±815.
Lapham, A. & Webster, R. (1999) Peer assessment of undergraduate seminar presentations:
motivations, re¯ections and future directions, in: S. Brown & A. Glasner (Eds) Assessment
matters in higher education: choosing and using diverse approaches (Buckingham, Society for
Research in Higher Education and Open University Press), 183±190.
Last, L. & Fulbrook, P. (2003) Why do student nurses leave? Suggestions from a Delphi study,
Nurse Education Today, 23, 449±458.
Lejk, M. & Wyvill, M. (2001a) The effect of the inclusion of self-assessment with peer assessment
of contributions to a group project: a quantitative study of secret and agreed assessments,
Assessment and Evaluation in Higher Education, 26, 551±561.
Lejk, M. & Wyvill, M. (2001b) Peer assessment of contributions to a group project: a comparison
of holistic and category-based approaches, Assessment and Evaluation in Higher Education, 26,
61±72.
Li, L. K. L. (2001) Some re®nements on peer assessment of group projects, Assessment and
Evaluation in Higher Education, 26, 5±18.
McDowell, L. (1996) Managing assessment in modular curriculum: issues, perceptions, responses
and opportunities, in Higher Education Quality Council (Ed.) Modular higher education in the
UK in focus (London, Higher Education Quality Council), 96±100.
Magin, D. (1993) Should student peer ratings be used as part of summative assessment? Research
and Development in Higher Education, 16, 537±542.
Meyers-Levy, J. & Maheswaran, D. (1991) Exploring differences in males' and females' processing
strategies, Journal of Consumer Research, 18, 63±70.
Meyers-Levy, J. & Sternthal, B. (1991) Gender differences in the use of message cues and
judgments, Journal of Marketing Research, 28, 84±96.
Morton, J. B. & Macbeth, W. A. A. G. (1977) Correlations between staff, peer and self-assessment
of fourth-year students in surgery, Medical Education, 11, 167±170.
Mowl, G. & Pain, R. (1995) Using self and peer assessment to improve students' essay writing: a
case study from Geography, Innovations in Education and Training International, 32, 324±335.
Natvig, G. K., Albrektsen, G. & Qvarnstrom, U. (2003) Methods of teaching and class
participation in relation to perceived social support and stress: modi®able factors for
improving health and wellbeing among students, Educational Psychology, 23(3), 261±273.
Old®eld, K. & MacAlpine, J. M. K. (1995) Peer and self-assessment at tertiary levelÐan
experiential report, Assessment and Evaluation in Higher Education, 20, 125±132.
Orpen, C. (1982) Student versus lecturer assessment of learning: a research note, Higher Education,
11, 567±572.
Orsmond, P., Merry, S. & Reiling, K. (1996) The importance of marking criteria in the use of
assessment, Assessment and Evaluation in Higher Education, 21, 239±250.
Orsmond, P., Merry, S. & Reiling, K. (2000) The use of student derived marking criteria in peer
and self-assessment, Assessment and Evaluation in Higher Education, 25, 23±38.
The impact of stress in assessment 63

Patri, M. (2002) The in¯uence of peer feedback on self- and peer assessment of oral skills,
Language Testing, 19(2), 109±131.
Pbert, L., Doer¯er, L. A. & Decosimo, D. (1992) An evaluation of the Perceived Stress Scale in
two clinical populations, Journal of Psychopathology and Behavioral Assessment, 14(4), 363±
376.
Pelfrene, E., Vlerick, P., Mak, R. P., De Smets, P., Kornitzer, M. & De Backer, G. (2001) Scale
reliability and validity of the Karasek `Job Demand-Control-Support' model in the Belstress
study, Work and Stress, 15(4), 297±313.
Pope, N. (2001) An examination of the use of peer rating for formative assessment in the context of
the theory of consumption values, Assessment and Evaluation in Higher Education, 26, 235±
246.
Downloaded by [University of Nebraska, Lincoln] at 01:38 27 November 2015

Roach, P. (1999) Using peer assessment and self-assessment for the ®rst time, in: S. Brown & A.
Glasner (Eds) Assessment matters in higher education: choosing and using diverse approaches
(Buckingham, Society for Research in Higher Education and Open University Press), 191±
201.
Rudy, D. W., Fejfar, M. C., Grif®th, C. H. III & Wilson, J. F. (2001) Self- and peer assessment in
a ®rst-year communication and interviewing course, Evaluation and the Health Professions,
24(4), 436±445.
Sambell, K., McDowell, L. & Brown, S. (1997) `But is it fair?': an exploratory study of student
perceptions of the consequential validity of assessment, Studies in Educational Evaluation, 23,
349±371.
Smith, H., Cooper, A. & Lancaster, L. (2002) Improving the quality of undergraduate peer
assessment: a case for student and staff development, Innovations in Education and Teaching
International, 39, 71±81.
Stanton, J. M., Balzer, W. K., Smith, P. C., Parra, L. F. & Ironson, G. (2001) A general measure
of work stress: the stress in general scale, Educational and Psychological Measurement, 61(5),
866±888.
Stefani, L. A. J. (1992) Comparison of collaborative, self, peer and tutor assessment in a
biochemistry practical, Biochemical Education, 20, 148±151.
Swanson, D., Case, S. & Van der Vleuten, C. (1991) Strategies for student assessment, in: D.
Boud & G. Feletti (Eds) The challenge of problem based learning (London, Kogan Page).
Topping, K. J. (1998) Peer assessment between students in colleges and universities, Review of
Educational Research, 68(3), 249±276.
Topping, K. J., Smith, E. F., Swanson, I. & Elliot, A. (2000) Formative peer assessment of
academic writing between postgraduate students, Assessment and Evaluation in Higher
Education, 25, 149±166.
Wooliscroft, J. O., Tenhacken, J., Smith, J. & Calhoun, J. (1993) Medical students' clinical self-
assessments: comparisons with external measures of performance and students' self-
assessments of overall performance and effort, Academic Medicine, 68, 285±297.
Downloaded by [University of Nebraska, Lincoln] at 01:38 27 November 2015

You might also like