Kyriakides BVQ06

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 22

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/6702238

An analysis of the Revised Olweus Bully/Victim Questionnaire using the Rasch


Measurement Model

Article in British Journal of Educational Psychology · January 2007


DOI: 10.1348/000709905X53499 · Source: PubMed

CITATIONS READS
280 8,456

3 authors:

Leonidas Kyriakides Chrystalla Kaloyirou


University of Cyprus 16 PUBLICATIONS 345 CITATIONS
199 PUBLICATIONS 8,119 CITATIONS
SEE PROFILE
SEE PROFILE

Geoff Lindsay
The University of Warwick
235 PUBLICATIONS 6,422 CITATIONS

SEE PROFILE

All content following this page was uploaded by Geoff Lindsay on 26 October 2022.

The user has requested enhancement of the downloaded file.


Copyright © The British Psychological Society
Reproduction in any form (including the internet) is prohibited without prior permission from the Society

781

The
British
Psychological
British Journal of Educational Psychology (2006), 76, 781–801
q 2006 The British Psychological Society
Society

www.bpsjournals.co.uk

An analysis of the Revised Olweus Bully/Victim


Questionnaire using the Rasch measurement
model

Leonidas Kyriakides1, Chrystalla Kaloyirou2 and Geoff Lindsay2*


1
Department of Education, University of Cyprus, Nicosia, Cyprus
2
Centre for Educational Development, Appraisal and Research (CEDAR),
University of Warwick, UK

Background. Bullying is a problem in schools in many countries. There would be


a benefit in the availability of a psychometrically sound instrument for its
measurement, for use by teachers and researchers. The Olweus Bully/Victim
Questionnaire has been used in a number of studies but comprehensive evidence on
its validity is not available.
Aims. To examine the conceptual design, construct validity and reliability of the
Revised Olweus Bully/Victim Questionnaire (OBVQ) and to provide further evidence
on the prevalence of different forms of bullying behaviour.
Sample. All 335 pupils (160 [47.8%] girls; 175 [52.2%]) boys, mean age 11.9 years
[range 11.2–12.8 years]), in 21 classes of a stratified sample of 7 Greek Cypriot primary
schools.
Method. The OBVQ was administered to the sample. Separate scales were created
comprising (a) the items of the questionnaire concerning the extent to which pupils are
being victimized; and (b) those concerning the extent to which pupils express bullying
behaviour. Using the Rasch model, both scales were analysed for reliability, fit to the
model, meaning, and validity. Both scales were also analysed separately for each of two
sample groups (i.e. boys and girls) to test their invariance.
Results. Analysis of the data revealed that the instrument has satisfactory
psychometric properties; namely, construct validity and reliability. The conceptual
design of the instrument was also confirmed. The analysis leads also to suggestions for
improving the targeting of items against student measures. Support was also provided
for the relative prevalence of verbal, indirect and physical bullying. As in other countries,
Cypriot boys used and experienced more bullying than girls, and boys used more
physical and less indirect forms of bullying than girls.
Conclusions. The OBVQ is a psychometrically sound instrument that measures two
separate aspects of bullying, and whose use is supported for international studies of

* Correspondence should be addressed to Professor Geoff Lindsay, CEDAR, University of Warwick, Coventry CV4 7AL, UK
(e-mail: geoff.lindsay@warwick.ac.uk)

DOI:10.1348/000709905X53499
Copyright © The British Psychological Society
Reproduction in any form (including the internet) is prohibited without prior permission from the Society

782 Leonidas Kyriakides et al.

bullying in different countries. However, improvements to the questionnaire were also


identified to provide increased usefulness to teachers tackling this significant problem
facing schools in many countries.

Bullying is not simply a contemporary phenomenon in education. Nevertheless, in


many countries, it is only recently that bullying has received substantial research and
societal attention. A possible reason for this delay could be its multidimensional
character, which has raised a variety of constraints in its definition and measurement.
School bullying, as a form of aggressive behaviour, involves many factors. Olweus
(1993) provided a holistic definition of the phenomenon of bullying as it is expressed
within the school environment: ‘a student is being bullied or victimized when he/she
is exposed, repeatedly and over time to negative actions on the part of one or more
other students. It is a negative action when someone intentionally inflicts, or
attempts to inflict, injury or discomfort upon another’ (p. 9). Moreover, Olweus
argued that the term ‘negative actions’ need not refer only to physical contact but
could also refer to verbal or other methods, such as making faces or obscene
gestures, and intentional exclusion from the group. However, not every negative act
could be considered as bullying as this presupposes an imbalance in strength
between the participants.
This definition of bullying became the basis for the development of a worldwide
research activity on school bullying (e.g. Ortega & Moran-Mercha, 1995; Pateraki &
Houndoumadi, 2001; Smith et al., 1999; Stevens, De Bourdeaudhuij, & Van Oost, 2000;
Whitney & Smith, 1993), which revealed that bullying is a significant educational
problem in many countries which can impair the school’s effectiveness (Ma, 2002).
Gender effects have been demonstrated across a number of countries including Ireland,
France, Scandinavia, England, Scotland, and the Netherlands with boys more likely to be
exposed to and to exhibit bullying than girls (Byrne, 1992; Carra & Sicot, 1996;
Lagerspetz, Bjorkqvist, Berts, & King, 1982; Mellor, 1990; Olweus, 1978; Ortega & Mora-
Merchan, 1995; Smith et al., 1999; Vandermissen & Thys, 1993). Furthermore, there are
interaction effects with boys being more likely to use physical bullying while girls tend
to use indirect forms of bullying (Crick, Casas, & Mosher, 1997; Olweus, 1993; Osterman
et al., 1998; Smith et al., 1999; Vandermissen & Thys, 1993; Whitney & Smith, 1993).
However, the most common form for both is verbal bullying (Irish National Teachers
Organization, 1993; Lind & Maxwell, 1996; Rigby & Slee, 1991; Smith et al., 1999;
Whitney & Smith, 1993).
There is conflicting evidence regarding the mediation of bullying by age. Olweus
(1991) reported a clear and fairly steady decline with age in reports of being bullied from
8 to 16 years. This age trend was supported by a longitudinal study from fifth to seventh
grade of a total rural school district in North America by Pelligrini and Long (2002).
However, two UK studies have failed to find an age effect (Smith, 1991; Johnson
et al., 2002).
It has been shown that bullying has a negative effect on the development of positive
self-esteem in the victims (Boulton & Smith, 1994); victims of bullying regard themselves
as responsible for what is happening to them. This attitude affects their concentration
and learning (Sharp & Smith, 1994). In addition, some children experience stress-related
symptoms (e.g. headaches, nightmares) and even school phobia (Sharp & Smith, 1994).
In the long term, some children continue to present low self-esteem and depression
(Olweus, 1993) or even commit suicide (Slee, 1994).
Copyright © The British Psychological Society
Reproduction in any form (including the internet) is prohibited without prior permission from the Society

The Rasch measurement model 783

The parents of the victims may also feel ashamed that their child is not a social
success and may be reluctant to contact the school for help. They may expect him/her
to fend off attack (Besag, 1989). Bullying is a hidden problem, which can increase
teachers’ stress (Byrne, 1992; Charlot & Emin, 1997; Nakou, 2000). As for the bullies,
they soon realize that bullying is an easy and effective way to get what they want (Besag,
1989) and may present other forms of antisocial behaviour (Sharp & Smith, 1994).
Recently, the problem of bullying in the state primary schools of Cyprus has become
an issue of significant concern. Both the teacher trade union and the parents’ association
have sought the development of a national policy to tackle it. A recent study revealed
that the majority of Cypriot teachers (69%) claimed that the situation has deteriorated
recently (Kaloyirou, 2002).
Given the international prevalence of bullying, it is important to develop a
psychometrically appropriate instrument measuring the phenomenon of bullying in
different countries. Thus, this paper presents the findings of a study investigating the
usefulness of the Revised Olweus Bully/Victim Questionnaire (OBVQ; Olweus, 1996) for
this purpose. The OBVQ is a revised version of an earlier instrument developed by
Olweus (1978). It was based on the definition of bullying, proposed by Olweus (1993;
see above), and consists of 40 questions for the measurement of aspects of bully/victim
problems: physical, verbal, indirect, racial, sexual forms of bullying harassment;
initiation of various forms of bullying other students; where the bullying occurs; pro-
bullying and pro-victim attitudes; and the extent to which teachers, peers, and parents
are informed about and react to the bullying (Olweus, 1997).
The questionnaire content derives from the main findings of studies conducted on
bullying in several countries (e.g. Garcia & Perez, 1989; Genta, Menesini, Fonzi,
Costabile, & Smith, 1996; Mellor, 1990; Monbusho, 1994). More specifically, three forms
of bullying are consistently identified: physical, verbal, and indirect bullying (Besag,
1989; Morita, 1985; Olweus, 1993; Sharp & Smith, 1994). It has also been shown that the
bullies and victims are important sources of data for investigating this phenomenon
(Besag, 1986; Olweus, 1978, 1993; Salmivalli, Lagerspetz, Bjorkqvist, Ostermann, &
Kaukiaianen, 1996; Sharp & Smith, 1994; Smith & Sharp, 1994). The OBVQ is divided
into two parts. Part I (Questions 5 to 24) refers to the initiation of an act of bullying
against the child who is answering the questionnaire, whereas Part II (Questions 25 to
40) refers to the expression of bullying behaviour against others by this child. Given the
generally consistent research on bullying, it is hypothesized that the OBVQ operates
comparably in different countries. This will be tested in the current study with respect
to prevalence of bullying overall, and to types of bullying, by gender.
The duration and frequency of the problem are also examined as these dimensions
distinguish a bullying act from an accidental incident. Moreover, pupils are prompted to
refer to the place where the problem occurs more often, who is informed about the
bullying incidents, and the role of their teachers, parents and peers in addressing the
problem.
The thematic distribution of the questions indicates that more emphasis is given to
the measurement of verbal bullying, as this is the most frequent form of bullying in
schools (Irish National Teachers Organization, 1993; Lind & Maxwell, 1996; Rigby, 1997;
Smith et al., 1999; Whitney & Smith, 1993). Second, more attention is given to the
investigation of the characteristics of the bully, rather than those of the victim as the
bully is identified by teachers as the main issue for the school (Nakou, 2000). Third,
more emphasis is given to the role of teacher rather than other significant persons in
children’s lives such as their parents or peers as the investigation of teacher’s role
Copyright © The British Psychological Society
Reproduction in any form (including the internet) is prohibited without prior permission from the Society

784 Leonidas Kyriakides et al.

and the examination of bully’s characteristics are the crucial elements in determining
action within the school setting.
The wide range of variables included in the OBVQ have enabled its use in an
international study of bullying. Two English versions of the questionnaire, for Grades 1
to 4 and Grades 5 to 9 and higher, respectively (Olweus, 1993) have been translated and
adapted in Spain (Ruiz, 1992), the Netherlands (Haeselager & Van Lieshout, 1992), Japan
(Hirano, 1992), Canada (Ziegler & Rosenstein-Manner 1991), the USA (Perry, Kusel, &
Perry, 1988), Australia (Rigby & Slee, 1991), and Finland (Lagerspetz et al., 1982), as well
as England (Smith, 1991; Whitney & Smith, 1993). Internal consistency and test–retest
reliability of the questionnaire from large representative samples (more than 5000
students) are satisfactory (e.g. Genta et al., 1996; Olweus, 1997). More specifically, at
the individual level, combinations of items for being victimized or bullying others have
yielded satisfactory internal consistency reliabilities with values of Cronbach alpha
higher than .80.
Only a few studies have investigated validity and these have been mainly
concerned with the concurrent validity of the earlier versions of the OBVQ. In the
early Swedish studies (e.g. Olweus, 1978), composites of 3 to 5 self-report items on
being bullied or bullying and attacking others, respectively, correlated in the .40–.60
range with reliable peer ratings on related dimensions (Olweus, 1994). Similarly,
Perry et al. (1988) reported a significant correlation coefficient of .42 between a self
report scale of three victimization items and a reliable measure of peer nominations
of victimization in elementary school children (Olweus, 1994). In addition, a recent
study (Bendixen & Olweus, 1999) provides some evidence for the construct validity
of the two main dimensions of the questionnaire (being victimized and bullying
others). They report fairly strong linear relations between degree of victimization
and variables such as depression, poor self-esteem and peer rejection, on the one
hand, and even stronger linear relations between degree of bullying others and
various dimensions of antisocial behaviour and several aspects of aggressive
behaviour, on the other.
While there is no denying that the OBVQ has proven useful to teachers,
researchers, and educational authorities, there are three aspects of the instrument
that are open to question. First, the OBVQ only provides data at the ordinal level
and not at the interval level since a Likert scale is used to collect attitude data.
Usually, Likert scales are regarded as a softer form of data collection, in which the
researcher clearly acknowledges that the questions require merely expressed
opinions (Hales, 1986). This recognizes the inherent subjectivity involved in
collecting information about any human conditions. However, the standard method
of analysing Likert scales disregards the subjective nature of the data by making
unwarranted assumptions about their meaning. Specifically, the coding for the five
response categories of the OBVQ scale is usually treated as follows: it happened to
me several times a week ¼ 5, it happened to me once a week ¼ 4, it happened to
me 2 to 3 times a month ¼ 3, it happened to me only once or twice in the last 2
months ¼ 2, it hasn’t happened to me in the last 2 months ¼ 1. Thus, the higher
number indicates that the event occurs more frequently. Whenever scores are added
in this manner, the ratio, or at least the interval nature of the data, is being
presumed. That is, the relative value of each response category across all items is
treated as being the same, and the unit increases across the rating scale are given
equal values. Although the subjectivity of attitudinal data is acknowledged each time
the data are collected, the data are subsequently analysed in a statistical manner that
Copyright © The British Psychological Society
Reproduction in any form (including the internet) is prohibited without prior permission from the Society

The Rasch measurement model 785

is rigidly prescriptive and inappropriate (Bond & Fox, 2001). It is both counter-
intuitive and mathematically inappropriate to analyse Likert data obtained through
OBVQ in the conventional way. Thus, a powerful measurement model such as
Rasch could be applied to the data to construct an interval-level measure (Andrich,
1978a). Further information regarding the process of analysing rating scale data
through the Rasch model is given in the next section. Second, Rasch analysis could
help to explain the conceptual structure of the OBVQ (its meaning and validity) and
test whether it is targeted correctly (i.e. whether the pupils’ measures and the item
difficulties can be represented on the same scale). Third, the adaptation of OBVQ so
that it meets the constraints of another language (in this case, Greek) provides an
opportunity to examine systematically its potential validity across cultures and to
develop an instrument that can be used in multinational studies.
The study reported in this paper was an attempt to use the Rasch model (Rasch,
1980) and create two interval-level measures: one scale is based on pupils’
responses to items concerning the extent to which children are being victimized,
and the other is based on their responses to items concerning the extent to which
children bully others. A further research aim was to investigate the conceptual
design of the OBVQ by examining the relationship of the two scales and the
difficulties of the items that refer to different forms of bullying. The third aim was
to explore the use of the OBVQ presented in a different language (Greek) and
different culture (Cyprus).
Our decision to use the Rasch model to analyse data derived from the OBVQ is
based on the idea that ‘useful measurement involves examination of only one
human attribute at a time on a hierarchical more than/less than line of inquiry’
(Bond & Fox, 2001, p. 32). This line of inquiry is a theoretical idealization against
which we can compare patterns of responses that do not coincide with this ideal.
Person and item deviations from the line can be assessed, alerting the investigator
to reconsider item wording and score interpretations from the data. Confusing a
number of attributes into a single score makes confident predictions from the score
more hazardous and the score a less useful summary of person ability or
achievement (Smith & Miao, 1994), but carefully constructed instruments that make
good measurement estimates of single attributes might be sufficient for a number of
specified purposes.
This implies that each of the items of an instrument is expected to contribute in
a meaningful way to the construct being investigated and thereby the Rasch model
helps examine the construct validity of an instrument (Cronbach, 1990). Specifically,
construct validity focuses on the idea that the recorded performances are reflections
of a single underlying construct: the theoretical construct as made explicit by the
investigator’s attempt to represent it in items and by the human ability inferred to
be responsible for those performances (Cronbach & Meehl, 1955). Given some
theoretical claims about a construct, the Rasch model permits the strong inference
that the measured behaviours are expressions of that underlying construct (Messick,
1995; Smith, 2001). Thus, Rasch analysis provides indicators of how well each item
fits within the underlying construct and thereby the construct validity of the
instrument can be examined (Overston, 1999). In the case of this study, we provide
in the next section further information on how the Rasch analysis of pupil
responses to the items which belong to the two scales of OBVQ, measuring the
extent to which children are being victimized and the extent to which children
bully others, aids examination of the construct validity of OBVQ.
Copyright © The British Psychological Society
Reproduction in any form (including the internet) is prohibited without prior permission from the Society

786 Leonidas Kyriakides et al.

Method
Instrument derivation and preliminary validation
The present study utilized a Greek version of the Revised Olweus Bullying/Victim
Questionnaire (OBVQ). The quality of the data that have been collected, therefore, is
strongly dependent upon the process of translating the OBVQ into Greek (Van de Vijver
& Hambleton, 1996). The following methods were used to ensure translation into the
Greek language was appropriate and to examine the extent to which the Greek version
of the OBVQ provided a valid measure of pupils’ perceptions towards bullying. First,
two members of the research team (LK, CK) conducted a translation from English to
Greek and then a research colleague who was not aware of the OBVQ was asked to
translate the Greek version back to English. It was found that the new English version of
OBVQ, which derived from translating the Greek version back to English, was identical
to the original version of OBVQ in meaning for all but 3 items where small adjustments
were made to correct the observed discrepancies.
Second, one lecturer of educational psychology, two postgraduate students, and two
primary teachers, selected on the basis of their familiarity with the problem of bullying
in schools, evaluated the face validity of the instrument. In the light of their comments,
minor amendments were made, particularly where the structure used was not easily
comprehensible or terms that had been used were seen as not familiar to primary pupils.
The final Greek version of the OBVQ met the satisfaction of each of the five judges. Once
this process was complete, the whole procedure was repeated with a sixth judge who
had not seen the questionnaire before. The outcome served to validate the version
finally used to gather data. (The Greek version of the OBVQ is available upon request
from the first author).

Sample
The study was conducted in Cyprus, a comparatively large island (i.e. the third largest
island in Mediterranean Sea) but a small country. One of the main characteristics of the
educational system in Cyprus is that its administration is centralized and both primary
and secondary schools are considered as government, and not as community,
institutions. The maintenance of the centralized system has historical and political
origins but also a decentralized system in a small country like Cyprus would be very
demanding in personnel (Kyriakides, 1996). With 380 primary schools and 120
secondary schools, it has the same administrative range as a large local educational
authority in England. It is also much smaller than an administrative region for education
in France. Pre-primary, primary, and secondary education are under the authority of the
Ministry of Education and Culture, which is responsible for the educational policy
making, the administration of education and the enforcement of educational laws.
Permission for the study was thus granted by the Ministry of Education and Culture.
Ethical guidelines of the British Psychological Society regarding research with children
were followed. Seven primary schools in Nicosia were selected to form a stratified
sample where pupils were approximately equally represented by middle-class, working
class and rural families. In Cyprus, primary schools provide a 6 year compulsory
schooling for children from 6 to 12 years. For practical reasons, all the Year 6 students
(N ¼ 335) from each class (N ¼ 21) of the school sample were chosen. This focus
provides a reasonable sample of children at the end of their primary education. Thus,
the total sample comprised 160 girls (47.8%) and 175 boys (52.2%). There were no
statistically significant difference between the research sample and the population in
Copyright © The British Psychological Society
Reproduction in any form (including the internet) is prohibited without prior permission from the Society

The Rasch measurement model 787

terms of students’ gender (X 2 ¼ 1.29, df ¼ 1, p , .38). Moreover, the mean age of the
sample was 11.9 years (range 11.1–12.8 years).

Procedure
At the end of the second term of the school year, the OBVQ was administered to each
class by one of the two Greek-speaking members of the research team over a 40 minute
session. Students were asked to reflect upon their experiences during the last four
months in their school.

Measurement and measurement model


Taken individually, 8 items of the OBVQ can be used to interpret the responses with
respect to the extent to which pupils are victims of bullying (Items 6 to 13) whereas a
second set of 8 items refers to the extent to which pupils initiate an act of bullying
against other children (Items 26 to 33). It is, however, important to examine whether
performance on each of these two sets of items could be reducible to a scale that enables
the specification of a hierarchy of item difficulty. The Rasch model is appropriate for the
specification of this scale because it enables researchers to test the extent to which the
data meet the requirement that both students’ performances on each set of items of
OBVQ and the difficulties of the relevant items form a stable sequence (within
probabilistic constraints) along a single continuum (Bond & Fox, 2001). Because the
Rasch model converts ordinal data into interval data, it also makes it possible to make
statements about the relative difficulty of OBVQ items and investigate its construct
validity (Bond, 2003).
The Rasch model is based on the assumption that the difference between item
difficulty and person ability should govern the probability of any person being
successful on any particular item. For example, the simplest member of the Rasch family
of models, the dichotomous model, predicts the conditional probability of a binary
outcome (correct/incorrect), given the person’s ability and the item’s difficulty.
Specifically, the probability of a correct response is a logistic function of the difference
between the ability of the person and the difficulty of the item. This S-shaped function
transforms any value of the real line into a value between 0 and 1. The rating scale model
(Andersen, 1977; Andrich, 1978a, 1978b; Rasch, 1980; Wright, 1985) is an extension of
the dichotomous model to the case in which items have more than two response
categories and was therefore used to analyse the data that emerged from pupils’
responses to each set of items of OBVQ. Since each item of OBVQ has five response
choices (‘it happened to me several times a week’, ‘it happened to me once a week’, ‘it
happened to me 2 to 3 times a month’, ‘it happened to me only once or twice in the last
2 months’, ‘it hasn’t happened to me in the last 2 months’), it can be modelled as having
four thresholds. Each threshold has its own difficulty estimate, and this estimate is
modelled as the threshold at which a person has a 50% chance of choosing one category
over another (Andersen, 1977). These thresholds are calculated in log odds (otherwise
called logits) and should be ordered to represent decreasing probability of each event
occurring. Thresholds that do not increase monotonically are considered disordered
(Andrich, 1978b). The magnitudes of the distances between the threshold estimates are
also important. Threshold distances should indicate that each step defines a distinct
position on the variable and thereby they should be neither too close together nor too
far apart on the logit scale (Bond & Fox, 2001). Specifically, guidelines indicate that
Copyright © The British Psychological Society
Reproduction in any form (including the internet) is prohibited without prior permission from the Society

788 Leonidas Kyriakides et al.

thresholds should increase by at least 1.4 logits (i.e. to show distinction between
categories) but no more than 5 logits (i.e. to avoid large gaps in the variable; Linacre, 1999).
The data were analysed by using the computer programme Quest (Adams & Khoo,
1996) to create two relevant scales, based on the log odds of pupils’ opinions about the
extent to which they are either being bullied (Scale A) or they bully other children (Scale
B). The items are ordered along each scale at interval measurement level from those
which refer to acts of bullying which often happen in schools (negative logit values) to
those which rarely occur (positive logit values). The latter are most likely to be answered
as happening often in the school only by pupils who are most likely to be victimized
(Scale A) or bully others (Scale B).
The model fit statistics are (a) infit (weighted) and (b) outfit (unweighted) mean
square statistic. Fit statistics are used to assess whether a given person’s performance
(or a given item) is consistent with other persons’ performances (or items) and are
based on the differences between the expected and observed performances. Outfit
statistics are based solely on the difference between observed and expected scores
whereas in calculating infit statistics extreme persons or items are downweighted.
All weighted (i.e. infit) statistics in the Rasch model actually increase the weight of
targeted responses. It is customary for items to be considered to fit the Rasch model if
they have item infit within the range of 0.77–1.30 (Adams & Khoo, 1996), although
many researchers recommend a more restricted range of 0.83–1.20 (Keeves &
Alagumalai, 1999). In the examination of the person statistics for fit to the Rasch model,
the outfit square statistic is considered to provide more useful information than the infit,
because a person’s performances on both the easier and the harder items are taken into
equal consideration (Andrich, 1988). Any marked difference between the calculated
values for the outfit and the infit statistics is highly informative, since it indicates a
tendency for a different pattern of responding to easier or harder items, when compared
with items at the centre of the scale. Moreover, the fit statistics can be approximately
normalized using the Wilson–Hilferty transformation. The normalized statistics, infit t
and outfit t, have a mean near zero and a standard deviation near one when the data
conform to the measurement model.
Finally, it is important to note that the general form of the rating scale model
expresses the probability of any person choosing any given category on any item as a
function of the agreeability of the person N (Bn) and the endorsability of the entire item i
(Di) at the given threshold k (Fk ). The natural log of the odd of this probability results in
the direct comparison between a person’s ability and the difficulty of threshold k on
item i. This ability of the Rasch model to compare persons and items directly allows the
creation of person-free measures and item-free calibrations. This characteristic–
parameter separation – is unique to the Rasch model, and holds for the entire family of
Rasch models. Andrich (1988) shows that the Rasch model provides item scale estimates
that are free of the distribution of locations of persons — providing the model holds.
Thus, specific objectivity, in Rasch’s terms has given rise to expressions such as ‘sample-
free’ or ‘population-free’ (Wright & Masters, 1981). However, this terminology can be
confusing and this is especially true since whether the model holds across different
classes of persons is an empirical question (Hambleton, Swaminathan, & Rogers, 1991).
For this reason, it was decided to examine whether each scale of the OBVQ was used
consistently by each group of our sample. Thus, separate analyses were conducted of
the responses of each group and the estimated difficulties of each item, which emerged
from each analysis, were compared in order to see whether the difficulties of the items
are invariant across the groups of our sample.
Copyright © The British Psychological Society
Reproduction in any form (including the internet) is prohibited without prior permission from the Society

The Rasch measurement model 789

Data analysis
For each scale, the data were analysed initially with the whole sample (N ¼ 335) and all
its 8 items together. There was no item that did not fit the model, and the analyses
therefore enabled the testing of the meaning, targeting, validity, and reliability of the
OBVQ. Subsequently, the analyses were repeated with each of the two groups of the
sample (N ¼ 160 girls, N ¼ 175 boys) separately to test whether the instrument is used
consistently by boys and girls. Specifically, a final step in investigating the quality of a
new measure is to compare the estimates across two or more distinct groups of interests
to examine whether the items have significantly different meanings for the different
groups. This is called differential item functioning (DIF) and models the invariance of
item difficulty estimates by comparing items across two or more samples (see Bond,
2003; Swaminathan & Rogers, 1990). Finally, by taking into account the item difficulties
derived from the analyses of the whole sample, the procedure for detecting pattern
clustering in measurement designs developed by Marcoulides and Drezner (1999) was
used in order to examine whether the various acts of bullying could be classified and
whether the three forms of bullying specified for the questionnaire could be established.
This procedure enables segmentation of the observed measurements into constituent
groups (or clusters) so that the members of any one group are similar to each other
according to some selected criterion.

Results
Model fit
Scale A: ‘Being victimized’
Figure 1 illustrates the scale for the 8 items of the OBVQ concerning the extent to which
pupils are being victimized. Both item difficulties and pupils’ measures are calibrated on
the same scale. Figure 1 reveals that the items have a good fit to the measurement model,
indicating strong mutual consistency in the responses of the 335 pupils located at
different positions on the scale, across all 8 items. Moreover, the items are well targeted
against the pupils’ measures since pupils’ scores range from 2 2.16 to 3.09 logits,
whereas the item difficulties range from 2 2.08 to 3.04 logits. However, the targeting of
the items measuring the extent to which Cypriot pupils are being victimized could be
improved if items that are relatively difficult (i.e. their difficulties range from 0.60 to 2.50
logits) were included.

Scale B: ‘Bullying others’


Figure 2 illustrates the scale for the 8 items of the OBVQ concerning the extent to which
pupils express bullying behaviour against other pupils. Both item difficulties and pupil
measures are calibrated on the same scale. The items of this dimension also have a good
fit to the measurement model and are well targeted against the pupils’ measures since
pupils’ scores range from 2 2.08 to 3.03 logits whereas the item difficulties range from
2 1.97 to 3.05 logits. However, the targeting of the items measuring the extent to
which Cypriot pupils express bullying behaviour against others could be improved
by adding items which are relatively difficult (i.e. their difficulties range from 0.50 to
2.60 logits).
For the sake of brevity, the item threshold values are not presented either in Figure 1
or in Figure 2. However, they are ordered from low to high indicating that the pupils
Copyright © The British Psychological Society
Reproduction in any form (including the internet) is prohibited without prior permission from the Society

790 Leonidas Kyriakides et al.

Figure 1. Scale for the dimension ‘Being Victimized’ of the Greek version of OBVQ.

answered consistently with the ordered response format of ‘it happened to me several
times a week’, ‘it happened to me once a week’, ‘it happened to me 2 to 3 times a
month’, ‘it happened to me only once or twice in the last 2 months’, ‘it hasn’t happened
to me in the last 2 months’. Moreover, the threshold distances range from 1.8 to 3.4
logits.
Table 1 provides a summary of the statistics of each scale for the whole sample and
the two subgroups (boys, and girls) separately. Reliability is calculated by the Item
Separation Index and the Person Separation Index. Separation indices represent the
proportion of the observed variance considered to be true. A value of 1 represents high
separability in which errors are low and item difficulties and pupil measures are well
separated along the scale (Wright & Masters, 1981). Table 1 reveals that for the whole
sample and for each group the indices of cases and item separation (i.e. reliability) are
Copyright © The British Psychological Society
Reproduction in any form (including the internet) is prohibited without prior permission from the Society

The Rasch measurement model 791

Figure 2. Scale for the dimension ‘Bullying Others’ of the Greek version of OBVQ.

higher than 0.85 indicating that the separability of each scale is relatively satisfactory.
However, this should be improved since reliability of 0.90 or higher is sought for an
excellent scale (Wright, 1985). Moreover, for each scale, the infit mean squares and the
outfit mean squares are 1 and the values of the infit t scores and the outfit t scores are
approximately zero. Looking at the actual values of the infit and outfit of each item, one
can identify that all items have item infit with the range 0.85–1.20, and item outfit with
the range of 0.74–1.42. In addition, it was found that all the values of infit t for both
students and items are greater than 2 2.00 and smaller than 2.00. This implies that in
each analysis, there is a good fit to the Rasch model.
For both Scales A and B, the item difficulties are calibrated with reasonably small
errors (smaller than 0.10). Although each scale consists of only 8 items, the errors of the
person estimates are relatively small (i.e. smaller than 0.28). Table 2 illustrates item
difficulties in logits for the whole sample and for boys and girls separately. All the items
Copyright © The British Psychological Society
Reproduction in any form (including the internet) is prohibited without prior permission from the Society

792 Leonidas Kyriakides et al.

Table 1. Statistics relating to each of the two scales of the OBVQ questionnaire for the whole sample
and the two groups

Whole group Boys (N ¼ 175) Girls (N ¼ 160)

Statistic Scale A Scale B Scale A Scale B Scale A Scale B

Mean (items) 0.00 0.00 0.00 0.00 0.00 0.00


(persons) 2 0.06 2 0.09 0.41 0.46 2 0.57 2 0.32
Standard deviation (items) 2.01 1.98 2.04 1.96 2.07 2.02
(persons) 1.13 1.04 1.01 1.02 1.05 1.00
Reliability (items) 0.92 0.91 0.86 0.88 0.90 0.87
(persons) 0.87 0.84 0.82 0.81 0.81 0.82
Mean Infit mean square (items) 1.00 1.01 1.02 1.00 1.00 1.01
(persons) 1.00 1.00 1.00 1.01 1.01 1.02
Mean Outfit mean square (items) 1.00 1.00 1.00 1.00 1.01 1.02
(persons) 1.01 1.00 1.02 1.00 1.01 1.00
Infit t (items) 0.00 0.02 0.03 0.01 0.01 0.02
(persons) 2 0.01 0.00 0.01 0.02 0.02 0.01
Outfit t (items) 0.02 0.00 0.03 0.02 0.01 0.01
(persons) 0.00 2 0.03 0.01 0.04 2 0.01 0.03

of both scales have difficulties that could be considered invariant across the two groups,
within measurement error. Specifically, the items of the two scales were estimated for
each sample separately and the item calibrations were plotted against each other
(see Figures 3 and 4). The model for invariance of item estimates is represented by a
straight line with a slope equal to 1 through the mean item difficulty estimates from each
scale (i.e. 0 logits for each). Control lines show the items that do not display invariance,
within the boundaries of measurement error, across the person samples (see Wright &
Masters, 1982). Thus, Figures 3 and 4 reveal that none of the items of Scales A and B
display invariance within the measurement errors because they are located within the
two control lines. This implies that for Scales A and B, the Rasch model provided item
estimates that are free of the distribution of locations of persons. It can therefore be
claimed that the two main dimensions of the OBVQ have satisfactory psychometric
properties.

Internal consistency
Because the first part of the questionnaire refers to the initiation of an act of bullying
against the child who is answering the questionnaire (Scale A), whereas the second part
refers to the expression of bullying behaviour against others by this child (Scale B) and
two relevant Rasch scales emerged from each part of the questionnaire, it was expected
that a negative correlation would be identified between pupils’ scores on each scale.
The Pearson correlation coefficient for the relationship between pupils’ measures on
the two scales was statistically significant and negative (r ¼ 20:78, N ¼ 335, p , .001)
and indicates that there is consistency in the responses of pupils to the two parts of the
questionnaire.
Pupils’ consistency in answering the two parts of the questionnaire was also
explored by examining the relationship between the difficulties of items on each scale
Copyright © The British Psychological Society
Reproduction in any form (including the internet) is prohibited without prior permission from the Society

The Rasch measurement model 793

Table 2. Difficulties of items (in logits) of each scale and their errors for the whole sample and for each
of the two groups

No, Item All (N ¼ 335) Girls (N ¼ 165) Boys (N ¼ 170)

Scale A: Being victimized


8 Hit, kicked, pushed, shoved around, or 3.04 (0.10) 3.03 (0.09) 3.06 (0.11)
locked indoors
10 Money or other things taken away 2.88 (0.07) 2.89 (0.09) 2.86 (0.08)
from me or destroyed
9 Other students told lies about me 0.58 (0.09) 0.63 (0.09) 0.49 (0.10)
or tried to make others dislike me
7 Left out of things, excluded, or 0.19 (0.10) 0.28 (0.12) 0.18 (0.11)
ignored
6 Called mean names, made fun of, 21.34 (0.09) 2 1.39 (0.09) 2 1.32 (0.10)
or teased in a hurtful way
11 Threatened to do things I didn’t want to 21.56 (0.08) 2 1.66 (0.12) 2 1.59 (0.10)
13 Bullied with mean names with a 21.75 (0.07) 2 1.73 (0.10) 2 1.72 (0.08)
sexual meaning
12 Bullied with mean names about my 22.08 (0.08) 2 2.10 (0.09) 2 2.04 (0.08)
race or colour

Scale B: Bullying others


30 I took money or other things 3.05 (0.08) 3.09 (0.08) 3.02 (0.09)
from them or damaged their belongings
28 I hit, kicked, pushed, and shoved 2.77 (0.07) 2.85 (0.08) 2.79 (0.10)
them around or locked them indoors
29 I spread false rumours about them 0.44 (0.09) 0.46 (0.12) 0.41 (0.08)
and tried to make others dislike them
27 I kept them out of things, 0.06 (0.07) 0.12 (0.09) 0.04 (0.08)
excluded, or ignored them
31 I threatened or forced them to 21.25 (0.09) 2 1.23 (0.10) 2 1.20 (0.08)
do things they didn’t want to
32 I bullied them with mean names 21.42 (0.10) 2 1.44 (0.12) 2 1.40 (0.11)
about their race or colour
26 I called them mean names, made 21.68 (0.08) 2 1.74 (0.09) 2 1.65 (0.07)
fun of or teased in a hurtful way
33 I bullied him or her with 21.97 (0.09) 2 2.01 (0.08) 2 1.94 (0.09)
mean names with a sexual meaning

referring to the same negative act. Specifically, for each negative action, the two item
difficulties that emerged from analysing pupil responses to Scales A and B were taken
into account, and the extent to which there is a relation between these two estimates of
their difficulty was examined. The Pearson correlation coefficient comparing the
difficulties of items from Scale A with the relevant item difficulties of Scale B revealed a
very strong statistically significant positive relationship (r ¼ :98, N ¼ 8, p , .001),
indicating a high level of internal consistency as the extent to which the children
mentioned being victimized in a specific way correlated with the extent to which they
mentioned bullying others in this way.
Copyright © The British Psychological Society
Reproduction in any form (including the internet) is prohibited without prior permission from the Society

794 Leonidas Kyriakides et al.

Figure 3. Differential item functioning of Scale A (Boys vs. Girls).

Figure 4. Differential item functioning of Scale B (Boys vs. Girls).

Items concerning the three forms of bullying


The procedure for detecting pattern clustering in measurement designs developed by
Marcoulides and Drezner (1999), which avoids many of the pitfalls of other algorithms
(Manly, 1994), was used to segment the observed measurements of the difficulties of the
8 items of the OBVQ into groups which were similar to each other according to their
difficulty level. This allowed examination of the relative occurrence of the three types of
bullying (verbal, physical, indirect) on Scale A. Information regarding this method of
clustering is provided in Note 1. Applying this method to segment the 8 items of Scale A
on the basis of their item difficulties emerging from the Rasch model, it was found that
the 8 items of Scale A are optimally clustered into three clusters or levels (see Table 3).
The cumulative D for the 3-cluster solution was 82% and the fourth gap adds only 6%.
Thus, three levels of negative acts can be identified which are identical to the three
forms of bullying mentioned in the specification of the OBVQ. More specifically, the first
level (i.e. below 2 1.30 logits) refers to verbal forms of bullying. The indirect forms of
bullying belong to the second level (0.00 up to 0.60) and, finally, the third level (higher
Copyright © The British Psychological Society
Reproduction in any form (including the internet) is prohibited without prior permission from the Society

The Rasch measurement model 795

than 2.50 logits) refers to the physical forms of bullying. The analysis was repeated for
item difficulties on Scale B with comparable results (see Table 3). Specifically, the
cumulative D for the 3-cluster solution was 81% and the fourth gap adds only 6%. This
implies that there is a strong consistency both in the responses of pupils to the two parts
of the questionnaire and in the responses of pupils to items concerning comparable
forms of bullying behaviour. Finally, the above procedure for detecting pattern
clustering in measurement designs for each scale was conducted based on the item
difficulties that were estimated for boys and girls separately. For each sample, the results
of this analysis provided support to the existence of the above three groups of items
reflecting the three forms of bullying, providing strong evidence for the relative
prevalence of verbal, indirect and physical bullying.

Table 3. Identification of the clusters of the 8 items of each scale based on the Marcoulides and
Drezner procedure

Item V S Sorted S D Sorted D Cum D

Scale A: Being victimized


12 2 2.08 0.00 0.00 0.06 0.45 0.45
13 2 1.75 0.06 0.06 0.04 0.29 0.74
11 2 1.56 0.10 0.10 0.05 0.08 0.82
6 2 1.34 0.15 0.15 0.29 0.06 0.88
7 0.19 0.44 0.44 0.08 0.05 0.93
9 0.58 0.52 0.52 0.45 0.04 0.97
10 2.88 0.97 0.97 0.03 0.03 1.00
8 3.04 1.00 1.00
Scale B: Bullying others
33 2 1.97 0.00 0.00 0.06 0.47 0.47
26 2 1.68 0.06 0.06 0.05 0.26 0.73
32 2 1.42 0.11 0.11 0.03 0.08 0.81
31 2 1.25 0.14 0.14 0.26 0.06 0.87
27 0.06 0.40 0.40 0.08 0.05 0.92
29 0.44 0.48 0.48 0.47 0.05 0.97
28 2.77 0.95 0.95 0.05 0.03 1.00
30 3.05 1.00 1.00

Note. The meaning of each item is shown in Table 2. The meaning of each symbol is given in the Appendix.

Gender differences
Comparison of the scores of boys and girls on Scales A and B revealed that boys were
more exposed to bullying than girls (t ¼ 8.7, df ¼ 333, p , .001) and used bullying
more than girls (t ¼ 7.1, df ¼ 333, p , .001), with large effect sizes of d ¼ 0.92 and
0.76, respectively (Cohen, 1988).
By comparing the scores of boys and girls on the three relevant subscales of Scale B
(i.e. ‘bullying others’), it was shown that the boys who participated in this project used
more physical bullying (Person scale: M ¼ 0:34, SD ¼ 0:91) compared with girls
(M ¼ 20:41, SD ¼ 0:87), t ¼ 7:72, df ¼ 333, p , .001. Conversely, girls used more
indirect forms of bullying (M ¼ 0:65, SD ¼ 0:98) than boys (M ¼ 20:18, SD ¼ 0:94),
t ¼ 7:93, df ¼ 333, p , .001. In each case, the effect sizes (d ¼ 0:94 and 0.92,
respectively) were large. However, the estimated difficulties of all the items concerning
Copyright © The British Psychological Society
Reproduction in any form (including the internet) is prohibited without prior permission from the Society

796 Leonidas Kyriakides et al.

verbal bullying, which emerged from analysing the responses of boys and girls
separately, were smaller than the difficulties of any other item. This implies that verbal
bullying is used more frequently than other forms of bullying by both boys and girls.
Nevertheless, it should be acknowledged that the number of items for each scale and
their subscales is very small and the methodological limitations of our attempt to
estimate gender differences should be taken into account.

Discussion
The present study addresses an important element in the investigation of bullying,
namely the development of a psychometrically appropriate instrument (the OBVQ).
The Rasch model was found to be useful in creating interval-level measures of the two
main dimensions of the OBVQ measuring the extent to which children are being bullied
(Scale A) or bully others (Scale B), and for investigating the validity and reliability of each
scale. The findings reveal that the Rasch analysis supports the conceptual design of the
instrument. The underlying trait, negative acts considered as bullying, seems to be an
overarching concept comprising three main forms of bullying: verbal, indirect, and
physical.
By comparing the difficulties of the items of the two scales measuring the extent to
which the same negative activity occurs in the school, a very high correlation was found
which reveals a high internal consistency in pupils’ responses to the questionnaire.
Moreover, by using the procedure for detecting pattern clustering in measurement
designs developed by Marcoulides and Drezner (1999) the observed measurements of
the difficulties of the OBVQ items dealing with negative acts clustered into groups that
were similar to each other according to their difficulty level. The three groups of items
that emerged were identical to the three forms of bullying expected from the structure
of the OBVQ. The measurement model of each scale places the items of the verbal form
of bullying at the easiest, high frequency, part of the scale (negative item difficulties),
and the items of physical form of bullying at the difficult, low frequency, part of the scale
(positive item difficulties), as would be expected. Items of the indirect form of bullying
were intermediate.
The internal consistency of pupils’ responses to the OBVQ was also demonstrated by
comparing the scores of each pupil on the two scales. A statistically significant negative
correlation was identified between the part of the questionnaire that refers to the
initiation of an act of bullying against the child who is answering the questionnaire, and
the part that refers to the expression of bullying behaviour against others by this child.
The present study adds to the validation of the OBVQ as a useful resource for
investigating the phenomenon of bullying in schools, and hence for forming the basis of
introducing appropriate intervention programmes. This instrument could also be used
in order to measure the effectiveness of such programmes by adopting relevant value-
added techniques (Kyriakides, 2002). However, further research is needed to examine
the psychometric properties of the questionnaire for conducting value-added analysis
and identify factors which make the intervention programmes more effective in solving
the problem of bullying. Adaptation of the OBVQ to meet the constraints of another
language (in this case, Greek) and the examination of its validity provide an opportunity
to examine systematically the potential validity of OBVQ as a psychometrically valid
instrument for use in multinational studies. Although a response to this issue is partly
provided by the findings of this study, the administration of the instrument in a variety of
contexts and the examination of its psychometric properties using the Rasch model are
Copyright © The British Psychological Society
Reproduction in any form (including the internet) is prohibited without prior permission from the Society

The Rasch measurement model 797

also required to determine the generalizable value of the OBVQ. Further research should
also attempt to demonstrate item invariance across the different versions of OBVQ
(e.g. the English vs. the Greek version).
Despite the many cultural differences in the expression of the phenomenon in
different countries, this project revealed that there are similar features across different
cultures. This study has confirmed the gender effects found in research in other
countries. Firstly, boys were more exposed to and used more bullying than girls. Similar
findings are reported in studies conducted elsewhere (Byrne, 1992; Carra & Sicot, 1996;
Lagerspetz et al., 1982; Mellor, 1990; Ortega & Mora-Merchan, 1995; Olweus, 1978;
Smith et al., 1999; Vandermissen & Thys, 1993). In addition, the boys in this project used
more physical bullying while girls used more indirect forms of bullying (Crick et al.,
1997; Olweus, 1993; Smith et al., 1999; Vandermissen & Thys, 1993). Finally, verbal
bullying was used more frequently than other forms by both boys and girls (Irish
National Teachers Organization, 1993; Lind & Maxwell, 1996; Rigby & Slee, 1991).
This study not only helps to examine the validity of the OBVQ, but also to suggest
improvements, especially the targeting of items against student measures through the
addition of items which are relatively difficult (i.e. 0.50 up to 2.50 logits). For example,
separation into 2 items could be made for taking money compared with destroying
the child’s possessions. In addition, analysis of the content of the OBVQ indicates that,
although there are questions referring to racial bullying, questions on physical
appearance could also usefully be included as adolescents do not all follow the same
pattern of physical development. A similar case could be made for items addressing
academic and athletic performance. The OBVQ includes questions about indirect ways
of bullying, such as ‘I was left out from a group’, but these questions are very general.
More specific items could explore the reasons that may lead to indirect bullying.
For example, academic performance or athletic scores could influence the pupil’s
popularity and self-esteem (Harter, 1999).
Although this project focuses exclusively on the two main dimensions of OBVQ, it
could be argued that questions should be included addressing other factors regarding
bullying in a school. The OBVQ contains questions that prompt the children to give
information about their relationship with their peers, parents and teachers; these could
be enhanced by questions examining the quality of the relationship between the
children and their significant others. In addition, they could give some indications of the
bullies’ or the victims’ self-perceptions in relation to their contact with their teachers,
parents, and peers. Finally, some questions could be added which would address
whether the problem is embedded in the community culture.
The present study has provided support for the validity and reliability of the Revised
Olweus Bully/Victim Questionnaire using Rasch modelling. Furthermore, the use of a
sample of Greek Cypriot pupils provides further evidence of the international
usefulness of the instrument. These results also provide information on the theoretical
constructs underpinning the scale and the nature of bullying, particularly the typology
of verbal, physical, and indirect bullying. The international importance of bullying as a
challenge to schools requires instruments applicable in a variety of countries and
cultures. The present study not only provides evidence to support the OBVQ for this
purpose from its application to a Greek Cypriot sample, but by the examination of its
psychometric properties using Rasch modelling, evidence is also provided for its more
general applicability.
Copyright © The British Psychological Society
Reproduction in any form (including the internet) is prohibited without prior permission from the Society

798 Leonidas Kyriakides et al.

References
Adams, R. J., & Khoo, S. T. (1996). Quest: The interactive test analysis system. Camberwell,
Victoria: ACER.
Andersen, E. B. (1977). The logistic model for m answer categories. In W. E. Kempf & B. H. Repp
(Eds.), Mathematics models for social psychology. Vienna, Austria: Hans Huber.
Andrich, D. (1978a). Scaling attitude items constructed and scored in the Likert tradition.
Educational and Psychological Measurement, 38(3), 665–680.
Andrich, D. (1978b). A rating formulation for ordered response categories. Psychometrika, 43(4),
561–573.
Andrich, D. (1988). A general form of Rasch’s extended logistic model for partial credit scoring.
Applied Measurement in Education, 1, 363–378.
Bendixen, M., & Olweus, D. (1999). Measurement of antisocial behaviour in early adolescence and
adolescence: Psychometric properties and substantive findings. Criminal Behaviour and
Mental Health, 9, 323–354.
Besag, V. (1989). Bullies and victims in schools. Milton Keynes, UK: Open University Press.
Bond, T. G. (2003). Validity and assessment: A Rasch measurement perspective. Metodologia de
las Ciencias del Comportamiento, 5(2), 179–194.
Bond, T. G., & Fox, C. M. (2001). Applying the Rasch model: Fundamental measurement in the
human sciences. Mahwah, NJ: Erlbaum.
Boulton, M. J., & Smith, P. K. (1994). Bully/Victim problems among middle school children:
Stability, self-perceived competence and peer acceptance. British Journal of Developmental
Psychology, 12, 315–329.
Byrne, B. (1992). Bullies and victims in a school setting with reference to some Dublin schools.
Dublin: University College.
Carra, C., & Sicot, F. (1996). Pour une diagnostic local de la violence a l’école. Enquête de
victimation dans les collèges du département du Doubs. Besançon, France: Université de
France-Comte, Laboratoire de la sociologie et d’anthropologie.
Charlot, B., & Emin, J. C. (Eds.). (1997). La violence a l’ école: état des savoir. Paris: A. Colin.
Cohen, J. (1988). Statistical power analysis of the behavioral sciences (2nd ed.). New York:
Academic Press.
Crick, N. R., Casas, J. F., & Mosher, M. (1997). Relational and overt aggression in preschool.
Developmental Psychology, 33, 579–588.
Cronbach, L. J. (1990). Essentials of psychological testing (3rd ed.). New York: Harper & Row.
Cronbach, L., & Meehl, P. E. (1955). Construct validity in psychological tests. Psychological
Bulletin, 52, 281–302.
Garcia, I. F., & Perez, G. Q. (1989). Violence, bullying and counselling in the Iberian Peninsula:
Spain. In E. Roland & E. Munthe (Eds.), Bullying: An international perspective. London: David
Fulton.
Genta, M. L., Menesini, E., Fonzi, A., Costabile, A., & Smith, P. K. (1996). Bullies and victims in
schools in southern Italy. International Journal of Educational Research, 11, 97–110.
Haeselager, G. J. T., & Van Lieshout, C. F. M. (1992, September). Social and affective adjustment of
self- and peer-reported victims and bullies. Paper presented at the European Conference of
Developmental Psychology, Seville, Spain.
Hales, S. (1986). Rethinking the business of psychology. Journal for the Theory of Social
Behaviour, 16(1), 57–76.
Hambleton, R. K., Swaminathan, H., & Rogers, H. J. (1991). Fundamentals of item response
theory. London: Sage.
Harter, S. (1999). The construction of self: A developmental perspective. New York: Guilford
Press.
Hirano, K. (1992, September). Bullying and victimization in Japanese classrooms. Paper
presented at the European Conference of Developmental Psychology, Seville, Spain.
Irish National Teachers’ Organisation (INTO). (1993). Discipline in the primary school. Dublin:
Author.
Copyright © The British Psychological Society
Reproduction in any form (including the internet) is prohibited without prior permission from the Society

The Rasch measurement model 799

Johnson, H. R., Thompson, M. J. J., Wilkinson, S., Walsh, L., Balding, J., & Wright, V. (2002).
Vulnerability to bullying: Teacher-reported conduct and emotional problems, hyperactivity,
peer relationship difficulties and prosocial behaviour in primary school children. Educational
Psychology, 22, 553–556.
Kaloyirou, C. (2002). Teachers’ aspects on bullying in Greek Cypriot state primary schools in
Nicosia. Paper presented at the 7th Conference of Pedagogical Review of Cyprus, University of
Cyprus.
Keeves, J. P., & Alagumalai, S. (1999). New Approaches to Measurement. In G. N. Masters &
J. P. Keeves (Eds.), Advances in measurement in educational research and assessment
(pp. 23–42). Oxford: Pergamon.
Kyriakides, L. (1996). Reforming primary education in Cyprus. Education, 24(2), 3–13, 46–50.
Kyriakides, L. (2002). A research based model for the development of policy on baseline
assessment. British Educational Research Journal, 28, 803–824.
Lagerspetz, K. M., Bjorkqvist, K., Berts, M., & King, E. (1982). Group aggression among school
children in three schools. Scandinavian Journal of Psychology, 23, 45–52.
Linacre, J. M. (1999). Investigating rating scale category utility. Journal of Outcome Measurement,
3(2), 103–122.
Lind, J., & Maxwell, G. (1996). Children’s experience of violence at school. Wellington,
New Zealand: Office of the Commissioner for Children.
Ma, X. (2002). Bullying in Middle school: Individual and school characteristics of victims and
offenders. School Effectiveness and School Improvement, 13, 63–89.
Manly, B. F. J. (1994). Multivariate statistical methods: A primer. New York: Chapman & Hall.
Marcoulides, G., & Drezner, Z. (1999). A procedure for detecting pattern clustering in
measurement designs. In M. Wilson & G. Engelhard, Jr (Eds.), Objective measurement: Theory
into practice (Vol. 5). New Jersey: Ablex.
Mellor, A. (1990). Bullying in Scottish secondary schools. Edinburgh, UK: Scottish Council for
Research in Education.
Messick, S. (1995). Validity of psychological assessment. American Psychologist, 50(9), 74–149.
Monbusho [Ministry of Education]. (1994). The present situation of issues concerning student
tutelage and measures by the Ministry of Education. Tokyo: Author.
Morita, Y. (1985). Sociological study on the structure of bullying group. Osaka, Japan: Osaka City
University, Department of Sociology.
Nakou, I. (2000). Elementary school teachers’ representations regarding school problem
behaviour: Problem children in talk. Educational and Child Psychology, 17, 91–106.
Olweus, D. (1978). Aggression in the schools: Bullies and whipping boys. Washington,
DC: Hemisphere.
Olweus, D. (1991). Bully/victim problems among school children: Basic facts and effects of a
school based intervention programme. In D. Pepler & K. Rubin (Eds.), The development and
treatment of childhood aggression. Hillsdale, NJ: Erlbaum.
Olweus, D. (1993). Bullying at school. What we know and what we can do? Oxford: Blackwell.
Olweus, D. (1994). Annotation: Bullying at school: Basic facts and effects of a school based
intervention program. Journal of Child Psychology and Psychiatry, 35, 1171–1190.
Olweus, D. (1996). The revised Olweus Bully/Victim Questionnaire for Students. Bergen,
Norway: University of Bergen.
Olweus, D. (1997). Bully/victim problems in school: Facts and intervention. European Journal of
Psychology of Education, 12, 495–510.
Ortega, R., & Mora-Merchan, J. A. (1995, August). Bullying in Andalucian adolescence: A study
about the influence of the passage from primary school to secondary school. Paper
presented at the Seventh European Conference on Developmental Psychology, Krakow,
Poland.
Osterman, K., Bjorkqvist, K., Lagerspetz, K. M. J., Kaukiainen, K., Landau, S. F., Frazcek, A., &
Caprara, G. V. (1998). Cross-cultural evidence of female indirect aggression. Aggressive
Behaviour, 24, 1–8.
Copyright © The British Psychological Society
Reproduction in any form (including the internet) is prohibited without prior permission from the Society

800 Leonidas Kyriakides et al.

Overston, W. F. (1999, August). Construct validity: A forgotten concept in psychology? Paper


presented at the annual meeting of the American Psychological Association, Boston.
Pateraki, L., & Houndoumadi, A. (2001). Bullying among primary school teachers in Athens,
Greece. Educational Psychology, 21(2), 167–175.
Perry, D. G., Kusel, S. J., & Perry, L. C. (1988). Victims of peer aggression. Developmental
Psychology, 24, 807–814.
Rasch, G. (1980). Probabilistic models for some intelligence and attainment tests. Chicago:
University of Chicago Press.
Rigby, K. (1997). Attitudes and beliefs of Australian schoolchildren regarding bullying in schools.
Irish Journal of Psychology, 18, 202–220.
Rigby, K., & Slee, P. (1991). Victims in school communities. Journal of the Australian Society of
Victimology, 25–31.
Ruiz, R. O. (1992, September). Violence in schools. Problems of bullying and victimization in
Spain. Paper presented at the European Conference of Developmental Psychology, Seville,
Spain.
Salmivalli, C., Lagerspetz, K. M. J., Bjorkqvist, K., Ostermann, K., & Kaukiaianen, A. (1996).
Bullying as a group process: Participant roles and their relations to social status within a group.
Aggressive Behaviour, 22, 1–15.
Sharp, S., & Smith, P. (1994). Tackling bullying in your school: A practical handbook for teachers.
London: Routledge.
Slee, P. T. (1994). Situational and interpersonal correlates of anxiety associated with peer
victimization. Child Psychiatry and Human Development, 25, 97–107.
Smith, E. (2001). Evidence for the reliability of measures and validity of measure interpretation:
A Rasch measurement perspective. Journal of Applied Measurement, 2(3), 281–311.
Smith, P. (1991). The silent nightmare: Bullying and victimization in school peer groups.
The Psychologist, 4, 243–248.
Smith, P., Morita, Y., Junger-Tas, J., Olweus, D., Catalano, R., & Slee, P. (Eds.). (1999). The nature of
school bullying: A cross-national perspective. London: Routledge.
Smith, P., & Sharp, S. (1994). School bullying: Insights and perspectives. London: Routledge.
Smith, R. M., & Miao, C. Y. (1994). Assessing unidimensionality for Rasch measurement.
In M. Wilson (Ed.), Objective measurement: Theory into practice (Vol. 2, pp. 316–327).
Norwood, NJ: Ablex.
Stevens, V., De Bourdeaudhuij, I., & Van Oost, P. (2000). Bullying in Flemish schools: An evaluation
of anti-bullying intervention in primary and secondary schools. British Journal of
Educational Psychology, 7, 195–210.
Swaminathan, H., & Rogers, H. J. (1990). Detecting differential item functioning using logistic
regression procedures. Journal of Educational Measurement, 27, 361–370.
Van de Vijver, F., & Hambleton, R. K. (1996). Translating tests: Some practical guidelines.
European Psychologist, 1, 89–99.
Vandermissen, V., & Thys, L. (1993). Study into school experiences in Flanders: Contacts with
fellow pupils. Caleidoscoop, 4, 4–9.
Whitney, I., & Smith, P. K. (1993). A survey of the nature and extent of bully/victim problems in
junior/middle and secondary schools. Educational Research, 35, 3–25.
Wright, B. D. (1985). Additivity in psychological measurement. In E. E. Roskam (Ed.),
Measurement and personality assessment (pp. 101–112). Amsterdam: Elsevier Science.
Wright, B., & Masters, G. (1981). The measurement of knowledge and attitude (Research
memorandum no. 30). Chicago: University of Chicago, Statistical Laboratory, Department of
Education.
Wright, B., & Masters, G. (1982). Rating scale analysis. Chicago: MESA press.
Ziegler, S., & Rosenstein-Manner, M. (1991). Bullying at school: Toronto in an international
context (Report No. 196). Toronto: Toronto Board of Education, Research Services.

Received 25 October 2004; revised version received 4 May 2005


Copyright © The British Psychological Society
Reproduction in any form (including the internet) is prohibited without prior permission from the Society

The Rasch measurement model 801

Appendix
Suppose that V1, V2, V3, : : : ,Vn represents the elements of the observed measurement
vector Vi which have to be clustered into groups. Initially, we find the minimum value
(Vmin) of the observed measurements (i.e. Vmin ¼ min {Vi}) and its maximum value (i.e.
Vmax ¼ max {Vi}). Then, we standardize the elements of the observed measurements
using the formula Si ¼ (Vi 2 Vmin)/(Vmax 2 Vmin). The vector of Si is now standardized
between zero and one. Because the relative standing of the terms in vector of Si are the
same as those of the vector Vi, we sort the vector Si in order to obtain S(i) such that S(i) ,
S(i þ 1). It follows that S(1) ¼ 0 and S(N) ¼ 1. At the next stage, we calculate Di ¼ S(i þ 1) -
S(i) for i ¼ 1, 2, : : : , N 2 1. The values of Di represent the gaps between two
consecutive values in the sorted vector of Si. Finally, the vector D is sorted in decreasing
order: D(1), D(2), D(3), D(4), and so on. In this way, the largest term D(1) divides the N
points into two clusters with the widest possible cluster. When the first k Ds are
selected, k þ 1 clusters are defined maximizing the smallest gap between any two
clusters. Thus, the number of clusters (identified in terms of the number of
gaps between clusters) can be determined by examining the contribution (as a
percentage) of Di.

View publication stats

You might also like