School Effectiveness Research From A Review of The

See discussions, stats, and author profiles for this publication at: https://www.researchgate.
net/publication/233295398
School Effectiveness Research: From a review of the criticism to

recommendations for further development
Article in School Effectiveness and School Improvement · September 2005

DOI: 10.1080/09243450500114884
CITATIONS READS
104 2,641
2 authors, including:
Hans Luyten
University of Twente
66 PUBLICATIONS 1,312 CITATIONS
SEE PROFILE
All content following this page was uploaded by Hans Luyten on 09 July 2014.
The user has requested enhancement of the downloaded file.

School Effectiveness and School Improvement
Vol. 16, No. 3, September 2005, pp. 249 – 279
School Effectiveness Research: From a

review of the criticism to
recommendations for further
development
Hans Luyten,* Adrie Visscher, and Bob Witziers
University of Twente, The Netherlands
(Received 21 April 2004; accepted 10 December 2004)
School effectiveness research (SER) has flourished since the 1980s. In recent years, however,
various authors have criticised several aspects of SER. A thorough review of recent criticism can
serve as a good starting point for addressing the flaws of SER, where appropriate, thereby
supporting its further development. This article begins by reviewing the criticism from different
perspectives by discussing the political-ideological nature of SER, its theoretical limitations and the
research methodology it applies. The review of each type of criticism is accompanied by a review of
the recommendations that the critics propose for improving SER. We then proceed to present our
views on each line of criticism and propose 5 avenues that we consider promising for the further
development of SER.
Introduction
This article refers to school effectiveness research (SER) as the line of research that
investigates performance differences between and within schools, as well as the
malleable factors that enhance school performance (usually using student achieve-
ment scores to measure the latter). The studies by Edmonds (1979) and by Rutter,
Maughan, Mortimore, and Ouston (1979) are generally considered the starting point
of SER. Much of the early work within the SER tradition had the explicit goal of
refuting the ‘‘schools-don’t-make-a-difference’’ interpretation that had been attrib-
*Corresponding author. Department of Educational Administration, Faculty of Educational

Science and Technology, University of Twente, PO Box 217, 7500 AE Enschede, The
Netherlands. Email: Luyten@edte.utwente.nl
ISSN 0924-3453 (print)/ISSN 1744-5124 (online)/05/030249–31
ª 2005 Taylor & Francis
DOI: 10.1080/09243450500114884
250 H. Luyten et al.
uted to the research outcomes presented by Coleman et al. (1966) and by Jencks et al.
(1972).
School effectiveness research has flourished since 1979, and it has attracted
considerable political support in several countries. It has become sophisticated in
both data collection and data analysis, and some authors, including Stringfield (1995)
and Scheerens (1997), have sought to connect its empirical findings to economic and
social scientific theory. Publications by Scheerens and Bosker (1997) and by Teddlie
and Reynolds (2000) present the knowledge base of school effectiveness research.
Despite the progress that has been made, some aspects of SER have been criticised
for a number of years. Although much of that criticism has been aired by authors
belonging to the SER community, external critics have raised objections that are even
more fundamental. In our view, the criticism calls for a thorough review. As
Goldstein and Woodhouse (2000, p. 13) state,
. . .if many of its [SER’s] proponents remain superficially defensive and it [SER] ignores
or fails to understand the warnings of its critics, we have very little optimism that it will
survive its present state of adolescent turmoil to emerge into full maturity.
A thorough review can serve as a starting point for defining strategies to address the
valid elements of this criticism, thereby supporting the further development of SER.
This article starts by reviewing recent criticisms of SER expressed by various
authors and by considering their recommendations for improvement. Our goal is not
to present a comprehensive overview of every critique that has been written about
SER since its inception as a line of scientific research. We focus instead on criticism
that has been published more recently (from the 1990s to date). An extensive
discussion of earlier criticism (e.g., Cuban, 1983; Ralph & Fennessey, 1983; Rowan,
Bossert, & Dwyer, 1983) thus exceeds the scope of this article.
We used a ‘‘snowball’’ method to locate relevant literature for our review. Our
starting point was a discussion concerning 20 years of SER (Townsend, 2001), which
was presented in a special issue of this journal. References in this discussion led us to
other publications criticising SER, and these publications led us to yet others, and so
forth. In addition to presenting the critiques, we offer our own views on each of three
elements of criticism and suggest a number of new avenues for SER.
Criticism From Three Perspectives

In our view, all of the criticism that has been expressed about SER can be reduced to
three different elements: the supposed political-ideological nature of SER, its
assumed theoretical limitations, and its methodological flaws.
The Political-Ideological Nature of SER

The most fundamental criticism of SER concerns its political-ideological focus, as it
relates to the basic assumptions about education and educational research that are
said to underlie SER and that are fiercely contested by several authors outside the
School Effectiveness Research 251
SER community. These assumptions relate especially to the feasibility of maintaining

objectivity in (educational) research and of distinguishing between facts and values.
The assumptions also relate to the feasibility of predicting the outcomes of the
teaching-learning process and to the question of whether it is worthwhile to assess the
quality of education according to its outcomes. According to the critics, researchers
in the field of SER are blind to the political and moral aspects of their work.
Moreover, the close ties between researchers and policy-makers gives the perception
that SER is not so much a scientific endeavour as it is an ideological force. The SER
research agenda has been accused of reflecting governmental concerns instead of
scientific considerations.
Objectivity. The SER tradition assumes that research will generate objective
knowledge through the application of rigorous quantitative methodologies. Many
critics from outside the SER community (Ball, 1998; Grace, 1998; Hamilton, 1998;
Lingard, Ladwig, & Luke, 1998; Rea & Weiner, 1998) consider this assumption
naı̈ve. Opposed to the belief in true objectivity, they argue that all research is
contaminated, at least to some extent, by the personal, political, and ideological
sympathies of the researcher. In their view, the processes of formulating research
questions, collecting information, and reporting findings always involve ideological
and political choices, which (might) serve particular interests. According to the
critics, this is especially apparent when research is dominated and funded by
governmental or government-related agencies. Rea and Weiner (1998) even suggest
that university departments or centres advocating SER should be formally recognised
as ‘‘think-tanks’’ for policy-makers rather than as centres for independent research.
Further, some critics argue that the tendency of SER to focus on the relationship
between school factors and student achievement while failing to address the limits of
what can be achieved through schooling has led to a culture of blame (Rea & Weiner,
1998) and guilt (Hargreaves, 1994).
The view of teaching and learning. Angus (1993), Elliott (1996) and other authors also
raise fundamental objections to the view of teaching and learning on which SER is
(implicitly) based. Elliott (1996) contrasts the SER perception of education as a
‘‘coercive process of social induction’’ (p. 209) with the perception of education as a
process that is ‘‘shaped by a concern to respect pupils’ capacities for constructing
personal meanings, for critical and imaginative thinking and, self-directing and self-
evaluating their learning’’ (p. 209). He explicitly denounces the idea that the quality
of the teaching-learning process should be judged according to its results. Elliot
argues that learning is an unpredictable process; a teacher’s responsibility is to create
conditions ‘‘which enable pupils to generate personally significant and meaningful
outcomes for themselves’’ (p. 221). In other words, the quality of education lies not in
its results but in the teaching-learning process itself.
Recommendations for combating the political-ideological nature of SER. Few of the

aforementioned critics offer suggestions for improving SER. In fact, their comments
essentially argue that SER should be abandoned altogether. Some external critics
(e.g., Thrupp, 2001), however, do not suggest abandoning SER, asking that SER
researchers (e.g., in England) recognise the political implications of their research,
and avoid compromising research and writing wherever possible.
Our position regarding the political-ideological aspects of SER. In our opinion, objectivity
is a worthy ideal for scientific research, even though its full realisation is unlikely.
Although the critics argue justifiably that every study is ideologically biased to some
extent, this does not mean that striving for objectivity is useless. On the contrary,
giving up the ideal of objectivity would render any research activity futile. The
fundamental reason for the existence of scientific research is its capacity for
generating information and knowledge that is valid regardless of ideological
preferences. Approaching this ideal as closely as possible requires minimising
ideological biases as much as possible.
The goal of objectivity is often most accessible in one major aspect of scientific
research: data analysis. Many of the quantitative methods used in SER involve
nothing more than using computer algorithms to process data. Although the data
processing itself is a completely objective procedure, the use of quantitative methods
certainly does not guarantee unbiased conclusions, as the output of the analyses
usually requires some interpretation, and ideological biases might again play a role at
this point. Given sufficient information about the data, analysis, and outcomes,
however, careful readers should be able to detect such biases. Research findings
derived from datasets that are open for secondary analysis further increase the
probability of revealing such biases.
The impact of ideological bias is probably much stronger in the agenda-setting phase
of research. There are few generally accepted standards concerning the propriety and
legitimacy of research questions. The clearest examples of the ideological bias of SER
are evident in the choice of research questions, as many aspects of the research agenda
are based on governmental rather than scientific concerns.
The extent to which SER can be accused of lacking ideological independence is
clearly reflected in the research questions it addresses and even more clearly in those
that it does not address. For example, SER hardly ever reflects on the appropriateness
of officially stated educational goals. The correspondence between these goals and the
standardised tests that are typically used to assess educational effectiveness is another
question that SER tends to ignore. The tests that are commonly used address only the
cognitive development of students, whereas the officially stated goals in most
countries are much broader, extending to such aspects as personal development and
citizenship. Furthermore, SER pays little attention to the limits of what can be
achieved through schooling. For example, the SER literature has yet to address the
question of whether it is fair to hold schools accountable for the persistent language
disadvantages experienced by minority students who mainly speak their native
languages outside of school.
There is no doubt that school effectiveness researchers in many countries have
strong ties to the policy community. Some authors (e.g., Slee & Weiner, 2001;
Thrupp, 2001) raise the problem that SER findings are inherently influenced by the
fact that they are funded primarily by the government or by government-related
agencies. This sceptical view of the relationship between the scientific community and
the world of policy is far too simplistic, however, particularly given the fact that the
work of these critics is usually funded by governments as well.
The second point of criticism relates to the fundamental view of teaching and
learning in SER. In our view, there is no alternative to viewing education as a
goal-oriented activity. Given the enormous amount of resources (taxpayers’
money) invested in education each year, it would be unethical not to consider
its effects. Education should prepare students for life. This includes stimulating
their personal development, citizenship and readiness for the labour market.
Assessing the extent to which these goals are accomplished is essential. We feel
that critics who argue that the teaching-learning process is unpredictable provide
an excellent argument for not investing in education—if the outcomes of
education are completely unpredictable, it makes no sense to invest any money
in it at all. Educational results must therefore be a major factor in assessing the
quality of education. This is not to say that the process of teaching is irrelevant.
Some teaching methods may be successful in certain aspects but also repellent
(e.g., because they are merely based on punishing students for failing to meet
targets set by the teacher). Teaching methods that fail to promote learning
progress satisfactorily must be rejected, regardless of any other ‘‘qualities’’ they
may have. It may be possible to choose between two or more methods that lead to
similar results. In that case, it makes sense for process characteristics to be the
decisive factor.
We must also acknowledge that, in practice, results may provide an incomplete or
biased picture. When teachers ‘‘teach to the test’’, the results may provide no
indication of any real interest, learning or understanding on the part of the students.
In our view, however, this implies no fundamental flaw in the use of results to assess
quality. It shows only that the application of the principle entails considerable
practical difficulties. Addressing these difficulties requires better instruments for
measuring the extent to which the intended results of teaching and learning have been
realised. The need for valid and reliable instruments for investigating the (presumed)
non-cognitive effects of education is especially acute.
Another serious problem is that educational goals are often formulated in abstract
or even vague terms. Governments frequently allocate large amounts of money and
other resources for educational activities without clearly specifying what these
activities are supposed to accomplish. To our knowledge, policy-makers hardly ever
identify a particular educational programme as a goal in and of itself. On the
contrary, they are usually quite eager to mention a wide range of effects that the
programme is supposed to have and goals that it is supposed to serve. If an
educational programme is to be seen as a goal in itself, that fact should be stated
explicitly right from the start. We feel, however, that this view is frequently
advanced only after the outcomes of an educational programme have turned out to
be less positive than originally hoped.
Theoretical Limitations of Research on School Effectiveness

A second line of criticism concerns the theoretical grounds that are used for
selecting and operationalising the variables studied in SER. The extent to which
SER clarifies how and why variables are interrelated has also been criticised.
Moreover, SER has been accused focusing too narrowly on explaining variation in
school effectiveness and on defining ‘‘school effectiveness criteria’’. Finally, some
authors doubt the capacity of SER to lead to the development of a theory for
school improvement.
The theoretical basis of SER. According to Coe and Fitz-Gibbon (1998), the definitions
of SER variables lack theoretical grounding (their inclusion is justified on statistical
rather than theoretical grounds) and precision, and they lead to common-sense
operationalisations that vary considerably across studies. The same authors accuse
SER of ‘‘‘fishing’ for correlations’’ between particular indicators of school
effectiveness and particular school features, without clarifying why specific
characteristics of students, classes and schools are expected (or appear) to accompany
higher scores on school effectiveness indices. The correlational studies that pervade
SER are based on simple linear logic (‘‘more of this is associated with more of that’’)
and allow conclusions concerning neither the causal relations between variables nor
the mechanisms behind those relationships. According to Coe and Fitz-Gibbon
(1998), these flaws, combined with the ‘‘vote counting’’ (comparing apples to
oranges) that is common in SER reviews, imply that conclusions about how specific
phenomena influence school performance are incorrect, thereby jeopardising theory
development.
Coe and Fitz-Gibbon (1998) also argue that the perception of consensus
concerning the correlates of effective schools (e.g., Edmonds’, 1979, five-factor
model, or the nine-factor model presented by Reynolds & Teddlie, 2000), is partly
the product of the vague formulation and sloppy measurement (e.g., through self-
reports and/or unstandardised instruments) of such factors. They contend that review
studies are likely to take the least well-defined concepts as the strongest confirmation
of general results, as these concepts tend to be measured in such a wide variety of
ways. In addition, studies rarely provide information about the reliability of such
measurements. Moreover, reviews that use vote counting to summarise research tend
to ignore the most important information about effectiveness factors—the size of their
effects. Another problem relates to the risk of chance capitalisation. The repeated
reports of significant correlations between effectiveness and ‘‘educational leadership’’
are due, at least in part, to its inclusion in so many studies. We could expect to find a
statistically significant result from time to time through chance alone. Finally,
researchers often tend to report only the significant findings, ignoring those that are
not significant.
Thrupp (2001) argues that SER studies continue to be undertheorised. He further
asserts that school effectiveness researchers have failed to embrace the detailed
microlevel research that could build a body of data that would be suitable for
generating theory, and that they have not tapped into the wealth of sociological
theories of education.
Blind spots in the research. In its most comprehensive form, SER can be conceptualised
as the study of relationships between school input, school context, the schooling
process (at both school and classroom levels), and school performance. School
performance is usually expressed in terms of average student achievement by school.
These measures ideally include adjustments for such student characteristics as entry-
level achievement and socioeconomic status (SES), in order to determine the ‘‘net’’
value added by a school. Their main goal is to identify the factors that lead to the best
results. As a consequence, the most malleable factors receive the most emphasis
(Scheerens & Bosker, 1997).
A number of conceptual models exist to represent the lines of thinking within SER.
The model developed by Scheerens (1997), shown in Figure 1, is a typical example.
Figure 1 illustrates the relationships that SER assumes to exist among the factors it
addresses. For example, a school’s financial resources and the professional experience
of its teachers are two factors included within the category of school inputs. These
factors are assumed to have a direct impact on processes within the school as well as
on the general performance of the school. The nature of school leadership, teacher
cooperation within schools, and similar school-level characteristics are thought to
affect student achievement directly and indirectly, through processes occurring at the
classroom level (e.g., the quality of instruction). Other models used within SER have
expressed similar thoughts (Creemers, 1994; Hallinger & Heck, 1996; Stringfield &
Slavin, 1992). Although such models take the complex direct and indirect
relationships between factors into account conceptually, critics of SER argue that
Figure 1. Central relationships in SER (source: Scheerens & Bosker, 1997)

empirical analyses are limited to the estimation of direct effects. We discuss this issue
in more detail in the section on criticism regarding research methodology.
To date, SER has concentrated especially on investigating relationships between
the nature of school-level processes and school performance. According to critics of
SER, some of the variation in school performance may be explained by the
relationship between school context and processes occurring within schools (2?3 in
Figure 1), as well as by the relationships between school context and school
performance (2?4). Thrupp (2001) argues that the background characteristics of
students, the composition of student populations within schools, and the curricula
used by the schools are often overlooked in the search for explanatory variables,
because they are erroneously taken as given. Slee and Weiner (1998) adopt an even
wider perspective, focusing on how schooling is influenced by the social, cultural, and
economic contexts of schools, as represented by the neighbourhoods, governments
and societies within which schools are embedded (cf. Lingard et al., 1998, and
Lauder, Jamieson, & Wikeley, 1998, as cited in our discussion of the political-
ideological criticism).
The student populations of schools can differ considerably in the proportion of
students coming from homes with particular characteristics. The extent to which and
how the home situation affects the schooling process (2?3) and how differences in
student populations promote or block student achievement (2?4) has yet to receive
much attention in SER. Authors have similarly ignored the question of which school
and classroom structures and strategies are most profitable for specific types of
student groups.
Thrupp (2001) refers to Hatcher (1998), who argues that school culture is a
product of the interaction between the culture of pupils and the formal school culture
(2$3). Further, according to Slee and Weiner (1998), the student population of a
school both responds to and influences its structure and culture. Grace (1998) argues
that SER is overly reductionistic in its attempts to explain differences in school
effectiveness. Referring to the successes of Catholic schools, he stresses that school
effectiveness is not the product of separate, individual factors but results instead from
strongly interrelated factors that find their expression in such concepts as ‘‘school
communities’’ and ‘‘school mission’’. For this reason, Grace argues that future
research should be more sensitive to the complexity of school effects and that its
analysis and methodology should be more innovative.
In addition to calling for more attention to the impact of variations in school
context on school processes and school outputs, other scholars recommend ‘‘opening
the black box’’ of the schooling process. According to Hill (1998), ‘‘. . .most school
effectiveness research has been top-down, it has been driven by the agenda of the
researchers, and it has failed to make meaningful connections with the place where
most school learning takes place, namely the classroom. . .’’ (p. 427).
Lingard et al. (1998) state that the use of only a small set of indicators to assess
school quality has led SER to be excessively oriented toward accountability. These
authors do not feel that surveys are adequate for opening the black box of schooling;
they therefore call for more in-depth, qualitative analyses of processes that actually
occur in schools, which they perceive to have a potential influence on school

performance. For example, the tracking systems used within schools, the criteria used
for taking tracking decisions, and the effects produced by various strategies are all
interesting questions that could be explored through the qualitative analysis of school
tracking practices.
Narrow indicators of school effectiveness. Coe and Fitz-Gibbon (1998) label the effect
criteria used in SER as ‘‘a very narrow range of measures’’ (p. 423), which they
perceive to be due to the fact that researchers use data that are already available (e.g.,
student scores at national examinations) or easy to measure, among other factors.
Publications by Slee and Weiner (1998) and by Bosker and Visscher (1999) also point
to the fact that student achievement in the basic skills is unfortunately the only
criterion for judging school’ performance in most SER studies.
A fruitful road toward school improvement? Although the study of the extent to which
schools vary in effectiveness and the factors that seem to promote effectiveness are
academically interesting, few researchers see them as goals in themselves. They hope,
and probably expect, that the investigations will eventually help to promote school
effectiveness. Several scholars doubt that the approach followed in SER can ever
produce a theory of school improvement, however, or that it can be helpful in
improving the quality of schools. Lauder et al. (1998) wonder whether school
improvement based on SER will ever be possible. They doubt that universal critical
success factors can be found across schools, given the extent to which they differ in
context, organisational structure, culture, and other aspects. Coe and Fitz-Gibbon
(1998) state that the poor theoretical base of SER and the scarcity of evidence
concerning the relationships among variables prevent this line of investigation from
ever providing a solid foundation for school improvement. Moreover, the research
strategy of using information obtained from analysing features common to effective
schools as a means of improving underperforming schools ignores the fact that
correlation provides no proof of causation (Coe & Fitz-Gibbon, 1998).
Recommendations for upgrading the theoretical calibre of SER. Given the criticisms
described above, it is not surprising that critics argue for more theory-driven SER
(Coe & Fitz-Gibbon, 1998; Thrupp, 2001). Their general message is that the results
of previous research should be used more carefully. This applies to the results of
earlier SER in general, and especially to the concepts and findings from other fields of
educational and non-educational research that are relevant to SER. A closer
connection with earlier scientific work should be accompanied by formulations and
operationalisation of variables that are as precise as possible, as well as by hypotheses
about the relationships among variables. Critics urge researchers to use theory rather
than common sense to explain the relationships they study, in order to allow progress
in theoretical development.
Research should develop convincing explanations of how and why variables are
related. Studies by Thrupp (2001) and by Scheerens and Bosker (1997) argue that
qualitative studies into processes that occur within schools can provide a valuable
basis for developing hypotheses to be tested in large-scale research into the factors
that make some schools more effective than others. Scheerens and Bosker (1997)
recommend focusing on such school processes as teacher selection and recruitment,
the allocation of teachers to student groups, and the extent and nature of evaluation.
In their view, SER should also allow more differentiation in school effects across
grades, subjects and teachers, as well as among students of varying ages and with
varying abilities.
Several authors argue for including outcomes that reflect the actual (instead of the
assumed) educational objectives of the schools in question (Coe & Fitz-Gibbon,
1998) and metacognition (e.g., learning to learn, solve problems, and reflect; Bosker
& Visscher, 1999) along with the usual school effectiveness criteria (e.g., drop-out
rates and the percentage of grade repeaters; Scheerens & Bosker, 1997). Moreover,
the use of curriculum-embedded and criterion-referenced tests is considered to
improve the fairness of school effectiveness evaluations: testing what should have
been taught as well as studying the extent to which schools meet the (minimum) goals
that have been set for them (Coe & Fitz-Gibbon, 1998; Scheerens & Bosker, 1997).
Several critics recommend that SER focus more on features of the school
environment (Lauder et al., 1998; Thrupp, 2001), the classroom (Hill, 1998), the
composition of the student population, and other potentially explanatory variables
(Thrupp, 2001). Thrupp is especially interested in the measures that are needed in
order to serve student populations with specific characteristics appropriately. This
interest probably stems from the belief that the concept of a single universal best way
to treat and teach any student group under any circumstance is unrealistic. Such
research requires intervention into educational practice and the subsequent
evaluation of the effectiveness of those interventions. This is consistent with Hill
(1998), who ultimately expects ‘‘hit-and-miss’’ approaches to school reform to have
more impact than the ‘‘combing-through-natural-variation’’ approach of SER. Hill
thinks that the old paradigm of SER as specific research in itself has reached the end
of its ‘‘use-by-date’’. In the new paradigm, SER should be undertaken as a critical
component of continuous school improvement and accountability.
Our position regarding the theoretical calibre of SER. With some exceptions, we feel that
the theoretical basis for selecting and operationalising the variables studied in SER is
often quite weak; it seldom constitutes an elaborated theory. The concepts
investigated are often too vague, and the operationalisations vary greatly across
studies. Although it is not the case for all research, too many studies fish for
correlations. This is because they do not use theory to derive their hypotheses
concerning the interrelationships among variables. Progress in theoretical develop-
ment is slow because the results of SER are often quite disappointing; correlations are
not found, or they are low and inconsistent across studies. If correlations are found,
we know that there must be a reason that the variables are correlated. We cannot be
certain, however, whether the correlation is due to coincidence or whether it is
meaningful, nor can we identify the mechanisms underlying the correlations.
One important barrier to the accumulation of knowledge concerns the lack of

standardised instruments for measuring the core variables of SER. It is difficult to
draw appropriate overall conclusions about the interrelationships among variables
when the concepts are not measured consistently across studies. We should attempt
to make more use of theoretical development in other disciplines (e.g., theories on
rational choice, the learning organisation, feedback mechanisms, self-organisation).
We should use the results of qualitative research in schools to develop theoretical
notions about school functioning and to test these notions in large-scale research. For
example, rational choice theory has proven to be a powerful tool for explaining social
scientific phenomena in various fields. According to Abell (1991), rational choice
theory invites us to understand individual actors as acting—or, more accurately,
interacting—in such a manner as to do the best they can for themselves, given their
objectives, resources, and circumstances, as they perceive them. For example, De Vos
(1989) uses this theory to model the behaviour of actors within classrooms (students,
peers, and teachers) and processes of interaction among the actors, with the goal of
explaining why student performance is high under some circumstances and low
under others.
In order to develop testable theoretical notions, we should attempt to model the
processes and mechanisms that lead to specific school outcomes as accurately as
possible. Acknowledging the practical problems that are involved with meeting the
requirements of randomised controlled trials, we feel that experimental research
should be encouraged as much as possible for identifying variables that promote
school effectiveness.
School effectiveness researchers should also work to eliminate what have been
called ‘‘blind spots’’ of this type of research. In addition to explaining the relationship
between features of school processes and school performance, studies should place
more emphasis on the influence of non-educational factors in the school context (e.g.,
neighbourhood, family, peer group) on schooling processes and on student
achievement. SER has certainly not ignored the effects of school context (see e.g.,
Teddlie, Stringfield, & Reynolds, 2000, for an overview of what is known about
school context effects), but the manner in which they have been treated has served
primarily to draw attention to the fact that school context does matter. More insight is
needed, however, into why and how the school context interacts with school
performance and with processes at both the classroom and the school level. For
example, the conditions under which contextual characteristics match with features of
classroom and school processes and why represent an important area for further
investigation. Moreover, although we know that classroom-level factors can explain a
considerable proportion of the variance in school performance, SER pays little
attention to the characteristics of classroom life. For example, how the school as an
organisation influences processes in the classroom and with what results are both
relevant research questions, as are questions concerning the impact of classroom
composition (in terms of the backgrounds of students) on student achievement.
More ‘‘configurational’’ research may also support the development of SER
theories. Our plea for research on configurations of variables stems from our view that
the character of schools and classrooms would probably be better addressed by

studying meaningful combinations (configurations) of variables than it would by
studying relationships among isolated variables. Distinguishing between typologies of
classrooms and schools may therefore be fruitful. For example, some schools operate
as segmented organisations; there is little cooperation among teachers, and the
policy-making activities of the principal and school board focus primarily on resource
issues. In contrast, other schools are characterised by strong departments of teachers
who cooperate intensively with each other in the planning, execution, and evaluation
of instructional activities. The school management and teaching staff engage in
extensive consultation, and they work together to develop school policy in various
areas (e.g., resources, instruction). Processes occurring in these two types of schools
differ strongly, and that difference is likely to have an impact on their effectiveness.
Taking only one variable at a time, measuring it in each school and studying its
relationship with the overall effectiveness of each school does no justice to the
complexity of the realities experienced by these schools. Quinn’s (1988) competing-
values framework offers a basis for making meaningful distinctions among school
organisations; each of Quinn’s four models of organisation reflects several mutually
related organisational characteristics.
A third element of the ‘‘theoretical criticism’’ concerns the limitations of the school
effectiveness indicators used in SER and the use of norm-referenced instead of
criterion-referenced tests (although the prevalence of criterion-referenced testing is
growing, norm-referenced testing is still very popular in SER). We do not dispute the
importance of student performance on national examinations, as the core objectives
of schooling lie in the mastery of the skills that these tests evaluate. This does not
mean, however, that it is not valuable to include other school effectiveness indicators
that are based on deliberate decisions.
Finally, critics of SER do not think that it can ever provide a basis for school
improvement. Although school improvement does indeed require intervention
research—and the evaluation of the results of these interventions—school improve-
ment should not be the exclusive focus of such research. Even if SER does not
provide schools with clues for attaining higher levels of performance, it can still be
valuable in other respects. For example, such research can provide insight into the
stability of school effects across academic years, regardless of school improvement
initiatives, and into school effects across varying grades, subjects, school types (e.g.,
vocational as compared with more general education), educational systems (e.g.,
centralised as compared with decentralised) and countries (e.g., developed as
compared with developing).
Criticism Regarding Research Methodology

Critiques of SER methodology focus on the ways in which researchers plan and
conduct such studies. They identify flaws in the research designs that are common in
SER, express scepticism about the methods of data collection, and object to the
techniques of data analysis that are applied in SER. Much of this methodological
criticism originates from researchers within the field of SER. The methodological
criticisms of mainstream SER studies centre on three broad issues.
Does the school effect exist in the real world? Coe and Fitz-Gibbon (1998) point out that
the result that authors usually report as the ‘‘school effect’’ is actually the between-
school variance that cannot be explained by the school intake characteristics that are
controlled in the data analyses. School effectiveness studies typically find that a range
of control variables—primarily such student background characteristics as prior
achievement, SES, ethnicity and gender—account for a considerable proportion of
the variation between schools. They subsequently assume that the remaining variance
between schools is caused by certain school characteristics. A rigorous meta-analysis
of school effectiveness studies (Scheerens & Bosker, 1997; Witziers & Bosker, 1997),
however, revealed weak effects for each of the effectiveness factors that had been
investigated (cooperation, school climate, monitoring, opportunity to learn, parental
involvement, pressure to achieve, and school leadership). These school character-
istics, which are often said to enhance effectiveness, account for a small proportion of
the variation between schools; when authors report a ‘‘school effect’’, they are
generally referring to this proportion of the variation.
For the reasons described above, SER studies that have been conducted to date
suggest that schools can have only a limited impact on the ‘‘school effect’’. In that
sense, the term is misleading (Coe & Fitz-Gibbon, 1998). Moreover, as Goldstein
(1997) observes, estimates of a school’s effectiveness are always based on its relative
position in comparison with other schools. As a result, such studies always identify
‘‘effective’’ and ‘‘ineffective’’ schools, even when they all accomplish acceptable
results according to some absolute standard. We know only that student achievement
differs by school and that the student background variables included in the analysis
cannot account for these differences. Although the differences may be due to
unmeasured aspects of the school organisation, they may just as well be caused by
factors beyond a school’s control (e.g., unmeasured student background character-
istics or variables that relate to the school’s context). As stated by Thrupp (1999, p.
5), ‘‘. . .they may be school-based, they may nevertheless not be school-caused’’.
Because residual variance between schools is assumed to express a school effect, its
size is strongly dependent on the control variables that are included in the data
analysis, and especially on their explanatory power with regard to the variance in
student achievement between schools. If the control variables account for much of
this variation, the school effect is small. In other cases, the school effect may be
spuriously large. The choice of control variables is therefore crucial (Coe & Fitz-
Gibbon, 1998; Goldstein, 1997; Thrupp, 1999, 2001). In most cases, pragmatic
considerations (e.g., the accessibility of information) are at least as important as
theoretical ones. Most researchers acknowledge that measures of prior achievement
are indispensable in effectiveness studies, however, and gender, ethnicity, and SES
are often taken into account as well. Coe and Fitz-Gibbon (1998) note that many
studies take no account of what has actually been taught in the school (curriculum
alignment).
The ultimate provider of education—the teacher—is an important factor that SER

tends to overlook. A typical ‘‘state-of-the-art’’ effectiveness study applies multilevel
analysis with two levels (school and student). Studies that consider the teacher (or
such teacher-related variables as class, grade or subject department) as an additional
source of variation are scarce (Luyten, 2003), but indicate that between-teacher
differences within schools probably outweigh the differences between schools.
Analyses that include both school and teacher (or class/grade) variance tend to reveal
less student-level variance than is usually found in two-level analyses (Hill & Rowe,
1996; Luyten & Snijders, 1996). On the one hand, this may imply that the effect of
education may be stronger than the results of two-level studies would suggest. On the
other hand, it may also imply that the effect of schooling depends more on the
teachers you get than on the school you choose. It is important to note, however, that
(residual) teacher variance may be due to factors beyond the teachers’ control, just as
school variance is not necessarily school-caused.
Number crunchers. The strong reliance of SER on quantitative research methods has
met with harsh criticism from scholars working outside of the effectiveness
tradition (Angus, 1993; Slee & Weiner, 2001; Thrupp, 1999, 2001). They
contend that the application of scientific standards, objective measurement and
sophisticated, rigorous data analysis and findings expressed as figures (e.g.,
regression coefficients, variance components, levels of significance) amounts to
nothing more than the objectification of teachers and pupils. In their view, the
prevailing empirical-analytical approach in the field of SER ignores the values and
life experiences of research participants and pays no attention to the meanings that
they give to events.
Authors within the school effectiveness tradition (Coe & Fitz-Gibbon, 1998;
Goldstein, 1997; Goldstein & Woodhouse, 2000; Reynolds, Hopkins, & Stoll, 1993;
Scheerens & Bosker, 1997) have also expressed concerns regarding the use of
predominantly quantitative methods. Their main concern is that effectiveness
research draws primarily on large-scale datasets containing relatively superficial
information, and that data analysis usually stops after estimating direct linear
relationships between one dependent variable (student achievement) and several
independent variables. They observe a preference for cross-sectional studies that are
biased towards variables (both dependent and independent) that are easy to measure.
Research questions are often addressed through standard, quantitative methodolo-
gies, even though the methodology should be tailored to the research questions.
These authors also emphasise that investigation into relationships that are more
complex (e.g., indirect, reciprocal, and curvilinear relationships, interaction and
differential effects, thresholds; cf. Coe & Fitz-Gibbon, 1998; Goldstein, 1997;
Scheerens & Bosker, 1997) remains scarce. Contrary to the ‘‘external critics’’, who
raise fundamental objections to quantitative methodology, the critics who are part of
the SER community call for a more sophisticated use of the available research
methods (qualitative and quantitative) in order to arrive at a closer match between
method and research questions. Finally, Goldstein (1997) points to the largely
ignored problem of measurement error; the low reliability of the variables under
analysis may produce seriously biased research findings.
Snapshot research. School effectiveness research has thus far focused strongly on
observational research, and studies based on (quasi-)experimental research are
relatively rare. Several authors within the SER community (Coe & Fitz-Gibbon,
1998; Hill, 1998; Scheerens & Bosker, 1997) refer to the limited value of studies that
mainly explore the natural variation between schools. This type of study basically
yields statistically sophisticated descriptions of existing situations, but it provides little
insight into how processes may differ in radically different situations.
The ultimate goal of effectiveness research is to find out ‘‘what works’’ and to
discover ways to improve education. Cross-sectional studies have been and continue
to be the prevailing research design in the SER tradition, essentially amounting to
comparisons of successful schools with ‘‘failing’’ counterparts. As Reynolds et al.
(1993, p. 51) noted about 10 years ago:
School effectiveness studies customarily show a ‘‘snapshot’’ of a school at one point in

time, not an evolutionary and moving picture of a school over time, a neglect which
hinders the usefulness of the knowledge for purposes of school development.
Over the past decade, school effectiveness researchers have cooperated more with
their colleagues from the tradition of school improvement research, which tends to
focus—almost by definition—more on development over time. The success of the
journal School Effectiveness and School Improvement and the annual International
Conference for School Effectiveness and Improvement are clear indications of this
cooperation, but actual long-term studies remain scarce. Most ‘‘longitudinal’’ school
improvement studies that have been conducted thus far, however, probably have very
little to say about the development of schools over the long term. Fullan (1991)
observes that institutional reform takes 5 years or more. The typical ‘‘longitudinal’’
school improvement research project comes to a halt before that time (Gray,
Goldstein, & Jesson, 1996). The ‘‘Louisiana School Effectiveness Study’’ (Teddlie &
Stringfield, 1993), which covers more than a decade, is probably the most notable
exception.
School effectiveness research has also focused more on successful schools than on
their less well-functioning counterparts (Reynolds & Teddlie, 2001). The factors that
enhance effectiveness may be quite different from those that lead to ineffectiveness. It
may simply be an unreachable goal for ineffective schools to adopt the policies and
practices that exist in well-performing schools (Slavin, 1998). While a school that is
already performing well may be able to increase its effectiveness by adopting a strong
focus on higher order thinking skills, the ineffectiveness of another school may be due
to such factors as an undisciplined school climate or insufficient attention to basic
skills. The need for studies that cover a wide time-span is generally acknowledged
(Coe & Fitz-Gibbon, 1998; Gray et al., 1996; Hill, 1998; Reynolds et al., 1993; Slater
& Teddlie, 1992). Cross-sectional studies essentially provide snapshots of successful
schools. Being effective and becoming effective are two different things, however, and
being effective is not the same as staying effective. While this statement may sound like
a truism, hardly any studies on school effectiveness have taken this distinction into
account. On the contrary, most SER studies implicitly assume that school
characteristics that correlate with being effective must also be related to becoming
and staying effective.
Recommendations for improving the methodological quality of SER. Coe and Fitz-Gibbon
(1998) call for a more appropriate description of the (adjusted) differences between
schools, which are usually referred to as ‘‘school effects’’. Their suggested alternative
‘‘adjusted academic performance of specific groups’’ has yet to receive much support.
Goldstein (1997) suggests qualifying the descriptions ‘‘effective’’ and ‘‘ineffective’’
with the term ‘‘relative’’. Several authors (Goldstein, 1997; Scheerens & Bosker,
1997; Thrupp, 1999, 2001) have argued that the selection of covariates in SER
should be based on theoretical relevance, although pragmatic considerations seem to
prevail in practice. Thrupp stresses the relevance of school population composition
and asserts that many school processes are deeply influenced by student intake
characteristics. Goldstein (1997) asserts that adjusting test scores for prior
achievement ignores the possibility that students may develop at different rates.
Measuring student development requires a series of test scores over an extended
period.
The need for a theoretical basis for variable selection applies also to the selection of
the dependent variables (measures of effectiveness). In this respect, Bosker and
Visscher (1999) propose ‘‘authentic testing’’ (e.g., teacher ratings of academic
performance based on observations, written material, etc.) as a means of obtaining
more valid information on student and school performance. Several authors call for
studies that pay more attention to teachers and departments as further sources of
variance, in addition to schools and students (Goldstein, 1997; Hill & Rowe, 1996;
Luyten & Snijders, 1996; Scheerens & Bosker, 1997). Responding to this call would
require multilevel analysis techniques that are somewhat more advanced than those
usually employed.
A widespread call for more longitudinal research on effectiveness and
improvement also exists within the SER community (Coe & Fitz-Gibbon, 1998;
Gray et al., 1996; Reynolds et al., 1993; Slater & Teddlie, 1992). Several authors
(Coe & Fitz-Gibbon, 1998; Hill, 1998; Scheerens & Bosker; 1997) argue that
future school effectiveness and school improvement research projects should be set
up as integrated components of educational innovation experiments or reform
initiatives. Scheerens and Bosker (1997) argue for studies that investigate the
processes by which ineffective schools improve and effective schools deteriorate. It
is not clear whether they also consider the possibility of retrospective research on
deterioration and improvement. They further recommend the use of large-scale
national datasets (involving cohorts of schools and students) to study the
development of schools in combination with additional, more in-depth data
collection.
Thrupp (1999, 2001), Slee and Weiner (2001), and other ‘‘external critics’’
criticise SER for its tendency to oversimplify the complex realities of education. In
their view, this tendency is due largely to the predominant use of large-scale
quantitative methodologies. These critics call for more small-scale, qualitative, and
detailed microlevel research. They see little use in striving for objective measurement
and applying rigorous data analysis, as ‘‘all forms of research are ideological’’ (Slee &
Weiner, 2001, p. 94). Researchers within the school effectiveness tradition (e.g.,
Teddlie & Stringfield, 1993) have also argued for the use of qualitative methods.
While internal critics consider qualitative approaches potentially useful supplements
to quantitative research methods, external critics tend to favour the replacement of
quantitative with qualitative methodologies. In the 1990s, cooperation increased
between researchers from the quantitative school effectiveness tradition and their
colleagues from the school improvement field, which began as a qualitatively oriented
research tradition (Reynolds et al., 1993).
In addition to the call for more qualitative methods, authors working within the
field of SER recommend the adoption of even more advanced quantitative methods
of data analysis (Coe & Fitz-Gibbon, 1998; Goldstein, 1997; Scheerens & Bosker,
1997; Scheerens, Bosker, & Creemers, 2001). These authors note that, while data
analysis in most studies stops at estimating direct linear relationships, analyses should
also explore relationships that are more complex. According to Goldstein and
Woodhouse (2000), the quantitative analysis in many SER studies oversimplifies
reality to the point of distortion. They note that the results of analysis can be only as
good as the data they concern. In their view, the main problem lies in the availability
of suitable data and not in any technical flaws in the analytical procedures.
Reynolds and Teddlie (2001, pp. 108 – 109) make the radical recommendation of
advising researchers to focus on failure and dysfunctionality:
. . .the dominant paradigm has been to study those schools already effective or ‘‘well’’
and to simply propose the adoption of the characteristics of the former organisations as
the goal for the less effective. In medicine, by contrast, research and study focuses upon
the sick person and their symptoms, the causes of their sickness and on the needed
interventions that may be appropriate to generate health. The study of medicine does not
attempt to combat illness through the study of good health, as does school effectiveness:
it studies illness to combat illness through promoting ‘‘wellness’’.
According to these authors, one of the main reasons why so little is known about the
road from ineffectiveness to effectiveness is the fact that prior research has tended to
focus on normal or average schools.
Our position regarding the methodological calibre of SER. We agree with Goldstein
(1997) and Coe and Fitz-Gibbon (1998) that the term ‘‘school effect’’ is hardly
appropriate, as SER has thus far failed to show conclusively that schools are able to
influence that which is commonly known as the ‘‘school effect’’. In our opinion,
however, efforts to replace this term with another are unlikely to result in any
substantial improvement in the quality of SER. Considering alternative methods of
expressing the effects of schooling would be more helpful. In addition to measuring

school effects as relative differences between schools, it would be useful to evaluate
school effects according to absolute standards (e.g., a minimum literacy level). The
current convention of focusing on relative differences will always produce similar
percentages of schools in the ‘‘top level’’, even if hardly any school accomplishes
satisfactory results. The best way to assess the effects of schooling would be to take a
situation in which students receive no schooling (e.g., during summer holidays) as the
baseline.
Researchers usually measure school effects as between-school differences that
cannot be explained by a number of control variables; as a result, these control
variables largely determine the size of the school effect. Most researchers seem to
agree that controlling for prior achievement is indispensable in SER, but the
consensus stops there. While several additional covariates are often considered
important (e.g., student SES, ethnicity, gender, IQ, school or class population
characteristics), they are apparently not considered indispensable. Because the
covariates that are addressed differ from study to study, the size of school effects is
bound to vary as well. As far as we know, there is no explicit consensus on a minimal
set of covariates that should be included and even less agreement on which control
variables should be taken into account in the ideal situation. At any rate, generally
accepted guidelines (preferably based on theoretical rather than pragmatic
considerations) regarding the types of covariates that should be considered in the
estimation of school effects would greatly improve the comparability of SER findings.
In our view, the effect of schooling is more appropriately reflected in the rate of
learning rather than in the level of student achievement. School effectiveness studies
should also pay more attention to teachers—the ultimate providers of education—as a
source of variance in educational outcomes. Prior research has revealed substantial
differences among teachers within schools (Luyten, 2003). This is not to say that
there is no validity in assessing school effects in a multilevel analysis that focuses
exclusively on the distinction between variance at the school and student levels.
Differences between schools are likely to remain practically important. Although it is
sometimes possible to choose between schools, it is hardly ever possible to choose
among teachers. Ignoring differences among teachers within schools, however, is to
ignore an essential aspect of educational reality.
In practice, researchers often use a single label to cover a wide range of
operationalisations (mostly self-reports), whose reliability, validity or both are
dubious. Much work remains in the development of standardised instruments for
measuring the (supposed) key factors in school effectiveness (see also the section on
upgrading the theoretical calibre of SER).
The choice for a specific research methodology should be based primarily on the
questions a study seeks to answer. Qualitative approaches are particularly suited for
explorative studies, in which key concepts are not yet clearly defined and the causal
links between them are still unclear. This applies to many aspects of the SER field. In
this respect, the strong reliance of SER on quantitative methods (the qualitative study
by Reynolds and Teddlie, 2000, is a notable exception) appears to be somewhat
misguided. While most quantitative techniques were originally designed to test

specific hypotheses, most quantitative SER studies involve the exploration of
statistical evidence for a range of hypotheses. This observation should not be taken
as a recommendation to abandon quantitative methods altogether. On the contrary, a
wide range of useful and well-developed test instruments is available for assessing the
outcomes of education, and they yield data that call for at least some basic statistical
analysis.
Some authors (Coe & Fitz-Gibbon, 1998; Goldstein, 1997; Scheerens & Bosker,
1997) argue for data analysis techniques that go beyond the usual estimation of direct,
linear relationships. Such approaches have already obtained a foothold in some areas
of SER. For example, Witziers, Bosker, and Krüger (2003) conclude that more and
more empirical studies on educational leadership are emerging that investigate the
indirect effects of leadership on school performance (i.e., the indirect influence of
leadership on school performance by way of teacher behaviour), and that these
studies support the tenability of models that focus on indirect effects. Although this
development is promising, we feel that the need to go beyond estimating direct effects
still makes sense only if the search for more complex relationships is guided by
theoretically well-founded hypotheses. Without a firm theoretical foundation, the
search for more complex relationships is bound to result in the kind of ‘‘fishing
expeditions’’ that already abound in SER.
The need for longitudinal SER studies that focus on the long-term development
of schools is widely acknowledged, even though such research entails many
practical problems (cf. Hill, 1998) and requires much patience, as the first results
require many years to obtain. One pragmatic alternative would be to reconstruct
school histories. Although this approach is not without problems (e.g., research
must be based on recollections and retrievable documents), it has at least two
serious advantages. First and most obviously, it reduces the amount of time
necessary to obtain findings that relate to long-term developments. A second
advantage is that the approach allows the purposive selection of schools that are
known to have gone through significant changes. In any case, little is known about
the impact of changing circumstances on the development of schools. For
example, although there is no doubt that student intake characteristics correlate
strongly with achievement scores, we also know that unadjusted school means are
much more stable over time than are means that have been adjusted for intake
(Gray, Goldstein, & Thomas, 2001). This suggests that changes in intake do not
result directly in changes in achievement (i.e., schools apparently try to achieve the
same average level of school performance, regardless of the intake levels of their
students).
Studies that focus on the effects of educational interventions and school
improvement efforts (e.g., the Barclay project, cf. Stringfield, 1995) are particularly
interesting, as their results constitute the ultimate test of the theories on which such
programmes are (implicitly or explicitly) based. Finally, we support the call by
Reynolds and Teddlie (2001) to pay more attention to clearly ineffective schools as a
starting point for expanding the school improvement knowledge base.
Where Do We Go From Here?

Having formulated our position regarding three lines of criticism about SER, we now
elaborate strategies that we consider especially promising for the future of SER.
In our view, the main shortcomings of SER (as it has developed to date) are:
1. its narrow reflection on educational goals;

2. its limited attention to determinants of learning outside the school;
3. its failure to assess the absolute effect of education on student development;
4. its strong reliance on cross-sectional research and the under-utilisation of theory
development;
5. its focus on one-size-fits-all solutions for improving schools that differ widely
among themselves.
Thus far, SER has placed too much emphasis on learning that takes place inside the
school and on academic goals that have traditionally received the most attention from
educational authorities. A genuine school effect would express the difference between
attending school and receiving no education in school. To date, however, only
relative performance differences between schools have been investigated. Moreover,
SER has been strongly observational and has made little use of existing theories.
Finally, the goal of developing strategies for school improvement based on insights
into the features of effective schools does not sufficiently consider the extent of
variation between schools.
What Are the Goals of Schooling?

We argue for an approach in which the development and future of the pupils is the
primary focus. This approach implies that indicators of school effectiveness should
relate to (1) preparation for the labour market, (2) personal development, and (3)
civic education. It therefore calls for a much wider range of school outcomes in
addition to the well-known achievement scores, which mainly reflect basic skills in
language and mathematics. Adopting this approach would require much more
reflection on the goals of education. Researchers must address questions concerning
the types of knowledge and skills that are needed to succeed in the labour market and
the competencies that are necessary for sound personal and social development.
Although the questions described above obviously involve normative choices
(especially with regard to personal and social competencies), it is not necessary to
start from scratch. The official educational goals of most western countries (stated in
varying degrees of detail) include objectives concerning personal and social
development in addition to preparation for the labour market. These official goal
statements can serve as a starting point for the formulation of more specific objectives
and the development of research instruments for measuring personal and social
competencies. The extent to which educational institutions are able to optimally
prepare individual students for life, given their unique talents and abilities, should
also reduce existing social inequalities. While we do not consider this a primary goal
of education, it is nonetheless a likely consequence of optimal education.
The ability to evaluate many valuable outcomes of schooling adequately requires
the development or drastic improvement of assessment instruments. Outcomes in the
cognitive domain (especially lower order cognitive skills) form an obvious exception,
although the number of instruments for other skills (e.g., problem-solving, learning to
learn, and cross-curricular competencies) is growing. Although cognitive develop-
ment is highly relevant to preparation for the labour market, other characteristics
(e.g., communication skills, persistence, and self-esteem) are also relevant.
Instruments used in educational research to measure such personal and social
competencies rely almost exclusively on teacher ratings or on self-reports by students.
Little is known about the validity of these instruments, and some of the findings based
on the use of such instruments certainly call for explanation. For example, a number
of studies (e.g., Fend, 1981; Martinot, Kuhlemeier, & Feenstra, 1988) have identified
a tendency for the self-confidence of students to decrease over the course of their
school careers. The percentage of students who believe themselves to be good at
mathematics decreases between the 1st and 2nd years of secondary education.
Student perceptions apparently do not reflect the progress they actually make. Yet, it
seems hardly conceivable that students in their 2nd year are not aware that they
perform at a higher level than do their peers in the lower grade. This is just one of the
problems associated with measuring the personal and social aspects of student
development. One problem with teacher ratings is that they may not reveal
differences between schools and classrooms, as teachers tend to use the typical
student in their classroom—rather than some (unknown) national average—as the
reference point for assigning scores. This tendency may result in the underestimation
of variation between schools and classrooms.
Determinants Outside the School

In our opinion, SER should also pay much closer attention to factors outside the
educational system that influence learning (like the family and peer group). Even
though almost every SER study confirms the limited influence of school factors and
the substantial impact of family background on learning, the latter relation is hardly
ever investigated thoroughly. The influence of factors outside the school on learning
can be substantial, and this influence is sometimes beneficial and sometimes not. As
already noted, SER studies usually treat family background, peer groups, and similar
factors as control variables in data analysis, in an attempt to estimate the effects of the
particular factors on which the studies focus (e.g., teaching time, amount of content
covered, educational leadership) more accurately. The notion that factors outside the
school and the classroom may cancel out the efforts of schools and teachers, however,
is not within the scope of SER.
In addition to its capacity for improving the quality of SER, we feel that more
insight into the learning limitations imposed by outside factors could also be of great
practical importance. In practice, such insight could facilitate the exploration of a
number of complex issues, including how to determine the extent to which the
demands that are placed on schools are realistic. For example, is it realistic to expect
schools to compensate for the language disadvantages of minority students when
these students continue to speak other languages as soon as they leave the classroom?
The need to understand factors outside the school is likely to be even stronger for
non-cognitive outcomes (social and personal competencies) than it is for traditional,
cognitive goals.
The Absolute Effect of Schools

Perhaps the most striking shortcoming of SER is that it has thus far failed to assess the
true effect of schooling. The ‘‘school effects’’ reported in these studies are actually
differences between schools. Because differences between students within schools are
nearly always considerably larger than are differences between schools, the school
effects reported in SER studies hardly ever exceed 15% for studies based on
achievement scores. For non-cognitive outcomes, the school effects are even smaller
(Thomas, Smees, MacBeath, Robertson, & Boyd, 2000; Van der Wal & Rijken,
2002). This does not mean, however, that education makes only a minor contribution
to the development of students. The small amount of variation between schools may
partly be due to a generally high level of teaching quality in all schools.
In school effectiveness studies that address both the school and the classroom
levels, differences between classes within schools explain much more of the variation
in student performance than do differences between schools. In other words, the
quality of teaching varies considerably between the classes of a school. In SER,
‘‘school characteristics’’ usually refer to features that are measured at the school level
(e.g., ‘‘educational leadership’’, ‘‘achievement-oriented policy’’ and ‘‘evaluative
potential’’). Given the strength of the impact of teaching quality, however, the average
teaching quality in schools can also be regarded as an important school feature (i.e.,
effective schools have many good teachers) that could explain performance variation
between schools. In this respect, the effect of schools may be considerable.
The common SER finding that ‘‘school effects’’ are small requires careful
interpretation; features measured at the school level explain relatively little difference
in school performance. The investigation of differences between schools with regard
to average teaching quality (both by and across subject areas) and relating those
differences in quality to student performance (both by and across subject areas) could
yield interesting results. Yet, differences in teaching quality within schools may be
much larger than the differences between schools. In that case, teaching quality may
explain a considerable amount of the variation in student performance, but only a
small amount of the variation between schools.
Assessing the impact of schooling is complicated by the fact that nearly everyone
attends school. Although it is hypothetically possible to estimate the absolute effect of
education by comparing those who attend school with a control group of those who
do not, this strategy is not feasible in practice. If the criteria that determine which
individuals receive a ‘‘treatment’’ (education) and which do not are known, it is
possible to estimate the treatment effect nearly as accurately as in randomised

experimentation.
Admission to school is primarily based on a student’s date of birth (in Dutch
primary education, students born before October 1st are usually placed in a higher
grade than the ones born later). Data from two consecutive grades (e.g., achievement
scores relating to a common scale, as well as opinions, attitudes, and teacher ratings)
that include the birth dates of the students allow researchers to assess the absolute
contribution of schooling to the development of students. Estimating the effect of age
on achievement should reveal a discontinuity between the oldest students in the lower
grade and the youngest students in the higher grade. This discontinuity reflects the
effect of having received an extra year of schooling (i.e., being in the higher grade).
It is also possible to combine the regression-discontinuity approach with multilevel
modelling commonly used in SER. Multilevel analysis can assess the extent to which
such discontinuities vary between schools. This strategy would provide both an
estimate of the absolute effect of schooling and an estimate of the variation between
schools in this respect.
Other Research Designs and More Theory-Driven Research

In our view, the incorporation of other research designs in addition to those already in
use and increased emphasis on theory-driven research could also help to expand the
SER knowledge base. To date, SER has relied too heavily on cross-sectional research
designs. Other designs, particularly experimental research designs (when aiming to
improve school features and student achievement), are indispensable to solving the
problem of causation, and longitudinal studies are needed. We acknowledge the
practical difficulties involved in addressing these needs. For example, most schools
would have no interest in participating in studies in which, if assigned to the control
group, they would not receive the treatment (e.g., a self-evaluation tool), or when
there is no funding for longitudinal research. If we are to deepen our insight into the
causes of school effectiveness, we must find ways to increase the number of
experimental and longitudinal studies.
Insight into the mechanisms behind school effectiveness requires the application of
causal analytical techniques and theories on the functioning of school organisations.
School effectiveness research should focus less on the well-known key variables and
make more use of theories that (attempt to) explain specific phenomena that can also
translate to the context of schools. Although the fact that this area of research is still
young necessarily implies that school effectiveness theory is still quite limited, there is
evidence of progress. For example, Annevelink (2004) bases her study on the work of
Anderson (2000), who developed a hypothetical model illustrating how reductions in
class size affect student achievement along three different lines (each consisting of
various chains).
Theories on organisational functioning could also be useful. For example, the
concept of ‘‘the learning organisation’’ implies a causal chain; the sustainability of an
organisation relies on innovation and improvement that, in turn, can be attained only
by unlocking individual potential and enhancing commitment by creating favourable

organisational conditions. Silins and Mulford (2003) and Silins, Mulford, and Zarins
(2002) show how these theories can be applied to research into the functioning of
schools by investigating causal relationships between particular leadership behaviours
(transformational leadership), characteristics of school cultures (organisational
learning), and indicators of school effectiveness indicators (student well-being).
There is also considerable theory on the functioning of school organisations.
Visscher (1999) points to the fact that few of the theoretical notions on the
organisation-theoretical aspects of the teaching-learning process, coordination within
schools, teacher commitment, the alignment of schools to their environment, and
similar issues, have been explored from the perspective of school effectiveness. In
other words, little is known about the relationship between various organisational
arrangements of schools and their effectiveness. Rowan, Raudenbush, and Cheong
(1993) present a fine example of how the application of (school) organisational theory
can refine our insight into how schools function and how their organisational features
relate to their effectiveness. Based on the well-known organisational theories of
Thompson (1967) they have studied the following topics:
. workplace conditions that make teaching non-routine;

. whether organic forms of management arise when teaching becomes non-routine;
. whether organic management enhances teacher effectiveness by promoting job-
related learning.
Micro-economic theory (Scheerens & Van Praag, 1998) may also provide a fruitful
theoretical basis for more cause-effect oriented studies in schools. From this
perspective, actors within educational organisations operate as rational decision-
makers, who allocate their time and efforts to activities in such a way as to maximise
their personal utility. Formal analysis can lead to hypotheses concerning what
happens to the allocation of time and effort when conditions within schools change.
These hypotheses can subsequently be verified through research and statistical
analyses. In this way, micro-economic theory may help to clarify why actors within
educational organisations act as they do, how they respond to changes in the school
setting and how their behaviour can be changed such that their own productivity—as
well as that of the organisation—will increase.
Micro-economic theory calls one key idea of SER into question: the expectation
that the manipulation of the right (school-level, classroom-level or both) variables
automatically produces improvements in student achievement levels. Research into
educational leadership can be used to illustrate this way of reasoning in much of the
SER. This type of research assumes that principals have central roles within schools
and can thus exercise a certain amount of control processes within their
organisations. It is assumed that, should a principal initiate activities aimed at
improving academic achievement, both teachers and students will react, thereby
improving achievement. However, although the position of the principal within a
school is formally central and powerful, many studies in the field of educational
administration (e.g., Ingersoll, 1993; Witziers, 1992) have shown this assumption to
be highly questionable. Micro-economists argue that educational organisations
should be considered as multiplayer organisations in which power is spread among
school staff. Students and teachers respond to school policies by withdrawing the
resources that are under their control (i.e., their effort), with the consequence that
policies initiated by the principal may have no impact at all.
A Different Approach to School Improvement

One of the goals of SER is to provide insight into the characteristics of schools and
classrooms that are associated with differences in school effectiveness. Such
knowledge is often regarded as a potential foundation for school improvement
interventions: If we know the features of effective schools, we can improve
underperforming schools by encouraging them to adopt the characteristics of
effective schools. As noted above, however, some authors are pessimistic about
whether the kind of correlational analysis carried out in SER will ever provide a solid,
general basis for improving schools, given that schools differ so much in several
important respects. The views of these critics are consistent with those of scholars in
the field of school improvement.
Dalin (1998), McLaughlin (1998), and Miles (1998) stress the local variability of
schools, implying that uniform, centrally developed reform policies and strategies will
not lead to the desired educational change in all schools. They argue that schools
differ so much in performance (and in the causes underlying their performance),
capacity for change, and contextual characteristics, that school improvement efforts
should carefully consider the ‘‘power of site or place’’. Smith (1998) goes a step further,
arguing that, because they are the most familiar with their educational practices,
practitioners should state the goals and changes to be pursued, and that they should
try to accomplish those goals, supported by extensive training. Only under such
conditions can adaptation to the user-context be achieved.
Related to the pessimism of the school improvement authors is the view of Glass
(1979), who considers ‘‘education’’ a complex, highly uncertain, and unpredictable
system about which our knowledge is incomplete. Glass argues that we should not
search for eternal truths about which strategies work well in particular circumstances,
with the goal of planning and manipulating education from positions that are far
removed from the teaching-learning process in schools. We should seek instead to
develop diligently monitored systems of highly decentralised services and flexible
actors who can choose what they consider best (rather than precisely implementing
universal approaches developed at a higher level).
Support for gradual, local improvement interventions was evident in the early work
of Dahl and Lindblom (1963), who advocated grounding the definition of goals and
values in concrete contexts rather than basing them on abstract goals. Decision-
makers simply are not sufficiently familiar with the systems that they must control to
be able to take solid decisions. These authors recommend a trial and error approach:
try to solve manageable, short-term problems incrementally by making testable
interventions. This strategy allows schools to adapt continuously to changing

circumstances, with improvement as their goal. The authors assume that this
approach would be more effective than the strategy of attempting to take major,
general steps forward. The latter strategy ultimately does not work—at least not for all
schools.
Other authors (e.g., Fullan, 1993; Stringfield, Millsap, & Herman, 1998) argue
that a strategy of simply supporting the efforts of schools to develop contextualised
strategies for improvement will not ‘‘do the trick’’; support should always be
accompanied by ‘‘pressure to improve’’. For example, such pressure can be exerted
by formulating clear targets, exercising external control, publishing school
performance results, and similar activities. Slavin (1998) observes that most schools
lack the capacity to improve on their own and therefore argues for comprehensive
models of reform: evidence-based prescriptions (e.g., ‘‘Success for All’’) that include
professional development, and instructional, curricular, and organisational prescrip-
tions for schools.
In our opinion, school-performance feedback systems (SPFSs) may be valuable tools to
use within the various perspectives on school improvement that we have presented.
Visscher and Coe (2002) define SPFSs as information systems external to schools
that provide schools with confidential information on their performance and
functioning as a basis for school self-evaluation. High-quality SPFSs have become
available in various parts of the world (cf. Visscher & Coe, 2002, for their description
and analysis). The systems can provide schools with systematically collected, high-
quality feedback on their performance and functioning in comparison with other
schools, and this feedback can serve as a basis for practitioner-led improvement
actions. The feedback can refer to both academic and non-academic outcomes of
schooling, features of school organisation (e.g., resources spent, educational
leadership, school climate, team cohesion, job satisfaction) and classroom processes
(e.g., the nature of instructional processes, the subject matter taught, the classroom
climate, student monitoring). Such feedback may help practitioners to detect
problems in the functioning of their schools in a timely fashion, to investigate the
underlying causes, and to development remedies. By regularly diagnosing and
attempting to optimise their school organisations, practitioners may gain more insight
into how their schools function. More importantly, they may also learn to identify
strategies for change that work best in their particular situations.
The consistent (albeit small: .15; Scheerens & Bosker, 1997) effect of ‘‘evaluation
and monitoring’’ on educational effectiveness, the conceptual importance of the
feedback mechanism in control theory (and other scientific disciplines), combined
with the proven substantial positive effect of feedback interventions (cf. the meta-
analysis of 131 studies of Kluger & DeNisi, 1996) make SPFSs a promising lever for
school improvement. At the same time, however, the assumption that an evaluator or
evaluation system (such as SPFS) can ‘‘know’’ some truth that practitioners should
know, and that the receipt of that knowledge will lead to behavioural change, has not
held up under many conditions. The utilisation of relevant, high-quality, evaluative
data is not likely at all (Huberman, 1987; Louis, 1998; Weiss, 1998). Based on her
long-standing research into the utilisation of evaluation findings, Weiss (1998)

concludes that evaluative information can be used partially, in fragments,
intermittently, inappropriately, or not at all. Prerequisites for successful use are that
the target group (e.g., teachers) hear about, understand, and believe the evaluative
data, know what to do in response to the data, and be convinced that they can and
should do something about the observed phenomena. Should practitioners actually
start taking action, they should ideally have the perseverance to continue in the face of
roadblocks (e.g., no time, skills). Simple, low-cost changes that are consonant with
existing practices are much simpler to accomplish than are school improvement
interventions that involve far-reaching and controversial changes.
Research should answer questions concerning the conditions under which
feedback from SPFSs can have a beneficial effect on school effectiveness. More
specifically, the following research questions are important:
1. How can the feedback be made accessible to as many members of the target
group as possible?
2. Which characteristics of the feedback content are most appreciated and
considered most credible by school staff?
3. Which type of feedback is most accessible and easy to understand?
4. What is involved in using the feedback to detect problems, find remedies and
implement them successfully, thereby making schools more effective?
5. Which strategies for change strategies make schools more effective, given specific
school and context characteristics?
This type of research calls for longitudinal studies, as the process of change and
improvement takes years. Answering several of the research questions requires
qualitative investigations to explore how schools act in response to the feedback, what
problems occur, and how they can be addressed. In addition, more improvement-
oriented research, along with more traditional SER, could prove valuable for
portraying ‘‘the effectiveness history’’ of the schools under scrutiny (question 5); has
the effectiveness of the schools in question improved and, if it has, how is their
effectiveness related to the other features of the schools?
There is no guarantee that the full implementation of all of the suggestions we have
made will resolve all of the flaws of SER as they have been described. Further, the
new approaches are likely to generate new problems as well. These potential
improvements are worthy of exploration, however, as they have the potential to bring
us closer to understanding what makes schools effective and how they can become
(even) more effective.
References
Abell, P. (1991). Rational choice theory. Aldershot: Elgar.
Anderson, L. W. (2000). Why should reduced class size lead to increased student achievement? In
M. C. Wang & J. D. Finn (Eds.), How small classes help teachers do their best (pp. 3 – 24).
Philadelphia: Temple University Centre for Research in Human Development and Education.
Angus, L. (1993). The sociology of school effectiveness. British Journal of Sociology of Education, 14,
333 – 345.
Annevelink, E. (2004). Class size: Linking teaching and learning. Enschede, The Netherlands: Print
Partners Ipskamp.
Ball, S. (1998). Educational studies, policy entrepreneurship and social theory. In R. Slee & G.
Weiner (with S. Tomlinson) (Eds.), School effectiveness for whom? Challenges to the school
effectiveness and school improvement movements (pp. 70 – 83). London: Falmer Press.
Bosker, R. J., & Visscher, A. J. (1999). Linking school management theory to school effectiveness
research. In A. J. Visscher (Ed.), Managing schools towards high performance (pp. 291 – 322).
Lisse, The Netherlands: Swets & Zeitlinger.
Coe, R., & Fitz-Gibbon, C. T. (1998). School effectiveness research: Criticisms and
recommendations. Oxford Review of Education, 24(4), 421 – 438.
Coleman, J. S., Campbell, E., Hobson, C., McPartland, J., Mood, A., Weinfeld, F., & York, R.
(1966). Equality of educational opportunity. Washington, DC: Government Printing Office.
Creemers, B. P. M. (1994). The effective classroom. London: Cassell.
Cuban, L. (1983). Effective schools: A friendly but cautionary note. Phi Delta Kappan, 64, 695 –
696.
Dahl, R., & Lindblom, C. (1963). Politics, economics, and welfare: Planning and politico-economic
systems resolved into basic social processes. New York: Harper.
Dalin, P. (1998). Developing the twenty-first century school, a challenge to reformers. In A.
Hargreaves, A. Lieberman, M. Fullan, & D. Hopkins (Eds.), International handbook of
educational change (Vol. 5, pp. 1059 – 1073). Dordrecht/Boston/London: Kluwer Academic
Publishers.
De Vos, H. (1989). A rational-choice explanation of compositional effects in educational research.
Rationality and Society, 1, 220 – 239.
Edmonds, R. R. (1979). Effective schools for the urban poor. Educational Leadership, 37, 15 – 27.
Elliot, J. (1996). School effectiveness research and its critics. Alternative visions of schooling.
Cambridge Journal of Education, 26(2), 199 – 224.
Fend, H. (1981). Theorie der Schule, 2. durchgesehene Auflage [Theory of the school, second revised
edition]. München, Germany: Urban & Scharzenberg.
Fullan, M. (1991). The new meaning of educational change. London: Cassell.
Fullan, M. (1993). Change forces: Probing the depths of educational reform. London: Falmer Press.
Glass, G. V. (1979). Policy for the unpredictable (Uncertainty research and policy). Educational
Researcher, 8(9), 12 – 14.
Goldstein, H. (1997). Methods in school effectiveness research. School Effectiveness and School
Improvement, 8, 369 – 395.
Goldstein, H., & Woodhouse, G. (2000). School effectiveness research and educational policy.
Oxford Review of Education, 26, 353 – 363.
Grace, G. (1998). Realizing the mission: Catholic approaches to school effectiveness. In R. Slee &
G. Weiner (with S. Tomlinson) (Eds.), School effectiveness to whom? Challenges to the school
Gray, J., Goldstein, H., & Jesson, D. (1996). Changes and improvements in schools’ effectiveness:
Trends over five years. Research Papers in Education, 11, 35 – 51.
Gray, J., Goldstein, H., & Thomas, S. (2001). Predicting the future: The role of past performance
in determining trends in institutional effectiveness at A-level. British Educational Research
Journal, 27, 391 – 406.
Hallinger, P., & Heck, R. H. (1996). The principal’s role in school effectiveness: An assessment of
methodological progress, 1980 – 1995. Paper presented at the Annual Meeting of the American
Educational Research Association, New York.
Hamilton, D. (1998). The idols of the market place. In R. Slee & G. Weiner (with S. Tomlinson)
(Eds.), School effectiveness to whom? Challenges to the school effectiveness and school improvement
movements (pp. 13 – 20). London: Falmer Press.
Hargreaves, D. (1994). Changing teachers, changing times. London: Cassell.

Hatcher, R. (1998). Social justice and the politics of school effectiveness and school improvement.
Race, Ethnicity and Education, 1(2), 267 – 289.
Hill, P. W. (1998). Shaking the foundations: Research driven school reform. School Effectiveness and
School Improvement, 9, 419 – 436.
Hill, P. W., & Rowe, K. J. (1996). Multilevel modelling in school effectiveness research. School
Effectiveness and School Improvement, 7, 1 – 34.
Huberman, M. (1987). Steps towards an integrated model of research utilization. Knowledge:
Creation, Diffusion, Utilization, 8(4), 586 – 611.
Ingersoll, R. H. (1993). Loosely coupled organizations revisited. Research in the Sociology of
Organizations, 11, 81 – 112.
Jencks, C. S., Smith, M., Ackland, H., Bane, M. J., Cohen, D., Gintis, H., Heyns, B., & Michelson,
S. (1972). Inequality, a reassessment of the effects of family and schooling in America. New York:
Basic.
Kluger, A. N., & DeNisi, A. (1996). The effects of Feedback Interventions on performance: A
historical review, a meta-analysis, and a preliminary Feedback Intervention Theory.
Psychological Bulletin, 119(2), 254 – 284.
Lauder, H., Jamieson, I., & Wikeley, F. (1998). Models of effective schools: Limits and capabilities.
In R. Slee & G. Weiner (with S. Tomlinson) (Eds.), School effectiveness to whom? Challenges to
the school effectiveness and school improvement movements (pp. 51 – 69). London: Falmer Press.
Lingard, B., Ladwig, J., & Luke, A. (1998). School effects in postmodern conditions. In R. Slee &
G. Weiner (with S. Tomlinson) (Eds.), School effectiveness for whom? Challenges to the school
Louis, K. S. (1998). Reconnecting knowledge utilization and school improvement: Two steps
forward, one step back. In A. Hargreaves, A. Lieberman, M. Fullan, & D. Hopkins (Eds.),
International handbook of educational change (Vol. 5, pp. 1074 – 1095). Dordrecht/Boston/
London: Kluwer Academic Publishers.
Luyten, H. (2003). The size of school effects compared to teacher effects, an overview of the
research literature. School Effectiveness and School Improvement, 14, 31 – 51.
Luyten, H., & Snijders, T. A. B. (1996). School effects and teacher effects in Dutch elementary
education. Educational Research and Evaluation, 2, 1 – 24.
Martinot, M. J., Kuhlemeier, H., & Feenstra, H. J. M. (1988). Het meten van affectieve doelen; de
validering en normering van een belevingsschaal voor wiskunde [Measuring affective goals;
validating and norming an mathematics attitude scale]. Tijdschrift voor Onderwijsresearch,
13(2), 65 – 76.
McLaughlin, M. W. (1998). Listening and learning from the field: Tales of policy implementation
and situated practice. In A. Hargreaves, A. Lieberman, M. Fullan, & D. Hopkins (Eds.),
International handbook of educational change (Vol. 5, pp. 70 – 84). Dordrecht/Boston/London:
Kluwer Academic Publishers.
Miles, M. B. (1998). Finding keys to school change: A 40-year odyssey. In A. Hargreaves, A.
Lieberman, M. Fullan, & D. Hopkins (Eds.), International handbook of educational change (Vol.
5, pp. 37 – 39). Dordrecht/Boston/London: Kluwer Academic Publishers.
Quinn, R. E. (1988). Beyond rational management; Mastering paradoxes and competing demands of high
performance. San Francisco: Jossey-Bass.
Ralph, J. H., & Fennessey, J. (1983). Science or reform: Some questions about effective schools. Phi
Delta Kappan, 64, 689 – 694.
Rea, J., & Weiner, G. (1998). Cultures of blame and redemption—When empowerment becomes
control: Practitioners’ views of the effective school movement. In R. Slee & G. Weiner (with S.
Tomlinson) (Eds.), School effectiveness for whom? Challenges to the school effectiveness and school
improvement movements (pp. 21 – 32). London: Falmer Press.
Reynolds, D., Hopkins, D., & Stoll, L. (1993). Linking school effectiveness knowledge and school
improvement practice: Towards a synergy, School Effectiveness and School Improvement, 4, 37 –
58.
Reynolds, D., & Teddlie, C. (2000). The processes of school effectiveness. In C. Teddlie & D.
Reynolds (Eds.), The international handbook of school effectiveness research (pp. 134 – 159).
London/New York: Falmer Press.
Reynolds, D., & Teddlie, C. (2001). Reflections on the critics and beyond them. School Effectiveness
and School Improvement, 12, 99 – 113.
Rowan, B., Bossert, S. T., & Dwyer, D. C. (1983). Research on effective schools: A cautionary
note. Educational Researcher, 12, 24 – 31.
Rowan, B., Raudenbush, S. W., & Cheong, Y. F. (1993). Teaching as a nonroutine task:
Implications for the management of schools. Educational Administration Quarterly, 29(4), 479 –
500.
Rutter, M., Maughan, B., Mortimore, P., & Ouston, J. (1979). Fifteen thousand hours: Secondary
schools and effects on children. Boston: Harvard University Press.
Scheerens, J. (1997). Conceptual models and theory-embedded principles on effective schooling.
School Effectiveness and School Improvement, 8, 269 – 310.
Scheerens, J., & Bosker, R. J. (1997). The foundations of educational effectiveness. Oxford/New York/
Tokyo: Pergamon.
Scheerens, J., Bosker, R. J., & Creemers, B. P. M. (2001). Time for self-criticism: On the viability
of school effectiveness research. School Effectiveness and School Improvement, 12, 131 – 157.
Scheerens, J., & Van Praag, B. M. S. (Eds.). (1998). Micro-economic theory and educational
effectiveness. Enschede: Print Partners Ipskamp.
Silins, H., & Mulford, B. (2003). Schools as learning organisations. The case for system, teacher
and student learning. Journal of Educational Administration, 40(5), 425 – 446.
Silins, H. C., Mulford, W. R., & Zarins, S. (2002). Organizational learning and school change.
Educational Administration Quarterly, 38(5), 613 – 642.
Slater, R. O., & Teddlie, C. (1992). Toward a theory of school effectiveness and leadership. School
Effectiveness and School Improvement, 3, 242 – 257.
Slavin, R. (1998). Sands, bricks and seeds: School change strategies and readiness for reform. In A.
Publishers.
Slee, R., & Weiner, G. (2001). Education reform and reconstruction as a challenge to research
genres: Reconsidering school effectiveness research and inclusive schooling. School Effective-
ness and School Improvement, 12, 83 – 98.
Slee, R., Weiner, G (with Tomlinson, S.) (Eds.). (1998). School effectiveness for whom? Challenges to
the school effectiveness and school improvement movements. London: Falmer Press.
Smith, L. M. (1998). A kind of educational idealism: Integrating realism and reform. In A.
Publishers.
Stringfield, S. (1995). Attempting to enhance students’ learning through innovative programs: The
case for schools evolving into high reliability organizations. School Effectiveness and School
Improvement, 6, 67 – 96.
Stringfield, S., Millsap, M. A., & Herman, R. (1998). Using ‘‘Promising Programs’’ to improve
educational processes and student outcomes. In A. Hargreaves, A. Lieberman, M. Fullan, &
D. Hopkins (Eds.), International handbook of educational change (Vol. 5, pp. 1314 – 1338).
Dordrecht/Boston/London: Kluwer Academic Publishers.
Stringfield, S., & Slavin, R. E. (1992). A hierarchical longitudinal model for elementary school
effects. In B. P. M. Creemers & G. J. Reezigt (Eds.), Evaluation of effectiveness (ICO
publication 2; pp. 35 – 68). Groningen, The Netherlands: ICO.
Teddlie, C., & Reynolds, D. (2000). The international handbook of school effectiveness research.
London/New York: Falmer Press.
Teddlie, C., & Stringfield, S. (1993). Schools make a difference: Lessons learned from a 10-year study of
school effects. New York: Teachers College Press.
Teddlie, C., C. Stringfield, S., & Reynolds, D. (2000). Context issues within school effectiveness
research. In C. Teddlie & D. Reynolds (Eds.), The international handbook of school effectiveness
research (pp. 160 – 186). London/New York: Falmer Press.
Thomas, S., Smees, R., MacBeath, J., Robertson, P., & Boyd, B. (2000). Valuing pupils’ views in
Scottish schools. Educational Research and Evaluation, 6, 281 – 316.
Thompson, J. D. (1967). Organizations in action. New York: McGraw-Hill.
Thrupp, M. (1999). Schools making a difference, let’s be realistic! Buckingham/Philadelphia: Open
University Press.
Thrupp, M. (2001). Sociological and political concerns about school effectiveness research: Time
for a new research agenda. School Effectiveness and School Improvement, 12, 7 – 40.
Townsend, T. (Ed.). (2001). Twenty years of school effectiveness research: Critique and response.
School Effectiveness and School Improvement, 12, 1 – 157.
Van der Wal, M., & Rijken, S. (2002). Cross Curriculaire Competenties: De samenhang tussen cognitieve
schoolprestaties, CCC en schoolkenmerken [Cross curricular competences: the relationship
between cognitive achievement, CCC and school characteristics]. Paper presented at the
Onderwijs Research Dagen, 29 – 31 May 2002, Antwerp, Belgium.
Visscher, A. J. (Ed.). (1999). Managing schools towards high performance. Lisse, The Netherlands:
Swets & Zeitlinger.
Visscher, A. J., & Coe, R. (2002). School improvement through performance feedback. Lisse, The
Netherlands: Swets & Zeitlinger.
Weiss, C. H. (1998). Improving the use of evaluations: Whose job is it anyway? In A. J. Reynolds &
H. J. Walberg (Eds.), Advances in educational productivity (Vol. 7, pp. 263 – 276). Greenwich/
London: JAI Press.
Witziers, B. (1992). Coördinatie binnen scholen voor voortgezet onderwijs [Coordination within
secondary schools]. Enschede, The Netherlands: Department of Education, University of
Twente.
Witziers, B., & Bosker, R. J. (1997, January). A meta-analysis on the effects of presumed school
effectiveness enhancing factors. Paper presented at the International Congress for School
Effectiveness and Improvement, Memphis, TN.
Witziers, B., Bosker, R. J., & Krüger, M. L. (2003). Educational leadership and student
achievement: The elusive search for an association. Educational Administration Quarterly,
39(3), 398 – 425.
View publication stats

School Effectiveness Research From A Review of The

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

School Effectiveness Research From A Review of The

Uploaded by

Copyright:

Available Formats

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

School Effectiveness Research: From a review of the criticism to

Article in School Effectiveness and School Improvement · September 2005

The user has requested enhancement of the downloaded file.

School Effectiveness Research: From a

(Received 21 April 2004; accepted 10 December 2004)

*Corresponding author. Department of Educational Administration, Faculty of Educational

Criticism From Three Perspectives

The Political-Ideological Nature of SER

SER community. These assumptions relate especially to the feasibility of maintaining

Recommendations for combating the political-ideological nature of SER. Few of the

Theoretical Limitations of Research on School Effectiveness

Figure 1. Central relationships in SER (source: Scheerens & Bosker, 1997)

occur in schools, which they perceive to have a potential inﬂuence on school

One important barrier to the accumulation of knowledge concerns the lack of

the character of schools and classrooms would probably be better addressed by

Criticism Regarding Research Methodology

The ultimate provider of education—the teacher—is an important factor that SER

School effectiveness studies customarily show a ‘‘snapshot’’ of a school at one point in

expressing the effects of schooling would be more helpful. In addition to measuring

misguided. While most quantitative techniques were originally designed to test

Where Do We Go From Here?

1. its narrow reﬂection on educational goals;

What Are the Goals of Schooling?

Determinants Outside the School

The Absolute Effect of Schools

possible to estimate the treatment effect nearly as accurately as in randomised

Other Research Designs and More Theory-Driven Research

by unlocking individual potential and enhancing commitment by creating favourable

. workplace conditions that make teaching non-routine;

A Different Approach to School Improvement

interventions. This strategy allows schools to adapt continuously to changing

long-standing research into the utilisation of evaluation ﬁndings, Weiss (1998)

Hargreaves, D. (1994). Changing teachers, changing times. London: Cassell.

You might also like