Professional Documents
Culture Documents
birenbaum-et-al-2015-international-trends-in-the-implementation-of-assessment-for-learning-implications-for-policy-and
birenbaum-et-al-2015-international-trends-in-the-implementation-of-assessment-for-learning-implications-for-policy-and
Christopher DeLuca
Queens University, Canada
Lorna Earl
Lorna Earl and Associates, Canada
Margaret Heritage
University of California, Los Angeles, USA
Val Klenowski
Queensland University of Technology, Australia
Anne Looney
National Council for Curriculum and Assessment, Ireland
Kari Smith
Nowegian University of Technology and Science and second University
of Berge, Norway
Helen Timperley
University of Auckland, New Zealand
Louis Volante
Brock University, Canada
Claire Wyatt-Smith
Australian Catholic University, Australia
Abstract
This paper discusses the emergence of assessment for learning (AfL) across the globe with
particular attention given to Western educational jurisdictions. Authors from Australia,
Canada, Ireland, Israel, New Zealand, Norway, and the USA explain the genesis of AfL, its
evolution and impact on school systems, and discuss current trends in policy directions for AfL
within their respective countries. The authors also discuss the implications of these various shifts
and the ongoing tensions that exist between AfL and summative forms of assessment within
national policy initiatives.
Keywords
Assessment for learning, international trends, educational policy
Introduction
Formative assessment has been an informal activity for a very long time in some classrooms
and countries around the world. In 1971, Bloom et al., (1971) moved formative assessment
into a more formal space when they wrote a book entitled Formative and Summative
Evaluation of Student Learning, in which they described a view of education in which the
primary purpose of schooling was the development of the individual. In their view,
assessment and evaluation were a part of learning and classroom teachers played a
prominent role in using evaluation to improve and extend student learning. Many
educators and researchers began to advocate assessment as being educationally useful
(e.g., Black, 1998; Crooks, 1988; Gipps, 1994; Popham, 1995; Shepard, 1989; Wolf et al.,
1991; Stiggins, 1991; Sutton, 1995; Wiggins, 1993; Wiggins and McTighe, 1999). They
focused on the importance and value of the assessment that teachers do every day in
classrooms as a critical element in helping students learn.
Formative assessment moved to center stage when Black and Wiliam (1998a) synthesized
over 250 studies linking assessment and learning and found that the intentional use of
assessment in the classroom promoted learning and improved student achievement. The
Assessment Reform Group, in England, described the findings this way Assessment
Reform Group (1999):
Assessment that is explicitly designed to promote learning is the single most powerful tool we have
for raising standards and empowering life-long learning. (Assessment Reform Group, 2002: 2)
They called this focus on assessment that is directly connected to helping students learn
assessment for learning (AfL):
[The] process of seeking and interpreting evidence for use by learners and their teachers to decide
where the learners are in their learning, where they need to go and how best to get there.
(Assessment Reform Group, 2002: 2)
In the last two decades formative assessment has been taken up in practice and in policy around
the world. In country after country, formative assessment or assessment for learning has been
infused into or adopted for educational assessment and evaluation policies or practices.
Since 2001 a group of researchers, policy makers and professional development
facilitators from a number of countries have been meeting every three or four years to
share, examine and explore assessment for learning in a wide range of contexts
Birenbaum et al. 119
(Klenowski, 2009). In this paper, researchers from seven of these countries (Australia,
Canada, Ireland, Israel, New Zealand, Norway, and the USA) provide a chronicle of how
formative assessment has emerged and describe the current status of AfL in their countries.
Taken together they provide a rich and fascinating picture of the emergence of a major
innovation in education.
International profiles
Australia
Genesis of assessment for learning. In Australia following the seminal work of Paul Black and
Dylan Wiliam (1998a; 1998b) and the Assessment Reform Group (1999) the Curriculum
Corporation developed a website on behalf of the education departments of the States and
Territories and Commonwealth of Australia entitled Assessment for Learning (AfL)
(www.curriculum.edu.au/assessment). AfL was defined according to the Assessment
Reform Group’s (2002) definition. To support teachers’ in their understanding and
practice the website provided links to 32 assessment tasks, to background research
reference material and to professional learning modules. In addition, a series of DVDs to
promote professional learning was developed. These resources focused on the importance of
feedback, self-assessment and peer assessment and strategic questioning, and were identified
as areas for further teacher development and national promotion.
Today, Education Services Australia (ESA), a national, not-for-profit company owned by
all Australian education ministers has taken on the work of the Curriculum Corporation and
Education Australia. ESA was established by the Standing Council on School Education
and Early Childhood (SCSEEC) and the company has responsibility for supporting national
priorities and initiatives in the schools. Two assessment related initiatives are:
. an introduction to AfL;
. learning intentions;
. success criteria and rubrics;
. effective teacher feedback;
. strategic questioning;
. peer feedback;
. student self-assessment; and
. formative use of summative assessment. (http://www.esa.edu.au/projects/assessment-
learning)
120 Policy Futures in Education 13(1)
. . . classroom assessment practices. . . observed are those practices of teachers who are more reliant
on assessment of learning, or summative assessments, as their dominant strategy. This often takes
the form of work sheets and short answer tests. In some SSLC schools, principals were aware of the
need to raise teachers’ awareness of contemporary understandings in assessment theory and practice
and had organised professional development and training. For example here a secondary,
metropolitan Hub school principal stated: ‘[s]o we try and look at a number of assessments. In
previous years we’ve been looking at assessment for learning. We haven’t looked at that this year
with our staff about assessment of, for and as learning. We’ll be doing a little bit of that or we’ll be
doing a lot of that next year’. (Luke et al. 2013: 310 2011)
Evolution and impact of AfL on school systems. From the late 1990s there has been increased
international systemic awareness of the centrality of assessment for learning in educational
reform efforts. Such momentum for change, inspired by the work of Paul Black, Dylan
Wiliam and the Assessment Reform Group, became apparent in the Asia Pacific region –
notably in Australia, Hong Kong and New Zealand. In Australia at the federal level the
Curriculum Corporation developed resources although each Australian state and territory
adopted its own approach to AfL. Curriculum Corporation also published a practical guide
to AfL entitled ‘Improving Student Achievement’ (Glasson, 2009). This guide highlighted
the differentiated roles of the teacher and the student under the AfL umbrella as articulated
by Lorna Earl (Earl, 2003). Drawing on classroom examples the collective experiences of
international researchers were brought together with the aim of illustrating to Australian
teachers the variety of AfL strategies and providing suggestions for teachers’ further
professional learning.
Each state and territory has adopted a particular approach to AfL. This national profile
describes the impact of AfL at a system level with reference to one Australian state (New
South Wales), as an example. Teachers in Australia have experienced major reform not just
with assessment but also with the introduction of an Australian Curriculum. The Australian
Curriculum Assessment and Reporting Authority (ACARA), an independent authority, is
responsible for the national curriculum (Foundation to Year 12), the national assessment
program and the national data collection and reporting program. ACARA has developed
Birenbaum et al. 121
achievement standards with the intention that they are ‘applied across every school in
Australia’ (http://www.acara.edu.au/curriculum/curriculum.html).
However, as with the uptake of AfL, each state and territory has adopted its own
approach to support the national curriculum and achievement standards. In New South
Wales (NSW), syllabuses and support materials are designed to promote an integrated
approach to teaching, learning and assessment.
Assessment for learning, assessment as learning and assessment of learning are approaches that can
be used individually or together, formally or informally, to gather evidence about student
achievement and to improve student learning. (http://syllabus.bos.nsw.edu.au/support-materials/
assessment-for-as-and-of-learning/)
Teachers in NSW are informed that the common elements of assessment as learning
strategies and AfL include self-assessment, peer assessment, strategies for students to
actively monitor and evaluate their own learning, and feedback (together with evidence,
to help teachers and students decide whether students are ready for the next phase of
learning or whether they need further learning experiences to consolidate their knowledge,
understanding and skills). It is suggested to teachers that these approaches will help them
and their students know if current understanding is a suitable basis for future learning. In
addition, teachers are informed that using their professional judgement in a standards-
referenced framework is a way of extending the process of AfL into assessment of learning.
Specifically, AfL is described to teachers as integral to the teaching and learning process
and as central for clarifying student learning and understanding. The use of evidence by
teachers regarding students’ knowledge, understanding and skills to inform their teaching is
considered to be important. The key characteristics of AfL have been interpreted and
presented to teachers in NSW as follows:
. a view of learning in which assessment helps students learn better rather than just achieve
a better mark;
. involve formal and informal assessment activities as part of learning and to inform the
planning of future learning;
. include clear goals for the learning activity;
. provide effective feedback that motivates the learner and can lead to improvement;
. reflect a belief that all students can improve;
. encourage self-assessment and peer assessment as part of the regular classroom
routines;
. involve teachers, students and parents reflecting on evidence; and
. inclusive of all learners.
(http://syllabus.bos.nsw.edu.au/support-materials/assessment-for-as-and-of-learning/).
The centrality of AfL to learning and teaching, and practices that involve the students in
the assessment process, are emphasized.
Current trends in policy directions for AfL within Australia. In 2010 the Australian Institute for
Teaching and School Leadership (AITSL) was established, with responsibility for
professional standards, fostering and driving high quality professional development
for teachers and school leaders and working collaboratively across jurisdictions and
engaging with professional bodies. Apart from the national curriculum and the
122 Policy Futures in Education 13(1)
achievement standards, teachers have recently had the introduction of the Australian
Professional Standards for Teachers. These standards aim to define the work of teachers
and to make explicit the elements of high-quality, effective teaching in 21st-century schools.
The goal is that these standards will drive improved educational outcomes for students.
The standards framework explicates the knowledge, practice and professional engagement
required across teachers’ careers with the rationale that the framework will provide
‘a common understanding and language for discourse between teachers, teacher
educators, teacher organisations, professional associations and the public’ (http://
www.aitsl.edu.au/).
Standard 5 – Assess, provide feedback and report on student learning
(see http://www.teacherstandards.aitsl.edu.au/DomainOfTeaching/ProfessionalPractice/
Standards/5) – is the professional standard that addresses assessment skills. The
standard aligns with recent major curriculum and assessment reforms and consists of the
following.
Canada
Education in Canada falls under provincial jurisdiction. Each of Canada’s 10 provinces and
3 territories is responsible for generating its own assessment policies to support and monitor
student learning. Prior to 2000, many provincial assessment policies emphasized a traditional
diagnostic–formative–summative assessment sequence (Airasian et al., 2006). In this
sequence, diagnostic and formative assessments were used by teachers to improve and
tailor their instruction, while summative assessments were used to report publically on
student achievement.
During the 1980s and 1990s, provincial assessment policies valued diagnostic, formative,
and summative assessments differently depending upon their curricular orientation and
provincial testing programs. In provinces with a long-standing history of large-scale
testing, mainly in Western Canada (i.e., Alberta and British Columbia), summative
classroom assessments were highly valued and regulated in relation to provincial test
content and criteria (Klinger et al., 2008). Provinces with minimal large-scale testing, such
as Manitoba, Ontario, and Prince Edward Island had a more balanced orientation toward
formative and summative assessments. This balanced approach was supported further with a
holistic curricular orientation characterized by less rigid provincial expectations, a
Birenbaum et al. 123
Assessment for learning, assessment as learning, and assessment of learning all serve valuable,
and different, purposes. It is not always easy, however, getting the balance right. If we want to
enhance learning for all students, the role of assessment for learning and assessment as learning
takes on a much higher profile than assessment of learning. (Western and Northern Canadian
Protocol for Collaboration in Education, 2006: 14)
Across provincial assessment policies, there is an explicit articulation of the value and
benefits of integrating assessment for and as learning into classroom teaching and
learning (e.g., Alberta Assessment Consortium, 2005; British Columbia Ministry of
Education, 2004; Ontario Ministry of Education, 2010a; see www.CAfLN.ca/resources for
complete listing). These policies emphasize that assessment for learning supports students’
growth toward educational standards while assessment as learning cultivates students’
autonomy, self-regulation, and general learning skills. Underpinning these policies is the
assertion that supporting the ability of students to learn (that is, assessment as learning)
will accelerate learning, increase summative assessment results, and contribute to lifelong
learning commitments.
However, despite provincial policies aimed at assessment for and as learning, several
researchers have noticed gaps in the capacity of teachers to implement rigorous
assessment for and as learning programs in their classrooms (DeLuca et al., 2012; Klinger
et al., 2012). These gaps are attributed to challenges related to teacher professional learning
124 Policy Futures in Education 13(1)
opportunities in assessment, practical barriers (e.g., time, class size, resources), and limited
research on the nuances of integrating assessment for learning in diverse classroom contexts.
As a result of these challenges, several Canadian provinces have engaged in various
initiatives to support teachers’ integration of assessment for learning. For example, since
1999 Alberta school districts have been provided with provincial funding to engage in
cyclical professional development projects aimed at improving student learning and
performance, with many of these projects focused on assessment for learning. Ultimately,
these projects are intended to build capacity in assessment at classroom and school levels
with results shared provincial to encourage systemic adoption of assessment for learning
(Townsend et al., 2010).
Similarly, the Ontario Ministry of Education and teachers’ federations have supported
Ontario teachers through programs of professional learning and funding for collaborative
inquiries focused on assessment for learning (see: Ontario Ministry of Education, n.d.).
These professional learning programs aim to generate cultures of learning that value
assessment-informed decision-making that affects students, teachers and school
administrators. Increasing assessment literacy across the province, specifically the use of
assessment for, of, and as learning, is part of a larger effort toward a school-wide
comprehensive reform model known as the School Effectiveness Framework (SEF). The
SEF is ‘a school self-assessment tool, grounded in research and professional learning,
used to promote school improvement and student success’ (Ontario Ministry of
Education, 2010b: 1). In this framework, school improvement is predicated on reliable
and valid assessment information about student learning, teacher effectiveness and school/
district achievement of systemic goals. Accordingly, the use in Ontario of assessment for and
as learning is integral to the educational system not only for enhancing student achievement
but also for supporting teacher learning and school/district goals (Ontario Ministry of
Education, 2010b, 2011).
Overall, assessment for learning is taking hold as a key feature of educational assessment
programs in Canada. Classroom assessment policies that integrate and explain assessment
for and as learning are evident across the provinces. Significant efforts are currently being
made to support teachers and school administrators in interpreting and implementing these
policies and assessment priorities. However, additional research is needed on professional
learning structures that support teachers most effectively in this process as well as continued
research on the ways assessment for learning is operationalized and integrated across
curricular areas, disciplines, and diverse student learning groups. Ultimately, there is a
concerted effort across the majority of provinces to integrate assessment for learning to
support teacher learning and effectiveness, informed school decision-making and district
priorities and, most importantly, to enhance student learning across Canada.
Republic of Ireland
The publication of the Black and Wiliam research review on assessment and classroom
learning in Assessment in Education in 1998 coincided with the completion of a report on
a policy review of lower secondary education, the junior cycle, in Ireland which noted the
following:
Post-primary education in Ireland has an established tradition of assessment through high stakes
examinations. Indeed it is noticeable that teachers rarely speak of ‘assessment’, or the need to
Birenbaum et al. 125
‘assess’ but focus instead on preparing students for ‘tests’ and ‘examinations’. This feature was
commented on by the OECD report on 1991 which noted that post-primary teachers tended to
be ‘purveyors of facts and coaches for examinations’ rather than ‘articulators, managers and
organisers of learning. (NCCA, 1999: 46)
AFL in practice and policy. The first developmental work under an Assessment for Learning
banner was initiated by the National Council for Curriculum and Assessment (NCCA) in
2003, arising from a visit by staff to the Medway Oxfordshire project in England in the
previous year. Interest in these projects had arisen from the initial Black and Wiliam review
(1998a) and their subsequent account of the project (Black et al., 2003). The focus of the
Irish project was the initial years of post-primary school and it was directly associated with
the 1998 report. The information leaflet on Assessment for Learning generated for the
project describes assessment for learning as a teacher ‘tool’ of sorts, in the following terms:
When a teacher is using assessment for learning he/she is trying to get learners ‘on the inside’ of
the learning process, as it were. So, her/she will try to get the learners to see that there is a specific
learning intention or target to each lesson and will share the learning intention with the learners.
(NCCA, 2003)
Of note, the project material goes to great lengths to emphasize what AfL is not. It is not, the
leaflet explains, ‘the same as continuous assessment’, or assessment of learning, ‘which is the
kind of assessment traditionally associated with the examination hall’ and the purpose of
which is ‘to measure achievement’. Clear efforts are made to take account of the sensitivities
of teachers aabout the assessment ‘boundaries’. However, a consequence of this was that
while the initiative was received positively by teachers, and those who participated reported
significant impact on classroom practice and on student motivation and autonomy (NCCA,
126 Policy Futures in Education 13(1)
2005), the initiative had no impact either on other kinds of assessment practice or on
teachers’ willingness to engage with any formal school-based assessment. Further, system
and teacher efforts to extend assessment discourse into consideration of quality was notably
absent.
Work on assessment in primary schools which resulted in the publication of guidelines in
2007 which also distinguished between the ‘two’ assessments but offered a slightly different
perspective on assessment for learning which,
. . .emphasises the child’s active role in his/her own learning, in that the teacher and child agree
what the outcomes of the learning should be and the criteria for judging to what extent the
outcomes have been achieved. (NCCA, 2007: 9)
Of note, the guidelines blur the carefully constructed boundary of the earlier post-primary
initiative, by suggesting that both Afl and AoL are ‘interrelated and complementary’ and
both are ‘central to the teacher’s work (NCCA, 2007: 8). Assessment is far less contested in
the primary context in Ireland with no national tests or examinations in place therefore the
boundary between assessment of and for learning can be more permeable. The agency of the
younger learner by comparison to their post-primary counterpart is noteworthy.
Current developments. The agency of the learner, and an intentional blurring of boundaries are
two features of the current wave of junior cycle reform. Although this reform was the focus
of a year-long consultation process, it effectively began in October 2012 with the
announcement by the Minister for Education and Skills of a new Framework for Junior
Cycle and the phasing out of the Junior Certificate examination by 2020. This Framework
for Junior Cycle is noteworthy, as a contemporary curriculum and assessment policy
document, for the almost complete absence of any reference to AfL. ‘Formative’ and
‘summative’ are used, but as purposes for, rather than types of assessment; and both of
these, it is proposed, should serve and promote learning (DES, 2012, p.18).
This focus on learning and the exclusion of the AfL label is an attempt to respond, at least
in part, to Swaffield’s 2009 critique of what she termed the ‘(mis)interpretation of AfL as a
teacher driven mechanism for advancing students up a prescribed ladder of subject
attainment’ (Swaffield, 2009: 6), and the lack of attention paid to learning and to the
central role of the learner in that process. It is also an attempt to move beyond the initial
AfL developmental project and its focus on teacher action to reflect the emphasis on student
agency in the 2007 Primary Assessment Guidelines. The intention is to build sustainable
assessment cultures in schools whereby teachers develop specific assessment design
capabilities as well as informed standards-referenced judgment practice. The latter extends
to the use of stated standards (called ‘expectations for learners’ in the Framework for Junior
Cycle) in teachers coming together for professional conversations for moderating student
work. In turn, students develop knowledge about and expertise in using standards for self-
assessment and improvement purposes. Given earlier discussions about current assessment
practice and policy, this is a long term project. It will entail a concerted focus on shifting the
assessment gaze of teachers, students and the public away from the terminal examination as
the sole or only trustworthy arbiter of quality for junior cycle education. It will also entail a
new valuing of teacher judgment, informed by system checks and balances, to maintain
public confidence in the education system.
In the new junior cycle, all assessment should be for learning, and learning should be for
students and for teachers. Ironically, the most notable and most significant feature of AfL in
Birenbaum et al. 127
the most significant assessment policy of the modern era in the Republic of Ireland, is its
absence.
Israel
The Israeli education system is centralized; in 2012/13 the system served about 1.6 million
students from K–12 in 4502 education institutions (Israeli Ministry of Education, 2013a).
Genesis of AfL. Although AfL is not yet implemented systemically in Israel, its origin can be
traced back to the mid 1970s, when a student-centered education policy was adopted,
replacing the ‘melting pot’ policy aimed at integrating the many different culture groups
that had emigrated to Israel following establishment of the state in 1948. Diversity, which
had previously been considered detrimental, became valued under the new policy and thus
adaptive instruction was enacted to cater to the needs of the individual student. Efforts were
subsequently invested in promoting self-regulated learning and by the late 1990s alternative
assessment methods were already widely employed and were even considered as a possible
substitute for the matriculation exams. Performance tasks, research projects, portfolios,
learning journals, exhibitions, rubrics, self- and peer-assessment have dominated the
educational discourse, albeit often failing to be implemented successfully.
Although multiple-choice tests were never the predominant method of testing in Israeli
schools, teachers’ assessment knowledge was quite shallow and teacher-made assessments
lagged behind instruction and failed to tap higher-order skills or enable utilization of the
results to advance learning. Consequently, the achievement level of the Israeli students in
international tests was quite disappointing, leading to the adoption of an accountability
policy at the beginning of the new millennium. The establishment of the National
Authority for Measurement and Evaluation in 2005 supported this policy by offering
superior/expert developed large-scale accountability tests (GEMS) for elementary and
middle schools (grades 2, 5 and 8) which supplemented the existing matriculation exams
in high schools. As a result, the focus of many schools shifted from learning to achievement
and negative effects of accountability systems, similar to those witnessed in other countries
enacting such accountability systems (Nichols & Berliner, 2007), were noticed. The situation
deteriorated further as a result of the Supreme Court ruling that from 2012 the GEMS test
scores (by school) should be released to the public. A new policy has since been declared by
the recently nominated Minister of Education, which is aimed at fostering ‘meaningful
learning’ rather than encouraging ‘competition for ranking on league tables’ (Israeli
Ministry of Education, 2013b).
It seems that the current position of the ‘swaying pendulum’ holds great potential for AfL
to play a significant role in advancing a systemic reform in education in alignment with the
required competencies in the 21st century. Initiatives to fulfill the potential will be addressed
following a brief account of lessons learned from successful, albeit sporadic,
implementations of AfL in the past.
Evolution and impact of AfL on school systems. For over a decade our research group has been
studying implementation of AfL in different elementary and middle school contexts in Israel
in order to identify school-based conditions that support and those that constrain proper
AfL implementations. Our findings pointed to the critical role of school-based professional
learning communities in advancing teachers’ AfL practices (Birenbaum et al., 2009, 2011),
128 Policy Futures in Education 13(1)
thus corroborating research findings from the UK and Canada (James et al., 2007; Pedder
and Opfer 2011; Earl and Katz, 2006). We identified similarities between cycles of assessment
in classroom (AfL) and in school-based professional learning communities (inquiry into
practice) as well as in their respective cultures. Furthermore, we underscored the school’s
assessment culture as a key component. A forthcoming paper (Birenbaum, 2014) offers a
conceptualization of the assessment culture from a complexity framework, viewing it as a
complex system in which two other complex systems (student learning and teacher learning)
are nested. By means of ongoing reciprocal relations among the three systems learning
emerges and a mindset, which we termed ‘AfL mindset’, evolves. The attributes of this
mindset, which were identified in schools with assessment culture where AfL was
successfully implemented, seem to define the ‘spirit of AfL’ that is claimed to be missing
from common implementations of AfL, causing itto fail to fulfill its potential to promote
learning (Marshal & Drummond, 2006).
Current trends in policy direction for AfL. In line with his intention to foster ‘meaningful learning’
in schools, the current Minister of Education declared that the external GEMS tests will be
withheld for at least a year and the number of matriculation exams will be drastically
reduced. He further declared that the Ministry of Education intends to maintain trusting
relations with schools as a step toward increasing mutual trust among all the participants in
the educational system. Such steps could set the stage for a systemic reform; the question
raised is what additional policy forces would lead to the desired reform.
Michael Fullan (2011) describes four ‘good drivers’ that lead to systemic education
reforms: capacity building (rather than accountability); group quality (rather than
individual quality) pedagogy (rather than technology) and ‘systemness’ (rather than
fragmentation). It seems that the current Israeli evolving education policy is directed
toward embracing those ‘good drivers’ but the way is still long and challenging. Initiatives
addressing capacity building are beginning to take place with regard to in-service and pre-
service teachers. Respectively, the Ministry of Education has recently issued an outline for a
professional development program for in-service teachers, which consists of three AfL
related modules (formal AfL, informal AfL, and school-based professional learning)
(Birenbaum, 2013). Likewise, teacher-training institutes are preparing to integrate AfL in
their programs.
If implemented successfully such programs would empower in-service and pre-service
teachers driving them to engage, as a habit of mind, in collaborative inquiries into their
practice thus leading to improved pedagogy practices including mindful consumption of
technology (Salomon, 2002). Moreover, establishing meaningful connections between
goals and practices (of instruction, learning and assessment) as well as supporting
networking within and between schools including their stakeholders, would advance what
Fullan (2011) terms ‘systemness’ or ‘coherent wholeness’. Under such conditions assessment
culture is likely to sprout in schools and in teacher-training institutes enabling AfL to fulfill
its potential to promote meaningful learning among students and teachers.
New Zealand
To understand the evolution of assessment for learning in New Zealand requires an
understanding of the wider policy context and the changes enacted in 1989in relation to
how schools are administered. At that time the New Zealand schooling system changed
Birenbaum et al. 129
almost overnight, from one of the most bureaucratic systems internationally to one of the
most decentralized. All layers of district administration were abolished and the old central
Department of Education was downsized to a policy-only Ministry of Education whose
primary role was to give policy advice to the Minister of Education in the government of
the day (New Zealand Government, 1989). Although aspects of this reform have evolved
over the 24 years since its introduction, most of the key tenets of the legislation are still in
place and it remains one of the most highly devolved systems in the world (Nusche et al.,
2012). Currently, there is no administrative layer between the Ministry of Education and the
governing authority (which takes the form of parent-elected boards of trustees) of individual
schools. This administrative organization means individual schools have considerable
discretion about how they assess students and, until very recently, there were few
mandated requirements about how they should do so. This administration system also
means that the principal influencing mechanisms available to the Ministry of Education
are concerned with the provision of nationally-funded professional development.
A brief history of assessment for learning. When other countries were introducing national
testing in the 1990s, New Zealand rejected this approach for students in the first 10
years of their schooling (ages 5–15 years). Throughout the 1990s the emphasis was on
the formative purpose of assessment on school or teacher-developed assessment tasks.
This policy emphasis was strongly influenced by the work of Crooks (1988) who,
through of a review of the research literature, found that classroom evaluation
practices could have a powerful direct and indirect impact on student outcomes. In
this review, Crooks identified the circumstances under which the impacts of assessment
were likely to be positive or negative, thus developing a clear rationale for assessment
for learning.
The emphasis on assessment for learning was supported by funding from the Ministry of
Education for professional development through national contracts for both primary and
secondary school teachers. These programs were school-based with the original title in 1995
of ‘Assessment for Better Learning’ evolving subsequently into ‘Assess to Learn’. Schools in
these contracts typically participated over a two-year period with the professional learning
undertaken primarily by facilitators who visited the schools on a regular basis. In the last
two years, the importance of school leaders in the assessment was recognized, with the most
recent contracts having the title ‘Leadership and Assessment’. The most important point
here is the long term commitment demonstrated by the Ministry of Education to assessment
for learning.
Part of the Ministry of Education’s support was to contract the development of nationally
normed assessment tools for teaching and learning (‘asTTle’) in reading, writing and
mathematics (e-asttle.tki.org.nz/). These innovative tools by design allowed teachers and
students to determine the content and the timing of what was assessed. The analysis
process developed individual student and class profiles of strengths and weaknesses,
together with suggestions about what to teach next.
These efforts to promote the formative purposes of assessment were also supported by the
revision to the New Zealand Curriculum in 2007 which similarly foregrounded assessment
for learning. It stated that ‘The primary purpose of assessment is to improve students’
learning and teachers’ teaching as both students and teachers respond to the information
it provides’ (New Zealand Ministry of Education, 2007: 38). More recently, a visionary
statement on student assessment was published by the Ministry of Education (2011). This
130 Policy Futures in Education 13(1)
Evaluation and assessment frameworks have little value if they do not lead to the improvement
of classroom practice and student learning. Therefore securing effective links to classroom
practice is one of the most critical factors in designing the evaluation and assessment
framework. The variation in practices across New Zealand raises questions as to the degree of
consistency that is desirable set against what may be seen as legitimate diversity in the context of
school self-management. . . the New Zealand education system is conceived as a high trust model
relying strongly on teacher judgment. There is, however an inevitable tension between variety of
practice and consistency across the system. Autonomy at the school level helps to create a sense
of ownership and self-direction, but is not easy to reconcile with the drive for consistency of
standards. (Nusche et al., 2012: 34)
Recent threats. Since 2010 threats to assessment for learning have begun to creep into the
education system. In 2010 and 2011 the government introduced National Standards for
students in Years 1–8 in English and Maori-medium schools respectively, with the
ostensible aim of giving parents clearer information about students’ achievements. The
official formative approach to assessment strongly influenced the way in which these
standards were developed, with a deliberate focus on the use of professional teacher
judgment underpinned by assessment for learning principles rather than a narrow testing
regime (NZ Ministry of Education, 2011). Teachers must assess students by making an
Birenbaum et al. 131
overall teacher judgment which takes into account the full range of information about
student performance available to teachers. Standardized tests could be included but other
sources of information, including observations and teacher interviews, were to be used to
determine if a particular student was at or below the national standard.
Not surprisingly, considerable variability in practices became evident across schools in
undertaking this complex task (Wylie and Hodgen, 2010). There were always reporting
requirements associated with the National Standards. Teachers were required to report
this information, together with a given students’ progress, to parents. Boards were
required to report collated information to the community and the Ministry on the
numbers and proportions of students at, above, below or well below National Standards
by ethnicity, gender and year level.
However, the greatest threat to the formative purposes of assessment came in 2012, when
despite the variability in practices, the Minister of Education decided to publish school-level
results, thus providing for school comparisons. The introduction of the standards was highly
contested (Ministry of Education, 2010), and this step has no doubt increased the stakes
related to what has now become assessments with a more summative purpose. Although, in
theory, the overall teacher judgments can still be used for formative purposes, and teachers
are encouraged to do so, a recent report by the OECD (2013) identified that ‘there is a risk
that pressures for summative scores may undermine effective formative assessment practices
in the classroom. . ..Such tensions between formative and summative assessment need to be
recognised and addressed’ (OECD, 2013: 215).
Norway
In Norway, as in many other countries, the trigger of the Assessment for Learning
movement can, to a large extent, be traced back to Black and Wiliam’s 1998 review
paper, Inside the Black Box. Before its publication, assessment had not been a widely
discussed theme in Norway, mainly because the schools followed traditional assessment
approaches. Assessment was understood as testing, often as end-of-school level testing –
that is, end of primary, middle or secondary school. While no grades were given in
elementary schools, grades in secondary schools were based on tests. It is safe to say that
Scriven’s (1967) definition of summative and formative evaluation was not widely known by
policy makers or by teachers. Building on Scriven’s work, Bloom (1969) advocated that the
conceptualization of summative and formative assessment should not be related to the
assessment instrument, but more to the purposes of assessment. The uses of information
collected through assessment is what determines if assessment is characterized as formative
or summative (Bloom, 1969; Black and Wiliam, 2009).
At the same time as international educators engaged in promoting ‘assessment for
learning’ as opposed to ‘assessment of learning’ at the end of the 1990s and in the
beginning of the current millennium, Norwegian evaluation reports identified problems in
the assessment practices of Norwegian teachers. The school reform, Reform 97, was
evaluated by a number of Norwegian researchers (Haug, 2003; Klette, 2003; among
others) who concluded that Norwegian teachers were found to support their students and
to create a safe and inclusive learning environment. However, the main problem was that the
students experienced teachers’ feedback as general and too positive. The students did not
find this helpful because it lacked corrective information and did not provide directions for
the students’ future learning (Dale, 2008). More recent international studies (OECD, 2011)
132 Policy Futures in Education 13(1)
support the findings from the earlier Norwegian research – that the quality of feedback
practice by Norwegian teachers is still not up to the expected level. Another important
finding from Norwegian research is that it seems that the quality of feedback correlates
negatively with the students’ advancement from level to level. In secondary schools,
assessment is to a large extent focused on testing, and grades are still given as the average
of test scores (Wendelborg et al., 2011).
In response to the research presented in summary above, various decisions were made
at the political level. At the start of the 1990s, in fact, a pioneer project – Assessment
and Mentoring (Vurdering og veiledning) – was initiated, which was based on the
principle that assessment can promote learning (Raaen, 1990). Similar ideas were also
expressed in policy documents from 1997 (L–97 by KUF, 1996). However, in reality there
was a large gap between the intentions presented in the policy papers and classroom practice,
and few initiatives were initiated in order to implement the politicians’ intentions (Eggen,
2011).
Only in the last decade has more focus been placed on the implementation of the ideas
and principles behind assessment for learning in schools, and much effort and funding have
both been put into various in-service activities for teachers. The point of change came as a
response to the disappointing PISA results for Norwegian students in 2000. In 2006 a
steering document was published (Ministry of Knowledge, 2006) which gave clear
directives about how to practice assessment, emphasizing assessment of achievements of
pre-defined goals in each and every school subject. Moreover, the importance of ‘during-
learning-assessment’ was stressed, without any clear explanation of why the more
international term, formative assessment, was not used. This was due in particular to the
fact that ‘formative’ carries the same meaning in Norwegian as in English. Assessment for
learning subsequently became and remains the commonly used term. In the steering
documents published in 2006 the students’ rights to assessment with and without grades
was emphasized, which led to an intensive demand for documentation of all assessment
practice to avoid legal actions taken by students and their parents. In 2009 a revised
version of the steering document about assessment was introduced (Ministry of
Knowledge, 2009), which specified the demand for learning-focused assessment, students’
rights to receive mid- and end-of-year assessment and, not least, the requirement to involve
students in the assessment process.
Concurrent with the publication of explicit prescriptions of how to practice assessment to
which teachers must adhere, several in-service activities have been initiated to empower
teachers in assessment. The aim is to change teachers’ assessment practice and recently
there has been a welcome tendency to move away from a top-down prescriptive approach
to a more competence-development approach. The principles behind assessment for learning
and the understanding that schools and teachers need to develop their own practice of
assessment within a given framework are being addressed in the most recent initiative,
‘Assessment-for-Learning-2010–2014’ (http://www.udir.no/Vurdering-for-laring/VFL-
skoler/) The project covers all Norwegian counties and involves 184 of the counties’ 428
municipalities. It is important to note that in Norway the local authorities are given the
responsibility for implementing macro-decided educational changes: in a recent report,
which looked at the governance of the new project, some of the key findings were as follows.
For successful implementation of the ‘Assessment for Learning’ program trust between the
local Municipalities and the stakeholders of school is needed.
Birenbaum et al. 133
The goals of the program had a better chance to be implemented if they aligned with wider
educational goals at a school and at a macro level (policy).
Learning networks between schools and sharing knowledge among peers supported
professional learning.
On-line examples of ‘best practice’ were of help, especially to the smaller, rural
municipalities.
Many of these felt overwhelmed by the multiple policy reforms. (Hopfenbeck et al., 2013).
The above findings align with the meta-study by Timperley et al. (2007) of professional
learning with a positive impact on student achievement, and there is cautious optimism
that Norwegian policy makers understand the need to look at educational change in a
long-term perspective, and the need to trust teachers and provide autonomy within a
given framework.
The challenge for Norwegian teachers is that alongside political rhetoric, and the most
welcome movement toward an ‘assessment-for-learning’ practice, teachers are faced with an
increasingly extensive testing regime for accountability purposes. The number of national
tests is high and testing of all students becomes more and more common and includes young
learners. In addition, the importance of end-of-school exams is increasing because of more
competition for entrance into higher education. Thus teachers have to practice assessment
within two competing paradigms, one more explicit – the assessment-for-learning – and one
implicit and ‘hidden’ in the political rhetoric, an increasing testing regime. The backlash
effect of the latter presents the many teachers who believe in the ‘assessment-for-learning’
principles with a professional dilemma, which could and should be avoided.
United States
Comparatively recently, in describing assessment for learning (or formative assessment as
the practice is referred to in the United States), noted assessment expert Lorrie Shepard
observed that ‘recently, this robust and well-researched knowledge base has made its way
back across the oceans, offering great promise for shifting classroom practices toward a
culture of learning’ (Shepard, 2005: 2). Assessment for learning has long been a staple of
educational practice in Europe, Australasia and Canada. Shepard’s observation implies that
its practice has been a recognizable feature of ‘good teaching’ in American classrooms in the
past, even though it lacked formalization in theory and research. The advent of robust
theory and research, much of it conducted outside the United States, offers the prospect
of the institutionalization of assessment for learning in teacher education and classroom
practice.
Shepard’s hope for a shift in classroom practice is a reaction to the dominance of the
culture of testing in the United States in which for most of the 20th century attention to
assessment was focused on large-scale testing (McMillan, 2013). This situation has been
compounded in the last fifteen years or so by state and, more recently, federal legislation
that held schools accountable for student performance on annual standardized tests, with a
menu of increasingly severe sanctions for low performance. The net effect of the
predominance of large-scale testing has been to squeeze out attention and resources to
support assessment for learning.
Since returning to North American shores, however, formative assessment has received a
good deal of attention. For example, in 2001 an influential committee of the National
134 Policy Futures in Education 13(1)
Formative assessment is a process used by teachers and students during instruction that provides
feedback to adjust ongoing teaching and learning to improve students’ achievement of intended
instructional outcomes.
The member states, currently numbering about a third of the states in the USA, have
adopted this definition to guide their practice and policy. In addition, it has become a
widely cited definition and is used by many other organizations to inform their work with
respect to formative assessment.
This definition and other supporting material from the FAST group has moved the focus
in the member states from testing and providing teachers with more measurement
instruments, to offering professional development for teachers to incorporate assessment
for learning, rather than just assessment of learning, in their classrooms. It is also
noteworthy that students are prominently included in this definition as equal stakeholders
in the assessment process, indicating the beginning of a shift in the United States from a
testing to a learning culture. However, while gaining traction, the involvement of students,
or indeed a central focus on the student role in assessment for learning, is not as yet well
established in the United States.
This move to formative assessment as assessment for learning is also reflected in other
policy initiatives in the USA. Two consortia of states have been funded by the federal
government to develop new assessment systems that measure student skills against a
common set of college- and career-ready standards in mathematics and English (US
Department of Education, 2010). One of them, Smarter Balanced, will provide a digital
library of artifacts to support teachers’ formative assessment practice, ranging from
Birenbaum et al. 135
Conclusions
Although the research base for AfL seems to be well established and accepted in the various
countries studied here, it appears that education policies have yet to be fully enacted in a
manner that would lead to a significant shift in teacher practice. The ongoing tensions
between formative and summative forms of assessments, as reflected in large-scale testing
programs, continue to pose a significant risk to the uptake of authentic and sustained AfL
practices in school systems across much of the Western world. This threat has already been
noted by the OECD and represents the greatest challenge facing federal and regional
Ministries of Education as they seek to reconcile accountability demands with the push for
improved student learning. The synergy (or lack thereof) that often exists between large-scale
testing and teachers’ classroom assessments is often the result of contradictory messages given
to school leaders and classroom practitioners. From our perspective, this tension will never be
resolved until both modes of assessment complement one another in a meaningful way. In
essence, AfL research should inform both the design and administration of accountability
measures and more importantly promote national/state policies that underpin the prominence
of AfL as the key driver of student learning. The profiles presented here suggest that some
nations are further along in this regard that others and ultimately it is the overarching policy
context that is providing the necessary zeitgeist for success.
Funding
This research received no specific grant from any funding agency in the public, commercial or not-for-
profit sectors.
Note
1. In contrast, the Leaving Certificate examination, a wholly external and largely terminal
examination at the end of upper secondary, carries considerable stakes because it is used for the
allocation of places in higher education.
136 Policy Futures in Education 13(1)
References
Airasian P, Engemann J and Gallagher T (2006) Classroom assessment: Concepts and Applications
(Canadian edition). Toronto, ON: McGraw-Hill Ryerson.
Alberta Assessment Consortium (2005) A framework for student assessment. Edmonton, AB,
(author) Assessment program application for new grants: Comprehensive assessment systems.
Available at:http://www.k12.wa.us/SMARTER/pubdocs/SBAC_Narrative.pdf (accessed 02
October 2010).
Assessment Reform Group (1999) Assessment for Learning: Beyond the Black Box. Cambridge, UK:
Cambridge University.
Assessment Reform Group (2002) Assessment for Learning: Research-based Principles to Guide
Classroom Practice. Cambridge, UK: Assessment Reform Group.
Birenbaum M (2013) Conditions that support formative assessment: Assessment for Learning (AfL) in
teacher preparation programs. Bimat Diun 51: 6–12. (Hebrew).
Birenbaum M, Kimron H and Shilton H (2011) Nested contexts that shape assessment for learning:
School-based professional learning community and classroom culture. Studies in Educational
Evaluation – Special Issue on Assessment for Learning 37(1): 35–48.
Birenbaum M, Kimron H, Shilton H and Shahaf-Barzilay R (2009) Cycles of inquiry: Formative
assessment in service of learning in classrooms and in school-based professional communities.
Studies in Educational Evaluation 35: 130–149.
Birenbuam M (2014) Conceptualizing assessment culture in school. In: Wyatt-Smith C, Klenowski V
and Colbert P (eds) Assessment for Learning Improvement and Accountability: The Enabling Power.
Berlin, Germany: Springer.
Birenbaum M (2013). Outline for professional development of teachers in assessment for learning
(AfL). The Israeli Authority for Measurement and Evaluation in Education (RAMA), Ministry
of Education. (Hebrew). Available at: http://cms.education.gov.il/educationcms/units/rama/
aarachabeitsifrit/mitve_halel.htm.
Black P and Wiliam D (1998a) Assessment and classroom learning. Assessment in Education.
Principles, Policy and Practice 5(1): 7–74.
Black P and Wiliam D (1998b) Inside the Black Box: Raising Standards through Classroom Assessment.
London, UK: King’s College.
Black P and Wiliam D (2009) Developing the theory of formative assessment. Educational Assessment,
Evaluation and Accountability 21(1): 5–31.
Black P, Harrison C, Lee C, Marshall B and Wiliam D (2003) Assessment for Learning. Putting it into
Practice. Buckinghamshire, UK: Open University Press.
Black P (1998) Testing: Friend or foe? Theory and practice of assessment and testing. London: Falmer.
Black P and Wiliam D (1998) Inside the black box: Raising standards through classroom assessment.
London: King’s College School of Education.
Bloom BS (1969) Some theoretical issues relating to educational evaluation. In: Tyler RW (ed.)
Educational Evaluation: New Roles, New Means. Chicago, IL: University of Chicago Press.
Bloom BS, Hastings JT and Madaus GF (1971) Handbook on formative and summative evaluation of
student learning. New York: McGraw Hill.
Bowe R, Ball S and Gold A (1992) Reforming Education and Changing Schools: Case Studies in Policy
Sociology. London, UK: Routledge.
British Columbia Ministry of Education (2004) Classroom Assessment and Evaluation. Victoria, BC:
BC Ministry of Education.
Council of Ontario Directors of Education (2006) Consistency in classroom assessment. Available at:
http://www.cpco.on.ca/ProfessionalDevelopment/Resources/CCA-Final.pdf (accessed 3 July 2014).
Crooks T (1988) The impact of classroom evaluation practices on students. Review of Educational
Research 58(4): 438–481.
Birenbaum et al. 137
OECD (2011) OECD Reviews of Evaluation and Assessment in Education: Norway. In: Nusche D, Earl
L, Maxwell W and Shewbridge C (eds) Paris, France: OECD. Available at: www.oecd.org/
dataoecd/60/60/48632032.pdf.
OECD (2013) Synergies for Better Learning. Paris, France: OECD.
Ontario Ministry of Education (2010a) Growing Success: Assessment, Evaluation and Reporting in
Ontario Schools. Toronto, ON: Queen’s Printers for Ontario.
Ontario Ministry of Education (2010b) School Effectiveness Framework: A Support for School
Improvement and Student Success. Toronto, ON: Queen’s Printer for Ontario.
Ontario Ministry of Education (2011) Learning For All: A Guide to Effective Assessment and Instruction
for All Students, Kindergarten to Grade 12. Toronto, ON: Queen’s Printer for Ontario. National
Research Council (2001).
Knowing What Students Know: The Science and Design of Educational Assessment. In: Pellegrino J,
Chudowsky N and Glaser R (eds), Committee on the Foundations of Assessment. Board on Testing
and Assessment, Center for Education, Division of Behavioral and Social Sciences and Education.
Washington, DC: National Academies Press.
Ontario Ministry of Education (n.d.) AER Gains. Available at: http://www.edugains.ca/newsite/aer2/
index.html (accessed 3 July 2014).
Ontario Royal Commission on Learning (1994) For the Love of Learning: Report of the Royal
Commission on Learning. Toronto, ON: Queen’s Printer for Ontario.
Pedder D and Opfer VD (2011) Are we realising the full potential of teachers’ professional learning in
schools in England? Policy issues and recommendations from a national study. Professional
Development in Education, 37(): 741–758. Princeton, NJ: Educational Testing Service.
Popham J (1995) Classroom assessment: What teachers need to know. Boston: Allyn & Bacon.
Raaen FD (1990) Elevvurdering i nytt perspektiv: Sluttrapport for prosjektet «Vurdering og veiledning.
Oslo, Norway: Grunnskolerådet.
Salomon G (2002) Technology and pedagogy: why don’t we see the promised revolution? Educational
technology, 42(1): 71–75. National Research Council (2001) Knowing What Students Know: The
Science and Design of Educational Assessment. Pellegrino J, Chudowsky N and Glaser R (eds),
Committee on the Foundations of Assessment. Board on Testing and Assessment, Center for
Education, Division of Behavioral and Social Sciences and Education. Washington, DC:
National Academies Press.
Schriven M (1967) The Methodology of Evaluation. Washington, DC: American Educational Research
Association.
Shepard LA (2005) Formative assessment: caveat emptor. Paper presented at the ETS Invitational
Conference, The Future of Assessment: Shaping Teaching and Learning, New York, NY:.
Shepard L (1989) Why we need better assessments. Educational Leadership 46(7), 4–9.
SMARTER Balanced Assessment Consortium (SBAC) (2010) Race to the Top.
Stiggins RJ (2006) Balanced Assessment Systems: Redefining Excellence in Assessment.
Stiggins RJ (2008) Assessment Manifesto: A Call for the Development of Balanced Assessment Systems.
Portland, OR: ETS Assessment Training Institute.
Stiggins R (1991) Assessment literacy. Phi Delta Kappan 72(7): 534–539.
Swaffield S (2009) The misrepresentation of Assessment for Learning – and the woeful waste of a
wonderful opportunity. Paper presented at the Conference of the Association for Achievement and
Improvement through Assessment, Bournemouth, UK.
Timperley H, Wilson A, Barrar H and Fung I (2007) Teacher Professional Learning and Development:
Best Evidence Synthesis Iteration, pp. 291. Available at: http://www.educationcounts.govt.nz/
__data/assets/pdf_file/0017/16901/TPLandDBESentire.pdf (accessed).
Townsend D, Adams P and White R (2010) Alberta Initiative for School Improvement: Successful
Assessment for Learning Projects from AISI Cycle 3. Lethbridge, AB: University of Lethbridge.
140 Policy Futures in Education 13(1)