Professional Documents
Culture Documents
International Journal of Research & Method in Education
International Journal of Research & Method in Education
To cite this article: Gunilla Näsström (2009) Interpretation of standards with Bloom’s revised
taxonomy: a comparison of teachers and assessment experts, International Journal of Research &
Method in Education, 32:1, 39-51, DOI: 10.1080/17437270902749262
Taylor & Francis makes every effort to ensure the accuracy of all the information (the
“Content”) contained in the publications on our platform. However, Taylor & Francis,
our agents, and our licensors make no representations or warranties whatsoever as to
the accuracy, completeness, or suitability for any purpose of the Content. Any opinions
and views expressed in this publication are the opinions and views of the authors,
and are not the views of or endorsed by Taylor & Francis. The accuracy of the Content
should not be relied upon and should be independently verified with primary sources
of information. Taylor and Francis shall not be liable for any losses, actions, claims,
proceedings, demands, costs, expenses, damages, and other liabilities whatsoever
or howsoever caused arising directly or indirectly in connection with, in relation to or
arising out of the use of the Content.
This article may be used for research, teaching, and private study purposes. Any
substantial or systematic reproduction, redistribution, reselling, loan, sub-licensing,
systematic supply, or distribution in any form to anyone is expressly forbidden. Terms &
Conditions of access and use can be found at http://www.tandfonline.com/page/terms-
and-conditions
Downloaded by [Memorial University of Newfoundland] at 01:25 03 August 2014
International Journal of Research & Method in Education
Vol. 32, No. 1, April 2009, 39–51
International
10.1080/17437270902749262
1743-727X
Original
Taylor
202009
32
gunilla.nasstrom@edmeas.umu.se
GunillaNäsström
000002009
&Article
Francis
(print)/1743-7288
Journal of Research
(online)
and Method in Education
Introduction
Educational systems are today often standards-based. Standards are here defined as
descriptions of what students should know and/or be able to do, as well as descriptions
of how well the students should attain these knowledge and skills (Popham 2003).
These standards are often broad and vague (Luft, Brown, and Slutherin 2007) and
therefore need to be interpreted. Teachers need to interpret the standards to plan their
teaching (Bybee 2003) and to assess and grade their students (Popham 2003). Those
who construct and develop standardized assessments have to interpret the standards to
formulate a valid blueprint (Popham 2003). In alignment analysis, the judges have to
interpret the standards to be able to compare the standards with other standards, with
assessments, or with teaching (Bhola, Impara, and Buckendahl 2003).
It is important that individuals and organizations agree on their interpretations of
standards. To get equivalent grades in a country or a region, all teachers should have
the same interpretations. Teachers and assessment experts who develop standardized
assessments have to interpret the standards in the same way to give the students an
opportunity to perform well on the standardized assessments (Biggs 2003). In align-
ment analyses, all judges should have similar interpretations of the standards to derive
trustworthy comparisons (Bhola, Impara, and Buckendahl 2003).
*Email: gunilla.nasstrom@edmeas.umu.se
PISA (OECD 1999), Marzano’s new taxonomy (Marzano and Kendall 2007), Porter’s
taxonomy (Porter and Smithson 2001) and Bloom’s revised taxonomy (Anderson and
Krathwohl 2001). Bloom’s revised taxonomy (Anderson and Krathwohl 2001) is a
development and revision of Bloom’s original taxonomy from 1956.
In this study, Bloom’s revised taxonomy was chosen as a categorization tool for
standards for four reasons. Firstly, the taxonomy is designed for analysing and devel-
oping standards, teaching and assessment as well as of emphasizing alignment
among these main components of an educational system. Secondly, this taxonomy
has been applied in nursing education (Su, Osisek, and Starnes 2004), music educa-
tion (Hanna 2007) as well as in schools in several states in the USA (Pickard 2007),
but none of these studies have evaluated the usefulness of this taxonomy. Therefore,
there is a lack of studies about the quality of Bloom’s revised taxonomy, especially
as a categorization tool for standards. Thirdly, this taxonomy has general stated
content categories which allow comparisons of standards from different subjects.
Fourthly, in a study where standards in chemistry were categorized with two differ-
ent types of models, Bloom’s revised taxonomy was found to interpret the standards
more unambiguously than a model with topics-based categories (Näsström and
Henriksson 2008).
The focus of this article is on evaluating the usefulness of Bloom’s revised taxon-
omy for interpretation of standards. Interpretation of standards is based on human
judgements, and therefore inter- and intra-judge consistency is an important issue for
the trustworthiness of interpretation of standards. Another focus of this article is on
similarities and differences between teachers and assessment experts when interpret-
ing standards. The article is structured in the following way: Firstly, criteria for eval-
uating the usefulness of a taxonomy as a categorization tool are described. Secondly,
a short review of inter- and intra-judge consistency in interpretation of standards is
presented. Thirdly, results are presented describing the usefulness of Bloom’s revised
taxonomy as well as describing similarities and differences between the teachers and
the assessment experts. Fourthly, the usefulness of Bloom’s revised taxonomy,
similarities and differences between the teachers and the assessment experts, as well
as limitations of this study are discussed.
The criteria for evaluating the usefulness of the taxonomy are based on Hauenstein’s
(1998) five rules. A taxonomy should, according to Hauenstein, (1) be applicable; (2)
be totally inclusive, i.e. all standards can be categorized; (3) have mutually exclusive
categories, i.e. unambiguously categorize one standard into only one category; (4)
follow a consistent principle of order; and (5) use the terms in categories and sub-
categories that are representative of those used in the field. One aspect of applicability
is that judges can use the taxonomy. Another aspect of applicability is the number of
International Journal of Research & Method in Education 41
categories utilized in the taxonomy. In this article, the first three rules are used in the
evaluation of Bloom’s revised taxonomy.
Interpretation of standards is based on human judgements, and it is important to
obtain agreement on the categorization of standards. One important aspect of this
agreement is to obtain a high level of inter-judge consistency, indicating that the cate-
gorizations will be the same regardless of judges, as well as intra-judge consistency,
indicating stability in the judgements (Stephens et al. 2006).
In general, studies about inter- and intra-judge consistency for interpretation of
standards are conspicuous by their absence. There are at least two possible explana-
tions of this. One explanation stems from Bloom’s original taxonomy (Bloom 1956),
in which the author claimed that it is at least a bit more complicated to classify assess-
Downloaded by [Memorial University of Newfoundland] at 01:25 03 August 2014
ment items than standards. This claim has been used as an argument for focusing only
on categorization of assessment items (Poole 1971). A second explanation is that
judges in alignment studies are supposed to be familiar with the specific standards,
and therefore the discussion about interpretations of standards is restricted to the train-
ing part (Bhola, Impara, and Buckendahl 2003). Even though it is important that the
judges agree on how to interpret standards, this assumption is very seldom verified.
On the contrary, inter-judge consistency has been reported in several studies
dealing with categorization of assessment items with the same type of taxonomies that
can be used for interpretation of standards (e.g. Fairbrother 1975; Seddon 1978;
Herman, Webb, and Zuniga 2007; Webb, Herman, and Webb 2007). In such studies,
inter-judge consistency is commonly measured as the percentage of perfect agreement
among the judges and the kappa coefficient (Watkins and Pacheco 2000; Stemler
2004). Herman, Webb, and Zuniga (2007) reported the percentage of agreement for a
clear majority (at least two thirds of the judges) and a bare majority (more than half
of the judges) to nuance the computation of percentage of agreement.
The purpose of this study is to investigate the usefulness of Bloom’s revised taxon-
omy for interpretation of standards. Another purpose is to describe differences and
similarities between teachers and assessment experts when interpreting standards.
Method
Design
Two panels of judges categorized the same standards with Bloom’s revised taxonomy
under similar conditions. The judgements of these two panels were compared regard-
ing inter- and intra-judge agreement as well as the usefulness of Bloom’s revised
taxonomy as a tool for interpretation of standards.
The standards
The 35 interpreted standards in this study make up one syllabus in mathematics for
upper secondary schools in Sweden. The analysed syllabus contains 20 standards
named goals and 15 standards named grading criteria (see Skolverket 2007–08).
focuses on content as types of knowledge. The categories in this dimension are factual
knowledge, conceptual knowledge, procedure knowledge and metacognitive knowl-
edge. The categories in the knowledge dimension are assumed, by the authors, to lie
along a continuum, from concrete in factual knowledge to abstract in metacognitive
knowledge. The continuum between conceptual and procedural knowledge overlaps
somewhat, according to the authors.
The dimension of cognitive processes focuses on how the knowledge is used. The
categories in this dimension are remember, understand, apply, analyse, evaluate and
create. The underlying continuum in this dimension is cognitive complexity, ranging
from low-cognitive complexity in remember to high-cognitive complexity in create.
Bloom’s revised taxonomy provides a two-dimensional taxonomy table with
Downloaded by [Memorial University of Newfoundland] at 01:25 03 August 2014
24 cells (see Figure 1). The rows in the taxonomy table represent the four categories
of the knowledge dimension and the columns the six categories of the cognitive
process dimension. One standard will thereby be categorized according to the two
dimensions and placed in the corresponding cell in the taxonomy table.
Figure 1. Distribution (in per cent) of each panel’s total categorizations of all standards on each occasion.
The judges
One panel consisted of four teachers with relevant education for, and experience of,
teaching the specific course in mathematics. The four teachers teach in different
schools in different parts of Sweden. These teachers were also engaged as a reference
group for developing national tests in mathematics for the particular course. There-
fore, the conclusion is that all the teachers were very familiar with the syllabus.
The other panel consisted of four assessment experts with relevant education for,
and prior experience of, teaching the specific course in mathematics. They have also
developed and constructed national tests for at least five years, but for different
courses in mathematics in upper secondary schools. These judges are supposed to
have a more detailed and deeper experience of analysing standards in mathematics
than the teachers in the other panel.
Procedure
The procedure for collecting data was the same for both panels, even though they took
place on different days. For both of the panels, data was collected on two occasions,
so that intra-judge consistency as well as inter-judge consistency could be studied.
A week before the first occasion, the judges received an introduction letter. This
letter presented the study, gave an overview of Bloom’s revised taxonomy and of
classified examples of standards from another syllabus than the one in the study. On
the first occasion, Bloom’s revised taxonomy was presented and exemplified,
followed by a discussion about classification of examples. Directly afterwards the
judges categorized the standards individually. On the second occasion, the judges
again categorized individually the standards in the same syllabus but without any
introduction. The time between the two occasions was between two and three months.
The cognitive process dimension is assumed to lie on a continuum from low to high
cognitive complexity (Anderson and Krathwohl 2001), and when categorizing each
standard regarding this dimension the judges were instructed to choose the category
with the highest cognitive complexity. The categories in the knowledge dimension are,
however, problematic to order along a continuum, because knowledge is commonly
assumed to consist of different types without any clear ordering (e.g. de Jong and
International Journal of Research & Method in Education 43
Ferguson-Hessler 1996). Therefore, the judges were allowed to place each standard
into more than one category in the knowledge dimension, i.e. multi-categorize.
However, factual and conceptual knowledge are ordered along a continuum. Factual
knowledge is the bricks that build up conceptual knowledge (e.g. Anderson and
Krathwohl 2001). Therefore, the judges were allowed only to choose either factual or
conceptual knowledge. If standards are multi-categorized, then the cells are not
mutually exclusive, according to Hauenstein’s third rule (1998).
Statistical methods
Three measures of both inter- and intra-judge consistency among individual judges are
Downloaded by [Memorial University of Newfoundland] at 01:25 03 August 2014
reported, namely the percentage of perfect agreement (all judges agree), the percent-
age of a clear majority of the judges (at least three judges agree) and Fleiss’s kappa.
These measures are all useful when treating nominal variables (Stemler 2004) and the
categories in at least the knowledge dimension of the taxonomy can only be assumed
to be nominal variables. The strength of kappa values compared to the percentage of
agreement is that kappa takes into account chance agreement among the judges
(Watkins and Pacheco 2000).
Fleiss’s kappa (1971) is used in this study, because of multiple judges and data on
nominal level. However, each judge was allowed to place one single standard in one
to three categories with the same weight. To be able to compute Fleiss’s kappa, only
one category per standard and judge can be used. The category chosen was the one
that the judges in each panel most strongly agreed on. According to Landis and Koch
(1977) kappa values between 0.01 and 0.20 represent slight agreement, those between
0.21 and 0.40 fair agreement, those between 0.41 and 0.60 moderate agreement, and
those greater than 0.60 substantial agreement. For measuring percentage of agreement,
a rule of thumb is that an agreement of at least 70% is acceptable (Stemler 2004).
The judges were instructed to use either factual or conceptual knowledge in their
categorization of standards, but sometimes both these categories were used at the same
time. In such cases, only the cell with conceptual knowledge was counted.
The statistical analysis of inter- and intra-judge consistency for panels as wholes
is based on how the standards are distributed in the taxonomy table for all judges in
each panel. The percentage of the total number of categorizations of all standards is
presented in a taxonomy table for each panel on each occasion.
The distribution of standards in the different cells in the taxonomy table for one
panel and occasion is compared to the distribution of the other panel or occasion and
as a measure of how similar the two distributions are the emphasis index is used. This
index has been used by Porter (2002) for comparing the distribution of standards with
the distribution of assessment items in alignment analyses, but he called the index for
balance index. The emphasis index is:
E = 1−
∑x− y
2
where x is the proportion of the total number of categorized standards in each cell in
the taxonomy table for Panel 1 or Occasion 1 and y is the corresponding proportion
for Panel 2 or Occasion 2. When E = 1, the distributions are the same and emphasize
the same cells in the taxonomy table. E = 0 means that the distributions are completely
different.
44 G. Näsström
Webb (2002) used a similar index for balance between standards and assessment
items, and according to him index values of at least 0.70 indicate an acceptable level.
Index values between 0.60 and 0.70 are, according to Webb, indicating an only
weakly acceptable level.
Results
Firstly, results about the usefulness of Bloom’s revised taxonomy for interpretation of
standards are presented. These results will first be reported for both panels, and then
the results for each panel are presented. Finally, results concerning the consistency of
the use of the taxonomy will be reported, both inter- and intra-judge consistency for
Downloaded by [Memorial University of Newfoundland] at 01:25 03 August 2014
individual judges.
Usefulness
All standards were categorized by all judges in both panels, i.e. the taxonomy is totally
inclusive. Table 1 shows proportions of multi-categorized standards for both of the
panels.
Both the teachers and the assessment experts multi-categorized standards, but the
assessment experts multi-categorized standards to a larger extent than the teachers. On
the first occasion, 31 of the 35 standards were multi-categorized by at least one assess-
ment expert, while only 5 standards were multi-categorized by at least one teacher.
The number of multi-categorized standards increased from the first occasion to the
second occasion for both panels (see Table 1). For example, the teachers more than
doubled the number of multi-categorized standards from 5 to 13.
The utilization of the cells in the taxonomy table is visualized in Figure 1. All four
judges in a panel on one occasion were treated as a whole. All placements of the stan-
dards for each whole were forming the total distribution for that whole and the
percentages for each whole in Figure 1 were based on each total distribution. The
number of rectangles placed in one cell in Figure 1 was larger in cells with a larger
proportion of standards. The cells with most placements were also darkest coloured.
The teachers categorized the standards into more cells in the taxonomy table than
the assessment experts (see Figure 1). On both occasions, the teachers used 21 cells in
the taxonomy table and 19 cells were used at both occasions. The assessment experts
used 16 cells on the first occasion and 18 cells on the second occasion, with 15 cells
used on both occasions. None of the judges used the cell create factual knowledge,
while all the other 23 cells were used by at least one judge on at least one occasion.
Table 2 presents the results of the emphasis index, which indicates how similar
two distributions of categorizations are, when each panel at each occasion is treated
as one whole.
Higher values on the emphasis index indicate a larger correspondence between the
two compared distributions of categorizations. When distributions of the two panels
Table 1. Proportions of multi-categorized standards (placed in more than one cell) for
teachers and assessment experts on both occasions.
Teachers Assessment experts
Totally Occasion 1 14% (5) 89% (31)
Occasion 2 37% (13) 97% (34)
International Journal of Research & Method in Education 45
Teachers 2% 2% 1% 1% 5% 3% 1% 1%
Factual
knowledge
The Knowledge Dimension
Assessment 3% 2% 1% 1% 1%
experts
Procedural
knowledge
Assessment 2% 1% 5% 7% 23% 18% 9% 10% 5% 5% 3% 7%
experts
Teachers 1% 5% 5% 3% 3% 3% 3% 7% 5% 10% 2%
Metacognitive
knowledge
Assessment 2% 1% 2% 1% 3% 1% 3%
experts
Figure 1. Distribution (in per cent) of each panel’s total categorizations of all standards on
each occasion.
are compared, the emphasis index is higher for the second occasion compared to the
first occasion (see Table 2) indicating that the distributions for the two panels corre-
spond to a larger extent on the second occasion than on the first one. However, the
emphasis indices are high on both occasions. When the distributions for the two occa-
sions for each panel are compared, the emphasis indices are higher for the assessment
experts (0.84) than for the teachers (0.73). This indicates that the assessment experts’
distributions of standards agree to a larger extent than the teachers’ distributions.
However, the emphasis index is also high for the teachers. An index of at least 0.70
is, according to Webb (2002), an acceptable level and all, except the comparison
between the panels on Occasion 1, reach this level.
Inter-judge consistency
Table 3 presents results from an analysis of inter-judge consistency among the indi-
vidual judges in each panel on each occasion. The assessment experts agreed to a
Table 2. Emphasis index, indicating the degree of similarity between two distributions of
categorizations of standards for comparison between the two panels on each occasion as well
as for comparison between the two occasions for the respective panel.
Comparison E
Between panels Occasion 1 0.64
Occasion 2 0.75
Between occasions Teachers 0.73
Assessment experts 0.84
46 G. Näsström
Table 3. Consistency among judges in each panel on each occasion (inter-judge consistency),
reported both as percentage of agreement and kappa coefficients.
Teachers Assessment experts
Occasion 1 Occasion 2 Occasion 1 Occasion 2
Perfect agreement 3% (1) 11% (4) 26% (9) 14% (5)
Clear majority 29% (10) 29% (10) 46% (16) 46% (16)
Kappa coefficients 0.15 0.24 0.47 0.41
Note: (1) The percentage of agreement is reported both for all four judges in each panel (perfect agreement)
and for at least three judges in each panel (clear majority). (2) Number of standards in parenthesis.
Downloaded by [Memorial University of Newfoundland] at 01:25 03 August 2014
higher degree than the teachers about the categorizations of the standards on both
occasions. All four assessment experts agreed about the categorization of nine
standards (26%) on the first occasion and five standards (14%) on the second
occasion, while all the teachers agreed on one standard (3%) on the first occasion and
four standards (11%) on the second occasion. A clear majority of assessment experts
agreed on 16 standards (46%) on both occasions, compared to 10 standards (29%) for
the teachers. If the acceptable level of at least 70% agreement is applied to these
results, the inter-judge consistency is non-acceptable.
The kappa coefficients (see Table 3) also show a higher degree of inter-judge
agreement for the assessment experts compared to the teachers on both occasions. For
the assessment experts, the kappa coefficients were 0.47 and 0.41 respectively, indi-
cating moderate agreement. For the teachers, the kappa coefficients were 0.15 and
0.24 respectively, indicating slight agreement on the first occasion and fair agreement
on the second occasion.
Intra-judge consistency
Table 4 presents results concerning intra-judge consistency for the individual judges.
The assessment experts placed standards in the same category on both occasions to a
higher degree than the teachers. On average, 51% of the standards (18) were catego-
rized in the same way on both occasions by the assessment experts compared to 25%
of the standards (9) for the teachers.
The kappa coefficients (see Table 4) also show a higher degree of intra-judge
consistency for the assessment experts compared to the teachers. For the assessment
Table 4. Consistency between occasions for individual judges (intra-judge consistency), and
average and standard deviations (SD) for each panel.
Teachers Assessment experts
Agreement Average 25% (9) 51% (18)
SD 7% (2.50) 12% (4.19)
Kappa coefficients Average 0.18 0.43
SD 0.09 0.12
Note: (1) Intra-judge consistency is reported as percentage of agreement and kappa coefficients. (2)
Number of standards in parenthesis.
International Journal of Research & Method in Education 47
experts, the average kappa coefficient is 0.43, indicating moderate agreement. For the
teachers, kappa coefficient is 0.18, indicating only slight agreement.
Discussion
The purpose of this study was to investigate the usefulness of Bloom’s revised taxon-
omy for interpretation of standards. The purpose was also to study differences and
similarities between teachers and assessment experts when they interpreted standards
by means of the taxonomy. The discussion is structured in the following way. Firstly,
the usefulness of Bloom’s revised taxonomy will be discussed. Secondly, differences
and similarities between teachers and assessment experts will be discussed. Finally,
Downloaded by [Memorial University of Newfoundland] at 01:25 03 August 2014
The conclusion is that Bloom’s revised taxonomy, on the whole, is a useful tool
for interpretation of standards in this study.
Limitations
In this study, the evaluation of the usefulness of the taxonomy was limited to
Hauenstein’s first three rules. To evaluate the fourth and fifth rules, i.e. whether the
categories are ordered by a consistent principle and whether the terms in the taxonomy
are representative of the field, other types of studies are needed.
The size of the samples in this study is quite small and this may have influenced
the reliability negatively. Alignment studies are methodologically comparable to this
study and in this type of studies the number of judges in panels is ranging from 2 (e.g.
Porter 2002) to 27 (e.g. D’Agostino et al. 2008). Webb (2007) recommends that a
panel should consist of five to eight judges and concludes that a larger number of
judges increase the reliability. However, a large number of judges also require a lot of
resources as time, people and money, and therefore the level of acceptable reliability
has to be weighed against the costs.
The teachers in this study are not fully representative of teachers in general,
because of their participation in the development of national tests. These teachers have
International Journal of Research & Method in Education 49
References
Anderson, L.W., and D.R. Krathwohl, eds. 2001. A taxonomy for learning, teaching, and
assessing: A revision of Bloom’s taxonomy of educational objectives. New York: Addison
Wesley Longman.
Bhola, D.S., J.C. Impara, and C.W. Buckendahl. 2003. Aligning tests with states’ content
standards: Methods and issues. Educational Measurement: Issues and practice 22, no. 3:
Downloaded by [Memorial University of Newfoundland] at 01:25 03 August 2014
21–9.
Biggs, J. 2003. Teaching for quality learning at university. Glasgow: Society for Research
into Higher Education and Open University Press.
Bloom, B.S., ed. 1956. Taxonomy of educational objectives: Handbook I: Cognitive domain.
New York: David McKay.
Bybee, R.W. 2003. Improving technology education: Understanding reform – Assuming
responsibility. Technology Teacher 62, no. 8: 22–5.
D’Agostino, J.V., M.E. Welsh, A.D. Cimetta, L.D. Falco, S. Smith, W. Hester VanWinkle,
and S.J. Powers 2008. The rating and matching item-objective alignment methods.
Applied Measurement in Education 21, no. 1: 1–21.
de Jong, T., and M.G.M. Ferguson-Hessler. 1996. Types and qualities of knowledge. Educational
Psychologist 31, no. 2: 105–13.
Fairbrother, R.W. 1975. The reliability of teachers’ judgement of the abilities being tested by
multiple choice items. Educational Research 17, no. 3: 202–10.
Fleiss, J.L. 1971. Measuring nominal scale agreement among many raters. Psychological
Bulletin 76, no. 5: 378–82.
Guilford, J.P. 1967. The nature of human intelligence. New York: McGraw-Hill.
Hanna, W. 2007. The new Bloom’s taxonomy: Implications for music education. Arts
Education Policy Review 108, no. 4: 7–16.
Hauenstein, A.D. 1998. A conceptual framework for educational objectives: A holistic
approach to traditional taxonomies. Lanham, MD: University Press of America.
Herman, J.L., N.M. Webb, and S.A. Zuniga. 2007. Measurement issues in the alignment of
standards and assessments: A case study. Applied Measurement in Education 20, no. 1:
101–26.
Landis, J.R., and G.G. Koch. 1977. The measurement of observer agreement for categorical
data. Biometrics 33, no. 1: 159–74.
Luft, P., C.M. Brown, and L.J. Slutherin. 2007. Are you and your students bored with the
benchmarks? Sinking under the standards? Then transform your teaching through
transition. Teaching Exceptional Children 39, no. 6: 39–46.
Marzano, R.J., and J.S. Kendall. 2007. The new taxonomy of educational objectives.
Thousand Oaks, CA: Corwin Press.
Mullis, I.V.S., M.O. Martin, T.A. Smith, R.A. Garden, K.D. Gregory, E.J. Gonzales, S.J.
Chrostowski, and K. M. O’Connor. 2001. TIMSS assessment frameworks and specifica-
tions 2003. Chestnut Hill: International Association for the Evaluation of Educational
Achievement.
Näsström, G., and W. Henriksson. 2008. Alignment of standards and assessment: A theoreti-
cal and empirical study of methods for alignment. Educational Journal of Research in
Educational Psychology 6, no. 3: 667–90.
OECD. 1999. Measuring student knowledge and skills: A new framework for assessment.
Paris: OECD.
Pickard, M.J. 2007. The new Bloom’s taxonomy: An overview for family and consumer
sciences. Journal of Family and Consumer Sciences Education 25, no. 1: 45–55.
Poole, R.L. 1971. Characteristics of the taxonomy of educational objectives: Cognitive
domain. Psychology in the Schools 8, no. 4: 379–85.
Popham, W.J. 2003. Test better, teach better: The instructional role of assessment.
Alexandria, VA: Association for Supervision and Curriculum Development.
International Journal of Research & Method in Education 51
Porter, A.C. 2002. Measuring the content of instruction: Uses in research and practice.
Educational Researcher 31, no. 7: 3–14.
Porter, A.C., and J.L. Smithson. 2001. Are content standards being implemented in the
classroom? A methodology and some tentative answers. In From the capitol to the classroom:
Standards-based reform in the States, ed. S.H. Fuhrmans, 60–80. Chicago, IL: National Soci-
ety for the Study of Education, University of Chicago Press.
Seddon, G.M. 1978. The properties of Bloom’s taxonomy of educational objectives for the
cognitive domain. Review of Educational Research 48, no. 2: 303–23.
Skolverket. 2007–08. Upper secondary school. Mathematics. http://www3.skolverket.se/ki03/
front.aspx?sprak=EN&ar=0708&infotyp=8&skolform=21&id=MA&extrald= and http://
www3.skolverket.se/ki03/info.aspx?sprak=EN&id=MA&skolform=21&ar=0708&info-
typ=17 (accessed March 4, 2008).
Stemler, S.E. 2004. A comparison of consensus, consistency, and measurement approaches to
Downloaded by [Memorial University of Newfoundland] at 01:25 03 August 2014