Reddy (2011) - Design and Development of Rubrics To Improve Assesment Outcomes PDF

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 21

The current issue and full text archive of this journal is available at

www.emeraldinsight.com/0968-4883.htm

QAE
19,1 Design and development of
rubrics to improve assessment
outcomes
84
A pilot study in a Master’s level business
Received January 2010 program in India
Revised July 2010
Accepted November 2010 Malini Y. Reddy
Department of Marketing and Strategy, IBS Hyderabad,
Icfai Foundation for Higher Education, Andhra Pradesh, India

Abstract
Purpose – This paper seeks to discuss the characteristics that describe a rubric. It aims to propose a
systematic method for developing curriculum wide rubrics and to discuss their potential utility for
program quality assessment.
Design/methodology/approach – Implementation of rubrics is a recent phenomenon in higher
education. Prior research and theoretical issues related to rubric design and development are
discussed. The proposed method for rubric development is illustrated by deriving generic analytic
scoring rubrics for two assessment methods, namely projects and cases in a Master’s level business
program in India. Aspects related to the validity of the rubrics developed are investigated and results
of reliability study conducted using intraclass correlation coefficient (ICC) are reported.
Findings – Upon testing, the rubrics were found to be reliable and valid grading tools. Results of
inter-rater and intra-rater reliability analyses illustrated that the rubrics supported standardization of
the manner in which cases and projects could be evaluated in different business courses.
Practical implications – Whereas previous studies on rubric construction have largely
concentrated on task specific rubrics, this study focuses on development of curriculum wide rubrics
that can be employed for assessment of students’ learning at both course and program level.
Originality/value – To date there has not been any published work on issues of assessment of
student learning through project and case analysis rubrics within diverse courses in a business
program. The method detailed in the study can guide the development of generic rubrics for
alternative assessment methods employed in business programs as well as in other disciplines.
Keywords Assessment, Learning, Higher education, Business schools, Students, India
Paper type Research paper

Introduction
A “rubric” in education literature is commonly understood as an assessment tool that is
used to describe and score observable qualitative differences in performances. It
captures the essence of performance in academic tasks by “listing the criteria, of what
counts, and describing levels of quality from excellent to poor” (Andrade and Du, 2005,
p. 1). In assessment methods such as projects, case analysis, essays, portfolio, where
Quality Assurance in Education
Vol. 19 No. 1, 2011
the constructed responses given by students cannot be evaluated with complete
pp. 84-104 objectivity, rubrics are considered an effective approach for achieving reliable
q Emerald Group Publishing Limited
0968-4883
(consistent) and valid (accurate) professional judgment of students’ performances
DOI 10.1108/09684881111107771 (Pellegrino et al., 1999).
The development and use of rubrics for assessment of learning outcomes has Design and
become a popular and recognizable trend since the early 1990s. While they have been
extensively used at the school level in the USA, their application in higher education, as
development of
well as their usage in other countries has gathered momentum only in the last one rubrics
decade (Montgomery, 2002). Several business schools in the USA are currently using
rubrics for assessment of students’ learning at course and program level[1]. This
interest in use of rubrics in business programs is growing rapidly in other countries as 85
well (Martell and Calderon, 2005). The departments of education and quality assurance
agencies in several countries (e.g. the USA, the United Kingdom, Australia, France,
Turkey) as well as globally recognized accreditation bodies (e.g. AACSB, EFMD,
AMBA) which have adopted assessment of learning outcomes as the cornerstone of
their assurance and accreditation process, have contributed to the growing awareness
and importance of rubrics in business education. They have been proposed as a viable
means for scrutinizing actual samples of student work produced in the programs for
assessing and assuring the attainment of the specified learning outcomes.
Having said this, it is puzzling to note the miniscule literature on the development and
use of rubrics in business programs. A majority of the limited samples of rubrics designed
for use in post secondary level business courses are task specific. There are no published
studies on development and use of a rubric for an assessment method such that it could be
easily implemented for different assignments/tasks in a variety of business courses (hence
forth, referred as curriculum wide generic rubrics). This paper, attempts to fill this gap by
demonstrating a systematic process for developing rubrics for two assessment methods,
namely, written analysis of cases and course based projects. The process illustrated can
be used for developing rubrics that are generic enough to suitably serve a variety of
assignment objectives in a course; can be implemented in different courses across
academic terms; and can possibly be applicable at multiple institutions.

Literature on rubric design and development


Several educational researchers have studied aspects of rubric design and have made
recommendations for their appropriate construction. Literature presents two equally
relevant approaches:
(1) development from the perspective of describing the essential features of rubrics
(e.g. Huba and Freed, 2000; Arter and McTighe, 2001); and
(2) development by outlining a series of discrete steps to follow one after the other
(e.g. Moskal, 2000; Mertler, 2001).
Studying and synthesizing these approaches is essential for a thorough understanding
of the process of rubric construction.
A rubric has three essential features namely, evaluation criteria, quality definitions
and scoring strategy (Popham, 1997). Evaluation criteria are the factors that an
assessor considers when determining the quality of a student’s work. These bring in
the much needed clarity in understanding the requirements of performance especially
in areas which are difficult to define, such as critical thinking, problem solving, team
skill. These also permit criterion-referenced discrimination of performances and enable
monitoring of students learning against each criterion. The criteria thereby provide the
students with opportunities to reflect on both the content of their performance as well
as process of learning (Parke, 2001).
Describing the development of a rubric to assess student learning as a multi-step
approach, the Council for Higher Education Accreditation (CHEA) recommends
QAE forming a team of four to eight members with representation from faculty, staff, and
students (CHEA, 2002). Defining the goals and purpose for conducting assessment;
19,1 becoming aware of the standards required by the national quality assurance systems;
reviewing the existing work in terms of what has been accomplished in the target area
thus far are the preparatory steps that the team needs to take (Bresciani et al., 2004).
Articulation of the student learning outcomes to be assessed; determining what
86 meeting the outcome would look like; converting these into a “handful” (Popham, 2003,
p. 99) of most important evaluative criteria to form the basic structure of the rubric.
This requires drawing upon knowledge from scholarly literature, professional
experience and feedback from colleagues.
Although attaining consensus on evaluative criteria amongst instructors and
students is proposed as a desirable step, literature discussing and illustrating the
possible approaches that may be used for attaining the consensus are rare. In fact an
overwhelming majority of empirical studies on rubric development do not incorporate
this step and rubrics are often developed by individual instructors. Studies by
Campbell (2005), Parkes (2006), Petkov and Petkova (2006), Reitmeier et al. (2004),
Green and Bowser (2006), Song (2006), Lapsley and Moody (2007) are a case in point.
The second element, Quality Definitions, are detailed explanations of what a student
must do to demonstrate the level of achievement of a skill, proficiency or criteria. The
definitions address the concern of telling a good response from a poor one. Definitions
across levels, for example beginning, developing, accomplished, exemplary, can help in
reflecting on the progress made by the students towards achievement of learning
outcomes over a period of learning (Huba and Freed, 2000). The ideal number of levels
in a rubric has been discussed minimally in literature. While there is a broad agreement
that the levels should be few and meaningful, there is no consensus on the ideal
number of descriptive levels in a rubric. For example, Popham (1997) suggests three to
five levels, Stevens and Levi (2005) suggest a minimum of three levels and Callison
(2000) recommends a maximum of four levels. The effect of number of levels on the
effectiveness and usability of the rubrics by instructors as well as students has not
been sufficiently examined. Another design feature of a rubric that plays a crucial role
in determining its utility is the use of such language that is easily understood by the
students. Involving students in the development of rubrics increases their usability and
has been suggested as a way of helping the students understand, develop and use their
own conceptions of what constitutes good and poor work (Huba and Freed, 2000). It is
expected to help them think “critically about their own work” (Stevens and Levi, 2005,
p. 21). In practice however, few studies mention involving students in the process of
rubric development. Even those that do make no attempt at detailing the iterative
process of incorporating changes in the rubric based upon student feedback.
One of the major flaws related to writing descriptions is lack of clarity in
meaningfully differentiating between performance levels. Descriptions could be too
general and therefore ambiguous, or could be too specific with many minute details
spelt out and therefore unwieldy. Development and description of the criteria by an
interdisciplinary team of faculty members has been suggested as a strategy for
achieving a balance between “generalized wording” for increasing applicability and
‘detailed description’ for ensuring reliability (Suskie, 2004; Glickman-Bond and Rose,
2006). Providing instructors with samples of student performances that represent
different accomplishment levels and expressing the criteria in observable behaviors;
product characteristics; and using measurement scales such as amount, frequency and
intensity can bring out the distinction between achievement levels clearly (Tierney and
Simon, 2004; Mertler, 2001). Actual accounts of rubric development that detail the Design and
process of writing the levels using these principles are however rare.
Another pressing issue related to quality definitions is inconsistencies in the
development of
descriptions of performance criteria across levels. Stefl-Mabry (2004) recommends rubrics
explaining the stages of development of valuable skills on a continuum. She suggests
asking instructors to list the areas in which they would like their students to develop
proficiencies and then detail how they would comprehend that the students have 87
achieved them. This process is expected to help the instructors in articulation of the
highest level of performance with observable elements. Similarly, asking the instructors
to articulate the base minimum requirements that would be acceptable to them for
awarding a passing grade is expected to help in explaining the minimum level in the
continuum. Moskal (2003) suggests a similar approach of beginning with a description of
the highest level followed by that of the lowest level so that the contrast between these
could suggest an appropriate description of the levels that fall between them. Though
widely discussed, the process of writing consistent definitions across levels such that
they effectively reflect development in a criteria/skill on a continuum as well as use of
methods for enhancing objective observations have not been sufficiently illustrated in
literature. These are crucial areas that determine the quality of the rubric and have
implications on the reliability as well as validity of the evaluations made using it.
The final element, scoring strategy of rubrics involves the use of a scale for
interpreting, and judging a product and/or process. A “holistic” rubric enables the scoring
of the process or product as a whole, without allowing for judging of the component parts
separately. Given this approach of overall evaluation, holistic rubrics are considered useful
for getting a broad picture of attainment especially when dealing with large number of
students or testing populations (Fraser et al., 2005). On the other hand, “analytic” rubrics
use a scoring strategy where each criterion is scored separately for eventual aggregation
to form an overall score. These therefore have a part-to-whole; criterion-by-criterion
judgment approach which makes multidimensional assessment possible.
Mertler (2001, p. 3) states that “prior to designing a specific rubric, a teacher must
decide whether the performance or product will be scored holistically or analytically”.
This important decision is based on how an instructor plans to use the results. An
objective of providing formative feedback for example, is better assisted by an analytic
rubric as these provide detailed and diagnostic feedback of the strengths and weakness
of the product or performance. The information so gained can be used to influence
individual students’ learning trajectories. Further, aggregation of this individual level
information can provide feedback to instructors about the effectiveness of their
instruction, which can be used for bringing improvements in the process of planning
instruction as well as in course design (Popham, 1997).
Application of rubrics, holistic or analytic, can be for assessing an individual
assignment or task i.e. task specific or for a group of similar tasks i.e. generic. The
thoroughness of detail in task specific rubrics leads to higher reliability and validity
than that in generic rubrics. However, due to the extensive investments of instructors’
time and energy needed for developing a rubric the necessity as well as feasibility of
developing a separate rubric for each task has been questioned (Popham, 1997, Moskal,
2000). Generic rubrics on the other hand are broader in scope and capture only the most
essential ingredients of the outcome to be measured across different tasks in the same
assessment method. This inherent flexibility is expected to permit a generic rubric
developed for an assessment method to be used across multiple courses and
QAE institutions (Montgomery, 2002; Tierney and Simon, 2004). The proposition however
19,1 remains largely untested.
After sufficiently deliberating and developing the key elements of the rubric, the
next step involved in rubric development is “norming” the raters (Bresciani et al., 2004).
Norming is the process of ensuring that raters understand the rubric in a similar
manner. This can be achieved through:
88 .
debate and discussion in the early stages of developing a rubric;
.
practicing the scoring of samples of student work; and
.
providing raters with “anchor papers” i.e. pre-scored student samples, to
illustrate each score point (Tierney and Simon, 2004).

Discussions on the inconsistencies in the scores given during the norming process help
in reconciling the divergence in understanding of the rubric (Maki, 2001). Additionally,
it serves the purpose of enhancing the reliability of subsequent scoring of student
performance between different raters (interrater reliability) and between the same rater
over a period of time (intrarater reliability). Testing the rubric on a pilot sample of
student work or on “colleagues willing to role-play as students” (Bresciani et al., 2004,
p. 35) and using the knowledge so gained to revise the rubric have been suggested as
the final step involved in rubric development.
Establishing the reliability and validity of the rubrics developed has not received
enough attention. Several empirical studies mention conducting pilot and reliability
tests, however very few (such as Dunbar et al., 2006; Oakleaf, 2006; Boulet et al., 2004)
report the results. Information about the procedures, analyses, and results would
enable readers to better understand claims made about validity and reliability. Studies
need to report how the validity of a rubric was established, scoring reliability,
including rater training and its contribution toward achieving inter-rater reliability.
Further, applicability of a rubric at multiple sites necessitates attention to rater
training as well as instrument use and bias.
From the literature reviewed it can be seen that the process of development of
rubrics requires tremendous investments of time and effort by instructors. Also, rubric
design poses some serious limitations. While it is possible to attain a consensus on the
process to be applied for rubric design and development, agreement on the criteria and
performance levels is difficult to achieve. Selection of criteria and drafting the
descriptors is highly subjective and contextual. It is dependent upon a variety of
factors including level of higher education (bachelor’s, master’s, doctoral); emphasis of
the program and course; quality of the students; instructor’s experience and expertise
etc. Even in the same context, different instructors may view the importance of criteria
for inclusion and their descriptions differently. As a result a rubric constructed for the
same assessment method or task could be drafted differently by different instructors.
Majority of the empirical studies do not elucidate the process used for rubric
development; rarely involve students in the development process; and hardly ever
provide information on validity and reliability of the rubrics developed. Further,
literature does not provide an approach for designing and developing such rubrics that:
.
are acceptable to several instructors from multiple disciplines; and
. can be used across courses and academic terms. This study proposes an
approach for addressing these aspects.
Study Design and
Based upon a review of literature on rubric design and construction, this study
proposed and tested a multi-step approach for construction of curriculum wide rubrics.
development of
This is presented in Table I, followed by a description of the implementation processes. rubrics
The rubrics in this study were developed by involving 35 instructors and 95
students from two reputed business schools in Hyderabad, India. Four instructors, who
have researched and published on the development and utility of rubrics and whose 89
work has been extensively cited were also involved as experts in the process of
development of rubrics. The rationale behind the procedure was to bring together the
views of a group of interested colleagues in the schools to build upon their common
knowledge, as this has a greater impact than any one individual may have upon
definition of learning goals and expectations, course design and course delivery
(Suskie, 2004). The composition of the multidisciplinary team of instructors was: 61
percent male; average age 35 years; 56 percent with more than five years of teaching
experience; specialization in the disciplines of finance and accounting (8 percent);
economics (18 percent); management (18 percent); human resources (14 percent);
operations (8 percent); marketing (26 percent); education (8 percent). Table II provides
further demographic details of the participants involved in rubrics development.

Identifying the criteria of evaluation, levels of performance, scoring strategy (Steps 1, 2, 3)


The construction of rubrics was approached within the overall clarity on what were the
desired learning outcomes and what meeting each of the outcomes would involve. The
first draft of the rubrics were prepared on the basis of the combined knowledge gained
from reviewing literature on rubric development, learning outcomes, and
recommended evaluation criteria for the assessment methods concerned; analysis of
student artifacts; and in-depth discussions with instructors.
A review of literature on the assessment methods of cases and projects reveal the
considerable attention paid to identifying the criteria for grading them (Table III).
Unlike case analysis, projects vary greatly in terms of the extent of structure and depth
of questions explored. It can be observed that literature does not present universally
accepted criteria for what constitutes a satisfactory project.
In addition to studying these criteria listed in published research articles, several
available (largely task specific) samples of written analysis of cases and project rubrics
were scrutinized. These were from books (e.g. Martell and Calderon, 2005; Schrock,
2000; Huba and Freed, 2000; Arter and McTighe, 2001); web sites of business schools
(Kania School of Management, The University of Scranton, Victoria School of

Steps

1 Identification of the learning objectives to be served by the use of the assessment method and
leading to the identification of qualities (criteria) that need to be displayed in a student’s work
to demonstrate proficient performance
2 Identification of levels of performance for each of the criteria
3 Development of separate descriptive scoring schemes for each evaluation level and criteria
4 Obtaining feedback on the rubrics developed
5 Revision of rubrics based on feedback from primary stakeholders Table I.
6 Testing the reliability and validity of the rubrics Step-by-step procedure
7 Pilot testing of the rubrics used for designing
8 Using the results of the pilot test to improve the rubrics generic analytic rubrics
QAE
Demographic item Participants (%)
19,1
Instructors (n ¼ 39)
Gender
Female 39
Male 61
90 Years of teaching experience
, 2 years 8
2-5 years 36
5-10 years 26
. 10 years 30
Area of specialization
Finance and accounting 8
Economics 18
Management 18
Human resources 14
Operations 8
Marketing 26
Education 8
Students (n ¼ 95)
Gender
Female 40
Male 60
Average age (years) 23
Table II. Work experience
Demographic data on , 1 years 48
participants involved in 1-2 years 34
development of rubrics . 2 years 16

Business- University of Houston, University of Wisconsin, College of Business – Idaho


State University, College of Business – Winona State University); and online rubric
resources (The ERIC Clearinghouse on Assessment and Evaluation, Rubric Builder,
Rubistar, MyTeacher Tools, Rubrician).
In-depth discussions were held with the instructors to gain an understanding of the
aspects of learning that they attempted to assess while using these assessment
methods. They were asked to define the level and type of knowledge associated with
the different tasks they allot in their courses and use the taxonomies of learning
outcomes to aid their specifications of the criteria for assessment. Samples of student
work were collected from the instructors and the evaluation parameters along with the
scales used by them to assess the extent of learning attained by students were
deliberated upon. Descriptions of different levels of performances against each
criterion were detailed. Discussions with regard to scoring strategy revealed an
overwhelming preference for analytic scoring rubric.
The initial rubric that were drafted for evaluation of cases contained the criteria of
“Issues; Perspectives; Analysis; Recommendations; and Documentation” and that for
projects contained “Statement of purpose; Information/literature search; Methodology;
Findings and conclusions; and Documentation” for evaluation of projects. A four point
scale with labels as beginning; developing; accomplished and exemplary with
numerical scores 1-4 respectively were given to denote a progression from the lowest
acceptable performance to the highest anticipated performance.
Cases Projects
Design and
development of
Identification and evaluation of the issues in the case
Introduction
Knowledge of the facts and their implications Review
rubrics
Logical and analytical arguments Design
Quality of language expression (Gopinath, 2004) Methods used
Presentation of the results 91
Interpretation and discussion
Conclusion drawn
Layout and referencing (Brown et al., 1997)
Issues identification: how well and how much Listing of important explicit and implicit issues in a
situation
Use of the observations to draw relevant conclusions Understanding the circumstances with the backdrop
construct analysis, offer recommendations from the of relevant literature
conclusions drawn and outline a plan of action Defining a set of objectives
(Corey, 1980) Selecting and evaluation of multiple alternatives to
address the objectives set
Problem, analysis, solution, recommendation Delineating the implementation and follow up plan
Narrative, interpretation as fact, logic and analysis Identifying areas for further learning; and using
(Greenhalgh, 2007) effective communication (Bigelow, 2004)
Stating the central issues with a clear rationale Clear statement of aims
Substantiate with facts and data Appropriate underpinning theory
Identifying and prioritizing the issues/concerns Critical analysis
based on importance and urgency
Analyzing the case by assessing the information Evaluation
available and making assumptions about important
missing information
Considering a number of feasible alternative actions Extent to which problem illuminated
Examining the values as well as risks of the possible Applicability of material to business
outcomes of the actions suggested
Evaluating the alternatives Supporting evidence provided
Integrating theory and practice by way of applying Conclusions drawn from evidence/analysis
theories and principles to situations
Preparing a set of specific recommendations for Appropriate (research) methods used
action
Addressing implementation concerns (Crittenden, Range of sources
2005) Logical structure
Clarity of communication
Presentation
Referencing/bibliography (Woolf, 2004)
Clarity of purpose Comprehension of concepts and aims, background
information
Meticulousness, correct and effective analysis Initiatives
Feasible recommendations (alternatives and Motivation/application
implementation)
Presence of logical consistency throughout a written Appropriateness of methods and/or experimental
presentation (Lundberg and Enz, 1993; Schroeder and design
Fitzgerald, 1984) Organizational skills Table III.
Competence and independence Summary of the list of
Ability to problem solve (Heywood, 2000 as quoted evaluation criteria for
by Petkov and Petkova, 2006) cases and projects
QAE Obtaining feedback on the rubrics developed and carrying out revisions (Steps 4 and 5)
The rubrics were discussed with the primary stakeholders (students, instructors) to
19,1 assess and improve the appropriateness and usability. Feedback on the rubrics was
collected from 39 instructors and 30 student volunteers (institution I, average age 24, 40
percent females) using a structured open ended format. The instructors and experts were
asked to rate and comment on the clarity, completeness and generalized applicability of
92 the rubrics while the students were asked to comment on the language, clarity and
completeness of the rubrics. Several important recommendations were made.
The students suggested a reduction in the length of descriptions, stating them to be
too “long” and “boring” to read and also suggested highlighting the most crucial
differentiating elements in each of the performance levels for easier identification and
greater impact. They expressed a preference for the ordering of performance levels
from highest level to lowest, in contrast to the lowest to highest presented in both the
rubrics. They felt that reversing the order would motivate them “to read the complete
rubric”. The students found the language easy to interpret.
A suggestion that came from the instructors with regard to the language was to
replace definitive phrases such as “student is unable to”, to more development oriented
“student is not able to”, thereby indicating a possibility for improvement. The
instructors also suggested bringing down the criteria in the rubric for written analysis
of cases from five to four by clubbing two criteria namely, “identification of issues” and
“identification of multiple perspectives”. Instructors, specifically those handling
quantitative courses, suggested that the two criteria in project rubric namely
“methodology” and “findings and conclusions” be decoupled to four criteria, namely,
methodology; computation and reporting of results; findings and analysis; and
conclusions and recommendations. This was suggested to make the rubric more
universally applicable and also to permit instructors to allot different weights to the
criteria as per the specific requirements of an assignment. This increased the number
of criteria in the project rubric from the existing five to seven.
Some instructors suggested that the descriptions of performance levels which were
presented as a collection of separate bullet points should instead be designed as
description in a summarized paragraph format. This was to take care of the temptation
on the part of raters to use the bullet points as a checklist instead of taking a holistic
view of performance against the criterion. The three experts made suggestions for
replacing qualitative words such as very, less, more with either objective descriptions
or quantification to bring in observable differences between the performance levels.
Based on these suggestions, the rubrics were revised. The final rubrics developed are
presented in Tables IV and V.

Reliability and validity of the rubrics developed (Step 6)


Although development of scoring rubrics is a strategy for reducing the subjectivity in
making qualitative judgments of students’ work, in order for rubrics based assessments
to be sound, they must be unbiased and free of distortions. The two concepts that are
important for identifying and estimating bias and distortion are Reliability and Validity.
While Reliability signifies the extent to which an assessment tool yields consistent
results on repeated implementation, Validity denotes the degree of accuracy with which
an assessment tool measures what the assessment method intends to measure.
Pretesting of a rubric has been suggested as an important requirement to reflect the
agreement between two or more raters and between the scores given by the same rater
over a period of time. Fourteen faculty members participated in the inter-rater
Levels
Exemplary Accomplished Developing Beginning
Criteria 4 3 2 1 Score

Issues Identifies and differentiates Identifies and differentiates Identifies the issues in the case Not able to clearly identify the
identification between the crucial and non- between the crucial and non- however differentiation between issues presented in the case
crucial issues while clearly crucial issues while partially crucial and non crucial is either
describing the multiple describing the multiple superficial or not presented
perspectives of different perspectives of different
characters in the case. characters in the case
Analysis Reasoning is logical, well Examination of the facts Examination of the facts Examination of facts presented
articulated and based upon: presented in the case is detailed presented in the case is at times in the case is superficial and
detailed and correct and correct however the same is incorrect and does not draw incorrect leading to
examination of the facts not suitably articulated to upon relevant theories/concepts unsupported ideas with flawed
presented in the case; sound explain how the facts or relevant leading to inconsistent reasoning
theoretical knowledge; and theories/concepts support the reasoning
personal experience assertions/reasoning
Recommendations Vigorously explores multiple Proposes practical solutions for Proposes solutions which have a Proposes solutions which are
practical solutions and carefully each of the issues identified theoretical basis and are related either impractical or unrelated
details the consequences of each however only partially to the issues but are too general to the issues identified
solution examines the consequences of to implement
each action proposed
Communication Logical organization of write up Minor problems in organization Errors in organization of write Frequent errors in organization
presented in a professional of write up presented in a up presented with some lapses of write up and in presentation
manner with well connected professional manner with in title page, referencing, section distract the reader and interfere
sections and proper title page, proper title page, referencing, headings, word choice, with meaning/comprehension
referencing, section headings, section headings, word choice, grammar and spelling which
word choice, grammar and grammar and spelling distract the reader
spelling
Total score /16

written analysis of cases


rubrics
Design and

Rubric developed for


93

Table IV.
development of
94
19,1
QAE

Table V.
Rubric developed for
course based projects
Levels
Exemplary Accomplished Developing Beginning
Criteria 4 3 2 1 Score

Statement of purpose Presents focused and novel Presents focused project Presents focused project Presents research objective(s)
project objective(s) that reveal objective(s) that enable objective(s) that enable which are either unfocussed or
a new line of enquiry and involvement in meaningful involvement in meaningful, lend themselves to readily
enable involvement in and challenging work though not challenging work available answers
meaningful and challenging
work
Information/literature Presents a critical review of Partial attempt at presenting a Presents only a summary of Presents a summary of either
review relevant and current literature/ critical review of relevant and relevant and largely current irrelevant or outdated
information to provide a clear current literature/information literature/information which literature/information
context and rationale for the to provide a clear context and does not provide a clear
proposed project rationale for the proposed context and/or rationale for the
project proposed project
Methodology Takes an appropriate Takes an appropriate The approach to examine the Takes an inappropriate
approach to study the project approach to study the project project objective(s) though approach to examine the
objective(s) and provides objective(s) and provide appropriate, appears to be pre- project objective(s)
logical explanation for explanations which partially selected with no attempt to
selection of the same justify the selection of the explain its selection
same
Computation and Absence of any errors in Presence of such minor errors Presence of such errors in Presence of such errors in
reporting of results computation and reporting of in computation and/or computation and/or reporting computation and/or reporting
results reporting of results that have of results that have a marginal of results that have a
no impact the analysis negative impact on the substantial negative impact on
analysis the analysis
Findings and analysis Presents factual findings as Presents factual findings as Presents factual findings as Presents only factual findings
well as analysis of connections well as analysis of connections well as analysis of connections without even making an
between them along with between them, however between them but synthesis attempt to examine the
synthesis with relevant synthesis with relevant with relevant theory/ connections between them or
theory/principles/literature theory/principles/literature principles/literature and synthesis with literature and
and personal experience and personal experience is personal experience is either personal experience
limited incorrect or absent
(continued)
Levels
Exemplary Accomplished Developing Beginning
Criteria 4 3 2 1 Score

Conclusions and Draws logical conclusions Draws logical conclusions Draws some of the conclusions Draws conclusions based on
recommendations from the evidence/analysis to from the evidence/analysis to from evidence/analysis and personal opinion to provide
provide action oriented provide recommendations some based on personal recommendations which are
recommendations while which are mostly action opinion to provide either unrelated to the problem
clearly articulating their pros oriented but articulation of the recommendations which are or are not action oriented with
and cons, thereby pros and cons do not either unrelated to the problem articulation of pros and cons of
demonstrating an demonstrate either an or are not action oriented with the recommendations being
understanding of the understanding of the articulation of pros and cons either absent or sketchy
complexity of the problem and complexity of the problem or being vague.
practicality of the practicality of the
recommendations recommendations
Documentation Format is professional with Format is professional with Lapses in organization, Lapses in organization,
proper organization, indexing, minor lapses in organization, indexing, cover page, indexing, cover page,
cover page, referencing, titles, indexing, cover page, referencing, titles, section referencing, titles, section
section headings, grammar referencing, titles, section headings, grammar and headings, grammar and
and spelling headings, grammar and spelling distract the reader spelling distract the reader and
spelling though not interfering with interfering with meaning
meaning
Total score /28

Table V.
rubrics
Design and

95
development of
QAE reliability study and two of these raters participated in the intra-rater reliability study
as well (average age 37 years; 64 percent male). Reliability of the rubrics developed was
19,1 tested by intraclass correlation coefficient (ICC), a measure of correlation, consistency
or conformity for a data set with multiple groups. ICC falls within the framework of
analysis of variance (ANOVA) and there are six formulas for calculating the ICC
depending upon the purpose of the study, the design of the study and type of
96 measurements taken. This study used a two-way random effects model of ICC. In this
model, both raters and samples are randomly selected from a larger pool. Each of the
raters scores all the samples and the individual rater’s score constitutes the unit of
analysis. Instead of an average measure, a single measure was selected as eventually,
the rubrics were intended to be used in a manner that the scores given by individual
raters (and not average score of raters) would be used to assess student work. This
version of ICC is interpreted as being generalizable to all possible judges (Shrout and
Fleiss, 1979) and is appropriate when the purpose is to establish that the scales can be
used effectively by several other raters. The analysis was carried out in SPSS version
12 and a 95 percent confidence interval was defined.
The selection of the raters was from the pool of instructors who had participated in
the previous steps. All the raters had taught the particular course whose samples they
were evaluating at least twice. With an aim to achieve high inter-rater reliability,
extensive discussions were held separately with each one of the raters involved in the
study. Anchor papers were used to clarify the criteria and the levels of performance. A
total of 100 student artifacts of written case analysis (43) and projects (57) were
collected from a cross section of core introductory courses to represent a spread of
qualitative and quantitative orientation. The courses were Marketing management
(MM); Financial management (FM); Macro economics and business environment
(MEBE); Business research methods (BRM); Management control and information
systems (MCIS); and Organizational behavior (OB). Along with the samples, the
assignment briefs were collected and shared with the raters to bring in clarity on the
objectives and expectations of the assignments.
Literature on inter-rater reliability states that for classroom assessments two raters
are sufficient to produce acceptable levels of inter-rater agreement. In this study two
raters independently scored each of the student artifacts. The raters were blind to the
other raters and also to the origination of the samples. The sample size in this study
was in the acceptable range of six-12 samples per course and assessment method
(Bonett, 2002; Walter et al., 1998). For each of the courses in which samples were
collected, absolute agreement and consistency were calculated for both total scores
given by the raters and for scores given against criterion in the rubric.
The results presented in Figure 1 show that the absolute agreement for the total
scores of case analysis across subjects ranged between 0.78-0.92 and consistency ranged
between 0.90-0.95. The same for the project rubric stood at 0.61-0.99 (agreement) and
0.71-0.99 (consistency). This is an acceptable range for initial evidence of score reliability
involving rubrics (Shrout and Fleiss, 1979; Wainer and Thissen, 1996).
Results of criteria wise ICC were also calculated to identify specific areas of low
reliability. In the case analysis rubric, relatively lower agreement as well as
consistency (0.64) was found in the MM course, in criteria 1. The same in the
course-based projects rubrics were seen in MEBE course in criteria 1, 2, 3, and 6
(agreement and consistency both reflected 0.615). Discussions were held with the
concerned raters to probe the reasons for the discrepant scores. It was found that one of
the raters for MEBE had difficulties in interpreting the levels correctly, while one of the
Design and
development of
rubrics

97

Figure 1.
Aggregate inter-rater
reliability (agreement and
consistency) of case and
project rubrics

raters in MM had misinterpreted the objectives of the assignment. The concerns of


lower reliability could therefore be suitably addressed by undertaking more intensive
rater training. The need for revision of the rubrics was not felt.
In addition to inter-rater reliability, the internal consistency (intra-rater reliability)
of a marker as a reflection of fairness of marks awarded was assessed by using a
QAE test-retest approach, wherein the same set of assignments were assessed by a rater
twice, separated by four weeks. The results of the correlation between scores at time 1
19,1 and time 2 showed a “high” level of agreement and consistency in the scores by both
instructors. Rater 1 achieved 0.96 (agreement) and 0.98 (consistency) on the case rubric
and 0.97 (agreement) and 0.99 (consistency) on the project rubric. The same for the
other rater were 0.93, 0.95, 0.95 and 0.98 respectively.
98 The results of the reliability study therefore ranged between acceptable to high
levels. Reliability is a necessary condition for validity, which refers to the
meaningfulness of an assessment tool. In classroom assessment it is conceptualized
in terms of establishing linkage with purposes of assessment. Relevance (measures
outcomes as directly as possible); accuracy (as precisely as possible) and utility
(provides results that can be used for improvement) are its attributes (Moskal and
Leydens, 2000).
With expert opinions suggested as a means to assess the content validity of rubrics,
instructors (n ¼ 15) were asked to reflect on the criteria of the rubrics developed in the
light of the purpose of the assignment method and the objectives of the assignments that
they had given in the past. This form of validity, labeled as “appropriateness” (Moskal
and Leydens, 2000) judges the relevance as well as representativeness of rubrics and is
important for generalizing the results beyond the sample under study. All the instructors
indicated high alignment. The instructors were then asked to reflect upon the following
set of questions suggested by Moskal and Leydens (2000): Do the evaluation criteria of the
scoring rubric address all aspects intended to be measured in the assignments given by
you? (Content validity); Are all the important criteria relevant to the assessment method
being evaluated through the rubric? (Construct validity); and Do the scoring criteria
reflect competencies that would suggest success on future or related performances?
(Criterion validity). The answers from the instructors were in the affirmative.

Pilot test (Steps 7 and 8)


The next step of the development process involved piloting the rubrics to assess its
usability and perceived usefulness from the students’ point of view. The rubrics were
implemented in a Quantitative Methods course with the students (n ¼ 33) of Executive
PG Program at institution I and in a Basic Marketing course with the students (n ¼ 32)
of Bachelors program in Management at institution II. They were handed over to the
students along with the assignments, clarified in the classroom, and the students were
encouraged to use them as guidelines to self assess their work prior to submission.
Upon completion of the course, the students were asked to give their feedback with
regard to language, comprehension, clarity, usability and actual use of the rubrics for
self assessment. All the 65 students completed the feedback forms. 73 percent and 67
percent of the students in institution I and II respectively found the rubrics to be easy
to understand and use. Five of the students mentioned the need for discussing the
rubrics in the classroom accompanied with anchor papers to help understand the
differences in levels better. While all the students said they used the rubrics to serve as
guidelines prior to starting their assignment as a way of understanding what the
instructor wants, only 42 percent and 38 percent of the respondents from institution I
and II respectively, revealed having actually used the rubrics to self assess and revise
their assignments. This shows that simply handing out a rubric to students may not
ensure its usage. Students must be taught to actively use it by aiding clarity by
handing out artifacts and also to reinforce its usage to enable them to reap its benefits
by using it for purposes of self-assessment.
The rubrics developed in this study therefore were found to be reliable, valid and Design and
easy to use. They support standardization of the manner in which cases and projects
are evaluated in different business courses. Apart from the documented process used
development of
for development of rubrics serving as evidence that validity issues were appropriately rubrics
addressed, feedback from instructors and students served as an additional source of
validity evidence. Also, the next phase of this study, involving the use of the rubrics in
different institutions could provide further evidence. The ease with which different 99
instructors could use the rubrics in their classroom would reflect the external validity
of the rubrics developed. Further, consequential validity, which refers to evidence of
short- and long-term consequences of the use of rubrics, could be established, if the
results of rubrics usage are found to have a positive effect on student learning.

Discussion and conclusion


The procedure for development of effective rubrics is undoubtedly demanding. It
requires substantial investment of time and effort by instructors and students. To
advance the development and usage of rubrics, it is therefore crucial to appreciate some
of the ways in which they can serve the purpose of measurement and enhancement of
student learning.The rubrics constructed in this study can help collect data to gain
valuable information regarding student learning at different levels of assessment,
namely, individual, course, and program level. At an individual student level, evidence
collected at different points in time during the course can be used to track a student’s
learning and improvement against the explicitly stated criteria and/or learning
outcomes. This would help in answering questions such as: How well has the student
achieved the learning outcomes set for the assessment task and course? Has the
student’s work improved over a period of time? What are the student’s strengths and
weaknesses? Thus sequential scores can serve the purpose of formative feedback by
enabling the student and the instructor to identify areas and ways for bringing
changes in learning processes.
Tables VI and VII give a hypothetical illustration of how an individual student’s (X)
incremental learning in the case analysis method can be mapped in a course (A) and
academic term respectively.
From the rubric scores data in Table VI, one can observe the overall as well as criteria
wise progress made by the student in the three sequential case assignments. The data
reveals that the student’s performance in criteria 1 remained at a constant acceptable
level; in criteria 2 showed a marked improvement; and in criteria 3 and 4 improved only
marginally. Aggregation of the sequential scores provides a summative and holistic
picture of performance in the course. Similarly, the data in Table VII illustrates how
rubric scores obtained can be used to map a student’s performance across courses in the

Scores attained on case


assignments in course A Average score in
Criteria I II III each criterion

1: Issues identification 3 3 3 9/3 ¼ 3


2: Analysis 2 3 4 9/3 ¼ 3 Table VI.
3: Recommendations 2 3 3 8/3 ¼ 2.6 Mapping student X’s
4: Communication 2 2 3 7/3 ¼ 2.5 learning in the course
Total score on assignment 9 11 13 33/3 ¼ 11 “A”, academic term I
QAE same academic term to uncover consistencies or otherwise in student’s performance.
Table VIII illustrates how rubric scores can be used to map the student’s performance
19,1 across courses in each of the academic terms. It can be seen that as the student
progressed through the academic term, there were improvements in criteria 1-3.
The rubric can therefore be used to discern how well the program, from its
beginning to end, fosters cumulative learning of the desired outcomes in the student
100 during his/her entire time in college.
Similar reporting can be taken up to assess learning at course level. Aggregation of
student performances in a class helps to understand how well the class achieves the
criteria or outcomes. As a formative measure, this information enables the instructor in
bringing changes in instruction to meet the immediate need. As a summative measure
on the other hand it can be used to bring in changes in instruction as well as course
design for future application. Rubric scores can also support cross-sectional analysis of
how consistently multi-section courses achieve important learning outcomes. In this
way, it can provide a well-founded measurement system for improving teacher
performance.
In order to build consistency in terms of what and how one assesses students’ work,
it is important to keep criteria and their descriptions same across courses and academic
terms. Using the same rubric across courses can ensure consistent measurement of
quality of performance by students. At the same time provision for individual
task/course specific requirements can be built in by allowing for assignment of weights
to criteria as appropriate. The longitudinal development purpose where the
expectations about student performance become more demanding can also be
suitably addressed by emphasizing increasingly more on higher level learning
outcomes in the rubric (for example by placing higher weights) as the student
progresses through the program.

Average scores attained on case


assignments in courses A-E Average score in
Criteria A B C D E each criterion

1: Issues identification 3 3.2 3 3.2 2.9 15.3/4 ¼ 3


Table VII. 2: Analysis 3 3.3 3.5 3.5 3.5 15.3/4 ¼ 3
Mapping student X’s 3: Recommendations 2.6 2.8 2.7 2.9 2.9 13.8/4 ¼ 2.8
learning in courses A-E in 4: Communication 2.5 2.6 2.5 2.7 2.6 12.9/4 ¼ 2.6
academic term I Average score in the courses 11 12.3 11.2 11.7 11 57.2/4 ¼ 11.4

Average scores attained on case


assignments across courses in
Criteria academic terms I-IV Average score in each
(Level of learning outcome) I II III IV criteria/outcome

1: Issues Identification 3 3.5 3.7 3.8 14/4 ¼ 3.5


Table VIII. 2: Analysis 3 3 3.8 3.8 13.6/4 ¼ 3.4
Mapping student X’s 3: Recommendations 2.8 2.8 3.3 3.4 12.3/4 ¼ 3.1
learning across courses in 4: Communication 2.6 2.6 2.7 2.8 10.7/4 ¼ 2.7
academic terms I-IV Average score in academic terms 11.4 11.9 13.5 13.8 50.6/4 ¼ 12.7
The data collected at assignments in different courses can be analyzed individually or Design and
collectively to reveal whether program level learning outcomes are being achieved. A
rubric can be used as a pretest-posttest measure, wherein students’ performance on the
development of
criteria can be collected in the beginning of a program and then again towards the rubrics
conclusion of the program. A comparison of these two scores would provide a direct
measure of the incremental learning brought about by the program. The information so
gained can be used to map the quality of a program offered by an institution over 101
several years. It can also be used to draw a comparison between multiple institutions
offering the same or similar program.
To summarize, the information from rubric scores can inform:
.
students about the progress made by them over a period of time;
.
instructors about the effectiveness of their instruction and course design; and
.
administrators about the quality of the program. In this manner, the use of
rubrics developed in this study would provide educational institutions with a
coherent assessment system where the assessment work of classroom teachers
and administrators are placed within the same conceptual framework.
While these rubrics may not be generic enough for application in every case and
project assignment, the instructors involved in the process of development have
experienced that with minimal effort the rubrics could be revised to address a wide
variety of assignment objectives. More importantly, the process of rubric development
proposed and detailed in this paper can guide efforts in formulating rubrics for student
learning assessment in other disciplines and academic levels.

Limitations of the study


The purpose of this study was to demonstrate the iterative process of developing a
curriculum wide generic rubric. While it would have been ideal to bring in participant
views from a large number of varied institutions, due to paucity of time and resources,
this study was conducted largely with participants from two institutions. The study
therefore could not identify and propose a method for addressing issues posed by wide
variations in academic settings and the resultant impact on rubric development
process.

Scope for further study


Currently, the studies on assessment of learning in business education shows a
predominant focus on researching program level summative assessments such as
standardized exit tests. It is surprising to note the lack of research on summative
and/or formative assessments at course level. Rubrics can be used for collecting
summative and formative assessment data at individual, course, and program level.
The knowledge so gained can serve as important indicators of instructional
effectiveness useful for bringing in improvements in instruction and course design.
Actual accounts (case studies) and detailed multi phased studies (action research or
otherwise) are needed to understand how rubrics can help in “closing the loop” (Suskie,
2004) i.e. how the changes incorporated in curriculum, course, and instruction can
enhance student learning. This would have implications for effective utilization of
rubrics for assessment and improvement of educational programs. Studies are also
needed to ascertain whether and how rubric assessments can complement or perhaps
even replace standardized tests as a program assessment measure.
QAE Note
19,1 1. In this study, the term program refers to Masters of Business Administration (MBA) being
offered in a two year, four academic terms format or a one year three academic terms format.
The term course refers to the various foundational (and therefore mandatory) as well as
optional subjects that a student needs to take to complete the MBA program. Examples of
courses are Marketing management, Financial management, Business research methods,
102 Managerial economics etc.

References
Andrade, H. and Du, Y. (2005), “Student perspectives on rubric-referenced assessment”, Practical
Assessment, Research & Evaluation, Vol. 10 No. 5, available at: http://PAREonline.net/
getvn.asp?v¼10&n¼3 (accessed December 14, 2006).
Arter, J. and McTighe, J. (2001), Scoring Rubrics in the Classroom, Corwin, Thousand Oaks, CA.
Bigelow, J.D. (2004), “Using problem-based learning to develop skills in solving unstructured
problems”, Journal of Management Education, Vol. 28 No. 5, pp. 591-609.
Bonett, D.G. (2002), “Sample size requirements for estimating intraclass correlations with desired
precision”, Statistics in Medicine, Vol. 21, pp. 1331-5.
Boulet, J.R., Rebbecchi, T.A., Denton, E.C., Mckinley, D. and Whelan, G.P. (2004), “Assessing the
written communication skills of medical school graduates”, Advances in Health Sciences
Education, Vol. 9, pp. 47-60.
Bresciani, M.J., Zelna, C.L. and Anderson, J.A. (2004), Assessing Student Learning and
Development: A Handbook for Practitioners, National Association of Student Personnel
Administrators (NASPA), Washington, DC.
Brown, G., Bull, J. and Pendlebury, M. (1997), Assessing Student Learning in Higher Education,
Routledge, Oxon and New York, NY.
Callison, D. (2000), “Rubrics”, School Library Media Activities Monthly, Vol. 17 No. 2, pp. 4-36, 42.
Campbell, A. (2005), “Application of ICT and rubrics to the assessment process where
professional judgement is involved: the features of an e-marking tool”, Assessment &
Evaluation in Higher Education, Vol. 30 No. 5, pp. 529-37.
Corey, E. (1980), Case Method Teaching, Harvard Business School, Boston, MA.
Council for Higher Education Accreditation (CHEA) (2002), “Student learning outcomes
workshop”, CHEA Chronicle, Vol. 5 No. 2, pp. 1-3.
Crittenden, W.F. (2005), “A social learning theory of cross-functional case education”, Journal of
Business Research, Vol. 58, pp. 960-6.
Dunbar, N.E., Brooks, C.F. and Kubicka-Miller, T. (2006), “Oral communication skills in higher
education: using a performance-based evaluation rubric to assess communication skills”,
Innovative Higher Education, Vol. 31 No. 2, pp. 115-28.
Fraser, L., Harich, K., Norby, J., Brzovic, K., Rizkallah, T. and Loewy, D. (2005), “Diagnostic and
value-added assessment of business writing”, Business Communication Quarterly, Vol. 68
No. 3, pp. 290-305, available at: http://bcq.sagepub.com/cgi/content/abstract/68/3/290
(accessed May 2, 2007).
Glickman-Bond, J. and Rose, K. (2006), Creating and Using Rubrics in Today’s Classrooms,
Christopher-Gordon Publishers Inc., Norwood, MA.
Gopinath, C. (2004), “Exploring effects of criteria and multiple graders on case grading”, Journal
of Education for Business, Vol. 79 No. 6, pp. 317-22.
Green, R. and Bowser, M. (2006), “Observations from the field: sharing a literature review rubric”, Design and
Journal of Library Administration, Vol. 45 Nos 1/2, pp. 185-202.
development of
Greenhalgh, A.M. (2007), “Case method teaching as science and art: a metaphoric approach and
curricular application”, Journal of Management Education, Vol. 31 No. 20, pp. 181-94. rubrics
Huba, M.E. and Freed, J.E. (2000), Learner-Centered Assessment on College Campuses: Shifting
the Focus from Teaching to Learning, Allyn and Bacon, Boston, MA.
Lapsley, R. and Moody, R. (2007), “Teaching tip: structuring a rubric for online course 103
discussions to assess both traditional and non-traditional students”, Journal of American
Academy of Business, Vol. 12 No. 1, pp. 167-72.
Lundberg, C.C. and Enz, C. (1993), “A framework for student case preparation”, Case Research
Journal, Vol. 13, pp. 133-44.
Maki, P.L. (2001), “From standardized tests to alternative methods: assessing learning in
education”, Change, Vol. 33 No. 2, pp. 28-32.
Martell, K. and Calderon, T. (2005), Assessment of Student Learning in Business Schools: Best
Practices Each Step of the Way, Vol. 1, Nos 1/2, The Association for Institutional Research,
Tallahassee, FL.
Mertler, C.A. (2001), “Designing scoring rubrics for your classroom”, Practical Assessment,
Research and Evaluation, Vol. 7 No. 25, available at: http://PAREonline.net/getvn.
asp?v¼7andn¼25 (accessed December 17, 2005).
Montgomery, K. (2002), “Authentic tasks and rubrics: going beyond traditional assessments in
college teaching”, College Teaching, Vol. 50 No. 1, pp. 34-40.
Moskal, B.M. (2000), “Scoring rubrics: what, when, and how”, Practical Assessment, Research,
and Evaluation, Vol. 7 No. 3, available at: http://ericae.net/pare/getvn.asp?v¼7andn¼3
(accessed December 9, 2006).
Moskal, B.M. (2003), “Recommendations for developing classroom performance assessments and
scoring rubrics”, Practical Assessment, Research & Evaluation, Vol. 8 No. 14, available at:
http://PAREonline.net/getvn.asp?v¼8&n¼14 (accessed January 22, 2007).
Moskal, B.M. and Leydens, J.A. (2000), “Scoring rubric development: validity and reliability”,
Practical Assessment, Research & Evaluation, Vol. 7, pp. 71-81, available at: http://
pareonline.net/getvn.asp?v¼7&n¼10 (accessed December 10, 2005).
Oakleaf, M.J. (2006), Assessing Information Literacy Skills: A Rubric Approach, UMI No. 3207346,
PhD dissertation, University of North Carolina, Chapel Hill, NC.
Parke, C.S. (2001), “An approach that examines sources of misfit to improve performance
assessment items and rubrics”, Educational Assessment, Vol. 7 No. 3, pp. 201-25.
Parkes, K.A. (2006), The Effect of Performance Rubrics on College Level Applied Studio Grading,
UMI No. 3215237, PhD dissertation, University of Miami, Coral Gables, FL.
Pellegrino, J.W., Baxter, G.P. and Glaser, R. (1999), “Addressing the two disciplines problem:
linking theories of cognition and learning with assessment and instructional practice”,
Review of Research in Education, Vol. 24, pp. 307-53.
Petkov, D. and Petkova, O. (2006), “Development of scoring rubrics for IS projects as an
assessment tool”, Issues in Informing Science and Information Technology, Vol. 3,
pp. 499-510.
Popham, W.J. (1997), “What’s wrong and what’s right with rubrics”, Educational Leadership,
Vol. 55 No. 2, pp. 72-5.
Popham, W.J. (2003), Test Better, Teach Better: The Instructional Role of Assessment,
Association for Supervision and Curriculum Development, Alexandria, VA.
QAE Reitmeier, C.A., Svendsen, L.K. and Vrchota, D.A. (2004), “Improving oral communication skills
of students in food science courses”, Journal of Food Science Education, Vol. 3, pp. 15-20.
19,1 Schrock, K. (2000), Kathy Schrock’s Guide for Educators, available at: http://school.discovery.
com/schrockguide/assess.html (accessed March 10, 2007).
Schroeder, H. and Fitzgerald, P. (1984), “Peer evaluation in case analysis”, Journal of Business
Education, Vol. 60, pp. 73-7.
104 Shrout, P.E. and Fleiss, J.L. (1979), “Intraclass correlations: uses in assessing rater reliability”,
Psychological Bulletin, Vol. 86 No. 2, pp. 420-8.
Song, K.H. (2006), “A conceptual model of assessing teaching performance and intellectual
development of teacher candidates: a pilot study in the US”, Teaching in Higher Education,
Vol. 11 No. 2, pp. 175-/190.
Stefl-Mabry, J. (2004), “Building rubrics into powerful learning assessment tools”, Knowledge
Quest, Vol. 32 No. 5, pp. 20-5.
Stevens, D.D. and Levi, A.J. (2005), Introduction to Rubrics, Stylus, Sterling, VA.
Suskie, L. (2004), Assessing Student Learning: A Common Sense Guide, Anker Publishing
Company, Bolton, MA.
Tierney, R. and Simon, M. (2004), “What’s still wrong with rubrics: focusing on the consistency of
performance criteria across scale levels”, Practical Assessment, Research and Evaluation,
Vol. 9 No. 2, available at: http://PAREonline.net/getvn.asp?v¼9andn¼2 (accessed
September 19, 2005).
Wainer, H. and Thissen, D. (1996), “How is reliability related to the quality of test scores? What is
the effect of local dependence on reliability?”, Educational Measurement: Issues and
Practice, Vol. 15, pp. 22-9.
Walter, S.D., Eliasziw, M. and Donner, A. (1998), “Sample size and optimal designs for reliability
studies”, Statistics in Medicine, Vol. 17 No. 1, pp. 101-10.
Woolf, H. (2004), “Assessment criteria: reflections on current practices”, Assessment &
Evaluation in Higher Education, Vol. 29 No. 4, pp. 479-93.

Further reading
Marzano, R.J., Brandt, R.S., Hughes, C.S., Jones, B.F., Presseisen, B.Z., Rankin, S.C. and Suthor, C.
(1988), Dimension of Thinking: A Framework for Curriculum and Instruction, Association
for Supervision and Curriculum Development, Alexandria, VA.

About the author


Malini Y. Reddy is an Assistant Professor in the discipline of marketing at IBS Hyderabad, India.
Her doctoral research was on the relationships between rubrics and student learning. She has
more than 15 years of experience in industry and academia. She teaches courses in marketing
management, services marketing, and customer relationship management. She has published
research articles in international and national refereed peer reviewed journals. She also has
proceedings of full papers at national seminars and conferences. Malini Y. Reddy can be
contacted at: malinireddy@yahoo.com and malinireddy@ibsindia.org

To purchase reprints of this article please e-mail: reprints@emeraldinsight.com


Or visit our web site for further details: www.emeraldinsight.com/reprints

You might also like