Download as pdf or txt
Download as pdf or txt
You are on page 1of 22

(eBook PDF) Developing and Using

Classroom Assessments 4th Edition


Visit to download the full and correct content document:
https://ebooksecure.com/download/ebook-pdf-developing-and-using-classroom-asses
sments-4th-edition/
BRIEF CONTENTS

1 Introduction 1 PART III


How to Develop, Administer, and Score
Alternative Assessments 139
PART I
How to Establish a Framework for Assessing
11 Informal Observations and Questions 140
Your Students 13 12 Performance Assessment Requisites 159
13 Creating Performance Assessments 174
2 How Assessments Are Interpreted
and Used 15 14 Portfolios 195
3 Measurable Objectives and Goals 26
4 Gathering Evidence of Validity 44 PART IV
5 Generalizing Performance 56 How to Use Assessment Results 211

15 Reporting Student Performance 212


PART II 16 Norm-Referenced Test Scores 227
How to Develop, Administer, and Score
17 Standards-Based Test Scores 248
Written Tests 71
18 Integrating Assessments into
6 Completion Items 72 Instruction 263
7 Essay Items 78 Appendix 282
8 Multiple-Choice Items 93
References 283
9 Alternate-Choice Items 112
10 Learning How to Take Written Index 288
Tests 128

vii
This page intentionally left blank
CONTENTS

1 Introduction 1 3 Measurable Objectives and Goals 26


Significance of Measurement 2 Categories of Learning Outcomes 27
Distinction between Measurement, Levels of Cognitive Complexity 33
Assessment, and Evaluation 2 Components of Performance Objectives 35
Settings Where Assessments Show Up 4 Instructional Goals versus Performance
Distinction between Formal and Informal Objectives 38
Assessments 6 Selecting Performances That Align with
Roles of Assessment in the Classroom 7 Standards 39
Maximum versus Typical Performance 9 Summary 41
Summary 9 Key Terms 42
Key Terms 10 Answers: Apply What You Are Learning 43
Something to Try 43
Additional Reading 43

PART I 4 Gathering Evidence of Validity 44


How to Establish a Framework for Assessing Construct-Related Evidence of Validity 45
Your Students 13 Content-Related Evidence of Validity 47
Criterion-Related Evidence of Validity 49
2 How Assessments Are Interpreted Valid Interpretation and Use of Tests 50
and Used 15 Summary 53
Frames of Reference for Interpreting Key Terms 54
Performance 15
Answers: Apply What You Are Learning 54
Meaning of Criterion- and Norm-Referenced
Interpretations 18 Something to Try 55
Choosing the Appropriate Interpretation 21 Additional Reading 55
Summary 23 5 Generalizing Performance 56
Key Terms 23 Why Observations Do Not Generalize 56
Answers: Apply What You Are Learning 24 Methods for Detecting Inconsistencies 62
Something to Try 24 Techniques for Improving Generalizability 64
Additional Reading 25 Relation of Generalizability to Validity 67

ix
x Contents

Summary 68 Variations of Multiple-Choice Items 106


Key Terms 68 Optimal Number of Choices 109
Answers: Apply What You Are Learning 69 Summary 109
Something to Try 70 Key Terms 110
Additional Reading 70 Answers: Apply What You Are Learning 110
Something to Try 111
Additional Reading 111
PART II
How to Develop, Administer, and Score
9 Alternate-Choice Items 112
Variations of Alternate-Choice Items 113
Written Tests 71
What Alternate-Choice Items Can Do 116
6 Completion Items 72 Qualities Alternate-Choice Items
What Completion Items Can Do 72 Should Have 119
Qualities Completion Items Should Have 74 Practice Applying These Desired Qualities to
Alternate-Choice Items 122
Practice Applying These Desired Qualities to
Completion Items 76 Summary 125
Summary 77 Key Terms 125
Key Terms 77 Answers: Apply What You Are Learning 126
Answers: Apply What You Are Learning 77 Something to Try 126
Something to Try 77 Additional Reading 127

7 Essay Items 78 10 Learning How to Take Written Tests 128


What Essay Items Can Do 78 Familiarity with Testing Medium 128
Qualities Essay Items Should Have 81 Preparing for a Test 130
Practice Applying These Desired Qualities to Testwiseness 132
Essay Items 85 Summary 137
Scoring Students’ Responses to Essay Key Terms 137
Items 89 Answer: Apply What You Are Learning 138
Summary 90 Something to Try 138
Key Terms 91 Additional Reading 138
Answers: Apply What You Are Learning 91
Something to Try 92 PART III
Additional Reading 92 How to Develop, Administer, and Score
8 Multiple-Choice Items 93 Alternative Assessments 139
What Multiple-Choice Items Can Do 94 11 Informal Observations and Questions 140
Qualities Multiple-Choice Items Should Characteristics of Informal Observations
Have 98 and Questions 140
Practice Applying These Desired Qualities to Ensuring the Validity of Informal
Multiple-Choice Items 103 Assessments 147
Contents xi

Guidelines for Using Informal Designing Student Portfolios 200


Observations 148 Guiding Students’ Use of Portfolios 207
Guidelines for Using Informal Questions 153 Summary 208
Record Keeping 154 Key Terms 208
Summary 156 Something to Try 209
Key Terms 157 Additional Reading 209
Answers: Apply What You Are Learning 157
Something to Try 158 PART IV
Additional Reading 158 How to Use Assessment Results 211
12 Performance Assessment 15 Reporting Student Performance 212
Requisites 159 Reporting Systems 212
Characteristics of Performance Establishing Grading Criteria 217
Assessments 160
Role of Grades in Motivating and Disciplining
Options for Scoring Performance Students 221
Assessments 164
Electronic Options for Reporting
Actions to Take before Creating a Performance Performance 223
Assessment 170
Summary 224
Summary 171
Key Terms 225
Key Terms 172
Answer: Apply What You Are Learning 225
Something to Try 173
Something to Try 225
Additional Reading 173
Additional Reading 225
13 Creating Performance
Assessments 174
16 Norm-Referenced Test Scores 227
Distinction between Aptitude and Achievement
Establishing the Capability to Be
Tests 228
Assessed 176
Interpreting Standard Scores 229
Establishing the Performance to Be
Observed 177 Interpreting Percentile Ranks 233
Establishing a Plan for Scoring Students’ Focused Discussion 16.1 235
Performance 182 Interpreting Grade Equivalents 237
Additional Examples of Creating Performance Summary 243
Assessments 183 Key Terms 244
Summary 192 Answers: Apply What You Are Learning 245
Key Terms 193 Something to Try 246
Answers: Apply What You Are Learning 193 Additional Reading 247
Something to Try 194
17 Standards-Based Test Scores 248
Additional Reading 194
Interpreting Reports Related to Specific Skill
14 Portfolios 195 Areas 248
Characteristics of Portfolios 196 Categorizing the Proficiency of Students 252
xii Contents

Interpreting Scale Scores 255 Key Terms 280


Summary 261 Something to Try 281
Key Terms 261 Additional Reading 281
Something to Try 262
Additional Reading 262
Appendix 282

18 Integrating Assessments References 283


into Instruction 263
Using External Standards-Based Index 288
Assessments 264
Using Formative and Summative
Note: Every effort has been made to provide accurate and current
Evaluations 271
Internet information in this book. However, the Internet and infor-
The Role of Electronic Tools 276 mation posted on it are constantly changing, so it is inevitable that
Summary 279 some of the Internet addresses listed in this textbook will change.
Introduction 1
This common experience we share with various
■ Significance of measurement assessments is essential to working with the mate-
■ Measurement versus evaluation rial presented in this book. For example, you know
that some written exams are better than others; we
■ Formal and informal assessments will use this knowledge when we try to isolate
■ The different classroom roles of assessment qualities that create better tests. You are aware of
specific problems that occur in tests, such as ques-
tions that are ambiguous and content that does not
correspond to material covered in the course. You

T hink for a moment of the number of tests you


have taken since you entered school. If you
averaged only 30 minutes of exams and quizzes
know that essays you write, math problems you
complete, and portfolios you prepare would be
scored differently if graded by other teachers. You
per week within each of the basic subject areas, also know of the many problems associated with
by the time you completed high school you had how assessments are used. Teachers sometimes
already answered over 1,000 hours of written distribute results long after a test is administered;
tests. That amounts to 40 solid 24-hour days. This results may be reported without explanations of
does not include tests you completed in college. areas that caused difficulty, and teachers often
Nor does it include the thousands of nonwritten move on to subsequent instruction before ad-
assessments, ranging from how you held your dressing difficulties identified in tests. Portfolios
pencil to how you conducted a science experi- often are reviewed several days and sometimes
ment. It does not include the hundreds of papers weeks after work samples have been completed,
and projects you completed, nor the notebooks and sometimes only a portion of work in the
or portfolios you prepared. Although students portfolio can be reviewed in detail.
may beg to differ, the demands that classroom We know that teachers do and have to depend
assessments place on teachers is greater than the on informal assessments, such as casual observa-
demands placed on students. It is estimated that tions and oral questions. Some teachers are more
more than 25% of a teacher’s time is devoted to effective than others in their use of informal
developing and using the various forms of class- assessments and have a knack for knowing what
room assessments. is going on. Some teachers retain too long their
We are constantly being assessed, and we are initial impressions about students, whereas others
constantly assessing others, not only in classroom are more sensitive to changes in students or errors
settings, but also in everyday life. We assess what in initial perceptions. Some teachers effectively
others say to us, their facial expressions, their use informal assessments to guide students’
choice of words, and their attitudes. We evaluate learning; other teachers seem more oblivious to
their reactions to what we say and do. We often what students are learning. Our common experi-
study who and what are around us to decide ences with all these situations will be invaluable
what to do or where to go. as we present procedures for developing written

1
2 CHAPTER 1 ■ Introduction

tests, preparing performance assessments, asking of the team, for example, by viewing films after
oral questions, and observing what students do. a game.
Teachers must also periodically develop and
administer tests and other formal observations.
■ SIGNIFICANCE OF MEASUREMENT Unlike ongoing informal observations, formal
Dictionaries list a number of related definitions assessments are not spontaneous; they are de-
for the verb measure. The definitions most useful liberately scheduled and fully developed in ad-
to our discussion include determining the char- vance. If teachers were to depend only on
acteristics of something, regulating through the impromptu observations, the students’ mastery of
use of a standard, and making comparisons to a many crucial skills would go unexamined. The re-
reference such as the performance of others. sults of formal observations often guide the nature
Measurement is essential in virtually every dis- of subsequent, more frequent informal observa-
cipline. In the physical sciences, measurement tions. Formal and informal assessments are both
is used to determine characteristics of various critical and are complementary.
substances or to ascertain when a material has Perhaps unfortunately, measurement skills are
achieved a particular property. In engineering, not intuitive within any discipline, and experi-
such qualities as loads, torque, friction, and ther- ence by itself is insufficient for ensuring these
mal characteristics are constantly measured. In measurement skills. For instance, watching the
sports, measurements are used for making deci- heavens at night does not provide the measure-
sions, such as calling a strike in baseball, desig- ment skills essential to being an astronomer, even
nating a first down in football, or calling a personal though that experience provides an important
foul in basketball. Measurement is used in politics base from which to learn these skills. Driving a
to anticipate the outcome of an election or to eval- car does not provide the skills needed by traffic-
uate another government’s reaction to a policy de- control engineers to measure traffic flow, although
cision. Measurement is used in the performing arts it provides insights essential to interpreting these
when the director selects an actor for a particular measures. Your experience with classroom as-
role and when critics review the performance on sessments provides a context essential to learning
opening night. Business relies on data to deter- the concepts discussed in this book, although
mine the effectiveness of production techniques you are likely to find many of the measurement
and to evaluate the impact of advertising. concepts introduced in this book to be new. The
Measurement is also critical to learning. Every application of measurement, particularly in the
theory of learning assumes the presence of feed- classroom, is dynamic and highly complex. Your
back. Measurement, whether it originates from experience, therefore, provides not only the con-
the teacher or the student, is a prerequisite to this text for learning measurement concepts, but also
feedback. Without good assessments, we cannot a basis for adapting, applying, and evaluating the
know whether effective learning has occurred. educational measurement concepts we discuss
In all disciplines, measurement is continuous. here.
This is certainly true in scientific research, sports,
the arts, business, and education. In each area,
informal but deliberate observations are always ■ DISTINCTION BETWEEN MEASUREMENT,
taking place. In addition to these continuous ASSESSMENT, AND EVALUATION
measures, planned formal observations must be The terms measurement, assessment, and evalu-
used. The scientist must incorporate carefully de- ation are often used interchangeably in educa-
veloped measurements at selected points during tion. For our discussion, it will be useful to
and at the end of an experiment. A football maintain a distinction, although the differences
coach must schedule times to analyze the progress among these terms are not absolute.
CHAPTER 1 ■ Introduction 3

The distinction between measurement and on these scores. Remember, however, that evalua-
evaluation is the easiest to establish, and as we tion involves combining our measures with other
will see, this distinction is important. In education, information in order to establish the desirability
measurement is concerned with establishing and importance of what was observed. Test scores
characteristics of individuals or groups of individ- on standardized tests, just like classroom tests used
uals, usually students. Measurement does not by a teacher, have no value until they are com-
associate value with what we see. Evaluation, bined with additional relevant information to es-
however, combines our measures with other in- tablish the desirability and importance of how
formation to establish the desirability and impor- students performed. With scores on standardized
tance of what we have observed. Evaluation is tests, for example, this additional information
the outcome of measurement after value has would include things such as socioeconomic is-
been added. Here are some contrasts between sues, students’ prior achievement, and details re-
measurement and evaluation: lated to what has been going on in the school.
The linkage of evaluation to measurement
Measurement: Performance on a test indicates sometimes makes it difficult to discuss measure-
that a student is unable to spell number words ment and evaluation separately. For example, erro-
less than one hundred. neous measures lead to inappropriate evaluations.
Evaluation: This performance is of significant In addition, establishing what should be measured
concern, because spelling number words is a is determined by the values we expect to attach to
prerequisite to the next unit, on writing checks. our observations. This book focuses more on class-
room applications of educational measurement
Measurement: A teacher observes a student
than on evaluation. It will provide ideas for deter-
speaking in class without first raising her hand.
mining what should be measured and how to in-
Evaluation: This behavior is encouraging be-
terpret student performance on these measures.
cause that student has never participated in
Evaluation is important; however, your profes-
class discussion.
sional expertise in your content area is fundamen-
Measurement: In education, we observe that tal in establishing what to measure and determining
the terms measurement and evaluation are the significance of what you observe. Evaluations
often used interchangeably. also depend on the context of your educational
Evaluation: Equating measurement and eval- measures and cannot be discussed fully in the
uation is undesirable because it confuses is- absence of that context.
sues discussed later in this book. In contrast to evaluation, the distinction be-
tween measurement and assessment is less concise.
Notice that with each of these examples, evalu- Assessment is often used as a stylistic alternative to
ation can occur only when that which was ob- measurement. Some sentences sound better if you
served is combined with other information beyond use the word assessment rather than measurement.
that obtained through the measurement. Serious Sometimes, measurement is perceived as quantita-
problems and misinterpretations arise if a clear dis- tive, cold, and less desirable, whereas assessment
tinction between measurement and evaluation is is seen as qualitative and possibly more humanistic
not maintained. A familiar example is the reporting or caring.
by the press of how various schools or school dis- Let us clarify what is meant by measurement.
tricts performed on a standardized test, such as in Sometimes measurement is artificially limited to
a statewide assessment. Often, only test scores are quantification. Statements such as “Measurement
reported, with value inferred directly from these is the process used to assign numbers to attrib-
scores. For instance, praise is given to teachers of utes or characteristics of persons” are fairly com-
one school and disdain to those of another based mon. These statements fit well with notions of
4 CHAPTER 1 ■ Introduction

validity and reliability that are limited to statisti- Airasian (2005) defines assessment as the
cal conceptualizations. Such statements are also “process of collecting, synthesizing, and inter-
consistent with the view that test scores are preting of information to aid the teacher in deci-
always numbers. sion making” (p. 2). This definition is broader
On the other hand, measurement is often given than what we have defined as measurement. It is,
a broader meaning. Dictionary definitions such however, unclear at what point assessment su-
as those noted earlier do not limit measurement persedes measurement. For instance, does deter-
to quantification. Likewise, people who describe mining a quantitative or qualitative attribute of an
themselves as measurement specialists do more individual include only the collection of informa-
than merely associate numbers with attributes. The tion, or does it include the synthesis of informa-
same can be said about books with “educational tion, or, possibly, naming the characteristic being
measurement” in their titles. Even the term test described? If the validation of test scores is con-
score is not necessarily a number. The late Samuel cerned with how the scores are to be interpreted
Messick, a highly distinguished psychologist, put it and used, then the meaning of measurement
this way: might be as broad as that of assessment.
As with many synonyms, measurement and as-
The term “test score” is used generically here in its
broadest sense to mean any observed consistency, sessment have similar meanings, yet they also
not just on tests as ordinarily conceived but also on have subtle differences. The subtleties are often
any means of observing or documenting consistent useful, but not always applied consistently. In
behaviors or attributes. This would include, for in- our discussion, assessment will be viewed as hav-
stance, any coding or summarization of observed ing broader connotations than measurement. As-
consistencies on performance tests, questionnaires, sessment will refer to a related series of measures
observation procedures, or other assessment devices. used to determine a complex attribute of an indi-
This general usage also subsumes qualitative as well vidual or group of individuals. Portfolios, which
as quantitative summaries and applies, for example, include planned collections of student work, are
to protocols, clinical interpretations, and computer-
an example of an assessment. Performance assess-
ized verbal score reports. (Messick, 1989a, p. 5)
ments, in which student performance on a com-
In our discussion, educational measurement plex task is observed, are assessments. A series
will refer to the process of determining a quanti- of related but informal observations used to de-
tative or qualitative attribute of an individual or termine complex attributes of one student or a
group of individuals that is of academic relevance. group of students is also an assessment. Each of
Test will refer to the vehicle used to observe that these types of assessments involves collecting,
attribute, such as a written test, an observation, or synthesizing, and labeling information. At what
an oral question. Test score will refer to an indi- point a measurement becomes an assessment,
cation of what was observed through the test. however, remains undefined.
Assessment generally has broader connota-
tions than does measurement. Were measure-
ments limited to assigning numbers to attributes, ■ SETTINGS WHERE ASSESSMENTS
assessment might be thought of as combining SHOW UP
qualitative and quantitative attributes. Or, if In this book, we look at assessments within a va-
measurements were limited to paper-and-pencil riety of settings or categories, some more familiar
tests, assessment might be said to involve all than others. One obvious setting is that of written
techniques including observations and oral ques- tests, such as those used often in quizzes and ex-
tions. In our discussion, however, measurement, ams. As noted earlier, these written tests, as im-
not assessment, will be used to represent all portant as they are, represent a small fraction of
these activities. educational assessments.
CHAPTER 1 ■ Introduction 5

Another important category is performance creates a portfolio, which consists of a physical or


assessments. A performance assessment involves electronic folder into which samples of work are
students completing a task in a realistic setting. placed. Specific instructional goals are established
Examples include performing an experiment in a and students, with guidance, select examples of
science lab, giving a persuasive speech in speech their work to place in the portfolio to demonstrate
class, demonstrating a particular sketching tech- how well they have achieved each goal. At inter-
nique in art class, and establishing how fast one vals, the teacher and student use the contents
can run 800 meters in a track event. As with all of the portfolio to jointly evaluate the student’s
assessments, each of these help establish profi- proficiency with each goal and, together, plan
ciency with a particular skill we have learned. subsequent goals and instruction.
However, as is the case with each of the above Informal assessments represent a very special
examples, performance assessments can measure and highly interactive form of classroom assess-
skills that cannot be assessed with a written test. ment. Informal assessments involve things like
Incidentally, performance assessments show up watching students’ facial expressions, listening to
in a variety of situations. For instance, homework what they say in discussion, and casually and spon-
and class projects represent performance assess- taneously probing students’ understanding with
ments when their purpose is to demonstrate pro- oral questions. In terms of quantity, most assess-
ficiency with a particular instructional goal. ments in the classroom are informal. Ironically,
Two techniques that we later discuss, most measurement books do not discuss informal
curriculum-based assessments and portfolio assessments. One can easily argue that, cumula-
assessments, represent not only a particular tively, informal assessments play a more important
assessment format, but also a strategy for integrat- role than all other assessments combined. We de-
ing assessments into learning. With curriculum- vote a full chapter to addressing characteristics and
based assessments, instructional intervention and related issues of informal assessments.
assessments are closely coordinated. For example, In education, most assessments are internal
if reading fluency for young children is defined in rather than external assessments. Internal assess-
terms of how quickly a child can accurately read ments are administered in the classroom and typ-
printed text, children would practice reading writ- ically are created by the teacher. In contrast,
ten passages, with the reading proficiency of each external assessments are developed by agen-
child assessed at a fixed interval such as once a cies external to the school building, such as school
week. The same assessment procedure would be districts, private testing companies, and state and
repeated each week such as having each child federal agencies. Tests used with external assess-
read a new written passage to determine how ments often are referred to as standardized tests,
many words the child can correctly read aloud in because their administration is standardized; that
one minute. Graphs would be prepared to moni- is, the same administration procedures are fol-
tor how well over several weeks the child was lowed each time the test is administered. Stan-
progressing toward a targeted reading goal. If a dardized tests might be administered individually
child’s rate of improvement were progressing too such as by a school psychologist. More typically,
slowly, alternate instruction would be initiated. standardized tests are group administered.
Curriculum-based assessments are particularly In the United States, many state legislatures for
effective when the skill area is highly focused, a number of years have mandated statewide as-
such as speed of reading new text, or rate at sessments involving standardized tests. Skills
which a specific type of multiplication problem measured by these assessments and the format
can be completed. of the tests were determined by the individual
Portfolio assessments similarly focus on inte- states. Until recently, assessments in most states
grating assessment and instruction. Each student occurred every year, but only in selected grades.
6 CHAPTER 1 ■ Introduction

In January 2002, the No Child Left Behind Act not included in the assessment. Other important
of 2001 (Public Law 107-110) was signed into qualities pertain to knowing how to develop
law. Among other things, this law stipulated that good assessments, and similarly knowing how to
all states would establish standards, with students appropriately limit the interpretations of these as-
in grades 3 through 8 assessed annually in math sessments. Classroom teachers take on tremen-
and reading. High school students were also to dous responsibilities; important among them
be tested, but not necessarily at each grade level, is effectively managing the assessment of their
and assessments in science were also to be added students.
later. A controversial part of the legislation re-
quires virtually all students to be tested, includ-
ing many with special needs, with all students ■ DISTINCTION BETWEEN FORMAL
demonstrating mastery of all standards within AND INFORMAL ASSESSMENTS
10 years of enactment of the legislation. Later in In the preceding section, we introduced the no-
the book, we look at some of the important as- tion of informal assessments. The basic distinc-
sessment issues and other implications associated tion between formal and informal assessments
with No Child Left Behind. lies in their spontaneity. Formal assessments are
Assessments associated with No Child Left devised in advance, whereas informal assess-
Behind, perhaps understandably, have received a ments happen on the spur of the moment.
disproportionate amount of media attention. A Formal assessments include final exams, unit
lesser-known yet important example of external tests and quizzes, graded homework, critiques of
assessment is the National Assessment of Educa- prepared speeches, and judgments of perfor-
tional Progress (NAEP). Also known as “The mances in science labs. Formal assessments also
Nation’s Report Card,” Congress established NAEP include tryouts for athletic teams and roles in a
in 1969 to provide a census of educational achieve- school play. Details of how each of these assess-
ment in the United States within core content ments will be implemented are established prior
areas. Rather than test every student, NAEP uses to their occurrence.
carefully selected samples of students. The con- Informal assessments, although more numer-
tent assessed has been continuously updated; ous, are less obvious. Informal assessments occur
however, anchor test items are included that al- when a teacher listens to a student’s questions
low monitoring historical changes in achieve- or watches her facial expression to determine
ment. Because sampling is involved, the chance whether she understands the concept being
that a particular student will be tested is small. taught. Informal assessments occur when a
Likewise, NAEP can be used only to evaluate teacher arrives at first-day impressions of a stu-
aggregates of students, not individual students. dent’s ability by watching where he sits, looking
NAEP is the best source of data for establishing at what he wears, and listening to what he says.
the status and changes in achievement of students Informal assessments include talking through a
in the United States. math problem to find out why the student gave a
As you can see, student assessments show up wrong answer or asking questions while teaching
in a variety of settings. The nature and purpose of a concept to determine whether students are
these assessments varies considerably. Neverthe- learning what is being taught.
less, important qualities that are discussed in sub- Because formal assessments are devised in ad-
sequent chapters are present and critical to every vance and informal assessments are sponta-
one of these assessments. These include estab- neous, they differ in a number of ways. A basic
lishing evidence that the assessments are valid, difference is that informal assessments can and
and evaluating whether performance on the tasks do occur while instruction is being delivered,
sampled by a test generalize to important tasks whereas formal assessments often require a pause
CHAPTER 1 ■ Introduction 7

in instruction. As a result, informal assessments students’ facial expressions and other behaviors.
can be particularly effective at redirecting instruc- Formative evaluations are also based on formally
tion as it occurs. On the other hand, informal developed assessments such as quizzes, seat
assessments are more likely to be unsystematic work, homework, and group projects. Most as-
and more often lead to faulty conclusions about sessments that occur in the classroom lead to
student performance. formative evaluations.
Formal assessments are used when more con- Summative evaluations occur at the conclu-
trolled measures are required, whereas informal sion of instruction, such as at the end of a unit or
assessments are used when more frequent and the end of the year. Summative evaluations are
responsive measures are needed. Because of used to certify student achievement and assign
their respective advantages and limitations, the end-of-term grades. They serve as the basis for pro-
use of formal versus informal assessment de- moting and sometimes for grouping students. Sum-
pends to a large extent on the role an assessment mative evaluations also help determine whether
is to play in the classroom. instructional strategies should be changed before
the next school year. Unlike in formative evalua-
tions, the role that assessment plays within sum-
■ ROLES OF ASSESSMENT mative evaluations is not to establish student
IN THE CLASSROOM proficiency with each skill, but instead to provide
Some time ago, Bloom, Hastings, and Madaus an overview of achievement across a number of
(1981) proposed that assessment is used to facil- skills. With summative evaluations, each skill might
itate formative, summative, and diagnostic evalu- be measured by just one test item, not enough
ations. These categories are still widely used. We to establish with confidence proficiency with indi-
will add a fourth role: preliminary evaluations. vidual skills. Often, only a sampling of skills is
Preliminary evaluations occur during the tested. Summative evaluations are based on formal
first days of school and provide a basis for ex- assessments.
pectations throughout the school year. They are Diagnostic evaluations occur before or,
obtained mainly through a teacher’s spontaneous more typically, during instruction. Diagnostic
informal observations and oral questions and are evaluations are concerned with skills and other
concerned with students’ skills, attitudes, and characteristics that are prerequisite to the cur-
physical characteristics. These evaluations are ba- rent instruction or that enable the achievement
sically the same as those you establish whenever of instructional objectives. During instruction,
you meet new people. They happen naturally, diagnostic evaluations are used to establish un-
and they are essential to guiding our interactions derlying causes for a student’s failing to learn a
with others and with students. skill. When used before instruction, diagnostic
Formative evaluations occur during instruc- evaluations try to anticipate conditions that will
tion. They establish whether students have negatively affect learning. In both cases, the role
achieved sufficient mastery of skills and whether of measurement is to assess a student’s perfor-
further work with these skills is appropriate. For- mance in specific prerequisite skills not typically
mative evaluations are also concerned with the taught in the present classroom setting. Diag-
attitudes students are developing. The purpose nostic evaluations are based mostly on informal
of formative evaluations is to determine what ad- assessments, although formal measures, such as
justments to the present learning environment standardized tests, are sometimes used.
should be made. Formative evaluations are based Figure 1.1 illustrates the relationship among
primarily on continuous informal assessments, the four evaluative roles of classroom assess-
such as listening to what students say, using oral ment. Preliminary evaluations feed into formative
questions to probe comprehension, and watching evaluations. Formative evaluations occur during
8 CHAPTER 1 ■ Introduction

Preliminary Evaluations
Purpose
Provide a quick but temporary determination
of students’ characteristics Diagnostic Evaluations

When it occurs Purpose


During the first 10 days of school Identify problems that will
prevent or are preventing a
Techniques used student from learning
Mostly informal observations and questions
When it occurs
When difficulties in learning
new knowledge are
anticipated; or more
typically, after difficulties in
learning have been
observed
Formative Evaluations
Techniques used
Purpose Typically informal
Determine what students have learned in observations and
order to plan instruction questions; sometimes
When it occurs formal assessments such
Continuously, during instruction as a teacher’s written test
or a standardized test
Techniques used
Mostly informal observations and questions;
also, written and oral quizzes, classroom
activities and performance assessments,
homework, and portfolios

Summative Evaluations
Purpose
Certify what students have learned in order to
assign grades, promote students, and refine
instruction for next year
When it occurs
At the conclusion of a unit of instruction
Techniques used
Mostly formal assessments, including written
tests, performance assessments, projects,
and portfolios

FIGURE 1.1 ■ Four roles of classroom assessment

instruction and are based on frequent assess- As indicated in Figure 1.1, preliminary, forma-
ments. Diagnostic evaluations are concerned with tive, and diagnostic evaluations depend mostly on
problems that might be or, more typically, already informal assessments. These evaluations require
are preventing students from learning. Summative measures of student performance that are highly
evaluations follow instruction. responsive to immediate situations. They require
CHAPTER 1 ■ Introduction 9

assessments that can occur without a pause in ■ Should a study of political history include mea-
instruction. Summative evaluations, in contrast, sures of political preferences or interests in politics?
require the more controlled measures provided ■ Is it appropriate for a teacher of meteorology to
by formal assessments. Summative evaluations do measure students’ attitudes toward meteorology
not depend on the spontaneous and responsive or interest in weather systems?
characteristics of informal assessments.
Because we often are uncertain whether mea-
suring typical performance is appropriate, particu-
■ MAXIMUM VERSUS TYPICAL larly if the intent is to modify undesired attitudes,
PERFORMANCE we might answer in the affirmative to all, part, or
Maximum performance and typical performance possibly none of these questions.
refer to whether students are performing at their Typical performance can be measured only
best or at their normal level. Formal assessments to a degree. To be valid, observations of typical
such as written tests, portfolios, structured perfor- performance may need to be conducted without
mance assessments, and graded homework tend students’ being aware that they are being as-
to encourage maximum performance. These sessed. Often, obtaining unobtrusive measures of
assessments establish a student’s ability to perform attitude is not possible, and, as a result, students
when motivated, but the performance does not fake typical responses.
necessarily generalize to other settings. Determin-
ing that a student can distinguish between state- SUMMARY
ments of fact and opinion on a test, for instance,
does not indicate the student will apply this skill It is quite appropriate to examine closely the is-
when reading newspapers. Likewise, determining sues related to classroom assessment. Assessment
that a student has achieved knowledge in history, is germane to everything that goes on in the
mathematics, or reading does not mean that she classroom. Students do not learn effectively un-
will use these skills beyond the classroom. less they receive feedback, which is obtained
Typical performance is more concerned with through assessments. Teachers similarly cannot
attitudes than with academic skills. Attitudes influ- be effective without information gained through
ence students’ interest in applying what they learn their assessments.
as well as in learning it in the first place. Typical Each of us has already had an extensive
performance is usually measured by informal as- amount of experience with assessments, gained
sessments, particularly observation. Observations from years of involvement with written tests and
or other measures have to be unobtrusive to assess other assessments. Our experience with assess-
typical performance. ments also derives from informal interactions
Whether teachers should develop and admin- with others. We can recognize the considerable
ister measures of typical performance pertinent differences in the quality of assessments and in
to the content they are teaching is unclear. Cer- the effectiveness with which they are used. This
tainly, values are important and can be taught. experience will be invaluable as we discuss the
The way in which a teacher facilitates learning in development and use of classroom assessments.
the classroom affects the attitudes students de- Several terms were defined in this chapter.
velop about a subject. Whether a teacher should Educational measurement refers to the process of
measure these attitudes with the intent of modi- determining a quantitative or qualitative attribute
fying student behavior can be argued both ways. of an individual or group that is of academic rele-
For instance, consider the following questions: vance. Assessment refers to a related series of
measures used to determine a complex attribute of
■ Should a study of world religions include mea- an individual or group. A test is any vehicle used
sures of students’ beliefs or interests in religion? to observe that attribute, and includes written tests,
10 CHAPTER 1 ■ Introduction

performance assessments, portfolio systems, and capabilities are being assessed, and determining
casual observations and questions. A test score is whether observations of student performance will
an indication of what is observed through the test generalize to other settings.
and can be quantitative or qualitative in nature. Part II describes how to develop and score
Evaluation combines measures and assessments written tests. We will also discuss “testwiseness”
with other information to establish the desirability and other issues that help students successfully
and importance of what we have observed. complete written tests. Part III explains how to
Assessments occur in a variety of settings, some develop and score alternative assessments. We
more evident than others. Obvious examples in- will discuss informal observations and questions,
clude written tests such as quizzes and exams, and performance assessments, and portfolio systems.
also performance assessments. Portfolio assess- These alternative techniques help evaluate skills
ments and curriculum-based assessments repre- that cannot be assessed through written tests.
sent approaches that emphasize the integration of Part IV describes how to interpret assessments
instruction and assessment. Informal assessments and report results. This discussion will pertain to
represent the dominant form of assessment, at students as well as to audiences outside the class-
least in sheer quantitative, but probably also in room, such as parents. Topics of grading and
terms of their cumulative effect on student learn- standardized tests are included in this discussion.
ing. Although far fewer in number, external as-
sessments are more dominate in the news media, KEY TERMS
particularly statewide assessments. The National
Assessment of Educational Progress also repre- Assessment: A related series of measures used
sents examples of standardized tests. to determine a complex attribute of an individual
Four types of evaluation have been identified. or group of individuals, such as a portfolio as-
Preliminary evaluations occur during the first two sessment or performance assessment. Another
weeks of school and provide a quick determina- example of an assessment is a series of related in-
tion of students’ characteristics. Formative evalu- formal observations used to determine complex
ations occur continuously during instruction to attributes of one student or a group of students.
determine what students are learning and to
Curriculum-based assessment: A close coordi-
enable instruction to be adjusted accordingly.
nation between a series of assessments and in-
Diagnostic evaluations identify problems that are
structional intervention. A series of assessments
preventing students from learning. Summative
occur at a fixed interval, such as once a week
evaluations occur at the conclusion of a unit of
measuring a student’s reading proficiency using a
instruction and are used for tasks such as certify-
random passage, with scores plotted over times
ing what students have learned, assigning grades,
and instruction altered if the student is not
and refining instruction for the next year. Prelim-
achieving the targeted rate of growth.
inary, formative, and diagnostic evaluations rely
more heavily on informal assessments but often Diagnostic evaluation: Occurs before or, more
include formal assessments. Summative evalua- typically, during instruction with focus on identi-
tions involve formal assessments. fying impediments to learning, such as mastery of
The remainder of this book is divided into four skills prerequisite to the present unit of instruction.
parts. Part I describes how to establish a frame- Evaluation: The outcome of measurement after
work for assessing your students. This framework value has been added. Evaluation establishes the
involves four interrelated components, including importance or desirability of what was observed.
identifying student performances to be observed, Evaluation can occur only when that which was
establishing how results from assessments will observed is combined with other information
be used, obtaining evidence that appropriate beyond that obtained through the measurement.
CHAPTER 1 ■ Introduction 11

External assessment: Is developed external from Performance assessments often are required to
the classroom, and generally limited to summative measure cognitively complex knowledge, partic-
evaluations. Can involve any assessment format ularly problem solving. Performance assessments
such as written tests, and portfolio and perfor- also are used when the task being observed is
mance assessments. Standardized tests including heavily dependent on psychomotor skills, such
statewide assessments are external assessments. as in the arts, speech, science labs, and physical
Formal assessment: Includes written tests and education.
quizzes, graded homework, and portfolio and Portfolio assessment: An assessment based on
performance assessments. Unlike informal assess- a collection of student work in a subject, with
ments, the observational tool or test is developed products gathered in the portfolio used to demon-
prior to the assessment. strate achievement of instructional goals. Empha-
Formative evaluation: Occurs during instruc- sis is placed on demonstrating growth over time,
tion and is used to guide learning activities dur- with the teacher and student using measures of
ing a unit of instruction. achievement to jointly evaluate progress and plan
the focus of subsequent learning.
Informal assessment: Includes a teacher’s casual
observations and oral questions during instruction. Preliminary evaluation: Initial evaluation that
Informal assessments occur casually yet are sys- occurs when the teacher is becoming familiar
tematically planned. Most classroom assessments with students’ proficiencies and other character-
are informal. istics, typically during the first 2 weeks of the
school term. These initial impressions provide an
Internal assessment: Is used and often devel-
important framework for subsequent evaluations.
oped by the teacher for evaluation of learning
before, during, and after instruction. Summative evaluation: Occurs at the comple-
Maximum performance: The demonstration of tion of a unit of instruction, and is used to certify
a student’s best performance, given the present what students have learned, often to assign
level of proficiency. A student’s performance grades, and to provide a basis for modifying a
when highly motivated. unit of instruction prior to the next academic year.
Measurement: In education, the process of de- Test: In education, any vehicle used to observe a
termining a quantitative or qualitative attribute quantitative or qualitative attribute of individuals
of an individual or group of individuals that is of that is of academic relevance.
academic relevance. Test score: A quantitative or qualitative sum-
No Child Left Behind: The common name as- mary of what was observed through a test.
sociated with Public Law 107-110, passed by the Typical performance: The level at which a stu-
U.S. Congress in 2001, which among other things dent will normally perform through intrinsic mo-
required individual states to establish standards tivation. Typical performance is usually but not
in language arts and math that all students would always less than maximum performance.
achieve by the year 2014. Written test: A formal classroom or external as-
Performance assessment: A direct, systematic sessment involving common item formats such as
observation of an individual’s performance, with completion, essay, or multiple-choice. A written
the process or product outcome of that per- test can be administered using paper-and-pencil
formance scored using pre-established criteria. or a computer.
This page intentionally left blank
Another random document with
no related content on Scribd:
prohibition against accepting unsolicited donations from donors in
such states who approach us with offers to donate.

International donations are gratefully accepted, but we cannot make


any statements concerning tax treatment of donations received from
outside the United States. U.S. laws alone swamp our small staff.

Please check the Project Gutenberg web pages for current donation
methods and addresses. Donations are accepted in a number of
other ways including checks, online payments and credit card
donations. To donate, please visit: www.gutenberg.org/donate.

Section 5. General Information About Project


Gutenberg™ electronic works
Professor Michael S. Hart was the originator of the Project
Gutenberg™ concept of a library of electronic works that could be
freely shared with anyone. For forty years, he produced and
distributed Project Gutenberg™ eBooks with only a loose network of
volunteer support.

Project Gutenberg™ eBooks are often created from several printed


editions, all of which are confirmed as not protected by copyright in
the U.S. unless a copyright notice is included. Thus, we do not
necessarily keep eBooks in compliance with any particular paper
edition.

Most people start at our website which has the main PG search
facility: www.gutenberg.org.

This website includes information about Project Gutenberg™,


including how to make donations to the Project Gutenberg Literary
Archive Foundation, how to help produce our new eBooks, and how
to subscribe to our email newsletter to hear about new eBooks.

You might also like