Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 6

Unit 1 (Chapter 1)Educational Testing: Context, Issues, and Trends

Why Learn About Externally Mandated Tests?


Pervasive
Many states
and districts
mandate them
Great variety
(from MC to
performance
based)
Various
purposes (e.g.,
monitor
student
progress,
school
accountability)

Controversi
al
Concerns
over
effectiven
ess/
sideeffects
Fairness

Attract policy
makers as
reform tool
Inexpensive
Easy to
implement
Quick to
implement
Results are
highly
visible,
easily
reported by
media

Will affect your life


as a teacher
May be on panels
determining test
standards and test
use
Will have to explain
tests and scores to
students, parents,
public
Content standards
and accountability
procedures may
affect your teaching

Earlier Waves of Test Based Educational Reform


Title I (TIERS)
Since 1965
Federal
compensatory
education
program
Twice yearly
tests to assess
program

Minimum
Competency Tests
1970s1980s
State tests for
basic skills
Passing required
for HS
graduation

Nation at Risk (and other major


reports)
1983 +
Stress standards beyond the
minimum
Testing often major instrument
of reform
Basis of "report cards" on
schools
"Report cards" increase stakes
of results
Pressure on schools to get
scores up
Pressure produces side effects
(e.g., teaching to test, Lake
Wobegon effect)

Recent Wave of "Standards Based" Reform (1990s)


Differs because:

Ambitious "worldclass" standards


More performance based assessments (less MC)
High-stakes accountability for schools, teachers, and (sometimes)
students
All students assessed

Established both content and performance standards

Specify ends, not means


Content (the "what")what students should know and be able to do in
specific areas in specific grades
Performance (the "how well")level of performance to be achieved
Almost every state has developed and adopted both
Basis for assessments intended to be aligned with curriculum

Emphasizes performance based assessment

Common namesauthentic, alternative, or performance assessment


Common themeshift from fixed-choice
MC to students constructing responses that judges rate
Rests on 3 premises
1. WYTIWYG (What you think is what you get)
2. You dont get what you dont assess
3. Make tests worth teaching to

High-stakes accountability mechanisms

Increasingly popular with policy makers


Rewards for schools (e.g., special recognition, money)
Sanctions for schools (e.g., remove principal, reassign teachers,
oversight)
Impact on students (e.g., promotion, graduation, types of diploma)

Includes all students

Many can take without any special accommodation (e.g., the recently
moved)
Others can take with only minor accommodations (e.g., extra time)

Some will need more extensive accommodations (e.g., test in a


different language), but then can take
Those with IEP's can take IEP based tests (that is, modify tests in ways
that instruction is already
modified by IEP, e.g., more time, read instructions to student, student
answers orally)

Growing Role of Federal Government in Testing


NAEP (1969+) (National Assessment of Education Progress)

National sample
Many subjects
Variety of item types
Some items repeated over years to chart trends (see example on p.
11)
Ages 9, 13, 17
State by state option beginning 1990
Performance standardsbelow basic, basic, proficient, advanced
3 purposesreport level of achievement for 3 ages changes over
time differences across demographic groups

TIMSS (Trends in International Mathematics and Science Study

Math
Difficult to compare across countries (e.g., selectivity, sample quality,
differences in definitions of education levels, translation)
Poor U.S. performance spurs calls for higher standards (see example on
p. 13)

Various presidential initiatives

Last three presidents (e.g., Goals 2000), the last too recent for the
book
All proposed system of voluntary national tests
Bush's was just approved

Public Concern over Testing


Active public involvement

Public often on panels determining objectives and standards


Has led to more testing

Some concern that too much testing

Takes too much class time?


Distorts curriculum?

Debates over social consequences

Often contentious
E.g., attacks on testing industry, calls for moratoriums on testing
3 concerns: Social consequences I, II, III

Social Consequences I: Nature and Quality of Tests


Complaints
MC (multiple-choice) may
penalize the most able,
creative students
Tests too structured,
problems too narrow and
unrealistic
Tests measure only a
limited aspect of
individual

Responses to complaints
Performance based tests are now
common, fewer are MC tests
Many complaints reflect poor test
use, not poor tests (e.g.,
overgeneralizing from a single score)
There are costs to not testing, to
opting for less rather than more
information. Are the alternatives
really better or worse?

Social Consequences II: Effects of Testing on Students


1. Anxiety
Concern: Tests create anxiety
Response: A little anxiety helps most students liberal time limits help avoid
harmful anxiety
2. Labeling
Concern: Tests categorize and label students
Response: Problems come when users overgeneralize from single
scores ability grouping can help when it is flexible and responsive to
changes in performance
3. Self-concepts

Concern: Tests damage students selfconcepts


Response: Problems come from overgeneralizing from low scores can
be avoided if mention strengths as well as weaknesses.
4. Self-fulfilling prophecies
Concern: Tests create self-fulfilling prophecies
Response: Dont overgeneralize attend to strengths as well as
weaknesses
5. Overall lesson
Use tests properly! Dont overgeneralize from single scores!

Social Consequences III: Racial and Gender Fairness


Definitions of test "fairness" often differ
1. "absence of bias"same score predicts the same thing, regardless of
race one of the professional standards for test fairness
2. "procedural fairness"testing conditions provide equal opportunity for
all to show what they know (e.g., comparable grading standards) one
of the professional standards for test fairness
3. "opportunity to learn"all students had the same opportunity to learn
the material being tested
4. "equality of results"all races get the same average score this often
requires violating definitions 1 and 2
"Proper" definition depends on your purposes
1. If it is valid scores, then 1 and 2 are essential
2. If it is to measure an enduring aptitude or ability, then 3 is also
essential it is not, if the aim is to know how much students actually
know (regardless of ability)
3. If it is equal scores for all groups, then 1 - 3 are irrelevant (and often
conflict)

Distinction between test bias and unfair test use


1. Test bias = flaw in the test (e.g., content that is unfamiliar or demeaning
to some groups)

test makers now use citizen panels and statistical tests to avoid this
this is a technical issue

lack of bias does not mean that all groups will score equally (there are
many possible reasons for average score and skill differences)

2. Unfair test use = unfair use of an unbiased test (e.g., do we use the
same or different score cutoffs for different racial groups? Should new tests
be added that favor minorities or women, e.g., SAT writing test?)

this is a social political issue (there is no technical solution)


decision may be affected by why scores differ (e.g., lack of opportunity
to learn)

You might also like