Professional Documents
Culture Documents
NCME 2010 Update On Test Standards Revision 4-27-10
NCME 2010 Update On Test Standards Revision 4-27-10
2010 Annual Meeting of the NCME Denver, Colorado May 1, 2010, 4:05 6:05 p.m.
Michael Kolen
University of Iowa
Lauress Wise, Co Chair, HumRRO Barbara Plake, Co Chair, University of Neb. Linda Cook, ETS Fritz Drasgow, University of Illinois Brian Gong, NCIEA Laura Hamilton, Rand Corporation Jo-Ida Hansen, University on MN Joan Herman, UCLA
May 1, 2010
Update on Revisions to the Test Standards
Michael Kane, ETS Michael Kolen, University of Iowa Antonio Puente, UNC-Wilmington Paul Sackett, University of MN Nancy Tippins, Valtera Corporation Walter (Denny) Way, Pearson Frank Worrell, Univ of CA- Berkeley
May 1, 2010
Update on Revisions to the Test Standards
received from invitation to comment Summarized by the Management Committee in consultation with the CoChairs
Wayne Camara, Chair, APA Suzanne Lane, AERA David Frisbie, NCME
Update on Revisions to the Test Standards
May 1, 2010
May 1, 2010
Theme Teams
Working teams Cross team collaborations Chapter Leaders Focusing of bringing into chapters
content related to themes in coherent and meaningful ways
May 1, 2010
Fairness Joan Herman Accountability Laura Hamilton Technology Denny Way Workplace Laurie Wise Format and Publication Options
Barbara Plake Discussant - Steve Ferrara, NCME Liaison to JC
Update on Revisions to the Test Standards
May 1, 2010
Timeline
First meeting January, 2009 Three year process for completing text of
revision Release of draft revision following December 2010 JC meeting Open comment/Organization reviews Projected publication Summer, 2012
Update on Revisions to the Test Standards
May 1, 2010
Overview
May 1, 2010
10
1999 Approach
Chapter 7: Fairness in Testing and Test Use Chapter 8: Rights and Responsibilities of
Test Takers Chapter 9: Testing Individuals of Diverse Linguistic Backgrounds Chapter 10: Testing Individuals with Disabilities
May 1, 2010
11
Committee Charge
One element focused on adequacy and comparability of translations One element focused on Universal Design
Update on Revisions to the Test Standards
May 1, 2010
12
Revision Response
Fairness is fundamental to test validity: include as foundation chapter Fairness and access are inseparable Same principles of fairness and access apply to all individuals and regardless of specific subgroup From three chapters to a single chapter describe core principles and standards
Examples drawn from ELs, EWD, and other groups (young children, aging adults adults, etc) Comments point to applications for specific groups Special standards retained where appropriate (e.g., test translations)
Update on Revisions to the Test Standards
May 1, 2010
13
Section I: General Views of Fairness Section II: Threats to the Fair and Valid
Interpretations of Test Scores Section III: Minimizing Construct Irrelevant Components Through the Use of Test Design and Testing Adaptations Section IV: The Standards
May 1, 2010
14
2.
3.
4.
May 1, 2010
15
Overview
Most notably in education but also in other areas such as behavioral health Facilitated by increasing availability of data and analysis tools Recent and impending federal and state initiatives will likely lead to further expansion
Under NCLB, or new pay for performance programs, tests often have consequences for individuals other than the examinees
Use of test scores in policy and program evaluations continues to be widespread
Reinforced by groups that fund and evaluate research (e.g., IES, What Works Clearinghouse)
Update on Revisions to the Test Standards
May 1, 2010
17
student-level accountability (e.g., promotional gates, high school exit exams) and interim assessment
May 1, 2010
18
Institutional level (e.g., conjunctive and disjunctive rules for combining scores) Individual level (e.g., teacher value-added modeling)
2. Issues related to validity, reliability, and reporting of individual and aggregate scores 3. Test preparation 4. Interim assessments
May 1, 2010
19
1. Accountability Indices
Most test-based accountability systems require calculation of indices using complex set of rules Advances in data systems and statistical methodology have led to more sophisticated indices to support causal inferences
E.g., teacher and principal value-added measures Consequences attached to these measures are growing increasingly significant
May 1, 2010
20
May 1, 2010
21
Describe exclusion rules, accommodations, and modifications Address error stemming from small subgroups Explain contribution of subgroup performance to accountability index
Teachers and other users should be given assistance to ensure appropriate interpretation and use of information from tests
May 1, 2010
22
3. Test Preparation
High-stakes testing raises concerns about inappropriate test preparation Users should take steps to reduce likelihood of test preparation that undermines validity
Help administrators and teachers understand what kinds of preparation are appropriate and desirable Design tests and testing systems to limit likelihood of harmful test preparation
May 1, 2010
23
Some produced by commercial publishers, others homegrown Vary in the extent to which they provide formative feedback vs. benchmarking to end-of-year tests Need to determine which of these tests should be subjected to the Standards
Requirements for validity and reliability depend in part on how scores are used
If used for high-stakes decisions such as placement, evidence of validity for that purpose should be provided Systems that provide instructional guidance should include rationale and evidence to support it
Update on Revisions to the Test Standards
May 1, 2010
24
Overview
Technological advances are changing the way
tests are delivered, scored, interpreted and in some cases, the nature of the tests themselves The Joint Committee has been charged with considering how technological advances should impact revisions to the Standards As with the other themes, comments on the standards that related to technology were compiled by the Management Committee and summarized in their charge to the Joint Committee
Update on Revisions to the Test Standards
May 1, 2010
26
Security issues for tests delivered over the internet Issues with web-accessible data, including data warehousing
Update on Revisions to the Test Standards
May 1, 2010
27
May 1, 2010
28
What level of documentation/disclosure is appropriate and tolerable for automated scoring developers/vendors? What sorts of evidence seem most important for demonstrating the validity and reliability of automated scoring systems? What issues will emerge over the next five years related to automated scoring systems that need to be addressed by the standards?
May 1, 2010
29
May 1, 2010
30
Test development and simulations Security & Fairness Timed tasks & processing speed Innovative clinical assessments & faking
(effort assessment)
May 1, 2010
31
May 1, 2010
32
Use of computer for score interpretation Actionable reports (e.g., routing students and teachers to instructional materials and lesson plans based on test results)
May 1, 2010
33
Revision of the Standards for Educational and Psychological Testing: Workplace Testing
2010 Annual Meeting of the NCME Denver, Colorado May 1, 2010, 4:05 6:05 p.m.
Laurie Wise
Human Resources Research Organization
(HumRRO)
Overview
Standards for testing in the work place are currently covered in Chapter 14 (one of the testing application chapters). Work-place testing includes employment testing as well as licensure, certification, and promotion testing.
Comments on standards related to work place testing were received by the Management Committee and summarized in their charge to the Joint Committee.
Comments suggested areas for extending or clarifying testing standards, but did not suggest major revisions existing standards.
May 1, 2010
35
4.
5.
May 1, 2010
36
37
Including:
Alternatives to statistical tools for item screening Alternatives to empirical validity evidence Maintaining comparability of scores from different
Assuring fairness Assuring technical accuracy
test forms
May 1, 2010
38
Issues include:
Identifying test content Establishing passing scores Assessing reliability Demonstrating validity
May 1, 2010
39
May 1, 2010
40
Differences in how test content is identified Differences in validation strategies Differences in test score use Who oversees testing
41
Revision of the Standards for Educational and Psychological Testing: Format and Publication
2010 Annual Meeting of the NCME Denver, Colorado May 1, 2010, 4:05 6:05 p.m.
Format Issues
May 1, 2010
43
Organization of Chapters
Scaling & Equating, Administration & Scoring, Documentation Fairness: Fairness, Test Takers Rights and Responsibilities, Disabilities, Linguistic Minorities Applications: Test Users, Psychological, Educational, Workplace, Policy
May 1, 2010
44
May 1, 2010
45
May 1, 2010
46
May 1, 2010
47
Publication Options
Management Committee responsibility Goal is for electronic access Pursuing options for Kindle, etc. Concerns about retaining integrity and
financial support for future revision efforts
May 1, 2010
48