Professional Documents
Culture Documents
Tests and Measurements Presentation
Tests and Measurements Presentation
and Measurement
Testing:
Basic Definitions
Assessment - process of documenting
knowledge, skills, attitudes, and/or
beliefs
Evaluation - the making of a
judgment about the amount,
number, or value
Measurement - quantitative (involves
assigning numbers)
Testing - form of measurement
Basic Definitions
(Continued)
Basic Statistics
Mean, Median, and
Standard Deviation
Mean
Advantages
Calculation includes all scores
Indicates typical score for
group
Disadvantages
Easily distorted by extreme
scores
Median
Advantages
Not easily distorted by
extremely high or low scores
Disadvantages
Does not take into account the
value of all the scores in the
group
Mean or median?
Rule of Thumb
use median when extremely
high or low scores (outliers)
are present;
use the mean for most other
situation
Standard Deviation
Indicates by how much the
scores in a distribution typically
deviate from the mean
Mean represents 50% of the
norm group,
68% within 1 SD above or below
the mean,
95% within 2 SD above or below
the mean,
99.7% within 3 SD above or below
mean
Types of Validity
Content Validity
Determined by the degree to
which the questions or items are
representative of the universe of
behavior the test was designed to
sample (does the test assess what
it claims to assess?)
Criterion-Related Validity
Determined by whether there is a
relationship between a test and an
immediate criterion measure
example - a driving test,
employment
Types of Reliability
Test-Retest: Coefficient of Stability
Alternate Form: Coefficient of
Equivalence
Internal Consistency: Consistency of
examinee across test items
Interrater Reliability: Consistency of
judges or scorers
Reliability
General Guidelines
Test scores used for decision
about individuals require a
much higher degree of
reliability than those for making
decisions about groups.
Higher reliability coefficients
are essential if decisions based
on test scores have long term
consequences.
Reliability
General Guidelines
(Continued)
How to Increase
Reliability
Use objective tests
Use a more heterogeneous
group
Make sure the difficulty level is
appropriate for the individuals
being tested
Increase the number of items
Types of Tests
Standardized Tests:
Norm-Referenced and
Criterion-Referenced Tests
Standardized Test
administered and scored in a
consistent, or "standard", manner.
designed in such a way that the
questions, conditions for
administering, scoring procedures,
and interpretations are consistent
administered and scored in a
predetermined, standard manner.
not necessarily a high-stakes, timelimited, or multiple-choice.
Standardized Testing
Benefits
Objectivity
Evidence of validity or reliability of
results
Ability to compare across students,
schools, states, etc.
Ease of administration and scoring
Efficiency (group testing)
Developed over time and
supported with data and research
Standardized Testing
Possible issues
Norm-Referenced Scores
Based on the normal curve
Reflects student performance
compared to other similar students
Shows relative strengths and
weaknesses
Are not standards of what should
be - only indicators of what is
Examples: CogAT, Iowa, NNAT, WISC,
Stanford, Terra Nova
Norms
A set standard of development
or achievement usually derived
from the average or median
achievement of a large group
Used to compare one students
results to those of a large sample
of students:
National norms - based on a large
sample from across the nation
Local norms - based on a large
sample from local schools within a
city, district, state, etc.
Norms
(Continued)
Representative
Because participation in the norm group is
voluntary, norm groups might not be
representative.
Relevant
The normal students used to establish the
norms may not have been provided a normal
instructional program.
Criterion-Referenced Tests
Allow inferences about:
a curricular domain of skills and
knowledge (e.g. the CCGPS, state
standards)
a cognitive domain of skill
reading comprehension
math computation
Types of Scores
NRTs & CRTs
Raw Scores
Actual number of points
received on test
For example, 25 correct answers
out of 30 questions equals a raw
score of 25
Standard Scores
Raw scores converted to new
scale
Can be used to make direct
comparisons among classes,
schools, or districts
Can be misinterpreted because
somewhat arbitrary scale values
used from test to test
Commonly Reported Standard Scores
SAT, GRE, NCEs, Stanines, SAS
Stanines
Standard Scores with whole
number values ranging from 1
to 9
Relate to percentile bands
Useful as a simple
approximation of performance;
May lead to a loss of precision
in reporting
Percentile Scores
Commonly used in expressing results of
standardized tests
Probably the best single derived score
for general use in relaying test results
Indicate the percentage of students in
the norm group scoring lower than the
examinee
Range between values of 1 and 99
Used to interpret a students
performance in comparison to other
students
Can result in misinterpretation because
all percentile ranks are not equally
spaced along any one scale
Percentile Bands
Range of values thought to contain the
students true percentile rank
smaller bands reflect higher reliability
Example: Susan might have a percentile
band ranging between 76 and 86 for
math computation on the ITBS, and a
percentile band ranging between 82 and
92 for reading.
Scores indicate that Susan probably
performs better at reading than she
did at math computation
However, exact percentile score for
math could be higher than for reading
Grade Equivalents
Identifies grade level at which
typical student obtains same
raw score
Expressed by grade and month
Are useful in measuring growth
Can be easily misinterpreted
Things to Know
Know the Test study the manual and
understand the content and purpose
Know the Norms cannot interpret
scores well if dont understand
norming population
Know the Score is it standard score,
raw score, percentile rank, or
something else?
Know the Background test results
dont tell the whole story so consider
multiple sources of data and
information on student
More to know
Research on your own the more you
know, the more you can explain test
results with accuracy and confidence
Communicate effectively provide
pertinent information in a clear,
understandable manner to approved
individuals
Use the test understanding
increases with multiple uses
Use caution test scores can reflect
ability but they do not determine
ability
Reference Test Scores and What They Mean, 6th edition by H. Lyman,