Professional Documents
Culture Documents
Characteristic of A Good Test
Characteristic of A Good Test
Characteristic of A Good Test
Types of Validity
Face Validity- Face validity refers to the extent to which a test appears to measure what it is intended to
measure. A test in which most people would agree that the test items appear to measure what the test
is intended to measure would have strong face validity.
Content Validity- The term content validity refers to how well a survey or test measures the construct
that it sets out to measure.
Criterion Validity- Criterion validity (or criterion-related validity) measures how well one measure
predicts an outcome for another measure. A test has this type of validity if it is useful for predicting
performance or behavior in another situation (past, present, or future).
Construct Validity- Construct validity is about how well a test measures the concept it was designed to
evaluate.
Unclear directions
Vocabulary and sentence construction
Ambiguity of options
Inadequate time limits
Overemphasis of easy and difficult questions
Test items with inappropriate instrument
Poorly constructed test items
Length of the test
Readiness of students
RELIABILITY
Test reliability refers to the degree to which a test is consistent and stable in measuring what it is
intended to measure. Most simply put, a test is reliable if it is consistent within itself and across time.
TYPES OF RELIABILITY
Test-retest- It measures stability. Same test is given after a quantitative period of time.
Alternate Form- Alternate form reliability occurs when an individual participating in a research or testing
scenario is given two different versions of the same test at different times. The scores are then
compared to see if it is a reliable form of testing.
Split-Half- In split-half reliability, a test for a single knowledge area is split into two parts and then both
parts given to one group of students at the same time.
FACTORS AFFECTING RELIABITY
1. Length of the test- One of the major factors that affect reliability is the length of the test. A longer test
provides a more adequate sample of behavior being measured and is less disturbed by chance factors
like guessing.
2. Moderate item difficulty. - The test maker shall spread the scores over a quarter range than having
purely difficult or easy items. Bloom’s taxonomy helps serve as the basis for equal distribution of
difficulty.
3. Objectivity. - Eliminate the biases, opinions or judgments of the person who checks the test. Socio-
political beliefs shall be set aside when checking the test.
4. Heterogeneity of the student’s group. - Reliability is higher when test scores are spread out a
range of abilities. Reliability is achieved when the test-takers represent a variety of intellectual levels
and skills.
5. Limited time. - Speed is a factor and is more reliable than a test that is conducted at a longer time.
This factor considers the chances that a student might cheat.
As a measurement tool, a test results in a score—a number. A number, however, has no intrinsic
meaning and must be compared with something that has meaning to interpret its significance. For a test
score to be useful for making decisions about the test, the teacher must interpret the score. Whether
the interpretations are norm referenced or criterion referenced, a basic knowledge of statistical
concepts is necessary to assess the quality of tests (whether teacher-made or published), understand
standardized test scores, summarize assessment results, and explain test scores to others.
Referencing Framework
A referencing framework is a structure you can use to compare a student's performance to something
external to the assessment itself.