Professional Documents
Culture Documents
Types of Norm
Types of Norm
Percentile norms- score or number of students below that is equal to the % of such students
Subgroup norms- segmented by any group of criteria. (ex: Profile- select according to age).
example- compare results of test in national to local ; students take ACHIEVEMENT TEST IN BAUAN).
Types of Norms
• Age norms
• Grade norm
• National norms
• Local normal
• Subgroup norms
• Percentile norms
-the distribution of scores obtained on the test from one group of test takers called the fixed reference
group is used as the basis for the calculation of test scores for future administrations of the test
• Criterion-referenced tests measure a fest taker's performance compared to a specific set of standards
or criteria.
- Suppose you received a score of 90% on a Math exam in school. This could be interpreted in both
ways. It the cut-score was 80%, you clearly passed. If the average score was 75%, then you performed at
the tope of the class.
NORM- comparing 90 to 80
Correlation is an expression of the degree and direction of correspondence between two things.
• Pearson (r ); +1 0 -1
• Spearman Rho
• Regression
Meta-Analysis
- a family of techniques used to statistically combine information across studies to produce singie
estimates of the statistics being studied.
Utility
•Benefits - justify the costs of administering, scorings and interpreting the test
Utility Analysis
- a family of techniques that entail a cost-benefit analysis designed to yield information relevant to a
decision about the usefulness and/or practical value of a tool of assessment
- one test is preferable to another test for use for a specific purpose
- one tool of assessment (such as a test) is preferable to another tool of assessment (such as behavioral
observation) for a specific purpose
- the addition of one or more tests (or other tools of assessment) to one or more tests (or other tools of
assessment) that are already in use is preferable for a specific purpose
Some Practical
Cut Score
- Reference point derived as a result of a judgment and used to divide a set of data into two or more
classifications, with some action to be taken or some inference to be made on the basis of these
classifications.
Type of Cut Score
• test construction
• test revision
Test Conceptualization
-Rating Scale
• Summative Scale
• Liken Scale
-Comparative Scaling
-Categorical Scaling
-Guttman Scale
For example: Rate Burger King in comparison to McDonald's
. Excellent
• Very good
• Good
• Poor
• Very poor
A comparative rating scale allows the researcher to interpret the resulting data in relation to another
company or product.
Categorical scaling
• In our running MDBS-R example, testtakers might be given 30 index cards on each of which is printed
one of the 30 items. Testtakers would be asked to sort the cards into three piles: those behaviors that
are never justified, those that are sometimes justified, and those that are always justified.
• Guttman scale,
a. All people should have the right to decide whether they wish to end their lives.
b. People who are terminally ill and in pain should have the option to have a doctor assist them in
ending their lives.
c. People should have the option to sign away the use of artificial life- support equipment before they
become seriously ill.
Test Conceptualization
•Writing Items
- Selected-response format
• Multiple choice
•Matching
• True/False
Writing Items
- Constructed-response format
• Completion item
• Short answer
•Essay
Item A
Stem- A psychological test, an interview, and a case study are
CORRECT ALT
DISTRACTORS
theory-linked measures
Items B
e. includes as much of the item as possible in the stem to avoid unnecessary repetition
Test Conceptualization
- Item bank - relatively large and easily accessible collection of test questions
- Item branching The ability of the computer to tailor the content and order of presentation of test items
on the basis of responses to previous items.
Test Conceptualization
Scoring Items
• Cumulative-the higher the score on the test, the higher the test taker is on the ability, trait, or other
characteristic that the test purports to measure
• Class Category-test taker responses ram credit toward placement in a particular class or category with
other test takers whose pattern of responses is presumably similar in some way
• Ipsative Scoring- comparing a test taker's score on one scale within a test to another scale within that
same test.
Item Analysis
• Other Considerations
• Guessing
•Item fairness
• Speed Test
• It is a measure of the proportion of examinees who answered the item correctly for this reason, it is
frequently called the p-value. As the proportion of examinees who got the item right, the p-value might
more properly be called the item easiness index, rather than the item difficulty.
• It can range between 0.0 and 1.0, with a higher value indicating that a greater proportion of
examinees responded to the item correctly, and it was thus an easier item.
The higher this index, the greater the test's internal consistency.
• This index is equal to the product of the item-score standard deviation (s) and the correlation (r)
between the item score and the total test score.
•Provide an indication of the degree to which a test is measuring what it purports to measure
• The higher the item-validity index, the greater the test's criterion-related validity.
• measure of how well an item is able to distinguish betwe examinees who are knowledgeable and
those who are not, between masters and non-masters.
- The questions raised either orally or in writing shortly after a test adminstration.
- Depending upon objectives of the test users the questions could be placed in other formats such as
true or false or multiple choice.
TEST REVISION
Cross validation-revalidation of a test on a sample of testtakers other than those on whom test
performance was originally found to be a valid predictor of some criterion
Co-validation-a test validation process conducted on two or more tests using the same sample of
testtakers