Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 9

Types of Norm

Age- average score in different ages.

Grade norms- average scores of students in different grade level of groups.

Percentile norms- score or number of students below that is equal to the % of such students

Ex: 25 feet of students...75% scored below

National norms- Sample coming from whole population

Ex: Luzon, Visaya, Mindanao

Subgroup norms- segmented by any group of criteria. (ex: Profile- select according to age).

Local population norms- local in test

example- compare results of test in national to local ; students take ACHIEVEMENT TEST IN BAUAN).

• Standardization or Test standardization - The process of administering a test to a representative


sample of test- takers for the purpose of establishing norms.

•Developing norms for a standardized test.

Types of Norms

• Age norms

• Grade norm

• National norms

• National anchor norms

• Local normal

• Subgroup norms

• Percentile norms

Fixed Reference Group Scoring Systems

-the distribution of scores obtained on the test from one group of test takers called the fixed reference
group is used as the basis for the calculation of test scores for future administrations of the test

example: Scholastic Aptitude test


Norm-Referenced versus Criterion- Referenced Evaluation

• Norm-referenced tests make comparisons between individuals.

• Criterion-referenced tests measure a fest taker's performance compared to a specific set of standards
or criteria.

Norm-referenced versus criterion- referenced evaluation

- Suppose you received a score of 90% on a Math exam in school. This could be interpreted in both
ways. It the cut-score was 80%, you clearly passed. If the average score was 75%, then you performed at
the tope of the class.

NORM- comparing 90 to 80

CRITERION REFERENCE TEST- comparing average score who pass in 90.

Correlation is an expression of the degree and direction of correspondence between two things.

• Pearson (r ); +1 0 -1

• Spearman Rho

• Regression

Meta-Analysis

- a family of techniques used to statistically combine information across studies to produce singie
estimates of the statistics being studied.

Positive Correlation ⬆️⬆️⬇️⬇️

Negative Correlation ⬆️⬇️

Utility

-usefullness of practical value of testing to improve efficiency.


Factors that affect utility

•Psychometric soundness - the reliability and validity of a test

•Cost-economic and noneconomic

•Benefits - justify the costs of administering, scorings and interpreting the test

Utility Analysis

- a family of techniques that entail a cost-benefit analysis designed to yield information relevant to a
decision about the usefulness and/or practical value of a tool of assessment

UTILITY RELATED QUESTIONS

- one test is preferable to another test for use for a specific purpose

- one tool of assessment (such as a test) is preferable to another tool of assessment (such as behavioral
observation) for a specific purpose

- the addition of one or more tests (or other tools of assessment) to one or more tests (or other tools of
assessment) that are already in use is preferable for a specific purpose

- no testing or assessment is preferable to any testing or assessment

Some Practical

- The pool of job applicants

- The complexity of the job

- The cut score in use

Cut Score

- Reference point derived as a result of a judgment and used to divide a set of data into two or more
classifications, with some action to be taken or some inference to be made on the basis of these
classifications.
Type of Cut Score

• relative cut score

• norm-referenced cut score

•fixed cut score

• multiple cut scores

Methods for Setting Cut Scores

• The Angoff Method

• The Known Groups Method- compare traits and characteristics in job

• IRT-Based Method- correct pass or failedTest Development

The process of developing a test occurs in fi ve stages

• test conceptualization - self thought

• test construction

• test tryout- test the validity of the test (SCORING ITEMS)

• item analysis- items (acceptable or not)

• test revision

Test Conceptualization

• Scaling Methods process of the assigning numbers in measurement.

-Rating Scale

• Summative Scale

• Liken Scale

- Method of Paired Comparisons

-Comparative Scaling

-Categorical Scaling

-Guttman Scale
For example: Rate Burger King in comparison to McDonald's

. Excellent

• Very good

• Good

Both are the same

• Poor

• Very poor

A comparative rating scale allows the researcher to interpret the resulting data in relation to another
company or product.

Categorical scaling

• In our running MDBS-R example, testtakers might be given 30 index cards on each of which is printed
one of the 30 items. Testtakers would be asked to sort the cards into three piles: those behaviors that
are never justified, those that are sometimes justified, and those that are always justified.

• MDBS-R-Morally Debatable Behaviors Scale-Revised

• Guttman scale,

• Do you AGREE or DISAGREE with each of the following:

a. All people should have the right to decide whether they wish to end their lives.
b. People who are terminally ill and in pain should have the option to have a doctor assist them in
ending their lives.

c. People should have the option to sign away the use of artificial life- support equipment before they
become seriously ill.

d. People have the right to a comfortable life.

Test Conceptualization

•Writing Items

- Selected-response format

• Multiple choice

•Matching

• True/False

Writing Items

- Constructed-response format

• Completion item

• Short answer

•Essay

Item A
Stem- A psychological test, an interview, and a case study are

CORRECT ALT

a psychological assessment tools

DISTRACTORS

standardized behavioral samples

reliable assement instruments

theory-linked measures

Items B

A good multiple-choice item in an achievement test

a. has one correct alternative

b. has grammatically parallel alternatives

c. has alternatives of similar length

d. has alternatives that fit grammatically with the stem

e. includes as much of the item as possible in the stem to avoid unnecessary repetition

f. avoids ridiculous distractors

g. is not excessively long

h. all of the above

l. none of the above

Test Conceptualization

• Writing items for computer administration

- Item bank - relatively large and easily accessible collection of test questions

- Item branching The ability of the computer to tailor the content and order of presentation of test items
on the basis of responses to previous items.
Test Conceptualization

Scoring Items

• Cumulative-the higher the score on the test, the higher the test taker is on the ability, trait, or other
characteristic that the test purports to measure

• Class Category-test taker responses ram credit toward placement in a particular class or category with
other test takers whose pattern of responses is presumably similar in some way

• Ipsative Scoring- comparing a test taker's score on one scale within a test to another scale within that
same test.

Item Analysis

• Other Considerations

• Guessing

•Item fairness

• Speed Test

Item Difficulty Index

• It is a measure of the proportion of examinees who answered the item correctly for this reason, it is
frequently called the p-value. As the proportion of examinees who got the item right, the p-value might
more properly be called the item easiness index, rather than the item difficulty.

• It can range between 0.0 and 1.0, with a higher value indicating that a greater proportion of
examinees responded to the item correctly, and it was thus an easier item.

Item Reliability Index

• Provides an indication of the internal consistency of a test

The higher this index, the greater the test's internal consistency.
• This index is equal to the product of the item-score standard deviation (s) and the correlation (r)
between the item score and the total test score.

Item Validity Index

•Provide an indication of the degree to which a test is measuring what it purports to measure

• The higher the item-validity index, the greater the test's criterion-related validity.

Item Discrimination Index

• measure of how well an item is able to distinguish betwe examinees who are knowledgeable and
those who are not, between masters and non-masters.

Potential Areas of Explanation by Means of Qualitative Item Analysis

- Possible interest in test users.

- The questions raised either orally or in writing shortly after a test adminstration.

- Depending upon objectives of the test users the questions could be placed in other formats such as
true or false or multiple choice.

TEST REVISION

• For new test development

• For existing test

Cross validation-revalidation of a test on a sample of testtakers other than those on whom test
performance was originally found to be a valid predictor of some criterion

Co-validation-a test validation process conducted on two or more tests using the same sample of
testtakers

You might also like