Professional Documents
Culture Documents
Test Construction 2 With Item Analysis
Test Construction 2 With Item Analysis
Test Construction 2 With Item Analysis
Content Instructional No. of % of KD Levels of Behavior, Item Format, No. & Placement
Time Items items
R U Ap An E C
1. Methods F I II #4
of #1
assessment
2. etc
Total 15 50 100% 5 10 20 10 3 2
What is the disadvantage of this
format?
Two way format It is difficult to construct items for a
target level without a statement of
objective where the behavior
required is given skills are
emphasize
Content Instructional No. of % of KD Levels of Behavior, Item Format, No. & Placement
Time Items items
R U Ap An E C
1. Methods F I II #4
of #1
assessment
2. etc
Total 15 50 100% 5 10 20 10 3 2
Two-Way Format
Content Instructional No. of % of KD Levels of Behavior, Item Format, No. & Placement
Time Items items
R U Ap An E C
1. Methods F I II #4
of #1
assessment
2. etc
Total 15 50 100% 5 10 20 10 3 2
Three way format
Content Test Instruction No. of % of KD Levels of Behavior, Item Format, No. & Placement
Objectives al Time Items items
R U Ap An E C
1. F I II #4
Methods #1
of
assessmen
t
2. etc
Total 15 50 100 5 1 20 10 3 2
% 0
Flowchart of test Construction
3. Have the TOS approved by experts
2. Supply Test
Short Answer - uses a direct question that can be
answered by a word, a phrase, a number, or a symbol.
Completion Test - consists of an incomplete statement
3. Essay Test and the Scoring Rubrics
Restricted Response - limits the content of the
response by restricting the scope of the topic
Extended Response - allows the students to select any
factual information that they think is pertinent and to
organize their answers in accordance with their best
judgment.
Different Methods of Assessment
Objective Objective Essay Performance- Oral Question Observation Self-Report
Supply Selection Based
Short Multiple Restricted Paper Oral Informal Attitude
Answer Choice Response presentations Examinations
Completion Matching Extended Projects Conferences Formal Survey
test Type Response
True/False Demonstrations Interviews Sociometric
Devices
Exhibitions Questionnaires
Portfolios Inventories
General Suggestions in Writing Test
• Use test specifications as guide to item writing.
• Construct more test items than needed to have
extra items when making decisions as to which
items have to be discarded or revised.
• Have test of sufficient length to adequately
measure the target performance (Note: the longer
the test, the more reliable it is.).
• Write the test items well in advance of the testing
date to have time for face and content validity.
• Write the test items with reference to the test
objectives.
Specific Suggestions: Multiple Choice
Have: Avoid:
A clear problem Double negatives in the stem
Stems that are meaningful Irrelevant information in the stem
Negatively stated stem only when Having patterns in the answers
significant learning outcomes Verbal clues in the stem and the correct
required it but highlight the answer
negative word. Alternative like “all of the above”
Plausible distracters specially when it is the correct answer.
Alternatives that are grammatically Alternative like “none of the above”
parallel to the stem when there are many possible distracters
Only one correct and clearly best to the correct answer.
answer Answers that are relatively longer than
Choices that are arranged the alternatives
alphabetically, according value or Using MC when there are better test
length formats for the test objectives.
Stems and options that are on the
same page.
Specific Suggestions: Alternative-Response Test
Have: Avoid:
meaningful items trivial statements
simple sentences long sentences unless cause-and-
only one correct and clearly effect relationships.
best answer use of obviously negative words
equal or approximately equal or double negatives in an item.
number for a choice to be a two ideas in one statement unless
correct answer cause-effect relationships are
being measured
opinionated ideas unless you
acknowledge the source or unless
the ability to identify opinion is
being specifically measured.
Specific Suggestions: Matching Type
Have: Avoid:
unequal number of responses and clues or patterns for the
premises, and instruct the pupils correct answer
that responses may be used once, different or heterogeneous
more than once, or not at all. items in a single exercise
list of items to be matched that are redundant items
brief breaking the whole match
the shorter responses at the right into two pages
responses arranged in logical
order.
directions indicating the basis for
matching the responses and
premises.
a maximum of 15 items per match
Specific Suggestions: Supply Objective Test
Have: Avoid:
item/s that require brief and clues or patterns for the
specific answer or unit. correct answer
a direct question is generally statements taken directly
more desirable than an from textbooks as a basis
incomplete statement. for short answer items.
Blanks for answers equal in too many blanks in a
length. single item
The answers written before Blanks at the beginning
the item number for easy of the sentence.
checking.
Specific Suggestions: Essay Test
Have: Avoid:
item/s that target/s high-level Items that simply
thinking skills require recall of facts
questions that specifies clearly items that are taken
the behavior of the learning directly from
outcome textbooks
items that all students could optional questions that
fairly answer regardless of their vary in levels of
religion, gender, or social status. difficulty or items
rubric in scoring the work,
which is given to the students as
a guide in answering the
question
Flowchart of test Construction
5. Validate the face and content of the item
Check if:
It looks good
The guidelines in test construction were followed;
and
The target behaviors in the TOS were met
Flowchart of test Construction
1. Qualitative Analysis
–It is done for all test formats and
kinds
–It is done by matching items and
objectives and by editing poorly
written test items
Example of Qualitative Item Analysis
Content Objective No. % of KD Levels of Behavior, Item Format, No. & Placement
of items
Items
R U Ap An E C
Note: In a Classroom achievement test, the desired difficulty index: not lower than 0.20 nor higher than .80
Average is from 0.30 to 0.80
• Maximum Discrimination is the sum of the proportion of
the Ug and Lg who answered the item correctly. It will occur
if the half or less of the sum of the Ug and Lg answered an
item correctly
.70 - .79 Good for a classroom test; in the range of most. There are probably a
few items which could be improved.
.60 - .69 Somewhat low. This test needs to be supplemented by other measures
(e.g., more tests) to determine grades. There are probably some items
which could be improved.
.50 - .59 Suggests need for revision of test, unless it is quite short (ten or fewer
items). The test definitely needs to be supplemented by other
measures (e.g., more tests) for grading.
Below 0.50 Questionable reliability. This test should not contribute heavily to the
course grade, and it needs revision.
Examine the results of your item
Analysis
Example
A B C* D
Upper 1 1 2 5
Causes of Poor Test Item Distracters
A B C* D
Upper 2 3 2 3
Causes of Poor Test Item Distracters
A B C D*
Upper 4 1 1 4
Checklist in Improving Criterion-
reference test
1. Does your item fail to satisfy the level of difficulty you
have targeted?
2. Does the item discriminate negatively?
3. DO the distracters discriminate positively?
4. In the upper group:
– Was there any distracter chosen more frequent than
the key (miskeying)?
Checklist in Improving Criterion-
reference test
4. In the upper group:
– Do the choices have almost the same frequency (guessing)?
– Was there a distracter chosen as frequent as the key
(ambiguous)?
Note:
• If you answered YES to any one of the questions, then
revise the item
• If you answered YES to at least two questions, then it
would be better to eliminate the item
• If you answered NO to all questions, then retain the item
Test Construction According to Garcia (2008)