Download as pdf or txt
Download as pdf or txt
You are on page 1of 13

Learning Episode 5:

BASICS OF ITEM ANALYSIS

To have a meaningful and successful accomplishment in this FS episode, be


sure to read through the whole episode before participating and assisting in your FS2
Resource Teacher’s class (any class modality). Note that all information and tasks
you will need to do before working on this episode.

At the end of this Learning Episode, I must be able to:


1. Explain the meaning of item analysis, validity, reliability, item difficulty and
discrimination index;
2. Determine quality of a test item by its difficulty index, discrimination index
and plausibility of options.

(BASICS OF ITEM ANALYSIS)

What is Item Analysis ?

● process that examines student responses to individual test items assessing


quality items and the test as a whole
● valuable in improving items which will be used again in later tests and
eliminate ambiguous or misleading items
● valuable for increasing instructors' skills in test construction, and
● identifying specific areas of course content which need greater emphasis or
clarity.

Several Purposes

1. More diagnostic information on students

–Classroom level:

⮚ determine questions most found very difficult/ guessing on,


–reteach that concept
⮚ questions all got right –
–don't waste more time on this area
⮚ find wrong answers students are choosing-
–identify common misconceptions
–Individual level:

⮚ isolate specific errors this student made

2. Build future tests, revise test items to make them better

– know how much workin writing good questions


– SHOULD NOT REUSE WHOLE TESTS--> diagnostic teaching means
responding to needs of students, so after a few years a test bank is build up
and choose a tests for the class
– can spread difficulty levelsacross your blueprint (TOS)

3. Part of continuing professional development

– doing occasional item analysis will help become a better test writer
– documenting just how good your evaluation is
– useful for dealing with parents or administrators if there's ever a dispute
– once you start bringing out all these impressive looking stats, parents and
administrators will believe why some students failed.

Validity. Validity is the extent to which a test measures what it purports to


measure or as referring to the appropriateness, correctness, meaningfulness and
usefulness of the specific decisions a teacher makes based on the test results. These
two definitions of validity differ in the sense that the first definition refers to the test
itself while the second refers to the decisions made by the teacher based on the test.
A test is .valid when it is aligned with the learning outcome.

Reliability refers to the consistency of the scores obtained — how consistent


they are for each individual from one administration of an instrument to another and
from one set of items to another. We already gave the formula for computing the
reliability of a test: for internal consistency; for instance, we could use the split-half
method or the Kuder-Richardson formulae (KR-20 or KR-21)
Reliability and Validity are related concepts. If an instrument is unreliable, it
cannot get valid outcomes. As reliability improves, validity may improve (or it may
not). However, if an instrument is shown scientifically to be valid then it is almost
certain that it is also reliable.

Item Analysis: Difficulty Index and Discrimination Index

There are two important characteristics of an item that will be of interest to


the teacher. These are: (a) item difficulty and (b) discrimination index. We shall learn
how to measure these characteristics and apply our knowledge in making a decision
about the item in question.

The difficulty of an item or item difficulty is defined as the number of


students who are able to answer the item correctly divided by the total number of
students. Thus:

Item difficulty = number of students with correct answer/ total number of


students. The item difficulty is usually expressed in percentage.

Example: What is the item difficulty index of an item if 25 students are unable to
answer it correctly while 75 answered it correctly?

Here, the total number of students is 100, hence the item difficulty index is
75/100 or 75%.

Another example: 25 students answered the item correctly while 75 students did
not. The total number of students is 100 so the difficulty index is 25/100 or
25 which is 25%.

It is a more difficult test item than that one with a difficulty index of 75.

A high percentage indicates an easy item/question while a low percentage


indicates a difficult item.

One problem with this type of difficulty index is that it may not actually
indicate that the item is difficult (or easy). A student who does not know the subject
matter will naturally be unable to answer the item correctly even if the question is
easy. How do we decide on the basis of this index whether the item is too difficult or
too easy?

DIFFICULTY INDEX TABLE

The following arbitrary rule is often used in the literature:

Range of Difficulty Index Interpretation Action


0.00 — 0.25 Difficult Revise or discard
0.26 — 0.75 Right difficulty Retain
0.76 — above Easy Revise or discard
Difficult items tend to discriminate between those who know and those who
do not know the answer. Conversely, easy items cannot discriminate between these
two groups of students. We are therefore interested in deriving a measure that will
tell us whether an item can discriminate between these two groups of students. Such
a measure is called an index of discrimination.

An easy way to derive such a measure is to measure how difficult an item is


with respect to those in the upper 25% of the class and how difficult it is with
respect to those in the lower 25% of the class. If the upper 25% of the class found
the item easy yet the lower 25% found it difficult, then the item can discriminate
properly between these two groups.

Thus:

Index of discrimination = DU — DL (U — Upper group; L — Lower group)

Example: Obtain the index of discrimination of an item if the upper 25% of the class
had a difficulty index of 0.60 (i.e. 60% of the upper 25% got the correct
answer) while the lower 25% of the class had a difficulty index of 0.20.

Here, DU = 0.60 while DL = 0.20,

Thus, index of discrimination = .60 - .20 = .40.

Discrimination index is the difference between the proportion of the top


scorers who got an item correct and the proportion of the lowest scorers who got the
item right. The discrimination index range is between -1 and +1. The closer the
discrimination index is to +1, the more effectively the item can discriminate or
distinguish between the two groups of students. A negative discrimination index
means more from the lower group got the item correctly. The last item is not good
and so must be discarded.

Theoretically, the index of discrimination can range from -1.0 (when DU =0


and DL = 1) to 1.0 (when DU = 1 and DL = 0). When the index of discrimination is
equal to -1, then this means that all of the lower 25% of the students got the correct
answer while all of the upper 25% got the wrong answer. In a sense, such an index
discriminates correctly between the two groups but the item itself is highly
questionable. Why should the bright ones get the wrong answer and the poor ones
get the right answer? On the other hand, if the index of discrimination is 1.0, then
this means that all of the lower 25% failed to get the correct answer while all of the
upper 25% got the correct answer. This is a perfectly discriminating item and is the
ideal item that should be included in the test.

From these discussions, let us agree to discard or revise all items that have
negative discrimination index for although they discriminate correctly
between the upper and lower 25% of the class, the content of the item itself may be
highly dubious or doubtful.
DISCRIMINATION INDEX TABLE

We have the following rule of thumb:

Index Range Interpretation Action


-1.0 — -.50 Can discriminate but item is Discard
questionable
-.51 - 0.45 Non-discriminating Revise
0.46 — 1.00 Discriminating item Include

Example: Consider a multiple choice type of test of which the following data were
obtained:

Item Options
A B* C D
1 0 40 20 20 Total
0 15 5 0 Upper 25%
0 5 10 5 Lower 25%
The correct response is B. Let us compute the difficulty index and index of
discrimination:

Difficulty, Index = no. of students getting correct response/total = 40/100 =


40%, within range of a "good item"

The discrimination index can similarly be computed:

DU = no. of students in upper 25% with correct response/no. of students in the


upper 25%

= 15/20 = .75 or 75%

DL = no. of students in lower 25% with correct response/ no. of students in the
lower 25%

= 5/20 = .25 or 25%

Discrimination Index = DU — DL = .75 - .25 = .50 or 50%.

Thus, the item also has a "good discriminating power."

It is also instructive to note that the distracter A is not an effective


distracter since this was never selected by the students. It is an implausible
distracter. Distracters C and D appear to have good appeal as distracters. They are
plausible distracters.

Index of Difficulty
Ru + RL
P= ___________ x 100
T
Where:
Ru — The number in the upper group who answered the item correctly.
RL — The number in the lower group who answered the item correctly.
T — The total number who tried the item.

Index of item Discriminating Power

Ru + RL
D=
½T
Where:
P percentage who answered the item correctly (index of difficulty)
R number who answered the item correctly
T total number who tried the item.

P= 8/20 x 100 = 40%

The smaller the percentage figure the more difficult the item

Estimate the item discriminating power using the formula below:

(Ru — RL) ( 6 – 2)
D= -------------- = ----------- = 0.40
½t 10

The discriminating power of an item is reported as a decimal fraction;


maximum discriminating power is indicated by an index of 1.00.

Maximum discrimination is usually found at the 50 percent level of difficulty

0.00 – 0.20 = Very difficult

0.21 – 0.80 = Moderately difficult

0.81 – 1.00 = Very easy

For classroom achievement tests, most test constructors desire items with
indices of difficulty no lower than 20 nor higher than 80, with an average index of
difficulty from 30 or 40 to a maximum of 60.

The INDEX OF DISCRIMINATION is the difference between the proportion of


the upper group who got an item right and the proportion of the lower group who
got the item right. This index is dependent upon the difficulty of an item. It may
reach a maximum value of 100 for an item with an index of difficulty of 50, that is,
when 100% of the upper group and none of the lower group answer the item
correctly. For items of less than or greater than 50 difficulty, the index of
discrimination has a maximum value of less than 100.
CONTENTS OF THIS MATERIAL IS ADAPTED IS FROM:
Navarro, R.L., R.G. Santos, B.B. Corpuz. (2019). Assessment of Learning 1. LORIMAR Publishing, Inc. 4th Edition.
All Rights Reserved

You are expected to observe in your subject assignment how Item Analysis are
conducted and implemented by your respective CT in the teaching-learning process.

(Note to Student Teacher: As you participate and assist your CT in conducting item
analysis, please take note of what you are expected to give more attention to as
asked in the next step of the Learning Episode (NOTICE)

1. Assist your CT in conducting the item analysis of the summative test in one
grading period of the assigned class.
2. Offer your assistance to engage in the conduct of item analysis trough your
CT.

NOTICE

1. Take note of:


a. Alignment of the different learning behavior or domains with the
learning outcomes based on the TOS and the results of the item
analysis.
b. The distribution of the test items in the learning domains against the
retained/discarded items in the subject as the result of the item
analysis.
c. How the percentage allocation of the lower and higher order thinking
skills observed and distributed in the TOS as manifested in the item
analysis results.
1. Did the results of the conducted item analysis are expected to measure the
learning competencies of students?
_______________________________________________________________
_______________________________________________________________

2. Was the item analysis constructed favorably or unfavorably in assessing


students performance?
_______________________________________________________________
___________________________________________________
3. What would be the effect of the results of the item analysis in the
teaching-learning process and the performance of students?
_______________________________________________________________
___________________________________________________

1. How would attainment of learning outcomes be measured if item analysis


were not employed or conducted after the summative test?

__________________________________________________________
________________________________________________________
__________________________________________________________
________________________________________________________
A. Give the term described/explained.

____________1. Refers to a statistical technique that helps instructors identify the


effectiveness of their test items.

____________2. Refers to the proportion of students who got the test item correctly.

____________3. Which is the difference between the proportion of the top scorers
who got an item correct and the proportion of the bottom scorers
who got the item right?

____________4. Which one is concerened with how easy or difficult a test item is?

____________5. Which adjective describes an effective distracter?

B. Problem Solving

1. Solve for the difficulty index of each test item:

Item No. 1 2 3 4 5

No, of Correct Responses 2 10 20 30 15

No. of Students 50 30 30 30 40

Difficulty Index

1. Which is most difficult? Most easy?


2. Which needs revision? Which should be discarded? Why?

2. Solve for the discrimination indexes of the following test items:

Item No. 1 2 3 4 5

UG LG UG LG UG LG UG LG UG LG

No. of Correct
12 20 10 20- 20 10 10 24 20 5
Responses

No. of Students 25 25 25 25 25 25 25 25 25 25

Discrimination
Index
1. Based on the computed discrimination index, which are good test
items?
2. Not good test items?

3. A multiple choice type of test has 5 options. The Table below indicates the number
of examinees out of 50 who chose each option.

Option

A B C D E

0 20 15* 5 10

* - Correct answer

1. Which options are plausible?

2. Which ones are implausible?

4. Study the following data. Compute for the difficulty index and the discrimination
index of each set of scores.

1. N = 80, number of wrong answers: upper 25% = 2 lower 25% = 9


2. N = 30, number of wrong answers: upper 25% = 1 lower 25% = 6
3. N = 50, number of wrong answers: upper 25% = 3 lower 25% = 8
4. N = 70, number of wrong answers, upper 25% = 4 lower 25% = 10

Compile activities, techniques in conducting item analysis by your FS


Resources Teacher in the classes you observed and assigned. Include your
drafts/improvements/ annotations on the conduct of item analysis.

Add other activities / techniques that you have researched on, e.g. how item analysis
is conducted in different learning institutions using technology and software.
OBSERVE

1. One thing that went well in the conduct of item analysis is


_______________________________________________________________
___________________________________________________
2. One thing that did not go very well in the conduct of item analysis is
_______________________________________________________________
___________________________________________________
3. One good thing observed in the conduct of item analysis is
_______________________________________________________________
___________________________________________________
4. One thing in the conduct of item analysis that needs improvement based on
what we have observed is
_______________________________________________________________
___________________________________________________

REFLECT

a. The conduct of item analysis went well because


____________________________________________________________
__________________________________________________
b. The conduct of item analysis did not go well because
____________________________________________________________
__________________________________________________

ACT

To ensure that the process in the conduct of item analysis serve its purpose
and in order to help in the learning process, I will learn from other’s best practice
by researching on
__________________________________________________________________
______________________________________________________

PLAN

To help improve the conduct of item analysis practices and implementation, I


plan to conduct an action research on
__________________________________________________________________
______________________________________________________
Learning Excellent Above Average Sufficient Minimal Poor %
Episodes 50 40 30 20 10 Weighted
Ave.
All episodes All or nearly all Nearly all Few activities of Episodes were
were done with episodes were episodes were the episodes were not done; or
Learning outstanding done with high done with done; only few objectives were
Activities quality; work quality acceptable objectives were not met 40%
exceeds quality met

All questions/ Analysis Half of Analysis Few parts of the Analysis were not
episodes were question were questions were Analysis were answered.
answered answered answered. answered.
completely; in completely.
Analysis
depth answers; Vaguely related Grammar and
of the
thoroughly Clear to the theories spelling needs
Learning
grounded on connections improvement
Episode
theories. with theories Grammar and
Exemplary spelling 30%
grammar and Grammar and acceptable
spelling Spelling are
superior
Reflection Reflection Reflection Few reflection Reflection
statements are statements are statements are statements contain statements are
profound and clear; but not good and is minimal supports poor and no
Reflection/ clear; clearly supported by of concrete real life personal
Insights supported by supported by experiences experiences as experiences were 10%
experiences experiences from the relevance to the stated as
from the from the learning learning episodes relevance to the
learning learning episodes learning episodes
episodes episodes
Portfolio is Portfolio is Portfolio is Few No
complete, clear, complete, clear, incomplete; documents/proofs/ documentations
well-organized well-organized supporting evidences of the and any other
and all and most documentation learning evidences of
Learning
supporting; supporting; are organized experiences from performing the
Portfolio
documentation documentation but are lacking the learning episode
s are located in s area available episode is presented 10%
sections clearly and logical and presented
designated clearly marked
locations
Submiss Submitted Submitted Submitted a Submitted Submitted a
ion of before the on the day after the two-five days week or more 10%
Learning deadline deadline deadline after the deadline after the
Episodes deadline
Total
100%

COMMENT/S

3.0 (50-51) 2.0 (70-75)


2.75 ( 52-57) 1.75 (76-81)
2.5 (58-63) 1.5 (82-87)
2.25 (64-69) 1.25 (88-93)
1.0 (94-100)
Over-all Score Rating: (Base on transmutation)

You might also like