Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 18

Item analysis

WHY do item analysis?

Item analysis is an important process in assessment that


involves evaluating the performance of individual items on a
test or assessment instrument. It is a statistical method used
to determine the quality of each item in terms of its ability
to discriminate between high-performing and low-
performing test takers, as well as its reliability and validity.
There are several reasons why item analysis is done in
assessment:
1.Identify problematic items: Item analysis helps to
identify problematic items that may not be measuring
what they are intended to measure or may be biased
towards certain groups of test takers. By analyzing the
performance of each item, test developers can
determine which items need to be revised, reworded, or
removed from the assessment.
2.Improve test quality: Item analysis can also be used to improve
the overall quality of an assessment instrument. By identifying items
that are too easy or too difficult for the target population, or items
that have a low correlation with the overall test score, test developers
can modify or replace those items to improve the reliability and
validity of the test.

3.Evaluate test performance: Item analysis allows for the


evaluation of the performance of the assessment instrument as a
whole. By analyzing the distribution of scores, test developers can
determine the overall difficulty level of the test, the extent to which it
measures the intended constructs, and whether it is appropriate for the
target population.
4.Enhance fairness: Item analysis can help to ensure that the
assessment instrument is fair and unbiased. By analyzing the
performance of each item for different subgroups of test
takers (e.g., males vs. females, different ethnic groups), test
developers can determine whether the test is biased towards
certain groups and take steps to ensure that all test takers are
evaluated fairly.

Overall, item analysis is an essential component of assessment


design and development, as it helps to ensure that the
assessment instrument is of high quality, reliable, valid,
fair, and measures what it is intended to measure.
a statistical procedure used to evaluate
the quality of test items in terms of their
difficulty, discriminatory power,
and alignment with the intended
learning outcomes. It involves
analyzing the performance of each
individual test item to determine how well
it measures the intended construct.
6
Difficulty Level
The difficulty level of an item is determined by
calculating the percentage of test-takers who
answered the question correctly.

• P-value refers to the percentage of


test-takers who answered a particular
item correctly. It is a measure of the
difficulty level of the item, indicating
the proportion of test-takers who were
able to answer the item correctly.

7
Difficulty Level

To calculate the P-value, you can follow these steps:

1. Determine the number of test-takers who answered


the item correctly.
2. Determine the total number of test-takers who
attempted the item.
3. Divide the number of test-takers who answered the
item correctly by the total number of test-takers
who attempted the item.
4. Multiply the result by 100 to obtain the P-value as
a percentage. 8
For example, if 80 out of 100 test-takers
answered a math problem correctly, the P-
value would be calculated as follows:
P-value = (number of test-takers who
answered the item correctly / total number of
test-takers who attempted the item) x 100
P-value = (80/100) x 100

P-value = 80%

This means that the item had a P-value of


80%, indicating that it may be too easy and
may not effectively differentiate between
high-performing and low-performing students.
.
It is important to note that the
interpretation of P-value may vary
depending on the subject area and
the specific test being analyzed.
Therefore, it is recommended to
use P-value in conjunction with
other item analysis measures,
such as the discrimination index,
to gain a more comprehensive
understanding of the quality of
test items.
.
• The point biserial correlation coefficient is
used to determine this value. A D-value of 0.3
or higher indicates good discriminatory power.
Discriminatory Power For example, if a reading comprehension
question has a high D-value, it suggests that
students who perform well on the question
The discriminatory power of an item is the also perform well on the overall test, while
degree to which it can differentiate between students who perform poorly on the question
high-performing and low-performing students. also perform poorly on the overall test.

11
In item analysis, the D-value refers to the discrimination index, which is a
measure of how well an item distinguishes between high-performing and low-
performing students. To calculate the D-value, you can follow these steps:

1. Divide the test-takers into two groups: high-performing and low-performing. You can
use a criterion such as the total test score or a predetermined cutoff score to
differentiate between the two groups.
2. Determine the number of high-performing test-takers who answered the item correctly.
3. Determine the number of low-performing test-takers who answered the item correctly.
4. Calculate the percentage of high-performing test-takers who answered the item
correctly.
5. Calculate the percentage of low-performing test-takers who answered the item
correctly.
6. Subtract the percentage of low-performing test-takers who answered the item correctly
from the percentage of high-performing test-takers who answered the item correctly.
7. Divide the result by 100 to obtain the D-value.
The formula for calculating the D-value is as follows:

D-value = [(percentage of high-performing test-takers who answered the item correctly)


- (percentage of low-performing test-takers who answered the item correctly)] / 100

For example, if 70% of high-performing test-takers and 30% of low-performing test-


takers answered an item correctly, the D-value would be calculated as follows:

D-value = [(percentage of high-performing test-takers who answered the item correctly)


- (percentage of low-performing test-takers who answered the item correctly)] / 100
D-value = [(70 - 30)] / 100
D-value = 0.4

This means that the item had a D-value of 0.4, indicating that it effectively
differentiated between high-performing and low-performing students. A D-value of 0.3
or higher is generally considered acceptable, while a D-value below 0.3 may indicate
that the item needs revision or removal from the test.
Alignment
Alignment -refers to the extent to which a
test item measures the intended learning
outcome or construct. Items that are not aligned
with the intended construct may not effectively
measure student learning. Alignment can be
determined by comparing the items on the test
with the intended learning outcomes.
For example, if a question on an
English test asks about a grammar rule
that was not covered in class, it may
indicate that the item is not aligned
with the intended learning outcomes.
14
Here is a sample of alignment in item analysis for an English language assessment:

Suppose a teacher is creating a reading comprehension test to assess students'


understanding of a particular text. The intended learning outcomes for this
assessment are:

1. Students will be able to identify the main idea of the text.


2. Students will be able to identify supporting details within the text.
3. Students will be able to make inferences based on the text.

To ensure alignment between the test items and the intended learning
outcomes, the teacher may create the following items:

4. "What is the main idea of the passage?"


5. "Which sentence in the passage provides evidence for the author's argument?"
6. "What can you infer about the character's motivations based on their actions in
the passage?"
These items are aligned with the intended learning outcomes because
they directly assess the skills and knowledge specified in the learning
outcomes.

For example, Item 1 assesses the ability to identify the main idea of
the text, Item 2 assesses the ability to identify supporting details, and
Item 3 assesses the ability to make inferences based on the text.

By analyzing the alignment between test items and intended learning


outcomes, educators can ensure that their assessments are effectively
measuring student learning and achieving their educational goals in
English language comprehension.
Here are some examples of item analysis in different subject areas:

Math Test
A math test has a question that asks students to solve an algebraic equation. The item analysis reveals that only
30% of students answered the question correctly, indicating that the item is too difficult. However, the item
also has a high discriminatory power, with a D-value of 0.8, suggesting that it is effective in differentiating
between high-performing and low-performing students.

Reading Comprehension Test


A reading comprehension test has a question that asks students to identify the main idea of a passage. The item
analysis reveals that 90% of students answered the question correctly, indicating that the item is too easy.
Additionally, the item has a low discriminatory power, with a D-value of 0.2, suggesting that it may not
effectively differentiate between high-performing and low-performing students.

Science Test
A science test has a question that asks students to identify the different types of rocks. The item analysis
reveals that the question is not aligned with the intended learning outcomes, as the test did not cover this
topic. This item may need to be revised or removed from the test.
In conclusion, item analysis is a
valuable tool in evaluating the
quality of test items and ensuring
that assessments accurately
measure student learning. By
analyzing the difficulty level,
discriminatory power, and
alignment of each item, educators
can make informed decisions about
teaching and learning, and improve
the effectiveness of future
assessments.
18

You might also like