Item Analysis Workshop

ITEM ANALYSIS
WORKSHOP FOR
NURSING FACULTY
Dr. Mohammed Jawad

DNP, MSN.ed, BSc.N
OUTLINE 2
‣What is Item Analysis?

‣Overview
‣Benefits of Item Analysis
‣Applications
‣Main Statistics of Item

Analysis
‣Item Difficulty
‣Item Discrimination
‣Test Reliability
‣Implementing Item Analysis

‣Excel
‣SPSS
WHAT IS ITEM
ANALYSIS
WHAT IS ITEM ANALYSIS 4
‣Consider the following ...
‣Imagine you have a multiple-choice test

‣Every student sees a collection of questions
‣Each question has different choices; only one choice is correct
‣We use the overall score on the exam to assess the student’s aptitude/ability
‣We want to know who understands the material and who doesn’t
‣We want to make sure that the student’s score is stable
‣Question: How do we know we assessing are the student's ability as well as we can?
‣Answer: Item Analysis

WHAT IS ITEM ANALYSIS 5
‣What is Item Analysis?
‣Statistically analyzing your multiple choice test items
‣So that you can ensure that your items are effectively evaluating student ability
‣Item Analysis is a collection of techniques
‣Many tools/methods to analyze questions
‣Examples:
‣Item Difficulty
‣Item Discrimination
‣Internal Consistency
‣Differential Item Functioning
ITEM ANALYSIS PROCESS 6
‣Item Analysis is an Iterative and Continuous Process
Write Test Items Administer Exam

Teach Content Area Perform Item Analysis
Have items reflect overall All students should be
Consider learning goals given sufficient time to Examine item
content area
complete exam performance
Consider student
Eliminate ambiguous or
abilities Testing conditions should Evaluate reasons for poor
misleading items
be consistent for all performance
students
BENEFITS
PURPOSE OF ITEM ANALYSIS 8
‣Primary questions that can be answered by Item Analysis
‣Were any of the questions too difficult or easy?1
‣How well do the questions identify those students who knew the
material from those that did not? (The more of these questions, the better
your exam can precisely measure ability)2
‣How consistent are the exam’s questions?3

1: Item Difficulty
‣How stable are people’s scores?3 2: Item Discrimination
3: Reliability
BENEFITS OF ITEM ANALYSIS 9
‣Benefits of using Item Analysis
‣Test Development - Improve test quality

‣Assess ability more efficiently
‣Produce scores that are stable
‣Make the exam more coherent
‣Precision - Quantify measurement of assessment

‣Identify areas of improvement
‣Understand characteristics of exam
‣Have clear indication of item quality
THE APPLICATION
APPLICATIONS OF ITEM ANALYSIS 11
‣Item Analysis can answer many questions (“Am I able to know who has
ability and who does not”, “Am I able to get a fine-grained view on ability”, “How
consistent are scores”)
‣Many fields are concerned with knowing the answer to those questions:
‣Academic exams
‣Dean, Head of the programme
‣Quality improvement committee in the college
PRELIMINARY
PREPARATION
BEFORE BEGINNING ITEM ANALYSIS 16
‣Almost all item analysis procedures require the exam data to be

a specific format
‣Columns = Exam questions
‣Rows = An individual examinee
‣Cells = Whether the examinee got the question right (1) or wrong (0)
‣Almost all item analysis

procedures require the exam Examinee Question 1 Question 2 Question 3 Question 4
data to be a specific format
Noor 1 0 1 1
‣Columns = Exam questions Zina 0 0 1 1
Ahmed 1 1 1 1
‣Rows = An individual Adnan 1 1 1 0

examinee
Fatima 0 1 1 0
‣Cells = Whether the Ali 1 0 1 0
examinee got the question

right (1) or wrong (0)
‣Almost all item

procedures
analysis require the exam Examinee Question 1 Question 2 Question 3 Question 4
Noor 1 0 1 1
‣Columns = Exam questions

Zina 0 0 1 1
Ahmed 1 1 1 1

examinee
Fatima 0 1 1 0

‣Almost all item analysis

procedures require the exam Examinee Question 1 Question 2 Question 3 Question 4
Noor 1 0 1 1
Ahmed 1 1 1 1

examinee
Fatima 0 1 1 0

‣Almost all item

procedures
analysis require the exam Examinee Question 1 Question 2 Question 3 Question 4
Noor 1 0 1 1
Ahmed 1 1 1 1
‣Rows = An individual
Adnan 1 1 1 0
examinee
Fatima 0 1 1 0

ACTIVITY: ITEM DIFFICULTY 21
1) Complete the activity sheet passed out by the presenter
2) The activity sheet contains five students’ responses to five different multiple choice
question
3) Create a data matrix that is formatted in a manner for item-analysis
a) Each row is a different student

EXERCISE
b) Each column is a different question
c) Each cell indicates whether the student answered the question correctly (1) or
incorrectly(0)
ITEM DIFFICULTY
ITEM DIFFICULTY: WHAT IS IT? 23
‣What is Item Difficulty?
‣Item difficulty is how easy or hard a question is
‣Examples
‣If no one got the question right, the item is “difficult”
‣If everyone got the question right, the item is considered “easy”
‣If half the people got the question right, then the item is somewhere
between easy and hard.
ITEM DIFFICULTY: WHY DOES IT MATTER? 24
‣Why Does Item Difficulty Matter?
‣You want your test to provide information on the full range of

people’s ability
‣If you don’t pay attention to Item Difficulty, you don’t get a precise measure of
ability
‣Helps ensure the full spectrum of ability is represented
‣You want scores to be roughly symmetric
‣If true ability has a bell-shaped distribution, then your estimated ability
should have a bell-shaped distribution
‣You want your test to provide

Student Question
information on people’s ranges of ability
Huda 100%
Ali 100%
‣If question too easy -> everyone gets the Hussain 100%
question right -> you don’t know who is on
Too Easy - Cannot tell who has
the lower end of ability more/less ability
‣If question too hard -> everyone gets

Student Question
question wrong -> you don’t know who is on
Huda 0%
the higher end of ability
Ali 0%
Hussain 0%
‣Therefore, it is important to pay attention to
Too hard - Cannot tell who has
item difficulty to have it be just right more/less ability
‣You want scores to be roughly symmetric
‣Ability tends to follow a roughly normal distribution
‣If item difficulty is too high or low, then scores will be truncated (prevents
symmetry)
Difficulty is too high Difficulty is just right Difficulty is too easy
‣Therefore, it is important to pay attention to item difficulty to have it be just right

ITEM DIFFICULTY: EXAMPLE 27
Examinee Q1 Q2 Q3
‣Imagine a test with THREE questions and FIVE
Huda 1 0 1 examinees
Ali 1 0 1
‣Everyone got Question 1 correct

Hussain 1 0 1
Ahmed 1 0 1 ‣Everyone got Question 2 wrong

Erin 1 0 0
‣FOUR people got Question 1 correct

Examinee Q1 Q2 Q3
‣ We can total up how many people got each question correct by taking the
Huda 1 0 1
sum for each question
‣Question 1 = 5
‣Question 2 = 0
Ali 1 0 1
‣Question 3 = 4
Hussain 1 0 1
Ahmed 1 0 1
Noor 1 0 0
Total 5 0 4
Examinee Q1 Q2 Q3
‣ We can total up how many people got each question correct by taking the
Huda 1 0 1
sum for each question
‣Question 1 = 5
‣Question 2 = 0
Ali 1 0 1
‣Question 3 = 4
Hussain 1 0 1
‣ Divide the total number of people who got the question correct by the
Ahmed 1 0 1 total number of people who took the test, and you have the item’s
difficulty
Noor 1 0 0 Question # Total Item
correct/ Difficulty
Total test
takers
Question 1 5 / 5 = 1.00 100%

Total 5 0 4
Question 2 0 / 5 = .00 0%
Question 3 4/ 5 = .80 80%

ITEM DIFFICULTY: HOW TO CALCULATE IT 30
‣How to Calculate Item Difficulty
‣Count how many people answered the answer at all = N
‣Count how many people answered the question correctly = P
‣Item Difficulty = P / N
ITEM DIFFICULTY: HOW TO INTERPRET IT 31
‣How to Interpret Item Difficulty
‣Think of it as “Item Easiness”
‣Ranges from 0% to 100%

‣Larger values = Easier
‣Smaller values = Harder
‣If value is at an ideal “sweet spot”, then your test can better separate
high ability people from low ability people (discussed in next section)
‣Ideal values depend on how answer choices there are:

‣Two answer choices (true and false) = .75
‣Three answer choices= .67
‣Four answer choices= .63
‣Five answer choices= .60
ITEM DIFFICULTY: DIAGNOSTICS 32
‣Low item-difficulty can be problematic: Indicates that people, regardless of
ability, could not answer the question correctly
‣If Item Difficulty is too Low / Item is too Hard (< .25 or .30):
‣The item may have been miskeyed
‣The item may be too challenging relative to the overall level of ability of the
class
‣The item may be ambiguous or not written clearly
‣There may be more than one correct answer.

ITEM DIFFICULTY: IMPROVEMENTS 33
‣How to Improve Low Item Difficulty (Item is too hard)
‣Make sure all items are keyed correctly
‣Find a less challenging concept to assess
‣Improve the class ability (typically via re-instruction or practice)
‣Find where people are being confused and clarify in the question
‣Make the question more specific

ITEM DIFFICULTY: PRECAUTIONS 34
‣Precautions on using Item Difficulty
‣Not meaningful if
‣Too few respondents (small sample size)
‣Goal is to assessing mastery of specific concepts/problems
‣The recommended values (.60-.75) assume you want to assess people’s ability
relative to others
‣If you are concerned with content mastery, you want all items answered
correctly
‣Test had a short time limit (“speed test”) - later items seem difficult
ACTIVITY 35
1) Break out into 4 separate groups
2) Each group will be assigned an item-difficulty value
3) Think of a question you can ask your fellow attendees that would probably have a
difficulty value close to your group’s assigned value
EXERCISE Example: If you are assigned a difficulty value of .25, what is a question you could ask
that only 25% of the attendees would know
4) Think of 4 multiple choice options to go with your question, one of which is right
‣Remember: Item
5) When you are ready, have one member of the group go up to the presenter and share the
Difficulty: Percentage of
group’s question and answers
people answering the
question correctly 6) WHEN TOLD THE SURVEY IS READY BY THE PRESENTER: Complete the combined
survey online (link will be provided) - Skip your question
‣ Larger values = Easier
7) Access the spreadsheet link provided, and compute the difficulty for each question
‣ Smaller values = Harder
8) How close was the actual difficulty to the difficulty you were assigned?
ITEM
DISCRIMINATION
ITEM DISCRIMINATION: WHAT IS IT? 37
‣What is Item Discrimination?
‣Item discrimination is how much a question relates to a person’s

overall knowledge on the exam / ability
‣High item discrimination = “You either know it or you don’t”
‣Examples of Question with Good Discrimination

‣Smart people know the answer to the question, and low ability people don’t know
‣People who studied get the question right, people who didn’t study get the question
wrong
ITEM DISCRIMINATION 38
Examinee Q1 Q2 Q3
‣Imagine a test with THREE questions and FIVE
Huda 1 1 1 students
Ali 1 1 1
Hussain 0 0 1
Ahmed 0 1 0
Noor 0 0 1
Examinee Q1 Q2 Q3 Total ‣Imagine a test with THREE questions and FIVE

students
Huda 1 1 1 100%
Ali 1 1 1 100% ‣Huda and Ali did the best (perfect scores)
Hussain 0 0 1 33%
‣Hussain, Ahmed, and Noor did the worst (33%)
Ahmed 0 1 0 33%
Noor 0 0 1 33%
Examinee Q1 Q2 Q3 Total ‣Imagine a test with THREE questions and FIVE

students
Huda 1 1 1 100%
Ali 1 1 1 100% ‣Huda and Ali did the best (perfect scores)
Hussain 0 0 1 33%
‣Hussain, Ahmed, and Noor did the worst (33%)
Ahmed 0 1 0 33%
Noor 0 0 1 33% ‣Which question’s performance best

predicts who will score high on the exam
(Huda and Ali) and who will score low
(Hussain, Ahmed, and Noor)?
Examinee Q1 Q2 Q3 Total ‣How people did on Question 1 predicts who

will score high and who will score low on the
Huda 1 1 1 100% exam
Ali 1 1 1 100%
‣People who answered question 1 correctly, got a

Hussain 0 0 1 33%
100%
Ahmed 0 1 0 33%
‣People who answered question 1 incorrectly, got a

Noor 0 0 1 33%
0%
‣Question 1 has good discrimination

Examinee Q1 Q2 Q3 Total ‣How people did on Question 3 does not

predicts who will score high and who will score
Huda 1 1 1 100% low on the exam
Ali 1 1 1 100%
‣People who answered question 3 correctly, got a

Ahmed 0 0 1 33%
100%
Hussain 0 1 0 33%
‣Many who scored 33% also got question 3 correct

Noor 0 0 1 33%
‣Question 3 has worse discrimination

than Question 1
ITEM DISCRIMINATION: HOW TO CALCULATE IT 43
‣How to Calculate Item Discrimination
‣Make one column that has whether people got the question right (1) or wrong
(0) = Question Scores
‣Make another column that has people’s total score on the exam (0 to 100%)
=
Total Scores
‣Item Discrimination = (Pearson’s) correlation between

Question scores and Total scores
‣Commonly called “Point-biserial correlation” or “Item-total

correlation”
ITEM DISCRIMINATION: HOW TO CALCULATE IT 44
‣Item-Discrimination = correlation between the item’s

Examinee Q1 Q2 Q3 Total
performance and the total performance
Huda 1 1 1 3
‣Correlation between Q1 column and Total column
Ali 1 1 1 3 ‣Discrimination =1.00
Ahmed 0 0 1 1 ‣ Correlation between Q2 column and Total column

‣Discrimination = .67
Hussain 0 1 0 1
‣ Correlation between Q3 column and Total column

Noor 0 0 1 1
‣Discrimination = .41
‣Larger values indicate the item can better

predict overall test performance
ITEM DISCRIMINATION: HOW TO INTERPRET IT 45
‣How to Interpret Item Discrimination
‣Ranges from -1 to +1 (Almost always positive)

‣Larger positive values = Question strongly relates to ability
‣Smaller values = Question does not relate ability much
‣Ideal values are positive and high (above +.20)

‣positive = those who correctly answer a particular item also
tend to do well on the test overall
‣0 = no relationship between exam knowledge and getting the question right
‣negative = the more you know, the less likely you are to get the question
right
‣If above .20, item is useful for describing people’s overall ability
ITEM DISCRIMINATION: DIAGNOSTICS 46
‣Low item discrimination is problematic: Suggests that people who know the
concepts really well overall, were not any more likely to understand the specific concept of
the question
‣If Item Discrimination is too Low (< .20):
‣The item may be miskeyed

‣Item may not represent domain of interest
‣Item concept may not be taught well
‣The item may be ambiguous
‣The item maybe misleading
‣The item may be too easy or difficult (everyone got question right or wrong)
ITEM DISCRIMINATION: IMPROVEMENTS 47
‣To Improve Low Item Discrimination (< .20):
 Check to make sure the item is keyed correctly

 Check to make sure item is conceptually relevant
 Modify the instruction to explain the concept better
 Make the question more specific
 Ask how students interpreted the question
 Ensure that difficulty is at the ideal level, given the number of response options
ITEM DISCRIMINATION: PRECAUTIONS 48
‣Precautions on using Item Discrimination
‣Not meaningful if
‣Too few respondents (small sample size)
‣Too few questions (doesn’t capture topic area)
‣Item difficulty too low or high (no variation for correlation)
‣Partial credit for answers (some answers are less wrong than others)
ACTIVITY: ITEM DISCRIMINATION 49
1) Think to yourself about a multiple choice exam you might give in your respective field
for a specific topic
2) What kinds of questions would you ask?
EXERCISE 3) Which of those questions, if answered correctly would indicate that this
person understands the topic well as a whole?
a) This question has good discrimination
b) Can tell you who likely has high knowledge and who has lower knowledge
4) Which of those questions, if answered correctly doesn’t necessarily

indicate that this person understands the topic well?
a) This question has poor discrimination
b) Cannot separate the high knowledge from the low knowledge students
RELIABILITY
TEST RELIABILITY: WHAT IS IT 51
‣What is Test Reliability?
‣Test Reliability = Consistency of Scores
‣People’s observed exam score is a mixture of true ability and error
‣True ability = what you actually know for the entire topic
‣Error = Fatigue, Misreading Question, Luck

TEST RELIABILITY: WHAT IS IT? 52
‣What Reliability Means?
‣If test 100% reliable, then the score a person receives is their true score, and
they get the same score each time they retake the exam
‣There was no error on the exams

‣Luck had nothing to do with the scores
‣No questions were misread
‣You were not tired
‣The score is based completely on ability
‣If our test is reliable, then a student’s ability is reflected in the

score received
‣You are capturing “pure” ability instead of ability + error

‣What Reliability Means?
‣If test not 100% reliability, then the score a person receives may be either higher or
lower than their actual true score. Next score might be different
‣Error played a big role in the scores

‣You got lucky the first time
‣You misread some questions
‣You were tired
‣Your true ability wasn’t reflected in the scores
‣If our test is unreliable, then a student’s ability is not reflected in the score received
‣Ways of Measuring Reliability
‣Test-retest reliability: Consistency from one examination point and another
‣Parallel forms reliability: Consistency from one exam form and another
‣Internal consistency reliability: Consistency of items with other items
‣We’re going to focus on internal consistency because it is the easiest to

measure, and also provides highly useful information
INTERNAL CONSISTENCY: WHAT IS IT 55
‣What is Internal Consistency?
‣Internal consistency is how consistent the items are with the other
items
‣High internal consistency = Questions are highly correlated (address a similar

topic) and many questions
‣Low internal consistency = Questions are unrelated and few questions
‣Internal consistency measured with “Cronbach’s Alpha” (also called “KR-20”)

INTERNAL CONSISTENCY: WHAT DOES IT MATTER 56
‣Why Does Internal Consistency Matter?
‣Cronbach’s Alpha provides a “lower-bound” on the reliability of an exam
‣If you know an exam internal consistency, then you know the worst case of its
reliability
‣Reliability = correlation of scores on the exam with scores on another equivalent

exam
‣Good internal consistency makes it more likely the students’ scores

are stable
INTERNAL CONSISTENCY: HOW TO INTERPRET IT 57
‣How to Interpret Internal Consistency
‣Ranges from -Infinity to +Infinity (Almost always positive)

‣Larger positive values = Test is highly reliable
‣Smaller values = Test is not very reliable
‣Ideal values are positive and high (above +.70)
Alpha Interpretation
> .90 Excellent
.90 > .80 Good
.80 > .70 Acceptable
.70 > 60 Marginal
<. 60 Poor
INTERNAL CONSISTENCY: HOW TO CALCULATE IT 58
‣How to Calculate Internal Consistency
‣Count the number of questions = K
‣For each question, make a column that has whether people got the question right (1) or
wrong (0)
‣Calculate the correlation between each right/wrong column and every other
right/wrong column
‣If K questions, then (K * (K-1) / 2) comparisons
‣If 20 questions, then (20 * (20-1) / 2) comparisons
‣If 40 questions, then (40 * (40-1) / 2) comparisons
‣Calculate the average inter-item correlation = r
‣Apply the following formula:

ACTIVITY: TEST RELIABILITY 59
Internal consistency is affected by the test length and average inter-item correlation
Test length: The more questions on the test, the more reliable the test will be.
Average inter-item correlation: The more the questions address a single common domain,
the more reliable the test will be
- All questions pertain to the same topic area = Higher average correlation between
question scores
- All questions pertain to the disparate topic areas = Lower average correlation between
EXERCISE question scores
1) Imagine we ask students:

a) “Calculate the area of a hexagon”
b) “Find the hypotenuse of a right triangle”
c) “Find the missing angle in a triangle”
d) “Find the radius of a circle”
2) What additional question could we ask that would probably INCREASE the
average inter-item correlation?
3) What additional question could we ask that would probably DECREASE

the average inter-item correlation?
TEST RELIABILITY: DIAGNOSTICS 60
‣If Test Reliability is too Low (< .70):
‣The item may be miskeyed
‣Items represent too many distinct dimensions (too many concepts being asked)
‣Too few items
‣Items are not written clearly
‣Items have poor difficulty and discrimination

TEST RELIABILITY: IMPROVEMENT 61
‣To Improve Low Test Reliability (< .70):
‣Check that items are keyed correctly
‣Check that items are assessing common domain
‣Increase the number of items (Increasing the number of items by 50%

increases reliability by ~.10)
‣Clarify ambiguous items
‣Recheck item difficulty and discrimination

TEST RELIABILITY: PRECAUTIONS 62
‣Precautions on using Test Reliability
‣Not meaningful if too few respondents (small sample size)
‣Cronbach’s Alpha only provides a lower-bound estimate of

reliability - the actual reliability could be much higher
IMPLEMENTING ITEM
ANALYSIS
IMPLEMENTING ITEM ANALYSIS 64
‣You have several options for implementing item analysis
‣Excel
‣SPSS
IMPLEMENTING ITEM ANALYSIS: EXCEL 68
‣You can calculate item analysis with Excel
‣Need an Excel spreadsheet that has

‣Each row is a different student
‣Each column is a different question
‣Each cell is either 1 or 0 indicating whether the student got the
corresponding question right or wrong
IMPLEMENTING ITEM ANALYSIS: EXCEL - ENTER IN DATA 69
Enter in the data

in the first Sheet,
in the same format
shown previously
in the lesson
IMPLEMENTING ITEM ANALYSIS: EXCEL - EXAM ITEM STATISTICS 70
Each Item has its

own Difficulty and
Discrimination
Score
IMPLEMENTING ITEM ANALYSIS: EXCEL - EXAM TEST STATISTICS 71
Cronbach’s alpha
= Internal
Consistency
IMPLEMENTING ITEM ANALYSIS: SPSS 72
‣You can calculate item analysis with SPSS (or R, SPSS, Minitab, Stata)
‣Enter the data as you would for an Excel spreadsheet

‣Each row is a different student
‣Each column is a different question
‣Each cell is either 1 or 0 indicating whether the student got the corresponding question right or
wrong
‣At the top menu, go to Analyze -> Scale -> Reliability Analysis
‣Move the Test Questions to the “Items” panel
‣Click “Statistics” and ask for “item,” “scale,” and “scale if item deleted” statistics
‣Click “OK”
IMPLEMENTING ITEM ANALYSIS: SPSS - STEP 1 - CHOOSE ANALYSIS 73
IMPLEMENTING ITEM ANALYSIS: SPSS - STEP 2 - SELECT VARIABLES 74
IMPLEMENTING ITEM ANALYSIS: SPSS - STEP 3 - INTERPRETATION 75
Cronbach’s alpha = Internal consistency
Item Mean = Item difficulty
Correct Item-Total Correlation = Item

discrimination
SUMMARY
SUMMARY 77
‣Item Analysis can provide useful information when examining

multiple choice tests
‣How difficult were the questions?
‣How well does a question contribute to understanding a person’s performance?
‣How reliable are the overall test scores?
‣It is important to consider the reasons for why an item is

performing poorly
‣Items can perform poorly due to wording ambiguity, lack of ability in that
domain, miscoding, lack of conceptual relevance, instructional issues
‣Item Analysis is an iterative process - takes time

SUMMARY 78
‣There are many tools for performing item analysis
‣Ones we discussed:, Excel, SPSS
‣Many others available: Stata, SAS, R, PSPP, jmetrik, D2L

SUMMARY 79
‣Item Analysis is a collection of tools - there are more out there
‣Alternative analysis - Are the incorrect answers equally likely to be

chosen?
‣Differential Item Functioning - Are the items fair between groups of

people?
‣Factor analysis - What underlying constructs are the items measuring?
‣Item-Response Theory - What are people’s ability, when you take into
account the difficulty and discrimination of the item that people’s answered
correctly/incorrectly
QUESTIONS

Item Analysis Workshop

Uploaded by

Copyright:

Available Formats

You might also like

Item Analysis Workshop

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Item Analysis Workshop

Uploaded by

Copyright:

Available Formats

ITEM ANALYSIS

Dr. Mohammed Jawad

‣What is Item Analysis?

‣Main Statistics of Item

‣Implementing Item Analysis

‣Imagine you have a multiple-choice test

‣Answer: Item Analysis

‣Statistically analyzing your multiple choice test items

‣Item Analysis is a collection of techniques

‣Many tools/methods to analyze questions

‣Item Analysis is an Iterative and Continuous Process

Write Test Items Administer Exam

‣Primary questions that can be answered by Item Analysis

‣Were any of the questions too difficult or easy?1

‣How consistent are the exam’s questions?3

‣Benefits of using Item Analysis

‣Test Development - Improve test quality

‣Precision - Quantify measurement of assessment

‣Almost all item analysis procedures require the exam data to be

‣Columns = Exam questions

‣Rows = An individual examinee

‣Almost all item analysis

‣Columns = Exam questions Zina 0 0 1 1

‣Rows = An individual Adnan 1 1 1 0

‣Cells = Whether the Ali 1 0 1 0

examinee got the question

‣Almost all item

‣Columns = Exam questions

‣Rows = An individual Adnan 1 1 1 0

‣Cells = Whether the Ali 1 0 1 0

examinee got the question

‣Almost all item analysis

‣Columns = Exam questions Zina 0 0 1 1

‣Rows = An individual Adnan 1 1 1 0

‣Cells = Whether the Ali 1 0 1 0

examinee got the question

‣Almost all item

‣Columns = Exam questions Zina 0 0 1 1

‣Cells = Whether the Ali 1 0 1 0

3) Create a data matrix that is formatted in a manner for item-analysis

a) Each row is a different student

‣What is Item Difficulty?

‣Item difficulty is how easy or hard a question is

‣If no one got the question right, the item is “difficult”

‣Why Does Item Difficulty Matter?

‣You want your test to provide information on the full range of

‣Helps ensure the full spectrum of ability is represented

‣You want scores to be roughly symmetric

‣You want your test to provide

‣If question too hard -> everyone gets

‣You want scores to be roughly symmetric

‣Ability tends to follow a roughly normal distribution

Difficulty is too high Difficulty is just right Difficulty is too easy

‣Therefore, it is important to pay attention to item difficulty to have it be just right

‣Everyone got Question 1 correct

Ahmed 1 0 1 ‣Everyone got Question 2 wrong

‣FOUR people got Question 1 correct

Question 1 5 / 5 = 1.00 100%