Professional Documents
Culture Documents
Item Analysis & Reliability
Item Analysis & Reliability
Item Analysis & Reliability
• Facility
• Discrimination
Item difficulty
Spearman
Pearson
Correlation
IQ and attitude to school
180
160
140
120
100
80
IQ score
60
40
0 10 20
attitude to school
Correlation
Correlations
attitude to
IQ s core s chool
IQ s core Pears on Correlation 1.000 .564**
Sig. (2-tailed) . .000
N 40 40
attitude to s chool Pears on Correlation .564** 1.000
Sig. (2-tailed) .000 .
N 40 40
**. Correlation is significant at the 0.01 level (2-tailed). IQ and attitude to school
180
160
140
120
100
IQ score 80
60
40
0 10 20
attitude to school
No Correlation
Correlations
12
10
random nos 2
0
-2
0 2 4 6 8 10 12
random nos 1
Strong correlation
Correlations
IQ s core NEWIQ
IQ s core Pears on Correlation 1.000 .995**
Sig. (2-tailed) . .000
N 40 40
NEWIQ Pears on Correlation .995** 1.000
Sig. (2-tailed) .000 . 180
N 40 40
**. Correlation is s ignificant at the 0.01 level 160
(2-tailed).
140
120
100
80
IQ score
60
40
40 60 80 100 120 140 160
NEWIQ
Discrimination
Item-Total Statistics
N in a 27% group
What makes a good question?
FACILITY
Discrimination
Below 40% 40%-60% Above 60%
<0.20 REJECT
Scale qualities
Reliability
• the extent to which the scores on the test
are measured consistently
– Parallel-form reliability
– Split-half reliability
– Internal consistency reliability
– Test-retest reliability
– Inter-rater reliability
• Parallel-form reliability
Correlations
A1 A2
A1 Pearson Correlation 1 .721**
Sig. (2-tailed) . .000
N 28 25
A2 Pearson Correlation .721** 1
Sig. (2-tailed) .000 .
N 25 25
**. Correlation is significant at the 0.01 level
(2-tailed).
• Split-half reliability
• Spearman-Brown formula
Correlations
ODD EVEN
ODD Pearson Correlation 1 .807**
Sig. (2-tailed) . .000
N 28 28
EVEN Pearson Correlation .807** 1
Sig. (2-tailed) .000 .
N 28 28
**. Correlation is significant at the 0.01 level
(2-tailed).
****** Method 1 (space saver) will be used for this analysis ******
_
R E L I A B I L I T Y A N A L Y S I S - S C A L E (S P L I T)
Reliability Coefficients
N of Cases = 28.0
N of
Statistics for Mean Variance Std Dev Variables
Scale 3.2500 10.4167 3.2275 10
Item-total Statistics
A B C D E
A 1 0.8 0.9 0.1 0.2
B 1 0.6 0.3 0.1
C 1 0.3 0.1
D 1 0.9
E 1
Factor structure of the scale
investigated above - Alpha = 0.88
Rotated Component Matrixa
Component
1 2 3
ITE0005 .897
ITE0010 .873
ITE0003 .873
ITE0006 .724
ITE0009 .661
ITE0008 .929
ITE0007 .750
ITE0004
ITE0002 .895
ITE0001 .731
Extraction Method: Principal Component Analysis.
Rotation Method: Varimax with Kaiser Normalization.
a. Rotation converged in 4 iterations.
Links between question difficulty
and scale reliability
qu2
qu1
Negative alphas
Are unusual but can be the result of
• The problem of questions that are too
easy
• Errors in coding
• Sampling error
OR
• That the questions really don’t measure
the same thing and right answers to some
go with wrong answers to others)
• Classical test Theory
• http://www.rasch.org/memo42.htm
• http://www.rasch.org/memo62.htm
Validity
• Does the test measure what it sets out to
measure
• Concurrent validity
• Discriminant validity
• Predictive validity
• A simple design:
• OXO
Issues of validity at the design
stage – experimental designs
Internal threats to validity
• History
• Maturation
• Testing
• Instrumentation
• Selection
• Statistical regression
• Mortality
External threats to validity
• Interaction of selection bias and
treatment
• Interaction between testing and
treatment
• Reaction to being in an experiment
Hawthorne Effect
• Treatment group do better because they
are in a privileged group
Hawthorne Effect
• Treatment group do better because they
are in a privileged group – or do they?
“Like other hallowed but unproven concepts in psychology, the so-called Hawthorne
effect has a life of its own.”
By Berkeley Rice
http://www.cs.unc.edu/~stotts/204/nohawth.html
Compensatory effect
John Henry Effect
• Control group do better because they
are not going to let the ‘smarties’ get the
better of them
True experiment
X Oe
R
- Oc
Quasi experiments – no R
O X O
O O
Mixed methodology
• Use quasi experimental designs to reveal
possible consequences of actions
• Use interpretative designs to check the
causal relationship between the actions
and the consequences – from several
perspectives
– (including enquiring about the known threats
to validity – history, maturation etc)
Another view of mixed
methodologies
Linking qualitative and quantitative data
Connect2
ADMPA
GNVQBus
Pth4 Prn
Shared Control
2.5
Engneer ASPsych
1.5
2.0 2.2 2.4 2.6 2.8 3.0 3.2 3.4
Student Negotiation
Explaining the High SC/Low SN
grouping
Support for Mature Students
• Self assessment, negotiation based on
assignments, individual learning plans
agreed and reviewed by tutor and student
Workbased assessment
• Individual support from tutor (underground
working)
Explaining the High SC/Low SN
grouping
Workbased assessment
• Different geographical placements
Connect2
ADMPA
GNVQBus
Pth4 Prn
Shared Control
2.5
Engneer ASPsych
1.5
2.0 2.2 2.4 2.6 2.8 3.0 3.2 3.4
.5 Exogenous variable
variance
IQ .4
Mediator var
.3
mot Direct effects – path
coeff
disturbances
.9
Structural equation modelling
http://www2.chass.ncsu.edu/garson/PA765/structur.htm
One indicator per latent: SEM=PA
http://www2.chass.ncsu.edu/garson/PA765/structur.htm
No dependent: SEM=CFA
http://www2.chass.ncsu.edu/garson/PA765/structur.htm
Multilevel Modelling
• Bennett – 1976, Teaching Styles and Pupil
Progress
– Children taught in a formal style did better