RELIABILITY AND VALIDITY 4s

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 42

MEASUREMENT

RELIABILITY AND VALIDITY

GLENDA P. DE VERA MAN, RN


RESEARCH MENTOR
MEASUREMENT
 an assignment of numbers to
represent the amount of
attribute present in an object
or person using specified set of
rules

 PURPOSE:
◦ To differentiate between people
or objects that possess varying
degrees of critical attributes.

◦ e.g. temperature, weight, BP


MEASUREMENT - PURPOSES

 Data collected should specify


under what conditions or
criteria the numeric values are
assigned to the characteristics of
interest.

 e.g. attitudes strongly


agree to strongly disagree
ADVANTAGES OF MEASUREMENT

 Removes subjectivity and guesswork

 Independently verifiable

 Language of communication
◦ accurate information (numbers)
SCALES (LEVELS) OF MEASUREMENT

 Scales
clarify the characteristics of
measurement processes

 Scales
indicate which statistical
procedures are appropriate
SCALES (LEVELS) OF MEASUREMENT
Nominal Ordinal Interval Ratio
• Categories • Distance is • Scale of
• Categories meaningful
without with order category
between
order has
categories
• Size (S,M,L) absolute
zero
• Colors, • Social class • Temperature
• Gender, • Scores,
• Age,
• Political • Agreement • Shoe size income, all
party, (strong, • IQ rates and
• Nationality some, low,
percent,
none)
vacation
time
RELIABILITY
Does my measurement procedure
give the same accurate
measurement
each time it is used?
WHAT IS RELIABILITY?
 Reliability is
consistency in
measurement

◦ Does this procedure or test


yield the same results if you
repeat the measurement, so
long as conditions have not
changed?
Stern Tone Variator, from
The Archives of the History of American Psychology
◦ How do we know?
What is reliability?
 The less variation an
instrument produce in
repeated measurement
– the higher its
reliability

 Equate with stability,


consistency, or
dependability Stern Tone Variator, from
The Archives of the History of American
Psychology
RELIABILITY –STABILITY
 STABILITY
- extent to which similar results
are obtained on two-separate
administrations.
 TEST-RETEST - simplest way to assess
reliability
 can be used whenever the process of
measuring will not, by itself, affect data
 Done with RELIABILITY
COEFFICIENT
RELIABILITY - STABILITY
 CORRELATION COEFFICIENT – a tool to
describe the magnitude and direction of a
relationship between two variables

 Range from -1.00 (perfect negative


relationship) through zero to +1.00
(perfect positive relationship)
 Usual range is zero to +1.00
 The higher the coefficient, the more stable
the measure
◦ .70 – considered satisfactory
RELIABILITY - STABILITY
 perfect positive relationship
e.g. height and weight relationship
- an increase in height tend to be associated
with increase in weight
❖ tallest person = heaviest in weight

 negative or inverse relationship


▪ increase in one variable is associated with a
decrease in second variable
❖ smallest person = heaviest in weight
RELIABILITY-STABILITY
 Perfect
relationship

❖Tallest person =
heaviest in weight

❖Second tallest person


= second heaviest
RELIABILITY - INTERNAL CONSISTENCY
 Refers to the extent to which all
the instrument’s items are
measuring the same attribute
(psychosocial instruments)

 Need to verify that all the items


relate to the same dimension

- Used split-half reliability


technique OR Cronbach’s
alpha method
RELIABILITY - INTERNAL CONSISTENCY
SPLIT-HALF RELIABILITY
technique

✓ A measure of consistency
where a test is split in two
and the scores for each half of
the test is compared with one
another.

❖ If the test is consistent it leads


the experimenter to believe
that it is most likely measuring
the same thing.
 The correlation coefficient for scores on
the two half-tests gives an estimate of the
scale on internal consistency

 If odd items = even items then the reliability


coefficient is high
RELIABILITY - EQUIVALENCE
 Focus on observer’s rating or coding
behavior

 estimates of interrater or inter


observer reliability is obtained

 Observation process is skilled and


requires individual judgment.

 Trained researchers make observations


of the same subject independently.
SOURCES OF UNRELIABILITY
 Meaning of questions is
unclear or produces
random answers.
 Raters not adequately
trained on method of
making rating.
 Some of the questions or items measures different
dimension, they don’t “go with” the others.
 Instructions may be unclear or inconsistent, even if
the test questions are fine.
 Outside events may be having an effect.
VALIDITY
Does my measurement procedure give a
measurement of the construct or
variable that I intend?
OR
Is it measuring something else?
VALIDITY
✓ The degree to which an instrument
measures what is supposed to measure

4 CRITERIA
✓ FACE VALIDITY
✓ CONTENT VALIDITY
✓ CRITERION – RELATED VALIDITY
✓ CONSTRUCT VALIDITY
“FACE” VALIDITY

◦ If the instrument measures the


appropriate construct.

◦ may relate to the overall


appearance

Manuscripts and checklists by Muffett at


http://www.flickr.com/photos/calliope/173797447/
CONTENT VALIDITY
 Determines if the instrument has
appropriate sample of items for the construct
being measured
(measures feelings, psychological traits &
cognitive measures )
◦ Are all aspects of the dimension or concept
covered?
◦ Are any aspects over or under-emphasized?
 Improve:
◦ Thorough search of the literature
◦ Consult with experts who disagree with your
perspective
CRITERION VALIDITY
▪ Determine the relationship between
instrument and external criteria

GRE flashcards by NEPMET at


http://www.flickr.com/photos/blahman/2168064272/
CRITERION VALIDITY
▪ Concurrent validity – refers to instruments
ability to distinguish individuals who differ on
present criteria.
◦ A music audition is a valid measure if it
selects the better players over those with
less ability

➢ Predictive validity - refers to the adequacy of


instrument in differentiating performance on
some criteria.
◦ The GRA is a valid measure if
people who do well on the
GRA succeed in graduate school
GRE flashcards by NEPMET at
http://www.flickr.com/photos/blahman/2168064272/
CONSTRUCT VALIDITY
 Instrument adequacy in measuring the
focal construct
“Does it adequately measures
the abstract/concept of interest?”

The construct itself might


be socially constructed
◦ classic studies in obedience
are being re-interpreted.
Reliable, but not valid

 Reliable: pattern shows


the shot hits the same part
of the target each time: it is
consistent, so it is reliable.

 Not valid. The goal is


to hit the center of the
target, but the shots are
not in that area.
Valid, but not reliable
 Valid because the pattern is
evenly distributed around the
correct goal (center): the
person probably tried to hit
the correct place.
 Not reliable because the
shots are off the mark in every
possible direction; they are not
consistent.
Neither reliable nor valid
 Not reliable because
the shots are not tightly
clustered together; they
are not consistent.

 Not valid because, to


the extent there is any
pattern, it is not at the
true target, the center.
Both reliable and valid
 Reliable: the darts land
close together. The red
player can reliably hit
the same part of the
target.

 Valid: the darts are Bullseye!! by modenadude at

clustered at the center, http://www.flickr.com/photos/mode


nadude/3280286776

where they were aimed.


WHY DO RELIABILITY
AND VALIDITY MATTER?
 All of our research uses data.
◦ Data is gathered through measurement
procedures
◦ The scores only have meaning if they measure
what they are supposed to measure (valid) and
do so with accuracy and consistency (reliability).

 Evaluating whether data are reliable and valid


is a key element in applying research findings.

You might also like