Fraenkel8 SMA ch08

You might also like

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 9

Chapter 8: Validity and Reliability

Activity 8.1: Instrument Validity

Activity 8.2: Instrument Reliability (1)

Activity 8.3: Instrument Reliability (2)

Activity 8.4: What Kind of Evidence: Content-Related, Criterion-Related, or


Activity 8.5: What Constitutes Construct-Related Evidence of Validity

Activity 8.1:
Instrument Validity

A valid instrument is one that measures what it says it measures. If a researcher is

interested in measuring how much a student knows about the U.S. Civil War, for
example, he or she needs an instrument that will measure exactly that -- the student’s
knowledge -- not his or her feelings, attitudes, beliefs, or skills. For each of the two
objectives listed below, write one example of the kind of question or observation you
might engage in to measure, at least to some extent, attainment of the objective.

1. Objective: To measure the degree to which a person enjoys modern art

Instrument question or observation strategy:

 use a rating scale to have a person rate (on a scale of 1-low, to 5-high) various
types of paintings (modern and other)
 interview a person in depth about his or her feelings about modern art

2. Objective: To measure the level of anxiety that exists among university students
during final exam period

Instrument question or observation strategy:

 use a rating scale to have a person rate (on a scale of 1-low, to 5-high) to estimate
the level of anxiety
 interview a person in depth about his or her feelings among university students
during final exam period

3. Objective: To measure the attitudes of local residents toward the building of a new
ballpark in downtown San Francisco

Instrument question or observation strategy:

 mail out a questionnaire to a randomly selected sample of residents in which you

ask them to respond to questions about the building of the ballpark
 interview a random sample of residents questions concerning their feelings about
the building of the ballpark
Activity 8.2:
Instrument Reliability (1)

A reliable instrument is one that is consistent in what it measures. If an individual scores

highly on the first administration of a test, for example, he or she should, if the test is
reliable, score highly on a second administration. In this activity, you are going to
evaluate the reliability of an instrument.
Imagine that you are conducting a study for which you must develop a mastery
test in mathematics for ninth-grade students. You develop a 30-point test and distribute it
to a class of 13 ninth-graders in a certain school district on the west coast of the United
States in May of 2004. You then give the test again one month later to the day in June,
2004. The scores of the students on the two administrations of the test are shown below.
Plot each pair of scores on the scatterplot started below. We have entered “A’s” scores to
get you started. What do they suggest to you about the reliability of this? Explain.



A 17 15_______________
B 22 18_______________
C 25 21_______________
D 12 15_______________
E 7 14_______________
F 28 27_______________
G 27 24_______________
H 8 5_______________
I 21 25_______________
J 24 21_______________
K 27 27_______________
L 21 19_______________
M 10 15_______________
Activity 8.3:
Instrument Reliability (2)

For each of the situations listed below, match the type of reliability with what the
researchers involved are evaluating.

Column A: Situation Column B: Instrument

1. __c_ A researcher develops two versions of a. internal consistency
a test meant to measure interests in students b. test-retest reliability
prior to their taking an examination. He c. equivalent forms reliability
gives one version of the test to a group of d. none of the above
college sophomores on a Monday, and the
other version of the test to them the next

2. __d__ A teacher develops a new test for

high school biology. She gives the test twice,
once to the students in her morning class and
once to the students in her afternoon class.
She then compares the scores for the two
classes of students.

3. __a__ A college professor is interested in

evaluating her end-of-semester course
evaluations that are completed by her
students. The instrument consists of 20 five-
point rating scale items. She obtains an
average score for each student on the first 10
items, and also an average score for each
student on the second 10 items. She then
compares the scores.

4. __d__ A researcher prepares a 15-item

multiple-choice test designed to measure
student knowledge of the causes of the
Spanish-American War. She asks two of her
colleagues, specialists in American history,
to identify any items that they think do not
measure what she is after.

5. _b_ A teacher prepares an algebra test and

gives it to her students at the end of the
semester and again two months later.
6. Which of the following would be a way of assessing unreliability due to content and
time? c
a. Administering a reading test (Form X) on Monday and again one month later
b. Administering a reading test (Form X) on Monday and calculating a split-half
c. Administering a reading test (Form X) on Monday and Form Y one month later
d. Administering a reading test (Form X) on Monday and deleting any questions that
more than 50 percent of those taking the test missed
Activity 8.4:
What Kind of Evidence: Content-Related, Criterion-Related or Construct-Related?

As we mention in the text, validity depends on the amount and type of evidence
there is to support one’s interpretations concerning data that has been collected. In
Chapter Eight, we describe three kinds of evidence that a researcher might collect:
content-related, criterion-related, and construct-related evidence of validity.
Listed below are a number of questions that each represent one of these three
types. In the space provided, write content if the question refers to content-related
evidence, criterion if the question related to criterion-related evidence, and construct if
the question refers to construct-related evidence of validity.

1. How strong is the relationship between student scores obtained using this
instrument and their teacher’s rating of their ability? __ criterion-related __

2. How adequately do the questions in the instrument represent that which is being
measured? __ content-related _____

3. Do the items that the instrument contains logically get at that which is being
measured? __ content-related ___

4. Are there a variety of different types of evidence (e.g., test scores, teacher ratings,
correlations, etc.) that all measure this variable? _ criterion-related ___

5. How well do the scores obtained using this instrument predict future
performance? __ criterion-related _

6. Is the format of the instrument appropriate? __ content-related ___

Activity 8.5:
What Constitutes Construct-Related Evidence of Validity?

On page 156 of the text, we provide an example of one piece of evidence that could be
used to establish construct validity for a pencil and paper test on honesty.

1. In the space provided below, after discussing this with a partner, suggest some
additional information that a researcher might collect as evidence of honesty in an
effort to establish construct validity for the test.








2. What about interest in the subject of chemistry? Suppose another researcher

wishes to develop a test to measure an individual’s interest in chemistry. What
sort of information might he or she collect in an attempt to establish construct
validity for the test?







Problem Sheet 8: Validity and Reliability

1. If you plan to use an existing instrument, describe what you have learned about the
validity and reliability of scores obtained with this instrument.

2. If you plan to develop an instrument, explain how you will try to ensure the validity
and reliability of results obtained with this instrument by using one or more of the
tips described on page 114 (specify which).:

3. If you have not already indicated so above for each instrument that you plan to use,
tell specifically how you will check for:

a. internal consistency:_______________________________________________

b. stability (reliability over time): _______________________________________


c. validity: _________________________________________________________

You might also like