Download as pdf or txt
Download as pdf or txt
You are on page 1of 12

A Guide for Instrulllent Developlllent

and Validat·on

(instrument development. occupational therapy, test construction)

Jeri Benson Florence Clark

As occupationaltherapzsts become planned, developed, and ualzdated, n 1976, I-bsselkus and Safrit (I)
increasingly concerned with A
is
sequential
illustrated
step-by-step
with a
process
flowchart and of
claimed
measuremeill
I
that lOO often principles
theory were not
accountabilzty, the paucity of ade­
quate instru,mentation available applied in the hypothetzcal con­ applied in the construction of the
for documentzng therapeutic effec­structzan of an allztude scale to
tiveness surfaces as a major prob­assess school administrators' valu­
lem, Therapzsts will need to con­ ing of the role of occu,patzonal
struct new or refine exzstzng therapists in the schools This Jeri Benson, Ph,D" is Assistant

example is provided to show how


instruments to satisfy the requzre­ Professor, School of Education,

ments of third-party payment, The general psychometric principles Department of Occupational

purpose of thzs paper is to illus­are applied wzthin an occupa­ Therapy, Unzverszty of Southern

trate how a new instrument is tzonaltherapy contex/' California, Los Angeles,

Callfornia,

Florence Clark, Ph,D" 0 TR,

FA 0 T A, is Assistant Professor,

Department of Occupatzonal

Therapy, University of Southern

California, Los Angeles,

California,

The Amencan Journal of Occupational Therapy 789


Downloaded From: http://ajot.aota.org/pdfaccess.ashx?url=/data/journals/ajot/930542/ on 09/19/2018 Terms of Use: http://AOTA.org/terms
Figure 1
Flowchart for Instrument Development

Phase I
Phase II
Planning
Construction

Review Literature
on Construct or
Variable of Develop Table of Develop New
Interest Specifications or Revise
Items

(1) (2) (3) (4) (5) (6)

Phase III Phase IV


Quantitative Evaluation Validation

Prepare Instru­ Revise Instrument Second Pilot Begin Validation


ment for First Prepare for Second Administration
Pilot Testing Pilot Testing

First Pilot
Administration

(7) (8) (9) (10) (11 ) (12) (13)

measurement tools occupational Since then, a number of instru­ an impact upon the larger occupa­
therapists used to assess the existing ments have been designed by occu­ tional therapy community, it is
and potential functional capabili­ pational therapists as clinical or re­ i mponant to recognize that many of
ties of clients. Assuming that this search tools possessing varying these instruments were constructed
si tuation was caused in pan by a degrees of adherence to the require­ by graduate students in conjunc­
lack of familiarity with measure­ men ts for adeq uate instru men ta tion tion with requirements for the mas­
ment theory, these authors pro­ (2-10). Not only does the construc­ ter's degree. The students probably
ceeded to summarize basic measure­ tion of these instruments signify had access to the consultation of
ment theory and selected procedures therapists' dissatisfaction with de­ psychometricians or other faculty
for estimating reliability and valid­ pending exclusively on subjective with expertise in measurement.
ity within an occupational therapy clinical judgment, but it also testi­ Other therapists who do not have
context. The purpose in publishing fies to their increased sophistication this access may need to develop
their article was to enable therapists regarding measurement concepts. instruments that are scientifically
to construct valid and reliable assess­ To avoid becoming overly optimis­ sound, but may be precl uded from
ments capable of yielding data of tic and so as not to exaggerate the doing so because they do not know
scientific value. extent to which this trend has had the steps necessary to proceed sys­

790 December 1982, Volume 36, Number 12


Downloaded From: http://ajot.aota.org/pdfaccess.ashx?url=/data/journals/ajot/930542/ on 09/19/2018 Terms of Use: http://AOTA.org/terms
tematically and correctly with test Plannmg Phase: Steps 1-2. This struct is being measured well. In
development and validation. most important phase begins with Step 2, therefore, the review of the
Numerous textbooks on measure­ the formulation of a statement of literature helps to identify the types
ment and research design (11-13) the purpose of the intended instru­ of items likely to assess the con­
include chapters on instrument ment(Step I). Thestatementshould struct as accurately and meaning­
reliability, validity, and item analy­ include a specification of the do­ flllly as possible. For the school
sis; summaries of this content have main (content area) or construct administrators' scale, the test devel­
been published in AjOT (14, 15). (abstract psychological trait) to be oper might, for example, review the
However, no one source provides a measured and the target group for recently published set of manuals
step-by-step guide to instrument which the instrument is intended. on occupational therapy in the
development and validation. The In the hypothetical example the schools in order to identify the phi­
purpose of this paper, therefore, is statement of purpose might read: losophy of occupational therapy
to outline the step-by-step process "The instrument to be constructed educational management, the objec­
through which a new occupational will assess the receptivity of school tives of occupational therapy man­
therapy instrument can be planned, administrators toward the inclu­ agement, and the role of the occupa­
developed, and validated. sion of occupational therapy in the tional therapist lD the schools (17).
schools, where receptivity is the Measurement of school administra­
The Steps Toward In trument construct to be measured and the tors' attitudes toward issues in each
Development and Validation target group is school administra­ of these domains migh t a ppropria te­
he flowchart shown in Figure I tors. " ly assess "receptivity" to the inclu­
illustrates the general step-by-step Step 2 requires more time than sion of occupational therapy in the
process involved in planning, con­ Step I since it involves a review of schools.
structing, and validating a ne,·\! the related literature. This step The major product of Step 2 is the
instrument. ensures that an appropriate, reli­ formulation of open-ended ques­
Although the illustration is gen­ able, and valid instrument does not tions that are then presented to
eral in scope, specificity within a already exist. Critical reviews of representatives of the target group.
chosen domain of interest to occu­ existing tests can be found in the Th is process usually el ucidates areas
pational therapists is gained Mental Measurement Yearbooks of inquiry related to the construct
through consultation with a psy­ edited by Buros (16), and in selected being measured that were not re­
chometrician or reference to the journals such as the journal oj vealed in the literature review and
psychometric literature. EducatIOnal Aleasurement, Applied sensitizes the test constructor to the
Each of the four phases in Figure Psychological Measurement, and concerns of the terminology cus­
1 contains several steps: l. planning iVleasurement and Evaluation Guid­ tomarily used by the target group.
(steps 1-2), 2. constructIOn (steps 3­ ance. Second, the review of the liter­ The following are examples of
6),3. quantitative evaluation (steps ature aids in formulating an opera­ open-ended questions that might be
7-11), and 4. validation (steps 12­ tional definition for the construct to presented to the school adminis­
13). The phase and its correspond­ be measured, an idea difficult to trator:
ing steps will be illustrated by an grasp. When a construct is opera­ 1. ,. As a school adminis tra tor I
example of how one would embark tionalized, the components neces­ believe an occupational therapist
upon the development and valida­ sary to measure it are spelled out could assist teachers by _
tion of a "hypothetical" attitude (13). Classically, for example, the 2. "The differences between the
scale designed for the purpose of operational definition of intelli­ special educator and occupational
assessing school administrators' gence (a construct) has been a score therapist are _
attitudes toward the inclusion of on a standardized inteliigence test Knowledge gam red from the re­
occu pational therapy in the schools. such as the Stanford-Binet. In the view of the literature and the re­
The steps described will apply in example of the school administra­ sponst's to the open-ended questions
the construction of a wide range of tors' "receptivity" scale, the con­ are synthesized, concluding Step 2,
instruments including those of the struct recepti vi ty is opera tionall y and the construction of the instru­
developmental, cognitive, motor, defined as a score on the scale to be ment itself begins.
affective, or functional assessment constructed. An operational defini­ Construction Phase: Steps 3-6.
variety. tion does not imply that the con­ The construction of the instrument

The American journal oj Occupational Therapy 791


Downloaded From: http://ajot.aota.org/pdfaccess.ashx?url=/data/journals/ajot/930542/ on 09/19/2018 Terms of Use: http://AOTA.org/terms
begins with listing the specific est level the individual not only The relative emphasis the test will
objectives of the instrument that holds particular value toward the give to each objective is then deter­
pinpoint the purpose of the instru­ phenomenon, but the individual mined by the number of items the
ment and indicate the content areas can also be characterized by a set of test constructor apportions to each
to be assessed. values related to the phenomenon. cell.
For example, for the school ad­ Objectives that reflect a feeling tone An abbreviated "sample" table of
ministrator's receptivity scale, ob­ or the degree of acceptance or rejec­ specifications is presented in Table
jectives might address the philos­ tion the administrators hold toward I that could be used for the devel­
ophy of occupational therapy edu­ occupational therapy in school can opment of the proposed attitude
cational manag'ement, the objectives be formulated by using the taxon­ scale, The ojectives formulated in
of occupational therapy educational omy for the scale. Objectives so con­ Step 3 are placed in their corre­
management, and the role of the structed, each of which addresses sponding cells. Also, the number of
occupational therapist in the one of the specified content areas of items that will be constructed for
schools. (These three content areas the instrument, might read: each objective is indicated. The ex­
were identified in the literature Objective 1: The instrument will tent to which school administra­
review undertaken in Step I as those assess a wareness of the role of the tors have internalized the philoso­
with which the instrument should occu pationa I thera pist in the schools. phv of occupational therapy educa­
deal. ) Objective 2: The instrument will tional management is shown in the
Specifying the content areas to be assess the valuing of the objectives table to be the most heavily weigh­
addressed by the objectives is neces­ of occupa tional therapy educational ted area on the scale.
sary but not sufficient for the for­ ma na gemen t. With the objectives of the instru­
mulation of the objectives. In addi­ ObJectzve 3: The instrument will ment clearly stated, and the table of
tion, a process dimension, which assess internalization of the philos­ specifications completed, the test
hierarchica II y reflects va rious types ophy of occupational therapy edu­ constructor begins to write items,
of human responses or reactions to cational management. Step 4 on the flowchart (Figure I).
the content included on the ques­ When the objectives of the in­ This step involves several facets.
tionnaire, typicall y is adopted for strument have been constructed in First, the item format must be se­
formulating the objectives of the this manner, they readily lend them­ lected. The choice of which format
instrument. In the field of educa­ selves to the next step, preparing the best suits the intended respondent
tion, formally constructed process table of specifications (Step 4). The in relation to age and ability is
taxonomies in the psychomotor purpose of the table of specifica­ important. Research in the cogni­
(18), cognitive (19), and affective tions is to delineate, as clearly as tivedomain, forexample, indicated
domains (20) exist and, while these possible, the scope and emphasis of that item format may confound
are used more commonly in the the test by relating items to objec­ what is actually being measured
preparation of educational objec­ tives. It ensures that the items are (21-22). For example, multiple­
tives, they can also be used to for­ derived from and appropriate to the choice questions on social studies
mulate the objectives of an instru­ objectives of the instrument. Most content can, in part, be measuring
ment. In the example of the school tables of specifications are two-way reading ability rather than knowl­
administrator's attitude survey, the grids with a number of cells. The edge of social studies if the question
affective taxonomy developed by horizontal axis lists content areas; is written using a high-level vocab­
Krathwohl, Bloom, and Bertram the vertical axis lists the categories ulary. The choice of item formats
(20) could be employed as a guide in the process hierarchy used in the differs according to the type of scale
for classifying the types of responses construction of the objecti ves. Each being designed. Multiple-choice,
school administrators might dem­ cell, therefore, represents the inter­ true-false, or matching items are
onstrate. In this taxonomy, affective section of a content category and a typically used on instruments that
reactions are hierarchically arranged: level of the process hierarchy. Be­ measure cognitive functions. Likert
receiving, responding, valuing, or­ cause each objecti ve was constructed (23), Thurstone (24), or Guttman
ganizing, and value complex. At the with only one content area and one (25) scales are generally selected for
most basic level, an individual level of the process hierarchy in the development of affective instru­
simply demonstrates awareness of a mind, it should "fit" in only one ments. Psychomotor tests involve a
phenomenon, whereas at the high­ cell of the table of specifications. variety of item formats ra nging from

792 December 1982, Volume 36, Number 12


Downloaded From: http://ajot.aota.org/pdfaccess.ashx?url=/data/journals/ajot/930542/ on 09/19/2018 Terms of Use: http://AOTA.org/terms
Table 1
Table of Specifications for Attitude Scale

Content Domain
Philosophy Objectives Role 01
Item Level 01 OT Ed 01 OT Ed OT in
(Process) Management Management Schools Total target group. At this point, review
of the item pool has just begun.
Awareness Objective 1
Steps to follow will include content
(2 items) (1 item) (1 item) 4 items
lalldaLion and further qualitative
Valuing Objective 2 elaluaLion in which the qualitv of
(3 items) (4 items) (3 items) 10 items the items is assessed in relation to
Internalization Objective 3 the target group (Step 5).
(5 items) (1 item) (1 item) 7 items In the content I'alidation session.
the 42 items typed as they would
Tolal 10 items 6 items 5 items 21 items
appear on the final form o{ the test
dne! the table of specifications with­
simple imitative lasks to those in Often. the item l\"fiting is not per­ out the items represented in thecells
which higher levels of skill must be formed exclusilely by the tcst dc­ all' gil('n lO colleagut'S or othn ex­
demonstrated. Greenstein (26) and I·eloper. Experts in the field can be pcrts In the lield [or tllcil' IClit'II·.
HarrO\v (18) provide discussions on "ssembled and trained for this pur­ This panel is then asked to match
taxonomies for constructing psy­ pose (Step 4), especially when fund­ Lhe items vI'ith the appropriate cell
chomotor objectives. These taxon­ ing is al·aibble. Writing items can on the table of specifications When
omies may be useful for developing be a long and tedious proces.; If it is absolute agrefment is not re,lched
iterm for psychomotor tests. done soleh by one person, espe­ upon the placement of an item, that
Since the Likert format is the cially since it is strongly recom­ item must be rel'ised until a consen­
most commonly used in the del'el­ mended lh;il a pool of i telll s be delTl­ sus on llS placement is reached.
oprnent of attitude scales in the oped containing 111'1((' as Illdnl' Obl'iously, some items II·i11 nevtT
social sciences, this format will be items as lI'i11 be needed for the final reach this standard despite seleral
chosen for the hypothetical attitude form of the instrument. Thus. for reI isions 3nd therefore must be dis­
survey. Likert scales typically h;:1\e the attitude scale at least ,12 instead carded The final product should be
five options for responses rangillg of 21 items correctly apportioned to an I nSlru ment in which ,11 I items
from strongly agree to strongly dis­ the 9 cells in Table I will be needed. 1;111 be classified into thc ;lppIO­

agree. The response, strongly agree, for efficiency. each item should be priatt' It'll of the table of "pclIiII;I­
might be assigned a value of 5. and l\Titten on a separ:Clle note card to llOns. An instrument is considered
strongl}' disagree assigned a value enable rt'\'ising, shufflIng fur place­ to be content valid when the ilems
of I; thus the higher the score the ment on the test form, and keyll1g to adeq ua tel y reflect the process and
more positil'e the administrator's the table of speofications. Once the content dimensions of the specified
attltude. The importance of the pool of items has been written. care­ objectil'es of the instrument as de­
num ber and nature of response op­ ful review by the writers is necessary termined by expert opinion.
tions has been discussed by numer­ to determine insofar as possible Following the content nllidation
ous authors (27-30). The following \I'!lcther: l. the ilems arc clearly proce"" ,1 sccond qu;llitati,c el ;Ii­
,lte examples of items correspond­ stated. 2. the items conform to a u,ttion sessIon is conl'ened, using
ing to objectives 1-3 that might be selected format, 3. the rt'sponse op­ subjects considered to be rejJresen­
included on the attitude scale for tions for each item are plausible, tatile of the target group. for ex­
admin is tra tors: ,wd 4. the wording is familiar to the ample, school administrators lI'ould
be assembled for the qualitative
Circle the response that most closely matches your feeling regarding the following assessment of the hypothetical atti­
statements There are no right or wrong answers Key: strongly agree = 5: agree = 4; tude scale. The participants would
neutral =3: disagree = 2; strongly disagree = 1
be asked to critique the instrument,
1 Students who need occupational
therapy are attending public but, unlike the panel of experts in
schools . . .. 5 4 3 2 1 the con ten t I'alida t ion se5S ion, the
2. I endorse the integration of occupa­ school administrators would be
tional therapy principles into class­
room curricula .
asked to actually take the scale to
.. 54321
3. I regard the human as an active determine the time it takes for com­
being who is positively influenced pletion, and then judge the clarity
by purposefUl activity. .. .. 54321
of the items and directlOns, and

The Aml'1'l(an journal of OcrupallOnal Therapy 793


Downloaded From: http://ajot.aota.org/pdfaccess.ashx?url=/data/journals/ajot/930542/ on 09/19/2018 Terms of Use: http://AOTA.org/terms
Child Number: _
PRESCHOOL
Session Number: _
Observer's Initials: _

0·1 year 1-2 years 2-3 years

Gross motor actIvity: reaches; plays


Gross motor activity: stands unsup­ Gross motor activity: beginning
~ with hands and feet; touches hands
ported; sits down; bends and recov­ integration of entire body in
Z to feet; crawls; sits with balance;
ers balance; walks and runs-wide activities-concentrates on complex
W
w:E pulls to stand; moves to continue
stance; climbs low objects; broad movements (i.e., throwing, jumping,
Ow pleasant sensation
movements involving large muscle climbing); pedals tricycle
«"
Q.c(
I1l
groups; rides kiddie car
z
c(
Territory: crib; playpen; house Territory: home; immediate surrounds Territory: outside; short excursions
:E Exploration: of self and objects within Exploration: of all unfamiliar things; Exploration: Increased exploration of
reach oblivious to hazards all unfamiliar objects; very curious
Comments: Comments: Comments:

ManipUlation: predominant-handles, Manipulation: predominant-throws; Manipulation: remains predominant­


mouths toys; brings two objects inserts; pushes; pulls; carries; feels; pats; dumps; squeezes; fills
together; picks up; hits; bangs; pounds
~
Z shakes
w Construcllon: not evident Construction: little attempt to make Construction: manipUlation

:E product: relates two objects appro­ predominates; scribbles; strings

w
"«z priately (i.e., lid on pot); stacks;
takes apart; puts together
beads; puzzles 4-5 pieces

c( Interest: people; gazes at faces; Interest: movement of self-explores Interest: explores new movement

:E follows movements; attends to various kinesthetic and propriocep­ patterns (i.e., jumping); toys with

..J
~ voices and sounds tive sensations; moving objects (i.e., moving parts (i.e., dump trucks,

II: balls, trucks, pull toys) jointed dolls); makes messes

W
~ Purpose: sensation or function-uses Purpose: experiments in movement­ Purpose: process important-less
c( materials to see, touch, hear, smell, interest in finished product (ie.,
practices basic movement patterns
:E mouth (i.e., rattles, teething rings, (i.e., rock, walk, run); process scribbles, squeezes play dough);
colored objects) important repetition of gross motor skills
Attention: follows moving objects Attenllon: rapid shifts Attention: intense interest; quiet play
with eyes up to 15 minutes; plays with single
object or theme 5-10 minutes
Comments: Comments: Comments:

Imitation: of observed facial expres­ ImItation: of simple actions; present Imitation: of adult routines with toy­
sions and physical movement (i.e., events and adults-self-related mim­ related mimicry (i.e., child feeding
smiling, pat-a-cake); emotions (hugs icry (i.e., feeds self with spoon) doll); toys as agents (i.e., doll feeds
toys) self)
Imagination: not evident Imagination: imaginary objects (i.e., Imagination: personifies dolls, stuffed
pretend food on spoon) animals; starts having imaginary
friends (i.e., animals, persons)
Dramatization: not evident Dramatization: not evident Dramatization: portrays single
character
Music: attends to sounds MusIc: sways; listens MusIc: responds to music with whole
body (i.e., marching, twirling)
Books: pats; strokes; picks at pictures Books: handles; points to pictures; Books: likes familiar stories; fills in
begins to name pictures words and phrases
Comments: Comments: Comments:

Type: solitary play (no effort to Type: combination of solitary, Type: parallel play (plays beside
interact with other children or onlooker play (watches others­ others, play remains independent,
choose similar activities) speaking but not entering their play) but child situates self among
others, enjoys their presence)
z Cooperation: demands personal Cooperation: more complex games
o Cooperation: possessive (much
j: attention; simple give and take inter­ with a variety of adults (i.e., hide and snatch and grab, hoarding, no shar­
«
c- action with immediate family or seek, chasing); offers toys but ing, resists toys being taken away);
O caretaker (i.e., tickling, peek-a-boo); somewhat possessive; persistent independent (does not ask for help,
j: 7-10 months-initiates games rather initiates own play)
II: than follows
c(
0.. Language: attends to sounds and Language: jabbers during play-talks Language: talkative-very little
voices; babbles: uses razzing sounds to self, often in sing-song rhythm; jabber; begins to use words to com­
uses gestures and words to com­ municate ideas, information
municate wants; labels objects
Comments: Comments: Comments:

'Revised from Knox, Susan. A Play Scale. In Playas Exploratory Learning, M. Reilly, Editor. Beverly Hills: Sage Publications, Inc.,
1974. - Jayne Shepherd, Nancy Bledsoe, 1981.

794 December 1982, Volume 36, Number 12

Downloaded From: http://ajot.aota.org/pdfaccess.ashx?url=/data/journals/ajot/930542/ on 09/19/2018 Terms of Use: http://AOTA.org/terms


PLAY SCALE' The following to be filled in after observations are Space Management: Imitation:
recorded: Mean ages for dimensions:
Material Management: Participation:
Play Age (mean of all dimensions):

3-4 years 4-5 years 5-6 years

Gross motor activity: more coordi­ Gross motor activity: increased Gross motor activity: more sedate;
nated body movements, smoother activity level: can concentrate on good muscle control and balance; ....Z
walking, jumping, climbing, running goal instead of movement; ease of hops on one foot; skips; somer­ w
(accelerates, decelerates) gross motor ability allows stunts, saults; skates: lifts self off ground w::E
tests of strength, exaggerated Ow
movements; clambers ~"
ll.~
IIl
Territory: home; immed. Territory: likes to be up off ground ze(
Territory: neighborhood
neighborhood ::E
Explorallon: interest in new experi­ Exploration: anticipates trips, likes Exploration: plans and enjoys
ences, places, animals, nature change of pace excursions and trips
Comments: Comments: Comments:

Manlpulallon: small muscle activity­ Manipulation: Increasing fine motor Manipulation: uses tools to make
hammers, sorts, inserts small objects control allows quick movements, things (i.e., cuts more precisely);
(i.e., peg boards); cuts force, pulling copies; traces; combines various
types of material
Construction: makes simple products Construcllon: predominates-makes Construction: predominates-makes
(i.e., blocks, crayons, clay); com­ products, specific designs evident, recognizable products; likes small I­
bines play materials; takes apart; builds complex structures; puzzles construction, attends to detail (i.e., Z
w
arranges in spatial dimension­ 10 pieces eyes, nose, fingers apparent in ::E
design is evident drawings); uses products in play
w
Interest: anything new: fine motor
manipulation of play materials
Interest: takes pride in work (i.e.,
shows and talks about products,
Interest: in reality-manipulation of
real life situations (i.e., miniature
"

c(
Z
c(
compares with friends, likes pictures things); making something useful­ ::E
displayed); complex ideas props for play; permanence of ...I
c(
products; toys that "really work"
iX
w
Purpose: beginning to show interest Purpose: product very important­ Purpose: replicate reality

in result or finished product use to express self; exaggerates e(
Atlentlon: longer span-around 30 Atlentlon: amuses self up to one Atlentlon: concentration for long ::E
minutes; plays with single object or hour; plays with single object or period of time: plays with single
theme 5-10 minutes theme 10-15 minutes object ortheme 10-15 minutes
Comments: Comments: Comments:

Imitation: more complex imitation of Imitation: more complex imitation of ImltalJon: more complex imitation of
real world-part of dramatization real world as part of dramatization real world as part of dramatization
Imagination: assumes familiar roles­ Imagination: prominent-able to use Imaglnallon: prominent-continues to
domestic themes, past experiences familiar knowledge to construct a construct new themes but emphasis
novel situation (i.e., expanding on on reality-reconstruction of real
the theme of a story or TV show) world
Dramatlzallon: imitates simple action Dramatization: role playing for or with Dramalizallon: sequences stories­
and reaction episodes-mi rrors others; portrays more complex emo­ emphasis on copying what occurs in
experience, emphasis on domestic tions; sequences stories-themes real world; costumes important; z
and animals; portrays multiple char­ from domestic to magic; enjoys props; puppets o
acters with feelings (mostly anger ~
dress-ups <
and crying); little interest in !::
costumes ~
Music: sings simple songs-not Music: sings whole songs on pitch; Music: meaning of songs important;
necessarily on pitch; plays musical games (ie., Farmer in the enjoys catchy tunes, songs that tell
instruments Dell); good rhyth m stories; dances reflect interpretation
of music
Books: new or information books; Books: listens better-doesn't need Books: looks at books independently
pictures important; relates own physical contact with book; looks at or with peer; describes picture to tell
experiences to story books independently-repeats story; must be credible
familiar stories
Comments: Comments: Comments:

Type: associative play (similar Type: cooperative (groups of 2-3 Type: cooperative (groups of 2-5,
activities with groups of 2-3, no organized to achieve a goal, i.e., organization of more complex
organization to reach a common assigns roles for pretend play) games and dramatic play)
goal, more interest in peers than z
activity) o
Cooperalion: limited-some turn Cooperation: takes turns: attempts to Cooperation: social give and take ~
c(
taking: asks for things rather than control the activities of others (often evident (i.e., compromises to facili­ ll.
grabbing; little attempt to control self-centered, bossy) tate group play); rivalry seen in U
others competitive games ~
Language: uses words to communi­ Language: very talkative-plays with Language: very prominent in socio­ a:
~
cate with peers, interest in new words; fabricates-capable of long dramatic play (uses words as part of ll.
words (repeats them, asks their narratives; questions persistently; playas well as to organize play);
meaning) communicates with peers to interest in present; relevant how,
organize activities what for questions
Comments: Comments: Comments:

The Amencan Journal of Occupational Therapy 795


Downloaded From: http://ajot.aota.org/pdfaccess.ashx?url=/data/journals/ajot/930542/ on 09/19/2018 Terms of Use: http://AOTA.org/terms
make any other recommendation'>
rele\'ant to the overall quality of the Table 2

instrument. This Jspect of instru­ Methods of Estimating Reliability

ment development is crucial if the


response format is new to the target Type of
Types of
Accepted
Reliability
Instruments
Procedures Values
group. In Step 6, the items are
revised and. where indicated, new Stability (test-retest) Instruments used to Give the same test .60 or greater
items are written. It is often easier to predict or select­ to the same group
aptitude tests, at two different
improwa bad ite.m than to develop psychomotor tests, times, correlate the
a new one (12), therefore, do not personality invento­ two scores using
discaru any item too quickly. ries, neuro­ the PPM.
psychological tests.
Quantitative Evaluation Phase:
Steps 7-Il. Quantitative evaluation Equivalence Any tests that have Give Form 1 .80 or greater
of the instrument is substantially (parallel form) alternate forms immediately fol­
accomplished in the first pilot test­ lowed by Form 2;
correlate the two
ing. Prior to the pilot test, the 112 scores, use the PPM
re\ised items must be typed to pro­
duce the test form. Then, in the Internal Instruments used to For tests with .80 or greater
pilot test (Step 7). the scale is Jd­ Consistency infer an underlying dichotomously
construct-aptitude scored items use
lnlnistered under optimal condi­ tests, attitude KR20, for all other
tions to a group that is represellla­ scales, psychomo­ tests use Coefficient
tl\e of the target group. Ideally, the tor tests, personality Alpha
inventories,
group should be large so that the neu ropsycholog ical
responses of a fe\·,· subjects do not tests.
distort the results of the quantita­
tiw evaluation. It is generally L1se­ Note:
PPM = Pearson Product-Moment Correlation Coefficient.

ful to follow the pilot test with KR20 and Coefficient Alpha formulas can be found in Mehrens and Lehmann.

another qualitative e\'aluation de­


briefing session in which the sub­
jeers who participated ill the pilot
study are asked to provide feedback
regarding the clarity of each item, values are considered zero) to a high ing reliability, the types of instru­
are informed of the purpose of the of J .00, The difference between the ments for which each reliability
instrument, and are asked to offer observed reliobility coefficient and coefficient is required, the proce­
an\' additional comments. As sub­ 1.00 is attributed to error. Thus, if dure to follow in calculating the
jects in the pilot study, this group the observed reliability coefficient reliability coefficient, and the gen­
has responded to twice as many was .75, then .25 represents the erally accepted value recommended
llems as will comprise the final degree of inconsistency in the mea­ by these writers for the obscn'ed
instrument. surement. The above coefficients coefficient. If decisions are to be
The pilot study provides the dara would be interpreted as follows: 75 made about indi\'iduals, rather than
from which the initial reliability percent of the variance in the test groups, these reliability coefficients
estimates are calculated (Item An­ 'A'as measuring the subject's actual should be considered a minimum.
alysis. Step 8). Reliability is defined ability, achie\'ement, attitude, or In the example of the attitude
as the consistency of the measure­ personality, and 25 percent was clue scale fOJ school administrators, s('\­
ment over time or the precision of to chance or random error. eral types of reliability estimates
measurement. It reflects the confi­ Se\ eral methods are '1\'ailable for might be neecled. For example, an
dence gi ven to the observed score as e,tim<1ting the reliability of an in­ in lerna I consistency esti ma te usi ng
being a reflection of what a person strument. The choice of method is coefficient alpha would be required.
really knows, believes, or is able to dependent upon the intended pur­ This type of reliability is important
do. The reliability codficiclll can pose of the instrumfnl. Table 2 dis­ because it ensures that all the items
range from a low of zero (negative ploys the three methods of estimat­ on the scale are measuring the same

796 DecembN 1982. Va/lime 36, Number /2


Downloaded From: http://ajot.aota.org/pdfaccess.ashx?url=/data/journals/ajot/930542/ on 09/19/2018 Terms of Use: http://AOTA.org/terms
trait, in this case "responsivity" to only one correct answer, typical of bility, the researcher must select
occupational therapy in the schools. those inc! uded on psychomotor and items that have high discrimination
In addition, if one assumes that the cognitive ll1struments, D is calcu­ indices. Ebel (II) suggested retain­
scale is to be used to predict future lated by first dividing the respon­ ing items whose discrimination
behavior, a measure of stability is dents into two halves, putting those va Iue is .30 or grea ter. If the item
required. Although the coefficient with the highest scores in one group shows a discrimination index of .20
alpha esti mate can be obtained from and those with the lowest scores in to .29, it should be revised. If the
the pilot study data, calculation of the other. These groups serve as discrimination index is .19 or below
the stability coefficient will necessi­ comparative groups in the calcula­ (including negative values), the
tate that the test in its final form be tion of D for each item. The for­ item should be eliminated.
given twice to the same group of mula for obtaining Dis: In summary, then, the first pilot
people at two different times and study provides quantitative data on
that these scores be correlated. Number of persons in the
each item together with reliability
ReliabilIty to a large extent re­ UPPER group correctly
estimates for the total instrument.
flects the adequacy of the items When these data are interpreted
since its calculation is dependent Number of persons in the with the information gathered in
upon the scores achieved by thesub­ LOWER group correctly the debriefi ng and qua Iita ti ve as­
jects on the items in the pilot study. sessment sessions, the test construc­
Items can often be improved through answering the item tor can make sound judgments re­
Item analySIS, which provides two 0= garding which items should be re­
Total number of persons ­ tained, revised, or discarded. The
types of quantitative indices of the
adequacy of each item: the item dif­ end product of Steps 5 through 8 is a
in the upper group revised instrument with the number
ficulty index and the item discrimi­
nation index provide a measure of of items reduced to that which was
answering the time
the difficu It y of respective items and originally planned and which con­
are calculated as follows. Total number of persons form to the table of specifications.
In rare instances, all or most of the
In the lower group items may be considered adequate.
:'\umber of people correctly
The test constructor then may ran­
answering the item
p =- - - - - - - - - - ­ For affecti ve instru ments, the dis­ domly choose the number of items
Total number of people crimination index is calculated by needed and reserve the remainder
responding to the item correlating the score each person for an alternative test form. The
received on each item with his or new red uced form of the ins tru men t
her total score. Regardless of the must next be pilot tested (Step 9). As
Obviously, the only types of items type of instrument, item analysis is in the first pilot testing, the items
for which this statistic is relevant tedious and is best handled by com­ are administered to a representative
are those for which a correct answer puter, for which there are readily target group comprising individu­
can be keyed, typically cognitive or a\'ailable reliability and item analy­ als other than those who were in the
psychomotor items. A p value re­ sis programs (31). In their absence, first pilot testing, then analyzed,
flecting moderate item difficulty Item analysis can be obtained by and new reliability estimates are
ranging from .30 to. 70 is considered using the formulas already pre­ obtained for the reduced form of the
optimal for most instruments. ThiS sen ted. instrument. The second pilot test­
stallstic would not be calculated for Interpretation of the statistics ob­ ing (Step 10) usually results in a
the hypothetical scale. tained in the item analysis results in final form of the instrument and
The item discriminatIOn index, retaining only the best items for the esta blishes that the reliability of the
denoted by the letter D, reflects the final, reduced form of the instru­ instrument is acceptable. Step 11 is
correlation of the item with the ment. Some rules of thumb do exist necessary only if the revised instru­
total test score and provides infor­ for making decisions in this area: ment needs improvement. After Step
mation about how well the item any item discrimination index that 10, or Step 11 when it is needed, test
discriminates among respondents is nega ti ve is considered to be zero. validation can begin.
of different types. When items have To produce a test with high relia­ Validation Phase: Steps 12-13.

The American Journal of Occupational Therapy 797


Downloaded From: http://ajot.aota.org/pdfaccess.ashx?url=/data/journals/ajot/930542/ on 09/19/2018 Terms of Use: http://AOTA.org/terms
well a client may perform in the
future, or 3. infer the degree to
Table 3 which a client possesses some
Methods of Estimating Validity amount of a directly nonobservable
Type of

hypothetical trait, a construct, for


Type of Use of the Procedure
Validity Instrument In8trument
example, sensory integration. Each
of these purposes requires a differ­
Content To determine how Provide an expert Achievement or
with a copy of the physical perfor­ ent method of validation. In Table
well an individual
performs at one objectives, table of mance tests 3, procedures forestablishing valid­
point in ti me for a specifications and ity corresponding to each of the
given content the instrument; the
expert judges
three purposes of testing are de­
domain
whether the content scri bed and the types of instrumen ts
domain has ade­ for which each is appropriate are
quately been
assessed.
listed. Logically then, Steps 12 and
13, the valida tion process, assu me
that the purpose of the test was well
defi ned a t the ou tset (Steps I and 2).
Criterion-Related To predict future Give the test and Tests used to
performance correlate the results select or classify,
Quite often a test will have multiple
with the criterion e.g., aptitude or purposes, in which case several types
variable. The criteri­ developmental of validation may be needed.
on may be obtained scales
concurrently or at
Content validity, defined by Cron­
some time in the bach (32), pertains to whether the
future. set of items adequately covers the
content domain of interest as well as
the set of behaviors implied by the
Construct To infer some Based upon a the­ Any test that pur­ test score. The "behaviors implied
amount of a hypo­ ory underlying the ports to measure a
by the test score" relate to the con­
thetical trait trait, hypotheses are hypothetical trait,
set up and tested such as personal­ text in which the items are pre­
regarding the ity or altitude sented (22). For example, in the
behavior of persons scales, aptitude
who possess large tests or develop­
illustration of the attitude scale for
or small amounts of mental scales school administTators, vocabulary
the trait. level of tes t items shou Id not be so
difficult or technical that the scores
reflect in part lack of familiarity
with occupational therapy termi­
nology. Attitude is what the in­
strument is intended to measure.
For this instrument to have content
validity, measurement of attitude
should not be confounded with vo­
The validation of a newly devel­ ing. Since there are several methods cabulary level or some other uni­
oped instrument is almost never [or establishing the validity of a dentified variable.
accomplished through one study or given instrument (14), the selection Procedures for the preliminary
by one researcher. Instead, it re­ of a method is dependent upon the establishment of content validity
quires numerous research efforts intended use of the instrument. For were described in Step 5. When that
and, for this reason, must beconsid­ example, a therapist may want to: I. step is properly undertaken, the
ered an ongoing process. Despite its determine how well a client per­ content validation of the final form
time-consuming demands, valida­ forms in a designated content do­ is easil y accompl ished. Essentiall y,
tion is essential because validity main at a particular point in time, Step 5 is repeated but now with only
allows one to be confident of what for example, performance in activi­ the items on the final test form (Step
the instrument is actually measur- ties in daily living, 2. predict how 12). The table of specifications is

798 December 1982, Volume 36, Number 12

Downloaded From: http://ajot.aota.org/pdfaccess.ashx?url=/data/journals/ajot/930542/ on 09/19/2018 Terms of Use: http://AOTA.org/terms


presented to experts who are asked and perhaps the most important to a theoretic variable derived from
to classify the final items into the form to obtain. It is needed when analyzing the intercorrelations of
cells. The percentage of items as­ test scores are used to infer the pres­ test items. For example, on the
signed a ppropria tel y reflects the ex­ ence of some underlying hypotheti­ school administrators' scale, it
tent of the test's content validity. cal trait or construct. The term con­ would be reasonable to hypothesize
Criterion-related validity is need­ struct validity was first discussed in that, in factor analysis. separate fac­
ed when the new test is to be used to the now classic article by Cronbach tors relating to each of the content
classify individuals or predict their and Meehl (33). The general proce­ areas or process levels would emerge.
future performance. In brief, the dure typically used to assess con­ If this hypothesis is confirmed, the
esta blish men t of cri terion-related struct validity of a new scale is to test constructor can be reasonably
validity involves correlating scores define the trait operationally and confident that the scale is actually
obtained on the new test with those set up a hypothesis or hypotheses measuring the intended trait, "re­
obtained on an older, already vali­ about how individuals who possess ceptivity." Unfortunately, one fac­
dated test. Although this procedure varying degrees of the trait will tor analytic study never suffices­
appears to be rather simple, in real­ behave. In the example used generally a series is conducted. Fac­
ity it has inherent difficulties. The throughout the paper, it might be tor analysis requires a very large
major problem is locating an ade­ hypothesized that school adminis­ sample size and a general rule is to
quate criterion, which must be either trators who agree with the idea of use ten people for every item (13).
a well-established test or highly employing occupational therapists Several factor analytic solutions are
respected clinical ratings. For ex­ in public school systems will have often necessary tOvalidate the under­
ample, the Slosson Intelligence Test. high scores on the attitude scale lying structure of the new scale
a group-administered paper-and­ and, conversely, individuals not re­ where each subsequent solution
pencil intelligence test that is a ceptive to this innovation will have must be obtained on a new sample.
quick and efficient measure of intel­ low scores on the attitude scale. Th us. factor analytic validation pre­
ligence, was validated by compar­ Once this hypothesis is formulated. sents a formidable challenge and
ing scores on it with those obtained It needs to be tested using any of probably cannot be undertaken in
by the same individuals on the Stan­ three gener;] I methods: I. the known occupational therapy unless sam­
ford-Binet. In this instance, the groups procedure, 2. factor analy­ ples of sufficient size are available.
Stanford-Binet served as the crite­ sis, and 3. the multitrait-multi­ For test developers who do not
rion. The acceptability of the Slos­ method procedure. have accessibility to large samples,
son Intelligence Test as a measure The known groups procedure re­ it IS recommended that they use the
of intelligence is dependent upon quires that a known group be ob­ multitrait-multimethod procedure
the adequacy and acceptability of tdined that already possesses the of establishing construct validity
the Stanford-Binet as a criterion; type of attitudes just described; for (34). Campbell and Fiske (34) ex­
that the criterion must have ade­ example, those who agree and those plain the procedure in detail. Essen­
quate reliability and validity in its who disagree that occupational tially, the procedure allows for de­
own right is a given. therapists should be delivering ser­ termination of the extent to which
The two types of criterion-related vices in the schools. The intent is to tests purported to measure the same
validity are concurrent and predic­ verify that the scale does measure trait with dIfferent methods or for­
tive. The former is obtained by con­ what it was intended to measure mats are correlated. These correla­
currently testing a group of subjects and the known group acts as a cri­ tions are compared with those ob­
on the new test and the criterion. terion for verification. If the scale is tai ned bet ween tes ts thought to
The latter is established by first test­ valid, those "known" to be recep­ measure different traits with similar
ing a group of subjects with the new tive should achieve high scores; formats. If the former set of correla­
test and then later with the crite­ those nonreceptive should obtain tions is not higher than the latter,
rion, The selection of concurrent or low ones. this is taken as evidence that the
predictive validity is based upon the Thesecond procedure, factor anal­ tests are not valid for measuring the
intended purpose of the new in­ ysis, requires that the researcher construct. Campbell and Fiske re­
strument. hypothesize the nature and the num­ port specific criteria for making the
The third type of validity, con­ ber of factors underlying a scale. In judgment. The advantage of this
struct validity, is the most difficult this instance, the term factor refers procedure over factor analysis is

The Amerzcan journal of Occupational Therapy 799


Downloaded From: http://ajot.aota.org/pdfaccess.ashx?url=/data/journals/ajot/930542/ on 09/19/2018 Terms of Use: http://AOTA.org/terms
that it requires fewer subjects; the als with whom it is to be used. Vali­ specti ve test constructor. I tis hel pfu I
disadvantage is that available sub­ dation is a continual process (Step for therapists who plan to embark
jects are required to respond to two 13), one in which an end point is upon instrument development to
or more scales under two or more rarel y ach ieved, bu t is only succes­ consult the psychometric literature
formats, which can be time con­ sivelyapproximated. and seek out the expenise of a psy­
suming. Moreover, it is sometimes chometrician, Instruments devel­
difficult to locate sets of instruments Conclusion oped within the occupational ther­
having the appropriate trait and Test planning, construction, and apy field according to the guidelines
format characteristics. validation require time and pa­ presen ted here are not on Iy likel y to
Validation of a new scale requ ires tience. The compressed guide to satisfy the requirement of third­
much time and funding. Several instrument development and vali­ pany payment but will also be ap­
studies across different populations dation presented in this paper pro­ propriate for research purposes. In­
and across different times are usu­ vides an overview of the steps re­ struments of caliber are needed in
ally needed to ensure that the new quired and is useful as a method­ all areas of occupational therapy
instrument is valid for all individu- ological "road map" for the pro­ practice.

tion on a sample of adults with head jective test performance: A question of

REFERENCES trauma. Master's Thesis, University of validity, Educ Psychol Measure 39:

1,Hasselkus BR, Safrit MJ: Measurement


Southern California, Los Angeles, Cali­ 381-387, 1979

in occupational therapy. Am J Occup


fornia, June 1981 22.Benson J: A redefinition of content

Ther 30: 429-436, 1976


11, Ebel R: Essentials of Educational Mea­
validity. Educ Psychol Measure 41'

2.Cermak SA, Coster W, Drake C: Rep­


surement (3rd Edition), Englewood
793-802, 1981

resentational and nonrepresentational


Cliffs, NJ Prentice Hall, 1979
23,Likert R: A technique forthe measure­

gestures in boys with learning disabili­


12,Mehrens W, Lehmann I: Measurement
ment of attitudes. Arch Psychol 140:

ties. Am J Occup Ther 34: 19-26, 1980


and Evaluation in Education and Psy­
52, 1932

3, Lindquist JC: A study of physical inac­ chology (2nd Edition), New York: Holt
24. Thurstone L: Attitudes can be mea­

tivity and its relationship to vestibular Rinehart & Winston, 1973


sured, Am Sociol Rev 9: 139-150, 1944

function in chronic schizophrenia. 13, Kerlinger F: Foundations of Behavioral


25. Guttman L: A basis for scaling qualita­

Master's Thesis, University of South­ Research (2nd Edition), New York: Holt
tive data. Am Sociol Rev 9: 139-150,

ern California, Los Angeles, California, Rinehart & Winston, 1973


1944

June 1981 14,Crocker LM: Validity of certification


26,Greenstein L: Psychomotor objectives

4.Drucker B: An imitation gesture test for measures for occupational therapists,


in occupational therapy education, Am

adolescents, Master's Thesis, Univer­ Am J Occup Ther 30: 229-233, 1976


J Occup Ther 30: 301-357, 1976

sity of Southern California, Los An­ 15,Crocker LM: Linking research to prac­
27.Edwards AL: Techniques of Attitude

geles, California, June 1981 tice: suggestions for reading a research


Scale Construction, New York: Apple­

5.Koomar JA, Cermak SA: Reliability of


article. Am J Occup Ther 31: 34-39,
ton-Century-Croft, 1957

dichotic listening using two stimulus


1977
28.Kamorita SS, Graham WK: Number of

formats with normal and learning-dis­


16.Buros OK (Editor): Mental Measure­
scale points and the reliability of scales.

abled children, Am J Occup Ther 35:


ment Yearbook (Vols 1-8). Highland
Educ Psychol Measure 4:987-995, 1965

456-463, 1981
Park, NJ: Gryphon Press, 1972
29, Lissitz R, Green S: The effect of the
6.Yerxa EJ, Burnett SE, Stocking S, Azen 17.Gilfoyle EM (Editor): Training: Occu­
number of scale points on reliability: A
SP: Development of the satisfaction pational Therapy Education Manage­
Monte Carlo approach. J App Psychol
with performance scaled questionnaire ment in Schools (Vols 1-4). Rockville,
60: 10-13, 1975

(SPSQ). Unpublished manuscript, Uni­ MD: American Occupational Therapy


30,Frisbie 0, Brandenburg 0: Equivalence

versity of Southern California, Los An­ Association, 1978


of questionnaire items with varying

geles, California 18.Harrow A: A Taxonomy 01 Psychomo­


response formats. J Educ Measure 16:

7.De Gangi GA, Berk RA, Larsen LA: The


tor Domain, New York: David McKay,
43-48, 1979

measurement of vestibular-based func­


1972
31.Hull CH, Nie NH: SPSS Update, New

tions in pre-school children, Am J


19.Bloom B (Editor): Taxonomyof Educa­
York: McGraw Hill, 1979

Occup Ther 34: 452-459, 1980


tional Objectives: The Classification of
32,Cronbach L: Test validation. In Educa­
8,Gilligan MB, Mayberry W, Stewart L,
Educational Goals, Handbook I: Cog­
tional Measurement (2nd Edition), RL

Kenyon P, Gaebler C: Measurement of


nitive Domain, New York. David McKay,
Thorndike, Editor. Washington, DC:

ocular pursuits in normal children. Am


1972
American Council on Education, 1971

J Occup Ther 35: 250-255, 1981


20.Krathwohl 0, Bloom B, Bertram M:
33,Cronbach L, Meehl P: Construct valid­

9.Harris NP: Duration and quality of the


Taxonomy of Educational Objectives:
ity in psychological tests. Psychol Bull

prone extension posture in four-, six­


The Classification of Educational
52 281-302, 1955

an eight-year-old normal children.


Goals: Handbook 1/: Affective Domain,
34.Campbell D, Fiske 0: Convergent and

Am J Occup Ther 35: 26-30, 1981


New York: David McKay, 1972
divergent validation by multitrait-multi­

10,Baum B: The establishment of reliabil­ 21 Benson J, Crocker L: The effects of method matrix. Psychol Bull 56: 81­

ity and validity of a perceptual evalua­ item format and reading ability on ob­ 105, 1959

800 December 1982, Volume 36, Number 12

Downloaded From: http://ajot.aota.org/pdfaccess.ashx?url=/data/journals/ajot/930542/ on 09/19/2018 Terms of Use: http://AOTA.org/terms

You might also like