Stages in Test Development Purpose of The Test

You might also like

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 16

Stages In Test Development Purpose Of The Test

By :

Cica Cahyani : 1311040238

Novi Amilia : 1311040233

Vera Deviyana : 1311040232

Class :F

Major / Semester : PBI / V

Lecturer : M. Ridho Kholid, M. Pd

EDUCATION FACULTY AND TEACHER TRAINING

STATE INSTITUT OF ISLAMIC STUDIES OF RADEN INTAN LAMPUNG

2015/2016
PREFACE

The author realized that the implementation of reserch and the


completion of the writer of this paper because the Lord who gives blessings of
health, strength and ability as well as people who are always ready to help in
all terms and time, God always gives more according to His will and giving it
more than we requested. Praise and Gratitude therefore the authors give to
the Lord God.

A sincere thank god from my heart for the lecturer Mr. Ridho Kholid,
M.Pd, who has given knowledge for us.

It is highly expected that this book might contribute to the betterment of


english instruction in this institution. We should, then, be pleased to accept
any suggestion or correction for the sake of the paper improvement.

Bandar lampung, 10 november 2015

Authors
CONTENTS

CHAPTER II

DISCUSSION

STAGES IN TEST DEVELOPMENT PURPOSE OF THE TEST

A. Stages and activities in the development the specific purpose of


language test
We now turn from our discussion of frameworks that can be
used in test development to a specific set of procedures for developing
useful languagetests. This chapter serves as a pre-organizer for the
remainder of the book, which will explain how to carry out each step in
test development and provides extensive examples. The chapter in
part two are organized, in a very general way, according to the process
of test development. Since numerous exercises will be provided for the
specific steps in the test development process in the chapters that
follow, there are no exercises for this chapter. We would note again
that although we list in stage 3, a number of activities that will involve
statistical analyses, these activities and analyses will not be discussed
in this book. We have, instead, provide references to relevant readings
in chapters 3 and 11, as well as at the end of this chapter.
Readers might want to read this chapter now as a general
preview of material to come and then review it prior to reading each of
the remaining chapters in order to re-acquaint themselves with the
whole test development process before considering the details of each
activity.
Test development is the entire process of creating and using a
test, beginning with its initial conceptualization and design, and
culminating in one or more archived tests and the results of their use.
The amount of time and effort we put into developing language tests
will, of course, vary depending upon the situation. At one extreme, with
low-stakes tests, the processes might be quite informal, as might be
the case if one teacher were preparing a short test to be used as one
to be a series of weekly quizzes to assign grades. At the other
extreme, with high-stakes tests, the processes might be highly
complex, perhaps invloving extensive trialing and revision, as well as
cordinating the effort of a large test development team. this might be
necessary if a test were to be used to make important decisions
affecting a large number of people. We would again point out that
although the amount of time and effort that goes into test development
may vary, depending on the usefulness need to be carefully
considered and this consideration should not be sacrified in either loe-
atakes or high-stakes situations.
Whatever the situation might be, we strongly belive that careful
planning of the test development process in all language testing
situations is crucial, for three reason. first and most importantly, we
believe that careful planning provides the best means for assuring that
the test will be useful for its intended purpose. Second, careful
planning tends to increase accountability: the ability to say what was
done and why. As teacher we must expect that test users (students,
parents and administrators) will be interested in the grality of our test.
Careful planning should make it easier to provide evidence tahat the
test was prepared carefully and with forethought. Third, we favor
careful planning because it increases the amount of satisfation we
experience. When we have a plan to do something that we value, and
complete it, we feel rewarded. the more careful the plan (the more
individual steps in contains) the more opportunities we created to feel
rewarded. The less careful the plan, the fewer the rewards. At the
extreme-no plan at all except the completion of the test. there is only
one reward: the completed test.
we organized test development conceptually into three stages:
design, operationalization, and administration. We say "conceptually"
because the test development process is not strictly sequential in its
implemetation. We practice, although test development is generraly
linear, with development progressing from one stage to the next, the
proces is also an iterative one, in which the decisions that are made
and the activities completed at one stage my lead us to reconsider and
revise decisions, and repeat activities, that have been done at another
stage. while there are many ways to organize the test development
process, we have discovered over the years that this type of
organization gives a better chance of monitoring the usefulness of the
test throughout the development process and hence producing a
useful test. the test develpment process is illustrated in figure 5.1. We
have included 'considerations of qualities of usefulness' in order to
emphasize that all decisions and activities involved in test
development are made in order to maximize the overall usefulness of
the test.

1. Stage 1: Design
In the design stage we describe in detail the component of the
test design that will enable us to insure that performance on the test
tasks will correspond as closely as possible to language use, and that
the test scores will be maximally useful for their intented purposes.
Design is in general a linear process, but in some cases some
activities are iterative, that is, will need to be repeated a number of
times. For examples, there are certain parts of the process, such as
considering qualities of usefulness and resource allocation and
management, tht are recurrent and will need to be considered and
thought about thoughout the process.

Stages/activities Products
1. Design Design Statement
Describing Purpose of the test
Identifying Description of TLU domain and
Selecting task types
Defining Characteristic of test takers
Developing Definition of construct(s)
Allocating Plan for evaluating the qualities of
Managing usefulness
Inventory of available resources
and plan for their allocation and
management

Blueprint
Test structure
Number of parts/tasks
Salience of parts
Sequence of parts
Relative important of parts/tasks
2. Operationalization Number of tasks perpart
Selecting Test task specifications Consideration of
Specifying Purpose qualities of usefuness
writing Definitions of construct(s)
Setting
Time allotment
Insructions
Characteristic of input and
expected response
Scoring method

test 1 test Test


2 n

3. Administration
Administering
Collecting Feedback on Usefulness
Feedback Qualitative
Analyzing Quantitative
Archiving Test scores
The product of the design stage is a design statement, which is a
document that includes the following components:
a) a description of the purpose(s) of the test,
b) a description of the TLU domain and task types,
c) a description of the test takers for whom the test is intended,
d) a definition of the construct(s) to be measured,
e) a plan for evaluating the qualities of usefulness, and
f) an inventory of required and available resources and a plan for
their allocation and management.

The purpose of this document is to provide us with a princpled


basis for developing test tasks, a blueprint, and tests. It is important to
prepare this document carefully, for this enables us to monitor the
subsequent stages of development.

There are six activities involved in the design stage,


corresponding to the six components of the design statement, as
indicated above. These are described briefly below.

a. Describing the purpose(s) of the test


This activity makes explicit the specific uses for which the test is
intended. It involves clearly stating the spesific inferences about
language ability or capacity for language use we intend to make on the
basis of test results, and any spesific decisions which will be based
upon these inferences. The resulting statement of purpose provides a
basis for considering the potential impact of test use. This activity is
discussed in detail in Chapter 6.

b. Identifying and describing tasks in the TLU domain


This activity makes explicit the tasks in the TLU domain to which
we want our inferences about language ability to generalize, and
describes TLU task types in terms of distinctive characteristics. It
provides a set of detailed descriptions of the TLU tasks types that will
be the basis for developing actual test tasks. These descriptions also
provide a means for considering the potential authenticity and
interactiveness of test tasks. This activity is discussed in Chapter 6.

c. Describing the characteristics of the language users/test takers


This activity makes explicit the nature of the population of
potential test takers for whom the test is being designed. The resulting
description provides another basis for considering the potential impact
of test use. This activity is discussed in Chapter 6.

d. Defining the construct to be measured


This activity makes explicit the precise nature of the ability we
want to measure, by defining it abstractly. The product of this activity is
a theoretical definition of the construct, which provides the basis for
considering and investigating the construct validity of the
interpretations we make of test scores. This theoretical definition also
provides a basis for the development, in the operationalization stage,
of test tasks. In language testing, our theoretical construct definition
can be derived from a theory of language ability, a syllabus
spesification, or both. This activity is discussed in Chapter 6.

e. Developing a plan for evaluating the qualities of usefulness


The plan for evaluating usefulness includes activities that are
part of every stage of the test development process. A plan for
assessing the qualities of usefulness will include an initial
consideration of the appropriate balance among the six qualities of
usefulness and setting minimum acceptable levels for each, and a
checklist of quetions that we will ask about each test task we develop.
These are discussed in Chapter 7. Assessing usefulness in pretesting
and administering will include collecting feedback. This will deal with a
range of information, both quantitative, such as test scores and scores
on individual test tasks, and qualitative, such as observes’ descriptions
and verbal self-reports from students on the test taking process.
Finally, the plan will include procedures for analyzing the information
we have collected. This will include procedures such as the descriptive
analysis of test scores, estimates of reliability, and appropriate analysis
of the qualitative data.

f. Identifying resources developing a plan for their allocation and


management
This activity makes explicit the resources (human, material,
time) that will be required and that will be available for various activities
during test development, and provides a plan for how to allocate and
manage them throughout the development process. This activity
further provides a basis for considering the potential practicality of the
test, and for monitoring this throughout the test development process.
This activity is discussed in Chapter 8.

2. Stage 2: Operationalization
Operationalization involves developing test task spesifications
for the types of test tasks to be included in the test, and a blueprint that
describes how test tasks will be organized to form actual tests.
Operationalization also involves developing and writing the actual test
tasks, writing instructions, and spesicifying the procedures for scoring
the test. By spesicifyng the conditions under which language use will
be elicited and the method for scoring responses to these tasks, we
are providing the operational definition of the construct.

a. Developing test tasks and a blueprint


In developing test tasks, we begin with the descriptions of the
TLU task types provided in the design statement, and modify these,
again taking into consideration the qualities of usefulness, to produce
test task spesifications. These comprise a detailed description of the
relevant task characteristics, and provide the basis for writing actual
test tasks. We would note that the particular task characteristics that
are included and the order in which they are arranged in the test task
specifications are likely to vary somewhat from one testing situation to
another, and hence will not necessarily correspond exactly to either
the theoretical framework of task characteristics presented in Chapter
3 or the wy in which they are presented in figure 5.1.
A blueprint consists of characteristics pertaining to the structure,
or over all organization, of the test, along with test task specifications
for each task type to be included in the test. The blueprint differs from
the design statement primarily in terms of rhe narrowness of the focus
and the emount of detail included. A design statement describes the
general paramters for the design of a test, including its purpose, the
TLU domain for which it is designed, the individuals who will be taking
the test, what the test is intended to measure, and so forth. A blueprint,
on the other hand, describes how actual test tasks are to be
constructed, and how these tasks are to be arranged to form the test.
Procedures for developing test task specifications and a blueprint are
discussed in Chapter 9.

b. Writing instructions
Writing instructions involves describing fully and explicitly the
structure of the test, the nature of the tasks the test takers will be
presented , and how they are expected to respond. Some instructions
are very general and apply to the test as a whole. Other instructions
are closely linked with specific test tasks, Considerations and
procedures for writing instructions are discussed in Chapter 10.

c. Specifying the scoring method

Specifying the scoring method involves two steps;

1) Defining the criteria by which the quality of the test takers’


responses will be evaluated and
2) Determining the procedures that will be followed to arrive at a
score.
Scoring methods are discussed in Chapter 11.

3. Stage 3: Test admistration


The test admistration stage of test development involves giving
the test to a group of individuals, collecting informations, and analyzing
this information, for two purposes;
a) Assessing the usefulness of the test, and
b) Making the inferences or decisions for which the test is intended.

a. Admistration typically takes place in two phases try-out and


operational testing.
Try-out involves administering the test for the purpose of
collecting information about the usefulness of the test itself, and for the
improvement of the test and testing procedures. The revisions made
on the basis of feedback obtained from a tryout might be fairly local,
and might consist of minor editing. Or the analysis of the results of the
tryout might indicate that a more global revision is required, perhaps
involving returning to the design stage and rethinking some of the
components in the design statement. In major testing efforts, tests or
test tasks are almost always tried out before they are actually used. In
classroom testing, try-outs are often omitted, although we strongly
recommend giving the test to selected students or fellow teacher in
advance, since this can provide the test developer with information that
can be useful in improving the test and test tasks before operational
test use.
Operational test use involves administering the test primarily in
order to accomplish the specified use/purpose of the test, but also for
collecting information about test usefulness. In all cases of test
development, we administer and score the test and then analyze the
results, as appropriate to the demands of the situation.

b. Procedures for administering tests and collecting feedback


Administering a test involves preparing the testing environment,
collecting test materials, training examiners, and actually giving the
test. Administrative procedures need to be developed for use in both
try-out and operational test use. Collecting feedback involves obtaining
qualitative and quantitative information and usefulness from test takers
and test users. Feedback is colected first during tryouts and later
during operational test use. These activities are discussed in Chapter
12.

c. Procedures for analyzing test scores


Although we do not discuss these procedures in this book, we
feel that listing them here will be helpful for understanding the entire
test development process. References to sources that describe these
procedures are provided in Chapter 2 and at the end of this chapter.
1) Describing test scores: using descriptive statistics to
characterize the quantitative characteristics of test scores.
2) Reporting test scores: using statistical procedures for
determining how to report test scores most effetively both to test
takers and other test users.
3) Item analysis: using varios statistical procedures for analyzing
and improving the quality of individual test tasks, or items.
4) Estimating reliability of test scores: using a number of statistical
procedures for estimating the consistency of test scores across
different specific conditions of test use.
5) Investigating the validity of test use: includes a number of
logicaly considerations and empirical procedures, both
quantitative and qualitative, for investigating the validity of
inferences made from test scores under specific conditions of
test use. A number of qualitative procedures that are relevant to
investigating construct validity, but do not discuss any
quantitative procedures for this.

d. Archiving
Archiving involves building up a large pool, or bank, of test tasks
so as to facilitate the development of subsequent tests. Archiving
makes it possible to make the test potentially more adaptable or
appropriate to specific kindsof test takers. Typically, archiving
procedures are designed to allow easy retrieval of tasks and important
information about the task. Archiving also facilitates the maintaining of
test security. Finally, archiving procedues may be used to facilitate the
selection of tasks with particular characterlistics. As with procedures
for test anaysis, we do not discuss archiving, but provide some
references at the end of this chapter.
CHAPTER III

CONCLUSION

A. Summary
Test development is the entire process of creating and using a
test. The process is organized into three stages; design.
operationalization, and administration. While test development is
generally linear, with development progressing from one stage to the
next, the process is also an iterative one, in which the decisions that
are made and activities that are completed at any stage may lead us to
reconsider and revise decisions and repeat activities that have been
performed at another stage.
In the design stage we describe in detail the components of the
test design that will enable us to insure that performance on the test
tasks will corespond as closely as possible to language use, and that
the test scores will be maximally useful for their intended purposes.
The operationalization stage involves developing test tasks
specifications for the types of test tasks to be included in the test, and
a bluprint that describes how test tasks will be organized to form actual
test. Operationalization also involves developing and writing the actual
test tasks, writing instructions, and specifying.

You might also like