Unit Three

Unit Three
Planning and Evaluating Teacher

Made Tests
Learning Objectives
At the end of this unit, learner should be able to:

1) Explain Teacher-made test and its Types
2) Define and Prepare Table of specification
3) Select an item format
4) Discuss the factors in considering to planning
tests
5) List the suggestion of writing a test item
6) Describe the criteria for evaluation
Teacher-made Tests
• Teachers have an obligation to provide their
students with the best instruction possible. This
implies that they must have some procedure(s)
whereby they can reliably and validly evaluate how
effectively their students have learned what has
been taught.
• Classroom test results may also be used by the teacher
to help her develop more efficient teaching strategies.
For example, Ms. Atom may feel that her pupils must
understand valence before they can be introduced to
balancing chemical equations in Chemistry Subject.
• Classroom tests, because they can be tailored to fit a
teacher's particular instructional objectives, are essential
if we wish to provide for optimal learning on the part of
the pupil and optimal teaching on the part of the teacher
Teacher-made Tests
• There are a variety of ways in which teacher-made
tests can be classified:
– Classification by Item Format - There are several
ways in which items have been classified by format-
supply and selection type; free answer and
structured answer; essay and objective. Some prefer
to make the distinction in format as free response
(supply) versus choice response (select), and
scoring is dichotomized as objective versus
subjective.
– Classification by Stimulus Material – tests are mostly
thought of in terms of a series of verbal problems that
require some sort of verbal response. There are
many instances, however, where the stimulus
material used to present the problem to the student
need not be verbal. In a humanities or art course, the
stimulus materials can be pictorial.
Teacher-made Tests
– Classification by Purpose - Teacher-made, or for that
matter, standardized achievement tests can also be
classified in terms of their purpose or use.
– Criterion versus Norm-Referenced Interpretation -
The test score in a criterion-referenced interpretation
is used to describe the status of the individual. Does
Mahad know how to add a single column of figures?
Does Ali know how to balance an equation? A norm-
referenced interpretation of the test score permits the
teacher to make meaningful comparisons among
students in terms of their achievement. Hence, if the
teacher wants to compare Mahad's performance in
arithmetic to that of his peers, he would use norm-
referenced interpretation.
– Achievement versus Performance - Education is
concerned with both what we know in an academic
sense and how well we are able to apply our
knowledge.
Table of Specification
• A Table of specifications, sometimes called a test
blueprint, is a table that helps teachers align
objectives, instruction, and assessment.
• This strategy can be used for a variety of
assessment methods but is most commonly
associated with constructing traditional summative
tests. When constructing a test, teachers need to
be concerned that the test measures an adequate
sampling of the class content at the cognitive level
that the material was taught.
• The Table of specification can help teachers map
the amount of class time spent on each objective
with the cognitive level at which each objective was
taught thereby helping teachers to identify the types
of items they need to include on their tests.
• There are factors to consider in preparing Table of
Specification:
– When to Prepare Specifications - Ideally, to be of
most benefit, the table of specifications should be
prepared before beginning instruction. Why?
Because these "specs" may help the teacher be a
more effective teacher. They should assist the
teacher in organizing his teaching material, his
outside readings, his laboratory experiences (if
necessary) - all the resources he plans on using in
teaching the course. In this way, the specs can help
provide for optimal learning on the part of the pupils
and optimal teaching efficiency on the part of the
instructor. In a way, then, the specs serve as a
monitoring device and can help keep the teacher
from straying off his instructional track.
– Preparing the Table of Specifications – Once the
course content and instructional objectives have
been specified, the teacher is ready to integrate them
in some meaningful fashion so that the test, when
completed, will be an accurate measure of the
students' knowledge. The following table contains the
course content in natural science that simultaneously
relates to the course content to Bloom's taxonomy.
One could, of course, delineate the course content
into finer subdivisions. Whether this needs to be
done depends upon the nature of the content and the
manner in which the course content has been
outlined and taught by the teacher. A good rule of
thumb to follow in determining how detailed the
content area should be is to have a sufficient number
of subdivisions to ensure adequate and detailed
coverage. The more detailed the blueprint, the easier
it is to get ideas for test items.
– Determination of Weights - You will recall that one of
the major advantages of the teacher-made versus
commercially published test is that the teacher-made
test can be tailor-made to fit the teacher's unique
and/or particular objectives. Each teacher can
prepare a test that is valid for his students. Because
the classroom teacher-more so than any other
person knows the relative emphasis placed upon the
various instructional objectives, it naturally follows
that he should have the major responsibility in
assigning the various weights to the cells in above
table. There is no hard-and-fast rule that can be
prescribed for the teacher to use in determining the
weights to be assigned to the various cells in the
table of specifications. The weights assigned should
reflect the relative emphasis used by the teacher
when he taught the course.
Selecting an item format
• The following are some suggestions for your
consideration in deciding which item format to use.
Factors to consider include the following:
– Purpose of the test - The most important factor to be
considered is what you want the test to measure. To
measure written self-expression, you would use the
essay; for spoken self-expression, the oral. To
measure the extent of the pupil's factual knowledge,
his understanding of principles, or his ability to
interpret, we prefer the objective test because it is
more economical and tends to possess higher score
reliability and content validity. If your purpose is to
use the test results to make binding decisions for
grading purposes or admission to college, we
recommend the objective test because of greater
sampling of content and more objective scoring.
– Time - It will take less time to prepare 5 extended-
response essay questions for a two-hour twelfth-
grade history test than it would to prepare 75
multiple-choice items for that same test. However,
the time saved in preparing the essay test may be
used up in reading and grading the responses. The
time element becomes of concern in relation to when
the teacher has the time.
– Numbers tested - If there are only a few pupils to be
tested and if the test is not to be reused, then the
essay or oral test is practical. However, if a large
number of pupils are to be tested and/or if the test is
to be reused at a later time with another group, we
recommend the objective test. It's much harder to
remember 75 objective items than it is to remember 5
or 6 essay topics.
– Time - It will take less time to prepare 5 extended-
response essay questions for a two-hour twelfth-
grade history test than it would to prepare 75
multiple-choice items for that same test. However,
the time saved in preparing the essay test may be
used up in reading and grading the responses. The
time element becomes of concern in relation to when
the teacher has the time.
– Numbers tested - If there are only a few pupils to be
tested and if the test is not to be reused, then the
essay or oral test is practical. However, if a large
number of pupils are to be tested and/or if the test is
to be reused at a later time with another group, we
recommend the objective test. It's much harder to
remember 75 objective items than it is to remember 5
or 6 essay topics.
– Skill tested - Hanson et al. (1986) showed that
certain item formats worked better for one skill than
for another. They also provided a design that could
be used to determine the specific combination and
number of items that should be included in a test for
each skill to be tested.
– Difficulty desired - Early research consistently
indicated that use of different formats had little effect
on pupils' ranking but did have a differential effect on
item-difficulty levels. Multiple-choice formats were
consistently found easier to answer than constructed
formats.
– Physical facilities - If duplication and reproduction
facilities are limited, the teacher is forced to use
either the essay test, with the questions written on
the board, or the oral test; or he can use the true-
false or short-answer item by reading the questions
aloud.
– Age of pupils - Unfortunately, there are still some
teachers who believe that a good test is
characterized by many different item formats. They
no doubt feel that this introduces an element of
novelty or that a change of pace will result in keeping
the pupils' motivation high. This may be true for older
pupils, but is definitely not so for younger pupils.
– Teacher's skill - Teachers may be prone initially to
more frustration and disappointment when writing
test items of one item format than another. As will be
seen in later sections, some item formats are easier
to write than others, and teachers do a better job with
one type than another. In fact, Ebel (197 5a) found
that teachers are able to write more discriminating
multiple-choice items than true-false items.
Factors to consider in planning tests
• Once we have decided on the purpose of the test and
have at least tentatively decided on the item formats to
be used, we still must answer five questions before we
are able to sit down and begin writing test items and
administer the test. They are (a) How long should the
test be? (b) How difficult should the test be? (c) When
and how often should tests be given? (d) Should the
nature of the stimulus (the item) be pictorial, verbal, or of
some other type? (e) Should the test (exam) be open- or
closed-book?
– Test Length - There is no readymade formula to tell
the teacher how many items should be used. Suffice
to say that the total number of items should be large
enough to provide for an adequate sample of student
behavior across objectives and content areas.
• Purpose
• Kinds of items used
• Reliability desired
• Pupil's age
• Ability level of pupils
• Time available for testing
• Length and complexity of the item
• Amount of computation required
• Instructional objective tested
– Item Difficulty - Classroom teachers can make their
tests very easy, very difficult, or in between. Some
teachers feel that they can purchase the respect of
their students by giving them easy tests. They are
wrong! Some other teachers feel that the more
difficult the test, the better; that a difficult test will
command respect from pupils and parents. They are
also wrong. About the only positive thing that we
know about difficult tests is that they tend to make
pupils study harder .
– When to Test - Teachers often ask, "Should I test
every week? Once or twice a semester?" Some
teachers prefer to test on small segments of the
course at frequent intervals. Other teachers prefer
testing less frequently and on large units of the
course. The majority of teachers usually govern
themselves by the marking and reporting schedules
of their schools. As of now, there is no evidence to
show that a test based on a small segment of the
course is better than a test that samples a larger unit
of the work
– Nature of the Stimulus: Verbal, Pictorial, or Other? -
The nature of the test-item stimulus is highly
dependent on the nature of the content being tested
and the age of the pupils tested. For young children
we recommend using lots of pictures, a minimum of
verbal material, and a simple vocabulary appropriate
to the students' age and ability.
– Open-Book versus Closed-Book Examinations - Most
teachers want to maximize the opportunity for their
students to do their best on classroom achievement
tests. There is some disagreement, however, as to
the best method for achieving this. There are some
teachers who contend that students should be able to
use any and all external aids such as notes, their
text(s), and other references when taking an exam.
Teachers preferring open-book exams say that (a)
they eliminate cheating; (b) they do not substitute for
studying because the time required to look through
one's notes or references for answers will dissuade
pupils from relying on these sources; (c) there are
not too many instances in life where one cannot look
up a formula, or equation, or piece of factual
information; and (d) they make students study for the
application and comprehension of knowledge rather
than for sheer recall of facts.
Preparing the test item
• The following are some suggestions in preparing test
items:
– Carefully define your instructional objectives
– Prepare a table of specifications, keep it before you,
and continually refer to it as you write the test item
– Formulate well-defined questions
– Avoid excess verbiage
– The test item should be based on information that the
examinee should know (or be able to deduce from
the context) without having to consult a reference
source
– Use the most appropriate stimulus
– Try to avoid race and sex bias
– Write each test item on a separate card
– Prepare more items than you will actually need
– Avoid specific determiners
Preparing the test item
– Write and key the test item as soon as possible after
the material has been taught
– Prepare the items well in advance to permit review
and editing
– Be careful when rewording a faulty item
– Insert some novelty into your test
– Avoid textbook or stereotyped language
– Obtaining the correct answer to one test item should
not be based on having correctly answered a prior
item
Criteria for test evaluation
• There are many criteria but we are going to consider five
(5) major ones. Let us now look at these criteria:
– Consistence with objectives - Every curriculum has
objectives that control and guide the
teaching/learning process. Evaluation is therefore
intended to help us ascertain or find out the extent to
which these objectives have been achieved. The
content of the evaluation programme should
therefore tally with the pre-stated objectives of the
programme.
– Comprehensiveness - You know that what we teach
covers a variety-of things. Even the syllabuses we
use have different units, topics and sub-topics. By
comprehensiveness we mean that the evaluation
programme should include items which cater for all
the objectives of the course or programme and
should cover all the topics taught and learnt.
– Sufficient diagnostic value - You have already learnt
in this unit that one of the roles of evaluation is
diagnostic. The examination or evaluation
programme should be designed in such a way that it
helps in distinguishing various levels of performance,
or mastery attained. It should help teachers and
other educationists to describe the strengths and
weaknesses in the teaching/learning process as well
as in the produce of performance.
– Reliability - The dictionary meaning of the word
reliable refers to something that can be trusted
because it works well. Similarly, reliability in
evaluation simply refers to the consistence of the
results. For example the reliability of a test refers to
the consistency of scores obtained by the same
individuals on different occasions of administering a
measuring instrument or sets of equivalent items
– Validity - You need to be aware that sometimes the
term validity in evaluation is confused with reliability.
You need to know that the validity of a test or any
evaluation instrument is the extent to which the
instrument measures what it is intended to measure.
• Content Validity - This is equivalent to the
criterion of comprehensiveness which you have
already read. According to Ahmann and Glock,
“finding the content validity of measuring
instruments is equivalent showing how well they
sample certain types of situations or subject
matter”.
• Predictive Validity - To predict is to say that
something will happen in future on the basis of
what you already know or have experienced. If
you can recall, some of our evaluation results
helps us to make decisions on promotion.
• Concurrent Validity - As you may already know,
things that happen concurrently actually happen
at the same time. You need to therefore know
that in evaluation, a test or an examination can
help you find out the ability of the student in the
intended area and another area.
For example: A reading test may also indicate a
student’s comprehension ability. A teacher with first
class performance in teaching practice may be
equally good at theory and vice versa.

Unit Three - Planning and Evaluating Teacher Made Tests

Uploaded by

Copyright:

Available Formats

You might also like

Unit Three - Planning and Evaluating Teacher Made Tests

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Unit Three - Planning and Evaluating Teacher Made Tests

Uploaded by

Copyright:

Available Formats

Planning and Evaluating Teacher

At the end of this unit, learner should be able to:

You might also like