Download as pdf or txt
Download as pdf or txt
You are on page 1of 78

TYPES OF

EVALUATION
TOOL/TECHNIQUES

Presented by
Nirsuba Gurung
TOOLS OF MEASUREMENT

1. Classroom Test Measures


2. Clinical Test Measure
Tools for Class room evaluation

Paper Pencil Tests

Subjective type
Objectivetype

 Extensive Response or 1. Supply Type


Essay Type
a.Short answer type
 Short Response
b.Completion type

2. Selection types
a. True false type
b. Matching types

c. MCQ
Clinical test measures

Observational techniques Practical Examination Viva Voce

Written
communication
 Rating scale reports OSPE
 Check list • Patient care OSCE
 Anecdotal report studies
•Problem
oriented
records
•Patient care
conference
EVALUATION METHODOLY

Class room Evaluation Clinical Evaluation

Paper pencil test Oral OSPE/OSCE Observation Written Report

Subjective type Objective Type Rating Scale Check List Anecdotal Record

Essay Short Answer Supply Type Selection type

Short Answer Completion Alternative response, Matching, Multiple choice


TOOLS OF MEASUREMENT

1. Paper pencil test: It plays a central role in


evaluation of student’s class room learning.
SUBJECTIVE TYPE
A. Extended Response or Essay
Type:
 It gives students a complete freedom in their response.
 Learner is free to select the factual information he/she
thinks is pertinent and free to organize the response
according to the best of his judgment.
ADVANTAGES:

1. Possible to measure the complex learning


outcomes which is not possible to measure
by means of objective type of test.
2. Measures such higher cognitive learning as
application of thinking and problem solving.
3. Serves as a mean of improving writing
skills.
4. Consumes less time in test construction
than a good objective type items.
5. It is more economical to use than the
objective types of questions that consume
much more stationary.
6. It induces study habits among students.
LIMITATIONS:

1. Scoring is time consuming.


2. Scoring is less reliable as it takes
subjective judgment , scoring may differ
from teachers to teachers .
3. Volume of the writing rather than content
might impress some teachers.
4. The quality of Examinees’ hand writing
might influence the scoring of answers.
5. At time students like to write every thing
they know than just what is asked for.
6. In essay test , representative of all
subject matter is less bcz covering of wide
sampling of the course is not possible.
7. Examinee’s speed of writing might
influence his/her response.
EXAMPLES OF EXTENDED RESPONSE OR ESSAY
TYPE

1. Describe similarities and differences between


teacher centered and student centered
learning.
2. Describe main characteristics of learning.
3. Describe three leadership styles with their
benefits and limitations.
HOW TO IMPROVE ESSAY TYPE OF TESTS

1. Use essay questions to measure only those learning outcomes


that can not be measured by objective items.
2. Formulate the essay questions in such a way that the task for
examinees is clearly indicated.
3. Remove vagueness by structuring the questions. The
unstructured questions can be turned into structured type by
breaking the questions into subtypes. E.g.
Unstructured: Discuss leadership styles.
Structured : Describe three leadership styles with their benefits and
limitations.
Unstructured : Describe curriculum and instruction
Structured: define the terms curriculum and instruction, and
differentiate them.
4. Give clear direction in term of time for
completion, full marks, pass marks and the number
of questions to be attempted.
5. Draft the Test items sometimes in advance.
6. Keep the blue print in view as you write the test
items
7. Get the test exercise examined or critiqued by
one or more collogues for establishing validity.
8. Prepare a model answer.
9. Conceal from the examiner the identity of
students to avoid halo effect in scoring.
10. Score all answers to one question before
proceeding with other question.
11. Allow adequate time for scoring.
SHORT RESPONSE TYPE

 Limits the response .


 Short-answer questions require students to
supply or write an answer, rather than to select
or to guess from a fixed number of options.
 The major limitation of the short-answer test is
that it is not suitable for testing complex
learning outcomes.
Procedure for setting and marking
SAQ
1. Make the questions precise .
2. Prepare a structured marking sheet : Allocate
marks or part of marks for the acceptable
answer (s)
3. Be prepared to consider other equally
acceptable answers, some of which you may
not have predicted
4. Mark questions with the following points in
mind:
• Mark anonymously
• Preferably have a different examiner for each page of
questions to reduce bias.
EXAMPLES

1. List the four advantages of using overhead


projectors.
2. Draw a diagram of internal structure of heart
and label it.
OBJECTIVE TYPE

 It is highly structured and requires students to


supply a word or two or select a correct answer
from the given alternatives.
1. Supply type
2. Selection type
SUPPLY TYPES ITEMS
 A supply type test item can be answered by a
word, phrase or symbol.
Classified into two subtypes:
a. Short answer type:
Example:
• what is the name of person who invented vaccine?-------
-----
• Which is the largest digestive gland? ---------
b. Completion types: The question is phrase an incomplete
sentence and examinee has to complete it by filling in the
blank space given. Example: The name of the person who
invented vaccine is -----
USES OF SUPPLY TYPES ITEMS

 Are suitable to measure the wide variety of


relatively simple learning outcomes like:
Knowledge of terminology
Specific facts
principles and
Simple interpretation of data.
BENEFITS OF SUPPLY TYPE

 Easy to construct
 A relatively large area of content can be covered.

 The chance of guessing to supply is reduced.

 Takes less time in scoring the test.


LIMITATIONS

 Itis suitable to measure only lower


cognitive achievements. Especially rote
memory but not measuring complex
learning.

 Unlessthe question is very carefully


phrased , the ambiguity might set in and
answer has to be accepted as partially
correct. For example; Where is BUDDHA
born?
SUGGESTION FOR WRITING SUPPLY
TYPES ITEMS
1. Phrase the item in such away that the required
answer is both brief and definite.
2. Writing question in direct question form is more
desirable than incomplete sentence.
3. Leave the blank space of equal length for all
items to avoid clue to the answer.
4. Do not use too many blanks in one item, it may
create confusion by breaking the thought
process.
5. Prepare answer key for scoring.
SELECTION TYPE ITEM

1. Alternative Response Item (T/F type)


2. Matching type
3. MCQ
SELECTION TYPE ITEM
1. Alternative Response Item: It consists of a
declarative statement that the student is
asked to mark as true or false.

Example:
a) DM develops mostly after the age of 40
years (T/F)
b) Penicillin is an effective drug for treatment
of pneumonia. (T/F)
c) All bacteria cause diseases. (T/F)

Uses: To measure the ability to identify the


statement of facts, definition of terms,
statements of principles.
BENEFITS/LIMITATIONS
Benefits

• Takes less time in scoring


• Students can respond to a larger number of
questions in limited time.

Limitations

 The simple true false item usually measures


the low cognitive achievements.
 The validity and reliability of the results of
the tests is low as only the limited number of
questions can be included.
 The students have 50% chance of getting the
correct answer by guessing.
SUGGESTION FOR WRITING
ALTERNATIVE RESPONSE ITEM
1. Avoid using broad general statements
as they are catchy.
2. Avoid using trivial statement (
unimportant, small, minor) that have
little significance from learning stand
point . E.g. Gastric word pertains to
stomach (T/F)
3. Avoid using long, wordy and complex
sentence. Poor: Bleeding of the gum,
associated with gingivitis, can be cured by
brushing teeth daily (T/F)
Better: Daily brushings will cure gingivitis
(T/F)
4. Avoid using negative statement particularly
double negative words together.

5. Make sure that both true and false


statements are approximately of equal
lengthy.

6. Try to make the number of true and false


statement almost equal.

7. Be sure that every body accepts the item as


true or false without doubt.
MATCHING TYPE TEST ITEMS

 It consists of two parallel column with each


word, number, symbol, being matched to a word,
sentence or phrase in another column.
 Characteristics:

Premises: the item in the column for which a match


is being made.
Response: The column from which a selection is
being made.
USES OF MATCHING TYPE TEST ITEMS
 It makes possible to measure the large amount
of factual material in relatively short time.

Benefits:
 Its compact form makes it possible to cover
larger sample of the factual matter of the
content in relatively short time.
 Test construction is easy and takes less time.
 Students can respond to a larger number of
questions in limited time.
 It takes less time in scoring.
Limitations:
 They are not very suitable to measure higher
cognitive achievements.
 Sometimes students can answer just by
guessing
SUGGESTIONS FOR WRITING:

1. Use only homogenous material in a single


matching exercise.
2. Keep the item relatively short.
3. Have more response choices than number of
premises.
4. Specify in direction whether responses can be
used more than once or not.
5. Place all of the items for one matching exercise
on the same page.
MULTIPLE CHOICE QUESTIONS
 The multiple choice question consists of a
stem, which presents a problem situation,
and several alternatives, which provide
possible solutions to the problem.

 The stem may be a question or an


incomplete statement.

 The alternatives include the correct answer


and several similar wrong answers, called
distracters.

 The function of the distracters is to


distract those students who are uncertain of
the answer.
EXAMPLE
MCQ in incomplete form or in question form ?
Questions form
1. What will you do first if an electric appliance in your
office is on fire?
A. Dial 000
B. Report to the person concerned
C. Evacuate from the site at once
D. Switch off the power*
Incomplete statement form
2. If an electric appliance in your office is on fire, the
first thing you will do is to;
A. dial 000.
B. report to the person concerned.
C. evacuate from the site at once.
D. switch off the power.*
GENERAL RULES FOR CONSTRUCTING
MCQS

1. Each question should be designed to assess an


important learning outcome.

2. State the stem in simple and clear


language
3. LIST THE ALTERNATIVES VERTICALLY AND
IN A LOGICAL ORDER.
3. PLACE AS MUCH OF THE WORDING AS POSSIBLE IN THE
STEM

Example
Poor: In objective testing, the term objective:
A. refers to the method of identifying the learning
outcomes.
B. refers to the method of selecting the test content.
C. refers to the method of presenting the problem.
D. refers to the method of scoring the answers.*
Better: In objective testing, the term objective
refers to the method of:
A. identifying the learning outcomes.
B. selecting the test content.
C. presenting the problem.
D. scoring the answers.*
4 . WHENEVER POSSIBLE, STATE THE STEM
IN POSITIVE FORM

Example
Better: Question 1 : Which one of the following is
a category of the cognitive domain in Bloom’s
Taxonomy?
A. Comprehension*
B. (distracter needed)
C. (distracter needed)
D. (distracter needed)
Poor : Question 2 : Which one of the following is
not a category of the cognitive domain in Bloom’s
Taxonomy?
A. Comprehension
B. Application
Note Question 2 ; Being able to pick
the right answer that does not apply
C. Analysis provides no guaranty that the student
D. (answer needed) possesses the desired knowledge.
5. WHENEVER NEGATIVE WORDING IS USED IN THE
STEM, EMPHASIZE IT

Sometimes the use of negative wording


is basic to the assessment of an
important learning outcome. For an
example, students have to know that
certain chemicals should not be mixed in a
chemistry lesson. When negative wording
is used in the stem, it should be
emphasized by using bold or upper cases
and being placed near the end of the
statement.
Example
Poor: Which one of the following is not a desirable
practice when preparing MCQs?
A. Starting the stem in positive form.
B. Using a stem that could function as a short-answer
question.
C. Underlining certain words in the stem for emphasis.
D. Shortening the stem by lengthening the alternatives.*
Better: All of the following are desirable practices
when preparing MCQs EXCEPT:
A. stating the stem in positive form.
B. using a stem that could function as a short-answer
question.
C. underlining certain words in the stem for emphasis.
D. shortening the stem by lengthening the alternatives.*
6. MAKE CERTAIN THAT THE INTENDED
ANSWER IS CORRECT OR CLEARLY BEST

When the correct-answer from of MCQ is used,


there should be only one correct answer and it
should be unquestionably correct. It may also be
necessary to include ‘of the following’ in the
stem .
Example
Poor: What is the best method of selecting
subject content for test questions?
Better: Which one of the following is the best
method of selecting subject content for test
questions?
7 . Check all alternatives are grammatically
consistent with the stem and parallel in form.
Grammatical inconsistency in tense or article
could provide a clue to the correct answer, or at
least make some of the distracters ineffective.
Example
Poor: The recall of factual information can be
measured best with a:
A. matching questions.
B. multiple-choice questions.
C. short-answer question.*
D. essay questions.
Better: The recall of factual information can be
measured best with:
A. matching questions.
B. multiple-choice questions.
C. short-answer questions.*
D. essay questions.
8.There should not be similarity of wording
in both the stem and the correct answer

Poor: Which one of the following would you


consult first to locate research articles on
achievement testing?
A. Journal of Educational Psychology
B. Journal of Educational Measurement
C. Journal of Consulting Psychology
D. Review of Educational Research*
9. DO NOT INCLUDE TOO MUCH
DETAILS IN THE CORRECT ANSWER.
Too much details in the correct answer may
provide a clue
Example
Poor: Lack of attention to learning outcomes
during test preparation:
A. will lower the technical quality of the
questions.
B. will make the construction of test questions
more difficult.
C. will result in the greater use of essay
questions.
D. may result in a test that is less relevant to
the instructional program.*
10. Use of more homogeneous alternatives
result in the greater plausibility.
Example
Poor: Obtaining a dependable ranking of
students is of major concern when using:
A. norm-referenced summative tests.*
B. behavior descriptions.
C. check lists.
D. questionnaires.
Better: Obtaining a dependable ranking of
students is of major concern when using:
A. norm-referenced summative tests.*
B. criterion-referenced formative tests.
c. -------------------------------------------
12. Avoid using the alternative ‘all of the above’
and use ‘none of the above’ with extreme
caution :
The use of ‘all of the above’ as an option makes it
possible to answer the question on the basis of
partial information.

When ‘none of the above’ is used as the right


answer in a correct-answer type of question, this
option may be measuring nothing more than the
ability to detect incorrect answers.
Example
Poor: Which one of the following is a category of
the cognitive domain in Bloom’s Taxonomy?
A. Critical thinking.
B. Scientific thinking.
C. Reasoning ability.
13. DO NOT INCLUDE ALTERNATIVES
SUCH AS “BOTH (A) AND (D)” OR “ALL
BUT MAINLY (C)”

 These complicate the structure of the question


and tend to confuse students and/or slow them
down.
13. IF NUMBERS ARE TO BE USED AS
DISTRACTERS, IT SHOULD BE MENTIONED EITHER
IN ASCENDING OR DESCENDING ORDERS.
1. The minimum score in GCS is :
a. 2
b. 3
c. 4
d. 5
Poor :The minimum score in GCS is :
a. 3
b. 2
c. 6
d. 1
STRENGTHS AND LIMITATIONS OF MCQ
Strengths
1. It allows for a precise interpretation for
content validity.
2. Purely objective in scoring and testing.
3. Easy to be administered both by students
and the teacher.
4. Student success does not depend on his/her
writing skills.
5. Results can easily be compiled and analyzed .
6. It can be used with all subject areas.
Limitations

1. It inhibits students from expressing creativity


or demonstrating original and imaginative
thinking.

2. Question design is restrictive, forcing students


to fit their understanding into the designer’s
way of understanding a concept.

3. Success of question depends on suitability of


distracters.
4. Longer reading time required and students with
poor reading skills may be disadvantaged.

5. Some students may guess at answers without


understanding them.

6. Time consuming to design good questions. It is


very easy to construct poor questions, bad
questions may be much worse than other
methods of assessing the same learning outcome.
OBSERVATIONAL TECHNIQUES
 Is especially used in evaluating psychomotor
skills and attitude..
It includes:
1. Check list
2. Rating scale and
3. Anecdotal report
1. CHECK LIST

 It consists of list of behavior or characteristics (on its


one hand column) that are essential to a success
performance.

 The next hand column calls for simple “ Yes/ No”


judgment.

 An additional column to jot down the observation made


might be useful to give feedback to the students.
SAMPLE OF CHECK LIST FOR PERFORMING DRESSING
 Name of students--------- Programme---------Year-------- Posted at-
-

S.N. Behavioral criteria Response Remarks

Yes No
1 Prepare the dressing trolley

2 Explain the procedure

3 Arrange equipment conveniently

4 Keep patient in comfortable position


and maintain privacy
5 ----------------------------------------
USES OF CHECK LIST

 It is used to assess those competencies areas


that are critical for gaining the proficiency in
the clinical field.

 With check list the observer can only record


whether certain behavior or characteristics is
present or absent

 It can not be used to find out the degree or


frequency of occurrence.
2. RATING SCALE
 It measures the degree to which characteristics
are present.
 It does not mere just note the absence or
presence of desirable behavior.
 It locate the behavior in a continuum and notes
the qualitative and quantitative abilities.
COMPONENTS OF RATING SCALE

 Typically rating scale consists of a set of


characteristics that is to be judged on the
left hand column and some type of scale on
the right hand side indicating degree to which
each attribute is present. Thus the
components of rating scale are :
1. Stimulus Variable (Criteria): quality to be
rated.
2. Response (Options) for rating
TYPES OF RATING SCALE

1. Graphing Rating Scale


2. Numerical rating Scale
1. GRAPHING RATING SCALE
 The graphic rating scale is rather broad and here
qualifying words are used rather than numbers.

Example:

Observers never seldom usually always


working hours

Ability to get poor Fair Good Best


along with
others
2. NUMERICAL RATING
 This is one of the simplest type of scale whether the
rater has to simply mark the number representing some
qualitative judgments. Some options as the standard of
reference are given here:
A. 1-unsatisfactory, 2-Below average, 3- Average, 4- above
average, 5- outstanding.
B. 1-poor, 2- Fair, 3- Average, 4- good, 5- best
C. 1- never, 2- seldom, 3- occasionally, 4- frequently, 5-
Always
D. 1- unsatisfactory, 2- satisfactory, 3- Good, 4- Very
Good, 5- Excellent
EXAMPLE OF NUMERICAL RATING SCALE
Direction: indicate to what degree student performed during her
practice teaching by marking her behaviors under appropriate
response.

Key: The number represents the values as follows;


 1-unsatisfactory, 2-Below average, 3- Average, 4- above
average, 5- outstanding

SN criteria Responses Remarks

1 2 3 4 5
1

4
COMMON ERROR IN RATING
1. Personal Bias
2. Halo Effect
3. Logical error
A. PERSONAL BIAS
Personal bias error results from evaluators’ natural
tendency to rate all the students at approximately
same position on the scale.

1. Generosity Error or Leniency Error: The most


common type of bias occurs with the evaluator's
natural tendency to rate the students at high end
of the scale only.
2. Horn effect : Another error occurring less
frequently is called Severity error or horn effect,
in which the rater favors the lower end of rating in
scale continuum.
3. Central tendency error :There are some raters
who avoid both extremes of the scale and tend
to rate every student within a very narrow range
around the center in the rating scale. This
tendency to rate everyone at average is called
Central tendency error.
B. HALO EFFECT

 It is an error when the evaluator’s general impression


of the student influences how he/she rates him on
individual characteristics.
 There is tendency of an evaluator to rate the
student’s performance as a whole on the basis of
good impression about his/her previous one or two
performance.
C. LOGICAL ERROR

 It results when the evaluator rates the student


high on one characteristics because he/she scored
high on the other character that is related to one,
which is being measured.
 In this situation a student may be rated higher
than he/she should have been.
PRINCIPLES OF EFFECTIVE RATING

1. The rating scale must specify who is being rated and


for what purposes.
2. It must specify the total weight carries .
3. The trait/ criteria to be rated must be educationally
significant or must be representative of the essential
competencies.
4. The characteristics /criteria must be directly
observable.
5. The points on the scale must be defined.
6. Rating from different observers must be combined
whenever possible.
7. Raters must be instructed to omit ratings when they
feel unqualified to judge.
3. ANECDOTAL REPORT
 Anecdotal reports are factual description of the
meaningful incidents and events , which the clinical
teacher /supervisor observe in a students and
record it on plain paper or form.
 Factual description of event:
 what happened?
 to whom?
 When ?
 Where ?
 Under what circumstances?
 Who observed ?
BENEFITS OF ANECDOTAL REPORT
• It provides the description of actual behavior
in natural situation and that provides check on
other evaluation methods.

• It makes possible to gather information on


events that is exceptional but significant.

• It acts as a means of directing teacher’s


individualized attention to the students.

• It provides cumulative record of student’s


progress in competency development.
LIMITATION

1. Time consuming
2. Difficult for evaluators to maintain the
objectivity( less reliable)
3. Difficult to obtain the sample behavior to
generalize (lacks validity)
4. There is tendency on the part of evaluators to
observe only negative incidents and neglect the
positive incidents.
PRINCIPLES OF WRITING EFFECTIVE ANECDOTAL
NOTES

1. Select meaningful event.


2. Observe and record enough of the situation to make
the behavior/incidents meaningful.
3. Record the incident as soon as possible after the
observation was made.
4. Always to note the description of a single event.
5. Keep the factual description of the incident and your
interpretation separate.
6. Record both positive as well as negative incidents.
7. Collect a number of anecdotes on the student before
drawing inferences concerning typical behavior.
PRACTICAL EXAMINATION

 Used to assess psychomotor skills and attitudes


of health professional
 To assess the clinical competence in health
professional

 Two types :
 OSPE
 OSCE
OSPE(OBJECTIVE STRUCTURED PRACTICAL
EXAMINATION )

 It is an assessment tool that would test


student’s competence in communication skills,
decision making ,psychomotor skills and the
knowledge competency as specified in
instructional objective

 Methodology :
 Written station
 Practical station
USES
 Formative evaluation to facilitate learning through
feedback
 Summative evaluation to determine student’s
achievement

 Advantages
 Fair and reliable as all the student’s are exposed to
same situation
 Multiple psychomotor skills and their related
knowledge may be assessed within a limited time
 A large students can be assessed within given time

 More objective evaluation as tool is structured and


there is no room for examiner’s opinion to enter
 Transparent and fair as responses are documented
DISADNTAGES

 Tools formulation is time consuming

 Planning and organization of OSPE is time


consuming

 As measurement here follows the principle of


reduction in terms of time and content ,
student’s performance in totality cannot be
measured
OSCE(0BJECTIVE STRUCTURED CLINICAL
EXAMINATION )

 Assess the component of clinical competence


 Such as
 History taking
 Physical examination
 Simple procedure
 Communication
 Attitude

 Response station
Do the procedure practically and observer
will be there to measure student’s achievement
4. ORAL EXAMINATION

 More probing and requires using deep explorative


thinking on the part of the students
 It is also helpful in assessing communication and
attitude as well

You might also like