Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 20

1

EDUCATIONAL ASSESSMENT AND EVALUATION

ALLAMA IQBAL OPEN UNIVERSITY ISLAMABAD

ASSIGNMENT NO: 02

Submitted By Anila Zafar

Registration No 0000090833

Course Title Educational Assessment and Evaluation

Course Code 8602

Level B.ed( 1.5 years)

Semester Autumn 2021


2
EDUCATIONAL ASSESSMENT AND EVALUATION

Assignment No.2

(Units: 6-9)

Question No: 1 How the validity of a test can be measured?

Answer:

VALIDITY

The validity of an assessment tool is the degree to which it measures for what it is designed to measure.

For example if a test is designed to measure the skill of addition of three digit in mathematics but the

problems are presented in difficult language that is not according to the ability level of the students then it

may not measure the addition skill of three digits, consequently will not be a valid test. Many experts of

measurement had defined this term; some of the definitions are given as under. According to Business

Dictionary the “Validity is the degree to which an instrument, selection process, statistical technique, or

test measures what it is supposed to measure.”

Cook and Campbell (1979) define validity as the appropriateness or correctness of inferences, decisions,

or descriptions made about individuals, groups, or institutions from test results.

According to APA (American Psychological association) standards document the validity is the most

important consideration in test evaluation. The concept refers to the appropriateness, meaningfulness, and

usefulness of the specific inferences made from test scores. Test validation is the process of accumulating

evidence to support such inferences. Validity, however, is a unitary concept. Although evidence may be

accumulated in many ways, validity always refers to the degree to which that evidence supports the

inferences that are made from the scores. The inferences regarding specific uses of a test are validated, not

the test itself.

Howell’s (1992) view of validity of the test is; a valid test must measure specifically what it is intended to

measure. According to Mesick the validity is a matter of degree, not absolutely valid or absolutely invalid.

He advocates that, over time, validity evidence will continue to gather, either enhancing or contradicting

previous findings. Overall we can say that in terms of assessment, validity refers to the extent to which a
3
EDUCATIONAL ASSESSMENT AND EVALUATION

test's content is representative of the actual skills learned and whether the test can allow accurate

conclusions concerning achievement. Therefore validity is the extent to which a test measures what it

claims to measure. It is vital for a test to be valid in order for the results to be accurately applied and

interpreted.

Methods of Measuring Validity

Validity is the appropriateness of a particular uses of the test scores, test validation is then the process of

collecting evidence to justify the intended use of the scores. In order to collect the evidence of validity

there are many types of validity methods that provide usefulness of the assessment tools. Some of them

are listed below.

Content Validity

The evidence of the content validity is judgmental process and may be formal or informal. The formal

process has systematic procedure which arrives at a judgment. The important components are the

identification of behavioural objectives and construction of table of specification. Content validity

evidence involves the degree to which the content of the test matches a content domain associated with

the construct. For example, a test of the ability to add two numbers should include a range of

combinations of digits. A test with only one-digit numbers, or only even numbers, would not have good

coverage of the content domain. Content related evidence typically involves Subject Matter Experts

(SME's) evaluating test items against the test specifications. It is a non-statistical type of validity that

involves “the systematic examination of the test content to determine whether it covers a representative

sample of the behaviour domain to be measured”.

Curricular Validity

The extent to which the content of the test matches the objectives of a specific curriculum as it is formally

described. Curricular validity takes on particular importance in situations where tests are used for high-

stakes decisions, such as Punjab Examination Commission exams for fifth and eighth grade students and

Boards of Intermediate and Secondary Education Examinations. In these situations, curricular validity
4
EDUCATIONAL ASSESSMENT AND EVALUATION

means that the content of a test that is used to make a decision about whether a student should be

promoted to the next levels should measure the curriculum that the student is taught in schools. Curricular

validity is evaluated by groups of curriculum/content experts. The experts are asked to judge whether the

content of the test is parallel to the curriculum objectives and whether the test and curricular emphases are

in proper balance. Table of specification may help to improve the validity of the test.

Construct Validity

Before defining the construct validity, it seems necessary to elaborate the concept of construct. It is the

concept or the characteristic that a test is designed to measure. A construct provides the target that a

particular assessment or set of assessments is designed to measure; it is a separate entity from the test

itself. According to Howell (1992) Construct validity is a test’s ability to measure factors which are

relevant to the field of study. Construct validity is thus an assessment of the quality of an instrument or

experimental design. It says 'Does it measure the construct it is supposed to measure'. Construct validity is

rarely applied in achievement test.

Construct validity refers to the extent to which operationalizations of a construct (e.g. practical tests

developed from a theory) do actually measure what the theory says they do. Construct validity evidence

involves the empirical and theoretical support for the interpretation of the construct. Such lines of

evidence include statistical analyses of the internal structure of the test including the relationships

between responses to different test items. They also include relationships between the test and measures

of other constructs. As currently understood, construct validity is not distinct from the support for the

substantive theory of the construct that the test is designed to measure. As such, experiments designed to

reveal aspects of the causal role of the construct also contribute to construct validity evidence.

Construct validity occurs when the theoretical constructs of cause and effect accurately represent the real-

world situations they are intended to model. This is related to how well the experiment is operationalised.

A good experiment turns the theory (constructs) into actual things you can measure. Sometimes just

finding out more about the construct (which itself must be valid) can be helpful. The construct validity
5
EDUCATIONAL ASSESSMENT AND EVALUATION

addresses the construct that are mapped into the test items, it is also assured either by judgmental method

or by developing the test specification before the development of the test.

Criterion Validity

Criterion validity evidence involves the correlation between the test and a criterion variable (or variables)

taken as representative of the construct. In other words, it compares the test with other measures or

outcomes (the criteria) already held to be valid. If the test data and criterion data are collected at the same

time, this is referred to as concurrent validity evidence. If the test data is collected first in order to predict

criterion data collected at a later point in time, then this is referred to as predictive validity evidence.

Concurrent Validity

According to Howell (1992) “concurrent validity is determined using other existing and similar tests

which have been known to be valid as comparisons to a test being developed. There is no other known

valid test to measure the range of cultural issues tested for this specific group of subjects”. Concurrent

validity refers to the degree to which the scores taken at one point correlates with other measures (test,

observation or interview) of the same construct that is measured at the same time. Returning to the

selection test example, this would mean that the tests are administered to current employees and then

correlated with their scores on performance reviews. This measure the relationship between measures

made with existing tests. The existing test is thus the criterion. For example, a measure of creativity

should correlate with existing measures of creativity.

Predictive Validity

Predictive validity assures how well the test predicts some future behaviour of the examinee. It validity

refers to the degree to which the operationalizations can predict (or correlate with) other measures of the

same construct that are measured at some time in the future. Again, with the selection test example, this

would mean that the tests are administered to applicants, all applicants are hired, their performance is

reviewed at a later time, and then their scores on the two measures are correlated. This form of the

validity evidence is particularly useful and important for the aptitude tests, which attempt to predict how
6
EDUCATIONAL ASSESSMENT AND EVALUATION

well the test taker will do in some future setting.

This measures the extent to which a future level of a variable can be predicted from a current

measurement. This includes correlation with measurements made with different instruments. For example,

a political poll intends to measure future voting intent. College entry tests should have a high predictive

validity with regard to final exam results. When the two sets of scores are correlated, the coefficient that

results is called the predictive validity coefficient.

----------------------------------------------------Q#1 THE END-------------------------------------------------

Question No: 2 What are the rules of writing Multiple choice test items?

Answer:

Multiple Choice Questions (MCQ’s)

Norman E. Gronlund (1990) writes that the multiple choice question is probably the most popular as well

as the most widely applicable and effective type of objective test. Student selects a single response from a

list of options. It can be used effectively for any level of course outcome. It consists of two parts: the

stem, which states the problem and a list of three to five alternatives, one of which is the correct (key)

answer and the others are distracters (“foils” or incorrect options that draw the less knowledgeable pupil

away from the correct response). The stem may be stated as a direct question or as an incomplete

statement. For example:

Direct question

Which is the capital city of Pakistan? --------------- (Stem)

A. Lahore. ------------------------------------- (Distracter)


7
EDUCATIONAL ASSESSMENT AND EVALUATION

B. Karachi. ------------------------------------- (Distracter)

C. Islamabad. ---------------------------------- (Key)

D. Peshawar. ----------------------------------- (Distracter)

Incomplete Statement

The capital city of Pakistan is

A. Lahore.

B. Karachi.

C. Islamabad.

D. Peshawar.

RULES FOR WRITING MULTIPLE-CHOICE QUESTIONS

1. Use Plausible Distracters (wrong-response options)

 Only list plausible distracters, even if the number of options per question changes

 Write the options so they are homogeneous in content

 Use answers given in previous open-ended exams to provide realistic distracters

2. Use a Question Format

Experts encourage multiple-choice items to be prepared as questions (rather than incomplete statements)

Incomplete Statement Format:

The capital of AJK is in-----------------.

Direct Question Format:

In which of the following cities is the capital of AJK?

3. Emphasize Higher-Level Thinking

 Use memory-plus application questions. These questions require students to recall principles, rules or

facts in a real life context.

 The key to prepare memory-plus application questions is to place the concept in a life situation or

context that requires the student to first recall the facts and then apply or transfer the application of
8
EDUCATIONAL ASSESSMENT AND EVALUATION

those facts into a situation.

 Seek support from others who have experience writing higher-level thinking multiple-choice

questions.

4. Keep Option Lengths Similar

 Avoid making your correct answer the long or short answer

5. Balance the Placement of the Correct Answer

 Correct answers are usually the second and third option

6. Be Grammatically Correct

 Use simple, precise and unambiguous wording

 Students will be more likely to select the correct answer by finding the grammatically correct option

7. Avoid Clues to the Correct Answer

 Avoid answering one question in the test by giving the answer somewhere else in the test

 Have the test reviewed by someone who can find mistakes, clues, grammar and punctuation problems

before you administer the exam to students

 Avoid extremes – never, always, only

 Avoid nonsense words and unreasonable statements

8. Avoid Negative Questions

 31 of 35 testing experts recommend avoiding negative questions

 Students may be able to find an incorrect answer without knowing the correct answer

9. Use Only One Correct Option (Or be sure the best option is clearly the best option)

 The item should include one and only one correct or clearly best answer

 With one correct answer, alternatives should be mutually exclusive and not overlapping

10. Give Clear Instructions

Such as:
9
EDUCATIONAL ASSESSMENT AND EVALUATION

 Questions 1 - 10 are multiple-choice questions designed to assess your ability to remember or recall

basic and foundational pieces of knowledge related to this course.

 Please read each question carefully before reading the answer options. When you have a clear idea of

the question, find your answer and mark your selection on the answer sheet. Please do not make any

marks on this exam.

11. Use Only a Single, Clearly-Defined Problem and Include the Main Idea in the Question

12. Avoid “All the Above” Option

13. Avoid the “None of the Above” Option

14. Don’t Use MCQ When Other Item Types Are More Appropriate

Advantages

 Quick and easy to score, by hand or electronically

 Can be written so that they test a wide range of higher-order thinking skills

 Can cover lots of content areas on a single exam and still be answered in a class period

Disadvantages

 Often test literacy skills: “if the student reads the question carefully, the answer is easy to recognize

even if the student knows little about the subject”.

 Provide unprepared students the opportunity to guess, and with guesses that are right, they get credit

for things they don’t know

 Expose students to misinformation that can influence subsequent thinking about the content

 Take time and skill to construct (especially good questions).

---------------------------------------------------Q#2 THE END----------------------------------------------------


10
EDUCATIONAL ASSESSMENT AND EVALUATION

Question No: 3 Write a detailed note on scale of measurement.

Answer:

MEASUREMENT

Measurement is the assignment of numbers to objects or events in a systematic fashion. Measurement

scales are critical because they relate to the types of statistics you can use to analyze your data. An easy

way to have a paper rejected is to have used either an incorrect scale/statistic combination or to have used

a low powered statistic on a high powered set of data. Following four levels of measurement scales are

commonly distinguished so that the proper analysis can be used on the data a number can be used merely

to label or categorize a response.

Scales of measurement is how variables are defined and categorised. Psychologist Stanley

Stevens developed the four common scales of measurement: nominal, ordinal, interval and ratio. Each

scale of measurement has properties that determine how to properly analyse the data. The properties

evaluated are identity, magnitude, equal intervals and a minimum value of zero.

Properties of Measurement

 Identity: Identity refers to each value having a unique meaning.

 Magnitude: Magnitude means that the values have an ordered relationship to one another, so there is a

specific order to the variables.

 Equal intervals: Equal intervals mean that data points along the scale are equal, so the difference

between data points one and two will be the same as the difference between data points five and six.

 A minimum value of zero: A minimum value of zero means the scale has a true zero point. Degrees,

for example, can fall below zero and still have meaning. But if you weigh nothing, you don’t exist.

THE FOUR SCALES OF MEASUREMENT

By understanding the scale of the measurement of their data, data scientists can determine the kind of

statistical test to perform.


11
EDUCATIONAL ASSESSMENT AND EVALUATION

1. Nominal Scale

The nominal scale of measurement defines the identity property of data. This scale has certain

characteristics, but doesn’t have any form of numerical meaning. The data can be placed into categories

but can’t be multiplied, divided, added or subtracted from one another. It’s also not possible to measure

the difference between data points.

Nominal scales are the lowest scales of measurement. A nominal scale, as the name implies, is simply

some placing of data into categories, without any order or structure. You are only allowed to examine if a

nominal scale datum is equal to some particular value or to count the number of occurrences of each

value. For example, categorization of blood groups of classmates into A, B. AB, O etc. In The only

mathematical operation we can perform with nominal data is to count. Variables assessed on a nominal

scale are called categorical variables; Categorical data are measured on nominal scales which merely

assign labels to distinguish categories. For example, gender is a nominal scale variable. Classifying

people according to gender is a common application of a nominal scale.

Nominal Data

Nominal data can be broken down again into three categories:

 Nominal with order: Some nominal data can be sub-categorised in order, such as “cold, warm, hot

and very hot.”

 Nominal without order: Nominal data can also be sub-categorised as nominal without order, such as

male and female.

 Dichotomous: Dichotomous data is defined by having only two categories or levels, such as “yes’

and ‘no’.

2. Ordinal Scale

The ordinal scale defines data that is placed in a specific order. While each value is ranked, there’s no

information that specifies what differentiates the categories from each other. These values can’t be added
12
EDUCATIONAL ASSESSMENT AND EVALUATION

to or subtracted from.

Something measured on an "ordinal" scale does have an evaluative connotation. You are also allowed to

examine if an ordinal scale datum is less than or greater than another value. For example rating of job

satisfaction on a scale from 1 to 10, with 10 representing complete satisfaction. With ordinal scales, we

only know that 2 is better than 1 or 10 is better than 9; we do not know by how much. It may vary. Hence,

you can 'rank' ordinal data, but you cannot 'quantify' differences between two ordinal values. Nominal

scale properties are included in ordinal scale.

3. Interval Scale

The interval scale contains properties of nominal and ordered data, but the difference between data points

can be quantified. This type of data shows both the order of the variables and the exact differences

between the variables. They can be added to or subtracted from each other, but not multiplied or divided.

An ordinal scale has quantifiable difference between values become interval scale. You are allowed to

quantify the difference between two interval scale values but there is no natural zero. A variable measured

on an interval scale gives information about more or better as ordinal scales do, but interval variables have

an equal distance between each value. The distance between 1 and 2 is equal to the distance between 9

and 10.

For example, temperature scales are interval data with 25C warmer than 20C and a 5C difference has

some physical meaning. Note that 0C is arbitrary, so that it does not make sense to say that 20C is twice

as hot as 10C but there is the exact same difference between 100C and 90C as there is between 42C and

32C. Students’ achievement scores are measured on interval scale.

4. Ratio scale of measurement

Something measured on a ratio scale has the same properties that an interval scale has except, with a ratio

scaling, there is an absolute zero point. Ratio scales of measurement include properties from all four

scales of measurement. The data is nominal and defined by an identity, can be classified in order, contains
13
EDUCATIONAL ASSESSMENT AND EVALUATION

intervals and can be broken down into exact value. Weight, height and distance are all examples of ratio

variables. Data in the ratio scale can be added, subtracted, divided and multiplied.

Ratio scales also differ from interval scales in that the scale has a ‘true zero’. The number zero means that

the data has no value point. An example of this is height or weight, as someone cannot be zero

centimeters tall or weigh zero kilos – or be negative centimeters or negative kilos. Examples of the use of

this scale are calculating shares or sales. Of all types of data on the scales of measurement, data scientists

can do the most with ratio data points.

To summarize, nominal scales are used to label or describe values. Ordinal scales are used to provide

information about the specific order of the data points, mostly seen in the use of satisfaction surveys. The

interval scale is used to understand the order and differences between them. The ratio scales gives more

information about identity, order and difference, plus a breakdown of the numerical detail within each

data point.

------------------------------------------------------Q#3 THE END------------------------------------------------

Question No: 4 What are the considerations in conducting parent-teacher conferences?

Answer:

Parent-Teacher Conferences

Parent-teacher conferences are mostly used in elementary schools. In such conferences portfolio are

discussed. This is a two-way flow of information and provides much information to the parents. But one

of the limitations is that many parents don’t come to attend the conferences. It is also a time consuming

activity and also needs sufficient funds to hold conferences.

Literature also highlights ‘parent-student-teacher conference’ instead ‘parent-teacher conference’, as

student is also one of the key components of this process since he/she is directly benefitted. In many
14
EDUCATIONAL ASSESSMENT AND EVALUATION

developed countries, it has become the most important way of informing parents about their children’s

work in school. Parent-teacher conferences are productive when these are carefully planned and the

teachers are skilled and committed.

The parent-teacher conference is an extremely useful tool, but it shares three important limitations with

informal letter. First, it requires a substantial amount of time and skills. Second, it does not provide a

systematic record of student’s progress. Third, some parents are unwilling to attend conferences, and they

can’t be enforced. Parent-student-teacher conferences are frequently convened in many states of the USA

and some other advanced countries. In the US, this has become a striking feature of Charter Schools.

Some schools rely more on parent conferences than written reports for conveying the richness of how

students are doing or performing. In such cases, a school sometimes provides a narrative account of

student’s accomplishments and status to augment the parent conferences.

Parent-teacher conferences have been an important part of educating generations of K–12 students.

Successful conferences require both parties to listen respectfully to what each has to say, and they’re a

proven way to help students.

A parent-teacher conference is a breeze when the student is doing well, but teachers are often in the

difficult position of explaining to parents that their child is struggling with some aspect of schoolwork.

This can trigger a wide range of responses. Some parents will blame their child, while others will blame

themselves or feel overwhelmed or inadequate.

Teachers need to assure parents that the conference isn’t about assigning blame. All parents can

encourage their children and advocate for them, including those who can’t provide much help with their

children’s homework.

The primary purpose of a parent-teacher conference is for the teacher to brief the parents on the child’s

academic progress and share anything notable about the child’s behavior and development at school.

It’s helpful to let parents know if their child is attentive in class, participates in discussions, or has some

potential that might only be obvious in the classroom. This is a great time to give parents suggestions
15
EDUCATIONAL ASSESSMENT AND EVALUATION

about how they can help their child and to tell them about any additional resources available — like

enrichment and remedial programs — that could be helpful.

While the student’s academic progress is the main focus of parent-teacher conferences, the sessions are

much more than simply an opportunity to tell parents how their child can improve their grades.

Parent-teacher conferences, which are typically held only once per semester and seldom last more than 30

minutes, are also a rare opportunity for the teacher to learn from the parents about the child’s patterns for

doing school work, reading, and preparing for tests. As a teacher, you need to listen closely when parents

answer questions about anything that might affect the student’s academic performance.

When you’re assessing the student’s academic performance, you want to get as much information as

possible to determine whether they need help and what kind. It’s important to discuss factors outside the

classroom that could influence a student’s behavior, classroom focus, motivation, and their relationships

with schoolmates.

Your discussion with parents can touch on the student’s home life, family dynamics, or family finances.

While they’re sometimes awkward, frank discussions are necessary for the teacher to understand the

student’s challenges and for the parents to learn how they can help their child reach their academic

potential. It may be valuable to include school staff — such as counselors — in your meeting with the

parents. While they’re a lot of work, the most effective parent-teacher conferences boost family

involvement — and that promotes positive outcomes.

Considerations in Parent-Teacher Conference

Parent-teacher conferences are not just for parents who have concerns about their child in the classroom.

These meetings provide an opportunity for parents to partner with teachers in providing a good learning

experience for children.

Here’s how to strengthen that partnership at your child’s parent-teacher conference:

1. Prepare for the conference. Keep a folder with tests, papers, or any other topics you want to address.

Do your research by talking to your child; find out how she is doing in class, what’s happening at lunch
16
EDUCATIONAL ASSESSMENT AND EVALUATION

and recess.

2. Respect the teacher’s time. Be prompt, try not to bring other children and do not answer your cell

phone during the meeting!

3. Begin with a positive attitude. Start with a compliment for the teacher before addressing concerns.

Starting with a negative comment could put the teacher on the defensive.

4. Work together. Purpose to partner with your teacher by being a team player. Approach issues

the teacher raises with the attitude of "We have to work together on this problem."

5. Remember that this is about your child, not you. The teacher is not trying to tell you how to parent.

He is only there to help your child. No need to get defensive.

6. Listen and be open minded. A conference is for teachers to give parents information based on

classroom observation. The teacher would not bring up an issue if it was not necessary.

7. Teachers are not the enemy. If you have a problem, no need to go on the attack. They care about your

child and want to help you and your child resolve the issue.

8. Ask questions. After you’ve listened, it’s your turn to ask questions. Make a list beforehand so you’ll

remember what’s important. 

9. Learn the communication protocol. Ask the teacher her preferred method of communication: phone,

email, or notes. Let the teacher know you want to be an available parent.

10. Ask how you can be involved. Whether you work full-time, part-time or not at all, involvement is

about being aware of what's happening with your child's education, not just about volunteering in the

classroom.

--------------------------------------------------Q#4 THE END----------------------------------------------------


17
EDUCATIONAL ASSESSMENT AND EVALUATION

Question No: 5 Write a note on advantages and disadvantages of criterion reference testing.

Answer:

CRITERION-REFERENCED TEST

A criterion-referenced test is designed to measure how well test takers have mastered a particular body of

knowledge. The term "criterion- referenced test" is not part of the everyday vocabulary in schools, and

yet, nearly all students take criterion-referenced tests on a routine basis. These tests generally have an

established "passing" score. Students know what the passing score is and an individual's test score is

determined by knowledge of the course material.

The criterion-referenced test definition states that this type of assessment compares a student’s academic

achievement to a set of criteria or standards. This norm or criteria is established before candidates begin

the test.

Usually, schools or districts set the standard as a percentage. The test-taker’s score shows how far they’ve

progressed toward the approved standard. If they miss the mark, they must work harder.

A good example is measuring your body temperature. The accepted normal level is 98.6 degrees

Fahrenheit. If your temperature is too high in comparison, you are running a fever.

Criterion-referenced evaluations are used in schools to examine specific knowledge and abilities that

students have most likely gained. This determines how close they are to mastering a standard. They allow

teachers to assess how they can assist students improve in specific areas. Criterion-referenced

evaluations will show you where your learners are in terms of an accepted standard, allowing you to tailor

instructions and assistance for students. Criterion-referenced assessment examples include driving tests,

end-of-unit exams in school, clinical skill competency tools, etc.

It is important to distinguish between criterion-referenced tests and norm-referenced tests. The

standardized tests used to measure how well an individual does relative to other people who have taken

the test are norm-referenced. Criterion-referenced assessment examples include driving tests, end-of-unit

exams in school, clinical skill competency tools, etc.


18
EDUCATIONAL ASSESSMENT AND EVALUATION

TYPES OF CRITERION-REFERENCED TESTS

Criterion-referenced tests are mainly of the following types:

1. QUESTIONNAIRES AND SURVEYS

 These could be about the following: the number of children served, the number of children handled by

their respective language groups, or the usual level of schooling of parents. Expected replies are on a scale

of 1 to 5 on an observation form or survey, etc. This information can be scored and examined.

2. MULTIPLE-CHOICE QUESTIONS

 In this type of criterion-referenced test, multiple choices follow a single question. There is only one

answer and the scores depend on the number of correct answers chosen.

3. TRUE OR FALSE QUESTIONS

In this format, a given sentence can either be true or false. The student might be asked to select the correct

statement or the false statement, or state whether the given statement is true or false.

4. OPEN-ENDED QUESTIONS

 In this, the student may be asked to write a short answer or an essay or summarize a passage. It may also

include a combination of different question types.

ADVANTAGES OF CRITERION REFRENCE TEST

Mastery of Subject Matter.

Criterion-referenced tests are more suitable than norm-referenced tests for tracking the progress of

students within a curriculum. Test items can be designed to match specific program objectives. The scores

on a criterion referenced test indicate how well the individual can correctly answer questions on the

material being studied, while the scores on a norm-referenced test report how the student scored relative

to other students in the group.

Criterion-Referenced Tests can be Managed Locally.

Assessing student progress is something that every teacher must do. Criterion-referenced tests can be
19
EDUCATIONAL ASSESSMENT AND EVALUATION

developed at the classroom level. If the standards are not met, teachers can specifically diagnose the

deficiencies. Scores for an individual student are independent of how other students perform. In addition,

test results can be quickly obtained to give students effective feedback on their performance. Although

norm-referenced tests are most suitable for developing normative data across large groups, criterion-

referenced tests can produce some local norms.

Criterion-referenced assessments are needs based, meaning the tests are created with what the students’

needs are. If a student really needs to improve their knowledge of proper nouns, then a test will be created

on proper nouns.

When discussing the advantages of criterion referenced tests, it is also important to mention that since

students are only judged against themselves, they have a better chance of scoring high, which will help

improve their self-esteem as well. Studies show that students with special needs tend to have lower self-

esteem. Any way that we can help students feel better about themselves is a great opportunity.

One thing to remember is that each student is an individual and is different. By using criterion-referenced

assessments in your classroom, you can meet the individual needs of the students and differentiate your

assessments with the sole purpose of helping the students achieve to their fullest potential.

Disadvantages of Criterion-Referenced Tests

Criterion-referenced tests have some built-in disadvantages. Creating tests that are both valid and reliable

requires fairly extensive and expensive time and effort. In addition, results cannot be generalized beyond

the specific course or program. Such tests may also be compromised by students gaining access to test

questions prior to exams. Criterion-referenced tests are specific to a program and cannot be used to

measure the performance of large groups.

Analyzing Test Items

Item analysis is used to measure the effectiveness of individual test items. The main purpose is to improve

tests, to identify questions that are too easy, too difficult or too susceptible to guessing. While test items

can be analyzed on both criterion-referenced and norm-referenced tests, the analysis is somewhat different
20
EDUCATIONAL ASSESSMENT AND EVALUATION

because the purpose of the two types of tests is different.

Items on norm-referenced tests need to discriminate between high and low performers because those tests

are generally used to make aptitude, proficiency or placement decisions. Criterion-referenced tests, in

contrast, are used to measure mastery of specific material and the goal is success for all students. The best

items on criterion-referenced tests are those that tap the important concepts.

----------------------------------------------------Q#5 THE END-----------------------------------------------------

You might also like