Topic: How To Assess? Ă Essay Tests

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 32

Topic  How to Assess?

ă Essay Tests
5
LEARNING OUTCOMES

By the end of the topic, you should be able to:


1. Define and list the criteria for an essay question;
2. Explain the formats of essay tests;
3. List the advantages and limitations of essay questions;
4. Construct well-written essay questions that assess learning outcomes
given; and
5. Describe different types of marking schemes for essays.

 INTRODUCTION
In Topic 4, we discussed in detail the use of objective tests in assessing students.
In this topic, we will examine a different type of test called the essay test. The essay
test is a popular technique for assessing learning and is used extensively at all
levels of education.

It is also widely used in assessing learning outcomes in business and professional


examinations. Essay questions are used because they challenge students to create
their own responses rather than simply selecting a response. Essay questions have
the potential to reveal studentsÊ abilities to reason, create, analyse and synthesise,
which may not be effectively assessed using objective tests.

Copyright © Open University Malaysia (OUM)


TOPIC 5 HOW TO ASSESS? – ESSAY TESTS  91

5.1 WHAT IS AN ESSAY QUESTION?


According to Stalnaker (1951), an essay is „a test item which requires a response
composed by the examinee usually in the form of one or more sentences of a nature
that no single response or pattern of responses can be listed as correct, and the
accuracy and quality of which can be judged subjectively only by one skilled or
informed in the subject.‰ Though the definition was provided a long time ago, it is
a comprehensive definition. Elaborating on this definition, Reiner, Bothell,
Sudweeks and Wood (2002) argued that to qualify as an essay question, it should
meet the following four criteria:

(a) The learner has to compose rather than select his or her response or answer.
In essay questions, students have to construct their own answer and decide
on what material to include in their response. Objective test questions (MCQ,
true-false, matching) on the other hand, require students to select the answer
from a list of possibilities.

(b) The response or answer the learner provides will consist of one or more
sentences. Students do not respond with a „yes‰ or „no‰ but instead have to
respond in the form of sentences. In theory, there is no limit to the length of
the answer. However, in most cases, its length is predetermined by the
demand of the question and the time limit allotted for the test question.

(c) There is no one single correct response or answer. In other words, the
question should be composed so that it does not ask for one single correct
response. For example, the question „Who killed JWW Birch?‰ assesses
verbatim recall or memory and not the ability to think. Hence, it cannot
qualify as an essay question. You can modify the question „Who killed JWW
Birch? Explain the factors that led to the killing.‰ Now, this is an essay
question that assesses studentsÊ ability to think and give reasons for the
killing supported with relevant evidence.

(d) The accuracy and quality of studentsÊ responses or answers to essay


questions must be judged subjectively by a specialist in the subject. The
nature of essay questions is such that only specialists in the subject can judge
to what degree responses (or answers) to an essay question are complete,
accurate and relevant. Good essay questions encourage students to think
deeply about their answers that can be judged only by someone with
appropriate experience and expertise in the content area. Thus, content
expertise is essential for both writing and grading essay tests. For example,
the question „List three reasons for the opening of Penang by the British in
1789‰ requires students to recall a set list of items. The person marking or
grading the essay does not have to be a subject matter expert to know
Copyright © Open University Malaysia (OUM)
92  TOPIC 5 HOW TO ASSESS? – ESSAY TESTS

whether the student has listed the three reasons correctly as long as the list
of three reasons is available as an answer key. For the question „To what
extent is commerce the main reason for the opening of Penang by the British
in 1789?‰, a subject matter expert is needed to grade or mark the answer to
this essay test question.

5.2 FORMATS OF ESSAY TESTS


Essay formats are usually classified into two groups: restricted response essay
questions and extended response essay questions. Both types are useful tools but
for different purposes.

(a) Restricted Response Essay Questions


Restricted response essay questions restrict or limit both the content and the
form of studentsÊ answers. The following are three examples:

(i) Discuss two advantages and two disadvantages of essay questions in


measuring studentsÊ performance.

(ii) List five guidelines for writing good essay items. For each guideline,
write a short statement explaining why it is useful in improving the
validity of essay assessment.

(iii) Distinguish the formative assessment from the summative assessment


in terms of their aims, the timing of the implementation and the content
coverage.

As shown in the examples, students are specifically informed what and how
they should respond to the questions. They indicate the number of points
required and/or the scope of the responses. The restriction or limitation on
the studentsÊ responses can also be done by including an interpretative
material (e.g. a graph, a paragraph describing a particular problem or an
extract from a literary work) and students are asked to respond to one or two
questions based on it.

The restricted response questions are more structured and are useful for
measuring learning outcomes requiring the interpretation and application of
knowledge in a specific area. They narrow the focus of the assessment task
to a specific and well-defined performance. The nature of these questions
makes it more likely that the students will interpret each question the way it
is intended. The teacher is also in a better position to assess the correctness
of studentsÊ answers when a question is focused and all students interpret it
in the same way. When the teacher is clear about what makes up correct

Copyright © Open University Malaysia (OUM)


TOPIC 5 HOW TO ASSESS? – ESSAY TESTS  93

answers, it improves scoring reliability and the scoresÊ validity. Although


restricting studentsÊ responses makes it possible to measure more specific
learning outcomes, these same restrictions make them less valuable as a
measure of those learning outcomes emphasising integration, organisation
and originality. For higher-order learning outcomes, greater freedom of
response is needed.

(b) Extended Response Essay Questions


Extended response essay questions provide less structure and this promotes
greater creativity, integration and organisation of material. The following are
three examples:

(i) Examine to what extent essay questions are effective in measuring


studentsÊ performance.

(ii) Evaluate the usefulness of multiple-choice questions as an assessment


tool in education.

(iii) „Research without theory is blind.‰ Discuss.

In responding to extended response essay questions, students are free to select any
information that they think pertinent, to organise the answer in accordance with
their best judgement, to integrate and to evaluate ideas they deem appropriate.
This freedom enables them to demonstrate their ability to analyse problems,
organise their ideas, describe in their own words, and/or develop a coherent
argument. The extended-response essay questions are therefore useful in assessing
higher-order thinking skills. They can also be used to assess writing skills.

The freedom for students to respond to extended response essay questions can
cause some problems. First, there is usually no single correct answer to the
question. Students are free to choose the way to respond, and the degree of
correctness or merit of their answers can only be judged by a skilled subject-matter
expert. A large number of examiners is required if the assessment involves a big
student population. Inter-rater reliability in scoring can be an issue. Second, the
same freedom that enables the demonstration of creative expression and other
higher-order thinking skills makes the extended response essay question
inefficient for measuring more specific learning outcomes. Third, the extended
response essay questions require good writing skills on the part of the students.
This type of question is thus disadvantageous to students whose writing skills are
poor. Due to these limitations, it is often recommended that more restricted
response essay questions to be used in place of extended response essay questions.

Copyright © Open University Malaysia (OUM)


94  TOPIC 5 HOW TO ASSESS? – ESSAY TESTS

ACTIVITY 5.1

Select a few essay questions that have been used in tests or examinations.
To what extent do these questions meet the criteria of an essay question
as defined by Stalnaker (1951) and elaborated by Reiner et al. (2002)?

Discuss with your coursemates in the myINSPIRE online forum.

5.3 ADVANTAGES OF ESSAY QUESTIONS


Essay questions are used to assess learning because of the following reasons:

(a) Essay questions provide an effective way of assessing complex learning


outcomes. They allow one to assess studentsÊ ability to synthesise, organise
and express ideas, and evaluate the worth of ideas. These abilities cannot be
effectively assessed directly with other paper-and-pencil test items.

(b) Essay questions allow students to demonstrate their reasoning. These


questions not only allow students to present an answer to a question but also
to explain how they have arrived at their conclusions. This allows teachers
to gain insight into a studentÊs way of viewing and solving problems. With
such insight, teachers can detect problems which students may have with
their reasoning process and help them overcome these problems.

(c) Essay questions provide authentic experiences. Constructing responses is


closer to real life than selecting responses as in the case of objective tests.
Problem solving and decision making are vital life competencies which
require the ability to construct a solution or decision rather than selecting a
solution or decision from a limited set of possibilities. In the work
environment, it is unlikely that an employer will give a list of „four options‰
for a worker to choose from when the latter is asked to solve a problem. In
most cases, the worker will be required to construct a response.

Copyright © Open University Malaysia (OUM)


TOPIC 5 HOW TO ASSESS? – ESSAY TESTS  95

5.4 DECIDING WHETHER TO USE ESSAY


QUESTIONS OR OBJECTIVE QUESTIONS
Keep in mind that essay questions should strive for higher-order thinking skills.
Therefore, the decision whether to use essay questions or objective questions in
examinations can be problematic for some educators. In such a situation, one has
to go back to the objectives of assessment. What kinds of learning outcomes do you
intend to assess? Essay questions are generally suitable to assess:

(a) StudentsÊ understanding of subject matter or content; and

(b) Thinking skills that require more than simple verbatim recall of information
by challenging the students to reason with their knowledge.

It is challenging to write test items to tap into higher-order thinking. However,


studentsÊ understanding of subject matter or content, and many of the other
higher-order thinking skills, can also be assessed through objective items. When in
doubt about whether to use an essay question or an objective question, just
remember that essay questions are used to assess studentsÊ ability to construct
rather than select answers.

To determine what type of test (essay or objective) to use, it is helpful that you
examine the verb(s) that best describe the desired ability to be assessed (refer to
Topic 2).

These verbs indicate what students are expected to do and how they should
respond. They serve to focus on the studentsÊ responses and channel them towards
the performance of specific tasks. Some verbs clearly indicate that students need
to construct rather than select their answer (such as to explain). Other verbs
indicate that the intended learning outcome is focused on studentsÊ ability to recall
information (such as to list). Perhaps, recall is best assessed through objectively
scored items. Verbs that test for understanding of subject matter or content or other
forms of higher-order thinking, but do not specify whether the student is to
construct or select the response (such as to interpret) can be assessed either by
essay questions or objective items.

Copyright © Open University Malaysia (OUM)


96  TOPIC 5 HOW TO ASSESS? – ESSAY TESTS

ACTIVITY 5.2

Compare, explain, arrange, apply, state, classify, design, illustrate,


describe, name, complete, choose, defend and name. Decide which of the
verbs in the list are best assessed by essay questions or objective tests or
both objective and essay questions.

Post your answer on the myINSPIRE online forum.

5.5 LIMITATIONS OF ESSAY QUESTIONS


While essay questions are popular because they enable the assessment of higher-
order learning outcomes, this format of evaluating students in examinations has a
number of limitations which should be kept in mind.

(a) One purpose of testing is to assess a studentÊs mastery of subject matter. In


most cases, it is not possible to assess the studentÊs mastery of the complete
subject matter domain with just a few questions. Because of the time it takes
for students to respond to essay questions and for markers to mark studentsÊ
responses, the number of essay questions that can be included in a test is
limited. Therefore, using essay questions will limit the degree to which the
test is representative of the subject matter domain, thereby reducing content
validity. For instance, a test of 80 multiple-choice questions will most likely
cover more of the content domain than a test of three to four essay questions.

(b) Essay questions have limitations in reliability. While essay questions allow
students some flexibility in formulating their responses, the reliability of
marking or grading is questionable. Different markers or graders may vary
in their marking or grading of the same or similar responses (inter-scorer
reliability) and one marker can vary significantly in his or her marking or
grading consistency across questions depending on many factors (intra-
scorer reliability). Therefore, essay answers of similar quality may receive
notably different scores. Characteristics of the learner, length and legibility
of responses, and personal preferences of the marker or grader with regard
to the content and structure of the response are some of the factors that may
lead to unreliable marking or grading.

Copyright © Open University Malaysia (OUM)


TOPIC 5 HOW TO ASSESS? – ESSAY TESTS  97

(c) Essay questions require more time for marking student responses. Teachers
need to invest a large amount of time to read and mark studentsÊ responses
to essay questions. On the other hand, relatively little or no time is required
for teachers to score objective test items like multiple-choice items and
matching exercises.

(d) As mentioned earlier, one of the strengths of essay questions is that they
provide students with authentic experiences because students are challenged
to construct rather than select their responses. To what extent does the short
time normally allotted to test affect student response? Students have
relatively little time to construct their responses and this time limit does not
allow them to give appropriate attention to the complex process of
organising, writing and reviewing their responses. In fact, in responding to
essay questions, students use a writing process that is quite different from
the typical process that produces excellent writing (draft, review, revise and
evaluate). In addition, students usually have no resources to aid their writing
when answering essay questions (dictionary or thesaurus). This
disadvantage may offset whatever advantage accrued from the fact that
responses to essay questions are more authentic than responses to multiple-
choice items.

5.6 MISCONCEPTIONS ABOUT ESSAY


QUESTIONS IN EXAMINATIONS
Other than the limitations of essay questions discussed earlier, there are also some
misconceptions about this form of assessment. These misconceptions are:

(a) By Their Very Nature, Essay Questions Assess Higher-order Thinking


Whether or not an essay item assesses higher-order thinking depends on the
design of the question and how studentsÊ responses are scored. Not all essay
questions can assess higher-order thinking skills. Indeed, it is possible to
write essay questions that simply assess recall. Also, if a teacher designs an
essay question meant to assess higher-order thinking but then scores
studentsÊ responses in a way that only rewards recall ability, that teacher is
not assessing higher-order thinking. Therefore, teachers must be well-
trained to design and write higher-order thinking questions.

Copyright © Open University Malaysia (OUM)


98  TOPIC 5 HOW TO ASSESS? – ESSAY TESTS

(b) Essay Questions are Easy to Construct


Essay questions are easier to construct than multiple-choice items because
teachers do not have to create effective distractors. However, that does not
mean that good essay questions are easy to construct. They may be easier to
construct in a relative sense, but they still require a lot of effort and time.
Essay questions that are hastily constructed without much thought and
review usually function poorly.

(c) The Use of Essay Questions Eliminates the Problem of Guessing


One of the drawbacks of objective test items is that students sometimes
get the right answer by guessing which of the presented options is correct.
This problem does not exist with essay questions because students need
to generate the answer rather than identifying it from a set of options
provided. At the same time, the use of essay questions introduces bluffing,
another form of guessing. Some students are „good‰ at using various
methods of bluffing (vague generalities, padding, name-dropping) to add
credibility to an otherwise weak answer. Thus, the use of essay questions
changes the nature of the guessing that occurs, but does not eliminate it.

(d) Essay Questions Benefit All Students by Placing Emphasis on the Importance
of Written Communication Skills
Written communication is a life competency that is required for effective and
successful performance in many vocations. Essay questions challenge
students to organise and express subject matter and problem solutions in
their own words, thereby giving them a chance to practise written
communication skills that will be helpful to them in future vocational
responsibilities. At the same time, the focus on written communication skills
is also a serious disadvantage for students who have marginal writing skills
but know the subject matter being assessed. If students who are
knowledgeable in the subject obtain low scores because of their inability to
write well, the validity of the test scores will be diminished.

(e) Essay Questions Encourage Students to Prepare More Thoroughly


Some research seems to indicate that students are more thorough in
their preparation for examinations using essay questions than in their
preparation for objective examinations such as those using multiple-choice
questions. However, after an extensive review of existing literature and
research on this topic, Crooks (1988) concluded that studentsÊ extent of
preparation is based more on the expectations teachers set upon them
(higher-order thinking and breadth and depth of content) than the type of
test questions they expect to be given in examinations.

Copyright © Open University Malaysia (OUM)


TOPIC 5 HOW TO ASSESS? – ESSAY TESTS  99

SELF-CHECK 5.1

1. What are some limitations in the use of essay questions?

2. List some of the misconceptions about essay questions.

ACTIVITY 5.3

Compare the following two essay questions and decide which one
assesses higher-order thinking skills.

(a) „What are the major advantages and limitations of solar energy?‰

(b) „Given its advantages and limitations, should governments spend


money developing solar energy?‰

Post your answer on the myINSPIRE online forum.

5.7 GUIDELINES ON CONSTRUCTING ESSAY


QUESTIONS
When constructing essay questions, whether they are for coursework assessments
or examinations, the most important thing is to ensure that students have a clear
idea of what they are expected to do after they have read the question or problem
presented.

Here are specific guidelines that can help you improve existing essay questions
and create new ones.

(a) Clearly Define the Intended Learning Outcome to be Assessed by the


Question
Knowing the intended learning outcome is crucial for designing essay
questions. In specifying the intended learning outcome, teachers clarify the
performance that students should be able to demonstrate as a result of what
they have learnt. The intended learning outcome typically begins with a verb
that describes an observable behaviour or action that students should
demonstrate. The focus is on what students should and should not be able to
do in the learning or teaching process. Reviewing a list of verbs can help to
clarify what ability students should demonstrate, thereby defining the
intended learning outcome to be assessed (refer to subtopic 4.8).
Copyright © Open University Malaysia (OUM)
100  TOPIC 5 HOW TO ASSESS? – ESSAY TESTS

(b) Avoid Using Essay Questions for Intended Learning Outcomes that are
Better Assessed with Other Kinds of Assessment
Some types of learning outcomes can be more efficiently and more reliably
assessed with objective tests than with essay questions. Since essay questions
sample a limited range of subject matter or content, are more time-
consuming to score and involve greater subjectivity in scoring, the use of
essay questions should be reserved for learning outcomes that cannot be
better assessed by some other means. Let us look at Example 5.1.

Example 5.1:
Learning Outcome:
To be able to differentiate the reproductive habits of birds and amphibians.

Essay Question:
What are the differences in egg laying characteristics between birds and
amphibians?

Note: This learning outcome can be better assessed by an objective test.

Objective Item:
Which of the following differences between birds and amphibians is correct?

Birds Amphibians
A Lay a few eggs at a time Lay many eggs at a time
B Lay eggs Give birth
C Do not incubate eggs Incubate eggs
D Lay eggs in nest Lay eggs on land

(c) Clarity About the Task and Scope


Essay questions have two variable elements ă the degree to which the task is
structured and the degree to which the scope of the content is focused. There
is still confusion among educators as to whether more structure (of the task
required) and more focus (on the content) are better than less structure and
less focus. When the task is more structured and the scope of content is more
focused, two problems are reduced:

(i) The problem of student responses containing ideas that were not meant
to be assessed; and

(ii) The problem of extreme subjectivity when scoring student answers or


responses.

Copyright © Open University Malaysia (OUM)


TOPIC 5 HOW TO ASSESS? – ESSAY TESTS  101

Although more structure helps to avoid these problems, how much and what
kind of structure and focus to provide are dependent on the intended
learning outcome that is to be assessed by the essay question. The process of
writing effective essay questions involves defining the task and delimiting
the scope of the content in an effort to create an effective question that is
aligned with the intended learning outcome to be assessed by it (as
illustrated in Figure 5.1).

Figure 5.1: Alignment between content, learning activities


and assessment tasks
Source: Phillips, Ansary Ahmed and Kuldip Kaur (2005)

This alignment is absolutely necessary for obtaining studentsÊ responses that


can be accepted as evidence that a student has achieved the intended learning
outcome. Hence, the essay question must be carefully and thoughtfully
written in such a way that it elicits student responses that provide the teacher
with valid and reliable evidence about the studentsÊ achievement of the
intended learning outcome. Failure to establish adequate and effective limits
for studentsÊ answers to the question may result in students setting their own
boundaries for their responses. This means that students might provide
answers that are outside the intended task or address only a part of the
intended task. If this happens, then the teacher is left with unreliable and
invalid information about the studentsÊ achievement of the intended learning
outcome. Also, there is no basis for marking or grading studentsÊ answers.
Therefore, it is the responsibility of the teacher to write essay questions in
such a way that they provide students with clear boundaries for their
answers or responses. Let us look at Example 5.2.

Copyright © Open University Malaysia (OUM)


102  TOPIC 5 HOW TO ASSESS? – ESSAY TESTS

Example 5.2: Improving Clarity of Task and Scope of Essay Questions

Weak Essay Question:


Evaluate the impact of the Industrial Revolution on England.

The verb is „evaluate‰, which is the task the student is supposed to do. The
scope of the question is the impact of the Industrial Revolution on England.
Very little guidance is given to students about the task of evaluating and the
scope of the task. A student reading the question may ask:

(i) The impact on what in England? The economy? Foreign trade? A


particular group of people? (The scope is not clear.)

(ii) Evaluate based on what criteria? The significance of the revolution? The
quality of life in England? Progress in technological advancements?
(The task is not clear.)

(iii) What exactly do you want me to do in my evaluation? (The task is not


clear.)

Improved Essay Question:


Evaluate the impact of the Industrial Revolution on the quality of family life
in England. Explain whether families were able to provide for the education
of their children.

The improved question determines the task for students by specifying a


particular unit of society in England affected by the Industrial Revolution
(family). The task is also determined by giving students a criterion for
evaluating the impact of the Industrial Revolution (whether or not families
were able to provide for their childrenÊs education). Students are clearer
about what must be done to „evaluate‰. They need to explain how family life
has changed and judge whether or not the changes are an improvement for
the children.

SELF-CHECK 5.2

1. When would you decide to use an objective item rather than an


essay question to assess learning?

2. What is the difference between the task and the scope of an essay
question?

Copyright © Open University Malaysia (OUM)


TOPIC 5 HOW TO ASSESS? – ESSAY TESTS  103

(d) Questions that are Fair


One of the challenges that teachers face in composing essay questions is that
because of their extensive experience with the subject matter, they may be
tempted to demand unreasonable content expertise on the part of the
students. Hence, teachers need to make sure that their students can „be
expected to have adequate material with which to answer the question‰
(Stalnaker, 1951). In addition, teachers should ask themselves if students can
be expected to adequately perform the thought processes which are required
of them in the task. For assessment to be fair, teachers need to provide their
students with sufficient instruction and practice in the subject matter
required for the thought processes to be assessed.

Another important element is to avoid using indeterminate questions. A


question is indeterminate if it is so unstructured that students can redefine
the problem and focus on some aspect of it with which they are thoroughly
familiar or if experts in the subject matter cannot agree that one answer is
better than another. One way to avoid indeterminate questions is to stay
away from vocabulary that is ambiguous. For example, teachers should
avoid using the verb „discuss‰ in an essay question. This verb is simply too
broad and vague. Moreover, teachers should also avoid including
vocabulary that is too advanced for students.

(e) Specify the Approximate Time Limit and Marks Allotted to Each Question
Specifying the approximate time limit helps students allocate their time in
answering several essay questions. Without such guidelines, students may
feel at a loss as to how much time to spend on a question. When deciding the
guidelines for how much time should be spent on a question, keep the slower
students and students with certain disabilities in mind. Also make sure that
students can be realistically expected to provide an adequate answer in the
given and/or suggested time. Similarly, state the marks allotted to each
question so that students can estimate how much they should write to
answer the question.

(f) Use Several Relatively Short Essay Questions Rather than One Long
Question
Only a very limited number of essay questions can be included in a test
because of the time it takes for students to respond to them and the time it
takes for teachers to grade the studentsÊ responses. This creates a challenge
with regard to designing valid essay questions. Shorter essay questions are
better suited to assess the depth of student learning within a subject, whereas
longer test essay questions are better suited to assess the breadth of student
learning within a subject. Hence, there is a trade-off when choosing between
several short essay questions or one long question. Focus on assessing the

Copyright © Open University Malaysia (OUM)


104  TOPIC 5 HOW TO ASSESS? – ESSAY TESTS

depth of student learning within a subject limits the assessment of the


breadth of student learning within the same subject. Meanwhile, focus on
assessing the breadth of student learning within a subject limits the
assessment of the depth of student learning within the same subject. When
choosing between using several short essay questions or a long question, also
keep in mind that short essays are generally easier to mark than long essays.

(g) Avoid the Use of Optional Questions


Students should not be permitted to choose one essay question to answer
from two or more optional questions. The use of optional questions should
be avoided for the following reasons:

(i) Students may waste time deciding on an option; and

(ii) Some questions are likely to be harder which could make the
comparative assessment of studentsÊ abilities unfair.

The issue of the use of optional questions is debatable. It is often practised,


especially in higher education and students often demand that they be given
choices. The practice is acceptable if it can be assured that the questions have
equivalent difficulty levels and the tasks as well as the scope required by the
questions are equivalent.

Last but not least, let us improve the essay questions through preview and review.

Improving Essay Questions Through Preview and Review

The following steps can help you improve the essay item before and after you
administer it to your students.

PREVIEW (before handing out the essay question to the students)

Predict StudentsÊ Responses


Try to respond to the question from the perspective of a typical student.
Evaluate whether students have the content knowledge and the skills
necessary to adequately respond to the question. After detecting possible
weaknesses of the essay questions, repair them before handing them out in the
exam.

Copyright © Open University Malaysia (OUM)


TOPIC 5 HOW TO ASSESS? – ESSAY TESTS  105

Write a Model Answer


Before using a question, write model answer(s) or at least an outline of major
points that should be included in an answer. Writing the model answer allows
reflection on the clarity of the essay question. Furthermore, the model answer
serves as a basis for the grading of student responses. Once the model answer
has been written, compare its alignment with the question and the intended
learning outcome, and make changes as needed to assure that the intended
learning outcome, the question and the model answer are aligned with one
another.

Before using the question in a test, ask a knowledgeable person in the subject
to critically review the essay question, the model answer and the intended
learning outcome to determine how well they are aligned with each other.

REVIEW (after receiving the student responses)

Review StudentsÊ Responses to the Essay Question


After students have answered the questions, carefully review the range of
answers given and the manner in which students seem to have interpreted the
question. Make revisions based on the findings. Writing good essay questions
is a process that requires time and practice. Carefully studying the studentsÊ
responses can help to evaluate studentsÊ understanding of the question as well
as the effectiveness of the question in assessing the intended learning
outcomes.

Copyright © Open University Malaysia (OUM)


106  TOPIC 5 HOW TO ASSESS? – ESSAY TESTS

In addition, you can use a checklist as shown in Figure 5.2 to check your essay
questions.

Figure 5.2: A checklist for writing essay questions

SELF-CHECK 5.3

1. Why should you specify the time allotted for answering each
question?

2. Why should you avoid optional questions?

3. What is meant when it is said that questions should be „fair‰?

4. What should you do before and after administering a test?

Copyright © Open University Malaysia (OUM)


TOPIC 5 HOW TO ASSESS? – ESSAY TESTS  107

5.8 VERBS DESCRIBING VARIOUS KINDS OF


MENTAL TASKS
Using the list suggested by Moss and Holder (1988), and Anderson and Krathwohl
(2001), Reiner et al. (2002) proposed the following list of verbs that describe mental
tasks to be performed (refer to Table 5.1).

Table 5.1: Verbs, Definitions and Examples

Verbs Definitions Examples


Analyse Break material into its constituent Analyse the meaning of the line „He
parts and determine how the parts saw a dead crow, in a drain, near the
relate to one another and to an post office‰ in the poem The Dead
overall structure or purpose. Crow.
Apply Decide which abstractions Apply the principles of supply and
(concepts, principles, rules, laws, demand to explain why the
theories, generalisations) are consumer price index (CPI) in
relevant in a problem situation. Malaysia has increased in the last
three months.
Attribute Determine a point of view, bias, Determine the point of view of the
value or intent underlying the author in the article about her
presented material. political perspective.
Classify Determine which category belongs Classify the organisms into
to something. vertebrates and invertebrates.
Compare Identify and describe points of Compare the role of the Dewan
similarity. Rakyat and Dewan Negara.
Compose Make or form by combining Compose an effective plan for
things, parts or elements. solving flooding problems in Kuala
Lumpur.
Contrast Bring out the points of difference. Contrast the contribution of Tun
Hussein Onn and Tun Abdul Razak
Hussein to the political stability of
Malaysia.
Create Put elements together to form a Create a comprehensive solution for
coherent or functional whole, the traffic problems in Kuala
reorganise elements into a new Lumpur.
pattern or structure.

Copyright © Open University Malaysia (OUM)


108  TOPIC 5 HOW TO ASSESS? – ESSAY TESTS

Critique Detect consistencies and Judge which of the two methods is


inconsistencies between a product the best way for reducing high
and relevant external criteria; absenteeism in the workplace.
detect the appropriateness of a
procedure for a given problem.
Defend Develop and present an argument Defend the decision to raise fuel
to support a recommendation, to prices by the government.
maintain or revise a policy,
programme or propose a course of
action.
Define Give the meaning of a word or Define the term „chemical
concept; place it in the class to weathering".
which it belongs and distinguish it
from other items in the same class.
Describe Give an account of; tell or depict in Describe the contribution of ZaÊba
words; represent or delineate by a to the development of Bahasa
word picture. Melayu.
Design Devise a procedure for Design an experiment to prove that
accomplishing some task. 21 per cent of air is composed of
oxygen.
Differentiate Distinguish relevant from Distinguish between supply and
irrelevant parts or important from demand in determining price.
unimportant parts of presented
material.
Explain Make clear the cause or reason of Explain the causes of the First
something; construct a cause-and- World War.
effect model of a system; tell
„how‰ to do; tell the meaning of.
Evaluate Make judgements based on criteria Evaluate the contribution of the
and standards; determine the microchip in telecommunications.
significance, value, quality or
relevance of; give the good points
and the bad ones; identify and
describe the advantages and
limitations.
Generate Come up with alternative Generate hypotheses to account for
hypotheses, examples, solutions, an observed phenomenon.
proposals based on criteria.
Identify Recognise as being a particular Identify the characteristics of the
person or thing. Mediterranean climate.

Copyright © Open University Malaysia (OUM)


TOPIC 5 HOW TO ASSESS? – ESSAY TESTS  109

Illustrate Use a word picture, a diagram, a Illustrate the use of catapults in the
chart or a concrete example to amphibious warfare of Alexander.
clarify a point.
Infer Draw a logical conclusion from What can you infer happened in the
presented information. experiment?
Interpret Give the meaning of; change from Interpret the poetic line, „The sound
one form of representation (such as of a cobweb snapping is the noise of
numerical) to another (such as my life.‰
verbal).
Justify Show good reasons for; give your Justify the American entry into the
evidence; present facts to support Second World War.
your position.
List Create a series of names or other List the major functions of the
items. human heart.
Predict Know or tell beforehand with Predict the outcome of a chemical
precision of calculation, reaction.
knowledge or shrewd inference
from facts or experience what will
happen.
Propose Offer for consideration, acceptance Propose a solution for landslides
or action; suggest. along the North-South Highway.
Recognise Locate knowledge in long-term Recognise the important events in
memory that is consistent with the road to independence in
presented material. Malaysia.
Recall Retrieve relevant knowledge from Recall the dates of important events
long-term memory. in Islamic history.
Summarise Sum up; give the main points Summarise the ways in which man
briefly. preserves food.
Trace Follow the course of; follow the Trace the development of television
trail of; give a description of in school instruction.
progress.

The definitions specify thought processes a person must perform to complete the
mental tasks. Note that this list is not exhaustive and local examples have been
introduced to illustrate the mental tasks required in each essay question.

Copyright © Open University Malaysia (OUM)


110  TOPIC 5 HOW TO ASSESS? – ESSAY TESTS

ACTIVITY 5.4

Discuss the following with your coursemates in the myINSPIRE online


forum:

(a) Select some essay questions in your subject area and examine
whether the verbs used are similar to those in the list given in
Table 5.1. Do you think the tasks required by the verbs used are
appropriate? Justify.

(b) Do you think students are able to differentiate between the tasks
required in the verbs listed? Justify.

(c) Are teachers able to describe to students the tasks required by using
these verbs? Explain.

5.9 MARKING AN ESSAY


Marking or grading of essays is a notoriously unreliable activity. If we read an
essay at two different times, the chances are high that we will give the essay a
different grade each time. If two or more of us read the essay, our grades will likely
differ, often dramatically so. We all like to think we are exceptions, but study after
study of well-meaning and conscientious teachers show that essay grading is
unreliable (Ebel, 1972; McKeachie, 1987). Eliminating the problem is unlikely, but
we can take steps to improve grading reliability. Using a scoring guide or marking
scheme helps control the shifting of standards that inevitably take place as we read
a collection of essays and papers. The common types of marking scheme used in
scoring studentsÊ responses to essay questions are diagrammatically presented as
follows (refer to Figure 5.3):

Figure 5.3: Types of marking scheme

Copyright © Open University Malaysia (OUM)


TOPIC 5 HOW TO ASSESS? – ESSAY TESTS  111

A marking scheme may take the form of a checklist, a rubric or a combination of


both.

(a) Checklist
In a checklist, a score is awarded for every correct or relevant point in a
response. The sum of these individual scores provides the final score of the
response. Table 5.2 is an example of a checklist.

Table 5.2: Sample of a Checklist

Reference Topic 5, Section 5.7, p. 74


Suggested Strengths
answers  Essay questions provide an effective way of assessing complex
learning outcomes.
 Essay questions allow students to demonstrate their reasoning
and creativity.
 Essay questions provide authentic experiences because students
are given the opportunity to organise, write and review their
responses.
 Guessing is very much reduced.
(Accept any other appropriate answers.)
Marks Award 1 mark for each point. (1 mark  4 = 4 marks)
allocation

This marking scheme can be used to assess studentsÊ responses to an essay


question that ask for the strengths of essay questions as an assessment tool.
A checklist is easy to use. The teacher just needs to read through the studentÊs
response and checks the number of points for the calculation of marks. A
checklist is useful to assess factual content and it is relatively easy to
construct. The teacher just needs to present a list of points required in the
response and decide on the marks for each point. However, a checklist with
a list of points does not provide for the assessment of intangible learning
outcomes such as „to discuss‰, „to evaluate‰ or „to explain‰ and other
complexity levels of BloomÊs taxonomy. It also has limited feedback for
formative purposes and students cannot use it as a guide for writing
assignments.

Copyright © Open University Malaysia (OUM)


112  TOPIC 5 HOW TO ASSESS? – ESSAY TESTS

(b) Rubric
The two most common approaches used in scoring rubrics are the holistic
and the analytic methods.

(i) Holistic Method (Global or Impressionistic Marking)


The holistic approach to scoring essay questions involves reading an
entire response and assigning it to a category identified by a score or
grade. This method involves considering the studentÊs answer as a
whole and judging the total quality of the answer relative to other
studentsÊ responses or the total quality of the answer based on certain
criteria that have been developed.

Think of it as sorting into bins. You read the answer to a particular


question and assign it to the appropriate bin. The best answers go into
the „exemplary‰ bin, the good ones go into the „good‰ bin and the
weak answers go into the „poor‰ bin (refer to Table 5.3).

Table 5.3: Sample of a Marking Scheme Using the Holistic Method

Level of Achievement Descriptor

7ă8  Addresses the question


(Exemplary)  States a relevant argument
 Presents arguments in a logical order
 Uses acceptable style and grammar (no errors)
5ă6  Combination of above traits, but less consistently
(Good) represented (few errors)
3ă4  Does not address the question explicitly, though does
(Adequate) so tangentially
 States a somewhat relevant argument
 Presents some arguments in a logical order
 Uses adequate style and grammar (some errors)
1ă2  Does not address the question
(Poor)  States no relevant arguments
 Is not clearly or logically organised
 Fails to use acceptable style and grammar
0  Irrelevant response or no answer

Copyright © Open University Malaysia (OUM)


TOPIC 5 HOW TO ASSESS? – ESSAY TESTS  113

Then, points are written on each paper appropriate to the bin it is in. It
is based on an overall impression. The holistic method is also referred
to as global or impressionistic marking.

One of the strengths of holistic rubric is that studentsÊ responses can be


scored quite quickly. The teacher needs to read through the studentÊs
response and decide in which band of scores the response lies. This
rubric can provide an overview of student performance but it does not
provide detailed information about studentÊs performance. It may be
difficult to provide an overall score to the studentÊs response.

How best can a teacher use the holistic method in scoring studentsÊ
responses? Before he or she starts marking, the teacher can develop a
description of the type of response that would illustrate each category,
and then try out this draft version using several actual papers. After
reading and categorising all of the papers, it is a good idea to re-
examine the papers within a category to see if they are similar enough
in quality to receive the same points or grade. It may be faster to read
essays holistically and provide only an overall score or grade, but
students do not receive much feedback about their strengths and
weaknesses. Some instructors who use holistic scoring also write brief
comments on each paper to point out one or two strengths and/or
weaknesses so students will have a better idea of why their responses
received the scores they did.

(ii) Analytic Method


The analytic method of marking is the system most frequently used in
large-scale public examinations and also by teachers in the classroom.
Its basic tool is a two-dimensional table with the performance criteria
down the vertical column on the left and the performance levels across
the top row. The cells then present the performance descriptors as
shown in Table 5.4.

Copyright © Open University Malaysia (OUM)


114  TOPIC 5 HOW TO ASSESS? – ESSAY TESTS

Table 5.4: Sample of a Marking Scheme Using the Analytic Method

Copyright © Open University Malaysia (OUM)


TOPIC 5 HOW TO ASSESS? – ESSAY TESTS  115

The holistic scoring gives students a single, overall assessment score for
the response as a whole. The analytic scoring provides students with at
least a rating score for each criterion. For example, based on the rubric,
a studentÊs response may get 3 points for focus/organisation, 2 points
for elaboration and 4 points for mechanics, giving a total of 9 marks.

Alternatively, an analytic rubric may take the form of a weighted


rubric, whereby different weights (value) are assigned to different
criteria and include an overall achievement by totalling the criteria.
Refer to Table 5.5 for a sample of a weighted analytic rubric.

Copyright © Open University Malaysia (OUM)


116  TOPIC 5 HOW TO ASSESS? – ESSAY TESTS

Table 5.5: Sample of a Marking Scheme Using the Weighted Analytic Method

Copyright © Open University Malaysia (OUM)


TOPIC 5 HOW TO ASSESS? – ESSAY TESTS  117

To use the rubric, the performance level achieved by the student is


multiplied by the weight to give a score for each criterion. For example,
for focus/organisation, the score is 3  1.25 = 3.75, for elaboration, the
score is 2  1.25 = 2.5 and for mechanics the score is 4  0.5 = 2.0. This
gives the student a total of 8.25 marks out 12.

The analytic rubric provides more detailed feedback on areas of


strength and weakness because the performance criteria are given and
each criterion can be weighted to reflect its relative importance in the
studentÊs response. Generic rubrics which are not task specific can also
be a useful aid to learning. Students can use them too as a guide to
doing the assignments. As shown in Table 5.5, the performance
descriptors are stated in general terms, and do not give away the
answers. However, it takes more time to create and use than a holistic
rubric. Moreover, it is important that each point for each criterion is
well-defined. Otherwise, different raters may not arrive at the same
score.

5.10 SUGGESTIONS FOR MARKING ESSAYS


Here are some suggestions for marking or scoring essays:

(a) Grade the papers anonymously. This will help control the influence of our
expectations of the student on the evaluation of the answer.

(b) Read and score the answers to one question before going on to the next
question. In other words, score all the studentsÊ responses to Question 1
before looking at Question 2. This helps to keep one frame of reference and
one set of criteria in mind through all the papers, which results in more
consistent grading. It also prevents an impression that we form in reading
one question from carrying over to our reading of the studentÊs next answer.

(c) If a student has not done a good job on the first question, we may let this
impression influence our evaluation of the studentÊs second answer.
However, if other studentsÊ papers come in between, we are less likely to be
influenced by the original impression.

(d) If possible, try to grade all the answers to one particular question without
interruption. Our standards might vary from morning to night or one day to
the next.

Copyright © Open University Malaysia (OUM)


118  TOPIC 5 HOW TO ASSESS? – ESSAY TESTS

(e) Shuffle all the papers after each item is scored. Changing the order of papers.
this way reduces the context effect and the possibility that a studentÊs score
may be the result of the location of the paper in relationship to other papers.
If RakeshÊs „B‰ work is always following JamalÊs „A‰ work, then it might
look more like „C‰ work and his grade would be lower than if his paper was
somewhere else in the stack.

(f) Decide in advance how you are going to handle extraneous factors and be
consistent in applying the rule. Students should be informed about how you
treat such things as misspelled words, neatness, handwriting, grammar and
so on.

(g) Be on the alert for bluffing. Some students who do not know the answer may
write a well-organised coherent essay but one containing material irrelevant
to the question. Decide how to treat irrelevant or inaccurate information
contained in the studentsÊ answers. We should not give credit for irrelevant
material. It is not fair to other students who may also have preferred to write
on another topic, but instead wrote on the required question.

(h) Write comments on the studentsÊ answers. Teacher comments make essay
tests a good learning experience for students. They also serve to refresh your
memory of your evaluation should the student question the grade given.

(i) Be aware of the order in which papers are marked which can have an impact
on the grades awarded. A marker may grow more critical (or more lenient)
after having read several papers, thus the early papers may receive lower (or
higher) marks than papers of similar quality that are scored later.

(j) Also, when students are directed to take a stand on a controversial issue, the
marker must be careful to ensure that the evidence and the way it is
presented is evaluated, not the position taken by the student. If the student
takes a position which differs from that of the marker, the marker must be
aware of his or her own possible bias in marking the essay.

Copyright © Open University Malaysia (OUM)


TOPIC 5 HOW TO ASSESS? – ESSAY TESTS  119

ACTIVITY 5.4

1. Compare the analytical method and holistic method of marking


essays.

2. Which method is widely practised in your institution? Why?

3. Do you think there would be a difference in marking an answer


using the two methods? Justify your answer.

Post your answers on the myINSPIRE online forum.

 An essay question is a test item which requires a response composed by the


examinee usually in the form of one or more sentences of a nature that no single
response or pattern of responses can be listed as correct, and the accuracy and
quality of which can be judged subjectively only by one skilled or informed in
the subject matter.

 There are two types of essays based on their function: restricted response and
extended response essay questions.

 Essay questions provide an effective way of assessing complex learning


outcomes.

 Essay questions provide authentic experiences because constructing responses


are closer to real life than selecting responses.

 It is not possible to assess a studentÊs mastery of the complete subject matter


domain with just a few questions.

 Essay questions have two variable elements ă the degree to which the task is
structured and the degree to which the scope of the content is focused.

 Whether or not an essay item assesses higher-order thinking depends on the


design of the question and how studentsÊ responses are scored.

 Specifying the approximate time limit helps students allocate their time in
answering several essay questions.

Copyright © Open University Malaysia (OUM)


120  TOPIC 5 HOW TO ASSESS? – ESSAY TESTS

 Avoid using essay questions for intended learning outcomes that are better
assessed with other kinds of assessment.

 Analytical marking is the system most frequently used in large-scale public


examinations and also by teachers in the classroom. Its basic tool is the marking
scheme with proper mark allocations for elements in the answer.

 The holistic approach to scoring essay questions involves reading an entire


response and assigning it to one of several categories, each given a score or
grade.

Analytical method Holistic method


Checklist Marking scheme
Complex learning outcomes Mental tasks
Constructed responses Model answer
Essay Rubric
Grading Time consuming

Anderson, L. W., & Krathwohl, D. R. (2001). A taxonomy for learning, teaching,


and assessing: A revision of BloomÊs taxonomy of educational objectives.
Boston, MA: Allyn & Bacon.

Crooks, T. J. (1988). The impact of classroom evaluation practices on


students. Review of Educational Research, 58(4), 438ă481.

Ebel, R. L. (1972). Essentials of educational measurement. Oxford, England:


Prentice-Hall.

McKeachie, W. J. (1987). Can evaluating instruction improve teaching? New


Directions for Teaching and Learning, 31(1987), 3ă7.

Copyright © Open University Malaysia (OUM)


TOPIC 5 HOW TO ASSESS? – ESSAY TESTS  121

Moss, A., & Holder, C. (1988). Improving student learning: A guidebook for faculty
in all disciplines. Dubuque, IO: Kendall/Hunt.

Phillips, J. A., Ansary Ahmed, & Kuldip Kaur. (2005). Instructional design
principles in the development of an e-learning graduate course. Paper
presented at The International Conference in E-Learning. Bangkok, Thailand.

Reiner, C. M., Bothell, T. W., Sudweeks, R. R., & Wood, B. (2002). Preparing
effective essay questions. Stillwater, OK: New Forums Press.

Stalnaker, J. M. (1951). The essay type examination. In E. F. Lindquist (Ed.),


Educational measurement (pp. 495ă530). Menasha, WI: George Banta.

Copyright © Open University Malaysia (OUM)

You might also like