Professional Documents
Culture Documents
Unit 4 Learning Environment & Assessment
Unit 4 Learning Environment & Assessment
1
collects assessment results to monitor individual student progress and to inform future
instruction.
14.Peer instruction: Perhaps the most accurate way to check for understanding is to have one
student try to teach another student what she’s learned. If she can do that successfully, it’s
clear she understood your lesson.
The first step of the test cycle is designing your test. Here you formulate your learning
objectives, your purposes of testing, and you make your test plan to check if your assessment
program is in line with your learning objectives and your teaching activities.
FORMULATING GOOD LEARNING OBJECTIVES
Objectives refer to learning outcomes: statements of what a learner is expected to know,
understand and/or be able to do and demonstrate after completion of a process of learning.
Outcomes should be observable, measurable.
DECIDE ON THE PURPOSE OF TESTING
Before we think about how we should assess, what methods we will use or what we have to
do, it is important to consider why we test in the first place. What is our purpose when we
test?
There are many purposes for testing as stated on the homepage of this website, but here we
will make a distinction between testing as mean for process evaluations and testing as mean
for grading/judgement as an end-evaluation.
To give a concrete example of the difference between formative and summative testing, think
about a cook in a restaurant:
When the cook (or his colleague or assistant) tastes the soup, that’s formative; when the
guests taste the soup, that’s summative.
You use interactive assessment tools to get insight into their understanding of the material.
For example, during your lecture you can check on what level of understanding the students
master the material, or see if there are any misconceptions.
2
Choosing the right testing method
A test plan helps with making sure your test is valid. It helps you making sure that you test
what you want to test. It provides an overview of all tests involved in your course/module in
relation to the learning outcomes/objectives of the course/module.
3
help you to construct suitable items.
There are many formats which you can use, this is an example for a test plan for a course
with the most basic information in it: learning outcomes, all the tests with test methods, the
weight per test and (if there are any) special conditions.
A test specification table helps to ensure that there is a match between what should be learned
(objectives), what is taught and what is tested. It also ensures transparency (for yourself but
also for colleagues) and repeatability.
A test specification table zooms in on an individual (written) test. For assignments like a
report or presentation a test specification table isn;t necessary.
Lecturers cannot measure every topic or objective and cannot ask every question they might
wish to ask. A test specification table allows you to construct a test which focuses on the key
areas and weights those different areas based on their importance. A test specification
table provides you with evidence that a test has content and construct validity.
There are many formats possible for a test specification table. You need to provide the
relation between the learning objectives per each individual question, to mention the question
format (open, closed, essay, etc), the score/amount of points per question, the weight of each
learning objective and/or per question. You can add the book/lesson material belonging to the
question. That way you immediately know if you need to adjust your written exam if you
change anything in your lesson material. However, this is not obligatory.
The idea for this primer series germinated from a simple question – “Could you do an article
looking at the validity of tests used in public safety assessment.” As my forgiving readership
already knows, I have trouble containing my thoughts to a single entry. So, as I began to
frame out how I would respond to the question of the validity of public safety assessments,
the amount of material I wanted to cover started to grow exponentially. At some point, I
decided it would be best to start from the beginning with a series of primers on topics related
4
to validity, building up to an answer to the question of “what is the validity of public safety
assessments.”
So now this blog will be the first in a series looking at this question. Over a series of articles
aimed to inform, but also intended to keep things simple, I will cover:
Reliable
Valid
Practical
Socially Sensitive
Candidate Friendly.
Briefly and simply, I will review the meaning of each of these characteristics.
Reliable
Reliability refers to the accuracy of the obtained test score or to how close the obtained scores
for individuals are to what would be their “true” score, if we could ever know their true score.
Thus, reliability is the lack of measurement error, the less measurement error the better. The
reliability coefficient, similar to a correlation coefficient, is used as the indicator of the
reliability of a test. The reliability coefficient can range from 0 to 1, and the closer to 1 the
better. Generally, experts tend to look for a reliability coefficient in excess of .70. However,
many tests used in public safety screening are what is referred to as multi-dimensional.
Interpreting the meaning of a reliability coefficient for a knowledge test based on a variety of
sources requires a great deal of experience and even experts are often fooled or offer
incorrect interpretations. There are a number of types of reliability, but the type usually
reported is internal consistency or coefficient alpha. All things being equal, one should look
for an assessment with strong evidence of reliability, where information is offered on the
degree of confidence you can have in the reported test score.
Valid
Validity will be the topic of our third primer in the series. In the selection context, the term
“validity” refers to whether there is an expectation that scores on the test have a demonstrable
relationship to job performance, or other important job-related criteria. Validity may also be
5
used interchangeably with related terms such as “job related” or “business necessity.” For
now, we will state that there are a number of ways of evaluating validity including:
Content
Criterion-related
Construct
Transfer or transportability
Validity generalization
A good test will offer extensive documentation of the validity of the test.
Practical
A good test should be practical. What defines or constitutes a practical test? Well, this would
be a balancing of a number of factors including:
Socially Sensitive
A consideration of the social implications and effects of the use of a test is critical in public
sector, especially for high stakes jobs such as public safety occupations. The public safety
assessment professional must be considerate of and responsive to multiple group of
stakeholders. In addition, in evaluating a test, it is critical that attention be given to:
Avoiding adverse Impact – Recent events have highlighted the importance of balance
in the demographics of safety force personnel. Adverse impact refers to differences in
the passing rates on exams between males and females, or minorities and majority
group members. Tests should be designed with an eye toward the minimization of
6
adverse impact. A complicated topic, I addressed adverse impact in greater depth in
previous blog posts here and here.
Universal Testing – The concept behind universal testing is that your exams should be
able to be taken by the most diverse set of applicants possible, including those with
disabilities and by those who speak other languages. Having a truly universal test is a
difficult, if not impossible, standard to meet. However, organizations should strive to
ensure that testing locations and environments are compatible with the needs of as wide
a variety of individuals as possible. In addition, organizations should have in place
committees and procedures for dealing with requests for accommodations.
Candidate Friendly