Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 20

VALIDITY

INTRODUCTION
1. Content validity
2. Criterion-related validity
3. Other forms of evidence for
construct validity
4. Validity in scoring
5. Face validity
6. How to make more valid
tests
Content validity
 It refers to how accurately an
assessment or measurement tool
taps into various aspects of the
specific construct in question. In
other words, do the questions
really assess the construct in
question?
 A test needs to be related to the
content of the class (relevant
content).
 HOW TO JUDGE IT? We need
a specification of the skills or
structures that the test is meant
to cover.
 Not all the course content
needs to appear in the test.
The importance of content validity
 Language specifications provide the test
constructor with a basis
For making a principled selection of
elements to include in the test.
 A comparison between test
specification and test content is the basis
for judgments related to content validity.
 The greater a test’s content validity, the
more likely is to be an
Accurate measure of what it is supposed
to measure.
ACTIVITY 1
1. According to the CEFR, which
specification of the language
skills would you need to take into account
to test anA1 student?
PLEASE USE YOUR GADGETS TO HAVE
ACCES TO THE CEFR
LANGUAGE SPECIFICATION.
2. Do you think teachers care about those
specifications while
testing a student?
Criterion related validity
 It is the degree to which test results agree
with those provided by some independent
and highly dependable assessment of the
candidate’s ability.
 The independent assessment is the
criterion measure against which the test is
validated.
 TWO TYPES:
CONCURRENT VALIDITY
PREDICTICE VALIDITY
CONCURRENT VALIDITY
 It refers to the extent to which the
results of a particular test, or
measurement correspond to those of a
previously established measurement for
the same construct.
 Is it possible to test everything you need
to test in a short time?
 This will always depend on how many
functions are tested in the
component, and how representative they
are among the complete
set of functions including in the objectives.
How the level of agreement is
measured?
 Using the “correlation coefficient”.
This is a mathematical measure of
similarity.
 Perfect agreement= 1
 Total lack of agreement= 0
The level of agreement is regarded as
satisfactory, depending on
the purpose of the test and the
decisions that are made based on it.
PREDICITVE VALIDITY
 This topic concerns the degree to which a
test can predict a candidate’s future
performance.
 How helpful is it to use final outcomes as
the criterion measure when so many factors
other than ability in English (such as subject
knowledge, intelligence, motivation, health
and happiness) will have contributed to
every outcome?
 Example: placement tests
Other forms of evidence for construct
validity
 We cannot be sure that the items of the
test are measuring what we expect them to
measure.
 Construct validity: “construct” refers to
any underlying ability that is hypothesized
in a theory of language ability.
 It is important to establish if distinct
abilities exist, if they can be measured and
if they are measured in a test.
 Research is needed for evidence.
Another way of obtaining evidence
about the construct validity of a
test is to investigate what test takers
actually do when they respond
to an answer.
TWO PRINCIPAL METHODS:
THINK ALOUD
Test taker voice their thoughts as
they respond to the item.
Problem: The very voicing of thoughts
may interfere with what would be the
natural response of the item
RETROSPECTION
They try to recollect what
their thinking was, as to they
responded.

Problem: Thoughts may


be forgotten
VALIDITY IN SCORING
 It is worth pointing out that if a test
is to have validity, not only the
item must be valid. Also the way in
which the responses are scored must
be valid.
 If the test is meant to test one
specific skill and while grading
another one interferes with the
scoring process, then the test is not
valid.
ACTIVITY 2
To discuss:
 Do the PUCE language tests lack
scoring validity? Yes or no?
 So, why do teachers take away
points when they are grading the
reading and comprehension part?
 What is the reason to do so? Do you
think this is fair for the
student?
FACE VALIDITY
 A test is said to have face
validity if it looks as if it
measures what it is supposed to
measure.
 A test which does not have
face validity may not be
accepted by candidates,
teachers, education authorities
or employers.
How to make tests more valid
 Write explicit specifications for
the test which take into account
all that is known about the
constructs that are to be
measured.
 Make sure that you include a
representative sample of the
content of these in the test.
 Use direct testing.
 Make sure that the scoring
responses relates directly to
what is
being tested.
 Do everything possible to make
the test reliable. If the test is not
reliable, it cannot be valid.
ACTIVITY 3
 In pairs answer the following test. Then,
check if it has validity
enough to say that the student knows all
the content of the A1 level.

 Would you say that the test is


measuring the most important
content of theA1 level?
a
 Which contents would you test instead
of those and why?
Conclusion
 It is important to take into
account the language
specifications.
 It is necessary to do research to
say that a test has content validity.

You might also like