Unit 2 Principles of Language Assessment

UNIT 2:
PRINCIPLES OF
LANGUAGE ASSESSMENT
PRINCIPLES OF LANGUAGE
ASSESSMENT
• Practicality
• Validity
• Reliability
• Authenticity
• Washback/ Backwash
1/
PRACTICALITY A practical test…..
• stays within budgetary limits
Practicality refers to
• can be completed within appropriate time
the logistical, down-
to-earth, • has clear directions for administration
administrative issues • appropriately utilizes available human resources
involved in making,
• does not exceed available material resources
giving, and scoring
an assessment • considers the time and effort involved to both
instrument (Brown, design and score
2018)
1/ PRACTICALITY
Why are the following tests impractical?
1/ A test of language proficiency that takes a student 5 hours to complete
2/ A test that requires individual one-on-one proctoring for a group of several hundred
test-takers and only a handful of examiners
3/ A test that can be scored only by computer when it takes place a thousand miles from
the nearest computer
4/ A test which is conducted in only 15 minutes for a student, but
requires 5 raters to score
5/ Administrators or proctors need special training to administer a test
6/ An essay-type test is used with several hundreds test takers
A valid test…..
2/ VALIDITY
• measures exactly what it proposes to measure
Validity of a test is • does not measure irrelevant variables
the extent to which it • relies as much as possible on empirical evidence
exactly measures • involve performance that samples the test’s

criterion
what it is supposed to
• offers useful, meaningful information about a
measure (Hughes, test-taker’s ability
2003) • is supported by a theoretical rationale or
argument
2/ VALIDITY
Five ways to establish validity:

1/ Content validity
2/ Criterion validity
3/ Construct validity
4/ Consequential validity
5/ Face validity
2.1 Content validity
• The correlation between the contents of the test and the language
skills, structures, etc. which it is meant to be measured has to be
clear.
• The test items should really represent the course objective
What do you think if a listening test requires students to read

passages to complete instead of requiring students to listening
attentively?
2.2 Criterion validity
• This kind of validity emphasizes on the relationship between the test score
and the outcome.
• The test score should really represent the criterion that is intended to
measure in the test
• Criterion validity usually falls into one of two categories:
+ concurrent validity: if the test’s result is supported by other concurrent
performance beyond the assessment itself
+ predictive validity: tends to assess and predict a student’s possible future
success
2.3 Construct validity
• Construct validity refers to concepts or theories which are

underlying the usage of certain ability including language ability
• Construct validity shows that the result of the test really represents
the same construct with the ability of the students which is being
measured
2.4 Consequential validity (impact)
• Consequential validity encompasses all the consequences of a test,

including such considerations as its accuracy in measuring
intended criteria, its effect on the preparation of test-takers, and
the social consequences of a test’s interpretation and use
2.5 Face validity
• Face validity refers to the degree to which a test looks right, and
appears to measure the knowledge or abilities it claims to measure
• The test can be judged to have face validity by simply look at the
items of the test
• Face validity can affect students in doing the test

A reliable test…..
3/ RELIABILITY
• has consistent conditions across two or more
administrations
Reliability refers to
• gives clear directions for scoring/ evaluation
the consistency of the
• has uniform rubrics for scoring/ evaluation
scores obtained
• lends itself to consistent application of rubrics
(Gronlund, 1977) by the scorer
• contains items/ tasks that are unambiguous to
the test-taker
3/ RELIABILITY
3/ RELIABILITY
Reliability falls into 4 kinds (Brown, 2018):
1/ Student-related reliability
2/ Rater reliability
3/ Test administration reliability
4/ Test reliability
3.1 Student-related reliability
• To get reliable scores from the test-takers, we need to be sure that

the test-takers are in good physical and mental conditions when
taking the test.
• Furthermore, the test-takers must be familiar with the procedure of

doing the test in order to reach optimal performance.
3.2 Rater reliability
• The raters or scorers of a test should possess reliability. They

should be consistent in scoring a test.
• There are two kinds of rater reliability, i.e. intra-rater reliability and
inter-rater reliability
+ Intra-rater reliability means consistency within the rater/scorer
himself/herself.
+ Inter-rater reliability means consistency between two or more

raters
3.3 Test administration reliability
• Test administration reliability concerns with the condition and

situation in which the test is administered.
• To increase the degree of this kind of reliability test, teachers as the

administrators should consider all the things related to the test
administration.
3.4 Test reliability
• Test reliability refers to the test itself.
• Unreliable scores can be caused due to bad quality of test, such as:
unclear instruction, ambiguous answer, bad item construction, or
clues to the correct/wrong answer.
4/ An authentic test…..
AUTHENTICITY
• contains language that is as natural as possible
Authenticity is the
• has items that are contextualized rather than
degree of
isolated
correspondence of
the characteristics of • includes meaningful, relevant, interesting topics
a given language test
task to the features of • provides some thematic organization to items,
a target language task such as through a story line or episode
(Bachman and • offers tasks that replicate real-world tasks
Palmer, 1996)
4/ Authenticity
• Task 1.1: Write the meaning of • Task 1.2: “The students think that
“trivial” the test is difficult, but the teacher
regards it as
trivial.” The underlined word
means ....
• Task 2.2: Muthia, if you meet your

• Task 1.2: List different ways of teacher in a super-market at 7 p.m.,
greetings how would
you greet him/her?
5/ WASHBACK/ A test that provides beneficial washback..
BACKWASH • Provides influences what and how teachers
Washback can be teach
defined as the effect • Provides influences what and how learners learn
of test or assessment
on teaching, learning, • Offers learners a chance to adequately prepare
learner, or • Give learners feedback that enhances their
government and language development
society. Washback
can be positive or • Is more formative in nature than summative
negative.
• Provides conditions for peak performance by the
learner
Questions for further discussion
Questions Group 1 Group 2 Group 3
1/ Why is the loss of concentration by the test-takers in
doing a test lead to unreliable test results?
2/ In a university entrance test, very often a proctor is

prohibited to answer any questions from the test-takers.
Why?
3/ Give an example of proctor’s behavior in a test room
which causes unreliability.
4/ Give a reason why mechanical/ substitution drill is
considered not meaningful.
5/ What kind of washback may happen when teachers
let their students cheat in the final examination?
Explain.
Questions for further discussion
• https://vnuaeduvn-my.sharepoint.com/:w:/g/personal/
ttmai_nn_vnua_edu_vn/ET3Z-
4f_HDBCuZObK7N3U_QBQbyf8bzvYtqSAsqqVHAbtw?e=1COlfx

Unit 2 Principles of Language Assessment

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Unit 2 Principles of Language Assessment

Uploaded by

Copyright:

Available Formats

UNIT 2:

Validity of a test is • does not measure irrelevant variables

the extent to which it • relies as much as possible on empirical evidence

exactly measures • involve performance that samples the test’s

Five ways to establish validity:

• The test items should really represent the course objective

What do you think if a listening test requires students to read

• Construct validity refers to concepts or theories which are

• Consequential validity encompasses all the consequences of a test,

• Face validity can affect students in doing the test

• To get reliable scores from the test-takers, we need to be sure that

• Furthermore, the test-takers must be familiar with the procedure of

• The raters or scorers of a test should possess reliability. They

+ Inter-rater reliability means consistency between two or more

• Test administration reliability concerns with the condition and

• To increase the degree of this kind of reliability test, teachers as the

• Test reliability refers to the test itself.

• Task 2.2: Muthia, if you meet your

2/ In a university entrance test, very often a proctor is

You might also like