Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 23

UNIT 2:

PRINCIPLES OF
LANGUAGE ASSESSMENT
PRINCIPLES OF LANGUAGE
ASSESSMENT

• Practicality
• Validity
• Reliability
• Authenticity
• Washback/ Backwash
1/
PRACTICALITY A practical test…..
• stays within budgetary limits
Practicality refers to
• can be completed within appropriate time
the logistical, down-
to-earth, • has clear directions for administration
administrative issues • appropriately utilizes available human resources
involved in making,
• does not exceed available material resources
giving, and scoring
an assessment • considers the time and effort involved to both
instrument (Brown, design and score
2018)
1/ PRACTICALITY
Why are the following tests impractical?
1/ A test of language proficiency that takes a student 5 hours to complete
2/ A test that requires individual one-on-one proctoring for a group of several hundred
test-takers and only a handful of examiners
3/ A test that can be scored only by computer when it takes place a thousand miles from
the nearest computer
4/ A test which is conducted in only 15 minutes for a student, but
requires 5 raters to score
5/ Administrators or proctors need special training to administer a test
6/ An essay-type test is used with several hundreds test takers
A valid test…..
2/ VALIDITY
• measures exactly what it proposes to measure

Validity of a test is • does not measure irrelevant variables

the extent to which it • relies as much as possible on empirical evidence

exactly measures • involve performance that samples the test’s


criterion
what it is supposed to
• offers useful, meaningful information about a
measure (Hughes, test-taker’s ability
2003) • is supported by a theoretical rationale or
argument
2/ VALIDITY

Five ways to establish validity:


1/ Content validity
2/ Criterion validity
3/ Construct validity
4/ Consequential validity
5/ Face validity
2.1 Content validity

• The correlation between the contents of the test and the language
skills, structures, etc. which it is meant to be measured has to be
clear.

• The test items should really represent the course objective

What do you think if a listening test requires students to read


passages to complete instead of requiring students to listening
attentively?
2.2 Criterion validity

• This kind of validity emphasizes on the relationship between the test score
and the outcome.
• The test score should really represent the criterion that is intended to
measure in the test
• Criterion validity usually falls into one of two categories:
+ concurrent validity: if the test’s result is supported by other concurrent
performance beyond the assessment itself
+ predictive validity: tends to assess and predict a student’s possible future
success
2.3 Construct validity

• Construct validity refers to concepts or theories which are


underlying the usage of certain ability including language ability

• Construct validity shows that the result of the test really represents
the same construct with the ability of the students which is being
measured
2.4 Consequential validity (impact)

• Consequential validity encompasses all the consequences of a test,


including such considerations as its accuracy in measuring
intended criteria, its effect on the preparation of test-takers, and
the social consequences of a test’s interpretation and use
2.5 Face validity

• Face validity refers to the degree to which a test looks right, and
appears to measure the knowledge or abilities it claims to measure

• The test can be judged to have face validity by simply look at the
items of the test

• Face validity can affect students in doing the test


A reliable test…..
3/ RELIABILITY
• has consistent conditions across two or more
administrations
Reliability refers to
• gives clear directions for scoring/ evaluation
the consistency of the
• has uniform rubrics for scoring/ evaluation
scores obtained
• lends itself to consistent application of rubrics
(Gronlund, 1977) by the scorer
• contains items/ tasks that are unambiguous to
the test-taker
3/ RELIABILITY
3/ RELIABILITY
Reliability falls into 4 kinds (Brown, 2018):
1/ Student-related reliability
2/ Rater reliability
3/ Test administration reliability
4/ Test reliability
3.1 Student-related reliability

• To get reliable scores from the test-takers, we need to be sure that


the test-takers are in good physical and mental conditions when
taking the test.

• Furthermore, the test-takers must be familiar with the procedure of


doing the test in order to reach optimal performance.
3.2 Rater reliability

• The raters or scorers of a test should possess reliability. They


should be consistent in scoring a test.
• There are two kinds of rater reliability, i.e. intra-rater reliability and
inter-rater reliability
+ Intra-rater reliability means consistency within the rater/scorer
himself/herself.

+ Inter-rater reliability means consistency between two or more


raters
3.3 Test administration reliability

• Test administration reliability concerns with the condition and


situation in which the test is administered.

• To increase the degree of this kind of reliability test, teachers as the


administrators should consider all the things related to the test
administration.
3.4 Test reliability

• Test reliability refers to the test itself.

• Unreliable scores can be caused due to bad quality of test, such as:
unclear instruction, ambiguous answer, bad item construction, or
clues to the correct/wrong answer.
4/ An authentic test…..
AUTHENTICITY
• contains language that is as natural as possible
Authenticity is the
• has items that are contextualized rather than
degree of
isolated
correspondence of
the characteristics of • includes meaningful, relevant, interesting topics
a given language test
task to the features of • provides some thematic organization to items,
a target language task such as through a story line or episode
(Bachman and • offers tasks that replicate real-world tasks
Palmer, 1996)
4/ Authenticity
• Task 1.1: Write the meaning of • Task 1.2: “The students think that
“trivial” the test is difficult, but the teacher
regards it as
trivial.” The underlined word
means ....

• Task 2.2: Muthia, if you meet your


• Task 1.2: List different ways of teacher in a super-market at 7 p.m.,
greetings how would
you greet him/her?
5/ WASHBACK/ A test that provides beneficial washback..
BACKWASH • Provides influences what and how teachers
Washback can be teach
defined as the effect • Provides influences what and how learners learn
of test or assessment
on teaching, learning, • Offers learners a chance to adequately prepare
learner, or • Give learners feedback that enhances their
government and language development
society. Washback
can be positive or • Is more formative in nature than summative
negative.
• Provides conditions for peak performance by the
learner
Questions for further discussion
Questions Group 1 Group 2 Group 3
1/ Why is the loss of concentration by the test-takers in
doing a test lead to unreliable test results?

2/ In a university entrance test, very often a proctor is


prohibited to answer any questions from the test-takers.
Why?
3/ Give an example of proctor’s behavior in a test room
which causes unreliability.
4/ Give a reason why mechanical/ substitution drill is
considered not meaningful.
5/ What kind of washback may happen when teachers
let their students cheat in the final examination?
Explain.
Questions for further discussion

• https://vnuaeduvn-my.sharepoint.com/:w:/g/personal/
ttmai_nn_vnua_edu_vn/ET3Z-
4f_HDBCuZObK7N3U_QBQbyf8bzvYtqSAsqqVHAbtw?e=1COlfx

You might also like