Professional Documents
Culture Documents
Criterion
Criterion
This type of validity is related to the extent to which the abilities of a candidate is aligned with the results of their
test. In essence, there are two kinds of criterion-related validity, namely concurrent validity, and predictive
validity. Concurrent validity is constructed when the criterion and the test are administered at relatively the same
time. In the case of the National English Exam University Entrance Exam 2023, the test criteria (ability to
accurately pronounce different sounds; ability to use diverse grammatical structures and lexical items; ability to
read passages; ability to identify errors; ability to combine sentences) are administered during the time that a test
taker sits for an exam. Predictive ability, on the other hand, deals with the degree to which a test can predict the
future performance of a candidate. For the National English Exam 2023, the test can provide predictions on a
candidate’s ability to perform English skills at a suitable level for higher education studies.
Validity in scoring
Not only must test items be valid, but their scoring also needs to be equally valid. For the National English Exam
University Entrance Exam 2023, objective scoring was employed to ensure as much objectivity as possible; each
question has only one correct answer.
Scorer reliability
This type of reliability is devised to quantify the level of agreement given among scorers of the same test. For the
National English University Entrance Exam 2023, the process of scoring was not publicized, so scorer reliability
cannot be established.
Suggestions
Test makers need to develop test specifications in as much detail as possible. Include test constructs, a
representative sample of the test content, and a reference test to teachers and students to base on for their
teaching and learning strategies. Include as much direct testing as possible, and make sure that there is a mutual
relationship between the testing and scoring of language items. In addition, to make the test more reliable for
future administration, it is important that test makers take enough samples of behavior. By analyzing previous
years’ samples, test makers can empirically test items and include them for future testing. Additionally, items
which do not distinguish between stronger and weaker students; this means that there should be more questions
at application level, rather than at recall and understanding levels.