Professional Documents
Culture Documents
Ede 206 Sedillo
Ede 206 Sedillo
Language and
Literature
Assessment
SCORING
LANGUAGE
TESTS
RICKY C. SEDILLO JR., LPT
Presenter
2
Scoring Items
• Scoring is the first step in the process of processing
test result, namely the process of converting the
answers to the test questions into numbers.
• Scoring is an act of quantification of answers given by
tester in a learning outcome test.
• In language testing, a construct of great interest has
always been the ability to recognise logical relations
between clauses, such as those indicating situation–
problem–response–evaluation patterns (Hoey, 1983),
3
One way of testing this is to use a multiple-choice
approach, as in the following example from Fulcher
(1998b):
Most human beings are curious. Not, I mean, in the sense that they are odd, but
in the sense that they want to find out about the world around them, and about
their own part in the world
1a. But they cannot do 1b. They therefore ask 1c. Or, on the other hand,
this easily. questions, they wonder, they may wish to ask
they speculate. many questions.
What they want to find out may be quite simple things: What lies beyond that
range of hills? Or they may be rather more complicated inquiries: How does
grass grow? Or they may be more puzzling inquiries still: What is the purpose
of life? What is the ultimate nature of truth? To the first question the answer
may be obtained by going and seeing. The answer to the next question will not
be so easy to find, but the method will be essentially the same.
2a. So, he is forced to 2b. Although, often, it may 2c. It is the method of the
observe life as he sees it. not be the same. scientist.
4
• You may wish to spend some time looking at this
particular task to identify problems and may also
wish to reverse engineer a specification and
attempt to improve it.
5
Below is a short story that happened recently. The order of the
five sentences of the story has been changed. Your task is to
number the sentences to show the correct order: 1, 2, 3, 4 and
5. Put the number on the line. The first one has been done for
you.
_____ (a) She said she’d taken the computer out of the box,
plugged it in, and sat there for 20 minutes waiting for something
to happen.
__1__ (b) A technician at Compaq Computers told of a frantic
call he received on the help line.
_____ (c) The woman replied, ‘What power switch?’
_____ (d) It was from a woman whose new computer simply
wouldn’t work.
_____ (e) The tech guy asked her what happened when she
pressed the power switch.
6
•Assuming that a mark is given for each sentence
placed in the correct slot, the most obvious scoring
problem with an item like this is that, once one
sentence is placed in the wrong slot, not only is
that sentence incorrect, but the slot for another
correct answer is filled.
7
SCORABILITY
8
SCORABILITY
•Scorability means that the test should be easy to
score, direction for scoring should be clearly stated
in the instruction. Provide the students an answer
sheet and the answer key for the one who will
check the test.
9
•Even with closed response items, Lado saw that if
the answers are ‘scattered in the pages’, the time
taken to score a test is extended, and the chances
of making errors when marking and transferring
results to a mark book would increase.
•He therefore recommended the use of separate
answer sheets upon which test takers could record
their responses. Scoring speeds up significantly,
and errors are reduced.
•The most commonly used keys are stencils that
enable the scorer to see whether a response is
correct or incorrect without having to read the
question. 10
•Even today, many teachers construct
‘templates’ to score tests much more quickly.
•From the very earliest days, a further perceived
advantage of scoring closed response items
was cost. Of course, increasing the speed of
marking through the use of stencils reduced
cost, but using clerical or other untrained staff
also reduced personnel costs.
•With the emphasis on speed, cost and
accuracy, there were many ingenious attempts
to make scoring easier. Lado (1961: 365–366)
mentions two of these.
11
The first was the Clapp-Young Self-Marking test
sheets.
•This consisted of two pieces of paper sealed
together with a sheet of carbon paper between
them.
•When the test taker marks an answer on the sheet
it is printed on the second sheet, which already has
the correct answers circled.
•The marker separates the sheets and counts off
the correct answers from the second sheet of
paper.
12
A second method was the punch-pad self-scoring
device
13
•Not surprisingly, computer-based testing has
become exceptionally popular.
15
Branching Tests
•Branching enables authors to deliver dynamic
assessments that adapt to each individual learner,
providing them with more personalized learning
experience.
19
Holistic Scales
•A single score is awarded, which reflects the
overall quality of the performance.
20
Strong Performance Writing has a clear focus and engages the
reader in the opening lines. Information is
accurate. Transitions help the reader move
smoothly from one idea to another. Any
errors in structures and/or spelling are
minor and infrequent, they do not interfere
with communication.
Meets Expectations Writing has a clear opening statement and
logical sequence of ideas. The information
is accurate. Any errors in structures and/or
spelling are minimal and do not interfere
with communication.
21
Primary Trait Scales
•A single score is awarded, but the descriptors are
developed for each individual prompt (or
question) that is used in the test.
22
PERSUADING AN AUDIENCE
23
Multiple Trait Scoring
• Unlike the two scale types already mentioned,
multiple trait scoring requires raters to award
two or more scores for different features or traits
of the speech or writing sample.
• The traits are normally prompt or prompt-type
specific, as in primary trait scoring.
24
Excellent Average Needs Improvement
Time on task The group forms The group forms fairly soon The group takes a long
immediately to work on to work mostly on activity time to form, they do not
activity until the teacher until the teacher indicates work on activity (unless the
indicates otherwise; if group otherwise; if group finishes teacher walks by); if group
finishes early, members early, members are either finishes early, members
discuss topic related to TL silent or discuss topics not discuss topics not related
related to TL. to TL.
10 9 8 7 6 5 4 3 2 1
Participation All group members All group members but one More the one group
participate equally participate equally member does not
throughout the entire throughout the activity participate equally
activity. throughout the activity.
4 3 2 1
5
Group Cooperation All members cooperate to Most members cooperate to Members do not cooperate
help each other learn; if help each other learn; if to help each other learn; if
anyone has been absent, anyone has been absent, anyone has been absent,
the group helps him/her; no the group sometimes helps the group does not help;
one acts ‘superior’ him/her; no one acts some members act
‘superior’ superior
10 9 8 7 6 5 4 3 2 1
Use of Target Members use as much TL Members use some TL Members rarely use TL
Language as possible (also to greet during activity (also to greet during activity (neither do
and say farewells) and say farewells) they greet nor say
farewells)
5 4 3 2 1
25
AUTOMATED
SCORING
26
•‘The examiner who is conscientious hesitates,
wonders if this response is as good as another he
considered good, if he is being too easy or too
harsh in his scoring.’
27
e-rater
•The software is capable of analysing syntactic
features of the essay, word and text length, and
vocabulary.
PhonePass
•The kinds of tasks utilised by computer scored
speaking tests include reading sentences aloud,
repeating sentences, providing antonyms for
words, and uttering short responses to questions.
28
THANK YOU
FOR
LISTENING!!!
29