Language Testing and Assessment, 2018-2019

ROCAIRA RACMAN-GUMAL
Associate Professor V
English Department, Collage of Social Sciences & Humanities
Mindanao State University, Marawi City
“Test what you teach and teach what you
test.”
What is a test?
A test is a method of measuring a person’s ability,

knowledge or performance in a given domain.
(Brown, 2004)
A method
 It is a method. It is an instrument - a set of procedures
and techniques or items - that require performance by
the test takers.
 The method must be explicit and structured.
 multiple-choice questions with prescribed correct answers;
 writing prompts with scoring rubrics;
 an oral interview based on a question script; and
 checklist of expected responses to be filled by the

administrator
It must measure.
 Some tests measure general ability
 A multi-skill proficiency test determines a general
ability level
 Others measure very specific competencies or

objectives.
 A quiz on recognizing correct use of definite articles
measure specific knowledge.
A test measures an individual's ability, knowlege and
performance.
 Testers need to understand who the test takers are.
 previous experience and background
 matching of tests with their abilities (appropriacy)
 how should test takers interpret their scores
 A test measures performance.

 Most language tests measure one's ability to perform
language.
 Some tests measure knowledge about the language.
A test measures a given domain.
 Proficiency tests measures overall proficiency in a
language - general competence in all language skills.
 Other tests measure specific domains such as a test

of pronunciation might just be a test of only a liited set
of phonemc minimal pairs.
What are the Purposes of Testing?
 to measure language proficiency.
 to discover how successful students have been in achieving

the objectives of the course of study.
 to diagnose students’ strengths and weaknesses to identify

what they know and what they do not know.
 to assess placement of students by identifying the stage of

part of a teaching programme most appropriate to their ability.
Why test?
• Diagnose students strengths and needs.
• Provide feedback on student learning.
• Provide a basis for instructional placement.
• Inform and guide instruction.
• Communicate learning expectations.
• Motivate and focus students’ attention and effort.
• Provide practice applying knowledge and skills.
Assessment, Test,and Teaching
Tests
Assessment
Teaching
Assessment vs Testing
Assessment Testing
• ongoing process • administrative procedure
• generally • specific times
encompasses a
wide domain • curriculum referenced
• subconscious • peak performance
impression • measured and evaluated
• incidental judgment
• implicit evaluation
Forms of Assessment
 Informal and Formal Assessment
 Formative and Summative Assessment
 Norm-referenced and Criterion-Referenced Tests

Informal and Formal Assessment
 Informal Assessment can take a number of forms:
 incidental, unplanned comments and responses;
 coaching and other impromptu feedback to the student;
 classroom tasks designed to elicit performance without
recording results; and
 making fixed judgements about a student's competence
(marginal comments,etc)
 Formal Assessments are systematically planned sampling
techniques constructed to give teacher and student an
appraisal of student achievement.
 exercises or precedures specifically designed to tap into a
storehouse of skills and knowledge.
Summative & Formative Assessment
Summative Assessment Formative Assessment
It is used at the end It is used to monitor the
a term or a year. students' progress
during the course.
It is used to assess how much It is in the form of
has been achieved by informal tests and
individuals or groups. quizzes and class
observation.
Norm-Referenced and Criterion-Referenced Tests
 Norm-Referenced Tests
 Each test-taker's score is interpreted in relation to a mean,
median, standard deviation, and/or percentile rank.
 The purpose is to place test-takers long a mathematical
continuum in rank order.
 Scores are usually reported back to test-takers in numerical
form.
 SAT, TOEFL
Criterion-Referenced Tests
 These are designed to give test-takers feedback usually in

the form of grades on specific course or lesson objectives.
 Classroom tests
 The distribution of students' scores across a continuum

may be of little concern as long as the instrument
assesses appropriate objectives
Approaches to Language Testing
 Discrete-Point Testing
 Integrative Testing
 Communicative-Language Testing
 Performance-Based Testing
 Discrete-Point Testing
 constructed on the assumption that language can be broken down into
its component parts that can be tested successfully.
 tests on reading, writing, speaking, listening, phonology, morphology,
lexicon, syntax, and discourse
 Integrative Testing
 cloze tests and dictation
 “indivisible” view of language proficiency (unitary trait hypothesis)
 vocabulary, grammar, phonology, the four skills and other discrete
points of lanuage could not be disentangled from each other in
language performance
 Communicative Language Testing
 for authenticity wherein designers centered on communicative
performance
 constructs include components of communicative competence model
 Performance-Based Testing
 involves oral and written production, open-ended responses,
integrated performance, group performance, & other interactive
tasks.
 time-consuming & expensive but more direct since students are
assessed as they perform actual or simulated real-world tasks.
 interactive tasks
Current Issues in Classroom Testing
 New Views on Intelligence
 Traditional and Alternative Assessment
 Computer-Based Testing
Old Views on Intelligence
 ability to perform (a) linguistic and (b) logical-mathematical
problem solving
 world of standardized, norm-referenced tests timed in a

multiple-choice format consisting of a multiplicity of logic-
constrained items, many of which are inauthentic.
New Views on Intelligence
 seven components of intelligence (linguistic, loogical-
mathematical, spatial, musical, bodily-kinesthetic,
interpersonal, intrapersonal)
 creative thinking and manipulative strategies (Robert
Sternberg, 1988, 1997)
 Emotional Quotient (Daniel Goleman, 1995). Those who
manage emotions (esp. detrimental) tend to be more
capable of fully intelligent processing.
Traditional & Alternative Assessment
Traditional Alternative
 one-slot, standardizes exam  continuous long-term assessment
 timed, multiple-choice format  untimed, free-response format
 decontextualized test items  contextualized communicative tasks
 scores suffice for feedback  individualized feedback and washback
 norm-referenced scores  criterion-referenced scores
 focus on the “right” answer  open-ended, creative answers
 summative  formative
 oriented to product  oriented to process
 non-interactive performance  interactive performance
 fosters extrinsic motivation  fosters intrinsic motivation
 Computer-Based Testing
 computer-assisted or web-based tests are small-scale “home-
grown” tests available on websites
 standardized, large-scale tests hat involve thousands of test-takers
Advantages
 classroom-based
 self-directed on various aspects of language
 practice for upcoming high-stakes standardized tests
 some individualization
 large-scale standardized tests
What is Good Testing?
 It is valid. VALIDITY
 It is reliable. RELIABILITY
 It is practical. PRACTICALITY
 It has positive impact on
the teaching process. WASHBACK

Principles of Language Assessment
 Practicality
 Reliability
 Validity
 Authenticity
 Washback
Practicality
An effective test is practical if
 It is not excessively expensive;
 It stays within appropriate time constraints;
 It is relatively eaasy to administer; and
 It has a scoring/evaluation procedure that is specific

and time-efficient.
Reliability
 A reliable test is consistent and dependable.
 If you give same test to same or matched students at

different times, it should yield similar results
Factors that may contribute to test
unreliability :
 Student-related Reliability (student's illness, fatigue, “bad day”,
anxiety and other physical or psychological factors)
 Rater Reliability (human error, subjectivity, and bias on the
part of the rater)
 Test Administration Reliability (conditions, environment,
facilities, quality of test production and photocopying
 Test Reliability (nature of the test - too long test, timed
tests,poorly written test items)
Validity
 A test is valid if it measures accurately what it is
intended to measure.
 To establish that a test is valid, empirical evidence is
needed. The evidence comes from different sources.
 A test is said to be valid to the extent that it measures
what it is supposed to measure, For example, a test
designed to measure control of grammar becomes
invalid if it contains difficult vocabulary.
Types of Validity
 Content validity:
 It is the extent to which a test adequately and sufficiently
measures the particular skills it sets out to measure (cf. test
specifications)
 Response validity:
 Test takers respond in the way expected by the test
developers
 Predictive validity:
 A test accurately predicts future performance
 Concurrent validity:
 One test relate to scores on another external measure.
 Face validity:
 Test appears to measure whatever it claims to measure.
(Hughes 2003: 26-35)
 Construct validity:
 the extent to which a test measures the underlying
psychological construct (“ability, capacity”)
 the extent to which a test reflects the essential aspects of
the theory on which that test is based
Authenticity
 A test is said to be authentic if:
 the language in the test is as antural as possible;
 items are contextualized rather than isolated
 topics are meaningful (relevant, interesting) for the learner;
 some thematic organization to items is provided, such as

through a story line or episode
 tasts represent, or closely approximate, real-world tasks
Washback
 Washback is generally defined as the influence of
testing on teaching and learning’
 Washback refers to the extent to which the

introduction and use of a test influences language
teachers and learners to do things that they would not
otherwise do that promote or inhibit language learning.
Beneficial Washback
If it makes teachers and learners do ‘good’ things they
would not otherwise do:
For example:
prepare lessons more thoroughly,
do their homework,
take the subject being tested more seriously, & so on.
Teachers are often said to use tests to get their
students to do things they would not otherwise do:
to pay attention to the lesson,
to prepare more thoroughly,
to learn by heart, and so on.
The concept is rooted in the notion that tests or

examinations can and should drive teaching, and
hence learning, and is also referred to as
measurement-driven instruction
Negative Effects of Washback
 anxiety in the learner brought about by having to take a test of whatever
nature
 concern in teachers, if they believe that some consequence will follow
on poor performance by the pupils.
 Any learner obliged to do something under pressure will perform
abnormally and may experience anxiety.
 Similarly for teachers, the fear of poor results, and the associated guilt,
shame, or embarrassment, might lead to the desire for their pupils to
achieve high scores in whatever way seems possible.
 This might lead to ‘teaching to the test’, with an undesirable ‘narrowing of
the curriculum’.
How to achieve Positive Washback?
1. Test the abilities/skills whose development you want
to encourage.
2. Sample widely and unpredictably.
3. Use direct testing.
4. Make testing criterion-referenced.
5. Base achievement tests on objectives.
6. Make sure that the test is known and understood by
students and other teachers.
Assessing the Four Skills (Reports)
 Assessing Listening
 Assessing Speaking
 Assessing Reading
 Assessing Writing
Designing Classroom Language Tests
A. Test Types (Reports)

 Language Aptitude Test
 Proficiency Test
 Placement Test
 Diagnostic Test
 Achievevment Test
 Alternative Assessment
B. Test Specifications and Designs

ALTERNATIVES IN ASSESSMENT
Defining Characteristics
of Alternatives in Assessment
 1. They require students to perform, create, produce, or do
something;
 2. They use real-world contexts or simulations;
 3. They are nonintrusive in that they extend the day-to-day
classroom activities;
 4. They allow students to be assesses on what they normally
do in class everyday;
 5. They use tasks that represent meaningful instructional
activities;
 6. They focus on processes as well as products;
 7. They tap into higher-level thinking and problem-solving
skills;
 8. They provide information about both the strengths and
weaknesses of students;
 9. They are multiculturally sensitive when properly
administered;
 10. They ensure that people, not machines, do the
scoring, using human judgment;
 11. They encourage open disclosure of standards and
rating criteria; and
 12. They call upon teachers to perform new
instructional and assessment roles.
Types of Alternatives in Assessment
 A. Porfolios
 B. Journals
 C. Conferences and Interviews
 D. Observations
 E. Self- and Peer-Assessment
A. Portfolios
 A portfolio is “a purposeful collection of students' work

that demonstrates their efforts, progress, and
achievements in given areas.” (Genesee and Upshur,
1996)
Portfolios include the following materials:
 a. essays and compositions in draft and final forms;
 b. reports, project outlines;
 c. poetry and creative prose;
 d. artwork, photos, newspaper or magazine clippings;
 e. audio and/or video recordings of presentations, demostrations, etc.;
 f. journals, diaries, and other personal reflections;
 g. tests, test scores, and written homework exercises;
 h. notes on lectures; and
 i. self- and peer-assessments - comments, evaluations, and checklists.
Six (6) Possible Attributes of a Portfolio
 C - collecting (collections of students' lives & identities)

 R - reflecting (reflective practice and assessment)
 A - assessing (evaluating quality and development
over time)
 D - documenting (documenting students'
achievements)
 L - linking (link between teacher aand student)
 E - evaluating (to generate accountability)
Advantages of Portfolios
a. They foster intrinsic motivation, responsibility, and ownership;
b. They promote student-teacher interaction with the teacher as
facilitator;
c. They individualize learning and celebrate the uniqueness of each
student;
d. They provide tangible evidence of a student's work;
e. They facilitate critical thinking, self-assessment, and revision
processess;
f. They offer opportunities for collaborative work with peers;and
g. They permit assessment of multiple dimensions of language
learning.
Steps and Guidelines
for Portfolio Development
a. State objectives clearly.
b. Give guidelines on what materials to include.
c. Communicate assessment criteria to students.
d. Designate time within the curriculum for portfolio
development.
e. Establish periodic schedules for review and conferencing.
f. Designate an accessible place to keep portfolios.
g. Provide positive washback-giving final assessments.
B. Journals
A journal is a log of one's thoughts, feelings,

reactions, assessments, ideas, or progress
towards goals, usually written with little
attention to structure, form, or correctness.
Categories or Purposes in Journal Writing
 1. language-learning logs
 2. grammar journals
 3. responses to readings
 4. strategies-based learning logs
 5. self-assessment reflections
 6. diaries of attittudes, feelings, and other affective
factors
 7. acculturation logs
Some Important Pedagogical Purposes
of Jounals
 1. practice in the mechanics o writing
 2. using writing as a “thinking” process
 3. individualization
 4. communication with the teacher
General Steps and Guidelines for
Using Journals as Assessment Instrument
 1. Sensitively introduce students to the concept of journal
writing.
 2. State the objective(s) of the journal.
 3. Give guidelines on what kinds of topics to include.
 4. Carefully specify the criteria for assessing or grading
journals.
 5. rovide optimal feedback in your responses.
 6. Designate appropriate time frames and schedules for
review
 7. Provide formative, washback-giving final comments.
Three Different Kind of Feedback to Journals
 a. cheerleading feedback, in which you celebrate successes

with the stuents or encourage them to persevere through
difficulties;
 b. instructional feedback, in which you suggest strategies or
materials, suggest ways to fine-tune strategy use, or instruct
students in their writing; and
 c. reality-check feedback, in which you help the students set
more realistic expectations for their language abilities.
C. Conferences and Interviews
 Conference is a teacher and student interaction to

facilitate the imrovement of student performance
 Interview is one specialized kind of conference in

which a teacher interviews a student for a designated
assessment purpose.
Conference activities
 commenting on drafts of essays and reports

 reviewing portfolios
 responding to journals
 advising on a stdent's plan for an oral presentation
 assessing a proposal for a project
 giving feedback on the results of performance on a
test
 clarifying understanding of a reading
Conference activities
 exploring strategies-based options for enhancement or

compensation
 focusing on aspects of oral production
 checking a student's self-assessment of a
performance
 setting personal goals for the near future
 assessing general progress in a course
Generic Kinds of Questions to Pose in a
Conference (Genesee and Upshur, 1996)
 What did you like about this work?
 What do you think you did well?
 How does it show improvement f previous work? Are there things you
would like to improve?
 Are there things about it you do not like? Are there things you would like
to improve?
 Did you have any difficulties with this piece of work? If so, where and
what did you do (will do) to overcome them?
 What strategies did you use to figure out the meaning of words you could
not understand?
 What did you do when you did not know word that you wanted to write?
Goals for Interviews
 assesses the student's oral production

 ascertaind a student's needs before designing a
course or curriculum
 seeks to disscover a student's learning styles ad
preferences
 asks a student to assess his or her own performance
 requests an evaluation of a course
Guidelines in Framing Interview Questions
 Offer an initial atmosphere of warmth and anxiety-lowering (warm-
up)
 Begin with relatively simple questions.
 Continue with level-check and probe questions, but adapt to the
interviewee as needed.
 Frame questions simply and directly.
 Focus on only one factor for each question. Do not combine
several objectives in the same question.
 Be prepared to repeat or reframe questions that are not
understood.
 Wind down with friendly and reassuring closing comments.
D. Observations
 Observation as an alternative assessment is a

systematic, planned procedure for real-time, almost
surreptitious recording of student verbal and nonverbal
behavior.
Potential Observation Foci
(Student Performance to be Observed)
 sentence-level oral production skills - microskills
(pronunciation of target sounds, intonations, etc. and
grammatical features (verb tenses, question formation, etc.);
 discourse-level skills (conversation rules, tun-taking, and oter
macroskills);
 intraction with classmates (coopeeration, fquency of oral
production)
 reactions to particular students, optimal productive pairs and
groups, which “zones” of the classroom are more vocal, et.
 frequency of student-initiated responses (whole class, group
work);
 quality of teacher-elicited responses;
 latencies, pauses, silent periods (number of seconds, minutes,
etc.);
 length of utterances;
 evidence of listening comprehension (questions, clarifications,
attention-giving verbal and nonverbal behavior);
 affective states (apparent self-esteem, extroversion,nxiety,
motivation, etc.)
 students' verbal or nonverbal to materials, types of activities,
teaching styles;
 use of strategic options in comprehension or production (use
of communication strategies, avoidance, et.); and
 culturally-specific linguistic and nonverbal factors (kinesics,
proxemics, use of humor, slang, metaphor, etc.)
Steps in Carrying out Observations
 1. Determine the specific objectives of the observation.
 2. Decide how many students will be observed at one time.
 3. Set up the logistics for making unnoticed observations.
 4. Design a system for recording observed performances.
 5. Do not overestimate the number of different elements you
can observe at one time - keep them very limited.
 6. Plan how many observations you will make.
 7 Determine specifically how you will use the results.
Recording observations can take the form of :
 anecdotal records (as specific as possible in focusing on the
objectives but varied in form, more note-taking than record-
keeping)
 checklists
 whole-class, group and individual participation
 content of the topic
 linguistic competence (form, function, discourse, sociolinguistic)
 materials being used
 skill (four skills)
 rating scales
E. Self- and Peer-Assessments
 Self-assessment dives its theoretical justification from

SLA principles such as the principle of autonomy
(setting one's own goals and pursuing them as well as
monitoring them independently, developing intrinsic
motivation)
 Peer-assessment appeals to similar principles such as
cooperative learning and collaborative education.
Self- and Peer-assessment benefits:
 direct involvement of students in their own destiny;

 the encouragement of autonomy;
 increased motivation
Self- and Peer-assessment Drawbacks
 Subjectivity is a primary obstacle to overcome.

 Students may either be too harsh on themselves or too
self-flattering;
 Students may not have the necessary tools to make
an accurate assessment;
 They may not be able to discern their own error.
Types of Self- and Peer-Assessment
 1. Assessment of (a specific) performance

 2. Indirect assessment of (general) competence
 3. Metacognitive assessment (for setting goals)
 4. Socioaffective assessment
 5. Student-generated tests
Guidelines for Self- and Peer Assessment
 1. Tell students the purpose of the assessment.

 2. Define the task(s) clearly.
 3. Encourage ipartial evaluation of performance or
ability.
 4. Ensure beneficial washback through follow-up tasks
Self- and Peer-Assessment Tasks
 Listening Tasks
 listening to TV or radio broadcasts and checking comprehension with
a partner
 listening to bilingual versions of a broadcast and checking
comprehension
 asking when you don not understand something in pair or group work
 listening to an academic lecture and checking yourself on a “quiz” of
the content
 setting golas for creating/increasing listening opportunities
 Speaking Tasks
 filling out student self-checklists and questionnaires
 using peer checklists and questionnaires
 rating someone's oral presentation (holistically)
 detecting pronunciation or grammar errors on a self-recording
 asking others for confirmation checks in conversational settings
 setting golas for creating/increasing opportunities for speaking
 Reading Tasks
 reading passages with self-check comprehension questions following
 reading and checking comprehension with a partner
 taking vocabulary quizzes
 taking grammar and vocabulary quizzes on the internet
 conducting self-assessment of reading habits
 setting goals for creating/increasing reading opportunities
 Writing Tasks
 revising written worj on your won
 revising written work with a peer (peer editing)
 proofreading
 using journal writing for reflection, asessment, and goal-setting
 setting goals for creating/incresing writing opportunities
Test Specifications and Designs
 See another PPT (MAELT 277 - test construction &
administration
Scoring, Grading,
and Giving Feedback
 Writing Assignment
Fairness, Ethics and Standards in Testing
THANK YOU VERY MUCH
and
ASSALAMO 'ALAYKOM!

Language Testing and Assessment, 2018-2019

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Language Testing and Assessment, 2018-2019

Uploaded by

Copyright:

Available Formats

ROCAIRA RACMAN-GUMAL

A test is a method of measuring a person’s ability,

 an oral interview based on a question script; and

 checklist of expected responses to be filled by the

 Others measure very specific competencies or

 how should test takers interpret their scores

 A test measures performance.

 Other tests measure specific domains such as a test

 to discover how successful students have been in achieving

 to diagnose students’ strengths and weaknesses to identify

 to assess placement of students by identifying the stage of

 Formative and Summative Assessment

 Norm-referenced and Criterion-Referenced Tests

 These are designed to give test-takers feedback usually in

 The distribution of students' scores across a continuum

 Traditional and Alternative Assessment

 world of standardized, norm-referenced tests timed in a

 It has positive impact on

the teaching process. WASHBACK

 It is relatively eaasy to administer; and

 It has a scoring/evaluation procedure that is specific

 If you give same test to same or matched students at

 topics are meaningful (relevant, interesting) for the learner;

 some thematic organization to items is provided, such as

 Washback refers to the extent to which the

The concept is rooted in the notion that tests or

A. Test Types (Reports)

B. Test Specifications and Designs

 A portfolio is “a purposeful collection of students' work

 C - collecting (collections of students' lives & identities)

A journal is a log of one's thoughts, feelings,

 a. cheerleading feedback, in which you celebrate successes

 Conference is a teacher and student interaction to

 Interview is one specialized kind of conference in

 commenting on drafts of essays and reports

 exploring strategies-based options for enhancement or

 assesses the student's oral production

 Observation as an alternative assessment is a

 Self-assessment dives its theoretical justification from

 direct involvement of students in their own destiny;

 Subjectivity is a primary obstacle to overcome.

 1. Assessment of (a specific) performance

 1. Tell students the purpose of the assessment.

You might also like