Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 34

STANDARDIZED

TESTING
Group 3

1. Sadira Elin Kinasih (2520007)


2. Nur Rahmah (2520008)
3. Hanna Fairussania (2520012)
4. Ma’rifatul Azkiyah (2520014)
TABLE OF CONTENTS
01 The Meaning of
Standardization
02 Plus and Minus of
Standardized Test

03 Developing a
04 Standardized
Standardized Test Language Proficiency
Testing
05 Four Standardized
Language Proficiency
Tests
01.
The Meaning of
Standardization
What is Standardization Test?
A standardized test presupposes certain standard objectives, or criteria, that
are held constant across one form of the test to another. The criteria in
large-scale standardized tests are designed to apply to a broad band of
competencies that are usually not exclusive to one particular curriculum. A
good standardized test is the product of a thorough process of empirical
research and development. It dictates standard procedures for
administration and scoring. And finally, it is typical of a norm-referenced
test, the goal of which is to place test-takers on a continuum across a range
of scores and to differentiate test-takers by their relative ranking. For
example: TOEFL, IELTS, or College Entrance Exam.
02.
Plus Minus of Standardized
Test
+ Plus
1. A ready-made previously validated product that frees the teacher from
having to spend hours creating a test.
2. Administration to large groups can be accomplished within reasonable
time limits.
3. In the case of multiple-choice formats, scoring procedures are streamlined
(for either scannable computerized scoring or hand-scoring with a hole-
punched grid) for fast turnaround time.
 4. There is often an air of face validity to such authoritative-looking
instruments.
- Minus
1. Using an overall proficiency test as an achievement test simply
because of the convenience of the standardization.
2. The potential misunderstanding of the difference between direct and
indirect testing.
03.
Developing Standardized
Test
How are Standardized Test Developed?

In the steps outlined below, three different standardized tests will be used to
exemplify the process of standardized test design:

01 The Test of English as a Foreign Language (TOEFL), Educational Testing


Service (ETS).

02 The English as a Second Language Placement Test (ESLPT), San Francisco


State University (SFSU).

03 The Graduate Essay Test (GET), SFSU.


01 Determine the purpose and objectives of the
test

Most standardized tests are expected to provide high practicality in administration


and scoring without unduly compromising validity. It is therefore important for its
purpose and objectives to be stated specifically.

A. The purpose of the TOEFL is “to evaluate the English proficiency of people
whose native language is not English” (TOEFL Test and Score Manual, 2001, p.
9). More specifically, the TOEFL is designed to help institutions of higher
learning make “valid decisions concerning English language proficiency in
terms of [their] own requirements” (p. 9).
B. The ESLPT, referred to in Chapter 3, is designed to place already admitted
students at San Francisco State University in an appropriate course in academic
writing, with the secondary goal of placing students into courses in oral
production and grammar-editing.

C. The GET. Another test designed at SFSU, is given to prospective graduate


students-both native and non-native speakers-in all disciplines to determine
whether their writing ability is sufficient to permit them to enter graduate-level
courses in their programs
02 Design test specifications

Before specs can be addressed, a comprehensive program of research must identify a set of
constructs underlying the test it self

a) Construct validation for the TOEFL is carried out by the TOEFL staff at ETS under the
guidance of a Policy Council that works with a Committee of Examiners that is composed of
appointed external university faculty, linguists, and assessment specialists.
Because of TOEFL is a proficiency test, the first step in the developmental process is to define
the construct of language proficiency.
(Bachman & Palmer, 1996) prefer the term ability to proficiency and thus speak of language
ability as the overarching concept.
TOEFL Specifications

- Listening Section
The listening section measures the examinee’s ability to understand english. Conversational features of the language
are stressed, and the skills tested include vocabulary and idiomatic expression as well as special grammatical
constructions that are frequently used in spoken english.

- Structure section
The Structure section measures an the examinee’s ability to recognize language that is appropriate for standard
written english.

- Reading Section
The Reading Section measures the ability to read and understand short passages similar in topic and style to academic
texts.

- Writing Section
The Writing Section measures the ability to write in english, including the ability to generate, organize, and
development ideas, to support those ideas with examples or evidence, and to compose a response to one assigned
topic in standard written english.
b) The designing of the test specs for the ESLPT was a somewhat simple task because the
purpose is placement and the construct validation of the test consisted of an examination of the
content of the ESL courses.

c) Specifications for the GET are the skills of writing grammatically and rhetorically acceptable
prose on a topic of some interest, with clearly produced organization of ideas and logical
development. The GET is a direct test of writing ability in which test-takers must, in a two hour
time period, write an essay on a given topic.
03 Design, select, and arrange test tasks/ items

The specs act much like a blueprint in determining the number and types of items to be created.

a) TOEFL test design specifies that each item be coded for content and statistical
characteristics. Content coding ensures that each examinee will receive test questions that
assess a variety of skills (reading, comprehending the main idea, or understanding inferences)
and cover a variety of subject matter without unduly biasing the content toward a subset of test-
takers.
Consider the following sample of a reading selection and ten items based on it, from a practice
TOEFL (Philips, 2001, pp. 4233-424)

Items target the assessment of comprehension of the main idea (item 11), stated detail ( 17,19),
unstated details (12,15,18), implied details (14,20) and vocabulary in context (13,16)
b) The selection of items in the ESLPT entailed two entirely different processes. In the two
subsections of the test that elicit writing performance (summary of reading; response to
reading)

c) The GET prompts are designed by a faculty committee of examiners who are specialist in
the field of university academic writing. The assumption is made that the topics are universally
appealing and the intended product of an essay that requires an organized logical argument and
conclusion.
04 Make appropriate evaluations of different
kinds of items
Performing them may not be practical, especially if the classroom-based test is a one-time test.
But for a standardized multiple-choice test that is designed to be marketed commercially, or
administered a number of times, and administered in different form, these indices are a must.

Item facility (IF) - % of people who give the right answer


Item Discrimination (IDis) – indicates the extent to which success on an item corresponds to
success on the whole test
Distractor Analysis – finding out the % of people who get the item right in the try-out group
04 Make appropriate evaluations of different
kinds of items

There are different form of evaluation for other types of response format.

Practically
clarity of directions, timing of the test, ease of administration, time required to score responses

Reliability
is the degree to which an assessment tool produces stable and consistent results.

Facility
unclear directions, complex language, obscure topics, fuzzy data, culturally biased information.
05 Specify scoring procedures and reporting
formats

A systematic assembly of test items in pre-selected arrangements and


sequences, all of which are validated to conform to an expected difficulty
level, should yield a test that can be scored accurately and reported back to
test-takers and institutions efficiently.
06 Perform ongoing construct validation studies

No standardized instrument is expected to be used repeatedly without a rigorous program of in


going construct validation. Any standardized test, once developed, must be accompanied by
systematic corroboration of its effectiveness and steps towards its improvement.

The process of ongoing validation will no doubt continue as new forms of the editing section
are created and as new prompts and reading passages are created for the writing section. Such a
validation process should also include consistent checks on placement accuracy and on face
validity.
04.
Standardized Language
Proficiency Testing
04. Standardized Language Proficiency Testing

Tests of language proficiency presoppose a comprehensive definition of the


specific competencies that comprise the overall language ability.
The specifications for the TOEFL provided an illustration of an operational
definition of ability for assessment purposes. This is not the only way to
conceptualize the concept.

Swain (1990) offered a multidimensional view of proficiency assessment by


referring to three linguistic traits :
“Grammar, Discourse and Sociolinguistic”
that can be assessed by means of oral, multiple choice and written responses.
 Another definition and conceptualization of proficiency is suggested by the ACTFL
association. ACTFL takes a holistic and more unitary view of proficiency in describing four
levels :

1.Superior
2. Advance
3. Intermediate
4. Novice.

Within each level, descriptions of Listening, Speaking, Reading, and Writing are provided
as guidelines for assessment.
 The ACTFL Guidelines describe the superior level of speaking as follows : ACTFL
Speaking guidelines, summary, superior level.Superior level speakers are characterized by
the ability to :
- Participate fully and effectively in conversations in Formal and Informal settings on topics
related to practical needs and areas of professional and or scholarly interest.
- Provide a structured argument to explaun and defend opinions and develop effective
hypotheses within extended discourse.
- Discuss topics concretely and abstractly.
- Deal with a linguistically unfamiliar situation.
- Maintain a high degree of linguistic accuracy.
- Satisfy the linguistic demands of professional and or scholarly life.
05.
FOUR STANDARDIZED
LANGUAGE PROFICIENCY TESTS
 Three standardized oral production tests :
1. TSE ( Test of Spoken English)
2. OPI ( The Oral Proficiency Inventory)
3. TWE ( Test of Written English)

Four commercially produced standardized tests of English Language Proficiency are


described briefly in this section :
1. TOEFL (Test of English as a Foreign Language)
2. MELAB (Michigan English Language Assessment Battery)
3. IELTS (International English Language Testing System)
4. TOEIC (Test of English for International Communication)
 When you turn to that appendix, use the following questions to help you evaluate these four
tests and their subsections :
1. What item types are included ?
2. How practical and reliable does each subsection of each test appear to be ?
3. Do the item types and tasks appropriately represent a conceptualization of language
proficiency ( ability ) ? That is , can you evaluate their construct validity ? 4. Do the tasks
achieve face validity ?
5. Are the tasks authentic ?
6. Is there some washback potential in the tasks
THANK
YOU
Do you have any questions ?

You might also like