Standardized Testing

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 34

STANDARDIZED

TESTING
Group 3

1. Sadira Elin Kinasih (2520007)


2. Nur Rahmah (2520008)
3. Hanna Fairussania (2520012)
4. Ma’rifatul Azkiyah (2520014)
TABLE OF CONTENTS
01 The Meaning of
Standardization
02 Plus and Minus of
Standardized Test

03 Developing a
04 Standardized
Standardized Test Language
Proficiency Testing
05 Four Standardized
Language
Proficiency Tests
01.
The Meaning of
Standardization
What is Standardization Test?
A standardized test presupposes certain standard objectives, or criteria, that
are held constant across one form of the test to another. The criteria in
large-scale standardized tests are designed to apply to a broad band of
competencies that are usually not exclusive to one particular curriculum. A
good standardized test is the product of a thorough process of empirical
research and development. It dictates standard procedures for
administration and scoring. And finally, it is typical of a norm-referenced
test, the goal of which is to place test-takers on a continuum across a range
of scores and to differentiate test-takers by their relative ranking. For
example: TOEFL, IELTS, or College Entrance Exam.
02.
Plus Minus of
Standardized Test
+ Plus
1. A ready-made previously validated product that frees the teacher
from having to spend hours creating a test.
2. Administration to large groups can be accomplished within
reasonable time limits.
3. In the case of multiple-choice formats, scoring procedures are
streamlined (for either scannable computerized scoring or hand-
scoring with a hole-punched grid) for fast turnaround time.
4. There is often an air of face validity to such authoritative-looking
instruments.
- Minus
1. Using an overall proficiency test as an achievement test simply
because of the convenience of the standardization.
2. The potential misunderstanding of the difference between direct and
indirect testing.
03.
Developing
Standardized Test
How are Standardized Test Developed?

In the steps outlined below, three different standardized tests will be


used to exemplify the process of standardized test design:

01 The Test of English as a Foreign Language (TOEFL), Educational Testing


Service (ETS).

02 The English as a Second Language Placement Test (ESLPT), San Francisco


State University (SFSU).

03 The Graduate Essay Test (GET), SFSU.


01 Determine the purpose and objectives of
the test

Most standardized tests are expected to provide high practicality in


administration and scoring without unduly compromising validity. It is
therefore important for its purpose and objectives to be stated specifically.

A. The purpose of the TOEFL is “to evaluate the English proficiency of


people whose native language is not English” (TOEFL Test and Score
Manual, 2001, p. 9). More specifically, the TOEFL is designed to help
institutions of higher learning make “valid decisions concerning English
language proficiency in terms of [their] own requirements” (p. 9).
B. The ESLPT, referred to in Chapter 3, is designed to place already admitted
students at San Francisco State University in an appropriate course in academic
writing, with the secondary goal of placing students into courses in oral
production and grammar-editing.

C. The GET. Another test designed at SFSU, is given to prospective graduate


students-both native and non-native speakers-in all disciplines to determine
whether their writing ability is sufficient to permit them to enter graduate-level
courses in their programs
02 Design test specifications
Before specs can be addressed, a comprehensive program of research must identify
a set of constructs underlying the test it self

a) Construct validation for the TOEFL is carried out by the TOEFL staff at ETS under
the guidance of a Policy Council that works with a Committee of Examiners that is
composed of appointed external university faculty, linguists, and assessment
specialists.
Because of TOEFL is a proficiency test, the first step in the developmental process is
to define the construct of language proficiency.
(Bachman & Palmer, 1996) prefer the term ability to proficiency and thus speak of
language ability as the overarching concept.
TOEFL Specifications

- Listening Section
The listening section measures the examinee’s ability to understand english. Conversational features of
the language are stressed, and the skills tested include vocabulary and idiomatic expression as well as
special grammatical constructions that are frequently used in spoken english.

- Structure section
The Structure section measures an the examinee’s ability to recognize language that is appropriate for
standard written english.

- Reading Section
The Reading Section measures the ability to read and understand short passages similar in topic and
style to academic texts.

- Writing Section
The Writing Section measures the ability to write in english, including the ability to generate, organize, and
development ideas, to support those ideas with examples or evidence, and to compose a response to one
assigned topic in standard written english.
b) The designing of the test specs for the ESLPT was a somewhat simple task
because the purpose is placement and the construct validation of the test consisted of
an examination of the content of the ESL courses.

c) Specifications for the GET are the skills of writing grammatically and rhetorically
acceptable prose on a topic of some interest, with clearly produced organization of
ideas and logical development. The GET is a direct test of writing ability in which test-
takers must, in a two hour time period, write an essay on a given topic.
03 Design, select, and arrange test tasks/
items

The specs act much like a blueprint in determining the number and types of items to
be created.

a) TOEFL test design specifies that each item be coded for content and statistical
characteristics. Content coding ensures that each examinee will receive test questions
that assess a variety of skills (reading, comprehending the main idea, or
understanding inferences) and cover a variety of subject matter without unduly biasing
the content toward a subset of test-takers.
Consider the following sample of a reading selection and ten items based on it, from a
practice TOEFL (Philips, 2001, pp. 4233-424)

Items target the assessment of comprehension of the main idea (item 11), stated
detail ( 17,19), unstated details (12,15,18), implied details (14,20) and vocabulary in
context (13,16)
b) The selection of items in the ESLPT entailed two entirely different processes. In the
two subsections of the test that elicit writing performance (summary of reading;
response to reading)

c) The GET prompts are designed by a faculty committee of examiners who are
specialist in the field of university academic writing. The assumption is made that the
topics are universally appealing and the intended product of an essay that requires an
organized logical argument and conclusion.
04 Make appropriate evaluations of different
kinds of items
Performing them may not be practical, especially if the classroom-based test is a one-
time test. But for a standardized multiple-choice test that is designed to be marketed
commercially, or administered a number of times, and administered in different form,
these indices are a must.

Item facility (IF) - % of people who give the right answer


Item Discrimination (IDis) – indicates the extent to which success on an item
corresponds to success on the whole test
Distractor Analysis – finding out the % of people who get the item right in the try-out
group
04 Make appropriate evaluations of different
kinds of items
There are different form of evaluation for other types of response format.

Practically
clarity of directions, timing of the test, ease of administration, time required to score
responses

Reliability
is the degree to which an assessment tool produces stable and consistent results.

Facility
unclear directions, complex language, obscure topics, fuzzy data, culturally biased
information.
05 Specify scoring procedures and reporting
formats

A systematic assembly of test items in pre-selected arrangements


and sequences, all of which are validated to conform to an expected
difficulty level, should yield a test that can be scored accurately and
reported back to test-takers and institutions efficiently.
06 Perform ongoing construct validation
studies
No standardized instrument is expected to be used repeatedly without a rigorous
program of in going construct validation. Any standardized test, once developed, must
be accompanied by systematic corroboration of its effectiveness and steps towards its
improvement.

The process of ongoing validation will no doubt continue as new forms of the editing
section are created and as new prompts and reading passages are created for the
writing section. Such a validation process should also include consistent checks on
placement accuracy and on face validity.
04.
Standardized Language
Proficiency Testing
04. Standardized Language Proficiency Testing

Tests of language proficiency presoppose a comprehensive definition of


the specific competencies that comprise the overall language ability.
The specifications for the TOEFL provided an illustration of an operational
definition of ability for assessment purposes. This is not the only way to
conceptualize the concept.

Swain (1990) offered a multidimensional view of proficiency assessment


by referring to three linguistic traits :
“Grammar, Discourse and Sociolinguistic”
that can be assessed by means of oral, multiple choice and written
responses.
 Another definition and conceptualization of proficiency is suggested by the ACTFL
association. ACTFL takes a holistic and more unitary view of proficiency in
describing four levels :

1.Superior
2. Advance
3. Intermediate
4. Novice.

Within each level, descriptions of Listening, Speaking, Reading, and Writing are
provided as guidelines for assessment.
 The ACTFL Guidelines describe the superior level of speaking as follows : ACTFL
Speaking guidelines, summary, superior level.Superior level speakers are
characterized by the ability to :
- Participate fully and effectively in conversations in Formal and Informal settings
on topics related to practical needs and areas of professional and or scholarly
interest.
- Provide a structured argument to explaun and defend opinions and develop
effective hypotheses within extended discourse.
- Discuss topics concretely and abstractly.
- Deal with a linguistically unfamiliar situation.
- Maintain a high degree of linguistic accuracy.
- Satisfy the linguistic demands of professional and or scholarly life.
05.
FOUR STANDARDIZED
LANGUAGE PROFICIENCY TESTS
 Three standardized oral production tests :
1. TSE ( Test of Spoken English)
2. OPI ( The Oral Proficiency Inventory)
3. TWE ( Test of Written English)

Four commercially produced standardized tests of English Language Proficiency


are described briefly in this section :
1. TOEFL (Test of English as a Foreign Language)
2. MELAB (Michigan English Language Assessment Battery)
3. IELTS (International English Language Testing System)
4. TOEIC (Test of English for International Communication)
 When you turn to that appendix, use the following questions to help you evaluate
these four tests and their subsections :
1. What item types are included ?
2. How practical and reliable does each subsection of each test appear to be ?
3. Do the item types and tasks appropriately represent a conceptualization of
language proficiency ( ability ) ? That is , can you evaluate their construct validity ?
4. Do the tasks achieve face validity ?
5. Are the tasks authentic ?
6. Is there some washback potential in the tasks
THANK
YOU
Do you have any questions ?

You might also like