Download as pdf or txt
Download as pdf or txt
You are on page 1of 6

TOEFL (Test of 

English as a Foreign
Language) Listening Sub-Test
RICHARD BADGER

­Framing the Issue

The Test of English as a Foreign Language (TOEFL) was introduced in 1964, as a


measure for determining the English proficiency of international students whose
first language is not English. It has been taken by 27 million candidates since
then (Educational Testing Services, 2012, p. 2). It is designed and run by
the  Educational Testing Service (ETS). The TOEFL is a test of general English
language proficiency but is most often used to evaluate whether candidates have
the linguistic ability to study effectively in anglophone academic settings. It is
largely taken by students who want to study in anglophone universities in
undergraduate or graduate programs. The TOEFL can now be taken by children
of school age: the TOEFL Junior is for students older than 11 years and the TOEFL
Primary is for children over the age of 8. While these tests are likely to increase
in importance, at the moment they are not central to the TOEFL and will not be
discussed further in this entry.
The TOEFL is now an Internet-based test, the iBT. It has sections on reading,
writing, speaking, and listening. The listening part of the test lasts between 60 and
90 minutes and comprises two sections. In the first section candidates listen to
audio recordings of four to six extracts from lectures or seminars, which last about
5 minutes each and are followed by six questions. The recording is accompanied
by a photograph intended to make it clear whether the recording involves a mono-
logue or a dialogue between a lecturer and students. The lectures relate to the arts,
life sciences, physical sciences, and social sciences. In the second section candi-
dates listen to two or three conversations in academic settings, each extract lasting
about 5 minutes and being followed by five questions. The conversations might be
between a lecturer and a student who discuss a course or a service encounter—say,
with a university agency related to housing. Again, the recording is accompanied
by a photograph. Like the lectures, the recordings are designed to represent
authentic academic discourse and may include a range of native speaker accents

The TESOL Encyclopedia of English Language Teaching, First Edition.


Edited by John I. Liontas (Project Editor: Margo DelliCarpini).
© 2018 John Wiley & Sons, Inc. Published 2018 by John Wiley & Sons, Inc.
DOI: 10.1002/9781118784235.eelt0627

eelt0627.indd 1 10/31/2017 8:16:14 PM


2 TOEFL (Test of English as a Foreign Language) Listening Sub-Test

of English. While ETS aims for realism (i.e., how the materials are experienced)
rather than authenticity (i.e., the origin of the texts), materials in the iBT practice
book seem generally authentic as they have many elements of natural speech.
Candidates can take notes while listening but do not see the questions until the
recording has finished. The question formats in both sections are objective but may
require candidates to select one or more options from a range of choices, sequence
events, match multiple choices, or categorize objects or text extracts. The questions
are designed so that they can be understood without any background knowledge
but may involve the implications or purpose of a particular utterance.
In task 1 of the writing section, the integrative writing test, candidates summa-
rize a text of 230 to 300 words that they have listened to and relate it to information
they derive from a separate reading passage.
The two main issues related to the TOEFL are: Is it is a valid listening test? And
what is its impact on teaching, learning, and admissions policies in anglophone
universities? The next sections of this entry examine the validity of the TOEFL
listening sub-test and the TOEFL’s impact.

­Making the Case

Test evaluation has traditionally been carried out on the criterion of validity, which
is the central concept in testing. Validity can be understood as a measure of the
correspondence between real-world facets and test facets. This is not so much a
quality of the test as of how the information from a test is used; and validity is best
addressed by considering the evidence upon which a decision based on a test score
could be justified (Chapelle, Enright, & Jamieson, 2008).
The listening sub-test tasks differ from listening in higher education institutions.
First, lectures are typically just under 1 hour, as opposed to the 3- to 5-minute-long
extracts in the test. Second, lectures are parts of modules selected by students and
are intended to convey discipline-specific arguments, unlike a listening test, which
aims to provide samples of a wide range of disciplines, covers the arts, life sci-
ences, physical sciences, and social sciences, and requires a general understanding
of the basic and pragmatic meaning of the extract (Bejar, Douglas, Jamieson,
Nissan, & Turner, 2000). Third, the listening sub-test of the TOEFL does not reflect
the extent to which listening is embedded in other forms of communication in
universities. For example, listening to a lecture leads into the creation of essays or
term papers. But these are, in part, addressed by the integrative writing task,
where one of the measures of success is the ability to connect and synthesize infor-
mation from different sources. The differences described in this paragraph seem to
be a necessary part of designing a practical test. It would be hard to imagine tests
where candidates listened to 1-hour lectures depending on their choice of major.
As practicable ways of bringing the TOEFL iBT closer to typical university listen-
ing are difficult to imagine, they will not be discussed further here.
Other aspects of the validity of the listening element of the TOEFL are more
easily addressed within a testing frame and might be considered more powerful

eelt0627.indd 2 10/31/2017 8:16:14 PM


TOEFL (Test of English as a Foreign Language) Listening Sub-Test 3

challenges to the validity of the TOEFL test. First, the listening sub-test focuses on
lectures, interactions between academics and students, and service exchanges
within an academic context. Lectures are the key listening task in university
­programs. Sawaki and Nissan’s (2009) survey of 145 undergraduate and post-
graduate students in three American universities found that 42% of the under-
graduate and 52% of the postgraduate programs were lecture courses. They also
found that what students learned in lectures made an important contribution to
the assessment. The inclusion of listening activities other than lectures is a strength
of the TOEFL and addresses Lynch’s critique of academic listening exams in
­general (Lynch, 2011). However, while the inclusion of service encounters has
plausibility, it is not entirely clear that this is the most important interaction out-
side the lecture theatre that involves listening. Many international students
­prioritize social conversations, often with non-native speakers of English, over
service encounters. There is a lack of information about the range of interactions,
including service encounters, that are important for international students in
higher education. It would be helpful to have more research on the interactions in
which international students engage as a basis for the inclusion of service encoun-
ters in the listening section; but the omission of social encounters is at least
­potentially problematic.
Second, the listening sub-test of the TOEFL does not accurately represent the
multimodal nature of lectures where diagrams, drawings, and videos are a routine
part. The use of photographs in the iBT is an attempt to address this issue but
remains a limited reflection of actual lectures where the majority of lecturers make
use of presentation software, most often PowerPoint. Research on whether
PowerPoint leads to better or worse recall of information has produced mixed
results. However, there is very little research on how students draw on PowerPoint
slides to support their own learning and none that I know of that examines how
PowerPoint presentations, handouts, and lecturers’ oral presentation are com-
bined by students as they make sense of lectures. For some students, reading the
PowerPoint slides is a more important part of lecture attendance than listening to
what the lecturer says, and this creates problems for what counts as an academic
listening test. There is also a lack of expertise on the use of multimodal recordings
in testing, but tests that incorporate video are now appearing, though there is still
little research as to how well they function as tests.
The TOEFL listening sub-test does not currently reflect the multimodal nature of
lectures. It is important that tests should be robust, particularly when there is a
dearth of evidence about this particular kind of “real-life language” use. However,
video is a well-established part of language teaching and could be an area of
TOEFL development.

­Pedagogical Implications

In this section I look at the washback of the TOEFL on preparations courses and at
how the TOEFL is—or can be—used in the admissions process.

eelt0627.indd 3 10/31/2017 8:16:14 PM


4 TOEFL (Test of English as a Foreign Language) Listening Sub-Test

Washback or backwash is the impact of tests on courses that are intended to pre-
pare candidates for these tests. While many teachers believe that tests have consid-
erable impact, the way in which tests impact teaching depends on the beliefs and
practices of teachers and learners. This argument was investigated in a study car-
ried out by Wall and Horak (2006, 2008, 2011) on TOEFL preparation courses in six
countries in Central and Eastern Europe; the study aimed to understand how
teachers’ beliefs and practices changed as the TOEFL iBT replaced computer-based
testing (CBT). One of the reasons for the revision of the TOEFL, and in particular for
the introduction of the integrated listening–reading–writing task, was to create
positive washback (Wall & Horak, 2006). The two researchers found that, for the
TOEFL CBT, the most common activity in the listening class was for the students to
practice test-like listening items (Wall & Horak, 2006). The changes that teachers
were planning in response to the iBT version were designed to allow students to
take notes, to use longer listening passages, and to practice integrating information
from listening and reading texts; but, in practice, teachers did not provide much
support for the development of either note-taking skills or abilities to integrate
information from different sources (Wall & Horak, 2011). The main impact of the
TOEFL remained the fact that students were doing testlike activities, though there
was more student–student interaction because of the teaching materials (Wall &
Horak, 2011).
The rather depressing impact of the TOEFL on pedagogy might be addressed in
three ways. One is to improve teacher education for preparing candidates for the
TOEFL so that teachers are able to help their students develop the underlying
skills that the TOEFL listening sub-test is attempting to assess. For example, rather
than simply doing exam practice on integrating information sources, learners can
be scaffolded to develop the metacognitive and reflective abilities that would help
them synthesize what they have learnt from lectures and reading. Second, and in
line with Wall and Horak’s emphasis on the importance of teaching materials in
mediating the washback of a test, designing teaching materials related, say, to
effective note taking and developing strategies for combining spoken and written
information would be likely to lead to more effective preparation classes. Third,
some changes to test content are desirable, as current classes are not providing
students with the most effective preparation for study in anglophone universities,
as a result of the exclusion of purely social interactions and of the limited extent to
which the text reflects the multimodal nature of lectures. Developments of the test
to encompass varieties from non-native speakers of English and a wider range of
visual ­elements in the lecture extracts would go some way to addressing this
last issue.
Another way in which the TOEFL has an impact on broader society is through
its use for university admissions. TOEFL scores are often used as part of the selec-
tion for admission to study at anglophone universities, and it is important that
those involved in the admission process have a clear understanding of the limits
of the information that the TOEFL or similar tests can provide. However, one
important kind of evidence for the validity of the TOEFL is the extent to which it

eelt0627.indd 4 10/31/2017 8:16:14 PM


TOEFL (Test of English as a Foreign Language) Listening Sub-Test 5

can be used to predict future academic success. Wait and Gressel’s (2009) study of
over 6,000 students at an American university in the United Arab Emirates found
that higher TOEFL scores were associated with higher grade point averages
(GPAs) but that there were important differences between disciplines; for exam-
ple, the association was weaker for engineering students. Wait and Gressel also
found that there were many students whose academic performance defied the
general pattern. Cho and Bridgeman (2012) carried out a study of 2,594 interna-
tional students at universities in America and found that the TOEFL score
accounted for between 6% and 7% of the variance in GPA for postgraduate pro-
grams and 3% for undergraduate ones. Further analysis showed that students
with better TOEFL scores tended to have better GPAs. These results confirm that
there is a significant, though small, relationship between TOEFL and academic
performance. This is not surprising. The fact that a non-native speaker of English
has a good command of this language has no necessary connection with that
speaker’s academic ability as reflected in his or her grade point average (Cho &
Bridgeman, 2012). This means that admission processes should be wary about
overreliance on TOEFL scores.

SEE ALSO: Construct of Listening; Listening Activities; Task-Based Approach to


Listening; Text Types in Listening; Texts for Listening Instruction and Assessment;
Video in Listening

­References

Bejar, I., Douglas, D., Jamieson, J., Nissan, S., & Turner, J. (2000). TOEFL 2000 listening
framework: A working paper. Retrieved from https://www.ets.org/research/policy_
research_reports/publications/report/2000/iciu
Chapelle, C. A., Enright, M. K., & Jamieson, J. (2008). Building a validity argument for the Test
of English as a Foreign Language. London, England: Routledge.
Cho, Y., & Bridgeman, B. (2012). Relationship of TOEFL iBT® scores to academic performance:
Some evidence from American universities. Language Testing, 29, 421–42. doi:10.1177/
0265532211430368
Educational Testing Services. (2012). The official guide to the TOEFL Test (4 ed.). New York,
NY: McGrawHill.
Lynch, T. (2011). Academic listening in the 21st century: Reviewing a decade of research.
Journal of English for Academic Purposes, 10, 79–88.
Sawaki, Y., & Nissan, S. (2009). Criterion-related validity of the TOEFL iBT listening section.
Retrieved from https://www.ets.org/research/policy_research_reports/publications/
report/2009/hvea
Wait, I. W., & Gressel, J. W. (2009). Relationship between TOEFL score and academic success
for international engineering students. Journal of Engineering Education, 98, 389–98.
doi:10.1002/j.2168–9830.2009.tb01035.
Wall, D., & Horak, T. (2006). The impact of changes in the TOEFL examination on teaching and
learning in central and eastern Europe: Phase 1, the baseline study. Princeton, NJ: Educational
Testing Service.

eelt0627.indd 5 10/31/2017 8:16:14 PM


6 TOEFL (Test of English as a Foreign Language) Listening Sub-Test

Wall, D., & Horak, T. (2008). The impact of changes in the TOEFL examination on teaching and
learning in Central and Eastern Europe: Phase 2, coping with change. Princeton, NJ: Educational
Testing Service.
Wall, D., & Horak, T. (2011). The impact of changes in the TOEFL® Exam on teaching in a sample
of countries in Europe: Phase 3, the role of the coursebook, Phase 4, describing change. Princeton,
NJ: Educational Testing Service.

eelt0627.indd 6 10/31/2017 8:16:14 PM

You might also like