Article 1

WASHBACK FROM ENGLISH LANGUAGE TESTS
FOR PRIMARY STAGE ON LANGUAGE TEACHING

AND LEARNING: AN EGYPTIAN PERSPECTIVE
BY
Dr. Hanan Waer

Curriculum & Teaching Methods Dept.
Faculty of Education at the New Valley
Assuit University
DOI: 10.12816/0053050
‫مجمة الدراسات التربوية واالنسانية ـ كمية التربية ـ جامعة دمنهور ـ‬

7102 ‫ لسنة‬-‫ الجزء الثانى‬- ‫المجمد التاسع – العددالرابع‬
Washback from English Language tests for primary stage Dr. Hanan Waer
DOI: 10.12816/0053050 044

7102
7102 ‫ لـــسنة‬-‫مجمة الدراسات التربوية واالنسانية ـ كمية التربية ـ جامعة دمنهورـ المجمد التاسع – العددالرابع – الجزء الثانى‬
Washback from English Language tests for primary stage on

Language Teaching and Learning:
An Egyptian perspective
Dr. Hanan Waer
DOI: 10.12816/0053050
Abstract
Recent years have witnessed an interest in teaching English early to

young children. Hence, there is a growing body of research in this
under-investigated area, especially concerning the assessment of
young language learners. This issue has not had considerable
attention in the Egyptian context. Thus, this study tries to fill this
gap by exploring how far the Egyptian English primary tests reflect
principles of assessing young learners (Cameron 2001) and
examining test characteristics, mainly: construct validity, washback
and test impact (Messick 1996). Data were collected from different
sources, including sample tests, test specifications, classroom visits
and in-depth interviews (teachers, supervisors and parents. The test
critique revealed that in terms of construct validity, speaking skill is
underrepresented and hence the test lacks a keystone in assessing
young learners. In terms of consequential validity, it was found that
the test has potential for negative washback on teaching practices
and learning; manifested in aligning teaching practices to test
agenda, focusing on the discrete linguistic items using materials that
are related to the test, monopolization of dialogues for test practices
and downplaying oral activities. Besides, children are frequently
dependent on private tutoring, test preparation books, memorizing
vocabulary, dialogues, and feeling under pressure. Consequently,
the study suggests a revised test with a rationale based on the
previous literature of assessing young language learners, testing
concepts and Cambridge starters test. It also argues for a more
performance-based assessment to suit the characteristics of young
language learners. Finally, some implications and recommendations
for further research are suggested.
Keywords: Assessment of young language learners, construct
validity, washback, impact, primary stage English test.
044
1. Introduction
Teaching English early to young children has been on the rise in
different countries in the last twenty years. Similarly, the Egyptian
system introduced English in the primary stage at grade four in
1993, then to all primary 1–6 graders in 2003 (Abdel Latif 2017).
Thus currently, children study English when they begin their
primary education at the age of six. The aim of introducing English
at this stage is "raising the learners' awareness of English as a
foreign language and the cultures it represents, in order to prepare
them for global citizenship"(Ministry of Education Standards
Document 2012, p. 7)
It follows then that the trend of teaching English to young
learners has implications for curricula, teaching and most
importantly, assessment. There has been a growing body of research
in this under-investigated area (Golis 2016; Hsieh 2016; Nikolov
2016, Hauck 2017). Hence, this study aims to contribute to this area
by examining the Egyptian English language tests for primary six
and their impact on language teaching and learning.
2. Literature review
2.1. Principles for assessing YLLs
Young language learners (hereafter YLLs) are children "learning a
foreign or second language and who are doing so during the first six
or seven years of formal schooling" (McKay 2006, p.1). YLLs are
different from adults. Hence, some considerations are associated
with assessing them (Cameron 2001; Taylor and Saville 2002;
Hughes 2003). Understanding the three categories "growth, literacy,
and vulnerability", which McKay (2006 pp. 5-24) identified helps
design suitable test tasks for YLLs.
Table [1] Characteristics of YLLs and their implications for

assessment tasks.
Categories Characteristics Assessment task demands
Growth - short attention span - brief and varied tasks; colorful and
- concentration interactive pictures; a short and
-Cognitive interesting story to motivate them to
complete the task
- assessment should take place in a quiet
calm setting
DOI: 10.12816/0053050 044

7102
- developing gross and - taking into consideration tiredness

-Physical fine-motor skills ability to sit still and hand-eye
coordination.
-Social /
Emotional - developing an -familiar content
understanding of the self in - ―psychologically safe environment‖;
relation to others assessment should be familiar and
involve familiar adults.
Literacy - developing reading and - tasks should be ones that children can
writing skills in L21 handle comfortably in their own
- Literacy in L2 builds on language‖ (Hughes, 2003 p. 202).
L1 literacy understanding - appropriate choice of reading texts
and skills. - Writing is at the word/sentence level.
Vulnerability - Children are vulnerable to - assessment should have a positive impact
criticism. on YLLs
The characteristics identified in the above table relate to the learners

in the Egyptian context. Their age range is 10-11 years old, so they
are still developing their cognitive abilities and literacy skills. These
characteristics imply using appropriate reading texts that match their
interests and their topical knowledge. They also need attractive
pictures and stories as the attention span is short. What is more
important is that testing should motivate them to learn English as
they are vulnerable to criticism. Thus, all those characteristics need
to be taken into consideration when assessing Egyptian YLLs. As
Cameron puts it:
"For young children, what matters is a solid base in spoken
language, confidence and enjoyment in working with the
spoken and written language, and a good foundation in
learning skills. We should be searching out assessment
practices that will reinforce the value of these to learners
and to their parents." (p. 240).
This remark highlights the basics for YLLs testing at three levels:
1) primacy of spoken language, 2) creating a non-threaten testing
atmosphere for both written and spoken language, and 3) testing
should boost pupils' self-esteem. Only then will testing motivate
them for more learning.
1
L1 stands for the first language; L2 stands for the second language
044
Building on Cameron (2001), Cambridge young learners'

English tests were developed. The suite of tests covers three key
levels: Starters, Movers and Flyers. They are good models of
suitable tests for YLLs since they take care of children's cognitive
and social development; be consistent with good practice in
primary school teaching (materials and methodology); support
language use …, be relevant and look interesting (e.g., by making
use of color and graphics); report meaningful results in order to
encourage further learning (Taylor and Saville 2002, p. 1). Such
characteristics provide valuable guidelines in evaluating tests of
primary pupils. Hence the present study will apply those
principles. They also provide a rationale for the critique of the
present test and suggest some modifications to it.
2.2 Main concepts related to testing

2.2.1. Construct Validity
Messick (1989 p.13) defines validity as follows:
"an integrated evaluative judgment of the degree to which
empirical evidence and theoretical rationales support the
adequacy and appropriateness of inferences and actions
based on test scores and other modes of assessment."
Thus, face validity- judging the value of the test by non-testers as

students, teachers, and administrators (Alderson et al. 1995 p. 172)
is not enough in test validation procedures. According to
McNamara (2000 p. 50), test developers should ―ensure
defensibility and fairness of interpretations based on test
performance‖ as there are different threats to validity relating to
test content, test method, test construct, and test impact (ibid
pp.50-54). Construct validity includes two major threats 'construct
under-representation and 'construct irrelevance' (Messick 1989).
So, there is a must to ensure that the construct tested represents
enough behavior samples and relevant to the inferences made.
2.2.2. Test washback

According to Messick (1996), washback is an aspect of construct
validity that contributes to the validity of test interpretations. He (p.
259) defines washback as "the extent to which the test influences
DOI: 10.12816/0053050 040

7102
language teachers and learners to do things they would not

otherwise necessarily do." Weir (2005 p. 37) puts it slightly
differently, stating that washback (or backwash) "occurs only at the
'micro' level of the individual participant (primarily teachers and
students)," but that there is also "impact [which] may occur at a
'macro' or social and institutional level." Hence, it is essential to
investigate washback and impact because, in many countries, tests
are powerful levers as "they are often the single indicators for
determining the future of individuals" (Shohamy 1998, p. 332). A
considerable body of literature into washback has reported that high-
stakes exams influence materials, teachers, and learners (Bailey
1996; Wall and Alderson 1996; Andrews et al. 2002; Chowdhury
and Ahmed 2013; Safa and Gafri 2016).
2.2.3. Test impact
In many contexts, including Egypt, YLLs are assessed by using
tests. McKay (2006 p. 48) argues that "External tests [also] have a
strong influence on what and how children learn, and on how
teachers are inclined to teach and assess in the classroom."
Similarly, Moon (2005 p. 33) suggests that "testing at primary levels
can have a strongly negative wash-back effect on what happens in
the classroom and undermine attempts to introduce more
experiential and activity-based teaching." YLLs are vulnerable to
criticism and failure, which highlights that testing influences them in
a way different from elder students (McKay 2006, p. 23). Cameron
(2001 p. 216) also explains different facets of impact as follows:
 stress is placed on children by the demands of assessment ;
 individual children's learning needs are downgraded in a
push to cover the syllabus or coursebook before the
following assessment;
 classroom activity is restricted to test preparation;
 the power of assessment machinery limits educational
change.
It follows then that language tests should aim at achieving a positive
impact on YLLs.
2. 3. Previous studies
A number of studies (e.g., Abdel Latif 2012; Chowdhury and
Ahmed 2013; Hsiesh 2016, Gerbil and Brown 2014; Safa and
Gafri 2016; Gerbil and Eid 2017) have examined testing in
044
different countries and the Egyptian education system. Abdel Latif

(2012) explored how the modified second version of Hello!- a
standards-based communicative textbook series- influenced
secondary school teachers' classroom practices. The researcher
found that the new curricular reform in secondary school English in
Egypt has not resulted in expected changes in teachers' practices.
Data analysis of interviews and questionnaire showed five factors
that affected teachers' practices: washback, the culture of teaching,
inadequate time, students' low English level, and lack of equipment
and materials". Washback has been the most influential one. Abdel
Latif's (2012) concluded that there should be a similar reform in the
examination system vis-à-vis the language education system in
Egypt. In the same seam, Chowdhury and Ahmed (2013)
investigated the side effects of assessment in secondary schools and
its impact on students. They used a qualitative approach to collect
and analyze data from three schools employing classroom
observation, in-depth interviews and focus group discussion. The
study found some side effects of assessment as "suffering from self-
inferiority, losing self-confidence, disregard for school and teachers.
…increase of competitive behavior" (p. 380).
Gebril and Brown (2014) investigated teacher beliefs about
the purposes of assessment in Egypt. They used the Teachers'
Conceptions of Assessment (TCoA) inventory to elicit responses
about four main factors: Improvement, School Accountability,
Student Accountability, and Irrelevance. The sample was 507
Egyptian pre-service and in-service teachers. Using Confirmatory
factor analysis, it was found that pre-existing New Zealand model
was not relevant. The model showed a strong positive relationship
between Improvement and Student Accountability. The researchers
argued that " greater changes to the examination system are required
if teacher beliefs are expected to be more positive about the priority
of formative, improvement-oriented uses of assessment" (p. 2)
Hsieh (2016) examined content validity evidence for the
new young language learner assessment—TOEFL Primary (for
young learners ages 8 and above). The participants were a panel of
17 experienced EFL teachers from 15 countries. The researcher
DOI: 10.12816/0053050 044

7102
used Content Validity Indices (CVIs) to determine the degree of

match between the test contents and the target constructs. The
participants evaluated the relevance and importance of the
knowledge, skills, and abilities (KSAs) assessed by the TOEFL
primary reading and listening items. Results showed that most of
the items had an average validity index CVI above the cut-off
value of .80, which indicated that the test items have construct
validity (they measure what they were intended to measure). The
test was found to be supportive of classroom practices.
Safa and Gafari (2016) examined the washback effect of third
grade-high school tests in Iran on EFL teachers' teaching
methodology, assessment procedures, and attitudes towards different
educational system aspects. The researchers administered their
designed questionnaire to 160 EFL teachers. The results showed that
the final exam has negative effects on EFL teachers' teaching
methodology, teaching practices according to the content and format
of the test, and assessment procedures. Gebril and Eid (2017) also
investigated high-school assessment and teachers' test preparation
beliefs and practices in the Egyptian context. The authors used a
questionnaire with 200 secondary school teachers and a follow-up
interview with some teachers. The results showed a wide range of
test preparation activities used in Egyptian schools and the teacher
stated valuable and harmful effects of test preparation activities.
Reviewing the previous studies, it is evident that all the
Egyptian studies are mainly focused on English testing in the
secondary stage (Abdel Latif 2012; Gebril and Brown 2014; Gebril
2017; Gebril and Eid 2017). Regarding assessing primary young
learners, to the researcher’s best knowledge, there are no studies that
dealt with the impact of testing on YLLs in the Egyptian context.
Hence, there is a need for such study.
3. Context of the Study
Based on the researcher's experience of supervising student teachers
in Egyptian primary schools where English is taught as a foreign
language, I realized that many pupils struggle to express themselves
in English. I also noticed that the syllabus "Time for English" has
been introduced, targeting more communicative orientation. In
contrast, English language tests and classroom
044
assessment practices are still primarily focused on linguistic

knowledge. Moreover, the private tutoring phenomenon - where
students seek extra teaching from teachers - is also prevailing in the
whole society vis-a-vis external books (McIlwraith and Fortune
2016). Besides, in an informal meeting, some parents also asked me
about engaging ways to study English as their poor children get
bored from memorizing words and grammar rules.
Reading previous literature on assessing YlLs, the
researchers could not locate any study in the Egyptian context. So,
all these factors urged me to examine some samples example of
those tests and see how far they are consistent with language testing
criteria and with principles for assessing YLLs. As tests have crucial
consequences on children, we should validate them, ensure their
fairness, and motivate students to learn English. Accordingly, the
present paper addresses the following three research questions:
Questions of the study

1-To what extent do English language tests for primary six
reflect principles of testing young language learners?
2-To what extent do English language tests for primary six
reflect construct validity?
3-What is the impact/washback of the English language test for
primary six on teaching and learning?
4. Research method
This study combines an analytical approach to evaluate test samples

and a qualitative approach collecting data from classroom visits and
in-depth-interviews.
4.1 Tools of the Study
4.1.1. The English language test for primary six
The English language test for primary six (hereafter ELTfPS) is a
final achievement test at the end of the primary stage when pupils
are approximately 11-12 years old. Those who pass the test will
move to the preparatory stage, but those who fail will have another
chance in the summer. The test takers are from different socio-
DOI: 10.12816/0053050 044

7102
economic backgrounds. Their level is beginners; they have

knowledge in familiar areas related to their age: family, animals,
fruits, colors and sports. All children study English as a foreign
language three lessons/week. The majority use English mainly in the
classroom.
The ELTfPS is divided into three main parts: listening (12)
marks), Reading comprehension (14 marks), and C) Writing and
usage (14 marks). As for test specifications, the National Centre of
Examinations and Evaluation (NCEE) "has tried to develop test
specifications for use at the primary level. It has also created sample
examination materials for all primary education levels in an activity
book for learners, with sections on 'test specifications' and 'language
enrichment exercises.' (McIlwraith and Fortune 2016, p. 7)
4. 1.2. Classroom visits

To have an accurate picture of the washback of the ELTfPS from
classroom practices, the researcher visited 12 primary schools in the
New Valley governorate (first-semester October-December
2016/2017 with a total of 20 classes. The observation concentrated
on teaching activities, materials, and assessment practices.
4.3 Interviews
Twelve teachers (9 males, 3 females) from the different visited
primary schools, four supervisors and five parents have been
interviewed. All participants are residents of the New Valley
governorates, except one parent lives in Cairo. The research aim was
explained for the participants as well as their confidentiality was
assured. After their consent, open-ended questions focused on the
same aspects which were observed in the class, besides their
opinions about the ELTfPS. The total recording time was
approximately 6 hours and the average time for the interview was
about half an hour. The range of teachers' teaching experience is
from seven to twenty years. Ten teachers have a BA in Education
and Literature and a diploma in Education. One teacher has a BA in
Arts and literature, and another teacher with BA in both literature
and Education, besides a specialized diploma in teaching English
(FELT).
044
5. Data Analysis, Results and Discussion

This section presents the results of a critique of the current test
according to principles of assessing YLLs (c.f. 2.1) and construct
validity (c.f. 2.2.1) and the main themes that emerged from the
interview and classroom data analysis, followed by a discussion of
the results.
5.1 Evaluating the English language test for primary six
(ELTfPS)
5.1.1 Examining Test specifications
Examining the specifications of the present test (Appendix 2), it is
clear that they just describe the question types rather than providing
detailed information about the purpose of the test, description of the
test format, tasks and sample items (Alderson et al. 1995; Bachman
and Palmer1996; Hughes 2003). This highlights the lack of a central
keystone in building the present test. Thus, the specifications seem
not adequate as they do not clearly explain the skills or the test
tasks. This confirms McIlwraith and Fortune (2016), who claimed
that the specifications of the test "lack structure, e.g., there are no
clear definitions of which language skills or elements are to be
tested, criteria for marking or justifications for choosing specific
items or task types (p.7).
5.1.2 Evaluating ELTfPS according to principles of assessing
YLLs
Examining some samples of ELTfPS in different governorates
(Appendix 1), it was found that the ELTfPS reflects some
characteristics of YLLs in using pictures Q62 in paragraph writing.
However, the reading text in Q 5 is not accompanied by pictures.
So, it is better to have colored pictures to suit the age of the learners.
The writing questions (Q7 and 8) are at the word/phrase level. This
suits the characteristics of YLL as recommended by Taylor and
Saville (2002) "writing is largely at the "word/phrase (enabling
skills) level'. Nevertheless, Qs includes discrete points rather than
presenting them in a meaningful context as a story.
2
Q stands for question, and Qs stands for questions
DOI: 10.12816/0053050 044

7102
However, there is no representation of speaking skills. In so doing,

the test underestimates "the primacy of spoken language over
written language among children" (Taylor and Saville, 2002).
Hence, this threatens the construct validity of the test, as the next
section explains.
5.1.3 Evaluating Construct Validity
A language test should cover different skills to provide a
representative sample of language behavior (Hughes 2003, p. 26).
However, the ELTfPS neglects students' speaking skills (Appendix
1 and 2). Besides, listening is not tested properly. For instance, in
the first question, pupils must listen first to one question and circle
the answer they heard, and second, they circle the question they
hear, choosing either a or b (all the questions and their answers are
related to the Set Books.). Multiple-choice items test recognition
knowledge (Hughes 2003, p. 75) rather than language use3. This
threatens the construct validity as it focuses on knowledge and
neglects listening comprehension subskills.
As for Q2 in the listening section, according to the stated test
specifications, pupils are required to fill in the missing words after
listening to "a long dialogue of SIX (6) exchanges between two
persons related to the Set Books"(Test specifications). Examining
different dialogues from different governorates, it is evident that
pupils can easily guess the missing word without listening. The
same conversation (taken from unit 4) is repeated in many tests with
just changing the caller names. For example, in the test samples
below from two different governorates (Sohag and Sharqia), the
test-takers can easily guess or remember the first missing word,
"May", and the other missing words as well. Hence, there is a
possibility of using "test wiseness" strategies.
3
Many difficulties are related to the design of multiple-choice questions as finding
appropriate distracters.
044
In the interview, asking one teacher about the listening part in the
test, he said, "it (Q2) covers what they study in the dialogues .., if
Not one of them". The teachers are aware of the easiness of this
question as clearly stated by one teacher,
"I don't focus on listening. It is easy. Pupils get used to it. I do this
with my daughter, we read each conversation, memorizing the
words. Then we answer previous exams".
Similarly, another teacher confirms, "I do not see any difficulties in
the listening part. The student understood the questions very much
and the private books already make them used to it. At the beginning
it was hard, but not anymore".
Thus the listening part seems to provide incomplete
information about the learner's ability to listen. Instead of asking the
pupils to perform listening comprehension, they just recognize or
complete missing words. Hughes remarks that "direct testing
implies the testing of performance skills with texts and tasks as
authentic as possible." Thus when assessing aural/oral skills, direct
testing can include "performance skills and tasks as authentic as
possible"(Hughes 2003, p. 75).
As for the reading tasks, the reading text in Q5 assesses
student comprehension via 2 multiple choice questions and 2 WH
DOI: 10.12816/0053050 044
7102
questions. The length of the text is appropriate to the learners' age.

However, using only one text does not provide enough information
about the learner's reading comprehension ability. Hughes (ibid, p.
142) recommends including as many passages as possible to achieve
both content validity and acceptable reliability".
Furthermore, the test under investigation primarily focuses on
measuring students' linguistic knowledge through discrete point
items (Q 2, Q3, Q5, and Q7). For example, in Q3, MCQ tests the
recognition of either grammar knowledge or vocabulary. Hence, this
may threaten the construct validity in terms of whether or not this
grammatical knowledge "underlies the productive use of grammar"
(Hughes 2003, p. 76). Besides, McKay (2006 p. 318) argues that
YLLs need communicative language tasks to use the language in
different social situations and their physical surroundings.
So far, the test critique has identified some issues in the
listening, reading and writing parts. It also reveals that the test does
not cover the speaking construct and focuses more on language
usage. The following section provides evidence-based classroom
and interview data about the test washback from different agents:
teachers, supervisors, and parents.
5.2 Results of qualitative data analysis

According to the emerging main themes, the collected data were
transcribed and coded (T stands for a teacher, p for a parent and S
for a supervisor). As this study looks into washback manifestations,
so the keywords are underlined in the extracts.
1-Aligning teaching practices to test items and narrowing their
scope
The data shows how the test influences teachers' classroom
practices. So, many teachers are keen to align their teaching and
organize it in light of the exam requirements. For example, T6
reported that "In month exam, I follow the test specifications
exactly." T3 was keen to explain exam requirements from the onset
of the semester "From the first class period, I speak about the test
specifications, for example, the new parts that were not in year five
exam, I explain them for my students." With the test in mind,
teacher test preparations narrow the curriculum with a major focus
on specific test items. Thus many teachers reshape their teaching
044
decisions according to the testing regime. For example, T10 clarifies

her teaching philosophy: I systematize every lesson according to the
test technique. The kids won't have an oral test, so I use 'rearrange',
I use MCQ questions…I show them when to choose this and not
that, and how to answer". In so doing, it seems that the test
machinery drives teaching practices.
2-Over-emphasis on specific linguistic items (grammar, vocab).
The data reveals teacher's decisions regarding which language
aspects are more important to be taught. In classroom observation,
it was noticed how grammar and vocabulary are mostly prioritized
over other language components. T3 reports, "grammar and
vocabulary take the lion share in our teaching": T11 shows a case
of teacher pragmatic thinking in doing what needed for the test
despite his beliefs about authentic communication.
For me, I consider vocabulary better or more important. … I
really want my students to speak. I know that abroad you do not
treat with the grammar we have in mind but on message……..
But when I make an exam or teach, I totally forgot this point
and concentrated on grammar for the exam. …..The most
important thing is that my student does not lose half a mark.
In classroom observation, it was noticed in 90% of the

observed classes how teachers extract grammar from reading texts
or conversations. T7 also mentions this remark.
For example, the first lesson is about concerts and musical
instruments…. So what is useful in this lesson is the grammar:
how do you play the guitar?.... It will come in the exam in the
grammar part. So I train my students to focus on the structure.
Similarly, a parent explained this focus with her son, "First, we

focus on the new words and new structures… then he answers
questions". This shows how children are directed to focus on words
and new structures either by teachers or parents at home study.
DOI: 10.12816/0053050 040

7102
3-Material selection
With the test agenda in mind, teachers and even parents' decisions
about material use or skipping some parts are taken. Both teachers
and parents confirmed this classroom observation. For example, T4
stated, "the sixth lesson in each unit is neglected; it's a kind of
training. But, if it has 'choose,' we focus on them.
Moreover, all the interviewed questioned the usefulness of the
workbook (the activity book),
The disaster is really the workbook. It has no relation with the test,
and I mean the questions in the test. My son asks me to answer it...
But when we come to the sample tests, I tell him to answer them
himself because that will come in the exam (P3)
This point is also confirmed by T5 "the most important three

papers in the workbook are the last three papers. The exam comes
from them". Similarly, P2 selects extra material that would help their
children in test "we study from the external book because the school
book is far away from the exam. … It asks questions that will not
come in the test. Besides, P1 justifies the use of external books "The
workbook is in one direction and the exam is in another direction.
The main source is the private book." Thus, the test has become the
benchmark for selecting or skipping materials.
4-Manipulation of dialogues for test regime and downplaying
oral activities
From classroom visits, it was clear that most pupils are reluctant to
speak. Both parents and teachers also confirmed this problem. P3 is
not satisfied with this method: "The teachers train them on
memorizing specific answers …., why do you make them memorize,
why do not let him express himself." Similarly, T9 admits that
giving reasons: "Speaking is totally neglected…..By the end of the
day, kids memorize what will come in the test". T7 explains how
dialogues are manipulated to serve the test regime. They are trained
since year four… 5…. The pupils do not have a problem in writing
words in listening. I train them, and they memorize the words by
heart". Interestingly, T11 explains how he manipulates
conversations to word focus serving the test "with practice we
know the words that come in the test. I myself when I teach,
044
at the end of the conversation, we have 5 or 6 words, for example".

This shows clearly how the test design can influence classroom
teaching. Thus, practice oral activities are downgraded as speaking
is not represented in the test, whereas words are upgraded as they
come in the test.
5-Teachers’ assessment awareness
The data shows some teachers' awareness of some problems in the
assessment of their children, such as speaking and listening
assessment. They are also keen on some change. For example, one
of them suggested integrating a speaking test as a formative
assessment.
"Speaking is the basis…. Even we make an exam of 30 marks, and
we assign even 3 marks for speaking, a day before the exam, or
during the school study. …….... This practice will give students
the courage to speak."
T1 also questions the listening tests in the primary stage in general
"it is not a real test", and she suggests an authentic oral test for
primary six.
T1: But why do not they put listening in schools? Just a simple task,
we only have an oral test in primary 2. Even in this, they write
cat, dog on the board, the child reads and that's it. It is not a real
test, then….So, why doesn't the child sit with the teacher, talk,
and make a conversation, even with simple marks?
This extract indicates how far some teachers are aware that the
current English primary tests do not seem authentic because they do
not present a representative sample of learners' speaking and
listening. This remark also shows an awareness of the nature of
YLLs assessment which emphasizes spoken language skills more
than written language (McKay 2006).
So far, the findings of this study can be summarized in the
following figure (1):
DOI: 10.12816/0053050 044

7102
Test design
 Focused mainly on linguistic usage
 Unclear test specifications
Threats to validity
Construct Validity Consequential Validity
Constructs under- Negative

Representation Washback
Speaking
Behavior Manisfestaions
Reasons
Learning: -
 Indirect testing Memorizing words/structures
 Skills under-representation Teaching:
 Multiple-choice questions  Narrowing curriculum scope
 Predictability of test items  Focus on linguistic aspects;
grammar and vocabulary
 Shallow speaking activities
 Written test sheets/samples
Impact on Society
 Shallow learning
 Private lessons/books
 The pressure to achieve high scores
 Stress on children/family
Figure (1) Summary of the study results
044
5.3. Discussion
The test analysis (c.f, 5.1) identified some problems with the construct
validity of ELTfPS. It also revealed the existence of negative
washback of English testing in the primary stage. Negative washback
seems evident in many aspects as aligning teaching practices to test
agenda and focusing on specific test items and narrowing curriculum
scope; the excessive use of grammatical explanations and vocabulary
memorization, focus on materials that are related to the test items and
skipping the unrelated ones, over-reliance on extra test preparation
materials, monopolization of dialogues to serve test regime, and
ultimately oral activities are downplayed. These manifestations appear
inside and outside of the classroom. Thus, the test has become the
machinery that regulates classroom practices, and teaching and
learning have become test-driven. The finding of the present study
echoes the findings of Gebril and Eid (2017) and Chouldry and
Ahmed (2013), and Abdel Latif (2012), who also found similar
negative washback manifestations.
Examining the construct validity of the present test in the
previous section has shown speaking skills are under-represented.
This factor might lead to adverse washback on teaching as "teachers
might come to overemphasize those constructs that are well-
represented and downplay that are not" (Messick 1996, p. 14).
Accordingly, many primary Egyptian teachers tend to focus on
vocabulary and grammar because the test focuses on linguistic
knowledge and marginalize speaking activities as they are not tested.
As Weir (2005 p.18) puts it, "teachers may simply not teach certain
important skills if they are not in the test" By the same token,
Andrews (1995) also reported that a test might also influence the time
allocated to particular aspects of teaching and learning. Hence,
according to parents and teachers in this study, the children give more
time to memorizing dialogues, grammar structures and words.
The present study has shown that the current test has the potential
for negative washback in directing and restricting learners' behavior
towards memorization "I train the children, they memorize the words
by heart" (T11). Teachers seem to narrow the curriculum scope by
aligning their classroom practices to test items. In so doing, the test
influences the degree and depth of teaching and learning, as
suggested by Alderson and Wall (1993). Many teachers might use
DOI: 10.12816/0053050 044

7102
traditional techniques that facilitate test practices "The majority are

traditional, just fill the board"(S3). The traditional techniques that
focus on memorization might affect learning quality, turning it into
shallow learning. Consequently, this does not facilitate meaningful
and holistic language learning. "yea, but after the test, I can
differentiate those who taught by traditional teachers will forget once
they finish the test. The others learn and understand better"(S3).
Nevertheless, some teachers have shown some awareness
of the test's negative impact on their students. For example, they
highlighted some problems with pupils' oral skills. "They
memorize, the kids can answer the test very well. But if you try to
make a conversation in English, they can't" (T6). Besides, another
teacher is concerned about testing inability to show students'
actual level in English.
"Sometimes I have a student who doesn't know how to speak and
in the test, he gets a high mark… For example, I have a student
who got 35 out of 40, and he does not raise his hand in the class.
However, he used to answer the test; if you see this, do this".
Furthermore, the problem of the listening section is also

highlighted, which refers to the negative washback of the test.
"Why don't we have listening assessment all school year? They

have to cancel the listening questions. The students memorize the
six dialogues. So the listening question does measure anything.
Especially in ‘complete the dialogue question’ (Q2)".
Moreover, teacher awareness is extended to suggesting some

solutions to enhance test validity at primary six. They suggested
incorporating listening and speaking in a formative way.
The present study results have shown various washback
manifestations for the Egyptian test on the children and parents. The
children are driven by the test requirements, which focus on
linguistic knowledge rather than helping them perform to their best.
They want to gain as high marks as they can as parents value
children's success through their marks, "My son may understand
English well, but the final measure is the marks. Thus when he
answers previous exams, I know where he is before the final exam."
044
(P5). On their part, the teachers are concerned with their student's
grades as well "The most important thing is that my student does not
lose half a mark."(T6). Consequently, as parents want their children
to get the highest marks, they pay extra private tutoring and extra
test preparation books. This condition can place a financial burden
on many families as well as exhausting the children. Some families
can afford this, while others cannot.
The data also reveals some evidence for the test impact on
learners and their families. Some parents express how the current
testing system seems to be a burden on the family. It seems that
children are not enjoying learning due to the existence of private
tutoring and test nature. One parent reported:
I feel that education has become ridiculous, not only in
English but in other subjects as well. I'm speaking through
my experience with my kids as a parent. It does not make us
enjoy our life. (P3)
Besides, the test-driven system affects families to hunt the best

teacher to maximize children's marks. However, the quality of
learning is not guaranteed as the private teachers help the children to
get high marks but not enhance language performance.
My kid takes a private lesson with this teacher to get marks, but
the other teacher is better. You know as if you are running after
something, but you cannot reach it. The kid is neither learning
something useful for life nor enjoying what's he is learning. ..
Another parent explains how family life is under pressure during the
exam period.
The whole family is under pressure. The exam period is like a

camp; there is time for anything. I see that the whole education
needs a reconsideration. We are in a war, and we are not doing
something worthwhile.
Other parents stated that memorization is tedious, teasing and

exhausting for learners. One said, "I have to revise continuously with
my kids many times, as she always forgets what she
DOI: 10.12816/0053050 044

7102
memorized quickly. I really feel sorry for her, but I have no way".
As the test is mainly focused on written language and discrete
linguistics items, the magic key to the exam is memorization. This
remark is reported by some teachers in the previous analysis section.
For example, T2 maintained,
We mainly focus on memory. One of the things that teased me in
diploma was that (memorization)….. You know, there is a
famous quotation for Einstein which means "I do not waste my
time in something that exists in books.
These extracts indicate how far parents and teachers are concerned
about the drawbacks of the current testing system and the need for
an enjoyable testing change that facilitates good quality learning.
As for the impact of the test on the educational system, the
change towards the new communicative syllabus seems to be
obstructed by the test design, which encourages memorization. The
alignment with test requirements mainly hinders the curriculum
reform. As one supervisor puts it, "Time for English" is a good
book; I taught it myself before supervision. The aim is how to
practice English and communicate with each other. Nevertheless,
what we do mainly is the exam work (S2). The aim of introducing
English is to help learners communicate effectively, but the actual
picture in the classroom is slightly different. The syllabus aims are
downplayed in preparation for test requirements which are focused
on linguistic knowledge. The qualitative data shows some evidence
for the misalignment between teaching, learning practices and the
aims of introducing English in the primary stage. Hence, alignment
between learning objectives, teaching and learning activities, and
assessment tasks (Biggs and Tang 2007) is highly needed to reach
those claimed aims and brings positive washback.
To put it in a nutshell, the results of the present study have
revealed some problems with the test construct validity, a mismatch
with assessing YLLs principles and some negative washback
manifestations. Consequently, the fairness and meaningfulness of
test scores are questionable as they provide incomplete information
about learners' language ability. Thus, the test does not fit the
purpose of using the scores to transfer learners into the next stage.
So, the problems mentioned above should be tackled as the test is a
"high-stake test" (McIlwraith and Fortune 2016).
044
The following section tries to find a solution for such problems by

proposing a revised test.
6. Implications
Based on the finding of this study, this section proposes a revised
test to diminish the possible washback of the current test. Then it
presents some pedagogical implications.
6.1. A suggested revised test tasks
The study results reveal that speaking skills are underrepresentation.
This affects construct validity (Messick 1989) and consequently
facilitates negative washback on teaching and learning (Weir 2005).
This is neither consistent with principles for assessing YLLs
Cameron nor those set by the starters test (c.f. 2.3), which
emphasize the importance of spoken language and primary
consideration of positive impact in creating tests for YLLs (Taylor
and Saville 2002, Cameron 2006). To this end, revised test
specifications are suggested (Appendix 3) to avoid these limitations
and encourage a more positive washback. The rationale for building
the revised test tasks is based on the results shown in (Figure 1) as
well as the relevant literature. To this end- the proposed new
specifications cover all language skills; speaking, listening, reading,
and writing. Besides, the tasks specifications take into consideration
the characteristics of YLLs (McKay 2006), principles for assessing
them (Cameron 2001) and also the starters test (Taylor and Saville
(2002).
6.1.1 Construct validity
Construct validity is built into the test through the following:
 Content relevance and coverage. The specifications are linked
directly to the learning objectives of the syllabus. Thus this
should provide a high degree of content relevance. The
moderation teams can evaluate the degree of correspondence
between specifications to the learning objectives.
 The content constitutes a representative sample of the four
language skills, structures, and the construct is consistently
linked to the purpose of the test (Hughes, 2003 p. 26).
 Designing test tasks that provide appropriate inferences about
testers' language ability, capture enough sample of behavior,
and give a comprehensive representation of the learner's
overall English level.
DOI: 10.12816/0053050 044

7102
6.1.2. Test tasks and principles for assessing YLLs

The test tasks are built in the light of Cambridge tests as a model of
suitable tests for YLLs. Some of the tasks are adapted from it; others
are from (Hughes 2003). They are also based on characteristics of
YLLs (c.f. Table [1] p.2). The following is a brief analysis of the
different tasks and how far they reflect principles for assessing
YLLs.
Listening tasks
The listening tasks are geared towards creating a non-threatening
and enjoyable atmosphere for the children. It also requires minimum
responses from them. For example, children look at a colorful
picture, listen for instructions, and perform simple operations such
as drawing a line. Instead of using discrete points, which encourages
memorizing vocabulary, the tasks are authentic and relevant to
children's world. The topics are also familiar to children and tasks
are simple and game-like (c.f. 2.1).
Reading and writing tasks

In designing test tasks, some tasks are accompanied by pictures, and
they require a minimum response from the pupils. They are in the
reading and writing tests (tasks 1, 2, and 3). The new contribution is
to clarify specifications for those tasks in terms of defining the skill
measured, input and response. Besides, new tasks from the Starters
test are added. Task 4 provides meaningful context to test reading
ability via cloze text accompanied by pictures. Pupils are required to
write one word for each picture.
Similarly, task 5 tests the same ability via a reading text
accompanied by 3 pictures and 6 questions. In so doing, it differs
from the reading text in the previous test, which was not
accompanied by pictures. It also helps to manifest the reading ability
via different texts in the three pictures instead of only one reading
comprehension passage in the previous test. The new format for the
reading text reflects the nature of YLLs in providing interactive
context and pictures, as suggested by Taylor and Saville (2002).
This format, in turn, lessens the cognitive load taking into account
"children's short span attention" (Hughes, 2003 p.201).
044
Speaking tasks
The new tasks highlight the importance of aural/oral skills as
maintained by Cameron (2001 p. 240), Taylor and Saville (2002)
and Mckay (2006). They provide an avenue for children to use the
language via cards, describing pictures, responding to personal
questions. The tasks measure pupils speaking ability via direct
testing and meaningful, authentic context. Moreover, they stimulate
student thinking. For example, in task 3, 'Odd-one-out,' the pupil
identifies the odd card and then logically reasons his/her choice. In
this way, this will hopefully have a positive washback on teaching
as it will foster using the language in a meaningful way rather than
focusing on grammar or vocabulary.
Example Task 3
The examiner shows the pupil four cards and he asks him/her which
one is odd:
Examinee: The duck
Examiner: Why
Examinee: Because it’s a bird and the others are animals
Task 5 also provides an authentic context for the YLLs to

understand and respond to personal questions. In sum, the speaking
tasks foster language use via direct testing. They also use pictures
and thus create motivating stimuli to produce the language.
Test washback/impact
The new test specifications are designed to yield a positive impact
on pupils, teachers, and society. Covering the learning objectives of
the course will support the positive impact as suggested by Hughes
2003 (C.f. 2.4.2). Unlike the previous test in which speaking skill
was under-represented, the newly revised test represents the four
skills. In this way, they are equally crucial for both teaches and
DOI: 10.12816/0053050 040
7102
learners. The new tasks are based on using direct testing to avoid
negative washback (ibid). Besides, they focus on using the language
in a meaningful context rather than testing discrete points as
recommended by McKay (2006). In this way, it hopefully will
encourage teachers to create more communicative tasks to integrate
listening and speaking skills. The suggested test will, in turn,
diminish the negative washback resulted from neglecting speaking
skills in the test.
Anticipated challenges for the new specifications
It is expected that there might be some challenges or requirements to
use the new specifications. Financial support will be needed at
different levels: producing colorful and clear test papers;
implementing the new test; training teachers in communicative;
task-based teaching; and producing manuals about the new test. All
these requirements and others will need extra expenses. However,
our children's future as competent learners of a foreign language is
more important than wasting our money in ineffective testing that
has negative washback on their learning (Hughes 2003, p. 56).
Another potential challenge is teachers' resistance who are not used
to teach speaking and listening communicatively. This difficulty can
be minimized by training them so that the newly revised test
achieves the intended positive effects.
6. 2. Pedagogical Implications
The present study has shown that English language tests for primary
six have the potential for negative feedback. Washback is a complex
process (Rea-Dickins and Scott 2007) and maybe other variables
contributed to its existence or intensifying it. For example, the
teaching methods might be due to inadequate teacher training. In
the classroom observed, not all participant teachers used traditional
teaching. As mentioned by one supervisor that "out of each six,
there is one creative." This extract implies an urgent need for
specialized practical training that can be applied to the classroom.
For example, some participants in the study asked for specialized
training in listening. As the listening part in the test is read by
teachers, they should be qualified for this task since children's
listening might be affected by teacher mispronunciation. Besides,
primary English teachers need specialized training in teaching and
assessing young learners as the majority have a general
044
methodology training. In this concern, faculties of education may

contribute. The ministry can also cooperate with Amideast and the
British council to train primary teachers.
The other variables might be due to the lack of resources;
some teachers in this study asked for listening materials and
pronunciation resources as they get lost in following which
dictionary. Besides, the participants in this study highlighted the
need for listening labs or just using some computers for listening in
primary schools. Furthermore, the teachers need to be provided with
the book's CDs to teach listening properly.
A good point that emerged from the data was teacher awareness
of the existence of negative washback. They even gave some
suggestions as integrating speaking, testing listening properly,
linking test tasks to similar tasks in the book, and changing the
examination system as a whole. Though they do not use the
academic terms for these suggestions, as construct validity or
content validity, this reflects their awareness of some problems and
their hope to change or suggestions to solve them. This implies that
the policymakers should listen to teachers as they apply the testing
policy in schools and identify assessment challenges in their
contexts, especially when introducing a new curriculum or new
testing system.
Parents in this study have complained of the panic that exams
impose on the children and family as well. Besides, some teachers
have shown some tension to foster oral skills or teaching to the test.
This implies that the examination system needs reconsideration. If
the new communicative syllabus "Time for English" is hoped to
bring positive results and enhance children's communicative
competence, it follows then that another radical assessment change
is a must. Principles of assessing young learners should be the basis
for this change. The new revised suggested in this study might help
in this issue. Additionally, formative assessment in the primary
stage should be operated. Finally, this study asks for emphasizing
oral formative assessment in the primary stage to coincide with the
specific nature of the children characteristics and providing more
enjoyable learning English to young learners. We need an
examination that is performance-based rather thancompetence-
based; that stimulates thinking and promotes 21st-century skills.
DOI: 10.12816/0053050 044

7102
To quote Einstein, as mentioned by one teacher in this study, "I

never commit to memory anything that can be looked up in a book."
In so doing, primary children can enjoy learning a foreign language
and hopefully develop their second language in a conducive learning
environment.
7.Conclusion
This study has attempted to investigate the extent to which the

English language test for the primary stage reflects concepts in
testing YLLs and language testing in general. The data analysis
showed that speaking skill is underrepresented. Furthermore hence,
this threatens the construct validity of the test Messick (1989).
Classroom observation and interview data helped to identify the
following manifestations of negative washback on both teaching and
learning: aligning teaching practices to test agenda and focusing on
specific test items and narrowing curriculum scope; the excessive
use of grammatical explanations and vocabulary memorization,
focus on materials that are related to the test items, skipping the
unrelated ones and monopolization of dialogues to serve test regime
whereas oral activities are downplayed.
Based on the results of this study, new specifications for the
test are designed. The revised test rationale is based on meeting the
identified drawbacks of the current test via empowering construct
validity, enhancing positive washback, and meeting the principles
for assessing YLLs. It also builds on starters test to design the test
tasks. Speaking and new listening tasks are augmented. Besides,
some reading and writing tasks are kept whereas new ones are also
added. The test tasks meet YLLs unique characteristics in using
stories, cards, colorful and attractive pictures. Hence, it is hoped the
newly revised test will yield a positive impact and encourage
meaningful teaching/ learning rather than narrowing the curriculum
scope on mere linguistic items.
The present study is limited to the small sample size (21
participants) from one governorate; therefore, it is difficult to
generalize results. Nevertheless, the data gathered from the in-depth
interview gave insightful insights into participants' testing practices.
Concerning these limitations, researchers can replicate this study
044
with a larger sample from different governorates in Egypt. Future

studies can use other methods to collect an adequate sample of
qualitative data as stimulated recall. Furthermore, questionnaires
can be used to explore the washback effect of testing on both
teachers and learners in different locations.
DOI: 10.12816/0053050 044

‫‪7102‬‬
‫مجمة الدراسات التربوية واالنسانية ـ كمية التربية ـ جامعة دمنهورـ المجمد التاسع – العددالرابع – الجزء الثانى‪ -‬لـــسنة ‪7102‬‬
‫‪044‬‬
References
1.Abdel Latif, M. (2012). Teaching a standard-based communicative

English textbook series to secondary school students in Egypt:
Investigating teachers’ practices and beliefs. English Teaching:
Practice and Critique 11(3), 78:97.
2.Abdel Latif, M. (2017). English Education Policy at the Pre-university
Stages in Egypt: Past, Present and Future Directions. In R.
Kirkpatrick (Ed.), English Language Education Policy in the
Middle East and North Africa (pp. 33-45). Cham: Springer
International Publishing.
3.Alderson, J. C., and Wall, D. (1993). Does Washback Exist? Applied
Linguistics, 14(2), 115-129. doi: 10.1093/applin/14.2.115
4.Alderson, J., Clapham, C. and Wall, D. 1995. Language Test
Construction and Evaluation. Cambridge: Cambridge
University Press.
5.Andrews, S. 1995. Washback or washout? The relationship between
examination reform and curriculum innovation. In D. Nunan, V.
Berry and R. Berry (Eds) Bringing about change in language
education (Hong Kong, University of Hong Kong), 67–82.
6.Andrews, S., Fullilove, J., and Wong, Y. (2002). Targeting washback—a
case-study. System, 30(2), 207-223. doi:
https://doi.org/10.1016/S0346-251X(02)00005-2
7.Bachman, L. and Palmer, A. (1996). Language Testing in Practice:
designing and developing useful language tests. Oxford: Oxford
University Press.
8.Bailey, K. (1996). Working for washback: a review of the washback
concept in language testing. Language Testing, 13 (3), 257-79.
9.Biggs, J. and Tang C. 2007. Teaching for quality learning at
university: What the student does. Third edition.
Maidenhead: Open University Press.
10.Cameron, L 2001. Teaching English to Young Learners.
Cambridge: Cambridge University Press.
11.Chowdhury, S. and Ahmed, S. (2013). Exploring the Side Effects of
Assessment in Secondary Schools and Its Impact on Students:
Perspective from Bangladesh. American Journal of Educational
Research, 1(9), 380-390.
12.Gebril, A. (2017). Language teachers’ conceptions of assessment: an
Egyptian perspective. Teacher Development, 21(1), 81-100. doi:
10.1080/13664530.2016.1218364
DOI: 10.12816/0053050 044

7102
13.Gebril, A., and Brown, G. (2014). The effect of high-stakes examination

systems on teacher beliefs: Egyptian teachers’ conceptions of
assessment. Assessment in Education: Principles, Policy &
Practice, 21(1), 16-33.
14.Gebril, A., and Eid, M. (2017). Test Preparation Beliefs and Practices in
a High-Stakes Context: A Teacher’s Perspective. Language
Assessment Quarterly, 14(4), 360-379.
15.McKay, P. (2005). Research into the assessment of school-age
language learners. Annual Review of Applied Linguistics,
25, 243–263.
16.Golis, A. (2016). Teachers’ Beliefs and Practices of Assessing Young
Learners. In M. Pawlak (Ed.), Classroom-Oriented Research:
Reconciling Theory and Practice (pp. 151-166). Cham: Springer
17.Hauck, M. (2017). Designing Task Types for English Language
Proficiency Assessments for Young Learners. In M. K. W. a. Y.
G. Butler (Ed.), English Language Proficiency Assessments for
Young Learners for K-12 English Learners in the US (pp.
79:96): Taylor & Francis.
18.Hsieh, C. (2016). Examining Content Representativeness of a Young
Learner Language Assessment: EFL Teachers’ Perspectives. In
M. Nikolov (Ed.), Assessing Young Learners of English: Global
and Local Perspectives (pp. 93-107). Cham: Springer
19.Hughes, A. (2003). Testing for Language Teachers. Second
Edition. Cambridge: Cambridge University Press.
20.McKay, P. (2006). Assessing young language learners.
Cambridge: Cambridge University Press.
21.McNamara, T. (2000). Language Testing. Oxford: Oxford
University Press
22.Messick, S. (1989). Validity. In R L Linn (ed) Educational
Measurement. Third edition, New York: Macmillan.
23.Messick S. (1996). Validity and washback in language testing.
Language Testing. 13 (3), 241-56 Available
at:http://www.eric.ed.gov/ERICDocs/data/ericdocs2sql/content_
storage_01/0000019b/80/14/dc/6f.pdf
24.Moon, J. (2005). Teaching English to young learners: The
challenges and the benefits. In English! British Council,
30-34. http://www.britishcouncil.org/az/ie2005w30-jayne-
moon.pdf [31/12/2016]
044
25.McIlwraith H. and Fortune, A. (2016). English language teaching and

learning in Egypt: an insight. British Council, Brand and
Design / F239.
26. Egyptian Ministry of Education (2012). The National Curriculum
Framework for English as a Foreign Language: Grades 1 – 12.
Egypt
27.Nikolov, M. (2016). Trends, Issues, and Challenges in Assessing Young
Language Learners. In M. Nikolov (Ed.), Assessing Young
Learners of English: Global and Local Perspectives (pp. 1-17).
Cham: Springer International Publishing.
28.Rea-Dickins, P. and Scott, C. (2007). Washback from language
tests on teaching, learning and policy: evidence from
diverse settings. Assessment in Education: Principles,
Policy & Practice), 14, (1), 1 - 7. Available from:
http://dx.doi.org/10.1080/ 09695940701272682
29.Safa, M and Gafari, F. (2016). The Washback Effects of High School
Third Grade Exam on EFL Teachers’ Methodology, Evaluation
and Attitude (Vol. 5).
30.Shohamy, E. (1998). Critical language testing and beyond. Studies
in Educational Evaluation, 24, 331–345.
31.Taylor, L., and Saville, N. (2002). Developing English language
tests for young learners. Extract from research notes.7, pp.
2-5
32.Cambridge: University of Cambridge Local Examinations
Syndicate.
33.Weir, C. (2005). Language Testing and Validation: An evidence-
based approach. Basingstoke: Palgrave Macmillan.
DOI: 10.12816/0053050 044

Article 1

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Article 1

Uploaded by

Copyright:

Available Formats

WASHBACK FROM ENGLISH LANGUAGE TESTS

FOR PRIMARY STAGE ON LANGUAGE TEACHING

Dr. Hanan Waer

‫مجمة الدراسات التربوية واالنسانية ـ كمية التربية ـ جامعة دمنهور ـ‬

DOI: 10.12816/0053050 044

Washback from English Language tests for primary stage on

Recent years have witnessed an interest in teaching English early to

Table [1] Characteristics of YLLs and their implications for

DOI: 10.12816/0053050 044

- developing gross and - taking into consideration tiredness

The characteristics identified in the above table relate to the learners

Building on Cameron (2001), Cambridge young learners'

2.2 Main concepts related to testing

Thus, face validity- judging the value of the test by non-testers as

2.2.2. Test washback

DOI: 10.12816/0053050 040

language teachers and learners to do things they would not

different countries and the Egyptian education system. Abdel Latif

DOI: 10.12816/0053050 044

used Content Validity Indices (CVIs) to determine the degree of

assessment practices are still primarily focused on linguistic

Questions of the study

This study combines an analytical approach to evaluate test samples

DOI: 10.12816/0053050 044

economic backgrounds. Their level is beginners; they have

4. 1.2. Classroom visits

5. Data Analysis, Results and Discussion

DOI: 10.12816/0053050 044

However, there is no representation of speaking skills. In so doing,

questions. The length of the text is appropriate to the learners' age.

5.2 Results of qualitative data analysis

decisions according to the testing regime. For example, T10 clarifies

In classroom observation, it was noticed in 90% of the

Similarly, a parent explained this focus with her son, "First, we

DOI: 10.12816/0053050 040

This point is also confirmed by T5 "the most important three

at the end of the conversation, we have 5 or 6 words, for example".

DOI: 10.12816/0053050 044

Construct Validity Consequential Validity

Constructs under- Negative

Figure (1) Summary of the study results

DOI: 10.12816/0053050 044

traditional techniques that facilitate test practices "The majority are

Furthermore, the problem of the listening section is also

"Why don't we have listening assessment all school year? They

Moreover, teacher awareness is extended to suggesting some

Besides, the test-driven system affects families to hunt the best

The whole family is under pressure. The exam period is like a

Other parents stated that memorization is tedious, teasing and

DOI: 10.12816/0053050 044

The following section tries to find a solution for such problems by

DOI: 10.12816/0053050 044

6.1.2. Test tasks and principles for assessing YLLs

Reading and writing tasks

Task 5 also provides an authentic context for the YLLs to

methodology training. In this concern, faculties of education may

DOI: 10.12816/0053050 044

To quote Einstein, as mentioned by one teacher in this study, "I

This study has attempted to investigate the extent to which the

with a larger sample from different governorates in Egypt. Future

DOI: 10.12816/0053050 044

1.Abdel Latif, M. (2012). Teaching a standard-based communicative

DOI: 10.12816/0053050 044