Download as pdf or txt
Download as pdf or txt
You are on page 1of 29

Language Testing

http://ltj.sagepub.com/

Examining dialogue: another approach to content specification and to validating inferences drawn from test scores
Merrill Swain Language Testing 2001 18: 275 DOI: 10.1177/026553220101800302 The online version of this article can be found at: http://ltj.sagepub.com/content/18/3/275

Published by:
http://www.sagepublications.com

Additional services and information for Language Testing can be found at: Email Alerts: http://ltj.sagepub.com/cgi/alerts Subscriptions: http://ltj.sagepub.com/subscriptions Reprints: http://www.sagepub.com/journalsReprints.nav Permissions: http://www.sagepub.com/journalsPermissions.nav Citations: http://ltj.sagepub.com/content/18/3/275.refs.html

Downloaded from ltj.sagepub.com by ROSNGELA RODRIGUES BORGES on September 4, 2010

Examining dialogue: another approach to content specication and to validating inferences drawn from test scores
Merrill Swain University of Toronto

In this article one aspect of the many interfaces between second language (L2) learning and L2 testing is examined. The aspect that is examined is the oral interaction the dialogue that occurs within small groups. Discussed from within a sociocultural theory of mind, the point is made that, in a group, performance is jointly constructed and distributed across the participants. Dialogues construct cognitive and strategic processes which in turn construct student performance, information which may be invaluable in validating inferences drawn from test scores. Furthermore, student dialogues provide opportunities for language learning, i.e., opportunities for the joint construction of knowledge. It is suggested that an examination of the content of these dialogues can provide test developers with targets for measurement. Other implications for L2 testing are also discussed.

I Introduction In this article I examine one aspect of the many interfaces between the elds of second language (L2) assessment and L2 learning: small groups and the oral interaction that occurs within them. Small groups consist of two or more individuals. Small groups are different from interview contexts where asymmetry in the exchanges are a given, and where one person is solely responsible for beginning and ending the interaction [and] for ending one topic and introducing a new topic . . . (van Lier, 1989: 489). Both language educators and language assessors are interested in what individuals say to each other in small groups. I am interested because what individuals say to each other can inform us about L2 learning processes and strategies. Language assessors are interested because there are tests which evaluate the performance of individuals as they interact in pairs or small groups. Importantly, some of those tests are high-stakes tests.
Address for correspondence: Merrill Swain, The Ontario Institute for Studies in Education of the University of Toronto, 252 Bloor St. W., Toronto, Ontario, M5S 1V6, Canada; email: MSwain oise.utoronto.ca
Language Testing 2001 18 (3) 275302 0265-5322(01)LT208OA 2001 Arnold

Downloaded from ltj.sagepub.com by ROSNGELA RODRIGUES BORGES on September 4, 2010

276 Content specication & validating inferences drawn from test scores Given that interaction in small groups is one point of overlap among our interests, I discuss some of the research that I am currently engaged in and the theoretical orientation within which this research is being conducted. Small groups are important to them both. My goal is to suggest some of what we, as researchers, might discover about L2 learners and about L2 test-takers by examining their dialogues as they jointly construct their performance in group activities. Specically I raise two issues for consideration: 1) Might these dialogues generate content which could serve as new targets for measurement? 2) Might analyses of the dialogues provide additional insights for examining the validity of the inferences we draw from a test and the uses we make of them? It is perhaps important to say at this point that the ways in which we have been examining dialogue (e.g., Swain and Lapkin, 1998) are somewhat different from the text and discourse analyses that are now quite commonly applied in our respective elds. Text and discourse analyses focus on linguistic and interactional features of speaking, rather than on its content and on the underlying cognitive and strategic processes which both generate that talk, and that that talk generates. Our literatures are rich in the use of text and discourse analyses which have been used, for example, to examine the similarities and differences of test-speak with that used outside of the testing situation (e.g., Lazarton, 1992), or of the performances generated by different oral tasks. A large number of features such as lexical density, uency, structural complexity and turn-taking have been examined (e.g., Shohamy et al., 1993; Young and He, 1998). Also, the ways in which we have been examining dialogue are somewhat different from that of other verbal protocol techniques (Green, 1998) in that the data we have examined are the dialogues that occur between participating individuals, not those which occur in solo think-aloud reports and retrospective accounts. What I propose could complement these analyses as a source of validity evidence. In the next section I provide two examples of high-stakes tests which make use of small groups, along with a sampling of some of the issues that have been considered by researchers in the testing eld related to small group testing. This is done as a reminder of the importance of understanding the dynamics of small group interaction to assessment practices. In Section III I introduce some ideas emanating from a theoretical perspective a sociocultural theory of mind which has recently been attracting interest amongst some L2 learning researchers. In Section IV I then consider aspects of our research, which is being conducted from within this theoretical framework. The

Downloaded from ltj.sagepub.com by ROSNGELA RODRIGUES BORGES on September 4, 2010

Merrill Swain 277 purpose of this is to explore the relevance of this theoretical orientation and research to language testing. II Small group tests and related issues A number of high and lower stakes tests incorporate a speaking section in which two or more candidates are required to talk to each other. For example, the main suite of UCLESs ve EFL Examinations each have a speaking component that involves interaction among the candidates. The non-interview-like part of the speaking section of the First Certicate in English, which lasts for approximately seven minutes, is described in their handbook as follows: The candidates are given visual prompts (photographs, line drawings, diagrams, etc.) which generate discussion through engaging in tasks such as planning, problem solving, decision making, prioritising, speculating, etc. (UCLES, 1996: 84) (three minutes). The examiner then encourages a discussion among the participants of matters related to the theme of the visual prompt (four minutes). A second example among high-stakes tests comes from Hong Kong. There, performance on the Hong Kong Use of English A/S level Examination determines whether a student can gain entry into Hong Kongs tertiary institutions. Since 1994, this test includes a twopart oral component, the second part involving groups of four students interacting in a university-like setting, replicating, for example, a small academic seminar. There are a number of reasons why some tests now include pairs or small groups of individuals interacting together to debate an issue or to solve a problem. Dissatisfaction with the oral interview as the sole means of assessing oral prociency and a search for other tasks that elicit different aspects of oral prociency (Shohamy et al., 1986) are concomitant reasons. An attempt to inuence teaching practices (Hilsdon, 1991) or, alternatively, to mirror teaching practices have also played an important role. Economic reasons, too, have played their part: where there are many students to be tested, it can be less expensive to test them in groups (Berry, 2000). Given that small group testing occurs in even one high-stakes test, as well as its reasoned use, it is surprising that so little validation work has been carried out. How are scores based on interaction among participants to be interpreted as an indication of individual performance ability? Can they be interpreted as individual performance ability at all? McNamara, in his thought-provoking paper entitled Interaction in second language performance assessment: Whose performance? (1997) has raised precisely these questions. His questions encompass a broad range of interactions including those between

Downloaded from ltj.sagepub.com by ROSNGELA RODRIGUES BORGES on September 4, 2010

278 Content specication & validating inferences drawn from test scores interviewer and interviewee, and those between test-raters and testtakers. The research of Lumley and Brown (1996), for example, found that features of an interviewers language behaviour differentially support or handicap a test candidates performance. The point here is that performance is not solo performance, but rests on a joint construction by the participating individuals. I return to this point below. Other researchers have raised different questions that impact on an interpretation of scores from group testing. Fulcher (1996), for example, questioned students about their reactions to oral tasks they had recently completed. The tasks included one-on-one interviews and a group discussion. Students indicated that they felt least anxious prior to carrying out the group discussion task, and that they considered the conversation that emerged during the group discussion to be more natural than in the one-on-one interviews. These affective responses translated into performance differences. On the basis of his G-study, Fulcher (1996: 36) states, however, that while task does have a signicant effect upon scores, this effect is so small that it does not seriously reduce the ability to generalize from one task to another. Berry (2000) has examined the relationship between extraversion and performance on a group oral test. What she has found is a complex relationship between the characteristics of the test-taker and those of the rest of the group. The scores of introverts are suppressed when they take part in a discussion . . . in a group with a low mean level of extraversion, and are elevated when in a group with a high mean level of extraversion (p. 163, section 5.6.6). The reverse holds for extroverts. Berrys ndings caution us against interpreting individual scores as reecting underlying linguistic abilities, and support an interpretation of situated performances. This potential for unfair biases if students of differing compatibilities (or of differing abilities) are grouped together needs further investigation. Not only is there the likelihood of real performance differences of any one test-taker in different contexts, but there may also be induced biases, i.e., because of its embeddedness in different contexts, test-raters may unintentionally judge the same performance differently. Importantly, there is also the issue of whose performance it is anyway. III Theoretical perspective: sociocultural theory of mind Since the 1980s the notions of input, and interaction have played a signicant role in theorizing about the essential ingredients and conditions for L2 learning (e.g., Pica, 1994; Gass, 1997). This is familiar

Downloaded from ltj.sagepub.com by ROSNGELA RODRIGUES BORGES on September 4, 2010

Merrill Swain 279 territory for most researchers of L2 learning. What is less familiar is the program of research that we (e.g. Swain, 1995, 2000; Kowal and Swain, 1997; Swain and Lapkin, 1998; 2000a; 2000b) have been pursuing since the mid-1990s. The purpose of our research has been to discover if, and how, what I have long called output that is, language production serves L2 learning. The basic argument is that the learners drive to communicate successfully in the target language pushes him or her to go beyond the cognitive activity that occurs in comprehension and to engage in more complete grammatical processing. In attempting to communicate, learners will create linguistic form and meaning, and in so doing, will discover the limitations of their current system. Depending on the individuals and the circumstances, noticing a gap in their linguistic knowledge may stimulate learning processes. Certainly, for many of the learners we have recorded as they interacted while working together on tasks (e.g., Kowal and Swain, 1997), we have observed that those learners noticed gaps in their linguistic knowledge and they worked to ll them by turning to a dictionary or grammar book, by asking their peers or teacher, by generating and testing hypotheses, or by noting to themselves to pay attention to future relevant input. Our data show that these actions generated linguistic knowledge that was new for the learner or consolidated their existing knowledge (e.g., Swain and Lapkin, 1998). Importantly, it was the attempt to communicate, as distinct from comprehending, that focused the learners attention on what he or she did not know, or knew only imperfectly. This view of output is embedded in the concept of language as a communicative activity. Since about 1995, individuals such as van Lier (2000) and Kramsch (1995) have argued against the continued use of terms like those of input and output, claiming that they limit our understanding of L2 learning to an information-processing, machine-like perspective. These terms are based on a conduit metaphor that sees language use simply as items which are transmitted as output from one source to be received as input elsewhere. Using this metaphor, they suggest, inhibits the development of a broader understanding of language use and language learning. This is a valid and productive critique. In accepting it and moving forward, I am engaged in reworking the notion of output to incorporate it within a view that focuses on language learning and use as dialogue dialogue with others, and dialogue with the self serving both communicative and cognitive functions. I have come to this position through research and reading the works of Vygotsky (e.g., 1978; 1987) and those who have further developed his theoretical stance,

Downloaded from ltj.sagepub.com by ROSNGELA RODRIGUES BORGES on September 4, 2010

280 Content specication & validating inferences drawn from test scores including psychologists (e.g., Wertsch, 1985; 1991; Cole, 1996) and applied linguists (e.g. Lantolf and Appel, 1994; Lantolf, 2000a). There are two important points I wish to make regarding this theoretical perspective, both of which are based on one of its primary premises: the origins of higher mental processes that is, of cognitive functioning are primarily social. 1 Higher cognitive processes as mediated activities The rst point is that higher cognitive processes processes such as attending, predicting, planning, monitoring and inferencing are mediated activities whose source is the interaction that occurs between individuals. That interaction initially takes the form of dialogue between individuals. According to Wertsch (1980), strategies that an individual participates in through social dialogue develop into strategic patterns of reasoning at the cognitive level, such that at a later stage the individual has taken over the cognitive responsibilities of both agents who had formerly participated in the social dialogue (p 159). These strategic processes become visible in the dialogue between individuals when they jointly engage in problem-solving activity. As Lantolf (2000b) states, an extremely important implication of research on mediated learning [is that] . . . attending to the talk generated by learners during peer mediation allows us access to some of the specic cognitive processes learners deploy to learn a language (p. 20 of manuscript). 2 Knowledge as constructed through dialogue The second point is that knowledge is constructed through dialogue. That dialogue may be with the self; it may be with others. The important point here is that dialogue mediates the construction of knowledge; through dialogue participants co-construct knowledge. In the case of researchers of L2 learning, we are interested in the construction of linguistic knowledge. And, in fact, our own research is oriented towards demonstrating that dialogue mediates L2 learning. Most of us have little trouble in understanding that dialogue mediates our learning of such substantive areas as mathematics, science and history. In principle the notion of dialogue mediating the learning of language is no different. When learners engage in a joint activity, particularly one in which successful communication is important, efforts to use language to solve language problems can be observed in their dialogue. I have called this sort of knowledge-building dialogue collaborative dialogue.

Downloaded from ltj.sagepub.com by ROSNGELA RODRIGUES BORGES on September 4, 2010

Merrill Swain 281 The implications of this perspective for language testing are twofold. Let me mention them briey here, and return to them after discussing some relevant L2 learning research. The rst implication is that because cognitive and strategic processes are made visible in dialogue, then studying that dialogue will provide us with evidence of how participants in group interaction approach and process the task demands. If understanding these strategies and processes is important to an understanding of the construct being measured, then the dialogue amongst participants will be an important source of validation evidence. The second implication is that the process of interaction and the outcome of interaction is a joint achievement, not one of individual performance.

IV Current research I now discuss our current research in which we have been examining closely the dialogues of pairs of students as they work collaboratively on different tasks. I would like to suggest that those dialogues offer opportunities for L2 learning (that is, they are a possible source of learning), that they offer evidence of the cognitive processes and strategies learners are using, and that they provide content that can establish targets for measurement. In our current work in L2 learning, this neo-Vygotskian, sociocognitive perspective has meant that we have designed our research to provide opportunities for students to work together in pairs on different tasks. We have been particularly interested in what students say to each other when they encounter a linguistic problem as they are doing the task: how do they go about solving it? And if they solve it, what evidence can we present that their solution provided an occasion for language learning? Learning is understood to be a continuous process of constructing and extending meaning that occurs during learners involvement in situated joint activities (Halliday and Matthiessen, 1999; Wells, 1999). In our most recent studies (Swain and Lapkin, 1998; 2000a; 2000b), we had two main questions we wished to investigate. Our rst question was whether the dialogues of the students were, indeed, a source of L2 learning. Our second question related to the attentional focus and cognitive processes that different tasks engendered. We wanted to know if one type of task would lead our students to focus more on language form than would another type of task. We worked with two Grade 8 (aged 1314) mixed-ability French

Downloaded from ltj.sagepub.com by ROSNGELA RODRIGUES BORGES on September 4, 2010

282 Content specication & validating inferences drawn from test scores
Table 1 Stages of the research and time frame
Week 1
pretest developed from pilot study was administered

Week 2
informal instructor-led training session task done in pairs (focus on adjective agreement)

Week 3

Week 4

Week 5
posttest was administered

video-taped tapes lesson; transcribed instructions; and classand modelling specic of task posttest items performance developed task done in pairs and tape-recorded (focus on reexives)

immersion classes from the same school.1 These two classes of Grade 8 Anglophone students had been in a French immersion program since kindergarten. Until Grade 3, all instruction was in French; thereafter, English language arts was introduced, and from about Grade 5 on, approximately 50% of instruction was in French and 50% in English. Table 1 shows the stages of the research, and the time frame. Data collection took place over a ve-week period. Because one of our goals was to trace the learning that occurred in the student dialogues, we administered both a pretest and posttest. The posttest consisted of the pretest plus items we constructed based on the language-related episodes (see denition below) occurring in the student dialogues. In other words, we used the content of the language-related episodes as the basis for the construction of new, additional posttest items. This was necessary because of our research question as to whether the student dialogues were, in fact, a source of L2 learning. We had found from previous research that in tasks like the jigsaw and dictogloss (see below for descriptions), we could not predict ahead of time precisely what aspects of language form and meaning each pair of students would focus on, no matter how precisely we thought we had structured the task, and no matter to what extent we intervened with such manipulations as, for example, using videoed mini-lessons. In other words, as each pair of students progressed through the task, they did so focusing on linguistic aspects that the story they created, created for them.
1 The research was conducted during regular classroom time. In one class (30 students), pairs of students worked together on the dictogloss task; in the other class (35 students), pairs of students worked together on the jigsaw task (see below for description).

Downloaded from ltj.sagepub.com by ROSNGELA RODRIGUES BORGES on September 4, 2010

Merrill Swain 283 This meant that any pre-designed posttest we might have administered would misrepresent the knowledge we believed our learners would attain. This is the reason that in week 4 we transcribed the tapes of the students and constructed test items based on their dialogues. It is in this sense that I raised the question at the beginning of this article of whether the dialogues of test-takers doing group oral tasks might serve as the basis for developing new targets for measurement. In order to be able to develop the new posttest items quickly, we had decided ahead of time on the format of the test items we would use. Two examples are given in Figure 1. The rst example shows a multiple-choice item, and the second shows two related grammaticality judgement items. These items were, in fact, developed based on two of the language-related episodes (Examples 1 and 2) discussed below.

Figure 1 Two test item types: multiple-choice and grammaticality judgement

Downloaded from ltj.sagepub.com by ROSNGELA RODRIGUES BORGES on September 4, 2010

284 Content specication & validating inferences drawn from test scores As can be seen in Table 1, during the second week, we conducted a session to familiarize the students with the particular type of task they would be doing again the following week when we tape-recorded them. To focus students attention on at least one aspect of French that has proven problematic for French immersion students and that was highlighted in the task itself, a short mini-lesson on adjective agreement was given. Then the students did the task. In the third week, the grammatical point focused on was the reexive verb. A pre-recorded mini-lesson on French reexive verbs was shown on video. The video also showed two students working together on a similar task, serving as a model for what the students were to do following the viewing of the video. The talk of each pair of students in the class was tape-recorded as they did their task. The stories the students wrote were collected. Later they were rated on a ve point scale for each of content, organization, vocabulary, morphology and syntax. The stories produced by each pair of students were scored by two experienced French immersion teachers. The two sets of ratings for each writing sample were averaged. For the descriptors of the ve scales, see Swain and Lapkin, 2000a. We used two contrasting tasks in the study: a jigsaw task and a dictogloss task. The jigsaw task, a task in which each member of a pair of students holds information the other does not, is shown in Figure 2. One student in each pair held pictures numbered 1, 3, 5 and 7 and the other held those numbered 2, 4, 6 and 8. The students were required to construct the story told by the pictures by looking only at the cards each held. Typically the students worked through the cards sequentially, alternately telling each other what was in their pictures. Then together they wrote out the story. Pica et al. (1994) suggest that this type of two-way information exchange task is thought maximally to foster negotiation of meaning, a condition hypothesized to increase comprehensible input and therefore enhance L2 learning. The dictogloss task we used is shown in Figure 3. The text of this dictogloss was read twice in French at normal speed to the students. Each student took notes on its content while they listened to the passage being read. Then pairs of students, using their notes, worked together to reconstruct the passage in writing. Teachers we had worked with had tried out dictogloss tasks in action research in their own classrooms. They had found that it provided a context in which students not only negotiated meaning, but also focused on linguistic form in context (Kowal and Swain, 1994; 1997). Our intent was that the two tasks be as similar and comparable as possible in terms of content, so we constructed the dictogloss text from the pictures in the jigsaw task. We showed the series of eight

Downloaded from ltj.sagepub.com by ROSNGELA RODRIGUES BORGES on September 4, 2010

Merrill Swain 285

Figure 2 Jigsaw task Source: The tricky alarm-clock I. Baltova 1994

Downloaded from ltj.sagepub.com by ROSNGELA RODRIGUES BORGES on September 4, 2010

286 Content specication & validating inferences drawn from test scores

Le re veil-matin de Martine
` ve. Martine dort tranquillement Il est six heures du matin et le soleil se le dans son lit. Elle fait de beaux re ves, la te te au pied du lit et les pieds sur veil sonne, Martine ne veut pas se lever. Elle sort son loreiller. Quand le re veil. Elle se rendort tout de suite. pied et avec le gros orteil, elle ferme le re tre en retard. A six heures et veil quil faut pour ne pas e Mais elle a le re deux minutes, une main me canique tenant une petite plume sort du re veil et ` ve. Elle se lui chatouille le pied. Cest efcace. Finalement Martine se le brosse les dents, se peigne les cheveux et shabille pour prendre le chemin cole. Encore une journe e bien commence e. de le
Translation of dictogloss task
Its 6a.m. and the sun is rising. Martine is sound asleep in her bed. Shes having sweet dreams, her head at the foot of the bed and her feet on the pillow. When the alarm clock rings, Martine doesnt want to get up. She sticks her foot out and with her big toe she shuts off the alarm. She falls asleep again immediately. But she has the kind of alarm clock you need to prevent being late. At 6:02, a mechanical hand holding a small feather comes out of the alarm clock. It tickles her foot. To good effect! Finally Martine gets up. She brushes her teeth, combs her hair and gets dressed to go to school. Another great start to the day!
Figure 3 Dictogloss text and translation

pictures to three adult native speakers of French and asked them to narrate the story they saw unfolding. Combining their transcribed narratives gave us the text we used for the dictogloss (Figure 3). Our working hypothesis about cognitive processes the tasks would engage was that although both tasks provided opportunities for the negotiation of meaning, the dictogloss task would lead students to focus more on linguistic form than the jigsaw task. As both tasks involve using language communicatively, we predicted that the additional focus on form which we thought the dictogloss task would promote would enhance the learning opportunities for the students who did that task. Thus, when this prediction was not borne out as revealed by the posttest results, an examination of the dialogue of the students became crucial in helping us understand why. Because of our particular interest in collaborative dialogue that dialogue which ensued when students encountered a linguistic problem and worked jointly to solve it we identied in the transcripts all instances of language-related episodes. A language-related episode an instance of collaborative dialogue is a unit of analysis that emerged from our data, and that we dened as any part of a dialogue where students talk about the language they are producing, question their language use, or other- or self-correct their language

Downloaded from ltj.sagepub.com by ROSNGELA RODRIGUES BORGES on September 4, 2010

Merrill Swain 287 production. Language-related episodes (LREs) thus entail discussion of meaning and form, but may emphasize one of these more than the other. In our analysis we distinguished lexis-based and form-based language-related episodes. Lexis-based LREs involve searching for French vocabulary and/or choosing among competing French vocabulary items. Form-based LREs involve focusing on spelling or an aspect of French morphology, syntax or discourse. (Conferencing among the research team members achieved consensus in identifying and classifying LREs.) An example of a lexis-based LRE is shown in Example 1. Kim and Rick are two Grade 8 French immersion students. Here they are engaged in doing the jigsaw task; they are working on writing out the part of the story illustrated in picture 2 of Figure 2.
Example 1 1 Kim: Quelque chose uh . . . est sur l . . . quelque chose est sur loreiller. (Something uh . . . is on the . . . something is on the pillow.) 2 Rick: Is that loreiller? [pointing to something in picture 2] (Is that the pillow?) 3 Kim: No, this is loreiller. [pointing to it] (No, this is the pillow) 4 Rick: Pillow? 5 Kim: Yeah, pillows oreiller. [Yeah, the French word for pillow is oreiller.] (from Swain and Lapkin, 1998)

In turn 1 Kim is working out what they might write down about this picture. However, Rick is not sure of the meaning of what Kim has said. Specically, Rick is unsure about the meaning of loreiller and, in turn 2, Rick seeks to clarify its meaning by pointing to something in the picture and asking whether what he is pointing to is in fact an oreiller. His shift to English is signicant; he could certainly ask his question in French. His shift allows him to focus on the French lexical item of importance here: loreiller; and his use of English frames the French word and holds it up for attention and reection. At this point, both the French word and its referent in the picture are in focus, and remain so in turn 3. In turn 3, Kim tells Rick that what he has pointed to is not a pillow while at the same time pointing to the correct referent. Here loreiller is again highlighted through its embedding in an English sentence. In turn 4, Rick appears to be making the essential connection: that loreiller means pillow, which Kim conrms in turn 5. In the L2 acquisition literature, this example would constitute a classic example of negotiation of meaning, where a conrmation request by Rick leads to input that is made comprehensible for him. This is hypothesized as being a necessary condition for learning to occur. However, what is going on here is more and different than that:

Downloaded from ltj.sagepub.com by ROSNGELA RODRIGUES BORGES on September 4, 2010

288 Content specication & validating inferences drawn from test scores it is a collaborative venture. Two students are engaged in a dialogue in which the meaning of a lexical item is concretized, used, translated into the native language and back again into the target language, and learned. This dialogue is not enhancing learning, or leading to learning, it is learning. In the posttest item (shown in Figure 1), Kim and Rick both correctly checked loreiller. It is from data of this sort that we have inferred that learning occurred during the dialogues (LREs). Rick learned the meaning and perhaps the lexical item for pillow. For Kim, this LRE perhaps served to consolidate previous learning. In this brief LRE, we see evidence of the processes and strategies in which the students engage. Rick generates a hypothesis Is that loreiller? and has his hypothesis disconrmed, being provided with a correct solution No, this is loreiller. What is going on here is much more than just input and output. In Figure 4, items from Purpuras (1998) taxonomy of cognitive strategies which are made visible in this short dialogue between Kim and Rick are shown. There is evidence of clarifying (line 1), verifying (line 3), translating (lines 4 and 5), inferencing (line 2), associating

Comprehending processes:
Clarifying/verifying I try to improve my English by asking other people to tell me if I have understood or said something correctly.
Translating When Im learning new material in my second language (L2) I translate it into my native language. I learn new words in my L2 by translating them into my own language.
Inferencing I try to improve my listening in my L2 by listening for the important words to help me understand better. I try to improve my listening in my L2 by guessing the meaning of new words from the situation.

Storing or memory processes:


Associating I learn new words in my L2 by connecting the sound of any new word with an image or picture of it to help me remember it.
Linking with prior knowledge When Im learning new material in my L2 I try to connect what I am learning with what I already know. I try to improve my listening in English by using what I know from other situations to help me understand what is being said.

Figure 4 Items from Purpuras (1998) cognitive strategies questionnaire

Downloaded from ltj.sagepub.com by ROSNGELA RODRIGUES BORGES on September 4, 2010

Merrill Swain 289 (lines 2 and 4), and linking with prior knowledge (entire LRE). These are strategies which occur dialogically here. We can see here an instantiation of Wertschs (1980) point that a large share of strategic activity in daily life has distributed responsibility. Our interpretations about the cognitive processes apparent in the dialogues would be strengthened through the collection of additional data (e.g., asking the participants what they thought about what they were doing). In our current research (e.g., Swain and Lapkin, 2001), we have done so. An example of a form-based LRE is shown as Example 2. This dialogue also occurred between two Grade 8 French immersion students doing the jigsaw task while they were writing out the story, and in reference to picture 8 (Figure 2):
Example 2 ` le cole. 1 S1: Yvonne va a (Yvonne goes to school.) ` le cole. 2 S2: Se part a (Yvonne leaves [uses non-existent reexive form] for school.) 3 S1: Oui. Elle . . . se marche. (Yes. She . . . walks [uses non-existent reexive form]) 4 S2: Se part, parce que . . . (Leaves [uses non-existent reexive form], because . . . ) S2: Est-ce que cest part ou se part? (Is it leaves or leaves [in the non-existent reexive form]?) 5 S1: Part. (Leaves.) 6 S2: Part? Just part? (Leaves? Just leaves?) 7 S1: Yeah. ` le cole. 8 S2: Ok. Yvonne part a (Ok. Yvonne leaves for school.) (from Swain and Lapkin, 2000a).

What is going on in this dialogue? It is important to remember, in interpreting this LRE, that these students watched on video a short mini-lesson on reexive verbs just prior to doing the task. The effect of this on many of the students was to lead them to overgeneralize and use reexive verbs where it is not possible to do so in French. This appears to be the case in this LRE. In turn 1, S1 proposes that ` le cole. S1 chooses to use the all purpose they write Yvonne va a verb aller ( to go ), a verb whose meaning and form she knows very well. However, perhaps because her partner, S2, remembers the instructions to try to use reexive verbs, she suggests using a more specic verb, partir ( to leave), providing it, in turn 2, in the reexive form. Se partir, however, is a non-existent form in French. Whether S1 realizes this or not, she appears not to like her partners suggestion and so offers yet another alternative in turn 3: se marche.

Downloaded from ltj.sagepub.com by ROSNGELA RODRIGUES BORGES on September 4, 2010

290 Content specication & validating inferences drawn from test scores The verb marcher also cannot be used in the reexive form in French. S2, in turn 4, returns to her original suggestion of using se part, is about to explain why she wants to use se part, and then demonstrates her uncertainty with the verb form by asking: Est-ce que cest part ou se part? ( Is it leaves without the reexive pronoun, or is it leaves with the reexive pronoun? ), that is, Is it part or se part?. S1, in turn 5, responds by saying that the correct form is part, that is, the correct form is without the reexive pronoun. Still unsure, S2, in turn 6, asks Part? Just part? ( Leaves, just leaves without the reexive pronoun? ). Note here, again, the switch to English Just part? again having the effect of highlighting and focusing attention on the verb form in question. In turn 7, S1 reassures her partner that part is the correct form and so, in turn 8, S2 uses the form correctly and completes the sentence they had started to write back at turn 1. In the posttest item shown in Figure 1, S1 and S2 accurately stated that cole ) was certainly the rst sentence ( Les garc ons partent pour le correct and that the second sentence ( Les garc ons se partent pour cole ) was certainly incorrect, evidence of the learning that le occurred during the dialogue. In this LRE, the issue is not one of comprehension as in the previous example. Here retrieval (e.g., word repetition) and production processes play more of a role. Additionally, we see ample evidence of hypothesis generation and testing. The students are trying to nd the correct and best way to express their intended meaning. What they wrote down ` le cole is a more precise and sophisticated way to Yvonne part a express what they wanted to say than what they started with. How they got there is clearly a collaborative effort, and the question whose performance is it anyway? is a good question to ask. The purpose of having given these two examples of languagerelated episodes was four-fold. First, I wanted to show how rich even these very brief dialogues are in informing us of the mental strategies and processes students use in performing these tasks. Secondly, I wanted to provide them as examples of the two ways in which we classied the language-related episodes that occurred in our data: as lexical-based LREs or as form-based LREs. We used LREs as a unit of analysis because we had set up the tasks with the expectation that they would lead students to focus their attention on form and meaning differentially across tasks. Thirdly, because we used LREs as our unit of analysis in attempting to understand the strategies the students used in carrying out the tasks, it is important to see examples of LREs, so as to better understand the ndings. And, fourthly, LREs were used as the basis for developing posttest items (see above). The results of our analysis are shown in Table 2. As shown in Table 2, there were no statistically signicant differences between

Downloaded from ltj.sagepub.com by ROSNGELA RODRIGUES BORGES on September 4, 2010

Merrill Swain 291


Table 2 Language-related episodes (LREs)a
Class J
N
Count of total episodes Count of lexis-based LREs Count of form-based LREs Percent lexis-based LREs Percent form-based LREs

Class D
s.d. N M s.d.

Sig.b

12 12 12 12 12

8.8 4.0 4.8 41% 59%

8.0 3.7 4.5 21% 21%

14 14 14 14 14

9.2 3.7 5.5 40% 60%

4.2 2.3 2.9 19% 19%

ns ns ns ns ns

Notes: a A language related episode is any part of a dialogue where students talk about the language they are producing, question their language use, or other- or self-correct. b Two-tail t-test, p 0.05. Class J: pairs of students who did the jigsaw task. Class D: pairs of students who did the dictogloss task.

those doing the dictogloss task and those doing the jigsaw task in the number and percent of lexis-based LREs, nor in the number and percent of form-based LREs. In other words, quite different from our expectations, students responded similarly to the two tasks with respect to the attention they paid to form and meaning. The very reason for our having developed the tasks in the way that we did, was, in effect, not conrmed. We could not have known that but for our examination of the students dialogues. Examining their dialogues in this way led us to rethink the nature of our tasks. In the end, we believe that the fact that both tasks involved producing a polished written product in the target language led both sets of students to focus equally as much on language form (Swain and Lapkin, 2000a). An interesting feature of the data that appear in Table 2 is that the standard deviations of Class D consisting of the pairs of students who did the dictogloss task are in general, considerably smaller than those of the standard deviations of Class J consisting of the pairs of students who did the jigsaw task. Levenes test for equality of variances showed that the range in the total number of LREs was smaller for Class D than Class J ( p .05), suggesting that the dictogloss task constrains student responses to a greater degree than the jigsaw task (Swain and Lapkin, 2000a).2 I return to this point below.
2 Of course, it is also possible that the composition of the students constrained the range of responses as opposed to the characteristics of the tasks. However, all students were given a French cloze pretest to establish comparability of the two classes. There were no statistical differences ( p .05) between the average scores of the two classes. Also, the two classes were described by

Downloaded from ltj.sagepub.com by ROSNGELA RODRIGUES BORGES on September 4, 2010

292 Content specication & validating inferences drawn from test scores To summarize up to this point, we have so far seen that the students dialogues: 1) provide opportunities for knowledge construction and thus can be a source of learning; 2) can serve as the basis for developing test items; and 3) make visible some of the cognitive and strategic activities of learners as they jointly undertake a task. We used language-related episodes as a unit of analysis to understand whether the two tasks we used led to a differential focus on language form and meaning. We also conducted a separate analysis focusing on the students use of English, their rst language (L1), in performing the two tasks. Our goal in doing so was to uncover the strategic purposes for which English was used, and to discover whether the two tasks led to the differential use of English. Information about how and why the students used their L1 can provide us with insights about the tasks and the students nal product (i.e., their written stories). Based on what the students said in English, we developed a set of categories (Swain and Lapkin, 2000b). These categories were inuenced by our theoretical orientation: students use of L1 would serve strategic purposes as a cognitive tool, mediating task performance (see also Anton and DiCamilla, 1998). We discovered that students used their L1 for three principal purposes: (1) to move the task along; (2) to focus attention; and (3) for interpersonal interaction. All instances of English use were accounted for within this rubric (for details of how the coding scheme was developed, see Swain and Lapkin, 2000b). What we mean by moving the task along is that students used English to, for example, gure out what they were supposed to be doing in the task, gure out the order of events in the story (particularly in the dictogloss task), or develop an understanding of the story. Example 3 is illustrative.
Example 3 J1: Is that a foot? Yeah, ok, its a foot. (from Swain and Lapkin, 2000b.)

In this example, as in many others, it is clear that the student is externalizing his own internalized dialogue: he asks a question and answers it himself. The presence of his partner may be the reason he spoke out loud. However, what he says is also for himself as he works
their teachers and the researcher who collected the classroom data as interchangeable.

Downloaded from ltj.sagepub.com by ROSNGELA RODRIGUES BORGES on September 4, 2010

Merrill Swain 293 out an understanding of one of his pictures (picture 2) in the jigsaw task. Examples such as these make obvious the mediating role of dialogue in problem solving, and reect the very process of comprehension the student is undergoing. As can be seen in Table 3, we found that students who did the dictogloss task were twice as likely to use their L1 to develop an understanding of the task than the students who did the jigsaw task: 22% of the turns in which English was used vs. 10% respectively. As they talked, students also used their L1 to focus attention. By this we mean that the students used their L1 to search for vocabulary, to focus on language form, to retrieve grammatical information, etc. An example of a search for vocabulary is shown in Example 4.
Example 4 e. How do you say tickled? 1 J1: Et elle est tickele (And she is tickled. How do you say tickled?) e. 2 J2: Chatouille (Tickled.) e, chatouille e. How do you say foot? 3 J1: Ok. Chatouille (Ok. Tickled, tickled. How do you say foot?) 4 J2: Le pied. (Foot.) e les pieds. 5 J1: Ah, chatouille (Ah, tickled her feet.) (from Swain and Lapkin, 2000b)

Here we see one student, J1, searching for lexical items. She uses English to identify and search for the words she needed to construct e a phrase in French. In Purpuras terms, her repetition of chatouille in turn 3 is an example of a storing or memory process, something that she doesnt need to do with the French word for foot, a word that is certainly known to her. (Not only do we see here the retrieval and storage of lexical items in this dialogue, but we can also see the e les pieds ( tickled her feet ) was way in which the phrase chatouille constructed incorrectly on a word-by-word basis). As shown in Table 3, the jigsaw students were twice as likely to use English to
Table 3 Mean percentage of English turns used for understanding and vocabulary search by task
Use of L1
Jigsaw
Understanding Vocabulary search 10 27

Task
Dictogloss
22 14

Downloaded from ltj.sagepub.com by ROSNGELA RODRIGUES BORGES on September 4, 2010

294 Content specication & validating inferences drawn from test scores search for vocabulary as the dictogloss students: 27% of the turns in which English was used vs. 14% respectively. (No differences were statistically signicant.) These data suggest that the dictogloss and jigsaw tasks made quite different processing demands on the students, even though both required the students to produce a written story. The dictogloss task made more demands on the students abilities to comprehend and remember than did the jigsaw task; whereas the jigsaw task made more demands on the students productive abilities than did the dictogloss task. As I pointed out earlier, we had developed these two tasks to be as similar as possible in content, with the expectation that there would be a differential focus on form and meaning. Our analysis of the dialogues demonstrated that this was not the case. However, the analysis of the use of English in the student dialogues suggests that an important variable was the nature of the stimulus materials used: textual vs. visual. On the one hand, the dictogloss task provided the students with a French text, thus providing a set of French vocabulary and structures that were drawn on to do the task. However, the students could not proceed with the task until they had made sense of the French text they had heard. Thus student comprehension processes were taxed. The jigsaw pictures, on the other hand, although easy to understand, provided no language model. So, for the jigsaw students, processes of lexical retrieval and the creation of linguistic structures were of greater importance. Of course, to state these conclusions unequivocally, an analysis of what functions French was used for would also be necessary. Perhaps particularly intriguing in examining the use of English in the students dialogues is our nding that task performance interacted with student prociency. As shown in Table 4, for those students whose written stories were judged as above the median rating on the scale of language (an average of the scale ratings on vocabulary, morphology and syntax), the amount of English use was approximately the same across tasks: 15% and 18%. However, for those whose written stories were judged as below the median rating on the language scale, considerably more English was used by the jigsaw students than the dictogloss students: 41% vs. 25%. The same pattern is shown for the content scale. The dictogloss task, in effect, evened the playing eld for students with respect to prociency. We think that the dictogloss evened the playing eld because of the nature of the stimulus (input) material: a target language text. Additionally, the performance of the jigsaw students relative to the dictogloss students was all over the map. Although no statistically signicant differences were found between the average ratings of the

Downloaded from ltj.sagepub.com by ROSNGELA RODRIGUES BORGES on September 4, 2010

Table 4 Percentage of rst language turns by student dyads who are above or below median language and content ratings on their written stories
Jigsaw
s.d. Percent of L1 turns Number of pairs Mean rating s.d.

Story rating

Dictogloss
Percent of L1 turns

Number of pairs

Mean rating

Languagea Above median Below median .55 .79 15 41 6 6 3.2 1.9

5 5

4.0 2.6

.42 .66

18 25

Content Above median Below median .71 .71 15 41 7 5

5 5

4.0 2.0

3.0 1.6

.57 .55

20 22

Downloaded from ltj.sagepub.com by ROSNGELA RODRIGUES BORGES on September 4, 2010

Merrill Swain 295

Note: a Average of ratings for vocabulary, morphology and syntax.

296 Content specication & validating inferences drawn from test scores dictogloss and jigsaw students on the content, organization, vocabulary and syntax scales, the standard deviations for each of these measures was much greater for the jigsaw students than for the dictogloss students, signicantly so in the case of the vocabulary ratings. As I stated earlier, we also found that there was more variation in the number of language-related episodes produced by the jigsaw pairs than the dictogloss pairs. It would appear, then, that the two tasks we used have the potential for creating more or less variation in the learners performance. V Summary To sum up and pull together the various threads of this article, a few preliminary comments are rst in order. In this article, I have discussed the student dialogues in terms of the information we gleaned concerning (1) the cognitive and strategic processes the tasks invoked and (2) that as the locus of language learning, they provided a source of information as to possible targets for the measurement of task outcomes. I have not discussed explicitly the issue of dialogue as the focus of measurement itself. Many scales for rating oral prociency have been developed; others, like Skehan (1998), have applied measures of uency, accuracy and complexity; still others, like Young (2000), have examined aspects of interactional competence. What I have discussed in this article could also be applied directly to the dialogues that occur in oral group testing as a measure of the strategic and cognitive uses of an L2, aspects of language performance that are surely crucial in problem-solving tasks, and tasks that attempt to simulate academic linguistic performance. That said, I now summarize and contextualize the points I have made. First, I have suggested that one point of contact between the elds of L2 learning and testing is what happens in small groups. As language testers, it is important to be able to measure accurately the performance of test-takers interacting in a small group setting. What I have suggested is that, in a group, the performance is jointly constructed and distributed across the participants. We have seen that dialogues construct cognitive and strategic processes which in turn construct student performance. One implication for testing is, minimally, that serious thought needs to be given to the most adequate and fair means of scoring the linguistic activity and its product that derives from group interaction. It also means that in a testing situation, who one is paired or grouped with, is not unimportant. Secondly, our research has shown that the dialogues of student dyads can be a source of learning, of the joint construction of knowledge. Language testers, too, are interested in language learning. One

Downloaded from ltj.sagepub.com by ROSNGELA RODRIGUES BORGES on September 4, 2010

Merrill Swain 297 aspect of measuring group performance might be to measure the knowledge that has been jointly constructed. What seems fairly certain is that an examination of the content of the dialogues of testtakers could provide test-developers with targets for measurement, should thinking about the outcomes of group performance in this way be productively pursued. I imagine that thinking further along these lines might most productively be pursued within group performances where the task is tightly specied, and an attempt to replicate academic and problem-solving contexts is at issue. Thirdly, there are many situations in which both L2 learning and testing researchers might nd it useful to understand the cognitive and strategic processes underlying performance. One example from the L2 learning literature is Wesche and Paribakhts (2000) study. Here, they report on their study of the acquisition of word knowledge by students as the students carried out different types of text-based vocabulary exercises, any of which could serve as test items. Each exercise was expected to promote learning of different aspects of word knowledge. Wesche and Paribakht used think-aloud and retrospective reports to uncover the learners cognitive processes. Their conclusion is that learners tended to work from the principle of minimal effort . . . they did not necessarily follow all the instructions provided or engage themselves in the mental processes envisaged (2000: 207). This conclusion is important for understanding lexical acquisition and is also important for language testers to consider. The use of qualitative methods for seeking validation evidence has increased in recent years. As Banerjee and Luoma (1997: 276) point out, Qualitative validation techniques can provide information on the content of the test, the properties of the test tasks, and the processes involved in test taking and assessing. These techniques include expert judgement, introspective (including think-aloud protocols) and retrospective accounts of test-takers and test-raters, interviews, and text and discourse analysis of performance. Each of these has a rich theoretical and research literature supporting its use. The recording and examination of the dialogue of individuals jointly doing a task provides test-developers and test-researchers with additional insights to aid in the interpretation of test scores and to make recommendations about appropriate uses. In the examples I have given, I have shown that, by examining the students dialogues, assumptions we made about how task performance would be achieved were shown to be wrong; that different tasks differentially engaged comprehension and production processes; and that one task constrained the use of certain cognitive and strategic processes in ways

Downloaded from ltj.sagepub.com by ROSNGELA RODRIGUES BORGES on September 4, 2010

298 Content specication & validating inferences drawn from test scores in which the other task did not. These are important things to know about tasks, whether the tasks are to be used for research purposes, pedagogical purposes or testing purposes. Green (1998: 49), writing about verbal protocol analysis, states that:
Under some circumstances, reports generated by two individuals working on a task can be useful and can serve to make explicit information that might not be apparent within a protocol generated by an individual working alone on a task . . . The difculty with paired reports is that the presence of another individual changes the way in which the task would be approached by an individual working alone on that task. Two individuals working together on a task interact, and each modies the behaviour of the other. The manner in which the task is solved by a pair may differ enormously from the way in which either individual might solve the task alone.

In response to this, I have three points to make. First, Greens claims are, in fact, empirical questions, and they should be investigated. Secondly, investigating these questions in the context of small group vs. individual oral testing would be a useful validation endeavour. Thirdly, I do not think that these investigations should stop with oral testing: they can be studied in the context of solo or joint writing, as we have done; or in the context of solo or joint understanding of the meaning of a written or spoken text. To consider these joint activities as tasks for tests would t into current ideas about integrating language skills in test-tasks (e.g. Chapelle, 1998). They would also more faithfully mirror regular, daily classroom and non-classroom activity. Furthermore, because students have reported less anxiety in group situations and because, in other disciplines at least (e.g., in mathematics) group performance has been superior to individual performance (Webb, 1993), group testing just might be, under the right circumstances, a means of biasing for best (Swain, 1983). What seems certain is that research in L2 learning and research in L2 testing have much in common (Bachman and Cohen, 1998). Both sets of researchers need to understand and measure language performance in groups and we both need to know how to interpret that performance, which includes understanding the processes and strategies underlying that performance. In the nal analysis, both sets of researchers need to know that whoever is doing the task is engaging in construct-relevant processes while doing so. This is why we must pay serious attention to each others theories and research. Editors comments This article was the Samuel Messick Memorial Lecture and opening plenary address at the 22nd Annual Language Testing Research

Downloaded from ltj.sagepub.com by ROSNGELA RODRIGUES BORGES on September 4, 2010

Merrill Swain 299 Colloquium held in Vancouver, British Columbia in March, 2000. Although invited for publication in Language Testing, the paper was also peer reviewed in accordance with normal practice. The editors believe that the SLA perspective it offers represents an important contribution to thinking in language testing. The theme of the conference was Interdisciplinary interfaces with language testing. Acknowledgements The author would like to thank the following people for their valuable feedback on earlier drafts of this paper: Lindsay Brooks, Andrew Cohen, Alister Cumming, Jean Handscombe, Jim Lantolf, Sharon Lapkin, Tim McNamara and Helen Moore.

VI References
Anton, M. and DiCamilla, F. 1998: Socio-cognitive functions of L1 collaborative interaction in the L2 classroom. The Canadian Modern Language Review 54, 31442. Bachman, L.F. and Cohen, A.D., editors, 1998: Interfaces between second language acquisition and language testing research. Cambridge: Cambridge University Press. Banerjee, J. and Luoma, S. 1997: Qualitative approaches to test validation. In Clapham, C. and Corson, D., editors, Encyclopedia of language and education, Volume 7: Language testing and assessment. Dordrecht: Kluwer Academic, 27587. Berry, V. 2000: An investigation into how individual differences in personality affect the complexity of language test tasks. Unpublished PhD thesis, Kings College, University of London. Chapelle, C. 1998: Construct denition and validity inquiry in SLA research. In Bachman, L.F. and Cohen, A.D., editors, Interfaces between second language acquisition and language testing research. Cambridge, Cambridge University press, 3270. Cole, M. 1996: Cultural psychology. Cambridge, MA: Belknap Press of Harvard University Press. Fulcher, G. 1996: Testing tasks: issues in task design and the group oral. Language Testing 13, 2351. Gass, S. 1997: Input, interaction and the second language learner. Mahwah, NJ: Lawrence Erlbaum. Green, A. 1998: Verbal protocol analysis in language testing research: a handbook. Cambridge: Cambridge University Press. Halliday, M.A.K. and Matthiessen, C.M.I.M. 1999: Construing experience through meaning: a language-based approach to cognition. London: Cassell.

Downloaded from ltj.sagepub.com by ROSNGELA RODRIGUES BORGES on September 4, 2010

300 Content specication & validating inferences drawn from test scores
Hilsdon, J. 1991: The group oral exam: advantages and limitations. In Alderson, J.C. and North, B., editors, Language testing in the 1990s. London: Modern English Publications and the British Council, 18997. Kowal, M. and Swain, M. 1994: Using collaborative language production tasks to promote students language awareness. Language Awareness 3, 7393. 1997: From semantic to syntactic processing: how can we promote metalinguistic awareness in the French immersion classroom? In Johnson, R.K. and Swain, M., editors, Immersion education: international perspectives. Cambridge: Cambridge University Press, 284309. Kramsch, C. 1995: The applied linguist and the foreign language teacher: can they talk to each other? Australian Review of Applied Linguistics 18, 116. Lantolf, J.P., editor, 2000a: Sociocultural theory and second language learning. Oxford: Oxford University Press. 2000b: Second language learning as a mediated process. Language Teaching 33, 7996. Lantolf, J.P. and Appel, G., editors, 1994: Vygotskian approaches to second language research. Norwood, NJ: Ablex, 132. Lazarton, A. 1992: The structural organization of a language interview: a conversation analytic perspective. System 20, 37386. Lumley, T. and Brown, A. 1996: Specic-purpose language performance tests: task and interaction. In Wigglesworth, G. and Elder, C., editors, The language testing cycle: from inception to washback. Australian Review of Applied Linguistics, Series S, Number 13, 10536. McNamara, T. 1997: Interaction in second language performance assessment: whose performance? Applied Linguistics 18, 44666. Pica, T. 1994: Research on negotiation: what does it reveal about secondlanguage learning conditions, processes and outcomes? Language Learning 44, 493527. Pica, T., Kanagy, R. and Falodun, J. 1994: Choosing and using communication tasks for second language instruction. In Crookes, G. and Gass, S., editors, Tasks and language learning: integrating theory and practice. Clevedon, Avon: Multilingual Matters, 934. Purpura, J.E. 1998: The development and construct validation of an instrument designed to investigate selected cognitive background characteristics of test-takers. In Kunnan, A.J., editor, Validation in language assessment. Mahwah, NJ: Lawrence Erlbaum, 11139. Shohamy, E., Donitsa-Schmidt, S. and Waizer, R. 1993: The effect of the elicitation mode on the language samples obtained on oral tests. Paper presented at the Language Testing Research Colloquium, Cambridge, UK. Available from the authors. Shohamy, E., Reves, T. and Bejarano, Y. 1986: Introducing a new comprehensive test of oral prociency. English Language Teaching Journal 40, 21220. Skehan, P. 1998: A cognitive approach to language learning. Oxford: Oxford University Press.

Downloaded from ltj.sagepub.com by ROSNGELA RODRIGUES BORGES on September 4, 2010

Merrill Swain 301


Swain, M. 1983: Large-scale communicative language testing: a case study. Language learning and communication 2, 13347. Reprinted in Savignon, S. and Berns, M., editors, Initiatives in communicative language teaching. Reading, MA: Addison Wesley, 185201. 1995: Three functions of output in second language learning. In Cook, G. and Seidlhofer, B., editors, Principle and practice in applied linguistics: studies in honour of H.G. Widdowson. Oxford: Oxford University Press, 12544. 2000: The output hypothesis and beyond: mediating acquisition through collaborative dialogue. In Lantolf, J.P., editor, Sociocultural theory and second language learning. Oxford: Oxford University Press, 97114. Swain, M. and Lapkin, S. 1998: Interaction and second language learning: two adolescent French immersion students working together. Modern Language Journal 82, 32037. 2000a: Focus on form through collaborative dialogue: exploring task effects. In Bygate, M., Skehan, P. and Swain, M., editors, Researching pedagogic tasks: second language learning, teaching, and testing. London: Longman, 99118. 2000b: Task-based second language learning: the uses of the rst language. Language Teaching Research 4, 25376. 2001: What learners notice in their reformulated writing, what they learn from it, and their insights into the process. Paper presented at the AAAL Annual Conference, St. Louis, MI. Available from the authors. UCLES (University of Cambridge Local Examinations syndicate) 1996: First certicate in English handbook. Cambridge. UCLES. van Lier, L. 1989: Reeling, writhing, drawling, stretching, and fainting in coils: oral prociency interviews as conversation. TESOL Quarterly 23, 489508. 2000: From input to affordance: social-interactive learning from an ecological perspective. In Lantolf, J.P., editor, Sociocultural theory and second language learning. Oxford: Oxford University Press, 24559. Vygotsky, L.S. 1978: Mind in society: the development of higher psychological processes. Cambridge, MA: Harvard University Press. 1987: Thought and speech. In Rieber, R.W. and Carton, A.S., editors, The collected works of L.S. Vygotsky: Volume 1. New York: Plenum, 24385. Webb, N.W. 1993: Collaborative group versus individual assessment in mathematics: processes and outcomes. Educational Assessment 1, 13152. Wells, G. 1999: Dialogic inquiry: towards a sociocultural practice and theory of education. Cambridge: Cambridge University Press. Wertsch, J.V. 1980: The signicance of dialogue in Vygotskys account of social, egocentric, and inner speech. Contemporary Educational Psychology 5, 15062. 1985: Vygotsky and the social formation of mind. Cambridge, MA: Harvard University Press.

Downloaded from ltj.sagepub.com by ROSNGELA RODRIGUES BORGES on September 4, 2010

302 Content specication & validating inferences drawn from test scores
1991: Voices of the mind: a sociocultural approach to mediated action. Cambridge, MA: Harvard University Press. Wesche, M. and Paribakht, S. 2000: Reading-based exercises in second language vocabulary learning: an introspective study. Modern Language Journal 84, 196213. Young, R. 2000: Interactional competence: challenges for validity. Paper presented at a joint symposium of the Language Research Colloquium and the American Association for Applied Linguistics, Vancouver, British Columbia. Young, R. and He, A. 1998: Talking and testing: discourse approaches to the assessment of oral prociency. Amsterdam, PA: John Benjamins.

Downloaded from ltj.sagepub.com by ROSNGELA RODRIGUES BORGES on September 4, 2010

You might also like