Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 14

Presupposition In the branch of linguistics known as pragmatics, a presupposition (or ps) is an implicit assumption about the world or background

belief relating to an utterance whose truth is taken for granted in discourse. Examples of presuppositions include: Do you want to do it again? Presupposition: that you have done it already, at least once. Jane no longer writes fiction. Presupposition: that Jane once wrote fiction. A presupposition must be mutually known or assumed by the speaker and addressee for the utterance to be considered appropriate in context. It will generally remain a necessary assumption whether the utterance is placed in the form of an assertion, denial, or question, and can be associated with a specific lexical item or grammatical feature (presupposition trigger) in the utterance. Crucially, negation of an expression does not change its presuppositions: I want to do it again and I don't want to do it again both presuppose that the subject has done it already one or more times; My wife is pregnant and My wife is not pregnant both presuppose that the subject has a wife. In this respect, presupposition is distinguished from entailment and implicature. For example, The president was assassinated entails that The president is dead, but if the expression is negated, the entailment is not necessarily true. If presuppositions of a sentence are not consistent with the actual state of affairs, then one of two approaches can be taken. Given the sentences My wife is pregnant and My wife is not pregnant when one has no wife, then either: Both the sentence and its negation are false; or Strawson's approach: Both "my wife is pregnant" and "my wife is not pregnant" use a wrong presupposition (i.e. that there exists a referent which can be described with the noun phrase my wife) and therefore can not be assigned truth values. Bertrand Russell tries to solve this dilemma with two interpretations of the negated sentence: "There exists exactly one person, who is my wife and who is not pregnant" "There does not exist exactly one person, who is my wife and who is pregnant." For the first phrase, Russell would claim that it is false, whereas the second would be true according to him. Projection of presuppositions A presupposition of a part of an utterance is sometimes also a presupposition of the whole utterance, and sometimes not. We've seen that the phrase my wife triggers the presupposition that I have a wife. The first sentence below carries that presupposition, even though the phrase occurs inside an embedded clause. In the second sentence, however, it does not. John might be mistaken about his belief that I have a wife, or he might be deliberately trying to misinform his audience, and this has an effect on the meaning of the second sentence, but, perhaps surprisingly, not on the first one. John thinks that my wife is beautiful. John said that my wife is beautiful. Thus, this seems to be a property of the main verbs of the sentences, think and say, respectively. After work by Lauri Karttunen,[1] verbs that allow presuppositions to "pass up" to the whole sentence ("project") are called holes, and verbs that block such passing up, or projection of presuppositions are called plugs. Some linguistic environments are intermediate between plugs and holes: They block some presuppositions and allow others to project. These are called filters. An example of such an environment are indicative conditionals ("If-then" clauses). A conditional sentence contains an antecedent and a consequent. The antecedent is the part preceded by the word "if," and the consequent is the part that is (or could be) preceded by "then." If the consequent contains a presupposition trigger, and the triggered presupposition is explicitly stated in the antecedent of the conditional, then the presupposition is blocked. Otherwise, it is allowed to project up to the entire conditional. Here is an example: If I have a wife, then my wife is blonde. Here, the presupposition triggered by the expression my wife (that I have a wife) is blocked, because it is stated in the antecedent of the conditional: That sentence doesn't imply that I have a wife. In the following example, it is not stated in the antecedent, so it is allowed to project, i.e. the sentence does imply that I have a wife. If it's already 4am, then my wife is probably angry. Hence, conditional sentences act as filters for presuppositions that are triggered by expressions in their consequent. A significant amount of current work in semantics and pragmatics is devoted to a proper understanding of when and how presuppositions project. Presupposition triggers A presupposition trigger is a lexical item or linguistic construction which is responsible for the presupposition [2]. The following is a selection of presuppositional triggers following Stephen C. Levinson's classic textbook on Pragmatics, which in turn draws on a list produced by Lauri Karttunen. As is customary, the presuppositional triggers themselves are italicized, and the symbol stands for presupposes.[3] Definite descriptions Main article: Definite description Definite descriptions are phrases of the form "the X" where X is a noun phrase. The description is said to beproper when the phrase applies to exactly one object, and conversely, it is said to be improper when either there exist more than one potential referents, as in "the senator from Ohio", or none at all, as in "the king of France". In conventional speech, definite descriptions are implicitly assumed to be proper, hence such phrases trigger the presupposition that the referent is unique and existent. John saw the man with two heads. there exists a man with two heads. Factive verbs In Western epistemology, there is a tradition originating with Plato of defining knowledge as justified true belief. On this definition, for someone to know X, it is required that X be true. A linguistic question thus arises regarding the usage of such phrases: does a person who states "John knows X" implicitly claim the truth of X? Steven Pinker discusses the usage of the phrase "having learned" as an example of a factive verb in George W. Bush's statement that "British Intelligence has learned that Saddam Hussein recently sought significant quantities of uranium from Africa." The factivity thesis, the proposition that relational predicates having to do with knowledge, such as knows, learn, remembers, and realized, presuppose the factual truth of their object, however, was subject to notable criticism by Allan Hazlett. Martha regrets drinking John's home brew. Martha drank John's home brew. Frankenstein was aware that Dracula was there. Dracula was there. John realized that he was in debt. John was in debt. It was odd how proud he was. He was proud. Some further factive predicates: know; be sorry that; be proud that; be indifferent that; be glad that; be sad that. Implicative verbs John managed to open the door. John tried to open the door. John forgot to lock the door. John ought to have locked, or intended to lock, the door.

Some further implicative predicates: X happened to VX didn't plan or intend to V; X avoided VingX was expected to, or usually did, or ought to V, etc. Change of state verbs John stopped beating his wife. John had been beating his wife. Joan began beating her husband. Joan hadn't been beating her husband. Kissinger continued to rule the world. Kissinger had been ruling the world. Some further change of sate verbs: start; finish; carry on; cease; take (as in X took Y from Z Y was at/in/with Z); leave; enter; come; go; arrive; etc. Iteratives The flying saucer came again. The flying saucer came before. You can't get gobstoppers anymore. You once could get gobstoppers. Carter returned to power. Carter held power before. Further iteratives: another time; to come back; restore; repeat; for the nth time. Temporal clauses Before Strawson was even born, Frege noticed presuppositions. Strawson was born. While Chomsky was revolutionizing linguistics, the rest of social science was asleep. Chomsky was revolutionizing linguistics. Since Churchill died, we've lacked a leader. Churchill died. Further temporal clause constructors: after; during; whenever; as (as in As John was getting up, he slipped). Cleft sentences Cleft construction: It was Henry that kissed Rosie. Someone kissed Rosie. Pseudo-cleft construction: What John lost was his wallet. John lost something. Comparisons and contrasts Comparisons and contrasts may be marked by stress (or by other prosodic means), by particles like too, or by comparatives constructions. Marianne called Adolph a male chauvinist, and then HE insulted HER. For Marianne to call Adolph a male chauvinist would be to insult him. Carol is a better linguist than Barbara. Barbara is a linguist. Counterfactual conditionals If the notice had only said mine-field in English as well as Welsh, we would never have lost poor Llewellyn. The notice didn't say mine-field in English. Questions Is there a professor of Linguistics at MIT? Either there is a professor of Linguistics at MIT or there isn't. Who is the professor of Linguistics at MIT? Someone is the professor of Linguistics at MIT. Possessive case John's children are very noisy. John has children. Accommodation of presuppositions A presupposition of a sentence must normally be part of the common ground of the utterance context (the shared knowledge of the interlocutors) in order for the sentence to be felicitous. Sometimes, however, sentences may carry presuppositions that are not part of the common ground and nevertheless be felicitous. For example, I can, upon being introduced to someone, out of the blue explain that my wife is a dentist, this without my addressee having ever heard, or having any reason to believe that I have a wife. In order to be able to interpret my utterance, the addressee must assume that I have a wife. This process of an addressee assuming that a presupposition is true, even in the absence of explicit information that it is, is usually called presupposition accommodation. We have just seen that presupposition triggers like my wife (definite descriptions) allow for such accommodation. In "Presupposition and Anaphora: Remarks on the Formulation of the Projection Problem",[7] the philosopher Saul Kripke noted that some presupposition triggers do not seem to permit such accommodation. An example of that is the presupposition trigger too. This word triggers the presupposition that, roughly, something parallel to what is stated has happened. For example, if pronounced with emphasis on John, the following sentence triggers the presupposition that somebody other than John had dinner in New York last night. John had dinner in New York last night, too. But that presupposition, as stated, is completely trivial, given what we know about New York. Several million people had dinner in New York last night, and that in itself doesn't satisfy the presupposition of the sentence. What is needed for the sentence to be felicitous is really that somebody relevant to theinterlocutors had dinner in New York last night, and that this has been mentioned in the previous discourse, or that this information can be recovered from it. Presupposition triggers that disallow accommodation are called anaphoric presupposition triggers. Presupposition in Critical discourse analysis Critical discourse analysis (CDA) seeks to identify presuppositions of an ideological nature. CDA is critical, not only in the sense of being analytical, but also in the ideological sense.[8] Van Dijk (2003) says CDA "primarily studies the way social power abuse, dominance, and inequality" operate in speech acts(including written text)"text and talk".[8] Van Dijk describes CDA as written from a particular point of view: [8] "dissendent research" aimed to "expose" and "resist social inequality."[8] One notable feature of ideological presuppositions researched in CDA is a concept termed synthetic personalisation This paper explores the implications of the principles and parameters theory of Universal Grammar for language teaching. Learning the core aspects of a second language means re-setting values for parameters according to the evidence the learner receives, perhaps starting from the L1 setting. Implications for the classroom can only be drawn for core areas of grammatical competence. Classroom acquisition depends crucially on the provision of appropriate syntactic evidence to trigger parameter-setting; certain aspects of vocabulary are also crucial. Variability, interaction, active production or comprehension, consciousness-raising and hypothesis-testing are irrelevant. Existing textbooks already supply appropriate evidence for parameter-setting; the grammatical component of syllabuses may be improved by use of principles and parameters, even if this reveals what does not need to be taught, as may the teachers awareness of language. UNIVERSAL GRAMMAR AND LANGUAGE ACQUISITION

Given the widely differing interpretations of Universal Grammar, it is necessary to specify that the present article considers Universal Grammar (UG) within the current Chomskyan model, described for instance in Chomsky (1988) and Cook (1988); this is different not only from the type of Universal Grammar studied by those concerned within the implicational universals tradition, such as Hawkins (1983, 1987), but also from much of the L2 discussion of Universal Grammar, which has looked at earlier models of language acquisition with different emphases or has looked at syntactic issues that are not directly relevant to this model. Current UG theory describes the speakers knowledge of language in terms of principles and parameters, as captured in the Government/Binding theory of syntax (Chomsky, 1981, 1988; Cook, 1988), not in terms of rules; hence it is sometimes called the principles and parameters model. To take an English example sentence Max played the drums with Charlie Parker, principles of phrase structure require every phrase in it to have a head of a related syntactic category and permit it to have complements of various types; A Verb Phrase such as played the drums must have a head that is a verb, play, and may have a complement the drums; a Prepositional Phrase such as with Charlie Parker must have a head that is a preposition, with, and a complement Charlie Parker; Noun Phrases such as Max, the drums, and Charlie Parker must have noun heads and may, but in this case do not, have complements. This is not true only of English; the phrases of all languages consist of heads and possible complements Japanese, Catalan, Gboudi, and so on. The difference between the phrase structures of different languages lies in the order in which head and complement occur within the phrase; in English the head verb comes before the complement, the head preposition comes before its complement, the adjective before its complement (easy to play), and the noun before its complement (belief that he can play well); Japanese is the opposite in that the head verb comes after the complement in the Verb Phrase, and the preposition comes after its complement (and so is known as a postposition), as in E wa kabe ni kakatte imasu (picture wall on is hanging). This variation between languages is captured by the head parameter, which has two settings head-first and head-last according to whether the head comes before or after the complement in the phrases of the language. So, while all languages have the same principles of phrase structure, they differ in their setting for the head parameter. Principles do not vary from one language to another, because they are built-in to the human mind; no human language breaches them. Parameters confine the variation between languages within circumscribed limits. Complementary to these phrase structure principles is the Projection Principle which claims that syntax and the lexicon are closely tied together. As well as knowledge of where the complement goes in the phrase, we need to know whether a complement is actually allowed, and this depends upon the lexical item that is used; hence the Projection Principle states that the English verb play must be specified as taking a complement (i.e. it is normally transitive); the lexical entry for the verb faint must specify it has no complement (i.e. is intransitive), while that for the verb give must specify that it has two complements (i.e. direct and indirect objects). The question of whether the phrase structure of a sentence is grammatical is a matter not just of whether it conforms to the overall possible structures in the language but also whether it conforms to the particular structures associated with the lexical items in it; Max played the drums is grammatical because the verb occurs in the correct head-first position, compared to Max the drums played and because the verb play has an Object Noun Phrase following it, compared to Max played. The Universal Grammar theory claims that the speakers knowledge of a language such as English consists of several such general principles and of the appropriate parameter settings for that language. Some principles lay down the relationship between items that have been moved in the sentence, as in questions and passives (Subjacency Principle); others concern the ways in which words such as himself may or may not corefer with the same entity as other words in the sentence (the Binding Principles). This model is not centrally concerned with conventional rules; it does not deal with the passive, or relative clauses, or any particular construction as such; instead rules are seen as the interaction of various principles and settings for parameters; the English passive reflects the combined effects of principles of syntactic movement, of phrase structure, and of case, each of which also applies to other areas of the grammar. The model of acquisition is essentially straightforward. As the principles of UG are built-in to the mind, they do not have to be learnt; the learner automatically applies them to whatever language he or she encounters. It does not matter whether the learner is faced with Japanese or English; the same principles of phrase structure apply. The settings for parameters are not constant but vary from one language to another; the crucial aspects of a language for the learner to master are the appropriate settings for the parameters; since the learner already knows the principles as they are part of his or her mind, all that is needed is sufficient evidence to set the values for the parameters. Given the learner knows the phrase structure principles, all that has to be learnt is whether the setting for the head parameter is head-first or head-last. For this the learner needs linguistic evidence in the form of actual sentences spoken by the people around him or her; hearing Mukashi mukashi ojihisan to obaasan ga koya ni sunde imashita (once upon a time an old man and old woman cottage in lived) the child learns that the setting is head-last in Japanese; hearing John ate an apple the child learns it is head-first in English. The learner needs to hear relevant evidence for setting the parameters of the grammar. A simplified picture of acquisition in the UG model is then as shown in Fig. 1. SORRY FIG NOT AVAILABLE Fig. 1. Acquisition in the UC model. Alongside this there is massive learning of vocabulary in a particular form. Due to the Projection Principle the acquisition of vocabulary means not just learning the meanings and pronunciations of words but also learning what structures the words can be used in; thus the crucial point to learn about the verb play is that it is used with a following object - play something. Since it relies on triggering from evidence, UG does not have a learning theory as such; nothing more than this framework is needed to describe acquisition-no learning strategies, motivations, cognitive or social schemas, or whatever. The concerns that linguists have within this model relate chiefly to the nature of the evidence that the learner needs to encounter, and to the starting position for- parameters in the learners mind. Many arguments have suggested that the learner must be able to learn solely from positive evidence, that is to say naturally occurring sentences, rather than from negative evidence such as correction or sentences people do not say. The interpretation of acquisition in which the child creates hypotheses that are modified in the light of feedback is no longer accepted since such appropriate feedback has never been found. In addition the evidence available has to meet the requirements of occurrence (i.e. does it actually happen?) and uniformity (is it available to all children?); since virtually all normal children learn their first language, the crucial evidence must be freely available to all children rather than a select few; Kahuli children for example are not treated as conversational partners for the first few years of life (Schieffelin, 1985) yet acquire language; any theory of language acquisition cannot therefore rely on particularly beneficial conversation with adults. The input to the child is vital for triggering parameter-setting since nothing would happen without it; nevertheless a bare sentence or two may suffice to demonstrate how a parameter should be set; a single sentence such as Max played the drums with Charlie Parker may be enough to set the head parameter for English and thus impart a knowledge of how to construct Verb Phrases, Noun Phrases, Adjective Phrases, and Prepositional Phrases; for a fuller discussion see Cook (1989a). The second issue is the initial setting for parameters. The setting for the parameter might be neutral and so equally settable for any language, or there might be a preferred position (the unmarked setting), which has to be set to a different position for certain languages (the marked setting) but not for others. Hyams (1986) took the example of the prodrop parameter-whether a language permits subjectless declarative sentences, like Chinese and Spanish (pro-drop languages) or does not permit them, like English and French (non-pro-drop languages). She argued that children start from the pro-drop setting in that their early sentences in all languages omit subjects; consequently pro-drop is the unmarked setting, non-pro-drop the marked; children learning English or French have to set the pro-drop parameter away from its first setting while those learning Chinese or Spanish can retain the original setting. (See Cook, 1989b for a different interpretation of pro-drop.) It is not a tenet of the theory that the whole of Universal Grammar is necessarily present from the start, interesting as this question may be in its own right. Instead it is neutral between no-growth models, which maintain that all principles and parameters are equally present at all times, subject to other constraints on the childs use of language, and growth models which hold that principles unfold in a developmental sequence. The fact that something is dictated by our genes does not mean it is necessarily present from the start, as the eyes are present; instead it may reveal itself over time, as milk teeth yield to permanent teeth and finally to wisdom teeth.

Second language researchers have had similar concerns, magnified by the problems peculiar to second language learning. The question of evidence is more open since unlike L1 children many L2 learners receive copious correction of their errors and grammatical explanation; it is dubious, however, whether they receive correction of the appropriate errors or grammatical explanations of the right type to learn the types of syntactic knowledge, partly because teachers are unacquainted with the pro-drop and head parameters (see Cook, 1988 for further discussion). The question of parameter-setting in L2 learning is interesting because there is already one setting for the parameters present in the learners mind; the question is how much influence this exerts on L2 learning. Does a Japanese learner approach English with a head-last setting for the head parameter or is he or she neutral between settings? This reintroduces the issue of transfer into L2 learning research in a new form; does the L2 learner transfer the L1 setting to the new language or start from scratch? Research by White (1986) on the pro-drop parameter suggests that the first language setting is carried over to the second; that is to say Spanish learners of English assume it is prodrop, French learners that it is non-pro-drop, rather than both groups starting from the same position. Again an overall question arises, namely the relationship between UG and L2 learning; this can be put as a choice between a direct access model that suggests that UG is still available for L2 acquisition, an indirect access model that claims it is only available via the mediation of the L1, and a no access model in which UG is no longer available for L2 learning (Cook, 1988). The arguments against no access are briefly the difference of the language system from other cognitive systems, so that language knowledge would be acquired with difficulty via alternative routes, and the absence in L2 learners, so far as researchers can tell, of grammars that breach principles of UG. Interlanguages seem to stay within the limits of possible human languages rather than to go against the UG principles. The no access model logically leads to treating language like any other area of learning and so in school terms to dealing with it in the same way as say geography and gymnastics. The picture of L2 learning can be diagrammed as in Fig. 2. The input of language sentences and the output of language knowledge are the same in the L1 and L2 models; the intervening parameter-setting differs according to whether one adopts the direct access model, which treats L1 and L2 entirely differently, or the indirect access model, L1 setting for parameters. which mediates L2 via Sorry FIG not available Fig. 2. L2 learning. VP, Verb Phrase; PP, Preposition Phrase; AP, Adjective Phrase; NP, Noun Phrase. Neither for first nor for second language acquisition can it be said that UG acquisition models are based on extensive empirical research within the principles and parameters framework. Most L1 and L2 research has dealt with rules not principles; much of it that purports to deal with Universal Grammar is dealing with areas of syntax that are not principle-based or, when they are, not based on the actual principles proposed within the Government/Binding theory. But evidence from actual children is not of prime importance to the theory for two reasons. First of all, the theory claims that acquisition research can establish what must be built-in to the mind without reference to an actual child at all by comparing what the speaker knows with the possible language evidence he or she has encountered; if we can show that a speaker knows something about language, say the phrase structure principles, and that this could not be worked out from the sentences the learner hears, then we can demonstrate it must be part of the speakers mind-the poverty of the stimulus argument. Secondly the theory separates the idealized picture of acquisition that is its concern from the history of the childs actual development, in which language acquisition is combined with physical, social and cognitive development; using actual childrens use of language for learning about acquisition necessitates disentangling the thread of language acquisition from all the others with which it is interwoven, something at present impossible. Any sentence from a child we try to study is a product of development, not acquisition, and dependent on the childs memory capacity, social development, and cognitive stage, all of which have an indirect connection to language acquisition proper. The case of L2 learning may be slightly different in that L2 learners may be more developed in all the aspects except language than the L1 child learner; L2 development is still not, however, immune to the effects of other cognitive deficits in a second language, such as reduced short-term memory capacity. This article is not the place to survey the contribution of UG theory to L2 learning in general; broadly similar accounts will be found in Cook (1988), Ellis (1985), Flynn (1988), Lightbown and White (1988), and McLaughlin (1987). The present argument presupposes that UG theory is relevant to L2 learning, specifically looking narrowly at the principles and parameters version of UG, as outlined above. UG AND CLASSROOM LEARNING The UG model is primarily about language knowledge, not language use, or language development; indeed strictly speaking it is about grammar rather than about language. Furthermore it is concerned with the abstract central areas of syntax rather than with broader aspects of language; its interests lie in what the speaker knows about language, grammatical competence, rather than in how the speaker uses language, pragmatic competence. The UG theory is arguably of minor importance in dealing with how people communicate, or how they meet and understand other people, or how their language behaviour varies from one situation to another. Classroom second language learning and teaching is made up of many components-psychological, social, and linguistic; UG theory can play only one part in this framework. When looking at the relevance of UG theory to classroom learning we need to remember its restricted scope-general principles of syntax such as the phrase structure principles and precise areas of variation such as the pro-drop or head parameters. It would be misleading to attempt to draw conclusions from UG theory for anything other than the central area that is its proper domain; much of the ensuing discussion will be concerned with reminding the reader that UG theory is neutral about many of the issues that arise in the classroom. UG theory does not regard language acquisition as depending upon particular circumstances; the uniformity and occurrence requirements mean that it deals with features that can be learnt regardless of situation, regardless of variation between learners, and regardless of types of input, provided the learner has sufficient examples of appropriate sentences to trigger the settings for the various parameters; in L2 learning this may be modified by any transfer of parameter setting from the L1 . If UG is involved in L2 learning, there should be no intrinsic difference between classroom acquisition and any other form of involved in UG theory. However important the concepts of variability and the context of language acquisition with respect to the type of language knowledge language learning may be to other areas of L2 learning research (Ellis and Roberts, 1987), they have no relevance to UG related areas. The classroom learner is setting values for parameters from positive evidence; so long as the classroom provides appropriate evidence, parameter setting will take place. What would such appropriate evidence consist of? On the one hand there is the extreme case where it is believed that learning may take place on the basis of one sentence or a small set of sentences, as with the head parameter example, called in Cook (1989b) onetime setting; indeed experiments with Micro Artificial Languages have shown that learners can choose appropriate settings for the head parameter from around 30 sentences, even if these appear to be not quite the settings that UG theory utilizes (Cook, 1989a). On the other hand some of the necessary evidence may be indirect; Hyams (1986) believes that the crucial element in the English childs switching to non-pro-drop is not the absence of subjectless declarative sentences themselves, a form of negative evidence, but the presence of the expletive subjects there and it, which is a by-product of the non-pro-drop setting and absent from pro-drop languages-a form of positive evidence. To set the parameter correctly sometimes requires a range of syntactic forms rather than just one paradigm sentence. Furthermore L2 learners may be exposed to forms of evidence such as explanation or correction which are rare in L1 acquisition. While the effects of this on the knowledge of parameters appear minimal, since teachers do not have the academic knowledge to correct errors or make explanations on the basis of such syntactic principles and parameters, they cannot be entirely dismissed. On the whole then the well-established features of teacher-talk-shorter utterances (Wesche and Ready, 1985), less subordination (Ishiguro, 1986), slower speed (Mannon, 1986), and so on-have nothing to do with the desirable properties of input for a UG model, except in so far as they segment the input more readily into grammatical constituents, as UG theory implies (Cook, in progress; Morgan, 1986). Nor can L2 knowledge be derived from particular types of interaction or behaviour by the learner, say, understanding the message in the sentence or taking part in a mutually constructed dialogue: hearing the sentence is enough. It is not necessary for the learner to do anything in particular; hence work with learners strategies in the classroom such as those enumerated by OMalley et al. (1985) is beside the point, as it does not reflect acquisition of any of the core areas of syntax. The learners grammar will conform in one way or another to the principles of UG, even if not in the same way as in the first language or the second language or in either of them, since the only type of grammar that may be entertained by the language faculty of the human mind must conform to the principles of UG. At all stages the learners interlanguage will

reflect UG principles, regardless of its wild deviancy from the L1 or the L2. Thus Error Analysis misses the point if it emphasises the selfcontained internal system of the learners language rather than seeing it as one of the possible instantiations of human language; learner languages vary within the finite possibilities set by UG, always provided that UG is in fact available to the L2 learner. The UG model is then neutral so far as the interaction in the classroom is concerned; whether teachers vary the types of question (Long and Sato, 1983), or provide corrective feedback of various types (Day et al., 1984), or engage in the three-fold classroom moves of Sinclair and Coulthard (1975) is irrelevant so far as the UG areas of syntax are concerned as these are not acquirable by such means. The uniformity requirement of the UG model insists in addition that whatever it is that fosters acquisition in the input is freely available in all L1 situations; generalized to the L2 situation, this suggests that an L2 learning theory cannot depend solely on a particular type of interaction that is highly idiosyncratic and does not occur in situations where some L2 learners have been shown to be successful. The UG theory has nothing in common with models that stress amount of practice, or active production by the learners, common for instance in language teaching methods from the audiolingual to the communicative; so far as the active comprehension advocated by supporters of Listening First methodologies (Cook, 1986), there is indeed the necessity for the learner to be aware of syntactic categories and of vocabulary meanings but there seems no particular need for the depth of semantic processing suggested by such models; triggering implies no deeper processing than syntactic and lexical codebreaking. A few sentences are all that is needed to set parameters; practice or production is neither good nor bad as the parameter is either set or it isnt; a parameter is a switch with two or more discrete positions rather than a steadily increasing response strength. Some research has shown that a minimal amount of data may turn the switch for the learner; Cromer (1987) for example showed that giving children 10 examples of sentences illustrating the eager/easy to please construction every three months without telling them if they were right or wrong brought them way ahead of their peers. Though not couched in terms of the current UC theory, this shows how linguistic evidence supplied at the right time may facilitate acquisition of syntax. The crucial point so far as UG is concerned is that the appropriate input for triggering particular parameters be available to the learner, not the amount of input in terms of quantity, nor the properties of the input in other syntactic terms, nor whether it conveys a message. The provision of input is crucial to acquisition, but the necessary input may consist of a handful of sentences. We come then to the question of sequence of development. Much L2 learning research has prided itself on discovering sequences of acquisition, as if a sequence were itself an explanation rather than a fact that needed to be explained. The main UG theory is neutral about L1 sequence; there might be a tendency to start with unmarked settings (in so far as these are not in any case synonymous with learnt earlier). But this tendency is likely to be obscured in the L1 by the gross developmental changes in the childs other attributes, in the L2 by the more subtle deficiencies in the learners other cognitive systems when using the L2. If the growth model of UG is accepted, there may be a difference between older L2 learners and younger L2 learners or L1 learners, in that all the principles are present in the minds of the older L2 learners. Far from the claim of the standard Critical Period Hypothesis that there is a cut-off point for language acquisition, and far from the usual claim of the Monitor Model that acquisition can take place at any time while conscious learning may occur only after a certain age (Krashen, 1982), if a growth model of UG is correct and UG is still available, the acquisition of older L2 learners will reflect UG better than that of younger L1 or L2 learners since they would have all the principles simultaneously present. Like other contemporary linguistic theories, UG also emphasizes the importance of vocabulary. The L2 learner needs to spend comparatively little effort on phrase structure, since it results from the setting of a handful of parameters. He or she needs however to acquire an immense amount of detail about how individual words are used. The comparative simplicity of syntax learning is achieved by increasing the burden of vocabulary learning, where the learner needs to acquire masses of words, not just in the conventional way of knowing their dictionary meaning or pronunciation but also in knowing the way they behave in sentences; it is not just a matter of the beginner in English learning the syntax, function, and meaning of He plays football, it is learning that in English the verb play needs to be followed by a Noun Phrase. It has often been reported that learners feel vocabulary to be particularly important (Hatch, 1978); a questionnaire I administered to 351 students of English found that they placed the statement I want to learn more English words and phrases second out of 10 possible aims for their English course, after I want to practice English so that I can use it outside the classroom, and some way above structures, functions, or life in England. A major learning component according to the UG theory will indeed be vocabulary, if not perhaps in the way that either learners or teachers presently conceive of it. UG theory clearly has little to say about many of the controversies about classroom language learning; it cannot be taken to support or deny various positions that are outside its remit. Thus for instance it is unjustifiable to invoke UG theory or indeed any Chomskyan view of language acquisition as supporting the provision of explicit rules to the learner. It must be recognized that one does not learn the grammatical structure of a second language through explanation and instruction beyond the most rudimentary level for the simple reason that no one has enough explicit knowledge about this structure to provide explanation and instruction (Chomsky, 1969). If such evidence were to help learners to set parameters in L2 acquisition, it would suggest that L2 learning were taking place through some faculty other than the language faculty, a possibility denied by the current theory, and that the resulting knowledge acquired was language-like rather than true grammatical competence. Proper language knowledge must be derivable via triggering from positive input. Needless to say, grammatical explanation may work for aspects of grammar that are not the central factors of UG; such peripheral areas as the acquisition of closed-class grammatical morphemes may well yield to such treatment. There is no warrant for seeing consciousness-raising in the form of explicit statements of grammatical rules to learners as having anything to do with UG theory, whatever its merits on other grounds (Rutherford, 1987). Knowledge of language is not conscious; the model has no way for conscious knowledge to become unconscious. And of course whatever explanations were vouchsafed to learners would need to be in terms of principles (which they already possess unconsciously anyway) rather than of construction-specific rules. Again this is not to say that such explanations would not work for aspects of grammar or of language outside the UG purview. But such views cannot be accommodated within the areas of language acquisition covered by the UG theory itself. Nor is it possible to interpret UG theory as supporting the hypothesis-testing theory in the form in which it became familiar in L2 research and language teaching-the learner makes a hypothesis about the grammar, tries it out and modifies it in the light of how successful it is. Such a process requires feedback to the learner concerning the correctness of his or her temporary hypothesis, as first argued by Braine (1971); without such feedback the learner would never know whether the hypothesis were correct. But in first language acquisition correction of the appropriate syntax is not universally provided, and so cannot be an essential component of L1 acquisition. So far as the learning of other aspects of language than the syntactic core, UG theory is simply neutral; perhaps these are precisely the parts of language that have to be learnt, since the rest is innate. It may be that communicative goals imply other forms of learning; pragmatic competence is multi-functional and includes a communicative function as well as others; such uses are not part of UG which is concerned with knowledge of language grammatical competence. The argument here has implied that UG is only one component out of many in L2 learning. The UG approach may indeed tackle the most profound areas of L2 acquisition, those that are central to language and to the human mind. But, once these are established, there may be rather little to say about them; the UG principles are not learnt, the parameter settings probably need rather little attention. On the one hand this is indeed proof of their central importance to language learning; UG is proof against situation and against learner variation because of its central importance. On the other hand the complexity, the difficulty, and indeed much of the interest, in L2 learning may be the aspects that are not predictable from UG theory-learner variation, situational purpose, foreign accent, motivation, and an endless list. The study of classroom L2 learning needs to operate within a framework that includes not only a linguistic model of acquisition such as UG but also psychological models of speech processing, language development, and cognitive development, a sociolinguistic model of discourse interaction, and an educational model of the values and purposes of language teaching. Having produced so many caveats, is it possible to venture some simple concrete applications to language teaching? So far as classroom interaction is concerned we have seen that the most that the UG theory would recommend is the provision of adequate language samples for parameter setting to take place. Let us take a modern beginners course The Cambridge English Course (Swan and Walters, 1984) to see what linguistic evidence it provides the students. The evidence for setting the head parameter needs to be sentences showing Object complements following verbs rather than preceding them; Unit 1 of the course concentrates on My names . . .; the first conversation the students hear

has two examples of Object Noun Phrases following is; the first practice for the students is Say your name. Hello my names . . .. In other words in the first minutes of the course the student is given sufficient information for setting the head parameter, one of the major aspects of the phrase structure of English; the only possible confusion is the use of questions such as Is your name Mark Perkins? in the same context where the Object is separated from the verb by the Subject. Furthermore the student is learning properties of the verb is, namely that it has to be followed by a complement, except in short answers No, it isnt. Turning to the pro-drop parameter the evidence needs to be the absence of null subject sentences, something eschewed by all EFL course books, even if they are not infrequent in ordinary spoken English for performance reasons, and, according to Hyams, the presence of expletive subjects such as it and there; Unit 5 of the Cambridge course introduces existential there in such sentences as Theres an armchair in the living room, Unit 7 in such sentences as Theres some water in the big field; Unit 9 introduces weather report, it in It rains from January to March and Itll cloud over tomorrow, together with there, There will be snow; Unit 10 teaches dummy it in Its a man. Again everything necessary to set the parameter is introduced within the first few weeks of the course. And it would be surprising if it werent; any small sample of English sentences must reflect this basic fact, just as it is hard for any small sample not to use all the phonemes of English. A traditional interpretation of Chomskyan thinking to the classroom is what I have termed elsewherelaissez-faire (Cook, 1988): leave the student alone so that the natural processes of his or her mind can get to grips with language. The argument in favour of this originally was that our ignorance of language acquisition meant we interfered with it at our peril. This is no longer the case so far as current UG theory is concerned: the contents of the speakers mind and the evidence necessary for acquisition are known, both at the general level of the need for positive evidence, and at the specific level of the need for expletive subjects, say, for setting the pro-drop parameter. Laissez-faire works because of the accidental reason that the necessary evidence is simple and common, and so bound to be present in the input. Behind the classroom stands the syllabus. In so far as grammar forms part of contemporary syllabuses, one consequence of UG might be that the division of what needs and doesnt need to be taught can be based on a notion of principles and parameters. Facts that are part of general principles dont need to be taught. A student does not need to learn that a phrase always has a head of a related syntactic category because, quite literally, everyone knows that. A glance at current syllabuses may disclose areas which can be eliminated for these reasons. Above all we need to reinterpret the grammatical syllabus in terms of principles rather than separate rules or structures; the syllabus often gives the impression of consisting of discrete grammatical items-the present tense, the definite article-rather than the interlinked knowledge that the UG theory suggests. Constellations of syntactic structures might be combined that so far have been widely separated; for instance suppose that we wanted to teach movement; this would involve at least wh-questions, yes/no questions, relative clauses, the passive, and the use of seem-grammatical topics whose common factor almost certainly has never been emphasized in teaching! The use of the concept of parameters by teachers depends upon a decision whether L1 parameter-settings are transferred or not. Finally as a slightly tangential point, there is the teachers awareness of language and of syntax. Firstly, being aware of what is taken care of by UG can free the teacher to pay attention to other things that actually need teaching; to some extent this already takes place since the teacher is not aware of the general principles or specific parameters that he or she has been covering: ignorance is bliss. But secondly the use of the Government/Binding model of syntax associated with UG theory can provide insights to the teacher confined to the traditional models of grammar used in language teaching. Take the case of prodrop. One volume of the Cambridge Handbooks for language teachers is Learner English (Swan and Smith, 1987), which collects information on the English spoken by 19 different groups of learners; for Italian we learn use of the subject pronoun is not obligatory and the order of subject and predicate is freer than in English (p. 66), for Spanish subject personal pronouns are largely unnecessary (p. 85) and subject-verb and verb-subject do not regularly correspond to statement and question respectively (p. 79), for Chinese English uses pronouns much more than Chinese, which tends to drop them when they may be understood (p. 232) and Not only interrogatives but also other sentences with inverted word order are also error-prone (p. 232); similar comments are made about Greek, Portuguese, and Thai, together with remarks about missing it in Portuguese (p. 99), and Spanish (p. 85). To a UG theorist these all reflect the pro-drop parameter; a crucial generalization is being overlooked. Teachers are missing an important insight if they see these as separate bits of information about different languages rather than as a two-way variation in languages. Teachers I have spoken to have indeed found the two examples of the head parameter and the pro-drop parameter useful insights that help them to understand their students. UG may be of help at one stage removed from the student. CORE PRINCIPLES OF TRANSFORMATIVE LEARNING THEORY - MEZIROW & OTHERS. Introduction to Transformative Learning Transformative Learning is a theory of deep learning that goes beyond just content knowledge acquisition, or learning equations, memorizing tax codes or learning historical facts and data. It is a desirable process for adults to learn to think for themselves, through true emancipation from sometimes mindless or unquestioning acceptance of what we have to come to know through our life experience, especially those things that our culture, religions, and personalities may predispose us towards, without our active engagement and questioning of how we know what we know. For us as adults to truly take ownership of our social roles, and our personal roles, being able to develop this self-authorship goes a long way towards helping our society and world to become a better place through our greater understanding and awareness of the world and issues beyond us, and can help us to improve our role in our lives and those of others. Making Meaning as a Learning Process Adult Learning needs to emphasize contextual understanding, critical reflection on assumptions, and validated meaning by assessing reasons. Recent approaches to transformative learning also include transformation through our intuitive, unconscious processes. John Dirkx and Patricia Cranton have researched "soulfulness" and the process of intuition and the unconscious on our meaning-making. While transformative learning theory originally consisted of critical self-reflection and disorienting dilemmas to make cognitive adjustments to reframing one's world, transformative learning theory has expanded to include what Jung considers the "unconscious functions." Dirkx and Cranton expand on this in their work. Context Context justification of much for much of what we know and believe, our values and our feelings, depends on the context biographical, historical and cultural in which they are embedded. We make meaning with different dimensions of awareness and understanding; in adulthood we may more clearly understand our experience when we know under what conditions an expressed idea is true or justified. In the absence of fixed truths and confronted with often rapid change in circumstances, we cannot fully trust what we know or believe (Mezirow, 2000, p. 3-4). Our understandings and beliefs are more dependable when they produce interpretations and opinions that are more justifiable or true than would be those predicated upon other understandings or beliefs (Mezirow, 2000, p. 4). Developing more dependable beliefs of our experience, considering them in the context of our lives and being able to improve our decision-making based on our insights, these are all critical to adult learning. Bruner identified four means of meaning making: Establishing, shaping and maintaining intersubjectivity Relating events, utterances, and behavior to the action taken Construing of particulars in a normative context deals with meaning relative to obligations, standards, conformities and deviations Making propositions application of rules of the symbolic syntactic, and conceptual systems used to achieve decontextualized meanings, including rules of inference and logic and such distinctions as wholepart, object-attribute, and identity-otherness Transformative Learning (Mezirow addition) Becoming aware of one's own tacit assumptions and expectations and those of others and assessing their relevance for making an interpretation (Mezirow, 2000, p.4).

Kitchener Three levels of cognitive processing: compute, memorize, read and comprehend (Metacognition) monitor own progress and products as they are engaged in first order cognitive tasks Epistemic cognition explain how humans monitor their problem solving when engaged in ill structured problems limits of knowledge, emerges in late adolescence, form may change during adult years (Mezirow, 2000 p 4-5). In this formulation, transformative learning pertains to epistemic cognition (Mezirow, 2000 p.5). Heron discusses a type of learning, called Presentational where we do not require words to make meaning (ex. Art, music, empathy, feeling, transcendence, inspiration, kinesthetic, aesthetic). Weis brings up the intuitive process, or the unconscious acquisition of knowledge; much more sophisticated and rapid than conscious capacity (Mezirow, 2000, p. 6) Art, music and dance are alternative languages. Intuition, imagination, and dreams are other ways of making meaning. Inspiration, empathy, and transcendence are central to self-knowledge and to drawing attention to the affective quality and poetry of human experience. Dirkx writes of learning through soul involving a focus on the interface where the socioemotional and the intellectual world meet, where inner and outer converge (Mezirow, 2000, p.6). These processes are another approach from the purely rational and cognitive lens. Domains of Learning Habermass identified two major domains of learning with different purposes, logics of inquiry, criteria of rationality, and modes of validating beliefs. Instrumental learning learning to control and manipulate the environment or other people (task oriented problem solving to improve performance) Communicative learning learning what others mean when they communicate with you. This often involves feelings, intentions, values and moral issues. In Communicative learning we need to be mindful of assessing meaning behind the words, truthfulness and qualifications of the speaker and authenticity of expressions of feeling. We must become critically reflective of the assumptions of the person communicating. Assumptions include intent, implied as subtext, conventional wisdom, a particular religious world view this often requires a critical assessment of assumptions supporting the justification of norms (Mezirow, 2000, p. 9). Reflective Discourse In the context of Transformative Learning Theory, is that specialized use of dialogue devoted to searching for a common understanding and assessment of the justification of an interpretation or belief (Mezirow, 2000, p. 10). This is about making personal understanding of issues or beliefs, through assessing the evidence and arguments of a point of view or issue, and being open to looking at alternative points of view, or alternative beliefs, then reflecting critically on the new information, and making a personal judgment based on a new assessment of the information. An example might be having a conviction that a tax levy in your community might be superfluous; politicians may tell the electorate that any new taxes are unnecessary, and that the current leaders have failed to responsibly managed budgets. If we actually seek data, we might find out that inflation and other economic factors (including loss of tax paying companies to our area) may have undermined our community's effort to continue to provide essential services, or those we consider essential, and perhaps these organizations are actually being fairly well managed. At this point, we need to consider alternatives what are we willing to give up, or what are we willing to reduce in scope, to maintain a balanced budget, or how high are we willing to go with new taxes to continue our services? Mezirow reminds us of our need to find collective experience and arrive at a best decision. This flies in the face of what Deborah Tannen (1998) calls our argument culture and we find evidence of this daily, in our political realities in our country. It is all about win or loose and unfortunately for most of us, not about seeking common ground. To develop common ground we need to help others and ourselves move from self serving debate and move towards empathetic listening and informed constructive discourse. Recent studies reveal that for effective discourse in transformative learning we need emotional maturity, or what Daniel Goleman calls emotional intelligence, knowing and managing our own emotions and motivating ourselves as well as recognizing emotions in others and handling relationships. Goleman's research shows that emotional intelligence accounts for about 87% of success at work! (Mezirow, 2000, p. 11). Meaning Structures Or a Frame of Reference This is our structure of assumptions and expectations (including our cultural assumptions often received as repetitive affective experiences outside of our conscious awareness) through which we filter sense impressions. A Frame of Reference has two dimensions: A Habit of Mind Broad based assumptions that act as a filter for our experiences; these include: moral consciousness, social norms, learning styles, philosophies including religion, world view, etc., our artistic tastes and personality type and preferences. Sometimes this may be expressed as our point of view (Mezirow, 2000, p 17 & 18). Resulting Point of View These include our points of view, attitudes, beliefs and judgments. Since it is here where our sense of self and our values are interwoven, we must be mindful of viewpoints that challenge our beliefs, or realize that if someone expresses a contrary or different point of view, we are not under personal attack especially, as are our beliefs and views may be. Transformations A process whereby we move over time to reformulate our structures for making meaning, usually through reconstructing dominant narratives or stories. This provides us with a more dependable way to make meaning within our lives, since we are questioning our own points of view, looking and reflecting on alternate points of view and often creating a new, more reliable and meaningful way of knowing that may be different from our old habits of the mind. This requires us to become open to others points of view, and to be able to reflect on new points of view and information and often go back and reconstruct what we know and how we know it. Often, we make judgments about others, and think we know why they do or do not do what we expect. If we are truly open to understanding we might engage them in dialogue and through discussion find out that our sense of them and their issues may be totally erroneous, thus leading us to make a new frame for how we see and experience them. Mezirow suggests transformations come about due to one of four ways: Elaborating Existing Frames of Reference Learning New Frames of Reference Transforming Points of View Transforming Habits of the Mind Transformations often follow some variation of the following phases of meaning becoming clarified: A Disorienting Dilemma loss of job, divorce, marriage, back to school, or moving to a new culture Self-examination with feelings of fear, anger, guilt, or shame A critical assessment of assumptions Recognition that one's discontent and the process of transformation are shared Exploration of options for new roles, relationships and actions Planning a course of action Acquiring knowledge and skills for implementing one's plans Provisional trying of new roles Building competence and self-confidence in new roles and relationships A reintegration into one's life on the basis of conditions dictated by one's new perspective (Mezirow, 2000, p. 22).

When we speak of reframing we are speaking of two different means of reframing. They are: Objective Reframing involving critical reflection on assumptions of others encountered in a narrative or task oriented problem solving Subjective Reframing involving critical self-reflection of one's own assumptions about narrative (applying reflective insight from someone else's narrative to one's own experience, a system (economic, social or educational), an organization or workplace, feelings and interpersonal relations (counseling or psychotherapy) and the way we learn (Mezirow, 2000, p. 23). Critical to teachers helping effect transformative learning in adults, is the understanding of the importance of supportive relationships in the adult students lives, who may be experiencing transformative learning. Having a safe and supportive system of teachers and other significant people may greatly facilitate the student's willingness to move forward with transformative learning. In summary, Transformative Learning Theory provides a structure and process through which to better understand adult growth and development. Early theorists including Jean Piaget and Maria Montessori, developed very thorough theories about childhood development and for years few scholars probed how adults learn and make meaning of their lives until Jack Mezirow, in doing a study on women returning to school as adults, discovered much of what we now know as Transformative Learning Theory, a theory that started with Mezirow and has been greatly enriched by many others. Transformative Learning What is Transformative Learning? According to Mezirow learning occurs in one of four ways: by elaborating existing frames of reference, by learning new frames of reference, by transforming points of view, or by transforming habits of mind. And cognitive processing involves three levels: First Order Thinking - compute, memorise, read and comprehend Metacognition - monitoring progress and products of first order thinking Transformative Learning - reflecting on the limits of knowledge, the certainty of knowledge, and the criteria for knowing. Emerges in late adolescents. Transformative learning therefore involves the transformation of frames of reference (points of view, habits of mind, worldviews) and critical reflection on how we come to know. From Mezirow: Transformation theory's focus is on how we learn to negotiate and act on our own purposes, values, feelings, and meanings rather than those we have uncritically assimilated from others -to gain greater control over our lives as socially responsible, clear thinking decision makers. ... we transform frames of reference -- our own and those of others -- by becoming critically reflective them of their assumptions and aware of their context Assumptions on which habits of mind and related points of view are predicated may be epistemological, logical, ethical, psychological, ideological, social, cultural, economic, political, ecological, scientific, or spiritual, or may pertain to other aspects of experience. "Transformative learning refers to transforming a problematic frame of reference to make it more dependable ... by generating opinions and interactions that are more justified. We become critically reflective of those beliefs that become problematic. Ref. Mezirow, Jack et al. (2000) Learning as Transformation Do we need to be more explicit about Transformative Learning? Until recently Transformative Learning has largely been the province of adult learning theory. However there are several reasons to consider transformative learning theory and practice for students (particularly adolescents) in schools and colleges. The transition to adult life often involves personal transformation as students move from a safe school environment to take on complex work, study and social responsibilities. Transformative learning equips students with the concepts and understanding necessary to make a success of this transition. When students are led to a deeper understanding of concepts and issues their fundamental beliefs and assumptions may be challenged leading to a transformation of perspective or worldview. Students who understand transformative learning may be better able to recognise the common stages of transformative change and have the tools to assist them during this process. As we ask students to develop critical and reflective thinking skills and encourage them to care about the world around them they may decide that some degree of personal or social transformation is required. Students will need the tools of transformative learning in order to be effective change agents. Otherwise students may feel disempowered, become pessimistic about the future, fear change, or develop a degree of cynicism towards those who promote change. We are living through a period of transformational change in society and culture. Students will be better able to understand and deal with such change if they understand the nature of transformation and the impact it has on individuals, groups, organizations and nations.

Communicative language teaching (CLT) is an approach to the teaching of second and foreign languages that emphasizes interaction as both the means and the ultimate goal of learning a language. It is also referred to as communicative approach to the teaching of foreign languages or simply the communicative approach. Relationship with other methods and approaches Historically, CLT has been seen as a response to the audio-lingual method (ALM), and as an extension or development of the notionalfunctional syllabus. Task-based language learning, a more recent refinement of CLT, has gained considerably in popularity. The audio-lingual method The audio-lingual method (ALM) arose as a direct result of the need for foreign language proficiency in listening and speaking skills during and after World War II. It is closely tied to behaviorism, and thus madedrilling, repetition, and habit-formation central elements of instruction. Proponents of ALM felt that this emphasis on repetition needed a corollary emphasis on accuracy, claiming that continual repetition of errors would lead to the fixed acquisition of incorrect structures and non-standard pronunciation.

In the classroom, lessons were often organized by grammatical structure and presented through shortdialogues. Often, students listened repeatedly to recordings of conversations (for example, in the language lab) and focused on accurately mimicking the pronunciation and grammatical structures in these dialogs. Critics of ALM asserted that this over-emphasis on repetition and accuracy ultimately did not help students achieve communicative competence in the target language. Noam Chomsky argued "Language is not a habit structure. Ordinary linguistic behaviour characteristically involves innovation, formation of new sentences and patterns in accordance with rules of great abstractness and intricacy". They looked for new ways to present and organize language instruction, and advocated the notional functional syllabus, and eventually CLT as the most effective way to teach second and foreign languages. However, audio-lingual methodology is still prevalent in many text books and teaching materials. Moreover, advocates of audio-lingual methods point to their success in improving aspects of language that are habit driven, most notably pronunciation. The notional-functional syllabus Main article: Notional-functional syllabus A notional-functional syllabus is more a way of organizing a language learning curriculum than a method or an approach to teaching. In a notional-functional syllabus, instruction is organized not in terms of grammatical structure as had often been done with the ALM, but in terms of notions and functions. In this model, a notion is a particular context in which people communicate, and a function is a specific purpose for a speaker in a given context. As an example, the notion or context shopping requires numerous language functions including asking about prices or features of a product and bargaining. Similarly, the notion party would require numerous functions like introductions and greetings and discussing interests and hobbies. Proponents of the notional-functional syllabus claimed that it addressed the deficiencies they found in the ALM by helping students develop their ability to effectively communicate in a variety of real-life contexts. Learning by teaching (LdL) Learning by teaching is a widespread method in Germany (Jean-Pol Martin). The students take the teacher's role and teach their peers. CLT is usually characterized as a broad approach to teaching, rather than as a teaching method with a clearly defined set of classroom practices. As such, it is most often defined as a list of general principles or features. One of the most recognized of these lists is David Nunans (1991) five features of CLT: An emphasis on learning to communicate through interaction in the target language. The introduction of authentic texts into the learning situation. The provision of opportunities for learners to focus, not only on language but also on the Learning Management process. An enhancement of the learners own personal experiences as important contributing elements to classroom learning. An attempt to link classroom language learning with language activities outside the classroom. These five features are claimed by practitioners of CLT to show that they are very interested in the needs and desires of their learners as well as the connection between the language as it is taught in their class and as it used outside the classroom. Under this broad umbrella definition, any teaching practice that helps students develop their communicative competence in an authentic context is deemed an acceptable and beneficial form of instruction. Thus, in the classroom CLT often takes the form of pair and group work requiring negotiation and cooperation between learners, fluency-based activities that encourage learners to develop their confidence, role-plays in which students practice and develop language functions, as well as judicious use of grammar and pronunciation focused activities. In the mid 1990s the Dogma 95 manifesto influenced language teaching through the Dogme language teaching movement, who proposed that published materials can stifle the communicative approach. As such the aim of the Dogme approach to language teaching is to focus on real conversations about real subjects so that communication is the engine of learning. This communication may lead to explanation, but that this in turn will lead to further communication.[1] Classroom activities used in CLT Example Activities Role Play Interviews Information Gap Games Language Exchanges Surveys Pair Work Learning by teaching However, not all courses that utilize the Communicative Language approach will restrict their activities solely to these. Some courses will have the students take occasional grammar quizzes, or prepare at home using non-communicative drills, for instance. Critiques of CLT One of the most famous attacks on communicative language teaching was offered by Michael Swan in the English Language Teaching Journal in 1985.[2] Henry Widdowson responded in defense of CLT, also in the ELT Journal (1985 39(3):158-161). More recently other writers (e.g. Bax[3]) have critiqued CLT for paying insufficient attention to the context in which teaching and learning take place, though CLT has also been defended against this charge (e.g. Harmer 2003[4]). Often, the communicative approach is deemed a success if the teacher understands the student. But, if the teacher is from the same region as the student, the teacher will understand errors resulting from an influence from their first language. Native speakers of the target language may still have difficulty understanding them. This observation may call for new thinking on and adaptation of the communicative approach. The adapted communicative approach should be a simulation where the teacher pretends to understand only what any regular speaker of the target language would and reacts accordingly (Hattum 2006[5]). Connectionism is a set of approaches in the fields of artificial intelligence, cognitive psychology, cognitive science, neuroscience and philosophy of mind, that models mental or behavioral phenomena as theemergent processes of interconnected networks of simple units. There are many forms of connectionism, but the most common forms use neural network models. Basic principles The central connectionist principle is that mental phenomena can be described by interconnected networks of simple and often uniform units. The form of the connections and the units can vary from model to model. For example, units in the network could represent neurons and the connections could represent synapses. Spreading activation In most connectionist models, networks change over time. A closely related and very common aspect of connectionist models is activation. At any time, a unit in the network has an activation, which is a numerical value intended to represent some aspect of the unit. For example, if the units in the model are neurons, the activation could represent the probability that the neuron would generate an action potentialspike. If the model is a spreading activation model, then over time a unit's activation spreads to all the other units connected to it. Spreading activation is always a feature of neural network models, and it is very common in connectionist models used by cognitive psychologists. Neural networks Neural networks are by far the most commonly used connectionist model today. Much research using neural networks is done under the more general name "connectionist". Though there is a large variety of neural network models, they almost always follow two basic principles regarding the mind: Any mental state can be described as an (N)-dimensional vector of numeric activation values over neural units in a network. Memory is created by modifying the strength of the connections between neural units. The connection strengths, or "weights", are generally represented as an (NN)-dimensional matrix.

Most of the variety among neural network models comes from: Interpretation of units: units can be interpreted as neurons or groups of neurons. Definition of activation: activation can be defined in a variety of ways. For example, in a Boltzmann machine, the activation is interpreted as the probability of generating an action potential spike, and is determined via a logistic function on the sum of the inputs to a unit. Learning algorithm: different networks modify their connections differently. Generally, any mathematically defined change in connection weights over time is referred to as the "learning algorithm". Connectionists are in agreement that recurrent neural networks (networks wherein connections of the network can form a directed cycle) are a better model of the brain than feedforward neural networks(networks with no directed cycles). Many recurrent connectionist models also incorporate dynamical systems theory. Many researchers, such as the connectionist Paul Smolensky, have argued that connectionist models will evolve towards fully continuous, high-dimensional, non-linear, dynamic systemsapproaches. Biological realism The neural network branch of connectionism suggests that the study of mental activity is really the study of neural systems. This links connectionism to neuroscience, and models involve varying degrees of biologicalrealism. Connectionist work in general need not be biologically realistic, but some neural network researchers, computational neuroscientists, try to model the biological aspects of natural neural systems very closely in so-called "neuromorphic networks". Many authors find the clear link between neural activity and cognition to be an appealing aspect of connectionism. This has been criticized[1] as reductionist. Learning Connectionists[citation needed] generally stress the importance of learning in their models. Thus, connectionists have created many sophisticated learning procedures for neural networks. Learning always involves modifying the connection weights. These generally involve mathematical formulas to determine the change in weights when given sets of data consisting of activation vectors for some subset of the neural units. By formalizing learning in such a way, connectionists have many tools. A very common strategy in connectionist learning methods is to incorporate gradient descent over an error surface in a space defined by the weight matrix. All gradient descent learning in connectionist models involves changing each weight by the partial derivative of the error surface with respect to the weight. Backpropagation, first made popular in the 1980s, is probably the most commonly known connectionist gradient descent algorithm today. History Connectionism can be traced to ideas more than a century old, which were little more than speculation until the mid-to-late 20th century. It wasn't until the 1980s that connectionism became a popular perspective among scientists. Parallel distributed processing The prevailing connectionist approach today was originally known as parallel distributed processing(PDP). It was an artificial neural network approach that stressed the parallel nature of neural processing, and the distributed nature of neural representations. It provided a general mathematical framework for researchers to operate in. The framework involved eight major aspects: A set of processing units, represented by a set of integers. An activation for each unit, represented by a vector of time-dependent functions. An output function for each unit, represented by a vector of functions on the activations. A pattern of connectivity among units, represented by a matrix of real numbers indicating connection strength. A propagation rule spreading the activations via the connections, represented by a function on the output of the units. An activation rule for combining inputs to a unit to determine its new activation, represented by a function on the current activation and propagation. A learning rule for modifying connections based on experience, represented by a change in the weights based on any number of variables. An environment which provides the system with experience, represented by sets of activation vectors for some subset of the units. These aspects are now the foundation for almost all connectionist models. A perceived limitation of PDP is that it is reductionistic. That is, all cognitive processes can be explained in terms of neural firing and communication. A lot of the research that led to the development of PDP was done in the 1970s, but PDP became popular in the 1980s with the release of the books Parallel Distributed Processing: Explorations in the Microstructure of Cognition - Volume 1 (foundations) and Volume 2 (Psychological and Biological Models), by James L. McClelland, David E. Rumelhart and the PDP Research Group. The books are now considered seminal connectionist works, and it is now common to fully equate PDP and connectionism, although the term "connectionism" is not used in the books. Earlier work PDP's direct roots were the perceptron theories of researchers such as Frank Rosenblatt from the 1950s and 1960s. But perceptron models were made very unpopular by the book Perceptrons by Marvin Minskyand Seymour Papert, published in 1969. It demonstrated the limits on the sorts of functions which single layered perceptrons can calculate, showing that even simple functions like the exclusive disjunction could not be handled properly. The PDP books overcame this limitation by showing that multi-level, non-linear neural networks were far more robust and could be used for a vast array of functions. Many earlier researchers advocated connectionist style models, for example in the 1940s and 1950s,Warren McCulloch, Walter Pitts, Donald Olding Hebb, and Karl Lashley. McCulloch and Pitts showed how neural systems could implement first-order logic: their classic paper "A Logical Calculus of Ideas Immanent in Nervous Activity" (1943) is important in this development here. They were influenced by the important work of Nicolas Rashevsky in the 1930s. Hebb contributed greatly to speculations about neural functioning, and proposed a learning principle, Hebbian learning, that is still used today. Lashley argued for distributed representations as a result of his failure to find anything like a localized engram in years of lesionexperiments. Connectionism apart from PDP Though PDP is the dominant form of connectionism, other theoretical work should also be classified as connectionist. Many connectionist principles can be traced to early work in psychology, such as that of William James. Psychological theories based on knowledge about the human brain were fashionable in the late 19th century. As early as 1869, the neurologist John Hughlings Jackson argued for multi-level, distributed systems. Following from this lead, Herbert Spencer's Principles of Psychology, 3rd edition (1872), andSigmund Freud's Project for a Scientific Psychology (composed 1895) propounded connectionist or proto-connectionist theories. These tended to be speculative theories. But by the early 20th century, Edward Thorndike was experimenting on learning that posited a connectionist type network. In the 1950s, Friedrich Hayek proposed that spontaneous order in the brain arose out of decentralized networks of simple units. Hayek's work was rarely cited in the PDP literature until recently. Another form of connectionist model was the relational network framework developed by the linguist Sydney Lamb in the 1960s. Relational networks have been only used by linguists, and were never unified with the PDP approach. As a result, they are now used by very few researchers. There are also hybrid connectionist models, mostly mixing symbolic representations with neural network models. The hybrid approach has been advocated by some researchers (such as Ron Sun). Connectionism vs. computationalism debate As connectionism became increasingly popular in the late 1980s, there was a reaction to it by some researchers, including Jerry Fodor, Steven Pinker and others. They argued that connectionism, as it was being developed, was in danger of obliterating what they saw as the progress being made in the fields of cognitive science and psychology by the classical approach of computationalism. Computationalism is a specific form of cognitivism which argues that mental activity is computational, that is, that the mind operates by performing purely formal operations on symbols, like a Turing machine. Some researchers argued that the trend in connectionism was a reversion towards associationism and the

abandonment of the idea of a language of thought, something they felt was mistaken. In contrast, it was those very tendencies that made connectionism attractive for other researchers. Connectionism and computationalism need not be at odds, but the debate in the late 1980s and early 1990s led to opposition between the two approaches. Throughout the debate some researchers have argued that connectionism and computationalism are fully compatible, though full consensus on this issue has not been reached. The differences between the two approaches that are usually cited are the following: Computationalists posit symbolic models that do not resemble underlying brain structure at all, whereas connectionists engage in "low level" modeling, trying to ensure that their models resemble neurological structures. Computationalists generally focus on the structure of explicit symbols (mental models) and syntacticalrules for their internal manipulation, whereas connectionists focus on learning from environmental stimuli and storing this information in a form of connections between neurons. Computationalists believe that internal mental activity consists of manipulation of explicit symbols, whereas connectionists believe that the manipulation of explicit symbols is a poor model of mental activity. Computationalists often posit domain specific symbolic sub-systems designed to support learning in specific areas of cognition (e.g. language, intentionality, number), while connectionists posit one or a small set of very general learning mechanisms. But despite these differences, some theorists have proposed that the connectionist architecture is simply the manner in which the symbol manipulation system happens to be implemented in the organic brain. This is logically possible, as it is well known that connectionist models can implement symbol manipulation systems of the kind used in computationalist models[citation needed], as indeed they must be able if they are to explain the human ability to perform symbol manipulation tasks. But the debate rests on whether this symbol manipulation forms the foundation of cognition in general, so this is not a potential vindication of computationalism. Nonetheless, computational descriptions may be helpful high-level descriptions of cognition of logic, for example. The debate largely centred on logical arguments about whether connectionist networks were capable of producing the syntactic structure observed in this sort of reasoning. This was later achieved[citation needed], although using processes unlikely to be possible in the brain[citation needed], thus the debate persisted. Today, progress in neurophysiology, and general advances in the understanding of neural networks, has led to the successful modelling of a great many of these early problems, and the debate about fundamental cognition has thus largely been decided amongst neuroscientists in favour of connectionism. However, these fairly recent developments have yet to reach consensus acceptance amongst those working in other fields, such as psychology or philosophy of mind. Part of the appeal of computational descriptions is that they are relatively easy to interpret, and thus may be seen as contributing to our understanding of particular mental processes, whereas connectionist models are generally more opaque, to the extent that they may only be describable in very general terms (such as specifying the learning algorithm, the number of units, etc.), or in unhelpfully low-level terms. In this sense connectionist models may instantiate, and thereby provide evidence for, a broad theory of cognition (i.e. connectionism), without representing a helpful theory of the particular process which is being modelled. In this sense the debate might be considered as to some extent reflecting a mere difference in the level of analysis in which particular theories are framed. The recent popularity of dynamical systems in philosophy of mind have added a new perspective on the debate; some authors now argue that any split between connectionism and computationalism is more conclusively characterised as a split between computationalism and dynamical systems. The recently proposed Hierarchical temporal memory model may help resolving this dispute, at least to some degree, given that it explains how the neocortex extracts high-level (symbolic) information from low-level sensory input. Part 1: What is connectionism? Connectionism as a term was first mentioned in Thorndike's study (1898) of the way cats learn in incremental stages. Connectionism as a paradigm of learning has its roots in associationism. Associationism dates from classical times but was substantially refined by the seventeenth century philosophers Hobbes and Locke. The fundamental belief of associationism is that learning could be regarded as the formation of associations between previously unrelated information based on their contiguity. Connectionism is also based on this principle but is somewhat different in that it encompasses much more as outlined below. Connectionism borrows heavily from associationism and is a term that covers neural networks and Parallel Distributed Processing (PDP). Neural networks seek to explain cognition in biological or neurological terms and PDP tries to show that the information is not stored in the brain in one place but is distributed throughout the various parts of the brain which serve certain linguistic and non-linguistic functions. Generally PDP and connectionism are seen as being synonymous. Associationism by contrast, does not contain many of the more advanced and sophisticated notions of connectionism (see Bechtel and Abrahamsen (1991) or Cohen et al. (1993) for reviews in this area). There is no unified agreement on what exactly connectionism is, however most connectionist models seem to share several properties. Connectionist architectures of cognition are loosely based on the architecture of the brain. Connectionists do not use neurological terms such as synapses and neurons directly, but instead use the terms nodes and networks which are said to represent a crude but effective approximation of the neural state of the brain at a superficial level . These nodes are massively interconnected with other nodes to form a network of interconnections, hence the term connectionism. Each of these nodes can be connected to many different networks. The knowledge is stored in these interconnections and is associated with other kinds of knowledge contained in the network and to other networks, hence the relationship to associationism. Connectionists believe that these interconnections store the lexical information, however this does not mean that the information is stored in one place (one cannot look inside the brain and find a particular word for example), but in the interconnections between the nodes in the form of a network. One could visualize that the representation of a word might involve interconnections between various parts of the network, for example to the phonological, semantic or orthographic parts of the network. From this we can see that the knowledge is distributed among many interconnections. This distribution information provides us with several advantages which will be discussed later. Some connectionists believe that information is related to each other in the brain in the form of massively interconnected sub-networks rather than as a simple unified system. These sub-networks store information that can be accessed by other sub-networks. For example, a subnetwork of morphological knowledge can connect with a sub network of word roots, which in turn can connect to a semantic sub network which stores meanings of words. While the exact make up of these interconnections is not known, we do have some insights from our knowledge of the mental lexicon what it might look like. Future research may be able to clarify this knowledge. From the interaction of these inter-related networks we can form the meaning of a word and find the correct word to choose. If we have to find a past tense form for example, the morphological network can be tapped to retrieve it. Each sub-network making up a 'word' as such, would be connected to hundreds or thousands of other nodes making up a mini-network for that word. These sub-networks will be connected to areas of the brain that control the phonological, speech, auditory functions as well as the storing of lexical-specific information. The sum of all these interconnections for that word make up the knowledge about that word which the learner has. Therefore, a well known word will have a very intricate network of interconnections and less well known words will have fewer interconnections. A different word would have a different set of nodes connected to hold that information - another mini-network. It may, of course, share many of the same nodes as other words, or may not depending on the make up of that word. In essence then our lexicon (or lexicons) is made up of hundreds or thousands of these subnetworks all massively interconnected to form the lexicon. Within a network the nodes are organized into 'levels' such that any one node excites or inhibits other nodes at its own or different levels. Patterns, habits and rules are not stored in these interconnections, but what is stored are the interconnection strengths that allow these

patterns and rules to be recreated. Knowledge is seen at the micro structure level rather than macro-structure level of cognition. Therefore, the strength of the interconnections reflects the relative knowledge one has about an item of vocabulary. Prototypical representations of the lexical environment emerge as a natural outcome of the learning process. See Bechtel and Abrahamsen (1991) Broeder and Plunkett (1994) Ney and Pearson (1990) for more detail in these areas. Learning, therefore, is a by-product of processing. The reader should immediately notice several things about this network and the limitations of representing the network in diagrammatic form. The first and most obvious, is that the representation is incomplete and is only a partial representation of a learner's knowledge of see. That said, the concept of see in diagram 1 is distributed among many interconnections, some of which are thin and some are thick. The stronger the interconnection (thicker line) the more 'well-known' the information is, the thinner the line represents less 'well-known' information. The learner is relatively sure that see means something like 'an image comes to my eyes' and that it collocates with some objects. She is less sure about her knowledge (partial knowledge) that the pronunciation of the past tense of saw is /s:/ represented by the thinner line. Secondly, diagram 1 shows, for diagrammatic purposes only, nodes that have been labeled 'meaning' 'past tense ending', 'preposition use' and 'objects that collocate' with their sub-categories. The learner could assign labels to these quite differently and in fact not even have them categorized as shown, but in some completely different way - reflecting her own view of the word see. Alternatively, these nodes could not exist at all, or there could be no interconnections between them, reflecting no knowledge between these nodes and thus no knowledge of see. The diagram does not show all the other possible nodes about see - for example there are no nodes for its 'idiomatic use', the knowledge that see is pronounced the same as sea and so on. Thirdly, each of the nodes and sub-categories, such as 'meaning' are shown as being connected to other parts of the network by the lines leaving the diagram. Therefore, the network is immensely complex in structure. It would take only a little imagination to conceive of a diagram which could represent knowledge of 'affix knowledge', 'words I have problems with', 'semantic networks, 'words to use when apologizing in German' and indeed many facets of vocabulary acquisition all linked together. Such a highly interconnected network would, of course, be beyond diagrammatic representation. What can the model demonstrate? Associative learning. Clearly, the associative nature of vocabulary is shown here. Each network of knowledge is connected to many other networks. This model can also demonstrate how we could instantiate knowledge from the network in a schematic way. One piece of lexical information connects to another and can instantiate a related idea or word (see Rumelhart and Ortony, 1977 for a discussion of schema). Schema theory has shown us the importance of background knowledge and the relationship it has to comprehension (see Brewer and Treyens, 1981 for an example). Sometimes learners cannot comprehend a lexical item due to insufficient conceptual development or lack of background knowledge. This model can show the interconnections (or lack thereof) to non linguistic knowledge that can hamper comprehension. By the same token, if a learner comes across a new word he may be able to guess from context prior lexical knowledge. Clearly the richer the network of associations, the more chance there will be of comprehension. The learning of an L2 lexicon would involve deepening and enriching these networks and their interconnections. Partial knowledge The model can account for full, partial and incorrect storage of lexical knowledge. Knowledge that things are not something can also be accounted for in this model. For example, this learner may explicitly know that the past tense ending of see is not seed (is not /I:d/) This would be represented by drawing a line to that part of the diagram - either thickly or thinly depending on the strength of that knowledge. Incremental learning Interlanguage phenomena point to a learner system whereby learning is incremental, and done in successive and / or recursive steps. This models reflects this well as it can account for information that is not part of the L1 nor the L2, but nevertheless is systematic and which the learner is constantly updating (or has fossilized) (see Klein, 1986). Content addressability. Word knowledge in this network is content addressable . This means that if a learner is asked for a word that means 'a round, hollow leather thing that you can play soccer with', or asked for the meaning of 'soccer ball' he can answer from both directions. Therefore, the information is stored in a connectionist architecture can be accessed in many ways. Individual variation. Each learner will have a different network of associations and interconnections. Clearly the L1 can intrude on the transfer of L1 lexical knowledge. This model can account for learner variation even with learners from the same L1 and with the same input having differing lexicons. Some SLVA researchers have proposed different lexicons serving different purposes such as productive or receptive L2 lexicons. In this model there is no reason to assume that sub-networks for separate lexicons could not exist side by side or be interconnected. At the other extreme, those who say there is only one lexicon for all lexical knowledge, both from the l1 and the L2 could also be accommodated here. Advantages of a distributed network. The knowledge is stored in the interconnections, therefore each node can connect to many networks. A major advantage of this is economy of the network in the sense that a single node could be connected to many others thus allowing one node to form many representations. This in turn allow the parallel processing of information where the brain can process many things at once. Clearly we receive many different forms of input at any given time all of which must be process simultaneously . Another advantage of a distributed system is that if one part of the system deteriorates (for example a given word is known but temporarily cannot be recalled) the whole system does not break down as the forgotten word will be connected to other words which could replace it. This is often called graceful degradation. For example, if a learner had learned collapse but when called upon to produce it cannot access it, a substitute could be found from within the network such as fall down. The network thus has built in redundancy in that the capacity to continue correct operation despite the loss of part of the information comes from the fact that the original network had encoded more information than was necessary to maintain the network. Human-like behaviour. One of the main achievements of a connectionist system is that it can process information and learn in ways which mirror some aspects of human learning and information processing such as, pattern matching, spontaneous generalization, stimulus categorization and concept learning. This make these models very attractive to psychologists in particular. In addition, some of the models developed have been able to model at least some specific aspects of human performance (see Cohen et al, 1993, and Haberlandt, 1994 for numerous examples). Generation from experience.

Connectionist systems have the ability to automatically or spontaneously generalize from experience. This could be done both productively and receptively. For example, if a learner knows the affix '-ist' can refer to a person doing a particular kind of job or work, then when the learner meets an unknown word ending in '-ist' he can guess that it would be a person doing a certain type of work. Similarly, if a learner wanted to generate a word '-ist' could be added to a person's job to create that word. For example, he might generate 'pianist' (if it was not known) from knowledge about piano, work, and -ist. Alternatively, he could create a novel word such as 'computerist'. Furthermore, overgeneralization of lexical applications can be explained in these terms. Lack of lexical knowledge can be represented. Beginning vocabulary learners often will not have an L2 network set up for some words/concepts let alone one that can find substitutes when needed. This lack of a developed network could help to account for why it is that learners are at a loss for words at times - simply the network has not been set up or it contains insufficient knowledge. It would therefore follow that lexical items which are not repeated or met frequently could have tenuous interconnections. Therefore there needs to be constant practice and reinforcement. It should be noted that this practice is not behaviourist in the sense that each items will need repeating many times and that is the only way to learn. What it does mean is that the interconnections may need reinforcing to strengthen the interconnections. Learning under this model. As we learn, we constantly match new input to old information and adjust our knowledge store network according to the new information. Our processing of the input affects our future potential output in that the present knowledge store has been altered by new input and a new status quo is made until new input comes along to confirm the present state or lead us to review it again. Connectionism rests on the assumption that we learn by trial and error in successive steps, incrementally and through exposure to input. Successive steps in the learning process alter the associative interconnections by the strengthening or weakening of the interconnections. The more well known a piece of word knowledge is, the stronger the interconnection that makes up that part of the word's knowledge. This matches the view that a new word will be not learned completely on first meeting, but the knowledge of that word (such as the pronunciation, spelling, 'grammatical' features of the word, its collocates, register and so on) will incrementally grow with the number of times the word is met in various contexts. It will be a rare occasion that a new word is learned at one trial with all its features readily available for use, though connectionist networks do not prevent this happening. This does not mean that a word cannot be learned at one trial however, despite the fact that present connectionist simulations cannot do so. The network or sub-networks making up the lexicon is ever changing and one could view it as never resting. Imagine for a moment we could take a snap shot of the network at rest. If a network representing say the word/concept 'do' were caught at rest, we could see that some interconnections were strong reflecting perceived well-known information (even if it is wrong) and others were weak reflecting less well known information. As new information is added, new interconnections are made to different nodes to account for this. The strength of these interconnections is altered by the input strengthening some interconnections, for example confirming that we in fact say 'do the washing' and weakening interconnections of other parts of the mini-network making up 'do'. For example, if a learner said 'do a crime' and was corrected, then the learner could then connect 'crime' with 'commit' rather than with 'do' making a new interconnection. This would not mean that the collocation would be 'learned' but that a link had been made and probably the learner will continue to use 'do' in preference to 'commit' until the network has been so altered through repeated exposure, practice and use to reflect the preference for 'commit' over 'do'. Evidence from second language data. Very little work on connectionism has been done in second languages. Notable exceptions are Schmidt's review (1990); Broeder and Plunkett's study of developmental order for pronouns in L2s (1994); Blackwell and Broeder's work on frequency (1992) Gasser's work on word order, (1988, 1990) Shirai's work on L1 transfer (1992) and Marchman (1992) and Sokolik and Smith's work on critical periods (1992). This lack of work does not mean a lack of interest however and is understandable in tat the field is only 10 years old. Most of the research has been done in the first language and it has only been very recently that work has started on a second language. An extensive search found no specific studies of second language vocabulary acquisition from a connectionist perspective. This may be due to the very complex and multi-faceted nature of SLVA and the fact that researchers may be more interested in the bigger picture of SLA rather than SLVA in particular. What can the model not demonstrate? The working mind, intention and higher cognitive functions. One thing missing in many discussions of connectionism is the conscious working mind - the things that we call a consciousness, memory, intention and so on. These are often referred to as higher cognitive functions. These quite obviously exist in some form or another as we can all say we have them. Purist connectionism views these as the by-products of the processing of information, whereas more traditional views of cognition (the current dominant experimental paradigm) see the mind as being somehow broken into parts. This modular view says we have different forms of memory and storage and that these can be tested in certain ways to find out how our lexicons work. It is clear that there are levels of human processing for which PDP models may not be an appropriate level of analysis, at least given the current generation of PDP models. If the higher cognitive functions exist in PDP terms, we would need to be able to explain why there are parts of a PDP system which are transparent and why other parts are not. Universal application. It is not generally accepted even by PDP researchers that a connectionist (=PDP) model can account for all areas of human cognition, although many try to resist external explanations. The challenge for these researchers then is to develop a system to 'account for the phenomena which are handled rather well by rules but also, without additional mechanisms, give an elegant account of other phenomena as well' (Betchel and Abrahamsen, 1991 p. 217). Connectionist models are good at the lower-level of cognition such as content addressability, low level perception and spontaneous generalization. However there has been little success in discovering such examples at higher levels of cognition. It may be that we should not be trying to explain all things at all levels, but we could fall back on the idea of levels (to be outlined below). Capturing syntactic structure. Fodor and Pylyshyn (1988) argue that a connectionist system cannot capture the representation of syntax well. Their example says that a PDP system can connect Joan, loves and florist in 'Joan loves the florist' to give it meaning, but it cannot discriminate it from the relationship in semantic terms with 'The florist loves Joan'. A network could add the representation but it could not disambiguate the two sentences. Therefore, they say that a PDP is inadequate to the task of representing syntactic knowledge. This is despite Chomsky (1986) stating recently that the generatabiliity of syntax is no longer the goal of generative linguistics. Developmental sequences Fodor and Pylyshyn state that the model is not good at learning in developmental stages which a rule-based approach can capture. This ignores the fact that some aspects of language tend to be rule-governed and some aspects do not. All languages have exceptions such as go / went and suru (do) and kuru (come in Japanese. PDP systems do in fact go through seems developmental stages as do first language learners.

This is not clear for second language learners however. In the first phase the systems tend catch the irregularities by rote, and the second phase concentrating on rule-governed regularities. In the final stage the model strikes a balance between the two poles of regularity and irregularity and even overgeneralizes at times as children would do (e.g. feets instead of feet). Non-human behaviour Due to the very nature of these systems not being transparent, they cannot be tested empirically at the micro level of cognition and we are left with computer simulations of learning. These computer models cannot sufficiently model human behaviour exactly and indeed sometimes generate very non-human responses. In addition the computer simulations take a long time to learn whereas humans can learn at one trial and new simulations need to be developed to account for these inadequacies. Differences from symbolic processing. It is important to distinguish connectionism from a symbolic account of learning and knowledge storage. In symbolic systems word knowledge is couched in terms of parts of speech such as nouns, verbs, or semantic groups such as 'words for travel' and so on each having a label for the kind of knowledge stored - a symbol for that knowledge. Typically, symbolic systems have rules by which this information can be processed and rules which state what is impossible in a language. Symbolic systems are context insensitive in that they are distinct from their environment. Elman (1991, p. 221) says 'this insensitivity allows for the expression of generalizations which are fully regular at the highest level of representation (e.g. purely syntactic), but they require additional apparatus to account for regularities which reflect the interaction of meaning with form and which are more contextually defined. Connectionist models on the other hand begin the task at the other end of the continuum. They emphasize the importance of context in the interaction of form with meaning'. Symbolic systems, therefore are subject the fallacy that things can only be referred to in symbolic terms and therefore do not connect themselves to the real world. That is, an alien listening to us via radio signal might learn the sounds of the language but not the semantics unless they could observe a word's relationship with objects and the events to which it refers. A network system, by comparison, can deal with anomalies by adding further assumptions. See Johnson-Laird et. al. (1984) for a review in this area. Symbolic systems such as the generative linguistic paradigm would account for linguistic knowledge in terms of nouns, subjects, objects and so on. These terms do not exist in purist (PDP) systems . PDP systems will accept that rules can be stored in a connectionist network, but they are not the foundation stone on which the network is made. This means that under a PDP paradigm, the symbolic system loses its causal role in cognition and is thus an unacceptable outcome to many linguists as a typical UG proponent would see these rules as essential to human linguistic processing. However, it may be that aspects of human performance which appear so regular as to be conveniently summarized by rules (like the rules of grammar in a language), may arise out of the general properties of parallel distribution which operate without any reference to such rules. Recent debates by Fodor and Pylyshyn (1988) and the Jacobs and Schumann (1992) v's Eubank and Gregg debate (1995) and a reviews by Bechtel and Abrahamsen (1991) and Morris (1989) underscore these differences. These debates take place on the basis of accepting one view means the other is unacceptable. Both sides tend to see things in extreme terms - a universal take-it-all-or-leaveit-all view (Pinker and Prince 1988). Neither side has produced evidence for this universality and clearly both have their limitations (see Cohen et al. (1993) for a review). However, if one views the connectionist / symbolic argument in terms of an non-universal answer then the situation changes somewhat and one can see things in terms of complementary rather then confrontationary stance. Clearly much has been learned about the workings of memory in relation to vocabulary learning in a second language in cognitive terms (see Nation, 1990 for a review) but they offer little in the way of insights into the micro-view of cognition which connectionism seems to explain quite well. It seems therefore that the issue of whether the current symbolic paradigm or connectionism is the one and only explanation misses the point. Summary. Connectionist systems of vocabulary acquisition have many characteristics that are desirable in simulations of human cognition, for example graceful degradation automatic generalization and so on. Many of these are found in other models of cognition, but it is unusual to find so many in one model. These models show the learning process over time, this is important as most studies in SLVA have been cross sectional in nature. There are parts of our cognitive apparatus which are open to inspection and are transparent in nature and empirically testable, such as memory span, lexical competence, attention and so on. There are other parts of our cognitive system which are not open to inspection, such as how we retrieve lexical information from our brain or how we process the auditory information and add it to our store. A connectionist account of lexical knowledge is good at describing the storehouse of vocabulary. That is, how the words are connected through their associations; how we may store and retrieve lexical knowledge; how lexical knowledge is schematic or associative, and how it can substitute for lack of knowledge, how we can guess the meaning of words and so on. It seems that the connectionist architecture could operate at a lower 'impenetrable' level of cognitive activity whereby we are not able to access it by introspection, in a sense it is unavailable to us and the interconnections are made automatically without our intervention. The transparent part of our cognitive system may operate at a higher level and would include what we know about memory and so on. This would lead to a two level interdependent model of vocabulary acquisition. It would make sense to have a two level hybrid system because the symbolic machine operates according to its own autonomous set of principles. This view is the one currently coming into fashion (see Kempen, 1992; Marcus et al. 1992, 1993; and Pinker, 1991).

You might also like