Professional Documents
Culture Documents
Pygmalion Effect PDF
Pygmalion Effect PDF
Pygmalion Effect PDF
In the mid-1950s I nearly ruined the results of my doctoral dissertation at UCLA. The sordid details are available elsewhere (Rosenthal, 1985) but briey, it appeared that I might have treated my research participants in such a way as to lead them to respond in accordance with my experimental hypothesis or expectancy. All of this was quite unwitting, of course, but it did raise a sobering question about the possibility of interpersonal expectancy effects in the psychological laboratory. If it were my unintentional interpersonal expectancy effect or my ``unconscious experimenter bias'' that had led to the puzzling and disconcerting results of my dissertation then presumably we could produce the phenomenon in our own laboratory and with several experimenters rather that just one. Producing the phenomenon in this way not only would yield the scientic benet of demonstrating an interesting and important concept; it would also yield the very considerable personal benet of showing that I was not alone in having unintentionally affected the results of my research by virtue of my bias or expectancy.
This Chapter is based in part on an invited address given to the Teachers of Psychology in the Secondary Schools (TOPSS) and subsequently published in Psychology Teacher Network (PTN), 1998, Vol. 8, pp. 24, 9. It is an updated version of papers cited in the references of the PTN paper. Correspondence concerning this chapter should be addressed to Robert Rosenthal, Department of Psychology, University of California, Riverside, CA 92521-0426.
Improving Academic Achievement Copyright 2002, Elsevier Science (USA). All rights reserved.
25
26
SOME EARLY RESULTS Human Subjects
Robert Rosenthal
In the rst of our studies employing human subjects, 10 students of psychology, both undergraduate and graduate, served as the experimenters (Rosenthal & Fode, 1963b). All were enrolled in an advanced course in experimental psychology and were already involved in conducting research. Each student experimenter was assigned as his or her research participants about 20 students of introductory psychology. The procedure was for the experimenters to show a series of 10 photographs of people's faces to each of their participants individually. Participants were to rate the degree of success or failure shown in the face of each person pictured in the photos. Each face could be rated at any value from 10 to 10, with 10 meaning extreme failure and 10 meaning extreme success. The 10 photos had been selected so that, on the average, they would be seen as neither successful nor unsuccessful, but quite neutral, with an average numerical score of zero. All 10 experimenters were given identical instructions on how to administer the task to their participants and were given identical instructions to read to them. They were cautioned not to deviate from these instructions. The purpose of their participation, it was explained to all experimenters, was to see how well they could duplicate experimental results that were already wellestablished. Half the experimenters were told that the ``well-established'' nding was such that their participants should rate the photos as of successful people (ratings of 5) and half the experimenters were told that their participants should rate the photos as being of unsuccessful people (ratings of 5). Results showed that experimenters expecting higher photo ratings obtained higher photo ratings than did experimenters expecting lower photo ratings. Subsequent studies tended to obtain generally similar results (Rosenthal, 1969; Rosenthal & Rubin, 1978).
Animal Subjects
Pfungst's work with Clever Hans and Pavlov's work on the inheritance of acquired characteristics had both suggested the possibility of experimenter expectancy effects with animal subjects (Gruenberg, 1929; Pfungst, 1965). In addition, Bertrand Russell (1927) had noted this possibility, adding that animal subjects take on the national character of the experimenter. As he put it: ``Animals studied by Americans rush about frantically, with an incredible display of hustle and pep, and at last achieve the desired result by chance. Animals observed by Germans sit still and think, and at last evolve the solution out of their inner consciousness'' (pp. 2930). But it was not only the work of Pavlov, Pfungst, and Russell that made us test the generality of experimenter expectancy effects by working with animal
27
subjects. It was also the reaction of my friends and colleagues who themselves worked with animal subjects. That reaction was: ``Well of course you'd nd expectancy effects and other artifacts when you work with humans; that's why we work with rats.'' A good beginning might have been to replicate with a larger sample size Pfungst's research with Clever Hans; but with horses hard to come by, rats were made to do (Rosenthal & Fode, 1963a). A class in experimental psychology had been performing experiments with human participants for most of a semester. Now they were asked to perform one more experiment, the last in the course and the rst employing animal subjects. The experimenters were told of studies that had shown that mazebrightness and maze-dullness could be developed in strains of rats by successive inbreeding of the well and the poorly performing maze runners. Sixty laboratory rats were equitably divided among the 12 experimenters. Half the experimenters were told that their rats were maze-bright and the other half were told their rats were maze-dull. The animal's task was to learn to run to the darker of two arms of an elevated T-maze. The two arms of the maze, one white and one gray, were interchangeable; and the ``correct'' or rewarded arm was equally often on the right as on the left. Whenever animals ran to the correct side they obtained a food reward. Each rat was given 10 trials each day for 5 days to learn that the darker side of the maze was the one that led to the food. Beginning with the rst day and continuing on through the experiment, animals believed to be better performers became better performers. Animals believed to be bright showed a daily improvement in their performance, while those believed to be dull improved only to the third day and then showed a worsening of performance. Sometimes an animal refused to budge from the starting position. This happened 11% of the time among the allegedly bright rats; but among the allegedly dull rats it happened 29% of the time. When animals did respond and correctly so, those believed to be brighter ran faster to the rewarded side of the maze than did even the correctly responding rats believed to be dull. When the experiment was complete, all experimenters rated their rats and -vis their animals. Those experimenters their own attitudes and behavior vis-a who had been led to expect better performance viewed their animals as brighter, more pleasant, and more likable. These same experimenters felt more relaxed in their contacts with the animals and described their behavior toward them as more pleasant, friendly, enthusiastic, and less talkative. They also stated that they handled their rats more and also more gently than did the experimenters expecting poor performance. The next experiment with animal subjects also employed rats, this time using not mazes but Skinner boxes (Rosenthal & Lawson, 1964). Because the experimenters (39) outnumbered the subjects (14), experimenters worked in teams of two or three. Once again about half the experimenters were led to believe that their subjects had been specially bred for excellence of
28
Robert Rosenthal
performance. The experimenters who had been assigned the remaining rats were led to believe that their animals were genetically inferior. The learning required of the animals in this experiment was more complex than that required in the maze learning study. This time the rats had to learn in sequence and over a period of a full academic quarter the following behaviors: to run to the food dispenser whenever a clicking sound occurred; to press a bar for a food reward; to learn that the feeder could be turned off and that sometimes it did not pay to press the bar; to learn new responses with only the clicking sound as a reinforcer (rather than the food); to bar-press only in the presence of a light and not in the absence of the light; and, nally, to pull on a loop that was followed by a light that informed the animal that a bar-press would be followed by a bit of food. At the end of the experiment the performance of the animals believed to be superior was superior to that of the animals believed to be inferior, and the difference in learning favored the allegedly brighter rats in all ve of the laboratory sections in which the experiment was conducted.
29
Lenore replied, mainly to discuss concerns over the ethical and organizational implications of creating false expectations for superior performance in teachers. If this problem could be solved, her school would be ideal, she felt, with children from primarily lower-class backgrounds. Lenore also suggested gently that I was ``a bit naive'' to think one could just tell teachers to expect some of their pupils to be ``diamonds in the rough.'' We would have to administer some new test to the children, a test the teachers would not know. Phone calls and letters followed, and in January of 1964, a trip to South San Francisco to settle on a nal design and to meet with the school district's administrators to obtain their approval. This approval was forthcoming because of the leadership of the school superintendent, Dr. Paul Nielsen. Approval for this research had already been obtained from Robert L. Hall, Program Director for Sociology and Social Psychology for the National Science Foundation, which had been supporting much of the early work on experimenter expectancy effects.
An Unexpected Finding
At the time the Pygmalion experiment was conducted there was already considerable evidence that interpersonal self-fullling prophecies could occur, at least in laboratory settings. It should not then have come as such a great surprise that teachers' expectations might affect pupils' intellectual
30
Robert Rosenthal
development. For those well-acquainted with the prior research, the surprise value was, in fact, not all so great. There was, however, a surprise in the Pygmalion research. For this surprise there was no great prior probability, at least not in terms of many formal research studies. At the end of the school year of the Pygmalion study, all teachers were asked to describe the classroom behavior of their pupils. Those children in whom intellectual growth was expected were described as having a better chance of becoming successful in the future, as more interesting, curious, and happy. There was a tendency, too, for these children to be seen as more appealing, adjusted, and affectionate, as less in need of social approval. In short, the children in whom intellectual growth was expected became more intellectually alive and autonomous, or at least were so perceived by their teachers. But we already know that the children of the experimental group gained more intellectually, so that perhaps it was the fact of such gaining that accounted for the more favorable ratings of these children's behavior and aptitude. But a great many of the control group children also gained in IQ during the course of the year. We might expect that those who gained more intellectually among these undesignated children would also be rated more favorably by their teachers. Such was not the case. The more the control group children gained in IQ the more they were regarded as less well-adjusted, as less interesting, and as less affectionate. From these results it would seem that when children who are expected to grow intellectually do so, they are beneted in other ways as well. When children who are not specically expected to develop intellectually do so, they seem either to show accompanying undesirable behavior or at least are perceived by their teachers as showing such undesirable behavior. If children are to show intellectual gain, it seems to be better for their real or perceived intellectual vitality, and for their real or perceived mental health, if their teacher has been expecting them to grow intellectually. It appears worthwhile to investigate further the proposition that there may be hazards to unpredicted intellectual growth (Rosenthal, 1974).
Reactions to Pygmalion
Reactions to Pygmalion were extreme. Many were very favorable, many were very unfavorable. Elsewhere we have noted in considerable detail the best known of these negative criticisms and given reasons in considerable detail why they were not compelling (Rosenthal, 1985, 1987, 1995; Rosenthal & Rubin, 1971, 1978). For our present purposes, it will be enough to give a very brief overview of these criticisms and why they were not compelling. 1. The analyses of the data were criticized on the grounds that the analyses should have focused on classrooms as a whole rather than on individual children. In fact, we had analyzed the data both ways, that is, by
31
2.
3.
4.
5.
6.
classrooms and by children, a fact made clear in our report, and both ways gave essentially the same results. A second criticism claimed that because the same IQ test had been employed for both the pretest and the posttest, the study might suffer from practice effects. It puzzles us how practice effects could bias the results of a randomized experiment. If practice effects were so great as to drive everyone's performance up to the limit, or ceiling, of the test, then practice effects could operate to diminish the effects of the experimental manipulation but they could not operate to increase those effects. A third criticism was that teachers themselves administered the group tests of IQ. As it turned out, however, when children were retested by testers who were blind to the experimental conditions, indeed to the existence of experimental conditions, the effects of teacher expectations actually increased rather than decreased. A fourth criticism was that we had employed group tests of IQ, tests that were less reliable than individually administered tests of IQ. This criticism suggested that the teacher expectancy effect might be due to this greater unreliability of the test instrument. As it turns out, however, lower reliability of a test instrument makes it harder, not easier, to obtain statistically signicant results. A fth criticism was that the IQ of the youngest children had been measured with low validity. Actually, the validity of the measures of even these youngest children (r 0:65) was substantially higher than the validity of many IQ tests, and much higher than the validity of psychological tests in general (Cohen, 1988). Incidentally, even if we set aside the results from these youngest children, ample evidence remains for the operation of teacher expectancy effects. A sixth criticism suggested that the data should have been transformed mathematically before being analyzed. Critics then transformed the original data of Pygmalion in eight different ways. Some of these transformations were seriously biased (e.g., discarding data showing greater teacher expectancy effects). Despite this, however, none of the transformations gave results noticeably different from those reported in the Pygmalion experiment. For total IQ, every transformation gave a signicant result when one had been reported in the Pygmalion experiment. When verbal IQ and reasoning IQ were considered separately, the various transformations yielded more signicant teacher expectancy effects than had been reported in Pygmalion.
A Heuristic Note
Before leaving the topic of negative criticisms of Pygmalion, one casual observation should be offered for any possible heuristic value it may hold. The bulk of the criticism of Pygmalion came neither from mathematical statisticians, nor
32
Robert Rosenthal
from experimental social psychologists, nor from educators (though the president of a large teachers' union attacked Pygmalion bitterly as an affront to the good name of the teacher or the teachers' union). The bulk of the negative reactions came from workers in the eld of educational psychology. Perhaps it was only they who would have been interested enough to respond. That seems unlikely, however, as many other kinds of psychologists regarded the Pygmalion effect as of great interest. We leave the observation as just a curiosity, one that might be claried by future workers in the elds of the history, the sociology, and the psychology of science.
33
summarized by Raudenbush in which the teacher knew the children 2 weeks or less, the typical (median) size of the IQ increase due to induced favorable teacher expectations was about a quarter of a standard deviation. For the best known of the individually administered tests of IQ, that represents about 4 IQ points; in terms of SAT scores, that would represent a gain of about 25 points. In sum, it seems clear, then, based on the accumulated evidence, as well as on the evidence provided by the original Pygmalion experiment, that the educational self-fullling prophecy (Merton, 1948) has now been well established, and that is the rst step in the scientic study of any phenomenon (Merton, 1987). For many years the central question in the study of interpersonal expectancy effects was whether there was any such thing. The replication evidence has answered that question sufciently, based on the current full number of 479 studies, so that simple additional replications will add little new knowledge. The central questions in the study of interpersonal expectancy effects have changed so that now the more interesting questions include the specication of the variables that (a) moderate expectancy effects and (b) mediate expectancy effects. Moderator variables are preexisting variables such as sex, age, and personality that are associated with the magnitude of interpersonal expectancy effects; mediating variables refer to the behaviors by which expectations are communicated.
34
Robert Rosenthal
impressive. Teachers appear to teach more and to teach it more warmly to students for whom they have more favorable expectations. From these results we cannot infer that if we select warmer and more material-presenting teachers our nation's children will learn more. We also cannot infer from these results that training teachers to be warmer and more material-presenting will lead to improved learning on the part of our nation's children. Our results, however, do suggest that conducting the research required to determine the benets of selection and training for climate (or affect) and input (or effort) may well yield substantial benets both for science and for society.
35
on written work in English and social studies. Detailed feedback may be an indication that the teacher feels the child is worth the effort. Accepting shoddy work and giving grades higher than merited by the child's performance can be an indication of a teacher's belief that the child cannot improve over her or his current level. (Such beliefs about students are rarely warranted whether one is working in special education at the elementary school level or with doctoral students at major research universities.) Q: When greater intellectual gains are expected of children by adults, why does this work to result in higher student achievement? What is taking place there? A: There is considerable evidence now that the two most important factors mediating the effects of teachers' favorable expectations are affect and effort. Affect refers to the tendency of teachers to provide warmer, more pleasant socioemotional climates for students for whom they hold more favorable expectations. Effort refers to the tendency to teach more material to students for whom they hold more favorable expectations. Emotional warmth combined with high standards (tough warmth) may well communicate to students ``I'm with you and I know you can do it.''
References
Clark, K. B. (1963) Educational stimulation of racially disadvantaged children. In A. H. Passow (Ed.), Education in depressed areas. New York: Bureau of Publications, Teachers College, Columbia University. Cohen, J. (1988) Statistical power analysis for the behavioral sciences (2nd ed.). Hillsdale, NJ: Lawrence Erlbaum. Gruenberg, B. C. (1929). The story of evolution. Princeton, NJ: Van Nostrand. Harris, M. J., & Rosenthal, R. (1985). The mediation of interpersonal expectancy effects: 31 metaanalyses. Psychological Bulletin, 97, 363386. Harris, M. J., & Rosenthal, R. (1986). Four factors in the mediation of teacher expectancy effects. In R. S. Feldman (Ed.), The social psychology of education (pp. 91114). New York: Cambridge University Press. Merton, R. K. (1948). The self-fullling prophecy. Antioch Review, 8, 193210. Merton, R. K. (1987). Three fragments from a sociologist's notebooks: Establishing the phenomenon, specied ignorance, and strategic research materials. Annual Review of Sociology, 13, 128. Pfungst, O. (1965). Clever Hans (C. L. Rahn, Trans.). New York: Holt, Rinehart & Winston. (Original work published 1911) Raudenbush, S. W. (1984). Magnitude of teacher expectancy effects on pupil IQ as a function of the credibility of expectancy induction: A synthesis of ndings from 18 experiments. Journal of Educational Psychology, 76, 8597. Raudenbush, S. W. (1994). Random effects models. In H. Cooper & L. V. Hedges (Eds.), The handbook of research synthesis. New York: Russell Sage Foundation. Rosenthal, R. (1963). On the social psychology of the psychological experiment: The experimenter's hypothesis as unintended determinant of experimental results. American Scientist, 51, 268283. Rosenthal, R. (1969). Interpersonal expectations. In R. Rosenthal & R. L. Rosnow (Eds.), Artifact in behavioral research (pp. 181277). New York: Academic Press.
36
Robert Rosenthal
Rosenthal, R. (1973). The mediation of Pygmalion effects: A four factor ``theory.'' Papua New Guinea Journal of Education, 9, 112. Rosenthal, R. (1974). On the social psychology of the self-fullling prophecy: Further evidence for Pygmalion effects and their mediating mechanisms (Module 53, pp. 128). New York: MSS Modular Pub. Rosenthal, R. (1984). Meta-analytic procedures for social research. Newbury Park, CA: Sage. Rosenthal, R. (1985). From unconscious experimenter bias to teacher expectancy effects. In J. G. Dusek, V. C. Hall, & W. J. Meyer (Eds.), Teacher expectancies (pp. 3765). Hillsdale, NJ: Lawrence Erlbaum. Rosenthal, R. (1987). Pygmalion effects: Existence, magnitude, and social importance. Educational Researcher, 16, 3741. Rosenthal, R. (1991a). Meta-analytic procedures for social research (rev. ed.). Newbury Park, CA: Sage. Rosenthal, R. (1991b). Teacher expectancy effects: A brief update 25 years after the Pygmalion experiment. Journal of Research in Education, 1, 312. Rosenthal, R. (1995). Critiquing Pygmalion: A 25year perspective. Current Directions in Psychological Science, 4, 171172. Rosenthal, R. (1998). Interpersonal expectancy effects: A forty year perspective. Psychology Teacher Network, 8, 24, 9. Rosenthal, R., & Fode, K. L. (1963a). The effect of experimenter bias on the performance of the albino rat. Behavioral Science, 8, 183189. Rosenthal, R., & Fode, K. L. (1963b). Three experiments in experimenter bias. Psychological Reports, 12, 491511. Rosenthal, R., & Jacobson, L. (1966). Teachers' expectancies: Determinants of pupils' IQ gains. Psychological Reports, 19, 115118. Rosenthal, R., & Jacobson, L. (1968). Pygmalion in the classroom. New York: Holt, Rinehart & Winston. Rosenthal, R., & Jacobson, L. (1992). Pygmalion in the classroom (expanded ed.). New York: Irvington. Rosenthal, R., & Lawson, R. (1964). A longitudinal study of the effects of experimenter bias on the operant learning of laboratory rats. Journal of Psychiatric Research, 2, 6172. Rosenthal, R., & Rubin, D. B. (1971). Pygmalion reafrmed. In J. D. Elashoff & R. E. Snow (Eds.), Pygmalion reconsidered (pp. 139155). Worthington, OH: C. A. Jones. Rosenthal, R., & Rubin, D. B. (1978). Interpersonal expectancy effects: The rst 345 studies. The Behavioral and Brain Sciences, 3, 377386. Russell, B. (1927). Philosophy. New York: Norton. Smith, M. L. (1980). Teacher expectations. Evaluation in Education, 4, 5355.
CHAPTER
Messages That Motivate: How Praise Molds Students' Beliefs, Motivation, and Performance (in Surprising Ways)
CAROL S. DWECK
Department of Psychology, Columbia University, New York
Why do some very bright students do poorly in school and end up achieving little in life? Why do other, seemingly less bright students rise to the challenges and accomplish far more than anyone ever expected? Much of my career has been devoted to answering these questions, and that's where social psychology comes in. One of the most important things social psychology has done is to show us how profoundly people's beliefs affect their behavior. This has been shown very clearly in the realm of students' motivation and achievement. Do students believe their intelligence is a xed trait or an expandable quality? Do they believe their failures are due to a lack of effort or to a lack of ability? Do they believe they are doing a task to learn something new or to show how smart they are? These beliefs are key components of students' eagerness to learn, their love of challenge, and their ability to persist and thrive in the face of difculty. This is why they are key factors in what students achievequite apart from their intellectual ability. The most exciting thing about this is that beliefs can be changed. So, even more important than showing that beliefs matter for students' motivation and
Improving Academic Achievement Copyright 2002, Elsevier Science (USA). All rights reserved.
37