Professional Documents
Culture Documents
Three Way ANOVA
Three Way ANOVA
688
Three-Way ANOVA
You will need to use the following from previous chapters: Symbols k: Number of independent groups in a one-way ANOVA c: Number of levels (i.e., conditions) of an RM factor n: Number of subjects in each cell of a factorial ANOVA NT: Total number of observations in an experiment Formulas Formula 16.2: SSinter (by subtraction) also Formulas 16.3, 16.4, 16.5 Formula 14.3: SSbet or one of its components Concepts Advantages and disadvantages of the RM ANOVA SS components of the one-way RM ANOVA SS components of the two-way ANOVA Interaction of factors in a two-way ANOVA So far I have covered two types of two-way factorial ANOVAs: two-way independent (Chapter 14) and the mixed design ANOVA (Chapter 16). There is only one more simple two-way ANOVA to describe: the two-way repeated measures design. [There are other two-way designs, such as those including randomeffects or nested factors, but they are not commonly usedsee Hays (1994) for a description of some of these.] Just as the one-way RM ANOVA can be described in terms of a two-way independent-groups ANOVA, the two-way RM ANOVA can be described in terms of a three-way independent-groups ANOVA. This gives me a reason to describe the latter design next. Of course, the threeway factorial ANOVA is interesting in its own right, and its frequent use in the psychological literature makes it an important topic to cover, anyway. I will deal with the three-way independent-groups ANOVA and the two-way RM ANOVA in this section and the two types of three-way mixed designs in Section B. Computationally, the three-way ANOVA adds nothing new to the procedure you learned for the two-way; the same basic formulas are used a greater number of times to extract a greater number of SS components from SStotal (eight SSs for the three-way as compared with four for the two-way). However, anytime you include three factors, you can have a three-way interaction, and that is something that can get quite complicated, as you will see. To give you a manageable view of the complexities that may arise when dealing with three factors, Ill start with a description of the simplest case: the 2 2 2 ANOVA.
Chapter
22
A
CONCEPTUAL FOUNDATION
688
689
Exploitive
28 22 25 48 88 68 46.5
Row Mean
34 26 30 42 64 53 41.5
Table 22.1
At risk:
Men
Figure 22.1
Graph of Cell Means for Data in Table 22.1
At risk
Control
Exploitive
690
Chapter 22
Three-Way ANOVA
in amount of two-way interaction for men and women constitutes a three-way interaction. If the two graphs had looked exactly the same, the F ratio for the three-way interaction would have been zero. However, that is not a necessary condition. A main effect of gender could raise the lines on one graph relative to the other without contributing to a three-way interaction. Moreover, an interaction of gender with the experimenter factor could rotate the lines on one graph relative to the other, again without contributing to the three-way interaction. As long as the difference in slopes (i.e., the amount of two-way interaction) is the same in both graphs, the three-way interaction will be zero.
691
Main Effects
In addition to the three-way interaction there are three main effects to look at, one for each factor. To look at the gender main effect, for instance, just take the average of the scores for all of the men and compare it to the average of all of the women. If you have the cell means handy and the design is balanced, you can average all of the cell means involving men and then all of the cell means involving women. In Table 22.1, you can average the four cell means for the men (40, 28, 36, 48) to get 38 (alternatively, you could use the row means in the extreme right column and average 34 and 42 to get the same result). The average for the women (30, 22, 40, 88) is 45. The means for the other main effects have already been included in Table 22.1. Looking at the bottom row you can see that the mean for the nurturing experimenter is 36.5 as compared to 46.5 for the exploitive one. In the extreme right column youll find that the mean for the control subjects is 30, as compared to 53 for the at-risk subjects.
692
Figure 22.2
Graph of Cell Means in Table 22.1 after Averaging Across Gender
simple main effects (rather than the overall main effects), a significant threeway interaction suggests that we focus on the simple interaction effectsthe two-way interactions at each level of the third variable (which of the three independent variables is treated as the third variable is a matter of convenience). Even if the three-way interaction falls somewhat short of significance, I would recommend caution in interpreting the two-way interactions and the main effects, as well, whenever the simple interaction effects look completely different and, perhaps, show opposite patterns. So far I have been focusing on the two-way interaction of alcohol and experimenter in our example, but this choice is somewhat arbitrary. The two genders are populations that we are likely to have theories about, so it is often meaningful to compare them. However, I can just as easily graph the three-way interaction using alcohol as the third factor, as I have done in Figure 22.3a. To graph the overall two-way interaction of gender and experimenter, you can go back to Table 22.1 and average across the alcohol factor. For instance, the mean for men in the nurturing condition is found by averaging the mean for control group men in the nurturing condition (40) with
Figure 22.3a
Graph of Cell Means in Table 22.1 Using the Alcohol Factor to Distinguish the Panels
At Risk Women
Exploitive
693
Figure 22.3b
Graph of Cell Means in Table 22.1 after Averaging Across the Alcohol Factor
the mean for at-risk men in the nurturing condition (36), which is 38. The overall two-way interaction of gender and experimenter is shown in Figure 22.3b. Note that once again the two-way interaction is a compromise. (Actually, the two two-way interactions are not as different as they look; in both cases the slope of the line for the women is more positiveor at least less negative). For completeness, I have graphed the three-way interaction using experimenter as the third variable, and the overall two-way interaction of gender and alcohol in Figures 22.4a and 22.4b.
Nurturing
Exploitive Women
Figure 22.4a
Graph of Cell Means in Table 22.1 Using the Experimenter Factor to Distinguish the Panels
694
Figure 22.4b
Graph of Cell Means in Table 22.1 after Averaging Across the Experimenter Factor
a large three-way interaction while all of the other effects are quite small. By changing the means only for the men in our example, I will illustrate a large, disordinal interaction that obliterates two of the two-way interactions and two of the main effects. You can see in Figure 22.5a that this new three-way interaction is caused by a reversal of the alcohol by experimenter interaction from one gender to the other. In Figure 22.5b, you can see that the overall interaction of alcohol by gender is now zero (the lines are parallel); the gender by experimenter interaction is also zero (not shown). On the other hand, the large gender by alcohol interaction very nearly obliterates the main effects of both gender and alcohol (see Figure 22.5c). The main effect of experimenter is, however, large, as can be seen in Figure 22.5b.
Figure 22.5a
Rearranging the Cell Means of Table 22.1 to Depict a Disordinal 3-Way Interaction
Men Control
At risk Expoitive
695
Figure 22.5b
Regraphing Figure 22.5a after Averaging Across Gender
Figure 22.5c
Regraphing Figure 22.5a after Averaging Across the Experimenter Factor
two genders do not look the same. In Figure 22.6, I created the means for the men by starting out with the womens means and subtracting 10 from each (this creates a main effect of gender); then I added 30 only to the mens means that involved the nurturing condition. The latter change creates a two-way interaction between experimenter and gender, but because it affects both the men/nurturing means equally, it does not produce any threeway interaction. One way to see that the three-way interaction is zero in Figure 22.6 is to subtract the slopes of the two lines for each gender. For the women the slope of the at-risk line is positive: 88 40 = 48. The slope of the control line is negative: 22 30 = 8. The difference of the slopes is 48 (8) = 56. If we do the same for the men, we get slopes of 18 and 38, whose difference is also 56. You may recall that a 2 2 interaction has only one df, and can be summarized by a single number, L, that forms the basis of a simple linear contrast. The same is true for a 2 2 2 interaction or any higher-order interaction in which all of the factors have two levels. Of course, quantifying a three-way interaction gets considerably more complicated when the factors have more than two levels, but it is safe to say that if the two (or more) graphs are exactly the same, there will be no three-way interaction (they will continue to be identical, even if a different factor is chosen to distinguish the
696
Figure 22.6
Rearranging the Cell Means of Table 22.1 to Depict a Zero Amount of Three-Way Interaction
graphs). Bear in mind, however, that even if the graphs do not look the same, the three-way interaction will be zero if the amount of two-way interaction is the same for every graph.
Main Effects
The calculation of the main effects is also the same as in the two-way ANOVA; the SS for a main effect is just the biased variance of the relevant group means multiplied by the total N. Let us say that each of the eight cells in our example contains five subjects, so NT equals 40. Then the SS for the experimenter factor (SSexper) is 40 times the biased variance of 36.5 and 46.5 (the nurturing and exploitive means from Table 22.1), which equals 40(25) = 1000 (the shortcut for finding the biased variance of two numbers is to take the square of the difference between them and then divide by 4). Similarly, SSalcohol = 40(132.25) = 5290, and SSgender = 40(12.25) = 490.
697
At the end of the analysis, SStotal (whether or not it has been calculated separately) has been divided into eight components: SSA, SSB, SSC, the four interactions listed in Formula 22.1, and SSW. Each of these is divided by its corresponding df to form a variance estimate, MS. Using a to represent the
698
Because the df happens to be 1 for all of the numerator terms, the critical F for all seven tests is F.05 (1,32), which is equal (approximately) to 4.15. Except for the main effect of gender, and the three-way interaction, all of the F ratios exceed the critical value (4.15) and are therefore significant at the .05 level.
699
700
Higher-Order ANOVA
This text will not cover factorial designs of higher order than the three-way ANOVA. Although higher-order ANOVAs can be difficult to interpret, no new principles are introduced. The four-way ANOVA produces 15 different F ratios to test: four main effects, 6 two-way interactions, 4 three-way interactions, and 1 four-way interaction. Testing each of these 15 effects at the .05 level raises serious concerns about the increased risk of Type I errors. Usually, all of the F ratios are not tested; specific hypotheses should guide the selection of particular effects to test. Of course, the potential for an inflated rate of Type I errors only increases as factors are added. In general, an N-way ANOVA produces 2N 1 F ratios that can be tested for significance. In the next section I will delve into more complex varieties of the threeway ANOVAin particular those that include repeated measures on one or two of the factors.
A
SUMMARY
1. To display the cell means of a three-way factorial design, it is convenient to create two-way graphs for each level of the third variable and place these graphs side by side (you have to decide which of the three variables will distinguish the graphs and which of the two remaining variables will be placed along the X axis of each graph). Each two-way graph depicts a simple interaction effect; if the simple interaction effects are significantly different from each other, the three-way interaction will be significant. 2. Three-way interactions can occur in a variety of ways. The interaction of two of the factors can be strong at one level of the third factor and close
701
3.
4.
5.
6.
7.
8.
EXERCISES
1. Imagine an experiment in which each subject is required to use his or her memories to create one emotion: either happiness, sadness, anger, or fear. Within each emotion group, half of the subjects participate in a relaxation exercise just before the emotion condition, and half do not. Finally, half the subjects in each emotion/relaxation condition are run in a dark, sound-proof chamber, and the other half are run in a normally lit room. The dependent variable is the subjects systolic blood pressure when the subject signals that the emotion is fully present. The design is balanced, with a total of 128 subjects. The results of the three-way ANOVA for this hypothetical experiment are as follows: SSemotion = 223.1, SSrelax = 64.4, SSdark = 31.6, SSemo rel = 167.3, SSemo dark = 51.5; SSrel dark = 127.3, and SSemo rel dark = 77.2. The total sum of squares is 2,344. a. Calculate the seven F ratios, and test each for significance.
702
b. Calculate partial eta squared for each of the three main effects (use Formula 14.9). Are any of these effects at least moderate in size? 2. In this exercise there are 20 subjects in each cell of a 3 3 2 design. The levels of the first factor (location) are urban, suburban, and rural. The levels of the second factor are no siblings, one or two siblings, and more than two siblings. The third factor has only two levels: presently married and not presently married. The dependent variable is the number of close friends that each subject reports having. The cell means are as follows:
Urban No Siblings Married Not Married 1 or 2 Siblings Married Not Married 2 or more Siblings Married Not Married Suburban Rural
a. Given that SSW equals 1,094, complete the three-way ANOVA, and present your results in a summary table. b. Draw a graph of the means for Location Number of Siblings (averaging across marital status). Describe the nature of the interaction. c. Using the means from part b, test the simple effect of number of siblings at each location. 3. Seventy-two patients with agoraphobia are randomly assigned to one of four drug conditions: SSRI (e.g., Prozac), tricyclic antidepressant (e.g., Elavil), antianxiety (e.g., Xanax), or a placebo (offered as a new drug for agoraphobia). Within each drug condition, a third of the patients are randomly assigned to each of three types of psychotherapy: psychodynamic, cognitive/behavioral, and group. The subjects are assigned so that half the subjects in each drug/therapy group are also depressed, and half are not. After 6 months of treatment, the severity of agoraphobia is measured for each subject (30 is the maximum possible phobia score); the cell means (n = 3) are as follows: a. Given that SSW equals 131, complete the three-way ANOVA, and present your results in a summary table.
b. Draw a graph of the cell means, with separate panels for depressed and not depressed. Describe the nature of the therapy drug interaction in each panel. Does there appear to be a three-way interaction? Explain. c. Given your results in part a, describe a set of follow-up tests that would be justifiable. d. Optional: Test the 2 2 2 interaction contrast that results from deleting Group therapy and the SSRI and placebo conditions from the analysis (extend the techniques of Chapter 13, Section B, and Chapter 14, Section C). 4. An industrial psychologist is studying the relation between motivation and productivity. Subjects are told to perform as many repetitions of a given clerical task as they can in a 1-hour period. The dependent variable is the number of tasks correctly performed. Sixteen subjects participated in the experiment for credit toward a requirement of their introductory psychology course (credit group). Another 16 subjects were recruited from other classes and paid $10 for the hour (money group). All subjects performed a small set of similar clerical tasks as practice before the main study; in each group (credit or money) half the subjects (selected randomly) were told they had performed unusually well on the practice trials (positive feedback), and half were told they had performed poorly (negative feedback). Finally, within each of the four groups created by the manipulations just described, half of the subjects (at random) were told that performing the tasks quickly and accurately was correlated with other important job skills (self motivation), whereas the other half were told that good performance would help the experiment (other motivation). The data appear in the following table:
703
PAID SUBJECTS
Positive Feedback 21 17 15 21 33 29 35 29 Negative Feedback 25 23 30 26 21 22 19 17
Other
a. Perform a three-way ANOVA on the data. Test all seven F ratios for significance, and present your results in a summary table. b. Use graphs of the cell means to help you describe the pattern underlying each effect that was significant in part a. c. Based on the results in part a, what post hoc tests would be justified? 5. Imagine that subjects are matched in blocks of three based on height, weight, and other physical characteristics; six blocks are formed in this way. Then the subjects in each block are randomly assigned to three differSAD
Subject No. 1 2 3 4 5 6 Low 5 2 5 3 4 3 Medium 6 5 7 6 9 5 High 9 7 5 5 8 7 Low 2 3 2 3 4 4
ent weight-loss programs. Subjects are measured before the diet, at the end of the diet program, 3 months later, and 6 months later. The results of the two-way RM ANOVA for this hypothetical experiment are given in terms of the SS components, as follows: SSdiet = 403.1, SStime = 316.8, SSdiet time = 52, SSdiet S = 295.7, SStime S = 174.1, and SSdiet time S = 230. a. Calculate the three F ratios, and test each for significance. b. Find the conservatively adjusted critical F for each test. Will any of your conclusions be affected if you do not assume that sphericity exists in the population? 6. A psychologist wants to know how both the affective valence (happy vs. sad vs. neutral) and the imageability (low, medium, high) of words affect their recall. A list of 90 words is prepared with 10 words from each combination of factors (e.g., happy, low imagery: promotion; sad, high imagery: cemetery) randomly mixed together. The number of words recalled in each category by each of the six subjects in the study is given in the following table: NEUTRAL
Medium 5 6 4 5 7 5 High 6 6 5 6 7 6 Low 3 5 4 4 4 6
HAPPY
Medium 4 5 3 4 5 4 High 8 6 7 5 9 4
a. Perform a two-way RM ANOVA on the data. Test the three F ratios for significance, and present your results in a summary table. b. Find the conservatively adjusted critical F for each test. Will any of your conclusions be affected if you do not assume that sphericity exists in the population?
c. Draw a graph of the cell means, and describe any trend toward an interaction that you can see. d. Based on the variables in this exercise, and the results in part a, what post hoc tests would be justified and meaningful?
An important way in which one three-factor design can differ from another is the number of factors that involve repeated measures (or matching). The design in which none of the factors involve repeated measures was covered in Section A. The design in which all three factors are RM factors will not be covered in this text; however, the three-way RM design is a straightforward extension of the two-way RM design described at the end of Section A. This section will focus on three-way designs with either one or two RM factors (i.e., mixed designs), and it will also elaborate on the general principles of dealing with three-way ANOVAs, as introduced in Section A, and consider
B
BASIC STATISTICAL PROCEDIRES
704
One RM Factor
I will begin with a three-factor design in which there are repeated measures on only one of the factors. The ANOVA for this design is not much more complicated than the two-way mixed ANOVA described in the previous chapterfor instance, there are only two different error terms. Such designs arise frequently in psychological research. One simple way to arrive at such a design is to start with a two-way ANOVA with no repeated measures. For instance, patients with two different types of anxiety disorders (generalized anxiety vs. specific phobias) are treated with two different forms of psychotherapy (psychodynamic vs. behavioral). The third factor is added by measuring the patients anxiety at several points in time (e.g., beginning of therapy, end of therapy, several months after therapy has stopped); I will refer to this factor simply as time. To illustrate the analysis of this type of design I will take the two-way ANOVA from Section B of Chapter 14 and add time as an RM factor. You may recall that that example involved four levels of sleep deprivation and three levels of stimulation. Performance was measured only onceafter 4 days in the sleep lab. Now imagine that performance on the simulated truck driving task is measured three times: after 2, 4, and 6 days in the sleep lab. The raw data for the three-factor study are given in Table 22.2, along with the various means we will need to graph and analyze the results; note that the data for Day 4 are identical to the data for the corresponding two-way ANOVA in Chapter 14. To see what we may expect from the results of a threeway ANOVA on these data, the cell means have been graphed so that we can look at the sleep by stimulation interaction at each time period (see Figure 22.7). You can see from Figure 22.7 that the sleep stimulation interaction, which was not quite significant for Day 4 alone (see Chapter 14, section B), increases over time, perhaps enough so as to produce a three-way interaction. We can also see that the main effects of stimulation and sleep, significant at Day 4, are likely to be significant in the three-way analysis. The general decrease in scores from Day 2 to Day 4 to Day 6 is also likely to yield a significant main effect for time. Without regraphing the data, it is hard to see whether the interactions of time with either sleep or stimulation are large or small. However, because these interactions are less interesting in the context of this experiment, I wont bother to present the two other possible sets of graphs. To present general formulas for analyzing the kind of experiment shown in Table 22.2, I will adopt the following notation. The two between-subject factors will be labeled A and B. Of course, it is arbitrary which factor is called A and which B; in this example the sleep deprivation factor will be A, and the stimulation factor will be B. The lowercase letters a and b will stand for the number of levels of their corresponding factorsin this case, 4 and 3, respectively. The within-subject factor will be labeled R, and its number of levels, c, to be consistent with previous chapters. Let us begin with the simplest SS components: SStotal, and the SSs for the numerators of each main effect. SStotal is based on the total number of observations, NT, which for any balanced three-way factorial ANOVA is equal to abcn, where n is the number of different subjects in each cell of the A B table. So, NT = 4 3 3 5 = 180. The biased variance obtained by entering all 180 scores is 43.1569, so SStotal = 43.1569 180 = 7,768.24. SSA is based
Table 22.2
PLACEBO
Day 4 Day 6 Subject Means Day 2 Day 4 Day 6 Day 2 Day 4 Subject Means
MOTIVATION
CAFFEINE
Day 6 Subject Row Means Means
Day 2
None
AB means
26 30 29 23 21 25.8
24 29 28 20 20 24.2
24 25 27 20 20 23.2
29 26 23 29 35 28.4
28 23 24 30 33 27.6
26 23 25 27 22 24.6
29 24 23 31 29 27.2
26 22 20 30 27 25.0
26 23 17 30 25 24.2
25.58
Jet Lag
AB means
24 20 15 27 28 22.8
22 18 16 25 27 21.6
17 15 13 19 22 17.2
27 29 34 23 25 27.6
26 30 32 20 23 26.2
33 17 25 18 20 22.6
24 30 30 25 23 26.4
25 27 31 24 21 25.6
20 24 25 17 22 21.6
23.51
Interrupt
AB means
17 19 22 11 15 16.8
16 19 20 11 14 16.0
9 6 11 7 10 8.6
25 21 19 25 24 22.8
16 13 12 18 19 15.6
10 9 8 12 14 10.6
23 29 28 20 21 24.2
23 28 26 17 19 22.6
20 23 23 12 17 19.0
17.35
Total
16 18 20 14 11 15.8 20.3
14 17 18 12 10 14.2 19.0
5 6 10 7 7 7.0 14.0
24 19 20 27 26 23.2 25.5
15 11 11 19 17 14.6 21.0
14 8 15 17 10 12.8 17.65
25 16 19 27 26 22.6 25.1
23 16 18 26 24 21.4 23.65
18 14 12 21 21 17.2 20.5
16.53
706
Figure 22.7
Graph of the Cell Means in Table 22.2
on the means for the four sleep deprivation levels, which can be found in the rightmost column of the table, labeled row means. SSB is based on the means for the three stimulation levels, which are found where the bottom row of the table (Column Means), intersects the columns labeled Subject Means (these are averaged over the three days, as well as the sleep levels). The means for the three different days are not in the table but can be found by averaging the three Column Means for Day 2, the three for Day 4, and similarly for Day 6. The SSs for the main effects are as follows: SSA = 2(25.58, 23.51, 17.35, 16.53) 180 = 15.08 180 = 2,714.4. SSB = 2(17.77, 21.38, 23.08) 180 = 4.902 180 = 882.36. SSR = 2(23.63, 21.22, 17.38) = 6.622 180 = 1,192.0 As in Section A, we will need the SS based on the cell means, SSABR, and the SSs for each two-way table of means: SSAB, SSAR, and SSBR. In addition, because one factor has repeated measures we will also need to find the means for each subject (averaging their scores for Day 2, Day 4, and Day 6) and the SS based on those means, SSbetween-subjects.
707
708
The within-subject variability can be divided into five components, which include the main effect of the RM factor and all of its interactions: SSW-S = SSR + SSA R + SSB R + SSA B R + SSS R The last term is the basis for the error term that is used for all of the effects involving the RM factor (it was called SSS RM in Chapter 16). It is found conveniently by subtraction: SSS R = SSW-S SSR SSA R SSB R SSA B R Formula 22.5
We are now ready to get the remaining SS components for our example. SSW = SSbet-S SSAB = 5,799.6 3,974 = 1,825.6 SSW-S = SStotal SSbet-S = 7,768.24 5,799.6 = 1,968.64 SSS R = SSW-S SSR SSA R SSB R SSA B R = 1,968.64 1,192 160.2 94.74 113.34 = 408.36 A more tedious but more instructive way to find SSS R would be to find the subject by RM interaction separately for each of the eight cells of the between-groups (AB) matrix and then add these eight components together. This overall error term is justified only if you can assume that all eight interactions would be the same in the entire population. As mentioned in the previous chapter, there is a statistical test (Boxs M criterion) that can be used to give some indication of whether this assumption is reasonable. Now that we have divided SStotal into all of its components, we need to do the same for the degrees of freedom. This division, along with all of the df formulas, is shown in the degrees of freedom tree in Figure 22.8. The dfs we will need to complete the ANOVA are based on the following formula: a. b. c. d. e. f. g. h. i. dfA = a 1 dfB = b 1 dfA B = (a 1)(b 1) dfR = c 1 dfA R = (a 1)(c 1) dfB R = (b 1)(c 1) dfA B R = (a 1)(b 1)(c 1) dfW = ab(n 1) dfS R = dfW dfR = ab(n 1)(c 1) Formula 22.6
For the present example, dfA = 4 1 = 3 dfB = 3 1 = 2 dfA B = 3 2 = 6 dfR = 3 1 = 2 dfA R = 3 2 = 6 dfB R = 2 2 = 4 dfA B R = 3 2 2 = 12 dfW = 4 3 (5 1) = 48 dfS R = dfW dfR = 48 2 = 96
709
Figure 22.8
Degrees of Freedom Tree for Three-Way ANOVA with Repeated Measures on One Factor
df groups [ab1]
df W [ab(n1)]
df A R [(a1)(c1)]
Note that the sum of all the dfs is 179, which equals dftotal (NT 1 = abcn 1 = 180 1). The next step is to divide each SS by its df to obtain the corresponding MS. The results of this step are shown in Table 22.3 along with the F ratios and their p values. The seven F ratios were formed according to Formula 22.7:
Source
Between-subjects Sleep deprivation Stimulation Sleep Stim Within-groups Within-subjects Time Sleep Time Stim Time Sleep Stim Time Subject Time
SS
5,799.6 2714.4 882.4 375.8 1825.6 1,968.64 1192 160.2 94.74 114.74 408.36
df
59 3 2 6 48 120 2 6 4 12 96
MS
904.8 441.2 62.63 38.03
F
23.8 11.6 1.65
p
<.001 <.001 >.05
Table 22.3
Note: The errors that you get from rounding off the means before applying Formula 14.3 are compounded in a complex design. If you retain more digits after the decimal place than I did in the various group and cell means or use raw-score formulas or analyze the data by computer, your F ratios will differ by a few tenths of a point from those in Table 22.3 (fortunately, your conclusions should be the same). If you are going to present your findings to others, regardless of the purpose, I strongly recommend that you use statistical software, and in particular a program or package that is quite popular (so that there is a good chance that its bugs have already been eliminated, at least for basic procedures, such as those in this text).
710
b. FB =
c.
FA B =
d.
FR =
e.
FA R =
f. FB R =
g.
FA B R =
711
Figure 22.9
Graph of the Cell Means in Table 22.2 After Averaging Across the Time Factor
20
Caffeine Motivation
Placebo
Total
Assumptions
The sphericity tests and adjustments you learned in Chapters 15 and 16 are easily extended to apply to this design as well. Boxs M criterion can be used to test that the covariances for each pair of RM levels are the same (in the population) for every combination of the two between-group factors. If M is not significant, the interactions can be pooled across all the cells of the twoway between-groups part of the design and then tested for sphericity with Mauchleys W. If you cannot perform these tests (or do not trust them), you can use the modified univariate approach as described in Chapter 15. A factorial MANOVA is also an option (see section C). The dfs and p levels for the within-subjects effects in Table 22.3 were based on the assumption of sphericity. Fortunately, the effects are so large that even using the most conservative adjustment of the dfs (i.e., lower-bound epsilon), all of the effects remain significant at the .05 level (although the three-way interaction is just at the borderline with p = .05).
712
713
714
Figure 22.10
Graph of Cell Means for the Bruder, et al. (1997) Study
presentation consistent with similar figures in this chapter). The authors state:
There was a striking difference in PA between cognitive-therapy responders and nonresponders on the syllables test but not on the complex tones test. In contrast, there was no significant difference in PA between placebo responders and nonresponders on either test. The dependence of PA differences between responders and nonresponders on treatment and test was reflected in a significant Outcome Treatment Test interaction in an overall ANOVA of these data, F (1, 72) = 5.81, p = .018. Further analyses indicated that this three-way interaction was due to the presence of a significant Outcome Test interaction for cognitive therapy, F (1, 29) = 5.67, p = .025, but not for placebo, F (1, 43) = 0.96, p = .332. Cognitivetherapy responders had a significantly larger right-ear (left-hemisphere) advantage for syllables when compared with nonresponders, t (29) = 2.58, p = .015, but no significant group difference was found for the tones test, t (29) = 1.12, p = .270.
Notice that the significant three-way interaction is followed by tests of the simple interaction effects, and the significant simple interaction is, in turn, followed by t tests on the simple main effects of that two-way interaction (of course, the t tests could have been reported as Fs, but it is common to report t values for cell-to-cell comparisons when no factors are being collapsed). Until recently, F values less than 1.0 were usually shown as F < 1, p > .05 (or ns), but there is a growing trend to report Fs and ps as given by ones statistical software output (note the reporting of F = 0.96 above).
Two RM Factors
There are many ways that a three-way factorial design with two RM factors can arise in psychological research. In one case you begin with a two-way RM design and then add a grouping factor. For instance, tension in the brow and cheek, as measured electrically (EMG), can reveal facial expressions that are hard to observe visually. While watching a happy scene from a movie, cheek tension generally rises in a subject (due to smiling), whereas brow tension declines (due to a decrease in frowning). The opposite pattern occurs while watching a sad scene. If tension is analyzed with a 2 (brow vs. cheek) 2 (happy vs. sad) ANOVA, a significant interaction is likely to emerge. This is not an impressive result in itself, but the degree of the twoway (RM) interaction can be used as an index of the intensity of a subjects (appropriate) emotional reactions. For example, in one (as yet unpublished) experiment, half the subjects were told to get involved in the movie scenes
715
Table 22.4
BELOW AVERAGE ABOVE
Mean Female Male Mean Row Means Mean Female Male Female Male
Low
Cell Mean
Moderate
Cell Mean
High
717
The appropriate critical F is F.05(2,9) = 4.26, so FA is easily significant. A look at the means for the three groups of subjects shows us that managers with greater experience are, in general, more cautious with their hirability ratings (perhaps they have been burned more times), especially when comparing low to moderate experience. However, there is no point in trying to interpret this finding before testing the various interactions, which may make this finding irrelevant or even misleading. I have completed the between-groups part of the analysis at this point just to show you that at least part of the analysis is easy and to get it out of the way before the more complicated within-subject part of the analysis begins. With only one RM factor there is only one error term that involves an interaction with the subject factor, and that error term is found easily by subtraction. However, with two RM factors the subject factor interacts with each RM factor separately, and with the interaction of the two of them, yielding three different error terms. The extraction of these extra error terms requires the collapsing of more intermediate tables, and the calculation of more intermediate SS terms. Of course, the calculations are performed the same way as alwaysthere are just more of them. Lets begin, however, with the numerators of the various interaction terms, which involve the same procedures as the three-way analysis with only one RM factor. First, we can
718
719
720
a.
FA =
e.
FA R = FQ R =
Formula 22.11
b. FQ = FR =
f.
c.
g. FA Q R =
d. FA Q =
The completed analysis is shown in Table 22.5. Notice that each of the three different RM error terms is being used twice. This is just an extension
Figure 22.11
Degrees of Freedom Tree for 3-Way ANOVA with Repeated Measures on Two Factors
df total [aqrn1]
df between-S [an1]
df within-S [an(qr1)]
df A [a1]
df QXS [a(n1)(q1)]
df R [r1]
df QXRXS [a(n1)(q1)(r1)]
721
Table 22.5
6.48 1.35
35.81 3.92
3.69 2.23
Note: The note from Table 22.3 applies here as well. of what you saw in the two-way mixed design when the S RM error term was used for both the RM main effect and its interaction with the betweengroups factor.
Figure 22.12
Graph of the Cell Means for the Data in Table 22.4
722
Figure 22.13
Graph of the Cell Means for Table 22.4 After Averaging Across Gender
High
the line for the low experience group is consistently above the line for moderate experience seems to account, at least in part, for the significance of the main effect for that factor. The significant attractiveness by experience (i.e., group) interaction is clearly due to a strong interaction for the male condition being averaged with a lack of interaction for the females (Figure 22.13 shows the male and female conditions averaged together, which bears a greater resemblance to the male than female condition). This is a case when a three-way interaction that is not significant should nonetheless lead to caution in interpreting significant two-way interactions. Perhaps, the most interesting significant result is the interaction of attractiveness and gender. Figure 22.14 shows that although attractiveness is a strong factor in hirability for both genders, it makes somewhat less of a difference for males. However, the most potentially interesting result would have been the three-way interaction, had it been significant; it could have shown that the impact of attractiveness on hirability changes with the experience of the employer, but more for male than female applicants.
Figure 22.14
Graph of the Cell Means for Table 22.4 After Averaging Across the Levels of Hiring Experience
Female Male
723
Follow-Up Comparisons
Given the significance of the attractiveness by experience interaction, it would be reasonable to perform follow-up tests, similar to those described for the two-way mixed design in Chapter 16. This includes the possibility of analyzing simple effects (a one-way ANOVA at each attractiveness level or a one-way RM ANOVA for each experience group), partial interactions (e.g., averaging the low and moderate experience conditions and performing the resulting 2 3 ANOVA) or interaction contrasts (e.g., the average of the low and moderate conditions and the high condition crossed with the average and above average attractiveness conditions). Such tests, if significant, could justify various cell-to-cell comparisons. To follow up on the significant gender by attractiveness interaction, the most sensible approach would be simply to conduct RM t tests between the genders at each level of attractiveness. In general, planned and post hoc comparisons for the three-way ANOVA with two RM factors follow the same logic as those described for the design with one RM factor. The only differences concern the error terms for these comparisons. If your between-group factor is significant, involves more than two levels, and is not involved in an interaction with one of the RM factors, you can use MSW from the overall analysis as your error term. For all other comparisons, using an error term from the overall analysis requires some questionable homogeneity assumption. For tests involving one or both of the two RM factors, it is safest to perform all planned and post hoc comparisons using an error term based only on the conditions included in the test.
724
The significant two-way interaction was then followed with an interaction contrast (dropping the neutral and nonword prime conditions) and cell-tocell comparisons:
The specific Prime Gender Target Gender interaction (excluding the neutral conditions) was also reliable, F (1,66) = 117.56, p < .0001. Subjects were faster to judge male pronouns after male than female primes, t (67) = 11.59, p < .0001, but faster to judge female pronouns after female than male primes, t (67) = 6.90, p < .0001 (p. 138).
B
SUMMARY
1. The calculation of the three-way ANOVA with repeated measures on one factor follows the basic outline of the independent three-way ANOVA, as described in Section A, but adds elements of the mixed design, as delineated in Chapter 16. The between-subject factors are labeled A and B, whereas the within-subject factor is labeled R (short for RM). The number of levels of the factors are symbolized by a, b, and c, respectively. The following steps should be followed: a. Begin with a table of the individual scores and then find the mean for each level of each factor, the mean for each different subject (averaging across the levels of the RM factor), and the mean for each cell of the three-way design. From your table of cell means, create three two-way tables of means, in each case taking a simple average of the cell means across one of the three factors. b. Use Formula 14.3 to find SStotal from the individual scores; SSA, SSB, and SSR from the means at each factor level; SSbetween-subjects from the means for each subject; SSABR from the cell means; and SSAB, SSAR, and SSBR from the two-way tables of means. c. Find the SS components for the three two-way interactions, the three-way interaction, and the two error terms (SSW and SSS R) by subtraction. Divide these six SS components, along with the three SS components for the main effects, by their respective df to create the nine necessary MS terms. d. Form the seven F ratios by using MSW as the error term for the main effects of A and B and their interaction and then, using MSS R as the error term for the main effect of the RM factor, its interaction with A, its interaction with B, and the three-way interaction. 2. The calculation of the three-way ANOVA with repeated measures on two factors is related to both the independent three-way ANOVA and the two-way RM ANOVA. The between-subject factor is labeled A, whereas the two RM factors are labeled R and Q. The number of levels of the factors are symbolized by a, r, and q, respectively. The following steps should be followed. a. Begin with a table of the individual scores and then find the mean for each level of each factor, the mean for each different subject (averaging across the levels of both RM factors), and the mean for each cell of the three-way design. From your table of cell means, create three two-way tables of means, in each case taking a simple average of the cell means across one of the three factors. In addition, create two more two-way tables in which scores are averaged over one RM factor or the other, but not both, and subjects are not averaged across groups (i.e., each table is a two-way matrix of subjects by one of the RM factors.). b. Use Formula 14.3 to find SStotal from the individual scores; SSA, SSQ, and SSR from the means at each factor level; SSS from the means for each subject; SSABR from the cell means; SSAB, SSAR, and SSBR from the
725
EXERCISES
1. A total of 60 college students participated in a study of attitude change. Each student was randomly assigned to one of three groups that differed according to the style of persuasion that was used: rational arguments, emotional appeal, and stern/commanding (Style factor). Each of these groups was randomly divided in half, with one subgroup hearing the arguments from a fellow student, and the other from a college administrator (Speaker factor). Each student heard arguments on the same four campus issues (e.g., tuition increase), and attitude change was measured for each of the four issues (Issue factor). The sums of squares for the three-way mixed ANOVA are as follows: SSstyle = 50.4, SSspeaker = 12.9, SSissue = 10.6, SSstyle speaker = 21.0, SSstyle issue = 72.6, SSspeaker issue = 5.3, SSstyle speaker issue = 14.5, SSW = 189, and SStotal = 732.7. a. Calculate the seven F ratios, and test each for significance. b. Find the conservatively adjusted critical F for each test involving a repeatedmeasures factor. Will any of your conclu-
726
sions be affected if you do not make any assumptions about sphericity? 2. Based on a questionnaire they had filled out earlier in the semester, students were classified as high, low, or average in empathy. The 12 students recruited in each category for this experiment were randomly divided in half, with one subgroup given instructions to watch videotapes to check for the quality of the picture and sound (detail group) and the other subgroup given instructions to get involved in the story portrayed in the videotape. All subjects viewed the same two videotapes (in counterbalanced order): one presenting a happy story and one presenting a sad story. The dependent variable was the subjects rating of his or her mood at the end of each tape, using a 10-point happiness scale (0 = extremely sad, 5 = neutral, and 10 = extremely happy). The data for the study appear in the following table: LOW EMPATHY
Happy Detail 6 6 5 7 4 6 5 5 6 4 5 4 Sad 5 5 7 4 6 5 4 4 4 5 3 5
AVERAGE EMPATHY
Happy 5 7 5 5 4 5 6 6 7 4 6 4 Sad 4 2 3 5 4 5 2 2 1 4 2 4
HIGH EMPATHY
Happy 7 8 7 5 6 5 7 8 9 7 6 7 Sad 3 3 1 5 4 5 2 1 1 2 1 2
Involved
New System
Combined
727
a. Given that SStype S = 224, SSdifficulty S = 130, SStype difficulty S = 62, and SSW = 528, perform the appropriate three-way ANOVA on the data. Present your results in a summary table. b. Graph the Type Difficulty means, averaging across instruction group. Compare this graph to the Type Difficulty graph for each instruction group. Can the overall Type Difficulty interaction be meaningfully interpreted? Explain. c. Find the conservatively adjusted critical F for each test. Will any of your conclusions be affected if you do not assume that sphericity exists in the population? d. Given the results you found in part a, which simple effects can be justifiably analyzed? 5. Imagine that the psychologist in Exercise 6 of Section A runs her study under two different conditions with two different random samples of subjects. The two conditions depend on the type of background music played to the subjects as they memorize the list of words: very happy or very sad. The number of words recalled in each word category for each subject in the two groups is given in the following table:
IMAGERY INSTRUCTIONS
Spatial 3.9 5.2 7.8 Verbal 2.2 2.4 2.8
SAD
Low Happy Music 4 2 4 2 4 3 5 3 6 3 4 5 Medium 6 5 7 5 8 5 6 5 7 6 10 5 High 9 7 5 4 8 6 9 9 6 7 9 7 Low 3 4 3 4 5 5 2 3 2 3 4 4
NEUTRAL
Medium 5 6 5 6 7 5 4 6 4 4 6 5 High 6 7 5 6 7 6 6 5 5 6 7 6 Low 4 5 4 4 5 6 3 4 3 4 5 4
HAPPY
Medium 4 6 5 4 5 4 4 5 3 4 5 4 High 9 6 7 5 10 5 6 5 6 5 8 5
Sad Music
728
a. Perform a three-way mixed-design ANOVA on the data. Present your results in a summary table. b. Find the conservatively adjusted critical F for each test. Will any of your conclusions be affected if you do not assume that sphericity exists in the population? c. Draw a graph of the cell means (with separate panels for the two types of background music), and describe the nature of any effects that are noticeable. Which 2 2 2 interaction contrast appears to be the largest? d. Based on the variables in this exercise, and the results in part a, what post hoc tests would be justified and meaningful? 6. A neuropsychologist is testing the benefits of a new cognitive training program designed to improve memory in patients who have suffered brain damage. The effects of the training are being tested on four types of memory: abstract words, concrete words, human faces, and simple line drawings. Each subject performs all four types of tasks. The dependent variable is the number of NO TRAINING
Right brain damage 11 13 9 5 7 3 7 8 6 19 20 18 5 8 5 6 5 7 7 10 4 13 15 11 11 8 14 5 9 1 11 7 15 7 9 5
TRAINING
12 10 14 7 9 5 8 7 9 18 19 17 10 8 12 9 11 7 10 7 13 15 17 13 11 9 13 8 11 5 12 9 15 9 7 11
Equal damage
C
OPTIONAL MATERIAL
729
Figure 22.15
Plot in which Two Groups of Students Differ Strongly on Two Variables
IQ 100
90 70 80 Grades 90 100
730
Figure 22.16
Plot in which Two Groups of Students Differ Strongly on One Variable and Weakly on a Second Variable
IQ 100
90 70 80 Grades 90 100
731
732
where, n1 and n2 are the sizes of the two groups. The critical F is found with P and n1 + n2 P 1 degrees of freedom. Notice that when the sample sizes are fairly large compared to P, T 2 is multiplied by approximately 1/P. Of course, when P = 1, there is no adjustment at all. There is one case in which it is quite easy to calculate T 2. Suppose you have equal-sized groups of left- and right-handers and have calculated t tests for two DVs: a verbal test and a spatial test. If across all your subjects the two DVs have a zero correlation, you can find the square of the point-biserial correlation corresponding to each t test (use Formula 10.13 without taking the square root) and add the two together. The resulting rpb2 can be converted back to a t value by using Formula 10.12 (for testing the significance of rpb). However, if you use the square of that formula to get t2 instead, what you are really getting is T 2 for the combination of the two DVs. T 2 can then be tested with the preceding formula. If the two DVs are positively correlated, finding T 2 as just described would overestimate the true value (and underestimate it if the DVs are negatively correlated). If you have any number of DVs, and each possible pair has a zero correlation over all your subjects, you can add all the squared point-biserial rs and convert to T 2, as just described. If any two of your DVs have a nonzero correlation with each other, you can use multiple regression to combine all of the squared point-biserial rs; the combination is called R2. The F ratio used in multiple regression to test R2 would give the same result as the F for testing T 2 in this case. In other words, the significance test of a MANOVA with two groups is the same as the significance test for a multiple regression to predict group membership from your set of dependent variables. If you divide an ordinary t value by the square root of n/2 (if the groups are not the same size, n has to be replaced by the harmonic mean of the two sample sizes), you get g, a sample estimate of the effect size in the population. If you divide T 2 by n/2 (again, you need the harmonic mean if the ns are unequal) you get MD2, where MD is a multivariate version of g, called the Mahalanobis distance. In Figure 22.17 I have reproduced Figure 22.15, but added a measure of distance. The means of the LV and HV groups are not far apart on either IQ or grades separately, but if you create a new axis from the discriminant function that optimally combines the two variables, you can see that the groups are well separated on the new axis. Each group has a mean (called a centroid) in the two-dimensional space formed by the two variables. The MD is the standardized distance between the centroids, taking into account the correlation between the two variables. If you had three discriminator variables, you could draw the points of the two groups in threedimensional space, but you would still have two centroids and one distance between them. The MD can be found, of course, if you have even more discriminator variables, but unfortunately I cant draw such a case. Because T 2 = (n/2)MD2, even a tiny MD can attain statistical significance with a large enough sample size. That is why it is useful to know MD in addition to T 2, so you can evaluate whether the groups are separated enough to be easily discriminable. I will return to this notion when I discuss discriminant analysis.
733
Figure 22.17
Plot of Two Groups of Students Measured on Two Different Variables Including the Discriminant Function and the Group Centroids
90 70 80
The critical F is based on P and n1 + n2 P 1 degrees of freedom. The ratio of 1 to is equal to SSbet / SSW, and when this ratio is multiplied by the ratio of dfs as in Formula 22.13, the result is the familiar ratio, MSbet/MSW, that you know from the one-way ANOVA and gives the same value as Formula 22.12. [In the two-group case, = df/(T2 + df) where df = n1 + n2 2.] The problem that you encounter as soon as you have more than two groups (and more than one discriminator variable) is that more than one discriminant function can be found. If you insist that the scores from each of the discriminant functions you find are completely uncorrelated with those from each and every one of the others (and we always do), there is, fortunately, a limit to the number of discriminant functions you can find for any given MANOVA problem. The maximum number of discriminant functions, s, cannot be more than P or k 1 (where k is the number of groups), whichever is smaller. We can write this as s = min(k 1, P). Unfortunately, there is no universal agreement on how to test these discriminant functions for statistical significance. Consider the case of three groups and two variables. The first discriminant function that is found is the combination of the two variables that yields the largest possible F ratio in an ordinary one-way ANOVA. This combination of variables provides what is called the largest or greatest characteristic root (gcr). However, it is possible to create a second discriminant function whose scores are not correlated with the scores from the first function. (It is not possible to create a third function with scores uncorrelated with the first two; we
734
735
Discriminant Analysis
When a MANOVA is performed, the underlying discriminant functions are tested for significance, but the discriminant functions themselves are often ignored. Sometimes the standardized weight or the discriminant loading of each variable is inspected to characterize a discriminant function and better understand how the groups can be differentiated. Occasionally, it is appropriate to go a step further and use a discriminant function to predict an
736
737
Figure 22.18
A Territorial Map of Three Groups of Subjects Measured Along Two Discriminant Functions
Orientation to Reality Low Centroid for Psychotics Centroid for Normals High
Low
738
Complex MANOVA
The MANOVA approach can be used with designs more complicated than the one-way RM ANOVA. For instance, in a two-way mixed design, MANOVA can be used to test the main effect of the RM factor, just as described for the oneway RM ANOVA. In addition, the interaction of the mixed design can be tested by forming the appropriate difference scores separately for each group of subjects and then using a two- or multigroup (i.e., one-way) MANOVA. A significant one-way MANOVA indicates that the groups differ in their level-to-level RM differences, which demonstrates a group by RM interaction. The MANOVA approach can also be extended to factorial RM ANOVAs (as described at the end of Section A in this chapter) and designs that are called doubly multivariate. The latter design is one in which a set of DVs is measured at several points in time or under several different conditions within the same subjects.
739
C
SUMMARY
740
EXERCISES
1. In a two-group experiment, three dependent variables were combined to give a maximum t value of 3.8. a. What is the value of T 2? b. Assuming both groups contain 12 subjects each, test T 2 for significance. c. Find the Mahalanobis distance between these two groups. d. Recalculate parts b and c if the sizes of the two groups are 10 and 20. 2. In a two-group experiment, four dependent variables were combined to maximize the separation of the groups. SSbet = 55 and SSW = 200. a. What is the value of ? b. Assuming one group contains 20 subjects and the other 25 subjects, test for significance. c. What is the value of T 2? d. Find the Mahalanobis distance between these two groups. 3. Nine men and nine women are tested on two different variables. In each case, the t test falls short of significance; t = 1.9 for the first DV, and 1.8 for the second. The correlation between the two DVs over all subjects is zero. a. What is the value of T 2? b. Find the Mahalanobis distance between these two groups. c. What is the value of Wilks ? d. Test T 2 for significance. Explain the advantage of using two variables rather than one to discriminate the two groups of subjects. 4. What is the maximum number of (orthogonal) discriminant functions that can be found when a. There are four groups and six dependent variables? b. There are three groups and eight dependent variables? c. There are seven groups and five dependent variables? 5. Suppose you have planned an experiment in which each of your 12 subjects is measured under six different conditions. a. What is the df for the error term if you perform a one-way RM ANOVA on your data? b. What is the df for the error term if you perform a MANOVA on your data? 6. Suppose you have planned an experiment in which each of your 20 subjects is measured under four different conditions.
741
b. What is the df for the error term if you perform a MANOVA on your data?
The SS components for the interaction effects of the three-way ANOVA with independent groups. a. b. c. d. SSA B = SSAB SSA SSB Formula 22.1 SSA C = SSAC SSA SSC SSB C = SSBC SSB SSC SSA B C = SSABC SSA B SSB C SSA C SSA SSB SSC
KEY FORMULAS
The df components for the three-way ANOVA with independent groups: a. b. c. d. e. f. g. h. dfA = a 1 dfB = b 1 dfC = c 1 dfA B = (a 1)(b 1) dfA C = (a 1)(c 1) dfB C = (b 1)(c 1) dfA B C = (a 1)(b 1)(c 1) dfW = abc (n 1) Formula 22.2
The SS for the between-groups error term of the three-way ANOVA with one RM factor: SSW = SSbet-S SSAB Formula 22.3
The within-subjects portion of the total sums of squares in a three-way ANOVA with one RM factor: SSW S = SStotal SSbet-S Formula 22.4
The SS for the within-subjects error term of the three-way ANOVA with one RM factor: SSS R = SSW S SSR SSA R SSB R SSA B R Formula 22.5
The df components for the three-way ANOVA with one RM factor. a. b. c. d. e. f. g. h. i. dfA = a 1 dfB = b 1 dfA B = (a 1)(b 1) dfR = c 1 dfA R = (a 1)(c 1) dfB R = (b 1)(c 1) dfA B R = (a 1)(b 1)(c 1) dfW = ab(n 1) dfS R = dfW dfR = ab(n 1)(c 1) Formula 22.6
742
b. FB =
c.
FA B =
d.
FR =
e.
FA R =
f. FB R =
g.
FA B R =
The SS for the between-groups error term of the three-way ANOVA with two RM factors: SSW = SSS SSA Formula 22.8
The SS components for the within-subjects error terms of the three-way ANOVA with two RM factors: a. SSQ S = SSQS SSQ SSS SSA Q Formula 22.9 b. SSR S = SSRS SSR SSS SSA R c. SSQ R S = SStotal SSAQR SSW SSQ S SSR S
The df components for the three-way ANOVA with two RM factors: a. b. c. d. e. f. g. h. i. j. k. dfA = a 1 dfQ = q 1 dfR = r 1 dfA Q = (a 1)(q 1) dfA R = (a 1)(r 1) dfQ R = (q 1)(r 1) dfA Q R = (a 1)(q 1)(r 1) dfW = a(n 1) dfQ S = dfQ dfW = a(q 1)(n 1) dfR S = dfR dfW = a(r 1)(n 1) dfQ R S = dfQ dfR dfW = a(q 1)(r 1)(n 1) Formula 22.10
Key Formulas
Optional Material
743
The F ratios for the three-way ANOVA with two RM factors: a. FA = FQ = FR = MSA MSW MSQ MSQ S MSR MSR S MSA Q MSQ S MSA R MSR S MSQ R MSQ R S MSA Q R MSQ R S Formula 22.11
b.
c.
d. FA Q = FA R = FQ R =
e.
f.
g.
FA Q R =
The F ratio for testing the significance of T2 calculated for P dependent variables and two independent groups: F= n1 + n2 P 1 2 T P(n1 + n2 2) Formula 22.12
The F ratio for testing the significance of Wilks lambda calculated for P dependent variables and two independent groups: F= 1 n1 + n2 P 1 P Formula 22.13
744
REFERENCES
Banaji, M. R., & Hardin, C. D. (1996). Automatic stereotyping. Psychological Science, 7, 136141. Bruder, G. E., Stewart, J. W., Mercier, M. A., Agosti, V., Leite, P., Donovan, S., & Quitkin, F. M. (1997). Outcome of cognitive-behavioral therapy for depression: Relation to hemispheric dominance for verbal processing. Journal of Abnormal Psychology, 106, 138144. Cole, D. A., Maxwell, S. E., Arvey, R., & Salas, E. (1994). How the power of MANOVA can both increase and decrease as a function of the intercorrelations among the dependent variables. Psychological Bulletin, 115, 465474. Harris, R. J. (1985). A primer of multivariate statistics (2nd ed.). Orlando, Florida: Academic Press. Hays, W. L. (1994). Statistics (5th ed.). New York: Harcourt Brace College Publishing. Huynh, H., & Mandeville, G. K. (1979). Validity conditions in repeated measures designs. Psychological Bulletin, 86, 964973. Marlowe, C. M., Schneider, S. L., & Nelson, C. E. (1996). Gender and attractiveness biases in hiring decisions: Are more experienced managers less biased? Journal of Applied Psychology, 81, 1121.
745
CHAPTER 22
Section A 1. a & b) Femotion = 74.37/14.3 = 5.2, p < .01, = .122 Frelax = 64.4/14.3 = 4.5, p < .05, 2 = .039 Fdark = 31.6/14.3 = 2.21, n.s. , 2 = .019 Femo rel = 55.77/14.3 = 3.9, p < .05, 2 = .095 Femo dark = 17.17/14.3 = 1.2, n.s, 2 = .031. Frel dark = 127.3/14.3 = 8.9, p < .01, 2 = .074 Femo rel dark = 25.73/14.3 = 1.8, n.s., 2 = .046
2
d) For Fyear and Fsize year, conservative F.05 (1, 18) = 4.41; for Fsystem year and Fsize system year, conservative F.05 (2, 18) = 3.55. All of the conclusions involving RM factors will be affected by not assuming that sphericity holds, as none of these tests are significant at the .05 level once dfs are adjusted by lower-bound epsilon. It is recommended that conclusions be determined after adjusting dfs with an exact epsilon calculated by statistical software.
Assuming that a moderate effect size is about .06 (or 6%), the main effect of emotion is more than moderate, as are the two-way interactions of emotion relaxation and relaxation dark. 3. a) SS 496.8 32.28 36.55 384.15 31.89 20.26 10.2 131 df MS F 60.65 5.91 13.4 23.45 5.84 2.47 .62 p < .001 < .01 < .01 < .001 < .05 n.s. n.s.
Source Drug Therapy Depression Drug Therapy Therapy Depression Drug Depression Drug Therapy Depress Within-groups
Section B 1. a) Fstyle = 25.2/3.5 = 7.2, p < .01 Fspeaker = 12.9/3.5 = 3.69, n.s. Fissue = 3.53/2.2 = 1.6, n.s. Fstyle speaker = 10.5/3.5 = 3.0, n.s. Fstyle issue = 12.1/2.2 = 5.5, p < .01 Fspeaker issue = 1.767/2.2 = .80, n.s. Fstyle speaker issue = 2.417/2.2 = 1.1, n.s. b) For Fissue and Fspeaker issue, conservative F.05 (1, 54) = 4.01; for Fstyle issue and Fstyle speaker issue, conservative F.05 (2, 54) = 4.01. None of the conclusions involving RM factors will be affected. a) SS df MS F p
3.
Source Between-Subjects Size System Size System Within-groups Within-Subjects Year Size Year System Year Size System Year Subject Year
b) Although there are some small differences between the two graphs, indicating that the threeway interaction is not zero, the two graphs are quite similar. This similarity suggests that the three-way interaction is not large, and is probably not significant. This observation is consistent with the F ratio being less than 1.0 for the three-way interaction in this example. c) You could begin by exploring the large drug by therapy interaction , perhaps by looking at the simple effect of therapy for each drug. Then you could explore the therapy by depression interaction , perhaps by looking at the simple effect of depression for each type of therapy. d) L = [(11.5 8.7) (11 14) ] [ (19 14.5) (12 10) ] = [2.8 (-3)] [4.5 2] = 5.8 2.5 = 3.3; SScontrast = nL2 / c2 = 3 (3.3)2 / 8 = 32.67 / 8 = 4.08375; Fcontrast = 4.08375 / 2.73 = 1.5 (not significant, but better than the overall three-way interaction). 5. a) Fdiet = 201.55 / 29.57 = 6.82, p < .05 Ftime = 105.6 / 11.61 = 9.1, p < .01 Fdiet time = 8.67 / 7.67 = 1.13, n.s. b) conservative F.05 (1, 5) = 6.61; given the usual .05 criterion, none of the three conclusions will be affected (the main effect of time is no longer significant at the .01 level, but it is still significant at the .05 level).
21.1 1 21.1 61.6 2 30.8 1.75 2 .88 52.1 18 2.89 4.36 4.70 6.17 9.83 27.19 3 3 6 6 54 1.46 1.57 1.03 1.64 .50
b) You can see that the line for the new system is generally the highest (if you are plotting by year), the line for the old system is lowest, and the combination is in between, producing a main effect of system. The lines are generally higher for the large school, producing a main effect of size. However, the ordering of the systems is the same regardless of size, so there is very little size by system interaction. Ratings generally go up over the years, producing a main effect of year. However, the ratings are aberrantly high for the first year in the large school, producing a size by year, as well as a threeway interaction. One partial interaction would result from averaging the new and combined system and comparing to the old system across the other intact factors.
746
c) Given the significant three-way interaction, it would be reasonable to look at simple interaction effectsperhaps, the system by year interaction for each school size. This two-way interaction would likely be significant only for the large school, and would then be followed by testing the simple main effects of year for each system. To be cautious about sphericity, you can use an error term based only on the conditions included in that follow-up test. There are other legitimate possibilities for exploring simple effects, as well. 5. a) SS df MS F p
Source
Between-groups Background Within-group Within-subjects Affect Background Affect Subject Affect Image Background Image Subject Image Affect Image Back Affect Image Subject Affect Image
.93 42.85 13.72 19.02 19.48 131.06 .24 25.59 18.39 2.32 39.07
1 10 2 2 20 2 2 20 4 4 40
.93 4.29 6.86 9.51 .97 65.53 .12 1.28 4.60 .58 .98
.22
n.s.
7.04 9.76
<.01 <.01
51.21 <.001
Section C
.09 n.s.
1.
4.71 .59 <.01 n.s.
a) T 2 = 3.82 = 14.44 b) F = 14.44 * (24 3 1) / 3 (22) = 4.376 > F.05 (3, 20) = 3.1, so T 2 is significant at the .05 level. c) MD 2 = T 2 /n/2 = 14.44/6 = 2.407; MD = 1.55 d) F = 14.44 * (30 3 1)/3 (28) = 4.47; harmonic mean of 10 and 20 = 13.33, MD 2 = 14.44/13.33/2 = 2.167; MD = 1.47 a) R2 (the sum of the two rpb2s) = .184 + .168 = .352; T 2 = 16 * [.352/(1 .352)] = 8.69 b) MD 2 = 8.69/4.5 = 1.93; MD = 1.39 c) = 16/(8.69 + 16) = .648 d) F = (15/32) * 8.69 = 4.07 > F.05 (2, 15) = 3.68, so T 2 is significant at the .05 level. As in multiple regression with uncorrelated predictors, each DV captures a different part of the variance between the two groups; together the two DVs account for much more variance than either one alone. a) df = (6 1) (12 1) = 5 * 11 = 55 b) df = 12 6 + 1 = 7
3. b) The conservative F.05 (1, 10) = 4.96 for all of the Fs involving an RM factor (i.e., all Fs except the main effect of background music). The F for the affect by image interaction is no longer significant with a conservative adjustment to df; a more exact adjustment of df is recommended in this case. None of the other conclusions are affected (except that the main effect of affect and its interaction with background music are significant at the .05, instead of .01 level after the conservative adjustment). c) If you plot affect on the X axis, you can see a large main effect of image, because the three
5.