Download as pdf or txt
Download as pdf or txt
You are on page 1of 60

ONE-WAY ANOVA

Introduction

If you want to determine whether there are any statistically


significant differences between the means of two or more
independent groups, you can use a one-way analysis of
variance (ANOVA).
EXAMPLES:
Example 1:
Determine whether exam performance differed based on test anxiety
levels amongst students
dependent variable: ”exam performance", measured
from 0-100
independent variable: ”test anxiety level", which has three
groups:
"low-stressed students", "moderately-stressed students" and
"highly-stressed students")

Example 2:
To understand whether there is a difference in salary based on degree
subject
dependent variable: "salary"
independent variable: "degree subject", which has five groups:
”business studies", "psychology", "biological sciences",
"engineering" and "law")
Basic requirements of the one-way
ANOVA
Assumption #1: You have one dependent variable that is measured
at the continuous level.

Assumption #2: You have one independent variable that consists of


two or more categorical, independent groups. Typically, a one-way
ANOVA is used when you have three or more categorical, independent
groups, but it can be used for just two groups (although an
independent-samples t-test is more commonly used for two groups).

Assumption #3: You should have independence of observations,


which means that there is no relationship between the observations in
each group of the independent variable or between the groups
themselves.
Basic requirements of the one-way
ANOVA

Assumption #4

There should be no significant outliers in the groups of your
independent variable in terms of the dependent variable

Assumption #5

Your dependent variable should be approximately normally distributed
for each group of the independent variable
Study Designs
(a)determining if there are differences
between three or more independent
groups;
(b)determining if there are differences
between three or more conditions; and
(c)determining if there are differences in
change scores.
Study Design #1
Determine if there are differences between three or more independent
groups
Study Design #2
Determine if there are differences between three or more
conditions/treatments (with no pre-test measurement taken)
Study Design #3
Determine if there are differences in
change scores
If you have a study design where three or more independent groups have
performed different interventions (e.g., control/interventions), the same
dependent variable is measured at the beginning and end of the study in all
groups, and a change score calculated (i.e., post-values minus pre-values),
a one-way ANOVA might be appropriate.
For example, pre- and post- blood glucose concentration measurements
were taken and change scores calculated for an exercise intervention
group, dietary intervention group and a control group. These change
scores were then compared between the three groups using a one-way
ANOVA. This will determine whether the changes in blood glucose
concentration between groups was equal or if there were statistically
significant differences in change score (i.e., the intervention type had a
differential effect on change in blood glucose concentration).
Example
A researcher believes that individuals that are more physically active are better
able to cope with stress in the workplace. To test this theory, the researcher
recruited 31 subjects and measured how many minutes of physical activity they
performed per week and their ability to cope with workplace stress. The subjects
were categorized into four groups based on the number of minutes of physical
activity they performed: namely, "sedentary", "low", "moderate" and "high"
physical activity groups. These groups (levels of physical activity) formed an
independent variable called group.

The ability to cope with workplace stress was assessed as the average score of a
series of items on a questionnaire, which allowed an overall "coping with workplace
stress" score to be calculated; higher scores indicating a greater ability to cope with
workplace-related stress. This dependent variable was called coping_stress

The researcher would like to know if CWWS score is dependent on


physical activity level. In variable terms, is mean coping_stress score
different for different levels of group?
Setting up your data
For a one-way ANOVA, you will have two variables. In this
example, these are:

1) The dependent variable, coping_stress, which is the "ability to


cope with workplace-related stress" (abbreviated as "CWWS"
score); and
2) The independent variable, group, which has four ordered
categories: "Sedentary", "Low", "Moderate", and "High" (N.B., the
categories do not have to be ordered in a one-way ANOVA).

To set up these variables, SPSS Statistics has a Variable View


where you define the types of variables you are analysing and a
Data View where you enter your data for these variables. First, we
show you how to setup your independent variable and then your
dependent variable in the Variable View window of SPSS
Statistics. Finally, we show you how to enter your data into the
Data View window.
The Variable View in SPSS Statistics
At the end of the setup process, your Variable View window will look like
the one below, which illustrates the setup for both the independent and
dependent variable:
The Data View in SPSS
Running the Explore... procedure

The following instructions show you how to run the Explore... procedure in
order to detect outliers and check if your data is normally distributed:
Determining if your data is normally distributed

Your data can be checked to determine whether it is normally


distributed using a variety of tests. This section of the guide
will concentrate on one of the most common methods: the
Shapiro-Wilk test of normality. This is a numerical method and
the result of this test is available in the output because it was
run when you selected the Normality plots with tests option in
the Explore: Plots dialogue box.
Shapiro-Wilk test of normality
The Shapiro-Wilk test is recommended if you have small sample sizes (<
50 participants) and are not confident visually interpreting Normal Q-Q
Plots or other graphical methods. The Shapiro-Wilk test tests if data is
normally distributed for each group of the independent variable.
Therefore, there will be as many Shapiro-Wilk tests as there are groups
of the independent variable. In this example, this would mean that four
tests have been run – one for each group of the independent variable,
group (i.e., the "Sedentary", "Low", "Moderate" and "High" groups). Each
test is presented on a new row in the Tests of Normality table, as shown
below:
In order to understand whether the scores in each group are normally
distributed, you need to consult the "Sig." column located under the
"Shapiro-Wilk" column, as highlighted above. If your data is normally
distributed (i.e., the assumption of normality is met), the significance level
(the value in the "Sig." column) should be more than .05 (i.e., p > .05). If
your data is not normally distributed (i.e., the assumption of normality is
violated), the significance level ("Sig.") will be less than .05 (i.e., p < .05).
The null hypothesis of the Shapiro-Wilk test is that your data's distribution
is equal to a normal distribution and the alternative hypothesis is that
your data's distribution is not equal to a normal distribution. Thus, if you
reject the null hypothesis (p < .05), this means that your data's
distribution is not equal to a normal distribution, and if you fail to reject
the null hypothesis, your data is normally distributed.
The following instructions show you how to run a one-way ANOVA using
SPSS Statistics' ONEWAY procedure, including which options to select to
generate a test of the homogeneity of variances and a post hoc test to
determine which group means differ from which other group means.
Because you will not know whether the assumption of homogeneity of
variances is met until after you have run the procedure, you will also be
shown how to run a test that allows there to be unequal variances called
the Welch ANOVA and how to run a post hoc test that also allows unequal
variances. This will result in a set of parallel results: one for when the
assumption of homogeneity of variances is met and one for when it is
violated. Based on the test result for this assumption, you will be shown
which results to interpret (e.g., standard one-way ANOVA or Welch
ANOVA).
To run a one-way ANOVA with a post hoc test in SPSS Statistics, follow
the instructions in the next slide:
Interpreting Results
SPSS Statistics will have generated a Descriptives table containing some useful
descriptive statistics for each group of the independent variable – the "Sedentary",
"Low", "Moderate" and "High" groups – which will help you get a "feel" for your data and
will be used when you report your results. You can make an initial interpretation of your
data using the Descriptives table:
You will want to report these descriptive statistics in your results using the
mean (the "Mean" column) and standard deviation (the "Std. Deviation"
column) rather than the standard error of the mean (the "Std. Error Mean"
column). Although both the standard deviation and standard error of the
mean are used to describe data, the latter is considered to be erroneous in
many of the cases where it is presented (e.g., see discussion by Carter
(2013) and explanation by Altman & Bland (2005)). You might, therefore,
report these results as follows:
Assumption of homogeneity of variances

The one-way ANOVA assumes that the population variances of the dependent variable
are equal for all groups of the independent variable. If the variances are unequal, this
can affect the Type I error rate. In our example, the (population) variance for CWWS
scores, coping_stress, for all levels of group should be equal. If this is not the case,
corrections can be applied to the calculations of the one-way ANOVA so that any
violation of homogeneity of variances can be compensated for and the test remains
valid.
The assumption of homogeneity of variances is tested using Levene's test of equality of
variances, which is but one way of determining whether the variances between groups
for the dependent variable are equal. The result of this test is found in the Test of
Homogeneity of Variances table, as highlighted below:
The important column of the table above is the "Sig." column, which
presents the significance value (i.e., p-value) of the test. If Levene's
test is statistically significant (i.e., p < .05), you do not have equal
variances and have violated the assumption of homogeneity of
variances (i.e., you have heterogeneous variances). On the other hand,
if Levene's test is not statistically significant (i.e., p > .05), you have
equal variances and you have not violated the assumption of
homogeneity of variances. In our example, the "Sig." value (i.e., p-
value) is .120 (i.e., p = .120), which indicates that the variances are
equal (i.e., the assumption of homogeneity of variances is met).
Results when homogeneity of variances is met
You established on the previous page that you have homogeneity of variances. This
means you can interpret the standard one-way ANOVA and, if this test is statistically
significant, either: (a) interpret the results from the Tukey post hoc test to understand
where any difference(s) lie; or (b) run contrasts to investigate specific differences
between groups. The one-way ANOVA result is found in the ANOVA table, as shown
below:
The most important part of the table above is the "Sig." column, as
highlighted below:

This column contains the statistical significance value (i.e., p-value) of the test found in
the "Sig." column as highlighted in the table above. If the ANOVA is statistically
significant (i.e., p < .05), it can be concluded that not all group means are equal in the
population (i.e., at least one group mean is different to another group mean).
Alternatively, if p > .05, you do not have any statistically significant differences between
the group means. The p-value in this example would appear to be .000 (obtained from
the "Sig." column). However, if you ever see SPSS Statistics print out a p-value of .
000, do not interpret this as a significance value that is actually zero; it actually means
p < .0005. As the statistical significance value in this example is less than .05 (i.e., p
< .0005 satisfies p < .05), it can be concluded that there is a statistically significant
difference in mean coping_stress scores for the different levels of group. That is, you
know that at least one group mean differs from the other group means. You could
report this result as:
The last part of the statement above (i.e., F(3, 27) = 8.316, p < .0005) is obtained from the ANOVA table, as
shown below:
Tukey post hoc test
As has been mentioned previously, if you have no prior hypotheses about which specific
groups might differ or your interest is in all possible pairwise comparisons, you should
run a post hoc test that tests all possible group comparisons. The Tukey post hoc test is
a good (Westfall et al., 2011) and recommended (Kirk, 2013) test for this purpose when
the assumption of homogeneity of variances is not violated (and all other assumptions
of the one-way ANOVA are met). This test is useful in that it not only provides the
statistical significance level (i.e., p-value) for each pairwise comparison, but also
provides confidence intervals (aka Tukey's intervals) for the mean difference for each
comparison.
The results from the Tukey post hoc test are presented in the Multiple Comparisons
table, as shown below:
Putting it all together
In conclusion, reporting all the results, including information about the
assumptions run, you could write the complete results as:
Results when homogeneity of variances is violated

In this example, the Welch ANOVA is used. If this test is statistically significant, you can either:
(a) interpret the results of the Games-Howell post hoc test to understand where any
difference(s) lie; or (b) run contrasts to investigate specific differences between groups. The
result of the Welch's ANOVA is found in the Robust Tests of Equality of Means table, as
shown below:
This column contains the statistical significance value (i.e., p-value) of the test found in the "Sig." column
as highlighted in the table above. If the ANOVA is statistically significant (i.e., p < .05), it can be concluded
that not all group means are equal in the population (i.e., at least one group mean is different to another
group mean). Alternatively, if p > .05, you do not have any statistically significant differences between the
group means. The p-value in this example would appear to be .000 (obtained from the "Sig." column).
However, if you ever see SPSS Statistics print out a p-value of .000, do not interpret this as a significance
value that is actually zero; it actually means p < .0005. As the statistical significance value in this example
is less than .05 (i.e., p < .0005 satisfies p < .05), it can be concluded that there is a statistically significant
difference in mean coping_stress scores for the different levels of group. That is, you know that at least
one group mean differs from the other group means. You could report this result as:
Games-Howell post hoc test
The Games-Howell post hoc test is a good test if you want to compare
all possible combinations of group differences when the assumption of
homogeneity of variances is violated. This post hoc test provides
confidence intervals for the differences between group means and
shows whether the differences are statistically significant. The Games-
Howell post hoc test is presented in the Multiple Comparisons table,
as shown below:
Reporting
One-way ANOVA not statistically significant, but variances were equal
One-way ANOVA was not statistically significant, but variances were unequal
One-way ANOVA was statistically significant, variances were equal and a post hoc test
was carried out
One-way ANOVA statistically significant, variances were unequal and a post hoc test
was carried out

You might also like