Download as pdf or txt
Download as pdf or txt
You are on page 1of 11

9-1

Session 9

Two-SampIe T-Tests and ANOVA



ndependent Samples T-Test Comparing Two
ndependent Means 9-2
Levene's Test for Equality of Variances 9-4
Paired Samples T-Test Comparing Two Dependent
Means 9-5
One-Way ANOVA Comparing Multiple ndependent
Means 9-7
Practical session 9 9-9

9-2
SESSION 9: Two-SampIe T-Tests and ANOVA

n Session 8 we looked at independence between two variables, where both
variables were categorical. How do we investigate relationships between
variables where one is numerical?

Independent SampIes T-Test - Comparing Two Independent Means

Open the STATLAB data found in H:\My Documents\spss data\
StatIaba.sav.

n this data set, the children were weighed and measured (among other
things) at the age of ten. We want to know whether there is any difference in
the average heights of boys and girls at this age. We do this by performing
an Independent sampIes T-Test. The boys and girls are our two
independent samples there is no connection between them.

We start by stating our NuII Hypothesis:

We assume there is no difference between boys and girls in terms of their
height

The AIternative Hypothesis, the one used if the Null Hypothesis is rejected,
is therefore that there is some difference in the average heights of boys and
girls at this age.

To perform the T-Test, we select:

Ananlyze
Compare Means
ndependent-Samples T Test.

which gives the Independent-SampIes T Test dialogue box, as in Figure 9.1.
Move the appropriate variables into the fields as seen in this figure.


Figure 9.1
9-3

We want to test for differences in the mean HEGHTS of the children; we
therefore move the variable CTH to the Test VariabIes area. We want to look
at differences in the heights of the two groups BOYS and GRLS, and so our
Grouping VariabIe is SEX.

We need to tell SPSS which groups we want to use in this case we only
have two categories (boys and girls), but if we had a grouping variable with
several categories, we could choose any two. We click on the Define
Groups button which gives the Define Groups dialogue box in Figure 9.2.


Figure 9.2

We enter the values representing the two groups - 1 for the females, 2 for the
males then click on Continue. n the Independent-SampIes T Test
dialogue box, the two question marks in the Grouping VariabIe area have
changed to the values we entered.

Clicking OK gives the output found in Figure 9.3

Group Statistics
647 53.287 2.5916 .1019
648 53.644 2.5264 .0992
sex
1.00 girl
2.00 boy
cth
N Mean Std. Deviation
Std. Error
Mean

9-4
Independent SampIes Test
.547 .460 -2.512 1293 .012 -.3573 .1422 -.6363 -.0782
-2.512 1292.056 .012 -.3573 .1422 -.6363 -.0782
Equal variances
assumed
Equal variances
not assumed
cth
F Sig.
Levene's Test
for Equality of
Variances
t df Sig. (2-tailed)
Mean
Difference
Std. Error
Difference Lower Upper
95%
Confidence
nterval of the
Difference
t-test for Equality of Means


Figure 9.3

The first part of the output gives some summary statistics; the numbers in
each group, and the mean, standard deviation and standard error of the mean
for the height for the two groups.

n the second table in the output, we have two lines of figures; we are going to
look at the first line, but will come back to the reason we made this decision
later. For the moment, ignore the two columns headed 'Levene's Test.'

The Mean Difference, or the result of taking the mean of boys' height (group
2) from the mean of the girls' height (group 1), is -0.3573. Our Null
Hypothesis says that there is no difference between the boys and girls in
terms of their heights; so we are testing whether this difference (-0.3573) is
significantIy different from zero. f it is, we must reject the Null Hypothesis,
and instead take the Alternative.

SPSS calculates the t-vaIue, the degrees of freedom and the Significance
LeveI and displays them in the columns headed t, df and Sig. (2-taiIed). Just
as with the
2
test, we can make our decision quickly based on the displayed
Significance Level.

f the Significance Level is less than 0.05, we reject the Null Hypothesis and
take the Alternative Hypothesis instead.

n this case, with a Significance Level of 0.012, we say that there is evidence,
at the 5% level, to suggest that there is a difference between the heights of
boys and girls at age ten (the Alternative Hypothesis).

Levene's Test for EquaIity of Variances

Why did we choose the first line of output? The independent samples t-test is
sensitive to groups having different dispersions (i.e. variances). f the
variances in the two groups are equal (or are close enough), the first line of
output is used, which is the usual t-test.

9-5
f, however, the variances are unequal, SPSS can apply a correction the
results when the correction is applied (whether or not it is needed) is shown in
the second line of output, where equal variances are not assumed.

Which line we use depends on the result of Levene's Test for EquaIity of
Variances, displayed in the first two columns. A significance value (Sig.) of
0.05 or more means that the Null Hypothesis of assuming equal variances is
acceptable, and we therefore take the first line of the output; a significance
value of less than 0.05 means that the second line of output should be used
for the T-Test.

n this case, the significance value is comfortably above this threshold, and
therefore equal variances are assumed we use the results from the first line.


9-6
Paired SampIes T-Test - Comparing Two Dependent Means

magine you want to compare two groups that are somehow paired; for
example, husbands and wives, or mothers and daughters. Knowing about
this pairing structure gives extra information, and you should take account of
this when performing the T-Test.

n the GSS91t data, we have variables recording how many years of
education the respondent and his or her spouse, father, and mother received.
f we want to know if there is a difference between the respondent and his or
her father in terms of education, we can perform a Paired SampIes T-Test on
the two variables educ and paeduc. Our NuII Hypothesis is that there is no
difference. We select:

Analyze
Compare Means
Paired-Samples T Test.

This gives us the Paired-Samples T Test dialogue box found in Figure 9.4.

Both variables must be selected from the list before clicking on the
button to move them to the Paired VariabIes area. As you click on a variable
it will appear in the Current SeIections area.


Figure 9.4

Clicking on OK gives the output in Figure 9.5.

9-7
Paired SampIes Statistics
13.42 1065 2.859 .088
10.87 1065 4.120 .126
educ Highest Year of
School Completed
paeduc Highest
Year School
Completed, Father
Pair
1
Mean N Std. Deviation
Std. Error
Mean


Paired SampIes CorreIations
1065 .463 .000
educ Highest Year of
School Completed &
paeduc Highest Year
School Completed,
Father
Pair
1
N Correlation Sig.


Paired SampIes Test
2.548 3.773 .116 2.322 2.775 22.044 1064 .000
educ Highest Year
of School
Completed - paeduc
Highest Year School
Completed, Father
Pair
1
Mean Std. Deviation
Std. Error
Mean Lower Upper
95%
Confidence
nterval of the
Difference
Paired Differences
t df Sig. (2-tailed)


Figure 9.5

As with the Independent SampIes T-Test, we are first given some summary
statistics; we are also given the CorreIation between the two variables (we
will deal with CorreIation in a later session). The Paired SampIes Test table
shows that the difference between the years of education of the respondents
and the fathers is 2.548 years is this significantly different from zero?

We use this table just as we did in the Independent SampIes T-Test, and
since the Sig. (2-taiIed) column shows a value of less than 0.05, we can say
that there is evidence, at the 5% level, to reject the NuII Hypothesis that
there is no difference between the respondents and their fathers in terms of
their years of education.

n other words, there is strong evidence (p < 0.0005) to show that
respondents have a different amount of education compared to their fathers
it appears the children stay in education longer.
9-8
One-Way ANOVA - Comparing MuItipIe Independent Means

We now look at the situation where we want to compare several independent
groups. For this we use a One-Way ANOVA (ANALYSS OF VARANCE).

Open the GSS91t data in H:\My Documents\spss data\Gss91t.sav. We
can split the respondents into three groups according to which category of the
variable LIFE they fall into; EXCTNG, ROUTNE or DULL. We want to know
if there is any difference in the average years of education of these groups.
Our NuII Hypothesis is that there is no difference between them in terms of
education.

Select:

Analyze
Compare Means
One-Way ANOVA.

n the One-Way ANOVA dialogue box (Figure 9.6) we move the variable we
want to test, EDUC, into the Dependent List, and our grouping variable,
LIFE, into the Factor area. Unlike the Independent SampIes T-Test, we
don't have to specify the categories; each category will be taken as a
separate group.


Figure 9.6

f we were to click on OK now, SPSS would produce output that would enable
us to decide whether to accept or reject the Null Hypothesis that there is no
difference between the groups. But if we find evidence of a difference, we will
not know where the difference lies.

For example, those finding life exciting may have a significantly different
number of years in education from those finding life dull, but there may be no
difference when they are compared to those finding life routine.

We therefore ask SPSS to perform a further analysis for us, called Tukey's b
Test. We select this by clicking on the Post Hoc button, which gives us the
9-9
One-Way ANOVA: Post Hoc MuItipIe Comparisons dialogue box in Figure
9.7.


Figure 9.7

Click next to Tukey's-b to select that option. Notice that the Significance
IeveI is set at 0.05 (5%), but the user can change this. Click on Continue to
return to the One-Way ANOVA dialogue box, and click OK. The output
produced by SPSS is below in Figure 9.8.


Figure 9.8

9-10
The first table gives the results of the One-Way ANOVA. A measure of the
variability found between the groups is shown in the Between Groups line,
while the Within Groups line gives a measure of how much the observations
within each group vary. These are used to perform the F-Test which we use
to test our NuII Hypothesis that there is no difference between the three
groups in terms of their years in education.

We interpret the F-Test in the same way as we did the T-Test; if the
significance (in the Sig. column) is less than 0.05, we have evidence, at the
5% level, to reject the Null Hypothesis, and say that there is some difference
between the groups. Otherwise, we accept our Null Hypothesis.

We can see from Figure 9.8 that the F-vaIue of 34.077 has a significance of
less than 0.0005, and therefore we reject the Null Hypothesis. The second
table then shows us where these differences lie.

Tukey's b Test creates subsets of the categories; if there is no difference
between two categories, they are put into the same subset. The groups are
arranged in ascending order of their mean number of years in education. We
can say that, at the 5% level, those who find life DULL are significantly
different from both those who find life ROUTNE and those who find it
EXCTNG in terms of education. Similarly, there is a significant difference in
number of years in education between those who find life ROUTNE and
those who find it EXCTNG.


PracticaI session 9

Open the STATLAB data (H:\My Documents\spss data\StatIaba.sav).

At the age of ten, the children in the sample were given two tests; the
Peabody Picture VocabuIary Test and the Raven Progressive Matrices
Test. Their scores are stored in the variables CTP and CTR.

Create a new variable called TESTS which is the sum of the two tests; this
new variable will be used in Questions 1 and 2.

n each of the questions below, state your NuII and AIternative Hypotheses,
which of the two you accept on the evidence of the relevant test, and the
Significance LeveI.

1. The Scores of the Two Sexes

Use an Independent SampIes T-Test to decide whether there is any
difference between boys and girls in terms of their scores. Which group seem
to do the better?

9-11
2. The Scores of the Offspring of Different PaternaI OccupationaI
Groups

n the STATLAB data, the fathers' occupation is stored in the variable FTO,
with the following categories:

0 Professional
1 Teacher / Counsellor
2 Manager / Official
3 Self-employed
4 Sales
5 Clerical
6 Craftsman / Operator
7 Labourer
8 Service worker

Using a One-Way ANOVA, test whether there is any difference between the
occupation groups in terms of the test scores of their children. Which
occupations are similar and which are different?


3. Education - respondents, spouses and parents

Open the GSS91t data.

By pairing the husbands and wives, decide whether there is any difference in
terms of years of education between

Respondents and their spouses (educ and speduc)
The parents of the respondents (paeduc and maeduc)

What do you interpret from this with respect to changes over generations?


Save aII your output as exer9.spo

You might also like