Download as pdf or txt
Download as pdf or txt
You are on page 1of 37

Steve Saffhill

Research Methods in Sport & Exercise


Difference Testing
Aims and Objectives
Introduce how to test for significance
differences in two groups of data
Detail basic principles of difference testing
Introduce parametric and non-parametric
tests
How to interpret SPSS outputs
Parametric Assumptions - Reminder
1. The data must be randomly sampled

2. The data must be high level data (interval/ratio not


nominal or ordinal)

3. The data must be normally distributed……


- normal curve on histograms
- z scores between 1.96 & + 1.96

4. The data must be of equal variance


 These assumptions are of progressive importance.

 If you do not meet #1 then use Non-parametric


inferential tests

 Some can be violated but you must justify doing so


with supporting evidence! (Vincent, 2005)

 # 4 is the least important


Basic Principles – Different Tests
• The statistical tests enables us to evaluate the effect of an
independent variable on a dependent variable.

• IV = the presumed cause of the effect being researched.


 The researcher controls the IV (i.e., differing levels of
athletic ability)

• DV = those that can be explained by the effects of the IV


 What is actually being measured (Performance changes -
that the researcher cannot control)!!!!
 PARAMETRIC = t-test
• Use for both repeated measurements (Paired t test) (i.e.
measurements carried out on the same subjects)
and
• independent measurements (independent t test) (i.e.
measurements carried out on two different groups of subjects).

 NON-PARAMETRIC =
• Wilcoxon – Repeated measures/Paired equivalent
• Mann-Whitney-U – independent t-test equivalent
• The statistical test is used to determine if the two levels of treatment differ
significantly (p < .05) so that their difference would not be attributable to a chance
occurrence more than 5 times in 100.
• The statistical test is always of the null hypothesis.
• All that statistics can do is reject or fail to reject the null hypothesis.
• Statistics cannot accept the research hypothesis...you do!
• Only logical reasoning, good experimental design, and appropriate theorising can
do so.
• Statistics can determine only whether the groups are different, not why they are
different.

• You the researcher say why!


Experimental Research Designs
• Within-Participants Design = repeated measures test on same
group

• Allows you to control for inter-individual confounding variables

• If you use different groups there is a chance of some variable


other than your IV that distinguishes between your group!

• In this design you have same people in all conditions so there is


less extraneous variation between conditions!

• Fewer participants too!!


• Between-Participants Design = different groups in
each condition of the IV
• Each group is less likely to get bored, tired etc
• Less susceptible to practice effects/results bias

• Needs more participants


• Need different participants in each group
• So lose some control over confounding influences!
Independent t - test
• The most frequently used t test determines whether two
sample means differ reliably from each other.

• In this test the samples are independent of one another


– also referred to as a between comparison.

• E.g., male v female scores in anxiety (IV/DV?)


• E.g., football v boxing VO2max (IV/DV?)
Types of t test

Example

• In an experiment of training intensity and distance run in 12


minutes the results are as follows:

 mean distance run in 12 min after 70% training intensity = 3004 m


 after 40% training intensity = 2456 m.

 Can you identify the IV and DV??


Types of t test

 IV = training intensity (70% vs 40%)

 DV = distance run
Types of t test

• The question that statistics has to answer is:

“Is the difference in the two mean scores significant or is it one


that could have occurred by chance given the inherent
variability of groups produced by random sampling?”
Using SPSS to carry out the analysis gives the following result:
t (2.8) = 13.81, p <.03

• The t is basically a ratio between a measure of the between group


variance and within group variance

• The larger the variance between the groups compared with the
variance within the groups = larger t value

DV goes into dependent


list in SPSS and IV goes into
FACTOR list!

Then you have to define


your groups (i.e., tell SPSS
who is who in what group)
What is the probability of obtaining that t value by
chance?
• The larger t is, then the more likely there is a TRUE difference between the
groups that is theoretically caused by our independent variable

• Each t value comes with its own associated probability level and this is where
the p value comes from

• p = .03

• Yes – there is a significant difference in distance between the two groups

• 70% intensity group ran reliably further than the 40% intensity of training
group.

• There is a significant difference between the two mean scores!


Caution!!!

• All comparison-between-groups techniques assume that the


variances between the groups are equivalent.
 Although mild violations of this assumption do not present major
problems, serious violations are more likely if group sizes are not
approximately equal.

• Most computer programs allow unequal group sizes. However,


the homogeneity assumption should be checked if group sizes
are very different or even when variances are very different
(automatically covered by SPSS) - Levene’s equality of variance
test.
Independent T-test on SPSS
Group Statistics

Std. Error
LEV EL N Mean Std. Deviation Mean
CS1 senior 94 4.3616 .88860 .09165
junior 101 4.1074 1.16502 .11592

Independe nt Sam ple s Test

Levene's Test f or
Equality of Varianc es t-test f or Equality of Means
95% Conf idence
Interval of the
Mean Std. Error Diff erence
F Sig. t df Sig. (2-tailed) Diff erence Diff erence Low er Upper
CS1 Equal variances
8.311 .004 1.704 193 .090 .2542 .14919 -.04009 .54843
assumed
Equal variances
1.720 185.961 .087 .2542 .14778 -.03737 .54571
not ass umed

If less than 0.05 we can say


that there is a significant
difference in the variance of If this is the case we say the
the two sets of scores!!! variance is not assumed and
use the bottom value here
Dependent t Test – also called a repeated measures or
Paired t Test

• This means that the two groups of scores are related in


some manner.
• one group of subjects is tested twice on the same variable,
and the experimenter is interested in the change between
the two tests
• Hence repeated measures design
There is no IV as such and
hence both variables go
into dependent list in SPSS
(i.e., nothing goes into
factor list)
Example: Effects of visualisation on pain
– Condition 1 = imagine performing an exciting t-test whilst
plunging hands into ice cold water
– Condition 2 = imagine being on a beach drinking beer
whilst plunging hands into ice cold water

IV = Condition DV = time hand immersed

• Similar formula to independent t-tests, however it is a bit


more sensitive as it takes into consideration that we are using
the same participants in both conditions
Counter-Balancing

• We couldn’t have all do C1 first as they would never return


for C2

• It might also lead to order effects!


 Learning, practice etc

• ½ do C1 and ½ do C2 first and then swap


SPSS Output
Paired Sam ples Statistics

Std. Error
Mean N Std. Deviation Mean
Pair CS1 4.2219 269 1.11603 .06805
1 CS2 4.3379 269 .98975 .06035

Paired Sam ples Correlations

N Correlation Sig.
Pair 1 CS1 & CS2 269 .523 .000

Paired Sam ples Te st

Paired Diff erences


95% Conf idence
Interval of the
Std. Error Diff erence
Mean Std. Deviation Mean Low er Upper t df Sig. (2-tailed)
Pair 1 CS1 - CS2 -.1161 1.03399 .06304 -.2402 .0081 -1.841 268 .067

The difference between


mean of C1 and C2
Issues of Significance

• Differences in pain between the two conditions were not


statistically significant (p = 0.67)

• Remember p must be < 0.05 to be statistically


significant
– There is no significant difference (p = 0.67)
– This only reflects a tendency
– Power issue??

• Tendency accepted as p<0.1


Non-Parametric Difference Tests

• Wilcoxon - 2 groups – within groups/repeated


measures – Paired t-test

• Mann Whitney U- 2 groups – between groups –


independent t-test
Mann Whitney U

• Do males and females differ on their emphasis on importance


of body image?

• Hypothesis = males and females will differ on their


emphasis of importance of body image

• Imagine the data were not randomly sampled/high


level, and/or not Normally distributed
2 output boxes appear:

Irrelevant

Similar function to
t in t-test
High = better
p value
>0.05 = no sig diff
Wilcoxon
• Differences between imagery rating scores from memory and
after watching video playback

• Hypothesis = there will be a significant difference between


imagery rating from memory and after watch the video of the
skill

• IV = presence/absence of video (operationalised by asking


subjects to rate likeness to actual performance
• DV = 1-7 scale
2 tables appear:

Similar function to
t in t-test
High = better

This is the p value!


>0.05 = NOT significant
• Generally speaking parametric tests are
preferred

• However, they are not always possible!


 It depends on YOUR data
Meaningful versus statistical
significance
• Meaningful significance cannot be determined by statistics.

• It is a decision made by the researcher.

• Statistical significance is not the same as meaningful


significance.

• A small effect of altering a surgical method may emerge as


statistically significant but may be unimportant when we
measure surgical survival
One tailed or two tailed tests
• This topic is concerned with directional or non-
directional hypotheses.
• If the hypothesis is non-directional, e.g. performance of
Group A will be different to Group B following an
intervention then we must chose a two-tailed statistical
test.
• If a hypothesis is directional, e.g. Group A will score
significantly higher than group B following the
intervention then we must chose a one-tailed statistical
test.

• a directional hypothesis is a more powerful test.


One or two tailed
• This is concerned with Directional vs. non-directional hypotheses.

• If the hypothesis is non-directional (e.g. performance of Group A will be


different to Group B following an intervention)
• two-tailed statistical test.

• If a hypothesis is directional (e.g. Group A will score significantly higher than


group B following the intervention)
• one-tailed statistical test.

• In a directional hypothesis, not only do you say there will be a difference, but
also what that difference will be
• E.g. women have better ultra endurance than men
What are the implications of one and two tailed hypotheses?

• SPSS USUALLY assumes we conduct two tailed research so the


p value it produces is for a two tailed test! So we do nothing!!!

• However, if our hypothesis is one-tailed we must change the p


value SPSS has given us into a one tailed value..OR click on 1-
tailed on SPSS if it has it!

This p value becomes 0.0335!!!


• We simply half it!!!
Paired Sam ples Te st

Paired Diff erences


95% Conf idence
Interval of the
Std. Error Diff erence
Mean Std. Deviation Mean Low er Upper t df Sig. (2-tailed)
Pair 1 CS1 - CS2 -.1161 1.03399 .06304 -.2402 .0081 -1.841 268 .067

Notice there was no significant difference and now there is!!!!


‘Why not always conduct one-tailed tests if it is more likely to
demonstrate significance?’

• The answer lies in the fact that you have to declare what you are
going to do before conducting the study (hypothesis).

(Remember - your study should be rooted in theory: you must have


an idea of what should happen)

• If we have conducted a one-tailed test and the result goes in the


opposite direction to that predicted, no matter how extreme, then
you cannot claim this as significant.
Why not always carry out two-tailed tests?

• For example, if our theory concerns the effects of stimulants on motor


performance, then stimulants generally speed up motor reactions.

• In which case it makes no sense to predict that tasks performed with


a stimulant will be performed significantly faster or slower than those
performed without it.

• In this case the theory dictates a directional test and hence a one-
tailed test of the hypothesis.

Think about your research first and select a


statistical test that fits your design/the
literature.
What if more than 2 groups?
• Often you will want to conduct a test to see if there are
differences between more than two groups/conditions

• VO2 max in football, rugby and hockey?

• Motivation in Year 1, Year 2, Year 3?

• ANOVA or MANOVA!

• More advanced stats - we will cover this next week!


Summary
 Check parametric assumptions to use correct test

 Allow to test significance of the IV on changing DV

 T-tests used to test data from TWO groups

 Can run either paired or independent sampled t-test

 For non-parametric data use either:


 Wilcoxon – paired groups
 Mann Whitney U – independent groups

 Can have 1- or 2-tailed significance, it depends on your hypothesis!

You might also like