Download as pdf or txt
Download as pdf or txt
You are on page 1of 30

Steve Saffhill

Research Methods in Sport & Exercise


Basic Statistics
Analysing Quantitative Data

• Data collected on its own does not answer your research


question(s)

• The data needs to be interpreted – to do this we need to


organise it and analyse it!

• Most students panic at this stage – statistics work!!!!

• Have you heard of terms in your reading of “multiple


regression”, “t-test”, “ANOVA” etc.
Statistics
• The more exercise classes attended by an individual the fitter they get.

• There is a positive and statistically significant relationship between


student attendance and academic achievement

• There is a positive and statistically significant relationship between 5km


and 10km PB’s

• There is a statistically significant difference in mortality rates and the


ownership of red cars compared to other car colours
• Statistics can be split into 2 forms:

1. Descriptives – organise the data...describe it!


2. Inferentials – allow you to make inferences...what does it mean?

• You need to ask yourself:


1. What exactly do I need to find out from this data to answer my
research question?
2. What statistical test will give me this info?
3. What do the results from this test mean?

• Statistics themselves have no meaning! The importance lies in


how you interpret them!!!
Computer Software

• Most common to use software to conduct the tests

• Most common is SPSS


(statistical package for the social sciences).

• Globally used in health and social sciences!

• Knowledge of SPSS is a very valuable transferable skill to


have….C.V!
Statistical Inference

• Work of statistician = making predictions about a group based on


collected data from a small sample of that population
• (exploring differences, relationships and making statements about the
meaningfulness)

• Stats allow us to make a statement and then cite the odds that it is
correct!
• Computers make stats easier: organise, analyse, and display data
much faster than we can.
• BUT, the PC is only an extension of you. It will only perform if you
enter the data correctly and understand its output.
• Before a PC is useful to you, you must know what you want it to do
and what is expected. That is where this module comes in!
Descriptive statistics
• Important to know the full range of information for different
variables
• PB, seasons best, distance, age, BMI etc

• We need to know the mean, median & mode to describe


the data!

• 1st type: central tendency (M, M, M)


• If these are known we can interpret the value of a single
score (e.g., 2nd place) by comparing it to the mean, median
and mode!
Central Tendency

• Suppose we collect data on the monthly wage of


employees from two specific companies.

Such RAW
Company one Company two DATA on its
own is not
1000 1890 1000 1890 particularly
1200 2135 1200 2142 informative
1215 786 1215 1390
1300 980 850 9800 We usually
990 1200 970 1200 need to
875 768 1875 3256 SUMMARISE/
1345 1000 1345 1000 DESCRIBE
our data.
• Mean (m)
– Average of scores of a particular set of scores
• Median
– Central value (mid-point)
– E.g., if weekly hrs spent training for a sport were 2, 2, 4,
5, 6, 10, 10, 11, 15 the median = 6
– If you have two groups
(e.g., males & females, high v low fear of failure) you can
calculate a median split – to make a comparison!
• Mode
– Most frequent number
(e.g., most common age for people who drop out from
sport)
However, Averages can be distorted by outliers!
Name Club Annual Salary
Robin van Persie Arsenal £4.5 million
Darren Bent Aston Villa £3.5 million
Nicolas Anelka Chelsea £3 million
Didier Drogba Chelsea £3.5 million
Mario Balotelli Manchester City £4 million
Michael Owen Manchester United £2.5 million
Carlos Tevez Manchester City £13 million
Tuncay Sanli Stoke City £3 million
Darren Bent Sunderland £2.25 million
Jermain Defoe Tottenham £3 million

£42.25m Total: Mean = £4.225m

• What has happened is that an outlier (extreme value) has


distorted the information carried by the mean.
The Median
• In some instances the MEDIAN maybe a better measure of
central tendency
• The median is basically the central value of a data set when
that set is numerically ordered
• Sometimes the median is very simple to find...

£100,000 £125,000 £200,000 £225,000 £300,000


The Median
• At times, the median can be slightly more difficult to calculate
due to the fact that there may not be a single middle value
• In such instances we take the MEAN of the TWO MIDDLE
VALUES to be the MEDIAN

£100,000 £150000 £200,000 £250,000 £300,000 £500000


£225,000
The Mode

• The mode is the most frequently occurring value in a


given data set:
• 7, 4, 8, 8, 9, 2, 4, 5, 7, 8, 4, 8, 8, 6
Here the mode would be 8

• But what about this data set?


• 3, 4, 5, 6, 4, 5, 4, 5, 6, 9, 1, 2
Such data sets obviously have two modes. These are
usually referred to as bi-modal (4 & 5).
Information can be presented as...

Numerical
• They convey info about the degree of your measure….

Graphical = e.g., box plot


• Contains detailed info about the distribution of scores
• Usual to use both in your study!
Measures of Dispersion/Variance
• Central tendency alone does not always provide an adequate
summary of our data
– the dispersion or variability of scores within a data set give
us supplementary information about the data

• We often need an idea of how each of our data values vary


around the central measure
• For example, we might know the mean of a data set, but people
might vary quite dramatically around that central value
• Suppose the manager, asked for a comparison of the wages of
his four most featured strikers and his four most featured
midfielders...
Strikers £/week Midfielders £/week

A £130,000 1 £175,000

B £250,000 2 £160,000

C £125,000 3 £150,000

D £125,000 4 £150,000

MEAN = £157,500 MEAN = £158,750

If we present the manager with the means alone, we do not


give him the full story…
• Clearly the variability in the first data set far outweighs that of the
second.
Variability of Data
• A statistic that allows the spread (dispersion) of the data
to be appreciated is the range.

• The range is simply the difference between the smallest


and largest values in the data set.
Range

Strikers £/week Midfielders £/week

A £130,000 1 £175,000

B £250,000 2 £160,000

C £125,000 3 £150,000

D £125,000 4 £150,000

MEAN = £157,500 MEAN = £158,750

RANGE = £125,000 RANGE = £25,000


BUT…..The range alone does not tell us the full story of how
much variability there is on average around the mean

Standard deviation does.

• It is a measure of the extent to which scores deviate from the


mean
• You will very frequently see it mentioned in research papers:
Descriptive statistics suggested that males (M = 4.4, SD = 0.8) had higher
levels of confidence than females (M = 3.6, SD = 0.5)

• If SD is large, then the Mean may not be a good


representation
• Say two samples have identical means:
• BUT....They can have different standard deviations (Spread
of scores around the mean)

• This tells the researcher that the measures from the sample
with the larger standard deviation are likely to deviate
further from the mean score to a greater extent
• i.e., the scores are more spread out.
Presenting Descriptive Statistics

• Generally presented in tables and graphs

• Tables should be included where the information is appropriate


to the research question

• Include notation to show significance


Using SPSS to find out your
descriptive data!
Coding Data: to find descriptives of groups
• SPSS only deals with numbers and NOT words!!!

• Sometimes (quite often in sport) we need to CODE our data


• Coding = translating responses into common categories
each with an assigned numerical value to allow you to run
some statistics
It is very easy!

• If you get non-numerical data (e.g., gender, level of


participation, sport played etc) you need to give each group
a code number (e.g., 1 for male 2 for female).
For example...
• All males are coded 1 and females 0
• All football players are coded 0, rugby players 1 and hockey
players 2 etc.
• The computer then knows what is 0 and what is 1
• Then when you run your descriptives, SPSS will be able to
give you them for each group and not just the sample as a
whole. Therefore it allows you to compare...
• Then when you run the inferential statistics you can
actually really compare the results!
De scrip tives

Grouping v ariable Statistic Std. Error


Social Pysique A nxiety exercis e dependent but Mean 26.5625 1.32906
good body image/no MD 95% Conf idence Low er Bound 23.7297
Interval f or Mean Upper Bound
29.3953

5% Trimmed Mean 26.5139


Median 24.0000
Variance 28.263
Std. Deviation 5.31625
Minimum 20.00
Maximum 34.00
Range 14.00
Interquartile Range 11.25
Skew ness .544 .564
Kurtosis -1.370 1.091
Exdependent and MD Mean 56.5238 .86084
95% Conf idence Low er Bound 54.7281
Interval f or Mean Upper Bound
58.3195

5% Trimmed Mean 56.8598


Median 58.0000
Variance 15.562
Std. Deviation 3.94486
Minimum 47.00
Maximum 60.00
Range 13.00
Interquartile Range 6.00
Skew ness -1.416 .501
Kurtosis 1.445 .972
Inferential Statistics
• Used to draw inferences (logical conclusion) about a
population from a sample

• E.G. We want to explore the effects of sleep deprivation on


performance.
• 10 subjects who performed a task post 24hrs of sleep
deprivation scored 12pts less than 10 subjects who performed
task after ‘normal’ sleep.

• Is the difference real or due to chance?

• Significant differences tests :- t-test, ANOVA etc


• Tests of association:- correlation
= most
common
inferential
tests for you!
2 Types of Inferential Tests
• Inferential tests test a null hypothesis (i.e., there will be
no relationship or difference between two variables).

1. Parametric tests – used on data that meet a strict


criteria
2. Non-Parametric tests - used on data that do not meet
the strict criteria

• We will be exploring these criteria next week!


Summary

• Statistics used to describe data (descriptive stats)

• Also used to discern what data mean (inferential)

• The type of test used determined by experimental


design

• First step in data analysis is exploring the data

• What is the effect of one variable on another

You might also like