The World of Statistics PDF

Discrete/Categorical Variables:
are categories such as male or female

fractions are not meaningful. only values can be taken
Continuous Variables:
is any score that lies on a continuum (o to infinity)
Fractions are meaningful.
Dependent Variables
Independent Varaibles
VARIABLE TYPES
*Nominal: **catergories are assigned

arbitrary numbers/ eg Gender can be assigned
a 1 or 2. these numbers do not give any information
regarding the score
Ordinal= rank. scores can

are classifications that tell you what
be arranged on a scale
measurement properties the values have
LEVELS OF MEASUREMENT
MODE: MOST
Interval: Scores are ordered and the CENTRAL TENDENCY: is a
distance between each score is equal. method that gives a single number
to represent the best version of your variable MEDIAN: MIDDLE
Ratio: scores can be ordered, (n+1)/2
difference between each score is equal Each group in the data set must
and the scale has a true absolute 0 have its own separate measure of central
tendency MEAN: AVERAGE
Historgrams
GRAPHS
Bar Charts
Pie Charts
Box plots and Whisker charts
3. SUM OF SQUARED ERRORS/ SUM OF
DESCRIPTIVE STATISTICS SS can be used to assess

SQAURES 3 the fit of model
how far are the individual SS is used as an indicator of the total deviance
1. DEVIANCE scores from the mean? of the scores from the mean A small SD means most data points
were close to the mean. A large SD
means the data points are widely
Deviance Squared is used spread from the mean
to calculate the SUM OF SQUARED ERRORS The Sum of Squares answer gives rise to the variance
is a measure that quantifies how much
STANDARD DEVIATION variation and dispersion is found in the data set Sum of squared is the Sum of SD is the square root of the variance.
2.DEVIANCE SQUARED :
each individual scores Standard deviation
The Sum of Squares/ Number of observations 5. STANDARD This tells you the deviations between
when individual minus the degrees of freedom (1) DEVIATION
from the mean the score and mean of the data set
scores are added up you will often get 0 because 0
Standardised scores is the central point of the data set.
DISPERSION: Is a method 4.VARIANCE
You need square the individuals scores to get the Deviance
that aims to give a single number to represent The Variance is the average error between
squared
the spread or variability of your variable is the easiest to calculate the mean and the observation
RANGE is calculated by minimising the highest
score from the lowest score in the data set
is the difference between the upper

quartile and the lower quartile Variance measures are
INTERQUARTILE RANGE
in units squared
DESCRIBING DATA NULL HYPOTHESIS: TYPE 1 ERROR Occurs when you reject the Null Hypothesis
is the opposite of the outcome because p<.05. However in reality their is no
you expect to find from conducting your effect
One tailed analysis. occurs when research
Accept or reject based on p value Occurs when you accept the Null Hypothesis
we evaluating the association
DIRECTIONAL HYPOTHESIS between two varaibles/groups PROBABILITY because p>.05. However, in reality there is an
HYPOTHESIS TESTING effect in the population
The chance of making this error
is related to the power of the tests
TYPE 2 ERROR
ALTERNATIVE HYPOTHESIS
Independent groups design :is the outcome you expect to find from
conducting your research
Correlational Design NON-DIRECTIONAL HYPOTHESIS
Two tailed analysis. Occurs when
RESEARCH DESIGN we evaluate more then two
Experimental Design
variables/groups
INDEPENDENT T-TEST Repeated Measures Design: the same group
Parametric of people are used in different experiments
significant difference between
2 independent groups.
allows you to make inferences from
MANN-WHITNEY U TEST your sample about the larger population. Confidence intervals are effected by
Non-Parametric tells you the probability of your results the variation in population and the sample size
occurring in the population CONFIDENCE INTERVALS
Step 1: Calculate the standard error:
is an interval estimate of sample standard deviation/ square root of the sample size
the population parameters. 95% CONFIDENCE INTERVALS The confidence
it is used when we do not interval allows
know us to Step 2: use a t value to multiply by the Standard error.
the population mean communicate this gives the Margin of Error
CALCULATING CONFIDENCE INTERVALS
how accurate
ONE-WAY-BETWEEN GROUPS ANOVA our population
Parametric mean estimate step 3: add and subtract the margin of error from
TWO-WAY-BETWEEN GROUPS ANOVA significant difference between is likely to be in the sample mean. These two numbers give you the confidence
Parametric 3+ independent groups. a specific data interval range
INDEPENDENT GROUPS RESEARCH DESIGNS CENTRAL LIMIT THEORIM range
Central limit theorem argues regardless of the shape of the
KRUSKAL-WALLIS ANOVA population, parameter estimates of that population will have a
Non-Parametric normal distribution provided the samples are big enough.
DATA ANALYSE TYPES INFERENTIAL STATISTICS :

THE WORLD OF STATISTICS
PAIRED T-TEST
Parametric
STATISITCAL SIGNIFICANCE
significant difference in scores
between 2 points in time Distribution is symmetrical
WILCOXON SIGN RANK TEST Normal distrbutions are assessed by
from the mid point
Non-Parametric using histograms
Mean, Median and Mode
NORMAL DISTRIBUTION are the same
REPEATED MEASURES RESEARCH DESIGN
ONE-WAY WITHIN GROUPS ANOVA a distribution is consider variable in the distribution is continous
Parametric normal when
Significant difference in scores between two or STATISTICAL POWER
TWO-WAY WITHIN GROUPS ANOVA
Parametric
more conditions in a repeated measures design RELATIONSHIPS BETWEEN distribution looks like a bell shape
VARIABLES EFFECT SIZE STANDARDISED SCORES

alpha level: .01 or .05
FRIEDMAN'S ANOVA Standardised scores allows

Non-Parametric you to compare apple and oranges TESTS ASSESSING NORMALITY
as SD decreases, power
of analysis increases . you convert the scores from the different Parametric tests KOLMOGOROV-SMIRNOV p<.05 significant difference: the sample distribution
data sets into a common numerical score assess whether sample distribution
POWER K-S TEST is significantly different from a normal disbribution
is normally distrbitued compares the scores in the sample
The effect size is the measure the size is an effect of the Power increases with sample size
indicates the extent to which to a normally distributed set of scores
experimental manipulation. To calculate he effect size you can
to varibales are related with the same mean and SD p>.05 non significant: it tells use the distrbution
take the means of two experimental groups and minus them. The
CORRELATION from the sample is not significantly different to the
difference in the means equals the effect size
MEAN AND STANDARD DEVIATION negative kurtosis normally distribution
are examples of variable scores being flat and light tail
KURTOSIS
standardised
we then use tests to identify

Regression models should positive kurtosis
if the effect size is significant Z=(raw score-mean)/standard deviation
only be applied if distribution is normal pointy and heavy tail
MEan= 0
SD= 1
OUTLIERS Step 1:Take the control mean and Z SCORES
are data scores with lage residuals. COHEN'S D the experimental mean and minus These calculation allow you to
outliers can impact on regression analysis to find the effect size. identify how far your score Result score is greater the 1.96 (p<.05)
PEARSON CORRELATION zscores can also be used to calculate
is from the mean, in terms of it is signficiant
ASSUMPTIONS OF REGRESSION LINEARITY the probability based on normal distribution. CONVERT TO ZSCORES
Step 2: Divide the standard deviation
The assumption of additivity and linearity means the outcome based on the score you get, it will tell you probability of
effect size by the SD. this
LINEAR REGRESSION variable is linearly related to any predictors.Their relationship obtaining that score.
gives you Cohen's d
can be summed up by a straight line
Regression :
allows you to see if one or more variables MULTICOLLINEARITY Step 3: SKEWNESS Negative Skew
can predict the outcome variable. takes one step further variables that have very high D= 0.2 is a small effect build up of high scores
by allowing you to predict the outcome variable based on correlations can suggest the variables R= .10 (small effect) 1% explaines the variance D= 0.5 is a medium effect
the application of the DV are measuring similar things. It doesn't allow R= .30 (medium effect) the effect accounts for 9% of the total D=0.8 is a large effect
you to ascert independent influence variance
Regression analysis does not R= .50 (large effect) the effect accounts for 25% of he total is the effect size significant?
prove causality variance.
Postive skew
Least Square Regression
too many low scores
is used to fit the regression line:
y=a+bx RESIDUALS
the difference between the actual scores
predictor scores. The line of best fit are the
predictor scores
Confidence intervals can be extremely inaccurate when
HOMOGENEITY OF VARIANCE homogeneity of variance cannot be assumes.
This assumption means that the variance of the outcome
variable should be stable at all levels of the predictor variable. heteroscedastic errors
The error variance is assumed to be the same at all points of a If you have Heteroscedastic errors it means there is another
linear relationship system in the data that is at work tha5t can explain the outcome
variable

The World of Statistics PDF

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

The World of Statistics PDF

Uploaded by

Copyright:

Available Formats

Discrete/Categorical Variables:

are categories such as male or female

*Nominal: **catergories are assigned

Ordinal= rank. scores can

Box plots and Whisker charts

3. SUM OF SQUARED ERRORS/ SUM OF

DESCRIPTIVE STATISTICS SS can be used to assess

is the difference between the upper

DATA ANALYSE TYPES INFERENTIAL STATISTICS :

VARIABLES EFFECT SIZE STANDARDISED SCORES

FRIEDMAN'S ANOVA Standardised scores allows

we then use tests to identify

You might also like