Formula Sheet

Median = (N +1)/2 *Avg middle values for even N (N = number of scores)
Index of Qualitative Variation (IQV): K(1002 – (sum of squared percentages))/1002(K - 1)

- K: number of categories and Range from 0 – 1
Inter-quartile range (IQR) = Q3 – Q1
- Q1 = N x .25
Deviation from mean = Y – Ϋ
- Ϋ = Mean
Variance (Sy2) = Σ(Y1 – Ϋ)2 + (Y2 – Ϋ)2…/(N – 1)
Standard deviation (Sy) = Square root (variance)
± 1 standard deviation: 68.26%
Z Score = (Y – mean)/Sy or (Y - μy)/( σΫ/square root N)
Y = Ϋ + Z(Sy)
Probability Sampling
- μy = population mean
- σy = standard deviation of population
- Ϋ = sample mean
- M = samples used to create the sampling distribution
- uΫ = mean of the sampling distribution
- uΫ = ΣΫ/M
Confidence Interval
- 68% ± 1(Z) standard error of population mean
- 95% ± 1.96 (Z)
- 99% ± 2.58 (Z)
- Standard Error of the sampling distribution (σΫ) also called the standard deviation of the sampling distribution
- σΫ = σy/ square root(N)
- If the sample is > 50 then Sy can be used in the place of σy
- Confidence Interval of the Mean = Ϋ ± Z(σΫ)
Confidence Interval for Proportions
- Standard error of a proportion (σp) = Square root ((π x 1 – π)/N)
- σp = standard error of proportions
- π = population proportion
- N = population size
- If population proportion (π) then use sample proportion (p)
- Estimated standard error of proportions (Sp) = Square root ((p x 1 – p)/N)
- N = sample size
Confidence Interval of proportion = p ± Z(Sp)
Hypothesis Testing (random, intercal-ratio, and normal)
Research Hypothesis – H1: μy </>… Null Hypothesis – H0: μy =…
Type I error: rejecting the null hypothesis when it is true/Type II error: failing to reject the null when it is false
t Statistic: similar to the Z score = (Ϋ - μy)/( Sy /square root N) degrees of freedom = N-1
Difference between Means:
Mean of sampling distribution: σΫ1- Ϋ2 = Square root (σ2y1/N1 + σ2y2/N2)
Estimated Standard Error: Sy1-y2 =
t statistic: (Y1 – Y2)/( Sy1-y2)
df = (N1 + N2) – 2
Chi-Square Test (random, nominal/ordinal)
H1 = two variables are statistically dependent/H0 = two variables are stat. independent
Expected frequencies (fe) = (column total)(row total)/N
Chi-Square (x2) = Σ(f0 – fe)2/fe df = (# row -1)(# columns – 1)
- Distribution always positively skewed, one tailed, values always positive 0 to 1; greater the df the more normal the curve
Measures of Association for Nominal and Ordinal Var
Measured by PRE = (E1 – E2)/E1 where E1 = error when independent var. is ignored and E2 = error when including the independent var
Lambda (λ): nominal ranges from 0 to 1, only strength Gamma(γ): ordinal/dichotomous nominal, -1 to 1, symmetrical, strength/dire
Multiple Regression
Ŷ = a + b1X1 + b2X2 Ŷ = predicted score on DV X1 = value of 1st independent/control var X2 = value of 2nd IV b1 = the change in
Y/DV with a unit change in X1 holding X2 b2 = inverse of b1; also b1, b2 = partial slopes
direct causal = Partial slope and original slope similar in value. spurious relationship/intervening relationship = Partial slope
substantially smaller than original slope.
Central Tendency/Variability
Nominal: Mode, IQV, Bar Graph, pie chart
Ordinal: Mode Median, IQV, Pie, bar graph
Interval Ratio: Mode, Median, Mean, Range, IQR, Variance, Standard Deviation, Histogram, Freq. polygon
- Causal relationship: Time-order, correlation, non-spuriousness
3 Aspects of a Bivariate Relationship
1. Existence: will vary across categories of the IV, 10% rule, Chi-square, r2
2. Strength: Measures of association (Lambda, Gamma, r ), Closer to 1 0r -1 means stronger, Larger % diff = stronger relationship
3. Direction: Interval-ratio and ordinal (nominal not ranked direction nonexistent)
Statistical Literacy Issues
- Appropriate stats: Using correct measure of central tendency (mean, median, and mode), Use statistics for appropriate level of
measurement, beware Inappropriate operationalization and conceptualization
- Sampling issues: Overgeneralization in non-random sampling, Sample size (greater than 50)
- Provide context: Reliability/accuracy of research (expertise, random sample), Significance, Use percent/rate/proportion instead of
raw number
- Interpretation: Precision vs. Accuracy, Correlation vs. Causation (time order, non-spuriousness, empirical relationship),
Overgeneralization/stereotyping
Linear regression
- Provides a predicted value of Y for any value of X. r2 = PRE for interval ratio
- Least squares line: sum of the squared residuals is at a minimum, best-fitting line, or difference between predicted and observed Y.
Multiple Regression
- looks at effects of 3 or more IV at same time; standardized allows variable to be ranked
- two equations shows the effect on the DV when other variables are taken into acount
- R2 increases w/ more IV, adjusted R2 corrects for this, used w/ many IV or control var.
- coefficients for dummy variables can be seen as differences in means on the DV for the two categories on the IV, controlling for
other variables in the model
- b = unstandardized β = standardized Sig = p value
- testing slope: set slope/partial slope equal to 0, independent/control variable has no effect on the dependent variable (null hyp), uses
t test where t statistic is calculated by taking the coefficient for the partial slope divided by its standard error (t = b/SE). Provides
statistical significance substantive significance = Size or magnitude of partial slope. Slope of dummy var. shows how much
higher/lower value of dependent variable is for category coded 1, compared to category coded 0.
- t distribution: family of curves based on different df, more df more like normal curve
- sampling distribution: theoretical distribution of all possible sample values
- sampling distribution of the mean: theo. distribution of sampling means obtained by drawing all possible samples of same size
- statistical control introduces an additional variable to learn more about the relationship between an independent and dependent
variable, determines whether the relationship is direct causal, spurious, indirect causal, or conditional. Ex) Elaboration. Crosstab for
nominal and/or ordinal variables and bivariate regression for interval-ratio for original relationship and multiple regression to
introduce a control variable

Formula Sheet

Uploaded by

Copyright:

Available Formats

You might also like

Formula Sheet

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Formula Sheet

Uploaded by

Copyright:

Available Formats

Median = (N +1)/2 *Avg middle values for even N (N = number of scores)

Index of Qualitative Variation (IQV): K(1002 – (sum of squared percentages))/1002(K - 1)

You might also like