Professional Documents
Culture Documents
Statistics Cheatsheet
Statistics Cheatsheet
2 etc 1
Statistics Cheat Sheet q. Mean: x = ∑xi / n
Mr. Roth , Mar 2004 r. Median: M: If odd – center, if even - mean of 2
1. Fundamentals s. Boxplot:
Min Q1 M Q3 Max
a. Population – Everybody to be analysed
Parameter - # summarizing Pop
o. – shape, center, spread. Symmetric, skewed right, o. Then solving knowing lines thru centroid (
skewed left ( x , y ); a = y −bx
p. Stemplots
0 11222 0 112233 p. b0 =
∑ y − (b ∑ x)1
n
Statistics Cheat Sheet
q. r^2 is proportion of variation described by linear c. 2) Theoretical: Relative frequency/proportion of a
relationship given event given all possible outcomes (Sample
r. residual = y - y = observed – predicted. Space)
s. Outliers: in y direction -> large residuals, in x d. Event: outcome of random phenomenon
direction -> often influential to least squares line. e. n(S) – number of points in sample space
t. Extrapolation – predict beyond domain studied f. n(A) – number of points that belong to A
u. Lurking variable g. p 183: Empirical: P'(A) = n(A)/n = #observed/
v. Association doesn't imply causation #attempted.
h. p 185: Law of large numbers – Exp -> Theoret.
5. Data – Sampling
i. p. 194: Theoretical P(A) = n(A)/n(S) ,
a. Population: entire group favorable/possible
b. Sample: part of population we examine j. 0 ≤ P(A) ≤ 1, ∑ (all outcomes) P(A) = 1
c. Observation: measures but does not influence k. p. 189: S = Sample space, n(S) - # sample points.
response Represented as listing {(, ), …}, tree diagram, or grid
d. Experiment: treatments controlled & responses l. p. 197 Complementary Events P(A) + P( A ) = 1
observed
m. p200: Mutually exclusive events: both can't happen
e. Confounded variables (explanatory or lurking) when
at the same time
effects on response variable cannot be distinguished
n. p203. Addition Rule: P(A or B) = P(A) + P(B) – P(A
f. Sampling types: Voluntary response – biased to
and B) [which = 0 if exclusive]
opinionated, Convenience – easiest
o. p207: Independent Events: Occurrence (or not) of A
g. Bias: systematically favors outcomes
does not impact P(B) & visa versa.
h. Simple Random Sample (SRS): every set of n
p. Conditional Probability: P(A|B) – Probability of A
individuals has equal chance of being chosen
given that B has occurred. P(B|A) – Probability of B
i. Probability sample: chosen by known probability given that A has occurred.
j. Stratified random: SRS within strata divisions q. Independent Events iff P(A|B) = P(A) and P(B|A) =
k. Response bias – lying/behavioral influence P(B)
6. Experiments r. Special Multiplication. Rule: P(A and B) = P(A)*P(B)
a. Subjects: individuals in experiment s. General mult. Rule: P(A and B) = P(A)*P(B|A) =
P(B)*P(A|B)
b. Factors: explanatory variables in experiment
t. Odds / Permutations
c. Treatment: combination of specific values for each
factor u. Order important vs not (Prob of picking four
numbers)
d. Placebo: treatment to nullify confounding factors
v. Permutations: nPr, n!/(n – r)! , number of ways to
e. Double-blind: treatments unknown to subjects &
pick r item(s) from n items if order is important :
individual investigators
Note: with repetitions p alike and q alike = n!/p!q!.
f. Control Group: control effects of lurking variables
w. Combinations: nCr, n!/((n – r)!r!) , number of ways
g. Completely Randomized design: subjects allocated to pick r item(s) from n items if order is NOT
randomly among treatments important
h. Randomized comparative experiments: similar x. Replacement vs not (AAKKKQQJJJJ10) (a) Pick an
groups – nontreatment influences operate equally A, replace, then pick a K. (b) Pick a K, keep it, pick
i. Experimental design: control effects of lurking another.
variables, randomize assignments, use enough y. Fair odds - If odds are 1/1000 and 1000 payout. May
subjects to reduce chance take 3000 plays to win, may win after 200.
j. Statistical signifi: observations rare by chance
8. Probability Distribution
k. Block design: randomization within a block of
individuals with similarity (men vs women) a. Refresh on Numb heads from tossing 3 coins. Do
grid {HHH,….TTT} then #Heads vs frequency
7. Probability & odds chart{(0,1), (1,3), (2,3), (4,1)} – Note Pascals triangle
a. 2 definitions: b. Random variable – circle #Heads on graph above.
b. 1) Experimental: Observed likelihood of a given "Assumes unique numerical value for each outcome
outcome within an experiment in sample space of probability experiment".
Base: x = ∑x / n , s 2 = ∑
(x − x)
2 h. Idea – outcome that would rarely happen if claim
i. were true evidences claim is not true
( n −1)
i. Ho – Null hypothesis: test designed to assess
Frequency Dist Probability Distribution evidence against Ho. Usually statement of no effect
Me x = ∑xf / ∑ f µ = ∑[ xP ( x )]
j. Ha – alternative hypothesis about population
an
parameter to null
Var
∑( x − x ) f
2
σ = ∑[( x − µ) P ( x )]
2 2