Download as pdf or txt
Download as pdf or txt
You are on page 1of 2

Stat 110 - Final Notes

Bernoulli Story. A light that turns on with probability p. w/ repl w/o repl LOTP (Partitioning)
fixed trials Bin HGeom P (A) = P (A|B1 )P (B1 ) + · · · + P (A|Bn )P (Bn )
X ∼ Bern(p)
fixed success NBin NHGeom
More fancy LOTP
Binomial Story. Flipping a coin that lands heads prob- P (A|C) = · · · + P (A|Bn , C)P (Bn |C)
x2 x3
ability p a fixed number of n times. X is the number of ex = 1 + x + + + ···
heads. Also, X = I1 + · · · + In where Ii ∼ Bern(p). 2! 3! Testing for independence.
V ar(X) = EX 2 − (EX)2
P (A ∩ B) = P (A)P (B)
X ∼ Bin(n, p)
P (A) = P (A|B)
P
LOTUS. E(g(x)) = k g(x)P (x = k)
P (B) = P (B|A)
Geometric Story. Flipping a coin that lands heads proba- Gambler’s Ruin. Let 0 < a < n be positive integers. We
bility p until you get to the first head. X is the number of start at a, and we want to get to get to n before getting to
tails to get to that point. 0. We make progress towards n with probability p, and we PIE
lose with probability q.
Linearity of Exp. (holds for dependent vars!)
X ∼ Geom(p)  a
1 − pq Coherent. If we receive multiple pieces of information and
P (Win) =  n . wish to update our probabilities to incorporate all the infor-
First Success Story. Geometric plus one. 1 − pq mation, it does not matter whether we update sequentially,
taking each piece of evidence into account one at a time,
X ∼ F S(p) First Step Conditioning. or simultaneously, using all the evidence at once.
68 − 95 − 99.7 Rule. Coupon Collector. X = X1 + · · · + Xn where
Hypergeometric Story. Consider an urn with w white Going from N (0, 1) to N (µ, σ): n−1−i
balls and b black balls. We draw n balls out of the
 urn at Xi ∼ F S( )
random without replacement, such that all w+b samples n
n µ + σZ ∼ N (µ, σ 2 )
are equally likely. Let X be the number of white balls in
the sample. Standardization. Going from N (µ, σ 2 ) to N (0, 1) Universality of the Uniform.

X ∼ HGeom(w, b, n) Z −µ • If U ∼ U nif (0, 1), F −1 (U ) has CDF F .


∼ N (0, 1)
σ • If X has CDF F , F (X) ∼ U nif (0, 1).
Negative Binomial Story. Number of failures until the Vandermonde’s Identity.
rth success. X = X1 + · · · + Xr where Xi ∼ Geom(p). CDF of a Standard Normal Distribution
  X r   
1
Z z
2
m+n m n
Φ(z) = √ e−t /2 dt =
X ∼ N Bin(r, p) r k r−k
2π −∞ k=0

Φ(z) + Φ(−z) = 1
Poisson Distribution. Support is k ∈ {0, 1, 2, . . .}. Conditional Probability.
Bayes’ Rule Odds Form P (A, B)
e−λ λk P (A|B) =
P (X = k) = P (B)
k! P (A|B) P (B|A)P (A)
=
P (Ac |B) P (B|Ac )P (Ac )
MGFs. The k th derivative of the moment generating func-
When is Poisson Used? Large number of trials, each with Bayes’ Rule tion, evaluated at 0,is the k th moment of X. Note for inde-
a small probability of success. Each trial is independent or pendent X, Y :
weakly dependent. In this case, let λ = np (expected num- P (B|A)P (A)
ber of successes). P (A|B) = MX+Y (t) = E(et(X+Y ) ) = E(etX )E(etY )
P (B)
Page 1 Updated December 14, 2019
Stat 110 - Final Notes

The joint CDF of X and Y is Markov Chains. A Markov chain is a random walk Also, Γ(a + 1) = aΓ(a), and Γ(n) = (n − 1)! if n is a
in a state space, which we will assume is finite, say positive integer.
F (x, y) = P (X ≤ x, Y ≤ y) {1, 2, . . . , M }. We let Xt denote which element of the state
space the walk is visiting at time t. The Markov chain is the Gamma story and representation We wait for n shoot-
In the discrete case, X and Y have a joint PMF iid
sequence of random variables tracking where the walk is at ing stars that come at Expo(λ) intervals, so X1 , . . . , Xn ∼
pX,Y (x, y) = P (X = x, Y = y). all points in time, X0 , X1 , X2 , . . . . Expo(λ) then

In the continuous case, they have a joint PDF By definition, a Markov chain must satisfy the Markov X1 + · · · + Xn ∼ Γ(n, λ)
property, which says that if you want to predict where
∂2 the chain will be at a future time, if we know the present Beta story and representation We have two processes, n
fX,Y (x, y) = FX,Y (x, y).
∂x∂y state then the entire past history is irrelevant. In symbols, shooting stars and m shooting bars, Beta is the ratio of the
time that the shooting stars take out of both. Gn ∼ Γ(n, λ)
The joint PMF/PDF must be nonnegative and P (Xn+1 = j| . . . , Xn = i) = P (Xn+1 = j|Xn = i) and Gm ∼ Γ(m, λ) with Gm Gn then

|=
sum/integrate to 1.
State Properties. A state is either recurrent or transient. Gm
Properties of Conditional Expectation. ∼ Beta(m, n)
• If you start at a recurrent state, then you will al- Gm + Gn
1. E(Y |X) = E(Y ) if X ⊥
⊥Y ways return back to that state at some point in the Uniform Order Statistics Suppose we have U1 , . . . , Un
2. E(h(X)W |X) = h(X)E(W |X) (taking out future. then the ordered set is U(1) , . . . , U(n) . Then we have
what’s known) E(h(X)|X) = h(X). • Otherwise you are at a transient state. There is
3. E(E(Y |X)) = E(Y ) (Adam’s Law, a.k.a. Law of some positive probability that once you leave you U(i) ∼ Beta(i, n − i + 1)
Total Expectation) will never return.
For integrals that look like this:
Adam’s Law (a.k.a. Law of Total Expectation). can also A state is either periodic or aperiodic. Z 1
be written in a way that looks analogous to LOTP. For any pa (1 − p)b dp
• If you start at a periodic state of period k, then the
events A1 , A2 , . . . , An that partition the sample space, 0
GCD of the possible numbers of steps it would take
E(Y ) = E(Y |A1 )P (A1 ) + · · · + E(Y |An )P (An ) to return back is k > 1. then we pattern match to the Beta pdf, so in this case we
have
• Otherwise you are at an aperiodic state. The GCD
For the special case where the partition is A, Ac , this says of the possible numbers of steps it would take to re- Z 1
Γ(a + b) a−1
c c turn back is 1. p (1 − p)b−1 dp = 1
E(Y ) = E(Y |A)P (A) + E(Y |A )P (A ) 0 Γ(a)Γ(b)
Z 1
Euler’s Approximation for Harmonic Sums. Γ(a)Γ(b)
Eve’s Law (a.k.a. Law of Total Variance). =⇒ pa−1 (1 − p)b−1 dp =
0 Γ(a + b)
1 1 1
Var(Y ) = E(Var(Y |X)) + Var(E(Y |X)) 1+ + + · · · + ≈ log n + 0.577 . . .
2 3 n I think this is called Bayes-Billiards.
D
CLT Distribution Approx. We use −→ to denote con- Stirling’s Approximation for Factorials.
verges in distribution to as n → ∞. The CLT says that if √  n n
we standardize the sum X1 + · · · + Xn then the distribu- n! ≈ 2πn
e
tion of the sum converges to N (0, 1) as n → ∞:
Gamma and Beta Integrals. You can sometimes solve
1 D complicated-looking integrals by pattern-matching to a
√ (X1 + · · · + Xn − nµX ) −→ N (0, 1)
σ n gamma or beta integral:
In other words, the CDF of the left-hand side goes to the
Z ∞
standard Normal CDF, Φ. In terms of the sample mean, the xt−1 e−x dx = Γ(t)
0
CLT says √ Z 1
n(X̄n − µX ) D Γ(a)Γ(b)
−→ N (0, 1) xa−1 (1 − x)b−1 dx =
σX 0 Γ(a + b)
Page 2 Updated December 14, 2019

You might also like