Download as pdf or txt
Download as pdf or txt
You are on page 1of 2

Reference Sheet

Prepared by: Amit Goyal1

Sampling Table. Number of ways of choosing k objects out of n


Order Matters  Order does
 not matter
n+k−1 (n + k − 1)!
With replacement nk =
  k
  k!(n − 1)!
n n! n n!
Without replacement k! = =
k (n − k)! k k!(n − k)!

! Results !
n
[ n
X X X n
\
Inclusion-exclusion Pr Ai = Pr(Ai ) − Pr(Ai ∩ Aj ) + Pr(Ai ∩ Aj ∩ Ak ) − · · · + (−1)n+1 Pr Ai
i=1 i=1! i<j i<j<k i=1
n
\
Conditioning Pr Ai = Pr(A1 ) Pr(A2 |A1 ) Pr(A3 |A1 ∩ A2 ) · · · Pr(An |A1 ∩ A2 ∩ A3 · · · ∩ An−1 )
i=1
Pr(B|A) Pr(A)
Bayes’ Rule Pr(A|B) =
Pr(B)
Xn n
X
Law of total Probability Given a partition A1 , A2 , A3 , . . . , An of S, Pr(E) = Pr(E ∩ Ai ) = Pr(E|Ai ) Pr(Ai )
i=1 i=1

Discrete vs. Continuous Random Variables.

Discrete Continuous
PMF/PDF PMF pX (x) = Pr(X = x) PDF fX (x)Z
x
CDF FX (x) = Pr(X ≤ x) FX (x) = Pr(X ≤ x) = fX (t)dt
Z ∞ −∞
X
Expectation E(X) = xpX (x) E(X) = xfX (x)dx
x∈X(S) −∞
X Z∞
Lotus E(g(X)) = g(x)pX (x) E(g(X)) = g(x)fX (x)dx
x∈X(S) −∞

Variance V(X) = E(X − E(X))2 = E(X 2 ) − E(X)2


MGF MX (t) = E(etX ) defined over some interval (−a, a) where a > 0
Joint PMF/PDF pX,Y (x, y) = Pr(X = x, Y = y) fZX,Y (x, y)
X ∞
Marginal PMF/PDF pX (x) = Pr(X = x) = pX,Y (x, y) fX (x) = fX,Y (x, y)dy
y −∞
pX,Y (x, y) fX,Y (x, y)
Conditional PMF/PDF pX|Y (x|y) = Pr(X = x|Y = y) = fX|Y (x|y) =
pY (y) fY (y)
Independence of X and Y pX,Y (x, y) = pX (x)pY (y) for all x, y fX,Y (x, y) = fX (x)fY (y) for all x, y
X Z ∞
Conditional Expectation E(X|Y = y) = xpX|Y (x|y) E(X|Y = y) = xfX|Y (x|y)dx
x −∞
Conditional Expectation Laws E(X) = E(E(X|Y )), V(X) = E(V(X|Y )) +Z V(E(X|Y ))
XX ∞ Z ∞
2-D Lotus E(g(X, Y )) = g(x, y)pX,Y (x, y) E(g(X, Y )) = g(x, y)fX,Y (x, y)dydx
x y −∞ −∞
Covariance E((X − E(X))(Y 
C(X, Y ) = − E(Y ))) = E(XY ) − E(X)E(Y )
n
X m
X Xn X m
Covariance Law C ai Xi , bj Yj  = ai bj C(Xi , Yj )
i=1 j=1 i=1 j=1
!
X − E(X) Y − E(Y ) C(X, Y )
Correlation ρ(X, Y ) = C p , p =p
V(X) V(Y ) V(X)V(Y )

1 Contact: amit.kr.goyal@gmail.com

1
Important Distributions.

Distribution PMF/PDF/Description
( Expectation Variance
p if x = 1
X ∼ Bern(p) pX (x) = E(X) = p V(X) = p(1 − p)
1 − p if x = 0
 
n x
X ∼ Binom(n, p) pX (x) = p (1 − p)n−x for x ∈ {0, 1, . . . , n} E(X) = np V(X) = np(1 − p)
x
1−p 1−p
X ∼ Geom(p) pX (x) = (1 − p)x p for x ∈ {0, 1, 2, . . .} E(X) = V(X) =
p p2
e−λ λx
X ∼ Pois(λ) pX (x) = for x ∈ {0, 1, 2, . . .} E(X) = λ V(X) = λ
x!
1 a+b (b − a)2
X ∼ Unif(a, b) fX (x) = for x ∈ [a, b] E(X) = V(X) =
b−a 2 2 12
1 (x−µ)
X ∼ N (µ, σ 2 ) fX (x) = √ e− 2σ2 for x ∈ (−∞, ∞) E(X) = µ V(X) = σ 2
σ 2π
1 1
X ∼ Expo(λ) fX (x) = λe−λx for x ∈ (0, ∞) E(X) = V(X) =
n
λ λ2
X
X ∼ χ2n X= Zi2 ; Z1 , Z2 , . . . , Zn are i.i.d N (0, 1) E(X) = n V(X) = 2n
i=1
Z n
T ∼ tn T =p ; Z ∼ N (0, 1), X ∼ χ2n are independent E(X) = 0 for n > 1 V(X) = for n > 2
X/n n−2
(
1 if s ∈ A
Indicator IA IA (s) = E(IA ) = Pr(A) V(IA ) = Pr(A)(1 − Pr(A))
0 otherwise

Statistical Inequalities.
Inequality Description p
Cauchy-Schwarz For any random variables X and Y , |E(XY )| ≤ E(X 2 )E(Y 2 )
Jensen’s Inequalty For any random variable X and any convex function g, Eg(X) ≥ g(E(X))
E(|X|)
Markov’s Inequalty For any random variable X and any a > 0, Pr(|X| ≥ a) ≤
a
V(X)
Chebyshev’s Inequalty For any random variable X and any a > 0, Pr(|X − µ| ≥ a) ≤ where µ = E(X)
a2
Definitions. Given a sequence Y1 , Y2 , . . . of random variables
Term Definition
Yn converges in probability to a ∀ > 0, lim Pr(|Yn − a| ≥ ) = 0
n→∞
Yn convergence in distribution to Y ∀y, lim FYn (y) = FY (y)
n→∞
X1 + X2 + · · · + Xn
Sample Mean Mn Given a i.i.d sample X1 , X2 , . . . Xn , Mn =
n
Limit Theorems. Given a sequence of i.i.d random variables X1 , X2 , . . . with E(Xi ) = µ and V(Xi ) = σ 2

Term Theorem
Weak Law of Large Numbers √ Sample mean M n converges in probability to µ
n(Mn − µ)
Central Limit Theorem converges in distribution to Z ∼ N (0, 1)
σ
Point Estimator. Given a sequence of random variables X1 , X2 , . . . , Xn drawn from a distribution with unknown parameter
θ. Let Θ̂ = h(X1 , X2 , . . . , Xn ) be a point estimator for θ.

Term Definition
Bias of Θ̂ BΘ̂ (θ) = Eθ (Θ̂) − θ
Θ̂ is unbiased BΘ̂ (θ) = 0 for all θ
Mean squared error of Θ̂ MSEΘ̂ (θ) = Eθ (Θ̂ − θ)2 = Vθ (Θ̂) + (BΘ̂ (θ))2
Consistency of Θ̂ Θ̂ converges in probability to θ for all values of θ

You might also like