Professional Documents
Culture Documents
1673852881544
1673852881544
! Results !
n
[ n
X X X n
\
Inclusion-exclusion Pr Ai = Pr(Ai ) − Pr(Ai ∩ Aj ) + Pr(Ai ∩ Aj ∩ Ak ) − · · · + (−1)n+1 Pr Ai
i=1 i=1! i<j i<j<k i=1
n
\
Conditioning Pr Ai = Pr(A1 ) Pr(A2 |A1 ) Pr(A3 |A1 ∩ A2 ) · · · Pr(An |A1 ∩ A2 ∩ A3 · · · ∩ An−1 )
i=1
Pr(B|A) Pr(A)
Bayes’ Rule Pr(A|B) =
Pr(B)
Xn n
X
Law of total Probability Given a partition A1 , A2 , A3 , . . . , An of S, Pr(E) = Pr(E ∩ Ai ) = Pr(E|Ai ) Pr(Ai )
i=1 i=1
Discrete Continuous
PMF/PDF PMF pX (x) = Pr(X = x) PDF fX (x)Z
x
CDF FX (x) = Pr(X ≤ x) FX (x) = Pr(X ≤ x) = fX (t)dt
Z ∞ −∞
X
Expectation E(X) = xpX (x) E(X) = xfX (x)dx
x∈X(S) −∞
X Z∞
Lotus E(g(X)) = g(x)pX (x) E(g(X)) = g(x)fX (x)dx
x∈X(S) −∞
1 Contact: amit.kr.goyal@gmail.com
1
Important Distributions.
Distribution PMF/PDF/Description
( Expectation Variance
p if x = 1
X ∼ Bern(p) pX (x) = E(X) = p V(X) = p(1 − p)
1 − p if x = 0
n x
X ∼ Binom(n, p) pX (x) = p (1 − p)n−x for x ∈ {0, 1, . . . , n} E(X) = np V(X) = np(1 − p)
x
1−p 1−p
X ∼ Geom(p) pX (x) = (1 − p)x p for x ∈ {0, 1, 2, . . .} E(X) = V(X) =
p p2
e−λ λx
X ∼ Pois(λ) pX (x) = for x ∈ {0, 1, 2, . . .} E(X) = λ V(X) = λ
x!
1 a+b (b − a)2
X ∼ Unif(a, b) fX (x) = for x ∈ [a, b] E(X) = V(X) =
b−a 2 2 12
1 (x−µ)
X ∼ N (µ, σ 2 ) fX (x) = √ e− 2σ2 for x ∈ (−∞, ∞) E(X) = µ V(X) = σ 2
σ 2π
1 1
X ∼ Expo(λ) fX (x) = λe−λx for x ∈ (0, ∞) E(X) = V(X) =
n
λ λ2
X
X ∼ χ2n X= Zi2 ; Z1 , Z2 , . . . , Zn are i.i.d N (0, 1) E(X) = n V(X) = 2n
i=1
Z n
T ∼ tn T =p ; Z ∼ N (0, 1), X ∼ χ2n are independent E(X) = 0 for n > 1 V(X) = for n > 2
X/n n−2
(
1 if s ∈ A
Indicator IA IA (s) = E(IA ) = Pr(A) V(IA ) = Pr(A)(1 − Pr(A))
0 otherwise
Statistical Inequalities.
Inequality Description p
Cauchy-Schwarz For any random variables X and Y , |E(XY )| ≤ E(X 2 )E(Y 2 )
Jensen’s Inequalty For any random variable X and any convex function g, Eg(X) ≥ g(E(X))
E(|X|)
Markov’s Inequalty For any random variable X and any a > 0, Pr(|X| ≥ a) ≤
a
V(X)
Chebyshev’s Inequalty For any random variable X and any a > 0, Pr(|X − µ| ≥ a) ≤ where µ = E(X)
a2
Definitions. Given a sequence Y1 , Y2 , . . . of random variables
Term Definition
Yn converges in probability to a ∀ > 0, lim Pr(|Yn − a| ≥ ) = 0
n→∞
Yn convergence in distribution to Y ∀y, lim FYn (y) = FY (y)
n→∞
X1 + X2 + · · · + Xn
Sample Mean Mn Given a i.i.d sample X1 , X2 , . . . Xn , Mn =
n
Limit Theorems. Given a sequence of i.i.d random variables X1 , X2 , . . . with E(Xi ) = µ and V(Xi ) = σ 2
Term Theorem
Weak Law of Large Numbers √ Sample mean M n converges in probability to µ
n(Mn − µ)
Central Limit Theorem converges in distribution to Z ∼ N (0, 1)
σ
Point Estimator. Given a sequence of random variables X1 , X2 , . . . , Xn drawn from a distribution with unknown parameter
θ. Let Θ̂ = h(X1 , X2 , . . . , Xn ) be a point estimator for θ.
Term Definition
Bias of Θ̂ BΘ̂ (θ) = Eθ (Θ̂) − θ
Θ̂ is unbiased BΘ̂ (θ) = 0 for all θ
Mean squared error of Θ̂ MSEΘ̂ (θ) = Eθ (Θ̂ − θ)2 = Vθ (Θ̂) + (BΘ̂ (θ))2
Consistency of Θ̂ Θ̂ converges in probability to θ for all values of θ