Professional Documents
Culture Documents
Probability
Probability
Topic 3: Probability
amangupta0141@gmail.com
QU1HPBT85A
Asef Nazari
Noise
there are always sources of inaccuracies in any real data
collected
Missing values
Missing variable
amangupta0141@gmail.com
latent variable analysis (latent variable = hidden variable =
QU1HPBT85A
missing variable)
Bias
systematically inaccurate estimates of population values
Outliers
These are data points that lie well beyond the bulk of samples
Ω = {m, f }
P (E) = 24 = 0.5
Two events are called disjoint or exclusive if they have no
outcome in common
Event E the first coin lands heads: E = {hh, ht}
Event F the first coin lands tails: F = {tt, th}
Some extensions
amangupta0141@gmail.com
For any two events E1 , E2 ⊂ Ω
QU1HPBT85A
P (E1 ∪ E2 ) = P (E1 ) + P (E2 ) − P (E1 ∩ E2 )
[n Xn
P ( Ei ) = P (Ei )
This file is meant for personal use
i=1by amangupta0141@gmail.com
i=1 only.
Sharing or publishing the contents in part or full is liable for legal action.
Asef Nazari Math AI - SIT787 Topic 3 23 / 128
Some propositions
P (E c ) = 1 − P (E)
P (A ∩ B) = P (A|B)P (B)
P (A ∩ B) = P (B|A)P (A)
amangupta0141@gmail.com
useful to check when events are dependent or independent
QU1HPBT85A
Two events with non-zero probabilities are independent
P (A ∩ B) = P (A)P (B)
P (A|B) = P (A)
P (B|A) = P (B)
If E and F are independent, then so are E and F c .
If we have several independent models, it is better to make an
ensemble model.
This file is meant for personal use by amangupta0141@gmail.com only.
Sharing or publishing the contents in part or full is liable for legal action.
Asef Nazari Math AI - SIT787 Topic 3 27 / 128
Conditional probabilities: example of die tossing
amangupta0141@gmail.com
QU1HPBT85A
amangupta0141@gmail.com
QU1HPBT85A
P (F ) = P (F ∩ E) + P (F ∩ E c )
= P (F |E)P (E) + P (F |E c )P (E c )
= P (F |E)P (E) + P (F |E c )[1 − P (F )]
the probability of the event F is a weighted average of the
conditional probability of F given that E has occurred and the
conditional
This file probability
is meant for personal of
useF by
given that E has not occurred,only.
amangupta0141@gmail.com
Sharing or publishing the contents in part or full is liable for legal action.
Asef Nazari Math AI - SIT787 Topic 3 37 / 128
Bayes’ Theorem
P (E ∩ F )
amangupta0141@gmail.com P (E|F ) =
QU1HPBT85A P (F )
P (D ∩ E) P (E|D)P (D)
P (D|E) = = = 0.3322
P (E) P (E|D)P (D) + P (E|Dc )P (Dc )
This file is meant for personal use by amangupta0141@gmail.com only.
Sharing or publishing the contents in part or full is liable for legal action.
Asef Nazari Math AI - SIT787 Topic 3 39 / 128
BAYES’ Theorem: extension
Bayes’ formula
P (E ∩ Fj ) P (E|Fj )P (Fj )
P (Fj |E) = = P
n
P (E) P (E|Fi )P (Fi )
i=1
amangupta0141@gmail.com
QU1HPBT85A
amangupta0141@gmail.com
QU1HPBT85A
amangupta0141@gmail.com
QU1HPBT85A
X:Ω→R
x1 , x2 , . . . , xn
P (X = x)
x 0 1 2
1 2 1
P (X = x) 4 4 4
x
1 1 3
P (X ≥ 1) = P (X = 1) + P (X = 2) = 2 + 4 = 4
amangupta0141@gmail.com
QU1HPBT85A
Probability mass function (PMF) pX (x) = p(x) = P (X = x)
Let X is a random variable and SX = {x1 , x2 , . . . , xn } with
probabilities {p1 , p2 , . . . , pn } where P (X = xj ) = pj for all j.
pj ≥ 0
Pn
pj = 1
j=1
Lottory game:
win $100 with probability 0.5
lose $100 with probability 0.3
lose $50 with probability 0.2
amangupta0141@gmail.com
QU1HPBT85A
The related random variable
x 100 -100 -50
P (X = x) 18 37
18
37
1
37
µ = E[X] = 100(0.5) + (−100)(0.3) + (−50)(0.2) = 10
dollars per game
x x1 x2 ... xn
x x1 x2 ... xn ...
P (X = x) p1 p2 ... pn
P (X = x) p1 p2 ... pn ...
Ω = {H, T H, T T H, T T T H, . . .}
P (X = 0) = P ({T T T T T }) =
amangupta0141@gmail.com
(1 − p)5
QU1HPBT85A
P (X = 1) =
P ({HT T T T, T HT T T, T T HT T,
T T T HT, T T T TT H}) =
5p(1 − p)4 = 51 p(1 − p)4
P (X = 2) = 52 p2 (1 − p)3
n k
For B(n, p): P (X = k) = k p (1 − p)n−k for k ∈ SX
This file is meant for personal use by amangupta0141@gmail.com only.
Sharing or publishing the contents in part or full is liable for legal action.
Asef Nazari Math AI - SIT787 Topic 3 64 / 128
Geometric and Poison distributions with infinite support set
Geometric distribution
Toss a coin with P (H) = p untill
the first head
X is the number of tossing
(including the last one)
P (X = k) = P ({T T . . . T H}) =
p(1 − p)k−1 , k ≥ 1
amangupta0141@gmail.com
QU1HPBT85A
Poisson distribution
X the number of arrivals
k
P (X = k) = λk! e−λ for
parameter λ > 0, k ≥ 0
The measure how far a random variable deviates fro its mean
We wanto to measure X − E[X]
E[X − E[X]] = 0
Let’s considser E[(X − E[X])2 ] which is the variance
Lottory game 1: µ = E[X] = 5(0.1) + (−1)(0.9) = −0.4
dollars per game
amangupta0141@gmail.com x 5 -1
QU1HPBT85A P (X = x) 0.1 0.9
X − E[X] 5.4 −0.6
(X − E[X])2 29.16 0.36
(X, Y )
x -1 3 4 x 1 5 6
amangupta0141@gmail.com
P (X = x) 0.2 0.5 0.3
QU1HPBT85A P (Y = x) 0.2 0.5 0.3
P (X = x) P (Y = x)
x x
x -1 3 4 x -2 6 8
amangupta0141@gmail.com
P (X = x) 0.2 0.5 0.3
QU1HPBT85A P (Y = x) 0.2 0.5 0.3
P (X = x) P (Y = x)
x x
n
P
E[X] = xi p i
i=1
n
P
E[Y
This file ] = E[g(X)]
is meant =
for personal g(xiby
use )piamangupta0141@gmail.com only.
i=1
Sharing or publishing the contents in part or full is liable for legal action.
Asef Nazari Math AI - SIT787 Topic 3 75 / 128
Properties of variance
Var(aX + b) = a2 V ar(X)
if a = 0, Var(b) = 0
p
The quantity Var(X) is called the standard deviation of X.
q
std(X) = Var(X)
= E[(X1 −E[X1 ])2 +(X2 −E[X2 ])2 +2(X1 −E[X1 ])(X2 −E[X2 ])]
Y \X -1 0 1 margin
1 1 2
1 3 0 3 3
1 1
-2 0 3 0 3
1 1
margin 3 3 0
Cov(X, Y )
Corr(X, Y ) = p
Var(X)Var(Y )
Corr(X, Y ) = √ Cov(X,Y )
Var(X)Var(Y )
If there is a perfect linear relationship between X and Y ,
Y = aX + b
Corr(X, Y ) = Corr(X, aX + b) = √ aVar(X)
amangupta0141@gmail.com 2
k
= |k| = ±1
Var(X)a Var(X)
QU1HPBT85A
Corr(aX, Y ) = Corr(X, Y ) if a > 0
Corr(X, Y ) ∈ [−1, 1]
Corr(X, Y ) = 0 then two random variables are uncorrelated
X : Ω → R, Y : Ω → R, Z : Ω → R
Expectation
Z = X + Y then E[Z] = E[X] + E[Y ]
Variance
Suppose Var(X) > 0
amangupta0141@gmail.com
QU1HPBT85AY = −X then Var(Y ) = Var(Y )
Z = X + Y , Var(Z) = Var(X − X) = Var(0) = 0
Var(X) + Var(Y ) = 2Var(X)
Var(X + Y ) 6= Var(X) + Var(Y )
For especial case that X and Y are independent random
variables, Var(X + Y ) = Var(X) + Var(Y )
x x1 x2 ... xm x y1 y2 ... yn
P (X = x) p1 p2 ... pm P (Y = x) q1 q2 ... qn
( (
1 if the 1st is H 1 if the 2nd is H
X= and Z =
0 else 0 else
HH Y
H
0 1
HH Y
0 1
This fileHisP (Xmeant
X HH
for personal use by amangupta0141@gmail.com only.
H
X HH
H
0 = 0 ∩ Z = 0) P (X = 0 ∩ Z = 1) 0 0.25 0.25
P (X = 1 ∩ Z = 0) P (X = 1 ∩ Z = 1) 1 0.25 0.25
Sharing or publishing the contents in part or full is liable for legal action.
1
Asef Nazari Math AI - SIT787 Topic 3 87 / 128
Marginal probabilities
X\Z 0 1
amangupta0141@gmail.com
X\Y 0 1 0 0.25 0.25
QU1HPBT85A 1 0.25 0.25
0 0 0.5
1 0.5 0
P (X = 0) = 0.25 + 0.25 =
P (X = 0) = 0 + 0.5 = 0.5 0.5
P (X = 1) = 0.5 + 0 = 0.5 P (X = 1) = 0.25 + 0 = 0.25
P (Y = 0) = 0 + 0.5 = 0.5 P (Z = 0) = 0.25 + 0.25 =
P (Y = 1) = 0.5 + 0 = 0.5 0.5
This file is meant for personal use by amangupta0141@gmail.com
P (Z = 1) = 0.25 + 0 =only. 0.25
Sharing or publishing the contents in part or full is liable for legal action.
Asef Nazari Math AI - SIT787 Topic 3 88 / 128
Independent random variables
HH Y H
HH Y
H 0 1 0 1
X HH H X H
H
H
0 0 0.5 0 0.25 0.25
1 0.5 0 1 0.25 0.25
HH Y
H 0 1 2 marginal
X HH H
1 1
0 8 8 0 14
1 1 1 1
1 8 4 8 2
2 0 8 8 14
1 1
1 1 1
This file is meant formarginal
personal use
4 by
2 amangupta0141@gmail.com
4 only.
Sharing or publishing the contents in part or full is liable for legal action.
Asef Nazari Math AI - SIT787 Topic 3 90 / 128
Independent random variables
x x1 x2 ... xm x y1 y2 ... yn
P (X = x) p1 p2 ... pm P (Y = x) q1 q2 ... qn
dF (a)
da = f (a)
P {a < x < b} = area of the shaded region
For
amangupta0141@gmail.com (
QU1HPBT85A e−x , x ≥ 0
f (x)
0, x < 0
amangupta0141@gmail.com
QU1HPBT85A
FX (a) = P (X ≤ a)
FX (x) = P (X ≤ x)
FX (x) non-strictly increasing
lim FX (x) = 0 and lim FX (x) = 1
x→−∞ x→+∞
FX (x) ∈ [0, 1]
Relationship between PDF and CDF
X is a random variable with CDF FX (x) and PDF fX (x)
amangupta0141@gmail.com
P (X ∈ [a, b]) = P (X ≤ b) − P (X ≤ a) = F (b) − F (a)
QU1HPBT85A Z b
P (X ∈ [a, b]) = fX (x)dx
Z b a
amangupta0141@gmail.com
QU1HPBT85A
amangupta0141@gmail.com
QU1HPBT85A
amangupta0141@gmail.com
QU1HPBT85A
pX,Y (x, y) = P (X = x ∩ Y = y)
amangupta0141@gmail.com
Joint PDF for continuous random variable
QU1HPBT85A
fX,Y (x, y)
Joint CDF
FX,Y (x, y) = P (X ≤ x ∩ Y ≤ y)
amangupta0141@gmail.com
Properties of X Properties of the sample
QU1HPBT85A
µ = E[X] x̄
σ2 = Var(X) s2
σ = std(X) s
m x̃
−1 ≤ r ≤ 1
amangupta0141@gmail.com
QU1HPBT85A
amangupta0141@gmail.com
using standard normal
QU1HPBT85A
distribution
P (−1 ≤ Z ≤ +1) = 0.68
P (−2 ≤ Z ≤ +2) = 0.95
P (−3 ≤ Z ≤ +3) = 0.997
e−λ λk
P (X = k) =
k!
This file is meant for personal
E[X] = Var(X) = λ use by amangupta0141@gmail.com only.
Sharing or publishing the contents in part or full is liable for legal action.
Asef Nazari Math AI - SIT787 Topic 3 119 / 128
Important continuous random variables
E[X] = µ, Var(X) = σ 2
This file is meant for personal use by amangupta0141@gmail.com only.
Sharing or publishing the contents in part or full is liable for legal action.
Asef Nazari Math AI - SIT787 Topic 3 120 / 128
Important continuous random variables
amangupta0141@gmail.com
QU1HPBT85A
µ, σ 2 , σ for a population
x̄, s2 , s coming for one
sample
X̄, S 2 , S when we have
many samples
amangupta0141@gmail.com
QU1HPBT85A From a population with µ
and σ 2
X̄ is the random variable of
sample mean of samples
S 2 a random variable of
sample variances of samples
QU1HPBT85A xi
u (xi −x̄)2
t
x̄ = i=1n , s = i=1 n−1 , n is the sample size
z is the level of confidence
z = 1.645 for 90% CI for the mean
z = 1.96 for 95% CI for the mean
z = 2.576 for 99% CI for the mean
amangupta0141@gmail.com
QU1HPBT85A