Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
You are on page 1of 35

Basics on Probability

Jingrui He
09/11/2007
Coin Flips
 You flip a coin
 Head with probability 0.5

 You flip 100 coins


 How many heads would you expect
Coin Flips cont.
 You flip a coin
 Head with probability p
 Binary random variable
 Bernoulli trial with success probability p
 You flip k coins
 How many heads would you expect
 Number of heads X: discrete random variable
 Binomial distribution with parameters k and p
Discrete Random Variables
 Random variables (RVs) which may take on
only a countable number of distinct values
 E.g. the total number of heads X you get if you
flip 100 coins

 X is a RV with arity k if it can take on exactly


one value out of  x1 , , xk 
 E.g. the possible values that X can take on are 0,
1, 2,…, 100
Probability of Discrete RV
 Probability mass function (pmf): P  X  xi 
 Easy facts about pmf
  P X  x  1
i i

 
 P X  x  X  x  0 if i  j
i j

   
 P X  x X  x  P X  x P X  x
i j i j  if i j
 
 P X  x1  X  x2    X  xk  1
Common Distributions
 Uniform X U  1, , N 
 X takes values 1, 2, …, N
 
 P X  i 1 N
 E.g. picking balls of different colors from a box
 Binomial X  Bin  n, p 
 X takes values 0, 1, …, n
n i n i
 P  X  i    p 1 p
i
 E.g. coin flips
Coin Flips of Two Persons
 Your friend and you both flip coins
 Head with probability 0.5
 You flip 50 times; your friend flip 100 times
 How many heads will both of you get
Joint Distribution
 Given two discrete RVs X and Y, their joint
distribution is the distribution of X and Y
together
 E.g. P(You get 21 heads AND you friend get 70
heads)

 x y
P X  x  Y  y  1
 E.g.
 
50 100
P  You get i heads AND your friend get j heads   1
i 0 j 0
Conditional Probability
 P  X  x Y  y  is the probability of X  x ,
given the occurrence of Y  y
 E.g. you get 0 heads, given that your friend gets
61 heads
P X  x  Y  y
 P X  x Y  y 
P Y  y
Law of Total Probability
 Given two discrete RVs X and Y, which take
values in  x1 , , xm  and  y1 , , yn  , We have

P  X  xi   P X  x  Y  y 
j i j

  P  X  x Y  y P  Y  y 
i j j
j
Marginalization

Marginal Probability Joint Probability

P  X  xi    P X  x  Y  y 
j i j

  P  X  x Y  y P  Y  y 
i j j
j

Conditional Probability Marginal Probability


Bayes Rule
 X and Y are discrete RVs…
P X  x  Y  y
P X  x Y  y 
P Y  y

 
P Y  y j X  xi P  X  xi 
 
P X  xi Y  y j 
 P Y  y
k j 
X  xk P  X  xk 
Independent RVs
 Intuition: X and Y are independent means that
X  x neither makes it more or less probable
that Y  y
 Definition: X and Y are independent iff
P  X  x  Y  y  P  X  x P  Y  y
More on Independence
 P  X  x  Y  y  P  X  x P  Y  y

P  X  x Y  y  P  X  x P  Y  y X  x  P  Y  y

 E.g. no matter how many heads you get, your


friend will not be affected, and vice versa
Conditionally Independent RVs
 Intuition: X and Y are conditionally
independent given Z means that once Z is
known, the value of X does not add any
additional information about Y
 Definition: X and Y are conditionally
independent given Z iff

P X  x  Y  y Z  z  P X  x Z  z P Y  y Z  z
More on Conditional Independence

P X  x  Y  y Z  z  P X  x Z  z P Y  y Z  z

P  X  x Y  y, Z  z   P  X  x Z  z 

P  Y  y X  x, Z  z   P  Y  y Z  z 
Monty Hall Problem
 You're given the choice of three doors: Behind one
door is a car; behind the others, goats.
 You pick a door, say No. 1
 The host, who knows what's behind the doors, opens
another door, say No. 3, which has a goat.
 Do you want to pick door No. 2 instead?
Host reveals

                         
Goat A
or
Host reveals
Goat B
           

              

Host must

                         
reveal Goat B

                        

Host must

                         
reveal Goat A
                       
                            
  
Monty Hall Problem: Bayes Rule
 Ci : the car is behind door i, i = 1, 2, 3
 P  Ci   1 3
 H ij : the host opens door j after you pick door i

0 i j
0 jk

 
P H ij Ck  
ik
1 2
 1 i  k , j  k
Monty Hall Problem: Bayes Rule cont.
 WLOG, i=1, j=3

P  H13 C1  P  C 1 
 P  C1 H13  
P  H13 
1 1 1
 P  H13 C1  P  C1    
2 3 6
Monty Hall Problem: Bayes Rule cont.
 P  H13   P  H13 , C1   P  H13 , C2   P  H13 , C3 
 P  H13 C1  P  C1   P  H13 C2  P  C2 
1 1
  1
6 3
1

2
16 1
 P  C1 H13   
12 3
Monty Hall Problem: Bayes Rule cont.
16 1
 P  C1 H13   
12 3
1 2
 P  C2 H13   1    P  C1 H13 
3 3
 You should switch!
Continuous Random Variables
 What if X is continuous?
 Probability density function (pdf) instead of
probability mass function (pmf)
 A pdf is any function f  x  that describes the
probability density in terms of the input
variable x.
PDF
 Properties of pdf
 f  x   0, x



f  x  1
 
 f x  1 ???

 Actual probability can be obtained by taking


the integral of pdf
 E.g. the probability
1
of X being between 0 and 1 is
P  0  X  1  
0
f  x  dx
Cumulative Distribution Function
 FX  v   P  X  v 
 Discrete RVs
 FX  v    vi
P  X  vi 
 Continuous RVs
v
 F v 
X   

f  x  dx
d
 FX  x   f  x 
dx
Common Distributions
 Normal X  N  , 

2
2

1   x   
 
 f x  exp  2 , x  
2   2 
 E.g. the height of the entire population
0.4

0.35

0.3

0.25
f(x)

0.2

0.15

0.1

0.05

0
-5 -4 -3 -2 -1 0 1 2 3 4 5
x
Common Distributions cont.
 Beta X  Beta   ,  
1
, x   0,1
 1
 f  x;  ,    x 1 x
 1
B ,  
     1 : uniform distribution between 0 and 1
 E.g. the conjugate prior for the parameter p in
Binomial distribution
1.6

1.4

1.2

1
f(x)

0.8

0.6

0.4

0.2

0
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
x
Joint Distribution
 Given two continuous RVs X and Y, the joint
pdf can be written as f X,Y  x, y 



x y
f X,Y  x, y  dxdy  1
Multivariate Normal
 Generalization to higher dimensions of the
one-dimensional normal
Covariance Matrix

1
 f X  x1 , , xd  
 2  d 2

12

 1  T 1  
 exp   x      x    
 2 
Mean
Moments
 Mean (Expectation):   E  X 
 Discrete RVs: E  X    vi P  X  vi 
vi

 Continuous RVs: E  X    
xf  x  dx
Variance: V  X   E  X   
2

Discrete RVs: V  X     vi    P  X  vi 
2

vi

Continuous RVs: V  X     x    f  x  dx
2


Properties of Moments
 Mean
   
 E XY  E X E Y  
 E  aX   aE  X 
 If X and Y are independent, E  XY   E  X   E  Y 
 Variance
 
 V aX  b  a 2V X  
 If X and Y are independent, V  X  Y   V (X)  V (Y)
Moments of Common Distributions
 Uniform X U  1, , N 
 Mean  1  N  2 ; variance  N  1 12
2

 Binomial X  Bin  n, p 
2
 Mean ; variance
np np
 Normal X  N  , 2

 Mean  ; variance  2
 Beta X  Beta   ,   
 Mean       ; variance    2     1
   
Probability of Events
 X denotes an event that could possibly happen
 E.g. X=“you will fail in this course”
 P(X) denotes the likelihood that X happens, or
X=true
 What’s the probability that you will fail in this
course?
  denotes the entire event set

   X, X 
The Axioms of Probabilities
 0 <= P(X) <= 1
 P    1
 P  X1  X 2      P  X  , where X
i i i are
disjoint events
 Useful rules
     
 P X1  X 2  P X1  P X 2  P X1  X 2  
 
 P X  1 P  X
Interpreting the Axioms

X1
X2

You might also like