Random Variable

Random Variable
MATH 2411, Spring 2024
Yu Hu, MATH and LIFS, HKUST
1
Outline
• Continuous • Discrete
• De nition and types of random variables
• Distributions of random variables
• Expectation, variance
• Special random variables
2
fi
Definition of a random variable
• Random experiment → an outcome in sample space
• We need Random variable to work with numerical data
De nition: A random variable (rv) X is a function on sample space Ω that associate a
real number X(ω) with each element in Ω.
• After an experiment, the random variable X will take a particular value or realization x. This
describes how (numerical) data is generated by a probability model.
Sample space Random values

X(ω) = Number of “H”
HH -2
Real
TT -1
Experiment Number
HT
Toss two -0 Line
coins TH
Four possible
outcomes
Three possible
3 Random values
fi
• Example: Rolling two dices, and de ne X to be the sum of the faces.
• Q: What is P(X=6)?
• The following outcomes will lead to X=6: (1,5), (2,4), (3,3), (4,2), (1,5). Since Ω is the a
sample space with equally likely outcomes,
5
P(X = 6) = P({(1,5), (2,4), (3,3), (4,2), (1,5)}) = .
36
−1
• De ne the event X (6) = {(1,5), (2,4), (3,3), (4,2), (1,5)} as the pre-image of
function X.
−1
• More generally, P(a < X ≤ b) = P(X ((a, b]))
4
fi
fi
• Example: We randomly choose a point (x,y) in the square region spanned by (0,0), (0,1), (1,0),
(1,1), and de ne a random variable Z=x+y.
• Q: What is P(Z ≤ 1)?
wikipedia
5
fi
Distribution of a random variable
• Example: Rolling two dices, and de ne X to be the sum of the faces.
−1
• More generally, P(a < X ≤ b) = P(X ((a, b]))
5
• Example: P(3 ≤ X ≤ 4) = P({(1,2), (2,1), (1,3), (2,2), (3,1)}) =
36
6 1
• Example: P(X ≤ 4) = P({(1,1), (1,2), (2,1), (1,3), (2,2), (3,1)}) = =
36 6
1 2 3 1
• Alternatively: P(X ≤ 4) = P(X = 2) + P(X = 3) + P(X = 4) = + + =
36 36 36 6
• It is easy to see we can answer any probability question regard X if we know the following:
• This table is called the distribution, or probability mass function, of random variable X.
6
fi
Probability mass function
• The range of a rv (same as the range of a function) RX is the set of all its
possible values.
• The pmf speci es the distribution of a rv when it takes nite or countably
in nite possible values, where we call X a discrete random variable.
• When the range of a rv consist of intervals on the real line, we call the it a
continuous random variable.
• De nition: The probability mass function (pmf) is p(x) = P(X = x), for each
possible value x of X.
∑
Proposition: For all x ∈ RX, 0 < p(x) ≤ 1 and p(x) = 1. A pmf satisfying
•
x∈RX
these conditions de ne a discrete rv.
7
fi
fi
fi
fi
fi
De nition: The probability mass function (pmf) is p(x) = P(X = x), for each
∑
Proposition: For all x ∈ RX, 0 < p(x) ≤ 1 and p(x) = 1. A pmf satisfying
x∈RX
these conditions de ne a discrete rv.
• Example: we toss a fair coin twice, let X be the number of heads in these
tosses.
• Find the pmf of X.
• Check your pmf satis es the properties of a pmf.
Range of X = {0,1,2}
p(0) = 1/4
p(1) = 1/2
p(2) = 1/4
8
fi
fi
fi
Cumulative distribution function
• Alternative to the pmf, we can consider the cumulative distribution function (cdf)
which works for both discrete and continuous rvs.
De nition: The cumulative distribution function (cdf) F : ℝ → [0,1] of a rv is
F(x) = P(X ≤ x).
• P(a < X ≤ b) = F(b) − F(a)

• For a discrete rv, e.g., the two dice example, is its cdf de ne for values outside
the possible values of X? F(-3.2)?
9
fi
fi
Distribution for a discrete rv
De nition: The cumulative distribution function (cdf) F : ℝ → [0,1] of a rv is F(x) = P(X ≤ x).
• Example: Find the cdf of the rv with the following pmf:

8
>
> 0, if x<2
>
>
>
> 1/36, if 2x<3
>
>
>
> 3/36, if 3x<4
>
>
>
>
>
> 6/36, if 4x<5
>
>
>
> 10/36, if 5x<6
>
>
<15/36, if 6x<7
F (x) =
>
> 21/36, if 7x<8
>
>
>
> 26/36, if 8x<9
>
>
>
>
>
> 30/36, if 9  x < 10
>
>
>
> 33/36, if 10  x < 11
>
>
>
> 35/36, if 11  x < 12
>
>
:
1,
<latexit sha1_base64="OQf/PHSjznmFJwkkI5laKpIVVIo=">AAADenicddLdatswFABgxd5P5/00XS/HQDQsa1nJpMR2EtigbDB22cHSFuIQZEVJRW3Zk+SSYPIOe7bd7Ul2s4vJqQZr4wkMh/MdS0cHxXnClUboZ8Nx791/8HDnkff4ydNnu82952cqKyRlI5olmbyIiWIJF2ykuU7YRS4ZSeOEncdXHys/v2ZS8Ux81aucTVKyEHzOKdEmNd1rfP90uDx6H8VswUVJzU5q7aHjNow0W2qZlnwO13D5rhtFHn7bC4/bt6AbJeyb4Z7h3jb3LPuGw232LQfV5mjbA+th5cG2h9b7xrs13fWtDyqvOX9gfVh1X3P+0DpGVUHN/QzYClxV1LRowFZsJnhX7fy8iImZnf602UIdFPp+EEATbJYJhngYYASxzbSAXafT5o9oltEiZULThCg1xijXk5JIzWnC1l5UKJYTekUWbGxCQVKmJuXm6azhK5OZwXkmzSc03GT//aMkqVKrNDaVKdGX6q5VyTobF3o+mJRc5IVmgt4cNC8SqDNYvUM445JRnaxMQKjkpldIL4kkVJvX6pkh/L0p/H9w1u1g1MFf/NbJBzuOHfACHIBDgEEfnIDP4BSMAG38cl46bee189s9cI/cNzelTsP+sw9uLdf/A7uTClo=</latexit>
if 12  x
10
fi
The distribution of the number X of mortgages approved per week at the local branch o ce of a
Example bank is given below:
1). What is the probability that on a given week fewer than 4 home mortgages has been approved?
2). What is the probability that on a given week more than 2 but no more than 5 home mortgages had been approved
3). Draw the cdf of random variable X. .
Solution:
Draw the cdf F(x)
1). P(X<4)=P(X=0)+P(X=1)+P(X=2)+P(X=3)=0.7
2). P(2<X<=5)=P(X=3)+P(X=4)+P(X=5)=0.55
8
>
> 0, if x < 0;
>
>
>
> 0.1, if 0  x < 1;
>
>
CDF >
>
>
> 0.2, if 1  x < 2;
>
<0.4, if 2  x < 3;
F (x) =
>
> 0.7, if 3  x < 4;
>
>
>
> 0.85, if 4  x < 5;
>
>
>
>
>
> 0.95, if 5  x < 6; 0 1 2 3 4 5 6
>
:
1,
<latexit sha1_base64="FfiqwtM+DvL7JUGIergeIU8VQ1g=">AAAC+3icbZJda9swFIZl76vzvpLtsjdiYaODEeQ0aTu6Qllh7LKDpS3EIcjKcSoqy54kjwTjv7KbXWyU3u6P9K7/pkqsi9XZAcHLed6jj6MT54JrQ8iN59+7/+Dho43HwZOnz56/aLVfnuisUAyGLBOZOoupBsElDA03As5yBTSNBZzGF0dLfvoDlOaZ/GYWOYxTOpM84Ywam5q0vfbnrfm7gyiGGZclszvpKiDv3+LIwNyotOQJrvD8I9mPooB0wyYhkYDvloc17zV56Hiv5v0m7zm+XfPdJt92vF/zvUHT0HeGQW34sGYYOMPO0mDvfwfu1HA/iEBO3fMnrQ7pklXgdRE60UEujiet62iasSIFaZigWo9CkptxSZXhTEAVRIWGnLILOoORlZKmoMfl6u8q/MZmpjjJlF3S4FX234qSplov0tg6U2rOdZMtk/9jo8Ike+OSy7wwIFl9UFIIbDK8HAQ85QqYEQsrKFPc3hWzc6ooM3ZcAtuEsPnkdXHS64Z2KL72O4efXDs20CZ6jbZQiHbRIfqCjtEQMW/u/fR+e3/8yv/lX/pXtdX3XM0rdCf8v7fLTuQ3</latexit>
if 6  x;
11
ffi
Properties of the cdf:
• Non-decreasing, F(−∞) = 0, F(+∞) = 1
• Right-continuous: lim+
= F(a) .
x→a
• For a discrete rv, how to nd the pmf given the cdf?
• Example: Given the following cdf of a discrete rv, nd its pmf.
1 -
RX = {1,2,4}
3
7
-
1 x 1 2 4
7
-
p(x)
0 1 2 3 4
12
fi
fi
De nition: The cumulative distribution function (cdf) F : ℝ → [0,1] of a rv is
F(x) = P(X ≤ x).

• P(a < X ≤ b) = F(b) − F(a)
0, z≤0
• For the rv on page 5 (Z = x + y) z2
2
, 0<z≤1
F(z) =
(2 − z)2
1− , 1<z≤2
• Find the cdf F(z) 2
1, z>2
• Find P(0.5 < Z ≤ 1.5)
P(0.5 < Z ≤ 1.5)

1 1 3
= F(1.5) − F(0.5) = (1 − ) − =
8 8 4
wikipedia
13
fi
Outline
14
fi
Population mean / Expectation
De nition: The (population) mean or expectation of a discrete rv X is de ned as
∑
E(X) = xp(x),
x∈RX
∑
where p(x) is its pmf, if | x | p(x) < + ∞ which guarantees that E(X) exists.
x∈RX
• The mean is usually denoted by μX, μ, EX.

• The mean describes the center of the
distribution of rv X.
15
fi
fi
De nition: The (population) mean or expectation of a discrete rv X is de ned as
∑
E(X) = xp(x),
x∈RX
∑
where p(x) is its pmf, if | x | p(x) < + ∞ which guarantees that E(X) exists.
x∈RX

• The mean describes the center of the distribution of rv X.
• Population mean vs sample mean
n
1
n∑
Let the data values be x1, x2, x3, …, xn, then the sample mean is x̄ = xi.
•
i=1
16
fi
fi
Relation with statistics
Preface ix
Randomness,
Uncertainty
Probability
Population: Tossing the coin Samples: Results over 10
to get a Head or Tail (H or T) tosses: HHTHTHHHTH
Data generating process C Ob.e""d data
Wasserman, All of Statistics
Parameter: probability of Head Inference and Data Mining Statistics

7
Fraction of H’s:
10
The value of sample mean changes with the data we have.

FIGURE L Probability and inference.
The value of population mean is a property of the rv and does not change with data.
9. On my website are files with R code which students can use for doing
all the computing. The website is: 17
http://www.stat.cmu.eduf''-'larry/all-of-statistics
Example: X is a discrete rv with range ={-1,0,1}. The pmf of X is: p(-1) = 0.4, p(0) = 0.2, p(1) =
0.4.
• Find E(X)
2 2
• Note X is also a discrete rv (why?), nd E(X ).
2 2
• Method 1: Find the pmf of X . Its range is {0,1}. P(X = 0) = P(X = 0) = 0.2,
2
P(X = 1) = P(X = − 1,X = 1) = 0.8.
2
E(X ) = 0 × 0.2 + 1 × 0.8 = 0.8
• Method 2:
2
E(X ) = 0 × 0.2 + 1 × 0.8 = 0 × P(X = 0) + 1 × (P(X = − 1) + P(X = 1))
2 2 2
= 0 × P(X = 0) + (−1) × P(X = − 1) + (1) × P(X = 1)
Proposition: For a discrete rv X and a continuous function g(x), g(X) de nes a discrete rv and its
mean if exists can be calculated as
∑
E(g(X)) = g(x)p(x).
x∈RX
18
fi
fi
Properties of E(X)
Proposition (linearity): For any real number a, b
1. E(aX + b) = aE(X) + b
∑
Proof: E(aX + b) = (ax + b)p(x)
x
∑
= (axp(x) + bp(x))
x
∑ ∑
=a xp(x) + b p(x)
x x
= aE(X) + b
19
Population variance
De nition: The (population) variance of a discrete rv X is de ned as
2 2
∑
Var(X) = (x − μ) p(x) = E((X − μ) ),
x∈RX
2
∑
where p(x) is its pmf, μ = E(X), if x p(x) < + ∞ which guarantees that μ and Var(X) exists.
x∈RX
2 2
• The variance is usually denoted by σX, σ , the population standard deviation σ is its square root.
• The variance describes the dispersion of the distribution of rv X. Population standard deviation SD(X) := Var(X)
• Population variance vs sample variance

n
2 1
(xi − x̄)2.
n−1∑
Let the data values be x1, x2, x3, …, xn, then the sample variance is sn−1 =
•
i=1
2
• Sample standard deviation: sn−1 = sn−1
20
fi
fi
Relation with statistics Preface ix
Probability
Population: Tossing the coin Samples: Results over 10
to get a Head or Tail (H or T) tosses: HHTHTHHHTH
Data generating process C Ob.e""d data
Wasserman, All of Statistics
Parameter: probability of Head Inference and Data Mining Statistics

7
Fraction of H’s:
10
FIGURE L Probability
21
and inference.
9. On my website are files with R code which students can use for doing
all the computing. The website is:
http://www.stat.cmu.eduf''-'larry/all-of-statistics
Population variance
De nition: The (population) variance of a discrete rv X is de ned as
2 2
∑
Var(X) = (x − μ) p(x) = E((X − μ) ),
x∈RX
2
∑
where p(x) is its pmf, μ = E(X), if x p(x) < + ∞ which guarantees that μ and Var(X) exists.
x∈RX
2 2
Proposition: Var(X) = E(X ) − (E(X))
Proof: 2 2 2
∑ ∑
E(X − μ) = (x − μ) p(x) = (x − 2μx + μ )p(x)
x x
2 2
∑ ∑ ∑
= x p(x) − 2μxp(x) + μ p(x)
x x x
2 2
∑ ∑
= E(X ) − 2μ xp(x) + μ p(x)
x x
2 2
= E(X ) − 2μ ⋅ μ + μ
2 2
= E(X ) − (E(X))
22
fi
fi
Population variance
(x − μ)2 p(x) = E((X − μ)2),
∑
Var(X) =
x∈RX
2 2
Var(X) = E(X ) − (E(X))
• Example: Tossing a fair coin twice, let X be the number of heads. Find E(X) and Var(X).
Pmf of X: P(X=0)=1/4, P(X=1)=1/2, P(X=2)=1/4
E(X) = (0) ⇤ 0.25 + (1) ⇤ 0.5 + (2) ⇤ 0.25 = 1

<latexit sha1_base64="bsIk0m4xhsFercbAzvUHn+CnVuo=">AAACCXicbZDLSgMxFIYz9VbrbdSlm2ARpgrDzFSxm0JRBJcV7AXaoWTSTBuauZBkhDJ068ZXceNCEbe+gTvfxkzbhbYeCPn4/3NIzu/FjAppWd9abmV1bX0jv1nY2t7Z3dP3D5oiSjgmDRyxiLc9JAijIWlIKhlpx5ygwGOk5Y2uM7/1QLigUXgvxzFxAzQIqU8xkkrq6fDGaJeqhlU6tUzn4sywM1C3MxOqdk8vWqZVKTtlGyqYFrQXoQjmVe/pX91+hJOAhBIzJETHtmLppohLihmZFLqJIDHCIzQgHYUhCohw0+kmE3iilD70I65OKOFU/T2RokCIceCpzgDJoVj0MvE/r5NIv+KmNIwTSUI8e8hPGJQRzGKBfcoJlmysAGFO1V8hHiKOsFThFVQISysvQ9Mxbcu0786Ltat5HHlwBI6BAWxwCWrgFtRBA2DwCJ7BK3jTnrQX7V37mLXmtPnMIfhT2ucP0ROUDQ==</latexit>
2 2 2 2
E(X ) = (0) ⇤ 0.25 + (1) ⇤ 0.5 + (2) ⇤ 0.25 = 1.5 <latexit sha1_base64="xRwiqNgTCkKivbJUiY1fzt8KDFY=">AAACE3icbZDLSgMxFIYzXmu9jbp0EyzCtMIwM7XYTaEogssK9gLttGTStA3NXEgyQhn6Dm58FTcuFHHrxp1vY3oRtPWHwMd/zuHk/F7EqJCW9aWtrK6tb2ymttLbO7t7+/rBYU2EMcekikMW8oaHBGE0IFVJJSONiBPke4zUveHVpF6/J1zQMLiTo4i4PuoHtEcxksrq6Llro9F2siXDyradnGU6hTPDnqEi58cs2Waho2cs0yrmnbwNFUwF7UXIgLkqHf2z1Q1x7JNAYoaEaNpWJN0EcUkxI+N0KxYkQniI+qSpMEA+EW4yvWkMT5XThb2QqxdIOHV/TyTIF2Lke6rTR3IgFmsT879aM5a9opvQIIolCfBsUS9mUIZwEhDsUk6wZCMFCHOq/grxAHGEpYoxrUJYOnkZao5pW6Z9e54pX87jSIFjcAIMYIMLUAY3oAKqAIMH8ARewKv2qD1rb9r7rHVFm88cgT/SPr4BnWuXFA==</latexit>
2 2
Var(X) = E(X )
<latexit sha1_base64="ESuPojxOV2DNZSgbG43cVrbkSzM=">AAACCHicbVDLSsNAFJ34rPUVdenCwSKkC0OSKnZTKErBZQXbBtq0TKbTdujkwcxEKKFLN/6KGxeKuPUT3Pk3Th8LbT1w4XDOvdx7jx8zKqRlfWsrq2vrG5uZrez2zu7evn5wWBdRwjGp4YhF3PWRIIyGpCapZMSNOUGBz0jDH95M/MYD4YJG4b0cxcQLUD+kPYqRVFJHP0lbPIB1xMeGmy9VDLft5M+NCnTzbadkmZcdPWeZVrHgFGyoyBTQXiQ5MEe1o3+1uhFOAhJKzJAQTduKpZciLilmZJxtJYLECA9RnzQVDVFAhJdOHxnDM6V0YS/iqkIJp+rviRQFQowCX3UGSA7EojcR//OaiewVvZSGcSJJiGeLegmDMoKTVGCXcoIlGymCMKfqVogHiCMsVXZZFcLSy8uk7pi2Zdp3F7ny9TyODDgGp8AANrgCZXALqqAGMHgEz+AVvGlP2ov2rn3MWle0+cwR+APt8weGRZZu</latexit>
(EX) = 0.5
23
Population variance
• Example: Which stock to choose? Below is the estimated probability distribution of the yearly
returns of stock A and B:
• Q: Find the expected returns for the stocks.
24
Population variance
• Example: Which stock to choose? Below is the estimated probability distribution of the yearly
returns of stock A and B:
• Q: Find the standard deviations of the stocks.

• Q: Assume the risk-free return is 1.5%, calculate the Sharpe ratio of the two stocks.
• Sharpe ratio: R0 is the risk-free investment

return,
E(X) − R0
.
σX
E(X) − 1.5 % 2.5% − 1.5 % E(Y ) − 1.5 % 4% − 1.5 % Stock A is better considering the trade-o between
= = 0.98 = = 0.61
σ(X) 1.02 % σ(Y ) 4.10 % risk and return.
25
ff
Properties of Var(X)
Proposition: For any real number a,b
2
Var(aX + b) = a Var(X)
Proof: Denote E(X) = μ,

2
Var(aX + b) = E[(aX + b − E(aX + b)) ]
= E[(aX + b − aμ − b)2]
2 2
= E(a (X − μ) )
= a 2E((X − μ)2)
2
= a Var(X)
Properties of E(X) and Var(X)
Proposition: For any real number a,b
1. E(aX + b) = aE(X) + b
2. Var(aX + b) = a 2Var(X)
Example: Recall the stock problem, we have a expected return of 4% with a standard
deviation of 4.1% for stock B. Now consider a portfolio where we invest in stock B for 0.7
of the fund and invest the remaining 0.3 of fund to a 1.5% xed interest time deposit.
Q: What is the expected return and standard deviation of the portfolio?

Portfolio Z = 0.7Y + 0.3 × 1.5
E(Z) = 0.7 × 4 + 0.3 × 1.5 = 3.25
Var(Z) = 0.72Var(Y ), Var(Z) = 0.7 Var(Y ) = 2.87
27
fi
Outline
28
fi
De nition: The probability mass function (pmf) is p(x) = P(X = x), for each
• The range of a rv (same as the range of a function) RX is the set of all its
possible values.
• The pmf speci es the distribution of a rv when it takes nite or countably
in nite possible values, where we call X a discrete random variable.
• When the range of a rv consist of intervals on the real line, we call the it a
continuous random variable.
• Is pmf useful for a continuous rv? What is P(Z = z) for a continuous rv Z?
wikipedia
29
fi
fi
fi
fi
Probability density function
• For a continuous rv, P(X=x) is never physically observed. Instead we consider the
“tendency” for X to be near x.
P (x − Δ/2 < X ≤ x + Δ/2) F(x + Δ/2) − F(x − Δ/2)

lim
• Δ→0 =: f(x) = lim = F′(x)
Δ Δ→0 Δ

30
De nition: A non-negative function f(x) ≥ 0 is called a probability density function (pdf) of a
rv X, if for any a ≤ b (can also be ± ∞),
b
∫a
P(a < X ≤ b) = f(x)dx.
+∞
∫−∞
• Proposition: f(x)dx = 1. Along with non-negative f(x) ≥ 0, they specify the requirements for a pdf to
de ne a continuous rv.
a
dF(x)
∫−∞
The cdf F(a) = f(x)dx, and for any point a where f(a) is continuous, F(a) is di erentiable |x=a = f(a).
• dx
a
∫a
P(X = a) = f(x)dx = 0 for any a.
•
31
fi
fi
ff
De nition: A non-negative function f(x) ≥ 0 is called a probability density function (pdf) of a rv X, if for any
a ≤ b (can also be ± ∞),
b
∫a
P(a < X ≤ b) = f(x)dx.
• Example: X be a number that is randomly chosen within [0,1].

• Find a pdf for X.
1
∫0
For all x ∈ [0,1], f(x) = c. We have f(x)dx = c = 1,
• Find the cdf using this pdf
x 0, x < 0
∫−∞
For x ∈ [0,1], F(x) = f(x)dx = x. We have F(x) = x, 0 ≤ x < 1.
1, x ≥ 1
• Modify pdf’s value at discrete points?
• Does not change the probability or distribution of continuous rv because integral over discrete points is 0.
32
fi
De nition: A non-negative function f(x) ≥ 0 is called a probability density function (pdf) of a rv X, if for any
a ≤ b (can also be ± ∞),
b
∫a
P(a < X ≤ b) = f(x)dx.
• Example: Z is the rv de ned on page 5. Find its pdf.

0, z≤0
z2
2
, 0<z≤1
F(z) =
(2 − z)2
1− 2
, 1<z≤2
1, z>2
wikipedia
0, z ≤ 0
z, 0 < z ≤ 1
f(x) =
2 − z, 1 < z ≤ 2
0, z > 2
33
fi
fi
Example
1) Find the value of c.

2) For any a and b in (0,2], calculate P(a ≤ X ≤ b)
3) Find and draw the cdf of X.
Solution:
34
De nition: The (population) mean or expectation of a continuous rv X is de ned as
∞
∫−∞
E(X) = xf(x)dx,
∫−∞
where p(x) is its pdf, if | x | f(x)dx < + ∞ which guarantees that E(X) exists.

• The mean describes the center of the distribution of rv X.
35
fi
fi
Population variance
De nition: The (population) variance of a continuous rv X is de ned as
∞
∫−∞
2 2
Var(X) = (x − μ) f(x)dx = E((X − μ) ),
∫−∞
2
where p(x) is its pdf, μ = E(X), if x p(x)dx < + ∞ which guarantees that μ and Var(X) exists.
2 2
• The variance is usually denoted by σX, σ , the population standard deviation σ is its
square root.
• The variance describes the dispersion of the distribution of rv X.
• Population standard deviation: SD(X) := Var(X)
36
fi
fi
Proposition: For a continuous rv X and a continuous function g(x), g(X) de nes a rv and its
mean if exists can be calculated as
∞
∫−∞
E(g(X)) = g(x)f(x)dx.
{ 0, elsewhere
2
cx , − 1 ≤ x ≤ 2
Example: X is a rv with pdf: f(x) =
•
2
• Find E(X )
2
2 3
x 1
∫−1
2
Solution: First determine c, 1 = cx dx = c = 3c. So c = .
3 3
−1
2
2 5
1 x 11
∫−1
2 2 2
E(X ) = x ⋅ x dx = = .
3 12 4
−1
37
fi
Mean and Variance of Continuous rvs
• Example: Find the mean and variance of the random variable X having the following pdf:
{0,
1/(b − a), a ≤ x ≤ b
f(x) = (every value in the range is equally likely).
otherwise
Here a < b are given parameters of the distribution. X is said to be a uniform distribution over [a, b].
Solution:
38
Mean and Variance of Continuous rvs
{0, elsewhere
2
cx , − 1 ≤ x ≤ 1
Example: X is a rv with pdf f(x) = .
• Find Var(X)
4
• Find the pdf of rv Y = X . 1
1 3
x 2c 3
∫−1
Solution: (1) First determine c, 1 = cx 2dx = c = . So c = .
3 3 2
−1
1
3 3
∫−1 2
2 4
E(X) = 0, Var(X) = E[(X − 0) ] = x dx = .
5
(2) Find the cdf of Y rst. Note 0
≤ Y ≤ 1. For 1y ∈ [0,1],
y4
3 2
∫−y 14 2
1 1 3
4
F(y) = P(X ≤ y) = P(−y 4 ≤ X ≤ y 4 ) = x dx = y 4.
d 3 −1
Thus pdf f(y) = F(y) = y 4 for y ∈ [0,1], and f(y) = 0 for y < 0 or y > 1.
dy 4
39
fi
Comparison between discrete and continuous rvs
Discrete Random Variable Continuous Random Variable
Probability mass function p(x) Probability density function

Z
f (x)
X 1
<latexit sha1_base64="gg0BE3faACgysigJ2SorsVvYu1A=">AAAB63icbZDLSsNAFIZPvNZ6q7p0M1iEuilJ6y27ohuXFewF2lAm00k7dCYJMxOxhL6CGxeKuPWF3Pk2Jmko3n4Y+PnOOcw5vxtyprRpfhpLyyura+uFjeLm1vbObmlvv62CSBLaIgEPZNfFinLm05ZmmtNuKCkWLqcdd3Kd1jv3VCoW+Hd6GlJH4JHPPEawTlFYeTgZlMpm1cyE/horN2XI1RyUPvrDgESC+ppwrFTPMkPtxFhqRjidFfuRoiEmEzyivcT6WFDlxNmuM3SckCHyApk8X6OMfp+IsVBqKtykU2A9Vr9rKfyv1ou0d+nEzA8jTX0y/8iLONIBSg9HQyYp0XyaGEwkS3ZFZIwlJjqJp5iFYNu2VTtbnIxSYtXt8wVp16pWvVq7PS03rvI4CnAIR1ABCy6gATfQhBYQGMMjPMOLIYwn49V4m7cuGfnMAfyQ8f4FmuiOJQ==</latexit> <latexit sha1_base64="zYv92MGBzAI280SU94hw9HthjKI=">AAAB63icbZDLSsNAFIZPvNZ6q7p0M1iEuilJ6y27ohuXFewF2lAm00k7dCYJMxOxhL6CGxeKuPWF3Pk2Jmko3n4Y+PnOOcw5vxtyprRpfhpLyyura+uFjeLm1vbObmlvv62CSBLaIgEPZNfFinLm05ZmmtNuKCkWLqcdd3Kd1jv3VCoW+Hd6GlJH4JHPPEawTpFXeTgZlMpm1cyE/horN2XI1RyUPvrDgESC+ppwrFTPMkPtxFhqRjidFfuRoiEmEzyivcT6WFDlxNmuM3SckCHyApk8X6OMfp+IsVBqKtykU2A9Vr9rKfyv1ou0d+nEzA8jTX0y/8iLONIBSg9HQyYp0XyaGEwkS3ZFZIwlJjqJp5iFYNu2VTtbnIxSYtXt8wVp16pWvVq7PS03rvI4CnAIR1ABCy6gATfQhBYQGMMjPMOLIYwn49V4m7cuGfnMAfyQ8f4Fi6KOGw==</latexit>
0  p(x)  1 p(x) = 1 f (x) 0 f (x)dx = 1

1
<latexit sha1_base64="6ZVJsO4BMI5scmtc/bI/vxdRiDo=">AAAB8XicbZDLSgMxFIYz9VbrrerSTbAIdVNmWm+zK7pxWcFesC0lk55pQzOZMcmIZehbuHGhiFvfxp1vY3pR6uWHwOH7z+Gc/F7EmdK2/WGlFhaXllfSq5m19Y3Nrez2Tk2FsaRQpSEPZcMjCjgTUNVMc2hEEkjgcah7g4uxX78DqVgorvUwgnZAeoL5jBJt0I2fvz9s9eAW251szinYE2H7T/Fl5dBMlU72vdUNaRyA0JQTpZqOHel2QqRmlMMo04oVRIQOSA+aphQkANVOJheP8IEhXeyH0jyh8YTOTyQkUGoYeKYzILqvfntj+J/XjLV/1k6YiGINgk4X+THHOsTj7+Muk0A1H5qCUMnMrZj2iSRUm5AyJgS74LquUzzGc2kYUHJPvkmtWHBKheLVUa58PosjjfbQPsojB52iMrpEFVRFFAn0gJ7Qs6WsR+vFep22pqzZzC76IevtE1zPkEE=</latexit>
<latexit sha1_base64="3oWgYt4e+He72EuRcLrLtmAdyeo=">AAAB+XicbZBLS8NAFIUn9VXrK+rSzWAR6iYkra/sim5cVrAPaEOZTCft0MkkzkyKJfSfuHGhiFv/iTv/jdO0Sn0cGDh8517mcvyYUals+8PILS2vrK7l1wsbm1vbO+buXkNGicCkjiMWiZaPJGGUk7qiipFWLAgKfUaa/vBqmjdHREga8Vs1jokXoj6nAcVIadQ1TbvDyB2MS/fHmXG6ZtGx7EzQ/mO+oiKYq9Y13zu9CCch4QozJGXbsWPlpUgoihmZFDqJJDHCQ9QnbW05Con00uzyCTzSpAeDSOjHFczo4kaKQinHoa8nQ6QG8nc2hf9l7UQFF15KeZwowvHsoyBhUEVwWgPsUUGwYmNtEBZU3wrxAAmElS6roEuwLdd1nfIpXGhDg4p79k0aZcupWOWbk2L1cl5HHhyAQ1ACDjgHVXANaqAOMBiBB/AEno3UeDRejNfZaM6Y7+yDHzLePgHQUpKs</latexit>
<latexit sha1_base64="3hqJS3rCKH6Xucz0/3ASRDb0XKs=">AAAB9HicbZBLSwMxFIUz9VXrq+rSTbAIdTPMtL5mIRTduKxgH9AOJZNm2tAkMyaZ0lL6O9y4UMStP8ad/8b0odTHgcDhO/eSywliRpV2nA8rtbS8srqWXs9sbG5t72R396oqSiQmFRyxSNYDpAijglQ01YzUY0kQDxipBb3rSV7rE6loJO70MCY+Rx1BQ4qRNshvqoS3BjDOD44v3VY259rOVND5Y76iHJir3Mq+N9sRTjgRGjOkVMN1Yu2PkNQUMzLONBNFYoR7qEMaxgrEifJH06PH8MiQNgwjaZ7QcEoXN0aIKzXkgZnkSHfV72wC/8saiQ4v/BEVcaKJwLOPwoRBHcFJA7BNJcGaDY1BWFJzK8RdJBHWpqeMKcGxPc9zC6dwoQ0Dit7ZN6kWbLdoF25PcqWreR1pcAAOQR644ByUwA0ogwrA4B48gCfwbPWtR+vFep2Npqz5zj74IevtE7HzkZY=</latexit>
x <latexit sha1_base64="9IKdHrIeDUgcnKsNSoioekILqDc=">AAACCHicbZDLSgMxFIYz9VbrrerShcEi6MIy03rrQii6cVnB1kJbSybNtKGZzJCckZahSze+ihsXirj1Edz5NqYXxduBkJ/vP4eT/G4ouAbbfrcSU9Mzs3PJ+dTC4tLySnp1raKDSFFWpoEIVNUlmgkuWRk4CFYNFSO+K9iV2z0b+lc3TGkeyEvoh6zhk7bkHqcEDGqmN+tcQjPeM5cH/cF1PBHeTm+31TtxmumMk7VHhe0/4tPKoEmVmum3eiugkc8kUEG0rjl2CI2YKOBUsEGqHmkWEtolbVYzUhKf6UY8+sgAbxvSwl6gzJGAR/T7REx8rfu+azp9Ah392xvC/7xaBN5xI+YyjIBJOl7kRQJDgIep4BZXjILoG0Go4uatmHaIIhRMdikTgp0tFApO7gB/S8OAfOHwi1RyWSefzV3sZ4qnkziSaANtoR3koCNUROeohMqIolt0jx7Rk3VnPVjP1su4NWFNZtbRj7JePwAWoJoy</latexit>
Cumulative distribution function Cumulative distribution function

X Z x
F (x) = p(t) = P (X  x) <latexit sha1_base64="S9sc71PMc5Yo7UzzYliJHIf7CCM=">AAAB8nicbZDLSgMxFIYz9VbrrerSTbAIdVPm0urMQii6cVnB1sJ0KJk004ZmLiYZsZQ+hhsXirj1adz5Nmampaj4Q+DnO+eQc34/YVRIXf/SCiura+sbxc3S1vbO7l55/6Aj4pRj0sYxi3nXR4IwGpG2pJKRbsIJCn1G7vzxVVa/eyBc0Di6lZOEeCEaRjSgGEmF3ItWtdtj5B4+nvbLFb2mWw27bkJldNsw9dzUbcuChjKZKmChVr/82RvEOA1JJDFDQriGnkhvirikmJFZqZcKkiA8RkPiKhuhkAhvmq88gyeKDGAQc/UiCXP6c2KKQiEmoa86QyRH4m8tg//V3FQGtjelUZJKEuH5R0HKoIxhdj8cUE6wZBNlEOZU7QrxCHGEpUqplIfgOI5hNpYnw4wYlnO2JB2zZlg186ZeaV4u4iiCI3AMqsAA56AJrkELtAEGMXgCL+BVk9qz9qa9z1sL2mLmEPyS9vENS5KQ0Q==</latexit>
F (x) = f (t)dt = P (X  x) <latexit sha1_base64="S9sc71PMc5Yo7UzzYliJHIf7CCM=">AAAB8nicbZDLSgMxFIYz9VbrrerSTbAIdVPm0urMQii6cVnB1sJ0KJk004ZmLiYZsZQ+hhsXirj1adz5Nmampaj4Q+DnO+eQc34/YVRIXf/SCiura+sbxc3S1vbO7l55/6Aj4pRj0sYxi3nXR4IwGpG2pJKRbsIJCn1G7vzxVVa/eyBc0Di6lZOEeCEaRjSgGEmF3ItWtdtj5B4+nvbLFb2mWw27bkJldNsw9dzUbcuChjKZKmChVr/82RvEOA1JJDFDQriGnkhvirikmJFZqZcKkiA8RkPiKhuhkAhvmq88gyeKDGAQc/UiCXP6c2KKQiEmoa86QyRH4m8tg//V3FQGtjelUZJKEuH5R0HKoIxhdj8cUE6wZBNlEOZU7QrxCHGEpUqplIfgOI5hNpYnw4wYlnO2JB2zZlg186ZeaV4u4iiCI3AMqsAA56AJrkELtAEGMXgCL+BVk9qz9qa9z1sL2mLmEPyS9vENS5KQ0Q==</latexit>
<latexit sha1_base64="6kKQaYDs2ke49iCUN+34NZj7rgE=">AAACAXicbZDLSsNAFIYn9VbrLepGcDNYhHZTktZbF0JREJcVbCu0JUymk3bo5OLMibSEuvFV3LhQxK1v4c63MU1L8fbDwM93zuHM+e1AcAWG8aml5uYXFpfSy5mV1bX1DX1zq678UFJWo77w5Y1NFBPcYzXgINhNIBlxbcEadv98XG/cMam4713DMGBtl3Q97nBKIEaWvnORG+RPWyp0rQhagt3iwQgHOchbetYoGInwX2NOTRZNVbX0j1bHp6HLPKCCKNU0jQDaEZHAqWCjTCtULCC0T7qsGVuPuEy1o+SCEd6PSQc7voyfBzih3yci4io1dO240yXQU79rY/hfrRmCc9KOuBeEwDw6WeSEAoOPx3HgDpeMghjGhlDJ479i2iOSUIhDyyQhlMtls3g4OxmPiVkqH81IvVgwS4Xi1UG2cjaNI4120R7KIRMdowq6RFVUQxTdo0f0jF60B+1Je9XeJq0pbTqzjX5Ie/8CGZuWOA==</latexit>
tx <latexit sha1_base64="DM9UHJARJSxCwENRcQQ3+1NH9uY=">AAACBnicbZDLSgMxFIYzXmu9VV2KECxCXVhmWm+zEIqCuKxgVWhryaSZGsxkhuSMtAxdufFV3LhQxK3P4M63MZ0W8XYg5Of7zyE5vxcJrsG2P6yx8YnJqenMTHZ2bn5hMbe0fK7DWFFWo6EI1aVHNBNcshpwEOwyUowEnmAX3s3RwL+4ZUrzUJ5BL2LNgHQk9zklYFArt3Zc6G4eNLiEVrJlLh96/SvcxX4BNtvQyuXtop0W/iuckcijUVVbufdGO6RxwCRQQbSuO3YEzYQo4FSwfrYRaxYRekM6rG6kJAHTzSRdo483DGljP1TmSMAp/T6RkEDrXuCZzoDAtf7tDeB/Xj0Gf7+ZcBnFwCQdPuTHAkOIB5ngNleMgugZQaji5q+YXhNFKJjksmkIrus6pZ2vlfGAOGV394ucl4pOuVg63c5XDkdxZNAqWkcF5KA9VEEnqIpqiKI79ICe0LN1bz1aL9brsHXMGs2soB9lvX0CquiYLQ==</latexit>
1
Expectation Expectation
X Z 1
E(X) = xp(x) E(X) = xf (x)dx
x 1
∞
<latexit sha1_base64="ypjTCQtuGCFK0s3mFgx0dY1axkA=">AAAB/HicbZDLSsNAFIYn9VbrLdqlm8EitJuStN6yEIoiuKxga6ENYTKdtEMnF2Ym0hDqq7hxoYhbH8Sdb2OShuLth4Gf75zDOfPbAaNCatqnUlhaXlldK66XNja3tnfU3b2u8EOOSQf7zOc9GwnCqEc6kkpGegEnyLUZubMnl2n97p5wQX3vVkYBMV008qhDMZIJstTyVbVXOx+I0LXi6QxOg+q0ZqkVra5lgn+NnpsKyNW21I/B0MehSzyJGRKir2uBNGPEJcWMzEqDUJAA4QkakX5iPeQSYcbZ8TN4mJAhdHyePE/CjH6fiJErROTaSaeL5Fj8rqXwv1o/lM6ZGVMvCCXx8HyREzIofZgmAYeUEyxZlBiEOU1uhXiMOMIyyauUhWAYht44XnwZpkRvGicL0m3U9Wa9cXNUaV3kcRTBPjgAVaCDU9AC16ANOgCDCDyCZ/CiPChPyqvyNm8tKPlMGfyQ8v4Fbc+ULw==</latexit>
<latexit sha1_base64="Tlv180zjUSkfTipoEGNFPe0Z6YQ=">AAACDXicbZDLSgMxFIYz9V5vVZduglWoC8tMvXYhFEVwWcHaQltLJs3UYCYzJGekZegLuPFV3LhQxK17d76NaTsUbwdCfr7/HJLzu6HgGmz700pNTE5Nz8zOpecXFpeWMyurVzqIFGUVGohA1VyimeCSVYCDYLVQMeK7glXd29OBX71jSvNAXkIvZE2fdCT3OCVgUCuzeZarbR83uIRWvGMuD3r96zgRuOvlutvtbiuTtfP2sPBf4SQii5IqtzIfjXZAI59JoIJoXXfsEJoxUcCpYP10I9IsJPSWdFjdSEl8ppvxcJs+3jKkjb1AmSMBD+n3iZj4Wvd813T6BG70b28A//PqEXhHzZjLMAIm6eghLxIYAjyIBre5YhREzwhCFTd/xfSGKELBBJgehlAsFp3C/nhlPCDObvFgTK4KeWc3X7jYy5ZOkjhm0TraQDnkoENUQueojCqIonv0iJ7Ri/VgPVmv1tuoNWUlM2voR1nvX/pYm7g=</latexit>
∑
∫−∞
E(g(X)) = g(x)p(x)
E(g(X)) = g(x)f(x)dx
x
Variance Variance
Z 1
X
Var(X) = (x µX ) p(x) 2 Var(X) = (x µX )2 f (x)dx
<latexit sha1_base64="R1achV8lFJzgVNoRlmTX1I1dKeI=">AAACIHicbZDNThsxFIU9/KbhL9BlNxYRUrJINBNaQhZIiG5YUqkJkTJh5HE8wcL2jOw7KNEoj8KGV+mGBahqd/A0OMkogtIrWT76zr2y7wkTwQ247rOztLyyurZe+FTc2Nza3int7nVMnGrK2jQWse6GxDDBFWsDB8G6iWZEhoJdhjffp/7lLdOGx+onjBPWl2SoeMQpAYuCUjPztcQdoieVbvXE5wqCrGavCMaTqywXuDKq+TINutWrBo4qo+pgFJTKbt2dFf4ovFyUUV4XQemvP4hpKpkCKogxPc9NoJ8RDZwKNin6qWEJoTdkyHpWKiKZ6WezBSf4wJIBjmJtjwI8o28nMiKNGcvQdkoC1+Zfbwr/5/VSiI77GVdJCkzR+UNRKjDEeJoWHnDNKIixFYRqbv+K6TXRhILNtDgLodVqeY1vi5XxlHiHraMF6TTq3mG98eNr+fQsj6OAvqB9VEEeaqJTdI4uUBtRdId+oUf05Nw7D85v58+8dcnJZz6jd+W8vAKiaKLQ</latexit>
1
x
= E(X µX ) 2
<latexit sha1_base64="3EnFHdpN4pE+jpzNT+CQej5QDyY=">AAACDXicbZDLSsNAFIYn9VbrrerSzWAV0oUlab11IRTduKxga6CpYTKdtkNnkjAXaQl9ATe+ihsXirh17863Mb1QvP0w8POdczhzfj9iVCrL+jRSc/MLi0vp5czK6tr6RnZzqy5DLTCp4ZCFwvGRJIwGpKaoYsSJBEHcZ+TG712M6jd3REgaBtdqEJEmR52AtilGKkFedi92BYd1JIamkz9zpeZeH5r9A5drz8nfFmFk9vNeNmcVrLHgX2NPTQ5MVfWyH24rxJqTQGGGpGzYVqSaMRKKYkaGGVdLEiHcQx3SSGyAOJHNeHzNEO4npAXboUheoOCYfp+IEZdywP2kkyPVlb9rI/hfraFV+7QZ0yDSigR4sqitGVQhHEUDW1QQrNggMQgLmvwV4i4SCKskwMw4hHK5bBePZifDEbFL5eMZqRcLdqlQvDrMVc6ncaTBDtgFJrDBCaiAS1AFNYDBPXgEz+DFeDCejFfjbdKaMqYz2+CHjPcvsnKaOw==</latexit>
= E(X
<latexit sha1_base64="fryrsXXjRJ3KsPTPLhZLTxT4OIE=">AAAB9HicbZDLSgMxFIYz9VbrrerSTbAIdWGZS6vThVAUwWUFe4F2LJk0bUMzF5NMoQx9DjcuFHHrw7jzbcxMS1Hxh8DPd87hnPxuyKiQuv6lZVZW19Y3spu5re2d3b38/kFTBBHHpIEDFvC2iwRh1CcNSSUj7ZAT5LmMtNzxdVJvTQgXNPDv5TQkjoeGPh1QjKRCzuVNsX3W9aJe+/TB7OULekm3KnbZhMrotmHqqSnblgUNZRIVwEL1Xv6z2w9w5BFfYoaE6Bh6KJ0YcUkxI7NcNxIkRHiMhqSjrI88Ipw4PXoGTxTpw0HA1fMlTOnPiRh5Qkw9V3V6SI7E31oC/6t1IjmwnZj6YSSJj+eLBhGDMoBJArBPOcGSTZVBmFN1K8QjxBGWKqdcGkK1WjXMyvLLMCGGVT1fkqZZMqySeVcu1K4WcWTBETgGRWCAC1ADt6AOGgCDR/AEXsCrNtGetTftfd6a0RYzh+CXtI9vRoyRVg==</latexit>
µX ) 2 <latexit sha1_base64="fryrsXXjRJ3KsPTPLhZLTxT4OIE=">AAAB9HicbZDLSgMxFIYz9VbrrerSTbAIdWGZS6vThVAUwWUFe4F2LJk0bUMzF5NMoQx9DjcuFHHrw7jzbcxMS1Hxh8DPd87hnPxuyKiQuv6lZVZW19Y3spu5re2d3b38/kFTBBHHpIEDFvC2iwRh1CcNSSUj7ZAT5LmMtNzxdVJvTQgXNPDv5TQkjoeGPh1QjKRCzuVNsX3W9aJe+/TB7OULekm3KnbZhMrotmHqqSnblgUNZRIVwEL1Xv6z2w9w5BFfYoaE6Bh6KJ0YcUkxI7NcNxIkRHiMhqSjrI88Ipw4PXoGTxTpw0HA1fMlTOnPiRh5Qkw9V3V6SI7E31oC/6t1IjmwnZj6YSSJj+eLBhGDMoBJArBPOcGSTZVBmFN1K8QjxBGWKqdcGkK1WjXMyvLLMCGGVT1fkqZZMqySeVcu1K4WcWTBETgGRWCAC1ADt6AOGgCDR/AEXsCrNtGetTftfd6a0RYzh+CXtI9vRoyRVg==</latexit>
40
Proposition: For any real number a, b
1. E(aX + b) = aE(X) + b
2 2
2. Var(X) = E(X ) − (E(X)) Same as discrete rvs
2
3. Var(aX + b) = a Var(X)
<latexit sha1_base64="Ei5UUD7UIkFBaEKbLayLe8NG5Q0=">AAAB7nicbZDLSsNAFIZP6q3WW9Wlm8EiuApJ6y27ohuXFewFmlAm00k6dDIJMxOhlD6EGxeKuPV53Pk2Ti+Ktx8GDt9/DufMH2acKe0471ZhaXllda24XtrY3NreKe/utVSaS0KbJOWp7IRYUc4EbWqmOe1kkuIk5LQdDq+mfvuOSsVScatHGQ0SHAsWMYK1QW0/ZHHsT3rlims7MyHnT/FpVWChRq/85vdTkidUaMKxUl3XyXQwxlIzwumk5OeKZpgMcUy7phQ4oSoYz86doCND+ihKpXlCoxn9PjHGiVKjJDSdCdYD9dubwv+8bq6ji2DMRJZrKsh8UZRzpFM0/TvqM0mJ5iNTYCKZuRWRAZaYaJNQyYTg2J7nudVT9C0NA2re2RdpVW23ZldvTir1y0UcRTiAQzgGF86hDtfQgCYQGMI9PMKTlVkP1rP1Mm8tWIuZffgh6/UDdyqP2Q==</latexit>
41
Proposition: For any discrete rvs X and Y,

E(X + Y) = E(X) + E(Y)
Proof requires the concept of joint pmf: p(x, y) := P({X = x} ∩ {Y = y}).
42
Joint distribution
• Example: We have a bag with 2 cards with “0” on them and 3 cards with “1” on them. We
draw two cards sequentially from the bag without replacement. Let X be the number of the rst
draw and Y the number of the second draw.
• What is the probability that X=1 and Y=1?

• This can be answered by calculating P({X = 1} ∩ {Y = 1}). More generally this is described
by the joint distribution of (X,Y).
De nition: For two discrete rvs X,Y with range RX, RY. Their joint pmf is
y 0 1
x
p(x, y) := P({X = x} ∩ {Y = y}), x ∈ RX, y ∈ RY.
0 1/10 3/10
∑
We have 0 ≤ p(x, y) ≤ 1, and p(x, y) = 1. 1 3/10 3/10
x∈RX,y∈RY
43
fi
fi
Joint distribution
• If we sum the joint pmf across the possible values of one rv, we arrive at the pmf of a single rv:
∑ ∑
p(x, y) = P({X = x} ∩ {Y = y}) = P({X = x}) = p(x)
y∈RY y∈RY
∑
Similarly p(x, y) = p(y)
x∈RX
• Example: We have a bag with 2 cards with “0” on them and 3 y

cards with “1” on them. We draw two cards sequentially from x
0 1
the bag without replacement. Let X be the number of the rst
draw and Y the number of the second draw. 0 1/10 3/10
1 3/10 3/10
• Sum across y, p(X=0) = 2/5, p(X=1) = 3/5.
44
fi
Proposition: For any rvs X and Y,
E(X + Y) = E(X) + E(Y)
• The property also holds for continuous rvs using joint pdf.
45
Joint distribution
De nition: For two continuous rvs X,Y. Their joint pdf p(x, y) ≥ 0 satis es
a b
∫−∞ ∫−∞
p(x, y)dxdy = P(X ≤ a, Y ≤ b),
∬
for any a,b. In particular p(x, y)dxdy = 1.
• If we integrate the joint pdf across one rv, we arrive at the pdf of a single rv:
∞ ∞
∫−∞ ∫−∞
p(x, y)dy = p(x), p(x, y)dx = p(y)
46
fi
fi
Independence of random variables
De nition: Two discrete rvs X, Y are independent if for any x ∈ RX, y ∈ RY
p(x, y) = p(x)p(y)
• Recall that for two events A,B, they are independent if P(A ∩ B) = P(A)P(B). Here the
events A = {X = x}, B = {Y = y} are independent.
• Example: Tossing a fair dice twice, X is the face number of the rst toss, Y is the face
number of the second toss. Prove that X and Y are independent.
1
Proof: For any 1 ≤ x, y ≤ 6 , p(x, y) = P(X = x, Y = y) = .
36
1 1
p(x) = P(X = x) = , p(y) = .
6 6
We have p(x, y) = p(x)p(y), thus X, Y are independent.
47
fi
fi
Independence of random variables
Using joint pmf and joint pdf we can also show the following important property.
Proposition: For two independent rvs X,Y and any continuous functions f(x) and g(y),
E( f(X)g(Y)) = E( f(X))E(g(Y)).
In particular, E(XY) = E(X)E(Y)
∑ ∑
Proof: E( f(X)g(Y)) = f(x)g(y)p(x, y) = f(x)g(y)p(x)p(y)
x∈RX,y∈RY x∈RX,y∈RY
∑∑ ∑ ∑ ∑
= f(x)p(x)g(y)p(y) = f(x)p(x)[ g(y)p(y)] = f(x)p(x)E(g(Y))
x∈RX y∈RY x∈RX y∈RY x∈RX
∑
= E(g(Y)) f(x)p(x) = E( f(X))E(g(Y)) .
x∈RX
48
Proposition: For two independent rvs X and Y,
Var(X + Y) = Var(X) + Var(Y)
Proof: Denote E(X) = a, E(Y) = b

2 2
Var(X + Y) = E[(X + Y − E(X + Y)) ] = E[(X − a + Y − b) ]
2 2
= E[(X − a) + 2(X − a)(Y − b) + (Y − b) ] =Var(X) + 2E[(X − a)(Y − b)] + Var(Y)
= Var(X) + 2E(X − a) ⋅ E(Y − b) + Var(Y) = Var(X) + Var(Y)

Independence
49
Properties of E(X) and Var(X) with multiple rvs
Hold for both discrete and continuous rvs
For any rvs For independent rvs
E(X + Y) = E(X) + E(Y) Var(X + Y) = Var(X) + Var(Y)

E(X1 + X2 + ⋯ + Xn) Var(X1 + X2 + ⋯ + Xn)
= E(X1) + E(X2) + ⋯ + E(Xn) = Var(X1) + Var(X2) + ⋯ + Var(Xn)
E(XY) = E(X) ⋅ E(Y)
E( f(X)g(Y)) = E( f(X)) ⋅ E(g(Y))
50
Outline
51
fi
Binomial distribution
• Consider n (called size) repeated trials (e.g., ipping a coin) where for each trial the outcome
is success or failure, denoted by 1 and 0.
• The trials are identical, the probability of success is p.

• The trials are independent. P({tial i=1, trial j=1}) = P({tial i=1}) P({trial j=1}), i ≠ j.
• The number of successes in these n trials X is a binomial rv, denoted by X ∼ B(n, p).
• Example: There are 20 MC questions in an exam, each is worth 1pt and has four choices.
Suppose you answer the questions by random guess. Let X be the total score X for these
MC questions. Is X a binomial rv?
52
fl
• Example: A coin has probability 0.4 to return a head. Flip this coin 3 times, let X be the
number of heads. Find the pmf of X.
• X~B(3,0.4)
ω x p(x)
TTT 0 1 × 0.63
HTT
THT 1 3 × 0.41 × 0.62
TTH
HHT
HTH 2 3 × 0.41 × 0.62
THH
HHH 3 1 × 0.43
53
Pmf of B(n,p):
(x)
n x n−x
x = 0,1,…,n, p(x) = P(X = x) = p (1 − p) .
• Number of ways to choose x item from n distinct items:
( x ) x!(n − x)!
n n! n(n − 1)⋯(n − x + 1)
= = ,
x!
• k! = k × (k − 1) × ⋯ × 2 × 1
∑ (x)
n n
n x n−x n
• ∑
p(x) = p (1 − p) = (p + (1 − p)) = 1.
x=0 x=0
3
• Expand (x + y) as and example.
54
Pmf of B(n,p):
(x)
n x n−x
x = 0,1,…,n, p(x) = P(X = x) = p (1 − p) .
• Example: A box contains 4 red balls and 6 black balls. We draw a ball from the box and
then put it back (i.e., with replacement). What is the probability that we draw two balls and
both of them are red?
X ∼ B(2,2/5), Find P(X=2)
In R:
dbinom(2, size=2, prob=0.4)
55
Pmf of B(n,p):
(x)
n x n−x
x = 0,1,…,n, p(x) = P(X = x) = p (1 − p) .
• Example: 6 students are asked to randomly pick one number between 0 and 9 inclusively.
Let X be the random variable of the number of students who pick the number “8”. What is
the probability more than one student picks the number “8”?
• X~B(6,0.1)
6 5 1
• P(X > 1) = 1 − P(X ≤ 1) = 1- P(X=0) - P(X=1) = 1 − 0.9 − 6 × 0.9 × 0.1 =0.114265
In R:
1 - pbinom(1, size=6, prob=0.1)
pbinom(1, size=6, prob=0.1, lower.tail=F)
56
Pmf of B(n,p):
(x)
n x n−x
x = 0,1,…,n, p(x) = P(X = x) = p (1 − p) .
• Example: Flip a coin which has probability p to produce a head. We set X=1 and otherwise set
X=0. Is X a binomial rv?
• B(1,p) is also called a Bernoulli rv. It is the outcome of a single 0,1 trial.
• E(X) = p, Var(X) = p(1 − p).
• For a B(n,p) rv X, Let Z1, Z2, …, Zn be the Bernoulli rv for the result of each trial, then
X = Z1 + Z2 + ⋯ + Zn
Importantly, Zi are independent and have the same distribution B(1,p).
57
For a B(n,p) rv X, Let Z1, Z2, …, Zn be the Bernoulli rv fro the result of each trial, then
X = Z1 + Z2 + ⋯ + Zn
Importantly, Zi are independent and have the same distribution B(1,p).
Population mean and variance of B(n,p)
E(X) = np, Var(X) = np(1 − p)
Proof: E(X) = E(Z1 + Z2 + ⋯ + Zn) = E(Z1) + E(Z2) + ⋯ + E(Zn) = np

Var(X) = Var(Z1 + Z2 + ⋯ + Zn) = Var(Z1) + Var(Z2) + ⋯ + Var(Zn) = np(1 − p)
58
E(X) = np, Var(X) = np(1 − p)
Directly using pmf:
Proof:
59
E(X) = np, Var(X) = np(1 − p)
Directly using pmf:

Proof:
60
Poisson distribution Pmf of B(n,p):
(x)
n x n−x
x = 0,1,…,n, p(x) = P(X = x) = p (1 − p) .
• What if n is large and p is small?
• Example: Number of identical twins born in Hong Kong each year. About 82,500 new borns
each year, and identical twin rate 1 in 250 births. B(82500, 0.004)
• Poisson distribution: n → ∞, p → 0, np → λ, then B(n, p) → Pois(λ)
61
Poisson distribution
Relation to Binomial distribution: When n → ∞, p → 0, np = λ
λ For any xed x
B(n, p)=B(n, )
n→∞ x!(n − x)! ( n ) ( n)

x n−x
n n! λ λ
lim 1−
P(X = x) = ( x) p (1 − p)
n x x
n λ(n − x)
x! ( n)
x λ⋅
n(n − 1)⋯(n − x + 1) λ λ
x!(n − x)! ( n ) ( n)
x n−x n
n! λ λ = lim 1 −
= 1− n→∞ nx
x −λ
λe
=
x!
Pmf of Pois(λ):
k
−λ λ
P(X = k) = e , k = 0,1,2,…
k!
62
fi
Poisson distribution Pmf of B(n,p):
(x)
n x n−x
x = 0,1,…,n, p(x) = P(X = x) = p (1 − p) .
• What if n is large and p is small?
• Example: Number of identical twins born in Hong Kong each year. About 82,500 new borns
each year, and identical twin rate 1 in 250 births. B(82500, 0.004)
• Poisson distribution: n → ∞, p → 0, np → λ, then B(n, p) → Pois(λ)
Pmf of Pois(λ):
k
−λ λ
P(X = k) = e , k = 0,1,2,…
k!
63
Poisson distribution
• Poisson distribution can be used to describe the probability of counts occurred over a period of time
or space (which are continuous duration/length/area!)
• Example:
• The number of shooting stars within an hour.
• The number of tra c accidents occurring on a highway in a day.
• The number of customers arrived on each day.
• The number of typos in a 10 page essay.
• Let the number of times some event occurs in a given continuous interval. Then we have a Poisson
distribution with parameter λ > 0 if the following conditions are satis ed:
1. The numbers of changes occurring in non-overlapping intervals are independent
2. The probability of exactly one event occurring in a su ciently short interval of length h is
approximately λh.
3. The probability of two or more events occurring in a su ciently short interval is essentially zero.
If n is small enough, we have a sequence of n

Bernoulli trials with probability
0 1 2 n−1
1 λ
n n n p≈
n
64
ffi
ffi
ffi
fi
Poisson distribution Population mean and variance of B(n,p)
E(X) = np, Var(X) = np(1 − p)

• Poisson distribution: n → ∞, p → 0, np → λ, then
B(n, p) → Pois(λ)
Pmf of Pois(λ):
k
−λ λ
P(X = k) = e , k = 0,1,2,…
k!
Population mean and variance of Pois(λ):
E(X) = λ, Var(X) = λ
∞ k ∞ k ∞ k−1
−λ λ −λ λ −λ λ
∑ ∑ ∑
Method 1: Use pmf E(X) = ke = e =λ e =λ
• k! (k − 1)! (k − 1)!
k=0 k=1 k=1
• Method 2: Use binomial rv results.
lim E(X) = lim np = λ, lim Var(X) = lim np(1 − p) = λ.
np→λ np→λ np→λ np→λ, p→0
65
Poisson distribution Pmf of Pois(λ):
k
−λ λ
P(X = k) = e , k = 0,1,2,…
k!
• Example: The number of shooting stars (meteoroids) occurred in an hour follows a Poisson distribution with rate
5.5.
• What is the probability that we see at least 3 shooting stars in an hour?

0 1 2
−5.5 5.5 −5.5 5.5 −5.5 5.5
• P(X ≥ 3) = 1 − P(X = 0) − P(X = 1) − P(X = 2) = 1 − e − e − e
0! 1! 2!
−5.5
=1−e (1 + 5.5 + 15.125) = 0.912
In R:
> 1 - ppois(2, 5.5)
[1] 0.9116236
> ppois(2, 5.5,lower.tail=F)
[1] 0.9116236
66
Normal distribution
• Most important distribution in statistics. In 1823, C. Gauss derived and applied the distribution in
statistical problems. It accurately describes many practical and signi cant real-world quantities
such as noise and other quantities which are results from many small independent random terms.
2
Pdf of N(μ, σ ):
1 (x − μ)2
− 2
f(x) = e 2σ , −∞<x<∞
2πσ 2
67
fi
Normal distribution
Pdf of N(μ, σ 2):
1 (x − μ)2
− 2
f(x) = e 2σ , −∞<x<∞
2πσ 2
2
• μ and σ > 0 are two parameters of the
distribution.
2
• Proposition: For X ∼ N(μ, σ ),
2
E(X) = μ, Var(X) = σ
∞ ∞ ∞
1 1 1
∫−∞ ∫−∞ ∫−∞
(x − μ)2 (x − μ)2 (x − μ)2
− 2 − 2 − 2
E(X) = xe 2σ dx = (x − μ)e 2σ dx + μe 2σ dx
2πσ 2 2πσ 2 2πσ 2
∞
1
∫−∞
(x − μ)2
− 2
=0+μ e 2σ dx = μ
2πσ 2
68
2
Pdf of N(μ, σ ):
Normal distribution 1 (x − μ)2
− 2
f(x) = e 2σ , −∞<x<∞
Probability density function 2πσ 2
Z b
= f (x)dx Z
a b
<latexit sha1_base64="sAmdho5moQoZZ8iBaEjITq7yI7U=">AAAB+XicbZDLSgMxFIYzXmu9jbp0EyxC3ZSZatWNUHTjsoK9QFtLJs20oZnMkJwpLUPfxI0LRdz6Ju58GzNtEbX+EPj4zzmck9+LBNfgOJ/W0vLK6tp6ZiO7ubW9s2vv7dd0GCvKqjQUoWp4RDPBJasCB8EakWIk8ASre4ObtF4fMqV5KO9hHLF2QHqS+5wSMFbHtq9aXEKHPHjYz49OuqOOnXMKbskpFUs4hfNTA27BmeobcmiuSsf+aHVDGgdMAhVE66brRNBOiAJOBZtkW7FmEaED0mNNg5IETLeT6eUTfGycLvZDZZ4EPHV/TiQk0HoceKYzINDXf2up+V+tGYN/2U64jGJgks4W+bHAEOI0BtzlilEQYwOEKm5uxbRPFKFgwsqaEBa+vAi1YsE1Ed2d5crX8zgy6BAdoTxy0QUqo1tUQVVE0RA9omf0YiXWk/Vqvc1al6z5zAH6Jev9C6XxkwM=</latexit>
sha1_base64="hP+6LrUf2d3tZaldqaQQvEKMXyw=">AAAB2XicbZDNSgMxFIXv1L86Vq1rN8EiuCozbnQpuHFZwbZCO5RM5k4bmskMyR2hDH0BF25EfC93vo3pz0JbDwQ+zknIvSculLQUBN9ebWd3b/+gfugfNfzjk9Nmo2fz0gjsilzl5jnmFpXU2CVJCp8LgzyLFfbj6f0i77+gsTLXTzQrMMr4WMtUCk7O6oyaraAdLMW2IVxDC9YaNb+GSS7KDDUJxa0dhEFBUcUNSaFw7g9LiwUXUz7GgUPNM7RRtRxzzi6dk7A0N+5oYkv394uKZ9bOstjdzDhN7Ga2MP/LBiWlt1EldVESarH6KC0Vo5wtdmaJNChIzRxwYaSblYkJN1yQa8Z3HYSbG29D77odBu3wMYA6nMMFXEEIN3AHD9CBLghI4BXevYn35n2suqp569LO4I+8zx84xIo4</latexit>
sha1_base64="XTjRnUxHKSxT2aTvlUHQZd9ZL6Y=">AAAB7nicbZBLSwMxFIXv1FetVUe3boJFqJsy40Y3guDGZQX7gHYsmTTThmYyQ3KntAz9J25cKOLPcee/MX0stPVA4OOchHtzwlQKg5737RS2tnd294r7pYPy4dGxe1JumiTTjDdYIhPdDqnhUijeQIGSt1PNaRxK3gpH9/O8NebaiEQ94TTlQUwHSkSCUbRWz3Vvu0Jhjz6HJKpOLvuTnlvxat5CZBP8FVRgpXrP/er2E5bFXCGT1JiO76UY5FSjYJLPSt3M8JSyER3wjkVFY26CfLH5jFxYp0+iRNujkCzc3y9yGhszjUN7M6Y4NOvZ3Pwv62QY3QS5UGmGXLHloCiTBBMyr4H0heYM5dQCZVrYXQkbUk0Z2rJKtgR//cub0Lyq+V7Nf/SgCGdwDlXw4Rru4AHq0AAGY3iBN3h3cufV+VjWVXBWvZ3CHzmfPxfnkXY=</latexit>
sha1_base64="niBdR328MMRetlm8jVY1fCBj8hU=">AAAB7nicbZBPS8MwGMbf+nfOqdWrl+AQ5mW0k6kXQfDicYL7A1staZZuYWlaknRslH0TLx4U8eN489uYbkPU+UDgx/MkvG+eIOFMacf5tNbWNza3tgs7xd3S3v6BfVhqqTiVhDZJzGPZCbCinAna1Exz2kkkxVHAaTsY3eZ5e0ylYrF40NOEehEeCBYygrWxfNu+7jGhffwYoLAyOetPfLvsVN26U6/VUQ4X5wbcqjPXN5RhqYZvf/T6MUkjKjThWKmu6yTay7DUjHA6K/ZSRRNMRnhAuwYFjqjysvnmM3RqnD4KY2mO0Gju/nyR4UipaRSYmxHWQ/U3y83/sm6qwysvYyJJNRVkMShMOdIxymtAfSYp0XxqABPJzK6IDLHERJuyiqaElS+vQqtWdU1F9w4U4BhOoAIuXMIN3EEDmkBgDE/wAq9WZj1bb4u61qxlb0fwS9b7F1IMkZ8=</latexit>
sha1_base64="HIRTssjJk+FQyqb3hl/EAmytkbM=">AAAB+XicbZDLSgMxFIYz9VbrbdSlm2AR6qbMVKpuhKIblxXsBdpaMmmmDc1khuRMaRn6Jm5cKOLWN3Hn25hpi6j1h8DHf87hnPxeJLgGx/m0Miura+sb2c3c1vbO7p69f1DXYawoq9FQhKrpEc0El6wGHARrRoqRwBOs4Q1v0npjxJTmobyHScQ6AelL7nNKwFhd275qcwld8uBhvzA+7Y27dt4pumWnXCrjFM7PDLhFZ6ZvyKOFql37o90LaRwwCVQQrVuuE0EnIQo4FWyaa8eaRYQOSZ+1DEoSMN1JZpdP8YlxetgPlXkS8Mz9OZGQQOtJ4JnOgMBA/62l5n+1Vgz+ZSfhMoqBSTpf5McCQ4jTGHCPK0ZBTAwQqri5FdMBUYSCCStnQlj68jLUS0XXRHTn5CvXiziy6AgdowJy0QWqoFtURTVE0Qg9omf0YiXWk/Vqvc1bM9Zi5hD9kvX+BaSxkv8=</latexit>
1 (x µ)2
p e 2 2 dx =?
2⇡ 2
<latexit sha1_base64="LC8lwKFa61BaOzEgw8X2LN69zRw=">AAACM3icdZBLSwMxFIUzvq2vqks3wSLoQpkpom7EohtxVcGq0GlLJs3U0CQzJnfEMsx/cuMfcSGIC0Xc+h9MHz7RC4HDOeeS5AtiwQ247oMzNDwyOjY+MZmbmp6ZncvPL5yaKNGUVWgkIn0eEMMEV6wCHAQ7jzUjMhDsLGgfdPOzK6YNj9QJdGJWk6SleMgpAWs18kc+V9Ag9QD7oSY09bLUN5ca0qIfc9/wliT1YpaxerreL6xer/syWbOmrXzmzevdvUa+4G24vcHul/Bcd6vo4o+ogAZTbuTv/GZEE8kUUEGMqXpuDLWUaOBUsCznJ4bFhLZJi1WtVEQyU0t7f87winWaOIy0PQpwz/2+kRJpTEcGtikJXJjfWdf8K6smEO7UUq7iBJii/YvCRGCIcBcgbnLNKIiOFYRqbt+K6QWxaMBizn2H8L84LVosG97xZqG0P8AxgZbQMlpFHtpGJXSIyqiCKLpB9+gJPTu3zqPz4rz2q0POYGcR/Rjn7R3S66uW</latexit>
a
It cannot be calculated in closed-form.
We need to use R code to compute it!

But rst, a simpli cation of the problem…
69
fi
fi
Standard Normal distribution 2
Pdf of N(μ, σ ):
b
∫a
1 (x − μ)2
− 2
Cdf and P(a < X ≤ b) = f(x)dx. f(x) = e 2σ , −∞<x<∞
• 2πσ 2
Theorem: If X is a normal rv, then for any a ≠ 0 and b, aX + b is also normally distributed.
2 X − μ
Corollary: If X ∼ N(μ, σ ), then Z = ∼ N(0,1).
σ
X−μ E(X) − μ X−μ 1 1
• Proof: E(Z) = E( ) = = 0; var(Z) = var( ) = var(X − μ) = var(X) = 1
σ σ σ σ2 σ2
• N(0,1) is called the standard normal distribution.
( σ σ )
a−μ b−μ
• P(a < X ≤ b) = P < Z ≤
70
Standard normal distribution
0 a µ
a =
<latexit sha1_base64="QDh1IM0kKH39jUi1ncUkG2/ouIs=">AAACAHicbVDLSsNAFJ3UV62vqAsXbgaL6MaStKLdCEU3LivYBzSl3Ewn7dCZJMxMhBKy8VfcuFDErZ/hzr9x+lho9cCFwzn3cu89fsyZ0o7zZeWWlldW1/LrhY3Nre0de3evqaJEEtogEY9k2wdFOQtpQzPNaTuWFITPacsf3Uz81gOVikXhvR7HtCtgELKAEdBG6tkHcHLlBRJICmeeSLLUU2wgIOvZRafklCsXlSo2ZArsLpIimqPesz+9fkQSQUNNOCjVcZ1Yd1OQmhFOs4KXKBoDGcGAdgwNQVDVTacPZPjYKH0cRNJUqPFU/TmRglBqLHzTKUAP1aI3Ef/zOokOqt2UhXGiaUhmi4KEYx3hSRq4zyQlmo8NASKZuRWTIZg4tMmsYEL48/Jf0iyXXKfk3p0Xa9fzOPLoEB2hU+SiS1RDt6iOGoigDD2hF/RqPVrP1pv1PmvNWfOZffQL1sc30YmWjg==</latexit>
0 b µ
b =
<latexit sha1_base64="VONW4CgmHmaBUEV/YH+DnGsYXIo=">AAACAHicbVDLSsNAFL2pr1pfURcu3AwW0Y0laUW7EYpuXFawD2hKmUwn7dCZJMxMhBKy8VfcuFDErZ/hzr9x+lho9cCFwzn3cu89fsyZ0o7zZeWWlldW1/LrhY3Nre0de3evqaJEEtogEY9k28eKchbShmaa03YsKRY+py1/dDPxWw9UKhaF93oc067Ag5AFjGBtpJ594J9ceYHEJPXPPJFkqafYQOCsZxedklOuXFSqyJApkLtIijBHvWd/ev2IJIKGmnCsVMd1Yt1NsdSMcJoVvETRGJMRHtCOoSEWVHXT6QMZOjZKHwWRNBVqNFV/TqRYKDUWvukUWA/VojcR//M6iQ6q3ZSFcaJpSGaLgoQjHaFJGqjPJCWajw3BRDJzKyJDbOLQJrOCCeHPy39Js1xynZJ7d16sXc/jyMMhHMEpuHAJNbiFOjSAQAZP8AKv1qP1bL1Z77PWnDWf2YdfsD6+AdS0lpA=</latexit>
71
Normal distribution ϕ(x) = Φ′(x)
2 X − μ
Corollary: If X ∼ N(μ, σ ), then Z = ∼ N(0,1).
σ
( σ σ ) ( σ ) ( σ )
a−μ b−μ b−μ a−μ Φ(x)
• P(a < X ≤ b) = P < Z ≤ = Φ − Φ
• Φ(x) is the cdf of a standard normal rv.

• In R, pnorm(x) z
• Example: Let X~N(1,4). Find P(X>1.5).

• P(X>1.5) = P(Z>0.25) = 1 − P(Z ≤ 0.25)
> 1- pnorm(0.25) > pnorm(1.5, mean=1, sd=2, lower.tail = F)

[1] 0.4012937 [1] 0.4012937
72

Normal distribution
• Example: Let X ∼ N(1,4). Find k such that P(0.5 < X < k) = 0.4
• 0.4 = P(−0.25 < Z < (k − 1)/2) = P(Z < (k − 1)/2) − P(−0.25)
• P(Z < (k − 1)/2) = 0.4 + P(−0.25) = 0.8012937
−1
• (k − 1)/2 = Φ (0.8012937), k = 2.692502
• De nition: the inverse of cdf is called the (population) quantile function. i.e., q-quantile is
the value zq satisfying:
P(X ≤ zq) = q.
> pnorm(-0.25) > qnorm(0.4+pnorm(-0.25))

[1] 0.4012937 [1] 0.8462512
73
fi
Normal distribution
Example:
Solution
74
Normal distribution
• Example: Suppose that the life span of light bulb is Gaussian distributed with mean 1200 hrs
and sd 300 hrs. A whole sail box contains 30 bulbs. What is the probability of the box containing
more than 2 bulbs that has a life span of less than 600 hrs?
• What distribution to use?

• The number of bulbs <600 hrs in box of 30, Y, is binomial distributed, B(30,p)
2
• Let X~N(1200,300 ) be the life span of a bulb, p=P(X<600) = P(Z<-2)=pnorm(-2) = 0.02275
30 29 30 * 29 2
• P(Y>2) = 1- P(Y=0) - P(Y=1) - P(Y=2) = 1 − (1 − p) − 30p(1 − p) − p (1 − p)28
2
> 1-pbinom(2,size=30,prob=pnorm(-2))
[1] 0.03025764
75
Normal distribution
Example:
Solution:
0.005
By using R code: qnorm(0.995), we get the value 2.5758. Hence = 2.5758 and we get σ = 0.00194
σ
Other continuous distributions
• Student’s t distribution
A family of distributions with a parameter ν > 0:
Pdf of t distribution*
• It has a similar shape as the standard normal

distribution N(0,1).
• When ν = 1, we get the Cauchy distribution on page

40.
• When ν → + ∞, we get the standard normal N(0,1)

wikipedia
77
Other continuous distributions
• Chi-squared distribution
A family of distributions with a parameter k > 0:
Pdf of chi-sq distribution*
• Its pdf is asymmetric and nonzero only on positive real numbers

+
ℝ .
2 2 2
• For integer k, X = Z1 + Z2 +⋯+ Zk , where Zi are iid standard
normal rvs. wikipedia
• E(X) = k
78
Summary
Important rvs
Discrete and continuous rv
• Binomial distribution
Distribution of a rv
• Poisson distribution
• Normal distribution
• Cumulative distribution function • t distribution
• Probability mass/density function • Chi-squared distribution
Mean and variance of a rv Use R to calculate cdf, pdf/pmf, quantile
• Independent rvs
• Linear transform
• Sum of rvs
79

Random Variable - Full

Uploaded by

Copyright:

Available Formats

You might also like

Random Variable - Full

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Random Variable - Full

Uploaded by

Copyright:

Available Formats

MATH 2411, Spring 2024

Yu Hu, MATH and LIFS, HKUST

• Special random variables

Sample space Random values

• Q: What is P(Z ≤ 1)?

De nition: The cumulative distribution function (cdf) F : ℝ → [0,1] of a rv is

F(x) = P(X ≤ x).

• P(a < X ≤ b) = F(b) − F(a)

• Example: Find the cdf of the rv with the following pmf:

• Example: Given the following cdf of a discrete rv, nd its pmf.

F(x) = P(X ≤ x).

P(0.5 < Z ≤ 1.5)

• Special random variables

• The mean is usually denoted by μX, μ, EX.

• The mean is usually denoted by μX, μ, EX.

Data generating process C Ob.e""d data

Wasserman, All of Statistics

Parameter: probability of Head Inference and Data Mining Statistics

The value of sample mean changes with the data we have.

• Population variance vs sample variance

Data generating process C Ob.e""d data

Wasserman, All of Statistics

Parameter: probability of Head Inference and Data Mining Statistics

Pmf of X: P(X=0)=1/4, P(X=1)=1/2, P(X=2)=1/4

E(X) = (0) ⇤ 0.25 + (1) ⇤ 0.5 + (2) ⇤ 0.25 = 1

• Q: Find the expected returns for the stocks.

• Q: Find the standard deviations of the stocks.

• Sharpe ratio: R0 is the risk-free investment

Proof: Denote E(X) = μ,

Q: What is the expected return and standard deviation of the portfolio?

E(Z) = 0.7 × 4 + 0.3 × 1.5 = 3.25

Var(Z) = 0.72Var(Y ), Var(Z) = 0.7 Var(Y ) = 2.87

• Special random variables

P (x − Δ/2 < X ≤ x + Δ/2) F(x + Δ/2) − F(x − Δ/2)

• Example: X be a number that is randomly chosen within [0,1].

• Find the cdf using this pdf

• Example: Z is the rv de ned on page 5. Find its pdf.

1) Find the value of c.

• The mean is usually denoted by μX, μ, EX.

• The variance describes the dispersion of the distribution of rv X.

• Population standard deviation: SD(X) := Var(X)

Probability mass function p(x) Probability density function

0  p(x)  1 p(x) = 1 f (x) 0 f (x)dx = 1

Cumulative distribution function Cumulative distribution function

Proposition: For any real number a, b

Proposition: For any discrete rvs X and Y,

Proof requires the concept of joint pmf: p(x, y) := P({X = x} ∩ {Y = y}).

• What is the probability that X=1 and Y=1?

• Example: We have a bag with 2 cards with “0” on them and 3 y

In particular, E(XY) = E(X)E(Y)

Var(X + Y) = Var(X) + Var(Y)

Proof: Denote E(X) = a, E(Y) = b

= Var(X) + 2E(X − a) ⋅ E(Y − b) + Var(Y) = Var(X) + Var(Y)

For any rvs For independent rvs

E(X + Y) = E(X) + E(Y) Var(X + Y) = Var(X) + Var(Y)

E(XY) = E(X) ⋅ E(Y)