Math2101Stat 4

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 15

COURSE TITLE: Linear Algebra, Statistics and

Probability
COURSE CODE: MATH-2101

Instructor: M. Ershadul Haque

Associate Professor

Department of Statistics, DU
Random Variables and Its Probability Distributions

• Random Variable (Definition): Given an experiment with sample space 𝑆, a random


variable (r.v.) is a function from the sample space 𝑆 to the real numbers ℝ. The random
variables are usually denoted by capital letters.

• Example: Consider an experiment where we toss a fair coin twice. The sample space
consists of four possible outcomes: 𝑆 = 𝐻𝐻, 𝐻𝑇, 𝑇𝐻, 𝑇𝑇 . Let 𝑋 be the number of Heads.
This is a random variable with possible values 0, 1, and 2. Viewed as a function, 𝑋 assigns
the value 2 to the outcome 𝐻𝐻, 1 to the outcomes 𝐻𝑇 and 𝑇𝐻, and 0 to the outcome 𝑇𝑇.
That is,
𝑋 𝐻𝐻 = 2, 𝑋 𝐻𝑇 = 𝑋 𝑇𝐻 = 1, 𝑋(𝑇𝑇) = 0.
Probability Distributions

• Probability distribution: Most commonly in applications, the support of a discrete r.v. is a


set of integers. In contrast, a continuous r.v. can take on any real value in an interval
(possibly even the entire real line). Given a random variable, we would like to be able to
describe its behavior using the language of probability by means of a probability
distribution.

• For a discrete r.v., the most natural way to describe the behavior of that r.v. is a function,
called the probability mass function, which we now define.

• Probability mass function: The probability mass function (pmf) of a discrete r.v. 𝑋 is the
function 𝑝𝑋 given by 𝑝𝑋 (𝑥) = 𝑃(𝑋 = 𝑥). Note that this is positive if 𝑥 is in the support of 𝑋,
and 0 otherwise.
Probability Distributions (cont…)

• Example: In the example of tossing a fair coin twice, we can find the pmf of the random
variable 𝑋 along with its pmf:

• 𝑋, the number of Heads. Since 𝑋 equals 0 if 𝑇𝑇 occurs, 1 if 𝐻𝑇 or 𝑇𝐻 occurs, and 2 if 𝐻𝐻


occurs, the Ppmf of 𝑋 is the function 𝑝𝑋 given by
𝑝𝑋 0 = 𝑃 𝑋 = 0 = 1 4,
𝑝𝑋 1 = 𝑃 𝑋 = 1 = 1 2,
𝑝𝑋 2 = 𝑃 𝑋 = 2 = 1 4,
𝑝𝑋 (𝑥) = 0 for all other values of 𝑥.
Probability Distributions (cont…)
• Valid pmfs: Let 𝑋 be a discrete r.v. with support 𝑥1 , 𝑥2 , … , 𝑥𝑛 . The pmf 𝑝𝑋 (or 𝑓𝑋 ) of 𝑋 must
satisfy the following two criteria:

 Nonnegative: 𝑝𝑋 (𝑥) > 0 if 𝑥 = 𝑥𝑗 for some 𝑗, and 𝑝𝑋 (𝑥) = 0 otherwise;

 Sums to 1: 𝑝𝑋 (𝑥𝑗 ) = 1

• Problem: Find the value of 𝑘, that makes the following function a valid pmf. Also find
𝑃 𝑋 = 16 , 𝑃 10 ≤ 𝑋 ≤ 20

𝑥 8 12 16 20 24
𝑓(𝑥) 1/8 1/6 3𝑘 𝑘 1/12
1 𝑥+1
• Problem: Verify that the function, 𝑓 𝑥 = ,𝑥 = 0, 1, 2, ⋯ is a pmf.
2

• Problem: Find the value of 𝑘, that makes the following function a valid pmf. Also find
𝑃 𝑋<3 ,𝑃 𝑋>5 ,𝑃 1<𝑋<5 ,𝑃 2≤𝑋≤4
𝑓 𝑥 =𝑘 𝑥+1 , 𝑥 = 1, 2, 3, 4, 5
Cumulative distribution functions

• Another function that describes the distribution of an r.v. is the cumulative distribution
function (cdf). Unlike the pmf, which only defined for discrete r.v.s possess, the cdf is
defined for all r.v.s i.e., for discrete and for continuous r.v.

• Definition: The cumulative distribution function (cdf) of an r.v. 𝑋 is the function 𝐹𝑋 given by
𝐹𝑋 𝑥 = 𝑃 𝑋 ≤ 𝑥 . When there is no risk of ambiguity, we sometimes drop the subscript and
just write 𝐹 (or some other letter) for a cdf.

• Continuous random variable: Although the possible values of a discrete random variables
can be written down as a list, the continuous r.v.s can take on any real value in an interval
(possibly of infinite length, such as (0,1) or the entire real line). A continuous random
variable is a random variable with a continuous distribution.
Cumulative distribution functions (cont…)
• The properties of cdf for discrete and continuous random variable can be describe by the
following figure:

• A r.v. has a continuous distribution if its cdf is diffeerentiable. For continuous r.v.s, the cdf is
often convenient to work with, and its derivative is a very useful function, called the
probability density function (pdf).

• For discrete r.v.s, the cdf is awkward to work with because of its jumpiness, and its
derivative is almost useless since it’s undefined at the jumps and 0 everywhere else.
Probability density function

• Definition (Probability density function): For a continuous r.v. 𝑋 with cdf 𝐹 , the
probability density function (pdf) of 𝑋 is the derivative 𝑓 of the cdf, given by 𝑓(𝑥) = 𝐹′(𝑥).
The support of 𝑋, and of its distribution, is the set of all 𝑥 where 𝑓(𝑥) > 0.

• An important way in which continuous r.v.s differ from discrete r.v.s is that for a continuous
r.v. 𝑋, 𝑃(𝑋 = 𝑥) = 0 for all 𝑥. This is because 𝑃(𝑋 = 𝑥) is the height of a jump in the cdf
at 𝑥, but the cdf of 𝑋 has no jumps! Since the pmf of a continuous r.v. would just be 0
everywhere, we work with a pdf instead.

• The pdf is analogous to the pmf in many ways, but there is a key difference: for a pdf 𝑓, the
quantity 𝑓(𝑥) is not a probability, and in fact it is possible to have 𝑓(𝑥) > 1 for some values
of 𝑥. To obtain a probability, we need to integrate the pdf.
Probability density function (cont…)

• pdf to cdf: Let 𝑋 be a continuous r.v. with pdf 𝑓. Then the cdf of 𝑋 is given by
𝑥

𝐹 𝑥 = 𝑓 𝑢 𝑑𝑢
−∞

• Probability from pdf: The probability of 𝑋 falling into an interval (𝑎, 𝑏) can be determined
as
𝑃 𝑎<𝑋<𝑏 =𝑃 𝑎<𝑋≤𝑏 =𝑃 𝑎≤𝑋<𝑏 =𝑃 𝑎≤𝑋≤𝑏

• By definition of cdf and the fundamental theorem of calculus,


𝑏

𝑃 𝑎 <𝑋 ≤𝑏 =𝐹 𝑏 −𝐹 𝑎 = 𝑓 𝑥 𝑑𝑥
𝑎
Probability density function (cont…)
• Valid pdfs: The pdf 𝑓 of a continuous random variable 𝑋 must satisfy the following two criteria:

 Nonnegative: 𝑓 𝑥 ≥ 0,

 Integrate to 1: −∞
𝑓 𝑥 𝑑𝑥 = 1

• Problem: Find the value of the constant k such that the following function is a valid pdf 𝑓 𝑥 =
𝑘 (1 − 𝑥 3 ) 0 < 𝑥 < 1. Also find P(𝑋 < 0.5), P(𝑋 > 0.2).

• Problem: Suppose that X is a continuous random variable whose probability density function

2 1 − 𝑥 ;0 ≤ 𝑥 ≤ 1
(pdf) is given by 𝑓 𝑥 =
0, 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒

 (a ) Find the cdf.

 (b) Find P(𝑋 < 0.4).

 (c) Find P(𝑋 > 0.5).

 (d) Find P(0.3< 𝑋 < 0.8).


Mathematical Expectation
• Let 𝑋 be a random variable. The mean of a random variable 𝑋 is referred to its expected
value and is denoted by 𝐸 𝑋 or 𝜇𝑋 or simply 𝜇. It is defined by

 𝐸 𝑋 = 𝑥𝑗 𝑝 𝑥𝑗 , if 𝑋 is discrete with mass points 𝑥1 , 𝑥2 … , 𝑥𝑗 , …



 𝐸 𝑋 = −∞
𝑥𝑓 𝑥 𝑑𝑥, if 𝑋 is continuous with pdf 𝑓 𝑥

• Let 𝑋 be a random variable, and let 𝑔 ⋅ be a function. Then

 𝐸𝑔 𝑋 = 𝑔 𝑥𝑗 𝑝 𝑥𝑗 , if 𝑋 is discrete with mass points 𝑥1 , 𝑥2 … , 𝑥𝑗 , …



 𝐸𝑔 𝑋 = −∞
𝑔 𝑥 𝑓 𝑥 𝑑𝑥 ,if 𝑋 is continuous with pdf 𝑓 𝑥
Properties Mathematical Expectation

• Let 𝑋 be a random variable with finite mean. Then for any numerical constants 𝑎 and 𝑏
𝐸 𝑎𝑋 + 𝑏 = 𝑎𝐸(𝑋) + 𝑏

• The expected value of the squared difference of a random variable 𝑋 from its mean is called
variance . It is denoted by 𝜎 2 i.e., 𝑉𝑎𝑟 𝑋 = 𝜎 2 and is defined as

 𝜎 2 = 𝐸 𝑋 − 𝜇 2 . It can be easily shown that

 𝜎 2 = E X 2 − 𝜇2 = E X 2 − E X 2

 The square root of the variance is called the standard deviation (SD). The standard

deviation of X is denoted by SD X or σX or simply σ i.e., σ = Var X


Mathematical Expectation and Its Properties (cont…)
• Let 𝑋 and 𝑌 be two independent random variables. Then
𝐸 𝑋𝑌 = 𝐸(𝑋)𝐸(𝑌)

• Let 𝑋 and 𝑌 be two independent random variables. Then Then for any functions 𝑔(𝑋) and
ℎ(𝑌)
𝐸 𝑔(𝑋)ℎ(𝑌) = 𝐸 𝑔(𝑋) 𝐸 ℎ(𝑌)

• Let 𝑋 be a random variable with finite mean. Then for any numerical constants 𝑎 and 𝑏
𝑉𝑎𝑟 𝑎𝑋 ± 𝑏 = 𝑎2 𝑉𝑎𝑟(𝑋)
Mathematical Expectation (cont…)

• Every day, the number of network blackouts, 𝑋 has a distribution (probability mass function)
𝑥 0 1 2
𝑃(𝑋 = 𝑥) 0.7 0.2 0.1
 (a) find 𝐸 𝑋 , 𝑉𝑎𝑟(𝑋)

 (b) A company estimates that each network blackout results in a $500 loss. Compute
expectation and variance of this company’s daily loss due to blackouts.

• The installation time, in hours, for a certain software module has a probability density
4
function 𝑓(𝑥) = (1 − 𝑥 3 ) for 0 < 𝑥 < 1
3

 (a) Find mean installation time

 (b) Find standard deviation of installation time

 (c) If installation cost is computed as 25𝑋 + 50, find mean and standard deviation of
installation cost.
Thank You

You might also like