Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
You are on page 1of 73

Virtual University of Pakistan

Lecture No. 23
of the course on
Statistics and Probability
by
Miss Saleha Naghmi Habibullah
IN THE LAST LECTURE,
YOU LEARNT

•Bayes’ Theorem
•Discrete Random Variable
• Discrete Probability Distribution
•Graphical Representation of a Discrete Probability
Distribution
•Mean, Standard Deviation and Coefficient of
Variation of a Discrete Probability Distribution
•Distribution Function of a Discrete Random Variable.
TOPICS FOR TODAY

•Graphical Representation of the Distribution Function of


a Discrete Random Variable

•Mathematical Expectation

•Mean, Variance and Moments of a Discrete Probability


Distribution

•Properties of Expected Values


First, let us consider
the concept of the
DISTRIBUTION FUNCTION of a discrete random
variable.
As discussed in the last lecture:
DISTRIBUTION FUNCTION

The distribution function of a random variable X,


denoted by F(x), is defined by F(x) = P(X < x).
The function F(x) gives the probability of the event
that X takes a value LESS THAN OR EQUAL TO a
specified value x.

The distribution function is abbreviated to d.f. and


is also called the cumulative distribution function (cdf)
as it is the cumulative probability function of the random
variable X from the smallest value upto a specific value x.
EXAMPLE

Find the probability distribution and distribution


function for the number of heads when 3 balanced coins
are tossed.
Depict both the probability distribution and the
distribution function graphically.
Since the coins are balanced, therefore the equi-
probable sample space for this experiment is

S = {HHH, HHT, HTH, THH,


HTT, THT, TTH, TTT}.
Let X be the random variable that denotes the
number of heads.
Then the values of X are
0, 1, 2 and 3.
And their probabilities are:

f(0) = P(X = 0)
= P[{TTT}] = 1/8
f(1) = P(X = 1)
= P[{HTT, THT, TTH}] = 3/8
f(2) = P(X = 2)
= P[{HHT, HTH, THH}] = 3/8
f(2) = P(X = 3)
= P[{HHH}] = 1/8
Expressing the above information in the tabular
form, we obtain the desired probability distribution of X
as follows:
N
umb
erofHe
adsP
ro
babil
ity
(
xi) f(
xi)
1
0
8
3
1
8
3
2
8
1
3
8
T
ot
al 1
The line chart of the above
probability distribution is as
follows:
f(x)

4/8
3/8
2/8
1/8
0 X
0 1 2 3
In order to obtain the distribution function of this
random variable, we compute the cumulative probabilities
as follows:
Number of Probability Cumulative
Heads Probability
(xi) f(xi) F(xi)
1 1
0
8 8
3 1 3 4
1  
8 8 8 8
3 4 3 7
2  
8 8 8 8
1 7 1
3  1
8 8 8
Hence the desired distribution function is
0, for x  0
1
 , for 0  x  1
8
 4
F x    , for 1  x  2
 8
7
8 , for 2  x  3

1, for x  3
INTERPRETATION:
If x < 0, we have
P(X < x) = 0, the reason being that it is not possible
for our random variable X to assume value less than zero.
(The minimum number of heads that we can have in
tossing three coins is zero.)

If 0 < x < 1, we note that it is not possible for our


random variable X to assume any value between zero and
one. (We will have no head or one head but we will NOT
have 1/3 heads or 2/5 heads!)

Hence, the probabilities of all such values will be


zero, and hence we will obtain a situation which can be
explained through the following table:
N um ber of P r o b a b ilit y C u m u la t iv e
H eads P r o b a b il it y
( x i) f ( x i) F ( x i)
1 1
0
8 8
1 1
0 .2 0 0
8 8
1 1
0 .4 0 0
8 8
1 1
0 .6 0 0
8 8
1 1
0 .8 0 0
8 8
3 1 3 4
1  
8 8 8 8
The above table clearly shows that the probability
that X is LESS THAN any value lying between zero and
0.9999… will be equal to the probability of X = 0 i.e.
For 0 < x < 1,

1
P(X  x)  P(X  0)  ;
8
Similarly,
•For 1 < x < 2, we have

P X  x   P X  0   P X  1
1 3 4
   ;
8 8 8
•For 2 < x < 3, we have

P X  x   P X  0  P X  1  P X  2
1 3 3 7
    ;
8 8 8 8
And, finally, for x > 3, we have

P X  x   P X  0  P X  1  P X  2  P(X  3)
1 3 3 1 8
      1.
8 8 8 8 8
Hence, the graph of the
DISTRIBUTION FUNCTION
is as follows:
F(x)
1

6/8

4/8

2/8

0 1 2 3 X
As this graph resembles the steps of a staircase, it
is known as a step function.
It is also known as a jump function (as it takes
jumps at integral values of X).
In some books, the graph of
the distribution function is
given as shown in the following
figure:
F(x)
1

6/8

4/8

2/8

0 1 2 3 X
In what way do we interpret
the above distribution function
from a REAL-LIFE point of
view?
If we toss three balanced
coins, the probability that we
obtain at the most one head is 4/8,
the probability that we obtain at
the most two heads is 7/8, and so
on.
Let us consider another
interesting example to illustrate
the concepts of a discrete
probability distribution and its
distribution function:
EXAMPLE
A large store places its last 15 clock radios in a
clearance sale. Unknown to any one, 5 of the radios are
defective.
If a customer tests 3 different clock radios selected
at random, what is the probability distribution of X,
where X represent the number of defective radios in the
sample?
SOLUTION
We have:

Type of Number of
Clock Radio Clock Radios
Good 10
Defective 5
Total 15
The total number of ways of
15 
selecting 3 radios out of 15 is  .
 3
Also, the total number of ways of selecting 3
good radios (and no defective radio)
 10   5 
is    .
 3  0
Hence, the probability of
10   5 
   
X = 0 is  3   0   0.26.
15 
 
3
The probabilities of X = 1, 2, and 3 are computed in
a similar way.
Hence, we obtain the following probability
distribution:
Number of defective Probability
clock radios in the
sample
X f(x)
0 0.26
1 0.49
2 0.22
3 0.02
Total 0.99 1
The line chart of this
distribution is:
LINE CHART
f(x)
0.5
0.4
0.3
0.2
0.1
0 X
0 1 2 3
As indicated by the above diagram, it is not
necessary for a probability distribution to be symmetric;
it can be positively or negatively skewed.
The distribution function of
the above probability
distribution is obtained as
follows:
Number of defective f(x) F(x)
clock radios in the
sample
X
0 0.26 0.26
1 0.49 0.75
2 0.22 0.97
3 0.02 0.99  1
Total 0.99  1
INTERPRETATION
The probability that the sample of 3 clock radios
contains at the most one defective radio is 0.75, the
probability that the sample contains at the most two
defective radios is 0.97, and so on.
Next, we consider the concept of
MATHEMATICAL EXPECTATION.

Let a discrete random variable X have possible


values x1, x2, …, xn with corresponding probabilities f(x1),
f(x2), …, f(xn) such that
f(xi) =1.
Then the mathematical expectation or the
expectation or the expected value of X, denoted by E(x),
is defined as28
E(X) = x1f(x1) + x2f(x2) + … + xnf(xn)
n
  x if  x i  ,
i 1
E(X) is also called the mean of X and is usually denoted
by the letter .
The expression
n
E X    x i f  x i 
i 1
may be regarded as a weighted
mean of the variable’s possible
values x1, x2, …, xn, each being
weighted by the respective
probability.
In case the values are
equally likely,
1
E X    x i ,
n
which represents the
ordinary arithmetic mean
of the n possible values.
It should be noted that
E(X) is the average value of
the random variable X over
a VERY LARGE number
of trials.
Let us now consider an
interesting example:
EXAMPLE
If it rains, an umbrella salesman can earn $ 30 per
day. If it is fair, he can lose $ 6 per day. What is his
expectation if the probability of rain is 0.3?
SOLUTION
Let X represent the number of
dollars the salesman earns. Then X
is a random variable with possible
values 30 and –6,
(where -6 corresponds to the fact
that the salesman loses), and the
corresponding probabilities are 0.3
and 0.7 respectively.
Hence, we have:
AMOUNT
EARNED PROBABILITY
EVENT
($) P(x)
x
Rain 30 0.3
No Rain –6 0.7
Total 1
In order to compute the
expected value of X, we carry
out the following computation:
AMOUNT
EARNED PROBABILITY
EVENT xP(x)
($) P(x)
x
Rain 30 0.3 9.0
No Rain –6 0.7 -4.2
Total 1 4.8
Hence

E(X) = $ 4.80 per day

i.e. on the average, the salesman can expect


to earn 4.8 dollars per day.
Until now, we have
considered the mathematical
expectation of the random
variable X.
But, in many situations, we
may be interested in the
mathematical expectation of
some FUNCTION of X:
EXPECTATION OF A
FUNCTION
OF A RANDOM VARIABLE
Let H(X) be a function of the random variable X.
Then H(X) is also a random variable and also has an
expected value, (as any function of a random variable is also
a random variable).
If X is a discrete random variable with probability
distribution f(x), then, since H(X) takes the value H(xi)
when X = xi, the expected value of the function H(X) is
E[H(X)]
= H(x1) f(x1) + H(x2)f(x2) + … + H(xn) f(xn)

provided the series converges absolutely.

  H x i  f  x i  ,
i
In particular, if H(X) = X2, then E(X2) = xi2 f(x).

It is relevant to note that E(X2) is not the same as


[E(X)]2.
Again, if H(X) = (X - )2, where  is the population
mean, then
E(X – )2 = (xi - )2 f(x).

We call this expected value the variance and denote


it by Var(X) or 2.
And, since
E(X – ) = E(X ) – [E(X)] ,
2 2 2

hence the short cut formula for the


variance is
2 = E(X2) – [E(X)]2.
The positive square root of the
variance, a before, is called the
standard deviation.
More generally, if
H(X) = Xk, k = 1, 2, 3, …, then

E(Xk) = xik f(x)

which we call the kth moment about the origin of the


random variable X and we denote it by k.
Similarly, if H(X) = (X – )k, k = 1, 2, 3, …, then we
get an expected value, called the kth moment about the
mean of the random variable X, which we denote by k.
That is:

k = E(X – )k = (xi – )k f(x)


The skewness of a probability distribution is often
measured by
2
3
1  3
2

and kurtosis by 4
2  2
.
2
These moment-ratios assist us in determining the
skewness and kurtosis of our probability distribution in
exactly the same way as was discussed in the case of
frequency distributions.
Next, we discuss some important properties of
mathematical expectation.
The important properties of the expected values
of a random variable are as follows:
PROPERTIES OF MATHEMATICAL
EXPECTATION

1. If c is a constant, then
E(c) = c.
Thus the expected value of a constant is
constant itself.
This point can be understood
easily by considering the following
interesting example:
Suppose that a very difficult test
was given to students by a professor,
and that every student obtained 2
marks out of 20!
It is obvious that the mean mark
is also 2. Since the variable ‘marks’
was a constant, therefore its
expected value was equal to itself.
2. If X is a discrete random variable and if a and b
are constants, then

E(aX + b) = a E(X) + b.
Let us verify this from
the following example:
EXAMPLE
Let X represent the number of
heads that appear when three fair coins
are tossed.
The probability distribution of X is:
X P(x)
0 1/8
1 3/8
2 3/8
3 1/8
Total 1
The expected value of X is
obtained as follows:
x P(x) xP(x)
0 1/8 0
1 3/8 3/8
2 3/8 6/8
3 1/8 3/8
Total 1 12/8=1.5
Hence, E(X) = 1.5
Suppose that we are
interested in finding the
expected value of the random
variable 2X+3.
Then we carry out the
following computations:
x 2x+3 P(x) (2x+3)P(x)
0 3 1/8 3/8
1 5 3/8 15/8
2 7 3/8 21/8
3 9 1/8 9/8
Total 1 48/8=6

Hence
E(2X+3) = 6
It should be noted that
E(2X+3) = 6
= 2(1.5) + 3
= 2E(X) + 3
i.e.
E(aX + b) = a E(X) + b.
IN TODAY’S LECTURE,
YOU LEARNT
•Graphical Representation of the
Distribution Function of a Discrete
Random Variable
•Mathematical Expectation
•Mean, Variance and Moments of a
Discrete Probability Distribution
•Properties of Expected Values
IN THE NEXT LECTURE,
YOU WILL LEARN

•Chebychev’s Inequality

•Continuous Probability
Distributions

You might also like