Professional Documents
Culture Documents
Lesson 4: Mean and Variance of A Discrete Random Variables
Lesson 4: Mean and Variance of A Discrete Random Variables
Recall the lesson 2 for this topic (on Probability Distribution of Discrete Random Variables).
The table below is the lists of the distribution of the number of heads in tosses of three fair
coins (or three independent tosses of one fair coin). The third column will be the product of
the entries of the first and second columns, (X)P(X).
Definition: given a discrete random variable X, the mean, denoted by µ, is the sum of the products
formed from multiplying the possible of X with their corresponding probabilities. It is
called the expected value of X, and given a symbol E(X).
More formally:
µ = 𝑬(𝑿) = ∑ 𝒊 𝑷(𝑿 = 𝒊)
𝒊
Recall that the empirical probabilities tend toward theoretical probabilities and, in
consequence, the mean is also a long-run average. This can be observed from the results
of the activity. As the number of trials of a statistical experiment increases, the empirical
average also gets closer and closer to have the value of the theoretical average. This is
why we can interpret the mean as a long-run average.
If the three coins tossed (and if all coins are fair), then it would have eight outcomes –
HHH, HHT, HTH, HTT, THH, THT, TTH, TTT. And if they would repeat tossing these coins
8000 times, expect 1000 tosses of each of the outcomes and, thus, the expected frequency
of 3 heads would be 1000; of 2 heads, would be 3000 tosses; of 1 head, would be 3000
tosses; and no heads, would be 1000 tosses. If we average these, we would have
(1000)(3) + (3000)(2) + (3000)(1) + (1000)(1)
= 1.5
8000
Although the Mean is called “expected value”, this should not be interpreted as the actual
expected result when they do an experiment. In the example for tossing three coins, the
mean is 1.5. Point out that when you toss three dice, you cannot get 1.5 heads out of it.
This indicates that the Mean is not necessarily a possible value of the random variable.
So you cannot simply say that the Mean is what you expect to be the number of heads
when you toss three coins. Rather, it is to be interpreted as a long-run average. Also,
Mean is the value that we expect the long-run average to approach and it is not the value
of the random variable X that we expect to observe.
Next, recall that the average of a given set of data is a measure of central tendency. The
expected value-being and average-measures the center of the distribution of the possible
values of X.
The mean of a (discrete) random variable X can be given as a physical interpretation.
Suppose we imagine that the x-axis is an infinite see-saw in a=each direction, and at each
possible value of X, we place weights equal to the corresponding probabilities. Then, the
mean is at a point which will make the see-saw balanced. In other words, it is at the centre
of gravity of the system.
I 1 2 3 4 5 6
P(X=i) 1−𝜃 1−𝜃 1−𝜃 1+𝜃 1+𝜃 1+𝜃
6 6 6 6 6 6
µ = 𝑬(𝑿) = ∑ 𝒊 𝑷(𝑿 = 𝒊)
𝒊
1−𝜃 1−𝜃 1−𝜃 1−𝜃 1−𝜃 1−𝜃 1−𝜃 𝟕−𝟑𝜽
= + 2( ) + + 2( ) + 3( ) + 4( ) + 5( ) + 6( )=
6 6 6 6 6 6 6 𝟐
If q = 0, then this reduces to a fair dice, for which, we would have a long-run average of 3.5
for the number of spots on the upward face.
II. Practical Example Used in Insurance
An insurance company sells life insurance of Php500,000 for a premium or payment of
Php10,400 per year. Actuarial tables show the probability of normal death in the year
following the purchase of this policy is 0.1%. What is the expected “gain: for this life
insurance policy?
There are two simple events here. Either the customer will live this year or will die (a
normal death). The probability of normal death, as given by the problem, is 0.001, which
will yield a negative gain to the insurance company in the amount of -489,600 = Php10,400
– Php500,000. The probability that the customer will live is 1 – 0.001 = 0.999. Thus, the
insurance company’s expected gain X from this insurance policy in the year after the
purchase has the following probability distribution:
X = number of
P(X) (X)P(X) (X - µ)2 (X - µ)2P(X)
heads
0 1/8 0 (0 – 1.5)2 = 2.25 0.28125
1 3/8 3/8 (1 – 1.5)2 = 0.25 0.09375
2 3/8 6/8 (2 – 1.5)2 = 0.25 0.09375
3 1/8 3/8 (3 – 1.5)2 = 2.25 0.28125
Total 12/8 = 1.5 0.75
The total in the last column is called the variance of the random variable, and the square
root, 0.866, is the standard deviation.
Now, define the variance of a random variable as the weighted average of squared
deviations of the values X from the mean, where the weights are the respective probabilities.
The variance, usually denoted by the symbol 𝜎2, is also denoted as Var(X) and formally
defined as
𝝈 = √𝑽𝒂𝒓(𝑿)
III. Example of Gains in Life Insurance
Show the following calculations for “deviations” formed from subtracting the mean from the
gains, as well as squared deviations, and weighted squared deviations.
Solution:
The variance is the sum of the entries on the last column, i.e.,
s2 = 107817868 + 240.55692 = 107818108
while the standard deviation is the square root of the variance
s = 10383.55
Remember that the standard deviation is the more understandable of the two measures of
spread, since the standard deviation is in the same units as X. for example, if X is a random
variable representing the number of heads in three tosses of a fair coin, then the units for a
standard deviation is “heads”, while the variance is in square heads (heads 2).
Unlike the mean, there is no simple interpretation for the variance or standard deviation. In
relative terms, variance,
a small standard deviation (and variance) means that the distribution of the random
variable is quite concentrated around the mean
a large standard deviation (and variance) means that the distribution is rather
spread out, with some chance of observing values at some distance from the mean
Variance is not computed with the definition, but rather using the following result:
𝝈𝟐 = 𝑽𝒂𝒓(𝑿) = 𝑬𝑿𝟐 − µ𝟐
Thus, the variance is the difference between the expected value of X2 and the square of the
mean.
Note: This can be derived from the definition, some algebraic expansion of a binomial
expression, and some properties of expected values (such as the mean of a constant is
the constant):
KEY POINTS:
The mean (or expected value) of a discrete random variable, say X, is a weighted
average of the possible values of the random variable, where the weights are the
respective probabilities
µ = 𝑬(𝑿) = ∑ 𝒊 𝑷(𝑿 = 𝒊)
𝒊
The variance is the expected value of the squared deviations from the mean.
𝝈 = √𝑽𝒂𝒓(𝑿)
Solution:
You should be able to obtain the mean of the team’s total time in the relay as the sum of the means
45.02 + 50.25 + 51.45 + 56.38 = 203.1 seconds,
with a variance equal to the sum of the variances, i.e.
0.202 + 0.262 + 0.242 + 0.222 = 0.2136,
so that the standard deviation is the square root of 0.2136 = 0.46 seconds. The best time of 201.62
seconds is 3.2 standard deviations below the mean, thus it would be very likely for the team to swim
faster than this best time.
EXAMPLE 2: The crucial assumption is independence of the random variables. Suppose the amount
of money you spent for lunch is represented by the random variable X, and the amount
of money of the same group spends on afternoon snacks is represented by variable Y.
the variance of the sum X + Y is not the sum of the variances, since X and Y are not
independent random variables.
Consider tossing a fair coin 10 times: What would be the number of heads expected?
The answer is 5.
Solution:
Define X as 1 if the ith toss comes up heads, and 0 if the ith toss comes up tails, and assuming in general
that the coin has a chance p of yielding heads (with p = 1/2 when the coin is fair), then the probability
mass function for Xi is
X 0 1
P(X = x) 1–p p
E(X ± c) = E(X) ± c
Var(X ± c) = Var(X)
Multiplying or dividing distribution of X by a constant changes the mean by a
factor equal to the constant, and the variability by the square of the constant.
E(aX) = a E(X)
Var(aX) = a2 Var(X)
The expected value of the sum (difference) of independent random variables X
and Y is the sum (difference) of the expected values.
2. A grade 12 student use the Internet to get information on temperatures in the city where he
intends to go for college. He finds information in degrees Fahrenheit.
Determine the summary statistics equivalent in Celsius scale given °C = (°F – 32)(5/9)
3. Suppose that in a casino, a certain slot machine pays out an average of Php15, with a standard
deviation of Php5000. Every play of the game costs a gambler Php20.