Professional Documents
Culture Documents
Exercises Probability and Statistics: Bruno Tuffin Inria, France
Exercises Probability and Statistics: Bruno Tuffin Inria, France
1. E F E, E E F ;
2. if E F then F c E c ;
3. the commutative laws are valid;
4. the associative laws are valid;
5. F = (F E) (F E c ;
6. E F = E (E c F );
7. DeMorgans laws are valid.
Exercise 6 Prove that
P(E c F c ) = 1?P(E)?P(F ) + P(E F ).
Exercise 7 A total of 500 married working couples were polled about their
annual salaries, with the following information resulting.
212
36
198
54
Thus, for instance, in 36 of the couples the wife earned more and the husband
earned less than $25000. If one of the couples is randomly chosen, what is
1. the probability that the husband earns less than $25000.
2. the conditional probability that the wife earns more than $25000 given
that the husband earns more than this amount.
3
3. the conditional probability that the wife earns more than $25000 given
that the husband earns less than this amount.
Exercise 8 There are two local factories that produce radios. Each radio
produced at factory A is defective with probability 0.05, whereas each one
produced at factory B is defective with probability 0.01. Suppose you purchase two radios that were produced at the same factory, which is equally
likely to have been either factory A or factory B. If the first radio that you
check is defective, what is the conditional probability that the other one is
also defective?
Exercise 9 Suppose that an insurance company classifies people into one of
three classes: good risks, average risks, and bad risks. Their records indicate
that the probabilities that good, average, and bad risk persons will be involved
in an accident over a 1-year span are, respectively, 0.05, 0.15, and 0.30. If 20
percent of the population are ?good risks,? 50 percent are ?average risks,?
and 30 percent are ?bad risks,? what proportion of people have accidents
in a fixed year? If policy holder A had no accidents in 1987, what is the
probability that he or she is a good (average) risk?
Exercise 10 Two cards are drawn from a well-shuffled ordinary deck of 52
cards. Find the probability that they are both aces if the first card is (a)
replaced, (b) not replaced.
Exercise 11 A couple has 2 children. What is the probability that both are
girls if the eldest is a girl?
Exercise 12 Let A, B, C be events such that P(A) = 0.2, P(B) = 0.3,
P(C) = 0.4. Find the probability that at least one of the events A and B
occurs if
1. A and B are mutually exclusive;
2. A and B are independent.
Find the probability that all of the events A, B, C occur if
1. A, B, C are independent;
2. A, B, C are mutually exclusive.
Exercise 13
1. Box I contains 3 red and 2 blue marbles while Box II
contains 2 red and 8 blue marbles. A fair coin is tossed. If the coin
turns up heads, a marble is chosen from Box I; if it turns up tails, a
marble is chosen from Box II. Find the probability that a red marble
is chosen.
2. Suppose now that the one who tosses the coin does not reveal whether
it has turned up heads or tails (so that the box from which a marble was
chosen is not revealed) but does reveal that a red marble was chosen.
What is the probability that Box I was chosen (i.e., the coin turned up
heads)?
Exercise 14 In how many ways can 10 people be seated on a bench if only
4 seats are available?
Exercise 15 How many 4-digit numbers can be formed with the 10 digits
0, 1, 2, 3, . . . , 9 if (a) repetitions are allowed, (b) repetitions are not allowed,
(c) the last digit must be zero and repetitions are not allowed?
Exercise 16 In how many ways can 7 people be seated at a round table if
(a) they can sit anywhere, (b) 2 particular people must not sit next to each
other?
Exercise 17 Out of 5 mathematicians and 7 physicists, a committee consisting of 2 mathematicians and 3 physicists is to be formed. In how many
ways can this be done if (a) any mathematician and any physicist can be
included, (b) one particular physicist must be on the committee, (c) two
particular mathematicians cannot be on the committee?
Exercise 18 How many different salads can be made from lettuce, escarole,
endive, watercress, and chicory?
Exercise 19 A box contains 8 red, 3 white, and 9 blue balls. If 3 balls are
drawn at random without replacement, determine the probability that (a) all
3 are red, (b) all 3 are white, (c) 2 are red and 1 is white, (d) at least 1 is
white, (e) 1 of each color is drawn, (f ) the balls are drawn in the order red,
white, blue.
Exercise 20 A shelf has 6 mathematics books and 4 physics books. Find
the probability that 3 particular mathematics books will be together.
Random variables
c
1 + x2
x/2
2/3
F (x) =
11/12
1
1. Plot this distribution function.
6
0x1
0x1
fR (x) = 2x
Determine the density function of W .
8
Exercise 34 A person playing darts finds that the probability of the dart
striking at distance r of the center has density
f (r) = c 1 (r/a)2
where c is a constant, and a is the radius of the target. Find the probability
of hitting the bulls-eye, which is assumed to have radius b. Assume that the
target is always hit.
Exercise 35 Two points are selected at random in the interval [0, 1]. Determine the probability that the sum of their squares is less than 1.
Exercise 36 Find the probability density of the random variable U = X 2
where X is the random variable with density f (x) = 1 defined for x [0, 1].
Exercise 37 If the random variables X and Y have joint density function.
xy/96 0 < x < 4, 1 < y < 5
f (x, y) =
0
otherwise.
1. Find the density function of U = X + 2Y .
2. Find the joint density function of U = X 2 Y and V = XY 2
Mathematical expectation
for 0 x 1
and 0 otherwise.
If E[X] = 3/5, find a and b.
Exercise 41 The lifetime in hours of electronic tubes is a random variable
having a probability density function given by
f (x) = a2 xeax
for x 0.
10
(z 8) 8 z 9
(10 z) 9 < z 10
f (z) =
0
otherwise.
1. Calculate the mean and variance of the random variable X.
2. The manufacturer sells the article for a fixed price of $2.00. He guarantees to refund the purchase money to any customer who finds the
weight of his article to be less than 8.25 oz. His cost of production is
related to the weight of the article by the relation x/15 + 0.35. Find
the expected profit per article.
Exercise 49 Let X and Y be independent random variables such that
1 with probability 1/3
X=
0 with probability 2/3
and
Y =
2
with probability 3/4
3 with probability 1/4
Find
11
1. E[3X + 2Y ]
2. E[(X 2 Y 2 ]
3. E[XY ]
4. E[X 2 Y ]
Exercise 50 Suppose that the Rockwell hardness X and abrasion loss Y of
a specimen (coded data) have a joint density given by
x + y for 0 x, y 1,
f (x, y) =
0
otherwise.
1. Find the marginal densities of X and Y .
2. Find E[X] and Var[X].
3. Find E[Y ] and Var[Y ].
4. Find Cov[X, Y ] and the coefficient of correlation .
5. Find E[X|Y ] and E[Y |X].
Exercise 51 If X1 and X2 have the same probability distribution function,
show that
Cov[X1 X2 , X1 + X2 ] = 0.
Note that independence is not being assumed.
Exercise 52 Suppose that X has density function
f (x) = ex ,
for x 0.
Compute the moment generating function of X and use your result to determine its mean and variance. Check your answer for the mean by a direct
calculation.
Exercise 53 Suppose that X has density function
f (x) = 1,
for 0 x 1.
13
n, m.
This property is named the memoryless property of the geometric distribution. It is the only discrete distribution to verify it. It means that living
at least m more steps given that you have lived n is the same as living at
least m from the beginning, with the same distribution.
Exercise 68 Let X be follow an exponential distribution with parameter
1. Calculate P(X > x) x 0.
2. Show that
P(X > x + y|X > x) = P(X > y)
x, y 0.
This property is named the memoryless property of the exponential distribution. It is the only continuous distribution to verify it. It means that
living at least y more steps given that you have lived x is the same as
living at least y from the beginning, with the same distribution.
15
Exercise 69
1. Prove that if X1 , X2 , . . . , Xn are independent exponential random variables having respective parameters P
1 , 2 , . . . , n , then
min(X1 , X2 , . . . , Xn ) is exponential with parameter ni=1 i .
2. A series system is one that needs all of its components to function in
order for the system itself to be functional. For an n-component series
system in which the component lifetimes are independent exponential
random variables with respective parameters 1 , 2 , . . . , n , what is the
probability that the system survives for a time t?
Exercise 70 Compare the Poisson approximation with the correct binomial
probability for the following cases:
1. P(X = 2) when n = 10, p = 0.1;
2. P(X = 0) when n = 10, p = 0.1;
3. P(X = 4) when n = 9, p = 0.2.
Exercise 71 Show that (a) the moment generating function for a Cauchy
distributed random variable X does not exist but that (b) the characteristic
function does exist.
Exercise 72
1. Compute the moment generating function of a gamma
random variable X with parameters (, ).
2. Compute the expectation and variance of X.
3. Show that if X1 and X2 are independent gamma random variables
having respective parameters ((1 , ) and (1 , ), then X1 + X2 is a
gamma random variable with parameters (1 + 2 , ).
Exercise 73 A box contains 5 red balls, 4 white balls, and 3 blue balls. A
ball is selected at random from the box, its color is noted, and then the ball
is replaced. Find the probability that out of 6 balls selected in this manner,
3 are red, 2 are white, and 1 is blue.
16
18
n?
1. What is the distribution of Xn+1 X
n = 4, give an interval that, with 90 percent confidence, will con2. If X
tain the value of Xn+1 .
Exercise 86 A civil engineer wishes to measure the compressive strength
of two different types of concrete. A random sample of 10 specimens of the
first type yielded the following data (in psi)
Type 1: 3250, 3268, 4302, 3184, 3266, 3297, 3332, 3502, 3064, 3116
whereas a sample of 10 specimens of the second yielded the data Type 2:
3094, 3106, 3004, 3066, 2984, 3124, 3316, 3212, 3380, 3018
If we assume that the samples are normal with a common variance, determine a 95 percent two-sided confidence interval for 1 2 .
Exercise 87 Independent random samples are taken from the output of two
machines on a production line.The weight of each item is of interest. From
the first machine, a sample of size 36 is taken, with sample mean weight of
120 grams and a sample variance of 4. From the second machine, a sample
of size 64 is taken, with a sample mean weight of 130 grams and a sample
variance of 5. It is assumed that the weights of items from the first machine
are normally distributed with mean 1 and variance 2 and that the weights
of items from the second machine are normally distributed with mean 2
and variance 2 (that is, the variances are assumed to be equal). Find a 99
percent confidence interval for 1 2 , the difference in population means.
Exercise 88 The standard deviation of the lifetimes of a sample of 200
electric light bulbs was computed to be 100 hours. Find (a) 95%, (b) 99%
confidence limits for the standard deviation of all such electric light bulbs.
Exercise 89 The standard deviation of the heights of 16 male students chosen at random in a school of 1000 male students is 2.40 inches. Find (a)
95%, (b) 99% confidence limits of the standard deviation for all male students at the school. Assume that height is normally distributed.
Exercise 90 Two samples of sizes 16 and 10, respectively, are drawn at
random from two normal populations. If their variances are found to be 24
and 18, respectively, find (a) 98%, (b) 90% confidence limits for the ratio of
the variances.
Exercise 91 Suppose X1 , . . . , Xn are independent Poisson random variables each having mean . Determine the maximum likelihood estimator
of .
19
20
Simulation
Exercise 94 Write a Monte Carlo simulation peusdo-code allowing to estimate the mean of exponential random variables with rate ,
Z
tf (t)dt,
0
21
f (x)
3
f (x) = 3x2
5
2
2
3
2
1
1
2
x
0
1
2
Hypothesis testing
Exercise 99 Consider a trial in which a jury must decide between the hypothesis that the defendant is guilty and the hypothesis that he or she is
innocent. (a) In the framework of hypothesis testing and the U.S. legal system, which of the hypotheses should be the null hypothesis? (b) What do you
think would be an appropriate significance level in this situation?
Exercise 100 A population distribution is known to have standard deviation 20. Determine the p-value of a test of the hypothesis that the population
mean is equal to 50, if the average of a sample of 64 observations is (a) 52.5;
(b) 55.0; (c) 57.5.
Exercise 101 Design a decision rule to test the hypothesis that a coin is
fair if a sample of 64 tosses of the coin is taken and if a level of significance
of (a) 0.05, (b) 0.01 is used.
Exercise 102 In a certain chemical process, it is very important that a
particular solution that is to be used as a reactant have a pH of exactly 8.20.
22
8.6, 9.4, 5.0, 4.4, 3.7, 11.4, 10.0, 7.6, 14.4, 12.2, 11.0, 14.4, 9.3, 10.5,
10.3, 7.7, 8.3, 6.4, 9.2, 5.7, 7.9, 9.4, 9.0, 13.3, 11.6, 10.0, 9.5, 6.6
Exercise 106 Consider a test of H0 : 100 versus H1 : > 100. Sup 20 = 105. Determine
pose that a sample of size 20 has a sample mean of X
the p-value of this outcome if the population standard deviation is known to
equal (a) 5; (b) 10; (c) 15.
Exercise 107 An advertisement for a new toothpaste claims that it reduces
cavities of children in their cavity-prone years. Cavities per year for this age
group are normal with mean 3 and standard deviation 1. A study of 2, 500
children who used this toothpaste found an average of 2.95 cavities per child.
Assume that the standard deviation of the number of cavities of a child using
this new toothpaste remains equal to 1.
1. Are these data strong enough, at the 5 percent level of significance, to
establish the claim of the toothpaste advertisement?
2. Do the data convince you to switch to this new toothpaste?
Exercise 108 A car is advertised as having a gas mileage rating of at least
30 miles/gallon in highway driving. If the miles per gallon obtained in 10
independent experiments are 26, 24, 20, 25, 27, 25, 28, 30, 26, 33, should
you believe the advertisement? What assumptions are you making?
Exercise 109 It is claimed that a certain type of bipolar transistor has a
mean value of current gain that is at least 210. A sample of these transistors
is tested. If the sample mean value of current gain is 200 with a sample
standard deviation of 35, would the claim be rejected at the 5 percent level
of significance if (a) the sample size is 25; (b) the sample size is 64?
Exercise 110 The viscosity of two different brands of car oil is measured
and the following data resulted:
Brand 1
Brand 2
10.62
10.50
10.58
10.52
10.33
10.58
10.72
10.62
10.44
10.55
10.74
10.51
10.53
Test the hypothesis that the mean viscosity of the two brands is equal, assuming that the populations have normal distributions with equal variances.
24
Exercise 111 It is argued that the resistance of wire A is greater than the
resistance of wire B. You make tests on each wire with the following results
(in ohm).
Wire A
Wire B
0.140
0.135
0.138
0.140
0.143
0.136
0.142
0.142
0.144
0.138
0.137
0.140
What conclusion can you draw at the 10 percent significance level? Explain
what assumptions you are making.
Exercise 112 A professor claims that the average starting salary of industrial engineering graduating seniors is greater than that of civil engineering
graduates. To study this claim, samples of 16 industrial engineers and 16
civil engineers, all of whom graduated in 2006, were chosen and sample
members were queried about their starting salaries. If the industrial engineers had a sample mean salary of $59, 700 and a sample standard deviation
of $2, 400, and the civil engineers had a sample mean salary of $58, 400
and a sample standard deviation of $2, 200, has the professor?s claim been
verified? Find the appropriate p-value.
We assume that the population distributions are normal and have equal
variances.
Exercise 113 To learn about the feeding habits of bats, 22 bats were tagged
and tracked by radio. Of these 22 bats, 12 were female and 10 were male.
The distances flown (in meters) between feedings were noted for each of the
22 bats, and the following summary statistics were obtained:
n = 180, S 2 = 92
Female bats: n = 12, X
1
26
Exercise 121 The following data relate x, the moisture of a wet mix of a
certain product, to Y , the density of the finished product.
xi
Yi
5
7.4
6
9.3
7
10.6
10
15.4
12
18.1
15
22.2
18
24.1
20
24.8
2
21
3
42
8
102
11
130
4
52
5
57
9
105
7
85
5
62
7
90
27
Exercise 124 A study has shown that a good model for the relationship
between X and Y , the first and second year batting averages of a randomly
chosen major league baseball player, is given by the equation
Y = 0.159 + 0.4X + e
where e is a normal random variable with mean 0. That is, the model is a
simple linear regression with a regression toward the mean.
1. If a player?s batting average is 0.200 in his first year, what would you
predict for the second year?
2. If a player?s batting average is 0.265 in his first year, what would you
predict for the second year?
3. If a player?s batting average is 0.310 in his first year, what would you
predict for the second year?
Exercise 125 The following data represent the relationship between the
number of alignment errors and the number of missing rivets for 10 different aircraft.
Number of missing rivets x
Number of errors y
13
7
15
7
10
5
22
12
30
15
7
2
25
13
16
9
20
11
15
8
Y = x + e,
is called regression through the origin since it presupposes that the expected
response corresponding to the input level x = 0 is equal to 0. Suppose that
(xi , Yi ), i = 1, . . . , n is a data set from this model.
1. Determine the least squares estimator B of .
2. What is the distribution of B?
3. Define SSR and give its distribution.
29
54.3
61.2
61.8
49.5
72.4
37.6
88.7
28.4
118.6
19.2
194.0
10.1
22.4
0
21.3
1
19.7
2
15.6
3
30
15.2
4
13.9
5
13.7
6
Analysis of variance
Resin II
0.038
0.035
0.031
0.022
0.012
Resin III
0.031
0.042
0.020
0.018
0.039
Test the hypothesis that there is no difference in the efficiency of the resins.
Exercise 132 A machine shop contains 3 ovens that are used to heat metal
specimens. Subject to random fluctuations, they are all supposed to heat to
the same temperature.To test this hypothesis, temperatures were noted on 15
separate heatings. The following data resulted.
Oven
1
2
3
Temperature
492.4, 493.6, 498.5, 488.6, 494
488.5, 485.3, 482, 479.4, 478
502.1, 492, 497.5, 495.3, 486.7
Sample 2
29
38
34
30
32
Sample 3
44
52
56
1.
= (1/m)
Pm
i=1 xi
= (1/n)
Pn
j=1 xj
2. If xij = ai + bj ,
m X
n
X
xij = n
i=1 j=1
m
X
i=1
ai + m
n
X
bj .
j=1
Exercise 135 The following data refer to the number of deaths per 10,000
adults in a large Eastern city in the different seasons for the years 1982 to
1986.
Year Winter Spring Summer Fall
1982
33.6
31.4
29.8
32.1
1983
32.5
30.1
28.5
29.9
1984
35.3
33.2
29.5
28.7
1985
34.4
28.6
33.9
30.1
1986
37.3
34.1
28.5
29.4
1. Assuming a two-factor model, estimate the parameters.
2. Test the hypothesis that death rates do not depend on the season. Use
the 5 percent level of significance.
3. Test, at the 5 percent level of significance, the hypothesis that there is
no effect due to the year.
Exercise 136 A study was made as to how the concentration of a certain
drug in the blood, 24 hours after being injected, is influenced by age and
gender. An analysis of the blood samples of 40 people given the drug yielded
the following concentrations (in milligrams per cubic centimeter).
32
33