Download as pdf or txt
Download as pdf or txt
You are on page 1of 33

Exercises

Probability and Statistics


Bruno Tuffin
Inria, France

Sets and probability

Exercise 1 An experiment consists of tossing a coin three times. What


is the sample space of this experiment? Which event corresponds to the
experiment resulting in more heads than tails?
Exercise 2 Let = {1, 2, 3, 4, 5, 6, 7}, E = {1, 3, 5, 7}, F = {7, 4, 6}, G =
{1, 4}. Determine
1. E F
2. E Gc
3. E c (F G)
4. E (F G)
5. (E G) (F G).
Exercise 3 If not done in class, prove
1. Commutative law
E F = F E and E F = F E.
2. Associative law
(E F ) G = E (F G) and (E F ) G = E (F G).
3. Distributive law
(E F ) G = (E G) (F G) and (E F ) G = (E G) (F G).
4. (E F )c = E c F c
5. (E F )c = E c F c
Exercise 4 Let E, F , G be three events. Find expressions for the events
that of E, F , G
1. only E occurs;
2. both E and G but not F occur;
3. at least one of the events occurs;
2

4. at least two of the events occur;


5. all three occur;
6. none of the events occurs;
7. at most one of them occurs;
8. at most two of them occur;
9. exactly two of them occur;
10. at most three of them occur.
Exercise 5

1. E F E, E E F ;

2. if E F then F c E c ;
3. the commutative laws are valid;
4. the associative laws are valid;
5. F = (F E) (F E c ;
6. E F = E (E c F );
7. DeMorgans laws are valid.
Exercise 6 Prove that
P(E c F c ) = 1?P(E)?P(F ) + P(E F ).
Exercise 7 A total of 500 married working couples were polled about their
annual salaries, with the following information resulting.

Wife < $25000


Wife > $25000

Husband < $25000

Husband > $25000

212
36

198
54

Thus, for instance, in 36 of the couples the wife earned more and the husband
earned less than $25000. If one of the couples is randomly chosen, what is
1. the probability that the husband earns less than $25000.
2. the conditional probability that the wife earns more than $25000 given
that the husband earns more than this amount.
3

3. the conditional probability that the wife earns more than $25000 given
that the husband earns less than this amount.
Exercise 8 There are two local factories that produce radios. Each radio
produced at factory A is defective with probability 0.05, whereas each one
produced at factory B is defective with probability 0.01. Suppose you purchase two radios that were produced at the same factory, which is equally
likely to have been either factory A or factory B. If the first radio that you
check is defective, what is the conditional probability that the other one is
also defective?
Exercise 9 Suppose that an insurance company classifies people into one of
three classes: good risks, average risks, and bad risks. Their records indicate
that the probabilities that good, average, and bad risk persons will be involved
in an accident over a 1-year span are, respectively, 0.05, 0.15, and 0.30. If 20
percent of the population are ?good risks,? 50 percent are ?average risks,?
and 30 percent are ?bad risks,? what proportion of people have accidents
in a fixed year? If policy holder A had no accidents in 1987, what is the
probability that he or she is a good (average) risk?
Exercise 10 Two cards are drawn from a well-shuffled ordinary deck of 52
cards. Find the probability that they are both aces if the first card is (a)
replaced, (b) not replaced.
Exercise 11 A couple has 2 children. What is the probability that both are
girls if the eldest is a girl?
Exercise 12 Let A, B, C be events such that P(A) = 0.2, P(B) = 0.3,
P(C) = 0.4. Find the probability that at least one of the events A and B
occurs if
1. A and B are mutually exclusive;
2. A and B are independent.
Find the probability that all of the events A, B, C occur if
1. A, B, C are independent;
2. A, B, C are mutually exclusive.

Exercise 13
1. Box I contains 3 red and 2 blue marbles while Box II
contains 2 red and 8 blue marbles. A fair coin is tossed. If the coin
turns up heads, a marble is chosen from Box I; if it turns up tails, a
marble is chosen from Box II. Find the probability that a red marble
is chosen.
2. Suppose now that the one who tosses the coin does not reveal whether
it has turned up heads or tails (so that the box from which a marble was
chosen is not revealed) but does reveal that a red marble was chosen.
What is the probability that Box I was chosen (i.e., the coin turned up
heads)?
Exercise 14 In how many ways can 10 people be seated on a bench if only
4 seats are available?
Exercise 15 How many 4-digit numbers can be formed with the 10 digits
0, 1, 2, 3, . . . , 9 if (a) repetitions are allowed, (b) repetitions are not allowed,
(c) the last digit must be zero and repetitions are not allowed?
Exercise 16 In how many ways can 7 people be seated at a round table if
(a) they can sit anywhere, (b) 2 particular people must not sit next to each
other?
Exercise 17 Out of 5 mathematicians and 7 physicists, a committee consisting of 2 mathematicians and 3 physicists is to be formed. In how many
ways can this be done if (a) any mathematician and any physicist can be
included, (b) one particular physicist must be on the committee, (c) two
particular mathematicians cannot be on the committee?
Exercise 18 How many different salads can be made from lettuce, escarole,
endive, watercress, and chicory?
Exercise 19 A box contains 8 red, 3 white, and 9 blue balls. If 3 balls are
drawn at random without replacement, determine the probability that (a) all
3 are red, (b) all 3 are white, (c) 2 are red and 1 is white, (d) at least 1 is
white, (e) 1 of each color is drawn, (f ) the balls are drawn in the order red,
white, blue.
Exercise 20 A shelf has 6 mathematics books and 4 physics books. Find
the probability that 3 particular mathematics books will be together.

Random variables

Exercise 21 Let X represent the difference between the number of heads


and the number of tails obtained when a coin is tossed n times. What are
the possible values of X?
If the coin is assumed fair, for n = 3, what are the probabilities associated
with the values that X can take on?
Exercise 22 Let X be a random variable giving the number of boys in families with 3 children, assuming equal probabilities for boys and girls. Find
the probability distribution of X
Exercise 23 A random variable X has the density function (for x R):
f (x) =

c
1 + x2

1. Find the value of the constant c.


2. Find the probability that X 2 lies between 1/3 and 1.
3. Find the cumulative distribution function of X.
Exercise 24 The (cumulative) distribution function for a random variable
X is

1 e2x 0 < x <
F (x) =
0
otherwise.
Find
1. the density function
2. the probability that X > 2
3. the probability that 3 X 4.
Exercise 25 The distribution function

x/2
2/3
F (x) =

11/12

1
1. Plot this distribution function.
6

of the random variable X is given


x<0
0x<1
1x<2
2x<3
x3

2. What is P(2 < X 4)?


3. What is P(X = 1)?
4. Is it a discrete or continuous random variable?
Exercise 26 Suppose you are given the distribution function F of a random
variable X . Explain how you could determineP(X = 1). (Hint: You will
need to use the concept of a limit.)
Exercise 27 The lifetime in hours of a certain kind of radio tube is a random variable having a probability density function given by (units are hours)

0
x 100
f (x) =
100
x > 100.
x2
What is the probability that exactly 2 of 5 such tubes in a radio set will
have to be replaced within the first 150 hours of operation? Assume that the
events Ei , i = 1, 2, 3, 4, 5, that the i-th such tube will have to be replaced
within this time are independent.
Exercise 28 The joint probability function of two discrete random variables
X and Y is given by
px,y = c(2x + y),
where x and y can be all integers such that 0 x 2, 0 y 2, and
px,y = 0 otherwise.
1. Find the value of the constant c. (Hint: write down a table with all
probabilities.)
2. Give the (marginal) distributions of X and Y
3. Find P(X = 2, Y = 1).
4. Find P(X 1, Y 2).
5. Are X and Y independent?
6. Find P(X = 1|Y = 2)
7. Find the (conditional) probability distribution of X given Y = 2.
Exercise 29 The joint probability density function of X and Y is given by
6  2 xy 
f (x, y) =
x +
0 < x < 1, 0 < y < 2.
7
2
7

1. Verify that this is indeed a joint density function.


2. Compute the density function of X.
3. Find P(X > Y ).
4. Compute the conditional density function of X given Y = y.
Exercise 30 The joint density of X and Y is given by
 (x+y)
xe
x > 0, y > 0
f (x, y) =
0
otherwise.
1. Compute the density of X.
2. Compute the density of Y .
3. Are X and Y independent?
4. Compute the conditional density function of X given Y = y.
Exercise 31 The joint density of X and Y is given by

2 0 < x < y, 0 < y < 1
f (x, y) =
0 otherwise.
1. Compute the density of X.
2. Compute the density of Y .
3. Are X and Y independent?
4. Compute the conditional density function of X given Y = y.
Exercise 32 Two people agree to meet between 2:00 P.M. and 3:00 P.M.,
with the understanding that each will wait no longer than 15 minutes for the
other. What is the probability that they will meet?
Exercise 33 When a current I (measured in amperes) flows through a resistance R (measured in ohms), the power generated (measured in watts) is
given by W = I 2 R. Suppose that I and R are independent random variables
with densities
fI (x) = 6x(1 x)

0x1

0x1

fR (x) = 2x
Determine the density function of W .
8

Exercise 34 A person playing darts finds that the probability of the dart
striking at distance r of the center has density

f (r) = c 1 (r/a)2
where c is a constant, and a is the radius of the target. Find the probability
of hitting the bulls-eye, which is assumed to have radius b. Assume that the
target is always hit.
Exercise 35 Two points are selected at random in the interval [0, 1]. Determine the probability that the sum of their squares is less than 1.
Exercise 36 Find the probability density of the random variable U = X 2
where X is the random variable with density f (x) = 1 defined for x [0, 1].
Exercise 37 If the random variables X and Y have joint density function.

xy/96 0 < x < 4, 1 < y < 5
f (x, y) =
0
otherwise.
1. Find the density function of U = X + 2Y .
2. Find the joint density function of U = X 2 Y and V = XY 2

Mathematical expectation

Exercise 38 An insurance company writes a policy to the effect that an


amount of money A must be paid if some event E occurs within a year.
If the company estimates that E will occur within a year with probability
p, what should it charge the customer so that its expected profit will be 10
percent of A?
Exercise 39 A total of 4 buses carrying 148 students from the same school
arrive at a football stadium. The buses carry, respectively, 40, 33, 25, and
50 students. One of the students is randomly selected. Let X denote the
number of students that were on the bus carrying this randomly selected
student. One of the 4 bus drivers is also randomly selected. Let Y denote
the number of students on her bus.
1. Which of E[X] or E[Y ] do you think is larger? Why?
2. Compute E[X] and E[Y ].
Exercise 40 The density function of X is given by
f (x) = a + bx2

for 0 x 1

and 0 otherwise.
If E[X] = 3/5, find a and b.
Exercise 41 The lifetime in hours of electronic tubes is a random variable
having a probability density function given by
f (x) = a2 xeax

for x 0.

Compute the expected lifetime of such a tube.


Exercise 42 Let X1 , . . . , Xn be independent random variables having the
common density function f (x) = 1 for x [0, 1], and 0 otherwise. Find
1. E[max(X1 , . . . , Xn )],
2. E[min(X1 , . . . , Xn )].
Exercise 43 If E[X] = 2 and E[X 2 ] = 8, calculate E[(2+4X)2 ] and E[X 2 +
(X + 1)2 ].

10

Exercise 44 A community consists of 100 married couples. If during a


given year 50 of the members of the community die, what is the expected
number of marriages that remain intact? Assume
that the set of people who

200
die is equally likely to be any of the 50 groups of size 50. (Hint: for
i = 1, . . . , 100, let Xi = 1 if neither member of couple i dies, 0 otherwise).
Exercise 45 Compute the expectation and variance of the number of successes in n independent trials, each of which results in a success with probability p. Is independence necessary?
Exercise 46 Let X be a random variable with mean and finite moment
of order n.
Prove that c R, E[(X c)2 ] Var[X].
Hence, the best predictor of a random variable, in terms of minimizing
its mean square error, is just its mean.
Exercise 47 Let pi = P(X = i) and suppose p1 + p2 + p3 = 1. If E[X] = 2,
what values of p1 ; p2 , p3 minimize the variance ? And maximize the variance?
Exercise 48 A random variable X, which represents the weight (in ounces)
of an article, has density function given by f (z),

(z 8) 8 z 9
(10 z) 9 < z 10
f (z) =

0
otherwise.
1. Calculate the mean and variance of the random variable X.
2. The manufacturer sells the article for a fixed price of $2.00. He guarantees to refund the purchase money to any customer who finds the
weight of his article to be less than 8.25 oz. His cost of production is
related to the weight of the article by the relation x/15 + 0.35. Find
the expected profit per article.
Exercise 49 Let X and Y be independent random variables such that

1 with probability 1/3
X=
0 with probability 2/3
and


Y =

2
with probability 3/4
3 with probability 1/4

Find
11

1. E[3X + 2Y ]
2. E[(X 2 Y 2 ]
3. E[XY ]
4. E[X 2 Y ]
Exercise 50 Suppose that the Rockwell hardness X and abrasion loss Y of
a specimen (coded data) have a joint density given by

x + y for 0 x, y 1,
f (x, y) =
0
otherwise.
1. Find the marginal densities of X and Y .
2. Find E[X] and Var[X].
3. Find E[Y ] and Var[Y ].
4. Find Cov[X, Y ] and the coefficient of correlation .
5. Find E[X|Y ] and E[Y |X].
Exercise 51 If X1 and X2 have the same probability distribution function,
show that
Cov[X1 X2 , X1 + X2 ] = 0.
Note that independence is not being assumed.
Exercise 52 Suppose that X has density function
f (x) = ex ,

for x 0.

Compute the moment generating function of X and use your result to determine its mean and variance. Check your answer for the mean by a direct
calculation.
Exercise 53 Suppose that X has density function
f (x) = 1,

for 0 x 1.

Compute the moment generating function of X. Differentiate to obtain


E[X n ] and check your answer/
12

Exercise 54 Suppose that X is a random variable with mean and variance


both equal to 20. What can be said about P(0 X 40)?
Exercise 55 From past experience, a professor knows that the test score of
a student taking her final examination is a random variable with mean 75.
1. Give an upper bound to the probability that a students test score will
exceed 85.
2. Suppose in addition the professor knows that the variance of a students
test score is equal to 25. What can be said about the probability that a
student will score between 65 and 85?
Exercise 56 Prove that
Var[Y ] = E[Var[Y |X = x]] + Var[E[Y |X = x]].

13

Particular distribution functions

Exercise 57 A satellite system consists of 4 components and can function


adequately if at least 2 of the 4 components are in working condition. If each
component is, independently, in working condition with probability 0.6, what
is the probability that the system functions adequately?
Exercise 58 Find the probability that in five tosses of a fair die, a 3 will
appear (a) twice, (b) at most once, (c) at least two times.
Exercise 59 Suppose that a particular trait (such as eye color or left-handedness) of a person is classified on the basis of one pair of genes, and suppose
that d represents a dominant gene and r a recessive gene. Thus, a person
with dd genes is pure dominance, one with rr is pure recessive, and one with
rd is hybrid. The pure dominance and the hybrid are alike in appearance.
Children receive 1 gene from each parent. If, with respect to a particular
trait, 2 hybrid parents have a total of 4 children, what is the probability that
3 of the 4 children have the outward appearance of the dominant gene?
Exercise 60 Let X be a binomial random variable with E[X] = 7 and
Var[X] = 2.1.
1. Find P(X = 4)
2. Find P(X > 12).
Exercise 61 Derive the moment generating function of a binomial random
variable and then use your result to verify the formulas for the mean and
variance.
Exercise 62 If U is uniformly distributed on (0, 1), show that a + (b?a)U
is uniform on (a, b).
Exercise 63 You arrive at a bus stop at 10 o?clock, knowing that the bus
will arrive at some time uniformly distributed between 10 and 10:30. What
is the probability that you will have to wait longer than 10 minutes? If at
10:15 the bus has not yet arrived, what is the probability that you will have
to wait at least an additional 10 minutes?
Exercise 64 Find the area under the standard normal curve
1. between z = 0 and z = 1.2
14

2. between z = 0.46 and z = 2.21


3. to the right of z = 1.28.
4. determine the value of z such that the area between 0 and z is 0.3770.
Exercise 65 The Scholastic Aptitude Test mathematics test scores across
the population of high school seniors follow a normal distribution with mean
500 and standard deviation 100. If five seniors are randomly chosen, find
the probability that (a) all scored below 600 and (b) exactly three of them
scored above 640.
Exercise 66 The lifetimes of interactive computer chips produced by a certain semiconductor manufacturer are normally distributed having mean 4.4
106 hours with a standard deviation of 3 105 hours. If a mainframe manufacturer requires that at least 90 percent of the chips from a large batch
will have lifetimes of at least 4.0 106 hours, should he contract with the
semiconductor firm?
Exercise 67 Let X be follow a geometric distribution with parameter p
1. Calculate P(X > n) n 0.
2. Show that
P(X > n + m|X > n) = P(X > m)

n, m.

This property is named the memoryless property of the geometric distribution. It is the only discrete distribution to verify it. It means that living
at least m more steps given that you have lived n is the same as living at
least m from the beginning, with the same distribution.
Exercise 68 Let X be follow an exponential distribution with parameter
1. Calculate P(X > x) x 0.
2. Show that
P(X > x + y|X > x) = P(X > y)

x, y 0.

This property is named the memoryless property of the exponential distribution. It is the only continuous distribution to verify it. It means that
living at least y more steps given that you have lived x is the same as
living at least y from the beginning, with the same distribution.
15

Exercise 69
1. Prove that if X1 , X2 , . . . , Xn are independent exponential random variables having respective parameters P
1 , 2 , . . . , n , then
min(X1 , X2 , . . . , Xn ) is exponential with parameter ni=1 i .
2. A series system is one that needs all of its components to function in
order for the system itself to be functional. For an n-component series
system in which the component lifetimes are independent exponential
random variables with respective parameters 1 , 2 , . . . , n , what is the
probability that the system survives for a time t?
Exercise 70 Compare the Poisson approximation with the correct binomial
probability for the following cases:
1. P(X = 2) when n = 10, p = 0.1;
2. P(X = 0) when n = 10, p = 0.1;
3. P(X = 4) when n = 9, p = 0.2.
Exercise 71 Show that (a) the moment generating function for a Cauchy
distributed random variable X does not exist but that (b) the characteristic
function does exist.
Exercise 72
1. Compute the moment generating function of a gamma
random variable X with parameters (, ).
2. Compute the expectation and variance of X.
3. Show that if X1 and X2 are independent gamma random variables
having respective parameters ((1 , ) and (1 , ), then X1 + X2 is a
gamma random variable with parameters (1 + 2 , ).
Exercise 73 A box contains 5 red balls, 4 white balls, and 3 blue balls. A
ball is selected at random from the box, its color is noted, and then the ball
is replaced. Find the probability that out of 6 balls selected in this manner,
3 are red, 2 are white, and 1 is blue.

16

Sampling theory and main statistical theorems

Exercise 74 A total of 100 people work at company A, whereas a total


of 110 work at company B. Suppose the total employee payroll is larger at
company A than at company B.
1. What does this imply about the median of the salaries at company A
with regard to the median of the salaries at company B?
2. What does this imply about the average of the salaries at company A
with regard to the average of the salaries at company B?
Exercise 75 The following are the percentages of ash content in 12 samples
of coal found in close proximity:
9.2, 14.1, 9.8, 12.4, 16.0, 12.6, 22.7, 18.9, 21.0, 14.5, 20.4, 16.9
Find the
1. sample mean, and
2. sample standard deviation of these percentages.
Exercise 76 Give examples of estimators (or estimates) which are (a) unbiased and efficient, (b) unbiased and inefficient, (c) biased and inefficient.
Exercise 77 Measurements of the diameters of a random sample of 200
ball bearings made by a certain machine during one week showed a mean of
0.824 inch and a standard deviation of 0.042 inch. Find (a) 95%, (b) 99%
confidence limits for the mean diameter of all the ball bearings.
Exercise 78 An electric scale gives a reading equal to the true weight plus
a random error that is normally distributed with mean 0 and standard deviation = 0.1 mg. Suppose that the results of five successive weighings of
the same object are as follows: 3.142, 3.163, 3.155, 3.150, 3.141.
1. Determine a 95 percent confidence interval estimate of the true weight.
2. Determine a 99 percent confidence interval estimate of the true weight.
Exercise 79 The standard deviation of test scores on a certain achievement
test is 11.3. If a random sample of 81 students had a sample mean score of
74.6, find a 90 percent confidence interval estimate for the average score of
all students.
17

Exercise 80 The sample mean of the annual salaries of a group of 100


accountants who work at a large accounting firm is $130,000 with a sample
standard deviation of $20,000. If a member of this group is randomly chosen,
what can we say about
1. the probability that his or her salary is between $90,000 and $170,000?
2. the probability that his or her salary exceeds $150,000?
Hint: use Chebychevs inequality.
Exercise 81 An astronomer wants to measure the distance from her observatory to a distant star.However, due to atmospheric disturbances, any
measurement will not yield the exact distance d. As a result, the astronomer
has decided to make a series of measurements and then use their average
value as an estimate of the actual distance. If the astronomer believes that
the values of the successive measurements are independent random variables
with a mean of d light years and a standard deviation of 2 light years, how
many measurements need she make to be at least 95 percent certain that her
estimate is accurate to within 0.5 light years?
Exercise 82 Verify the central limit theorem for a random variable X that
is binomially distributed, and thereby establish the validity of the normal
approximation to the binomial distribution.
Exercise 83 A standardized test is given annually to all sixth-grade students in the state of Washington. To determine the average score of students
in her district, a school supervisor selects a random sample of 100 students.
If the sample mean of these students? scores is 320 and the sample standard
deviation is 16, give a 95 percent confidence interval estimate of the average
score of students in that supervisor?s district.
Exercise 84 A sample of 150 brand A light bulbs showed a mean lifetime
of 1400 hours and a standard deviation of 120 hours. A sample of 200 brand
B light bulbs showed a mean lifetime of 1200 hours and a standard deviation
of 80 hours. Find (a) 95%, (b) 99% confidence limits for the difference of
the mean lifetimes of the populations of brands A and B.
Exercise 85 Let X1 , . . . , Xn , Xn+1 be a sample from a normal
population
n = (1/n) Pn Xi be the
having an unknown mean and variance 1. Let X
i=1
average of the first n of them.

18

n?
1. What is the distribution of Xn+1 X
n = 4, give an interval that, with 90 percent confidence, will con2. If X
tain the value of Xn+1 .
Exercise 86 A civil engineer wishes to measure the compressive strength
of two different types of concrete. A random sample of 10 specimens of the
first type yielded the following data (in psi)
Type 1: 3250, 3268, 4302, 3184, 3266, 3297, 3332, 3502, 3064, 3116
whereas a sample of 10 specimens of the second yielded the data Type 2:
3094, 3106, 3004, 3066, 2984, 3124, 3316, 3212, 3380, 3018
If we assume that the samples are normal with a common variance, determine a 95 percent two-sided confidence interval for 1 2 .
Exercise 87 Independent random samples are taken from the output of two
machines on a production line.The weight of each item is of interest. From
the first machine, a sample of size 36 is taken, with sample mean weight of
120 grams and a sample variance of 4. From the second machine, a sample
of size 64 is taken, with a sample mean weight of 130 grams and a sample
variance of 5. It is assumed that the weights of items from the first machine
are normally distributed with mean 1 and variance 2 and that the weights
of items from the second machine are normally distributed with mean 2
and variance 2 (that is, the variances are assumed to be equal). Find a 99
percent confidence interval for 1 2 , the difference in population means.
Exercise 88 The standard deviation of the lifetimes of a sample of 200
electric light bulbs was computed to be 100 hours. Find (a) 95%, (b) 99%
confidence limits for the standard deviation of all such electric light bulbs.
Exercise 89 The standard deviation of the heights of 16 male students chosen at random in a school of 1000 male students is 2.40 inches. Find (a)
95%, (b) 99% confidence limits of the standard deviation for all male students at the school. Assume that height is normally distributed.
Exercise 90 Two samples of sizes 16 and 10, respectively, are drawn at
random from two normal populations. If their variances are found to be 24
and 18, respectively, find (a) 98%, (b) 90% confidence limits for the ratio of
the variances.
Exercise 91 Suppose X1 , . . . , Xn are independent Poisson random variables each having mean . Determine the maximum likelihood estimator
of .
19

Exercise 92 Let X1 , . . . , Xn be a sample from the distribution whose density function is


f (x) = e(x) x
and 0 otherwise.
Determine the maximum likelihood estimator of .
Exercise 93 Suppose that X1 , . . . , Xn are normal with mean 1 , Y1 , . . . , Yn
are normal with mean 2 , and W1 , . . . , Wn are normal with mean 1 + 2 .
Assuming that all 3n random variables are independent with a common variance, find the maximum likelihood estimators of 1 and 2 .

20

Simulation

Exercise 94 Write a Monte Carlo simulation peusdo-code allowing to estimate the mean of exponential random variables with rate ,
Z
tf (t)dt,
0

with f (t) = et density of an exponential distribution defined over [0, ).


As an input of the program, you will give the number of independent runs
(trials) that will be used to obtain the estimation and .
It should include a confidence interval for the estimation.
Exercise 95 Write a Monte Carlo simulation pseudo-code allowing to estimate the volume of a sphere with radius 1/2 and center (1/2, 1/2, 1/2)
(hint: you can generate points in the unit cube and determine if they fall or
not in the sphere). As an input of the program, you will give the number of
independent runs (trials) that will be used to obtain the estimation.
It should include a confidence interval for the estimation.
R1
Exercise 96 We now aim at estimating the surface I = 0 3x2 dx under
the curve f (x) = 3x2 between 0 and 1 (the grey part in Figure 1).
1. Write a pseudo-code (including a confidence interval as an output) estimating I by simulation by considering random point uniformly distributed on a specific domain (a rectangle with known measure) and
including the grey area.
2. Write a pseudo-code computing I by estimating E[f (X)] with X a
random variable adequately chosen.
3. Compute the variance of those two estimators and compare with the
numerical results. Which estimator is the best? Explain/interpret.
Exercise 97 Given a uniform pseudo-random generator, explain how we
can generate a binomial distribution with parameters (n, p).
Exercise 98 Given a uniform pseudo-random generator, explain how we
can generate a beta distribution with parameters (n, 1).

21

f (x)
3

f (x) = 3x2

5
2

2
3
2

1
1
2

x
0

1
2

Figure 1: Computing the surface

Hypothesis testing

Exercise 99 Consider a trial in which a jury must decide between the hypothesis that the defendant is guilty and the hypothesis that he or she is
innocent. (a) In the framework of hypothesis testing and the U.S. legal system, which of the hypotheses should be the null hypothesis? (b) What do you
think would be an appropriate significance level in this situation?
Exercise 100 A population distribution is known to have standard deviation 20. Determine the p-value of a test of the hypothesis that the population
mean is equal to 50, if the average of a sample of 64 observations is (a) 52.5;
(b) 55.0; (c) 57.5.
Exercise 101 Design a decision rule to test the hypothesis that a coin is
fair if a sample of 64 tosses of the coin is taken and if a level of significance
of (a) 0.05, (b) 0.01 is used.
Exercise 102 In a certain chemical process, it is very important that a
particular solution that is to be used as a reactant have a pH of exactly 8.20.
22

A method for determining pH that is available for solutions of this type


is known to give measurements that are normally distributed with a mean
equal to the actual pH and with a standard deviation of 0.02. Suppose 10
independent measurements yielded the following pH values:
8.18, 8.17, 8.16, 8.15, 8.17, 8.21, 8.22, 8.16, 8.19, 8.18
1. What conclusion can be drawn at the = 0.1 level of significance?
2. What about at the = 0.05 level of significance?
Exercise 103 A British pharmaceutical company, Glaxo Holdings, has recently developed a new drug for migraine headaches. Among the claims
Glaxo made for its drug, called somatriptan, was that the mean time it
takes for it to enter the bloodstream is less than 10 minutes.To convince the
Food andDrug Administration of the validity of this claim, Glaxo conducted
an experiment on a randomly chosen set of migraine sufferers. To prove its
claim, what should they have taken as the null and what as the alternative
hypothesis?
Exercise 104 In the past a machine has produced washers having a mean
thickness of 0.050 inch. To determine whether the machine is in proper
working order a sample of 10 washers is chosen for which the mean thickness
is 0.053 inch and the standard deviation is 0.003 inch. Test the hypothesis
that the machine is in proper working order using a level of significance of
(a) 0.05, (b) 0.01. (c) Find the p-value of the test.
Exercise 105 In a single-server queueing system in which customers arrive according to a Poisson process, the long-run average queueing delay per
customer depends on the service distribution through its mean and variance.
Indeed, if is is the mean service time, and 2 is the variance of a service
time, then the average amount of time that a customer spends waiting in
queue is given by
(2 + 2 )
2(1 )
provided that < 1, where is the arrival rate.
Suppose that the owner of a service station will hire a second server
if it can be shown that the average service time exceeds 8 minutes. The
following data give the service times (in minutes) of 28 customers of this
queueing system. Do they indicate that the mean service time is greater than
8 minutes?
23

8.6, 9.4, 5.0, 4.4, 3.7, 11.4, 10.0, 7.6, 14.4, 12.2, 11.0, 14.4, 9.3, 10.5,
10.3, 7.7, 8.3, 6.4, 9.2, 5.7, 7.9, 9.4, 9.0, 13.3, 11.6, 10.0, 9.5, 6.6
Exercise 106 Consider a test of H0 : 100 versus H1 : > 100. Sup 20 = 105. Determine
pose that a sample of size 20 has a sample mean of X
the p-value of this outcome if the population standard deviation is known to
equal (a) 5; (b) 10; (c) 15.
Exercise 107 An advertisement for a new toothpaste claims that it reduces
cavities of children in their cavity-prone years. Cavities per year for this age
group are normal with mean 3 and standard deviation 1. A study of 2, 500
children who used this toothpaste found an average of 2.95 cavities per child.
Assume that the standard deviation of the number of cavities of a child using
this new toothpaste remains equal to 1.
1. Are these data strong enough, at the 5 percent level of significance, to
establish the claim of the toothpaste advertisement?
2. Do the data convince you to switch to this new toothpaste?
Exercise 108 A car is advertised as having a gas mileage rating of at least
30 miles/gallon in highway driving. If the miles per gallon obtained in 10
independent experiments are 26, 24, 20, 25, 27, 25, 28, 30, 26, 33, should
you believe the advertisement? What assumptions are you making?
Exercise 109 It is claimed that a certain type of bipolar transistor has a
mean value of current gain that is at least 210. A sample of these transistors
is tested. If the sample mean value of current gain is 200 with a sample
standard deviation of 35, would the claim be rejected at the 5 percent level
of significance if (a) the sample size is 25; (b) the sample size is 64?
Exercise 110 The viscosity of two different brands of car oil is measured
and the following data resulted:
Brand 1
Brand 2

10.62
10.50

10.58
10.52

10.33
10.58

10.72
10.62

10.44
10.55

10.74
10.51

10.53

Test the hypothesis that the mean viscosity of the two brands is equal, assuming that the populations have normal distributions with equal variances.
24

Exercise 111 It is argued that the resistance of wire A is greater than the
resistance of wire B. You make tests on each wire with the following results
(in ohm).
Wire A
Wire B

0.140
0.135

0.138
0.140

0.143
0.136

0.142
0.142

0.144
0.138

0.137
0.140

What conclusion can you draw at the 10 percent significance level? Explain
what assumptions you are making.
Exercise 112 A professor claims that the average starting salary of industrial engineering graduating seniors is greater than that of civil engineering
graduates. To study this claim, samples of 16 industrial engineers and 16
civil engineers, all of whom graduated in 2006, were chosen and sample
members were queried about their starting salaries. If the industrial engineers had a sample mean salary of $59, 700 and a sample standard deviation
of $2, 400, and the civil engineers had a sample mean salary of $58, 400
and a sample standard deviation of $2, 200, has the professor?s claim been
verified? Find the appropriate p-value.
We assume that the population distributions are normal and have equal
variances.
Exercise 113 To learn about the feeding habits of bats, 22 bats were tagged
and tracked by radio. Of these 22 bats, 12 were female and 10 were male.
The distances flown (in meters) between feedings were noted for each of the
22 bats, and the following summary statistics were obtained:
n = 180, S 2 = 92
Female bats: n = 12, X
1

Male bats: m = 10, Ym = 136, S22 = 86.


Assuming that the population distributions are normal and have equal
variances, test the hypothesis that the mean distance flown between feedings
is the same for the populations of both male and of female bats. Use the 5
percent level of significance.
Exercise 114 A gun-like apparatus has recently been designed to replace
needles in administering vaccines. The apparatus can be set to inject different amounts of the serum, but because of random fluctuations the actual
amount injected is normally distributed with a mean equal to the setting and
with an unknown variance 2 . It has been decided that the apparatus would
be too dangerous to use if exceeds 0.10. If a random sample of 50 injections resulted in a sample standard deviation of 0.08, should use of the new
apparatus be discontinued? Suppose the level of significance is = 0.10.
25

Comment on the appropriate choice of a significance level for this problem,


as well as the appropriate choice of the null hypothesis.
Exercise 115 In the past the standard deviation of weights of certain 40.0
oz packages filled by a machine was 0.25 oz. A random sample of 20 packages
showed a standard deviation of 0.32 oz. Is the apparent increase in variability
significant at the (a) 0.05, (b) 0.01 level of significance? (c) What is the P
value of the test?
Exercise 116 An instructor has two classes, A and B, in a particular subject. Class A has 16 students while class B has 25 students. On the same
examination, although there was no significant difference in mean grades,
class A had a standard deviation of 9 while class B had a standard deviation
of 12. Can we conclude at the (a) 0.01, (b) 0.05 level of significance that
the variability of class B is greater than that of A? (c) What is the P value
of the test?
Exercise 117 Find the probability of getting between 40 and 60 heads inclusive in 100 tosses of a fair coin.
Exercise 118 If we obtain a sample of n = 120 tosses of a die, with x1 =
12, x2 = 24, x3 = 20, x4 = 13, x5 = 18 and x6 = 33. Test the assumption
that we have a fair die at level of significance 5 percent. What is the p-value?
Exercise 119 According to the Mendelian theory of genetics, a certain garden pea plant should produce either white, pink, or red flowers, with respective probabilities 1/4, 1/2, 1/4. To test this theory, a sample of 564 peas was
studied with the result that 141 produced white, 291 produced pink, and 132
produced red flowers. Using the chi-square approximation, what conclusion
would be drawn at the 5 percent level of significance?
Exercise 120 Among 100 vacuum tubes tested, 41 had lifetimes of less than
30 hours, 31 had lifetimes between 30 and 60 hours, 13 had lifetimes between
60 and 90 hours, and 15 had lifetimes of greater than 90 hours. Are these
data consistent with the hypothesis that a vacuum tube?s lifetime is exponentially distributed with a mean of 50 hours?

26

Curve fitting and regression

Exercise 121 The following data relate x, the moisture of a wet mix of a
certain product, to Y , the density of the finished product.
xi
Yi

5
7.4

6
9.3

7
10.6

10
15.4

12
18.1

15
22.2

18
24.1

20
24.8

1. Draw a scatter diagram.


2. Fit a linear curve to the data.
Exercise 122 The following data indicate the gain in reading speed versus the number of weeks in the program of 10 students in a speed-reading
program.
Number of weeks
Speed gain (wds/min)

2
21

3
42

8
102

11
130

4
52

5
57

9
105

7
85

5
62

7
90

1. Draw a scatter diagram to see if a linear relationship is indicated.


2. Find the least squares estimates of the regression coefficients.
3. Estimate the expected gain of a student who plans to take the program
for 7 weeks.
Exercise 123 The following table (next page) relates the number of sunspots
that appeared each year from 1970 to 1983 to the number of auto accident
deaths during that year. Test the hypothesis that the number of auto deaths
is not affected by the number of sunspots. (The sunspot data are from Jastrow and Thompson, Fundamentals and Frontiers of Astronomy, and the
auto death data are from General Statistics of the U.S. 1985.)

27

Exercise 124 A study has shown that a good model for the relationship
between X and Y , the first and second year batting averages of a randomly
chosen major league baseball player, is given by the equation
Y = 0.159 + 0.4X + e
where e is a normal random variable with mean 0. That is, the model is a
simple linear regression with a regression toward the mean.
1. If a player?s batting average is 0.200 in his first year, what would you
predict for the second year?
2. If a player?s batting average is 0.265 in his first year, what would you
predict for the second year?
3. If a player?s batting average is 0.310 in his first year, what would you
predict for the second year?
Exercise 125 The following data represent the relationship between the
number of alignment errors and the number of missing rivets for 10 different aircraft.
Number of missing rivets x
Number of errors y

13
7

15
7

10
5

1. Plot a scatter diagram.


28

22
12

30
15

7
2

25
13

16
9

20
11

15
8

2. Estimate the regression coefficients.


3. Test the hypothesis that = 1.
Exercise 126 It is difficult and time consuming to measure directly the
amount of protein in a liver sample. As a result, medical laboratories often
make use of the fact that the amount of protein is related to the amount of
light that would be absorbed by the sample. As a result, a spectrometer that
emits light is shined on a solution that contains the liver sample and the
amount of light absorbed is then used to estimate the amount of protein.
The above procedure was tried on five samples having known amounts of
protein, with the following data resulting.
Light absorbed
0.44
0.82
1.20
1.61
1.83

Amount of protein (mg)


2
16
30
46
55

1. Determine the coefficient of determination.


2. Does this appear to be a reasonable way of estimating the amount of
protein in a liver sample?
3. What is the estimate of the amount of protein when the light absorbed
is 1.5?
4. Determine a prediction interval, in which we can have 90 percent confidence, for the quantity in part 3.
Exercise 127 The regression model
e N (0, 2 )

Y = x + e,

is called regression through the origin since it presupposes that the expected
response corresponding to the input level x = 0 is equal to 0. Suppose that
(xi , Yi ), i = 1, . . . , n is a data set from this model.
1. Determine the least squares estimator B of .
2. What is the distribution of B?
3. Define SSR and give its distribution.
29

4. Derive a test of H0 : = 0 versus H0 1 : 6= 0 . (e) Determine a


100(1?) percent prediction interval for Y (x0 ), the response at input
level x0 .
Exercise 128 Derive the normal equations for the least-squares parabola.
Exercise 129 The following table gives experimental values of the pressure
P of a given mass of gas corresponding to various values of the volume V .
Volume V (in3 )
Pressure P (lb/in2 )

54.3
61.2

61.8
49.5

72.4
37.6

88.7
28.4

118.6
19.2

194.0
10.1

According to thermodynamic principles, a relationship having the form


1. Find the values of and C.
2. Write the equation connecting P and V .
3. Estimate P when V = 100.0in3 .
Exercise 130 In 1957 the Dutch industrial engineer J. R. DeJong proposed
the following model for the time it takes to perform a simple manual task as
a function of the number of times the task has been practiced:
T tsn
where T is the time, n is the number of times the task has been practiced,
and t and s are parameters depending on the task and individual. Estimate
t and s for the following data set.
T
n

22.4
0

21.3
1

19.7
2

15.6
3

30

15.2
4

13.9
5

13.7
6

Analysis of variance

Exercise 131 A purification process for a chemical involves passing it, in


solution, through a resin on which impurities are adsorbed. A chemical
engineer wishing to test the efficiency of 3 different resins took a chemical
solution and broke it into 15 batches. She tested each resin 5 times and then
measured the concentration of impurities after passing through the resins.
Her data were as follows:
Resin I
0.046
0.025
0.014
0.017
0.043

Resin II
0.038
0.035
0.031
0.022
0.012

Resin III
0.031
0.042
0.020
0.018
0.039

Test the hypothesis that there is no difference in the efficiency of the resins.
Exercise 132 A machine shop contains 3 ovens that are used to heat metal
specimens. Subject to random fluctuations, they are all supposed to heat to
the same temperature.To test this hypothesis, temperatures were noted on 15
separate heatings. The following data resulted.
Oven
1
2
3

Temperature
492.4, 493.6, 498.5, 488.6, 494
488.5, 485.3, 482, 479.4, 478
502.1, 492, 497.5, 495.3, 486.7

Do the ovens appear to operate at the same temperature?Test at the 5 percent


level of significance. What is the p-value?
Exercise 133 Test the hypothesis that the following three independent samples all come from the same normal probability distribution.
Sample 1
35
37
29
27
30

Sample 2
29
38
34
30
32

Sample 3
44
52
56

Exercise 134 For data xij (1 i m, 1 j n) show that


31

1.

= (1/m)

Pm

i=1 xi

= (1/n)

Pn

j=1 xj

2. If xij = ai + bj ,
m X
n
X

xij = n

i=1 j=1

m
X
i=1

ai + m

n
X

bj .

j=1

Exercise 135 The following data refer to the number of deaths per 10,000
adults in a large Eastern city in the different seasons for the years 1982 to
1986.
Year Winter Spring Summer Fall
1982
33.6
31.4
29.8
32.1
1983
32.5
30.1
28.5
29.9
1984
35.3
33.2
29.5
28.7
1985
34.4
28.6
33.9
30.1
1986
37.3
34.1
28.5
29.4
1. Assuming a two-factor model, estimate the parameters.
2. Test the hypothesis that death rates do not depend on the season. Use
the 5 percent level of significance.
3. Test, at the 5 percent level of significance, the hypothesis that there is
no effect due to the year.
Exercise 136 A study was made as to how the concentration of a certain
drug in the blood, 24 hours after being injected, is influenced by age and
gender. An analysis of the blood samples of 40 people given the drug yielded
the following concentrations (in milligrams per cubic centimeter).

32

1. Test the hypothesis of no age and gender interaction.


2. Test the hypothesis that gender does not affect the blood concentration.
3. Test the hypothesis that age does not affect blood concentration.

33

You might also like