HW4 Solutions

Solutions
Math 501 Assignment 4: due Monday 10/7 at 11:59 p.m. submitted

via Canvas in pdf format
Scoring: This assignment is worth 155 points (35 + 15 + 15 + 20 + 15 + 15

+ 10 + 30 = 155).
Academic integrity and collaboration policy for this assignment: you

may work together on the reading assignment, which includes dozens of solved
problems, examples, proofs, detailed discussions, theorems, etc., and you are
encouraged to do so, but you must complete the written assignment entirely on
your own. Learn the material well with the reading assignment; prove you have
learned the material well with the written assignment.
You may consult our TAs or me regarding questions about written exercises
if you have worked hard on them and are very stuck, but you may not consult
anyone else until after all parties have submitted the assignment.
You are to …gure the exercises out entirely on your own.
If you …nd solutions to one or more of the written exercises in Weiss’s book,
in Hsu’s book, or in documents I myself have previously posted to Canvas for
our course this semester, you may make use of them, but the write-up should
be your own, and you must cite the source.
If you …nd solutions to one or more of the written exercises in any source
other than Weiss’s book, Hsu’s book, or documents I myself have previously
posted to Canvas for our course this semester, you may not make use of them
at all, and doing so would be an academic integrity violation.
Reading assignment:
Class notes: Part IV of the class notes (Chapters 28 through 30, inclusive)
(Chapters 1 through 27 were previously assigned.)
Weiss: Section 4.3 (Chapters 1, 2, and 3, and Sections 4.1, 4.2, and 4.4
were previously assigned.)
Schaum’s book (Hwei Hsu): Read Section 1.8 (Sections 1.1 through 1.7
were previously assigned.)
Schaum’s book (Hwei Hsu): Carefully work through exercises 1.63 through
1.73 inclusive. Do not turn these in, but make sure you understand the
details of the calculations and how to solve these exercises.
Written assignment (to be turned in):
Exercise 1) (35 points –5 points per part) (conditional probability)

Read but do not solve Exercise 4.76 in the book. It is just like our TB
example from class. A person is selected at random and tested. Let D be the
1
event “the selected person has the disease.” Let + be the event “the selected
person tests + for the disease.”
a) Draw a 2-level tree diagram, like our tree diagram for the TB example from
class, as follows. Draw two line segments, and label them D and Dc : Further
subdivide each of those into + and (where is the complement of +). Then
…ll in the appropriate probabilities and conditional probabilities along the line
segments of the tree diagram by using the information provided in Exercise 4.76
from the book.
b) Find the probability a person testing positive actually has the disease. I.e.,
calculate P (D j +) : Use the tree diagram from part a) to organize your calcu-
lation, and proceed as in class (using the de…nition of conditional probability
and the Law of Total Probability).
Suppose that 1 in 1000 people has the disease, as in Exercise 4.76 in the
book.
c) Now suppose P (D) = a; for some a such that 0 < a < 1: Proceed in the
same way to show that
0:96a
P (D j +) = :
0:96a + 0:02 (1 a)
d) Graph this function of a; for 0 < a < 1 and attach a printout. You may use
your calculator or a computer. Be extra careful to use parentheses around the
numerator and around the denominator when entering it into a computer. For
instance, enter it as
(0:96a) = (0:96a + 0:02 (1 a))
Entering it as
0:96a=0:96a + 0:02 (1 a)
will result in a graph of the wrong function because of the order of operations
in mathematics: PEMDAS (Parentheses, Exponentiation, Multiplication and
Division left to right, Addition and Subtraction left to right).
If you have trouble printing, you may instead give a hand-drawn graph, but
draw it very carefully and in a lot of detail.
e) Carefully explain the practical importance (to a patient, to a doctor, etc.) of

the graph from part d), and relate this to the TB example from class. Speci…-
cally, comment on the dramatic changes in the value of P (D j +) as we change
how common the disease is, and comment on how this may a¤ect medical rec-
ommendations. I recommend evaluating this function at a few speci…c values,
such as at a = 1=100 and a = 1=3:
f ) Explain what the graph from part d) has to do with the “base rate fallacy.”
2
g) An article is published in a science magazine about an extremely rare, non-
contagious disease. Anyone who has it will die fairly quickly, but it can be
cured if caught early. The cure is not without consequences, however – the
treatment renders the patient blind every time. Alarmed about the fatality
of the disease, many concerned people write to their Congressperson to ask
that random testing (i.e., testing of people selected at random) for this disease
be done. The Congressperson writes a bill requiring random testing for this
condition and tries to get it passed into law. During the discussion period,
a few scientists who are skilled in probability object quite vehemently, using
arguments based on probability. Why are they objecting? Explain carefully.
NOTE: This is an example of how probability can help determine public policy.
Solution Remarks: We’ll solve part a) …rst and then part c). Part b) will
then be solved as a special case of c). Then we’ll solve the other parts in order.
Solution to a): Since we are selecting at random from a …nite, non-empty set,
the relevant probability model is classical probability. We will proceed this way
in other parts of this problem as well.
We make the tree diagram as described. Putting D above Dc at the …rst
level, the probabilities at the …rst level from top to bottom are 0:001 and 0:999;
respectively. Putting + above in each case at the second level, the second-level
probabilities from top to bottom are 0.96, 0.04, 0.02, and 0.98, respectively.
Solution to c): We are interested in P (D j +) :
P (D \ +)
P (D j +) = (de…nition of conditional probability)
P (+)
0:96a
= (using the Law of Total Probability twice)
0:96a + 0:02 (1 a)
48a
= :
47a + 1
Solution to b): We are interested in P (D j +) : Here, a = 1=1000; so we

can substitute this into our formula for P (D j +) from part c) and simplify to
conclude that
16
P (D j +) = = 0:04584 : : :
349
Solution to d): Here is a graph of y = f (a) ; where

0:96a
f (a) = :
0:96a + 0:02 (1 a)
3
1.0
y
0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0.0
0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
x
Remark 1 Notice the dramatic changes in the graph of P (D j +) as we change

how common the disease is!
Solution to e): When P (D) is low (i.e., when a is small, which is the case for
rare conditions), P (D j +) is also low, despite the fact that the test was quite
accurate in the sense that most people with the condition test positive and
most without it test negative. Therefore, if an individual is selected for testing
at random, a positive test result is rarely indicative of disease if the disease is
rare, exactly as in the TB example from class.
This makes it clear that, for instance, random testing of the public for a rare
condition can be potentially a very bad idea (particularly if there are signi…cant
negative consequences associated with giving someone unnecessary treatment,
such as health risks and/or signi…cant costs associated with the treatment).
Of course, if a person has symptoms of a condition, even of a rare one, then
this mathematical model does not apply, since this model is based on random
testing; a symptomatic person di¤ers in important ways from a person selected
at random.
By contrast, when the condition is highly prevalent, random testing can give
extremely useful results.
This entire problem also demonstrates the importance of developing diag-
nostic tests having extremely low false positive and false negative rates (the
lower the better), if the condition is rare, as well as the importance of having
multiple, independent tests done to diagnose and con…rm a rare condition.
Suppose a person without other symptoms tests positive for a rare condition.
Should insurance companies cover the costs of another test? Should insurance
companies insist on another test before agreeing to pay for treatment? Ques-
tions like these can be addressed prudently by making use of all this knowledge.
4
Doctors and/or patients with this knowledge can use it to help ensure they
get better treatment. Insurance companies can use it to help ensure that ap-
propriate testing is done before patients are subjected to possibly unnecessary,
dangerous, and costly treatments or surgeries. Lawmakers can use this informa-
tion to ensure that adequate protections are given to residents (to ensure they
are not taken advantage of by insurance companies or doctors).
Thus, we see that probability can also play a major role in health policies.
Solution to f ): The “base rate fallacy” is in assuming that P (D j +) is high
(near 1) when the false positive and false negative rates are small (as they are
in this problem), without taking into consideration P (D) ; the prevalence of the
disease. The graph shows that P (D j +) is in fact tiny (very close to 0) when
the condition is rare (i.e., when P (D) is small), despite the false positive and
false negative rates being fairly small. The false positive and false negative rates
would have to be extremely tiny for P (D j +) to be near 1 when P (D) is tiny.
Remark 2 Observe that f (0) = 0 and f (1) = 1; of course. Pay particular

attention to the shape of the curve. It rises sharply for small a and then slowly
continues to rise until it reaches y = 1 when a = 1:
Solution to g): As we saw in class, a positive test result for a very rare
condition – in the absence of any other symptoms or any particular reason to
undergo the testing –can result in a very low probability of actual infection.
Random testing of asymptomatic people for a rare condition will end up
incorrectly misdiagnosing many people as having the condition, and most of
them would likely seek the treatment (since they would view the condition as
fatal) and end up unnecessarily disabled by it, greatly adversely a¤ecting them
and their families and dependents unnecessarily, and to make matters even
worse the healthcare system would have signi…cant costs associated with all
these unnecessary and dangerous treatments.
A good understanding of probability is critical for people making public
policy decisions which a¤ect many people.

This exercise is taken almost verbatim (but with some modi…cations) from
Walpole, Myers, Myers, and Ye, Probability & Statistics, 7th Edition, a rea-
sonably good textbook for an undergraduate calculus-based probability and
statistics course for students not majoring in mathematics.
Pollution of the rivers in the United States has been a problem for many
years. Consider the following events:
A = the river is polluted,

B = a tested water sample detects pollution,
C = …shing is permitted.
5
Suppose
P (A) = 0:3;
P (B j A) = 0:75;
P (B j Ac ) = 0:20;
P (C j A \ B) = 0:20;
c
P (C j A \ B) = 0:15;
P (C j A \ B c ) = 0:80;
c c
P (C j A \ B ) = 0:90:
a) Find the conditional probability that the river is polluted given that …shing
is permitted and that the sample tested did not detect pollution.
b) Find the probability that pollution will not be detected in a tested sample
and that …shing will be permitted given that the river is polluted.
c) Find the probability that the river is polluted or that a tested water sample
detects pollution, given that …shing is permitted.
Solution to a): We wish to …nd
P (A j B c \ C) ;
which equals
P (A \ B c \ C)
:
P (B c \ C)
We will use the law of total probability twice, once to evaluate the numerator
and a second time to evaluate the denominator.
We draw a 3-level tree diagram with each node splitting into two line seg-
ments at each level (into A and Ac at the …rst level, into B and B c at the second
level, and into C and C c at the third level.
The probabilities (top to bottom, in the order speci…ed, with A on top of Ac ;
B on top of B c ; and C on top of C c ;are as follows: …rst level (0.3, 0.7), second
level (0.75, 0.25, 0.20, 0.80), third level (0.20, 0.80, 0.80, 0.20, 0.15, 0.85, 0.90,
0.10). We can therefore evaluate the requested probabilities using the general
constructions used above.
The event B c \ C occurs in two branches (A \ B c \ C and Ac \ B c \ C).
In each case, in accordance with the law of total probability, we multiply the
probabilities and conditional probabilities across and add the results. We get
P (B c \ C) = 0:3 0:25 0:8 + 0:7 0:80 0:90

= 0:564
P (A \ B c \ C) is the probability of a single branch of the tree diagram,

the branch passing through A and C but not B: We …nd it by multiplying
6
the probabilities and conditional probabilities along that branch. Putting the
results together gives
P (A \ B c \ C)
P (A j B c \ C) =
P (B c \ C)
0:3 0:25 0:8
=
0:564
= 0:106 3 : : :
Solution to b): We need to …nd P (B c \ C j A) : We will use the de…nition of

conditional probability. P (A \ B c \ C) was found in part a), and P (A) was
given. Thus,
P (A \ B c \ C)
P (B c \ C j A) =
P (A)
0:3 0:25 0:8
=
0:3
= 0:2:
Solution to c): We need to …nd P (A [ B j C) :
P ((A [ B) \ C)
P (A [ B j C) = (de…nition of conditional probability). (1)
P (C)
We calculate P ((A [ B) \ C) by using the law of total probability: we mul-

tiply the probabilities and conditional probabilities along each branch in which
the event (A [ B) \ C occurs (the 1st, 3rd, and 5th branches from the top):
P ((A [ B) \ C) = (0:3) (0:75) (0:20) + (0:3) (0:25) (0:8) + (0:7) (0:20) (0:15)
= 0:126:
We calculate P (C) by using the law of total probability: we multiply the

probabilities and conditional probabilities along each branch in which the event
C occurs (the 1st, 3rd, 5th, and 7th branches from the top):
P (C) = (0:3) (0:75) (0:20) + (0:3) (0:25) (0:8) + (0:7) (0:20) (0:15) + (0:7) (0:80) (0:9)
= 0:63:
Thus,
0:126
P (A [ B j C) = = 0:2:
0:63
Warning P (A [ (B j C)) does not make any sense at all since B j C is not an
event.
7
Look at Exercise 4.21 in the textbook. You will answer part d) of that
problem from the book, but not quite following their outline.
a) First, construct a table like Table 4.3 on page 133, except with “Sex” and
“Marital Status” replaced by “Religion” and “Occupation” (with the choices
for Religion being R1 ; R2 ; and R3 ; and with the choices for Occupation being
W, B, and O).
The top left entry represents P (R1 \ W ) : Since
P (W \ R1 )
P (W j R1 ) = ;
P (R1 )
we can multiply both sides of this equation by P (R1 ) to get
P (W \ R1 ) = P (R1 ) P (W j R1 ) :
We can use that last formula and the given values to …ll in the top left entry
of the table. Proceed in that manner to …ll in the inner 3 3 table. Then add
across and down to produce the marginal probabilities.
b) Answer the question stated in 4.21d.
c) Why might someone …nd this sort of use of demographic data (which really
does go on, of course) disturbing?
Solution to a): We proceed exactly as indicated above. For example,
P (R1 \ W ) = P (R1 ) P (W j R1 ) = 0:35 0:12 = 0:042:
W B O
R1 0:042 0:2835 0:0245 0:35
R2 0:108 0:432 0:06 0:60
R3 0:0065 0:0375 0:006 0:05
0:1565 0:753 0:0905 1
WRITE-UP NOTE: In situations like this, you need not show the arithmetic
for every calculation. If it’s exactly the same in each case (as it is here), just
explain that (as I did), show the arithmetic for one case in complete detail (as
above), and then write in the answers to the others.
Solution to b):
P (R1 \ W ) 0:042
P (R1 j W ) = = = 0:26837 : : : ;
P (W ) 0:1565
whereas
P (R1 ) = 0:35:
8
So, the answer is “less likely.” White collar workers are less likely than a ran-
domly selected member of that community to be of Religion 1. So, the additional
information about worker type in this case gives us additional information about
religion as well.
Solution to c): There is no single right answer to this. Of course, conditional

probability can be used to re…ne probabilities in the presence of additional
information. However, this exercise shows that it may also be used to gain
probabilistic information – about a speci…c individual, not just about a large
group of people – regarding sensitive personal issues that someone may wish
to keep private, even issues it may be illegal to ask about directly (as in a
job interview), simply by using information about less sensitive issues (such as
worker type, which is something that may be discussed in job interviews, for
instance).
Remark 3 This is in fact a major application of data analysis, one that busi-
nesses across the world have spent and continue to spend a tremendous amount
of money on. Di¤ erent countries will have to make decisions and laws which
establish an appropriate balance between the perceived bene…ts to businesses and
large organizations making use of such data and the perceived costs to the in-
dividuals whose data are being used in such a manner. Conditional probability
plays a signi…cant role in such analyses.
Exercise 4) (20 points – 5 points per part) (independence, mutual

exclusivity, conditional probability)
a) Suppose A and B are mutually exclusive. Suppose also that P (A) > 0 and
P (B) > 0: Can A and B be independent? Why or why not?
b) Give an example of events A and B such that A and B are simultaneously

mutually exclusive and independent. Show carefully, using the de…nitions, that
the A and B you chose satisfy these properties. You may select any probability
space you wish and choose A and B to be events in that space.
c) Give an example of events A and B which are neither mutually exclusive

nor independent. Show carefully, using the de…nitions, that the A and B you
chose satisfy these properties. You may select any probability space you wish
and choose A and B to be events in that space.
d) Suppose P (A j B) = a and P (B j A) = b; where a > 0 and b > 0: In

particular, we are supposing that the conditional probabilities in the preceding
sentence are de…ned. For which values of a and b (if any) is it possible for A
and B to be mutually exclusive? Explain.
Solution to a): We are given that A and B are mutually exclusive, and so
A \ B = ;:
9
If A and B were also independent, we would have
P (A) P (B) = P (A \ B) (because of independence)
= P (;) (since A and B are mutually exclusive)
= 0;
which cannot happen since P (A) > 0 and P (B) > 0: Therefore, A and B
cannot be independent.
Solution to b): You needed to provide just a single example. I will discuss
how to …nd one.
We simply need an example of events A and B for which A \ B = ; and yet
P (A \ B) = P (A) P (B) : As we saw in part a), this cannot happen unless we
also have P (A) = 0 or P (B) = 0: So, we are led to look for examples of events
A and B which have empty intersection and for which at least one of them has
probability zero. If either P (A) = 0 or P (B) = 0; then P (A \ B) = 0; which
makes each side of the equation P (A \ B) = P (A) P (B) equal to zero, and
thus A and B are also independent.
We can now easily …nd many examples. Here is one. Let = [0; 5] : Let
F be the set of all subsets of [0; 5] having a de…ned length. Let P be the
one-dimensional geometric probability measure in which P (E) is the length
of E divided by the length of : Let A = f2g ; and let B = f4g : We have
A \ B = ;; so A and B are mutually exclusive. We have P (A) = 0; P (B) = 0;
P (A \ B) = 0; and so P (A \ B) = P (A) P (B) ; which implies that A and B
are also independent.
Remark 4 The following is true: “If A and B are mutually exclusive, then
they are independent if and only if either P (A) = 0 or P (B) = 0:” Make sure
you know how to prove that statement.
Solution to c): Almost anything you pick will work for this one: make them
overlap (so as to not be mutually exclusive) but not in the exact proportion of
the equation P (A \ B) = P (A) P (B) ; so as to not be independent). There
are in…nitely many correct solutions; you need give only one and demonstrate
that it clearly has the indicated properties.
For example, consider a classical probability model with = f1; 2; 3; 4; 5; 6g :
Let A = B = f1; 2; 3g :
Then A \ B 6= ;; so A and B are not mutually exclusive, and furthermore
1 1 1
P (A \ B) = 6= = P (A) P (B) ;
2 2 2
so A and B are also not independent.
Solution to d): Since P (A j B) and P (B j A) are de…ned, it follows that we

have P (A) > 0 and P (B) > 0: Then
P (A \ B)
a = P (A j B) = ;
P (B)
10
which gives (after multiplying both sides by P (B))
P (A \ B) = a P (B) > 0;
since a > 0 and P (B) > 0: In particular, A \ B 6= ; (if A \ B were empty, we

would have P (A \ B) = 0), so for any a and b with a > 0 and b > 0 it follows
that A and B are not mutually exclusive.
Remark 5 In fact, an even stronger statement is true: “Suppose P (A j B) = a

and P (B j A) = b; where a > 0 or b > 0: In particular, we are supposing that
the conditional probabilities in the preceding sentence are de…ned. Then A and B
cannot be mutually exclusive.”Make sure you know how to prove that statement.
I gave the proof above for the case a > 0: The proof for the case b > 0 is similar
(just switch A and B; and a and b).
Exercise 5) (15 points) (independence for 3 or more events)

We can also de…ne what it means for a collection of 3 or more events to be
independent. See De…nition 4.5 on p. 150 of our textbook.
a) Let A; B; and C be events in a probability space ( ; F; P ) : Write down the
4 equations which must be true in order for A; B; and C to be independent.
b) Let A; B; C; and D be events in a probability space ( ; F; P ) : Write down the

11 equations which must be true in order for A; B; C; and D to be independent.
c) Let A1 ; A2 ; : : : ; An be events in a probability space ( ; F; P ) : Explain why

2n n 1 equations must be satis…ed in order for the Ai s to be independent.
Solution to a):
P (A \ B) = P (A) P (B) ;
P (A \ C) = P (A) P (C) ;
P (B \ C) = P (B) P (C) ;
P (A \ B \ C) = P (A) P (B) P (C) :
11
Solution to b):
P (A \ B) = P (A) P (B) ;
P (A \ C) = P (A) P (C) ;
P (A \ D) = P (A) P (D) ;
P (B \ C) = P (B) P (C) ;
P (B \ D) = P (B) P (D) ;
P (C \ D) = P (C) P (D) ;
P (A \ B \ C) = P (A) P (B) P (C) ;
P (A \ B \ D) = P (A) P (B) P (D) ;
P (A \ C \ D) = P (A) P (C) P (D) ;
P (B \ C \ D) = P (B) P (C) P (D) ;
P (A \ B \ C \ D) = P (A) P (B) P (C) P (D) :
Solution to c): The multiplication property must hold for each collection of 2
n
or more events. There are 2n subsets of a set having n events, of which =1
0
n
consist of no events and = n consists of 1; for a total of 2n n 1 collections
1
consisting of 2 or more events. There is one multiplication property equation
for each of these, for a total of 2n n 1 equations that must be satis…ed.

This problem concerns a condition for which one is testing (medical or oth-
erwise) and a diagnostic test used to detect the presence of the condition. Care-
fully consider the de…nitions of the terms speci…city and sensitivity in Exercise
4.76 on p. 165 of our textbook.
a) Suppose a given diagnostic test has a “false positive”rate of 3% and a “false
negative”rate of 2%. What are the speci…city and sensitivity? Explain carefully.
b) Let D be the event where the condition being tested for is present. Let +
be the event that the diagnostic test indicates the condition is present, and let
be the complement of +:
Answer each of these 5 multiple-choice questions by indicating (by circling it
or by indicating it in any other clear way) which is the correct choice (1 point
each). No work needs to be shown in this case.
P (D j ) = (false positive rate, false negative rate, sensitivity, speci…city,

none of these).
P ( j D) = (false positive rate, false negative rate, sensitivity, speci…city,
none of these).
P (D j +) = (false positive rate, false negative rate, sensitivity, speci…city,
none of these).
12
P (+ j D) = (false positive rate, false negative rate, sensitivity, speci…city,
none of these).
P ( j Dc ) = (false positive rate, false negative rate, sensitivity, speci…city,
none of these).
c) Consider the following 8 conditional probabilities:
P (+ j D) ; P (+ j Dc ) ; P (Dc j +) ; P (Dc j );
c
P (D j +) ; P (D j ); P ( j D) ; P ( j D ):
Suppose that 0 < P (D) < 1 and that 0 < P (+) < 1: Which PAIRS of
these probabilities must ALWAYS add up to 1 (which means they must both
be de…ned and sum to 1)? List all such pairs if there are any. If there aren’t
any, say so.
GRADING NOTE: no work needs to be shown; you will earn 5 points
if ALL such pairs (if any) are correctly identi…ed, and 0 points oth-
erwise. There is no partial credit. Proceed carefully.
Hint: There is one underlying idea here; if you understand it clearly then you
will get it completely right, and if you do not then you should spend time trying
to sort it out. The grading for this one is designed to encourage you to do that.
Remember what we did with multi-level tree diagrams and labeling conditional
probabilities along line segments. Try to construct a tree diagram similar to the
one we used in class for the Tuberculosis example. That may help.
GRADING NOTE for a): 50% deduction for assuming any speci…c value for
P (D) ; or for not showing …rst that PD and PDc must be probability measures.
Solution to a):
The false positive rate is the conditional probability that, if a person does not
have a given condition, he or she will incorrectly get a positive test result. Thus,
it is P (+ j Dc ) : The fact that it is de…ned implies that P (Dc ) > 0; so that PDc
is a conditional probability measure. The false positive rate is PDc (+) :
The false negative rate is the conditional probability that, if a person does
have a given condition, he or she will incorrectly get a negative test result. Thus,
it is P ( j D) : The fact that it is de…ned implies that P (D) > 0; so that PD
is a conditional probability measure. The false negative rate is PD ( ) :
We can now write all quantities in terms of these probability measures and
use the fact that P (+) + P ( ) = 1 for any probability measure P because +
and are complementary events.
The false positive rate is P (+ j Dc ) = PDc (+) :
The false negative rate is P ( j D) = PD ( ) :
The sensitivity is P (+ j D) = PD (+) :
The speci…city is P ( j Dc ) = PDc ( ) :
13
Since PD is a probability measure, and since + and form a partition of
the sample space, we have
PD (+) + PD ( ) = 1;
which, in words, is
sensitivity + false negative rate = 1:
The false negative rate is 2%, or 0:02; so the sensitivity is 0:98 or 98%:
Since PDc is a probability measure, and since + and form a partition of
the sample space, we have
PDc (+) + PDc ( ) = 1;
which, in words, is
false positive rate + speci…city = 1:
The false positive rate is 3%, or 0:03; so the speci…city is 0:97 or 97%:
GRADING NOTE for b): 1 point each (no partial credit for each).
Solution to b):
P (D j ) = none of these,
P( j D) = false negative rate,
P (D j +) = none of these,
P (+ j D) = sensitivity,
P( j Dc ) = speci…city.
GRADING NOTE for c): no work needs to be shown; you will earn 5
points if ALL such pairs (if any) are correctly identi…ed, and 0 points
otherwise.
Solution to c): Because 0 < P (D) < 1 and 0 < P (+) < 1; it follows that
P (D) > 0; P (Dc ) > 0; P (+) > 0; and P ( ) > 0; and so each of the indicated
conditional probabilities is de…ned, and PD ; PDc ; P+ ; and P are conditional
probability measures, hence probability measures, which implies, in particular,
that their values on complementary events must add up to 1:
Thus,
PD (+) + PD ( ) = 1;
which is equivalent to
P (+ j D) + P ( j D) = 1:
14
Next,
PDc (+) + PDc ( ) = 1;
P (+ j Dc ) + P ( j Dc ) = 1:
Next,
P+ (D) + P+ (Dc ) = 1;
P (D j +) + P (Dc j +) = 1:
Next,
P (D) + P (Dc ) = 1;
P (D j ) + P (Dc j ) = 1:
In this way, we have derived the following four equations relating the given
quantities:
P (+ j D) + P ( j D) = 1;
c c
P (+ j D ) + P ( jD ) = 1;
P (D j +) + P (Dc j +) = 1;
c
P (D j ) + P (D j ) = 1:
Exercise 7) (10 points) (conditional probability)

Consider Exercise 4.21 from the textbook. In this problem, we’ll let the
events R1 ; R2 ; R3 ; W; B; and O mean the same as in the textbook, but ignore
all of the probabilities given in 4.21.
Instead, suppose we know only the following (which hold simultaneously):
0:1 < P (R1 ) < 0:9; (2)

0:2 < P (R2 ) < 0:8; (3)
0:3 < P (R3 ) < 0:7; (4)
P (W j R1 ) = 0:20; (5)
P (W j R2 ) = 0:15; (6)
P (B j R2 ) = 0:61; (7)
P (B j R3 ) = 0:58; (8)
P (O j R1 ) = 0:37; (9)
P (O j R3 ) = 0:32: (10)
15
What, if anything, can be said about the value of
P (W j R3 ) + P (B j R1 ) + P (O j R2 )? (11)
Explain your answer in careful detail.
Solution: For each i = 1; 2; 3; since P (Ri ) > 0 the conditional probability

measure PRi is de…ned and is also a probability measure. Since W; B; and O
partition the sample space, for each i their PRi measures must add up to 1, so
P (W j R3 ) = PR3 (W ) = 1 PR3 (B) PR3 (O) = 0:10;

P (B j R1 ) = PR1 (B) = 1 PR1 (W ) PR1 (O) = 0:43;
P (O j R2 ) = PR2 (O) = 1 PR2 (W ) PR2 (B) = 0:24;
and so the sum in (11) equals 0:10 + 0:43 + 0:24 = 0:77:
Exercise 8. (30 points –5 points per part) (independence and condi-

tional probability)
In this exercise, you will explore various properties which hold when two
events are independent. Parts a. and b. concern a speci…c probability space.
The other parts concern an arbitrary probability space, not necessarily the one
from parts a. and b.
a. For part a., let = [0; 2] [0; 2] ; and let ( ; F; P ) be the corresponding
2-dimensional geometric probability model.
Our experiment consists of selecting a point from “at random.” Let us
denote the selected point by
! = (x; y) ;
so that x and y are the coordinates of the selected point.
Let E be the event that x is in the interval [3=2; 2] :
Let F be the event that y is in the interval [1=4; 2] :
Are E and F independent events? Why or why not?

Hint: Calculate P (E) ; P (F ) ; and P (E \ F ) :
b. Let E and F be as in part a.. Show that E and F c are independent by

calculating P (E) ; P (F c ) ; and P (E \ F c ) and verifying that the de…nition of
independence is satis…ed with events E and F in the de…nition replaced by E
and F c :
c. For this exercise, let ( ; F; P ) be an arbitrary probability space. Part b. of

this exercise might lead you to wonder whether or not it is true that events E
and F are independent if and only if events E and F c are independent. In fact,
this is true in general.
In this part, I will have you complete a proof of the following assertion: “if
E and F are independent, then E and F c are independent.”
16
I will provide all of the steps. You are to …ll in the reasons for each of the
steps where I write “Why?”.
Claim: Suppose ( ; F; P ) is a probability space and that E and F are events.
If E and F are independent, then E and F c are independent.
Proof: Suppose E and F are independent. Let a = P (E) ; and let b = P (F ) :
Step 1: P (E \ F ) = ab: Why?
Step 2: Then P (E \ F c ) = P (E) P (E \ F ) : Why?
Step 3: Then P (E \ F c ) = P (E) P (F c ) : Why?
It follows that E and F c are independent. q.e.d.
d. For this exercise, let ( ; F; P ) be an arbitrary probability space. Consider

the following claim:
Claim: Suppose ( ; F; P ) is a probability space and that E and F are events.
If E and F are independent, then E c and F are independent.
Prove this claim by switching the letters E and F in the statement (not in
the proof) of the previous claim. This gives an almost immediate proof. It is an
example of an important method of reasoning called a symmetry argument.
e. Suppose ( ; F; P ) is an arbitrary probability space and that A and B are

independent events with P (A) = 0:4 and P (B) = 0:5: Calculate
P (A [ B) + P (A \ B c ) + P (A [ B c )
by using any legitimate method.

Hint: Remember the inclusion-exclusion principle. It allows you to rewrite the
probability of a union in terms of other probabilities which are easier to work
with when events are independent. For example, if A and B are independent
then so are Ac and B (from the above), and thus,
P (Ac [ B) = P (Ac ) + P (B) P (Ac \ B) (inclusion-exclusion principle)

= P (Ac ) + P (B) P (Ac ) P (B) (since Ac and B are independent),
which is easy to calculate if we know P (A) and P (B) : Proceed similarly with
the assigned problem.
f. Suppose ( ; F; P ) is an arbitrary probability space and that A and B are

events. Suppose that P (A) > 0: Prove that A and B are independent if and
only if
P (B j A) = P (B) :
Solution to a.: Since we are selecting a point at random from a set having
positive, …nite area, we use a 2-dimensional geometric probability model with
equal to our set, so that
Area (A) 1
P (A) = = Area (A)
Area ( ) 4
17
for any event A:
We will …nd the probabilities of the events E; F; and E \ F:
Area (E) 1
P (E) = = ;
Area ( ) 4
Area (F ) 7=2 7
P (F ) = = = ;
Area ( ) 4 8
Area (E \ F ) 7=8 7
P (E \ F ) = = = :
Area ( ) 4 32
We see that P (E \ F ) = P (E) P (F ) and so E and F are independent.
Warning It is incorrect to proceed as follows:

Length ([3=2; 2]) 1=2 1
P (E) = = = :
Length ([0; 2]) 2 4
The reason is that we have not shown that the x component of a point
(x; y) selected at random from must itself be randomly selected from
[0; 2] : The selection is made at random from ; so must be our sample
space. It turns out that one can prove that the x component of a point
(x; y) selected at random from must itself be randomly selected from
[0; 2] : However, this would require a technical and fully rigorous proof,
not an intuitive argument (it’s a very major claim to make). Later in the
course, when we consider joint uniform random variables, we will have the
technical tools necessary to prove such a claim carefully.
Solution to b.: We will use the same probability space as in part a), where we
also showed that P (E) = 1=4 and P (F ) = 7=8: It follows that P (F c ) = 1=8:
Next,
(1=4) (1=2) 1
P (E \ F c ) = = ;
4 32
which equals P (E) P (F c ) ; so E and F c are independent.
Solution to c.:
Step 1 reason: since E and F are independent.
Step 2 reason: since E \ F and E \ F c form a partition of E; we have
P (E \ F ) + P (E \ F c ) = P (E) ;
from which the asserted claim follows immediately.

Step 3 reason:
P (E \ F c ) = P (E) P (E \ F ) (from Step 2)

= P (E) P (E) P (F ) (from Step 1)
= P (E) (1 P (F )) (algebra)
= P (E) P (F c ) (complement rule).
18
Solution to d.: Proof: We are given that ( ; F; P ) is a probability space, that
E and F are events, and that E and F are independent. We must show that
E c and F are independent.
Applying the claim from part c) with the letters E and F switched, we see
that E c and F are independent, as required. q.e.d.
Solution to e.: Since A and B are independent, we have
P (A \ B) = P (A) P (B) = (0:4) (0:5) = 0:2:
By the Inclusion-Exclusion Principle,
P (A [ B) = P (A) + P (B) P (A \ B)
= 0:4 + 0:5 0:2 = 0:7:
Because B and B c form a partition of ; A \ B and A \ B c form a partition

of A; and so
P (A) = P (A \ B) + P (A \ B c ) ;
0:4 = 0:2 + P (A \ B c ) ;
c
P (A \ B ) = 0:2:
Above, we found that P (A \ B c ) = 0:2 and P (B c ) = 0:5: By the Inclusion-

Exclusion Principle,
P (A [ B c ) = P (A) + P (B c ) P (A \ B c ) = 0:4 + 0:5 0:2 = 0:7:

c
Alternatively, (A [ B c ) = Ac \ B = B \ Ac ; so we will …nd the probability
of B \ Ac and then subtract that result from 1.
Because A and Ac form a partition of ; B \ A and B \ Ac form a partition
of B; and so making use of the probability of B \ A (which we found in part
a)) we have
P (B \ A) + P (B \ Ac ) = P (B) ;
c
0:2 + P (B \ A ) = 0:5;
from which we see that P (B \ Ac ) = 0:3: As noted above, this implies that
P (A [ B c ) = 0:7:
Thus,
P (A [ B) + P (A \ B c ) + P (A [ B c )
= 0:7 + 0:2 + 0:7 = 1:6:
Solution to f.: We will prove each of the two claims in the given “if and only
if” statement separately.
19
First, suppose A and B are independent. Then
P (A \ B) = P (A) P (B) ;
and so
P (A \ B) P (A) P (B)
P (B j A) = = = P (B) :
P (A) P (A)
Next, suppose that
P (B j A) = P (B) :
Then
P (A \ B)
= P (B) ;
P (A)
which yields P (A \ B) = P (A) P (B) after multiplying both sides by P (A) :
It follows that A and B are independent.
20

HW4 Solutions

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

HW4 Solutions

Uploaded by

Copyright:

Available Formats

Solutions

Math 501 Assignment 4: due Monday 10/7 at 11:59 p.m. submitted

Scoring: This assignment is worth 155 points (35 + 15 + 15 + 20 + 15 + 15

Academic integrity and collaboration policy for this assignment: you

Written assignment (to be turned in):

Exercise 1) (35 points –5 points per part) (conditional probability)

e) Carefully explain the practical importance (to a patient, to a doctor, etc.) of

Solution to b): We are interested in P (D j +) : Here, a = 1=1000; so we

Solution to d): Here is a graph of y = f (a) ; where

Remark 1 Notice the dramatic changes in the graph of P (D j +) as we change

Remark 2 Observe that f (0) = 0 and f (1) = 1; of course. Pay particular

Exercise 2) (15 points –5 points per part) (conditional probability)

A = the river is polluted,

Solution to a): We wish to …nd

P (B c \ C) = 0:3 0:25 0:8 + 0:7 0:80 0:90

P (A \ B c \ C) is the probability of a single branch of the tree diagram,

Solution to b): We need to …nd P (B c \ C j A) : We will use the de…nition of

Solution to c): We need to …nd P (A [ B j C) :

We calculate P ((A [ B) \ C) by using the law of total probability: we mul-

We calculate P (C) by using the law of total probability: we multiply the

we can multiply both sides of this equation by P (R1 ) to get

b) Answer the question stated in 4.21d.

Solution to a): We proceed exactly as indicated above. For example,

P (R1 \ W ) = P (R1 ) P (W j R1 ) = 0:35 0:12 = 0:042:

Solution to c): There is no single right answer to this. Of course, conditional

Exercise 4) (20 points – 5 points per part) (independence, mutual

b) Give an example of events A and B such that A and B are simultaneously

c) Give an example of events A and B which are neither mutually exclusive

d) Suppose P (A j B) = a and P (B j A) = b; where a > 0 and b > 0: In

Solution to d): Since P (A j B) and P (B j A) are de…ned, it follows that we

since a > 0 and P (B) > 0: In particular, A \ B 6= ; (if A \ B were empty, we

Remark 5 In fact, an even stronger statement is true: “Suppose P (A j B) = a

Exercise 5) (15 points) (independence for 3 or more events)

b) Let A; B; C; and D be events in a probability space ( ; F; P ) : Write down the

c) Let A1 ; A2 ; : : : ; An be events in a probability space ( ; F; P ) : Explain why

Exercise 6) (15 points –5 points per part) (conditional probability)

P (D j ) = (false positive rate, false negative rate, sensitivity, speci…city,

c) Consider the following 8 conditional probabilities:

sensitivity + false negative rate = 1:

PDc (+) + PDc ( ) = 1;

false positive rate + speci…city = 1:

Exercise 7) (10 points) (conditional probability)

0:1 < P (R1 ) < 0:9; (2)

Explain your answer in careful detail.

Solution: For each i = 1; 2; 3; since P (Ri ) > 0 the conditional probability

P (W j R3 ) = PR3 (W ) = 1 PR3 (B) PR3 (O) = 0:10;

and so the sum in (11) equals 0:10 + 0:43 + 0:24 = 0:77:

Exercise 8. (30 points –5 points per part) (independence and condi-

Are E and F independent events? Why or why not?

b. Let E and F be as in part a.. Show that E and F c are independent by

c. For this exercise, let ( ; F; P ) be an arbitrary probability space. Part b. of

d. For this exercise, let ( ; F; P ) be an arbitrary probability space. Consider

e. Suppose ( ; F; P ) is an arbitrary probability space and that A and B are

by using any legitimate method.

P (Ac [ B) = P (Ac ) + P (B) P (Ac \ B) (inclusion-exclusion principle)

f. Suppose ( ; F; P ) is an arbitrary probability space and that A and B are

Warning It is incorrect to proceed as follows:

from which the asserted claim follows immediately.

P (E \ F c ) = P (E) P (E \ F ) (from Step 2)

Solution to e.: Since A and B are independent, we have

P (A \ B) = P (A) P (B) = (0:4) (0:5) = 0:2:

By the Inclusion-Exclusion Principle,

Because B and B c form a partition of ; A \ B and A \ B c form a partition

Above, we found that P (A \ B c ) = 0:2 and P (B c ) = 0:5: By the Inclusion-