Professional Documents
Culture Documents
HW4 Solutions
HW4 Solutions
Reading assignment:
Class notes: Part IV of the class notes (Chapters 28 through 30, inclusive)
(Chapters 1 through 27 were previously assigned.)
Weiss: Section 4.3 (Chapters 1, 2, and 3, and Sections 4.1, 4.2, and 4.4
were previously assigned.)
Schaum’s book (Hwei Hsu): Read Section 1.8 (Sections 1.1 through 1.7
were previously assigned.)
Schaum’s book (Hwei Hsu): Carefully work through exercises 1.63 through
1.73 inclusive. Do not turn these in, but make sure you understand the
details of the calculations and how to solve these exercises.
1
event “the selected person has the disease.” Let + be the event “the selected
person tests + for the disease.”
a) Draw a 2-level tree diagram, like our tree diagram for the TB example from
class, as follows. Draw two line segments, and label them D and Dc : Further
subdivide each of those into + and (where is the complement of +). Then
…ll in the appropriate probabilities and conditional probabilities along the line
segments of the tree diagram by using the information provided in Exercise 4.76
from the book.
b) Find the probability a person testing positive actually has the disease. I.e.,
calculate P (D j +) : Use the tree diagram from part a) to organize your calcu-
lation, and proceed as in class (using the de…nition of conditional probability
and the Law of Total Probability).
Suppose that 1 in 1000 people has the disease, as in Exercise 4.76 in the
book.
c) Now suppose P (D) = a; for some a such that 0 < a < 1: Proceed in the
same way to show that
0:96a
P (D j +) = :
0:96a + 0:02 (1 a)
d) Graph this function of a; for 0 < a < 1 and attach a printout. You may use
your calculator or a computer. Be extra careful to use parentheses around the
numerator and around the denominator when entering it into a computer. For
instance, enter it as
(0:96a) = (0:96a + 0:02 (1 a))
Entering it as
0:96a=0:96a + 0:02 (1 a)
will result in a graph of the wrong function because of the order of operations
in mathematics: PEMDAS (Parentheses, Exponentiation, Multiplication and
Division left to right, Addition and Subtraction left to right).
If you have trouble printing, you may instead give a hand-drawn graph, but
draw it very carefully and in a lot of detail.
f ) Explain what the graph from part d) has to do with the “base rate fallacy.”
2
g) An article is published in a science magazine about an extremely rare, non-
contagious disease. Anyone who has it will die fairly quickly, but it can be
cured if caught early. The cure is not without consequences, however – the
treatment renders the patient blind every time. Alarmed about the fatality
of the disease, many concerned people write to their Congressperson to ask
that random testing (i.e., testing of people selected at random) for this disease
be done. The Congressperson writes a bill requiring random testing for this
condition and tries to get it passed into law. During the discussion period,
a few scientists who are skilled in probability object quite vehemently, using
arguments based on probability. Why are they objecting? Explain carefully.
NOTE: This is an example of how probability can help determine public policy.
Solution Remarks: We’ll solve part a) …rst and then part c). Part b) will
then be solved as a special case of c). Then we’ll solve the other parts in order.
Solution to a): Since we are selecting at random from a …nite, non-empty set,
the relevant probability model is classical probability. We will proceed this way
in other parts of this problem as well.
We make the tree diagram as described. Putting D above Dc at the …rst
level, the probabilities at the …rst level from top to bottom are 0:001 and 0:999;
respectively. Putting + above in each case at the second level, the second-level
probabilities from top to bottom are 0.96, 0.04, 0.02, and 0.98, respectively.
Solution to c): We are interested in P (D j +) :
P (D \ +)
P (D j +) = (de…nition of conditional probability)
P (+)
0:96a
= (using the Law of Total Probability twice)
0:96a + 0:02 (1 a)
48a
= :
47a + 1
3
1.0
y
0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0.0
0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
x
Solution to e): When P (D) is low (i.e., when a is small, which is the case for
rare conditions), P (D j +) is also low, despite the fact that the test was quite
accurate in the sense that most people with the condition test positive and
most without it test negative. Therefore, if an individual is selected for testing
at random, a positive test result is rarely indicative of disease if the disease is
rare, exactly as in the TB example from class.
This makes it clear that, for instance, random testing of the public for a rare
condition can be potentially a very bad idea (particularly if there are signi…cant
negative consequences associated with giving someone unnecessary treatment,
such as health risks and/or signi…cant costs associated with the treatment).
Of course, if a person has symptoms of a condition, even of a rare one, then
this mathematical model does not apply, since this model is based on random
testing; a symptomatic person di¤ers in important ways from a person selected
at random.
By contrast, when the condition is highly prevalent, random testing can give
extremely useful results.
This entire problem also demonstrates the importance of developing diag-
nostic tests having extremely low false positive and false negative rates (the
lower the better), if the condition is rare, as well as the importance of having
multiple, independent tests done to diagnose and con…rm a rare condition.
Suppose a person without other symptoms tests positive for a rare condition.
Should insurance companies cover the costs of another test? Should insurance
companies insist on another test before agreeing to pay for treatment? Ques-
tions like these can be addressed prudently by making use of all this knowledge.
4
Doctors and/or patients with this knowledge can use it to help ensure they
get better treatment. Insurance companies can use it to help ensure that ap-
propriate testing is done before patients are subjected to possibly unnecessary,
dangerous, and costly treatments or surgeries. Lawmakers can use this informa-
tion to ensure that adequate protections are given to residents (to ensure they
are not taken advantage of by insurance companies or doctors).
Thus, we see that probability can also play a major role in health policies.
Solution to f ): The “base rate fallacy” is in assuming that P (D j +) is high
(near 1) when the false positive and false negative rates are small (as they are
in this problem), without taking into consideration P (D) ; the prevalence of the
disease. The graph shows that P (D j +) is in fact tiny (very close to 0) when
the condition is rare (i.e., when P (D) is small), despite the false positive and
false negative rates being fairly small. The false positive and false negative rates
would have to be extremely tiny for P (D j +) to be near 1 when P (D) is tiny.
Solution to g): As we saw in class, a positive test result for a very rare
condition – in the absence of any other symptoms or any particular reason to
undergo the testing –can result in a very low probability of actual infection.
Random testing of asymptomatic people for a rare condition will end up
incorrectly misdiagnosing many people as having the condition, and most of
them would likely seek the treatment (since they would view the condition as
fatal) and end up unnecessarily disabled by it, greatly adversely a¤ecting them
and their families and dependents unnecessarily, and to make matters even
worse the healthcare system would have signi…cant costs associated with all
these unnecessary and dangerous treatments.
A good understanding of probability is critical for people making public
policy decisions which a¤ect many people.
5
Suppose
P (A) = 0:3;
P (B j A) = 0:75;
P (B j Ac ) = 0:20;
P (C j A \ B) = 0:20;
c
P (C j A \ B) = 0:15;
P (C j A \ B c ) = 0:80;
c c
P (C j A \ B ) = 0:90:
a) Find the conditional probability that the river is polluted given that …shing
is permitted and that the sample tested did not detect pollution.
b) Find the probability that pollution will not be detected in a tested sample
and that …shing will be permitted given that the river is polluted.
c) Find the probability that the river is polluted or that a tested water sample
detects pollution, given that …shing is permitted.
P (A j B c \ C) ;
which equals
P (A \ B c \ C)
:
P (B c \ C)
We will use the law of total probability twice, once to evaluate the numerator
and a second time to evaluate the denominator.
We draw a 3-level tree diagram with each node splitting into two line seg-
ments at each level (into A and Ac at the …rst level, into B and B c at the second
level, and into C and C c at the third level.
The probabilities (top to bottom, in the order speci…ed, with A on top of Ac ;
B on top of B c ; and C on top of C c ;are as follows: …rst level (0.3, 0.7), second
level (0.75, 0.25, 0.20, 0.80), third level (0.20, 0.80, 0.80, 0.20, 0.15, 0.85, 0.90,
0.10). We can therefore evaluate the requested probabilities using the general
constructions used above.
The event B c \ C occurs in two branches (A \ B c \ C and Ac \ B c \ C).
In each case, in accordance with the law of total probability, we multiply the
probabilities and conditional probabilities across and add the results. We get
6
the probabilities and conditional probabilities along that branch. Putting the
results together gives
P (A \ B c \ C)
P (A j B c \ C) =
P (B c \ C)
0:3 0:25 0:8
=
0:564
= 0:106 3 : : :
P (A \ B c \ C)
P (B c \ C j A) =
P (A)
0:3 0:25 0:8
=
0:3
= 0:2:
P ((A [ B) \ C)
P (A [ B j C) = (de…nition of conditional probability). (1)
P (C)
P ((A [ B) \ C) = (0:3) (0:75) (0:20) + (0:3) (0:25) (0:8) + (0:7) (0:20) (0:15)
= 0:126:
P (C) = (0:3) (0:75) (0:20) + (0:3) (0:25) (0:8) + (0:7) (0:20) (0:15) + (0:7) (0:80) (0:9)
= 0:63:
Thus,
0:126
P (A [ B j C) = = 0:2:
0:63
Warning P (A [ (B j C)) does not make any sense at all since B j C is not an
event.
7
Exercise 3) (15 points –5 points per part) (conditional probability)
Look at Exercise 4.21 in the textbook. You will answer part d) of that
problem from the book, but not quite following their outline.
a) First, construct a table like Table 4.3 on page 133, except with “Sex” and
“Marital Status” replaced by “Religion” and “Occupation” (with the choices
for Religion being R1 ; R2 ; and R3 ; and with the choices for Occupation being
W, B, and O).
The top left entry represents P (R1 \ W ) : Since
P (W \ R1 )
P (W j R1 ) = ;
P (R1 )
P (W \ R1 ) = P (R1 ) P (W j R1 ) :
We can use that last formula and the given values to …ll in the top left entry
of the table. Proceed in that manner to …ll in the inner 3 3 table. Then add
across and down to produce the marginal probabilities.
c) Why might someone …nd this sort of use of demographic data (which really
does go on, of course) disturbing?
W B O
R1 0:042 0:2835 0:0245 0:35
R2 0:108 0:432 0:06 0:60
R3 0:0065 0:0375 0:006 0:05
0:1565 0:753 0:0905 1
WRITE-UP NOTE: In situations like this, you need not show the arithmetic
for every calculation. If it’s exactly the same in each case (as it is here), just
explain that (as I did), show the arithmetic for one case in complete detail (as
above), and then write in the answers to the others.
Solution to b):
P (R1 \ W ) 0:042
P (R1 j W ) = = = 0:26837 : : : ;
P (W ) 0:1565
whereas
P (R1 ) = 0:35:
8
So, the answer is “less likely.” White collar workers are less likely than a ran-
domly selected member of that community to be of Religion 1. So, the additional
information about worker type in this case gives us additional information about
religion as well.
Remark 3 This is in fact a major application of data analysis, one that busi-
nesses across the world have spent and continue to spend a tremendous amount
of money on. Di¤ erent countries will have to make decisions and laws which
establish an appropriate balance between the perceived bene…ts to businesses and
large organizations making use of such data and the perceived costs to the in-
dividuals whose data are being used in such a manner. Conditional probability
plays a signi…cant role in such analyses.
Solution to a): We are given that A and B are mutually exclusive, and so
A \ B = ;:
9
If A and B were also independent, we would have
P (A) P (B) = P (A \ B) (because of independence)
= P (;) (since A and B are mutually exclusive)
= 0;
which cannot happen since P (A) > 0 and P (B) > 0: Therefore, A and B
cannot be independent.
Solution to b): You needed to provide just a single example. I will discuss
how to …nd one.
We simply need an example of events A and B for which A \ B = ; and yet
P (A \ B) = P (A) P (B) : As we saw in part a), this cannot happen unless we
also have P (A) = 0 or P (B) = 0: So, we are led to look for examples of events
A and B which have empty intersection and for which at least one of them has
probability zero. If either P (A) = 0 or P (B) = 0; then P (A \ B) = 0; which
makes each side of the equation P (A \ B) = P (A) P (B) equal to zero, and
thus A and B are also independent.
We can now easily …nd many examples. Here is one. Let = [0; 5] : Let
F be the set of all subsets of [0; 5] having a de…ned length. Let P be the
one-dimensional geometric probability measure in which P (E) is the length
of E divided by the length of : Let A = f2g ; and let B = f4g : We have
A \ B = ;; so A and B are mutually exclusive. We have P (A) = 0; P (B) = 0;
P (A \ B) = 0; and so P (A \ B) = P (A) P (B) ; which implies that A and B
are also independent.
Remark 4 The following is true: “If A and B are mutually exclusive, then
they are independent if and only if either P (A) = 0 or P (B) = 0:” Make sure
you know how to prove that statement.
Solution to c): Almost anything you pick will work for this one: make them
overlap (so as to not be mutually exclusive) but not in the exact proportion of
the equation P (A \ B) = P (A) P (B) ; so as to not be independent). There
are in…nitely many correct solutions; you need give only one and demonstrate
that it clearly has the indicated properties.
For example, consider a classical probability model with = f1; 2; 3; 4; 5; 6g :
Let A = B = f1; 2; 3g :
Then A \ B 6= ;; so A and B are not mutually exclusive, and furthermore
1 1 1
P (A \ B) = 6= = P (A) P (B) ;
2 2 2
so A and B are also not independent.
10
which gives (after multiplying both sides by P (B))
P (A \ B) = a P (B) > 0;
Solution to a):
P (A \ B) = P (A) P (B) ;
P (A \ C) = P (A) P (C) ;
P (B \ C) = P (B) P (C) ;
P (A \ B \ C) = P (A) P (B) P (C) :
11
Solution to b):
P (A \ B) = P (A) P (B) ;
P (A \ C) = P (A) P (C) ;
P (A \ D) = P (A) P (D) ;
P (B \ C) = P (B) P (C) ;
P (B \ D) = P (B) P (D) ;
P (C \ D) = P (C) P (D) ;
P (A \ B \ C) = P (A) P (B) P (C) ;
P (A \ B \ D) = P (A) P (B) P (D) ;
P (A \ C \ D) = P (A) P (C) P (D) ;
P (B \ C \ D) = P (B) P (C) P (D) ;
P (A \ B \ C \ D) = P (A) P (B) P (C) P (D) :
Solution to c): The multiplication property must hold for each collection of 2
n
or more events. There are 2n subsets of a set having n events, of which =1
0
n
consist of no events and = n consists of 1; for a total of 2n n 1 collections
1
consisting of 2 or more events. There is one multiplication property equation
for each of these, for a total of 2n n 1 equations that must be satis…ed.
b) Let D be the event where the condition being tested for is present. Let +
be the event that the diagnostic test indicates the condition is present, and let
be the complement of +:
Answer each of these 5 multiple-choice questions by indicating (by circling it
or by indicating it in any other clear way) which is the correct choice (1 point
each). No work needs to be shown in this case.
12
P (+ j D) = (false positive rate, false negative rate, sensitivity, speci…city,
none of these).
P ( j Dc ) = (false positive rate, false negative rate, sensitivity, speci…city,
none of these).
P (+ j D) ; P (+ j Dc ) ; P (Dc j +) ; P (Dc j );
c
P (D j +) ; P (D j ); P ( j D) ; P ( j D ):
Suppose that 0 < P (D) < 1 and that 0 < P (+) < 1: Which PAIRS of
these probabilities must ALWAYS add up to 1 (which means they must both
be de…ned and sum to 1)? List all such pairs if there are any. If there aren’t
any, say so.
GRADING NOTE: no work needs to be shown; you will earn 5 points
if ALL such pairs (if any) are correctly identi…ed, and 0 points oth-
erwise. There is no partial credit. Proceed carefully.
Hint: There is one underlying idea here; if you understand it clearly then you
will get it completely right, and if you do not then you should spend time trying
to sort it out. The grading for this one is designed to encourage you to do that.
Remember what we did with multi-level tree diagrams and labeling conditional
probabilities along line segments. Try to construct a tree diagram similar to the
one we used in class for the Tuberculosis example. That may help.
GRADING NOTE for a): 50% deduction for assuming any speci…c value for
P (D) ; or for not showing …rst that PD and PDc must be probability measures.
Solution to a):
The false positive rate is the conditional probability that, if a person does not
have a given condition, he or she will incorrectly get a positive test result. Thus,
it is P (+ j Dc ) : The fact that it is de…ned implies that P (Dc ) > 0; so that PDc
is a conditional probability measure. The false positive rate is PDc (+) :
The false negative rate is the conditional probability that, if a person does
have a given condition, he or she will incorrectly get a negative test result. Thus,
it is P ( j D) : The fact that it is de…ned implies that P (D) > 0; so that PD
is a conditional probability measure. The false negative rate is PD ( ) :
We can now write all quantities in terms of these probability measures and
use the fact that P (+) + P ( ) = 1 for any probability measure P because +
and are complementary events.
The false positive rate is P (+ j Dc ) = PDc (+) :
The false negative rate is P ( j D) = PD ( ) :
The sensitivity is P (+ j D) = PD (+) :
The speci…city is P ( j Dc ) = PDc ( ) :
13
Since PD is a probability measure, and since + and form a partition of
the sample space, we have
PD (+) + PD ( ) = 1;
which, in words, is
The false negative rate is 2%, or 0:02; so the sensitivity is 0:98 or 98%:
Since PDc is a probability measure, and since + and form a partition of
the sample space, we have
which, in words, is
The false positive rate is 3%, or 0:03; so the speci…city is 0:97 or 97%:
GRADING NOTE for b): 1 point each (no partial credit for each).
Solution to b):
P (D j ) = none of these,
P( j D) = false negative rate,
P (D j +) = none of these,
P (+ j D) = sensitivity,
P( j Dc ) = speci…city.
GRADING NOTE for c): no work needs to be shown; you will earn 5
points if ALL such pairs (if any) are correctly identi…ed, and 0 points
otherwise.
Solution to c): Because 0 < P (D) < 1 and 0 < P (+) < 1; it follows that
P (D) > 0; P (Dc ) > 0; P (+) > 0; and P ( ) > 0; and so each of the indicated
conditional probabilities is de…ned, and PD ; PDc ; P+ ; and P are conditional
probability measures, hence probability measures, which implies, in particular,
that their values on complementary events must add up to 1:
Thus,
PD (+) + PD ( ) = 1;
which is equivalent to
P (+ j D) + P ( j D) = 1:
14
Next,
PDc (+) + PDc ( ) = 1;
which is equivalent to
P (+ j Dc ) + P ( j Dc ) = 1:
Next,
P+ (D) + P+ (Dc ) = 1;
which is equivalent to
P (D j +) + P (Dc j +) = 1:
Next,
P (D) + P (Dc ) = 1;
which is equivalent to
P (D j ) + P (Dc j ) = 1:
In this way, we have derived the following four equations relating the given
quantities:
P (+ j D) + P ( j D) = 1;
c c
P (+ j D ) + P ( jD ) = 1;
P (D j +) + P (Dc j +) = 1;
c
P (D j ) + P (D j ) = 1:
15
What, if anything, can be said about the value of
P (W j R3 ) + P (B j R1 ) + P (O j R2 )? (11)
a. For part a., let = [0; 2] [0; 2] ; and let ( ; F; P ) be the corresponding
2-dimensional geometric probability model.
Our experiment consists of selecting a point from “at random.” Let us
denote the selected point by
! = (x; y) ;
so that x and y are the coordinates of the selected point.
Let E be the event that x is in the interval [3=2; 2] :
Let F be the event that y is in the interval [1=4; 2] :
16
I will provide all of the steps. You are to …ll in the reasons for each of the
steps where I write “Why?”.
Claim: Suppose ( ; F; P ) is a probability space and that E and F are events.
If E and F are independent, then E and F c are independent.
Proof: Suppose E and F are independent. Let a = P (E) ; and let b = P (F ) :
Step 1: P (E \ F ) = ab: Why?
Step 2: Then P (E \ F c ) = P (E) P (E \ F ) : Why?
Step 3: Then P (E \ F c ) = P (E) P (F c ) : Why?
It follows that E and F c are independent. q.e.d.
P (A [ B) + P (A \ B c ) + P (A [ B c )
which is easy to calculate if we know P (A) and P (B) : Proceed similarly with
the assigned problem.
Area (A) 1
P (A) = = Area (A)
Area ( ) 4
17
for any event A:
We will …nd the probabilities of the events E; F; and E \ F:
Area (E) 1
P (E) = = ;
Area ( ) 4
Area (F ) 7=2 7
P (F ) = = = ;
Area ( ) 4 8
Area (E \ F ) 7=8 7
P (E \ F ) = = = :
Area ( ) 4 32
We see that P (E \ F ) = P (E) P (F ) and so E and F are independent.
Solution to b.: We will use the same probability space as in part a), where we
also showed that P (E) = 1=4 and P (F ) = 7=8: It follows that P (F c ) = 1=8:
Next,
(1=4) (1=2) 1
P (E \ F c ) = = ;
4 32
which equals P (E) P (F c ) ; so E and F c are independent.
Solution to c.:
Step 1 reason: since E and F are independent.
Step 2 reason: since E \ F and E \ F c form a partition of E; we have
P (E \ F ) + P (E \ F c ) = P (E) ;
18
Solution to d.: Proof: We are given that ( ; F; P ) is a probability space, that
E and F are events, and that E and F are independent. We must show that
E c and F are independent.
Applying the claim from part c) with the letters E and F switched, we see
that E c and F are independent, as required. q.e.d.
P (A [ B) = P (A) + P (B) P (A \ B)
= 0:4 + 0:5 0:2 = 0:7:
P (A) = P (A \ B) + P (A \ B c ) ;
0:4 = 0:2 + P (A \ B c ) ;
c
P (A \ B ) = 0:2:
P (B \ A) + P (B \ Ac ) = P (B) ;
c
0:2 + P (B \ A ) = 0:5;
from which we see that P (B \ Ac ) = 0:3: As noted above, this implies that
P (A [ B c ) = 0:7:
Thus,
P (A [ B) + P (A \ B c ) + P (A [ B c )
= 0:7 + 0:2 + 0:7 = 1:6:
Solution to f.: We will prove each of the two claims in the given “if and only
if” statement separately.
19
First, suppose A and B are independent. Then
P (A \ B) = P (A) P (B) ;
and so
P (A \ B) P (A) P (B)
P (B j A) = = = P (B) :
P (A) P (A)
Next, suppose that
P (B j A) = P (B) :
Then
P (A \ B)
= P (B) ;
P (A)
which yields P (A \ B) = P (A) P (B) after multiplying both sides by P (A) :
It follows that A and B are independent.
20