Professional Documents
Culture Documents
P-Values and Hypothesis Testing
P-Values and Hypothesis Testing
P-Values and Hypothesis Testing
Fall 2003
P-VALUES AND HYPOTHESIS TESTING
The Idea of a p-value. A p-value is something you calculate when you want to evaluate
two competing hypotheses. Given a pair of competing hypotheses a p-value is
calculate! from relevant !ata you have gathere!. The p-value you get from your !ata will
give you an i!ea of how plausi"le the hypotheses you are evaluating are.
Null and Alternative Hypotheses. The hypotheses you are intereste! in must first "e
formulate! as one #null hypothesis$ %!enote! 0
H
& an! one #alternative hypothesis$
%!enote!
A
H &. ' fin! it helpful to thin( of 0
H
as the #!efault$) it is what you will
"elieve if your !ata provi!es no compelling evi!ence to the contrary. ' thin( of
A
H as
the conclusion on which the "ur!en of proof is place!) we will fin! the alternative
hypothesis convincing only if our !ata provi!e compelling support.
*
2
+.g.) 'f you are !oing a trial to see whether a !rug is effective the null woul! "e that it is
not effective an! the alternative woul! "e that it is effective.
+.g.) 'f a person is "eing trie! for a crime in an American court the null hypothesis is
that she is innocent an! the alternative is that she is guilty.
An Example. A casino in Atlantic ,ity has a game in which people "et on whether a
coin will come up hea!s or tails when it is tosse!. This game is perfectly legal as long as
the coin is fair meaning that every time it is tosse! there is a -0 percent chance it comes
up hea!s an! a -0 percent chance it comes up tails. .ut an agent of the /0 Gam"ling
,ommission suspects that the casino has "een using a weighte! coin that has a greater
pro"a"ility of coming up hea!s than of coming up tails. The owner of the casino has in
fact "een arreste! an! is on trial.
The null hypothesis that the casino owner is innocent an! the alternative that she is
guilty can "e written li(e this)
- . )
- . )
0
>
=
where p represents the pro"a"ility that the coin comes up hea!s on any toss.
*
As we will see compelling support for the alternative will actually come in the form of compelling
evi!ence against the null.
2
The null an! alternative must "e mutually e1clusive. 2et3s also assume that they are formulate! in such a
way that they are mutually e1haustive. There are some su"tleties involve! in the latter assumption that
might "e worth !iscussing "ut that ' thin( woul! !istract us from the principal o"4ectives of this first pass
at p-values.
After the null an! alternative hypotheses have "een state! some !ata must "e collecte!.
'n our e1ample suppose the 4u!ge tosses the coin in 5uestion ten times an! the outcomes
of those tosses are the only evi!ence availa"le in the trial. An! suppose that the se5uence
of hea!s an! tails o"serve! in the ten tosses of the coin is HHHHTHHHTH. ' will call
this the #raw !ata.$
/ow thin( a"out a courtroom !ialog that ta(es place "etween the attorneys for the
prosecution an! the !efense after this !ata is o"serve!)
678S+,9T'8/) Aha: 2oo( at that: +ight hea!s in ten tosses:;: That coin
must "e weighte! in favor of hea!s: 't is 4ust not possi"le that a fair coin woul! come up
hea!s eight times in ten tosses.
<+F+/S+) #=ust "e$ you say; 'mpossi"le that the coin is fair; /onsense. 't
is perfectly possi"le that the coin is fair an! 4ust happene! to come up hea!s eight times
out of ten. This evi!ence !oesn3t prove anything.
678S+,9T'8/) 8> it is possi"le that a fair coin coul! come up hea!s eight
times out of ten "ut how li(ely is that;
't is this last 5uestion that un!erlies the notion of a p-value. The rough conceptuali?ation
is this) @e have o"serve! evi!ence that on the face of it loo(s unfavora"le to the null
an! favora"le to the alternative. 'f we ha! o"serve! evi!ence that coul! not possi"ly "e
generate! if the null were true then we woul! (now the null is false an! the alternative is
true. .ut that is not generally the case an! it is not the case in this e1ample) it is
possi"le for a fair coin to come up hea!s eight times in ten tosses. /onetheless this
evi!ence !oes cast !ou"t on the null an! provi!es support for the alternative. 'n the final
5uestion of the !ialog a"ove the prosecutor has propose! a 5uantitative measure of how
much !ou"t the evi!ence casts on the null) the smaller is the pro"a"ility of getting as
many as eight hea!s in ten tosses of a fair coin the less cre!i"le the null is in the face of
this evi!ence an! the more persua!e! we will "e that we shoul! "ase future actions %li(e
convicting the casino owner& on the assumption that the alternative is true.
2et3s !escri"e this way of thin(ing a"out the pro"lem more formally "ut still in the
conte1t of this coin-tossing e1ample. 2et3s start at the point at which we have formulate!
the null an! alternative hypotheses an! o"serve! the raw !ata !escri"e! a"ove. Here3s
what we !o then to calculate a p-value.
To start we !efine a #test statistic$ a value that we calculate from our raw !ata that will
"e useful in evaluating the competing hypotheses. 'n this case the prosecution has
implicitly invo(e! the num"er of hea!s in ten tosses as the test statistic. That choice of a
test statistic is an intuitively plausi"le an! we will see that in fact it wor(s well in this
conte1t. So let us ta(e the num"er of hea!s o"serve! in ten tosses as the test statistic.
2
/e1t we as() #Aualitatively spea(ing what values of the test statistic woul! challenge
the null an! support the alternative;$ 'n this e1ample it is large values of the test
statistic that loo( inconsistent with the null.
8nce we have state! what it means for the test statistic to "e inconsistent with the null
we can as() #'f the null hypothesis were true how li(ely is it that the !ata woul! yiel! a
test statistic that is as inconsistent with the null hypothesis as the test statistic that we
actually calculate! from the !ata we o"serve!;$ This 5uestion is almost i!entical to the
prosecution left us with in the !ialog a"ove) if the coin were fair what is the pro"a"ility
of getting eight hea!s in ten tosses; .ut there is one thing to "e careful a"out) 's it the
fact that we saw e1actly eight hea!s that seems suspicious; @oul! it have "een less
suspicious to o"serve e1actly nine hea!s; /oBas o"serve! a"ove it is really 4ust the
fact that the num"er of hea!s is large that ma(es us suspicious. So when we tal( a"out a
test statistic that is #as inconsistent with the null$ as the one we calculate! form our
sample we really mean #at least as inconsistent$ or #as inconsistent or more inconsistent
with the null$ as the one we calculate! from our sample. 'n this e1ample getting a test
statistic as inconsistent or more inconsistent with the null as the one we calculate! from
our !ata means o"serving eight or more hea!s in ten tosses. So the 5uestion we are
as(ing is) 'f the coin were fair what is the pro"a"ility of getting eight or more hea!s in
ten tosses of the coin.; The answer to this 5uestion is the p-value.
'f we let X represent the num"er of hea!s in ten tosses of the coin we can write
( ) true is hypothesis null the X P value p C D E F G
To calculate this pro"a"ility we somehow nee! to figure out the pro"a"ility !istri"ution
of the test statistic X. 'n this case it is easy) assuming the tosses of the coin are mutually
in!epen!ent %which is reasona"le in this case& the num"er of hea!s in ten tosses is a
"inomial ran!om varia"le. .ut what are the parameters of this "inomial ran!om
varia"le; The num"er of trials is ten "ut !o we (now the pro"a"ility of getting hea!s on
any given trial; 'f we !i! we woul!n3t have to !o this hypothesis test: So all we can say
is that the pro"a"ility of hea!s on any trial is p where p is the true "ut un(nown
pro"a"ility of getting hea!s on any toss. So what we (now for sure a"out the !istri"ution
of X can "e written as
X ~ Bin 10, p ( ).
.ut the p-value is not simply the pro"a"ility that X is greater than or e5ual to eight. The
5uestion is what woul! that pro"a"ility "e if the null hypothesis were true; An! since
the null hypothesis is that
- . =
we can say the following) 'f the null hypothesis is
true then
X ~ Bin 10,.5 ( ) . HIou sometimes hear terminology li(e #under the null
hypothesis
O
is 4ust some num"er li(e *2 or 0 or P2.L&
"ollect some data and calculate a 'test statistic(.
'n the case of hypothesis tests a"out a population mean the test statistic is the sample
mean X . @e will use the notation
O
&. So 5ualitatively spea(ing if
the null hypothesis were true it woul! "e unli(ely to o"serve large values of X .
)uantitatively spea,ing* +hat +ould be the probability of the reali-ed value of the test
statistic being as .or more/ inconsistent +ith the null hypothesis as the value you
calculated from your data* if the null hypothesis +ere in fact true&
'n the case of this one-si!e! hypothesis test a"out
O
;
'n sym"ols this 5uestion is as(ing us to fin!
( )
O
x X P % % F Q C
.
O
%hat is the probability distribution of the test statistic&
Fortunately we (now a lot a"out the !istri"ution of X .
First we (now that
( ) = X &
. %This is going to "e useful even though we !on3t (now
what
is e5ual to.&
@e also (now that ( )
n
X 'ar
2
(
F %an! we are assuming we (now
2
&.
An! since we are assuming that n is large we (now "y the ,2T that X is normally
!istri"ute!.
So we (now that
n
N X
2
R
(
%
.
@e !on3t (now what
=
O
. So when we calculate this con!itional pro"a"ility we can
assume that
X ~ N
O
,
2
n
. An! now we (now everything we nee! to (now to
calculate the !esire! pro"a"ility)
( )
> =
>
= = >
n
O
n
O
n
O
O
x
) P
x X
P x X P
( ( (
% % %
% % C
@e (now the values for
O
%it is state! in the null
hypothesis&
2
an! n so we can use the stan!ar! normal ta"le to fin! this pro"a"ility.
This probability is the p-value.
A slightly !ifferent loo(ing "ut e5uivalent way of presenting how a p-value is calculate!
in this e1ample is as follows. 9se the standardi*ed value of the reali?e! sample mean %its
?-score& as the test statistic. ,all this test statistic
z
where
z =
x
O
n
. Then use the
stan!ar! normal !istri"ution
( ) * 0 R N )
to calculate p-values as follows)
( ) * ) P value p Q F G
L
%hat if +e don0t ,no+ the population variance& 'f we !on3t (now
2
%as we usually
won3t& we can use an alternative test statistic)
n
s
O
X
t
=
where s represents the
sample variance
( )
=
*
*
2
n
X X
s
n
i
i
.
Something that ' call a #generali?ation of the ,2T$ tells us that when n is large an! if
the null hypothesis is true
*
R
. 'n this e1ample it is large values of t that are
inconsistent with the null hypothesis so we calculate
( )
+ n
t t P value p S F G
G*
where
+
t represents the reali?e! value of t you calculate! from your !ata.
D
Hughes Faculty Seminar on Teaching Statistics
Fall 2003
P-VALUE PROBLEMS
1) @awa sells #two foot$ hoagie rolls. 8f course "ecause there is some varia"ility in the
pro!uction process not each of the rolls is e1actly 2K inches long. 't is (nown that for
the entire population of @awa #two foot$ hoagie rolls the stan!ar! !eviation in their
lengths is *.K inches. A consumers3 a!vocacy group has claime! that the mean length for
the entire population of these rolls is less than 2K inches. The a!vocacy group has ta(en
the @awa ,orporation to court to sue them for misrepresenting their pro!uct.
a& State the appropriate null an! alternative hypotheses to "e teste!. %As usual let
represent the population mean length of @awa #twelve inch$ hoagie rolls.&
"& Suppose that in a ran!om sample of *K0 #twelve inch$ hoagie rolls the mean length is
23.L- inches. Fin! the p-value.
2) Suppose a ran!om sample has "een ta(en from a normally !istri"ute! population. Iou
(now that the mean of the population is
2- =
an! the variance in the population is
K
2
= "ut you !o not (now the sample si?e %call it n which as usual represents the
num"er of o"servations in the sample&.
Iou want to test the following hypotheses a"out the si?e of the sample)
*00 )
*00 )
0
<
=
Although you will not "e tol! what the sample si?e was you will "e tol! what the
reali?e! value of the sample mean was %as usual call it
x &.
a&& Aualitatively what values of
x 25
the more inconsistent the !ata is with the null hypothesis.J
%ii& The more that the reali?e! sample mean falls el!" the population mean the
more inconsistent it is with the null hypothesis. HThat is the smaller %more
negative& is
x 25
the more inconsistent the !ata is with the null hypothesis.J
"hoose .i/* .ii/* or .iii/ as your ans+er* and briefly explain the reasoning behind your
choice.
M
"& As usual let X represent the mean of a ran!om sample of si?e n from the population
!escri"e! a"ove %with
2- =
an! K
2
= &. 'f the null hypothesis state! a"ove is true
then what is the pro"a"ility !istri"ution of X ;
c& Suppose you are tol! that in the sample that was ta(en the reali?e! value of the sample
mean was
x = 25.48. Fin! the p-value for the null an! alternative hypotheses state!
a"ove.
&) An office supply company hire! a salesperson to wor( for one !ay. The contract
specifie! that the salesperson shoul! visit the hea!5uarters of *- large corporations to try
to sell the company3s office pro!ucts. At each visit to a corporate hea!5uarters the
pro"a"ility that the salesperson ma(es a sale is .K %so the pro"a"ility that she !oesn3t
ma(e a sale is .O&. @hether she ma(es a sale at any office is in!epen!ent of whether she
ma!e a sale at any other office.
a& 2et X !enote the num"er of sales she ma(es if she visits *- offices. @hat is the
pro"a"ility !istri"ution of X; (,ust give the name of the family of distributions that X
belongs to and indicate what the values of the parameters of the distribution are# -ou
do not need to write out each of the possible reali*ations of X with their probabilities#$
"& Suppose that at the en! of the !ay the salesperson has ma!e only 3 sales. The owners
of the company are !ismaye! at how low this num"er of sales is an! suspect that the
salesperson may have ta(en it easy an! visite! fewer than the *- corporations that her
contract sai! she was suppose! to visit.
The company woul! therefore li(e to sue the salesperson for "reach of contract. .ut to
win the case they must convince a 4u!ge that they have strong evi!ence to show that she
visite! fewer than *- corporations.
Thin( of this as a hypothesis testing pro"lem an! state the null an! alternative
hypotheses that the company woul! want to test. (-ou can state these hypotheses in
word or you can use symbols# .f you use symbols be sure you indicate what the symbols
you are using are meant to represent#$
c& %*2 points& The only evi!ence the company has to present to the 4u!ge is that the
salesperson ma!e only 3 sales !uring the !ay. %/o"o!y followe! her aroun! all !ay to
!irectly o"serve how many offices she actually visite!.& Given this !ata what is the p-
value for the hypotheses state! in part %"& a"ove; %Assume that the 4u!ge the company
an! the wor(er all (now an! agree that as state! a"ove the pro"a"ility of a sale on any
in!ivi!ual call is .K&
') To re!uce employee theft a company proposes to screen its wor(ers with a lie !etector
test. This test is not perfectly relia"le) if a person is really innocent the test in!icates
*0
TguiltyT *0N of the time an! if the person is really guilty the test in!icates TinnocentT
20N of the time. 't is (nown that -N of the wor(ers actually are guilty.
Thin( of this as a hypothesis testing pro"lem. Suppose the company wants to test the
following null an! alternative hypotheses)
H
0
) The wor(er is innocent
H
A
) The wor(er is guilty
Suppose the wor(er ta(es the lie !etector test an! the test result is Tguilty.T Fin! the p-
value. %Thin( of the test result #guilty$ as the !ata you collecte!.&
**