P-Values and Hypothesis Testing

You might also like

Download as doc, pdf, or txt
Download as doc, pdf, or txt
You are on page 1of 11

Hughes Faculty Seminar on Teaching Statistics

Fall 2003
P-VALUES AND HYPOTHESIS TESTING
The Idea of a p-value. A p-value is something you calculate when you want to evaluate
two competing hypotheses. Given a pair of competing hypotheses a p-value is
calculate! from relevant !ata you have gathere!. The p-value you get from your !ata will
give you an i!ea of how plausi"le the hypotheses you are evaluating are.
Null and Alternative Hypotheses. The hypotheses you are intereste! in must first "e
formulate! as one #null hypothesis$ %!enote! 0
H
& an! one #alternative hypothesis$
%!enote!
A
H &. ' fin! it helpful to thin( of 0
H
as the #!efault$) it is what you will
"elieve if your !ata provi!es no compelling evi!ence to the contrary. ' thin( of
A
H as
the conclusion on which the "ur!en of proof is place!) we will fin! the alternative
hypothesis convincing only if our !ata provi!e compelling support.
*

2
+.g.) 'f you are !oing a trial to see whether a !rug is effective the null woul! "e that it is
not effective an! the alternative woul! "e that it is effective.
+.g.) 'f a person is "eing trie! for a crime in an American court the null hypothesis is
that she is innocent an! the alternative is that she is guilty.
An Example. A casino in Atlantic ,ity has a game in which people "et on whether a
coin will come up hea!s or tails when it is tosse!. This game is perfectly legal as long as
the coin is fair meaning that every time it is tosse! there is a -0 percent chance it comes
up hea!s an! a -0 percent chance it comes up tails. .ut an agent of the /0 Gam"ling
,ommission suspects that the casino has "een using a weighte! coin that has a greater
pro"a"ility of coming up hea!s than of coming up tails. The owner of the casino has in
fact "een arreste! an! is on trial.
The null hypothesis that the casino owner is innocent an! the alternative that she is
guilty can "e written li(e this)
- . )
- . )
0
>
=

where p represents the pro"a"ility that the coin comes up hea!s on any toss.
*
As we will see compelling support for the alternative will actually come in the form of compelling
evi!ence against the null.
2
The null an! alternative must "e mutually e1clusive. 2et3s also assume that they are formulate! in such a
way that they are mutually e1haustive. There are some su"tleties involve! in the latter assumption that
might "e worth !iscussing "ut that ' thin( woul! !istract us from the principal o"4ectives of this first pass
at p-values.
After the null an! alternative hypotheses have "een state! some !ata must "e collecte!.
'n our e1ample suppose the 4u!ge tosses the coin in 5uestion ten times an! the outcomes
of those tosses are the only evi!ence availa"le in the trial. An! suppose that the se5uence
of hea!s an! tails o"serve! in the ten tosses of the coin is HHHHTHHHTH. ' will call
this the #raw !ata.$
/ow thin( a"out a courtroom !ialog that ta(es place "etween the attorneys for the
prosecution an! the !efense after this !ata is o"serve!)
678S+,9T'8/) Aha: 2oo( at that: +ight hea!s in ten tosses:;: That coin
must "e weighte! in favor of hea!s: 't is 4ust not possi"le that a fair coin woul! come up
hea!s eight times in ten tosses.
<+F+/S+) #=ust "e$ you say; 'mpossi"le that the coin is fair; /onsense. 't
is perfectly possi"le that the coin is fair an! 4ust happene! to come up hea!s eight times
out of ten. This evi!ence !oesn3t prove anything.
678S+,9T'8/) 8> it is possi"le that a fair coin coul! come up hea!s eight
times out of ten "ut how li(ely is that;
't is this last 5uestion that un!erlies the notion of a p-value. The rough conceptuali?ation
is this) @e have o"serve! evi!ence that on the face of it loo(s unfavora"le to the null
an! favora"le to the alternative. 'f we ha! o"serve! evi!ence that coul! not possi"ly "e
generate! if the null were true then we woul! (now the null is false an! the alternative is
true. .ut that is not generally the case an! it is not the case in this e1ample) it is
possi"le for a fair coin to come up hea!s eight times in ten tosses. /onetheless this
evi!ence !oes cast !ou"t on the null an! provi!es support for the alternative. 'n the final
5uestion of the !ialog a"ove the prosecutor has propose! a 5uantitative measure of how
much !ou"t the evi!ence casts on the null) the smaller is the pro"a"ility of getting as
many as eight hea!s in ten tosses of a fair coin the less cre!i"le the null is in the face of
this evi!ence an! the more persua!e! we will "e that we shoul! "ase future actions %li(e
convicting the casino owner& on the assumption that the alternative is true.
2et3s !escri"e this way of thin(ing a"out the pro"lem more formally "ut still in the
conte1t of this coin-tossing e1ample. 2et3s start at the point at which we have formulate!
the null an! alternative hypotheses an! o"serve! the raw !ata !escri"e! a"ove. Here3s
what we !o then to calculate a p-value.
To start we !efine a #test statistic$ a value that we calculate from our raw !ata that will
"e useful in evaluating the competing hypotheses. 'n this case the prosecution has
implicitly invo(e! the num"er of hea!s in ten tosses as the test statistic. That choice of a
test statistic is an intuitively plausi"le an! we will see that in fact it wor(s well in this
conte1t. So let us ta(e the num"er of hea!s o"serve! in ten tosses as the test statistic.
2
/e1t we as() #Aualitatively spea(ing what values of the test statistic woul! challenge
the null an! support the alternative;$ 'n this e1ample it is large values of the test
statistic that loo( inconsistent with the null.
8nce we have state! what it means for the test statistic to "e inconsistent with the null
we can as() #'f the null hypothesis were true how li(ely is it that the !ata woul! yiel! a
test statistic that is as inconsistent with the null hypothesis as the test statistic that we
actually calculate! from the !ata we o"serve!;$ This 5uestion is almost i!entical to the
prosecution left us with in the !ialog a"ove) if the coin were fair what is the pro"a"ility
of getting eight hea!s in ten tosses; .ut there is one thing to "e careful a"out) 's it the
fact that we saw e1actly eight hea!s that seems suspicious; @oul! it have "een less
suspicious to o"serve e1actly nine hea!s; /oBas o"serve! a"ove it is really 4ust the
fact that the num"er of hea!s is large that ma(es us suspicious. So when we tal( a"out a
test statistic that is #as inconsistent with the null$ as the one we calculate! form our
sample we really mean #at least as inconsistent$ or #as inconsistent or more inconsistent
with the null$ as the one we calculate! from our sample. 'n this e1ample getting a test
statistic as inconsistent or more inconsistent with the null as the one we calculate! from
our !ata means o"serving eight or more hea!s in ten tosses. So the 5uestion we are
as(ing is) 'f the coin were fair what is the pro"a"ility of getting eight or more hea!s in
ten tosses of the coin.; The answer to this 5uestion is the p-value.
'f we let X represent the num"er of hea!s in ten tosses of the coin we can write
( ) true is hypothesis null the X P value p C D E F G
To calculate this pro"a"ility we somehow nee! to figure out the pro"a"ility !istri"ution
of the test statistic X. 'n this case it is easy) assuming the tosses of the coin are mutually
in!epen!ent %which is reasona"le in this case& the num"er of hea!s in ten tosses is a
"inomial ran!om varia"le. .ut what are the parameters of this "inomial ran!om
varia"le; The num"er of trials is ten "ut !o we (now the pro"a"ility of getting hea!s on
any given trial; 'f we !i! we woul!n3t have to !o this hypothesis test: So all we can say
is that the pro"a"ility of hea!s on any trial is p where p is the true "ut un(nown
pro"a"ility of getting hea!s on any toss. So what we (now for sure a"out the !istri"ution
of X can "e written as

X ~ Bin 10, p ( ).
.ut the p-value is not simply the pro"a"ility that X is greater than or e5ual to eight. The
5uestion is what woul! that pro"a"ility "e if the null hypothesis were true; An! since
the null hypothesis is that
- . =
we can say the following) 'f the null hypothesis is
true then

X ~ Bin 10,.5 ( ) . HIou sometimes hear terminology li(e #under the null
hypothesis

X ~ Bin 10,.5 ( ) $ or #the null distribution of X is "inomial with n=*0 an!


pF.-.$J
So now we can say more a"out the p-value. 'n fact we can calculate it)
3

p value = P X 8 | the null hypothesis is true ( ), where X ~ Bin 10, p ( )


= P X 8 ( ), assuming X ~ Bin 10,.5 ( )
= .0547
This means that there is 4ust a -.KL percent chance of getting eight or more hea!s in ten
tosses of a fair coin. =ore pointe!ly if the coin in 5uestion in this trial were fair there
woul! "e 4ust a -.KL percent chance of getting as many hea!s as we !i! when we tosse! it
ten times. This pro"a"ility is not miniscule "ut it is pretty small) we o"serve!
something that woul! have "een pretty unli(ely if the null hypothesis ha! "een true. @e
haven3t proven the null hypothesis is false %we coul! have gotten as many as eight hea!s
in ten tosses of a fair coinBin fact if we !o repeate! iterations of ten tosses of a fair
coin we will get eight hea!s or more in more than five percent of the iterations&. .ut the
lowness of the pro"a"ility of o"serving a test statistic as large as we !i! if the null
hypothesis were true ma(es us !ou"t that it is in fact true. The smaller is the p-value the
greater our !ou"ts.
Defining and Interpreting p-values. A !efinition of the p-value)
The p-value is the %ex ante& pro"a"ility with which the value of the test statistic
woul! "e as or more inconsistent with the null hypothesis as the %ex post& value of
the test statistic we calculate! from our !ata if the null hypothesis were true.
The p-value answers the 5uestion) 'f the null hypothesis ha! "een true what woul! have
"een the pro"a"ility of o"taining !ata that loo(e! as or more inconsistent with it than the
!ata we o"serve! in our sample; So the smaller is the p-value the greater is the !ou"t
that our !ata she!s on the null hypothesis.
How low a p-value must "e "efore one re4ects the null hypothesis Hi.e. "efore one ta(es
an action pre!icate! on the assumption that the null is not trueJ is a 4u!gment call that
will !epen! on the conte1t. 'n the legal conte1t of the prece!ing e1ample the 5uestion
woul! "e how low the p-value woul! have to "e "efore we conclu!e! #"eyon! a
reasona"le !ou"t$ that the coin was not fairBan! so convicte! the casino owner of the
crime. Although some conventions e1ist with respect to how low a p-value must "e to
re4ect a null hypothesis there is no o"4ective "asis for !eci!ing precisely how low a p-
value must "e to constitute evi!ence #"eyon! a reasona"le !ou"t.$
K
@hen we o"tain a p-value of .0-KL we say) #@e can re4ect the null hypothesis at the
MK.-3N confi!ence level.$ 'n general the statement #@e can re4ect the null hypothesis at
the
&N * % *00
confi!ence level$ is e5uivalent to the statement #The p-value is e5ual to

.$ 'n sym"ols that is


( ) = true H sample our in observed we data the as H with nt inconsiste as data getting P
O O
C
't is tempting but not correct to say that when we re4ect a null hypothesis at the
&N * % *00
we have foun! that given the !ata we o"serve! the pro"a"ility that the null
hypothesis is true is 4ust

. 'n sym"ols that woul! "e


( ) =
O O
H with was data our nt inconsiste how true H P C
"ut that is not what a p-value is. An! in fact it is not even sensi"le to tal( a"out the
pro"a"ility that the null hypothesis is true since the null hypothesis is a statement a"out a
parameter an! since parameters are constants %not ran!om varia"les& we can3t tal( a"out
the pro"a"ility with which a parameter ta(es on certain values %or ta(es on values in
certain intervals&.
An utline of the !eneral Approach to "alculating p-values.
*& State the null an! alternative hypotheses.
2& Figure out what (in! of relevant !ata is availa"le or coul! "e collecte!.
3& Figure out what test statistic you will calculate from the !ata.
K& Figure out in a 5ualitative sense what values of the test statistic woul! "e inconsistent
with the null hypothesis. That is as( what values or ranges of values of the test statistic
woul! "e unli(ely to "e o"serve! if the null hypothesis were true.
-& Figure out what the !istri"ution of the test statistic woul! "e if the null hypothesis is
correct.
O& 8"tain or collect the raw !ata you !eci!e! you woul! nee! in %2& a"ove.
L& ,alculate the test statistic you !eci!e! upon in %3& a"ove.
D& 9n!er the assumption that the null hypothesis is true calculate the %ex ante&
pro"a"ility of a o"taining a sample of !ata for which the value of the test statistic is as or
more inconsistent with the null hypothesis as the value you actually calculate! %ex post&
from your !ata. %Iou will use the things you figure! out in %K& an! %-& a"ove to calculate
this pro"a"ility.&
M& The pro"a"ility that you calculate in %D& is the p-value.
#-values in Hypothesis Tests about a #opulation $ean
-
Suppose we have a sample of n o"servations an! that n is large. Suppose also that
although we !on3t

%the population mean& we !o (now


2
%the population variance&.
%hat are the null and alternative hypotheses&
O A
O O
H
H


>
=
:
:
%

O
is 4ust some num"er li(e *2 or 0 or P2.L&
"ollect some data and calculate a 'test statistic(.
'n the case of hypothesis tests a"out a population mean the test statistic is the sample
mean X . @e will use the notation

x to represent the particular value of the sample


mean that was foun! for your !ata.
)ualitatively* +hat values of the test statistic +ould +e be unli,ely to observe if the null
hypothesis +ere true& In other +ords* +hat values of the test statistic +ould appear
inconsistent +ith the null hypothesis& (Note that the notion of inconsistency being used
here is not that a certain value of the test statistic could not possibly be observed if the
null hypothesis were true but !ust that it would be unli"ely to be observed if the null
hypothesis were true#$
For the particular one-si!e! hypothesis test "eing consi!ere! here the null hypothesis
states that the population mean is less than or e5ual %

O
&. So 5ualitatively spea(ing if
the null hypothesis were true it woul! "e unli(ely to o"serve large values of X .
)uantitatively spea,ing* +hat +ould be the probability of the reali-ed value of the test
statistic being as .or more/ inconsistent +ith the null hypothesis as the value you
calculated from your data* if the null hypothesis +ere in fact true&
'n the case of this one-si!e! hypothesis test a"out

this 5uestion is) @hat woul! "e


the pro"a"ility of o"taining a sample with a mean X as large as %or larger than& the
value

x calculate! from our sample if in fact the population mean is e5ual to

O
;
'n sym"ols this 5uestion is as(ing us to fin!
( )
O
x X P % % F Q C
.
O
%hat is the probability distribution of the test statistic&
Fortunately we (now a lot a"out the !istri"ution of X .
First we (now that
( ) = X &
. %This is going to "e useful even though we !on3t (now
what

is e5ual to.&
@e also (now that ( )
n
X 'ar
2
(
F %an! we are assuming we (now
2
&.
An! since we are assuming that n is large we (now "y the ,2T that X is normally
!istri"ute!.
So we (now that

n
N X
2
R
(
%
.
@e !on3t (now what

is really e5ual to %if we !i! we woul!n3t have to "e testing a


hypothesis a"out it& "ut the pro"a"ility we want to calculate is con!itione! on the
assumption that

=
O
. So when we calculate this con!itional pro"a"ility we can
assume that

X ~ N
O
,

2
n





. An! now we (now everything we nee! to (now to
calculate the !esire! pro"a"ility)
( )

> =

>

= = >
n
O
n
O
n
O
O
x
) P
x X
P x X P
( ( (
% % %
% % C
@e (now the values for

x %we calculate! it from our !ata&

O
%it is state! in the null
hypothesis&
2
an! n so we can use the stan!ar! normal ta"le to fin! this pro"a"ility.
This probability is the p-value.
A slightly !ifferent loo(ing "ut e5uivalent way of presenting how a p-value is calculate!
in this e1ample is as follows. 9se the standardi*ed value of the reali?e! sample mean %its
?-score& as the test statistic. ,all this test statistic

z
where

z =
x
O

n
. Then use the
stan!ar! normal !istri"ution
( ) * 0 R N )
to calculate p-values as follows)
( ) * ) P value p Q F G
L
%hat if +e don0t ,no+ the population variance& 'f we !on3t (now
2
%as we usually
won3t& we can use an alternative test statistic)
n
s
O
X
t

=
where s represents the
sample variance
( )

=
*
*
2
n
X X
s
n
i
i
.
Something that ' call a #generali?ation of the ,2T$ tells us that when n is large an! if
the null hypothesis is true
*
R

. 'n this e1ample it is large values of t that are
inconsistent with the null hypothesis so we calculate
( )
+ n
t t P value p S F G
G*
where
+
t represents the reali?e! value of t you calculate! from your !ata.
D
Hughes Faculty Seminar on Teaching Statistics
Fall 2003
P-VALUE PROBLEMS
1) @awa sells #two foot$ hoagie rolls. 8f course "ecause there is some varia"ility in the
pro!uction process not each of the rolls is e1actly 2K inches long. 't is (nown that for
the entire population of @awa #two foot$ hoagie rolls the stan!ar! !eviation in their
lengths is *.K inches. A consumers3 a!vocacy group has claime! that the mean length for
the entire population of these rolls is less than 2K inches. The a!vocacy group has ta(en
the @awa ,orporation to court to sue them for misrepresenting their pro!uct.
a& State the appropriate null an! alternative hypotheses to "e teste!. %As usual let


represent the population mean length of @awa #twelve inch$ hoagie rolls.&
"& Suppose that in a ran!om sample of *K0 #twelve inch$ hoagie rolls the mean length is
23.L- inches. Fin! the p-value.
2) Suppose a ran!om sample has "een ta(en from a normally !istri"ute! population. Iou
(now that the mean of the population is
2- =
an! the variance in the population is
K
2
= "ut you !o not (now the sample si?e %call it n which as usual represents the
num"er of o"servations in the sample&.
Iou want to test the following hypotheses a"out the si?e of the sample)
*00 )
*00 )
0
<
=

Although you will not "e tol! what the sample si?e was you will "e tol! what the
reali?e! value of the sample mean was %as usual call it

x &.
a&& Aualitatively what values of

x woul! "e inconsistent with the null hypothesis; 'n


particular which of the following woul! "e true)
%i& The more that the reali?e! sample mean exceeds the population mean the
more inconsistent it is with the null hypothesis. HThat is the greater is

x 25
the more inconsistent the !ata is with the null hypothesis.J
%ii& The more that the reali?e! sample mean falls el!" the population mean the
more inconsistent it is with the null hypothesis. HThat is the smaller %more
negative& is

x 25 the more inconsistent the !ata is with the null hypothesis.J


%iii& The more that the reali?e! sample mean d#ffe$s f$!% the population mean
the more inconsistent it is with the null hypothesis. HThat is the greater is

x 25
the more inconsistent the !ata is with the null hypothesis.J
"hoose .i/* .ii/* or .iii/ as your ans+er* and briefly explain the reasoning behind your
choice.
M
"& As usual let X represent the mean of a ran!om sample of si?e n from the population
!escri"e! a"ove %with
2- =
an! K
2
= &. 'f the null hypothesis state! a"ove is true
then what is the pro"a"ility !istri"ution of X ;
c& Suppose you are tol! that in the sample that was ta(en the reali?e! value of the sample
mean was

x = 25.48. Fin! the p-value for the null an! alternative hypotheses state!
a"ove.
&) An office supply company hire! a salesperson to wor( for one !ay. The contract
specifie! that the salesperson shoul! visit the hea!5uarters of *- large corporations to try
to sell the company3s office pro!ucts. At each visit to a corporate hea!5uarters the
pro"a"ility that the salesperson ma(es a sale is .K %so the pro"a"ility that she !oesn3t
ma(e a sale is .O&. @hether she ma(es a sale at any office is in!epen!ent of whether she
ma!e a sale at any other office.
a& 2et X !enote the num"er of sales she ma(es if she visits *- offices. @hat is the
pro"a"ility !istri"ution of X; (,ust give the name of the family of distributions that X
belongs to and indicate what the values of the parameters of the distribution are# -ou
do not need to write out each of the possible reali*ations of X with their probabilities#$
"& Suppose that at the en! of the !ay the salesperson has ma!e only 3 sales. The owners
of the company are !ismaye! at how low this num"er of sales is an! suspect that the
salesperson may have ta(en it easy an! visite! fewer than the *- corporations that her
contract sai! she was suppose! to visit.
The company woul! therefore li(e to sue the salesperson for "reach of contract. .ut to
win the case they must convince a 4u!ge that they have strong evi!ence to show that she
visite! fewer than *- corporations.
Thin( of this as a hypothesis testing pro"lem an! state the null an! alternative
hypotheses that the company woul! want to test. (-ou can state these hypotheses in
word or you can use symbols# .f you use symbols be sure you indicate what the symbols
you are using are meant to represent#$
c& %*2 points& The only evi!ence the company has to present to the 4u!ge is that the
salesperson ma!e only 3 sales !uring the !ay. %/o"o!y followe! her aroun! all !ay to
!irectly o"serve how many offices she actually visite!.& Given this !ata what is the p-
value for the hypotheses state! in part %"& a"ove; %Assume that the 4u!ge the company
an! the wor(er all (now an! agree that as state! a"ove the pro"a"ility of a sale on any
in!ivi!ual call is .K&
') To re!uce employee theft a company proposes to screen its wor(ers with a lie !etector
test. This test is not perfectly relia"le) if a person is really innocent the test in!icates
*0
TguiltyT *0N of the time an! if the person is really guilty the test in!icates TinnocentT
20N of the time. 't is (nown that -N of the wor(ers actually are guilty.
Thin( of this as a hypothesis testing pro"lem. Suppose the company wants to test the
following null an! alternative hypotheses)
H
0
) The wor(er is innocent
H
A
) The wor(er is guilty
Suppose the wor(er ta(es the lie !etector test an! the test result is Tguilty.T Fin! the p-
value. %Thin( of the test result #guilty$ as the !ata you collecte!.&
**

You might also like