How To Understand P Values

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 2

4 of 9 | JNCI J Natl Cancer Inst, 2016, Vol. 108, No.

Phase 2 trial, by combining the chemo- be completed before any definitive con- pancreatic cancer that have shown
therapy with an experimental enzyme, clusions can be drawn. promise in early clinical trials. These
PEGPH20. The Fred Hutchinson Cancer treatments can be broken into these
Pancreatic and other cancers, amass Research team is not the only ones main categories: checkpoint inhibitors/
high levels of hyaluronic acid (HA), a working on T cell immunology vaccines immune modulators, therapeutic vac-
molecule that absorbs water and created for pancreatic cancer. There are other T cines, adoptive cell transfer, monoclonal
a microenvironment that nurtures the cell vaccine approaches – administered antibodies, oncolytic viruses, adjuvant
growth of tumor cells and inhibits the ef- alone or combined with other therapies – immunotherapies, and cytokines.
fectiveness of many anti-cancer agents. currently under investigation and enroll- As for the Fred Hutchinson Cancer
PEGPH20 is a version of the naturally oc- ing patients with pancreatic cancer in Research Center approach, Hingorani,
curring hyaluronidase enzyme that clinical trials. For example, Carl H. June, who lost his father to pancreatic cancer,
breaks down HA, and which has been M.D., professor of pathology and labora- said the team plans to have the human
modified to be more stable in the body. tory medicine at the Perelman School of version of the T cell therapy in clinical
NEWS

Breaking through the tumor microenvi- Medicine and director of translational re- trials by the end of the year. He is cau-
ronment’s HA component lowers the search at the Abramson Cancer Center at tiously optimistic. “We have vetted our
pressure inside the tumors, opens con- the University of Pennsylvania in Phila- strategies as deeply as possible, and of
stricted vessels, thus increasing blood delphia, is investigating chimeric anti- course, the goal is to see it work in a pa-
flow and access to the tumor. gen receptor T cells which have shown tient; that’s the great proving ground.”
Hingorani said to date, they have success with another version of the ther-
nearly doubled the amount of time cer- apy given to patients with advanced © Oxford University Press 2016.
tain patients with metastatic pancreatic leukemia. DOI:10.1093/jnci/djw193
cancer have before their disease pro- Jaffee said all told, there are seven
gresses. He cautions that the trial must approaches to immunotherapy for

“P < 0.05” Might Not Mean What You Think: American Statistical
Association Clarifies P Values
By Beatrice Grabowski

In 2011, the U.S. Supreme Court unani- research studies. However, assembling would be equal to or more extreme than
mously ruled in Matrixx Initiatives Inc. v. a short, clear statement by such a di- its observed value.”
Siracusano that investors could sue a verse group took longer and was more Why is the exact definition so impor-
drug company for failing to report ad- contentious than expected. Participants tant? Many authors use statistical soft-
verse drug effects—even though they wrote supplementary commentaries, ware that presumably is based on the
were not statistically significant. available online with the published correct definition. “It’s very easy for re-
Describing the case in the April 2, 2011, statement. searchers to get papers published and
issue of the Wall Street Journal, Carl Bialik The panel discussed many miscon- survive based on knowledge of what sta-
wrote, “A group of mathematicians has ceptions about P values. Test your tistical packages are out there but not
been trying for years to have a core statis- knowledge: Which of the following is necessarily how to avoid the problems
tical concept debunked. Now the Supreme true? that statistical packages can create for
Court might have done it for them.” That you if you don’t understand their appro-
conclusion may have been overly optimis- • P > 0.05 is the probability that the null priate use,” said Barnett S. Kramer, M.D.,
tic, since misguided use of the P value hypothesis is true. M.P.H., JNCI’s former editor in chief and
continued unabated. However, in 2014 • 1 minus the P value is the probability now director of the National Cancer
concerns about misinterpretation and that the alternative hypothesis is Institute’s Division of Cancer Prevention.
misuse of P values led the American Stat- true. (Kramer was not on the ASA panel.)
istical Association (ASA) Board to convene • A statistically significant test result Part of the problem lies in how people
a panel of statisticians and experts from a (P  0.05) means that the test hypoth- interpret P values. According to the ASA
variety of disciplines to draft a policy esis is false or should be rejected. statement, “A conclusion does not im-
statement on the use of P values and hy- • A P value greater than 0.05 means mediately become ‘true’ on one side of
pothesis testing. After a year of discussion, that no effect was observed. the divide and ‘false’ on the other.” Val-
ASA published a consensus statement in uable information may be lost because
American Statistician (doi:10.1080/00031305. If you answered “none of the above,” researchers may not pursue “insignifi-
2016.1154108). you may understand this slippery con- cant” results. Conversely, small effects
The statement consists of six princi- cept better than many researchers. The with “significant” P values may be bio-
ples in nontechnical language on the ASA panel defined the P value as “the logically or clinically unimportant. At
proper interpretation of P values, hy- probability under a specified statistical best, such practices may slow scientific
pothesis testing, science and policy model that a statistical summary of the progress and waste resources. At worst,
decision-making, and the necessity for data (for example, the sample mean dif- they may cause grievous harm when
full reporting and transparency of ference between two compared groups) adverse effects go unreported. The
NEWS | 5 of 9

Supreme Court case involved the drug “If success is defined based reporting and transparency by authors
Zicam, which caused permanent hearing as well as changes in statistics educa-
loss in some users. Another drug, rofe- on passing some magic tion. In his commentary, Don Berry,
coxib (Vioxx), was taken off the market Ph.D., professor of biostatistics at the
because of adverse cardiovascular ef- threshold, biases may University of Texas M.D. Anderson Can-
fects. The drug companies involved did cer Center in Houston, urged researchers
not report those adverse effects because continue to exert their to report every aspect of the study. “The
of lack of statistical significance in the specifics of data collection and curation
original drug tests (Rev. Soc. Econ.
influence regardless of and even your intentions and motivation
2016;74:83–97; doi:10.1080/ whether the threshold is are critical for inference. What have you
00346764.2016.1150730). not told the statistician? Have you de-
ASA panelists encouraged using alter- defined by a P value, Bayes leted some data points or experimental
native methods “that emphasize estima- units, possibly because they seemed to

NEWS
tion over testing, such as confidence, factor, false-discovery rate, be outliers?” he wrote.
credibility, or prediction intervals; Bayesian Kramer advised researchers to “con-
methods; alternative measures of evi- or anything else.” sult a statistician when writing a grant
dence, such as likelihood ratios or Bayes application rather than after the study is
Beyond simple misinterpretation of
Factors; and other approaches such as finished; limit the number of hypotheses
the P value and the associated loss of in-
decision-theoretic modeling and false to be tested to a realistic number that
formation, authors consciously or un-
discovery rates.” However, any method doesn’t increase the false discovery rate;
consciously but routinely engage in data
can be used invalidly. “If success is defined be conservative
dredging (aka fishing, P-hacking) and
based on passing some magic threshold, in interpreting
selective reporting. “Any statistical
biases may continue to exert their influ- the data; don’t
technique can be misused and it can be
ence regardless of whether the threshold consider P ¼
manipulated especially after you see
is defined by a P value, Bayes factor, false- 0.05 as a magic
the data generated from the study,”
discovery rate, or anything else,” wrote number; and
Kramer said. “You can fish through a
panelist John Ioannidis, Ph.D., professor of whenever pos-
sea of data and find one positive finding
medicine and of health research and policy sible, provide
and then convince yourself that even
at Stanford University School of Medicine confidence in-
before you started your study that
in Stanford , Calif. tervals.” He also
would have been the key hypothesis
Some panelists argued that the P value suggested,
and it has a lot of plausibility to the
per se is not the problem and that it has “Webinars and
investigator.”
its proper uses. A P value can sometimes symposia on
In response to those practices and John Ioannidis, Ph.D.
be “more informative than an interval”— this issue will
concerns about replicability in science,
such as when “the predictor of interest is be useful to
some journals have banned the P value
a multicategorical variable,” said Clarice clinical scien-
and inferential statistics. Others, such as
Weinberg, Ph.D., who was not on the tists and bench researchers because
JNCI, require confidence intervals and ef-
panel. “While it is true that P values are they’re often not trained in these princi-
fect sizes, which “convey what a P value
imperfect measures of the extent of evi- ples.” As the ASA statement concludes,
does not: the magnitude and relative im-
dence against the null hypothesis, confi- “No single index should substitute for
portance of an effect,” wrote panel mem-
dence intervals have a host of problems scientific reasoning.”
ber Regina Nuzzo, Ph.D., professor of
of their own,” said Weinberg, deputy chief
mathematics and computer sciences at
of the Biostatistics and Computational © Oxford University Press 2016.
Gallaudet University in Washington, D.C.
Biology Branch and principal investigator DOI:10.1093/jnci/djw194
(Nature 2014;506:150–2).
of the National Institute of Environmental
How can practice improve? Panel
Health Sciences in Research Triangle
members emphasized the need for full
Park, N.C.

Physical Activity Associated With Fewer Cancers


By Susan Jenks

Even moderate leisure-time physical activ- say, nor is exercise alone likely to ac- health services in the Fielding School of Pub-
ity may protect against 13 cancers, accord- count for its association with a lower lic Health at the University of California, Los
ing to a massive observational study that cancer risk in colon, breast, and endome- Angeles. “The biggest bang [in risk reduction]
appeared May 16 in JAMA Internal Medicine trial cancers, among others. comes when exercise is coupled with a
(doi:10.1001/jamainternmed.2016.1548). “Physical activity is not a stand-alone, Mediterranean-style diet and not smoking.”
But which type of exercise brings the magic bullet,” said William McCarthy, Ph.D., Still, McCarthy said, the recent joint
most benefit is not yet clear, researchers adjunct professor in the department of study by researchers at the National

You might also like