Professional Documents
Culture Documents
P Value
P Value
P Value
Pr(data|H0) ≠ Pr(H0|data)
Pr(clouds|rain) ≠ Pr(rain|clouds)
2
Doing a hypothesis test means making a
decision.
Reject H0 Retain H0
H0 true Type I error OK
3
Examples (?)
George stands trial for a crime (e.g., burglary).
What is H0, Ha? Type I error? Type II error?
H0: George is innocent
Ha: He is guilty
Type I error: innocent man convicted
Type II error: guilty man set free
Note: “not guilty” ≠ “innocent”
Note: A grand jury might have looked at several possible
defendants and only agreed to let the DA bring forward
George’s case. I.e., George was not chosen randomly to
stand trial. If we were to randomly chose defendants, then we
would make lots of Type I errors over the many trials.
4
Susan goes to her doctor because she thinks she is ill.
What is H0, Ha? Type I error? Type II error?
H0: Susan is well
Ha: She is sick
5
Fred reads that aliens landed at Roswell, NM in 1947.
Should he believe this?
What is H0, Ha? Type I error? Type II error?
H0: No such thing happened
Ha: There is a conspiracy of silence
6
Expert witness work
Consider the question asked, then give one of
the six acceptable answers:
Yes
No
I don’t know
I don’t remember
Could you please repeat the question?
Green
Not “The car was a green Honda with a sunroof, NY
license plates, and the radio was blaring.”
Q: What color was the car? A: Green
7
Hypothesis test
No matter what question you wish the test
would answer, a hypothesis test only answers
one question.
Not “This model is probably true.”
Not “The effect of the drug is large.”
Not “People should care about the difference I have
found.”
Q: Are the data consistent with the model (such that
any deviation from the model could reasonably have
happened by chance)? A: Yes (or No)
8
See the Dance of the P-values
https://www.youtube.com/watch?v=ez4DgdurRPg
12
Consider testing whether an effect is zero.
mean SE H 0?
Group 1 25 10 Reject
Group 2 10 10 Retain
Group 1
vs 15 14 Retain!
Group 2
14
Two (different) Ideas
23
Publication bias
One study looked at 10 years of papers
indexed in PubMed and identified 4970
observational studies of medical treatments.
82% of them had statistically significant results
at the 0.05 level.
Another study looked at 1046 research
articles in three clinical psychology journals.
86% of them used statistical tests; 94% of
these rejected H0 at the 0.05 level.
24
Ben Goldacre TED MED talk
25
2005 paper in PLoS Medicine
26
2013 paper, Statistics in Medicine
“Our experiment provides evidence that the majority of
observational studies would declare statistical
significance when no effect is present. Empirical
calibration was found to reduce spurious results to the
desired 5% level. Applying these adjustments to literature
suggests that at least 54% of findings with p < 0.05 are not
actually statistically significant and should be
reevaluated.”
27
Garden of Forking Paths
(“researcher degrees of freedom”)
29
Note: The Reproducibility Project has its critics.
See http://science.sciencemag.org/content/351/6277/1037.2
And a response:
https://hardsci.wordpress.com/2016/03/03/evaluating-a-
new-critique-of-the-reproducibility-project/
30
NIH new (2015) stat guidelines
See http://www.nih.gov/about/reporting-
preclinical-research.htm for a statement of
“principles with the aim of facilitating the
interpretation and repetition of experiments”
31