Professional Documents
Culture Documents
Curran Everett1998
Curran Everett1998
There are very few things which we know, which are not size methods rather than concepts, they contain glaring
capable of being reduc’d to a Mathematical Reasoning, errors, or they perpetuate misconceptions (4, 11, 12).
. . . and where a Mathematical Reasoning can be had, In his editorial prelude to a series of statistical
it’s as great folly to make use of any other, as to grope for papers, Yates (51) wrote that the papers were designed
a thing in the dark when you have a Candle standing to raise statistical consciousness and thereby reduce
by you. statistical errors in journals published by the American
John Arbuthnot (1692) Physiological Society. Rather than reinforce concepts,
STATISTICS IS ONE KIND of mathematical reasoning. Its these papers reviewed methods: analysis of variance
concepts and principles are ubiquitous in science: as (20), linear regression (37, 46), mathematical modeling
researchers, we use them to design experiments, ana- (22, 29, 40), risk assessment (36), and statistical pack-
lyze data, report results, and interpret the published ages (34). The proper use of any statistical technique,
however, requires an understanding of the fundamen-
findings of others. Indeed, it is from this foundation of
tal statistical concepts behind the technique.
statistical concepts and principles that scientific knowl-
How well do physiologists understand fundamental
edge is accumulated. If we fail to understand fully these concepts in statistics? One way to answer this question
fundamental statistical concepts and principles—if our is to examine the empirical incidence of basic statistical
statistical reasoning is faulty—then we are more likely quantities such as standard deviations, standard er-
to reach wrong scientific conclusions. Wrong conclu- rors, and confidence intervals. These quantities charac-
sions based on faulty reasoning is shoddy science; it is terize different statistical features: standard deviations
also unethical (1, 21, 30). characterize variability in the population, whereas
Regrettably, faulty reasoning in statistics rears its standard errors and confidence intervals characterize
head in the practice of science: for 60 years, statisti- uncertainty about the estimated values of population
cians have documented statistical errors in the scien- parameters, e.g., means. Of the original articles pub-
tific literature (3, 4, 17, 33, 50). In part, these errors lished in 1996 by the American Physiological Society,
exist because many introductory textbooks of statistics the overwhelming majority (69–93%, range) report
paradoxically hinder literacy in statistics: they empha- standard errors, apparently not as estimates of uncer-
http://www.jap.org 8750-7587/98 $5.00 Copyright r 1998 the American Physiological Society 775
776 INVITED REVIEW
Table 1. Manuscripts for the American Physiological Society’s journals in 1996: use of statistics and statisticians
%Research Manuscripts That Report
Am. J. Physiol.
Cell Physiol. 43 21 88 0 7 0
Endocrinol. Metab. 28 18 86 0 4 4
Gastrointest. Liver Physiol. 26 8 92 0 4 12
Heart Circ. Physiol. 60 17 87 0 10 3
Lung Cell. Mol. Physiol. 25 20 84 0 4 4
Regulatory Integrative Comp. Physiol. 41 17 88 0 15 12
Renal Fluid Electrolyte Physiol. 27 15 93 0 7 4
J. Appl. Physiol. 62 24 79 0 6 10
J. Neurophysiol. 58 36 69 2 5 7
n, No. of research manuscripts reviewed. In 1996, these journals published a total of 3,693 original articles. No. of articles reviewed
represents a 10% sample (selected by systematic random sampling, fixed start) of articles published by each journal. * Precise P value: for
example, P 5 0.02 (rather than P , 0.05) or P 5 0.13 (rather than P $ 0.05 or P 5 not significant). † We assessed collaboration with a
statistician using author affiliation and acknowledgments. We recognize that a statistician may be affiliated with another department, e.g.,
medicine. Using our criterion, however, few articles (0–12%, range) report formal collaboration of a physiologist with a statistician, a
partnership that typically reaps great rewards.
hypothesis testing and estimation. Most researchers statistical procedures, including the analysis of variance.
INVITED REVIEW 777
ranted focus on hypothesis testing has blurred the son procedures is beyond the scope of this review; Refs.
distinction between statistical significance and scien- 2, 9, 42, and 48 summarize these issues.
tific importance (3, 13, 19). Most investigators appear For the rest of this review, we focus our attention on
to reach scientific conclusions that are based not on several aspects of estimation.
their knowledge of science but solely on the probabili-
USING SAMPLES TO LEARN ABOUT POPULATIONS
ties of test statistics (16); this is an untenable approach
to scientific discovery. As researchers, we use samples to make inferences
The limited utility of hypothesis testing can be about populations. A sample interests us not because of
demonstrated with an example. Suppose a clinician its own merits but because it helps us estimate selected
wants to assess the impact of a placebo and the characteristics of the underlying population: for ex-
b-blockers bisoprolol and metoprolol on heart rate ample, the sample mean y estimates the population
variability in patients with left heart failure. Suppose mean µ.5
also that the clinician constructs the null and alterna- As an illustration, suppose the random variable Y
tive hypotheses, H0 and H1, as represents the change in systolic blood pressure after
some intervention. Suppose also that the distribution of
H0: treatments have identical effects Y conforms to a normal distribution. A normal distribu-
on heart rate variability tion is specified completely by two parameters: the
H1: treatments have different effects mean and variance. The population mean µ conveys the
location of the center of the distribution; the population
investigator can be quite certain of a trivial experimen- Table 3. Limitations of statistics: raw data and
tal effect. regression statistics
Whatever the statistical result of a hypothesis test,
assessment of the corresponding confidence interval Drug A Drug B
that introductory courses in statistics are relevant and Consider the linear function L
sound (7, 44, 50).
L 5 k1X11 k2X2 1 · · · 1 kmXm
In this review, we have reiterated the primary role of
statistics within science to be one of estimation: estima- For i 5 1, 2, . . . , m, each ki is a real constant, and each Xi ,
tion of a population parameter or estimation of the N(µi, s2i ). The mean of L, Ave 5L6, is
uncertainty about the value of that parameter. More- m
over, we have demonstrated the essential distinction
between statistical significance and scientific impor-
Ave 5L 6 5 k1µ1 1 k2µ2 1 · · · 1 kmµm 5 okµ
i51
i i
17. Colditz, G. A., and J. D. Emerson. The statistical content of 35. Hogg, R. V., and A. T. Craig. Introduction to Mathematical
published medical research: some implications for biomedical Statistics (4th ed.). New York: Macmillan, 1978.
education. Med. Educ. 19: 248–255, 1985. 36. Iberall, A. S. The problem of low-dose radiation toxicity. Am. J.
18. Colton, T. Statistics in Medicine. Boston, MA: Little, Brown, Physiol. 244 (Regulatory Integrative Comp. Physiol. 13): R7–R13,
1974. 1983.
19. Cox, D. R. Statistical significance tests. Br. J. Clin. Pharmacol. 37. Jackson, T. E. Comparison of a class of regression equations.
14: 325–331, 1982. Am. J. Physiol. 246 (Regulatory Integrative Comp. Physiol. 15):
20. Denenberg, V. H. Some statistical and experimental consider- R271–R276, 1984.
ations in the use of the analysis-of-variance procedure. Am. J. 38. Kruskal, W. H. Tests of significance. In: International Encyclope-
Physiol. 246 (Regulatory Integrative Comp. Physiol. 15): R403– dia of the Social Sciences, edited by D. L. Sills. New York:
R408, 1984. Macmillan & The Free Press, 1968, vol. 14, p. 238–250.
21. Denham, M. J., A. Foster, and D. A. J. Tyrrell. Work of a 39. Land, T. A., and M. Secic. How to Report Statistics in Medicine.
district ethical committee. Br. Med. J. 2: 1042–1045, 1979. Philadelphia, PA: Am. College Physicians, 1997.
22. DiStefano, J. J., III, and E. M. Landaw. Multiexponential, 40. Landaw, E. M., and J. J. DiStefano III. Multiexponential,
multicompartmental, and noncompartmental modeling. I. Meth- multicompartmental, and noncompartmental modeling. II. Data
odological limitations and physiological interpretations. Am. J. analysis and statistical considerations. Am. J. Physiol. 246
Physiol. 246 (Regulatory Integrative Comp. Physiol. 15): R651– (Regulatory Integrative Comp. Physiol. 15): R665–R677, 1984.
R664, 1984. 41. Montgomery, D. C., and G. C. Runger. Applied Statistics and
23. Draper, N. R., and H. Smith. Applied Regression Analysis (2nd Probability for Engineers. New York: Wiley, 1994, p. 361–363.
ed.). New York: Wiley, 1981. 42. Moses, L. E. Think and Explain with Statistics. Reading, MA:
24. Evans, S. J. W., P. Mills, and J. Dawson. The end of the p Addison-Wesley, 1986.
value? Br. Heart J. 60: 177–180, 1988.
43. Mosteller, F., and J. W. Tukey. Data Analysis and Regression.
25. Fisher, R. A. Statistical Methods and Scientific Inference (3rd
Reading, MA: Addison-Wesley, 1977.