Main Menu - The Hyperlinks Below Take You To The Appropriate Worksheet

Main Menu - The hyperlinks below take you to the appropriate wo
Basic Concepts
Normal Distribution & 'Standard Deviation
Skewed Distribution
Epidemic curve (how to create one)
Descriptive Statistics (mean, median,mode, 95% confidence interval for a mean, standard deviation, standard error, range)
Incidence Rates & Cumulative Incidence (IR & CI)

Epidemiology/Biostatistics Tools
Wayne W. LaMorte, MD, PhD, MPH Copyright

2006
Menu - The hyperlinks below take you to the appropriate worksheet.

Statistical Tests
ANOVA (Analysis of Variance)
Chi Squared Test
Confidence Interval for a Proportion
Correlation & Linear Regression
T-test (Unpaired)
T-test (Paired)
Standardized Rates (Proportions) - Direct Standardization

Standardized Incidence Ratio
Fisher's Exact Test (You need to be online to use this.)
Binomial Probability Calculator

Poisson Probability Calculator
Normal Probability Calculator
appropriate worksheet.
Study Analysis
Case-Control
Cohort Studies
Stratified Case-Control
Stratified Cohort Analysis (Cumulative Incidence)
Stratified Cohort Analysis (Incidence Rates)
Screening Test Performance - Sensitivity/Specificity
Sample Size Calculations

Survival Curves
Random Assignment to Groups
Applications
Predicting BMI and Body Weight Change with Activity
Epidemic Curve- Method 1 - Sort & Count Restaurants Visited in Past 3
ID Date of Onset gender age occupation school Sp. Bakery The Ledge
1 4/28/2004 M 44 mechanic . 1 0
18 4/30/2004 M 13 student private 1 0
19 5/2/2004 F 35 teacher . 1 0
20 5/2/2004 M 27 cashier . 1 1
58 5/4/2004 F 11 student private 1 0
2 5/5/2004 F 9 student Sparta 1 1
22 5/5/2004 F 44 waitress . 1 1
55 5/5/2004 M 10 student Sparta 1 1
16 5/7/2004 M 19 laborer . 1 0
21 5/7/2004 F 22 waitress . 1 0
23 5/8/2004 F 27 pharmacy . 1 1
54 5/8/2004 M 37 teacher . 1 0
3 5/9/2004 F 21 student college 1 1
24 5/10/2004 F 12 student Sparta 0 0
49 5/10/2004 F 44 teacher . 1 0
50 5/10/2004 M 50 writer . 1 1
53 5/10/2004 F 12 student Sparta 0 1
4 5/11/2004 M 34 cashier . 1 0
51 5/11/2004 M 35 nurse . 1 1
17 5/12/2004 F 11 student Sparta 1 0
25 5/12/2004 M 36 pastor . 0 0
26 5/12/2004 F 22 waitress . 1 1
27 5/12/2004 M 11 student Sparta 0 0
28 5/12/2004 M 10 student Sparta 1 1
42 5/12/2004 F 22 cashier . 1 0
43 5/12/2004 M 33 carpenter . 0 1
48 5/13/2004 F 33 sales . 1 1
9 5/14/2004 F 28 grocer . 0 1
10 5/14/2004 F 19 sales . 1 0
29 5/14/2004 F 17 student Sparta 1 1
30 5/14/2004 F 12 student Sparta 1 0
31 5/14/2004 M 18 student Sparta 1 1
32 5/14/2004 M 18 cook . 1 0
41 5/14/2004 F 25 waitress . 1 1
47 5/14/2004 M 39 sales . 1 0
12 5/15/2004 M 15 student Sparta 1 1
13 5/15/2004 M 28 construction . 0 1
33 5/15/2004 F 10 student Sparta 1 1
39 5/15/2004 M 36 librarian . 1 0
40 5/15/2004 F 37 sales . 1 0
46 5/15/2004 F 14 student Sparta 0 0
14 5/16/2004 F 23 sales . 1 0
15 5/16/2004 M 22 sales . 1 1
38 5/16/2004 F 21 student college 1 0
35 5/20/2004 F 12 student private 1 0
Restaurants Visited in Past 30 Days: 1=yes; 0=no Link to video showing creation of the
epidemic curve by sorting and countng.
Jimbo's Chi Chis Chinese Garden Green Tree
1 0 1 0
0 0 0 1 4/27-28 1
1 1 0 0 4/29-30 1
1 0 1 0 5/1-2 2
1 0 0 1 5/3-4 1
1 0 0 0 5-5-6 3
1 0 1 0 5/7-8 7
1 1 0 0 5/9-10 5
0 0 0 0 5/11-12 14
1 0 1 0 5/13-14 10
1 0 0 0 5/15-16 11
1 1 0 1 5/17-18 1
1 0 1 0 5/19-20 2
1 1 0 0 5/20-22 0
1 1 1 0
1 0 0 0
1 1 0 0
1 0 0 0
0 0 0 0
1 1 0 0
1 0 0 0
1 0 0 0
1 0 1 1
0 0 0 0
0 1 0 1
1 0 0 0 New Hepatitis Cases
1 0 1 1 16
1 0 1 1
14
1 1 0 0
1 0 0 0 12
1 0 0 1 10
1 1 1 0
0 0 0 1 8
1 0 0 0 6
1 0 0 0
4
1 0 1 0
1 0 1 0 2
1 0 1 0
0
0 0 0 1 28 30 1-
2
3-
4
5-
6
7-
8 10 12 14
1 0 0 0 27- 29- 5/ 5/ 5- 5/ / 9- 11- 13-
4/ 4/ 5 5/ 5/
1 0 1 1
1 1 1 0
1 0 1 0
1 0 0 0
1 0 0 0
1 0 0 0
0 0 1 0
1 0 0 1
0 0 0 0
1 0 1 0
1 1 1 0
1 0 1 1
1 0 0 0
1 0 0 0
1 0 0 0
0 0 0 0
1 1 0 1
0 0 0 0
wing creation of the
y sorting and countng.
Main Menu
New Hepatitis Cases
4 6 8 10 12 14 16 18 20 22
3- 5- 7- 9- 1- 3- 5- 7- 9- 0-
5/ 5- 5/ 5/ 1 1 1 1 1 2
5/ 5/ 5/ 5/ 5/ 5/
Epidemic Curve- Method 2 - Piovot Tables Restaurants Visited in Past 30 Day
ID Date of Onset gender age occupatioschool Sp. Bakery The Ledge Jimbo'
1 4/28/2004 M 44 mechanic . 1 0 1
2 5/5/2004 F 9 student Sparta 1 1 1
3 5/9/2004 F 21 student college 1 1 1
4 5/11/2004 M 34 cashier . 1 0 1
5 5/12/2004 M 9 student Sparta 0 1 1
6 5/12/2004 F 6 student Sparta 1 1 0
7 5/12/2004 F 8 student Sparta 1 1 0
8 5/13/2004 M 12 student Sparta 0 1 1
9 5/14/2004 F 28 grocer . 0 1 1
10 5/14/2004 F 19 sales . 1 0 1
11 5/15/2004 M 8 student Sparta 1 1 1
12 5/15/2004 M 15 student Sparta 1 1 1
13 5/15/2004 M 28 construction . 0 1 0
14 5/16/2004 F 23 sales . 1 0 1
15 5/16/2004 M 22 sales . 1 1 1
16 5/7/2004 M 19 laborer . 1 0 0
17 5/12/2004 F 11 student Sparta 1 0 1
18 4/30/2004 M 13 student private 1 0 0
19 5/2/2004 F 35 teacher . 1 0 1
20 5/2/2004 M 27 cashier . 1 1 1
21 5/7/2004 F 22 waitress . 1 0 1
22 5/5/2004 F 44 waitress . 1 1 1
23 5/8/2004 F 27 pharmacy . 1 1 1
24 5/10/2004 F 12 student Sparta 0 0 1
25 5/12/2004 M 36 pastor . 0 0 1
26 5/12/2004 F 22 waitress . 1 1 1
27 5/12/2004 M 11 student Sparta 0 0 1
28 5/12/2004 M 10 student Sparta 1 1 1
29 5/14/2004 F 17 student Sparta 1 1 0
30 5/14/2004 F 12 student Sparta 1 0 1
31 5/14/2004 M 18 student Sparta 1 1 1
32 5/14/2004 M 18 cook . 1 0 1
33 5/15/2004 F 10 student Sparta 1 1 1
34 5/15/2004 M 5 student Sparta 0 0 0
35 5/20/2004 F 12 student private 1 0 0
36 5/19/2004 M 9 student Sparta 1 0 1
37 5/17/2004 F 9 student Sparta 1 1 0
38 5/16/2004 F 21 student college 1 0 1
39 5/15/2004 M 36 librarian . 1 0 1
40 5/15/2004 F 37 sales . 1 0 1
41 5/14/2004 F 25 waitress . 1 1 1
42 5/12/2004 F 22 cashier . 1 0 1
43 5/12/2004 M 33 carpenter . 0 1 1
44 5/12/2004 F 7 student Sparta 1 0 0
45 5/12/2004 M 8 student Sparta 1 1 1
46 5/15/2004 F 14 student Sparta 0 0 1
47 5/14/2004 M 39 sales . 1 0 1
48 5/13/2004 F 33 sales . 1 1 1
49 5/10/2004 F 44 teacher . 1 0 1
50 5/10/2004 M 50 writer . 1 1 0
51 5/11/2004 M 35 nurse . 1 1 1
52 5/7/2004 F 15 student Sparta 0 1 1
53 5/10/2004 F 12 student Sparta 0 1 1
54 5/8/2004 M 37 teacher . 1 0 1
55 5/5/2004 M 10 student Sparta 1 1 1
56 5/7/2004 F 8 student Sparta 1 1 1
57 5/7/2004 M 9 student Sparta 0 0 1
58 5/4/2004 F 11 student private 1 0 1
s Visited in Past 30 Days: 1=yes; 0=no
Chi Chis Chinese Garden Green Tree Main Menu
0 1 0
0 0 0
0 0 0 Link to video showing creation of the
0 0 0 epidemic curve using pivot tables.
0 1 1
0 0 0
1 0 1
0 0 0
0 1 0
0 1 0
0 0 0
0 0 0
0 1 0
0 0 0
0 0 0
0 0 0
0 0 0
0 0 1
1 0 0
0 1 0
0 1 0
0 1 0
1 0 0
1 0 0
0 1 1
0 1 1
1 0 0
0 0 0
0 0 1
0 0 0
0 1 1
1 1 0
0 0 1
0 0 0
0 0 0
1 0 1
0 0 0
0 0 0
0 1 0
1 1 0
0 1 0
0 0 1
1 1 0
0 0 1
0 0 0
0 1 1
0 0 0
0 1 0
0 0 0
0 0 0
0 0 0
0 0 0
1 0 0
1 1 0
1 0 0
1 0 1
0 1 0
0 0 1
Main Menu
Wayne W. LaMorte, MD, PhD, MPH Confidence Intervals for a Single Preval
Copyright 2006
Use these first four rows to compute the confidence limits for a single proportion (i.e., a prevalence of a cumulative incidence in one group
symmetrically distributed above and below the point estimate. They are more accurate than the approximations that assume large sample
or 1. (See the formula in Rothmans K: Epidemiology: An Introduction, Oxford University Press, 2002, page 132).
90% Confidence Interval

"N" Estimated +/- Lower Upper +/-
Numerator Denominator proportion Limit Limit
4 8 0.5 0.24863 0.75137
10 20 0.5 0.32739 0.67261
70 100 0.7 0.62016 0.76930
7 10000 0.0007 0.00038 0.00129
The lower section uses Byar's approximation of the exact limits as described by Rothman, K: Epidemiology: An Introduction, Oxford U
above, these can be used for small or large samples, and the limits are not necessarily symmetrically distributed above and below the
Confidence Intervals for a Sing

90% Confidence Interval
Person-Time Estimated +/- Lower Upper +/-
Numerator Denominator Rate Limit Limit
3 2500 0.0012 0.00043 0.00281
10 2500 0.004 0.00232 0.00653
20 2500 0.008 0.00546 0.01139
50 2500 0.02 0.01576 0.02509
rvals for a Single Prevalence or Cumulative Incidence
Main Menu
of a cumulative incidence in one group. These use Wilson's approximation of the exact limits for a binomial distribution. These are not
proximations that assume large sample size. These formulas can be used when the sample size is small or the proportion is close to 0
2, page 132).
95% Confidence Interval 99% Confidence Interval

Lower Upper +/- Lower Upper
Limit Limit Limit Limit
0.21521 0.78479 0.16369 0.83660
0.29929 0.70071 0.25063 0.74949 Wilson's Approximation
0.60415 0.78105 0.57263 0.80251
0.00034 0.00144 0.00027 0.00179
pidemiology: An Introduction, Oxford University Press, 2002, page 134. As with Wilson's approximation
rically distributed above and below the point estimate.
ence Intervals for a Single Incidence Rate

95% Confidence Interval 99% Confidence Interval
Lower Upper +/- Lower Upper
Limit Limit Limit Limit
0.00033
0.00205
0.00320
0.00710
0.00019
0.00160
0.00407
0.00829
(Byar's Approximation)
0.00504 0.01211 0.00428 0.01362
0.01502 0.02614 0.01363 0.02827
s Approximation
Approximation)
Chi Squared Test Main Menu
Wayne W. LaMorte, MD, PhD, MPH
Copyright 2006
Example Observed Data Expected Under H0
+ Outcome -Outcome + Outcome -Outcome
Exposed 7 124 131 Exposed 4.99 126.01 131
Non-exposed 1 78 79 Non-exposed 3.01 75.99 79
8 202 210 8 202 210
p-value= 0.13481471 8/210= 0.03809524

The Chi Squared statistic is calculated from the difference between observed and expected values for each cell. The
difference is squared and then divided by the expected value for the cell. This calculation is repeated for each cell and
the results are summed. Note that the chi squared test is a "large sample test"; it should not be used if the number of
expected observations in any of the cells is <5, because it gives falsely low p-values. In this case, Fisher's Exact Test
should be used.
Enter data into the blue cells to calculate a p-value with the chi squared test.
Observed Data Expected Under H0
+ Outcome -Outcome + Outcome -Outcome
Exposed 0 Exposed #DIV/0! #DIV/0! ###
Non-exposed 0 Non-exposed #DIV/0! #DIV/0! ###
0 0 0 #DIV/0! #DIV/0! ###
Chi Sq= #DIV/0!
p-value= #DIV/0! #DIV/0!
The chi squared test can also be applied to situations with multiple groups and outcomes.
For example, the number of runners who finished a marathon in less than 4 hours among those who trained not at all, a little, m
The Excel function CHITEST will calculate the p-value automatically, if you specify the range of actual (observe
frequencies and the range of expected observations. For example,
Observed Data Expected Under H0
Finished Didn't finish Finished Didn't finish
Not at all 2 5 7 3.29 3.71 7
A little 8 30 38 17.86 20.14 38
Moderately 20 15 35 16.45 18.55 35
A lot 25 12 37 17.39 19.61 37
55 62 117 55 62 117
p-value= 0.000280
The Chi Squared Test is based on the difference
PhD, MPH
between the frequency distribution that was observed
Copyright 2006
and the frequency distribution that would have been
expected under the null hypothesis. In the example
above, only 8 of 210 subjects had the outcome of
interest (3.8095%). Under the null hypothesis, we would
expect 3.8095% of the exposed group to have the
outcome, and we would expect 3.8095% of the non-
exposed group to have the outcome as well. The 2x2
table to the right calculates the frequencies expected
under the null hypothesis
for each cell. The
ed for each cell and
d if the number of
2
= 
Fisher's Exact Test
2 (O-E)
E
o trained not at all, a little, moderately, or a lot.

he range of actual (observed)
Case-Control Studies
Lamorte, Wayne W:
Enter data into the Observed Data
turquoise cells. Cases Controls
Exposed 41 28 69
Non-exposed 19 32 51
60 60 120
0.95 Select Confidence Level

Confidence Limits
Lower Limit Upper Limit
Odds Ratio= 2.47 1.17 5.19
Chi Sq= 5.763
p-value= 0.016367
Online Fisher's Exact Test
Stratified Analysis
The area below provides an illustration of a stratified analysis that is limited to two substrata.
For stratified analysis of case-control data for up to 12 sbstrata use the worksheet entitled:
Stratified Analysis (for 2 Substrata) Crude
Cases Controls
688 650 1338
21 59 80
709 709 1418
Lamorte, Wayne W:
Enter data into the
turquoise cells.
Stratum 1
Cases Controls
Exposed 647 622 1269
Non-exposed 2 27 29
649 649 1298
Odds Ratio= 14.04

Chi Sq= 22.044
p-value= 0.000003
Conf. Level= 0.95
Upper CI= 59.30
The weights in this row are used for
the Chi Square test of HOMGENEITY. Lower CI= 3.33
See Aschengrau and Seage, p.347- lnOR 2.64 2.6421
348. (2nd edition)
The weights in this row are used for
the Chi Square test of HOMGENEITY.
See Aschengrau and Seage, p.347-
348. (2nd edition)
se(lnOR) 0.734976428
(weight) w 1.851199307
ad/T= 13.45839753 G1
bc/T= 0.958397535 H1
(a+d)/T 0.519260401 P1
(b+c)/T 0.480739599 Q1
Var(lnOR) 0.540190349
ChiSq_het 2.37518211
Expected Under H0
Cases Controls
Exposed 634.50 634.50 1269.00
Non-exposed 14.50 14.50 29.00
649 649 1298
O-E 634.5
O 647
n0n1m0m1/(n2(n-1)= 7.093484965 n0n1m0m1/(n2(n-1)=
Y-Values X-Values
From Ken Rothm p-value Curve 1 Null Bar
0.0037312 0.0037312 0.8019729
z value 0.0051099 0.0051099 0.833647666
2.9 0.0069335 0.0069335 0.86657346
2.8 0.009322 0.009322 0.900799692
2.7 0.0124189 0.0124189 0.936377726
2.6 0.0163947 0.0163947 0.973360952
2.5 0.0214478 0.0214478 1.011804869
2.4 0.0278065 0.0278065 1.051767169
2.3 0.0357284 0.0357284 1.093307823
2.2 0.0454999 0.0454999 1.136489168
2.1 0.0574327 0.0574327 1.181376007
2 0.0718603 0.0718603 1.228035698
1.9 0.0891306 0.0891306 1.276538263
1.8 0.1095982 0.1095982 1.326956488
1.7 0.133614 0.133614 1.379366033
1.6 0.161513 0.161513 1.43384555
1.5 0.1936006 0.1936006 1.490476792
1.4 0.230139 0.230139 1.549344744
1.3 0.2713318 0.2713318 1.610537749
1.2 0.3173102 0.3173102 1.674147636
1.1 0.36812 0.36812 1.740269862
1 0.4237105 0.4237105 1.809003656
0.9 0.4839271 0.4839271 1.880452163
0.8 0.548506 0.548506 1.954722604
0.7 0.6170749 0.6170749 2.031926435
0.6 0.6891563 0.6891563 2.112179512
0.5 0.764177 0.764177 2.195602269
0.4 0.8414805 0.8414805 2.282319896
0.3 0.9203442 0.9203442 2.372462527 1
0.2 1 1 2.466165437 1
0.1 0.9203443 0.9203443 2.563569242
0 0.8414805 0.8414805 2.664820112
0.1 0.7641771 0.7641771 2.770069993
0.2 0.6891564 0.6891564 2.879476828
0.3 0.6170749 0.6170749 2.993204802
0.4 0.5485061 0.5485061 3.111424583
0.5 0.4839271 0.4839271 3.234313578
0.6 0.4237106 0.4237106 3.362056203
0.7 0.36812 0.36812 3.494844158
0.8 0.3173102 0.3173102 3.632876714
0.9 0.2713318 0.2713318 3.776361011
1 0.230139 0.230139 3.925512372
1.1 0.1936007 0.1936007 4.080554622
1.2 0.161513 0.161513 4.24172043
1.3 0.1336141 0.1336141 4.409251652
1.4 0.1095982 0.1095982 4.583399696
1.5 0.0891306 0.0891306 4.7644259
1.6 0.0718603 0.0718603 4.952601926
1.7 0.0574327 0.0574327 5.148210162
1.8 0.0454999 0.0454999 5.351544151
1.9 0.0357285 0.0357285 5.562909031
2 0.0278065 0.0278065 5.78262199
2.1 0.0214478 0.0214478 6.011012743
2.2 0.0163947 0.0163947 6.248424031
2.3 0.0124189 0.0124189 6.495212127
2.4 0.009322 0.009322 6.75174738
2.5 0.0069335 0.0069335 7.018414763
2.6 0.0051099 0.0051099 7.295614455
2.7 0.0037312 0.0037312 7.583762442
2.8
2.9
Main Menu Copyright 2006
1.00
h the chi squared test.
0.90
Expected Under H0 0.80
p-- value
Cases Controls
Exposed 34.50 34.50 69 0.70
Non-exposed 25.50 25.50 51 0.60
60 60 120
0.50
ect Confidence Level 0.40
0.30
80% confidence
0.20
90% confidence
0.10 95% confidence
0.00
0.01 0.1 1 scale)
Odds Ratio (log 10
Strat. Case-Control (Rothman)
Crude Odds Ratio= 2.973773
Lamorte, Wayne W:
Stratum 2 This is the p-value for the used to
Cases Controls P-value (HOMOG)= 0.02636859 ok compute a p-value for the Chi Square
Test for homogeneity across strata in
41 28 69
order to determine if there is effect
19 32 51 measure modification.
60 60 120
Odds Ratio= 2.47 ORmh= 4.52 ok

Chi Sq= 5.763 P-value (MH)= 5.97585E-07 ok
p-value= 0.016367 4.52 ok
Conf. Level= 0.95 LL95%CI 2.42 ok
Upper CI= 5.19 UL95%CI 8.47 ok
Lower CI= 1.17
lnOR 0.9027 ln(OR mh)= 1.51 ok
se(lnOR) 0.379455
w 6.9451144
ad/T= 10.933333 G2 0.01146262 ok
bc/T= 4.4333333 H2 + 0.05302419 ok
(a+d)/T 0.6083333 P2 + 0.03778932 ok
(b+c)/T 0.3916667 Q2 Var[ln(ORmh)]= 0.10227613 ok
0.1439861 SE= 0.31980639 ok
ChiSq_het 2.5565032 ChiSQ_HOMG= 4.93168534 ok
Expected Under H0
Cases Controls
34.50 34.50 69.00
25.50 25.50 51.00
60 60 120
34.5 Sum D54+H54 669.00 ok

41 Sum c55+g55 688.00 ok
Difference 19.00 ok
7.392857142857 14.49 ok
ChiSqMH= 24.9200245 ok
For p-value function

90% 95% 99%
Lower Bound 1.17
Upper Bound 5.19
RR
2.46616543684019
SE(ln)
0.38736032802159
Curve 1 p-value
0.8019729 0.003731
0.8336477 0.00511
0.8665735 0.006934
0.9007997 0.009322
0.9363777 0.012419
0.973361 0.016395
1.0118049 0.021448
1.0517672 0.027807
1.0933078 0.035728
1.1364892 0.0455
1.181376 0.057433
1.2280357 0.07186
1.2765383 0.089131
1.3269565 0.109598
1.379366 0.133614
1.4338455 0.161513
1.4904768 0.193601
1.5493447 0.230139
1.6105377 0.271332
1.6741476 0.31731
1.7402699 0.36812
1.8090037 0.423711
1.8804522 0.483927
1.9547226 0.548506
2.0319264 0.617075
2.1121795 0.689156
2.1956023 0.764177
2.2823199 0.841481
0 2.3724625 0.920344
1 2.4661654 1
2.5635692 0.920344
2.6648201 0.841481
2.77007 0.764177
2.8794768 0.689156
2.9932048 0.617075
3.1114246 0.548506
3.2343136 0.483927
3.3620562 0.423711
3.4948442 0.36812
3.6328767 0.31731
3.776361 0.271332
3.9255124 0.230139
4.0805546 0.193601
4.2417204 0.161513
4.4092517 0.133614
4.5833997 0.109598
4.7644259 0.089131
4.9526019 0.07186
5.1482102 0.057433
5.3515442 0.0455
5.562909 0.035729
5.782622 0.027807
6.0110127 0.021448
6.248424 0.016395
6.4952121 0.012419
6.7517474 0.009322
7.0184148 0.006934
7.2956145 0.00511
7.5837624 0.003731
0.99
0.95
0.9
80% confidence
90% confidence
95% confidence
1 scale)
tio (log 10 100
orte, Wayne W:
is the p-value for the used to
pute a p-value for the Chi Square
for homogeneity across strata in
r to determine if there is effect
sure modification.
0.00
0.00
0
Estimating Cumulative Incidence (CI) from Incidence Ra
Relationship of Incidence Rate to Cumulative Incidence (Risk)
Cumulative incidence (the proportion of a population at risk that will develop an outcome in a given period of time
measure of risk, and it is an intuitive way to think about possible health outcomes. An incidence rate is less intuitive
really an estimate of the instantaneous rate of disease, i.e. the rate at which new cases are occurring at any paricul
Incidence rate is therefore more analgous to the speed of a car, which is typically expressed in miles per hour. Time
to measure a car's speed, but we don't have to wait a whole hour; we can glance at the speedometer to see the ins
of travel. Rather than measuring risk per se, incidence rate measures the rate at which new cases of disease occur p
time, and time is an integral part of the calculation of incidence rate. In contrast, cumulative incidence or risk asses
probability of an event occurring during a stated period of observation. Consequently, it is essential to describe the
period in words when discussing cumulative incidence (risk), but time is not an integral part of the calculation. Desp
distinction, these two ways of expressing incidence are obviously related, and incidence rate can be used to estima
incidence. At first glance it would seem logical that, if the incidence rate remained constant the cumulative inciden
equal to the incidence rate times time:
CI = IR x T
This relationship would hold true if the population were infinitely large, but in a finite population this approximatio
increasingly inaccurate over time because the size of the population at risk declines over time. Rothman uses the e
population of 1,000 people who experience a mortality rate of 11 deaths per 1,000 person-years over a period of y
words, the rate remains constant. The equation above would lead us to believe that after 50 years the cumulative i
death would be CI = IR X T = 11 X 50 = 550 deaths in a population which initally had 1,000 members. In realtity, ther
be 423 deaths after 50 years. The problem is that the equation above fails to take into account the fact that the size
population at risk declines over time. After the first year there have been 11 deaths, and the population now has on
not 1,000. As a result, the equation above overestimates the cumulative incidence, because there is an exponentia
population at risk. A more accurate mathematical expression that takes this into account is:
CI = 1 - e(-IR x T), where 'e' = 2.71828
This constant 'e' arises in many mathematical relationships describing growth or decay over time. If you are using a
spreadsheet, you could calculate the CI using the formula:
CI = 1 - EXP(-IR xT)

In the graph below the upper blue line shows the predicted number of deaths using the approximation CI = IR x T. T
in red, shows the more accurate projection of cumulative deaths using the exponential equation.

Nevertheless, note that the prediction from CI = IR x T gives quite reasonable estimates in the early years of follow-
This means CI=1 minus the constant 'e' (2.71828)

raised to the power (- IRxT).
In Excel this can be programmed as
"1-exp(-IR*T)" or as "1-POWER(2.71828, -IR*T).
This column gives the approximation of CI from the
IR (person-yr) Time (yr) CI=1-e(-IRxT) CI=IRxT formula CI=IRxT. This approximation is only valid
when CI is low, i.e. <0.01 per person-year.
0.001 8 0.008 0.008
0.005 8 0.039 0.040
0.010 8 0.077 0.080
0.020 8 0.148 0.160
0.040 8 0.274 0.320
0.060 8 0.381 0.480
0.080 8 0.473 0.640 Note that cumulative incidence cannot be greater than 1
increasing incidence rates, and column B sets the observ
0.100 8 0.551 0.800 approximations of CI only when the IR is low (i.e. <0.10)
0.200 8 0.798 1.600
0.400 8 0.959 3.200
0.800 8 0.998 6.400
Here, IR is low (0.01 p-yrs) and we observe the

effect of increasing Time on the estimates of CI.
Estimate over time when IR is low (0.01 person-yr)

IR (person-yr) Time (yr) CI=1-e(-IRxT) CI=IRxT
0.010 1 0.010 0.010
0.010 2 0.020 0.020
0.010 3 0.030 0.030
0.010 4 0.039 0.040
0.010 5 0.049 0.050
0.010 8 0.077 0.080
0.010 10 0.095 0.100
0.010 12 0.113 0.120
0.010 14 0.131 0.140
0.010 16 0.148 0.160
0.010 18 0.165 0.180
0.010 20 0.181 0.200
0.010 30 0.259 0.300
0.010 40 0.330 0.400
0.010 50 0.393 0.500
0.010 60 0.451 0.600
0.010 80 0.551 0.800
0.010 100 0.632 1.000
from Incidence Rate (IR) Main Menu
k) time Predicted # Deaths

e in a given period of time) provides a IR (yrs) CI=IR*T CI=1-exp(-IR*T)
idence rate is less intuitive because it is 0.011 1 11.0 10.9
e occurring at any paricular moment. 0.011 2 22.0 21.8
ed in miles per hour. Time has to elapse 0.011 3 33.0 32.5
peedometer to see the instanteous rate 0.011 4 44.0 43.0
ew cases of disease occur per unit of
tive incidence or risk assesses the 0.011 5 55.0 53.5
s essential to describe the relevant time 0.011 6 66.0 63.9
art of the calculation. Despite this 0.011 7 77.0 74.1
ate can be used to estimate cumulative 0.011 8 88.0 84.2
nt the cumulative incidence would be 0.011 9 99.0 94.3
0.011 10 110.0 104.2
pulation this approximation becomes 0.011 15 165.0 152.1
time. Rothman uses the example of a 0.011 20 220.0 197.5
n-years over a period of years; in other 0.011 25 275.0 240.4
50 years the cumulative incidence of 0.011 30 330.0 281.1
members. In realtity, there would only
count the fact that the size of the 0.011 35 385.0 319.5
the population now has only 989 people, 0.011 40 440.0 356.0
use there is an exponential decay in the 0.011 45 495.0 390.4
is: 0.011 50 550.0 423.1
er time. If you are using an Excel 600.0
500.0
pproximation CI = IR x T. The lower line,
quation.
400.0
n the early years of follow-up.
Predicted # Deaths
300.0 CI=IR*T
CI=1-exp(-IR*T)
200.0
100.0
0.0
0 5 10 15 20 25 30 35 40 45 50
e approximation of CI from the

s approximation is only valid
0.01 per person-year.
dence cannot be greater than 1 (i.e., 100%). In this example, column A lists
s, and column B sets the observation time at T=8 years. CI=IRxT gives good
y when the IR is low (i.e. <0.10).
Here, IR is very low (0.001 p-yrs) and

we observe the effect of increasing Time
on the estimates of CI.
Estimate over time when IR is very low (0.001 person-yr)

IR (person-yr)Time (yr)
CI=1-e(-IRxT)CI=IRxT
0.001 1 0.001 0.001
0.001 2 0.002 0.002
0.001 3 0.003 0.003
0.001 4 0.004 0.004
0.001 5 0.005 0.005
0.001 8 0.008 0.008
0.001 10 0.010 0.010
0.001 12 0.012 0.012
0.001 14 0.014 0.014
0.001 16 0.016 0.016
0.001 18 0.018 0.018
0.001 20 0.020 0.020
0.001 30 0.030 0.030
0.001 40 0.039 0.040
0.001 50 0.049 0.050
0.001 60 0.058 0.060
0.001 80 0.077 0.080
0.001 100 0.095 0.100
Predicted # Deaths
CI=IR*T
CI=1-exp(-IR*T)
Cohort Studies- Cumulative Incidence
Observed Data
Diseased No Disease
Exposed 75 49275 49350 Exposed
Non-exposed 38 77054 77092 Non-exposed
113 126329 126442
Incidence in exposed= 0.0015 Confidence Limits

Incidence in non-exposed= 0.0005 Lower Limit Upper Limit
Risk Ratio= 3.08 2.09 4.55
Risk Difference= 0.0010 0.00 0.00
Attrib. Prop. (AR%)= 67.6
Chi Sq= 35.531 Population Attributable Fraction=
p-value= 0.000000003
PAF= 0.448447 Online Fisher's Exact Test

# Needed to Treat= 974
Cohort Studies- Incidence Rate

Observed Data
Diseased No Disease Person-Time observation
Exposed 261 - 737397 Exposed
Non-exposed 363 - 878,547 Non-exposed
624 1615944
Incidence exposed= 0.0004 Confidence Limits
Incidence nonexpos 0.0004 Lower Limit Upper Limit
Rate Ratio= 0.86 0.73 1.00
Rate Difference= -0.000059 -0.000120 0.000001
Chi Sq= 3.643
p-value= 0.056313 (if p=0.000000 it means that p<0.000001)
NNT= -16882 for a year if the denominator is person-years
Stratified Analysis
For stratified analysis of risk data (cumulative incidence) use the worksheet entitled: Strat. Cohort CI (Rothman)
For stratified analysis of rate data (incidence rates) use the worksheet entitled: Strat. Cohort IR (Rothman)
Lower Bound
Upper Bound
From Ken Rothman's Episheet

Y-Values X-Values
z value p-value Curve 1 Null Bar
2.9 0.003731 0.0037312 1.710405
2.8 0.00511 0.0051099 1.745513
2.7 0.006934 0.0069335 1.781342
2.6 0.009322 0.009322 1.817906
2.5 0.012419 0.0124189 1.855221
2.4 0.016395 0.0163947 1.893301
2.3 0.021448 0.0214478 1.932164
2.2 0.027807 0.0278065 1.971824
2.1 0.035728 0.0357284 2.012298
2 0.0455 0.0454999 2.053603
1.9 0.057433 0.0574327 2.095755
1.8 0.07186 0.0718603 2.138773
1.7 0.089131 0.0891306 2.182674
1.6 0.109598 0.1095982 2.227476
1.5 0.133614 0.133614 2.273198
1.4 0.161513 0.161513 2.319858
1.3 0.193601 0.1936006 2.367476
1.2 0.230139 0.230139 2.416072
1.1 0.271332 0.2713318 2.465664
1 0.31731 0.3173102 2.516275
0.9 0.36812 0.36812 2.567925
0.8 0.423711 0.4237105 2.620635
0.7 0.483927 0.4839271 2.674426
0.6 0.548506 0.548506 2.729322
0.5 0.617075 0.6170749 2.785345
0.4 0.689156 0.6891563 2.842518
0.3 0.764177 0.764177 2.900864
0.2 0.841481 0.8414805 2.960407
0.1 0.920344 0.9203442 3.021173 1 0
0 1 1 3.083187 1 1
0.1 0.920344 0.9203443 3.146473
0.2 0.841481 0.8414805 3.211058
0.3 0.764177 0.7641771 3.276969
0.4 0.689156 0.6891564 3.344233
0.5 0.617075 0.6170749 3.412877
0.6 0.548506 0.5485061 3.482931
0.7 0.483927 0.4839271 3.554422
0.8 0.423711 0.4237106 3.627381
0.9 0.36812 0.36812 3.701838
1 0.31731 0.3173102 3.777822
1.1 0.271332 0.2713318 3.855367
1.2 0.230139 0.230139 3.934503
1.3 0.193601 0.1936007 4.015263
1.4 0.161513 0.161513 4.097682
1.5 0.133614 0.1336141 4.181791
1.6 0.109598 0.1095982 4.267628
1.7 0.089131 0.0891306 4.355226
1.8 0.07186 0.0718603 4.444623
1.9 0.057433 0.0574327 4.535854
2 0.0455 0.0454999 4.628958
2.1 0.035729 0.0357285 4.723973
2.2 0.027807 0.0278065 4.820938
2.3 0.021448 0.0214478 4.919894
2.4 0.016395 0.0163947 5.020881
2.5 0.012419 0.0124189 5.12394
2.6 0.009322 0.009322 5.229116
2.7 0.006934 0.0069335 5.33645
2.8 0.00511 0.0051099 5.445987
2.9 0.003731 0.0037312 5.557772
p-value (test-based)
Expected Under H0
p-value function
Diseased No Disease
44.10 49305.90 49350
68.90 77023.10 77092
113 126329 126442 1.00
0.95
Confidence Level
0.90
0.85
0.80
0.75
0.70
0.65
0.60
0.55
0.50
ble Fraction= 0.4484471 0.45
0.40
0.35
0.30
0.25
0.20
0.15
0.10
0.05
0.00
0.01 0.1 1 10
Expected Under H0
Diseased No Disease RR
284.75 -
339.25 - From Rothman: Epidemiology: An Introduction, p. 121:
624 "… significance testing is only a qualitative proposition. The end result is a declaration o
Confidence Level significant' that provides no quantitative clue about the size of the effect. [The p value
quantitative visual message about the estimated size of the effect. The message comes
to the strngth of the effect and the precision. Strength is conveyed by the location of th
horizontal axis and precision by the spread of the function around the point estimate. B
only one number, it cannot convey two separate quantitative messages."
Cohort CI (Rothman)
Cohort IR (Rothman)
For p-value function
90% 95% 99%
Lower Bound 2.09
Upper Bound 4.55
RR
3.0831867
SE(ln)
0.203184
Curve 1 p-value
1.7104048 0.0037312
1.745513 0.0051099
1.7813417 0.0069335
1.817906 0.009322
1.8552207 0.0124189
1.8933014 0.0163947
1.9321637 0.0214478
1.9718238 0.0278065
2.0122979 0.0357284
2.0536027 0.0454999
2.0957554 0.0574327
2.1387734 0.0718603
2.1826743 0.0891306
2.2274764 0.1095982
2.2731981 0.133614
2.3198583 0.161513
2.3674762 0.1936006
2.4160715 0.230139
2.4656644 0.2713318
2.5162752 0.3173102
2.5679248 0.36812
2.6206346 0.4237105
2.6744263 0.4839271
2.7293222 0.548506
2.7853449 0.6170749
2.8425175 0.6891563
2.9008637 0.764177
2.9604074 0.8414805
3.0211734 0.9203442
3.0831867 1
3.1464729 0.9203443
3.2110581 0.8414805
3.276969 0.7641771
3.3442328 0.6891564
3.4128773 0.6170749
3.4829308 0.5485061
3.5544222 0.4839271
3.6273811 0.4237106
3.7018375 0.36812
3.7778223 0.3173102
3.8553667 0.2713318
3.9345028 0.230139
4.0152633 0.1936007
4.0976815 0.161513
4.1817915 0.1336141
4.2676279 0.1095982
4.3552262 0.0891306
4.4446225 0.0718603
4.5358538 0.0574327
4.6289578 0.0454999
4.7239728 0.0357285
4.8209382 0.0278065
4.9198938 0.0214478
5.0208807 0.0163947
5.1239404 0.0124189
5.2291156 0.009322
5.3364496 0.0069335
5.4459867 0.0051099
5.5577723 0.0037312
confidence
0.90
0.95
tion
0.99
80%
confidence
90%
confidence
95%
confidence
1 10 100
RR
. 121:
The end result is a declaration of 'significant' or 'not
size of the effect. [The p value function]...presents a
the effect. The message comes in two parts, relating
s conveyed by the location of the curve along the
on around the point estimate. Because the p value is
ative messages."
Sample Size Calculations
Part I - Sample Size Calculations for Means
Anticipated Values: Put your anticipated proportions in the blue boxes.
Mean Stan. Dev
Group 1 205 30 The cells in the table below show the estimated number of
subjects needed in each group to demonstrate a statistically
Group 2 200 30
significant differenence at "p" values ranging from 0.10-0.01 and
at varying levels of "power." [Power is the probability of finding a
Difference in means = 2.44 % statistically significant difference, assuming it exists, at a given
"p" value.]
Sample Size Needed in Each Group

alpha level Power
("p" value) 95% 90% 80% 50%
0.10 778 619 446 194 The red cells indicate the two
0.05 936 756 562 274 most commonly used estimates,
0.02 1138 936 720 389 i.e. based on 90% or 80% power
0.01 1282 1073 842 475
=================================================================
Part II - Sample Size Calculations for a Difference in Proportions (frequency)

Anticipated Values: Put your anticipated proportions in the blue boxes.
Proportion with (without)
Group 1 0.4 0.6 The cells in the table below show the estimated number of subjects needed in each g
statistically significant differenence at "p" values ranging from 0.10-0.01 and at varyin
Group 2 0.1 0.9 the probability of finding a statistically significant difference, assuming it exists, at a g
Difference in frequency = 75.00 %

Sample Size Needed in Each Group
alpha level Power
("p" value) 95% 90% 80% 50%
0.10 40 32 23 10 The red cells indicate the two
0.05 48 39 29 14 most commonly used estimates,
0.02 58 48 37 20 i.e. using a p-value <0.05 and
0.01 65 55 43 24 either 90 or 80% power.
Using Lisa Sullivan's "Essentials of Biostatistics in Public Health", chapter 8
p1 0.6
p2 0.9
overall p 0.75
Effect Size= 0.69282032
Z(1-a/2) 1.96
Z(1-b) 0.84
n 32.6666667
Main Menu
Table for (Z1-alpha/2+Z1-beta)squared

beta
alpha 0.05 0.1 0.2 0.5
0.1 10.8 8.6 6.2 2.7
0.05 13 10.5 7.8 3.8
0.02 15.8 13 10 5.4
0.01 17.8 14.9 11.7 6.6
equency)
er of subjects needed in each group to demonstrate a

ng from 0.10-0.01 and at varying levels of "power." [Power is
ence, assuming it exists, at a given "p" value.]
ells indicate the two

monly used estimates,
a p-value <0.05 and
or 80% power.
Random Number Generator Main Menu
Number of groups= 4
Enter a seed # 6
Assign to Group: 4
random # 0.824695
82
This program usesxe2
a random number
328
generator to assign subjects randomly to a group. You
need to specify how
/100
many groups
3.28
you want in the first blue cell. You then need to “spark” the
random number generator
trunc
by entering
4
some number (ANY number) in the 2nd blue cell. Enter
a number and click outside the cell; this will generate a random number and specify to which
group the subject should be assigned, based on how many groups you specified.
Main Menu
a group. You
to “spark” the
blue cell. Enter
ecify to which
d.
Direct Standardization (for Adjusted Rates)
Adapted from Dr. Tim Heeren, Boston University School of Public Health, Dept. of Biostatist
For specific strata of a population (e.g. age groups) indicate the number of observed events and the number of people in the s
Indicate the distribution of some standard reference population in column C. [Leave a "1" in column F for extra strata to preven
Distribution of
Reference Number of Number of Proportion
e.g. age Stratum Population Events Subjects or "Rate" SE
<5 1 0.20 200 46000 0.00435 0.00031
5-19 2 0.40 900 23000 0.03913 0.00128
20-44 3 0.40 2000 30000 0.06667 0.00144
45-64 4 0.00 0 1 0.00000 0.00000
65+ 5 0.00 0 1 0.00000 0.00000
6 0.00 0 1 0.00000 0.00000
7 0.00 0 1 0.00000 0.00000
8 0.00 0 1 0.00000 0.00000
Totals 1.00 3100 99005
Crude Rate 0.03131
Standardized Proportion or "Rate" 0.04319
Standard Error 0.00077
95% CI for Standardized Rate 0.04167 0.04470
Suppose you want to compare Florida and Alaska with respect to death rates from cancer. The problem is tha
Example: and Florida and Alaska have different age distributions. However, we can calculate age-adjusted rates by usin
determine what the overall rates for Florida and Alaska would have been if their populations had similar distrib
specific rates observed for each population and calculates a weighted average using the "standard" population
US age distribution in 1988 was used as a standard, but you can use any other standard. Note that the crude
substantially (1,061 per 100,000 vs.391 per 100,000, but Florida has a higher percentage of old people. The s
similar (797 vs. 750 per 100,000).
Distribution of Florida
US Population Number of Number of Proportion
e.g. age Stratum in 1988 Events Subjects or "Rate" SE
<5 1 0.07 2414 850000 0.00284 0.00006 Florida
5-19 2 0.22 1300 2280000 0.00057 0.00002 Age Deaths Po
20-44 3 0.40 8732 4410000 0.00198 0.00002 <5 2,414 85
45-64 4 0.19 21190 2600000 0.00815 0.00006 5-19 1,300 2,28
20-44 8,732 4,41
65+ 5 0.12 97350 2200000 0.04425 0.00014 45-64 21,190 2,60
6 0.00 0 1 0.00000 0.00000 >65 97,350 2,20
7 0.00 0 1 0.00000 0.00000 Tot. 130,986 12,340
8 0.00 0 1 0.00000 0.00000
Totals 1.00 130986 12340003 Crude Rate= 130,98
Crude Rate 0.01061
Distribution of Alaska
US Population Number of Number of Proportion
in 1988
Alaska
Events Subjects or "Rate" SE Age Deaths Pop
<5 164 60,
5-19 85 130,
20-44 450 240,
45-64 503 80,
>65 870 20,
Distribution of
US Population Number of Number of Proportion Alaska
Stratum in 1988 Events Subjects or "Rate" SE Age Deaths Pop
1 0.07 164 60000 0.00273 0.00021 <5 164 60,
2 0.22 85 130000 0.00065 0.00007 5-19 85 130,
3 0.40 450 240000 0.00188 0.00009 20-44 450 240,
4 0.19 503 80000 0.00629 0.00028 45-64 503 80,
>65 870 20,
5 0.12 870 20000 0.04350 0.00144 Tot. 2,072 530,0
6 0.00 0 1 0.00000 0.00000
7 0.00 0 1 0.00000 0.00000 Crude Rate= 2,072/5
8 0.00 0 1 0.00000 0.00000
Totals 1.00 2072 530003
Crude Rate 0.00391
Main Menu
h, Dept. of Biostatistics
number of people in the stratum in columns E and F.
F for extra strata to prevent calculation error.]
0.00086957 0.00000000
0.01565217 0.00000026
0.02666667 0.00000033
0.00000000 0.00000000
0.00000000 0.00000000
0.00000000 0.00000000
0.00000000 0.00000000
0.00000000 0.00000000
0.04318841 0.00000060
ancer. The problem is that death rates are markedly affected by age,
age-adjusted rates by using a reference or "standard" distribution to
ulations had similar distributions. The calculation uses the age-
g the "standard" populations distribution for weighting. In this case, the
dard. Note that the crude rates for Florida and Alaska differ
ntage of old people. The standardized (age-adjusted) rates are very
Florida
0.00019880 0.00000000
% of total Rate per
Age Deaths
0.00012544Pop. (Weight) 100,000
0.00000000
<5 2,414 850,000
0.00079202 7%
0.00000000 284
5-19 0.00154850
1,300 2,280,000 18%
0.00000000 57
20-44 8,732 4,410,000 36% 198
45-64 0.00531000 0.00000000
21,190 2,600,000 21% 815
>65 0.00000000 0.00000000
97,350 2,200,000 18% 4,425
0.00000000
Tot. 130,986 0.00000000
12,340,000 100%
0.00000000 0.00000000
Crude0.00797476
Rate= 130,986/12,340,000=1,061
0.00000000 per 100,000
Alaska % of total Rate per

Age Deaths Pop. (Weight) 100,000
<5 164 60,000 11% 274
5-19 85 130,000 25% 65
20-44 450 240,000 45% 188
45-64 503 80,000 15% 629
>65 870 20,000 4% 4,350
Alaska % of total Rate per
Age Deaths Pop. (Weight) 100,000
<5 0.00019133
164 60,000 0.00000000
11% 274
5-19 85 130,000
0.00014385 25%
0.00000000 65
20-44 450 240,000 45%
0.00075000 0.00000000 188
45-64 503 80,000
0.00119463 15%
0.00000000 629
>65 870 20,000 4% 4,350
Tot. 0.00522000
2,072 530,0000.00000003
100%
0.00000000 0.00000000
0.00000000
Crude Rate= 0.00000000per 100,000
2,072/530,000=391
0.00000000 0.00000000
0.00749980 0.00000003
Standardized Incidence Ratios
SIR is useful for evaluating whether the number of observed cancers in a community exceeds the
overall average rate for the entire state.
(Column CxD) Calculation of Standardized Incidence R
State Cancer # People in Expected # Observed #
Rate Community Community Community
In order to monitor for unusual increase
(Standard) Strata Cancers Cancers
municipalities, cancer incidence data ar
e.g. age Stratum
overall rates of cancer for the entire stat
<20 1 0.00010 74657 7.5 11 'expected' rates for all communities. The
20-44 2 0.00020 134957 27.0 25 demographics of each municipality in or
45-64 3 0.00050 54463 27.2 30 particular type of cancer. This is then co
65-74 4 0.00150 25136 37.7 40 that were observed in a given town. The
75-84 5 0.00180 17012 30.6 30 expected ) x 100. Thus, if a town's SIR fo
suggest that the town's breast cancer in
85+ 6 0.00100 6337 6.3 8 state overall.
7 0.00000 0 0.0
8 0.00000 0 0.0
Totals 312562 136 144
95% CI
observed # expected SIR lower Upper
144 136.35 105.61 89.06 124.34
p-value: 0.267253325
Main Menu
unity exceeds the
n of Standardized Incidence Ratios in the MA Cancer Registry

o monitor for unusual increases in cancer rates in individual
ities, cancer incidence data are tabulated by age group and gender. The
tes of cancer for the entire state are computed and used to generated the
' rates for all communities. These expected rates are then applied to the
phics of each municipality in order to compute the 'expected' number of a
type of cancer. This is then compared to the actual number of cancers
observed in a given town. The SIR is the (# of observed cancer cases / #
) x 100. Thus, if a town's SIR for breast cancer were 125, this would
hat the town's breast cancer incidence was 25% greater than that of the
all.
The SIR is the ratio of the observed # cases

divided by the expected #. The ratio is then
multiplied X 100.
The Poisson Probability Calculator
Suppose that in a given population the average (expected) number of leukemia cases in a given time period is 5, but one obse
cases during the time period. What is the probability of observing this as a result of random variation? To compute the relevan
probabilities, enter the observed number of cases and the average, or usual number of cases.
Enter the observed number of events: 12

Enter the usual (average) number of events: 5
Probability of exactly 12 events = 0.003434

Cumulative probability of < 12 events = 0.994547
Cumulative probability of > 12 events = 0.002019
Main Menu
n time period is 5, but one observes 12

ariation? To compute the relevant
The Binomial Probability Calculator
Suppose that a town has 12 leukemia cases during a given period of time, and 6 of the cases occurred in a census tract where
town's population lives. What is the probability of 6 or more cases in this census tract as a result of random variation? To com
relevant probabilities, enter the observed number of cases and the average, or usual number of cases.
Probability of "success" on each trial: 0.51

Enter the observed number of events: 12
Enter the usual (average) number of events: 9
Probability of exactly 9 events = 0.060415

Main Menu
occurred in a census tract where only 17% of the

ult of random variation? To compute the
of cases.
Normal Probability Calculator
Main Menu
Enter the Mean and Std. Dev. for a population and an observed value (X) from the
population. The function returns the probability of values less than the observed value.
This can also be used to interpreted as the "percentile" for the observed value.
Mean Std. Dev. X Cumulative probability

29 6 35 Example 0.841344746068543
Enter your values Err:502
Percentile Value Calculator
Enter the desired probability (percentile), e.g., for 90th percentile enter 0.90. Then
enter the mean and standard deviation for the population. The function will return the
value representing that percentile.
Value at the Desired

Percentile Mean Std. Dev Percentile
0.9 28 7 Example 36.9708609588122
Enter your values Err:502

Screening Main Menu
Gold Standard
+ -
Test + 4235 13337 17572 PPV= 0.241
Result - 1755 53563 55318 NPV= 0.968
5990 66900 72890
Sensitivity Specificity
0.707 0.801
Descriptive Statistics: Mean, Median, Mode, 95% confidence
interval for a mean, Standard Deviation, Standard Error, Range
(minimum and maximum)
Example Data:
14 N= 12
17 Mean = 17.83
22 Variance = 23.06
18 STD (Stand. Dev.) = 4.80
22 Std Error = 1.39
17 Variability in the data can be
12 Lower 95% confidence limit = 15.12 between each individual va
7 Upper 95% confidence limit = 20.55
σ2 =
20
21
21
23
Standard deviation ( σ
SD captures 68% of the obs
σ=
The Standard Error is th
SE = σ/n
Use Standard Deviatio
population SD, e.g. the deg
our estimate of the populati
the SEM will be narrower w
If the purpose is to describe

use SD (e.g. see Tables 2&
estimate the mean in a grou
95% confidence
ndard Error, Range Wayne W. LaMorte, MD, PhD, MPH
Copyright 2006 Main Menu
Median 19
Mode 17
Minimum 7
Maximum 23
Variability in the data can be quantified from the variance (σ2), which basically calculates the average distance
between each individual value and the mean (the x with the "bar" over it).
2
σ2 = (x – x)
n -1
Standard deviation (σ or SD) is just the square root of the variance, and it is convenient because the mean +
SD captures 68% of the observations, and the mean + 2 SD captures 95% of the observations.
σ= 2
(x – x)
The Standard Error is the Standard Deviation divided n
by -1
the square root of “n”.
SE = σ/n
Use Standard Deviation or Standard Error? A standard deviation from a sample is an estimate of the
population SD, e.g. the degree of variability of body weight in the population. The SE is a measure of the precision o
our estimate of the population’s mean. The precision of this estimate will increase as the sample size increases, i.e.
the SEM will be narrower with larger samples.
If the purpose is to describe a group of patients, for example, to see if they are typical in their variability one should
use SD (e.g. see Tables 2& 3 in Gottlieb et al.: N. Engl. J. Med. 1981; 305:1425-31). However, if the purpose is to
estimate the mean in a group or the prevalence of disease, one should use SE or a confidence interval.
T-critical 2.12
T-critical* 2.94
s the average distance
nient because the mean + 1

ations.
e is an estimate of the
measure of the precision of
sample size increases, i.e.
heir variability one should

wever, if the purpose is to
dence interval.
Skewed Distributions Wayne W. LaMorte, MD, PhD, MPH
Examining the frequency distribution of a data set is an important first step in analysis. It gives an overall picture of the data, a
distribution determines the appropriate statistical analysis. Many statistical tests rely on the assumption that the data are norm
this isn't always the case. Below in the green cells is a data set with hospital length of stay (days) for rwo sets of patients who
surgery. One data set was collected before instituting a new clinical pathway and one set was collected after instituting it.
Question: Was LOS different after
instituting the pathway?
LOS
Before After
3 3
12 1
2 1
1 5
11 1
4 6
2 1 We can rapidly get a feel for what is going on here by creating a frequency
2 5 histogram. The first step is to sort each of the data sets. Begin by selecting the
3 2 "before" values of LOS. Then, from the top toolbar, click on "Data", "Sort" (if
1 3 you get a warning about adjacent data, just indicate you want to continue with
8 3 the current selection). Also, indicate that there is no "header" row and that you
2 1 want to sort in ascending order. Repeat this procedure for the other data set.
3 5
6 2
1 2
13 2
3 3
8 3
10 7
6 3
4 4
12 1
9 3
7 3
1 2
3 2
3 2
2 4
5.07 2.86
Your data should now look like this:

LOS
Before After
And you can summarize it by

counting the frequency of each LOS.
1 1
And you can summarize it by
1 1
1 1
counting the frequency of each LOS.
1 1
2 1 Summary: Frequency of each LOS
2 1 # of people # of people
2 2 LOS before after
2 2 1 4 6
2 2 2 5 7 9
3 2 3 6 8 8
3 2 4 2 2
7
3 2 5 0 3
6 Mean Mean
3 2 6 2 1 after before
3 3 7 1 1 5
3 3 8 2 0 4
4 3 9 1 0
3
4 3 10 1 0
6 3 11 1 0 2
6 3 12 2 0 1
7 3 13 1 0
0
8 3 14 0 0
1 2 3 4 5 6 7 8 9 10 11
8 4 15 0 0
9 4 total 28 28 Median
10 5
11 5 Note that the data is not normally distributed; it is a skewed distribution. As a
12 5 deviation is large, relative to the mean. In situations were the distribution is qu
12 6 standard deviation are misleading parameters to describe the data, and it is b
13 7 median (half of the observations are above the median and half are below) an
5.07 2.86 Mean maximum values). Note that in this case one mean is almost twice as large as
14.74 2.57 Variance
values are the same. Consequently, it is not clear whether institution of the clin
3.84 1.60 SD
improvement in hospital stay. With skewed data like this, a common mistake
t-test. This is NOT appropriate, because the validity of the t-test relies on the a
3 3 Median
are normally distributed.
Copyright 2006
ves an overall picture of the data, and the shape of the

assumption that the data are normally distributed, and
(days) for rwo sets of patients who had femoral bypass
was collected after instituting it.
by creating a frequency
sets. Begin by selecting the
click on "Data", "Sort" (if
you want to continue with
"header" row and that you
ure for the other data set.
before
after
4 5 6 7 8 9 10 11 12 13 14 15
s a skewed distribution. As a result the standard

ons were the distribution is quite skewed the mean and
o describe the data, and it is better to use simply state the
median and half are below) and the range (minimum &
an is almost twice as large as the other, but the median
ar whether institution of the clinical pathway produced an
like this, a common mistake is to compare them using a
dity of the t-test relies on the assumption that the data
T-Tests
Unpaired T-test
Group 1 Group 2
Consider the values of body mass index for the two groups to
BMI BMI
represents values in a group that was treated with a regimen
25 23
variability from person to person. Values range from 22-34, an
25 26
two groups.
27 24
40
34 32
Not suprisingly, when I perform an
38 34
30 30
unpaired t-test on these data, the 35
25 24
differences are not statistically
28 26
significant (p=0.18).
30
29 22
32 31
27 28 25
28 25
30 27 20
31 30 0.8
29.21 Mean 27.29

3.70 SD 3.65
p-value for unpaired t-test 0.17682508

However, suppose these were not two independen
Paired T-test groups of individuals, but a single group whose
(before) (after) BMIs were measured before and after the 4 month
Subject BMI 1 BMI 2 difference treatment. In other words, the data were "paired" in
1 25 23 -2 the sense that each person acted as their own
2 25 26 1 control. Much of the "variability" that we are dealing
3 27 24 -3 with in the setting of two independent groups is due
4 34 32 -2 to the fact that there is substantial person-to-person
5 38 34 -4 variability to begin with. However, what we are really
6 30 30 0 interested in is the response to treatment.
7 25 24 -1
8 28 26 -2 In this case, it looks like just about all subjects reduc
9 29 22 -7 factor out the person-to-person differences, it looks l
10 32 31 -1 effect.
11 27 28 1
12 28 25 -3 In the unpaired t-test the null hypothesis is that the m
13 30 27 -3 t-test the null hypothesis is that the mean differe
14 31 30 -1
Mean difference -1.9
p-value with paired t-test 0.004 In Excel a paired t-test is specified just like an unpaired t-test,
except that the last parameter is set to 1.
A paired t-test can be used in two circumstances:

1) When doing a "before and after" comparison in each subject or comparing two treatments in each
2) In matched case-case control studies it is sometimes possible to make comparisons in pairs. [See
case-control study by Herbst et al.: Adenocarcinoma of the vagina: association of maternal stilbesterol
appearance in young women, N. Engl. J. Med 1971; 284:878-883.]
A paired t-test relies on the following assumptions:

1) The data are quantitative.
A paired t-test can be used in two circumstances:
1) When doing a "before and after" comparison in each subject or comparing two treatments in each
2) In matched case-case control studies it is sometimes possible to make comparisons in pairs. [See
case-control study by Herbst et al.: Adenocarcinoma of the vagina: association of maternal stilbesterol
appearance in young women, N. Engl. J. Med 1971; 284:878-883.]
A paired t-test relies on the following assumptions:

1) The data are quantitative.
2) The differences (e.g. after-before) are normally distributed.
3) The differences are independent of one another.
Main Menu
for the two groups to the left; group1 was untreated & group 2
eated with a regimen of diet and exercise for 4 months. There is
range from 22-34, and there is considerable overlap between the
40
35
30
25
20
0.8 1.8
re not two independent 40
gle group whose

nd after the 4 month 35
data were "paired" in

ed as their own
" that we are dealing 30
endent groups is due

ntial person-to-person 25
er, what we are really
treatment.
20
out all subjects reduced their BMI somewhat, and if you

0.8 1.8
differences, it looks like the treatment regimen had an
ypothesis is that the means are the same, but in a paired

hat the mean difference between the pairs is zero.
ust like an unpaired t-test,

to 1.
wo treatments in each in a clinical trial.

parisons in pairs. [See the methods section in the
maternal stilbesterol therapy with tumor
wo treatments in each in a clinical trial.
parisons in pairs. [See the methods section in the
maternal stilbesterol therapy with tumor
The Unpaired T-Test Main Menu
Wayne W. LaMorte, MD, PhD,
MPH Copyright 2006
Unpaired t-tests (comparing two independent means):

For continuous data one is frequently asking the question "Is the mean different for these two groups?" I
hypothesis is that the groups have the same mean. If the sample size is relatively large (>30) this can be
normal distribution. However, authors frequently use a t-test (even with large sample), and this is particu
size is small.
T-tests calculate a "t" statistic that takes into account the difference between the means, the variability in
observations in each group. Based on the "t" statistic and the degrees of freedom (total observations in
can look up the probability of observing a difference this great or greater if the null hypothesis were true.
T-tests are based on several assumptions:

1) that the data are reasonably close to being normally distributed
2) that the two samples have similar variance & standard deviation
3) that the observations are independent of each other.
Consider the WBC counts (in thousands) in two groups of patients:
Group 1 Group 2 From a practical point of view Excel provides built in functions that
4.5 4.2 cell C44 to see the function used for a t-test with equal variance. O
5.0 7.2 • the cells where the first groups data is found,
5.3 8.0 • the cells where the second group's data is found,
5.3 3.5 • then whether it is a 2-tailed test or a 1-tailed test, and
6.0 6.3 • finally a "2" to indicate a test for equal variance.
6.0 5.1 If the variance is unequal, there is a modified calculations that one
7.6 4.6 the last parameter in the function (compare the formulae in cells C
7.7 4.8 thumb, if one standard deviation is more than twice the other, you
6.4 2.0 variance test.
7.2 5.0
7.0 5.4 Note also that the two groups do not have to have the same numb
5.6
8.4 Finally, note that in this case we are estimating the means in each
8.3 are different; consequently, it is appropriate to calculate SEM, whic
9.5 square root of N.
15 11 N
6.7 5.1 Mean
2.10 2.77 Variance
1.45 1.66 SD
0.37 0.50 SEM (standard error of the mean)
0.02 Two-tailed p-value by t-test for equal variance
0.02 Two-tailed p-value by t-test for unequal variance
The t-test is a "parametric" test, because it relies on the legitimate use of the means and standard deviations, w
the parameters that define normally distributed continuous variables. If the groups you want to compare are cl
skewed (i.e. do not conform to a Normal distribution), you have two options:
1) Sometimes you can "transform" the data, e.g. by taking the log of each observation; if the log
are normally distributed, you can then do a t-test on the transformed data; this is legitimate.
2) You can use a "non-parametric" statistical test.

The t-test is a "parametric" test, because it relies on the legitimate use of the means and standard deviations, w
the parameters that define normally distributed continuous variables. If the groups you want to compare are cl
skewed (i.e. do not conform to a Normal distribution), you have two options:
1) Sometimes you can "transform" the data, e.g. by taking the log of each observation; if the log
are normally distributed, you can then do a t-test on the transformed data; this is legitimate.
2) You can use a "non-parametric" statistical test.

W. LaMorte, MD, PhD,
Copyright 2006
Age Freq. faile
10-20 1
21-30
erent for these two groups?" In other words, the null 31-40 4
latively large (>30) this can be done using z scores and the 41-50
ge sample), and this is particularly appropriate if the sample 51-60 2
61-70 1
en the means, the variability in the data, and the number of

4.5
eedom (total observations in the two groups minus 2) one
the null hypothesis were true. 4
3.5
3
2.5
2
1.5
1
0.5
0
provides built in functions that make t-tests easy. Click on 10-20 21-30 31-40 41-50 5
a t-test with equal variance. One specifies:
ps data is found, failed ok
roup's data is found, 56 19
est or a 1-tailed test, and 37 25
for equal variance. 57 38
modified calculations that one can get by specifying "3" as 39
mpare the formulae in cells C44 & C45). As a rule of 35
more than twice the other, you should use the unequal 40
66
19
have to have the same number of subjects. 43.6 27.3
227.4 94.3
estimating the means in each group to test whether they 15.1 9.7
opriate to calculate SEM, which is SD divided by the
0.08
eans and standard deviations, which are

oups you want to compare are clearly
og of each observation; if the log values

is legitimate.
eans and standard deviations, which are
oups you want to compare are clearly
og of each observation; if the log values

is legitimate.
Freq OK
1
1
1
Freq. failed
Freq OK
21-30 31-40 41-50 51-60 61-70
Mean
Variance
SD
Two-tailed p-value; ttest with unequal variance

Correlation, Linear Regression, and the Line of Best Fit Main Menu
Example "X" "Y"
Weeks Savings Prediction
The "dependent" varia
1 200 50.88679 m= 515.32 savings and the indepe
2 850 566.2075 b= -464.43
(X and Y values) are c
3 1300 1081.528 from B3 to C10. The a
4 1500 1596.849 r= 0.942109 several functions that a
5 1578 2112.17 r2= 0.89 "m" is calculated in cel
8 3000 3658.132 N=? 8 "=SLOPE(C3:C10,B3:B
9 3600 4173.453 t- 6.882303
specify where the data
first.
10 5900 4688.774 p-value= 0.000235
The Y-INTERCEPT "b"
7000 Excel function "INTER
the data block is speci
Savings
6000
values first. From thes
5000 specify the line of best
4000 To calculate the correla
3000 relationship one would
"=CORREL(B3:B10,C3
2000 Finally, in H7, I square
1000 calculate "r-squared", w
of the variability in earn
0
0 2 4 6 8 10 12
Weeks
I used the graphing tool to plot the individual data points (blue diamonds) and
the line of best fit (pink line).
Main Menu
The "dependent" variable (outcome of interest) here is

savings and the independent variable is time. The data
(X and Y values) are contained in the block of cells
from B3 to C10. The analysis is performed using
several functions that are built into Excel. The SLOPE
"m" is calculated in cell H3 using the Excel function
"=SLOPE(C3:C10,B3:B10) " . So basically you need to
specify where the data is, with the "Y" values specified
first.
The Y-INTERCEPT "b" is calculated in H4 from the

Excel function "INTERCEPT(C3:C10,B3:B10); again,
the data block is specified, given the location of the "Y"
values first. From these two parameters, one can now
specify the line of best fit using the form Y=b + mX.
To calculate the correlation coefficient for this

relationship one would use the Excel function
"=CORREL(B3:B10,C3:C10)" which is located in H6.
Finally, in H7, I squared what I got in H6 in order to
calculate "r-squared", which indicates what percentage
of the variability in earnings is explained by time.
In order to perform analysis of variance you must first intall the Excel "Analysis Tool-Pak". Click on "Tools" (above) and then o
Pack" and "Analysis Tool-Pak - VBA"; then click "Ok". After installation, when you click on "Tools," you will see a new selection
Tools menu. When you select "Data Analysis" you will see options for analysis of variance and other procedures.
Analysis of Variance Main Menu
Controls (ANOVA)
Aortoiliac Fem-AK Pop Fem-Distal The columns of data to the left are serum creatinine le
factor analysis of variance can be performed to determ
0.7 1.1 1.5 1.2
differences in the means of these groups.
1.2 1.3 1.1 0.8
1.1 0.9 0.8 0.7 Select the block of data (including column labels) from
0.7 0.7 0.9 0.7 select "Tools", then "Data Analysis", then "Single Facto
1.0 0.8 1.1 8.4 for labels, and specify the Output Range as G12. The
0.5 1.4 0.9 1.8
The p-value (0.0764) indicates differences in means th
1.6 0.5 7.0 0.8 criterion for statistical significance.
0.8 1.1 1.4 1.0
0.6 2.0 0.8 0.7
0.6 0.8 1.1 2.8 Anova: Single Factor
0.6 0.7 0.6 1.5
1.3 1.4 1.2 0.6 SUMMARY
0.5 1.1 0.6 1.3 Groups Count
1.0 1.5 1.2 0.5 Controls 25
1.0 1.0 0.6 1.2 Aortoiliac 25
0.8 0.9 0.8 8.2 Fem-AK Pop 25
0.8 0.9 0.8 0.4 Fem-Distal 25
0.6 0.6 1.3 0.6
0.5 0.9 1.3 1.6
0.9 0.9 1.5 0.5 ANOVA
0.7 1.2 1.5 11.4 Source of Variation SS
0.7 1.2 0.4 0.8 Between Groups 30.3779
0.7 1.3 12.9 0.7 Within Groups 412.2632
0.7 0.4 1.1 0.6
1.1 0.7 8.6 0.9 Total 442.6411
Means: 0.828 1.012 2.040 1.988
ick on "Tools" (above) and then on "Add-Ins" and select "Analysis Tool
ools," you will see a new selection ("Data Analysis") at the bottom of the
nd other procedures.

Copyright 2006
to the left are serum creatinine levels among 4 groups of subjects. A one-
riance can be performed to determine whether there are significant
eans of these groups.
ata (including column labels) from B2:E27. Then, from the upper menu,
Data Analysis", then "Single Factor Analysis of Variance". Check the box
fy the Output Range as G12. The result is shown in the box below.
) indicates differences in means that do not quite meet the alpha=0.05

al significance.
Sum Average Variance

20.7 0.828 0.07626667
25.3 1.012 0.1261
51 2.04 8.77333333
49.7 1.988 8.20193333
df MS F P-value F crit
3 10.12597 2.35794221 0.0764786914 2.699393
96 4.294408
99
S u rv iv a l P ro b a b ility
Survival Curves Main Menu
(Adapted from Kenneth Rothman's "Episheet".)
In the blue cells enter the initial # of subjects at risk (C8), and then the # of events and
losses to follow up for each period.
1.00
Risk
Initial No. at
No. at Risk
Cumulative
0.80
Surv. Prob.
95% Lower
95% Upper
Follow-up
Effective
sum q/pL
Survival
Lost to
Events
Bound
Bound
0.60
Period
Prob.
Risk
0.40
0 100 6 4 98.0 0.0612 0.9388 0.9388 0.8728 0.9716 0.000665
1 90 6 5 87.5 0.0686 0.9314 0.8744 0.7931 0.9267 0.001507 0.20
2 79 3 2 78.0 0.0385 0.9615 0.8408 0.7535 0.9012 0.002020
3 74 5 7 70.5 0.0709 0.9291 0.7811 0.6854 0.8540 0.003102 0.00
4 62 4 7 58.5 0.0684 0.9316 0.7277 0.6254 0.8106 0.004357 1 2 3 4
5 51 5 2 50.0 0.1000 0.9000 0.6550 0.5459 0.7498 0.006579 Tim
6 44 3 6 41.0 0.0732 0.9268 0.6070 0.4947 0.7091 0.008505
7 35 0 3 33.5 0.0000 1.0000 0.6070 0.4947 0.7091 0.008505
8 32 7 3 30.5 0.2295 0.7705 0.4677 0.3493 0.5899 0.018271
9 22 5 4 20.0 0.2500 0.7500 0.3508 0.2364 0.4854 0.034938
10 13 6 7 9.5 0.6316 0.3684 0.1292 0.0517 0.2879 0.215389
11 0
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
S u rv iv a l P ro b a b ility
Survival Curve
1.00
Cumulative Surv.
Effective Size
0.80 Prob.
0.60 95% Lower Bound
0.40 95% Upper Bound
98.0000
0.20
95.3235
93.7696
0.00
90.3081
1 2
85.8686 3 4 5 6 7 8 9 10 11
80.0720 Time Period
76.1161
76.1161
62.2871
52.9724
31.2817
Case-Control Analysis from Ken Rothman's "Episheet.xls" (with permission)
This sheet enables you to perform crude and/or Mantel-Haenszel stratified analysis with up to 12 substrata.
It also computes a MH p-value, testing the null hypothesis OR=1, and it computes a p-value for the chi square test for homogen
Unexposed
Unexposed
Unexposed
Instructions
Exposed
Exposed
Exposed
1. Enter frequencies in the yellow
Total
Total
cells of 2x2 tables; check against
crude table below; CTL-e to clear. Cases 31 14 45 65 12 77
2. Click button to the right to adjust
column width if necessary. Controls 1700 1700 3400 1700 1700 3400
3. The results appear in red. Scroll Total 1731 1714 3445 1765 32 1797
down or right if needed. RR= 2.2143 RR= 5.4167
Cases
Controls
1 Total
0.9 P-value Function
P-value
0.8 Cases
0.7 Controls
0.6 Total
0.5
0.4 Crude Data RRmh = 4.2051
0.3 Cases 96 26 122 90% Conf. Interv. = 2.8496
0.2 Controls 3400 3400 6800 95% Conf. Interv. = 2.6449
0.1 Total 3496 3426 6922 99% Conf. Interv. = 2.2868
0
Crude RR = 3.6923 P-value testing RR = 1: 0.5444
0.1 Relative1 Risk 10
P-value for homogeneity: 0.0326
P-value Function Data series

Y X
z valu p-valow/upper bd vert bar
2.9 0 2 0 1 0
2.8 0 2 0 1 1
2.7 0 2 0
2.6 0 2 0
2.5 0 2 0
2.4 0 2 0
2.3 0 2 0
2.2 0 2 0
2.1 0 3 0
2 0 3 0
1.9 0.1 3 0.1
1.8 0.1 3 0.1
1.7 0.1 3 0.1
1.6 0.1 3 0.1
1.5 0.1 3 0.1
1.4 0.2 3 0.2
1.3 0.2 3 0.2
1.2 0.2 3 0.2
1.1 0.3 3 0.3
1 0.3 3 0.3
0.9 0.4 3 0.4
0.8 0.4 3 0.4
0.7 0.5 4 0.5
0.6 1 4 1
0.5 1 4 1
0.4 1 4 1
0.3 1 4 1
0.2 1 4 1
0.1 1 4 1
0 1 4 1
0.1 1 4 1
0.2 1 4 1
0.3 1 5 1
0.4 1 5 1
0.5 1 5 1
0.6 1 5 1
0.7 0.5 5 0.5
0.8 0.4 5 0.4
0.9 0.4 5 0.4
1 0.3 5 0.3
1.1 0.3 5 0.3
1.2 0.2 6 0.2
1.3 0.2 6 0.2
1.4 0.2 6 0.2
1.5 0.1 6 0.1
1.6 0.1 6 0.1
1.7 0.1 6 0.1
1.8 0.1 6 0.1
1.9 0.1 7 0.1
2 0 7 0
2.1 0 7 0
2.2 0 7 0
2.3 0 7 0
2.4 0 7 0
2.5 0 8 0
2.6 0 8 0
2.7 0 8 0
2.8 0 8 0
2.9 0 8 0
e chi square test for homogeneity.
Unexposed
Exposed
G=ad/t
table
Total
Total
RR A E V
1 2.214285714
### 22.61103048 11.10600133 15.29753266
2 5.416666667
### 75.62882582 2.54954008 61.49137451
3
4 Main Menu
5
6
7
8
9
10
11
12
Tot ### 98.2398563 13.65554141 76.78890717
2.8496 - 6.2055
2.6449 - 6.6856 MH chi = -0.60612988170105
2.2868 - 7.7326 Var(ln(RRMH)) = 0.0559606561599416
0.5444
0.0326
Q=(b+c)/T
P=(a+d)/T
GQ+HP
H=bc/t
ln(RR)
HQ
GP
6.908563135 0.502467344 0.497532656 7.686510603 11.08234942 3.437235766 0.794929875
11.35225376 0.982192543 0.952698943 60.39636951 69.73286647 10.81528015 1.68948062
18.26081689 1.484659887 1.450231599 68.08288012 80.81521589 14.25251592
612988170105
606561599416
var(ln(RR))
chisq het
0.104863107 3.922818633
0.099894419 0.641668673
4.564487306
df = 1
Cohort Cumulative Incidence Analysis from Ken Rothman's "Episheet.xls" (with permission)
It also computes a MH p-value, testing the null hypothesis RR=1, and it computes a p-value for the chi square test for homogen
Unexposed
Unexposed
Instructions
Exposed
Exposed
Exposed
1. Enter frequencies in yellow cells
Total
Total
of 2x2 tables; check against crude
table below; CTL-e to clear. Cases 31 14 45 90 15 105
2. Click button to the right to adjust
Non-cases 1969 1986 3955 1910 1985 3895
column width if necessary.
3. The results appear in red. Scroll Total 2000 2000 4000 2000 2000 4000
down or right if needed. RR= 2.21428571 RR= 6
References:
1. Modern Epidemiol, 3rd Ed., Ch. 15 RD= 0.0085 RD= 0.0375
2. Sato T, Biometrics 1989;45:1323-4 Cases
Non-cases
Total
1 P-value Function
0.9 Cases
Non-cases
P-value
0.8
0.7 Total
0.6
0.5
0.3 Cases 121 29 150 90% Conf. Interv. =
0.2 Non-cases 3879 3971 7850 95% Conf. Interv. =
0.1 Total 4000 4000 8000 99% Conf. Interv. =
0
0.1 1 10
Crude RR = 4.1724 P-value testing RR = 1:
Relative Risk Crude RD = 0.0230 P-value for homogeneity:
P-value Function Data series RDmh = 0.0230

Y X 90% Conf. Interv. =
z va p-valow/upper bd vert bar 95% Conf. Interv. =
2.9 0 2 0 1 0 99% Conf. Interv. =
2.8 0 2 0 1 1
2.7 0 2 0 P-value testing RD = 0:
2.6 0 2 0 P-value for homogeneity:
2.5 0 2 0
2.4 0 3 0
2.3 0 3 0
2.2 0 3 0
2.1 0 3 0
2 0 3 0
1.9 0.1 3 0.1
1.8 0.1 3 0.1
1.7 0.1 3 0.1
1.6 0.1 3 0.1
1.5 0.1 3 0.1
1.4 0.2 3 0.2
1.3 0.2 3 0.2
1.2 0.2 3 0.2
1.1 0.3 3 0.3
1 0.3 3 0.3
0.9 0.4 3 0.4
0.8 0.4 4 0.4
0.7 0.5 4 0.5
0.6 1 4 1
0.5 1 4 1
0.4 1 4 1
0.3 1 4 1
0.2 1 4 1
0.1 1 4 1
0 1 4 1
0.1 1 4 1
0.2 1 4 1
0.3 1 4 1
0.4 1 5 1
0.5 1 5 1
0.6 1 5 1
0.7 0.5 5 0.5
0.8 0.4 5 0.4
0.9 0.4 5 0.4
1 0.3 5 0.3
1.1 0.3 5 0.3
1.2 0.2 5 0.2
1.3 0.2 5 0.2
1.4 0.2 6 0.2
1.5 0.1 6 0.1
1.6 0.1 6 0.1
1.7 0.1 6 0.1
1.8 0.1 6 0.1
1.9 0.1 6 0.1
2 0 6 0
2.1 0 6 0
2.2 0 7 0
2.3 0 7 0
2.4 0 7 0
2.5 0 7 0
2.6 0 7 0
2.7 0 7 0
2.8 0 7 0
2.9 0 8 0
M1N1N0/T^2 - ab/T
Main Menu
2 substrata.
he chi square test for homogeneity.
Unexposed
Unexposed
Exposed
ln(RR)
aNo/T
bN1/T
table
Total
Total
RD RR A E V
1 0.009 2.21 31 22.5 11.13 15.5 7 11.1 0.795
2 0.038 6 90 52.5 25.57 45 7.5 25.9 1.792
3
4
5
6
7
8
9
10
11
12
Tot 121 75 36.69 60.5 14.5 37.1

2.9755 - 5.8508
2.7890 - 6.2421 MH chi = 7.59386681387112
2.4578 - 7.0831 Var(ln(RRMH)) = 0.042238814477059
0.0000
0.0177 Var(RDMH) = 0.00000913125
0.0180 - 0.0280
0.0171 - 0.0289
0.0152 - 0.0308
0.0000
0.0000
(aN0 - bN1)/T
RR chisq het
RD chisq het
var(ln(RR))
Sato Qk
Sato Pk
var(RD)
N1N0/T
0.1027 3.909023 1000 8.5 -4.25 11.1415 1.11E-05 18.932
0.0768 1.718743 1000 37.5 -18.75 25.9125 2.52E-05 8.3402
5.627766 2000 46 -23 37.054 27.272

df = 1
Cohort Incidence Rate Analysis from Ken Rothman's "Episheet.xls" (with permission)
It also computes a MH p-value, testing the null hypothesis RR=1, and it computes a p-value for the chi square test for homogen
Unexposed
Unexposed
Instructions
Exposed
Exposed
Exposed
1. Enter events and person-time in
Total
Total
yellow cells of tables on the right;
check entries against crude table
below; CTL-e to clear. Cases 32 2 34 104 12 116 206
2. Click button to the right to adjust Person-time 52407 18790 71197 43248 10673 53921 28612
column width if needed. RR= 5.73663823535 RR= 2.138811814 RR=
3. The results appear in red. Scroll
down or right if needed. RD= 0.00050416586 RD= 0.0012804031 RD=
Ref: Mod. Epid. 3rd ed. Ch. 15 Cases 102 31 133
Person-time 5317 1462 6779
RR= 0.90473041431
P-value Function
RD= -0.00202008013
1
Cases
0.9
Person-time
P-value
0.8
0.7
0.6
0.4 Cases 630 101 731 90% Conf. Interv. =
0.3
Person-time 142247 39220 181467 95% Conf. Interv. =
0.2
99% Conf. Interv. =
0.1
0 Crude RR = 1.7198 P-value testing RR = 1:
0.1 1 Risk
Relative 10 Crude RD = 0.0019 P-value for homogeneity:
P-value Function Data series RDmh = 0.0011439

Y X 90% Conf. Interv. =
z va p-valow/upper bd vert bar 95% Conf. Interv. =
2.9 0 1 0 1 0 99% Conf. Interv. =
2.8 0 1 0 1 1
2.7 0 1 0 P-value testing RD = 0:
2.6 0 1 0 P-value for homogeneity:
2.5 0 1 0
2.4 0 1 0
2.3 0 1 0
2.2 0 1 0
2.1 0 1 0
2 0 1 0
1.9 0.1 1 0.1
1.8 0.1 1 0.1
1.7 0.1 1 0.1
1.6 0.1 1 0.1
1.5 0.1 1 0.1
1.4 0.2 1 0.2
1.3 0.2 1 0.2
1.2 0.2 1 0.2
1.1 0.3 1 0.3
1 0.3 1 0.3
0.9 0.4 1 0.4
0.8 0.4 1 0.4
0.7 0.5 1 0.5
0.6 1 1 1
0.5 1 1 1
0.4 1 1 1
0.3 1 1 1
0.2 1 1 1
0.1 1 1 1
0 1 1 1
0.1 1 1 1
0.2 1 1 1
0.3 1 1 1
0.4 1 1 1
0.5 1 2 1
0.6 1 2 1
0.7 0.5 2 0.5
0.8 0.4 2 0.4
0.9 0.4 2 0.4
1 0.3 2 0.3
1.1 0.3 2 0.3
1.2 0.2 2 0.2
1.3 0.2 2 0.2
1.4 0.2 2 0.2
1.5 0.1 2 0.1
1.6 0.1 2 0.1
1.7 0.1 2 0.1
1.8 0.1 2 0.1
1.9 0.1 2 0.1
2 0 2 0
2.1 0 2 0
2.2 0 2 0
2.3 0 2 0
2.4 0 2 0
2.5 0 2 0
2.6 0 2 0
2.7 0 2 0
2.8 0 2 0
2.9 0 2 0
Main Menu
i square test for homogeneity.

Unexposed
Unexposed
Exposed
table
Total
Total
RD RR A E
28 234 186 28 214 1 0.0005041659 5.7366382354 32 25.02686911
5710 34322 12663 2585 15248 2 0.0012804031 2.138811814 104 93.03922405
1.468240099 RR= 1.356059837 3 0.0022960986 1.4682400991 206 195.07045044
0.002296099 RD= 0.003856741 4 0.003856741 1.3560598369 186 177.72048793
5 -0.0020200801 0.9047304143 102 104.31641835
6
7
8
9
10
11
12
Tot 630 595.17344988
1.1944 - 1.6994
1.1547 - 1.7578 MH chi =
1.0810 - 1.8776 Var(ln(RRMH)) =
0.0009
0.0340
Var(RDMH) =
0.0006349 - 0.0016529
0.0005375 - 0.0017504
0.0003472 - 0.0019407
0.0009
0.0000
M1PT1PT0/T^2
RR chisq
aPT0/T
bPT1/T
G=ad/t
H=bc/t
V
6.6049815381 8.4452996615 1.4721687712 6.6049815381 8.4452996615 1.4721687712 3.6522151216
18.415972224 20.585523266 9.6247473155 18.415972224 20.585523266 9.6247473155 1.7760455491
32.45301183 34.271312861 23.341763301 32.45301183 34.271312861 23.341763301 0.0223562547
30.129030778 31.532660021 23.253147954 30.129030778 31.532660021 23.253147954 0.0593063549
22.497507542 21.997934799 24.314353149 22.497507542 21.997934799 24.314353149 4.9017361112
110.10050391 116.83273061 82.00618049 110.10050391 116.83273061 82.00618049 10.411659392
3.31906534265252
0.0114915389874771
df = 4
9.57372904933489E-08
num of VarRD
PT1PT0/T
RD chisq
WmhRD
13831.025605 6.97313089035 3.3124868945 23.636285138
8560.4106749 10.96077595 11.794298357 0.1157396089
4760.0524445 10.92954956 25.160064332 1.1955071883
2146.7638379 8.27951206716 24.656777197 1.3755452387
1146.6962679 -2.3164183508 23.814780406 0.5527425286
30444.94883 34.8265501168 88.738407186 26.875819702

Direct Standardization (Birth Defects in Minnesota vs. Illinois)
Suppose a study was conducted in Illinois to investigate the hypothesis that birth defects occurred more often in Illinois as
Example: in this new study the authors thought that the type of water consumed could be related to birth defects. They wanted to a
defects in the two states for water type. Data from the two studies are compared as below.
Distribution of Minnesota
the Combined Number of Number of Proportion
e.g. age Stratum Populations % Events Subjects or "Rate" SE
Well water 1 0.29 0.76 93 3379 0.02752 0.00281
City water 2 0.09 0.20 27 874 0.03089 0.00585
Bottled wa 3 0.62 0.05 5 206 0.02427 0.01072
Totals 1.00 1.00 125 4459
Crude Rate 0.02803
Standardized Proportion or "Rate" 0.02580 Crude RR
Standard Error 0.00674 Adjusted RR
Distribution of Illinois
the Combined Number of Number of Proportion
Stratum Populations Events Subjects or "Rate" SE
1 0.29 0.01 2 100 0.02000 0.01400
2 0.09 0.03 6 200 0.03000 0.01206
3 0.62 0.96 145 7293 0.01988 0.00163
Totals 1.00 153 7593
Crude Rate 0.02015
Illinois) Main Menu
d more often in Illinois as compared to Minnesota. However,
defects. They wanted to adjust (standardize) the rates of
0.00794493 0.00000066
0.00275294 0.00000027
0.01510244 0.00004451
0.02580031 0.00004544
1.39
1.24 Suppose that after this publication came out, another study was conducted in Illinois to inv
occurred more often in Illinois as compared to Minnesota. However, in this new study the a
consumed could be related to birth defects. They wanted to adjust (standardize) the rates o
Data from the two studies are compared as below.
Births by state and water type Minnesota Pesticide Appliers Illinois Pesticide Appliers Norm
Water Type (#) (#) rate* (#) (#) rate* Well water only 3379 93 26.8 100 2 ____ City water on
0.00577332 0.00001633 only 206 5 23.7 7293 145 ____ Total 4456 125 28.0 7593 153 ____ * per 1000 live births a.
0.00267342 0.00000116 specific rates for Illinois. Briefly describe how these two states compare in crude rates of bi
0.01237103 0.00000103 number of live births as a standard, calculate a standardized rate (standardized for water ty
how these standardized rates compare with each other and reasons why they may or may
0.02081777 0.00001852
onducted in Illinois to investigate the hypothesis that birth defects
r, in this new study the authors thought that the type of water
(standardize) the rates of defects in the two states for water type.
Pesticide Appliers Normal With anomalies Normal With anomalies

100 2 ____ City water only 874 27 30.0 200 6 ____ Bottled water
* per 1000 live births a. calculate the crude rate and the water-type
pare in crude rates of birth anomalies. (4 pts) b. Using the combined
tandardized for water type) for each of the states. Briefly describe
ns why they may or may not agree with the crude rates. (6 pts)
Wishbringer: Calculates how body weight and BMI would diminish when new activities are added.
This worksheet will compute how your body weight and BMI would change over the course of one year if you added certain ph
You need to enter your intial body weight, your height, and your gender. [Note: entries for gender and new activities are drop
appear which you click on to see the choices.] Then select 1-3 new activities that you might be able to add to your weekly sche
average number of times per week that you would engage in that activity. Assuming no other changes in diet or activity, the pr
the course of a year, assuming that you engage in these activities according the the duration and frequency you selected. The p
calories at a given activity as your weight decreases. It also adjusts for changes in your basal metabolic rate, taking into accoun
Main Menu
Please enter your:
Initial Body Wgt (lb 138
Hgt (inches) 64 Your Initial Body Mass Index (BMI)= 23.7
Gender female
Age (yrs.) 26
Select 1 to 3 new activities Enter duration & frequency for ea

Additional Activities # Min. each time
Activity #1 Walking, 4.0 mph, very brisk pace 60
Activity #2 Basketball, nongame, general 60
Activity #3 Bicycling, <10mph, leisure 60
Predicted Change in Body Weight Over 1 Year Predicted Change in BMI Over
40.0
150
35.0
30.0
125
25.0
20.0
100 15.0
0 10 20 30 40 50 0 10 20 30
w activities are added.
year if you added certain physical activites to your life.
and new activities are drop down menus. Click on the appropriate cell, and a down arrow will
e to add to your weekly schedule on a regular basis. Also choose the average duration and the
nges in diet or activity, the program will compute predicted changes in your weight and BMI over Lamorte, Wayne W:
Note that each week we use the
requency you selected. The program automatically adjusts for the fact that you burn fewer
previious week's weight to compute
bolic rate, taking into account your age, weekly body weight, height, and gender.
calorie expenditure, so this takes
into account burning fewer calories
during activity as one's weight
decreases.
In addition, it adjusts for changes in

BMR by adding back in the
cumulative decliine*7 each week.
Pred Pred Actual Actual

uration & frequency for each Wks Wgt BMI Wgt BMI BMR BMR delta
# Times/week slope 1 137 23.6 220 37.8 1431.6 0
5 1.817431 2 136 23.4 1427.4 -4.2
5 2.733945 3 136 23.3 1423.2 -4.1
1 1.817431 4 135 23.1 1419.2 -4.1
5 134 22.9 1415.2 -4.0
ed Change in BMI Over 1 Year 6 133 22.8 1411.2 -3.9
7 132 22.6 1407.3 -3.9
8 131 22.5 1403.5 -3.8
9 130 22.3 1399.8 -3.8
10 129 22.2 1396.1 -3.7
11 128 22.0 1392.4 -3.6
12 128 21.9 1388.8 -3.6
13 127 21.8 1385.3 -3.5
14 126 21.6 1381.8 -3.5
15 125 21.5 1378.4 -3.4
16 124 21.4 1375.1 -3.4
17 124 21.2 1371.8 -3.3
18 123 21.1 1368.5 -3.3
19 122 21.0 1365.3 -3.2
20 30 40 50 20 122 20.9 1362.1 -3.2
21 121 20.7 1359.0 -3.1
22 120 20.6 1356.0 -3.1
23 119 20.5 1353.0 -3.0
24 119 20.4 1350.0 -3.0
25 118 20.3 1347.1 -2.9
26 117 20.1 1344.2 -2.9
27 117 20.0 1341.4 -2.8
28 116 19.9 1338.6 -2.8
29 115 19.8 1335.9 -2.7
30 115 19.7 1333.2 -2.7
31 114 19.6 1330.5 -2.7
32 114 19.5 1327.9 -2.6
33 113 19.4 1325.3 -2.6
34 112 19.3 1322.8 -2.5
35 112 19.2 1320.3 -2.5
36 111 19.1 1317.9 -2.4
37 111 19.0 1315.5 -2.4
38 110 18.9 1313.1 -2.4
39 110 18.8 1310.8 -2.3
40 109 18.7 1308.5 -2.3
41 109 18.6 1306.2 -2.3
42 108 18.6 1304.0 -2.2
43 108 18.5 1301.8 -2.2
44 107 18.4 1299.6 -2.2
45 107 18.3 1297.5 -2.1
46 106 18.2 1295.4 -2.1
47 106 18.1 1293.3 -2.1
48 105 18.1 1291.3 -2.0
49 105 18.0 1289.3 -2.0
50 104 17.9 1287.4 -2.0
51 104 17.8 1285.4 -1.9
52 103 17.8 1283.5 -1.9
W:
ek we use the
eight to compute
e, so this takes
ng fewer calories
ne's weight
sts for changes in

ck in the (kcal/hr/lb) Wks Pred BMI
*7 each week. ACTIVITY BW=130 BW=155 BW=190 slope 1 23.59276
2 23.42711
Aerobics, general 354 422 518 2.733945 3 23.26407
cum
delta Aerobics, high impac 413 493 604 3.182569 4 23.10359
0 Aerobics, low impact 295 352 431 2.266055 5 22.94563
-4.2 Backpacking, genera 413 493 604 3.182569 6 22.79015
-8.3 Basketball, game 472 563 690 3.633028 7 22.63712
-12.4 Basketball, nongame 354 422 518 2.733945 8 22.48649
-16.4 Basketball, officiatin 413 493 604 3.182569 9 22.33823
-20.3 Basketball, shooting 266 317 388 2.033028 10 22.1923
-24.2 Basketball, wheelcha 384 457 561 2.951376 11 22.04867
-28.0 Bicycling, <10mph, l 236 281 345 1.817431 12 21.9073
-31.8 Bicycling, >20mph, r 944 1126 1380 7.266055 13 21.76814
-35.5 Bicycling, 10-11.9mph 354 422 518 2.733945 14 21.63118
-39.1 Bicycling, 12-13.9mp 472 563 690 3.633028 15 21.49637
-42.7 Bicycling, 14-15.9mph 590 704 863 4.549541 16 21.36367
-46.2 Bicycling, 16-19mph, 708 844 1035 5.450459 17 21.23307
-49.7 Bicycling, BMX or m 502 598 733 3.850459 18 21.10452
-53.1 Bicycling, stationary 295 352 431 2.266055 19 20.97798
-56.5 Bicycling, stationary, 325 387 474 2.483486 20 20.85344
-59.8 Bicycling, stationary 413 493 604 3.182569 21 20.73086
-72.5 Bowling 177 211 259 1.366972 25 20.25949
-75.6 Boxing, in ring, gene 708 844 1035 5.450459 26 20.14624
-78.6 Boxing, punching ba 354 422 518 2.733945 27 20.03477
-81.6 Boxing, sparring 531 633 776 4.083486 28 19.92506
-84.5 Broomball 413 493 604 3.182569 29 19.81707
-87.4 Calisthenics (pushups 472 563 690 3.633028 30 19.71078
-90.2 Calisthenics, home, l 266 317 388 2.033028 31 19.60616
-93.0 Canoeing, on campin 236 281 345 1.817431 32 19.50318
-95.7 Canoeing, rowing, >6 708 844 1035 5.450459 33 19.40183
-98.4 Canoeing, rowing, cr 708 844 1035 5.450459 34 19.30206
-101.0 Canoeing, rowing, lig 177 211 259 1.366972 35 19.20387
-103.7 Canoeing, rowing, mo 413 493 604 3.182569 36 19.10722
-106.2 Carpentry, general 207 246 302 1.584404 37 19.01208
-108.7 Carrying heavy loads 472 563 690 3.633028 38 18.91845
-111.2 Circuit training, gene 472 563 690 3.633028 39 18.82628
-113.7 Cleaning, heavy, vigo 266 317 388 2.033028 40 18.73557
-116.1 Cleaning, house, gen 207 246 302 1.584404 41 18.64628
-118.5 Cleaning, light, mode 148 176 216 1.133945 42 18.55839
-120.8 Coaching: football, s 236 281 345 1.817431 43 18.47189
-123.1 Dancing, aerobic, bal 354 422 518 2.733945 44 18.38675
-125.4 Dancing, ballroom, f 325 387 474 2.483486 45 18.30294
-127.6 Dancing, ballroom, s 177 211 259 1.366972 46 18.22045
-129.8 Dancing, general 266 317 388 2.033028 47 18.13926
-131.9 Darts, wall or lawn 148 176 216 1.133945 48 18.05935
-134.1 Diving, springboard 177 211 259 1.366972 49 17.98069
-136.2 Fencing 354 422 518 2.733945 50 17.90327
-138.2 Football or baseball, 148 176 216 1.133945 51 17.82706
-140.2 Football, competitive 531 633 776 4.083486 52 17.75206
-142.2 Football, touch, flag 472 563 690 3.633028
-144.2 Frisbee playing, gen 177 211 259 1.366972
-146.1 Frisbee, ultimate 207 246 302 1.584404
-148.0 Gardening, general 295 352 431 2.266055
Golf, carrying clubs 325 387 474 2.483486
Golf, general 236 281 345 1.817431
Golf, miniature or dr 177 211 259 1.366972
Golf, pulling clubs 295 352 431 2.266055
Golf, using power ca 207 246 302 1.584404
Gymnastics, general 236 281 345 1.817431
Health club exercise 325 387 474 2.483486
Hiking, cross countr 354 422 518 2.733945
Hockey, field 472 563 690 3.633028
Hockey, ice 472 563 690 3.633028
Horse grooming 354 422 518 2.733945
Horse racing, gallop 472 563 690 3.633028
Horseback riding, ge 236 281 345 1.817431
Horseback riding, tro 384 457 561 2.951376
Horseback riding, wa 148 176 216 1.133945
Jogging, general 413 493 604 3.182569
Judo, karate, kick b 590 704 863 4.549541
Kayaking 295 352 431 2.266055
Kickball 413 493 604 3.182569
Lacrosse 472 563 690 3.633028
Moving furniture, ho 354 422 518 2.733945
Moving household ite 531 633 776 4.083486
Moving household it 413 493 604 3.182569
Mowing lawn, genera 325 387 474 2.483486
Mowing lawn, riding 148 176 216 1.133945
Paddleboat 236 281 345 1.817431
Polo 472 563 690 3.633028
Race walking 384 457 561 2.951376
Racquetball, casual, 413 493 604 3.182569
Racquetball, competi 590 704 863 4.549541
Raking lawn 236 281 345 1.817431
Rock climbing, asce 649 774 949 5
Rock climbing, rapell 472 563 690 3.633028
Rope jumping, fast 708 844 1035 5.450459
Rope jumping, moder 590 704 863 4.549541
Rope jumping, slow 472 563 690 3.633028
Rowing, stationary, li 561 669 819 4.299083
Rowing, stationary, 413 493 604 3.182569
Rowing, stationary, v 708 844 1035 5.450459
Rowing, stationary, v 502 598 733 3.850459
Rugby 590 704 863 4.549541
Running, 10 mph (6 944 1126 1380 7.266055
Running, 10.9 mph (5 1062 1267 1553 8.182569
Running, 5 mph (12 472 563 690 3.633028
Running, 5.2 mph (11 531 633 776 4.083486
Running, 6 mph (10 590 704 863 4.549541
Running, 6.7 mph (9 649 774 949 5
Running, 7 mph (8.5 679 809 992 5.217431
Running, 7.5mph (8 738 880 1078 5.666055
Running, 8 mph (7.5 797 950 1165 6.133945
Running, 8.6 mph (7 826 985 1208 6.366972
Running, 9 mph (6.5 885 1056 1294 6.815596
Running, cross coun 531 633 776 4.083486
Running, general 472 563 690 3.633028
Running, in place 472 563 690 3.633028
Running, on a track, 590 704 863 4.549541
Running, stairs, up 885 1056 1294 6.815596
Running, training, p 472 563 690 3.633028
Running, wheeling, 177 211 259 1.366972
Sailing, boat/board, 177 211 259 1.366972
Sailing, in competiti 295 352 431 2.266055
Shoveling snow, by 354 422 518 2.733945
Shuffleboard, lawn b 177 211 259 1.366972
Skateboarding 295 352 431 2.266055
Skating, ice, 9 mph o 325 387 474 2.483486
Skating, ice, general 413 493 604 3.182569
Skating, ice, rapidly 531 633 776 4.083486
Skating, ice, speed, 885 1056 1294 6.815596
Skating, roller 413 493 604 3.182569
Ski jumping (climb up 413 493 604 3.182569
Ski machine, genera 561 669 819 4.299083
Skiing, cross-countr 826 985 1208 6.366972
Skiing, cross-country 472 563 690 3.633028
Skiing, cross-country, 413 493 604 3.182569
Skiing, downhill, light 295 352 431 2.266055
Skiing, downhill, mod 354 422 518 2.733945
Skiing, downhill, vigo 472 563 690 3.633028
Skiing, snow, genera 413 493 604 3.182569
Skiing, water 354 422 518 2.733945
Ski-mobiling, water 413 493 604 3.182569
Skin diving, scuba di 413 493 604 3.182569
Sledding, tobogganin 413 493 604 3.182569
Snorkeling 295 352 431 2.266055
Snow shoeing 472 563 690 3.633028
Snowmobiling 207 246 302 1.584404
Soccer, casual, gene 413 493 604 3.182569
Soccer, competitive 590 704 863 4.549541
Softball or baseball, 295 352 431 2.266055
Softball, officiating 354 422 518 2.733945
Squash 708 844 1035 5.450459
Stair-treadmill ergom 354 422 518 2.733945
Standing-packing/un 207 246 302 1.584404
Stretching, hatha yo 236 281 345 1.817431
Surfing, body or boa 177 211 259 1.366972
Swimming laps, freest 590 704 863 4.549541
Swimming laps, freest 472 563 690 3.633028
Swimming, backstrok 472 563 690 3.633028
Swimming, breaststr 590 704 863 4.549541
Swimming, butterfly, 649 774 949 5
Swimming, leisurely, 354 422 518 2.733945
Swimming, sidestrok 472 563 690 3.633028
Swimming, sychroni 472 563 690 3.633028
Swimming, treading w 590 704 863 4.549541
Swimming, treading w 236 281 345 1.817431
Table tennis, ping p 236 281 345 1.817431
Tai chi 236 281 345 1.817431
Teaching aerobics c 354 422 518 2.733945
Tennis, doubles 354 422 518 2.733945
Tennis, general 413 493 604 3.182569
Tennis, singles 472 563 690 3.633028
Volleyball, beach 472 563 690 3.633028
Volleyball, competit 236 281 345 1.817431
Volleyball, noncomp 177 211 259 1.366972
Walk/run-playing wit 236 281 345 1.817431
Walk/run-playing with 295 352 431 2.266055
Walking, 2.0 mph, s 148 176 216 1.133945
Walking, 3.0 mph, m 207 246 302 1.584404
Walking, 3.5 mph, up 354 422 518 2.733945
Walking, 4.0 mph, ve 236 281 345 1.817431
Walking, carrying inf 207 246 302 1.584404
Walking, grass track 295 352 431 2.266055
Walking, upstairs 472 563 690 3.633028
Walking, using crutc 236 281 345 1.817431
Wallyball, general 413 493 604 3.182569
Water aerobics, wate 236 281 345 1.817431
Water polo 590 704 863 4.549541
Water volleyball 177 211 259 1.366972
Weight lifting or body 354 422 518 2.733945
Weight lifting, light 177 211 259 1.366972
Whitewater rafting, 295 352 431 2.266055

Main Menu - The Hyperlinks Below Take You To The Appropriate Worksheet

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Main Menu - The Hyperlinks Below Take You To The Appropriate Worksheet

Uploaded by

Copyright:

Available Formats

Main Menu - The hyperlinks below take you to the appropriate wo

Epidemic curve (how to create one)

Incidence Rates & Cumulative Incidence (IR & CI)

Wayne W. LaMorte, MD, PhD, MPH Copyright

Menu - The hyperlinks below take you to the appropriate worksheet.

Standardized Rates (Proportions) - Direct Standardization

Fisher's Exact Test (You need to be online to use this.)

Binomial Probability Calculator

Screening Test Performance - Sensitivity/Specificity

Sample Size Calculations

Random Assignment to Groups

New Hepatitis Cases

90% Confidence Interval

Confidence Intervals for a Sing

95% Confidence Interval 99% Confidence Interval

ence Intervals for a Single Incidence Rate

p-value= 0.13481471 8/210= 0.03809524

o trained not at all, a little, moderately, or a lot.

0.95 Select Confidence Level

Online Fisher's Exact Test

Odds Ratio= 14.04

n0n1m0m1/(n2(n-1)= 7.093484965 n0n1m0m1/(n2(n-1)=

Strat. Case-Control (Rothman)

Crude Odds Ratio= 2.973773

Odds Ratio= 2.47 ORmh= 4.52 ok

34.5 Sum D54+H54 669.00 ok

For p-value function

This means CI=1 minus the constant 'e' (2.71828)

Here, IR is low (0.01 p-yrs) and we observe the

Estimate over time when IR is low (0.01 person-yr)

k) time Predicted # Deaths

er time. If you are using an Excel 600.0

e approximation of CI from the

Here, IR is very low (0.001 p-yrs) and

Estimate over time when IR is very low (0.001 person-yr)

Incidence in exposed= 0.0015 Confidence Limits

PAF= 0.448447 Online Fisher's Exact Test

Cohort Studies- Incidence Rate

From Ken Rothman's Episheet

Sample Size Needed in Each Group

Part II - Sample Size Calculations for a Difference in Proportions (frequency)

Difference in frequency = 75.00 %

Using Lisa Sullivan's "Essentials of Biostatistics in Public Health", chapter 8

Table for (Z1-alpha/2+Z1-beta)squared

er of subjects needed in each group to demonstrate a

ells indicate the two

Alaska % of total Rate per

n of Standardized Incidence Ratios in the MA Cancer Registry

The SIR is the ratio of the observed # cases

Enter the observed number of events: 12

Probability of exactly 12 events = 0.003434

n time period is 5, but one observes 12

Probability of "success" on each trial: 0.51

Probability of exactly 9 events = 0.060415

occurred in a census tract where only 17% of the

Mean Std. Dev. X Cumulative probability

Enter your values Err:502

Percentile Value Calculator

Value at the Desired

Enter your values Err:502

If the purpose is to describe

s the average distance

nient because the mean + 1