Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
You are on page 1of 45

Principles of Epidemiology

Dona Schneider, PhD, MPH, FACE

Epidemiology Defined

Epi + demos + logos = that which befalls man The study of the distribution and determinants of disease frequency in human populations (MacMahon and Pugh, 1970)

Epidemiology (Schneider)

Epidemiology Defined

The study of the distribution and determinants of health-related states or events in specified populations and the application of this study to the control of health problems (John Last, 1988)

Epidemiology (Schneider)

Uses of Epidemiology

Identifying the causes of disease

Legionnaires disease

Completing the clinical picture of disease

Tuskegee experiment

Determining effectiveness of therapeutic and preventive measures

Mammograms, clinical trials

Identifying new syndromes

Varieties of hepatitis

Epidemiology (Schneider)

Uses of Epidemiology

Monitoring the health of a community, region, or nation

Surveillance, accident reports

Identifying risks in terms of probability statements

DES daughters

Studying trends over time to make predictions for the future

Smoking and lung cancer Estimating health services needs

Epidemiology (Schneider)

Life Table of Deaths in London

Age 0 6 16 26 36 46 56 66 76 80 Deaths -36 24 15 9 6 4 3 2 1 Survivors 100 64 40 25 16 10 6 3 1 0

Source: Graunts Observations 1662

Epidemiology (Schneider)

Graunts Observations

Excess of male births High infant mortality Seasonal variation in mortality

Epidemiology (Schneider)

Yearly Mortality Bill for 1632: Top 10 Causes of Death

Chrisomes & Infants Consumption Fever Collick, Stone, Strangury Flox & Small Pox
Bloody Flux, Scowring & Flux

Dropsie & Swelling Convulsion Childbed Liver Grown 0 500 1000 1500 2000 2500

Number of deaths
Epidemiology (Schneider)

Leading Causes of Death in US: 1900

Pneumonia Tuberculosis Diarrhea and enteritis Heart disease Chronic nephritis Unintentional injury Stroke Diseases of early infancy Cancer Diptheria 0 0 0 00 0 00 0 00 0 00 0 00 0

Death rate per 000 , 000

Epidemiology (Schneider)

Leading Causes of Death in US: 1990

Heart disease Cancer Stroke Unintentional injury Lung diseases Pneumonia and influenza Diabetes Suicide Liver disease HIV/AIDS

Epidemiology (Schneider)







Death Rates per 100,000

Endemic Vs. Epidemic

No. of Cases of a Disease

Endemic Time
Epidemiology (Schneider)


Population Pyramid

Epidemiology (Schneider)






Epidemiology (Schneider)


Statistics: A branch of applied mathematics which utilizes procedures for condensing, describing, analyzing and interpreting sets of information

Biostatistics: A subset of statistics used to handle health-relevant information

Epidemiology (Schneider)

Statistics (cont.)

Descriptive statistics: Methods of producing quantitative summaries of information

Measures of central tendency Measures of dispersion

Inferential statistics: Methods of making generalizations about a larger group based on information about a subset (sample) of that group

Epidemiology (Schneider)

Populations and Samples

Before we can determine what statistical test to use, we need to know if our information represents a population or a sample A sample is a subset which should be representative of a population

Epidemiology (Schneider)


A sample should be representative if selected randomly (i.e., each data point should have the same chance for selection as every other point) In some cases, the sample may be stratified but then randomized within the strata

Epidemiology (Schneider)

We want a sample that will reflect a populations gender and age:
1. 2. 3.

Stratify the data by gender Within each strata, further stratify by age Select randomly within each gender/age strata so that the number selected will be proportional to that of the population

Epidemiology (Schneider)

Populations and Samples

You can tell if you are looking at statistics on a population or a sample

Greek letters stand for population parameters (unknown but fixed) Arabic letters stand for statistics (known but random)

Epidemiology (Schneider)

Classification of Data
Qualitative or Quantitative

Qualitative: non-numeric or categorical

Examples: gender, race/ethnicity

Quantitative: numeric

Examples: age, temperature, blood pressure

Epidemiology (Schneider)

Classification of Data
Discrete or Continuous

Discrete: having a fixed number of values

Examples: marital status, blood type, number of children

Continuous: having an infinite number of values

Examples: height, weight, temperature

Epidemiology (Schneider)


Qualitative (categorical) data are discrete Quantitative (numerical) data may be

discrete continuous

Epidemiology (Schneider)

Qualitative Data: Nominal

Data which fall into mutually exclusive categories (discrete) for which there is no natural order Examples:

Race/ethnicity Gender Marital status ICD-10 codes Dichotomous data such as HIV+ or HIV-; yes or no

Epidemiology (Schneider)

Qualitative Data: Ordinal

Data which fall into mutually exclusive categories (discrete data) which have a rank or graded order Examples:

Grades Socioeconomic status Stage of disease Low, medium, high

Epidemiology (Schneider)

Quantitative Data: Interval

Data which are measured by standard units The scale measures not only that one data point is different than another, but by how much Examples

Number of days since onset of illness (discrete) Temperature in Fahrenheit or Celsius (continuous)

Epidemiology (Schneider)

Quantitative Data: Ratio

Data which are measured in standard units where a true zero represents total absence of that unit Examples

Number of children (discrete) Temperature in Kelvin (continuous)

Epidemiology (Schneider)

Review of Descriptive Biostatistics

Mean Median Mode and range Variance and standard deviation Frequency distributions Histograms

Epidemiology (Schneider)


Most commonly used measure of central tendency Arithmetic average

Formula: x = x / n

Sensitive to outliers

Epidemiology (Schneider)

Example: Number of accidents per week 8, 5, 3, 2, 7, 1, 2, 4, 6, 2 x = (8+5+3+2+7+1+2+4+6+2) / 10 = 40 / 10 = 4

Epidemiology (Schneider)


The value which divides a ranked set into two equal parts Order the data

If n is even, take the mean of the two middle observations If n is odd, the median is the middle observation

Epidemiology (Schneider)

Given an even number of observations (n=10): Example: 1, 2, 2, 2, 3, 4, 5, 6, 7, 8 Median = (3+4) / 2 = 3.5 Given an odd number of observations (n=11): Example: 1, 2, 2, 2, 3, 4, 5, 6, 7, 8, 10 Median = 4 (n+1)/2 = (11+1)/2 = 6th observation
Epidemiology (Schneider)


The number which occurs the most frequently in a set Example: 1, 2, 2, 2, 3, 4, 5, 6, 7, 8

Mode = 2

Epidemiology (Schneider)


The difference between the largest and smallest values in a distribution Example: 1, 2, 2, 2, 3, 4, 5, 6, 7, 8

Range = 8-1 = 7

Epidemiology (Schneider)

Variance and Standard Deviation

Measures of dispersion (or scatter) of the values about the mean

If the numbers are near the mean, variance is small If numbers are far from the mean, the variance is large

Epidemiology (Schneider)

V = [(x-x)2] / (n-1) V = [(8-4) 2 +(5-4) 2 +(3-4) 2 +(2-4) 2 +(7-4) 2 +(1-4) 2 +
(2-4) 2 +(4-4) 2 +(6-4) 2 +(2-4) 2] / (10-1) =

V = 5.7777
Epidemiology (Schneider)

Standard Deviation

SD = V SD = 5.777 = 2.404

Epidemiology (Schneider)

Symmetric and Skewed Distributions

Symmetrical Skewed

Mean Median Mode

Epidemiology (Schneider)

Mean Median Mode

Frequency Diagrams of Symmetric and Skewed Distributions

Epidemiology (Schneider)


12 Patients 5-point Anxiety Scale Scores

Patient Anxiety score
1 4 2 3 3 5 4 1 5 4 6 4 7 2 8 5 9 4 10 3 11 4 12 5

1 2 3 4 5 Total
Epidemiology (Schneider)

1 1 2 5 3 12

Frequency Diagram for 12 Psychiatric Patients

0 0


0 0 0 0

Epidemiology (Schneider)

Accidents at a summer camp requiring ER treatment

Week 1 2 3 4 5 6 7 8
Epidemiology (Schneider)

Frequency 1 3 1 1 1 1 1 1

Percent 10 30 10 10 10 10 10 10

0 0


0 0 0 0 0 0 0 0 0 0 0 0 0

Number of accidents per week

Epidemiology (Schneider)

Frequency Polygon
0 0


0 0 0 0 0 0 0 0 0 0 0 0 0

Number of accidents per week

Epidemiology (Schneider)

Frequency Polygon and Histogram

Note: area A = A; B = B; C = C; D = D; area under histogram = to area under polygon
0 0 0 0 0 0


Number of accidents per week

Epidemiology (Schneider)

Descriptive Statistics

Used as a first step to look at health-related outcomes Examine numbers of cases to identify an increase (epidemic) Examine patterns of cases to see who gets sick (demographic variables) and where and when they get sick (space/time variables)

Epidemiology (Schneider)

You might also like