Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
You are on page 1of 79

Principles of Epidemiology for Public Health (EPID600)

Study designs: Cross-sectional studies,


ecologic studies (and confidence intervals)
Victor J. Schoenbach, PhD home page
Department of Epidemiology
Gillings School of Global Public Health
University of North Carolina at Chapel Hill
www.unc.edu/epid600/
2/22/2011

Cross-sectional studies

Signs from around the world


In a Copenhagen airline ticket office:

We take your bags and send them in all


directions.

Signs from around the world


In a Norwegian cocktail lounge:

Ladies are requested not to have


children in the bar.

Signs from around the world


Rome laundry:
Ladies, leave your clothes here and
spend the afternoon having a good time.

Faster keyboarding - 1
I cdnuolt blveiee taht I cluod aulaclty uesdnatnrd waht I
was rdanieg. The phaonmneal pweor of the hmuan mnid,
aoccdrnig to a rscheearch at Cmabrigde Uinervtisy. It
dn'seot mttaer in waht oredr the ltteers in a wrod are, the
olny iprmoatnt tihng is taht the frist and lsat ltteer be in
the rghit pclae. The rset can be a taotl mses and you can
sitll raed it wouthit a porbelm.
Gary C. Ramseyer's First Internet Gallery of Statistics Jokes
http://davidmlane.com/hyperstat/humorf.html (#162)

Faster keyboarding - 2
Most of my friends could read this with understanding
and rather quickly I might add. Then I had them read a
statistical bit of literature:
Miittluvraae asilyans sattes an idtenossiy ctuoonr epilsle
is the itternoiecsno of a panle pleralal to the xl-yapne and
the sruacfe of a btiiarave nmarol dbttiisruein.
Gary C. Ramseyer's First Internet Gallery of Statistics Jokes
http://davidmlane.com/hyperstat/humorf.html (#162)

Principles of Epidemiology for Public Health (EPID600)

Study designs: Cross-sectional studies,


ecologic studies (and confidence intervals)
Victor J. Schoenbach, PhD home page
Department of Epidemiology
Gillings School of Global Public Health
University of North Carolina at Chapel Hill
www.unc.edu/epid600/
2/22/2011

Cross-sectional studies

Today outline
Cross-sectional studies (and sampling)
Ecologic studies
Confidence intervals

10/15/2001

Cross-sectional studies

Cross-sectional studies
Cross-sectional studies include surveys
People are studied at a point in time, without
follow-up.
Can combine a cross-sectional study with follow-up
to create a cohort study.
Can conduct repeated cross-sectional studies to
measure change in a population.
2/10/2009

Cross-sectional studies

Cross-sectional studies
Number of uninsured Americans rises to 50.7 million .
(USA Today, 9/17/2010; data from Census Bureau)

In 2007-2008, almost one in five children older than 5


years was obese. (Health, United States, 2010; data from the
National Health and Nutrition Examination Survey)

35% (~7.4 million) of births to U.S. women during the


preceding 5 years were mistimed or unwanted (2002
National Survey of Family Growth, Series 23, No. 25, Table 21)
[Source: www.cdc.gov/nchs/]

2/22/2011

Cross-sectional studies

10

Cross-sectional studies
Incidence information is not available from a typical
cross-sectional study
Sometimes can reconstruct incidence from historical
information
Example: the incidence proportion of quitting smoking,
called the quit ratio:
ex-smokers / ever-smokers
is calculated from survey data.
2/10/2009

Cross-sectional studies

11

Measure prevalence at point in time


Snapshot of a population, a still life
Can measure attitudes, beliefs, behaviors, personal or
family history, genetic factors, existing or past health
conditions, or anything else that does not require followup to assess.
The source of most of what we know about the
population
10/15/2001

Cross-sectional studies

12

Population census
A cross-sectional study of an entire
population
Provides the denominator data for
many purposes (e.g., estimation of
rates, assessing generalizability,
projecting from smaller studies)
A huge effort people can be difficult to
find and to count; may not want to
provide data
Some countries maintain accurate and
current registries of the entire country
2/22/2011

Cross-sectional studies

13

National surveys conducted by NCHS


National Health Interview Survey (NHIS)
household interviews

National Health and Nutrition Examination


Survey (NHANES) interviews and physical
examinations

National Survey of Family Growth (NSFG)


household interviews

National Health Care Survey (NHCS)


medical records
2/22/2011

Cross-sectional studies

14

National surveys
Designed to be representative of the entire country
Modes: household interview, telephone, mail
Employ complex sampling designs to optimize
efficiency (tradeoff between information and cost)
Logistically challenging (answering machines, cellphones, . . .)
See presentation by Dr. Anjani Chandra at

www.minority.unc.edu/institute/2003/materials/slides/Chandra-20030522.ppt

2/22/2011

Cross-sectional studies

15

Example: National Health Interview Survey


Conducted every year in U.S. by National
Center for Health Statistics (CDC)
Stratified, multistaged, household survey
that covers the civilian noninstitutionalized
population of the United States
Redesigned every decade to use new
census
10/15/2001

Cross-sectional studies

16

multistaged
Improves logistical feasibility and reduces costs
(though reduces precision)
1. Divide population into primary sampling units
(PSUs)
PSU = primary sampling unit: metropolitan statistical
area, county, group of adjacent counties

2/10/2009

Cross-sectional studies

17

multistaged
2. Select sample of census block groups (SSUs) within
each selected PSU
3. Map each selected census block group or examine
building permits
4. Select one cluster of 4-8 housing units dispersed
evenly throughout the block
NCHS draws a new representative sample for each
weeks interviews
2/10/2009

Cross-sectional studies

18

stratified
US divided into 1,900 PSUs
Largest 52 PSUs are self-representing
Rest of PSUs divided into 73 categories (strata),
based on socioeconomic and demographic variables
Sampling takes place separately within each category
(stratum)

10/15/2001

Cross-sectional studies

19

Sample size and Precision

7/30/2010

Cross-sectional studies

20

Weighted sampling
Hypothetical Unweighted Weighted
Age group Pop (1,000's) Sample
Sample

20-39 yrs
40-59 yrs
60-69 yrs
Total
3/6/2006

18,000
18,000
8,000
44,000
Cross-sectional studies

900
900
400
2,200

400
400
400
1,200
21

stratified
Also place census blocks into categories and
sample within each
Oversample some strata

10/15/2001

Cross-sectional studies

22

Defined population
Studies, especially cross-sectional studies, are easiest to
interpret when they are based in a population that has some
existence apart from the study itself (defined population)
1. Political subdivision (city, county, state)
2. Institutional (HMO, employer, profession)
Probability sampling enables statistical generalizability to
the defined population

2/10/2009

Cross-sectional studies

23

Surveys of sentinel populations


HIV seroprevalence survey in three county STD
clinics in central NC in 1988
3,000 anonymous, unlinked, leftover sera
Anonymous questionnaire for demographics and
risk factors
[Schoenbach VJ, Landis SE, Weber DJ, Mittal M, Koch GG, Levine PH. HIV
seroprevalence in sexually transmitted disease clients in a low-prevalence southern state.
Ann Epidemiol 1993;3:281-288]

2/22/2011

Cross-sectional studies

24

HIV seroprevalence

[Schoenbach VJ, Landis SE, Weber DJ, Mittal M, Koch GG, Levine PH. HIV
seroprevalence in sexually transmitted disease clients in a low-prevalence southern
state. Ann Epidemiol 1993;3:281-288]
10/15/2001

Cross-sectional studies

25

Seroprevalence (% HIV+) by risk factors


Characteristic

Gay Hetero Women

Syphilis
(history/current)
Gonorrhea (history)

53

9.0

37

2.6

Anal intercourse

41

1.7

Paid for sex

5.2

[Schoenbach VJ, Landis SE, Weber DJ, Mittal M, Koch GG, Levine PH. HIV
seroprevalence in sexually transmitted disease clients in a low-prevalence southern state.
Ann Epidemiol 1993;3:281-288]
10/14/2003

Cross-sectional studies

26

Interpretation
Measures prevalence if incidence is our real
interest, prevalence is often not a good surrogate
measure
Studies only survivors and stayers
May be difficult to determine whether a cause
came before an effect (exception: genetic
factors)

10/15/2001

Cross-sectional studies

27

Other points
Can choose by exposure or overall
Can choose by disease may not be
distinguishable from a case-control study with
prevalent cases

10/15/2001

Cross-sectional studies

28

Outline
Cross-sectional studies (and sampling)
Ecologic studies
Confidence intervals

10/15/2001

Cross-sectional studies

29

Ecologic studies
Most study designs cross-sectional, casecontrol, cohort, intervention trials can be carried
out with individuals or with groups
Group-level studies which use routinely collected
data are easier and less costly
Group-level studies that involve interventions
may not be easier or less costly
10/15/2001

Cross-sectional studies

30

Types of group-level variables


Summary of individual-level variable (e.g.,
median household income, % with high
school diploma)
Property of the aggregate (e.g.,
neighborhood grocery stores, seat belt
legislation, community competence)
3/6/2006

Cross-sectional studies

31

Interpretation
Link between summary exposure variable and
individual-level outcome must be inferred
Inference from group to individual is not
always sound

2/22/2011

Cross-sectional studies

32

Example: Male Circumcision and HIV

(Slope indicates strength of relationship;


r indicates linearity)

Source: Bongaarts J, et al. The relationship between male circumcision and HIV infection in African populations. AIDS 1989; 3(6): 373-7.

2/22/2011

Cross-sectional studies

33

Outline
Cross-sectional studies (and sampling)
Ecologic studies
Confidence intervals

10/15/2001

Cross-sectional studies

34

Confidence intervals
Provide a plausible range for the quantity
being estimated
Width indicates the precision of an estimate
for a given level of confidence
Confidence intervals quantify only random
error from sampling variation, not systematic
error from nonresponse, study design, etc.
3/8/2006

Cross-sectional studies

35

Confidence level vs. precision


The more vague my estimate, the more
confident I can be that it includes the
population parameter: I am 100% confident
that the prevalence of HIV is between 0 and
100%.
The more specific my estimate, the lower my
confidence: I am 0% confident that the
prevalence of HIV is 5.23%
10/15/2001

Cross-sectional studies

36

Confidence intervals interpretation


Simple interpretations are typically not
precise
Precise interpretations are typically not
simple

10/12/2004

Cross-sectional studies

37

Simple but imprecise


There is 95% confidence that the interval
contains the true value
True, but begs the question how to
define confidence

10/15/2001

Cross-sectional studies

38

Simple but imprecise


There is a 95% probability that the interval
contains the true value
Not quite correct: probability (as
conventionally defined) applies to a process,
not to a single instance

10/15/2001

Cross-sectional studies

39

Probability applies to a process: example


A 95% confidence interval can be viewed as a
measurement or estimation process that will
be correct (the interval includes the true
value of the parameter) 95% of the time and
incorrect 5% of the time.
Let us make up another estimation process
that will be correct (about) 95% of the time.
3/7/2006

Cross-sectional studies

40

Why probability applies to a process


Estimate your gender by flipping a coin 5 times if the result is 5 heads estimate your gender to
be its opposite; otherwise estimate your gender to be
what you think it is now.
Probability that estimate will be correct is
(1 Probability of 5 heads) = 0.97 = 97%
Probability that estimate will be incorrect is 3%
6/29/2002

Cross-sectional studies

41

Why probability applies to a process


So we now have a measurement process that will
be correct 97% of the time. We will use it to
measure your gender.
Flip the coin 5 times, and suppose you get 5 heads
Is there a 97% probability that you are of the
opposite sex?

6/29/2002

Cross-sectional studies

42

Precise but not simple


A 95% confidence interval is:
1. obtained by using a procedure that will include
the population parameter being estimated 95%
of the time
2. the set of all population values which are likely
to yield a sample like the one we obtained

2/22/2011

Cross-sectional studies

43

Suppose that this line represents the value


of the parameter we are trying to estimate

True value

10/15/2001

Cross-sectional studies

44

Possible estimates of that parameter in N


identical studies (shows sampling variation)
o
Study estimates
oo
oooo
True
value
oooooo
oooooooo
oooooooooo
o o ooooooooooo o
oo o ooooooooooooooooo o o
10/15/2001

Cross-sectional studies

45

One possible true value and how it would


manifest, on average, in N identical studies
o
oo
oooo
True value
oooooo
oooooooo
oooooooooo
o o ooooooooooo o
oo o ooooooooooooooooo o o
95% of the distribution
10/15/2001

Cross-sectional studies

46

Estimate from one study of a given size


?

Estimate

10/15/2001

Cross-sectional studies

47

A possible true value with < 2.5% chance of


being observed at or beyond the estimate
?

o
oo
oooo
oooooo
oooooooo
oooooooooo
o ooooooooooo o
ooooooooooooooo o o

Estimate

95% of the distribution


10/14/2003

Cross-sectional studies

48

A possible true value with > 2.5% probability


of being observed at or beyond the estimate
?

o
oo
oooo
oooooo
oooooooo
oooooooooo
o o ooooooooooo o
oooooooooooooooo o o

Estimate

95% of the distribution


10/15/2001

Cross-sectional studies

49

A possible true value with > 2.5% probability


of being observed at or beyond the estimate
?

Estimate

o
oo
oooo
oooooo
oooooooo
oooooooooo
o o ooooooooooo o
oo o ooooooooooooooo
95% of the distribution

10/15/2001

Cross-sectional studies

50

A possible true value with < 2.5% probability of


being observed at or beyond the estimate
?

Estimate

o
oo
oooo
oooooo
oooooooo
oooooooooo
o o ooooooooooo
oo o oooooooooooooo
95% of the distribution

10/15/2001

Cross-sectional studies

51

What the confidence interval represents


o
o
?
oo
o
oo
ooo o o
oo
oooo
oo
oo o o
oo
oooo
oooooo
oo
oo
oo
oo o o
oo
oooooo
oooooooo
oo
oo
oo
oo
oo
oo o o
oo
oooooooo
oooooooooo
oo
oo
oo
oo
oo
oo
oo
oo o o
o oo
oooooooooo
o ooooooooooo o
oo
oo
oo
oo
oo
oo
oo
oo
oo
oo o
o ooooooooooo o oo o ooo
ooooooooooooooo o o o o ooooooooooo o
oooooooooooooooo o o
oo o ooooooooooooooo
95% confidence interval
10/14/2003

Cross-sectional studies

52

What the confidence interval represents

o
o
o
o
ooo
o
o
oo oo oo
oo
oooo
o
o
oooooooooooo
ooo
oo
ooo
oo
o o o o o o o o o o o o o o o o oo
ooo
o o o o o o o o o o o o o oo o
oo
oo
oo
ooo
oo
oo
oo
oo
o
o
o
o
o
o
oo
ooo
o
o
o
o
o
o
o
o
oo
oo
oo
oo
oo
oo
o
oooooooooooo
o
o
o
o
o
o
o
o
oo
ooo
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
oo
oo
o
o
o
o
o
o
o
oo
o o oo o
ooooooo o
oo
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
oo
o
o
o
o
o
o
o
o
o
o
o
oo
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o o oo o oo
oo
o
o
o
o
o
oo
oo
o o o oo o o o o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
oo
o oo o o o o o o o o
oo
oo
oo
oo
oo
oo
oo o o
ooooooooooooooo
o
95% confidence interval
10/15/2001

Cross-sectional studies

53

One possible true value and how it would


manifest, on average, in N identical studies
o
oo
oooo
True value
oooooo
oooooooo
oooooooooo
o o ooooooooooo o
oo o ooooooooooooooooo o o
1.96 x s.e. | 1.96 x s.e.
3/8/2006

Cross-sectional studies

54

Confidence intervals another take

10/15/2001

Cross-sectional studies

55

One possible population

10/15/2001

Cross-sectional studies

56

Another possible population


O

10/15/2001

Cross-sectional studies

57

A 3rd possible population


O

10/15/2001

Cross-sectional studies

58

A 4th possible population


O

10/15/2001

Cross-sectional studies

59

A 5th possible population


O

10/15/2001

Cross-sectional studies

60

A 6th possible population


O

10/15/2001

Cross-sectional studies

61

etc.

10/15/2001

Cross-sectional studies

62

There are 1.6 x 1060 possible populations


(no cases all cases)

10/15/2001

Cross-sectional studies

63

Suppose this is the population


(prevalence = 15%)
O

O O

OO
O

O
O
O

O
O
O
O

O
O O O

O
O
O O

O
O
O
O O

O
O
O

10/15/2001

Cross-sectional studies

64

Take a sample (n=10)


O

O O

OO
O

O
O
O

O
O
O
O

O
O O O

O
O
O O

O
O
O
O O

O
O
O

10/15/2001

Cross-sectional studies

65

The sample

10/15/2001

Cross-sectional studies

66

Make point estimate of prevalence

10/15/2001

Cross-sectional studies

67

Interval estimate
What are all the possible populations that
would be expected to yield this prevalence in
a sample of size 10?

6/29/2005

Cross-sectional studies

68

This one is not possible


O

10/15/2001

Cross-sectional studies

69

Possible, but VERY UNLIKELY


O

3/8/2006

Cross-sectional studies

70

Not quite 2.5% probability (2.1%, in fact)


O

3/8/2006

Cross-sectional studies

71

Yields just about 2.5% (3%, actually) probability of


selecting 2 (or more) cases in 10
O

3/8/2006

Cross-sectional studies

72

One possible true value and how it would


manifest, on average, in N identical studies
o
oo
oooo
True value
oooooo
oooooooo
oooooooooo
o o ooooooooooo o
oo o ooooooooooooooooo o o
95% of the distribution
3/8/2006

Cross-sectional studies

73

Just above 2.5% (actually 2.6%) probability of


selecting 2 (or fewer) cases in 10
O OO OO O O OOO
O O

O OO OO OOOOO O OO
OO
O

O O O OO OO O O OO OO O

OO O OO O O O O
OO

O
O O O O O OO O OO O
O

OO OO O OO O OO O O O

O O OO OOO O OO O OO

O O O O OOO OO OOO O

3/8/2006

Cross-sectional studies

74

Just below 2.5% (actually 2.4%) probability of


selecting 2 (or fewer) cases in 10
O OO OO OO O OOO
O O

O OO OO O OOO O OO O OO
O

O O O OO OO O O OO OO O

OO O OO O O OO O
O

O
O O O O O OO O OO O
O

OO OO OO OO O OO O O O

O O OO O O O OO O OO

OO O OO O OOO OO OOO O

3/8/2006

Cross-sectional studies

75

Interval estimate for 2/10


Lower bound: 2.5% (5 cases)
Upper bound: 55% (110 cases)
Meaning: Our sample of 10 with 2 cases provides
evidence to exclude, at conventional error
tolerance, populations with fewer than 5 cases or
more than 110 cases. Populations with 5-110
cannot be excluded as likely sources for this
sample.
3/8/2006

Cross-sectional studies

76

Interval estimate for 2/10


Actual population prevalence was 15%,
which in fact is between 2.5% and 55%.
2.5% to 55% is a very wide interval, i.e.,
a very imprecise estimate
To make it more precise, we need a
larger sample
3/8/2006

Cross-sectional studies

77

Signs from around the world Germany


A sign posted in Germany's Black Forest:

It is strictly forbidden on our black forest camping


site that people of different sex, for instance, men
and women, live together in one tent unless they
are married with each other for that purpose.

78

Signs from around the world Finland


On the faucet in a Finnish washroom:

To stop the drip, turn cock to right.

79

You might also like