Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 56

Chapter 2

Types of Studies
Modified by B. Issa, PhD
Contents

2.1 Surveys
2.2 Comparative Studies
Types of Studies
• Surveys: are mainly used to describe / quantify
population characteristics (e.g., a study of the
prevalence of hypertension in a population)  §2.1
• Comparative studies: determine relationships
between variables (e.g., a study to address whether
weight gain causes hypertension)  §2.2
2.1 Surveys
• Goal: to describe/quantify population characteristics
• The population consists of all entities worthy of
study.
• Studies that attempts to collect data from all the
population is called (Census)
• Studies usually use a subset of the population
(sample)
• Samples are made to make inferences / conclusions /
generalizations about population characteristics
2.1 Surveys
• Sampling has many advantages:
– Saves time
– Saves money
– It can be more beneficial as it allow resources to
be devoted to greater scope and accuracy
• Generally sampling is the rule or the common
practice in statistics/research. Census study is
rarely carried out.
Illustrative Example: Youth Risk Behavior
Surveillance (YRBS) (page 22, chapter2)
YRBS monitors health behaviors in youth and young
adults in the US. Six categories of health-risk behaviors
are monitored. These include:
1. Behaviors that contribute to unintentional injuries
and violence;
2. tobacco use;
3. alcohol and drug use;
4. sexual behaviors;
5. unhealthy dietary behaviors; and
6. physical activity levels and body weight.
Illustrative Example: Youth Risk Behavior
Surveillance (YRBS)
The 2003 report used information from 15,240
questionnaires completed at 158 schools to infer health-
risk behaviors for the public and private school student
populations of the United States and District of Columbia.a
The 15,240 students who completed the questionnaires
comprise the sample. This information is used to infer the
characteristics of the several million public and private
school students in the United States for the period in
question.
Grunbaum, J. A., Kann, L., Kinchen, S., Ross, J., Hawkins, J., Lowry, R., et al. (2004). Youth risk behavior surveillance—
a

United States, 2003. MMWR Surveillance Summary, 53(2), 1–96.


Random or Probability Sampling
• Samples should be chosen in such away that
allows for generalizations to be made to the entire
population.
• To achieve this the sampling must entail an
element of chance in the selection of individuals,
therefore a random sample or probability
sampling must be used.
• Each observation in the population has an equal
chance / probability to be part of the sample.
Simple Random Sampling (SRS)
• The most fundamental type of probability
sample is the Simple Random Sample (SRS)
• SRS (defined): an SRS of size n is selected so that
all possible combinations of (n) individuals from
the population (N) are equally likely to comprise
the sample
• SRSs demonstrate sampling independence
• Example: All possible combinations to select 3
out of 10 students.
Sampling Independence
Each population member
has the same probability
of being selected into the
sample

+ Sampling
independence
The selecting of any
individual of the
sample does not
influence the likelihood
of selecting other
individual
Simple Random Sampling Method
• How to practically do SRS? Identify each population
member (e.g. give it a number)  mix all members 
blindly draw entries
1. Number population members 1, 2, . . ., N
2. Pick an arbitrary spot in the random digit table (Table
A in the Appendix's)
3. Go down rows or columns of table to select n
appropriate tuples (e.g. doublet, triplet, ..) of digits
(discard inappropriate tuples)
• Alternatively, use a random number generator (e.g.,
www.random.org or those in software like Excel &
MATLAB) to generate n random numbers between 1 and
N.
Illustrative Example: Selecting a simple
random sample

Suppose a high school population has 600 students


and you want to choose three students at random
from this population. To select an SRS of n = 3:
1. Get a roster of the school. Assign each student a
unique identifier 1 through 600.
2. Enter Table A at (say) line 15. Line 15 starts with
these digits:
76931 95289 55809 19381 56686
Appendix A *
Illustrative Example: Selecting a
simple random sample
3. The first six triplets of numbers in this line are
769, 319, 528, 955, 809, and 193. Why triplet?
4. The first triplet (769) is excluded because there is
no individual with that number in the population.
The next two triplets (319 and 528) identify the first
two students to enter the sample. The next two
triplets (955 and 809) are not relevant. The last
student to enter the sample is student 193.
The final sample is composed of students with the
IDs 319, 528, and 193.
Sampling Fraction
• Ratio of size of sample (n) to the population size
(N). Usually quoted as a percentage. A correction
factor may be needed if it is more than 5%.

Sampling with or without


Replacement
*
• Both can be used. The former when selected
member is put back into the mix after being
selected. Therefore, every member can appear
more than once. So a chance (or probability) of
n/N.
Cautions when Sampling
• Samples that tend to under-represent or over-
represent certain segments of the population can
bias the results
– E.g.(1) sampling a population that has a minority group
using SRS, the sample may not represent the views of
this group, therefore other type of sampling is required.
– E.g.(2) sampling a population of school students where
the percentage of male and female students are 20%
and 30% respectively. Therefore other type of sampling
is required
• Under-coverage: groups in the source population
are left out or underrepresented in the population
list used to select the sample (e.g. using electronic
survey will exclude those who don’t have internet/
computers)
Cautions when Sampling
• Volunteer bias: occurs when self-selected
participants are atypical of the source population
– E.g.(1) Like advertising in newspaper, website, or ad
stand. Only readers of this location will have a
chance/volunteer to participate
– E.g.(1) Sampling for a study about physical activity in a
gym.

• Non-response bias: occurs when a large


percentage of selected individuals refuse to
participate or cannot be contacted.
– None responders often /may differ systematically from
responders.
Other Types of Probability Samples
• When random sampling is not possible, due to practical/
valid reasons, other types of sampling might be
performed.
• However, if NONE random sampling is used,
generalizability of the study results may be questioned.

• Other types of probability sampling


– Probability sample: is a sample in which each member
of the population has a known probability sample.
– Simple Random Sampling is the most basic type of
probability sampling where population individuals
have equal probability of being in the sample
– However in practice as mentioned in the previous
examples, more complex sampling technique is
required.
Other Types of Probability Samples
• Stratified random samples: is a sample
that draws independent SRSs from within
relatively homogenous strata or groups
– E.g.(1) sampling a population of school
students where the percentage of male and
female students are 40% and 60%
respectively.
1. Calculate the sample size required
2. Draw 40% of the required sample by SRS
from the male population
3. Draw 60% of the required sample by SRS
from the female population
Other Types of Probability Samples
• Cluster samples: Randomly select large units (clusters)
consisting of smaller units.
• This allows for random sampling when there is no reliable
list of sub-units, but there is one for the clusters
• E.g.(1):Examining the effect of school based intervention on
students eating knowledge and habits
1. Participating Schools were randomly selected from the list of
schools in X country
2. Participating students then were selected from these schools
• E.g.(2):Examining the income of families within particular area,
you don’t have a list of earning individuals in all the houses
within an area, but you have a list of ALL houses addresses
1. Participating house were randomly selected from the list of
house addresses
2. Participating individual earners within the households were
interviewed
Other Types of Probability Samples

M. G. Stabin. Radiation Protection and Dosimetry-Springer


2.2 Comparative Studies
• Comparative designs study the relationship
between an explanatory variable (independent
variable) and response variable (dependent
variable).
Exercise 2.4, 2.5
• Whereas being able to infer population
characteristics based on a sample was the main
concern in the survey studies, comparability
between population groups in relation to
particular characteristic is the main concern for
the comparative studies.
• Comparative studies may be experimental or non-
experimental.
2.2 Comparative Studies
• In experimental designs, the investigator assigns
the subjects to groups according to the
explanatory variable (e.g., exposed and
unexposed groups)

• In non-experimental designs, the investigator


does not assign subjects into groups; individuals
are merely classified as “exposed” or “non-
exposed.”

• Non-experimental studies is also called


Observational studies (???)
Exercises
2.4 Explanatory variable and response variable. Identify the explanatory
variable and response variable in each of the studies described here.
• (a) A study of cell phone use and primary brain cancer suggested that cell
phone use was not associated with an elevated risk of brain cancer.
• (b) Records of more than three-quarters of a million surgical procedures
conducted at 34 different hospitals were monitored for anesthetics safety.
The study found a mortality rate of 3.4% for one particular anesthetic. No
other major anesthetics was associated with mortality greater than 1.9%.
• (c) In a landmark study involving more than three-quarters of a million
individuals in the United States, Canada, and Finland, subjects were
randomly given either the Salk polio vaccine or a saline (placebo) injection.
The vaccinated group experienced a polio rate of 28 per 100,000 while the
placebo group had a rate of 69 per 100,000. A third group that refused to
participate had a polio rate of 46 per 100,000.

2.5 Experimental or nonexperimental? Determine whether each of the studies


described in Exercise 2.4 are experimental or nonexperimental.
• Explain your reasoning in each instance.
2.2 Comparative Studies
• Various types of control group might be used in
experimental studies.
– E.g. (1) Placebo control: which use inert
(impotent, inactive, fake) intervention /
drug intended to deceive the recipient

– E.g. (2) Active control: New intervention VS


old/known intervention (routine)
2.2 Comparative Studies
– In case of placebo (or even other types), even
though it is an inert treatment, some changes
may occur after its administration. This is called
“Placebo effect”

– “Placebo effect” means that recipient of


placebo may also have positive effect on their
subjective experience.
Figure 2.1
Experimental
and non-
experimental
study designs
Example of an Experimental
Design
• The Women's Health Initiative study randomly
assigned about half its subjects to a group
that received hormone replacement therapy
(HRT).

• Subjects were followed for ~5 years to


ascertain various health outcomes, including
heart attacks, strokes, the occurrence of
breast cancer and so on.
Example of a Nonexperimental
Design
• The Nurse's Health study classified individuals
according to whether they received HRT.

• Subjects were followed for ~5 years to


ascertain the occurrence of various health
outcomes.
r ence
t h e diffe ne!
se rve v io us o
fu lly ob the pre
se care slide &
Plea een this
betw
2.2 Comparative Studies
• Confounding (external factors, mediating
variable): is a distortion in an association between
an explanatory variable and response variable
brought about by the influence of extraneous
factors
• It is more likely to happen in comparative non-
experimental studies.
Comparison of Experimental and
Nonexperimental Designs
• In both the experimental (WHI) study and
nonexperimental (Nurse’s Health) study, the
relationship between HRT (explanatory variable)
and various health outcomes (response variables)
was studied.
• In the experimental design, the investigators
controlled who was and who was not exposed.
• In the nonexperimental design, the study subjects
(or their physicians) decided on whether or not
subjects were exposed.
Confounding (external factors)
• E.g. 1 researcher compared two samples of subjects to find any
correlation between coffee drinking and cancer. The researcher
found a positive correlation. This result was actually influenced by
another confounding factor which is smoking, where almost all coffee
drinkers in the comparison sample were smokers as well and
therefore drank more coffee. *

• E.g. 2 The study aims to examine the relationship between


background music and task performance amongst employees at a
packing, the music was played in the morning and the number of
packs produced was then compared with the packs produced in the
afternoon when the music is off.
• The researcher found a positive relationship. This result was actually
influenced by another confounding factor which is time of playing
music, where if the music was played in the morning, the employees
will be more active than the afternoon anyway, and therefore more
productive.
Jargon
• A subject ≡ an individual participating in the
experiment
• A factor ≡ an explanatory variable being
studied; experiments may address the effect
of multiple factors
• A treatment ≡ a specific set of factors i.e. can
combine effects of more than one explanatory
variable
Illustrative Example: Hypertension
Trial
• A trial looked at two explanatory factors in the
treatment of hypertension.
• Factor A was a health-education program aimed
at increasing physical activity, improving diet,
and lowering body weight. This factor had two
levels: active treatment or passive treatment.
• Factor B was pharmaceutical treatments at three
levels: medication A, medication B, and placebo.
Illustrative Example: Hypertension
Trial
• Because there were two levels of the health-
education variable and three levels of
pharmacological variable, the experiment
evaluated six treatments, as shown in Table 2.2.
• The response variable was “change in systolic
blood pressure” after six months. One hundred
and twenty (120) subjects were studied in total,
with equal numbers assigned to each group.
• Figure 2.3 is a schematic of the study design.
Table 2.2 Hypertension treatment trial
with two factors and six treatments
• Subjects = 120 individuals who participated in the
study
• Factor A = Health education: active or passive
• Factor B = Medication: Rx A, Rx B, or placebo
• Treatments = the six specific combinations of
factor A and factor B
Figure 2.3 Study design outline, hypertensive
treatment trial illustrative example

An interaction occurs when factors in combination produce an effect that could not be
predicted by looking at the effect of each factor separately
Random Assignment of Treatments
• Experiments involving human subjects are called
trials.
• Trials with one or more control groups are
controlled trials.
• When the assignment of the treatment is based on
chance, this is a randomized controlled trial.
Three Important Experimentation
Principles
• Controlled comparison
• Randomized
• Blinded
“Controlled” Trail
• The term “controlled” in this context means there is a non-
exposed “control group”
• Having a control group is essential because the effects of a
treatment can be judged only in relation to what would
happen in its absence (previous polio example)
• You cannot judge effects of a treatment without a control
group because:
– Conditions change on their own over time
E.g. studying new mothers’ confidence on their abilities to
care for their child after particular intervention. No control
group, and the self confidence were measured at one
month, 6 months and 1 year. It is expected that the mother
confidence will improve after 1 year, we are not sure that
the intervention has contributed to this unless we have a
control group.
Randomization
• Randomization is the second principle of
experimentation
• Randomization refers to the use of chance
mechanisms to assign treatments
• Randomization balances lurking variables
(confounding variables) among treatments groups,
mitigating their potentially confounding effects
Per 100,000: Without placebo,
Vaccinated 28 vaccine would
Placebo 69 have been
Refused 46 underestimated

Important illustrative example page 35: Poliomyelitis Trial


Randomization
• Poliomyelitis trial. In the 1954 poliomyelitis field trial
mentioned in Exercise 2.4(c), subjects were randomly
assigned to either the Salk polio vaccine group or a saline
placebo group.
• When the trial was initially proposed, there was a suggestion
that everyone who agreed to participate in the study be given
the vaccine while those who refused to participate serve as
the control group.
• Fortunately, this did not occur and a placebo control group was
included in the study. It was later revealed that “refusers” had
atypically low polio rates.
• Had the refusers been used as the control group, the benefits
of the vaccine would have been greatly underestimated.

Per 100,000: Vaccinated 28; Placebo 69; Refused 46


Randomization - Example
Consider this study (JAMA 1994;271: 595-600)
• Explanatory variable: Nicotine or placebo patch
• 60 subjects (30 each group)
• Response: Cessation of smoking (yes/no)

Group 1 Treatment 1
30 smokers Nicotine Patch

Random Compare
Assignment Cessation
Treatment 2 rates
Group 2
30 smokers Placebo Patch
Randomization – Example
• Number subjects 01,…,60
• Use Table A (or a random number generator)
to select 30 two-tuples between 01 and 60
• If you use Table A, arbitrarily select a different
starting point each time
• For example, if we start in line 19, we see
04247 38798 73286
Randomization, cont.
• We identify random two-tuples, e.g., 04, 24, 73, 87,
etc.
• Random two-tuples greater than 60 are ignored
• The first three individuals in the treatment group are
01, 24, and 29
• Keep selecting random two-tuples until you identify
30 unique individuals
• The remaining subjects are assigned to the control
group
Blinding
• Blinding is the third principle of experimentation
• Blinding :individuals involved in the study are kept
unaware of the treatment assignment

• Blinding can occur at several levels of a study design


– Single blinding - subjects are unaware of specific
treatment they are receiving
– Double blinding - subjects and investigators are
blinded
– Triple Blinding - the subjects, the investigator, and the
statisticians making analysis of the data are blinded.

Important illustrative example page 37:


Ginkgo and memory enhancement Trial
Double blinding: Ginkgo and Memory Enhancement Trial
• Ginkgo biloba is a commonly used herb that claims to improve
memory and cognitive function.
• A randomized, double-blinded study in an elderly population was
conducted to evaluate whether this claim is true.
• The treatment group consisted of subjects who took the active
product according to the manufacturer’s recommendation.
• The control group received lactose gelatin capsules that looked and
tasted like the ginkgo pill.
• The study was double blinded, with subjects and evaluators
administering cognitive tests unaware of whether subjects were
receiving the ginkgo or placebo.
• Analysis revealed no difference in any of the cognitive functions
that were measured.
• Double blinding helps avoid biases associated with the reporting and recording of the
study outcomes.
• When errors in measurement do occur, they are likely to occur equally in the groups being
compared, mitigating more serious forms of bias
Ethics
• Informed consent
• Beneficence
• Equipoise
• Independent (IRB) over-sight
Exercises .

2.1 Sample and population. For the scenarios presented here,


identify the source population and sample as specifically as
possible. If information
is insufficient, do your best to provide a reasonable description
of the population and sample and then suggest additional
“person, place, and time characteristics” that are needed to
better define the population.
(a) A study that reviewed 125 discharge summaries from a large
university hospital in metropolitan Detroit found that 35% of
the individuals in the hospital received antibiotics during their
stay.
(b) A study of eighteen 35- to 44-year-old diabetic men found a
mean body mass index that was 13% above what is considered
to be normal.
Exercises
2.8 MRFIT. The MRFIT field trial discussed as an
illustrative example studied 12,866 high-risk men
between 35 and 57 years of age. Use Table A
starting in row 03 to identify the first two members
of the treatment group.

2.9 Five-City Project. The Stanford Five-City


Project (Exercise 2.7) randomized cities to either a
treatment or a control group. Number the cities 1
through 5. Use Table A starting in line 17 to
randomly select the two treatment cities.
Exercises
2.10 Controlled-release morphine in patients with
chronic cancer pain.

Warfield reviewed 10 studies comparing the


effectiveness of controlled-release and immediate-
release morphine in cancer patients with chronic
pain. The studies that were reviewed were double
blinded. How would you double blind such studies?
Review Questions .

2.1 What is the general goal of a statistical survey?


2.5 What is sampling independence?
2.6 What is a sampling fraction?
2.7 What is sampling bias?
2.9 What is a probability sample?
2.10 What is multistage sampling?
2.16 Select the best response: This is the mixing-up of
the effects of the explanatory factor with that of
extraneous “lurking” variables.
(a) the placebo effect
(b) blinding
(c) confounding
Review Questions .

Select the best response: An individual who


participates in a statistical study is often referred to as
a
(a) study subject
(b) study factor
(c) study treatment

2.22 Who is usually “kept in the dark” in double-


blinded studies?
2.23 What does equipoise mean?
2.24 What does IRB stand for?
Exercises .

2.12 Sampling nurses. You want to survey nurses who


work at a particular hospital. Of the 90 nurses who
work at this hospital, 40 work in the maternity ward,
20 work in the oncology ward, and 30 work in the
surgical ward.
You decide to study 10% of the nurse population so
you choose nine nurses as follows: four nurses are
chosen at random from the 40 maternity nurses, two
are chosen at random from the 20 oncology nurses,
and three are chosen at random from the 30 surgical
nurses.
Is this a simple random sample? Explain your
response.
Exercises
2.15 Four-naughts. Could the number “0000” appear
in a table of random digits? If so, how likely is this?

2.16 Class survey. A simple random sample of


students is selected from students attending a class.
Identify a problem with this sampling method.
Exercises
2.18 How much do Master of Public Health (MPH) students
earn? A university official wants to know how much MPH
students earn from employment during the academic year and
during the summer. The student population at the official’s
school consists of 378 MPH students who have completed at
least one year of MPH study at three different campuses. A
questionnaire will be sent to an SRS of 75 of these students.

(a) You have a list of the current email addresses and


telephone numbers of all the 378 students. Describe how you
would derive an SRS of n = 30 from this population.
(b) Use Table A starting in line 13 to identify the first 3
students in your sample.

You might also like