Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 30

DKT

Biyoistatistik
Çıkarımsal İstatistik
Key Lecture Concepts
• Assess role of random error (chance) as an
influence on the validity of the statistical
association
• Identify role of the p-value in statistical
assessments
• Briefly introduce tests to undertake

2
Research Process
Research question

Hypothesis

Identify research design

Data collection

Presentation of data

Data analysis

Interpretation of data

3
Interpreting Results

When evaluating an association between


disease (e.g., cancer) and exposure (e.g., smoking),
we need guidelines
to help determine whether there is a
true difference in the frequency of disease
between the two exposure groups, or perhaps
just random variation from the study sample.
4
Random Error (Chance)
1. Rarely can we study an entire population, so
inference is attempted from a sample of
the population

2. There will always be random variation


from sample to sample

3. In general, smaller samples have less


precision, reliability, and statistical power
(more sampling variability)
5
Hypothesis Testing
• The process of deciding statistically
whether the findings of an
investigation reflect chance or real
effects at a given level of probability.

6
Elements of Testing hypothesis
• Null Hypothesis
• Alternative hypothesis
• Identify level of significance
• Test statistic
• Identify p-value / confidence interval
• Conclusion

7
Hypothesis Testing
H0: There is no association between the
exposure and disease of interest

H1: There is an association between the


exposure and disease of interest

Note: With prudent skepticism, the null hypothesis


is given the benefit of the doubt until the data
convince us otherwise.
8
Hypothesis Testing
• Because of statistical uncertainty regarding
inferences about population parameters based
upon sample data, we cannot prove or disprove
either the null or alternate hypotheses as
directly representing the population effect.

• Thus, we make a decision based on probability


and accept a probability of making an incorrect
decision.

9
Associations
• Two types of pitfalls can occur that
affect the association between
exposure and disease

• Type 1 error: observing a difference


when in truth there is none
• Type 2 error: failing to observe a
difference where there is one.

10
Interpreting Epidemiologic Results
Four possible outcomes of any epidemiologic study:

REALITY
YOUR H0 True H1 True
DECISION (No assoc.) (Yes assoc.)
Do not reject H0 Correct Type II
(not stat. sig.) decision (beta error)
Reject H0 Type I Correct
(stat. sig.) (alpha error) decision
11
Four possible outcomes of a scientific study:

REALITY
YOUR H0 True H1 True
DECISION (No assoc.) (Yes assoc.)
Do not reject H0 Correct Failing to find a
(not stat. sig.) decision difference when
one exists
Reject H0 Finding a Correct decision
(stat. sig.) difference when
there is none
12
Type I and Type II errors

•  is the probability of committing type I


error.

•  is the probability of committing type II


error.

13
“Conventional” Guidelines:

• Set the fixed alpha level (Type I error) to 0.05


This means, if the null hypothesis is true, the
probability of incorrectly rejecting it is 5% or less.

DECISION H0 True H1 True


Do not reject H0
Study Result

(not stat. sig.)


Reject H0 Type I
(stat. sig.) (alpha error)

14
Empirical Rule
For a Normal distribution approximately,

a) 68% of the measurements fall within one


standard deviation around the mean

b) 95% of the measurements fall within two


standard deviations around the mean

c) 99.7% of the measurements fall within three


standard deviations around the mean
15
Normal Distribution

50% 50%
•  usually set at 5%) 16
Random Error (Chance)

4. A test statistic to assess “statistical significance”


is performed to assess the degree to which the
data are compatible with the null hypothesis of no
association

5. Given a test statistic and an observed value, you


can compute the probability of observing a value
as extreme or more extreme than the observed
value under the null hypothesis of no association.
This probability is called the “p-value”
17
Random Error (Chance)

6. By convention, if p < 0.05, then the


association between the exposure and disease is
considered to be “statistically significant.”
(e.g. we reject the null hypothesis (H 0) and
accept the alternative hypothesis (H 1))

18
Random Error (Chance)
• p-value
• the probability that an effect at least as
extreme as that observed could have occurred
by chance alone, given there is truly no
relationship between exposure and disease (Ho)
• the probability the observed results occurred
by chance
• that the sample estimates of association differ
only because of sampling variability.

19
Random Error (Chance)

What does p < 0.05 mean?

Indirectly, it means that we suspect that the


magnitude of effect observed is not due to
chance alone (in the absence of biased data
collection or analysis)

Directly, p=0.05 means that one test result out of


twenty results would be expected to occur due
to chance (random error) alone
20
Example:

D+ D-
E+ 15 85 IE+ = 15 / (15 + 85) = 0.15
IE- = 10 / (10 + 90) = 0.10
E- 10 90
RR = IE+/IE- = 1.5, p = 0.30

Although it appears that the incidence of disease may be


higher in the exposed than in the non-exposed (RR=1.5),
the p-value of 0.30 exceeds the fixed alpha level of 0.05.
This means that the observed data are relatively
compatible with the null hypothesis. Thus, we do not
reject H0 in favor of H1 (alternative hypothesis).
21
Random Error (Chance)
Take Note:
The p-value reflects both the magnitude of the
difference between the study groups AND the
sample size
• The size of the p-value does not indicate the
importance of the results
• Results may be statistically significant but be
clinically unimportant
• Results that are not statistically significant
may still be important

22
Sometimes we are more concerned with
estimating the true difference than the
probability that we are making the decision
that the difference between samples is
significant

23
Selection of Tests of Significance

24
Scale of Data
1. Nominal: Data do not represent an amount or
quantity (e.g., Marital Status, Sex)

2. Ordinal: Data represent an ordered series of


relationship (e.g., level of education)

3. Interval (Continuous): Data are measured on an


interval scale having equal units but an arbitrary zero
point. (e.g.: Temperature in Celsius)

4. Interval Ratio (Continuous): Variable such as weight


for which we can compare meaningfully one weight
versus another (say, 100 Kg is twice 50 Kg) 25
Which Test to Use?
Scale of Data

Nominal Chi-square test

Ordinal Mann-Whitney U test

Continuous
T-test
- 2 groups
Continuous
ANOVA
- 3 or more groups
26
Example
• Imagine a study on the effectiveness of therapy X in rehabilitation of children with SLI
• Participants:
• 20 children with SLI
• 10 children receive therapy X (therapy group), 10 children do not receive therapy (control group)
• After therapy, receptive & expressive abilities of children are assessed with a
standardized test (e.g., TEDİL)
• The scores of the two groups (therapy & control) are compared
• Statistical test of significance:
• Continuous data (test scores)
• There are 2 groups to compare, so…
• T-test is conducted

SPSS illustration
T-test

Therapy X works!
Protection against Random Error

• Test statistics provide protection from


type 1 error due to random chance
• Test statistics do not guarantee
protection against type 1 errors due to
bias or confounding.
• Statistics demonstrate association, but
not causation.

29
Resources
• Thomas Songer, PhD. Introduction to Research Methods In the
Internet Era. http
://www.pitt.edu/~super1/CentralAsia/workshop.htm

You might also like