Çıkarımsal İstatistik

DKT
Biyoistatistik
Çıkarımsal İstatistik
Key Lecture Concepts
• Assess role of random error (chance) as an
influence on the validity of the statistical
association
• Identify role of the p-value in statistical
assessments
• Briefly introduce tests to undertake
2
Research Process
Research question
Hypothesis
Identify research design
Data collection
Presentation of data
Data analysis
Interpretation of data
3
Interpreting Results
When evaluating an association between

disease (e.g., cancer) and exposure (e.g., smoking),
we need guidelines
to help determine whether there is a
true difference in the frequency of disease
between the two exposure groups, or perhaps
just random variation from the study sample.
4
Random Error (Chance)
1. Rarely can we study an entire population, so
inference is attempted from a sample of
the population
2. There will always be random variation

from sample to sample
3. In general, smaller samples have less

precision, reliability, and statistical power
(more sampling variability)
5
Hypothesis Testing
• The process of deciding statistically
whether the findings of an
investigation reflect chance or real
effects at a given level of probability.
6
Elements of Testing hypothesis
• Null Hypothesis
• Alternative hypothesis
• Identify level of significance
• Test statistic
• Identify p-value / confidence interval
• Conclusion
7
Hypothesis Testing
H0: There is no association between the
exposure and disease of interest
H1: There is an association between the

exposure and disease of interest
Note: With prudent skepticism, the null hypothesis

is given the benefit of the doubt until the data
convince us otherwise.
8
Hypothesis Testing
• Because of statistical uncertainty regarding
inferences about population parameters based
upon sample data, we cannot prove or disprove
either the null or alternate hypotheses as
directly representing the population effect.
• Thus, we make a decision based on probability

and accept a probability of making an incorrect
decision.
9
Associations
• Two types of pitfalls can occur that
affect the association between
exposure and disease
• Type 1 error: observing a difference

when in truth there is none
• Type 2 error: failing to observe a
difference where there is one.
10
Interpreting Epidemiologic Results
Four possible outcomes of any epidemiologic study:
REALITY
YOUR H0 True H1 True
DECISION (No assoc.) (Yes assoc.)
Do not reject H0 Correct Type II
(not stat. sig.) decision (beta error)
Reject H0 Type I Correct
(stat. sig.) (alpha error) decision
11
Four possible outcomes of a scientific study:
REALITY
YOUR H0 True H1 True
DECISION (No assoc.) (Yes assoc.)
Do not reject H0 Correct Failing to find a
(not stat. sig.) decision difference when
one exists
Reject H0 Finding a Correct decision
(stat. sig.) difference when
there is none
12
Type I and Type II errors
•  is the probability of committing type I

error.
•  is the probability of committing type II

error.
13
“Conventional” Guidelines:
• Set the fixed alpha level (Type I error) to 0.05

This means, if the null hypothesis is true, the
probability of incorrectly rejecting it is 5% or less.
DECISION H0 True H1 True

Do not reject H0
Study Result
(not stat. sig.)

Reject H0 Type I
(stat. sig.) (alpha error)
14
Empirical Rule
For a Normal distribution approximately,
a) 68% of the measurements fall within one

standard deviation around the mean
b) 95% of the measurements fall within two

standard deviations around the mean
c) 99.7% of the measurements fall within three

standard deviations around the mean
15
Normal Distribution
50% 50%
•  usually set at 5%) 16
4. A test statistic to assess “statistical significance”

is performed to assess the degree to which the
data are compatible with the null hypothesis of no
association
5. Given a test statistic and an observed value, you

can compute the probability of observing a value
as extreme or more extreme than the observed
value under the null hypothesis of no association.
This probability is called the “p-value”
17
6. By convention, if p < 0.05, then the

association between the exposure and disease is
considered to be “statistically significant.”
(e.g. we reject the null hypothesis (H 0) and
accept the alternative hypothesis (H 1))
18
• p-value
• the probability that an effect at least as
extreme as that observed could have occurred
by chance alone, given there is truly no
relationship between exposure and disease (Ho)
• the probability the observed results occurred
by chance
• that the sample estimates of association differ
only because of sampling variability.
19
What does p < 0.05 mean?
Indirectly, it means that we suspect that the

magnitude of effect observed is not due to
chance alone (in the absence of biased data
collection or analysis)
Directly, p=0.05 means that one test result out of

twenty results would be expected to occur due
to chance (random error) alone
20
Example:
D+ D-
E+ 15 85 IE+ = 15 / (15 + 85) = 0.15
IE- = 10 / (10 + 90) = 0.10
E- 10 90
RR = IE+/IE- = 1.5, p = 0.30
Although it appears that the incidence of disease may be

higher in the exposed than in the non-exposed (RR=1.5),
the p-value of 0.30 exceeds the fixed alpha level of 0.05.
This means that the observed data are relatively
compatible with the null hypothesis. Thus, we do not
reject H0 in favor of H1 (alternative hypothesis).
21
Take Note:
The p-value reflects both the magnitude of the
difference between the study groups AND the
sample size
• The size of the p-value does not indicate the
importance of the results
• Results may be statistically significant but be
clinically unimportant
• Results that are not statistically significant
may still be important
22
Sometimes we are more concerned with
estimating the true difference than the
probability that we are making the decision
that the difference between samples is
significant
23
Selection of Tests of Significance
24
Scale of Data
1. Nominal: Data do not represent an amount or
quantity (e.g., Marital Status, Sex)
2. Ordinal: Data represent an ordered series of

relationship (e.g., level of education)
3. Interval (Continuous): Data are measured on an

interval scale having equal units but an arbitrary zero
point. (e.g.: Temperature in Celsius)
4. Interval Ratio (Continuous): Variable such as weight

for which we can compare meaningfully one weight
versus another (say, 100 Kg is twice 50 Kg) 25
Which Test to Use?
Scale of Data
Nominal Chi-square test
Ordinal Mann-Whitney U test
Continuous
T-test
- 2 groups
Continuous
ANOVA
- 3 or more groups
26
Example
• Imagine a study on the effectiveness of therapy X in rehabilitation of children with SLI
• Participants:
• 20 children with SLI
• 10 children receive therapy X (therapy group), 10 children do not receive therapy (control group)
• After therapy, receptive & expressive abilities of children are assessed with a
standardized test (e.g., TEDİL)
• The scores of the two groups (therapy & control) are compared
• Statistical test of significance:
• Continuous data (test scores)
• There are 2 groups to compare, so…
• T-test is conducted
SPSS illustration
T-test
Therapy X works!
Protection against Random Error
• Test statistics provide protection from

type 1 error due to random chance
• Test statistics do not guarantee
protection against type 1 errors due to
bias or confounding.
• Statistics demonstrate association, but
not causation.
29
Resources
• Thomas Songer, PhD. Introduction to Research Methods In the
Internet Era. http
://www.pitt.edu/~super1/CentralAsia/workshop.htm

Çıkarımsal İstatistik

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Çıkarımsal İstatistik

Uploaded by

Copyright:

Available Formats

DKT

Identify research design

When evaluating an association between

2. There will always be random variation

3. In general, smaller samples have less

H1: There is an association between the

Note: With prudent skepticism, the null hypothesis

• Thus, we make a decision based on probability

• Type 1 error: observing a difference

•  is the probability of committing type I

•  is the probability of committing type II

• Set the fixed alpha level (Type I error) to 0.05

DECISION H0 True H1 True

(not stat. sig.)

a) 68% of the measurements fall within one

b) 95% of the measurements fall within two

c) 99.7% of the measurements fall within three

4. A test statistic to assess “statistical significance”

5. Given a test statistic and an observed value, you

6. By convention, if p < 0.05, then the

What does p < 0.05 mean?

Indirectly, it means that we suspect that the

Directly, p=0.05 means that one test result out of

Although it appears that the incidence of disease may be

2. Ordinal: Data represent an ordered series of

3. Interval (Continuous): Data are measured on an

4. Interval Ratio (Continuous): Variable such as weight

Nominal Chi-square test

Ordinal Mann-Whitney U test

• Test statistics provide protection from

You might also like