Chapter 5 - Hypothesis Testing Part 1

Lord God,
PRAYER
We thank you that you promise to be with us always. Thank
you that your presence is with us right now.
Today we give you our hearts, our minds and our lives. Come
speak your words of life into our beings.
We pray that you grant us the serenity to accept the things

we cannot change, the courage to change the things we can,
and wisdom to differentiate the two.
Thank you Lord.

Amen.
“What Does it Mean if 2 Companies
Report 95% Efficacy Rates?”
- New York Times, November 20, 2020

“In the case of Pfizer, for example, the
company recruited 43, 661 volunteers and
waited for 170 people to come down with
symptoms of COVID-19 and then get a
positive test. Out of these 170, 162 had
received a placebo shot, and just eight had
received the real vaccine.
- Carl Zimmer (2020)

On Nov. 30, 2020 Moderna (2020) issued a
press release stating a primary analysis
based on 196 cases, of which 185 cases of
COVID-19 were observed in the placebo
group versus 11 cases observed in the
mRNA-1273 group, resulting in a point
estimate of vaccine efficacy rate of 94.1%
(out of the 30,000 participants in the US).
• Hypothesis Testing Intuition
• Null and Alternative Hypotheses
• Type I and Type II Errors
• Significance Level
• Test Statistics
• P-value
• Choosing Statistical Test
Hypothesis Testing:
“The Idea”
Suppose, as a statistics student, you would like
to verify whether the claim of Pfizer that the
efficacy rate of their COVID-19 vaccine is
really 95%.
Or, perhaps as a Pfizer researcher, you would

like to find out if your vaccine is really
effective in protecting a vaccinated individual
from acquiring COVID-19.
Let’s recall the steps of a research process
(scientific method)
•Define a problem (based on an observation)
•Gather data
•Generate a theorem and formulate a testable
hypothesis (an educated guess)
•Test the hypothesis (experiment)
•Make a conclusion
What is a statistical hypothesis?
In statistical inference, this is an assumption or

prediction about population/s expressed in
terms of parameters (e.g. population means,
population proportion or correlations)
What is a statistical hypothesis?
•Say for example, Pfizer vaccines have an
efficacy rate of 95%.
•Or, males have higher IQ level than females.
•Or, intervention X is better than intervention
Y.
•Or rating of movies is related to the number
of viewers.
A procedure for determining whether or
not the hypothesis is true is called
statistical hypothesis test or significance
test.
It uses data to evaluate a hypothesis by
comparing sample point estimates of
parameters to values predicted by the
hypothesis.
How is scientific method compared with statistical test
of hypothesis?
• Define a problem
• Gather data We create two hypotheses...
• Formulate a
hypothesis A hypothesis to be tested called
• Test the hypothesis the NULL HYPOTHESIS
• Make a conclusion and a back up one called the
ALTERNATIVE HYPOTHESIS
Null vs Alternative
Hypotheses:
“Why two guesses?”
Types of Hypothesis
• Null Hypothesis (Ho) - always states that there is
no effect in the underlying population.
-by effect we might mean a relationship between

two or more variables, a difference between two
or more different populations or a difference in the
responses of one population under two or more
different conditions.
Types of Hypothesis
• Null Hypothesis (Ho) - -is usually formulated for

the purpose of being rejected.
Types of Hypothesis
•Alternative Hypothesis (Research Hypothesis)

(Ha or H1) - the operational statement of the
researcher's hypothesis.
-to be accepted if the null hypothesis is rejected.
-is a prediction of how two variables might be

related to each other.
Types of Hypothesis
• Alternative Hypothesis (Research Hypothesis)
- or, it might be our prediction of how specified

groups of participants might be different from
each other or how one group of participants might
be different when performing under two or more
conditions.
Types of Hypothesis (Remarks)
•The null hypothesis and the alternative

hypothesis are mutually exclusive
•The alternative research hypothesis can be

directional or non-directional.
ALTERNATIVE HYPOTHESES can be…
•Directional research hypothesis
• specifies the direction of the difference or direction
of relationship.
•Non-directional research hypothesis

• does not specify the direction of the difference or
direction of relationship.
Illustration…
• Who is smarter, males or females?
• Is IQ related to academic performance?
• Does in-house training make employees more
productive than out-house training as indicated by
their work performance?
• Is there a significant relationship between the level of
morale of employees and their work performance?
We can restate…
•Who is smarter, males or females?
•Is there a significant difference between
the IQ level of the male and females?
•Is IQ related to academic performance?
•Is there a significant relationship between
IQ and academic achievement?
1. Research Question: Is there a significant difference
between the IQ level of the males and females?
Null Hypothesis: There is no significant difference between
the IQ level of the males and females.
Non-directional alternative hypothesis:
• There is a significant difference between the IQ level of the
males and females.
Directional alternative hypothesis:
• The IQ level of males is higher than that of the females.
2. Research Question: Is there a significant relationship
between IQ and academic achievement?
Null Hypothesis: There is no significant relationship between
IQ and academic achievement.
Non-directional alternative hypothesis:
• There is a significant relationship (or correlation) between
the IQ and academic performance.
Directional alternative hypothesis:
• The higher the IQ of a student the better his academic
performance.
Type I and Type II
Errors:
“The good, the bad, and
the ugly ”
•Decision maker (statistician) mainly relies on data
observed on whether he will REJECT the null
hypothesis or ACCEPT (fail to reject) it.
•However, true states of nature, which are beyond
his control, may determine whether his decision
is good or bad.
•Let’s illustrate the possible scenarios for every
decision made against the true state of nature
Nature of the hypothesis
Null H is true Null H is false
Decision
Error Good
Reject the null H
Decision
Accept the null H Good Error

Decision
Consider the following analogies to
illustrate this scenario.
•Courtroom Trial
•COVID-19 Testing
Courtroom Trial Analogy
•Null Hypothesis: The defendant is not guilty
•Alternative Hypothesis: The defendant is guilty.
•The judge may reject the null and convict the
defendant.
•Or accept the null and acquit him.
Ho: The defendant is not guilty.
(Innocent) (Guilty)
Decision
Good
Reject (Convict) Error
Decision
Good Error
Accept (Acquit) Decision
COVID-19 Test Analogy
•Null Hypothesis: The patient is COVID-19 negative.
•Alternative Hypothesis: The patient is COVID-19
positive.
•The test may reject the null and report a COVID-19
positive person.
•Or accept the null and report him as negative.
Ho: The patient is COVID-19 negative.
(Without Virus) (With Virus)
Decision
Good
Reject (Positive) Error
Decision
Good Error
Accept (Negative) Decision
Types of Statistical Errors
• Type I Error (producer’s error) is committed when we
reject the null hypothesis when in fact it is true.
• Type II Error (consumer’s error) is committed when

we accept the null hypothesis when in fact it is false.
Chance of Committing Such Errors
• 𝛼(alpha) is the probability of committing type I

error.
• 𝛽(𝑏𝑒𝑡𝑎) error is the probability of committing

type II error.
Alpha (𝛼):
“A criterion for
statistical significance”
Alpha (𝛼)
•known as the significance level, it is
interpreted as the allowance for error in
decision making.
•To be useful, the level of significance of a test
must be small.
•By tradition, the most common value of 𝛼 are
0.05 or 0.01.
Alpha (𝛼)
•It is the probability level that we use as a cut-
off below which we are happy to assume that
our pattern of results is so unlikely as to
render our research hypothesis (alternative)
as more plausible than the null hypothesis.
Alpha (𝛼)
•On the assumption of the null hypothesis
being true, if the probability of obtaining an
effect due to sampling error is less than 5%,
then the findings are said to be ‘significant’.
•If this probability is greater than 5%, then the
findings are said to be ‘non-significant’.
Test Statistic:
“The decision maker”
Test Statistic
•Test statistic is a formula (called a decision

maker) used to test the null hypothesis.
•It is used to determine how close a specific
sample result falls to one of the hypothesis
being tested.
•Examples of test statistics: z, 𝜒2, t, F
• When we convert our data into a score from a
probability distribution, the score we calculate is
called the test statistic.
• For example, if we were interested in looking for a
difference between two groups, we could convert our
data into a t-value (from the t-distribution). This t-
value is called our test statistic.
• We then calculate the probability of obtaining such a
value by chance factors alone and this represents our
p-value.
Remark: The values of the test statistic can be
classified in two sets:
• 1. Critical region or rejection region of a test is the set
of values of the test statistic that leads to the
rejection of the null hypothesis.
• 2. Acceptance region is the set of values of the test

statistic that will lead to the acceptance of the null
hypothesis.
•Critical value of the test statistic is a that value
which separates the critical region from the
acceptance region.
test statistic
REJECTION
REGION
ACCEPTANCE
REGION
P-value
P-values or attained level of significance
•The p-value is the probability of obtaining the
pattern of results we found in our study if there
was no relationship between the variables in
which we were interested in the population.
•the smaller the p-value, the more strongly the

test rejects the null hypothesis
Remarks
•The type of test to be used depends on the nature
of the research hypothesis.
•In general, if the research hypothesis is
directional, a one-tailed-test is used;
•if the research hypothesis is non-directional, a
two-tailed test is used.
Logic of Null Hypothesis Testing
• 1. State the null and the alternative hypothesis
• 2. Set the level of significance (𝛼) to be used.
• 3. Identify and compute the appropriate test statistic to be
used (e.g., t-statistic).
• 4. Determine the probability value (p- value).
• 5. Make the decision:
• Decision Rule: Reject the null hypothesis if and only if, the
p-value is less than level of significance (𝜶) .
Let’s take a BREAK!!!
Choosing the Appropriate
Test Statistics
(Without asking a
statistician)
Assumptions underlying the use of statistical tests
- Many statistical tests that we use require that our
data have certain characteristics. These
characteristics are called assumptions.
- Many statistical tests are based upon the
estimation of certain parameters relating to the
underlying populations in which we are interested.
These sorts of test are called parametric tests.
Parametric Tests
- These tests make assumptions that our samples are
similar to underlying probability distributions such as
the standard normal distribution
- scale of the dependent variable should be at least interval
- samples must be drawn from a normally distributed
population
- variances of the population must be approximately equal
(homogeneity of variances)
- no extreme scores
SPSS: Statistics Coach
One of the biggest factors in
determining which statistical
tests you can use to analyse
your data is the way you have
designed your study
Two most common tests are
difference test and
correlational test
Overview of the main features of the various research design
Overview of the main features of the various research design
Research Designs
Another important feature of research designs
is whether you get each participant to take
part in more than one condition.
Between-participants or
Within-participants designs
Research Designs
Between-participants designs are those
where we have different participants
allocated to each condition of the IV.
The participants are called independent or

uncorrelated samples.
Between –participants/Independent Samples
•samples that are randomly selected from distinct
populations
•the sample sizes may or may not be equal.
Examples of Independent Samples
•sample of male students and sample of
female students
•sample of smokers and sample of non-
smokers
•sample of parents, sample of teachers, and
sample of pupils
Limitations of Between-participants design
different people bring different
characteristics to the experimental
setting hence may increase presence of
extraneous variables
Research Designs
Within-participants designs, on the other
hand, are those where each participant is
measured under all conditions of the IV.
The participants are called dependent or

correlated samples.
Within-Participants/Dependent Samples
dependent samples or correlated samples

usually arise in experimental designs where
the objective is to make sure that the
subjects being compared are comparable in
terms of relevant variables
Within-Participants/Dependent Samples
•these designs are repeated measures

designs (e.g. pretest-posttest design) and
matched groups design
•the sample sizes of the groups are always

equal.
A. Before & After or Pretest-Posttest Design
(Repeated Measures Designs)
• The two sets of

data are said to be
correlated because
they are taken or
measured from the
same set of
individuals.
MICHAEL JUN M. PONCIANO

B. Matched Groups Design
• We take a sample of
paired individuals and we
randomly split each pair
into two groups.
• The resulting samples are
dependent or correlated.
For instance…
•Experimental study on the effectiveness of two
methods of teaching, we have to make sure that
the groups of students are comparable in terms of
ability and other relevant characteristics.
•If matching is not done, it may happen that the

groups are not comparable and the internal
validity of the study will be questionable.
Matched group design
•is rarely resorted to by educational
researchers because of the difficulty in
matching individuals.
•the more variables considered, the more
difficult it will be to form a good number of
matched or paired individuals.
Limitations of Within-participants design
•presence of order effects (can be answered by
counterbalancing)
•demand effects / halo effects
•cannot be used in many quasi-experimental
researches

Chapter 5 - Hypothesis Testing Part 1

Uploaded by

Copyright:

Available Formats

You might also like

Chapter 5 - Hypothesis Testing Part 1

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Chapter 5 - Hypothesis Testing Part 1

Uploaded by

Copyright:

Available Formats

Lord God,

We pray that you grant us the serenity to accept the things

Thank you Lord.

- New York Times, November 20, 2020

- Carl Zimmer (2020)

Or, perhaps as a Pfizer researcher, you would

In statistical inference, this is an assumption or

-by effect we might mean a relationship between

• Null Hypothesis (Ho) - -is usually formulated for

•Alternative Hypothesis (Research Hypothesis)

-is a prediction of how two variables might be

- or, it might be our prediction of how specified

•The null hypothesis and the alternative

•The alternative research hypothesis can be

•Non-directional research hypothesis

Accept the null H Good Error

• Type II Error (consumer’s error) is committed when

• 𝛼(alpha) is the probability of committing type I

• 𝛽(𝑏𝑒𝑡𝑎) error is the probability of committing

•Test statistic is a formula (called a decision

• 2. Acceptance region is the set of values of the test

•the smaller the p-value, the more strongly the

The participants are called independent or

The participants are called dependent or

dependent samples or correlated samples

•these designs are repeated measures

•the sample sizes of the groups are always

• The two sets of

MICHAEL JUN M. PONCIANO

•If matching is not done, it may happen that the

You might also like