Professional Documents
Culture Documents
Group 7 - Hypothesis Testing Two Populations
Group 7 - Hypothesis Testing Two Populations
SUBMITTED BY:
Sherinah Mae P. Abarilles
Jhana B. Juan
SUBMITTED TO:
1|Page
Republic of the Philippines
POLYTECHNIC UNIVERSITY OF THE PHILIPPINES
Office of the Vice President for Branches and Satellite Campuses
SANTA MARIA BULACAN CAMPUS
LEARNING OBJECTIVES:
2|Page
Republic of the Philippines
POLYTECHNIC UNIVERSITY OF THE PHILIPPINES
Office of the Vice President for Branches and Satellite Campuses
SANTA MARIA BULACAN CAMPUS
PRETEST
Encircle the letter of the correct answer.
1. Statement 1: If confidence interval contains only negative values, then we can conclude
that 𝑝₁ −𝑝₂ <0 and 𝑝₁ <𝑝₂ (with confidence c)
Statement 2: If confidence interval contains only positive values, then we can conclude
that 𝑝₁ −𝑝₂ >0 and 𝑝₁ >𝑝₂ (with confidence c)
Statement 3: If confidence interval for 𝑝₁ −𝑝₂ contains zero, then we cannot say which is
larger.
A. Statement 1 true; Statement 2 false: Statement 3 true
B. Statement 1 false, Statement 2 true: Statement 3 false
C. All statements are correct
D. All statements are incorrect
2. Statement 1: The commonly used levels of confidence are 80%, 85%, and 99%.
Statement 2: A higher confidence level implies a wider interval because it necessitates
accounting for a greater proportion of the potential sampling variability.
A. Statement 1 true; Statement 2 false
B. Statement 1 false, Statement 2 true
C. Both statements are correct
D. Both statements are incorrect
3. A company wants to compare the proportion of customers who prefer Product A versus
Product B. They sampled 200 customers who have purchased Product A and find that 140
prefer it. They also sampled 250 customers who have purchased Product B and find that
180 prefer it. Construct a 95% confidence interval for the difference in proportions.
A. (-0.1156, 0.0756)
B. (0.3452, 0.7490)
C. (-0.3556, 0.2436)
D. (0.1232, 0.7485)
3|Page
Republic of the Philippines
POLYTECHNIC UNIVERSITY OF THE PHILIPPINES
Office of the Vice President for Branches and Satellite Campuses
SANTA MARIA BULACAN CAMPUS
8. Statement 1: The decision to reject the null hypothesis is based on the rejection rule, which
states that if the calculated t-value is more than the critical value -𝑡a, we reject the null
hypothesis in favor of the alternative hypothesis.
Statement 2: For a two-tailed test, if the absolute value of the t-value is less than 𝑡a/2, we
reject the null hypothesis.
A. Statement 1 true; Statement 2 false
B. Statement 1 false, Statement 2 true
C. Both statements are correct
D. Both statements are incorrect
4|Page
Republic of the Philippines
POLYTECHNIC UNIVERSITY OF THE PHILIPPINES
Office of the Vice President for Branches and Satellite Campuses
SANTA MARIA BULACAN CAMPUS
10. Statement 1: In a one-tailed test, the null hypothesis (Ho) states that the difference
between the population means, (𝜇₁ − 𝜇₂), is greater than or equal to 0, while the alternative
hypothesis (Ha) suggests that the difference is less than 0.
Statement 2: In a two-tailed test, the null hypothesis states that the difference between the
population means is equal to 0, and the alternative hypothesis suggests that the difference
is not equal to 0.
A. Statement 1 true; Statement 2 false
B. Statement 1 false, Statement 2 true
C. Both statements are correct
D. Both statements are incorrect
5|Page
Republic of the Philippines
POLYTECHNIC UNIVERSITY OF THE PHILIPPINES
Office of the Vice President for Branches and Satellite Campuses
SANTA MARIA BULACAN CAMPUS
HYPOTHESIS TESTING
Inferential Statistics enables us to make estimates of population values called parameters and to
make statements about computed statistics acceptable to some degree of confidence.
Example: A consumer wants to estimate the average price of similar homes in his city
before putting his home on the market.
Example: A manufacturer wants to know if a new type of steel is more resistant to high
temperatures than an old type was.
Hypothesis Test: Is the new average resistance, µN greater than the old average
resistance, µ0?
What is Hypothesis?
A hypothesis is a statement or claim regarding a characteristic of one or more populations.
A preconceived idea, assumed to be true but must be tested for its truth or falsity.
6|Page
Republic of the Philippines
POLYTECHNIC UNIVERSITY OF THE PHILIPPINES
Office of the Vice President for Branches and Satellite Campuses
SANTA MARIA BULACAN CAMPUS
Population mean µ = 20
µ ≠ 20
µ ≥ 20
µ > 20
µ ≤ 20
µ < 20
▪ Ex: There is no difference
between Coke and Diet Coke ▪ Ex: There is a difference between
Coke and Diet Coke
Note: If you are conducting a research study and you want to use a hypothesis test to support your
claim, the claim must be stated in such a way that it becomes the alternative hypothesis, so it
cannot contain the condition of equality.
2. Two - tailed test. If we are primarily concerned with deciding whether the true value of a
population parameter is different from a specified value, then the test should be two-tailed.
Significance level refers to the percentage of sample means that it's outside certain
prescribed limits.
▪ A test of significance is a problem of deciding between the null and the alternative hypotheses
based on the information contained in a random sample.
▪ The goal will be to reject Ho in favor of H A, because the alternative is the hypothesis that the
researcher believes to be true. If we are successful in rejecting H o, we then declare the
results to be “significant”.
Note: It is important to note that we want to set (α) before we start our study because the Type I
error is the more ‘grievous’ error to make. The smaller (α) is, the smaller the region of rejection.
7|Page
Republic of the Philippines
POLYTECHNIC UNIVERSITY OF THE PHILIPPINES
Office of the Vice President for Branches and Satellite Campuses
SANTA MARIA BULACAN CAMPUS
Types of Error
Decision Ho is actually
TRUE
FALSE
Retain Ho Type I Error
(false positive)
Correct
Reject Ho
Type II Error
Correct
(false negative)
P-Value is the smallest level of significance at which Ho will be rejected based on the
information contained in the sample. It is commonly generated by statistical software.
Decision rule: Reject Ho if the p-value is less than or equal to the level of significance (α)
8|Page
Republic of the Philippines
POLYTECHNIC UNIVERSITY OF THE PHILIPPINES
Office of the Vice President for Branches and Satellite Campuses
SANTA MARIA BULACAN CAMPUS
Example:
You’re a financial analyst for an investment firm. You want to find out if there is a difference in
dividend yield between stocks listed in NYSE and NASDAQ. You collect the following data:
Solution:
95% Confidence Interval:
In a one-tailed test, the null hypothesis (Ho) states that the difference between the population
means, (𝜇₁ − 𝜇₂), is greater than or equal to 0, while the alternative hypothesis (Ha) suggests that
the difference is less than 0. On the other hand, in a two-tailed test, the null hypothesis states that
the difference between the population means is equal to 0, and the alternative hypothesis suggests
that the difference is not equal to 0.
To perform these tests, using a test statistic, denoted as z, which is calculated using
the formula , where x̄₁ and x̄₂ are the sample means, and 𝜎(x̄ ₁ − x̄ ₂) is the standard
deviation of the sampling distribution of the difference between sample means.
In a one-tailed test, the rejection rule is , where is the critical value corresponding to
the chosen level of significance. In a two-tailed test, the rejection rule is .
9|Page
Republic of the Philippines
POLYTECHNIC UNIVERSITY OF THE PHILIPPINES
Office of the Vice President for Branches and Satellite Campuses
SANTA MARIA BULACAN CAMPUS
Example:
You’re a financial analyst for an investment firm. You want to find out if there is a difference in
dividend yield between stocks listed in NYSE and NASDAQ. You collect the following data:
Test Statistic:
Decision:
Reject Ho at 𝛼 = 0.05
Conclusion:
There is an evidence of a difference in average dividend
Yield between stocks listed on NYSE & NASDAQ
In comparing the means of two independent populations, a common scenario is testing the
difference between two population means, 𝜇₁ and 𝜇₂, assuming equal variances.
In the context of a one-tailed test, we set up the null hypothesis Ho :(𝜇₁ − 𝜇₂) ≥ 0 and the
alternative hypothesis Ha :(𝜇₁ − 𝜇₂) < 0 . This indicates our interest in determining if the mean of the
first population is significantly less than the mean of the second population.
To conduct this hypothesis test, we calculate the test statistic using the
The decision to reject the null hypothesis is based on the rejection rule, which states that if
the calculated t-value is less than the critical value -𝑡a, we reject the null hypothesis in favor of the
alternative hypothesis. Alternatively, for a two-tailed test, if the absolute value of the t-value is less
than 𝑡a/2, we reject the null hypothesis.
Example:
As a director of training for your company, you are interested in determining whether different
training methods have an effect on productivity. You randomly assign 42 newly hired employees
into two groups of 21. The first group received a computer-assisted, individual-based training. The
10 | P a g e
Republic of the Philippines
POLYTECHNIC UNIVERSITY OF THE PHILIPPINES
Office of the Vice President for Branches and Satellite Campuses
SANTA MARIA BULACAN CAMPUS
other group received a face-to-face team-based training. Upon completion of the training the
employees were evaluated on their performance (measured in time to perform a task).
Test Statistic:
t = 2.14
Decision:
Reject Ho at 𝛼 = 0.05
Conclusion:
There is evidence of difference in performance between employees
trained in a computer-assisted program and those trained in a team
based program.
11 | P a g e
Republic of the Philippines
POLYTECHNIC UNIVERSITY OF THE PHILIPPINES
Office of the Vice President for Branches and Satellite Campuses
SANTA MARIA BULACAN CAMPUS
Moreover, the standard deviation of the sampling distribution of (𝑝̂₁ − 𝑝̂₂) is determined by the
This normality assumption is crucial for conducting hypothesis tests and making inferences
about the population proportions accurately.
Confidence intervals are a useful statistical tool for estimating population parameters from
sample data. When dealing with proportions or percentages, such as the success rate in two
different groups, confidence intervals for the difference in proportions (p1 - p2) are frequently used.
These intervals define a range of values within which the true difference in proportions between
two populations is most likely to occur.
In this narrative report, we will look at confidence intervals for the difference between two
population proportions, denoted p1 - p2. We'll talk about confidence intervals and how they're
interpreted.
A confidence interval is a set of values that represent the expected range of values for the
true population parameter. The confidence interval for p1 - p2 estimates the likely range of the
proportional difference between two populations. The interval is typically expressed as "p1 - p2 ±
(Z * SE)", where Z represents the critical value for the selected level of confidence.
For example, if we construct a 95% confidence interval for p1 - p2, it means that if we
collect samples repeatedly and construct intervals in the same way, approximately 95% of those
intervals will contain the true difference between the proportions.
The sample sizes (n1 and n2) are critical in determining confidence intervals for p1 - p2. A
larger sample size generally results in a narrower confidence interval, which improves precision in
estimating the difference between population proportions. It is critical to ensure an adequate
sample size in order to obtain reliable and accurate estimates. In general, a larger sample size is
12 | P a g e
Republic of the Philippines
POLYTECHNIC UNIVERSITY OF THE PHILIPPINES
Office of the Vice President for Branches and Satellite Campuses
SANTA MARIA BULACAN CAMPUS
required when the expected difference between proportions is small or a higher level of confidence
is desired.
The level of confidence used determines the degree of certainty associated with the
estimated confidence interval. The commonly used levels of confidence are 90%, 95%, and 99%.
A higher confidence level implies a wider interval because it necessitates accounting for a greater
proportion of the potential sampling variability. It is critical to strike a balance between the desired
degree of confidence and the practical implications of interval width. A narrower interval provides
more precise estimates, but it may result in decreased confidence.
When comparing two confidence intervals for p1 - p2, overlapping intervals indicate that the
difference in sample proportions is not statistically significant. Non-overlapping intervals, on the
other hand, show that the two population proportions differ statistically significantly. It is important
to understand that statistical significance does not always imply practical significance. When
interpreting the results, it is important to consider the magnitude of the difference as well as the
context of the study.
When constructing confidence intervals for p1 - p2, the samples are assumed to be
independent and representative of their respective populations. Furthermore, it assumes that the
sampling distribution of the proportional difference can be approximated by a normal distribution,
which is true under certain conditions (for example, large sample sizes or proportions close to 0 or
1). It is critical to be aware of the assumptions and their potential impact on the reliability of the
confidence interval.
To calculate the sample proportions (p̂1 and p̂2), you need to determine the number of successes
in each sample and divide it by the corresponding sample size.
13 | P a g e
Republic of the Philippines
POLYTECHNIC UNIVERSITY OF THE PHILIPPINES
Office of the Vice President for Branches and Satellite Campuses
SANTA MARIA BULACAN CAMPUS
- Determine the sample sizes for each population: Identify the total number of individuals or
observations in each sample. Denote the sample size of Population 1 as n1 and of
Population 2 as n2.
By performing these calculations, you obtain the sample proportions (p̂1 and p̂2) for the two
populations.
On the other hand, to calculate the standard error (SE) for the difference between two sample
proportions (p1 - p2), you can use the following formula:
The standard error (SE) represents the standard deviation of the sampling distribution of
the difference between the two sample proportions.
14 | P a g e
Republic of the Philippines
POLYTECHNIC UNIVERSITY OF THE PHILIPPINES
Office of the Vice President for Branches and Satellite Campuses
SANTA MARIA BULACAN CAMPUS
Example:
Suppose you want to compare the proportions of male and female customers who made a
purchase on an online shopping platform. You collected data from a random sample of customers
and obtained the following results:
For males:
Sample size (n1): 200
Number of customers who made a purchase (x1): 140
For females:
Sample size (n2): 150
Number of customers who made a purchase (x2): 110
Determine the critical value (Z) for the desired confidence level. Let's assume a 95% confidence
level, corresponding to a Z-value of 1.96.
Conclusion:
The resulting confidence interval is approximately (-0.113, 0.047). This means that, with
95% confidence, we estimate that the true difference in proportions between males and females
who made a purchase lies within this range.
15 | P a g e
Republic of the Philippines
POLYTECHNIC UNIVERSITY OF THE PHILIPPINES
Office of the Vice President for Branches and Satellite Campuses
SANTA MARIA BULACAN CAMPUS
- If confidence interval contains only positive values, then we can conclude that 𝑝₁ −𝑝₂ >0
and 𝑝₁ >𝑝₂ (with confidence c)
- If confidence interval for 𝑝₁ −𝑝₂ contains zero, then we cannot say which is larger.
Confidence intervals for the difference between two population proportions (p1 - p2) play a
crucial role in statistical analysis for several reasons
Estimating the True Difference: Confidence intervals provide an estimate of the range
within which the true difference in proportions between two populations is likely to fall.
Assessing Precision: The width of the confidence interval reflects the precision of our
estimate. A narrower interval indicates more precise estimation, while a wider interval suggests
greater uncertainty.
16 | P a g e
Republic of the Philippines
POLYTECHNIC UNIVERSITY OF THE PHILIPPINES
Office of the Vice President for Branches and Satellite Campuses
SANTA MARIA BULACAN CAMPUS
POSTTEST
Encircle the letter of the correct answer.
2. Suppose you want to compare the proportion of customers who prefer Product A (p1) to the
proportion of customers who prefer Product B (p2). You collect a random sample of 200
customers who have tried both products and find that 120 of them prefer Product A and 80
prefer Product B. Construct a 95% confidence interval for the difference in proportions.
A. (0.464, 0.374)
B. (0.113, 0.287)
C. (0.345, 0.768)
D. (0.729, 0.876)
3. Suppose you compare the effectiveness of two different treatments (Treatment A and
Treatment B) for a medical condition. You randomly assign 500 patients to receive
Treatment A and another 500 patients to receive Treatment B. After the treatments, you
observe that 150 patients in Treatment A group show improvement, while 120 patients in
Treatment B group show improvement. Construct a 90% confidence interval for the
difference in proportions.
A. (0.012, 0.108)
B. (0.985, 0.753)
C. (0.348, 0.862)
D. (0.372, 0.753)
4. Statement 1: A null hypothesis states the hypothesized value of the parameter before
sampling.
Statement 2: An alternative has all possible alternatives other than the null hypothesis.
A. Statement 1 false; Statement 2 true
B. Statement 1 true, Statement 2 false
C. Both statements are incorrect
D. Both statements are correct
17 | P a g e
Republic of the Philippines
POLYTECHNIC UNIVERSITY OF THE PHILIPPINES
Office of the Vice President for Branches and Satellite Campuses
SANTA MARIA BULACAN CAMPUS
6. The refers to the percentage of sample means that it's outside certain
prescribed limits.
A. Significance Level
B. Point estimator
C. Error
D. P-Value
7. Statement 1: In a one-tailed test, the null hypothesis (Ho) states that the difference
between the population means, (𝜇₁ − 𝜇₂), is less than or equal to 0, while the alternative
hypothesis (Ha) suggests that the difference is more than 0.
Statement 2: In a two-tailed test, the null hypothesis states that the difference between the
population means is equal to 0, and the alternative hypothesis suggests that the difference
is not equal to 0.
A. Statement 1 true; Statement 2 false
B. Statement 1 false, Statement 2 true
C. Both statements are correct
D. Both statements are incorrect
18 | P a g e
Republic of the Philippines
POLYTECHNIC UNIVERSITY OF THE PHILIPPINES
Office of the Vice President for Branches and Satellite Campuses
SANTA MARIA BULACAN CAMPUS
10. Statement 1: The decision to reject the null hypothesis is based on the rejection rule, which
states that if the calculated t-value is more than the critical value -𝑡a, we reject the null
hypothesis in favor of the alternative hypothesis.
Statement 2: For a two-tailed test, if the absolute value of the t-value is more than 𝑡a/2, we
reject the null hypothesis.
A. Statement 1 true; Statement 2 false
B. Statement 1 false, Statement 2 true
C. Both statements are correct
D. Both statements are incorrect
19 | P a g e