Download as pdf or txt
Download as pdf or txt
You are on page 1of 94

SENIOR HIGH SCHOOL

STATISTICS &
PROBABILITY
QUARTER 4
Module 4

ii
DO_Q4_STATISTICS & PROBABILITY_GRADE 11_MODULE4
RESOURCE TITLE: Statistics & Probability
Alternative Delivery Mode
Quarter 4 – Week 1-10
Revised Edition, 2023

Republic Act 8293, section 176 states that: No copyright shall subsist in any work of
the Government of the Philippines. However, prior approval of the government agency or office
wherein the work is created shall be necessary for the exploitation of such work for a profit.
Such agency or office may, among other things, impose as a condition the payment of
royalties.

Borrowed materials (i.e., songs, stories, poems, pictures, photos, brand names,
trademarks, etc.) included in this module are owned by their respective copyright holders.
Every effort has been exerted to locate and seek permission to use these materials from their
respective copyright owners. The publisher and authors do not represent nor claim ownership
over them.

Published by the Department of Education


Secretary: Sara Z. Duterte-Carpio

Development Team of the Module

Writers: MARY ANN B. ORIO, Dalandanan National SHS, SDO-Valenzuela


JHERALD Q. GABICA, Bignay National SHS, SDO-Valenzuela
VIRGILIO G. VENTURA, Caruhatan National SHS, SDO-Valenzuela
DAISY LYN F. MARIANO, Parada National SHS, SDO-Valenzuela
Reviewers: REBECCA M. BIÑAS
Illustrator: NATHANIEL D.C. DEL MUNDO
Layout Artist: LYNET D. DEL PILAR
Management Team:
MELITON P. ZURBANO, SDS
FILMORE R. CABALLERO, CID Chief
MYRON WILLIE III B. ROQUE, EPS LRMS
EDNA LLANERA, Division SHS Focal Person
MARILYN B. SORIANO, EPS Mathematics

Printed in the Philippines by ________________________

Department of Education – National Capital Region – SDO VALENZUELA

Office Address: Pio Valenzuela St., Marulas, Valenzuela City


Telefax: (02) 292 – 3247
E-mail Address: sdovalenzuela@deped.gov.ph

iii
11

STATISTICS &
PROBABILITY
QUARTER 4
Module 4

ii
Introductory Message
For the facilitator:

Welcome to the Statistics and Probability for Grade 11 Alternative Delivery Mode (ADM)
Module on Understanding Hypothesis Testing!

This module was collaboratively designed, developed and reviewed by educators both from
public and private institutions to assist you, the teacher or facilitator in helping the learners
meet the standards set by the K to 12 Curriculum while overcoming their personal, social,
and economic constraints in schooling.

This learning resource hopes to engage the learners into guided and independent learning
activities at their own pace and time. Furthermore, this also aims to help learners acquire
the needed 21st century skills while taking into consideration their needs and circumstances.

In addition to the material in the main text, you will also see this box in the body of the
module:

Notes to the Teacher


This contains helpful tips or strategies that
will help you in guiding the learners.

As a facilitator you are expected to orient the learners on how to use this module. You also
need to keep track of the learners' progress while allowing them to manage their own learning.
Furthermore, you are expected to encourage and assist the learners as they do the tasks
included in the module.

iii
For the learner:

Welcome to the Statistics and Probability for Grade 11 Alternative Delivery Mode (ADM)
Module on Understanding Hypothesis Testing!

The hand is one of the most symbolized part of the human body. It is often used to depict
skill, action and purpose. Through our hands we may learn, create and accomplish. Hence,
the hand in this learning resource signifies that you as a learner is capable and empowered
to successfully achieve the relevant competencies and skills at your own pace and time. Your
academic success lies in your own hands!

This module was designed to provide you with fun and meaningful opportunities for guided
and independent learning at your own pace and time. You will be enabled to process the
contents of the learning resource while being an active learner.

This module has the following parts and corresponding icons:

This will give you an idea of the skills or


What I Need to Know
competencies you are expected to learn in the
module.
This part includes an activity that aims to check
What I Know
what you already know about the lesson to take.
If you get all the answers correct (100%), you may
decide to skip this module.
This is a brief drill or review to help you link the
What’s In
current lesson with the previous one.

What’s New In this portion, the new lesson will be introduced


to you in various ways such as a story, a song, a
poem, a problem opener, an activity or a situation.
This section provides a brief discussion of the
What is It
lesson. This aims to help you discover and
understand new concepts and skills.
This comprises activities for independent practice
What’s More
to solidify your understanding and skills of the
topic. You may check the answers to the exercises
using the Answer Key at the end of the module.
This includes questions or blank
What I Have Learned
sentence/paragraph to be filled in to process what
you learned from the lesson.
This section provides an activity which will help
What I Can Do
you transfer your new knowledge or skill into real
life situations or concerns.
This is a task which aims to evaluate your level of
Assessment
mastery in achieving the learning competency.

iv
In this portion, another activity will be given to
Additional Activities
you to enrich your knowledge or skill of the lesson
learned. This also tends retention of learned
concepts.
This contains answers to all activities in the
Answer Key
module.

At the end of this module you will also find:

References This is a list of all sources used in developing this


module.
The following are some reminders in using this module:

1. Use the module with care. Do not put unnecessary mark/s on any part of the module.
Use a separate sheet of paper in answering the exercises.
2. Don’t forget to answer What I Know before moving on to the other activities included
in the module.
3. Read the instruction carefully before doing each task.
4. Observe honesty and integrity in doing the tasks and checking your answers.
5. Finish the task at hand before proceeding to the next.
6. Return this module to your teacher/facilitator once you are through with it.
If you encounter any difficulty in answering the tasks in this module, do not hesitate to
consult your teacher or facilitator. Always bear in mind that you are not alone.

We hope that through this material, you will experience meaningful learning and gain
deep understanding of the relevant competencies. You can do it!

v
11

STATISTICS &
PROBABILITY
Quarter 4-Module 4
Lesson 1
Understanding Hypothesis
Testing

vi
This module was designed and written with you in mind. It is here to help you master
the sampling and sampling distributions. The scope of this module permits it to be
used in many different learning situations. The language used recognizes the diverse
vocabulary level of students. The lessons are arranged to follow the standard
sequence of the course. But the order in which you read them can be changed to
correspond with the textbook you are now using.

The module is divided into two lessons, namely:


• Lesson 1 – Illustrating: (a) null hypothesis; (b) alternative hypothesis; (c) level
of significance; (d) rejection region; and (e) types of errors in hypothesis
testing. M11/12SP-IVa-1
• Lesson 2 – Identifying the parameters to be tested given a real-life problem.
M11/12SP-IVa-3

After going through this module, you are expected to:


1. Illustrate: (a) null hypothesis; (b) alternative hypothesis; (c) level of
significance; (d) rejection region; and (e) types of errors in hypothesis testing.
2. Identify the parameters to be tested given a real-life problem

Let’s see how much you already know about this lesson.

MULTIPLE CHOICES. Encircle the letter of your answer.


1. It is also known as non-directional test.
A. One-tailed test C. Two-tailed test
B. Tailed test D. Three-tailed test
2. It refers to a statement that there is no difference between a parameter and a
specific value.
A. Tailed-test C. Null hypothesis
B. Alternative hypothesis D. Significant difference
3. It is the decision to reject or do not reject the null hypothesis in a given
situation.
A. Conclusion C. Hypothesis
B. Directional test D. Significance
4. It refers to a statement that there is a difference between a parameter and a
specific value.
A. Alternative hypothesis C. Significant difference
B. Tailed test D. Null hypothesis
5. It is a classification of error wherein the decision to reject the null hypothesis
could be wrong.
A. Correct decision C. Type II error
B. Type 1 error D. Type III error

7 DO_Q4_STATISTICS & PROBABILITY_GRADE 11_MODULE4


Lesson
Understanding Hypothesis Testing
1
In this lesson, the learners will understand the concepts of tests of hypotheses on
the population mean and population proportion.

Determine whether the statement is True or False. If False, modify the underlined
word/s to make it true.
_____________1.) The area under the normal curve is 1.
_____________2.) Under the normal curve, there are many z-values.
_____________3.) The level of significance, a = 0.01 gives 99% accuracy.
_____________4.) The level of significance, a = 0.05 gives 0.95% accuracy.
_____________5.) In a given problem, the notations µ and 𝝈 are sample values.

From the activity above, you have decided whether the statement is true or false.
In decision making, you usually follow certain processes: collect evidences, weigh
alternatives and decide.
In Statistics, decision making starts with a concern about a population
regarding its characteristics denoted by parameter values. We might be interested in
the population parameter like the mean and the proportion. For example, a
fisherman looks into several factors before deciding to go out to catch fish in the sea.
In the same manner, a farmer’s decision on when to plant his crops, and a politician
in a community decide to approve an agenda on environmental awareness are some
examples that can be addressed in procedures in Statistics called hypothesis testing.
Hypothesis Testing is another area in Inferential Statistics. It is a decision-
making process for evaluating claims about a population based on the characteristics
of a sample purportedly coming from that population. The decision is whether the
characteristic is acceptable or not.
There are two types of hypotheses: the null hypothesis and alternative
hypothesis.

Null hypothesis is the hypothesis to be tested. It states an exact value about the
parameter. When the null hypothesis is rejected, this leads to another option, which
is the alternative hypothesis that allows for the possibility of many values.

8 DO_Q4_STATISTICS & PROBABILITY_GRADE 11_MODULE4


Null hypothesis
➢ It is denoted by Ho
➢ It is a statement that states there is no significant difference between a
parameter and a specific value, or that there is no significant difference
between two parameters.
➢ It is a statement that asserts the value to which the population parameter
is equal and is presumed to be true.
➢ It is a statement of equality (=) or one which involves equality (≤ and ≥).
Alternative hypothesis
➢ It is denoted by Ha
➢ It is a statement that there is a significant difference between a parameter
and a specific value, or that there is a significant difference between two
parameters.
➢ It is a statement of inequality such as, ≠, < and >.

Example 1: The mean number of studying hours of a Grade 11 student is 6 hours.


Ho: The mean number of studying hours of a Grade 11 student is 6 hours.
In symbols: Ho: µ = 6.
Ha: The mean number of studying hours of a Grade 11 student is not equal to
6 hours.
In symbols: Ha: µ ≠ 6
Example 2: The mean height of a Grade 12 student is at least 150 cm.
Ho: The mean height of a Grade 12 student is at least 150 cm.
In symbols: Ho: µ ≥ 150
Ha: The mean height of a Grade 12 student is less than 150 cm.
In symbols: Ha: µ < 150
Directional versus Non-directional Test
In example 1 above, we can write the alternative hypothesis as:
a.) The mean number of studying hours of a Grade 11 student is not equal to
6 hours. (In symbols, Ha : µ ≠ 6)
b.) The mean number of studying hours of a Grade 11 student is greater than
6 hours. (In symbols, Ha : µ > 6)
c.) The mean number of studying hours of a Grade 11 student is less than 6
hours. (In symbols, Ha : µ ˂ 6)
The appropriateness on the use of “not equal to”, “greater than”, and “less
than” in alternative hypothesis depends on the design of the hypothesis test.
Design of Hypothesis Test can be: (a.) one - tailed test (also known as
directional test); (b.) two – tailed test (also known as non-directional test).
The two-tailed test (non-directional test) is the standard test used in many
researches and it compares the population parameter in both directions (left or right)
of the bell curve. On the other hand, one-tailed test (directional test) is a test that

9 DO_Q4_STATISTICS & PROBABILITY_GRADE 11_MODULE4


determines the relationship between the variables in only one direction, either the
left or the right tail of the curve.

Level of Significance
The next step in hypothesis testing after the statement of the hypotheses is
the setting of the standard or criterion on which the decision will be based.
Apparently, there are only two possible decisions to make in the process of
hypothesis testing- either “reject Ho” (accept Ha) or “do not reject Ho” (reject Ha). This
decision to reject the null hypothesis is called significance and it should be based
on a set of criteria of judgment called the level of significance, denoted using the
Greek lower-case alpha, α.
➢ Significance is reached when the p-value of the statistic is less than the level
of significance.
➢ In general, statisticians arbitrarily set the commonly used levels of
significance,
at 1%, 5% and 10%.

The Rejection Region


To clarify the rejection or retention of the null hypothesis, a critical region or
rejection region must be defined.
After the level of significance for the hypothesis test is set, the researcher now
computes the test statistic. When the computed test statistic falls within a specific
range of values allowable for the test, the null hypothesis is rejected. This range of
values for the sample statistic that indicates when the null hypothesis should be
rejected is called the rejection region. Figures 1,2 and 3 show the rejection region for
both directional and non-directional tests.

Fig. 1 Fig. 2 Fig. 3


The critical region is based on a value called the critical value, which is
usually determined using an appropriate distribution table based on the test
statistic.
Decision Errors in Hypothesis Testing
The last step in hypothesis testing is the decision to reject or not to reject the
null hypothesis. Since not all members of the population are considered in the
process of verifying the null hypothesis, it is always a possibility that the decision to
reject or not to reject the null hypothesis is wrong.
Classification of decision errors: (a.) Type I error -the decision to reject the null
hypothesis could be wrong; (b.) Type II error-the decision not to reject the null
hypothesis could be wrong
Of course, you only reject the null hypothesis when it is false and you fail to
reject the null hypothesis when it is true. Doing otherwise would certainly lead to do

10 DO_Q4_STATISTICS & PROBABILITY_GRADE 11_MODULE4


a decision error. The Table 1 below summarizes the four possible outcomes when a
decision is made in hypothesis testing.
Table 1. Four Possible Outcomes of the decision in Hypothesis Testing
Reality Fail to Reject Reject
Null hypothesis is true. Correct decision Type I error
Null hypothesis is false. Type II error Correct decision
Example 3:
Maria insists that she is 30 years old, when in fact, she is 32 years old. What
error is Maria committing?
Answer: Maria is rejecting the truth. She is committing a Type I error.

Example 4:
It has been established that a particular teaching strategy improves math
performance. However, the p-value taken from your experiment at an alpha-value of
0.05 was 0.15. Thus, you did not reject the null hypothesis and concluded that there
is no significance between the strategy and math performance. What type of decision
is illustrated in this example?
Answer: This illustrates Type II error because there is really significance in the
population between the teaching strategy and math performance, but you
did not find any significance in your sample.
Probability of Committing a Type I and Type II error
In decisions that we make, we form conclusions and these conclusions are the
bases of our actions. But this is not always the case in Statistics because we make
decisions based on a sample information. The best way we can do is to control the
probability with which an error occurs.
The probability of committing a Type I error is denoted by Greek letter α (alpha)
while the probability of committing a Type II error is denoted by β (beta)
The following table shows the probability with which decisions occur.
Table 2. Types of Errors
Error in Decision Type Probability Correct Type Probability
Decision
Reject a true Ho I Α Accept a true Ho A 1-α
Accept a false Ho II Β Reject a false Ho B 1-β

Parameter
Parameter is defined as any numerical quantity that characterizes a given population
or some of its aspects. It means that, the parameter tells us something about the
whole population.
However, the numerical measure that is calculated from the sample is
called statistic. Statistic is a known number and a variable that depends on the
portion of the population.
A parameter denotes the true value that would be obtained if a census
rather than a sample was undertaken.

11 DO_Q4_STATISTICS & PROBABILITY_GRADE 11_MODULE4


Examples of parameters are the measures of central tendency. These tell us
how the data behave on an average basis. For example, mean, median, and mode are
measures of central tendency that give us an idea about where the data concentrate.
Meanwhile, standard deviation tells us how the data are spread from the central
tendency, i.e. whether the distribution is wide or narrow. Such parameters are often
very useful in analysis.
Identifying parameters to be used:
1. The television habits of children were observed and found out that the
standard deviation is 10.2 hours per week.
Parameter to be tested: The standard deviation of children’s television habits
hours per week is 10.2
Parameter: standard deviation in symbol: 𝜎 = 10.2
2.A study claims that the mean quarantine days for a certain person is 14 days.
Parameter to be tested: mean quarantine days for a certain person
Parameter: mean in symbol: 𝜇 = 14

Directions: State the null and the alternative hypotheses of the following
statements.
1. A medical trial is conducted to test whether or not a new releases medicine
reduces uric acid by 40%.
: ____________________________________________________
: ____________________________________________________
2. Supposed, we want to test whether the general average of students in Math is
different from 82%.
: ____________________________________________________
: ____________________________________________________
3. We want to test whether the mean height of Grade 7 students is 56
inches.
: ____________________________________________________
: ____________________________________________________
4. We want to test if BNHS students take more than four years to graduate from
high school, on the average.
: ____________________________________________________
: ____________________________________________________
5. We want to test if it takes less than 45 minutes to answer the summative test in
Mathematics.
: ____________________________________________________
: ____________________________________________________

12 DO_Q4_STATISTICS & PROBABILITY_GRADE 11_MODULE4


Directions: Complete the statements that follow.
Analyze the possibilities of Jherald’s conclusion. Identify if it is a Type I Error, Type
II Error, or a Correct Decision.

If Jherald finds out that his null hypothesis is …

1. true and he fails to reject it, then he commits a ____________________.

2. true and he rejects it, then he commits a _____________________.

3. false and he fails to reject it, then he commits a __________________.

4. false and he rejects it, then he commits a ______________.

Directions: Determine if one-tailed test or two-tailed test fits the given alternative
hypothesis.

1. The enrolment in junior high schools is not the same as the enrolment in the
senior high schools.
2. The standard deviation of their height is not equal to 7 inches.
3. The average number of internet users this year is significantly higher as
compared last year.
4. Male Grade 8 and Grade 11 students differ in height on average.
5. Miya’s grade is higher compared to her previous grade.

MULTIPLE CHOICES: Encircle the letter of your answer.

1. What kind of parameter is applied in the given situation? “The mean height of all
Grade 10 students is 170 cm.”
A. mean B. variance C. proportion D. standard deviation
2. A licensed teacher claims that more than 40 % of all education graduates passed
the licensure examination for teachers. What kind of parameter is used in this
claim?

A. mean B. variance C. proportion D. standard deviation


3. It is a classification of error wherein the decision to reject the null hypothesis
could be wrong.

A. Correct decision B. Type I error C. Type II error D. type III error

13 DO_Q4_STATISTICS & PROBABILITY_GRADE 11_MODULE4


4. It is also known as non-directional test.

A. One-tailed test B. tailed test C. Two-tailed test D. three-tailed test


5. Which of the following is not a parameter?

A. Mean B. Mode C. Summation D. Standard deviation


6. The decision to reject or to fail the null hypothesis is called ____________.

A. Conclusion B. Directional test C. Hypothesis D. Significance


7. It refers to any numerical quantity that characterizes a given population or some
of its aspects.
A. Parameter B. Hypothesis C. Median D. Mode

For numbers 8-10, refer to this:

Anna wants to estimate the average shower time of teenagers. From the sample of 50
teenagers, she found out that it takes 5 minutes for teenagers to shower

8.What parameter is to be tested? ____________


9.What parameter is to be used? _______________
10.How are you going to write it in symbols? _______________

Give 5 situation in your life that you experienced: (3) Correct decision (1) Type I
error and (1) Type II error.

14 DO_Q4_STATISTICS & PROBABILITY_GRADE 11_MODULE4


15 DO_Q4_STATISTICS & PROBABILITY_GRADE 11_MODULE4
Almeda, Capistrano, Ferry Sarte. (2010). Elementary Statistics. Quezon City: University of the
Philippines Press.

Belecina, R., Baccay, E., & Mateo, E. (2016). Statistics and Probability. Manila,Philippines:
REX Book Store Inc.

Bluman, A. (2018). Elementary Statistics: A Step by Step Approach 10th edition. McGraw
Hill. New York, USA.

Canlapan, R. (2016). Statistics and Probability. Makati, Philippines: Diwa Learning System
Inc.

Keller, Warrack. (2003). Statistics For Management and Economics. California USA:
Thomson Learning, Inc.

Levine, et. al. (2005). Statistics: A Handbook for Managers. New Jersey: Prentice Hall.

PERCDC Learnhub

Walpole, R., Myers, R., Myers, S., and Ye, K., (2012). Probability and Statistics for Engineers
and Scientists 9th edition. Pearson Education Inc. Massachusetts, USA.

16 DO_Q4_STATISTICS & PROBABILITY_GRADE 11_MODULE4


11
STATISTICS &
PROBABILITY
Quarter 4-Module 4
Lesson 2
Formulating the Appropriate
Null and Alternative
Hypotheses on a Population
Mean

17 DO_Q4_STATISTICS & PROBABILITY_GRADE 11_MODULE4


Target:

1. formulate the appropriate null and alternative hypotheses on a population


mean. M11/12SP-IVb-1; and
2. identify the appropriate form of the test statistic when: (a) the population
variance is assumed to be known; (b) the population variance is assumed to
be unknown; and (c) the central limit theorem is to be used. M11/12SP-IVb-
2 - M11/12SP-IVc-1.

Directions: In each problem, underline the population standard deviation/sample


standard deviation and circle the number of samples.
1. A sample of 150 people has a mean age of 25 with a population standard
deviation (σ) of 5. Test the hypothesis that the population mean is 24.7 at α=0.05.

2. An electric lamps manufacturer is testing a new production method that will


be considered acceptable if the lamps produced by this method result in a normal
population with an average life of 1,250 hours and a standard deviation equal to
110. A sample of 100 lamps produced by this method has an average life of 1,150
hours.

3. The cholesterol levels in a certain population have mean of 200 and standard
deviation 20. The cholesterol levels for a random sample of 9 individuals are
measured and the sample mean x is determined. What is the z-score for a sample
mean x=180?

4. Mapagmahal Elementary School has 1,000 students. The principal of the


school thinks that the average IQ of students at Mapagmahal is at least 110. To
prove her point, she administers an IQ test to 20 randomly selected 10 students.
Among the sampled students, the average IQ is 108 with a standard deviation of
10.

5. A new energy-efficient lawn mower engine was developed by a well-known


inventor. He claims that the engine will run continuously for 5 hours on a single
gallon of regular gasoline. From his stock of 1,000 engines, the inventor selects
a simple random sample of 50 engines for testing. The engines run for an average
of 290 minutes with a standard deviation of 20 minut

18 DO_Q4_STATISTICS & PROBABILITY_GRADE 11_MODULE4


Formulating the Appropriate Null and
Lesson
Alternative Hypotheses on a Population
2
Mean

In statistics, hypothesis testing is a way for you to test the results of a survey or
experiment to see if you have meaningful results. You're basically testing whether
your results are valid by figuring out the odds that your results have happened by
chance. In addition, it allows you to collect samples and make decision based on
facts, not on how you feel or what you think is right. To be able to prove your
assumptions, you must state first the null and alternative hypotheses.

This module will start by recalling your knowledge on the equality/inequality


symbols. This concept will help you understand how to formulate hypothesis.

Direction: Identify the situations which illustrate inequalities. Then write the
inequality model in the appropriate column.

Real-life situation Inequality/equality


model
1. The value of one Philippine peso(p) is less than the
value of one US dollar (d)
2. To get a passing mark in school, a student must
have a grade (g) of at least 75.
3. A Taxi has a maximum capacity(c) of 5.
4. The number of students (s) in a particular section is
50
5. Three times the number of male faculty member (m)
is less than the number of female teachers(f)

“The importance of sunlight in plants”


Directions: Examine the pictures below then answer the guide questions that
follow.

19 DO_Q4_STATISTICS & PROBABILITY_GRADE 11_MODULE4


Guide Questions:
1. What have you observed between the two figures?

2. Do you think the sunlight has an effect to the plant?

3. What do you think are the variables shown in the pictures?

4. Is there any relationship among the variables in Figure 1 and Figure 2?

5. How does these pictures relate to hypothesis?

Statistical hypothesis is an assertion or conjecture concerning one or more


populations.
an assumption or statement which may or may not be true
concerning one or more population
TWO TYPES OF STATISTICAL HYPOTHESIS:
a) Null Hypothesis, H0
• It states that there is no difference between population parameters and
the hypothesized value.
• is a hypothesis that the population mean equals a specific value.
• It contains the “=”, “≥”, and “ ≤ " signs.
b) Alternative hypothesis, Ha or H1.
• It is a claim about the population that is contradictory to H0 and what
we conclude when we reject H0.
• The alternative hypothesis says the population mean is “greater than”
or “less than” or “not equal to” the value we assume is true in the null
hypothesis.
• It contains the “ >” , “<”, and “ ≠ " signs.

One-tailed test Two-tailed test


-Alternative hypothesis contains - Alternative contains the
the greater than (>) or less than inequality symbol (≠).
(<) symbol.
- It is directional either right- -It has no direction
tailed or left tailed

Hypothesis
H0: The exposure to sunlight Ha: The exposure to sunlight
does not affect the growth of does affect the growth of the
the plant. plant.

20 DO_Q4_STATISTICS & PROBABILITY_GRADE 11_MODULE4


To state the null and alternative hypotheses correctly:
1.Identify the parameter in a given problem.
2.Identify the claim to be tested that may show up in null or alternative
hypothesis.
3.Translate the claim into mathematical symbols/notations.
4.Formulate first the null hypothesis (H 0 ) then alternative hypothesis (Ha )
based on the three different ways in writing hypothesis as : “: µ =” , “:
µ ≤” and “: µ ≥”
Test Statistics is used to calculate the p-value of your results, helping to decide
whether to reject your null hypothesis.
the larger the test statistic, the smaller the p-value and the more
likely you are to reject the null hypothesis.
The table below shows the appropriate test statistics to be used when (make it
bigger):

Example:
1. A study was conducted to look at the average time students exercise. A
researcher claimed that in average, students exercise less than 12 hours per
month. In a random sample size n=110, it was found that the mean time
students exercise is x̄ = 11.3 hours per month with s = 6.40 hours per month.
Since n=110, the sample size is large and variance is unknown. Hence, z-
test is the appropriate tool. (Central Limit Theorem)
2. An English teacher wanted to test whether the mean reading speed of
students is 540 words per minute. A sample of 10 students revealed a
sample mean of 520 words per minute with a standard deviation of 5 words
per minute. At 0.05 significance level, is the reading speed different from 540
words per minute?
The sample size (n) is 10 which is less than 30 and sample standard
deviation (5 words per minute) was given. Therefore, the appropriate test
is t-test.

21 DO_Q4_STATISTICS & PROBABILITY_GRADE 11_MODULE4


Direction: Write the null and alternative hypothesis of the following and determine
if it is one-tailed or two-tailed.
1. Mrs. Queliste claims that her students scored an average of 90 in their
Mathematics quiz. The master teacher wants to know whether the teacher’s claim
is acceptable or not.
2. A manufacturer of soft drinks claims that all labeled 1.5-liter bottles contain an
average of 1.48 liters of soft drinks. A retailer wishes to test whether the mean
amount of soft drinks in labeled 1.5-liter bottle is less than 1.48 liters.
3. A car manufacturer claims that the mean selling price of all cars manufactured
is only ₱160,000. A consumer agency wants to test whether the mean selling
price of all the cars manufactured exceeds ₱160, 000.
4. The average power consumption of air conditioner is at most 2,500 watts as
claimed by the owner. A survey made by an electric power company found out that
the mean consumption is 3,500 with standard deviation of 225.
5. A bus company in Manila claims that the mean waiting time for a bus during
rush hour is less than 12 minutes. A random sample of 30 waiting times has a
mean of 15 minutes with a standard deviation of 4.8 minutes.

IDENTIFICATION. Identify which is being described.


_________________________1. It is an assumption or statement which may or may not
be true concerning one or more population.
_________________________2. It is used to calculate the p-value of your results, helping
to decide whether to reject your null hypothesis.
_________________________3. It states that there is no difference between population
parameters and the hypothesized value.
_________________________4. This is what we conclude when we reject H 0.

_________________________5. This is the test statistics to be used when n < 30 and


population variance is unknown.

Directions: Identify the appropriate test statistic to be used in each problem. Write
z-test or t-test on a separate sheet of paper.
___________1. A sample of n=20 is selected from a normal population, mean = 53
and s= 10.

22 DO_Q4_STATISTICS & PROBABILITY_GRADE 11_MODULE4


___________2. Based on the report of the school nurse, the average height of Grade
11 students has increased. Five years ago, the average height of Grade 11 students
was 168 cm with standard deviation of 36cm. She took a random sample of 150
students and derived the average height of 159 cm.

___________3. Knowing from a previous study that the average of athletes is 60, an
athletic adviser asked how his soccer players are academically doing as compared to
other student athletes. After an initiative to help improve the average of student
athletes, the adviser randomly selected 15 soccer players and found 80 as the average
with standard deviation of 1.20.

___________4. The CEO of a battery manufacturing company claimed that their


batteries would last an average of 270 hours under normal use. A researcher
randomly selected 15 batteries from the production line and tested them. The tested
batteries had a mean life span of 270 hours with a standard deviation of 40 hours.
Do we have enough evidence to suggest that the claim of an average of 280 hours is
false?

___________5. It was known that the number of tickets purchased by students at the
ticket window for the volleyball match of two popular universities followed a
distribution that has mean of 500 and standard deviation of 8.7. Suppose that a few
hours before the start of one of these matches, there are 100 eager students standing
in line to purchase tickets. If there are 250 tickets remaining, what is the probability
that all 100 students will be able to purchase the tickets they want?

MULTIPLE CHOICE. Encircle the letter of your answer.

1. This hypothesis states that there is no difference between population


parameters and the hypothesized value.
A. hypothesis C. alternative hypothesis
B. null hypothesis D. two-tailed hypothesis
2. When the value of parameter has significant difference with the hypothesized
value, then it is called ________________.
A. one-tailed test C. null hypothesis
B. two-tailed test D. alternative hypothesis
3. What kind of hypothesis is illustrated below? The mean score of all Grade 12
students is higher than 75.
A. one-tailed test C. null hypothesis
B. two-tailed test D. alternative hypothesis
4. The sign of the alternative hypothesis in a left-tailed test is always_________.
A. equal C. not equal
B. less than D. greater than
5. “A modern approach in advertisement will not increase the demand for a
product.” This is an example of _______________ hypothesis.
A. null C. Mean
B. alternative D. right-tailed

23 DO_Q4_STATISTICS & PROBABILITY_GRADE 11_MODULE4


A normal smartphone battery manufacturer claims that the mean life of a
certain
type of battery is more than 640 hours.

a.) Formulate the null and alternative hypotheses.


b.) Identify whether it is one-tailed or two-tailed.
c.) If the hypothesis is one tailed, identify its direction whether it is left or right.

24 DO_Q4_STATISTICS & PROBABILITY_GRADE 11_MODULE4


25 DO_Q4_STATISTICS & PROBABILITY_GRADE 11_MODULE4
Almeda, Capistrano, Ferry Sarte. (2010). Elementary Statistics. Quezon City: University of the
Philippines Press.

Belecina, R., Baccay, E., & Mateo, E. (2016). Statistics and Probability. Manila,Philippines:
REX Book Store Inc.

Bluman, A. (2018). Elementary Statistics: A Step by Step Approach 10th edition. McGraw
Hill. New York, USA.

Canlapan, R. (2016). Statistics and Probability. Makati, Philippines: Diwa Learning System
Inc.

Keller, Warrack. (2003). Statistics For Management and Economics. California USA:
Thomson Learning, Inc.

Levine, et. al. (2005). Statistics: A Handbook for Managers. New Jersey: Prentice Hall.

PERCDC Learnhub

Walpole, R., Myers, R., Myers, S., and Ye, K., (2012). Probability and Statistics for Engineers
and Scientists 9th edition. Pearson Education Inc. Massachusetts, USA.

26 DO_Q4_STATISTICS & PROBABILITY_GRADE 11_MODULE4


11

STATISTICS &
PROBABILITY
Quarter 4-Module 4
Lesson 3
The Rejection Region and
Critical Values

27 DO_Q4_STATISTICS & PROBABILITY_GRADE 11_MODULE4


Target:
1. Identify the appropriate rejection region for a given level of significance
when: (a) the population variance is assumed to be known; (b) the
population variance is assumed to be unknown; and (c) the Central Limit
Theorem is to be used (M11/12SP-IVc-1).

Directions: Read and analyze each item carefully, then circle the letter of the
correct answer from the given choices.

1. What type of error is committed when a true hypothesis is rejected?


A. Type I B. Type II C. Type A D. Type B
2. It is the number of possible combinations of decisions and truth values of a
null hypothesis when we perform hypothesis testing.
A. 4 B. 3 C. 2 D. 1
***Refer to the statements below to answer item numbers 3 and 4.
I. A boy whose height is 5’2” insists that his height is just 5’6”.
II. A police officer accepts unsolicited gifts even if it is wrong to do so.
III. Danilo still insists on working illegally despite knowing its risks.
IV. Julia says that her hair color is not black, instead tells that she has
dark-skinned.
3. Which of the statement/s above illustrates a Type I error?
A. I only B. I and IV C. III only D. II and III
4. Which of the statement/s above illustrates a Type II error?
A. I only B. I and IV C. III only D. II and III
***For item numbers 5 to 8, determine the critical values that matches the given
condition.
5. Population standard deviation is known, and the confidence level is 90% for a
two-tailed test.
A. ±1.65 B. ±1.96 C. ±2.33 D. ±2.58
6. Population standard deviation is unknown, the confidence level is 95% for a
two-tailed test and the sample size is 16.
A. ±1.761 B. ±1.753 C. ±2.131 D. ±2.145
7. Sample size is 150 but the population standard deviation is unknown, and the
confidence level is 90% for a two-tailed test.
A. ±1.65 B. ±1.96 C. ±2.33 D. ±2.58
8. Population standard deviation is unknown, the confidence level is 99% for a
one-tailed test and the sample size is 10.
A. 2.821 B. 2.764 C. 1.833 D. 1.812

28 DO_Q4_STATISTICS & PROBABILITY_GRADE 11_MODULE4


9. A certain test statistic lies on the left side of the critical value, which of the
following must be true?
A. The test statistic is equal to the critical value.
B. The test statistic is less than the critical value.
C. The test statistic is greater than the critical value.
D. The critical value is greater than the test statistic.
***Refer to the situation below to answer item numbers 10 to 12.
A sample of 250 bulbs were taken to verify the claim of its manufacturer that the
average lifespan of their bulbs is 2.5 years.
10. Which distribution must be considered to identify the appropriate rejection
region?
A. t-distribution C. f-distribution
B. z-distribution D. insufficient data
11. What must be the critical value if the claim will be tested using 99% level of
confidence?
A. ±1.65 B. ±1.96 C. ±2.33 D. ±2.58
12. Assuming that a test statistic was determined to be 2.11. Does this value lie on
the rejection region?
A. Yes B. No C. Maybe D. insufficient data
***Refer to the situation below to answer item numbers 13 to 15.
A random sample of Grade 11 students were selected from a population whose
standard deviation is unknown.
13. Which distribution must be considered to identify the appropriate rejection
region?
A. t-distribution C. f-distribution
B. z-distribution D. insufficient data
14. Suppose that there were 25 students, what will be the critical value if α = 0.01?
A. 2.485 B. 2.492 C. 2.787 D. 2.797
15. Assuming that a test statistic was computed as 2.93. Does this value lie on the
rejection region?
A. Yes B. No C. Maybe D. insufficient data

29 DO_Q4_STATISTICS & PROBABILITY_GRADE 11_MODULE4


Lesson
The Rejection Region and Critical Values
3

In this lesson, the learners will understand the concept of rejection region and critical
values. Ideas about types of error will also be presented.

When performing hypothesis testing, we come up with a decision of either


rejecting the null hypothesis or not. The decision is made based on how our
computed test statistic relates to the corresponding critical value set at a given
confidence level

This goes to show that critical values play an important role in establishing
the region/s under the curve where the hypothesis being tested may be rejected or
not. The region in which the hypothesis must be rejected is called the rejection
region.

Aside from establishing the hypothesis and identifying the appropriate test
statistic, there are other elements of hypothesis testing relevant to decision making.
As we can naturally commit mistakes, one of these relevant elements is the concept
of error. It was mentioned above that there are two possible decisions. Also, the null
hypothesis may either be true or false. Hence, there are 4 possible combinations of
decisions and truth values of the null hypothesis.
Interestingly, only two of these four outcomes are correct. The other two are
errors. These errors are named as Type I error and Type II error. Study the diagram
given below.
Reject Ho Do not reject Ho

TYPE I CORRECT
Ho is true
ERROR DECISION

Ho is false CORRECT TYPE II


DECISION ERROR

We can note from the diagram that a Type I error is committed when a true
hypothesis is rejected while a Type II error is committed when you fail to reject a false
hypothesis.
How do these errors relate in real life? Let us see the illustrations below.
Illustration 1: A man who insists that he stands 5’10” when in fact, his height is only
5’8”. In this situation, the man is said to commit a Type I error since he is rejecting
the idea that he just stands 5’8”.

30 DO_Q4_STATISTICS & PROBABILITY_GRADE 11_MODULE4


Illustration 2: A student who allows his/her classmates to cheat on his answers
during a test. In this scenario, the student is said to commit a Type II error since he
is allowing the act of cheating despite knowing that it is a wrong deed.
The question now is, how likely are we going to commit these errors as we do
decision making in the hypothesis testing process? We denote the probability of
committing a Type I error as 𝛼 while the probability of committing a Type II error as
𝛽. With the goal of minimizing these errors, we set their values to be relatively small.
For instance, we usually assign an alpha value of 0.05 or 0.01, depending on the
implications of the errors. Of course, the more serious the implications, the less likely
we would like to commit the error. From here we can further say that the probability
of making a correct decision with respect to Type I error is 1 − 𝛼.

The probabilities introduced above may be seen graphically in a normal


curve.

The figures on the left show the rejection region


under the normal curve for a directional (one-
tailed) test. Notice that the entire area under the
curve is divided into two parts by the critical
value. The shaded region is the rejection region.

The figure on the left shows the rejection regions under the
normal curve for a non-directional (two-tailed) test. This time,
notice that the entire area under the curve is divided into three
parts by the critical values. The rejection regions are seen on
both tails, which means that 𝛼 has been equally distributed.

Remember that these regions serve us our guide in decision making. If the
computed test statistic falls in the rejection region, then we must reject H o. If the test
statistic falls outside the rejection region, then we do not reject Ho.
As a remark, the curve to be used is based on whether the population variance
is known or not.

Sample Problem:
1. Assuming that the population standard deviation is known, sketch the rejection
region for a two-tailed test with 95% confidence. Does z = 1.68 fall in the rejection
region?
Solution:
Since the population standard deviation is assumed to be known, we will use
the z-distribution (normal distribution). Also, the 95% confidence level implies that
𝛼 = 0.05. Further, the two-tailed test implies that we must consider 𝛼/2 = 0.025
since the probability is distributed on both tails. Thus, the critical value is the
corresponding z-value for 1 – 0.025 = 0.975. Using the z-table, we find that the
critical values are -1.96 and 1.96. The sketch of the rejection region is shown
below.

31 DO_Q4_STATISTICS & PROBABILITY_GRADE 11_MODULE4


Obviously, z = 1.68 is found between -1.96 and 1.96,
it means that z does NOT lie in the rejection region.

2. Assuming that the population standard deviation is unknown for a randomly


selected sample whose size is 11, sketch the rejection region for a one-tailed test
with 99% confidence. Does t = 2.86 fall in the rejection region?
Solution:
Since the population standard deviation is assumed to be unknown and the
sample size is small, we will use the t-distribution. Also, the 99% confidence level
implies that 𝛼 = 0.01. Further, the one-tailed test implies that the rejection region
is found on one side of the curve only.
Given that the sample size is 11, it follows that 𝑑𝑓 = 𝑛 − 1 = 11 − 1 = 10. Using
the t-table, we find that the critical value is 2.764. The sketch of the rejection
region is shown below.

It can be seen that t=2.86 is


found on the right of the
right-tailed t-critical value. Thus,
t=2.86 lies on the rejection region.

3. A random sample of 250 bottles of juice drink were taken and was found to have an average content
that is less than the company’s claim that each bottle contains 500 mL of juice drink. Suppose that
an appropriate test statistic revealed a value of -1.75 at 95% confidence, sketch the rejection region
and locate test statistic value.
Solution:
It is seen from the problem that the population standard deviation is unknown but with the
sample size of 250, which is large enough, we can make use of the Central Limit Theorem and
consider the z-distribution. With 95% confidence, it shows that 𝛼 = 0.05. Thus, 1 − 𝛼 = 1 −
0.05 = 0.95.
Further, the phrase ‘less than’ indicates that we have a one-tailed test. Thus, we verify now if z
= -1.75 lies on the rejection region or not. The sketch is shown below.

As shown in the figure, the test statistic


z = -1.75 lies on the rejection region.

32 DO_Q4_STATISTICS & PROBABILITY_GRADE 11_MODULE4


Directions: Answer the following problems.
1. Assuming that the population standard deviation is unknown for a randomly
selected sample whose size is 11, sketch the rejection region for a one-tailed test
with 90% confidence. Does t = 2.86 fall in the rejection region?
2. A random sample of 250 bottles of juice drink were taken and was found to have
an average content that is less than the company’s claim that each bottle
contains 500 mL of juice drink. Suppose that an appropriate test statistic
revealed a value of -1.75 at 90% confidence, sketch the rejection region and
locate test statistic value.

• A Type I error is committed when a true hypothesis is rejected while a Type II


error is committed when you fail to reject a false hypothesis.
• If the computed test statistic falls in the rejection region, then we must reject H o.
If the test statistic falls outside the rejection region, then we do not reject H o.
• The curve to be used is based on whether the population variance is known or
not. When the population variance is known, we use the z-distribution while
when the population variance is unknown, we use the t-distribution. However,
when the sample size is sufficiently large, the Central Limit Theorem may be
used, and the z-distribution may be considered.

A. Decide whether each statement is TRUE or FALSE. Write T for True and F
for False.
_____1. We use t-distribution when the population standard deviation is known.
_____2. In a one-tailed test, the rejection region is found on both tails of a
distribution.
_____3. The critical values divide the curve into rejection and non-rejection
regions.
_____4. Type II error is committed when a false hypothesis is not rejected.
_____5. The probability of not committing a Type I error is 1 − 𝛼.

B. Identify the type of error illustrated in each of the following.


_____1. Jonas says that he is not bald, instead admit that his hairline is just
receding.
_____2. Vicky said that she is just 26 years old, but the truth is, she is already
40 years old.
_____3. A judge decides to acquit a guilty suspect.
_____4. A man still chooses to stay with friends who have vices.
_____5. The officer detains an innocent man
33 DO_Q4_STATISTICS & PROBABILITY_GRADE 11_MODULE4
Directions: Read and analyze each item carefully, then circle the letter of the
correct answer from the given choices.
1. What type of error is committed when you fail to reject a false null hypothesis?
A. Type I B. Type II C. Type A D. Type B
2. How many possible combinations of decisions and truth values of a null
hypothesis are there when we perform hypothesis testing?
A. 4 B. 3 C. 2 D. 1
***Refer to the statements below to answer item numbers 3 and 4.
I. A man who weighs 80 kilograms argues that his weight is just 75
kilograms.
II. A judge accepts bribe even if it is wrong to do so.
III. Carla still insists on working illegally despite knowing its risks.
IV. A woman says that her skin color is not black, instead tells that she is
dark-skinned.
3. Which of the statement/s above illustrates a Type I error?
A. I only B. I and IV C. III only D. II and III
4. Which of the statement/s above illustrates a Type II error?
A. I only B. I and IV C. III only D. II and III
***For item numbers 5 to 8, determine the critical values that matches the given
condition.
5. Population standard deviation is known, and the confidence level is 99% for a
two-tailed test.
A. ±1.65 B. ±1.96 C. ±2.33 D. ±2.58
6. Population standard deviation is unknown, the confidence level is 95% for a
two-tailed test and the sample size is 24.
A. ±1.711 B. ±1.714 C. ±2.064 D. ±2.069
7. Sample size is 101 but the population standard deviation is unknown, and the
confidence level is 95% for a two-tailed test.
A. ±1.65 B. ±1.96 C. ±2.33 D. ±2.58
8. Population standard deviation is unknown, the confidence level is 99% for a
one-tailed test and the sample size is 21.
A. 2.861 B. 2.845 C. 2.528 D. 2.518
9. A certain test statistic lies on the right side of the critical value, which of the
following must be true?
A. The test statistic is equal to the critical value.
B. The test statistic is less than the critical value.
C. The test statistic is greater than the critical value.
D. The critical value is greater than the test statistic.
***Refer to the situation below to answer item numbers 10 to 12.
A sample of 120 batteries were taken to verify the claim of its manufacturer that
the average lifespan of their batteries is 7.6 months.
10. Which distribution must be considered to identify the appropriate rejection
region?
A. t-distribution C. f-distribution
B. z-distribution D. insufficient data

34 DO_Q4_STATISTICS & PROBABILITY_GRADE 11_MODULE4


11. What must be the critical value if the claim will be tested using 95% level of
confidence?
A. ±1.65 B. ±1.96 C. ±2.33 D. ±2.58
12. Assuming that a test statistic was determined to be 2.11. Does this value lie on
the rejection region?
A. Yes B. No C. Maybe D. insufficient data
***Refer to the situation below to answer item numbers 13 to 15.
A random sample of Grade 10 students were selected from a population whose
standard deviation is unknown.
13. Which distribution must be considered to identify the appropriate rejection
region?
A. t-distribution C. f-distribution
B. z-distribution D. insufficient data
14. Suppose that there were 25 students, what will be the critical value if α =
0.05?
B. ±1.708 B. ±1.711 C. ±2.060 D. ±2.064
15. Assuming that a test statistic was computed as 2.055. Does this value lie on
the rejection region?
A. Yes B. No C. Maybe D. insufficient data

Directions: Answer the following problems.

1. Locate z = 1.96 under the curve and sketch the rejection region for a one-tailed
test with 99% confidence. Is z found on the rejection region?
2. One hundred packs of potato chips were selected to verify the manufacturer’s
claim that the mean weight of each pack is 36 grams. At 95% confidence, are the
mean weights of the sample and the population significantly different after
knowing that the test statistic is 1.83? Sketch the rejection region and locate the
test statistic.
3. Previous records of a supermarket revealed that their goers have an average
budget of Php 1, 550. Suppose the grocery budgets of 20 randomly chosen
supermarket goers were taken and a test statistic value of 2.15 was computed.
Is there enough evidence to say that there is no significant difference between
the sample mean and the population mean at 95% confidence? Locate the test
statistic and sketch the rejection region.

35 DO_Q4_STATISTICS & PROBABILITY_GRADE 11_MODULE4


DO_Q4_STATISTICS & PROBABILITY_GRADE 11_MODULE4 36
What I Know
1. A 2. A 3. B 4. D 5. A 6. C 7. A 8. A 9. B 10. B
11. D 12. B 13. A 14. D 15. A
What’s More
1. Yes 2.
What I Can Do
A. 1. F 2. F 3. T 4. T 5. T
B. 1. Type I 2. Type I 3. Type II 4. Type II 5. Type I
Assessment
1. The critical value is 2.33 and the shaded
area represents the rejection
1. B 2. A 3. B 4. D 5. D region. 7. B 8. C
6. C The 9. C 10. B
11. green
B 12.segment
A 13. Arepresents
14. D 15.the B location of
z = 1.96. Clearly, z is NOT on the
Activities
Additionalrejection region.
2. Since the sample size is large enough,
the Central Limit Theorem may be
applied and so we proceed with the z-
distribution. The critical values are ±
1.96 and the shaded area represents
the rejection region. The green
segment represents the location of z=
1.83. Therefore, z lies in the non-
rejection region. Hence, we fail to
reject the null hypothesis.
3. We apply the t-distribution since the
population standard deviation is
assumed to be unknown and the
sample size is small. The critical values
are ± 2.093 and the shaded region
shows the rejection region. The orange
segment gives the location of the test
statistic t = 2.15 and clearly it is found
on the rejection region. Thus, we reject
References:

Belecina, R., Baccay, E., and Mateo E. (2016). Statistics and Probability. Rex
Publishing House. Manila.
Bluman, A. (2018). Elementary Statistics: A Step by Step Approach 10 th edition.
McGraw Hill. New York, USA.
https://www.khanacademy.org/math/statistics-probability/significance-tests-one-
sample/idea-of-significance-tests/v/simple-hypothesis-testing
https://www.khanacademy.org/math/statistics-probability/significance-tests-one-
sample/more-significance-testing-videos/v/hypothesis-testing-and-p-values
https://www.statisticshowto.com/probability-and-statistics/hypothesis-testing/

37 DO_Q4_STATISTICS & PROBABILITY_GRADE 11_MODULE4


11

STATISTICS &
PROBABILITY
Quarter 4-Module 4
Lesson 4
The Test Statistic

38 DO_Q4_STATISTICS & PROBABILITY_GRADE 11_MODULE4


Targets:
1. Compute for the test-statistic value (population mean) (M11/12SP-IVd-1);
and
2. Draw conclusion about the population mean based on the test-statistic value
and the rejection region (M11/12SP-IVd-2).

Directions: Read and analyze each item carefully and circle the letter of the
correct answer from the given choices.
***Refer to the statements below to answer item numbers 1 and 2.
I. The distribution is normal or approximately normal.
II. The population standard deviation is known.
III. The population standard deviation is unknown.
IV. The sample size is greater than or equal to 30.
1. Which of the conditions above must be met so that one can compute for the z-
test statistic?
A. I and II B. I and III C. I, II and IV D. I, III and IV
2. Which of the conditions above must be met so that one can compute for the t-
test statistic?
A. I and II B. I and III C. I, II and IV D. I, III and IV
3. Which values must be compared in order to make a decision about the null
hypothesis?
A. test statistic and confidence level
B. test statistic and critical value
C. test statistic and degrees of freedom
D. test statistic and significance level
4. Given that 𝑛 = 135; 𝑋̅ = 15; 𝜇 = 8; 𝜎 = 1.23, what test statistic is appropriate to
use?
A. t-test statistic C. either A or B
B. z-test statistic D. insufficient data
̅
5. Given the following: 𝑛 = 75; 𝑋 = 6.9; 𝜇 = 6.4; 𝑠 = 1.5, what must be the value of the
appropriate test statistic?
A. z = -2.89 B. z = 2.89 C. t = -2.89 D. t = 2.89
***Refer to the situation below to answer item numbers 6 to 10.
A pool of researchers claims that the average age of schooling among children
in a certain district is 4.8 years with a standard deviation of 0.21. A pre-school
teacher attempted to verify this claim by taking the ages of 360 first-time
schoolers in the said district and found out that the average age is 4.12 years.
6. What test statistic must be computed to test the claim of the researchers?
A. t-test statistic C. either A or B
B. z-test statistic D. insufficient data
7. What is the correct value of the test statistic?
A. -61.44 B. -58.42 C. -0.48 D.-0.02
8. What must be the appropriate critical value if 99% significance level was used?

39 DO_Q4_STATISTICS & PROBABILITY_GRADE 11_MODULE4


A. ±1.65 B. ±1.96 C. ±2.33 D. ±2.58
9. How do the absolute values of the test statistic (TS) and the critical value (CV)
compare?
A. TS = CV C. TS < CV
B. TS > CV D. insufficient data
10. Which conclusion can be possibly drawn?
A. The claim of the researchers is true.
B. The average age of schooling is 4.12 years.
C. There is not enough evidence to support the claim of the researchers.
D. The sample selected by the pre-school teacher does not correctly represent
the population.
***Refer to the situation below to answer item numbers 11 to 15.
A large-scale survey revealed that the average number of hours that a Senior
High School student spends in doing social media activities is 5.8. In a certain
school, a teacher asked 25 senior high school students and he claimed that the
mean time that they spend in social media is 6.2 hours with a standard deviation
of 0.12 hours.
11. What test statistic must be determined to test the claim of the teacher?
A. z-test statistic C. either A or B
B. t-test statistic D. insufficient data
12. What is the correct value of the test statistic?
A. 16.67 B. 16.33 C. 15.83 D. 15.67
13. Suppose 95% confidence interval was used, what must be the appropriate critical
value?
A. ±2.797 B. ±2.492 C. ±2.156 D. ±2.064
14. How do the absolute values of the test statistic (TS) and the critical value (CV)
compare?
A. TS = CV C. TS < CV
B. TS > CV D. insufficient data
15. What can be possibly concluded from the result above?
A. The claim of the teacher is false.
B. The mean time spent by senior high school students in doing social media
activities is really 5.8 hours.
C. There is enough evidence to believe the claim of the teacher.
D. The teacher lacks on the number of students interviewed.

40 DO_Q4_STATISTICS & PROBABILITY_GRADE 11_MODULE4


Lesson
The Test Statistic
4
In this lesson, the learners will be equipped with the skill of computing for the
appropriate test statistic needed in performing hypothesis testing later.

In the previous lesson, we learned about critical values and their role in
establishing the rejection region. To complete the scenario towards arriving with a
certain decision on whether to reject the null hypothesis or not, one must be able to
compute for the test statistic accurately.
The calculation of the test statistic mainly depends on whether or not the
population standard deviation is known as well as on the size of the sample. In this
lesson, we introduce the formula and the procedures for computing an appropriate
test statistic which will be used to arrive at a correct decision that may eventually
lead to sound conclusions.

Generally, to compute for a test statistic, we subtract the expected value


form the observed value and divide the result by the standard error. There are two
main statistical tests performed concerning the population mean, they are known as
the z-test and the t-test.
On one hand, we use z-test when 𝑛 ≥ 30 or when the population is normally
𝑋̅ −𝜇
distributed and 𝜎 is known. The formula for the z-test statistic is given by 𝑧 =
𝜎/√𝑛
where, 𝑋̅ is the sample mean,
𝜇 is the hypothesized population mean,
𝜎 is the population standard deviation, and
𝑛 is the sample size.
On the other hand, we use t-test when the population is normal or
approximately normal and 𝜎 is unknown. The formula for the t-test statistic is given
𝑋̅ −𝜇
by 𝑡 =
𝑠/√𝑛
where, 𝑋̅ is the sample mean,
𝜇 is the hypothesized population mean,
𝑠 is the sample standard deviation, and
𝑛 is the sample size.
Notice that the formulas presented above are very much similar. It is only in
the standard error of the mean that they differ since in the use of t-test, the
population standard deviation is unknown and is replaced by the sample standard
deviation.

41 DO_Q4_STATISTICS & PROBABILITY_GRADE 11_MODULE4


After computing for the appropriate test statistic, a decision must be made.
To come up with a correct decision, we follow the following rule: If the absolute value
of computed test statistic is greater than or equal to the critical value, the null
hypothesis is rejected and if the absolute value of computed test statistic is less than
the critical value, we do not reject the null hypothesis.

After deciding, we proceed to writing sound conclusions. These are inferences


that we can draw from the context of the situation based on whether we have rejected
the null hypothesis or not. To illustrate how we compute test statistics and write
conclusions, thereafter, study the example below.

Sample Problem:

1) Previous records revealed that the mean salary of the high school teachers in a
municipality is Php 16, 250 with a standard deviation of Php 1, 400. A sample of
50 teachers were taken and was reported to have a mean salary of Php 18,000. At
95% confidence level, do we have enough evidence to believe what the records
revealed?

Solution:
Since the sample size is 50, which is greater than or equal to 30 and that
the population standard deviation is known, we compute for the z-test statistic.
Substituting the known values to our formula for z-test, we obtain the following:

𝑋̅ −𝜇 18 000−16 250 1 750


𝑧= = ≈ ≈ 8.839
𝜎/√𝑛 1 400/√50 197.99

At, 95% confidence interval, the critical value for z is 1.96. Clearly, the
computed z-test statistic which is 8.839 is greater than the z-critical value of 1.96.
Thus, we reject the null hypothesis stating that the population mean salary and
the sample mean salary are statistically equal.
Therefore, the sample mean is statistically different from that of the population
mean. This implies that the selected high school teachers have significantly
different salary as compared to the population and so, there is no enough evidence
to believe what the records have revealed.

2) A medical report claims that the number of infections per week at a certain
hospital in a province is 12.7. A random sample of 9 weeks had a mean number
of 11.4 infections with a standard deviation of 0.6. Is there enough evidence to
support the claim at 95% confidence level?

Solution:
Given that the sample size is 9, which is less than 30 and that the population
standard deviation is unknown, we compute for the t-test statistic. Applying the
formula for this test value, we have the following:
𝑋̅ − 𝜇
𝑡=
𝑠/√𝑛
11.4 − 12.7
𝑡=
0.6/√9
42 DO_Q4_STATISTICS & PROBABILITY_GRADE 11_MODULE4
−1.3
𝑡= = −6.5
0.2
At 95% confidence level, 𝛼 = 0.05. Also, when 𝑛 = 9, 𝑑𝑓 = 9 − 1 = 8. Using the
t-table, these lead to the t-critical value of ±2.306. Comparing the absolute values,
we can say that the computed test statistic is greater than the critical value.
Hence, we reject the null hypothesis stating that the population mean, and the
sample mean are statistically equal.
Therefore, the sample mean is statistically different from that of the
population mean. This implies that the selected weeks have significantly different
number of infections as compared to the population and so, there is no enough
evidence to support what the medical report claims.

Compute for the appropriate test statistic in each of the following:


a. 𝑛 = 17; 𝑋̅ = 8; 𝜇 = 7.4; 𝑠 = 0.5 d. 𝑛 = 10; 𝑋̅ = 5.9; 𝜇 = 6.3; 𝑠 = 0.125
b. 𝑛 = 40; 𝑋̅ = 6.24; 𝜇 = 5.11; 𝜎 = 3.6 e. 𝑛 = 48; 𝑋̅ = 10.27; 𝜇 = 9.4; 𝑠 = 2.04
c. 𝑛 = 100; 𝑋̅ = 11; 𝜇 = 9; 𝜎 = 1.2

• To compute for a test statistic, we subtract the expected value form the
observed value and divide the result by the standard error.
• We use z-test when 𝑛 ≥ 30 or when the population is normally distributed and
𝑋̅ −𝜇
𝜎 is known. The formula for the z-test statistic is given by 𝑧 = .
𝜎/√𝑛
• We use t-test when the population is normal or approximately normal and 𝜎
𝑋̅ −𝜇
is unknown. The formula for the t-test statistic is given by 𝑡 =
𝑠/√𝑛
• If the absolute value of computed test statistic is greater than or equal to the
critical value, the null hypothesis is rejected and if the absolute value of
computed test statistic is less than the critical value, we do not reject the null
hypothesis.
• Conclusions are inferences that we can draw from the context of the situation
based on whether we have rejected the null hypothesis or not.

Directions: Decide whether each statement is TRUE or FALSE. Write T for True
and F for False.
_______1. We use z-test statistic when the population standard deviation is
unknown.
_______2. The inferences that we can draw out of a decision are called
conclusions.

43 DO_Q4_STATISTICS & PROBABILITY_GRADE 11_MODULE4


_______3. If the critical value is greater than the computed test statistic, we do
not reject the null hypothesis.
_______4. We use t-test statistic when the population standard deviation is
known.
_______5. The distribution must be normal or approximately normal so that one
can proceed with computing for either z- or t-test statistic.

Directions: Read and analyze each item carefully and circle the letter of the
correct answer from the given choices.
***Refer to the statements below to answer item numbers 1 and 2.
I. The distribution is normal or approximately normal.
II. The population standard deviation is known.
III. The population standard deviation is unknown.
IV. The sample size is greater than or equal to 30.
1. Which of the conditions above must be met so that one can compute for the t-
test statistic?
A. I and II B. I and III C. I, II and IV D. I, III and IV
2. Which of the conditions above must be met so that one can compute for the z-
test statistic?
A. I and II B. I and III C. I, II and IV D. I, III and IV
3. Which values must be compared to decide about the null hypothesis?
A. test statistic and critical value
B. test statistic and significance level
C. test statistic and confidence level
D. test statistic and degrees of freedom
4. Given that 𝑛 = 50; 𝑋̅ = 12; 𝜇 = 11; 𝜎 = 0.12, what test statistic is appropriate to
use?
A. t-test statistic C. either A or B
B. z-test statistic D. insufficient data
̅
5. Given the following: 𝑛 = 25; 𝑋 = 6.9; 𝜇 = 6.4; 𝑠 = 1.5, what must be the value of the
appropriate test statistic?
A. z = -1.667 B. z = 1.667 C. t = -1.667 D. t = 1.667
***Refer to the situation below to answer item numbers 6 to 10.
A pool of researchers claims that the average age of schooling among children
in a certain district is 4.8 years with a standard deviation of 0.21. A pre-school
teacher attempted to verify this claim by taking the ages of 20 first-time schoolers
in the said district and found out that the average age is 4.12 years.
6. In testing the claim of the researchers, what test statistic must be determined?
A. t-test statistic C. either A or B
B. z-test statistic D. insufficient data
7. Which of the following is the correct value of the test statistic?
A. 14.481 B. 10.481 C. -0.032 D. -14.481
8. Suppose that 95% confidence level was used, what must be the appropriate
critical value?
A. ±1.65 B. ±1.96 C. ±2.093 D. ±2.064
9. How do the absolute values of the test statistic (TS) and the critical value (CV)
compare?
A. TS = CV C. TS < CV
B. TS > CV D. insufficient data

44 DO_Q4_STATISTICS & PROBABILITY_GRADE 11_MODULE4


10. Which conclusion can be possibly drawn?
A. The claim of the researchers is true.
B. The average age of schooling is 4.12 years.
C. There is not enough evidence to support the claim of the researchers.
D. The sample selected by the pre-school teacher does not correctly represent
the population.
***Refer to the situation below to answer item numbers 11 to 15.
A large-scale survey revealed that the average number of hours that a Senior
High School student spends in doing social media activities is 5.8. In a certain
school, a teacher asked 125 senior high school students and he claimed that the
mean time that they spend in social media is 6.2 hours with a standard deviation
of 0.12 hours.
11. To test the claim of the teacher, what test statistic must be computed?
A. z-test statistic C. either A or B
B. t-test statistic D. insufficient data
12. Which of the following is the correct value of the test statistic?
A. 17.46 B. 26.46 C. 37.27 D. 37.98
13. If 99% confidence interval was used, what must be the appropriate critical value?
A. ±1.65 B. ±1.96 C. ±2.33 D. ±2.58
14. How do the absolute values of the test statistic (TS) and the critical value (CV)
compare?
A. TS = CV C. TS < CV
B. TS > CV D. insufficient data
15. What can be possibly concluded from the result above?
A. The claim of the teacher is false.
B. The mean time spent by senior high school students in doing social media
activities is really 5.8 hours.
C. There is enough evidence to believe the claim of the teacher.
D. The teacher lacks on the number of students interviewed.

Additional Activities

Directions: Answer the following problems.


1. A manufacturing company claims that each pack of their potato chips weigh 40
grams with a standard deviation of 2.5 grams. To verify the manufacturer’s claim,
100 packs of potato chips were selected and was found to have a mean weight of
38.1 grams. At 95% confidence level, is there enough evidence to reject the claim
of the manufacturing company?
2. According to statista.com, as of April 2, 2020, the mean age of the COVID-19
patients was 53.1 years old. Suppose a pool of researchers randomly took 15
patients with mean age and standard deviation determined to be 55.2 years and
3.6 respectively. Can we conclude with 99% confidence that the population and
sample mean are significantly different?

45 DO_Q4_STATISTICS & PROBABILITY_GRADE 11_MODULE4


DO_Q4_STATISTICS & PROBABILITY_GRADE 11_MODULE4 46
What I Know
1. C 2. B 3. B 4. B 5. D 6. B 7. A 8. D 9. B 10. C
11. B 12. A 13. D 14. B 15. C
What’s More
a. t=4.948 b. z=1.985 c. z=16.667 d. t=-10.119 e. t=2.955
What I Can Do
1. F 2. T 3. T 4. F 5. T
Assessment
1. B 2. C 3. A 4. B 5. D 6. A 7. D 8. C 9. B 10. C
11. A 12. C 13. D 14. B 15. C
Additional Activities
1. Solution:
Since the sample size is 100, which is greater than or equal to 30 and that
the population standard deviation is known, we compute for the z-test
statistic.
Substituting the known values to our formula for z-test, we obtain the
following:
𝑋̅ −𝜇 38.1−40
𝑧= = = −7.6
𝜎/√𝑛 2.5/√100
At, 95% confidence interval, the critical value for z is 1.96. Clearly, the
absolute value of the computed z-test statistic which is 7.6 is greater than the
z-critical value of 1.96. Thus, we reject the null hypothesis stating that the
population mean weight and the sample mean weight are statistically
different. Therefore, the sample mean is statistically different from that of the
population mean. This implies that the selected packs have significantly
different weight as compared to the population.
2. Solution:
Given that the sample size is 15, which is less than 30 and that the
population standard deviation is unknown, we compute for the t-test statistic.
Applying the formula for this test value, we have the following:
𝑋̅ − 𝜇
𝑡=
𝑠/√𝑛
55.2 − 53.1
𝑡=
3.6/√15
𝑡 ≈ 2.259
At 99% confidence level, 𝛼 = 0.01. Also, when 𝑛 = 15, 𝑑𝑓 = 15 − 1 = 14.
Using the t-table, these lead to the t-critical value of ±2.977. Comparing the
absolute values, we can say that the computed test statistic is less than the
critical value. Hence, we do not reject the null hypothesis. Meaning to say, the
mean age of the selected patients does not significantly differ to that of the
population mean.
References
Belecina, R., Baccay, E., and Mateo E. (2016). Statistics and Probability. Rex
Publishing House. Manila.
Bluman, A. (2018). Elementary Statistics: A Step by Step Approach 10 th edition.
McGraw Hill. New York, USA.
https://www.statista.com/statistics/1104061/philippines-coronavirus-covid-19-patients-by-
age-group/
https://www.khanacademy.org/math/statistics-probability/significance-tests-one-sample/idea-
of-significance-tests/v/simple-hypothesis-testing
https://www.khanacademy.org/math/statistics-probability/significance-tests-one-
sample/more-significance-testing-videos/v/hypothesis-testing-and-p-values
https://www.statisticshowto.com/probability-and-statistics/hypothesis-testing/

47 DO_Q4_STATISTICS & PROBABILITY_GRADE 11_MODULE4


11
STATISTICS &
PROBABILITY
Quarter 4-Module 4
Lesson 5
Hypothesis Testing

48 DO_Q4_STATISTICS & PROBABILITY_GRADE 11_MODULE4


Targets:
1) Solve problems involving test of hypothesis on the population mean
(M11/12SP-IVe-1);
2) Formulate the appropriate null and alternative hypotheses on a population
proportion (M11/12SP-IVe-2); and
3) Identify the appropriate form of the test-statistic when the Central Limit
Theorem is to be used (M11/12SP-IVe-3).

Directions: Read and analyze each item carefully, then circle the letter of the
correct answer from the given choices.

Refer to the problem below to answer item numbers 1-5.


A table manufacturing company reported that at the end of the 2021, the mean
number of delivered tables daily is 245.2. If a random sample of 16 manufacturing
days revealed that the mean number of delivered tables is 250.1 with a standard
deviation of 3.6, test the claim of the company in its report at 0.05 level of
significance.
1. Which of the following shows the correct alternative hypothesis?
A. µ = 245.2 B. µ ≠ 245.2 C. µ > 245.2 D. µ < 245.2
2. Which distribution is appropriate for the desired hypothesis testing?
A. t-distribution B. z-distribution C. f-distribution D. none of these
3. What is the corresponding critical value?
A. ± 2.131 B. ±1.96 C. ± 1.68 D. ± 1.42
4. Which of the following is the correct test statistic?
A. 4. 31 B. 5. 05 C. 5.44 D. 6.87
5. Which of the following can be concluded from the results of the hypothesis
testing?
A. There is not enough evidence to reject the claim of the company in its report.
B. There is enough evidence to reject the claim of the company in its report.
C. On the average, the company delivers more than 245.2 tables.
D. No valid conclusion can be drawn.
Refer to the problem below to answer item numbers 6-8.
Mr. Mariano reported that the average time of social media exposure of college
students per day is 5.5 hours. If a random sample of 200 college students were
surveyed and showed that they have 4.5 hours social media exposure per day with
0.5 hour of standard deviation, test the claim of Mr. Mariano in his report at 0.05
level of significance.
6. Which of the following shows the correct null hypothesis?
A. µ = 5.5 B. µ ≠ 5.5 C. µ > 5.5 D. µ < 5.5
7. Which distribution is appropriate for the desired hypothesis testing?
A. t-distribution B. z-distribution C. f-distribution D. none of these

49 DO_Q4_STATISTICS & PROBABILITY_GRADE 11_MODULE4


8. If the absolute value of the computed test statistics is greater than the critical
value, which can be concluded?
A. There is not enough evidence to reject the claim of Mr. Mariano in his report.
B. There is enough evidence to reject the claim of Mr. Mariano in his report.
C. The average time of social media exposure of college students is 5.5 hours.
D. No valid conclusion can be drawn.
Refer to the problem below to answer item numbers 9-10.
A recent report indicated that 59% of adults have at least 7 hours of sleep in a day.
To verify this claim, a researcher took 100 adults, and revealed that 60 of them
affirms the report.
9. Which of the following represents the null hypothesis?
A. Ho: 𝑝 = 0.59 B. Ho: 𝑝 ≠ 0.59 C. Ho: 𝑝 > 0.59 D. Ho: 𝑝 < 0.59
10. Which of the following represents the alternative hypothesis?
A. H1: 𝑝 = 0.59 B. H1: 𝑝 ≠ 0.59 C. H1: 𝑝 > 0.59 D. H1: 𝑝 < 0.59

50 DO_Q4_STATISTICS & PROBABILITY_GRADE 11_MODULE4


Lesson
Hypothesis Testing
5
In this lesson, the learners will be acquainted of the entire procedures in performing
hypothesis testing concerning both population mean and population proportion.

In the past lessons, you learned how to formulate null and alternative
hypotheses for population mean, identify critical values, compute test statistic,
decide, and state conclusions based on the results. The test of hypothesis concerning
population mean may be described as a decision-making process about a certain
claim concerning a population.
The previous lessons have indeed allowed you to experience how this entire
process is performed in a piece-by-piece approach. In this lesson, we will look at all
these procedures as a one whole big picture. The test of hypothesis may be conducted
in three ways namely – traditional method, p-value method, and confidence interval
method. For this lesson we will only tackle about the traditional method and its steps
are summarized below.
Steps in Hypothesis Testing using the Traditional Method
1. State the hypothesis and identify the claim. 4. Make a decision.
2. Find the critical value. 5. State the conclusion.
3. Compute for the appropriate test statistic.

To solve problems involving test of hypothesis concerning population mean,


we follow the steps presented above. As an illustration, let us take the problem below.

Sample Problem 1:
According to the latest data published by the World Health Organization
(WHO) in 2018, the life expectancy of male Filipinos is 66.2 years. A random sample
of 50 recorded deaths among male Filipinos was taken and was found to have a mean
of 64.6 years. Assuming that the population standard deviation is 7.2 years, does
this seem to indicate that the mean life span of male Filipinos is less than 66.2 years?
Use 0.05 level of significance.
Steps Solution
1. State the hypothesis The following are the hypotheses:
and identify the claim. Ho: 𝜇 = 66.2 H1: 𝜇 < 66.2 (claim)
2. Find the critical value. Since the population standard deviation is known, then
the appropriate distribution is z. Also, the hypotheses
are directional and so it implies a one-tailed test with 𝛼 =
0.05.
Using the z-table, the critical z-value is -1.65.

51 DO_Q4_STATISTICS & PROBABILITY_GRADE 11_MODULE4


3. Compute for the As mentioned in Step 2, we use the z-distribution. So,
appropriate test the appropriate test statistic is z and is computed as
statistic. follows:
𝑋̅ −𝜇 64.6−66.2
𝑧= = ≈ −1.571
𝜎/√𝑛 7.2/√50
4. Make a decision. Comparing the absolute values of the computed z-test
statistic and the z-critical value, it can be seen that
zcomputed < zcritical. Thus, at 0.05 level of significance, the
null hypothesis is not rejected.
5. State the conclusion. Based on the decision, there is not enough evidence to
support the claim that the life span of male Filipinos is
less than 66.2 years.
Sample Problem 2:
In a report by Sanchez (2020), during 2016 the average household electric
consumption per capita in our country is 248.1 kilowatt hours. If a random sample
of 16 households included in a planned study indicated that their consumption is
254.3 kilowatt hours with a standard deviation of 3.6 kilowatt hours, test the claim
of Sanchez in his report at 0.05 level of significance.
Steps Solution
1. State the hypothesis The following are the hypotheses:
and identify the claim. Ho: 𝜇 = 248.1 (claim)
H1: 𝜇 ≠ 248.1
2. Find the critical value. Since the population standard deviation is unknown
and the sample is less than 30, then the appropriate
distribution is t. Also, the hypotheses are non-
directional and so it implies a two-tailed test with 𝛼 =
0.05.

Using the t-table, with 𝑑𝑓 = 16 − 1 = 15, the critical t-


value is ±2.131.
3. Compute for the As mentioned in Step 2, we use the t-distribution. So,
appropriate test the appropriate test statistic is t and is computed as
statistic. follows:
𝑋̅ − 𝜇
𝑡=
𝑠/√𝑛
254.3 − 248.1
𝑡=
3.6/√16
𝑡 ≈ 6.889
4. Make a decision. Comparing the absolute values of the computed t-test
statistic and the t-critical value, it is clear that
tcomputed > tcritical. Thus, at 0.05 level of significance, the
null hypothesis is rejected.
5. State the conclusion. Based on the decision, there is enough evidence to
reject the claim of Sanchez in his report that the
average electric consumption in Filipino households is
248.1 kilowatt hours.

52 DO_Q4_STATISTICS & PROBABILITY_GRADE 11_MODULE4


At this point, it is timely for you to know that hypothesis testing does not only
apply on problems that concern population mean but also on population proportion.
To effectively carry out hypothesis testing in these cases, one must be able to
correctly state hypotheses for population proportion.
Instead of actual population mean values, our hypotheses concern a part or a
percentage of a population. Consider the following hypothetical statements:
a. 72% of public elementary school teachers have their own internet connection
at home;
b. 12% of diabetes patients are at risk of having kidney failure; and
c. 16% of female college students graduate with honors.
These statements are examples of claims that involve population proportion,
and thus, may be subjected to hypothesis testing. The establishment of the
hypotheses in these cases is like those which concern population mean. We will use
𝑝 to denote the population proportion.
To illustrate how hypotheses concerning population proportion are formulated, let
us look at the illustrations below.
State the null and alternative hypotheses for each of the following.
a. A recent report indicated that 67% of teenagers spend more than 5 hours in
doing social media activities. To verify this claim, a researcher took 90 teenagers,
and revealed that 54 of them affirms the report.

Solution:
In the above problem, we formulate the following hypotheses:
Ho: 𝑝 = 0.67 (claim) H1: 𝑝 ≠ 0.67

b. A survey revealed that more than 46% of working professionals dine-in at fast
food stores daily. A pool of researchers tested this survey result by taking 75
working professionals with 52 of them agreeing the result.

Solution:
In the above problem, we formulate the following hypotheses:
Ho: 𝑝 = 0.46 H1: 𝑝 > 0.46 (claim)

A recent survey of 200 people revealed that the mean time spent in watching
television of teenagers is 4.2 hours. Previous national records say that the mean time
was 3.8 hours with standard deviation of 0.3 hours. Do the survey results
significantly differ from previous records at 0.05 level of significance?

What I Have Learned

• There are five steps in the Traditional Method of Hypothesis Testing. These
are (1) State the hypothesis and identify the claim; (2) Find the critical
53 DO_Q4_STATISTICS & PROBABILITY_GRADE 11_MODULE4
value; (3) Compute for the appropriate test statistic; (4) Make a decision;
and (5) State the conclusion.
• Situations in which a percentage of the population is given instead of
means involve population proportion.
• When the Central Limit Theorem is used, the appropriate form of the test
statistic is z.

A. State the null and alternative hypotheses for each of the following.
1. A recent report indicated that 72% of teachers spend more than 8 hours in
doing schoolwork in this current work-from-home arrangement. To verify
this claim, a researcher took 80 teachers, and revealed that 63 of them
affirms the report.
2. A survey revealed that less than 36% of children ages 8 to 10 years old are
exposed to computer games daily. A pool of researchers tested this survey
result by taking 105 children on the given age bracket with 41 of them
agreeing the result.
B. Read and analyze the problem below then solve.
A random sample of 150 bottles of juice drink were taken and was found to
have an average content of 318 mL with a standard deviation of 2.2 mL. This
average content is less than the company’s claim that each bottle contains 330
mL of juice drink. At 0.05 significance level, do the contents of the juice drink of
the sample significantly differ to that of the population?

Directions: Read and analyze each item carefully, then circle the letter of the
correct answer from the given choices.

Refer to the problem below to answer item numbers 1-5.


A chocolate manufacturing company reported that at the end of the 2021, the mean
number of boxes of chocolates delivered daily is 514.3. If a random sample of 25
manufacturing days revealed that the mean number of delivered boxes of chocolates
is 510.5 with a standard deviation of 2.5, test the claim of the company in its report
at 0.05 level of significance.
1. Which of the following shows the correct null hypothesis?
A. µ = 514.3 B. µ ≠ 514.3 C. µ > 514.3 D. µ < 514.3
2. Which distribution is appropriate for the desired hypothesis testing?
A. t-distribution B. z-distribution C. f-distribution D. none of these
3. What is the corresponding critical value?
A. ± 2.064 B. ±2.060 C. ± 1.96 D. ± 1.65
4. Which of the following is the correct test statistic?
A. 7.6 B. 5. 05 C. -6.5 D. -7.6
5. Which of the following can be concluded from the results of the hypothesis
testing?
A. There is not enough evidence to reject the claim of the company in its report.
B. There is enough evidence to reject the claim of the company in its report.

54 DO_Q4_STATISTICS & PROBABILITY_GRADE 11_MODULE4


C. On the average, the company delivers more than 514.3 boxes of
chocolates.
D. No valid conclusion can be drawn.
Refer to the problem below to answer item numbers 6-8.
Mr. Mariano reported that the average time of social media exposure of senior high
school students per day is 6.5 hours. If a random sample of 100 senior high school
students were surveyed and showed that they have 5.5 hours social media exposure
per day with 0.7 hour of standard deviation, test the claim of Mr. Mariano in his
report at 0.05 level of significance.
6. Which of the following shows the correct alternative hypothesis?
A. µ = 6.5 B. µ ≠ 6.5 C. µ > 6.5 D. µ < 6.5
7. Which distribution is appropriate for the desired hypothesis testing?
A. t-distribution B. z-distribution C. f-distribution D. none of these
8. If the absolute value of the computed test statistics is greater than the critical
value, which can be concluded?
A. There is not enough evidence to reject the claim of Mr. Mariano in his report.
B. There is enough evidence to reject the claim of Mr. Mariano in his report.
C. The average time of social media exposure of college students is 5.5 hours.
D. No valid conclusion can be drawn.
Refer to the problem below to answer item numbers 9-10.
A recent report indicated that more than 61% of teenagers have at least 7.5 hours of
sleep in a day. To verify this claim, a researcher took 100 teenagers, and revealed
that 55 of them affirms the report.
9. Which of the following represents the null hypothesis?
A. Ho: 𝑝 = 0.61 B. Ho: 𝑝 ≠ 0.61 C. Ho: 𝑝 > 0.61 D. Ho: 𝑝 < 0.61
10. Which of the following represents the alternative hypothesis?
A. H1: 𝑝 = 0.61 B. H1: 𝑝 ≠ 0.61 C. H1: 𝑝 > 0.61 D. H1: 𝑝 < 0.61

Read and analyze the problem below then solve.


A car company revealed that their car dealers were able to sell an average of
18 cars last year. To verify this claim, the over-all manager selected 52 car dealers
and found out that the average number of cars sold by these dealers is 20 with a
standard deviation of 1.5. At 0.05 level of significance, is there a significant
difference between the average cars sold by the dealer based on the company’s
report and based on the selection of the over-all manager?

55 DO_Q4_STATISTICS & PROBABILITY_GRADE 11_MODULE4


DO_Q4_STATISTICS & PROBABILITY_GRADE 11_MODULE4 56
What I Know
2. B 2. A 3. A 4. C 5. B 6. A 7. B 8. B 9. A 10. B
What’s More
Steps Solution
1. State the The following are the hypotheses:
hypothesis and Ho: 𝜇 = 3.8 H1: 𝜇 ≠ 3.8 (claim)
identify the
claim.
2. Find the critical Since the population standard deviation is known, then the
value. appropriate distribution is z. Also, the hypotheses are non-
directional and so it implies a two-tailed test with 𝛼 = 0.05.
Using the z-table, the critical z-value is ±1.96.
3. Compute for the As mentioned in Step 2, we use the z-distribution. So, the
appropriate test appropriate test statistic is z and is computed as follows:
statistic. 𝑋̅ −𝜇 4.2−3.8
𝑧= = ≈ 18.856
𝜎/√𝑛 0.3/√200
4. Make a decision. Comparing the absolute values of the computed z-test statistic
and the z-critical value, it is clear that zcomputed > zcritical. Thus, at
0.05 level of significance, the null hypothesis is rejected.
5. State the Based on the decision, there is enough evidence to believe the
conclusion. claim that the time spent by teenagers in watching television is
3.8 hours a day.
DO_Q4_STATISTICS & PROBABILITY_GRADE 11_MODULE4 57
What I Can Do
A. 1. Ho: 𝑝 = 0.72 H1: 𝑝 ≠ 0.72 2. Ho: 𝑝 = 0.36 H1: 𝑝 < 0.36
B.
Steps Solution
1. State the The following are the hypotheses:
hypothesis and Ho: 𝜇 = 330 (claim) H1: 𝜇 ≠ 330
identify the
claim.
2. Find the critical Since the population standard deviation is unknown but the
value. sample size is sufficiently large, we apply the Central Limit
Theorem. So, the appropriate distribution is z. Also, the
hypotheses are non-directional and so it implies a two-tailed
test with 𝛼 = 0.05.
Using the z-table, the critical z-value is ±1.96.
3. Compute for the As mentioned in Step 2, we use the z-distribution. So, the
appropriate test appropriate test statistic is z and is computed as follows:
statistic. 𝑋̅ −𝜇 318−330
𝑧= = ≈ −66.804
𝜎/√𝑛 2.2/√150
4. Make a decision. Comparing the absolute values of the computed z-test statistic
and the z-critical value, it can be seen that zcomputed > zcritical.
Thus, at 0.05 level of significance, the null hypothesis is
rejected.
5. State the Based on the decision, there is enough evidence to reject the
conclusion. claim of the company that the average content of the juice drink
is 330 mL.
Assessment
1. A 2. A 3. A 4. D 5. B 6. B 7. B 8. B 9. A 10. C
Additional Activities
Steps Solution
1. State the The following are the hypotheses:
hypothesis and Ho: 𝜇 = 18 (claim) H1: 𝜇 ≠ 18
identify the
claim.
2. Find the critical Since the population standard deviation is unknown but the
value. sample size is sufficiently large, we apply the Central Limit
Theorem. So, the appropriate distribution is z. Also, the
hypotheses are non-directional and so it implies a two-tailed
test with 𝛼 = 0.05.
Using the z-table, the critical z-value is ±1.96.
3. Compute for the As mentioned in Step 2, we use the z-distribution. So, the
appropriate test appropriate test statistic is z and is computed as follows:
statistic. 𝑋̅ −𝜇 20−18
𝑧= = ≈ 9.615
𝜎/√𝑛 1.5/√52
4. Make a decision. Comparing the absolute values of the computed z-test statistic
and the z-critical value, it can be seen that zcomputed > zcritical.
Thus, at 0.05 level of significance, the null hypothesis is
rejected.
5. State the Based on the decision, there is enough evidence to reject the
conclusion. claim that the average number of cars sold by the car dealers
of the company is 18.
References:
Belecina, R., Baccay, E., and Mateo E. (2016). Statistics and Probability. Rex
Publishing House. Manila.
Bluman, A. (2018). Elementary Statistics: A Step by Step Approach 10 th edition.
McGraw Hill. New York, USA.
https://www.statista.com/statistics/1104061/philippines-coronavirus-covid-19-patients-by-
age-group/
https://www.khanacademy.org/math/statistics-probability/significance-tests-one-sample/idea-
of-significance-tests/v/simple-hypothesis-testing
https://www.khanacademy.org/math/statistics-probability/significance-tests-one-
sample/more-significance-testing-videos/v/hypothesis-testing-and-p-values
https://www.statisticshowto.com/probability-and-statistics/hypothesis-testing/

58 DO_Q4_STATISTICS & PROBABILITY_GRADE 11_MODULE4


11
STATISTICS &
PROBABILITY
Quarter 4-Module 4
Lesson 6
Hypothesis Testing About A
Population Proportion

59 DO_Q4_STATISTICS & PROBABILITY_GRADE 11_MODULE4


TARGETS:
3. Identify the appropriate rejection region for a given level of significance
4. Compute for the test statistic value (Z-test for Proportion)
5. Draw conclusion about the population proportion based on the whole
hypothesis testing procedure
6. Solve problems using Z-test for proportions

Do this Pre-Test: Check the appropriate box (TRUE or FALSE).


TRUE FALSE
1. If n = 20, the Central Limit Theorem applies.
2. If the confidence level is 90%, then α/2 is .05.
3. The area under the curve represents the
probability, proportion, or percentage.
4. When H0 is rejected, it means that a significant
difference does not exist.
5. A sample is small when n<30.

60 DO_Q4_STATISTICS & PROBABILITY_GRADE 11_MODULE4


Lesson Hypothesis Testing About A Population
6 Proportion
In this lesson, the learners will understand the concept of hypothesis testing about
a population proportion. Also, they will learn the procedure and the steps in doing
so.

The ABM Coffee Company claims that 20% of the coffee drinkers in Pinalagad,
Malinta prefer their brand, the Ang Barako Mo coffee. To test the claim, a group of CNHS
Grade 11 students conducted a survey and this is what they found out. Out of the 500
randomly selected residents, 95 indicated that ABM coffee is the reason why they wake up
in the morning.
Can we believe the claim? Is the claim true?

If the favorable responses were 50 or 60, it seems very reasonable to reject the
claim. On the other hand, if the number of people who say they drink ABM coffee everyday
reaches 100 ( 20% of 500) or more in this survey, we can say without question that the
ABM Coffee Company is telling the truth.

But the number that the researchers got was 95, a proportion so close to the claim
that is 100. Do we reject the claim outright? Can we accept it? You know that it is possible
to get a different set of responses if we will get a different sample of 500 residents? In order
to make a correct decision, we need to set up the rejection region.

The REJECTION REGION is a range of values such that if the test statistic (Z, t, or p)
falls into that range, we decide to reject the H0 in favor of the H1. (Keller/Warrack,
p.324)

We will now illustrate how the rejection region is determined when =.05.

For a Left-Tailed Test (One-Tailed Directional Test), the rejection region will have an area
equal to 0.05 at the extreme left side of the Normal Curve. See the figure below.

The line that divides the Rejection Region and the Non-rejection Region (some books call it
Acceptance Region) corresponds to a Z value that we call Critical Value. This critical value
may be found in our Z-table. Since the rejection region (red region) has an area equal to
0.05, we have to look at our Z-table for the corresponding Z value. Were you able to locate
it? It’s exactly between -1.64 and -1.65. That means the Z value we are looking for is -
1.645.

61 DO_Q4_STATISTICS & PROBABILITY_GRADE 11_MODULE4


REMEMBER: We use this diagram if our H1 uses the symbol <.
Using the same argument and also because our Normal Curve is symmetric, our
critical value for a Right-Tailed Test (One-Tailed Directional Test) when = .05 must be
1.645.

REMEMBER: We use this diagram if our H1 uses the symbol >.


The diagram is a little bit different if we are using a Two-Tailed Test (Directional Test). The
=.05 will be divided equally into two at both tails of the Normal Curve.

REMEMBER: We use the Two-Tailed Test if we are using the symbol ≠ in the H1.

Do you understand? The preceding discussion is based only on =.05 level of significance.

In the beginning of this module, we talked about brands of coffee and the
people’s preference. When data are nominal, the only thing we can do to describe
the population or sample is to count the number of occurrences for each category.
From the counts we determine the proportions. (Keller/Warrack, p. 373)

In the following discussion we will perform a hypothesis test by comparing a


sample proportion with a hypothesized proportion. The procedure here is almost
similar to what you did when you dealt with sample mean and population mean.

The sampling distribution of a proportion approximately follows a standardized


normal distribution. (Levine, p.356)

The test statistic for proportion p0 is given as follows:

̂ – p0
𝒑
𝑋
Z= where 𝑝̂ = 𝑛
√(p0q0/n) 𝑝̂ = sample proportion

62 DO_Q4_STATISTICS & PROBABILITY_GRADE 11_MODULE4


x = number of successes
n = sample size
p0 = hypothesized population proportion
q0 = 1 – p0

NOTE: p0 is approximately normal for np0 > 5 and nq0 > 5.


(Keller/Warrack, p. 374)

EXAMPLE 1: A local government official claims that only up to 25% of all public
school students in the city own an electronic gadget that can be used for distance
learning like cellphone, tablet, or laptop. To test the claim, a group of Grade 11
Statistics students made a survey and found out that out of 1, 000 randomly
selected students, 275 indicated that they are ready for Online learning. Can we
infer from the data that the local official is true to his claim? Use =.05
SOLUTION:
(1) H0: The proportion of students who own an electronic gadget is at most 25%,
p0 ≤ 0.25
H1: The proportion of students who own an electronic gadget is more than 25%,
p0 > 0.25
NOTE: We are using the symbol > in our H1 because we hope to show that the
obtained sample 275 is significantly greater than 250, the 25% of 1,000.
(2) One-Tailed Test, =.05
(3) Is np0 > 5? YES! Is nq0 > 5? YES! np0 = (1000)(0.25)= 250 nq0
=(1000)(0.75)= 750
(4) USE Z TEST
𝑝̂ - p0 0.275 – 0.25
Z= = = 1.83 (computed
value)
(p0q0/n) [(0.25)(0.75)/1000)]

𝑋 275
𝑝̂ = = = 0.275 q0 = 1 – p0 = 1 – 0.25 = 0.75
𝑛 1000

(5) Set up the Rejection Region and the Critical Values

1.83

(6) DECISION RULE: FOR RIGHT-TAILED TEST


If the computed value is greater than or equal to the critical value,
REJECT H0. Otherwise, do NOT reject the Null Hypothesis.
(7) COMPARE the computed value with the critical value: Since 1.83 > 1.645,
DECISION: REJECT H0.
(8) CONCLUSION: There is enough evidence to reject the claim of the government
official. There is a Significant Difference between the sample proportion and the
hypothesized population proportion. It is safe to say that the proportion of students
who own an electronic gadget in this city is more than 25%.

63 DO_Q4_STATISTICS & PROBABILITY_GRADE 11_MODULE4


The test statistic for proportion p0 is given as follows:

̂ – p0
𝒑
𝑋
Z= where 𝑝̂ = 𝑛
√(p0q0/n) 𝑝̂ = sample proportion
x = number of successes
n = sample size
p0 = hypothesized population proportion
q0 = 1 – p0

Let us now solve the problem presented in the beginning of this module – the ABM coffee.
SOLUTION
(1) H0: The proportion of residents in Pinalagad that prefer the ABM coffee is 20%
or more, p0 ≥ 0.20
H1: The proportion of residents in Pinalagad that prefer the ABM coffee is
less than 20%, p0 < 0.20
NOTE: We are using the symbol < in our H1 because we hope to prove that the obtained
sample value 95 is significantly lesser than 100, the 20% 0f 500.
(2) One-Tailed Test, =.05
(3) Is np0 > 5? YES! Is nq0 > 5? YES! np0 = (500)(0.20)= 100 nq0 =(500)(0.80)=400
(4) USE Z TEST
𝑝̂ - p0 0.19 – 0.20
Z= = = -0.56 (computed value)

(p0q0/n) [(0.20)(0.80)/500)]

𝑋 95
𝑝̂ = = = 0.19 q0 = 1 – p0 = 1 – 0.20 = 0.80
𝑛 500

(5) Set up the Rejection Region and the Critical Value

-0.56

(6) DECISION RULE: FOR LEFT-TAILED TEST


64 DO_Q4_STATISTICS & PROBABILITY_GRADE 11_MODULE4
If the computed value is less than or equal to the critical value, REJECT H0.
Otherwise, do NOT reject the Null Hypothesis.
(7) COMPARE the computed value with the critical value: Since -0.56 > -1.645,
DECISION: Do NOT reject H0.
(8) CONCLUSION: There is NOT enough evidence to reject the claim of ABM Coffee
Company. There is NO Significant Difference between the sample proportion and the
hypothesized population proportion. It is safe to say that 20% of Pinalagad residents prefer
ABM coffee.

(a) Set up the Rejection Region and the Critical Values for Left-Tailed, Right-Tailed,
and Two-Tailed tests. Use =.01.
(b) Set up the Rejection Region and the Critical Values for Left-Tailed, Right-Tailed,
and Two-Tailed tests. Use =.10.

SOLVE THE FOLLOWING PROBLEM.


It has been claimed that 35% of students in a certain senior high school
dislike Mathematics. When a survey was conducted, 310 out of 800
students indicated a negative perception towards the dreaded subject. Test
the claim at =.05. Use a One-Tailed Test.

65 DO_Q4_STATISTICS & PROBABILITY_GRADE 11_MODULE4


DO_Q4_STATISTICS & PROBABILITY_GRADE 11_MODULE4 66
PRETEST: 1) FALSE 2) TRUE 3) TRUE 4) FALSE 5) TRUE
ASSESSMENT a)
b)
Almeda, Capistrano, Ferry Sarte. (2010). Elementary Statistics. Quezon City:
University of the Philippines Press.

Keller, Warrack. (2003). Statistics For Management and Economics. California USA:
Thomson Learning, Inc.

Levine, et. al. (2005). Statistics: A Handbook for Managers. New Jersey: Prentice Hall.

67 DO_Q4_STATISTICS & PROBABILITY_GRADE 11_MODULE4


11
STATISTICS &
PROBABILITY
Quarter 4-Module 4
Lesson 7
Constructing and Analyzing a
Scatter Diagram

68 DO_Q4_STATISTICS & PROBABILITY_GRADE 11_MODULE4


TARGETS:

1. Illustrate the nature of bivariate data


2. Construct a Scatter Diagram
3. Describe the shape (FORM), trend (DIRECTION), and variation
(STRENGTH)based on the scatter diagram

Do this Pre-Test: Check the appropriate box (TRUE or FALSE).


TRUE FALSE
1. Data which involve a single variable are called
univariate data.
2. Brand of vaccine is an example of categorical
variable.
3. The dependent variable is usually positioned in the
X axis.
4. When using the p-value approach, we reject the
null hypothesis if the computed p-value is greater
than α, the level of significance.
5. The p-value is equal to the probability of
committing Type I error or α.

69 DO_Q4_STATISTICS & PROBABILITY_GRADE 11_MODULE4


Lesson Constructing and Analyzing a Scatter
7 Diagram
In this lesson, the learners will learn the nature of bivariate data. They will also
learn how to construct and analyze a scatter diagram.

In our previous lessons we studied graphical techniques in order to present


single sets of data --- data that involve only one variable. Those sets of data are called
univariate data.

In this module we will tackle what are known as bivariate data. Bivariate data
involve two variables at a time. We will also learn the relationships between those
two variables. There are two types of bivariate data --- the categorical and the
numerical bivariate data. But in this module we will focus only on the numerical
bivariate data.

In order to see graphically the possible relationship between two variables


the Scatter Diagram is used. The Scatter Diagram (or Scatter Plot) is a technique
used to describe the relationship between two numerical variables. (Keller/
Warrack, p.58)

When drawing the scatter diagram we need the raw data from our two
variables. Each pair of observations from the two variables is represented by a dot.
This is similar to what you did in your Gen Math class when you plotted points on
the XY plane.

In most cases one variable seems to be dependent on the other variable. Just
to cite some examples --- an individual’s income somewhat depends on the number
of years of education (the higher your educational attainment, the higher you
expect your salary to become), a company’s sales depend on the amount spent in
advertising (This is the reason why companies spend a lot of money to advertise
their products), a student’s score in a major exam may depend on the number of
hours spent in studying (We sincerely hope you prepare really well for your exams).

HOW TO DRAW A SCATTER DIAGRAM (UST Worktext, p. 149)

1. Draw the X and Y axes.


70 DO_Q4_STATISTICS & PROBABILITY_GRADE 11_MODULE4
2. Position the independent variable on the X axis. Use an appropriate scale.
3. Position the dependent variable on the Y axis. Use also an appropriate scale.
4. Plot each ordered pair, (x,y) from the raw data.

EXAMPLE
It is unfortunate that this generation has experienced a dreaded
pandemic. The following are the actual number of Covid 19 cases in the Philippines
starting January 30, 2020 when the first case was detected. The data cover a
period of 20 weeks from January 30 to June 30, 2020 (en.m.wikipedia.org)

X 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
Y 1 2 3 3 5 140 380 2084 4195 5878 6981 8488 10610 12305 13777

16 17 18 19 20
18086 21340 24787 27799 37514

Scatter Diagram
40000
Number of Covid 19

20000
Cases

0
0 5 10 15 20 25
Week 1 to Week 20

What do you notice about the general direction of the dots in our example?

Take a look at the Example again. You can clearly see that as time passes by,
the number of Covid 19 cases also increases. As the independent variable increases,
the dependent variable also increases. When this happens we say that there is a
positive linear relationship between the two variables.

Sometimes the general direction of the dots is downward. As the independent


variable increases, the dependent variable decreases. When this happens we say that
there is a negative linear relationship between the two variables.

Now, let’s try a crude technique of determining the strength of a linear


relationship. Please, get a pencil and a ruler, and try to draw a straight line through
the middle of the scatter diagram. Position the straight line in such a way that you
can “pierce” as many dots as possible. If most of the dots fall close to the line, we
say that there is a strong linear relationship. (Keller/Warrack, p. 61) If you are
having a hard time positioning your straight line and you can’t “pierce” even a few
dots, we say that there is a weak or no linear relationship at all.

71 DO_Q4_STATISTICS & PROBABILITY_GRADE 11_MODULE4


The following are examples of scatter diagram that show the direction.

The following are examples of scatter diagrams that show strength.

In a certain senior high school in Valenzuela City where Gen Math (X) and Statistics (Y) are
offered as core subjects, a sample of 15 students was drawn. The midterm grades for both
subjects were recorded for each student. The data are listed below. (Keller/Warrack, p. 67)

X 65 60 93 68 74 81 60 85 88 75 63 79 80 60 72

Y 74 72 84 71 68 85 63 73 79 65 62 71 74 68 73

a. Draw a scatter diagram of the data.


b. What does the graph tell you about the relationship between the grades in Gen Math and
Statistics?

72 DO_Q4_STATISTICS & PROBABILITY_GRADE 11_MODULE4


Scatter Diagram

100

STATISTICS 50

0
0 20 40 60 80 100
GEN MATH

SOLUTION: (a)

(b) There is a positive linear relationship between the two core subjects, and
the relationship is of medium-strength.

In a certain senior high school in Valenzuela City a random sample of 10


students in Statistics were asked regarding the number of hours they spent in
studying (X) and the scores (Y) they received during the recently concluded Final
Exam. The data are given below. (Acelejado, et. al., p. 185)

X 2 2 2 3 3 4 5 5 6 6

Y 57 63 70 72 69 75 73 84 82 89

a. Draw a scatter diagram for the given data.

b. Describe the relationship between the two variables with respect to direction and

1. Choose the scatterplot that best fits this description: "There is a strong,
positive, linear association between the two variables." Explain each choices
why or why not it is a solution to the problem. (Khan Academy)

73 DO_Q4_STATISTICS & PROBABILITY_GRADE 11_MODULE4


DO_Q4_STATISTICS & PROBABILITY_GRADE 11_MODULE4 74
PRETEST: 1. True 2. True 3. False 4. False 5. True
ASSESSMENT: (a)
(b) There is a Medium Strength Positive linear relationship between the number of
hours spent in studying by the students and their scores in the Statistics Final Exam.
Almeda, Capistrano, Ferry Sarte. (2010). Elementary Statistics. Quezon City: University of the
Philippines Press.

Belecina, R., Baccay, E., & Mateo, E. (2016). Statistics and Probability. Manila,Philippines:
REX Book Store Inc.

Keller, Warrack. (2003). Statistics For Management and Economics. California USA:
Thomson Learning, Inc.

Levine, et. al. (2005). Statistics: A Handbook for Managers. New Jersey: Prentice Hall.

75 DO_Q4_STATISTICS & PROBABILITY_GRADE 11_MODULE4


11
STATISTICS &
PROBABILITY
Quarter 4-Module 4
Lesson 8
Calculating the Pearson’s
Sample Correlation Coefficient

76 DO_Q4_STATISTICS & PROBABILITY_GRADE 11_MODULE4


Targets:
1. calculates the Pearson’s sample correlation coefficient;
2. solves problems involving correlation analysis;

Do this Pre-Test: Write True if the statement is correct, False otherwise. Write
your answer in your notebook.
____________1.) 1.001 can be a representation of correlation coefficient r ?
____________2.) Negative relationship means direct relationship.
____________3.) The first step in computing Pearson’s sample correlation coefficient r
is to get the sum of all entries in all columns.
____________4.) If the coefficient of correlation falls between 0.51 to 0.74, there is a
high negative correlation.
____________5.) In the Pearson r, n represents sum of x-values.

77 DO_Q4_STATISTICS & PROBABILITY_GRADE 11_MODULE4


Lesson Calculating the Pearson’s Sample
8 Correlation Coefficient

In this lesson, the learners will learn how to calculate the Pearson’s Sample
Correlation Coefficient. They will also learn how solve problems involving
correlation analysis diagram.

In our previous lessons we studied Constructing and Analyzing a Scatter Diagram.


In this module we focus on using the Pearson’s Correlation Coefficient in calculating
relationship between sets of sample data.

The Correlation coefficient, r, between sets of the data is a measure of how well
they are related. It is a measure of the strength of the relationship between or among
variables.

The most common measure of correlation is the Pearson Correlation.


The strength of correlation is indicated by the coefficient of correlation. There are
several coefficients of correlation. One that is most commonly used in linear
correlation is Pearson product-moment correlation coefficient.
The formula of Pearson correlation:
𝑛(∑ 𝑥𝑦)−(∑ 𝑥)(∑ 𝑦)
𝑟= , where: n number of paired observation
√[𝑛 ∑ 𝑥 2 −(∑ 𝑥)2 ][𝑛 ∑ 𝑦 2 −(∑ 𝑦)2 ] x first variable
y other variable
Note: the rounding rule for the correlation coefficient value of r shall be to the three
decimal places
The type of relationship is represented by the correlation coefficient:

r =+1 perfect positive correlation


+1 >r > 0 positive relationship
r=0 no relationship
0>r>1 negative relationship
r=1 perfect negative correlation
The correlation coefficient is bound by –1 and +1. The closer the coefficient to
–1 or +1, the stronger is the correlation.

78 DO_Q4_STATISTICS & PROBABILITY_GRADE 11_MODULE4


The following example will guide you on how to compute the Pearson’s sample
correlation coefficient r.

EXAMPLE 1: Find the coefficient of correlation and interpret the relationship


between the two set of test scores in Algebra and Geometry of ten (10) students as
shown below:

Student Algebra (x) Geometry (y) xy x2 y2


1 18 19 342 324 361
2 15 17 225 225 289
3 13 14 182 169 196
4 16 15 240 256 225
5 13 14 182 169 196
6 10 11 110 100 121
7 13 12 156 169 144
8 15 14 210 225 196
9 10 13 130 100 169
10 14 17 238 196 289

∑. 137 146 2045 1933 2186

Solution:

‘Since the table is already completed, proceed to substitution for the values required
for the formula:

𝑛(∑ 𝑥𝑦) − (∑ 𝑥)(∑ 𝑦) 10(2045) − (137)(146)


𝑟= = = 0.81
√[𝑛 ∑ 𝑥 2 − (∑ 𝑥)2 ][𝑛 ∑ 𝑦 2 − (∑ 𝑦)2 ] √[10(1933) − (137)2 ][10(2186) − (146)2 ]

Interpretation: since r is closer to +1, there’s a strong positive correlation


between number of years of college and the monthly income

79 DO_Q4_STATISTICS & PROBABILITY_GRADE 11_MODULE4


Example 1. Correlation Coefficient

Below are the data for six participants giving their number of years in college
(X) and their subsequent monthly income (Y). Which one of the following best
describes the correlation between X and Y?

# of Years of College (x) 0 1 3 4 4 6

Income (y) 15 15 20 25 30 35

Solution:
Step 1: Complete the table

# of Years of College (x) Income (y) x2 y2 xy

0 15 0 225 0
1 15 1 225 15
3 20 9 400 60
4 25 16 625 100
4 30 16 900 120
6 35 36 1225 210
∑ 𝒙 = 𝟏𝟖 ∑ 𝒚 = 𝟏𝟒𝟎 ∑ 𝒙𝟐 = 𝟏𝟒𝟎 ∑ 𝒚𝟐 = 3600 ∑ 𝒙𝒚 = 𝟓𝟎𝟓

Step 2: Substitute the values obtained through summations


𝑛(∑ 𝑥𝑦) − (∑ 𝑥)(∑ 𝑦) 6(505) − (18)(140)
𝑟= = = 0.95
2
√[𝑛 ∑ 𝑥 2 − (∑ 𝑥)2 ][𝑛 ∑ 𝑦 2 − (∑ 𝑦) ] √[6(78) − (18)2 ][6(3600) − (140)2 ]

Step 3: Interpret

since r is closer to +1, there’s a strong positive correlation between number of


years of college and the monthly income

80 DO_Q4_STATISTICS & PROBABILITY_GRADE 11_MODULE4


The most common measure of correlation is he Pearson Correlation with the formula
as below:
𝑛(∑ 𝑥𝑦)−(∑ 𝑥)(∑ 𝑦)
𝑟= , where: n number of paired observation
√[𝑛 ∑ 𝑥 2 −(∑ 𝑥)2 ][𝑛 ∑ 𝑦 2 −(∑ 𝑦)2 ] x first variable
y other variable

Activity 1. Complete the table below. Fill in the blanks in the formula to arrive at the
computed Pearson r. Then interpret the result.

X Y XY X2 Y2
15 5 225
23 3
11 8 64
9 10 100
15 8 64
20 20 400
∑X = ∑Y = ∑ 𝒙𝟐 = ∑ 𝒚𝟐 = ∑ 𝒙𝒚 =

Directions: Solve for the following problem:


The time x in years that an employee spent at a company and the employee’s hourly
pay, y, for 5 employees are listed in the table below. Calculate and interpret the
correlation coefficient r. refer to the table below:

x y x2 y2 xy
5 25 25 635 125
3 20 9 400 60
4 21 16 441 84
10 35 100 1225 350
15 38 225 1444 570
∑ X =37 ∑ Y =139 ∑ 𝒙𝟐 = (1)____ ∑ 𝒚𝟐 = (2)____ ∑ 𝒙𝒚 = (3)____

(4) Calculate the correlation coefficient r.

(5) Interpret the result.


81 DO_Q4_STATISTICS & PROBABILITY_GRADE 11_MODULE4
Mary Ann comes from a family of genetically diabetic. But she is in denial of
it and wants to prove that being diabetic comes with age. She randomly selects
six relatives and record their age and glucose level to prove her claim. Here’s
the data from her record:

Subject Age (x) Glucose


Level (y)
Gener 43 99
Rolinda 21 65
Arianne 25 79
Susana 42 75
Elizabeth 57 87
Alfred 59 81

(a)Test whether there is a relationship with 𝛼 = 0.05. (b) describe the


relationship between variables

82 DO_Q4_STATISTICS & PROBABILITY_GRADE 11_MODULE4


DO_Q4_STATISTICS & PROBABILITY_GRADE 11_MODULE4 83
ASSESSMENT: Pre-Test:
1. 375 1. False
2. 4135 2. False
3. 1189 3. False
4 r=0.97, 4. False
5. There is a strong positive 5. False
correlation between the number of
What I can Do:
years employee has worked and the
salary since r is very close to 1. r= 0.04
Interpretation: Since r is very close to
0, then there is no relationship
between the variables.
Almeda, Capistrano, Ferry Sarte. (2010). Elementary Statistics. Quezon City: University of the
Philippines Press.

Belecina, R., Baccay, E., & Mateo, E. (2016). Statistics and Probability. Manila,Philippines:
REX Book Store Inc.

Bluman, A. (2018). Elementary Statistics: A Step by Step Approach 10th edition. McGraw
Hill. New York, USA.

Canlapan, R. (2016). Statistics and Probability. Makati, Philippines: Diwa Learning System
Inc.

Keller, Warrack. (2003). Statistics For Management and Economics. California USA:
Thomson Learning, Inc.

Levine, et. al. (2005). Statistics: A Handbook for Managers. New Jersey: Prentice Hall.

PERCDC Learnhub

Walpole, R., Myers, R., Myers, S., and Ye, K., (2012). Probability and Statistics for Engineers
and Scientists 9th edition. Pearson Education Inc. Massachusetts, USA.

84 DO_Q4_STATISTICS & PROBABILITY_GRADE 11_MODULE4


11
STATISTICS &
PROBABILITY
Quarter 4-Module 4
Lesson 9-10
Dependent and Independent
Variables & Regression
Analyses

85 DO_Q4_STATISTICS & PROBABILITY_GRADE 11_MODULE4


Targets:
1.calculates the slope and y-intercept of the regression line;
2.interprets the calculated slope and y-intercept of the regression line;
3. predicts the value of the dependent variable given the value of the independent
variable; and
4. solves problems involving regression analysis.

Do this Pre-Test: Write True if the statement is correct, False otherwise. Write
your answer in your notebook.
____________1.) The y-intercept is the value of y when x=0.
____________2.) correlation is used to determine the existence, strength, and direction
of relationship between bivariate data?
____________3.) In regression analysis, a response variable is also known as the
dependent variable.
____________4.) The equation for the straight line that is used to estimate y based on
x is referred to as linear equation.
____________5.) Independent Variable is also known as output variable.

86 DO_Q4_STATISTICS & PROBABILITY_GRADE 11_MODULE4


Lesson Dependent and Independent
9-10 Variables & Regression Analyses
In this lesson, the learners will learn how to calculate and interpret and solve the
slope and y-intercept of the regression line. They will also learn how predict the value
of the dependent variable given the value of the independent variable.

In the previous lesson we focus on using the Pearson’s Correlation Coefficient in


calculating relationship between sets of sample data. This lesson, on the other hand
is all about the regression line and predicting predict the value of the dependent
variable given the value of the independent variable.

Let’s have some review of the Dependent and Independent Variables. By example
below, you may be reminded of what is meant by Dependent Variable as the values
that predicts or assumes the predictor and sometimes called the outcome or response
variable:

• How will you perform in a race depends on your training.


• How much you weigh depends on your diet.
• How much you earn depends upon the number of hours you work.
While, the variables that are manipulated or are changed by researchers and whose
effects are measured and compared are Independent Variable some called as
predictors or input.
In the equation 𝒚 = 𝟑𝒙, can you tell what’s independent and independent variable?
Yes, y is the dependent variable while x represents the independent variable.

The technique used to develop the equation for a straight line and make predictions
about relationship of two variable is called Regression Analysis. The equation for
the straight line that is used to estimate y based on x is referred to as regression
equation.
The equation of the regression line is written as: 𝑦 = 𝑎 + 𝑏𝑥, 𝑎 = 𝑦 − 𝑖𝑛𝑡𝑒𝑟𝑐𝑒𝑝𝑡, 𝑏 =
𝑠𝑙𝑜𝑝𝑒.
The formulas used to generate the Regression Equation (least square method) are:
𝑛(∑ 𝑥𝑦)−(∑ 𝑥)(∑ 𝑦) ∑𝑦 ∑𝑥
𝑏= 2 2 𝑎= − 𝑏( )
𝑛(∑ 𝑥 )−(∑ 𝑥) 𝑛 𝑛

87 DO_Q4_STATISTICS & PROBABILITY_GRADE 11_MODULE4


The following example will help you identify the independent and dependent
variables in some research questions:

Example 1:

Questions Independent Dependent


Variable Variable
1. To what extent does remote working remote working job satisfaction
increase job satisfaction?
2. What is the effect of intermittent intermittent blood sugar
fasting on blood sugar levels? fasting levels
3. Is stressful experiences increase the stressful likelihood of
likelihood of headaches? experiences headaches
4. How does time of day affects someone’s time of day someone’s
alertness.? alertnes
5. How true that women are more Wearing earrings women
attracted to men without earrings than among men attraction
men with earrings?

Example 2. Regression Analysis: To further describe the relationship of dependent and


independent variable, regression analysis can be used. See example below:

The Chief of Admission Office of the of a certain university wanted to determine if


the Exit Exam Rating (EER) is a good indicator of the Grade Point Average (GPA) of
the 16 academic scholars selected at random from the graduating class. Their GPA
and EER are shown below:
Student 1 2 3 4 5 6 7 8
GPA (y) 1.52 1.85 1.86 1.79 1.67 2.96 2.05 2.79
EER (x) 85 76 69 75 106 61 70 59
9 10 11 12 13 14 15 16
2.63 2.71 2.12 1.94 2.11 1.95 2.59 2.45
56 54 62 73 64 57 92 85

88 DO_Q4_STATISTICS & PROBABILITY_GRADE 11_MODULE4


Describe the relationship of EER to GPA. What is a point estimate of a graduating
GPA when EER is 85?

Solution: 16(2457.48) − (1144)(34.99)


𝑏= 2
GPA (y) EER (x) xy y2 x2 16(84984) − (∑ 1144)
1.52 85 129.20 2.31 7225 𝑏 = −0.0139 𝑜𝑟 𝑏 = −0.014
1.85 76 140.60 3.42 5776
1.86 69 128.34 3.46 4761 34.99 1144
𝑎= − (−0.0139)( )
1.79 75 134.25 3.20 5625 16 16
1.67 106 177.02 2.79 11236
2.96 61 180.56 8.76 3721 𝑎 = 3.181
2.05 70 143.50 4.20 4900 The fitted equation describing
2.79 59 164.61 7.78 3481
2.63 56 147.28 6.92 3136
the relationship between GPA
2.71 54 146.34 7.34 2916 and EER has been found to be:
2.12 62 131.44 4.49 3844
𝑦 = 3.181 − 0.014𝑥
1.94 73 141.62 3.76 5329
2.11 64 135.04 4.45 4096
1.95 57 111.15 3.80 3249 The quantity 0.014 preceded by a
2.59 92 238.28 6.71 8464 negative sign indicates that as EER
2.45 85 208.25 6.00 7225 increase, GPA will decrease. Thus,
34.99 1144 2457.48 79.42 84984 good EER results to good GPA.

The point estimate of graduate GPA when EER is 85 is: 𝑦 = 3.181 − 0.014(85) =
1.99

Regression analysis is a powerful statistical method that allows you to examine the
relationship between two or more variables of interest. While there are many types
of regression analysis, at their core they all examine the influence of one or more
independent variables on a dependent variable.

Solve for this:


Connect the dependent and independent variables to form a correct sentence
structure. The first one is given as an example.

Independent Variable Dependent Variable Correct Sentence


number of hours winning the contest Winning the contest
spent in practice depends on the
number of hours
spent in practice.
Cubic meter used in a water bill The water bill depends on
household the cubic meter used in a
89 DO_Q4_STATISTICS & PROBABILITY_GRADE 11_MODULE4
household
screen time spent daily eye health status of a The eye health status of a
person person depends on the
screen time spent daily
participation of learners academic performance The academic
performance depends on
the participation of
learners

Directions: Solve for the following problem:

The time x in years that an employee spent at a company and the employee’s hourly
pay, y, for 5 employees are listed in the table below:

(1) What is the independent variable?


(2) What is the dependent Variable?
(3) Describe the relationship among the variables.
(4) Find the equation of the regression line.

x y x2 y2 xy
5 25 25 635 125
3 20 9 400 60
4 21 16 441 84
10 35 100 1225 350
15 38 225 1444 570
37 139 375 4135 1189

In the problem in the Assessment part, how much could be employee’s hourly pay if
he is already 20 years in the company? Round off your answer to the nearest whole
number.

90 DO_Q4_STATISTICS & PROBABILITY_GRADE 11_MODULE4


DO_Q4_STATISTICS & PROBABILITY_GRADE 11_MODULE4 91
ASSESSMENT: Pre-Test:
1. time in years that an 1. True
employee spent at a company
2. True
2. the employee’s hourly pay
3. True
3. As the time in years that
an employee spent at a 4. False
company increases, 5. False
employee’s hourly pay also
increases. What I Can Do:
4. 𝑦 = 1.58𝑥 + 16.11 1. The water bill depends on the cubic meter used
in a
Household
ADDITIONAL ACTIVITIES 2. The eye health status of a person depends on
the screen time spent daily
1. 48
3. The academic performance depends on the
participation of learners
Almeda, Capistrano, Ferry Sarte. (2010). Elementary Statistics. Quezon City: University of the
Philippines Press.

Belecina, R., Baccay, E., & Mateo, E. (2016). Statistics and Probability. Manila,Philippines:
REX Book Store Inc.

Bluman, A. (2018). Elementary Statistics: A Step by Step Approach 10th edition. McGraw
Hill. New York, USA.

Canlapan, R. (2016). Statistics and Probability. Makati, Philippines: Diwa Learning System
Inc.

Keller, Warrack. (2003). Statistics For Management and Economics. California USA:
Thomson Learning, Inc.

Levine, et. al. (2005). Statistics: A Handbook for Managers. New Jersey: Prentice Hall.

PERCDC Learnhub

Walpole, R., Myers, R., Myers, S., and Ye, K., (2012). Probability and Statistics for Engineers
and Scientists 9th edition. Pearson Education Inc. Massachusetts, USA.

92 DO_Q4_STATISTICS & PROBABILITY_GRADE 11_MODULE4


For inquiries or feedback, please write or call:
Department of Education – SDO Valenzuela
Office Address: Pio Valenzuela Street, Marulas, Valenzuela City
Telefax: (02) 8292-4340
Email Address: sdovalenzuela@deped.gov.ph
93 DO_Q4_STATISTICS & PROBABILITY_GRADE 11_MODULE4

You might also like