Download as pdf or txt
Download as pdf or txt
You are on page 1of 33

Statistics and

Probability
Quarter 4 Module 1:
Testing Hypothesis
2. What is the average daily usage of social media of her friends? Compare
it with the previous average usage.
3. Which of the two claims could probably be true? Why?
4. If Sofia computed the average daily internet usage of her friends to be
higher than the global survey, do you think it would be significantly
higher?
5. What is your idea of an average value being significantly higher than the
global average value?
6. What do you think is the difference between simple comparison of data
and hypothesis testing?

What Is It

Hypothesis testing is a statistical method applied in making decisions


using experimental data. Hypothesis testing is basically testing an
assumption that we make about a population.

A hypothesis is a proposed explanation, assertion, or assumption about a


population parameter or about the distribution of a random variable.

Here are the examples of questions you can answer with a hypothesis test:
Does the mean height of Grade 12 students differ from 66 inches?
Do male and female Grade 7 and Grade 12 students differ in height
on average?

higher than that of senior female students?

Key Terms and Concepts Used in Test Hypothesis

The Null and Alternative Hypothesis


The null hypothesis is an initial claim based on previous analyses,
which the researcher tries to disprove, reject, or nullify. It shows no
significant difference between two parameters. It is denoted by .
The alternative hypothesis is contrary to the null hypothesis, which
shows that observations are the result of a real effect. It is denoted
by .

Note: You can think of the null hypothesis as the current value of the
population parameter, which you hope to disprove in favor of your
alternative hypothesis.
6
Take a look at this example.
The school record claims that the mean score in Math of the incoming
Grade 11 students is 81. The teacher wishes to find out if the claim is true.
She tests if there is a significant difference between the batch mean score
and the mean score of students in her class.

Solution:
Let be the population mean score and be the mean score of
students in her class.
You may select any of the following statements as your null and
alternative hypothesis as shown in Option 1 and Option 2.

Option 1:
: The mean score of the incoming Grade 11 students is 81 or = 81.
: The mean score of the incoming Grade 11 students is not 81 or 81.

Option 2:
: The mean score of the incoming Grade 11 students has no significant
difference with the mean score of her students or = .
: The mean score of the incoming Grade 11 students has a significant
difference with the mean score of her students or .

formulate two hypotheses about the global average usage ( ) and the average
usage of her friends ( ) on the blanks provided below.
: _____________________________________________
: _____________________________________________
You can verify your answer to your teacher and start working on the next
activity.
Here is another key term you should know!

Level of Significance
The level of significance denoted by alpha or refers to the degree of
significance in which we accept or reject the null hypothesis.
100% accuracy is not possible in accepting or rejecting a hypothesis.
is also the probability of making the wrong decision
when the null hypothesis is true.

Do you know that the most common levels of significance used are 1%, 5%,
or 10%?
Some statistics books can provide us table of values for these levels of
significance.

7
Take a look at this example.
Maria uses 5% level of significance in proving that there is no
significant change in the average number of enrollees in the 10 sections for
the last two years. It means that the chance that the null hypothesis ( )
would be rejected when it is true is 5%.

-3 -2.5 -2 -1.5 -1 -0.5 0 0.5 1 1.5 2 2.5 3

is actually the area under the


normal curve within the rejection region.

If Sofia used a 0.10 level of significance, what are the chances that she
would have a wrong conclusion if the two values have no significant
difference?

Here is another key term you should know!


Two-Tailed Test vs One-Tailed Test
When the alternative hypothesis is two-sided like : , it is
called two-tailed test.
When the given statistics hypothesis assumes a less than or greater
than value, it is called one-tailed test.

Here are some examples.


The school registrar believes that the average number of enrollees this
school year is not the same as the previous school year.
In the above situation,
let be the average number of enrollees last year.
:
:

8
However, if the school registrar believes that the average number of enrollees
this school year is less than the previous school year, then you will have:
:
:

Use the left-tailed when


contains the symbol .

On the other hand, if the school registrar believes that the average number
of enrollees this school year is greater than the previous school year, then
you will have:
:
:

Use the right-tailed test when


contains the symbol .

Now back to the two claims of Sofia, what do you think should be the type of
test in her following claims?
Claim A: The average daily usage of social media of her friends is
the same as the global average usage.

Claim B: The average daily usage of social media of her friends is


higher than the global average usage.

Here is the other concept!


9
Illustration of the Rejection Region
The rejection region (or critical region) is the set of all values of the test
statistic that causes us to reject the null hypothesis.
The non-rejection region (or acceptance region) is the set of all values of
the test statistic that causes us to fail to reject the null hypothesis.
The critical value is a point (boundary) on the test distribution that is
compared to the test statistic to determine if the null hypothesis would
be rejected.

Non-Rejection
Region Rejection Region

-3 -2.5 -2 -1.5 -1 -0.5 0 0.5 1 1.5 2 2.5 3

Critical Value

Illustrative Example 1:

online usage of her friends is the same as the global usage ( ).

She computed for the t-value using the formula where = 142,
= 152, s = 19.855, and n = 10.

Use a scientific
This t-test formula calculator to
was discussed in verify the
the last chapter. computed t-
value.

10
From the table of t-values, determine the critical value. Use df = n-1 = 9,
one-tailed test at 5% level of significance.
The critical t-value is 1.833.
How did we get that value?
Look at this illustration!

The table of t-values


can be found at the
last part of this
module.

Now, you can sketch a t distribution curve and label showing the rejection
area (shaded part), the non-rejection region, the critical value, and the
computed t-value. This is how your t distribution curve should look like!

Rejection
Region
Non-Rejection
Region

-3 -2.5 -2 -1.5 -1 -0.5 0 0.5 1 1.5 2 2.5 3

1.593 1.833
(Computed Value) (Critical Value)

As you can see from your previous illustration, the computed t-


value of 1.593 is at the left of the critical value 1.833. So, in
which region do you think the computed value falls?

The computed value is less than the critical value.

: The average online usage of


her friends is the same as the
The computed
global usage. We fail to reject
t-value is at the
: The average online usage of the null
non-rejection
her friends is higher than the hypothesis, .
region.
global usage.

11
Illustrative Example 2:
A medical trial is conducted to test whether or not a certain drug reduces
cholesterol level. Upon trial, the computed z-value of 2.715 lies in the
rejection area.

-3 -2.5 -2 -1.5 -1 -0.5 0 0.5 1 1.5 2 2.5 3

The computed z-value of


2.715 can be found here!

The computed value is greater than the critical value.


: The certain drug is effective in We reject the
The computed
reducing cholesterol level by 60%. null hypothesis,
z-value is at the
: The certain drug is not effective in in favour of
rejection region.
reducing cholesterol level by 60%. .

Illustrative Example 3:
Sketch the rejection region of the test hypothesis with critical values of
and determine if the computed t-value of 1.52 lies in that region.

Solution:

Draw a t-distribution curve. Since there are two critical values, it is a two
tailed test. Locate the critical values and shade the rejection regions.

Now, locate the computed t-value of 1.52. You can clearly see that it is not
at the rejection region as shown in the following figure. The computed t-value
is at the non-rejection region. Therefore, we fail to reject the null hypothesis,
.

1.52

1.753 1.753
(critical value) (critical value)

12
Type I and Type II Errors

Rejecting the null hypothesis when it is true is called a Type I


error with probability denoted by alpha ( ). In hypothesis testing,
the normal curve that shows the critical region is called the alpha
region.
Accepting the null hypothesis when it is false is called a Type II
error with probability denoted by beta ( ). In hypothesis testing,
the normal curve that shows the acceptance region is called the
beta region.
The larger the value of alpha, the smaller is the value of beta.

This is the region of Type I


error.
P [Type I error]
P [ is true, Reject ]
Region where
is true

This is the region of Type II


error.
P [type II error]
P [ is false, Fail to reject ]

Region where is
false

To summarize the difference between the Type I and Type II errors, take a
look at the table below.

Null Hypothesis Fail to Reject Reject


Correct Decision Type I Error
True - Failed to reject when - Rejected when
it is true it is true
Type II Error Correct Decision
False - Failed to reject when - Rejected when it
it is false is false

13
Now, complete the statements that follow.

Type I
Error, Type II Error, or a Correct Decision.

1. true and she fails to reject it, then she commits a ____________________.
2. true and she rejects it, then she commits a _____________________.
3. false and she fails to reject it, then she commits a __________________.
4. false and she rejects it, then she commits a _____________________.
Your answers should be: 1) Correct Decision, 2) Type I Error, 3) Type II
Error, and 4) Correct Decision.

Illustrative Example:

Bryan is starting his own food cart


business and he is choosing cities where he
will run his business. He wants to survey
residents and test at 5% level of significance
whether or not the demand is high enough
to support his business before he applies for
the necessary permits to operate in his
selected city. He will only choose a city if
there is strong evidence that the demand
there is high enough. We can state the null
hypothesis for his test as:
The demand is high enough.

What would be the consequence of a Type I error in this setting?


_____ He doesn't choose a city where demand is actually high enough.
_____ He chooses a city where demand is actually high enough.
_____ He chooses a city where demand isn't actually high enough.

The Type I error is the first statement because he rejected the true
null hypothesis.

What would be the consequence of a Type II error in this setting?


_____ He doesn't choose a city where demand is actually high enough.
_____ He chooses a city where demand is actually high enough.
_____ He chooses a city where demand isn't actually high enough.

The Type II error is the third statement because he failed to


reject the false null hypothesis.

What is the probability of Type I error?


_____ 0.10 _____ 0.25 _____ 0.05 _____ 0.01
The probability of Type I error is 0.05 because it is the level of
significance used.

14
15. If the computed z-value is 1.915 and the critical value is 1.812, which of
the following statements could be true?
A. It lies in the rejection region, must be rejected.
B. It lies in the rejection region, hence we fail to reject .
C. It lies in the non-rejection region, must be rejected.
D. It lies in the non-rejection region, hence we fail to reject .

Additional Activities

A medical trial is conducted to test whether or not a certain drug can treat a
certain allergy. Upon trial, the t-value is computed as 1.311. Sketch and
complete the table below to discuss the findings of the medical trial.

-3 -2.5 -2 -1.5 -1 -0.5 0 0.5 1 1.5 2 2.5 3

: The computed Decision:


t-value is at the
: ___________
region.

Justify your decision by writing an explanation in 5-10 sentences.

23
Statistics and
Probability
Quarter 4 Module 2:
Identifying Parameters for
Testing in Given Real-Life
Problems
Activity 2: Grouping!
Directions: Group the following symbols into two. Place the first group
inside Box A and the second group in Box B.

A B

Guide Questions:
1. What are the symbols that you placed in Box A? Box B?
2. How did you categorize each symbol or notation?
3. What mathematical principle did you consider in answering the
activity?
4. Which symbols seemed to be familiar to you and which are not?

What Is It

Parameters in statistics are important component of any statistical


analysis. In simple words, a parameter is any numerical quantity that
characterizes a given population or some of its aspects. This means the
parameter tells us something about the whole population.
However, the numerical measure that is calculated from the sample is
called statistic. Statistic is a known number and a variable that depends on
the portion of the population.
A parameter denotes the true value that would be obtained if a census
rather than a sample was undertaken.
Examples of parameters are the measures of central tendency. These
tell us how the data behave on an average basis. For
example, mean, median, and mode are measures of central tendency that
give us an idea about where the data concentrate. Meanwhile, standard
deviation tells us how the data are spread from the central tendency, i.e.
whether the distribution is wide or narrow. Such parameters are often very
useful in analysis.
In the normal distribution, there are two parameters that can
characterize a distribution - the mean and standard deviation. By varying
these two parameters, you can get different kinds of normal distribution.

5
Different symbols are used to denote parameters. Based on Activity 2,
symbols are grouped as indicated in the table below.
Measure Statistic Parameter
(x-bar) (myu)
(sigma squared)
(sigma)
(p hat)

Mean and standard deviation


are two common parameters.

Identifying Parameter to be Tested


Illustrative Examples:
1. The average height of adult Filipinos 20 years and older is 163 cm for
males.
Parameter: the average height of adult Filipinos 20 years and older
In hypothesis testing, the parameter will be translated into symbols such
as where is the symbol for mean/average and 163 is the value
that pertains to the average height.

2. A Grade 11 researcher reported that the average allowance of Senior High


School students is 100. A sample of 40 students has mean allowance of
120. At test, it was the claimed that the students had allowance
of 100. The standard deviation of the population is 50.
Parameters: the average allowance of Senior High School students is
100 or

In this claim, there are different parameters used but the parameter
to be tested in this hypothesis would be the average allowance of Senior
High School students since it relates to the population, not in sample.
Statistical hypothesis is a conjecture abo
why you will look for the population mean, population standard deviation, or
population proportion but not sample mean.

3. According to a survey, 63% of the parents are willing to spend extra


health and education matters.
Parameter: the percentage/proportion of parents willing to spend

To identify the parameters to be tested:


1. Just look for mean/average, standard deviation, variance,
and proportion of population.
2. Determine the value that pertains to the given parameter,
then translate them in symbols for hypothesis testing.

6
Activity 3. Translate It!
Directions: Determine the notation of the given parameter, inequality
symbol, or value of the parameter.

Notation Symbols Value


Parameter (

1. Average salary of Polytechnic University of


the Philippines (PUP) graduates is at most
324,000. _____ _____
2. The standard deviation of adults riding a
bus is 1.5. _____ _____
3. Filipino employers offer a mean of 15 days
of paid vacation for sick leave. _____ _____ 15
4. Survival rate of breast cancer in the
Philippines is below 50%. _____ _____ .50
5. Mean number of vehicles in households is
at most 1.9 personal vehicles. _____ _____

Activity 4-5. What Is Your Parameter?

Directions: Determine the parameter to be tested in each situation by


writing your answer on a separate sheet of paper. Translate it into symbols.

1. The television habits of children were observed and found out that the
standard deviation is 12.4 hours per week.
2. A newspaper article stated that students in the country take an average
of 4 years to finish their undergraduate degrees. Suppose that you
believe the mean time is longer, you conducted survey on 49 students.
The result obtained a sample mean of 5 with a sample standard deviation
of 1.2.
3. According to DOLE, registered nurses in government earned an average
monthly salary of 9,700. For that same year, a survey was conducted on
41 registered nurses to determine if the mean salary is higher than the
previous survey. The sample average was 10,000 with a sample
standard deviation of 2,500.
4. Records of the Department of Health (DOH) revealed that 14.7% of the
country's Filipino smokers have maintained their habit of smoking.

7
11
What I Know What's In
1. B 10. C Activity 1 Additional Activities
2. D 11. D 1. A
2. B Activity 6
3. D 12. D 3. B
4. C 13. A 1. average life of
4. D
2,600 hours (µ)
5. D 14. C 5. D
6. C 15. D 2. average price of
7. D What's New Honda Vios is at
8. B Activity 2 least 662,000.00
A or B and vice versa (µ)
9. D
{
{
Assessment
Activity 3 Activity 5
1. 1. 1. C
2. A
2. 2.
3. C
3. 3.
4. A
4. 4. p = 0.147
5. B
5.
6. C
7. B
Activity 4 8. B
1. 9. D
habits hours per week is 12.4 10.B
2. an average of 4 years to finish undergraduate 11.C
degrees 12.C
3. an average monthly salary of registered 13.D
14.B
government nurse is 9,700
15.A
4.
smokers maintain their smoking habits
Answer Key
evidence to reject the dealer s claim at
640,000.00 and standard deviation of 24,000.00. Is there enough
that random sample of 15 similar vehicles has the mean price of
Statistics
Quarter 4 Module 3:
Formulating Appropriate Null
and Alternative Hypotheses on a
Population Mean
Guide Questions:

1. What have you observed between the two figures?


2. Do you think the fertilizer has an effect to the plant?
3. What do you think are the variables shown in the pictures?
4. Is there any relationship among the variables in Figure 1 and Figure
2?
5. How does these pictures relate to hypothesis?

What Is It

A statistical hypothesis is a statement about a parameter and deals with


evaluating the value of parameter.

In statistical hypothesis testing, there are always two hypotheses: the null
and alternative hypotheses. Below is a comparison between the two.

Null Hypothesis ( ) Alternative Hypothesis ( )

- It states that there is no - It states that the population


difference between population parameter has some statistical
parameters (such as mean, significance (smaller, greater,
standard deviation, and so on) or different than) with the
and the hypothesized value. hypothesized value.

- There is no observed effect. - There is an observed effect.

- The null hypothesis is often an - The alternative hypothesis is


initial claim that is based on what you might believe to be
previous analyses or specialized true or hope to prove true.
knowledge.

To state the null and alternative hypotheses correctly:


1. Identify the parameter in a given problem.
2. Identify the claim to be tested that may show up in null or alternative
hypothesis.
3. Translate the claim into mathematical symbols/notations.
4. Formulate first the null hypothesis ( ) then alternative hypothesis ( )
based on the three different ways in writing hypothesis as illustrated
below:

6
Hypothesis-Testing Common Phrases
is equal to is not equal to
is the same as is not the same
is exactly the same as is different from
has not changed from has changed from
is increased is decreased
is greater than is less than
is higher than is lower than
is above is below
is bigger than is smaller than
is longer than is decreased or reduced from
is more than is not more than
is at least is at most
is not less than is not more than
is greater than or equal to is less than or equal to

Let us take an example from your previous activity.


that the number of students (n) who have parents
with a house of their own is less than 20.

The claim used the word less than which as seen in the table above,
corresponds to the symbol . Therefore, the answer is n<20.

Note:
always has = symbol in it. never has an = symbol in it. The choice of
symbol depends on the wording of the hypothesis test. However, be aware
that many researchers use = (equal sign) in the null hypothesis, even with
> or < as the symbol in the alternative hypothesis. Notice also that the
notation of alternative hypothesis complements the null hypothesis.

Illustrative Examples:

1. The average weight of all Grade 11 students in Senior High School is


169cm. Is this claim true?

Solution: First, identify the parameter which is the mean height of all
Grade 11 students. Since it is a population mean, use the notation .
The claim in this example is that the average weight is 169 cm which
translates to and is considered as null hypothesis. To formulate

7
the alternative hypothesis, write the complement/opposite of the null
hypothesis which is the average weight is not equal to 169 cm.

: The average weight of all Grade 11 students is 169 cm. /


(claim)
The average weight of all Grade 11 students is not 169 cm./

2. The average price per square meter of residential lot in an exclusive


subdivision is above 15,000
claim.

Solution: In this hypothesis, the parameter is the average. Therefore,


you will use the symbol µ. The claim is above 15,000 can be written as
µ 15,000 and greater than falls at alternative hypothesis,
. Since you have already formulated the alternative, the null
hypothesis will be as complement of >. You can also write
your null hypothesis as .
or
(claim)

3. Holistic Fitness Center claims that their members reduced an


average of 13 pounds after joining the center. An independent
agency wanted to check this claim took sample of 40 members and
found that they reduced an average of 12 pounds with the standard
deviation of 4 pounds. Determine the null and alternative
hypothesis.

Solution: In this example, the parameter to be tested is the average and


the claim is reduced of 13 pounds. The claim that pertains to the
parameter has the notation of . Therefore, the claim is found at the
alternative hypothesis and can be written as . The null
hypothesis would be or
or .
(claim)

4. The treasurer of a municipality claims that the average net worth


of families in the municipality is at least 730,000. A random
sample of 50 families from this area produced a mean net worth of
860,000 with standard deviation of 65,000. What are the null and
alternative hypotheses?

Solution: In this example, the parameter is the average and the claim
is that the average is at least 730,000. The word at least has the
notation of which means that the claim is at the null hypothesis. In

8
the alternative hypothesis, you will use (<) as its complement.
Therefore:
or (claim)

5.
time is at most 240 minutes per day, on average. Another survey
was conducted to find whether the claim is true. The group took a
random sample of 30 students and found a mean study time of 300
minutes with standard deviation of 90 minutes. What are the null
and alternative hypotheses?

Solution: The parameter used in this example is average (µ) and the
claim is that average is at most 240 minutes. The word has
the notation of which means that claim is at the null hypothesis.
The null hypothesis would be . To formulate the alternative,
use the notation as the complement of . Therefore, alternative
hypothesis is .
or (claim)

One-Tailed and Two-Tailed Test

The alternative hypothesis can take another form depending on the


value of the parameter. The parameter may increase, decrease, or changed
from the null value. An alternative hypothesis predicts not only the
difference of sample mean from the population mean but also how it would
be different in a specific direction - lower or higher. This test is called
a directional or one-tailed test because the rejection region is entirely
within one tail of the distribution.

On the other hand, some hypotheses predict only that one value will
be different from another, without additionally predicting which will be
higher. The test of such a hypothesis is nondirectional or two-
tailed because an extreme test statistic in either tail of the distribution
(positive or negative) will lead to the rejection of the null hypothesis of no
difference.
One-Tailed Two-Tailed
Alternative hypothesis contains Alternative contains the
the greater than (>) or less than symbol.
(<) symbols
It is directional (either right-tailed It has no direction.
or left-tailed)

9
The table below shows the null and alternative hypotheses stated
together with the directional test.

Two-Tailed Test Right-Tailed Test Left-Tailed Test


Null or or
Hypothesis
Alternative
Hypothesis
Illustrative Examples:
Determine the hypotheses and the hypothesis test.
1. Teacher A wants to know if mathematical games affect the
performance of the students in learning Mathematics. A class of 45
students was used in the study. The mean score was 90 and the
standard deviation was 3. A previous study revealed that and
the standard deviation .
The parameter is the population mean . You can write the
hypotheses into symbols: and . The phrase affects
performance has no clue of the direction of the study, so it implies
either increase or decrease in performance. This tells you that the test is
two-tailed test.
and (two-tailed test)

2. A piggery owner believes that using organic feeds on his pigs will
yield greater income. His average income from the previous year
was 120, 000. State the hypothesis and identify the directional
test.
In this example, the null hypothesis is . You may
notice that the hypothesis used the phrase greater income that is
associated with greater than. Therefore, . This
hypothesis uses inequality symbol so it is one-tailed test and it uses
greater than which specifically called for the right-tailed test.
and (right-tailed test)

3. The average waiting time of all costumers in a restaurant before


being served is less than 20 minutes. Determine the hypotheses and
the directional test.
You may notice that the hypothesis used the phrase
which denotes that the alternative hypothesis is . This
hypothesis uses inequality symbol so it is one-tailed test and it used
less than which specifically called for the left-tailed test. In this
example, the null hypothesis is .
and (left-tailed test)

10
Activity 4. One-Tailed or Two-Tailed!

Directions: Identify whether the given hypothesis is one-tailed or two-tailed.


Write ONE if it is one-tailed and TWO if it is two-tailed test.

1. A used car dealer says that the mean price car in the Philippines is at
least 350,000.

2. PAG-ASA reported that the mean annual rainfall in the Philippines is at


most 4,064mm.

3. According to the survey, the average cost of visiting doctors is 500.

4. The mean age of students in a university in the previous years was 27


years old. An instructor thinks the mean age for students is older than
27. She randomly surveys 56 students and finds that the sample mean is
29 with a standard deviation of 2.

5. The mean work week for engineers in a new company is believed to be


about 40 hours. A newly hired engineer hopes that it is shorter. She asks
10 engineering friends for the lengths of their mean work weeks. Based
on the results, should she count on the mean work week to be shorter
than 40 hours?

Activity 5. Formu-Tail

Directions: Formulate the null and alternative hypotheses. Identify whether


it is one-tailed or two-tailed. If the hypothesis is one tailed, identify its
direction whether it is left or right. Write your answer on a separate sheet of
paper.

1. The average salary of an accountant is 24,620 per month in the


Philippines.
________________ __________________ _______-tailed test

2. A normal smartphone battery manufacturer claims that the mean life of


a certain type of battery is more than 650 hours.
________________ __________________ _______-tailed test

3. According to an international shipping company, a package from US can


arrive to Manila in an average of less than 8 business days.
________________ _________________ _______- tailed test

13
4. The average price of a certain type of car is greater than 600,000.
_________________ _________________ _______- tailed test

5. A research organization reports that the mean of adult grocery shoppers


who never buy the store brand in Metro Manila is 300.
_________________ _________________ _______- tailed test

6. A study claims that the mean survival period for certain cancer patients
treated immediately with chemotherapy and radiation is 24 months.
_________________ _________________ _______- tailed test

7. The average pre-school cost for tuition fees last year was 15,500. The
following year, 20 schools had a mean of 13, 100 and standard
deviation of 2,500.
_________________ _________________ _______- tailed test

8. A magazine reports that a typical shopper spends less than 10 minutes


in line waiting to check out. A sample of 30 shoppers at the DM
Supermarket showed mean of 9.5 minutes with standard deviation of 2.7
minutes.
________________ __________________ _______-tailed test

9. The principal of Mabundok High School claims that the students in his

IQ scores have a mean score of 113. The mean population IQ is 100 with
a standard deviation of 15. Is there an evidence to support his claim?
________________ __________________ _______-tailed test

10. The owner of BYD manufacturer claims that their batteries last an
average of at most 350 hours under normal use. A researcher randomly
selected 20 batteries from the production line and tested them. The
tested batteries had a mean life span of 270 hours with a standard
deviation of 50 hours.
________________ __________________ _______-tailed test

14
19
What I Know
1. C 11. D What's In Activity 2
2. A 12. A 1. Mean,
3. C 13. C Activity 1 2. ,
4. C 14. D
5. B 15. D 1. A 3. Average,
6. B 2. B 4. mean weight time is
7. C 3. D at most 8.7,
8. C 4. C
9. D 5. A 5. ,
10. A
Activity 3 Activity 5 Additional
1. , 1. , Activities
two-tailed 1. a. ,
2. , 2. or
,
right-tailed
3. , b. Right-tailed test
3. or
2. a. ,
4. , left-tailed ,
4. or
b. Left-tailed test
5. , right-tailed
5. , Assessment
two-tailed
Activity 4 1. B
6.
two-tailed 2. D
1. ONE
7. , 3. B
2. ONE
two-tailed 4. D
3. TWO
5. A
4. ONE 8. or
6. C
5. ONE
7. D
left-tailed 8. A
9. 9. B
10.A
right-tailed 11.C
10. or 12.D
13.A
right-tailed 14.C
15.A
Answer Key
Statistics and
Probability
Quarter 4 Module 4:
Identifying Appropriate Test
Statistics Involving Population
Mean
What Is It

Before we move forward to the different test statistics, it is important to


define the following terms:
A population includes all of the elements from a set of data.
A sample consists of one or more observations drawn from the population.
Sample mean ( is the mean of sample values collected.
Population mean (µ) is the mean of all the values in the population.
If the sample is randomly selected and sample size is large, then the
sample mean would be a good estimate of the population mean.
Population standard deviation is a parameter which is a measure of
variability with fixed value calculated from every individual in the
population.
Sample standard deviation is a statistic which means that this
measure of variability is calculated from only some of the individuals in a
population.
Population variance , in the same sense, indicates how the
population data points are spread out. It is the average of the distances
from each data point in the population to the mean, squared.

Since we already defined important things in identifying the test


statistics in hypothesis testing, let us now determine those concepts when
given a problem. use the example in Activity 2.

Example:

A Grade 11 researcher reported that the average allowance of


Senior High School students was 100. A sample of 40 students has
mean allowance of 120. At test, it was the claimed that the
students had allowance of more than 100.The standard deviation of the
population is 50.

µ = 100 the average allowance of the population (Senior High School


students)
the number of students taken from all Senior High School students
= 120 the mean allowance of the sample
= 50 the standard deviation of the population

Now you already know how to get the data needed in choosing test
statistics. This time, you will determine what test statistic is appropriate in
computing test value in the hypothesis testing.

7
A test statistic is a random variable that is calculated from sample
data and used in a hypothesis test. You can use test statistics to determine
whether to reject or accept the null hypothesis. The test statistic compares
your data with what is expected under the null hypothesis.
To identify the test statistic, you must consider whether the
population standard deviation/variance is known or unknown. If the
population standard deviation is known, then the mean has a normal
distribution. Use z-test. If the population standard deviation is unknown,
then the mean has a t- distribution. Use t-test. Instead of the population
standard deviation, use the sample standard deviation.
z-test
In a z-test, the sample is assumed to be normally distributed. A z-score
is calculated with population parameters such as
and . It is used to validate a
hypothesis that the sample drawn belongs to the same population. When the
variance is known and either the distribution is normal or sample size is
large, use a z-test statistic.
t-test
Like a z-test, a t-test also assumes a normal distribution of the
sample. A t-test is used when the population variance or standard deviation
are not known. When the variance is unknown and a sample size is less
than 30, use a t-test statistic assuming that the population is normal or
approximately normal.

Central Limit Theorem


In Central Limit Theorem, if the population is normally distributed
or the sample size is large and the true population mean µ = , then z has
a standard normal distribution.
When population standard deviation is not known, we may still use
z-score by replacing the population standard deviation by its estimate,
sample standard deviation s. Since the sample is large the resulting test
statistic still has a distribution that is approximately standard normal.
Historically, this was very useful, as most statisticians before did not
have access to the t-table of quantities for very large number of degrees of
freedom. But with modern computers today, using t-test with a very large
sample size is not a problem at all.
However, since you will be using a t-table with only limited number of
degrees of freedom, you will use z-test when the sample size is large even
though the population standard deviation is unknown.
When sample sizes are small, the Central Limit Theorem does not
apply. You must then impose stricter assumptions on the population to give
statistical validity to the test procedure. One common assumption is that
the population from which the sample is taken has a normal probability
distribution to begin with. Under such circumstances, if the population
standard deviation is known, then the test statistic still has the

standard normal distribution.

8
The table shows what test statistic is appropriate when:
Population Variance Is Population Variance Is Central Limit Theorem
Known Unknown (CLT)
Population is normal or Population may not be
Population is normally
nearly normally normally distributed.
distributed.
distributed.
or considered
sufficiently large
Population standard Sample standard
Variance is known/
deviation ( ) is known. deviation (s) is known.
unknown.
Population standard
deviation ( ) is unknown.
Use z-test by replacing
population standard
z-test t-test deviation ( by sample
standard deviation in
the formula.
Identifying Appropriate Test Statistic

z-test z-test z-test t-test

Illustrative Examples:
1. A manufacturer claimed that the average life of batteries used in their
electronic games is 150 hours. It is known that the standard deviation of
this type of battery is 20 hours. A consumer wished to test the

the battery. It was found out that the mean is equal to 144 hours.
Here, the sample size (n) is 100 (extremely large) and population
standard deviation (20 hours) is known, then the appropriate test
statistic to be used is z-test.

2. An English teacher wanted to test whether the mean reading speed of


students is 550 words per minute. A sample of 12 students revealed a
sample mean of 540 words per minute with a standard deviation of 5
words per minute. At 0.05 significance level, is the reading speed
different from 550 words per minute?

9
The sample size (n) is 12 which is less than 30 and sample
standard deviation (5 words per minute) was given. Therefore, the
appropriate test is t-test.

3. A study was conducted to look at the average time students exercise. A


researcher claimed that in average, students exercise less than 15 hours
per month. In a random sample size n=115, it was found that the mean
time students exercise is hours per month with s = 6.43 hours
per month.
Since n=115, the sample size is large and variance is unknown.
Hence, z-test is the appropriate tool. (Central Limit Theorem)

Note:
The illustrative examples above used standard deviations instead of
variances. Variance is the square of the standard deviation and conversely,
the standard deviation is the square root of the variance. Hence, if the
standard deviation is known in the problem, then basically, variance is also
known.

Activity 3: Mark My Numbers!

Directions: In each problem, underline the population standard


deviation/sample standard deviation and circle the number of samples.

1. A sample of 160 people has a mean age of 27 with a population standard

.
2. An electric lamps manufacturer is testing a new production method that
will be considered acceptable if the lamps produced by this method result
in a normal population with an average life of 1,300 hours and a
standard deviation equal to 120. A sample of 100 lamps produced by this
method has an average life of 1,250 hours.

3. The cholesterol levels in a certain population have mean of 210 and


standard deviation 21. The cholesterol levels for a random sample of 9
individuals are measured and the sample mean x is determined. What is
the z-score for a sample mean x=180?
4. Mabunga Elementary School has 1,000 students. The principal of the
school thinks that the average IQ of students at Mabunga is at least 110.
To prove her point, she administers an IQ test to 20 randomly selected

10
students. Among the sampled students, the average IQ is 108 with a
standard deviation of 10.
5. A new energy-efficient lawn mower engine was developed by a well-known
inventor. He claims that the engine will run continuously for 5 hours on
a single gallon of regular gasoline. From his stock of 2,000 engines, the
inventor selects a simple random sample of 50 engines for testing. The
engines run for an average of 295 minutes with a standard deviation of
20 minutes.
Activity 4. Check It Out!

Directions: Read and analyze each problem. On the table below, put a
check on the columns of the criteria that correspond to the given problem.

1. It is claimed that the average age of working students in a certain


university is 35. A researcher selected a random sample of 25 working
students. The computation of their ages resulted to an average of 32
years with standard deviation of 10 years.
2. A manufacturer of tires claim that their tire has a mean life of at least
50,000kms. A random sample of 28 of these tires is tested and the
sample mean is 33,000kms. Assume that the population standard
deviation is 3,000kms and the lives of the tires are approximately
normally distributed.
3. On average, a drinking vending machine is adjusted so it dispenses
240ml of fruit juice. However, the machine tends to go out of adjustment
and periodic checks are made to determine the average amount of fruit
juice being dispensed. A sample of 28 with a standard deviation of 15ml
in plastic cup drinks is taken to test the adjustment of the machine.
4. Uber company claims that the mean time to rent a car on their app is 60
seconds with a standard deviation of 30 seconds. A random sample of 36
customers attempted to rent a car on the app. The mean time of renting
was 75 seconds. Is this enough evidence to contradict the company's
claim?
5. The waiting time to be seated at the restaurant has population standard
deviation of 10 minutes. An expensive restaurant claims that the average
waiting time for dinner is approximately 1 hour, but we suspect that this
claim is inflated to make the restaurant appear more exclusive and
successful. A random sample of 30 customers yielded a sample average
waiting time of 50 minutes.

11
is known. is unknown. z-test t-test
1.
2.
3.
4.
5.

Activity 5. Which is Which?

Directions: Identify the appropriate test statistic to be used in each


problem. Write z-test or t-test on a separate sheet of paper.
___________1. A sample of n=25 is selected from a normal population,
and s .

___________2. Based on the report of the school nurse, the average height of
Grade 11 students has increased. Five years ago, the average height of
Grade 11 students was 170cm with standard deviation of 38cm. She took a
random sample of 150 students and derived the average height of 165cm.

___________3. Knowing from a previous study that the average of athletes is


80, an athletic adviser asked how his soccer players are academically doing
as compared to other student athletes. After an initiative to help improve the
average of student athletes, the adviser randomly selected 15 soccer players
and found 85 as the average with standard deviation of 1.25.

___________4. The CEO of a battery manufacturing company claimed that


their batteries would last an average of 280 hours under normal use. A
researcher randomly selected 20 batteries from the production line and
tested them. The tested batteries had a mean life span of 250 hours with a
standard deviation of 40 hours. Do we have enough evidence to suggest that
the claim of an average of 280 hours is false?

___________5. It was known that the number of tickets purchased by


students at the ticket window for the volleyball match of two popular
universities followed a distribution that has mean of 500 and standard
deviation of 8.9. Suppose that a few hours before the start of one of these
matches, there are 100 eager students standing in line to purchase tickets.
If there are 250 tickets remaining, what is the probability that all 100
students will be able to purchase the tickets they want?

12
17
Activity 4
is is z-test t-test
known unknown
1.
2.
3.
4.
5.
Activity 5
1. t-test
2. z-test
3. t-test
4. t-test
5. z-test
Assessment
Additional
Activities 1. B
2. D
Activity 6 3. B
1. a. df=11 4. C
b. t-test 5. A
6. C
2. a. left-tailed 7. B
b. z-test 8. A
9. A
10. B
11. B
12. B
13. A
14. A
15. A
Answer Key

You might also like