Lesson 12 -Hypothesis Testing and Interpretation

Learning Competencies

At the end of this lesson, you are expected to:

 Explain the concept of hypothesis testing;
 Describe the steps in hypothesis testing;
 Infer meaning from the results of the hypothesis testing; and
 Draw conclusions from the results of the hypothesis testing.

A hypothesis functions as an answer to the research question and guides data collection and
interpretation. A hypothesis enables researchers not only to discover a relationship between
variables, but also to predict a relationship based on theoretical guidelines and/or empirical
evidence. To give meaning to the hypothesis, there is a need to test it using the gathered data. In
this lesson, you are going to learn how to test the hypothesis using a statistical software (e.g.,
EXCEL or SPSS) or manually. As you undergo the rudiments of hypothesis testing, you will
appreciate the logic of inferential analysis which is essential in the correct interpretation of the
results. Samples are provided for you to easily pick up the process.

Hypothesis Testing and Interpretation

Inferential questions in your research demand the use of inferential statistics to analyze the
gathered data. Remember that in the previous lesson, descriptive statistics describes data (for
example, a chart or graph) and inferential statistics allows you to make predictions (“inferences”)
from that data. With inferential statistics, you take data from samples and make generalizations
about a population.

For example, you might stand in Novo Mall entrance and ask a sample of 100 people if they
like Chinese products. You could make a bar chart of yes or no answers (that would be descriptive
statistics) or you could use your research data (and inferential statistics) to reason that around 75-
80% of the population like shopping at Novo Mall although you interviewed only 100 samples.

There are two main areas of inferential statistics:

1. Estimating parameters. This means taking a statistic from your sample data (for example the
sample mean) and using it to say something about a population parameter (i.e. the
population mean).
Probable Pairs of 2 samples: 1, 2:
1,3; 1,4; 1,5; 2,3; 2,4; 2,5; 3,4; 3,5;
N= 5 n=2
1, 2, 3, 4, 5
& 4,5

Sample (n)
Population (N)

As shown above, there is a population consisting of 5 members. If you get a random

sample of 2 members each, by law of Combination, there are 10 possible pairs. If we get the
mean of each pair, we have a sampling of 10 sample means consisting of two members,
shown below:

Pairs of Samples (n=2) Sample Means Population Mean (N = 5)

1, 2 1.5 15/5 = 3
1, 3 2.0
1, 4 2.5 Using the logic above, the population
1, 5 3.0 mean ranges from 1.5 to 4.5. The
2, 3 2.5 sample mean of 1.5 is a good estimate
2, 4 3.0 of the population means = 3. The same
2, 5 3.5 could be said of the other 9 sample
3, 4 3.5 means. The 10 sample means are the
3, 5 4.0 estimates of the population mean.
4, 5 4.5

2. Hypothesis tests. This is where you can use sample data to answer research questions. For
example, you might be interested in knowing if exposure to mass media will increase the
productive vocabulary of the high school students. Or if the number of hours spent in
reviewing the lessons increase the retention of the key concepts learned by the learners.

To explain the steps of hypothesis testing, let us compare the mean reading
comprehension scores of two groups (Male and Female) of students.

N = 150 Male = 13;
Female =17

Using this research scenario, let us learn how to conduct hypothesis testing. Testing
the hypothesis involves several steps:
1. Specify the null hypothesis.

As you learned in the previous lessons, the null hypothesis states the proposition that
two or more groups do not differ or variables X and Y are not related.

In the situation given, the null hypothesis could be: There is no difference in the
reading comprehension scores of the students grouped by sex.

Ho (symbol for null hypothesis, read as H null)

x̅ or “x-bar” (the group sample mean)

So the null hypothesis is mathematically or symbolically expressed as -

Ho: = X-barMale – X-barFemale = 0

This symbol is read as Mean Female minus Mean Female is equal to zero.

2. Specify the alternative hypothesis

The corresponding alternative hypothesis (we presume to be true if the null

hypothesis is NOT true) is There is a difference in the reading comprehension
scores of the students grouped by sex.

Ho: = X-barMale – X-barFemale ≠ 0

This symbol is read as Mean Female minus Mean Female is not equal to zero. The
difference is significantly greater than zero.

3. Identify the appropriate statistical tool needed to test the hypothesis

Inferential statistical tools as you learned in lesson 11 vary according to the

hypothesis set, the variables used and the measurement scales they are measured.
The matrix below will guide you on the appropriate statistical tool to use.

Number of
Hypothesis Variable Measurement Groups Statistical Tool
Scale Involved
There is no Reading scores Interval
difference in the Two (Male vs t-test for
reading scores of Sex Nominal Female) independent
students grouped groups
by sex
There is no Pretest scores Interval
difference in the One (Grade 9 t-test for
pretest and post- Post-test scores Interval students) dependent or
test scores of the paired groups
Grade 9 students
There is no
difference in the Communication Interval More than two One-way
communication anxiety scores groups ANOVA
anxiety scores of (Education,
the third-year College Affiliation Nominal Agriculture,
students grouped CICS, CHIM)
by college.
There is no
relationship Number of hours Ratio One group Pearson product-
between the spent reviewing (Grade 7 moment
number of hours lessons students) correlation
spent reviewing
lessons and the Number of Interval
number of concepts
concepts understood
understood by the
Grade 7 students
There is no
relationship Mobile phone Nominal One group Chi-square
between the Ownership (Grade 8 analysis
ownership of a students)
mobile phone and Place of residence Nominal
the place of
residence of the
Grade 8 students.

4. Set the significance level

The term significance level means denoted by the Greek letter sigma alpha (α) is
generally set at 0.05 in educational researches. This means that there is a 5% chance that
you will accept your alternative hypothesis when your null hypothesis is actually true. If
you repeat the same study 100 times, chances are that 95 times the null hypothesis is
TRUE; only 5 times that is it NOT TRUE, meaning the alternative hypothesis is

The smaller the significance level, the greater the burden of proof needed to
reject the null hypothesis, or in other words, to support the alternative hypothesis. If
you set your alpha or significance level at 0.01, you are hypothesizing that if the study is
repeated 100 times, 99 times, the null hypothesis is TRUE; only 1 time it is NOT TRUE.
It means therefore that if you set a smaller significance level, the chance of rejecting or
not accepting the null hypothesis is very small or slim. Thus, if you are rejecting the null
hypothesis at .05 level, you have 95% confidence that your finding is correct – there is a
SIGNIFICANT difference or relationship. If the null hypothesis is rejected at 0.01 level,
your level of confidence that you are correct in making a decision is 99%.

5. Calculate the test statistic and the corresponding p - value

A test statistic is a random variable that is calculated from sample data and used
in a hypothesis test. You can use test statistics to determine whether to reject the null
hypothesis. The test statistic compares your data with what is expected under the null

Statistical Tool Statistic Computed Name

t-test (independent or
dependent groups) t t-computed value
One-way ANOVA F F-value or F-ratio
Pearson product moment
correlation r Correlation coefficient
Chi-square X2 Chi-square-value

The p-value describes the probability of obtaining a sample statistic as or more

extreme by chance alone if your null hypothesis is true. This p-value is determined based
on the result of your test statistic. Your conclusions about the hypothesis are based on
your p-value and your significance level. Examples:

P-value = 0.01 This will happen 1 in 100 times by pure chance if your null
hypothesis is true. Not likely to happen strictly by chance. If test statistic has a p-
value of .000 to .01, you have to REJECT the null hypothesis and ACCEPT the
alternative hypothesis to be true. Then, you claim in your study that there is a
significant difference or relationship, whatsoever is the hypothesis.

P-value = 0.75 This will happen 75 in 100 times by pure chance if your null
hypothesis is true. It is very likely to occur strictly by chance. If your test statistic has
a p-value of 0.051 and higher, you have to ACCEPT that your null hypothesis is

NOTE: There are two ways to check if the computed test statistic is asking you to accept
or reject the null hypothesis.

a. Using the p-value associated with the test statistic

The p-values are indicated in all software statistical computations, like EXCEL
Data analysis, SPSS, etc. If the significance level you set in Step 3 is 0.05 and the
corresponding p-value of the computed test statistic is lower, you need to
REJECT the null hypothesis. For example, if the p-values are .04,.039, .025,
.019, it means that the difference between or among groups are significantly
different, so REJECT the null hypothesis. In correlation analysis, it would mean
X is significantly related with Y. Any p-value greater than 0.05 (e.g., .051, .06,
.073, etc.) indicates that you have to ACCEPT the null hypothesis, “There is no
significant difference between or among groups” or “There is no significant
relationship between X and Y.”
b. Using the computed test statistic and the tabular/critical value

If you are MANUALLY computing for the test statistic, you can make a
statistical decision by comparing the computed value of the statistic with the
tabular value (found at the back of any statistics book or in Google) at the
designated degrees of freedom (df). See Annexes A to E for the manual
computation of the different test statistics.

If the computed value is greater than the tabular/critical value, your decision is
to REJECT the null hypothesis. If your computed value is smaller than the
tabular value, your decision is to ACCEPT the null hypothesis.
Examine the Excel t-test for independent groups results. Both the p-value of the
statistic t and the critical/tabular value are presented. The same rejection of the
null hypothesis is correct.

Data: Reading comprehension scores of male and female students

Sample Male Sample Female

1 15 1 15
2 13 2 12
3 14 3 18
4 16 4 19
5 12 5 18
6 13 6 16
7 15 7 15
8 19 8 20
9 13 9 21
10 14 10 19
11 16 11 14
12 12 12 15
13 15 13 17
14 18
15 20
16 15
17 14

This is the results of the t-test for independent groups using Excel Data Analysis;
t-Test: Two-Sample Assuming Unequal Variances

Male Female
Mean 14.38461538 16.82352941
Variance 3.756410256 6.529411765
Observations 13 17
Hypothesized Mean Difference 0
df 28
t Stat -2.972876048
P(T<=t) one-tail 0.003003304
t Critical one-tail 1.701130934
P(T<=t) two-tail 0.006006609
t Critical two-tail 2.048407142

The p-value of the test statistic t = 2.973 (rounded off) is p = 0.006. This value
is lower than .05 set, so you have to REJECT the null hypothesis “There is no
difference between the reading scores of students grouped by sex.” Look also at
the comparison between the computed versus the critical value 2.973 vs. 2.048.
The computed t-value is higher or greater than 2.048, so the same decision of
rejecting the null hypothesis is correct. SO, whether you use the p-values and
critical values as point of comparison to make a statistical inference, the same
results would be given. In t-test, ignore the sign of the computed value; get the
absolute value always. The sign (+ or -) depends on which group you use first
and last. If the first is bigger than the second, the t-value is positive; if reversed,
the t-value is negative.

Note: Even if you are manually computing the test statistic, you can Google the
associated probability by typing Probability Calculator associated with _____
(give the name of the tests, t-test, Pearson r, chi-square). Once there, it asks you
the computed test statistic and the degrees of freedom. Once done, in a click,
the p-value is given.

6. Interpret and draw conclusions

After you make a decision to reject or accept the null hypothesis, you have to
interpret the results. If t-test gives a significant result (the null hypothesis is
rejected), identify which group has a higher mean than the other. Then, explain the
probable factors/reasons that made the difference significant. Use your review of
related literature to support your finding and argument. From your arguments, you
can come out with a valid and reliable conclusion.
Sample Interpretation of the Findings Using the Example Above

Comparison Between the Reading Comprehension Scores of Students Grouped by


The study hypothesized that there is no difference between the reading

comprehension scores of the Grade 7 students grouped by sex. Results of the t-test
for independent groups revealed that the computed t-value of 2.973 has an
associated probability of 0.006 which is lower than the significance level set in the
study, thus, the null hypothesis is rejected (Table 3). It means that there is a
significant difference in the reading comprehension scores of the students.

As shown in the table, female students with a mean score of 16.82 outscored
the males who obtained a mean score of 14.38. It indicates that female students
understand better what they read than their counterparts. A study conducted by
Logan and Johnston (2009:200) about the relationship between reading
comprehension and gender comes up with the finding that: Girls are better in
reading comprehension than boys; Girls read more frequently than boys do; and
Girls have more positive attitude to reading. In another study, females are better
than males in foreign language comprehension, “in terms of language
comprehension, several studies have demonstrated female superiority.”(Saidi,

Table 3. Comparison between the reading comprehension scores of students

grouped by sex.

Group Mean Variance t-value P - value
Female 18.823 6.529
Male 14.384 3.756 2.973** 0.006
** = significant at 0.01

Note: In designating levels of significance, the following should be observed.

Single asterisk (*) means significant at 0.05 level. Just state “There is a significant
difference . . . .

Double asterisks (**) means highly significant at 0.01 level. Just state “There is a
highly significant difference . . . .
Activity 1. In your own words, explain the major steps in hypothesis testing.

Activity 2. In your own words, explain the two ways to make a statistical decision (rejecting or
accepting the null hypothesis).

Activity 3. Explain how best to explain the results of a statistical analysis.

Activity 4. With your chosen research topic/title and inferential research question, complete
the following table by supplying the needed information. Note , for every variable, identify the
measurement scale and statistical tool to be used.

Hypothesis Variables/Data to Measurement Number of Statistical Test to

Gather Scale Groups Analyze the


Directions: Identify the key term described in the statements below:

1. This function of statistical inference allows the research to make statements about the population
when only a portion of it was observed.
2. This value is the one computed from the sample data which is the estimate of the population
3. This value, found at the annexes of a statistics books, is the reference value to determine if the
computed value permits the rejection or acceptance of the null hypothesis.
4. This value tells you the initial amount to set to determine when to reject or acct the null hypothesis.
5. This is the decision to take if the computed value is greater than the critical value.
Directions: Write Yes if the researcher did the correct action; No if it is wrong.

6. Researcher Joan rejected the null hypothesis when the computed value has a p-value of 0.07.
7. Researcher Mark used chi-square to analyze his data to test the hypothesis “There is no difference in
the average grades of students from central and barangay high schools.”
8. Researcher Jayden accepted the null hypothesis when the computed value is 4.15 but the
critical/tabular value is 8.11.
9. Researcher Marionette used t-test in testing the hypothesis that “There is no association between
average grades and number of hours to study assignments.”
10. In a correlation analysis, Researcher Toni found a positive coefficient (r = 0.789; p = .001) for
number of hours spent to text message and number of productive vocabulary used by the students.
He concludes that the longer a student sends text messages, the more vocabulary words he uses.

