New Normal MPA Statistics Chapter 2

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 15

CARLOS HILADO MEMORIAL STATE COLLEGE

COLLEGE OF ARTS AND SCIENCES


MASTERS OF PUBLIC ADMINISTRATION

First Semester 2021 - 2022


Chapter 2. Hypothesis Testing
Hypothesis testing was introduced by Ronald Fisher, Jerzy Neyman, Karl Pearson and Pearson’s son, Egon
Pearson. Hypothesis testing is a statistical method that is used in making statistical decisions using experimental data.
Hypothesis Testing is basically an assumption that we make about the population parameter (SS, 2013).

According to Majaski ( 2019), hypothesis testing is an act in statistics whereby an analyst tests an assumption
regarding a population parameter. Moreover, it is a methodology employed by the analyst depends on the nature of the
data used and the reason for the analysis. Lastly, hypothesis testing is used to infer the result of a hypothesis performed
on sample data from a larger population.

Why You Should Perform Statistical Hypothesis Testing? (Frost, 2019)


Hypothesis testing is a form of inferential statistics that allows us to draw conclusions about an entire
population based on a representative sample. You gain tremendous benefits by working with a sample. In most cases, it
is simply impossible to observe the entire population to understand its properties. The only alternative is to collect a
random sample and then use statistics to analyze it.

While samples are much more practical and less expensive to work with, there are tradeoffs. When you
estimate the properties of a population from a sample, the sample statistics are unlikely to equal the actual population
value exactly. For instance, your sample mean is unlikely to equal the population mean. The difference between the
sample statistic and the population value is the sample error.

Types of Statistical Hypothesis


There are basically two types, namely, null hypothesis (Ho) and alternative hypothesis (H1 or Ha). A research
generally starts with a problem. Next, these hypotheses provide the researcher with some specific restatements and
clarifications of the research problem (SS, 2013).
According to Nigam (2018), a null hypothesis proposes that no significant difference exists in a set of given
observations. For the purpose of these tests in general
Null (Ho): Given two sample means are equal
Alternate (Ha): Given two sample means are not equal

In statistics according to Beers (2019), the p-value is the probability of obtaining the observed results of a test,
assuming that the null hypothesis is correct. It is the level of marginal significance within a statistical hypothesis test
representing the probability of the occurrence of a given event.

Moreover, the p-value is used as an alternative to rejection points to provide the smallest level of significance at
which the null hypothesis would be rejected. A smaller p-value means that there is stronger evidence in favor of the
alternative hypothesis.

What Is the Significance Level (Alpha)? (Ellis, et.al, 2015)


The significance level, also denoted as alpha or α, is the probability of rejecting the null hypothesis when it is
true. For example, a significance level of 0.05 indicates a 5% risk of concluding that a difference exists when there is no
actual difference.
The significance level determines how far out
from the null hypothesis value we'll draw that line on the
graph. To graph a significance level of 0.05, we need to
shade the 5% of the distribution that is furthest away
from the null hypothesis.
In figure 1, the two shaded areas are equidistant
from the null hypothesis value and each area has a
probability of 0.025, for a total of 0.05. In statistics, we
call these shaded areas the critical region for a two-tailed
test. If the population mean is 260, we’d expect to obtain
a sample mean that falls in the critical region 5% of the
time. The critical region defines how far away our sample
statistic must be from the null hypothesis value before we
can say it is unusual enough to reject the null hypothesis.

Figure 1
In figure 2, the Sample mean (330.6) falls within
the critical region, which indicates it is statistically
significant at the 0.05 level. We can also see if it is
statistically significant using the other common significance
level of 0.01.

The two shaded areas each have a probability of


0.005, which adds up to a total probability of 0.01. This
time our sample mean does not fall within the critical
region and we fail to reject the null hypothesis.

Figure 2

This comparison shows why you need to choose your significance level before you begin your study. Since the
graph determine that our results are statistically significant at the 0.05 level without using a P value. However, utilizing
the numeric output produced by statistical software, we compare the P value to your significance level to make this
determination.

What is the difference between a parametric and a nonparametric test?


(https://help.xlstat.com/s/article/what-is-the-difference-between-a-parametric-and-a-nonparametric-
test?language=en_US)

 Parametric tests assume underlying statistical distributions in the data. Therefore, several conditions of validity
must be met so that the result of a parametric test is reliable. For example, Student’s t-test for two independent
samples is reliable only if each sample follows a normal distribution and if sample variances are homogeneous.

 Nonparametric tests do not rely on any distribution. They can thus be applied even if parametric conditions of
validity are not met.

What is the advantage of using a nonparametric test?


Nonparametric tests are more robust than parametric tests. In other words, they are valid in a broader range of
situations (fewer conditions of validity).

What is the advantage of using a parametric test?


The advantage of using a parametric test instead of a nonparametric equivalent is that the former will have
more statistical power than the latter. In other words, a parametric test is more able to lead to a rejection of H0. Most of
the time, the p-value associated to a parametric test will be lower than the p-value associated to a nonparametric
equivalent that is run on the same data.

Parametric tests often have nonparametric equivalents. (Campbell and Swinscow, 2009)

Parametric test Nonparametric test


A. One sample One-sample t-test Chi-square test of goodness-of-fit
group
B. Group Independent t-test Mann-Whitney U test
Comparison
One-Way ANOVA Kruskal-Wallis H test
(Scheffe test)
C. Repeated Paired-samples t-test Wilcoxon signed rank test
measures
D. Correlation and Pearson Product Moment (PPM) of Spearman rank
Regression Correlation Coefficient
Linear Regression
Chi-square test of association

Steps in Hypothesis Testing of Difference


Step 1: Determine the variable of interest X, State the null hypothesis () and alternative hypothesis () in words and in
symbols.
Step 2: Choose the level of significance (usually 0.05 or 0.01)
Step 3: Determine the Statistical Tool.
Step 4: Computation of computed value.
Step 5: Making of decision and conclusion.

Assumptions in Parametric Statistics

Assumptions for Parametric and Nonparametric tests

Basis of Comparison Parametric test Nonparametric test


Meaning A statistical test, in which specific A statistical test used in the
assumptions are made about the case of non-metric independent
population parameter. variables.
Distribution / Data Normal Not normal (Skewed data)
Measurement level Interval or ratio Nominal or ordinal
Sampling Random Nonrandom
Measure of central Assess group means Assess group medians
tendency
Sample size Big Small

Test statistics (Statistical test)

A test statistic is used in a hypothesis test when you are deciding to support or reject the null hypothesis. The
test statistic takes your data from an experiment or survey and compares your results to the results you would expect
from the null hypothesis (WP, 2020).

One sample t-test

The single sample t method tests a null hypothesis that the population mean is equal to a specified value. If this
value is zero (or not entered) then the confidence interval for the sample mean is given (Altman, 1991; Armitage and
Berry, 1994).

Example: Consider 20 first year resident female doctors drawn at random from one area, resting systolic blood pressures
measured using an electronic sphygmomanometer were:
Sample BP Sample BP
1. 128 11. 127
2. 118 12. 115
3. 144 13. 142
4. 133 14. 140
5. 132 15. 131
6. 111 16. 132
7. 149 17. 122
8. 139 18. 119
9. 136 19. 129
10. 126 20. 128

Can we conclude that the instrument is effective? (using 0.05 level of significance)

Using Step of Hypothesis Testing

Is there significant difference among the participants blood pressures?

1. State your Null and Alternative Hypothesis


Ho: There is no significant difference among the participants blood pressures.
H1: There is significant difference among the participants blood pressures.

2. Level of Significance
α = 0.05

3. Statistical Tool
Dependent Sample t-test

4. Computation

Using JASP

1. Open the data

2. Click t-test and select one sample t-test

3. Direct the data to variables

4. Go to Results
One Sample T-Test
t df p
BP 58.392 19 < .001

Note. Student's t-test.

5. Making of decision and conclusion.

Based from the results, there was significant difference on the systolic blood pressure of the participants [t(19) =

58.39, p < 0.001] at 0.05 level of significance. Therefore, the instrument is effective.

Activity:

1. A weight reduction program claims to be effective in treating obesity. To test this claim 12 PNP personnel were put on
the program and the number of pounds of weight gain/loss was recorded for each person after two years. Can we
conclude the weight loss program effective? (use 0.05 level of significance)

Subjec Weight Loss Subject Weight Loss


t
1 12 7 12
2 15 8 -8
3 -5 9 20
4 7 10 8
5 1 11 -2
6 -10 12 -5

Independent sample t-test


The independent samples t-test is a test that compares two groups on the mean value of a continuous (i.e.,
interval or ratio), normally distributed variable. The model assumes that a difference in the mean score of the
dependent variable is found because of the influence of the independent variable that distinguishes the two groups (SS,
2013).

Example: Researchers give each of a random sample of 15 employees with high absenteeism records (Group A) a test to
measure level of hostility. They give the same test to an independent random sample of 22 employees with low
absenteeism records (Group B). Is there significant difference in the level of hostility of the employees when grouped
according to absenteeism record? (use 0.05 level of significance.)

Absenteeism Hostility
Group A 62
Group A 93
Group A 71
Group A 90
Group A 69
Group A 90
Group A 71
Group A 76
Group A 86
Group A 71
Group A 81
Group A 84
Group A 65
Group A 61
Group A 69
Group B 67
Group B 66
Group B 64
Group B 42
Group B 59
Group B 70
Group B 75
Group B 69
Group B 72
Group B 74
Group B 55
Group B 55
Group B 56
Group B 57
Group B 60
Group B 48
Group B 60
Group B 53
Group B 65
Group B 64
Group B 46
Group B 41

Steps of Hypothesis Testing

Is there significant difference in the level of hostility of the employees when grouped according to absenteeism
record?

1. State the null and alternative hypothesis


Ho: There is no significant difference in the level of hostility of the employees when grouped according to
absenteeism record.
H1: There is significant difference in the level of hostility of the employees when grouped according to
absenteeism record.

2. Level of significance
α = 0.05

3. Statistical Tool
Independent sample t-test

4. Computation

Using JASP

1. Open the data


2. Click t-test and select independent sample t-test

3. Direct Absenteeism to Grouping Variables and Hostility to Dependent Variables, and click Descriptives

4. Go to Results

Independent Samples T-Test


Independent Samples T-Test
t df p
Hostility 4.704 35.000 < .001

Note. Student's t-test.

Group Descriptives
Group N Mean SD SE
Hostility Group A 15 75.933 10.640 2.747
Group B 22 59.909 9.851 2.100
5. Making of decision and conclusion.

Results shows that there was significant difference in the level of hostility of the employees when grouped
according to absenteeism record [t(35) = 4.704, p < 0.001] at 0.05 level of significance. This implies that the participants
with higher number of absences are more hostile than the participants with lower number of absences.

Activity:
A research study was conducted to examine the differences between older and younger adults on perceived life
satisfaction. A pilot study was conducted to examine this hypothesis. Ten older adults (over the age of 70) and ten
younger adults (between 20 and 30) were give a life satisfaction test (known to have high reliability and validity). Scores
on the measure range from 0 to 60 with high scores indicative of high life satisfaction; low scores indicative of low life
satisfaction. Is there significant difference on the perceived life satisfaction when the participants are grouped according
to age? (use 0.05 level of significance). (http://faculty.webster.edu/woolflm/ttest.html)

Participants Scores
Older 45
Older 38
Older 52
Older 48
Older 25
Older 39
Older 51
Older 46
Older 55
Older 46
Younger 34
Younger 22
Younger 15
Younger 27
Younger 37
Younger 41
Younger 24
Younger 19
Younger 26
Younger 36

Paired t-test

The Paired Samples t-test compares two means that are from the same individual, object, or related units
(https://libguides.library.kent.edu/SPSS/PairedSamplestTest). The two means can represent things like:

 A measurement taken at two different times (e.g., pre-test and post-test


with an intervention administered between the two time points)
 A measurement taken under two different conditions (e.g., completing a test
under a "control" condition and an "experimental" condition)
 Measurements taken from two halves or sides of a subject or experimental
unit (e.g., measuring hearing loss in a subject's left and right ears).

Moreover, the purpose of the test is to determine whether there is statistical evidence that the mean difference
between paired observations on a particular outcome is significantly different from zero. The Paired Samples t Test is a
parametric test. This test is not appropriate for analyses involving the following:

1) unpaired data;
2) comparisons between more than two units/groups;
3) a continuous outcome that is not normally distributed; and
4) an ordinal/ranked outcome.
Example:

A dose of the drug Captropil, designed to lower systolic blood pressure, is administered to 10 randomly selected
volunteers, with the following results. Test the effectiveness of the drug. (use 0.01 level of significance)

Before After

120 118
136 122
160 143
98 105
115 98
110 98
180 180
190 175
138 105
128 112

Steps of hypothesis testing

Is there significant difference in the blood pressure of the volunteers before and after taking Captropil?

1. State the null and alternative hypothesis

Ho: There is no significant difference in the blood pressure of the volunteers before and after taking
Captropil.

H1: There is significant difference in the blood pressure of the volunteers before and after taking
Captropil.

2. Level of significance

α = 0.01

3. Statistical tool

Paired sample t-test

4. Computation

Using JASP

1. Open the Data


2. Click t-test and select paired t-test

3. Direct the Before and After Data to the Variable Pairs, and click Descriptives

4. Go to Results

Paired Samples T-Test


Measure 1 Measure 2 t df p
Before - After 3.366 9 0.008

Note. Student's t-test.

Descriptives
N Mean SD SE
Before 10 137.500 30.351 9.598
After 10 125.600 30.424 9.621
5. Making of decision and conclusion.

Results shows, there was significant difference on the systolic blood pressure of the participants before and
after the administration of Captropil [t(9) = 3.366, p = 0.008] at 0.01 level of significance. This implies that Captropil is
effective to lower down the systolic blood pressure of the participants.

Activity:

Based on “An analysis of Factors that contribute to the Efficacy of Hypnotic Analgesia”, by Price and Barber,
Journal of Abnormal Psychology, Vol.96, No.1. A study was conducted to investigate the effectiveness of hypnotism in
reducing pain. Results for randomly selected subjects, at level of significance 0.05, test the claim that the sensory
measurements are lower after hypnotism. (The values are before and after hypnosis; the measurements are in
centimeter on a pain scale.) Does hypnotism appear effective in reducing pain?

Subject Before After


1 6.6 6.8
2 6.5 2.4
3 9.0 7.4
4 10.3 8.5
5 11.3 8.1
6 8.1 6.1
7 6.3 3.4
8 11.6 2.0

One-way ANOVA

The one-way analysis of variance (ANOVA) is used to determine whether there are any statistically significant
differences between the means of two or more independent (unrelated) groups (although you tend to only see it used
when there is a minimum of three, rather than two groups) (Leard, 2018).

Furthermore, the one-way ANOVA is an omnibus test statistic and cannot tell you which specific groups were
statistically significantly different from each other; it only tells you that at least two groups were different. Since you
may have three, four, five or more groups in your study design, determining which of these groups differ from each
other is important. You can do this using a post hoc test.

Example: Researchers wish to study the effect of crowding on the productivity of office workers. Office workers of the
same age, sex, and level of training and experience are randomly assigned to one of three groups representing three
level of crowding: Severe, Moderate, or None. The following shows the results. Can we conclude from these data that
crowding affects productivity? Let α = 0.05.

Crowding Productivity
Severe 22
Severe 49
Severe 32
Severe 37
Severe 32
Severe 22
Moderate 31
Moderate 30
Moderate 43
Moderate 30
Moderate 46
None 68
None 73
None 78
None 47
None 56
None 59

Steps of hypothesis testing

Is there significant difference on the level of productivity of the participants when grouped according to
crowding?

1. State null and alternative hypothesis

Ho: There is no significant difference on the level of productivity of the participants when grouped
according to crowding.

H1: There is significant difference on the level of productivity of the participants when grouped according
to crowding.

2. Level of significance
α = 0.05

3. Statistical tool
one-way ANOVA

4. Computation

Using JASP

1. Open the Data

2. Click ANOVA and select ANOVA


3. Direct Crowding to Fixed Factors and Productivity to Dependent Variable, and click Descriptives

4. Click Post Hoc Test and direct Crowding to the other box

5. Click Scheffe and go to Results

ANOVA - Productivity
Cases Sum of Squares df Mean Square F p
Crowding 3415.284 2 1707.642 16.732 < .001
Residuals 1428.833 14 102.060

Note. Type III Sum of Squares

Descriptives - Productivity
Crowding Mean SD N
Moderate 36.000 7.842 5
None 63.500 11.572 6
Severe 32.333 10.132 6
Post Hoc Comparisons - Crowding
Mean Difference SE t p scheffe
Moderate None -27.500 6.117 -4.495 0.002
Severe 3.667 6.117 0.599 0.837
None Severe 31.167 5.833 5.343 < .001

Note. P-value adjusted for comparing a family of 3

5. Making of decision and conclusion.

Based from the results, there was significant difference on the level of productivity when the participants are
grouped according to crowding [F(2, 14) = 16.732, p < 0.001] at 0.05 level of significance.
Utilizing Scheffe Method, none crowding creates difference with moderate and severe crowding with mean
difference of 27.5 and 31.167 respectively. This implies that the none crowded office is more productive than the
moderate and severe crowded working area.

Activity:
A firm wishes to compare four programs for training workers to perform a certain manual task. Twenty new
employees are randomly assigned to the training programs, with 5 in each program. At the end of the training period, a
test is conducted to see how quickly trainees can perform the task. The number of times the task is performed per
minute is recorded for each trainee, with the following results:

Program Score
Program 1 9
Program 1 12
Program 1 14
Program 1 11
Program 1 13
Program 2 10
Program 2 6
Program 2 9
Program 2 9
Program 2 10
Program 3 12
Program 3 14
Program 3 11
Program 3 13
Program 3 11
Program 4 9
Program 4 8
Program 4 11
Program 4 7
Program 4 8

Using α = .05, determine whether the treatments differ in their effectiveness.

References:

Beers, B (2019), P-Value Definition, https://www.investopedia.com/terms/p/p-value.asp, Date retrieved January 9,


2020, Thursday

Campbell and Swinscow, (2009), Parametric and Non-parametric tests for comparing two or more groups, Parametric
and Non-parametric tests for comparing two or more groups, https://www.healthknowledge.org.uk/public-health-
textbook/research-methods/1b-statistical-methods/parametric-nonparametric-tests, Date retrieved January 9, 2020,
Thursday.

Ellise, et.al (2015), Understanding Hypothesis Tests: Significance Levels (Alpha) and P values in Statistics,
https://blog.minitab.com/blog/adventures-in-statistics-2/understanding-hypothesis-tests-significance-levels-alpha-and-
p-values-in-statistics, Date retrieved January 9, 2020.
Mangiafico, S. (2016), Assumptions in parametric statistics, https://rcompanion.org/handbook/I_01.html, Date
Retrieved January 9, 2020, Thursday

Nigam, V (2018), Statistical Tests — When to use Which? https://towardsdatascience.com/statistical-tests-when-to-use-


which-704557554740, Date retrieved January 9, 2020, Thursday.

Statistics Solutions (SS) (2013). Hypothesis Testing [WWW Document]. Retrieved from
http://www.statisticssolutions.com/academic-solutions/resources/directory-of-statistical-analyses/hypothesis-testing/

W.P. Word Press (2020), Test Statistic: What is it? Types of Test Statistic,
https://www.statisticshowto.datasciencecentral.com/test-statistic/, January 9, 2020, Thursday

You might also like