Download as pdf or txt
Download as pdf or txt
You are on page 1of 6

Essentials of Statistics for The

Behavioral Sciences 9th Edition


Gravetter Solutions Manual
Visit to Download in Full: https://testbankdeal.com/download/essentials-of-statistics-fo
r-the-behavioral-sciences-9th-edition-gravetter-solutions-manual/
Chapter 8: Introduction to Hypothesis Testing

Chapter Outline

8.1 The Logic of Hypothesis Testing


The Elements of a Hypothesis Test
The Four Steps of a Hypothesis Test
Step l: State the Hypothesis
Step 2: Set the Criteria for a Decision
Step 3: Collect Data and Compute Sample Statistics
Step 4: Make a Decision
A Closer Look at the z-Score Statistic (Recipe and Ratio)
8.2 Uncertainty and Errors in Hypothesis Testing
Type I Errors
Type II Errors
Selecting an Alpha Level
8.3 More about Hypothesis Tests
A Summary of the Hypothesis Test
In the Literature: Reporting the Results of the Statistical Test
Factors That Influence a Hypothesis Test
Assumptions for Hypothesis Tests with z-Scores
8.4 Directional (One-Tailed) Hypothesis Tests
The Hypotheses for a Directional Test
The Critical Region for Directional Tests
Comparison of One-Tailed versus Two-Tailed Tests
8.5 Concerns about Hypothesis Testing: Measuring Effect Size
Measuring Effect Size
8.6 Statistical Power
Power and Effect Size
Other Factors that Affect Power

Learning Objectives and Chapter Summary

1. Students should understand the purpose, logic, and steps involved in hypothesis testing.

In the generic hypothesis testing situation, a sample is selected from a population, a


treatment is administered to the sample, and the individuals in the sample are measured.
If the sample mean differs significantly from the original population mean, then we have
evidence that the treatment had an effect.

123
2. Students should be able to state/identify the null and alternative hypotheses and locate the
critical region.

A hypothesis test always begins with a null hypothesis stating that the treatment has no
effect (no change, no difference, no relationship, etc.). The next step is to determine what
kind of sample data would be reasonable if this hypothesis is true, and what kind of
sample data would be very unlikely. The term “very unlikely” is defined by the alpha
level for the test. The critical region consists of the set of sample outcomes that would be
very unlikely to occur if the null hypothesis is true. If the research study produces a
sample in the critical region, we conclude that the sample data are not consistent with the
null hypothesis, and we reject the null hypothesis.

3. Students should be able to conduct a hypothesis test using a z-score statistic and make a
statistical decision.

As noted earlier, if the mean for the treated sample is noticeably different from the mean
for the original population, then we conclude that the treatment had an effect. The
problem, however, is that an observed difference between the sample mean and the
population mean may be due to chance (sampling error). One goal of a hypothesis test is
to rule out chance as a plausible explanation for the mean difference. To accomplish this
goal, we first calculate how much difference is reasonable to expect between M and µ if
there is no treatment effect (the standard error). Then, we compare the actual obtained
difference with this value. The z-score statistic reflects this comparison.

M−μ Actual mean difference


z = ───── = ──────────────────────
σM Standard difference between M and µ

A large value for the z-statistic (as defined by the critical region) means that we can
conclude that the obtained mean difference is more than can be explained by chance, and
reject the null hypothesis.

4. Students should be able to define and differentiate Type I and Type II errors.

A hypothesis test uses limited information from a sample to make a general conclusion
about a population. It is always possible that the sample information is misleading or not
representative, leading us to an incorrect conclusion. Sometimes the sample appears to
show evidence of a treatment effect, when in fact the treatment has no effect. In this case,
the researcher falsely concludes that the treatment has an effect, making a Type I error
(i.e., incorrectly rejects the null hypothesis). It is also possible that the treated sample
does not appear to be noticeably different from the original population, even though the
treatment did have an effect. In this case, the researcher falsely concludes that the
treatment does not have a significant effect, a Type II error (i.e., incorrectly accepts the
null hypothesis).

124
5. Students should understand the purpose of measuring effect size and power, and they should
be able to compute Cohen’s d.

A hypothesis test determines whether the mean difference obtained in a research study is
greater than is expected simply by chance (i.e., due to sampling error). The standard error
is used to determine how much difference is reasonable to expect. However, in some
cases, especially with large samples, the standard error can be very small. In these cases,
a tiny mean difference may be enough to be statistically significant. Thus, concluding
that a treatment effect is “significant” does not tell you anything about the actual size of
the effect and does not imply that the effect is large. To gain information about the size of
the treatment effect, it is recommended that researchers also report a measure of effect
size.

Cohen’s d is a measure of effect size. It is computed by dividing the obtained mean


difference by the standard deviation. Thus, a d value of 0.50 indicates that the difference
between the sample mean (after treatment) and the original population mean (before
treatment) is equal to one-half of a standard deviation.

6. Students should be able to incorporate a directional prediction into the hypothesis test and
conduct a directional (one-tailed) test.

If the expected treatment effect is an increase in scores, the null hypothesis for a
directional test simply states that there is no increase (no effect). Sample data that show
an increase (large values in the right-hand tail) tend to refute this null hypothesis, thus the
critical region consists entirely of values in one tail of the distribution.

7. Students should understand the concept of power and the factors that affect it.

The power of a hypothesis test is the probability that the test will reject the null
hypothesis when there is a real treatment effect. As the size of the treatment effect
increases, the power of the test also increases. Other factors that increase power are
increasing the sample size, increasing the alpha level, and switching from a two-tailed to
a one-tailed test.

Other Lecture Suggestions

1. The general purpose for a hypothesis test can be demonstrated using Figure 1.2, which
introduces the concept of sampling error. In Chapter 8, we are administering a treatment to a
sample and want to determine if the observed difference between the sample mean and μ is
caused by the treatment. To justify this conclusion, however, we must demonstrate that the
observed mean difference is significantly larger than can be explained by chance or sampling
error (as in Figure 1.2). This is the job for a hypothesis test. In simple terms, the goal for a
hypothesis test is to rule out chance (random, unsystematic factors) as a plausible explanation for
the research results.

125
Answer to Even Numbered Problems

2. The alpha level is a small probability value that defines the concept of “very unlikely.” The
critical region consists of outcomes that are very unlikely to occur if the null hypothesis is true,
where “very unlikely” is defined by the alpha level.

4. a. Lowering the alpha level causes the boundaries of the critical region to move farther out
into the tails of the distribution.
b. Lowering α reduces the probability of a Type I error.

6. a. The null hypothesis states that studying on an electronic screen has no effect on final
exam scores.
b. H0: μ = 77 (even with studying on a screen, the mean is still 77). H1: μ  77 (the
mean has changed) The critical region consists of z-scores beyond 1.96. For these data,
the standard error is 2 and z = –4.5/2 = –2.25. Reject the null hypothesis. Using an
electronic screen for study does have a significant effect on final exam scores.

8. a. The null hypothesis states that participation in sports, cultural groups, and youth groups
has no effect on self-esteem. H0: µ = 50, even with participation. With n = 100, the
standard error is 1.5 points and z = 3.8/1.5 = 2.53. This is beyond the critical value of
1.96, so we conclude that there is a significant effect.
b. Cohen’s d = 3.8/15 = 0.253.
c. The results indicate that group participation has a significant effect on self-esteem, z =
2.53, p < .05, d = 0.253.

10. a. With n = 36, the standard error is 1, and z = –3/1 = –3.00. Reject H0.
b. With n = 9, the standard error is 2, and z = –3/2 = –1.50. Fail to reject H0.
c. A larger sample increases the likelihood of rejecting the null hypothesis.

12. The null hypothesis states that adding a statement about sense of humor to the description
will have no effect on the ratings; the mean will still be μ = 4.0. The standard error is 0.15 and
the z-score for this sample is z = 2.80. For a one-tailed test, the critical value is z = 1.65. Reject
the null hypothesis and conclude that a sense of hummer has a significant effect on attraction
rating scores.

14. With n = 4, the standard error is 0.95 and the sample mean corresponds to z = 2.79. This is
well beyond the critical boundary of 1.65. Reject the null hypothesis and conclude that the
number of 90-degree days in the past four years is significantly higher than the overall mean
of µ = 9.6.

16. a. H0: μ ≤ 1.85 (not more than average) For the males, the standard error is 0.2 and z =
3.00. With a critical value of z = 2.33, reject the null hypothesis.
b. H0: μ ≥ 1.85 (not fewer than average) For the females, the standard error is 0.24 and z =
–2.38. With a critical value of z = –2.33, reject the null hypothesis.

18. a. For a sample of n = 9 the standard error is 4 points, and the critical boundary for

126
z = 1.96 corresponds to a sample mean of M = 47.84. With a 6-point effect, the
distribution of sample means would be centered at  = 46. In this distribution, the
critical boundary of M = 47.84 corresponds to z = 0.46. The power for the test is
p(z > 0.46) = 0.3228 or 32.28%.
b. For a sample of n = 16 the standard error would be 3 points, and the critical boundary for
z = 1.96 corresponds to a sample mean of M = 45.88. With a 6-point effect, the
distribution of sample means would be centered at  = 46. In this distribution, the
critical boundary of M = 45.88 corresponds to z = –0.04 . The power for the test is
p(z > –0.04) = 0.5160 or 51.60%.

20. a. With no treatment effect the distribution of sample means is centered at  = 100 with a
standard error of 3 points. The critical boundary of z = 1.96 corresponds to a sample
mean of M = 105.88. With a 7-point treatment effect, the distribution of sample
means is centered at  = 107. In this distribution a mean of M = 105.88 corresponds to z
= −0.37. The power for the test is the probability of obtaining a z-score greater than
−0.37, which is p = 0.6443.
b. With a one-tailed test, a critical boundary of z = 1.65 corresponds to a sample mean of M
= 104.95. With a 7-point treatment effect, the distribution of sample means is centered at
 = 107. In this distribution a mean of M = 104.95 corresponds to z = −0.68. The power
for the test is the probability of obtaining a z-score greater than −0.68, which is p =
0.7517.

22. a. Increasing alpha increases power.


b. Changing from one- to two-tailed decreases power.

127

You might also like