Type I Errors, Type Ii Errors, and Statistical Power

You might also like

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 5

TYPE I ERRORS, TYPE II ERRORS, AND STATISTICAL POWER

In Chapter 5 we explained that the hypothetico‐deductive method requires hypotheses to


be falsifiable. For this reason, null hypotheses are developed. These null hypotheses (H0) are
thus set up to be rejected in order to support the alternate hypothesis, termed HA.

The null hypothesis is presumed true until statistical evidence, in the form of a hypothesis
test, indicates otherwise. The required statistical evidence is provided by inferential statistics,
such as regression analysis or MANOVA. Inferential statistics help us to draw conclusions (or to
make inferences) about the population from a sample.

The purpose of hypothesis testing is to determine accurately if the null hypothesis can be
rejected in favor of the alternate hypothesis. Based on the sample data the researcher can reject
the null hypothesis (and therefore accept the alternate hypothesis) with a certain degree of
confidence: there is always a risk that the inference that is drawn about the population is
incorrect.

There are two kinds of errors (or two ways in which a conclusion can be incorrect),
classified as type I errors and type II errors. A type I error, also referred to as alpha (α), is the
probability of rejecting the null hypothesis when it is actually true. In the Excelsior Enterprises
example introduced in Chapter 14, a type I error would occur if we concluded, based on the data,
that burnout affects intention to leave when, in fact, it does not. The probability of type I error,
also known as the significance level, is determined by the researcher. Typical significance levels
in business research are 5% (<0.05) and 1% (<0.01).

A type II error, also referred to as beta (β), is the probability of failing to reject the null
hypothesis given that the alternate hypothesis is actually true; e.g., concluding, based on the data,
that burnout does not affect intention to leave when, in fact, it does. The probability of type II
error is inversely related to the probability of type I error: the smaller the risk of one of these
types of error, the higher the risk of the other type of error.

A third important concept in hypothesis testing is statistical power (1 − β). Statistical


power, or just power, is the probability of correctly rejecting the null hypothesis. In other words,
power is the probability that statistical significance will be indicated if it is present.
Statistical power depends on:

1. Alpha (α): the statistical significance criterion used in the test. If alpha moves closer to zero
(for instance, if alpha moves from 5% to 1%), then the probability of finding an effect when
there is an effect decreases. This implies that the lower the α (i.e., the closer α moves to zero)
the lower the power; the higher the alpha, the higher the power.
2. Effect size: the effect size is the size of a difference or the strength of a relationship in the
population: a large difference (or a strong relationship) in the population is more likely to be
found than a small difference (similarity, relationship).
3. The size of the sample: at a given level of alpha, increased sample sizes produce more power,
because increased sample sizes lead to more accurate parameter estimates. Thus, increased
sample sizes lead to a higher probability of finding what we were looking for. However,
increasing the sample size can also lead to too much power, because even very small effects
will be found to be statistically significant.

Along these lines, there are four interrelated components that affect the inferences you might
draw from a statistical test in a research project: the power of the test, the alpha, the effect size,
and the sample size. Given the values for any three of these components, it is thus possible to
calculate the value of the fourth. Generally, it is recommended to establish the power, the alpha,
and the required precision (effect size) of a test first, and then, based on the values of these
components, determine an appropriate sample size.

CHOOSING THE APPROPRIATE STATISTICAL TECHNIQUE

After you have selected an acceptable level of statistical significance to test your
hypotheses, the next step is to decide on the appropriate method to test the hypotheses. The
choice of the appropriate statistical technique largely depends on the number of (independent and
dependent) variables you are examining and the scale of measurement (metric or nonmetric) of
your variable(s). Other aspects that play a role are whether the assumptions of parametric tests
are met and the size of your sample.

Univariate statistical techniques are used when you want to examine two‐variable
relationships. For instance, if you want to examine the effect of gender on the number of candy
bars that students eat per week, univariate statistics are appropriate. If, on the other hand, you are
interested in the relationships between many variables, such as in the Excelsior Enterprises case,
multivariate statistical techniques are required. The appropriate univariate or multivariate test
largely depends on the measurement scale you have used, as Figure 15.1 illustrates.

Chi‐square analysis was discussed in the previous chapter. This chapter will discuss the
other techniques listed in Figure 15.1 . Note that some techniques are discussed more elaborately
than others. A detailed discussion of all these techniques is beyond the scope of this book.

Testing a hypothesis about a single mean

The one sample t‐test is used to test the hypothesis that the mean of the population from
which a sample is drawn is equal to a comparison standard. Assume that you have read that the
average student studies 32 hours a week. From what you have observed so far, you think that
students from your university (the population from which your sample will be drawn) study
more. Therefore, you ask 20 class mates how long they study in an average week. The average
study time per week turns out to be 36.2 hours, 4 hours and 12 minutes more than the study time
of students in general. The question is: is this a coincidence?

In the above example, the sample of students from your university differs from the
typical student. What you want to know, however, is whether your fellow students come from a
different population than the rest of the students. In other words, did you select a group of
motivated students by chance? Or is there a “true” difference between students from your
university and students in general?

In this example the null hypothesis is:

H0: The number of study hours of students from our university is equal to the number of study
hours of students in general.

Univariate techniques:

Testing a hypothesis on a single mean:

Metric data: One sample t-test

Nonmetric data: Chi-square


Testing hypotheses about two related means

Independent samples

Metric data: Independent samples t-test

Nonmetric data: Chi-square

Mann−WhitneyU-test

Related samples

Metric data Paired samples t-test

Nonmetric data: Chi-square

Wilcoxon

McNemar

Testing hypotheses about several means

Metric data: One-way analysis of variance

Nonmetric data: Chi-square

Multivariate techniques:

One metric dependent variable

Analysis of variance and covariance

Multiple regression analysis

Conjoint analysis

One nonmetric dependent variable

Discriminant analysis

Logistic regression
More than one metric dependent variable

Multivariate analysis of variance

Canonical correlation

You might also like