Professional Documents
Culture Documents
Unit 5 Tutorials Hypothesis Testing With Z Tests T Tests and Anova
Unit 5 Tutorials Hypothesis Testing With Z Tests T Tests and Anova
Unit 5 Tutorials Hypothesis Testing With Z Tests T Tests and Anova
Sampling Distributions
Hypothesis Testing
Hypothesis Testing
Statistical Significance
Type I/II Errors
Significance Level and Power of a Hypothesis Test
One-Tailed and Two-Tailed Tests
Test Statistic
Pick Your Inference Test
T-Tests
How to Find a Critical T Value
How to Find a P-Value from a T-Test Statistic
Confidence Intervals Using the T-Distribution
© 2023 SOPHIA Learning, LLC. SOPHIA is a registered trademark of SOPHIA Learning, LLC. Page 1
Calculating Standard Error of a Sample Mean
Analysis of Variance/ANOVA
One-Way ANOVA/Two-Way ANOVA
Chi-Square Statistic
Chi-Square Test for Goodness-of-Fit
Chi-Square Test for Homogeneity
Chi-Square Test for Association and Independence
WHAT'S COVERED
This tutorial will explain the distinction between sample statistics and population parameters, with a
review of sampling distributions. Our discussion breaks down as follows:
1. Sample Statistics
2. Population Parameters
1. Sample Statistics
When you take a sample, it is important to try to obtain values that are accurate and represent the true values
for the population. A measure of an attribute of a sample is called a sample statistic
EXAMPLE In election season, suppose we took a simple random sample of 500 people from a
town of 10,000 and found that in this particular poll, 285 of those 500 plan to vote for Candidate Y.
That would mean that our best guess for the proportion of the town that will vote for Candidate Y,
when the election actually does happen, is 285 out of 500, or 57%. This 57% is a sample statistic.
We don't know what the real proportion is of people who will vote for Candidate Y. We only know that
after election day. For now, though, this is our best guess as to the proportion that will vote for
Candidate Y. We are using the results of our sample to estimate the value on the population.
In general, the following notations are for the sample statistics that we generate most often. The sample
proportion is shown as p-hat. The sample mean is shown as x-bar. Lastly, a sample standard deviation is
shown as s.
© 2023 SOPHIA Learning, LLC. SOPHIA is a registered trademark of SOPHIA Learning, LLC. Page 2
Sample proportion =
Sample Mean =
HINT
Basically, the sample mean is the sum of a certain attribute of a sample divided by the sample size.
TERMS TO KNOW
Sample Statistic
A measure of an attribute of a sample.
Sample Mean
A mean obtained from a sample of a given size. Denoted as
2. Population Parameters
A sample statistic is a measurement from a sample, and a population parameter is the corresponding
measurement for the population. This is something that we can find in a sample. The only way to figure out a
parameter is to take a census.
EXAMPLE In our previous example, the sample proportion was 57%. The population proportion,
however, is unknown; we won't know it until election day. In applied statistics, it is often a goal to use
sample statistics to better understand unknowable population parameters.
Statistic Parameter
A population proportion is denoted as p (without the hat). A population mean is denoted as the Greek letter
mu, and a population standard deviation is shown with the Greek letter sigma.
Population proportion =
Population Mean =
HINT
Population mean is basically the sum of a certain attribute of a population divided by the population size.
⭐ BIG IDEA
© 2023 SOPHIA Learning, LLC. SOPHIA is a registered trademark of SOPHIA Learning, LLC. Page 3
IN CONTEXT
The mean GPA of all 2,000 students at a school is 2.9, while the GPA of a sample of 50 students is
3.1
Based on the information provided, we can identify the following population parameters:
Sample size = 50
Sample mean = x̅ = 3.1
TERMS TO KNOW
Population Parameters
Summary values for the population. These are often unknown.
Population Mean
A mean for all values in the population. Denoted as .
SUMMARY
Statistics are sample measures that we can use to estimate parameters, which are corresponding
population measure. It's important to remember that this only works when the sample is carried out
well. For instance, if there's bias, then those wouldn't accurately reflect the population measures.
Good luck!
TERMS TO KNOW
Population Mean
A mean for all values in the population. Denoted as .
Population Parameters
Summary values for the population. These are often unknown.
Sample Mean
© 2023 SOPHIA Learning, LLC. SOPHIA is a registered trademark of SOPHIA Learning, LLC. Page 4
A mean obtained from a sample of a given size. Denoted as .
Sample Statistics
Summary values obtained from a sample.
© 2023 SOPHIA Learning, LLC. SOPHIA is a registered trademark of SOPHIA Learning, LLC. Page 5
Sampling With or Without Replacement
by Sophia
WHAT'S COVERED
This tutorial will cover sampling, both with and without replacement. Our discussion breaks down as
follows:
Typically, one big requirement for statistical inference is that the individuals, the values from the sample, are
independent. One doesn't affect any of the others. When sampling with replacement, each trial is
independent.
Suppose you pull the 10 of spades, but then you put it back into the deck. Now, what's the probability of a
spade on the second draw?
It's one fourth again. It's the same 52 cards. Therefore, you have the same likelihood of selecting a spade.
⭐ BIG IDEA
© 2023 SOPHIA Learning, LLC. SOPHIA is a registered trademark of SOPHIA Learning, LLC. Page 6
Sampling With Replacement
A sampling plan where each observation that is sampled is replaced after each time it is sampled,
resulting in an observation being able to be selected more than once.
EXAMPLE You wouldn't call a person twice for their opinion in a poll, so we don't put someone
back into the population and see if you can sample them again.
Most situations are considered sampling without replacement, which means that each observation is not put
back once it's selected--once it's selected, it's out and cannot be selected again.
EXAMPLE Let's go back to the example with the standard deck of 52 cards. What is the probability
that you select a spade on the first draw?
On the first draw, you have all 52 cards available, so the probability of drawing a spade is 13 out of 52, or
one-fourth, as we had found before.
Suppose you drew the Ten of Spades and did not place it back in the deck of cards. Now, what's the
probability of a spade on the second draw?
Now that there are only 12 spades left out of 51 cards, the probability of a spade on the second draw is not
equal to one-fourth.
This means that the first draw and the second draw are dependent. The probability of a spade on the
second draw changed after knowing that you got a spade on the first draw and did not replace it before
drawing again.
© 2023 SOPHIA Learning, LLC. SOPHIA is a registered trademark of SOPHIA Learning, LLC. Page 7
⭐ BIG IDEA
Even though the sampling that happens in real life doesn't technically fit the definition for independent
observations, there's going to be a workaround.
Suppose that your population was very large. Suppose you had four decks of cards, totaling 208 different
cards.
What is the probability of drawing a diamond?
There are 52 diamonds out of 208 cards, so the probability of a diamond on the first draw is one-fourth
probability, the same as if there were one deck.
Suppose the worst case scenario happened in terms of independence and every card picked you have
picked was the same suit. Take four diamonds from the group and do not replace them into the deck.
The larger population actually has an effect now. The probability is about 0.24, which is different than
0.25, but not dramatically--even after five draws. The probability of a diamond didn't change particularly
© 2023 SOPHIA Learning, LLC. SOPHIA is a registered trademark of SOPHIA Learning, LLC. Page 8
that much from the first to the last draw.
When you sample without replacement, if the population is large enough, then the probabilities don't shift
very much as you sample. The sampling without replacement becomes almost independent because the
probabilities don't change very much.
The question is, when is the population large enough? How large is considered a large population? You're
going to institute a rule.
CONCEPT TO KNOW
For independence, a large population is going to be at least 10 times larger than the sample.
If that's the case, then you're going to say that the probabilities don't shift very much when you sample "n"
items from the population. Therefore, you can treat the sampling as being almost independent.
TERM TO KNOW
SUMMARY
Sampling with replacement is the gold standard, in a sense. It always creates independent trials. The
probability of particular events doesn't change at all from trial to trial. However, in real life, when you
sample without replacement, the probabilities do necessarily change. Your workaround is that if the
population from which you're sampling is at least 10 times larger than the sample that you're drawing,
the trials can be considered nearly independent.
Good luck!
TERMS TO KNOW
© 2023 SOPHIA Learning, LLC. SOPHIA is a registered trademark of SOPHIA Learning, LLC. Page 9
© 2023 SOPHIA Learning, LLC. SOPHIA is a registered trademark of SOPHIA Learning, LLC. Page 10
Sampling Error and Sample Size
by Sophia
WHAT'S COVERED
This tutorial will focus on sampling error and how sample size relates to sampling error. Our
discussion breaks down as follows:
1. Sampling Error
2. The Effect of Sampling Size on Sampling Error
1. Sampling Error
Sampling error simply relates to the variability within the sampling distribution. It is the amount by which the
sample statistic differs from the population parameter.
EXAMPLE Suppose that you have taken a sampling distribution of certain sizes from this spinner.
This would be a distribution where the number 1 occurs about three-eighths of the time, 2 occurs about
one-eighth of the time, 3 occurs about two-eighths of the time, and 4 also occurs about two-eighths of the
time.
These are the different sampling distributions.
© 2023 SOPHIA Learning, LLC. SOPHIA is a registered trademark of SOPHIA Learning, LLC. Page 11
You can see that their means are all the same. However, you can also notice that their standard
deviations, which are the lengths of the arrows, decrease as the sample size increases. That means that
the larger the sample, the closer on average the sample statistic will be to the right answer. This also
means that you will be closer to the population mean, represented by the blue line down the middle of
the graphs above.
What you'll notice is that some of the sample means from samples of size 4 are way up near four or down
near one, when the true population mean is two and three-eighths. Meanwhile, when you look at samples
of size 20, the vast majority of these samples are between two and three--very close to the population
mean of two and three-eighths. So, the distribution of sample means has a smaller standard deviation
with a larger sample size.
⭐ BIG IDEA
The sampling error is an amount by which the sample statistic, like a sample mean, is off from the
population parameter (a fixed value that we're trying to estimate). With larger samples, this sampling error
decreases.
When we calculate a margin of error, we're approximating the sampling error. What’s being said is that our
sample statistic is probably within a certain margin of error of the right answer, which we don't know.
EXAMPLE Think of a poll that states that 60% of people are going to vote for a particular
candidate for office. It's reported with a margin of error of 7%. So this is saying that the sample gave us
60%, but the real population proportion of people who are going to vote for that candidate is plus or
minus 7%.
TERMS TO KNOW
Sampling Error
The amount by which the sample statistic differs from the population parameter.
Sample Size
The size of a sample of a population of interest.
© 2023 SOPHIA Learning, LLC. SOPHIA is a registered trademark of SOPHIA Learning, LLC. Page 12
2. The Effect of Sampling Size on Sampling Error
From no fault of yours, sampling error occurs when you use a statistic like a sample mean or sample
proportion to estimate a parameter, like a population mean or population proportion. It's important to note,
though, from the previous sampling distributions, is that since the sampling error decreases as the sample size
increases, you would like as large a sample as possible.
HINT
Sometimes getting a large sample is precluded by practical concerns like money or time. Perhaps you
simply don't have the money or time to take a large sample, so you are confined to a small sample. That is
fine--just keep in mind that you would like a larger sample if you can get one.
An increased sample size has to be coupled with well-collected data. A large sample size does not rescue
poorly-collected data. If your data are biased, you can't simply double the sample size and assume that
everything will be okay, because there will be less sampling error. It doesn't work that way.
If the questions are poorly worded or there's non-response or other biases like response bias, then the data
aren’t going to become any more accurate. They're not going to accurately approach the population
parameters that you're trying to estimate by taking a larger sample.
⭐ BIG IDEA
Once you've collected your data poorly, you might as well throw it out. An increased sample size does not
rescue it.
SUMMARY
Sample statistics estimate population parameters. They do it more accurately when the sample size is
large. Often, we don't know what the population parameter is, which is why we've taken the sample in
the first place--to try and estimate the parameter. If the data were properly collected and the sample
size is large--which is ideal--you can be fairly sure that the statistic that we get is close to the right
answer, the parameter for the population. When you calculate a margin of error, you're approximating
the sampling error.
Good luck!
TERMS TO KNOW
Sample Size
The size of a sample of a population of interest.
Sampling Error
The amount by which the sample statistic differs from the population parameter.
© 2023 SOPHIA Learning, LLC. SOPHIA is a registered trademark of SOPHIA Learning, LLC. Page 13
© 2023 SOPHIA Learning, LLC. SOPHIA is a registered trademark of SOPHIA Learning, LLC. Page 14
Distribution of Sample Means
by Sophia
WHAT'S COVERED
This tutorial will cover the distribution of sample means. Our discussion breaks down as follows:
Suppose you spun it four times to obtain a sample mean. The first spin was a 2, the second spin was a 4,
the third spin was a 3, and the fourth spin was a 1. The sample mean, then, would be 2.50. There's are
many possible samples that could be taken of size 4 for this spinner, and there are many possible means
that could arise from those samples, as shown below:
Sample Mean
= {2, 4, 3, 1} = 2.50
= {1, 4, 3, 1} = 2.25
= {4, 2, 4, 4} = 3.50
= {2, 2, 3, 1} = 2.00
© 2023 SOPHIA Learning, LLC. SOPHIA is a registered trademark of SOPHIA Learning, LLC. Page 15
= {3, 1, 1, 1} = 1.50
= {1, 1, 1, 2} = 1.25
Step 1: First, take these sample means and graph them. Draw out an axis. For this one, it should go from 1
to 4 because this set can’t average anything higher than four or lower than a one.
Step 2: Take the average value, for example, the mean of 2.5, and put a dot at 2.5 on the x-axis, much like
a dot plot. Do this for all the sample means that you have found.
Step 3: You can keep doing this over and over again. Ideally, you would do this hundreds or thousands of
times, to show the distribution of all possible samples that could be taken of size four. Once you’ve
enumerated every possible sample of size four from this spinner, then the sampling distribution looks like
this:
On the graph, the lowest number you can get is one, and the highest number you can get is four. On the
far right of the graph is the point that represents a spin of 4 fours, {4, 4, 4, 4}. On the far left is the point
that represents a spin of 4 ones, {1, 1, 1, 1}. Notice that 4 ones happens more than 4 fours. Why is that? If
you take a look at the spinner, you'll see that there are more ones on the spinner than there are fours.
You can also notice that, since there are more ones, this actually pulls the average down a bit. The most
frequent average is 2.25, not 2.5, which would be the exact middle between 1 and 4. Therefore, this
distribution is skewed slightly to the right because the numbers on the spinner are not evenly distributed.
TERM TO KNOW
1a. Mean
For the spinner above, the following are the histograms for a sample size of 1 spin, 4 spins, 9 spins, and
20 spins.
© 2023 SOPHIA Learning, LLC. SOPHIA is a registered trademark of SOPHIA Learning, LLC. Page 16
With the sampling distribution when the sample size was 1, you'll see that 1 occurs about 3/8 of the time, 2
occurs about 1/8 of the time, 3 occurs about a fourth of the time, and 4 occurs about a fourth of the time.
This produces a mean of 2.375:
You'll notice that when the sample size is 4, the shape of the distribution of sample means is significantly
different from when the sample size was 1. However, there some similarities and differences that you can
recognize here about all four of these sampling distributions. The similarities are their centers--all of them
are centered at 2.375. You'll notice that some of these are more tightly packed around that number--for
instance, the samples of size 20 are more tightly packed around 2.375 than the samples of size 1--but
they all are centered at that very same number.
© 2023 SOPHIA Learning, LLC. SOPHIA is a registered trademark of SOPHIA Learning, LLC. Page 17
What we can see here is that the mean of the sampling distribution of sample means is the same as the
mean for the population. In this case, it was 2.375.
FORMULA
Notice the arrows on the first distribution are very wide, and they seem to diminish in size as each
distribution is graphed. When we get to the lowest distribution where the sample size was 20, its spread
is much, much less.
© 2023 SOPHIA Learning, LLC. SOPHIA is a registered trademark of SOPHIA Learning, LLC. Page 18
The rule that's being followed is that the standard deviation of a distribution of sample means is the
standard deviation of the population divided by the square root of sample size.
FORMULA
What that indicates is that when the sample size is 4, the standard deviation of that sampling distribution
of sample means is going to be half as large as it was when the sample size was one. When the sample
size is 9, it's going to be a third the size of the original standard deviation. And when n is 20, it's going to
be the original standard deviation divided by the square root of 20.
HINT
The standard deviation of the sampling distribution is also called the standard error.
TERMS TO KNOW
Standard Error
The standard deviation of the sampling distribution of sample means.
1c. Shape
Lastly, let's discuss the measured center, or measured spread, and describe the shape of these
distributions. You'll notice that the shape is becoming more and more like the normal distribution as the
sample size increases. There's a theorem that describes that, called the central limit theorem.
The Central Limit Theorem states that when the sample size is large (at least 30 for most distributions
with a finite standard deviation), the sampling distribution of the sample means is approximately normal.
This means we can use the normal distribution to calculate probabilities on them, which is nice because
normal calculations are easy to do.
Therefore, it's going to be normal, or approximately normal, with a mean of the same as that of the
population, and a standard deviation equal to the standard deviation of the population divided by the
square root of sample size.
TERM TO KNOW
SUMMARY
The distribution of sample means is called a sampling distribution of sample means. The sampling
distribution of sample means has an approximately normal sampling distribution when the sample
size is large. This is the Central Limit Theorem. The mean of the sampling distribution is the mean of
the population. The standard deviation of the sampling distribution, which is also called the standard
© 2023 SOPHIA Learning, LLC. SOPHIA is a registered trademark of SOPHIA Learning, LLC. Page 19
error, is the standard deviation of the population divided by the square root of the sample size.
Good luck!
TERMS TO KNOW
Standard Error
The standard deviation of the sampling distribution of sample means distribution.
FORMULAS TO KNOW
© 2023 SOPHIA Learning, LLC. SOPHIA is a registered trademark of SOPHIA Learning, LLC. Page 20
Distribution of Sample Proportions
by Sophia
WHAT'S COVERED
This tutorial will cover the distribution of sample proportions, which is called a sampling distribution.
Our discussion breaks down as follows:
1. Sample Proportions
a. Mean
b. Standard Deviation
c. Shape
1. Sample Proportions
Many different situations can provide you with proportions.
EXAMPLE Suppose that you were taking a poll during a political season, and you calculated the
proportion of people that were going to vote for a particular candidate.
However, proportions like this are typically sample proportions. The only way to obtain the true population
proportion, which is the parameter we're trying to estimate, is by taking a census. If you had some binomial
question type-- meaning, are you going to vote for one candidate or the other--and you took a census, you
would be able to know the parameter.
In most cases, you only deal with samples. You will want to figure out what thedistribution of sample
proportions actually looks like, which is the distribution of all possible sample proportions for a certain size, n.
EXAMPLE Consider flipping a coin ten times. Obviously, you would expect 50% heads and 50%
tails, however, it doesn't always work out exactly that way.
Suppose the first time you flipped ten coins, you got 6 heads, so a percentage of 60% heads.
The next time you flipped ten coins, you got 70% heads. So it seems like the proportion of heads might
change from trial to trial, or sample to sample rather. First time, you got 60% heads in your sample. The
second time you got 70% heads in your sample. Suppose you do this a lot of times, and obtain sample
proportions of heads every time.
=HHHTHTTHHT = 0.6
=HTHHHTHTHH = 0.7
© 2023 SOPHIA Learning, LLC. SOPHIA is a registered trademark of SOPHIA Learning, LLC. Page 21
=HHTHHHTHTT = 0.6
=TTHTHTTHTH = 0.4
=TTTTHHTHHH = 0.5
=HHHHHTTTTH = 0.6
Next, you can start to graph those sample proportions on a dot plot. Take the 0.6 and graph it, and then
the 0.7, then the 0.6 again, stacking up the second dot on top of the first dot.
Repeat this process for every possible sample of size ten. Eventually, you would obtain a distribution that
looks like this:
This is the distribution of the sample proportions of heads. This is what is called a sampling distribution of
proportions.
TERM TO KNOW
1a. Mean
For the scenario above, notice that it peaks at 0.5, exactly where you would expect. Also, notice that it
sort of falls in almost a normal-looking shape off to each side. Very rarely did you get all of them being
heads (a sample proportion of one) and very rarely did you get none of them being heads (a sample
proportion of zero).
Notice the mean of the distribution of sample proportions, is the value of p, which is the actual probability
of getting heads, which was 0.5. It centers around what the proportion, or probability, of heads is going to
be for a single trial.
FORMULA
© 2023 SOPHIA Learning, LLC. SOPHIA is a registered trademark of SOPHIA Learning, LLC. Page 22
1b. Standard Deviation
The number of successes is actually a binomial variable, meaning either you do it or you don't, and each trial
is independent and all of the requirements for it being binomial are there. Since this is the case, when we
graph the proportion of successes --which is the number of successes divided by the sample size, n-- the
standard deviation will be the standard deviation of the binomial distribution, divided by n.
Therefore, the standard deviation of a distribution of sample proportions is the square root of n times p times
q, divided by n. After some algebra, this simplifies to this square root of p times q over n. This is also known as
the standard error.
FORMULA
HINT
Standard Error
The standard deviation of the sampling distribution of sample proportions.
1c. Shape
For a distribution of sample proportions, we have discussed that the mean is equal to the probability of
success and the standard deviation is the equal to the square root of p times q over n.
You're going to use the binomial numerator again to determine the shape. Since the sampling distribution of
sample proportions is a binomial variable divided by a constant --that is, it's some number of successes
divided by n-- the rules for the shape of it are going to follow that of the binomial distribution.
That is, it's going to be skewed to the left when the value of p is high and the sample size is low. It's going to
be skewed to the right when the probability of success is low and the sample size is low. Then, when the
sample size is large, it will be approximately normal.
Again, how large is large? When n times p is at least ten and when n times q is at least ten, the distribution of
sample proportions will be approximately normal, with the mean of p and the standard deviation of the square
root of p times q over n.
This is going to be one of our conditions for inference if you're going to use normal calculations, which you'll
want to do because they're easy to deal with. You're going to require that n times p is at least ten, and n times
q is also at least ten.
⭐ BIG IDEA
A condition for inference with a distribution of sample proportions states that n times p is at least ten and
© 2023 SOPHIA Learning, LLC. SOPHIA is a registered trademark of SOPHIA Learning, LLC. Page 23
n times q is at least ten.
AND
SUMMARY
You've learned about the distribution of sample proportions, the standard deviation of a distribution of
sample proportions, and standard error, which is the same thing as the standard deviation of the
sampling distribution. The sampling distribution of sample proportions has an approximately normal
sampling distribution when the number of trials is large, referring to the shape. Its mean is the
proportion of successes in the population--that's the center. In addition, the standard deviation of the
sampling distribution, which is also called standard error, is the square root of the product of the
probabilities of success and failure, divided by the number of trials. That's the spread.
Good luck!
TERMS TO KNOW
Standard Error
The standard deviation of the sampling distribution of sample proportions.
FORMULAS TO KNOW
© 2023 SOPHIA Learning, LLC. SOPHIA is a registered trademark of SOPHIA Learning, LLC. Page 24
Hypothesis Testing
by Sophia
WHAT'S COVERED
This tutorial will cover the basics of hypothesis testing. Our discussion breaks down as follows:
1. Hypothesis Testing
2. Null and Alternative Hypotheses
3. Reject to Fail or Fail to Reject the Null Hypotheses
1. Hypothesis Testing
Hypothesis testing is the standard procedure in statistics for testing ahypothesis, or claim, about population
parameters.
IN CONTEXT
Suppose a Liter O'Cola company has a new Diet Liter O'Cola, which they claim is indistinguishable
from Classic Liter O'Cola. They obtain 120 individuals to do a taste test.
If their claim is true, some people will be able to identify the diet soda just by guessing correctly.
What percent of people will do that? You'd think it would probably be around 60 people, which is
50% --50% would guess correctly and 50% would guess incorrectly, simply based on guessing, even
if the Diet Cola was indistinguishable from the Classic Cola.
Now, suppose that you didn't get an exact 50/50 split. Suppose 61 people correctly identified the
diet Cola. Would that be evidence against the company's claim? Well, it's more than half, but it's not
that much more than half. We would say no. Sixty-one isn't that different from 60. Therefore, it's not
really evidenced that more than half of people can correctly identify the diet soda
Suppose that 102 people of the group were able to identify the diet cola correctly. Is that evidence
against the company's claim? In this case, 102 is significantly more than half. We would say that this
would be evidence that at least some of the people could taste the difference. Even if some of those
102 were guessing, it's evidence that at least some of those 102 can taste the difference.
Now, the question posed to us with the 102 is if the people were guessing randomly just by chance,
what would be the probability that we would get 102 correct answers or more? Isn't it possible that
102 out of 120 could correctly pick the diet cola just by chance? Anything is possible.
However, if this was a low probability, then the evidence doesn't really support the hypothesis of
© 2023 SOPHIA Learning, LLC. SOPHIA is a registered trademark of SOPHIA Learning, LLC. Page 25
guessing. In fact, it would appear that some people can taste the difference.
TERMS TO KNOW
Hypothesis Testing
The standard procedure in statistics for testing claims about population parameters.
Hypothesis
A claim about a population parameter.
Null Hypothesis: A claim about a particular value of a population parameter that serves as the starting
assumption for a hypothesis test.
Alternative Hypothesis: A claim that a population parameter differs from the value claimed in the null
hypothesis.
EXAMPLE Refer back to the competing hypotheses from above. The null hypothesis will be that
"Liter O'Cola claims that 50% of people will correctly select the diet cola. We will state the null
hypothesis as the true proportion of people who can correctly identify the diet soda, p, is equal to 1/2.
The suspicion is that perhaps over 50% of people will select the diet cola--some of those by chance, and
some of those because they can actually taste the difference. This is called the alternative hypothesis,
which in essence is a "something is going on here" type of assumption.
HINT
The notation is H subscript 0 for the null hypothesis (H0), and H subscript a for the alternative hypothesis
(Ha).
Null hypothesis is always an equality, and the alternative hypothesis can be expressed many ways,
depending on the problem. It's either a "less than" symbol, a "greater than" symbol, or a strictly "not equal
to" symbol.
TERMS TO KNOW
Null Hypothesis
A claim about a particular value of a population parameter that serves as the starting assumption for a
hypothesis test.
Alternative Hypothesis
A claim that a population parameter differs from the value claimed in the null hypothesis.
© 2023 SOPHIA Learning, LLC. SOPHIA is a registered trademark of SOPHIA Learning, LLC. Page 26
3. Reject to Fail or Fail to Reject the Null
Hypotheses
In this example, if significantly more than half of the cola drinkers in our sample of 120 can correctly select the
diet soda, we would reject the null hypothesis where Liter O'Cola claims that 50% of people will correctly
select diet cola by chance.
If we reject the null hypothesis, then we are saying that we are in favor of the alternative hypothesis, which
states that there is convincing evidence that more than half of people will correctly identify the diet cola.
Now, significantly more than half is a loose term. How many is that? It was decided that 102 was probably
significant, while 61 probably wasn't that significant. We'll leave that definition for another time. On the other
hand, if not significantly more than half of the participants select the diet soda, then you would fail to reject the
null hypothesis. For instance, the 61 is not significantly more than half of the participants, and so you'd fail to
reject the null hypothesis.
HINT
Notice you don't say the word "accept" the null hypothesis. Why not? Why do you fail to reject the null
hypothesis and not accept it? There's a very good reason for that.
When you do an experiment like this, you already believe the null hypothesis and try to provide evidence
against it. If there isn't enough legitimate evidence against it or strong enough evidence to reject it, then
all you can do is not reject it. You haven't proven that the null hypothesis is true, you just haven't
presented strong enough evidence to prove it false.
SUMMARY
You learned about the hypotheses in the hypothesis test: the null and alternative hypotheses. You pit
those against each other and calculate probabilities in order to make a decision about the population.
Hypothesis testing involves a lot of things. You start by stating your assumption about the population,
which is the null hypothesis denoted H subscript null. You determine if the evidence gathered
contradicts the assumption, leading you to reject the null hypothesis in favor of the alternative
hypothesis, H sub a. You can calculate conditional probabilities by questioning the probability that
you would obtain statistics at least as extreme as these from a sample if the null hypothesis were, in
fact, true.
Good luck!
TERMS TO KNOW
© 2023 SOPHIA Learning, LLC. SOPHIA is a registered trademark of SOPHIA Learning, LLC. Page 27
Alternative Hypothesis
A claim that a population parameter differs from the value claimed in the null hypothesis.
Hypothesis
A claim about a population parameter.
Hypothesis Testing
The standard procedure in statistics for testing claims about population parameters.
Null Hypothesis
A claim about a particular value of a population parameter that serves as the starting assumption for a
hypothesis test.
© 2023 SOPHIA Learning, LLC. SOPHIA is a registered trademark of SOPHIA Learning, LLC. Page 28
Statistical Significance
by Sophia
WHAT'S COVERED
This tutorial will cover statistical significance, which is an important concept in hypothesis testing. Our
discussion breaks down as follows:
1. Statistical Significance
2. Practical Significance
1. Statistical Significance
When you run a significance test, you need to determine what level of departure is considered a significant
departure from what you would have expected to have happened.
IN CONTEXT
Suppose you work in research at Liter O'Cola company. They've developed a new diet cola that
they believe is indistinguishable from the classic cola. Therefore, you obtain 120 individuals to do a
taste test. If the claim is true, what percent of people should select the correct cola just by random
chance, by guessing?
Well, if Liter O'Cola's claim is correct, about 50% of people would just guess correctly and 50%
would guess incorrectly if presented with the two options. So now the question is, at what point are
we going to stop believing Liter O'Cola's claim?
Suppose 61 people were able to pick the diet cola. Is this evidence against the claim? Well, 61 is not
that different from 60, so you're going to say no. This is not significantly different from what you
would expect.
Conversely, suppose 102 people were able to pick the diet cola correctly. Would that be evidence
against the company's claim?
In this case, you would probably say yes--102 is significantly over 60, and 60 is what you would
expect had they been randomly guessing. It's fairly unusual that you would see 102 people get it
right by randomly guessing out of 120. Therefore, this is evidence that some people can taste the
difference.
This is the whole idea of statistical significance. The result of 61 out of 120 is not a significant result, meaning
that it is not evidenced against the claim or null hypothesis. Conversely, the 102 would be evidence against
© 2023 SOPHIA Learning, LLC. SOPHIA is a registered trademark of SOPHIA Learning, LLC. Page 29
the null hypothesis, because it's so much higher than what we would have expected. Statistical significance
means that you doubt that the results that you obtain are due to chance.
Instead, you believe that it's part of some larger trend. For instance, in the cola example, you don't believe the
null hypothesis that people can't distinguish. You believe that the trend is that people, in fact, can distinguish.
So, if 61 people correctly identify it, you're not convinced that over half can identify the diet cola. The
difference might be only due to chance. In fact, it probably is. On the other hand, the difference of 42 from
what you expect is probably not due to chance. That would be called statistically significant.
TERM TO KNOW
Statistical Significance
The statistic obtained is so different from the hypothesized value that we are unable to attribute the
difference to chance variation.
2. Practical Significance
Practical significance is whether or not something is meaningful in the real world. With practical significance,
we can ask ourselves, in practice, does this affect our lives?
It's important to make the distinction between practical significance and statistical significance. They're not
necessarily the same thing.
Suppose you had a large enough sample. It's possible if the sample size was large enough that even
something as close to 50% as 50.1% correct guessing could be considered statistically significant, even
though 50.1% is not that different from 50%.
The statistical significance argument is based largely on sample size and how far off from this 50% percent
claim you are. If the sample size is big, you don't need to be very far off. If the sample size is small, you need
to be further off in order to claim significance. However, if the sample size is big, you might get something like
50.1%, which is not considered practically significant.
IN CONTEXT
A state survey of all high school students finds that 15% of 10th graders drink regularly. A town
randomly selects 100 students and finds that 18% of their 10th graders drink regularly.
By doing some statistical test and setting a significance level and if this passes that test, then we can
say whether this is statistically significant or not.
Now whether this is practically significant, we need to consider if this affects our lives in the real
world. For this town, even if it came back that there was no statistical significance and the 18% result
was random, you may still want to do something about this report because it may still have meaning
for you in the real world because this is about something serious.
So without doing a test, we cannot say that this is statistically significant, but it may be practically
significant.
© 2023 SOPHIA Learning, LLC. SOPHIA is a registered trademark of SOPHIA Learning, LLC. Page 30
TERM TO KNOW
Practical Significance
An arbitrary assessment of whether observations reflect a practical real-world use.
SUMMARY
You learned about statistical significance and how to measure it versus practical significance. You
also learned how those two are not necessarily the same. Statistical significance is the extent to which
a sample measurement is evidence of a trend, like being able to taste the difference between regular
cola and diet cola, and whether the difference can be attributed to chance. Sometimes very small
differences can be statistically significant, though not have a lot of real-life meaning, which is practical
significance.
Good luck!
TERMS TO KNOW
Practical Significance
An arbitrary assessment of whether observations reflect a practical real-world use.
Statistical Significance
The statistic obtained is so different from the hypothesized value that we are unable to attribute the
difference to chance variation.
© 2023 SOPHIA Learning, LLC. SOPHIA is a registered trademark of SOPHIA Learning, LLC. Page 31
Type I/II Errors
by Sophia
WHAT'S COVERED
This tutorial will cover the difference between a Type I error and a Type II error in a hypothesis test.
Our discussion breaks down as follows:
HINT
1. You could fail to reject the null hypothesis that the drug is not effective.
2. You could reject the null hypothesis in favor of the alternative hypothesis that the drug is effective.
However, there's only one thing that's actually true and fact. Suppose these are the four different possibilities.
Two of them are the correct decisions.
Reality
Drug is not
Drug is effective
effective
Reject H0;
Correct
decide drug is
Decision
effective
Decision
Fail to reject H0;
Correct
© 2023 SOPHIA Learning, LLC. SOPHIA is a registered trademark of SOPHIA Learning, LLC. Page 32
decide drug Decision
isn't effective
With the two correct decisions, if the drug was effective, you should reject the null hypothesis and decide that
the drug is effective. Also, if the drug isn't effective, you should fail to reject the null hypothesis and decide
that the drug isn't as effective as it would have needed to be to reject it.
The other two possibilities are considered a Type I error or a Type II error.
Reality
Drug is not
Drug is effective
effective
Reject H0;
Correct
decide drug is Type I Error
Decision
effective
Decision
Fail to reject H0;
Correct
decide drug Type II Error
Decision
isn't effective
A Type I error is an error that occurs when a true null hypothesis is rejected. In the example above, a Type I
error would happen when the drug is not effective, but you decide that it is effective. The drug is not effective,
but you rejected the null hypothesis anyway. Based on your data, you thought that you had enough evidence
to reject the null hypothesis, but, in fact, the drug is not effective.
A Type II error is an error that occurs when a false null hypothesis is not rejected. As you can see on the chart
below, the drug was effective, but the data didn't make it clear enough, and so you failed to reject the null
hypothesis. This incorrect decision would be considered a Type II error.
TERMS TO KNOW
Type I Error
An error that occurs when a true null hypothesis is rejected.
Type II Error
An error that occurs when a false null hypothesis is not rejected.
A Type I error would have a consequence of you approving the drug and allowing the public to have it, even
though it's not effective. You're also unleashing all the potential negative side effects that this drug might
have. There's really no upside here and some negative consequences.
In a Type II error, you would not allow the drug to go to market because you think it's not effective when, in
fact, it is. You would deny an effective drug to the public who might need it, because you didn't know it was
© 2023 SOPHIA Learning, LLC. SOPHIA is a registered trademark of SOPHIA Learning, LLC. Page 33
effective, based on your data. This is another negative consequence. These errors always have negative
consequences.
⚙ THINK ABOUT IT
Which one are you more easily able to reconcile with yourself? In this case, probably a Type II error. It
would be difficult to deal with the idea of unleashing something that might hurt people just because you
think it might be effective. Typically, you need some hard evidence--if there's not hard evidence, you
would deny the drug.
IN CONTEXT
In the criminal justice system, juries are told to presume that someone is innocent until proven guilty,
meaning the null hypothesis is that the suspect is innocent, and the prosecution has to prove its
case. What would a Type I and Type II error look like in this context?
A Type I error would be that the person is innocent, but they're convicted anyway.
A Type II error would be that the person is guilty, but the result of the trial is that they're acquitted.
Obviously, both of these are problematic, but the criminal justice system in America puts a lot of
safeguards in place to make sure that a Type I error doesn't happen very often. In fact, the criminal
justice system allows a Type II error to happen fairly frequently in order to reduce a Type I error.
You may think a Type I error is absolutely the worst thing you can do in this particular case, but it's
not always this way. Sometimes a Type II error is worse. It depends on the situation, and so you have
to analyze each situation to determine which one is a worse mistake to make.
SUMMARY
When you talk about a hypothesis test as a decision-making tool, you might be making an error in
your judgment. It's not that you made a mistake, but the result that you choose might not match what
is really the case. A Type I error is when the null hypothesis is rejected when, in fact, it's true. A Type II
error is when the null hypothesis is not rejected. In reality, it's false, but you didn't reject it. The severity
of these errors depends on the context. In both the examples covered in the tutorial, a Type I error
was worse. However, there are conceivably some scenarios where a Type II error might be worse.
Good luck!
TERMS TO KNOW
Type I Error
In a hypothesis test, when the null hypothesis is rejected when it is in fact, true.
© 2023 SOPHIA Learning, LLC. SOPHIA is a registered trademark of SOPHIA Learning, LLC. Page 34
Type II Error
In a hypothesis test, when the null hypothesis is not rejected when it is, in fact, false.
© 2023 SOPHIA Learning, LLC. SOPHIA is a registered trademark of SOPHIA Learning, LLC. Page 35
Significance Level and Power of a Hypothesis
Test
by Sophia
WHAT'S COVERED
This tutorial will cover how to identify factors that influence the significance level and power of a
hypothesis test. Our discussion breaks down as follows:
1. Significance Level
a. Selecting an Appropriate Significance Level
b. Cautions about Significance Level
2. Power of a Hypothesis Test
1. Significance Level
Before we begin, you should first understand what is meant by statistical significance. When you calculate a
test statistic in a hypothesis test, you can calculate the p-value. The p-value is the probability that you would
have obtained a statistic as large (or small, or extreme) as the one you got if the null hypothesis is true. It's a
conditional probability.
Sometimes you’re willing to attribute whatever difference you found between your statistic and your
parameter to chance. If this is the case, you fail to reject the null hypothesis, if you’re willing to write off the
differences between your statistic and your hypothesized parameter.
If you’re not, meaning it's just too far away from the mean to attribute to chance, then you’re going to reject
the null hypothesis in favor of the alternative.
The hypothesized mean is right in the center of the normal distribution. Anything that is considered to be too
© 2023 SOPHIA Learning, LLC. SOPHIA is a registered trademark of SOPHIA Learning, LLC. Page 36
far away--something like two standard deviations or more away--you would reject the null hypothesis.
Anything you might attribute to chance, within the two standard deviations, you would fail to reject the null
hypothesis. Again, this is assuming that the null hypothesis is true.
However, think about this. All of this curve assumes that the null hypothesis is true, but you make a decision to
reject the null hypothesis anyway if the statistic you got is far away. It means that this would rarely happen by
chance. But, it's still the wrong thing to do technically, if the null hypothesis is true. This idea that we're
comfortable making some error sometimes is called a significance level.
The probability of rejecting the null hypothesis in error, in other words, rejecting the null hypothesis when it is,
in fact, true, is called a Type I Error.
Fortunately, you get to choose how big you want this error to be. You could have stated that three standard
deviations from the mean on either side as "too far away". Or, for instance, you could say you only want to be
wrong 1% of the time, or 5% of the time, meaning that you are rejecting the null hypothesis in error that often.
This value is known as the significance level. It is the probability of making a Type I error. We denote it with
the Greek letter alpha ( ).
TERM TO KNOW
Significance Level
The probability of making a Type I error. Abbreviated with the symbol, alpha, .
© 2023 SOPHIA Learning, LLC. SOPHIA is a registered trademark of SOPHIA Learning, LLC. Page 37
The alpha, in this case, is 0.05. If you recall, the 68-95-99.7 rule says that 95% of the values will fall within two
standard deviations of the mean, meaning that 5% of the values will fall outside of those two standard
deviations. Your decision to reject the null hypothesis will be 5% of the time; the most extreme 5% of cases,
you will not be willing to attribute to chance variation from the hypothesized mean.
The level of significance will also depend on the type of experiment that you're doing.
EXAMPLE Suppose you are trying to bring a drug to market. You want to be extremely cautious
about how often you reject the null hypothesis. You will reject the null hypothesis if you’re fairly certain
that the drug will work. You don't want to reject the null hypothesis of the drug not working in error,
thereby giving the public a drug that doesn't work.
If you want to be really cautious and not reject the null hypothesis in error very much, you'll choose a low
significance level, like 0.01. This means that only the most extreme 1% of cases will have the null hypothesis
rejected.
If you don't believe a Type I Error is going to be that bad, you might allow the significance level to be
something higher, like 0.05 or 0.10. Those still seem like low numbers. However, think about what that means.
This means that one out of every 20, or one out of every ten samples of that particular size will have the null
hypothesis rejected even when it's true. Are you willing to make that mistake one out of every 20 times or
once every ten times? Or are you only willing to make that mistake one out of every 100 times? Setting this
value to something really low reduces the probability that you make that error.
© 2023 SOPHIA Learning, LLC. SOPHIA is a registered trademark of SOPHIA Learning, LLC. Page 38
1b. Cautions about Significance Level
It is important to note that you don't want the significance level to betoo low. The problem with setting it really
low is that as you lower the value of a Type 1 Error, you actually increase the probability of a Type II Error.
A Type II Error is failing to reject the null hypothesis when a difference does exist. This reduces the power or
the sensitivity of your significance test, meaning that you will not be able to detect very real differences from
the null hypothesis when they actually exist if your alpha level is set too low.
Consider the curves below. Note that μ0 is the hypothesized mean and μA is the actual mean. The actual
mean is different than the null hypothesis; therefore, you should reject the null hypothesis. What you end up
with is an identical curve to the original normal curve.
If you take a look at the curve below, it illustrates the way the data is actually behaving, versus the way you
thought it should behave based on the null hypothesis. This line in the sand still exists, which means that
because we should reject the null hypothesis, this area in orange is a mistake.
Failing to reject the null hypothesis is wrong, if this is actually the mean, which is different from the null
hypothesis' mean. This is a type II error.
© 2023 SOPHIA Learning, LLC. SOPHIA is a registered trademark of SOPHIA Learning, LLC. Page 39
Now, the area in yellow on the other side, where you are correctly rejecting the null hypothesis when a
difference is present, is called power of a hypothesis test. Power is the probability of rejecting the null
hypothesis correctly, rejecting when the null hypothesis is false, which is a correct decision.
© 2023 SOPHIA Learning, LLC. SOPHIA is a registered trademark of SOPHIA Learning, LLC. Page 40
TERM TO KNOW
SUMMARY
The probability of a type I error is a value that you get to choose in a hypothesis test. It is called the
significance level and is denoted with the Greek letter alpha. Choosing a big significance level allows
you to reject the null hypothesis more often, though the problem is that sometimes we reject the null
hypothesis in error. When the difference really doesn't exist, you say that a difference does exist.
However, if you choose a really small one, you reject the null hypothesis less often. Sometimes you
fail to reject the null hypothesis in error as well. There's no foolproof method here. Usually, you want
to keep your significance levels low, such as 0.05 or 0.01. Note that 0.05 is the default choice for
most significance tests for most hypothesis testing.
Good luck!
TERMS TO KNOW
Significance Level
© 2023 SOPHIA Learning, LLC. SOPHIA is a registered trademark of SOPHIA Learning, LLC. Page 41
The probability of making a type I error. Abbreviated with the symbol, alpha .
© 2023 SOPHIA Learning, LLC. SOPHIA is a registered trademark of SOPHIA Learning, LLC. Page 42
One-Tailed and Two-Tailed Tests
by Sophia
WHAT'S COVERED
This tutorial will cover the difference between a one-tailed and a two-tailed test in a hypothesis test.
Our discussion breaks down as follows:
1. One-Tailed Test
a. Right-Tailed Test
b. Left-Tailed Test
2. Two-Tailed Test
a. One-Tailed vs. Two-Tailed Tests
1. One-Tailed Test
A one-tailed test is a test for when you have reason to believe the population parameter is higher or lower
than the assumed parameter value of the null hypothesis.
Right-Tailed Test
Left-Tailed Test
TERM TO KNOW
One-Tailed Test
A test for when you have reason to believe the population parameter is higher or lower than the
assumed parameter value of the null hypothesis.
IN CONTEXT
Suppose you have your favorite soda, Liter O'Cola, and it's come out with new Diet Liter O'Cola.
They think that it's indistinguishable from their regular cola, so they obtain 120 individuals to do the
taste test. If the claim is true, you would expect about 50%, or 60 people, to guess correctly simply
based on the fact that guessed it right, if the taste was indistinguishable.
© 2023 SOPHIA Learning, LLC. SOPHIA is a registered trademark of SOPHIA Learning, LLC. Page 43
However, what if some people can taste the difference? What would you expect the proportion of
people correctly selecting the diet cola to be? You would likely say that it's some number over 50%.
At least half of the people will be able to correctly identify which cup is the diet cola.
Your null hypothesis says that p, the true proportion of people who can correctly identify the diet
cola is 1/2, is half the people. Your alternative hypothesis suspects that maybe more than half of
people will be able to select the diet cola correctly.
Since you're only interested in testing whether or not the true proportion of people who can guess
correctly or identify which one is the diet cola is over half, this will be considered a right-tailed test, a
specific type of a one-tailed test. You don't care if it's under half. If it's under half, that actually works
in Liter O'Cola's favor.
The distribution of a right-tailed test would look similar to the following curve:
We are looking at the values higher than the assumed value, which is the section to the right of this value.
TERM TO KNOW
Right-tailed Test
A hypothesis test where the alternative hypothesis only states that the parameter is higher than the
stated value from the null hypothesis.
IN CONTEXT
Suppose you suspect that Liter O'Cola is under-filling their bottles. Unsurprisingly, the bottles are
© 2023 SOPHIA Learning, LLC. SOPHIA is a registered trademark of SOPHIA Learning, LLC. Page 44
supposed to contain one liter of cola.
This is another example of a one-tailed test, more specifically a left-tailed test. The null hypothesis
says that the average amount of cola in the bottle is one liter for all the bottles that Liter O'Cola
makes. The alternative is that perhaps it's less than one liter--they're under-filling the bottles. The
average amount is less than one liter.
If the average amount, μ, was greater than one liter, you wouldn't really have a claim against Liter
O'Cola because you're actually getting more soda than they claim they're providing. You're only
going to give them trouble if they're under-filling their bottles.
The distribution of a left-tailed test would look similar to the following curve:
We are looking at the values lower than the assumed value, which is the section to the left of this value.
TERM TO KNOW
Left-tailed Test
A hypothesis test where the alternative hypothesis only states that the parameter is lower than the
stated value from the null hypothesis.
2. Two-Tailed Test
A two-tailed test is when we have reason to believe the population is different from the assumed parameter
value of the null hypothesis.
IN CONTEXT
Liter O'Cola also claims 35 grams of sugar in its bottles of cola. Anything over that and the soda will
taste too sweet. Anything under that and the soda won't taste quite sweet enough. Consumers won't
© 2023 SOPHIA Learning, LLC. SOPHIA is a registered trademark of SOPHIA Learning, LLC. Page 45
get the refreshing Liter O'Cola taste that they have come to expect. We suspect that Liter O'Cola
might have altered their formula recently because it tastes differently.
What do you think the null and alternative hypotheses will be here with respect to sugar?
Here, the null hypothesis is that the mean grams of sugar will be the same as it was before, 35.
What about the alternative hypothesis? Well, if they've changed their formula, you don't know if they
added more sugar or put in less sugar. However, they're only going to be in trouble if they put in a
different amount of sugar than before. The alternative hypothesis will stat that the mean grams of
sugar in the bottle is different than 35. So this is considered a two-tailed test. They're going to be in
trouble if they put in significantly more than 35 grams or significantly less than 35 grams.
The distribution of a two-tailed test would look similar to the following curve:
We are looking at the values on the values that are extremely lower or higher than the assumed value.
TERM TO KNOW
Two-tailed Test
A test for when you have reason to believe the population parameter is different from the assumed
parameter value of the null hypothesis
Let's take a look visually at what a one-tailed test and a two-tailed test look like. This is what a one-tailed test
with a p-value of 5% would look like.
© 2023 SOPHIA Learning, LLC. SOPHIA is a registered trademark of SOPHIA Learning, LLC. Page 46
Both With a p-value of 5%
With the one-tailed test, this would be under the alternative hypothesis that you have something less than a
particular number, like a mean is less than 1, for example. You end up with one tail area here of about 5%.
You're only going to get them in trouble if it's extremely lower than what you would have expected.
With the two-tailed test, you are interested in what the probability is that you would get at least as extreme on
either side of a value as you ended up with from your sample. It could either be extremely low or extremely
high--something that is extremely different from what you would have expected.
SUMMARY
One-tailed tests only test whether or not there is evidence of a statistic being significantly higher or
lower than a claimed parameter, like mu or p. Two-tailed tests will test whether or not the statistic
obtained, x-bar or p-hat, is significantly different from the claimed parameter. You learned about one-
tailed tests, which have two versions, a left-tailed test, where you say in the alternative hypothesis
that it's less than a claimed parameter; and a right-tailed test, which means that it's larger than the
claimed parameter. There can also be a two-sided test, where we simply claim that the true value is
different than the claimed parameter, not equal to.
Good luck!
TERMS TO KNOW
Left-tailed test
A hypothesis test where the alternative hypothesis only states that the parameter is lower than the
stated value from the null hypothesis.
One-tailed test
A hypothesis test where the alternative hypothesis only states that the parameter is higher (or lower)
than the stated value from the null hypothesis.
Right-tailed test
A hypothesis test where the alternative hypothesis only states that the parameter is higher than the
stated value from the null hypothesis.
Two-tailed test
A hypothesis test where the alternative hypothesis states that the parameter is different from the stated
© 2023 SOPHIA Learning, LLC. SOPHIA is a registered trademark of SOPHIA Learning, LLC. Page 47
value from the null hypothesis; that is, the parameter's value is either higher or lower than the value
from the null hypothesis.
© 2023 SOPHIA Learning, LLC. SOPHIA is a registered trademark of SOPHIA Learning, LLC. Page 48
Test Statistic
by Sophia
WHAT'S COVERED
This tutorial will cover the topic of test statistics, which is the statistic that we calculate using the
statistics that we already have when we're running a hypothesis test. The tutorial will cover how to
determine whether to reject a null hypothesis from a given p-value and significance level. Our
discussion breaks down as follows:
1. Test Statistics
a. Z-Statistic for Means
b. Z-Statistic for Proportions
2. p-Value
3. Critical Values
1. Test Statistics
A test statistic is a relative distance of the statistic obtained from the sample from the hypothesized value of
the parameter from the null hypothesis. It is measured in terms of the number of standard deviations from the
mean of the sampling distribution.
When we have a hypothesized value for the parameter from the null hypothesis, we might get a statistic that's
different than that number. So, it's how far it is from that parameter.
⭐ BIG IDEA
FORMULA
Test Statistic
TERM TO KNOW
Test Statistic
A measurement, in standardized units, of how far a sample statistic is from the assumed parameter if
the null hypothesis is true
© 2023 SOPHIA Learning, LLC. SOPHIA is a registered trademark of SOPHIA Learning, LLC. Page 49
1a. Z-Statistic for Means
When dealing with means, we can use the following values:
Therefore, the z-statistic for sample means that you can calculate is your test statistic, and it is equal to x-bar
minus mu, divided by the standard deviation of x-bar.
FORMULA
Z-Statistic of Means
HINT
The standard deviation of the p-hat statistic is going to be the square root of p times q (which is 1 minus p)
over n.
Therefore, the z-statistic for sample proportions that you can calculate is your test statistic, and it is equal to p-
hat minus p from the null hypothesis, divided by the standard deviation of p-hat.
FORMULA
Z-Statistic of Proportions
2. p-Value
Both these situations have conditions under which they're normally distributed. You can use the normal
distribution to analyze and make a decision about the null hypothesis.
© 2023 SOPHIA Learning, LLC. SOPHIA is a registered trademark of SOPHIA Learning, LLC. Page 50
The normal curve below operates under the assumption that the null hypothesis is, in fact, true.
Suppose you are dealing with means. In the following graph, the parameter mean is indicated by mu, the
standard deviation of the sampling distribution is sigma over the square root of n, and perhaps your statistic x-
bar is over to the right as indicated below. The test statistic will become a z-score of means.
You are going to find what is called a p-value, the probability that you would get an x-bar at least as high as
what you'd get if the mean really is over here at the mean, mu. In this particular case, it's one sided-test.
TERM TO KNOW
© 2023 SOPHIA Learning, LLC. SOPHIA is a registered trademark of SOPHIA Learning, LLC. Page 51
P-Value
The probability that the test statistic is that value or more extreme in the direction of the alternative
hypothesis
3. Critical Value
Another way to determine statistical significance not using a p-value would be with what's called acritical
value. This corresponds to the number of standard deviations away from the mean that you're willing to
attribute to chance.
EXAMPLE You might say that anything within this green area here is a typical value for x-bar.
You are willing to attribute any deviations from mu to chance if it's in this green region. This is the most
typical 95 percent of values. If it's outside that region, it would be within the most unusual 5%. You would
be more willing to reject the null hypothesis in that case.
A test statistic, meaning a z-statistic, that's far from 0 provides evidence against the null hypothesis. One
way would be to say that if it's farther than two standard deviations, which means it's in the outermost 5%,
then you're going to reject the null hypothesis. If it's in the most innermost 95 percent, you will fail to reject
the null hypothesis.
With two-tailed tests like the image above, the critical values are actually symmetric around the mean. That
means that if you use positive 2 on the right side, you would be using negative 2 on the left side.
There are some very common critical values that we use. The most common cutoff points are at 5%, 1%, and
10%, and you can see their corresponding critical values, which is the number of standard deviations away
from the mean that you're willing to attribute to chance.
Tail Area
© 2023 SOPHIA Learning, LLC. SOPHIA is a registered trademark of SOPHIA Learning, LLC. Page 52
Two-Tailed One-Tailed Critical Value (z*)
So, if it's two-tailed with 0.05 as your significance level, this was actually 1.96 standard deviations away.
If you were doing a one-tailed test with 0.05 as your significance level or a two-tailed test with rejecting the
null hypothesis if it's among the most 10% extreme values, you'd use a z-statistic critical value of 1.645.
If you were doing a one-tailed test and you wanted to reject the most extreme 10% of values on one side,
you'd use 1.282 for your critical value.
When you run a hypothesis test with the critical value, you should state it as a decision rule. For instance, you
would say something like, "I will reject the null hypothesis if the test statistic, z, is greater than 2.33". That's the
same as saying that on a right-tailed test, reject the null hypothesis if the sample mean is among the highest
1% of all sample means that would occur by chance. Note this is one-tailed because you're saying that the
rejection region is on the high side of the normal curve.
The area within the blue box is what you're not willing to attribute to chance.
The area within the red box is what you are willing to attribute to chance.
The decision rule, the area where the red and blue boxes overlap, is your line in the sand. Anything less than
that will fail to reject the null hypothesis and attribute whatever differences exist for a mu to chance. Anything
higher than 2.33 for a test statistic, you will reject the null hypothesis and not attribute the difference from mu
to chance.
TERM TO KNOW
Critical Value
A value that can be compared to the test statistic to decide the outcome of a hypothesis test
SUMMARY
© 2023 SOPHIA Learning, LLC. SOPHIA is a registered trademark of SOPHIA Learning, LLC. Page 53
We learned about test statistics, both of which were z's. We also learned about p-values, which were
the probabilities that you would get a statistic as extreme as what you got by chance, and the critical
values, which are our lines in the sand whereby if we exceed that number with our test statistic, we'll
reject the null hypothesis. When we are running a hypothesis test, we convert our sample statistic
obtained (either x-bar or p-hat) into a test statistic, both of which are z's. If the sampling distribution is
approximately normal, we can use the normal distribution to determine if our sample statistic is
unusual or not--unusually high or unusually low or just unusually different--given that the null
hypothesis is true. We can decide on different critical values for different levels of "unusual", where if
our test statistic exceeds the critical value, we reject the null hypothesis--and that's our decision rule
Good luck!
TERMS TO KNOW
Critical Value
A value that can be compared to the test statistic to decide the outcome of a hypothesis test
P-value
The probability that the test statistic is that value or more extreme in the direction of the alternative
hypothesis
Test Statistic
A measurement, in standardized units, of how far a sample statistic is from the assumed parameter if the
null hypothesis is true
FORMULAS TO KNOW
Test Statistic
z-statistic of Means
z-statistic of Proportions
© 2023 SOPHIA Learning, LLC. SOPHIA is a registered trademark of SOPHIA Learning, LLC. Page 54
Pick Your Inference Test
by Sophia
WHAT'S COVERED
This tutorial will help explain which inference test should be used based upon the data set. Our
discussion breaks down as follows:
1. Overview
2. Qualitative or Categorical Data
a. One-Proportion Z-Test
b. Chi-Squared Test for Goodness-of-Fit
c. Chi-Squared Test for Homogeneity
d. Chi-Squared Test for Association and Independence
3. Quantitative Data
a. One-Way ANOVA
b. Two-Way ANOVA
c. One-Sample T-Test Vs. One-Sample Z-Test
1. Overview
Let's take a look at how to determine what type of hypothesis testing or inference test we should perform on a
given data set. First, we need to ask ourselves if we're dealing with qualitative or quantitative data.
How Many
Type of Population
Test
Data Proportions
or Population Means?
Qualitative One
One-Proportion Z-Test; model the data using a normal distribution.
or Population Proportion
Categorical Two or More Chi-squared test; determine if we are testing for goodness of fit,
Data Population Proportions homogeneity, or association and independence.
Quantitative Two Special type of student t-test, which will not be addressed in this
Data Population Means tutorial.
© 2023 SOPHIA Learning, LLC. SOPHIA is a registered trademark of SOPHIA Learning, LLC. Page 55
Three or More ANOVA test. If it has two or more characteristics, use a two-way
Population Means ANOVA test.
Another way to determine the type of test is through this inference test decision tree, which is available to
view or download as a PDF at the end of this tutorial.
✔ ✔ ✔ ✔ ✘
Was the claim accurate? What kind of tests are you going to use to try and figure this out?
We need to note that we're dealing with categorical data here. We're looking at dentists and if they
© 2023 SOPHIA Learning, LLC. SOPHIA is a registered trademark of SOPHIA Learning, LLC. Page 56
recommend something or don't recommend something. We're not really dealing with calculating means.
We also need to think about how many proportions we have. Here we only have one proportion: 75 out
of 100 dentists. Therefore, we're going to perform a one proportion z-test.
Expected 50 50
Observed 30 70
So, how can you tell if the coin that you're flipping is fair? And what tests should we use?
We need to consider the type of data that we're dealing with. Notice here, we have heads and tails to
record, which are categorical data because the data just falls into two categories: heads or tails.
We're also dealing with population proportions in regards to heads and tails; there are two population
proportions: heads and tails. Therefore, we're going to use a chi-squared test.
But what kind of chi-squared test should we be using? We're comparing observed data to expected data.
Because we are looking to see if the sample distribution matches the population distribution, we're going
to be using a chi-squared test for goodness-of-fit.
© 2023 SOPHIA Learning, LLC. SOPHIA is a registered trademark of SOPHIA Learning, LLC. Page 57
2c. Chi-Squared Test for Homogeneity
Suppose you want to determine the effectiveness of the flu vaccine in preventing the chance of someone
getting the flu. You gather data on 500 people where 250 had the flu vaccine, and 250 didn't get the flu
vaccine. You also record who got the flu and who did not get it.
Received
115 135 250
Flu Vaccine
Did Not
Receive Flu 120 130 250
Vaccine
Received
235 265 500
Flu Vaccine
What type of tests would you use to determine if the flu vaccine was effective or not?
We need to ask ourselves again what kind of data we are dealing with. We're looking at those that got the
flu vaccine and those who did not, as well as the number of people who caught the flu. Both are
categorical data.
Notice here that we're dealing with two populations proportions, those that got the flu and those who
didn't. Therefore, we're going to use a chi-squared test again.
We are also trying to determine if the flu vaccine was effective or not across two populations we're
considering. Because we're seeing if there is a difference between this variable across two populations,
we're going to use a chi-squared test for homogeneity.
© 2023 SOPHIA Learning, LLC. SOPHIA is a registered trademark of SOPHIA Learning, LLC. Page 58
2d. Chi-Squared Test for Association and Independence
Suppose we want to determine if gender affects whether or not someone likes an apple, orange, or banana.
We need to ask ourselves what kind of data we are dealing with. In this case, we're dealing with data that
can be categorized by names--apples, oranges, and bananas, which are categorical data.
We also notice that we're dealing with two populations proportions, men and women. Therefore, we're
going to use a chi-squared test.
We're trying to determine how apples, oranges, and bananas are related to each population. Because we
are looking for an association between two or more variables in a single population, we're going to use a
chi-squared test for association or independence.
© 2023 SOPHIA Learning, LLC. SOPHIA is a registered trademark of SOPHIA Learning, LLC. Page 59
3. Quantitative Data
3a. One-Way ANOVA
Suppose you're trying to determine if the overall standardized test scores on a given test across different
states are equal for high school students trying to enter college.
Notice that we are dealing with mean test scores here, which are qualitative data. Remember, that's the
first thing you should always ask yourself. What kind of data am I dealing with?
We're also dealing with several population means--in this case, 50 population means, one for each state.
That means we're going to be using an ANOVA f-test.
In this case, we're looking at one characteristic of the data, which is the overall test scores. Because we're
just looking at one characteristic--overall test scores--we're going to use a one-way ANOVA f-test.
© 2023 SOPHIA Learning, LLC. SOPHIA is a registered trademark of SOPHIA Learning, LLC. Page 60
3b. Two-Way ANOVA
Suppose you want to determine how students in different states are performing on the Math and English
sections of the exam.
We need to think about what kind of data we are dealing with. Here we're dealing with mean test scores,
which are quantitative data.
We're also dealing with multiple populations--in this case, up to 50 population means, because we're
having one for each state. Again, because we have so many population means, we're going to use an
ANOVA f-test.
In this case, we're looking at two characteristics of the data tests: test scores on the math and the English
section. So, we're going to use a two-way ANOVA f-test.
© 2023 SOPHIA Learning, LLC. SOPHIA is a registered trademark of SOPHIA Learning, LLC. Page 61
3c. One-Sample T-Test Vs. One-Sample Z-Test
Suppose we're concerned with the test scores of students in Minnesota taking a given standardized test.
Again, what kind of data are we dealing with? Here we're dealing with mean test scores, which are
quantitative data.
We're also dealing with one population mean--in this case, Minnesota's population mean. So, we're going
to use a one-sample test.
In this case, we're looking at one characteristic of the data, the overall test score. Therefore, if we don't
know the standard deviation of the entire population that took the test, we would use a one-sample t-test.
If, however, we did know the standard deviation of the population that took the test, then we would use a
one-sample z-test.
© 2023 SOPHIA Learning, LLC. SOPHIA is a registered trademark of SOPHIA Learning, LLC. Page 62
SUMMARY
This lesson explored how to perform different types of hypothesis or inference tests that you're likely
to encounter when you're in a statistics course, and when to apply one over the other.
© 2023 SOPHIA Learning, LLC. SOPHIA is a registered trademark of SOPHIA Learning, LLC. Page 63
Standard Normal Table Review
by Sophia
WHAT'S COVERED
This tutorial will review the standard normal table. Our discussion breaks down as follows:
1. The table value itself gives you the percent of observations below a particular z-score.
2. You can find the percent above a particular z-score by subtracting the table value from 100% because the
table value always gives the area to the left.
3. You can find the percent of observations between two z-scores by subtracting the table values.
4. You can find the percent of values outside of two z-scores by finding both the percent above the higher
number and the percent below the lower number, which is sort of a combination of these other options.
TERM TO KNOW
© 2023 SOPHIA Learning, LLC. SOPHIA is a registered trademark of SOPHIA Learning, LLC. Page 64
First we need to find the z-score by using the following formula:
The z-score ends up being negative 1.5; it's 1.5 standard deviations below the mean of 68.
You can use the negative z-score table, and go to the negative 1.5 row and the zero hundredths column, and
find that the probability is 0.0668.
z 0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09
-3.4 0.0003 0.0003 0.0003 0.0003 0.0003 0.0003 0.0003 0.0003 0.0003 0.0002
-3.3 0.0005 0.0005 0.0005 0.0004 0.0004 0.0004 0.0004 0.0004 0.0004 0.0003
-3.2 0.0007 0.0007 0.0006 0.0006 0.0006 0.0006 0.0006 0.0005 0.0005 0.0005
-3.1 0.0010 0.0009 0.0009 0.0009 0.0008 0.0008 0.0008 0.0008 0.0007 0.0007
-3.0 0.0013 0.0013 0.0013 0.0012 0.0012 0.0011 0.0011 0.0011 0.0010 0.0010
-2.9 0.0019 0.0018 0.0017 0.0017 0.0016 0.0016 0.0015 0.0015 0.0014 0.0014
-2.8 0.0026 0.0025 0.0024 0.0023 0.0023 0.0022 0.0021 0.0021 0.0020 0.0019
-2.7 0.0035 0.0034 0.0033 0.0032 0.0031 0.0030 0.0029 0.0028 0.0027 0.0026
-2.6 0.0047 0.0045 0.0044 0.0043 0.0041 0.0040 0.0039 0.0038 0.0037 0.0036
-2.5 0.0062 0.0060 0.0059 0.0057 0.0055 0.0054 0.0052 0.0051 0.0049 0.0048
-2.4 0.0082 0.0080 0.0078 0.0075 0.0073 0.0071 0.0069 0.0068 0.0066 0.0064
-2.3 0.0107 0.0104 0.0102 0.0099 0.0096 0.0094 0.0091 0.0089 0.0087 0.0084
-2.2 0.0139 0.0136 0.0132 0.0129 0.0125 0.0122 0.0119 0.0116 0.0113 0.0110
-2.1 0.0179 0.0174 0.0170 0.0166 0.0162 0.0158 0.0154 0.0150 0.0146 0.0143
-2.0 0.0228 0.0222 0.0217 0.0212 0.0207 0.0202 0.0197 0.0192 0.0188 0.0183
-1.9 0.0287 0.0281 0.0274 0.0268 0.0262 0.0256 0.0250 0.0244 0.0239 0.0233
-1.8 0.0359 0.0351 0.0344 0.0336 0.0329 0.0322 0.0314 0.0307 0.0301 0.0294
-1.7 0.0446 0.0436 0.0427 0.0418 0.0409 0.0401 0.0392 0.0384 0.0375 0.0367
-1.6 0.0548 0.0537 0.0526 0.0516 0.0505 0.0495 0.0485 0.0475 0.0465 0.0455
© 2023 SOPHIA Learning, LLC. SOPHIA is a registered trademark of SOPHIA Learning, LLC. Page 65
-1.5 0.0668 0.0655 0.0643 0.0630 0.0618 0.0606 0.0594 0.0582 0.0571 0.0559
-1.4 0.0808 0.0793 0.0778 0.0764 0.0749 0.0735 0.0721 0.0708 0.0694 0.0681
-1.3 0.0968 0.0951 0.0934 0.0918 0.0901 0.0885 0.0869 0.0853 0.0838 0.0823
-1.2 0.1151 0.1131 0.1112 0.1093 0.1075 0.1056 0.1038 0.1020 0.1003 0.0985
-1.1 0.1357 0.1335 0.1314 0.1292 0.1271 0.1251 0.1230 0.1210 0.1190 0.1170
-1.0 0.1587 0.1562 0.1539 0.1515 0.1492 0.1469 0.1446 0.1423 0.1401 0.1379
This means that about 7% of men are shorter than 63.5 inches.
Here's the normal distribution. 72 inches is the cutoff value, and you want the percent of men that are taller
than that.
© 2023 SOPHIA Learning, LLC. SOPHIA is a registered trademark of SOPHIA Learning, LLC. Page 66
The 72 inches standardizes to positive 1.33 for a z-score.
We can also take the normal distribution, centered at 68 with a standard deviation of 3, and convert it into the
standard deviation of 1 and mean of zero. The image below is called the standard normal curve.
Our z-score was positive 1.33, so you will look in the positive z-score table. Positive z-scores deal with the
tenths place and the hundredths place. Because your z-score was positive 1.33, you will go to the 1.3 row
(tenths) and the 0.03 column (hundredths).
z 0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09
0.0 0.5000 0.5040 0.5080 0.5120 0.5160 0.5199 0.5239 0.5279 0.5319 0.5359
0.1 0.5398 0.5438 0.5478 0.5517 0.5557 0.5596 0.5636 0.5675 0.5714 0.5753
0.2 0.5793 0.5832 0.5871 0.5910 0.5948 0.5987 0.6026 0.6064 0.6103 0.6141
0.3 0.6179 0.6217 0.6255 0.6293 0.6331 0.6368 0.6406 0.6443 0.6480 0.6517
0.4 0.6554 0.6591 0.6628 0.6664 0.6700 0.6736 0.6772 0.6808 0.6844 0.6879
0.5 0.6915 0.6950 0.6985 0.7019 0.7054 0.7088 0.7123 0.7157 0.7190 0.7224
0.6 0.7257 0.7291 0.7324 0.7357 0.7389 0.7422 0.7454 0.7486 0.7517 0.7549
0.7 0.7580 0.7611 0.7642 0.7673 0.7704 0.7734 0.7764 0.7794 0.7823 0.7852
0.8 0.7881 0.7910 0.7939 0.7967 0.7995 0.8023 0.8051 0.8078 0.8106 0.8133
0.9 0.8159 0.8186 0.8212 0.8238 0.8264 0.8289 0.8315 0.8340 0.8365 0.8389
1.0 0.8413 0.8438 0.8461 0.8485 0.8508 0.8531 0.8554 0.8577 0.8599 0.8621
© 2023 SOPHIA Learning, LLC. SOPHIA is a registered trademark of SOPHIA Learning, LLC. Page 67
1.1 0.8643 0.8665 0.8686 0.8708 0.8729 0.8749 0.8770 0.8790 0.8810 0.8830
1.2 0.8849 0.8869 0.8888 0.8907 0.8925 0.8944 0.8962 0.8980 0.8997 0.9015
1.3 0.9032 0.9049 0.9066 0.9082 0.9099 0.9115 0.9131 0.9147 0.9162 0.9177
1.4 0.9192 0.9207 0.9222 0.9236 0.9251 0.9265 0.9279 0.9292 0.9306 0.9319
At that intersection, you will find 0.9082, which is the area to the left of 1.33. But, the question was asking the
area above so now you simply subtract from 100%.
This tells us that 9.18% of adult men have heights over 72 inches.
© 2023 SOPHIA Learning, LLC. SOPHIA is a registered trademark of SOPHIA Learning, LLC. Page 68
Something like this is a little trickier. When you standardize the values of 66 and 69, you end up with these
two z-scores.
To find the probability of the area between these two numbers, you actually need to find the probabilities of
both z-scores.
First, for the area corresponding to the z-score of positive 0.33, look in the positive z-score table at 0.3 row
and 0.03 column to find that the orange area shown below is 0.6293.
z 0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09
0.0 0.5000 0.5040 0.5080 0.5120 0.5160 0.5199 0.5239 0.5279 0.5319 0.5359
0.1 0.5398 0.5438 0.5478 0.5517 0.5557 0.5596 0.5636 0.5675 0.5714 0.5753
0.2 0.5793 0.5832 0.5871 0.5910 0.5948 0.5987 0.6026 0.6064 0.6103 0.6141
0.3 0.6179 0.6217 0.6255 0.6293 0.6331 0.6368 0.6406 0.6443 0.6480 0.6517
0.4 0.6554 0.6591 0.6628 0.6664 0.6700 0.6736 0.6772 0.6808 0.6844 0.6879
Now, we need to consider our second z-score, negative 0.67. When you look at the negative z-score table for
© 2023 SOPHIA Learning, LLC. SOPHIA is a registered trademark of SOPHIA Learning, LLC. Page 69
the negative 0.67 z-score, you find that its probability in the negative 0.6 row and the 0.07 column is 0.2514,
shown in the green area below.
z 0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09
-3.4 0.0003 0.0003 0.0003 0.0003 0.0003 0.0003 0.0003 0.0003 0.0003 0.0002
-3.3 0.0005 0.0005 0.0005 0.0004 0.0004 0.0004 0.0004 0.0004 0.0004 0.0003
-3.2 0.0007 0.0007 0.0006 0.0006 0.0006 0.0006 0.0006 0.0005 0.0005 0.0005
-3.1 0.0010 0.0009 0.0009 0.0009 0.0008 0.0008 0.0008 0.0008 0.0007 0.0007
-3.0 0.0013 0.0013 0.0013 0.0012 0.0012 0.0011 0.0011 0.0011 0.0010 0.0010
-2.9 0.0019 0.0018 0.0017 0.0017 0.0016 0.0016 0.0015 0.0015 0.0014 0.0014
-2.8 0.0026 0.0025 0.0024 0.0023 0.0023 0.0022 0.0021 0.0021 0.0020 0.0019
-2.7 0.0035 0.0034 0.0033 0.0032 0.0031 0.0030 0.0029 0.0028 0.0027 0.0026
-2.6 0.0047 0.0045 0.0044 0.0043 0.0041 0.0040 0.0039 0.0038 0.0037 0.0036
-2.5 0.0062 0.0060 0.0059 0.0057 0.0055 0.0054 0.0052 0.0051 0.0049 0.0048
-2.4 0.0082 0.0080 0.0078 0.0075 0.0073 0.0071 0.0069 0.0068 0.0066 0.0064
-2.3 0.0107 0.0104 0.0102 0.0099 0.0096 0.0094 0.0091 0.0089 0.0087 0.0084
-2.2 0.0139 0.0136 0.0132 0.0129 0.0125 0.0122 0.0119 0.0116 0.0113 0.0110
-2.1 0.0179 0.0174 0.0170 0.0166 0.0162 0.0158 0.0154 0.0150 0.0146 0.0143
-2.0 0.0228 0.0222 0.0217 0.0212 0.0207 0.0202 0.0197 0.0192 0.0188 0.0183
-1.9 0.0287 0.0281 0.0274 0.0268 0.0262 0.0256 0.0250 0.0244 0.0239 0.0233
-1.8 0.0359 0.0351 0.0344 0.0336 0.0329 0.0322 0.0314 0.0307 0.0301 0.0294
-1.7 0.0446 0.0436 0.0427 0.0418 0.0409 0.0401 0.0392 0.0384 0.0375 0.0367
-1.6 0.0548 0.0537 0.0526 0.0516 0.0505 0.0495 0.0485 0.0475 0.0465 0.0455
-1.5 0.0668 0.0655 0.0643 0.0630 0.0618 0.0606 0.0594 0.0582 0.0571 0.0559
-1.4 0.0808 0.0793 0.0778 0.0764 0.0749 0.0735 0.0721 0.0708 0.0694 0.0681
-1.3 0.0968 0.0951 0.0934 0.0918 0.0901 0.0885 0.0869 0.0853 0.0838 0.0823
-1.2 0.1151 0.1131 0.1112 0.1093 0.1075 0.1056 0.1038 0.1020 0.1003 0.0985
-1.1 0.1357 0.1335 0.1314 0.1292 0.1271 0.1251 0.1230 0.1210 0.1190 0.1170
-1.0 0.1587 0.1562 0.1539 0.1515 0.1492 0.1469 0.1446 0.1423 0.1401 0.1379
-0.9 0.1841 0.1814 0.1788 0.1762 0.1736 0.1711 0.1685 0.1660 0.1635 0.1611
-0.8 0.2119 0.2090 0.2061 0.2033 0.2005 0.1977 0.1949 0.1922 0.1894 0.1867
-0.7 0.2420 0.2389 0.2358 0.2327 0.2296 0.2266 0.2236 0.2206 0.2177 0.2148
-0.6 0.2743 0.2709 0.2676 0.2643 0.2611 0.2578 0.2546 0.2514 0.2483 0.2451
-0.5 0.3085 0.3050 0.3015 0.2981 0.2946 0.2912 0.2877 0.2843 0.2810 0.2776
© 2023 SOPHIA Learning, LLC. SOPHIA is a registered trademark of SOPHIA Learning, LLC. Page 70
The area between 66 inches and 69 inches is the area below the 0.33 z-score but not below the -0.67 z-score.
Therefore, we need to subtract the orange area by the green area to obtain the area between the two values,
shown in blue below.
The orange area is equal to 0.6293, the green area is equal to 0.2514, so 0.6293 minus 0.2514 is 0.3779,
which tells us that about 38% of men are between those two heights.
All you do is add the two probabilities of the area below 65.5 and the area above 70.5. First, convert both of
these to z-scores.
© 2023 SOPHIA Learning, LLC. SOPHIA is a registered trademark of SOPHIA Learning, LLC. Page 71
We get z-scores of negative 0.83 and positive 0.83. Now, since these two values are the same distance away
from the mean and because of the symmetry of the normal curve, you can actually just find one of these two
areas and double it. In general, you wouldn't be able to do that if they were different differences from the
mean.
Let's find the probability of negative 0.83 in the table.
z 0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09
-3.4 0.0003 0.0003 0.0003 0.0003 0.0003 0.0003 0.0003 0.0003 0.0003 0.0002
-3.3 0.0005 0.0005 0.0005 0.0004 0.0004 0.0004 0.0004 0.0004 0.0004 0.0003
-3.2 0.0007 0.0007 0.0006 0.0006 0.0006 0.0006 0.0006 0.0005 0.0005 0.0005
-3.1 0.0010 0.0009 0.0009 0.0009 0.0008 0.0008 0.0008 0.0008 0.0007 0.0007
-3.0 0.0013 0.0013 0.0013 0.0012 0.0012 0.0011 0.0011 0.0011 0.0010 0.0010
-2.9 0.0019 0.0018 0.0017 0.0017 0.0016 0.0016 0.0015 0.0015 0.0014 0.0014
-2.8 0.0026 0.0025 0.0024 0.0023 0.0023 0.0022 0.0021 0.0021 0.0020 0.0019
-2.7 0.0035 0.0034 0.0033 0.0032 0.0031 0.0030 0.0029 0.0028 0.0027 0.0026
-2.6 0.0047 0.0045 0.0044 0.0043 0.0041 0.0040 0.0039 0.0038 0.0037 0.0036
-2.5 0.0062 0.0060 0.0059 0.0057 0.0055 0.0054 0.0052 0.0051 0.0049 0.0048
-2.4 0.0082 0.0080 0.0078 0.0075 0.0073 0.0071 0.0069 0.0068 0.0066 0.0064
-2.3 0.0107 0.0104 0.0102 0.0099 0.0096 0.0094 0.0091 0.0089 0.0087 0.0084
-2.2 0.0139 0.0136 0.0132 0.0129 0.0125 0.0122 0.0119 0.0116 0.0113 0.0110
-2.1 0.0179 0.0174 0.0170 0.0166 0.0162 0.0158 0.0154 0.0150 0.0146 0.0143
-2.0 0.0228 0.0222 0.0217 0.0212 0.0207 0.0202 0.0197 0.0192 0.0188 0.0183
-1.9 0.0287 0.0281 0.0274 0.0268 0.0262 0.0256 0.0250 0.0244 0.0239 0.0233
-1.8 0.0359 0.0351 0.0344 0.0336 0.0329 0.0322 0.0314 0.0307 0.0301 0.0294
-1.7 0.0446 0.0436 0.0427 0.0418 0.0409 0.0401 0.0392 0.0384 0.0375 0.0367
-1.6 0.0548 0.0537 0.0526 0.0516 0.0505 0.0495 0.0485 0.0475 0.0465 0.0455
-1.5 0.0668 0.0655 0.0643 0.0630 0.0618 0.0606 0.0594 0.0582 0.0571 0.0559
-1.4 0.0808 0.0793 0.0778 0.0764 0.0749 0.0735 0.0721 0.0708 0.0694 0.0681
-1.3 0.0968 0.0951 0.0934 0.0918 0.0901 0.0885 0.0869 0.0853 0.0838 0.0823
-1.2 0.1151 0.1131 0.1112 0.1093 0.1075 0.1056 0.1038 0.1020 0.1003 0.0985
-1.1 0.1357 0.1335 0.1314 0.1292 0.1271 0.1251 0.1230 0.1210 0.1190 0.1170
-1.0 0.1587 0.1562 0.1539 0.1515 0.1492 0.1469 0.1446 0.1423 0.1401 0.1379
-0.9 0.1841 0.1814 0.1788 0.1762 0.1736 0.1711 0.1685 0.1660 0.1635 0.1611
-0.8 0.2119 0.2090 0.2061 0.2033 0.2005 0.1977 0.1949 0.1922 0.1894 0.1867
-0.7 0.2420 0.2389 0.2358 0.2327 0.2296 0.2266 0.2236 0.2206 0.2177 0.2148
© 2023 SOPHIA Learning, LLC. SOPHIA is a registered trademark of SOPHIA Learning, LLC. Page 72
-0.6 0.2743 0.2709 0.2676 0.2643 0.2611 0.2578 0.2546 0.2514 0.2483 0.2451
-0.5 0.3085 0.3050 0.3015 0.2981 0.2946 0.2912 0.2877 0.2843 0.2810 0.2776
You would find the area below the negative 0.83 z-score, which is 0.2033. Normally you would find the area
above the positive 0.83 z-score, but you don't have to do that, because it's the same as the area below the
negative 0.83 z-score. Just use the symmetry and double it to obtain about 41% of men being outside that
range.
SUMMARY
It's possible to use the standard normal table to find the percent of values above or below a particular
value, or between two values, or even outside two values, using z-scores on the normal distribution.
The normal probability table, also called the z-table or the standard normal table, can find these
percents by finding the percent of values below a certain z- score, and subtract as necessary.
Good luck!
TERMS TO KNOW
© 2023 SOPHIA Learning, LLC. SOPHIA is a registered trademark of SOPHIA Learning, LLC. Page 73
Z-Test for Population Means
by Sophia
WHAT'S COVERED
This tutorial will cover how to perform a z-test for population means. Our discussion breaks down as
follows:
This type of z-test is not done often because it is unlikely that we would know the population standard
deviation without knowing the population mean.
When calculating a z-test for population means, you need the following information:
This information will be plugged into the formula for a z-statistic of population means:
FORMULA
IN CONTEXT
The average weight of newborn babies is 7.2 pounds, with a standard deviation of 1.1 pounds. A
local hospital has recorded the weights of all 285 babies born in a month, and the average weight
was 6.9 pounds.
© 2023 SOPHIA Learning, LLC. SOPHIA is a registered trademark of SOPHIA Learning, LLC. Page 74
We know the average weight is 7.2 pounds, with a standard deviation of 1.1 pounds. Because we
know the population standard deviation and it is quantitative data, we can use the normal
distribution and find a z-score. We also know the average weight was 6.9 pounds. We can plug this
information into the following formula to calculate our z-test statistic:
We have 6.9, which is our sample mean, minus the population mean of 7.2, divided by the
population standard deviation, 1.1, divided by the square root of our sample size, 285. This gives us a
z-test statistic of negative 4.604. We should expect to get a negative z-score because our sample
mean was less than the population mean.
If we were to put this on a normal distribution, it's centered at the population mean, which is 7.2
pounds. The average weight of the babies at the hospital was 6.9, which is less than 7.2 pounds, so
it should fall in the lower part of our distribution.
The corresponding z-score is all the way down at the negative 4.604. At this hospital, the average
weight of the babies was definitely far below the average weights of the babies of the population.
HINT
© 2023 SOPHIA Learning, LLC. SOPHIA is a registered trademark of SOPHIA Learning, LLC. Page 75
Z-Test for Population Means
A hypothesis test that compares a hypothesized mean from the null hypothesis to a sample mean,
when the population standard deviation is known.
STEP BY STEP
Step 2: Check the conditions necessary in order to actually perform the inference that you're trying to do.
Step 3: Calculate the test statistic--in this case, a z-statistic--and calculate the p-value based on the normal
sampling distribution.
Step 4: Compare your test statistic to your chosen critical value or your p-value to our chosen significance
level. Those are both acceptable approaches. Based on how they compare, state a decision regarding
the null hypothesis. Circle it back around to the null hypothesis and decide if it supports the null
hypothesis or refutes the hypothesis. Make a decision to either reject or fail to reject it based on your
evidence. It should also be in the context of the problem.
Assuming the distribution of bag weights is approximately normal, and the standard deviation of all M&M's
bags is 0.22 grams, is this evidence that the bags do not contain the claimed amount of 47.9 grams in each
bag?
This could mean that it's either higher than 47.9 grams or lower than 47.9 grams. If you take a look, some of
the weights in the sample are fairly off, some by almost a full gram.
You are also assuming that you know the standard deviation of all M&M's bags, which is not always a
reasonable assumption, but is for this example.
Let's walk through each of the steps of running a hypothesis test with our M&M's example.
STEP BY STEP
© 2023 SOPHIA Learning, LLC. SOPHIA is a registered trademark of SOPHIA Learning, LLC. Page 76
The alternative hypothesis is that they're not that number. This is going to be a two-sided test based on this
"not equal to" symbol.
You should also state what your alpha level or significance level is going to be. By stating that alpha equals
0.05, which is the most common significance level, you are saying if the p-value is less than 0.05, reject the
null hypothesis. If this is above 0.05, you should fail to reject it.
Criteria Description
Randomness
The randomness should be stated somewhere in the problem. Think about the way the
data was collected.
Population ≥ 10n
Independence You want to make sure that the population is at least 10 times as large as the sample
size. This was your workaround for independence. If the population is sufficiently large,
then taking out the number of bags that you took doesn't make a huge difference.
There are two ways to verify normality. Either the parent distribution has to be normal or
Normality
the central limit theorem is going to have to apply. The central limit theorem says that
for most distributions, when the sample size is greater than 30, the sampling distribution
will be approximately normal.
Randomness: In the problem, it does say the bags were randomly selected. So, thinking about the way
that the data was collected in the problem is important.
Independence: We can also assume there are at least 140 bags of M&M's, which is a reasonable
assumption. Why 140? Because there were 14 bags in our sample. So you're going to assume that the
population of all bags of M&M's is at least 10 times that size.
Normality: Finally, the distribution of bag weights is in fact approximately normal as stated in the problem.
Step 3: Calculate the test statistic and calculate the p-value based on the normal sampling distribution.
In this problem, your test statistic is going to be a z-statistic. How is this done? Take the sample mean minus
the hypothesized population mean of 47.9 from the null hypothesis, and divide by the standard error, which is
the standard deviation of the population divided by the square root of sample size.
© 2023 SOPHIA Learning, LLC. SOPHIA is a registered trademark of SOPHIA Learning, LLC. Page 77
When you do all of that and input the numbers, you get a z-statistic of positive 2.89. Look at where that lies on
the normal distribution that you're using. A z-statistic of 2.89 on the standard normal distribution centered at 0
is between two and three standard deviations above the mean. Because this is a two-sided test, find the
probability that your z-statistic is above positive 2.89 and the probability that it's below negative 2.89.
z 0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09
-3.4 0.0003 0.0003 0.0003 0.0003 0.0003 0.0003 0.0003 0.0003 0.0003 0.0002
-3.3 0.0005 0.0005 0.0005 0.0004 0.0004 0.0004 0.0004 0.0004 0.0004 0.0003
-3.2 0.0007 0.0007 0.0006 0.0006 0.0006 0.0006 0.0006 0.0005 0.0005 0.0005
-3.1 0.0010 0.0009 0.0009 0.0009 0.0008 0.0008 0.0008 0.0008 0.0007 0.0007
-3.0 0.0013 0.0013 0.0013 0.0012 0.0012 0.0011 0.0011 0.0011 0.0010 0.0010
-2.9 0.0019 0.0018 0.0017 0.0017 0.0016 0.0016 0.0015 0.0015 0.0014 0.0014
-2.8 0.0026 0.0025 0.0024 0.0023 0.0023 0.0022 0.0021 0.0021 0.0020 0.0019
-2.7 0.0035 0.0034 0.0033 0.0032 0.0031 0.0030 0.0029 0.0028 0.0027 0.0026
-2.6 0.0047 0.0045 0.0044 0.0043 0.0041 0.0040 0.0039 0.0038 0.0037 0.0036
-2.5 0.0062 0.0060 0.0059 0.0057 0.0055 0.0054 0.0052 0.0051 0.0049 0.0048
Step 4: Compare your test statistic to your chosen critical value or your p-value to our chosen significance
level.
This actually contains three parts: the comparison, the decision, and the conclusion. Since your p-value of
0.0038 is less than the significance level of 0.05, your decision is to reject the null hypothesis. There is
evidence to conclude that the M&M's bags are not filled to a mean of 47.9 grams.
© 2023 SOPHIA Learning, LLC. SOPHIA is a registered trademark of SOPHIA Learning, LLC. Page 78
SUMMARY
The steps in any hypothesis test--not just a z-test for population means--are the same. First, state the
null and alternative hypotheses both in symbols and in words. Second, state and verify the conditions
necessary for inference. Third, calculate the test statistic from your statistics that you have and
calculate its p-value. Finally, compare your p-value to the alpha level that you've chosen, or your test
statistic to the critical value, and make a decision about the null hypothesis. State your conclusion in
the context of the problem. In a z-test for population means, the population standard deviation must
be known. That is not very common. You'll have other ways to address it when you don't know the
population standard deviation. Since the test statistic is a z, this focuses on a z-test for population
means.
Good luck!
TERMS TO KNOW
FORMULAS TO KNOW
© 2023 SOPHIA Learning, LLC. SOPHIA is a registered trademark of SOPHIA Learning, LLC. Page 79
Z-Test for Population Proportions
by Sophia
WHAT'S COVERED
This tutorial will cover how to calculate a hypothesis test for population proportions. Our discussion
breaks down as follows:
1. Calculating a Z-Test for Population Proportions
2. Conducting a Z-Test for Population Proportions
⭐ BIG IDEA
This information will be plugged into the formula for a z-statistic of population proportions:
FORMULA
HINT
IN CONTEXT
Approximately 10% of the population is left-handed, with a standard deviation of 3.13%. Of 100
randomly selected people, 14 claimed to be left-handed.
© 2023 SOPHIA Learning, LLC. SOPHIA is a registered trademark of SOPHIA Learning, LLC. Page 80
Find the z-test statistic for this data set.
This type of data is qualitative data; people are answering either yes or no. They're either left-
handed or not left-handed. We're placing the answers into categories, which is why it's also called
categorical data.
Since we know the population standard deviation, we can use the formula for population proportions
to find the z-score. We have p-hat, which is the proportion of successes from our sample. In this
case, a success is being left-handed, which is 14 out of 100, or 14%, or 0.14. Now, p is the population
number proportion of successes, which is 10%, or 0.10. Then, we have q, which is the complement to
p. This tells us that for people who are right-handed, there would be 90%, or 0.90. Our sample size
was 100.
We've got our 14% minus the population proportion of 10%. We're going to calculate divide by the
standard error, so the square root of 0.10 times 0.90 all over our sample size of 100. We end up
getting a z-test statistic of 1.33.
We can use a normal distribution because we know the population standard deviation. This
distribution is centered at 10%. Our sample rendered 14% of people being left-handed, which was
1.33 standard deviations above the mean.
© 2023 SOPHIA Learning, LLC. SOPHIA is a registered trademark of SOPHIA Learning, LLC. Page 81
TERM TO KNOW
The question that we need to consider is this significantly less than 80%? Is this evidence that, in fact, less
than 80% of all items at the supermarket have a price ending in 9 or 5?
STEP BY STEP
Conversely, the alternative hypothesis suspects that something is amiss, that it is actually less than 80%. We
are going to say that p, the true proportion of prices ending in 9 or 5, is below 80%.
In this problem, choose a significance level of 0.10. With the decision rule, if the p-value is less than 0.10, you’ll
reject the null hypothesis in favor of the alternative.
© 2023 SOPHIA Learning, LLC. SOPHIA is a registered trademark of SOPHIA Learning, LLC. Page 82
Criteria Description
Randomness
The randomness should be stated somewhere in the problem. Think about the way the data
was collected
Population ≥ 10n
Independence
Make sure that the population is at least 10 times the size of the sample because you're
sampling without replacement.
np ≥ 10 and nq ≥ 10
Because you're using the sampling distribution of p-hat instead of x-bar, there are different
Normality
conditions for normality. Use the conditions np is at least 10 and nq is at least 10. We can't
use the central limit theorem here because this is not the sampling distribution of x-bar. It's
the sampling distribution of p-hat, sample proportions.
Randomness: In the problem, it does say that the items were randomly selected, so the simple random
sample condition is acceptable.
Independence: Assume the independence piece--that the population of all items at the grocery store is at
least 1,150. That seems reasonable.
Normality: You know what n is, and you know what p is. So, p is the value from the null hypothesis--it's the
80% that you're believing is the center of the distribution; n was the sample size, 115. Multiply 0.80 times
115 to get 92 for n times p. That's greater than 10. In addition, you get 23 for n times q, which is also
greater than 10.
All three conditions have been checked, and we're good to go.
Step 3: Calculate the test statistic and calculate the p-value based on the normal sampling distribution.
Now you can perform the z-test for population proportions. It's going to be the statistic (88 over 115) minus the
hypothesized parameter (0.80) over the standard error. The standard error in this case is the square root of p
times q divided by n.
© 2023 SOPHIA Learning, LLC. SOPHIA is a registered trademark of SOPHIA Learning, LLC. Page 83
When you evaluate the fraction, you get a z-score of negative 0.93. Then, you can find negative 0.93 on the
normal distribution that is the sampling distribution for p-hat, and find the tail probability by using a normal z-
table.
The probability that your sample proportion would be less than the one that you got, the 88 out of 115, is
equals to the probability that the z-statistic would be less than negative 0.93.
You can find that area using the normal table, and you get 0.1762, or about 18% of the time. This means that if
the null hypothesis was true and this distribution was really centered at 0.8, meaning the true proportion of
prices ending in 9 or 5 was 0.8, you would find something at least as low as we got about 18% of the time.
Step 4: Compare your test statistic to your chosen critical value or your p-value to our chosen significance
level.
Based on how your p-value compares to your chosen significance level, which you may recall was 0.10, you're
going to make a decision about the null hypothesis and state the conclusion. In your case, 0.1762 is greater
than 0.10. Your decision, then, is that you fail to reject the null hypothesis. The conclusion is that there's not
sufficient evidence to conclude that less than 80% of supermarket prices end in 9 or 5. You don't have strong
enough evidence to reject the claim of the consumer report.
SUMMARY
The steps in any hypothesis test are always the same. You start by stating your null and alternative
hypotheses, which is where you would also state your alpha level. Next, state and verify the
conditions. Calculate the test statistic and the p-value. Finally, based on your p-value, compare it to
your alpha level and make a decision about the null hypothesis and state it in the context of the
problem. In this case, we did a z-test for population proportions, and it's analogous to any other
hypothesis tests that you do. The only thing that you changed was how you verified the normality
condition, because you needed np to be at least 10 and nq to be at least 10.
Good luck!
© 2023 SOPHIA Learning, LLC. SOPHIA is a registered trademark of SOPHIA Learning, LLC. Page 84
Source: Adapted from Sophia tutorial by Jonathan Osters.
TERMS TO KNOW
FORMULAS TO KNOW
z-statistic of Proportions
© 2023 SOPHIA Learning, LLC. SOPHIA is a registered trademark of SOPHIA Learning, LLC. Page 85
How to Find a Critical Z Value
by Sophia
WHAT'S COVERED
This tutorial will cover how to find the critical z-value for the following tests:
1. Left-Tailed Tests
a. Graphing Calculator
b. Z-Table
c. Excel
2. Right-Tailed Tests
a. Graphing Calculator
b. Z-Table
c. Excel
3. Two-Sided Tests
a. Graphing Calculator
b. Z-Table
c. Excel
1. Left-Tailed Tests
For a left-tailed test, suppose we need to find the critical z-value for a hypothesis test that would reject the
null hypothesis (H0) at a 2.5% significance level. To do this, we want to find, on our normal distribution, the
cutoff on the left tail that corresponds to the lower 2.5% of our distribution.
© 2023 SOPHIA Learning, LLC. SOPHIA is a registered trademark of SOPHIA Learning, LLC. Page 86
Then scroll down to the third function, which is inverse norm (invNorm). This is the inverse of the normal
distribution. We're going to hit "Enter".
Next, we're going to input 0.025, because this is a left-tailed test, so we're looking at the lower 2.5% of our
distribution. For this specific calculator (TI-84 Plus), we need to type 0.025 for the area, and 0 for mu and 1 for
sigma because these are the values that correspond with a normal distribution. Hit "Enter", and we get a z-test
statistic of negative 1.96.
At about -1.96, this is the cutoff for the lower 2.5% of our data.
© 2023 SOPHIA Learning, LLC. SOPHIA is a registered trademark of SOPHIA Learning, LLC. Page 87
Basically any z-score that is below a negative 1.96 means we're going to reject the null hypothesis. Any z-
score that is above a negative 1.96 is going to fall in this unshaded region of our distribution. This means that
we're willing to accept the variation in our sample from the center of our distribution due to chance, and we're
going to fail to reject the null hypothesis.
1b. Z-Table
The second method is using a z-table. When using the z-table, we look for our significance level in the table. In
this case, remember we were looking at a left-tailed test. This means we need to use the negative z-table with
negative z-scores, not positive, because we're looking at the lower half of the distribution. Remember, the
significance level is 0.025, or 2.5%, so we are going to look for that value or the closest thing to it. Here it is
on a z-table:
z 0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09
-3.4 0.0003 0.0003 0.0003 0.0003 0.0003 0.0003 0.0003 0.0003 0.0003 0.0002
-3.3 0.0005 0.0005 0.0005 0.0004 0.0004 0.0004 0.0004 0.0004 0.0004 0.0003
-3.2 0.0007 0.0007 0.0006 0.0006 0.0006 0.0006 0.0006 0.0005 0.0005 0.0005
-3.1 0.0010 0.0009 0.0009 0.0009 0.0008 0.0008 0.0008 0.0008 0.0007 0.0007
-3.0 0.0013 0.0013 0.0013 0.0012 0.0012 0.0011 0.0011 0.0011 0.0010 0.0010
-2.9 0.0019 0.0018 0.0017 0.0017 0.0016 0.0016 0.0015 0.0015 0.0014 0.0014
-2.8 0.0026 0.0025 0.0024 0.0023 0.0023 0.0022 0.0021 0.0021 0.0020 0.0019
-2.7 0.0035 0.0034 0.0033 0.0032 0.0031 0.0030 0.0029 0.0028 0.0027 0.0026
-2.6 0.0047 0.0045 0.0044 0.0043 0.0041 0.0040 0.0039 0.0038 0.0037 0.0036
-2.5 0.0062 0.0060 0.0059 0.0057 0.0055 0.0054 0.0052 0.0051 0.0049 0.0048
-2.4 0.0082 0.0080 0.0078 0.0075 0.0073 0.0071 0.0069 0.0068 0.0066 0.0064
-2.3 0.0107 0.0104 0.0102 0.0099 0.0096 0.0094 0.0091 0.0089 0.0087 0.0084
-2.2 0.0139 0.0136 0.0132 0.0129 0.0125 0.0122 0.0119 0.0116 0.0113 0.0110
-2.1 0.0179 0.0174 0.0170 0.0166 0.0162 0.0158 0.0154 0.0150 0.0146 0.0143
-2.0 0.0228 0.0222 0.0217 0.0212 0.0207 0.0202 0.0197 0.0192 0.0188 0.0183
-1.9 0.0287 0.0281 0.0274 0.0268 0.0262 0.0256 0.0250 0.0244 0.0239 0.0233
-1.8 0.0359 0.0351 0.0344 0.0336 0.0329 0.0322 0.0314 0.0307 0.0301 0.0294
-1.7 0.0446 0.0436 0.0427 0.0418 0.0409 0.0401 0.0392 0.0384 0.0375 0.0367
-1.6 0.0548 0.0537 0.0526 0.0516 0.0505 0.0495 0.0485 0.0475 0.0465 0.0455
-1.5 0.0668 0.0655 0.0643 0.0630 0.0618 0.0606 0.0594 0.0582 0.0571 0.0559
A significance level of 0.025 corresponds to a z-score of negative 1.96. Therefore, our z critical value is
negative 1.96.
© 2023 SOPHIA Learning, LLC. SOPHIA is a registered trademark of SOPHIA Learning, LLC. Page 88
1c. Excel
A third way to find the critical z-value that corresponds to a 2.5% significance level for a left-tailed test is in
Excel. All we have to do is go to our "Formulas" tab. We're going to insert under the "Statistical" column. We're
looking for "NORM.S.INV", which is right here:
This is for the inverse of the normal distribution, and because it's a left-tailed test, we're looking at the lower
half of our distribution. We're going to put in the 0.025 for the lower 2.5%. Hit "Enter", and notice how we get
the same critical z-value that we did using the calculator and table:
TERM TO KNOW
Critical Value
A value that can be compared to the test statistic to decide the outcome of a hypothesis test
2. Right-Tailed Tests
For a right-tailed test, suppose we need to find the critical z-value for a hypothesis test that would reject the
null (H0) at a 5% significance level. To do this, we want to find, on our normal distribution, the cutoff on the
upper part of the distribution where we are not going to attribute the difference in proportion due to chance.
© 2023 SOPHIA Learning, LLC. SOPHIA is a registered trademark of SOPHIA Learning, LLC. Page 89
about 1.645.
Any z-test statistic that is greater than 1.645 falls in the upper 5% of our distribution, and therefore we would
reject the null hypothesis.
2b. Z-Table
The second method uses the z-table. Because we're looking at a right-tailed test, we're going to have positive
z-scores since we're looking at the upper half of the distribution. We'll use the positive z-table that
corresponds with positive z-scores.
The significance level was 5%, but it was the upper 5%. Remember, this corresponds to the 95th percentile on
our distribution. In the table, we need to look for the closest thing to 95%, or 0.95.
z 0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09
0.0 0.5000 0.5040 0.5080 0.5120 0.5160 0.5199 0.5239 0.5279 0.5319 0.5359
0.1 0.5398 0.5438 0.5478 0.5517 0.5557 0.5596 0.5636 0.5675 0.5714 0.5753
0.2 0.5793 0.5832 0.5871 0.5910 0.5948 0.5987 0.6026 0.6064 0.6103 0.6141
0.3 0.6179 0.6217 0.6255 0.6293 0.6331 0.6368 0.6406 0.6443 0.6480 0.6517
0.4 0.6554 0.6591 0.6628 0.6664 0.6700 0.6736 0.6772 0.6808 0.6844 0.6879
0.5 0.6915 0.6950 0.6985 0.7019 0.7054 0.7088 0.7123 0.7157 0.7190 0.7224
0.6 0.7257 0.7291 0.7324 0.7357 0.7389 0.7422 0.7454 0.7486 0.7517 0.7549
0.7 0.7580 0.7611 0.7642 0.7673 0.7704 0.7734 0.7764 0.7794 0.7823 0.7852
© 2023 SOPHIA Learning, LLC. SOPHIA is a registered trademark of SOPHIA Learning, LLC. Page 90
0.8 0.7881 0.7910 0.7939 0.7967 0.7995 0.8023 0.8051 0.8078 0.8106 0.8133
0.9 0.8159 0.8186 0.8212 0.8238 0.8264 0.8289 0.8315 0.8340 0.8365 0.8389
1.0 0.8413 0.8438 0.8461 0.8485 0.8508 0.8531 0.8554 0.8577 0.8599 0.8621
1.1 0.8643 0.8665 0.8686 0.8708 0.8729 0.8749 0.8770 0.8790 0.8810 0.8830
1.2 0.8849 0.8869 0.8888 0.8907 0.8925 0.8944 0.8962 0.8980 0.8997 0.9015
1.3 0.9032 0.9049 0.9066 0.9082 0.9099 0.9115 0.9131 0.9147 0.9162 0.9177
1.4 0.9192 0.9207 0.9222 0.9236 0.9251 0.9265 0.9279 0.9292 0.9306 0.9319
1.5 0.9332 0.9345 0.9357 0.9370 0.9382 0.9394 0.9406 0.9418 0.9429 0.9441
1.6 0.9452 0.9463 0.9474 0.9484 0.9495 0.9505 0.9515 0.9525 0.9535 0.9545
1.7 0.9554 0.9564 0.9573 0.9582 0.9591 0.9599 0.9608 0.9616 0.9625 0.9633
1.8 0.9641 0.9649 0.9656 0.9664 0.9671 0.9678 0.9686 0.9693 0.9699 0.9706
1.9 0.9713 0.9719 0.9726 0.9732 0.9738 0.9744 0.9750 0.9756 0.9761 0.9767
This actually falls in between these two values, 0.9495 and 0.9505. This value corresponds to a z-score of 1.6
in the left column, and it falls between the 0.04 and the 0.05 in the top row. When we take the average of 1.64
and 1.65, we get a critical z-value of 1.645.
2c. Excel
A third way to find the critical z-value that corresponds to a 5% significance level for an upper tail test, or a
right-tailed test, is by using Excel. Again, go to "Formulas" tab. We're going to insert under the "Statistical"
column our "NORM.S.INV", but we're not going to put in 0.05 for the 5%. Because we're looking at the upper
part of our distribution, this is going to correspond to the 95th percentile. We're going to enter 0.95, and
notice how we get the same critical value we did from our table and our calculator, which is a positive 1.645.
3. Two-Sided Tests
For a two-sided test, suppose we want to find the critical z-score for a hypothesis test that would reject the
null at a 1% significance level. Because it's a two-sided test, we have to divide that 1% into each tail. Therefore,
1% divided by 2 means we're going to be looking for the cutoff at the lower 0.5% of the distribution, and the
upper 0.5% of our distribution.
© 2023 SOPHIA Learning, LLC. SOPHIA is a registered trademark of SOPHIA Learning, LLC. Page 91
This gives us a corresponding z-score of negative 2.576. In a distribution, this falls right about here, negative
2.576.
The shaded region corresponds to the lower 0.5% of the distribution. If we do this correctly, we should get the
same z-score, but a positive value for the upper portion of our distribution for that 0.5% cut off.
Let's go ahead and do inverse norm again on our calculator. But we can't put in 0.005, because remember,
our distribution reads from 0% to 100%. We actually have to do 100% minus 0.05%, or 99.5%. We are going to
put in 0.995 and get a positive 2.576.
© 2023 SOPHIA Learning, LLC. SOPHIA is a registered trademark of SOPHIA Learning, LLC. Page 92
This positive 2.576 corresponds to the upper 0.5% of our distribution.
Any z-score that we would calculate that would be greater than a positive 2.576 or less than a negative 2.576
means we would reject the null hypothesis.
3b. Z-Table
Using our z-table, we first look for the corresponding critical value for the lower half of our distribution, since
it's a two-sided test. Remember, we're not going to look for the closest thing to 1%, but we're going to look for
the closest value to 0.5%, or 0.005.
Let's use the table to find the lower critical value for our two-sided test in a 1% significance level, we're going
to find the closest thing to 0.5%.
z 0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09
-3.4 0.0003 0.0003 0.0003 0.0003 0.0003 0.0003 0.0003 0.0003 0.0003 0.0002
-3.3 0.0005 0.0005 0.0005 0.0004 0.0004 0.0004 0.0004 0.0004 0.0004 0.0003
-3.2 0.0007 0.0007 0.0006 0.0006 0.0006 0.0006 0.0006 0.0005 0.0005 0.0005
-3.1 0.0010 0.0009 0.0009 0.0009 0.0008 0.0008 0.0008 0.0008 0.0007 0.0007
-3.0 0.0013 0.0013 0.0013 0.0012 0.0012 0.0011 0.0011 0.0011 0.0010 0.0010
-2.9 0.0019 0.0018 0.0017 0.0017 0.0016 0.0016 0.0015 0.0015 0.0014 0.0014
-2.8 0.0026 0.0025 0.0024 0.0023 0.0023 0.0022 0.0021 0.0021 0.0020 0.0019
-2.7 0.0035 0.0034 0.0033 0.0032 0.0031 0.0030 0.0029 0.0028 0.0027 0.0026
-2.6 0.0047 0.0045 0.0044 0.0043 0.0041 0.0040 0.0039 0.0038 0.0037 0.0036
© 2023 SOPHIA Learning, LLC. SOPHIA is a registered trademark of SOPHIA Learning, LLC. Page 93
-2.5 0.0062 0.0060 0.0059 0.0057 0.0055 0.0054 0.0052 0.0051 0.0049 0.0048
-2.4 0.0082 0.0080 0.0078 0.0075 0.0073 0.0071 0.0069 0.0068 0.0066 0.0064
-2.3 0.0107 0.0104 0.0102 0.0099 0.0096 0.0094 0.0091 0.0089 0.0087 0.0084
-2.2 0.0139 0.0136 0.0132 0.0129 0.0125 0.0122 0.0119 0.0116 0.0113 0.0110
-2.1 0.0179 0.0174 0.0170 0.0166 0.0162 0.0158 0.0154 0.0150 0.0146 0.0143
-2.0 0.0228 0.0222 0.0217 0.0212 0.0207 0.0202 0.0197 0.0192 0.0188 0.0183
The closest value to 0.005 is between these two values, 0.0051 and 0.0049. This corresponds to a negative
2.5 in the left column, and in between the 0.07 and the 0.08 in the top row. If we're using the table, we're
going to get an average critical z-value of negative 2.575, which is quite close to what the calculator gave us.
Remember, sometimes the table can just give us an estimate.
Let's use the table to find the upper critical value for our two-sided test in a 1% significance level, we're going
to try to find the closest to 100% minus 0.5%, or 99.5%.
z 0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09
0.0 0.5000 0.5040 0.5080 0.5120 0.5160 0.5199 0.5239 0.5279 0.5319 0.5359
0.1 0.5398 0.5438 0.5478 0.5517 0.5557 0.5596 0.5636 0.5675 0.5714 0.5753
0.2 0.5793 0.5832 0.5871 0.5910 0.5948 0.5987 0.6026 0.6064 0.6103 0.6141
0.3 0.6179 0.6217 0.6255 0.6293 0.6331 0.6368 0.6406 0.6443 0.6480 0.6517
0.4 0.6554 0.6591 0.6628 0.6664 0.6700 0.6736 0.6772 0.6808 0.6844 0.6879
0.5 0.6915 0.6950 0.6985 0.7019 0.7054 0.7088 0.7123 0.7157 0.7190 0.7224
0.6 0.7257 0.7291 0.7324 0.7357 0.7389 0.7422 0.7454 0.7486 0.7517 0.7549
0.7 0.7580 0.7611 0.7642 0.7673 0.7704 0.7734 0.7764 0.7794 0.7823 0.7852
0.8 0.7881 0.7910 0.7939 0.7967 0.7995 0.8023 0.8051 0.8078 0.8106 0.8133
0.9 0.8159 0.8186 0.8212 0.8238 0.8264 0.8289 0.8315 0.8340 0.8365 0.8389
1.0 0.8413 0.8438 0.8461 0.8485 0.8508 0.8531 0.8554 0.8577 0.8599 0.8621
1.1 0.8643 0.8665 0.8686 0.8708 0.8729 0.8749 0.8770 0.8790 0.8810 0.8830
1.2 0.8849 0.8869 0.8888 0.8907 0.8925 0.8944 0.8962 0.8980 0.8997 0.9015
1.3 0.9032 0.9049 0.9066 0.9082 0.9099 0.9115 0.9131 0.9147 0.9162 0.9177
1.4 0.9192 0.9207 0.9222 0.9236 0.9251 0.9265 0.9279 0.9292 0.9306 0.9319
1.5 0.9332 0.9345 0.9357 0.9370 0.9382 0.9394 0.9406 0.9418 0.9429 0.9441
1.6 0.9452 0.9463 0.9474 0.9484 0.9495 0.9505 0.9515 0.9525 0.9535 0.9545
1.7 0.9554 0.9564 0.9573 0.9582 0.9591 0.9599 0.9608 0.9616 0.9625 0.9633
1.8 0.9641 0.9649 0.9656 0.9664 0.9671 0.9678 0.9686 0.9693 0.9699 0.9706
1.9 0.9713 0.9719 0.9726 0.9732 0.9738 0.9744 0.9750 0.9756 0.9761 0.9767
2.0 0.9772 0.9778 0.9783 0.9788 0.9793 0.9798 0.9803 0.9808 0.9812 0.9817
2.1 0.9821 0.9826 0.9830 0.9834 0.9838 0.9842 0.9846 0.9850 0.9854 0.9857
2.2 0.9861 0.9864 0.9868 0.9871 0.9875 0.9878 0.9881 0.9884 0.9887 0.9890
2.3 0.9893 0.9896 0.9898 0.9901 0.9904 0.9906 0.9909 0.9911 0.9913 0.9916
2.4 0.9918 0.9920 0.9922 0.9925 0.9927 0.9929 0.9931 0.9932 0.9934 0.9936
2.5 0.9938 0.9940 0.9941 0.9943 0.9945 0.9946 0.9948 0.9949 0.9951 0.9952
© 2023 SOPHIA Learning, LLC. SOPHIA is a registered trademark of SOPHIA Learning, LLC. Page 94
2.6 0.9953 0.9955 0.9956 0.9957 0.9959 0.9960 0.9961 0.9962 0.9963 0.9964
2.7 0.9965 0.9966 0.9967 0.9968 0.9969 0.9970 0.9971 0.9972 0.9973 0.9974
2.8 0.9974 0.9975 0.9976 0.9977 0.9977 0.9978 0.9979 0.9979 0.9980 0.9981
2.9 0.9981 0.9982 0.9982 0.9983 0.9984 0.9984 0.9985 0.9985 0.9986 0.9986
The closest value to 99.5%, or 0.995, is in between these two values, 0.9949 and 0.9951. This corresponds to
a positive 2.5 in the left column, and falling between the 0.07 and the 0.08 in the top row. If we're using the
table, we would get a critical z-value of a positive 2.575, taking the average between those two values.
3c. Excel
In Excel, we're going to find the two critical z-values that correspond to the 1% significance level for our two-
sided test. Again, go under your "Formulas" tab. We're going to insert the "NORM.S.INV" under the "Statistical"
column. We'll first find the lower critical value that corresponds to the lower 0.5%, so enter 0.005.
You can see that we get our first critical z-value of negative 2.576. Now, if we do this correctly, we should get
a positive 2.576. Again, we're going to insert to get the second critical value for the upper part of our
distribution. The upper percentage that corresponds to the top 0.5% is going to be our 99.5%, so 0.995.
SUMMARY
We calculated a critical z-score for a left-tailed, right-tailed, and two-tailed test, utilizing three methods
for each test: graphing calculator, z-table, and Excel.
Good luck!
TERMS TO KNOW
Critical Value
A value that can be compared to the test statistic to decide the outcome of a hypothesis test
© 2023 SOPHIA Learning, LLC. SOPHIA is a registered trademark of SOPHIA Learning, LLC. Page 95
How to Find a P-Value from a Z-Test Statistic
by Sophia
WHAT'S COVERED
This tutorial will explain how to find a p-value when given the z-test statistic, by using either graphing
calculator, z-table, or technology. Our discussion breaks down as follows:
1. Two-Sided Tests
a. Z-Table
b. Graphing Calculator
c. Excel
2. Left-Tailed Tests
a. Z-Table
b. Graphing Calculator
c. Excel
3. Right-Tailed Tests
a. Z-Table
b. Graphing Calculator
c. Excel
1. Two-Sided Tests
Suppose a pharmaceutical company manufactures ibuprofen pills. They need to perform some quality
assurance to ensure they have the correct dosage, which is supposed to be 500 milligrams. This is a two-
sided test because if the company's pills are deviating significantly in either direction, meaning there are more
than 500 milligrams or less than 500 milligrams, this will indicate a problem.
In a random sample of 125 pills, there is an average dose of 499.3 milligrams with a standard deviation of 6
milligrams. Because this is quantitative data, 500 mg is the population mean. We can use the following
formula to calculate the z-score:
© 2023 SOPHIA Learning, LLC. SOPHIA is a registered trademark of SOPHIA Learning, LLC. Page 96
We get a z-score of negative 1.304. Because this is a two-sided test, it is not enough to just look at the left tail.
We also have to look at the equivalent of the right tail, or a positive 1.304.
Now that we have the z-score, we can use a variety of methods to find the probability, or p-value.
1a. Z-Table
The first way to find the p-value is to use the z-table. In the z-table, the left column will show values to the
tenths place, while the top row will show values to the hundredths place. If we have a z-score of -1.304, we
need to round this to the hundredths place, or -1.30. In the left column, we will first find the tenths place, or -1.3.
In the top row, we will find the hundredths place, or 0.
z 0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09
-3.4 0.0003 0.0003 0.0003 0.0003 0.0003 0.0003 0.0003 0.0003 0.0003 0.0002
-3.3 0.0005 0.0005 0.0005 0.0004 0.0004 0.0004 0.0004 0.0004 0.0004 0.0003
-3.2 0.0007 0.0007 0.0006 0.0006 0.0006 0.0006 0.0006 0.0005 0.0005 0.0005
-3.1 0.0010 0.0009 0.0009 0.0009 0.0008 0.0008 0.0008 0.0008 0.0007 0.0007
-3.0 0.0013 0.0013 0.0013 0.0012 0.0012 0.0011 0.0011 0.0011 0.0010 0.0010
-2.9 0.0019 0.0018 0.0017 0.0017 0.0016 0.0016 0.0015 0.0015 0.0014 0.0014
-2.8 0.0026 0.0025 0.0024 0.0023 0.0023 0.0022 0.0021 0.0021 0.0020 0.0019
-2.7 0.0035 0.0034 0.0033 0.0032 0.0031 0.0030 0.0029 0.0028 0.0027 0.0026
-2.6 0.0047 0.0045 0.0044 0.0043 0.0041 0.0040 0.0039 0.0038 0.0037 0.0036
-2.5 0.0062 0.0060 0.0059 0.0057 0.0055 0.0054 0.0052 0.0051 0.0049 0.0048
-2.4 0.0082 0.0080 0.0078 0.0075 0.0073 0.0071 0.0069 0.0068 0.0066 0.0064
-2.3 0.0107 0.0104 0.0102 0.0099 0.0096 0.0094 0.0091 0.0089 0.0087 0.0084
-2.2 0.0139 0.0136 0.0132 0.0129 0.0125 0.0122 0.0119 0.0116 0.0113 0.0110
-2.1 0.0179 0.0174 0.0170 0.0166 0.0162 0.0158 0.0154 0.0150 0.0146 0.0143
-2.0 0.0228 0.0222 0.0217 0.0212 0.0207 0.0202 0.0197 0.0192 0.0188 0.0183
© 2023 SOPHIA Learning, LLC. SOPHIA is a registered trademark of SOPHIA Learning, LLC. Page 97
-1.9 0.0287 0.0281 0.0274 0.0268 0.0262 0.0256 0.0250 0.0244 0.0239 0.0233
-1.8 0.0359 0.0351 0.0344 0.0336 0.0329 0.0322 0.0314 0.0307 0.0301 0.0294
-1.7 0.0446 0.0436 0.0427 0.0418 0.0409 0.0401 0.0392 0.0384 0.0375 0.0367
-1.6 0.0548 0.0537 0.0526 0.0516 0.0505 0.0495 0.0485 0.0475 0.0465 0.0455
-1.5 0.0668 0.0655 0.0643 0.0630 0.0618 0.0606 0.0594 0.0582 0.0571 0.0559
-1.4 0.0808 0.0793 0.0778 0.0764 0.0749 0.0735 0.0721 0.0708 0.0694 0.0681
-1.3 0.0968 0.0951 0.0934 0.0918 0.0901 0.0885 0.0869 0.0853 0.0838 0.0823
-1.2 0.1151 0.1131 0.1112 0.1093 0.1075 0.1056 0.1038 0.1020 0.1003 0.0985
-1.1 0.1357 0.1335 0.1314 0.1292 0.1271 0.1251 0.1230 0.1210 0.1190 0.1170
-1.0 0.1587 0.1562 0.1539 0.1515 0.1492 0.1469 0.1446 0.1423 0.1401 0.1379
This results in a p-value of 0.0968, or 9.68%, for a z-score of negative 1.304. We also need to take the positive
1.304 into account, which is the upper right tail.
To calculate the true p-value, we just need to multiply 0.0968 by two, or 0.1936. This would be a p-value of
19.36%.
We get a value of 0.0961, which is about the same value as we got in the table. Again, we need to take both
tails into account, so we can simply multiply this value by two to get a p-value of 0.1922, or 19.22%.
1c. Excel
The third method to find the p-value is to use Excel. First, select "Formulas", choose the "Statistical" option,
© 2023 SOPHIA Learning, LLC. SOPHIA is a registered trademark of SOPHIA Learning, LLC. Page 98
and pick "NORM.DIST". The first value we are going to input is the mean of the sample, which was 499.3, then
the population mean which we are testing against, or 500, and finally the standard deviation, which was 6,
divided by the square root of the sample of n. We can find the square root under the "Math and Trigonometry"
option in "Formulas". The last value that we need to enter is "TRUE".
We get about the same value as we did with the table and the calculator. Since this is a two-sided test, we
need to multiply the value by two, or 0.096 times two equals 0.1912.
2. Left-Tailed Test
In this next example, we'll look at the proportion of students who suffer from test anxiety. We want to test the
claim that fewer than half of students suffer from test anxiety.
In this case, we will have a left-tailed test. Because this is qualitative data, meaning the students answer yes or
no to suffering from test anxiety, this is a population proportion and we can use the following formula to
calculate the z-test statistic:
In a random sample of 1000 students, 450 students claimed to have test anxiety. This will be p-hat, or the
sample proportion. We can calculate this by dividing 450 by 1000, or 0.45. The population proportion, p, is
50%, or 0.50. The complement of p, or q, can be found by calculating 1 minus 0.50, or 0.50. The sample size
is 1000.
The corresponding z-score is negative 3.162. Testing against that half, or 50%, of students suffer from test
anxiety, we get the following shaded region all the way to the left of our curve:
© 2023 SOPHIA Learning, LLC. SOPHIA is a registered trademark of SOPHIA Learning, LLC. Page 99
2a. Z-Table
The first way to find the p-value is with the z-table. Remember, we can only go up to the hundredths place, so
we will need to round -3.162 to -3.16. In the left column, we will first find the tenths place, or -3.1. In the top row,
we will find the hundredths place, or 0.06.
z 0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09
-3.4 0.0003 0.0003 0.0003 0.0003 0.0003 0.0003 0.0003 0.0003 0.0003 0.0002
-3.3 0.0005 0.0005 0.0005 0.0004 0.0004 0.0004 0.0004 0.0004 0.0004 0.0003
-3.2 0.0007 0.0007 0.0006 0.0006 0.0006 0.0006 0.0006 0.0005 0.0005 0.0005
-3.1 0.0010 0.0009 0.0009 0.0009 0.0008 0.0008 0.0008 0.0008 0.0007 0.0007
-3.0 0.0013 0.0013 0.0013 0.0012 0.0012 0.0011 0.0011 0.0011 0.0010 0.0010
-2.9 0.0019 0.0018 0.0017 0.0017 0.0016 0.0016 0.0015 0.0015 0.0014 0.0014
-2.8 0.0026 0.0025 0.0024 0.0023 0.0023 0.0022 0.0021 0.0021 0.0020 0.0019
© 2023 SOPHIA Learning, LLC. SOPHIA is a registered trademark of SOPHIA Learning, LLC. Page 100
This answer shows a p-value of 0.00078, or 0.078%.
2c. Excel
In Excel, select "Formulas", choose the "Statistical" option, and pick "NORM.DIST". The first value we are going
to input is the sample proportion, "0.45", then the population proportion, "0.50", and finally the standard
deviation, which was the square root of pq divided by n, or 0.50 times 0.50 divided by 1000. We can find the
square root under the "Math and Trigonometry" option in "Formulas". The standard deviation should be input
as "SQRT((0.50*0.50)/1000)". The last value that we need to enter is "TRUE".
We get about the same p-value as we did with the z-table and the calculator.
3. Right-Tailed Test
In this final example, we will be testing the claim that women in a certain town are taller than the average U.S.
height, which is 63.8 inches.
From a random sample of 50 women, we get an average height of 64.7 inches with a standard deviation of
2.5 inches. Inches is a quantitative variable, therefore the 63.8 inches is a population mean. We will then use
the following formula to calculate the z-score:
3a. Z-Table
© 2023 SOPHIA Learning, LLC. SOPHIA is a registered trademark of SOPHIA Learning, LLC. Page 101
The first way to find the p-value is to use the z-table. In the z-table, the left column will show values to the
tenths place, while the top row will show values to the hundredths place. If we have a z-score of 2.546, we
need to round this to the hundredths place, or 2.55. In the left column, we will first find the tenths place, or 2.5.
In the top row, we will find the hundredths place, or 0.05.
z 0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09
0.0 0.5000 0.5040 0.5080 0.5120 0.5160 0.5199 0.5239 0.5279 0.5319 0.5359
0.1 0.5398 0.5438 0.5478 0.5517 0.5557 0.5596 0.5636 0.5675 0.5714 0.5753
0.2 0.5793 0.5832 0.5871 0.5910 0.5948 0.5987 0.6026 0.6064 0.6103 0.6141
0.3 0.6179 0.6217 0.6255 0.6293 0.6331 0.6368 0.6406 0.6443 0.6480 0.6517
0.4 0.6554 0.6591 0.6628 0.6664 0.6700 0.6736 0.6772 0.6808 0.6844 0.6879
0.5 0.6915 0.6950 0.6985 0.7019 0.7054 0.7088 0.7123 0.7157 0.7190 0.7224
0.6 0.7257 0.7291 0.7324 0.7357 0.7389 0.7422 0.7454 0.7486 0.7517 0.7549
0.7 0.7580 0.7611 0.7642 0.7673 0.7704 0.7734 0.7764 0.7794 0.7823 0.7852
0.8 0.7881 0.7910 0.7939 0.7967 0.7995 0.8023 0.8051 0.8078 0.8106 0.8133
0.9 0.8159 0.8186 0.8212 0.8238 0.8264 0.8289 0.8315 0.8340 0.8365 0.8389
1.0 0.8413 0.8438 0.8461 0.8485 0.8508 0.8531 0.8554 0.8577 0.8599 0.8621
1.1 0.8643 0.8665 0.8686 0.8708 0.8729 0.8749 0.8770 0.8790 0.8810 0.8830
1.2 0.8849 0.8869 0.8888 0.8907 0.8925 0.8944 0.8962 0.8980 0.8997 0.9015
1.3 0.9032 0.9049 0.9066 0.9082 0.9099 0.9115 0.9131 0.9147 0.9162 0.9177
1.4 0.9192 0.9207 0.9222 0.9236 0.9251 0.9265 0.9279 0.9292 0.9306 0.9319
1.5 0.9332 0.9345 0.9357 0.9370 0.9382 0.9394 0.9406 0.9418 0.9429 0.9441
1.6 0.9452 0.9463 0.9474 0.9484 0.9495 0.9505 0.9515 0.9525 0.9535 0.9545
1.7 0.9554 0.9564 0.9573 0.9582 0.9591 0.9599 0.9608 0.9616 0.9625 0.9633
1.8 0.9641 0.9649 0.9656 0.9664 0.9671 0.9678 0.9686 0.9693 0.9699 0.9706
1.9 0.9713 0.9719 0.9726 0.9732 0.9738 0.9744 0.9750 0.9756 0.9761 0.9767
2.0 0.9772 0.9778 0.9783 0.9788 0.9793 0.9798 0.9803 0.9808 0.9812 0.9817
2.1 0.9821 0.9826 0.9830 0.9834 0.9838 0.9842 0.9846 0.9850 0.9854 0.9857
2.2 0.9861 0.9864 0.9868 0.9871 0.9875 0.9878 0.9881 0.9884 0.9887 0.9890
2.3 0.9893 0.9896 0.9898 0.9901 0.9904 0.9906 0.9909 0.9911 0.9913 0.9916
2.4 0.9918 0.9920 0.9922 0.9925 0.9927 0.9929 0.9931 0.9932 0.9934 0.9936
2.5 0.9938 0.9940 0.9941 0.9943 0.9945 0.9946 0.9948 0.9949 0.9951 0.9952
2.6 0.9953 0.9955 0.9956 0.9957 0.9959 0.9960 0.9961 0.9962 0.9963 0.9964
2.7 0.9965 0.9966 0.9967 0.9968 0.9969 0.9970 0.9971 0.9972 0.9973 0.9974
2.8 0.9974 0.9975 0.9976 0.9977 0.9977 0.9978 0.9979 0.9979 0.9980 0.9981
2.9 0.9981 0.9982 0.9982 0.9983 0.9984 0.9984 0.9985 0.9985 0.9986 0.9986
However, when we are performing an upper-tail test, or right-tailed test, that p-value from the table always
reads left to right for our distribution. The p-value of 99.46% is associated with the 99.46% percent that is
unshaded.
© 2023 SOPHIA Learning, LLC. SOPHIA is a registered trademark of SOPHIA Learning, LLC. Page 102
To get the percent that is shaded under the curve, we just need to calculate 100% minus 99.46%. This gives
us the p-value of 0.54%, or 0.0054.
3c. Excel
In Excel, first, select "Formulas", choose the "Statistical" option, and again pick "NORM.DIST". The first value
we are going to input is the mean of the sample, which was 64.7, then the population mean which we are
testing against, or 63.8, and finally the standard deviation, which was 2.5, divided by the square root of the
sample size of 50. We can find the square root under the "Math and Trigonometry" option in "Formulas". The
last value that we need to enter is "TRUE".
Notice that we do not get the same p-value as the graphing calculator. In this case, since it is a right-tailed
test, Excel always goes from the first part of the distribution and reads left to right. We know that the
distribution is 100%, so to get that upper portion of the distribution, we have to do 100%, or 1, minus this value.
© 2023 SOPHIA Learning, LLC. SOPHIA is a registered trademark of SOPHIA Learning, LLC. Page 103
We get the same p-value, which is about 0.0054, or 0.54%.
SUMMARY
Today we calculated the p-value from a given z-test statistic, for a two-sided test, a left-tailed test, and
a right-tailed test. For each test, we performed the calculation using three different methods: z-table,
graphing calculator, and Excel.
Good luck!
TERMS TO KNOW
P-value
The probability that the test statistic is that value or more extreme in the direction of the alternative
hypothesis
Test Statistic
A measurement, in standardized units, of how far a sample statistic is from the assumed parameter if the
null hypothesis is true
© 2023 SOPHIA Learning, LLC. SOPHIA is a registered trademark of SOPHIA Learning, LLC. Page 104
Confidence Intervals
by Sophia
WHAT'S COVERED
This tutorial will cover the basics of confidence intervals, focusing on how to identify the z-critical
value needed for a given confidence interval. Our discussion breaks down as follows:
1. Confidence Intervals
2. Margin of Error and the Affect of Confidence Level and Sample Size
3. Confidence Interval Formulas
a. For Sample Means
b. For Sample Proportions
4. Finding Z*
1. Confidence Intervals
Before we begin, it's important to note that sampling error is the inherent variability in the process of sampling.
In a random sample, it occurs when you use a statistic, like a sample mean, to estimate the parameter, like a
population mean. You won't always get exact accuracy with the sample mean, but you can use it to estimate
the population mean. The idea is that you can be close.
When you take a larger sample, you're going to be, on average, closer. The sampling error, which is the
amount by which the sample statistic is off from the population parameter, decreases. You get more
consistently close values to the parameter when you take samples. When you calculate a margin of error in a
study, you are approximating the sampling error.
When you take a sample, you try to obtain values that accurately represent what's going on in the population.
EXAMPLE For example, suppose you took a simple random sample of 500 people getting ready
for an upcoming election in a town of 10,000, and found that 285 of those 500 plan to vote for a
particular candidate. Your best guess, for the true proportion, in the population of the town that will
vote for candidate y is the proportion that you got in your sample--285 out of 500, which is 57% of the
town. That's your best guess, but you might be off by a little bit.
You don't know if the true proportion of people who will vote for that candidate is 57%, and that's why
you report a margin of error in your poll.
From the margin of error, you can create what is called aconfidence interval. A confidence interval is an
interval that contains the likely values for a parameter. We base the confidence interval on our point estimate,
and the width of the interval is affected by the confidence level and sample size.
© 2023 SOPHIA Learning, LLC. SOPHIA is a registered trademark of SOPHIA Learning, LLC. Page 105
FORMULA
Confidence Interval
The confidence interval is your point estimate, which is your best guess from your simple random sample, plus
or minus the margin of error. You believe you are within a certain amount of the right answer with your point
estimate.
TERM TO KNOW
Confidence Interval
An interval that contains likely values for a parameter. We base our confidence interval on our point
estimate, and the width of the interval is affected by confidence level and sample size.
Sample size: You knew this from before when you said that a larger sample size results in less sampling
error, and therefore a lower margin of error.
Confidence level: You're going to learn more about this, but a higher confidence level results in a larger
margin of error.
EXAMPLE If you want to be very confident that you're going to accurately describe what percent
of people are going to vote for that particular candidate, you have to go out a little bit further on each
side. Maybe you have to go out plus or minus 5%, as opposed to plus or minus 3%.
IN CONTEXT
95% Confidence
If the sampling distribution of p-hat is approximately normal, it will be centered at p, the population
parameter. 95% of all sample proportions will be within two standard deviations of p.
© 2023 SOPHIA Learning, LLC. SOPHIA is a registered trademark of SOPHIA Learning, LLC. Page 106
So p plus or minus two standard deviations will contain 95% percent of all p-hat. This is called 95%
confidence. Approximately 19 out of every 20 samples, in the long term, that you take will be within
two standard deviations of the right answer. 95% percent of all p-hats are within two standard
deviations of p.
99% Confidence
For instance, 99% of all p-hats will be within 2.58 standard deviations of p. This means that when
you take a sample proportion, 99% of sample proportions will be within 2.58 standard deviations of
the right answer, the value of p.
Take your p-hat value, and plus or minus 2.58 standard deviations, and you're 99% likely to capture
the value of p.
In C% of samples, the parameter will be within z* standard errors of the sample statistic.
This is the interpretation here. These bold words are all going to be replaced with numbers, in typical
interpretations.
© 2023 SOPHIA Learning, LLC. SOPHIA is a registered trademark of SOPHIA Learning, LLC. Page 107
For C% of the time, the parameter μ will be contained in the interval .
FORMULA
FORMULA
4. Finding Z*
The confidence level determines the value of z*. Depending on what you choose for your confidence level, z*
will be affected that way. To find the z* critical value, we can use a z-table. For a confidence interval, we can
follow the same steps as a two-sided test.
EXAMPLE If we have a 95% confidence interval, this is actually the same as a 5% significance
level. However, this is split between two tails, the lower and upper part of the distribution. Each tail will
have 2.5%, or 0.025.
We can use the upper limit to find the critical z-score. Remember, a distribution is 100%, so to find the upper
limit, we can subtract 0.025 from 1, which gives us 0.975. Now, we can use a z-table.
0.0 0.5000 0.5040 0.5080 0.5120 0.5160 0.5199 0.5239 0.5279 0.5319 0.5359
0.1 0.5398 0.5438 0.5478 0.5517 0.5557 0.5596 0.5636 0.5675 0.5714 0.5753
0.2 0.5793 0.5832 0.5871 0.5910 0.5948 0.5987 0.6026 0.6064 0.6103 0.6141
© 2023 SOPHIA Learning, LLC. SOPHIA is a registered trademark of SOPHIA Learning, LLC. Page 108
0.3 0.6179 0.6217 0.6255 0.6293 0.6331 0.6368 0.6406 0.6443 0.6480 0.6517
0.4 0.6554 0.6591 0.6628 0.6664 0.6700 0.6736 0.6772 0.6808 0.6844 0.6879
0.5 0.6915 0.6950 0.6985 0.7019 0.7054 0.7088 0.7123 0.7157 0.7190 0.7224
0.6 0.7257 0.7291 0.7324 0.7357 0.7389 0.7422 0.7454 0.7486 0.7517 0.7549
0.7 0.7580 0.7611 0.7642 0.7673 0.7704 0.7734 0.7764 0.7794 0.7823 0.7852
0.8 0.7881 0.7910 0.7939 0.7967 0.7995 0.8023 0.8051 0.8078 0.8106 0.8133
0.9 0.8159 0.8186 0.8212 0.8238 0.8264 0.8289 0.8315 0.8340 0.8365 0.8389
1.0 0.8413 0.8438 0.8461 0.8485 0.8508 0.8531 0.8554 0.8577 0.8599 0.8621
1.1 0.8643 0.8665 0.8686 0.8708 0.8729 0.8749 0.8770 0.8790 0.8810 0.8830
1.2 0.8849 0.8869 0.8888 0.8907 0.8925 0.8944 0.8962 0.8980 0.8997 0.9015
1.3 0.9032 0.9049 0.9066 0.9082 0.9099 0.9115 0.9131 0.9147 0.9162 0.9177
1.4 0.9192 0.9207 0.9222 0.9236 0.9251 0.9265 0.9279 0.9292 0.9306 0.9319
1.5 0.9332 0.9345 0.9357 0.9370 0.9382 0.9394 0.9406 0.9418 0.9429 0.9441
1.6 0.9452 0.9463 0.9474 0.9484 0.9495 0.9505 0.9515 0.9525 0.9535 0.9545
1.7 0.9554 0.9564 0.9573 0.9582 0.9591 0.9599 0.9608 0.9616 0.9625 0.9633
1.8 0.9641 0.9649 0.9656 0.9664 0.9671 0.9678 0.9686 0.9693 0.9699 0.9706
1.9 0.9713 0.9719 0.9726 0.9732 0.9738 0.9744 0.9750 0.9756 0.9761 0.9767
2.0 0.9772 0.9778 0.9783 0.9788 0.9793 0.9798 0.9803 0.9808 0.9812 0.9817
2.1 0.9821 0.9826 0.9830 0.9834 0.9838 0.9842 0.9846 0.9850 0.9854 0.9857
2.2 0.9861 0.9864 0.9868 0.9871 0.9875 0.9878 0.9881 0.9884 0.9887 0.9890
In a z-table, the value 0.975 corresponds with a 1.9 in the left column and 0.06 in the top row. This tells us that
the z-score is 1.96.
Another way is to use a t-table, which you will learn more about in a later tutorial. We don't use the t-
distribution for proportions; however, we can use the last row in this table to find the confidence levels.
Tail Probability, p
One-tail 0.25 0.20 0.15 0.10 0.05 0.025 0.02 0.01 0.005 0.0025 0.001 0.0005
Two-tail 0.50 0.40 0.30 0.20 0.10 0.05 0.04 0.02 0.01 0.005 0.002 0.001
df
1 1.000 1.376 1.963 3.078 6.314 12.71 15.89 31.82 63.66 127.3 318.3 636.6
2 0.816 1.080 1.386 1.886 2.920 4.303 4.849 6.965 9.925 14.09 22.33 31.60
3 0.765 0.978 1.250 1.638 2.353 3.182 3.482 4.541 5.841 7.453 10.21 12.92
4 0.741 0.941 1.190 1.533 2.132 2.776 2.999 3.747 4.604 5.598 7.173 8.610
5 0.727 0.920 1.156 1.476 2.015 2.571 2.757 3.365 4.032 4.773 5.893 6.869
6 0.718 0.906 1.134 1.440 1.943 2.447 2.612 3.143 3.707 4.317 5.208 5.959
7 0.711 0.896 1.119 1.415 1.895 2.365 2.517 2.998 3.499 4.029 4.785 5.408
8 0.706 0.889 1.108 1.397 1.860 2.306 2.449 2.896 3.355 3.833 4.501 5.041
9 0.703 0.883 1.100 1.383 1.833 2.262 2.398 2.821 3.250 3.690 4.297 4.781
10 0.700 0.879 1.093 1.372 1.812 2.228 2.359 2.764 3.169 3.581 4.144 4.587
11 0.697 0.876 1.088 1.363 1.796 2.201 2.328 2.718 3.106 3.497 4.025 4.437
12 0.695 0.873 1.083 1.356 1.782 2.179 2.303 2.681 3.055 3.428 3.930 4.318
© 2023 SOPHIA Learning, LLC. SOPHIA is a registered trademark of SOPHIA Learning, LLC. Page 109
13 0.694 0.870 1.079 1.350 1.771 2.160 2.282 2.650 3.012 3.372 3.852 4.221
14 0.692 0.868 1.076 1.345 1.761 2.145 2.264 2.624 2.977 3.326 3.787 4.140
15 0.691 0.866 1.074 1.341 1.753 2.131 2.249 2.602 2.947 3.286 3.733 4.073
16 0.690 0.865 1.071 1.337 1.746 2.120 2.235 2.583 2.921 3.252 3.686 4.015
17 0.689 0.863 1.069 1.333 1.740 2.110 2.224 2.567 2.898 3.222 3.646 3.965
18 0.688 0.862 1.067 1.330 1.734 2.101 2.214 2.552 2.878 3.197 3.610 3.922
19 0.688 0.861 1.066 1.328 1.729 2.093 2.205 2.539 2.861 3.174 3.579 3.883
20 0.687 0.860 1.064 1.325 1.725 2.086 2.197 2.528 2.845 3.153 3.552 3.850
21 0.686 0.859 1.063 1.323 1.721 2.080 2.189 2.518 2.831 3.135 3.527 3.819
22 0.686 0.858 1.061 1.321 1.717 2.074 2.183 2.508 2.819 3.119 3.505 3.792
23 0.685 0.858 1.060 1.319 1.714 2.069 2.177 2.500 2.807 3.104 3.485 3.767
24 0.685 0.857 1.059 1.318 1.711 2.064 2.172 2.492 2.797 3.091 3.467 3.745
25 0.684 0.856 1.058 1.316 1.708 2.060 2.167 2.485 2.787 3.078 3.450 3.725
26 0.684 0.856 1.058 1.315 1.706 2.056 2.162 2.479 2.779 3.067 3.435 3.707
27 0.684 0.855 1.057 1.314 1.703 2.052 2.158 2.473 2.771 3.057 3.421 3.690
28 0.683 0.855 1.056 1.313 1.701 2.048 2.154 2.467 2.763 3.047 3.408 3.674
29 0.683 0.854 1.055 1.311 1.699 2.045 2.150 2.462 2.756 3.038 3.396 3.659
30 0.683 0.854 1.055 1.310 1.697 2.042 2.147 2.457 2.750 3.030 3.385 3.646
40 0.681 0.851 1.050 1.303 1.684 2.021 2.123 2.423 2.704 2.971 3.307 3.551
50 0.679 0.849 1.047 1.299 1.676 2.009 2.109 2.403 2.678 2.937 3.261 3.496
60 0.679 0.848 1.045 1.296 1.671 2.000 2.099 2.390 2.660 2.915 3.232 3.460
80 0.678 0.846 1.043 1.292 1.664 1.990 2.088 2.374 2.639 2.887 3.195 3.416
100 0.677 0.845 1.042 1.290 1.660 1.984 2.081 2.364 2.626 2.871 3.174 3.390
1000 0.675 0.842 1.037 1.282 1.646 1.962 2.056 2.330 2.581 2.813 3.098 3.300
>1000 0.674 0.841 1.036 1.282 1.645 1.960 2.054 2.326 2.576 2.807 3.091 3.291
50% 60% 70% 80% 90% 95% 96% 98% 99% 99.5% 99.8% 99.9%
Z confidence level, critical values, are found in the last row of this t-table, under the infinity value, or ">1000".
Essentially the normal distribution is the t distribution with infinite degrees of freedom. We're going to look in
this row to find the z critical value that we should use, which is the same as the 1.96 we previously got.
SUMMARY
When you take a sample, you obtain a sample statistic that is a point estimate of the population
parameters. You can create a confidence interval where you can be a certain percent confident that
the parameter lies within the interval. This means that the percent of sample statistics in the sample
distribution are within the margin of error of the parameter. Perhaps you'll say 95% of all the x-bars in
the sampling distribution of x-bar will be within the margin of error of the true parameter Mu. That
percent of confidence intervals will contain the parameters. If you did samples over and over again,
and took confidence intervals each time, 90% or 95% of confidence intervals would contain the
answer of Mu or p, or whatever parameters you're trying to estimate.
Good luck!
© 2023 SOPHIA Learning, LLC. SOPHIA is a registered trademark of SOPHIA Learning, LLC. Page 110
Source: Adapted from Sophia tutorial by Jonathan Osters.
TERMS TO KNOW
Confidence Interval
An interval that contains likely values for a parameter. We base our confidence interval on our point
estimate, and the width of the interval is affected by confidence level and sample size.
FORMULAS TO KNOW
Confidence Interval
© 2023 SOPHIA Learning, LLC. SOPHIA is a registered trademark of SOPHIA Learning, LLC. Page 111
Confidence Interval for Population Proportion
by Sophia
WHAT'S COVERED
This tutorial will cover how to calculate confidence intervals for a population proportion. Our
discussion breaks down as follows:
For a confidence interval for population proportions, the statistic is the sample proportion and population
parameter is the population proportion. The following can be used to calculate the confidence interval:
FORMULA
HINT
We will use p-hat and q-hat because we do not have an assumed population proportion.
TERM TO KNOW
© 2023 SOPHIA Learning, LLC. SOPHIA is a registered trademark of SOPHIA Learning, LLC. Page 112
To construct a confidence interval for population proportions, the following steps must be followed:
STEP BY STEP
EXAMPLE Obecalp is a popular prescription drug but is thought to cause headaches as a side
effect. In a random sample of 206 patients taking Obecalp, 23 experienced headaches.
Construct a 95% confidence interval for the proportion of all Obecalp users that would experience
headaches.
Step 1: Verify the conditions necessary for inference.
Stating the conditions isn't enough, and it's not just a formality--you must verify them. Recall the conditions
needed:
Condition Description
Normality np ≥ 10 and nq ≥ 10
Let's go back to our example to check the requirements of randomness, independence, and normality.
Randomness: The sample of Obecalp users was a random sample, so that is verified.
Independence: The sample of Obecalp users taken was a small fraction of the population of Obecalp
users. There's no way to verify that empirically unless you had the whole list of people taking the drug.
You're going to have to assume there are at least ten times the sample size, or 2,060 people taking this
drug.
Normality: This "np is greater than or equal to 10" equation is a little harder to figure out. You don't know
p, the true proportion of people who will get headaches, and you don't have a best guess for it from a null
hypothesis. There is no null hypothesis in this problem. What you do have, as a point estimate for p, is p-
hat. Verify normality by using p-hat instead of p. Say n times p-hat has to be at least 10. In this case, 206
times p, 23 out of 206, is 23, which is bigger than 10; n times q-hat is 183, which is also bigger than 10.
HINT
Recall that you need to use sample statistic, p-hat, to verify the normality condition because you don't
know population parameter, p.
Step 2: Calculate the confidence interval.
To do this, we will take the point estimate, p-hat, plus or minus the z* critical value times the standard error of
p-hat, which is the square root of p-hat times q hat, over n. The population proportion is not known, so you’ll
© 2023 SOPHIA Learning, LLC. SOPHIA is a registered trademark of SOPHIA Learning, LLC. Page 113
use p-hat for the standard error.
First, let's find the corresponding z* critical value for a 95% confidence interval by using a z-table. For a
confidence interval, we can follow the same steps as a two-sided test. If we have a 95% confidence interval,
this is actually the same as a 5% significance level. However, this is split between two tails, the lower and
upper part of the distribution. Each tail will have 2.5%.
We can use the upper limit to find the critical z-score. Remember, a distribution is 100%, so to find the upper
limit, we can subtract 0.025 from 1, which gives us 0.975. Now, we can use a z-table.
0.0 0.5000 0.5040 0.5080 0.5120 0.5160 0.5199 0.5239 0.5279 0.5319 0.5359
0.1 0.5398 0.5438 0.5478 0.5517 0.5557 0.5596 0.5636 0.5675 0.5714 0.5753
0.2 0.5793 0.5832 0.5871 0.5910 0.5948 0.5987 0.6026 0.6064 0.6103 0.6141
0.3 0.6179 0.6217 0.6255 0.6293 0.6331 0.6368 0.6406 0.6443 0.6480 0.6517
0.4 0.6554 0.6591 0.6628 0.6664 0.6700 0.6736 0.6772 0.6808 0.6844 0.6879
0.5 0.6915 0.6950 0.6985 0.7019 0.7054 0.7088 0.7123 0.7157 0.7190 0.7224
0.6 0.7257 0.7291 0.7324 0.7357 0.7389 0.7422 0.7454 0.7486 0.7517 0.7549
0.7 0.7580 0.7611 0.7642 0.7673 0.7704 0.7734 0.7764 0.7794 0.7823 0.7852
0.8 0.7881 0.7910 0.7939 0.7967 0.7995 0.8023 0.8051 0.8078 0.8106 0.8133
0.9 0.8159 0.8186 0.8212 0.8238 0.8264 0.8289 0.8315 0.8340 0.8365 0.8389
1.0 0.8413 0.8438 0.8461 0.8485 0.8508 0.8531 0.8554 0.8577 0.8599 0.8621
1.1 0.8643 0.8665 0.8686 0.8708 0.8729 0.8749 0.8770 0.8790 0.8810 0.8830
1.2 0.8849 0.8869 0.8888 0.8907 0.8925 0.8944 0.8962 0.8980 0.8997 0.9015
1.3 0.9032 0.9049 0.9066 0.9082 0.9099 0.9115 0.9131 0.9147 0.9162 0.9177
1.4 0.9192 0.9207 0.9222 0.9236 0.9251 0.9265 0.9279 0.9292 0.9306 0.9319
1.5 0.9332 0.9345 0.9357 0.9370 0.9382 0.9394 0.9406 0.9418 0.9429 0.9441
1.6 0.9452 0.9463 0.9474 0.9484 0.9495 0.9505 0.9515 0.9525 0.9535 0.9545
1.7 0.9554 0.9564 0.9573 0.9582 0.9591 0.9599 0.9608 0.9616 0.9625 0.9633
1.8 0.9641 0.9649 0.9656 0.9664 0.9671 0.9678 0.9686 0.9693 0.9699 0.9706
1.9 0.9713 0.9719 0.9726 0.9732 0.9738 0.9744 0.9750 0.9756 0.9761 0.9767
2.0 0.9772 0.9778 0.9783 0.9788 0.9793 0.9798 0.9803 0.9808 0.9812 0.9817
© 2023 SOPHIA Learning, LLC. SOPHIA is a registered trademark of SOPHIA Learning, LLC. Page 114
2.1 0.9821 0.9826 0.9830 0.9834 0.9838 0.9842 0.9846 0.9850 0.9854 0.9857
2.2 0.9861 0.9864 0.9868 0.9871 0.9875 0.9878 0.9881 0.9884 0.9887 0.9890
In a z-table, the value 0.975 corresponds with a 1.9 in the left column and 0.06 in the top row. This tells us that
the z-score is 1.96.
Another way is to use a t-table, which you will learn more about in a later lesson but is available to view at the
end of this tutorial. We don't use the t-distribution for proportions; however, we can use the last row in this
table to find the confidence levels. Z confidence level, critical values, are found in the last row of this t table,
under the infinity value, or ">1000". Essentially, the normal distribution is the t distribution with infinite degrees
of freedom. We're going to look in this row to find the z critical value that we should use, which is the same as
the 1.96 we got from before.
Now that we have the corresonding z* critical value, we need to use p-hat, which is 23 out of 206, q-hat, which
is the complement of p-hat, and the sample size, n, which is 206 and put all this information in the formula:
From this formula, we obtain 0.112, which was our p-hat, plus or minus 0.043, which is the margin of error.
When we evaluate the interval, it's going to be from 0.069 all the way up to 0.155.
The confidence interval of 0.069 to 0.155 means we're 95% certain that if everyone who was taking Obecalp
was in the study, the true proportion of all Obecalp users who would experience headaches is somewhere
between 6.9% and 15.5%. We don't know exactly where in that range, but the true proportion is probably
somewhere in this range.
SUMMARY
You can create point estimates for population proportions, which is your sample proportion, and then
use that sample proportion to determine the margin of error for a confidence interval. First, verify the
conditions for inference are met, then construct and interpret a confidence interval based on the data
© 2023 SOPHIA Learning, LLC. SOPHIA is a registered trademark of SOPHIA Learning, LLC. Page 115
that you've gathered and the statistics that you've calculated.
Good luck!
TERMS TO KNOW
FORMULAS TO KNOW
© 2023 SOPHIA Learning, LLC. SOPHIA is a registered trademark of SOPHIA Learning, LLC. Page 116
Calculating Standard Error of a Sample
Proportion
by Sophia
WHAT'S COVERED
This tutorial will explain how to calculate standard error for a sample proportion, for cases when the
population standard deviation is known, and when it is unknown. Our discussion breaks down as
follows:
These students are either answering yes or no on the survey: "Yes, I've drank some amount of alcohol" or "No,
I have not drank some amount of alcohol". That is qualitative data, also known as categorical data. Therefore,
we're dealing with a sample proportion.
Whenever we're dealing with a sample proportion, the next question we need to ask ourselves is, "Do I know
the population standard deviation?" In this case, we do not have any of that information. Therefore, the
formula to calculate the standard error is p-hat times q-hat, divided by n, all under the square root.
FORMULA
We're actually going to use the data that was given to us, which are estimates--that's what the hat indicates--in
order to calculate the standard error.
The first thing we need to do is to figure out what p-hat is, based off of the information given to us. In this case,
the p-hat is what we're interested in, and that is how many have answered yes to participating in underage
© 2023 SOPHIA Learning, LLC. SOPHIA is a registered trademark of SOPHIA Learning, LLC. Page 117
drinking. That would be 188 out of 523 students, or 188/523, which is about 36% of the students.
Now, we also need the complement, which would be q-hat. This is also written as 1 minus p-hat. One minus
the 188 out of 523, or 1 - 0.36, tells us that 0.64, or 64%, of the students have not participated in underage
drinking. To always make sure our math is correct, remember that our p-hat and q-hat should add up to 1,
because they're complements of each other.
We have 0.36 for p-hat, 0.64 for q-hat, and the total sample, n, was 523 students. This calculates to a
standard error that is 0.021.
We are still looking at the students who were surveyed about underage drinking, but notice how this scenario
added on that the proportion of underage drinkers nationally is 39%. We're still calculating the standard error
of the sample proportion, but in this case, we know the population standard deviation, which is 39%. We're
going to use the formula of the square root of pq over n.
FORMULA
We do not need to use p-hat, which is the 188 out of 523, to make the estimate for the standard error. We
actually know p, which is 39%, or 0.39.
In this case, we're going to use 0.39 for p. This is another way of indicating population proportion. We can
then use this to find q, which is the complement of p. The complement of 0.39 is calculated by 1 minus 0.39,
which equals 0.61, or 61%. Sometimes we'll see this written as p subscript 0 and q subscript 0.
© 2023 SOPHIA Learning, LLC. SOPHIA is a registered trademark of SOPHIA Learning, LLC. Page 118
The standard error is 0.021.
SUMMARY
Today we learned how to calculate standard error of a sample proportion, and practiced identifying
which formula to use, based on the whether the population standard deviation is unknown or known.
TERMS TO KNOW
Standard Error
The standard deviation of the sampling distribution of sample means distribution.
FORMULAS TO KNOW
Standard Error
Sample Means:
© 2023 SOPHIA Learning, LLC. SOPHIA is a registered trademark of SOPHIA Learning, LLC. Page 119
T-Tests
by Sophia
WHAT'S COVERED
In this tutorial, you will learn about t-tests, and how to determine key characteristics of a t-distribution.
Our discussion breaks down as follows:
FORMULA
However, the z-statistic was based on the fact that the population standard deviation was known. If the
population standard deviation is not known, we need a new statistic. We're going to use our sample standard
deviation, s, instead.
FORMULA
HINT
This "s" over the square root of n value, replacing the sigma over square root of n value, is called the
standard error.
The only problem with using the sample standard deviation (s) as opposed to the population standard
deviation (σ) is that the value of s can vary largely from sample to sample. Sigma (σ) is fixed, so we can base
our normal distribution off of it.
The sample standard deviation is more variable than the population standard deviation and much more
variable for small samples than for large samples. For large samples, s and sigma are very close, but with
© 2023 SOPHIA Learning, LLC. SOPHIA is a registered trademark of SOPHIA Learning, LLC. Page 120
small samples particularly, the value of s can vary wildly.
Because s is so variable, it creates a new distribution of test statistics much like the normal distribution, but is
known as the student's t-distribution, or sometimes just the t-distribution.
The only difference is the t-distribution is a more heavy-tailed distribution. If we used the normal distribution, it
wouldn't underestimate the proportion of extreme values in the sampling distribution.
The t-distribution is actually a family of distributions. They all are a little bit shorter than the standard normal
distribution and a little heavier on the tails. As the sample size gets larger, the t-distribution does get close to
the normal distribution. It doesn't diminish as quickly in the tails when the sample size is small, but gets very
close to the normal distribution when n is large.
TERM TO KNOW
T-Distribution
A family of distributions that are centered at zero and symmetric like the standard normal distribution,
but heavier in the tails. Depending on the sample size, it does not diminish towards the tails as fast. If
the sample size is large, the t-distribution approximates the normal distribution.
2. Conducting a T-Test
We're going to conduct a t-test for population means much like we conducted a z-test for population means.
Recall, that when running a hypothesis test, there are four parts:
STEP BY STEP
© 2023 SOPHIA Learning, LLC. SOPHIA is a registered trademark of SOPHIA Learning, LLC. Page 121
The only difference between these two tests is the test statistic is going to be a t-statistic instead of a z-
statistic. Because we’re using the t-distribution instead of the z- distribution, we're going to obtain a different
p-value.
Therefore, we will need a new table, not the standard normal table for that. Below is the t-distribution table.
We can see the possible p-values in the top row and the t-values are the values inside the table. Potential p-
values are based on the values within this section.
Tail Probability, p
One-tail 0.25 0.20 0.15 0.10 0.05 0.025 0.02 0.01 0.005 0.0025 0.001 0.0005
Two-tail 0.50 0.40 0.30 0.20 0.10 0.05 0.04 0.02 0.01 0.005 0.002 0.001
df
1 1.000 1.376 1.963 3.078 6.314 12.71 15.89 31.82 63.66 127.3 318.3 636.6
2 0.816 1.080 1.386 1.886 2.920 4.303 4.849 6.965 9.925 14.09 22.33 31.60
3 0.765 0.978 1.250 1.638 2.353 3.182 3.482 4.541 5.841 7.453 10.21 12.92
4 0.741 0.941 1.190 1.533 2.132 2.776 2.999 3.747 4.604 5.598 7.173 8.610
5 0.727 0.920 1.156 1.476 2.015 2.571 2.757 3.365 4.032 4.773 5.893 6.869
6 0.718 0.906 1.134 1.440 1.943 2.447 2.612 3.143 3.707 4.317 5.208 5.959
7 0.711 0.896 1.119 1.415 1.895 2.365 2.517 2.998 3.499 4.029 4.785 5.408
8 0.706 0.889 1.108 1.397 1.860 2.306 2.449 2.896 3.355 3.833 4.501 5.041
9 0.703 0.883 1.100 1.383 1.833 2.262 2.398 2.821 3.250 3.690 4.297 4.781
10 0.700 0.879 1.093 1.372 1.812 2.228 2.359 2.764 3.169 3.581 4.144 4.587
11 0.697 0.876 1.088 1.363 1.796 2.201 2.328 2.718 3.106 3.497 4.025 4.437
12 0.695 0.873 1.083 1.356 1.782 2.179 2.303 2.681 3.055 3.428 3.930 4.318
13 0.694 0.870 1.079 1.350 1.771 2.160 2.282 2.650 3.012 3.372 3.852 4.221
14 0.692 0.868 1.076 1.345 1.761 2.145 2.264 2.624 2.977 3.326 3.787 4.140
15 0.691 0.866 1.074 1.341 1.753 2.131 2.249 2.602 2.947 3.286 3.733 4.073
16 0.690 0.865 1.071 1.337 1.746 2.120 2.235 2.583 2.921 3.252 3.686 4.015
17 0.689 0.863 1.069 1.333 1.740 2.110 2.224 2.567 2.898 3.222 3.646 3.965
18 0.688 0.862 1.067 1.330 1.734 2.101 2.214 2.552 2.878 3.197 3.610 3.922
19 0.688 0.861 1.066 1.328 1.729 2.093 2.205 2.539 2.861 3.174 3.579 3.883
20 0.687 0.860 1.064 1.325 1.725 2.086 2.197 2.528 2.845 3.153 3.552 3.850
21 0.686 0.859 1.063 1.323 1.721 2.080 2.189 2.518 2.831 3.135 3.527 3.819
22 0.686 0.858 1.061 1.321 1.717 2.074 2.183 2.508 2.819 3.119 3.505 3.792
23 0.685 0.858 1.060 1.319 1.714 2.069 2.177 2.500 2.807 3.104 3.485 3.767
24 0.685 0.857 1.059 1.318 1.711 2.064 2.172 2.492 2.797 3.091 3.467 3.745
25 0.684 0.856 1.058 1.316 1.708 2.060 2.167 2.485 2.787 3.078 3.450 3.725
26 0.684 0.856 1.058 1.315 1.706 2.056 2.162 2.479 2.779 3.067 3.435 3.707
27 0.684 0.855 1.057 1.314 1.703 2.052 2.158 2.473 2.771 3.057 3.421 3.690
28 0.683 0.855 1.056 1.313 1.701 2.048 2.154 2.467 2.763 3.047 3.408 3.674
29 0.683 0.854 1.055 1.311 1.699 2.045 2.150 2.462 2.756 3.038 3.396 3.659
30 0.683 0.854 1.055 1.310 1.697 2.042 2.147 2.457 2.750 3.030 3.385 3.646
40 0.681 0.851 1.050 1.303 1.684 2.021 2.123 2.423 2.704 2.971 3.307 3.551
50 0.679 0.849 1.047 1.299 1.676 2.009 2.109 2.403 2.678 2.937 3.261 3.496
60 0.679 0.848 1.045 1.296 1.671 2.000 2.099 2.390 2.660 2.915 3.232 3.460
80 0.678 0.846 1.043 1.292 1.664 1.990 2.088 2.374 2.639 2.887 3.195 3.416
100 0.677 0.845 1.042 1.290 1.660 1.984 2.081 2.364 2.626 2.871 3.174 3.390
© 2023 SOPHIA Learning, LLC. SOPHIA is a registered trademark of SOPHIA Learning, LLC. Page 122
1000 0.675 0.842 1.037 1.282 1.646 1.962 2.056 2.330 2.581 2.813 3.098 3.300
>1000 0.674 0.841 1.036 1.282 1.645 1.960 2.054 2.326 2.576 2.807 3.091 3.291
50% 60% 70% 80% 90% 95% 96% 98% 99% 99.5% 99.8% 99.9%
This distribution is actually one-sided, and it's the upper side that gives us these tail probabilities here. The
entries in the t-table represent the probability of a value laying above t.
The one new wrinkle that we're adding for a t-distribution is this value df (the far left column). It's called the
degrees of freedom. For our purposes, it's going to be the sample size minus 1. We find our t-statistic in
whatever row our degrees of freedom is. If it's between two values, that means our p-value is between these
two p-values.
EXAMPLE According to their bags, a standard bag of M&M's candies is supposed to weigh 47.9
grams. Suppose we randomly select 14 bags and got this distribution.
Assuming the distribution of bag weights is approximately normal, is this evidence that the bags do not
contain the amount of candy that they say that they do?
Step 1: State the null and alternative hypotheses.
The null is that the mean is 47.9 grams. The alternative is the mean is not 47.9 grams. We can select a
significance level of 0.05, which means that if the p-value is less than 0.05, reject the null hypothesis.
Criteria Description
© 2023 SOPHIA Learning, LLC. SOPHIA is a registered trademark of SOPHIA Learning, LLC. Page 123
How were the data collected?
Randomness
The randomness should be stated somewhere in the problem. Think about the way the
data was collected.
Population ≥ 10n
Independence
You want to make sure that the population is at least 10 times as large as the sample
size.
There are two ways to verify normality. Either the parent distribution has to be normal or
Normality
the central limit theorem is going to have to apply. The central limit theorem says that
for most distributions, when the sample size is greater than 30, the sampling distribution
will be approximately normal.
Randomness: It says in the problem that the bags were randomly selected.
Independence: We're going to go ahead and assume that there are at least 140 bags of M&M's. That's 10
times as large as the 14 bags in our sample.
Normality: Finally, it does say in the problem that the distribution of bag weights is approximately normal,
and so normality will be verified for our sampling distribution.
When we press "Enter", we get an average of 48.07. We can also find the standard deviation easily by typing
"=STDEV.S(" and highlighting the 14 values. We can also find this function by going under the Formulas tab,
and then selecting the Statistical option.
© 2023 SOPHIA Learning, LLC. SOPHIA is a registered trademark of SOPHIA Learning, LLC. Page 124
This gives a standard deviation of 0.60.
Now, we can plug the known values into the t-statistic formula:
By plugging in all the numbers that we have, we obtain a t-statistic of positive 1.06. Where exactly does this
tell us? We need to calculate the probability that we get a t-statistic of 1.06 or larger.
In the table, we also need to identify the degrees of freedom (df), which is the sample size minus one. The
sample was 14 bags, so our degrees of freedom is 14 minus 1, or 13.
Let's look at the t-table in row 13 to find the closest value to 1.06:
Tail Probability, p
One-tail 0.25 0.20 0.15 0.10 0.05 0.025 0.02 0.01 0.005 0.0025 0.001 0.0005
Two-tail 0.50 0.40 0.30 0.20 0.10 0.05 0.04 0.02 0.01 0.005 0.002 0.001
df
1 1.000 1.376 1.963 3.078 6.314 12.71 15.89 31.82 63.66 127.3 318.3 636.6
2 0.816 1.080 1.386 1.886 2.920 4.303 4.849 6.965 9.925 14.09 22.33 31.60
3 0.765 0.978 1.250 1.638 2.353 3.182 3.482 4.541 5.841 7.453 10.21 12.92
4 0.741 0.941 1.190 1.533 2.132 2.776 2.999 3.747 4.604 5.598 7.173 8.610
5 0.727 0.920 1.156 1.476 2.015 2.571 2.757 3.365 4.032 4.773 5.893 6.869
6 0.718 0.906 1.134 1.440 1.943 2.447 2.612 3.143 3.707 4.317 5.208 5.959
7 0.711 0.896 1.119 1.415 1.895 2.365 2.517 2.998 3.499 4.029 4.785 5.408
8 0.706 0.889 1.108 1.397 1.860 2.306 2.449 2.896 3.355 3.833 4.501 5.041
9 0.703 0.883 1.100 1.383 1.833 2.262 2.398 2.821 3.250 3.690 4.297 4.781
10 0.700 0.879 1.093 1.372 1.812 2.228 2.359 2.764 3.169 3.581 4.144 4.587
11 0.697 0.876 1.088 1.363 1.796 2.201 2.328 2.718 3.106 3.497 4.025 4.437
12 0.695 0.873 1.083 1.356 1.782 2.179 2.303 2.681 3.055 3.428 3.930 4.318
13 0.694 0.870 1.079 1.350 1.771 2.160 2.282 2.650 3.012 3.372 3.852 4.221
© 2023 SOPHIA Learning, LLC. SOPHIA is a registered trademark of SOPHIA Learning, LLC. Page 125
14 0.692 0.868 1.076 1.345 1.761 2.145 2.264 2.624 2.977 3.326 3.787 4.140
15 0.691 0.866 1.074 1.341 1.753 2.131 2.249 2.602 2.947 3.286 3.733 4.073
In all likelihood, it's not one of the values listed in the row here, but between two values. We’ll see 1.06 is
between the 0.870 and the 1.079, which means that the p-value is going to be between the two numbers 0.40
and 0.30 on the top row for a two-tailed test. Recall that we are testing to see if the mean weight of the M&M's
bags is anything other than the hypothesized 47.9 grams, so this is a two-tailed test.
HINT
You will need to consider if the problem is looking at a one-tailed or two-tailed test. If we need a value
that is either less than or greater than the hypothesized mean, we will use a one-tailed test. If we are
looking for any value that is different than the hypothesized mean, then we will use a two-tailed test.
Now, we can, in fact, use technology to nail down the p-value more exactly. We don't have to use this table,
although we can still use the table to answer the question about the null hypothesis.
Finally, the conclusion is that there's not sufficient evidence to conclude that M&M's bags are being filled to a
mean other than 47.9 grams.
TERM TO KNOW
SUMMARY
In cases where the population standard deviation is not known--which is almost always--we should
use the t-distribution to account for the additional variability introduced by using the sample standard
deviation in the test statistic. A t-test means that the value will be a "t" statistic instead of a "z" statistic.
The steps in the hypothesis test are the same as they are in a z-test: first stating the non-alternative
hypotheses, stating and verifying the conclusions of the test, calculating the test statistic and the p-
value, and then finally, comparing the p-value to alpha and making a decision about the null
hypothesis.
Good luck!
TERMS TO KNOW
© 2023 SOPHIA Learning, LLC. SOPHIA is a registered trademark of SOPHIA Learning, LLC. Page 126
T-Distribution/Student's T-Distribution
A family of distributions that are centered at zero and symmetric like the standard normal distribution,
but heavier in the tails. Depending on the sample size, it does not diminish towards the tails as fast. If
the sample size is large, the t-distribution approximates the normal distribution.
FORMULAS TO KNOW
© 2023 SOPHIA Learning, LLC. SOPHIA is a registered trademark of SOPHIA Learning, LLC. Page 127
How to Find a Critical T Value
by Sophia
WHAT'S COVERED
This tutorial will explain how to find a critical t-value by using either a t-table or technology. Our
discussion breaks down as follows:
1. Left-Tailed Test
a. Graphing Calculator
b. T-Table
c. Excel
2. Right-Tailed Test
a. Graphing Calculator
b. T-Table
c. Excel
3. Two-Sided Test
a. Graphing Calculator
b. T-Table
c. Excel
1. Left-Tailed Test
Remember, a critical value is a value that corresponds to the number of standard deviations away from the
mean that we're willing to attribute to chance. How far from the center of our distribution can a t-test statistic
fall? We'll decide to either fail to reject the null hypothesis or reject the null hypothesis.
For a left-tailed test, let's find the critical t* for a hypothesis test, with eight degrees of freedom, that would
reject the H0 at a 2.5% significance level.
© 2023 SOPHIA Learning, LLC. SOPHIA is a registered trademark of SOPHIA Learning, LLC. Page 128
We get a corresponding critical t-value of negative 2.306. This falls about here on the distribution, where the
lower shaded region corresponds to the lower 2.5% of our distribution, or negative 2.306.
Any t-test statistic that we calculate for this corresponding hypothesis test that is less than negative 2.306
means we would reject the null hypothesis. Anything greater than that critical value is in this safe region that is
unshaded. We would just attribute it to chance and we would fail to reject the null hypothesis.
1b. T-Table
Using the t-table to find our critical t-value--remember, this is a lower-tail test--we're going to locate the closest
thing to 2.5%, or 0.025, for the one-tail probability. We also know that we have eight degrees of freedom.
t-Distribution Critical Values
Tail Probability, p
One-tail 0.25 0.20 0.15 0.10 0.05 0.025 0.02 0.01 0.005 0.0025 0.001 0.0005
Two-tail 0.50 0.40 0.30 0.20 0.10 0.05 0.04 0.02 0.01 0.005 0.002 0.001
df
1 1.000 1.376 1.963 3.078 6.314 12.71 15.89 31.82 63.66 127.3 318.3 636.6
2 0.816 1.080 1.386 1.886 2.920 4.303 4.849 6.965 9.925 14.09 22.33 31.60
3 0.765 0.978 1.250 1.638 2.353 3.182 3.482 4.541 5.841 7.453 10.21 12.92
4 0.741 0.941 1.190 1.533 2.132 2.776 2.999 3.747 4.604 5.598 7.173 8.610
5 0.727 0.920 1.156 1.476 2.015 2.571 2.757 3.365 4.032 4.773 5.893 6.869
6 0.718 0.906 1.134 1.440 1.943 2.447 2.612 3.143 3.707 4.317 5.208 5.959
7 0.711 0.896 1.119 1.415 1.895 2.365 2.517 2.998 3.499 4.029 4.785 5.408
© 2023 SOPHIA Learning, LLC. SOPHIA is a registered trademark of SOPHIA Learning, LLC. Page 129
8 0.706 0.889 1.108 1.397 1.860 2.306 2.449 2.896 3.355 3.833 4.501 5.041
9 0.703 0.883 1.100 1.383 1.833 2.262 2.398 2.821 3.250 3.690 4.297 4.781
10 0.700 0.879 1.093 1.372 1.812 2.228 2.359 2.764 3.169 3.581 4.144 4.587
A tail probability of 0.025 and eight degrees of freedom is going to correspond to a critical t-value of 2.306.
⭐ BIG IDEA
Now, one thing with the t-table--unlike the z-table where we have a positive table and a negative table--is
that it's all positive. However, we can use it for both positive and negative values.
Since this is a left-tailed test, we have to recognize that it's a lower-tail test; we're lower than the mean of the
distribution. So the critical value should actually be a negative 2.306. Always be careful of that when using the
t-table.
1c. Excel
To find a critical t-value in Excel, go under the Formulas tab. We're going to insert a function. Under the
Statistical column, we're going to look for T.INV, or inverse of T. We're going to put in the corresponding
significance level, 0.025, comma, degrees of freedom, which was 8. Hit Enter.
Notice how we get the same value we did on the calculator and in the table--a negative 2.306.
2. Right-Tailed Test
For this second example, we're going to look at a right-tailed test and find the critical t* for a hypothesis test
with only four degrees of freedom that would reject the null hypothesis at 5% significance level.
Because it's a right-tailed test, we need to find the cutoff for our t-scores. This will be on the upper part of this
distribution that corresponds to the top 5% of our distribution.
© 2023 SOPHIA Learning, LLC. SOPHIA is a registered trademark of SOPHIA Learning, LLC. Page 130
Our calculator would give us a corresponding t-test statistic of 2.132. This falls right about here on the
distribution and corresponds to the top 5%.
It makes sense that this is a positive value, because we're above the mean. So, any t-test statistic that is above
2.132 for this particular hypothesis test means we would reject the null. Anything below that value, we'd
attribute to chance and we'd fail to reject.
2b. T-Table
Now we're going to use our t-table to find our critical t-value with a significant level of 5% for an upper-tail test.
Again, looking at our top row which has the one-tail probabilities, we have 5%, or 0.05. We also are looking at
four degrees of freedom.
t-Distribution Critical Values
Tail Probability, p
One-tail 0.25 0.20 0.15 0.10 0.05 0.025 0.02 0.01 0.005 0.0025 0.001 0.0005
Two-tail 0.50 0.40 0.30 0.20 0.10 0.05 0.04 0.02 0.01 0.005 0.002 0.001
df
1 1.000 1.376 1.963 3.078 6.314 12.71 15.89 31.82 63.66 127.3 318.3 636.6
2 0.816 1.080 1.386 1.886 2.920 4.303 4.849 6.965 9.925 14.09 22.33 31.60
3 0.765 0.978 1.250 1.638 2.353 3.182 3.482 4.541 5.841 7.453 10.21 12.92
4 0.741 0.941 1.190 1.533 2.132 2.776 2.999 3.747 4.604 5.598 7.173 8.610
5 0.727 0.920 1.156 1.476 2.015 2.571 2.757 3.365 4.032 4.773 5.893 6.869
6 0.718 0.906 1.134 1.440 1.943 2.447 2.612 3.143 3.707 4.317 5.208 5.959
7 0.711 0.896 1.119 1.415 1.895 2.365 2.517 2.998 3.499 4.029 4.785 5.408
© 2023 SOPHIA Learning, LLC. SOPHIA is a registered trademark of SOPHIA Learning, LLC. Page 131
That will correspond to a t-test statistic, or in this case a critical t*, of 2.132. Because it is an upper-tail test,
we're above the center of the distribution, so it should be positive. We're going to leave the t-test statistic as a
positive 2.132.
2c. Excel
Now let's use Excel to find our critical t-value. Again, go under our Formulas tab and look under the Statistical
column for T.INV, or inverse T. In this case, remember that because it's an upper-tail test, we have to put in
95% or 0.95 since that corresponds to the upper 5% from a cutoff value. Next, enter a comma, and then our
degrees of freedom, which was 4.
We get the same critical t-value we did in our calculator and on our table--a positive 2.132.
3. Two-Sided Test
For this last example, we're going to look at a two-sided test and find the critical t-value for a hypothesis test
with 13 degrees of freedom that would reject the null hypothesis at 1% significance level.
Because it's a two-sided test, we have to divide the significance level of 1%, or 0.01, onto both sides of our
distribution. Half of 0.01 is equal to 0.005. This means we're going to be finding that critical value, that cutoff,
for the lower 0.5%, or 0.005, of the distribution and the upper 0.5% of our distribution.
We would get a corresponding critical t-value of negative 3.012, which is about here on our distribution.
© 2023 SOPHIA Learning, LLC. SOPHIA is a registered trademark of SOPHIA Learning, LLC. Page 132
This would correspond to the lower 0.5% of our distribution, and it should be negative since it's below the
mean. We're going to do the same thing, but for the upper 0.5% of our distribution. Remember, in our
distribution, we always read left to right from 0% to 100%. So, upper 0.5% actually corresponds to 99.5%, or
0.995, of our distribution. On the graphing calculator, we would enter "invT(0.995,13)".
For this particular hypothesis test, if we got a t-test statistic that was above a positive 3.012 or below a
negative 3.012, we would reject the null. Anything in between, we'd attribute to chance and we would fail to
reject the null.
3b. T-Table
Now we're going to use our t-table to find our critical t-value. We had 13 degrees of freedom and our
significance level was 1%. Keep in mind that this is the tail probability for two tails, meaning each tail will have
0.5%.
If you take a look at our table, we actually have both one-tailed and two-tailed probabilities listed. You can use
© 2023 SOPHIA Learning, LLC. SOPHIA is a registered trademark of SOPHIA Learning, LLC. Page 133
either row and get the same critical value.
Go down to the row that shows 13 degrees of freedom and find the corresponding critical value.
t-Distribution Critical Values
Tail Probability, p
One-tail 0.25 0.20 0.15 0.10 0.05 0.025 0.02 0.01 0.005 0.0025 0.001 0.0005
Two-tail 0.50 0.40 0.30 0.20 0.10 0.05 0.04 0.02 0.01 0.005 0.002 0.001
df
1 1.000 1.376 1.963 3.078 6.314 12.71 15.89 31.82 63.66 127.3 318.3 636.6
2 0.816 1.080 1.386 1.886 2.920 4.303 4.849 6.965 9.925 14.09 22.33 31.60
3 0.765 0.978 1.250 1.638 2.353 3.182 3.482 4.541 5.841 7.453 10.21 12.92
4 0.741 0.941 1.190 1.533 2.132 2.776 2.999 3.747 4.604 5.598 7.173 8.610
5 0.727 0.920 1.156 1.476 2.015 2.571 2.757 3.365 4.032 4.773 5.893 6.869
6 0.718 0.906 1.134 1.440 1.943 2.447 2.612 3.143 3.707 4.317 5.208 5.959
7 0.711 0.896 1.119 1.415 1.895 2.365 2.517 2.998 3.499 4.029 4.785 5.408
8 0.706 0.889 1.108 1.397 1.860 2.306 2.449 2.896 3.355 3.833 4.501 5.041
9 0.703 0.883 1.100 1.383 1.833 2.262 2.398 2.821 3.250 3.690 4.297 4.781
10 0.700 0.879 1.093 1.372 1.812 2.228 2.359 2.764 3.169 3.581 4.144 4.587
11 0.697 0.876 1.088 1.363 1.796 2.201 2.328 2.718 3.106 3.497 4.025 4.437
12 0.695 0.873 1.083 1.356 1.782 2.179 2.303 2.681 3.055 3.428 3.930 4.318
13 0.694 0.870 1.079 1.350 1.771 2.160 2.282 2.650 3.012 3.372 3.852 4.221
14 0.692 0.868 1.076 1.345 1.761 2.145 2.264 2.624 2.977 3.326 3.787 4.140
15 0.691 0.866 1.074 1.341 1.753 2.131 2.249 2.602 2.947 3.286 3.733 4.073
We get a corresponding critical t-value of 3.012. But because it's a two-sided test, it's both the positive and the
negative 3.012.
3c. Excel
Now we're going to use Excel to find our critical t-value for our two-sided test. This one's going to be a little bit
different than the previous two, which were one-tailed tests. Again, go under the Statistical column. In this
case, we're going to do the inverse of T for 2T, which means the two-tailed test.
In this case, we do not have to divide our significance level into the two halves; we can just put 0.01. Excel
knows--because we indicated it's a two-tailed test--to automatically divide that 1%. We were also at 13 degrees
of freedom. Therefore, it gives us the positive corresponding critical t-value of 3.012. However, we know that
it's not only a positive 3.012, but also a negative 3.012.
SUMMARY
We learned how to calculate the critical t-value using either a t-table or Excel, for a left-tailed test,
right-tailed test, and two-sided test. At the end of this lesson, we've attached a PDF where you can try
some examples for yourself.
© 2023 SOPHIA Learning, LLC. SOPHIA is a registered trademark of SOPHIA Learning, LLC. Page 134
Good luck!
TERMS TO KNOW
Critical Value
A value that can be compared to the test statistic to decide the outcome of a hypothesis test
© 2023 SOPHIA Learning, LLC. SOPHIA is a registered trademark of SOPHIA Learning, LLC. Page 135
How to Find a P-Value from a T-Test Statistic
by Sophia
WHAT'S COVERED
In this tutorial, you will learn how to find a p-value from a t-test statistic using the calculator, the t-table
table, and Excel. Our discussion breaks down as follows:
1. Right-Tailed Test
a. Graphing Calculator
b. T-Table
c. Excel
2. Two-Tailed Test
a. Graphing Calculator
b. T-Table
c. Excel
1. Right-Tailed Test
The average ACT score in Illinois is a 20.9. There's one high school in particular that believes its students
scored significantly better than the state average. Because the school believes that its students performed
better, this is an upper tail test.
In order to test this hypothesis, the school took a random sample of 15 students' scores and got an average of
22.5, with a standard deviation of 2.3.
Let's first ask ourselves which type of data we are dealing with. In this case, we're dealing with quantitative
data, and we do not know the population standard deviation. We just know the sample standard deviation, so
we're going to do a t-test.
© 2023 SOPHIA Learning, LLC. SOPHIA is a registered trademark of SOPHIA Learning, LLC. Page 136
This gives us a t-score of 2.694. This score is plotted on the t-distribution below.
© 2023 SOPHIA Learning, LLC. SOPHIA is a registered trademark of SOPHIA Learning, LLC. Page 137
When we're entering in the values for a "tcdf," it is always going to be the lower boundary of the shaded
region, the upper boundary of the shaded region, then the degrees of freedom. Remember, the shape of this
distribution changes with the sample size. In this case, the lower boundary of our shaded region is 2.694. The
upper boundary is the top portion of our distribution, which is positive infinity. In order to indicate positive
infinity to our calculator, we just enter a positive 99. Our degrees of freedom will be 14 because the degrees of
freedom is sample size minus one, or 15 minus 1.
1b. T-Table
We can also use the t-table. Now, tables sometimes can only get us an estimated p-value. They can't get us
an exact p-value, like the calculator or Excel. But sometimes it's all we have, and it's definitely sufficient. In this
case, remember we had a t-test statistic of 2.694. All of the values inside our t-table are a bunch of t-scores.
On the left hand column, these are the degrees of freedom, and the top row contains all of our corresponding
p-values.
What we're going to do is look for the corresponding degrees of freedom for our hypothesis test. In this case
we had 14 degrees of freedom. In that same row, we're going to look for the closest thing possible to the t-
© 2023 SOPHIA Learning, LLC. SOPHIA is a registered trademark of SOPHIA Learning, LLC. Page 138
score of 2.694.
Tail Probability, p
One-tail 0.25 0.20 0.15 0.10 0.05 0.025 0.02 0.01 0.005 0.0025 0.001 0.0005
Two-tail 0.50 0.40 0.30 0.20 0.10 0.05 0.04 0.02 0.01 0.005 0.002 0.001
df
1 1.000 1.376 1.963 3.078 6.314 12.71 15.89 31.82 63.66 127.3 318.3 636.6
2 0.816 1.080 1.386 1.886 2.920 4.303 4.849 6.965 9.925 14.09 22.33 31.60
3 0.765 0.978 1.250 1.638 2.353 3.182 3.482 4.541 5.841 7.453 10.21 12.92
4 0.741 0.941 1.190 1.533 2.132 2.776 2.999 3.747 4.604 5.598 7.173 8.610
5 0.727 0.920 1.156 1.476 2.015 2.571 2.757 3.365 4.032 4.773 5.893 6.869
6 0.718 0.906 1.134 1.440 1.943 2.447 2.612 3.143 3.707 4.317 5.208 5.959
7 0.711 0.896 1.119 1.415 1.895 2.365 2.517 2.998 3.499 4.029 4.785 5.408
8 0.706 0.889 1.108 1.397 1.860 2.306 2.449 2.896 3.355 3.833 4.501 5.041
9 0.703 0.883 1.100 1.383 1.833 2.262 2.398 2.821 3.250 3.690 4.297 4.781
10 0.700 0.879 1.093 1.372 1.812 2.228 2.359 2.764 3.169 3.581 4.144 4.587
11 0.697 0.876 1.088 1.363 1.796 2.201 2.328 2.718 3.106 3.497 4.025 4.437
12 0.695 0.873 1.083 1.356 1.782 2.179 2.303 2.681 3.055 3.428 3.930 4.318
13 0.694 0.870 1.079 1.350 1.771 2.160 2.282 2.650 3.012 3.372 3.852 4.221
14 0.692 0.868 1.076 1.345 1.761 2.145 2.264 2.624 2.977 3.326 3.787 4.140
15 0.691 0.866 1.074 1.341 1.753 2.131 2.249 2.602 2.947 3.286 3.733 4.073
Now, 2.694 falls somewhere in between these two values, 2.624 and 2.977. These t-scores correspond to the
two p-values, 0.01 and 0.005, or 1% and 0.5%. Since it falls somewhere in the middle, we are actually just
going to take the average of these two p-values.
To do this, take 1% plus 0.5%; divide that by 2, and we get an estimated p-value of 0.0075, or 0.75% when
using the t-table. It is not exactly the same as the value we found using the calculator; however, it is very close.
⭐ BIG IDEA
The t-table only shows positive values, however, you can still use this same method for a left-tailed test;
you would just look for the positive of your t-statistic.
1c. Excel
To convert our t-test statistic into a p-value using Excel, go under the Formulas tab. We are going to insert a
formula that falls under the Statistical column. We are looking for t-distribution dot rt, or T.DIST.RT, because
we're performing a right-tail test.
Notice how there is no T.DIST.LT. If we're performing a left-tail test, we would just use the T.DIST. But since
we're performing an upper-tail test, we are going to use T.DIST.RT. The first thing we are going to put in is the
t-score, which was a positive 2.694, then 14 degrees of freedom. Hit Enter.
© 2023 SOPHIA Learning, LLC. SOPHIA is a registered trademark of SOPHIA Learning, LLC. Page 139
Notice how we get the same p-value of 0.0087, or 0.87%, as when we used our calculator.
2. Two-Tailed Test
The quality assurance of a soda company wants to make sure that its machines are working in filling its 12-
ounce cans of soda. Because it would be a problem if the machines were significantly over-filling and under-
filling the cans of soda, the company needs to perform a two-sided test.
The company took a random sample of 10 cans of its soda off the production line and got an average of 11.8
ounces with a standard deviation of 1.2 ounces.
We're looking at quantitative data, which is ounces of soda, and we also do not know the population standard
deviation. Therefore, we're going to perform a t-test.
The t-test statistic is negative 0.527. This score is plotted on the t-distribution below.
© 2023 SOPHIA Learning, LLC. SOPHIA is a registered trademark of SOPHIA Learning, LLC. Page 140
The lower boundary of the shaded region for this problem is negative infinity. In order for the calculator to
recognize negative infinity, we put in a negative 99. The upper boundary of the shaded region is the t-score,
which is a negative 0.527. We have nine degrees of freedom, because our sample size was 10. We get a
corresponding p-value of 0.3055, or 30.55%.
However, this is not our final answer. This was a two-sided test, and this is only the p-value that's associated
with the left-tail, or the left-shaded region. We also have to get the corresponding upper-tail as well, which
would fall at a positive 0.527.
Luckily, because these two areas are equal, all we have to do is take this p-value that we got on the
calculator, and multiply it by two. The p-value for this problem is 0.3055 times 2, or 61.1%.
2b. T-Table
Now, let's find the p-value using the t-table. To get the p-value from the t-test statistic of a negative 0.527,
remember, we're going to look at the corresponding degrees of freedom, which in this case was a nine, and
we're going to find the closest t-score we can to what we calculated. The closest value to 0.527 is this 0.703.
Tail Probability, p
One-tail 0.25 0.20 0.15 0.10 0.05 0.025 0.02 0.01 0.005 0.0025 0.001 0.0005
Two-tail 0.50 0.40 0.30 0.20 0.10 0.05 0.04 0.02 0.01 0.005 0.002 0.001
df
1 1.000 1.376 1.963 3.078 6.314 12.71 15.89 31.82 63.66 127.3 318.3 636.6
2 0.816 1.080 1.386 1.886 2.920 4.303 4.849 6.965 9.925 14.09 22.33 31.60
3 0.765 0.978 1.250 1.638 2.353 3.182 3.482 4.541 5.841 7.453 10.21 12.92
4 0.741 0.941 1.190 1.533 2.132 2.776 2.999 3.747 4.604 5.598 7.173 8.610
5 0.727 0.920 1.156 1.476 2.015 2.571 2.757 3.365 4.032 4.773 5.893 6.869
6 0.718 0.906 1.134 1.440 1.943 2.447 2.612 3.143 3.707 4.317 5.208 5.959
7 0.711 0.896 1.119 1.415 1.895 2.365 2.517 2.998 3.499 4.029 4.785 5.408
© 2023 SOPHIA Learning, LLC. SOPHIA is a registered trademark of SOPHIA Learning, LLC. Page 141
8 0.706 0.889 1.108 1.397 1.860 2.306 2.449 2.896 3.355 3.833 4.501 5.041
9 0.703 0.883 1.100 1.383 1.833 2.262 2.398 2.821 3.250 3.690 4.297 4.781
10 0.700 0.879 1.093 1.372 1.812 2.228 2.359 2.764 3.169 3.581 4.144 4.587
There's nothing that we can estimate. We can't take an estimate between two values, because our t-score
falls below the very first value. Because we're doing a two-sided test, we can just look at the row for two-tailed
and get a p-value of 0.5, or 50%. We could have also looked at the one-tailed row and doubled the value of
0.25 to get the same answer.
2c. Excel
To convert our t-score into a p-value using Excel, we're going to go under the Formulas tab and insert a
formula again under the Statistical column. But because this is a two-sided test, we want t-distribution dot 2t,
or T.DIST.2T, for two tails. Even though the t-square that we calculated was negative 0.527, in Excel you
always put in the positive tail. We're going to go ahead and put in the positive 0.527 with our degrees of
freedom of 9.
Notice how we get the same p-value value of 61.1% that we did when using our calculator.
SUMMARY
Today we learned how to find a p-value from a t-test statistic using the calculator, the t-table table, and
Excel. We did this for two types of test: a right-tailed test and a two-tailed test.
Good luck!
TERMS TO KNOW
P-value
The probability that the test statistic is that value or more extreme in the direction of the alternative
hypothesis
Test Statistic
A measurement, in standardized units, of how far a sample statistic is from the assumed parameter if the
null hypothesis is true
© 2023 SOPHIA Learning, LLC. SOPHIA is a registered trademark of SOPHIA Learning, LLC. Page 142
Confidence Intervals Using the T-Distribution
by Sophia
WHAT'S COVERED
This tutorial will discuss confidence intervals, specifically those that use the t-distribution. Our
discussion breaks down as follows:
When calculating the confidence interval for a sampling distribution, you would normally take the sample
mean plus or minus some number of standard deviations times the standard error, or .
However, the only problem is this formula has a sigma in it, which is the population standard deviation. For
situations where we don't know what the population standard deviation is, you have to replace this formula
with one that uses "s", or the sample standard deviation.
Since you're using s as a stand-in for sigma, you need to use thet-distribution instead and come up with the
following formula:
FORMULA
TERMS TO KNOW
Confidence Interval
An interval we are some percent certain (eg 90%, 95%, or 99%) will contain the population parameter,
given the value of our sample statistic.
T-Distribution
A distribution similar to the normal distribution but with fatter tails. Depending on the sample size, it
does not diminish toward the tails as fast.
© 2023 SOPHIA Learning, LLC. SOPHIA is a registered trademark of SOPHIA Learning, LLC. Page 143
T-Distribution
To construct a confidence interval for population means using the t-distribution, the following steps must be
followed:
STEP BY STEP
EXAMPLE Many times, consumers will pay attention to nutritional contents on packaged food so
it's important to make them accurate as to what the food product actually contains. Suppose, for
example, that the stated calorie content for a particular frozen dinner was 240.
A random sample of 12 frozen dinners was selected, and the calorie contents of each one was
determined.
One of the boxes actually contained 255 calories worth of food whereas another one only contained 225
calories' worth of food. We can quickly calculate the mean and standard deviation by using Excel.
First, enter all 12 values. Go to the Formulas tab, and we will use the formula AVERAGE under the Statistical
option to find the sample mean and highlight all the values.
The sample mean is 244.33 calories. To find the sample standard deviation, again go under the Formulas tab,
and select STDEV.S under the Statistical option. Highlight all 12 values.
© 2023 SOPHIA Learning, LLC. SOPHIA is a registered trademark of SOPHIA Learning, LLC. Page 144
The sample standard deviation is 12.38.
Suppose you want to construct a 90% confidence interval for the true mean number of calories. This means
that you want to construct a confidence interval such that you’re 90% confident that the true mean of all the
packaged frozen dinners lies within the interval.
Stating the conditions isn't enough, and it's not just a formality--you must verify them. Recall the conditions
needed:
Condition Description
n ≥ 30 or normal parent
Normality
distribution
Let's go back to our example to check the requirements of randomness, independence, and normality.
Randomness: It was a random sample, as was said in the problem.
Independence: Is the population of all frozen dinners at least 10 times the size of your sample? That's
reasonable to believe. Assume there are at least 120 frozen dinners in all of this company's frozen dinner
line.
Normality: This one's a little tricky. Your sample size isn't 30 or larger, so the Central Limit Theorem
doesn't apply to this problem. Is the parent distribution normal? You don't know that either. You need to
determine if this is plausible. You can do that by graphing the actual data that you have.
You can see that the parent distribution might be normal since the data that you got from the population
are single peaked and approximately symmetric. It's possible that the population parent distribution is
normal. You can proceed under the assumption of normality. You can’t verify it 100%, but assume it is for
the purposes of this problem.
Step 2: Calculate the confidence interval.
Reviewing the formula, we need the sample mean, the sample standard deviation, the sample size, and the t-
critical value. We have already figured out the information about the sample with the help from Excel:
We know that 244.33 is the sample mean and the standard deviation is 12.38 when we used the data of the 12
dinners. This information also tells us that the sample size is 12. What we still need to do is figure out what that
© 2023 SOPHIA Learning, LLC. SOPHIA is a registered trademark of SOPHIA Learning, LLC. Page 145
t* value is going to be.
To find this value, we need a t-distribution table. We need a t* that will give us 90% of the t-distribution. 90%
confidence interval would mean that there is 10% remaining on either side for the two tails, or 5% for each tail.
Looking at the table, we can match this information with the values 0.05 in the row of one-tailed or 0.10 in the
row of two-tailed. We can also look all the way down at the bottom and see that there is a row that says
"Confidence Interval." There is 50%, 60%, 70%, 80%, 90%, etc. Either one of those justifications is reason
enough to use this column.
We also need to know the degrees of freedom to determine which number from this column we're going to
use. In this problem, we have 11 degrees of freedom because we had 12 dinners in our sample, and the
degrees of freedom is n minus 1.
So we need to look for the corresponding value inside that table that matches with 11 degrees of freedom and
90% confidence interval.
Tail Probability, p
One-tail 0.25 0.20 0.15 0.10 0.05 0.025 0.02 0.01 0.005 0.0025 0.001 0.0005
Two-tail 0.50 0.40 0.30 0.20 0.10 0.05 0.04 0.02 0.01 0.005 0.002 0.001
df
1 1.000 1.376 1.963 3.078 6.314 12.71 15.89 31.82 63.66 127.3 318.3 636.6
2 0.816 1.080 1.386 1.886 2.920 4.303 4.849 6.965 9.925 14.09 22.33 31.60
3 0.765 0.978 1.250 1.638 2.353 3.182 3.482 4.541 5.841 7.453 10.21 12.92
4 0.741 0.941 1.190 1.533 2.132 2.776 2.999 3.747 4.604 5.598 7.173 8.610
5 0.727 0.920 1.156 1.476 2.015 2.571 2.757 3.365 4.032 4.773 5.893 6.869
6 0.718 0.906 1.134 1.440 1.943 2.447 2.612 3.143 3.707 4.317 5.208 5.959
7 0.711 0.896 1.119 1.415 1.895 2.365 2.517 2.998 3.499 4.029 4.785 5.408
8 0.706 0.889 1.108 1.397 1.860 2.306 2.449 2.896 3.355 3.833 4.501 5.041
9 0.703 0.883 1.100 1.383 1.833 2.262 2.398 2.821 3.250 3.690 4.297 4.781
10 0.700 0.879 1.093 1.372 1.812 2.228 2.359 2.764 3.169 3.581 4.144 4.587
11 0.697 0.876 1.088 1.363 1.796 2.201 2.328 2.718 3.106 3.497 4.025 4.437
12 0.695 0.873 1.083 1.356 1.782 2.179 2.303 2.681 3.055 3.428 3.930 4.318
13 0.694 0.870 1.079 1.350 1.771 2.160 2.282 2.650 3.012 3.372 3.852 4.221
14 0.692 0.868 1.076 1.345 1.761 2.145 2.264 2.624 2.977 3.326 3.787 4.140
15 0.691 0.866 1.074 1.341 1.753 2.131 2.249 2.602 2.947 3.286 3.733 4.073
16 0.690 0.865 1.071 1.337 1.746 2.120 2.235 2.583 2.921 3.252 3.686 4.015
17 0.689 0.863 1.069 1.333 1.740 2.110 2.224 2.567 2.898 3.222 3.646 3.965
18 0.688 0.862 1.067 1.330 1.734 2.101 2.214 2.552 2.878 3.197 3.610 3.922
19 0.688 0.861 1.066 1.328 1.729 2.093 2.205 2.539 2.861 3.174 3.579 3.883
20 0.687 0.860 1.064 1.325 1.725 2.086 2.197 2.528 2.845 3.153 3.552 3.850
21 0.686 0.859 1.063 1.323 1.721 2.080 2.189 2.518 2.831 3.135 3.527 3.819
22 0.686 0.858 1.061 1.321 1.717 2.074 2.183 2.508 2.819 3.119 3.505 3.792
23 0.685 0.858 1.060 1.319 1.714 2.069 2.177 2.500 2.807 3.104 3.485 3.767
24 0.685 0.857 1.059 1.318 1.711 2.064 2.172 2.492 2.797 3.091 3.467 3.745
© 2023 SOPHIA Learning, LLC. SOPHIA is a registered trademark of SOPHIA Learning, LLC. Page 146
25 0.684 0.856 1.058 1.316 1.708 2.060 2.167 2.485 2.787 3.078 3.450 3.725
26 0.684 0.856 1.058 1.315 1.706 2.056 2.162 2.479 2.779 3.067 3.435 3.707
27 0.684 0.855 1.057 1.314 1.703 2.052 2.158 2.473 2.771 3.057 3.421 3.690
28 0.683 0.855 1.056 1.313 1.701 2.048 2.154 2.467 2.763 3.047 3.408 3.674
29 0.683 0.854 1.055 1.311 1.699 2.045 2.150 2.462 2.756 3.038 3.396 3.659
30 0.683 0.854 1.055 1.310 1.697 2.042 2.147 2.457 2.750 3.030 3.385 3.646
40 0.681 0.851 1.050 1.303 1.684 2.021 2.123 2.423 2.704 2.971 3.307 3.551
50 0.679 0.849 1.047 1.299 1.676 2.009 2.109 2.403 2.678 2.937 3.261 3.496
60 0.679 0.848 1.045 1.296 1.671 2.000 2.099 2.390 2.660 2.915 3.232 3.460
80 0.678 0.846 1.043 1.292 1.664 1.990 2.088 2.374 2.639 2.887 3.195 3.416
100 0.677 0.845 1.042 1.290 1.660 1.984 2.081 2.364 2.626 2.871 3.174 3.390
1000 0.675 0.842 1.037 1.282 1.646 1.962 2.056 2.330 2.581 2.813 3.098 3.300
>1000 0.674 0.841 1.036 1.282 1.645 1.960 2.054 2.326 2.576 2.807 3.091 3.291
50% 60% 70% 80% 90% 95% 96% 98% 99% 99.5% 99.8% 99.9%
Look in the 11 degrees of freedom row and the 90% confidence column until we obtain a t* of 1.796.
Now we have all the information needed in order to create our confidence interval. Construct it as x-bar plus
or minus the t critical value times the sample standard deviation divided by the square root of sample size.
When we do that, we obtain 244.33 plus or minus 6.42. When we evaluate the interval, it's going to be 237.68
all the way up to 249.98.
What does this confidence interval actually mean? How can you interpret the interval? The interpretation is
that we're 90% confident that the true mean calorie content of all frozen dinners is between about 237 and
250 calories. We're 90% confident that the real value is somewhere in there, and that the 240 value that they
were purporting at the beginning of the problem is, in fact, plausible.
SUMMARY
Today we learned about confidence intervals specifically for means using the t- distribution. We can
create point estimates for the population means using x-bar, and determine the margin of error. That
margin of error is the "t* times s over the square root of n" piece of the confidence interval. First, we
verify that conditions are met. Then, we construct and interpret the confidence interval.
© 2023 SOPHIA Learning, LLC. SOPHIA is a registered trademark of SOPHIA Learning, LLC. Page 147
Good luck!
TERMS TO KNOW
Confidence Interval
An interval we are some percent certain (eg 90%, 95%, or 99%) will contain the population parameter,
given the value of our sample statistic.
t-distribution
A family of distributions similar to the standard normal distribution, except that they are fatter in the tails,
due to the increased variability associated with using the sample standard deviation instead of the
population standard deviation in the formula for the test statistic.
FORMULAS TO KNOW
© 2023 SOPHIA Learning, LLC. SOPHIA is a registered trademark of SOPHIA Learning, LLC. Page 148
Calculating Standard Error of a Sample Mean
by Sophia
WHAT'S COVERED
This tutorial will explain how to calculate standard error for both sample means and sample
proportions. It will cover both when the population standard deviation is unknown and known. Our
discussion breaks down as follows:
FORMULA
It is the sample standard deviation, s, over the square root of the sample size, n.
EXAMPLE The amount of fallen snow, in inches, is recorded for one week in Minneapolis.
© 2023 SOPHIA Learning, LLC. SOPHIA is a registered trademark of SOPHIA Learning, LLC. Page 149
At first, we have 1.5 inches of snow, so type 1.5, then hit Enter. Next it snowed 3 inches, then it snowed a
lot, 4.75 inches up to 8 inches. Then it tapered off to 0.3 inches. On Friday, there were 2 inches of snow.
Finally, on Saturday, there were 2.95 inches of snow.
Once we've entered all of that data, we can exit out of this screen by hitting 2nd Mode. Now, to get s, the
standard deviation of this sample, we need to get the sample statistics from that list of data. To do so, hit
the Stat button, scroll over to Calc, and we're interested in the first function, one variant statistics.
Go ahead and hit Enter. We want it for List One, so we're going to hit 2nd 1, and we can see the L1 in the
upper left hand corner above the one button. Hit Enter, and we get all sorts of useful data for this set of
data.
We have the x-bar, which is the mean. We also have s of x and sigma of x, as well as the sample size,
© 2023 SOPHIA Learning, LLC. SOPHIA is a registered trademark of SOPHIA Learning, LLC. Page 150
which is seven.
We also get the five number summary, which is quite useful for other types of problems. However, for this
problem, we're interested in the standard deviation of the sample. Remember, this is a sample, not a
population, so we want to use the value for s of x, not sigma of x. Remember, population statistics are
noted with Greek letters.
In this case, since it is just a sample, we're going to use the 2.526 (which is rounded). We'll divide 2.526 by
the square root of 7 to calculate the standard error.
Notice how we get the same value that we got in our calculator under s of x. Now, we're going to finish
calculating the standard error, which was s divided by the square root of n. In a new cell, we'll type the
equal sign and then click on A1 with our mouse since that is where the s-value is. This automatically inserts
that value. Next, divide by the square root of n. To get the square root, we have to insert a formula. The
square root is just under "Math and Trigonometry", and is indicated with SQRT. Enter the sample size of 7.
Notice how we get the same value as we did on our calculator for standard error, 0.955.
FORMULA
© 2023 SOPHIA Learning, LLC. SOPHIA is a registered trademark of SOPHIA Learning, LLC. Page 151
The formula to calculate the standard error is p-hat times q-hat, divided by n, all under the square root. We're
actually going to use the data that was given to us, which are estimates -- that's what the hat indicates.
EXAMPLE A survey is conducted at the local high schools to find out about underage drinking. Of
the 523 students who replied to the survey, 188 replied that they have consumed some amount of
alcohol.
We have 0.36 for p-hat, 0.64 for q-hat, and the total sample, n, was 523 students. This calculates to a
standard error that is 0.021.
FORMULA
© 2023 SOPHIA Learning, LLC. SOPHIA is a registered trademark of SOPHIA Learning, LLC. Page 152
We do not need to use the sample data, p-hat, to make the estimate for the standard error. We actually know
the population parameters we can use this information.
EXAMPLE Revisiting our prior example, a survey is conducted at the local high schools to find out
about underage drinking. Of the 523 students who replied to the survey, 188 replied that they have
drank some amount of alcohol. The proportion of underage drinkers nationally is 39%.
SUMMARY
Today we learned how to calculate standard error, and practiced identifying which formula to use,
based on the information given for these three formulas: Sample Means, Sample Proportion
(population standard deviation is unknown), and Sample Proportion (population standard deviation is
known).
TERMS TO KNOW
Standard Error
The standard deviation of the sampling distribution of sample means distribution.
FORMULAS TO KNOW
Standard Error
© 2023 SOPHIA Learning, LLC. SOPHIA is a registered trademark of SOPHIA Learning, LLC. Page 153
Sample Means:
© 2023 SOPHIA Learning, LLC. SOPHIA is a registered trademark of SOPHIA Learning, LLC. Page 154
Analysis of Variance/ANOVA
by Sophia
WHAT'S COVERED
This tutorial will cover tests for three or more population means and the process for analysis of
variance (ANOVA). Our discussion breaks down as follows:
1. ANOVA
a. Conditions
b. Null and Alternative Hypothesis
c. F-Statistic
d. Concluding the ANOVA Test
1. ANOVA
Comparing three or more means requires a new hypothesis test calledanalysis of variance (ANOVA). The AN
is for "analysis", the O is for "of", and the VA is for "variance"). For ANOVA, we compare the means by
analyzing the sample variances from the independently selected sample.
EXAMPLE Suppose a factory supervisor wants to know whether it takes his workers different
amounts of time to complete a task based on their proficiency level. The factory employs apprentices,
novices, and masters. The supervisor randomly selects ten workers from each group and has them
perform the task.
The summary of the data, which is the time in minutes to complete the task, is shown in this table here:
Proficiency n x̄ s
© 2023 SOPHIA Learning, LLC. SOPHIA is a registered trademark of SOPHIA Learning, LLC. Page 155
1a. Conditions
There are a few conditions necessary for an ANOVA test:
1. Independent samples from the populations.
2. Each population has to be normally distributed.
3. The variances, and therefore the standard deviations of all those normal distributions, are the same.
For the above factory scenario, let's assume that the above three conditions are met.
Null H0: μA = μN = μM ; The mean time required to complete the task is the same for the
Hypothesis masters, the novices, and the apprentices.
Alternative
Ha: At least one of the mean times is different from another.
Hypothesis
Alpha
α = 0.05
Level
1c. F-Statistic
When you do an ANOVA test, the statistic that you use is not going to be a z or t, as you have been using in
the past. Instead, you will use what is called an "F". An F statistic is calculated by taking the quotient of the
variability between the samples and the variability within each sample.
FORMULA
F-Statistic
⭐ BIG IDEA
A small F is consistent with the null hypothesis, versus a large F statistic, which is evidence against the
null hypothesis. You wouldn't reject it if F was small.
Almost always, you will calculate the ANOVA F statistic and the p-value with technology. All but the most
simple, straightforward problems will be calculated using technology.
In our factory scenario, the F statistic, calculated with technology, is 1.418. That is not a very large value of F.
The corresponding p-value is 0.26, which is a very large p-value.
TERM TO KNOW
© 2023 SOPHIA Learning, LLC. SOPHIA is a registered trademark of SOPHIA Learning, LLC. Page 156
F Statistic
The test statistic in an ANOVA test. It is the ratio of the variability between the samples to the
variability within each sample. If the null hypothesis is true, the F statistic will probably be small.
If the p-value is less than the significance level, you would reject the null hypothesis.
If the p-value is greater than the significance level, you would fail to reject the null hypothesis.
In the factory scenario, since the p-value of 0.26 is very large, greater than the 0.05 significance level, you fail
to reject the null hypothesis. There's no evidence that suggests that the time required to complete the task
differs significantly with proficiency level.
SUMMARY
ANOVA, or analysis of variance, allows you to compare three or more means by comparing the
variability within each sample to the variability between the samples. The null hypothesis is that all the
means are the same, and the alternative hypothesis is that at least one of them is different. A small F
is consistent with the null hypothesis, versus a large F statistic, which is evidence against the null
hypothesis. The F and the p-value are almost always calculated with technology.
Good luck!
TERMS TO KNOW
F statistic
The test statistic in an ANOVA test. It is the ratio of the variability between the samples to the variability
within each sample. If the null hypothesis is true, the F statistic will probably be small.
© 2023 SOPHIA Learning, LLC. SOPHIA is a registered trademark of SOPHIA Learning, LLC. Page 157
One-Way ANOVA/Two-Way ANOVA
by Sophia
WHAT'S COVERED
This tutorial will cover the difference between a one-way ANOVA test versus a test with a two-way
ANOVA. Our discussion breaks down as follows:
1. Types of ANOVA Tests
a. One-Way ANOVA
b. Two-Way ANOVA
EXAMPLE Suppose that you had a 10-point cleanliness scale that you were ranking detergents on.
Detergent
Tide 8.3
All 6.4
Era 5.5
Based on this one factor, detergent, you are trying to see how clean the clothes get on average. You
would need more information, such as sample size and standard deviation, but this is a situation which
would lead you to an ANOVA test.
Because we're only looking at the one factor of detergent affecting cleanliness, this case would be
considerd a one-way ANOVA.
TERM TO KNOW
One-Way ANOVA
© 2023 SOPHIA Learning, LLC. SOPHIA is a registered trademark of SOPHIA Learning, LLC. Page 158
A hypothesis test that compares three or more population means with respect to a single
characteristic or factor.
EXAMPLE Consider the scenario from above that compared cleanliness of different types of
detergents. Now, suppose that you included another factor: water temperature.
Water Temperature
It's possible that some of these detergents do a better job of cleaning in different temperatures of water.
Now that you have all of this additional information, you're actually looking at 12 treatments, four
detergents and three water temperatures for each detergent.
There are two factors that are factoring into the cleanliness score.
Type of Detergent
Water Temperature
Because there are two factors that are affecting the cleanliness score, we can still do an ANOVA test, but
this time, it's called a two-way ANOVA.
TERM TO KNOW
Two-Way ANOVA
A hypothesis test that compares three or more population means with respect to multiple
characteristics or factors.
SUMMARY
In one-way ANOVA, you can consider population means that are based on just one characteristic. In
two-way ANOVA, you consider the comparisons based on multiple characteristics or factors.
Good luck!
TERMS TO KNOW
© 2023 SOPHIA Learning, LLC. SOPHIA is a registered trademark of SOPHIA Learning, LLC. Page 159
One-Way ANOVA
A hypothesis test that compares three or more population means with respect to a single characteristic
or factor.
Two-Way ANOVA
A hypothesis test that compares three or more population means with respect to multiple characteristics
or factors.
© 2023 SOPHIA Learning, LLC. SOPHIA is a registered trademark of SOPHIA Learning, LLC. Page 160
Chi-Square Statistic
by Sophia
WHAT'S COVERED
This tutorial will cover the chi-square statistic, discussing how to calculate the observed frequency
and expected frequency of a data set. Our discussion breaks down as follows:
The observed frequency is the number of observations we actually see for a value, or what actually
happened. The expected frequency is what we would expect to happen. It is the number of observations we
would see for a value if the null hypothesis was true.
HINT
In this tutorial, you will not run any significance tests because the chi-square tests have many different
versions, and each of them will have their own tutorial. This tutorial is going to focus on how the statistic is
calculated, as it's calculated the same regardless of the test you're running.
To measure the discrepancy between what you observed and what you expected, we need to calculate the
chi-square statistic, which is calculated this way:
STEP BY STEP
FORMULA
Chi-Square Statistic
EXAMPLE Suppose you have a tin of colored beads, and you claim that the tin contains the
© 2023 SOPHIA Learning, LLC. SOPHIA is a registered trademark of SOPHIA Learning, LLC. Page 161
colored beads in these proportions: 35% blue, 35% green, 15% yellow, and 15% red. These will be used
to find the expected frequencies.
You draw 10 beads from the tin: 4 red, 3 blue, 1 green, and 2 yellow. These will be your observed
frequencies.
Is what you drew consistent with the percentages you claimed or not? Why or why not?
If the claim were true, we would have expected that out of 10 beads, 3 1/2 of them would be blue, 3 1/2
green, 1 1/2 yellow, and 1 1/2 red. This is called the expected frequencies and can be calculated by
multiplying the sample size by the hypothesized proportion.
Expected
Color Percentage
out of 10
You can't actually pull 3 1/2 blue beads, because you can't have half of a bead. Therefore, this is an
idealized scenario, representative of what you might expect in the long-term in samples of 10.
In your one sample of 10 beads, what you actually got was: 3 blue, 1 green, 2 yellow, and 4 red. The two
yellow beads drawn seems fairly close with the 15% claim. However, the four red beds that were drawn
does not seem consistent with the 15% claim for red.
How can you measure that discrepancy? We can calculate the chi-square statistic using the above
formula. First, subtract the each expected frequency from the observed frequency, square that value, and
divide by the expected frequency. Finally, add up all those calculations.
Sum 6.1905
The 3 1/2, 3 1/2, 1 1/2, and 1 1/2 were the expected frequencies and the observed frequencies were the 3, 1,
2, and 4. Using the formula, we get a chi-square statistic value of 6.1905.
So what do we do with this chi-squared statistic? We can find this value, along with the degrees of
freedom, in a chi-squared distribution table to determine if we reject or fail to reject the null hypothesis by
comparing it to the pre-determined significance level.
HINT
You can use a table to calculate the chi-square statistic or you can use technology.
© 2023 SOPHIA Learning, LLC. SOPHIA is a registered trademark of SOPHIA Learning, LLC. Page 162
Now, it's worth noting that in this case, the conditions for inference with a chi-square test are not met. This
is only meant to illustrate how a chi-square statistic would be calculated, although you can't do any real
chi-square inference on this because the sample size isn't large enough.
TERMS TO KNOW
Chi-Square Statistic
The sum of the ratios of the squared differences between the expected and observed counts to the
expected counts.
Observed Frequencies
The number of occurrences that were observed within each of the categories in a qualitative
distribution.
Expected Frequencies
The number of occurrences we would have expected within each of the categories in a qualitative
distribution if the null hypothesis were true.
IN CONTEXT
Suppose there are four flavors of candy in a bag: cherry, lemon, orange, and strawberry. The
company claims the flavors are equally distributed in each bag.
After opening a bag of candy and sorting the flavors, the following counts were produced:
Flavor Observed
Cherry 11
Lemon 15
Orange 12
Strawberry 12
Total 50
In equal distribution, it is helpful to think of the proportions of each flavor and then make a
hypothesis based on those proportions. For the null hypothesis, we can assume that the proportions
for the four flavors are the same. The alternate hypothesis would state that is is not true; that the
proportions are not the same.
H0: pC = pL = pO = pS
Ha: The proportions of the flavors are not the same.
Next, we need to compare the observed frequency with the expected frequency. The observed
frequencies are the same as the above counts.
© 2023 SOPHIA Learning, LLC. SOPHIA is a registered trademark of SOPHIA Learning, LLC. Page 163
To find the expected frequency, we need to find the number of occurrences if the null hypothesis is
true, which in this case, was that the flavor proportions are equal, or if the four flavor categories
were all evenly distributed. Counting up all the flavors in that bag of candy gives us a total of 50
candies. If the flavor categories were evenly distributed among the 50 candies, we would need to
divide the total candies evenly between the four flavors, so 50 divided by 4, or 12.5 candies. This
means we would expect 12.5 candies in each flavor.
Cherry 11 12.5
Lemon 15 12.5
Orange 12 12.5
Strawberry 12 12.5
We can then use the chi-squared formula to calculate the chi-square statistic to compare the
discrepency between the expected and observed frequencies.
A middle school is gathering information on its after-school clubs because it was assumed that the
distribution of students in each grade was evenly distributed across the clubs, meaning there were
the same amount of 6th graders in each club, the same amount of 7th graders in each club, and the
same amount of 8th graders in each club.
This table lists the number of students from each grade participating in each club.
Coding Club 12 14 8
Photography Club 7 11 15
Debate Club 9 5 13
Suppose we want to find the observed frequency for 7th graders participating in the photography
club. Using the chart, we can directly see the observed frequency for 7th graders participating in the
photography club is 11.
To find the expected frequency for 7th graders participating in the photography club, we need to
find the number of occurrences if the null hypothesis is true, which in this case, was that the three
options are equally likely, or if the students in each grade were all evenly distributed across the
clubs.
© 2023 SOPHIA Learning, LLC. SOPHIA is a registered trademark of SOPHIA Learning, LLC. Page 164
If each of these three clubs were evenly distributed among the 30 7th graders, we would need to
divide the total evenly between the three options:
This means we would expect 10 7th graders to participate in the coding club, 10 7th graders to
participate in the photography club, and 10 7th graders to participate in the debate club.
In summary, the observed and expected frequencies for 7th graders participating in photography
club is:
Observed: 11
Expected: 10
SUMMARY
The chi-square statistic is a measure of discrepancy across categories from what you would have
expected in categorical data. You can only use it for data that appear in categories or qualitative data.
The expected values may not be whole numbers since the expected values are long-term average
values.
Good luck!
TERMS TO KNOW
Chi-Square Statistic
The sum of the ratios of the squared differences between the expected and observed counts to the
expected counts.
Expected Frequencies
The number of occurrences we would have expected within each of the categories in a qualitative
distribution if the null hypothesis were true.
Observed Frequencies
The number of occurrences that were observed within each of the categories in a qualitative
distribution.
FORMULAS TO KNOW
© 2023 SOPHIA Learning, LLC. SOPHIA is a registered trademark of SOPHIA Learning, LLC. Page 165
Chi-Square Statistic
© 2023 SOPHIA Learning, LLC. SOPHIA is a registered trademark of SOPHIA Learning, LLC. Page 166
Chi-Square Test for Goodness-of-Fit
by Sophia
WHAT'S COVERED
This tutorial will cover how to calculate a chi-square test statistic for a chi-square test of goodness-of-
fit. Our discussion breaks down as follows:
The chi-square distribution is a right-skewed distribution that generally measures the discrepancy from what a
sample of categorical data would look like if you had an idea of what the population should look like in those
categories.
The p-value is always the area in the chi-square distribution to the left of your particular chi-square statistic
that we end up calculating. The values on the left (low values of chi-square) are likely to happen by chance,
and high values of chi-square are unlikely to happen by chance.
Just like the t distribution, the chi-square distribution is actually a family of curves. The shape changes a little
bit, based on the degrees of freedom, but it's always skewed to the right.
© 2023 SOPHIA Learning, LLC. SOPHIA is a registered trademark of SOPHIA Learning, LLC. Page 167
HINT
The degrees of freedom for the chi-square distribution is the number of categories minus 1.
The conditions for using the chi-square distribution are:
STEP BY STEP
EXAMPLE In the book Outliers, Malcolm Gladwell outlines a trend that he finds in professional
hockey, related to birth month. Suppose a random sample of 512 professional hockey players was
taken and their birth month was recorded.
Given the following information about birth month for the population, what would you expect for the
number of hockey players born in each month?
Expected # of Observed # of
Month % of Population
Hockey Players Hockey Players
January 8% 51
February 7% 46
March 8% 61
April 8% 49
May 8% 46
© 2023 SOPHIA Learning, LLC. SOPHIA is a registered trademark of SOPHIA Learning, LLC. Page 168
June 8% 49
July 9% 36
August 9% 41
September 9% 36
October 9% 34
November 8% 33
December 9% 30
Total 512
Is the recorded values what you would have expected, given the general population? It certainly appears that
the earlier months of the year have larger numbers of NHL players born in them, which is not very consistent
with the nearly uniform distribution of the population. The observed distribution looks like this:
What you would have expected is that, of those 512 professional hockey players, 8% of them would have
been born in January, 7% of them would have been born in February, etc. We can find the expected value for
each month based on the given percentages of the population to get the following values:
Expected # of Observed # of
Month % of Population
Hockey Players Hockey Players
January 8% 40.96 51
February 7% 35.84 46
March 8% 40.96 61
April 8% 40.96 49
May 8% 40.96 46
June 8% 40.96 49
July 9% 46.08 36
August 9% 46.08 41
September 9% 46.08 36
October 9% 46.08 34
November 8% 40.96 33
December 9% 46.08 30
© 2023 SOPHIA Learning, LLC. SOPHIA is a registered trademark of SOPHIA Learning, LLC. Page 169
Total 512 512
We would have expected 9% of the players to have been born in each of July, August, September, October,
and December. So we would have expected 46.08. However, apparently just 30 were born in December.
Let's perform a chi-square goodness-of-fit test for this set of data to determine the discrepency:
Condition Description
You can treat it as such. This was a sample of hockey players born
between 1980 and 1990. There's no reason to imagine that that's
Simple Random Sample going to be particularly different or unrepresentative. Therefore, you
can treat this as a random sample of players who have played or will
play professional hockey.
You have to assume that there are at least 10 times as many players
who have ever played pro hockey as there were in our sample, such
Independence that we can assume that independence piece. That would mean that
you have to assume that there are at least 5,120 players who have
ever played pro hockey.
FORMULA
Chi-Square Test
The chi-square statistic is going to be the observed minus the expected for each month squared, divided by
the expected for each month.
Expected # of Observed # of
Month
Hockey Players Hockey Players
© 2023 SOPHIA Learning, LLC. SOPHIA is a registered trademark of SOPHIA Learning, LLC. Page 170
February 35.84 46 2.88
Sum 34.21
When you add all of those components together, you get the chi-square value of 34.21.
In this case, it's also a good idea to state that the degrees of freedom, which is the number of categories
minus 1. There were 512 hockey players, but there were 12 categories. So the degrees of freedom is 12 minus
1, or 11.
The p-value can be obtained from technology or with a table. When using a table, you go down to the line that
corresponds with the degrees of freedom and look for the chi-square value.
χ2 Critical Values
Degrees Tail Probability Values, p
of freedom
(df) 0.250 0.200 0.150 0.100 0.050 0.025 0.020 0.010 0.005 0.0025 0.001 0.0005
1 1.32 1.64 2.07 2.71 3.84 5.02 5.41 6.63 7.88 9.14 10.83 12.12
2 2.77 3.22 3.79 4.61 5.99 7.38 7.82 9.21 10.60 11.98 13.82 15.20
3 4.11 4.64 5.32 6.25 7.81 9.35 9.84 11.34 12.84 14. 32 16.27 17.73
4 5.39 5.59 6.74 7.78 9.49 11.14 11.67 13.23 14.86 16.42 18.47 20.00
5 6.63 7.29 8.12 9.24 11.07 12.83 13.33 15.09 16.75 18.39 20.51 22.11
6 7.84 8.56 9.45 10.64 12.53 14.45 15.03 16.81 18.55 20.25 22.46 24.10
7 9.04 9.80 10.75 12.02 14.07 16.01 16.62 18.48 20.28 22.04 24.32 26.02
8 10.22 11.03 12.03 13.36 15.51 17.53 18.17 20.09 21.95 23.77 26.12 27.87
9 11.39 12.24 13.29 14.68 16.92 19.02 19.63 21.67 23.59 25.46 27.88 29.67
10 12.55 13.44 14.53 15.99 18.31 20.48 21.16 23.21 25.19 27.11 29.59 31.42
11 13.70 14.63 15.77 17.29 19.68 21.92 22.62 24.72 26.76 28.73 31.26 33.14
12 14.85 15.81 16.99 18.55 21.03 23.34 24.05 26.22 28.30 30.32 32.91 34.82
13 15.93 16.99 18.90 19.81 22.36 24.74 25.47 27.69 29.82 31.88 34.53 36.48
14 17.12 18.15 19.40 21.06 23.68 26.12 26.87 29.14 31.32 33.43 36.12 38.11
Going down to the line for 11 degrees of freedom, the closest to 34.21 is 33.14. This chi-square statistic
corresponds with a probability of 0.0005. That's a very low p-value, much less than the significance level of
© 2023 SOPHIA Learning, LLC. SOPHIA is a registered trademark of SOPHIA Learning, LLC. Page 171
0.05.
Step 4: Compare your test statistic to your chosen critical value, or your p-value to your chosen
significance level. Based on how they compare, state a decision about the null hypothesis and conclusion
in the context of the problem.
Since your p-value of 0.0005 is low, you can't attribute the difference from the "norm" to chance alone.
This means that you must reject the null hypothesis in favor of the alternative and conclude that the
distribution of birth months for professional hockey players differs significantly from the birth month
distribution for the general populace.
IN CONTEXT
A manufacturing company claims that they expect five defects per day. This means that they believe
the defects are evenly distributed across Monday through Friday.
A manager collects data on the days of the week and records the following information:
Monday 5 6
Tuesday 5 8
Wednesday 5 4
Thursday 5 2
Friday 5 5
Let's perform a chi-square test for goodness of fit to determine if the variation that we see in the
observation is from random chance, or is there something different than an even distribution.
Step 2: Check conditions. Let's check the three conditions for this hypothesis test.
Simple Random Sample: We can assume that the manager collected data randomly
throughout the days of the week.
Independence: You have to assume that there have been at least 10 times as many defects
at this manufacturing company as there were in our sample, such that we can assume that
independence piece. That would mean that you have to assume that there have been at
least 250 defects in this company's history.
Expected Counts At least 5: When you look at the entire row of expected values, all of them
© 2023 SOPHIA Learning, LLC. SOPHIA is a registered trademark of SOPHIA Learning, LLC. Page 172
are 5 so this condition is satisfied.
We can use the chi-square formula to calculate the chi-square test statistic and take the
observed minus the expected, then square those value and divide by the expected, and finally
sum everything that we find.
Monday 5 6
Tuesday 5 8
Wednesday 5 4
Thursday 5 2
Friday 5 5
Sum 4
The chi-square test statistic for this data set is equal to 4. We can use a chi-square table or
technology to find the p-value that relates to this value of the chi-statistic, 4. We also need to
look at the degrees of freedom, which is the sample size minus 1, or 5 minus 1. So, in this
case, the chi-square statistic and the degrees of freedom are both 4. Applying this
information and using technology, we find a p-value of 0.40601.
Step 4: Compare your test statistic to your chosen critical value, or your p-value to your
chosen significance level. Based on how they compare, state a decision about the null
hypothesis and conclusion in the context of the problem.
Remember, our significance level was 0.05. In this case, our p-value is greater than our
significance level so we cannot reject our null hypothesis.
TERM TO KNOW
© 2023 SOPHIA Learning, LLC. SOPHIA is a registered trademark of SOPHIA Learning, LLC. Page 173
SUMMARY
The chi-square statistic is a measure of discrepancy across categories from what we would have
expected in our categorical data. The expected values might not be whole numbers, since each
expected value is a long term average. The chi-square distribution is a skewed right distribution, and
chi-square statistics near zero are more common if the null hypothesis is true. The goodness-of-fit test
is used to see if the distribution across categories for data fit a hypothesized distribution across
categories.
Good luck!
TERMS TO KNOW
FORMULAS TO KNOW
Chi-Square Statistic
© 2023 SOPHIA Learning, LLC. SOPHIA is a registered trademark of SOPHIA Learning, LLC. Page 174
Chi-Square Test for Homogeneity
by Sophia
WHAT'S COVERED
This tutorial is going to run through a chi-square test of homogeneity. Our discussion breaks down as
follows:
Instead of comparing the distributions to some hypothesized distribution, you compare whether or not two
sample distributions are significantly different from each other.
STEP BY STEP
EXAMPLE Suppose that two colleges, the U and State, are worried about the student drinking
behaviors, so they both independently choose random samples of their students. The results of the
drinking behaviors are given in the table here:
Drinking Level The U State
High 63 16 79
© 2023 SOPHIA Learning, LLC. SOPHIA is a registered trademark of SOPHIA Learning, LLC. Page 175
The question is, does there appear to be a difference with drinking behaviors between the two colleges?
Obviously, those who drink a lot represent the lowest category in both schools, and those who drink a little
represent the highest in both schools. Perhaps the schools are not that different. You can run a test, though,
to make sure whether that's the case or to dispute whether that's the case.
H0: The distribution of drinking levels is the same for The U as it is for State.
Ha: The distribution of drinking levels is not the same for The U as it is for State.
α: 0.05
High 63 16 79
The idea here is that if the two distributions were homogeneous, then it would be 16.2% at The U that don't
drink at all and 16.2% at State that don't drink it all.
When you calculated the expected value for "None" and The U, you divided 326 by 2017 to get the 16.2%, and
then multiplied by 981. In other words, we multiplied the total of "None" by the total of "The U", and divide all
that by the grand total.
In general, what we can say is that the expected values for each cell are going to be the row total times the
column total over the grand total.
© 2023 SOPHIA Learning, LLC. SOPHIA is a registered trademark of SOPHIA Learning, LLC. Page 176
FORMULA
From that, it's not too hard to create an entire table of expected values.
Drinking Drinking
The U State The U State
Level Level
The table on the left is what you observed; the table on the right is what you expected. Again, these values
don't have to be integers.
The conditions for this hypothesis test are met: you have two independent random samples and all cell
counts in the expected table are at least five, the smallest one being 38.42.
FORMULA
Chi-Square Test
HINT
You can also use technology to calculate the chi-square test statistic and the p-value.
The chi-square test statistic that you would obtain is 96.6.
The degrees of freedom, in this case, can be found by multiplying the number of rows minus one times the
value of the number of columns minus one. This is technically the general rule and can be applied to the
previous chi-square tests.
FORMULA
© 2023 SOPHIA Learning, LLC. SOPHIA is a registered trademark of SOPHIA Learning, LLC. Page 177
Drinking Level The U State
High 63 16
In this case, there were four rows (none, low, moderate, and high) and two columns (The U and State):
So, the degrees of freedom is going to be equal to three. The chi-square statistic and p-value can all be
obtained using technology and we get a corresponding p-value of 0.001. This is a very low value, less than
0.05.
Step 4: Compare your test statistic to your chosen critical value, or your p-value to your chosen
significance level. Based on how they compare, state a decision about the null hypothesis and conclusion
in the context of the problem.
Since the p-value is lower than the significance level, you reject the null hypothesis and conclude that there is
a difference in drinking behavior between the students at the U and the students at State.
TERM TO KNOW
SUMMARY
The chi-square test of homogeneity allows you to test whether two populations have significantly
different distributions across the categories. The expected counts for each cell is the product of the
row total and the column total divided by the grand total. The conditions are the same as they are for
a goodness-of-fit test, in that all the expected values have to be greater than five.
Good luck!
TERMS TO KNOW
FORMULAS TO KNOW
© 2023 SOPHIA Learning, LLC. SOPHIA is a registered trademark of SOPHIA Learning, LLC. Page 178
Chi-square Degrees of Freedom
© 2023 SOPHIA Learning, LLC. SOPHIA is a registered trademark of SOPHIA Learning, LLC. Page 179
Chi-Square Test for Association and
Independence
by Sophia
WHAT'S COVERED
This tutorial will cover the chi-square test of independence. Our discussion breaks down as follows:
STEP BY STEP
EXAMPLE Suppose 335 students of different backgrounds (rural, suburban, and urban schools)
were asked to pick one thing about school that was most important to them: getting good grades,
being popular, or being good at sports. Here is the distribution of responses:
School Locations
© 2023 SOPHIA Learning, LLC. SOPHIA is a registered trademark of SOPHIA Learning, LLC. Page 180
Grades 57 87 24
Popular 50 42 6
Sports 42 22 5
The question is, does there appear to be an association between the geographic location of the school and
the answer choice to the question (the goal)? This is an ideal time to run a chi-square test for association or
independence. This can tell you if the distribution of goals (grades, popular, and sports) differ significantly for
each school location. Are they associated or are they independent?
Remember, the expected value is equal to that particular cell's row total, times its column total, divided by the
grand total for all the cells.
FORMULA
For example, if we wanted the expected value for "Grades" and "Rural", we would multiply the row total for
"Grades" with the column total for "Rural", and divide by the total values in the table.
For the row with "Grades", there was a total of 57 plus 87 plus 24, or 168 students. For "Rural", there was a
total of 149 students. We were told at the beginning there were a total of 335 students, however, we could
also add up all the values in the table to get this same value.
We can continue using this formula for each cell and get the expected table of results:
Observed
School Locations
Grades 57 87 24
© 2023 SOPHIA Learning, LLC. SOPHIA is a registered trademark of SOPHIA Learning, LLC. Page 181
Popular 50 42 6
Sports 42 22 5
Expected
School Locations
What you are interested in is whether or not all the expected counts are at least 5. The smallest one is 7.21, so
the conditions are met.
To find the corresponding p-value, we first need to find the degrees of freedom. The degrees of freedom can
be found by multiplying the number of rows minus one times the value of the number of columns minus one.
FORMULA
In this case, there were three rows (grades, popular, and sports) and three columns (rural, suburban, and
urban):
In this case, the degrees of freedom is going to be equal to four. Using technology and plugging in the chi-
square statistic of 18.564 and 4 degrees of freedom, the p-value can be obtained and we get a very small p-
value of 0.001.
Step 4: Compare your test statistic to your chosen critical value, or your p-value to your chosen
significance level. Based on how they compare, state a decision about the null hypothesis and conclusion
in the context of the problem.
You need to link your p-value to a decision about the null hypothesis. Since the p-value is smaller than 0.05,
you reject the null hypothesis in favor of the alternative and conclude that there is an association between the
two categorical variables of school location and goal.
TERM TO KNOW
SUMMARY
© 2023 SOPHIA Learning, LLC. SOPHIA is a registered trademark of SOPHIA Learning, LLC. Page 182
The chi-square test of independence tests whether two qualitative variables have an association or
not, so it's sometimes called the chi-square test of association. The expected value for each cell is
equal to that particular cell's row total, times its column total, divided by the grand total for all the cells.
Good luck!
TERMS TO KNOW
FORMULAS TO KNOW
© 2023 SOPHIA Learning, LLC. SOPHIA is a registered trademark of SOPHIA Learning, LLC. Page 183
Terms to Know
Alternative Hypothesis
A claim that a population parameter differs from the value claimed in the null hypothesis.
Chi-Square Statistic
The sum of the ratios of the squared differences between the expected and observed
counts to the expected counts.
Confidence Interval
An interval we are some percent certain (eg 90%, 95%, or 99%) will contain the population
parameter, given the value of our sample statistic.
Critical Value
A value that can be compared to the test statistic to decide the outcome of a hypothesis
test
© 2023 SOPHIA Learning, LLC. SOPHIA is a registered trademark of SOPHIA Learning, LLC. Page 184
Distribution of Sample Means
A distribution where each data point consists of a mean of a collected sample. For a given
sample size, every possible sample mean will be plotted in the distribution.
Expected Frequencies
The number of occurrences we would have expected within each of the categories in a
qualitative distribution if the null hypothesis were true.
F statistic
The test statistic in an ANOVA test. It is the ratio of the variability between the samples to
the variability within each sample. If the null hypothesis is true, the F statistic will probably
be small.
Hypothesis
A claim about a population parameter.
Hypothesis Testing
The standard procedure in statistics for testing claims about population parameters.
Left-tailed test
A hypothesis test where the alternative hypothesis only states that the parameter is lower
than the stated value from the null hypothesis.
Null Hypothesis
A claim about a particular value of a population parameter that serves as the starting
assumption for a hypothesis test.
Observed Frequencies
The number of occurrences that were observed within each of the categories in a
qualitative distribution.
© 2023 SOPHIA Learning, LLC. SOPHIA is a registered trademark of SOPHIA Learning, LLC. Page 185
One-Way ANOVA
A hypothesis test that compares three or more population means with respect to a single
characteristic or factor.
One-tailed test
A hypothesis test where the alternative hypothesis only states that the parameter is higher
(or lower) than the stated value from the null hypothesis.
P-value
The probability that the test statistic is that value or more extreme in the direction of the
alternative hypothesis
Population Mean
A mean for all values in the population. Denoted as .
Population Parameters
Summary values for the population. These are often unknown.
Practical Significance
An arbitrary assessment of whether observations reflect a practical real-world use.
Right-tailed test
A hypothesis test where the alternative hypothesis only states that the parameter is higher
than the stated value from the null hypothesis.
Sample Mean
A mean obtained from a sample of a given size. Denoted as .
Sample Size
The size of a sample of a population of interest.
Sample Statistics
Summary values obtained from a sample.
Sampling Error
The amount by which the sample statistic differs from the population parameter.
© 2023 SOPHIA Learning, LLC. SOPHIA is a registered trademark of SOPHIA Learning, LLC. Page 186
Sampling With Replacement
A sampling plan where each observation that is sampled is replaced after each time it is
sampled, resulting in an observation being able to be selected more than once.
Significance Level
The probability of making a type I error. Abbreviated with the symbol, alpha .
Standard Error
The standard deviation of the sampling distribution of sample means distribution.
Statistical Significance
The statistic obtained is so different from the hypothesized value that we are unable to
attribute the difference to chance variation.
T-Distribution/Student's T-Distribution
A family of distributions that are centered at zero and symmetric like the standard normal
distribution, but heavier in the tails. Depending on the sample size, it does not diminish
towards the tails as fast. If the sample size is large, the t-distribution approximates the
normal distribution.
© 2023 SOPHIA Learning, LLC. SOPHIA is a registered trademark of SOPHIA Learning, LLC. Page 187
Test Statistic
A measurement, in standardized units, of how far a sample statistic is from the assumed
parameter if the null hypothesis is true
Two-Way ANOVA
A hypothesis test that compares three or more population means with respect to multiple
characteristics or factors.
Two-tailed test
A hypothesis test where the alternative hypothesis states that the parameter is different
from the stated value from the null hypothesis; that is, the parameter's value is either higher
or lower than the value from the null hypothesis.
Type I Error
In a hypothesis test, when the null hypothesis is rejected when it is in fact, true.
Type II Error
In a hypothesis test, when the null hypothesis is not rejected when it is, in fact, false.
t-distribution
A family of distributions similar to the standard normal distribution, except that they are
fatter in the tails, due to the increased variability associated with using the sample standard
deviation instead of the population standard deviation in the formula for the test statistic.
Formulas to Know
Chi-Square Statistic
© 2023 SOPHIA Learning, LLC. SOPHIA is a registered trademark of SOPHIA Learning, LLC. Page 188
Confidence Interval
Standard Error
Sample Means:
© 2023 SOPHIA Learning, LLC. SOPHIA is a registered trademark of SOPHIA Learning, LLC. Page 189
T-Statistic For Population Means
Test Statistic
z-statistic of Means
z-statistic of Proportions