Professional Documents
Culture Documents
AP Review Part IV - Harrison
AP Review Part IV - Harrison
I. Why do we do statistics? To answer a question. So the first step is to decide what question we’re trying to
answer. If we don’t have a burning question, then statistics is not needed. Now the only way to really know the
truth is to study the entire population, but we don’t have time for that. So studying a sample to answer our
question works for us.
Confidence Interval – When you don’t know the population characteristic you want and you wish you
did, this is your choice. Used to estimate the population parameter. The CI is an interval of plausible
values that we hope will capture the value of the population characteristic. Helps us to get an idea of
what we want to know.
Hypothesis Test – When you have a set standard and want to know if your sample data meets that
standard (or not), this is your choice. Helps to settle arguments of the type “yes, it does”, “no it
doesn’t”, “yes, it does”, “no it doesn’t”. You should expect a bit of a difference, but a hypothesis test
will help you decide if the difference is significant.
categorical quantitative
1 2 1 2
t-test linear
or regression
paired t-test t-test
TO RECEIVE FULL CREDIT FOR A HYPOTHESIS TEST YOU MUST:
1) Write the null and alternative hypothesis and define each variable
2) Write which test you are using in words or with the appropriate formula and why you chose that test
3) Write and check all conditions for that test
4) Give the test statistic and the p-value and df , if applicable
5) Reject or fail to reject Ho based on the p-value (or critical value)
6) Write a conclusion in terms of the problem
You either have enough evidence to claim whatever the alternative hypothesis represents (reject H o)or you do
not have enough evidence to claim whatever the alternative hypothesis represents (fail to reject H o)
*******************************************************************
Type I errors, Type II errors, and Power
Power = 1 – β, so Type II errors and power are inversely related. Type I errors and power are directly related.
Type I Error = α
Power = 1 – β
Type II Error = β
For each of the following scenarios, determine the type of inference procedure to use. Then, proceed with the
inference procedure. (will need to do this on another page)
1. (2011B #5) During a flu vaccine shortage in the United States, it was believed that 45 percent of vaccine-eligible
people received a flu vaccine. The results of a survey given to a random sample of 2,350 vaccine-eligible people
indicated that 978 of the 2,350 people had received flu vaccine.
(a) Construct a 99 percent confidence interval for the proportion of vaccine-eligible people who had received flu
vaccine. Use your confidence interval to comment on the belief that 45 percent of the vaccine-eligible people had
received flu vaccine.
(b) Suppose a similar survey will be given to vaccine-eligible people in Canada by Canadian health officials. A 99
percent confidence interval for the proportion of people who will have received flu vaccine is to be constructed. What
is the smallest sample size that can be used to guarantee that the margin of error will be less than or equal to 0.02?
2. (2013 #1b) An environmental group conducted a study to determine whether crows in a certain region were ingesting
food containing unhealthy levels of lead. A biologist classified lead levels greater than 6.0 parts per million (ppm) as
unhealthy. The lead levels of a random sample of 23 crows in the region were measured and recorded. The data are
shown in the stemplot below.
The mean lead level of the 23 crows in the sample was 4.90 ppm and the standard deviation was 1.12 ppm. Construct and
interpret a 95 percent confidence interval for the mean lead level of crows in the region.
3. (2015 #4) A researcher conducted a medical study to investigate whether taking a low-dose aspirin reduces the chance
of developing colon cancer. As part of the study, 1,000 adult volunteers were randomly assigned to one of two groups.
Half of the volunteers were assigned to the experimental group that took a low-dose aspirin each day, and the other half
were assigned to the control group that took a placebo each day. At the end of six years, 15 of the people who took the
low-dose aspirin had developed colon cancer and 26 of the people who took the placebo had developed colon cancer. At
the significance level α = 0.05, do the data provide convincing evidence that taking a low-dose aspirin each day would
reduce the chance of developing colon cancer among all people similar to the volunteers?
Is there statistically convincing evidence that electricity production by the windmill is related to wind velocity? Explain.
Categorical or Quantitative?_____ Number of samples? ______ Number of variables? _____
5. (1998 #5) A large university provides housing for 10 percent of its graduate students to live on campus. The
university's housing office thinks that the percentage of graduate students looking for housing on campus may be more
than 10 percent. The housing office decides to survey a random sample of graduate students, and 62 of the 481
respondents say that they are looking for housing on campus.
On the basis of the survey data, would you recommend that the housing office consider increasing the amount of housing
on campus available to graduate students? Give appropriate evidence to support your recommendation.
6. (2006 #4) Patients with heart-attack symptoms arrive at an emergency room either by ambulance or self-transportation
provided by themselves, family, or friends. When a patient arrives at the emergency room, the time of arrival is recorded.
The time when the patient’s diagnostic treatment begins is also recorded.
An administrator of a large hospital wanted to determine whether the mean wait time (time between arrival and diagnostic
treatment) for patients with heart-attack symptoms differs according to the mode of transportation. A random sample of
150 patients with heart-attack symptoms who had reported to the emergency room was selected. For each patient, the
mode of transportation and wait time were recorded. Summary statistics for each mode of transportation are shown in the
table below.
(a) Use a 99 percent confidence interval to estimate the difference between the mean wait times for ambulance-
transported patients and self-transported patients at this emergency room.
(b) Based only on this confidence interval, do you think the difference in the mean wait times is statistically significant?
Justify your answer.
Can one conclude that the mean manual dexterity for people who have completed the 6-week training program has
significantly increased? Support your conclusion with appropriate statistical evidence.
8. (2013 #4) The Behavioral Risk Factor Surveillance System is an ongoing health survey system that tracks health
conditions and risk behaviors in the United States. In one of their studies, a random sample of 8,866 adults answered the
question “Do you consume five or more servings of fruits and vegetables per day?” The data are summarized by response
and by age-group in the frequency table below.
Do the data provide convincing statistical evidence that there is an association between age-group and whether or not a
person consumes five or more servings of fruits and vegetables per day for adults in the United States?
11. Which of the following is a criterion for choosing a t-test rather than a z-test when making an inference
about the mean of a population?
(A) The standard deviation of the population is unknown.
(B) The mean of the population is unknown.
(C) The sample may not have been a simple random sample.
(D) The population is not normally distributed.
(E) The sample size is less than 100.
12. A large-sample 98 percent confidence interval for the proportion of hotel reservations that are canceled on the
intended arrival day is (0.048, 0.112). What is the point estimate for the proportion of hotel reservations that are canceled
on the intended arrival day from which this interval was constructed?
(A) 0.032
(B) 0.064
(C) 0.080
(D) 0.160
(E) It cannot be determined from the information given.
13. When using a one-sample t-procedure to construct a confidence interval for the mean of a finite population, a
condition is that the population size be at least 10 times the sample size. The reason for the condition is to ensure that
14. A random sample of 50 students at a large high school resulted in a 95 percent confidence interval for the mean
number of hours of sleep per day of (6.73, 7.67). Which of the following statements best summarizes the meaning of this
confidence interval?
(A) About 95% of all random samples of 50 students from this population would result in a 95% confidence interval
of (6.73, 7.67).
(B) About 95% of all random samples of 50 students from this population would result in a 95% confidence interval
that covered the population mean number of hours of sleep per day.
(C) 95% of the students in the survey reported sleeping between 6.73 and 7.67 hours per day.
(D) 95% of the students in this high school sleep between 6.73 and 7.67 hours per day.
(E) A student selected at random from this population sleeps between 6.73 and 7.67 hours per day for 95% of the
time.
15. In order to plan its next advertising campaign, the Trendy Motor Vehicle company is investigating whether the type
of vehicle and the color of vehicle are related. Each person in a random sample of size 275 selected from the company’s
mailing list was classified according to the type (car or truck) and the color of vehicle he or she drove. The data are
shown in the table below.
Which of the following procedures would be most appropriate to use for investigating whether there is a relationship
between vehicle type and color?
16. A random sample of 432 voters revealed that 100 are in favor of a certain bond issue. A 95 percent confidence
interval for the proportion of the population of voters who are in favor of the bond issue is
Amber wants to compute a 95 percent confidence interval for the slope of the least squares regression line in the
population of all students in her major field of study. Assuming that conditions for inference are satisfied, which of the
following gives the margin of error for the confidence interval?
17. Perchlorate is a chemical used in rocket fuel. People who live near a former rocket-testing site are concerned that
perchlorate is present in unsafe amounts in their drinking water. Drinking water is considered safe when the average level
of perchlorate is 24.5 parts per billion (ppb) or less. A random sample of 28 water sources in this area produces a mean
perchlorate measure of 25.3 ppb. Which of the following is an appropriate alternative hypothesis that addresses their
concern?
18. A manufacturer claims its Brand A battery lasts longer than its competitor’s Brand B battery. Nine batteries of each
brand are tested independently, and the hours of battery life are shown in the table below.
Provided that the assumptions for inference are met, which of the following tests should be conducted to determine if
Brand A batteries do, in fact, last longer than Brand B batteries?