Download as pdf or txt
Download as pdf or txt
You are on page 1of 4

STAT 730 – Final Exam Due December 20

You will turn in your responses as a pdf file. I do not require you to show work, but for calculation
problems, I cannot assign partial credit for incorrect answers if I cannot see your work. I recommend
showing calculations using the equation editor in Word or turning in a separate R script (with comments
to make it easy to find each question).
For statistical analyses:
• Include appropriate visualization(s).
• Indicate the null and alternative hypotheses for all tests.
• Address whether assumptions are met.
• Report effect size (difference or trend), its 95% CI, the test statistic, any associated parameters
(such as degrees of freedom), and the p-value.
• Indicate your conclusion.

1. The bill lengths of a population of male blue jays follow an approximately normal distribution with
mean equal to 25.4 mm and standard deviation equal to 0.8 mm.
a. (4) You mist net a male blue jay. What’s the probability that his bill is longer than 26.2 mm?
b. (4) You mist net 5 male blue jays. What’s the probability that 3 of them have bills shorter than
24 mm and the other 2 have bills longer than 24 mm?
c. (4) What’s the probability that the mean bill length of the 5 male blue jays you caught is
between 25 and 26 mm?

2. (10) Biologists noticed that some stream fishes are most often found in pools, while others prefer
riffles. They captured fish at 15 locations along a river. At each location, they recorded the number
and species of fish captured in a riffle and the number and species of fish captured in an adjacent
pool. They used these data to calculate the Shannon Diversity Index for each sample, shown in the
table below.
Do pools and riffles support equal levels of fish diversity? Conduct an appropriate statistical analysis
and report your results.
Location Pool Riffle
1 1.79 0.98
2 2.12 1.02
3 1.10 1.06
4 2.16 1.43
5 1.49 0.71
6 0.73 0.68
7 2.06 0.84
8 1.96 0.78
9 0.00 0.59
10 0.99 0.60
11 1.36 0.99
12 1.66 0.00
13 1.27 1.03
14 1.75 0.63
15 1.31 1.17
3. (6) John is a hunter who wants to know how acorn mass this year compares to last year because he
knows deer rely on acorns. John knows how to interpret confidence intervals and hypothesis tests,
and he assumes that 𝛼 = 0.05. You have random samples from trees in each year and are to analyze
the data and write a report. You plan to report the two sample means, but you aren’t sure what to
say about how they compare. You seek advice from four people and get the following feedback:
i. “Conduct an 𝛼 = 0.05 test of H0: 𝜇1 = 𝜇2 versus HA: 𝜇1 ≠ 𝜇2 and tell John whether or not you
reject H0 at the 𝛼 = 0.05 level.”
ii. “Report a 95% confidence interval for 𝜇1 − 𝜇2 .”
iii. “Conduct a test of H0: 𝜇1 = 𝜇2 versus HA: 𝜇1 ≠ 𝜇2 and report to John the p-value from the test.”
iv. “Compare the mean of last year to the mean of this year. If last year’s mean was higher, then
test H0: 𝜇1 = 𝜇2 versus HA: 𝜇1 > 𝜇2 . If last year’s mean was lower, then test H0: 𝜇1 = 𝜇2 versus
HA: 𝜇1 < 𝜇2 . Use 𝛼 = 0.05 and tell John whether or not you reject H0.”
Rank the four pieces of advice from worst to best and explain why you rank them as you do. That is,
explain what makes one better than another.

4. Researchers took random samples from two populations and applied a two-sample t test to the data
using ∝= 0.10. The p-value for the test, using a nondirectional alternative, was 0.053. For each of
the following, say whether the statement is true or false and explain why.
a. (3) There is a 5.3% chance that the two population distributions actually are the same.
b. (3) If the two population distributions actually are the same, then a difference between the two
samples as extreme as what these researchers observed would only happen 5.3% of the time.
c. (3) If a new study were done that compared the two populations, there is a 94.7% chance that
H0 would be rejected again.
d. (3) If a directional alternative with ∝= 0.05 were used and the data diverged in the direction
hypothesized, then H0 would be rejected.

5. A confused statistics student finds a quarter on the sidewalk and wants to test whether it’s fair (i.e.
probability of heads = 0.5). The quarter is an ordinary, fair quarter, but he doesn’t know that. He
tosses the coin 100 times, finds the p-value for a goodness-of-fit test, and compares it to ∝= 0.05. If
he can’t reject the null (because the p-value is too high), he discards the first sample and repeats the
experiment. If he again can’t reject the null, he discards the second sample and repeats the
experiment. He then accepts the result, either way.
a. (3) What’s the probability that he will make a Type I error if he follows this procedure?
b. (2) In general, if he is willing to perform the experiment up to 𝑥 times, what’s the probability
that he will make a Type I error?

6. For each of the following statements, say whether they are true or false and explain why.
a. (3) If the independent variable is truly unrelated to the dependent variable, a larger study would
have more power than a smaller one.
b. (3) Changing the value of ∝ will affect the p-value of the test.
c. (3) If the independent variable is really associated with the dependent variable, we would have
a better chance of detecting this relationship by choosing a small ∝ rather than a large one.
d. (3) If the independent variable is really associated with the dependent variable, we would have
a better chance of detecting this relationship with a large sample rather than a small one.
e. (3) A t-test results in a p-value of 0.022. If ∝= 0.01, we could be making a Type I error.
Problems 7-11

Conservationists in Tasmania are working hard to combat and control the devil facial tumor disease
(DFTD) that has drastically reduced Tasmanian devil populations since its appearance in the mid-1990s.
Unaffected populations have been established in which healthy devils are taken from the wild,
quarantined, and then kept and bred in captivity.

7. (3) Researchers are trying to estimate the prevalence of DFTD in the wild devil population of West
Pencil Pine. They trap 23 devils, and find that 18 of them are infected. Provide an estimate of
disease prevalence and its 95% confidence interval.

For problems 8-10, assume your estimate from 7 is the true population prevalence.

8. (3) Researchers trap 3 more devils. What’s the probability that exactly 2 of them are infected?

9. A conservationist would like to establish a new breeding pair at an existing quarantine site. Assume
the sex ratio is even and both sexes have the same disease prevalence.
a. (3) If they trap 2 devils, what’s the probability that they will get an appropriate pair (i.e. one
uninfected female and one uninfected male)?
b. (5) If they trap 5 devils, what’s the probability that they will get exactly one appropriate pair?
(Hint: Work your way up starting at 3 trapped devils if you’re having trouble figuring out where
to start.)

10. (3) Scientists develop a new screening tool to detect DFTD before visible symptoms are apparent.
The test correctly identifies 98% of devils with DFTD as infected. It correctly identifies 93% of devils
without DFTD as uninfected. The test is administered to a new devil with unknown disease status,
and it comes back negative. What’s the probability that the devil is truly uninfected? (Hint: The
disease prevalence does matter in this problem.)

11. Wildlife biologists at Hobart trap 43 devils, and find that 26 of them have DFTD.
a. (3) Provide an estimate of disease prevalence and its 95% confidence interval.
b. (6) Is disease prevalence the same at West Pencil Pine and Hobart? Conduct an appropriate
statistical analysis and report the results.

Problems 12-14

Use the “GPD.csv” file from Canvas. This is a modified version of the data you used for HW 12.

12. (10) Excluding Antarctica, the southern hemisphere has a greater proportion of land closer to the
equator than the northern hemisphere. Is the average net primary productivity (NPP) for plants in
the southern hemisphere greater than in the northern hemisphere? Conduct an appropriate
statistical analysis and report the results.
13. The leaf area index is a dimensionless measure of the total single-sided green leaf area of a plant
divided by the ground area underneath the plant.
a. (2) Propose a biological explanation for why leaf area index would be related to NPP.
b. (2) Write a prediction based on your proposed explanation in a.
c. (10) Does the evidence support your hypothesis? Conduct an appropriate statistical analysis and
report the results.

14. (10) Is a plant’s NPP dependent on its growth form (herb, shrub, tree)? Conduct an appropriate
statistical analysis and report the results. Instead of effect size and CI, report the mean NPP and 95%
CI for each growth form.

You might also like