Professional Documents
Culture Documents
Same Guidelines I Gave in ISQS 5347
Same Guidelines I Gave in ISQS 5347
2. Give p(y| X = 3) for problem 1. (Note: it is a probability distribution that puts 100%
probability on a single number; this is called a degenerate probability distribution.)
3. How could you change the problem statement in problem 1. so that the amount of
money you spend spent after X = 3 visits has a non-degenerate probability distribution?
Keep the problem real, about the health club, the money Y. Do not discuss simulation or
other fake data.
4. Suppose the classical regression model holds, with β0 = 10, β1 = 1 and σ = 3. Using R,
graph the distribution of Y when X = 5, and when X = 10 on the same axes, with different
plot line types. Include labels.
5. Using the model of 4., find the probability that Y| X =5 is between 10 and 20. Use R.
6. Simulate data from the model in 4. as follows: (a) generate n=10 X values from the
Poisson distribution with λ = 12. (b) using the X values in (a), generate Y values
corresponding to the model.
7. Is the model in 6. a fixed-x or a random-X model? Explain, using the definitions in the
book.
8. Using the simulated data in 6, draw the scatterplot, and overlay (i) the true regression
function (ii) the least squares estimate of the true regression function, and (iii) the
LOESS estimate of the true regression function. Which of these three functions is
random? Which is fixed? How do you know for sure?
12. Problems 10 and 11 have larger sample sizes. In your answers, what benefit do you see
of having a larger sample size?
14. (i) Define the “true regression function” using the definition that involves conditional
distributions in problem 13. (ii) Why did you not overlay this true regression function in
problem 13.? (iii) Which estimate do you think is closer to the true regression function,
the least squares line or the LOWESS (LOESS) fit? (There is no clearly correct answer
here. See the book’s discussion, and use your judgment.)
16. (i) Define the “true regression function” using the definition that involves conditional
distributions in problem 15. (ii) Why did you not overlay this true regression function in
problem 15.? (iii) Which estimate do you think is closer to the true regression function,
the least squares line or the LOWESS (LOESS) fit? (There is a clearly correct answer here.
See the book’s discussion.)