Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 5

November 5, 2021

Econ 226-004

Jonathan L. Graves Assignment 3


1. Find1 a well-shuffled deck of regular playing cards (52 cards, no jokers!). Let’s call Ri the
dummy variable associated with whether or not card i is red:
(
1 if card i is
red Ri =
0 otherwise

This deck will represent the population of card draws.


(a) (5 points) What is the P(Ri = 1)? What is P(Ri = 0)? What is µRi = E[Ri] in the
population? Explain. (Hint: E[Ri] = Pall possible values of Ri rP(Ri = r))
P(Ri=1) = 26/52 = 0.5;
P(Ri=0) = 26/52 = 0.5;
The population mean, µRi = E[Ri], is 26/52 = 0.5
(b) (5 points) Draw a sample of n = 10 cards (with replacement, shuffling carefully
each time) from the population (deck). In a table, like the one below, record the
value of Ri for each observation. Repeat this process ten times, for 10 total
samples. Finally, compute the mean of Ri for each of your samples.
SAMPL 1 2 3 4 5 6 7 8 9 10 MEAN
E
1 0 0 1 0 1 1 1 0 0 1 5/10
2 1 0 0 1 0 1 0 0 0 0 3/10
3 0 0 0 1 1 0 1 1 0 1 5/10
4 1 1 0 1 1 1 0 0 1 1 7/10
5 0 1 0 0 0 1 1 0 1 1 5/10
6 1 1 0 0 0 1 0 1 1 1 6/10
7 0 0 0 1 0 0 1 1 0 0 3/10
8 1 1 0 0 1 0 1 1 1 0 6/10
9 0 0 0 1 1 1 1 0 0 1 5/10
10 0 1 1 0 0 0 0 1 1 0 4/10

(c) (5 points) What is the mean of the means you computed, taken across all your
samples? How does it compare to your answer to (a)? Explain, using the ideas we
developed in class.

1
November 5, 2021

The mean of the means that I have computed above, taken across all samples is 0.49. This is
calculated using bootstrap sampling. According to the Law of Large Numbers, if the sample
is representative of the population, then as the number of samples increases, the sample
mean becomes a closer approximation of the population mean. Here, the sample mean was
26/52 = 0.5. Thus, it is a close approximation of the population mean, but not an exact
population mean as n, (here, 10) is not large enough.
(d) (5 points) If n was 1000 instead of 10, what do you think would happen if you
repeated (b)? Explain, relating your answer to the ideas developed in class.
If n was 1000, instead of 10, our mean would have become an even closer approximation of
the population mean, which is 0.5, instead of 0.49 (when n = 10). This happens because of
the concept of the Law of Large Numbers that states that if the sample is representative of
the population, then as the number of samples, n becomes large, the sample mean becomes
a closer approximation of the population mean. Thus, since here n is larger, it is a closer
approximation of the population mean.
Total for Question 1: 20
November 5, 2021

2. As in Question 1, suppose you knew that a deck of cards was composed of 26 red, 26
black cards, and 2 (colourless) jokers, so that and . Suppose
that you drew a large, representative sample (n = 2,500) from this deck of cards (with
replacement).
(a) (5 points) What is the sampling distribution of R¯i? Be specific, and explain.
The sampling distribution of Ri is a Normal Distribution or Gaussian Distribution. This
is because the sample is representative of the population and the n is large. It is
considered as a kind of fundamental neutral property or law. Here, the mean would be
26/54 = 0.48; the variance would be 728/2916; and the standard deviation would
come out to be 0.00999.
(b) (5 points) Suppose that in your sample that you found R¯i = 0.61. How likely was
this? Explain how you know. (Hint: you may want to use R to calculate something
here)
It is highly unlikely that R¯i could be 0.6. After calculating z score and/or using the
code ‘pnorm(0.61, mean=26/54, sd=sqrt((728/2916)/2500))’; the z score comes out to
be 12.
According to the empirical rule, z can only lie within at most 3 standard deviations,
thus we can say that it is highly unlikely that R¯i could be 0.6.

(c) (5 points) Suppose you weren’t sure whether the deck truly contained 26 red
cards. Is your sample (in (b)) evidence for or against your original belief about the
composition of the deck? Explain why.
If we weren’t sure whether the deck truly contained 26 red cards, our sample evidence
goes against our original belief about the composition of the deck as the z score is too
large, that is 12.
Total for Question 2: 15
November 5, 2021

3. (10 points) Your friend is talking about statistical analysis, and they say the following:
Econometrics relies on a lot of assumptions in order to work, most of
which are completely impossible to believe. For example, we use the normal
distribution to learn things about the average value of wages. But wages are
definitely not normally distributed! This means our conclusions are
completely meaningless.

Write a short response to your friend, identifying what parts of their statement are
correct, and which ones aren’t. Relate your answer to the ideas developed in class.

The above statement analysis is partly correct and partly incorrect. Econometrics does rely
on assumptions in order to work, as it tells us about the theoretical relationships we should
exist (or not) in the population. And we cannot actually see a population, thus it does
depend on assumptions. So, we use statistics from the sample to learn about the
distribution, and thereby the population. Moreover, the fact that wages are definitely not
normally distributed is correct. However, rendering econometrics’ conclusions completely
meaningless would be an over-statement.

When we calculate the average value of wages, our analysis entails taking repeated samples
from the sample with replacement, calculating the total sample mean, and finally,
calculating the confidence intervals within which the mean population should lie.
According to the Law of Large Numbers, if the sample is representative of the population,
then as the number of samples increases, the sample mean becomes a closer approximation
of the population mean. Thus, since ‘n’ is large when looking at wages, the mean would be a
close approximation of the population mean. According to the Central Limit Theorem, our
sampling distribution of wages, X¯ has a known population mean, equal to the mean of X¯; a
known variance and a known distribution of x. And by using the properties of the sampling
distribution we learn things about the original variable, that is the wages without having
any information about their distribution. Thus, drawing judgments about the average wage
value based on the normal distribution closely approaches wage reality, and is a nearly
perfect estimate, thus is a good idea.
November 5, 2021

Bonus Problems
6. (10 points (bonus)) In problems (1) and (2), above, we always drew with
replacement. Why? Explain, relating to the ideas developed in class

Answer 6 : In problems 1 and 2, we always drew with replacement, because of the concept
of “Bootstrapping”, that is the idea of drawing a new sample of the same size repeatedly
from the existing sample with replacement. Bootstrapping allows us to deal with the
problem associated with empirically applying sampling distributions, that is we can’t
normally create new samples. Thus, Bootstrapping lets us stimulate samples using the
empirical distribution. It gives us a sense of what the sampling distribution might look like.
It’s not the actual sampling distribution as we are not really generating new samples but is
a good approximation.
We can do this using the sample itself, by randomly drawing a new sample of the same size
repeatedly from the existing sample with replacement. We repeat this step about a
hundred or thousand times to create a new sample of bootstrap samples. Then, we
compute the sampling distribution of the statistic from this sample. Here, replacement is
crucial, as if hypothetically we don’t replace as randomly draw a new sample of the same
size, we may be unknowingly drawing the same cards and hence, getting the same results.

End of Assignment

You might also like