Download as pdf or txt
Download as pdf or txt
You are on page 1of 3

Solutions exercises chapter 2

1.
a. If we only use one observation 𝑿𝟏 , then 𝑛 ∙ 𝑋1 = ∑ 𝑋𝑖 , where all sample variables 𝑋1,…, 𝑋𝑛 have
the same value 𝑋1 . In other words, they are (strongly) dependent: the effect of cancelling observations
with a positive and negative deviations from μ has disappeared, causing a much larger variability.
b. 𝐸(∑ 𝑋𝑖 ) = ∑ 𝐸(𝑋𝑖 ) = 𝑛 ∙ 𝜇 and 𝐸(𝑛 ∙ 𝑋1 ) = 𝑛 ∙ 𝐸(𝑋1 ) = 𝑛 ∙ 𝜇
In case of independence (different values) we have: 𝑣𝑎𝑟(∑ 𝑋𝑖 ) = ∑ 𝑣𝑎𝑟(𝑋𝑖 ) = 𝑛 ∙ 𝜎 2 .
But if we have only one observation: 𝑣𝑎𝑟(𝑛 ∙ 𝑋1 ) = 𝑛2 ∙ 𝑣𝑎𝑟(𝑋1 ) = 𝑛2 ∙ 𝜎 2
(note that: 𝑣𝑎𝑟(𝑛 ∙ 𝑋1 ) = 𝑣𝑎𝑟(𝑋1 + 𝑋1 + ⋯ + 𝑋1 ) ≠ 𝑣𝑎𝑟(𝑋1 ) + ⋯ + 𝑣𝑎𝑟(𝑋1 ), because of the
dependence)
In conclusion: the expectations of ∑ 𝑋𝑖 and 𝑛 ∙ 𝑋1 are the same, but their variances differ a factor 𝑛:
using the same observation over and over again makes the variance increase more rapidly.
Note that if the variables 𝑋1,…, 𝑋𝑛 are 𝑁(𝜇, 𝜎 2 ), then ∑ 𝑋𝑖 ~𝑁(𝑛𝜇, 𝑛𝜎 2 ) and 𝑛𝑋1 ~𝑁(𝑛𝜇, 𝑛2 𝜎 2 ).
2.
a. 𝑥 = 21.1 en s ≈ 13.19
𝑥(5) +𝑥(6)
b. Order the observations: 3, 8, 12, 15, 17, 18, 27, 29, 37, 45. The median 𝑚 = = 17.5 .
2
𝜎 𝑠
c. The standard deviation of the sample mean 𝑋 is 𝜎𝑋 = , so an estimate of 𝜎𝑋 can be given by ,
√𝑛 √𝑛
𝑠 13.19
using the sample standard deviation 𝑠. If 𝑛 = 100 we have = = 1.319 and for 𝑛 = 1000:
√𝑛 10
𝑠 13.19
= ≈ 0.4171 (The larger 𝑛, the smaller the standard error of 𝑋 is).
√1000 √1000

3.
a. If we define 𝑋 = “the weight of an egg”, where 𝑋~𝑁(𝜇, 𝜎 2 ), we have to determine 𝑃(𝑋 > 68.5):
68.5−𝜇 68.5−𝜇
𝑃(𝑋 > 68.5) = 𝑃 (𝑍 > 𝜎 ) = 1 − Φ ( 𝜎 ) (, a function of the unknown 𝜇 and 𝜎)
b. The estimate at hand for this probability is found by simply replacing 𝜇 and 𝜎 by their estimates:
68.5−56.3
𝑃(𝑋 > 68.5) ≈ 1 − Φ ( 7.6 ) ≈ 1 − Φ(1.61) = 1 − 0.9463 ≈ 5.4%

4. The solutions to a. and b. is for the 4 estimators as follows:


1. 𝐸(𝑇1 ) = 𝐸(𝑋1 ) = 𝜇 and 𝑣𝑎𝑟(𝑇1 ) = 𝑣𝑎𝑟(𝑋1 ) = 𝜎 2 , this is an unbiased estimator.
So the Mean Squared Error is: 𝐸(𝑇1 − 𝜇)2 = 𝑣𝑎𝑟(𝑋1 ) = 𝜎 2
𝑋 +𝑋 1 1
2. 𝐸(𝑇2 ) = 𝐸 ( 1 2 2 ) = 2 (𝐸𝑋1 + 𝐸𝑋2 ) = 2 (𝜇 + 𝜇) = 𝜇 (𝑇2 is an unbiased estimator!) and
𝑋1 +𝑋2 1 1 1
𝑣𝑎𝑟(𝑇2 ) = 𝑣𝑎𝑟 ( ) = 4 𝑣𝑎𝑟(𝑋1 + 𝑋2 ) = 4 (𝑣𝑎𝑟(𝑋1 ) + 𝑣𝑎𝑟(𝑋2 )) = 2 𝜎 2 , so
2
1
𝐸(𝑇2 − 𝜇)2 = 𝑣𝑎𝑟(𝑇2 ) = 2 𝜎 2 ( < 𝑣𝑎𝑟(𝑇1 ), so 𝑇2 is better than 𝑇1 )
3. 𝐸(𝑇3 ) = 𝐸𝑋1 + 𝐸𝑋2 + ⋯ + 𝐸𝑋10 = 10𝜇 (not an unbiased estimator of µ) and
(because of independence:) 𝑣𝑎𝑟(𝑇3 ) = 𝑣𝑎𝑟(𝑋1 ) + ⋯ + 𝑣𝑎𝑟(𝑋10 ) = 10𝜎 2 .
Because of the bias we will use the formula 𝐸(𝑇 − 𝜃)2 = (𝐸𝑇 − 𝜃)2 + 𝑣𝑎𝑟(𝑇) :
𝐸(𝑇3 − 𝜇)2 = (𝐸𝑇3 − 𝜇)2 + 𝑣𝑎𝑟(𝑇3 ) = (10𝜇 − 𝜇)2 + 10𝜎 2 = 10𝜎 2 + 81𝜇 2
𝑋 +𝑋 +⋯+𝑋 𝐸𝑋 +𝐸𝑋 +⋯+𝐸𝑋10 10𝜇
4. E(𝑇4 ) = 𝐸 ( 1 210 10 ) = 1 210 = 10 = 𝜇 (an unbiased estimator of μ) and
𝑋1 +𝑋2 +⋯+𝑋10 1 1 1
𝑣𝑎𝑟(𝑇4 ) = 𝑣𝑎𝑟 ( ) = 102 (𝑣𝑎𝑟(𝑋1 ) + ⋯ + 𝑣𝑎𝑟(𝑋10 )) = 100 ∙ 10𝜎 2 = 10 𝜎 2 .
10
1
So 𝐸(𝑇4 − 𝜇)2 = 𝑣𝑎𝑟(𝑇4 ) = 10 𝜎 2
In conclusion: the sample mean 𝑇4 = 𝑋 is the estimator on the basis of 10 variables, that has the smallest
Mean Squared Error, and therefore it is the best of these four estimators.
(Remark: 𝑇1 and 𝑇2 are sample means as well, but based on only 1 or 2 observations (larger variance),
𝑇3 has a bias ànd a large variance.)
c. Since the µ is an unknown this is not an estimator: an estimator should attain a real value (the
estimate), if we replace the sample variables by their observed values. In this case it will remain a
function of the unknown µ.
5.
a. If the estimator is unbiased then 𝐸(𝑇) = 𝜇 , which is true for 𝑇1 and 𝑇2 :
1 1 1 𝑚𝐸(𝑋)+𝑛𝐸(𝑌) (𝑚+𝑛)𝜇
𝐸(𝑇1 ) = 𝐸 (2 (𝑋 + 𝑌)) = 2 (𝐸𝑋 + 𝐸𝑌) = 2 (𝜇 + 𝜇) = 𝜇 and 𝐸(𝑇2 ) = = =𝜇
𝑚+𝑛 𝑚+𝑛
b. Since both estimators are unbiased, the best is the one with the smallest variance (= the Mean Squared
Error for unbiased estimators):
1 1 1 𝜎2 𝜎2 1 1
𝑣𝑎𝑟(𝑇1 ) = 4 𝑣𝑎𝑟(𝑋 + 𝑌) = 4 (𝑣𝑎𝑟(𝑋) + 𝑣𝑎𝑟(𝑌)) = 4 ( 𝑚 + ) = (4𝑚 + 4𝑛) 𝜎 2 and
𝑛
𝑚 𝑛 𝑚2 𝑛2 𝑚2 𝜎2 𝑛2 𝜎2
𝑣𝑎𝑟(𝑇2 ) = 𝑣𝑎𝑟 (𝑚+𝑛 𝑋 + 𝑚+𝑛 𝑌) = (𝑚+𝑛)2 𝑣𝑎𝑟(𝑋) + (𝑚+𝑛)2 𝑣𝑎𝑟(𝑌) = (𝑚+𝑛)2 ∙ + (𝑚+𝑛)2 ∙
𝑚 𝑛
𝑚 𝑛 𝑚+𝑛 1
= ((𝑚+𝑛)2 + (𝑚+𝑛)2
) 𝜎2 = (𝑚+𝑛)2 𝜎 2 = 𝑚+𝑛 𝜎 2

1 1 1 (𝑚+𝑛)𝑛+(𝑚+𝑛)𝑚−4𝑚𝑛 𝑛2 +𝑚2 −2𝑚𝑛


𝑣𝑎𝑟(𝑇1 ) < 𝑣𝑎𝑟(𝑇2 ) if + 4𝑛 < 𝑚+𝑛 or: = < 0,
4𝑚 4(𝑚+𝑛)𝑚𝑛 4(𝑚+𝑛)𝑚𝑛
so if 𝑛2 + 𝑚2 − 2𝑚𝑛 = (𝑚 − 𝑛)2 < 0. This inequality has no solutions for 𝑚 and 𝑛.
So 𝑇2 is the best estimator of µ, except if 𝑚 = 𝑛.
In the latter case they are equally good (then 𝑇2 = 𝑇1 ).

6.
a. The trials (each person votes party A or not) should be independent (Bernoulli-trials): that is indeed
the case if the draws are with replacement. Even if we draw without replacement, we can apply the
binomial distribution approximately. But then we should have a relatively small sample drawn
from a large population: not replacing a drawn element will only affect the proportion successes and
failures marginally: approximately independent draws.
b. See the answer in part a.: when sampling without replacement the population should be sufficiently
large. In probability theory we mentioned a population size of at least 5𝑛2 (for sample size 𝑛).
𝑋 1 1 𝑋
c. 𝑋 ~𝐵(𝑛, 𝑝), so 𝐸 (𝑛 ) = 𝑛 𝐸(𝑋) = 𝑛 ∙ 𝑛𝑝 = 𝑝 . Then 𝑛 is an unbiased estimator of p.
𝑋 1 1 𝑝(1−𝑝)
𝑣𝑎𝑟 (𝑛 ) = 𝑛2 𝑣𝑎𝑟(𝑋) = 𝑛2 ∙ 𝑛𝑝(1 − 𝑝) = 𝑛
𝑋 𝑝(1−𝑝)
d. is approximately 𝑁 (𝑝, )-distributed (for sufficiently large 𝑛)
𝑛 𝑛
𝑋
We have to determine 𝑛, such that 𝑃 (𝑝 − 0.1 ≤ 𝑛 ≤ 𝑝 + 0.1) ≥ 0.95.

𝑋 0.1 0.1
Standardizing 𝑛 we find: 𝑃 (− ≤𝑍≤ ) ≥ 0.95
√𝑝(1−𝑝) √𝑝(1−𝑝)
𝑛 𝑛
0.1 0.1√𝑛
Since 𝑃(−1.96 ≤ 𝑍 ≤ 1.96) = 0.95, we have: = ≥ 1.96,
√𝑝(1−𝑝) √𝑝(1−𝑝)
𝑛
so 𝑛 ≥ 19.62 𝑝(1 − 𝑝)
1
Using the property 𝑝(1 − 𝑝) ≤ 4 we find: 𝑛 ≥ 96.04 , so for 𝑛-values at least 96 or 97 (remember we
applied an approximation to find 𝑛) the condition is fulfilled.
e. Replacing 0.1 by 0.01, in the same computation we find: 𝑛 ≥ 1962 𝑝(1 − 𝑝) = 9604

7.
a.
The density function of the yearly return 𝑋
𝑋−𝜇 0−𝜇
𝑃(𝑋 < 0) = 𝑃 [ < ] = 𝑃(𝑍 < −½) = 1 − 𝑃(𝑍  0.50) = 1 – Φ(0.50) = 30.85%
2𝜇 2𝜇
(𝑍 is 𝑁(0, 1))
𝜎2 4𝜇 2
b. 𝑋~ 𝑁 (𝜇, ) and since 𝜎 2 = 4𝜇 2 , we have 𝑋~ 𝑁 (𝜇, )
𝑛 𝑛
c. 𝑌 is an unbiased estimator of  if 𝐸(𝑌) = .
For the sample mean 𝐸(𝑋)  is always true.
The unbiasedness of 𝑌 implies 𝐸(𝑌) = µ:
𝐸(𝑌) = 𝐸(𝑎𝑋) = 𝑎𝐸(𝑋) = 𝑎 = , so a = 1.
d. For 𝑌 we have: 𝐸(𝑌) = 𝑎, ( is not equal to µ if a ≠1)
2 4𝑎2 𝜇2
and 𝑣𝑎𝑟(𝑌) = 𝑣𝑎𝑟(𝑎𝑋) = 𝑎 𝑣𝑎𝑟(𝑋) = 𝑎 2
= = 𝑎2 𝜇2 (if a decreases, var(Y)
2
4𝜇𝑛 10
2
5
decreases)
(Note that E(Y - )2 = (E(Y) -  )2 + var(Y) really splits the Mean Squared Error into the
bias E(Y) -  and the variance of Y: if we choose 𝑎 smaller than 1, the second term var(Y) will
decrease, but on the other hand the first term (E(Y) -  )2 will increase (larger than 0).
We are searching the value of 𝑎, such that the sum of these two effects is the smallest)
The Mean Squared Error is, expressed in unknown (fixed) µ and 𝑎:
2𝑎2 2 2𝑎2
𝐸(𝑌 − )2 = (𝐸(𝑌) − )2 + 𝑣𝑎𝑟(𝑌) = (𝑎 − )2 + = 2 [(𝑎 − 1)2 + ]
5 5
2𝑎2
This Mean Squared Error takes on its smallest value if the value 𝑓(𝑎) = (𝑎 − 1)2 + 5
is minimal
(assuming that   0).
4𝑎
The derivative is 0 in the extreme values of 𝑓: 𝑓 `(𝑎) = 2𝑎 − 2 + 5 = 2.8𝑎 − 2 = 0,
𝟓
or: 2.8𝑎 = 2: 𝒂 = 𝟕
5 5
Considering the signs of the derivative: if 𝑎 < 7 , 𝑓 `(𝑎) < 0 (decreasing) and increasing if 𝑎 > 7
(or computing the second derivative f ``(5/7) = 2.8 > 0),
5
we can draw the conclusion that 𝑓 attains its minimum at this point 𝑎 = 7 .
5 1
Therefore 𝑌 = 7 𝑋 = 14 ∑10
𝑖=1 𝑋𝑖 is the best estimator of , with:
2𝑎2 𝜇 2 10𝜇 2
𝐸(𝑌) = 57µ and 𝑣𝑎𝑟(𝑌) = = .
5 49

You might also like