Professional Documents
Culture Documents
Ch-5 Solution
Ch-5 Solution
Ch-5 Solution
Statistical Estimation
5.1 Introduction
5.2 The Methods of finding Point estimators
5.3 Some Desirable Properties of Point Estimators
5.4 Confidence Interval for
5.5 A Confidence Interval for the Population Variance
5.6 Confidence Interval Concerning Two Population Parameters
5.7 Chapter Summary
5.8 Computer Examples
Projects for Chapter 5
Exercise 5.2
5.2.1
(a) Given we can equate the first sample moment x to the first moment of our distribution and
1
x However this estimate only holds true if all x i≥1 .
^p=
solve for the parameter we wish to estimate.
5.2.2
(a)
(b)
(c)
(d)
= =1.4814
5.2.3
(a)
b
1 1
E( X )=∫ x dx= ( a+b )
a b−a 2
where
= =12.492
5.2.4
(a)
(b)
5.2.5
(a) Here,
Let,
Then
Now,
is the method of moment estimator of θ.
(b)
We have,
Then,
.
5.2.8
We are given,
Now, we have from (i)
5.2.9
(a)
Then,
(b)
5.2.10
We know,
5.2.11
For a normal distribution
Now,
Again,
By invariance property
5.2.12
We know that for beta distribution,
Then,
Let
Also,
We can write
Similarly,
5.2.13
For a normal distribution
Now,
Again,
5.2.14
The tossing of a coin n times has the Bernoulli distribution with n trials represented below:
Now,
The factor is a fixed constant and does not affect the MLE. So,
By Invariance property,
5.2.15
We have,
Now,
5.2.16
Now,
5.2.17
Now,
The factor is a fixed constant and does not affect the MLEs. So,
5.2.18
Now,
The above two equations cannot be solved analytically. Numerical computation is required.
5.2.19
Now,
5.2.20
We have,
Here, the equation is not differentiable, so we have to maximize the likelihood equation w.r.t . We
= and
5.2.21
we have to maximize the likelihood equation w.r.t .We have to minimize , so that will be
maximum. Hence,
Similarly,
5.2.22
5.2.23
5.2.24
It is difficult to solve the above equation to find the MLE of .We can use Newton Raphson method to
solve the above equation and obtain the MLE of .
5.2.25
The pdf for a normal distribution is:
The likelihood function is:
=50.189
=
Now,
The above two equations cannot be solved analytically to obtain separate estimates for and .
Now,
5.2.28
5.2.29
We are given,
Now,
5.2.30
Exercise 5.3
5.3.1
−∞
(a) E ( X )= ∫ x e
−(x−θ)
dx
θ
0 0 0
−∞
−∞
¿ τ ( 2 ) +θ ∫ e−u du=1−θ [ e−u ]0 =1−θ [ e−∞−e 0 ]
0
¿ 1+θ
[ ]
n
Now,
∑ Xi 1
n
1
n
.
E ( X ) =E
i =1
n
= ∑
n i=1
E ( X i )= ∑ (1+θ )=1+θ
n i =1
(b) As seen from part (a), E ( X ) −( 1+ θ )=0 . Hence, X is an unbiased estimator for 1+θ .
5.3.2
Now,
Then,
5.3.3
Sample Standard deviation
5.3.4
Now,
We have
Then we get,
5.3.5
c 1 X 1+ …+c n X n.
5.3.6
Now,
Therefore,
θ
(a) First population moment is E ( X ) = and first sample moment is X .
2
θ
Hence, =X → θ^2=2 X .
2
nθ
(b) E ( θ^ 1) =E ( Y ( n) ) = ≠θ . Hence, θ^ 1 is a biased estimator for θ .
n+ 1
( )
n n
Now,
Hence,
is a biased estimator of .
5.3.9
We have,
is the unbiased estimator of . By the definition, the unbiased estimate that minimizes the
is the MVUE of .
5.3.10
Also,
Now,
Since for normal distribution X =M which implies E ( M )=E ( X ) =μ. Hence, both are unbiased
estimators for μ.
2
σ
σ 2
πσ
2
Var ( X ) n 2(n−1)
Var ( X )= and Var ( M )= . Next, we can see that = = <1. Hence,
n 2( n−1) Var ( M ) πσ
2
nπ
2( n−1)
Var ( X ) <Var ( M ).
5.3.12
is sufficient for .
5.3.13
1
f ( x ;σ )= exp ¿
2σ
¿> f ( x ; σ )=exp ¿
Hence, according to theorem 5.3.6, |X| is a sufficient statistic for σ .
5.3.14
If is sufficient for , then by Neyman Factorization theorem, we can write the joint distribution
must be a function of .
Since the maximizer of is nothing but the MLE of , such MLE must be a function of .
5.3.15
Now,
and
Therefore,
is sufficient for .
Similarly,
and
Therefore,
is sufficient for .
5.3.16
(a)
Now,
Therefore,
From question 16, the two sufficient statistics for the parameter are
and
From the above data,
=49.8
=1.992
5.3.17
is sufficient for .
5.3.18
Now,
Therefore,
is sufficient for .
5.3.19
is sufficient for
5.3.20
Therefore,
is sufficient for .
5.3.21
We have,
5.3.22
Therefore,
is sufficient for
5.3.23
is sufficient for .
Exercise 5.4
5.4.1
(a) We're 99% confident that the true population parameter is in our confidence interval.
(b) 99% is the wider confidence interval, if everything remains constant e.g. sample size and parameters
then increase confidence increases the interval size.
(c) If the population variance is unknown we use a t-distribution, if the population variance is known and
the sample size is large satisfying the CLT we using the standard normal distribution.
(d) As the sample size increase the confidence interval becomes narrower.
5.4.2
For as a pivot
,
This implies
,
98% confidence interval for is
5.4.3
(a) Z is a standard normal random variable with known variance, thus we simply treat -2.81 and 2.75 as
coming from a standard normal distribution.
σ σ
P( x−2 .81 ≤μ≤x+2. 75 )=0 .9945
√n √n
(c) 0.9945
(d)
5.4.4
a. Here
b.
, ,
Therefore,
5.4.7
the pivotal quantity which is a distribution that does not depend on . Now,
we can construct our confidence interval as follows.
2 θ^ 2 θ^
P( ≤θ≤ )=0.95
Γ n,2,0 .95 Γ n ,2,0. 025
5.4.8
(a)
Now,
(b)
5.4.9
(a)
(b) If is unknown, we use sample variance for the estimation of confidence interval and t-
distribution as a sampling distribution. Thus the confidence interval is
5.4.10
If is known and is used as an estimator of the mean , then confidence
interval is given by
i.e.
5.4.11
Then we construct our pivotal quantity . Thus our confidence interval is:
.
This assumes CLT and that our sample size is large enough.
Exercise 5.5
5.5.1
a.
b.
= (0.572, 0.628)
c.
= (0.13, 0.17)
d.
Here
b)
c)
5.5.4
Assumption: normal distribution because of the large sample size and known.
5.5.5
5.5.6
Here
5.5.7
We assume that .
95% confidence interval is
5.5.8
5.5.9
5.5.10
95% confidence interval is given by
Proportion of defection
(b) Yes, The assumption of normality is valid, based on the sampling distribution of and the
central limit theorem.
(c)
5.5.12
95% confidence interval is given by
5.5.13
5.5.15
(b)
= (296.268, 503.75)
(c)
With 95% confidence, we can say that there are approximately minimum 3 to maximum 4 people
living in the household. Similarly, mothly food expenses are approximately in the range of 296 to 503
in the household.
5.5.17
The 90% confidence interval for the mean , we have the following confidence interval.
5.5.18
and
The 95% confidence interval is given by
5.5.19
We are 95% confident that the true proportion of women at least 35 years of age who are pregnant
with a fetus affected by Down syndrome who will receive positive test results from this procedure lie
between (0.781, 0.953).
5.5.20
(a)
Now,
(b)
From the given data,
5.5.21
We have,
5.5.22
, we get
5.5.24
Here, we have
5.5.25
(a) The t -distribution is more spread out (i.e. has more area in the tails) than the normal distribution.
(b) This increases the size of confidence interval
(c) Yes, sample size makes a difference. The greater the sample size the less additional area in the
tails as compared to the normal curve. The t-distribution approaches the z-distribution as a limit when
n increases.
(d) The population distribution of the variable is normally distributed
5.5.26
(a) We have,
(b) We have,
From t-table, we get
(c) We have,
5.5.27
We have,
(a)
(b)
5.5.29
We are 98% confident that the population mean lies between (-3.03, -1.41).
5.5.30
5.5.31
Now, 95% confidence interval is given by
5.5.32
5.5.33
We are 99% confident that the actual average Hb level in children with chronic diarrhea for this city
lie between (11.99, 15.41).
In the boxplot, the data seems to be slightly skewed, and when looking at the normal plot the data
seems to deviate a bit from the normal line at the ends. This skewness could be due to the random
small sample size.
5.5.34
Now, 99% confidence interval is given by
5.5.35
We are 95% confident that the mean peak CK activity lie between (237.65, 584.21).
5.5.36
5.5.37
= =97.24
(b)
5.5.39
We are 98% confident that the population mean lies between (2.69, 5.15).
5.5.40
(a)
In the boxplot, the data seems to be slightly skewed, and when looking at the normal plot the data
seems to deviate a bit from the normal line at the ends. This skewness could be due to the random
small sample size.
(b)
(c)
Looking at the normal plot, the point are mostly around the normal line suggesting that the data
comes from a normal population
(b)
We are 95% confident that the population mean stopping distance μ lie between (147.2095,
149.1904).
5.5.42
(a)
(b)
Looking at the normal plot, the points are mostly around the normal line suggesting that the data
comes from a normal population
Exercise 5.6
5.6.1
90% confidence interval for is given by
5.6.2
5.6.3
5.6.4
[ 2
, 2
][
( n−1 ) s 2 ( n−1 ) s 2
χ 17,0.005 χ 17,0.995
=
17∗1.02 17∗1.02
,
35.7185 5.6972 ]
=[ 0.4855 ,3.0436 ]
We are 99% confident that the true variance is in between 0.4855 and 3.0436 .
5.6.6
We have,
[ 2
, 2
][
( n−1 ) s 2 ( n−1 ) s 2
χ 24,0.005 χ 24,0.995
=
24∗148.44 24∗148.44
45.5585
,
9.8862 ]
=[ 78.1975 , 360.3556 ]
c) We are 99% confident that the true variance is in between 78.1975and 360.3556 .
Assumption: The population is Normal
5.6.8
a.
We are 95% confident that the true variance is in between 26.6371 and 123.604 .
5.6.10
= (6.05,12.03)
5.6.11
(a)
We are 99% confident that the true variance is in between 2.068 and 11.658.
(b)
Looking at the normal plot, the points are mostly around the normal line suggesting that the data
comes from a normal population
Exercise 5.7
5.7.1
2 2
n1 =10 ,n 2=10 , x 1=98.4 , x2 =95.4 , s 1=235.6 , s1=87.16 ,C=98 %
Assume common variance, So the pooled sample variance is
∴ s p=√ 161.38=12.7035
[ 2 √ ][ 1 1
1 2
1 1
√ ]
( x 1−x 2) ± t α , 18 s p n + n = ( 98.4−95.4 ) ±2.552∗12.7035 10 + 10 = [−11.4984,17 .4984 ]
We are 98% confident that the true difference in the mean number of components assembled by the two
methods is in between −11.4984 and 17.4984 .
5.7.2
= (-2.53, 10.41)
We are 95% confident that the true difference in the is in between −2.53 and 10.41.
5.7.3
, ,
, ,
5.7.4
, ,
, ,
= (-7.5668, -1.275)
We are 99% confident that the true difference in is in between −7.5668 and −1.275 .
5.7.5
n1 =25 ,n 2=23 , x1 =58550 , x 2=53700 , s 1=4000 , s2=3200 ,C=90 %
s p= √ 13245217.39=3639.3979
90% confidence interval for μ1−μ 2
[ 2 √ ][ 1 1
1 2 √
1 1
]
( x 1−x 2) ± t α , 46 s p n + n = (58550−53700 ) ±1.6787∗3639.3979 25 + 23 =[ 3084.8187,6615.1813 ]
We are 90% confident that the true difference in males and females mean salary is in between
$ 3084.82and $ 6615.18.
5.7.6
5.7.7
n1 =40 , n2=32 , x 1=28.4 , x 2=25.6 , s 1=4.1 , s 2=4.5 ,C=99 %
( )
a) Assuming that the two samples are from normal distribution x 1 N μ 1 ,
σ 21
n1
σ2
, x 2 N ( μ2 , 2 )
n2
Then we have x 1−x 2 N μ 1−μ2 ,
( σ 21 σ 22
+
n1 n2 )
So the Joint density function
[( ]
1 −1
f ( x1 , x2)= exp ( ( x 1−x 2 )−( μ 1−μ2 )2 )
√ )
2 2
( )
2
σ σ
2 2 σ σ 1 2
2π 1+ 2 2 +
n1 n2 n1 n2
Log Likelihood function is
[( ] [ ( )]
2 2
−1 1 σ σ 2
ln (L ( μ 1 , μ2 ) )= (( x 1−x 2 )− ( μ1−μ2 ) ) − 2 ln 2 π n 1 + n 2
)
2 2 2
σ σ 1 2
2 1+ 2
n1 n2
[( ]
∂ ln ( L ( μ1 , μ 2) ) −2
= (( x1 −x2 ) −( μ1−μ2 ) ) =0
)
∂ ( μ1−μ 2 ) 2
σ1 σ2
2
2 +
n1 n2
( ^μ1− μ^ 2 )MLE =( x 1−x 2 )
b) Since n1 ∧n2 >30, by large sample approximation
[ ( x 1−x 2) ± z α
2 √ ][
s21 s22
+ = ( 28.4−25.6 ) ± 2.58
n1 n2
4.12 4.52
+
40 32
=[ 0.1524,5 .4476 ]
√ ]
We are 99% confident that the true difference in mean is in between 0.1524 and 5.4476 .
5.7.8
, ,
, ,
= (1.0736, 1.7263)
We are 99% confident that the true difference in mean is in between 1.0736 and 1.7263.
5.7.9
[ √ ][ √ ]
2 2 2 2
s1 s2 21000 23000
( x 1−x 2) ± z α + = ( 148822−155908 ) ± 2.33
n1 n2 100
+
150
=¿
2
[ −13650.10 ,−521.90 ]
We are 98% confident that the true difference in prices of houses in 2000 and 2001 is in between
$−13650.10and $−521.90. This means that the average prices of houses in 2000 were higher than in
2001.
5.7.10
= (0.0145, 0.1028)
5.7.11
2 2
n1 =11, n2 =13 , x 1=35.18 , x 2=38.69 , s 1 =19.76 , s2 =12.69 , C=90 %
Using the F table, we have F n −1 , n −1,1− α =F10,12,0.95=2.75 and
1 2
2
1
=F =F12,10,0.95 =2.91
F α n2−1, n1−1,1−
α
n1−1 ,n2−1 , 2
2
2
σ1
90% confidence interval for 2
σ2
[ ) ] [( )( ]
19.76
( ( )( )) (( )(
2 2 ∗1
s1
s 22 F
1
n 1−1 ,n2−1,1−
α
,
s1
s 22 F
1
n1−1 , n2 −1 ,
α
=
12.69
2.75
,
19.76
12.69 )
∗2.91 = [ 0.5662 , 4.5313 ]
2 2
2
σ1
We are 90% confident that the ratio of true variance, 2 is located in the interval 0.5662 and
σ2
4.5313 .
5.7.12
, ,
, ,
2
σ1
We are 90% confident that the ratio of true variance, 2 is located in the interval 0.1625 and
σ2
0.7839 .
5.7.13
n1 =12, n2 =12, x 1=68.92 , x 2=80.67 , s 1=16.77 , s 2=10.86 , C=95 %
( )
2 2 2
( )
s 1 s2 2
16.77 10.86
2 2
+ +
n1 n2 12 12
υ= = =18.85=18( round down)
( ) ( ) ( ) ( )
2 2 2 2 2 2 2 2
s1 s2 16.77 10.86
n1 n2 12 12
+ +
n1−1 n2 −1 11 11
[ √ ][ √ ]
2 2
s 1 s2 16.77 2 10.86 2
( x 1−x 2) ± t α , υ + = ( 68.92−80.67 ) ±2.101 + =[ −23.8676,0.3676 ]
2
n 1 n2 12 12
We are 95% confident that the true difference in scores of the average and gifted students is in between
−23.8676 and 0.3676 .
1
=F =F11,11,0.975 =3.47
F α n2−1, n1−1,1−
α
2
n1−1 ,n2−1 ,
2
2
σ1
95% confidence interval for 2
σ2
) ] [( )( ]
2
[
16.77
(( ) ( ) ) (( )(
∗1
)
2 2 2 2
s1 1 s1 1 10.86 16.77
, = , ∗3.47 = [ 0.6872,8 .2744 ]
s 2
2
F α s 2
2
F α 3.47 10.862
n1−1 ,n2−1,1− n 1−1 , n2 −1 ,
2 2
2
σ1
We are 95% confident that the ratio of variance in test scores for regular and gifted students, 2 is located
σ2
in the interval 0.6872 and 8.2744 .
c)Assumptions: The populations are Normal and the samples are independent
Normal Q-Q Plot Normal Q-Q Plot
100
90
90
S ample Quantiles
S ample Quantiles
80
80
70
70
60
50
60
According to the QQ plots we can see that the normality assumptions of the two samples are satisfied
since data points follow a straight line.
5.7.14
Now,
Similarly,
Hence both and are the unbiased variance of
Again,
And