Professional Documents
Culture Documents
211MAT1302 Unit-4
211MAT1302 Unit-4
LARGE SAMPLES
4.1 Sampling
Definition 4.1.1. A population consists of collection of individual units,
which may be persons or experimental outcomes, whose characteristics are to
be studied.
Types of Sampling
Some of the commonly known and frequently used sampling are:
1. Purposive sampling
2. Random sampling
3. Simple sampling
4. Stratified sampling
1
Sample Statistic
Number t x s2
1 t1 x1 s21
2 t2 x2 s22
3 t3 x3 s23
.. .. .. ..
. . . .
k tk xk s2k
The set of values of the statistic so obtained, one for each sample, is called
the sampling distribution of the statistic.
Standard Error
Note 4.1.6. For large samples, the S.E. for some well known statistics are
given below:
Statistic S.E.
Sample mean (x) √σ
√n
σ2
Sample S.D. (s)
√2n
2 2 2
Sample Variance (s ) σ
√ n
PQ
Sample proportion (p)
√ n
σ12 σ22
Difference of two sample means (x1 − x2 ) +
√ n1 n2
σ12 σ22
Difference of two sample SDs (s1 − s2 ) +
√ 2n1 2n2
Note 4.1.7. S.E. plays a very important role in large sample theory. If t is
any statistic, then for large samples Z = t−E(t)
SE(t)
∼ N (0, 1).
Test of Significance
Null Hypothesis
2
Definition 4.1.9. For applying the test of significance, we first setup a hy-
pothesis - a definite statement about the population parameter. Such a hy-
pothesis is a hypothesis of no difference is called the null hypothesis and it is
denoted by H0 .
Alternative Hypothesis
Definition 4.1.10. Any hypothesis which is complementary to the null hy-
pothesis is called an Alternative hypothesis and it is denoted by H1 .
Example: If we want to test the null hypothesis that the population has a
specified mean ( µ0 say). i.e., H0 : µ = µ0 , then the alternative hypothesis
would be
(i) H1 : µ ̸= µ0 (ii) H1 : µ > µ0 (iii) H1 : µ < µ0
The alternative hypothesis (i) is known as a two tailed alternative, (ii) is
known as a right tailed alternative and (iii) is know as a left tailed alterna-
tive.
Errors in Sampling
Definition 4.1.11. There are two types of errors in sampling,
i) Type I Error: Reject H0 , when it is true.
ii)Type II Error: Accept H0 , when it is wrong.
Critical Region and Level of significance (LOS)
Definition 4.1.12. A region (corresponding to a statistic t) in the sample
space S which amounts to rejection of H0 is termed as Critical region or
Region of rejection.
If w is the critical region and t is the value of the statistic based on a random
sample of size n, the P (t ∈ w/H0 ) = α, P (t ∈ w/H1 ) = β where w is the
complementary set of w, is called the Acceptance region.
The probability α that a random value of the statistic t belongs to the critical
region is known as the level of significance.
3
Note 4.1.15. When Z = t−E(t)
SE(t)
∼ N (0, 1), we have P (|Z| < 1.96) = 95%
and P (|Z| > 1.96) = 5%. Thus Z = ±1.96 separates the critical region and
the acceptance region at 5% LOS for a two tailed test.
Note 4.1.16. The critical value of Z for a single tailed test (right or left) at
LOS α is the same as that for a LOS for a two tailed test of LOS 2α.
Note 4.1.17. The critical values for some standard LOSs are given below
for large samples.
Nature of Test 1%(0.01) 2%(0.02) 5%(0.05) 10%(0.10)
Two tailed |Zα | = 2.58 |Zα | = 2.33 |Zα | = 1.96 |Zα | = 1.645
Right tailed Zα = 2.33 Zα = 2.055 Zα = 1.645 Zα = 1.28
Left tailed Zα = −2.33 Zα = −2.055 Zα = −1.645 Zα = −1.28
For single tailed test (Right or Left), we compare the computed value
of |Z| with 1.645 (at 5% LOS) and 2.33 (at 1% LOS) and accept or
reject H0 accordingly.
4
Interval Estimation of Population parameter
We have P (|Z| ≤ 1.96) = 0.95
⇒ P (| t−E(t)
SE(t)
| ≤ 1.96) = 0.95
⇒ P (t − 1.96 SE(t) ≤ E(t) ≤ t + 1.96 SE(t)] = 0.95
This means that, with 95% confidence, that the parameter E(t) will lie be-
tween t − 1.96 SE(t) and t + 1.96 SE(t). Thus {t − 1.96 SE(t), t + 1.96 SE(t)}
is the 95% confidence limit for E(t).
Similarly, {t − 2.58 SE(t), t + 2.58 SE(t)} is the 99% confidence limit for E(t)
and {t − 2.33 SE(t), t + 2.33 SE(t)} is the 98% confidence limit for E(t).
Here, p = 63
640
= 0.0984, P = 17.26% = 0.1726, Q = 1 − P = 0.8274 and
n = 640.
Now |Z| > |Zα |, we reject the null hypothesis H0 and we accept the alter-
native hypothesis H1 . i.e., the difference between p and P is significant.
5
That is, the hospital is efficient in bringing down the fatality rate of typhoid
patients at 1% LOS.
Problem 4.2.3. A random sample of 500 apples were taken from a large
consignment and 60 were found to be defective. Obtain the 98% confidence
limits for the percentage number of bad apples in the consignment.
Solution: Given n = 500, p = proportion of bad apples in the sample
60
= 500 = 0.12, q = 1 − p = 0.88.
We
( have √ the 98% confidence√ pq ) limits for population proportion are
p −(2.33 pq , p + 2.33 )
n √ n √
i.e., 0.12 − 2.33 (0.12)(0.88)
500
, 0.12 + 2.33 (0.12)(0.88)
500
i.e., (0.08615, 0.15385).
∴98% confidence limits percentage of bad apples in the consignment are
(8.62, 15.39).
Null Hypothesis, H0 : p = P
Alternative Hypothesis, H1 = p > P (one tailed (right) test )
6
Problem 4.2.5. A random sample of 400 men and 600 women were asked
whether they would like to have a flyover near their residence. 200 men
and 325 women were in favour of the proposal. Test the hypothesis that,
proportions of men and women in favour of the proposal are same at 5%
level.
where
n1 p1 + n2 p2 (400)(0.5) + (600)(0.541)
P = = = 0.525
n1 + n2 (400 + 600)
and Q = 1 − P = 1 − 0.525 = 0.475.
0.5 − 0.541
∴Z = √ ( 1 ) = −1.28
1
(0.525)(0.475) 400 + 600
Since |Z| = 1.28 < 1.96, we accept the null hypothesis at 5% LOS.
i.e., There is no significant difference of opinion between men and women as
far as the proposal of flyover is concerned.
7
Proportion of smokers in PG, p2 = 20% = 0.20
Null Hypothesis, H0 : p1 = p2
Alternative Hypothesis, H1 : p1 < p2 (one tailed)
Given the LOS is 1%, ∴|Zα | = 2.33 (one tailed)
where
n1 p1 + n2 p2 (1600)(0.155) + (900)(0.20)
P = = = 0.1712
n1 + n2 (1600 + 900)
0.155 − 0.20
∴Z = √ ( 1 ) = −2.8671
1
(0.1712)(0.8288) 1600 + 900
Since |Z| = 2.8671 > 2.33 = |Zα |, we reject the null hypothesis at 1% LOS.
i.e., We accept the alternative hypothesis, H1 .
i.e., We conclude that, the proportion of smokers in UG is less than the
proportion of smokers in PG.
Type III: Test of significance for single mean
To test the given sample of size n and mean x has been drawn from a popu-
lation with mean µ, we setup the null hypothesis that there is no difference
between x and µ. The test statistic is
x−µ
Z=
√σ
n
where σ is the standard deviation of the population and n is the sample size.
Problem 4.2.7. A sample of 900 members has a mean 3.4 cms and s.d 2.61
cms. Is the sample drawn from a large population of mean 3.25 cms and s.d
2.61 cms.
8
mean µ = 3.25 (or) x = µ
Alternative Hypothesis, H1 : x ̸= µ (Two tailed)
Let the LOS be 5%. ∴|Zα | = 1.96.
The test statistic
x−µ 3.4 − 3.25
Z= = = 1.7241
√σ 2.61
√
n 900
9
Note : If the samples have been drawn from the same population, then
σ12 = σ12 = σ 2
x1 − x2 x1 − x2
∴Z = √ = √
σ2 2
n1
+ nσ2 σ n11 + n12
n1 s21 + n2 s22
σ2 =
n1 + n2
where s21 and s22 are the variances of sample 1 and sample 2 respectively.
Problem 4.2.9. The means of two large samples of 1000 and 2000 members
are 67.5 inches and 68.0 inches respectively. Can the samples be regarded as
drawn from the same population of standard deviation 2.5 inches?
Null Hypothesis, H0 : The samples have been drawn from the same popula-
tion of S.D. 2.5 inches. (or) µ1 = µ2 and σ = 2.5 inches.
Alternative Hypothesis, H1 : µ1 ̸= µ2 (two tailed test)
Let the LOS be 5%. ∴|Zα | = 1.96.
The test statistic
x1 − x2 67.5 − 68
Z= √ = √ = −5.1640
1 1 1 1
σ n1 + n2 2.5 1000 + 2000
Since |Z| = 5.1640 > 1.96, We reject the null hypothesis at 5% LOS.
i.e., The samples are not drawn from the same population of S.D. 2.5 inches.
10
Null Hypothesis, H0 : There is no significant difference between the mean
height of Englishmen and the mean height of Americans. i.e., µ1 = µ2 .
Alternative Hypothesis, H1 : µ1 < µ2 (one tailed test) Let the LOS be 1%.
∴|Zα | = 2.33 (one tailed test).
The test statistic
x1 − x2
Z= √
σ n11 + n12
where
n1 s21 + n2 s22 6400 × 6.42 + 1600 × 6.32
σ2 = = = 40.706
n1 + n2 6400 + 1600
170 − 172
∴ Z=√ √ = −11.2152
1 1
40.706 6400 + 1600
Since |Z| = 11.2152 > |Zα | = 2.33, we reject the null hypothesis at 1% LOS.
i.e., we accept the alternative hypothesis.
We conclude that, on an average, Americans are taller than Englishmen.
11