Download as pdf or txt
Download as pdf or txt
You are on page 1of 51

Probability & Statistics

Dr. Santosh Kumar Yadav


Assistant Professor

Department of Mathematics
Lovely Professional University, Punjab.

Dr. Santosh Yadav, LPU Punjab 1 / 45


UNIT-5
Testing of Hypothesis
Types of Error, Student t-test for single mean and difference
of means, Z-test for single mean and difference of means,
F-test, goodness of fit, Chi- Square Test

Dr. Santosh Yadav, LPU Punjab 2 / 45


Statistical Hypothesis?
Hypothesis is some statement or assertion about population
which we want to test (or verify) on the basis of available
information from sample.

Null Hypothesis: A hypothesis which is tested for possible


rejection is called null hypothesis, denoted by H0 .
A hypothesis of no difference is H0 . That means, a previos
information or observation which we already have is taken as
H0 .

Alternate hypothesis: A hypothesis which is contradictory


or complementry to the null hypothesis, is called alternate hy-
pothesis. i.e., Anything other than null hypothesis is alternate
hypothesis. It is denoted by H1 or sometimes Ha .

Dr. Santosh Yadav, LPU Punjab 3 / 45


Some Remarks

Whenever we are testing any null hypothesis (H0 ) against al-


ternate hypothesis. Then, corresponding to null hypothesis,
alternate hypothesis can be two tailed, one tailed.

Whenever we applying test, whether H0 will be rejected or


accepted, then for this purpose whatever region we choose,
we define critical region and acceptance region.

Dr. Santosh Yadav, LPU Punjab 4 / 45


Acceptance and Rejection region

The region where H0 is rejected when it is true, is called


rejection region or critical region.

The region where H0 is accepted when it is false, is called


acceptance region.

Dr. Santosh Yadav, LPU Punjab 5 / 45


Types of Errors

The decision to accept or reject the H0 is made on the basis


of the information from the observed sample observations.
This means the decision (conclusion) based on the sample
may not be always true in respect to the population.

This means any testing problem involves two types of errors.

Dr. Santosh Yadav, LPU Punjab 6 / 45


Type I Error

The error of rejecting H0 when it is true, is called type I error.


It is denoted by α.

α = Probability of type I error


= Probability of rejecting H0 , when it is true
= P(x ∈ W |H0 is true)
Z
= L0 dx
W

where L0 is likelihood function of sample observation under


H0 .

Dr. Santosh Yadav, LPU Punjab 7 / 45


Type II Error
The error of accepting H0 (rejecting H0 ) when H0 is false, is
called type II error. It is denoted by β.

β = Probability of type II error


= Probability of accepting H0 , when it is false
= P(x ∈ W |H0 is false)
= P(x ∈ W |H1 is true)
Z
= L1 dx
W

where L1 is likelihood function of sample observation under


H1 .
Dr. Santosh Yadav, LPU Punjab 8 / 45
Cont...

Level of Significance: The constant α is also called level


of significance. i.e., the probability of error in accepting or
rejecting H0 .
Level of Confidence: C = (1 − α)%. This means that if
LoS, α = 0.5 then LoC is 95%.
Power of the Test: 1 − β.

Dr. Santosh Yadav, LPU Punjab 9 / 45


Problem 1
If x ≥ 1 is the critical region for testing H0 : θ = 2 against
the alternative H1 : θ = 1, on the basis of single observation
from the population,

f (x, θ) = θ e−θx , x ≥0

Find the size of type I and type II errors.

Dr. Santosh Yadav, LPU Punjab 10 / 45


Problem 1
If x ≥ 1 is the critical region for testing H0 : θ = 2 against
the alternative H1 : θ = 1, on the basis of single observation
from the population,

f (x, θ) = θ e−θx , x ≥0

Find the size of type I and type II errors.

α = {x ∈ W |H0 is true}
α = {x ≥ 1|θ = 2}

Dr. Santosh Yadav, LPU Punjab 10 / 45


Problem 2
For the distribution
1
f (x, θ) = , 0≤x ≤θ
θ
and that you are testing the null hypothesis: H0 : θ = 1
against H1 : θ = 2, by means of a single observed value of
x. Find the size of type I and type II errors if you choose the
interval x ≥ 0.5.
Ans: α = 0.5 and β = 0.25

Dr. Santosh Yadav, LPU Punjab 11 / 45


Steps of Hypothesis Testing

Define null and alternate hypothesis


Test statistic: (a formula under H0)
Level of significance (α)
Decision: Reject or accept H0.

Dr. Santosh Yadav, LPU Punjab 12 / 45


Z-test for single mean
Define H0 : µ = µ0 and H1 : µ ̸= µ0

The test statistic: under null hypothesis

x̄ − E(X̄ ) x̄ − µ
Z = = √
S.E(X̄ ) σ/ n
follow standard normal distribution .

Level of significance:
Decision: If Calculated |z| < tabulated z-value, at α LoS,
then Accept null hypothesis H0 .
If Calculated |z| > tabulated z-value, at α LoS, then reject null
hypothesis.
Dr. Santosh Yadav, LPU Punjab 13 / 45
t-test for single mean
Define H0 : µ = µ0 and H1 : µ ̸= µ0

The test statistic: under null hypothesis


 
x̄ − E(X̄ ) x̄ − µ x −µ
t= = √ = n
S.D(X̄ ) S/ n S
follow t-distribution with (n-1) degree of freedom. Here S 2 is
unbaised estimate of population variance.

Level of significance:
Decision: If Calculated, tα,(n−1) < tabulated, tα,(n−1) , then
accept H0 .
If Calculated, tα,(n−1) > tabulated, tα,(n−1) , then reject H0 .
Dr. Santosh Yadav, LPU Punjab 14 / 45
Example 1
A sample of 900 members has a mean 3.4 cms and s.d. 2.61
cms. Test whether sample is taken from a large population
of mean 3.25 cms and s.d. 2,61 cms. (α = 5%)

Dr. Santosh Yadav, LPU Punjab 15 / 45


Example 1
A sample of 900 members has a mean 3.4 cms and s.d. 2.61
cms. Test whether sample is taken from a large population
of mean 3.25 cms and s.d. 2,61 cms. (α = 5%)
test statistic under H0 : Z = 3.40−3.25
2.61/30
= 1.73
Decision: accept hull hypothesis at 5% LoS. Means sample
is drawn from a population of mean 3.25 cms.

Dr. Santosh Yadav, LPU Punjab 15 / 45


Problem 2
The heights of 10 males of a society are 70, 67, 62, 68, 61,
68, 70, 64, 64, 66. Is it reasonable to believ that the average
height is greater than 64 inches. Test at 5% LoS (taking that
for 9 degree of freedom, t=1.83).

Dr. Santosh Yadav, LPU Punjab 16 / 45


Problem 2
The heights of 10 males of a society are 70, 67, 62, 68, 61,
68, 70, 64, 64, 66. Is it reasonable to believ that the average
height is greater than 64 inches. Test at 5% LoS (taking that
for 9 degree of freedom, t=1.83).
H0 : µ = 64 inches
H1 : µ > 64inches
x̄−µ
√ find X̄ , then S 2 for given
test statistic under H0 : t = S/ n
heights: X̄ = 66 and S 2 = 90/9 = 10 and hence t = 2.
From the table t value for 9 d.f and 5% LoS (for right tailed
alternate is = 1.83).
Decision: Reject H0 , i.e, accept H1 , hence we can say that
average height of males is greater than 64 inch.

Dr. Santosh Yadav, LPU Punjab 16 / 45


Problem 3

A normal polulation has mean of 0.1 and s.d. of 2.1. Find


the probability that mean of sample of 900 will be negative.
Ans: P(x̄ < 0) = P(Z < −1.43) = 0.0764

Dr. Santosh Yadav, LPU Punjab 17 / 45


Problem 4

A random sample of size 16 has 53 as a mean. The sum of


squares of the deviation taken from mean is 135. Can this
sample be regarded as taken from population having 56 as
a mean. (Given that LoS= 5% and t15,0.05 = 2.131)

Dr. Santosh Yadav, LPU Punjab 18 / 45


Problem 5

A machinist is making engine parts with axle diameters of


0.700 inch. A random sample of 10 parts shows a mean
diameter of 0.742 inch with a s.d. of 0.040 inch. Compute
statistics you would use to test whether the work is meeting
the specifications. (Given that LoS= 5% and t9,0.05 = 2.26)

Dr. Santosh Yadav, LPU Punjab 19 / 45


Z-test for difference of means

H 0 : µ1 = µ2
Test statistic under null hypothesis:

x1 − x2
Z = r 
σ12 σ22
n1
+ n2

follow standard normal distribution.


Decision: As difference is significant or not.
Note: If s.d of both population is given same or if sample s.d
is given in problem, above formula can be easily modified.

Dr. Santosh Yadav, LPU Punjab 20 / 45


t-Test for difference of means

Test statistic is
x −y
t= r 
1 1
S n1
+ n2

where x, y , are sample means and

2 1 hX
2
X
2
i
S = (xi − x) + (yj − y )
n1 + n2 − 2
is an unbaised estimate of common population variance σ 2 ,
follows Student’s t distribution with (n1 + n2 − 2) degree of
freedom

Dr. Santosh Yadav, LPU Punjab 21 / 45


Dr. Santosh Yadav, LPU Punjab 22 / 45
Problem 1
The means of two single large samples of 1000 and 2000
members are 67.5 inches and 68.0 inches respectively. Can
the sample be regareded as drawn from the same popuula-
tioun of s.d. 2.5 inches? (α = 5%).
H0 : µ1 = µ2 and s.d= 2.5
H1 : µ1 ̸= µ2 and s.d= 2.5
test statistic under null hypothesis: Cal-|Z | = −5.1
Decision: reject null hypothesis.

Dr. Santosh Yadav, LPU Punjab 23 / 45


Example 2
The average hourly wage of a sample of 150 workers in a
plant A was Rs. 2.56 with a s.d. of rs 1.08. The average
hourly wage of a sample of 200 workers in a plant B was Rs.
2.87 with a s.d. of rs 1.28. Can the applicant safely assme
that the hourly wages paid by plant B are higher than those
paid by plant A.? (α = 5%).

Dr. Santosh Yadav, LPU Punjab 24 / 45


Example 2
The average hourly wage of a sample of 150 workers in a
plant A was Rs. 2.56 with a s.d. of rs 1.08. The average
hourly wage of a sample of 200 workers in a plant B was Rs.
2.87 with a s.d. of rs 1.28. Can the applicant safely assme
that the hourly wages paid by plant B are higher than those
paid by plant A.? (α = 5%).
Let x1 and x2 are hourly wages of A and B then n1=150, n2=
200, x1¯ = 2.56, x2 =¯2.87, and S1 = 1.08, S2 = 1.28
H 0 : µ1 = µ2
H1 : µ1 < µ2 (left tailed)
Test statistic under null hypothesis: Cal Z-value= -2.46
Decision: Reject H0. accept H1.

Dr. Santosh Yadav, LPU Punjab 24 / 45


Example 3
The samples of two types of electric bulb were tested for
length of life in hours, where for Type I and type II have fol-
lowing data is given: (n1 , x¯1 , s1 ) = (8, 1234, 36) and (n2 , x¯2 , s2 ) =
(7, 1036, 40). Is this difference in the means is sufficient to
warrant that type I is superior to type II regarding the length
of life? (tabulated value of t at 5% and 13 d.f for single tailed
is 1.77)

Dr. Santosh Yadav, LPU Punjab 25 / 45


Example 4
The height of six randomly chosen sailors in inches are: 63,
65, 68, 69, 71, and 72. Those of 10 randomly selected sol-
diers are: 61, 62, 65, 66, 69, 69, 70, 71, 72 and 73. Test
whether the given data suggest that sailors are on the aver-
age taller than soldiers. (tabulated t value for 14 d.f and 5%
LoS for single tailed is 1.76) .

Dr. Santosh Yadav, LPU Punjab 26 / 45


F-Test

This test is used to make a decision about ”equalty for two


population variances.
OR

Test whether two estimates of population variance are


homogeneous or not (i.e., difference is significant or not).

H0 : σx2 = σy2 = σ 2

Dr. Santosh Yadav, LPU Punjab 27 / 45


Cont...

Let X and Y are samples of size n1 and n2 drawn from


normal populations with mean µ1 and µ2 and variances, σ12
and σ22

test statistic: under H0 : σx2 = σy2 = σ 2

Sx2 larger estimate of variance


F = 2 =
Sy smaller estimate of variance

follow F- distribtion with (n1 − 1, n2 − 1) degree of freedom.


where Sx2 and Sy2 are unbaised estimates of common
population variance σ 2 obtained from two independent
samples.

Dr. Santosh Yadav, LPU Punjab 28 / 45


Cont...

This test is also known as ”variance ratio test”.

Greateest of two variances is to be taken in numerator and


n1 corresponds to the greatest variance.
LoS: tabulated value Fα,(n1 −1,n2 −1) =?

Decision: The Null hypothesis will be accepted or rejected


depending on whether difference is significant or not.

Dr. Santosh Yadav, LPU Punjab 29 / 45


Problem 1

In one sample of 8 observations, the sum of the squares


of deviations of the sample values from sample mean
was 84.4 and in other sample of 10 observations it was
102.6. Test whether this difference is significant at 5%
LoS, given that F0.05,(7,9) = 3.29.

Dr. Santosh Yadav, LPU Punjab 30 / 45


Problem 2

Two random samples are drawn from two normal


populations and their values are

A :16, 17, 25, 26, 32, 34, 38, 40, 42


B :14, 16, 24, 28, 32, 35, 37, 42, 43, 45, 47.

Test whether the two populations have the same


variance, at LoS α = 5%.

Dr. Santosh Yadav, LPU Punjab 31 / 45


Cont...

Given that, n1 = 9, n2 = 11, and α = 5%

H0 : σx2 = σy2 = σ 2 (i.e., estimates of σ 2 given by samples


are homogeneous)

Test statistic: under H0

Sx2
F =
Sy2

Dr. Santosh Yadav, LPU Punjab 32 / 45


Cont...

Dr. Santosh Yadav, LPU Punjab 33 / 45


Cont...
X X −X (X − X )2 Y Y −Y (Y − Y )2
16 -14 196 14 -19 361
17 -13 169 16 -17 289
25 -5 25 24 -9 81
26 -4 16 28 -5 25
32 2 4 32 -1 71
34 4 16 35 2 4
38 8 64 37 4 16
40 10 100 42 9 81
42 12 144 43 10 100
- - - 45 12 144
P - - P - P 47 14 P 196
= 270 - = 734 = 363 - = 1298

Dr. Santosh Yadav, LPU Punjab 33 / 45


Cont...
734
Sy2 = 9−1
= 91.75
1296
Sx2 = 11−1
= 129.8

Cal, F = 129.8
91.75
= 1.414
with degree of freedom (10, 8).

Tabulated F-value with (10, 8) d.f and at 5% LoS is


F(10,8),0.05 = 3.07
Decision: Cal, F-value (1.41) < Tab, F-value (3.09), this
means that difference is not significant and we can accept
H0 . Thus, sample are drawn from the populations having
variance σ 2 .

Dr. Santosh Yadav, LPU Punjab 34 / 45


Cont...

Dr. Santosh Yadav, LPU Punjab 35 / 45


Chi-Square Test

This test is used when no information about the parent pop-


ulation is given and some categorical data is given.

Dr. Santosh Yadav, LPU Punjab 36 / 45


Chi-Square Test

This test is used when no information about the parent pop-


ulation is given and some categorical data is given.

That is, this is a non-parametric test, meaning is it is free


from parameters/ distribution and we apply the hypothesis
on the given data.

Used for test of goodness of fit, test of independence of at-


tributes, test of homogeniety, etc.

A test to judge whether observed data fits (matches)


with the expected data. i.e., testing the significance of
discrepancy between theory and experiment.

Dr. Santosh Yadav, LPU Punjab 36 / 45


Chi-Square Test of Goodness of Fit

The test statistic (Karl Pearson’s chi-square):

X (f0 − fe )2
χ2 = ,
fe

where f0 denotes observed frequency (i.e., given sample


data). And fe denotes expected frequency (the theoretical
data). It will be given in problem or we will calculate by using
various probability distributions depending on the nature of
problem.
degree of freedom = n − 1 (number of frquencies given -1)

Dr. Santosh Yadav, LPU Punjab 37 / 45


Cont..

Decision: If Calculated, χ2α,(n−1) < tabulated, χ2α,(n−1) , then


accept H0 .
If Calculated, χ2α,(n−1) > tabulated, χ2α,(n−1) , then reject H0 .

Dr. Santosh Yadav, LPU Punjab 38 / 45


Problem 1
The demand for particular spare part in a factory was found
to vary from day to day. In a sample study the following in-
formation was obtained:
Days: No. of parts demanded
Mon. 1124
Tues 1125
Wed 1110
Thu 1120
Fri 1126
Sat 1115
Total 6720
Test the hypothesis that the no. of parts demanded does not
depends on day of the week. (Given chi 2 vales at 5, 6. 7, d.f
respectively 11.07, 12.59, 14.07 at 5% LoS)

Dr. Santosh Yadav, LPU Punjab 39 / 45


Dr. Santosh Yadav, LPU Punjab 40 / 45
Cont...
H0 : demand does not depend on day of week
expected frequencies of spareP xi parts demanded on each of
6720
the six days would be ei = n
= 6
= 1120
Calculated, χ2 = 0.179.
Tabulated, χ20.05 = 11.07 for 5 deg of freedom.
Decision: No. of parts demanded are same over the six
days of week.

Dr. Santosh Yadav, LPU Punjab 41 / 45


Problem 2
The distribution of digits in numbers chosen from a tele-
phone is:
Digits Freqency
0 1026
1 1107
2 997
3 966
4 1075
5 933
6 1107
7 972
8 964
9 853
Test whether the digits may be taken to occur equally fre-
quently in the directory
Dr. Santosh Yadav, LPU Punjab 42 / 45
Dr. Santosh Yadav, LPU Punjab 43 / 45
Dr. Santosh Yadav, LPU Punjab 44 / 45
THANK YOU

Dr. Santosh Yadav, LPU Punjab 45 / 45

You might also like