Download as pdf or txt
Download as pdf or txt
You are on page 1of 34

Hypothesis Testing

Sampling distribution (t, chi-square, z-test and ANOVA)


Testing of hypothesis
Hypothesis : A quantitative statement about the population
parameter is called a hypothesis (assumption) and the act of
verification involves testing the validity of such assumption on the
basis of sample evidence is called testing of hypothesis.
Types of hypothesis
Two complementary hypothesis are set up for a
single problem
• Null hypothesis: A assumption made about a population
parameter to test its validity for the purpose of possible
acceptance is called null hypothesis. It is denoted by H0
eg. H0 : μ=μ0 (Population mean has some specified value μ0 )

Alternative hypothesis: A hypothesis set up against the null


hypothesis is called alternative hypothesis. It is denoted by H1
eg. H1 : μ≠μ0 OR
H1 : μ>μ0 OR
H1 : μ<μ0
One tailed and two tailed tests
• For alternative hypothesis:
H1 : μ ≠ μ0 (μ<μ0 or μ>μ0 ) - Two tailed test
OR
H1 : μ < μ0 left tailed test- (sample statistics <
population parameter or first sample statistics<
second sample statistics)
OR
H1 : μ > μ0 Right tailed test- (sample statistics >
population parameter or first sample statistics >
second sample statistics)
Figure of one tailed and two tailed test
Procedure of testing of hypothesis
• Step 1- set up the hypothesis
• Step 2- compute appropriate test statistics(eg Z-test, t- test, F- test
etc. )
• Step 3-choose level of significance
• Step 4-find critical value of test statistic i.e find Zα from table
• Step 5- make conclusion : If calculated value of test statistics is less
than or equal to tabulated value of particular level of significance,
then null hypothesis is accepted otherwise null hypothesis is rejected
Test of significance of a single mean
• Step 1: Null hypothesis H0 : μ=μ0 (Population mean has
some specified value μ0. In other words, there is no
significant difference between sample mean(X̅) and
population mean(μ) or the sample has been drawn from
given large population with mean(μ)
• Step 2: Alternative hypothesis
H1 : μ≠μ0 (two tailed test). Population mean is not equal
to μ0. There is significant difference….
OR
H1 : μ>μ0 (right tailed test). Population mean is greater
than μ0.
OR
H1 : μ<μ0 (left tailed test). Population mean is less than
μ0.
Step 3:Test statistics
X  X  X 
Z= = 
=
S .E.( X ) s
n n
Step 4: select the level of significance α and obtain
tabulated value of Z
Step 5: Conclusion.
If calculated value of ІZІ ≤ tabulated value of Z, it is
not significant and H0 is accepted. In other there is no
significant difference between sample mean and
population mean.
Z-test Vs t-test
• Z- test is applied for large sample (n > 30) but t-test is applied for
small sample (n≤ 30 )
• t distribution can be used even in case of large sample size but the
large sample theory cannot be used for small sample.
• t distribution is flatter than normal distribution (Z)
Degree of freedom
• The number of values in a sample that we can choose freely is called
degree of freedom. It can be denoted by d.f or υ (nu)
e.g for estimating population mean, the number of degree of
freedom= n-1
In other cases, d.f = sample size – number of population parameter
which must be estimated from the sample observation.
Test of significance of a single mean
• Step 1: Null hypothesis H0 : μ=μ0 (Population mean has
some specified value μ0. In other words, there is no
significant difference between sample mean(X̅) and
population mean(μ) or the sample has been drawn from
given large population with mean(μ)
• Step 2: Alternative hypothesis
H1 : μ≠μ0 (two tailed test). Population mean is not equal
to μ0. There is significant difference….
OR
H1 : μ>μ0 (right tailed test). Population mean is greater
than μ0.
OR
H1 : μ<μ0 (left tailed test). Population mean is less than
μ0.
• Test statistics:

where, S= unbiased estimate of population standard


deviation ( when data is given, we need to calculate S)
s=sample standard deviation
degree of freedom= n -1
• A sample of 26 bulbs gives a mean life of 990 hours
with a standard deviation of 20 hours. The
manufacturer claims that the mean life of bulbs is 1000
hours. Is the sample not upto the standard?

Think on it !!!
Chi- square test
• Non parametric test because it depends only on the set of observed
and expected frequencies and degree of freedom.
• Good for nominal or ordinal scale of measurement e.g. data deals
with male and female, head and tail, juniors and seniors
• Use for analysis of qualitative variables such as opinion of person,
smoking habits.
Application of X2 - test
 X2 – test for goodness of fit
 X2 – test for independence of attributes
 X2 – test for population variance
X2 – test for goodness of fit
Different test language
• Test of significant difference between observed and expected
frequencies
• Test of consistency of data
• Data support the theory or not
• Best fit of binomial and poison distribution
• X2 – test
Null hypothesis: H0 = There is no significant
difference between observed and expected
frequencies. i.e given data support the theory.

Alternative hypothesis: : H1 = There is significant


difference between observed and expected
frequencies. i.e given data does not support the
theory.
X2 –test can be used only when
Observed frequency N > 50(large)
Sample observations should be independent
Observed frequency = Expected frequency
Each theoretical (expected) frequency should be large than 10 but in
practice, no expected frequency is less than 5. If the expected
frequency is less than 5, it should be pooled with the succeeding or
preceding frequencies so that the resulting sum is 5 or more and
adjust the degree of freedom accordingly.
X2 –test for independence of Attributes
• Used to test whether two or more attributes are associated or not i.e
whether the attributes are related or independent.
eg. To test whether there is any association between height and
weight of a person
To test whether there is any association between marriage and
failure.
Hypothesis setting
• Null hypothesis: Two attributes( two categorical variables) A and B are
independent. There is no relationship between them.
• Alternative hypothesis: Two attributes( two categorical variables) A
and B are dependent. There is relationship (association) between
them.
Analysis of Variance(ANOVA)
• ANOVA is a statistical technique specially designed to test whether
the means of more than two quantitative population are equal.
• To test whether all the samples have come from the same normal
population having the same mean.
• Samples are randomly selected and independent.
ANOVA-test
• ANOVA in one way classification or one factor analysis of
Variance.(study the influence of one factor on different sample
groups)
• ANOVA in two way classification or Two factor analysis of Variance.(
effect of two variables are studied simultaneously)
One way ANOVA
• Null hypothesis
H0 : μ1 =μ2 =μ3 = ……… =μk . k population means are
equal. In other words, there is no significant difference
between k sample means.

• Alternative hypothesis
H1 : μ1 ≠μ2 ≠μ3 ≠ ……… ≠μk . k population means are not
equal. In other words, there is significant difference
between k sample means.
• Test Statistics:
d.f=(k-1 , n-k)
Where, MSC=mean sum of squares between samples
(columns)=

MSE=mean sum of squares within samples(errors)=

SSC=sum of squares between samples


SSE= sum of squares within samples
SST=total sum of squares
SST= SSC + SSE
Two-way ANOVA
• Null hypothesis
H0 : μ1 =μ2 =μ3 = ……… =μc . there is no significant difference in c population
means due to column factor.
H0 : μ1 =μ2 =μ3 = ……… =μr . there is no significant difference in r population
means due to row factor .

• Alternative hypothesis
H1 : μ1 ≠μ2 ≠μ3 ≠ ……… ≠μc . there is significant difference in c population means
due to column factor.
H1 : μ1 ≠μ2 ≠μ3 ≠ ……… ≠μr . there is significant difference in r population means
due to row factor.

You might also like