Download as pdf or txt
Download as pdf or txt
You are on page 1of 81

“CHI SQUARE TEST”

PRESENTED TO:

“RESPECTED MAM FAIZA ZUBAIR”


PRESENTED BY:

TANZEELA SHOUKAT
NIDA AKBAR
HAJRA ARSHAD
AFIA KHALID
ANSA MANZOR
M.ILYAS KHAN
AMBREEN AKHTAR
CONTENTS:

 Important terms
 Introduction
 Characteristics of the test
 Chi square distribution
 Applications of chi square test
 Calculation of chi square test
 Condition for the application of the test
 Example
 Yates correction for continuity
 Limitations of the test
IMPORTANT TERMS

1. PARAMETRIC TEST
Parametric test is used in statistics when an assumption/ inference is
made of population parameters. it normally involves data expressed in
absolute numbers or values rather than ranks.

2. NON PARAMETRIC TEST


There may be a situation where we cannot meet the assumptions and
conditions and thus cannot use parametric statistical procedures. In such
situation we are bound to apply non parametric statistics.

The first meaning of non-parametric covers techniques that do not


rely on data belonging to any particular distribution. In this the statistics is
based on my distribution of that population.

Basically, non-parametric statistics:

 It deals with small sample sizes


 These are not bound by any assumptions
 These are friendly compared with parametric statistics and
economical in time.
3. HYPOTHESIS
Hypothesis is an educated guess, a possible answer or a predictive
statement that can be tested by scientific methods or scientifically testable
or measureable.

This statement is based on our previous experience on the topic or


based on existing knowledge or review of literature.

4. NULL HYPOTHESIS
It is symbolized as HO is a statistical hypothesis that states that there
is no difference between a parameter and a specific value, or that there is
no difference between two parameters.

5. ALTERNATIVE HYPOTHESIS
It is symbolized as H1, is a statistical hypothesis that states the
existence of a difference between a parameter and specific value or states
that there is a difference between two parameters.

when calculated value is more than tabular value, then we accept


alternative hypothesis.

6. DEGREE OF FREEDOM
It denotes the extent of independence (freedom) enjoyed by a given
set of observed frequencies. Suppose we are given a set of n observed
frequencies.

df =(r-1)(c-1)
where

r = the number of rows

c= the number of columns

7. CONTINGENCY TABLE
A table of data in which the row entries tabulate the data according to
one variable and the column entries tabulate it according to another
variable and which is used especially in the study of the correlation
between variables.

INTRODUCTION
 The chi square test is an important test amongst the several tests of
significance developed by statisticians.
 It was developed by Karl Pearson in 1900.
 Chi square test is a non-parametric test not based on any assumption
or distribution of any variable.
 In general, the test we use to measure the differences between what
is observed and what is expected according to an assumed
hypothesis is called the CHI SQUARE TEST.
Chi-square test is usually done
for nominal and ordinal data
• Nominal data is when there is two variables which
are almost equal like pass/fail,male/female,good/
bad

Ordinal data is when there is some order in data or


for example hygiene status orders as very poor, poor ,
fair ,good.

Chi-square is done to check the comparison


between the expected and actual or we have
something to compare with expectations.
To do Chi-square test we must
have
• Expectation r assumptions
• We must have the value of expectation/frequencies
like number of times something is happening
• For eg if you upload a picture and you get 20
comments good and bad...the number of times good
comments are repeating so it is expressed in terms
of frequency.
• Its like how many times its going to be good or how
many times its going to be bad so basically its the
differences between two. I.e frequency of observed
frequency of exexpectation.
Types of Chi-square test
1:Test of independence
2:Goodness of Fit
Test of Independence:

•When we have two variables and we have to


check the association between two variables.

•it tells the independence level.


Goodness of fit:

•when we have only one variable and we are


comparing it with the expecting observation.

For example,
if you are expecting 71% in exams and you
actually get 65% so the comparison of the
actual 65% to expecting one is goodness of fit.
Chi Square Test
Application of chi square test
Chi square test as a
Test of Homogeneity
• In a certain community a random sample of 50 men
& another 50 women over 21 years of age asked
about the educational background classified as
junior high, senior high and college. The results are:

Junior high Senior high College

Male 13 25 12

Female 23 20 7
• Test weather the two samples are homogeneous in
respect of educational background level
• Let
=0.05
H○: The two samples are homogeneous(95%)
H¡:The two samples are not homogeneous(5%)
Junior high Senior high College

Male (ai) 13 25 12 50 (A)

Female 23 20 7 50(B)

Total (ci) 36 45 19 100 (N)


Calculations :
.......@=0.05

Df=No. of columns -
1×no of rows _1

(3-1)×(2-1)=(2)(1)
=2
CHI SQUARE TEST

G4 AFIA KHALID
CONDITIONS FOR THE APPLICATION
OF χ2 TEST
The following conditions should be
satisfied before χ2 Test can be
applied:

THE DATA MUST BE IN THE FORM OF FREQUENCIES.

THE FREQUENCY DATA MUST HAVE A PRECISE


NUMERICAL VALUE AND MUST BE ORGANISED INTO
CATEGORIES OR GROUPS
OBSERVATIONS RECORDED AND USED ARE
COLLECTED ON RANDOM BASIS.

TWO CATEGORICAL VARIABLES ( married


women and her education) how much they
depends on each other.

TWO OR MORE GROUPS (categories) FOR


EACH VARIABLE. for example: categories are
of married and unmarried women

In these Variables we check their


qualification and their dependency
on their marital status
ALL THE ITEMS IN THE SAMPLE MUST BE
INDEPENDENT .

•There is no relationship between


the subjects in each group.
•The categorical variables are not
paired in any way (e.g : pre-
test/post-test observations

NOTE: Independent samples are samples that are selected


randomly so that its observations do not depend on the values
of other observations. Many statistical analyses are based on the
assumption that samples are independent. Others are designed
to assess samples that are not independent.
NO GROUPS SHOULD CONTAIN VERY FEW ITEMS,
say less than 10.

 In case where the frequencies are less than 10,


regrouping is done by combining the frequencies
of adjoining groups so that the new frequencies
become greater than 10

(Some statisticians take this number as 5 but 10


is regarded as better by most of them.)
THE OVERALL NUMBER OF ITEMS
MUST ALSO BE REASONABLY LARGE. It
should normally be atleast 50.

You might also like