Chi Square Test

You might also like

Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
You are on page 1of 45

Inferential Statistics

Details of t-test & chi-


square test

1
INTRODUCTION
 T-TEST
- Numerical Data
- Comparing the Mean of two
groups

 CHI-SQUARE
- Categorical Data

 BOTH TESTS ARE USED FOR


UNPAIRED OBSERVATION
2
T - TEST
Also referred as Student t-test
Suppose we have taken a sample of 100 women and
found mean Hb as 11.4 gm. Now the question is
whether this sample is from a population whose
standard deviation is 2 gm and mean Hb is 12 or not?
 What should we be looking at?
 Is there a difference b/w population
mean and sample mean?
 In this case difference is there
 Again the question is; Is this
difference significant ? Or it is due to
chance alone
3
 Of course the difference should be
large enough to be called significant
 Now the difference can be written as
Sample Mean – population mean

 What other factor can effect the


importance of the difference ?
 If there is great variation in the
population (Standard deviation with
small sample size) the difference
would not be of that much importance
 If there is small variation (standard
deviation with large sample size) a
slight degree of difference can become
important

4
So we should add this factor as well
to
decide about the importance of
difference
 The formula becomes
Sample Mean – Population
mean
Standard deviation/√N

And this is given the name of t test

5
Hence the t test =

T= Sample Mean – population


mean
Standard deviation/√N
Putting the values in the formula
t = 11.4 -12 = -0.6
2/√100 2/10
Calculated value of t = - 0.6/0.2 = - 3.0
Critical value (cut off) = ± 2.00
What should be the conclusion
Reject Ho or not
As the calculated value is more than critical we
reject chance explanation and conclude that the
sample is not from a population whose mean Hb
level is 12 gm.
6
Example 2 :
In a certain province the proportion of women
who are delivered through caesarian section
is very high. A study is, therefore, conducted
to discover why this is the case. As small
height is known to be one of the risk factors
related to difficult deliveries, the researcher
may want to find out if there is a difference
between the mean height of women in this
province who had normal deliveries and of
those who had Caesarian sections.

1. Calculating the T-Value


2. Using a T-Table
3. Interpreting the Result
7
USING THE T-TABLE
To determine if null hypothesis is
rejected or not
 Significance level (p-value) - 0.05
 Degree of freedom
- measure derived form the sample size
- student t-test the number of degree of freedom is
calculated as the sum of two sample sizes minus 2
d.f. = 60 + 52 - 2 = 110
 t-value belonging to the p-value and the
degree of freedom is located in the table
- t-value belonging to p -0.05 and d.f. = 120 and we find it
is 1.98

8
INTERPRETING THE RESULT
 Decision Rule
 Calculated t-value is larger than the value
derived from the table. p is smaller than the
value indicated at the top of the column. We
than reject the null hypothesis and conclude a
statistical significant difference b/w the two
means.
 Calculated t-value is smaller, p is larger than
the value indicated, we accept null hypothesis
and conclude observed difference is not
statistically significance.

9
INTERPRETING THE RESULT
 t-value calculated in step 1 is 3.6
larger than t-value derived in
step 2 is (1.98)
 p is smaller than 0.05, we reject
the null hypothesis
 Observed difference of two cms
b/w mean heights of women with
normal del. and CS is statistically
significant difference.

10
CHI-SQUARE (x ) TEST 2
Suppose you are presented with two vaccines
A and B for the prevention of measles. Both
the manufacturer claim that their vaccine is
better. How would you resolve this issue?
If we want to decide the credibility of vaccine
objectively. You would ask both the
manufacture to provide some vaccine to
conduct a clinical trial. 100 randomly
selected children were given vaccine A and
same number of children were vaccinated
with B. Out of those vaccinated by A fifteen
(15) developed infection 20 children
vaccinated by B developed infection.
Which vaccine is better?

11
If Vaccine A Then Producer of vaccine
B may claim that this difference is by
chance.
However the producer of vaccine A will
call this difference as significant.
The dispute will not be resolved even if
the difference is large enough until a
cut off value is decided by some
neutral person and that neutral
umpire is test of significance.

12
By putting information from the data following 2x2
table can be filled

Type of va
13
By putting information from the data following 2x2
table can be filled

Type of va
14
By putting information from the data following 2x2
table can be filled

Type of vac

15
By putting information from the data following 2x2
table can be filled

Type of vac

16
 According to null Hypothesis
Ifboth vaccine had same effect then
we should expect the same number of
infection in both the groups.
Accordingly the following table shows
the shape of the table we should
expect if null hypothesis is true

17
Type of v

A
If both vaccine same then half of
infection will be in vaccine A and Half in
vaccine B and the table will look like
this

18
Type of va
Alternatively we can determine the
expected value in cell a by the
principle of unit. i.e.
Out of 200 the number of infection

A
are 35
Out of 1 the number of infection
will be 35/200
Out of 100 the number of infection
will be

35 x 100 = 17.5 19
If you look care fully then you will observe
that in the first step we have divided the
coloumn total of cell a with grand total
and then multiplied it with row total of
cell a. So this two step procedure for
expected value can be written as a
single step
Expected value(EV) of cell a = Column total x Row
total
Grand Total
Thus we can use this formula to calculate
expected value of any cell
Now we fill the table with all expected values
20
Type of va
According to null hypothesis we should not
find any difference in observed value in cells
a, b, c, and d. and expected value. Here we
A
find some difference. This difference may be
small (insignificant) or large (significant).
Hence the next logical step should be that we
should subtract expected value from observed
value to find the difference. We would get a
table like this
21
Cell Observed Expected O – E (O – E)2 (O – E)2/ E Total
Value Value
O E
a 15

b 85

c 20

d 80

Total

Prof. Dr. M. I. Siddiqui – Clinision Workshop


Cell Observed Expected O – E (O – E)2 (O – E)2/ E Total
Value Value
O E
a 15 17.5

b 85 82.5

c 20 17.5

d 80 82.5

Total

Prof. Dr. M. I. Siddiqui – Clinision Workshop


Cell Observed Expected O – E (O – E)2 (O – E)2/ E Total
Value Value
O E
a 15 17.5 -2.5

b 85 82.5 2.5

c 20 17.5 2.5

d 80 82.5 -2.5

Total

Prof. Dr. M. I. Siddiqui – Clinision Workshop


Cell Observed Expected O – E (O – E)2 (O – E)2/ E Total
Value Value
O E
a 15 17.5 -2.5 6.25

b 85 82.5 2.5 6.25

c 20 17.5 2.5 6.25

d 80 82.5 -2.5 6.25

Total

Prof. Dr. M. I. Siddiqui – Clinision Workshop


Cell Observed Expected O – E (O – E)2 (O – E)2/ E Total
Value Value
O E
a 15 17.5 -2.5 6.25 6.25/ 17.5

b 85 82.5 2.5 6.25 6.25/ 82.5

c 20 17.5 2.5 6.25 6.25/ 17.5

d 80 82.5 -2.5 6.25 6.25/82.5

Total

Prof. Dr. M. I. Siddiqui – Clinision Workshop


Cell Observed Expected O – E (O – E)2 (O – E)2/ E Total
Value Value
O E
a 15 17.5 -2.5 6.25 6.25/ 17.5 0.36

b 85 82.5 2.5 6.25 6.25/ 82.5 0.07

c 20 17.5 2.5 6.25 6.25/ 17.5 0.36

d 80 82.5 -2.5 6.25 6.25/82.5 0.07

Total = 0.86

Prof. Dr. M. I. Siddiqui – Clinision Workshop


Table 4.12

D.F .0.50 .10 .050. 020 .010 .0030 .001


1 0.45 2.74 3.84 5.41 6.04 7.68 10.83
2 1.39 4.61 5.99 7.82 9.21 10 13.62
3 2.37 6.25 7.82 9.84 11.34 12.84 16.27
4 3.36 7.78 9.49 11.67 13.28 14.36 18.47
5 4.35 9.24 11.07 13.39 15.09 16.75 20.31
6 5.35 10.65 12.59 15.03 16.81 18.55 22.46
7 6.35 12.02 14.07 10.62 18.48 20.28 24.32
8 7.34 13.36 15.51 18.17 20.09 21.96 26.13
9 8.34 14.68 16.92 19.68 21.67 23.59 27.88
10 9.34 15.99 18.31 21.16 23.21 25.19 29.59
( Source. JE Park K. Park, Text Book of Preventive
and social Medicine)

28
Calculated value of difference = 0.86
Critical value = 3.84 from the table of Chi
square at 5% level of significance
Decision ?
As the difference is less than the minimum difference
to be called significant. We conclude that there is
no significant difference between two vaccine and
we fail to reject null hypothesis.
In simple English both the vaccine are of same
efficacy and the difference is by chance and we
can decide to buy the one which is cheaper.

29
CHI-SQUARE (x ) TEST 2
Example:
Suppose that a study of the factors affecting the
utilization of antenatal clinics you found that 51
women out of 80 who lived within 10 km of the
clinic came for antenatal care, compared to only
35 out of 75 of those who lived more than 10 km
away. This suggests that antenatal care (ANC) is
used more often by women who live close to the
clinics.

Distance from
1) Calculate the x2 value
2) Using a x2 Table
3) Interpreting the Result
30
Distance
ANC
31
By putting expected value of cell
from our calculation

Distance fr
ANC 32
If there are 155 user then the ANC was used by 86
If there was 1 user then ANC was used by 86/155
If there were 80 people then the ANC was used by
80X86/155 (equation 1)
44.4
We can write this expression as
80 = row total (RT)
86 = coloumn Total (CT)
155 = Grand total (GT)
So we can write equation one as follow

EV= RTXCT/GT
Hence we can calculate expected value of any cell
with this expression or formula

33
CALCULATE X2 VALUE
 Expected frequency (E) for each cell.
E = row total x column total / grand (overall)
total
 Each cell, subtract the expected frequency from
the observed frequency (O)
O-E
 For each cell square the result of (O-E) and
divide by expected frequency E.
 Add the result of the above step for all the cells

Cont…
34
CALCULATE X2 VALUE
 Formula for calculating chi-square
value:
x2 = ∑ ( 0 - E)2 / E
O is the observe frequency (indicated in the table)
E is the expected frequency to be calculated
∑ (the sum of) directs you to add together the products
of (O-E)2 for all the cell of the table
 For two by two table (which
contain 4 cells) the formula is
x2 = [(01 - E1)2 / E1] + [(02 - E2)2 / E2] + [(03 - E3)2 /
E3] + [(04 - E4)2 / E4]

35
USING X2 TABLE

 Decide a p-value example 0.05


 Degree of freedom

df = (r-1) x (c-1)

for a 2 by 2 table the no. of d.f. is 1


(i.e. d.f. = (2-1) x (2-1) = 1)

36
INTERPRETING THE RESULT
null hypothesis is rejected if < 0.05 which is the
case if x2 is larger than theoretical x2 in table

 Step 1 (a)
expected frequency for each cell
E1 = 86 x 80 / 155 = 44.4

Distance fr
Cont…
37
INTERPRETING THE RESULT
 Step 1(b) to (1d)
x2 = (51- 44.4)2 / 44.4 + …… + ……
= 0.98 + 1.22 + 1.05 + 1.30 = 4.55
 Step 2
- (d.f.) is 1
- table of chi-square decided p-value = 0.05
- d.f. is 1, we look along row in the column where
p=0.05. This gives us value of 3.84. Our value of
4.55 is > 3.84, which means that the p-value is <
0.05

Contd..
38
INTERPRETING THE RESULT

 Step 3
We can now conclude that the
women living within the
distance of 10 km from the
clinic used antenatal care
significantly more often than
women living more than 10 km
away.
39
INTERPRETATION OF
EXAMPLE
 64% of the women living within a
distance of 10 km from the clinic
used antenatal care during
pregnancy, compared to only 47%
of women living 10 km or further
away from the nearest clinic. This
difference is statistically significant
(x2 = 4.55; - < 0.05)

40
NOTE
• x2 test applied if the sample is large enough
• General rule total should be at least 40 and the
expected frequency in each of the cells should be at
least 5.
• x2 test can be used to compare more than two
groups
• A table with 3 or more rows or columns would be
designed rather than a two by two tables
• In above example < 5 km, 5 - 10 km, > 10 km data
would be put in 2 x 3 tables.
• d.f. would be (3 -1) x (2 - 1) = 2

41
QUICK FORMULA
 For two-by-two tables there is a quick method for
calculating the chi-square value, which can
replace step 1 described above.
If the various numbers in the costs table are
represented by the following letters
Condition Total
+ -
Exposure a b e
Yes c d f
No g h n
The quick formula for calculating the Chi-square value is
x2 = n (ad - bc)2 / efgh
42
Objectives
By the end of session the participant will be
able to
Define test of significance
Name test of significance according to type of
data
Able to calculate T test
Able to compute X2
Interpret the result
Interpret statistical results in simple English
language

43
Tutorial
Exercise for test of significance
A surgeon wants to compare the two surgical procedure
for hysterectomy, Abdominal Hysterectomy (AH) and
Vaginal Hysterectomy (VH) with a research question of
infection rate between two surgical procedures. She operates
on 400 patient with VH approach and 40 developed
infection. Out of 100 AH cases 20 developed infection
Answer the following questions

a) What should be the title of research for publication in a


journal?
b) Frame Null hypothesis in this case?
c) What level of significance you will set?.
d) What test of significance you will apply and why?
e) Calculate test statistics
f) Interpret your result in statistical language
g) Describe the result in simple English
Home Assignment

Home Assignment

Do exercise 12.1, 12.2 , 12.3 12.10, 12.13, 12.19

45

You might also like