Module 10

You might also like

Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 31

BUSINESS RESEARCH

METHODS
Prof.Radhika Kiran Kumar
Indira Institute of Business Management
2

Chi-square Analysis
Chi-Square Test
» Karl Pearson introduced a test to distinguish whether an observed set of
frequencies differs from a specified frequency distribution.

» This method is applicable to categorical (discrete) data only.


Chi-square test
A chi-square test is a statistical test commonly used
for testing independence and goodness of fit. Testing
independence determines whether two or more
observations across two populations are dependent on each
other (that is, whether one variable helps to estimate the
other).
Testing for goodness of fit determines if an
observed frequency distribution matches a theoretical
frequency distribution.
Chi-Square Test

Parametric Non-
Parametric
Testing
Test for
Independence
comparing
Test for
variance
Goodness of Fit
Conditions for the application 6

of  test
»Observations recorded and collected are collected on random
basis.

»All items in the sample must be independent.

»No group should contain very few items, say less than 10. Some
statisticians take this number as 5. But 10 is regarded as better by most
statisticians.

»Total number of items should be large, say at least 50.


Chi- Square Test as a Non- 7
Parametric Test

» Test of Goodness of Fit.


» Test of Independence
8

Test of Goodness of fit


» It enables us to see how well does the assumed theoretical
distribution(such as Binomial distribution, Poisson distribution
or Normal distribution) fit to the observed data.
» When the calculated value of χ2 is less than the table value at
certain level of significance, the fit is considered to be good one
and if the calculated value is greater than the table value, the fit
is not considered to be good.
9
10

Example
As personnel director, you want to test the perception of
fairness of three methods of performance evaluation. Of
180 employees, 63 rated Method 1 as fair, 45 rated Method 2
as fair, 72 rated Method 3 as fair.At the 0.05 level of
significance, is there a difference in perceptions?
11

Observed Expected (O-E) (O-E)2 (O-E)2


frequency
frequency
E
63 60 3 9 0.15
45 60 -15 225 3.75
72 60 12 144 2.4
6.3
12

Test Statistic:
c2 = 6.3

» H0: p = p = p = 1/3
Decision:
1 2 3
Reject H0 at sign. level
0.05
» H1: At least 1 is different Conclusion:

» a = 0.05 At least 1 proportion is


different
13

Test of Independence of
Attributes
» χ2 test enables us to explain whether or not two
attributes are associated. Testing independence determines
whether two or more observations across two populations
are dependent on each other (that is, whether one variable
helps to estimate the other.
» If the calculated value is less than the table value at certain
level of significance for a given degree of freedom, we
conclude that null hypotheses stands which means that two
attributes are independent or not associated. If calculated
value is greater than the table value, we reject the null
14

»Determine The Hypothesis:

»Ho : The two variables are independent


» Ha : The two variables are associated

»Calculate Expected frequency


–Test Methodology

» Given a data set, it is customary to draw a contingency


table, whose structure is given above.
16

–Test Methodology
Entry into Contingency Table: Observed Frequency
In contingency table, an entry Oij denotes the event that attribute A takes on value ai and
attribute B takes on value bj (i.e., A = ai, B = bj).
17

–Test Methodology
Entry into Contingency Table: Expected Frequency
In contingency table, an entry eij denotes the expected frequency, which can be calculated
as 𝐶𝑜𝑢𝑛𝑡 ( 𝐴=𝑎 𝑖 )× 𝐶𝑜𝑢𝑛𝑡 ( 𝐵 =𝑏 𝑗 ) 𝐴𝑖 × 𝐵 𝑗
𝑒 =
𝑖𝑗 =
𝐺𝑟𝑎𝑛𝑑 𝑇𝑜𝑡𝑎𝑙 𝑁
18

– Test

Definition 7.3: χ2-Value

The value ( also known as the Pearson’s test) can be computes as

is the expected frequency


19

– Test
» The cell that contribute the most to the 𝛘 2 value are those whose actual
count is very different from the expected.

» The 𝛘 2 statistics tests the hypothesis that A and B are independent. The
test is based on a significance level, with (n-1) ×(m-1) degrees of
freedom., with a contingency table of size n×m

» If the hypothesis can be rejected, then we say that A and B are


statistically related or associated.
20
Example
Survey on Gender versus Hobby.
» Suppose, a survey was conducted among a population of size 1500. In this survey, gender
of each person and their hobby as either “book” or “computer” was noted. The survey
result obtained in a table like the following.

» We have to find if there is any association between Gender and Hobby of a people, that is,
we are to test whether “gender” and “hobby” are correlated.
21

– Test
Example : Survey on Gender versus Hobby.
» From the survey table, the observed frequency are counted and entered into the
contingency table, which is shown below.

GENDER

Male Female Total

Book
HOBBY

Computer

Total
22

– Test
» From the survey table, the expected frequency are
counted and entered into the contingency table,
which is shown below.
Male Female Total
Book

Computer

Total
23

– Test
» From the survey table, the expected frequency are
counted and entered into the contingency table,
which is shown below.
Male Female Total
Book

Computer

Total
24
– Test

» Using equation for 𝛘2 computation, we get

𝛘2 = + + +
=
25

– Test
» This value needs to be compared with the tabulated value
of 𝛘 2 (available in any standard book on statistics) with 1
degree of freedom (for a table of m × n, the degrees of
freedom is ; here m = 2, n = 2).

» For 1 degree of freedom, the 𝛘 2 value needed to reject


the hypothesis at the 0.01 significance level is 10.828.
Since our computed value is above this, we reject the
hypothesis that “Gender” and “Hobby” are independent
and hence, conclude that the two attributes are strongly
correlated for the given group of people.
26

– Test

»You’re a marketing research analyst. You ask a random sample of 286


consumers if they purchase Diet Pepsi or Diet Coke. At the 0.05 level of
significance, is there evidence of a relationship?

Diet Pepsi
Diet Coke No Yes Total
No 84 32 116

Yes 48 122 170


Total 132 154 286
c2 TEST OF INDEPENDENCE
SOLUTION*

 Eij >=5 in all cells

116·13 Diet Pepsi 154·132


2 No Yes 286
286 Obs. Obs. Total
DietNo
Coke Exp.
84 53.5 Exp.32 62.5
Yes 116
Total 48 78.5 122 91.5
170
132 132
170·132 154 154170·154
286
286 286
28

c2 TEST OF INDEPENDENCE SOLUTION*

»
=54.29
 H0: No Relationship
 H1: Relationship
Test Statistic: c2 = 54.29
 a= 0.05

 df = (2 - 1)(2 - 1) = 1

 Critical Value(s):
Decision:
Reject at sign. level 0 .05

Reject H0 Conclusion:

a = 0.05 There is evidence of a relationship

0
3.841 c2
30

– Test

Example : Hypothesis on “accident proneness” versus “driver’s handedness”.

» Consider the following on car accidents among left and right-handed drivers’ of sample size
175.

» Hypothesis is that “fatality of accidents is independent of driver’s handedness”

HANDEDNESS

Left-Handed Right-Handed Total


FATALITY

Non-Fatal

Fatal

Total

» Find the correlation between Fatality and Handedness and test the significance of the
correlation with significance level 0.1%.
31

THANKS!
Any questions?

You might also like