Download as pdf or txt
Download as pdf or txt
You are on page 1of 3

Analysis Of Categorical Data

An attribute is a quality or a characteristic which cannot be measured but which can be


marked by their presence or absence. For instance, literacy, honesty, nationality, beauty etc.
are attributes. Attributes are also studied, with the objective to know whether any kind of
association exists between them. e.g. We would like to know whether education of a student
and area of residence are related or inoculation (vaccine) and prevention of disease are
associated. Given an attribute the population can be divided into two classes: one possessing
that attribute and the other not possessing it. Such a classification into two classes is called
dichotomous.
Association of Attributes
Let,
N denote total number of items under study
A and B denote two attributes under study. Each attribute is divided into two disjoint classes.
A and a denote presence and absence of attribute A while B and denote presence and absence
of attribute B respectively.
(A) denote number of items possessing attribute A and (α) denote number of items not
possessing attribute A. Similarly (B) and (β) denote number of items possessing and not
possessing attribute B respectively.
⸫ (A) + (α) = N
and (B) + (β) = N
(AB) denote the number of items possessing both A and B. Similarly (Aβ). (αB) and (αβ)
denote number of items possessing A but not possessing B, not possessing A and possessing
B and not possessing A and B both.
⸫ (AB) + (Aβ) = (A)
(αB) + (αβ) = (α)
(AB) +(αB) = (B)
(Aβ) +(αβ) = (β)

This information can be presented in a 2 x 2 table called contingency table


B B Β Total
A
A (AB) (Aβ) (A)

α (αB) (αβ) (α)

Total (B) (β) N


Two attributes A and B are said to be independent if there does not exist any association
between them i.e., we can say attributes A and B are independent if and only if
(AB) / (B) = (Aβ)/ (β) or (AB) (αB) = (Aβ) (αB) or (AB)= (A) (B) /N
For e.g. intelligence and gender
Attributes A and B are positively associated
if (AB) > (A) (B) / N
for.eg. extra coaching leads to good results
Attributes A and B are said to be negatively associated
if (AB) < (A) (B) / N
for e.g. vaccine and prevention of diseases
If any frequency in the table is Negative then we say that the data is inconsistent

Measures of association between two Attributes


Yules Coefficient of Association
A method of determining not only the nature of association but also the degree or extent to
which the two attributes are associated. It is denoted by Q and is given as
Q= (AB)(αβ) - (Aβ) (αB) / (AB) (αβ) + (Aβ) (αB)
1) Q always lies between -1 and +1
2) When (AB)(αβ)= (AB)(αβ)=0 then Q=0 and we say A and B are independent
3) When (Aβ) =0 or (αB) =0 then Q= +1 and we say that A and B are completely
associated
4) When (AB)=0 or (αβ)=0 then Q= -1 and we say that A and B are completely
dissociated
5) When A and B are negatively associated then -1 < Q < 0
6) When A and B are positively associated then 0 < Q < 1

Coefficient of Colligation
It is another measure of association with properties similar to the Yule's coefficient of
association Q. The coefficient of colligation is denoted by Y is given by the formula
√(𝐴𝐵)(𝛼𝛽)−√((𝐴𝛽)(𝛼𝐵)
Y=
√(𝐴𝐵)(𝛼𝛽)+(𝐴𝛽)(𝛼𝐵)

The interpretation of Y values between -1 and +1 is same as that of Q (Yules coefficient of


association).

Relationship between yules coefficient of association and coefficient of colligation (done


in class)
Consistency of Data
Conditions for consistency of data for two attributes
(i) (AB) ≥ 0
(ii) (A) ≥ (AB)
(iii) (B) ≥ (AB)
(iv) (AB) ≥ (A) + (B) – N
(v) Max{ 0, (A) + (B) – N } ≤ (AB) ≤ Min {(A) , (B) }
Conditions for consistency for three Attributes
(i) (ABC) ≥ 0
(ii) (AB) ≥ (ABC)
(iii) (AC) ≥ (ABC)
(iv) (BC) ≥ (ABC)
(v) (ABC) ≥ (AB) +(AC) - (B)
(vi) (ABC) ≥ (AB) +(AC) - (A)
(vii) (ABC) ≥ (AC) + (BC) – (C)
(viii) (ABC) ≤ (AB) + (BC) + (AC) – (A) – (B) - (C) + N
(ix) Max{0, (AC) + (BC) – (C) , (AB) + (BC) – (B), (AB) + (AC) – (A) ≤ (ABC) ≤
Min (AB) , (AC), (BC), (AB) + (BC) + (AC) – (A) – (B) – (C) +N }

You might also like