Theory 1

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 4

Theme: The analysis of the qualitative characteristics.

Contingency table

In statistics, a contingency table (also known as a cross tabulation or crosstab) is a


type of table in a matrix format that displays the (multivariate) frequency distribution of the
variables. They provide a basic picture of the interrelation between two variables and can help
find interactions between them. The term contingency table was first used by Karl Pearson.
Pearson's chi-squared test (χ2) is a statistical test applied to sets of categorical data to evaluate
how likely it is that any observed difference between the sets arose by chance. It is suitable
for unpaired data from large samples. It is the most widely used of many chi-squared
tests (e.g., Yates, likelihood ratio, portmanteau test in time series, etc.) – statistical procedures
whose results are evaluated by reference to the chi-squared distribution. It tests a null
hypothesis stating that the frequency distribution of certain events observed in a sample is
consistent with a particular theoretical distribution. The events considered must be mutually
exclusive and have total probability. A common case for this is where the events each cover an
outcome of a categorical variable

Steps:
1. Make a 2x2 contingency table with a, b, c, d – observed frequencies.

2. State your null hypothesis and your alternative hypothesis.


3. Choose a significance level of the test,α
4. Calculate a0, b0, c0 and d0- expected frequencies by the next formulas:
(𝑎+𝑏)∗(𝑎+𝑐) (𝑎+𝑏)∗(𝑏+𝑑) (𝑐+𝑑)∗(𝑎+𝑐) (𝑐+𝑑)∗(𝑏+𝑑)
𝑎0 = ; 𝑏0 = ; 𝑐0 = ; 𝑑0 = .
. 𝑛 𝑛
2 2
𝑛 𝑛
2 (𝑎−𝑎0 ) (𝑏−𝑏0 ) (𝑐−𝑐0 )2 (𝑑−𝑑0 )2
5. Define the chi-square statistic 𝑋𝑠𝑡 = + + +
𝑎0 𝑏0 𝑐0 𝑑0
2
If at least one cell of the table has an expected count smaller than 5, then for 𝑋𝑠𝑡 use
2 (|𝑎−𝑎0 |−0.5)2 (|𝑏−𝑏0 |−0.5)2 (|𝑐−𝑐0 |−0.5)2 (|𝑑−𝑑0 |−0.5)2
Yates's correction: 𝑋𝑌𝑎𝑡𝑒𝑠 = + + +
𝑎0 𝑏0 𝑐0 𝑑0
2
6. Find out a critical value from a chi-square table X cr=(α, k). k=n-3=4-3=1
7. Compare the meanings of X2st and X2cr
Do conclusion and Interpret the results:
i. If X2st ≤ X2cr then H0 null hypothesis is accepted
ii. If X2st > X2cr then H0 null hypothesis is rejected and H1 alternative
hypothesis is accepted.
8. If H0 null hypothesis is rejected and variables are dependent, then calculate a
coefficient of Yula to define the power of relationship between two variables:
𝑎∗𝑑−𝑏∗𝑐
𝑄=
𝑎∗𝑑+𝑏∗𝑐
Example: Suppose there are two variables, sex (male or female) and handedness (right or
left handed). Further suppose that 60 individuals are randomly sampled from a very large
population as part of a study of sex differences in handedness. A contingency table can be
created to display the numbers of individuals who are male and right handed, male and left
handed, female and right handed, and female and left handed.

Observ Handedness Sex Observ Handedness Sex Observ Handedness Sex


ations ations ations
1 Right handed Male 21 Right handed Male 41 Right handed Male
2 Right handed Female 22 Right handed Male 42 Right handed Female
3 Left handed Female 23 Right handed Female 43 Right handed Male
4 Right handed Male 24 Right handed Male 44 Right handed Male
5 Right handed Male 25 Left handed Female 45 Right handed Male
6 Right handed Male 26 Left handed Female 46 Left handed Female
7 Right handed Male 27 Left handed Female 47 Right handed Male
8 Right handed Female 28 Right handed Male 48 Right handed Female
9 Left handed Female 29 Right handed Male 49 Right handed Male
10 Right handed Male 30 Left handed Female 50 Right handed Male
11 Left handed Female 31 Left handed Male 51 Right handed Female
12 Right handed Male 32 Right handed Male 52 Left handed Female
13 Left handed Male 33 Left handed Female 53 Left handed Female
14 Left handed Male 34 Left handed Male 54 Right handed Male
15 Left handed Female 35 Left handed Female 55 Left handed Male
16 Right handed Male 36 Right handed Female 56 Right handed Female
17 Right handed Female 37 Right handed Male 57 Right handed Male
18 Right handed Male 38 Left handed Female 58 Left handed Female
19 Left handed Female 39 Left handed Male 59 Left handed Male
20 Left handed Male 40 Left handed Male 60 Left handed Male

Step1.
Contingency table is shown below.

Handed-
ness Right handed Left handed Total
Sex
Male 25 a 10 b 35
Female 9 c 16 d 25
Total 34 26 60

The numbers of the males, females, and right- and left-handed individuals are
called marginal totals. The grand total (i.e. the total number of individuals represented in the
contingency table) is the number in the bottom right corner.
The table allows users to see at a glance that the proportion of men who are right handed
is about the same as the proportion of women who are right handed although the proportions
are not identical. The significance of the difference between the two proportions can be
assessed with a variety of statistical tests including Pearson's chi-squared test, provided the
entries in the table represent individuals randomly sampled from the population about which
conclusions are to be drawn. If the proportions of individuals in the different columns vary
significantly between rows (or vice versa), it is said that there is a contingency between the two
variables. In other words, the two variables are not independent. If there is no contingency, it is
said that the two variables are independent.

Step2. H0 – X and Y are independent variables and there is no relationship between them.
Handedness doesn’t depend on Sex.
H1 - X and Y are dependent variables and they have a relationship. Handedness
depends on Sex.

Step3. Significance level is 0.05.

Step4.

(𝑎+𝑏)∗(𝑎+𝑐) 35∙34 (𝑐+𝑑)∗(𝑎+𝑐) 25∙34


𝑎0 = = = 19.83 ; 𝑐0 = = = 14.17;
𝑛 60 𝑛 60
(𝑎+𝑏)∗(𝑏+𝑑) 35∙26 (𝑐+𝑑)∗(𝑏+𝑑) 25∙26
𝑏0 = = = 15.17; 𝑑0 = = = 10.83;
𝑛 60 𝑛 60

Step5. a0, b0, c0 and d0-all expected frequencies’ meanings are more than 5, so we don’t use
the Yates's correction, and calculate the chi-square statistic by the formula:

2 (𝑎−𝑎0 )2 (𝑏−𝑏0 )2 (𝑐−𝑐0 )2 (𝑑−𝑑0 )2


𝑋𝑠𝑡 = + + + =7.45
𝑎0 𝑏0 𝑐0 𝑑0

Step6. X2cr=(α, k)=(0.05,1)=3.84

Step7. X2st(7.45)> X2cr(3.84)

Conclusion: H0 null hypothesis is rejected and H1 alternative hypothesis is accepted. X and


Y are dependent variables and they have a relationship. Handedness depends on Sex.

Step8. The power of relationship between two variables:

𝑎 ∗ 𝑑 − 𝑏 ∗ 𝑐 25 ∗ 16 − 10 ∗ 9
𝑄= = = 𝟎. 𝟔𝟑
𝑎 ∗ 𝑑 + 𝑏 ∗ 𝑐 25 ∗ 16 + 10 ∗ 9

Step9. Odds ratio. Another important statistic can also be calculated from the contingency
table. It is called the odds ratio (OR) and is calculated as

𝑎∗𝑑
𝑂𝑅 =
𝑏∗𝑐

The odds ratio is used to assess how the chances of positive and negative outcomes are (for
example, the development of an unwanted side effect after using a drug). If OR = 1 (or very
close to 1), it means that the chances of an event in both groups are almost the same.
Step10. Relative risk. Risk is the probability of a certain outcome, such as illness or injury,
depending on a factor. The risk can range from 0 (there is no probability of an outcome) up to 1
(in all cases, an unfavorable outcome is expected). Relative risk is the ratio of the frequency of
outcomes among subjects influenced by the studied factor to the frequency of outcomes among
subjects not influenced by this factor. In the scientific literature, the abbreviated name of the
indicator is often used - RR ("relative risk"). We find the value of the relative risk using the
following formula:
𝑎
𝑅𝑅 = 𝑎 + 𝑏 = 𝑎 ∗ (𝑐 + 𝑑)
𝑐 𝑐 ∗ (𝑎 ∗ 𝑏)
𝑐+𝑑

Where, a, b, c, d are the observed frequencies in the cells of the contingency table.

You might also like