Professional Documents
Culture Documents
Chi Squared Test: Goodness of Fits and Independence of Attributes
Chi Squared Test: Goodness of Fits and Independence of Attributes
Chi Squared Test: Goodness of Fits and Independence of Attributes
• The Chi-squared test can be used to see if your data follows a well-known
theoretical probability distribution like the Normal or Poisson distribution.
• The Chi-squared test allows you to assess your trained regression
model's goodness of fit on the training, validation, and test data sets.
▪ Chi-square is most commonly used by researchers who are studying
survey response data because it applies to categorical variables.
▪ Demography, consumer and marketing research, political science, and
economics are all examples of this type of research.
Independence of Attributes
▪ The chi-square test of independence also known as the chi-square test of
association which is used to determine the association between the categorical
variables.
▪ It is considered as a non-parametric test.
▪ Non-parametric tests are experiments that do not require the underlying population
for assumptions. It does not rely on any data referring to any particular parametric
group of probability distributions. Non-parametric methods are also called
distribution-free tests since they do not have any underlying population.
▪ It is mostly used to test statistical independence.
▪ For this test, the data must meet the following requirements:
• Two categorical variables
• Relatively large sample size
• Categories of variables (two or more)
• Independence of observations
Formulas
With degrees
Where r – number of rows, c - number of columns
𝐸𝑖𝑗 of freedom = (r-1)(c-1)
O11, O12, O13 .......,Orc – Observed Values for every cell
H1 : Grades in are Fluid Mechanics and Dynamics of Machinery are dependent of each other
With degrees
Where k – number of intervals, p – number of parameters
𝐸𝑖 of freedom = k-p-1
O1, O2, O3 .......,Ok – Observed Values for every cell
Flaws 1 2 3 4 5 6 7 8
Observed
1 11 8 13 11 12 10 9
Frequency
H0 : The form of distribution of the flaws follows poissons distribution
H1 : The form of distribution of the flaws does not follows poisons distribution
The expected frequencies can be calculated using poisson distribution,
Where is the parameter of the poisson distribution and can be estimated from
the sample mean
Expected
Probability by
Observed Frequency =
Flaws Poisson
Frequency (prob. x no. of
Distribution
specimens)
1 1 0.036 2.721
2 11 0.089 6.677
3 8 0.145 10.921
4 13 0.179 13.397
5 11 0.175 13.146
6 12 0.143 10.753
7 10 0.101 7.538
8 9 0.062 4.624
Expected
Observed Probability Frequency
Flaws Frequency by Poisson (Ei) = (prob. x (Oi-Ei)
(Oi) Distribution no. of
specimens)
1,2 12 0.125 9.398 2.602 0.720
3 8 0.145 10.921 2.921 0.782
4 13 0.179 13.397 0.397 0.012
5 11 0.175 13.146 2.146 0.351
6 12 0.143 10.753 1.247 0.145
7,8 19 0.163 12.162 6.838 3.845
5.8545
The degree of freedom for the data is, (p = 1
for poisson)