5-Chi Square Analysis Tutorial

You might also like

Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 17

Chi Square Test

Chi Square c2 for goodness of fit

The c2 test is a statistical method that tests whether a given set of data fit a hypothesis.

Tests the probability of this

Not that the hypothesis is correct but the probability

Can the test accept the hypothesis and it’s wrong? Can we
reject a correct hypothesis? You bet! That’s why we call it
probability and not absolutility!

But in the following pages you will see how statistics can gauge
the confidence of our answer.
Chi Square Test
Chi Square c2 for goodness of fit

To perform the c2 statistical test you must:


1. Make a null hypothesis concerning your data
2. Predict the outcome of the data if your null hypothesis were correct
3. Establish the c2 value
4. Determine the degrees of freedom for the test
5. Determine the probability that your null hypothesis is correct
6. Accept or reject your null hypothesis based on the probability
7. Draw a conclusion

These 7 steps will be explained in detail in the following example.

PS – these are the steps I look for and score on any exam…
Chi Square Test
Chi Square c2 for goodness of fit

Example: If we observed 99 purple and 45 white flowers, is this a 3:1 ratio? Use c2 to
test this hypothesis.

So we can write the problem as:

A 3:1 ratio is indicative of a monohybrid cross:


Pp X Pp -> 3 P_ : 1 pp

99 purple and 45 white flowers


total = 144

Are these progeny in a 3:1 ratio representative of a monohybrid cross?

In the next slides we will outline the c2 analysis for this process
Chi Square Test
Example: If we observed 99 purple and 45 white flowers, is this a 3:1 ratio? Use c2 to test this hypothesis.

Step 1. State null hypothesis in detail

Hypothesis: These progeny are in a 3:1 ratio


representative of a monohybrid cross
Pp X Pp
gives
3 Purple : 1 White

Would give progeny: P_ & pp in a 3:1 ratio

• This is one of the most important steps that students often overlook
• Remember that if you don’t know your hypothesis you don’t know what
you are accepting or rejecting in the end
Chi Square Test
Example: If we observed 99 purple and 45 white flowers, is this a 3:1 ratio? Use c2 to test this hypothesis.

Step 2. Determine rules of probability to predict expected values

If this is a 3:1 ratio, then we would expect ¾ purple and ¼ white:

99 purple + 45 white = 144 total progeny


- So, we would expect for the Purple ¾ of 144 = 108.
- We would expect for the White ¼ of 144 = 36.

- 108 purple and 36 white are the expected values

- The given data are the observed values (ie, in this case 99 purple and 45 white)
Chi Square Test
Example: If we observed 99 purple and 45 white flowers, is this a 3:1 ratio? Use c2 to test this hypothesis.

Step 3. Establish the c2 value

First, determine deviation of actual data from expected:


For each class:
Find the difference from the expected:
108 - 99 = 9 purple
36 - 45 = -9 white
Square the difference:
(9)2 = 81 purple
(-9)2 = 81 white
Divide by the expected
81/108 = 0.75 purple
81/36 = 2.25 white
Chi Square Test
Example: If we observed 99 purple and 45 white flowers, is this a 3:1 ratio? Use c2 to test this hypothesis.

Step 3. Establish the c2 value

Total the results for all the classes of progeny


c2 = 0.75 + 2.25 = 3.0
We can summarize Step 3 by the equation:
c2 = S (observed - expected)2
expected

So now that we have a number, What does this value mean?


To answer that question we have to remember that c2 is determined from the deviations
from the expected values. Therefore the smaller the number is the smaller the deviation
is from the expected values:
If c2 = 0
Data fit hypothesis exactly and no difference is seen
Conversely, the larger the number is the larger the deviation is from the expected values.
Chi Square Test
Example: If we observed 99 purple and 45 white flowers, is this a 3:1 ratio? Use c2 to test this hypothesis.

Step 4 Determine the Degrees of Freedom for the test

The degrees of freedom are the number of independent variables


One phenotype is not variable with itself and so would have 0 degrees of freedom. Two
phenotypes have one variable and therefore would have 1 degree of freedom, etc. We
can therefore summarize this by saying that:

Degrees of freedom = # of classes - 1

For our example, the degrees of Freedom = 2 (Purple and White) - 1 = 1


Chi Square Test
Example: If we observed 99 purple and 45 white flowers, is this a 3:1 ratio? Use c2 to test this hypothesis.

Step 5. Determine the probability that your null hypothesis is correct


Up to now we still have not determined the “probability” that the data fit the hypothesis
but only the sum of how far each individual data point has deviated from the expected. T

We will take a little detour to demonstrate that probability depends on sample size:

(for 4 slides after this…)


Step 5. Determine the probability that your null hypothesis is correct
Probability depends on sample size
Let’s look at 3 sample sizes below, 4, 8, and 40. Here we are looking at tall vs. dwarf trees.
Now let’s look at the predicted distribution of trees having the indicated number of tall
trees on the left for each category and the probability for each outcome on the right.
Since tall is dominant, it is not surprising to see that the probability is higher for higher
numbers of trees to be tall. You will notice though that the larger the number the more
accurate the prediction of ¾ should be tall.

Let’s look at it
in pictures
Step 5. Determine the probability that your null hypothesis is correct

as n gets larger, curve gets smoother


So when you plot the results of the previous table you see that as you increase the
number, the curve gets smoother, but there is less change in probability. With 40 trees
you see that you get a bell curve shape. Let’s examine the last one in greater detail.
Step 5. Determine the probability that your null hypothesis is correct
Ratios within 95% limits are supportive of hypothesis

How do we know what probability is most likely to be correct?


If we look at the graph, 95% of
the area is under the curve
and 5% is in the shoulder
regions. This 5% is the data
that is most in doubt – do
probabilities that fall into this
region mean that the
hypothesis is incorrect? No.
Do probabilities that fall
under in the 95% area mean
that the hypothesis is correct?
No. But depending upon
sample size if we are within
those 95% confidence limits
we accept the hypothesis.
Chi Square Test
• Degrees of freedom (df) are listed in the outer columns
• Probabilities head the interior columns
• The numbers in those columns are c2 values

If we go from the graph to the c2 table below we see columns with probabilities as headings.
Within the body of the table are the c2 values that correspond to the probability at the top.
< 5% chance of being the correct hypothesis
Rarely will you find a (bell curve shoulders)
c2 that is exact –
usually it falls between
2 columns.

This means that the


probability is between
2 numbers as well.

For example:
0.05<p<0.1

This is the correct way


to write probability.
End of
Chi Square Test
Example: If we observed 99 purple and 45 white flowers, is this a 3:1 ratio? Use c2 to test this hypothesis.

Step 5 Determine the probability that your null hypothesis is correct

To return to our example, to determine the probability that our data fit the null hypothesis,
we will use both the degrees of freedom that you determined (1) and the c 2 value (3.0)
in a c2 table shown below.

• Find the df line


you determined

• Locate the 2
columns that
span the c2 value
from your
analysis

• Go to the top of
the column for
the probability (p)
Chi Square Test
Example: If we observed 99 purple and 45 white flowers, is this a 3:1 ratio? Use c2 to test this hypothesis.

Step 5 Determine the probability that your null hypothesis is correct


Since we have 1 df and our c2 is 3.0 our probability is between 0.1 and 0.05.
We would write this as 0.1>p>0.05
Chi Square Test
Example: If we observed 99 purple and 45 white flowers, is this a 3:1 ratio? Use c2 to test this hypothesis.

Step 6 Accept or reject the null hypothesis based on the probability

Since p is above 0.05, we accept the hypothesis.

Step 7 Draw the conclusion

These data for purple and white are indicative of a 3:1 ratio

The c2 test can be used to test any type of genetic hypothesis using these 7 steps:
Monohybrids, dihybrids, testcrosses, and as we’ll see linkage of 2 genes.
Here are 2 of Mendel’s
experiments to practice with.
You can use the answers to
check yourself.

Here you see the same data


assuming a hypothesis of 1;1. See
how different the outcome is?

You might also like