Goodness of fit: How to decide which ratio to deal with?

Tyto alba · Mar 13, 2017

The problem statement, all variables and given/known data with attempts
While solving problems of Goodness of fit, I'm faced with an issue, how to decide which ratio to consider to find the test statistic from given set of observations.
E.g. 1
A supplied sample contains four types of seeds and the total number is 64. The types of seeds are large red 42, large white 8, small red 10 and small white 4. Calculate goodness of fit.

Attempts & Problem: As df=3, Ratio= 9:3:3:1 / 1:1:1:1?

E.g.2
You are supplied with two different varieties of plant samples;tall-76 and short-24. Determine the observed number, apply Chi square test to state whether it is in agreement with expected ratio.

Attempts & Problem: As df=1, Ratio= 3:1 /1:1?

I was vaguely told by my professor that which ever ratio seems to apply(by logical guess) we should choose that one to determine the expected values and thus the statistic.

I couldn't find any good read in this regard, except those that are full of mistakes. I've been reading Statistic blogs to understand the concepts but they didn't cover these typical biological problems and those that did had the ratio mentioned.

I've another question in mind, from experimental result it is also likely to happen that we won't get one of the four types of seeds (give that mating is random and the progenies appeared by dihybrid crosses) so determining the actual ratio behind becomes more difficult as df = 2 =/= 3!

FactChecker · Mar 14, 2017

The Chi-squared goodness of fit test will test how well the data fits a hypothesized theoretical distribution. So you need to hypothesize a theoretical distribution. In problem 1, I can think of 5 possibilities:
1: Colors and sizes of seed equally likely and independent of each other. That would give 16 seeds expected in each category (small red, small white, large red, large white)
2: Colors as the data shows, sizes equally likely, independent: That would give expected totals of 52 red, 12 white, 32 large and 32 small ( 26 red small, 26 red large, 6 white small, 6 white large)
3: Colors equally likely, sizes as the data shows, independent: That would give expected totals of 32 red, 32 white, 50 large and 14 small (25 red large, 25 white large, 7 red small, 7 white small)
4: Colors and sizes as the data shows, independent: That would give expected totals of 52 red, 12 white, 50 large, 14 small (40.625 red large, 11.375 red small, 9.375 white large, 2.625 white small) rounded to (41 red large, 11 red small, 9 white large, 3 white small)
5: Colors and sizes as the data shows and they are dependent: This would be the same as the sample data and there is nothing to test the data against.

I would pick the first option simply because nothing in it is derived from the sample data. Basing any part of the theoretical distribution on the sample data is complicated and not covered by any statistical test that I know of.

Buzz Bloom · Mar 14, 2017

FactChecker said:

I can think of 5 possibilities:

Hi @FactChecker:

I think you omitted four other plausible models based on four plausible Mendelian assumptions:
(A) Color has Red (R) recessive and White (W) dominant, or vice versa. (i) With R dominant, the ratio of R to W would be 3:1. (ii) With W dominant, the ratio would be 1:3.
(B) Size has Small (S) recessive and Large (L) dominant, or vice versa. (i) With S dominant, the ratio of S to L would be 3:1. (ii) With L dominant, the ratio would be 1:3.
For all four of these assumptions it would also be assumed that color and size are independent.
The four models are as follows.
6. Ai and Bi: RS 36, RL 12, WS 12, WL 4
7. Ai and Bii: RL 36, RS 12, WL 12, WS 4
8. Aii and Bi: WS 36, WL 12, RS 12, RL 4
9 Aii and Bii: WL 36, WS 12, RL 12, RS 4

Regards,
Buzz

FactChecker · Mar 14, 2017

Buzz Bloom said:

Hi @FactChecker:I think you omitted four other plausible models based on four plausible Mendelian assumptions:

Good point. I was only thinking in terms of general statistics and forgot about recessive / dominant. I don't really know about that.

Goodness of fit: How to decide which ratio to deal with?

1. What is "goodness of fit" in statistics?

2. Why is it important to evaluate the goodness of fit?

3. How do you decide which ratio to use when evaluating the goodness of fit?

4. Can a model have a perfect fit?

5. How do you interpret the results of a goodness of fit test?

Similar threads

Hot Threads

Recent Insights