Goodness of fit: How to decide which ratio to deal with?

  • Thread starter Thread starter Tyto alba
  • Start date Start date
  • Tags Tags
    Fit Ratio
Click For Summary
SUMMARY

The discussion centers on determining the appropriate ratio for calculating the Chi-squared goodness of fit test statistic in biological data sets. Participants analyze two examples: one involving four types of seeds and another with tall and short plant varieties. They emphasize the importance of hypothesizing a theoretical distribution based on Mendelian genetics principles, specifically addressing ratios such as 9:3:3:1 and 3:1. The conversation highlights the complexity of selecting ratios when actual observations deviate from expected distributions.

PREREQUISITES
  • Understanding of Chi-squared goodness of fit tests
  • Familiarity with Mendelian genetics and inheritance ratios
  • Knowledge of statistical hypothesis testing
  • Ability to analyze biological data sets
NEXT STEPS
  • Study Chi-squared goodness of fit test applications in biological research
  • Learn about Mendelian inheritance patterns and their implications for data analysis
  • Explore statistical software tools for performing Chi-squared tests, such as R or Python's SciPy library
  • Investigate case studies that illustrate the application of theoretical distributions in biological experiments
USEFUL FOR

Biologists, geneticists, statisticians, and researchers analyzing genetic data who require a deeper understanding of statistical methods for hypothesis testing in biological contexts.

Tyto alba
Messages
60
Reaction score
0
The problem statement, all variables and given/known data with attempts
While solving problems of Goodness of fit, I'm faced with an issue, how to decide which ratio to consider to find the test statistic from given set of observations.
E.g. 1
A supplied sample contains four types of seeds and the total number is 64. The types of seeds are large red 42, large white 8, small red 10 and small white 4. Calculate goodness of fit.

Attempts & Problem: As df=3, Ratio= 9:3:3:1 / 1:1:1:1?

E.g.2
You are supplied with two different varieties of plant samples;tall-76 and short-24. Determine the observed number, apply Chi square test to state whether it is in agreement with expected ratio.

Attempts & Problem: As df=1, Ratio= 3:1 /1:1?

I was vaguely told by my professor that which ever ratio seems to apply(by logical guess) we should choose that one to determine the expected values and thus the statistic.

I couldn't find any good read in this regard, except those that are full of mistakes. I've been reading Statistic blogs to understand the concepts but they didn't cover these typical biological problems and those that did had the ratio mentioned.

I've another question in mind, from experimental result it is also likely to happen that we won't get one of the four types of seeds (give that mating is random and the progenies appeared by dihybrid crosses) so determining the actual ratio behind becomes more difficult as df = 2 =/= 3!
 
Last edited:
Physics news on Phys.org
The Chi-squared goodness of fit test will test how well the data fits a hypothesized theoretical distribution. So you need to hypothesize a theoretical distribution. In problem 1, I can think of 5 possibilities:
1: Colors and sizes of seed equally likely and independent of each other. That would give 16 seeds expected in each category (small red, small white, large red, large white)
2: Colors as the data shows, sizes equally likely, independent: That would give expected totals of 52 red, 12 white, 32 large and 32 small ( 26 red small, 26 red large, 6 white small, 6 white large)
3: Colors equally likely, sizes as the data shows, independent: That would give expected totals of 32 red, 32 white, 50 large and 14 small (25 red large, 25 white large, 7 red small, 7 white small)
4: Colors and sizes as the data shows, independent: That would give expected totals of 52 red, 12 white, 50 large, 14 small (40.625 red large, 11.375 red small, 9.375 white large, 2.625 white small) rounded to (41 red large, 11 red small, 9 white large, 3 white small)
5: Colors and sizes as the data shows and they are dependent: This would be the same as the sample data and there is nothing to test the data against.

I would pick the first option simply because nothing in it is derived from the sample data. Basing any part of the theoretical distribution on the sample data is complicated and not covered by any statistical test that I know of.
 
FactChecker said:
I can think of 5 possibilities:
Hi @FactChecker:

I think you omitted four other plausible models based on four plausible Mendelian assumptions:
(A) Color has Red (R) recessive and White (W) dominant, or vice versa. (i) With R dominant, the ratio of R to W would be 3:1. (ii) With W dominant, the ratio would be 1:3.
(B) Size has Small (S) recessive and Large (L) dominant, or vice versa. (i) With S dominant, the ratio of S to L would be 3:1. (ii) With L dominant, the ratio would be 1:3.
For all four of these assumptions it would also be assumed that color and size are independent.
The four models are as follows.
6. Ai and Bi: RS 36, RL 12, WS 12, WL 4
7. Ai and Bii: RL 36, RS 12, WL 12, WS 4
8. Aii and Bi: WS 36, WL 12, RS 12, RL 4
9 Aii and Bii: WL 36, WS 12, RL 12, RS 4

Regards,
Buzz
 
  • Like
Likes   Reactions: Ygggdrasil and FactChecker
Buzz Bloom said:
Hi @FactChecker:I think you omitted four other plausible models based on four plausible Mendelian assumptions:
Good point. I was only thinking in terms of general statistics and forgot about recessive / dominant. I don't really know about that.
 

Similar threads

  • · Replies 1 ·
Replies
1
Views
2K
Replies
1
Views
3K
  • · Replies 13 ·
Replies
13
Views
2K
Replies
1
Views
4K
Replies
2
Views
4K
  • · Replies 12 ·
Replies
12
Views
4K
  • · Replies 11 ·
Replies
11
Views
2K
  • · Replies 20 ·
Replies
20
Views
6K
Replies
1
Views
1K
  • · Replies 9 ·
Replies
9
Views
11K