Comparing model fits for many populations - F statistic

In summary, the conversation discusses using an F-test to compare the fits of 2-parameter and 5-parameter models in protein dynamics research. However, this approach only allows for model comparison within each experiment and does not consider the overall behavior of the data. The speaker suggests using a meta-analysis or mixed effects model to address this issue, and also mentions the potential of consolidating experiments using a hierarchical model.
  • #1
bamm7382
1
0
Hey all,

I'm doing some research on protein dynamics which involves fitting models to data. Basically, we think that our system can be described by a complex, 5 parameter model rather than the generally accepted 2 parameter model. The data we're working with was acquired from experiments where we added a chemical species to a solution containing a receptor - the amount of binding was then quantified by fluorescence that occurs only upon binding. We ran very many experiments of this type since there is a lot of error involved. The experiments were ran in slightly different ways that involved changing some concentrations of chemical species.

For each experiment, we fit curves describing both models to the data by a least squares approach. Naively comparing the sum of squared errors between the two fits won't work since the more complex model will always fit better. So, we employed an F-test to compare the fits since the two models are nested. This produces good results that validate our hypothesis in most cases. However, we must run an F-test for each experiment and can only look at model comparison within the experiment itself.

What we want to do now is do perform a single, final comparison that takes all of the data (all experiments) into account, allowing us to say with what certainty we can choose the more complex model. I'm very clueless as to how to do this and I'm not sure that it can even be done at all! Biological systems are inherently different, even when they are identical on a genetic level. So, perhaps it is fundamentally flawed to ask about the relationship among independent samples.

Is there a way to compare F statistics for slightly dissimilar, unique experiments? If not, is there a way to consolidate these experiments before testing a hypothesis in order to look at overall behavior? Perhaps the best we can do is simply compare p-values for each experiment.

Thank you for any help - I'm sure there is much statistical intuition I can gain from working this out.
 
Last edited:
Physics news on Phys.org
  • #2


Hello,

Thank you for sharing your research with us. Your approach of using an F-test to compare the fits of the 2-parameter and 5-parameter models seems appropriate, as the models are nested and the F-test is designed for such comparisons. However, as you mentioned, this approach only allows for model comparison within each experiment and does not take into account the overall behavior of the data.

To address this issue, one approach you could consider is performing a meta-analysis. This involves combining the results from multiple studies (in this case, experiments) to obtain an overall estimate of the effect size (i.e. the difference between the two models). This can be done using statistical software such as R or STATA, which have packages specifically designed for meta-analysis.

Another option is to use a mixed effects model, which takes into account the variation between experiments and allows for the comparison of the two models across all experiments. This approach may be more appropriate if the experiments were not completely independent, as it allows for the inclusion of random effects to account for potential correlations between experiments.

In terms of consolidating the experiments before testing a hypothesis, this could also be done using a mixed effects model or by using a hierarchical model, which allows for the inclusion of both fixed and random effects. This would allow you to compare the models while taking into account the differences between experiments.

I hope this helps and provides some insight into potential approaches for your research. Good luck with your study!
 

What is the F statistic and why is it used in comparing model fits for many populations?

The F statistic is a ratio of two variances and is used to assess whether two or more groups have significantly different means. In comparing model fits for many populations, the F statistic is used to determine if there is a significant difference between the overall fit of different models.

How is the F statistic calculated?

The F statistic is calculated by dividing the between-group variance by the within-group variance. This is also known as the ratio of mean squares.

What does a high F statistic indicate?

A high F statistic indicates that the between-group variance is significantly larger than the within-group variance, suggesting that there is a significant difference between the model fits for the different populations being compared.

What is the significance level for the F statistic?

The significance level for the F statistic is determined by the p-value, which is the probability of obtaining a result as extreme as the one observed if the null hypothesis is true. The standard significance level is 0.05, meaning that if the p-value is less than 0.05, the results are considered statistically significant.

Can the F statistic be used to compare any type of model?

No, the F statistic is specifically used to compare linear regression models. It cannot be used to compare non-linear models or models with different types of data. Additionally, the data must meet certain assumptions, such as normality and homogeneity of variances, for the F statistic to be valid.

Similar threads

  • Set Theory, Logic, Probability, Statistics
Replies
20
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
14
Views
255
  • Set Theory, Logic, Probability, Statistics
Replies
7
Views
484
  • Set Theory, Logic, Probability, Statistics
Replies
30
Views
2K
  • Set Theory, Logic, Probability, Statistics
Replies
6
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
16
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
2
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
30
Views
2K
  • Set Theory, Logic, Probability, Statistics
Replies
5
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
5
Views
3K
Back
Top