Sample variances & ANOVA: How different is too different?

  • Context: Graduate 
  • Thread starter Thread starter Rasalhague
  • Start date Start date
  • Tags Tags
    anova
Click For Summary

Discussion Overview

The discussion revolves around the application of analysis of variance (ANOVA) in the context of varying sample variances from different teaching methods. Participants explore the assumptions involved in performing ANOVA and question the criteria for determining when sample variances are considered "too different" for analysis.

Discussion Character

  • Exploratory
  • Debate/contested
  • Mathematical reasoning

Main Points Raised

  • Koosis presents a scenario with three groups of students and their respective sample variances, questioning the assumptions needed for ANOVA and how to determine if the variances are too different.
  • Some participants suggest that the populations should be viewed as either infinite or finite sets of students, raising philosophical questions about the nature of the populations involved.
  • One participant mentions the central limit theorem and various statistical rules of thumb, seeking clarification on what specific criterion Koosis might be using to assess the differences in sample variances.
  • Another participant expresses skepticism about the rigor of applying statistics to real-world problems, suggesting that the assumption of a normal population complicates the interpretation of results.
  • A later reply references the Bonferroni correction as a potential consideration, although it is noted that this may not align with Koosis's reasoning.

Areas of Agreement / Disagreement

Participants express differing views on the nature of populations in statistical analysis and the criteria for applying ANOVA, indicating that multiple competing perspectives remain without a consensus on the question of how different is too different.

Contextual Notes

Participants highlight the limitations of applying statistical methods to finite populations and the assumptions underlying the normality of distributions, which remain unresolved in the discussion.

Rasalhague
Messages
1,383
Reaction score
2
Koosis: Statistics..., 4th ed., p. 177:

A number of students is assigned randomly to three classes with three different teaching methods. The following statistics summarize the performance of the three groups [...] Can you perform an analysis of variance with these data? What assumptions are involved?

Group 1: n = 10, s2 = 100.
Group 2: n = 11, s2 = 81.
Group 2: n = 8, s2 = 64.

The given answer is yes. "You must be prepared to assumed the populations from which you are sampling are normally distributed" and that "the teaching method is the only reasonable explanation of the differences between groups."

In the next problem, the situation is the same, but the data are different. Now the sample variances are 144, 81 and 64. Can ANOVA be done on these data? Answer no, the sample variances are too different.

The obvious question: How different is too different?

One thought I had was to do an F test with null hypothesis the population variances are equal, alternative the population with the biggest sample variance has a bigger population variance than the population with the smallest sample variance. But in Excel, with a 5% significance level, I get a critical value of 3.68. The F scores for both ratios of sample variances are less than this: 100/64 = 1.56 and 144/64 = 2.25. So I guess this isn't the criterion. But why not? And what is?

Another question: what are the populations in this case: three sets each consisting of a hypothetical continuum of infinitely many identical students? Or three copies of the same finite set of actual students, depending on context? Or some ill-defined three copies of the same large, but finite set of all students in history who might conceivably be taught, or have been taught, by these methods, whose population parameters are only approximated by the normal probability measure? Or is it not advisable to think too hard about what population means in such cases?
 
Physics news on Phys.org
You have a very thorough and rigorous approach to studying mathematics, so I can't resist asking why you are bothering to study statistics. Applying statistics to anything is a largely subjective and non-rigorous activity!

If a test involving sampling assumes "a normal population" then the bottom line for the population must be that independently drawn samples from it have a normal distribution, so the mathematically simplest way to visualize the population of students would be to visualize an infinite population of them, having a continuum of values. Of course, taking that idea seriously would rule out applying statistics to many real world problems, so your thought of "it is not adviable to think to hard" about the population is the one that is usually applied.
 
Okay, thought center deactivated : )

But, philosophy aside, I guess there's some rule of thumb though, at least? I read that the central limit theorem applies for a given, fixed sample size of at least 30, and that the binomial distribution is a good approximation for the hypergeometric when the population is at least 20 times larger than the sample size, and that each cell should have a value of at least 5 for the chi square test to give reasonable results. What rule of thumb is Koosis applying in this case? By what criterion - however rough and subjective - would you decide whether sample variances were too different to apply ANOVA?
 
I don't know the answer to question "how different is too different". Tutorials on the web apply the F test, just as you did to investigate the equality of variances.

I doubt the following is Koosis's reason, but it would be an excuse to reactivate the thought center: Bonferoni correction: http://en.wikipedia.org/wiki/Bonferroni_correction
 

Similar threads

  • · Replies 3 ·
Replies
3
Views
3K
  • · Replies 5 ·
Replies
5
Views
3K
  • · Replies 13 ·
Replies
13
Views
3K
  • · Replies 6 ·
Replies
6
Views
2K
  • · Replies 2 ·
Replies
2
Views
7K
  • · Replies 1 ·
Replies
1
Views
6K
  • · Replies 2 ·
Replies
2
Views
2K
  • · Replies 20 ·
Replies
20
Views
4K
  • · Replies 5 ·
Replies
5
Views
2K
  • · Replies 3 ·
Replies
3
Views
3K