MHB Confusion about greater variance in the numerator for F ratio

Click For Summary
SUMMARY

The discussion centers on the F ratio and its relationship with the F Distribution, specifically addressing the confusion regarding which variance should be placed in the numerator when calculating the ratio of variances from two samples drawn from normally distributed populations. It is clarified that while the variances may differ across samples, the convention is to consistently place the larger variance in the numerator for ease of interpretation. This approach allows for the use of F-tables, which typically list values greater than 1, facilitating statistical analysis.

PREREQUISITES
  • Understanding of F Distribution and its properties
  • Knowledge of sample variances and their calculation
  • Familiarity with statistical hypothesis testing
  • Basic concepts of normal distribution and variance
NEXT STEPS
  • Study the derivation and properties of the F Distribution
  • Learn how to interpret F-tables for hypothesis testing
  • Explore the implications of variance selection in statistical analysis
  • Investigate the relationship between sample size and variance estimation
USEFUL FOR

Statisticians, data analysts, and researchers involved in hypothesis testing and variance analysis will benefit from this discussion, particularly those working with F ratios in their statistical models.

dhiraj
Messages
3
Reaction score
0
Hi,

I am studying about F ratio and how, as a random variable, it follows F Distribution. So let me explain what confuses me.

This is what the theory says -- We draw two random samples $sample_x$ and $sample_y$ from two different Normally distributed populations with equal variance $\sigma^2$. Let the sample variances of these samples be $s^2_x$ and $s^2_y$ respectively. The sample sizes for $sample_x$ is $n_x$ and the sample size for $sample_y$ is $n_y$. Then if we form the random variable $\frac{\sigma^2_x}{\sigma^2_y}$ , such that the greater variance (whichever is the greater variance in that sample pair) must appear appear as the numerator. This is what I am not able to understand.

If it's a random variable for the sampling distribution for that ratio -- it means , if we draw a random sample pair (x,y) with fixed sizes $n_x$ and $n_y$ many many times from their respective parent populations (say we do it 1000 times e.g.), we will get 1000 pairs of variances i.e. ($s^2_x$,$s^2_y$). Now if we have to draw histogram for F distribution , we have to calculate 1000 numbers (ratios) out of each of the 1000 variance pairs ($s^2_x$,$s^2_y$). And the theory says that the greater variance has to appear as numerator in the ratio. Now how can it be fixed? Across all the 1000 pairs it may change, in some of the pairs the sample x (the first) may have higher variance, and in some of them the sample y (the second) can have the greater variance. If we have to have a common fixed formula for the random variable (supposedly $\frac{\sigma^2_x}{\sigma^2_y}$ ), how can it change from pair to pair? It has to remain fixed for all the 1000 instances. This is my dilemma.

Can you try to explain?

Thanks,
Dhiraj
 
Physics news on Phys.org
Hi dhiraj! Welcom to MHB! (Smile)

We don't have to put the largest variance on top - it's a convenience.

Note that if the variances are the same, we have:
$$F=\frac{\sigma_x^2}{\sigma_y^2}=1$$
If the variances are different, we will either have $F>1$ or $F<1$.
It's just that typical $F$-tables only list $F$-values greater than $1$, which makes sense because we can also look up $\frac 1 F$, which is what we have if we put the other variance on top.

So yes, we should always put the same variance on top, because we should indeed be consistent.
And initially (or afterwards) we might make an 'educated guess', which variance we think will be bigger, and put it on top, just so we get 'nice' numbers (that are mostly greater than $1$).
 
"Across all the 1000 pairs it may change, in some of the pairs the sample x (the first) may have higher variance, and in some of them the sample y (the second) can have the greater variance."
You seem to be under the impression that the "x" and "y" of each sample has its own "variance". That is not true. The probability distribution for x has a single variance and the probability distribution for y has a single variance.
 

Similar threads

Replies
5
Views
5K
  • · Replies 12 ·
Replies
12
Views
2K
Replies
1
Views
4K
  • · Replies 7 ·
Replies
7
Views
3K
  • · Replies 5 ·
Replies
5
Views
2K
  • · Replies 1 ·
Replies
1
Views
1K
  • · Replies 1 ·
Replies
1
Views
2K
  • · Replies 1 ·
Replies
1
Views
2K
  • · Replies 1 ·
Replies
1
Views
1K
  • · Replies 2 ·
Replies
2
Views
2K