# Casella Berger: Why is distribution of F-statistic in ANOVA not T^2

• I
• shaikh22ammar
In summary, Theorem 11.2.8 in Casella & Berger defines the ANOVA statistic as a maxima of T^2 statistic. The ANOVA statistic is equal to the supremum of the square of a term, which follows a t distribution. However, when there are more than two groups, the supremum of the square follows a (k-1) F(k-1, n-k) distribution, instead of a t^2 distribution.

#### shaikh22ammar

Theorem 11.2.8 in Casella & Berger defines the ANOVA statistic as a maxima of $T^2$ statistic as:
$$\sup_{\sum a_i = 0} T_a^2 = \sup_{\sum a_i = 0} \left( \left( S^2_p \sum a_i^2 / n_i \right)^{-1/2} \left( \sum a_i \bar Y_{i \cdot} - \sum a_i \theta_i\right) \right)^2 = \left( S^2_p \right)^{-1} \sum n_i \left( \bar Y_{i \cdot} - \bar{\bar Y} - \theta_i + \bar{\theta} \right)^2$$
where all the summations are from 1 to $k$ the no. of treatments and $S^2_p, n_i, \theta_i, \bar Y_{i \cdot}$ are the pooled sample variance, no. of observations of treatment $i$, its mean, and sample mean respectively. The term inside the square between equals signs follows t distribution but for whatever reason the supremum of the square follows $(k-1) F(k-1, n-k)$, as opposed to $t^2$.

$$F = t^2$$
only when there are two groups.

## 1. Why is the distribution of F-statistic used in ANOVA instead of T^2?

The distribution of F-statistic is used in ANOVA because it is specifically designed for comparing the means of three or more groups. In ANOVA, we are interested in determining whether there is a significant difference between the means of different groups, and the F-statistic allows us to do this by comparing the variance between groups to the variance within groups.

## 2. How is the F-statistic calculated in ANOVA?

The F-statistic is calculated by dividing the between-group variance by the within-group variance. In other words, it is the ratio of the mean square between (MSB) to the mean square within (MSW). This calculation allows us to determine whether the differences between groups are larger than the differences within groups, indicating a significant difference in means.

## 3. Can the F-statistic be used to compare two groups?

No, the F-statistic is specifically designed for comparing the means of three or more groups. For comparing two groups, the t-test is a more appropriate statistical test.

## 4. What are the assumptions for using the F-statistic in ANOVA?

The F-statistic assumes that the data is normally distributed, the variances of the groups are equal, and the observations are independent of each other. Violations of these assumptions can lead to inaccurate results.

## 5. How can the F-statistic be interpreted in ANOVA?

The F-statistic is used to determine the p-value, which indicates the probability of obtaining the observed results if there is no true difference between the groups. If the p-value is less than the chosen significance level (usually 0.05), then we can reject the null hypothesis and conclude that there is a significant difference between the means of the groups.