Exploring Chi Square Tests for Independence: Understanding Expected Frequencies

  • Thread starter Thread starter Faiq
  • Start date Start date
  • Tags Tags
    Statistical
Click For Summary

Homework Help Overview

The discussion revolves around understanding the expected frequencies in chi-square tests for independence, specifically the formula used to calculate these frequencies in a two-way table context.

Discussion Character

  • Conceptual clarification, Mathematical reasoning

Approaches and Questions Raised

  • Participants explore the reasoning behind the formula for expected frequencies, questioning the assumptions of independence between attributes. Some provide detailed explanations of the components of the formula, while others seek clarification on the terminology used.

Discussion Status

The discussion includes multiple interpretations of the expected frequency formula and its derivation. Some participants offer detailed mathematical reasoning, while others question the clarity of the original post. There is an ongoing exploration of the concepts without a clear consensus reached.

Contextual Notes

Participants note potential confusion regarding the definitions and setup of the problem, particularly in relation to the totals in a two-way table and the implications of independence in the context of the chi-square test.

Faiq
Messages
347
Reaction score
16

Homework Statement


Can someone tell me when testing for independence using chi square tests, why is the expected frequency of a cell is denoted by the formula
$$ \frac{\sum row * \sum column}{\sum total } \ $$
 
Physics news on Phys.org
Is this a human cell or a plant cell ? Or perhaps an excell ? :smile: Some more description might make your question a bit clearer, perhaps ?
 
  • Like
Likes   Reactions: SammyS
Faiq said:

Homework Statement


Can someone tell me when testing for independence using chi square tests, why is the expected frequency of a cell is denoted by the formula
$$ \frac{\sum row * \sum column}{\sum total } \ $$

In a two-way table, if ##R_i## is the total number in row ##i## and ##C_j## the total number in column ##j## then ##f_i = R_i/N## is the estimated probability of the event for row ##i## and ##g_j = C_j/N## is the estimated probability of the event for column ##j##. Here, ##N = \sum_i R_i = \sum_j C_j## is the total number of observations. Under the hypothesis of independence between rows and columns, the estimated probability of the cell ##(i,j)## is ##\bar{p}_{ij} = f_i \,g_j = R_i C_j/N^2.## Thus, the expected frequency of cell ##(i,j)## is ##E_{ij} = N \bar{p}_{ij} = R_i C_j/N.##
 
Faiq said:

Homework Statement


Can someone tell me when testing for independence using chi square tests, why is the expected frequency of a cell is denoted by the formula
$$ \frac{\sum row * \sum column}{\sum total } \ $$
The null hypothesis is that the two attributes are independent. Call the attributes A (rows representing A and not A) and B (columns representing B and not B). If they are independent then the fraction having attribute A multiplied by the fraction having attribute B should approximately equal the fraction having attributes A and B. I.e. #(A & B) / total = ( #A / total)*( #B /total), so #(A & B) = #A * #B / total.
 
Ray Vickson said:
In a two-way table, if ##R_i## is the total number in row ##i## and ##C_j## the total number in column ##j## then ##f_i = R_i/N## is the estimated probability of the event for row ##i## and ##g_j = C_j/N## is the estimated probability of the event for column ##j##. Here, ##N = \sum_i R_i = \sum_j C_j## is the total number of observations. Under the hypothesis of independence between rows and columns, the estimated probability of the cell ##(i,j)## is ##\bar{p}_{ij} = f_i \,g_j = R_i C_j/N^2.## Thus, the expected frequency of cell ##(i,j)## is ##E_{ij} = N \bar{p}_{ij} = R_i C_j/N.##
Shouldn't it be ##N = \sum_i R_i + \sum_j C_j##
 
Faiq said:
Shouldn't it be ##N = \sum_i R_i + \sum_j C_j##
No. Try it for yourself on a simple example:
$$\begin{array}{ccc|l}
& & &\text{tot.} \\ \hline
1 & 2 & 3 &6\\
4 & 5 & 6 & 15 \\
7 & 8 & 9 & 24\\ \hline
12 & 15 & 18 & 45\; \leftarrow \text{totals}
\end{array}
$$
 
Last edited:

Similar threads

  • · Replies 7 ·
Replies
7
Views
3K
  • · Replies 1 ·
Replies
1
Views
2K
  • · Replies 2 ·
Replies
2
Views
2K
  • · Replies 7 ·
Replies
7
Views
7K
  • · Replies 5 ·
Replies
5
Views
3K
  • · Replies 14 ·
Replies
14
Views
2K
  • · Replies 12 ·
Replies
12
Views
2K
  • · Replies 2 ·
Replies
2
Views
2K
  • · Replies 5 ·
Replies
5
Views
4K
  • · Replies 7 ·
Replies
7
Views
2K