Exploring Chi Square Tests for Independence: Understanding Expected Frequencies

  • Thread starter Thread starter Faiq
  • Start date Start date
  • Tags Tags
    Statistical
AI Thread Summary
The discussion focuses on the formula for calculating expected frequencies in chi-square tests for independence, specifically $$E_{ij} = \frac{R_i C_j}{N}$$ where ##R_i## and ##C_j## are the totals for row and column respectively, and ##N## is the total number of observations. It explains that under the null hypothesis of independence, the expected frequency of a cell is derived from the product of the probabilities of the respective row and column. Participants clarify that the formula does not involve summing the totals of rows and columns but rather multiplying the row total by the column total and dividing by the overall total. The discussion also emphasizes the importance of understanding these calculations through practical examples. Overall, the conversation enhances comprehension of chi-square tests and their statistical foundations.
Faiq
Messages
347
Reaction score
16

Homework Statement


Can someone tell me when testing for independence using chi square tests, why is the expected frequency of a cell is denoted by the formula
$$ \frac{\sum row * \sum column}{\sum total } \ $$
 
Physics news on Phys.org
Is this a human cell or a plant cell ? Or perhaps an excell ? :smile: Some more description might make your question a bit clearer, perhaps ?
 
  • Like
Likes SammyS
Faiq said:

Homework Statement


Can someone tell me when testing for independence using chi square tests, why is the expected frequency of a cell is denoted by the formula
$$ \frac{\sum row * \sum column}{\sum total } \ $$

In a two-way table, if ##R_i## is the total number in row ##i## and ##C_j## the total number in column ##j## then ##f_i = R_i/N## is the estimated probability of the event for row ##i## and ##g_j = C_j/N## is the estimated probability of the event for column ##j##. Here, ##N = \sum_i R_i = \sum_j C_j## is the total number of observations. Under the hypothesis of independence between rows and columns, the estimated probabilty of the cell ##(i,j)## is ##\bar{p}_{ij} = f_i \,g_j = R_i C_j/N^2.## Thus, the expected frequency of cell ##(i,j)## is ##E_{ij} = N \bar{p}_{ij} = R_i C_j/N.##
 
Faiq said:

Homework Statement


Can someone tell me when testing for independence using chi square tests, why is the expected frequency of a cell is denoted by the formula
$$ \frac{\sum row * \sum column}{\sum total } \ $$
The null hypothesis is that the two attributes are independent. Call the attributes A (rows representing A and not A) and B (columns representing B and not B). If they are independent then the fraction having attribute A multiplied by the fraction having attribute B should approximately equal the fraction having attributes A and B. I.e. #(A & B) / total = ( #A / total)*( #B /total), so #(A & B) = #A * #B / total.
 
Ray Vickson said:
In a two-way table, if ##R_i## is the total number in row ##i## and ##C_j## the total number in column ##j## then ##f_i = R_i/N## is the estimated probability of the event for row ##i## and ##g_j = C_j/N## is the estimated probability of the event for column ##j##. Here, ##N = \sum_i R_i = \sum_j C_j## is the total number of observations. Under the hypothesis of independence between rows and columns, the estimated probabilty of the cell ##(i,j)## is ##\bar{p}_{ij} = f_i \,g_j = R_i C_j/N^2.## Thus, the expected frequency of cell ##(i,j)## is ##E_{ij} = N \bar{p}_{ij} = R_i C_j/N.##
Shouldn't it be ##N = \sum_i R_i + \sum_j C_j##
 
Faiq said:
Shouldn't it be ##N = \sum_i R_i + \sum_j C_j##
No. Try it for yourself on a simple example:
$$\begin{array}{ccc|l}
& & &\text{tot.} \\ \hline
1 & 2 & 3 &6\\
4 & 5 & 6 & 15 \\
7 & 8 & 9 & 24\\ \hline
12 & 15 & 18 & 45\; \leftarrow \text{totals}
\end{array}
$$
 
Last edited:
Since ##px^9+q## is the factor, then ##x^9=\frac{-q}{p}## will be one of the roots. Let ##f(x)=27x^{18}+bx^9+70##, then: $$27\left(\frac{-q}{p}\right)^2+b\left(\frac{-q}{p}\right)+70=0$$ $$b=27 \frac{q}{p}+70 \frac{p}{q}$$ $$b=\frac{27q^2+70p^2}{pq}$$ From this expression, it looks like there is no greatest value of ##b## because increasing the value of ##p## and ##q## will also increase the value of ##b##. How to find the greatest value of ##b##? Thanks

Similar threads

Replies
7
Views
2K
Replies
7
Views
7K
Replies
14
Views
2K
Replies
2
Views
2K
Replies
5
Views
3K
Replies
7
Views
2K
Back
Top