Exploring Chi Square Tests for Independence: Understanding Expected Frequencies

Faiq · Feb 17, 2017

Homework Statement

Can someone tell me when testing for independence using chi square tests, why is the expected frequency of a cell is denoted by the formula
$$ \frac{\sum row * \sum column}{\sum total } \ $$

BvU · Feb 17, 2017

Is this a human cell or a plant cell ? Or perhaps an excell ?

Some more description might make your question a bit clearer, perhaps ?

Ray Vickson · Feb 17, 2017

Faiq said:

Homework Statement

Can someone tell me when testing for independence using chi square tests, why is the expected frequency of a cell is denoted by the formula
$$ \frac{\sum row * \sum column}{\sum total } \ $$

In a two-way table, if ##R_i## is the total number in row ##i## and ##C_j## the total number in column ##j## then ##f_i = R_i/N## is the estimated probability of the event for row ##i## and ##g_j = C_j/N## is the estimated probability of the event for column ##j##. Here, ##N = \sum_i R_i = \sum_j C_j## is the total number of observations. Under the hypothesis of independence between rows and columns, the estimated probabilty of the cell ##(i,j)## is ##\bar{p}_{ij} = f_i \,g_j = R_i C_j/N^2.## Thus, the expected frequency of cell ##(i,j)## is ##E_{ij} = N \bar{p}_{ij} = R_i C_j/N.##

haruspex · Feb 17, 2017

Faiq said:

Homework Statement

Can someone tell me when testing for independence using chi square tests, why is the expected frequency of a cell is denoted by the formula
$$ \frac{\sum row * \sum column}{\sum total } \ $$

The null hypothesis is that the two attributes are independent. Call the attributes A (rows representing A and not A) and B (columns representing B and not B). If they are independent then the fraction having attribute A multiplied by the fraction having attribute B should approximately equal the fraction having attributes A and B. I.e. #(A & B) / total = ( #A / total)*( #B /total), so #(A & B) = #A * #B / total.

Faiq · Feb 17, 2017

Ray Vickson said:

In a two-way table, if ##R_i## is the total number in row ##i## and ##C_j## the total number in column ##j## then ##f_i = R_i/N## is the estimated probability of the event for row ##i## and ##g_j = C_j/N## is the estimated probability of the event for column ##j##. Here, ##N = \sum_i R_i = \sum_j C_j## is the total number of observations. Under the hypothesis of independence between rows and columns, the estimated probabilty of the cell ##(i,j)## is ##\bar{p}_{ij} = f_i \,g_j = R_i C_j/N^2.## Thus, the expected frequency of cell ##(i,j)## is ##E_{ij} = N \bar{p}_{ij} = R_i C_j/N.##

Shouldn't it be ##N = \sum_i R_i + \sum_j C_j##

Ray Vickson · Feb 17, 2017

Faiq said:

Shouldn't it be ##N = \sum_i R_i + \sum_j C_j##

No. Try it for yourself on a simple example:
$$\begin{array}{ccc|l}
& & &\text{tot.} \\ \hline
1 & 2 & 3 &6\\
4 & 5 & 6 & 15 \\
7 & 8 & 9 & 24\\ \hline
12 & 15 & 18 & 45\; \leftarrow \text{totals}
\end{array}
$$

Exploring Chi Square Tests for Independence: Understanding Expected Frequencies

Homework Statement

Homework Statement

Homework Statement

Thread 'Greatest possible value of a constant in polynomial'

Similar threads

Hot Threads

Why Are There Two Possible Values of x in Similar Shapes Geometry Problems?

[ASK] Trigonometric Inequality

What does this equation mean?

Finding polar equation of a shifted cricle

Intersection of a circle and a sine curve

Recent Insights

Insights Thinking Outside The Box Versus Knowing What’s In The Box

Insights Why Entangled Photon-Polarization Qubits Violate Bell’s Inequality

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect

Insights What Exactly is Dirac’s Delta Function? - Insight

Insights Relativator (Circular Slide-Rule): Simulated with Desmos - Insight

Insights Fixing Things Which Can Go Wrong With Complex Numbers