G-Test: Is it Dependent on Total Amount of Observations?

  • Context: Graduate 
  • Thread starter Thread starter mnb96
  • Start date Start date
Click For Summary
SUMMARY

The G-Test serves as an alternative to the Chi-squared test for analyzing categorical data, defined by the formula G = 2∑_i O_i · log(O_i/E_i), where O_i and E_i represent observed and expected counts, respectively. A critical issue identified is that the G value is directly proportional to the total number of observations (N), which can lead to misleading interpretations of statistical significance. For example, with N=10 tosses yielding G≈7.36 and N=100 yielding G≈73.6, the threshold for rejecting the null hypothesis changes significantly. This dependency on N necessitates careful consideration of the degrees of freedom and the chosen significance level (α) when interpreting results.

PREREQUISITES
  • Understanding of G-Test and Chi-squared test
  • Familiarity with contingency tables
  • Knowledge of statistical significance and p-values
  • Concept of degrees of freedom in statistical tests
NEXT STEPS
  • Explore the implications of sample size on G-Test results
  • Learn about the relationship between G-Test and Chi-squared test
  • Investigate how to calculate degrees of freedom for various tests
  • Study the interpretation of p-values in hypothesis testing
USEFUL FOR

Statisticians, data analysts, researchers in social sciences, and anyone involved in hypothesis testing and statistical analysis.

mnb96
Messages
711
Reaction score
5
Hello,

it is claimed that the so called G-Test can be used as a replacement for the well-known Chi-squared test. The G-test is defined as: G = 2\sum_i O_i \cdot \log \left( \frac{O_i}{E_i}\right)where Oi and Ei are the observed and expected counts in the cell i of a contingency table.

I see a big problem with this.
Namely, the value G is directly proportional to the total amount N of observations!

This is easily seen even with the most trivial example of a coin toss. Suppose we want to test wheter a coin is fair or not. We collect N=10 samples and we obtain {1 head, 9 tails}. Thus, according to the above formula G≈7.36.
Now suppose we collect N=100 samples and we obtain {10 heads, 90 tails}. Well, according to the above formula we now get G≈73.6, exactly ten times more.

So, what is the threshold value for G above which we reject the null-hypothesis that the coin is fair?
 
Physics news on Phys.org
mnb96 said:
exactly ten times more.

What's bad about that? Intuitively, more trials provide more evidence of a trend toward tails.


So, what is the threshold value for G above which we reject the null-hypothesis that the coin is fair?

What \alpha do you want to use?

"The" chi-square distribution is actually a family of distributions. You have to specify the "degrees of freedom" to specify a particular distribution.
 
  • Like
Likes   Reactions: 1 person
Stephen Tashi said:
What \alpha do you want to use?

"The" chi-square distribution is actually a family of distributions. You have to specify the "degrees of freedom" to specify a particular distribution.

Well, let's say I want to set α=0.005.
If we stick with the example of the coin-toss in my previous post, we have only 1 degree of freedom, to which it correspond a P-value of 7.879. Thus in the first case, where we had only 10 tosses, we won't yet reject the hypothesis that the coin is fair (G was ~7.36).
In the second case when we have 100 tosses (more evidence), we obtained G≈73.6 which is more than enough to reject the hypothesis that the coin is fair.

Stephen Tashi said:
What's bad about that? Intuitively, more trials provide more evidence of a trend toward tails.

Yes, now it makes sense. I was just missing the correct interpretation.
I believe that what confused me is that in both scenarios we had 90% tails and 10% heads, and I wrongly expected to get the same G-value.
 

Similar threads

  • · Replies 1 ·
Replies
1
Views
3K
  • · Replies 2 ·
Replies
2
Views
3K
  • · Replies 5 ·
Replies
5
Views
3K
Replies
1
Views
3K
  • · Replies 4 ·
Replies
4
Views
2K
  • · Replies 1 ·
Replies
1
Views
2K
  • · Replies 5 ·
Replies
5
Views
4K
  • · Replies 13 ·
Replies
13
Views
3K
  • · Replies 6 ·
Replies
6
Views
2K
  • · Replies 3 ·
Replies
3
Views
5K