Chi-Square Test: Solving Doubt w/ Kepler

Click For Summary
SUMMARY

The discussion revolves around the application of the Chi-Square test in scenarios where the sum of observed values differs from the control group. Kepler presents a dataset with observed frequencies and control values, seeking guidance on executing the Chi-Square test under these conditions. The consensus among participants is that a Chi-Square test is not appropriate due to the nature of the data, and instead, a Kolmogorov-Smirnov test is recommended for comparing observed distributions against reference distributions. The discussion emphasizes the importance of creating a hypothesized distribution based on control group proportions if one insists on using the Chi-Square test.

PREREQUISITES
  • Understanding of Chi-Square tests and their application in statistical analysis.
  • Familiarity with Kolmogorov-Smirnov tests for comparing distributions.
  • Knowledge of observed and expected frequencies in statistical contexts.
  • Basic principles of hypothesis testing in statistics.
NEXT STEPS
  • Research the implementation of the Kolmogorov-Smirnov test in statistical software like R or Python.
  • Learn how to create hypothesized distributions based on control group data for Chi-Square tests.
  • Explore the assumptions and limitations of Chi-Square tests in statistical analysis.
  • Study the differences between parametric and non-parametric tests in statistics.
USEFUL FOR

Statisticians, data analysts, and researchers who are involved in hypothesis testing and statistical analysis, particularly those working with frequency data and distribution comparisons.

cptolemy
Messages
45
Reaction score
1
Good afternoon,

I'm glad I've joined this forum. Here's my doubt: I have a serie of values in a table like this:

Case 1 34 55
Case 2 23 10
Case 3 55 40
etc...

the 34 means the observed value, and the 55 the control group, and so on. It's easy to do the test of course if...

The problem is: if the sum of the observed values is different from the sum of the control group, how do I execute the test?

Should I use %s and then, for instance, use a mean value from the sums...?

Kind regards,

Kepler
 
Physics news on Phys.org
kepler said:
Good afternoon,

I'm glad I've joined this forum. Here's my doubt: I have a serie of values in a table like this:

Case 1 34 55
Case 2 23 10
Case 3 55 40
etc...

the 34 means the observed value, and the 55 the control group, and so on. It's easy to do the test of course if...

The problem is: if the sum of the observed values is different from the sum of the control group, how do I execute the test?

Should I use %s and then, for instance, use a mean value from the sums...?

Kind regards,

Kepler

Bi Kepler! Welcome to MHB! ;)

A chi square test only applies if we're talking about frequencies. That is, counts for some condition to occur.
That doesn't seem to be the case with your data. Can you clarify?
Otherwise a linear regression may be more appropriate...
 
I like Serena said:
Bi Kepler! Welcome to MHB! ;)

A chi square test only applies if we're talking about frequencies. That is, counts for some condition to occur.
That doesn't seem to be the case with your data. Can you clarify?
Otherwise a linear regression may be more appropriate...

Hi,

Thanks for the reply :) Actually they are frequencies where case 1,2,3... occurr. The control values are for regular and normal frequencies. The difference - and problem - is that the observed frequencies are being measured against a previous distribution - therefore the sums are different (the control cases where fewer). Chi square test relies on square differences. So I think I must choose the right proportion fo N.

I would very much like your opinion.

Kind regards,

Kepler
 
kepler said:
Hi,

Thanks for the reply :) Actually they are frequencies where case 1,2,3... occurr. The control values are for regular and normal frequencies. The difference - and problem - is that the observed frequencies are being measured against a previous distribution - therefore the sums are different (the control cases where fewer). Chi square test relies on square differences. So I think I must choose the right proportion fo N.

I would very much like your opinion.

Kind regards,

Kepler

A chi-square test typically compares observed frequencies against a hypothesized distribution.
Your control values are not a hypothesized distribution, but different observations of a group that is hypothesized to be different.

It means that a Kolmogorov-Smirnov test is more appropriate. It compares an observed distribution against a reference distribution, both with unknown distribution parameters.
Is it an option to use the Kolmogorov-Smirnov test?
Or does it have to be a chi-square test?
 
I like Serena said:
A chi-square test typically compares observed frequencies against a hypothesized distribution.
Your control values are not a hypothesized distribution, but different observations of a group that is hypothesized to be different.

It means that a Kolmogorov-Smirnov test is more appropriate. It compares an observed distribution against a reference distribution, both with unknown distribution parameters.
Is it an option to use the Kolmogorov-Smirnov test?
Or does it have to be a chi-square test?

Hi,

Thanks for the reply. Actually, you might be right. My observed values in a condition 1 belonging to a group of type A are compared to another group of the same (A) type without that condition. In condition 2, the observed group B is tested against another value of the same group without having condition 2 - and so on.

The difference - and problem - is that the obs. cases sum a sample of N individuals. The other group sums N1. N<>N1

But I must solve this for a chi square test.

Any help is apreciated.

Kind regards,

Kepler
 
I like Serena said:
A chi-square test typically compares observed frequencies against a hypothesized distribution.
Your control values are not a hypothesized distribution, but different observations of a group that is hypothesized to be different.

It means that a Kolmogorov-Smirnov test is more appropriate. It compares an observed distribution against a reference distribution, both with unknown distribution parameters.
Is it an option to use the Kolmogorov-Smirnov test?
Or does it have to be a chi-square test?

Hi,

Thanks for the reply. Actually, you might be right. My observed values in a condition 1 belonging to a group of type A are compared to another group of the same (A) type without that condition. In condition 2, the observed group B is tested against another value of the same group without having condition 2 - and so on.

The difference - and problem - is that the obs. cases sum a sample of N individuals. The other group sums N1. N is not equal to N1

Resume: I have several groups of type individuals, from A to F let's say. For a given condition, I have my observed values that comply with the condition (in a sample that sums N1 subjects) and a control value (the same type of group) but that does not complies that condition; and the subjects, N2, is different from N1.

But I must solve this for a chi square test.

Any help is apreciated.

Kind regards,

Kepler
 
kepler said:
Hi,

Thanks for the reply. Actually, you might be right. My observed values in a condition 1 belonging to a group of type A are compared to another group of the same (A) type without that condition. In condition 2, the observed group B is tested against another value of the same group without having condition 2 - and so on.

The difference - and problem - is that the obs. cases sum a sample of N individuals. The other group sums N1. N is not equal to N1

Resume: I have several groups of type individuals, from A to F let's say. For a given condition, I have my observed values that comply with the condition (in a sample that sums N1 subjects) and a control value (the same type of group) but that does not complies that condition; and the subjects, N2, is different from N1.

But I must solve this for a chi square test.

Any help is apreciated.

Kind regards,

Kepler

Is the observed group of type A the same as the observed group of type B?

If you really want to use a chi-square test, I think we will have to create a hypothesized distribution based on the control group.
We get that when we divide the observed frequency of the control group and divide it by the number of people in the control group. That gives us a proportion.
Then we can estimate the expected frequency by multiplying this proportion with the number of people in the observed group.
This approach is sensitive to errors in the measurements of the control group though, which would only be acceptable if the control group is very large.
 

Similar threads

  • · Replies 5 ·
Replies
5
Views
4K
  • · Replies 4 ·
Replies
4
Views
2K
  • · Replies 3 ·
Replies
3
Views
2K
  • · Replies 20 ·
Replies
20
Views
4K
  • · Replies 4 ·
Replies
4
Views
3K
  • · Replies 2 ·
Replies
2
Views
2K
  • · Replies 1 ·
Replies
1
Views
1K
  • · Replies 14 ·
Replies
14
Views
2K
  • · Replies 6 ·
Replies
6
Views
23K
  • · Replies 1 ·
Replies
1
Views
2K