Appropriate statistical test for this situation?

In summary, the conversation discusses how to determine if two distributions are consistent using statistical tests. The data provided includes counts for two states (A and B) across three groups/regions/populations. The speaker suggests using a \chi^2 test to calculate the p-value and determine the significance of any differences between the distributions. They also clarify that the null hypothesis is that the two distributions are the same and caution against misinterpreting a p-value as the probability of the distributions being different. The conversation ends with a question about calculating the \chi^2 statistic on a contingency table.
  • #1
Jean Tate
27
4
Can anyone help me with this, please?

It's about how you go about trying to decide if two distributions are consistent, statistically speaking; specifically, what statistical test, or tests, is (are) most appropriate to use.

Here's the data:

N(A) N(B) G/R/P
0043 0046 #101
0264 0235 #102
0033 0029 #103

N(A) N(B) G/R/P
0172 0201 #201
1686 1496 #202
1444 1336 #203

Astronomical observations were made, and reduced to data. By two quite different teams, using different telescopes, cameras, data reduction routines, etc. In the first two columns (N(A) and N(B)) are counts, with leading zeros to ensure everything lines up nicely. "A" and "B" are two states, or conditions, or ... they are distinct and - for the purposes of this question - unambiguous. So the first cell of the first table says 43 cases of A (or with condition A) were observed.

The third column (G/R/P) is the name/label of the group/region/population observed. The two teams each observed the same group/region/population; the first table is the first team's data, the second the second.

There is nothing to say what the underlying ("true") distribution is, or should be. Nor any way to compare what the two teams observed: the 43 could be a proper subset of the of 172 (first column, first row), an overlap, or disjoint. However, assume no mistakes at all in the assignment of "A" and "B".

Clearly, the two distributions - of states A and B, across the three groups/regions/populations - are different. However, is that difference statistically significant? What test - or tests - are appropriate to use, here?

More details? Consider these:

i) what's observed is white dwarf stars, in three different clusters; A is DA white dwarfs, B DB ones
ii) globular clusters, in three different galaxies; A is 'red' GCs, B 'blue'
iii) spiral galaxies, in three different galaxy clusters; A is 'anti-clockwise', B 'clockwise'
iv) radio galaxies, in three different redshift bins; A is 'FR-I', B 'FR-II'
v) GRBs, in three different RA bins; A is 'long', B is 'short'

(I don't think the details matter, in terms of the type of statistical test to use; am I right?)
 
Astronomy news on Phys.org
  • #2
To clarify something; namely, what 'distribution' am I asking about?

Express the data as ratios, of N(A)/N(B) (two significant figures only):

A/B G/R/P
0.93 #101
1.12 #102
1.14 #103

A/B G/R/P
0.86 #201
1.13 #202
1.08 #203

Now 0.93 != 0.86, 1.12 != 1.13, and 1.14 != 1.08, so the two teams' values of the three ratios are not the same (duh!)

The ratio in group/region/population #01 is not, necessarily, the same as that in G/R/P #02 (ditto #03).

Given the underlying data - which is counts, not ratios - is the ordered triple* (0.93, 1.12, 1.14) inconsistent with the ordered triple (0.86, 1.13, 1.08)?

Oh, and I should have asked about inconsistent with ...

* am I using the term correctly?
 
  • #3
Jean Tate said:
Clearly, the two distributions - of states A and B, across the three groups/regions/populations - are different. However, is that difference statistically significant? What test - or tests - are appropriate to use, here?
A [itex]\chi^2[/itex] test would likely be appropriate here. Once you've computed the [itex]\chi^2[/itex], the associated p-value can be used to determine the significance, [itex]\alpha[/itex], of the test. The p-value comparison assumes that the two distributions are actually the same, and gives the probability that any differences between them result from a statistical fluke. The conventional reading of a test that gives a p-value better than a significance level of, say, [itex]\alpha = 0.05[\itex], is that there is only a 5% chance that any differences are due to a statistical fluke. Often, this is termed as "there is a 5% chance of falsely rejecting the null hypothesis" (null hypothesis = the hypothesis assumed in the significance test -- that the two distributions are the same). This is known as a Type I error, or false positive. People are often tempted to invert this, and say that there is a 95% chance that the two distributions are different, but strictly speaking this is incorrect and sloppy.
 
  • #4
bapowell said:
A [itex]\chi^2[/itex] test would likely be appropriate here. Once you've computed the [itex]\chi^2[/itex], the associated p-value can be used to determine the significance, [itex]\alpha[/itex], of the test. The p-value comparison assumes that the two distributions are actually the same, and gives the probability that any differences between them result from a statistical fluke. The conventional reading of a test that gives a p-value better than a significance level of, say, [itex]\alpha = 0.05[\itex], is that there is only a 5% chance that any differences are due to a statistical fluke. Often, this is termed as "there is a 5% chance of falsely rejecting the null hypothesis" (null hypothesis = the hypothesis assumed in the significance test -- that the two distributions are the same). This is known as a Type I error, or false positive. People are often tempted to invert this, and say that there is a 95% chance that the two distributions are different, but strictly speaking this is incorrect and sloppy.
Thanks!

Calculate the [itex]\chi^2[/itex] statistic on the following (a contingency table)? Or something else?

NA.1 NB.1 NA.2 NB.2 G/R/P
0043 0046 0172 0201 #01
0264 0235 1686 1496 #02
0033 0029 1444 1336 #03
 
  • #5




The appropriate statistical test to use in this situation would be a chi-square test for independence. This test is used to determine if there is a significant relationship between two categorical variables. In this case, the two variables are the states A and B, and the groups/regions/populations observed. The chi-square test will determine if there is a significant difference in the distribution of states A and B across the three groups/regions/populations observed.

To perform the chi-square test, the observed frequencies (N(A) and N(B)) and expected frequencies (based on the overall proportion of A and B in the data) would need to be calculated. The test will then determine if the observed frequencies are significantly different from the expected frequencies, indicating a significant relationship between the variables.

Other possible statistical tests that could be used in this situation include a two-sample t-test or ANOVA, if the data were normally distributed and had equal variances. However, the chi-square test is more appropriate for categorical data and does not have these assumptions.

The details provided about the observations (white dwarf stars, globular clusters, spiral galaxies, etc.) do not impact the type of statistical test to use. The key factor is that the data is categorical and the goal is to determine if there is a significant relationship between two variables.
 

What is the appropriate statistical test for this situation?

The appropriate statistical test for a situation depends on the type of data being analyzed and the research question being addressed. Generally, if the data is continuous and normally distributed, a parametric test such as t-test or ANOVA would be appropriate. If the data is non-normal, non-parametric tests such as Mann-Whitney U test or Kruskal-Wallis test may be more suitable.

How do I determine if my data is normally distributed?

You can visually assess the normality of your data by creating a histogram or a qq-plot. Additionally, you can use statistical tests such as Shapiro-Wilk or Kolmogorov-Smirnov test to check for normality. If the p-value is greater than 0.05, the data can be considered normally distributed.

What is the difference between a parametric and non-parametric test?

A parametric test assumes that the data is normally distributed and follows a specific distribution, such as a normal distribution. Non-parametric tests, on the other hand, make no assumptions about the underlying distribution of the data. They are more robust to violations of assumptions and can be used with non-normal data.

When should I use a one-tailed vs. two-tailed test?

A one-tailed test is used when the research question has a specific direction or hypothesis, while a two-tailed test is used when the research question does not have a specific direction. For example, if the research question is "Is there a difference in mean scores between Group A and Group B?", a two-tailed test would be appropriate. But if the research question is "Is the mean score of Group A higher than the mean score of Group B?", a one-tailed test would be more suitable.

What is the appropriate sample size for my study?

The appropriate sample size for a study depends on several factors, including the research question, the effect size, and the desired level of power. Generally, a larger sample size allows for more accurate and reliable results. It is important to conduct a power analysis to determine the minimum sample size needed to detect a significant effect.

Similar threads

Replies
1
Views
838
  • Astronomy and Astrophysics
Replies
1
Views
855
  • Astronomy and Astrophysics
Replies
1
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
13
Views
920
  • Set Theory, Logic, Probability, Statistics
Replies
5
Views
483
Replies
1
Views
885
Replies
1
Views
738
Replies
2
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
4
Views
1K
  • General Math
Replies
1
Views
1K
Back
Top