Does Returning Marbles After Each Draw Affect Chi-square Test Results?

  • Context: Undergrad 
  • Thread starter Thread starter JFS321
  • Start date Start date
Click For Summary

Discussion Overview

The discussion revolves around the application of the Chi-square test in the context of a classroom activity involving drawing colored marbles from a bag. Participants explore how returning marbles after each draw affects the test results compared to counting all marbles at once. The conversation touches on statistical principles, sampling methods, and the implications of drawing with replacement versus without.

Discussion Character

  • Exploratory
  • Technical explanation
  • Conceptual clarification
  • Debate/contested
  • Mathematical reasoning

Main Points Raised

  • One participant questions how returning marbles after each draw differs from counting all marbles at once for the Chi-square test.
  • Another participant explains that the Chi-square test is used to assess hypotheses about proportions when the entire population cannot be tested, emphasizing that counting all marbles eliminates the need for the test.
  • There is a discussion about the implications of sampling methods, with one participant cautioning that non-random selection of bags could introduce uncertainty in the results.
  • Participants discuss the concept of statistical power, with one asking if drawing 50 times increases the likelihood of obtaining a ratio close to the actual ratio in the bag.
  • Clarifications are made regarding the interpretation of Chi-square results, particularly that the confidence statistic does not indicate the probability of obtaining the same results from another sample.
  • Concerns are raised about the possibility of repeatedly drawing the same marbles, which could affect the representativeness of the sample.

Areas of Agreement / Disagreement

Participants express varying views on the implications of sampling methods and the interpretation of Chi-square test results. There is no consensus on the best approach to the activity or the statistical implications of returning marbles versus counting them all at once.

Contextual Notes

Participants note limitations regarding the randomness of marble selection and the potential for repeated sampling of the same marbles, which could affect the validity of the Chi-square test results.

Who May Find This Useful

This discussion may be useful for educators, students, and individuals interested in statistical methods, particularly in the context of hypothesis testing and sampling techniques in experimental design.

JFS321
Messages
75
Reaction score
6
Hi all, I'm a high school physics / AP biology teacher looking to expand my understanding of the Chi-square test some. I planned an activity in which students are randomly drawing colored marbles out of a bag in order to see if they match predicted ratios (2:1, 1:1, others). I'm having them draw blindly 50x (there are around 20-30 marbles per bag), returning the marble into the bag after each draw.

I'm wondering, though -- how does performing the test in this manner differ from simply having them dump all the marbles out and count the actual numbers and doing the Chi-square test with those exact values? If you do a Chi-sq test on flipping a coin, for example, this seems to be similar to returning the marble after each draw. They are all independent events. So, hopefully my question makes sense -- how does the Chi-square test "change" between these two scenarios? Does it?

Thanks. Please note, I am no mathematician.
 
Physics news on Phys.org
JFS321 said:
I'm wondering, though -- how does performing the test in this manner differ from simply having them dump all the marbles out and count the actual numbers and doing the Chi-square test with those exact values?
A Chi square test is designed to test a hypothesis about proportions in a population. It is only used when the entire population cannot be tested, either because it is too big, or because testing is intrusive or unpleasant. So a sample is tested and the Chi Square test tells us confidence levels about whether the sample accurately reflects the population.

If our population is the marbles in the bag and we take them all out and count the proportions of each colour then we know the exact, correct proportions for the population and no Chi Square test is needed.

But if we only take out a few marbles from the bag then we use the sample with a Chi Square test to get confidence levels for our hypothesis about the whole bag.

The hypothesis in this case might be something like 'The bag contains equal numbers of each colour'.

If instead the population were all marbles produced by the factory and sold in bags of that type then we cannot test the whole population and if we examine the entire contents of one bag then that is a sample of the population and Chi Square may be used, just as it could be if we only inspected some of the marbles from the bag.

But I'd advise caution about using that second approach, because the selection of bag is not random. It was bought from a particular shop in a particular area at a particular time. For all we know, the proportions in bags may have changed over time, or may differ between different distribution markets. Chi Square approaches always have a cloud of uncertainty over them when selection of the sample is not sufficiently random. Drawing marbles out of a bag while not looking, and having first given them a good mixing, is a pretty good random selection method where the population is just what's in the bag.
 
Ok, thanks a lot. I think I can make sense of that -- basically, if we did dump out all the marbles (I'm not going to do it that way, but...), those marbles would theoretically represent a random sample from the larger population. Then the Chi sq can tell us the probability getting those results based on our random sampling efforts -- in other words, the likelihood of receiving another sample at least as extreme as that one. Any issues there?
 
The confidence statistic one gets from a Chi square is not a probability that one would get the same results from another random sample. Rather it is the probability that one can reject the Null Hypothesis which, in a situation like this, would typically be that all colours have the same frequency in the whole population.

If there are three colours Red, Green and Blue and we draw out twenty marbles and see that eight are red, two are green and ten are blue, we perform a Chi square test. The Chi square value is 5.2 and there are two degrees of freedom. The p-value is about 7.5%. That tells us that the probability of getting a Chi square value that high or higher from a sample of twenty is only 7.5% if the three colours are equally frequent. People sometimes say this as 'the probability that the three colours are equally frequent is 7.5%', which is a paraphrase that will send some statisticians into fits of rage, but it may be OK if you are talking to school students.

Note that the test says noting about the probability of getting another sample with the same proportions.
 
Also, let me be sure I am clear on this. There are only about 25 marbles in the bag -- by having them draw/replace 50x, am I increasing my statistical power because we are much more likely to get a ratio that is closest to the actual ratio in the bag? I think it's the "pretending" part that is getting me...sampling 25 marbles 50x is basically sampling the whole population, but we are pretending it's just a sample. Perhaps I should have done 100 marbles per bag and none of these questions would have jumped in my mind!
 
JFS321 said:
There are only about 25 marbles in the bag -- by having them draw/replace 50x, am I increasing my statistical power because we are much more likely to get a ratio that is closest to the actual ratio in the bag?
Yes. But it's by virtue of the number of samples drawn, not by virtue of the ratio of that to the number in the bag. The statistical power is the same regardless of whether there are 25 in the bag or 25,000. It is only the number of samples and the number of different colours that matters.
JFS321 said:
sampling 25 marbles 50x is basically sampling the whole population,
It feels like that, but that is not the case. For all we know we picked only as many different marbles as the number of different colours we sampled. If all our samples were the same colour, it might have even been the same marble every time.
 
Thanks. All of this makes good sense.

When you said "for all we know we picked only as many different marbles as the number of different colors..." ... Do you mean if we had 100 red, 50 blue, and 10 green, we may have picked the same 3 red, blue, and green marble each sampling event?
 
JFS321 said:
Thanks. All of this makes good sense.

When you said "for all we know we picked only as many different marbles as the number of different colors..." ... Do you mean if we had 100 red, 50 blue, and 10 green, we may have picked the same 3 red, blue, and green marble each sampling event?
Yes, it may be that, by coincidence, we picked the same red marble 100 times, the same blue one 50 times and the same green one 10 times.
 
Thanks for all of the help!
 

Similar threads

  • · Replies 5 ·
Replies
5
Views
4K
  • · Replies 5 ·
Replies
5
Views
9K
  • · Replies 1 ·
Replies
1
Views
3K
  • · Replies 7 ·
Replies
7
Views
7K
Replies
1
Views
2K
  • · Replies 2 ·
Replies
2
Views
48K
  • · Replies 11 ·
Replies
11
Views
2K
  • · Replies 5 ·
Replies
5
Views
2K
  • · Replies 7 ·
Replies
7
Views
3K
Replies
3
Views
4K