- #1
alan2
- 323
- 56
Hi guys. I'm not a statistician although I use it enough that I'm surprised something is bothering me. I'm doing hypothesis testing on a population >100,000. What I'm wondering is whether there is any difference whatsoever between performing multiple tests on several samples or just doing one test on a larger sample. For example, is one test on a sample of size 100 equivalent in all respects to 4 tests on samples of size 25 each. Is there any additional information to be gained by one method versus the other? If so, I can't find a reference (which leads me to believe there is no difference).
A bit of explanation might be helpful. This is an economics issue with essentially an infinite number of assets. There is some number of participants, each of whom may randomly choose some small fixed number of assets, say 20, from that infinite number available. So what I would like to say is, for example, each participant has a 95% chance of choosing a set which has a mean of property x in some interval as opposed to saying with 95% confidence that the population mean of property x lies in some interval. So it somehow seems to me that pulling multiple samples of size 20 and testing those would give me a better indication of the distribution of sample means than pulling one large sample. On the other hand that seems dumb and the two methods should be equivalent. Any guidance would be appreciated.
A bit of explanation might be helpful. This is an economics issue with essentially an infinite number of assets. There is some number of participants, each of whom may randomly choose some small fixed number of assets, say 20, from that infinite number available. So what I would like to say is, for example, each participant has a 95% chance of choosing a set which has a mean of property x in some interval as opposed to saying with 95% confidence that the population mean of property x lies in some interval. So it somehow seems to me that pulling multiple samples of size 20 and testing those would give me a better indication of the distribution of sample means than pulling one large sample. On the other hand that seems dumb and the two methods should be equivalent. Any guidance would be appreciated.