How can I test the difference between two sample means from SAME population?

In summary, the conversation discusses the scenario of two sets of samples, SET 1 and SET 2, from different populations with the same sample size of 500. The mean price that doctors in SET 1 are willing to pay for a new drug is $20, while in SET 2, composed of doctors, nurses, and patients, the mean price is $40. The questions revolve around how to test for a difference between two sample means from different populations, whether the mean price of nurses in a separate survey with the same sample size would be lower than the mean price of SET 2, and if the mean price of SET 1 is always lower than that of SET 2. The expert advises using a two-sample t
  • #1
adgalo
3
0
Here is the scenario: Let's say that two sets of samples (SET 1 and SET 2) are from the same population (N=1000). SET 1 and SET 2 has the same sample size (n=500). SET 1 is composed of doctors only, and their average (mean) price that they are willing to pay for a certain product (a new drug) is $20. While SET 2 is composed of doctors, nurses, patients, etc, and their average (mean) price that they are willing to pay for a certain drug is $40.

My questions are the following:

1.How can I test if there is a difference between two sample means from SAME population?
2. If I run another survey for NURSES only, with the same sample size (n=500), will I get a sample mean price lower than the sample mean price of SET 2 (which is $40)?
3. Does the sample mean of SET 1, which is just a subset of SET 2 in terms of composition, is always lower than the sample mean of SET 2?

Thank you very much
 
Physics news on Phys.org
  • #2
Hey adgalo and welcome to the forums.

For your problem as you have described it, I would argue that these populations are not the same but different in terms of their characteristics.

Also you have to be careful about your terminology and I think it's a good idea I said a few words so that you don't misuse the statistical terminology in the future.

The first thing is that you don't have one population with 1000 data points: you have two. It's like saying that if you sampled the country of 20 million with 10 million men and women, that the men and women reflect the same population.

This is not correct. While they the men and women are indeed subsets of the entire population, their characteristics are not the same and thus for that reason you have two populations. If the characteristics were exactly the same, then yes they would be part of the same but since they are not, they are not. It's important you realize this because making this error could be very costly for you in the future.

If you want to show evidence that these two distributions are from the same underlying population distribution, then you need to use a two sample t-test. If you had more than two groups you use what is called an ANOVA.

What you have to do is show that there is evidence either that the two samples come from the same population, or that there is evidence that they don't come from the same population under some kind of credibility or 'confidence' constraint. You can't just say they come from the same distribution: it doesn't work like that. You have to show enough evidence that they 'may' come from the same distribution because even though you may get evidence, it doesn't mean that they do. If you don't understand this, then you need to in order to understand what statistics is all about.

You may now ask what characteristics have to be the same? The answer depends on your question. You will have variation in some way and the question will determine what you will be actually be comparing and analyzing. As long as you have variation in some sense between variables, you will always have a hypothesis to test: again you can't just say things come from the same distribution if there is variation in the characteristics! You have to show evidence for it! Can't just assume it! I stress this point because it is fundamental to understanding what statistics can and can't do and why we even use it in the first place.

With that said, let's go to the questions.

For 1. You need to treat them as two different populations and then do a two-sampled t-test. There are variations of the test depending on 'equal variances' and whether data is paired (each element at index i in set A is 'linked' to element at same index in set B).

For 2. You have to actually do the experiment to find out! You can also test if these are likely to be from the same population using the kinds of procedures above, but if you have more than two groups I would recommend an ANOVA procedure for testing means.

For 3. Set 1 is not a subset of Set 2. They are independent samples and they refer to different sets of data. You can't just say because doctors are in set 1 and partially in set 2 that they are subsets: it's not true.

There's a lot more to this when you are doing sampling and I'm not going to get into right now because at the moment, I see that you are having a bit of confusion with statistics and how you should think about it.

If you are doing an actual course, I would really talk with your teacher about these issues because these are really important.

If you are a researcher, scientist, engineer or some analyst trying to analyze data then you should talk to a statistician before you do anything else.

If you have specific questions, I will do my best to answer them but again your understanding of statistics, how it works, and how its used needs to be addressed.
 
  • #3
Thank you very much chiro. I guess my problem boils down on how do i treat them as separate populations. Lately, I realized that indeed SET 1 and SET 2 are from different populations based on their characteristics. Before I posted my problem here in the forum, I was thinking that ANOVA should be the method that I will use to test the difference between two means, but then my first assumption was that these two means came from the SAME population, and so I decided that ANOVA is not a good idea. Later you mentioned that these sets should be treated as separate populations. I realized that my assumption was wrong.^_^
 
  • #4
The ANOVA technique is used to generalize the t-test to more than two comparisons. If you are only comparing two groups then the t-test is suitable in its current form.
 
  • #5
As what you have said earlier, two-sample t-test can show evidence that two samples are from the same underlying population. What should be my assumptions here? Do I have to assume that my sample sizes (SET 1 and SET 2) are equal and my population variances are unknown?
 

Question 1: What is the purpose of testing the difference between two sample means from the same population?

The purpose of testing the difference between two sample means from the same population is to determine if there is a significant difference between the two samples. This can help us understand if the two groups are truly different or if any observed differences are due to chance or random variation.

Question 2: What statistical test can be used to compare the means of two samples from the same population?

The appropriate statistical test for comparing the means of two samples from the same population is the paired t-test. This test takes into account the correlation between the two samples and provides a more accurate assessment of the difference between the sample means.

Question 3: How do I know if the difference between two sample means from the same population is statistically significant?

To determine if the difference between two sample means from the same population is statistically significant, we compare the calculated p-value to the chosen significance level (usually 0.05). If the p-value is less than the significance level, we can reject the null hypothesis and conclude that there is a significant difference between the two sample means.

Question 4: What assumptions need to be met for the paired t-test to be valid?

The paired t-test assumes that the data is normally distributed, the samples are paired or matched, and that there is a linear relationship between the two variables. Additionally, the variances of the two samples should be equal.

Question 5: Can I use the paired t-test if my sample sizes are small?

Yes, the paired t-test is suitable for small sample sizes as long as the assumptions are met. However, if the sample sizes are extremely small (less than 10), the test may not be as reliable and alternative methods such as non-parametric tests may be more appropriate.

Similar threads

  • Set Theory, Logic, Probability, Statistics
Replies
4
Views
774
  • Set Theory, Logic, Probability, Statistics
Replies
1
Views
782
  • Set Theory, Logic, Probability, Statistics
Replies
5
Views
2K
  • Set Theory, Logic, Probability, Statistics
Replies
6
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
1
Views
815
Replies
1
Views
716
  • Set Theory, Logic, Probability, Statistics
Replies
7
Views
4K
  • Set Theory, Logic, Probability, Statistics
Replies
2
Views
985
  • Set Theory, Logic, Probability, Statistics
Replies
9
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
3
Views
1K
Back
Top