Tests for Difference in Mean VERSUS Tests for Difference in Median

In summary, if you have two samples from a normal distribution, you should use a test for mean. If you do not have evidence of normality, you should use a test for median. If you have evidence of normality, but your data does not meet the assumptions of a normal distribution, you should use a test for variance. If you have evidence of normality and your data meets the assumptions of a normal distribution, you should use a test for mean and variance.
  • #1
number0
104
0
Sup everyone,

Assume I have two sample observations.

I am wondering when I should use a test for a difference in mean and when I should use a test for a difference in median.

Should I test for mean if both the distribution of both the samples are normal?
Should I test for median otherwise?

I am confused!

Any help would be greatly appreciated.
 
Last edited:
Physics news on Phys.org
  • #2
number0 said:
Sup everyone,

Assume I have two samples.

I am wondering when I should use a test for a difference in mean and when I should use a test for a difference in median.

Should I test for mean if both the distribution of both the samples are normal?
Should I test for median otherwise?

I am confused!

Any help would be greatly appreciated.

If you have sufficient evidence that your underlying distribution has a particular distribution of the Normal variety, it would probably be better to use a sampling distribution of the mean, which in your case is also a normal distribution.

There is no silver bullet answer for your problem because it depends on the assumptions you have and what you are trying to do: it's not completely a plug and chug mechanical process.

Also you have to realize that your sample size is not great no matter what test you are doing. When you have a low amount of samples like you have, its probably be better to use prior information techniques like those found in Bayesian statistics.

Also I think I have misunderstood you: when you say two 'samples' do you mean two distinct collections of observations or do you mean two observations only? If its the first answer, how many observations in each sample?
 
  • #3
chiro said:
If you have sufficient evidence that your underlying distribution has a particular distribution of the Normal variety, it would probably be better to use a sampling distribution of the mean, which in your case is also a normal distribution.

There is no silver bullet answer for your problem because it depends on the assumptions you have and what you are trying to do: it's not completely a plug and chug mechanical process.

Also you have to realize that your sample size is not great no matter what test you are doing. When you have a low amount of samples like you have, its probably be better to use prior information techniques like those found in Bayesian statistics.

Also I think I have misunderstood you: when you say two 'samples' do you mean two distinct collections of observations or do you mean two observations only? If its the first answer, how many observations in each sample?

Oh, my apologies! I meant to say that I have two distinct collections of observations (16 observations total and 8 observation per each category). Other than that information (aside from the actual data), I am not given any conditions to work with.
 
  • #4
number0 said:
Oh, my apologies! I meant to say that I have two distinct collections of observations (16 observations total and 8 observation per each category). Other than that information (aside from the actual data), I am not given any conditions to work with.

Well above you mentioned an assumption namely that your data is normally approximated.

If the data is (or is approximately) normally distributed with an unknown population variance, a good test would be to use a t-test.

Now again, more assumptions enter the equation. If the two variances are not statistically significant you would use a pooled variance. If not, you don't. Also if the different processes are linked (or are thought to be anyway) in a kind of "cause-effect" manner between pairs of observations, then you would consider a paired t-test.

All of the above tests also have distributional assumptions, and if these are not met, you may need to use tests that are lot more complicated.

If you want to test normal approximations, there are different tests for this but the main test is the Shapiro-Wilk test. Any decent statistical software package will do this very easily and quickly.

There are other tests, but I am a) not familiar with them and b) don't understand enough about their differences to give specific advice.

For this problem, I would check normality assumptions for both samples and then do a two-sample t-test. In saying this, if you want to draw conclusions that are statistically significant and useful, I would take a bit of time to either learn the statistics or to ask a statistician for advice.

If this is for some kind of research, I strongly recommend you get some advice. If this is a homework question, I would be interested in telling us what course this is in and what statistical background you have so that I can put your problem into the proper context.
 
  • #5


Hello,

I understand your confusion and I am happy to provide some clarification on when to use a test for difference in mean versus a test for difference in median.

Firstly, it is important to understand the difference between mean and median. Mean is the average value of a dataset, while median is the middle value of a dataset when arranged in ascending or descending order. Mean is more sensitive to extreme values, while median is more robust to outliers.

When to use a test for difference in mean:
1. When the distribution of the data is approximately normal or bell-shaped.
2. When the data is measured on a continuous or interval scale.
3. When the data does not have a large number of outliers.

When to use a test for difference in median:
1. When the data is not normally distributed.
2. When the data is measured on an ordinal scale.
3. When the data has a large number of outliers.

In general, if the data meets the assumptions for a parametric test (normality, equal variances), it is recommended to use a test for difference in mean. However, if the data does not meet these assumptions, a non-parametric test for difference in median should be used.

I hope this helps to clarify when to use each type of test. It is also important to carefully consider the type of data and the research question being addressed when deciding on the appropriate test. If you are still unsure, consulting with a statistician or conducting further research on the topic may be beneficial. Best of luck with your analysis!
 

What is the difference between tests for difference in mean and tests for difference in median?

The main difference between these two types of tests is the measure of central tendency they are comparing. Tests for difference in mean compare the average value of a variable between two groups, while tests for difference in median compare the middle value of a variable between two groups.

When should I use a test for difference in mean versus a test for difference in median?

The decision to use one type of test over the other depends on the distribution of the data. In general, if the data is normally distributed, it is appropriate to use a test for difference in mean. If the data is skewed or not normally distributed, a test for difference in median may be more appropriate.

What are the assumptions for tests for difference in mean versus tests for difference in median?

The assumptions for tests for difference in mean include that the data is normally distributed, the variances of the two groups are equal, and the data is independent. For tests for difference in median, the assumptions include that the data is ordinal or at least continuous, and that the data is independent.

Which type of test is more robust to outliers?

Tests for difference in median are generally more robust to outliers because the median is not affected by extreme values as much as the mean is. Therefore, if a dataset has a few extreme values, a test for difference in median may be more appropriate to use.

Can I use both tests for difference in mean and tests for difference in median on the same dataset?

Yes, it is possible to use both types of tests on the same dataset. This may be helpful in determining which test is more appropriate or in comparing the results from both tests. However, it is important to consider the assumptions and limitations of each test before making a decision.

Similar threads

  • Set Theory, Logic, Probability, Statistics
Replies
1
Views
431
  • Set Theory, Logic, Probability, Statistics
Replies
1
Views
809
  • Set Theory, Logic, Probability, Statistics
Replies
7
Views
457
  • Set Theory, Logic, Probability, Statistics
Replies
2
Views
790
  • Set Theory, Logic, Probability, Statistics
Replies
4
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
23
Views
2K
  • Set Theory, Logic, Probability, Statistics
Replies
30
Views
2K
  • Set Theory, Logic, Probability, Statistics
Replies
1
Views
883
  • Set Theory, Logic, Probability, Statistics
Replies
1
Views
4K
  • Set Theory, Logic, Probability, Statistics
Replies
6
Views
1K
Back
Top