What is the correct way of describing this change - mean or median?

  • Context: Undergrad 
  • Thread starter Thread starter musicgold
  • Start date Start date
  • Tags Tags
    Change Mean Median
Click For Summary

Discussion Overview

The discussion revolves around the appropriate statistical method for describing changes in a set of variable values, specifically whether to use the mean or median. Participants explore the implications of each approach in the context of central tendency, particularly in the presence of potential outliers.

Discussion Character

  • Debate/contested
  • Technical explanation
  • Conceptual clarification

Main Points Raised

  • One participant suggests four methods for describing the change: using the mean or median of the change values, or the differences between specific cells in an Excel file.
  • Another participant argues against using the median of the differences, stating that both distributions appear symmetric and without outliers, thus favoring the mean.
  • A later reply questions the reasoning behind dismissing the median, asking if it relates to the symmetry of the distribution of change values.
  • Another participant expresses uncertainty about the advantages of using the median of differences, suggesting it may only be beneficial under specific conditions involving strong correlations.
  • One participant emphasizes a preference for the median in the presence of outliers, indicating a different perspective on when to use each measure.
  • There is a discussion about the concept of 'measurement series' and its relevance to the analysis, with some participants seeking clarification on terminology.

Areas of Agreement / Disagreement

Participants express differing opinions on the appropriateness of using the mean versus the median to describe changes. There is no consensus on which method is superior, and the discussion remains unresolved regarding the best approach.

Contextual Notes

Participants mention the potential influence of outliers and the symmetry of distributions, but these factors are not fully explored or agreed upon. The discussion also touches on the relevance of ignoring certain variables in the analysis.

musicgold
Messages
303
Reaction score
19
Hi,

Please see the attached Excel file.

The list shows the old and new values of a set of variables. I am trying to understand what is the best way – average or median - to describe the change the values of the set. I want to describe the true central tendency of the change.

1. I think there are the following four ways I can describe the change in the values of the set . Which one is the most accurate?

a) the average value changed by 0.15% (the mean of the change values shown in Cell E63)
b) the average value changed by 0.10% (the median of the change values shown in Cell E64)
c) the average value changed by 0.14% (the difference of Cell D63 and C63)
d) the average value changed by 0.25% (the difference of Cell D64 and C64)
2. I am ignoring the change values of variables X12 and X39, as these variables did not change. Is that correct?

Thanks.
 

Attachments

Physics news on Phys.org
Certainly not the median of the differences (b).

Both distributions look nice, not too asymmetric and without outliers. I think I would compare the average values. a and c should give the same result here.

2. I am ignoring the change values of variables X12 and X39, as these variables did not change. Is that correct?
Don't ignore them, they are measured values! This could just be by chance, and also a result of your measurement resolution.
 
  • Like
Likes   Reactions: 1 person
mfb said:
Certainly not the median of the differences (b).

Both distributions look nice, not too asymmetric and without outliers. I think I would compare the average values. a and c should give the same result here.

Can you explain why you think B should not be used? Is it because the distribution of change values is not symmetrical?
 
I cannot imagine a scenario where the median of the differences would have an advantage over anything else, unless the correlation between your measurement series is much stronger than the correlation within the series (so you have something like 0.10 -> 0.11, 45343.44 -> 45343.45 and similar things, together with some outliers so the mean cannot be used). But then you should not compare the series like that anyway.
 
:confused: I am not sure what you are saying here.

I prefer the median over the mean when there are outliers in the data.

mfb said:
I cannot imagine a scenario where the median of the differences would have an advantage over anything else, unless the correlation between your measurement series is much stronger than the correlation within the series (so you have something like 0.10 -> 0.11, 45343.44 -> 45343.45
Are you talking about auto-correlation here?
Also, I don't know what you mean by 'measurement series'.
 
musicgold said:
I prefer the median over the mean when there are outliers in the data.
Right.

Are you talking about auto-correlation here?
It is related to that.
Also, I don't know what you mean by 'measurement series'.
Columns in your excel file.
 

Similar threads

  • · Replies 3 ·
Replies
3
Views
2K
  • · Replies 4 ·
Replies
4
Views
3K
  • · Replies 7 ·
Replies
7
Views
3K
  • · Replies 1 ·
Replies
1
Views
7K
  • · Replies 2 ·
Replies
2
Views
2K
  • · Replies 12 ·
Replies
12
Views
7K
  • · Replies 216 ·
8
Replies
216
Views
31K
  • · Replies 2 ·
Replies
2
Views
9K
  • · Replies 5 ·
Replies
5
Views
3K