Standard deviation versus absolute mean deviation

In summary, the conversation discusses the advantages of using the absolute mean deviation (MAD) over the standard deviation (SD), particularly in cases where slight errors may occur in the data. The MAD is also compared to the median absolute deviation (MAD), which is a more robust measure of variability. The conversation also introduces a variation of the MAD called the Gini mean difference, with two versions (GiniA and GiniB) that differ in the number of terms included in the calculation. The potential usefulness of GiniB is discussed, as it includes all differences and divides by n*n instead of n*(n-1).
  • #1
Twinbee
117
0
What are the advantages of using the absolute mean deviation over the standard deviation. Is it possible to show a simple example where the former is more (or less) appropriate?

Also, related to the mean deviation is my own variation. Does it have a name? Instead of calculating the absolute differences from the mean for each number, my technique would instead find the average of all the absolute differences for each number against each other number.

So for example: given the numbers 3,7,7,19
Average is: 9
Absolute Mean deviation is: 5
My 'special' deviation is: 6

This is found thusly:

(|3-3| + |7-3| + |7-3| + |19-3| +
|3-7| + |7-7| + |7-7| + |19-7| +
|3-7| + |7-7| + |7-7| + |19-7| +
|3-19| + |7-19| + |7-19| + |19-19| ) / 16

= 6

As you can see, everything is compared against everything else. What do people here think? One could also remove the 3-3, 7-7, 7-7, and 19-19 bits, and then divide by 12 for a similar variation (results in 8 by the way).

Could this method be usefully applied in stats?
 
Physics news on Phys.org
  • #2
Your variation is essentially the Gini Mean difference, if I understand your explanation correctly.

The Mean Absolute Deviation (MAD), which is

[tex]
\frac 1 n \sum |x_i - \bar x |
[/tex]

was proposed as an estimate of variation, but in the case of normally distributed data it is neither unbiased nor particularly efficient, compared to the usual estimates.

Note that there are other, better (more robust) measures of variability. The median absolute deviation (another MAD) is

[tex]
MAD = \text{median}_i \left( | x_i - \text{median}_j (x_j)|\right)
[/tex]

(The same name is also given to this estimate: [tex] 1.4826 MAD [/tex] - the multiplication by 1.4826 makes this unbiased for [tex] \sigma [/tex] in the normal distribution case. Here also [tex] MAD [/tex] refers to the median absolute deviation.)
 
  • #3
Ah, the Gini Mean Difference looks like the one. I wonder what applications it should be used for over SD or MD.

was proposed as an estimate of variation, but in the case of normally distributed data it is neither unbiased nor particularly efficient, compared to the usual estimates.
Interesting you say that it's biased. Doesn't it depend on the distribution? In that sense, the standard deviation would appear biased for evenly distributed data (non-normal).

There's this page which sings the praises of the MD over the SD and says it should be used in most cases of 'real data' where even slight errors may creep in. One other advantage apparently is that when outliers (long tail data) are squared, this creates bias, and the MD avoids this. Of course, I'm not sure how much all this is true, but here's the page:

http://www.leeds.ac.uk/educol/documents/00003759.htm

Thanks for the reply.
 
Last edited:
  • #4
An old thread, but a goodie. My first post did indeed describe the Gini mean difference, but I described two different versions which I'll call GiniA and GiniB:

GiniA(3,7,7,19) = 8
GiniB(3,7,7,19) = 6

For the above values, 8 is the correct answer according to the standard Wikipedia definition. However, I think a 'better?' value may be 6 as GiniB shows, since it includes the missing 4 differences (3-3, 7-7, 7-7 and 19-19) and divides the result by n*n instead of n*(n-1). This is demonstrated in my first post as well (16 sums divided by 16 instead of by 12).

Is there any reason to believe GiniB could be useful? It seems natural to divide by 16, since one could say that each value is an 'error of itself' (i.e. no error) as well as errors of each other.
 
  • #5



Standard deviation and absolute mean deviation are two common measures of variability used in statistics. Standard deviation is a measure of how much the data values deviate from the mean, while absolute mean deviation is a measure of how much the data values deviate from the median. Both measures have their own advantages and can be useful in different situations.

One advantage of using absolute mean deviation over standard deviation is that it is less sensitive to extreme values or outliers in the data. Standard deviation takes into account the squared differences between the data values and the mean, which can be greatly influenced by extreme values. On the other hand, absolute mean deviation only considers the absolute differences, which makes it a more robust measure of variability.

To illustrate this, let's consider the following set of data: 2, 4, 6, 8, 10, 100. The mean of this data is 18 and the median is 6. The standard deviation is 37.2, which is greatly influenced by the extreme value of 100. However, the absolute mean deviation is only 4, which is a more accurate representation of the variability in the data.

In terms of your own variation, it is similar to the median absolute deviation (MAD) which is another measure of variability that is less sensitive to extreme values. Your technique of finding the average of all the absolute differences for each number against each other number is essentially finding the MAD for the data.

This method can be useful in situations where the data contains extreme values or if you want a measure of variability that is not as influenced by these values. However, it may not be as commonly used in statistics as standard deviation or absolute mean deviation. It is always important to consider the nature of your data and the purpose of your analysis when choosing a measure of variability.
 

1. What is the difference between standard deviation and absolute mean deviation?

Standard deviation is a measure of how much the data values deviate from the mean, while absolute mean deviation is a measure of how much the data values deviate from the median.

2. Which measure of variability should I use - standard deviation or absolute mean deviation?

This depends on the specific characteristics of your data and the purpose of your analysis. Standard deviation is more sensitive to extreme values, while absolute mean deviation gives equal weight to all data points.

3. Can I use both standard deviation and absolute mean deviation in my analysis?

Yes, it is possible to use both measures of variability in your analysis. They provide different insights into the spread of the data and can complement each other.

4. Is standard deviation always a larger value than absolute mean deviation?

No, this is not always the case. Standard deviation can be larger or smaller than absolute mean deviation, depending on the distribution of the data.

5. How do I interpret the values of standard deviation and absolute mean deviation?

Standard deviation and absolute mean deviation both measure the spread of data, with larger values indicating a larger spread. However, standard deviation is in the same unit as the original data, while absolute mean deviation is in absolute units.

Similar threads

  • Set Theory, Logic, Probability, Statistics
Replies
4
Views
883
  • Set Theory, Logic, Probability, Statistics
Replies
15
Views
2K
  • Set Theory, Logic, Probability, Statistics
Replies
4
Views
977
  • Set Theory, Logic, Probability, Statistics
Replies
2
Views
836
  • Set Theory, Logic, Probability, Statistics
Replies
6
Views
3K
  • Set Theory, Logic, Probability, Statistics
Replies
6
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
5
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
5
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
3
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
1
Views
1K
Back
Top