Why is the mode usually not as useful?

  • Context: High School 
  • Thread starter Thread starter Cheesycheese213
  • Start date Start date
  • Tags Tags
    Mode
Click For Summary

Discussion Overview

The discussion centers around the utility of the mode as a measure of central tendency in various contexts, including theoretical models and practical applications. Participants explore its reliability compared to other measures like the mean and median, and discuss specific scenarios where the mode may or may not be informative.

Discussion Character

  • Debate/contested
  • Technical explanation
  • Conceptual clarification

Main Points Raised

  • Some participants note that the mode can be unreliable, particularly in experimental data where it may not reflect the distribution well.
  • Others argue that in asymmetric distributions, the mode may not provide meaningful information, citing the mode of an exponential distribution as an example.
  • A participant suggests that the mean, median, mode, and range collectively characterize a dataset, indicating that the mode can be useful in specific contexts, such as generating datasets for software testing.
  • Some participants highlight that the Central Limit Theorem gives the mean statistical properties that the mode lacks, making the mean more significant in many cases.
  • One participant discusses the example of birthdays, arguing that the mode may not be useful with small sample sizes and can be undefined in certain cases, while the median tends to provide more consistent information as sample size increases.
  • Another participant points out that the mode can indicate days with higher birth rates, contrasting it with the mean and median, which may not reflect this information accurately.
  • A participant mentions that in a uniform distribution, the mode can be particularly uninformative, as it may fluctuate widely regardless of sample size.

Areas of Agreement / Disagreement

Participants express a range of views on the usefulness of the mode, with some agreeing on its limitations while others highlight specific contexts where it may still hold value. The discussion remains unresolved regarding the overall utility of the mode compared to other measures of central tendency.

Contextual Notes

Participants acknowledge that the effectiveness of the mode can depend on the nature of the data distribution, sample size, and specific applications, but do not resolve these complexities.

Cheesycheese213
Messages
55
Reaction score
8
I know that there are some cases where the mode just isn’t very helpful for finding central tendency, but I have never heard any real specific reason why other than it isn’t too reliable.
 
Physics news on Phys.org
* In experimental data, the mode doesn't have to be anywhere close to most of the distribution
* Even in a theoretical model: You can have extremely asymmetric distributions, where the mode doesn't tell you much. The mode of an exponential distribution is 0. How does that help?
 
  • Like
Likes   Reactions: Cheesycheese213
I think the mean, median, mode and range help to characterize the collection of numbers you have and so mode is useful in that sense. I could imagine someone wanting a special collection of numbers for software testing where they specify thes four values and you are left to generating the collection.

I’ve done something like this to create a fake customer transaction database for a web store. I created lookup lists of codes. The list matched the desired statistical breakdown and then I would generate a random index into the list to select a value for the record I was creating.

As an example, my list of grades might be ‘aabbbbcccccccccdddf’ so I’d query the list with a random key from 0 to 19 (the list has 20 elements I hope) to get a letter grade to assign to a student. The list was created using a median value of c and a mode of 9 c’s and an average slightly above a c.
 
mfb said:
* Even in a theoretical model: You can have extremely asymmetric distributions, where the mode doesn't tell you much. The mode of an exponential distribution is 0. How does that help?

This basically is the maximum likelihood criterion though

(I am not a big fan of it, but still)
 
Cheesycheese213 said:
I know that there are some cases where the mode just isn’t very helpful for finding central tendency, but I have never heard any real specific reason why other than it isn’t too reliable.

"Central tendency" is vague property unless you specify a quantitative measure for it. There are more well-known theorems about the mean of a probability distributions than about its mode. So if you use the mean value in your work, you have , in a manner of speaking, more guarantees about the performance of the tool than if you use the mode.
 
Cheesycheese213 said:
I know that there are some cases where the mode just isn’t very helpful for finding central tendency, but I have never heard any real specific reason why other than it isn’t too reliable.
You can have bimodal distributions.

I typically only use the mode for categorical/nominal data.
 
If you ask random persons to choose one of the numbers 1, 2, 3, and 4, based only on intuition, a prior knowledge of the mode of a large number of responses can perhaps yield some predictability of which of the numbers a newly asked person is more or less likely to choose.
 
1) The Central Limit Theorem gives the mean a significance and statistical properties that the median and mode do not have.

2) Consider the example of the birthdays (1...365) in a group of people (see https://en.wikipedia.org/wiki/Birthday_problem ). Each birthday has a uniform distribution in (1...365).
Consider the mean of a sample of birthdays. One would want to say that a measure of the center of the distribution is at 365/2 = 182.5. That would be the mean. It tells you something about the distribution.

Now consider the mode. With a sample of fewer than 23 birthdays, there is probably not a mode since there are probably no duplicate birthdays. So that tells you nothing. With a simple of 23, there probably is one duplicate and a mode, but it can be anywhere in (1..365). With a sample of 31, there are probably two duplicates, so the mode is again undefined. With a significantly larger sample, there is probably a unique mode which is equally likely to be anywhere within (1..365). The mode is not very useful for indicating anything about the sample.

Now consider the median. The median would tend toward the mean as the sample grows. It has some use.
 
FactChecker said:
Each birthday has a uniform distribution in (1...365).
It turns out birthdays don't have a uniform distribution. The mode tells you something about days with a higher number of births (e.g. 1.1., 2.2., 3.3., ...), but mean and median only give a comparison between the first and second half of the year. In addition, they depend on the arbitrary definition of the start of a year, while the mode does not.
 
  • #10
mfb said:
It turns out birthdays don't have a uniform distribution.
I'll buy that. I didn't think of that.
The mode tells you something about days with a higher number of births (e.g. 1.1., 2.2., 3.3., ...)
It tells you something about the one day with the highest number of births, but nothing about the other days.
but mean and median only give a comparison between the first and second half of the year.
I would say that the median tells you the location of the first half of the probability, not necessarily the first half of the year.
In addition, they depend on the arbitrary definition of the start of a year, while the mode does not.
I guess that the same things that influence the probability distribution of birth dates, and therefore the mode, would be reflected some way in both the mean and the median.
 
  • #11
FactChecker said:
I would say that the median tells you the location of the first half of the probability, not necessarily the first half of the year.
What I meant: A median or mean that is a few days before the middle of the year tells you the first half of the year has more births. The median also allows a rough estimate how many more births there are in the first half.
 
  • #12
An example where the mode tells you virtually nothing is the uniform distribution, which occurs (or is assumed) very often. No matter how large the sample size is, the mode will keep jumping around within the entire range of the distribution. Here is an example where the distribution is evenly distributed on the integers 0, 1, ..., 9. As the sample size increases to 1000 in increments of 10, the mean and median settle down quickly but the mode never does:
meanMedianModeExample.png
 

Attachments

  • meanMedianModeExample.png
    meanMedianModeExample.png
    7.3 KB · Views: 739
Last edited:

Similar threads

  • · Replies 14 ·
Replies
14
Views
2K
  • · Replies 2 ·
Replies
2
Views
1K
  • · Replies 6 ·
Replies
6
Views
4K
  • · Replies 6 ·
Replies
6
Views
2K
  • · Replies 7 ·
Replies
7
Views
2K
  • · Replies 4 ·
Replies
4
Views
3K
Replies
8
Views
3K
  • · Replies 6 ·
Replies
6
Views
2K
Replies
4
Views
2K
  • · Replies 5 ·
Replies
5
Views
4K