High School Why is the mode usually not as useful?

  • Thread starter Thread starter Cheesycheese213
  • Start date Start date
  • Tags Tags
    Mode
Click For Summary
The mode is often less useful for determining central tendency due to its lack of reliability, especially in asymmetric distributions where it may not represent the data well. In experimental data, the mode can be far from the majority of the distribution, making it less informative than the mean or median. The Central Limit Theorem supports the mean's significance, providing more guarantees about its performance compared to the mode. While the mode can be useful for categorical data and identifying specific occurrences, it often fails to provide meaningful insights in larger samples or uniform distributions. Overall, the mode's variability and limited applicability in certain contexts diminish its effectiveness as a measure of central tendency.
Cheesycheese213
Messages
55
Reaction score
8
I know that there are some cases where the mode just isn’t very helpful for finding central tendency, but I have never heard any real specific reason why other than it isn’t too reliable.
 
Physics news on Phys.org
* In experimental data, the mode doesn't have to be anywhere close to most of the distribution
* Even in a theoretical model: You can have extremely asymmetric distributions, where the mode doesn't tell you much. The mode of an exponential distribution is 0. How does that help?
 
  • Like
Likes Cheesycheese213
I think the mean, median, mode and range help to characterize the collection of numbers you have and so mode is useful in that sense. I could imagine someone wanting a special collection of numbers for software testing where they specify thes four values and you are left to generating the collection.

I’ve done something like this to create a fake customer transaction database for a web store. I created lookup lists of codes. The list matched the desired statistical breakdown and then I would generate a random index into the list to select a value for the record I was creating.

As an example, my list of grades might be ‘aabbbbcccccccccdddf’ so I’d query the list with a random key from 0 to 19 (the list has 20 elements I hope) to get a letter grade to assign to a student. The list was created using a median value of c and a mode of 9 c’s and an average slightly above a c.
 
mfb said:
* Even in a theoretical model: You can have extremely asymmetric distributions, where the mode doesn't tell you much. The mode of an exponential distribution is 0. How does that help?

This basically is the maximum likelihood criterion though

(I am not a big fan of it, but still)
 
Cheesycheese213 said:
I know that there are some cases where the mode just isn’t very helpful for finding central tendency, but I have never heard any real specific reason why other than it isn’t too reliable.

"Central tendency" is vague property unless you specify a quantitative measure for it. There are more well-known theorems about the mean of a probability distributions than about its mode. So if you use the mean value in your work, you have , in a manner of speaking, more guarantees about the performance of the tool than if you use the mode.
 
Cheesycheese213 said:
I know that there are some cases where the mode just isn’t very helpful for finding central tendency, but I have never heard any real specific reason why other than it isn’t too reliable.
You can have bimodal distributions.

I typically only use the mode for categorical/nominal data.
 
If you ask random persons to choose one of the numbers 1, 2, 3, and 4, based only on intuition, a prior knowledge of the mode of a large number of responses can perhaps yield some predictability of which of the numbers a newly asked person is more or less likely to choose.
 
1) The Central Limit Theorem gives the mean a significance and statistical properties that the median and mode do not have.

2) Consider the example of the birthdays (1...365) in a group of people (see https://en.wikipedia.org/wiki/Birthday_problem ). Each birthday has a uniform distribution in (1...365).
Consider the mean of a sample of birthdays. One would want to say that a measure of the center of the distribution is at 365/2 = 182.5. That would be the mean. It tells you something about the distribution.

Now consider the mode. With a sample of fewer than 23 birthdays, there is probably not a mode since there are probably no duplicate birthdays. So that tells you nothing. With a simple of 23, there probably is one duplicate and a mode, but it can be anywhere in (1..365). With a sample of 31, there are probably two duplicates, so the mode is again undefined. With a significantly larger sample, there is probably a unique mode which is equally likely to be anywhere within (1..365). The mode is not very useful for indicating anything about the sample.

Now consider the median. The median would tend toward the mean as the sample grows. It has some use.
 
FactChecker said:
Each birthday has a uniform distribution in (1...365).
It turns out birthdays don't have a uniform distribution. The mode tells you something about days with a higher number of births (e.g. 1.1., 2.2., 3.3., ...), but mean and median only give a comparison between the first and second half of the year. In addition, they depend on the arbitrary definition of the start of a year, while the mode does not.
 
  • #10
mfb said:
It turns out birthdays don't have a uniform distribution.
I'll buy that. I didn't think of that.
The mode tells you something about days with a higher number of births (e.g. 1.1., 2.2., 3.3., ...)
It tells you something about the one day with the highest number of births, but nothing about the other days.
but mean and median only give a comparison between the first and second half of the year.
I would say that the median tells you the location of the first half of the probability, not necessarily the first half of the year.
In addition, they depend on the arbitrary definition of the start of a year, while the mode does not.
I guess that the same things that influence the probability distribution of birth dates, and therefore the mode, would be reflected some way in both the mean and the median.
 
  • #11
FactChecker said:
I would say that the median tells you the location of the first half of the probability, not necessarily the first half of the year.
What I meant: A median or mean that is a few days before the middle of the year tells you the first half of the year has more births. The median also allows a rough estimate how many more births there are in the first half.
 
  • #12
An example where the mode tells you virtually nothing is the uniform distribution, which occurs (or is assumed) very often. No matter how large the sample size is, the mode will keep jumping around within the entire range of the distribution. Here is an example where the distribution is evenly distributed on the integers 0, 1, ..., 9. As the sample size increases to 1000 in increments of 10, the mean and median settle down quickly but the mode never does:
meanMedianModeExample.png
 

Attachments

  • meanMedianModeExample.png
    meanMedianModeExample.png
    7.3 KB · Views: 723
Last edited:

Similar threads

Replies
2
Views
1K
  • · Replies 7 ·
Replies
7
Views
2K
  • · Replies 4 ·
Replies
4
Views
3K
Replies
8
Views
3K
  • · Replies 6 ·
Replies
6
Views
2K
Replies
4
Views
2K
  • · Replies 5 ·
Replies
5
Views
4K
  • · Replies 4 ·
Replies
4
Views
2K
  • · Replies 5 ·
Replies
5
Views
2K
  • · Replies 1 ·
Replies
1
Views
2K