How is the harmonic mean affected by additional data points?

In summary, the conversation discusses the concept of harmonic mean and how it is affected by adding new data points to an existing data series. It is mentioned that the harmonic mean is skewed towards smaller values and that adding new points can either increase or decrease the overall mean, depending on the values of the new points. The formula for calculating the harmonic mean of two series is also mentioned.
  • #1
Feynstein100
162
16
We have a collection of 8 discrete data points. They are:
10, 20, 30, 20, 30, 40, 30, 40
In increasing order:
10, 20*2, 30*3, 40*2
The harmonic mean of this data series is 22.86
I read on Wikipedia that the harmonic mean is skewed towards the smaller values i.e. smaller values will affect the HM more than larger values. So I thought that if we add 2 additional data points 20 and 30, our HM would be even smaller. And yet, when I calculated the HM of this new data series with 10 points:
10, 20*3, 30*4, 40*2
it turned out to be 23.08 i.e. higher than the previous case. Why did that happen?
One of our new points was lower than the HM whereas the other was higher. I thought the HM would be more skewed toward the lower value and thus would bring the overall mean down. Ah is it because the second datapoint was much higher than the HM?
In general, I'm interested in the question of how adding new datapoints will affect the HM of the existing data series.
We're not changing the endpoints, they remain constant. So any new point added will lie somewhere inside the bounds of the data series. In our example, that's 10 and 40.
So I think the answer is quite simple. If New point < HM, it lowers the HM. If New point > HM, it increases the HM.
It seems quite straightforward for adding one datapoint but what if we add multiple? In essence appending another data series to the existing one. Can we predict in advance if the new HM will be higher or lower?
 
Mathematics news on Phys.org
  • #2
It looks like a simple quantitative question. One point at a time will give a predictable result. More than one - results?
 
  • #3
Feynstein100 said:
It seems quite straightforward for adding one datapoint but what if we add multiple? In essence appending another data series to the existing one. Can we predict in advance if the new HM will be higher or lower?
Yes. If the harmonic mean of the new points is lower, the HM of all the points will be lower, If it is higher the new HM will be higher. Easy to see from HM of [tex] \frac {1}{\sum {\frac{1}{a_i}}} [/tex] and [tex] \frac {1}{\sum {\frac{1}{b_i}}} [/tex] is [tex] \frac {1}{ \sum {\frac{1}{a_i}} + \sum {\frac{1}{b_i}} } [/tex]
 
  • Like
Likes Feynstein100
  • #4
mathman said:
It looks like a simple quantitative question. One point at a time will give a predictable result. More than one - results?
That's............kind of what I'm asking 😂
 
  • #5
willem2 said:
Yes. If the harmonic mean of the new points is lower, the HM of all the points will be lower, If it is higher the new HM will be higher. Easy to see from HM of [tex] \frac {1}{\sum {\frac{1}{a_i}}} [/tex] and [tex] \frac {1}{\sum {\frac{1}{b_i}}} [/tex] is [tex] \frac {1}{ \sum {\frac{1}{a_i}} + \sum {\frac{1}{b_i}} } [/tex]
Thanks for the reply. I worked it out myself and turns out, the combined harmonic mean Hc of two harmonic means H1 with m items and H2 with n items will be
Hc =(m + n)/(m/H1 + n/H2)
i.e. the weighted harmonic mean of H1 and H2. And by the general property of all means, Hc will be somewhere between H1 and H2. Thus, if the new HM is lower, the overall HM will be lower as well. And if the new HM is higher, the overall HM will be higher as well.

Btw your third formula has a mistake. It should be 2/(sum of inverses), not 1/(sum of inverses) and that's a special case of when both series have the same number of items. The general formula is the weighted harmonic mean.
 

1. How does adding more data points affect the harmonic mean?

Adding more data points can either increase or decrease the harmonic mean, depending on the values of the additional data points. If the new data points have values that are close to the existing data points, the harmonic mean may not change significantly. However, if the new data points have values that are significantly higher or lower than the existing data points, the harmonic mean will be affected and may increase or decrease accordingly.

2. Does the number of data points affect the harmonic mean?

Yes, the number of data points does affect the harmonic mean. The more data points there are, the more accurate the harmonic mean will be. This is because the harmonic mean takes into account the reciprocals of the data points, and the more data points there are, the more accurate the average of these reciprocals will be.

3. How does the distribution of data points affect the harmonic mean?

The distribution of data points can greatly affect the harmonic mean. If the data points are evenly distributed, the harmonic mean will be a good representation of the average. However, if the data points are heavily skewed towards one end, the harmonic mean may be significantly affected and may not accurately represent the average.

4. Can outliers affect the harmonic mean?

Yes, outliers can greatly affect the harmonic mean. Outliers are data points that are significantly higher or lower than the rest of the data. Since the harmonic mean takes the reciprocals of the data points, outliers can greatly skew the average and affect the overall value of the harmonic mean.

5. Is the harmonic mean affected by the order of the data points?

No, the harmonic mean is not affected by the order of the data points. This is because the harmonic mean takes the reciprocals of the data points, and the order of the data points does not affect the reciprocals. However, if new data points are added, the order of the data points may affect the overall value of the harmonic mean.

Similar threads

Replies
20
Views
1K
Replies
4
Views
386
  • General Math
Replies
2
Views
804
Replies
3
Views
2K
  • General Math
Replies
19
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
26
Views
2K
  • Programming and Computer Science
Replies
11
Views
989
Replies
3
Views
1K
Replies
2
Views
835
Back
Top