Statistics - Standard Deviation, Standard Error and Mean

Click For Summary

Discussion Overview

The discussion revolves around statistical analysis, specifically focusing on calculating averages from a data set while considering the influence of outliers. Participants explore methods to derive a more representative mean by potentially excluding extreme values.

Discussion Character

  • Exploratory
  • Technical explanation
  • Debate/contested

Main Points Raised

  • One participant seeks a method to calculate the average of four numbers from a set of six, expressing a desire to exclude outliers that skew the mean.
  • Another participant suggests calculating the average of all possible combinations of four numbers, indicating that the average would depend on the specific numbers chosen.
  • A different viewpoint proposes using robust statistical measures, such as the median and median absolute deviation (MAD), as alternatives to the mean to better handle outliers.
  • Additionally, a participant comments on the limitations of variance as a measure of spread in non-symmetrical distributions or those with outliers, recommending visual representations like dot plots or stemplots.

Areas of Agreement / Disagreement

Participants express differing opinions on the best approach to handle outliers and calculate a representative average, indicating that multiple competing views remain without a consensus.

Contextual Notes

Participants highlight the potential limitations of traditional measures of central tendency and variability in the presence of outliers, but do not resolve the specific mathematical steps or assumptions involved in their proposed methods.

Learn
Messages
1
Reaction score
0
Hello,

Just had a question regarding statistical analysis.

I'm trying to calculate the average of 4 numbers from a data set of 6 numbers in excel without manually choosing to average only the 4 numbers.

e.g.


85 20 32 45 27 3 (total mean = 35.3 desired mean = 31)
100 30 27 40 21 1
...etc

The middle 4 numbers represent a more realistic result whilst the two end numbers are irrelevant.

I've tried looking into weighted averages however I'm unsure how to apply this to an excel sheet.

I'd prefer to find a formula which would discount numbers which vary wildly from the average amount and focus only on the few that don't.

Any help would be greatly appreciated.
 
Physics news on Phys.org
You have six numbers - if you picked four of them at random, what would be their average?
That about it?

Clearly that depends on the four numbers - but you can work out each possible combination of 4, and work out the average for each one. The expectation value of the averages will be your answer.
 
A better strategy than excluding observations arbitrarily would be to compute "robust" estimates of the distribution, e.g. the median instead of the mean and the median absolute deviation (MAD) as an estimator for the variability.
See
http://en.wikipedia.org/wiki/Robust_statistics
 
As a general comment, to reinfoce what was already said, in distributions that are not symmetrical or not unimodal, and or distributions with outliers, the variance is not a good measure of spread. You may also want to represent your data with a dot plot or stemplot:

http://en.wikipedia.org/wiki/Stem-and-leaf_plot
 

Similar threads

  • · Replies 4 ·
Replies
4
Views
3K
  • · Replies 5 ·
Replies
5
Views
2K
  • · Replies 1 ·
Replies
1
Views
2K
  • · Replies 5 ·
Replies
5
Views
2K
  • · Replies 18 ·
Replies
18
Views
3K
  • · Replies 4 ·
Replies
4
Views
2K
  • · Replies 11 ·
Replies
11
Views
8K
  • · Replies 5 ·
Replies
5
Views
2K
  • · Replies 4 ·
Replies
4
Views
2K
  • · Replies 7 ·
Replies
7
Views
4K