Mean with standard deviation or Median with IQR?

In summary, the data table has a wide distribution with a lot of values that are less then 2 SD from the mean. The median is a better representation than mean because 68% of the values are within 1 SD of the mean, and 95% of the values are within 2 SD of the mean.
  • #1
3vo
7
0
Hi guys,

I hope someone is able to help me with this, I'm currently stuck on a problem.

1. I was given some data (in continuous, grouped form) regarding phone call times for a call center agent and asked to represent the data using the most accurate form of average.
I initially calculated an estimate of the median and IQR using interpolation, followed by estimation of the mean with standard deviation using midpoints for each group.

The answers however were very different.
Median 3.44 with IQR 9.5
Mean 8.6 with standard deviation of 14.5


I now need to justify which is the better representation of the data, mean or median?
My ogive graph seems to indicate that the distribution for data is wide and uneven and based on this I was always under the impression that if the distribution is skewed it is better to use median as it is not affected by outliers. However I was informed by someone that the mean was the more accurate representation in this case. Any ideas why this may be?

Can anyone explain to me what to do with standard deviation value or IQR. I understand they are both measures of spread, but what do they mean to the data? All my textbooks seem to keep reiterating that they are measures of spread without explaining what to do with them regarding accuracy


2. Below is a copy of the data table I've complied
t group mid(x) f c.f fx fx2
0 ≤ t < 2 1 80 80 80 80
2 ≤ t < 4 3 53 133 159 477
4 ≤ t < 6 5 19 152 95 475
6 ≤ t < 10 8 22 174 176 1408
10 ≤ t < 20 15 31 205 465 6975
20 ≤ t < 30 25 16 221 400 10000
30 ≤ t < 60 45 15 236 675 30375
Total - 236 236 2050 49790


3. I believe that due to the uneven distribution that the median may be the better representation for this data, however I have also noticed that the spread is very wide and understand that median is more to do with the central tendency. I'm unsure if this would then disregards the high freq in the first group and may explain why median is not the best representation in the case of wide distributions rather than just uneven?
 

Attachments

  • photo 1.jpg
    photo 1.jpg
    38 KB · Views: 568
  • photo 2.jpg
    photo 2.jpg
    47.4 KB · Views: 528
Physics news on Phys.org
  • #2
IMHO, it makes no sense to ask what is the best way to represent data without first understanding how the representation will be used to make decisions.
 
  • #3
Hi Haruspex,

Thank you for your reply.

The main part of my assignment brief was to show I was able to calculate an estimate for both mean and median with measures of spread.

However for the final part I only need to justify which of the two averages is the better representation for this data as a whole. No further conclusions or decisions would be made or required from this data.

I'm now stuck on which of two measures best represents this data set.

From my understanding (and please correct me if I am wrong) is that 68% of the values are less then one SD from the mean value. And 95% are less then 2 SD.

Looking at my cumulative frequency graph I can see that most of the data does fall within one SD from the mean value of 8. However everything I've read either online or in my textbook also seems to suggest if the distribution is ever uneven, to always use median. I've also calculated that outliers are present after the 25.25 value and this would normally affect the mean value. Would it also have an impact on SD or is SD resistant to outliers?

I understand the median gives a better idea of central tendency than mean and is resistant to presence of outliers, would this be enough justification to use median as a better representation than mean?
 
Last edited:

1. What is the purpose of calculating mean with standard deviation or median with IQR?

The purpose of calculating these measures is to summarize and describe the distribution of a set of data. Mean with standard deviation provides a measure of central tendency and variability, while median with IQR is more resistant to extreme values and outliers.

2. When should I use mean with standard deviation instead of median with IQR?

Mean with standard deviation is more appropriate for data that follows a normal distribution, while median with IQR is better for skewed or non-normal data. Median with IQR is also preferred when dealing with outliers.

3. How do I interpret the values of mean with standard deviation and median with IQR?

The mean with standard deviation represents the average value and spread of the data. The median with IQR represents the middle value and the range of the middle 50% of the data. A smaller standard deviation or IQR indicates a more compact distribution, while a larger standard deviation or IQR indicates a more spread out distribution.

4. Can mean with standard deviation and median with IQR be used for categorical data?

No, these measures are only appropriate for numerical data. For categorical data, other measures such as mode and frequency are more suitable.

5. How can I determine which measure is more appropriate for my data?

This depends on the type of data and the distribution of the data. It is best to visualize the data using histograms or box plots to get an idea of the distribution, and then choose the appropriate measure based on the shape of the data.

Similar threads

  • Calculus and Beyond Homework Help
Replies
2
Views
1K
  • Calculus and Beyond Homework Help
Replies
13
Views
2K
  • Calculus and Beyond Homework Help
Replies
4
Views
1K
  • Calculus and Beyond Homework Help
Replies
24
Views
2K
Replies
4
Views
20K
  • Set Theory, Logic, Probability, Statistics
Replies
4
Views
927
  • Calculus and Beyond Homework Help
Replies
2
Views
2K
  • Calculus and Beyond Homework Help
Replies
2
Views
10K
  • Set Theory, Logic, Probability, Statistics
Replies
2
Views
794
Back
Top