Standard deviation of aggregated data

Click For Summary

Discussion Overview

The discussion revolves around the calculation of standard deviation for aggregated data, specifically when combining records that include mean speed, standard deviation of speed, and number of observations over different intervals. The scope includes statistical methods and potential challenges in applying these methods to datasets with varying interval lengths.

Discussion Character

  • Technical explanation
  • Mathematical reasoning
  • Debate/contested

Main Points Raised

  • One participant seeks clarification on how to calculate the standard deviation for aggregated intervals after computing the weighted mean.
  • Another participant suggests that the second moments can be combined using the same method as the means, and that the standard deviation can then be calculated from the weighted mean and weighted second moment.
  • A later reply acknowledges the previous suggestion but raises a concern about the varying lengths of intervals used for measuring average speed, questioning if this affects the calculation.
  • This participant also emphasizes the need for an unbiased estimator for the standard deviation, referencing a formula derived from a Wikipedia article.

Areas of Agreement / Disagreement

Participants express differing views on the implications of varying interval lengths on the calculation of standard deviation, indicating that the discussion remains unresolved regarding the best approach to take in such cases.

Contextual Notes

There are limitations regarding assumptions about the data distribution and the applicability of the proposed methods, particularly in relation to the unbiased estimator for standard deviation when intervals differ in length.

KThy
Messages
2
Reaction score
0
This might be embarrasingly easy or impossible; I've been a computer programmer for too long since my statistics classes to tell for sure :blushing:

I have a set of records with the following data for each record: interval, mean speed, standard deviation of speed, number of observations. Exempli gratia:

Code:
1, 77.2, 1.75, 10
2, 75.9, 2.05, 12

Now, if I want to aggregate the data to get mean speed and standard deviation over two or more intervals (1 and 2 above), calculating the weighted mean is no problem but how - if possible - do I calculate the standard deviation for the aggregated intervals?
 
Physics news on Phys.org
Given the std. dev. and mean, you can easily get the second moment. Combine the second moments by the same procedure you used for the means. Finally calculate the std. dev. from the weighted mean and weighted second moment.
 
mathman said:
Combine the second moments by the same procedure you used for the means. Finally calculate the std. dev. from the weighted mean and weighted second moment.

Right, of course. As I said, embarrasingly simple - thanks!
 
I recently encountered a similar problem at work.

However, for me the problem is that some of the intervals used for measuring average speed are different lengths of time. I believe the problem is the same?

Since we consider the data to be a sample, we want the unbiased estimator for the standard deviation which takes on the following form given on wikipedia website (i derived it alone to make sure).


http://en.wikipedia.org/wiki/Mean_square_weighted_deviation

Cheers.
 

Similar threads

  • · Replies 42 ·
2
Replies
42
Views
7K
  • · Replies 4 ·
Replies
4
Views
3K
  • · Replies 5 ·
Replies
5
Views
2K
  • · Replies 5 ·
Replies
5
Views
3K
  • · Replies 2 ·
Replies
2
Views
2K
  • · Replies 4 ·
Replies
4
Views
2K
  • · Replies 4 ·
Replies
4
Views
2K
  • · Replies 15 ·
Replies
15
Views
2K
  • · Replies 6 ·
Replies
6
Views
4K
  • · Replies 5 ·
Replies
5
Views
2K