What Changed in Standard Deviation Calculation?

Click For Summary

Discussion Overview

The discussion centers on the calculation of standard deviation, specifically the differences between using n and n-1 in the denominator. Participants explore the implications of these formulas in statistical contexts, including when to use each and the reasoning behind these choices.

Discussion Character

  • Technical explanation
  • Debate/contested
  • Mathematical reasoning

Main Points Raised

  • Some participants note that the standard deviation calculation has changed, highlighting the difference between using n and n-1.
  • Others argue that both formulas are still in use, with the choice depending on whether the mean is known or estimated from the sample.
  • One participant questions why using n-1 corrects for statistical error and whether this correction is negligible for large sample sizes.
  • Another participant explains that n-1 is used to adjust for variability in samples, as using n tends to underestimate population variance.
  • It is mentioned that using n-1 makes the sample variance an unbiased estimator of the population variance, although some participants question the necessity of an unbiased estimator.
  • A later reply introduces the idea that minimizing mean squared error could be better achieved using (n+1) instead of (n-1) for normal distributions.
  • One participant clarifies that the denominator reflects the number of degrees of freedom, which is reduced when calculating the mean.

Areas of Agreement / Disagreement

Participants express multiple competing views regarding the use of n versus n-1, and the discussion remains unresolved with no consensus on the necessity or implications of using either formula.

Contextual Notes

Some limitations include the dependence on definitions of mean and variance, as well as the varying interpretations of unbiased estimators and degrees of freedom in statistical calculations.

The Bob
Messages
1,127
Reaction score
0
Changes to Standard Deviation??

How many of you know that Standard Deviation has changed.

It used to be:[tex]\sqrt{\frac{\Sigma(x_i - \overline{x})^2}{n}}[/tex]

And now it is:[tex]\sqrt{\frac{\Sigma(x_i - \overline{x})^2}{n - 1}}[/tex]

It is the Variance of data but square rooted:

[tex]s^2 = \frac{\Sigma(x_i - \overline{x})^2}{n - 1}[/tex] convertd to: [tex]s = \sqrt{\frac{\Sigma(x_i - \overline{x})^2}{n - 1}}[/tex]

Not really anything important, just wanted people to know and comment (if necessary) on the fact that it has changed.

The Bob (2004 ©)
 
Last edited:
Physics news on Phys.org
Actually, both formulae are used... I forget the reasons for using n instead of n-1, though.
 
It depends on what you are using for the mean. If you know the mean, then you divide by n. If you estimate the mean from the sample, then you use n-1, because the estimated mean has a statistical error.
 
Hurkyl said:
Actually, both formulae are used... I forget the reasons for using n instead of n-1, though.
I do understand that both are still used but I didn't realize why until:
mathman said:
It depends on what you are using for the mean. If you know the mean, then you divide by n. If you estimate the mean from the sample, then you use n-1, because the estimated mean has a statistical error.
- Mathman came along and said why.

Cheers guys.

The Bob (2004 ©)
 
mathman said:
It depends on what you are using for the mean. If you know the mean, then you divide by n. If you estimate the mean from the sample, then you use n-1, because the estimated mean has a statistical error.
Why does using n-1 instead correct the error? Is this negligible for large values of n?
 
n-1

n-1 is used for samples in order to adjust for the variability of the data set which does not included all possible events.

using n tends to produce an undersestimate of the population variance. So we use n-1 in the denominator to provide the appropriate correction for this tency.

to sum up:
when using populations, use n as the denominator.
else
use n-1

hope that helps!
 
The factor (n-1) is used to make the sample variance an "unbiased estimator" of the population variance.

There's no particular reason you need an unbiased estimator, though. For example, if you want to minimize mean squared error, it turns out that it's much better to use (n+1) instead of (n-1) (in case of a normal distribution). See http://www-laplace.imag.fr/Jaynes/prob.html , chapter 17.
 
Last edited by a moderator:
The denominator is not the sample size, but the number of degrees of freedom. Initially the two are equal, but when you do the mean (sum over n) you "fix" or "lose" one degree of freedom. So when you then go to use the mean in the sd calculation, you have only n-1 degrees of freedom left.

Degrees of freedom are a thorny thing to teach, and they only become essential to consider in things like ANOVA, so they are frequently skipped in teaching simple statistics.
 

Similar threads

  • · Replies 42 ·
2
Replies
42
Views
7K
  • · Replies 5 ·
Replies
5
Views
3K
Replies
1
Views
5K
  • · Replies 6 ·
Replies
6
Views
2K
  • · Replies 4 ·
Replies
4
Views
4K
  • · Replies 2 ·
Replies
2
Views
2K
  • · Replies 28 ·
Replies
28
Views
3K
  • · Replies 5 ·
Replies
5
Views
2K
  • · Replies 3 ·
Replies
3
Views
4K
  • · Replies 5 ·
Replies
5
Views
2K