Dismiss Notice
Join Physics Forums Today!
The friendliest, high quality science and math community on the planet! Everyone who loves science is here!

Standard deviation

  1. Aug 5, 2004 #1
    Why is the standard deviation [tex]\sigma =\sqrt{\frac{\sum_{i=1}^{n}\left( \overline{x}-x_{i}\right) ^{2}}{n-1}}[/tex] and not [tex]\sigma =\frac{\sum_{i=1}^{n}\left| \overline{x}-x_{i}\right| }{n}[/tex] or at least [tex]\sigma =\sqrt{\frac{\sum_{i=1}^{n}\left( \overline{x}-x_{i}\right) ^{2}}{n}}[/tex]?

    I suppose that it has something to do with the normal distribution but I'm not sure in what way.

    Thanks for the input.
  2. jcsd
  3. Aug 6, 2004 #2


    User Avatar
    Science Advisor

    First, the "root-mean-square" is used because it coincides with our usual formula for distance between two points. "standard deviation" is basically a distance measure.
    That said, the sum of the absolute values of coordinate differences CAN be used as a distance (as can "max of absolute value of coordinate differences") and can be used as a measure of "standard deviation". Since it's not the usual one, all your formulas (including that form normal distribution would have to be changed and that's a pain.

    Secondly, the formula [tex]\sigma =\sqrt{\frac{\sum_{i=1}^{n}\left( \overline{x}-x_{i}\right) ^{2}}{n-1}}[/tex] is for the standard deviation of a SAMPLE from some infinite population.

    [tex]\sigma =\sqrt{\frac{\sum_{i=1}^{n}\left( \overline{x}-x_{i}\right) ^{2}}{n}}[/tex] is correct for the standard deviation of a finite population.

    There are technical reasons for the "n-1" (it gives an "unbiased estimator") but I like to think of it as just making the "spread" a little larger to reflect the fact that, since we are using a sample, not the entire population, we have more uncertainty.
  4. Aug 6, 2004 #3


    User Avatar
    Staff Emeritus
    Science Advisor
    Gold Member

    Also, using squares to measure distance, instead of absolute values, tends to be significantly easier to manipulate. It also places the standard deviation (ok, ok, I mean the variance) among a class of things called the moments about the mean, where you use ^k for any positive k, instead of simply ^2.
  5. Aug 8, 2004 #4
    Thanks for you input. Can you go over what an unbiased estimator is and what maximum likelihood estimators are? I'd really appreciate it. Thanks.
  6. Aug 8, 2004 #5
    And also the use of n vs n-1 in the normal distribution too. Like if I wanted to assume data was normally distributed and use z-scores and stuff to make estimates on the probability of data lying within a range why use n or n-1 (and which do I use)... Thanks again.
Share this great discussion with others via Reddit, Google+, Twitter, or Facebook