Why Standard Deviation is Calculated Differently: Explained

phoenixthoth · Aug 5, 2004

Why is the standard deviation \sigma =\sqrt{\frac{\sum_{i=1}^{n}\left( \overline{x}-x_{i}\right) ^{2}}{n-1}} and not \sigma =\frac{\sum_{i=1}^{n}\left| \overline{x}-x_{i}\right| }{n} or at least \sigma =\sqrt{\frac{\sum_{i=1}^{n}\left( \overline{x}-x_{i}\right) ^{2}}{n}}?

I suppose that it has something to do with the normal distribution but I'm not sure in what way.

Thanks for the input.

HallsofIvy · Aug 6, 2004

First, the "root-mean-square" is used because it coincides with our usual formula for distance between two points. "standard deviation" is basically a distance measure.
That said, the sum of the absolute values of coordinate differences CAN be used as a distance (as can "max of absolute value of coordinate differences") and can be used as a measure of "standard deviation". Since it's not the usual one, all your formulas (including that form normal distribution would have to be changed and that's a pain.

Secondly, the formula \sigma =\sqrt{\frac{\sum_{i=1}^{n}\left( \overline{x}-x_{i}\right) ^{2}}{n-1}} is for the standard deviation of a SAMPLE from some infinite population.

\sigma =\sqrt{\frac{\sum_{i=1}^{n}\left( \overline{x}-x_{i}\right) ^{2}}{n}} is correct for the standard deviation of a finite population.

There are technical reasons for the "n-1" (it gives an "unbiased estimator") but I like to think of it as just making the "spread" a little larger to reflect the fact that, since we are using a sample, not the entire population, we have more uncertainty.

Hurkyl · Aug 6, 2004

Also, using squares to measure distance, instead of absolute values, tends to be significantly easier to manipulate. It also places the standard deviation (ok, ok, I mean the variance) among a class of things called the moments about the mean, where you use ^k for any positive k, instead of simply ^2.

phoenixthoth · Aug 8, 2004

Thanks for you input. Can you go over what an unbiased estimator is and what maximum likelihood estimators are? I'd really appreciate it. Thanks.

phoenixthoth · Aug 8, 2004

And also the use of n vs n-1 in the normal distribution too. Like if I wanted to assume data was normally distributed and use z-scores and stuff to make estimates on the probability of data lying within a range why use n or n-1 (and which do I use)... Thanks again.

Why Standard Deviation is Calculated Differently: Explained

Thread 'Onto set mapping is the surjective set mapping, and into injective?'

Thread 'Roulette wheel physics and probability'

Thread 'Detail of Diagonalization Lemma'

Similar threads

Hot Threads

B A Little Probability Puzzle

I Need help solving this Existence Algorithm for truth

A Does this computation satisfy LTL formulas?

A Prove that points which are indistinguishable from 0 exist (using logic)

A Mathematical Connection between Cosmic Expansion and Exponential Growth

Recent Insights

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect

Insights What Exactly is Dirac’s Delta Function? - Insight

Insights Relativator (Circular Slide-Rule): Simulated with Desmos - Insight

Insights Fixing Things Which Can Go Wrong With Complex Numbers

Insights Fermat's Last Theorem

Insights Why Vector Spaces Explain The World: A Historical Perspective