Why is the degrees of freedom for variance calculation n-1 instead of n?

PRodQuanta · Sep 18, 2004

Well, were trying to get caught up to where we were in school last year. And we are going over standard deviation. I just have a simple question.

Why, when you are examining only a sample do you (when trying to find the variance) use n-1, and when you are examining the population, you use n?

Let's see if I can't get this in latex...:

\sqrt{\frac{1}{n-1}\sum_{i=1}^n ({x_i}-{\bar{x}})^2}intead of \sqrt{\frac{1}{n}\sum_{i=1}^n ({x_i}-{\bar{x}})^2

Sorry if this question insults your intelligence. I just can't see the reason.

Paden Roder

TenaliRaman · Sep 18, 2004

Dont worry its a problem for many not just you,
You may like to go through this page,
http://mathworld.wolfram.com/Variance.html

-- AI

mathman · Sep 18, 2004

When you compute the average of the sum of the squares using the sample mean (i.e. the sample variance), the mathematical expectation of the sample variance equals the theoretical variance with n-1 not n.

gerben · Sep 18, 2004

You want to estimate the variance of the whole population, if you have the whole population you can just calculate it, but when you have only a small portion (a sample) of the population you know that the variance in that sample is probably going to be bit lower than the variance in the whole population, so in order to make your estimate of the variance in the population more suitable you divide by one less than the number of data points in your sample so that you get a somewhat higher number for your estimated population variance. If the sample gets large enough there is hardly any difference (it does not matter very much whether you divide by 1000 or by 999).

PRodQuanta · Sep 19, 2004

Thanks guys. That makes sense.

I liked your post TenaliRaman. After reading mathman and gerben's "laymen" explanation, it was pretty easy to read and discover mathematically what was going on.

Thanks.

Paden Roder

selfAdjoint · Sep 19, 2004

Let me just make an addition. When you compute the variance, you have to know the mean. When you computed the mean, the number of degrees of freedom you had to account for in your finite sample was just the number of members in it, say n. Now when you come to use the mean in your variance calculation, you have used up one of your degrees of freedom and bundled it up into the value of the mean. So to divide out the number of degrees of freedom, you only have n-1 of them left.

Why is the degrees of freedom for variance calculation n-1 instead of n?

Thread 'What Exactly is Dirac’s Delta Function? - Insight'

Thread 'Fermat's Last Theorem'

Thread 'Imaginary Pythagorus'

Similar threads

Hot Threads

Insights Fermat's Last Theorem

B How is it that law of sines does not work in this exercise?

B What could prove this wrong? I'm having a dispute with friends

B About a definition: What is the number of terms of a polynomial P(x)?

B How Many Straight Lines to Connect an N by M Array of Points in a Closed Loop?

Recent Insights

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect

Insights What Exactly is Dirac’s Delta Function? - Insight

Insights Relativator (Circular Slide-Rule): Simulated with Desmos - Insight

Insights Fixing Things Which Can Go Wrong With Complex Numbers

Insights Fermat's Last Theorem

Insights Why Vector Spaces Explain The World: A Historical Perspective