Why is the degrees of freedom for variance calculation n-1 instead of n?

  • Context: High School 
  • Thread starter Thread starter PRodQuanta
  • Start date Start date
  • Tags Tags
    Statistics
Click For Summary

Discussion Overview

The discussion centers on the rationale behind using n-1 instead of n when calculating the variance of a sample, as opposed to that of a population. It explores theoretical and practical implications of this choice in the context of statistics and data analysis.

Discussion Character

  • Conceptual clarification
  • Technical explanation
  • Debate/contested

Main Points Raised

  • One participant questions the reasoning behind using n-1 for sample variance, expressing confusion about the distinction between sample and population variance.
  • Another participant suggests that the expectation of the sample variance equals the theoretical variance when using n-1, indicating a mathematical basis for this approach.
  • A different viewpoint emphasizes that using n-1 compensates for the bias in estimating population variance from a sample, as sample variance tends to underestimate the true population variance.
  • One participant adds that when calculating variance, using the sample mean consumes one degree of freedom, thus leaving n-1 degrees of freedom for the variance calculation.

Areas of Agreement / Disagreement

Participants express varying levels of understanding and agreement regarding the rationale for using n-1. While some explanations resonate with others, no consensus is reached on a singular, definitive explanation.

Contextual Notes

Some participants reference mathematical expectations and degrees of freedom without fully resolving the underlying assumptions or implications of these concepts. The discussion remains open to interpretation and further exploration.

Who May Find This Useful

This discussion may be useful for students and individuals seeking to understand statistical concepts related to variance, particularly in distinguishing between sample and population calculations.

PRodQuanta
Messages
341
Reaction score
0
Well, were trying to get caught up to where we were in school last year. And we are going over standard deviation. I just have a simple question.

Why, when you are examining only a sample do you (when trying to find the variance) use n-1, and when you are examining the population, you use n?

Let's see if I can't get this in latex...:

[tex]\sqrt{\frac{1}{n-1}\sum_{i=1}^n ({x_i}-{\bar{x}})^2}[/tex]intead of [tex]\sqrt{\frac{1}{n}\sum_{i=1}^n ({x_i}-{\bar{x}})^2[/tex]

Sorry if this question insults your intelligence. I just can't see the reason.

Paden Roder
 
Physics news on Phys.org
When you compute the average of the sum of the squares using the sample mean (i.e. the sample variance), the mathematical expectation of the sample variance equals the theoretical variance with n-1 not n.
 
You want to estimate the variance of the whole population, if you have the whole population you can just calculate it, but when you have only a small portion (a sample) of the population you know that the variance in that sample is probably going to be bit lower than the variance in the whole population, so in order to make your estimate of the variance in the population more suitable you divide by one less than the number of data points in your sample so that you get a somewhat higher number for your estimated population variance. If the sample gets large enough there is hardly any difference (it does not matter very much whether you divide by 1000 or by 999).
 
Thanks guys. That makes sense.

I liked your post TenaliRaman. After reading mathman and gerben's "laymen" explanation, it was pretty easy to read and discover mathematically what was going on.

Thanks.

Paden Roder
 
Let me just make an addition. When you compute the variance, you have to know the mean. When you computed the mean, the number of degrees of freedom you had to account for in your finite sample was just the number of members in it, say n. Now when you come to use the mean in your variance calculation, you have used up one of your degrees of freedom and bundled it up into the value of the mean. So to divide out the number of degrees of freedom, you only have n-1 of them left.
 

Similar threads

  • · Replies 42 ·
2
Replies
42
Views
7K
Replies
1
Views
5K
Replies
5
Views
6K
  • · Replies 7 ·
Replies
7
Views
6K
  • · Replies 28 ·
Replies
28
Views
3K
  • · Replies 1 ·
Replies
1
Views
2K
  • · Replies 3 ·
Replies
3
Views
2K
  • · Replies 9 ·
Replies
9
Views
2K
  • · Replies 1 ·
Replies
1
Views
2K
  • · Replies 1 ·
Replies
1
Views
3K