Confused about the intuitive explanation of degrees of freedom

Click For Summary
SUMMARY

The discussion centers on the concept of degrees of freedom (D.F.) in statistics, particularly in relation to estimating population variance using sample variance. It is established that when calculating sample variance, the divisor is n-1 to ensure an unbiased estimate, as the sample mean m is derived from the n numbers. The confusion arises from whether the sample mean should also be considered a degree of freedom, leading to debates about the intuitive understanding of D.F. The consensus is that n-1 represents the degrees of freedom when n numbers are involved, as the nth number is determined by the others to maintain the condition that all residuals sum to zero.

PREREQUISITES
  • Understanding of sample variance and population variance
  • Familiarity with statistical concepts such as residuals and unbiased estimators
  • Basic knowledge of probability theory
  • Awareness of statistical tests like the chi-square test
NEXT STEPS
  • Study the derivation of sample variance and its relationship with degrees of freedom
  • Learn about the chi-square test and how degrees of freedom apply in hypothesis testing
  • Explore intuitive explanations and common misconceptions about degrees of freedom in statistics
  • Review additional resources on statistical estimation techniques and their implications
USEFUL FOR

Statisticians, mathematicians, data analysts, and anyone seeking a deeper understanding of statistical concepts related to degrees of freedom and sample variance.

kotreny
Messages
46
Reaction score
0
One common explanation of the concept of D.F. is this:

Suppose you have n numbers (a, b, c,...) that make up a sample of a population. You want to estimate the variance of the population with the sample variance. But the sample mean m is being calculated from these numbers, so when determining the variance ((a-m)2+(b-m)2+(c-m)2...)/n, only n-1 numbers are free to vary. The n-th number must be chosen so that the mean of all n numbers comes out to m. Thus, there are only n-1 "degrees of freedom."

But wait--shouldn't m be free to vary in this case? The value of the n-th number is a function of the other numbers and m. Fair enough, but that means m must become the n-th degree of freedom!
 
Physics news on Phys.org
I am not sure what your point is. However in estimating the variance, the sample variance divisor is n-1 in order for it to be an unbiased estimate of the true variance.
 
Sorry, I forgot to add that this is a common intuitive explanation for why the n-1 creates an unbiased sample variance. I take it it's a bad one? Regardless, n-1 is generally said to be the number of degrees of freedom in the case of n numbers whose residuals must sum to zero. Supposedly, only n-1 numbers are useful as information because they are free to vary. The nth number is completely determined by the previous n-1 numbers and the condition that all n residuals sum to zero. Sometimes the explanation describes the sample mean as the condition. My argument is that either of these additional conditions qualify as degrees of freedom themselves, making it n degrees of freedom no matter what.

Here is a small sample of links with the D.F. explanation I am questioning. Either all are wrong (not likely), I misinterpreted them, or my own reasoning is naive. Please, clear up the situation for me if you can.

http://en.wikipedia.org/wiki/Degrees_of_freedom_(statistics)#Linear_regression

http://www.tufts.edu/~gdallal/dof.htm

http://arnoldkling.com/apstats/df.html

 
Last edited by a moderator:
As a mathematician, specializing in probability theory (not statistics), I have not worked with the concept degrees of freedom. However, the proof of the use of n-1 comes directly from estimating the mean of the sample variance. To make it equal to the true variance, you need n-1.
 
Last edited:
Thank you for replying anyway. I am familiar with the proof you speak of, but some people have said that the n-1 "makes sense because it is the number of degrees of freedom." I rather doubt this claim; In fact, as I said twice, I doubt the entire claim that n-1 is even the number of D.F. to begin with.
 
I, too, have been struggling with this concept. I don't think degrees of freedom really work in an intuitive manner, so I'm just settling with using n-1 for sample variance to make it an unbiased estimator.
 
Hi mezza8, thanks for the input and welcome to the forums. Even if we discard completely the D.F. connection to the sample variance, D.F. is still an important concept in statistics. It is applied in the chi-square test for example. A lot of people say that degrees of freedom is an intuitive concept, and make the questionable argument seen in my links and discussed above. (Check the YouTube one for a particularly clear demonstration of this dubious reasoning. If the link doesn't work for any reason, the uploader's name is jdeisenberg. You can search that with "degrees of freedom.") I hope I have made clear why I think this argument is false. When using an estimated parameter to justify removing a D.F., the parameter itself becomes the so-called removed D.F.
 

Similar threads

  • · Replies 7 ·
Replies
7
Views
6K
Replies
5
Views
5K
  • · Replies 5 ·
Replies
5
Views
584
  • · Replies 7 ·
Replies
7
Views
3K
  • · Replies 9 ·
Replies
9
Views
2K
  • · Replies 4 ·
Replies
4
Views
5K
  • · Replies 1 ·
Replies
1
Views
2K
  • · Replies 10 ·
Replies
10
Views
5K
Replies
1
Views
4K
  • · Replies 4 ·
Replies
4
Views
2K