# To find mean wedivide the sum of no.s with 'n'for variance why 'n-1'

1. Aug 16, 2013

### dexterdev

Hi all,
For finding average we take the sum of sequence numbers and divide by the number of elements. Why for variance this changes to number of elements minus 1.

-Devanand T

2. Aug 16, 2013

### Stephen Tashi

The definition of the formula for the "sample variance" varies from book to book. Many books prefer the formula that uses division by n-1 because this is also the formula for the "unbiased estimator" of the population variance. Do you know about the definition of an "unbiased" estimator? - or the definition of an "estimator" for that matter?

3. Aug 16, 2013

### chiro

To add to what Stephen Tashi said, there are general formulas for an arbitrary number of degrees of freedom.

If you do enough statistics you'll see n-1, n-2, n-4 instead of n-1 and the reason is because you are estimating a quantity using p known parameters which is why you divide by n-p to get un-biased estimator.

You'll learn this when you look at degrees of freedom in depth.

4. Aug 16, 2013

### dexterdev

Thank you guys , I will see what an estimator and degrees of freedom etc...No idea on these.

5. Aug 24, 2013

### haruspex

Let me try to show why n-1 is right in certain situations.
Suppose there is a total population N with unknown mean μ and variance σ2. From this, you draw a sample S = {xi} of size n. You can calculate the mean μS = Ʃxi/n and variance σ2S = Ʃ(xiS)2/n of the sample (just treated as a set of numbers).
Consider the function VS(y) = Ʃ(xi-y)2/n.
If somehow you knew the real value of μ you could write down an unbiased estimate for σ2 as VS(μ).
The mean of the sample, μS, is that number y which minimises VS(y). Since the true mean of the population is likely to be a little different, VS(μ) tends to be greater than VSS).
One way to think of this is that taking all the samples relative to μS removes one degree of freedom, leaving only n-1 degrees for how the samples are scattered around it.
Anyway, it's not hard to show that σ2S*n/(n-1) is the least biased estimator for σ2.
Sadly, the terminology is very misleading. "Sample variance" generally refers to the estimated variance of the population given the sample (i.e. the n-1 form), when it sounds like it should be the variance of the sample taken as an abstract set of numbers (the n form).