- #1

- 48

- 0

I was just wondering why we use (n1-1) etc instead of using n1 and n2 and then dividing by n1 + n2?

thanks

You are using an out of date browser. It may not display this or other websites correctly.

You should upgrade or use an alternative browser.

You should upgrade or use an alternative browser.

- Thread starter adamg
- Start date

- #1

- 48

- 0

I was just wondering why we use (n1-1) etc instead of using n1 and n2 and then dividing by n1 + n2?

thanks

- #2

uart

Science Advisor

- 2,797

- 21

adamg said:

I was just wondering why we use (n1-1) etc instead of using n1 and n2 and then dividing by n1 + n2?

thanks

When you calculate the sample variance using "

In your example above I assume that

Last edited:

- #3

uart

Science Advisor

- 2,797

- 21

Here is the above in a bit more detail :

[tex]\[ s_n^2 = 1/n \sum (x_i-\bar{x})^2 \][/tex]

[tex]\[= 1/n \sum [ ( (x_i-\mu) - (\bar{x}-\mu) )^2 ]\][/tex]

[tex]\[= 1/n \sum [(x_i-\mu)^2 - 2 (x-\mu)(\bar{x}-\mu) + (\bar{x}-\mu)^2 ] \][/tex]

[tex]\[= 1/n \sum [(x_i-u)^2)] - (\bar{x}-u)^2 \][/tex]

So,

[tex]\[E[s_n^2] = 1/n \sum E[(x_i-u)^2)] - E[(\bar{x}-u)^2 ] \][/tex]

[tex]\[= E[(x-u)^2)] - E[(\bar{x}-u)^2 ] \][/tex]

[tex]\[= \sigma^2 - \{\rm{term\ greater\ than\ or\ equal\ zero}\} \][/tex]

This shows that the sample variance [tex]s_n^2[/tex] always under-estimates the population variance [tex]\sigma^2[/tex].

You can further show (assuming all the samples are independant) that [tex]E[(\bar{x}-u)^2 ] [/tex] is equal to [tex]$\sigma^2/n[/tex] and hence,

[tex]\[s_n^2 = \sigma^2 - \sigma^2/n = \frac{n-1}{n} \sigma^2 \][/tex].

So not only is [tex]s_n^2[/tex] a biased estimator of [tex]\sigma^2[/tex] it is too small by a factor of precisely [tex](n-1)/n[/tex]. Clearly using [tex](n-1)[/tex] instead of [tex](n)[/tex] in the denominator fixes this and makes the expectation of this modified sample variance ([tex]E(s_{n-1}^2)[/tex]) equal to the population variance ([tex]\sigma^2[/tex]).

[tex]\[ s_n^2 = 1/n \sum (x_i-\bar{x})^2 \][/tex]

[tex]\[= 1/n \sum [ ( (x_i-\mu) - (\bar{x}-\mu) )^2 ]\][/tex]

[tex]\[= 1/n \sum [(x_i-\mu)^2 - 2 (x-\mu)(\bar{x}-\mu) + (\bar{x}-\mu)^2 ] \][/tex]

[tex]\[= 1/n \sum [(x_i-u)^2)] - (\bar{x}-u)^2 \][/tex]

So,

[tex]\[E[s_n^2] = 1/n \sum E[(x_i-u)^2)] - E[(\bar{x}-u)^2 ] \][/tex]

[tex]\[= E[(x-u)^2)] - E[(\bar{x}-u)^2 ] \][/tex]

[tex]\[= \sigma^2 - \{\rm{term\ greater\ than\ or\ equal\ zero}\} \][/tex]

This shows that the sample variance [tex]s_n^2[/tex] always under-estimates the population variance [tex]\sigma^2[/tex].

You can further show (assuming all the samples are independant) that [tex]E[(\bar{x}-u)^2 ] [/tex] is equal to [tex]$\sigma^2/n[/tex] and hence,

[tex]\[s_n^2 = \sigma^2 - \sigma^2/n = \frac{n-1}{n} \sigma^2 \][/tex].

So not only is [tex]s_n^2[/tex] a biased estimator of [tex]\sigma^2[/tex] it is too small by a factor of precisely [tex](n-1)/n[/tex]. Clearly using [tex](n-1)[/tex] instead of [tex](n)[/tex] in the denominator fixes this and makes the expectation of this modified sample variance ([tex]E(s_{n-1}^2)[/tex]) equal to the population variance ([tex]\sigma^2[/tex]).

Last edited:

Share:

- Replies
- 8

- Views
- 2K

- Replies
- 1

- Views
- 3K