- #1

- 48

- 0

I was just wondering why we use (n1-1) etc instead of using n1 and n2 and then dividing by n1 + n2?

thanks

- Thread starter adamg
- Start date

- #1

- 48

- 0

I was just wondering why we use (n1-1) etc instead of using n1 and n2 and then dividing by n1 + n2?

thanks

- #2

uart

Science Advisor

- 2,776

- 9

When you calculate the sample variance using "adamg said:

I was just wondering why we use (n1-1) etc instead of using n1 and n2 and then dividing by n1 + n2?

thanks

In your example above I assume that

Last edited:

- #3

uart

Science Advisor

- 2,776

- 9

Here is the above in a bit more detail :

[tex]\[ s_n^2 = 1/n \sum (x_i-\bar{x})^2 \][/tex]

[tex]\[= 1/n \sum [ ( (x_i-\mu) - (\bar{x}-\mu) )^2 ]\][/tex]

[tex]\[= 1/n \sum [(x_i-\mu)^2 - 2 (x-\mu)(\bar{x}-\mu) + (\bar{x}-\mu)^2 ] \][/tex]

[tex]\[= 1/n \sum [(x_i-u)^2)] - (\bar{x}-u)^2 \][/tex]

So,

[tex]\[E[s_n^2] = 1/n \sum E[(x_i-u)^2)] - E[(\bar{x}-u)^2 ] \][/tex]

[tex]\[= E[(x-u)^2)] - E[(\bar{x}-u)^2 ] \][/tex]

[tex]\[= \sigma^2 - \{\rm{term\ greater\ than\ or\ equal\ zero}\} \][/tex]

This shows that the sample variance [tex]s_n^2[/tex] always under-estimates the population variance [tex]\sigma^2[/tex].

You can further show (assuming all the samples are independant) that [tex]E[(\bar{x}-u)^2 ] [/tex] is equal to [tex]$\sigma^2/n[/tex] and hence,

[tex]\[s_n^2 = \sigma^2 - \sigma^2/n = \frac{n-1}{n} \sigma^2 \][/tex].

So not only is [tex]s_n^2[/tex] a biased estimator of [tex]\sigma^2[/tex] it is too small by a factor of precisely [tex](n-1)/n[/tex]. Clearly using [tex](n-1)[/tex] instead of [tex](n)[/tex] in the denominator fixes this and makes the expectation of this modified sample variance ([tex]E(s_{n-1}^2)[/tex]) equal to the population variance ([tex]\sigma^2[/tex]).

[tex]\[ s_n^2 = 1/n \sum (x_i-\bar{x})^2 \][/tex]

[tex]\[= 1/n \sum [ ( (x_i-\mu) - (\bar{x}-\mu) )^2 ]\][/tex]

[tex]\[= 1/n \sum [(x_i-\mu)^2 - 2 (x-\mu)(\bar{x}-\mu) + (\bar{x}-\mu)^2 ] \][/tex]

[tex]\[= 1/n \sum [(x_i-u)^2)] - (\bar{x}-u)^2 \][/tex]

So,

[tex]\[E[s_n^2] = 1/n \sum E[(x_i-u)^2)] - E[(\bar{x}-u)^2 ] \][/tex]

[tex]\[= E[(x-u)^2)] - E[(\bar{x}-u)^2 ] \][/tex]

[tex]\[= \sigma^2 - \{\rm{term\ greater\ than\ or\ equal\ zero}\} \][/tex]

This shows that the sample variance [tex]s_n^2[/tex] always under-estimates the population variance [tex]\sigma^2[/tex].

You can further show (assuming all the samples are independant) that [tex]E[(\bar{x}-u)^2 ] [/tex] is equal to [tex]$\sigma^2/n[/tex] and hence,

[tex]\[s_n^2 = \sigma^2 - \sigma^2/n = \frac{n-1}{n} \sigma^2 \][/tex].

So not only is [tex]s_n^2[/tex] a biased estimator of [tex]\sigma^2[/tex] it is too small by a factor of precisely [tex](n-1)/n[/tex]. Clearly using [tex](n-1)[/tex] instead of [tex](n)[/tex] in the denominator fixes this and makes the expectation of this modified sample variance ([tex]E(s_{n-1}^2)[/tex]) equal to the population variance ([tex]\sigma^2[/tex]).

Last edited:

- Last Post

- Replies
- 4

- Views
- 11K

- Last Post

- Replies
- 1

- Views
- 3K

- Last Post

- Replies
- 2

- Views
- 3K

- Last Post

- Replies
- 1

- Views
- 1K

- Last Post

- Replies
- 1

- Views
- 13K

- Last Post

- Replies
- 1

- Views
- 2K

- Replies
- 1

- Views
- 11K

- Replies
- 3

- Views
- 3K

- Replies
- 3

- Views
- 2K

- Replies
- 8

- Views
- 2K