# pooled variance

 Sci Advisor P: 2,751 Here is the above in a bit more detail : $$$s_n^2 = 1/n \sum (x_i-\bar{x})^2$$$ $$$= 1/n \sum [ ( (x_i-\mu) - (\bar{x}-\mu) )^2 ]$$$ $$$= 1/n \sum [(x_i-\mu)^2 - 2 (x-\mu)(\bar{x}-\mu) + (\bar{x}-\mu)^2 ]$$$ $$$= 1/n \sum [(x_i-u)^2)] - (\bar{x}-u)^2$$$ So, $$$E[s_n^2] = 1/n \sum E[(x_i-u)^2)] - E[(\bar{x}-u)^2 ]$$$ $$$= E[(x-u)^2)] - E[(\bar{x}-u)^2 ]$$$ $$$= \sigma^2 - \{\rm{term\ greater\ than\ or\ equal\ zero}\}$$$ This shows that the sample variance $$s_n^2$$ always under-estimates the population variance $$\sigma^2$$. You can further show (assuming all the samples are independant) that $$E[(\bar{x}-u)^2 ]$$ is equal to $$\sigma^2/n$$ and hence, $$$s_n^2 = \sigma^2 - \sigma^2/n = \frac{n-1}{n} \sigma^2$$$. So not only is $$s_n^2$$ a biased estimator of $$\sigma^2$$ it is too small by a factor of precisely $$(n-1)/n$$. Clearly using $$(n-1)$$ instead of $$(n)$$ in the denominator fixes this and makes the expectation of this modified sample variance ($$E(s_{n-1}^2)$$) equal to the population variance ($$\sigma^2$$).