PDA

View Full Version : Expected value of sample variance


musicgold
Jun23-09, 05:27 PM
Hi,

My question is related to this web page. http://en.wikipedia.org/wiki/Estimator_bias

In the Examples section, note the equation for the expected value of sample variance.

{E}(S^2)=\frac{n-1}{n} \sigma^2


Could anybody please show me the steps to go from the sample variance equation (given below) to the above equation?

S^2=\frac{1}{n}\sum_{i=1}^n(X_i-\overline{X}\,)^2


Thanks

MG.

mXSCNT
Jun23-09, 06:15 PM
well, that "sample variance" was defined for the purposes of that page. The usual sample variance divides by n-1 instead of by n, so it is not biased. This page (http://en.wikipedia.org/wiki/Variance#Population_variance_and_sample_variance) includes a derivation of that fact.

mathman
Jun23-09, 07:47 PM
The essential point for the use of n-1 rather than n is that the sample variance makes use of the sample mean, not the theoretical mean.

Specifically, let x be one sample, m the theoretical mean and a the statistical average.
Then E(x-a)2=E(x-m+m-a)2=E(x-m)2+E(m-a)2+2E((x-m)(m-a)).
When you plow through the details, the factor shows up.

musicgold
Jun23-09, 10:24 PM
Thanks folks. However, my question is not about the use of n-1 in the denominator. I understand the concept of the degrees of freedom.

I wish to know the operations/steps I need to perform on the Sample Variance equation to get the expected value equation.

Thanks again,

MG.

mXSCNT
Jun24-09, 02:34 AM
I gave you the answer.

statdad
Jun24-09, 07:49 AM
Is this what you're looking for?

First consider (I'll bring in the 1/n later)


\sum (x_i - \bar x)^2 = \sum x_i^2 - n\bar{x}^2


The expected value of this expression is


\begin{align*}
E\left(\sum(x_i - \bar x^2)^2\right) &= \sum E(x_i^2) - n E\left( \bar{x}^2\right)\\
& = \sum \left(\mu^2 + \sigma^2\right) - n \frac 1 {n^2} \left(\sum E(x_i^2) + \sum_{i<j} x_i x_j \right) \\
& = n\mu^2 + n \sigma^2 - \frac 1 n \left( n \mu^2 + n \sigma^2 + n(n-1) \mu^2 \right) \\
& = n\mu^2 + n \sigma^2 - \mu^2 - \sigma^2 - (n-1) \mu^2 \\
& = n\mu^2 + n\sigma^2 - n \mu^2 - \sigma^2 \\
& = (n-1) \sigma^2
\end{align*}


Now

\begin{align*}
S^2 & = \frac 1 n \sum (x_i - \bar{x})^2) \\
E(S^2) & = \frac 1 n E\left(\sum (x_i - \bar{x}^2) \right) \\
& = \left(\frac 1 n \right) (n-1) \sigma^2 = \frac{n-1} n \sigma^2
\end{align*}


and from this last line we see that in order to obtain an unbiased estimate of \sigma^2 , the maximum likelihood (for normal distributions) estimator S^2 needs to be multiplied by (n)/(n-1) to get


\frac 1 {n-1} \sum (x_i - \bar{x})^2)

musicgold
Jun24-09, 01:06 PM
Statdad,

Thanks a lot. That is what I was looking for. Though some steps are not crystal clear to me, I can dig up more to understand them.

The attached file shows more detail calculations. I found it here:
http://journal.lib.uoguelph.ca/index.php/surg/article/viewFile/407/660

Thanks again,

MG.

musicgold
Jun24-09, 01:29 PM
Statdad

I am not clear about just one step.

How do I get

(\left(\mu^2 + \sigma^2\right) from E (x_i^2)


Thanks

MG.

P.S. How do you manage to write so many equations efficiently using LaTex? Do you have an advanced editor?

statdad
Jun24-09, 01:56 PM
First:
Since

Var(X) = \sigma^2 = E(X - \mu)^2 = E(X^2) - \mu^2


a simple re-arrangement gives


E(X^2) = \sigma^2 + \mu^2


Second question: if you want to have several equations nicely aligned inside a display, use the \begin{align*} and \end{align*} pair inside the tex delimiters. Without the tex info, if i have

f(x) & = x^2 + 5x + 6 \\
& = (x+3)(x+2)

inside the delimiters, the compiled result is


\begin{align*}
f(x) & = x^2 + 5x + 6 \\
& = (x+3)(x+2)
\end{align*}


* the "&" sign causes the equations to be aligned at the start of the next symbol ("=" in my
example)
* the "\\" terminates a line and tells tex to begin a new line

If you click on any displayed formula you should see, in a pop-up window, the underlying code.

Edited to note: some older tex manuals will discuss the use of the "eqarray" (I think I have the name correct, but since I don't use it I'm not going to claim 100% accuracy here) environment for doing what I've done
with align*. Don't use eqarray - the spacing is (to state it as nicely as possible) horrific.

musicgold
Jun25-09, 03:09 PM
Statdad,

Thanks a lot. I really appreciate your help.

Also,


\begin{align*}

Var (X) & = E [ X - E (X) ]^2 \\
& = E [ X^2 - 2X E(X) + E(X)^2] \\
& = E(X^2) - 2 E(X) E(X) + E(X)^2 \\
& = E(X^2) - 2 E(X)^2 + E(X)^2 \\
& = E (X^2) - E(X)^2.


\end{align*}