# Statistics: What is the efficient estimator of sigma^2?

1. Feb 26, 2014

### sanctifier

1. The problem statement, all variables and given/known data

$X_{1},\; X_{2},\;\;...,\;\; X_{n}$ are sampled from a normal random variable x of mean $\mu =0$ and variance $4 \sigma ^2$

Question: What is the efficient estimator of $\sigma ^2$

2. Relevant equations
Nothing Special.

3. The attempt at a solution
When $\sigma$ is given, the joint probability density function (p.d.f.) of $X_{1},\; X_{2},\;\;...,\;\; X_{n}$ is

$f(X| \sigma ^2) = \frac{1}{( \sqrt[2]{2 \pi }2 \sigma )^n}exp\{- \frac{1}{2}\sum_{i=1}^{n} \frac{{X_i}^2}{4\sigma^2} \}$

Where $X$ denotes a vector of components $X_{1},\; X_{2},\;\;...,\;\; X_{n}$.

Let $\lambda (X|\sigma^2) = -n*ln( \sqrt{2 \pi } 2\sigma)- \frac{1}{2}\sum_{i=1}^{n} \frac{{X_i}^2}{4\sigma^2}$

Namely, $\lambda (X|v) = -n*ln( \sqrt{8 \pi v} )- \frac{1}{2}\sum_{i=1}^{n} \frac{{X_i}^2}{4v}$ when $v=\sigma^2$

$\lambda' (X|v) = \frac{d\lambda (X|v)}{dv} = - \frac{n}{2v} + \frac{n\overline{X} _n}{2v^2}$

Where $\overline{X} _n = \sum_{i=1}^{n} X_i / n$

Hence, the efficient estimator of $\sigma^2$ is

$\overline{X} _n = \frac{2v^2}{n} \lambda' (X|v) + v = \frac{2\sigma^4}{n}\lambda' (X|\sigma^2) + \sigma^2$

Is this correct?

Last edited: Feb 26, 2014
2. Feb 26, 2014

### haruspex

Maybe it's my ignorance of the topic, but the question makes no sense to me. If you are given σ, why do you need an estimator for it?

3. Feb 27, 2014

### marcusl

I think that he's asking for the best estimate of sigma^2 from the data samples X_i. I'm no mathematician so discount the following comments appropriately.

First, I am not following your approach, and your final result makes no sense to me--in the last line, in fact, you have equated a mean on the left to a variance on the right.

To point in a different direction, I know that the sample variance $$\hat{\sigma}^2=E[X_i^2]-E[\overline{X_i}^2]$$ is related to the true population variance by $$E[\hat{\sigma}^2]=\frac{N-1}{N}\sigma^2.$$ Also I seem to recall that the sample variance is the maximum likelihood estimate of the population variance for the case of normally distributed random variables. Perhaps this will help (and I'll leave the demonstrations and derivations to you...).

4. Feb 27, 2014

### Ray Vickson

You are not actually given σ; you are given that there is such a σ. This is a very standard type of question, and in the field the "..but σ is not known.." part of the statement is usually understood. However, calling the variance $4 \sigma^2$ instead of $\sigma^2$ is just silly, and is, I suspect, introduced in order to try to trick the student.

5. Feb 28, 2014

### sanctifier

Sorry for causing confusion, "σ is given but unknown" just means although σ remains in the formula, it acts like the unknown variable and needs to be estimated through the known values of $X_{1},\; X_{2},\;\;...,\;\; X_{n}$.

Thank you for mentioning the M.L.E.

Actually, an efficient estimator is based on the Cramér–Rao inequality. The equality in this inequality holds when a certain condition is met, i.e., $T=u(\theta) \lambda_n ' (X|\theta)+v(\theta)$, where $X$ denotes a vector of components $X_{1},\; X_{2},\;\;...,\;\; X_{n}$, actually it means the function $\lambda (X|\theta)$ is a multivariable function of $X_{1},\; X_{2},\;\;...,\;\; X_{n}$ whose parameter $\theta$ is unknown. $u(\theta)$ and $v(\theta)$ are functions not involving any part of $X$, and

$\lambda_n(X|\theta) = lnf(X|\theta)$

$\lambda_n ' (X|\theta) = \frac{d lnf(X|\theta)}{d\theta}$

Ray knows the rules.

Can you tell me whether the answer is correct or not? Thank you in advance.

Last edited: Feb 28, 2014
6. Feb 28, 2014

### Ray Vickson

If you are trying to find the maximum of $\lambda$ then no, it is not correct. You need $0 = \lambda' = (n/2)(-w + w^2 S)$ where $w = 1/v$ and $S = (1/n)\sum X_i^2$. Also, your final line makes no sense: you are giving a formula for an estimator of $\sigma$ that contains the value of $\sigma$ itself. The whole point of an estimator is that it contains only the observed quantities $X_i$ and $n$, but not the true, underlying parameters of the distribution.