# Maximum likelihood

1. Jul 15, 2012

### kimkibun

is it possible to estimate all parameters of an n-observation (X1,...Xn) with same mean, μ, but different variances (σ2122,...,σ2n)? if we assume that σ2i are known for all i in {1,...n}, what is the mle of of μ?

2. Jul 15, 2012

### chiro

It is possible to do so, but you will have to weight the variances correctly for your estimator.

We know that E[Sum(Xi)/N] = E[mu] since all distributions have the same mean.

In terms of the variance we know that the variance of a normal estimator of a mean is sigma^2/n. Now because we can add variances and because all samples are independent, it means our final variance of the estimator is Sum (sigma_i^2/n_i) wher sigma_i is the standard deviation at index i and n_i is the number of samples from that particular distribution. If all sigma's are the same, this should simplify to sigma^2/N where N is the total number of samples (I'd double check for yourself).

So with that said our estimator will have the distribution (X - mean)/SQRT(Sum(sigma_i^2/n_i)) ~ N(0,1) under CLT assumption.

Now here is the thing though: if you are using classical statistical tests, you are going to assume the Central Limit Theorem which assumes you have enough samples for it to be normally distributed (or close enough to it) so that you can estimate the mean.

Because you have lots of sigma's, if you don't have enough samples with respect to the number of sigma's or if the sigma's are wildly different, then the assumption might break down.

There is even more to consider, but the above should give you an idea in terms of using just the population parameters of expectation.

If you are using things like t-tests, you should use the above kind of formulation and go from the derivation to include the above condition of how the variance of the estimator of the mean is the sum of all the variances. I'm guessing the S^2 will look similar but you'd have to make sure.

3. Jul 15, 2012

### kimkibun

i forgot to tell you that the n-observations was drawn out from a normal population..well anyway, is it possible that different variances might affect the maximum likelihood of the mean?

4. Jul 15, 2012

### Parlyne

They do. If you go through the maximum likelihood logic with different variances, you'll find that the maximum likelihood estimator for the mean is the standard weighted average:
$$\frac{\sum_{i=1}^N w_i x_i}{\sum_{i=1}^N w_i}$$
with
$$w_i = \frac{1}{\sigma_i^{\phantom{i}2}}.$$

5. Jul 17, 2012

### kimkibun

can you please show me how you get that?

the mle for μ that i got is this,

(Ʃxi/Ʃσi2)(Ʃ1/σi2)

is it possible to find the mle of the parameter σi2?

6. Jul 17, 2012

### Parlyne

Presuming Gaussian errors, the probability of measuring a value within $dx_i$ of $x_i$ given mean $\mu$ and width $\sigma_i$ is
$$\frac{dx_i}{\sigma_i\sqrt{2\pi}}e^{-\frac{(x_i-\mu)^2}{2\sigma_i^{\phantom{i}2}}}.$$

The probability of the full data set is the product of the probabilities for each value, which will be
$$\frac{\prod_{i=1}^N dx_i}{(2\pi)^\frac{N}{2}\prod_{i=1}^N \sigma_i} e^{-\frac{1}{2}\chi^2},$$
with
$$\chi^2 = \sum_{i=1}^N\left(\frac{x_i-\mu}{\sigma_i}\right)^2.$$

Since $\chi^2$ is the only part of the probability expression which depends on $\mu$, maximizing it over $\mu$ is equivalent to minimizing $\chi^2$. So, we take the derivative of $\chi^2$ with respect to $\mu$ and set it to 0 as follows.
\begin{align} 0 &= \frac{d}{d\mu}\sum_{i=1}^N \frac{x_i^{\phantom{1}2}-2\mu x_i +\mu^2}{\sigma_i^{\phantom{i}2}}\\ &= \sum_{i=1}^N \frac{-2x_i+2\mu}{\sigma_i^{\phantom{i}2}}\\ &= 2 \mu \sum_{i=1}^N \frac{1}{\sigma_i^{\phantom{i}2}} - 2 \sum_{i=1}^N \frac{x_i}{\sigma_i^{\phantom{1}2}}\\ \mu &= \frac{\sum_{i=1}^N \frac{x_i}{\sigma_i^{\phantom{i}2}}}{\sum_{i=1}^N \frac{1}{\sigma_i^{\phantom{i}2}}}\end{align}

As for $\sigma_i$, since there's a different value for each measurement, there's nothing to estimate. Each is taken as exact. What you can look for an estimator of is the width of the distribution in $\mu$, which can be found through error propagation in the equation for the mle of $\mu$, taking the associated $\sigma_i$ as the error in $x_i$.

7. Jul 18, 2012

### kimkibun

thank you so much sir! i really appreciate your efforts! God Bless you!