MLE of μ for X1,...Xn with Known σ2i

kimkibun
Messages
28
Reaction score
1
is it possible to estimate all parameters of an n-observation (X1,...Xn) with same mean, μ, but different variances (σ2122,...,σ2n)? if we assume that σ2i are known for all i in {1,...n}, what is the mle of of μ?
 
Physics news on Phys.org
kimkibun said:
is it possible to estimate all parameters of an n-observation (X1,...Xn) with same mean, μ, but different variances (σ2122,...,σ2n)? if we assume that σ2i are known for all i in {1,...n}, what is the mle of of μ?

It is possible to do so, but you will have to weight the variances correctly for your estimator.

We know that E[Sum(Xi)/N] = E[mu] since all distributions have the same mean.

In terms of the variance we know that the variance of a normal estimator of a mean is sigma^2/n. Now because we can add variances and because all samples are independent, it means our final variance of the estimator is Sum (sigma_i^2/n_i) wher sigma_i is the standard deviation at index i and n_i is the number of samples from that particular distribution. If all sigma's are the same, this should simplify to sigma^2/N where N is the total number of samples (I'd double check for yourself).

So with that said our estimator will have the distribution (X - mean)/SQRT(Sum(sigma_i^2/n_i)) ~ N(0,1) under CLT assumption.

Now here is the thing though: if you are using classical statistical tests, you are going to assume the Central Limit Theorem which assumes you have enough samples for it to be normally distributed (or close enough to it) so that you can estimate the mean.

Because you have lots of sigma's, if you don't have enough samples with respect to the number of sigma's or if the sigma's are wildly different, then the assumption might break down.

There is even more to consider, but the above should give you an idea in terms of using just the population parameters of expectation.

If you are using things like t-tests, you should use the above kind of formulation and go from the derivation to include the above condition of how the variance of the estimator of the mean is the sum of all the variances. I'm guessing the S^2 will look similar but you'd have to make sure.
 
i forgot to tell you that the n-observations was drawn out from a normal population..well anyway, is it possible that different variances might affect the maximum likelihood of the mean?
 
kimkibun said:
i forgot to tell you that the n-observations was drawn out from a normal population..well anyway, is it possible that different variances might affect the maximum likelihood of the mean?

They do. If you go through the maximum likelihood logic with different variances, you'll find that the maximum likelihood estimator for the mean is the standard weighted average:
\frac{\sum_{i=1}^N w_i x_i}{\sum_{i=1}^N w_i}
with
w_i = \frac{1}{\sigma_i^{\phantom{i}2}}.
 
Parlyne said:
They do. If you go through the maximum likelihood logic with different variances, you'll find that the maximum likelihood estimator for the mean is the standard weighted average:
\frac{\sum_{i=1}^N w_i x_i}{\sum_{i=1}^N w_i}
with
w_i = \frac{1}{\sigma_i^{\phantom{i}2}}.

can you please show me how you get that?

the mle for μ that i got is this,

(Ʃxi/Ʃσi2)(Ʃ1/σi2)

is it possible to find the mle of the parameter σi2?
 
kimkibun said:
can you please show me how you get that?

the mle for μ that i got is this,

(Ʃxi/Ʃσi2)(Ʃ1/σi2)

is it possible to find the mle of the parameter σi2?

Presuming Gaussian errors, the probability of measuring a value within dx_i of x_i given mean \mu and width \sigma_i is
\frac{dx_i}{\sigma_i\sqrt{2\pi}}e^{-\frac{(x_i-\mu)^2}{2\sigma_i^{\phantom{i}2}}}.

The probability of the full data set is the product of the probabilities for each value, which will be
\frac{\prod_{i=1}^N dx_i}{(2\pi)^\frac{N}{2}\prod_{i=1}^N \sigma_i} e^{-\frac{1}{2}\chi^2},
with
\chi^2 = \sum_{i=1}^N\left(\frac{x_i-\mu}{\sigma_i}\right)^2.

Since \chi^2 is the only part of the probability expression which depends on \mu, maximizing it over \mu is equivalent to minimizing \chi^2. So, we take the derivative of \chi^2 with respect to \mu and set it to 0 as follows.
\begin{align} 0 &amp;= \frac{d}{d\mu}\sum_{i=1}^N \frac{x_i^{\phantom{1}2}-2\mu x_i +\mu^2}{\sigma_i^{\phantom{i}2}}\\<br /> &amp;= \sum_{i=1}^N \frac{-2x_i+2\mu}{\sigma_i^{\phantom{i}2}}\\<br /> &amp;= 2 \mu \sum_{i=1}^N \frac{1}{\sigma_i^{\phantom{i}2}} - 2 \sum_{i=1}^N \frac{x_i}{\sigma_i^{\phantom{1}2}}\\<br /> \mu &amp;= \frac{\sum_{i=1}^N \frac{x_i}{\sigma_i^{\phantom{i}2}}}{\sum_{i=1}^N \frac{1}{\sigma_i^{\phantom{i}2}}}\end{align}

As for \sigma_i, since there's a different value for each measurement, there's nothing to estimate. Each is taken as exact. What you can look for an estimator of is the width of the distribution in \mu, which can be found through error propagation in the equation for the mle of \mu, taking the associated \sigma_i as the error in x_i.
 
thank you so much sir! i really appreciate your efforts! God Bless you!
 
Back
Top