MLE, Uniform Distribution, missing data

AI Thread Summary
The discussion centers on determining the maximum likelihood estimate (MLE) for k in a uniform distribution U(0,k) with missing data represented as X={1,3,*}. The consensus is that the MLE for k is at least 3, as it should be based on the largest observed value. Various methods, including Expectation Maximization (EM), are explored, with the conclusion that the MLE converges to 3 regardless of the missing data. Participants express concerns about potential biases from ignoring the missing data, but it is noted that with a small sample size, the estimate is likely the best achievable. The conversation also touches on the limitations of inference from such a small dataset and the potential application of the EM algorithm for more complex distributions.
sopsku
Messages
4
Reaction score
0
I would like to determine the MLE for k in U(0,k) where U is the uniform pdf constant on the interval [0,k] and zero elsewhere. I would like this estimate in the case of missing data. To be specific, what is the MLE for k given the three draws X={1,3,*} where * is unknown.
 
Physics news on Phys.org
sopsku said:
I would like to determine the MLE for k in U(0,k) where U is the uniform pdf constant on the interval [0,k] and zero elsewhere. I would like this estimate in the case of missing data. To be specific, what is the MLE for k given the three draws X={1,3,*} where * is unknown.

The only thing we can say for certain is that k\geq 3. So what do you think the MLE of k would be?
 
Yes. I think it should be the largest measured value, in this case three. Thank you for the verification.

I had tried to look at it from doing Expectation Maximization: Assume the missing value x* as large and estimating k by using the expectation value of x* = x*/2. If this is greater than 3 iterate again using this as my new k. If it is less than 3, then K=3. This will ultimately always converge to the largest recorded value (=3). Is this a valid argument?

I was troubled by the fact that I have information (additional mesurement(s)) that is being ignored. I guess that means that the MLE with missing information is even more biased by the fact that this information is ignored.
 
sopsku said:
This will ultimately always converge to the largest recorded value (=3). Is this a valid argument?

It should be as long as your likelihood function ranges over distribution parameters for the data you actually have. I believe the MLE is biased toward underestimating k.
was troubled by the fact that I have information (additional mesurement(s)) that is being ignored. I guess that means that the MLE with missing information is even more biased by the fact that this information is ignored.

What information are you ignoring? If a datum is missing the only alternative is to interpolate or simulate it. For this you might use the sample mean which is 2. I think the sample would be too small for MME but you could try it for n=3 if in fact the missing datum was included in the sample but lost.
 
Last edited:
I think I am ignoring the fact that the missing data could be greater than 3 if the true value of the K is greater than three. Perhaps the likelihood looks something like

If[x > 3, (3/k) (1/x)^3, 0]+If[x > k, ((k - 3)/k) (1/x)^3, 0]

(where k is my assumed K>3), which is maximum for x=k=3, implying K=3 as above, but I am not sure of my likelihood. I am trying to stay with MLE since I got sidetracked into this while looking at the Expectation-Maximization algorithm
 
sopsku said:
I think I am ignoring the fact that the missing data could be greater than 3 if the true value of the K is greater than three.

Well, of course, k>3 is possible, but I assume to have good random sample, if very small. Given the small sample with a missing data point, I agree that this MLE is about as good as you are going to be able to achieve.

Choosing to interpolate the missing data point as I described will not change your estimate. It will simply increase its power a bit. Of course this technique is usually used on much larger data sets (which are expensive to develop) where a few data points get "lost" somehow, and you want to maximize the power of your estimate. If you were to do this on this set, it would only be as an experiment, not for statistical inference. You really can't make any inferences from this tiny data set
 
Last edited:
I agree with all that you are saying. I am not really trying to "improve" the estimate. What I am interested in is a functional form for the likelihood and thought if I understood the MLE for my toy n=3, U(0,K) example I would be one step closer to this real goal. Given the functional form I wanted to formally apply the EM algorithm. I thought the toy example would be insightful to learning about the EM algorithm but the piecewise continuous nature of the U(0,K) has lead me rather astray. I am back to using the exponential family of distributions to investigate the EM algorithm which make the expectation step more straightforward.

I want to thank you very much for your kind help.
 
sopsku said:
I want to thank you very much for your kind help.

You're welcome.
 
sopsku said:
I had tried to look at it from doing Expectation Maximization: Assume the missing value x* as large and estimating k by using the expectation value of x* = x*/2. If this is greater than 3 iterate again using this as my new k. If it is less than 3, then K=3. This will ultimately always converge to the largest recorded value (=3). Is this a valid argument?

As far as I understand it, the EM algorithm works like this: For the E step, since x* is U(0,k) with k>=3, the log-likelihood given x* and the 2 other datapoints is log(1/k^3), so the expected log-likelihood is also log(1/k^3). For the M step, this is maximized at k=3, so EM stops after 1 iteration.

Also for uniform estimation it's worth taking a look at minimum variance estimators. Wikipedia has a good article on how the German Tank Problem was solved in this way.
 
Back
Top