Question to Maximum Likelihood

This means that you want to maximize the probability of the error term, instead of minimizing it, in order to find the most likely model that produced the observed data.
  • #1
senmeis
69
2
Hello,

I have a question to Maximum Likelihood Estimation. The typical form of MLE looks like this:

X = Hθ + W. W is gaussion with N(0, C).
θml = (HTC-1H)-1HTC-1X

I think θml can only be calculated after a lot of measurements are made, that is, there are plenty of samples of H and X. Or, it is impossible to get θml if only information about θ is known. Do I understand it correctly?

Senmeis
 
Physics news on Phys.org
  • #2
The typical form of MLE is that you have some random variable X that depends on a parameter [itex] \theta [/itex], and has a density [itex] p(x,\theta)[/itex], then given some samples [tex] x_1,...,x_n [/tex] of X, you find the value of [itex] \theta [/itex] maximizing
[tex] \prod_{j=1}^{n} p(x_j, \theta) [/tex]

or something to that effect (it might be different if your samples are dependent, or you have samples from different random variables etc.). You seem to have a very specific application of this to a Gaussian model. You can do the calculation with any number of samples, but the more samples you have the better odds you have that your MLE estimate is a good estimate of the real value of the parameter.
 
  • Like
Likes 1 person
  • #3
A measurement vector can be written as:

ø = Du + ε where ε is a zero mean gaussian random vector.

The MLE is Dml when P(ε) is maximum, but why maximum? I think the probability of ε shall be as small as possible. I know I must make a understanding mistake. Can anyone point it out?

Senmeis
 
  • #4
In words, by picking D to maximize P(ε), you are saying 'My choice of D indicates that the events I just witnessed were not unusual in any way", whereas if you try to minimize P(ε), you are saying "My choice of D indicates the events I just witnessed will never happen again in the history of the universe".

To give a simple example, let's say I flip one hundred coins and all of them come up heads. I then ask you for a MLE of the probability that the coin lands on heads. If you want to maximize the probability that one hundred heads come up and notails comes up, you'll end up saying 'the coin has a probability of 1 of landing on heads' because if that is the case, then the probability that I get 100 heads in a row is 1. If you wanted to minimize the probability that the coin comes up heads 100 times in a row, you will tell me 'the coin has a probability of 0 of landing on heads.', and 100 heads coming up in a row will have a probability of 0. Which sounds more reasonable?
 
  • #5
senmeis said:
A measurement vector can be written as:

ø = Du + ε where ε is a zero mean gaussian random vector.

The MLE is Dml when P(ε) is maximum, but why maximum? I think the probability of ε shall be as small as possible. I know I must make a understanding mistake. Can anyone point it out?

Senmeis

You want to find the model that most likely could have produced the data that you have. That is the goal of the MLE. If you have to chose between a model the is very unlikely to produce your data, and one that is likely to give those results, you pick the more likely one. If you have tossed a coin 50 times and got 50 heads, you would pick the model that the coin is rigged for heads, not the model that the coin is fair.
 

1. What is maximum likelihood estimation?

Maximum likelihood estimation is a statistical method used to estimate the parameters of a probability distribution by finding the set of values that maximize the likelihood of the observed data.

2. How is maximum likelihood used in scientific research?

Maximum likelihood is used in various fields of science, such as biology, psychology, and physics, to estimate the parameters of a model based on observed data. It is used to make predictions and test hypotheses about a population or system.

3. What is the difference between maximum likelihood and least squares?

Maximum likelihood and least squares are both methods used to estimate parameters in a model, but they differ in their approach. Maximum likelihood assumes a probability distribution for the data, while least squares does not. Additionally, maximum likelihood aims to maximize the likelihood of the data, while least squares minimizes the squared differences between the data and the model.

4. What are the assumptions of maximum likelihood estimation?

The assumptions of maximum likelihood estimation include a correctly specified probability distribution for the data, independent and identically distributed data points, and a sufficient amount of data to accurately estimate the parameters of the model.

5. What are some limitations of maximum likelihood estimation?

Some limitations of maximum likelihood estimation include its sensitivity to outliers in the data, the need for a large sample size to accurately estimate parameters, and the potential for biased estimates if the underlying assumptions are not met. Additionally, maximum likelihood may not be appropriate for non-normal or non-parametric data.

Similar threads

  • Set Theory, Logic, Probability, Statistics
Replies
1
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
16
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
2
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
19
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
6
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
3
Views
903
  • Set Theory, Logic, Probability, Statistics
Replies
3
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
11
Views
2K
  • Set Theory, Logic, Probability, Statistics
Replies
3
Views
2K
  • Set Theory, Logic, Probability, Statistics
Replies
8
Views
1K
Back
Top