Maximum Likelihood to find the original data or estimation directly

In summary: There is no difference between (1) and (3), they are both equations for the estimator x^ML. The only difference is that (1) uses a convolution product and (3) uses the product of covariances.
  • #1
fab13
312
6
I make confusions in the using of Maximum Likelihood to find (approximately) the original signal knowing the data observed and the using of Maximum Likelihood to find estimations of parameters of the PSF

1) Find (up to some point) the original signal :

I start from this general definition (in discretized) : ##y=H\,x+w\quad(1)## (with ##w## a white noise)

How can I demonstrate this relation ? (it seems that we should start from a discrete convolution product, doesn't it ? Then, the correct expression would be rather : ##y=H*x+w## with ##*## the product covolution)

For estimation, I have to maximalize the likelihood function :

##\phi(x) = (H \cdot x - y)^{T} \cdot W \cdot (H \cdot x - y)##

with ##H## the PSF, ##W## the inverse of covariance matrix (of data ##y##), and ##y## data observed : the goal is to find ##x## original signal.

So the estimator is given by : ##x^{\text{(ML)}} = (H^{T}\cdot W\cdot H)^{-1}\cdot H^{T}\cdot W \cdot y\quad(2)##

For this task, I don't know practically how to compute this estimator ##x^{\text{(ML)}}## ? is equation (2) correct ?

2) Second task : next step, I have to find the parameters on the function with ##\theta=[a,b]## parameter vector : this allow to find the best parameters of PSF and gives the best fit as a function of data ##y##

Is this step well formulated ?

I am working with the following PSF :

qaJcr.png


And for this second task, I have to find the parameters ##a## and ##b## of :

3uU7X.png


knowing (##r_0,c_0##)

In practise, I have used on Matlab the function `\` to perform the Least Mean squares method between the data observed (PSF without noise) and the raw data (the PSF with noise).

In this way, I can find the two affine parameters ##a## and ##b## (actually, I think this is called a linear regression).

I saw that we could take the vector of parameters ##\theta[a,b]## and use the following relation :

##y=H\theta + w\quad (3)##

(with H the response of the system, ##\theta## the vector of parameters to estimate and w the white noise. How to demonstrate this important relation ? and what's the difference between ##(1)## and ##(3)## ?

Finally, I need remarks or help to knowing what the differences between the 2 tasks and what is the right method to apply for each one.

Regards
 

Attachments

  • qaJcr.png
    qaJcr.png
    25.6 KB · Views: 504
  • 3uU7X.png
    3uU7X.png
    25.8 KB · Views: 487
Last edited by a moderator:
Physics news on Phys.org
  • #2
fab13 said:
So the estimator is given by : ##x^{\text{(ML)}} =(H^{T}\cdot W\cdot H)^{-1}\cdot H^{T}\cdot W \cdot y\quad(2)##

For this task, I don't know practically how to compute this estimator ##x^{\text{(ML)}}## ? is equation (2) correct ?
You would need to write the PSF as a matrix ##H## that converts a vector ##x## into a vector ##y##: ##y = Hx##. Then, the computation would work as you have written it.

fab13 said:
I saw that we could take the vector of parameters ##\theta[a,b]## and use the following relation :

##y=H\theta + w\quad (3)##

(with H the response of the system, ##\theta## the vector of parameters to estimate and w the white noise. How to demonstrate this important relation ? and what's the difference between ##(1)## and ##(3)## ?

Finally, I need remarks or help to knowing what the differences between the 2 tasks and what is the right method to apply for each one.

Regards

As far as computation is concerned, with Gaussian noise the only practical difference between the MLE and Least Squares methods is the value of W. For Least Squares, it is the identity matrix.
 
  • #3
@tnich

thanks for your answer. How can I generate a matrix ##H## given the PSF I am using, i.e :

##PSF(i,j) = \bigg(1+ \dfrac{i^2+j^2}{\alpha^2}\bigg)^{-\beta}##

?

Indeed, I can't see how can I traduce this PSF under matricial form that allows to write :

##y=H\,x## with ##y## and ##x## vectors.

?
I want to clarify an important point; when I write ##y=H*x+w##, does ##x## represent the vector of real image ? OR does ##x## represent the vector of parameters ##\theta=[\alpha,\beta]## with ##\alpha,\beta## the parameters to estimate with Maximum likelihood ?

Finally, does ##y## vector represent the data observed ? Sorry but I made confusions between the direct inversion with :

##x^{\text{(ML)}} = (H^{T}\cdot W\cdot H)^{-1}\cdot H^{T}\cdot W \cdot y\quad(2)##

and the estimation of ##\alpha,\beta## with Maximum likelihood.

Regards
 
  • #4
is there really nobody that can help me ? my issues are bad formulated ?
 
Last edited:
  • #5
fab13 said:
@tnich

thanks for your answer. How can I generate a matrix ##H## given the PSF I am using, i.e :

##PSF(i,j) = \bigg(1+ \dfrac{i^2+j^2}{\alpha^2}\bigg)^{-\beta}##

?

Indeed, I can't see how can I traduce this PSF under matricial form that allows to write :

##y=H\,x## with ##y## and ##x## vectors. In this case ##x## represents the signal.

?
If your signal is an ##m \times m## matrix ##S##, and your image is a ##m \times m## matrix ##T##, just map them to vectors ##x## and ##y## as ##x_{mv+u} = S_{u,v}## and ##y_{mt+s} = T_{s,t}##. You will also need to map your PSF to a matrix ##H## where ##H_{mt+s,mv+u} = PSF(s-u, t-v)##. (For simplicity, I assume here that your indices, ##s,t,u,v## run from from ##0## to ##m-1##.)
fab13 said:
I want to clarify an important point; when I write ##y=H*x+w##, does ##x## represent the vector of real image ? OR does ##x## represent the vector of parameters ##\theta=[\alpha,\beta]## with ##\alpha,\beta## the parameters to estimate with Maximum likelihood ?

Finally, does ##y## vector represent the data observed ? Sorry but I made confusions between the direct inversion with :

##x^{\text{(ML)}} = (H^{T}\cdot W\cdot H)^{-1}\cdot H^{T}\cdot W \cdot y\quad(2)##

and the estimation of ##\alpha,\beta## with Maximum likelihood.

Regards
You will need to convert ##y## and ##H## to matrix form as in the previous problem.
 
Last edited:
  • #6
The vector ##x## represents whatever you want to estimate, whether it is the signal or parameters of the PSF.
 
  • #7
Why do you have an inverse covariance matrix the ML estimator? You said the noise is white, and I assume it's also Gaussian. Is this the case? If yes, then you don't need the inverse covariance matrix, and your problem is reduced to finding ##\mathbf{x}## that minimizes

[tex]||\mathbf{y}-\mathbf{H}\mathbf{x}||^2[/tex]

This could be computationally expensive for general ##\mathbf{H}## where you cannot decouple the individual elements of ##\mathbf{x}##. For example, if ##\mathbf{x}## is n-dimensional binary vector, then you need to search the entire ##2^n## vector space, which is prohibitive, unless ##n## is very small.
 
  • Like
Likes tnich
  • #8
EngWiPy said:
Why do you have an inverse covariance matrix the ML estimator? You said the noise is white, and I assume it's also Gaussian. Is this the case? If yes, then you don't need the inverse covariance matrix, and your problem is reduced to finding ##\mathbf{x}## that minimizes

[tex]||\mathbf{y}-\mathbf{H}\mathbf{x}||^2[/tex]

This could be computationally expensive for general ##\mathbf{H}## where you cannot decouple the individual elements of ##\mathbf{x}##. For example, if ##\mathbf{x}## is n-dimensional binary vector, then you need to search the entire ##2^n## vector space, which is prohibitive, unless ##n## is very small.
True, but you should be able to apply Levinson recursion to alleviate the computational load.
 
  • #9
tnich said:
True, but you should be able to apply Levinson recursion to alleviate the computational load.
In fact, @fab13 may be on the wrong track with MLE. A little research on the use of deconvolution methods in astronomy might be helpful.
 

1. What is maximum likelihood estimation?

Maximum likelihood estimation is a statistical method used to find the parameters of a model that best fit a given set of data. It involves finding the values of the model's parameters that maximize the likelihood of the observed data.

2. How does maximum likelihood estimation work?

In maximum likelihood estimation, the likelihood function is constructed using the observed data and the model's parameters. The goal is to find the values of the parameters that make the likelihood function as large as possible. This is typically done using optimization algorithms to iteratively improve the parameter estimates until the maximum likelihood is achieved.

3. What is the difference between maximum likelihood and least squares?

The main difference between maximum likelihood and least squares is their objective. Maximum likelihood aims to find the parameters that maximize the likelihood of the observed data, while least squares aims to minimize the sum of squared differences between the observed data and the model's predictions.

4. What are the advantages of using maximum likelihood estimation?

Maximum likelihood estimation has several advantages, including being a robust and efficient method for finding the parameters of a model. It also provides a measure of uncertainty for the estimated parameters, making it useful for statistical inference.

5. What are some common applications of maximum likelihood estimation?

Maximum likelihood estimation is commonly used in fields such as statistics, machine learning, and data science. It is used to estimate parameters in regression models, time series analysis, and in the fitting of probability distributions to data.

Similar threads

  • Set Theory, Logic, Probability, Statistics
Replies
2
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
16
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
23
Views
2K
  • Set Theory, Logic, Probability, Statistics
Replies
1
Views
2K
  • Set Theory, Logic, Probability, Statistics
Replies
19
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
6
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
11
Views
2K
  • Set Theory, Logic, Probability, Statistics
Replies
3
Views
864
  • Set Theory, Logic, Probability, Statistics
Replies
3
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
3
Views
2K
Back
Top