I Maximum Likelihood to find the original data or estimation directly

AI Thread Summary
The discussion focuses on using Maximum Likelihood Estimation (MLE) to recover an original signal from observed data while also estimating parameters of the Point Spread Function (PSF). The relationship between the observed data, the original signal, and the PSF is expressed through equations involving convolution and matrix operations. Participants seek clarification on the correct formulation of these equations and the practical computation of the MLE estimator. There is confusion regarding the representation of the signal and parameters in the equations, as well as the necessity of an inverse covariance matrix when dealing with Gaussian noise. The conversation suggests that simplifying the problem may be possible by minimizing the least squares error instead of relying solely on MLE.
fab13
Messages
300
Reaction score
7
I make confusions in the using of Maximum Likelihood to find (approximately) the original signal knowing the data observed and the using of Maximum Likelihood to find estimations of parameters of the PSF

1) Find (up to some point) the original signal :

I start from this general definition (in discretized) : ##y=H\,x+w\quad(1)## (with ##w## a white noise)

How can I demonstrate this relation ? (it seems that we should start from a discrete convolution product, doesn't it ? Then, the correct expression would be rather : ##y=H*x+w## with ##*## the product covolution)

For estimation, I have to maximalize the likelihood function :

##\phi(x) = (H \cdot x - y)^{T} \cdot W \cdot (H \cdot x - y)##

with ##H## the PSF, ##W## the inverse of covariance matrix (of data ##y##), and ##y## data observed : the goal is to find ##x## original signal.

So the estimator is given by : ##x^{\text{(ML)}} = (H^{T}\cdot W\cdot H)^{-1}\cdot H^{T}\cdot W \cdot y\quad(2)##

For this task, I don't know practically how to compute this estimator ##x^{\text{(ML)}}## ? is equation (2) correct ?

2) Second task : next step, I have to find the parameters on the function with ##\theta=[a,b]## parameter vector : this allow to find the best parameters of PSF and gives the best fit as a function of data ##y##

Is this step well formulated ?

I am working with the following PSF :

qaJcr.png


And for this second task, I have to find the parameters ##a## and ##b## of :

3uU7X.png


knowing (##r_0,c_0##)

In practise, I have used on Matlab the function `\` to perform the Least Mean squares method between the data observed (PSF without noise) and the raw data (the PSF with noise).

In this way, I can find the two affine parameters ##a## and ##b## (actually, I think this is called a linear regression).

I saw that we could take the vector of parameters ##\theta[a,b]## and use the following relation :

##y=H\theta + w\quad (3)##

(with H the response of the system, ##\theta## the vector of parameters to estimate and w the white noise. How to demonstrate this important relation ? and what's the difference between ##(1)## and ##(3)## ?

Finally, I need remarks or help to knowing what the differences between the 2 tasks and what is the right method to apply for each one.

Regards
 

Attachments

  • qaJcr.png
    qaJcr.png
    25.6 KB · Views: 563
  • 3uU7X.png
    3uU7X.png
    25.8 KB · Views: 558
Last edited by a moderator:
Physics news on Phys.org
fab13 said:
So the estimator is given by : ##x^{\text{(ML)}} =(H^{T}\cdot W\cdot H)^{-1}\cdot H^{T}\cdot W \cdot y\quad(2)##

For this task, I don't know practically how to compute this estimator ##x^{\text{(ML)}}## ? is equation (2) correct ?
You would need to write the PSF as a matrix ##H## that converts a vector ##x## into a vector ##y##: ##y = Hx##. Then, the computation would work as you have written it.

fab13 said:
I saw that we could take the vector of parameters ##\theta[a,b]## and use the following relation :

##y=H\theta + w\quad (3)##

(with H the response of the system, ##\theta## the vector of parameters to estimate and w the white noise. How to demonstrate this important relation ? and what's the difference between ##(1)## and ##(3)## ?

Finally, I need remarks or help to knowing what the differences between the 2 tasks and what is the right method to apply for each one.

Regards

As far as computation is concerned, with Gaussian noise the only practical difference between the MLE and Least Squares methods is the value of W. For Least Squares, it is the identity matrix.
 
@tnich

thanks for your answer. How can I generate a matrix ##H## given the PSF I am using, i.e :

##PSF(i,j) = \bigg(1+ \dfrac{i^2+j^2}{\alpha^2}\bigg)^{-\beta}##

?

Indeed, I can't see how can I traduce this PSF under matricial form that allows to write :

##y=H\,x## with ##y## and ##x## vectors.

?
I want to clarify an important point; when I write ##y=H*x+w##, does ##x## represent the vector of real image ? OR does ##x## represent the vector of parameters ##\theta=[\alpha,\beta]## with ##\alpha,\beta## the parameters to estimate with Maximum likelihood ?

Finally, does ##y## vector represent the data observed ? Sorry but I made confusions between the direct inversion with :

##x^{\text{(ML)}} = (H^{T}\cdot W\cdot H)^{-1}\cdot H^{T}\cdot W \cdot y\quad(2)##

and the estimation of ##\alpha,\beta## with Maximum likelihood.

Regards
 
is there really nobody that can help me ? my issues are bad formulated ?
 
Last edited:
fab13 said:
@tnich

thanks for your answer. How can I generate a matrix ##H## given the PSF I am using, i.e :

##PSF(i,j) = \bigg(1+ \dfrac{i^2+j^2}{\alpha^2}\bigg)^{-\beta}##

?

Indeed, I can't see how can I traduce this PSF under matricial form that allows to write :

##y=H\,x## with ##y## and ##x## vectors. In this case ##x## represents the signal.

?
If your signal is an ##m \times m## matrix ##S##, and your image is a ##m \times m## matrix ##T##, just map them to vectors ##x## and ##y## as ##x_{mv+u} = S_{u,v}## and ##y_{mt+s} = T_{s,t}##. You will also need to map your PSF to a matrix ##H## where ##H_{mt+s,mv+u} = PSF(s-u, t-v)##. (For simplicity, I assume here that your indices, ##s,t,u,v## run from from ##0## to ##m-1##.)
fab13 said:
I want to clarify an important point; when I write ##y=H*x+w##, does ##x## represent the vector of real image ? OR does ##x## represent the vector of parameters ##\theta=[\alpha,\beta]## with ##\alpha,\beta## the parameters to estimate with Maximum likelihood ?

Finally, does ##y## vector represent the data observed ? Sorry but I made confusions between the direct inversion with :

##x^{\text{(ML)}} = (H^{T}\cdot W\cdot H)^{-1}\cdot H^{T}\cdot W \cdot y\quad(2)##

and the estimation of ##\alpha,\beta## with Maximum likelihood.

Regards
You will need to convert ##y## and ##H## to matrix form as in the previous problem.
 
Last edited:
The vector ##x## represents whatever you want to estimate, whether it is the signal or parameters of the PSF.
 
Why do you have an inverse covariance matrix the ML estimator? You said the noise is white, and I assume it's also Gaussian. Is this the case? If yes, then you don't need the inverse covariance matrix, and your problem is reduced to finding ##\mathbf{x}## that minimizes

||\mathbf{y}-\mathbf{H}\mathbf{x}||^2

This could be computationally expensive for general ##\mathbf{H}## where you cannot decouple the individual elements of ##\mathbf{x}##. For example, if ##\mathbf{x}## is n-dimensional binary vector, then you need to search the entire ##2^n## vector space, which is prohibitive, unless ##n## is very small.
 
  • Like
Likes tnich
EngWiPy said:
Why do you have an inverse covariance matrix the ML estimator? You said the noise is white, and I assume it's also Gaussian. Is this the case? If yes, then you don't need the inverse covariance matrix, and your problem is reduced to finding ##\mathbf{x}## that minimizes

||\mathbf{y}-\mathbf{H}\mathbf{x}||^2

This could be computationally expensive for general ##\mathbf{H}## where you cannot decouple the individual elements of ##\mathbf{x}##. For example, if ##\mathbf{x}## is n-dimensional binary vector, then you need to search the entire ##2^n## vector space, which is prohibitive, unless ##n## is very small.
True, but you should be able to apply Levinson recursion to alleviate the computational load.
 
tnich said:
True, but you should be able to apply Levinson recursion to alleviate the computational load.
In fact, @fab13 may be on the wrong track with MLE. A little research on the use of deconvolution methods in astronomy might be helpful.
 

Similar threads

Back
Top