Conditional & uncoditional MSE (in MMSE estimation)

In summary, the minimum mean squared error (MMSE) estimator is the mean of the posterior density, p(x|Z), and its variance is the trace of the covariance matrix of the posterior. The trace of the conditional covariance is the minimum MSE, while the trace of the unconditional covariance is not necessarily equal to the minimum MSE. MMSE estimation involves finding the function h(Z) that minimizes the MSE.
  • #1
kasraa
16
0
Hi,

1- Please explain conditional & unconditional mean square error, and their difference.
2- Which one is the solution for minimum MSE estimation? (that is conditional expectation: [tex] E \left[ X|Y \right] [/tex]. I meant which one is minimized by selecting the conditional expectation.)
3- What is the relation between these two and covariance matrix in Kalman Filter? IMO, the trace of Kalman's covariance (error covariance matrix) is one of these MSEs, but I don't know which one.
4- Is there any other interpretation of Kalman's covariance matrix than the one I mentioned above? (of course there is. I meant I don't know any other and please help me :))

Thanks a lot.
 
Physics news on Phys.org
  • #2
kasraa said:
Hi,

1- Please explain conditional & unconditional mean square error, and their difference.
2- Which one is the solution for minimum MSE estimation? (that is conditional expectation: [tex] E \left[ X|Y \right] [/tex]. I meant which one is minimized by selecting the conditional expectation.)
3- What is the relation between these two and covariance matrix in Kalman Filter? IMO, the trace of Kalman's covariance (error covariance matrix) is one of these MSEs, but I don't know which one.
4- Is there any other interpretation of Kalman's covariance matrix than the one I mentioned above? (of course there is. I meant I don't know any other and please help me :))

Thanks a lot.

I usually don't refer questions to the Wikipedia, but it has a fairly comprehensive discussion of the Kalman filter and associated Bayesian analysis. I suggest you read it and then come back if you have unanswered questions.

You can minimize the MSE by minimizing the trace of the posterior error estimate covariance matrix. The trace is minimized when the matrix derivative is zero
 
Last edited:
  • #3
SW VandeCarr said:
I usually don't refer questions to the Wikipedia, but it has a fairly comprehensive discussion of the Kalman filter and associated Bayesian analysis. I suggest you read it and then come back if you have unanswered questions.

You can minimize the MSE by minimizing the trace of the posterior error estimate covariance matrix. The trace is minimized when the matrix derivative is zero
Thanks for your reply.
Actually I've read it.

My question is about MMSE estimation in general (and Kalman filter, only as one of its implementations for some particular case).

Let me explain more. As I've asked in (1) and (2), I'm not sure what conditional/unconditional MSE exactly are (and which one is minimized by MMSE estimator), but I think they are something like:

[tex] E \left[ \left( x - \hat{x} \right) \left( x - \hat{x} \right)^{T} | Z \right] [/tex]
and
[tex] E \left[ \left( x - \hat{x} \right) \left( x - \hat{x} \right)^{T} \right] [/tex]

(where [tex] Z [/tex] is the observation (or sequence of observations as in Kalman) and [tex] \hat{x}=E \left[ x | Z \right] [/tex]).Again, if we look at Kalman as an implementation of MMSE estimator, in some references the conditional MSE is expanded to reach Kalman's covariances, and in some others, the unconditional MSE is used to do so.

(BTW, I won't be surprised if someone show that they're equal for Gaussian/linear case, and both references are right).

Thanks a lot.
 
  • #4
kasraa said:
Thanks for your reply.
Actually I've read it.

My question is about MMSE estimation in general (and Kalman filter, only as one of its implementations for some particular case).

[tex] E \left[ \left( x - \hat{x} \right) \left( x - \hat{x} \right)^{T} \right] [/tex]

(where [tex] Z [/tex] is the observation (or sequence of observations as in Kalman) and [tex] \hat{x}=E \left[ x | Z \right] [/tex]).Again, if we look at Kalman as an implementation of MMSE estimator, in some references the conditional MSE is expanded to reach Kalman's covariances, and in some others, the unconditional MSE is used to do so.

(BTW, I won't be surprised if someone show that they're equal for Gaussian/linear case, and both references are right).

Thanks a lot.

I think this article may help.

http://cnx.org/content/m11267/latest/

I take it that P(Z) is your unconditional probability density and p(Z|x) is your likelihood function. Then taking the joint density p(x)p(Z|x) you can use Bayes Theorem for the posterior density which is the conditional p(x|Z)=p(Z|x)p(x)/p(Z).

I'm not sure why you think the unconditional and conditional probability densities would be equal unless, of course, the prior density and the posterior density were equal. It appears that the MMSE estimate applies to the posterior density p(x|Z).

EDIT: The link is a bit slow, but works as of my testing at the edit time.
 
Last edited:
  • #5
SW VandeCarr said:
I think this article may help.

http://cnx.org/content/m11267/latest/

I take it that P(Z) is your unconditional probability density and p(Z|x) is your likelihood function. Then taking the joint density p(x)p(Z|x) you can use Bayes Theorem to for the posterior density which is the conditional p(x|Z)=p(Z|x)p(x)/p(Z).

I'm not sure why you think the unconditional and conditional probability densities would be equal unless, of course, the prior density and the posterior density were equal. It appears that the MMSE estimate applies to the posterior density p(x|Z).

EDIT: The link is a bit slow, but works as of my testing at the edit time.

Part one:

The posterior [tex] p \left( x|Z \right) [/tex], has a mean and a (co)variance. Its mean is the MMSE estimator, [tex] E \left[ x|Z \right] [/tex], and its variance (or the trace of its covariance matrix, if it's a random vector) is the minimum mean squared error. Am I right?

So the trace of conditional (co)variance ((co)variance of conditional pdf), that is the trace of
[tex] E \left[ \left( x - E \left[ x|Z \right] \right) \left( x - E \left[ x|Z \right] \right)^{T} | Z \right] [/tex]
is the minimum MSE (and
[tex] E \left[ \left(x-E \left[ x|Z \right] \right)^2 | Z \right] [/tex]
for the case of scalar RV).
Is it correct?

And then what is the trace of
[tex] E \left[ \left( x - E \left[ x|Z \right] \right) \left( x - E \left[ x|Z \right] \right)^{T}\right] [/tex]
?
(or
[tex] E \left[ \left(x-E \left[ x|Z \right] \right)^2 \right] [/tex]
for the case of scaler RV).




Part Two:

As I know MMSE estimation is about finding [tex] h \left( . \right) [/tex] that minimizes the
[tex] E \left[ \left( x - h \left( Z \right) \right)^2 \right] [/tex] (MSE).
And the answer is [tex] h \left( Z \right) = E \left[ x | Z \right] [/tex].

So the MMSE is
[tex] E \left[ \left(x-E \left[ x|Z \right] \right)^2 \right] [/tex].

Can you see the problem?




And a new one :D Maybe it's the answer.

Orthogonality principle implies [tex] E \left[ \left( x - E \left[ x|Z \right] \right)Z \right] = 0 [/tex], which implies
[tex] E \left[ \left( x - E \left[ x|Z \right] \right)| Z \right] = E \left[ \left( x - E \left[ x|Z \right] \right) \right] [/tex].

Does it also imply:
[tex] E \left[ \left( x - E \left[ x|Z \right] \right) ^2 | Z \right] = E \left[ \left( x - E \left[ x|Z \right] \right)^2 \right] [/tex]?
Is it correct?

Thanks.
 
Last edited:
  • #6
kasraa said:
Part one:

The posterior [tex] p \left( x|Z \right) [/tex], has a mean and a (co)variance. Its mean is the MMSE estimator, [tex] E \left[ x|Z \right] [/tex], and its variance (or the trace of its covariance matrix, if it's a random vector) is the minimum mean squared error. Am I right?
Thanks.

I don't think so. For a random vector of observations, the MMSE for the posterior estimate is the minimized trace of the covariance matrix. This is consistent with the discussion in the link I provided. As for the rest, I'm not following you. I don't understand why you're double conditioning on Z for instance. Someone else will have to try and help you
 
  • #7
SW VandeCarr said:
I don't think so. For a random vector of observations, the MMSE for the posterior estimate is the minimized trace of the covariance matrix. This is consistent with the discussion in the link I provided. As for the rest, I'm not following you. I don't understand why you're double conditioning on Z for instance. Someone else will have to try and help you

I believe the covariance matrix of [tex] p \left( x | Z \right) [/tex] when they're jointly Gaussian is:
[tex] R_{XX}-R_{XZ}R_{ZZ}^{-1}R_{ZX} [/tex]
which its trace is the *minimum* MSE.

I believe the minimization took place when you selected [tex] E \left[ x|Z \right] [/tex] as your estimator.


About double conditioning, that's the part I do not fully understand either. But you can find it in many references. For example: "Estimation with Applications to Tracking and Navigation" by Bar-Shalom.

http://books.google.com/books?id=xz...DNS5bQDp6QkASajLGgBw&cd=1#v=onepage&q&f=false

see the bottom of page 204 for example. (There are plenty of these in this book (and also others), I just found one that is included in Google's preview.)


Thanks again.

Any other ideas?
 
  • #8
Not really. I was thinking of the discussion re the Kalman filter where the trace is minimized using the Kalman gain [tex]K_{k}[/tex] and setting:

[tex]\frac{\partial tr(P_{k|k})}{\partial K_{k}}= 0 [/tex]
 
  • #9
SW VandeCarr said:
Not really. I was thinking of the discussion re the Kalman filter where the trace is minimized using the Kalman gain [tex]K_{k}[/tex] and setting:

[tex]\frac{\partial tr(P_{k|k})}{\partial K_{k}}= 0 [/tex]
Sorry, but I can't understand your last post (I don't get your "English". not minimizing the trace of covariance matrix to find the Kalman gain ...).

What I understand is that Kalman and MMSE are related (in fact, I think Kalman is the MMSE estimator for the case of Gaussian variables (or Linear MMSE estimator without the assumption of Gaussian variables), for associated linear state (process) and observation equations (models)).Did you see the book?
 
  • #10
SW VandeCarr said:
Not really. I was thinking of the discussion re the Kalman filter where the trace is minimized using the Kalman gain [tex]K_{k}[/tex] and setting:

[tex]\frac{\partial tr(P_{k|k})}{\partial K_{k}}= 0 [/tex]
Sorry, but I can't understand your last post (I don't get your "English". not minimizing the trace of covariance matrix to find the Kalman gain ...).

What I understand is that Kalman and MMSE are related (in fact, I think Kalman is the MMSE estimator for the case of Gaussian variables (or Linear MMSE estimator without the assumption of Gaussian variables), for associated linear state (process) and observation equations (models)).


Did you see the book?
 
  • #11
kasraa said:
Sorry, but I can't understand your last post (I don't get your "English". not minimizing the trace of covariance matrix to find the Kalman gain ...).

What I understand is that Kalman and MMSE are related (in fact, I think Kalman is the MMSE estimator for the case of Gaussian variables (or Linear MMSE estimator without the assumption of Gaussian variables), for associated linear state (process) and observation equations (models)).Did you see the book?

Yes. There's a lot there to look at. Thanks

If you go back to the wiki article and go down to "Kalman gain derivation" you'll see the equation I wrote. This is how the author suggests minimizing the trace of [tex]P_{k|k}[/tex] (posterior estimate covariance matrix).

http://en.wikipedia.org/wiki/Kalman_filter

And yes, the Kalman Filter is a MMSE estimator.
 
Last edited:
  • #12
So you're confused about conditional/unconditional MSE too (just like me), right? :D
 
  • #13
kasraa said:
So you're confused about conditional/unconditional MSE too (just like me), right? :D

I didn't think so, but maybe I am. Using your notation P(Z) is an unconditional probability, p(Z|x) is the likelihood function. The joint probability is p(Z|x)p(x) and the conditional probability is p(x|Z) which we obtain from p(x|Z)=p(Z|x)p(x)/p(Z). What's wrong with this?

EDIT:If your reading this in your mail, go to the forum. The post has been edited. The calculation in the Wiki link is specific to the Kalman Filter.
 
Last edited:
  • #14
In my notation, [tex] X [/tex] is the RV which we're trying to estimate, so the prior (unconditional pdf, which in case of the Kalman filter, is our estimate at the previous step) is [tex] p(x) [/tex].

Actually nothing is wrong with it (using Bayes in order to reach to the posterior). I believe I explained my confusions clear, especially in post #5.

What do you think about my statement at the end of that post? Is it true?

Thanks a lot.

BTW, anyone else has any ideas about our discussion?
 
  • #15
kasraa said:
Does it also imply:
[tex] E \left[ \left( x - E \left[ x|Z \right] \right) ^2 | Z \right] = E \left[ \left( x - E \left[ x|Z \right] \right)^2 \right] [/tex]?
Is it correct?

Thanks.

As I said, I don't know what the double conditional on Z means. I can only guess that it might mean something like [tex]P_{k|k}[/tex] which indicates the successor state to [tex]P_{k|k-1}[/tex]. If so, you need to introduce a system of subsripts.

Also, I don't see any problem to getting the MSE from any sample vector. It's the MMSE that can be a challenge.
 
Last edited:

1. What is the difference between conditional and unconditional MSE in MMSE estimation?

Conditional MSE refers to the average squared error when estimating a random variable with knowledge of other random variables. Unconditional MSE, on the other hand, refers to the average squared error when estimating a random variable without any additional knowledge.

2. How is conditional MSE calculated in MMSE estimation?

Conditional MSE is calculated by taking the average squared difference between the actual value and the estimated value, given the knowledge of other random variables. In MMSE estimation, this is done by minimizing the conditional mean squared error.

3. What is the importance of conditional and unconditional MSE in MMSE estimation?

Conditional and unconditional MSE are important metrics in MMSE estimation because they measure the accuracy of the estimated values. These values can help determine the effectiveness of the estimation process and can be used to compare different estimation methods.

4. How does the use of conditional and unconditional MSE affect the MMSE estimation process?

The use of conditional and unconditional MSE can affect the MMSE estimation process by influencing the choice of estimation method. For example, if the goal is to minimize the conditional MSE, then the MMSE estimation method will be different from when the goal is to minimize the unconditional MSE.

5. Are there any limitations to using conditional and unconditional MSE in MMSE estimation?

Yes, there are limitations to using conditional and unconditional MSE in MMSE estimation. These metrics assume that the estimation errors follow a normal distribution and may not be suitable for non-Gaussian signals. Additionally, they do not take into account the cost or consequences of estimation errors, which may be important in certain applications.

Similar threads

  • Set Theory, Logic, Probability, Statistics
Replies
11
Views
2K
  • Set Theory, Logic, Probability, Statistics
Replies
3
Views
864
  • Set Theory, Logic, Probability, Statistics
Replies
1
Views
813
  • Set Theory, Logic, Probability, Statistics
Replies
11
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
9
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
3
Views
965
  • Set Theory, Logic, Probability, Statistics
Replies
1
Views
917
  • Set Theory, Logic, Probability, Statistics
Replies
2
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
1
Views
4K
  • Set Theory, Logic, Probability, Statistics
Replies
1
Views
2K
Back
Top