Rearranging a formula (Multivariate Gaussian function)

Click For Summary

Homework Help Overview

The discussion revolves around rearranging the multivariate Gaussian likelihood function, specifically the expression p(y|θ). Participants are attempting to manipulate the equation to match a specified form involving a constant and an exponential function of a quadratic expression in θ.

Discussion Character

  • Exploratory, Mathematical reasoning, Problem interpretation

Approaches and Questions Raised

  • Participants have expanded the original function and proposed equivalences involving constants and matrix forms. They are exploring how to match coefficients in the expanded expressions and questioning the sufficiency of terms to derive certain components.

Discussion Status

There are multiple lines of reasoning being explored, with some participants providing insights into the relationships between the variables and matrices involved. However, there is no explicit consensus on the next steps or the resolution of the problem, as participants continue to seek clarification and additional leads.

Contextual Notes

Participants note constraints such as the non-invertibility of certain matrices and the need to reconcile different forms of the equations. There are also references to specific equations and conditions that must hold true for the rearrangement to be valid.

Pi-Bond
Messages
300
Reaction score
0

Homework Statement


See image, p(y|θ) is the Likelihood function which has to be rearranged in the form of equation (3). θ is a vector variable.
1zobx9u.png


Homework Equations


None?

The Attempt at a Solution


I first expanded the exponent in the original function, equation (2).

(b-A\theta)^T(b-A\theta)=b^Tb - b^TA\theta - \theta^T A^T b + \theta^T A^T A \theta

Now suppose I can write the function equivalently as

C \exp\left( -\frac{1}{2} (\theta - \theta_0)^T L (\theta - \theta_0) \right)

where C represents the same constant multiplying with exp in equation (2). In this case, the exponential parts must be the same. So:

b^Tb - b^TA\theta - \theta^T A^T b + \theta^T A^T A \theta = \theta^TL\theta - \theta^TL\theta_0 - \theta_0^TL\theta + \theta_0^TL\theta_0

If this this expression is true for all θ, then all coefficients of (...)θ , θT(...) , θT(...)θ and the constants must match.

So L = ATA and θ0=L-1ATb.

Now I don't know how to get L0. (equation (5) ). I only have this condition left from my assumption about the constants above:

bTb=θT00

There just don't seem to be enough terms in either exponent to allow the exponential part of L0. I don't think I can add and subtract anything either...

Any ideas?
 
Last edited:
Physics news on Phys.org
Pi-Bond said:

Homework Statement


See image, p(y|θ) is the Likelihood function which has to be rearranged in the form of equation (3). θ is a vector variable.
1zobx9u.png


Homework Equations


None?

The Attempt at a Solution


I first expanded the exponent in the original function, equation (2).

(b-A\theta)^T(b-A\theta)=b^Tb - b^TA\theta - \theta^T A^T b + \theta^T A^T A \theta

Now suppose I can write the function equivalently as

C \exp\left( -\frac{1}{2} (\theta - \theta_0)^T L (\theta - \theta_0) \right)

where C represents the same constant multiplying with exp in equation (2). In this case, the exponential parts must be the same. So:

b^Tb - b^TA\theta - \theta^T A^T b + \theta^T A^T A \theta = \theta^TL\theta - \theta^TL\theta_0 - \theta_0^TL\theta + \theta_0^TL\theta_0

If this this expression is true for all θ, then all coefficients of (...)θ , θT(...) , θT(...)θ and the constants must match.

So L = ATA and θ0=LATb.

Now I don't know how to get L0. (equation (5) ). I only have this condition left from my assumption about the constants above:

bTb=θT00

There just don't seem to be enough terms in either exponent to allow the exponential part of L0. I don't think I can add and subtract anything either...

Any ideas?

Try to show that
(b-A\theta)^T(b-A\theta) = (\theta - \theta_0)^T L (\theta - \theta_0) + K for some matrix L and some constant K, and for ##\theta_0## as given in the question.
 
Ok.

(b-A\theta)^T(b-A\theta)=b^Tb - b^TA\theta - \theta^T A^T b + \theta^T A^T A \theta
= b^Tb - (L L^{-1}A^T b)^T\theta - \theta^T L( L^{-1}A^T b) + \theta^T L \theta
= b^Tb + \theta_0^T L \theta - \theta^T L \theta_0 + \theta^T L \theta
= b^Tb + (\theta-\theta_0)^T L \theta - \theta^T L \theta_0
= b^Tb + (\theta-\theta_0)^T L \theta - (\theta-\theta_0)^T L \theta_0 -\theta_0^T L \theta_0
= b^Tb + (\theta-\theta_0)^T L (\theta-\theta_0)-\theta_0^T L \theta_0

I used the fact that LT=L=ATA.

On the basis of this it seems K = bTb - θ0T0

I still can't see the origin of L0 here though..
 
Bump. I haven't been able to find any leads. Anyone have an idea?
 
I'm somewhat confused by the fact that if L = A^TA and \theta_0 = L^{-1}A^Tb then L^{-1} = A^{-1}(A^T)^{-1} so that
A\theta_0 = AL^{-1}A^Tb = A(A^{-1}(A^T)^{-1})A^Tb = b
which means that b - A\theta_0 = b - b = 0. So I'm at a loss to explain why the author has included the exponential in \mathcal{L}_0, when its value appears to be \exp(0) = 1.

On the other hand, I was able to show (using the additional fact that L is symmetric so that (L^T)^{-1} = (L^{-1})^T = L^{-1}) that
(\theta - \theta_0)^TL(\theta - \theta_0) = (A\theta - b)^T(A\theta - b)<br /> = (b - A\theta)^T(b - A\theta)
It's just a case of expanding the left hand side, substituting the definitions of L and \theta_0 and simplifying.
 
pasmith said:
I'm somewhat confused by the fact that if L = A^TA and \theta_0 = L^{-1}A^Tb then L^{-1} = A^{-1}(A^T)^{-1}
Ignore that: A is not square, so cannot be invertible.

The aim is to show that (b - A\theta)^T(b - A\theta) = (b - A\theta_0)^T(b - A\theta_0) + (\theta - \theta_0)^T(\theta - \theta_0)

Now the left hand side is
b^Tb - b^TA\theta - \theta^TA^Tb + \theta^TA^TA\theta

The right hand side is
b^Tb - b^TA\theta_0 - \theta_0^TA^Tb + \theta_0^TA^TA\theta_0<br /> + \theta^TL\theta - \theta^TL\theta_0 - \theta_0^TL\theta + \theta_0^TL\theta_0\\<br /> = b^Tb - (b^TA\theta_0 + \theta_0^TA^Tb) + 2\theta_0^TL\theta_0<br /> + \theta^TL\theta - \theta^TL\theta_0 - \theta_0^TL\theta \\<br /> = b^Tb - 2b^TAL^{-1}A^Tb + 2b^TAL^{-1}A^Tb<br /> + \theta^TA^TA\theta - \theta^TA^Tb - b^TA\theta \\<br /> = b^Tb -b^TA\theta - \theta^TA^Tb + \theta^TA^TA\theta<br />
as required.
 
Last edited:
Edit: there seems to be a mistake between the second and third lines. You go from -(bT0 + θ0TATb) to -2bT(...)+2bT(...)

The plus sign should be minus, and I'm not sure of where the factor of 2 comes from. Anyway I will investigate your approach.
 
Last edited:

Similar threads

  • · Replies 8 ·
Replies
8
Views
2K
Replies
1
Views
2K
  • · Replies 6 ·
Replies
6
Views
2K
  • · Replies 21 ·
Replies
21
Views
4K
  • · Replies 3 ·
Replies
3
Views
2K
  • · Replies 2 ·
Replies
2
Views
2K
  • · Replies 16 ·
Replies
16
Views
2K
  • · Replies 8 ·
Replies
8
Views
2K
  • · Replies 10 ·
Replies
10
Views
3K
  • · Replies 1 ·
Replies
1
Views
2K